Skip to content

Persistency storage

Data model

We have following data model, implemented in yaptide/persistence/

Simulation model and dependent classes:

  class SimulationModel {
    id: int
    job_id: str
    user_id: int
    start_time: datetime
    end_time: datetime
    title: str
    platform: str
    input_type: str
    sim_type: str
    job_state: str
    update_key_hash: str

  class CelerySimulationModel {
    id: int
    merge_id: str

  class BatchSimulationModel {
    id: int
    cluster_id: int
    job_dir: str
    array_id: int
    collect_id: int

  class TaskModel {
    id: int
    simulation_id: int
    task_id: int
    requested_primaries: int
    simulated_primaries: int
    task_state: str
    estimated_time: int
    start_time: datetime
    end_time: datetime
    platform: str
    last_update_time: datetime

  class CeleryTaskModel {
    id: int
    celery_id: str

  class BatchTaskModel {
    id: int

  class InputModel {
    id: int
    simulation_id: int
    compressed_data: bytes

  class EstimatorModel {
    id: int
    simulation_id: int
    name: str
    compressed_data: bytes

  class PageModel {
    id: int
    estimator_id: int
    page_number: int
    compressed_data: bytes

  class LogfilesModel {
    id: int
    simulation_id: int
    compressed_data: bytes

  SimulationModel <|-- CelerySimulationModel
  SimulationModel <|-- BatchSimulationModel
  TaskModel <|-- CeleryTaskModel
  TaskModel <|-- BatchTaskModel
  SimulationModel "1" *-- "0..*" TaskModel
  SimulationModel "1" *-- "0..*" EstimatorModel
  EstimatorModel "1" *-- "0..*" PageModel
  SimulationModel "1" *-- "0..*" LogfilesModel
  SimulationModel *-- InputModel

other classes we use are:

  class UserModel {
    id: int
    username: str
    auth_provider: str

  class YaptideUserModel {
    id: int
    password_hash: str

  class KeycloakUserModel {
    id: int
    cert: str
    private_key: str

  class ClusterModel {
    id: int
    cluster_name: str

  UserModel <|-- YaptideUserModel
  UserModel <|-- KeycloakUserModel

We've been too lazy to write down the mermaid code for these diagrams, but ChatGPT nowadays does a good job on that. Whenever you need to update the diagrams, just copy the code from the yaptide/persistence/ file and ask ChatGPT to generate the diagram for you.


Production version uses PostgreSQL database, while in the unit tests suite we use SQLite in-memory database.

Sometimes it may be convenient to connect to the production DB from outside the container, e.g. to check the content of the database. Then you can use the following command to get the DB URL.

docker exec -it yaptide_flask bash -c "cd /usr/local/app && python -c 'from yaptide.application import create_app; app = create_app(); app.app_context().push() or print(app.extensions[\"sqlalchemy\"].engine.url.render_as_string(hide_password=False))'"

The code above is implemented as a handy onliner, the code may look tricky, epecially the app.app_context().push() or part. The reason for that hacking is simple. Regular methods to get the DB URL require the application context. This is usually achieved using with app.app_context(): construct, which is not possible in the oneliner.

Knowing the DB URL, you can connect to the DB using any DB client, e.g. psql or pgadmin. You can also use the script from the yaptide/admin directory. For example, to list all users in the DB, you can use the following command from outside the container:

FLASK_SQLALCHEMY_DATABASE_URI=postgresql+psycopg://yaptide_user:yaptide_password@localhost:5432/yaptide_db ./yaptide/admin/ list-users

This is equivalent to the following command executed inside the container:

docker exec -it yaptide_flask ./yaptide/admin/ list-users