Database
The backend uses PostgreSQL via SQLAlchemy with Flask-Migrate (Alembic) for schema management. In tests, an in-memory SQLite database is used.
Data Model
Section titled “Data Model”The database contains 12 tables organized around users, simulations, tasks, and results.
Entity Relationships
Section titled “Entity Relationships”UserModel (polymorphic)├── YaptideUserModel└── KeycloakUserModel
UserModel ──< SimulationModel (polymorphic) ├── CelerySimulationModel └── BatchSimulationModel
SimulationModel ──< TaskModel (polymorphic) ├── CeleryTaskModel └── BatchTaskModel
SimulationModel ──< InputModelSimulationModel ──< EstimatorModel ──< PageModelSimulationModel ──< LogfilesModel
ClusterModel ──< BatchSimulationModelTables
Section titled “Tables”User Tables
Section titled “User Tables”User — base user table with polymorphic inheritance on auth_provider.
| Column | Type | Description |
|---|---|---|
id | Integer (PK) | Auto-incremented user ID |
username | String | Unique username |
auth_provider | String | "yaptide" or "keycloak" (discriminator) |
YaptideUser — native auth users.
| Column | Type | Description |
|---|---|---|
id | Integer (FK → User) | |
password_hash | String | Werkzeug PBKDF2 hash |
KeycloakUser — Keycloak/PLGrid users.
| Column | Type | Description |
|---|---|---|
id | Integer (FK → User) | |
cert | Text | SSH certificate (PEM) |
private_key | Text | SSH private key (PEM) |
Simulation Tables
Section titled “Simulation Tables”Simulation — base simulation table with polymorphic inheritance on platform.
| Column | Type | Description |
|---|---|---|
id | Integer (PK) | Auto-incremented |
job_id | String | UUID, used as the external identifier |
user_id | Integer (FK → User) | Owner |
title | String | Simulation title |
platform | String | "DIRECT" or "BATCH" (discriminator) |
input_type | String | "editor" or "files" |
sim_type | String | "shieldhit", "fluka", "topas" |
job_state | String | Current state (see lifecycle) |
start_time | DateTime | Job submission time |
end_time | DateTime | Job completion time |
update_key_hash | String | Hashed JWT for worker auth |
CelerySimulation — direct (Celery) simulations.
| Column | Type | Description |
|---|---|---|
id | Integer (FK → Simulation) | |
merge_id | String | Celery task ID for the merge step |
BatchSimulation — batch (Slurm) simulations.
| Column | Type | Description |
|---|---|---|
id | Integer (FK → Simulation) | |
cluster_id | Integer (FK → Cluster) | Target HPC cluster |
job_dir | String | Remote working directory on cluster |
array_id | String | Slurm array job ID |
collect_id | String | Slurm collect job ID |
Task Tables
Section titled “Task Tables”Task — individual simulation tasks within a job.
| Column | Type | Description |
|---|---|---|
id | Integer (PK) | |
task_id | Integer | 0-based task index |
simulation_id | Integer (FK → Simulation) | |
task_state | String | Task state |
requested_primaries | BigInteger | Target number of primaries |
simulated_primaries | BigInteger | Completed primaries |
estimated_time | Integer | Estimated remaining seconds |
start_time | DateTime | |
end_time | DateTime |
CeleryTask — Celery-specific task data.
| Column | Type | Description |
|---|---|---|
id | Integer (FK → Task) | |
celery_id | String | Celery task UUID |
BatchTask — batch-specific task data (minimal, inherits from Task).
Result Tables
Section titled “Result Tables”Input — stores simulation input files.
| Column | Type | Description |
|---|---|---|
id | Integer (PK) | |
simulation_id | Integer (FK → Simulation) | |
compressed_data | LargeBinary | gzip-compressed JSON (file dict) |
Estimator — named result containers.
| Column | Type | Description |
|---|---|---|
id | Integer (PK) | |
simulation_id | Integer (FK → Simulation) | |
name | String | Estimator name |
file_name | String | Original filename |
compressed_data | LargeBinary | gzip-compressed metadata |
Page — individual scoring dimensions within an estimator.
| Column | Type | Description |
|---|---|---|
id | Integer (PK) | |
estimator_id | Integer (FK → Estimator) | |
page_number | Integer | 0-based page index |
page_name | String | Descriptive name |
page_dimension | Integer | Number of dimensions |
compressed_data | LargeBinary | gzip-compressed result data (axes, values) |
Logfiles — simulation log output.
| Column | Type | Description |
|---|---|---|
id | Integer (PK) | |
simulation_id | Integer (FK → Simulation) | |
compressed_data | LargeBinary | gzip-compressed log text |
Cluster Table
Section titled “Cluster Table”Cluster — registered HPC clusters.
| Column | Type | Description |
|---|---|---|
id | Integer (PK) | |
cluster_name | String | e.g., "ares", "prometheus" |
Data Compression
Section titled “Data Compression”All large binary fields use gzip compression:
from yaptide.persistence.models import compress, decompress
# Storemodel.compressed_data = compress(json.dumps(data).encode())
# Retrievedata = json.loads(decompress(model.compressed_data).decode())This reduces storage for large simulation results (estimator pages can be several MB uncompressed).
Migration Workflow
Section titled “Migration Workflow”The project uses Flask-Migrate (Alembic) for schema changes.
Creating a New Migration
Section titled “Creating a New Migration”# After modifying models.pypoetry run flask --app yaptide.application db migrate -m "Add new column"
# Review the generated migration in migrations/versions/# Then apply:poetry run flask --app yaptide.application db upgradeDevelopment Workflow
Section titled “Development Workflow”- Modify
persistence/models.py - Generate migration:
flask db migrate -m "description" - Review the auto-generated migration script
- Test locally:
flask db upgrade - Commit the migration file alongside the model changes
Production Migration
Section titled “Production Migration”Always back up the database before running migrations in production.
# Backupdocker compose exec postgresql pg_dump -U yaptide yaptide > backup.sql
# Apply migrationdocker compose exec yaptide_flask flask --app yaptide.application db upgradeTesting Migrations
Section titled “Testing Migrations”To test a migration against a copy of the production database:
- Dump production:
pg_dump -U yaptide yaptide > prod_backup.sql - Load into a test database:
psql -U yaptide test_db < prod_backup.sql - Point
FLASK_SQLALCHEMY_DATABASE_URIat the test DB - Run
flask db upgrade - Verify the application works correctly
Database Access from Outside Docker
Section titled “Database Access from Outside Docker”When running the database in Docker, access it with pgAdmin or psql:
# Direct psql accessdocker compose exec postgresql psql -U yaptide yaptide
# Or with the develop compose file (includes pgAdmin on port 9999)docker compose -f docker-compose.yml -f docker-compose-develop.yml up -d# Open http://localhost:9999, login with admin@admin.com / admin