Skip to content

Implement Alembic for database migrations #2

@nanotaboada

Description

@nanotaboada

Problem

The FastAPI project currently lacks a robust database migration management system. Without proper migration tooling, schema changes are difficult to track, version control, and deploy consistently across environments. This creates risks during database evolution and makes collaboration challenging when multiple developers work on schema changes.

Additional Challenge: The project uses async SQLAlchemy with aiosqlite, which requires special consideration when setting up Alembic to ensure compatibility with the async database engine and session management.

Project Architecture Note: The project has an unconventional but functional structure:

  • models/player_model.py → Pydantic models for API validation
  • schemas/player_schema.py → SQLAlchemy ORM models (actual database tables)

This is opposite to the typical FastAPI convention where "schemas" are Pydantic and "models" are SQLAlchemy, but Alembic will work with the existing structure.

Required by: #542 (Add PostgreSQL support with unified migration-based initialization)

Proposed Solution

Implement Alembic as the database migration tool for the FastAPI project. Alembic will provide:

  • Automatic migration generation from SQLAlchemy model changes
  • Version-controlled schema changes that can be tracked in Git
  • Bidirectional migrations (upgrade/downgrade capabilities)
  • Dual-dialect support: migrations must be compatible with both SQLite (local dev/test) and PostgreSQL (Docker/production, see Add PostgreSQL support with unified migration-based initialization #542)
  • Seamless SQLAlchemy integration leveraging existing ORM models

Why Alembic over Prisma Client?

After evaluation, Alembic is the recommended choice for the following reasons:

Alembic Advantages:

  • Native SQLAlchemy integration (built by the same author)
  • Pure Python implementation, no additional toolchain required
  • Mature ecosystem with extensive community support
  • Fine-grained control over migration scripts
  • Works with any SQLAlchemy-supported database

Prisma Client Limitations:

  • Requires Node.js runtime alongside Python
  • Uses its own schema definition language (Prisma Schema), not SQLAlchemy models
  • Would require maintaining two sources of truth for database schema
  • Less mature Python support (Prisma primarily TypeScript-focused)

Suggested Approach

1. Project Structure

project_root/
├── alembic/
│   ├── versions/
│   ├── env.py
│   ├── script.py.mako
│   └── README
├── alembic.ini
├── databases/player_database.py   # Async SQLAlchemy setup ✓
├── models/player_model.py         # Pydantic models ✓
├── schemas/player_schema.py       # SQLAlchemy ORM models ✓
├── main.py
└── pyproject.toml

2. Installation and Setup

  • Add alembic and asyncpg to pyproject.toml dependencies
  • Initialize: alembic init alembic
  • Configure alembic/env.py (see below)

3. Primary key design: UUID, not auto-increment integer

The Player schema uses a custom HyphenatedUUID type as the primary key (stored as String(36), hyphenated UUID format):

# schemas/player_schema.py
id = Column(
    HyphenatedUUID(),   # String(36) under the hood
    primary_key=True,
    default=uuid4,
    nullable=False,
)

Alembic autogenerate will detect HyphenatedUUID as String(36) and generate the column correctly. No manual column type override needed — but the generated migration should be reviewed to confirm the String(36) mapping.

Seed data in migrations must use deterministic UUID v5 values (stable across environments), not random UUID v4. This is consistent with the docstring in schemas/player_schema.py:

Records seeded by migration scripts use deterministic UUID v5 values so that IDs are stable across environments and can be safely referenced in tests.

4. Configure alembic/env.py

Read DATABASE_URL from the environment, supporting both SQLite and PostgreSQL:

import asyncio
import os
from logging.config import fileConfig
from sqlalchemy import pool
from sqlalchemy.engine import Connection
from sqlalchemy.ext.asyncio import async_engine_from_config
from alembic import context

from databases.player_database import Base
from schemas.player_schema import Player

# Supports both SQLite (local) and PostgreSQL (Docker, see #542):
#   sqlite+aiosqlite:///players-sqlite3.db
#   postgresql+asyncpg://postgres:password@postgres:5432/playersdb
database_url = os.getenv("DATABASE_URL", "sqlite+aiosqlite:///players-sqlite3.db")

config = context.config
config.set_main_option("sqlalchemy.url", database_url)

if config.config_file_name is not None:
    fileConfig(config.config_file_name)

target_metadata = Base.metadata


def run_migrations_offline() -> None:
    url = config.get_main_option("sqlalchemy.url")
    context.configure(
        url=url,
        target_metadata=target_metadata,
        literal_binds=True,
        dialect_opts={"paramstyle": "named"},
        render_as_batch=True,   # Required for SQLite ALTER TABLE support
    )
    with context.begin_transaction():
        context.run_migrations()


def do_run_migrations(connection: Connection) -> None:
    context.configure(
        connection=connection,
        target_metadata=target_metadata,
        render_as_batch=True,   # Required for SQLite ALTER TABLE support
    )
    with context.begin_transaction():
        context.run_migrations()


async def run_async_migrations() -> None:
    connectable = async_engine_from_config(
        config.get_section(config.config_ini_section, {}),
        prefix="sqlalchemy.",
        poolclass=pool.NullPool,
    )
    async with connectable.connect() as connection:
        await connection.run_sync(do_run_migrations)
    await connectable.dispose()


def run_migrations_online() -> None:
    asyncio.run(run_async_migrations())


if context.is_offline_mode():
    run_migrations_offline()
else:
    run_migrations_online()

render_as_batch=True: Required for SQLite, which does not support ALTER TABLE natively. Alembic uses a batch process (copy-transform-replace) instead. Harmless on PostgreSQL.

5. Apply migrations on startup

Update the lifespan handler in main.py:

from alembic.config import Config
from alembic import command

@asynccontextmanager
async def lifespan(_: FastAPI) -> AsyncIterator[None]:
    logger.info("Applying database migrations...")
    alembic_cfg = Config("alembic.ini")
    command.upgrade(alembic_cfg, "head")
    logger.info("Database migrations applied successfully.")
    yield
    logger.info("Application shutting down...")

6. Cross-dialect migration considerations

  • UUID primary key: HyphenatedUUID autogenerates as VARCHAR(36) — works in both SQLite and PostgreSQL
  • render_as_batch=True: Required for SQLite ALTER TABLE; harmless on PostgreSQL
  • Seed data: Use UUID v5 with a consistent namespace so values are deterministic across environments

7. Migration structure

Three migrations, each independently reversible:

alembic/versions/
  001_create_players_table.py    ← autogenerated from Player schema
  002_seed_starting11.py         ← manual: inserts 11 Starting XI players
  003_seed_substitutes.py        ← manual: inserts 14 Substitute players

Educational note: Splitting seed data across two migrations illustrates the incremental, composable nature of the migration system. alembic downgrade -1 removes only the substitutes, leaving the starting 11 intact.

Seed migrations are written manually (not autogenerated) and use op.bulk_insert() or raw op.execute():

# 002_seed_starting11.py
def upgrade() -> None:
    op.execute("""
        INSERT INTO players (id, firstName, ...) VALUES
        ('uuid-v5-value-1', 'Emiliano', ...),
        -- ... 10 more Starting XI players
    """)

def downgrade() -> None:
    op.execute("DELETE FROM players WHERE starting11 = true")
# 003_seed_substitutes.py
def upgrade() -> None:
    op.execute("""
        INSERT INTO players (id, firstName, ...) VALUES
        ('uuid-v5-value-12', 'Gerónimo', ...),
        -- ... 13 more Substitute players
    """)

def downgrade() -> None:
    op.execute("DELETE FROM players WHERE starting11 = false")

8. Migration Workflow

# Auto-generate schema migration from model changes
alembic revision --autogenerate -m "description"

# Apply all pending migrations
alembic upgrade head

# Roll back last migration
alembic downgrade -1

9. Remove pre-seeded database

  • Remove committed .db file from git (if present)
  • Add *.db to .gitignore
  • Move seed data into an Alembic data migration using deterministic UUID v5 values

10. Update documentation

  • Document DATABASE_URL environment variable
  • Add migration workflow to README.md
  • Explain UUID v5 vs v4 seed strategy

Acceptance Criteria

  • Alembic installed and initialized
  • alembic/env.py reads DATABASE_URL from environment (SQLite default, PostgreSQL compatible)
  • alembic/env.py configured for async SQLAlchemy (run_async_migrations)
  • render_as_batch=True set for SQLite ALTER TABLE compatibility
  • target_metadata references Base.metadata from schemas/player_schema.py
  • HyphenatedUUID primary key generates as VARCHAR(36) in migration — verified in autogenerated script
  • 002_seed_starting11.py seeds the 11 Starting XI players with deterministic UUID v5 values
  • 003_seed_substitutes.py seeds the 14 Substitute players with deterministic UUID v5 values
  • Rolling back 003 removes only substitutes; 002 removes only Starting XI
  • Startup lifespan applies migrations automatically (alembic upgrade head)
  • asyncpg added to pyproject.toml
  • Migrations verified compatible with both SQLite and PostgreSQL
  • Pre-seeded .db file removed from git; *.db added to .gitignore
  • All existing tests pass
  • README.md documents migration workflow and DATABASE_URL variable

References

Metadata

Metadata

Assignees

Labels

good first issueGood for newcomerspriority highImportant for production readiness. Schedule for current milestone.pythonPull requests that update Python codequestionFurther information is requested

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions