Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema check on startup #168

Closed
wants to merge 44 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
ff847b2
feat: add initial check function
cbrinson-rise8 Jan 6, 2025
b878a89
feat: rebased
cbrinson-rise8 Jan 8, 2025
f61c3e1
feat: working tests
cbrinson-rise8 Jan 8, 2025
cbf1f19
feat: lint fixes
cbrinson-rise8 Jan 8, 2025
23120cd
feat: testing typemismatch
cbrinson-rise8 Jan 8, 2025
1c5b184
feat: put back in the type check
cbrinson-rise8 Jan 9, 2025
3f918a4
feat: working
cbrinson-rise8 Jan 9, 2025
c6512c5
feat: test without verify tables
cbrinson-rise8 Jan 9, 2025
6781797
feat: fix mysql test
cbrinson-rise8 Jan 9, 2025
052bc80
feat: put verify tables back int
cbrinson-rise8 Jan 9, 2025
7c5e603
feat: fix stuff
cbrinson-rise8 Jan 9, 2025
2867fef
feat: check smoke tests
cbrinson-rise8 Jan 13, 2025
6e86a25
testing something
cbrinson-rise8 Jan 13, 2025
6a00b0c
testing another thing
cbrinson-rise8 Jan 13, 2025
d15fd19
final check
cbrinson-rise8 Jan 13, 2025
dfcb34c
feat: temporary change
cbrinson-rise8 Jan 13, 2025
532db79
feat: revert smoke test temp change
cbrinson-rise8 Jan 17, 2025
bd40d67
feat: collect logs for debug
cbrinson-rise8 Jan 21, 2025
7bb9ccd
feat: type equivalence work around
cbrinson-rise8 Jan 21, 2025
7071101
making the comparison more robost
ericbuckley Jan 23, 2025
2f06501
reverting smoke tests back
ericbuckley Jan 23, 2025
fcef723
moving smoke tests to a script
ericbuckley Jan 23, 2025
1d334b1
fixing space
ericbuckley Jan 23, 2025
33adf61
adding a matrix of databases to run smoke tests against
ericbuckley Jan 23, 2025
e30af60
removing sqlite from the smoke tests
ericbuckley Jan 23, 2025
066fbdd
trying just postgres for now
ericbuckley Jan 23, 2025
a7458fe
taking out health check
ericbuckley Jan 23, 2025
4b38049
adding health check back
ericbuckley Jan 23, 2025
9e39cc4
removing sleep call
ericbuckley Jan 23, 2025
f05eda4
updating the health check
ericbuckley Jan 23, 2025
dfa73cf
adding more databases to the matrix
ericbuckley Jan 23, 2025
cb25fa7
skip verify_tables_match_orm for now
ericbuckley Jan 23, 2025
5b56798
print rl service logs when it fails
ericbuckley Jan 23, 2025
39f757c
trying again
ericbuckley Jan 23, 2025
f7de7c8
more debugging
ericbuckley Jan 23, 2025
a6451e8
adding quotes
ericbuckley Jan 23, 2025
49a1e6f
fixing quotes
ericbuckley Jan 23, 2025
deb933c
fixing brackets
ericbuckley Jan 23, 2025
bb25671
Merge branch 'main' into feat/schema-check-on-startup
ericbuckley Jan 24, 2025
5c44d94
removing smoke_tests script for now
ericbuckley Jan 24, 2025
04311ac
adding configuration variables for create_tables and verify_tables on…
ericbuckley Jan 24, 2025
d5e2ead
adding debugging
ericbuckley Jan 24, 2025
0a56121
fixing retry logic
ericbuckley Jan 24, 2025
8b5c0e4
Merge branch 'main' into feat/schema-check-on-startup
ericbuckley Jan 25, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 14 additions & 14 deletions .github/workflows/check_smoke_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ jobs:

- name: Build RecordLinker Docker Image
run: |
docker build -t rl-service-image .
docker build -t rl-service-image .

- name: Start RecordLinker Service and Run Smoke Tests
run: |
Expand All @@ -67,18 +67,18 @@ jobs:
rl-service-image

# Wait for the RL Service to be healthy
TRIES=5
COUNT=0
TRIES=3
until curl -s http://localhost:8080/ | grep "OK"; do
echo "Waiting for RL Service to become healthy... Attempt $((COUNT+1)) of $TRIES"
echo "Waiting for RL Service to become healthy..."
sleep 5
COUNT=$((COUNT+1))
if [ "$COUNT" -ge "$TRIES" ]; then
TRIES=$((TRIES-1))
if [ "$TRIES" -lt "0" ]; then
echo "RL Service did not become healthy in time."
docker logs rl-service
exit 1
fi
done

# Run smoke tests and print the response
JSON_BODY_1='{"record": {"birth_date": "2053-11-07", "sex": "M", "identifiers":[{"value": "123456789", "type": "MR"}], "name":[{"family":"Shepard", "given":["John"]}]}}'
JSON_BODY_2='{"algorithm": "dibbs-enhanced", "record": {"birth_date": "2000-12-06", "sex": "M", "identifiers":[{"value": "9876543210", "type": "MR"}], "name":[{"family":"Smith", "given":["William"]}]}}'
Expand All @@ -89,7 +89,7 @@ jobs:
-H "Content-Type: application/json")

echo "Response: $RESPONSE_1"
echo "$RESPONSE_1" | jq -e '.prediction == "no_match"'
echo "$RESPONSE_1" | jq -e '.prediction == "no_match"'

PERSON_REFERENCE_ID=$(echo "$RESPONSE_1" | jq -r '.person_reference_id')

Expand All @@ -98,7 +98,7 @@ jobs:
-H "Content-Type: application/json")

echo "Response: $RESPONSE_2"
echo "$RESPONSE_2" | jq -e '.prediction == "match"'
echo "$RESPONSE_2" | jq -e '.prediction == "match"'
echo "$RESPONSE_2" | jq -e --arg id "$PERSON_REFERENCE_ID" '.person_reference_id == $id'

#enhanced tests
Expand All @@ -107,7 +107,7 @@ jobs:
-H "Content-Type: application/json")

echo "Response: $RESPONSE_3"
echo "$RESPONSE_3" | jq -e '.prediction == "no_match"'
echo "$RESPONSE_3" | jq -e '.prediction == "no_match"'

PERSON_REFERENCE_ID=$(echo "$RESPONSE_3" | jq -r '.person_reference_id')

Expand All @@ -116,13 +116,13 @@ jobs:
-H "Content-Type: application/json")

echo "Response: $RESPONSE_4"
echo "$RESPONSE_4" | jq -e '.prediction == "match"'
echo "$RESPONSE_4" | jq -e --arg id "$PERSON_REFERENCE_ID" '.person_reference_id == $id'
echo "$RESPONSE_4" | jq -e '.prediction == "match"'
echo "$RESPONSE_4" | jq -e --arg id "$PERSON_REFERENCE_ID" '.person_reference_id == $id'

#invalid tests
RESPONSE_5=$(curl -s -X POST http://localhost:8080/link \
-d '{"algorithm": "invalid", "record": {}}' \
-H "Content-Type: application/json")

echo "Response: $RESPONSE_5"
echo "$RESPONSE_5" | grep -q "No algorithm found"
echo "$RESPONSE_5" | grep -q "No algorithm found"
14 changes: 7 additions & 7 deletions .github/workflows/check_unit_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,26 +57,26 @@ jobs:
run: |
# Add Microsoft repository keys
curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add -

# Add the Microsoft SQL Server repository for Ubuntu
curl https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list | sudo tee /etc/apt/sources.list.d/mssql-release.list

# Update package lists
sudo apt-get update

# Accept the EULA and install msodbcsql18
sudo ACCEPT_EULA=Y apt-get install -y msodbcsql18 unixodbc-dev

- name: Run Pytest with the matrix database
run: |
if [[ "${{ matrix.database }}" == "postgres" ]]; then
export TEST_DB_URI=postgresql+psycopg2://postgres:pw@localhost:5432/postgres
export TEST_DB_URI="postgresql+psycopg2://postgres:pw@localhost:5432/postgres"
elif [[ "${{ matrix.database }}" == "sqlite" ]]; then
export TEST_DB_URI=sqlite:///testdb.sqlite3
export TEST_DB_URI="sqlite:///testdb.sqlite3"
elif [[ "${{ matrix.database }}" == "mysql" ]]; then
export TEST_DB_URI=mysql+pymysql://root:pw@localhost:3306/mysql
export TEST_DB_URI="mysql+pymysql://root:pw@localhost:3306/mysql"
elif [[ "${{ matrix.database }}" == "sqlserver" ]]; then
export TEST_DB_URI=mssql+pyodbc://sa:YourStrong!Passw0rd@localhost:1433/master?driver=ODBC+Driver+18+for+SQL+Server&TrustServerCertificate=yes
export TEST_DB_URI="mssql+pyodbc://sa:YourStrong!Passw0rd@localhost:1433/master?driver=ODBC+Driver+18+for+SQL+Server&TrustServerCertificate=yes"
fi

pytest --cov=recordlinker tests/unit
Expand Down
16 changes: 16 additions & 0 deletions docs/site/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,22 @@ Each setting can be configured as follows:

**Development Default**: `""`

`DB_CREATE_TABLES (Optional)`

: Whether to create the database tables on startup, if they do not already exist.

**Docker Default**: `true`

**Development Default**: `true`

`DB_VERIFY_TABLES (Optional)`

: Whether to verify that the database tables match the ORM definitions on startup.

**Docker Default**: `true`

**Development Default**: `true`

`TEST_DB_URI (Optional)`

: The URI for the application database used when running tests.
Expand Down
8 changes: 8 additions & 0 deletions src/recordlinker/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,14 @@ class Settings(pydantic_settings.BaseSettings):
description="The prefix for all database tables",
default="",
)
db_create_tables: bool = pydantic.Field(
description="Whether to create the database tables on startup, if they are missing",
default=True,
)
db_verify_tables: bool = pydantic.Field(
description="Whether to verify that the database tables match the ORM definitions on startup",
default=True,
)
test_db_uri: str = pydantic.Field(
description="The URI for the MPI database to run tests against",
default="sqlite:///testdb.sqlite3",
Expand Down
42 changes: 40 additions & 2 deletions src/recordlinker/database/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,15 @@
import typing

from sqlalchemy import create_engine
from sqlalchemy import inspect
from sqlalchemy import orm
from sqlalchemy.exc import SQLAlchemyError

from recordlinker import models
from recordlinker.config import settings


def create_sessionmaker(init_tables: bool = True) -> orm.sessionmaker:
def create_sessionmaker(init_tables: bool = True, verify_tables: bool = True) -> orm.sessionmaker:
"""
Create a new sessionmaker for the database connection.
"""
Expand All @@ -25,11 +27,47 @@
if settings.connection_pool_max_overflow is not None:
kwargs["max_overflow"] = settings.connection_pool_max_overflow
engine = create_engine(settings.db_uri, **kwargs)

if init_tables:
models.Base.metadata.create_all(engine)
if verify_tables:
verify_tables_match_orm(engine)

return orm.sessionmaker(bind=engine)


def verify_tables_match_orm(engine):
"""
Verify that database tables match ORM definitions.
"""
inspector = inspect(engine)

for table_name, orm_table in models.Base.metadata.tables.items():
if not inspector.has_table(table_name):
raise SQLAlchemyError(f"Table '{table_name}' is missing in the database.")

db_columns = {c["name"]: c["type"] for c in inspector.get_columns(table_name)}

for orm_col in orm_table.columns:
if orm_col.name not in db_columns:
raise SQLAlchemyError(
f"Column '{orm_col.name}' is missing in the database for table '{table_name}'."
)

db_col_type = db_columns[orm_col.name]
orm_col_type = orm_col.type
if (
# check if the database column is the same type as the ORM column
not isinstance(db_col_type, type(orm_col_type))
# check if the database column compiles to the same type as the ORM column
and db_col_type.compile() != orm_col_type.compile(engine.dialect)
):
raise SQLAlchemyError(

Check warning on line 65 in src/recordlinker/database/__init__.py

View check run for this annotation

Codecov / codecov/patch

src/recordlinker/database/__init__.py#L65

Added line #L65 was not covered by tests
f"Type mismatch for column '{orm_col.name}' in table '{table_name}': "
f"DB type is {db_col_type}, ORM type is {orm_col_type}."
)


def get_session() -> typing.Iterator[orm.Session]:
"""
Get a new session from the sessionmaker.
Expand Down Expand Up @@ -62,4 +100,4 @@
models.Base.metadata.drop_all(engine)


SessionMaker = create_sessionmaker()
SessionMaker = create_sessionmaker(settings.db_create_tables, settings.db_verify_tables)
59 changes: 59 additions & 0 deletions tests/unit/database/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,38 @@

This module contains the unit tests for the recordlinker.database module.
"""

import os
import unittest.mock

import pytest
from sqlalchemy import Column
from sqlalchemy import create_engine
from sqlalchemy import Integer
from sqlalchemy import String
from sqlalchemy.exc import SQLAlchemyError
from sqlalchemy.sql import text

from recordlinker import models
from recordlinker.config import settings
from recordlinker.database import create_sessionmaker
from recordlinker.database import verify_tables_match_orm


@pytest.fixture
def in_memory_engine():
"""
Fixture for an in-memory SQLite engine.
"""
engine = create_engine("sqlite:///:memory:")
models.Base.metadata.create_all(engine) # Create tables as defined by ORM
return engine

class TableForTesting(models.Base):
__tablename__ = "test_table"
id = Column(Integer, primary_key=True)
name = Column(String(255))
age = Column(Integer)

def test_create_sessionmaker():
"""
Expand All @@ -33,3 +59,36 @@ def test_create_sessionmaker():
os.remove("test.db")
except FileNotFoundError:
pass

def test_verify_tables_match_orm_no_mismatch(in_memory_engine):
"""
Test that verify_tables_match_orm passes when the database schema matches the ORM.
"""
models.Base.metadata.create_all(in_memory_engine) # Create tables as defined by ORM

try:
verify_tables_match_orm(in_memory_engine)
except SQLAlchemyError:
pytest.fail("verify_tables_match_orm raised an exception with a matching schema.")

def test_verify_tables_match_orm_missing_column(in_memory_engine):
"""
Test that verify_tables_match_orm raises an error when a column is missing.
"""
# Drop a column from the database schema
with in_memory_engine.connect() as connection:
connection.execute(text("ALTER TABLE test_table DROP COLUMN age"))

with pytest.raises(SQLAlchemyError, match="Column 'age' is missing in the database for table 'test_table'"):
verify_tables_match_orm(in_memory_engine)

def test_verify_tables_match_orm_missing_table(in_memory_engine):
"""
Test that verify_tables_match_orm raises an error when a table is missing.
"""
# Drop the table from the database
with in_memory_engine.connect() as connection:
connection.execute(text("DROP TABLE test_table"))

with pytest.raises(SQLAlchemyError, match="Table 'test_table' is missing in the database."):
verify_tables_match_orm(in_memory_engine)
Loading