A Python application to validate and store published OC4IDS datasets.
- Python 3.12
- Postgres
python -m venv .venv
source .venv/bin/activate
pip install -r requirements_dev.txt
export DATABASE_URL="postgresql://oc4ids_datastore@localhost/oc4ids_datastore"
alembic upgrade head
If enabled, the pipeline will upload the files to a DigitalOcean Spaces bucket.
First create the bucket with DigitalOcean.
If doing this via the UI, take the following steps:
- Choose any region
- Enable CDN
- Choose any bucket name
- Click "Create a Spaces Bucket"
After the bucket is created, create an access key in DigitalOcean.
If doing this via the UI, take the following steps:
- Go to your bucket
- Go to settings
- Under "Access Keys" click "Create Access Key"
- Set the access scope to "Limited Access"
- Select your bucket from the list and set "Permissions" to "Read/Write/Delete"
- Choose any name
- Click "Create Access Key"
Securely store the access key ID and secret.
Once you have created the bucket and access key, set the following environment variables for the pipeline:
ENABLE_UPLOAD
: 1 to enable, 0 to disableBUCKET_REGION
: e.g.fra1
BUCKET_NAME
: e.g.my-bucket
BUCKET_ACCESS_KEY_ID
: e.g.access-key-id
BUCKET_ACCESS_KEY_SECRET
: e.g.access-key-secret
To make this easier, the project uses python-dotenv
to load environment variables from a config file.
For local development, create a file called .env.local
, which will be used by default.
You can change which file is loaded setting the environment variable APP_ENV
.
For example the tests set APP_ENV=test
, which loads variables from .env.test
.
To send failure notifications by email, the following environment variables must be set:
NOTIFICATIONS_ENABLED
: 1 to enable, 0 to disableNOTIFICATIONS_SMTP_HOST
NOTIFICATIONS_SMTP_PORT
NOTIFICATIONS_SMTP_SSL_ENABLED
: 1 to enable, 0 to disableNOTIFICATIONS_SENDER_EMAIL
NOTIFICATIONS_RECEIVER_EMAIL
pip install -e .
oc4ids-datastore-pipeline
black oc4ids_datastore_pipeline/ tests/
isort oc4ids_datastore_pipeline/ tests/
flake8 oc4ids_datastore_pipeline/ tests/
mypy oc4ids_datastore_pipeline/ tests/
pytest
alembic revision --autogenerate -m "<MESSAGE HERE>"
To publish a new version, raise a PR to main
updating the version in pyproject.toml
. Once merged, create a git tag and GitHub release for the new version, with naming vX.Y.Z
. This will trigger a docker image to to be built and pushed, tagged with the version and latest
.