Warning
It is important to structure your data science project based on a certain standard so that your teammates can easily maintain and modify your project.
This repository provides a template that incorporates best practices for creating a maintainable and reproducible data science project.
- Poetry: Dependency manager
- Hydra: File configuration manager
- pre-commit plugins: Automate code review formatting
- DVC: Data version control
- mkdocs-material: Automatically create API documentation for your project
.
├── config # Project configuration
│ ├── main.yaml # Main configuration file
│ ├── model # Model settings
│ │ ├── model1.yaml # Model 1 configuration
│ │ └── model2.yaml # Model 2 configuration
│ ├── process # Process settings
│ ├── process1.yaml # Process 1 configuration
│ └── process2.yaml # Process 2 configuration
├── data # Project data
│ ├── external # External data
│ ├── final # Final data ready for analysis
│ ├── processed # Processed data
│ ├── raw # Raw unprocessed data
│ └── raw.dvc # DVC file to manage raw data
├── docs # Project documentation
│ ├── assets # Documentation resources
│ │ └── img # Images for documentation
│ └── index.md # Documentation index
├── Makefile # Task automation script
├── mkdocs.yml # MkDocs configuration for documentation
├── models # Trained models
├── notebooks # Jupyter notebooks for analysis and development
├── pyproject.toml # Project configuration in TOML format
├── README.md # README file with the project description
├── references # References and additional resources
├── reports # Reports generated from the analysis
│ └── figures # Figures and graphs from the reports
├── src # Project source code
│ ├── __init__.py # Initialization of the src package
│ ├── process.py # Script for data processing
│ └── train_model.py # Script for training models
└── tests # Testing the source code
├── __init__.py # Initialization from package tests
├── test_process.py # Tests for the processing script
└── test_train_model.py # Tests for the model training script
Install Cookiecutter:
pip install cookiecutter
Create a project based on the template:
cookiecutter https://github.com/SENTUstudio/cookiecutter-data-science --checkout main
This Python script is a command-line interface (CLI) tool for managing various tasks in a data science project. The script leverages the typer
library to handle command-line arguments, providing an easy-to-use interface for initializing a Git repository, managing a Conda environment, running tests, building documentation, and handling PostgreSQL databases with Docker Compose.
The script requires the following Python libraries:
os
: For interacting with the operating system.subprocess
: For running shell commands.sys
: For handling command-line arguments and system exit codes.typer
: For creating the CLI application.
PROJECT_DIR
: The absolute path to the project directory. This is determined using the__file__
attribute andos.path.dirname
.CONDA_ENV
: The path to the Conda environment within the project directory. It is constructed by appending"env"
to thePROJECT_DIR
path.
graph TD;
run_command["run_command(command, capture_output, use_conda)"]
run_command -->|Uses| CONDA_ENV["CONDA_ENV"]
The run_command
function is designed to execute shell commands from within a Python script. It provides additional options to capture the output of the command and to run the command within a specified Conda environment.
def run_command(command: str, capture_output: bool = False, use_conda: bool = False):
-
command
(str):
The shell command to be executed. This should be a valid command string that can be run in a terminal or shell. -
capture_output
(bool, optional):
A flag that indicates whether the function should capture and return the standard output of the command.- If
True
, the command's output is captured and returned as a string. - If
False
, the output is displayed directly in the terminal, and nothing is returned. - Default value:
False
.
- If
-
use_conda
(bool, optional):
A flag that determines whether the command should be executed within a Conda environment.- If
True
, the command is prefixed with the necessary shell commands to activate the Conda environment specified by the global variableCONDA_ENV
. - If
False
, the command is executed in the current shell environment. - Default value:
False
.
- If
str
:
Ifcapture_output
isTrue
, the function returns the captured standard output of the command as a decoded string. Ifcapture_output
isFalse
, the function does not return anything.
subprocess.CalledProcessError
:
This exception is raised if the command returns a non-zero exit code (indicating failure). In this case:- The function catches the exception.
- The standard error output of the command (if available) is decoded and printed. If decoding fails or if there is no error output, a generic error message is printed.
- The function then exits the script with the same return code as the failed command by calling
sys.exit(e.returncode)
.
- If the
use_conda
flag isTrue
, the function modifies thecommand
string to include the necessary commands for activating the Conda environment. - The command is prepended with
eval "$(conda shell.bash hook)" && conda activate "{CONDA_ENV}" &&
, which:- Initializes the Conda shell environment.
- Activates the Conda environment specified by the global variable
CONDA_ENV
. - Executes the provided command within this activated environment.
- The command is executed using the
subprocess.run()
function with the following parameters:shell=True
: Allows the command to be executed in the shell, enabling shell features like pipelines and file redirection.check=True
: Ensures that aCalledProcessError
is raised if the command exits with a non-zero status code.capture_output
: If set toTrue
, the command's standard output is captured.
- If
capture_output
isTrue
, the captured output is decoded to a string and returned.
- If the command fails (i.e., returns a non-zero exit code), the function catches the
subprocess.CalledProcessError
exception. - The function then attempts to decode and print the command's standard error output (
e.stderr
). If decoding fails or if no error output is available, a generic error message is printed instead. - Finally, the script exits with the return code of the failed command to propagate the error status.
run_command("ls -l")
- This example runs the
ls -l
command, listing the contents of the current directory with details. The output is printed directly to the terminal.
output = run_command("echo 'Hello, World!'", capture_output=True)
print(output) # Output: Hello, World!
- This example runs the
echo 'Hello, World!'
command. The output is captured by the function and printed to the console.
run_command("python script.py", use_conda=True)
- This example runs
python script.py
within the Conda environment specified byCONDA_ENV
.
run_command("exit 1")
- This command forces an exit with a status code of
1
. The function catches the resultingCalledProcessError
, prints the error message, and exits the script with the same status code.
The install_pre_commit
function is designed to automate the installation of pre-commit
hooks in a project. pre-commit
is a framework for managing and maintaining multi-language pre-commit hooks, which are scripts that run automatically before a commit is made in a version control system like Git. This function ensures that the pre-commit
hooks are installed and activated in the project's Git repository.
def install_pre_commit():
- The function begins by printing a message to the console:
"Installing pre-commit..."
. This informs the user that the process of installingpre-commit
hooks is starting.
- The function calls the
run_command
function with the command"pre-commit install"
to install the pre-commit hooks. - The command is executed in the shell, and the
use_conda=True
argument ensures that the command is run within the Conda environment specified by the global variableCONDA_ENV
. pre-commit install
:- This command sets up the pre-commit hooks for the repository by creating the necessary configuration files and linking the hooks to Git.
- Once installed, these hooks will automatically run specified checks and validations (e.g., code formatting, linting, security checks) before any commit is made.
install_pre_commit()
- This will install
pre-commit
hooks in the Git repository, ensuring that the predefined checks are enforced every time a commit is made. The function also confirms that the command is run within the active Conda environment.
-
run_command
:- The
install_pre_commit
function relies on therun_command
function to execute the shell command. It is important that therun_command
function is correctly defined in the same script or module, as it handles the execution of the shell command and the activation of the Conda environment.
- The
-
pre-commit
:- The
pre-commit
framework must be available and installed in the project's environment for this function to work. It is typically included as a dependency in theenvironment.yml
orrequirements.txt
file of the project.
- The
-
pre-commit
:pre-commit
is a tool that helps manage and maintain hooks for Git repositories. More information can be found at the official pre-commit documentation.
-
Conda Environment:
- The function is designed to run the installation command within a Conda environment. The specific environment is determined by the global
CONDA_ENV
variable, which should point to the environment wherepre-commit
is installed.
- The function is designed to run the installation command within a Conda environment. The specific environment is determined by the global
Note
- The
install_pre_commit
function is typically called during the setup or initialization phase of a project to ensure that all necessary hooks are in place from the beginning. This helps maintain code quality and enforce project-specific guidelines throughout the development process.
graph TD;
init["init()"]
init -->|Calls| show_logo["show_logo()"]
init -->|Calls| run_command["run_command()"]
init -->|Calls| install_pre_commit["install_pre_commit()"]
install_pre_commit -->|Calls| run_command
The init
function is a command-line interface (CLI) command designed to automate the initialization of a new project repository. It performs the following tasks:
- Initializes a Git repository in the project directory.
- Renames the default Git branch to
main
. - Creates and configures a Conda environment based on the
environment.yml
file. - Installs
pre-commit
hooks to enforce code quality standards.
This function is typically used at the beginning of a project to set up the development environment and version control system.
@app.command()
def init():
@app.command()
: This decorator is part of thetyper
library and designates theinit
function as a command within a CLI application. Users can invoke this command via the command line, making it an integral part of the project's setup process.
- The function starts by calling
show_logo()
, which displays an ASCII logo or banner representing the project. This provides a visual identifier for the project during the setup process, making it clear to the user that the initialization is starting.
- The function prints
"Initializing Git..."
to inform the user that Git initialization is in progress. - It then runs the command
git init
using therun_command
function to initialize a new Git repository in the current project directory. - The function also renames the default branch from
master
tomain
by executinggit branch -m main
. This reflects modern Git practices, wheremain
is commonly used as the default branch name.
- The function prints
"Installing dependencies..."
to indicate the start of the environment setup process. - It executes the command
conda env create --prefix {CONDA_ENV} --file environment.yml
using therun_command
function to create a Conda environment.--prefix {CONDA_ENV}
specifies the directory where the Conda environment will be created, as defined by the global variableCONDA_ENV
.--file environment.yml
tells Conda to use theenvironment.yml
file to install the required packages and dependencies.
- After the Conda environment is created, the function calls
install_pre_commit()
to set uppre-commit
hooks. These hooks will automatically run checks (e.g., linting, code formatting) before any commit is made, helping to maintain code quality and consistency throughout the project.
To initialize a new project, a user would execute the following command in the terminal:
python manage.py init
- This command will:
- Display the project logo.
- Initialize a Git repository and rename the default branch to
main
. - Create a Conda environment and install all dependencies listed in
environment.yml
. - Install
pre-commit
hooks.
-
show_logo()
:- Displays the project logo at the beginning of the initialization process. This function must be defined elsewhere in the script or module.
-
run_command()
:- A utility function that executes shell commands. It is used throughout the
init
function to run Git and Conda commands.
- A utility function that executes shell commands. It is used throughout the
-
install_pre_commit()
:- Installs
pre-commit
hooks after the Conda environment has been set up. This function must be defined in the same script or module.
- Installs
-
CONDA_ENV
:- A global variable that specifies the path to the Conda environment directory. It should be set before this function is called.
-
Git:
- Git is a distributed version control system used to track changes in source code. This function initializes a Git repository, which is essential for managing project versioning.
-
Conda:
- Conda is an open-source package management and environment management system. It helps manage project dependencies and environments, ensuring consistency across different development setups.
-
Pre-commit:
pre-commit
is a framework for managing Git pre-commit hooks. These hooks run checks before code is committed, helping to enforce code quality standards.
Note
- This function is intended to be part of the initial project setup process. Running it multiple times might result in warnings or errors if the Git repository or Conda environment already exists.
- The
init
function assumes that theenvironment.yml
file is correctly configured and present in the project directory. Any issues with this file (e.g., missing dependencies) will cause the environment setup to fail.
graph TD;
env["env()"]
env -->|Calls| show_logo["show_logo()"]
env -->|Calls| print_activate_command["print_activate_command()"]
The env
function is a command-line interface (CLI) command designed to display the necessary command for activating the project's Conda environment. This is particularly useful in guiding users to correctly set up their development environment after the Conda environment has been created.
@app.command()
def env():
@app.command()
: This decorator, provided by thetyper
library, marks theenv
function as a command within the CLI application. Users can execute this command from the terminal to get instructions on how to activate the Conda environment.
- The function begins by calling
show_logo()
, which displays an ASCII logo or banner representing the project. This is a visual element that helps users identify the project and provides a consistent user experience when using the CLI.
- The function then calls
print_activate_command()
, which prints the specific command that the user needs to run in their terminal to activate the Conda environment. - This activation command is essential for users to ensure that they are working within the correct Conda environment, which contains all the dependencies and configurations specified for the project.
To display the Conda environment activation command, a user would run the following command in the terminal:
python manage.py env
- This command will:
- Display the project logo.
- Print the command needed to activate the Conda environment, guiding the user on how to proceed with setting up their development environment.
-
show_logo()
:- A function that displays the project logo or banner. It must be defined elsewhere in the script or module. This function is called to provide a consistent and recognizable visual output for the user.
-
print_activate_command()
:- A function that prints the command necessary to activate the Conda environment. This function must also be defined in the script or module. It is crucial for guiding users on how to activate the Conda environment that contains all the project's dependencies.
-
Conda:
- Conda is an open-source package management and environment management system. It helps manage project dependencies and environments, ensuring consistency across different development setups.
-
Environment Activation:
- Activating a Conda environment is a key step in ensuring that the correct dependencies are used when developing or running a project. The environment typically contains all the libraries and tools needed for the project, as specified in an
environment.yml
file.
- Activating a Conda environment is a key step in ensuring that the correct dependencies are used when developing or running a project. The environment typically contains all the libraries and tools needed for the project, as specified in an
Note
- The
env
function does not create or modify the Conda environment; it simply provides instructions on how to activate an existing environment. - It is assumed that the user has already created the Conda environment using a command like
conda env create
, and theCONDA_ENV
variable correctly points to this environment. - Running the activation command printed by this function ensures that the user is working within the appropriate environment, which is necessary for avoiding dependency conflicts and ensuring the project runs smoothly.
graph TD;
tests["tests()"]
tests -->|Calls| show_logo["show_logo()"]
tests -->|Calls| run_command["run_command()"]
The tests
function is a command-line interface (CLI) command designed to automate the process of running tests in a project using pytest
. This function ensures that tests are executed within the correct Conda environment, which contains all the necessary dependencies for the testing process.
@app.command()
def tests():
@app.command()
:
This decorator, provided by thetyper
library, registers thetests
function as a command in the CLI application. Users can invoke this command via the command line to run the project's test suite.
- The function begins by calling
show_logo()
, which displays an ASCII logo or banner representing the project. This provides a consistent and recognizable visual element to the user, indicating that the test execution process is starting.
- The function prints
"Running tests..."
to the console. This message informs the user that the test suite is about to be executed.
- The function calls
run_command("pytest", use_conda=True)
to execute thepytest
testing framework within the Conda environment specified by the global variableCONDA_ENV
. pytest
is a powerful testing framework for Python that simplifies the process of writing and running tests. It can discover and execute tests automatically based on a set of naming conventions.- The
use_conda=True
argument ensures that thepytest
command is run within the appropriate Conda environment, where all the project's dependencies are installed.
To run the project's test suite, a user would execute the following command in the terminal:
python manage.py tests
- This command will:
- Display the project logo.
- Print a message indicating that the tests are being run.
- Execute the
pytest
command within the Conda environment to run the project's test suite.
-
show_logo()
:- A function that displays the project logo or banner. It must be defined elsewhere in the script or module. This function is called at the beginning of the
tests
function to provide a consistent and recognizable visual output for the user.
- A function that displays the project logo or banner. It must be defined elsewhere in the script or module. This function is called at the beginning of the
-
run_command()
:- A utility function used to execute shell commands. The
run_command
function is responsible for running thepytest
command within the Conda environment. It must be defined in the same script or module.
- A utility function used to execute shell commands. The
-
pytest
:pytest
is a testing framework for Python that allows for easy writing and execution of test cases. Thepytest
command must be available in the Conda environment for the tests to run successfully.
-
Conda:
- Conda is an open-source package management and environment management system. The function uses Conda to ensure that the tests are run in an environment where all dependencies are correctly installed.
-
Pytest:
pytest
is a widely used testing framework for Python. It simplifies the process of writing and executing tests, supporting various testing needs such as unit testing, functional testing, and more.
Note
- The
tests
function assumes that the Conda environment is already created and thatpytest
is installed within that environment. - Running tests within the correct environment helps to avoid issues related to missing dependencies or version conflicts, ensuring that the tests reflect the true state of the project.
- This function is typically used as part of a continuous integration (CI) pipeline or during the development process to verify that the code works as expected.
graph TD;
docs["docs()"]
docs -->|Calls| show_logo["show_logo()"]
docs -->|Calls| run_command["run_command()"]
The docs
function is a command-line interface (CLI) command designed to automate the process of building and serving project documentation using MkDocs. MkDocs is a static site generator that's geared towards creating project documentation, making it easy to write and maintain documentation in Markdown. This function ensures that the documentation is built and served within the correct Conda environment, where all dependencies are properly managed.
@app.command()
def docs():
@app.command()
:
This decorator, provided by thetyper
library, registers thedocs
function as a command within the CLI application. Users can run this command from the terminal to build and serve the project's documentation.
- The function starts by calling
show_logo()
, which displays an ASCII logo or banner representing the project. This visual element gives a consistent and professional appearance to the user, indicating that the documentation process is starting.
- The function prints
"Building documentation cache..."
to the console. This message informs the user that the documentation build process is beginning. - The function then calls
run_command("mkdocs build", use_conda=True)
to build the documentation using MkDocs.mkdocs build
:- This command generates the static site for the documentation by processing the Markdown files and applying the configured theme and structure.
- The
use_conda=True
argument ensures that this command is executed within the Conda environment specified by the global variableCONDA_ENV
, where MkDocs and its dependencies are installed.
- After building the documentation, the function prints
"Serving documentation..."
to inform the user that the documentation is being made available for local viewing. - It then calls
run_command("mkdocs serve", use_conda=True)
to serve the documentation locally.mkdocs serve
:- This command starts a local web server, making the documentation accessible via a web browser at
http://localhost:8000
by default. - The documentation is automatically rebuilt and refreshed in the browser when changes are detected in the source files, which is particularly useful during the writing and editing process.
- This command starts a local web server, making the documentation accessible via a web browser at
To build and serve the project's documentation, a user would execute the following command in the terminal:
python manage.py docs
- This command will:
- Display the project logo.
- Print a message indicating that the documentation is being built.
- Build the static site for the documentation using MkDocs.
- Print a message indicating that the documentation is being served locally.
- Serve the documentation locally, making it accessible through a web browser.
-
show_logo()
:- A function that displays the project logo or banner. It must be defined elsewhere in the script or module. This function is called at the beginning of the
docs
function to provide a consistent and recognizable visual output for the user.
- A function that displays the project logo or banner. It must be defined elsewhere in the script or module. This function is called at the beginning of the
-
run_command()
:- A utility function used to execute shell commands. The
run_command
function is responsible for running themkdocs build
andmkdocs serve
commands within the appropriate Conda environment. It must be defined in the same script or module.
- A utility function used to execute shell commands. The
-
MkDocs
:- MkDocs is a static site generator specifically designed for project documentation. The
mkdocs build
andmkdocs serve
commands must be available in the Conda environment for this function to work properly.
- MkDocs is a static site generator specifically designed for project documentation. The
-
Conda:
- Conda is an open-source package management and environment management system. The
docs
function uses Conda to ensure that the documentation is built and served in an environment where all necessary dependencies are installed.
- Conda is an open-source package management and environment management system. The
-
MkDocs:
- MkDocs is a static site generator that's designed for building project documentation from Markdown files. It is easy to configure and supports themes and plugins that can be used to customize the appearance and functionality of the documentation.
Note
- The
docs
function assumes that the Conda environment has already been created and that MkDocs is installed within that environment. - The function is particularly useful during the documentation development process, allowing for continuous previewing of changes in real time.
- It is recommended to run this function during or after significant updates to the documentation to ensure that all changes are correctly built and reflected in the served site.
graph TD;
db_up["db_up()"]
db_up -->|Calls| show_logo["show_logo()"]
db_up -->|Calls| run_command["run_command()"]
The db_up
function is a command-line interface (CLI) command designed to start a PostgreSQL database using Docker Compose. Docker Compose is a tool that allows users to define and manage multi-container Docker applications, including databases, through simple configuration files. This function automates the process of bringing up the PostgreSQL database container, making it easier to start the database as part of the project setup or development workflow.
@app.command()
def db_up():
@app.command()
:
This decorator is provided by thetyper
library and registers thedb_up
function as a command within the CLI application. Users can invoke this command from the terminal to start the PostgreSQL database using Docker Compose.
- The function begins by calling
show_logo()
, which displays an ASCII logo or banner representing the project. This provides a consistent and recognizable visual element to the user, indicating that the process of starting the database is beginning.
- The function prints
"Starting PostgreSQL database with Docker Compose..."
to the console. This message informs the user that the PostgreSQL database is about to be started using Docker Compose.
- The function calls
run_command("docker compose up -d")
to start the PostgreSQL database container.docker compose up -d
:- This command starts the services defined in the Docker Compose file (
docker-compose.yml
) in detached mode (i.e., running in the background). - The PostgreSQL service, as defined in the
docker-compose.yml
file, will be started, initializing the database and making it available for use by the application or developers. - Running the command in detached mode allows the terminal to be freed up for other tasks while the database runs in the background.
- This command starts the services defined in the Docker Compose file (
To start the PostgreSQL database using Docker Compose, a user would execute the following command in the terminal:
python manage.py db_up
- This command will:
- Display the project logo.
- Print a message indicating that the PostgreSQL database is being started.
- Execute the
docker compose up -d
command to start the PostgreSQL database container in the background.
-
show_logo()
:- A function that displays the project logo or banner. It must be defined elsewhere in the script or module. This function is called at the beginning of the
db_up
function to provide a consistent and recognizable visual output for the user.
- A function that displays the project logo or banner. It must be defined elsewhere in the script or module. This function is called at the beginning of the
-
run_command()
:- A utility function used to execute shell commands. The
run_command
function is responsible for running thedocker compose up -d
command. It must be defined in the same script or module.
- A utility function used to execute shell commands. The
-
Docker
andDocker Compose
:- Docker is a platform that enables the creation, deployment, and management of containerized applications. Docker Compose is a tool that defines and runs multi-container Docker applications. The
db_up
function relies on these tools to start the PostgreSQL database.
- Docker is a platform that enables the creation, deployment, and management of containerized applications. Docker Compose is a tool that defines and runs multi-container Docker applications. The
-
Docker:
- Docker is a tool designed to make it easier to create, deploy, and run applications by using containers. Containers allow developers to package an application with all its dependencies and ship it as a single unit.
-
Docker Compose:
- Docker Compose is a tool for defining and running multi-container Docker applications. With Compose, you can use a YAML file to configure your application's services, networks, and volumes.
-
PostgreSQL:
- PostgreSQL is an open-source relational database management system (RDBMS) known for its robustness, scalability, and standards compliance. It is often used as the database backend for web applications and other software projects.
Note
- The
db_up
function assumes that a validdocker-compose.yml
file is present in the project directory and that it defines a PostgreSQL service. - Docker and Docker Compose must be installed and properly configured on the host machine for this function to work.
- This function is particularly useful during development, allowing developers to quickly spin up a database instance without needing to manage the database manually.
graph TD;
db_down["db_down()"]
db_down -->|Calls| show_logo["show_logo()"]
db_down -->|Calls| run_command["run_command()"]
The db_down
function is a command-line interface (CLI) command designed to stop and remove the PostgreSQL database container using Docker Compose. Docker Compose is a tool that allows users to manage multi-container Docker applications, including databases. This function automates the process of shutting down the PostgreSQL database, ensuring that all associated containers are properly stopped and cleaned up.
@app.command()
def db_down():
@app.command()
: This decorator, provided by thetyper
library, registers thedb_down
function as a command within the CLI application. Users can invoke this command from the terminal to stop the PostgreSQL database using Docker Compose.
- The function begins by calling
show_logo()
, which displays an ASCII logo or banner representing the project. This provides a consistent and recognizable visual element to the user, indicating that the process of stopping the database is beginning.
- The function prints
"Stopping PostgreSQL database..."
to the console. This message informs the user that the PostgreSQL database is about to be stopped using Docker Compose.
- The function calls
run_command("docker compose down")
to stop and remove the PostgreSQL database container.docker compose down
:- This command stops and removes all the containers, networks, and volumes associated with the services defined in the
docker-compose.yml
file. - Specifically, for the PostgreSQL service, this command stops the running database container and removes it, freeing up resources on the host machine.
- This command ensures that the database and any related services are completely shut down and cleaned up.
- This command stops and removes all the containers, networks, and volumes associated with the services defined in the
To stop the PostgreSQL database using Docker Compose, a user would execute the following command in the terminal:
python manage.py db_down
- This command will:
- Display the project logo.
- Print a message indicating that the PostgreSQL database is being stopped.
- Execute the
docker compose down
command to stop and remove the PostgreSQL database container and associated resources.
-
show_logo()
:- A function that displays the project logo or banner. It must be defined elsewhere in the script or module. This function is called at the beginning of the
db_down
function to provide a consistent and recognizable visual output for the user.
- A function that displays the project logo or banner. It must be defined elsewhere in the script or module. This function is called at the beginning of the
-
run_command()
:- A utility function used to execute shell commands. The
run_command
function is responsible for running thedocker compose down
command. It must be defined in the same script or module.
- A utility function used to execute shell commands. The
-
Docker
andDocker Compose
:- Docker is a platform that enables the creation, deployment, and management of containerized applications. Docker Compose is a tool that defines and manages multi-container Docker applications. The
db_down
function relies on these tools to stop the PostgreSQL database and clean up the associated resources.
- Docker is a platform that enables the creation, deployment, and management of containerized applications. Docker Compose is a tool that defines and manages multi-container Docker applications. The
-
Docker:
- Docker is a tool designed to make it easier to create, deploy, and run applications by using containers. Containers package an application with all its dependencies into a standardized unit of software.
-
Docker Compose:
- Docker Compose is a tool for defining and running multi-container Docker applications. It allows users to manage their application’s services, networks, and volumes through a simple configuration file.
-
PostgreSQL:
- PostgreSQL is an open-source relational database management system (RDBMS) known for its robustness, scalability, and standards compliance. It is often used as the database backend for web applications and other software projects.
Note
- The
db_down
function assumes that the PostgreSQL database was started using Docker Compose and that a validdocker-compose.yml
file is present in the project directory. - Docker and Docker Compose must be installed and properly configured on the host machine for this function to work.
- This function is useful when you need to stop the database to free up system resources or prepare the environment for other tasks, such as cleaning up before deployment or restarting services.
graph TD;
show_logo["show_logo()"]
show_logo -->|Prints| logo["ASCII Art Logo"]
The show_logo
function is designed to display an ASCII art logo representing the project. This logo serves as a visual identifier, providing a consistent and recognizable element for users whenever the function is called. It is typically used at the start of CLI commands to create a branded user experience and to visually indicate that the project-specific script is running.
def show_logo():
#### Detailed Description
- The function defines a multi-line string named
logo
that contains the ASCII art. - The logo is designed using special characters and includes color codes to enhance its appearance in the terminal.
- The color codes used are ANSI escape sequences:
\033[1m
: Bold text.\033[33m
: Yellow text.\033[0m
: Reset all attributes (text color and style).
- The function uses the
print()
function to output thelogo
string to the console. - The displayed logo features the following elements:
- A stylized title "█▀ █▀▀ █▄░█ ▀█▀ █░█" and "▄█ ██▄ █░▀█ ░█░ █▄█", which represent a graphical representation of text.
- The text "Data Science with Python on Archlinux" is displayed prominently, indicating the focus of the project.
- A small ".studio" text at the bottom adds a branding element.
- The color and bold styling make the logo visually striking, helping it stand out in the terminal output.
When the function is executed, the following logo is displayed in the terminal (with colors and bold formatting applied):
█▀ █▀▀ █▄░█ ▀█▀ █░█ ┎┤ Data Science ├┒
▄█ ██▄ █░▀█ ░█░ █▄█ ┖┤ with Python on Archlinux ├┚
.studio
To display the project logo, a user or script would simply call the show_logo
function:
show_logo()
- This command will print the ASCII art logo to the terminal, providing a visual indication that the project-related script or command is being executed.
- The
show_logo
function is self-contained and does not depend on any external libraries or functions. It relies solely on Python's built-in string handling and print capabilities.
-
ANSI Escape Codes:
- The function uses ANSI escape codes to add color and style to the text. These codes are widely supported in terminal emulators, allowing for enhanced text formatting.
-
ASCII Art:
- ASCII art is a graphic design technique that uses printable characters from the ASCII standard to create images or text. It is commonly used in terminal-based applications to add visual appeal.
Note
- The logo is designed with specific text and styling to represent a data science project that uses Python on Archlinux. If this function is used in a different context, the content and styling of the logo may need to be adjusted.
- The use of color codes assumes that the terminal supports ANSI escape sequences. If the script is run in a terminal that does not support these codes, the text may not display as intended.
graph TD;
print_activate_command["print_activate_command()"]
print_activate_command -->|Prints| activation_command["Activation Command"]
The print_activate_command
function is designed to provide users with the exact command needed to activate the Conda environment associated with a project. This function is particularly useful in guiding users who may not be familiar with Conda or the specific steps required to activate the environment in their terminal.
def print_activate_command():
- The function begins by printing a clear and instructive message to the console:
"To activate the Conda environment, run the following command in your terminal:"
- This message informs the user that the next step in setting up their development environment is to activate the Conda environment.
- The function then prints the actual command needed to activate the Conda environment:
\033[1meval conda activate ./env\033[0m
- The command includes the
eval
andconda activate
instructions, which are required to properly initialize and activate the environment located at./env
. - The command is formatted with ANSI escape sequences:
\033[1m
: This sequence is used to make the command text bold, making it stand out in the terminal output.\033[0m
: This sequence resets the text formatting back to normal after the command is printed.
eval conda activate ./env
:eval
: This shell built-in command is used to evaluate and execute the command string that follows it. This ensures that the Conda environment is activated in the current shell session.conda activate ./env
: This command activates the Conda environment located at./env
, which is typically created during the project's setup process.
To instruct users on how to activate the Conda environment, a script or developer would call the print_activate_command
function:
print_activate_command()
- This command will:
- Display an instructional message to the user about activating the Conda environment.
- Print the exact command needed to activate the environment, formatted in bold for emphasis.
When the function is executed, the following output will be displayed in the terminal (with the command text in bold):
To activate the Conda environment, run the following command in your terminal:
eval conda activate ./env
- The
print_activate_command
function is self-contained and does not depend on any external libraries or functions. It uses Python's built-inprint()
function to display the messages.
-
Conda:
- Conda is an open-source package management and environment management system. It helps manage dependencies and environments, ensuring that the correct libraries are used in a project.
-
Environment Activation:
- Activating a Conda environment is a crucial step in using the specific dependencies and configurations required for a project. This function provides a simple way to ensure users are correctly activating the environment.
Note
- The function assumes that the Conda environment has already been created in the
./env
directory. If the environment is located elsewhere, the command string may need to be adjusted accordingly. - The use of ANSI escape sequences for bold text formatting assumes that the terminal supports these codes. If run in a terminal that does not support ANSI codes, the text may not display as intended.
The script uses the typer
library to create a simple and intuitive CLI. The following commands are available:
Usage: manage.py [OPTIONS] COMMAND [ARGS]...
╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --install-completion Install completion for the current shell. │
│ --show-completion Show completion for the current shell, to copy it or customize the installation. │
│ --help Show this message and exit. │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ db-down Stop the PostgreSQL database. │
│ db-up Start the PostgreSQL database with Docker Compose. │
│ env Show the command to activate the Conda environment. │
│ init Initialize Git and install dependencies with Conda. │
│ tests Run tests with pytest. │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
To use this script, save it as a .py
file in your project directory and execute it with the desired command. For example:
python manage.py init
This will initialize the Git repository and install all dependencies.
The script handles errors primarily through the run_command
function. If a command fails, an error message is printed, and the script exits with the appropriate return code.
The script can be customized by modifying the commands executed in the run_command
function or by adding new typer
commands. The typer
library makes it easy to extend the CLI with additional functionality.