A command-line tool for preparing and sending curated files for integration.
Runs several sanity checks on the files (including syntax checkers), does the FTP transfers, and backs up the files and cleans the submission directories.
- Python 3.8 or higher (the Anaconda/Miniconda python distribution is recommended).
- UniProt curation environment, which includes the syntax checker scripts.
- FTP credentials for connecting to the remote server.
Ideally the package should be installed into a clean virtual environment. This command creates a conda environment named "sending":
conda create --name sending
Activte the new environment:
conda activate sending
Finally, install the sending package into the environment:
conda install --channel ehatton sending
Environment variables are used to configure the location of submission directories and other resources. A full list of required environment variables can be found in the section below. The easiest way to implement this is to use a batch script to set all the environment variables, activate the python environment and open a cmd shell.
There are four commands for the checking, information, sending and backup stages.
-
To run checks on the files, use the check command.
sending check
This automatically runs the syntax checkers on all files.
It also checks for valid accessions in newly curated entries.
For curated TrEMBL entries, it will check that the accessions and protein ids match the original TrEMBL entries in the database.
For pep and sub files, it checks that none of the secondary accessions are present in TrEMBL.
-
To view summary information about the files, use the info command.
sending info
This command lists the number of files, and the total number of entries created/updated, for each directory.
-
To execute the FTP transfer, use the send command.
sending send
This automatically transfers all the files to the remote FTP server. New entries are concatenated into a single file named allnew with a date stamp appended to the filename (e.g. allnew_20200707.swp).
-
To back up files and clean submission directories ready for the following week, use the tidy command.
sending tidy
-
Help documentation is also available:
sending --help
Usage examples can be found in the .env file in the tests folder.
- LOG_DIR
- LOG_REMOTE_DIR
- TREMBL_DIR
- TREMBL_REMOTE_DIR
- NEW_DIR
- NEW_REMOTE_DIR
- PEP_DIR
- PEP_REMOTE_DIR
- SUB_DIR
- SUB_REMOTE_DIR
- PID_DIR
- PID_REMOTE_DIR
- SEQ_DIR
- SEQ_REMOTE_DIR
- REMOTE_HOST_NAME
- REMOTE_USER
- REMOTE_SERVER
- REMOTE_KEY
- KNOWN_HOSTS
- TREMBL_SERVER
- BINPROT
- PERLLIB
- SPROT
- JIRA_URL