Cortex is a client-server system which receives special snapshots files from the user, that include photos the user has taken, as well as various metadata such as a depth map of the taken image, the position and rotation of the camera and the feelings the user had while taking the image. It then parses the snapshots, saves the results to a database and presents the parsed snapshots as a gallery in a website it renders.
The workflow of the Cortex system is presented in the following flow chart:
This project was done by Michael Glukhman for the Advanced System Design course in Tel Aviv University. © 2020
As shown in the above flow chart, the Cortex system consists of several components, some of them operate on the client side and some on the server side. Each component is written to operate independently, so that it can easily be replaced by a different component which fits the same role (exposes and uses the same API's), so that the other components of the system can treat it as a "black box". This modularity also allows the wrapping of each server-side component inside a docker container, running as a micro-service.
At the client side, the user uploads to Cortex, using the upload-sample command, a special snapshots file, which contains all the data to be parsed by the system. The currently supported file format is a g-zipped Google's Protobuf 3, which has the following structure:
- A user data message, preceded by a 4-byte unsigned integer representing the size of the data in bytes, and containing the following fields:
- user_id: an 8-byte unsigned integer;
- username: a string;
- birthday: a 4-byte unsigned integer representing the user's birth date in UNIX time;
- gender: an Enumerator with 3 options: 0 = MALE, 1 = FEMALE, 2 = OTHER.
- Multiple snapshot messages, each preceded by a 4-byte unsigned integer representing the size of the data in bytes, and containing the following fields:
- datetime: an 8-byte unsigned integer representing the time in microseconds (in UNIX time) at which the snapshot_ was taken
- pose: representation of the user's pose while taking the snapshot, having the following fields:
- translation: three double-precision floats, representing camera position on axes X, Y and Z;
- rotation: four double-precision floats, representing the camera rotation quaternion;
- color_image: two 4-byte unsigned integers representing width and height, following by pixel data of the photographed image, with 3 bytes per pixel;
- depth_image: two 4-byte unsigned integers representing width and height, following by a pixel data of a heat map of image depth, with a 4-byte float per pixel;
- feelings: four single-precision floats, representing the user's feelings of hunger, thirst, exhaustion and happiness while taking the snapshot, in a scale of -1 to 1 each.
The client reads the file using the default reader protocol (currently Protobuf 3): first the user data and then one snapshot at a time, re-packs each message using a Protobuf 3 protocol (same as the default reader but not coupled with it in the code) and sends the messages to the server over a socket connection.
The client exposes the following Python API:
from bci.client import upload_sample
upload_sample(host='127.0.0.1', port=5000, path='sample.mind.gz')
where host
is the IP address or hostname of the server, port
is the port on which the server communicates
and path
is the relative or absolute path to a snapshots file;
And the following command line interface:
python -m bci.client upload-sample -h/--host '127.0.0.1' -p/--port 5000 'snapshot.mind.gz'
with the same arguments.
The default arguments for both API's are: host = '127.0.0.1', port = 5000.
An additional format
argument, available in both the python API and the CLI (as -f/--format ), allows the user to specify the reader module with which the uploaded snapshots file should be read. Currently, only 'protobuf'
is available and is set as the default.
In order to add to the client a different reader module, so snapshots files in formats other than Protobuf 3 can be read, one needs to do the following:
- write a new reader class and put it in a file
bci/readers/<format_name>.py
in the project. - at the bottom of the file, add the following:
reader_cls = <format_name>
- from now on, the new format becomes available in the CLI with
bci.client upload-sample ... -f/--format <format_name>
and in the python API withbci.client.upload_sample(..., format=<format_name>)
- to set the new format as default, in
bci/utils/constants.py
specifyDEFAULT_FORMAT=<format_name>
The server, via the run-server command receives messages from clients over a socket connection. For each message it unpacks it using the same Protobuf 3 protocol as the client, and publishes it to the message queue given in the message-queue-url
argument (see busage below), using the protocol given in that URL's scheme (currently supporting only rabbitmq).
The current rabbitmq publisher does the following:
- For user data messages it simply repacks the message with the same Protobuf 3 protocol as the server and publishes it to the massage queue under users topic;
- For snapshot messages it repacks the message with the same Protobuf 3 protocol as the server, saves it as a snapshot.raw file in the data/<user_id>/<timestamp>
path, and publishes to the massage queue:
1. the address of the raw snapshot to a raw_snapshot fanout exchange; and
2. the metadata (id, user id, time stamp) of the snapshot under snapshots topic, in JSON format.
After a successful publishing, or a failure, the server sends an acknowledgement (ACK) message back to the client, which can then proceed to sent the next messages.
The server exposes the following Python API:
from bci.server import run_server
run_server(host='127.0.0.1', port=5000, publish=publisher_function})
where host
is the IP address or hostname of the server, port
is the port on which the server communicates
and publish
is publishing function which has one argument and **kwargs
defined, e.g.:
def log_message(message, **kwargs):
print(message, file=open('/tmp/test.log', 'a'))
And the following command line interface:
python -m bci.server run-server -h/--host '127.0.0.1' -p/--port 5000 'rabbitmq://127.0.0.1:5672/'
where host
and port
are the same as above, while the last argument is the address (IP:port) of
the message queue, preceded by the protocol used (currently, only rabbitmq is supported).
The default arguments for both API's are: host = '127.0.0.1', port = 5000.
In order to add custom publisher module, write a new publisher function with the signature publish(message, **kwargs)
and put it in a file bci/publishers/<publisher_name>.py
in the project.
A module named <publisher_name> will become available for use as a scheme in the publisher URL in the CLI above.
The following key-word arguments (kwargs) are available for publisher functions:
- provided by the Python API
run_server
: msg_type, user_id - provided by the CLI function: publisher_host, publisher_port
Parsers are micro-services which receive raw snapshot data (a file path is consumed from the message queue through the raw_snapshot fanout exchange, and the data itself from the file system, see the server above), parse it in some way and publish the results to the message queue on a dedicated topic (in JSON format), to be later saved to the database by the saver component.
For color and depth images, only metadata is published (image id, height, width, image url) while the parsed image itself is saved as a file in the file system, in data/{color,depth}_images/<id>
.
The parser module has a parse function, which actually calls the requested parsing engine, and parses data from a file in a given raw_datapath, and a run-parser function, which runs continuously as a service, consuming messages from the message queue, and calls parse as part on its action.
Each parser exposes the following Python API:
from bci.parsers import run_parser
result = run_parser('<parser_name>', '<raw_data_path>')
which accepts a parser name and the path to a raw data file (as consumed from the message queue), and returns the result (as published to the message queue);
And the following command line interface:
python -m bci.parsers parse '<parser_name>' '<raw_data_path>' > '<parsed_data_path>'
which accepts a parser name and the path to a raw data file (as consumed from the message queue), and prints the result (as published to the message queue) or redirects it to a result file as in the example above.
A parsing service continuously consuming from the message queue is available by:
python -m bci.parsers run-parser '<parser_name>' 'rabbitmq://127.0.0.1:5672/'
where the last argument is the address (IP:port) of the message queue, preceded by the protocol used (currently, only rabbitmq is supported).
Parsers currently available are:
- feelings: parses and publishes a JSON of the user's feelings while taking the snapshot;
- pose: parses and publishes a JSON of the camera position and rotation while taking the snapshot;
- color_image: renders the actual image taken by the user, in JPEG format, and publishes the image metadata;
- depth_image: renders a heat-map representing the depth dimension of the image taken by the user, in PNG format (using the pyplot library), and publishes the resulting image metadata.
In order to add custom parser modules, one has to create a file named <parser_name>.py
in the
bci/parsing
directory, containing the following:
from ..parsers import BasicParser
class MyParser(BasicParser):
def parse(self, raw_snapshot_path):
# your parsing code here, generating some savable result
return result
parser_cls = MyParser
A parser named <parser_name> will become available for use in the Python API and CLI above.
The saver consumes messages from the message queue through the various topic exchanges, and saves the message data to a database in the given URL, using the database service provided in the run-saver function's URL scheme (currently, only mongodb is supported).
The saver exposes the following Python API:
from bci.saver import Saver
saver = Saver('mongodb://127.0.0.1:27017')
data = '<some data in JSON-serializable format>'
saver.save('<topic_name>', data)
where a URL of a running database service is provided to the Saver class constructor, and topic_name
is the name of the topic (i.e. table in relational databases or document in non-relational ones) to which the data should be saved;
And the following command line interface:
python -m bci.saver save -d/--database 'mongodb://127.0.0.1:27017' '<topic_name>' '<parsed_data_path>'
where topic_name
is the name of the topic (table/document) in the database which the data should be saved, and parsed_data_path
is a path to a file containing JSON-serializable data to be saved to the database.
The default argument for --database
is 'mongodb://127.0.0.1:27017'.
The command line also supports running the saver as a service, which continuously consumes messages to be saved from all the topics listed in bci/utils/constants.py
under TOPICS
, and saves them to the appropriate topics in the database:
python -m bci.saver run-saver 'mongodb://127.0.0.1:27017' 'rabbitmq://127.0.0.1:5672'
where the first argument is the URL of a running database service (currently only mongodb is supported), and the second is the URL of a running message queue service (currently only rabbitmq is supported).
The API server is a Flask server which receives REST calls from command line interface (CLI) and graphical user interface (GUI) clients and responds with results which are read from the database based on the provided REST endpoint, using the database service provided in the run-server function's URL scheme (currently, only mongodb is supported).
The following REST endpoints are supported:
GET /users
: returns the list of all the users Cortex currently knows, each with it's user id and name;GET /users/<user-id>
: returns the specified user's details: user id, name, birthday and gender;GET /users/<user-id>/snapshots
: returns the list of the specified user's snapshot IDs and datetimes;GET /users/<user-id>/snapshots/<snapshot-id>
: returns the specified snapshot's details: ID, datetime, and a list of the available results' names (e.g. [pose, feelings]);GET /users/<user-id>/snapshots/<snapshot-id>/<result-name>
: returns the specified snapshot's result details. <result_name> can be pose, color_image, depth_image or feelings, where for color and depth images, only the image's metadata and the URL address of the image are shown;
The API server exposes the following Python API:
from bci.api import run_api_server
run_api_server(host='127.0.0.1', port=8000, database_url='mongodb://127.0.0.1:27017')
which starts the server and runs it continuously;
And the following command line interface:
python -m bci.api run-server -h/--host '127.0.0.1' -p/--port 8000 -d/--database 'mongodb://127.0.0.1:27017'
which does the same.
The default arguments for both API's are: host = '127.0.0.1', port = 8000, database_url = 'mongodb://127.0.0.1:27017'.
The CLI client is a simple command-line client which requests each of the REST endpoints of the API server (see above), gets a JSON response and tabulates it to the terminal.
Available commands are:
python -m bci.cli get-users
python -m bci.cli get-user 1
python -m bci.cli get-snapshots 1
python -m bci.cli get-snapshot 1 2
python -m bci.cli get-result 1 2 '<result_name>'
where, in commands with arguments, the first argument is the user ID, the second is the snapshot ID (discoverable by running the get-snapshots
command first) and result_name
is the result details for the specified snapsho details: this can be pose
, color_image
, depth_image
or feelings
, where for color and depth images, only the image's metadata and the URL address of the image are shown.
For each of the above commands, -h/--host
and -p/--port
arguments are available for providing the address of the API server with which the CLI client connects. Default arguments are: host = '127.0.0.1', port = 8000.
The GUI is a website run by a Flask server which receives HTTP requests a GUI client - such as a web browser - then makes REST requests to the API server according to the provided endpoint, and generates HTML pages based on the received data.
The GUI exposes the following Python API:
from bci.gui import run_server
run_server(host='127.0.0.1', port=8080, api_host='127.0.0.1' api_port=8000)
which starts the web server and runs it continuously;
And the following command line interface:
python -m bci.gui run-server -h/--host '127.0.0.1' -p/--port 8080 -H/--api-host '127.0.0.1' -P/--api-port 8000
which does the same.
The default arguments for both API's are the ones presented in the examples above.
The website consists of three mutually-navigable parts:
- An index of snapshot galleries uploaded by users;
- A snapshot gallery for each user, complete with preview of each snapshot's color image;
- Detailed graphical representation for each snapshot. The pose is represented by two plots generated using the pyplot library:
- translation is a point in 3D space;
- rotation is an arrow inside a 3D sphere, pointing to the direction viewed by the camera.
The installation of Cortex requires Python 3.8.
In order to install and deploy the system, the following steps are required:
- Clone the repository and enter it:
git clone [email protected]:advanced-system-design/project-304366891.git cd project-304366891/
- Run the installation script and activate the virtual environment:
The
./scripts/install.sh source .env/bin/activate ... [Mike''s BCI] $ # you're good to go!
install.sh
script will also deploy two docker containers: one running a mongodb database with port 27017 open, and another running a rabbitmq message queue with port 5672 open.
The project includes pytest tests, which are available in the tests directory of the project. They can be run by typing pytest
in the command line, while in the project's main directory.
The tests are also integrated with TravisCI, and are run with each push to Github.
The system has a simple logging service, which stores logs under log/<component_name>.log for debug purposes.
A script is provided with the project which enables a quick deployment of the entire Cortex system in one command, using Docker containers, by running the script ./run-pipeline.sh
from the main directory of the project (sudo
might be needed, depending on the Docker installation).
The script requires:
- Docker insatlled;
- The project's virtual environment being active (see above);
- No container name conflicts while installing. These are unlikely since all the containers are namespaced qd_ (for "quick deployment") specifically to avoid such issues;
- No port conflicts. The script will use ports 5000, 8000, 8080 on localhost, so make sure these are available before running.
The script will build a single docker image of the project, on which all containers are to be based, then deploy a docker container for each server-side component: the server, the message queue (rabbitmq), the database (mongodb) one for each parser: feelings, pose, color image and depth image, the saver, the API server and the GUI. The containers share a dedicated network named cortex, and a shared data volume mounted as /data
for saving to and loading from the file system.
Once the script has finished running, the server will become available on localhost:5000, the API server on localhost:8000 and the GUI website on localhost:8080. At this point, one can start uploading snapshot files. E.g.:
python -m bci.client upload-sample tests/good_proto.mind.gz
python -m bci.client upload-sample tests/test_proto.mind.gz
(this will upload a two minimalistic snapshots files used for testing)