Skip to content

cefriel/convergo

Repository files navigation

convergo: GTFS/GTFS-RT to NeTEx/SIRI Converter

convergo (Latin, “I converge”) — To come together from different directions so as to meet at a point; to tend to a common result.

The name convergo is an acronym for Chimera cONVERter Gtfs to transmOdel, reflecting its role in enabling convergence towards standard data models (NeTEx, SIRI) based on Transmodel and mandated by the European regulation. The use of the Latin word is chosen also to highlight the initial focus of the tool for the Italian profiles of NeTEx and SIRI.

Description

The conversion between mobility data formats is enabled by leveraging the open-source Chimera solution (based on Apache Camel), which adopts a low-code approach to data conversion. This method is fundamental as it decouples the definition of conversion rules from the software itself, significantly increasing scalability and maintainability. The Camel framework also facilitates integration, allowing the configuration of data integration pipelines by reusing existing components, reducing the need for ad-hoc development.

This approach has been applied to develop two key converters integrated into the same software component:

  • GTFS → NeTEx (Italian Profile)
  • GTFS-RT → SIRI (Italian Profile)

Functionalities

The conversion functionalities are exposed by convergo through a set of APIs. For detailed API documentation, see api_doc.yml.

GTFS → NeTEx Conversion

  • Asynchronous API: Used due to the complexity of the process and potentially long processing times for GTFS feeds.

  • Functionality: A dedicate API allows the user to start the process and configure the output. A second API allows monitoring the process status and, upon completion, provides a compressed package containing the converted NeTEx data.

  • Parameters: The output of the conversion can be configured to generate a single file with all NeTEx Frames or multiple files organized by Frame type or service line.

GTFS-RT → SIRI Conversion

  • Synchronous API: Chosen as the execution time is suitable for real-time transformation.

  • Functionality: The converter exposes three endpoints that support the different functions of the Italian SIRI profile: Situation Exchange (SX), Vehicle Monitoring (VM), and Estimated Timetable (ET).

  • Parameters: The output of the conversion can be configured to access static GTFS data together with the input GTFS-RT and enrich the generated SIRI output. Note: the enrichment of GTFS-RT feeds with GTFS data increases the latency.

Mapping Rules

The mapping rules are based on the Mapping Template Language (MTL) and leverage a template-based approach based on Apache Velocity. The following templates are defined and located within the templates folder:

  • netex.vm (Template for GTFS to NeTEx conversion)
  • siri-vm.vm (Template for GTFS-RT to SIRI Vehicle-Positioning conversion)
  • siri-et.vm (Template for GTFS-RT to SIRI Trip-Updates conversion)
  • siri-sx.vm (Template for GTFS-RT to SIRI Alerts conversion)

The component can be easily extended by implementing additional mapping rules or modifying the existing ones. For example, they can be modified to:

  • Ensure compatibility with NeTEx profiles different from the Italian one
  • Include management of other GTFS fields not currently handled

The flexibility of Camel pipelines also enables a customization of the existing pipelines (e.g., for the retrieval of data from additional data sources to execute the mapping).

Build and Execution

A Docker image is made available via the GitHub Docker Registry and can be executed using the docker-compose.yaml file. Alternatively, a Dockerfile is provided to build the image from the source code.

To run the compiled JAR file or perform a manual build of the project, the following artefacts should be installed:

  • Java Development Kit (JDK): Version 17 or higher (required for both JAR execution and building)
  • Apache Maven: Version 3.9.5 (required for manual project compilation and dependency management)

Requirements

For the average case, it is recommended to use machines with at least 8GB of RAM to execute the converter. However, it is advisable to use machines with 16GB/32GB of RAM if processing large GTFS files. If used only to support a limited amount of GTFS-RT feeds, the converter can run smoothly with 4GB of RAM.

Please note that you may want to adjust the JVM memory directives for executing the JAR or the Docker container (default: -Xmx8g).

Usage Notes

The converter does not provide or contemplate the use of fields that are not defined in the specification provided by the General Transit Feed Specification Reference. It is therefore recommended to perform a pre-processing phase of GTFS feeds to evaluate their correctness through available validation tools such as:

The current version of the converter considers and processes the following GTFS files:

  • feed_info.txt
  • agency.txt
  • stops.txt
  • routes.txt
  • trips.txt
  • stop_times.txt
  • calendar.txt
  • calendar_dates.txt
  • shapes.txt
  • levels.txt
  • transfers.txt
  • pathways.txt
  • fare_attributes.txt (Fares L1)
  • fare_rules.txt (Fares L1)

The information is mapped considering the levels 1, 2, 3 and 5 of the NeTEx Italian profile. For specific concepts (e.g., for the pathways.txt file), additional entities from NeTEx are used even if not specified in the NeTEx italian profile.

The current GTFS-RT messages are supported:

  • TripUpdate
  • VehiclePosition
  • Alert

Additional notes:

  • GTFS feeds without the feed_info.txt file can be mapped, provided an appropriate configuration is supplied.
  • The converter supports multi-agency feed files (multiple agencies in the agency.txt file), provided an appropriate configuration is supplied.
  • All location_type values in the stops.txt file are managed, mapped in NeTEx as Quay, StopPlace, StopPlaceEntrance, BoardingPosition, and PathJunction. However, some types are not directly supported by the Italian NeTEx profile.
  • If not present, the shape_dist_traveled field in both the stop_times.txt and shapes.txt files is calculated using the haversine formula applied to the coordinates of consecutive points.

Converter Configuration

For the conversion to actually work, it is necessary to define the map related to a specific GTFS/GTFS-RT feed. These files must be placed within the config folder and must be in .txt format. The map file contains essential information for the converter to perform the conversion and is referenced within the templates defined above. It is important to define a single file for GTFS and GTFS-RT feeds related to the same operator to ensure proper alignment of identifiers.

The map should be named map-[feed_id].txt and follow the configuration detailed in the CONFIG.md file.

An example is made available in the config folder for the Gruppo Torinese Trasporti (GTT). The corresponding GTFS file can be downloaded from here to test the converter. Note that the provided configuration only generates the NeTEx output for a specific route to limit resource requirements, you can remove the routeParameter variable from the config to get the complete output.

Converter Files

Once compiled and started correctly, the converter will create a support folder for the application in the main directory named appFolder. This folder will contain three sub-folders:

  • log
  • error
  • templates

The log folder contains a convergo.log file where all the logs of the running application are saved.

The error folder will contain a sub-folder for each day in which at least one error/exception occurred during conversion, and each sub-folder will contain a file named errors.txt.

The templates folder contains a copy of the templates and are those actually used by the converter when running. It is important not to delete or modify the templates when the converter is running.

How to Contribute

Before contributing, carefully read, complete, and sign the Contributor License Agreement.

Contributions to the repository must be made via a pull request. It is preferable to discuss any modifications or problems encountered with the repository owners beforehand.

License

Licensed under the EUPL (European Union Public Licence) v1.2, you may not use the content of this repository except in compliance with the License. You may obtain a copy of the License at

https://eupl.eu/

About

Chimera Converter from GTFS/GTFS(-RT) to the Italian Profiles of NeTEx and SIRI

Resources

License

Stars

Watchers

Forks

Packages