Skip to content

Repo to hold the homeworks for the data engineering zoomcamp by DataTalks

Notifications You must be signed in to change notification settings

Tiburso/DataEngineeringZoomcamp

Repository files navigation

Instalation

Everything in this project can be run via the docker compose up command. The bigquery and gcs buckets have to be setup via de terraform apply command in the infra folder

Objective

The main goal of this project was to learn and practice data engineering basics. This is an end-to-end project where it goes from the data ingestion part to the presentation layer with Metabase.

Tools

For this project the following tools were used:

  • airflow (orchestration)
  • Docker (containerization and development)
  • terraform (infrastructure)
  • Google Big Query/Google Cloud Storage (cloud -> data warehousing and lake)
  • Apache Spark (batch processing)
  • dbt (data transformation)
  • Metabase (visualization)

About

Repo to hold the homeworks for the data engineering zoomcamp by DataTalks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published