Skip to content

Latest commit

 

History

History
9 lines (6 loc) · 404 Bytes

README.md

File metadata and controls

9 lines (6 loc) · 404 Bytes

spark-example

This repository contains a sample Spark job in Python. The script demonstrates:

  1. Reading data from a CSV file into a Spark DataFrame (titanic dataset).
  2. Performing a simple transformation (grouping by the Age column and counting).
  3. Optionally running a user-provided SQL query on the data (via Valohai parameters).
  4. Writing the transformation and optional SQL results to disk.