spark-example

This repository contains a sample Spark job in Python. The script demonstrates:

Reading data from a CSV file into a Spark DataFrame (titanic dataset).
Performing a simple transformation (grouping by the Age column and counting).
Optionally running a user-provided SQL query on the data (via Valohai parameters).
Writing the transformation and optional SQL results to disk.