Execute the SQL queries that create the dataset used to build our models, downloading the results.
Create a Cloud Platform project with billing enabled. You won't be charged but it is required to save large query results. Download the Google Cloud SDK and follow the prompts to add the project you created. After you are done, use the web interface or the bq mk
command to create a dataset.
Running:
./run_queries.sh PROJECT_ID DATASET_ID
will execute the queries, creating necessary tables, and saving the file to the working directory as subreddit-algebra-output-%Y-%m-%d.csv.
These queries were copied from FiveThirtyEight's public data repository