To get started with setting up the website for local development, see the Setup Guide.
This website repository ("website") holds code for the frontend of Data Commons, including https://datacommons.org, Custom Data Commons, the JavaScript client, the web components, and more.
The file structure is as follows:
├── .github/ # GitHub Actions workflows (CI/CD) and templates
├── build/ # Build scripts and configuration (Dockerfiles)
├── custom_dc/ # Sample configurations for Custom Data Commons instances
├── deploy/ # Deployment scripts (GKE/Cloud Run)
├── docs/ # Developer guides and documentation
├── gke/ # Google Kubernetes Engine configuration files
├── import/ # Submodule: Import scripts used for loading Custom DC data
├── mixer/ # Submodule: Code for the Data Commons Mixer (backend)
├── model_server/ # Code for the model hosting server
├── nl_server/ # Code for the Natural Language (NL) search server
├── packages/ # Shared internal packages/libraries (often for UI)
├── scripts/ # Utility and maintenance scripts
├── server/ # Main Python website server code (Flask/endpoints)
├── shared/ # Shared resources and logic used across website and nl servers
├── static/ # Static assets: CSS, JavaScript, images, and data files for website
│ └── src/ # Entry point for the NodeJS Charts server
│ └── nodejs_server/ # Main lib code for the NodeJs Charts server
├── tools/ # Developer tools (e.g., golden generators, verifiers)
├── nl_app.py # Entry point for the NL server
├── web_app.py # Entry point for the main website server
├── run_*.sh # Various convenience scripts to run the servers/tests locally
└── skaffold.yaml # Configuration for Skaffold (Kubernetes development, Cloud Deploy)
For changes that do not test GCP deployment or involve mixer changes, one can simply run flask in a local environment (Mac or Linux machine). The local Flask app talks to the autopush mixer.
Note: the autopush mixer contains the latest data and mixer code changes. It
is necessary to update the mixer submodule if compatibility is required between
website and mixer changes.
./run_npm.shThis will watch static files change and re-build on code edit.
If there are errors, make sure to run nvm use v18.4.0 to set the correct version.
Start the flask webserver locally at localhost:8080
./run_server.shTo enable NL search, follow the "Start NL Server" instructions in the next section.
Then, start the flask webserver with language models enabled via -m:
./run_server.sh -mIf you don't have access to the DataCommons Maps API, you can bring up website without place search functionality:
./run_server.sh -e liteThere are multiple environments for the server, specified by -e options.
For example, custom is for custom data commons and iitm is
for iitm data commons.
To start multiple instances, bind each server instance to a different port. The following example will start localhost on port 8081. The default is 8080.
Please note the strict syntax requirements for the script, and leave a space
after the flag. So: ./run_server.sh -p 8081 but not ./run_server.sh -p=8081.
ModuleNotFoundError: missing python libraries...
Clear the environment and rebuild all required libraries by running:rm -rf .venv
./run_test.sh --setup_pythonNatural language models are hosted on a separate server. For features that depend on it (all NL-based interfaces and endpoints), the NL server needs to be brought up locally (in a separate process):
./run_nl_server.sh -p 6060By default the NL server runs on port 6060.
If you run into problems starting the server, try running these commands before restarting the server:
./run_test.sh --setup_python
rm -rf ~/.datacommons
rm -rf /tmp/datcom-nl-models
rm -rf /tmp/datcom-nl-models-devIf local mixer is needed, can start it locally by following these instructions. This allows development with custom BigTable or mixer code change. Make sure to also run ESP locally.
Then start the Flask server with -l option to let it use the local mixer:
./run_server.sh -l❗IMPORTANT: Make sure that your ChromeDriver version is compatible with your local Google Chrome version.
Before running the tests, install a browser and webdriver. We recommend you use Google Chrome browser and ChromeDriver.
Instructions for installing Google Chrome and ChromeDriver
-
Chrome browser can be downloaded here.
-
ChromeDriver can be downloaded here. You can view the latest ChromeDriver version here. Or, download it using a package manager directly:
npm install chromedriver
-
Make sure PATH is updated with ChromeDriver's location. You can view the latest ChromeDriver version here.
If you're using a Linux system, you can run the following commands to download Chrome browser and ChromeDriver, this will also include the path setup:
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
sudo dpkg -i google-chrome-stable_current_amd64.deb; sudo apt-get -fy install
CHROMEDRIVERV=$(curl https://chromedriver.storage.googleapis.com/LATEST_RELEASE)
wget https://chromedriver.storage.googleapis.com/${CHROMEDRIVERV}/chromedriver_linux64.zip
unset CHROMEDRIVERV
unzip chromedriver_linux64.zip
sudo mv chromedriver /usr/bin/chromedriver
sudo chown root:root /usr/bin/chromedriver
sudo chmod +x /usr/bin/chromedriver❗ NOTE: If using MacOS with an ARM processor (M1 chip), run local NL server before running the tests:
./run_nl_server.sh -p 6060Run all tests:
./run_test.sh -aRun client-side tests:
./run_test.sh -cRun server-side tests:
./run_test.sh -pRun webdriver tests:
./run_test.sh -wcd static
npm test . -- -uWebsite is deployed in Kubernetes cluster. A deployment contains the following containers:
- website: A Flask app with static files compiled by Webpack.
- mixer: A Data Commons API server.
- esp: Google Extensive Service Proxy used for endpoints management.
The code for mixer lives in our mixer repo and is included in website as a submodule. We read mixer's deployment info from the submodule.
Note: This section covers deploying the full website stack (Website + Mixer) to datcom-website-dev.
- If you need to deploy mixer changes to
datcom-mixer-dev, please see mixer/deploy/README.md.
The deployment process involves:
- Building and pushing artifacts (Docker images) for Website and/or Mixer servers to Artifact Registry.
- Triggering a rollout via Google Cloud Deploy using those artifacts.
If you have website changes, commit them locally. Then run:
gcloud auth login
gcloud auth configure-docker
# Builds and pushes website image to GCR
./scripts/push_image.sh datcom-ci DEV- This will push the
datacommons-website,datacommons-nl, anddatacommons-nodejsimages tagged withdev-<git-hash>(e.g.,dev-72c634f). - Note: This script does not push a mixer image.
- Check for the image in Artifact Registry (datacommons-website)
Standard Case: Use an existing image tag available in Artifact Registry.
Alternative: If you need to incorporate local mixer changes, click to expand
-
Server Code Change: Push the mixer image to Artifact Registry:
- In your fork of the
mixerrepo, run# in mixer repo ./scripts/push_image.sh datcom-ci DEV - This will push an image tagged with
dev-<mixer-git-hash>.
- In your fork of the
-
Deployment Change: If you have modified deployment configurations (e.g.,
deploy/helm_charts/values.yaml,deploy/helm_charts/envs/*.yaml), you MUST pull these changes into thewebsiterepository prior to deploying.- Update your local
websiterepo's mixer submodule to point to your localmixercommit.
- Update your local
Once you have your hashes, run the cloud deploy script.
# Set your hashes (include the "dev-" prefix)
# Example: website_hash="dev-72c634f"
website_hash=
mixer_hash=
# Deploy BOTH Website and Mixer to datcom-website-dev using datacommons-website-dev Delivery pipeline
./scripts/deploy_website_cloud_deploy.sh $website_hash $mixer_hash datacommons-website-devThe deploy_website_cloud_deploy.sh script creates a new release in Google Cloud Deploy using the specified image tags. It does not build images locally; it deploys the already-pushed artifacts to the GKE dev instance.
Images tagged with "dev-" will not be picked up by our CI/CD pipeline for autodeployment.
View the deployment at link.
Monitor rollout progress at: Cloud Deploy Delivery Pipeline
Force stop will create additional secrets pending/upgrading and stop future dev deployment by helm. Run below CLI to validate/find the blocking secrets.
helm history --max 20 dc-website
helm history --max 20 dc-mixerThen roll back to the previous version.
helm rollback <RELEASE_NAME> [REVISION]After rollback, deployment can proceed again.
The autopush instance(autopush.datacommons.org) always has the latest code and data. For this to happen in other dev/demo instance, in a clean git checkout, simply run:
./script/deploy_latest.sh <ENV_NAME> <REGION>-
[Optional] Update variables in 'env' of 'Flask' configurations in .vscode/launch.json as needed.
-
In the left hand side menu of VS Code, click on "Run and Debug".
-
On top of the "Run and Debug" pane, select "DC Website Flask" and click on the green "Play" button.
-
In "DEBUG CONSOLE" (not "TERMINAL"), check the server logs show up.
This brings up Flask server from the debugger. Now you can set break point and inspect variables from the debugger pane.
TIPS: you can inspect variable in the botton of "DEBUG CONSOLE" window.
A full tutorial of debugging Flask app in Visual Studio Code is in here.
Feature flags are used to gate the rollout of features, and can easily be turned on/off in various environments. Please read the Feature Flags guide.
-
Update server/config/chart_config/
<category>.jsonwith the new chart.{ "category": "", // The top level category this chart belongs to. Order of charts in the spec matters. "topic": "", // Strongly encouraged - A page-level grouping for this chart. "titleId": "", // Strictly for translation purposes. "title": "", // Default (EN) display string "description": "", // Strictly for translation purposes. "statsVars": [""], // List of stat vars to include in the chart "isOverview": true, // Optional - default false. If the chart should be added to the overview page. "isChoropleth": true, // Optional - default false. If a map should be used to display the data "unit": "", "scaling": 100, "relatedChart": { // Defined if there should be comparison charts added // All chart fields from above can be specified. If unspecified, it will be inherited. } }
-
Update related files.
-
If adding a new category, create a new config file in server/chart_config and add the new category to:
-
If a new stat var is introduced, also update:
- Labels that appear as chips under comparison charts: static/js/i18n/strings/en/stats_var_labels.json
- Titles on ranking pages: static/js/i18n/strings/en/stats_var_titles.json
- New stat vars which have not been cached: NEW_STAT_VARS
-
If a new unit is required, update:
- static/js/i18n/i18n.tsx
- static/js/i18n/strings/*/units.json (with display names and labels for the unit in ALL languages)
Note: Please add very detailed descriptions to guide our translators. See localization.md for more details.
-
-
Run these commands:
./scripts/extract_messages.sh ./scripts/compile_messages.sh
-
IMPORTANT: Manually restart Flask to reload the config and translations. Most likely, this means re-running
run_server.py -
Test the data on a place page!
For detailed debugging instructions (disabling headless mode, screenshots, flakiness), see the WebDriver Testing Guide.
The GKE configuration is stored here.
Redis memcache is used for production deployment. Each cluster has a Redis instance located in the same region.
To test .yaml cloudbuild files, you can use cloud-build-local to dry run the file before actually pushing. Find documentation for how to install and use cloud-build-local here.
The Data Commons site makes use of Material Design icons. In certain cases, font-based Material Design icon usage can result in flashes of unstyled content that can be avoided by using SVG icons.
We have provided tools to facilitate the creation and use of Material SVG icons in both the Jinja template and in React components. For instructions on how to generate and use these SVGs and components, please see: Icon Readme: