Skip to content

Conversation

@mgithub40
Copy link

Dr. Alex Reinhart,

As discussed in the pull request#384. COVID mobility data can now be used as an API source. Have commited all the required files. Can you please review and provide me your input.

cmu-delphi/covidcast#384

Adding COVID mobility data
Adding COVID mobility data insertion files
Adding COVID mobility test file
Adding COVID mobility data table creation script
Adding COVID mobility source
Add COVID Mobility API
Add COVID mobility API
Add COVID mobility API
Add COVID mobility API
@capnrefsmmat
Copy link
Contributor

Thanks for this! This is quite a lot of work, and I'm impressed you were able to find your way around the codebase quickly.

At this point we have a few design decisions to make before we integrate this code. First is our prioritization; once we add these indicators, we'll need to maintain them and run them regularly, which will require devoting some resources. @krivard and the engineering team will need to discuss this.

Second, it looks like you've added new Epidata endpoints, and you store county-level data for Google and some similar level (county_and_city?) for Apple. Is that right? Are the counties encoded by name, by FIPS code, or by some other identifier? Because if counties are available by FIPS code, another good option is to format the data in the way expected by the covidcast acquisitions code. In our covidcast-indicators repository are several examples of codebases that do this.

The advantage here is that covidcast-indicators contains a utility module that can, for instance, take county-level data and produce aggregates for states, metropolitan statistical areas, hospital referral regions, HHS regions, and so on. Using the covidcast system would also allow the R and Python covidcast packages to fetch the data easily, and since you've already done the hard work of extracting the data, maybe the code wouldn't be too difficult. (Depends on how closely their geographic coding matches ours, I think.)

If you'd like to go that route, it'd be a good idea to coordinate with @krivard and the engineering team so they can guide you to the best way to implement this code so we can integrate it.

@mgithub40
Copy link
Author

Dr. Alex Reinhart,

Thank you for the reviews and your quick response.
Will update the tables to contain the FIPS code for the counties. So that it can be utilized by the covidcast.
Also, will connect with @krivard and the engineering team to integrate this code.

Added fips code to Google and Apple mobility table
Added fips code to Apple and Google mobility data
Added fips code to Apple mobility data
fips code value for each county
fips code value for each county
Added fips code for Google mobility data
Added fips code to Apple and Google mobility data
Added fips code to test cases
@mgithub40
Copy link
Author

Dr. Alex Reinhart,

FIPS code for each county has been added to the Apple and Google mobility data. Can you please provide me with your reviews. Also planning on adding vaccination data. But it may not contain county level details. Would that be fine Dr. Dr. Alex Reinhart. Would be creating a datasource to capture vaccination details.

Dr. Katie Mazaitis, can you please let me know how do we integrate mobility data with the main stream. Need your assistance and help.

@krivard
Copy link
Contributor

krivard commented Jan 11, 2021

Hello, I agree with Alex that we will want to approach this as a covidcast indicator and not its own Epidata endpoint. I am sorry to be giving you the runaround -- answers to the questions on the original PR would have helped us give you better instructions.

Covidcast indicator pipelines are hosted in this repository: https://github.com/cmu-delphi/covidcast-indicators

The rough requirements of an indicator are:

  • it must be a python module
  • it must run using python -m [module-name] (so, no arguments)
  • it must be configured (if needed) using a file called params.json located in the working directory
  • it must use pytest for testing
  • it must not attempt to read or write the epidata database directly
  • it must generate output files in COVIDcast CSV format

There is a template module there: https://github.com/cmu-delphi/covidcast-indicators/tree/main/_template_python

& you can also look at other CSV-reading indicators like jhu and usafacts for an example.

Before we accept your PR and deploy the new indicator, we will also want to know:

  • How often does the pipeline need to run?
  • How many regions are updated each time, on average?
  • Are the values provided in the file downloaded on day X ever updated on subsequent days? What is the largest expected delay between when a value is first published and the final time it is updated? See also our documentation on date coding and revisions.

Feel free to email me ([email protected]) if you want to discuss further.

@mgithub40
Copy link
Author

Dr. Alex Reinhart and Dr. Katie Mazaitis

Thank you for the review and guidance. Will be adding mobility data as covidcast indicators.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants