Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Texas and improve documentation for the process (3SP) #195

Open
ryanrolds opened this issue Aug 24, 2019 · 3 comments
Open

Add Texas and improve documentation for the process (3SP) #195

ryanrolds opened this issue Aug 24, 2019 · 3 comments
Labels
enhancement New feature or request

Comments

@ryanrolds
Copy link
Contributor

ryanrolds commented Aug 24, 2019

Is your feature request related to a problem? Please describe.
We need to slowly add states and monitor performance. It's been a month since we added WA and ID and we maintain the pace. At this time we have a very basic section in our vector tileset documents on loading boundaries.

Describe the solution you'd like

  • Get the Census Tract and Census Block boundaries for Montana
  • Create the JSON files following the Boundaries section of the document above
  • Get the JSON files uploaded to public S3 talk to Ryan or Diego
  • Load the MT boundaries into your local environment (a rake task does this)
  • Modify existing submissions to be in Montana and confirm it shows on the map.
  • Use the sensor settings in the Chrome dev tools to put you in MT and run a speed test
  • Confirm it geocoded you correctly and your data shows up (may need to run stats cache rake task)

Additional context
If you have any questions ask Ryan. You shouldn't need the BigQuery service key as you won't be loading data from M-Lab, that will be another ticket.

@ryanrolds ryanrolds added the enhancement New feature or request label Aug 24, 2019
@mattsayre mattsayre changed the title Add Montana and improve documentation for the process (2SP) Add Mississippi and improve documentation for the process (2SP) Aug 24, 2019
@mattsayre
Copy link
Collaborator

mattsayre commented Aug 24, 2019

After talking with Ryan we have identified an opportunity to prioritize Mississippi. The reason for this is that US Senator Roger Wicker heads the Senate Committee where broadband mapping is being actively discussed ahead of federal legislation.

@mattsayre
Copy link
Collaborator

2021 update Texas is interested in being the next state to be added to the national map.

@mattsayre mattsayre changed the title Add Mississippi and improve documentation for the process (2SP) Add Texas and improve documentation for the process (3SP) Mar 3, 2021
@webaissance
Copy link
Collaborator

webaissance commented Sep 29, 2021

Hi @mattsayre and @ryanrolds and team,
I want to let you know I've been pursuing this task - to add Texas - steadily for more than a week - and I'm making progress.

I think there are a few challenges which are making it take a while - but I'm thinking of a strategy to expedite adding states moving forward.
The main challenge is that there is a LOT of data - and as we add states the data keeps growing and growing.
Per the document linked above on Vector Tilesets and Boundaries I've been working with the populate_boundaries.rake method. While it's a cool method - it hasn't been working very well for me.
What has worked better for me is the method of loading a dataset from a .sql file such as sua_20191022.sql and sua_lane_20191022.sql

So I have an idea - which is to create a rake file that creates a bunch of .sql files - one for each state we want to add - plus a few more for the ancillary data that's needed - such as zips, counties, etc. and then build another rake that would load all of these .sqls into the db.

There are a few advantages to this. For one thing we could inspect the .sql files and make sure the data is all there and formatted correctly. And each one could be reused - so the rake would only have to be run once per state. It could be updated in future years as needed. And It would be more robust because the current populate_boundaries method can break which breaks the whole process.
The .sql file for each state would be a manageable size - whereas even the current combined file sua_20191022.sql of Oregon, Washington and Idaho (referenced in README.md) is too large to be managed well - as it is 4.3 GB in size.

I would plan to use the 2021 census bureau data for the tract and tabblock files here:
https://www2.census.gov/geo/tiger/TIGER2021/TRACT/
https://www2.census.gov/geo/tiger/TIGER2021/TABBLOCK20/
(this document matches the numbers to the states)

So I want to let you know that I'm making progress and also propose this plan of action. Let me know your feedback on this plan. I will start building it in the meantime - and make adjustments based on your feedback.
-Dave @webaissance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants