You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Code available in the github [repo](https://github.com/walkerke/umich-workshop-2024/blob/main/census-2020/bonus-chart.R) or R/Workshops/tidycensus-umich-workshop-2024-main/census-2020/bonus-chart.R
2176
2176
- Distribution shape remains pretty much the same, but decreasing for most age cohorts, i.e. people are leaving the state across most age groups.
2177
2177
- e.g. The large hump representing the group of people in there mid-40s in 2000 steadily decreases over time.
2178
2178
2179
-
- [Example]{.ribbon-highlight}: Compare 2010 to 2020 Population Densities for Dallas-Ft. Worth\
2179
+
- [Example 3]{.ribbon-highlight}: Compare 2010 to 2020 Population Densities for Dallas-Ft. Worth\
- [Example]{.ribbon-highlight}: Compare 2022 5-Year ACS to the 2017 5-Year ACS
2291
+
- [Example 4]{.ribbon-highlight}: Compare 2022 5-Year ACS to the 2017 5-Year ACS (*County Level)*
2292
2292
2293
-
- County Level
2293
+
``` r
2294
2294
2295
-
``` r
2295
+
utah_wfh_compare <- get_acs(
2296
+
geography = "county",
2297
+
variables = c(
2298
+
work_from_home17 = "CP03_2017_024",
2299
+
work_from_home22 = "CP03_2022_024"
2300
+
),
2301
+
state = "UT",
2302
+
year = 2022
2303
+
)
2304
+
```
2296
2305
2297
-
utah_wfh_compare <- get_acs(
2298
-
geography = "county",
2299
-
variables = c(
2300
-
work_from_home17 = "CP03_2017_024",
2301
-
work_from_home22 = "CP03_2022_024"
2302
-
),
2303
-
state = "UT",
2304
-
year = 2022
2305
-
)
2306
-
```
2306
+
- The Comparison Profile dataset has aggregated statistics to compare between ACS 5-Year surveys (See [tidycensus \>\> Variables](surveys-census-data.qmd#sec-surv-cens-tidyc-vars){style="color: green"} \>\> Search Variables)
2307
+
- This dataset only goes down to the county level
2308
+
2309
+
- [Example 5]{.ribbon-highlight}: Compare 2022 5-Year ACS to the 2017 5-Year ACS (*Tract Level*)
2307
2310
2308
-
- The Comparison Profile dataset has aggregated statistics to compare between ACS 5-Year surveys (See [tidycensus \>\> Variables](surveys-census-data.qmd#sec-surv-cens-tidyc-vars){style="color: green"} \>\> Search Variables)
2311
+
- There are two methods to calculate change at the census tract level
2309
2312
2310
-
- This dataset only goes down to the county level
2313
+
- Interpolate data from 2022 boundaries to 2017 boundaries. Then calculate change.
2314
+
- Interpolate data from 2017 boundaries to 2022 boundaries. Then calculate change
2311
2315
2312
-
- Census Tract Level
2316
+
- Data\
2317
+
The data is the number of remote workers by census tract in Salt Lake County (i.e. Salt Lake City) from the 2013-2017 period and the 2018 to 2022 period
2313
2318
2314
2319
``` r
2315
2320
library(sf)
@@ -2332,3 +2337,98 @@ lightbox:
2332
2337
geometry = TRUE) |>
2333
2338
st_transform(6620)
2334
2339
```
2340
+
2341
+
- The process is quicker on a projected coordinated system
2342
+
- [EPSG:6620](https://epsg.io/6620) is NAD83(2011) / Utah North
2343
+
2344
+
- 2022 to 2017 Boundaries\
2345
+
{.lightbox group="tsa-ex5-1" width="582"}
- **Area-Weighted Interpolation** allocates information from one geography to another geography by weights based on the area of overlap ([Walker, Ch. 7.3.1](https://walker-data.com/census-r/spatial-analysis-with-us-census-data.html?q=small#area-weighted-areal-interpolation))
2363
+
- Typically more accurate when going *backward*, as many new tracts will “roll up” within parent tracts from a previous Census (though not always)(aka rolls backwards)
2364
+
- The book has an example that rolls *forwards* from 2015 to 2020.
2365
+
- Beware: This may be very inaccurate as assumes that population is evenly distributed over area. It can incorrectly allocate large values to low-density / empty areas.
2366
+
- Better to use Population-Weighted Areal Interpolation
2367
+
- The 2022 data is weighted and "rolled" into 2017 census tract boundaries.
2368
+
- [extensive = TRUE]{.arg-text} says weighted sums will be computed. Alternatively, if [extensive = FALSE]{.arg-text}, the function returns weighted means.
2369
+
2370
+
- 2017 to 2022 Boundaries\
2371
+
{.lightbox group="tsa-ex5-1" width="582"}
2372
+
2373
+
``` r
2374
+
library(tigris)
2375
+
options(tigris_use_cache = TRUE)
2376
+
2377
+
salt_lake_blocks <-
2378
+
tigris::blocks(
2379
+
"UT",
2380
+
"Salt Lake",
2381
+
year = 2020
2382
+
)
2383
+
2384
+
wfh_17_to_22 <-
2385
+
tidycensus::interpolate_pw(
2386
+
from = wfh_17,
2387
+
to = wfh_22,
2388
+
to_id = "GEOID",
2389
+
weights = salt_lake_blocks,
2390
+
weight_column = "POP20",
2391
+
crs = 6620,
2392
+
extensive = TRUE
2393
+
)
2394
+
2395
+
# check result
2396
+
m17b <-
2397
+
mapview(wfh_17,
2398
+
zcol = "estimate",
2399
+
layer.name = "2017 geographies")
2400
+
m22b <-
2401
+
mapview(wfh_17_to_22,
2402
+
zcol = "estimate",
2403
+
layer.name = "2022 geographies")
2404
+
2405
+
sync(m17b, m22b)
2406
+
```
2407
+
2408
+
- **Population-Weighted Interpolation** uses an underlying dataset that explains the population distribution as weights.
2409
+
- Recommended to use census block level data to create the weights. ACS only has geographies down to the Block Group level, so the Dicennial Census values are used.
2410
+
- `blocks` gets the 2020 Dicennial population values at the census block level to calculate the weights
2411
+
- `interpolate_pw` creates weights based on the 2020 census block populations. Then, it splits the 2017 weighted data into 2022 geographies.
2412
+
2413
+
- Calculate Change\
2414
+
{.lightbox group="tsa-ex5-1" width="582"}
2415
+
2416
+
``` r
2417
+
wfh_shift <- wfh_17_to_22 %>%
2418
+
select(GEOID, estimate17 = estimate) %>%
2419
+
left_join(
2420
+
select(st_drop_geometry(wfh_22),
2421
+
GEOID,
2422
+
estimate22 = estimate),
2423
+
by = "GEOID"
2424
+
) |>
2425
+
mutate(
2426
+
shift = estimate22 - estimate17,
2427
+
pct_shift = 100 * (shift / estimate17)
2428
+
)
2429
+
2430
+
mapview(wfh_shift, zcol = "shift")
2431
+
```
2432
+
2433
+
- Uses the 2017 data that'sbeeninterpolatedto2022censustractboundaries.
Copy file name to clipboardExpand all lines: scrapsheet.qmd
+62-26Lines changed: 62 additions & 26 deletions
Original file line number
Diff line number
Diff line change
@@ -640,7 +640,9 @@ title: "Scrapsheet"
640
640
641
641
- get more details
642
642
643
-
- Aerial Interpolation (see [book ](https://walker-data.com/census-r/spatial-analysis-with-us-census-data.html?q=small#small-area-time-series-analysis)for more details)
643
+
- Areal Interpolation (see [book](https://walker-data.com/census-r/spatial-analysis-with-us-census-data.html?q=small#small-area-time-series-analysis)for more details)
644
+
645
+
- Interpolating data between sets of boundaries involves the use of weights to re-distribute data from one geography to another
644
646
645
647
- Check for incongruent boundaries
646
648
@@ -666,55 +668,79 @@ title: "Scrapsheet"
666
668
st_transform(6620)
667
669
```
668
670
669
-
- Process is quicker on a projected coordinated system
671
+
- The process is quicker on a projected coordinated system
670
672
671
673
- [EPSG:6620](https://epsg.io/6620) is NAD83(2011) / Utah North
672
674
673
-
- get details on how he found incongruent boundaries
- **Area-Weighted Interpolation** allocates information from one geography to another geography by weights based on the area of overlap ([Walker, Ch. 7.3.1](https://walker-data.com/census-r/spatial-analysis-with-us-census-data.html?q=small#area-weighted-areal-interpolation))
693
+
- Typically more accurate when going *backward*, as many new tracts will “roll up” within parent tracts from a previous Census (though not always)(aka rolls backwards)
694
+
- The book has an example that rolls *forwards* from 2015 to 2020.
695
+
- Beware: This may be very inaccurate as assumes that population is evenly distributed over area. It can incorrectly allocate large values to low-density / empty areas.
696
+
- Better to use Population-Weighted Areal Interpolation
697
+
- [extensive = TRUE]{.arg-text} says weighted sums will be computed. Alternatively, if [extensive = FALSE]{.arg-text}, the function returns weighted means.
698
+
699
+
- Population-Weighted Areal Interpolation
690
700
691
701
``` r
692
702
library(tigris)
693
703
options(tigris_use_cache = TRUE)
694
704
695
-
salt_lake_blocks <- blocks(
696
-
"UT",
697
-
"Salt Lake",
698
-
year = 2020
699
-
)
700
-
701
-
wfh_17_to_22 <- interpolate_pw(
702
-
from = wfh_17,
703
-
to = wfh_22,
704
-
to_id = "GEOID",
705
-
weights = salt_lake_blocks,
706
-
weight_column = "POP20",
707
-
crs = 6620,
708
-
extensive = TRUE
709
-
)
705
+
salt_lake_blocks <-
706
+
tigris::blocks(
707
+
"UT",
708
+
"Salt Lake",
709
+
year = 2020
710
+
)
711
+
712
+
wfh_17_to_22 <-
713
+
tidycensus::interpolate_pw(
714
+
from = wfh_17,
715
+
to = wfh_22,
716
+
to_id = "GEOID",
717
+
weights = salt_lake_blocks,
718
+
weight_column = "POP20",
719
+
crs = 6620,
720
+
extensive = TRUE
721
+
)
722
+
723
+
# check result
724
+
# m17b <-
725
+
# mapview(wfh_17,
726
+
# zcol = "estimate",
727
+
# layer.name = "2017 geographies")
728
+
# m22b <-
729
+
# mapview(wfh_17_to_22,
730
+
# zcol = "estimate",
731
+
# layer.name = "2022 geographies")
732
+
#
733
+
# sync(m17b, m22b)
710
734
711
735
# calculate change over time
712
736
wfh_shift <- wfh_17_to_22 %>%
713
737
select(GEOID, estimate17 = estimate) %>%
714
738
left_join(
715
739
select(st_drop_geometry(wfh_22),
716
-
GEOID, estimate22 = estimate), by = "GEOID"
717
-
) %>%
740
+
GEOID,
741
+
estimate22 = estimate),
742
+
by = "GEOID"
743
+
) |>
718
744
mutate(
719
745
shift = estimate22 - estimate17,
720
746
pct_shift = 100 * (shift / estimate17)
@@ -723,6 +749,16 @@ title: "Scrapsheet"
723
749
mapview(wfh_shift, zcol = "shift")
724
750
```
725
751
752
+
- **Population-Weighted Interpolation** uses an underlying dataset that explains the population distribution as weights.
753
+
754
+
- Recommended to use census block level data to create the weights. ACS only has geographies down to the Block Group level, so the Dicennial Census values are used.
755
+
756
+
- `blocks`gets the 2020 Dicennial population values at the census block level to calculate the weights
757
+
758
+
- `interpolate_pw`creates weights based on the 2020 census block populations. Then, it splits the 2017 weighted data into 2022 geographies.
759
+
760
+
- The 2022 data is joined to the new 2017 data and percent-change can now be calculated since both have 2022 geometries.
0 commit comments