New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Updated place-level arrest file and prepared for first review #466

Open

vpancini wants to merge 1 commit into version2025 from iss411b

+1,280 −0

Collaborator

vpancini commented Feb 25, 2025

**Mobility metric pull request template

Please include the following points in your PR:**

A link to the issue that this PR relates to: #411
**2) A description of the content in this pull request.

What was changed?**

The 2022 metric update included two files: 01_agency_geo_place.qmd and juvenile-arrests-place-all.qmd. I integrated these files into one to make them more intuitive. The new file is called juvenile-arrests-place-combined.qmd.

What should the reviewer be focusing on?

The overall logic of the code and aggregating counts from the agency-level to the place-level

Is there a logical order to review the files in?

There is only one file

Detail on any issues or flags that the metric reviewer/data-team should be aware of.
Rates for 2021 and 2022 may have changed because the underlying data have been updated


          Updated place-level arrest file and prepared for first review

ec89d8e

cdsolari requested a review from ridhi96

February 26, 2025 16:57

cdsolari assigned ridhi96

ridhi96 requested changes

View reviewed changes

ridhi96 left a comment

Hi @vpancini! I have added my review comments. Please let me know if you have any questions or concerns. Thank you!

07_safety/juvenile-arrests/juvenile-arrests-place-combined.qmd

+                  ethnicity_of_arrestee
+                ) |>
+                mutate(
+                  age_of_arrestee = as.numeric(age_of_arrestee)

ridhi96 Mar 5, 2025

This introduces around 2769 new NA values, do we know why? When I execute any(arrests_a$age_of_arrestee %in% c("NA", "na", "n/a", "N/A")) I get FALSE so not sure where the NAs are coming from.

07_safety/juvenile-arrests/juvenile-arrests-place-combined.qmd

+                  ethnicity_of_arrestee
+                ) |>
+                mutate(
+                  age_of_arrestee = as.numeric(age_of_arrestee)

ridhi96 Mar 5, 2025

This introduces around 3832 new NA values, do we know why? When I execute any(arrests_b$age_of_arrestee %in% c("NA", "na", "n/a", "N/A")) I get FALSE so not sure where the NAs are coming from.

07_safety/juvenile-arrests/juvenile-arrests-place-combined.qmd


		Note that this metric includes the following subgroups by race, sex, and age subgroups, so those subgroups are also aggregated in this section:

		* Race: white, Black, Hispanic, Asian/other

ridhi96 Mar 5, 2025

Can we please add a check which shows the unique race-ethnicity and sex values that exist in the original data before grouping?

07_safety/juvenile-arrests/juvenile-arrests-place-combined.qmd

+                group_by(year, ori) |>
+                summarize(
+                  arr_total_juv = n(),
+                  # Race subgroups

ridhi96 Mar 5, 2025

I don't feel very confident about the race subgroups because in the data sometimes race and ethnicity values aren't aligned, e.g., race is "white" but ethnicity is "unknown" instead "not hispanic or latino" or race is "unknown" but ethnicity is "hispanic or latino" etc. It would make sense to explicitly define these categories and how we want to code them for this metric outside of the code block before coding the race subgroup. Maybe it will make sense to use both the "race_of_arrestee" and "ethnicity_of_arrestee" variables to create the race-ethnicity subgroups.

07_safety/juvenile-arrests/juvenile-arrests-place-combined.qmd



		## 4.1 Instructions to manually download NIBRS Batch Header Segment
		REVIEWER - the storage of these files has changed on Harvard Dataverse since I downloaded them and wrote this section, but there has been an error, and the Batch Header File section now lists the Group B Arrest Report Files. These instructions will need to be updated next year once Jacob Kaplan fixes the file storage.

ridhi96 Mar 5, 2025

Please update the instructions for this segment!

07_safety/juvenile-arrests/juvenile-arrests-place-combined.qmd

+              ### 5.1a  Check ACS variable names and identify those we need
+              Load ACS variables for our first and last years. Manually explore each file and spot check several observations. The naming conventions of ACS variables do not seem to change during our time period. Note that if they did we would need to split up the code that reads in the years below.
+              ```{r check-acs-variables, eval = FALSE}

ridhi96 Mar 5, 2025

Please briefly explain the ACS variables used.

07_safety/juvenile-arrests/juvenile-arrests-place-combined.qmd

+                * Sex: male, female
+                * Age: ages 10-14, ages 15-17
+              Note that there is a Non-Hispanic white category, but not a Non-Hispanic category for other races (Black, Asian, etc.). This means that the categories will have a small but non-zero overlap. The race/ethnicity categories actually used in the `juvenile-arrests` metric are white, Black, Asian/other, and Hispanic. These are not mutually exclusive (e.g., someone could be counted as both Black and Hispanic), which matches the population denominators created here (this is because ACS doesn't have counts of non-Hispanic Black or other non-Hispanic races other than white).

ridhi96 Mar 5, 2025

Can we add some test for this?

07_safety/juvenile-arrests/juvenile-arrests-place-combined.qmd

+                mutate(
+                  # Total age 10-17
+                  pop_1017 = age_m_1014 + age_m_1517 + age_f_1014 + age_f_1517,
+                  # Race subgroups

ridhi96 Mar 5, 2025

It would make the coding clearer if we state what the race subgroups are encoding more explicitly maybe in a comment or text outside of the code block.

07_safety/juvenile-arrests/juvenile-arrests-place-combined.qmd


		```

		## 7. Validation

ridhi96 Mar 5, 2025

Can we see agency and place level distributions (visualization) of the crime data before any transformations are done?

07_safety/juvenile-arrests/juvenile-arrests-place-combined.qmd

		```


		## 8. Save and write out data

ridhi96 Mar 5, 2025

This file should include a final data evaluation function before data is written out. Please refer the project Wiki on GitHub for more details. Also please make sure to include places that should be in the urban universe but we couldn't calculate statistics for, these should just have N/A values for the metric.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet