Register an API key at http://api.census.gov/data/key_signup.html
Create a .Renviron file in the main directory with “KEY=XXXXXXXXXXX”.
Note: this will not work with spaces on either side of the equal sign.
Also note: tidycensus already has this utility worked into it (read ?census_api_key). They call their api key CENSUS_API_KEY (it is common for this key to be in all caps), so that is what I also called mine. This will be especially helpful in not mixing up API keys if I use other API keys in the future.
library(tidycensus)
# First time, reload your environment so you can use the key without restarting R.
# .../ tells the machine to go one folder outside the folder it is in
readRenviron("../../../.Renviron")
# You can check it with:
# Sys.getenv("CENSUS_API_KEY")
- load variables with load_variables(year, dataset, chache=T/F)
Read ?load_variables for various datasets and more information.
Note that label shows the estimates by total, and then sex and age range. concept is by sex, then race, origins, and ancestry.
v15 <- load_variables(2019, "acs1")
v15
## # A tibble: 35,528 x 3
## name label concept
## <chr> <chr> <chr>
## 1 B01001_001 Estimate!!Total: SEX BY AGE
## 2 B01001_002 Estimate!!Total:!!Male: SEX BY AGE
## 3 B01001_003 Estimate!!Total:!!Male:!!Under 5 years SEX BY AGE
## 4 B01001_004 Estimate!!Total:!!Male:!!5 to 9 years SEX BY AGE
## 5 B01001_005 Estimate!!Total:!!Male:!!10 to 14 years SEX BY AGE
## 6 B01001_006 Estimate!!Total:!!Male:!!15 to 17 years SEX BY AGE
## 7 B01001_007 Estimate!!Total:!!Male:!!18 and 19 years SEX BY AGE
## 8 B01001_008 Estimate!!Total:!!Male:!!20 years SEX BY AGE
## 9 B01001_009 Estimate!!Total:!!Male:!!21 years SEX BY AGE
## 10 B01001_010 Estimate!!Total:!!Male:!!22 to 24 years SEX BY AGE
## # … with 35,518 more rows
Let’s only focus on the first line for now, “B01001_001” which should be the total estimates. Then we can use get_acs() to get data population data by state from the American Community Survey.
get_acs(geography = "state", year = 2019, variable = "B01001_001")
## Getting data from the 2015-2019 5-year ACS
## # A tibble: 52 x 5
## GEOID NAME variable estimate moe
## <chr> <chr> <chr> <dbl> <dbl>
## 1 01 Alabama B01001_001 4876250 NA
## 2 02 Alaska B01001_001 737068 NA
## 3 04 Arizona B01001_001 7050299 NA
## 4 05 Arkansas B01001_001 2999370 NA
## 5 06 California B01001_001 39283497 NA
## 6 08 Colorado B01001_001 5610349 NA
## 7 09 Connecticut B01001_001 3575074 NA
## 8 10 Delaware B01001_001 957248 NA
## 9 11 District of Columbia B01001_001 692683 NA
## 10 12 Florida B01001_001 20901636 NA
## # … with 42 more rows
We can get similar population estimates setting the variable = c(“POP), with get_estimates. As well as”DENSITY”; for housing unit estimates, c(“HUEST”); and for components of change estimates, c(“BIRTHS”, “DEATHS”, “DOMESTICMIG”, “INTERNATIONALMIG”, “NATURALINC”, “NETMIG”, “RBIRTH”, “RDEATH”, “RDOMESTICMIG”, “RINTERNATIONALMIG”, “RNATURALINC”, “RNETMIG”).
get_estimates(geography = "state", year = 2019, variable = c("POP"))
## # A tibble: 52 x 4
## NAME GEOID variable value
## <chr> <chr> <chr> <dbl>
## 1 Mississippi 28 POP 2976149
## 2 Missouri 29 POP 6137428
## 3 Montana 30 POP 1068778
## 4 Nebraska 31 POP 1934408
## 5 Nevada 32 POP 3080156
## 6 New Hampshire 33 POP 1359711
## 7 New Jersey 34 POP 8882190
## 8 New Mexico 35 POP 2096829
## 9 New York 36 POP 19453561
## 10 North Carolina 37 POP 10488084
## # … with 42 more rows
get_estimates(geography = "county", state = "OR", year = 2019, variable = c("POP"))
## # A tibble: 36 x 4
## NAME GEOID variable value
## <chr> <chr> <chr> <dbl>
## 1 Jackson County, Oregon 41029 POP 220944
## 2 Grant County, Oregon 41023 POP 7199
## 3 Clackamas County, Oregon 41005 POP 418187
## 4 Tillamook County, Oregon 41057 POP 27036
## 5 Josephine County, Oregon 41033 POP 87487
## 6 Umatilla County, Oregon 41059 POP 77950
## 7 Columbia County, Oregon 41009 POP 52354
## 8 Wasco County, Oregon 41065 POP 26682
## 9 Lane County, Oregon 41039 POP 382067
## 10 Washington County, Oregon 41067 POP 601592
## # … with 26 more rows