Command-Line Cartography, Census BlockGroup level data

Tin Ho
7 min readJun 7, 2020

Mike Bostock published a 4-parts series of excellent articles on generating maps with census population data. Those were at the Census TRACT level. I have adopted them for BLOCK GROUP level data, and am sharing my commands here.

For a quick peek, the map on the left is by TRACT, right is by BLOCK GROUP. There are just a bit more detailed granularity of where people live 🤗

The detailed command are provided below. Explanations are in Mike Bostock original article. So, when I refer to step “2c”, it is in Part 2, command “c” of his article.

Part 1: Download map shape data and generate a svg graphics file for it.

California State FIPS code is 06. Filename prefix changed from Bostok’s “ca-” to “ca2018bg-” to indicate the newer data and change to Block Group level data:

curl ‘https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_06_bg_500k.zip' -o cb_2018_06_bg_500k.zipunzip cb_2018_06_bg_500k.zipshp2json cb_2018_06_bg_500k.shp -o ca2018bg.json

The resulting .json is a geojson file with [longitude,latitude] coordinates:

Step 1b: Apply California Albers Equal Area Conic projection to the json file. Note that this projection will convert the coordinates and the result will no longer be a true geojson. Thus, omit this step is want to keep the coordinate in Longitude/Latitude:

geoproject 'd3.geoConicEqualArea().parallels([34, 40.5]).rotate([120, 0]).fitSize([960, 960], d)' < ca2018bg.json > ca2018bg-albers.json

The above resulting in a file with coordinates like [164.1468809671912,437.62295438355295]. More specifically, it looks like this:

Step 1c: we convert to svg:

geo2svg -w 960 -h 960 < ca2018bg-albers.json > ca2018bg-albers.svg

Which result in fairly large file, 11 MB. And if visualized by the browser based GIS tool mapshaper.org . Alternatively, VScode also have a previewer. Both show map “upside down”:

And that was for Part 1 😃

Part 2: Join shape data with population data.

Step “2a”, convert [geo]json to ndjson:

ndjson-split 'd.features' < ca2018bg-albers.json  > ca2018bg-albers.ndjson

You should get something like this for TRACTCE: 980401

census block group (2018) TRACTCE: 980401
{"type":"Feature","properties":{"STATEFP":"06","COUNTYFP":"075","TRACTCE":"980401","BLKGRPCE":"1","AFFGEOID":"1500000US060759804011","GEOID":"060759804011","NAME":"1","LSAD":"BG","ALAND":419323,"AWATER":247501289},"geometry":{"type":"Polygon","coordinates":[[[164.1468809671912,437.62295438355295],[164.63136562909594,437.78130627883274],[164.66061334779198,437.4585472897443],[164.99020559033386,437.24058568718465],[165.18788475165627,437.5895682364189],[165.34708696199812,437.95636228142894],[165.00971718370104,438.4217441413507],[164.76560417638595,438.337767038262],[164.64069467117463,438.0862550961165],[164.22534302939806,438.0159600586103],[164.1468809671912,437.62295438355295]]]}}

Step “2b”

ndjson-map 'd.id = d.properties.GEOID.slice(2), d'  < ca2018bg-albers.ndjson  > ca2018bg-albers-id.ndjson

Next step is to download population data. You need to download more data via Census API, which require that you have an API KEY. Example below download all block group in State FIPS=06 (california) and County FIPS=025 (Imperial county, the newest county addition to CA):

ApiKey=c9b728xxx123...789zzz
curl "https://api.census.gov/data/2018/acs/acs5?get=NAME,B01003_001E&for=block%20group:*&in=state:06%20county:025&key=$ApiKey" -o cb_2018_06_bg_B01003.025.json

Full county FIPS are listed in Wikipedia. They are sequentially numbered, skipping 1 (ie always odd-numbered). To download all 58 counties of Califoria, run:

for FIPS in $(seq -w 19 2 115); do
echo curl "https://api.census.gov/data/2018/acs/acs5?get=NAME,B01003_001E&for=block%20group:*&in=state:06%20county:$FIPS&key=$ApiKey" -o cb_2018_06_bg_B01003.$FIPS.json
done

At this point, the first 3 lines of your json file for County=001 (Alameda) should look like:

[["NAME","B01003_001E","state","county","tract","block group"],
["Block Group 2, Census Tract 4384, Alameda County, California","1623","06","001","438400","2"],
["Block Group 1, Census Tract 4384, Alameda County, California","945" ,"06","001","438400","1"],

Step “2d”. The ndjson-map changed a bit here, as we now have to combine 3 columns (d[3] + d[4] + d[5] to generate the id field. The population count remains the same and placed in B01003. We also need to loop for all 58 counties:

for FIPS in $(seq -w 001 2 115); do
ndjson-cat cb_2018_06_bg_B01003.$FIPS.json | ndjson-split 'd.slice(1)' | ndjson-map '{id: d[3] + d[4] +d[5], B01003: +d[1]}' > cb_2018_06_bg_B01003.$FIPS.ndjson
done
cat cb_2018_06_bg_B01003.???.ndjson > cb_2018_06_bg_B01003.CA.ndjson

Your new ndjson should now looks like:

{"id":"0014441003","B01003":1755}
{"id":"0014441002","B01003":1320}

Step “2e”:

ndjson-join 'd.id' \
ca2018bg-albers-id.ndjson \
cb_2018_06_bg_B01003.CA.ndjson \
> ca2018bg-albers-join.ndjson

The result here differs from Bostock example, as we have Block Group level data here. For the same TRACTCE=400300, we now have 4 records instead of 1. Further more, there is a new field added: BLKGRPCE:

[{"type":"Feature","properties":{"STATEFP":"06","COUNTYFP":"001","TRACTCE":"400300","BLKGRPCE":"2","AFFGEOID":"1500000US060014003002","GEOID":"060014003002","NAME":"2","LSAD":"BG","ALAND":269347,"AWATER":0},"geometry":{"type":"Polygon","coordinates":[[[224.44636755999747,425.3691423744949],[224.5939145191771,425.21708244189904],[224.7830672777208,425.34538644862505],[225.07612515752876,425.52808473848654],[225.1067691336628,425.4894377365358],[225.31806810037276,425.1724692650978],[225.58249395352414,425.05059707011105],[225.35882837057437,425.47619464326226],[225.22516372508392,425.73538936106115],[224.86658222608307,425.5294755512],[224.6649471804986,425.47993141313145],[224.43926884491924,425.4361850983005],[224.44636755999747,425.3691423744949]]]},"id":"0014003002"},{"id":"0014003002","B01003":1404}]

Step “2f”. The input data here has changed. However, we really only care for the population data (keyed under “B01003”) and the land area, ALAND. Population density is calculated as:

ndjson-map 'd[0].properties = {density: Math.floor(d[1].B01003 / d[0].properties.ALAND * 2589975.2356)}, d[0]' \
< ca2018bg-albers-join.ndjson \
> ca2018bg-albers-density.ndjson

Step “2g alt”. Here I am using the alternate method which worked more reliably for me:

ndjson-reduce 'p.features.push(d), p' '{type: "FeatureCollection", features: []}' \
< ca2018bg-albers-density.ndjson \
> ca2018bg-albers-density.json

The above result is a json file that you can view with your GIS program of choice. Bostock hinted at mapshaper.org web tool, which shows the map Australian-style (ie, South is top). If in step 1c you have skipped the Calif Albers projection step, the json file is a true geojson with [Lng,Lat] coordinates in it.

Next, step “2h” is to color the map according to population density. Bostock used d3 tool that he wrote, and this step only works when the coordinates are in California Albers:

ndjson-map -r d3 \
'(d.properties.fill = d3.scaleSequential(d3.interpolateViridis).domain([0, 4000])(d.properties.density), d)' \
< ca2018bg-albers-density.ndjson \
> ca2018bg-albers-color.ndjson
geo2svg -n --stroke none -p 1 -w 960 -h 960 \
< ca2018bg-albers-color.ndjson \
> ca2018bg-albers-color.purple.svg

The above produced a svg map that looks all purple, similar map that Bostock presented at the end of Part 2 of his article. Here we have population density by the Block Group rather than Tract. If you are using Linux Mint, you can use the program xviewer to look at the resulting svg file.

Part 3: TopoJSON

In Part 3, Bostock simplified the map (reducing its size) by using TopoJSON. All the steps are exactly the same, except that I changed filenames. Here is the whole enchilada :)

geo2topo -n \
tracts=ca2018bg-albers-density.ndjson \
> ca2018bg-tracts-topo.json
toposimplify -p 1 -f \
< ca2018bg-tracts-topo.json \
> ca2018bg-simple-topo.json
topoquantize 1e5 \
< ca2018bg-simple-topo.json \
> ca2018bg-quantized-topo.json
topomerge -k 'd.id.slice(0, 3)' counties=tracts \
< ca2018bg-quantized-topo.json \
> ca2018bg-merge-topo.json
topomerge --mesh -f 'a !== b' counties=counties \
< ca2018bg-merge-topo.json \
> ca2018bg-topo.json

Part 4: Coloring map

Bostock showed several way to color the map. The last result in step “4e” is the prettiest and most useful one, using log scale and the OrRd color scheme. So that’s what we will use here. Note that I had to drop “d3=” in the second -r clause for it to work (ie, use -r d3-scale-chromatic)

topo2geo tracts=- \
< ca2018bg-topo.json \
| ndjson-map -r d3 -r d3-scale-chromatic 'z = d3.scaleThreshold().domain([1, 10, 50, 200, 500, 1000, 2000, 4000]).range(d3.schemeOrRd[9]), d.features.forEach(f => f.properties.fill = z(f.properties.density)), d' \
| ndjson-split 'd.features' \
| geo2svg -n --stroke none -p 1 -w 960 -h 960 \
> ca-2018bg-threshold.svg

Finally, in step “4e”, county borders are added to help orient the viewer of what is where:

(topo2geo tracts=- \
< ca2018bg-topo.json \
| ndjson-map -r d3 -r d3-scale-chromatic 'z = d3.scaleThreshold().domain([1, 10, 50, 200, 500, 1000, 2000, 4000]).range(d3.schemeOrRd[9]), d.features.forEach(f => f.properties.fill = z(f.properties.density)), d' \
| ndjson-split 'd.features'; \
topo2geo counties=- \
< ca2018bg-topo.json \
| ndjson-map 'd.properties = {"stroke": "#000", "stroke-opacity": 0.3}, d')\
| geo2svg -n --stroke none -p 1 -w 960 -h 960 \
> ca2018bg.svg

Here is the resulting svg (converted to png cuz that’s what Medium supports):

Detailed map of Calif population density by census block groups.
color scalebar for map above

You can also download the final json (with CA Albers projection) and .svg.

For those interested in using other Census data, I found this really useful example page. Note that for the 2018 Estimate data, it seems that Census have changed things a bit? Or at least the API I used changed the result, it has an extra column with NAME in it, thus offsetting column number by 1. (I have accounted these in my examples above).

For those wanting to use GIS programs that use [Longitude, Latitude] coordinate, eg Mapbox, you would need to avoid geo projection steps like 1c above. You can still run the commands up to step 2g and create a true geojson file, which you can download here as well. A JavaScript program using Mapbox GL JS API can render the data on an interactive web page that looks like this:

Population density by Census Block Group near Stockton via Mapbox

How to generate geojson with simplified/compressed geometry using TopoJSON is left as an exercise for the reader. Show/link your work in the comment section! Cheers! 🥰

--

--

Tin Ho

Linux geek at Univ of Calif. Opinions expressed are my own and not my employers.