Tags:
tag this topic
create new tag
view all tags
---+ How to Use Census Bureau Data There is an enormous amount of information about Census Bureau data. There is so much that it is hard to sort through it all. Furthermore, a lot of the information assumes that you already have some understanding of how to work with the data, and/or that you are using a commercial software package (e.g. <nop>ArcInfo). If you want to massage the data yourself -- without going through a commercial tool -- you will need to learn more than you ever wanted to know about the data. This document is what I've taught myself. I'm not certain that I've got it all completely correct, but it's the best I could do. ---++ Terminology [[http://www.mdp.state.md.us/msdc/census/cen2000/DOC/geographic_term.pdf][Census 2000 Geographic Terms and Concepts]] is a good start at explaining the census data terminology. There are a few things, however, that this document does not make clear: * What is the relationship between geographical entities? Are blocks subsets of tracts, for example? Note that tracts are numbered uniquely _within_ counties. Tract 2031 in Santa Clara County is different from tract 2031 in Calaveras County. ---++ Data files The census bureau data is split into many files. If you just care about tabulating data, you will only need a data file; if you want to draw maps, you will also need a shapefile set. ---+++ Shapefile sets Shapefiles hold information about regions on maps, e.g. the outlines of states, counties, census tracts, etc. Shapefiles define the outlines in terms of points, and the points are given in latitude/longitude pairs. Shapefiles are actually three different files: a .dbf file, a .shp file, and a .shx file. The .dbf file holds information about the other files; I don't know the difference between a .shp file and a .shx file. One place to get shapefiles is from ESRI (the makers of <nop>ArcInfo). Download from http://arcdata.esri.com/data/tiger2000/tiger_download.cfm . They have documented it [[http://shapelib.maptools.org/dl/shapefile.pdf][here]]. Note that shapefiles are so big and unweildy that the ESRI shapefiles are split into multiple pieces. Generally, a file has a specific region (county or state) and a specific category of shape. There are lots of different types of shapes -- census tracts, cities, voting districts, etc. Example: If I request California, then "Census Tracts 2000" from the [[http://arcdata.esri.com/data/tiger2000/tiger_download.cfm][ESRI download page]], it will ask me which counties I want. If I say all counties, then I'll receive a file, which when I unzip it, will have a bunch of zip files, one for each California county. Unzipping those files will give me one .dbf file, one .shp file, and one .shx file. On the other hand, you can get [[http://www.census.gov/geo/www/cob/bdy_files.html][boundary files]] directly from the Census Bureau. These have an entire state's worth of data in them, though perhaps aren't as well-documented. ---++++ Shapefile utilities [[http://shapelib.maptools.org/][Shapelib]] is a wonderful thing, and it has a [[ftp://intevation.de/users/bh/pyshapelib/][python binding]]. Its [[http://shapelib.maptools.org/shp_api.html][Shape API]] is very nice and will let you pull out individual fields. In order to figure out what the different fields are, you need to query the associated .dbf file. Use Shapelib's [[http://shapelib.maptools.org/dbf_api.html][DBF API]] for that. The Census bureau docs are really lame at telling you whether a field is char or int or double. Never fear -- use the Shapelib distro's =dbfdump= with the =-h= flag, and it will tell you what you need to know. ---+++ Data files One of the juiciest data files that you can get is the SF1 data file. (No, I don't know what SF1 stands for.) As the [[http://www.census.gov/prod/cen2000/doc/sf1.pdf][SF1 documentation]] shows, it has ALL KINDS of yummy population information broken down sixteen ways from Sunday. The SF1 file is also a .dbf file, so you can use the [[http://shapelib.maptools.org/dbf_api.html][DBF API]] to extract that information, as above. * Set ALLOWTOPICCHANGE = DuckySherwood
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r1 - 2005-11-18
-
DuckySherwood
Home
Site map
BETA web
Communications web
Faculty web
Imager web
LCI web
Main web
SPL web
Sandbox web
TWiki web
TestCases web
Main Web
Users
Groups
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
P
P
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Register User
E
dit
A
ttach
Copyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback