README.tmp 27-Jan-04



TIGER database conversion to FORCES 'roads' database


======================================================================

General Bits

------------


1. the input source: TIGER/Line files, are generated by the US Census

Bureau for public use/access.

- They are a series of files containing a load of information

including road segments, road names (when there's more than 1

name), landmark's (museums, schools, etc), and even Census

information.

- The series of files are broken down by County. Files are named:

TGR<state-code><county-code>.RT<record-type>

- files are in plaintext, fixed length records, fixed length fields

(think COBOL).

- All of the files are zipped into a county.zip.

- we have downloaded ALL the TIGER 'database' .zip files from

2002, for the entire USA.


2. Specification information is found in the tgr108cd.pdf file

"TIGER/Line Files, Technical Documentation".

- Most importantly, is Appendix A, which contains the State and

county codes that you need to determine which files you need to

include in a conversion for the 'roads' database.

- is there a .pdf to plaintext converter so we can automate the

xref between county name and id?


3. This code was written to the 2002 Census specs. Care should be

taken when uploading more recent TIGER files, as the document says

there will be changes to the file layouts. You'll need to review

the Tech Doc't before assuming existing code will convert it

correctly.


4. Features that we could add to FORCES:

1. curves in roads; presently we straight-line each segment. TIGER

supports a separate file containing multiple lat/long points to

describe curve(s) in a segment.

2. Roads can have multiple names ('I-25' = 'Valley Highway', etc),

and a cross reference exists in TIGER to support them.

3. Currently our features table contains a lat/long point for a

landmark. Tiger also supports definition of (larger) landmarks

where multiple lat/long points define a polygon. This may be

useful in defining for example a Mall, where we'd need a better

idea of the perimeter and which roads access it.

a. Features detected include:

Hospitals

Fire Depts

Military Bases

Police Stations

Power/utility lines

Pipelines (fluid or slurry)

Gov't Agencies/Building

Railroads - stations, yards, rails.

Marina's

Schools/Religious Institutions

Airports

Levee's

Towers

public structures (shopping centers, industrial bldgs, office parks,

gov't centers. I did check and see Monuments, Galleries

and Museum's)

Rivers -- doc'n says we can map rivers (boundaries/shorelines) as well.

I haven't verified the usefulness of the information provided

however.

-------------------------------------------------------------------

Items we've discussed that are not in the spec:

Communications

Power Utility Substations

Telephone network/substations

Water/treatment facilities

Evacuation Centers/Bomb Shelters

Bridges

Shipping/Water Transportation

Fuel Reserve

Zoo's


... it should also be noted that in the DC load, the TIGER files

did NOT include Arlington Cemetery, any of the Smithsonian

facilities, or The Mall.


4. we can distinguish tunnels, bridges, underpasses, and

overpasses.



======================================================================

To build a roads database for a particular area

-----------------------------------------------


1. cvt_tiger_to_roads.csh shows the correct run sequence. Currently,

I've been editing this script file with the appropriate input TIGER

files needed for a region (in the build_roads1 command, and

rt2_extract).

-- (sim/maintenance/tiger_db/county/) county_file will list the

tiger filename when you specify state and county.


2. The environment variable TIGER_FILES needs to be defined to specify

the directory location of the TIGER files to read.


3. What I've done at this point is to create a /tmp directory for the

source location, point TIGER_FILES to it, and unzipped the

appropriate TIGER files into that directory. The conversion

programs will access the *.RT1, *.RT7 and *.RT8 files.


4. You will also need to select the 'level' of road detail you want

in your database. The database begins with 'A11' = Interstates.

The conversion code accepts a 'level' being the highest numbered

road type (lowest level of road). Current trial and error indicates

that '38' (A11-A38) covers the major road systems: Interstates, US

Highways, 4+ lane State Highways, etc..

- A39-A74 gets everything: driveways, 4WD trails, Ferry routes,

etc. (be careful!).

- A41 gets down to residential streets - with 2 counties and DC,

it really pigged the system down, but it seems some major roads

segments were tagged as A41's!


See tgr108cd.pdf pages 3-27 to 3-31 for road class (CFCC's)

descriptions. Note that the conversion programs also grabs 'P'

roads, which are: "Provisional features are those streets that were

added from reference sources or other programs in preparation for

Census 2000, but were not field verified by census staff...".

build_roads1 simply converts the 'P' to an 'A', without changing

the decimal value.


5. The code is not designed to be re-run for the same database. There

are multiple field's that are sequenced at runtime, and SQL

INSERT's that can't be duplicated. Thus, the cvt_tiger_to_roads.csh

initializes all the tables before running.


======================================================================

Notes

-----

1. Census documentation says: "The ability to group chains together to

include the entire length of a street feature, such as US Route 66,

depends on the uniqueness of the identifiers and the consistency of

the feature identifiers along the length of the feature. {{{ The

Census Bureau makes no guarantee that the complete chains have

uniform names or contain all of the known feature identifiers }}}".


2. We may want to go back and do the TGR RT* files that define curves

within a segment.


3. rt2_extract.c (building 'features' table): Polygon landmark features

are defined in FORCES as a center-point. We could go in and support

the multi-point polygon's TIGER files offer.


4. If you happen to notice segment_id's looking like: 207141431

instead of: 76525141,

They're correct! TLID's are 10 chars, so I guess TIGER can do

that. Just a little suspect, since there's only a few of them. I

haven't seen doc'n explaining the difference.


5. Need to support 'separated' roads. 'Separated' roads are

highways where TIGER defines the (for ex) northbound as a separate

road from the southbound - we end up with 2 parallel

'roads'. Perhaps look into future capability where we detect and

specify the 2 roads by directionality.


6. A38 seems to be the reasonable performance threshold for converting

multiple TIGER files (multiple counties). A41 takes hours. Which

may or may not be a problem -- I've found major road segments where

most of the road would be (for ex) A31-A35, then a segment would be

'missing', only to find it in the TIGER db as an A41!


7. Still seeing fragments of roads (like Interstates!), even after

populating up to level 38 CFCC's. Research so far indicates the

segments are there, but 1 segment in a major road can be found in

A41! ... but that includes residential streets, which pretty much

kills the system performance to the point of being unserviceable.


8. I tried to incorporate road type A63 "Access ramp; ala a cloverleaf

or limited access interchange", thinking it might plug some gaps in

the road system/database. It didn't appear to help.


9. FORCES: ROAD DISPLAY, Lowest Road Class to Display -- what are the

valid values? machine_class? Or clss?



----------------------------------------------------------


pg_dump --create --no-owner -U postgres roads > roads.dump


to load:

su postgres

dropdb roads

psql template1 < roads.dump

exit


----------------------------------------------------------


This builds Wash DC, along with 2 nearby VA counties with up to A38

level roads (state highways).

---------------------------------------------------------------------------

#!/bin/csh

#

# cvt_tiger_to_roads.csh

#

# Create 'roads' database and convert TIGER datafiles into it

#

# NOTE: if you want to keep the data you already have in 'roads', DO NOT

# RUN THIS SCRIPT!

#

psql roads < initialize_db.sql

echo "================================================================="

#build_roads1 -l 41 TGR11001.RT1 TGR51013.RT1 TGR51059.RT1

build_roads1 -l 38 TGR11001.RT1 TGR51013.RT1 TGR51059.RT1

echo "Creating the road_segments table from build_roads1 data"

psql roads -U postgres < road_segments.tmp

echo "================================================================="

echo "Creating the roads table from the road_segments"

psql roads < Insert_Roads.sql

echo "================================================================="

echo "Mashing it all together"

psql roads < Create_Roads.sql

echo "================================================================="

echo "Creating the features table"

rt2_extract TGR11001.RT7 TGR51013.RT7 TGR51059.RT7 > rt2_extract.log



This builds Wash DC, along with 2 nearby VA counties, with up to A38

level roads (interstates):

---------------------------------------------------------------------------