Abstract
A geocoding process can take several forms depending on the format of the starting data. To many people geocoding is the process of linking a standard mailing address with point defined by a latitude and a longitude. More generally it is any process in which a unique geographic identifier is associated with a spatially defined area. In this general sense linking aggregate data to a zip code boundary may also be considered a form of geocoding.
Mapping Address Data with ArcGIS
In this exercise we will geocode address information in three separate ways:
- Using zip codes to relate a data table with a zip code file
- Using existing latitude and longitude data to display points
- Using addresses to match to an indexed street file
All of the files needed for this exercise are located in EDS in the following directory: E:\work\workshops\geocoding_workshop\
Navigate to the folder and add all layers (2 shapefiles and one dbf file)
The shapefile for the New York City streets and zip codes were obtained from ESRI. The streets were originally created from the US Census Bureau's Census TIGER (topographically integrated geographic encoding and referencing) files
The zip code file was originally produced by Geographic Data Technologies (GDT) and represents zip codes in use in 2003. The data contained in the dbf were obtained from InfoUSA and represent business records for a small subset of NYC businesses. These include art stores, lumber yards, and restaurants in zip code 10025
1. Relating the businesses to zip codes
- After importing the files, right click on the zip code file in the left-hand explorer pane
- Select joins and relates --> relate, You will receive a dialog box that looks like:

- In the top pulldown menu, select ZIP as the field for the relate
- In the middle pulldown, select the all_abi_addresses file as the table to relate
- In the bottom pulldown menu, select zip and select OK
- If all went well, you will have created a rudimentary relational database that associates the zip code file with a number of different businesses. Open the attribute table for zip codes, and at the bottom select Options --> Select By Attributes
- In the dialog box that appears, enter the following query:
"ZIP"='10025' - hit Apply - You should see at the bottom of the table '1 out of 195 selected'
- Go again to Options and select Related tables
- The attribute table for the business data should open
- All businesses associated with zip code 10025 should be selected
- This should be 187 out of 434
Displaying X/Y data
- At the bottom of the left-hand pane you should see two tabs: Display and Source - Make certain that Source is selected.
- In the left-hand pane, right click on the all_abi_addresses table
- Select Display X-Y Data
- A dialog box will appear that looks like this:

- Ensure that the left-hand pulldown shows 'lon' and the right-hand pulldown shows 'lat'
- Note the section at the bottom that asks for projection information (i.e. Spatial Reference)
If we have information about the map projection it is a good idea to include this information now. These data are in decimal degrees and probably in the North American Datum of 1983. Since we do not know for certain, we will leave this unchanged.
In most cases, the software is smart enough to determine whether or not geographic data are in latitude and longitude coordinates and (in most cases) will automatically adjust the display so that data sets are correctly registered - Hit okay and the points should be displayed - Note that this is a temporary data set called all_abi_addresses Events. To make this permanent the data must be saved (exported)
- To save, right click on the layer and choose: Data -> Export Data
Geocoding Addresses
Prior to geocoding addresses we must create an address locator file using appropriately indexed street data. For this exercise we will use the TIGER files for the five counties that make up New York City, merged into a single file
- Click on the button that looks like a small filing cabinet - this will open the ArcCatalog application
- In the left-hand pane, scroll to the bottom where it says Address Locators - click on this and double click on Create New Address Locator
- You will be given a number of different options - Choose US Streets with Zone (File)
- A dialog box that looks like this will appear:

- Fill in the fields so that it is identical with the example above
Many of the fields will be automatically filled in when you browse to the reference data (mergednyc_TIGER2000) - Change the name of the locator at the top so that you will remember what reference data are being used. Also set a 'side offset' toward the lower right
- Click OK
- Now, return to the map document
- Right click again on the data table - this time select Geocode Addresses
- A dialog box will appear that looks like this:

- Click on Add and navigate to the one you just created
- Pick the one that you just made - you may have to scroll to the bottom of the
explorer pulldown menu to find the Address Locators area

- Once you have successfully chosen your locator, hit okay and you will get yet another
dialog box:

- In the Output section you can name the data set you are about to create - his is worth doing; if you don;t, you end up with many shapefiles all named something like Geocoding_Result_23.shp
- At this point you can also change some of the options that you established in your locator - for instance, if you forgot to put a side offset you can do that now. You can also change the acceptance score
- We will briefly discuss this:

- Once you are happy with these, hit OK and OK again
- You will get a certain number of matches, unmatches and ties
- In general there is not much you can do about matches that are incorrect. You are unlikely to ever see them
- You can fix the unmatched addresses and tied addresses though - select the interactive match
button that shows up on the dialog

- Any unmatched addresses, you can click through one at a time and can experiment with using the modify button and changing the address parameters
- Geocoding is often a trial and error process and can be exceptionally time consuming to achieve high match rates

