English

How To: Edit the geocoding standard abbreviations for street types

Summary

ArcView's geocoding engine uses US Postal Service street abbreviation conventions. These conventions are controlled through classification files that standardize address component abbreviations. For example, the 'AV' in the address '123 MAPLE AV' will be converted to 'AVE' when compared to the street theme's attribute table. Although this design works well with most datasets, you may encounter data that does not match this convention.

Although it is possible to alter the way ArcView standardizes addresses, editing the necessary files is not supported, as with any customizations.

Procedure

The files mentioned below are located in the ..\Esri\Av_gis30\Arcview\Geocode directory. Make backup copies of these files before you start editing.

You can modify the way ArcView handles street types by editing the following ArcView geocoding classification files (CLS) files:

¤ us_addr.cls
¤ us_intsc.cls
¤ stname.cls

Once you edit these files, the changes will affect every geocoding operation you perform.

  • The street definitions

    1. Open us-addr.cls in a text editor such as NotePad. The first 30 lines contain string classes and how they are used in the geocoding process.

    2. Scroll down to the section with these four columns:

    ... 
    AVENUE AVE T 800.0
    AVE AVE T
    AVEN AVE T
    AV AVE T
    AVD AVE T
    AVNUE AVE T
    AVENIDA AVE T
    BCH BEACH A
    BLF BLUFF A
    BLUF BLUFF A
    BOULEVARD BLVD T 790.0
    BLVD BLVD T
    BVD BLVD T
    BOUL BLVD T
    ...

    ¤ The first column lists street types as they may appear in you address table. There cannot be more than one entry of any string in this entire column.

    ¤ The second column represents the format ArcView standardizes the entry in the first column. This abbreviation should be the same as the abbreviations used in your street theme's attribute table.

    ¤ The third column sets the class of the record. Class 'T' represents street types.

    ¤ The fourth column is optional and can contain a spelling sensitivity factor. The number '800.0' means the event string can be 20% different from the string in the first column and it will still be standardize properly. The number '750.0' means the event string can be 25% different. For example, in the case of AVENUE, you could type AVENU and it would still standardize to AVE, even though there is no entry of AVENU in the first column. This fourth column should be used only when the entry in the first column is very large and easy to misspell.

  • What you should edit

    Take the example given in the Summary section of this document. The AV is converted to AVE. But if the street theme's attribute table does not use the standard AVE to represent Avenue, then all address using AV will not be marked as a perfect match even though the address table and the street table store the same values.

    To fix this, change all occurrences of AVE in the second column of this file to AV. If your street attribute tables uses BL instead of BLVD, then change all occurrences of BLVD in the second column to BL.

  • What you should avoid

    You can also make whole new line entries in these files. For example, there is a street type in Florida called Nene. If you make an entry like this:

    NENE		NENE	T

    ArcView will recognize this as a street type and compare this string to the proper field in the street theme's attribute table. But let's also say that the word "Nene" can be abbreviated as "NE", so you make this second entry:

    NE		NENE	T

    ArcView will take all occurrences of NE, standardizes them to NENE, and code the string as a street Type. This is fine for Nene, but since NorthEast can also be abbreviated as NE, this will cause a problem for matching the NE street direction. The other problem is that now you have two entries of NE in this first column. You will need to make a decision as to which one you want to keep.

  • The ST string

    Be careful when you are attempting to change the way the word STREET is standardized. This word is treated differently than other street types because of its many possible meanings. ST could be:

    1. An abbreviation for the word "Street"
    2. An abbreviation for the word "Saint"
    3. An abbreviation for the word "State"
    4. An ordinal number suffix ( 1st, 31st, 101st, etc.)

    Because of this, ST has its own class, called S. In an extreme case, you could have an address '123 1ST ST ST ST 123', which can be read as "123 First State Street Suite #123". ArcView's pattern recognition files try to determine which pieces of the address string to compare to which fields in the street theme's attribute table, based on the classes of the strings assigned in the CLS files, and the way those class token strings are assigned to the MatchKey object.

  • Directional strings

    In addition to street types of the class T, you can also experiment with editing directional information in your addresses. Directionals are like W, NE, North, and so forth, and have the class D in the third column. The integrity of these CLS files is paramount to the success of ArcView geocoding.

  • Completing the edits

    Any edits related to street types you make in the us_addr.cls file, you will also need to make in the us_intsc.cls and stname.cls files.

    You will need to rebuild the geocoding index and restart ArcView unles you only edited street types and directions.