Indiana State Agencies
ARC/INFO Data Collection
Standards
Preliminary
August , 1995
Chapter I Introduction
1.00 INTRODUCTION
1.01 APPLICATION OF STANDARDS
1.02 ADDITIONAL STANDARDS AND INITIATIVES
1.03 STATE AGENCIES GIS DATA LIBRARY
Chapter 2 Map Preparation and Source Media Guidelines
2.00 INTRODUCTION
2.01 STANDARD SOURCE MAP
2.02 OVERLAY LABELING
2.03 NON GEO-REFERENCED MAPS
2.04 SCALE
2.05 MAP MEDIA
2.06 EDGE MATCHING
2.07 LINE WIDTH
2.08 ACCURACY GOALS
2.09 COORDINATE SYSTEMS
2.10 ENTITY IDENTIFIERS
2.11 PROJECTION
2.12 DATUM
2.13 CONTROL/REGISTRATION
Chapter 3 Data Automation
3.00 MODULATION
3.01 COVERAGE NAMING CONVENTIONS
3.02 COINCIDENT FEATURES
3.03 ARC/INFO DIGITIZING
3.04 ATTRIBUTE CODING
3.05 TOPOLOGY
3.06 EDGE MATCHING ADJOINING COVERAGES
3.07 PROOFPLOTS
3.08 ACCURACY ASSESSMENT/DIGITAL MAP STANDARDS
Chapter 4, Data Documentation
DATA DOCUMENTATION
Appendixes
APPENDIX I ABBREVIATIONS
APPENDIX 2 GLOSSARY
APPENDIX 3 US NATIONAL MAP ACCURACY STANDARD
1.00 INTRODUCTION
The increasing use of Geographic Information Systems (GIS), the need for locational data for specific programs, the need to share locational data across program areas and agencies, and the implementation of EPA's Locational Data Policy (LDP) has created the need for spatial data collection standards. Standards adopted by State Agencies will help ensure that the data being collected and the technology being used complies with established guidelines of quality and compatibility. Standards for GIS are critically important to ensure consistency in the databases and GIS applications that are being developed within the State Agencies. Without standards, it will be very difficult to transfer files, overlay data, share data or develop integrated systems.
This document is intended to be a reference manual on standards for State Agencies staff who collect and/or automate locational data in both tabular and graphic format. This document provides guidelines for map preparation and conversion of spatial data into digital ARC coverages, sets specifications for digitizing data, and provides standard procedures for documenting the history of each data layer and source map. The first two chapters refer to both tabular and GIS data collection efforts. Chapters 3 and 4 should be used for GIS data automation by staff who have had formal training in digitizing and other GIS procedures. This document defines terms, specifies standards, and describes general procedures. This document is not intended to be a tutorial or training manual for GIS use of GIS concepts. This document will be reviewed and updated periodically by the interagency GIS Coordinators. Questions regarding this document may be directed to each State Agency GIS Coordinator.
EPA has instituted a Locational Data Policy (LDP) to ensure the collection of accurate, consistently formatted, fully-documented latitude/longitude coordinates as part of all spatially relevant State Agencies sponsored data collection activities. This policy app applies to all EPA organizations and agents, including state and local government personnel directly responsible for, or who have delegated authority for, implementing federal environmental laws.
When gathering data, the following elements MUST be collected in accordance with the LDP:
Latitude/longitude coordinates
Method used to determine latitude/longitude coordinates
Reference datum
Map scale
Description of data
Accuracy of data
The LDP also strongly recommends that the date and source of collection are
documented. The State Agencies will require ALL of the above elements for both tabular and GIS data collection. More stringent standards apply to GIS data sets (see Chapter 4).
For further information on EPA's LDP, a four volume guide is located in the IDEM GIS library located at the IDEM MIS Division 1249 IGCN.
1.01 APPLICATION OF STANDARDS
These standards must be applied realistically. They are intended to facilitate data collection and subsequent use not impede it. For data being collected and automated within the State Agencies, the standards should be followed rigorously. In cases where the automation has begun before the adoption of standards procedures may need to be modified so that automation proceeds according to accepted standards.
The standards should not be interpreted as precluding the use of existing digital data that do not meet standards. Neither should they preclude cooperative efforts between the State Agencies and other entities who cannot meet these standards. In all cases where a data set does not meet standards, statements defining the specifics of the deviation shall be a permanent part of the documentation to be distributed with that data set.
The short term costs of adhering to standards may be high, but the long term costs of bypassing standards are far greater. Data developed within the State Agencies may be used by many people for a variety of purposes. Data may be copied and re-used many times. Any errors or inaccuracies in the data will be multiplied every time the data is used. Each new step in processing the data may allow compounding existing errors and the errors may not be well documented which may lead to making inaccurate assessment of the data and incorrect decisions based on the data. New technology on the market today allows instant access and unlimited capabilities to combine and overlay data from any source. This also allows unlimited capabilities to unknowingly combine good and bad data.
1.02 ADDITIONAL STANDARDS AND INITIATIVES
The EPA currently considers Global Positioning Systems (GPS) to be the best practicable technology because of its potential for yielding consistent, highly accurate measurements. With the completion of full constellation of GPS satellites completed in 1993, and with the proliferation of commercial measuring devices, GPS will serve as the method of choice for the collection of the highly accurate locations. GPS will not be addressed in this version of the State Agencies Spatial Data Collection Standards.
Also not discussed in this version are guidelines for scanning data since that technology is not currently in common use. Scanning standards will be addressed with the appointment of a interagency scanning technology workgroup.
1.03 STATE AGENCIES GIS DATA LIBRARY
The State Agencies GIS data library is a function that will be developed in the future. The intent of the GIS library is to serve three purposes. The library should store a complete set of source maps for digitizing purposes as well as any reference documents necessary for proper implementation of these standards. The second function of the library should be to maintain a current electronic data dictionary and on-line access to data needed by State Agencies staff. The third function should be to maintain a data archival system.
Until a state wide GIS DATA LIBRARY is created the cooperating agencies will develop joint naming conventions and templates for use with GIS data.
Chapter 2
Map Preparation and Source Media Guidelines
2.00 INTRODUCTION
Map preparation source media guidelines are necessary to insure consistency in data compilation. Data that are collected from field work, other maps and reference materials should be compiled onto a source map. This chapter addresses standards for defining the standard source map(s) and drafting for maps to be digitized.
2.01 STANDARD SOURCE MAP
The standard source maps for thematic data layers shall be the United States Geological Survey (USGS) 1:250,000. 1: 100,000, and 1:24,000 (7.5') quadrangle. Engineering and Site maps of a variety of larger scales may be used for project specific data collection provided that the source map can be accurately geo-referenced (see Section 2.01 and 2.12). Mylar source maps are recommended over paper, but paper source maps can be used with care and caution. Geographic data should be compiled onto a standard source map by drafting features of interest onto a mylar or vellum overlay to maintain a clean source map. The overlay should be registered to the source map using standard control points such as the quadrangle comer ties. The features to be digitized can then be drafted onto the overlay. There is no need to draft the features existing on the published quadrangle map onto the overlay. Existing features should be digitized directly from the quadrangle, if necessary. Features compiled by the program area should be drafted onto the overlay prior to digitizing. Source maps shall NOT be folded, torn, water stained, laminated or modified in any manner which would alter the accuracy of the map.
2.02 OVERLAY LABELING
On the overlay, there must be clear notation of the map's theme, series (if any), scale, datum, projection, date, staff person and State Agencies program area. A minimum of four registration tics must be accurately drafted and labeled.
2.03 NON GEO-REFERENCED MAPS
Examples of non geo-referenced maps are aerial photos and survey maps. When data are to be compiled on or extracted from an aerial photo, the photo must be capable of being registered to section comer points (or other mapped features) as published on the USGS 1:24,000 quadrangles. A minimum of four registration tics must be drafted onto the overlay, for sketch maps and five registration tics for aerial photos. The additional tic is required due to the variable scale existing on aerial photos.
In order to register any source, there is a need to have known coordinates which can be accurately referenced to create registration points. Before using aerial photos, additional information should also be known: scale of photo, if the photo has been rectified, date and time of flight. To minimize distortion, only the effective area of aerial photos should be used. The effective area is that area of the aerial photo where distortion is minimized. In relatively flat rural areas of Indiana the effective area of a 9" x 9" photo would be approximately 6" x 6".
2.04 SCALE
The State Agencies standard source map scale for spatial data capture and database development is 1: 2 5 0, 000, 1: I 00, 000 and 1: 24, 000 depending on application and user needs. Engineering and site maps of larger scales may be used for project specific areas.
Users must carefully evaluate and knowledgeably select the appropriate scale for the project they are undertaking. Thematic layers being automated should be built from a single scale, rather than multiple scales. The choice of scale should be determined by:
Accuracy and resolution needs
Map and data availability
Available time to complete a project
Digitizing and digital data storage costs
2.05 MAP MEDIA
Spatial data should be prepared on stable media.
Paper maps or drawings are susceptible to shrinkage and swelling with humidity changes. Digitizing from paper maps should be avoided whenever it is feasible. The following is list of media types, ordered from most preferable to least preferable.
Mylar original
Mylar contact reproduction from mylar original
Vellum or paper contact reproduction from mylar original with an accurate mylar registration overlay
Vellum or paper original with an accurate mylar registration overlay
2.06 EDGE MATCHING
Adjacent source maps should be edge matched.
Features/mapping units that intersect the boundaries of source maps should be accurately edge matched with the corresponding features/mapping units on adjacent source maps.
Attributes of features that cross boundaries should also logically match.
2.07 LINE WIDTH
Lines to be digitized should be precisely drawn, having a consistent width. The most desirable maximum width is 0.01 inches (0.254mm). This can be achieved with a 000 drafting pen.
Lines wider than .01 " can cause a substantial loss of accuracy, both when being drafted and when being digitized.
2.08 ACCURACY GOALS
Map compilation and data being prepared for digitizing should attempt to meet the National Map Accuracy Standard (NMAS) for the scale being utilized (see Appendix A). This standard requires that, in a random sample of well defined map locations, 90 % will be found (on the map) within 0. 0 1 inches (at scale) of their true location (on the ground) and 100% will be found within 0.02 inches of their true location. The table below lists the relative accuracies of a selection of quadrangle maps.
Positional
1:24,000 +/- 40 feet or 12.2 meters
1:63,500 +/-105.6 feet or 32.2. meters
1:100,000 +/-166.7 feet or 50.8 meters
1:250,000 +/-416.7 feet or 127 meters
Locational data not obtained from USGS maps listed above should comply with the 25 meter accuracy goals of these standards.
2.09 COORDINATE SYSTEMS
The standard coordinate system will be UTM. Other coordinate systems may be used for locational data collection, but the coordinates must be converted to UTM before it may be incorporated into the State Agencies GIS data library. The following three coordinate systems may be utilized.
Universal Transverse Mercator (UTM) coordinates
Latitude/Longitude (not a true coordinate system)
State Plane Coordinates (SPC)
Each of the three coordinate systems will be discussed below.
UNIVERSAL TRANSVERSE MERCATOR (UTM)
UTM is a specialized application of the Transverse Mercator Projection. The Globe is divided into sixty zones, each spanning six degrees of longitude. Each zone has its own central meridian. There are minimal distortions of large shapes within the zone and minimal distortion of angles. Indiana is located entirely in UTM zone 16. The USGS 1:24,000 scale quadrangle maps contain UTM grid tics along the map margins.
LATITUDE/LONGITUDE
Latitude/longitude, or geographic coordinates are used in a variety of settings for a diverse set of applications. Like UTM coordinates, they can represent large areas in a single coordinate system. The disadvantage of this reference system is that the length of a degree, minute and second of longitude on the ground varies with the latitude. Because of the variable lengths of units distance and area measurements are very difficult using geographic coordinates, as are other GIS operations. The 1:24,000 USGS quadrangle maps contain registration tics for latitude and longitude.
STATE PLANE COORDINATE (SPC)
SPC are most useful when map tiles are to be used independently or limited joining of adjacent tiles is expected. There are two SPC zones: east and west. Each county is placed entirely within one of the two zones. Counties on either side of a zonal boundary cannot be joined with those across the line without transformation and reprojection. The USGS 1:24,000 quadrangle contain SPC grid tics along the map margins. The SPC coordinate units are in feet.
2.10 ENTITY IDENTIFIERS
Entity identifiers use relative references (Area or Feature Identifiers) to specify general or specific location. In all cases, the locations of the entities must be specified in one of the coordinate systems noted above before any GIS applications can be carried out on the entities. Often, the entity locations are defined in a GIS data layer, such as the statewide Township, Range, and Section or Public Land Survey (PLS) coverage. Several entity identifier systems are commonly used in the State Agencies and should be used in any locational data collections. Examples of standard entity identifiers are as follows:
Counties
PLS components (e.g. sections) and ownership parcels;
US EPA Reach File 3 segment and milepoint
Unique IDNR well record id numbers
Minor Civil Divisions (Towns, Cities and Unorganized Territories)
Master Entity Facility IDs (FINDS)
Permit Numbers
FIPS (Federal Information Processing Standards) Codes
2.11 PROJECTION
The State Agencies will use Universal Transverse Mercator (UTM) as a standard projection. Before a map is digitized, the nature of its projection should be known. For example, USGS 1:24,000 quadrangle maps in Indiana may be published in Transverse Mercator (TM), Polyconic, or Lambert Conformal projections. The type of projection is usually included with the reference material at the bottom of a USGS quadrangle. Data capture must occur using the same projection as the source map. If the source map is not TM the coverage shall be projected into latitude/longitude and then projected again into the standard UTM zone 16 (See Coordinate System, above) to maintain accuracy. If necessary, once a map is digitized, it can be transformed to a number of other projections. When capturing data, records shall be kept indicating the projection of the source maps and whether or not the projection was later transformed. The method of transformation must also be identified.
2.12 DATUM
By legislative mandate, NAD83 is now the Indiana standard datum. NAD83 is also the EPA LDP standard. All new data generated or captured by State Agencies shall be collected in or converted to NAD83. Existing data should be converted as opportunities arise. NADCON (developed and recommended by the National Geodetic Survey) is the standard software package to make the conversions. ARC/INFO has a datum conversion utility based on NADCON which will be the standard for converting map feature data within ARC/INFO GIS databases.
2.13 CONTROL/REGISTRATION
A minimum of four registration tics are required for registering a manuscript for digitizing. Comer tics from USGS 7.5 minute quadrangles are the preferred registration coordinate to establish control. Internal tics may also be included to increase the registration accuracy. When using USGS quadrangles, high positional accuracy can be obtained by also using the four internal tics and the eight edge tics. The recommended maximum RMS (Root Mean Square) error for registration of a manuscript to be digitized is 0.005, or between 3 and 4 meters at 1 to 24,000 scale.
Register quadrangle map comers by using the published latitude/longitude comers of the map, rather than the dashed NADS3 comer tics. An ARC/INFO coverage of the coordinates for these comers is available through the State Agencies GIS data library.
Tics are simple constructs with very high significance. They must be digitized with EXTREME precision. They are the very foundation of spatial data structures, and should be handled with care.
Tics have a number of functions as listed below:
1) Relating the map to digitizer table surface. If a map is digitized, taken off the table and remounted for additional digitizing or editing, the tics allow the digitizing software to transform new coordinates digitized to the coordinate space of the file established during the initial digitizing session. This is also known as registering the map sheet to the digitizer.
2) Relating adjacent map files to one another. Tics, when properly selected, can be utilized by transformation and joining software to append two or more adjacent map sheet files into a single coverage, even if they are stored in an arbitrary coordinate system or in different projections.
3) Registering aerial photos or nonstandard scale maps to a standard base. In this application, tics, internal to the map, are located at identifiable locations (e.g. a road intersection). Coordinate transformation software can link the new coordinates to existing overages. This use can be particularly tricky with unrectified aerial photos, as many factors can impact accuracy.
4) Transforrffing from one coordinate system to another. When tic files are available in the original coordinate system and in a second desirable coordinate system transformation software can be utilized to manipulate coordinates. This is especially useful for converting data digitized in table inches to the coordinate reference system of the source map.
5) Generating coordinates specific to a map projection in native units. The latitude and longitude coordinates of tics at the comers of USGS quadrangle maps are known. Projection software can project these coordinates into the coordinate system which is native to the projection on the map. These calculated tics can then be used for subsequent digitizing from the map.
Tic points should initially be entered in table inches or map projection coordinates. Tics are very similar to point features or polygon label points, in that they consist of an X coordinate, Y coordinate and a tic ID. They are different in that their significance is geometric, rather than substantive or topological.
ARC/INFO requires a minimum of 4 tics for map registration and five tics for aerial photo registration. These are necessary for the transformation from one coordinate system to another.
Chapter 3
Data Automation
3.00 INTRODUCTION
Locational data automation provides the means to ensure consistent quality information. Digital data is more efficiently shared among State Agencies divisions and other entities than paper files. Care must be taken to follow the Spatial Data Collection Standards to ensure accuracy. This chapter provides guidelines for creating digital data sets and performing quality assurance checks.
3.01 COVERAGE NAMING CONVENTIONS
All coverage names and tile names will follow standard naming conventions and directory tree structures. A separate reference document will be created which describes these standards.
3.02 COINCIDENT FEATURES
Coincident features MUST match. If a digital base layer exists, lines or points that represent the same features as those about to be digitized should be copied from it, and NOT re-digitized. An alternative is to digitize these features, and then snap them to the coincident features in the base layer. The base layer must be at a scale at least as accurate as the scale being digitized.
3.03 ARC/INFO DIGITIZING
Digitize enough points to accurately represent line and polygon (area) features. The plotted digitized features must lie within 0. 0 1 inches of the source map features. Enter only enough points to meet this standard, as unnecessary points require extra computer storage space and extra processing time. The reconunended digitizing tolerance for distance between points (weed tolerance in ARC/INFO) is .002 table inches at the scale of the manuscript. Digitize along the centers of lines and in the center of point features. Save changes frequently. Avoid unnecessary coverage cleans.
Point Digitizing
POINT features can be digitized, key-entered or loaded from machine-readable data files. When key-entered, latitude/longitude (in decimal degrees), UTM, or State Plane coordinate/reference systems may be used. When digitized, table inches or map projections coordinates shall be utilized.
For all sets of point features, each point will have an X coordinate, a Y coordinate, and a Point DD. Point DDs can be either an attribute string of numeric information on the point, or a relational key to a point attribute table.
Line Digitizing
LINE features shall be digitized in table inches or map projection coordinates. The basic unit of digitizing is the line segment. Each line segment will consist of two nodes (end points), a variable number vertices between the nodes, and an ID. Line IDs can be either an attribute string of numeric information on the line, or a relational key to a line attribute table.
Defititions
A NODE is either the beginning or end point of a line. Each node defines the end point of 3 or more segments, or the end point of a single "dangle".
A VERTEX is a point between nodes where the line changes direction. A line segment without vertices is, by definition, straight. A line segment with one vertex is composed of two straight subsegments.
There is also one special case of line nodes: PSELTDO-NODES. Pseudo-nodes are nodes within lines segments. They are identical to vertices in the topological function, except in one of the following regards:
1) they have attributes associated with them (e.g. river gauging station);
2) they have topological addressability for NETWORK applications (e.g. river mile indices, highway mileposts);
3) they exist solely to subdivide very long lines into transparent subunits for ease or efficiency (e.g. a very long stream segment of 9,000 vertices is subdivided into two pseudo-segments of 4,500 vertices);or,
4) they exist at map borders solely to facilitate map joining or map extraction.
Line features can be either topological or not. A river/stream or highway networks are examples of topological line features. By definition, segments must be linked together to form a logical network. Geological faults and lineaments are examples of non topological line features. Such segments may be "floating" (i.e. not touching other segments, as in two strands of spaghetti lying on different parts of a plate) or randomly crisscrossed. Line features which are inherently non topological can be "spaghetti" or randomly digitized. Line segments do not have to connect, and line intersections do not have to be explicitly identified as nodes.
Line features to be digitized which have inherent topology must go through a verification process to ensure topological consistency. The ability to do network analysis is an important GIS function. Linear connectivity is required for this operation and topological consistency checking ensures such connectivity. Topological processing is described in section 3.05.
Polygon (Area) Digitizing
POLYGON features shall be digitized in digitizer inch or map projection coordinates. The basic units of digitizing are the line segments forming polygon borders and the label points with which the polygon IDs are associated. Each polygon line segment will consist of 2 nodes (end points) and a variable number of vertices. In cases where the polygon is defined by a single line closing on itself there will only be one node defining the feature, see pseudo-node below.
Definitions: A NODE is either the beginning or the end point of a polygon line segment where three or more polygon line segments come together (e.g. a section comer is a node in a PLS coverage). Each node defines the end point of 3 or more polygon line segments.
A VERTEX is a point between nodes where the line segment changes direction. A line segment without vertices is, by definition, straight. A line segment with one vertex is composed of 2 straight subsegments. A line segment with N vertices is composed of N+1 subsegments.
A LABEL POINT consists of an X coordinate, Y coordinate, and a polygon ID. Label points MUST fall within the border of the parent polygon. Only 1. label point is allowed per polygon. The polygon ID can be either an attribute sting of numeric information on the polygon, or a relational key to a polygon attribute table. Conceptually, polygon label points are similar to point features, except they are linked by their location to a particular polygon.
There is also one special case of polygon segment nodes: PSEUDO-NODES. Pseudonodes are nodes within a polygon line segment. They are identical to vertices in their topological function, except that-.
1) they exist solely to subdivide very long polygon line segments into transparent
subsegments for processing ease or efficiency; or
2) they exist at map borders solely to facilitate map joining or map extractions
3) they can connect a single arc to itself
All polygons WST be topologically structured. By definition, polygon line segments must be linked together into a logical structure, and all polygon areas must have label points. All polygons digitized MUST go through a verification process to ensure topological consistency. Topological processing is described in section 3.05.
The ability to do polygon overlays with other polygon, line or point overages is an
ESSENTIAL GIS capability. It is at the heart of most spatial analysis tasks. Topological data structuring is required to perform these functions. While such structuring and consistency checking may seem tedious, it is a small price to pay for spatial intelligence and integrity.
3.04 ATTRIBUTE CODING
Attribute accuracy: each point, line and polygon must have a single, unique feature identification number. Ninety-eight (98) percent of all attributes shall be coded correctly. The 98% figure is based on the National Center of Health Statistics Standards. Each program area shall develop an electronic data dictionary which should contain listing of all existing coding schemes. Other references for coding include the USGS Spatial Data Transfer Standard. These reference documents are available through the State Agencies GIS Coordinator in NUS.
3.05 TOPOLOGY
The graphic component of all digital data produced must be topologically clean. This means there shall be no :
Sliver polygons
Open polygons
Unlabeled polygons
Duplicate labels
Polygon labels outside polygons
Overshoots or undershoots
Unresolved line segment intersections
Topological Processing in ARC/INFO
Upon completion of the linework entry, line overages should be BUILT or CLEANed. Polygon overages are usually CLEANed, CLEANing includes intersecting of lines, clipping of dangles, snapping of undershoots, and BUILDing of topology. BUILDing includes only the latter step. The choice of which to use for line overages is a function of the data type, the digitizing procedures employed, and its intended uses. If you are unsure of which to use, consult with your State Agency GIS Coordinator. Excessive use of the CLEAN command may cause features to shift in space as topological structure is created. This shift called 'fuzzy creep' is created when closely spaced arcs and nodes are snapped together.
Definition of Errors
Errors are detected on an error plot produced with ARCPLOT. This plot will show node errors and label errors. Node errors include undershoots (i.e. those arcs not connected to a node) and dangling arcs (i.e. those which overshoot past an intersection). Label errors include missing and duplicate labels. Avoidance and correction of these errors are described below.
Avoidance of Errors
To avoid node errors, it is strongly recommended that the operator overshoot intersections slightly. When CLEANed, overshoots less than the length of the dangle distance are clipped to the node, thereby avoiding undershoot errors and further editing. An overshoot longer than the dangle distance can easily be deleted after the cover is CLEANed. The clean coverage will have the overshoots identified as separate dangling line segments. It is much easier and more accurate to delete an arc overshoot than to snap one arc to another. Alternatively, when ARCEDIT is used, the SNAP envirornment should be set to enable either ARC SNAP or NODESNAP functions so as to avoid node errors.
To avoid label errors, consider not entering labels until after all the linework has been completed and certified for topological consistency. The CREATELABELS command will generate a unique label point for each polygon in the coverage. The actual feature ID can then be loaded into the attribute table or replace the default ID created by the CREATELABELS conunand, or a separate field may be added to the Polygon Attribute Table (PAT) for the feature ID.
Fixing Errors
In general, the use of CLEAN should be minimized. One way of doing this is to use CLEAN diagnostically. That is, run CLEAN and produce error plots from the output of the CLEANing, but correct any error detected on the original (pre-CLEAN) coverage. Repeated CLEAN runs cause a problem called "fuzzy creep" which results in small shifts of features.
If CLEANing uncovers many node errors, a change in one or both of the two parameters used by CLEAN may be indicated. If there are many undershoots with very small gaps, the original coverage should be CLEANed again with a larger fuzzy tolerance. The default fuzzy tolerance in CLEAN is 1/10,000th of the width of the coverage (0.002 if the coverage is in inch coordinates). This can easily be doubled, and can be increased to as much as I / 1, 000th of the coverage width, or 0. 02 for inch coordinates.
If there are many dangling arcs, the original coverage should be CLEANed again with a larger dangle distance. The default dangle distance is 0. This parameter can be set to a range comparable to that described above for the fuzzy tolerance. If both problems exist, both parameters should be increased before CLEANing the original coverage again. If the coverage is reCLEANed, another error plot should be produced which will, hopefully, contain fewer node errors. These remaining node errors should be corrected in the original coverage with ARCEDIT or ADS.
If there are many, relatively large undershoots in a coverage, an alternative snapping procedure should be considered. Rather than extending undershot arcs manually, they can be automatically snapped with the ARCEDIT EXTEND FEATURE function, or MATCHNODE with the EXTEND option.
The process for correcting label errors is much more straightforward. If there are duplicate labels, either the arc which is supposed to separate them must be snapped to define two polygons, or one of the label points must be deleted. A missing label simply needs to be entered within the unlabeled polygon.
Once all node and label errors have been fixed, the final edited version of the original overages should be BUILT or CLEANed and an error plot produced as above. When there are no further errors, the coverage can be considered topologically certified and the digitizing phase is complete. A copy should be sent to the State Agencies GIS data library (when created), on high density 3.5" diskettes, 8mm DAT 2.5 GB tapes, or 4mm DAT 5 gb tapes in ARC export format. Until a State Agencies GIS Library is created each agency GIS Coordinator or Manager will maintain his/her agency's library resources.
3.06 EDGE MATCHING ADJOINING COVERAGES
Adjacent overages should be edge matched. Arcs that intersect the boundaries of a coverage should be accurately edge matched with the corresponding arcs in adjacent overages. Arcs must not overshoot or undershoot the coverage boundary. Attributes of features that cross coverage boundaries should also logically match in item definition and values.
3.07 PROOFPLOTS
A proof plot should be made of each map, at the source document scale, to verify its accuracy when compared to the original manuscript. Mylar is the preferred medium for a proof plot, the recommended line width is 0. 0 1 inches.
3.08 ACCURACY ASSESSMENT/DIGITAL MAP STANDARDS
It is the responsibility of each State Agency program producing the map to verify the accuracv of the digitized product according to the following standards.
1) 90% of the digitized features on a same scale, stable base proof plot must lie within 0. 0 1 inches of the corresponding features on the original manuscript.
2) 1 00% of the digitized features on a same scale, stable base proof plot must lie within 0.02 inches of the corresponding features on the original manuscript.
3) Digital data must be topologically clean and free of errors.
Positional accuracy can be measured by testing a number of points on a stable base proof plot. Measure the distance between the points on the plot and the original manuscript, measuring from the center of the line or point.
![]()