Documente Academic
Documente Profesional
Documente Cultură
Vector Models
Data Models Topological Models Representation of surfaces
Geographic information represented based on the projection and coordinate system used Features, such as roads, rivers, rail lines, etc., All information represented on a map must be translated into electronic form The electronic form of the map may contain just the graphic representation of the map, or information on map is separated into groups of objects (features) and the objects (features) are stored electronically in a group
Graphic representations and geographical space can be presented in raster form or vector form In the raster format, the graphic is represented as a combination of individual units, where each unit can represent only one value. All units are stored to represent the graphic
Example: bitmaps of images, where the image is composed by the combination of individual pixels
In the vector format, the graphic is represented by a set of points, joined by a certain relationship or function. Only the points and the relationship are stored. Intermediate points are determined using the relationship
Example: A CAD drawing (engineering drawing)
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
Flat File 4753456 623412 4753436 623424 4753462 623478 4753432 623482 4753405 623429 4753401 623508 4753462 623555 4753398 623634 Flat File
0000000000000000 0001100000100000 1010100001010000 1100100001010000 0000100010001000 0000100010000100 0001000100000010 0010000100000001 0111001000000001 0000111000000000 0000000000000000
Raster-based line
Water dominates W W G W W G W W G
Edges separate W E W E E E G G G
10
Lossy: Where there is a certain loss of accuracy in exchange for a greatly increased compression Lossless: Where there is a guarantee that the exact input stream will be generated after the compress/expand cycle Data compression = Modeling + Coding
Symbols Input Stream Model Probabilities Codes Encoder Output Stream
Lossless coding techniques: Huffman, Runlength Encoding, LZ (uses adaptive dictionary), LZW, etc.,
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
11
Variable-length coding: Use short codes for most frequent symbols, and longer codes for less frequent symbols. Examples: Huffman, ShanoFannon Sliding Window Compression: Uses previously seen data as a dictionary. Examples: Lempel Ziv (LZ77,LZ78) (used in PKZip), LZW (used in ARC, GIFs) Run Length Encoding (RLE): Replace repeated data by the count of the data elements
12
BMP
Used by MS Windows and OS/2 systems An 8-bit bitmap uses a color table of size 256 (8 bits) to store color depth (based on palette). Each pixel is coded as index of the color table. File size is number of pixels multiplied by bits/pixel Color may vary from computer to computer based on default palette A 24-bit bitmap uses 3 bytes for each pixel to provide a depth of 16 million colors
13
Device-independent bitmap (DIB) format allows Windows to display the bitmap on any type of display device. The term "device independent" means that the bitmap specifies pixel color in a form independent of the method used by a display to represent color.
14
Interplatform format created by Compuserve One file can contain multiple images Uses LZW (Lempel-Ziv-Welch) algorithm for bitmap compression. Patented algorithm requires license for each use Two versions: 87a and 89a. Allows images to be interlaced, have transparent background, or animated Extension blocks provide mechanisms for file annotation Can have only 256 colors in an image
15
Earliest and simplest method of data compression A repeated string of characters is replaced by two bytes: the number of times character appears and the character itself Basis for many CCITT group standards. Combined with Huffman coding Two dimensional coding schemes: First line coded using 1D scheme, next K lines coded using first line Size of K varies over applications
16
17
18
19
20
Color Characteristics
Color characteristics:
Luminance or Brightness: Measure of the brightness of light emitted by an object. Human eye responds differently to different colors. Response is highest at wavelength of 575 nm (yellow color) Hue: Sensation produced due to the presence of certain wavelengths of color Saturation: Measure of the color intensity - example: red and pink have same predominant wavelength but pink has more white
21
Color Models
Chromacity model:
3-D model which uses x and y for color and the third dimension for luminance. Additive model
RGB model:
Combines different intensities of red, green, and blue to generate various colors. Additive model
22
Lossy compression
Lossless compression is difficult, especially for continuous tone images, because of slight variations in color Lossy compression:
Based on the principle that the human eye sees finer detail in an image more because of brightness variations than because of color variations. Hence, certain color pixels can be dropped without any perceptible loss To determine which pixels should be dropped, image is converted from a spatial domain to a frequency domain
Most raster image formats use some form of lossy compression technique
23
JPEG compression
24
JPEG compression
8x8 Image Block Apply DCT
Matrix Quantization
Quantizer Table
Huffman Table
25
26
27
28
29
An image encoded in a TIFF file is wholly defined by its tags, and the file format is highly extensible because additional features can be added simply by defining additional tags The TIFF file format specification defines more than 70 different types of tags. For example, tags are used for: image width in pixels; image height; a color table (if required); compression type; TIFF file can contain multiple images TIFF file format is one of the best for transferring bitmaps across platforms, because it is flexible enough to allow virtually any image to be encoded in binary form without losing any of its attributes, visual or otherwise
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
30
Georeferenced image of quad or quarter quad (3.75 minute) developed from photograph and other data Displacements due to sensor orientation and terrain relief has been removed DOQQs have a ground pixel distance of 1 meter DOQ created by mosaicking DOQQs and other photography chips Photographs exposed by camera at 20,000 feet above mean terrain with a 6-inch focal length camera
31
Photograph scanned at resolution of 7.5 to 30 micrometers (generally 25 micrometers) A black and white QQ generated from a 240mm square photograph at 25 micrometers produces an image between 45-50 megabytes uncompressed, and yields a ground pixel of 1 meter A standard ASCII header is used. File has image stored west to east with north on top Uses the UTM coordinate system and NAD 83 datum
32
Digital Elevation Model (DEM) data files are digital representations of cartographic information in a raster form. DEMs consist of a sampled array of elevations for a number of ground positions at regularly spaced intervals. These digital cartographic/geographic data files are produced by the U.S. Geological Survey (USGS) as part of the National Mapping Program. DEM data for 7.5-minute units correspond to the USGS 7.5-minute topographic quadrangle map series for all of the United States and its territories
33
DEMs are generated using several techniques, such as interpolation, hyposgraphy layer, etc Some uses of DEMs
Cut and fill volume estimation Coarse contour maps Line of sight (viewshed) maps Shaded relief maps
Several models are used for storing information to a file, and hence appropriate software is required to read/analyze them
34
DEM Examples
35
DEM Examples
36
One grid cell is one unit or holds one attribute Every cell has a value, even if it is missing A cell can hold a number or an index value standing for an attribute A cell has a resolution, given as the cell size in ground units A thematic map of grid cells where each cell represents the same theme is called a coverage
37
Grid extent
Resolution
Columns
Figure 3.1 Generic structure for a grid.
Rows
Grid cell
38
Water dominates W W G W W G W W G
Edges separate W E W E E E G G G
39
IDRISI names raster files as images. Each image consists of a defined count of rows and columns thus forming cells. These cells are stored as a sequence of numbers (byte, integer or real) representing values (vegetation classcodes, reflectance numbers, political units, z-values in a DEM, ...) A raster in IDRISI carries no information about 'itself' - it stores that metadata separately. This is done by so-called raster documentation files (*.DOC). All images must have their corresponding DOC-files. These are ASCII files made up by a sequence of lines, each representing metadata
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
40
The values may represent some code for land usage. IDRISI is starting in the upper-left corner (row 0/column 0), then advances column by column and row by row. In the simplest format ASCII - the cellvalues are stored one in each line
41
210
0 2
0 2 1 3
1 3
quadrant number
42
Pros
Can be easily created from existing pixel data in memory Pixel values can be modified individually or in a group by using a palette Translate well to CRT based output
Cons
Can be very large, based on size and number of colors Do not scale very well. Decimation (throwing away pixels) may make an image unacceptable
43
Vectors are line segments minimally defined by a starting point, a direction, and a length A vector data model uses points stored by their real (earth) coordinates Lines and areas are built from sequences of points in order. Lines have a direction based on the ordering of the points Straight, curved lines, and simple shapes can be used to create more complex shapes Vector models can store information about topology
44
Topological model
A line is a segment between two points (vertices) A link (or arc or chain) is a connection between two nodes. A link may consist of several lines which are joined at points (vertices) Links can only originate, terminate or be connected at nodes A point (vertex) is where a line originates or terminates A node is where a link originates or terminates A polygon (area) is composed of links. Adjacent polygons have only one link between them
45
Direction
All links have a direction (the FROM node and TO node)
Connectivity
Keep track of which links are connected at a node
Adjacency
A link can determine the polygon to its left and its right
Nestedness
what nodes and links and other polygons are within a polygon
These characteristics allows the software to determine the relationships between the individual graphic objects as well as values for length, perimeter, area, etc.
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
46
The map is maintained as a conceptual model. It is a one-for-one translation of the analog map Imagine covering each graphic object on the analog map with a piece of spaghetti, where each spaghetti acts as a single entity, without any structure between them Each entity is a single, logical record coded as variable length strings of (X,Y) coordinate pairs A polygon is a closed loop coordinate string No two adjacent polygons share the same spaghetti string - hence stored twice
47
Lack of topology increases computational overhead Generally used where analysis is not important Good for plotting
48
49
50
Copyright: ESRI
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
51
Copyright: ESRI
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
52
Copyright: ESRI
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
53
Copyright: ESRI
54
Copyright: ESRI
55
Copyright: ESRI
56
Copyright: ESRI
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
57
Copyright: ESRI
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
58
Copyright: ESRI
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
59
60
Best known topological model is the Geographic Base File / Dual Independent Map Encoding (GBF/DIME) model Developed by US Census Bureau to store street map data for decennial census Street addresses and UTM coordinates of each link are defined, permitting street addresses to be accessed by geographic coordinates Suffers from same problems as spaghetti model because the program must perform a sequential search to search for a particular link (ex: a street is broken at intersections)
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
61
62
Topologically Integrated Geographic Encoding and Referencing System (TIGER) Designed for use for the 1990 census Points, lines, and areas can be explicitly addressed
63
Map 3
2
157
Addresses on block
6
158 159
Files
Zero cells
Nodes 13,17, 21, 22,23, 158,159 (x,y) values 18,19, 156,157,
156
First St.
3 1
87
Ave nu e B
4
Lake Drive
88
A ve n u e C
89
22
One cells
1,2,3,4,5,6,7,8, 9,10,11,12,13, 15, 16,17,18 Addresses 14 ,
11
86
Avenue A
21
23
Second St.
18
17
A ve nu e D
10
85
13
13
90
14
91
Two cells
Lake, Blocks 86, 87, 88, 89, 90, 91
12
Third St.
Zero cell
17
15
16
18 19
From Clarke, Pg 91 64
Developed by USGS to produce digital 7.5 minute and 15 minute Topographic Map series Uses the currently available 7.5 minute topographic maps as a base Information is separated into layers, such as: hydrographic, transportation, etc. In additional to topological data, points, lines, and areas have attribute codes attached to them consisting of a three digit major code and a four digit minor code. Ex: 050 0200 is a Shoreline on the hydrography layer The file also has header records for metadata information
65
66
Topology Matters
Topology allows automated error detection and elimination The tolerances controlling snapping, elimination, and merging must be considered carefully, because they can move features Complete topology makes map overlay feasible Topology allows many GIS operations to be done without accessing the point files. This can mean considerable speed improvements for many operations. Ex. bounding box tests, route analysis, polygon neighbor (contiguity) operations
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
67
Unsnapped node
From Clarke
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
68
(xmin, ymin)
From Clarke
69
Network Analysis
Network models are built on top of topological models to establish routes between connected nodes Data associated with individual routes is stored in tables associated with the route
70
In vector models, the space between the graphical entities is implied Volumes (continuous surfaces) are represented with the Triangulated Irregular Network (TIN) model, including edge or triangle topology It allows surface models to be generated efficiently to analyze and display terrain and other types of surfaces The fundamental building block of the TIN data model is the node. Model a surface by placing irregular nodes that act as vertices Each node has an explicit topographic value
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
71
Nodes are connected to their nearest neighbors by edges, according to a set of rules, to represent an area of uniform topography Finally, the TIN model creates a network of triangles by storing the topological relationships of the triangles TINs use an optimal Delaunay triangulation of a set of irregularly distributed points TINs are popular in CAD and surveying packages
72
73
Creating a TIN
Initially approximate the map by a square with 2 triangles, 4 points, and 5 edges Find the most deviant point in either triangle and split that triangle into 3 by inserting a new point and 3 edges Next check all quadrilaterals composed of a new triangle and a old triangle to see if the diagonal should be swapped (based on certain criteria) Finally, find the new most deviant point and repeat
From: W.R.Franklin, RPI.edu
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
74
75
PDLs (such as Postscript, PDF), are used for output on print or display devices and are not true graphics file formats.
76
Consists of up to seven sections: Header, tables, blocks, classes, objects, entities, and end-of-file A DXF file consists of group codes, and associated values. For example: Code 9 introduces the name of the header section and 999 for a comment
77
78
Raster Models:
Good for surface representation whereas TIN models must be used to represent surfaces for vector models Easy to conceptualize space representation Allows easy integration of image data (satellite, remoted sensed, etc.) Do not provide precise locational, and area computation information due to grid cells Requires large storage capacity Blocky appearance when image is viewed in detail
79
Vector
Provides precise locational information of points Vector can represent point, line, and area features very accurately. Hence measurements are very accurate Topological models enable many types of analysis Vectors are far more efficient in storage than grids Vectors work well with pen and light-plotting devices and tablet digitizers Vectors are not good at continuous coverages or plotters that fill areas. Spatial analysis is difficult Image data overlay requires special image tools
80
Data Exchange
A GIS is based on either a vector model or a raster model. Within each, it may use (import) many file formats and convert them to its internal model/ data structure A vector based GIS may support raster based layers (and vice versa), but most analysis tools will be based on its native model (either raster or vector) Changing vector to raster is less time consuming and simpler than raster to vector. Loss of data and inaccuracies may result with either conversion
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
81
82
Data Exchange
Data also are often exchanged or transferred between entire different GIS packages and computer systems In the past, GIS data exchange has been been isolated. However, with the development and use of Open Standards for systems, architecture, databases, and interfaces, handshaking between GISs of various types is now possible Also, the use of GIS as a component of an Integrated System (Enterprise system) has made it necessary to share and exchange data
83
Blind data exchange by translation (export and import) can lead to significant errors in attributes and in geometry In the United States, the Spatial Data Transfer Standard (SDTS) was evolved to facilitate data transfer. It became a federal standard (FIPS 173) in 1992. Website: http://mcmcweb.er.usgs.gov/sdts/index.html SDTS is quite complex and contains a terminology, a set of references, a list of features, a transfer mechanism, and an accuracy standard Both DLG and TIGER data are available in SDTS format
IS 645: Geographic Information Systems, Summer 2000, J. Wolfe
84
85