Module 4. Vector Data
Learning Objectives
Describe what vector data is and identify types of vector data.
Identify an appropriate spatial operation for a given task or question.
Explain different types of joins and their potential uses.
Differentiate between Spatial Operations and Spatial Analysis.
Lecture Slides
Assignments
Overview
As we learned earlier, geospatial data can be stored in vector data models. Vector data is represented by vertices- discrete geometric coordinates (x,y). Vertices can be connected to one another through arcs, or edges, defined by two end nodes. Points are zero-dimensional locations comprised of individual vertices. A tree could be represented by a point. Lines are collections of two or more vertices that are connected to one another. Lines are one-dimensional. Rivers are often represented by lines in GIS. Finally, polygons are collections of three or more vertices that are connected in a closed system. A polygon is a two-dimensional feature. Building footprints are commonly stored as polygon features.
Benefits of Vector Data
Both raster and vector data models have strengths and weaknesses. We talked about these briefly in the previous module. Vector data is a strong choice when the features you want to represent have discrete boundaries, such as buildings. Vector data also has the benefit that it can be stored without data loss, and geographic location can be measured precisely. The vector format makes it possible to store multiple data attributes. For example, you might have a vector data set that contains building footprints where each footprint has attributes related to its use, size, and age. This is more difficult to capture in raster data models. Finally, it is possible to record the topological relationships of vector data.
The two most common data structures used for vector data are the spaghetti model and the topological data model. The spaghetti model represents points, lines, and polygons as strings of (x,y) coordinate pairs. This model does not have an inherent structure, thus the name spaghetti model. Polygons in the spaghetti model are represented by their own set of (x,y) coordinate pairs. This means that even when two polygons have a shared border, the border is recorded for each of the polygons. Topological data models include inherent information about the spatial relationships between vector features.
Topology
Topology is a set of rules and behaviors that establish how points, lines, and polygons share coincident geometry. Topology is used to ensure data quality about the way features are connected to one another and their relative positions to one another. There are four main types of topological relationships- adjacency, connectivity, containment, and coincidence. Adjacency (Contiguity) is the topological relationship when two or more polygons share a common bo,undary and the arc has a direction. Connectivity is the topological relationship of coincident arcs and nodes. Containment (Area) refers to the creation of boundaries through the connection of arcs to form polygons. Coincidence occurs when two points have the same (x,y) coordinates but do not have connectivity. For example, street lines have coincident geometry with census blocks.
Many GISystems have tools available for evaluating topological relationships and ensuring data integrity. Topological integrity is important when trying to perform an analysis based on spatial relationships between features.
Topological Errors
Topological errors occur during digitization or raster-to-vector conversion. Topological errors can cause errors in analyses when the relationship between features is incorrectly defined. For example, network analysis, such as shortest-path distances, require valid topology in order for the algorithm to determine the elements of a network and their connectivity. Error propagation arises when inaccuracies in the original data are propagated through to the output layer. In some spatial analyses, topological errors can lead to errors in the final product. Thus it is important to evaluate the topological accuracy of data in those cases.
Vector Analysis
Vector data analysis encompasses a wide range of spatial analyses that utilize geometric objects (points, lines, polygons). Vector data analysis falls under the umbrella term geoprocessing a collection of analyses that allow users to perform spatially explicit analyses on a dataset or multiple datasets to create a new dataset. A common geoprocessing analysis using GIS is buffering. Buffering is the process of creating a zone that is drawn around any point, line, or polygon that encompasses all of the areas within a specified distance from a feature or collection of features. An analyst might create a 1-mile buffer around the school and then determine how many parks are located within that buffer. This information can be used to make decisions about school siting, land use planning, and community development.
Overlay analysis is another form of geoprocessing where multiple data layers are overlaid to find combinations of data attributes. An example of using overlay analysis in GIS is determining the areas of land that are vulnerable to flooding. In this case, a GIS analyst would overlay a map of flood zones on top of a map of land use. This would allow the analyst to see which areas, such as residential neighborhoods or industrial parks, are located in flood-prone areas and may be at risk for flooding. There are many different types of geoprocessing tools available in GISystems. The table below outlines some of the most common ones.
Buffer
define a region by establishing a boundary by a given distance from a feature
Isolating riparian zones along a river
Overlay
overlay multiple data layers to find attribute combinations
Habitat suitability
Clip
Reduce a dataset by the boundary of another dataset
Reduce data size
Merge
combine two data sets of the same type
Combine two location point files (Dunkin Donuts and Starbucks) to evaluate the location of coffee shops
Dissolve
unify boundaries based on a common attribute value
Combine census blocks based on average income
Intersect
Overlap analysis where all layers of overlap are output
Combine residential locations with flood zone map to evaluate the parcels within different flood zones
Union
Combines two layers while maintaining all input feature boundaries and attributes
Combine land use and zoning data into one dataset
Erase/Difference
Remove features from a dataset
Intersect a land cover dataset with forest fire extent to examine extent of fragmentation
GIScience of Vector Data
GIScience research related to vector data aims to develop new methods and techniques for managing, analyzing, and visualizing vector data, enhancing the quality and accuracy of vector data, and utilizing the information within vector data to make informed decisions.
Geographic Information Science addresses vector data from a variety of perspectives. For example, spatial data modeling, where research questions address the development of models for representing and storing vector data. Spatial data analysis is focused on the development of new spatial statistical methods. Spatial data integration involves developing methods for integrating multisource vector data. While machine learning research involves developing methods for extracting information from geographic data using machine learning algorithms. Another interesting area of research concerning vector data is uncertainty quantification and visualization which evaluates topics such as error propagation and uncertainty modeling.
Readings
You may need to obtain these from the University of Illinois Library.
B. A. Ricker, P. R. Rickles, G. A. Fagg & M. E. Haklay (2020) Tool, toolmaker, and scientist: case study experiences using GIS in interdisciplinary research, Cartography and Geographic Information Science, 47:4, 350-366, DOI: 10.1080/15230406.2020.1748113
Heikinheimo, V., Tenkanen, H., Bergroth, C., Järv, O., Hiippala, T., & Toivonen, T. (2020). Understanding the use of urban green spaces from user-generated geographic information. Landscape and Urban Planning, 201, 103845. https://doi.org/10.1016/j.landurbplan.2020.103845
Last updated