Analytical Methods

UNDERSTANDING OF DATA SELECTION QUERIES AND VIEWS

UNDERSTANDING OF TECHNIQUES AND IMPLICATIONS OF DATA CLASSIFICATION

UNDERSTANDING OF ANALYTICAL OPERATIONS AND METHODS

KNOWLEDGE OF DESCRIPTIVE AND SPATIAL STATISTICS

UNDERSTANDING OF DATA SELECTION QUERIES AND VIEWS

Data selection and queries are fundamental for extracting relevant information from both spatial and non-spatial datasets.

KEY CONCEPTS AND TERMINOLOGY

Data selection:
- Selection involves choosing a subset of features (points, lines, polygons) or records (rows) from a dataset based on specific criteria.
- Common scenarios for data selection include:
  - Spatial Selection: Choosing features within a defined area (e.g., selecting all buildings within a city boundary).
  - Attribute Selection: Filtering features based on attribute values (e.g., selecting all roads with a speed limit above 40 mph).
- Tools for Data Selection:
  - Select by Location: Select features based on their spatial relationship to other features (e.g., selecting all parks intersecting a river).
  - Select by Attributes: Choose features based on attribute conditions (e.g., selecting all parcels with a land use of “residential”).
  - Interactive Selection: Manually select features using the mouse or touch interface.
Querying in GIS:
- Querying involves asking questions about geographic features and their attributes.
- Queries help retrieve specific information from a dataset.
- Types of Queries:
  - Query by Attribute: Retrieve features based on attribute values (e.g., finding all hospitals with more than 100 beds).
  - Query by Geography: Retrieve features based on their spatial location (e.g., finding all rivers within a specific distance of a road).
SQL Expressions in GIS:
- Many GIS applications such as ArcGIS and QGIS support standard SQL expressions for querying.
- You can build WHERE clauses to filter data based on field values (e.g., STATE_NAME = ‘Alabama’)
- Subqueries and compound queries are also supported.
- Different SQL dialects are used depending on the data source (file-based, SQL Server, MS Access, ArcSDE geodatabase).
Benefits of Data Selection and Queries:
- Efficiency: Selecting relevant data reduces the volume of information to work with.
- Precision: Queries allow you to pinpoint specific features or records.
- Analysis: Data selection and queries support spatial analysis, visualization, and decision-making.
Database View: is a powerful construct that provides a virtual representation of data stored in one or more database tables. It is essentially a named query saved within the database that remains persistent and can be called upon when needed.
- Views encapsulate complex joins, calculations, and aggregations.
- Users can query views as if they were regular tables.
- Views ensure consistent data presentation across different applications.
- Changes to the underlying tables automatically reflect in the view results.

SAMPLE QUESTION

Which of the following SQL expressions would you use to select all roads and the fields with a speed limit greater than 40 mph from a road network dataset?

A) SELECT * FROM Roads WHERE SpeedLimit > 40

B) SELECT RoadName FROM Roads WHERE SpeedLimit = 40

C) SELECT SpeedLimit FROM Roads WHERE SpeedLimit > 40

D) SELECT RoadName, SpeedLimit FROM Roads WHERE SpeedLimit > 40

Answer: A) SELECT * FROM Roads WHERE SpeedLimit > 40

Explanation:

Option A selects all fields (*) from the Roads table where the SpeedLimit is greater than 40 mph.
Option B only retrieves the specific column (RoadName) where the SpeedLimit is equal to 40 mph.
Option C only retrieves the specific column (SpeedLimit) where the SpeedLimit is greater than 40 mph.
Option D only retrieves both the RoadName and SpeedLimit columns for roads meeting the condition.

Remember that SQL expressions in GIS adhere to standard SQL syntax, and the correct choice depends on the specific query requirements.

ADDITIONAL RESOURCES

SQL Tutorial - Essential SQL For The Beginners

GIS and SQL | Geography Realm

UNDERSTANDING OF TECHNIQUES AND IMPLICATIONS OF DATA CLASSIFICATION

Data classification is essential for visualizing and analyzing spatial data. The choice of classification method depends on the data type/scale, distribution, visualization goals, and context of your analysis.

KEY CONCEPTS AND TERMINOLOGY

Manual Interval:
- Description: Manually define custom class ranges based on your understanding of the data.
- Use Case: Useful when you want to tailor class breaks to specific context or domain knowledge.
Defined Interval:
- Description: Specify an interval size to create classes with equal value ranges.
- Use Case: Appropriate for evenly distributed data, such as temperature or elevation.
Equal Interval:
- Description: Divide the attribute value range into equal-sized subranges.
- Use Case: Best applied to familiar data ranges (e.g., percentages), emphasizing relative differences.
Quantile:
- Description: Assign an equal number of features to each class.
- Use Case: Well suited for linearly distributed data but can lead to misleading maps.
Natural Breaks (Jenks):
- Description: Groups data based on natural groupings inherent in the data.
- Use Case: Maximizes differences between classes, but not suitable for comparing different maps.
Four main typse of data scales that help characterize data.
- Nominal Scale of Measurement:
  - Description: Nominal data defines the identity property of data points.
  - Characteristics:
    - Categories have no inherent order.
    - Examples include names, labels, and categories.
    - Nominal data can be used for grouping and categorization.
  - Example: Classifying animals into categories like “mammals,” “birds,” or “reptiles.”
- Ordinal Scale of Measurement:
  - Description: Ordinal data defines data placed in a specific order.
  - Characteristics:
    - Categories have a natural order.
    - Differences between categories are not uniform.
    - Examples include ranks, ratings, and survey responses (e.g., “strongly agree,” “agree,” “neutral,” “disagree,” “strongly disagree”).
  - Example: Ranking students based on their exam scores.
- Interval Scale of Measurement:
  - Description: Interval data can be categorized, ranked, and has evenly spaced intervals.
  - Characteristics:
    - Intervals between values are consistent.
    - Zero point is arbitrary (no true zero).
    - Examples include temperature (measured in Celsius or Fahrenheit) and calendar dates.
  - Example: Measuring temperature differences (e.g., 20°C to 30°C).
- Ratio Scale of Measurement:
  - Description: Ratio data has all the properties of interval data, plus a natural zero point.
  - Characteristics:
    - Ratios between values are meaningful.
    - True zero indicates the absence of the measured attribute.
    - Examples include height, weight, income, and time (measured in seconds).
  - Example: Counting the number of books on a shelf (zero books means an empty shelf).

SAMPLE QUESTION

Which of the following classification methods emphasizes natural groupings inherent in the data and maximizes differences between classes?

A) Equal Interval

B) Quantile

C) Natural Breaks (Jenks)

D) Defined Interval

Answer: C) Natural Breaks (Jenks)

Explanation: Natural breaks classification (also known as Jenks classification) groups data based on inherent patterns in the data. It sets class boundaries where there are relatively significant differences in data values. This method is data-specific and not suitable for comparing multiple maps built from different underlying information.

ADDITIONAL RESOURCES

AM-09 - Classification and Clustering | GIS&T Body of Knowledge (ucgis.org)

Data Classification (saylordotorg.github.io)

UNDERSTANDING OF ANALYTICAL OPERATIONS AND METHODS

Analytical operations and methods in GIS allow you to extract meaningful insights from both spatial and non-spatial data. GIS analytical operations empower decision-making by revealing spatial patterns, relationships, and trends.

KEY CONCEPTS AND TERMINOLOGY

Spatial Analysis:
- Description: Spatial analysis involves studying the characteristics of places and the relationships among them.
- Purpose:
  - Solve complex location-oriented problems.
  - Explore and understand data from a geographic perspective.
  - Determine relationships, detect patterns, assess trends, and make predictions.
- Capabilities:
  - Overlay Analysis: Combine and compare multiple layers to identify intersections, containment, or proximity.
  - Buffer Analysis: Create zones around features based on a specified distance.
  - Network Analysis: Optimize routes, find nearest facilities, and perform service area analysis.
  - Spatial Statistics: Calculate statistics related to spatial patterns and distributions.
  - Interpolation: Estimate values at unmeasured locations based on nearby measurements.
  - Hot Spot Analysis: Identify statistically significant clusters of high or low values.
  - Viewshed Analysis: Determine visible areas from a specific location.
  - Terrain Analysis: Analyze elevation data for slope, aspect, and visibility.
  - Time Series Analysis: Study changes over time using spatiotemporal data.
Geoprocessing: Geoprocessing involves performing operations on geographic data.
- Description:
- Purpose:
  - Transform, analyze, and manage data.
  - Automate repetitive tasks.
- Tools and Techniques:
  - Vector Operations: Clip, dissolve, union, intersect, and more.
  - Raster Operations: Reclassify, resample, mosaic, and calculate.
  - Model Builder: Create custom workflows by chaining geoprocessing tools.
  - Python Scripting: Write custom scripts for specific tasks.
Raster Analysis:
- Description: Raster analysis focuses on grid-based data (e.g., elevation, satellite imagery).
- Capabilities:
  - Surface Analysis: Calculate slope, aspect, hillshade, and viewshed.
  - Distance Analysis: Compute proximity, cost distance, and least-cost paths.
  - Density Analysis: Assess point density, line density, and kernel density.
  - Change Detection: Identify differences between raster datasets.
  - Image Classification: Categorize pixels based on spectral characteristics.
Statistical Analysis:
- Description: Statistical methods help uncover patterns and relationships in spatial data.
- Techniques:
  - Descriptive Statistics: Mean, median, standard deviation, etc.
  - Regression Analysis: Explore relationships between variables.
  - Cluster Analysis: Group similar features.
  - Correlation Analysis: Assess associations between variables.
  - Spatial Autocorrelation: Detect spatial patterns.

SAMPLE QUESTION

Which of the following spatial analysis techniques is used to identify statistically significant clusters of high or low values in a dataset?

A) Buffer Analysis

B) Natural Breaks (Jenks)

C) Viewshed Analysis

D) Hot Spot Analysis

Answer: D) Hot Spot Analysis

Explanation: Hot Spot Analysis (also known as Getis-Ord Gi) identifies statistically significant spatial clusters (hot spots or cold spots) based on attribute values. It helps detect areas with unusually high or low values compared to the overall pattern.

ADDITIONAL RESOURCES

AM-03 - Buffers | GIS&T Body of Knowledge (ucgis.org)

AM-04 - Overlay | GIS&T Body of Knowledge (ucgis.org)

AM-07 - Point Pattern Analysis | GIS&T Body of Knowledge (ucgis.org)

AM-08 - Kernels and Density Estimation | GIS&T Body of Knowledge (ucgis.org)

AM-09 - Classification and Clustering | GIS&T Body of Knowledge (ucgis.org)

AM-29 - Kriging Interpolation | GIS&T Body of Knowledge (ucgis.org)

AM-40 - Areal Interpolation | GIS&T Body of Knowledge (ucgis.org)

AM-20 - Geospatial Analysis and Model Building | GIS&T Body of Knowledge (ucgis.org)

KNOWLEDGE OF MAP ALGEBRA

Map algebra involves performing mathematical operations on raster data (gridded data) within a GIS environment. Unlike traditional algebra, which deals with scalar values, map algebra operates on entire raster datasets (individual pixels and groups of pixels). It allows you to combine, transform, and analyze raster layers using various mathematical functions.

KEY CONCEPTS AND TERMINOLOGY

Types of Map Algebra Operations:
- Local Operations:
  - Apply a function (add, subtract, multiply) to each cell in a raster independently.
  - Examples: addition, subtraction, multiplication, division.
- Global Operations:
  - Apply a function (add, subtract, multiply) to all cells in a raster simultaneously.
  - Examples: rescaling, thresholding, normalization.
- Focal Operations:
  - Compute an output value for each cell based on its neighborhood values.
  - Examples: convolution, kernel filters, moving windows.
Zonal Operations:
- Apply a function to a group of cells within a specified zone.
- Zones can be defined by vector or raster features.
- Example: calculating average temperature within watersheds.
Applications of Map Algebra:
- Terrain Analysis: Derive slope, aspect, hillshade, and viewshed.
- Distance Measurement: Calculate Euclidean distance, cost distance, and least-cost paths.
- Change Detection: Identify differences between raster datasets.
- Spatial Modeling: Combine multiple layers to create new information.
- Image Classification: Assign land cover classes based on spectral characteristics.

SAMPLE QUESTION

Which of the following map algebra operations involves applying a function to each cell in a raster independently?

A) Focal Operations

B) Global Operations

C) Zonal Operations

D) Local Operations

Answer: D) Local Operations

Explanation: Local operations in map algebra apply a function to each cell individually without considering neighboring cells. Examples include addition, subtraction, multiplication, and division.

ADDITIONAL RESOURCES

Map algebra - Wikipedia

What is Map Algebra? [Raster Math] - GIS Geography

KNOWLEDGE OF DESCRIPTIVE AND SPATIAL STATISTICS

Descriptive statistics provide simple numeric descriptions of data, summarizing its characteristics. These statistics help us understand the central tendency, variability, and distribution of a dataset. Spatial statistics is a field of applied statistics that deals with spatial data. It involves various techniques for analyzing and understanding data with a geographic or spatial context.

KEY CONCEPTS AND TERMINOLOGY

Measures of Central Tendency: These statistics describe the central value around which data points tend to cluster. Common measures include:
- Mean (Average): Sum of all values divided by the number of values.
- Median: Middle value when data is sorted in ascending order.
- Mode: Most frequently occurring value.
Measures of Dispersion (Variability): These statistics quantify how spread out or dispersed the data points are. Common measures include:
- Range: Difference between the maximum and minimum values.
- Variance: Average of squared differences from the mean.
- Standard Deviation: Square root of the variance.
Frequency Distribution: A table or graph showing how often each value occurs in a dataset. Useful for understanding the distribution of data.
Percentiles and Quartiles: Percentiles divide data into equal parts. Quartiles split data into four equal parts (Q1, Q2, Q3).
Skewness and Kurtosis: Skewness measures the asymmetry of the data distribution. Kurtosis describes the shape of the distribution (peakedness or flatness).
Graphical Descriptions: Histograms, box plots, and scatter plots visually represent data distributions.
Spatial Relationships and Patterns: Spatial statistics explore relationships between data points based on their spatial proximity. Techniques help identify patterns, clusters, and trends in spatial data.
Applications of Spatial Statistics:
- Geostatistics: Analyzing spatial variability and interpolation (e.g., kriging).
- Point Pattern Analysis: Studying the distribution of point features (e.g., crime incidents, tree locations).
- Spatial Autocorrelation: Detecting spatial patterns (positive or negative spatial dependence).
- Spatial Regression: Modeling relationships between spatial variables.
- Hot Spot Analysis: Identifying statistically significant clusters (hot spots or cold spots).

SAMPLE QUESTION

Which of the following spatial statistics techniques is used to measures the spatial dependence or pattern in a dataset?

A) Geostatistics

B) Point Pattern Analysis

C) Spatial Autocorrelation

D) Hot Spot Analysis

Answer: C) Spatial Autocorrelation

Explanation: Spatial autocorrelation refers to the degree of similarity or dissimilarity between spatially adjacent data points within a geographic dataset.

ADDITIONAL RESOURCES

Descriptive statistics - Wikipedia

Summary statistics - Wikipedia

Coefficient of determination - Wikipedia

AM-22 - Global Measures of Spatial Association | GIS&T Body of Knowledge (ucgis.org)