|Understanding PostGIS Spatial Queries|
In this tutorial you will familiarize yourself with PostgreSQL geometry tables and PostGIS functions. PostgreSQL is a database that can store GIS data. PostGIS is the extensions that allow PostgreSQL to store GIS data, as well as many GIS functions to analyze and manage your data. You will practice writing SQL and visually see the output from a PostGIS query to help you understand the power of using spatial queries.
A PostgreSQL geometry table is a data table for a particular GIS layer. In many respects it is similar to a shapefile. For example, you may have a GIS shapefile containing USA states. This GIS data can be loaded into PostgreSQL and would be stored into the database as a single table. This table will have all the associated attribute columns as well as a geometry column. The geometry column stores the actual geographic coordinates for the geometry of each state in Well Known Binary (WKB) format. This is a format specified by the OpenGIS Consortium (OGC) which defines information about the type of the object (point, line, polygon, multi-polygon, etc.) and the coordinates which form the geometry. In this example, there would be a record (row) for each state.
PostGIS is a set of GIS functions that is added onto PostgreSQL allowing the database to store and analyze GIS data. A PostGIS query browser has been developed for this tutorial so that you can view the geographical output of PostGIS queries.
PostGIS functions are executed through a query on the geometry column of a GIS data table. There are many GIS functions supported by PostGIS, a comprehensive list can be viewed here.http://postgis.refractions.net/docs/reference.html
Let’s start by writing a simple SELECT statement to view the GIS data from the usa_states table. Basic syntax for a PostgreSQL SELECT statement is as follows:
SELECT column1, column2,... etc.
FROM tablename1, tablename2,... etc.
Substituting column1, column2, etc. for the columns you wish to view, and tablename1, tablename2, etc. for the tables you are requesting data from. To view geographical output, you must select a geometry column containing the GIS data.
For the purpose of this tutorial, we have set up a PostgreSQL database with 2 tables: usa_states and usa_rivers. Both tables have a geometry column called the_geom. Click the “Run” button in the query window below to execute the query. The "C" button will clear any PostGIS layers. You will see each US state appear in orange on the map as well as the attribute table returned from the query. This query selects all columns (* means all columns) from the usa_states table.
The output is shown in graphical form, as well as a result table. What this query means is that we are selecting all columns including the geometry column (GIS Data) from the usa_states table. Remember this table contains the USA states GIS layer.
PostGIS SQL Function
Now let’s perform a GIS analysis function. The following query buffers the geometry of California by 30km. This is done using the ST_Buffer() function on the_geom column and selects only California by using a WHERE clause (columnname = value). Note that the map units are in meters so we buffer by a value of 30000 to get 30km.
Nested PostGIS queries
A powerful feature of using SQL for GIS analysis is that you can execute multiple spatial functions in 1 query. This is done by nesting PostGIS function around one another. The inside function executes first and the outside function executes last. In this example we are first selecting the centroids of each state, then implementing a 200km buffer around the centroids, and then making a union of all the buffers creating 1 record with dissolved boundaries.
Using GROUP BY with PostGIS functions
GROUP BY can be used to perform a GIS function on groups of attributes. For example, we may want to create boundaries for each US region using the boundaries of the states. We can perform a union on each region using the following example:
This creates 4 distinct polygons that have been unioned based on the region column.
Using multiple GIS layers with PostGIS functions
Often we want to analyze GIS data in relationship to each other. We can use multiple layers in PostgreSQL by specifying more than 1 table in the FROM clause. Since we are using multiple tables, we also need to specify the tables when selecting our columns. For example, you can use the tablename.columnname format or you can give your tables an alias and use the alias to reference the tablename. In the following example we define an alias for our 2 tables: usa_states and usa_rivers.
FROM usa_states as s, usa_rivers as r
We then make reference to the table when selecting columns. For example, SELECT s.state means we are selecting the “state” column from the usa_states table.
The following query will select the geometries where the rivers intersect with the state of Texas:
ST_Intersects() is a PostGIS function used in the WHERE clause and is a test that returns TRUE where the geographies of the 2 layers intersect. ST_Intersection() executes the actual intersection and returns the geography that represents the shared portion between the rivers and the state of Texas.
Write a query that selects a 50km buffer around any rivers intersecting the West region. Make sure to only select the river geometry that is intersecting the West region and no other geometry outside this boundary. Union this geography for each state. There are 11 states in the West region so there should only be 11 records. Note that map units are in meters:
Congratulations on completing this tutorial. You should now have PostGIS technical know how and understand the power of writing spatial queries using PostGIS. You can use PostGIS to complete full analyses in a similar manner to other desktop GIS programs. In most desktop GIS programs you would use specific analysis tools to create an output (usually a new shapefile). That output is typically used as input to the next GIS tool and a series of shapefiles is created throughout the analysis. You can do that same thing in PostGIS by creating a table with the output of your query. The syntax for this to add 1 line at the beginning of your query as follows:
CREATE TABLE tablename AS
This means that a new table will be created based on your SELECT query. You can use this flow to create tables and use them as input to the next spatial query to do your analysis.
Most Popular PostgreSQL/PostGIS Tutorials
- How to Import or Export a CSV File using PostgreSQL COPY TO and COPY FROM Queries
- Importing Shapefile GIS Data into PostgreSQL
- How to install pgAdmin onto a Linux or Windows system, and configure a PostgreSQL server to accept remote connections
- Understanding PostGIS Spatial Queries
- Setting Up pgAdmin Server Instrumentation