Spatial SQL – Multi-Point to Line Example

I have been using Spatial SQL for a while now.  I like it.  A few lines of code can do a lot of analysis or data processing.  I’ve covered a number of basic topics but there are always more to do.  Here is a script to take a series of points and convert them into a single line.   This script will use a few SQL commands, including using a cursor, developing a linestring, and STLineFromText.

The sample data comes from NOAA and the National Hurricane Center. The points represent some sample tropical cyclone forecast points for Hurricane Sandy.

The basic idea of the script is to link a set of points together using a common attribute using a cursor, combine the coordinate pairs into a line string, and use the  STLineFromText method to convert the coordinate pairs into a single line.  The script works pretty well, but since I am using a cursor it can slow down with larger (a few hundred thousand rows) datasets. A good SQL programmer probably wouldn’t use a cursor here since they can be slow and cumbersome to use.  In fact, I’m sure a good SQL programmer wouldn’t use a cursor.  I am investigating ways to not use a cursor, so if you have a suggestion let me know!

The sample script includes a sample dataset that is available here.  Take a look at the script.  If you have any suggestions to make this script faster or more efficient let me know.

/*#################################################

Script Name: multi_points_to_Line.sql
Purpose: This script will take pairs of coordinates
 of the same line and convert them into a LineString 
that can be used to generate lines of the geometry 
data type. User will need to update the database, 
tables, and column names relevant to their own analysis.

Sample Sandy data tracks represent five day
models from the National Hurricane Center.

Prepping the data - The user will need to download 
the following file and load into their SQL database:

http://www.gisdoctor.com/downloads/Line_Parts.txt

Here is a quick script to take the text file and load 
it into a table generated for this exercise:

create table Spatial_Database.dbo.Distinct_Points
([ADVISNUM] varchar(3), [lat] float, [lon] float, 
[MaxWind] int)

BULK INSERT Spatial_Database.dbo.Distinct_Points
FROM 'Path to Line_Parts.txt file'
WITH (FIELDTERMINATOR =',',FirstRow = 2);

###################################################*/
 
--Set the database to process in
use Spatial_Database
--Drop temporary table
drop table #Sandy_hur_tracks
--Create new temporary table
create table #Sandy_hur_tracks
([EventID] int, [line] geometry)

--Declare cursor variable
DECLARE @eventID varchar(10)
--Declare text string that will store coordinate pairs to
--populate the LineString
DECLARE @coordString VARCHAR(MAX)

--Initialize the cursor using the ADVISNUM column
DECLARE db_cursor CURSOR FOR
select distinct ADVISNUM
from Spatial_Database.dbo.Distinct_Points
order by ADVISNUM asc

OPEN db_cursor
FETCH NEXT FROM db_cursor INTO @eventID

WHILE @@FETCH_STATUS = 0
BEGIN

-- Clear the coordinate string with each iteration of the 
-- cursor - otherwise the coordinate string will 
-- append itself each time
set @coordString = ''
--collect all coordinate pairs and add them to a single row. 
-- Coordinate pairs are separated by a comma.
select @coordString = (COALESCE(@coordString + ', ', ' ') + 
(cast(Lon as varchar) +' ' + CAST(lat as varchar)))
FROM Spatial_Database.dbo.Distinct_Points
WHERE ADVISNUM = @EventID

--Insert the eventId and coordinate pairs into the table.         
--Coordinate pairs string is used to build the LineString to      
--create the line geometry
insert into #Sandy_hur_tracks
select @eventID as EventID,
Geometry::STLineFromText('Linestring 
(' + right(@coordString,LEN(@coordString)-1) + ')' 
, 4326) as line

FETCH NEXT FROM db_cursor INTO @eventID
END
--Close and delete the cursor
CLOSE db_cursor
DEALLOCATE db_cursor

--Select results from the temp table.
use Spatial_Database
select * from #Sandy_hur_tracks

Spatial SQL for the Geographer – Part 1 – Welcome to Spatial SQL

Before we begin I would like to make it clear that learning all the capabilities and limitations of SQL Server and Spatial SQL cannot be covered in three short lessons.  If you believe you might need some more training please visit the Microsoft SQL Transact reference guide.

What is Microsoft SQL Server?

Microsoft SQL Server is a relational database system that stores data and provides several mechanisms to retrieve stored data.  One of the beauties of relational database systems is that they allow you to effectively store a variety of data sets and they are both scalable and securable, and in my experience, much more trustworthy than any flat file.

There are a number of good references out there explaining SQL Server in much more detail than I would attempt.  I recommend that you check a couple of them out:

Explanation of Spatial Data Types

As you probably know by now a column in a table or view can be assigned one of many different data types.  Most are familiar with the traditional data types such as varchars, integers, bits, floats, or decimals.  With SQL Server 2008 two new spatial data types were introduced.  These data types support the storage of spatial data, and are broken up into two different types.  Their characteristics are important to understand:

  • Geometry:  Primary characteristic – spatial data stored in a planar system.
  • Geography: Primary characteristic – spatial data stored using the ellipsoid.
  • Here is a great blog post that describes the spatial data types in more detail.

Tutorial

Step One: Getting Started

The user first needs to download some software and data to their local machine.

Downloading Microsoft SQL Server 2008 R2 may take a few minutes.  The user needs to make sure that they read the download and installation instructions before they proceed. Also, when installing the software the users needs to pay particular attention to the installation instructions.

Note:  This tutorial was completed on a 64 bit machine running Windows 7.  I cannot guarantee that any of this will work on anything different.

Notes about the Data.  The populated places, countries, and states and provinces data sets were downloaded from Natural Earth and the roads data set came from Data.gov. The data represent the three different vector data types, points, lines and polygons. They also provide a variety of query options to work with during for the tutorials. Also, I renamed the data from the original names so for clarity during the tutorials.

Once you download the data I recommend first viewing it in a GIS program to make sure you downloaded it correctly.  Here the user will see the four data sets in Quantum:

Tutorial Data

Step Two: Create a Database

Now that the software has been downloaded the user can open Microsoft SQL Server Management Studio.  To do this the user will select to Start–> All Programs–> Microsoft SQL Server 2008 R2–>SQL Server Management Studio.  Since this will probably be the first time the user opens MSSQL it may take a couple minutes to configure and load.

Next, the user will be prompted to connect to a server.  In this case the server is the local machine.   In this example I connect to my local machine instance and use Windows Authentication.  Unless you configured your SQL instance differently you will connect the same way.  Once these options are selected click connect.

connect to server

Now that the user is connected to the server they can create a new database.  First, the user will expand the database instance.  Then the user will right click on the database folder and select New Database from the menu.

Create Database

The new database window will now open.  Here the user can configure a number of settings before the database is created.  In this tutorial we want to name the database Spatial_Database which is set in the Database Name text box.  Another option I like to configure is the location of the data and log files.  This is done by clicking the … button under the Path column.  I recommend storing the database in a location that has plenty of room to grow, as the default location on the C drive may not have space to expand on your local machine.

Change Location

Once the settings are all set the user will click OK to create the database.  The user can refresh the Databases folder to see the new, empty database.

Spatial Database

The user is the system admin on this sever and this database be default.  This means the user has permissions to create databases, delete databases, create tables and load data into them.  With these privileges already in place the admin user can load data into the database without any further configuration.

Step Three: Load Data into the Database using Shape2SQL

Using Shape2SQL is really easy.  The user will first open Shape2SQL and add a shapefile that will be loaded into the database as an individual table.  To do this the user will first navigate to shapefile stored on the local machine by clicking on the Select Button.

Select  Shapefile

The next step is to select the server and database where the data will live.  The user will click on the Configure Button to do this.  In the Configure window the user enter the name of the server in the Server Name window.  The user will then select Use Windows Authentication and then select the appropriate database.  In this tutorial the database has been named Spatial_Database.  Once this information has been configured the user will click OK.

Configure Server

Now, back in the uploader window the user has several additional options to configure before loading the data into the database.  First, the user needs to select the data type. The user will select either the geometry or geography data type.  In this tutorial we load the data as the geometry data, as it is more flexible, in terms of what it will accept, than the geography data type.

The second setting that needs to be set is the name of the geometry column.  The default for this setting is ‘geom’, but the user can change this if desired.  Remember, all the tutorials will use the ‘geom’ column.

The user will then make sure the Create Spatial Index option is selected.  This will ensure that a spatial index is built, which will improve the performance of the table in SQL Server.  Finally the user will set the name of the table in the Table Name window.

There are a few settings that do not need to be selected or altered.  Since this tutorial is using the geometry data type the user does not need to set the SRID.  The user can also change which columns are loaded into the table.  By default all columns are taken from the shapefile and loaded into the table. If a user does not want certain columns they will simply uncheck the columns they don’t want.  Once all the settings are selected the user will select Upload to Database.  Again, I may use all the columns in future tutorials, so you are best off keeping all of them.

Once the user clicks Upload to Database the shapefile will be loaded into the database as a table and the spatial index will be created.  Depending on the size of the shapefile and the power of the user’s machine it may take several minutes to load the table into the database.

The user will repeat these steps and load all four shapefiles into their database.  Once that is completed the user can proceed to the next section of the tutorial.

Step Four: Perform a Basic Query

Now that the shapefiles have been loaded into SQL Server the user can start to perform some queries.  Now, if the user does not have any SQL experience at all it may make sense to read through this tutorial from Microsoft.

The first example query is probably the most simple SQL query one could perform. In this first example I will use the dbo.States_Provinces table.  The user will open SQL Server, connect to their server, and then click the New Query button.  The user will then paste the following into the query window:

use spatial_database
go

select top 100 * from dbo.States_Provinces

The results will return the first 100 records from the dbo.States_Provinces table.  In SQL Server the user will notice there is a Results tab and a Spatial results tab.  The results tab will display the tabular data and the spatial results will display the spatial data.  The Spatial results tab will only appear if there is spatial data returned in the query.

Step Five: Set a Where Statement to Query Specific Data

The user can narrow down what is selected by incorporating some basic SQL statements, such as a where statement.  The basic premise of a where clause is to limit the data returned from a query. The user is setting a condition in regards to what they want returned.  For example, let’s say we only want the return records from the States_Provinces table that are for the United States.  If the user explores the table they will set that this information is stored in the Name_0 column. However, we are unsure of the exact syntax of “United States” so we can use a like operator in the where clause. For example:

use spatial_database
go

select * from dbo.States_Provinces
where NAME_0 like '%United States%'

The user will see only those records that have ‘United States’ in the Name_0 column. If the user clicks on the spatial results tab they will see the spatial representation of those records.

Spatial Results

Alright, that is it! We have completed the first part of this tutorial. The user has downloaded the appropriate software and data, installed the software, loaded a number of shapefiles using Shape2SQL into SQL Server, and have completed two basic queries.

Before the next tutorial I recommend the user, if they are light in their SQL experience, read up on some basic queries from either the W3 Schools or from Microsoft.

The next part to this tutorial will cover some more Spatial SQL queries.  I’ll go into more detail about some spatial operations and provide a number of examples.  If there is anything the you want to see let me know!

Spatial SQL for the Geographer – Introduction

During WhereCamp Boston one of the impromptu break-out sessions was on spatial SQL. In WhereCamp fashion the person who proposed the session was looking to learn more about spatial SQL and its many uses.  With about 20 people in the session I volunteered to give a few examples of spatial SQL within Microsoft SQL Server.  Now, I was not prepared to give an in-depth or very informative session.  In fact, it was probably the worst presentation I have ever given.  I went through a number of examples of of some basic principals and I said I would put together a “Hello World” example of spatial SQL sometime after the meeting.

But, to complete a “Hello World” example we need to develop some background first.  In order to develop the guide I will be putting it together from the ground-up.  I use Microsoft SQL Server everyday, but I am GIS professional first, and many of the readers of this site are also GIS professionals.  Therefore, I will write the how-to from this point of view.  If you are a database professional this guide might be a little boring, but if you are a GIS professional I hope you’ll get something out of it.

The guide will be broken into three additional posts.  The first will be cover some background and vocab.  Part two will discuss basic spatial SQL syntax, and the third part will walk through how to expand basic scripts into larger sql processes. Once the guides have been finished I will provide links below.

Part 1Welcome to Spatial SQL – 11/14/2011

Part 2Basic Spatial SQL Scripts – 11/21/2011

Part 3 – More Basic SQL Scripts – 1/2012

Part 4 – Working Spatial SQL into Larger Processes – 2/2012

*Update – 12/4/2011- with the holidays and other commitments I am going to push back the last two installments!

This guide is being developed for use with Microsoft SQL Server.  That’s the DBMS that I use on a daily basis.

Before I release these posts feel free to browse some resources on spatial SQL and Microsoft SQL Server: