Sunday Morning Geo-Fun

Why is it that the only time I have to blog is on Sunday mornings?  Here are a few quick geo-items that tickled my fancy from the previous week.

  • Check out the article, The New Cartographers: OpenStreetMap’s World Takeover, from Carl Franzen at Talking Points Memo.  The first two parts of this story have been tweeted a lot this last week and I can see why.  The article provides a fairly good overview of OSM, including some background on the project, the nuances of licensing OSM data, and adoption in the tech industry.  Part three of the article comes out on Sunday.  Makes me feel good about calling 2012 the year of OSM.
  • Years ago I used to pump out Google Map Mash-Ups on a regular basis, some of which were developed during my time at the Map and Geographic Center at the University of Connecticut.  Well, after nearly two years one of those mash-ups got some press!  Check out the article in the Atlantic, Pre-Sprawl Aerial Images:’Next Best Thing to a Time Machine. The article discusses the dual-map mash-up that I developed for the On the Line Project that is used to compare the drastic changes in Connecticut’s landscape using current and historical aerial photography.  Pretty cool.
  • The guys at Google’s NC data center, which just got the indoors street view treatment, definitely Rickrolled streetview (Check out the image on the screens, also, why didn’t they blur out Rick Astley’s face too?).
  • Brian Flood has been doing a lot of great things for the online mapping and spatial data communities for a while now.  This video and post on the MapBox blog is the latest example.  Using Arc2Earth Sync to integrate with MapBox and ArcGIS appears smooth and simple.  Awesome.  There is a lot of great work happening in “spatial” and it’s only going to make what we do as geo-professionals better.
  • Speaking of MapBox, when does Esri try to scoop them up (if they haven’t already), like they just did with GeoLoqi?
  • Avid Geo Boston had their October meet-up this past week.  The video is here. Since I am a horrible member and missed the meet-up for the third straight month I cannot comment on the talks, but I’m sure everyone had a good time.
  • Avid Geo will be hosting their wildly successful annual Ignite Spatial event on November 14th at the Center for Geographic Analysis at Harvard.  Tickets are available here, and they are currently looking for presenters.
  • Don’t forget to take the totally unscientific ArcGIS 10.1 survey!
  • Finally, I’ll be updating some pages on my site this week, including the blogs page and some of the mash-ups.
  • As always, follow me on twitter @GISDoctor, and hopefully I’ll blog more this week.  I have tons of ideas!


Sunday Geo-Notes

It’s summertime and I’m not blogging or twittering as much.  Typical.  But, it’s Sunday morning and before I head out to the garden I wanted to share these few items:

  • I’m doing a lot of geo-analysis using MSSQL Server lately and as I tweeted, the spatial index can be the key to a fast query.  However, the spatial index is sometimes tricky to understand.  Check out the “The Black Art Of Spatial Index Tuning In SQL Server” for a good overview of SQL’s spatial indexing.  For other spatial SQL inspiration check out Bob Beauchemin’s blog.
  • Speaking of analysis, the folks at Somerville’e ResiStat, who I am a fan of, did a nice geo-analysis, actually using some real statistics, not just what they saw from Google Earth, to debunk  a “study” of equating tree coverage to income in Somerville, MA.
  • There is an Avid Geo meet-up this coming Thursday (7/19).  If you are a Boston based geo-pro or geo-nerd you should check out the group!
  • The biggest geo-news of the week, GeoIQ being bought by Esri, gota lot of people talking.  Some positive, some not-so-positive.  I just hope the talented folks at GeoIQ are given room to do their own thing and bring positive innovations to Esri’s product line.  Sometimes when the little guy is bought out by the big guy their ideas and creativity may languish in the corporate culture.  I hope this doesn’t happen to them.
  • The Esri UC festivities start at the end of this coming week with the ed and business summit kicking things off.  I’ll be heading out with a few of my coworkers and meeting up with some old grad school friends for the full UC.  Like last year, I’ll be focusing on the analysis presentation tracks.  I hope to see something about speeding up large geo-analyses.  I’ll touch on improving performance in my talk, but I want to see what others are doing, as everyone now-a-days is using huge datasets (whatever happened to the sample?).

Until next time, check out my twitter feed @GISDoctor.  I almost have a 100 followers!!!!!!!!!  Only 11M+ more followers until I catch Ashton Kutcher…

The First Law of Geography – Today’s Geo Inspiration

Ever now and then I look for inspiration when designing a model or writing some code to solve a geo-problem.  Recently, while searching for some geo-inspiration I came across one of my favorite papers.  If you have ever taken any type of  GIS or spatial analysis course you have probably heard some variation of the following phrase, commonly referred to as the First Law of Geography:

“Everything is related to everything else,
but near things are more related than distant things”

This idea, my friends, is what defines the field of geo-analysis.  From interpolation to distance decay, and spatial autocorrelation to gravity models, the idea that locations that are closer together are more related than those that are far apart provides the base for the field of geographic analysis.  Like any great quote, many may know it but few know it’s origins.  So, where did this phrase come from?  Here it is…

A Computer Movie Simulating Urban Growth in the Detroit Region
W. R. Tobler
Economic Geography
Vol. 46, Supplement: Proceedings. International Geographical Union. Commission on Quantitative Methods (Jun., 1970), pp. 234-240

Waldo Tobler is very well known geographer (he has his own Wikipedia page!) and if you have ever taken a geography or GIS course at some point the professor probably referred to his work, either directly or indirectly.  A copy of the paper is available here and a reply to the First Law of Geography can be found here.  If you are in the learning mood both are quick reads.

Dear ACS, if you are killed off, I will miss you…

Since it is widely regarded that killing the American Community Survey is a bad idea it would make complete sense that portions of the Senate still want to end it.  The main arguments against the ACS are that it is expensive, intrudes on privacy and is “unconstitutional”, whereas the benefits (which in my opinion far outweigh any of the negatives)  generally go along the lines of better data equals better decision making.

I wanted to write a full post about the dangers of ending ACS from the point of view of a geographer, but I became very frustrated reading the stories about why it should be ended that I just deleted everything I had except the first paragraph you just read.  However, I do think it is important to at least give my opinion as a geographer and someone who values good, unbiased data:

Without current and quality spatial data you won’t know where you are or where you are going…

In a world where big data and quantitative analytics are essential to data-driven decision making the loss of the ACS could send shock waves through the business, academic, non-profit, and government worlds.

Now, I am going to volunteer some personal information, tag my location, and post some pictures to Facebook.  Now, there is a data collector I can trust!


Note: It is reported that the White House will veto this if passed.

Open Job – Senior Software Engineer – GIS / SQL Spatial Application / C# – Boston, MA

AIR Worldwide, based in the Back Bay section of Boston, Massachusetts is advertising an open position for a GIS/SQL spatial software engineer.  The position requires 3-5 years experience in software development and strong base of GIS and GIScience skills.  AIR develops natural catastrophe modeling software (a lot of spatial analysis) that is used by a variety of insurance companies, financial institutions and governments to help them understand their risk from natural catastrophes.

Are you a developer with some serious GIS chops, or a GIS pro with some serious programming chops?  Looking for a job in Boston?  Would you like to work with a scientists, engineers, programmers, analysts and others on  a variety of interesting projects? Then I would recommend you check this job out!

FYI, this is a shameless plug

Geospatial Topology, the Basics

The concept of topology isn’t something that every spatially enabled person fully understands.  That is OK, because I too had to learn (and relearn) how spatial topology works over the years, especially early on back in the ArcView 3.X days.  I think this experience is fairly typical of someone who uses GIS.  If one is taking a GIS course or a course that uses GIS it is not very often that the concept of spatial topology is covered in-depth or at all.  Spatial topology also may not be something that people are overly concerned about during their day-to-day workflow, meaning they may let their geospatial topology skills slide from time to time.  As a public service here is a basic overview of geospatial topology.

First question: What is topology?

You have probably heard the term topology before, whether it was in a GIS course where the instruction lightly glazed over the topic, or in a geometry/mathematics course.

Technically speaking, topology is a field of mathematics/geometry/graph theory, that studies how the properties of a shape remain under a number of different transformations, like bending, stretching, or twisting.   The field of topology is well established within mathematics and far more complicated than I wish to get in this post.

Second question: How does topology relate to GIS and spatial analysis?

Spatial analysis is at its core an analysis of shapes in space.  Geospatial topology is used to determine and preserve the relationships between shapes in the vector data model.

The GIS software we use for analysis and data storage incorporates a set of “topological rules” to define how vector objects are stored and how they can interact with each other.  These rules can dictate how nodes interact within a network, how the edges or faces of polygons coexist, or how points are organized across space.

Back in the “olden-days” (which was before “my time”) GIS users, particularly ArcInfo users, were well versed in geospatial topology because of the coverage.  The coverage data model, a precursor to today’s ubiquitous shapefile format, was unique in that topology was stored within the file.  This data format allowed users a certain set of controls to the spatial relationships within the dataset that later went away with the shapefile.  The shapefile is not a topologically valid dataset, as geometric relationships are not enforced.  For example, how may of you have downloaded (or bought) a shapefile from a data provider and it was FULL of slivers? In the Esri world geospatial topology came back with the geodatabase, and has been incorporated into a number of other geospatial data formats including spatial databases supported by Oracle, PostGIS (2.0) and SpatiaLite.

Today, topology is important in geodatabase design (for those who pay attention to it!), and data creation/editing.  By understanding the set of geospatial topology rules and creating topologically sound data, the user can have a level of trust in their data during analysis.

 Additional Resources:

Esri white paper on GIS topology 

PostGIS Topology

PostGIS 2.0 Topology Support

Oracle Topology Data Model

Vector topology in GRASS

Esri Coverage Topology

Esri Geodatabase Topology

Real topology

Vector topology cleaning with Quantum and GRASS – youtube vid

Blogs for Wannabe GIS Programmers (like me)

Anyone who knows me and works with me on a regular basis knows that I am not a developer.  I am a geographer who develops code for visualization and analysis applications that will hopefully work.  In our line of work knowing how to write and understand code is critical and on my quest to become a better programmer I am continuously searching for the next resource to add to my collection of how-to guides and programming resources.  Some of my favorite resources are spatially focused programming blogs.  Usually the bloggers are facing the same problems I am dealing with and they are using jargon that I understand.  These two factors make it much easier to follow their examples and ideas.

Here are three blogs (from people who know what they are doing) that I follow (do people still follow blogs, or is that so 2007?) and have referenced in my quest to improve my marginal programming skills(clear overuse of () in a sentence).  Check these blogs out when you get the chance:

GeoChalkboard – A number of Esri javascripting posts, which is great for me, since I am doing a lot of that type of work lately.  Also, professional courses are made available through the site.

Guerilla GIS – I’ve referenced this blog before, mostly because I like two things about it.  The variety of code examples and the GIS snark.

odoenet – A number of GIS programming examples, along with a number of tips a tricks.  For those of you new to GIS programming  should check out the two blog posts about simplifying GIS development in ArcGIS.  Not a bad read.  Also, this post is very true.

I know there are many more blogs like this out on the interwebs.  If you know of one or have a favorite spatial programming resource post it in the comments section!

Spatial Random Sample, Sample

Often, when performing spatial analysis, one may need to execute some type of sampling across space.  For example, one may need to sample locations across a geographically continuous surface (think soils, anything weather related, etc.).  A spatial random sample can be used to select locations without bias.  With a simple python script one can develop a spatial random sample with relative ease.  In this post I will cover a few definitions, provide a code sample, and discuss some additional points.

First, a few definitions:

Random Number: A number chosen as if by chance from some specified distribution such that selection of a large set of these numbers reproduces the underlying distribution.

Statistical Randomness: A numeric sequence is said to be statistically random when it contains no recognizable patterns or regularities; sequences such as the results of an ideal dice roll, or the digits of π exhibit statistical randomness.

Simple Random Sample: A sample in which every element in the population has an equal chance of being selected.

Second, what is a spatial random sample?

Spatial Random Sample: Locations obtained by choosing x-coordinates and y-coordinates at random (p. 58). Any points that do not intersect the landform will be dropped from the list of random points.  

Third, give me some python code to do this!

import os, random
from time import strftime

f = open("C:\\Data\\output\\spatial_random_sample.csv", 'w')

#How many points will be generated
numpoints = random.randint(0,1000)

# Create the bounding box
#set longitude values - Y values
minx = -180
maxx = 180

#set latitude values - X values
miny = -23.5
maxy = 23.5

print "Start Time:", strftime("%a, %d %b %Y %H:%M:%S")
#Print the column headers
print >>f, "ID",",","X",",","Y"
for x in range(0,numpoints):
print >>f, x,",", random.uniform(minx,maxx),",",                      random.uniform(miny,maxy)

print "Script Complete, Hooray!", numpoints, "random points generated"
print "End Time:", strftime("%a, %d %b %Y %H:%M:%S")

This quick, dirty and very simple script does a few things. First, it creates a csv file in a local directory, and by using the ‘w’ mode the file will be created if it doesn’t exist and will be overwritten every time the code is run (so be careful).

Next, the code  selects a random number of points to be generated. In this case it will be a random integer between zero and 1,000. The user will then set the bounding box for which the points will be contained by. If using ArcPy and ArcGIS the user could easily set the bounding box to that of a particular layer. In this example, it is simply 180,-180 and the approximate Tropic of Cancer and Tropic of Capricorn.

The next block of code will generate the random number of points in the specified ranges and print them to a csv file.  The output is fairly straight forward: three columns, an ID field and X and Y. The user can open the file in OpenOffice as they could any other csv file.

Well, that’s great.  With this data the user can easily visualize it in Quantum using the Add Delimited Text Layer tool from the Layer menu. Since the output was formatted with X and Y fields the tool will populate itself:

Once the user clicks OK the points will be added to the map.  From there the user can export the data to any number of formats and perform their analysis.

As you can see it is pretty easy to generate random points with the script.  In fact, ArcMap and Quantum have tools that will do this, but both run much slower than just creating a simple spatial random sample as demonstrated here, as they have many more options than this simple script.  Also, the Arc version will only work if the user has ArcEditor or the spatial analyst extension.  The folks at SpatialEcology also have a tool that will do this within ArcMap as well, and I am sure there are other tools out there as well.

But before we wrap this up, here are a couple notes:

  • This is a simple example, and not intended to be an “end-all, be-all example”.
  • Python generates psuedo-random values
  • The points that are generated have an equal chance of being created, meaning that whatever is being sampled with those coordinates has an equal chance of being selected as well.
  • The script presented here does not check against any boundaries, only a bounding box.
  • The above code can easily be extended to work within ArcPy and ArcGIS.  I can post the code later on if there is interest.

GISDoctor Spatial Analysis Post Series

There once was a well know GIS blog post that compared geographic information systems to word processors.  No matter what you think about the post we will always need people who are skilled at “writing” and have something to “write” about.

As I have said before, and will say again, if you are using GIS technologies you should have a grasp on the fundamentals.  You wouldn’t write a paper or a report without a grasp on the basics of the topic or without a knowledge of writing in general.  So, to improve the world’s GIS grammar (or at least my own), I will be posting a number of spatial analysis related topics over the course of the next few months.  Here are a few of the topics I will cover:

  • Data classification schemes
  • Understanding spatial random samples
  • Topology, from a spatial point of view
  • The basics of projections
  • Avoiding false accuracy
  • Using root mean square
  • Geary’s c and Moran’s I
  • The First Law of Geography
  • Spatial autocorrelation
  • and many more…

I’ll use a variety of software, data, and problems to explain these topics, in order to expose the reader to the broad language of GIS.