30 Python libraries to harness power of geospatial data 🌎

Business use-cases around Location Intelligence are quite fascinating to me.

Since 2012, I have been learning about Geo Spatial data analytics. I used ArcGIS and Python for analysing and visualizing geo-data during my Masters program from Virginia Tech; and since then, I have solved a few business use-cases around it.

ArcGIS training by Author in 2014.

In this blog, I will be sharing how you can go about using Geo-Spatial Data in Python.

Agenda here is to cover following topics —

  1. What is Location Intelligence?
  2. Understanding GeoSpatial Data
  3. 30 Python libraries for Geospatial Data Analysis

What is Location Intelligence?

Location Intelligence uses spatial information to empower understanding, insight, decision-making, and prediction.

It has applications everywhere, from retail site selection and solving traffic bottlenecks to maintaining and repairing vital infrastructure.

Here is the brief on Location Intelligence from ESRI.

Location Intelligence (ESRI)

Understanding GeoSpatial Data

Spatial data, Geospatial data, GIS data or Geo-data, are names for numeric data that identifies the geographical location of a physical object such as a building, a street, a town, a city, a country, etc….. according to a geographic coordinate system.

From the spatial data, you can find out not only the location but also the length, size, area or shape of any object. An example of a kind of spatial data that you can get are: coordinates of an object such as latitude, longitude, and elevation.

Geographic Information Systems (GIS) or other specialized software applications can be used to access, visualize, manipulate and analyze geospatial data. Some examples of geospatial data include:

1. Vectors and Attributes

Points, lines, polygons, and other descriptive information about a location.

Understanding Vector Data.

Vector data is a representation of a spatial element through its x and y coordinates. The most basic form of vector data is a point. Two or more points form a line, and three or more lines form a polygon.

The simplest form is to include one or more extra columns in the table that defines its geospatial coordinates. More formal encoding formats such as GeoJSON also come in handy.

GeoJSON, an extension to the JSON data format, contains a geometry feature that can be a Point, LineString, Polygon, MultiPoint, MultiLineString, or MultiPolygon.

There are several other libraries available for representing geospatial data that are all described in the Geospatial Data Abstraction Library (GDAL).

2. Point Clouds

Collected by LiDAR systems, they can be used to create 3D models.

Understanding Point Cloud data from LiDAR systems

3. Raster and Satellite Imagery

Raster data is used when spatial information across an area is observed. It consists of a matrix of rows and columns with some information associated with each cell.

Understanding Raster and Satellite Imagery

An example of raster data is a satellite image of a nation or a city represented by a matrix that contains the weather information in each of its cells.

There are several ways that you can work with raster data in Python. One recent package that is user-friendly is xarray, which reads netcdf files.

Additional Terminologies

  1. Shapefile: data file format used to represent items on a map
  2. Geometry: a vector (generally a column in a dataframe) used to represent points, polygons, and other geometric shapes or locations, usually represented as well-known text (WKT)
  3. Polygon: an area
  4. Point: a specific location
  5. Basemap: the background setting for a map, such as county borders in California
  6. Projection: since the Earth is a 3D spheroid, chose a method for how an area gets flattened into 2D map, using some coordinate reference system (CRS)
  7. Colormap: choice of a color palette for rendering data, selected with the ‘cmap’ parameter
  8. Overplotting: stacking several different plots on top of one another
  9. Choropleth: using different hues to color polygons, as a way to represent data levels
  10. Kernel Density Estimation: a data smoothing technique (KDE) that creates contours of shading to represent data levels
  11. Cartogram: warping the relative area of polygons to represent data levels
  12. Quantiles: binning data values into a specified number of equal-sized groups
  13. Voronoi Diagram: dividing an area into polygons such that each polygon contains exactly one generating point and every point in a given polygon is closer to its generating point than to any other; also called a Dirichlet tessellation

30 Python libraries for Geospatial Data Analysis

I have included a full list of 30 Python libraries here.

About Author

Ishan is an experienced data scientist with expertise in building data science and analytics capabilities from scratch including analysing unstructured/structured data, building end-to-end ML-based solutions, and deploying ML/DL models at scale on public cloud in production.

You may find him on LinkedIn.

I will be adding handsome tricks to handle geospatial data such as coordinates and city or country in Python in the upcoming articles.

Follow📱 to stay updated on the upcoming articles! 🔔

Thank you for reading this article. 🙌 🙌



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store