A good proportion of the data out there in the real world is inherently spatial. From the population recorded in the national census, to every shop in your neighborhood, the majority of datasets have a location aspect that you can exploit to make the most of what they have to offer. This course will show you how to integrate spatial data into your Python Data Science workflow. You will learn how to interact with, manipulate and augment real-world data using their geographic dimension. You will learn to read tabular spatial data in the most common formats (e.g. GeoJSON, shapefile, geopackage) and visualize them in maps. You will then combine different sources using their location as the bridge that puts them in relation to each other. And, by the end of the course, you will be able to understand what makes geographic data unique, allowing you to transform and repurpose them in different contexts.
In this chapter, you will be introduced to the concepts of geospatial data, and more specifically of vector data. You will then learn how to represent such data in Python using the GeoPandas library, and the basics to read, explore and visualize such data. And you will exercise all this with some datasets about the city of Paris.
One of the key aspects of geospatial data is how they relate to each other in space. In this chapter, you will learn the different spatial relationships, and how to use them in Python to query the data or to perform spatial joins. Finally, you will also learn in more detail about choropleth visualizations.
In this chapter, we will take a deeper look into how the coordinates of the geometries are expressed based on their Coordinate Reference System (CRS). You will learn the importance of those reference systems and how to handle it in practice with GeoPandas. Further, you will also learn how to create new geometries based on the spatial relationships, which will allow you to overlay spatial datasets. And you will further practice this all with Paris datasets!
In this final chapter, we leave the Paris data behind us, and apply everything we have learnt up to now on a brand new dataset about artisanal mining sites in Eastern Congo. Further, you will still learn some new spatial operations, how to apply custom spatial operations, and you will get a sneak preview into raster data.
Open Source Software Developer; Core Developer of Pandas, GeoPandas and scikit-learn
Senior Lecturer in Geographic Data Science
“I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.”
Devon Edwards Joseph
Lloyds Banking Group
“DataCamp is the top resource I recommend for learning data science.”
Harvard Business School
“DataCamp is by far my favorite website to learn from.”
Decision Science Analytics, USAA