Meteorological Stations of Germany
Let's assume we are supposed to create a Python visualization, using data scattered across over 7,000 files of total size of 7.5 GB. Difficult task? With SQLAlchemy, no problem!
This was in fact one of the most exciting tasks we were asked to solve at Spiced Academy. I chose to create a hexagonal animated meteorological map.
The task was:
1. Create a single data frame from a set of meteorological measurements in TXT files.
2. Create two tables in a database using SQLAlchemy (measurements + locations of meteorological stations).
3. Extract relevant information using SQLAlchemy.
4. Create an interactive animated hexagonal map of temperature in Germany.
Solution
First, I downloaded all the files to my computer from the website of the European Climate Assessment & Dataset and created a database using Google Cloud Console.
Then, the four steps are easy to perform in a simple Jupyter Notebook once we import SQLAlchemy, a handy library to use SQL inside of Python code.
The sequence consists of:
1. In Jupyter notebook using Pandas, loop over all files and add the data to a single data frame. Export to CSV. This is our first table with measurements.
2. Log into Google Cloud database, create two tables corresponding to our data frames and load the data, all using SQLAlchemy.
3. With SQLAlchemy, use postgreSQL to write any SELECT statements. Use JOIN to get data from both tables at once and filter with smart WHERE statement.
4. Create the animation using plotly.figure_factory module.
See the project in my GitHub repo. You can find the most important part of the code in the readme file and all the code in the Jupyter notebook.
The resulting interactive animation shows the average yearly temperature for each hexagon with at least one meteorological station.