Get the code
By Kalonji Bankole | Published November 28, 2018 - Updated November 28, 2018
In this code pattern, we’ll demonstrate how to analyze a large air quality dataset provided by the EPA. This can be considered as a “smart cities” use case. We demonstrate how to analyze large data sets with Watson Studio and Python data science packages. The Jupyter notebook offers a few different examples of how to take advantage of open source software packages to analyze data sets.
This pattern requires a structured dataset. This data can be generated in a variety of ways. One way is to follow our related pattern titled “Setting up the hardware platform for long-range IoT systems that use LoRaWAN networking,” which goes through the process of deploying a long range network to collect sensor data.
As an alternative, we’ll use a dataset that has been generated by the EPA, which measures pollutant levels at several locations throughout the United States. Measurements are taken hourly throughout the year, which enables us to leverage time series analysis.
When you have completed this code pattern, you will understand how to:
Ready to get started? For detailed instructions, especially for a walk through the analyses done by the Jupyter notebook, please see the README.
Get the Code »
Back to top