A Densely-Deployed, High Sampling Rate, Open-Source Air Pollution Monitoring WSN

Citation Author(s):
Bartolomeo
Montrucchio
Politecnico di Torino, Dipartimento di Automatica e Informatica
Edoardo
Giusto
Politecnico di Torino, Dipartimento di Automatica e Informatica
Mohammad
Ghazi Vakili
Politecnico di Torino, Dipartimento di Automatica e Informatica
Stefano
Quer
Politecnico di Torino, Dipartimento di Automatica e Informatica
Renato
Ferrero
Politecnico di Torino, Dipartimento di Automatica e Informatica
Claudio
Fornaro
Università Telematica Internazionale UNINETTUNO, Roma
Submitted by:
Mohammad Ghazivakili
Last updated:
Tue, 05/17/2022 - 22:21
DOI:
10.21227/m4pb-g538
Data Format:
Link to Paper:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

This work contains data gathered by a series of sensors (PM 10, PM 2.5, temperature, relative humidity, and pressure) in the city of Turin in the north part of Italy (more precisely, at coordinates 45.041903N, 7.625850E). The data has been collected for a period of 5 months, from October 2018 to February 2019. The scope of the study was to address the calibration of low-cost particulate matter sensors and compare the readings against official measures provided by the Italian environmental agency (ARPA Piemonte). The database proposed has been designed as general enough to handle not only PM measures plus temperature and relative humidity but also almost any other quantity, such as altitude, wind speed and direction, radioactivity, electromagnetic pollution, etc.

The total size of the database is about 50GB of time-stamped data. The directory also contains several useful scripts that can be used to perform the calibration and the analysis of the acquired data, such as plotting graphs, displaying the correlation with the reference values, printing measurement errors, etc. The scripts implement two commonly used calibration techniques, namely Multivariate Linear Regression and Random Forest, resorting to the SciKitLearn Python library. 

The README files included in the main subdirectories report hints and comments on the data set format and the logic of the scripts. Please refer to them for further details. Please note that, following article 18.5 of Italian Decree 155/2010 on the dissemination of air quality data, which absorbs EU directive 2008/50/CE, ARPA Piemonte (http://www.arpa.piemonte.it/english-version) can not be ascribed for any mistake in these data, that can not be considered official, unlike the ones provided by ARPA itself.

 

Authors can be contacted at the following addresses:

{bartolomeo.montrucchio, edoardo.giusto, mohammad.ghazivakili, stefano.quer, renato.ferrero}@polito.it, and c.fornaro@uninettunouniversity.net

 

Instructions: 

A Densely-Deployed, High Sampling Rate, Open-Source Air Pollution Monitoring WSN

Documentation for the air pollution monitoring station developed at Politecnico di Torino by:
Edoardo Giusto, Mohammad Ghazi Vakili under the supervision of Prof. Bartolomeo Montrucchio.

System Overview

This section includes a description of our architecture from several points of view, going from the hardware and software architecture, to the communication protocols.

Hardware Architecture

We target the following key characteristics of our system:

  1. The rapid and easy prototyping capabilities,
  2. Flexibility in connection scenarios, and
  3. Cheapness but also dependability of components.

As each board has to include a limited number of modules, to facilitate our prototype development, we select the Raspberry Pi single-board computer as a monitoring board.
Due to our constraints in terms of cost, size and power consumption we select its Zero Wireless version based on the ARM11 microprocessor.

The basic operating principle of the system is the following. The data gathered from the sensors are stored in the MicroSD card of the RPi. At certain time intervals the RPi tries to connect to a Wi-Fi network and, if such a connection is established, it uploads the newly acquired data to a remote server.
The creation of the Wi-Fi network is achieved using a mobile phone set to operate as personal hot-spot, while on the remote server resides the database storing all the performed measurements.

Software Architecture

Wi-Fi connectivity was one of the requirements for the system, but at the same time, the system itself should have not to produce unnecessary electromagnetic noise, possibly impacting the operating ability of the host's appliances.
To reduce the time in which the Wi-Fi connection was active, the Linux OS was set to activate the specific interface at predefined time instants in order to connect to the portable hot-spot.
Once connected to the network, the system performed the following tasks:

  1. synchronization of the system and RTC clock with a remote Network Time Protocol (NTP) server,
  2. synchronization of the local samples directory with the remote directory residing on the server.
    The latter task is performed using the UNIX rsync utility, which has to be installed on both the machines.

To gather data from the sensors, a Python program has been implemented, which runs continuously with a separate process reading from each physical sensor plugged to the board and writing on the MicroSD card.
It has to be noted that for what concerns the PM sensors, since the UART communication had to take place using GPIOs, a Pigpiod deamon has been leveraged, to create digital serial ports over the Pi's pins.

The directories on the remote server are a simple copy of the MicroSD cards mounted on the boards.
Data in these directories have been inserted in a MySQL database.

Mechanical Design and Hardware Components

In order to easily stack more than one device together, a 3D printed modular case has been designed.
Several enclosing frames can be tied together using nuts and bolts, with the use of a single cap on top.
Figure shows the 3D board design, together with the final sensor and board configurations.

Each platform is equipped with 4 PM sensors (a good trade-off between size and redundancy), 1 Temperature (T) and Relative Humidity (HT) sensor and 1 Pressure (P) sensor.
As our target was to capture significant data sampling for the particulate matter we adopt the following sensors:

  1. The Honeywell HPMA115S0-XXX as PM sensor.
    As one of our targets was to evaluate these sensors' suitability for air pollution monitoring applications, we insert 4 instances of this sensor in every single platform.
    This sort of redundancy allows us to detect strange phenomena and to avoid several kind of malfunctions, making more stable the overall system.

  2. The DHT22 as temperature and relative humidity sensor.
    This is very widespread in prototyping applications, with several open-source implementation of its library, publicly available on the internet.

  3. The Bosch BME280 as a pressure sensor.
    This is a cheap but precise barometric pressure and temperature sensor which comes pre-soldered on a small PCB for easy prototyping.

The system also includes a Real Time Clock (RTC) module for the operating system to retrieve the correct time after a sudden power loss. The chosen device is the DS3231.
The DS3231 communicates via I2C interface and has native support in the Linux kernel.

As a last comment, notice that a Printed Circuit Board (PCB) was designed to facilitate connections and soldering of the various sensors and other components.

Database

Create database

The database structure can be created using the scripts located in the mysql_insertion folder of the Dataset/SQL_Table repository.

mysql -u <user> [-h <host>] [-p] < create_db.sql

Load SQL data (SQL Format)

Data formated in SQL can be loaded using the mysql command mysql -u username -p WEATHER_STATION < db_whole_data.sql, and the db_whole_data.sql is available in the SQL_data/ folder of the Dataset directory.

Load RAW data (CSV)

Data can be loaded using the python script sql_ins.py available in the mysql_insertion folder of the Dataset/SQL_Table repository.

python sql_ins.py <data_folder>

The script assumes the following folder structure:

* data_folder
|-- 01-board_table
|-- 02-unit_of_measure_table
|-- 03-param_type_table
|-- 04-board_config_table
|-- 05-physical_sensor_table
|-- 06-logical_sensor_table
|-- 07-board_sensor_connection_table
|-- 08-measure_table
    |-- arpa
    |-- mobility
    |-- stations

Each folder contains a set of csv files. The script automatically loads data into the appropriate table and using the correct fields, which are specified as a list of parameters in the script. It is possible to edit the script to load only a subset of the folders.

System Usage

To replicate the experiments, the user should clone the raspberry pi image into a MicroSD (16-32 GB).
To do this, s/he can issue the command dd if=/path/to/image of=/path/of/microsd bs=4m on Linux.
The sampling scripts are run by a systemd unit automatically at system startup. The same systemd unit handles also the automatic respawn of the processes if some problems occur. The data are stored in the /home/alarm/ws/data directory, with filenames corresponding to the date of acquisition.

In order to upload these data to a database, it is possible to use the guide contained in the "database" directory.

In order to perform calibration and tests, it is recommended to take a look at the guide contained in the "analysis" directory. A Python class has been implemented to perform calibration of sensors against the ARPA reference ones. The resulting calibration can then be applied to a time window of choice.

3D Model

A 3D model of the case has been developed using SketchUp online software.
The resulting model is split in 5 different parts, each large enough to fit in our 3D printer (Makerbot Replicator 2X).
The model is stackable, meaning that several cases can be put on top of each other, with a single roof piece.

Printed Circuit Board

A PCB has been developed using KiCad software, so to create a hat for the RPi0 connecting all the sensors.

WS Analysis library documentation (v0.2)

The aim of this package is to provide fast and easy access and analysis to the Weather Station database. This package is located in the analysis directory, and it is compatible only with Python 3. Please follow the readme file for more information.

Directory Structure

project
├── 3D_Box
│   ├── Cap_v0_1stpart.skp
│   ├── Cap_v0_2dpart.skp
│   ├── ws_rpzero_noGPS_v1.skp
│   ├── ws_sensors_2d_half_v2.skp
│   └── ws_sensors_half_v2.skp
├── analysis
│   ├── arpa_station.json
│   ├── board.json
│   ├── example.py
│   ├── extract.py
│   ├── out.pdf
│   ├── requirements.txt
│   ├── ws_analysis
│   │   ├── __pycache__
│   │   │   └── ws_analysis.cpython-37.pyc
│   │   ├── rpt.txt
│   │   └── script_offset.py
│   ├── ws_analysis.md
│   ├── ws_analysis.pdf
│   ├── ws_analysis.py
│   └── ws_analysis.pyc
├── Dataset
│   ├── db_setup.html
│   ├── db_setup.md
│   ├── db_setup.pdf
│   ├── er_diagram.pdf
│   ├── mysql_insertion
│   │   ├── extract_to_file.py
│   │   ├── remove_duplicate.py
│   │   └── sql_ins.py
│   ├── SQL_Table
│   │   ├── create_db.sql
│   │   ├── create_measure_table.sql
│   │   └── load_data.sql
│   └── SQL_data
│       └── db_whole_data.sql.gz
├── PCB
│   └── WS_v2_output.tar.xz
├── readme.html
├── readme.md
├── readme.pdf
└── scripts
    ├── python
    │   ├── csv
    │   │   ├── arpa_retrieve.py
    │   │   ├── filemerge.py
    │   │   ├── gpx2geohash.py
    │   │   ├── parse_csv.py
    │   │   └── validation.py
    │   └── mpu9250
    │       └── gyro.py
    └── README.md