Energy-efficient indoor localization WiFi-fingerprint dataset

Citation Author(s):
Jose Luis
Salazar González
Universidad de Sevilla
Luis Miguel
Soria Morillo
Universidad de Sevilla
Juan Antonio
Álvarez García
Universidad de Sevilla
Fernando
Enríquez
Universidad de Sevilla
Antonio Ramon
Jimenez Ruiz
(CAR) CSIC-UPM
Submitted by:
Jose Luis Salaz...
Last updated:
Tue, 05/17/2022 - 22:21
DOI:
10.21227/49yg-5d21
Data Format:
Link to Paper:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

WiFi measurements dataset for WiFi fingerprint indoor localization compiled on the first and ground floors of the Escuela Técnica Superior de Ingeniería Informática, in Seville, Spain. The facility has 24.000 m² approximately, although only accessible areas were compiled. The dataset is made up of two elements: training, consisting of 7175 wifi fingerprints in 489 different locations; and testing, consisting of 390 test samples with two different mobile devices and one sample per location, this testing dataset was compiled two days after the training compilation by taking samples at random locations. The training set also include magnetic magnitude values compiled with Android magnetic field sensor API.

Instructions: 

The training dataset consists of 7175 fingerprints collected from 489 different locations. Each fingerprint is stored as a JSON object corresponding to an unique scan with the following values:

  • _id: contains an unique identifier for the fingerprint, uses to differentiate one fingerprint from another.

  • avgMagneticMagnitude: average magnetic magnitude during scanning with the mobile phone sensor, although this value is not used is provided in case it was useful.

  • location: object with the coordinates of the real world in which the sample was captured.

    • floor: number indicating the floor in which the sample was captured.

    • lat: latitude as part of the coordinate at which the sample was captured.

    • lon: longitude as part of the coordinate at which the sample was captured.

  • timestamp: UNIX timestamp in which the sample was captured.

  • userId: identifier of the user who captured the sample, this value will be anonymized so that it is not directly identifiable but remains unique.

  • wifiDevices: list of APs appearing in the sample.

    • bssid: unique AP identifier, this value will be anonymized so that it is not directly identifiable but remains unique.

    • frequency: AP WiFi frequency.

    • level: AP WiFi signal strength (RSSI).

    • ssid: AP name, this value will be anonymized so that it is not directly identifiable but can be used to compare APs with the same name.

The training dataset was compiled by taking samples at every 3 meters on average with 15 samples per location. The time at each location was approximately 40 seconds performing consecutive scans with a bq Aquaris E5 4G device using Android stock 6.0.1 without making any movements during the process. The following is an example of a fingerprint, the list of WiFi devices has been shortened to two APs, as it was too long.

{
   "_id":"5cc81e8ac28d6d2533709425",
   "avgMagneticMagnitude":40.615368,
   "location":{
      "floor":1,
      "lat": 37.357746,
      "lon": -5.9878354
   },
   "timestamp":1556618890,
   "userId":"USER-0",
   "wifiDevices":[
      {
         "bssid":"AP-BSSID-0",
         "frequency":2457,
         "level":-75,
         "ssid":"AP-SSID-0"
      },
      ...
      {  
         "bssid":"AP-BSSID-23",
         "frequency":2437,
         "level":-64,
         "ssid":"AP-SSID-6"
      }
   ]
}

The testing dataset consists of two tests with a total of 390 samples in random locations yet in areas captured by the training dataset and with different devices. This dataset is grouped by tests and within it are the captured samples, so both the individual error and the average error can be obtained, besides recalculating this error to test different algorithms. Each test is stored as a JSON object corresponding to an unique scan with the following values:

  • _id: contains an unique identifier for the test, uses to differentiate one test from another.

  • userId: identifier of the user who performed the test, this value will be anonymized so that it is not directly identifiable but remains unique.

  • startTimestamp: UNIX timestamp that indicates when the test was started.

  • endTimestamp: UNIX timestamp that indicates when the test was ended.

  • samples: list of samples taken during testing.

    • timestamp: UNIX timestamp that indicates when the sample was collected.

    • real: object with the coordinates of the real world in which the sample was captured.

      • floor: number indicating the floor in which the sample was captured.

      • lat: latitude as part of the coordinate at which the sample was captured.

      • lon: longitude as part of the coordinate at which the sample was captured.

    • predicted: object with the predicted coordinates of the real world.

      • floor: number indicating the floor predicted.

      • lat: latitude as part of the predicted coordinate.

      • lon: longitude as part of the predicted coordinate.

    • wifiDevices: list of APs appearing in the sample.

      • bssid: unique AP identifier, this value will be anonymized so that it is not directly identifiable but remains unique.

      • frequency: AP WiFi frequency.

      • level: AP WiFi signal strength (RSSI).

      • ssid: AP name, this value will be anonymized so that it is not directly identifiable but can be used to compare APs with the same name.

    • error: approximate distance between the actual location and the predicted location.

  • error: average distance between the actual locations and the predicted locations.

The testing dataset was compiled two days after the training phase by taking samples at random locations with an average of 3 meters, performing a single scan per location. The samples were taken with two devices, which represent each of the tests individually, a bq Aquaris E5 4G device using Android stock 6.0.1 and a Xiaomi Redmi 4X using Android 7.1.2 with MIUI 10 Global 9.5.16. Before taking the sample, 5 seconds were waited without making any movements. The following is an example of a test entry, the list of samples has been shortened to one sample and wifi devices has been shortened to two APs, as it was too long.

{
   "_id":"5d13245e279a550b548e3bfe",
   "userId":"USER-0",
   "startTimestamp": 1557212799.6555429,
   "endTimestamp": 1557222705.0710876,
   "samples":[
      {
         "timestamp":1557212799.6552203,
         "real":{
            "floor":0,
            "lat":37.358547,
            "lon":-5.9867215
         },
         "predicted":{
            "floor":0,
            "lat":37.358547,
            "lon":-5.9868493
         },
         "wifiDevices":[
            {
                "bssid":"AP-BSSID-156",
                "frequency":2412,
                "level":-80,
                "ssid":"AP-SSID-5"
            },
            ...
            {
                "bssid":"AP-BSSID-146",
                "frequency":2462,
                "level":-36,
                "ssid":"AP-SSID-6"
            }
         ],
         "error":5.233510868645419
      },
      ...
   ],
   "error":3.975672826048607
}

In order to provide more information about the device used in each fingerprint of the dataset, the following relationship between users and devices is given:

USER-0: Xiaomi Redmi 4X (Android 7.1.2 with MIUI 10 Global 9.5.16)

USER-1: BQ Aquaris E5 4G (Android stock 6.0.1)

Dataset Files

LOGIN TO ACCESS DATASET FILES
Open Access dataset files are accessible to all logged in  users. Don't have a login?  Create a free IEEE account.  IEEE Membership is not required.