Improving transportation research with passively-collected location data

Danielle McCool and Barry Schouten


WIN sensor data projects

  • Travel/mobility
    • Stakeholders RWS/KiM
    • Location (GSM, Wi-Fi, GPS) and motion sensors
  • Time use
    • Stakeholder SCP
    • Location (GSM, Wi-Fi, GPS) and motion sensors
    • Possibly wearables
  • Budget expenditure
    • Eurostat project @ HBS
    • Location (GSM, Wi-Fi, GPS) and camera
  • Fitness/physical activity
    • Stakeholders RIVM, GGD
    • Wearables

WIN sensor data projects

  • Travel/mobility for RWS/KiM
  • Time use for SCP
  • Budget expenditure as Eurostat project @ HBS
  • Fitness/physical activity for RIVM and GGD

This project on Travel and mobility

  • Both UU and CBS
  • Stakeholders RWS/KiM
  • Location (GSM, Wi-Fi, GPS)

CBS verplaatsingen app

Fieldtest (Nov-Dec) to evaluate recruitment and data collection strategies as well as general data quality

  • Analyses
    • Quick-turnaround set for making future decisions
    • More comprehensive set for quality analysis and methodology development
  • Team
    • Danielle McCool as PhD
    • Two EMOS students (Katie Roth and Laurent Smeets)
    • Trainee (Lars Killaars)
    • WIN researchers

Dimensions of the data


Important bits

  • Device information
  • Location data
  • Track data
  • Stop data
  • Daily Questions

Device information

  • Make
  • Model
  • OS
  • OS version

Location data

  • High-tracking mode: 1 measurement per second
  • Low-trackign mode: 1 measurement per minute

Track data

  • Start time
  • Stop time
  • Transportation mode

Stop data

  • Start time
  • Stop time
  • Stop name
  • Stop motive

Daily questions

  • “Did you have your phone with you today?”
  • “Was today a normal day for you?”

Interesting challenges

  • Incomplete data
  • Device differences
  • Strange sensor measurements
  • Sensitivity vs. battery life
  • What is a stop

What is a stop?


What is a stop (lvl 2)


What is a stop (lvl 3)


Not a stop

  • Waiting at a stoplight
  • Being stuck in traffic
  • Switching Wi-Fi on and having your position change

A stop

  • Going from one building to another on campus
  • Taking your dog to the dog park
  • Dropping your kid off at school


  • Waiting for your train at the station
  • Taking your dog for a walk
  • Going to ask the neighbors for your package

Our stop definition

Two levels

  1. Data collection
  2. User interface

Data collection

Parameters trigger ‘high-tracking’ and ‘low-tracking’ modes on the device.

  1. Distance Delta Limit
  2. Time period within that radius

User interface

  1. Grouping radius
  2. Time
  3. Minimum Stop Accuracy
  4. Stop merge radius
  5. Stop merge max travel radius

Grouping radius parameter


Time parameter


Better interpretation


Missing data

Missing data occurs at myriad levels within this data.

  1. Recruitment
  2. App/device incompatibility
  3. App installation
  4. App closes itself
  5. App only has location on Wi-Fi or GPS
  6. Device dies
  7. Short losses due to tunnels or buildings

Missing data in the recruitment phase


Missing data over time per OS


Missing data within a day


Missing data within a trip


Next steps

PhD project consists of five projects (2018-2021):

  • A descriptive paper over the app
  • Adjustment for missing data in CBS verplaatsingen app
  • Adjustment for measurement error/inaccurate measurements in CBS verplaatsingen app
  • Two projects linked to time-use sensor data


  • Field test with 1900 to make a first foray into replacing paper surveys
  • App generally successful
  • Self-reported stops are difficult to reproduce programmatically
  • Lots of flavors of missingness
  • Immediate issues: reporting distance to the stakeholders

Like to know more? * or