Simulator Pre-Screening of Underprepared Drivers Prior to Licensing On-Road Examination: Clustering of Virtual Driving Test Time Series Data

David Grethlein; Flaura Koplin Winston; Elizabeth Walshe; Sean Tanner; Venk Kandadai; Santiago Ontanon

doi:10.2196/13995

Back

Simulator Pre-Screening of Underprepared Drivers Prior to Licensing On-Road Examination: Clustering of Virtual Driving Test Time Series Data

Journal article

Open access

Peer reviewed

Simulator Pre-Screening of Underprepared Drivers Prior to Licensing On-Road Examination: Clustering of Virtual Driving Test Time Series Data

David Grethlein, Flaura Koplin Winston, Elizabeth Walshe, Sean Tanner, Venk Kandadai and Santiago Ontanon

Journal of medical Internet research, v 22(6), pp e13995-e13995

18 Jun 2020

DOI: https://doi.org/10.2196/13995

PMID: 32554384

Featured in Collection : UN Sustainable Development Goals @ Drexel

Files and links (1)

url

https://doi.org/10.2196/13995View

Published, Version of Record (VoR)CC BY V4.0, Open

Abstract

Health Care Sciences & Services

Life Sciences & Biomedicine

Medical Informatics

Science & Technology

Background: A large Midwestern state commissioned a virtual driving test (VDT) to assess driving skills preparedness before the on-road examination (ORE). Since July 2017, a pilot deployment of the VDT in state licensing centers (VDT pilot) has collected both VDT and ORE data from new license applicants with the aim of creating a scoring algorithm that could predict those who were underprepared. Objective: Leveraging data collected from the VDT pilot, this study aimed to develop and conduct an initial evaluation of a novel machine learning (ML)-based classifier using limited domain knowledge and minimal feature engineering to reliably predict applicant pass/fail on the ORE. Such methods, if proven useful, could be applicable to the classification of other time series data collected within medical and other settings. Methods: We analyzed an initial dataset that comprised 4308 drivers who completed both the VDT and the ORE, in which 1096 (25.4%) drivers went on to fail the ORE. We studied 2 different approaches to constructing feature sets to use as input to ML algorithms: the standard method of reducing the time series data to a set of manually defined variables that summarize driving behavior and a novel approach using time series clustering. We then fed these representations into different ML algorithms to compare their ability to predict a driver's ORE outcome (pass/fail). Results: The new method using time series clustering performed similarly compared with the standard method in terms of overall accuracy for predicting pass or fail outcome (76.1% vs 76.2%) and area under the curve (0.656 vs 0.682). However, the time series clustering slightly outperformed the standard method in differentially predicting failure on the ORE. The novel clustering method yielded a risk ratio for failure of 3.07 (95% CI 2.75-3.43), whereas the standard variables method yielded a risk ratio for failure of 2.68 (95% CI 2.41-2.99). In addition, the time series clustering method with logistic regression produced the lowest ratio of false alarms (those who were predicted to fail but went on to pass the ORE; 27.2%). Conclusions: Our results provide initial evidence that the clustering method is useful for feature construction in classification tasks involving time series data when resources are limited to create multiple, domain-relevant variables.

Metrics

6 Record Views

6 citations in Web of Science

7 citations in Scopus

Details

Title: Simulator Pre-Screening of Underprepared Drivers Prior to Licensing On-Road Examination: Clustering of Virtual Driving Test Time Series Data
Creators: David Grethlein - Drexel University
Flaura Koplin Winston - Children's Hospital of Philadelphia
Elizabeth Walshe - Annenberg Public Policy Center
Sean Tanner - Rutgers, The State University of New Jersey
Venk Kandadai - Diagnostic Driving, Inc, Philadelphia, PA, United States.
Santiago Ontanon - Drexel University
Publication Details: Journal of medical Internet research, v 22(6), pp e13995-e13995
Publisher: Jmir Publications, Inc
Number of pages: 14
Resource Type: Journal article
Language: English
Academic Unit: Computer Science
Web of Science ID: WOS:000540902300001
Scopus ID: 2-s2.0-85086691482
Other Identifier: 991019168078304721

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types: Domestic collaboration
Web of Science research areas: Health Care Sciences & Services; Medical Informatics

Simulator Pre-Screening of Underprepared Drivers Prior to Licensing On-Road Examination: Clustering of Virtual Driving Test Time Series Data

Files and links (1)

Abstract

Metrics

Details

UN Sustainable Development Goals (SDGs)

InCites Highlights

Drexel University Social media