Completed 2018 Research Project

Machine Learning Tools for Informing Transportation Technology and Policy

Principal Investigator
Missy Cummings
Duke University
View Bio

Full Report

Project Slide Deck

Research Brief


Transportation analysts are inundated with requests to apply popular machine learning modeling techniques to data sets to uncover never-before-seen relationships that could potentially revolutionize safety, congestion and mobility. To demonstrate some of the pitfalls in engaging in such analytics, which include subjectivity at several points in the modeling process, we developed Logistic Regression (LR) and Neural Network (NN) models for a driving injury/fatality and pedestrian fatality datasets. We then developed 5 different representations for each data set, one LR and one NN, with 4 different feature interpretations commonly used in the machine learning community.

This study showed that when attempting to determine if road design variables significantly influenced driver injuries and fatalities, the answer is unclear, with many possible interpretations of the results. In the pedestrian model, results indicated that speed and age mattered, but any conclusions could not be drawn about road design variables. However, interpretations could be very different depending on the model and parameters selected.


Project Details

Project Type: Research
Project Status: Completed
Start Date: 3-1-2018
End Date: 1-31-2021
Contract Year: Year 2
Total Funding from CSCRS: $45,000