Skip to content

linahcharif/NTDS-Team24

Repository files navigation

NTDS-Team24 : Analysis of delays on the New Jersey railway network

This is the GitHub repository for the final project for the course EE558 - A Network Tour of Data Science (EPFL).

The project focuses on the analysis of train delays on the New Jersey railway network. The dataset is available in Kaggle (https://www.kaggle.com/pranavbadami/nj-transit-amtrak-nec-performance). Due to the huge amount of data avaiable, we choose to focus mainly on data from March 2018.

Repository structure

The results (figures and gifs) are stored in Images and Gifs folder.

There are 4 notebooks in total:

  1. Data exploration and processing for the network analysis : Preprocessing_def
  2. Clustering : Clusters_original
  3. ML Model 1 (RNN): Final_LSTM_Model
  4. ML Model 2 (ANN) and 3 (Ridge): Classification_Model

The data for the months of March and June 2018 are available on the main page, as well as the March data splitted into inward and outward trips (going to or coming from New York). The coordinates of the stations of the network are also available.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •