Solving the Cargo 2000 dataset problem

Suraj Negi
Jan 24, 2021

--

Hi All,
I was recently inspired to solve the Cargo 2000 dataset problem statement to find the number of legs (i.e., stops)

Source:

Andreas Metzger (andreas.metzger ‘@’ paluno.uni-due.de)
paluno (The Ruhr Institute for Software Technology)
University of Duisburg-Essen
Gerlingstraße 16
45127 Essen, Germany

Data Set Information:

A description of the underlying Cargo 2000 standard https://archive.ics.uci.edu/ml/datasets/Cargo+2000+Freight+Tracking+and+Tracing.

The dataset consists of a plethora of null values, so I filled those null values with mean.

DATASET

Used the correlation heatmap

Used Classification algorithms and found that the models are overfitting.

Tried PCA method and solved the issue

Model values after using PCA

Please find the full code + dataset here: https://github.com/whosurajnegi/Cargo2000

--

--