MeLi Data Challenge 2019 | Deep Learning

Author: Mariano Leonel Acosta | Leaderboard #17 - 0.89764

https://ml-challenge.mercadolibre.com/final_results

I developed a predictive system for product classification with 1588 different categories. Using Natural Language Processing (NLP) combined with Deep Learning, I was able to analyze over two million product descriptions from Mercado Libre and predict new cases with a balanced accuracy of 89,76%.

The final model consists of a Neural Network ensemble, a combination of Long Short Term Memory RNNs (LSTM) and Convolutional Nets (CNN). Each sub-system was trained independently on different subset of the dataset. Then, to make the final prediction, each output is combined using weighted sums.

Implementation

In order to try this project on your own, first you need to download the dataset (using Bash):

$wget https://meli-data-challenge.s3.amazonaws.com/train.csv.gz 
$wget https://meli-data-challenge.s3.amazonaws.com/test.csv 
$wget https://meli-data-challenge.s3.amazonaws.com/sample_submission.csv 
$gunzip resources/train.csv.gz

(Alternately, the resources can be downloaded manually from HERE)

Next, simply run the main.py script.

The following libraries are required:

Numpy
Pandas
SciKit Learn
Tensorflow
Keras

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
README.md		README.md
main.py		main.py
tokenizer.pickle		tokenizer.pickle
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MeLi Data Challenge 2019 | Deep Learning

Author: Mariano Leonel Acosta | Leaderboard #17 - 0.89764

Implementation

About

Releases

Packages

Languages

mlacosta/MeLi-Data-Challenge-2019

Folders and files

Latest commit

History

Repository files navigation

MeLi Data Challenge 2019 | Deep Learning

Author: Mariano Leonel Acosta | Leaderboard #17 - 0.89764

Implementation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages