Skip to content

This repository contains code and resources for a Kaggle competition using NLP techniques to classify tweets as disaster-related or not.

Notifications You must be signed in to change notification settings

phitrann/Natural-Language-Processing-with-Disaster-Tweets-Competition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Natural-Language-Processing-with-Disaster-Tweets-Competition

  • This repository contains the code and resources for the "Natural Language Processing with Disaster Tweets" competition. The goal of the competition is to analyze and process tweets related to disasters using natural language processing techniques.
  • Our submission achieved a commendable rank of 45th on the competition leaderboard.

Overview

  • In this competition, participants are provided with a dataset of tweets that are labeled as either related to a disaster or not. The task is to build a model that can accurately classify new tweets as disaster-related or not. The competition encourages the use of various natural language processing (NLP) techniques to extract meaningful features from text data and train a predictive model.

Contents

  • datasets/: This directory contains the dataset for the competition.
  • sources/: This directory contains Jupyter notebooks with exploratory data analysis, feature engineering, and model development
    • data_collecting.ipynb: Explain the dataset.
    • data_exploring.ipynb: Exploring data using visualization techniques.
    • natural-language-processing-with-disaster-tweets.ipynb: Performs exploratory data analysis (EDA) on the data and data preprocessing, then split data into train/val/test.
    • natural-language-processing-with-disaster-tweet-v2.ipynb: Builds the model using the processed data from natural-language-processing-with-disaster-tweets.ipynb.
    • data_modeling (2).ipynb: Another model by my teammate that has the same result.
  • Report_Final.pdf: Report of this project.
  • linkyoutube.txt: The presentation of my team.

Getting Started

To get started with this project, follow these steps:

  • Clone this repository.
  • Explore the notebooks in the sources/ directory to understand the data and various NLP techniques used.
  • Evaluate the model's performance and make necessary improvements.

About

This repository contains code and resources for a Kaggle competition using NLP techniques to classify tweets as disaster-related or not.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published