Skip to content

rania-azad/text_summarization_LimitelessDeepLearning-Course

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Text Summarization Project with AraGPT2

Project Overview

This project is designed to provide hands-on experience in fine-tuning the Arabic GPT-2 (AraGPT2) model for the task of text summarization. Students will work with a custom dataset and will have the opportunity to understand and implement various aspects of Natural Language Processing (NLP), particularly in the context of summarization using transformer models.

Objectives

  • Understanding the fundamentals of text summarization.
  • Exploring the architecture and capabilities of the AraGPT2 model.
  • Fine-tuning AraGPT2 on a custom dataset for summarization tasks.
  • Evaluating the performance of the fine-tuned model.

Repository Structure

  • data/: Directory containing the dataset for training and validation.
  • src/: Contains the source code with TODOs for students.
    • utils_data.py: Custom dataset class for handling the summarization dataset.
    • utils_tokenizer.py: Custom tokenizer for text summarization.
    • train.py: Main training script with placeholders (TODOs) for students to complete.
  • main.ipynb/: Notebook for the main calls, also contains TODOs to be filled by students.
  • requirements.txt: List of Python dependencies for the project.

Getting Started

You can upload the notebook and the code directly in colab =)

About

Text Summarization with AraGPT2

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 94.6%
  • Python 5.4%