A Great Collection of Deep Learning Tutorials and Repositories for Natural Language Processing (NLP)
- Great NLP Posts
- Awesome NLP Paper Discussions - Hugging Face [Excellent]
- Ten trends in Deep learning NLP
- Attention in RNNs
- Understanding self-attention and other types of attention mechanisms
- BERT - TensorFlow
- Understanding XLNet
- XLNet - TensorFlow
- XLM (PyTorch implementation of Cross-lingual Language Model Pretraining)
- Pretrained PyTorch models for BERT
- Library of state-of-the-art pretrained models for NLP [Excellent]
- DistilBERT
- FastBert
- FastBert Linkedin Post
- PyTorch Hub - BERT
- A Simple Guide On Using BERT for Binary Text Classification
- Core ML 3 implementation of BERT for Question answering
- NLP - Keras - Intro
- AllenNLP [General NLP]
- Stanza - A Python NLP Library for Many Human Languages
- The Best NLP Papers From ICLR 2020
- Deep learning for natural language processing and information retrieval at the University of Waterloo
- Natural Language Processing With spaCy in Python [Great]
- NLP Papers
- A Great NLP Course
- KerasNLP: Modular NLP Workflows for Keras
- NLP Test: Deliver Safe & Effective Models
- Karpathy minbpe
- Karpathy's 2 Hours Tutorial for Building GPT Tokenizer
- Learning Core Foundational Concepts in NLP by Examples and by calculation by Hand
- SetFit: Efficient Few-shot Learning with Sentence Transformers
- Parsivar: library for Persian text preprocessing
- Hazm
- persianNLP
- ParsiNLU: Comprehensive suit of high-level NLP tasks for Persian language
- FarsTail: A Persian Natural Language Inference Dataset
- wordfreq: Access a database of word frequencies
- Persian Stop Words List
- Persian Stop Words List in Hazm Repo
- PCoQA: Persian Conversational Question Answering Dataset
- Khayyam Challenge (PersianMMLU): Is Your LLM Truly Wise to The Persian Language? [Good paper & dataset]
- Basalam Dataset via RadeAI Team
- Basalam Datasets for LLM Fine-tuning
- Beyond Word Embeddings Part 1
- Beyond Word Embeddings Part 2
- Learning Word Embedding
- Introduction to Word Embedding and Word2Vec
- Word Embedding
- Understanding Word Embeddings
- Introduction to Word Vectors
- Word2vec Made Easy
- What is GloVe? Part I
- What is GloVe? Part II
- What is GloVe? Part III
- What is GloVe? Part IV
- What is GloVe? Part V
- ELMo: Deep Contextualized Word Representation
- A Step-by-Step NLP Guide to Learn ELMo
- ELMo: Contextual language embedding
- word embeddings with ELMo
- Doc2Vec - Gensim
- https://amitness.com/2020/05/self-supervised-learning-nlp/
- COSINE: Fine-Tuning Pre-trained Language Model with Weak Supervision
- Understanding LSTM Networks
- Illustrated Guide to LSTM’s and GRU’s
- Animated RNN, LSTM and GRU
- Recurrent Neural Networks and LSTM explained
- Long Short-Term Memory (LSTM): Concept
- Understanding architecture of LSTM cell from scratch
- Basic understanding of LSTM
- Taming LSTMs with PyTorch
- Introduction to LSTM
- Introduction to RNNs
- xLSTM - Post1
- Were RNNs All We Needed? [Interesting Paper]
- How Transformers Work
- The Illustrated Transformer
- Transformers from Scratch
- What is a Transformer?
- How Transformers work in deep learning and NLP
- Transformer: A Novel Neural Network Architecture for Language Understanding
- How do Transformers Work in NLP?
- The Essence of Transformers [Good]
- Transformers and Multi Head Attention
- Multi Head Attention
- BERT for Dummies
- The Dark Secrets of BERT
- A Survey of Long-Term Context in Transformers [Great]
- The Transformer Family
- The Transformer Isn’t As Hard To Understand As You Might Think
- Review of Compact Transformer Architectures [Great]
- REFORMER: The Efficient Transformer
- GPT-3: Language Models are Few-Shot Learners
- GPT-3 Sandbox
- Microsoft will launch GPT-4
- OpenAI GPT-4
- Some information about GPT-4
- Regular Expressions (Regex) Generated by GPT-3
- Auto Regex: Converting English description to Regex [Good]
- minGPT
- NVIDIA FasterTransformer: Transformer related optimization, including BERT & GPT
- OpenNMT CTranslate2: Fast inference engine for Transformer models
- Deploying GPT-J and T5 with FasterTransformer and Triton Inference Server [Interesting]
- MEND: Fast Model Editing at Scale [Excellent Work]
- BorealisAI Transformers I: Introduction
- OpenAI Best Practices for Deploying Language Models
- OPT-IML
- RetNet: an Alternative to Transformers
- Transformer Taxonomy [Great]
- Generative AI exists because of the transformer: Great Visual Explanation [Great]
- RLHF Tutorial
- New method instead of RLHF: Direct Preference Optimization: Your Language Model is Secretly a Reward Model
- Finetuning an LLM: RLHF and alternatives (Part I)
- Finetuning an LLM: RLHF and alternatives (Part II)
- Finetuning an LLM: RLHF and alternatives (Part III)
- How good is AI feedback?
- Direct Preference Optimization (DPO) for LLM Alignment (From Scratch)
- 𝗻𝗲𝘄 𝗽𝗮𝗽𝗲𝗿 𝗯𝘆 𝗠𝗲𝘁𝗮 𝗰𝗹𝗮𝗶𝗺𝘀 𝘁𝗵𝗮𝘁 𝘄𝗲 𝗰𝗮𝗻 𝗴𝗲𝘁 𝗿𝗶𝗱 𝗼𝗳 𝘁𝗼𝗸𝗲𝗻𝗶𝘇𝗲𝗿𝘀: Byte Latent Transformer: Patches Scale Better Than Tokens --> we could get rid of tokenizers
- Byte Latent Transformer: Patches Scale Better Than Tokens (paper)
- LLM Reading Papers
- LLaMA
- Toolformer: Language Models Can Teach Themselves to Use Tools [Great]
- Toolformer GitHub
- Amazon Multimodal Chain-of-Thought Reasoning in Language Models
- LLaMA-based ChatGPT Training [Great]
- The Wisdom of Hindsight Makes Language Models Better Instruction Followers
- Stanford Alpaca: An Instruction-following LLaMA model
- Alpaca: A Strong, Replicable Instruction-Following Model
- Fine-Tune Alpaca in Arabic
- TRL: Transformer Reinforcement Learning
- Large Language Model (LLM) Primers Tutorial [Great]
- Dolly
- Microsoft JARVIS & HuggingGPT [Interesting]
- open-source LLMs
- GPT4Free
- HuggingChat
- LaMini-LM: A Diverse Herd of Distilled Models
- RedPajama-Data: An Open Source Recipe to Reproduce LLaMA training dataset
- BigCode
- OpenLLaMA
- Dromedary: towards helpful, ethical and reliable LLMs
- MPT-7B Model with Commercial Licence
- MPT-7B Story Writer
- MPT-7B
- MPT-7B Blog
- Open LLMs
- Google PaLM 2
- BLOOMChat
- LLMs Practical Guide
- FrugalGPT
- ChatALL [Great]
- Falcon LLM
- The Falcon has landed in the Hugging Face ecosystem [Great]
- Open LLMs [Great]
- OpenLLMs: Less is More for Open-source Models [Great]
- LLaMA2
- source code of llama2-chatbot
- Notes about OpenAI's GPT-4 Model
- GPT-4 is getting worse over time
- OpenChat: Less is More for Open-source Models
- Instruction Tuning Datasets
- ToolLLM
- Falcon 180B
- Fine-tune Falcon 180B using QLoRA and Flash Attention on Amazon SageMaker
- Large Language Models as Optimizers
- Favourite LLM Authors
- Open Source LLMs for Commercial Use
- Optimizing your LLM in production [Important]
- In Context Vectors (ICV): an alternative to Few-Shot Learning and Finetuning techniques like LoRA to improve an LLMs performance
- NexusRavan v2 13B Fuction Calling LLM Surpassing GPT-4
- Phixtral model
- Eagle-7B LLM: 100% attention-free RNN Model!
- Eagle-7B LLM: Blog Post
- Can LLMs improve themselves? Self-play fine-tuning (SPIN)
- AI2 OLMo Model: Linkedin Post
- AI2 OLMo Model: HuggingFace
- AI2 OLMo Model: Original Blog post
- Some Notes about OLMo Model
- Mixtral in colab [Great]
- Grok-1 LLM with 314B Size: Post1
- Grok-1 LLM: Post2
- DBRX LLM
- DBRX LLM: Post1
- DBRX LLM: Post2
- LLMs via Multi-Token Prediction
- Linkedin Post
- Colab Notebook
- Main Github of Mergekit
- huggingface merge-models blog post
- Making the NeuralBeagle14-7B LLM Model (via Merging models and other methods)
- Merge Large Language Models with mergekit
- Fine-tune a Mistral-7b model with Direct Preference Optimization
- AutoMerger
- Evolutionary LLM Merging - Post1
- Evolutionary LLM Merging - Post2
- Mixture of Experts (MoEs) Explained [Great]
- Mixture of Experts (MoEs) Papers List
- Mixture of Experts (MoEs) Linkedin Post
- Mixture-of-Depths - Post1
- Mixture-of-Depths (MoD) - Post2
- AutoLoRA-Merging Linkedin Post
- A colab gradio web UI for running Large Language Models [Great]
- llama-2-7b-chat-GPTQ-4bit
- camenduru
- llama-2 philschmid
- fine-tuning LLMs with TRL
- lora tuning peft finetuning llama2
- LLaMA2 with PEFT
- Baby LLaMA2 in C
- Releasing LLongMA-2 16k
- LLaMA2 API in Hugging Face Inference
- LLaMA2 API in Monster API
- LLaMA2-Accessory
- Hermes-LLongMA-2 8k
- Training Llama 2
- Llama-2-7B-32K-Instruct — and fine-tuning for Llama-2 models with Together API
- LLaMA-Factory
- LLaMA-Factory Notes
- Purple llama by Meta - Link1
- Purple llama by Meta - Link2
- Purple llama by Meta - Link3
- TinyLLaMa-1.1B
- Can llama learn new language?
- Persian LLaMa
- LLaMA3 Linkedin Post1
- Meta LLaMA3-8B
- Fine tune LLaMA3
- LLaMA3 Long Context
- LLaMA3.1
- LLaMA 3.1 Some Notes
- LLaMA 3.1 Model Finetunning
- LLaMA 3.1 Detail Notes
- LLaMA 3.2 Detail Notes
- Mobile LLaMA 3.2
- Llama-3.3-70B-Instruct
- How an online gifting site is using Llama to help protect customer privacy [interesting]
- Mistral AI models
- Is Mistral's first model a good replacement for OpenAI?
- Mistral Mixture of Experts (MoE) Model
- Mixtral - a SOTA Mixture of Experts
- Mistraltrx
- Nous-Hermes-Mixtral model
- Mixtral in colab [Great]
- Brev.dev Notebooks: Fine-tuning mistral, mixtral, phi-2 and etc [Excellent]
- Optimized LLM inference api for mistral-7b using vllm and AWQ [Excellent]
- Run Mistral7b Quantized for free on any computer (CPU or GPU) [Interesting]
- Mixtral 8x22B a 176B MoE Model
- Mistral-7B-Instruct-v0.3
- Codestral: A model fluent in 80+ programming languages
- Mistral Finetune: the official repo and guide on how to fine-tune Mistral open-source models
- Mistral Large 2 Model
- Introducing Qwen1.5 Blog Post
- Qwen1.5 Linkedin Post
- Qwen1.5 HuggingFace
- Qwen2 HuggingFace
- Qwen MoE Model
- Qwen2
- Qwen 2.5 - Linkedin Post
- Qwen 2.5 - Models
- Gemma an open Gemini LLM released by Google! - Linkedin Post
- Gemma - another linkedin post
- Google's Gemma Detailed Notes
- Gemma usage via TRL
- Gemma usage in Hugging Face via OpenAI SDK
- Does Gemma overfit the Open LLM Leaderboard?
- Zehpyr 7B Gemma
- Gemma 2
- Gemma2 Detailed Notes
- Gemma 2-2b
- 1-bit LLMs (AlphaSignal Post)
- 1-bit Quantization
- Some Notes about 1-bit LLMs (Their benefits and drawbacks)
- AutoBitnet (Train your 1.58-bit LLM based on LLama Architecture for free on Colab T4 GPU)
- Llama2 7b in 1-bit precision
- Microsoft 1-Bit LLM
- Claude LLM
- Some Notes about the 100K Claude LLM Model
- Anthropic's Claude-2
- Claude-2, Anthropic's ChatGPT competitor
- Some Information about Claude 3
- LongNet: Scaling Transformers to 1B Tokens
- Lost in the Middle: How Language Models Use Long Contexts
- Notes about How Language Models Use Long Contexts
- Scaling LLaMA and GPTNeoX to >8k input context
- Unofficial Claude-API
- Claude Unofficial API
- YARN & LongLlaMa
- YaRN: Efficient Context Window Extension of LLMs
- LLMs get lost when the context becomes too long: Lost in the Middle: How Language Models Use Long Contexts [Very Important]
- LongLoRA: Efficient Fine-tuning of Long-Context LLMs
- LongLoRA: Efficient Fine-tuning of Long-Context LLMs (another post)
- Efficient Streaming LLMs with Attention Sinks for infinite-length inputs
- MemGPT: Teaching LLMs memory management for unbounded context
- LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs [Interesting]
- Llmlingua Prompt Compress [Interesting]
- Microsoft Phi-2 Model (with 2.7B Parameters)
- Can "small" finetuned LLMs with less than 2B parameters outperform larger openly available LLMs (Mixtral, Llama 2 Chat) and proprietary LLMs (ChatGPT)?
- Smol LM
- Hymba Small LM
- ColossalAI: Library for LLMs
- LangChain: Library for Building applications with LLMs
- LangChain Chat
- LangChain Crash Course
- LangChain 101
- LangChain Resources
- LangChain & Vector Databases in Production Course
- Building LLM Powered Apps via LangChain Course
- OpenFlamingo
- Deepset Haystack Framework
- LMQL: A query language for programming LLMs
- LLM Training Frameworks List
- NeMo Guardrails
- Lamini: The LLM engine for rapidly customizing models
- Scikit-LLM: Sklearn Meets Large Language Models
- Chainlit
- ChatUI
- Streamlit-Chat
- Gradio: Creating a Streaming chatbot fast
- Streamlit-Weaviate Connection: provides a custom streamlit connection to query data from weaviate
- LangKit: an open-source text metrics toolkit for monitoring language models
- HuggingFace Transformers Agents
- privateGPT: Ask questions to your documents using the power of LLMs
- Spacy LLM
- Lit-GPT
- Zero to LitGPT Tutorial: Getting Started with Pretraining, Finetuning, and Using LLMs [Great]
- GPTCache: A Library for Creating Semantic Cache for LLM Queries
- AutoTrain-Advanced
- Monster API: API for using & fine-tuning LLMs
- AnythingLLM: A full-stack personalized AI assistant
- EasyLLM: helpful tools and methods for working with LLMs
- gpt-llm-trainer: input a description of your task, and fine-tune a LLaMA 2 model for you
- Embedchain: a framework to easily create LLM powered bots
- PandasAI [It is not related strictly in this section, but it is interesting]
- GPT Engineer: Specify what you want it to build, the AI asks for clarification, and then builds it
- Ludwig: a low-code framework for building custom AI models like LLMs
- open-interpreter
- kani: is a lightweight and highly hackable framework for chat-based language models with tool usage/function calling
- Kani colab samples
- Kani Linkedin Post
- Argilla: the open-source data curation platform for LLMs
- LiteLLM: Call all LLM APIs using the OpenAI format
- LLM Finetuning with PEFT
- ChatGPT-AutoExpert: Supercharged Custom Instructions for ChatGPT
- PyTorch thunder (pytorch compiler for speed up training of LLMs) - Linkedin Post
- PyTorch Lightning Thunder
- unsloth library: 2-5X faster 70% less memory QLoRA & LoRA finetuning [Great for fine-tuning LLMs]
- TorchTune: A Native-PyTorch Library for LLM Fine-tuning
- LLM Finetuning with PEFT Colab Notebooks
- Self Instruct TRL for LLMs
- Self Instruct TRL for LLMs - Link2
- How to Fine-Tune LLMs in 2024 with Hugging Face
- Fine tune LLMs in your own hardware via PyTorch team (great)
- RLHF in 2024 with DPO & Hugging Face
- A little guide to building Large Language Models in 2024 (PPT by HuggingFace Team) [Great]
- Video Link1 of A little guide to building Large Language Models in 2024 (PPT by HuggingFace Team)
- Video Link2 of A little guide to building Large Language Models in 2024 (PPT by HuggingFace Team)
- Understanding the instruction fine-tuning process in LLMs
- Top 5 Tips and Tricks for LLM Fine-Tuning and Inference from Intel Experts
- Design2Code: How Far Are We From Automating Front-End Engineering?
- Llama Coder: Can generate full React apps
- LLM Bootcamp - Spring 2023
- LLM University
- List of LLM Courses
- Anti-hype LLM reading list
- Microsoft Generative AI Course
- Google and Kaggle five-day generative AI course [Good]
- Best Resources for learning to work with LLMs
- Start with Large Language Models (LLMs) - Become an expert for free! [Interesting]
- Intro to LLMs: Andrej Karpathy 1 Hour Lecture
- LLM Course [good]
- LLM Course in ChatGPT Plus
- Build a Large Language Model (From Scratch) great Course and Book Tutorial [Great]
- Learning Resources about LLMs
- The Transformer Layer by Layer Course
- The Transformer Layer by Layer Course: Linkedin
- Hands-on LLMs Course
- Direct Preference Optimization (DPO) Method for LLMs Tutorial
- CS25: Transformers United V3 Courses - Autumn 2023
- CS336: Language Modeling from Scratch
- Visual and Animated Lecture about LLMs and Transformers and Deep Learning
- LLMs Roadmap [Great]
- Brev.dev Notebooks: Fine-tuning mistral, mixtral, phi-2 and etc [Excellent]
- LLM Summer School
- LLM Engineer's Handbook
- LLM Twin Course: Building Your Production-Ready AI Replica
- Hands-On Large Language Models Book
- Open LLM Leaderboard
- Chatbot Arena Leaderboard
- AlpacaEval Leaderboard
- CanAiCode Leaderboard
- Small LLMs Performance Ranking
- Chatbot Arena: Benchmarking LLMs in the Wild [Great]
- Chatbot Arena Leaderboard
- AI2 WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild [Great]
- AI2 WildBench Linkedin Post
- Persian LLM Leaderboard (via Part AI)
Building NLP Applications Powered by LLMs (Different Methods for Augmenting Knowledge to LLMs (or Retrieval-Augmented Generation (RAG) applications)):
- Ask a Book Questions with LangChain OpenAI [Great]
- OpenAI Web QA Embeddings
- Deepset Haystack Framework
- Stanford Retrieval-based NLP
- Hypothetical Document Embeddings (HyDE)
- ChatDB: Augmenting LLMs with Databases
- ChatNode
- Emerging Architectures for LLM Applications
- Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
- Fine tuning vs. RAG for LLMs
- Building RAG-based LLM Applications for Production (Part 1) [Good]
- Verba: The Golden RAGtriever, user-friendly interface for Retrieval-Augmented Generation (RAG) applications
- DocsGPT: GPT-powered chat for documentation, chat with your documents
- RAFT: Retrieval Augmented Fine Tuning - Post1
- RAFT: Retrieval Augmented Fine Tuning - Post2
- RAFT: Retrieval Augmented Fine Tuning - Microsoft Blog
- RAFT: Retrieval Augmented Fine Tuning - Berkeley Blog
- RAFT Code
- Long context LLMs vs RAG [Interesting]
- RAGFlow: an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding
- Two Step RAG: Speculative RAG: Enhancing retrieval augmented generation through drafting
- Exploring Multimodal RAG with LlamaIndex and GPT-4 or the New Anthropic Sonnet Model
- PaperQA2: High accuracy RAG for answering questions from scientific documents with citations
- Sophisticated Controllable Agent for Complex RAG Tasks
- Anthropic's Cluade Introducing Contextual Retrieval RAG
- Docling: Get your docs ready for gen AI
- Recent RAG Research from Google
- ArangoDB: The Most Complete And Scalable Platform For Graph-Powered GenAI
- Microsoft GraphRAG
- llamaindex Graph RAG
- Gephi: The Open Graph Viz Platform
- JanusGraph: is a scalable graph database optimized for storing and querying graphs
- cayley: Open Source Graph Data Base
- Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering (Paper)
- The GraphRAG Manifesto: Adding Knowledge to GenAI
- Neo4j for GenAI
- weaviate
- weaviate GitHub
- chroma
- Qdrant: Vector Database for AI Applications
- pinecone
- rektor-db
- pgvector
- LlamaIndex: comprehensive toolkit to perform data augmentation for LLMs
- jina-ai VectorDB
- sqlite-vec: A vector search SQLite extension
Great Embedding Models for Search (for Augmenting External Knowledge into ChatBot Vector DB) [Retrieval Augmented Generation (RAG)]:
- Massive Text Embedding Benchmark (MTEB) Leaderboard
- Word and sentence embeddings is how LLMs understand text
- FlagEmbedding
- E5 embedding vs OpenAI Ada
- M2-BERT-80M-32k-Retrieval
- Embedding Quantization - Post1
- Embedding Quantization - Post2
- Embedding Quantization - HuggingFace Blog Post
- Quantization Fundamentals with Hugging Face Course
- Is Cosine-Similarity of Embeddings Really About Similarity?
- LLM2Vec [Great]
- Fine tuning embedding models for RAG (Linkedin post)
- Fine tuning embedding models for RAG (Original Post)
all-MiniLM-L6-v2
--> Sentence-Transformers Model for Embedding- Learn How to Fine-tuning Embedding Models Course [Great]
- LLMs Embedding Course - Link1
- LLMs Embedding Course - Link2
- txtai: All-in-one embeddings database
- NVIDIA NV-emb-2 embeddings
- jina-embeddings-v3: Multilingual Embeddings With Task LoRA
- ModernBert: Linkedin Post1
- ModernBert: Linkedin Post2
- Deep Dive Into LLM Hallucinations Across Generative Tasks
- Controlled Generation Tools
- Guidance: Controlling LLMs
- NeMo Guardrails
- Minimising Hallucinations in LLM Applications: NeMo Guradrails Video Tutorial
- Mitigate Hallucination in LLMs
- LLMs Hallucinations Benchmark
- Mitigating LLM Hallucinations: a multifaceted approach [Great]
- Cramming: Training a Language Model on a Single GPU in One Day [Great]
- Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU [Great]
- PEFT: State-of-the-art Parameter-Efficient Fine-Tuning [Great]
- PEFT: Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware [Great]
- Introduction to 8-bit Matrix Multiplication for transformers at scale using Hugging Face Transformers, Accelerate and bitsandbytes
- bitsandbytes: 8-bit CUDA functions for PyTorch
- Alpaca-LoRA: Low-Rank LLaMA Instruct-Tuning on consumer hardware [Great]
- LLaMA & Alpaca Tutorial: “ChatGPT” On Your Local Computer
- Dalai: The simplest way to run LLaMA on your local machine
- pyllama
- Alpaca-LoRA-Serve
- llama.cpp: Port of Facebook's LLaMA model in C/C++
- alpaca.cpp
- SparseGPT: Remove 100 Billion Parameters of LLMs
- xFormers: Toolbox to Accelerate Research on Transformers
- LLaMA-Adapter: Efficient Fine-tuning of LLaMA (Fine-tuning LLaMA to follow instructions within 1 Hour and 1.2M Parameters)
- GPT4All [Great]
- Vicuna web page [Great]
- Vicuna GitHub: FastChat
- PetGPT
- GPT-4-LLM
- baize Chatbot
- Koala
- Gorilla: An API store for LLMs
- Lit-LLaMA
- Auto-GPT
- xTuring
- GPTCache
- Dolly-v2-12B
- Web LLM
- P-tuning v2
- QLoRA: Efficient Finetuning of Quantized LLMs
- AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
- GPTQ Quantization Method in Transformers
- Optimize open LLMs using GPTQ and Hugging Face Optimum
- GPTQ vs. bitsandbytes (BNB)
- BNB Blog: Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA
- GPTQ Blog: Making LLMs lighter with AutoGPTQ and transformers
- TensorRT-LLM
- Overview of 🤗 Transformers Quantization: GPTQ vs bitsandbytes
- LoRA Exchange (LoRAX): Serve 100s of Fine-Tuned LLMs for the Cost of 1
- Introducing LoRAX
- DeepSparse: Sparsity-aware deep learning inference runtime for CPUs
- Practical Tips for Finetuning LLMs Using LoRA (Low-Rank Adaptation) [Great]
- Dare method for improving LLMs performance
- Small model that surpass the GPT4 [Interesting]
- Efficient LLMs Survey [Great]
- LoRAX (LoRA eXchange): Framework that allows users to serve thousands of fine-tuned models on a single GPU
- PowerInfer: High-speed LLMs Serving on PCs with Consumer-grade GPUs
- LoRA From Scratch Implementation
- Improving LoRA (DoRA): Implementing Weight-Decomposed Low-Rank Adaptation (DoRA)
- DoRA Link2
- Proxy-Tuning (new method for fine-tuning LLMs)
- AutoQuantize (GGUF, AWQ, EXL2, GPTQ) Colab Notebook [Great]
- DoRA: Weight-Decomposed Low-Rank Adaptation - Linkedin Post
- DoRA: Weight-Decomposed Low-Rank Adaptation - Paper
- GaLore: Memory Efficient Fine-tuning Technique
- Quanto: a pytorch quantization toolkit [Great]
- Quanto: Linkedin Post
- Deleting 40% of LLM Layers Without Drop in Accuracy
- The Unreasonable Ineffectiveness of the Deeper Layers
- Continual Pretraining of LLMs
- NOLA: run 10,000 customized LLaMA2 (70B) (4bit) models on a single 48GB GPU
- NOLA LLaMA3
- LoRA Learns Less and Forgets Less in comparision to full finetuning
- Best Practices for Fine-Tuning & Training LLMs
- TorchChat
- The Evolution of Extreme LLM Compression: From QuIP to AQLM with PV-Tuning
- Calculating GPU memory for serving LLMs
- How Much GPU Memory is Needed to Serve a Large Language Model (LLM)?
- CUDA-Free Inference for LLMs (PyTorch Blog)
- Building LLM applications for production
- Bard API
- Amazon Bedrock: build and scale generative AI applications [Great]
- text to SQL Github Repos
- vanna
- sqlchat
- dataherald
- WrenAI
- Practical text-to-SQL for data analytics by Linkedin [Great]
- Persian abstract of above Practical text-to-SQL for data analytics by Linkedin - Out of Distribution Telegram Channel
- Different Kinds of Prompt Engineering
- Prompt Engineering Guide
- PromptTools: tools for prompt testing and experimentation
- Prompt engineering for Claude's long context window
- Chain of Verification Prompt engineering method
- Analogical Prompting
- Prompt Flow: Build high-quality LLM apps
- Contrastive Chain-of-Thought Prompting (CCoT)
- New Prompting Techniques
- Openai Prompt Engineering Guide - Linkedin Post
- Openai Prompt Engineering Guide
- Anthropic Claude Metaprompt Tool
- Anthropic Prompt Improver
- Anthropic Prompt Improver Linkedin Post
- Anthropic Evaluate Prompts Tool
- Cohere Prompt Tuner: Prompt Optimization at Your Fingertips
- Quality Prompts: Use and evaluate prompting techniques quickly
- Prompt Design at Character.AI
- Structured Prompting
- Writing with AI: Five ways professional writers are leveraging ChatGPT
- Google Prompt Gallery
- ell: The Language Model Programming Library
- Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science
- LLMs for Tabular Data - Linkedin post
- MetaGPT: Multi-Agent Framework
- DevOpsGPT: AI-Driven Software Development Automation Solution
- LLM Agent Survey
- Microsoft AutoGen development of LLM applications using multiple agents
- OpenDevin: autonomous AI software engineer
- Composio: the best toolset to integrate AI Agents
- MindSearch: An LLM-based Multi-agent Framework of Web Search Engine
- OpenAI Swarm Library for Multi-Agent
- Don't Sleep on Single-agent Systems
- Linkedin post for Don't Sleep on Single-agent Systems
- Microsoft TinyTroupe library for simulate human agents with LLMs [Interesting]
- HuggingFace Smolagent Library blog post [Useful]
- Cost to Deploy LLaMA2 vs. ChatGPT [Very Important]
- Anyscale Training Cost
- LLMs APIs Pricing Benchmark: pricing of AWS Bedrock, OpenAI, Microsoft Azure
- LLM Token-based Price Sheet
- LLM Pricing Table Sheet
- LLM Pricing Table Linkedin Post
- Pricibg Sheet for Hosted LLMs
- LLM Pricing Comparison Tool in HuggingFace Space
- e2eml transformers from scratch [Excellent]
- annotated-transformer: Learning transformers from code
- Transformers Recipe
- ALBERT-Persian
- ALBERT-Persian Demo Page
- ALBERT-Farsi-base-v2 in HuggingFace
- ParsBERT - Model for Persian Language Understanding
- ARMAN [Great]
- ParsBigBird: Persian Bert For Long-Range Sequences [Great]
- PersianQA
- Persian (Farsi) Pre-trained Language Models [Great]
- Hezar: The all-in-one AI library for Persian, supporting a wide variety of tasks and modalities [Great & Important]
- XLM-RoBERTa (Multilingual & supports Persian)
- TookaBERT by PartAI [Great]
- Dorna PartAI LLM
- Transfer Learning for NLP via BERT for Text Classification
- Text Classification with BERT Tokenizer
- Bert Text Classification
- Persian Semantic Search
- Toward fine-tuning a state of the art Natural Language Inference (NLI) model for Persian
- Attention Mechanism
- Visualizing A Neural Machine Translation Model - Attention Mechanism
- Intuitive Understanding of Attention Mechanism in Deep Learning
- Structured Attention Networks
- WaveNet: Increasing reception field using dilated convolution
- Understanding WaveNet architecture
- WaveNet: A Generative Model for Raw Audio
- How WaveNet Works
- PyTorch Tutorial to Sequence Labeling
- Bert Extractive Summarizer [Great]
- Generating Text Summaries Using GPT-2 on PyTorch with Minimal Training [Good]
- A Gentle Introduction to Text Summarization in Machine Learning
- Taming Recurrent Neural Networks for Better Summarization
- PyTorch implementation of "Get to the point"
- TensorFlow implementation of "Get to the point"
- A Comprehensive Guide to Build your own Language Model in Python
- D2L: Language Models and Dataset
- Develop a word-level Neural Language Model in Keras
- IBM deep learning language model
- BERT language model
- Facebook AI: GSLM
- Language Modeling Great Tutorial
- GALACTICA: general-purpose scientific language model [Great]
- Distributed Training of Language Models with Reinforcement Learning via Human Feedback (RLHF) [Excellent]
- Over-Sampling using SMOTE [SMOTE for high-dimensional class-imbalanced data]
- Over-sampling via imbalanced-learn library
- Imbalanced Data Handling
- Rasa Chatbot [Great]
- Learn how to Build and Deploy a Chatbot in Minutes using Rasa
- chatbot with DialoGPT
- DialoGPT: huggingface Transformer
- deeppavlov [Great]
- PyTorch Chatbot Tutorial
- Implement a Simple Chat Bot With PyTorch
- GPT2 Chatbot PyTorch
- PyTorch Official Chatbot Tutorial
- PaddlePaddle Knover: toolkit for knowledge grounded dialogue generation
- PaddlePaddle PLATO-2
- ParlAI [Great]
- huggingface: Transformers [Great]
- huggingface: Blenderbot [Great]
- huggingface: Blenderbot Small [Great]
- huggingface: GPT-2 Text Generation [Great]
- Seq2seq Chatbot
- seq2seq Chatbot implemented in Pytorch
- papers with code: chatbot
- Proudly Leading the Chatbot
- Real Python: Build a Chatbot with Python ChatterBot
- A step-by-step guide to building a chatbot based on your own documents with GPT
- GitHub Models
- Git Ingest: Quickly turn a GitHub repository into text for LLMs [Great]
- Create a Chatbot for any GitHub repo [Great]
- Chatbot Analytics: 9 Key Metrics
- Chatbot Statistics for 2023
- Chatbot Analytics 101: Essential Metrics to Track
- 12 Metrics For Chatbot Analytics
- ParlAI Evaluation Metrics for Chatbot
- Chatbot Evaluation Metrics [Great]
- Databricks' report on LLM evaluation methods
- AgentBench: Evaluating LLMs as Agents
- Prometheus: Using GPT4 as SLMs Evaluator
- LLM Model Evaluation Metrics - When and How to Use Them
- OpenAI ChatGPT [Amazing]
- Description of How OpenAI ChatGPT Works: Illustrating Reinforcement Learning from Human Feedback (RLHF)
- How ChatGPT was Trained
- ChatGPT Android SDK
- ChatGPT awesome apps
- A Categorical Archive of ChatGPT Failures
- Is ChatGPT a General-Purpose Natural Language Processing Task Solver?
- aman.ai chatGPT Tutorial [Great]
- ChatGPT for customer service
- Chatgpt Retrieval Plugin
- Trending AI Tools
- Merlin: OpenAI ChatGPT Plus extension on all websites
- Adrenaline
- Using LLMs as agents that orchestrate tools [Interesting]
- ChatGPT API Using Python
- parthean: A Startup about Financial Expert via ChatGPT
- Notes on the cost of ChatGPT
- Ortus - your YouTube AI buddy
- How Is ChatGPT’s Behavior Changing over Time?
- LLM Drifts: How Is ChatGPT’s Behavior Changing over Time?
- ChatGPT app Builder
- GPT4 Turbo 128k analysis Notes (its price)
- Designer GPT: website creator
- OpenAI DevDay Breakout Sessions Videos
- GPT Seed Parameter Notes
- Awesome ChatGPT Prompts
- GPT-4o Full Data Analysis
- GPT4-o Architecture
- Introducing Structured Outputs in the OpenAI API
- OpenAI Realtime-api
- OpenAI Model Distillation in the API
- OpenAI Prompt Caching
- LibreChat: Enhanced ChatGPT Clone [Great]
- Learning to Reason with LLMs: OpenAI o1 Model
- How does OpenAI train the Strawberry (o1) model to spend more time thinking?
- Learning to Reason before you speak is how OpenAI o1 generates its response
- 5 Papers that better understanding Openai o1 models
- Anthropic Claude Tool Use
- Anthropic Prompt Generator
- Switched to Claude 3.5
- Anthropic Message Batches API
- Anthropic Message Batches API - Linkdin Post
- OpenAI Prompt Caching in GPT 4o and o1: How Does It Compare To Claude Prompt Caching?
- Anthropic Blog: Transformer Circuits Thread
- Anthropic MCP (Model Context Protocol)
- 100 Times Faster Natural Language Processing in Python
- Multi-label Text Classification using BERT
- Learning Meaning in Natural Language Processing
- Train and Deploy the Mighty Transformer NLP models using FastBert and AWS SageMaker
- Distilling knowledge from Neural Networks to build smaller and faster models
- HarfBuzz - a text shaping library [Useful]
- PruneBERT - Hugging Face
- spacy-streamlit: spaCy building blocks for Streamlit apps
- HuggingFace Evaluate Library
- NeMo - toolkit for Conversational AI [Excellent]