I read a paper every day for 100 days, documenting them here. A ⭐ means that I need to come back to the paper after I know more. A ❌ means that it isn't necessary for you to read, as it was kind of a meh read.
- ImageNet Classification with Deep Convolutional Neural Networks
- ⭐Attention is all you need
- Visualizing and Understanding Convolutional Networks
- A ConvNet for the 2020s
- ⭐The Matrix Calculus you need for Deep Learning
- Deepface Closing the Gap to Human-level Performance
- ⭐Improving Language Understanding by Generative Pre-Training (GPT 1)
- ⭐Language Models are Unsupervised Multitask Learners (GPT 2)
- Language Models are Few-Shot Learners (GPT 3)
- ❌Character-Level Language Modeling with Deeper Self-Attention
- ⭐Recurrent Neural Networks (RNNs): A gentle Introduction and Overview
- ⭐RWKV: Reinventing RNNs for the Transformer Era
- ❌Pytorch
- DIFFEDIT: DIFFUSION-BASED SEMANTIC IMAGE EDITING WITH MASK GUIDANCE
- ⭐RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
- ⭐RT-1: ROBOTICS TRANSFORMER FOR REAL-WORLD CONTROL AT SCALE
- Platypus: Quick, Cheap, and Powerful Refinement of LLMs
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Pointer networks
- Layer normalization
- Going Deeper with Convolutions
- Rethinking the Inception Architecture for Computer Vision
- Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
- Long Short Term Memory
- Deep Residual Learning for Image Recognition - Resnet
- U-Net: Convolutional Networks for Biomedical Image Segmentation
- Very Deep Convolutional Networks for Large-Scale Image Recognition
- A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting
- Bagging Predictors
- Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
- Improved Techniques for Training GANs
- Conditional Generative Adversarial Nets
- Generative Adversarial Networks (the original paper)
- Diffusion Models Beat GANs on Image Synthesis
- Denoising Diffusion Probabilistic Models
- Understanding Diffusion Models: A Unified Perspective
- Deep Unsupervised Learning using Nonequilibrium Thermodynamics (the original paper)
- High-Resolution Image Synthesis with Latent Diffusion Models
- Hierarchical Text-Conditional Image Generation with CLIP Latents
- Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
- Adding Conditional Control to Text-to-Image Diffusion Models
- Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
- Greedy function approximation: A gradient boosting machine.
- XGBoost: A Scalable Tree Boosting System
- Random Forests
- Mastering the game of Go with Deep Neural Networks & Tree Search
- Generally capable agents emerge from open-ended play
- Highly accurate protein structure prediction with AlphaFold
- Adam: A Method for Stochastic Optimization
- Autograd: Effortless Gradients in Numpy
- Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
- Dropout: A Simple Way to Prevent Neural Networks from Overfitting
- Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
- Torch: A modular machine learning software library
- Automatic differentiation in PyTorch
- TensorFlow: A system for large-scale machine learning
- Experiments on Learning by Back Propagation
- Support Vector Networks maybe idk this one's a bit old
- Latent Dirichlet Allocation maybe idk this one's a bit old
- Statistical Modeling: The Two Cultures
- Textbooks are all you need
- A Fast Learning Algorithm for Deep Belief Nets
- Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
- Subject-Diffusion
- Generating long sequences with sparse transformers
- Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding
- Improved Techniques for Training GANs
- Dynamic Evaluation of Neural Sequence Models
- Grandmaster level in StarCraft II using multi-agent reinforcement learning
- Efficient Transformers: A survey
- FlashAttention
- FlashAttention 2
- SpikeGPT
- Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
- Learning to Model the World with Language
- Learning Transferable Visual Models From Natural Language Supervision
- FLatten Transformer: Vision Transformer using Focused Linear Attention
- DeepSpeed Chat
- MusicGen
- PaLM-E: An Embodied Multimodal Language Model
- PaLI-X: On Scaling up a Multilingual Vision and Language Model
- Visual Instruction Tuning
- Image Transformer
- Reinforcement learning papers
- NLP papers
- Ethics papers
- ViTs (Vision Transformers)
- Statistics (PCA, fisher vectors and fisher kernels, boltzmann machines)
- SIFT+FV networks
- GELUs and other activation functions
- facial recognition
- 3d model representation
- https://ai.googleblog.com/2018/03/using-evolutionary-automl-to-discover.html
- https://www.fast.ai/posts/2018-08-10-fastai-diu-imagenet.html
- Mixup, Cutmix, RandAugment, Random Erasing, and regularization scehemes like Stochastic Depth and Label Smoothing
- Inverted bottleneck
- BatchNorm vs LayerNorm
- https://www.reddit.com/r/MachineLearning/comments/hj4cx/comment/c1vt6ny/
- Bishop's Pattern Recognition and Machine Learning
- http://alumni.media.mit.edu/~tpminka/statlearn/glossary/
- Statistical Pattern Recognition by Andrew Webb
- Programming Collective Intelligence
- https://www.d2l.ai/