- Compression for AGI 2023.02
- Language Modeling Is Compression 2023.09
- Training Compute-Optimal Large Language Models 2022.03
- Predicting Emergent Abilities with Infinite Resolution Evaluation 2023.10
- The Platonic Representation Hypothesis 2024.05
- Parables on the Power of Planning in AI: From Poker to Diplomacy 2024.05
- Attention Is All You Need 2017.06
- Improving Language Understanding by Generative Pre-Training 2018.06
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 2018.10
- DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models 2024.01
- Scaling Instruction-Finetuned Language Models 2022.10
- Reflexion: Language Agents with Verbal Reinforcement Learning 2023.03
- RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment 2023.04
- WizardLM: Empowering Large Language Models to Follow Complex Instructions 2023.04
- LIMA: Less Is More for Alignment 2023.05
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model 2023.05
- Mixture-of-Experts Meets Instruction Tuning: A Winning Combination for Large Language Models 2023.05
- Preference Ranking Optimization for Human Alignment 2023.06
- Orca: Progressive Learning from Complex Explanation Traces of GPT-4 2023.06
- Self-Alignment with Instruction Backtranslation 2023.08
- Taken out of context: On measuring situational awareness in LLMs 2023.09
- RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback 2023.09
- Self-Rewarding Language Models 2024.01
- A Survey of Monte Carlo Tree Search Methods 2012.03
- From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large Language Models 2024.04
- From Instructions to Constraints: Language Model Alignment with Automatic Constraint Verification 2024.03
- Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models 2024.04
- Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models 2024.06
- Inverse Constitutional AI: Compressing Preferences into Principles 2024.06
- Following Length Constraints in Instructions 2024.06
- The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions 2024.04
- Self-critiquing models for assisting human evaluators 2022.06
- Weak-to-strong generalization 2023.12
- Prover-Verifier Games improve legibility of LLM outputs 2024.07
- Larger language models do in-context learning differently 2023.03
- Many-Shot In-Context Learning 2024.04
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models 2022.01
- Let’s Verify Step by Step 2023.05
- Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks 2023.05
- Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations 2023.12
- Solving olympiad geometry without human demonstrations 2024.01
- DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models 2024.02
- AlphaMath Almost Zero: process Supervision without process 2024.05
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models 2023.05
- Large Language Models Can Learn Temporal Reasoning 2024.01
- Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking 2024.03
- GPT-4 Architecture, Infrastructure, Training Dataset, Costs, Vision, MoE 2023.07
- Llama 2: Open Foundation and Fine-Tuned Chat Models 2023.07
- Gemini 1.0 2023.12
- Gemini 1.5 2024.02
- DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence 2024.06
- GPT-4o mini: advancing cost-efficient intelligence 2024.07
- AI achieves silver-medal standard solving International Mathematical Olympiad problems 2024.07
- The Llama 3 Herd of Models 2024.07
- Learning to Reason with LLMs 2024.09
- Introducing ChatGPT Pro 2024.12
- State of GPT 2023.05
- Some intuitions about large language models 2023.11
- MiniCPM:揭示端侧大语言模型的无限潜力 2024.04
- Llama 3 Opens the Second Chapter of the Game of Scale 2024.04
- Successful language model evals 2024.05
- Three hypotheses on LLM reasoning 2024.12
- Claude’s Character 2024.06
- Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process 2024.07
- Physics of Language Models: Part 2.1, YouTube 2024.09
- Physics of Language Models: Part 3.1, Knowledge Storage and Extraction 2023.09
- Physics of Language Models: Part 3.2, Knowledge Manipulation 2023.09
- Physics of Language Models: Part 3.1 + 3.2, YouTube 2023.11
- Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws 2024.04
- Challenging BIG-Bench tasks and whether chain-of-thought can solve them 2022.10
- COLLIE: Systematic Construction of Constrained Text Generation Tasks 2023.07
- FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models 2023.10
- Instruction-Following Evaluation for Large Language Models 2023.11
- Beyond Instruction Following: Evaluating Rule Following of Large Language Models 2024.07
- Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning 2024.06
- Introducing SimpleQA 2024.10
- OpenAI Model Spec 240508 2024.05