Skip to content

Conference schedule, top papers, and analysis of the data for NeurIPS 2023!

Notifications You must be signed in to change notification settings

jacobmarks/awesome-neurips-2023

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Awesome NeurIPS 2023 Info

Neurips 2023 wordcloud Caption: Wordcloud of all NeurIPS 2023 titles

Welcome to the hub for all things NeurIPS 2023! We scraped the data for all 3500+ NeurIPS projects and dove into the depths of Hugging Face, GitHub, LinkedIn, and Arxiv to pick out the most interesting content.

In this repo, you will find:

  • Data Analysis: detailed analysis of the titles and abstracts from NeurIPS 2023 accepted papers
  • Awesome Projects: synthesized collection of 40 NeurIPS 2023 papers you won't want to miss
  • Conference Schedule: comprehensive listing of all NeurIPS 2023 projects (title, authors, abstract) organized by poster session and sorted alphabetically

Data Analysis

The raw data is included in this repo. If you have ideas for other interesting analyses, feel free to create an issue or submit a PR!

For now, insights are organized into the following categories:

  • Authors
  • Titles
  • Abstracts

🔍 For the data analysis itself, check out the Jupyter Notebook!

🔍 And check out the blog post synthesizing the results here.

Authors

Neurips num authors

Most prolific authors

The top 10 authors with the most papers at NeurIPS 2023 are:

  • Bo Li: 15 papers
  • Ludwig Schmidt: 14 papers
  • Bo Han: 13 papers
  • Mihaela van der Schaar: 13 papers
  • Hao Wang: 12 papers
  • Dacheng Tao: 11 papers
  • Bernhard Schölkopf: 11 papers
  • Masashi Sugiyama: 11 papers
  • Andreas Krause: 11 papers
  • Tongliang Liu: 11 papers

Number of unique authors

There were 13,012 unique authors at NeurIPS 2023, up from 9913 at NeurIPS 2022.

This continues the exponential explosion of unique authors over the past decade.

Neurips unique authors history

Number of authors per paper

Titles

Title Length

Neurips 2023 title length histogram

  • The average title length was 8.72 words, up from 8.48 at NeurIPS 2022. This continues an ongoing trend of title lengthening:

Neurips title length history

Prevalence of Acronyms

22% of titles introduced an acronym, up from 18% at NeurIPS 2022.

LaTeX in Titles

  • 1.3% of titles contained LaTeX, whereas none of the titles at NeurIPS 2022 contained LaTeX.

Abstracts

abstract length

Abstract Length

GitHub Reigns Supreme

  • Out of the 3581 abstracts, 675 explicitly mention GitHub, including a link to their code, models, or data.
  • Only 79 abstracts include a URL that is not GitHub.

Modalities, Models, and Tasks

Using a CLIP model, we zero-shot classified/predicted the modality of focus for each paper based on its abstract. The categories were ["vision", "text", "audio", "tabular", "time series", "multimodal"].

By far the biggest category was multimodal, with a count of 1296. However, the CLIP model's inclination towards "multimodal" may be somewhat biased by trying to partially fit other modalities — the words multi-modal and multimodal only show up in 156 abstracts, and phrases like vision-language and text-to-image only appear a handful of times across the dataset.

Themes occurring frequently include:

  • "benchmark": 730
  • ("generation", "generate"): 681
  • ("efficient", "efficiency"): 963
  • "agent": 280
  • ("llm", "large language model"): 238

Cool NeurIPS Projects

Title Paper Code Project Page Hugging Face Blog
An Inverse Scaling Law for CLIP Training arXiv GitHub
Augmenting Language Models with Long-Term Memory arXiv GitHub
Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models arXiv GitHub Project Blog
Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models arXiv GitHub Project Blog
DataComp: In search of the next generation of multimodal datasets arXiv GitHub Project Blog
Direct Preference Optimization: Your Language Model is Secretly a Reward Model arXiv GitHub Blog
DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data arXiv GitHub Project Blog
Fine-Tuning Language Models with Just Forward Passes arXiv GitHub Blog
Generating Images with Multimodal Language Models arXiv GitHub Project
Holistic Evaluation of Text-To-Image Models arXiv GitHub Project Blog
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face arXiv GitHub Hugging Face
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation arXiv GitHub Hugging Face Blog
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning arXiv GitHub Hugging Face Blog
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena arXiv GitHub
LAMM: Multi-Modal Large Language Models and Applications as AI Agents arXiv GitHub Project Hugging Face
LIMA: Less Is More for Alignment arXiv Blog
LLM-Pruner: On the Structural Pruning of Large Language Models arXiv GitHub
LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenario arXiv GitHub
MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion arXiv GitHub Project
MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing arXiv GitHub Project Hugging Face Blog
Mathematical Capabilities of ChatGPT arXiv GitHub Project
Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation arXiv GitHub Project
Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset arXiv GitHub Project
MotionGPT: Human Motion as Foreign Language arXiv GitHub Project Hugging Face Blog
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents arXiv GitHub Hugging Face Blog
Photoswap: Personalized Subject Swapping in Images arXiv GitHub Project
Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation arXiv GitHub Hugging Face Blog
QLoRA: Efficient Finetuning of Quantized LLMs arXiv GitHub Hugging Face Blog
Reflexion: Language Agents with Verbal Reinforcement Learning arXiv GitHub Blog
ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting arXiv GitHub Project Blog
Segment Anything in 3D with NeRFs arXiv GitHub Project Blog
Segment Anything in High Quality arXiv GitHub Hugging Face Blog
Segment Everything Everywhere All at Once arXiv GitHub
Self-Refine: Iterative Refinement with Self-Feedback arXiv GitHub Project Blog
Simple and Controllable Music Generation arXiv GitHub Blog
Squeeze, Recover and Relabel: Dataset Condensation at ImageNet Scale From A New Perspective arXiv GitHub
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only arXiv Hugging Face Blog
Toolformer: Language Models Can Teach Themselves to Use Tools arXiv Blog
Unlimiformer: Long-Range Transformers with Unlimited Length Input arXiv GitHub Blog
Visual Instruction Tuning arXiv GitHub Project Hugging Face Blog

Conference Schedule

Note: GitHub automatically truncates files larger than 512 KB. To have all papers display on GitHub, we've split the file up by session.

Poster Session 1

Poster Session 2

Poster Session 3

Poster Session 4

Poster Session 5

Poster Session 6

Posters Not Presented