LLMs on the Line: Data Determines Loss-To-Loss Scaling Laws

Official code to reproduce the results and data presented in the paper LLMs on the Line: Data Determines Loss-To-Loss Scaling Laws.

Repository Structure

main.py: Main evaluation script for huggingface models
config.py: Configuration handling and argument parsing
utils.py: Utility classes and functions
models.yaml: Model configurations
tasks.yaml: Task configurations with few-shot settings
config/: Configuration files for different models
data/: Contains a csv file with all evaluation results
lingua: Contains the lingua-huggingface repository with all the changes and config files to train / evaluate models from scratch

Setup

Load required CUDA modules:

module purge
module load cuda/11.7

Initialize and update submodules:

git submodule update --init --recursive

Create and activate virtual environment:

python3.10 -m venv ~/.llm_line
source ~/.llm_line/bin/activate

Install dependencies: Basic requirements:

Python 3.10
CUDA 11.7
CUDNN 8.4.1

Then install the rest of the dependencies:

pip install --upgrade pip
pip install -r requirements.txt

Configuration Files

models.yaml

Specifies models to evaluate with their configurations:

models:
  - name: "model/name"
    cls: "hf" # lm_eval argument
    batch_size: 8
    device: "cuda"

tasks.yaml

Defines tasks and their few-shot settings:

task_configs:
  commonsense_qa:
    shots: [0, 5, 10]  # Will evaluate with 0, 5, and 10 shots

Usage

Basic usage:

python main.py

With custom paths:

python main.py \
    --models_yaml path/to/models.yaml \
    --tasks_yaml path/to/tasks.yaml \
    --results_dir custom_results_dir

Arguments

--models_yaml: Path to models configuration file (default: 'models.yaml')
--tasks_yaml: Path to tasks configuration file (default: 'tasks.yaml')
--results_dir: Directory to store evaluation results (default: 'results')

Troubleshooting

If you encounter CUDA-related errors:

Ensure you have the correct CUDA modules loaded (cuda/11.7, cudnn/8.4.1-cu11.6)
Try cleaning the virtual environment and reinstalling:

rm -rf ~/.llm_line
python3.10 -m venv ~/.llm_line
source ~/.llm_line/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

Adding New Model Arguments

When adding new arguments to model configurations, several files need to be updated to ensure proper handling:

Model YAML Files (models.yaml, etc.):

models:
  - name: "model/name"
    revision: "step-123"  # New argument
    cls: "hf"
    batch_size: 8
    device: "cuda"

Configuration Class (config.py):
- Add the new field to ModelConfig class
- Update from_dict method if special handling is needed
```
@dataclass
class ModelConfig:
    name: str
    revision: Optional[str] = None  # New field
```

Filename Generation (config.py):

Update format_model_info to include new args in filenames

def format_model_info(model_config):
    filename = f"name={model_config.name}"
    if model_config.revision:
        filename += f"__checkpoint={model_config.revision}"

Metadata Storage (config.py):

Update get_model_metadata to include new args in results

def get_model_metadata(model_config):
    metadata = {
        "model_name": model_config.name,
        # Add new fields
        "checkpoint": model_config.revision if model_config.revision else None
    }

Model Loading (main.py):

Update model loading logic if the new argument affects how models are loaded

model_args = f"pretrained={model_config.name}"
if model_config.revision:
    model_args += f",revision={model_config.revision}"

This ensures that:

New arguments are properly parsed from YAML
Arguments are included in result filenames for tracking
Metadata in result files includes new information
Model loading uses new arguments correctly

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
configs		configs
data		data
figures		figures
img		img
lingua @ 2592159		lingua @ 2592159
notebooks		notebooks
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
config.py		config.py
eval.sh		eval.sh
index.html		index.html
main.py		main.py
process_data.py		process_data.py
requirements.txt		requirements.txt
tasks.yaml		tasks.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLMs on the Line: Data Determines Loss-To-Loss Scaling Laws

Repository Structure

Setup

Configuration Files

models.yaml

tasks.yaml

Usage

Arguments

Troubleshooting

Adding New Model Arguments

About

Releases

Packages

Contributors 2

Languages

License

brendel-group/llm-line

Folders and files

Latest commit

History

Repository files navigation

LLMs on the Line: Data Determines Loss-To-Loss Scaling Laws

Repository Structure

Setup

Configuration Files

models.yaml

tasks.yaml

Usage

Arguments

Troubleshooting

Adding New Model Arguments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages