SCRIBE: Structured Chain Reasoning for Interactive Behavior Explanations using Tool Calling

Abstract

Language models can be used to provide interactive, personalized student feedback in educational settings. However, real-world deployment faces three key challenges: privacy concerns, limited computational resources, and the need for pedagogically valid responses. These constraints require small, open-source models that can run locally and reliably ground their outputs in correct information. We introduce SCRIBE, a framework for multi-hop, tool-augmented reasoning designed to generate valid responses to student questions about feedback reports. SCRIBE combines domain-specific tools with a self-reflective inference pipeline that supports iterative reasoning, tool use, and error recovery. We distil these capabilities into 3B and 8B models via two-stage LoRA fine-tuning on synthetic GPT-4o-generated data. Evaluation with a human-aligned GPT-Judge and a user study with 108 students shows that SCRIBE models achieve comparable or superior quality to much larger models in key dimensions such as relevance and actionability, while being perceived on par with GPT-4o and Llama-3.3 70B by students. These findings demonstrate the viability of SCRIBE for low-resource, privacy-sensitive educational applications.

Finetuning Instructions

To finetune a model with this framework, use a command like the following (replace the paths and names with your actual model, adapter, and checkpoint names as needed):

accelerate launch --config_file /path/to/your/accelerate_config.yaml /path/to/finetune_lama.py --step multi_step --adapter_checkpoint /path/to/your/adapter_checkpoint/initial_adapter --model_name /path/to/your/model_directory --epochs 3

Note: In your accelerate_default_config.yaml, make sure to set num_processes to the number of GPUs you want to use for training.

Synthetic Data

We have made the synthetic training data used for SCRIBE open source. You can access and download it here:

SCRIBE Synthetic Training Data (Open Source)

This dataset contains the multi-hop, tool-augmented reasoning examples used for model finetuning and evaluation.

Open Source Models

The following SCRIBE models are available as open source on Hugging Face:

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SCRIBE: Structured Chain Reasoning for Interactive Behavior Explanations using Tool Calling

Abstract

Finetuning Instructions

Synthetic Data

Open Source Models

About

Uh oh!

Releases

Packages

Languages

epfl-ml4ed/SCRIBE

Folders and files

Latest commit

History

Repository files navigation

SCRIBE: Structured Chain Reasoning for Interactive Behavior Explanations using Tool Calling

Abstract

Finetuning Instructions

Synthetic Data

Open Source Models

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages