언어 선택 / Language Selection:
This Jupyter Notebook demonstrates the Toolformer approach, where a language model (GPT-J in this case) is augmented with API call capabilities to enhance its ability to complete tasks requiring external information. By inserting API calls during text generation and leveraging the responses, the model can dynamically incorporate real-time data into its outputs.
- API Call Integration: Automatically inserts API calls into text where external information is required.
- Dynamic API Execution: Executes API calls and incorporates their results into the output text.
- Loss-Based Evaluation: Compares loss values for text with and without API call results to determine the optimal representation for fine-tuning.
- Model Fine-Tuning: Uses the selected dataset for fine-tuning GPT-J to improve API-aware text generation.
- Iterative Inference: Implements an iterative process where API results dynamically influence subsequent text generation.
- Install the required Python libraries:
pip install torch transformers
- Ensure access to a compatible GPU for optimal performance.
-
Setup:
- Load the GPT-J model and tokenizer from the Hugging Face Transformers library.
- Define a custom
Calendar
API tool for dynamic date-related queries.
-
Prompt and Data Preparation:
- Define a prompt instructing the model to integrate API calls where necessary.
- Provide example input data for generating API calls.
-
API Call Generation and Execution:
- The model generates text with
[Calendar()]
API calls inserted where external information is needed. - API calls are executed, and results are incorporated back into the text.
- The model generates text with
-
Loss Calculation:
- Compute loss for three variations of the text:
- With API results (
[Calendar() -> Today is Thursday, November 30, 2023.]
) - With API placeholders (
[Calendar()]
) - Plain text (no API call)
- With API results (
- Compare losses to identify the most appropriate representation for fine-tuning.
- Compute loss for three variations of the text:
-
Fine-Tuning:
- Create a fine-tuning dataset based on the loss comparison.
- Fine-tune the GPT-J model for better API-aware text generation.
-
Inference:
- Feed new inputs to the fine-tuned model.
- When the model generates an API call (e.g.,
->
), execute the call, integrate the results, and continue decoding iteratively.
The store is never open on the weekend, so today it is closed.
The store is never open on the [Calendar() -> Today is Saturday, November 25, 2023.] weekend, so today it is closed.
-
Loss-Based Filtering: The example shows how incorporating API results improves loss:
api + result loss: 2.84 api without result loss: 2.98 plain text loss: 3.83
# api_with_result_output.loss: 2.84 # api_without_result_output.loss: 2.98 # plain_output.loss: 3.83 # filtering_threshold: variable(e.g., 1.0) if min(api_without_result_output.loss, plain_output.loss) - api_with_result_output.loss >= filtering_threshold: finetune_dataset = including_API_without_result + next_words else: finetune_dataset = plain_text + next_words
-
Dynamic API Calls: Enables the model to seamlessly interact with tools during text generation.
-
Customizable APIs: Easily extendable to include other APIs beyond
Calendar
.
- Understand the integration of external API tools into language models.
- Explore loss-based filtering to evaluate the impact of API calls.
- Fine-tune language models for dynamic, tool-enhanced text generation.
- Implement iterative inference pipelines for real-time API utilization.
- original paper(Toolformer: Language Models Can Teach Themselves to Use Tools)
- lucidrains/toolformer-pytorch
This notebook serves as an introduction to Toolformer principles. It can be extended with additional APIs and customized pipelines to address specific use cases, such as database queries, real-time weather updates, or knowledge retrieval systems.