This Python-based tool integrates with a range of AI APIs, including OpenAI's ChatGPT, Google's Gemini, Meta's Llama, Microsoft's Copilot and Mistral language models. Users can submit requests and retrieve responses for the same input, with collected responses organized and stored in a CSV file for easy comparison and analysis. The tool computes similarities based on the expected outputs of the AIs, enabling users to evaluate the performance of different models efficiently. With a focus on usability, this automated solution streamlines the process of performance assessment across multiple AI platforms.
The project aims to facilitate in-depth analysis and comparison of AI models, making it an essential resource for researchers, developers, and enthusiasts in the field of Artificial Intelligence.
- Multi-AI-API-Response-Collector.
In this section, you will learn how to set up the project. The setup process includes cloning the repository, generating API keys for the AI models, and configuring the .env
file with the API keys.
- Clone the repository with the following command:
git clone https://github.com/BrenoFariasDaSilva/Multi-AI-API-Response-Collector.git
cd Multi-AI-API-Response-Collector
To interact with the various AI models supported by this tool, you will need to generate API keys for each respective service. Below are the steps to obtain API keys for ChatGPT, Gemini, Llama, and Mistral.
To access the ChatGPT API, follow these steps:
- Visit the OpenAI Platform and sign up or log in to your account.
- The ChatGPT API allows you to integrate AI capabilities into your applications, enabling natural language processing, semantic search, and more.
- Generate an API key in the OpenAI dashboard by going to API Keys.
For more details, you can explore the OpenAI Developer Quickstart.
To obtain an API key for the Gemini API, follow these steps:
- Visit the Google AI Studio to get a Gemini API key.
- Sign in with your Google account or create one if you don't have one already.
- Navigate to the Google AI Studio API Key page, and with a few clicks, generate your key.
To access the Llama API, follow these steps:
- Visit the Llama API website and create an account by signing up.
- Once registered, note that Llama is currently in a private beta. You will be added to the waitlist after signing up.
- After receiving an invitation, log in and navigate to the API Token section to generate your token.
More details on obtaining your token can be found in the Llama API Documentation.
To use the Mistral API, follow these steps:
- Visit the Mistral AI Documentation to learn more about Mistral and its API.
- Sign in or create an account in the Mistral Console and generate your API key.
For more information, refer to the Mistral Getting Started Guide.
To ensure the tool can properly authenticate with each AI API, you will need to provide your API keys in a .env
file. Follow these steps to configure it:
- Open the
.env_example
file in the root directory of the project. - Replace the placeholder values with your actual API keys, which you should have obtained by following the instructions for each respective AI service.
Here’s what the .env_example
file looks like:
CHATGPT_API_KEY=
GEMINI_API_KEY=
LLAMA_API_KEY=
MISTRAL_API_KEY=
- After filling in the keys, rename the file from
.env_example
to.env
.
You can do this using the command line:
mv .env_example .env
Now the tool will automatically load the API keys from your .env
file when making requests to the respective AI models. Make sure the .env
file is not shared publicly to keep your API keys secure.
In order to run the project, you must have Python and Pip installed in your machine. In this section, you will learn how to install Python and Pip in your machine.
In order to run the scripts, you must have python3 and pip installed in your machine. If you don't have it installed, you can use the following commands to install it:
In order to install python3 and pip in Linux, you can use the following commands:
sudo apt install python3 -y
sudo apt install python3-pip -y
In order to install python3 and pip in MacOS, you can use the following commands:
brew install python3
In order to install python3 and pip in Windows, you can use the following commands in case you have choco
installed:
choco install python3
Or just download the installer from the official website.
Great, you now have python3 and pip installed. Now, we need to install the project requirements/dependencies.
This project depends on the following libraries:
- Google.generativeai -> Google Generative AI is used to interact with the Google API.
- Mistralai -> Mistral AI is used to interact with the Mistral API.
- NumPy -> NumPy is used to generate the linear prediction of the linear regression and to many operations in the list of the metrics.
- Openai -> OpenAI is used to interact with the OpenAI API.
- Pandas -> Pandas is used mainly to read and write the csv files.
- SciKit-Learn -> SciKit-Learn is used to generate the linear prediction of the linear regression.
- Install the project dependencies with the following command:
make dependencies
This command will generate virtual environment and install all the dependencies needed to run the project in the virtual environment.
In this section, you will learn how to use the project. The project is a tool that interacts with various AI APIs, including OpenAI's ChatGPT, Google's Gemini, Meta's Llama, Microsoft's Copilot and Mistral language models. Users can submit requests and retrieve responses for the same input, with collected responses organized and stored in a CSV file for easy comparison and analysis. The tool computes similarities based on the expected outputs of the AIs, enabling users to evaluate the performance of different models efficiently. With a focus on usability, this automated solution streamlines the process of performance assessment across multiple AI platforms.
To set up the input for this project, you need to populate the Inputs/input.csv
file. This file will contain the tasks or text prompts that the AI models will process and, optionally, the expected output for each task. The comparison between the models' outputs and the expected results will be based on these entries. Follow the steps below to properly configure the input file:
- Navigate to the
Inputs
directory in the project folder. - Open the
input.csv
file. - Ensure that the file has the following header:
Task,Expected Output (Optional)
- This column is mandatory and should contain the task or input text you want to provide to the AI models and must be surrounded by double quotes, in order to avoid any issues with the commas that break the csv file.
- Each line represents a separate task for evaluation.
- Examples of tasks might include questions, statements, or instructions for the models to interpret and respond to.
- This column is optional and should contain the expected output you anticipate from the AI models for the corresponding task and must be surrounded by double quotes, in order to avoid any issues with the commas that break the csv file.
- If provided, the model's output will be compared to this expected result, and a similarity score will be computed.
- Leaving this column blank will disable the comparison for that specific task.
Task,Expected Output (Optional)
"Explain the 'sudo' command in Linux.","The 'sudo' command in Linux allows a permitted user to execute a command as the superuser or another user, as specified by the security policy."
"Explain the 'chmod' command in Linux.","The 'chmod' command in Linux changes the permissions of a file or directory."
- Make sure each task is clear and concise to ensure the AI models can generate appropriate responses.
- If you want to test model outputs without expecting a specific result, you can leave the "Expected Output (Optional)" column blank. The system will still process the task but won't perform any comparisons.
- You can add as many tasks as needed, with or without expected outputs, to the
input.csv
file. The system will automatically process each task in the file.
After setting up the input file and the API keys, you must open the main.py
file and modify a few constants in order to customize the project to your needs. The constants that you can modify are:
EXECUTE_MODELS = {"ChatGPT": "ChatGPTModel", "Copilot": "CopilotModel", "Gemini": "GeminiModel", "Llama": "LlamaModel", "Mistral": "MistralModel"}
The EXECUTE_MODELS
constant is a dictionary that contains the name of the model and the name of the class that will be executed. You can remove models from this dictionary if you don't want to execute them or if you simply don't have the API key for them. In order to add a new model, you must create a new class using the template.py
file as a template and implement the new model's logic. After that, you can get the model's name and the class name and add them to the EXECUTE_MODELS
dictionary.
Lastly, open the utils.py
file and modify the VERBOSE
constant to true if you want the program to output everything that is being done. I personally never set it to true, only for debugging purposes.
Finally, as you have set up the input file, the API keys, and the constants in the main.py
file, you can run the project.
In order to run the project, run the following command:
make
This command will always ensure that the virtual env and the dependencies are installed and then run the project.
In this section, the results generated by the tool based on the input tasks in the input.csv
file are discussed. The tool outputs results in a file located at Outputs/output.csv
. The structure of this file includes details about the tasks provided, the expected outputs, and the comparison results of the AI models' responses. For each task, the tool calculates various similarity metrics between the AI model responses and the expected output. The output.csv
file includes the following columns:
- Task: The input task that was provided to the tool.
- Expected Output: The expected correct response for the task.
- Similarity Score: The computed similarity between the model’s response and the expected output. This includes:
- Standard Deviation Similarity
- Most Similar Model
- Minimum Similarity
- Maximum Similarity
- Average Similarity
- Median Similarity
- Models Output: For each AI model evaluated, the following details are provided:
- Model Name: The name of the AI model being evaluated with its respective output.
- Model Similarity: The similarity score between the model's response and the expected output.
This subsection provides an example of the output file structure and discusses the results generated by the tool based on two example tasks: "Explain the 'sudo' command in Linux" and "Explain the 'chmod' command in Linux." The input tasks are read from the input.csv
file, and the responses from two models (Gemini
and Copilot
) are evaluated.
The example tasks provided in the input.csv
file are as follows:
Task,Expected Output (Optional)
"Explain the 'sudo' command in Linux.","The 'sudo' command in Linux allows a permitted user to execute a command as the superuser or another user, as specified by the security policy."
"Explain the 'chmod' command in Linux.","The 'chmod' command in Linux changes the permissions of a file or directory."
Based on the tasks and models executed (Gemini and Copilot), here are the computed similarity metrics and model responses for each task. These results provide insight into how well the models perform against the expected outputs.
"The 'sudo' command in Linux allows a permitted user to execute a command as the superuser or another user, as specified by the security policy."
- Most Similar Model: Gemini (49.92%)
- Minimum Similarity: 44.56%
- Maximum Similarity: 49.92%
- Average Similarity: 47.24%
- Median Similarity: 47.24%
- Standard Deviation Similarity: 2.68%
"• sudo is a command in Linux that allows a user to run a command with elevated privileges, usually as the root user or another privileged user. • It stands for 'Super User Do'. • When a command is run with sudo, it prompts the user for their password to verify their identity. • Once authenticated, the command is executed with the privileges of the specified user, allowing access to protected files and system resources. • sudo is often used to perform administrative tasks that require elevated privileges, such as installing software, modifying system configurations, or managing services. • It provides a way to control access to critical system operations and helps prevent unauthorized changes that could potentially harm the system."
"The sudo command in Linux is a powerful tool that allows you to execute commands with the privileges of another user, typically the root user (also known as superuser). It stands for 'superuser do'. // Here's how it works: // 1. Authentication: When you type sudo followed by a command, the system prompts you for your password. This is to ensure that you are authorized to use the elevated privileges. // 2. Privilege Escalation: If your password is correct, sudo temporarily grants you the permissions of the specified user (usually root). // 3. Command Execution: The command you entered is then executed with the elevated privileges. // 4. Privilege Reversion: Once the command finishes, your privileges revert back to your normal user account. // Why use sudo? // * Security: It's generally considered unsafe to log in as root directly. sudo allows you to perform specific tasks with elevated privileges without compromising the entire system. // * Convenience: It's easier to use sudo for occasional tasks requiring root access than logging in as root every time. // * Fine-grained Control: You can configure sudo to allow specific users to execute specific commands with root privileges, enhancing security and control."
"The 'chmod' command in Linux changes the permissions of a file or directory."
- Most Similar Model: Gemini (55.32%)
- Minimum Similarity: 37.21%
- Maximum Similarity: 55.32%
- Average Similarity: 46.26%
- Median Similarity: 46.26%
- Standard Deviation Similarity: 9.06%
"• chmod is a command used in Linux to change the permissions of files and directories. • It stands for 'change mode'. • It modifies the read, write, and execute permissions for the owner, group, and others. • The general syntax is chmod [OPTIONS] MODE FILE.... • OPTIONS can include: • -c to display a message for each file that is changed. • -R to recursively change the permissions of directories and their contents. • -v to show a message for each file processed. • MODE specifies the desired permissions using a symbolic or numeric representation. • Symbolic representation uses letters: u for user/owner, g for group, o for others, a for all. • Numeric representation uses a 3-digit octal number: the first digit represents the owner, the second digit represents the group, and the third digit represents others. Each digit is a sum of the desired permissions: 4 for read, 2 for write, 1 for execute."
"The chmod command in Linux is used to change the permissions of files and directories. Permissions determine who can access a file or directory and what they can do with it. // Understanding Permissions: // Linux uses a system of three sets of permissions: // * Owner: The user who created the file or directory. // * Group: A group of users who have specific access rights. // * Others: All other users on the system. // Each set of permissions has three types: // * Read (r): Allows viewing the contents of a file or listing the contents of a directory. // * Write (w): Allows modifying the contents of a file or adding/removing files/directories within a directory. // * Execute (x): Allows running a file (if it's an executable) or entering a directory."
Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated. If you have suggestions for improving the code, your insights will be highly welcome. In order to contribute to this project, please follow the guidelines below or read the CONTRIBUTING.md file for more details on how to contribute to this project, as it contains information about the commit standards and the entire pull request process. Please follow these guidelines to make your contributions smooth and effective:
-
Set Up Your Environment: Ensure you've followed the setup instructions in the Setup section to prepare your development environment.
-
Make Your Changes:
- Create a Branch:
git checkout -b feature/YourFeatureName
- Implement Your Changes: Make sure to test your changes thoroughly.
- Commit Your Changes: Use clear commit messages, for example:
- For new features:
git commit -m "FEAT: Add some AmazingFeature"
- For bug fixes:
git commit -m "FIX: Resolve Issue #123"
- For documentation:
git commit -m "DOCS: Update README with new instructions"
- For refactorings:
git commit -m "REFACTOR: Enhance component for better aspect"
- For snapshots:
git commit -m "SNAPSHOT: Temporary commit to save the current state for later reference"
- For new features:
- See more about crafting commit messages in the CONTRIBUTING.md file.
- Create a Branch:
-
Submit Your Contribution:
- Push Your Changes:
git push origin feature/YourFeatureName
- Open a Pull Request (PR): Navigate to the repository on GitHub and open a PR with a detailed description of your changes.
- Push Your Changes:
-
Stay Engaged: Respond to any feedback from the project maintainers and make necessary adjustments to your PR.
-
Celebrate: Once your PR is merged, celebrate your contribution to the project!
We thank the following people who contributed to this project:
Breno Farias da Silva |
This project is licensed under the Apache License 2.0. This license permits use, modification, distribution, and sublicense of the code for both private and commercial purposes, provided that the original copyright notice and a disclaimer of warranty are included in all copies or substantial portions of the software. It also requires a clear attribution back to the original author(s) of the repository. For more details, see the LICENSE file in this repository.