Skip to content

A Chatbot tool powered by LLM for data analysis and KDD processes

Notifications You must be signed in to change notification settings

fab2112/AI-Datanalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


AI-Datanalysis


AI-Datanalysis is a chatbot tool powered by AI agent for analyzing and discovering knowledge in data.

This project aims to facilitate data interpretation and manipulation, allowing users to ask questions in natural language and receive insights directly from their data through graphs, tables, texts or even maps. The application's brain is heavily based on generative models (LLMs) integrated with powerful framework solutions such as: Plotly (graphics engine), Langchain (LLM interface), Statsmodels and Scipy (statistics), Pandas (dataframe), Numpy (mathematics), Streamlit (web interface), Scikit-learn (Machine Learning), in addition to the possibility of including other modules in the execution environment.



Prerequisites


Get started

Web app access

Local installation

Option-1 Docker (recommended):

git clone https://github.com/fab2112/AI-Datanalysis.git
cd AI-Datanalysis
docker-compose up --build -d

Option-2 Poetry:

git clone https://github.com/fab2112/AI-Datanalysis.git
cd AI-Datanalysis
poetry shell
poetry install

streamlit run app.py 

Usage

Load dataset

Set language and privacy

  • Anonymous privacy only metadata is sent to the model

Set model and api-key

For local model (Ollama)

For maps in Plotly with Mapbox

Get Python codes

  • Get all Python code response solutions

Add new modules to environment

  • For adding new modules to environment execution set WHITELIST_ENV in app.py
WHITELIST_ENV = ["json", "statsmodels", "scipy", "datetime"]

LLMs available

  • Obtain the api-key for use in app according to the desired model
Model Api-Key URL
Groq llama3-70b-8192 Groq Cloud
Groq mixtral-8x7b-32768 Groq Cloud
Google gemini-1.5-pro-latest Google AIStudio
Google gemini-1.5-flash-latest Google AIStudio
OpenAI gpt-3.5-turbo OpenAI API
Cohere command-r-plus Cohere API

Ecosystem

  • Infrastructure
Component Version
Docker Engine 20.10.0+
Docker Compose 1.29.0+
  • Applications
Poetry Python
1.8.3 >= 3.11

Experiments - screenshots

  • Density maps analysis


  • Scatter maps analysis


  • Choropleth maps


  • Scatter 3d analysis


  • OHLC data analysis


  • Bar plot analysis


  • Heatmap correlations



  • Surfaces analysis


  • Landscape analysis


  • Pie plots analysis


  • Box-plot and Violin-plot



  • Polar plots analysis



  • Machine Learning



  • Manifold analysis



  • Line and Area plots


  • Scatter plots



Experiments reproducibility

  • Datasets and prompts available in the datasets directory
Datasets
cancer_data.csv
everest_data.csv
sales_data.csv
digits.csv
iris.csv
ohlcv.csv
geojson_brasil.json

License

MIT


About

A Chatbot tool powered by LLM for data analysis and KDD processes

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published