AI-Datanalysis is a chatbot tool powered by AI agent for analyzing and discovering knowledge in data.
This project aims to facilitate data interpretation and manipulation, allowing users to ask questions in natural language and receive insights directly from their data through graphs, tables, texts or even maps. The application's brain is heavily based on generative models (LLMs) integrated with powerful framework solutions such as: Plotly (graphics engine), Langchain (LLM interface), Statsmodels and Scipy (statistics), Pandas (dataframe), Numpy (mathematics), Streamlit (web interface), Scikit-learn (Machine Learning), in addition to the possibility of including other modules in the execution environment.
- Docker
- Docker Compose
- Git
- Poetry
- Python >= 3.11
- Ambiente Linux | Windows
- Access the web application here https://ai-datanalysis.streamlit.app/
git clone https://github.com/fab2112/AI-Datanalysis.git
cd AI-Datanalysis
docker-compose up --build -d
git clone https://github.com/fab2112/AI-Datanalysis.git
cd AI-Datanalysis
poetry shell
poetry install
streamlit run app.py
- After installation access the app in the browser using the http://0.0.0.0:8501 or http://localhost:8501
- Anonymous privacy only metadata is sent to the model
- Mapbox API Token Mapbox token
- Get all Python code response solutions
- For adding new modules to environment execution set WHITELIST_ENV in app.py
WHITELIST_ENV = ["json", "statsmodels", "scipy", "datetime"]
- Obtain the api-key for use in app according to the desired model
Model | Api-Key URL | |
---|---|---|
Groq | llama3-70b-8192 | Groq Cloud |
Groq | mixtral-8x7b-32768 | Groq Cloud |
gemini-1.5-pro-latest | Google AIStudio | |
gemini-1.5-flash-latest | Google AIStudio | |
OpenAI | gpt-3.5-turbo | OpenAI API |
Cohere | command-r-plus | Cohere API |
- Infrastructure
Component | Version |
---|---|
Docker Engine | 20.10.0+ |
Docker Compose | 1.29.0+ |
- Applications
Poetry | Python |
---|---|
1.8.3 | >= 3.11 |
- Density maps analysis
- Scatter maps analysis
- Choropleth maps
- Scatter 3d analysis
- OHLC data analysis
- Bar plot analysis
- Heatmap correlations
- Surfaces analysis
- Landscape analysis
- Pie plots analysis
- Box-plot and Violin-plot
- Polar plots analysis
- Machine Learning
- Manifold analysis
- Line and Area plots
- Scatter plots
- Datasets and prompts available in the datasets directory
Datasets |
---|
cancer_data.csv |
everest_data.csv |
sales_data.csv |
digits.csv |
iris.csv |
ohlcv.csv |
geojson_brasil.json |