Skip to content
@upgini

Upgini • Data search & enrichment for Machine Learning and AI

Easily find and add relevant features to your ML & AI pipeline from hundreds of public, community and premium external data sources, including LLMs

Easily find and add relevant features to your ML & AI pipeline from hundreds of public, community and premium external data sources, including open & commercial LLMs

🚀 Awesome features of Upgini Python Library

⭐️ Automatically find only relevant features that give accuracy improvement for ML model. Not just correlated with target variable
⭐️ Automated feature generation from the sources: feature generation with Large Language Models' data augmentation, RNNs, GraphNN; multiple data source ensembling
⭐️ Automatic search key augmentation from all connected sources. If you do not have all search keys in your search request, such as postal/zip code, Upgini will try to add those keys based on the provided set of search keys. This will broaden the search across all available data sources
⭐️ Calculate accuracy metrics and uplifts after enrichment existing ML model with external features
⭐️ Check the stability of accuracy gain from external data on out-of-time intervals and verification datasets. Mitigate risks of unstable external data dependencies in ML pipeline
⭐️ Easy to use - single request to enrich training dataset with all of the keys at once: date/datetime, country, postal/ZIP code, country, phone number, hashed email/HEM, IP-address
⭐️ Simple Drag & Drop Search UI: upgini_data_search_for_ML

📊 Total: 239 countries and up to 41 years of history

Data source Countries History, years # sources for ensemble Update Search keys API Key required
Historical weather & Climate normals 68 22 - Monthly date, country, postal/ZIP code No
Location/Places/POI/Area/Proximity information from OpenStreetMap 221 2 - Monthly date, country, postal/ZIP code No
International holidays & events, Workweek calendar 232 22 - Monthly date, country No
Consumer Confidence index 44 22 - Monthly date, country No
World economic indicators 191 41 - Monthly date, country No
Markets data - 17 - Monthly date, datetime No
World mobile & fixed broadband network coverage and perfomance 167 - 3 Monthly country, postal/ZIP code No
World demographic data 90 - 2 Annual country, postal/ZIP code No
World house prices 44 - 3 Annual country, postal/ZIP code No
Public social media profile data 104 - - Monthly date, email/HEM, phone Yes
Car ownership data and Parking statistics 3 - - Annual country, postal/ZIP code, email/HEM, phone Yes
Geolocation profile for phone & IPv4 & email 239 - 6 Monthly date, email/HEM, phone, IPv4 Yes

👉 Details on datasets and features

Pinned Loading

  1. upgini upgini Public

    Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & co…

    Python 317 25

Repositories

Showing 4 of 4 repositories
  • upgini Public

    Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs

    upgini/upgini’s past year of commit activity
    Python 317 BSD-3-Clause 25 1 4 Updated Oct 21, 2024
  • upgini/upgini-docs’s past year of commit activity
    0 1 0 0 Updated Mar 6, 2024
  • .github Public
    upgini/.github’s past year of commit activity
    0 0 0 0 Updated Jan 21, 2024
  • staged-recipes Public Forked from conda-forge/staged-recipes

    A place to submit conda recipes before they become fully fledged conda-forge feedstocks

    upgini/staged-recipes’s past year of commit activity
    Python 0 BSD-3-Clause 4,968 0 0 Updated Aug 5, 2022

Top languages

Loading…

Most used topics

Loading…