Skip to content

Latest commit

 

History

History
303 lines (273 loc) · 39.7 KB

File metadata and controls

303 lines (273 loc) · 39.7 KB

Applications and Frameworks

Applications, Frameworks, and User Interface (UI/UX)

LLM Training/Build

  1. fastText: A library for efficient learning of word representations and sentence classification [Aug 2016] GitHub Repo stars
  2. Pytorch: PyTorch is the most favorite library among researchers. Papers with code Trends [Sep 2016]
  3. fairseq: a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling [Sep 2017] GitHub Repo stars
  4. huggingface/transformers: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. (github.com) [Oct 2018] GitHub Repo stars
  5. jax: JAX is Autograd (automatically differentiate native Python & Numpy) and XLA (compile and run NumPy) [Oct 2018] GitHub Repo stars
  6. Sentence Transformers: Python framework for state-of-the-art sentence, text and image embeddings. Useful for semantic textual similar, semantic search, or paraphrase mining. git [27 Aug 2019] GitHub Repo stars
  7. Weights & Biases: Visualizing and tracking your machine learning experiments wandb.ai doc: deeplearning.ai/wandb [Jan 2020] GitHub Repo stars
  8. mosaicml/llm-foundry: LLM training code for MosaicML foundation models [Jun 2022] GitHub Repo stars
  9. vLLM: Easy-to-use library for LLM inference and serving. [Feb 2023] GitHub Repo stars
  10. string2string: an open-source tool that offers a comprehensive suite of efficient algorithms for a broad range of string-to-string problems. [Mar 2023] GitHub Repo stars
  11. GPT4All: Open-source large language models that run locally on your CPU [Mar 2023] GitHub Repo stars
  12. Visual Blocks: Google visual programming framework that lets you create ML pipelines in a no-code graph editor. [Mar 2023] GitHub Repo stars
  13. LLaMA-Factory: Unify Efficient Fine-Tuning of 100+ LLMs [May 2023] GitHub Repo stars
  14. ollama: Running with Large language models locally [Jun 2023] GitHub Repo stars
  15. unsloth: Finetune Mistral, Gemma, Llama 2-5x faster with less memory! [Nov 2023] GitHub Repo stars
  16. LM Studio: UI for Discover, download, and run local LLMs [May 2024]
  17. YaFSDP: Yet another Fully Sharded Data Parallel (FSDP): enhanced for distributed training. YaFSDP vs DeepSpeed. [May 2024] GitHub Repo stars
  18. exo: Run your own AI cluster at home with everyday devices [Jun 2024] GitHub Repo stars
  19. BitNet: Official inference framework for 1-bit LLMs [Aug 2024] GitHub Repo stars
  20. Meta Lingua: a minimal and fast LLM training and inference library designed for research. [Oct 2024] GitHub Repo stars

LLM Application Development

  1. mindsdb: The open-source virtual database for building AI from enterprise data. It supports SQL syntax for development and deployment, with over 70 technology and data integrations. [Aug 2018] GitHub Repo stars
  2. Jina-Serve: a framework for building and deploying AI services that communicate via gRPC, HTTP and WebSockets. [Feb 2020] GitHub Repo stars
  3. superduper: Build end-to-end AI-data workflows and applications with your favourite tools. [Aug 2022] GitHub Repo stars
  4. langflow: LangFlow is a UI for LangChain, designed with react-flow. [Feb 2023] GitHub Repo stars
  5. MiniChain: A tiny library for coding with llm [Feb 2023] GitHub Repo stars
  6. marvin: a lightweight AI toolkit for building natural language interfaces. [Mar 2023] GitHub Repo stars
  7. microsoft/Tokenizer: Tiktoken in C#: .NET and TypeScript implementation of BPE tokenizer for OpenAI LLMs. [Mar 2023] GitHub Repo stars
  8. Azure OpenAI Proxy: OpenAI API requests converting into Azure OpenAI API requests [Mar 2023] GitHub Repo stars
  9. ChainForge: An open-source visual programming environment for battle-testing prompts to LLMs. [Mar 2023] GitHub Repo stars
  10. E2B: an open-source infrastructure that allows you run to AI-generated code in secure isolated sandboxes in the cloud. [Mar 2023] GitHub Repo stars
  11. Flowise Drag & drop UI to build your customized LLM flow [Apr 2023] GitHub Repo stars
  12. Dify: an open-source platform for building applications with LLMs, featuring an intuitive interface for AI workflows and model management. [Apr 2023] GitHub Repo stars
  13. ThinkGPT: Chain of Thoughts library [Apr 2023] GitHub Repo stars
  14. langfuse: Traces, evals, prompt management and metrics to debug and improve your LLM application. [May 2023] GitHub Repo stars
  15. Superagent: AI Assistant Framework & API [May 2023] GitHub Repo stars
  16. DemoGPT: Automatic generation of LangChain code [Jun 2023] GitHub Repo stars
  17. Spring AI: Developing AI applications for Java. [Jul 2023] GitHub Repo stars
  18. litellm: Python SDK to call 100+ LLM APIs in OpenAI format [Jul 2023] GitHub Repo stars
  19. Opencopilot: Build and embed open-source AI Copilots. [Aug 2023] GitHub Repo stars
  20. BISHENG: an open LLM application devops platform, focusing on enterprise scenarios. [Aug 2023] GitHub Repo stars
  21. langfun: leverages PyGlove to integrate LLMs and programming. [Aug 2023] GitHub Repo stars
  22. mirascope: a library that simplifies working with LLMs via a unified interface for multiple providers. [Dec 2023] GitHub Repo stars
  23. Pipecat: Open Source framework for voice and multimodal conversational AI [Dec 2023] GitHub Repo stars
  24. Refly: WYSIWYG AI editor to create llm application. [Feb 2024] GitHub Repo stars
  25. Llama Stack:💡building blocks for Large Language Model (LLM) development [Jun 2024] GitHub Repo stars
  26. aisuite: Andrew Ng launches a tool offering a simple, unified interface for multiple generative AI providers. [26 Nov 2024] GitHub Repo stars vs litellm vs OpenRouter
  27. PocketFlow: Minimalist LLM Framework in 100 Lines. Enable LLMs to Program Themselves. [Dec 2024] GitHub Repo stars

LLM Memory

  1. zep: Long term memory layer. Zep intelligently integrates new information into the user's Knowledge Graph. GitHub Repo stars [May 2023]
  2. Mem0:💡A self-improving memory layer for personalized AI experiences. [Jun 2023] GitHub Repo stars
  3. Letta (previously MemGPT): Virtual context management to extend the limited context of LLM. A tiered memory system and a set of functions that allow it to manage its own memory. ref / git:old [12 Oct 2023] GitHub Repo stars
  4. Memary: memary mimics how human memory evolves and learns over time. The memory module comprises the Memory Stream and Entity Knowledge Store. [May 2024] GitHub Repo stars

LLM Application

  1. OpenBB: The first financial Platform that is free and fully open source. AI-powered workspace [Dec 2020] GitHub Repo stars
  2. knowledge: Tool for saving, searching, accessing, and exploring websites and files. Electron based app, built-in Chromium browser, knowledge graph [Jul 2021] GitHub Repo stars
  3. Nomic python client: Generate, store and retrieve embeddings for your unstructured data. supports from hundreds to tens of millions of points. [Jul 2022] GitHub Repo stars
  4. guardrails: Adding guardrails to large language models. [Jan 2023] GitHub Repo stars
  5. aider: AI pair programming in your terminal [Jan 2023] GitHub Repo stars
  6. BookGPT: Generate books based on your specification [Jan 2023] GitHub Repo stars
  7. KnowledgeGPT: Upload your documents and get answers to your questions, with citations [Jan 2023] GitHub Repo stars
  8. DocsGPT: Chatbot for document with your data [Feb 2023] GitHub Repo stars
  9. LibreChat: a free, open source AI chat platform. [8 Mar 2023] GitHub Repo stars
  10. BIG-AGI FKA nextjs-chatgpt-app [Mar 2023] GitHub Repo stars
  11. Next.js AI Chatbot:💡An Open-Source AI Chatbot Template Built With Next.js and the AI SDK by Vercel. [May 2023] GitHub Repo stars
  12. dataline: Chat with your data - AI data analysis and visualization [Apr 2023] GitHub Repo stars
  13. pyspark-ai: English instructions and compile them into PySpark objects like DataFrames. [Apr 2023] GitHub Repo stars
  14. vanna: Chat with your SQL database [May 2023] GitHub Repo stars
  15. Continue: open-source AI code assistant inside of VS Code and JetBrains. [May 2023] GitHub Repo stars
  16. localGPT: Chat with your documents on your local device [May 2023] GitHub Repo stars
  17. anything-llm: All-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more. [Jun 2023] GitHub Repo stars
  18. Dialoqbase: Create custom chatbots with your own knowledge base using PostgreSQL [Jun 2023] GitHub Repo stars
  19. GPT Researcher: Autonomous agent designed for comprehensive online research [Jul 2023] / GPT Newspaper: Autonomous agent designed to create personalized newspapers [Jan 2024] GitHub Repo stars GitHub Repo stars
  20. Postiz: AI social media scheduling tool. An alternative to: Buffer.com, Hypefury, Twitter Hunter. [Jul 2023] GitHub Repo stars
  21. SolidGPT: AI searching assistant for developers (VSCode Extension) [Aug 2023] GitHub Repo stars
  22. notesGPT: Record voice notes & transcribe, summarize, and get tasks [Nov 2023] GitHub Repo stars
  23. screenshot-to-code: Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue) [Nov 2023] GitHub Repo stars
  24. Geppeto: Advanced Slack bot using multiple AI models [Jan 2024] GitHub Repo stars
  25. code2prompt: a command-line tool (CLI) that converts your codebase into a single LLM prompt with a source tree [Mar 2024]
  26. OpenHands: OpenHands (formerly OpenDevin), a platform for software development agents [Mar 2024] GitHub Repo stars
  27. LlamaFS: Automatically renames and organizes your files based on their contents [May 2024] GitHub Repo stars
  28. Cellm: Use LLMs in Excel formulas [Jul 2024] GitHub Repo stars
  29. Nyro: AI-Powered Desktop Productivity Tool [Aug 2024] GitHub Repo stars
  30. Auto_Jobs_Applier_AIHawk: automates the jobs application [Aug 2024] GitHub Repo stars
  31. PDF2Audio: an open-source alternative to NotebookLM for podcast creation [Sep 2024] GitHub Repo stars
  32. o1-engineer: a command-line tool designed to assist developers [Sep 2024] GitHub Repo stars
  33. Zed: AI code editor from the creators of Atom and Tree-sitter [Sep 2024] GitHub Repo stars
  34. Cofounder: full stack generative web apps ; backend + db + stateful web apps [Sep 2024] GitHub Repo stars
  35. Podcastfy.ai: An Open Source API alternative to NotebookLM's podcast feature. [Oct 2024] GitHub Repo stars

Code editor incl. Proprietary Software

  • AI Code Editor: Replit Agent [09 Sep 2024] / Cursor [Mar 2023]
  • Vercel AI Vercel AI Toolkit for TypeScript
  • Cline: CLI aNd Editor. Autonomous coding agent. VSCode Extension. [Jul 2024] GitHub Repo stars
  • void OSS Cursor alternative. a fork of vscode [Oct 2024] GitHub Repo stars
  • Github Spark: an AI-powered tool for creating and sharing micro apps (“sparks”) [29 Oct 2024]
  • bolt.new: Dev Sanbox with AI from stackblitz [Sep 2024] GitHub Repo stars
  • Windsurf editor: Flows = Agents + Copilots. Cascades (a specific implementation of AI Flows. Advanced chat interface). [13 Nov 2024]
  • devin.cursorrules: Transform your Cursor or Windsurf IDE into a Devin-like AI Assistant [Dec 2024] GitHub Repo stars

UI/UX

  1. Gradio: Build Machine Learning Web Apps - in Python [Mar 2023] GitHub Repo stars
  2. GPT 学术优化 (GPT Academic): UI Platform for Academic & Coding Tasks. Optimized for paper reading, writing, and editing. [Mar 2023] GitHub Repo stars
  3. Text generation web UI: Text generation web UI [Mar 2023] GitHub Repo stars
  4. Open AI Chat Mockup: An open source ChatGPT UI. mckaywrigley/chatbot-ui [Mar 2023] GitHub Repo stars
  5. chainlit: Build production-ready Conversational AI applications in minutes. [Mar 2023] GitHub Repo stars
  6. CopilotKit: Built-in React UI components [Jun 2023] GitHub Repo stars
  7. Open-source GPT Wrappers 1. ChatGPT-Next-Web [Mar 2023] 2. FastGPT [Feb 2023] 3. Lobe Chat [Jan 2024] GitHub Repo stars GitHub Repo stars GitHub Repo stars
  8. anse: UI for multiple models such as ChatGPT, DALL-E and Stable Diffusion. [Apr 2023] GitHub Repo stars
  9. Open WebUI: User-friendly AI Interface (Supports Ollama, OpenAI API, ...) [Oct 2023] GitHub Repo stars

Data Processing and Management

  1. Camelot a Python library that can help you extract tables from PDFs! ref: Comparison with other PDF Table Extraction libraries [Jul 2016] GitHub Repo stars
  2. Trafilatura: Gather text from the web and convert raw HTML into structured, meaningful data. [Apr 2019] GitHub Repo stars
  3. Math formula OCR: MathPix, OSS LaTeX-OCR [Jan 2021] GitHub Repo stars
  4. activeloopai/deeplake: AI Vector Database for LLMs/LangChain. Doubles as a Data Lake for Deep Learning. Store, query, version, & visualize any data. Stream data in real-time to PyTorch/TensorFlow. ref [Jun 2021] GitHub Repo stars
  5. PostgresML: The GPU-powered AI application database. [Apr 2022] GitHub Repo stars
  6. unstructured: Open-Source Pre-Processing Tools for Unstructured Data [Sep 2022] GitHub Repo stars
  7. outlines: Structured Text Generation [Mar 2023] GitHub Repo stars
  8. pandas-ai: Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). [Apr 2023] GitHub Repo stars
  9. Instructor: Structured outputs for LLMs, easily map LLM outputs to structured data. [Jun 2023] GitHub Repo stars
  10. Nougat: Neural Optical Understanding for Academic Documents: The academic document PDF parser that understands LaTeX math and tables. git [25 Aug 2023] GitHub Repo stars
  11. Marker: converts PDF to markdown [Oct 2023] GitHub Repo stars
  12. Maxun: Open-Source No-Code Web Data Extraction Platform [Oct 2023] GitHub Repo stars
  13. firecrawl: Scrap entire websites into LLM-ready markdown or structured data. [Apr 2024] GitHub Repo stars
  14. Crawl4AI: Open-source LLM Friendly Web Crawler & Scrapper [May 2024] GitHub Repo stars
  15. MegaParse: a powerful and versatile parser that can handle various types of documents. Focus on having no information loss during parsing. [30 May 2024] GitHub Repo stars
  16. Zerox OCR: Zero shot pdf OCR with gpt-4o-mini [Jul 2024] GitHub Repo stars
  17. docling: IBM. Docling parses documents and exports them to the desired format. [13 Nov 2024] GitHub Repo stars
  18. markitdown: Python tool for converting files and office documents to Markdown. [14 Nov 2024] GitHub Repo stars
  19. Azure AI Document Intelligence (FKA. Azure Form Recognizer): ref: Table and Meta data Extraction in the Document
  20. Table to Markdown: LLM can recognize Markdown-formatted tables more effectively than raw table formats.

Sample code

  • Streaming with Azure OpenAI SSE [May 2023] GitHub Repo stars
  • TaxyAI/browser-extension: Browser Automation by Chrome debugger API and Prompt > src/helpers/determineNextAction.ts [Mar 2023] GitHub Repo stars
  • Embedding does not use Open AI. Can be executed locally: pdfGPT [Mar 2023] GitHub Repo stars
  • Langchain Ask PDF (Tutorial): git [Apr 2023] GitHub Repo stars

Cross-reference

  • RAG: x-ref
  • Agent Applications and Libraries: x-ref
  • OSS Alternatives for OpenAI Code Interpreter: x-ref
  • LLMOps: Large Language Model Operations: x-ref

Caching

  • Caching: A technique to store data that has been previously retrieved or computed, so that future requests for the same data can be served faster.
  • To reduce latency, cost, and LLM requests by serving pre-computed or previously served responses.
  • Strategies for caching: Caching can be based on item IDs, pairs of item IDs, constrained input, or pre-computation. Caching can also leverage embedding-based retrieval, approximate nearest neighbor search, and LLM-based evaluation. ref
  • GPTCache: Semantic cache for LLMs. Fully integrated with LangChain and llama_index. git [Mar 2023] GitHub Repo stars
  • Prompt Cache: Modular Attention Reuse for Low-Latency Inference: LLM inference by reusing precomputed attention states from overlapping prompts. [7 Nov 2023]
  • Prompt caching with Claude: Reducing costs by up to 90% and latency by up to 85% for long prompts. [15 Aug 2024]

Defensive UX

  • Defensive UX: A design strategy that aims to prevent and handle errors in user interactions with machine learning or LLM-based products.
  • Why defensive UX?: Machine learning and LLMs can produce inaccurate or inconsistent output, which can affect user trust and satisfaction. Defensive UX can help by increasing accessibility, trust, and UX quality.
  • Guidelines for Human-AI Interaction: Microsoft: Based on a survey of 168 potential guidelines from various sources, they narrowed it down to 18 action rules organized by user interaction stages.
  • People + AI Guidebook: Google: Google’s product teams and academic research, they provide 23 patterns grouped by common questions during the product development process3.
  • Human Interface Guidelines for Machine Learning: Apple: Based on practitioner knowledge and experience, emphasizing aspects of UI rather than model functionality4.

Proposals & Other topics

  • /llms.txt: Proposal for an /llms.txt file to guide LLMs in using websites during inference. git [3 Sep 2024] GitHub Repo stars
  • Model Context Protocol (MCP): Anthropic proposes an open protocol for seamless LLM integration with external data and tools. git [26 Nov 2024] GitHub Repo stars

LLM for Robotics: Bridging AI and Robotics

  • PromptCraft-Robotics: Robotics and a robot simulator with ChatGPT integration git [Feb 2023] GitHub Repo stars
  • ChatGPT-Robot-Manipulation-Prompts: A set of prompts for Communication between humans and robots for executing tasks. git [Apr 2023] GitHub Repo stars
  • Siemens Industrial Copilot ref [31 Oct 2023]
  • LeRobot: Hugging Face. LeRobot aims to provide models, datasets, and tools for real-world robotics in PyTorch. git [Jan 2024] GitHub Repo stars
  • Mobile ALOHA: Stanford’s mobile ALOHA robot learns from humans to cook, clean, do laundry. Mobile ALOHA extends the original ALOHA system by mounting it on a wheeled base ref [4 Jan 2024] / ALOHA: A Low-cost Open-source Hardware System for Bimanual Teleoperation.
  • Figure 01 + OpenAI: Humanoid Robots Powered by OpenAI ChatGPT 📺 [Mar 2024]

Awesome demo

  • FRVR Official Teaser: Prompt to Game: AI-powered end-to-end game creation [16 Jun 2023]
  • rewind.ai: Rewind captures everything you’ve seen on your Mac and iPhone [Nov 2023]
  • Vercel announced V0.dev: Make a snake game with chat [Oct 2023]
  • Mobile ALOHA: A day of Mobile ALOHA [4 Jan 2024]
  • groq: An LPU Inference Engine, the LPU is reported to be 10 times faster than NVIDIA’s GPU performance ref [Jan 2024]
  • Sora: Introducing Sora — OpenAI’s text-to-video model [Feb 2024]
  • Oasis: Minecraft clone. Generated by AI in Real-Time. The first playable AI model that generates open-world games. ref git [31 Oct 2024] GitHub Repo stars