This repository is a guide for the MiniCPM series of edge-side models, covering inference, quantization, edge-end deployment, fine-tuning, applications, and technical reports.
MiniCPM Repository | MiniCPM-V Repository | MiniCPM Series Knowledge Base | 中文教程 | Join our Discord and WeChat Group
The MiniCPM edge-side series is jointly open-sourced by ModelBest and OpenBMB, in collaboration with the Tsinghua NLP Lab. It comprises globally lightweight high-performance AI models,including the MiniCPM foundation model and the MiniCPM-V multimodal model. We have now ushered in the "Edge-Side ChatGPT Era" in terms of performance; in the multimodal direction, we have made GPT-4V level MLLMsfor Single Image, Multi Image and Video on Your Phone. It is currently being deployed in end devices such as smartphones, computers, cars, wearable devices, VR, and more. For more detailed information about the MiniCPM series, please visit the OpenBMB page.
- Playing with RAG LangChain on 4GB VRAM
- Controllable Text Generation with RLHF
- Function Call
- Build Your Agent Data
- Building an Agent on AIPC-Windows
- Cross-Modality High-Definition Retrieval
- Text Recognition and Localization
- Getting Started with Agents
- Constructing Long-Chain Agents
- Multimodal Document RAG
- MiniCPM Language Model Technical Report
- MiniCPM-V Multimodal Model Technical Report
- Evolution of Attention Mechanisms in MiniCPM
- Architecture Principles of MiniCPM-V Multimodal Model
- Principles of High-Definition Decoding in MiniCPM-V
- GPU
- CPU
- NPU
- Android
- Mac
- Windows
- iOS
- MiniCPM 2.4B_transformers_cuda
- MiniCPM 2.4B_vllm_cuda
- MiniCPM 2.4B__mlx_mac
- MiniCPM 2.4B_ollama_cuda_cpu_mac
- MiniCPM 2.4B_llamacpp_cuda_cpu
- MiniCPM 2.4B_llamacpp_android
- MiniCPM 3.0_vllm_cuda
- MiniCPM 3.0_transformers_cuda_cpu
- MiniCPM 3.0_llamacpp_cuda_cpu
- MiniCPM 3.0_sglang_cuda
- MiniCPM-Llama3-V 2.5_vllm_cuda
- MiniCPM-Llama3-V 2.5_LMdeploy_cuda
- MiniCPM-Llama3-V 2.5_llamacpp_cuda_cpu
- MiniCPM-Llama3-V 2.5_ollama_cuda_cpu
- MiniCPM-Llama3-V 2.5_transformers_cuda
- MiniCPM-Llama3-V 2.5_xinference_cuda
- MiniCPM-Llama3-V 2.5_swift_cuda
- MiniCPM-V 2.6_vllm_cuda
- MiniCPM-V 2.6_vllm_api_server_cuda
- MiniCPM-V 2.6_llamacpp_cuda_cpu
- MiniCPM-V 2.6_transformers_cuda
- MiniCPM-V 2.6_swift_cuda
- MiniCPM 2.4B AWQ Quantization
- MiniCPM 2.4B GGUF Quantization
- MiniCPM 2.4B GPTQ Quantization
- MiniCPM 2.4B BNB Quantization
- MiniCPM 3.0 AWQ Quantization
- MiniCPM 3.0 GGUF Quantization
- MiniCPM 3.0 GPTQ Quantization
- MiniCPM 3.0 BNB Quantization
- xtuner: The Optimal Choice for Efficient Fine-Tuning of MiniCPM
- LLaMA-Factory: One-Click Fine-Tuning Solution for MiniCPM
- ChatLLM Framework: Running MiniCPM on CPU
- datawhale_Rapidly deploy open source large models based on Linux environment
In the spirit of open source, we encourage contributions to this repository, including but not limited to adding new MiniCPM tutorials, sharing user experiences, providing ecosystem compatibility, and model applications. We look forward to contributions from developers to enhance our open-source repository.