Running large language models on a single GPU for throughput-oriented scenarios.
-
Updated
Oct 28, 2024 - Python
Running large language models on a single GPU for throughput-oriented scenarios.
Run Mixtral-8x7B models in Colab or consumer desktops
PyTorch native quantization and sparsity for training and inference
A QoE-Oriented Computation Offloading Algorithm based on Deep Reinforcement Learning (DRL) for Mobile Edge Computing (MEC) | This algorithm captures the dynamics of the MEC environment by integrating the Dueling Double Deep Q-Network (D3QN) model with Long Short-Term Memory (LSTM) networks.
A lightweight framework that enables serverless users to reduce their bills by harvesting non-serverless compute resources such as their VMs, on-premise servers, or personal computers.
A framework for IoT devices to offload tasks to the cloud, resulting in efficient computation and decreased cloud costs.
Monero hardware wallet protocol implementation for Trezor, agent
Code for paper "Real-time Neural Network Inference on Extremely Weak Devices: Agile Offloading with Explainable AI" (MobiCom'22)
Backend.AI Client Library for Python
A Pandas-inspired data analysis project with lazy semantics and query-offloading to SQLite
Ferramenta para a criação de ambientes de testes com dispositivos Android
Offloading Resource-Intensive Tasks to Raspberry Pi (or IoT Devices) Using SSH
Add a description, image, and links to the offloading topic page so that developers can more easily learn about it.
To associate your repository with the offloading topic, visit your repo's landing page and select "manage topics."