Groq llama 3-70b-8192 chatbot taking so long to respond #3849

mathabes · 2024-09-18T18:01:35Z

mathabes
Sep 18, 2024

Hi, I'm using the groq llama 3-70b-8192 template component to make a conversational agent. My requests are taking longer and longer, reaching 30 seconds. This only happens when I use the 70b template, because with 8b the requests take a maximum of 4 seconds.

I'm hosting langflow on railway with a Pro plan. Is there any way to solve this problem?

I am sending a high amount of instructions in the conversation, could this be affecting the 70b model?

@mathabes · 2024-09-18T18:03:12Z

dosubot[bot]
bot Sep 18, 2024

Hey there, @mathabes! I'm here to assist you with any bugs, questions, or contributions. Let's tackle this issue together!

To address the performance issues with the groq llama 3-70b-8192 model, you can try the following optimizations and configurations:

Scaling:
Increase the number of replicas for both the backend and frontend services to handle more requests and distribute the load.
```
langflow:
  backend:
    replicaCount: 3
  frontend:
    replicaCount: 3
```

Resource Allocation:
Allocate more memory and CPU resources to the backend and frontend services to ensure they have enough capacity to handle large models.

langflow:
  backend:
    resources:
      requests:
        memory: "2Gi"
        cpu: "1000m"
  frontend:
    resources:
      requests:
        memory: "1Gi"
        cpu: "1000m"

Storage:
Use an external PostgreSQL database instead of the default SQLite for better performance and scalability.

postgresql:
  enabled: true
  fullnameOverride: "langflow-ide-postgresql-service"
  auth:
    username: "langflow"
    password: "langflow-postgres"
    database: "langflow-db"

langflow:
  backend:
    externalDatabase:
      enabled: true
      driver:
        value: "postgresql"
      host:
        value: "langflow-ide-postgresql-service"
      port:
        value: "5432"
      database:
        value: "langflow-db"
      user:
        value: "langflow"
      password:
        valueFrom:
          secretKeyRef:
            key: "password"
            name: "langflow-ide-postgresql-service"
    sqlite:
      enabled: false

Model Configuration:
Configure the model settings to optimize performance. For example, if you are using the Ollama model, you can adjust parameters such as num_gpu, num_thread, mirostat, and temperature to better suit your hardware and performance needs.

def build_config(self) -> dict:
    return {
        "num_gpu": {
            "display_name": "Number of GPUs",
            "field_type": "int",
            "info": "Number of GPUs to use for computation. (Default: 1 on macOS, 0 to disable)",
            "value": 1,
        },
        "num_thread": {
            "display_name": "Number of Threads",
            "field_type": "int",
            "info": "Number of threads to use during computation. (Default: detected for optimal performance)",
            "value": 8,
        },
        "mirostat": {
            "display_name": "Mirostat",
            "options": ["Disabled", "Mirostat", "Mirostat 2.0"],
            "info": "Enable/disable Mirostat sampling for controlling perplexity.",
            "value": "Mirostat 2.0",
        },
        "temperature": {
            "display_name": "Temperature",
            "field_type": "float",
            "value": 0.7,
            "info": "Controls the creativity of model responses.",
        },
    }

These configurations will help ensure that Langflow can efficiently handle large models by scaling horizontally and vertically, using a more robust database system, and optimizing model-specific settings ^[1]^[2].

To continue talking to Dosu, mention @dosu.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Groq llama 3-70b-8192 chatbot taking so long to respond #3849

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Groq llama 3-70b-8192 chatbot taking so long to respond #3849

mathabes Sep 18, 2024

Replies: 1 comment

dosubot[bot] bot Sep 18, 2024

mathabes
Sep 18, 2024

dosubot[bot]
bot Sep 18, 2024