|
| 1 | +#### Integration Steps |
| 2 | + |
| 3 | +From the user's perspective, to integrate the Token rate limiting function provided by Sentinel, the following steps are required: |
| 4 | + |
| 5 | +1. Prepare a Redis instance |
| 6 | + |
| 7 | +2. Configure and initialize Sentinel's runtime environment. |
| 8 | + 1. Only initialization from a YAML file is supported |
| 9 | + |
| 10 | +3. Embed points (define resources) with fixed resource type: `ResourceType=ResTypeCommon` and `TrafficType=Inbound` |
| 11 | + |
| 12 | +4. Load rules according to the configuration file below. The rule configuration items include: resource name, rate limiting strategy, specific rule items, Redis configuration, error code, and error message. The following is an example of rule configuration, with specific field meanings detailed in the "Configuration File Description" below. |
| 13 | + |
| 14 | + ```go |
| 15 | + _, err = llmtokenratelimit.LoadRules([]*llmtokenratelimit.Rule{ |
| 16 | + { |
| 17 | + |
| 18 | + Resource: ".*", |
| 19 | + Strategy: llmtokenratelimit.FixedWindow, |
| 20 | + SpecificItems: []llmtokenratelimit.SpecificItem{ |
| 21 | + { |
| 22 | + Identifier: llmtokenratelimit.Identifier{ |
| 23 | + Type: llmtokenratelimit.Header, |
| 24 | + Value: ".*", |
| 25 | + }, |
| 26 | + KeyItems: []llmtokenratelimit.KeyItem{ |
| 27 | + { |
| 28 | + Key: ".*", |
| 29 | + Token: llmtokenratelimit.Token{ |
| 30 | + Number: 1000, |
| 31 | + CountStrategy: llmtokenratelimit.TotalTokens, |
| 32 | + }, |
| 33 | + Time: llmtokenratelimit.Time{ |
| 34 | + Unit: llmtokenratelimit.Second, |
| 35 | + Value: 60, |
| 36 | + }, |
| 37 | + }, |
| 38 | + }, |
| 39 | + }, |
| 40 | + }, |
| 41 | + }, |
| 42 | + }) |
| 43 | + ``` |
| 44 | + |
| 45 | +5. Optional: Create an LLM instance and embed it into the provided adapter |
| 46 | + |
| 47 | + |
| 48 | +#### Configuration File Description |
| 49 | + |
| 50 | +Overall rule configuration |
| 51 | + |
| 52 | +| Configuration Item | Type | Required | Default Value | Description | |
| 53 | +| :----------------- | :------------------- | :------- | :------------------ | :----------------------------------------------------------- | |
| 54 | +| enabled | bool | No | false | Whether to enable the LLM Token rate limiting function. Values: false (disable), true (enable) | |
| 55 | +| rules | array of rule object | No | nil | Rate limiting rules | |
| 56 | +| redis | object | No | | Redis instance connection information | |
| 57 | +| errorCode | int | No | 429 | Error code. Will be changed to 429 if set to 0 | |
| 58 | +| errorMessage | string | No | "Too Many Requests" | Error message | |
| 59 | + |
| 60 | +rule configuration |
| 61 | + |
| 62 | +| Configuration Item | Type | Required | Default Value | Description | |
| 63 | +| :----------------- | :--------------------------- | :------- | :-------------- | :----------------------------------------------------------- | |
| 64 | +| resource | string | No | ".*" | Rule resource name, supporting regular expressions. Values: ".*" (global match), user-defined regular expressions | |
| 65 | +| strategy | string | No | "fixed-window" | Rate limiting strategy. Values: fixed-window, peta (predictive error temporal allocation) | |
| 66 | +| encoding | object | No | | Token encoding method, **exclusively for peta rate limiting strategy** | |
| 67 | +| specificItems | array of specificItem object | Yes | | Specific rule items | |
| 68 | + |
| 69 | +encoding configuration |
| 70 | + |
| 71 | +| Configuration Item | Type | Required | Default Value | Description | |
| 72 | +| :----------------- | :----- | :------- | :------------ | :-------------------- | |
| 73 | +| provider | string | No | "openai" | Model provider | |
| 74 | +| model | string | No | "gpt-4" | Model name | |
| 75 | + |
| 76 | +specificItem configuration |
| 77 | + |
| 78 | +| Configuration Item | Type | Required | Default Value | Description | |
| 79 | +| :----------------- | :---------------------- | :------- | :------------ | :------------------------------------------- | |
| 80 | +| identifier | object | No | | Request identifier | |
| 81 | +| keyItems | array of keyItem object | Yes | | Key-value information for rule matching | |
| 82 | + |
| 83 | +identifier configuration |
| 84 | + |
| 85 | +| Configuration Item | Type | Required | Default Value | Description | |
| 86 | +| :----------------- | :----- | :------- | :------------ | :----------------------------------------------------------- | |
| 87 | +| type | string | No | "all" | Request identifier type. Values: all (global rate limiting), header | |
| 88 | +| value | string | No | ".*" | Request identifier value, supporting regular expressions. Values: ".*" (global match), user-defined regular expressions | |
| 89 | + |
| 90 | +keyItem configuration |
| 91 | + |
| 92 | +| Configuration Item | Type | Required | Default Value | Description | |
| 93 | +| :----------------- | :----- | :------- | :------------ | :----------------------------------------------------------- | |
| 94 | +| key | string | No | ".*" | Specific rule item value, supporting regular expressions. Values: ".*" (global match), user-defined regular expressions | |
| 95 | +| token | object | Yes | | Token quantity and calculation strategy configuration | |
| 96 | +| time | object | Yes | | Time unit and cycle configuration | |
| 97 | + |
| 98 | +token configuration |
| 99 | + |
| 100 | +| Configuration Item | Type | Required | Default Value | Description | |
| 101 | +| :----------------- | :----- | :------- | :-------------- | :----------------------------------------------------------- | |
| 102 | +| number | int | Yes | | Token quantity, greater than or equal to 0 | |
| 103 | +| countStrategy | string | No | "total-tokens" | Token calculation strategy. Values: input-tokens, output-tokens, total-tokens | |
| 104 | + |
| 105 | +time configuration |
| 106 | + |
| 107 | +| Configuration Item | Type | Required | Default Value | Description | |
| 108 | +| :----------------- | :----- | :------- | :------------ | :----------------------------------------------------------- | |
| 109 | +| unit | string | Yes | | Time unit. Values: second, minute, hour, day | |
| 110 | +| value | int | Yes | | Time value, greater than or equal to 0 | |
| 111 | + |
| 112 | +redis configuration |
| 113 | + |
| 114 | +| Configuration Item | Type | Required | Default Value | Description | |
| 115 | +| :----------------- | :------------------- | :------- | :----------------------------------- | :----------------------------------------------------------- | |
| 116 | +| addrs | array of addr object | No | [{name: "127.0.0.1", port: 6379}] | Redis node services, **see notes below** | |
| 117 | +| username | string | No | Empty string | Redis username | |
| 118 | +| password | string | No | Empty string | Redis password | |
| 119 | +| dialTimeout | int | No | 0 | Maximum waiting time for establishing a Redis connection, unit: milliseconds | |
| 120 | +| readTimeout | int | No | 0 | Maximum waiting time for Redis server response, unit: milliseconds | |
| 121 | +| writeTimeout | int | No | 0 | Maximum time for sending command data to the network connection, unit: milliseconds | |
| 122 | +| poolTimeout | int | No | 0 | Maximum waiting time for getting an idle connection from the connection pool, unit: milliseconds | |
| 123 | +| poolSize | int | No | 10 | Number of connections in the connection pool | |
| 124 | +| minIdleConns | int | No | 5 | Minimum number of idle connections in the connection pool | |
| 125 | +| maxRetries | int | No | 3 | Maximum number of retries for failed operations | |
| 126 | + |
| 127 | +addr configuration |
| 128 | + |
| 129 | +| Configuration Item | Type | Required | Default Value | Description | |
| 130 | +| :----------------- | :----- | :------- | :------------- | :----------------------------------------------------------- | |
| 131 | +| name | string | No | "127.0.0.1" | Redis node service name, a complete [FQDN](https://en.wikipedia.org/wiki/Fully_qualified_domain_name) with service type, e.g., my-redis.dns, redis.my-ns.svc.cluster.local | |
| 132 | +| port | int | No | 6379 | Redis node service port | |
| 133 | + |
| 134 | + |
| 135 | +#### Overall Configuration File Example |
| 136 | + |
| 137 | +```YAML |
| 138 | +version: "v1" |
| 139 | +sentinel: |
| 140 | + app: |
| 141 | + name: sentinel-go-demo |
| 142 | + log: |
| 143 | + metric: |
| 144 | + maxFileCount: 7 |
| 145 | + llmTokenRatelimit: |
| 146 | + enabled: true, |
| 147 | + rules: |
| 148 | + - resource: ".*" |
| 149 | + strategy: "fixed-window" |
| 150 | + specificItems: |
| 151 | + - identifier: |
| 152 | + type: "header" |
| 153 | + value: ".*" |
| 154 | + keyItems: |
| 155 | + - key: ".*" |
| 156 | + token: |
| 157 | + number: 1000 |
| 158 | + countStrategy: "total-tokens" |
| 159 | + time: |
| 160 | + unit: "second" |
| 161 | + value: 60 |
| 162 | + |
| 163 | + errorCode: 429 |
| 164 | + errorMessage: "Too Many Requests" |
| 165 | + |
| 166 | + redis: |
| 167 | + addrs: |
| 168 | + - name: "127.0.0.1" |
| 169 | + port: 6379 |
| 170 | + username: "redis" |
| 171 | + password: "redis" |
| 172 | + dialTimeout: 5000 |
| 173 | + readTimeout: 5000 |
| 174 | + writeTimeout: 5000 |
| 175 | + poolTimeout: 5000 |
| 176 | + poolSize: 10 |
| 177 | + minIdleConns: 5 |
| 178 | + maxRetries: 3 |
| 179 | +``` |
| 180 | +
|
| 181 | +
|
| 182 | +#### Notes |
| 183 | +
|
| 184 | +- PETA uses tiktoken to estimate input token consumption but requires downloading or preconfiguring the `Byte Pair Encoding (BPE)` dictionary |
| 185 | + - Online mode |
| 186 | + - tiktoken needs to download encoding files online for the first use |
| 187 | + - Offline mode |
| 188 | + - Prepare pre-cached tiktoken encoding files (**not directly downloaded files, but files processed by tiktoken**) in advance, and specify the file directory via the TIKTOKEN_CACHE_DIR environment variable |
| 189 | +- Rule deduplication description |
| 190 | + - In keyItems, if only the number differs, the latest number will be retained after deduplication |
| 191 | + - In specificItems, only deduplicated keyItems will be retained |
| 192 | + - In resource, only the latest resource will be retained |
| 193 | +- Redis configuration description |
| 194 | + - **If the connected Redis is in cluster mode, the number of addresses in addrs must be at least 2; otherwise, it will default to Redis standalone mode, causing rate limiting to fail** |
0 commit comments