Release 2.0.0 #1444

etr2460 · 2023-12-21T01:04:12Z

Releases 2.0.0 of evals. This is a major version bump because:

openai-python is bumped to >1.0.0, which reflects a major breaking change to many uses of the repo
We haven't released a version since April, so it seems fair to bump to 2.0.0 since there may be significant breaking changes to the code in the last 8 months

The release is successfully pushed to PyPi: https://pypi.org/project/evals/. Updating the repo to reflect the new version.

In future work, I'll set up a github action to publish versions to PyPi when the version string is bumped

katyhshi

wowww

Releases 2.0.0 of evals. This is a major version bump because: * openai-python is bumped to >1.0.0, which reflects a major breaking change to many uses of the repo * We haven't released a version since April, so it seems fair to bump to 2.0.0 since there may be significant breaking changes to the code in the last 8 months The release is successfully pushed to PyPi: https://pypi.org/project/evals/. Updating the repo to reflect the new version. In future work, I'll set up a github action to publish versions to PyPi when the version string is bumped

Since [the `openai-python` library update](#1444), eval runs are getting flooded with excessive "HTTP/1.1. 200 OK" logs from the openai library: ``` junshern@JunSherns-MacBook-Pro ⚒ oaieval gpt-3.5-turbo 2d_movement [2024-02-15 12:22:08,549] [registry.py:262] Loading registry from /Users/junshern/projects/oss_evals/evals/evals/registry/evals [2024-02-15 12:22:08,898] [registry.py:262] Loading registry from /Users/junshern/.evals/evals [2024-02-15 12:22:08,900] [oaieval.py:211] Run started: 240215042208OCODJ2NY [2024-02-15 12:22:08,949] [data.py:94] Fetching /Users/junshern/projects/oss_evals/evals/evals/registry/data/2d_movement/samples.jsonl [2024-02-15 12:22:08,949] [eval.py:36] Evaluating 100 samples [2024-02-15 12:22:08,955] [eval.py:144] Running in threaded mode with 10 threads! 0%| | 0/100 [00:00<?, ?it/s][2024-02-15 12:22:10,338] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" 1%|██▎ | 1/100 [00:01<02:17, 1.39s/it][2024-02-15 12:22:10,355] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,384] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,392] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,393] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,395] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,400] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,400] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,401] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,432] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,890] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" 11%|█████████████████████████▌ | 11/100 [00:01<00:12, 7.05it/s][2024-02-15 12:22:10,907] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,319] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" 13%|██████████████████████████████▏ | 13/100 [00:02<00:13, 6.36it/s][2024-02-15 12:22:11,421] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" 14%|████████████████████████████████▍ | 14/100 [00:02<00:12, 6.65it/s][2024-02-15 12:22:11,463] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,504] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,524] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,542] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" 18%|█████████████████████████████████████████▊ | 18/100 [00:02<00:08, 10.17it/s][2024-02-15 12:22:11,564] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,564] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,565] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,570] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,829] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" ... ``` - This PR adds a `logging.getLogger("httpx").setLevel(logging.WARNING)` to suppress logs from `httpx` (which is the module within `openai` generating these logs) below the WARNING level, which quiets these logs. - I chose to make the change within `evals/utils/api_utils.py` since that's closest to where the logs are being generated. After the change: ``` junshern@JunSherns-MacBook-Pro ⚒ oaieval gpt-3.5-turbo 2d_movement [2024-02-15 12:22:20,408] [registry.py:262] Loading registry from /Users/junshern/projects/oss_evals/evals/evals/registry/evals [2024-02-15 12:22:20,762] [registry.py:262] Loading registry from /Users/junshern/.evals/evals [2024-02-15 12:22:20,763] [oaieval.py:211] Run started: 240215042220QS3AJAVA [2024-02-15 12:22:20,812] [data.py:94] Fetching /Users/junshern/projects/oss_evals/evals/evals/registry/data/2d_movement/samples.jsonl [2024-02-15 12:22:20,812] [eval.py:36] Evaluating 100 samples [2024-02-15 12:22:20,819] [eval.py:144] Running in threaded mode with 10 threads! 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:08<00:00, 11.96it/s] [2024-02-15 12:22:29,217] [record.py:371] Final report: {'accuracy': 0.09, 'boostrap_std': 0.029618636025313522}. Logged to /tmp/evallogs/240215042220QS3AJAVA_gpt-3.5-turbo_2d_movement.jsonl [2024-02-15 12:22:29,217] [oaieval.py:228] Final report: [2024-02-15 12:22:29,217] [oaieval.py:230] accuracy: 0.09 [2024-02-15 12:22:29,217] [oaieval.py:230] boostrap_std: 0.029618636025313522 [2024-02-15 12:22:29,233] [record.py:360] Logged 200 rows of events to /tmp/evallogs/240215042220QS3AJAVA_gpt-3.5-turbo_2d_movement.jsonl: insert_time=15.670ms ```

Releases 2.0.0 of evals. This is a major version bump because: * openai-python is bumped to >1.0.0, which reflects a major breaking change to many uses of the repo * We haven't released a version since April, so it seems fair to bump to 2.0.0 since there may be significant breaking changes to the code in the last 8 months The release is successfully pushed to PyPi: https://pypi.org/project/evals/. Updating the repo to reflect the new version. In future work, I'll set up a github action to publish versions to PyPi when the version string is bumped

Since [the `openai-python` library update](openai/evals#1444), eval runs are getting flooded with excessive "HTTP/1.1. 200 OK" logs from the openai library: ``` junshern@JunSherns-MacBook-Pro ⚒ oaieval gpt-3.5-turbo 2d_movement [2024-02-15 12:22:08,549] [registry.py:262] Loading registry from /Users/junshern/projects/oss_evals/evals/evals/registry/evals [2024-02-15 12:22:08,898] [registry.py:262] Loading registry from /Users/junshern/.evals/evals [2024-02-15 12:22:08,900] [oaieval.py:211] Run started: 240215042208OCODJ2NY [2024-02-15 12:22:08,949] [data.py:94] Fetching /Users/junshern/projects/oss_evals/evals/evals/registry/data/2d_movement/samples.jsonl [2024-02-15 12:22:08,949] [eval.py:36] Evaluating 100 samples [2024-02-15 12:22:08,955] [eval.py:144] Running in threaded mode with 10 threads! 0%| | 0/100 [00:00<?, ?it/s][2024-02-15 12:22:10,338] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" 1%|██▎ | 1/100 [00:01<02:17, 1.39s/it][2024-02-15 12:22:10,355] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,384] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,392] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,393] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,395] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,400] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,400] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,401] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,432] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:10,890] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" 11%|█████████████████████████▌ | 11/100 [00:01<00:12, 7.05it/s][2024-02-15 12:22:10,907] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,319] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" 13%|██████████████████████████████▏ | 13/100 [00:02<00:13, 6.36it/s][2024-02-15 12:22:11,421] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" 14%|████████████████████████████████▍ | 14/100 [00:02<00:12, 6.65it/s][2024-02-15 12:22:11,463] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,504] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,524] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,542] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" 18%|█████████████████████████████████████████▊ | 18/100 [00:02<00:08, 10.17it/s][2024-02-15 12:22:11,564] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,564] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,565] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,570] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" [2024-02-15 12:22:11,829] [_client.py:1027] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" ... ``` - This PR adds a `logging.getLogger("httpx").setLevel(logging.WARNING)` to suppress logs from `httpx` (which is the module within `openai` generating these logs) below the WARNING level, which quiets these logs. - I chose to make the change within `evals/utils/api_utils.py` since that's closest to where the logs are being generated. After the change: ``` junshern@JunSherns-MacBook-Pro ⚒ oaieval gpt-3.5-turbo 2d_movement [2024-02-15 12:22:20,408] [registry.py:262] Loading registry from /Users/junshern/projects/oss_evals/evals/evals/registry/evals [2024-02-15 12:22:20,762] [registry.py:262] Loading registry from /Users/junshern/.evals/evals [2024-02-15 12:22:20,763] [oaieval.py:211] Run started: 240215042220QS3AJAVA [2024-02-15 12:22:20,812] [data.py:94] Fetching /Users/junshern/projects/oss_evals/evals/evals/registry/data/2d_movement/samples.jsonl [2024-02-15 12:22:20,812] [eval.py:36] Evaluating 100 samples [2024-02-15 12:22:20,819] [eval.py:144] Running in threaded mode with 10 threads! 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:08<00:00, 11.96it/s] [2024-02-15 12:22:29,217] [record.py:371] Final report: {'accuracy': 0.09, 'boostrap_std': 0.029618636025313522}. Logged to /tmp/evallogs/240215042220QS3AJAVA_gpt-3.5-turbo_2d_movement.jsonl [2024-02-15 12:22:29,217] [oaieval.py:228] Final report: [2024-02-15 12:22:29,217] [oaieval.py:230] accuracy: 0.09 [2024-02-15 12:22:29,217] [oaieval.py:230] boostrap_std: 0.029618636025313522 [2024-02-15 12:22:29,233] [record.py:360] Logged 200 rows of events to /tmp/evallogs/240215042220QS3AJAVA_gpt-3.5-turbo_2d_movement.jsonl: insert_time=15.670ms ```

Release 2.0.0

a6d2638

etr2460 requested review from andrew-openai, jwang47, logankilpatrick and katyhshi as code owners December 21, 2023 01:04

katyhshi approved these changes Dec 21, 2023

View reviewed changes

etr2460 merged commit 311e91e into main Dec 21, 2023
1 check passed

JunShern mentioned this pull request Feb 15, 2024

Suppress 'HTTP/1.1 200 OK' logs from openai library #1468

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 2.0.0 #1444

Release 2.0.0 #1444

etr2460 commented Dec 21, 2023

katyhshi left a comment

Release 2.0.0 #1444

Release 2.0.0 #1444

Conversation

etr2460 commented Dec 21, 2023

katyhshi left a comment

Choose a reason for hiding this comment