Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refrain from overwriting experiment metadata #1250

Merged
merged 1 commit into from
Nov 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions python/langsmith/evaluation/_arunner.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""V2 Evaluation Interface."""

Check notice on line 1 in python/langsmith/evaluation/_arunner.py

View workflow job for this annotation

GitHub Actions / benchmark

Benchmark results

........... WARNING: the benchmark result may be unstable * the standard deviation (106 ms) is 15% of the mean (709 ms) Try to rerun the benchmark with more runs, values and/or loops. Run 'python -m pyperf system tune' command to reduce the system jitter. Use pyperf stats, pyperf dump and pyperf hist to analyze results. Use --quiet option to hide these warnings. create_5_000_run_trees: Mean +- std dev: 709 ms +- 106 ms ........... WARNING: the benchmark result may be unstable * the standard deviation (153 ms) is 11% of the mean (1.40 sec) Try to rerun the benchmark with more runs, values and/or loops. Run 'python -m pyperf system tune' command to reduce the system jitter. Use pyperf stats, pyperf dump and pyperf hist to analyze results. Use --quiet option to hide these warnings. create_10_000_run_trees: Mean +- std dev: 1.40 sec +- 0.15 sec ........... WARNING: the benchmark result may be unstable * the standard deviation (152 ms) is 11% of the mean (1.41 sec) Try to rerun the benchmark with more runs, values and/or loops. Run 'python -m pyperf system tune' command to reduce the system jitter. Use pyperf stats, pyperf dump and pyperf hist to analyze results. Use --quiet option to hide these warnings. create_20_000_run_trees: Mean +- std dev: 1.41 sec +- 0.15 sec ........... dumps_class_nested_py_branch_and_leaf_200x400: Mean +- std dev: 699 us +- 6 us ........... dumps_class_nested_py_leaf_50x100: Mean +- std dev: 24.9 ms +- 0.2 ms ........... dumps_class_nested_py_leaf_100x200: Mean +- std dev: 106 ms +- 3 ms ........... dumps_dataclass_nested_50x100: Mean +- std dev: 25.5 ms +- 0.2 ms ........... WARNING: the benchmark result may be unstable * the standard deviation (17.1 ms) is 24% of the mean (71.4 ms) Try to rerun the benchmark with more runs, values and/or loops. Run 'python -m pyperf system tune' command to reduce the system jitter. Use pyperf stats, pyperf dump and pyperf hist to analyze results. Use --quiet option to hide these warnings. dumps_pydantic_nested_50x100: Mean +- std dev: 71.4 ms +- 17.1 ms ........... dumps_pydanticv1_nested_50x100: Mean +- std dev: 198 ms +- 3 ms

Check notice on line 1 in python/langsmith/evaluation/_arunner.py

View workflow job for this annotation

GitHub Actions / benchmark

Comparison against main

+-----------------------------------------------+----------+------------------------+ | Benchmark | main | changes | +===============================================+==========+========================+ | dumps_pydanticv1_nested_50x100 | 221 ms | 198 ms: 1.12x faster | +-----------------------------------------------+----------+------------------------+ | dumps_class_nested_py_leaf_50x100 | 25.3 ms | 24.9 ms: 1.02x faster | +-----------------------------------------------+----------+------------------------+ | create_20_000_run_trees | 1.42 sec | 1.41 sec: 1.01x faster | +-----------------------------------------------+----------+------------------------+ | create_5_000_run_trees | 714 ms | 709 ms: 1.01x faster | +-----------------------------------------------+----------+------------------------+ | create_10_000_run_trees | 1.41 sec | 1.40 sec: 1.01x faster | +-----------------------------------------------+----------+------------------------+ | dumps_class_nested_py_branch_and_leaf_200x400 | 700 us | 699 us: 1.00x faster | +-----------------------------------------------+----------+------------------------+ | dumps_dataclass_nested_50x100 | 25.3 ms | 25.5 ms: 1.01x slower | +-----------------------------------------------+----------+------------------------+ | dumps_class_nested_py_leaf_100x200 | 104 ms | 106 ms: 1.02x slower | +-----------------------------------------------+----------+------------------------+ | dumps_pydantic_nested_50x100 | 66.2 ms | 71.4 ms: 1.08x slower | +-----------------------------------------------+----------+------------------------+ | Geometric mean | (ref) | 1.01x faster | +-----------------------------------------------+----------+------------------------+

from __future__ import annotations

Expand Down Expand Up @@ -843,8 +843,12 @@
project_metadata["dataset_splits"] = await self._get_dataset_splits()
self.client.update_project(
experiment.id,
end_time=datetime.datetime.now(datetime.timezone.utc),
metadata=project_metadata,
end_time=experiment.end_time
or datetime.datetime.now(datetime.timezone.utc),
metadata={
**experiment.metadata,
**project_metadata,
},
)


Expand Down
8 changes: 6 additions & 2 deletions python/langsmith/evaluation/_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -1587,8 +1587,12 @@ def _end(self) -> None:
project_metadata["dataset_splits"] = self._get_dataset_splits()
self.client.update_project(
experiment.id,
end_time=datetime.datetime.now(datetime.timezone.utc),
metadata=project_metadata,
end_time=experiment.end_time
or datetime.datetime.now(datetime.timezone.utc),
metadata={
**experiment.metadata,
**project_metadata,
},
)


Expand Down
2 changes: 1 addition & 1 deletion python/pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "langsmith"
version = "0.1.144"
version = "0.1.145"
description = "Client library to connect to the LangSmith LLM Tracing and Evaluation Platform."
authors = ["LangChain <support@langchain.dev>"]
license = "MIT"
Expand Down
Loading