Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add path as an option for uploading attachments #1331

Merged
merged 35 commits into from
Jan 16, 2025

Conversation

isahers1
Copy link
Contributor

No description provided.

Comment on lines +321 to +323
attachments = {
a: v for a, v in self.attachments.items() if isinstance(v, tuple)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done strictly for typing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is v ever not a tuple?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so might be worth doing in a for loop and raising an error if not. I think would be unexpected. But mypy probably knows better than me, so there might be a case we're not handling here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v is never not a tuple here, this is just because of the union types I believe and it thinks that something could be passed here that is never passed

Comment on lines +395 to +399
class Config:
"""Configuration class for the schema."""

arbitrary_types_allowed = True

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed to allow the AttachmentInfo

Comment on lines 275 to 280
acc_parts.append(
(
f"attachment.{op.id}.{n}",
(
None,
open(file_path, "rb"), # type: ignore[arg-type]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we not need to read this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, I believe the multipart endpoint allows a buffer to be passed (I also added a test that should cover this)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty cve worthy so we should be careful about cases where this is allowed/encouraged

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion would be to require either an environment variable of LANGSMITH_DANGEROUSLY_ALLOW_FILESYSTEM or some client init variable saying this. Otherwise throw an error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to explicitly close the files the are opened here, using some try, finally pattern. I would structure the code such that the multipart form is assembled and sent within the same method we are opening the file handles:

try:
    # Prepare the files for the multipart request
    # Make the POST request
finally:
    # Explicitly close the files

@@ -370,7 +368,9 @@ class RunBase(BaseModel):
tags: Optional[List[str]] = None
"""Tags for categorizing or annotating the run."""

attachments: Attachments = Field(default_factory=dict)
attachments: Union[Attachments, Dict[str, AttachmentInfo]] = Field(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this related to this pr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah need this for the changes in read_run

Copy link
Member

@efriis efriis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left comments. I think main issue is the arbitrary file read, which I could be convinced isn't a cve if we feel data exfiltration to langsmith is by design. But it feels moderately easy to construct a payload as a user of an app that would make the server stream /etc/passwd to whichever langsmith url you have, which is unideal.

Copy link
Contributor Author

@isahers1 isahers1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add the flag for runs (creation and updating)

Comment on lines 275 to 280
acc_parts.append(
(
f"attachment.{op.id}.{n}",
(
None,
open(file_path, "rb"), # type: ignore[arg-type]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to explicitly close the files the are opened here, using some try, finally pattern. I would structure the code such that the multipart form is assembled and sent within the same method we are opening the file handles:

try:
    # Prepare the files for the multipart request
    # Make the POST request
finally:
    # Explicitly close the files

@@ -377,6 +378,41 @@ def test_persist_update_run(langchain_client: Client) -> None:
langchain_client.delete_project(project_name=project_name)


def test_update_run_attachments(langchain_client: Client) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to add a test case for traceable as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added one in test_runs because that is where the previous traceable tests were

python/langsmith/schemas.py Show resolved Hide resolved
try:
file.close()
except Exception:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably good to log something here

python/langsmith/_internal/_operations.py Outdated Show resolved Hide resolved
f"trace={op.trace_id},id={op.id}",
else:
file_size = os.path.getsize(data)
file = open(data, "rb")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we check dangerously_use_filesystem here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We check it before hand (in client.py). Everywhere this function gets called, dangerously_use_filesystem has already been checked (you should verify this though, I have double checked and am 99% certain but another pass is a good idea)

python/langsmith/client.py Outdated Show resolved Hide resolved
python/langsmith/client.py Show resolved Hide resolved
obi1kenobi and others added 3 commits January 14, 2025 10:03
From the Rust bindings, import the `_serialize_json` function from
`langsmith._internal._serde`, then use it as the default fallback if
`orjson` serialization can't handle some object. This makes the Rust
serialization code equivalent to the `_orjson.dumps()` call inside
`langsmith._internal._serde.dumps_json`.

I will handle UTF surrogate characters in a subsequent PR.
@isahers1 isahers1 marked this pull request as ready for review January 14, 2025 15:35
@baskaryan baskaryan requested a review from efriis January 15, 2025 15:55
python/langsmith/run_helpers.py Show resolved Hide resolved
Co-authored-by: Erick Friis <erick@langchain.dev>
@isahers1 isahers1 changed the base branch from main to py-version-0.3.0 January 16, 2025 01:25
@isahers1 isahers1 merged commit d073f7c into py-version-0.3.0 Jan 16, 2025
23 checks passed
@isahers1 isahers1 deleted the isaac/attachmentspath branch January 16, 2025 01:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants