Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Report all fields which fail registry transformation #496

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

msto
Copy link
Contributor

@msto msto commented Oct 8, 2024

Hi,

I ran into a bug that was a bit challenging to troubleshoot because the error reporting in to_registry_literal() does not include the name of the failing field, and only sometimes includes the failing value.

The name of each transformed field is only in scope within upsert_record(), so I thought it most sensible to catch the thrown exception there and add the relevant field and value to the error message.

This pattern has the added benefit of validating all of the upserted values before erroring out, so in the event multiple fields are malformatted, the user can see all malformatted fields at once instead of one malformatted field per execution.

@msto msto requested a review from ayushkamat as a code owner October 8, 2024 14:02
latch/registry/table.py Outdated Show resolved Hide resolved
Comment on lines +502 to +503
if len(errs) > 0:
raise RegistryTransformerException(f"Could not upsert record {name}:" + "\n".join(errs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if len(errs) > 0:
raise RegistryTransformerException(f"Could not upsert record {name}:" + "\n".join(errs))
if len(errs) > 0:
if len(errs) > 9:
rest = len(errs) - 9
errs = errs[:9] + [f"({rest} error(s) hidden)"]
raise RegistryTransformerException(f"Could not upsert record {name}:" + textwrap.indent("\n".join(errs)))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ayushkamat

Thanks for the feedback. I've incorporated the above suggestion, but I'd really prefer to present an error message that includes all transformation failures.

This exception is raised when uploading records to the Registry. This is something that often occurs in the final stages of a workflow, so if a subset of errors are obscured, the user must re-execute the entire workflow to discover the remaining errors. This can be time consuming and expensive.

The hope of this PR was to expose all errors at once to avoid this. With that in mind, would you be willing to keep the original implementation?

I do like the suggestion to use textwrap and I'll incorporate that regardless.

Thanks!

Copy link
Contributor

@ayushkamat ayushkamat Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. My counter-argument is that typically there are very few "structurally unique" errors when doing bulk inserts (e.g. if you have an error like "LatchFile" cannot be assigned to type "str" or something, its likely that there are 300 more errors with the exact same message). Because of this I think there is limited value in printing everything out all of the time, and we should still limit the output of this to avoid spam.

That being said for now its fine to print everything out given it will be some work to separate out all of the unique errors.

Feel free to leave it as is and print everything, but please add a todo to filter this (you can assign me to it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're saying.

To be clear - this message is reporting all the errors associated with the insert of a single record, not a batch insert. While it is still unlikely that a single record would yield more than 10 errors (hopefully 🙂 ), I think in this context it would be appropriate to leave the set of errors unfiltered.

I agree that it's sensible to filter error messaging for a bulk insert, but I think that would happen elsewhere upstream.

Co-authored-by: Ayush Kamat <34531970+ayushkamat@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants