Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: pypi mapping into cached client #3318

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

baszalmstra
Copy link
Contributor

@baszalmstra baszalmstra commented Mar 10, 2025

Refactors the pypi mapping in a more client based approach. The MappingClient has a function that can be called to amend purls. It also serves as the entry point for coalescing of requests and in memory caching.

Also fixes #2615

@ruben-arts
Copy link
Contributor

#3322 This should fix the check-cli-docs job

@baszalmstra baszalmstra marked this pull request as ready for review March 11, 2025 10:40
@baszalmstra baszalmstra added the test:extra_slow Run the extra slow tests label Mar 11, 2025
@wolfv
Copy link
Member

wolfv commented Mar 11, 2025

I asked @mariusvniekerk on Zulip to try this.

Maybe @tobiasfischer or @matthewfeickert could also give it a go to see if that helps with slow PyPI mapping issues that you observed.

@matthewfeickert
Copy link
Contributor

Maybe @tobiasfischer or @matthewfeickert could also give it a go to see if that helps with slow PyPI mapping issues that you observed.

@wolfv I was too slow in testing the original PR @wolfv linked me to on Discord, and the artifacts had expired. You'd like testing of the Updating lock file takes forever Discord questions with the build artifacts from this PR though?

@wolfv
Copy link
Member

wolfv commented Mar 11, 2025

yes, @matthewfeickert - this one is the better implementation. If you could test again with artifacts from this PR here that would be great.

}

/// Given a set of `RepoDataRecord`s, amend the purls for each record.
pub async fn ament_purls(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub async fn ament_purls(
pub async fn amend_purls(

} else if custom_mappings.is_mapping_for_record(record) {
custom_mappings.derive_purls(record).await
} else {
self.derive_purls_from_clients(record).await
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we implement DerivePurls for MappingClinet itself? and inside it will call both hash_mapping or compressed


let derived_purls = if matches!(mapping_source, MappingSource::Disabled) {
Ok(None)
} else if custom_mappings.is_mapping_for_record(record) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a nitpick ( not sure if it was written previously by me 😓 )

we check if custom_mappings = mapping_source, otherwise use default to create it ( meaning that we don't have custom location ) .

here we always will ask is_mapping_for_record, even if the custom mapping is explicitly not present.

can we make it a match on mapping_source, and just act directly based on what type of the mapping we have?

@nichmor
Copy link
Contributor

nichmor commented Mar 12, 2025

I've tested it on a slow mapping example, as reported by @mariusvniekerk .

after profiling with cargo flamegraph

I can't find any samples for ament_purls - this means that we have a huge improvement :) comparing to previous 464 samples for it.

@wolfv
Copy link
Member

wolfv commented Mar 12, 2025

@baszalmstra can we close this PR then? #3079

@baszalmstra
Copy link
Contributor Author

Yep!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test:extra_slow Run the extra slow tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

conda-pypi-map should be relative to pixi.toml?
5 participants