Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix is_highest logic on resyncs #1548

Merged
merged 1 commit into from
Aug 9, 2023
Merged

Conversation

gerrod3
Copy link
Contributor

@gerrod3 gerrod3 commented Aug 7, 2023

fixes: #1547

Copy link
Member

@mdellweg mdellweg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Content in Pulp is meant to be immutable.
is_highest does not even look right to me when a collection version is in two different repositories.

# Ensure that this collection_version doesn't have is_highest improperly set
if collection_version.is_highest:
collection_version.is_highest = False
if save:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this save trying to tell us?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? I explained the bug in the issue. This if save is to just be consistent with how the function can be called.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see if <condition> or save : in other places of this function. I'm really confused whether this flag should tell us to always save the unit, or whether it allows us to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That previous condition is for an odd scenario where none of the collection versions for a collection had is_highest set. The reason why save is conditional is for an optimization since this function is called to update a collection version before it is saved in the CollectionSaver stage or the import_collection task, there is no point in saving the collection twice in a row.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this fixes the issue at hand then.
Can you add some of the explanations as comments to the code, please?

Copy link
Member

@mdellweg mdellweg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really not happy, we ignore that content must be immutable.
What happens if we remove the uniqueness constraint on the is_latest and specifically allow it to be a bit fuzzy at times? A mutable field within a uniqueness constraint seems to be additionally painful. Also what is true for a collection in one repository version does not necessarily be as true in another. Could we move that information elsewhere?

# Ensure that this collection_version doesn't have is_highest improperly set
if collection_version.is_highest:
collection_version.is_highest = False
if save:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this fixes the issue at hand then.
Can you add some of the explanations as comments to the code, please?

@gerrod3
Copy link
Contributor Author

gerrod3 commented Aug 8, 2023

I'm really not happy, we ignore that content must be immutable. What happens if we remove the uniqueness constraint on the is_latest and specifically allow it to be a bit fuzzy at times? A mutable field within a uniqueness constraint seems to be additionally painful. Also what is true for a collection in one repository version does not necessarily be as true in another. Could we move that information elsewhere?

This is how it has been for years... Yes we should refactor it to better be align with Pulp philosophy, not sure how though. For AAH it has been fine to have this field calculated on the model since they only cared about one repository, but with the introduction of repo management it might be time to look at this field again. I think we should file an issue for this and put the design discussion there.

@mdellweg mdellweg merged commit 6f7f087 into pulp:main Aug 9, 2023
@gerrod3 gerrod3 deleted the update-highest-fix branch August 9, 2023 15:01
@patchback
Copy link

patchback bot commented Aug 9, 2023

Backport to 0.17: 💚 backport PR created

✅ Backport PR branch: patchback/backports/0.17/6f7f087aca62923870d7d408bdcdd25fa63d39d7/pr-1548

Backported as #1552

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

patchback bot pushed a commit that referenced this pull request Aug 9, 2023
fixes: #1547
(cherry picked from commit 6f7f087)
gerrod3 added a commit that referenced this pull request Aug 9, 2023
fixes: #1547
(cherry picked from commit 6f7f087)

Co-authored-by: Gerrod <gerrodubben@gmail.com>
Copy link

patchback bot commented Sep 19, 2024

Backport to 0.16: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply 6f7f087 on top of patchback/backports/0.16/6f7f087aca62923870d7d408bdcdd25fa63d39d7/pr-1548

Backporting merged PR #1548 into main

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/pulp/pulp_ansible.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/0.16/6f7f087aca62923870d7d408bdcdd25fa63d39d7/pr-1548 upstream/0.16
  4. Now, cherry-pick PR Fix is_highest logic on resyncs #1548 contents into that branch:
    $ git cherry-pick -x 6f7f087aca62923870d7d408bdcdd25fa63d39d7
    If it'll yell at you with something like fatal: Commit 6f7f087aca62923870d7d408bdcdd25fa63d39d7 is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x 6f7f087aca62923870d7d408bdcdd25fa63d39d7
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Fix is_highest logic on resyncs #1548 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/0.16/6f7f087aca62923870d7d408bdcdd25fa63d39d7/pr-1548
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

mdellweg pushed a commit that referenced this pull request Sep 25, 2024
fixes: #1547
(cherry picked from commit 6f7f087)
@mdellweg
Copy link
Member

Manual cherry-pick:
#1984

mdellweg pushed a commit that referenced this pull request Sep 25, 2024
fixes: #1547
(cherry picked from commit 6f7f087)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Collection sync update_highest_version calculation can potentially fail on re-sync
3 participants