Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid Post Duplication (View Only) #61

Merged
merged 2 commits into from
Jul 23, 2019
Merged

Conversation

jeffersonrabb
Copy link
Contributor

@jeffersonrabb jeffersonrabb commented Jul 22, 2019

All Submissions:

Changes proposed in this Pull Request:

This PR is an attempt to ensure that multiple articles blocks on the same page will not show the same post more than once. Some questions and caveats:

  • This PR only affects the view (preview/published content). In the editor there will still be duplicate posts. This will be addressed in a separate PR.
  • Currently this logic will be applied to every instance of Newspack Homepage Articles block. Would we need the flexibility to exempt certain blocks from the logic?
  • Blocks will be processed in the order they appear on the page, with higher prioritized stories going to the first ones. It is possible that an unusual layout could be produced where the topmost block is not the visually dominant one, which could lead to unexpected distribution of posts.

This is a partial resolution of #12

How to test the changes in this Pull Request:

  1. Add a few instances of the block to a post or page, with identical query settings.
  2. Observe there are duplicate posts visible in the Editor.
  3. Publish or preview. Observe that there is no duplication.

Other information:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your changes, as applicable?
  • Have you successfully ran tests with your changes locally?

Copy link
Contributor

@claudiulodro claudiulodro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should work well!

I looked into using offset instead of oversized posts_per_page (for much better performance), but I wasn't able to figure out how to make it work nicely when there are homepage blocks with categories. Ping me if you want to give it a whirl and I'll explain it further, but your current implementation should work well in all situations.

@jeffersonrabb
Copy link
Contributor Author

I looked into using offset instead of oversized posts_per_page (for much better performance), but I wasn't able to figure out how to make it work nicely when there are homepage blocks with categories.

I tried this approach too, but as you say it only really works if all blocks use the exact same query, which they won't. Do you think the oversized posts_per_page will be a significant drag? I was under the impression this won't be too costly, and the most important thing was to avoid using post__not_in, which is the primary rationale for the approach used here.

Copy link
Contributor

@laurelfulford laurelfulford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When testing, I managed to 'trick' the code into screwing up the posts-per-section count -- this is a bit convoluted, so just give me a ping if something doesn't make sense!

Steps to reproduce:

  1. Add at least three new posts; publish date is very important here, so set the second two to have older published dates:
    • the first one should have at least one category (Category A), and set to be published today
    • the second one should have at least two categories (Category A and Category B), and set to be published yesterday
    • the third post should have at least one category (Category B), and be set to have been published the day before yesterday
  2. Set up the homepage to display two blocks. Each block should show only one post.
  3. Set the first block to display Category A, and the second block to display Category B.
  4. Confirm that only two posts are displaying in the editor, one for each block.
  5. Save and publish.
  6. See that the second block is showing two posts -- it's not duplicating one -- it wouldn't without the fix -- but it's miscounting the number to show.

I've put together some screenshots from my test site; I've done the following so it's hopefully easier to tell what the settings are:

  • I've renamed the posts, so they say their assigned categories so it's a bit easier to tell what's happening.
  • I've set each block's section header to read the category that it should show.

In the editor: the block showing 'Category A' (Business & Tech) shows it's newest post; the second block shows 'Category B' (Arts & Culture), but the post there is also assigned to 'Business & Tech'.

Screen Shot 2019-07-22 at 2 17 51 PM

When published, the second block shows a second article, even though it's only set to display one:

Screen Shot 2019-07-22 at 2 17 25 PM

If the second-newest article, the one assigned to both 'Category A' and 'Category B', has the publication date changed so it's the newest article, the correct number of posts will show.

If the first block's post count is increased to two, it will display two posts on the front end, but the second block will still incorrectly show two:

image

I think it has something to do with the query being increased to make up for a post that's not actually being removed/hidden, but I could have that wrong! Just let me know if I can provide more info!

@claudiulodro
Copy link
Contributor

claudiulodro commented Jul 23, 2019

Good sleuthing. I think the issue may be as simple as this being an off-by-one error?

e.g.

$posts_to_show = intval( $attributes['postsToShow'] ); // Let's say postsToShow = 1
$post_counter = 0

// Show posts until $post_counter > $posts_to_show
// Display a post. $post_counter is now 1. Go again since $post_counter > $posts_to_show (1 > 1) is false.
// Display a post. $post_counter is now 2. $post_counter > $posts, so stop showing posts. 2 posts were displayed and we requested only 1.

@jeffersonrabb jeffersonrabb force-pushed the try/deduplicate-view-posts branch from c45305f to 5c5507f Compare July 23, 2019 18:30
@jeffersonrabb
Copy link
Contributor Author

Nice sleuthing both of you! This should fix it: 365d8d1

@laurelfulford
Copy link
Contributor

Looks good to me! I turned my front page into a dumpster fire of multi-category posts; there aren't any dupes, and the post counts are right; parsing through the /wp-admin posts screen, it looks like everything is displaying in order and where expected. 🎉

@jeffersonrabb
Copy link
Contributor Author

OK! Onward to solving the issue in the editor now...

@jeffersonrabb jeffersonrabb merged commit 15a08a8 into master Jul 23, 2019
@jeffersonrabb jeffersonrabb deleted the try/deduplicate-view-posts branch July 23, 2019 19:24
@iamtakashi
Copy link

Onward to solving the issue in the editor now...

Could you let me know where this work is happening? The difference between the editor and the front-end of the site is very confusing at the moment. cc @Automattic/ajax

@jeffersonrabb
Copy link
Contributor Author

Could you let me know where this work is happening?

This editor de-duplication work is occurring in #176 which is a different block. It's conceivable the work in that block could be applied to Homepage Articles, but this isn't being actively explored at the moment. This is the commit where editor de-dupe was first introduced: 672ff19

cc: @georgeh

@matticbot
Copy link
Contributor

🎉 This PR is included in version 1.0.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants