Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pluck(:value).first to avoid loading entire row and using try! #3282

Merged
merged 1 commit into from
Aug 7, 2019

Conversation

JDutil
Copy link
Contributor

@JDutil JDutil commented Jul 22, 2019

Update: The ordering was actually already fixed in #3224 but this continues to improve on the code & query by only loading the value attribute as a String instead of loading the entire database row & initializing a full ActiveRecord object.

Overview

The SQL queries to determine each promotions code or batches of codes
count are currently the slowest queries on the promotions index page.

In our application containing approximately 32 Million codes and
counting they've significantly slowed down the page by approximately
80-100 seconds currently.

This was caused by the SQL's ORDER BY clause trying to sort all the
records as they were being counted which was not necessary. Then when
calling .first to initialize the ActiveRecord object it would add
even further delay to the rendering approximately 5 seconds.

By using pluck(:code) the ORDER BY statement is dropped, and we don't
have to initialize a full ActiveRecord object either.

Checklist:

  • I have followed Pull Request guidelines
  • I have added a detailed description into each commit message
  • I have updated Guides and README accordingly to this change (if needed)
  • I have added tests to cover this change (if needed)

The SQL queries to determine each promotions code or batches of codes
count are currently the slowest queries on the promotions index page.

In our application containing approximately 32 Million codes and
counting they've significantly slowed down the page by approximately
80-100 seconds currently.

This was caused by the SQL's ORDER BY clause trying to sort all the
records as they were being counted which was not necessary. Then when
calling `.first` to initialize the ActiveRecordR object it would add
even further delay to the rendering approximately 5 seconds.

By using `pluck(:code)` the ORDER BY statement is dropped, and we don't
have to initialize a full ActiveRecord object either.
Copy link
Contributor

@mdesantis mdesantis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@spaghetticode
Copy link
Member

spaghetticode commented Jul 23, 2019

@JDutil thank you for this PR

Out of curiosity, I'm trying to reproduce the issue, I created ~20K records, and this is what I get:

p = Spree::Promotion.first
p.codes.size
# => 20012 

p.codes.take.try!(:value)
# (15.0ms) SELECT "spree_promotion_codes".* FROM "spree_promotion_codes" WHERE "spree_promotion_codes"."promotion_id" = ? ORDER BY "spree_promotion_codes"."created_at" ASC LIMIT ?  [["promotion_id", 1], ["LIMIT", 1]]

p.codes.pluck(:value).first
# (87.6ms) SELECT "spree_promotion_codes"."value" FROM "spree_promotion_codes" WHERE "spree_promotion_codes"."promotion_id" = ? ORDER BY "spree_promotion_codes"."created_at" ASC  [["promotion_id", 1]]

So, it seems that both ways you get the ORDER clause in the query.

In order to add the ordering clause I had to add a default_scope to Spree::PromotionCode, as standard Solidus does not have ordering for that model.

Can you help me in understanding if my assumptions are wrong, or I'm missing some part? Thank you!

@JDutil
Copy link
Contributor Author

JDutil commented Jul 23, 2019

@spaghetticode that's because the main issue was actually patched by someone else recently for v2.9: #3224

Our application is still running an older version using first instead of take so when I wrote this up I was still thinking Solidus was using first.

I still believe using pluck(:value).first provides a slight performance improvement by not selecting the entire db row & initializing the ActiveRecord object though.

@JDutil
Copy link
Contributor Author

JDutil commented Jul 23, 2019

@spaghetticode ah sorry I didn't read your full comment first I was just assuming you noticed the take vs first change. I'm still running an older version, and using take or pluck doesn't include an order clause. Perhaps the newer versions are adding an order scope that our applications aren't? Also are you using MySQL or Postgres?

@spaghetticode
Copy link
Member

@JDutil I did the test using the standard Solidus sandbox, with SQLite.

@JDutil
Copy link
Contributor Author

JDutil commented Jul 23, 2019

My tests are with Postgres. I'll have to try comparing with sqlite on the latest version, but I'm guessing it's sqlite adding the order by clause to that query. Unless there was something that changed between versions.

@JDutil
Copy link
Contributor Author

JDutil commented Aug 5, 2019

@spaghetticode sorry for the delay getting back to you on this, but I just got around to testing this out on the latest master branch's sandbox for both SQLite & PostgreSQL which both behaved the same for me without an ORDER BY clause:

SQLite:

 >> Spree::Promotion.first.codes.take.try!(:value)
   Spree::Promotion Load (0.2ms)  SELECT  "spree_promotions".* FROM "spree_promotions" ORDER BY "spree_promotions"."id" ASC LIMIT ?  [["LIMIT", 1]]
   Spree::PromotionCode Load (0.2ms)  SELECT  "spree_promotion_codes".* FROM "spree_promotion_codes" WHERE "spree_promotion_codes"."promotion_id" = ? LIMIT ?  [["promotion_id", 1], ["LIMIT", 1]]
 => "testjfghkz"

 >> Spree::Promotion.first.codes.pluck(:value).first
  Spree::Promotion Load (0.1ms)  SELECT  "spree_promotions".* FROM "spree_promotions" ORDER BY "spree_promotions"."id" ASC LIMIT ?  [["LIMIT", 1]]
   (0.2ms)  SELECT "spree_promotion_codes"."value" FROM "spree_promotion_codes" WHERE "spree_promotion_codes"."promotion_id" = ?  [["promotion_id", 1]]
=> "testjfghkz"

PostgresQL:

 >> Spree::Promotion.first.codes.take.try!(:value)
  Spree::Promotion Load (0.7ms)  SELECT  "spree_promotions".* FROM "spree_promotions" ORDER BY "spree_promotions"."id" ASC LIMIT $1  [["LIMIT", 1]]
  Spree::PromotionCode Load (0.6ms)  SELECT  "spree_promotion_codes".* FROM "spree_promotion_codes" WHERE "spree_promotion_codes"."promotion_id" = $1 LIMIT $2  [["promotion_id", 1], ["LIMIT", 1]]
=> "testrxlzrc"
>> Spree::Promotion.first.codes.pluck(:value).first
  Spree::Promotion Load (0.4ms)  SELECT  "spree_promotions".* FROM "spree_promotions" ORDER BY "spree_promotions"."id" ASC LIMIT $1  [["LIMIT", 1]]
   (0.2ms)  SELECT "spree_promotion_codes"."value" FROM "spree_promotion_codes" WHERE "spree_promotion_codes"."promotion_id" = $1  [["promotion_id", 1]]
=> "testrxlzrc"

I'm not sure why you're seeing an ORDER BY clause in your console...

@JDutil
Copy link
Contributor Author

JDutil commented Aug 5, 2019

@spaghetticode actually I think we had another misunderstanding here.

In order to add the ordering clause I had to add a default_scope to Spree::PromotionCode, as standard Solidus does not have ordering for that model.

That's because take and pluck don't add an ordering clause. This code in older versions of solidus uses first which does add an order clause. That looks like this:

>> Spree::Promotion.first.codes.first.try!(:value)
  Spree::Promotion Load (8.4ms)  SELECT  "spree_promotions".* FROM "spree_promotions" ORDER BY "spree_promotions"."id" ASC LIMIT $1  [["LIMIT", 1]]
  Spree::PromotionCode Load (1.2ms)  SELECT  "spree_promotion_codes".* FROM "spree_promotion_codes" WHERE "spree_promotion_codes"."promotion_id" = $1 ORDER BY "spree_promotion_codes"."id" ASC LIMIT $2  [["promotion_id", 1], ["LIMIT", 1]]
=> "testrxlzrc"

When I created this PR it was for the older version using first, which had an order clause. That code however was changed recently to begin using take here:
https://github.com/solidusio/solidus/pull/3224/files

So my PR description was actually misleading because I'm not actually changing that anymore.

What this PR does do though is that it improved the SQL query by only selecting the value attribute from the database record. Which prevents initializing an entire ActiveRecord object, as well as, allows us to drop the try!.

Sorry for the confusion here, but thought I was fixing more than it turned out after rebasing with the #3224 change.

I've updated the PR title & description to better reflect the actual change here.

@JDutil JDutil changed the title Use pluck(:value).first to prevent ORDER BY in SQL query Use pluck(:value).first to avoid loading entire row and using try! Aug 5, 2019
Copy link
Member

@kennyadsl kennyadsl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @JDutil for this change and the clear explanation. 🎉

Copy link
Member

@spaghetticode spaghetticode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JDutil thank you for further clarifying the context, I agree that pluck is an improvement over take 👍

@kennyadsl kennyadsl merged commit 8e8f0c7 into solidusio:master Aug 7, 2019
@JDutil JDutil deleted the use-pluck-first branch August 7, 2019 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants