-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cache pool abstraction to cache stylesheets #3300
Conversation
Approved Hi @schlessera, Once a parsed stylesheet is stored in a transient, this retrieves it from the transient, instead of parsing it again. I set an Xdebug breakpoint and stepped through |
@@ -399,6 +418,7 @@ public function __construct( DOMDocument $dom, array $args = [] ) { | |||
$this->base_url = untrailingslashit( $guessurl ); | |||
$this->content_url = WP_CONTENT_URL; | |||
$this->xpath = new DOMXPath( $dom ); | |||
$this->cache_pool = new AMP_Cache_Pool( self::CACHE_GROUP ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice dependency injection 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not being injected because I don't think the architecture is currently set up well for injecting dependencies.
For now, I wanted to keep it simple and just directly instantiated here. Hopefully, we'll have injection for v2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, OK. I thought this was dependency injection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dependency injection would be if exterior code would inject "some cache" and this code here would not need to know which exact implementation is being used. This would then decouple this code from the cache, and the cache could be changed/replaced without needing to adapt this code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks!
* @param string $group Optional. Group to use. Defaults to an empty string. | ||
* @param int $size Optional. Size of the cache pool. | ||
*/ | ||
public function __construct( $group = '', $size = self::DEFAULT_POOL_SIZE ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like so far, this class is only instantiated with one argument:
$this->cache_pool = new AMP_Cache_Pool( self::CACHE_GROUP );
...though maybe later this class will be instantiated with 2 arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The size is needed, and it felt obvious to allow this to be set through instantiation. This way, you could also create several pools with differing groups that might have different sizes as well.
*/ | ||
class Test_AMP_Cache_Pool extends WP_UnitTestCase { | ||
|
||
public function test_transients_storing_a_value() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, very thorough testing 😄
} else { | ||
$parsed = get_transient( $cache_group . '-' . $cache_key ); | ||
} | ||
$parsed = $this->cache_pool->get( $cache_key ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Left a few comments. This may make sense to also use in the AMP_Image_Dimension_Extractor
since images can also end up filling up the wp_options
table.
$this->group = $group; | ||
$this->size = $size; | ||
|
||
$this->is_object_cache_available = wp_using_ext_object_cache(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If using an external object cache, then essentially the cache pool is not used, is that right? This assumes that that the persistent object cache has implemented some algorithm for garbage collection.
Aside: Shouldn't WordPress transient system implement some cache replacement policy? If there was such a cache pool in core like you've implemented here, then there wouldn't be a need for this class, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the manual pooling/rotation is only used with transients. Proper object caches have built-in mechanisms for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think WordPress should use something like this with a separate table, transients as they are now are a hack with several performance issues.
// The expiration is to ensure transient doesn't stick around forever since no LRU flushing like with external object cache. | ||
set_transient( $cache_group . '-' . $cache_key, $parsed, MONTH_IN_SECONDS ); | ||
} | ||
$this->cache_pool->set( $cache_key, $parsed ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. 👍
self::$pool_indexes[ $this->group ]++; | ||
|
||
if ( self::$pool_indexes[ $this->group ] >= $this->size ) { | ||
self::$pool_indexes[ $this->group ] = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In regards to #2092 (comment), this would be where we could potentially show an error message to the user, warning the that stylesheet cache pool is overflowing, yes? This should issue a PHP warning to the error log?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is only expressing the "distance travelled". Velocity is distance over time, we don't know anything about the time here. If we hit the end of the pool several times a day, we have a very high velocity. If we only hit it every 3 months, it might just be a site with lots of edits.
Co-Authored-By: Weston Ruter <westonruter@google.com>
Yes, it was already built to be usable in multiple locations, with differing groups and sizes, and the tests already make sure this works as expected. |
Co-Authored-By: Weston Ruter <westonruter@google.com>
I wonder whether the default size of 1000 pool entries (per group) is not too high. The main issue I can see is that we serialize and unserialize the entire pool map. For 1000 entries, this might have a non-negligible performance impact. |
It's hard to know. We somewhat need data to help inform that. This provides the count: wp transient list --fields=name --search='amp-parsed-stylesheet-*' --format=count I've asked someone to run that to get a number from a production site that is not running an object cache. |
Note that a the parsed stylesheets being stored here can come from:
A transient will be created for each 👉 unique 👈 instance of the above to contain the parsed stylesheet information. |
On hold for now as we are awaiting an export of production transient data from an actual site to evaluate. |
Here's my initial observation from looking through the export we got from a bigger site:
|
Ah, interesting. Maybe that's because AMP used to not support inline styles. |
That's mostly why, but there are other reasons. See #1313 which suggests the elimination of moving inline |
These are all coming from inline styles (in |
I'm not quite sure I understand what you mean by the above, specifically the suggestion to "parse the style rules at runtime rather than fire up the PHP CSS Parser". Would that mean they are neither validated/sanitized, not tree-shaken anymore? And what would you then actually do with them at runtime? I seem to be missing something here. |
The inline styles don't need to be tree-shaken, because we know they are going to exist in the page. What I mean is that in amp-wp/includes/sanitizers/class-amp-style-sanitizer.php Lines 2437 to 2439 in 72bd441
And then the rule gets processed: amp-wp/includes/sanitizers/class-amp-style-sanitizer.php Lines 2443 to 2449 in 72bd441
This then causes the PHP CSS Parser to be fired up. The And I think that's all that needs to be done for properties in a |
Okay, I understood now, the term "runtime" was confusing me. I'll open a new issue for that. What is needed now to take this PR across the finish line?
Would this be enough to merge this for now? |
Given that a large majority of the transients are for inline styles, if the PHP CSS Parser were bypassed in favor of a more lightweight parser (e.g. using regex, though this is less robust!) then we could avoid needing to store so many transients and there will be less need to increase the size of the pool. |
I'm just concerned that if a site has 5000 transients and the pool size is 1000, but they don't know they have so many, then there will be very frequent cache misses and they wouldn't know it is happening. |
This 👆. |
I wonder if the PHP CSS Parser is actually sufficiently fast for the simple case if CSS properties in a |
Also because there is no need to parse the selector for tree shaking. |
Closing in favor of #4177 |
This PR adds a cache pool abstraction that abstracts away the differences between the object cache and transients when caching stylesheets.
The transient cache implementation provides a fixed-size pool of cache entries that are rotated to enforce a maximum number of transients and avoid filling up the database.
Note: For the architecture rethink, this should be an interface with 2 separate implementations that will get injected, instead of a single class that contains all implementations.
See #2092