Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RewriteAmpUrls transformer #5078

Closed
wants to merge 11 commits into from

Conversation

schlessera
Copy link
Collaborator

@schlessera schlessera commented Jul 21, 2020

Summary

Adds the RewriteAmpUrls transformer.

This is used for switching URLs to an AMP LTS channel or for adding ESM support when loading the runtime and components.

Fixes ampproject/amp-toolbox-php#20
Fixes #4600

Checklist

  • My pull request is addressing an open issue (please create one otherwise).
  • My code is tested and passes existing tests.
  • My code follows the Engineering Guidelines (updates are often made to the guidelines, check it out periodically).

@schlessera schlessera added Optimizer WS:Perf Work stream for Metrics, Performance and Optimizer labels Jul 21, 2020
@schlessera schlessera added this to the v2.1 milestone Jul 21, 2020
@schlessera schlessera self-assigned this Jul 21, 2020
@google-cla google-cla bot added the cla: yes Signed the Google CLA label Jul 21, 2020
@schlessera schlessera marked this pull request as draft July 21, 2020 16:01
@github-actions
Copy link
Contributor

Plugin builds for 3813325 are ready 🛎️!

@schlessera schlessera changed the title Add/rewrite amp urls transformer Add RewriteAmpUrls transformer Jul 21, 2020
@schlessera
Copy link
Collaborator Author

I found out why all the tests succeed, while it doesn't seem to do anything on the actual site:
The scripts to load AMP components are not yet part of either the sanitizer or the optimizer, they are added in AMP_Theme_Support::ensure_required_markup(), which is running immediately after the Optimizer.

The quick fix would be to move it before the Optimizer. However, the proper fix would be to add automatic extension resolution to either the Sanitizer or the Optimizer.

Note: The Node.JS Optimizer has a transformer for this: https://github.com/ampproject/amp-toolbox/blob/master/packages/optimizer/lib/transformers/AutoExtensionImporter.js

@schlessera
Copy link
Collaborator Author

Preliminary testing

Without ESM:

Image 2020-07-24 at 12 41 29 PM

Image 2020-07-24 at 12 42 37 PM

With ESM:

Image 2020-07-24 at 12 41 58 PM

Image 2020-07-24 at 12 39 25 PM

The difference is not huge but impactful nevertheless.

I assume that this will further improve as the ESM tooling for amphtml will be further refined.

The implementation currently causes the following AMP validation issues:

  • The attribute 'nomodule' may not appear in tag 'amphtml engine v0.js script'. (1x for the runtime)
    For each extension:
  • The attribute 'src' in tag '<extension> extension .js script' is set to the invalid value 'https://cdn.ampproject.org/v0/<extension>-0.1.mjs'.
  • The attribute 'type' in tag '<extension> extension .js script' is set to the invalid value 'module'.
  • Custom JavaScript is not allowed. (=> AMP extension script is not recognized with .mjs file extension, so tagged as custom JS)

Note that the plugin's validation does pass successfully, as we have a chicken&egg problem, due to the fact that the sanitizer & validator are the same thing. The Optimizer would conceptually need to run after the sanitizer but before the validator. As it is now, it runs after both of these, as they can't be separated. Therefore, any validation issues that the Optimizer introduces won't be caught if they would be detected by a sanitizer like the AMP_Tag_And_Attribute_Sanitizer.

@westonruter
Copy link
Member

Note that the plugin's validation does pass successfully, as we have a chicken&egg problem, due to the fact that the sanitizer & validator are the same thing. The Optimizer would conceptually need to run after the sanitizer but before the validator. As it is now, it runs after both of these, as they can't be separated. Therefore, any validation issues that the Optimizer introduces won't be caught if they would be detected by a sanitizer like the AMP_Tag_And_Attribute_Sanitizer.

Yes, I think we need to basically promise that the the Optimizer's transformers are not going to cause invalid output. That's the current state of things. Also, the AMP_Tag_And_Attribute_Sanitizer is explicitly only accepting non-transformed markup anyway. The spec importer will omit any rule that relates to transformed. If we rearrange things to run the AMP_Tag_And_Attribute_Sanitizer after the Optimizer, then it would have to learn how to handle transformed input. We should probably do that eventually, but that will be a big effort.

@CLAassistant
Copy link

CLAassistant commented Oct 8, 2020

CLA assistant check
All committers have signed the CLA.

@schlessera
Copy link
Collaborator Author

This PR needs to be transferred over to https://github.com/ampproject/amp-toolbox-php/.

@westonruter westonruter removed this from the v2.1 milestone Feb 12, 2021
@erwinmombay
Copy link
Member

howdy, all. The validator should now allow these transformations to be valid. Is there a way for us to test this out?

@westonruter
Copy link
Member

@erwinmombay This PR is stale and needs to be re-started on the amp-toolbox-php repo, so it's not really in a testable state. We're working on getting AMP plugin v2.1 out the door (which includes the hero image prerendering), and once that's out we'll be able to focus on other Optimizer improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Signed the Google CLA Optimizer WS:Perf Work stream for Metrics, Performance and Optimizer
Projects
None yet
4 participants