This note is for developers of the AMP PHP Library only. You don't need to read it if you are a consumer of the library
Prerequisites: You should have cloned and installed for development the Lullabot fork of the ampproject/amphtml. Please be sure to switch to the php-validator-generated
branch. Follow the instructions under "Building a Custom Validator" here and run python build.py
when in cd
-ed in the validator
folder in your console.
What is the validator-generated.php
file in this directory and how is it created?
The AMP HTML standard is a complicated set of rules that specifies for each HTML tag (e.g. <script>
) what attributes (and values) it can can have or must have, what are its permissible parent or ancestor tags and so forth. This is summarized in the protocol buffer specification.
Google has written a validator in JavaScript (we will often refer to this validator as the "canonical validator"). The job of this validator is to report problems with AMP HTML standard compliance on HTML documents you feed it. The canonical validator does not understand the protocol buffer specification referred to above. So this .protoascii
file must be converted to JavaScript objects to be of any use to the main validator code. This happens when python build.py
is executed when you're cd
ed in the validator folder. The code that is eventually executed here is validator_gen.py
Our AMP PHP library is a ported subset of the canonical validator. Our validator is obviously written in PHP. Now, we need to consume a representation of the specification analogous to how the canonical JavaScript validator does. This is precisely what validator-generated.php
is: a representation of the .protoascii
AMP HTML specification. Our PHP validator code consumes this PHP code to build an internal set of objects and classes corresponing to the various tags and attributes in HTML (and what these tags and attributes are "allowed"). These objects and classes are then used by the PHP validator code.
So when you're using the Lullabot fork of ampproject/amphtml and run python build.py
(this file is customized in our fork), it, in turn runs file validator_gen_php.py
. This is the python file that generates the representation in PHP (this file is only available in our fork)
Specifically when build.py
is run, it runs validator_gen_php.py
which in turns outputs validator-generated.php
in the validator/dist
folder. We simply copy this validator-generated.php
file into the AMP PHP library src/Spec
folder. Once this is done, there is no active dependency on the ampproject/amphtml project to run the AMP PHP library. You would only regenerate this file infrequently when you want to synchronize our PHP validator with the canonical validator. Note that this method allows our validator to "auto-magically" pick up new rules for tags when it gets a fresh version of validator-generated.php
. Of course, this is not perfect for all cases -- sometimes there is no alternative to actually changing our validator code if the canonical validator has picked up some more complicated changes.