Skip to content
Cyrus Chan edited this page Jan 4, 2024 · 2 revisions

MetaInfo is structured object which denotes metadatas to display rich information link. Ideally, this object only can be constructed inside a fetcher which read parsed HTML objects and find at least one supported protocols to extract data from HTML object. In this repository, it ships MetaFetch which parsed HTML object and extracts metadata into MetaInfo with specific parsers for corresponded protocols.

This article focus on create a custom parser for specific protocols. For setup MetaFetch initially, please refer to model's README file.

Create new property parser

When decided to extend supported metadata protocol for MetaFetch, a new class based from MetaPropertyParser must be implemented as well as register the new parser into MetaFetch before operation.

Import library and extend class

To create new parser, please ensure buffer parser library form model package has been imported at first which bundled classes for constructing MetaInfo:

import 'packages:oghref_model/buffer_parser.dart';

Once the library has been imported, please create a new base (or final) class which inherted or mixin from MetaPropertyParser, and there are two properties and methods need to be overriden namely propertyNamePrefix and resolveMetaTags:

// Normally implemented protocols should not be extended further classes.
final class CustomPropertyParser extends MetaPropertyParser {
    // Good practise: Declare parser is constant object.
    const CustomPropertyParser();

    @overriden
    String get propertyNamePrefix => "custom";

    @overriden
    void resolveMetaTags(MetaInfoAssigner assigner, Iterable<PropertyPair> propertyPair) {
        // Further implementation will be explained below.
    }
}

Parsing metadata procedures

propertyNamePrefix is a prefix of <meta> property that it used for describe applied protocol when extracting content. Given example from above, this custom parser will be recognized these <meta> tags (and it must be UTF-8 encoded):

<meta property="custom:name" content="John Doe"/>
<meta property="custom:title" content="Sample text"/>
<meta property="custom:url" content="https://example.com"/>

and it will filtered difference prefix properties of <meta> tags no matter which sequence applied in HTML files:

<meta property="custom:name" content="John Doe"/>
<!-- This will be ignored -->
<meta property="foo:bar" content="Example"/>
<meta property="doe:var" content="A"/>
<!-- This will be resumed -->
<meta property="custom:title" content="Sample text"/>
<meta property="custom:url" content="https://example.com"/>

These metadata will be parsed as an iterable record object with two string value which denoted as property and content values that it contains the records which matched pattern of property value and defined a type as PropertyPair which will becomes one of the parameter of resolveMetaTags.

Giving <meta> example from above, the record will be formed as below when printed into console:

(custom:name, John Doe)
(custom:title, Sample text)
(custom:url, https://example.com)

Assign data into MetaInfo

This is the final procedure for constructing new parser which uses obtained content to construct MetaInfo. Since the info class is immutable which cannot modify properites once it constructed already. Therefore, MetaInfoAssigner is provided in resolveMetaTags for assigning final result and construct MetaInfo afterward.

In this example, the data will be assigned by executing for each loop along with switch case to determine corresponded properties and type conversion:

    // Under `CustomPropertyParser` class.

    @overriden
    void resolveMetaTags(MetaInfoAssigner assigner, Iterable<PropertyPair> propertyPair) {
        // Execture for each loop
        for (PropertyPair pair in propertyPair) {
            // Extract two variabled from a record
            var (property, content) = propertyPair;

            switch (property) {
                case "custom:title":
                    assigner.title = content;
                    break;
                case "custom:url":
                    assigner.url = Uri.tryParse(content);
                    break;
            }
        }
    }

If the given property has images, audios and videos data, ImageInfoParser, AudioInfoParser and VideoInfoParser are required to used and must be constructed inside the resolveMetaTag method of extended classes.

For example, given a fragment of a single HTML file.

<meta property="custom:image" content="https://example.com/1.jpg"/>

Then, ImageInfoParser should be constructed at first before assigning content. Moreover, additional check should be performed that ensure the multimedia info will be constructed when at least one critical property has been defined:

    // Under `CustomPropertyParser` class.

    @overriden
    void resolveMetaTags(MetaInfoAssigner assigner, Iterable<PropertyPair> propertyPair) {
        // Create ImageInfoParser
        final imgParse = ImageInfoParser();

        // Execture for each loop
        for (PropertyPair pair in propertyPair) {
            // Extract two variabled from a record
            var (property, content) = propertyPair;

            switch (property) {
                // Skip mentioned cases from previous example
                case "content:image":
                    if (imgParser.isInitalized) {
                        // Add into assigner if executed before
                        assigner.images.add(imgParser.compile());
                        // Wipe all previous assigned data and ready to uses.
                        imgParser.reset();
                    } else {
                        // When it executed first time
                        imgParser.markInitalized();
                    }

                    imgParser.url = Uri.tryParse(content);
                    break;
            }
        }

        // Check and leftover data assigned into imgParser but not appended into assigner yet
        if (imgParser.url != null) {
            assigner.images.add(imgParser.compile());
        }
    }

Finally, MetaInfo will be created depending current assigned data from assigner by calling parse method.

Register into MetaFetch

MetaPropertyParser defines the construction from <meta> with eligable prefix property names to immutable object namely MetaInfo. To fetch metadata with specified protocol adopted in existed website, it required to uses MetaFetch and should be registered the parser already:

void main() {
    MetaFetch.instance
        ..register(const CustomPropertyParser())
        ..primaryPrefix = "custom";
}

And the MetaInfo with specific protocol will be exported by invoking fetchFromHttp.