-
-
Notifications
You must be signed in to change notification settings - Fork 0
Parser
MetaInfo
is structured object which denotes metadatas to display rich information link. Ideally, this object only can be constructed inside a fetcher which read parsed HTML objects and find at least one supported protocols to extract data from HTML object. In this repository, it ships MetaFetch
which parsed HTML object and extracts metadata into MetaInfo
with specific parsers for corresponded protocols.
This article focus on create a custom parser for specific protocols. For setup MetaFetch
initially, please refer to model's README file.
When decided to extend supported metadata protocol for MetaFetch
, a new class based from MetaPropertyParser
must be implemented as well as register the new parser into MetaFetch
before operation.
To create new parser, please ensure buffer parser library form model package has been imported at first which bundled classes for constructing MetaInfo
:
import 'packages:oghref_model/buffer_parser.dart';
Once the library has been imported, please create a new base (or final) class which inherted or mixin from MetaPropertyParser
, and there are two properties and methods need to be overriden namely propertyNamePrefix
and resolveMetaTags
:
// Normally implemented protocols should not be extended further classes.
final class CustomPropertyParser extends MetaPropertyParser {
// Good practise: Declare parser is constant object.
const CustomPropertyParser();
@overriden
String get propertyNamePrefix => "custom";
@overriden
void resolveMetaTags(MetaInfoAssigner assigner, Iterable<PropertyPair> propertyPair) {
// Further implementation will be explained below.
}
}
propertyNamePrefix
is a prefix of <meta>
property that it used for describe applied protocol when extracting content. Given example from above, this custom parser will be recognized these <meta>
tags (and it must be UTF-8 encoded):
<meta property="custom:name" content="John Doe"/>
<meta property="custom:title" content="Sample text"/>
<meta property="custom:url" content="https://example.com"/>
and it will filtered difference prefix properties of <meta>
tags no matter which sequence applied in HTML files:
<meta property="custom:name" content="John Doe"/>
<!-- This will be ignored -->
<meta property="foo:bar" content="Example"/>
<meta property="doe:var" content="A"/>
<!-- This will be resumed -->
<meta property="custom:title" content="Sample text"/>
<meta property="custom:url" content="https://example.com"/>
These metadata will be parsed as an iterable record object with two string value which denoted as property
and content
values that it contains the records which matched pattern of property
value and defined a type as PropertyPair
which will becomes one of the parameter of resolveMetaTags
.
Giving <meta>
example from above, the record will be formed as below when printed into console:
(custom:name, John Doe)
(custom:title, Sample text)
(custom:url, https://example.com)
This is the final procedure for constructing new parser which uses obtained content
to construct MetaInfo
. Since the info class is immutable which cannot modify properites once it constructed already. Therefore, MetaInfoAssigner
is provided in resolveMetaTags
for assigning final result and construct MetaInfo
afterward.
In this example, the data will be assigned by executing for each loop along with switch case to determine corresponded properties and type conversion:
// Under `CustomPropertyParser` class.
@overriden
void resolveMetaTags(MetaInfoAssigner assigner, Iterable<PropertyPair> propertyPair) {
// Execture for each loop
for (PropertyPair pair in propertyPair) {
// Extract two variabled from a record
var (property, content) = propertyPair;
switch (property) {
case "custom:title":
assigner.title = content;
break;
case "custom:url":
assigner.url = Uri.tryParse(content);
break;
}
}
}
If the given property has images, audios and videos data, ImageInfoParser
, AudioInfoParser
and VideoInfoParser
are required to used and must be constructed inside the resolveMetaTag
method of extended classes.
For example, given a fragment of a single HTML file.
<meta property="custom:image" content="https://example.com/1.jpg"/>
Then, ImageInfoParser
should be constructed at first before assigning content. Moreover, additional check should be performed that ensure the multimedia info will be constructed when at least one critical property has been defined:
// Under `CustomPropertyParser` class.
@overriden
void resolveMetaTags(MetaInfoAssigner assigner, Iterable<PropertyPair> propertyPair) {
// Create ImageInfoParser
final imgParse = ImageInfoParser();
// Execture for each loop
for (PropertyPair pair in propertyPair) {
// Extract two variabled from a record
var (property, content) = propertyPair;
switch (property) {
// Skip mentioned cases from previous example
case "content:image":
if (imgParser.isInitalized) {
// Add into assigner if executed before
assigner.images.add(imgParser.compile());
// Wipe all previous assigned data and ready to uses.
imgParser.reset();
} else {
// When it executed first time
imgParser.markInitalized();
}
imgParser.url = Uri.tryParse(content);
break;
}
}
// Check and leftover data assigned into imgParser but not appended into assigner yet
if (imgParser.url != null) {
assigner.images.add(imgParser.compile());
}
}
Finally, MetaInfo
will be created depending current assigned data from assigner
by calling parse
method.
MetaPropertyParser
defines the construction from <meta>
with eligable prefix property names to immutable object namely MetaInfo
. To fetch metadata with specified protocol adopted in existed website, it required to uses MetaFetch
and should be registered the parser already:
void main() {
MetaFetch.instance
..register(const CustomPropertyParser())
..primaryPrefix = "custom";
}
And the MetaInfo
with specific protocol will be exported by invoking fetchFromHttp
.