This is a library to extract WebClip icon information from the website. Available in JVM and Android as this is written in pure Kotlin.
This app is in touch-icon-extractor-sample And also published in Play store
net.mm2d:touchicon:
core component. All feature is provided by this. UseUrlConnection
for HTTP access and its own parser for HTML parse.net.mm2d:touchicon-http-okhttp:
Adapter to use OkHttp for HTTP access.
net.mm2d:touchicon-html-jsoup:
is EOL. The last version is 0.8.3
jCenter will close in May. In 0.9.1 moved to mavenCentral from jcenter.
Please note that the groupID has changed
Download from mavenCentral.
The latest version is:
dependencies {
implementation("net.mm2d.touchicon:touchicon:$touchIconVersion")
implementation("net.mm2d.touchicon:touchicon-http-okhttp:$touchIconVersion") // Optional: If use OkHttp for HTTP access
}
Versions below 0.9.1 were distributed with jCenter. However, jCenter will close and old versions are not migrated to mavenCentral. If you need an older version, please use the GitHub Pages repository.
repositories {
maven { url = URI("https://ohmae.github.com/maven") }
}
dependencies {
implementation("net.mm2d:touchicon:$touchIconVersion")
implementation("net.mm2d:touchicon-http-okhttp:$touchIconVersion") // Optional: If use OkHttp for HTTP access
}
Documentation comment is written in KDoc.
val extractor = TouchIconExtractor() // initialize
extractor.userAgent = "user agent string" // option: set User-Agent
extractor.headers = mapOf("Cookie" to "hoge=fuga") // option: set additional HTTP header
extractor.downloadLimit = 10_000 // option: set download limit (default 64kB).
// <= 0 means no limit
//...
GlobalScope.launch(Dispatchers.Main) {
val job = async(Dispatchers.IO) {
extractor.fromPage(siteUrl, true) // Do not call from the Main thread
}
//...
}
If in RxJava
//...
Single.fromCallable { extractor.fromPage(url, true) } // Do not call from the Main thread
.subscribeOn(Schedulers.io())
.observeOn(AndroidSchedulers.mainThread())
.subscribe({
//...
}, {})
By default, this use HttpUrlConnection for HTTP access. If you want to use OkHttp, use touchicon-http-okhttp module.
val extractor = TouchIconExtractor(
httpClient = OkHttpAdapterFactory.create(OkHttpClient())
)
You may want to use communication in the same session as other communication. You need to use the same cookie in WebView and HTTP session of this library. For example, to use the same session as WebView in an Android application,
For the default HTTP client using HttpUrlConnection, implement CookieHandler.
object WebViewCookieHandler : CookieHandler {
private val cookieManager = CookieManager.getInstance()
override fun saveCookie(url: String, value: String) {
cookieManager.setCookie(url, value)
}
override fun loadCookie(url: String): String? = cookieManager.getCookie(url)
}
TouchIconExtractor(
httpClient = SimpleHttpClientAdapterFactory.create(WebViewCookieHandler)
)
For OkHttp, set CookieJar in OkHttpClient as you know.
object WebViewCookieJar : CookieJar {
private val cookieManager = CookieManager.getInstance()
override fun saveFromResponse(url: HttpUrl, cookies: List<Cookie>) {
val urlString = url.toString()
cookies.forEach {
cookieManager.setCookie(urlString, it.toString())
}
}
override fun loadForRequest(url: HttpUrl): List<Cookie> =
cookieManager.getCookie(url.toString()).let { cookie ->
if (cookie.isNullOrEmpty()) {
emptyList()
} else {
cookie.split(";")
.filter { it.isNotBlank() }
.mapNotNull { Cookie.parse(url, it) }
}
}
}
TouchIconExtractor(
httpClient = OkHttpAdapterFactory.create(
OkHttpClient.Builder()
.cookieJar(WebViewCookieJar)
.build()
)
)
There are two kinds of methods for specifying the WebClip icon. This library supports both.
Specify the following description in the HTML header.
<link rel="icon" href="/favicon.ico" type="image/x-icon">
<link rel="shortcut icon" href="/favicon.ico">
<link rel="apple-touch-icon" href="/apple-touch-icon.png" sizes="57x57">
<link rel="apple-touch-icon-precomposed" href="/apple-touch-icon-precomposed.png" sizes="80x80">
If you want this information, as following
extractor.fromPage(url)
This library attempts to download an HTML file from the specified URL. Since only the header is required, if the download size is larger than a certain size, the download is stopped there.
Analyzing the downloaded HTML file,
Extract only link tags whose rel attribute is
"icon", "shortcut icon", "apple-touch-icon", "apple-touch-icon-precomposed".
Parse it, create an PageIcon
instance, and return it as a result.
Although not strictly a WebClip icon, this can also get an icon written in the Web App Manifest.
This is described by the following JSON.
{
"short_name": "name",
"name": "Web App Icon",
"icons": [
{
"src": "icon-1x.png",
"type": "image/png",
"sizes": "48x48"
},
{
"src": "icon-2x.png",
"type": "image/png",
"sizes": "96x96"
},
{
"src": "icon-4x.png",
"type": "image/png",
"sizes": "192x192"
}
],
"start_url": "index.html"
}
And it is described as follows in HTML.
<link rel="manifest" href="/manifest.json">
This information is expressed as WebAppIcon
.
If you want this information, as following
extractor.fromPage(url, true)
As you guessed, it gets at the same time as PageIcon.
Simply putting a file with a fixed name like "favicon.ico" in the root of the domain. Whether an icon exists or not can not be known until you try HTTP communication.
This is an inefficient, but there are Websites that are still deployed in this way. You should try only if you can not get it by the method in the previous section. Please be aware that this method can be annoying to the website administrator.
If you want this information, as following
extractor.fromDomain(url)
It checks whether the file exists, and returns the information if it exists.
The order of checking the existence of the icon is as follows
- apple-touch-icon-precomposed.png
- apple-touch-icon.png
- favicon.ico
If the file exists, the subsequent files will not be checked.
If you do not need precomposed, as following
extractor.fromDomain(url, false)
The order of checking the existence of the icon is as follows
- apple-touch-icon.png
- favicon.ico
Sometimes the size information is included in the name, such as "apple-touch-icon-120x120.png"
When
extractor.fromDomain(url, true, listOf("120x120", "72x72"))
The order of checking the existence of the icon is as follows
- apple-touch-icon-120x120-precomposed.png
- apple-touch-icon-120x120.png
- apple-touch-icon-72x72-precomposed.png
- apple-touch-icon-72x72.png
- apple-touch-icon-precomposed.png
- apple-touch-icon.png
- favicon.ico
There are methods to gather all the information (TouchIconExtractor#listFromDomain()
)
This is for debugging and verification, strongly recommended not to use in production..
Often you can get more than one icon. Which is the most appropriate icon depends on the application, but this library provides several Comparator.
val icons = extractor.fromDomain(url, true, listOf("120x120", "72x72"))
val bestIcon1 = icons.maxWith(IconComparator.SIZE) // Compare by size. (the largest icon is the best)
val bestIcon2 = icons.maxWith(IconComparator.REL_SIZE) // Compare by rel, if same, compare by size
大前 良介 (OHMAE Ryosuke) http://www.mm2d.net/