Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for documnetation sites powered by mkdocs #9

Merged
merged 5 commits into from
Jun 18, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 9 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# ulauncher-docsearch

> Full text search on Documentation sites, powered by [Algolia](https://www.algolia.com/) Docsearch.
> Search in your favorite Documentation sites directly from Ulauncher.

[![Ulauncher Extension](https://img.shields.io/badge/Ulauncher-Extension-green.svg?style=for-the-badge)](https://ext.ulauncher.io/-/github-brpaz-ulauncher-docsearch)
![License](https://img.shields.io/github/license/brpaz/ulauncher-docsearch.svg?style=for-the-badge)
Expand All @@ -15,7 +15,7 @@ This extension, aims to make documentation search less painfull, allowing you to

## Features

This extension allows to easily search on popular documentation websites, that implements [Algolia DocSearch](https://community.algolia.com/docsearch/).
This extension allows to easily search on popular documentation websites, that are built using [Algolia DocSearch](https://community.algolia.com/docsearch/) or [MkDocs](https://www.mkdocs.org/).

The following documentation sites are included by default in this extension:

Expand Down Expand Up @@ -61,6 +61,11 @@ The following documentation sites are included by default in this extension:
- Webpack
- Web.dev

It also includes the following using MkDocs:

- Obsidian Dataview
- Nginx ingress controller
- ArgoCD

## Requirements

Expand All @@ -86,9 +91,9 @@ Open Ulauncher and type ```docs```. A list of available documentation sites will

Optionally some documentation also supports a custom keyword, to trigger it directly without having to type ```docs``` before.

For example, for Vue documentation, you can just type ```vuedocs <query>```.
For example, for Vue documentation, you can just type ```docs:vue <query>```.

You can see the supported keywords in [manifest.json](manifest.json) file.
You can see the available keywords in [manifest.json](manifest.json) file.


## Development
Expand Down
74 changes: 74 additions & 0 deletions data/docsets.json

Large diffs are not rendered by default.

11 changes: 9 additions & 2 deletions docs/add-new-doc.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,23 @@

## Add new Documentation

It´s pretty easy to add new documentation to this extension.
It´s pretty easy to add new documentation to this extension.

The only requirement is that the Documentation site must use [Algolia DocSearch](https://community.algolia.com/docsearch/) as their search engine. You can check all availalbe sites [here](https://github.com/algolia/docsearch-configs).
The only requirement is that the Documentation site must use [Algolia DocSearch](https://community.algolia.com/docsearch/) as their search engine or [MkDocs](https://www.mkdocs.org/). For Algolia, you can check all availalbe sites [here](https://github.com/algolia/docsearch-configs).

The first step is to add an entry to the docsets json file.

### Algolia

The "algolia_index", "algolia_application_id" and "algolia_api_key" properties can be found by inspecting the search request of the original site. For example for VueJS documentation, do a search and look for the following in the "Netowrk tab" of your Browser

![add docs](add-docs.png)


### MkDocs

For mkdocs based websites you must provide "search_url". You can find it by checking the Network tab in the dev tools of your browser. It should look similar to this: `https://www.mkdocs.org/search/search_index.json`

You also need to add an icon to the **icons/docs** folder, whose path should match the one configured in "docset.json" file.

## Add local documentation
Expand Down
42 changes: 27 additions & 15 deletions docsearch/extension.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,14 @@
from ulauncher.api.shared.action.SetUserQueryAction import SetUserQueryAction
from docsearch.listeners.query_listener import KeywordQueryEventListener
from docsearch.searcher import Searcher
from threading import Thread, Timer
from docsearch.indexers.mkdocs import MkDocsIndexer
from docsearch.providers.constants import PROVIDER_MKDOCS

logger = logging.getLogger(__name__)

MKDOCS_INDEXER_INTERVAL = 7200


class DocsearchExtension(Extension):
""" Main Extension Class """
Expand All @@ -21,6 +26,28 @@ def __init__(self):
super(DocsearchExtension, self).__init__()
self.subscribe(KeywordQueryEvent, KeywordQueryEventListener())
self.searcher = Searcher()
self.mkdocs_indexer = MkDocsIndexer()

th = Thread(target=self.index_mkdocs)
th.daemon = True
th.start()

def index_mkdocs(self):

docsets_to_index = self.searcher.get_docsets_by_provider(
PROVIDER_MKDOCS)

for key, doc in docsets_to_index.items():
try:
logger.info("Indexing mkdocs docset for %s", key)
self.mkdocs_indexer.index(key, doc)
except Exception as e:
logger.error("Error indexing mkdocs index for %s. Error: %s",
key, e)

timer = Timer(MKDOCS_INDEXER_INTERVAL, self.index_mkdocs)
timer.daemon = True
timer.start()

def list_docsets(self, event, query):
""" Displays a list of available docs """
Expand Down Expand Up @@ -95,18 +122,3 @@ def search_in_docset(self, docset, query):
on_enter=OpenUrlAction(result['url'])))

return RenderResultListAction(items)

def get_docset_from_keyword(self, keyword):
""" Returns a docset matching the extension keyword or None if no matches found """
kw_id = None
for key, value in self.preferences.items():
if value == keyword:
kw_id = key
break

if kw_id:
kw_parts = kw_id.split("_")
if len(kw_parts) == 2 and kw_parts[0] == "kw":
return kw_parts[1]

return None
Empty file added docsearch/indexers/__init__.py
Empty file.
62 changes: 62 additions & 0 deletions docsearch/indexers/mkdocs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
from ulauncher.config import CACHE_DIR
import requests
import logging
import json
import os

logger = logging.getLogger(__name__)

CACHE_FOLDER_PATH = os.path.join(CACHE_DIR, 'ulauncher-docsearch',
'mkdocs-indexes')


class MkDocsIndexError(Exception):
pass


class MkDocsIndexer(object):

def index(self, docset_key, docset):
url = docset["search_index_url"]

r = requests.get(url)
if r.status_code != 200:
raise MkDocsIndexError(
"Error downloading mkdocs index for %s. HTTP error: %s" %
(docset_key, r.status_code))

data = r.json()

if "docs" not in data:
return

items = []

for doc in data["docs"]:
items.append(self.map_item(doc))

index_file = self.get_index_file_path(docset_key)
with open(index_file, 'w', encoding='utf-8') as f:
json.dump(items, f)

def get_index_file_path(self, docset_key):
index_filename = "%s.json" % docset_key

if not os.path.exists(CACHE_FOLDER_PATH):
os.makedirs(CACHE_FOLDER_PATH)

return os.path.join(CACHE_FOLDER_PATH, index_filename)

def map_item(self, data):
return {
'title': data["title"],
'description': data["location"],
'text': self.trim_string(str(data["text"]), 60)
}

def trim_string(self, s: str, limit: int, ellipsis='…') -> str:
s = s.strip()
if len(s) > limit:
return s[:limit - 1].strip() + ellipsis

return s
18 changes: 17 additions & 1 deletion docsearch/listeners/query_listener.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ def on_event(self, event, extension):

query = event.get_argument() or ""

kw_docset = extension.get_docset_from_keyword(event.get_keyword())
kw_docset = self.get_docset_from_keyword(extension.preferences,
event.get_keyword())
if kw_docset:
return extension.search_in_docset(kw_docset, query)

Expand All @@ -22,3 +23,18 @@ def on_event(self, event, extension):
return extension.search_in_docset(docset, term)

return extension.list_docsets(event, query)

def get_docset_from_keyword(self, preferences, keyword):
""" Returns a docset matching the extension keyword or None if no matches found """
kw_id = None
for key, value in preferences.items():
if value == keyword:
kw_id = key
break

if kw_id:
kw_parts = kw_id.split("_")
if len(kw_parts) == 2 and kw_parts[0] == "kw":
return kw_parts[1]

return None
Empty file added docsearch/providers/__init__.py
Empty file.
77 changes: 77 additions & 0 deletions docsearch/providers/algolia.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
import logging
from typing import List
from .base import BaseProvider
from .constants import PROVIDER_ALGOLIA_DOCSEARCH
from .base import SearchException
from docsearch.mapper import DefaultMapper, VercelMapper, PrismaMapper, TerraformMapper, WebDevMapper
from algoliasearch.search_client import SearchClient
from algoliasearch.exceptions import AlgoliaException

logger = logging.getLogger(__name__)


class AlgoliaProvider(BaseProvider):

def __init__(self):
self.result_mappers: List = [
VercelMapper(),
TerraformMapper(),
PrismaMapper(),
WebDevMapper()
]

def get_name(self):
return PROVIDER_ALGOLIA_DOCSEARCH

def search(self, docset_key, docset, term):
algolia_client = SearchClient.create(docset['algolia_application_id'],
docset['algolia_api_key'])

index = algolia_client.init_index(docset['algolia_index'])

try:
search_results = index.search(term,
self.build_request_options(docset))

if not search_results['hits']:
return []

return self.map_results(docset_key, docset, search_results["hits"])
except AlgoliaException as e:
logger.error("Error fetching documentation from algolia: %s", e)
raise SearchException(
"Error fetching documentation from algolia: %s", e)

def build_request_options(self, docset):
"""
Allow to specify custom search options for a specific docset
Parameters:
docset (string): The identifier of the docset.
"""
opts = {}

if "facet_filters" in docset:
opts = {"facetFilters": docset["facet_filters"]}

return opts

def get_results_mapper(self, docset_key):
"""
Returns the mapper object that will map the specified docset data into the format required by the extension
"""
for mapper in self.result_mappers:
if mapper.get_type() == docset_key:
return mapper

return DefaultMapper()

def map_results(self, docset_key, docset_data, results):
""" Maps the results returned by Algolia Search """

mapper = self.get_results_mapper(docset_key)
items = []
for hit in results:
mapped_item = mapper.map(docset_data, hit)
items.append(mapped_item)

return items
11 changes: 11 additions & 0 deletions docsearch/providers/base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
class BaseProvider(object):

def get_name(self):
raise NotImplementedError()

def search(self, docset_key, docset, term):
raise NotImplementedError()


class SearchException(Exception):
pass
2 changes: 2 additions & 0 deletions docsearch/providers/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
PROVIDER_ALGOLIA_DOCSEARCH = "algolia"
PROVIDER_MKDOCS = "mkdocs"
19 changes: 19 additions & 0 deletions docsearch/providers/factory.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
from typing import List
from .base import BaseProvider
from .algolia import AlgoliaProvider
from .mkdocs import MkDocsProvider


class ProviderFactory(object):

def __init__(self):
self.providers: List[BaseProvider] = [
MkDocsProvider(), AlgoliaProvider()
]

def get(self, name: str) -> BaseProvider:
for provider in self.providers:
if provider.get_name() == name:
return provider

raise RuntimeError("Provider with name '%s' was not found" % name)
39 changes: 39 additions & 0 deletions docsearch/providers/mkdocs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
import functools
import os
import json
from .base import BaseProvider
from .constants import PROVIDER_MKDOCS

from docsearch.indexers.mkdocs import CACHE_FOLDER_PATH


class MkDocsProvider(BaseProvider):

def get_name(self):
return PROVIDER_MKDOCS

def search(self, docset_key, docset, query):

data = self.read_mkdocs_index_file(docset_key)

results = []
for item in data:
if query.lower() in item["title"].lower():
results.append({
'url':
"{}/{}".format(docset["url"], item["description"]),
'title':
item["title"],
'icon':
docset['icon'],
'category':
item['description'],
})

return results

@functools.lru_cache(maxsize=10)
def read_mkdocs_index_file(self, docset_key):
file_path = os.path.join(CACHE_FOLDER_PATH, "%s.json" % docset_key)
with open(file_path) as f:
return json.load(f)
Loading