-
Notifications
You must be signed in to change notification settings - Fork 7
Features
- Fully Customizable schema and configuration
- Indexing StdClass or array following your schemas
- Scoring of your results, with customizable score boost for each field in your schema
- Tokenization of the data, with ready-to-use tokenizers
- Stemming functionnality supporting 12 languages (Danish, Dutch, English, French, German, Italian, Norwegian, Portuguese, Romanian, Russian, Spanish and Swedish). Huge thanks to Wamania with his stemming library
- Faceting of your fields (based on your schema)
- Some Fuzzy Search with customizable cost
- A basic Autocompletion system
- Some strong caching system to improve performances of already performed queries (You can still clear it at any time if needed)
- A simple QueryBuilder to help you create queries faster with a more readable code
- Possibility to search exact values on specific fields, or do a regular search on them (see QueryBuilder and QuerySegment for more informations.
- Possibility to search for documents that have content shared between your current result via the Connex Search (see below)
The purpose of this feature is to try to find documents or tokens that have common subjects of the best documents found in the current search.
Note : you can skip the non-bold text if you don't want to know the technical part of the feature
You are searching "Star Wars". The search engine will give you the movies you search for. With the connex feature enabled, the engine will take every tokens and associated score from your results:
- at minimum the 3 (configurable) first documents
- every document that matches a threshold of 90% (configurable) of the max score
- maximum 10 (configurable) documents (if there are more documents with a score >threshold)
The collected scores are then multiplied by the accuracy of the document where it comes, and added per token. This makes tokens that appear accross multiple documents have more value, and the accuracy of every documents affects how fast the token's score will grow. After this, the engine will keep the 20 (configurable) best tokens (excluding the tokens used for the current search) and perform an internal search for every of these tokens adding the previously calculated score to the documents. The feature will keep the 10 (configurable) best documents, excluding those returned into the regular search.
The search engine will then return you 20 tokens (with their respecting scores) and 10 movies related to space and sci-fi content