____ _ _ _
/ ___|| |__ ___ _ __| | ___ ___| | __
\___ \| '_ \ / _ \ '__| |/ _ \ / __| |/ /
___) | | | | __/ | | | (_) | (__| <
|____/|_| |_|\___|_| |_|\___/ \___|_|\_\
Sherlock is a no-hassle service which handles all the indexing and query needs of a content-brokering [Pebble] 1. For example, consider using Sherlock if you need to full-text query [Grove] 4 content.
Sherlock is currently used by several pebbles in production, but the API is still under continous development so expect both behavior and interfaces to change.
- AMQP. Sherlock subscribes to a message queue. Incoming messages are evaluated, normalized and passed via…
- ...HTTP requests to Elasticsearch for indexing.
- Accept client HTTP search queries, translate the query, pass it to Elasticsearch and route the search result back to the client as an HTTP response.
Sherlock needs two external services to function:
brew install rabbitmq
brew install elasticsearch
Stop elasticsearch
launchctl unload -wF ~/Library/LaunchAgents/homebrew.mxcl.elasticsearch.plist
Add these lines to your elasticsearch.yml ('brew info elasticsearch' to locate the file):
# Turn off automatic index creation
action.auto_create_index: false
Start elasticsearch
launchctl load -wF ~/Library/LaunchAgents/homebrew.mxcl.elasticsearch.plist
git clone git@github.com:bengler/sherlock.git
cd sherlock
bundle install
Run the integration tests to see if Sherlock is playing well with RabbitMQ and Elasticsearch
rspec spec/integration/
Start the update listener
./bin/update_listener start --daemon
Index something:
curl -XPUT 'http://localhost:9200/an_index_of_books/book/1' -d '{"title":"Island","author":"Aldous Huxley", "year_of_publication":"1962"}'
Also, anything added to grove, or updated in grove, will be put on the River, which Sherlock then picks up and sends to Elasticsearch for indexing.
You can now do queries such as:
http://sherlock.dev/api/sherlock/v1/search/my_index/*:*/?q=foo+bar
http://sherlock.dev/api/sherlock/v1/search/my_index/*:*
http://sherlock.dev/api/sherlock/v1/search/development/post.greeting:*?limit=100
Some use-cases require predefined mappings. E.g. we don't want a UID to be string-tokenized because this will cause dash to look like a whitespace and thus not return hits on a query.
If you need to predefine how ES analyzes a particular part of you data, do so by appending your stuff to config/predefined_es_mappings.json
.
Drop all indexes
./bin/sherlock drop_all_indices
Empty all queues of messages
./bin/sherlock empty_all_queues
...or you can also scratch the queue itself with
rabbitmqadmin purge queue=response.report.river
Test how text is analyzed
curl -XGET 'localhost:9200/development_dna/_analyze' -d 'as sly as a fox'
Type mapping of fields in an index
curl -XGET 'http://localhost:9200/sherlock_development_apdm/_mapping'