-
Notifications
You must be signed in to change notification settings - Fork 2
/
README
74 lines (58 loc) · 2.97 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
_______________________________________________________________________________
solr-plugins README
Welcome to solr-plugins, a sandbox for writing a custom SOLR update request
processor chain (currently named "docHandler"). Other custom/plugin stuff might
be added in the future, like:
- SearchComponents
- ParserPlugins
- ...
The "docHandler" chain is defined in the bundled solrconfig.xml as:
-snip-
<updateRequestProcessorChain name="docHandler">
<processor class="sandboxes.solrplugins.DownloadingUpdateProcessorFactory" />
<processor class="solr.DistributedUpdateProcessorFactory" />
<processor class="sandboxes.solrplugins.ExtractingUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
<processor class="solr.LogUpdateProcessorFactory" />
</updateRequestProcessorChain>
-snip-
...and performs the following steps:
1. Download the content of some URI. <- CUSTOM
2. Pick a shard (via the cloud distributed indexing hooks). This might
forward the SolrInputDoc to a different shard if the current one isn't
the leader.
3. Extract the text of the resource plus meta-data if possible <- CUSTOM
4. Update the index.
5. Log the update.
Steps 1 and 3 are custom UpdateRequestProcessors. This project contains the
source code and unit-tests for them. Note the downloading happens prior to the
SolrCloud distribution logic. This is so each document is downloaded just once
by the cluster. If the doc was distributed before downloading, each replica
would download unnecessarily.
solr-plugins is built using Maven 3. It also includes Eclipse .project and
.classpath files as well as some .settings if you're interested. Tested with
a SOLR 4 nightly build.
_______________________________________________________________________________
Before you build
1. Install Maven 3
2. Install ${solr-plugins}/lib/shawty-0.9.4.jar (http://code.google.com/p/shawty/)
into your local maven repo:
$ mvn install:install-file -Dfile=lib/shawty-0.9.4.jar -DgroupId=com.googlecode \
-DartifactId=shawty -Dversion=0.9.4 -Dpackaging=jar
You should be ready to build:
$ mvn clean package
_______________________________________________________________________________
Installing the example
Assuming you've gotten this project to build and produce the target zip,
1. Install a SOLR 4 nightly trunk build (http://wiki.apache.org/solr/NightlyBuilds)
2. Make a copy of the SOLR 4 example:
$ cd /path/to/solr-4
$ cp -r example solr-plugin-example
$ cd solr-plugin-example
$ unzip /path/to/solr-plugins/target/solr-plugins-0.0.1-SNAPSHOT-with-dependencies.zip
3. Start SOLR:
$ java -jar start.jar OPTIONS=All # run Jetty with JSP support
4. Visit http://0.0.0.0:8983/solrplugins/feeder.jsp to index some resources
referenced by an ATOM or RSS feed. You should be able to see documents being
downloaded, extracted, and indexed.
5. Search for some documents http://0.0.0.0:8983/solr/select?fl=title,uri&q=*