This is a Nutch plugin for indexing documents using Elasticsearch 2.x.
-
Copy this plugin into your nutch plugin directory
$NUTCH_HOME/src/plugin
. -
Add the required plugin to your
NUTCH_HOME/src/plugin/build.xml
.
<!-- NUTCH_HOME/src/plugin/build.xml -->
<project name="Nutch" default="deploy-core" basedir=".">
...
<target name="deploy">
...
<ant dir="indexer-elastic2" target="deploy"/>
</target>
...
<target name="clean">
...
<ant dir="indexer-elastic2" target="clean"/>
</target>
...
</project>
- Make sure you have plugin name included in your nutch configuration file
NUTCH_ROOT/conf/nutch-site.xml
<!-- NUTCH_HOME/conf/nutch-site.xml -->
<configuration>
...
<property>
<name>plugin.includes</name>
<value>urlfilter-regex|parse-(html|tika)|index-(basic|anchor)|urlnormalizer-(pass|regex|basic)|scoring-opic|indexer-elastic2</value>
</property>
...
</configuration>
- Compile using ant