Skip to content

amir343/grape

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Grape

Grape is a collection of document clustering algorithms written in Scala. It avails from Apache OpenNLP to extract specific feature from each document and build the final vector space that is used in different approaches. Grape contains the following algorithms (at the moment):

  • KMean Clustering
  • Hierarchical Agglomerative Clustering
  • Buckshot Clustering

How to use

An example how to use KMean clustering on your documents:

import com.jayway.textmining.{NLPFeatureSelection, Cluster, KMeanCluster}

// number of clusters
val k = ...

// A document is a pair of (Document ID, Document Content). ID can be anything.
val docs: List[(String, String)] = ...

val kMeanCluster = new KMeanCluster(docs, k) with NLPFeatureSelection
val clusters:List[Cluster] = kMeanCluster.doCluster()

License

Copyright (C) 2012 Amir Moulavi

Distributed under the Apache Software License.

About

Document clustering

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages