You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 30, 2018. It is now read-only.
jiminoc edited this page Aug 21, 2011
·
4 revisions
There will be times you need to configure different locations for defaults.
Goose provides a configuration object that can be passed into the extractor so you can set items that make sense to your environment.
See example below:
publicclassGooseTest {
@TestpublicvoidgooseFromJavaTest() {
// set my configuration options for gooseConfigurationconfiguration = newConfiguration();
configuration.setMinBytesForImages(4500);
configuration.setLocalStoragePath("/tmp/goose");
configuration.setEnableImageFetching(false); // i don't care about the image, just want text, this is much faster!configuration.setImagemagickConvertPath("/opt/local/bin/convert");
Stringurl = "http://www.cnn.com/2010/POLITICS/08/13/democrats.social.security/index.html";
Goosegoose = newGoose(configuration);
Articlearticle = goose.extractContent(url);
System.out.println(article.cleanedArticleText());
}
}