Skip to content

Commit

Permalink
Fix piskvorky#851, Error is raised instead of returning text [WiP] (p…
Browse files Browse the repository at this point in the history
…iskvorky#902)

* Update summarizer.py

Return statement removed and error raised.

* Update test_summarization.py

Removed test for single sentence input.

* Update CHANGELOG.md

* Update summarizer.py

* Update test_wikicorpus.py

* Update test_summarization.py
  • Loading branch information
metalaman authored and harshuljain13 committed Sep 30, 2016
1 parent c0f5896 commit b8ce776
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 9 deletions.
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Changes
- bigram construction can now support multiple bigrams within one sentence
* Fixed issue #838, RuntimeWarning: overflow encountered in exp (@markroxor, [#895](https://github.com/RaRe-Technologies/gensim/pull/895))
* Changed some log messages to warnings as suggested in issue #828. (@rhnvrm, [#884](https://github.com/RaRe-Technologies/gensim/pull/884))
* Fixed issue #851, In summarizer.py, check for single sentence as an input added to avoid ZeroDivionError, added test cases in test/test_summarization.py(@metalaman, #887)
* Fixed issue #851, In summarizer.py, RunTimeError is raised if single sentence input is provided to avoid ZeroDivionError. (@metalaman, #887)


0.13.2, 2016-08-19
Expand Down
5 changes: 2 additions & 3 deletions gensim/summarization/summarizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -198,10 +198,9 @@ def summarize(text, ratio=0.2, word_count=None, split=False):
logger.warning("Input text is empty.")
return

# If only one sentence is present, the function return the input text (Avoids ZeroDivisionError).
# If only one sentence is present, the function raises an error (Avoids ZeroDivisionError).
if len(sentences) == 1:
logger.warning("Summarization not performed since the document has only one sentence.")
return text
raise ValueError("input must have more than one sentence")

# Warns if the text is too short.
if len(sentences) < INPUT_MIN_LENGTH:
Expand Down
4 changes: 2 additions & 2 deletions gensim/test/test_summarization.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ def test_text_summarization_raises_exception_on_short_input_text(self):
text = "\n".join(text.split('\n')[:8])

self.assertTrue(summarize(text) is not None)

def test_text_summarization_returns_input_on_single_input_sentence(self):
pre_path = os.path.join(os.path.dirname(__file__), 'test_data')

Expand All @@ -97,7 +97,7 @@ def test_text_summarization_returns_input_on_single_input_sentence(self):
# Keeps the first sentence only.
text = text.split('\n')[0]

self.assertEqual(summarize(text),text)
self.assertRaises(ValueError,summarize,text)

def test_corpus_summarization_raises_exception_on_short_input_text(self):
pre_path = os.path.join(os.path.dirname(__file__), 'test_data')
Expand Down
6 changes: 3 additions & 3 deletions gensim/test/test_wikicorpus.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,25 +12,25 @@
import os
import sys
import types

import logging
import unittest

from gensim.corpora.wikicorpus import WikiCorpus



module_path = os.path.dirname(__file__) # needed because sample data files are located in the same folder
datapath = lambda fname: os.path.join(module_path, 'test_data', fname)
FILENAME = 'enwiki-latest-pages-articles1.xml-p000000010p000030302-shortened.bz2'

logger = logging.getLogger(__name__)

class TestWikiCorpus(unittest.TestCase):

def setUp(self):
wc = WikiCorpus(datapath(FILENAME))

def test_get_texts_returns_generator_of_lists(self):

logger.debug("Current Python Version is "+str(sys.version_info))
if sys.version_info < (2, 7, 0):
return

Expand Down

0 comments on commit b8ce776

Please sign in to comment.