get_no_of_tokens()
: Number of words in the headlineget_avg_char_count()
: Average number of characters in words (of headline)sd_len()
: Maximum length of syntactic dependency between governing & dependent words. If there exists one syntactic dependency of length 2 and two of length 4, the output vector for a given headline would be [0, 0, 1, 0, 2] i.e the value at the corresponding index would be 1 (number of such dependencies). The maximum number of tokens in any headline is 19, so a list with 19 elements has been created.sub_count()
: Get a list of all words used as subjects in a title