Table of Contents

Name

ngrams-freq-filter - filters out ngrams with low counts.

Synopsis

ngrams-freq-filter [-t THRESHOLD]

Description

The ngrams-freq-filter utility reads ngrams produced by the ngram utility from standard input and filters out ngrams with counts below a user-specified threshold. The output is written to standard output.

Options

-t THRESHOLD
specifies count threshold. ngrams with counts below this number will not be included in the output. Default value: 1.

Examples

Command:

echo -e "this is a test\nthis is yet another test" | \
ngrams -n 2 | ngrams-freq-filter -t 2
Output:

6
2    <s> this
2    test </s>
2    this is

Author

Autocorpus was written by Maciej Pacula (maciej.pacula@gmail.com).

The project website is http://mpacula.com/autocorpus

See Also

autocorpus(7) , ngrams(1) , ngrams(5) , ngrams-sort(1) , sentences(1) , tokenize(1) , wiki-articles(1) , wiki-textify(1) ,


Table of Contents