count - UNIX line counting utilities

Homepage: https://github.com/wroberts/count

This project is licensed under the terms of the MIT license (see LICENSE.md).

Overview

count works similarly to sort fruit | uniq -c. The output is tab-separated and in alphabetical order.

addcount sums two count files produced by count, assuming that the files are sorted in alphabetical order.

sortalph takes count data as produced by count and sorts it alphabetically; it can also be used to sum two (or more) count files together (even if they're not in alphabetical order):

`cat COUNT1 COUNT2 | sortalph`

sortnum is a script that calls sort -nr.

threshcount reads a count file as produced by count and outputs only those lines whose counts are greater than the given threshold argument.

shuffle is a short Python script which reads in a file and outputs its lines in random order. shuf in the GNU Coreutils is faster and more flexible.

Install

From tarball:

tar xf count-1.0.tar.gz
cd count-1.0/
./configure
make install

From github:

autoreconf --install
mkdir build
cd build
../configure
make install

Speed Test

count is faster than sort | uniq -c, but can use much more memory:

$ cat BIGFILE | wc
 1653677 21751482 75598346

$ time (cat BIGFILE | sort | uniq -c > /dev/null)

real   0m50.933s
user   0m55.267s
sys    0m0.347s

$ time (cat BIGFILE | count > /dev/null)

real   0m9.233s
user   0m9.357s
sys    0m0.453s

Awk Equivalents

Most of the count tools can be replicated with trivial awk scripts. Usually, the compiled binaries are faster.

count is equivalent to, though faster than:

awk '{c[$0]++} END {OFS="\t"; for (x in c) print c[x], x}' | sort -k2

sortalph is equivalent to, though faster than:

awk 'BEGIN{FS=OFS="\t"} {v=$1; $1=""; c[substr($0,2)]+=v} END {for (x in c) print c[x], x}' | sort -k2

threshcount 2 is equivalent to, but slower than:

awk '{if (2 < $1) print $0}'

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
src		src
testing		testing
.gitignore		.gitignore
LICENSE.md		LICENSE.md
Makefile.am		Makefile.am
README.md		README.md
configure.ac		configure.ac
howto-autoconf.md		howto-autoconf.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

count - UNIX line counting utilities

Overview

Install

Speed Test

Awk Equivalents

About

Releases 1

Packages

Languages

License

wroberts/count

Folders and files

Latest commit

History

Repository files navigation

count - UNIX line counting utilities

Overview

Install

Speed Test

Awk Equivalents

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages