wordcount counts and extracts all possible unique sequence words of a specified size in one or more DNA sequences. It writes an output file giving all possible words for that word size with a count of each word in the input sequences. Optionally, only words occuring a specified minimum number of times are reported.
|
The file simply consists of two columns, separated by spaces or TAB characters.
The first column consists of all the possible words of size wordsize. The second column consists of the count of those words in the input sequence.