wordmatch finds all regions of identity (exact matches) of a specified minimum size of two input sequences. These regions are reported in a standard EMBOSS alignment file and (optionally) in standard EMBOSS sequence feature files.
|
The normal 'report' header is output. It contains the details of the program run and the input sequences.
The data lines consist of five columns separated by spaces or TAB characters. Each line contains the information on one identical region. The first column is the length of the match. The second column is the name of the first sequence. The third column is the start and end position of the match. The next two columns are the name and positions of the second sequence.
wormatch will only report identical regions that are at least as long as the specified wordsize.