extractalign
Function
Description
extractalign allows you to specify one or more regions of a
sequence alignment to extract sub-sequences from to build up a
resulting sub-sequence alignment.
extractakign reads in a sequence alignment and a set of regions
of that alignment as specified by pairs of start and end positions
(either on the command-line or contained in a file) using gapped
alignment positions as the coordinates, and writes out the specified
regions of the input sequence in the order in which they have been
specified. Thus, if the sequence "AAAGGGTTT" has been input and the
regions: "7-9, 3-4" have been specified, then the output sequence will
be: "TTTAG".
Usage
Command line arguments
Input file format
extractalign reads a normal sequence USA.
You can specifiy a file of ranges to extract by giving the '-regions'
qualifier the value '@' followed by the name of the file containing the
ranges. (eg: '-regions @myfile').
The format of the range file is:
- Comment lines start with '#' in the first column.
- Comment lines and blank lines are ignored.
- The line may start with white-space.
- There are two positive (integer) numbers per line separated by one or
more space or TAB characters.
- The second number must be greater or equal to the first number.
- There can be optional text after the two numbers to annotate the line.
- White-space before or after the text is removed.
An example range file is:
# this is my set of ranges
12 23
4 5 this is like 12-23, but smaller
67 10348 interesting region
Output file format
The output is a normal sequence file.
If the option '-separate' is used then each specified region is written
to the output file as a separate sequence. The name of the sequence is
created from the name of the original sequence with the start and end
positions of the range appended with underscore characters between them,
For example: "XYZ region 2 to 34" is written as: "XYZ_2_34"
Data files
None.
Notes
None.
References
None.
Warnings
None.
Diagnostic Error Messages
Several warning messages about malformed region specifications:
- Non-digit found in region ...
- Unpaired start of a region found in ...
- Non-digit found in region ...
- The start of a pair of region positions must be smaller than the
end in ...
Exit status
It exits with status 0, unless a region is badly constructed.
Known bugs
None noted.
Comments
Author(s)
History
Target users
Comments