splitter splits one or more input sequences into smaller, optionally overlapping, subsequences. The subsequence size and overlap (if any) may be specified. Optionally, feature information will be used.
|
The names of the sequences are the same as the original sequence, with '_start-end' appended, where 'start', and 'end' are the start and end positions of the sub-sequence. eg: The name U01317 would be changed in the sub-sequences to: U01317_1-50000 and U01317_50001-73308 if they were split at the size of 50000 with no overlap.
Splitting a large sequence into smaller sub-sequences for analysis might be useful in cases where a particularly memory or CPU intensive application will not run quickly enough or at all on the full sequence. This should seldom be necessary in EMBOSS.
By default, splitter will write all the sub-sequences to a single file. In some cases, particularly where non-EMBOSS programs are used, it is necessary to have a single sequence per file. To write the sub-sequences into separate files use the command-line switch -ossingle.