seqretsplit is a variant of the standard program for reading and writing sequences, seqret. It performs exactly the same function except that when it reads more than one sequence, it writes each sequence to an individual file. In all other respects, skipseq is the same as seqret. Its main use is therefore to split a file containing multiple sequences into many files, each containing one sequence. There are many options built-in into EMBOSS for detailed specification of the input and output sequences, for example the sequence type and file format. Optionally, feature information will be read and written.
The specification of the output file is not used in this case.
At some point this ought to change and you will not be prompted for the output file at all.
|
One file for each input sequence is written out.
The names of the files it creates are derived from the ID name of the sequence, followed by an extension denoting the format of the sequence. You have no control over the names of the files it writes out.
For example, if the files embl:hsfa11* are read in and the output is specified as wibble.seq, then the following files are expected to be created:
hsfa110.fasta hsfa111.fasta hsfa112.fasta hsfa113.fasta hsfa114.fasta
(No file wibble.seq is created.)
See the documentation for seqret to see the full range of things that you can do when reading and writing sequences.
Some non-EMBOSS programs will accept only single sequences. In such cases seqretsplit is useful for splitting a multiple sequence file into many individual files. Some EMBOSS programs will also read only a single sequence, which may, however, be one of many in a file. You can specify the sequence using the USA filename:sequenceID. Nonetheless, some people feel more comfortable handling one sequence per file, so seqretsplit will be useful to them too.
One file for each input sequence is written. The names of the files it creates are derived from the ID name of the sequence, followed by an extension denoting the format of the sequence. You have no control over the names of the files it writes out. For example, if the files embl:hsfa11* are read in and the output is specified as wibble.seq, then the following files are expected to be created:
hsfa110.fasta hsfa111.fasta hsfa112.fasta hsfa113.fasta hsfa114.fasta
(No file wibble.seq is created.)
This is a side effect of the way sequence output works in EMBOSS. Writing multiple sequences to separate files (the -ossingle qualifier) does this, and seqretsplit has set it automatically on.