yank adds the full Uniform Sequence Address (USA) of a specified sequence, or a region (subsequence) of a sequence, to a list file. The file is appended to by default but (optionally) is overwritten.A list file contains one or more sequence references (USAs). For example, a database entry, the name of a file containing sequences, or even the names of another list file. In addition to the name of the sequence, it can write the start and end position of a region within that sequence and, if the sequence is nucleic, if can specify whether the sequence is the reverse complement.
|
You will be prompted for the start and end positions you wish to use.
If the sequence is nucleic, you will be prompted whether you wish to use the reverse complement of the sequence.
The output list file can now be read in by a program such as union by specifying the list file as '@cds.list' when union prompts for input.
There are many ways of specifying input and output sequences for EMBOSS programs, including wildcarded sequence file names, wildcarded database entry names and list files. List files (files of file names) are the most flexible. Instead of containing the sequences themselves, a list file contains one or more sequence references (USAs). For example, a database entry, the name of a file containing sequences, or even the names of another list file. For example, here's a valid list file:
opsd_abyko.fasta sw:opsd_xenla sw:opsd_c* @another_list
The file contains:
* opsd_abyko.fasta - this is the name of a sequence file. The file is read in from the current directory. * sw:opsd_xenla - this is a reference to a specific sequence in the SwissProt database * sw:opsd_c* - this represents all the sequences in SwissProt whose identifiers start with ``opsd_c'' * another_list - this is the name of a second list file
Notice the @ in front of the last entry. This is the way you tell EMBOSS that this file is a list file, not a regular sequence file.
Without the program yank you would need to use a text editor such as pico to create the appropriate list files. yank makes this process easy.
The program extract does not make list files, but creates a sequence from sub-regions of a single other sequence.