infoseq displays on screen basic information about one or more input sequences. This includes the Uniform Sequence Address (USA), name, accession number, type (nucleic or protein), length, percentage C+G and description. Any combination of these records is easily selected or unselected for display. The same information may be written to an output file which (optionally) may be formatted in an HTML table.
|
The first non-blank line is the heading. This is followed by one line per sequence containing the following columns of data separated by one of more space or TAB characters:
If qualifiers to inhibit various columns of information are used, then the remaining columns of information are output in the same order as shown above, so if '-nolength' is used, the order of output is: usa, name, accession, type, description.
When the -html qualifier is specified, then the output will be wrapped in HTML tags, ready for inclusion in a Web page. Note that tags such as <HTML> and <BODY> are not output by this program as the table of databases is expected to form only part of the contents of a web page - the rest of the web page must be supplier by the user.
The lines of out information are guaranteed not to have trailing white-space at the end.
There are many qualifiers to control exactly what information on the sequence is output and how it is formatted. If you only want a few fields in the output file, the command line may be shortended by preceding the appropriate qualifier with -only. For example, instead of specifying -nohead -noname -noacc -notype -nopgc -nodesc to get only the length output, you can specify -only -length.
By default, the output file starts each line with the USA of the sequence being described, so the output file is a list file that can be manually edited and read in by any other EMBOSS program that can read in one or more sequence to be analysed.