einverted finds inverted repeats (stem loops) in nucleotide sequences. It identifies regions of local alignment of the input sequence and its reverse complement that exceed a threshold score. The alignments may include a proportion of mismatches and gaps, which correspond to bulges in the stem loop. One or more sequences are read and a file with the sequence(s) (without gap characters) of the inverted repeat regions is written. It can find multiple inverted repeats in a sequence. Only non-overlapping matches are reported.
einverted uses dynamic programming and thus is guaranteed to find optimal, local alignments between the sequence and its reverse complement. Matched bases contribute positively to the score whereas gaps and mismatches are penalised. The score for a local alignment is the sum of the values of each match, minus penalties for mismatches and gap insertion. Any region whose score exceeds the threshold is reported. The gap penalty, match score and mismatch score, and the threshold score for reporting regions, are all user-specified.
|
The original "inverted" program (from which einverted was derived) was used to annotate the nematode genome. Excluding overlapping repeats saved problems with simple repeat sequences in this genome.
einverted will find optimal alignments but is slower than heuristic methods such as BLAST.
Sometimes you can find repeats using the program palindrome that you can't find with einverted using the default parameters.This is not due to a problem with either program. It is simply because some of the shortest repeats that you find with palindrome's default parameter values are below einverted's default cutoff score - you should decrease the 'Minimum score threshold' to see them.
For example, when palindrome is run with 'em:x65921', it finds the repeat:
64 aaaactaaggc 74 ||||||||||| 98 ttttgattccg 88
einverted will not report this as its score is 33 (11 bases scoring 3 each, no mismatches or gaps) with is below the default score cutoff of 50.
If einverted is run as:
% einverted em:x65921 -threshold 30
then it will find it:
Score 33: 11/11 (100%) matches, 0 gaps 64 aaaactaaggc 74 ||||||||||| 98 ttttgattccg 88
Anything can be considered to be a repeat if you set the score threshold low enough!
einverted does not report overlapping matches.
The original "inverted" program was written to annotate the nematode genome. Excluding overlapping repeats saved problems with simple repeat sequences in this genome.
This application was modified for inclusion in EMBOSS by