helixturnhelix uses the method of Dodd and Egan to identify helix-turn-helix nucleic acid binding motifs in an input protein sequence. The output is a standard EMBOSS report file describing the location, size and score of any putative motifs.
|
By default helixturnhelix writes a 'motif' report file.
The data files are stored in the standard EMBOSS data directory. The names are:
The old (1987) data has a motif length of 20 residues, whilst the default data (Ehth.dat) has a motif length of 22 residues.
With care these can be replaced to suit your data sets. If the files are placed in the following directories they will be used in preference to the files in the EMBOSS distribution data directory:
# Amino acid counts for 91 Helix-turn-helix (presumed) protein motifs # from Dodd IB and Egan JB (1990) Nucl. Acids. Res. 18:5019-5026. # Sample: 91 aligned sequences # # R 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Total Exp # - -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- ----- --- A 2 1 3 14 10 12 75 6 15 9 1 1 4 3 8 15 4 4 4 11 0 10 212 995 C 0 0 1 1 0 0 0 0 0 3 3 1 1 0 0 0 0 0 0 1 0 3 14 106 D 0 1 0 1 14 0 0 14 1 0 5 0 1 2 0 0 0 0 1 1 0 2 43 556 E 4 5 0 11 26 0 0 16 9 3 3 0 3 12 13 0 0 2 0 1 13 6 127 669 F 4 0 4 0 0 4 0 1 0 10 0 0 0 0 1 0 0 1 1 1 22 0 49 358 G 9 7 1 4 0 0 8 0 0 0 50 0 6 0 7 1 0 3 1 1 0 4 102 761 H 4 3 1 1 2 0 0 3 2 0 5 0 3 3 0 2 0 2 4 5 0 2 42 225 I 10 0 13 3 2 15 0 4 9 4 0 17 0 2 0 1 31 1 4 8 16 1 141 583 K 4 4 6 11 12 1 1 14 11 0 5 2 2 7 2 1 0 5 8 4 5 15 120 516 L 16 1 17 0 1 35 0 3 12 31 0 22 0 2 1 1 22 1 1 12 20 0 198 954 M 7 0 2 1 1 1 0 0 5 7 1 10 0 0 2 0 2 0 0 2 0 1 42 275 N 0 8 0 1 0 0 0 2 1 1 14 0 8 1 4 2 0 4 9 0 0 11 66 383 P 1 6 0 1 0 0 0 0 0 0 0 0 3 13 7 0 0 0 0 0 0 3 34 403 Q 2 1 21 9 11 0 0 9 8 0 0 2 1 17 7 12 0 3 12 5 3 9 132 437 R 9 10 14 9 5 0 1 16 10 0 1 0 1 17 8 7 0 17 28 3 0 16 172 609 S 2 17 0 8 4 1 6 1 2 2 3 0 37 1 25 5 0 29 3 0 1 5 152 552 T 6 24 3 12 1 5 0 2 2 4 0 5 20 4 3 39 0 4 1 0 4 3 142 512 V 7 3 1 1 2 16 0 0 2 12 0 29 0 5 3 3 32 0 7 8 7 0 138 724 W 2 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 2 21 0 0 27 105 Y 2 0 4 3 0 1 0 0 2 4 0 1 1 2 0 2 0 15 5 7 0 0 49 267
The helix-turn-helix protein structural motif was originally identified as the DNA-binding domain of phage repressors. One alpha-helix lies in the wide groove of DNA; the other lies at an angle across DNA. The motif is commonly involved in binding DNA. The motif is of fundamental biological importance and is found in most proteins that regulate gene expression. It is formed by of two alpha-helices joined by a short turn.
Original program "HELIXTURNHELIX" (EGCG 1990) by