chaos creates a chaos game representation (CGR) plot for a nucleotide sequence. A CGR plot represents a nucleotide sequence as a square box with an A, G, C, or T nucleotide at each corner. The box contains dots, each one representing a dinucleotide. All overlapping dinucleotides from the start to the end of the sequence are plotted. Regions which are devoid of dots (or heavily covered with dots) indicate short sequence motifs that are unusually infrequent (or frequent). CGR plots depict base composition and sequentiality and is a unique visual representation of a sequence that complements more traditional linear representations.
The plot is generated as follows. A box is drawn and an A, G, C, or T is drawn at each corner. Starting from the middle, move half way to the corner of the box representing the first base in the sequence and draw a dot. Then for each subsequent base move half way to the corresponding box corner and draw a dot. Finally display the number and percentage values of AGCT bases. The result is an image of a square sprinkled with dots.
|
Regions which are devoid of dots (or heavily covered with dots) indicate short sequence motifs that are unusually infrequent (or frequent). The sequence of such motifs can be deduced by looking to see which quarter of the square the region is in - the letter that this quarter belongs to is the first base of the motif. The quarter is then quartered again and the appropriate base letters are assigned to the corners of the quarter - the part that the region is in gives the second base of the motif.
The process continues until you have identified the 1/16th or 1/32nd, etc. of the original square containing the unusual region and you now have the sequence of the motif.