restrict

Function

Description

restrict scans one or more nucleotide sequences for cut sites for a supplied set of restriction enzymes. One or more restriction enzymes can be specified or alternatively all the enzymes in the REBASE database can be investigated. The minimum length of a recognition site to be reported must be specified. It writes an output file showing the location of the cut sites. There are several options to control exactly what sites are reported and the format of the output file. Optionally, the fragment lengths of the forward sense strand produced by complete restriction by each restriction enzyme on its own, or by using all of the input enzymes together, may be reported. Results are added to the tail section of the report.

Algorithm

Usage

Command line arguments


Input file format

restrict reads one or more DNA sequence USAs.

Output file format

By default restrict writes a 'table' report file.

The output from restrict is a simple text one. The base number, restriction enzyme name, recognition site and cut positions are shown. Note that cuts are always to the right of the residue shown and that 5' cuts are referred to by their associated 3' number sequence.

The program reports enzymes that cut at two or four sites. The program also reports isoschizomers and enzymes having the same recognition sequence but different cut sites.

When the "-fragments" or "-solofragments" qualifiers are given then the sizes of the fragments produced by either all of the specified enzymes cutting, or by each enzyme cutting individually, are given in the 'tail' section at the end of the report file.

Data files

Notes

Several criteria may be set to control what sites are reported: -min, -max, -single (minimum or maximum number of cuts, or single site cuts only. -blunt (enzymes which cut at the same position on the forward and reverse strands). -sticky (enzymes which cut at different positions on the forward and reverse strands, leaving an overhang). -ambiguity (enzymes which have one or more N ambiguity codes in their pattern). -commercial (enzymes with a commercial supplier). -plasmid (allows searches for restriction enzyme recognition site and cut postions that span the end of the sequence).

By default, only one enzyme of any group of isoschizomers (enzymes that have the same recognition site and cut positions) is reported. This behaviour can be changed by specifying -nolimit, in which case all isoschizomers are reported. The default behaviour uses the representative enzyme of an isoschizomer group (the prototype) which is specified in the EMBOSS data file embossre.equ. This file is generated from the REBASE database by running rebaseextract. You may edit this file to set your own preferred prototype,if you wish.

Output file size is related to the size of the recognition site and the maximum number of allowed cutting positions. Setting the site length to six and restricting the cuts to two is a common choice of parameters. The size of the output can sometimes be reduced by specifying the -noambiguity switch.

References

  1. Nucleic Acids Research 27: 312-313 (1999).

Warnings

restrict uses the EMBOSS REBASE restriction enzyme data files stored in directory data/REBASE/* under the EMBOSS installation directory. These files must first be set up using the program rebaseextract. Running rebaseextract may be the job of your system manager.

Diagnostic Error Messages

None.

Exit status

It exits with status 0 unless an error is reported.

Known bugs

None.

Author(s)

History

Target users

Comments