biosed

Function

Description

biosed is a simple sequence editing utility that searches for a target subsequence in one or more input sequences and replaces it with an insert subsequence, or optionally, just deletes the target subsequence where found. If the target subsequence occurs more than once, then each instance of the target is replaced or deleted.

The -position option allows a sequence position to be specified as an additional constraint for the match: a replacement / deletion only occurs if the start of a match is at the specified -position position.

Algorithm

The target subsequence is just a short, literal sequence of characters. biosed cannot interpret cannot any sort of an ambiguity pattern such as a regular expression. A simple string match is done between the target and input sequences. If there is an exact matches then the replacement or deletion is done. The matching is case insensitive, independent of the case of both the input sequences and target.

Usage

Command line arguments


Input file format

It reads the USA of one or more nucleic acid or protein sequences.

Output file format

The edited sequence is output.

The sequence will be in uppercase.

Data files

None.

Notes

biosed was inspired by the useful UNIX utility sed which searches for a pattern in text and can replace or delete the found pattern.

References

None.

Warnings

No check for correct type (protein, nucleic, gapped etc) is made on the replacement sequence so you must ensure it is of the type required. Any text can be used, including characters only used in proteins (e.g. D, E, F, etc.), characters rarely used in proteins (e.g. U, J, O, etc), digits and punctuation characters.

Diagnostic Error Messages

None.

Exit status

Author(s)

History

Target users

Comments