JAligner is a Java implementation of algorithms for pairwise alignment of biological sequences e.g. Smith-Waterman and Needleman-Wunsch.
Currently, only Smith-Waterman algorithm is implemented, it is extended to support affine gap penalties with Gotoh improvement to maintain the O(n2) time complexity of the original Smith-Waterman algorithm.
java -classpath jaligner.jar:matrices.jar gnu.bioinformatics.jaligner.SW s1.fasta s2.fasta -m matrix -o open -e extend -a -f jaligner.out
s1.fasta |
the location of the 1st sequence in FASTA format. |
s2.fasta |
the location of the 2nd sequence in FASTA format. |
-m matrix | the name of the scoring matrix,
optional, the default value is BLOSUM62 |
-o open | the open gap penalty,
optional, the default value is 10.0 |
-e extend | the extend gap penalty,
optional, the default value is 0.5 |
-a | all the possible alignments with the same maximum score |
-f jaligner.out | the output file name
optional, the default is the standard output (the console) |
java -classpath jaligner.jar:matrices.jar gnu.bioinformatics.jaligner.SW s1.fasta s2.fasta -m BLOSUM90 -o 11 -e 1
java -jar gnu.bioinformatics.jaligner.SW s1.fasta s2.fasta -m /home/ahmed/mymatrix
Currently, FASTA is the only supported input format.
Currently, Pair is the only supported output format.
In the command line, if -jar
is used (i.e. java -jar gnu.bioinformatics.jaligner.SW as in the 2nd example), all the user defined class paths will be disregarded by JVM (including matrices.jar), and in this case, the path to the scoring matrix has to be defined by the -m
option.
For large sequences e.g. 10KB, JVM needs to know that, to reserve more heap, by passing the -Xm options, for example java -Xms2048m -Xmx2048m
To use a user-defined scoring matrix i.e. not included in matrices.jar
, the matrix name in the arguments should be the full path to the user-defined scoring matrix e.g. /home/ahmed/mymatrix
, then it is not required to include matrices.jar
in the classpath.