| Warning! Most multiple alignment programs will align either DNA or amino acid sequences. However, it's important to know that unless nucleic acid sequences are very closely-related, with few gaps (eg. tRNA, rRNA genes), a reliable multiple alignment (or for that matter, even a pairwise alignment) is almost impossible. The reason is that nucleic acids use a 4-letter alphabet, allowing many equally good alignments to form a set of sequences. In contrast, the 20-letter amino acid alphabet drastically decreases the number of possible alignments, so that an obvious 'correct' alignment is usually possible to find. |
Select all amino acid sequences and open the Alignment menu, which shows programs related to multiple sequence alignment

Choose TCOFFEE, which brings up the following menu:

TCOFFEE sends the alignment to a new blprotein window.

Warning!: Do not use guide trees
generated by clustal or other multiple alignment programs for any
purpose eg. phylogenetic analysis of sequence or species
evolution. These trees are based on pairwise alignments, and
therefore do not contain the evolutionary information found in the gaps
that are present in the completed alignment. Once you have an
alignment, you can then go back and construct a phylogenetic tree. |
Select all
sequences as above, and choose Alignment
--> DIALIGN-TX. One of the
main points of DIALIGN-TX is that normally, one should not have to set
any parameters. In fact, although a number of parameters can be set in
this program, it is often dangerous to change them. In no case should
you change the parameters without carefully reading the publications
on DIALIGN. So, in most cases, simply click on 'Run' and run the
program.
While the
alignments differ, it is hard to say whether one is "better" than the
other. Both programs align the motifs containing the Cys
residues. While TCOFFEE tends to insert leading gaps which prevent the
N-terminal Met residues from aligning, DIALIGN-TX does not insert
any gaps in the N-terminus. The order of sequences in the final
alignment is different, seeming to imply a disagreement on the presumed
phylogenetic relationships of the sequences. However, Neighbor Joining
trees constructed from these alignments are identical (not shown).
MRTRANS by Bill Pearson aligns DNA sequences using the corresponding
amino acid alignment as a guide. Thus, if you have aligned a set of
amino acid sequences, it is straightforward to generate the
corresponding DNA alignment. MRTRANS requires two files for input, in
Pearson/FASTA format. The first file contains the unaligned protein
coding sequences, and the second file contains the corresponding amino
acid sequences, aligned by a program such as TCOFFEE or CLUSTALX.
| Notes: 1) MRTRANS needs the DNA and aligned amino acid sequences to have the same names. During the extraction process, during translation, the names may be modified, so it may be necessary to change names in 'File --> Get Info', before you export to a .wrp file 2) Where two or more copies of a gene are present in a single entry (eg. CAGTHIOGN:CDS1 and CAGTHIOGN:CDS2), it is necessary to give them each unique names so that MRTRANS can distinguish them. Since the CDS extensions will be removed when MRTRANS is run, one solution is to delete CDS but retain the number (eg CAGTHIOGN_1 and CAGTHIOGN_2). The current biolegato instances implementation of TCOFFEE and DIALIGN-TX usually handle these steps automatically. Nonetheless, if you are having problems with mrtrans, make sure that the names of sequences are the same for both protein and nucleic acid sequences. |
Continuing with our earlier example, open up defensin.CDS.gde
in bldna.

|
Select all the sequences and choose 'File -->
Export Foreign Format' . (Do NOT
choose SaveAs.) Save the sequences in FASTA format, to the file defensin.CDS.fsa. |
![]() |
Next go to the blprotein window containing the amino acid alignment,
as done previously from TCOFFEE.

| and choose File --> Export foreign format to
save as defensin.tcoffee.fsa. |
![]() |
| To run MRTRANS, choose Alignment --> MRTRANS, and choose
the
DNA and protein alignment filenames. This is a very easy step to mess up. Make sure to choose the correct file for DNA and protein. |
![]() |
The aligned DNA sequences appear in a new bldna window:

| Hint: One of the most common errors is
to switch the names of the Multiply-Aligned Protein File and the Unaligned
DNA file, which, of course, will fail. |
Alignments can be edited directly in biolegato instances. By default, only gaps can be edited out, although it is possible to delete amino acids or nucleotides if protections are changed (Edit --> GetInfo).
Sometimes, it is useful to set several sequences to act as a group.
For example, the blprotein window below shows several gap positions
between Thr and Ser (T---S) in
BOAJ5280:CDS1 and BOAJ5281:CDS1.

If we wanted to put the gap after the S, rather than between T and
S, we could do the following. Select
BOAJ5280:CDS1 and BOAJ5281:CDS1, and choose 'Edit --> Group' The
'1'
at the left of these sequences indicates that changes made to any of
these sequences will be made to all. Thus, deletion of a gap character
could be done by clicking on the S in either of the selected sequences,
and backspacing, so that the S is next to the T (TS).
Next, place the cursor to the right of the S and insert gaps to return
the downstream portion of these sequences to the original alignment (TS---).
See $doc/GDE/GDE2.2_manual.pdf
for a more in depth description of how to edit alignments.
In many cases you need an alignment displayed as editable text. This might be true if you wanted to be able to import the alignment into a word-processor or HTML editor for further modification, such as coloring or underlining certain characters. Choosing Alignment --> REFORM will print out an alignment in which amino acids matching the consensus are indicated by dots:
Various features of the output can be changed in the REFORM menu For example, to print ALL amino acids at every position, the REFORM menu would be set as follows:
10 20 30 40 50 60 70
Maxxxkxxa--xxxlxmxLxxatxxx-----xxxxxCxx--------------xsxxfkglcxsxxxCxx
AF112443_C..rsiyfm...flv.a.t.fv.ygvq.....gkeic.ke..............ltkpv.--.s.dpl.qk
AF128239_C..rsiyfm...flv.avt.fv.ngvq.....gqnni.kt..............t.kh.....fadsk.rk
BOAJ5280_C.kntv.lsligfvm.tvl.lge.via.....qkrkp.ys..............qepd--kt.evn-r.ka
BOAJ5281_C.kntv.lsligfvm.tvl.lge.via.....qkrkp.ys..............qepd--kt.evn-r.ka
CAGTHIOGN_..gfs.vi...tif.m.m.vf..gmv.....aeart.es..............q.hr.....f.ksn.gs
CAGTHIOGN_..gfs.vi...tif.m.m.vf..dmm.....aeaki.ea..............l.gn.....l.srd.gn
CAGT_CDS1 ..gfs.vv...tif.m.l.vf..dmm.....aeaki.ea..............l.gn.....l.srd.gn
GMU12150_C.srsvplvs..ticvlll.lv..emmgptmvaeart.es..............q.hr...p.l.dtn.gs
ZMA133530_.r-ivyma.v.-----.c.vl..mss.....tspsf.qaggcigcprappppsdetcyed.kc.asr.hl
b) JALVIEW - Graphic display and alignment
10 20 30 40 50 60 70
Maxxxkxxa xxxlxmxLxxatxxx xxxxxCxx xsxxfkglcxsxxxCxx
AF112443_Cmarsiyfma flvlamtlfvaygvq gkeiccke ltkpvk cssdplcqk
AF128239_Cmarsiyfma flvlavtlfvangvq gqnnickt tskhfkglcfadskcrk
BOAJ5280_Cmkntvklsligfvmltvlllgetvia qkrkpcys qepd ktcevn rcka
BOAJ5281_Cmkntvklsligfvmltvlllgetvia qkrkpcys qepd ktcevn rcka
CAGTHIOGN_magfskvia tiflmmmlvfatgmv aeartces qshrfkglcfsksncgs
CAGTHIOGN_magfskvia tiflmmmlvfatdmm aeakicea lsgnfkglclssrdcgn
CAGT_CDS1 magfskvva tiflmmllvfatdmm aeakicea lsgnfkglclssrdcgn
GMU12150_Cmsrsvplvs ticvlllllvatemmgptmvaeartces qshrfkgpclsdtncgs
ZMA133530_mr ivymaav mclvlatmss tspsfcqaggcigcprappppsdetcyedlkcsasrchl
Jalview is a feature-rich sequence alignment viewer. It can be
launched by selecting an alignment in bldna or blprotein, and choosing
'Alignment
--> Jalview'.

The alignment above is shown using one of several color schemes available. The Hydrophobicity color scheme shows hydrophobic residues in red, hydrophilic residues in blue, and residues of intermediate hydrophobicity in varying shades of purple.
The alignment can be written in paginated form, suitable for printing, by saving to a PostScript file. PostScript is an almost universal printer language understood by virtually all laser printers. An example of PostScript output can be seen in defensin.jalview.ps. Clicking on this link should launch ghostview or a similar PostScript viewer, which can print the file. The file could also be saved and printed to any laser printer (eg. lpr defensin.jalview.ps).
If your browser is not configured to launch a PostScript viewer, you can save the file, and convert it to PDF. Most Unix systems have the ps2pdf command:
ps2pdf defensin.jalview.ps
would create a file called defensin.jalview.pdf. Clicking on this link should launch a PDF viewer such as Adobe Acrobat or ggv.
Jalview can also do complete alignments from unaligned sequences.
For a full description of the capabilities of Jalview, see $doc/jalview/contents.html.
| Note to VNC users: Jalview, like many Java applications, has color usage issues. For example, running vncserver at 16-bit color depth (eg vncserver -depth 16) will cause the alignment window to appear completely black. Running vncserver at the default color depth of 8 seems to work (eg vncserver) |