| Change Begin, End Points |
| -b {i#} , -e {i# | 0*} |
|
These flags select the beginning (-b) or end (-e) of a subsequence to be extracted and analysed from
a larger sequence. -b defaults to 1; -e defaults to the end of the sequence (which can be explicitly
signified by appending '0'.
|
| Interactions: In the Linear Map output, the
upper label indicates numbering from beginning of subsequence; the lower label indicates numbering
from the beginning of the entire sequence. This can be further confused if you use the
--numstart option which forces the numbering scheme you choose on the
linear map. |
| Warnings: The SMALLEST SEQUENCE that tacg can handle is 4 bases (10
for the ladder map (-l)). This allows analysis of primers and linkers. |
|
| Set Topology to Linear or Circular |
| -f {0 | 1*} |
| This flag sets the form or topology of the Nucleic Acid.
Linear is assumed unless otherwise specified. |
| Interactions:If circular topology is specified, patterns will be matched
across the border as long as the pattern isn't longer than BASE_OVERLAP set in tacg.h (30 as
distributed). Number and size of fragments will be adjusted to account for the topology in both
Fragments Table and Gel Map. If either the
--ps or --pdf flags are used, -f is set to
circular. |
| Warnings: If topology is set to circular, Translation and ORFs will not
be tracked accurately across the origin, so if you suspect that this is the case, change the origin
and try again.
|
|
| Change the output width |
| -w {i# | 1} |
| -w sets output width in bp's (must be between 60* and 210, truncated
to a # exactly divisible by 15 ('-w 100' will be interpreted as '-w 90') and actual printed output will be about
20 characters wider due to numbering and other labels. Also applies to output of the linear, ladder and gel
maps, so if you're trying to get more accuracy and your output device can display small fonts, you may
want to use this flag to widen the output. |
| Interactions |
| Warnings: If you want as much output on one line as possible for
external parsing/analysis, specify -w 1, which will print the output in 1 line, so that it might
be easier to search with an external tool such as the grep family. |
|
| Identify Sequences Only |
| -i --idonly {0|1*|2} |
reduces output for sequences that have no hits, when scanning multiple
sequence files.
- 0 - ID line and normal output printed regardless of hits
- 1 - (default) ID line and normal output are printed ONLY IF there are hits.
- 2 - ONLY ID line is printed if there are hits.
|
| Interactions |
| Warnings |
|
| Force raw file read --raw |
| --raw |
| tells tacg to consider ALL input as valid sequence (as with version 2).
instead of using SEQIO to parse the input as a standard sequence format. Useful for analyzing file
fragments or editor buffers, which may be missing valid format. |
| Interactions |
| Warnings: Note that specifying
this flag will tell tacg to consider all headers, comments, etc as sequence, if it
encounters them and if the characters are valid IUPAC . ALL IUPAC degeneracies will be
analyzed |
|
| Set Degeneracy Handling |
| -D {0-4} |
- 0 FORCES exclusion of degens in seq; only 'acgtu' accepted; much like
- 1 [default] cut as NONdegen unless degen's found; then cut as '-D3'
- 2 degen's OK; ignore in KEY, but match outside of KEY
- 3 degen's OK; expand in KEY, find only EXACT matches
- 4 degen's OK; expand in KEY, find ALL POSSIBLE matches
where KEY is the central hexamer under consideration.
|
| Interactions |
| Warnings: Using -D 0 will silently strip all
degeneracies, which may not be what you want.
-D 4 will result in a very large number of hits as it will match all possible
degeneracies with all possible patterns. If there are enough hits in a small region, it may overflow
some formatting buffers, but this should be caught by the program.
|
|
| Extract Sequences around match |
| -X, --extract {b#,e#,[0|1]} |
| eXtracts the sequence around the pattern matched,
from b# bases preceding, to e# bases following
the MIDDLE of a pattern (if an IUPAC pattern), or the START of
the pattern (if a regular expression). If the pattern is found in the bottom strand AND the
last field = 1, the extracted sequence is reverse-complemented before it's extracted so all
patterns are in same orientation; if the last field = 0, it is NOT reverse compl'ed. In any event,
the sequences are FASTA-formatted on output, so they are ready to be fed to a multiple alignment
program such as ClustalX.
|
| Interactions |
| Warnings: Don't forget that IUPAC and regex patterns are extracted
accordings to different positions, so if you mix them, they won't line up if you then try to recombine
them. |
|
| select enzymes by magnitude of recognition site; the minimum is a
magnitude of 3 = all, 4 = 4,5,6... 5 = 5,6,7,8... etc. ACGTU have a magnitde of 1 each, YRWSMK
have a magnitude of 1/2 each, BDHV have a magnitude of 1/4, N has a magnitude of 0
(doesn't count) ie: ttca=4, tgyrca=5,
tgcnnngca=6, etc. This flag filters on the
tannnnnnnnnnta=4 |