
NCLEVER(1)                 OGMP SEQUENCE UTILITY                NCLEVER(1)


NAME
    nclever - "Network Command-Line Entrez VERsion"

REVISION
    This documentation refers to nclever version 4.0

SYNOPSIS
    nclever [-b] [-e]

DESCRIPTION
    nclever is a tty-based version of NCBI's Entrez program. It is an
    interactive tool that allows easy browsing of the Entrez database.
    For more information about Entrez, see the Entrez manual, write to
    entrez@ncbi.nih.nlm.gov, or better yet access their web site at

        http://www.ncbi.nlm.nih.gov/Entrez/

    The original Entrez Browser program written by NCBI is a tool that
    uses windows, menus, and a pointing device; since not everyone has
    computers or terminals with graphics capabilities, nclever was
    written to do the same work using only text input/output.
    nclever can do almost everything the browser version can do, and some
    more. See the section called NCLEVER AND THE ENTREZ BROWSER for a more
    complete comparison between these two Entrez database access tools.

    In addition, the nclever program permits BATCH access to the Entrez
    databases. Thus by use of script files, nclever can be made to perform
    queries in batch mode.  In this way nclever can be used as a "search
    engine" for any application which has as its input a set of database
    queries in nclever format and can use as output any of the data in the
    Entrez databases (in any of the various formats supported by nclever).

    nclever's user interface is command-line based (thus its name). The
    user is presented with a prompt at which he/she types a command. Some
    commands perform searches in the Entrez databases, while others set
    options, display records, or save information in files.

THE "-b" SWITCH
    The "-b" (for "batch") command-line switch globally affects the way
    warning messages (and other outputs of some commands) appear; when
    supplied it turns OFF an internal parameter called VerboseMode.
    It can alse be toggled on and off with the Option command. See the
    description for this command and the section USING NCLEVER IN BATCH
    MODE. Two very noticable effects of turning off VerboseMode is to
    disable the printing of the information seen when starting, and the
    complete absence of the interactive prompt!

THE MAIN LISTS
    The program manages two main lists. The first one is a list of
    terms, supplied by the user. It is the equivalent of the "Query
    Refinement" subwindow in the Entrez Browser. Each term is a search
    string used to query one of Entrez's indexes. They can be grouped
    together to make conjunctive or disjunctive queries. nclever
    computes the other list, called the current documents list, from
    the list of terms, every time the list of term is changed. When
    doing neighboring or lookups, the current list of documents can
    become quite different from what is specified in the list of terms.
    The two lists have their own "current database" associated
    with them. The term list database is changed with the database
    command, while the curent documents database is usually the same
    as the term list database except when doing neighborings, when
    it may change.

COMMANDS
    Commands are described here; they are entered after nclever's prompt,
    which is "NCLEVER> ". Arguments to commands are separated by white spaces
    (that is, blanks or tabs) and sometimes can also be separated by commas.
    All commands can be abbreviated to the minimum number of characters
    necessary to resolve ambiguity with other commands. They are not case-
    sensitive.

    INFORMATION COMMANDS

    About: Synopsis: "About"
        Displays the version and authorship of the program; an abbridged
        version of this message is also displayed once at the begining of
        each invocation of nclever if VerboseMode is true.

    Help,Man,?: Synopsis: "Help [command]", "Man [command]", "? [command]"
        Without arguments, displays a list of all commands with a
        short description of each. If the name of a command is supplied
        as argument, displays more information about that command.

    Status: Synopsis: "Status"
        Reports miscellanous Entrez database information.

    Info: Synopisis: "Info"
        Display the list of available Entrez search fields in tabular
        format. Each field has
            1) a text tag (shown in square brackets)
            2) a descriptive name
            3) three flags indicating whether or not this field
               can be used to query the three databases:
                   M = Medline
                   P = Protein
                   N = Nucleotide
        This command is particularly useful in conjonction with the
        "Search" query command which needs the text tags in its
        search expression. See the description of this command,
        below, in section QUERY COMMANDS"

    CONFIGURATION COMMANDS

    Option: Synopsis: "Option [[no|!]optname] [no|!]optname...]"
        This commands set/resets options that change the behavior
        of other commands and of the program generally. With no arguments,
        it displays the current values of all options. There are two
        kind options: boolean and integer. Boolean options are set
        to TRUE simply by supplying their names as argument to the
        command, and set to FALSE by prefixing them with the letters
        "no" or by a "!". For example,

            Option MultipleMode

        sets the option MultipleMode to TRUE while

            Option noMultipleMode

        sets it to false. There is currently only one integer option,
        and it is called CharsPerLine. It is set to a value by supplying
        a number after the option name, like in:

            Option CharsPerLine 80

        Many options can be set/reset on the same command-line; they are
        not case-sensitive and they can all be abbreviated to the minimum
        number of characters necessary to distinguish between them. All
        options have default values, which can be saved to the user's
        nclever configuration file (see the section THE NCLEVER CONFIGURATION
        FILE).

        Here is a description of all possible options:

            Option [no]MEDAbstract
            Option [no]MEDGenes
            Option [no]MEDMesh
            Option [no]MEDSubstances

        These four options selectively enable/disable displaying parts
        of the MEDLINE records when the REPORT format is chosen (see
        the ARTICLE command).

            Option CharsPerLine <number>

        This option tells nclever to use a display width of <number>
        characters. It affects the display of sequence records in
        "Features" format, and the output of the ABOUT command.

            Option [no]ParentsPersist

        This option tells nclever to always include a copy of the records
        that were selected for neighboring when returning a list of
        their neighbors. When listing the current documents list, a "*"
        is shown beside the parents documents.

            Option [no]MultipleMode

        This option affect all search commands. When set to TRUE, the
        search commands will parse their arguments, separating them at
        white spaces, and making a query for each of them. When set
        to FALSE, everything after the search command is considered
        part of the query, INCLUDING the white spaces. Therefore,
        when set to TRUE, a query of the form

            Author Struhl K

        will try to search for "Struhl" and then for "K", which might
        not be what the user wanted; setting NOMultipleMode will
        look for "Struhl K".

            Option [no]TruncationMode

        This option affects all search command. When set to TRUE, all
        queries will be made in Truncation Mode, that is, the search
        string will be interpreted as a prefix of what is looked for.
        For example, a query on the word "cox" will effectively match
        "cox1", "cox2", etc. In that case, reported entries are shown
        with a "..." appended to the search string. When set to FALSE,
        an exact match between the search string and the indexed terms
        of entrez is expected.

        SPECIAL NOTE, January 1999: even though Truncation Mode is
        correctly implemented in nclever, it doesn't work because of
        unfinished upgrades at NCBI's servers. It is recommended NOT
        to use Truncation Mode in day to day business, but to try it
        anyway about once a month. It may suddenly start working late
        in 1999 or in early 2000.

            Option [no]AllowNull

        This option affects all search command. When set to TRUE,
        all queries that return an empty list of document will
        still create an entry in the current list of terms. This
        behavior is useful when using NCLEVER in batch mode; when
        using it interactively this option is better set to FALSE.
        The default is FALSE.

            Option [no]VerboseMode

        This option affects many commands. It basically toggles
        the displaying/nondisplaying of warning messages. Usually
        it is set to TRUE for interactive query of the database,
        and set to FALSE on invocation (using to -b command-line option)
        when nclever is used in BATCH mode, doing automatic retrieval
        from the Entrez database under script control. In that case,
        nclever only to produces useful information when the script
        is run, so the displaying of the prompt and the initial
        welcome message are also disabled. See the section USING
        NCLEVER IN BATCH MODE for more information.

            Option Save

        This is not really an option. This tells nclever to write out
        the current setting of all the options to nclever's configuration
        file. See the section THE NCLEVER CONFIGURATION FILE for more
        information.

        OPTIONS DEFAULTS:

                Medline Report display options:
                    - MedAbstracts   = TRUE
                    - MedMesh        = TRUE
                    - MedGenes       = TRUE
                    - MedSubstances  = TRUE

                Miscellanous options:
                    - CharsPerLine   = 80
                    - ParentsPersist = FALSE
                    - MultipleMode   = TRUE
                    - TruncationMode = FALSE
                    - AllowNull      = FALSE
                    - VerboseMode    = TRUE

    Database: Synopsis: "Database [database name]"
        This command sets the current terms lookup database to
        either "medline", "protein" or "nucleotide". Note that
        changing database implies doing a RESET of the current search
        environment: all searched terms are cleared and the current
        document list and its neighboring history too. The default
        database is Medline. Without arguments, shows the current
        list of terms database.

    Article: Synopsis: "Article [article format]"
        This command sets the format in which medline articles are
        displayed. If no arguments are supplied, it shows what format is
        currently chosen. Possible formats can be shown by supplying
        a question mark to the command (or anything else that is not a
        legal article format).

    Report: Synopsis: "Report [sequence report format]"
        This command sets the format in which sequence records are
        displayed. If no arguments are supplied, it shows what format is
        currently chosen. Possible formats can be shown by supplying
        a question mark to the command (or anything else that is not a
        legal sequence format).

    Class: Synopsis: "Class [sequence level]"
        This command tells nclever what level of complexity of the
        sequence to display. Possible levels are NucProt, SegSet and
        BioSeq, according to NCBI's internal data structures for
        representing set of sequences. If no arguments are supplied,
        it shows what level is currently chosen. Possible levels can
        be shown by supplying a question mark to the command (or anything
        else that is not one of the three keywords just mentionned).

    QUERY COMMANDS

    Synopsis: "<Command> <term1> [term2] [term3]..." (MultipleMode=TRUE)
              "<Command> <term>"                     (MultipleMode=FALSE)

    All query commands do the same thing: they search for one or more terms
    in the indexes of the Entrez databases, and if the operation is
    successful, add each term in the current list of terms at which point
    they can be grouped together, excluded, etc. There are fifteen query
    commands that each search a single Entrez field, and a more general
    query command called "Search" (described below) which can be used
    for more sophisticated boolean queries.

    There are two options that affects searching (see also the Option
    command). MultipleMode forces the query commands to make a query
    for each space-separated argument given to them; when MultipleMode
    is FALSE, spaces become significant in queries. TruncationMode allows
    queries to be made on prefixes of indexed terms, for example, a query
    on the word "cox" will effectively match "cox1", "cox2", etc. 

    Some query commands apply to only some of the three Entrez databases.
    The "Info" command can show you which field is available for each
    database.

    They query commands are:

       Accession  - Select documents by accession number
       Author     - Select documents by author
       Date       - Select documents by publication date (same as PDate)
       Pdate      - Select documents by publication date (same as DATE)
       Edate      - Select documents by Entrez date
       Mdate      - Select documents by modification date
       ECnumber   - Select documents by E. C. Number
       Gene       - Select documents by gene name
       Journal    - Select documents by journal title
       Keyword    - Select documents by keyword
       Mesh       - Select documents by MESH terms
       Organism   - Select documents by organism name
       Pname      - Select documents by protein name
       Substance  - Select documents by substance name
       Text       - Select documents by text words (titles + abstracts)
       Title      - Select documents by title words

    The index of the date command contains years like "1968",
    combination of years and months like "1995/01", and combinations
    of years, months and dates like "1995/01/23".

       Search     - Select documents using an Entrez search expression

    The "Search" query command allows the user to enter explicitely his
    or her own Entrez query expression. These expressions are made up
    of query terms, fields tags, parentheses and boolean operators.
    Query terms must be surrounded by double quotes; field tags are
    surrounded by square brackets (see also the Info command); the
    available boolean operators are "&" (and), "|" (or) and "-" (butnot).
    This query command is useful for searching Entrez with fields which
    do not have a corresponding search command in the list above. For
    example, to search medline by page numbers the user can query with

        Search "293-295" [PAGE]

    A more complex example using the "&" boolean operator:

        Search  "rioux" [AUTH] & "littlejohn" [AUTH]

    This last query is exactly equivalent as issuing two separate
    queries with the "Author" query command.

    When a term has been added the list of terms, the current
    document list is updated, and its associated database is
    set to the list of term's database.

    NEIGHBORING COMMANDS:

    Synopsis: "<Command> <num1> [num2] [num3]"

    The commands "Neighbors", "Medline", "Protein" and "Nucleic" are
    used to do neighboring and lookup (see below) searches. When supplied
    with a list of numbers corresponding to documents in the current 
    documents list, they retrieve the set of "similar" documents 
    (precomputed in the entrez database; see the Entrez documention for
    how these indexes are built). This set then becomes the new current
    list of documents. "Neighbors" are similar records in the same database
    as the current list. The other commands ("Medline", "Protein" and
    "Nucleic") specify another database (or the same) in a more explicit
    manner (see LOOKUPS below). If the ParentsPersist option is TRUE,
    the documents used for neighboring will be included at the
    top of the new list, and marked with an "*" when listing it. Special
    recognised arguments are ALL for "all documents" and PARENTS for
    "parents documents" (they can be abbreviated to "A" and "P").

    LOOKUPS
    
    Unlike the Entrez Browser, nclever does not have a lookup command.
    Instead, lookups are performed by specifiying the database to be
    accessed and the documents from the current list to be looked up
    in that database.  Thus lookups are performed as described above for
    "NEIGHBORING" but apply only when the "Medline", "Protein" and "Nucleic"
    commands are used and when the most recent document list applies to
    a database different from the one specified in the command.  For
    instance, if the Medline database had just been searched and the
    nucleic acid entry for the first entry was desired, the command

	Nucleic 1

    would retrieve that entry.  On the other hand, if the neighbours to
    this document were desired, the commands:

	Medline 1

   or

	Neighbor 1

    would both retrieve the neighbors to the first document on the list.

    
    HISTORY COMMANDS:

    Synopsis: "History"
              "Previous"
              "Next"

    When doing neighboring, the current list changes as the user browses
    lists of documents. nclever keeps a history of the changes, and the user
    is able to go back to previously fetched document lists with the
    "previous" command. The "next" command goes forward in the history
    list. The "history" command shows a summary of that list. Note that
    Modifying the term list doesn't automatically update the history list,
    until an explicit access to the history list is done with one of the
    history command.

    TAXONOMY COMMANDS:

    Synopsis: "Taxonomy List"
              "Taxonomy Down <num>"
              "Taxonomy Up [num]"
              "Taxonomy Add [num]"

    These commands allow the user to browse the two taxonomic trees
    available with the the two sequence databases. The "List" command
    shows information related to the current node in the tree: its
    lineage and the name of all its children along with the number
    of documents found in the current sequence database. Taxonomy
    starts by default at the "root" of the tree, which is by convention
    at the 'top' and is its 'highest point'. The "Down" command
    allows the user to go to child number <num> (as reported by
    the "List" command) of the current node. The "Up" command does
    the inverse; if a <num> is supplied, the user climbs back up the
    tree to the lineage level with that number. The "Add" command puts
    the list of documents specified by children <num> into the current
    list of term, in the same way as the search commands. If the current
    node in the tree is already a leaf (and therefore has no children),
    then <num> doesn't need to be specified.

    MISCELLANOUS COMMANDS:

    List: Synopsis: "List [num]"
        Shows a summary of the current list of documents. Since this
        list can be very long, the default shows only the first 20
        documents. When given a number N as argument, a summary of the
        first N documents are shown. Special arguments are "A" and "P";
        see the NEIGHBORING COMMANDS subsection.

    Pick, Union and Not: Synopsis: "<Pick|Union|Not> [num1] [num2] [num3]..."
        These commands are used to manipulated the terms
        in the list of terms. They take a list of numbers as arguments,
        each number corresponding to one of the terms shown by the
        list command. Since version 3.02, the "all" keyword can also be
        used to specify "all terms in the current term list".

        The "Pick" command  used without argument simply shows the
        current term list. With arguments, it selects and unselects
        individual terms; an unselected term is shown with nothing in
        front of it while a selected one is shown with a "<" sign (unless
        grouped with the "Union" command). A negative number -n means
        to UNpick the term number n. Unselected term are completely
        ignored for the purpose of building the list of documents.

        The "Union" commands groups all the terms whose numbers are supplied
        in argument. Groups of one term are shown with a "<" sign in front
        of them, like a one-line bracket, while larger groups are shown with
        the characters "/", "|" and "\" which visually appear as larger
        brackets. Picking or unpicking single terms can be used to
        break-up a group. 

        The "Not" command can only be used on groups of one term. It
        is used to do boolean NOTS of terms. Such terms are shown with
        a "-" sign in front of the single "<". A negative number -n means
        to remove the boolean NOT associated with the terms numN.

        The "Pick" command interprets "Pick 0" as "unpick all". Note that
        grouping terms together with "Union" can move the terms around in
        the list (this doesn't apply to the "Pick" or "Not" commands).

        Examples:

            Pick 2 -5 3     - Selects terms 2 and 3, unselects term 5
            Union 2 4 6     - Groups terms 2, 4 and 6 (boolean "or")
            Not 1 3 -5      - Subtract documents of terms 1 and 3;
                              documents of term 5 are back to normal.

        Evaluation of the boolean expression built by PICKing, UNIONing
        and NOTing terms is done in the following manner: first all
        groups made with the "Union" command are evaluated as ORs of
        the lists of documents specified by the terms. The groups are
        then ANDed together and finally, the single NOTed groups are
        substracted from that result. Therefore, a list of 8 terms
        like this:

        -<  1 Term1
         <  2 Term2
         /  3 Term3
         |  4 Term4
         \  5 Term5
        -<  6 Term6
         /  7 Term7
         \  8 Term8

        can be interpreted as the boolean expression

        ((2) AND (3 OR 4 OR 5) AND (7 OR 8)) AND (NOT 1) AND (NOT 6)

        When no argument is supplied, these commands show the current
        list of terms.

        When a change has been made to the list of terms, the current
        document list is updated, and its associated database is
        set to the list of term's database.

    Type: Synopsis: "[Type] <num1> [num2] [num3]..."
        This command displays one or more documents from the current
        list of documents. It takes numbers as arguments to specify
        which documents to show. Special arguments are "A" and "P";
        see the NEIGHBORING COMMANDS subsection. The format of the
        displayed documents depend on the settings of the "Article"
        or "Report" commands. The "Type" keyword itself is optional, since
        nclever will recognize a command that starts with a digit
        as an abbreviation for the "Type" command.

    UID: Synopsis: "UID <uid1> [uid2] [uid3]..."
        This command doesn't affect any of the internal lists. It simply
        displays one or more documents from the current term database,
        specified by their UIDs. It therefore assumes that the user
        knows the correct uids. As for the "type" command, the
        format of the displayed documents depend on the settings of the
        "Article" and "Report" commands.

    File: Synopsis: "File [<filename> [modifiers]]"
        This command tells nclever to send the useful output of other
        commands (like "list", "type", etc) to a file. With no argument
        it returns the output to the user's stream. When a file is
        supplied as argument, the output of the following commands
        are sent to the file name specified, and nothing will be displayed
        on the screen. Some one-letter modifiers can be specified after
        the filename. An "A" means Append to the file. A "1" means
        redirect the output for the next command ONLY, not all the
        following commands. It is possible to do a one-command-only
        redirection (with modifier "1") while the general output has
        already been redirected somewhere else. This feature is used
        internally by the "save" and "print" commands. Examples:

             File myabstracts 1     - Next command's output redirected
             List all               - This is what is sent to myabstracts
             List all               - This time it's sent to your console

    Save: Synopsis: "Save <filename> <num1> [num2] [num3]..."
        This commands does the same thing as the "type" command,
        but sends its ouput to "filename". It is the same as doing
        "File <filename> 1" followed by "Type <num1> [num2] [num3]...".

    Print: Synopsis: Print <filename> <num1> [num2] [num3]..."
        This command does the same thing as "save", but sends its
        output to the printer. The output is saved in a temporary file
        and that file is printed using the PRINT COMMAND configuration
        in the user's nclever configuration file. See the section called
        THE NCLEVER CONFIGURATION FILE. This command is implemented
        only for UNIX systems.

    Reset: Synopsis: "Reset"
        This command discards the current list of documents, its history
        list, and the current list of terms. It leaves all configuration
        setting unchanged (current term database, record formats, etc).

    Saveuids: Synopsis: "Saveuids <filename>"
        This command saves the list of all the uids of the current
        documents list to the file filename.

    Loaduids: Synopsis: "Loaduids <filename>"
        This command reloads a list saved by the "saveuids" command.
        It adds the list to the term list, as if it was a legal
        searched-for term. The loaded list must be a list of uids
        from the same database as the list of term's database (see
        the "database" command).

    Exit, Quit, EOF: Synopsis: "Exit", "Quit", "<EOF>".
        This exits from nclever.

USING NCLEVER IN BATCH MODE
    Since nclever receives its command using the standard input and
    displays records to its standard output, it can be used as a tool
    to query the Entrez database automatically. One simply has to feed
    it the commands on its standard input and gather the results on
    its standard output. The "-b" (for BATCH) command-line switch can be
    used to turn off the internal option VerboseMode; this is advantageous
    in that it tells nclever not to print a prompt for each command, which
    would clutter the output and render it difficult to parse by other
    programs. The -b switch also disables the display of the introductory
    message. Therefore, building a scriptfile like this one:

        Database Medline
        Article ASN
        Option NoMultipleMode
        Author Struhl K
        Type 1

    and feeding it to nclever with "nclever -b <scriptfile" will result
    in displaying the first found article of medline with author "Struhl K"
    in ASN.1 format.

    When VerboseMode is FALSE, many warning or information messages are
    not displayed, because it is assumed that the output is to be processed
    by another program. Error message are still sent to the output stream;
    if this still a problem, one can use the "file" command to make sure
    that only relevant information is sent to a file, and redirect nclever's
    output to the system's bit-bucket (/dev/null for UNIX).

    For debugging scripts, and making them more human readable, a "-e"
    command line switch exists. This tell nclever to echo back commands
    read from it input stream to it's ouput stream. By supplying "-e"
    and omiting "-b", the output of nclever will look like an interactive
    session: beside each prompt the command will appear, and just below
    that, the result of the command.

THE NCLEVER CONFIGURATION FILE
    When starting up, nclever looks for a configuration file much like
    the Entrez Browser's. If the file exists, it will fetch many configuration
    options from it that will override its internal defaults, such as
    the default initial database and the value of all the options settable
    by the "option" command. Some options are the exact equivalent of some
    of the Entrez Browser's options, and if they can't be found in a nclever
    configuration file proper, they will be looked for in the Browser's
    configuration file. When saving back options, they are always saved to
    nclever's own configuration file (which will be created if needed).

    The nclever configuration file name and location are determined according
    to the NCBI standards and so will vary according to platform (e.g. for
    unix systems, the file will be called .cleverrc and will be located in the
    user's home directory).

NCLEVER AND THE ENTREZ BROWSER
    nclever is based on NCBI's Entrez Browser program. It can do most of
    what the browser can do, and some more. Here are the main differences,
    beside the fact that nclever is a stream- (or terminal-, or character-)
    based tool:

    - nclever remembers what articles were parents of a list of documents
      when doing lots of neighboring and going back in the history list.

    - nclever can do multiple truncation searches.

    - The browser has a "term selection" subwindow, an equivalent function of
      which is not supplied by nclever.

    - nclever can directly access MEDLINE, PROTEIN and NUCLEOTIDE record
      by UIDs, using the UID command. The browser can only do this with MEDLINE
      records.

    - nclever can selectively omit part of displayed medline record when
      the "report" format is chosen. This is useful for people who don't
      want to see MESH headings, etc...

FILES
    - $HOME/.ncbirc               - NCBI configuration file
    - $HOME/.entrezrc             - Entrez browser's config file
    - $HOME/.cleverrc
    - Some support files in a data/ subdirectory as specified by
      the $HOME/.ncbirc file. See the installation manual.

BUGS
    Probably some, though it's been heavily tested. There's room for
    improvement in the way some warning/information/error messages
    appear, though. And some commands are cryptic at first.

    It has been shown that nclever cannot reload very large lists
    of UIDs (with the Loaduids command). This problem lies in
    the NCBI toolkit's Entrez libraries which were used to build
    nclever.

    This documentation was originally written in 1994; both Entrez
    and nclever have evolved quite a bite since then, so expect
    some information to be innacurate. If you find some, write to
    us.

AUTHORS
    This software was written by Pierre Rioux of the OGMP (Organelle
    Genome Megasequencing Project), Departement de Biochimie, Universite
    de Montreal (riouxp@bch.umontreal.ca) and William A. Gilbert, of
    the University of New Hampshire (gilbert@unh.edu) under the management of
    Tim Littlejohn, OGMP, Departement de Biochimie, Universite de Montreal
    (tim@bch.umontreal.ca).  Please send any comments/correspondence to
    ogmp@bch.umontreal.ca.

ACKNOWLEDGMENTS
   The development of nclever was supported by a grant from the Canadian
   Genome Analysis and Technology Program (CGAT).
