diff --git a/.bowerrc b/.bowerrc
new file mode 100644
index 00000000..6761dfa1
--- /dev/null
+++ b/.bowerrc
@@ -0,0 +1,3 @@
+{
+ "directory": "_components"
+}
diff --git a/.gitignore b/.gitignore
new file mode 100644
index 00000000..c255ecf6
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1 @@
+_site
diff --git a/.ruby-version b/.ruby-version
new file mode 100644
index 00000000..4132f25c
--- /dev/null
+++ b/.ruby-version
@@ -0,0 +1 @@
+ruby-2.1.1
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
new file mode 100644
index 00000000..a1260c4f
--- /dev/null
+++ b/CONTRIBUTING.md
@@ -0,0 +1,11 @@
+## Public domain
+
+The project is in the public domain within the United States, and
+copyright and related rights in the work worldwide are waived through
+the [CC0 1.0 Universal public domain dedication][CC0].
+
+All contributions to this project will be released under the CC0
+dedication. By submitting a pull request, you are agreeing to comply
+with this waiver of copyright interest.
+
+[CC0]: http://creativecommons.org/publicdomain/zero/1.0/
diff --git a/COPYING.txt b/COPYING.txt
new file mode 100644
index 00000000..354f1e04
--- /dev/null
+++ b/COPYING.txt
@@ -0,0 +1,121 @@
+Creative Commons Legal Code
+
+CC0 1.0 Universal
+
+ CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE
+ LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN
+ ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS
+ INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES
+ REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS
+ PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM
+ THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED
+ HEREUNDER.
+
+Statement of Purpose
+
+The laws of most jurisdictions throughout the world automatically confer
+exclusive Copyright and Related Rights (defined below) upon the creator
+and subsequent owner(s) (each and all, an "owner") of an original work of
+authorship and/or a database (each, a "Work").
+
+Certain owners wish to permanently relinquish those rights to a Work for
+the purpose of contributing to a commons of creative, cultural and
+scientific works ("Commons") that the public can reliably and without fear
+of later claims of infringement build upon, modify, incorporate in other
+works, reuse and redistribute as freely as possible in any form whatsoever
+and for any purposes, including without limitation commercial purposes.
+These owners may contribute to the Commons to promote the ideal of a free
+culture and the further production of creative, cultural and scientific
+works, or to gain reputation or greater distribution for their Work in
+part through the use and efforts of others.
+
+For these and/or other purposes and motivations, and without any
+expectation of additional consideration or compensation, the person
+associating CC0 with a Work (the "Affirmer"), to the extent that he or she
+is an owner of Copyright and Related Rights in the Work, voluntarily
+elects to apply CC0 to the Work and publicly distribute the Work under its
+terms, with knowledge of his or her Copyright and Related Rights in the
+Work and the meaning and intended legal effect of CC0 on those rights.
+
+1. Copyright and Related Rights. A Work made available under CC0 may be
+protected by copyright and related or neighboring rights ("Copyright and
+Related Rights"). Copyright and Related Rights include, but are not
+limited to, the following:
+
+ i. the right to reproduce, adapt, distribute, perform, display,
+ communicate, and translate a Work;
+ ii. moral rights retained by the original author(s) and/or performer(s);
+iii. publicity and privacy rights pertaining to a person's image or
+ likeness depicted in a Work;
+ iv. rights protecting against unfair competition in regards to a Work,
+ subject to the limitations in paragraph 4(a), below;
+ v. rights protecting the extraction, dissemination, use and reuse of data
+ in a Work;
+ vi. database rights (such as those arising under Directive 96/9/EC of the
+ European Parliament and of the Council of 11 March 1996 on the legal
+ protection of databases, and under any national implementation
+ thereof, including any amended or successor version of such
+ directive); and
+vii. other similar, equivalent or corresponding rights throughout the
+ world based on applicable law or treaty, and any national
+ implementations thereof.
+
+2. Waiver. To the greatest extent permitted by, but not in contravention
+of, applicable law, Affirmer hereby overtly, fully, permanently,
+irrevocably and unconditionally waives, abandons, and surrenders all of
+Affirmer's Copyright and Related Rights and associated claims and causes
+of action, whether now known or unknown (including existing as well as
+future claims and causes of action), in the Work (i) in all territories
+worldwide, (ii) for the maximum duration provided by applicable law or
+treaty (including future time extensions), (iii) in any current or future
+medium and for any number of copies, and (iv) for any purpose whatsoever,
+including without limitation commercial, advertising or promotional
+purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each
+member of the public at large and to the detriment of Affirmer's heirs and
+successors, fully intending that such Waiver shall not be subject to
+revocation, rescission, cancellation, termination, or any other legal or
+equitable action to disrupt the quiet enjoyment of the Work by the public
+as contemplated by Affirmer's express Statement of Purpose.
+
+3. Public License Fallback. Should any part of the Waiver for any reason
+be judged legally invalid or ineffective under applicable law, then the
+Waiver shall be preserved to the maximum extent permitted taking into
+account Affirmer's express Statement of Purpose. In addition, to the
+extent the Waiver is so judged Affirmer hereby grants to each affected
+person a royalty-free, non transferable, non sublicensable, non exclusive,
+irrevocable and unconditional license to exercise Affirmer's Copyright and
+Related Rights in the Work (i) in all territories worldwide, (ii) for the
+maximum duration provided by applicable law or treaty (including future
+time extensions), (iii) in any current or future medium and for any number
+of copies, and (iv) for any purpose whatsoever, including without
+limitation commercial, advertising or promotional purposes (the
+"License"). The License shall be deemed effective as of the date CC0 was
+applied by Affirmer to the Work. Should any part of the License for any
+reason be judged legally invalid or ineffective under applicable law, such
+partial invalidity or ineffectiveness shall not invalidate the remainder
+of the License, and in such case Affirmer hereby affirms that he or she
+will not (i) exercise any of his or her remaining Copyright and Related
+Rights in the Work or (ii) assert any associated claims and causes of
+action with respect to the Work, in either case contrary to Affirmer's
+express Statement of Purpose.
+
+4. Limitations and Disclaimers.
+
+ a. No trademark or patent rights held by Affirmer are waived, abandoned,
+ surrendered, licensed or otherwise affected by this document.
+ b. Affirmer offers the Work as-is and makes no representations or
+ warranties of any kind concerning the Work, express, implied,
+ statutory or otherwise, including without limitation warranties of
+ title, merchantability, fitness for a particular purpose, non
+ infringement, or the absence of latent or other defects, accuracy, or
+ the present or absence of errors, whether or not discoverable, all to
+ the greatest extent permissible under applicable law.
+ c. Affirmer disclaims responsibility for clearing rights of other persons
+ that may apply to the Work or any use thereof, including without
+ limitation any person's Copyright and Related Rights in the Work.
+ Further, Affirmer disclaims responsibility for obtaining any necessary
+ consents, permissions or other rights required for any use of the
+ Work.
+ d. Affirmer understands and acknowledges that Creative Commons is not a
+ party to this document and has no duty or obligation with respect to
+ this CC0 or use of the Work.
diff --git a/Gemfile b/Gemfile
new file mode 100644
index 00000000..618d8980
--- /dev/null
+++ b/Gemfile
@@ -0,0 +1,2 @@
+source 'https://rubygems.org'
+gem 'github-pages'
diff --git a/Gemfile.lock b/Gemfile.lock
new file mode 100644
index 00000000..b429b94a
--- /dev/null
+++ b/Gemfile.lock
@@ -0,0 +1,129 @@
+GEM
+ remote: https://rubygems.org/
+ specs:
+ RedCloth (4.2.9)
+ activesupport (4.2.1)
+ i18n (~> 0.7)
+ json (~> 1.7, >= 1.7.7)
+ minitest (~> 5.1)
+ thread_safe (~> 0.3, >= 0.3.4)
+ tzinfo (~> 1.1)
+ blankslate (2.1.2.4)
+ celluloid (0.16.0)
+ timers (~> 4.0.0)
+ classifier-reborn (2.0.3)
+ fast-stemmer (~> 1.0)
+ coffee-script (2.4.1)
+ coffee-script-source
+ execjs
+ coffee-script-source (1.9.1.1)
+ colorator (0.1)
+ execjs (2.5.2)
+ fast-stemmer (1.0.2)
+ ffi (1.9.8)
+ gemoji (2.1.0)
+ github-pages (35)
+ RedCloth (= 4.2.9)
+ github-pages-health-check (~> 0.2)
+ jekyll (= 2.4.0)
+ jekyll-coffeescript (= 1.0.1)
+ jekyll-mentions (= 0.2.1)
+ jekyll-redirect-from (= 0.6.2)
+ jekyll-sass-converter (= 1.2.0)
+ jekyll-sitemap (= 0.8.1)
+ jemoji (= 0.4.0)
+ kramdown (= 1.5.0)
+ liquid (= 2.6.2)
+ maruku (= 0.7.0)
+ mercenary (~> 0.3)
+ pygments.rb (= 0.6.1)
+ rdiscount (= 2.1.7)
+ redcarpet (= 3.1.2)
+ terminal-table (~> 1.4)
+ github-pages-health-check (0.3.1)
+ net-dns (~> 0.6)
+ public_suffix (~> 1.4)
+ hitimes (1.2.2)
+ html-pipeline (1.9.0)
+ activesupport (>= 2)
+ nokogiri (~> 1.4)
+ i18n (0.7.0)
+ jekyll (2.4.0)
+ classifier-reborn (~> 2.0)
+ colorator (~> 0.1)
+ jekyll-coffeescript (~> 1.0)
+ jekyll-gist (~> 1.0)
+ jekyll-paginate (~> 1.0)
+ jekyll-sass-converter (~> 1.0)
+ jekyll-watch (~> 1.1)
+ kramdown (~> 1.3)
+ liquid (~> 2.6.1)
+ mercenary (~> 0.3.3)
+ pygments.rb (~> 0.6.0)
+ redcarpet (~> 3.1)
+ safe_yaml (~> 1.0)
+ toml (~> 0.1.0)
+ jekyll-coffeescript (1.0.1)
+ coffee-script (~> 2.2)
+ jekyll-gist (1.2.1)
+ jekyll-mentions (0.2.1)
+ html-pipeline (~> 1.9.0)
+ jekyll (~> 2.0)
+ jekyll-paginate (1.1.0)
+ jekyll-redirect-from (0.6.2)
+ jekyll (~> 2.0)
+ jekyll-sass-converter (1.2.0)
+ sass (~> 3.2)
+ jekyll-sitemap (0.8.1)
+ jekyll-watch (1.2.1)
+ listen (~> 2.7)
+ jemoji (0.4.0)
+ gemoji (~> 2.0)
+ html-pipeline (~> 1.9)
+ jekyll (~> 2.0)
+ json (1.8.2)
+ kramdown (1.5.0)
+ liquid (2.6.2)
+ listen (2.10.0)
+ celluloid (~> 0.16.0)
+ rb-fsevent (>= 0.9.3)
+ rb-inotify (>= 0.9)
+ maruku (0.7.0)
+ mercenary (0.3.5)
+ mini_portile (0.6.2)
+ minitest (5.6.1)
+ net-dns (0.8.0)
+ nokogiri (1.6.6.2)
+ mini_portile (~> 0.6.0)
+ parslet (1.5.0)
+ blankslate (~> 2.0)
+ posix-spawn (0.3.11)
+ public_suffix (1.5.1)
+ pygments.rb (0.6.1)
+ posix-spawn (~> 0.3.6)
+ yajl-ruby (~> 1.2.0)
+ rb-fsevent (0.9.4)
+ rb-inotify (0.9.5)
+ ffi (>= 0.5.0)
+ rdiscount (2.1.7)
+ redcarpet (3.1.2)
+ safe_yaml (1.0.4)
+ sass (3.4.13)
+ terminal-table (1.4.5)
+ thread_safe (0.3.5)
+ timers (4.0.1)
+ hitimes
+ toml (0.1.2)
+ parslet (~> 1.5.0)
+ tzinfo (1.2.2)
+ thread_safe (~> 0.1)
+ yajl-ruby (1.2.1)
+
+PLATFORMS
+ ruby
+
+DEPENDENCIES
+ github-pages
+
+BUNDLED WITH
+ 1.10.2
diff --git a/LICENSE b/LICENSE
new file mode 100644
index 00000000..edeed790
--- /dev/null
+++ b/LICENSE
@@ -0,0 +1,24 @@
+This is free and unencumbered software released into the public domain.
+
+Anyone is free to copy, modify, publish, use, compile, sell, or
+distribute this software, either in source code form or as a compiled
+binary, for any purpose, commercial or non-commercial, and by any
+means.
+
+In jurisdictions that recognize copyright laws, the author or authors
+of this software dedicate any and all copyright interest in the
+software to the public domain. We make this dedication for the benefit
+of the public at large and to the detriment of our heirs and
+successors. We intend this dedication to be an overt act of
+relinquishment in perpetuity of all present and future rights to this
+software under copyright law.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+OTHER DEALINGS IN THE SOFTWARE.
+
+For more information, please refer to
diff --git a/README.md b/README.md
new file mode 100644
index 00000000..351a984a
--- /dev/null
+++ b/README.md
@@ -0,0 +1 @@
+The book is published here: http://toolkitbook.github.io/book/
diff --git a/_config.yml b/_config.yml
new file mode 100644
index 00000000..45891d3c
--- /dev/null
+++ b/_config.yml
@@ -0,0 +1,18 @@
+name: Developer Hub
+markdown: kramdown
+highlighter: pygments
+safe: true
+baseurl: /book
+exclude: ['.ruby-version', 'node_modules', 'package.json', 'bower.json']
+sass:
+ sass_dir: static/_sass
+
+defaults:
+ -
+ scope:
+ path: "" # an empty string here means all files in the project
+ values:
+ organization-name: "NCBI"
+ organization-url: "http://www.ncbi.nlm.nih.gov"
+ organization-email: pubmedlabs@ncbi.nlm.nih.gov
+ source-code-policy.url:
diff --git a/_includes/footer.html b/_includes/footer.html
new file mode 100644
index 00000000..55cb8af9
--- /dev/null
+++ b/_includes/footer.html
@@ -0,0 +1,105 @@
+
+
+
+
+ Vakatov D, editor. The NCBI C++ Toolkit Book [Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2004-. 24, Applications
+ .
+ [Updated: March 17, 2015]
+ .
+
+
Available from: http://www.ncbi.nlm.nih.gov/toolkit/doc/book/ch_app
for a pseudo-button in the middle that identifies the current
+ page number. We don't have that. */
+div.pagination h3 { font-size: 1em; }
+div.pagination h3 { font-weight: normal; }
+div.pagination h3 { display: inline; }
+
+div.pagination {
+ white-space: nowrap;
+ float: right;
+ margin-top: 2px;
+ margin-bottom: 2px;
+}
+
+div.pagination .page_link {
+ padding: 1px 6px;
+ zoom:1;
+}
+div.pagination span.inactive {
+ color: #ccc;
+}
+div.pagination a.active {
+ border: 1px solid #ddd;
+ color: #336699;
+ padding: 2px 6px;
+ *padding: 0px 4px;
+ white-space: nowrap;
+ text-decoration: none;
+}
+div.pagination a:hover {
+ background-color: #369;
+ color: #fff;
+ border: 1px solid #888;
+}
+div.pagination .prev {
+ margin-right: .6em;
+ margin-left: .2em;
+}
+div.pagination .next {
+ margin-left: .6em;
+ margin-right: .2em;
+}
+
+
+
+
+
+
+
+/********************************************************************/
+/* ComponentID=1778673 /projects/books/components/PBooksPageTocHeaderCSS@1.4 */
+
+.page-toc-head {
+ border-top: 1px solid #C0C0C0;
+}
+
+h2.has-page-navigation, h3.has-page-navigation, h4.has-page-navigation {
+ margin-bottom: 0;
+}
+
+.page-toc-top-link {
+ float: right;
+}
+
+.content .page-toc-head a.page-toc-label,
+.content .page-toc-head a.top-link
+{
+ text-decoration: none;
+ border-bottom: none;
+ color: #808080;
+}
+
+/*****************************************************/
+/* The 'Contents v' label that controls the popper. */
+
+/*
+ This is the same as is in PBooksBookNavCSS.
+ I stole this from Entrez 3.15, Entrez_DisplayBarStyle css v. 1.7.
+ The following was designed for 12px font with 18px line-height
+*/
+.page-toc-label {
+ font-size: 1em;
+ background:transparent url(/portal/portal3rc.fcgi/4048106/img/27532) no-repeat 100% 100%;
+ padding-right: 17px;
+ margin-right: 3px;
+}
+.page-toc-label:active {
+ background:transparent url(/portal/portal3rc.fcgi/4048106/img/27532) no-repeat 100% 58%;
+ padding-right: 17px;
+ margin-right: 3px;
+}
+
+/**********************************/
+/* The page-toc popper area itself. */
+
+.page-toc-popper {
+ background-color: white;
+ -moz-box-shadow: 0.4em 0.4em 0.5em #999999;
+ padding: 0 2em 0 0;
+ max-height: 35em;
+ overflow: auto
+}
+
+.page-toc-popper ul {
+ list-style-type: none;
+}
+
+/********************************************************************/
+/* ComponentID=67324 /projects/entrez/core/Entrez_MessagesStyle@1.11 */
+
+div.messagearea {
+ margin:0;
+ padding: 0;
+ border-bottom: solid 1px #888;
+ clear:both;
+}
+#messagearea.empty {border: none; clear:both;}
+ul.messages {
+ font-family: Arial;
+ margin: 0;
+ padding: 0;
+ list-style-type:none;
+ list-style-image:none;
+}
+ul.messages li {
+ font-size: 1em;
+ margin:0.22em 0 0.22em;
+ padding: 0.25em 0.25em 0.25em 28px;
+ background-position: 0.5em 0.3em;
+ background-repeat: no-repeat;
+ background-color: transparent;
+}
+ul.messages li.success {
+ background-image: url(/portal/portal3rc.fcgi/4048106/img/67325);
+}
+ul.messages li.error {
+ background-image: url(/portal/portal3rc.fcgi/4048106/img/67326);
+}
+ul.messages li.warn {
+ background-image: url(/portal/portal3rc.fcgi/4048106/img/67327);
+}
+ul.messages li.info {
+ background-image: url(/portal/portal3rc.fcgi/4048106/img/67328);
+}
+ul.messages li.suggest {
+ background-image: url(/portal/portal3rc.fcgi/4048106/img/26044);
+}
+ul.messages li.hi_warn {
+ background-image: url(/portal/portal3rc.fcgi/4048106/img/67327);
+ font-weight: bold;
+}
+div#messagearea ul.messages li.hi_warn { margin: 2.5em 0; }
+ul.messages li.hi_warn em.detail{
+ font-weight: normal;
+ font-style: normal;
+ padding-left: 0.5em;
+}
+
+/********************************************************************/
+/* ComponentID=4054478 /projects/cppdocs/cppdocs/CppTkbPageCSS@1.12 */
+
+
+/*named-content*/
+.ncbi-class , .ncbi-type, .ncbi-func {font-weight:bold;font-style:italic; white-space:pre;}
+.ncbi-app , .ncbi-lib , .ncbi-macro , .ncbi-monospace , .ncbi-var, .ncbi-cmd , .ncbi-code, .ncbi-path {font-family: monospace; font-size: 1.2296em; white-space:pre;}
+.nctnt-pre{white-space:pre;}
+.ncbi-cmd {color:#985735; white-space:pre;}
+.ncbi-code{color:#59331F; white-space:pre;}
+.ncbi-path {color:#777777; white-space:pre;}
+.pageobject{font-weight:bold;color:#985735;}
+.highlight{background-color:yellow;font-weight:bold;}
+
+/* Search bar */
+.search_extras label { margin-right:1em;font-size: 1.1em; }
+.search_extras strong { font-size: 150%; color: #888; margin-right: 0.5em}
+
+/* Temporary fix for wide tables (adding scroll bars (CXX-3856)) */
+div.table-scroll {
+ overflow: auto;
+ max-width: none;
+ max-height: 500px;
+}
+/* Added top space for tables to display consecutive tables better (CXX-4144)) */
+div.table {
+ padding-top: 0.5em;
+ }
+
+
+.grid{ min-width:800px; max-width:none; width:100%; clear:both; margin:0 auto; text-align:left }
+
+/* remove indentation for definition lists (CXX-3856) */
+.labeled-list{}
+.labeled-list dt{float:left;margin-right:.8em}
+.labeled-list dd{vertical-align:top;display:table-cell;*display:inline-block}
+
+/* change highlight color from brown to yellow (CXX-3856) */
+.highlight{background-color:yellow;font-weight:bold;}
+
+/*Fixing the alignment in Entez search bar at the top of the pages*/
+div.search {
+ padding-bottom: 15px;
+}
+
+/*labels in headings*/
+.label{margin-right:1em;}
+
+/* style used for pale grey square brackets*/
+.internal-link-marker{color:#BDBDBD}
+
+/*******************************/
+/* PDF link */
+
+div.pdf-link {
+ white-space: nowrap;
+ float: right;
+}
+
+/* Added padding for the sidebar top division */
+#source-branding {
+margin-top: 1.2em;
+}
+
+/* Added style to box-ed pieces of text */
+
+.box{
+font-family: arial,helvetica,clean,sans-serif;
+font-size: 13px;
+font-size-adjust: none;
+font-stretch: normal;
+font-style: normal;
+font-variant: normal;
+font-weight: 400;
+line-height: 18px;
+text-align: left;
+}
+
+/* Added style and scroll bars to preformatted pieces of text */
+pre {
+ background: #eee;
+ border: 1px solid #C0C0C0;
+ overflow: auto;
+ max-width: none;
+ max-height: 500px;
+ font-family: Courier, Monospace;
+}
+
+/* Made top level headers underlined */
+h2 { border-bottom: 1px solid rgb(151, 176, 200);}
+
+/* Remove underline for 'Go To' link and for the items from the drop down list (see CXX-4213) */
+
+a.jig-ncbiinpagenav-goto-heading, a.jig-ncbiinpagenav-goto-heading:hover {
+ border: 0 none;
+ text-decoration: none;
+ }
+ul.ui-ncbibasicmenu li a, ul.ui-ncbibasicmenu li a:hover {
+ border: 0 none;
+ text-decoration: none;
+}
+
+/* Remove H3 element to make HTML valid */
+.portlet_title { margin-right: 2em; font-size: 1.4em; color: #985735; font-weight: bold; display: inline;}
+
+/* Style for scrolling images (CXX-5457) */
+
+a.img_link {
+ border: 0 none;
+}
+
+
+
+
+/********************************************************************/
+/* ComponentID=3871380 /projects/standards/standard_searchbar_css@1.16 */
+
+.header{background:#d5d5d5 url(/portal/portal3rc.fcgi/4048106/img/2375536) repeat-x scroll left bottom; position: relative;margin-bottom: 1.231em;z-index:20 }
+/* .header{background:#d5d5d5 url(/portal/portal3rc.fcgi/4048106/img/2375536) repeat-x scroll left bottom; position: relative;margin-bottom: 1.231em;z-index:20 } */
+
+.header{background:#205081;}
+.header a{text-decoration:none}
+.header a:hover{text-decoration:underline}
+
+.search{margin:0 0 0 13.539em; padding:1.2em 0 .7em}
+.search_form{*zoom:1}
+.search_form select,.search_form .jig-ncbiclearbutton-wrap,.search_form button{margin-right:.2em;font-family:arial,helvetica,sans-serif;}
+.search_form select{font-size: 1.077em; width:8.5em;margin-right:.3em;*vertical-align: middle; position: relative; bottom: 1px;}
+.search_form select optgroup {font-style: normal; color: #555; padding-left: 0.2em;} /*FF is the only one to recognize optgroup and option padding, which is nice because it's the only one that screws it up.*/
+.search_form select optgroup option {color: #000}
+.search_form input{font-size: 1.1543em; width:48%;display:inline-block;_width:100%}
+.search_form div.nowrap {*height: 100%; }
+.search_form button.nowrap {*vertical-align: middle; }
+.search_form div.nowrap div.nowrap {*height: 100%; *vertical-align: middle; }
+.search_form .nowrap{display:inline;*zoom:1}
+.searchlinks{margin:.2em 0 0 9.6em;_zoom:1}
+.searchlinks li{margin-right:1.2em;zoom:1}
+.searchlinks .help{position:absolute;right:1em;margin-right:0;_margin-top:-.1em}
+.searchlinks .hidden{display:none}
+.searchlinks .visible{display:inline}
+.search_form .wrap{position:relative;display:inline;_width:70%}
+#cl{position:absolute;right:8px;top:-3px;top:-12px\9;*top:5px}
+
+.search_form button{border:0 none;cursor:pointer;overflow:visible;width:auto;background-color:#ddd;padding:.2em .4em;*padding:.2em .6em;_padding:.2em .4em;margin:0 .2em;*margin:0 .3em;*height:2em; -moz-border-radius:5px;-webkit-border-radius:5px;border-radius:5px;text-shadow:.1em .1em .1em rgba(0,0,0,.5);-moz-box-shadow:.1em .1em .1em rgba(0,0,0,.5);-webkit-box-shadow:.1em .1em .1em rgba(0,0,0,.5);box-shadow:.1em .1em .1em rgba(0,0,0,.5)}
+.search_form button.button_search{background-color:#47a;font-weight:bold;color:#fff;*margin-left:.5em;font-size:inherit;}
+.search_form button.button_search:active{background-color:#4c96df}
+.search_form button.button_preview{background-color:#A64D48;font-weight:bold;color:#fff}
+.search_form button.button_preview:active{background-color:#F27069}
+
+.search_form input:focus{-moz-box-shadow:0 0 .3em rgba(211,186,44,.8)}
+
+.search_form .jig-ncbiclearbutton-wrap { width: 64%; *display: inline; border:1px solid #999; }
+.search_form .jig-ncbiclearbutton-wrap input { width: 100%; *width: 50%; }
+.search_form .jig-ncbiclearbutton-wrap a.reset { margin-left: 0; top: 50%; margin-top: -7px; }
+
+.rss_icon{position:relative;top:3px;margin-right:.3em}
+
+.rss_menu{z-index:1001;display:none;}
+.rss_menu legend{font-weight:bold;margin:2px 0 0 3px}
+.rss_menu ul{margin:0;padding:0;list-style-type:none;padding-top:5px}
+.rss_menu ul li{margin-bottom:.4em}
+.rss_menu span,.rss_menu label{margin-right:.5em}
+.rss_menu ul input{top:0}
+.rss_menu dd{margin-left:0;margin-bottom:1em}
+.rss_menu #rss_name{width:15em}
+.rss_menu button{margin-top:.5em}
+.rss_menu label{display:block}
+
+.db_logo{background:transparent url() no-repeat scroll left top;display:block;height:36px;width:100px;text-indent:-9999px}
+
+.res_logo{width:25em;left:1.231em;padding-top:.4em;position:absolute}
+.res_logo h1{font-weight:normal;margin:0}
+/*sibling of res_logo*/
+.long{padding:.5em 0}
+.long h1{line-height:1.15}
+
+.res_logo h1 a{color:#333;display:block;padding:.3em 0;text-shadow:1px 1px 1px rgba(240,240,240,.9)}
+.res_tagline{display:none}
+.res_logo h1 a,.res_logo h1 a:hover,.res_logo h1 a:visited{text-decoration:none}
+
+h1.img_logo{margin:0}
+h1.img_logo a{padding:0}
+
+/********************************************************************/
+/* ComponentID=4005757 /projects/PAF/BaseComponents/PAFLocalNavCSS@1.5 */
+
+div.page div.header { margin-bottom: 0;}
+
+
+
+/********************************************************************/
+/* ComponentID=3398175 /projects/PAF/BaseComponents/Support/PAFDebugConsoleCSS@1.3 */
+
+
+div.paf-debug-wrap {
+display: block;
+margin: 1em 2em;
+color: black;
+background-color: white;
+clear:both;
+}
+
+div.paf-debug-content {
+max-height: 300px;
+margin:auto;
+overflow: auto;
+border: solid 1px #ccc;
+font-family: Courier, Monospace;
+font-size: 10pt;
+}
+
+div.paf-debug-wrap h3 {
+background-color: #ccc;
+color: #444;
+font-size: 110%;
+margin: 0;
+padding: 5px;
+}
+
+ul#paf-debug-trace {
+margin:0;
+padding:0;
+padding-left: 1em;
+margin-left: 1em;
+}
+
+ul#paf-debug-trace li {
+list-style-type: disc;
+}
+
+
+dl#paf-debug dt,
+dl#paf-debug dd {
+margin-bottom: 0.5em;
+min-height: 1.8em;
+font-family: courier, fixed, monospace;
+}
+
+dl#paf-debug dd pre { white-space: pre; }
+
+dl#paf-debug dd.empty {
+background-color: transparent;
+}
+
+dl#paf-debug dt {
+float: left;
+width: 14em;
+text-align: right;
+clear: left;
+}
+
+dl#paf-debug dd {
+
+margin-left: 14.5em;
+border-left: solid 1px #ccc;
+padding-left: 0.5em;
+background-color: #eeeee0;
+color: black;
+}
+
+dl#paf-debug pre {
+margin:0;
+}
diff --git a/default_files/4025445.js b/default_files/4025445.js
new file mode 100644
index 00000000..ae802133
--- /dev/null
+++ b/default_files/4025445.js
@@ -0,0 +1,830 @@
+jQuery(function($j) {
+ var formState = {
+ overrideBackends: false,
+ backends: {}
+ };
+
+ // Name of the cookie
+ var cookieName;
+
+ // Mostly just for debugging, store the cookie string value here
+ // rather than in the sub-function scope
+ var cookieStr;
+
+ // An object representation of the cookie. This is converted from the
+ // XML cookie value on init. The form controls will manipulate this,
+ // and when the user clicks "Go", this will be converted back into
+ // XML.
+ var cookieObj;
+
+ ///////////////////////////////////////////////////////////////////////////////
+ function cbChanged(event) {
+ //console.info("Event caught: " + event);
+ var target = $j(event.target);
+ var id = target.attr("id");
+ var value = target.attr("value");
+ var checked = target.attr("checked");
+ /*console.info("target id: '" + id +
+ "', value: '" + value +
+ "', checked: '" + checked + "'");*/
+
+
+ if (id == "besetsel-cb") {
+ if (checked) {
+ $j("#besetsel-sel").removeAttr("disabled");
+ besetSelFormToObj();
+ }
+ else {
+ $j("#besetsel-sel").attr("disabled", 1);
+ delete cookieObj.besetName;
+ }
+ }
+ else if (id == "besetsel-sel") {
+ besetSelFormToObj();
+ }
+ else {
+ var m;
+ if (m = id.match(/besetsel-be-(.*?)-cb/)) {
+ var backend = m[1];
+ //console.info(">>>backend checkbox: " + backend);
+ if (checked) {
+ $j("#besetsel-be-" + backend + "-text").removeAttr("disabled");
+ beUrlFormToObj(backend);
+ }
+ else {
+ $j("#besetsel-be-" + backend + "-text").attr("disabled", 1);
+ delete cookieObj.backendUrls[backend];
+ }
+ }
+ else if (m = id.match(/besetsel-be-(.*?)-text/)) {
+ backend = m[1];
+ //console.info(">>>backend text: " + backend);
+ beUrlFormToObj(backend);
+ }
+ }
+
+ // PMC-11784 and PMC-11785.
+ // This fixes a nasty IE bug. It causes a slight flash when the user
+ // clicks a checkbox, but it works.
+ if (jQuery.browser.msie){
+ target.hide();
+ window.setTimeout( function(){ target.show();}, 0 );
+ }
+
+ }
+
+ ///////////////////////////////////////////////////////////////////////////////
+ // besetSelFormToObj()
+ // This is called by a couple of event handlers and decodes the
+ // currently selected BESet (in the drop-down form) and sets the
+ // cookieObj.besetName accordingly.
+
+ function besetSelFormToObj()
+ {
+ cookieObj.besetName = $j("#besetsel-sel").val();
+ }
+
+ ///////////////////////////////////////////////////////////////////////////////
+ // beUrlFormToObj(backend)
+ // This is similar, and takes care of reading the text value from the
+ // form and stuffing it into the object
+
+ function beUrlFormToObj(backend) {
+ var value = $j("#besetsel-be-" + backend + "-text").attr("value");
+ if (value) cookieObj.backendUrls[backend] = value;
+ }
+
+ ///////////////////////////////////////////////////////////////////////////////
+ function init() {
+ if ($j("#besetsel-form").length < 1)
+ {
+ return;
+ }
+
+ cookieName = $j("#besetsel-form").attr("cookieName");
+ cookieObj = cookieXmlToJson(cookieName);
+ initFormState();
+
+ // Set event handers
+ $j("#besetsel-form .besetsel-control").change(function(event) {
+ cbChanged(event);
+ });
+ $j("#besetsel-go-button").click(function(event) {
+ goButton(event);
+ });
+ $j("#besetsel-reset-button").click(function(event) {
+ resetButton(event);
+ });
+
+ // This "pullout" might be empty, in the case of the BESet being
+ // selected by path segment instead of cookie. In that case, the
+ // tab acts as a watermark, just to identify the BESet, and we
+ // don't want to allow it to be "pulled out". So we'll set the
+ // width to 0 in that case.
+ var w = $j("#besetsel-go-button").length > 0 ? "400px" : "0px";
+
+ // Put it into the sidecontent pullout
+ $j("#besetsel-form").sidecontent({
+ /*classmodifier: "besetsel",*/
+ attachto: "rightside",
+ width: w,
+ opacity: "0.8",
+ pulloutpadding: "5",
+ textdirection: "vertical",
+ clickawayclose: 0,
+ titlenoupper: 1
+ });
+
+ var pulloutColor = $j("#besetsel-form").attr("pulloutColor");
+ //alert("color is " + pulloutColor);
+ $j("#besetsel-form").data("pullout").css("background-color", pulloutColor || '#663854');
+
+ if ($j("#besetsel-go-button").size() > 0) {
+ $j("#besetsel-form").data("pullout").css({
+ "border-top": "ridge gray 5px",
+ "border-bottom": "ridge gray 5px",
+ "border-left": "ridge gray 5px"
+ });
+ }
+ }
+
+ ///////////////////////////////////////////////////////////////////////////////
+ // goButton(event)
+ // Handle the user-click of the "Go!" button.
+
+ function goButton(event) {
+ // Convert the object into XML
+ var cookieXml = "") : ">" );
+ for (var backend in cookieObj.backendUrls) {
+ //console.info("+++ backend " + backend);
+ cookieXml +=
+ "" + xmlEscape(cookieObj.backendUrls[backend]) + "";
+ }
+ cookieXml += "";
+ //console.info(cookieXml);
+
+ // Set the cookie
+ document.cookie = cookieName + "=" + encodeURIComponent(cookieXml) +
+ "; max-age=604800" +
+ "; path=/" +
+ "; domain=nih.gov";
+ // Reload the page
+ window.location.reload();
+ }
+
+ ///////////////////////////////////////////////////////////////////////////////
+ // resetButton(event)
+ // Handle the user-click of the "Reset" button.
+ // Does the same thing as "Go!", but sets the cookie to the empty string.
+
+ function resetButton(event) {
+ // Clear the cookie
+ document.cookie = cookieName + "=" +
+ "; max-age=604800" +
+ "; path=/" +
+ "; domain=nih.gov";
+ // Reload the page
+ window.location.reload();
+ }
+
+ ///////////////////////////////////////////////////////////////////////////////
+ function xmlEscape(str) {
+ str = str.replace(/\&/g, '&')
+ .replace(/\/g, '>')
+ .replace(/\"/g, '"')
+ .replace(/\'/g, ''');
+ return str;
+ }
+
+ ///////////////////////////////////////////////////////////////////////////////
+ // This function reads the cookie value and initializes the form state
+ // Don't assume anything about the form state -- redo everything.
+ function initFormState() {
+
+ var besetName = cookieObj.besetName;
+
+ if (!besetName) {
+ $j("#besetsel-cb").removeAttr("checked");
+ $j("#besetsel-sel").attr("disabled", 1);
+ }
+ else {
+ var selBESet = $j("#besetsel-opt-" + besetName);
+ if (selBESet.length != 0) {
+ $j("#besetsel-cb").attr("checked", 1);
+ $j("#besetsel-sel").removeAttr("disabled");
+ selBESet.attr("selected", 1);
+ }
+ else {
+ $j("#besetsel-cb").removeAttr("checked");
+ $j("#besetsel-sel").attr("disabled", 1);
+ }
+ }
+
+ // Foreach backend in the form
+ $j(".besetsel-be-cb").each(function(i) {
+ var id = $j(this).attr("id");
+ var beName = id.match(/besetsel-be-(.*?)-cb/)[1];
+ //console.info("### backend, id is '" + id + "', beName is '" + beName + "'");
+ if (!beName) return;
+
+ // See if there's a corresponding element in the cookie
+ if (!cookieObj.backendUrls ||
+ !cookieObj.backendUrls[beName]) {
+ //console.info("Didn't find " + beName);
+ $j("#besetsel-be-" + beName + "-cb").removeAttr("checked");
+ $j("#besetsel-be-" + beName + "-text").attr("disabled", 1);
+ }
+ else {
+ //console.info("Found " + beName);
+ $j("#besetsel-be-" + beName + "-cb").attr("checked", 1);
+ var textbox = $j("#besetsel-be-" + beName + "-text");
+ textbox.removeAttr("disabled");
+ textbox.attr("value", cookieObj.backendUrls[beName]);
+ }
+ });
+ }
+
+ ///////////////////////////////////////////////////////////////////////////////
+ // This gets the value of the _beset cookie, which is in XML, and turns it
+ // from this:
+ //
+ //
+ // ...
+ //
+ // Into this (note that everything is optional):
+ // { besetName: 'test',
+ // backendUrls: {
+ // tagserver: 'bingo', ... }
+ // }
+ // If there is no cookie set or parsing fails, this returns {}.
+
+ function cookieXmlToJson(cookieName) {
+ var cookieObj = {
+ backendUrls: {}
+ };
+
+ cookieStr = getCookie(cookieName);
+ //console.info("cookie value is '" + cookieStr + "'");
+
+ // Parse XML
+ try {
+ var cookieXml = $j(cookieStr);
+ }
+ catch(err) {
+ return cookieObj;
+ }
+
+ var besetElem = cookieXml.find('BESet');
+ if (besetElem.length == 0) {
+ // No valid cookie value found.
+ return cookieObj;
+ }
+
+ var besetName = besetElem.attr("name");
+ if (besetName) {
+ cookieObj.besetName = besetName;
+ }
+
+ var backends = besetElem.find("backend");
+ if (backends.length != 0) {
+ backends.each(function (i) {
+ var e = $j(backends[i]);
+ cookieObj.backendUrls[e.attr("name")] = e.text();
+ //console.info("Setting " + e.attr("backend") + ": " + e.attr("url"));
+ })
+ }
+
+ return cookieObj;
+ }
+
+ ///////////////////////////////////////////////////////////////////////////////
+ function getCookie(name) {
+ var allCookies = document.cookie;
+ //console.info("allCookies = " + allCookies);
+ var pos = allCookies.indexOf(name + "=");
+ if (pos != -1) {
+ var start = pos + (name + "=").length;
+ var end = allCookies.indexOf(";", start);
+ if (end == -1) end = allCookies.length;
+ return decodeURIComponent(allCookies.substring(start, end));
+ }
+ return "";
+ }
+
+ init();
+
+});
+
+
+
+;
+(function($)
+{
+ // This script was written by Steve Fenton
+ // http://www.stevefenton.co.uk/Content/Jquery-Side-Content/
+ // Feel free to use this jQuery Plugin
+ // Version: 3.0.2
+
+ var classModifier = "";
+ var sliderCount = 0;
+ var sliderWidth = "400px";
+
+ var attachTo = "rightside";
+
+ var totalPullOutHeight = 0;
+
+ function CloseSliders (thisId) {
+ // Reset previous sliders
+ for (var i = 0; i < sliderCount; i++) {
+ var sliderId = classModifier + "_" + i;
+ var pulloutId = sliderId + "_pullout";
+
+ // Only reset it if it is shown
+ if ($("#" + sliderId).width() > 0) {
+
+ if (sliderId == thisId) {
+ // They have clicked on the open slider, so we'll just close it
+ showSlider = false;
+ }
+
+ // Close the slider
+ $("#" + sliderId).animate({
+ width: "0px"
+ }, 100);
+
+ // Reset the pullout
+ if (attachTo == "leftside") {
+ $("#" + pulloutId).animate({
+ left: "0px"
+ }, 100);
+ } else {
+ $("#" + pulloutId).animate({
+ right: "0px"
+ }, 100);
+ }
+ }
+ }
+ }
+
+ function ToggleSlider () {
+ var rel = $(this).attr("rel");
+
+ var thisId = classModifier + "_" + rel;
+ var thisPulloutId = thisId + "_pullout";
+ var showSlider = true;
+
+ if ($("#" + thisId).width() > 0) {
+ showSlider = false;
+ }
+
+ CloseSliders(thisId);
+
+ if (showSlider) {
+ // Open this slider
+ $("#" + thisId).animate({
+ width: sliderWidth
+ }, 250);
+
+ // Move the pullout
+ if (attachTo == "leftside") {
+ $("#" + thisPulloutId).animate({
+ left: sliderWidth
+ }, 250);
+ } else {
+ $("#" + thisPulloutId).animate({
+ right: sliderWidth
+ }, 250);
+ }
+ }
+
+ return false;
+ };
+
+ $.fn.sidecontent = function (settings) {
+
+ var config = {
+ classmodifier: "sidecontent",
+ attachto: "rightside",
+ width: "300px",
+ opacity: "0.8",
+ pulloutpadding: "5",
+ textdirection: "vertical",
+ clickawayclose: false
+ };
+
+ if (settings) {
+ $.extend(config, settings);
+ }
+
+ return this.each(function () {
+
+ $This = $(this);
+
+ // Hide the content to avoid flickering
+ $This.css({ opacity: 0 });
+
+ classModifier = config.classmodifier;
+ sliderWidth = config.width;
+ attachTo = config.attachto;
+
+ var sliderId = classModifier + "_" + sliderCount;
+ var sliderTitle = config.title;
+
+ // Get the title for the pullout
+ sliderTitle = $This.attr("title");
+
+ // Start the totalPullOutHeight with the configured padding
+ if (totalPullOutHeight == 0) {
+ totalPullOutHeight += parseInt(config.pulloutpadding);
+ }
+
+ if (config.textdirection == "vertical") {
+ var newTitle = "";
+ var character = "";
+ for (var i = 0; i < sliderTitle.length; i++) {
+ character = sliderTitle.charAt(i).toUpperCase();
+ if (character == " ") {
+ character = " ";
+ }
+ newTitle = newTitle + "" + character + "";
+ }
+ sliderTitle = newTitle;
+ }
+
+ // Wrap the content in a slider and add a pullout
+ $This.wrap('').wrap('');
+ var pullout = $('
' + sliderTitle + '
').insertBefore($("#" + sliderId));
+
+ // Store reference to the tab element in parent
+ $This.data('pullout', pullout);
+
+ if (config.textdirection == "vertical") {
+ $("#" + sliderId + "_pullout span").css({
+ display: "block",
+ textAlign: "center"
+ });
+ }
+
+ // Hide the slider
+ $("#" + sliderId).css({
+ position: "absolute",
+ overflow: "hidden",
+ top: "0",
+ width: "0px",
+ zIndex: "1",
+ opacity: config.opacity
+ });
+
+ // For left-side attachment
+ if (attachTo == "leftside") {
+ $("#" + sliderId).css({
+ left: "0px"
+ });
+ } else {
+ $("#" + sliderId).css({
+ right: "0px"
+ });
+ }
+
+ // Set up the pullout
+ $("#" + sliderId + "_pullout").css({
+ position: "absolute",
+ top: totalPullOutHeight + "px",
+ zIndex: "1000",
+ cursor: "pointer",
+ opacity: config.opacity
+ })
+
+ $("#" + sliderId + "_pullout").live("click", ToggleSlider);
+
+ var pulloutWidth = $("#" + sliderId + "_pullout").width();
+
+ // For left-side attachment
+ if (attachTo == "leftside") {
+ $("#" + sliderId + "_pullout").css({
+ left: "0px",
+ width: pulloutWidth + "px"
+ });
+ } else {
+ $("#" + sliderId + "_pullout").css({
+ right: "0px",
+ width: pulloutWidth + "px"
+ });
+ }
+
+ totalPullOutHeight += parseInt($("#" + sliderId + "_pullout").height());
+ totalPullOutHeight += parseInt(config.pulloutpadding);
+
+ var suggestedSliderHeight = totalPullOutHeight + 30;
+ if (suggestedSliderHeight > $("#" + sliderId).height()) {
+ $("#" + sliderId).css({
+ height: suggestedSliderHeight + "px"
+ });
+ }
+
+ if (config.clickawayclose) {
+ $("body").click( function () {
+ CloseSliders("");
+ });
+ }
+
+ // Put the content back now it is in position
+ $This.css({ opacity: 1 });
+
+ sliderCount++;
+ });
+
+ return this;
+ };
+})(jQuery);
+;
+/* Override this file with one containing code that belongs on every page of your application */
+
+
+;
+
+
+// Added by Karanjit Siyan 4/3/2004
+// TODO: Rewrite this in jQuery, or (better) handle as a search request.
+function SymbolSearch(bookID)
+{
+
+ var f = document.forms['frmSymbolSearch'];
+ var url;
+ var sel;
+
+ for(i=0;i51&&c<123) { c-=7; }
+ else if(c>44&&c<52) { c+=71; }
+ x+= String.fromCharCode(c);
+ }
+ em = ""+x+"";
+ elements[i].innerHTML = em;
+ }
+}
+
+;
+(function($){
+
+ $(function() {
+
+ var theSearchInput = $("#term");
+ var originalTerm = $.trim(theSearchInput.val());
+ var theForm = jQuery("form").has(theSearchInput);
+ var dbNode = theForm.find("#database");
+ var currDb = dbNode.val();
+ var sbConfig = {};
+ try{
+ sbConfig = eval("({" + theSearchInput.data("sbconfig") + "})");
+ }catch(e){}
+ var defaultSubmit = sbConfig.ds == "yes";
+ var searched = false;
+ var dbChanged = null; //since db.change is triggered as a work around for JSL-2067
+ var searchModified = false; //this is used to allow searching when something esle changed on the page with out the term changing
+
+ if(!$.ncbi)
+ $.extend($,{ncbi:{}});
+ if(!$.ncbi.searchbar)
+ $.extend($.ncbi,{searchbar:{}});
+
+ $.extend($.ncbi.searchbar,
+ (function(){
+ //*****************private ******************/
+ function doSearchPing() {
+ try{
+ var cVals = ncbi.sg.getInstance()._cachedVals;
+ var searchDetails = {}
+ searchDetails["jsEvent"] = "search";
+ var app = cVals["ncbi_app"];
+ var db = cVals["ncbi_db"];
+ var pd = cVals["ncbi_pdid"];
+ var pc = cVals["ncbi_pcid"];
+ var sel = dbNode[0];
+ var searchDB = sel.options[sel.selectedIndex].value;
+ var searchText = theSearchInput[0].value;
+ if( app ){ searchDetails["ncbi_app"] = app.value; }
+ if( db ){ searchDetails["ncbi_db"] = db.value; }
+ if( pd ){ searchDetails["ncbi_pdid"] = pd.value; }
+ if( pc ){ searchDetails["ncbi_pcid"] = pc.value; }
+ if( searchDB ){ searchDetails["searchdb"] = searchDB;}
+ if( searchText ){ searchDetails["searchtext"] = searchText;}
+ ncbi.sg.ping( searchDetails );
+ }catch(e){
+ console.log(e);
+ }
+ }
+ function getSearchUrl(term){
+ var url = "";
+ if (typeof(NCBISearchBar_customSearchUrl) == "function")
+ url = NCBISearchBar_customSearchUrl();
+ if (!url) {
+ var searchURI = dbNode.find("option:selected").data("search_uri");
+ url = searchURI ? searchURI.replace('$',term) :
+ "/" + dbNode.val() + "/" + ( term !="" ? "?term=" + term : "");
+
+ url = "//www.ncbi.nlm.nih.gov"+url; }
+ return url;
+ }
+
+ return {
+ //*****************exposed attributes and functions ******************/
+ 'theSearchInput':theSearchInput,
+ 'theForm':theForm,
+ 'dbNode':dbNode,
+ 'searched':searched,
+ 'setSearchModified':function(){searchModified=true;},
+ 'searchModified':function(){return searchModified;},
+ 'doSearch':function(e){
+ e.stopPropagation();
+ e.preventDefault();
+ //checking for the searched flag is necessary because the autocompelete control fires on enter key, the form submit also fires on enter key
+ if(searched == false){
+ searched = true;
+ theForm.find('input[type="hidden"][name^="p$"]').attr('disabled', 'disabled');
+ //$("input[name]").not(jQuery(".search_form *")).attr('disabled', 'disabled');
+ if (defaultSubmit)
+ $.ncbi.searchbar.doSearchPing();
+ else {
+ var term = $.trim(theSearchInput.val());
+ if (dbChanged || searchModified || term !== originalTerm){
+ $.ncbi.searchbar.doSearchPing();
+ var searchUrl = $.ncbi.searchbar.getSearchUrl(encodeURIComponent(term).replace(/%20/g,'+'));
+ var doPost = (term.length > 2000) ? true : false;
+ if (doPost){
+ if (e.data.usepjs){
+ Portal.$send('PostFrom',{"theForm":theForm,"term":term,"targetUrl":searchUrl.replace(/\?.*/,'')});
+ }
+ else{
+ theForm.attr('action',searchUrl.replace(/\?.*/,''));
+ theForm.attr('method','post');
+ }
+ }
+ else {
+ window.location = searchUrl;
+ }
+ }
+ else{ //if (term !== originalTerm){
+ searched = false;
+ }
+ }
+ }
+ },
+ 'onDbChange':function(e){
+ if (dbChanged === null)
+ dbChanged = false;
+ else
+ dbChanged = true;
+ var optionSel = $(e.target).find("option:selected");
+ var dict = optionSel.data("ac_dict");
+ if (dict){
+ theSearchInput.ncbiautocomplete("option","isEnabled",true).ncbiautocomplete("option","dictionary",dict);
+ theSearchInput.attr("title","Search " + optionSel.text() + ". Use up and down arrows to choose an item from the autocomplete.");
+ }
+ else{
+ theSearchInput.ncbiautocomplete("turnOff",true);
+ theSearchInput.attr("title", "Search " + optionSel.text());
+ }
+ if (defaultSubmit)
+ theForm.attr('action','/' + dbNode.val() + '/');
+ },
+ 'doSearchPing':function(){
+ doSearchPing();
+ },
+ 'getSearchUrl':function(term){
+ return getSearchUrl(term);
+ }
+
+ };//end of return
+ })() //end of the self executing anon
+ );//end of $.extend($.ncbi.searchbar
+
+ function initSearchBar(usepjs){
+ //enable the controls for the back button
+ theForm.find('input[type="hidden"][name^="p$"]').removeAttr('disabled');
+ if (usepjs)
+ portalSearchBar();
+ }
+
+
+
+ function portalSearchBar(){
+
+ Portal.Portlet.NcbiSearchBar = Portal.Portlet.extend ({
+ init:function(path,name,notifier){
+ this.base (path, name, notifier);
+ },
+ send:{
+ "Cmd":null,
+ "Term":null
+ },
+ "listen":{
+ "PostFrom":function(sMessage,oData,sSrc){
+ this.postForm(oData.theForm,oData.term,oData.targetUrl);
+ }
+ },
+ "postForm":function(theForm,term,targetUrl){
+ //console.log('targetUrl = ' + targetUrl);
+ theForm.attr('action',targetUrl);
+ theForm.attr('method','post');
+ this.send.Cmd({
+ 'cmd' : 'Go'
+ });
+ this.send.Term({
+ 'term' : term
+ });
+ Portal.requestSubmit();
+ },
+ 'getPortletPath':function(){
+ return this.realpath + '.Entrez_SearchBar';
+ }
+ });
+
+ }//portalSearchBar
+
+
+
+ //portal javascript is required to make a POST when the rest of the app uses portal forms
+ var usepjs = sbConfig.pjs == "yes";
+ //console.log('sbConfig',sbConfig);
+ initSearchBar(usepjs);
+
+ dbNode.on("change",$.ncbi.searchbar.onDbChange);
+
+ theForm.on("submit",{'usepjs':usepjs},$.ncbi.searchbar.doSearch);
+ theSearchInput.on("ncbiautocompleteenter ncbiautocompleteoptionclick", function(){theForm.submit();});
+ //a work around for JSL-2067
+ dbNode.trigger("change");
+ //iOS 8.02 changed behavior on autofocus, should probably check other mobile devices too
+ if (sbConfig.afs == "yes" && !/(iPad|iPhone|iPod)/g.test(navigator.userAgent) ){
+ window.setTimeout(function(){
+ try{
+ var x = window.scrollX, y = window.scrollY;
+ var size= originalTerm.length;
+ if (size == 0 || /\s$/.test(originalTerm))
+ theSearchInput.focus()[0].setSelectionRange(size, size);
+ else
+ theSearchInput.focus().val(originalTerm + " ")[0].setSelectionRange(size+1, size+1);
+ window.scrollTo(x, y);
+ }
+ catch(e){} //setSelectionRange not defined in IE8
+ },1);
+ }
+
+ //set the query changed flag true after a few seconds, still prevents scripted clicking or stuck enter key
+ window.setTimeout(function(){$.ncbi.searchbar.setSearchModified();},2000);
+
+ });//End of DOM Ready
+
+})(jQuery);
+
+/*
+a call back for the 'Turn off' link at the bottom of the auto complete list
+*/
+function NcbiSearchBarAutoComplCtrl(){
+ jQuery("#term").ncbiautocomplete("turnOff",true);
+ if (typeof(NcbiSearchBarSaveAutoCompState) == 'function')
+ NcbiSearchBarSaveAutoCompState();
+ }
+
+
+
+
diff --git a/default_files/CAF-LBSMD.gif b/default_files/CAF-LBSMD.gif
new file mode 100644
index 00000000..e464d553
Binary files /dev/null and b/default_files/CAF-LBSMD.gif differ
diff --git a/default_files/CFEngine.jpg b/default_files/CFEngine.jpg
new file mode 100644
index 00000000..71b9fa7f
Binary files /dev/null and b/default_files/CFEngine.jpg differ
diff --git a/default_files/DISPDAndFWDaemon.jpg b/default_files/DISPDAndFWDaemon.jpg
new file mode 100644
index 00000000..f69c0d6e
Binary files /dev/null and b/default_files/DISPDAndFWDaemon.jpg differ
diff --git a/default_files/FWDaemonCheckPage.gif b/default_files/FWDaemonCheckPage.gif
new file mode 100644
index 00000000..96ac956f
Binary files /dev/null and b/default_files/FWDaemonCheckPage.gif differ
diff --git a/default_files/FWDaemonMonitor.gif b/default_files/FWDaemonMonitor.gif
new file mode 100644
index 00000000..416cb2fb
Binary files /dev/null and b/default_files/FWDaemonMonitor.gif differ
diff --git a/default_files/InstrumentPageStarterJS.js b/default_files/InstrumentPageStarterJS.js
new file mode 100644
index 00000000..d99c6c63
--- /dev/null
+++ b/default_files/InstrumentPageStarterJS.js
@@ -0,0 +1,104 @@
+if (typeof ncbi === "undefined") {
+ ncbi = {};
+}
+
+ncbi.sgAppsWithScrolling = [
+ {"ncbi_app": "entrez",
+ "ncbi_db": "gene",
+ "ncbi_report": "full_report"},
+ {"foo": "bar"}
+];
+
+;
+(function(){function F(a){return ncbi.sg._urls.getAttrFromStr("jsevent",a)}function G(a){for(var b=ncbi.sg.reservedParams,c=0;c100&&this.ignoreLengthRestrictions.indexOf(a)===-1)b=b.substr(0,100);this._cachedVals[a]={sProp:a,value:b}}},getVal:function(a){return typeof this._cachedVals[a]!=="undefined"&&this._cachedVals[a]&&
+typeof this._cachedVals[a].value!=="undefined"?this._cachedVals[a].value:null},removeAllEntries:function(){for(var a={},b=this.cachedNames.length,c=0;c0?":"+this._pathParts.part2:"",c=this._pathParts.part3.length>0?":"+this._pathParts.part3:"",e=this._pathParts.part4.length>
+0?":"+this._pathParts.part4:"";a={pagename:a+b+c,server:window.location.hostname,sitesect2:a+b,subsect3:a+b+c,subsect4:a+b+c+e,heir1:(a+b+c+e).replace(/:/g,"|")};for(var g in a)this.addEntry(g,a[g]);this._sessionIdCheck();this._staticPageCheck();this._prevHitCheck();this._browserConfigurationSettings();this._hashCheck()},_staticPageCheck:function(){this._cachedVals.ncbi_app&&this._cachedVals.ncbi_app.value.length>0||this.addEntry("ncbi_app","static");this._cachedVals.ncbi_pdid&&this._cachedVals.ncbi_pdid.value.length>
+0||this.addEntry("ncbi_pdid",(document.title||"unknown").replace(/\s+/g,""))},_sessionIdCheck:function(){if(!(this._cachedVals.ncbi_sessionid&&this._cachedVals.ncbi_sessionid.value.length>0)){var a="";if(a.length===0){var b=this.getCookie("ncbi_sid");if(b.length>0)a=b}if(a.length===0){b=this.getCookie("WebCubbyUser")||this.getCookie("WebEnv");if(b.length>0){b=unescape(b).split("@");if(b.length>1)a=b[b.length-1]}}if(a.length===0)a="UNK_SESSION";this.addEntry("ncbi_sessionid",a)}},getBrowserWidthHeight:function(){var a=
+this.getViewportWidth(),b=this.getViewportHeight();return{width:a,height:b}},_browserConfigurationSettings:function(){if(ncbi.sg.calcXY){var a=this.getBrowserWidthHeight();this.addEntry("browserwidth",a.width);this.addEntry("browserheight",a.height);this.addEntry("screenwidth",screen.width);this.addEntry("screenheight",screen.height);this.addEntry("screenavailwidth",screen.availWidth);this.addEntry("screenavailheight",screen.availHeight);if(document&&document.body){var b=document.body.scrollWidth,
+c=document.body.scrollHeight,e=c>a.height?"true":"false";this.addEntry("canscroll_x",b>a.width?"true":"false");this.addEntry("canscroll_y",e);this.addEntry("scrollwidth",b);this.addEntry("scrollheight",c)}}if(screen.colorDepth)this.addEntry("colorDepth",screen.colorDepth);else screen.pixelDepth&&this.addEntry("colorDepth",screen.pixelDepth)},_hashCheck:function(){var a=window.location.hash;if(a){a=a.replace("#","");this.addEntry("urlhash",a)}(a=window.location.search.match(/[?&]campaign=([^&]*)/))&&
+this.addEntry("campaign",a[1])},_createPHID:function(){var a=this._cachedVals.ncbi_sessionid.value,b=a.substr(0,15)+"9"+(new Date).getTime().toString(),c=a.length;b+=a.substr(c-(32-b.length),c);a={value:b};this.addEntry("ncbi_phid",b);return a},currentPageHitId:null,_prevHitCheck:function(){var a=this.getCookie("ncbi_prevPHID"),b=this._cachedVals.ncbi_phid;a.length>0&&this.addEntry("prev_phid",a);if(!b||!b.value||b.value.length===0)b=this._createPHID();this.currentPageHitId=b.value;var c=this;ncbi.sg._hasFocus&&
+c.setCookie("ncbi_prevPHID",b.value);var e=window.onfocus;window.onfocus=function(g){c.getCookie("ncbi_prevPHID")!==b.value&&c.setCookie("ncbi_prevPHID",b.value);typeof e==="function"&&e(g)}},_setUpPathParts:function(){var a=this._cachedVals.ncbi_app,b=this._cachedVals.ncbi_db,c=this._cachedVals.ncbi_pdid,e=this._cachedVals.ncbi_pcid;this._pathParts.part1=a!==undefined?a.value:"";this._pathParts.part2=b!==undefined?b.value:"";this._pathParts.part3=c!==undefined?c.value:"";this._pathParts.part4=e!==
+undefined?e.value:""},getPerfStats:function(){var a=window.performance;if(!a)return{};var b=a.timing;if(b)b={dns:b.domainLookupEnd-b.domainLookupStart,connect:b.connectEnd-b.connectStart,ttfb:b.responseStart-b.connectEnd,basePage:b.responseEnd-b.responseStart,frontEnd:b.loadEventStart-b.responseEnd};else return{};if(a=a.navigation){b.navType=a.type;b.redirectCount=a.redirectCount}return b},setPerfStats:function(a,b){var c=this.getPerfStats();for(var e in c){var g=c[e];if(g>=0){var i="jsperf_"+e;if(b)a[i]=
+g;else a.push(i+"="+g)}}},getExtraRenderStats:function(){var a={SELF_URL:encodeURIComponent(window.location.href)};if(typeof document!=="undefined"&&typeof document.referrer!=="undefined")a.HTTP_REFERER=encodeURIComponent(document.referrer);return a},setExtraRenderStats:function(a){var b=this.getExtraRenderStats();for(var c in b)a.push(c+"="+b[c])},_send:function(a,b,c){if(typeof c==="undefined"||c===null)c=true;var e=[];if(a==="init"){e.push("jsevent=render");ncbi.sg.renderTime=new Date;if(typeof ncbi_startTime!==
+"undefined"){e.push("jsrendertime="+(ncbi.sg.renderTime-ncbi_startTime));ncbi.sg.loadTime&&e.push("jsloadtime="+(ncbi.sg.loadTime-ncbi_startTime))}this.setPerfStats(e);this.setExtraRenderStats(e);e.push("cookieenabled="+(ncbi.sg.isCookieEnabled?"true":"false"))}for(var g in this._cachedVals)ncbi.sg.appLogIgnore.indexOf(g)===-1&&e.push(g+"="+encodeURIComponent(this._cachedVals[g].value));this._sendAl(e.join("&"),b,true,c);this._hasInitRun=true;var i=this;setTimeout(function(){i.isProcessRunning=false;
+i.runSGProcess()},300)},send:function(a,b){this._send(a,b,false)},_sendPrev:function(){var a=ncbi.sg.getInstance(),b;b=a.getCookie("prevselfurl")!=document.referrer?"false":"true";var c=a.getCookie("clicknext");if(c){c=ncbi.sg._urls.addAttrToStr("directnav",b,c);ncbi.sg._ping(c);a.setCookie("clicknext","")}if(c=a.getCookie("prevsearch")){c=ncbi.sg._urls.addAttrToStr("directnav",b,c);ncbi.sg._ping(c);a.setCookie("prevsearch","")}if(c=a.getCookie("unloadnext")){c=ncbi.sg._urls.addAttrToStr("directnav",
+b,c);ncbi.sg._ping(c);a.setCookie("unloadnext","")}a.setCookie("prevselfurl","")},_sendAl:function(a,b,c,e){if(typeof e==="undefined"||e===null)e=true;var g=F(a);g=H(g,a);if(a.indexOf("jseventms")===-1)a+="&jseventms="+ncbi.sg.getInstance().getMillisecondsSinceSunday();if(a.match(/jsevent=search/i)){for(var i=ncbi.sg.addTimeonpageToAr([]),p=0;p0)return b}if(document.cookie.length>
+0){b=document.cookie.indexOf(a+"=");if(b!==-1){b=b+a.length+1;a=document.cookie.indexOf(";",b);if(a===-1)a=document.cookie.length;return unescape(document.cookie.substring(b,a))}}return""},getTransport:function(){var a=null;if(window.XMLHttpRequest)try{a=new XMLHttpRequest;this.getTransport=function(){return new XMLHttpRequest}}catch(b){a=null}if(window.ActiveXObject&&a===null)try{a=new ActiveXObject("Msxml2.XMLHTTP");this.getTransport=function(){return new ActiveXObject("Msxml2.XMLHTTP")}}catch(c){try{a=
+new ActiveXObject("Microsoft.XMLHTTP");this.getTransport=function(){return new ActiveXObject("Microsoft.XMLHTTP")}}catch(e){a=false}}if(a===null)this.getTransport=function(){return null};return this.getTransport()},_shouldStorePing:function(a){return a.search(/jsevent=(click|search|unload)next/)!==-1||(a.search("jsevent=render")!==-1||a.search("jsevent=domready")!==-1||a.search("jsevent=jserror")!==-1)&&a.search("sgSource=api")===-1},_pushPingFired:function(a){this._shouldStorePing(a)&&ncbi.sg.pingsFired.push(a)},
+_pushPingSucceeded:function(a){this._shouldStorePing(a)&&ncbi.sg.pingsSucceeded.push(a)},makeAjaxCall:function(a,b,c){var e=this,g=this.getTransport();g._ncbi_skipOverride=true;g.open("GET",a,c);if(c)g.onreadystatechange=function(){if(g.readyState===4){ncbi.sg.outstandingPings-=1;e._pushPingSucceeded(a);b(g)}};ncbi.sg.lastPing=g;ncbi.sg.outstandingPings+=1;g.send(null);return g},makeImgRequest:function(a,b){var c=this,e=document.createElement("img");e.setAttribute("src",a);e.style.display="none";
+e.onload=function(){if(!this.complete||typeof this.naturalWidth==="undefined"||this.naturalWidth==0)console.warn("could not load stat img");else{c._pushPingSucceeded(a);b()}ncbi.sg.outstandingPings-=1;document.body.removeChild(e)};ncbi.sg.outstandingPings+=1;document.body.appendChild(e)},scrollDetails:{maxScroll_x:0,maxScroll_y:0,currScroll_x:0,currScroll_y:0,hasScrolled:false},scrollEventDetails:{xTenths:0,yTenths:0,xMax:0,yMax:0},_visibleHeadings:[],_hiddenHeadings:[],_getScrollXYPx:function(){return[window.pageXOffset||
+document.documentElement.scrollLeft||document.body.scrollLeft||0,window.pageYOffset||document.documentElement.scrollTop||document.body.scrollTop||0]},_getScrollXY:function(){var a=this.getViewportHeight(),b=this.getViewportWidth(),c=document.body.scrollHeight,e=document.body.scrollWidth,g=this._getScrollXYPx(),i=Math.round(g[1]/a*10)/10;return{xRel:Math.round(g[0]/b*10)/10,yRel:i,viewportHeight:a,viewportWidth:b,pageHeight:c,pageWidth:e}},_addOnScrollListeners:function(){var a=window.onscroll,b=this;
+window.onscroll=function(){if(ncbi.sg.isScrollingEnabled){b._setScrollDetails();b.scrollDetails.hasScrolled=true;b._addScrollEvent()}else{b._setScrollDetails();b.scrollDetails.hasScrolled=true}if(typeof a==="function")return a()}},getViewportHeight:function(){return window.innerHeight?window.innerHeight:document.documentElement&&document.documentElement.clientHeight?document.documentElement.clientHeight:document.body!==null?document.body.clientHeight:"NA"},getViewportWidth:function(){return window.innerWidth?
+window.innerWidth:document.documentElement&&document.documentElement.clientWidth?document.documentElement.clientWidth:document.body!==null?document.body.clientWidth:"NA"},_setScrollDetails:function(){if(ncbi.sg.calcXY){this.scrollDetails.currScroll_y=window.pageYOffset||document.documentElement.scrollTop||document.body.scrollTop||0;this.scrollDetails.currScroll_x=window.pageXOffset||document.documentElement.scrollLeft||document.body.scrollLeft||0;this.getViewportWidth();this.getViewportHeight();if(this.scrollDetails.maxScroll_y<
+this.scrollDetails.currScroll_y)this.scrollDetails.maxScroll_y=this.scrollDetails.currScroll_y;if(this.scrollDetails.maxScroll_x0){g-=parseInt(a.scrollTop);c=true}if(a.scrollLeft&&a.scrollLeft>0){e-=parseInt(a.scrollLeft);b=true}if(a.offsetParent){b=this.findElementPos(a.offsetParent,b,c);if(b==-1)return-1;e+=b[0];g+=b[1]}else if(a.ownerDocument){var i=a.ownerDocument.defaultView;if(!i&&a.ownerDocument.parentWindow)i=a.ownerDocument.parentWindow;if(i){var p=i.pageXOffset!==
+undefined?i.pageXOffset:(a.document.documentElement||a.document.body.parentNode||a.document.body).scrollLeft;a=i.pageYOffset!==undefined?i.pageYOffset:(a.document.documentElement||a.document.body.parentNode||a.document.body).scrollTop;if(!c&&a&&a>0)g-=parseInt(a);if(!b&&p&&p>0)e-=parseInt(p)}}return[e,g]},getJoinedData:function(a){var b=[];for(var c in a)b.push(c+"="+encodeURIComponent(a[c]));return b.join("&")},addScrollHeadingData:function(a,b){var c=this.scrollEventDetails.headings;if(c){a["numHeadings."+
+this._scrollOrder+".scrollInfo"]=c.length;for(var e=0;e=0&&p<=w&&m+z>=0&&m<=d)if(this.isVisible(i))n=true;(f.visible=n)?a.push(f):b.push(f)}}this._visibleHeadings=a;this._hiddenHeadings=b},getVisibleHeadings:function(){return this._visibleHeadings},getHiddenHeadings:function(){return this._hiddenHeadings},getVisibleHeadingIDs:function(){for(var a=this.getVisibleHeadings(),b=[],c=0;c1E3){this._scrollOrder=this._scrollOrder!=undefined?this._scrollOrder+1:0;if(ncbi.sg.calcXY)var i="yTenths."+this._scrollOrder+".scrollInfo",p="xTenths."+this._scrollOrder+".scrollInfo",n="maxXTenths."+this._scrollOrder+
+".scrollInfo",m="maxYTenths."+this._scrollOrder+".scrollInfo";b={};b["duration."+this._scrollOrder+".scrollInfo"]=this._lastScroll?c.tstamp-this._lastScroll:new Date-ncbi.sg.loadTime;if(ncbi.sg.calcXY){b[p]=this.scrollEventDetails.xTenths;b[i]=this.scrollEventDetails.yTenths;b[n]=this.scrollEventDetails.xMax;b[m]=this.scrollEventDetails.yMax;b["viewportHeight."+this._scrollOrder+".scrollInfo"]=e.viewportHeight;b["viewportWidth."+this._scrollOrder+".scrollInfo"]=e.viewportWidth;b["maxPossibleScrollTenthsY."+
+this._scrollOrder+".scrollInfo"]=Math.round((e.pageHeight/e.viewportHeight-1)*10);b["maxPossibleScrollTenthsX."+this._scrollOrder+".scrollInfo"]=Math.round((e.pageWidth/e.viewportWidth-1)*10)}g=b=this.addScrollHeadingData(b,a)}this._setBeforeScrollDetails(c.tstamp);return g},getScrollDetailsAr:function(a,b){var c=[];a=this.getScrollDetails(a,b);for(var e in a)c.push(e+"="+encodeURIComponent(a[e]));return c},addScrollDetailsAr:function(a,b,c){b=this.getScrollDetailsAr(b,c);for(c=0;c0)for(;m.length>0;)c(m.pop());
+var d={jsevent:"unload",ncbi_pingaction:"unload"};d=ncbi.sg.addTimeonpageDetails(d);d.eventid=ncbi.sg.getEventId();var f=ncbi.sg.getInstance();f.setPerfStats(d,true);f.addScrollDetails(d,1800-f.getJoinedData(d).length,true);if(!w){ncbi.sg._ping(d);var h="";for(var k in d)h+=k+"="+(k==="jsevent"?"unloadnext":d[k])+"&";h+="ncbi_phid="+f.currentPageHitId;f.setCookie("prevselfurl",window.location.href,null);f._storeNext("unloadnext",h,null)}w=true}function b(d){for(var f=m.length-1;f>=-1;f--)if(m[f]===
+d){m.slice(f,1);break}c(d)}function c(d,f){if(u.indexOf(d.tstamp)===-1){u.push(d.tstamp);z.push(d);e("click",d,f)}}function e(d,f,h,k){if(typeof k==="undefined"||k===null)k=true;var o=d==="click"?"link":"elem",l=f.link,s=f.evt,j=l.id||"",q=l.name||"",v=l.sid||"",y=l.href||"",A=l.innerText||l.textContent||"";if(A.length>50)A=A.substr(0,50);var D=l.getAttribute?l.getAttribute("ref")||l.ref||"":"",E=l.className?l.className.replace(/^\s?/,"").replace(/\s?$/,"").split(/\s/g).join(",")||"":"";f=[];var B=
+[],x=l.parentNode;if(x)for(var t=0;t<6&&x!==null;t++){(parId=x.id)&&f.push(parId);if(parClassName=x.className)B=B.concat(parClassName.split(/\s/));x=x.parentNode}x=ncbi.sg.getInstance();t=x.currentPageHitId||"";var r=[];j.length>0&&r.push(o+"_id="+encodeURIComponent(j));q.length>0&&r.push(o+"_name="+encodeURIComponent(q));v.length>0&&r.push(o+"_sid="+encodeURIComponent(v));y.length>0&&r.push(o+"_href="+encodeURIComponent(y));A.length>0&&r.push(o+"_text="+encodeURIComponent(A));E.length>0&&r.push(o+
+"_class="+encodeURIComponent(E));if(ncbi.sg.calcXY){t=x.getBrowserWidthHeight();t.width!==null&&r.push("browserwidth="+encodeURIComponent(t.width));t.height!==null&&r.push("browserheight="+encodeURIComponent(t.height))}for(var C in s){t=s[C];t!==undefined&&r.push(C.toLowerCase()+"="+t.toString())}d==="click"&&r.push("eventid="+encodeURIComponent(ncbi.sg.getEventId()));r.push("jsevent="+d);D.length>0&&r.push(D);if(typeof jQuery!=="undefined")if(l=jQuery(l).attr("sg")){l=l.split(/\}\s*,\s*\{/);for(t=
+0;t0)for(;h.length>0;)r.push(h.shift());f.length>0&&r.push("ancestorId="+f.join(","));B.length>0&&r.push("ancestorClassName="+B.join(",").replace(/\s+/g," ").replace(/(^\s|\s$)/g,""));x.addScrollDetailsAr(r,1800-r.join("&").length,true);if(d==="click"){r=ncbi.sg.addTimeonpageToAr(r);d=r.join("&").replace("jsevent=click",
+"jsevent=clicknext");t=ncbi.sg.getInstance().currentPageHitId||"";d+="&ncbi_phid="+t;x.setCookie("prevselfurl",window.location.href,null);x._storeNext("clicknext",d,null,k)}ncbi.sg._ping(r,true,null,null,k)}function g(d){var f={};if(ncbi.sg.calcXY&&d){if(d.clientX||d.clientY){var h=ncbi.sg.getInstance()._getScrollXYPx();f.evt_coor_x=d.clientX+h[0];f.evt_coor_y=d.clientY+h[1]}else if(d.pageX||d.pageY){f.evt_coor_x=d.pageX;f.evt_coor_y=d.pageY}f.jseventms=ncbi.sg.getInstance().getMillisecondsSinceSunday()}return f}
+function i(d,f,h,k,o){var l={},s=null,j=null;if(typeof f==="string"){s=f;j=h}else{l=g(f);s=h;j=k}if(j){f=typeof j;if(f==="string")j=[j];else if(f==="object"&&!(j instanceof Array)){f=[];for(var q in j)f.push(q+"="+j[q]);j=f}}e(s,{link:d,evt:l},j,o)}function p(d,f,h){var k=[];if(typeof f==="undefined")f=true;if(typeof d==="object"&&!(d instanceof Array))for(var o in d)k.push(o+"="+encodeURIComponent(d[o]));else if(typeof d==="string")k.push(d);else k=d;d=ncbi.sg.getInstance().currentPageHitId||"";
+o=null;if(typeof ncbi.sg.loadTime!=="undefined")o=new Date-ncbi.sg.loadTime;var l=k.join("&");if(l.indexOf("jsevent=clicknext")!==-1||l.indexOf("jsevent=searchnext")!==-1||l.indexOf("jsevent=unloadnext")!==-1){d.length>0&&k.push("next_phid="+encodeURIComponent(d));o!==null&&k.push("next_ncbi_timesinceload="+o)}else{d.length>0&&k.push("ncbi_phid="+encodeURIComponent(d));o!==null&&k.push("ncbi_timesinceload="+o)}ncbi.sg.getInstance()._sendAl(k.join("&"),null,f,h)}var n=window.onerror;window.onerror=
+function(d,f,h){if(!ncbi.sg.hasNotedErrorEvent){ncbi.sg.getInstance().noteEventData("jserror",{jserror:d,jserrorlocation:f,jserrorline:h,SELF_URL:window.location.href},["ncbi_sessionid","ncbi_phid"]);ncbi.sg.hasNotedErrorEvent=true;if(typeof n==="function")return n(d,f,h)}};ncbi.sg._currentEventId=1;ncbi.sg.getEventId=function(){return ncbi.sg._currentEventId++};var m=[],u=[],z=[],w=false;ncbi.sg.sendElementEvent=function(d,f,h){e(d,f,h,false)};ncbi.sg.clickTimers=[];setClickEvent=function(){function d(){a()}
+var f=function(j){return(j=typeof j.parentNode!=="undefined"?j.parentNode:null)?o(j)?j:f(j):false},h=function(j){var q=j.target||j.srcElement;if(typeof q=="undefined"||q==null)return null;if(q.nodeType==3)q=j.target.parentNode;o(q)||(q=f(q));return q},k=function(j){return ncbi.sg.getInstance().isInLinkObjs(j)},o=function(j){var q=typeof j.tagName!=="undefined"?j.tagName.toLowerCase():null,v=false,y=false;if(typeof jQuery!=="undefined")v=jQuery(j).is("button, input[type=button], input[type=submit], input[type=reset]");
+else if(q==="input"){v=j.type;v=v=="button"||v=="submit"||v=="reset"}else v=q==="button"?true:false;v||(y=q=="a"||q=="area");return y?"link":v?"button":k(j)?"linkObjs":null},l=function(j,q,v,y){if(!(y&&y=="click"&&j.which&&j.which==3))if(!(!q||o(q)==null)){ncbi.sg.getInstance().setCookie("ncbi_prevPHID",ncbi.sg.getInstance().currentPageHitId);j=g(j);j.iscontextmenu=y=="contextmenu"?"true":"false";q={evt:j,link:q,tstamp:(new Date).getTime(),floodTstamp:(new Date).getTime()};b(q);ncbi.sg.clickTimers&&
+window.clearTimeout(ncbi.sg.clickTimers);ncbi.sg.clickTimers=window.setTimeout(function(){ncbi.sg.clickTimers=null},300)}};if(window.addEventListener){window.addEventListener("click",function(j){l(j,h(j),[],"click")});window.addEventListener("contextmenu",function(j){l(j,h(j),[],"contextmenu")},false);window.addEventListener("beforeunload",d)}else if(window.attachEvent){document.attachEvent("onclick",function(j){l(j,h(j),[],"click")});document.attachEvent("oncontextmenu",function(j){l(j,h(j),[],"contextmenu")},
+false);document.attachEvent("onbeforeunload",d)}if(Event.prototype.stopPropagation){var s=Event.prototype.stopPropagation;Event.prototype.stopPropagation=function(){var j=h(this);if(o(j)!=null)if(this.type=="click")l(this,j,[],"click");else this.type=="contextmenu"&&l(this,j,[],"contextmenu");return s.apply(this,arguments)}}};ncbi.sg.isClickingEnabled&&setClickEvent();ncbi.sg.scanLinks=function(d){var f=ncbi.sg.getInstance();if(d){var h=typeof jQuery!=="undefined"&&jQuery?d instanceof jQuery:false;
+if(typeof d==="object"&&!(d instanceof Array)&&!h)d=[d];f.addLinkObjs(d)}};ncbi.sg._ping=function(d,f,h,k,o){if(typeof o==="undefined"||o===null)o=true;typeof d==="undefined"||d===null||(typeof d==="object"&&d.nodeName!==undefined?i(d,f,h,k,o):p(d,f,o))};ncbi.sg.ping=function(d,f,h,k){ncbi.sg._ping(d,f,h,k,false)};ncbi.sg.loadTime=new Date;ncbi.sg.pingsFired=[];ncbi.sg.pingsSucceeded=[];ncbi.sg.prevPingsFired=null;ncbi.sg.prevPingsSucceeded=null;ncbi.sg.outstandingPings=0;ncbi.sg._isGBPage=false;
+ncbi.sg._urls={getDataObj:function(d){if(typeof d==="string"||typeof DOMString!=="undefined"&&d instanceof DOMString||typeof String!=="undefined"&&d instanceof String){var f={};d=d.split("&");for(var h=0;h2)for(var s=2;s1?typeof d[1]!=="undefined"&&d[1]?d[1]:"":null},getMainUrlPart:function(d){d=this.getUrlParts(d);return d.length>0?typeof d[0]!=="undefined"&&d[0]?d[0]:"":""},hasQuestionMark:function(d){return d.search(/\?/)!==-1},getDataStr:function(d){if(typeof d==="object"&&!(d instanceof Array)){var f=[];for(var h in d)f.push(h+"="+d[h]);return f.join("&")}else return false},attrInStr:function(d,
+f){if(typeof f!=="undefined"&&f&&f.length>0)if(typeof this.getDataObj(f)[d]!=="undefined")return true;return false},attrInUrl:function(d,f){return this.attrInStr(this.getQueryString(f))},getAttrFromData:function(d,f){if(typeof f!=="undefined")return f[d]},getAttrFromStr:function(d,f){return this.getAttrFromData(d,this.getDataObj(f))},addAttr:function(d,f,h){h[d]=f;return h},addAttrToStr:function(d,f,h){h=this.getDataObj(h);h=this.addAttr(d,f,h);return this.getDataStr(h)},addDataToStr:function(d,f){f=
+f;for(var h in d)f=this.addAttrToStr(h,d[h],f);return f},addDataToObj:function(d,f){for(var h in f)d[h]=f[h]},addAttrWithIndex:function(d,f,h){var k=ncbi.sg.getInstance();if(typeof d==="undefined"||!d)return f;if(typeof f!=="undefined"&&f){k=k.getVal(d);f[d]=k?k:"unknown";if(typeof h!=="undefined"&&h)f[d]+=".0"+ncbi.sg._ajaxRequestIndex}return f},addAttrWithIndexToUrl:function(d,f,h){var k=this.getMainUrlPart(f);f=this.getQueryString(f);f=f!==null?f:"";f=this.addAttrWithIndexToStr(d,f,h);return k+
+"?"+f},addAttrWithIndexToStr:function(d,f,h){f=this.getDataObj(f);ncbi.sg.getInstance();f=this.addAttrWithIndex(d,f,h);return this.getDataStr(f)}}})();if(!Array.prototype.indexOf)Array.prototype.indexOf=function(a,b){var c=this.length>>>0;b=Number(b)||0;b=b<0?Math.ceil(b):Math.floor(b);if(b<0)b+=c;for(;b1&&typeof arguments[1]!=="undefined"?arguments[1]:null;if(e!==null&&ncbi.sg._isGBPage){e=ncbi.sg._urls.addAttrWithIndexToUrl("ncbi_phid",e,true);e=ncbi.sg._urls.addAttrWithIndexToUrl("ncbi_sessionid",e);arguments[1]=e}e=c.apply(this,arguments);b(this,"ncbi_phid","NCBI-PHID",true);ncbi.sg._ajaxRequestIndex+=1}return e}}})()}})();
+
+;
+// This code creates window.console if it doesn't exist.
+// It also creates stub functions for those functions that are missing in window.console.
+// (Safari implements some but not all of the firebug window.console methods--this implements the rest.)
+(function() {
+ var names = [ "log", "debug", "info", "warn", "error", "assert", "dir", "dirxml", "group",
+ "groupEnd", "time", "timeEnd", "count", "trace", "profile", "profileEnd" ];
+
+ if (typeof(console) === 'undefined' || typeof console === "function" ) {
+ //"typeof function" is needed see PP-769
+ console = {};
+ }
+
+ for (var i = 0; i < names.length; ++i) {
+ if (typeof(console[names[i]]) === 'undefined') {
+ console[names[i]] = function() { return false; };
+ }
+ }
+ ncbi.sg.getInstance().init();
+})();
diff --git a/default_files/LBSMDSearchMain.gif b/default_files/LBSMDSearchMain.gif
new file mode 100644
index 00000000..b694667f
Binary files /dev/null and b/default_files/LBSMDSearchMain.gif differ
diff --git a/default_files/LoadBalancingDispD.jpg b/default_files/LoadBalancingDispD.jpg
new file mode 100644
index 00000000..e90d9084
Binary files /dev/null and b/default_files/LoadBalancingDispD.jpg differ
diff --git a/default_files/LoadBalancingInternetLong.jpg b/default_files/LoadBalancingInternetLong.jpg
new file mode 100644
index 00000000..aee8b8fe
Binary files /dev/null and b/default_files/LoadBalancingInternetLong.jpg differ
diff --git a/default_files/LoadBalancingInternetShort.jpg b/default_files/LoadBalancingInternetShort.jpg
new file mode 100644
index 00000000..34d4108d
Binary files /dev/null and b/default_files/LoadBalancingInternetShort.jpg differ
diff --git a/default_files/LoadBalancingLocal.jpg b/default_files/LoadBalancingLocal.jpg
new file mode 100644
index 00000000..7d7914af
Binary files /dev/null and b/default_files/LoadBalancingLocal.jpg differ
diff --git a/default_files/NetCache_diagramm.gif b/default_files/NetCache_diagramm.gif
new file mode 100644
index 00000000..bc95beaa
Binary files /dev/null and b/default_files/NetCache_diagramm.gif differ
diff --git a/default_files/Penalty.jpg b/default_files/Penalty.jpg
new file mode 100644
index 00000000..e8e6ecd8
Binary files /dev/null and b/default_files/Penalty.jpg differ
diff --git a/default_files/QA.jpg b/default_files/QA.jpg
new file mode 100644
index 00000000..dfe4f437
Binary files /dev/null and b/default_files/QA.jpg differ
diff --git a/default_files/QACookieManager.gif b/default_files/QACookieManager.gif
new file mode 100644
index 00000000..0ee704d8
Binary files /dev/null and b/default_files/QACookieManager.gif differ
diff --git a/default_files/ch_app_lbsmd_cfg_structure.png b/default_files/ch_app_lbsmd_cfg_structure.png
new file mode 100644
index 00000000..de81c0a8
Binary files /dev/null and b/default_files/ch_app_lbsmd_cfg_structure.png differ
diff --git a/default_files/clear.png b/default_files/clear.png
new file mode 100644
index 00000000..c7af78d6
Binary files /dev/null and b/default_files/clear.png differ
diff --git a/default_files/data_types.gif b/default_files/data_types.gif
new file mode 100644
index 00000000..c8f3bfd3
Binary files /dev/null and b/default_files/data_types.gif differ
diff --git a/default_files/hfjs2.js b/default_files/hfjs2.js
new file mode 100644
index 00000000..de4eba17
--- /dev/null
+++ b/default_files/hfjs2.js
@@ -0,0 +1,99 @@
+var signin = document.getElementById("sign_in");
+if(typeof signin != 'undefined' && signin){
+ signin.href = signin.href + "?back_url=" + encodeURIComponent(window.location);
+}
+
+var signout = document.getElementById('sign_out');
+if(typeof signout != 'undefined' && signout){
+ signout.href = signout.href + "?back_url=" + encodeURIComponent(window.location);
+}
+
+function getCookie(cookie_name) {
+ var start_pos = document.cookie.indexOf(cookie_name + "="); //start cookie name
+ if (start_pos != -1) {
+ start_pos = start_pos + cookie_name.length+1; //start cookie value
+ var end_pos = document.cookie.indexOf(";", start_pos);
+ if (end_pos == -1) {
+ end_pos = document.cookie.length;
+ }
+ return decodeURIComponent(document.cookie.substring(start_pos, end_pos));
+ }
+ else {
+ return "";
+ }
+}
+
+var c = getCookie('WebCubbyUser');
+c = decodeURIComponent(decodeURIComponent(c));
+lre = /.*logged-in\=(\w*);.*/;
+ure = /.*my-name\=([\w|\-|\.|\ |\@|\+]*);.*/;
+plus = /\+/gi;
+
+if(c){
+ l = lre.exec( c );
+ if(l && l[1] && l[1] === 'true' ) {
+ u = ure.exec( c );
+ if(u && u[1]){
+ var myncbi_username = document.getElementById("myncbiusername");
+ var uname = document.getElementById('mnu');
+ if (uname) {
+ if (typeof uname != 'undefined') {
+ uname.appendChild(document.createTextNode(u[1].replace(plus, ' ')));
+ myncbi_username.style.display = "inline";
+
+ var signin = document.getElementById("sign_in");
+ signin.style.display = "none";
+
+ var signout = document.getElementById("sign_out");
+ signout.style.display = "inline";
+
+ var myncbi = document.getElementById('myncbi');
+ myncbi.style.display='inline';
+ }
+ }
+ }
+ }
+}
+
+(function( $ ){
+ $( function() {
+ if (typeof $.fn.ncbipopper == "function") {
+ $('#info .external').each( function(){
+ var $this = $( this );
+ var popper = $this;
+ popper.ncbipopper({
+ destSelector: '#external-disclaimer',
+ isDestElementCloseClick: false,
+ openAnimation: 'none',
+ closeAnimation: 'none',
+ isTriggerElementCloseClick: false,
+ triggerPosition: 'bottom center',
+ destPosition: 'top center',
+ hasArrow: true,
+ arrowDirection: 'top'
+ });
+ });
+ }
+ });
+})( jQuery );
+
+if(typeof jQuery !== 'undefined' && jQuery.ui){
+ var version = jQuery.ui.jig.version;
+ var pieces = version.split(".");
+ if(pieces[0] >= 1 && pieces[1] >= 11){
+ if(pieces[1] == 11 && pieces[2] && pieces[3] >= 2){
+ jQuery("#sign_in").click(function(e){
+ if(typeof jQuery.ui.jig.requiresLogin !== 'undefined'){
+ e.preventDefault();
+ jQuery.ui.jig.requiresLogin();
+ }
+ });
+ }
+ }
+}
+// Global Alerts - new
+if (typeof(jQuery) != 'undefined') {
+ jQuery.getScript("/core/alerts/alerts.js", function () {
+ galert(['div.nav_and_browser', 'div.header', '#universal_header', 'body > *:nth-child(1)'])
+ });
+}
\ No newline at end of file
diff --git a/default_files/jig.css b/default_files/jig.css
new file mode 100644
index 00000000..f3dce223
--- /dev/null
+++ b/default_files/jig.css
@@ -0,0 +1 @@
+.ui-helper-hidden{display:none;}.ui-helper-hidden-accessible{position:absolute;left:-99999999px;}.ui-helper-reset{margin:0;padding:0;border:0;outline:0;line-height:1.3;text-decoration:none;font-size:100%;list-style:none;}.ui-helper-clearfix:after{content:".";display:block;height:0;clear:both;visibility:hidden;}.ui-helper-clearfix{display:inline-block;}/* required comment for clearfix to work in Opera \*/ * html .ui-helper-clearfix{height:1%;}.ui-helper-clearfix{display:block;}/* end clearfix */ .ui-helper-zfix{width:100%;height:100%;top:0;left:0;position:absolute;opacity:0;filter:Alpha(Opacity=0);}.ui-state-disabled{cursor:default!important;}.ui-icon{display:block;text-indent:-99999px;overflow:hidden;background-repeat:no-repeat;}.ui-widget-overlay{position:absolute;top:0;left:0;width:100%;height:100%;}.ui-widget{font-size:1.1em;}.ui-widget-content{border:1px solid #aaa;background:#fff url(images/ui-bg_flat_75_ffffff_40x100.png) 50% 50% repeat-x;color:#222;}.ui-widget-content a{color:#222;}.ui-widget-header{border:1px solid #aaa;background:#ccc url(images/ui-bg_highlight-soft_75_cccccc_1x100.png) 50% 50% repeat-x;color:#222;font-weight:bold;}.ui-widget-header a{color:#222;}.ui-state-default,.ui-widget-content .ui-state-default{border:1px solid #d3d3d3;background:#e6e6e6 url(images/ui-bg_glass_75_e6e6e6_1x400.png) 50% 50% repeat-x;font-weight:normal;color:#555;}.ui-state-default a,.ui-state-default a:link,.ui-state-default a:visited{color:#555;text-decoration:none;}.ui-state-hover,.ui-widget-content .ui-state-hover,.ui-state-focus,.ui-widget-content .ui-state-focus{border:1px solid #999;background:#dadada url(images/ui-bg_glass_75_dadada_1x400.png) 50% 50% repeat-x;font-weight:normal;color:#212121;}.ui-state-hover a,.ui-state-hover a:hover{color:#212121;text-decoration:none;}.ui-state-active,.ui-widget-content .ui-state-active{border:1px solid #aaa;background:#fff url(images/ui-bg_glass_65_ffffff_1x400.png) 50% 50% repeat-x;font-weight:normal;color:#212121;}.ui-state-active a,.ui-state-active a:link,.ui-state-active a:visited{color:#212121;text-decoration:none;}.ui-state-highlight,.ui-widget-content .ui-state-highlight{border:1px solid #fcefa1;background:#fbf9ee url(images/ui-bg_glass_55_fbf9ee_1x400.png) 50% 50% repeat-x;color:#363636;}.ui-state-highlight a,.ui-widget-content .ui-state-highlight a{color:#363636;}.ui-state-error,.ui-widget-content .ui-state-error{border:1px solid #cd0a0a;background:#fef1ec url(images/ui-bg_glass_95_fef1ec_1x400.png) 50% 50% repeat-x;color:#cd0a0a;}.ui-state-error a,.ui-widget-content .ui-state-error a{color:#cd0a0a;}.ui-state-error-text,.ui-widget-content .ui-state-error-text{color:#cd0a0a;}.ui-priority-primary,.ui-widget-content .ui-priority-primary{font-weight:bold;}.ui-priority-secondary,.ui-widget-content .ui-priority-secondary{opacity:.7;filter:Alpha(Opacity=70);font-weight:normal;}.ui-state-disabled,.ui-widget-content .ui-state-disabled{opacity:.35;filter:Alpha(Opacity=35);background-image:none;}.ui-icon{width:16px;height:16px;background-image:url(images/ui-icons_222222_256x240.png);}.ui-widget-content .ui-icon{background-image:url(images/ui-icons_222222_256x240.png);}.ui-widget-header .ui-icon{background-image:url(images/ui-icons_222222_256x240.png);}.ui-state-default .ui-icon{background-image:url(images/ui-icons_888888_256x240.png);}.ui-state-hover .ui-icon,.ui-state-focus .ui-icon{background-image:url(images/ui-icons_454545_256x240.png);}.ui-state-active .ui-icon{background-image:url(images/ui-icons_454545_256x240.png);}.ui-state-highlight .ui-icon{background-image:url(images/ui-icons_2e83ff_256x240.png);}.ui-state-error .ui-icon,.ui-state-error-text .ui-icon{background-image:url(images/ui-icons_cd0a0a_256x240.png);}.ui-icon-carat-1-n{background-position:0 0;}.ui-icon-carat-1-ne{background-position:-16px 0;}.ui-icon-carat-1-e{background-position:-32px 0;}.ui-icon-carat-1-se{background-position:-48px 0;}.ui-icon-carat-1-s{background-position:-64px 0;}.ui-icon-carat-1-sw{background-position:-80px 0;}.ui-icon-carat-1-w{background-position:-96px 0;}.ui-icon-carat-1-nw{background-position:-112px 0;}.ui-icon-carat-2-n-s{background-position:-128px 0;}.ui-icon-carat-2-e-w{background-position:-144px 0;}.ui-icon-triangle-1-n{background-position:0 -16px;}.ui-icon-triangle-1-ne{background-position:-16px -16px;}.ui-icon-triangle-1-e{background-position:-32px -16px;}.ui-icon-triangle-1-se{background-position:-48px -16px;}.ui-icon-triangle-1-s{background-position:-64px -16px;}.ui-icon-triangle-1-sw{background-position:-80px -16px;}.ui-icon-triangle-1-w{background-position:-96px -16px;}.ui-icon-triangle-1-nw{background-position:-112px -16px;}.ui-icon-triangle-2-n-s{background-position:-128px -16px;}.ui-icon-triangle-2-e-w{background-position:-144px -16px;}.ui-icon-arrow-1-n{background-position:0 -32px;}.ui-icon-arrow-1-ne{background-position:-16px -32px;}.ui-icon-arrow-1-e{background-position:-32px -32px;}.ui-icon-arrow-1-se{background-position:-48px -32px;}.ui-icon-arrow-1-s{background-position:-64px -32px;}.ui-icon-arrow-1-sw{background-position:-80px -32px;}.ui-icon-arrow-1-w{background-position:-96px -32px;}.ui-icon-arrow-1-nw{background-position:-112px -32px;}.ui-icon-arrow-2-n-s{background-position:-128px -32px;}.ui-icon-arrow-2-ne-sw{background-position:-144px -32px;}.ui-icon-arrow-2-e-w{background-position:-160px -32px;}.ui-icon-arrow-2-se-nw{background-position:-176px -32px;}.ui-icon-arrowstop-1-n{background-position:-192px -32px;}.ui-icon-arrowstop-1-e{background-position:-208px -32px;}.ui-icon-arrowstop-1-s{background-position:-224px -32px;}.ui-icon-arrowstop-1-w{background-position:-240px -32px;}.ui-icon-arrowthick-1-n{background-position:0 -48px;}.ui-icon-arrowthick-1-ne{background-position:-16px -48px;}.ui-icon-arrowthick-1-e{background-position:-32px -48px;}.ui-icon-arrowthick-1-se{background-position:-48px -48px;}.ui-icon-arrowthick-1-s{background-position:-64px -48px;}.ui-icon-arrowthick-1-sw{background-position:-80px -48px;}.ui-icon-arrowthick-1-w{background-position:-96px -48px;}.ui-icon-arrowthick-1-nw{background-position:-112px -48px;}.ui-icon-arrowthick-2-n-s{background-position:-128px -48px;}.ui-icon-arrowthick-2-ne-sw{background-position:-144px -48px;}.ui-icon-arrowthick-2-e-w{background-position:-160px -48px;}.ui-icon-arrowthick-2-se-nw{background-position:-176px -48px;}.ui-icon-arrowthickstop-1-n{background-position:-192px -48px;}.ui-icon-arrowthickstop-1-e{background-position:-208px -48px;}.ui-icon-arrowthickstop-1-s{background-position:-224px -48px;}.ui-icon-arrowthickstop-1-w{background-position:-240px -48px;}.ui-icon-arrowreturnthick-1-w{background-position:0 -64px;}.ui-icon-arrowreturnthick-1-n{background-position:-16px -64px;}.ui-icon-arrowreturnthick-1-e{background-position:-32px -64px;}.ui-icon-arrowreturnthick-1-s{background-position:-48px -64px;}.ui-icon-arrowreturn-1-w{background-position:-64px -64px;}.ui-icon-arrowreturn-1-n{background-position:-80px -64px;}.ui-icon-arrowreturn-1-e{background-position:-96px -64px;}.ui-icon-arrowreturn-1-s{background-position:-112px -64px;}.ui-icon-arrowrefresh-1-w{background-position:-128px -64px;}.ui-icon-arrowrefresh-1-n{background-position:-144px -64px;}.ui-icon-arrowrefresh-1-e{background-position:-160px -64px;}.ui-icon-arrowrefresh-1-s{background-position:-176px -64px;}.ui-icon-arrow-4{background-position:0 -80px;}.ui-icon-arrow-4-diag{background-position:-16px -80px;}.ui-icon-extlink{background-position:-32px -80px;}.ui-icon-newwin{background-position:-48px -80px;}.ui-icon-refresh{background-position:-64px -80px;}.ui-icon-shuffle{background-position:-80px -80px;}.ui-icon-transfer-e-w{background-position:-96px -80px;}.ui-icon-transferthick-e-w{background-position:-112px -80px;}.ui-icon-folder-collapsed{background-position:0 -96px;}.ui-icon-folder-open{background-position:-16px -96px;}.ui-icon-document{background-position:-32px -96px;}.ui-icon-document-b{background-position:-48px -96px;}.ui-icon-note{background-position:-64px -96px;}.ui-icon-mail-closed{background-position:-80px -96px;}.ui-icon-mail-open{background-position:-96px -96px;}.ui-icon-suitcase{background-position:-112px -96px;}.ui-icon-comment{background-position:-128px -96px;}.ui-icon-person{background-position:-144px -96px;}.ui-icon-print{background-position:-160px -96px;}.ui-icon-trash{background-position:-176px -96px;}.ui-icon-locked{background-position:-192px -96px;}.ui-icon-unlocked{background-position:-208px -96px;}.ui-icon-bookmark{background-position:-224px -96px;}.ui-icon-tag{background-position:-240px -96px;}.ui-icon-home{background-position:0 -112px;}.ui-icon-flag{background-position:-16px -112px;}.ui-icon-calendar{background-position:-32px -112px;}.ui-icon-cart{background-position:-48px -112px;}.ui-icon-pencil{background-position:-64px -112px;}.ui-icon-clock{background-position:-80px -112px;}.ui-icon-disk{background-position:-96px -112px;}.ui-icon-calculator{background-position:-112px -112px;}.ui-icon-zoomin{background-position:-128px -112px;}.ui-icon-zoomout{background-position:-144px -112px;}.ui-icon-search{background-position:-160px -112px;}.ui-icon-wrench{background-position:-176px -112px;}.ui-icon-gear{background-position:-192px -112px;}.ui-icon-heart{background-position:-208px -112px;}.ui-icon-star{background-position:-224px -112px;}.ui-icon-link{background-position:-240px -112px;}.ui-icon-cancel{background-position:0 -128px;}.ui-icon-plus{background-position:-16px -128px;}.ui-icon-plusthick{background-position:-32px -128px;}.ui-icon-minus{background-position:-48px -128px;}.ui-icon-minusthick{background-position:-64px -128px;}.ui-icon-close{background-position:-80px -128px;}.ui-icon-closethick{background-position:-96px -128px;}.ui-icon-key{background-position:-112px -128px;}.ui-icon-lightbulb{background-position:-128px -128px;}.ui-icon-scissors{background-position:-144px -128px;}.ui-icon-clipboard{background-position:-160px -128px;}.ui-icon-copy{background-position:-176px -128px;}.ui-icon-contact{background-position:-192px -128px;}.ui-icon-image{background-position:-208px -128px;}.ui-icon-video{background-position:-224px -128px;}.ui-icon-script{background-position:-240px -128px;}.ui-icon-alert{background-position:0 -144px;}.ui-icon-info{background-position:-16px -144px;}.ui-icon-notice{background-position:-32px -144px;}.ui-icon-help{background-position:-48px -144px;}.ui-icon-check{background-position:-64px -144px;}.ui-icon-bullet{background-position:-80px -144px;}.ui-icon-radio-off{background-position:-96px -144px;}.ui-icon-radio-on{background-position:-112px -144px;}.ui-icon-pin-w{background-position:-128px -144px;}.ui-icon-pin-s{background-position:-144px -144px;}.ui-icon-play{background-position:0 -160px;}.ui-icon-pause{background-position:-16px -160px;}.ui-icon-seek-next{background-position:-32px -160px;}.ui-icon-seek-prev{background-position:-48px -160px;}.ui-icon-seek-end{background-position:-64px -160px;}.ui-icon-seek-start{background-position:-80px -160px;}.ui-icon-seek-first{background-position:-80px -160px;}.ui-icon-stop{background-position:-96px -160px;}.ui-icon-eject{background-position:-112px -160px;}.ui-icon-volume-off{background-position:-128px -160px;}.ui-icon-volume-on{background-position:-144px -160px;}.ui-icon-power{background-position:0 -176px;}.ui-icon-signal-diag{background-position:-16px -176px;}.ui-icon-signal{background-position:-32px -176px;}.ui-icon-battery-0{background-position:-48px -176px;}.ui-icon-battery-1{background-position:-64px -176px;}.ui-icon-battery-2{background-position:-80px -176px;}.ui-icon-battery-3{background-position:-96px -176px;}.ui-icon-circle-plus{background-position:0 -192px;}.ui-icon-circle-minus{background-position:-16px -192px;}.ui-icon-circle-close{background-position:-32px -192px;}.ui-icon-circle-triangle-e{background-position:-48px -192px;}.ui-icon-circle-triangle-s{background-position:-64px -192px;}.ui-icon-circle-triangle-w{background-position:-80px -192px;}.ui-icon-circle-triangle-n{background-position:-96px -192px;}.ui-icon-circle-arrow-e{background-position:-112px -192px;}.ui-icon-circle-arrow-s{background-position:-128px -192px;}.ui-icon-circle-arrow-w{background-position:-144px -192px;}.ui-icon-circle-arrow-n{background-position:-160px -192px;}.ui-icon-circle-zoomin{background-position:-176px -192px;}.ui-icon-circle-zoomout{background-position:-192px -192px;}.ui-icon-circle-check{background-position:-208px -192px;}.ui-icon-circlesmall-plus{background-position:0 -208px;}.ui-icon-circlesmall-minus{background-position:-16px -208px;}.ui-icon-circlesmall-close{background-position:-32px -208px;}.ui-icon-squaresmall-plus{background-position:-48px -208px;}.ui-icon-squaresmall-minus{background-position:-64px -208px;}.ui-icon-squaresmall-close{background-position:-80px -208px;}.ui-icon-grip-dotted-vertical{background-position:0 -224px;}.ui-icon-grip-dotted-horizontal{background-position:-16px -224px;}.ui-icon-grip-solid-vertical{background-position:-32px -224px;}.ui-icon-grip-solid-horizontal{background-position:-48px -224px;}.ui-icon-gripsmall-diagonal-se{background-position:-64px -224px;}.ui-icon-grip-diagonal-se{background-position:-80px -224px;}.ui-corner-tl{-moz-border-radius-topleft:4px;-webkit-border-top-left-radius:4px;border-top-left-radius:4px;}.ui-corner-tr{-moz-border-radius-topright:4px;-webkit-border-top-right-radius:4px;border-top-right-radius:4px;}.ui-corner-bl{-moz-border-radius-bottomleft:4px;-webkit-border-bottom-left-radius:4px;border-bottom-left-radius:4px;}.ui-corner-br{-moz-border-radius-bottomright:4px;-webkit-border-bottom-right-radius:4px;border-bottom-right-radius:4px;}.ui-corner-top{-moz-border-radius-topleft:4px;-webkit-border-top-left-radius:4px;border-top-left-radius:4px;-moz-border-radius-topright:4px;-webkit-border-top-right-radius:4px;border-top-right-radius:4px;}.ui-corner-bottom{-moz-border-radius-bottomleft:4px;-webkit-border-bottom-left-radius:4px;border-bottom-left-radius:4px;-moz-border-radius-bottomright:4px;-webkit-border-bottom-right-radius:4px;border-bottom-right-radius:4px;}.ui-corner-right{-moz-border-radius-topright:4px;-webkit-border-top-right-radius:4px;border-top-right-radius:4px;-moz-border-radius-bottomright:4px;-webkit-border-bottom-right-radius:4px;border-bottom-right-radius:4px;}.ui-corner-left{-moz-border-radius-topleft:4px;-webkit-border-top-left-radius:4px;border-top-left-radius:4px;-moz-border-radius-bottomleft:4px;-webkit-border-bottom-left-radius:4px;border-bottom-left-radius:4px;}.ui-corner-all{-moz-border-radius:4px;-webkit-border-radius:4px;border-radius:4px;}.ui-widget-overlay{background:#aaa url(images/ui-bg_flat_0_aaaaaa_40x100.png) 50% 50% repeat-x;opacity:.30;filter:Alpha(Opacity=30);}.ui-widget-shadow{margin:-8px 0 0 -8px;padding:8px;background:#aaa url(images/ui-bg_flat_0_aaaaaa_40x100.png) 50% 50% repeat-x;opacity:.30;filter:Alpha(Opacity=30);-moz-border-radius:8px;-webkit-border-radius:8px;border-radius:8px;}.ui-widget{font-family:arial,"sans-serif"!important;font-size:100%;}.ui-helper-reset{font-size:100%!important;}iframe.ui-ncbi-iframe-fix{position:absolute;top:0;left:0;height:200px;z-index:3000;display:block;filter:alpha(opacity=1);}#ui-datepicker-div{display:none;}.ui-widget-content a{color:#2F4A8B;}.ui-helper-hidden-accessible{left:-10000000px!important;}#jig-ncbi_requires_login iframe{border:none;}
\ No newline at end of file
diff --git a/default_files/jig.js b/default_files/jig.js
new file mode 100644
index 00000000..25a5787a
--- /dev/null
+++ b/default_files/jig.js
@@ -0,0 +1,265 @@
+(function(){function La(){var g=null,j=jQuery.cookie("WebCubbyUser");if(j)if(j.indexOf("logged-in=true")>-1)if(j=j.match(/my-name=([^;]+)/i))g=j[1];return g}function W(g){this.name=this.selector="";this.onPage=false;this.dependsOn=[];this.interactions=[];this.overrideDefaults={};this.addCss=function(){document.write('')};this.addJs=function(){document.write('");this.oLoaded.push(sNewSrc);}}},sBase:"",oLoaded:[]},insertInHtml:function(text,obj){if(document.all){obj.innerHTML+=text;}else{var range=document.createRange();range.setStartAfter(obj);var docFrag=range.createContextualFragment(text);obj.appendChild(docFrag);}},replaceInHtml:function(text,obj){if(document.all){obj.innerHTML=text;}else{while(obj.hasChildNodes())obj.removeChild(obj.firstChild);var range=document.createRange();range.setStartAfter(obj);var docFrag=range.createContextualFragment(text);obj.appendChild(docFrag);}},drawText:function(sText,sId,add){if(!sId)sId="debug";var obj=document.getElementById(sId);if(obj){if(add)
+obj.innerHTML=" "+sText;else
+obj.innerHTML+=sText;}},createNewId:function(){var newid=null;while(!newid||document.getElementById(newid)){newid="XID"+Math.round(Math.random()*65536).toString(16);}
+return newid;}};String.prototype.trimSpaces=function(trimMode){var targetString=this;var iPos=0;if(!trimMode)trimMode=0;if(trimMode==0||trimMode==1){if(targetString.charAt(iPos)==" "){while(targetString.charAt(iPos)==" ")iPos++;targetString=targetString.substr(iPos);}}
+iPos=targetString.length-1;if(trimMode==0||trimMode==2){if(targetString.charAt(iPos)==" "){while(targetString.charAt(iPos)==" ")iPos--;targetString=targetString.substr(0,iPos+1);}}
+return targetString;}
+function $(){var elements=new Array();for(var i=0;i""){oElements[oElements.length]=els[i];}}
+return oElements;}
+function $N(name,node){var oElements=[];if(node==null)node=document;var els=node.getElementsByName(name);for(i=0;i""){if(rulehash[thisName]){oThis.addRule(oThis,sActionEvent,thisName,oThis.doDataExchange,rulehash[thisName]);}
+if(activenames[thisName]){oThis.addRule(oThis,sActionEvent,thisName,oThis.doSubmitAttribute,null);}
+oThis.listenForEvents(domCtrl,sActionEvent);}}finally{}};},addRule:function(oThis,sEvent,sName,fFunc,oArg){var rules=this._rules;var ename=sName+"$"+sEvent;var i;if(typeof(rules[ename])!='undefined'){for(i=0;i0)){el[0].value=domTarget.getAttribute("href");}}
+Dispatcher.getInstance().requestSubmit();},handleAction:function(e){var d=Dispatcher.getInstance();var t=this;var i;if(t.tagName&&t.tagName.toLowerCase()=='a'){e.preventDefault();e.stopPropagation();}
+var realname=this.getAttributeNode("realname");realname=realname?realname.value:null;if(this.name||realname){d.setSubmitSource(realname||this.name);}
+d.submitCheckBegin();try{console&&console.info("Executing rule "+t.name+"."+e.type);var rules=d.getRulesFor(t.name,e.type);for(i=0;rules&&(i0)){if(typeof(sourceName)=='undefined'){console&&console.warn("Warning: Can't identify submitter: using p$a=''");}else{el[0].value=sourceName;}}},getSrcDst:function(oRule,oNotifierObj){function x_FindObj(name,sid){var oResult=[];var oControls=$N(name);for(var i=0;i""?", ":"")+oSrcDst.src[i].value;}}},SetValue:function(oListener,oRule,sMessage,oNotifierObj){var dispatcher=oListener;var oSrcDst=dispatcher.getSrcDst(oRule,oNotifierObj);for(var j=0;j=0){dstItems.splice(position,1);}}
+dst.value=dstItems.join(", ");}},PropertyToValue:function(oListener,oRule,sMessage,oNotifierObj){var dispatcher=oListener;var oSrcDst=dispatcher.getSrcDst(oRule,oNotifierObj);for(var j=0;j]+)>/);var m;var d=Dispatcher.getInstance();for(msg in this.listen){m=isEvent.exec(msg);if(m){this.addEvent(m[1],m[2],this.listen[msg],false);}else{this._listen(msg,this.listen[msg],null);}}}
+if(this.send){for(msg in this.send){if(this.send[msg]==null){this.send[msg]=this.makeSender(this,msg);}}}},makeSender:function(sender,msg){return function(obj){return sender._send(msg,obj,null);};},getValue:function(attr){var prop=null;var ix;if(typeof attr==="object"){prop=attr.prop;attr=attr.attr;}
+else if((ix=attr.indexOf(":"))>=0){prop=attr.substring(ix+1);attr=attr.substring(0,ix);}
+var inp=this.getInputs(attr);if(!inp){return null;}
+if(inp.length==1){inp=inp[0];return(prop&&prop.toLowerCase()!=="value")?inp.getAttribute(prop):htmlutils.getValue(inp);}
+else if(inp.length>0&&(!prop||prop.toLowerCase()==="value")&&(inp[0].type||"").toLowerCase()==="radio"){return htmlutils.getValue(inp);}
+var result=[];for(var i=0;i=0){prop=attr.substring(ix+1);attr=attr.substring(0,ix);}
+if(prop){throw"UNIMPLEMENTED: Component.getList: Getting list by property";}
+var inp=this.getInputs(attr);if(!inp){return null;}
+if(inp.length==1){inp=inp[0];return htmlutils.getList(inp);}
+var result=[];for(var i=0;i=0){prop=attr.substring(ix+1);attr=attr.substring(0,ix);}
+var inp=this.getInputs(attr);if(inp.length==1){inp=inp[0];if(prop){if(typeof(inp[prop])!='undefined'){inp[prop]=value;}else{inp.setAttribute(prop,value);}}else{inp.value=value;}}else{throw"UNIMPLEMENTED: Cannot (yet) set vector values from scalar";}},getInputs:function(name){var inp=document.getElementsByName(this.path+"."+name);if(!inp||inp.length===0){inp=null;}
+return inp;},getInput:function(name){var inp=this.getInputs(name);return((inp&&inp.length>0)?inp[0]:null);},focusInitialInput:function(name,overrideEnterWatch){var inp=this.getInput(name);if(inp){Portal.initialElement.focusElement(inp,overrideEnterWatch);}},has:function(attrname){var inp=document.getElementsByName(this.path+"."+attrname);return(inp&&(inp.length>0));},addEvent:function(inputName,eventName,f,flag){var inputs;var oThis=this;if(typeof(inputName)=='string'){inputs=this.getInputs(inputName);}else if(utils.isArray(inputName)){inputs=inputName;}else{inputs=[inputName];}
+if(!inputs){console&&console.warn("Can't find: "+inputName);return;}
+var d=Dispatcher.getInstance();for(i=0;i1?elem[0]:elem;var tagName=em.tagName.toLowerCase();if(tagName==="input"){tagName+="_"+em.type;}
+var v=this.accessors[tagName];return v&&v.getValue?v.getValue(elem):elem.value;},getList:function(elem){var v=this.accessors[elem.tagName.toLowerCase()];return v&&v.getList?v.getList(elem):(elem.value?[elem.value]:[]);}};utils.addEvent(window,"load",function(){Portal.getInstance();},false);(function(){Portal.initialElement={count:0,key13Pressed:false,hasListenTimePassed:false,listenTime:175,_focusQueue:null,focus:function(elem,ignoreEnter){if(document.all){elem.blur();}
+if(elem&&(!Portal.initialElement.key13Pressed||ignoreEnter)){if(Portal.initialElement.hasListenTimePassed||ignoreEnter){if(elem.createTextRange){var text=elem.createTextRange();text.moveStart('character',elem.value.length);text.collapse();text.select();}
+else if(elem.setSelectionRange){elem.focus();var len=elem.value.length;elem.setSelectionRange(len,len);}
+else{elem.focus();elem.value=elem.value;}}
+else{Portal.initialElement._focusQueue=elem;}}},timerEnd:function(){Portal.initialElement.hasListenTimePassed=true;var elem=Portal.initialElement._focusQueue;if(elem!==null)Portal.initialElement.focus(elem);elem=null;}}
+function watchKeyPress(evt){evt=evt||window.event;var keyCode=evt.keyCode||evt.which;if(keyCode==13){Portal.initialElement.key13Pressed=true;Portal.initialElement.count++;}}
+if(document.addEventListener){document.addEventListener('keypress',watchKeyPress,false);document.addEventListener('keydown',watchKeyPress,false);}else if(document.attachEvent){document.attachEvent('onkeypress',watchKeyPress);document.attachEvent('onkeydown',watchKeyPress);}
+window.setTimeout(Portal.initialElement.timerEnd,Portal.initialElement.listenTime);})();
\ No newline at end of file
diff --git a/default_files/specs_asn.gif b/default_files/specs_asn.gif
new file mode 100644
index 00000000..f6d2617d
Binary files /dev/null and b/default_files/specs_asn.gif differ
diff --git a/default_files/specs_dtd.gif b/default_files/specs_dtd.gif
new file mode 100644
index 00000000..b95cabd2
Binary files /dev/null and b/default_files/specs_dtd.gif differ
diff --git a/default_files/th-toolkit-lrg.png b/default_files/th-toolkit-lrg.png
new file mode 100644
index 00000000..a459a6e2
Binary files /dev/null and b/default_files/th-toolkit-lrg.png differ
diff --git a/default_files/type_strings.gif b/default_files/type_strings.gif
new file mode 100644
index 00000000..b4ddc9bb
Binary files /dev/null and b/default_files/type_strings.gif differ
diff --git a/help_page.md b/help_page.md
new file mode 100644
index 00000000..4ea11b4b
--- /dev/null
+++ b/help_page.md
@@ -0,0 +1,8 @@
+---
+layout: default
+title: How to edit the book
+nav: help_page
+---
+
+How to edit this book
+=================================================
diff --git a/index.md b/index.md
new file mode 100644
index 00000000..070010d0
--- /dev/null
+++ b/index.md
@@ -0,0 +1,56 @@
+---
+layout: default
+title: C++ Toolkit test
+nav: index
+---
+
+The NCBI C++ Toolkit Book
+=========================
+
+National Center for Biotechnology Information (US): Bethesda (MD); 2004.
+
+Contents
+--------
+
+- [Book Information](pages/fm)
+- [Part 1. Overview](pages/part1)
+ - [1. Introduction to the C++ Toolkit](pages/ch_intro)
+ - [2. Getting Started](pages/ch_start)
+- [Part 2. Development Framework](pages/part2)
+ - [3. Retrieve the Source Code (FTP and Subversion)](pages/ch_getcode_svn)
+ - [4. Configure, Build, and Use the Toolkit](pages/ch_config)
+ - [5. Working with Makefiles](pages/ch_build)
+ - [6. Project Creation and Management](pages/ch_proj)
+ - [7. Programming Policies and Guidelines](pages/ch_style)
+- [Part 3. C++ Toolkit Library Reference](pages/part3)
+ - [8. Portability, Core Functionality and Application Framework](pages/ch_core)
+ - [9. Networking and IPC](pages/ch_conn)
+ - [10. Database Access Support](pages/ch_dbapi)
+ - [11. CGI and Fast-CGI](pages/ch_cgi)
+ - [12. HTML](pages/ch_html)
+ - [13. Data Serialization (ASN.1, XML)](pages/ch_ser)
+ - [14. Biological Sequence Data Model](pages/ch_datamod)
+ - [15. Biological Object Manager](pages/ch_objmgr)
+ - [16. BLAST API](pages/ch_blast)
+ - [17. Access to NCBI data](pages/ch_dataaccess)
+ - [18. Biological Sequence Alignment](pages/ch_algoalign)
+ - [19. GUI and Graphics](pages/ch_gui)
+ - [20. Using the Boost Unit Test Framework](pages/ch_boost)
+- [Part 4. Wrappers for 3rd-Party Packages](pages/part4)
+ - [21. XmlWrapp (XML parsing and handling, XSLT, XPath)](pages/ch_xmlwrapp)
+- [Part 5. Software](pages/part5)
+ - [22. Debugging, Exceptions, and Error Handling](pages/ch_debug)
+ - [23. Distributed Computing](pages/ch_grid)
+ - [24. Applications](pages/ch_app)
+ - [25. Examples and Demos](pages/ch_demo)
+ - [26. C Toolkit Resources for C++ Toolkit Users](pages/ch_res)
+- [Part 6. Help and Support](pages/part6)
+ - [27. NCBI C++ Toolkit Source Browser](pages/ch_browse)
+ - [28. Software Development Tools](pages/ch_devtools)
+ - [29. FAQs, Useful Documentation Links, and Mailing Lists](pages/ch_faq)
+- [Part 7. Library and Applications Configuration](pages/part7)
+ - [30. Library Configuration](pages/ch_libconfig)
+- [Release Notes](pages/part8)
+- [Appendix - Books and Styles](pages/appendix)
+
+
diff --git a/pages/appendix.md b/pages/appendix.md
new file mode 100644
index 00000000..dda3c43a
--- /dev/null
+++ b/pages/appendix.md
@@ -0,0 +1,28 @@
+---
+layout: default
+title: C++ Toolkit test
+nav: pages/appendix
+---
+
+
+Appendix - Books and Styles
+===========================
+
+
+
+Books and links to C++ and STL manuals
+--------------------------------------
+
+Books
+
+- *On To C++, by Patrick Henry Winston*. If you are looking for a short and concise tutorial, this is as close as you can get. It doesn't cover all of C++, but many of the essential features (except the STL). A decent first book to buy.
+
+- *The C++ Primer, Third Edition*, by Stanley Lippman and Josee Lajoie. A decent book, much expanded from previous editions. Gets carried away with very long examples, which makes it harder to use as a reference. Full coverage of ANSI/ISO C++.
+
+- *The C++ Programming Language, Third Edition* by Bjarne Stroustrup. Often called the best book for C++ written in Danish. Written by the designer of C++, this is a difficult read unless you already know C++. Full coverage of ANSI/ISO C++.
+
+- *Effective C++, Second Edition: 50 Specific Ways to Improve Your Programs and Designs*, by Scott Meyers. . A must-have that describes lots of tips, tricks, and pitfalls of C++ programming.
+
+- *More Effective C++: 35 New Ways to Improve Your Programs and Designs*, by Scott Meyers.. Same as above. For example, how is the new operator different from operator new? Operator new is called by the new operator to allocate memory for the object being created. This is how you hook your own malloc into C++.
+
+
diff --git a/pages/ch_algoalign.md b/pages/ch_algoalign.md
new file mode 100644
index 00000000..155cabde
--- /dev/null
+++ b/pages/ch_algoalign.md
@@ -0,0 +1,328 @@
+---
+layout: default
+title: C++ Toolkit test
+nav: pages/ch_algoalign
+---
+
+
+18\. Biological Sequence Alignment
+================================================
+
+Last Update: October 18, 2013.
+
+The Global Alignment Library [`xalgoalign`:[include](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/include/algo/align) \| [src](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/algo/align)]
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+
+The overview for this chapter consists of the following topics:
+
+- [Introduction](#ch_algoalign.intro)
+
+- [Chapter Outline](#ch_algoalign.outline)
+
+### Introduction
+
+The library contains C++ classes encapsulating global pairwise alignment algorithms frequently used in computational biology.
+
+- ***CNWAligner*** is the base class for the global alignment algorithm classes. The class provides an implementation of the generic Needleman-Wunsch for computing global alignments of nucleotide and amino acid sequences. The implementation uses an affine scoring scheme. An optional end-space free variant is supported, which is useful in applications where one sequence is expected to align in the interior of the other sequence, or the suffix of one string to align with a prefix of the other.
The classical Needleman-Wunsch algorithm is known to have memory and CPU requirements of the order of the sequence lengths' product. If consistent partial alignments are available, the problem is split into smaller subproblems taking fewer operations and less space to complete. ***CNWAligner*** provides a way to specify such partial alignments (ungapped).
+
+- ***CBandAligner*** encapsulates the banded variant of the global alignment algorithm which is applicable when the number of differences in the target alignment is limited ('the band width'). The computational cost of the algorithm is of the order of the band width multiplied by the length of the query sequence.
+
+- ***CMMAligner*** follows Hirschberg's divide-and-conquer approach under which the amount of space required to align two sequences globally becomes a linear function of the sequences' lengths. Although the latter is achieved at a cost of up to twice longer running time, a multithreaded version of the algorithm can run even faster than the classical Needleman-Wunsch algorithm in a multiple-CPU environment.
+
+- ***CSplicedAligner*** is an abstract base for algorithms computing cDNA-to-genome, or spliced alignments. Spliced alignment algorithms specifically account for splice signals in their dynamic programming recurrences resulting in better alignments for these particular but very important types of sequences.
+
+### Chapter Outline
+
+The following is an outline of the chapter topics:
+
+- [Computing pairwise global sequence alignments](#ch_algoalign.generic_global_alignment)
+
+ - [Initialization](#ch_algoalign.initialization)
+
+ - [Parameters of alignment](#ch_algoalign.setup)
+
+ - [Computing](#ch_algoalign.computing)
+
+ - [Alignment transcript](#ch_algoalign.transcript)
+
+- [Computing multiple sequence alignments](#ch_algoalign.Computing_multiple_s)
+
+- [Aligning sequences in linear space](#ch_algoalign.divide_and_conquer)
+
+ - [The idea of the algorithm](#ch_algoalign.idea)
+
+ - [Implementation](#ch_algoalign.mm_implementation)
+
+- [Computing spliced sequences alignments](#ch_algoalign.spliced_alignment)
+
+ - [The problem](#ch_algoalign.uk_formulation)
+
+ - [Implementation](#ch_algoalign.uk_implementation)
+
+- [Formatting computed alignments](#ch_algoalign.formatter)
+
+ - [Formatter object](#ch_algoalign.nw_formatter)
+
+**Demo Cases** [[src/app/nw\_aligner](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/app/nw_aligner)] [[src/app/splign/](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/app/splign/)]
+
+
+
+Computing pairwise global sequence alignments
+---------------------------------------------
+
+Generic **pairwise** global alignment functionality is provided by ***CNWAligner***.
+
+***NOTE:*** ***CNWAligner*** is not a multiple sequence aligner. An example of using ***CNWAligner*** can be seen [here](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/app/nw_aligner).
+
+This functionality is discussed in the following topics:
+
+- [Initialization](#ch_algoalign.initialization)
+
+- [Parameters of alignment](#ch_algoalign.setup)
+
+- [Computing](#ch_algoalign.computing)
+
+- [Alignment transcript](#ch_algoalign.transcript)
+
+
+
+### Initialization
+
+Two constructors are provided to initialize the aligner:
+
+ CNWAligner(const char* seq1, size_t len1,
+ const char* seq2, size_t len2,
+ const SNCBIPackedScoreMatrix* scoremat = 0);
+ CNWAligner(void);
+
+The first constructor allows specification of the sequences and the score matrix at the time of the object's construction. Note that the sequences must be in the proper strands, because the aligners do not build reverse complementaries. The last parameter must be a pointer to a properly initialized ***SNCBIPackedScoreMatrix*** object or zero. If it is a valid pointer, then the sequences are verified against the alphabet contained in the ***SNCBIPackedScoreMatrix*** object, and its score matrix is further used in dynamic programming recurrences. Otherwise, sequences are verified against the IUPACna alphabet, and match/mismatch scores are used to fill in the score matrix.
+
+The default constructor is provided to support reuse of an aligner object when many sequence pairs share the same type and alignment parameters. In this case, the following two functions must be called before computing the first alignment to load the score matrix and the sequences:
+
+ void SetScoreMatrix(const SNCBIPackedScoreMatrix* scoremat = 0);
+ void SetSequences(const char* seq1, size_t len1,
+ const char* seq2, size_t len2,
+ bool verify = true);
+
+where the meaning of **`scoremat`** is the same as above.
+
+
+
+### Parameters of alignment
+
+***CNWAligner*** realizes the affine gap penalty model, which means that every gap of length L (with the possible exception of end gaps) contributes Wg+L\*Ws to the total alignment score, where Wg is a cost to open the gap and Ws is a cost to extend the gap by one basepair. These two parameters are always in effect when computing sequence alignments and can be set with:
+
+ void SetWg(TScore value); // set gap opening score
+ void SetWs(TScore value); // set gap extension score
+
+To indicate penalties, both gap opening and gap extension scores are assigned with negative values.
+
+Many applications (such as the shotgun sequence assembly) benefit from a possibility to avoid penalizing end gaps of alignment, because the relevant sequence's ends may not be expected to align. ***CNWAligner*** supports this through a built-in end-space free variant controlled with a single function:
+
+ void SetEndSpaceFree(bool Left1, bool Right1, bool Left2, bool Right2);
+
+The first two arguments control the left and the right ends of the first sequence. The other two control the second sequence's ends. True value means that end spaces will not be penalized. Although an arbitrary combination of end-space free flags can be specified, judgment should be used to get plausible alignments.
+
+The following two functions are only meaningful when aligning nucleotide sequences:
+
+ void SetWm(TScore value); // set match score
+ void SetWms(TScore value); // set mismatch score
+
+The first function sets a bonus associated with every matching pair of nucleotides. The second function assigns a penalty for every mismatching aligned pair of nucleotides. It is important that values set with these two functions will only take effect after ***SetScoreMatrix()*** is called (with a zero pointer, which is the default).
+
+One thing that could limit the scope of global alignment applications is that the classical algorithm takes quadratic space and time to evaluate the alignment. One wayto deal with it is to use the linear-space algorithm encapuslated in ***CMMAligner***. However, when some pattern of alignment is known or desired, it is worthwhile to explicitly specify "mile posts" through which the alignment should pass. Long high-scoring pairs with 100% identity (no gaps or mismatches) are typically good candidates for them. From the algorithmic point of view, the pattern splits the dynamic programming table into smaller parts, thus alleviating space and CPU requirements. The following function is provided to let the aligner know about such guiding constraints:
+
+ void SetPattern(const vector& pattern);
+
+Pattern is a vector of hits specified by their zero-based coordinates, as in the following example:
+
+ // the last parameter omitted to indicate nucl sequences
+ CNWAligner aligner (seq1, len1, seq2, len2);
+ // we want coordinates [99,119] and [129,159] on seq1 be aligned
+ // with [1099,1119] and [10099,10129] on seq2.
+ const size_t hits [] = { 99, 119, 1099, 1119, 129, 159, 10099, 10129 };
+ vector pattern ( hits, hits + sizeof(hits)/sizeof(hits[0]) );
+ aligner.SetPattern(pattern);
+
+
+
+### Computing
+
+To start computations, call ***Run()***, which returns the overall alignment score having aligned the sequences. Score is a scalar value associated with the alignment and depends on the parameters of the alignment. The global alignment algorithms align two sequences so that the score is the maximum over all possible alignments.
+
+
+
+### Alignment transcript
+
+The immediate output of the global alignment algorithms is a transcript.The transcript serves as a basic representation of alignments and is simply a string of elementary commands transforming the first sequence into the second one on a per-character basis. These commands (transcript characters) are (M)atch, (R)eplace, (I)nsert, and (D)elete. For example, the alignment
+
+ TTC-ATCTCTAAATCTCTCTCATATATATCG
+ ||| |||||| |||| || ||| ||||
+ TTCGATCTCT-----TCTC-CAGATAAATCG
+
+has a transcript:
+
+ MMMIMMMMMMDDDDDMMMMDMMRMMMRMMMM
+
+Several functions are available to retrieve and analyze the transcript:
+
+ // raw transcript
+ const vector* GetTranscript(void) const
+ {
+ return &m_Transcript;
+ }
+ // converted transcript vector
+ void GetTranscriptString(vector* out) const;
+ // transcript parsers
+ size_t GetLeftSeg(size_t* q0, size_t* q1,
+ size_t* s0, size_t* s1,
+ size_t min_size) const;
+ size_t GetRightSeg(size_t* q0, size_t* q1,
+ size_t* s0, size_t* s1,
+ size_t min_size) const;
+ size_t GetLongestSeg(size_t* q0, size_t* q1,
+ size_t* s0, size_t* s1) const;
+
+The last three functions search for a continuous segment of matching characters and return it in sequence coordinates through **`q0`**, **`q1`**, **`s0`**, **`s1`**.
+
+The alignment transcript is a simple yet complete representation of alignments that can be used to evaluate virtually every characteristic or detail of any particular alignment. Some of them, such as the percent identity or the number of gaps or mismatches, could be easily restored from the transcript alone, whereas others, such as the scores for protein alignments, would require availability of the original sequences.
+
+
+
+Computing multiple sequence alignments
+--------------------------------------
+
+COBALT (COnstraint Based ALignment Tool) is an experimental multiple alignment algorithm whose basic idea was to leverage resources at NCBI, then build up a set of pairwise constraints, then perform a fairly standard iterative multiple alignment process (with many tweaks driven by various benchmarks).
+
+COBALT is available online at:
+
+
+
+A precompiled binary, with the data files needed to run it, is available at:
+
+[ftp://ftp.ncbi.nlm.nih.gov/pub/agarwala/cobalt/](ftp://ftp.ncbi.nlm.nih.gov/pub/agarwala/cobalt)
+
+Work is being done on an improved COBALT tool.
+
+The paper reference for this algorithm is:
+
+*J.S. Papadopoulos, R. Agarwala, "COBALT: Constraint-Based Alignment Tool for Multiple Protein Sequences". Bioinformatics, May 2007*
+
+
+
+Aligning sequences in linear space
+----------------------------------
+
+***CMMAligner*** is an interface to a linear space variant of the global alignment algorithm. This functionality is discussed in the following topics:
+
+- [The idea of the algorithm](#ch_algoalign.idea)
+
+- [Implementation](#ch_algoalign.mm_implementation)
+
+
+
+### The idea of the algorithm
+
+That the classical global alignment algorithm requires quadratic space could be a serious restriction in sequence alignment. One way to deal with it is to use alignment patterns. Another approach was first introduced by Hirschberg and became known as a divide-and-conquer strategy. At a coarse level, it suggests computing of scores for partial alignments starting from two opposite corners of the dynamic programming matrix while keeping only those located in the middle rows or columns. After the analysis of the adjacent scores, it is possible to determine cells on those lines through which the global alignment's back-trace path will go. This approach reduces space to linear while only doubling the worst-case time bound. For details see, for example, Dan Gusfield's "Algorithms on Strings, Trees and Sequences".
+
+
+
+### Implementation
+
+***CMMAligner*** inherits its public interface from ***CNWAligner***. The only additional method allows us to toggle multiple-threaded versions of the algorithm.
+
+The divide-and-conquer strategy suggests natural parallelization, where blocks of the dynamic programming matrix are evaluated simultaneously. A theoretical acceleration limit imposed by the current implementation is 0.5. To use multiple-threaded versions, call ***EnableMultipleThreads()***. The number of simultaneously running threads will not exceed the number of CPUs installed on your system.
+
+When comparing alignments produced with the linear-space version with those produced by ***CNWAligner***, be ready to find many of them similar, although not exactly the same. This is normal, because several optimal alignments may exist for each pair of sequences.
+
+
+
+Computing spliced sequences alignments
+--------------------------------------
+
+This functionality is discussed in the following topics:
+
+- [The problem](#ch_algoalign.uk_formulation)
+
+- [Implementation](#ch_algoalign.uk_implementation)
+
+
+
+### The problem
+
+The spliced sequence alignment arises as an attempt to address the problem of eukaryotic gene structure recognition. Tools based on spliced alignments exploit the idea of comparing genomic sequences to their transcribed and spliced products, such as mRNA, cDNA, or EST sequences. The final objective for all splice alignment algorithms is to come up with a combination of segments on the genomic sequence that:
+
+- makes up a sequence very similar to the spliced product, when the segments are concatenated; and
+
+- satisfies certain statistically determined conditions, such as consensus splice sites and lengths of introns.
+
+According to the classical eukaryotic transcription and splicing mechanism, pieces of genomic sequence do not get shuffled. Therefore, one way of revealing the original exons could be to align the spliced product with its parent gene globally. However, because of the specificity of the process in which the spliced product is constructed, the generic global alignment with the affine penalty model may not be enough. To address this accurately, dynamic programming recurrences should specifically account for introns and splice signals.
+
+Algorithms described in this chapter exploit this idea and address a refined splice alignment problem presuming that:
+
+- the genomic sequence contains only one location from which the spliced product could have originated;
+
+- the spliced product and the genomic sequence are in the plus strand; and
+
+- the poly(A) tail and any other chunks of the product not created through the splicing were cut off, although a moderate level of sequencing errors on genomic, spliced, or both sequences is allowed.
+
+In other words, the library classes provide basic splice alignment algorithms to be used in more sophisticated applications. One real-life application, Splign, can be found under demo cases for the library.
+
+
+
+### Implementation
+
+There is a small hierarchy of three classes involved in spliced alignment facilitating a quality/performance trade-off in the case of distorted sequences:
+
+- ***CSplicedAligner*** - abstract base for spliced aligners.
+
+- ***CSplicedAligner16*** - accounts for the three conventional splices (GT/AG, GC/AG, AT/AC) and a generic splice; uses 2 bytes per back-trace matrix cell. Use this class with high-quality genomic sequences.
+
+- ***CSplicedAligner32*** - accounts for the three conventionals and splices that could be produced by damaging bases of any conventional; uses 4 bytes per back-trace matrix cell. Use this class with distorted genomic sequences.
+
+The abstract base class for spliced aligners, ***CNWSplicedAligner***, inherites an interface from its parent, ***CNWAligner***, adding support for two new parameters: intron penalty and minimal intron size (the default is 50).
+
+All classes assume that the spliced sequence is the first of the two input sequences passed. By default, the classes do not penalize gaps at the ends of the spliced sequence. The default intron penalties are chosen so that the 16-bit version is able able to pick out short exons, whereas the 32-bit version is generally more conservative.
+
+As with the generic global alignment, the immediate output of the algorithms is the alignment transcript. For the sake of spliced alignments, the transcript's alphabet is augmented to accommodate introns as a special sequence-editing operation.
+
+
+
+Formatting computed alignments
+------------------------------
+
+This functionality is discussed in the following topics:
+
+- [Formatter object](#ch_algoalign.nw_formatter)
+
+
+
+### Formatter object
+
+***CNWFormatter*** is a single place where all different alignment representations are created. The only argument to its constructor is the aligner object that actually was or will be used to align the sequences.
+
+The alignment must be computed before formatting. If the formatter is unable to find the computed alignment in the aligner that was referenced to the constructor, an exception will be thrown.
+
+To format the alignment as a ***CSeq\_align*** structure, call
+
+ void AsSeqAlign(CSeq_align* output) const;
+
+To format it as text, call
+
+ void AsText(string* output, ETextFormatType type, size_t line_width = 100)
+
+Supported text formats and their ***ETextFormatType*** constants follow:
+
+- Type 1 (**`eFormatType1`**): `TTC-ATCTCTAAATCTCTCTCATATATATCG` `TTCGATCTCT-----TCTC-CAGATAAATCG` ` ^ ^ `
+
+- Type 2 (**`eFormatType2`**): `TTC-ATCTCTAAATCTCTCTCATATATATCG` `||| |||||| |||| || ||| ||||` `TTCGATCTCT-----TCTC-CAGATAAATCG`
+
+- Gapped FastA (**`eFormatFastA`**): `>SEQ1` `TTC-ATCTCTAAATCTCTCTCATATATATCG` `>SEQ2` `TTCGATCTCT-----TCTC-CAGATAAATCG`
+
+- Table of exons (**`eFormatExonTable`**) - spliced alignments only. The exons are listed from left to right in tab-separated columns. The columns represent sequence IDs, alignment lengths, percent identity, coordinates on the query (spliced) and the subject sequences, and a short annotation including splice signals.
+
+- Extended table of exons (**`eFormatExonTableEx`**) - spliced alignments only. In addition to the nine columns, the full alignment transcript is listed for every exon.
+
+- ASN.1 (**`eFormatASN`**)
+
+
diff --git a/pages/ch_app.md b/pages/ch_app.md
new file mode 100644
index 00000000..3c4f43c7
--- /dev/null
+++ b/pages/ch_app.md
@@ -0,0 +1,2557 @@
+---
+layout: default
+title: C++ Toolkit test
+nav: pages/ch_app
+---
+
+
+24\. Applications
+===============================
+
+Created: April 1, 2003; Last Update: March 17, 2015.
+
+Overview
+--------
+
+- [Introduction](#ch_app.Intro)
+
+- [Chapter Outline](#ch_app.Outline)
+
+### Introduction
+
+Most of the applications discussed in this chapter are built on a regular basis, at least once a day from the latest sources, and if you are in NCBI, then you can find the latest version in the directory: `$NCBI/c++/Release/bin/` (or `$NCBI/c++/Debug/bin/`).
+
+### Chapter Outline
+
+The following is an outline of the topics presented in this chapter:
+
+- [DATATOOL: code generation and data serialization utility](#ch_app.datatool)
+
+ - [Invocation](#ch_app.datatool.html_refArgs)
+
+ - [Main arguments](#ch_app.datatool.html_refMainArgs)
+
+ - [Code generation arguments](#ch_app.datatool.html_refCodeGenerationAr)
+
+ - [Data specification conversion](#ch_app.Data_Specification_C)
+
+ - [Scope prefixes](#ch_app.Scope_Prefixes)
+
+ - [Modular DTD and Schemata](#ch_app.Modular_DTD_and_Sche)
+
+ - [Converting XML Schema into ASN.1](#ch_app.Converting_XML_Schem)
+
+ - [Definition file](#ch_app.datatool.html_refDefFile)
+
+ - [Common definitions](#ch_app.datatool.html_refDefCommon)
+
+ - [Definitions that affect specific types](#ch_app.datatool.html_refDefSpecific)
+
+ - [INTEGER, REAL, BOOLEAN, NULL](#ch_app.datatool.html_refDefINT)
+
+ - [ENUMERATED](#ch_app.datatool.html_refDefENUM)
+
+ - [OCTET STRING](#ch_app.datatool.html_refDefOCTETS)
+
+ - [SEQUENCE OF, SET OF](#ch_app.datatool.html_refDefArray)
+
+ - [SEQUENCE, SET](#ch_app.datatool.html_refDefClass)
+
+ - [CHOICE](#ch_app.datatool.html_refDefChoice)
+
+ - [The Special [-] Section](#ch_app.The_Special__Section)
+
+ - [Examples](#ch_app.datatool.html_refDefExample)
+
+ - [Module file](#ch_app.ch_app_datatool_html_refModFile)
+
+ - [Generated code](#ch_app.datatool.html_refCode)
+
+ - [Normalized name](#ch_app.datatool.html_refNormalizedName)
+
+ - [ENUMERATED types](#ch_app.datatool.html_refCodeEnum)
+
+ - [Class diagrams](#ch_app.dt_inside.html)
+
+ - [Specification analysis](#ch_app.dt_inside.html_specs)
+
+ - [ASN.1 specification analysis](#ch_app.dt_inside.html_specs_asn)
+
+ - [DTD specification analysis](#ch_app.dt_inside.html_specs_dtd)
+
+ - [Data types](#ch_app.dt_inside.html_data_types)
+
+ - [Data values](#ch_app.dt_inside.html_data_values)
+
+ - [Code generation](#ch_app.dt_inside.html_code_gen)
+
+- [Load Balancing](#ch_app.Load_Balancing)
+
+ - [Overview](#ch_app._Overview)
+
+ - [Load Balancing Service Mapping Daemon (LBSMD)](#ch_app.Load_Balancing_Servi)
+
+ - [Overview](#ch_app._Overview_1)
+
+ - [Configuration](#ch_app._Configuration)
+
+ - [Check Script Specification](#ch_app.Check_Script_Specification)
+
+ - [Server Descriptor Specification](#ch_app.Server_Descriptor_Specification)
+
+ - [Signals](#ch_app.Signals)
+
+ - [Automatic Configuration Distribution](#ch_app.Automatic_Configurat)
+
+ - [Monitoring and Control](#ch_app.Monitoring_and_Contr)
+
+ - [Service Search](#ch_app.Service_Search)
+
+ - [lbsmc Utility](#ch_app.lbsmc_Utility)
+
+ - [NCBI Intranet Web Utilities](#ch_app.NCBI_Intranet_Web_Ut)
+
+ - [Server Penalizer API and Utility](#ch_app.Server_Penalizer_API)
+
+ - [SVN Repository](#ch_app.SVN_Repository)
+
+ - [Log Files](#ch_app._Log_Files)
+
+ - [Configuration Examples](#ch_app._Configuration_Exampl)
+
+ - [Database Load Balancing](#ch_app.Database_Load_Balancing)
+
+ - [Cookie / Argument Affinity Module (MOD\_CAF)](#ch_app.Cookie___Argument_Af)
+
+ - [Overview](#ch_app._Overview_2)
+
+ - [Configuration](#ch_app._Configuration_1)
+
+ - [Configuration Examples](#ch_app._Configuration_Exampl)
+
+ - [Arguments Matching](#ch_app.Arguments_Matching)
+
+ - [Argument Matching Examples](#ch_app.Argument_Matching_Ex)
+
+ - [Log File](#ch_app.Log_File)
+
+ - [Monitoring](#ch_app._Monitoring)
+
+ - [DISPD Network Dispatcher](#ch_app.DISPD_Network_Dispat)
+
+ - [Overview](#ch_app._Overview_3)
+
+ - [Protocol Description](#ch_app.Protocol_Description)
+
+ - [Client Request to DISPD](#ch_app.Client_Request_to_DI)
+
+ - [DISPD Client Response](#ch_app.DISPD_Client_Respons)
+
+ - [Communication Schemes](#ch_app.Communication_Scheme)
+
+ - [NCBID Server Launcher](#ch_app.NCBID_Server_Launche)
+
+ - [Overview](#ch_app._Overview_4)
+
+ - [Firewall Daemon (FWDaemon)](#ch_app.Firewall_Daemon_FWDa)
+
+ - [Overview](#ch_app._Overview_5)
+
+ - [FWDaemon Behind a "Regular" Firewall](#ch_app.FWDaemon_Behind_a__R)
+
+ - [FWDaemon Behind a "Non-Transparent" Firewall](#ch_app.FWDaemon_Behind_a__N)
+
+ - [Monitoring](#ch_app._Monitoring_1)
+
+ - [Log Files](#ch_app._Log_Files_1)
+
+ - [FWDaemon and NCBID Dispatcher Data Exchange](#ch_app.FWDaemon_and_NCBID_D)
+
+ - [Launcherd Utility](#ch_app.Launcherd_Utility)
+
+ - [Monitoring Tools](#ch_app.Monitoring_Tools)
+
+ - [Quality Assurance Domain](#ch_app.Quality_Assurance_Do)
+
+- [NCBI Genome Workbench](#ch_app.applications1)
+
+ - [Design goals](#ch_app.gbench_dg)
+
+ - [Design](#ch_app.gbench_design)
+
+- [NCBI NetCache Service](#ch_app.ncbi_netcache_service)
+
+ - [What is NetCache?](#ch_app.what_is_netcache)
+
+ - [What can NetCache be used for?](#ch_app.what_it_can_be_used)
+
+ - [How to use NetCache](#ch_app.getting_started)
+
+ - [The basic ideas](#ch_app.The_basic_ideas)
+
+ - [Setting up your program to use NetCache](#ch_app.Set_up_your_program_to_use_NetCac)
+
+ - [Establish the NetCache service name](#ch_app.Establish_the_NetCache_service_na)
+
+ - [Initialize the client API](#ch_app.Initialize_the_client_API)
+
+ - [Store data](#ch_app.Store_data)
+
+ - [Retrieve data](#ch_app.Retrieve_data)
+
+ - [Samples and other resources](#ch_app.Available_samples)
+
+ - [Questions and answers](#ch_app.Questions_and_answers)
+
+
+
+DATATOOL: Code Generation and Data Serialization Utility
+--------------------------------------------------------
+
+**DATATOOL** source code is located at `c++/src/serial/datatool;` this application can perform the following:
+
+- Generate C++ data storage classes based on [ASN.1](http://www.itu.int/ITU-T/studygroups/com17/languages), [DTD](http://www.w3.org/TR/REC-xml) or [XML Schema](http://www.w3.org/XML/Schema) specification to be used with [NCBI data serialization streams](ch_ser.html).
+
+- Convert ASN.1 specification into a DTD or XML Schema specification and vice versa.
+
+- Convert data between ASN.1, XML and JSON formats.
+
+***Note:*** Because ASN.1, XML and JSON are, in general, incompatible, the last two functions are supported only partially.
+
+The following topics are discussed in subsections:
+
+- [Invocation](#ch_app.datatool.html_refArgs)
+
+- [Data specification conversion](#ch_app.Data_Specification_C)
+
+- [Definition file](#ch_app.datatool.html_refDefFile)
+
+- [Module file](#ch_app.ch_app_datatool_html_refModFile)
+
+- [Generated code](#ch_app.datatool.html_refCode)
+
+- [Class diagrams](#ch_app.dt_inside.html)
+
+
+
+### Invocation
+
+The following topics are discussed in this section:
+
+- [Main arguments](#ch_app.datatool.html_refMainArgs)
+
+- [Code generation arguments](#ch_app.datatool.html_refCodeGenerationAr)
+
+
+
+#### Main Arguments
+
+See [Table 1](#ch_app.tools_table1).
+
+
+
+Table 1. Main arguments
+
+| Argument | Effect | Comments |
+|------------------------|-----------------------------------------------------------|----------------------------------------------------------------------------|
+| -h | Display the **DATATOOL** arguments | Ignores other arguments |
+| -m \ | module specification file(s) - ASN.1, DTD, or XSD | Required argument |
+| -M \ | External module file(s) | Is used for IMPORT type resolution |
+| -i | Ignore unresolved types | Is used for IMPORT type resolution |
+| -f \ | Write ASN.1 module file | |
+| -fx \ | Write DTD module file | "-fx m" writes modular DTD file |
+| -fxs \ | Write XML Schema file | |
+| -fd \ | Write specification dump file in datatool internal format | |
+| -ms \ | Suffix of modular DTD or XML Schema file name | |
+| -dn \ | DTD module name in XML header | No extension. If empty, omit DOCTYPE declaration. |
+| -v \ | Read value in ASN.1 text format | |
+| -vx \ | Read value in XML format | |
+| -F | Read value completely into memory | |
+| -p \ | Write value in ASN.1 text format | |
+| -px \ | Write value in XML format | |
+| -pj \ | Write value in JSON format | |
+| -d \ | Read value in ASN.1 binary format | -t argument required |
+| -t \ | Binary value type name | See -d argument |
+| -e \ | Write value in ASN.1 binary format | |
+| -xmlns | XML namespace name | When specified, also makes XML output file reference Schema instead of DTD |
+| -sxo | No scope prefixes in XML output | |
+| -sxi | No scope prefixes in XML input | |
+| -logfile \ | File to which the program log should be redirected | |
+| conffile \ | Program's configuration (registry) data file | |
+| -version | Print version number | Ignores other arguments |
+
+
+
+
+
+#### Code Generation Arguments
+
+See [Table 2](#ch_app.tools_table2).
+
+
+
+Table 2. Code generation arguments
+
+| Argument | Effect | Comments |
+|-----------------|-----------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| -od \ | C++ code [definition file](#ch_app.datatool.html_refDefFile) | See [Definition file](#ch_app.datatool.html_refDefFile) |
+| -ods | Generate an example definition file (e.g. `MyModuleName._sample_def`) | Must be used with another option that generates code such as -oA. |
+| -odi | Ignore absent code definition file | |
+| -odw | Issue a warning about absent code definition file | |
+| -oA | Generate C++ files for all types | Only types from the main module are used (see [-m](#ch_app.tools_table1) and -mx arguments). |
+| -ot \ | Generate C++ files for listed types | Only types from the main module are used (see [-m](#ch_app.tools_table1) and -mx arguments). |
+| -ox \ | Exclude types from generation | |
+| -oX | Turn off recursive type generation | |
+| -of \ | Write the list of generated C++ files | |
+| -oc \ | Write combining C++ files | |
+| -on \ | Default namespace | The value "-" in the [Definition file](#ch_app.datatool.html_refDefFile) means don't use a namespace at all and overrides the -on option specified elsewhere. |
+| -opm \ | Directory for searching source modules | |
+| -oph \ | Directory for generated \*.hpp files | |
+| -opc \ | Directory for generated \*.cpp files | |
+| -or \ | Add prefix to generated file names | |
+| -orq | Use quoted syntax form for generated include files | |
+| -ors | Add source file dir to generated file names | |
+| -orm | Add module name to generated file names | |
+| -orA | Combine all -or\* prefixes | |
+| -ocvs | create ".cvsignore" files | |
+| -oR \ | Set -op\* and -or\* arguments for NCBI directory tree | |
+| -oDc | Turn ON generation of Doxygen-style comments | The value "-" in the [Definition file](#ch_app.datatool.html_refDefFile) means don't generate Doxygen comments and overrides the -oDc option specified elsewhere. |
+| -odx \ | URL of documentation root folder | For Doxygen |
+| -lax\_syntax | Allow non-standard ASN.1 syntax accepted by asntool | The value "-" in the [Definition file](#ch_app.datatool.html_refDefFile) means don't allow non-standard syntax and overrides the -lax\_syntax option specified elsewhere. |
+| -pch \ | Name of the precompiled header file to include in all \*.cpp files | |
+| -oex \ | Add storage-class modifier to generated classes | Can be overriden by [[-].\_export](#ch_app.datatool.html_refDefCommon) in the definition file. |
+
+
+
+
+
+### Data Specification Conversion
+
+When parsing a data specification, **DATATOOL** identifies the specification format based on the source file extension - ASN, DTD, or XSD.
+
+
+
+#### Scope Prefixes
+
+Initially, **DATATOOL** and the serial library supported serialization in ASN.1 and XML format, and conversion of ASN.1 specification into DTD. Compared to ASN.1, DTD is a very sketchy specification in the sense that there is only one primitive type - string, and all elements are defined globally. The latter feature of DTD led to a decision to use ‘scope prefixes’ in XML output to avoid potential name conflicts. For example, consider the following ASN.1 specification:
+
+ Date ::= CHOICE {
+ str VisibleString,
+ std Date-std
+ }
+ Time ::= CHOICE {
+ str VisibleString,
+ std Time-std
+ }
+
+Here, accidentally, element ***str*** is defined identically both in ***Date*** and ***Time*** productions; while the meaning of element ***std*** depends on the context. To avoid ambiguity, this specification translates into the following DTD:
+
+
+
+
+
+
+
+
+Accordingly, these scope prefixes made their way into XML output.
+
+Later, DTD parsing was added into **DATATOOL**. Here, scope prefixes were not needed. Also, since these prefixes considerably increase the size of the XML output, they could be omitted when it is known in advance that there can be no ambiguity. So, **DATATOOL** has got command line flags, which would enable that.
+
+With the addition of XML Schema parser and generator, when converting ASN.1 specification, elements can be declared in Schema locally if needed, and scope prefixes make almost no sense. Still, they are preserved for compatibility.
+
+
+
+#### Modular DTD and Schemata
+
+Here, ‘module’ means ASN.1 module. Single ASN.1 specification file may contain several modules. When converting it into DTD or XML schema, it might be convenient to put each module definitions into a separate file. To do so, one should specify a special file name in `-fx` or `-fxs` command line parameter. The names of output DTD or Schema files will then be chosen automatically - they will be named after ASN.1 modules defined in the source. ‘Modular’ output does not make much sense when the source specification is DTD or Schema.
+
+You can find a number of DTDs and Schema converted by **DATATOOL** from NCBI public ASN.1 specifications [here](http://www.ncbi.nlm.nih.gov/data_specs).
+
+
+
+#### Converting XML Schema into ASN.1
+
+There are two major problems in converting XML schema into ASN.1 specification: how to define XML attributes and how to convert complex content models. The solution was greatly affected by the underlying implementation of data storage classes (classes which **DATATOOL** generates based on a specification). So, for example the following Schema
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+translates into this ASN.1:
+
+ Author ::= SEQUENCE {
+ attlist SET {
+ gender ENUMERATED {
+ male (1),
+ female (2)
+ } OPTIONAL
+ },
+ lastName VisibleString,
+ fF CHOICE {
+ foreName VisibleString,
+ fM SEQUENCE {
+ firstName VisibleString,
+ middleName VisibleString OPTIONAL
+ }
+ } OPTIONAL,
+ initials VisibleString OPTIONAL,
+ suffix VisibleString OPTIONAL
+ }
+
+Each unnamed local element gets a name. When generating C++ data storage classes from Schema, **DATATOOL** marks such data types as anonymous.
+
+It is possible to convert source Schema into ASN.1, and then use **DATATOOL** to generate C++ classes from the latter. In this case **DATATOOL** and serial library provide compatibility of ASN.1 output. If you generate data storage classes from Schema, and use them to write data in ASN.1 format (binary or text), if you then convert that Schema into ASN.1, generate classes from it, and again write same data in ASN.1 format using this new set of classes, then these two files will be identical.
+
+
+
+### Definition File
+
+It is possible to tune up the C++ code generation by using a definition file, which could be specified in the [-od](#ch_app.tools_table2) argument. The definition file uses the generic [NCBI configuration](ch_core.html#ch_core.registry_syntax) format also used in the configuration (`*.ini`) files found in NCBI's applications.
+
+**DATATOOL** looks for code generation parameters in several sections of the file in the following order:
+
+1. `[ModuleName.TypeName]`
+
+2. `[TypeName]`
+
+3. `[ModuleName]`
+
+4. `[-]`
+
+Parameter definitions follow a "name = value" format. The "name" part of the definition serves two functions: (1) selecting the specific element to which the definition applies, and (2) selecting the code generation parameter (such as `_class`) that will be fine-tuned for that element.
+
+To modify a top-level element, use a definition line where the name part is simply the desired code generation parameter (such as `_class`). To modify a nested element, use a definition where the code generation parameter is prefixed by a dot-separated "path" of the successive container element names from the data format specification. For path elements of type `SET OF` or `SEQUENCE OF`, use an "`E`" as the element name (which would otherwise be anonymous). ***Note:*** Element names will depend on whether you are using ASN.1, DTD, or Schema.
+
+For example, consider the following ASN.1 specification:
+
+ MyType ::= SEQUENCE {
+ label VisibleString ,
+ points SEQUENCE OF
+ SEQUENCE {
+ x INTEGER ,
+ y INTEGER
+ }
+ }
+
+Code generation for the various elements can be fine-tuned as illustrated by the following sample definition file:
+
+ [MyModule.MyType]
+ ; modify the top-level element (MyType)
+ _class = CMyTypeX
+
+ ; modify a contained element
+ label._class = CTitle
+
+ ; modify a "SEQUENCE OF" container type
+ points._type = vector
+
+ ; modify members of an anonymous SEQUENCE contained in a "SEQUENCE OF"
+ points.E.x._type = double
+ points.E.y._type = double
+
+ ; modify a DATATOOL-assigned class name
+ points.E._class = CPoint
+
+***Note:*** **DATATOOL** assigns arbitrary names to otherwise anonymous containers. In the example above, the `SEQUENCE` containing `x` and `y` has no name in the specification, so **DATATOOL** assigned the name `E`. If you want to change the name of a **DATATOOL**-assigned name, create a definition file and rename the class using the appropriate `_class` entry as shown above. To find out what the **DATATOOL**-assigned name will be, create a sample definition file using the **DATATOOL** `-ods` option. This approach will work regardless of the data specification format (ASN.1, DTD, or XSD).
+
+The following additional topics are discussed in this section:
+
+- [Common definitions](#ch_app.datatool.html_refDefCommon)
+
+- [Definitions that affect specific types](#ch_app.datatool.html_refDefSpecific)
+
+- [The Special [-] Section](#ch_app.The_Special__Section)
+
+- [Examples](#ch_app.datatool.html_refDefExample)
+
+
+
+#### Common Definitions
+
+Some definitions refer to the generated class as a whole.
+
+`_file` Defines the base filename for the generated or referenced C++ class.
+
+For example, the following definitions:
+
+ [ModuleName.TypeName]
+ _file=AnotherName
+
+Or
+
+ [TypeName]
+ _file=AnotherName
+
+would put the class ***CTypeName*** in files with the base name `AnotherName`, whereas these two:
+
+ [ModuleName]
+ _file=AnotherName
+
+Or
+
+ [-]
+ _file=AnotherName
+
+put **all** the generated classes into a single file with the base name `AnotherName`.
+
+`_extra_headers` Specify additional header files to include.
+
+For example, the following definition:
+
+ [-]
+ _extra_headers=name1 name2 \"name3\"
+
+would put the following lines into all generated headers:
+
+ #include
+ #include
+ #include "name3"
+
+Note the name3 clause. Putting name3 in quotes instructs **DATATOOL** to use the quoted syntax in generated files. Also, the quotes must be escaped with backslashes.
+
+`_dir` Subdirectory in which the generated C++ files will be stored (in case \_file not specified) or a subdirectory in which the referenced class from an external module could be found. The subdirectory is added to include directives.
+
+`_class` The name of the generated class (if `_class=-` is specified, then no code is generated for this type).
+
+For example, the following definitions:
+
+ [ModuleName.TypeName]
+ _class=CAnotherName
+
+Or
+
+ [TypeName]
+ _class=CAnotherName
+
+would cause the class generated for the type `TypeName` to be named ***CAnotherName***, whereas these two:
+
+ [ModuleName]
+ _class=CAnotherName
+
+Or
+
+ [-]
+ _class=CAnotherName
+
+would result in **all** the generated classes having the same name ***CAnotherName*** (which is probably not what you want).
+
+`_namespace` The namespace in which the generated class (or classes) will be placed.
+
+`_parent_class` The name of the base class from which the generated C++ class is derived.
+
+`_parent_type` Derive the generated C++ class from the class, which corresponds to the specified type (in case \_parent\_class is not specified).
+
+It is also possible to specify a storage-class modifier, which is required on Microsoft Windows to export/import generated classes from/to a DLL. This setting affects all generated classes in a module. An appropriate section of the definition file should look like this:
+
+ [-]
+ _export = EXPORT_SPECIFIER
+
+Because this modifier could also be specified in the [command line](#ch_app.tools_table2), the **DATATOOL** code generator uses the following rules to choose the proper one:
+
+- If no `-oex` flag is given in the command line, no modifier is added at all.
+
+- If `-oex ""` (that is, an empty modifier) is specified in the command line, then the modifier from the definition file will be used.
+
+- The command-line parameter in the form `-oex FOOBAR` will cause the generated classes to have a `FOOBAR` storage-class modifier, unless another one is specified in the definition file. The modifier from the definition file always takes precedence.
+
+
+
+#### Definitions That Affect Specific Types
+
+The following additional topics are discussed in this section:
+
+- [INTEGER, REAL, BOOLEAN, NULL](#ch_app.datatool.html_refDefINT)
+
+- [ENUMERATED](#ch_app.datatool.html_refDefENUM)
+
+- [OCTET STRING](#ch_app.datatool.html_refDefOCTETS)
+
+- [SEQUENCE OF, SET OF](#ch_app.datatool.html_refDefArray)
+
+- [SEQUENCE, SET](#ch_app.datatool.html_refDefClass)
+
+- [CHOICE](#ch_app.datatool.html_refDefChoice)
+
+
+
+##### INTEGER, REAL, BOOLEAN, NULL
+
+`_type` C++ type: int, short, unsigned, long, etc.
+
+
+
+##### ENUMERATED
+
+`_type` C++ type: int, short, unsigned, long, etc.
+
+`_prefix` Prefix for names of enum values. The default is "e".
+
+
+
+##### OCTET STRING
+
+`_char` Vector element type: char, unsigned char, or signed char.
+
+
+
+##### SEQUENCE OF, SET OF
+
+`_type` STL container type: list, vector, set, or multiset.
+
+
+
+##### SEQUENCE, SET
+
+`memberName._delay` Mark the specified member for delayed reading.
+
+
+
+##### CHOICE
+
+`_virtual_choice` If not empty, do not generate a special class for choice. Rather make the choice class as the parent one of all its variants.
+
+`variantName._delay` Mark the specified variant for delayed reading.
+
+
+
+#### The Special [-] Section
+
+There is a special section `[-]` allowed in the definition file which can contain definitions related to code generation. This is a good place to define a namespace or identify additional headers. It is a "top level" section, so entries placed here will override entries with the same name in other sections or on the command-line. For example, the following entries set the proper parameters for placing header files alongside source files:
+
+ [-]
+ ; Do not use a namespace at all:
+ -on = -
+
+ ; Use the current directory for generated .cpp files:
+ -opc = .
+
+ ; Use the current directory for generated .hpp files:
+ -oph = .
+
+ ; Do not add a prefix to generated file names:
+ -or = -
+
+ ; Generate #include directives with quotes rather than angle brackets:
+ -orq = 1
+
+Any of the code generation arguments in [Table 2](#ch_app.tools_table2) (except `-od`, `-odi`, and `-odw` which are related to specifying the definition file) can be placed in the `[-]` section.
+
+In some cases, the special value `"-"` causes special processing as noted in [Table 2](#ch_app.tools_table2).
+
+
+
+#### Examples
+
+If we have the following ASN.1 specification (this not a "real" specification - it is only for illustration):
+
+ Date ::= CHOICE {
+ str VisibleString,
+ std Date-std
+ }
+ Date-std ::= SEQUENCE {
+ year INTEGER,
+ month INTEGER OPTIONAL
+ }
+ Dates ::= SEQUENCE OF Date
+ Int-fuzz ::= CHOICE {
+ p-m INTEGER,
+ range SEQUENCE {
+ max INTEGER,
+ min INTEGER
+ },
+ pct INTEGER,
+ lim ENUMERATED {
+ unk (0),
+ gt (1),
+ lt (2),
+ tr (3),
+ tl (4),
+ circle (5),
+ other (255)
+ },
+ alt SET OF INTEGER
+ }
+
+Then the following definitions will effect the generation of objects:
+
+
+
+| Definition | Effected Objects |
+|---------------------------------------------------------------------|--------------------------------------------------------------------|
+| `[Date]` `str._type = string` | the `str` member of the `Date` structure |
+| `[Dates]` `E._pointer = true` | elements of the `Dates` container |
+| `[Int-fuzz]` `range.min._type = long` | the `min` member of the `range` member of the `Int-fuzz` structure |
+| `[Int-fuzz]` `alt.E._type = long` | elements of the `alt` member of the `Int-fuzz` structure |
+
+
+
+As another example, suppose you have a ***CatalogEntry*** type comprised of a ***Summary*** element and either a ***RecordA*** element or a ***RecordB*** element, as defined by the following XSD specification:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+In this specification, the `` element in ***CatalogEntryType*** is anonymous, so **DATATOOL** will assign an arbitrary name to it. The assigned name will not be descriptive, but fortunately you can use a definition file to change the assigned name.
+
+First find the **DATATOOL**-assigned name by creating a sample definition file using the `-ods` option:
+
+ datatool -ods -oA -m catalogentry.xsd
+
+The sample definition file (`catalogentry._sample_def`) shows `RR` as the class name:
+
+ [CatalogEntry]
+ RR._class =
+ Summary._class =
+
+Then edit the module definition file (`catalogentry.def`) and change `RR` to a more descriptive class name, for example:
+
+ [CatalogEntry]
+ RR._class=CRecordChoice
+
+The new name will be used the next time the module is built.
+
+
+
+### Module File
+
+Module files are not used directly by **DATATOOL**, but they are read by `new_module.sh` and [project\_tree\_builder](ch_config.html#ch_config._Build_the_Toolkit) and therefore determine what **DATATOOL**'s command line will be when **DATATOOL** is invoked from the NCBI build system.
+
+Module files simply consist of lines of the form "`KEY = VALUE`". Only the key `MODULE_IMPORT` is currently used (and is the only key ever recognized by `project_tree_builder`). Other keys used to be recognized by `module.sh` and still harmlessly remain in some files. The possible keys are:
+
+- `MODULE_IMPORT` These definitions contain a space-delimited list of other modules to import. The paths should be relative to `.../src` and should not include extensions.
For example, a valid entry could be: MODULE\_IMPORT = objects/general/general objects/seq/seq
+
+- `MODULE_ASN`, `MODULE_DTD`, `MODULE_XSD` These definitions explicitly set the specification filename (normally `foo.asn`, `foo.dtd`, or `foo.xsd` for `foo.module`). Almost no module files contain this definition. It is no longer used by the `project_tree_builder` and is therefore not necessary
+
+- `MODULE_PATH` Specifies the directory containing the current module, again relative to `.../src`. Almost all module files contain this definition, however it is no longer used by either `new_module.sh` or the `project_tree_builder` and is therefore not necessary.
+
+
+
+### Generated Code
+
+The following additional topics are discussed in this section:
+
+- [Normalized name](#ch_app.datatool.html_refNormalizedName)
+
+- [ENUMERATED types](#ch_app.datatool.html_refCodeEnum)
+
+
+
+#### Normalized Name
+
+By default, DATATOOL generates "normalized" C++ class names from ASN.1 type names using two rules:
+
+1. Convert any hyphens ("***-***") into underscores ("***\_***"), because hyphens are not legal characters in C++ class names.
+
+2. Prepend a 'C' character.
+
+For example, the default normalized C++ class name for the ASN.1 type name "***Seq-data***" is "***CSeq\_data***".
+
+The default C++ class name can be overridden by explicitly specifying in the definition file a name for a given ASN.1 type name. For example:
+
+ [MyModule.Seq-data]
+ _class=CMySeqData
+
+
+
+#### ENUMERATED Types
+
+By default, for every `ENUMERATED` ASN.1 type, **DATATOOL** will produce a C++ enum type with the name ***ENormalizedName***.
+
+
+
+### Class Diagrams
+
+The following topics are discussed in this section:
+
+- [Specification analysis](#ch_app.dt_inside.html_specs)
+
+- [Data types](#ch_app.dt_inside.html_data_types)
+
+- [Data values](#ch_app.dt_inside.html_data_values)
+
+- [Code generation](#ch_app.dt_inside.html_code_gen)
+
+
+
+#### Specification Analysis
+
+The following topics are discussed in this section:
+
+- [ASN.1 specification analysis](#ch_app.dt_inside.html_specs_asn)
+
+- [DTD specification analysis](#ch_app.dt_inside.html_specs_dtd)
+
+
+
+##### ASN.1 Specification Analysis
+
+See [Figure 1](#ch_app.specs_asn).
+
+
+
+[](/book/static/img/specs_asn.gif "Click to see the full-resolution image")
+
+1. ASN.1 specification analysis.
+
+
+
+##### DTD Specification Analysis
+
+See [Figure 2](#ch_app.specs_dtd).
+
+
+
+[](/book/static/img/specs_dtd.gif "Click to see the full-resolution image")
+
+2. DTD specification analysis.
+
+
+
+#### Data Types
+
+See [CDataType](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCDataType.html).
+
+
+
+#### Data Values
+
+See [Figure 3](#ch_app.data_values).
+
+
+
+[](/book/static/img/data_types.gif "Click to see the full-resolution image")
+
+3. Data values.
+
+
+
+#### Code Generation
+
+See [Figure 4](#ch_app.code_gen).
+
+
+
+[](/book/static/img/type_strings.gif "Click to see the full-resolution image")
+
+4. Code generation.
+
+
+
+Load Balancing
+--------------
+
+- [Overview](#ch_app._Overview)
+
+- [Load Balancing Service Mapping Daemon (LBSMD)](#ch_app.Load_Balancing_Servi)
+
+- [Database Load Balancing](#ch_app.Database_Load_Balancing)
+
+- [Cookie / Argument Affinity Module (MOD\_CAF)](#ch_app.Cookie___Argument_Af)
+
+- [DISPD Network Dispatcher](#ch_app.DISPD_Network_Dispat)
+
+- [NCBID Server Launcher](#ch_app.NCBID_Server_Launche)
+
+- [Firewall Daemon (FWDaemon)](#ch_app.Firewall_Daemon_FWDa)
+
+- [Launcherd Utility](#ch_app.Launcherd_Utility)
+
+- [Monitoring Tools](#ch_app.Monitoring_Tools)
+
+- [Quality Assurance Domain](#ch_app.Quality_Assurance_Do)
+
+***Note:*** For security reasons not all links in the public version of this document are accessible by the outside NCBI users.
+
+The section covers the following topics:
+
+- The purpose of load balancing
+
+- All the separate components’ purpose, internal details, configuration
+
+- Communications between the components
+
+- Monitoring facilities
+
+
+
+### Overview
+
+The purpose of load balancing is distributing the load among the service providers available on the NCBI network basing on certain rules. The load is generated by both locally-connected and Internet-connected users. The figures below show the most typical usage scenarios.
+
+[](/book/static/img/LoadBalancingLocal.jpg "Click to see the full-resolution image")
+
+Figure 5. Local Clients
+
+Please note that the figure is slightly simplified to remove unnecessary details for the time being.
+
+In case of local access to the NCBI resources there are two NCBI developed components, which are involved into the interactions. These are LBSMD daemon (Load Balancing Service Mapping Daemon) and mod\_caf (Cookie/Argument Affinity module) - an Apache web server module.
+
+The LBSMD daemon is running on each host in the NCBI network. The daemon reads its configuration file with all the services available on the host described. Then the LBSMD daemon broadcasts the available services and the current host load to the adjacent LBSMD daemons on a regular basis. The data received from the other LBSMD daemons are stored in a special table. So at some stage the LBSMD daemon on each host will have had a full description of the services available on the network as well as the current hosts’ load.
+
+The mod\_caf Apache’s module analyses special cookies, query line arguments and reads data from the table populated by the LBSMD daemon. Basing on the best match it makes a decision of where to pass a request further.
+
+Suppose for a moment that a local NCBI client runs a web browser, points to an NCBI web page and initiates a DB request via the web interface. At this stage the mod\_caf analyses the request line and makes a decision where to pass the request. The request is passed to the ServiceProviderN host which performs the corresponding database query. Then the query results are delivered to the client. The data exchange path is shown on the figure above using solid lines.
+
+Another typical scenario for the local NCBI clients is when client code is run on a user workstation. That client code might require a long term connection to a certain service, to a database for example. The browser is not able to provide this kind of connection so a direct connection is used in this case. The data exchange path is shown on the figure above using dashed lines.
+
+The communication scenarios become more complicated in case when clients are located outside of the NCBI network. The figure below describes the interactions between modules when the user requested a service which does not suppose a long term connection.
+
+[](/book/static/img/LoadBalancingInternetShort.jpg "Click to see the full-resolution image")
+
+Figure 6. Internet Clients. Short Term Connection
+
+The clients have no abilities to connect to front end Apache web servers directly. The connection is done via a router which is located in DMZ (Demilitarized Zone). The router selects one of the available front end servers and passes the request to that web server. Then the web server processes the request very similar to how it processes requests from a local client.
+
+The next figure explains the interactions for the case when an Internet client requests a service which supposes a long term connection.
+
+[](/book/static/img/LoadBalancingInternetLong.jpg "Click to see the full-resolution image")
+
+Figure 7. Internet Clients. Long Term Connection
+
+In opposite to the local clients the internet clients are unable to connect to the required service directly because of the DMZ zone. This is where DISPD, FWDaemon and a proxy come to help resolving the problem.
+
+The data flow in the scenario is as follows. A request from the client reaches a front end Apache server as it was discussed above. Then the front end server passes the request to the DISPD dispatcher. The DISPD dispatcher communicates to FWDaemon (Firewall Daemon) to provide the required service facilities. The FWDaemon answers with a special ticket for the requested service. The ticket is sent to the client via the front end web server and the router. Then the client connects to the NAT service in the DMZ zone providing the received ticket. The NAT service establishes a connection to the FWDaemon and passes the received earlier ticket. The FWDaemon, in turn, provides the connection to the required service. It is worth to mention that the FWDaemon is running on the same host as the DISPD dispatcher and neither DISPD nor FWDaemon can work without each other.
+
+The most complicated scenario comes to the picture when an arbitrary Unix filter program is used as a service provided for the outside NCBI users. The figure below shows all the components involved into the scenario.
+
+[](/book/static/img/LoadBalancingDispD.jpg "Click to see the full-resolution image")
+
+Figure 8. NCBID at Work
+
+The data flow in the scenario is as follows. A request from the client reaches a front end Apache server as it was discussed above. Then the front end server passes the request to the DISPD dispatcher. The DISPD communicates to both the FWDaemon and the NCBID utility on (possibly) the other host and requests to demonize a requested Unix filter program (Service X on the figure). The demonized service starts listening on the certain port for a network connection. The connection attributes are delivered to the FWDaemon and to the client via the web front end and the router. The client connects to the NAT service and the NAT service passes the request further to the FWDaemon. The FWDaemon passes the request to the demonized Service X on the Service Provider K host. Since that moment the client is able to start data exchange with the service. The described scenario is purposed for long term connections oriented tasks.
+
+Further sections describe all the components in more detail.
+
+
+
+### Load Balancing Service Mapping Daemon (LBSMD)
+
+
+
+#### Overview
+
+As mentioned earlier, the LBSMD daemon runs almost on every host that carries either public or private servers which, in turn, implement NCBI services. The services include CGI programs or standalone servers to access NCBI data.
+
+Each service has a unique name assigned to it. The “TaxService” would be an example of such a name. The name not only identifies a service. It also implies a protocol which is used for data exchange with that service. For example, any client which connects to the “TaxService” service knows how to communicate with that service regardless the way the service is implemented. In other words the service could be implemented as a standalone server on host X and as a CGI program on the same host or on another host Y (please note, however, that there are exceptions and for some service types it is forbidden to have more than one service type on the same host).
+
+A host can advertize many services. For example, one service (such as “Entrez2”) can operate with binary data only while another one (such as “Entrez2Text”) can operate with text data only. The distinction between those two services could be made by using a content type specifier in the LBSMD daemon configuration file.
+
+The main purpose of the LBSMD daemon is to maintain a table of all services available at NCBI at the moment. In addition the LBSMD daemon keeps track of servers that are found to be dysfunctional (dead servers). The daemon is also responsible for propagating trouble reports, obtained from applications. The application trouble reports are based on their experience with advertised servers (e.g., an advertised server is not technically marked dead but generates some sort of garbage). Further in this document, the latter kind of feedback is called a penalty.
+
+The principle of load balancing is simple: each server which implements a service is assigned a (calculated) rate. The higher the rate, the better the chance for that server to be chosen when a request for the service comes up. Note that load balancing is thus almost never deterministic.
+
+The LBSMD daemon calculates two parameters for the host on which it is running. The parameters are a normal host status and a BLAST host status (based on the instant load of the system). These parameters are then used to calculate the rate of all (non static) servers on the host. The rates of all other hosts are not calculated but received and stored in the LBSDM table.
+
+The LBSMD daemon can be restarted from a crontab every few minutes on all the production hosts to ensure that the daemon is always running. This technique is safe because no more than one instance of the daemon is permitted on a certain host and any attempt to start more than one is ignored. Normally, though, a running daemon instance is maintained afloat by some kind of monitoring software, such as “puppet” or “monit” that makes use of the crontabs unnecessary.
+
+The main loop of the LBSMD daemon:
+
+- periodically checks the configuration file and reloads the configuration when necessary;
+
+- checks for and processes incoming messages from neighbor LBSMD daemons running on other hosts; and
+
+- generates and broadcasts the messages to the other hosts about the load of the system and configured services.
+
+The LBSMD daemon can also periodically check whether the configured servers are alive: either by trying to establish a connection to them (and then disconnecting immediately, without sending/receiving any data) and / or by using a special plugin script that can do more intelligent, thorough, and server-specific diagnostics, and report the result back to LBSMD via an exit code.
+
+Lastly, LBSMD can pull port load information as posted by the running servers. This is done via a simple API . The information is then used to calculate the final server rates in run-time.
+
+Although cients can [redirect services](ch_conn.html#ch_conn.Service_Redirection), LBSMD does not distinguish between direct and redirected services.
+
+
+
+#### Configuration
+
+The LBSMD daemon is configured via command line options and via a configuration file. The full list of command line options can be retrieved by issuing the following command:
+
+`/opt/machine/lbsm/sbin/lbsmd --help`
+
+The local NCBI users can also visit the following link:
+
+
+
+The default name of the LBSMD daemon configuration file is `/etc/lbsmd/servrc.cfg`. Each line can be one of the following:
+
+- an include directive
+
+- site / zone designation
+
+- host authority information
+
+- a monitored port designation
+
+- a part of the host environment
+
+- a service definition
+
+- an empty line (entirely blank or containing a comment only)
+
+Empty lines are ignored in the file. Any single configuration line can be split into several physical lines by inserting backslash symbols (\\) before the line breaks. A comment is introduced by the pound/hash symbol (\#).
+
+A configuration line of the form
+
+ %include filename
+
+causes the contents of the named file **`filename`** to be inserted here. The daemon always assumes that relative file names (those that do not start with the slash character, /) are based on the daemon startup directory. This is true for any level of nesting.
+
+Once started, the daemon first tries to read its configuration from `/etc/lbsmd/servrc.cfg`. If the file is not found (or is not readable) the daemon looks for the configuration file `servrc.cfg` in the directory from which it has been started. This fallback mechanism is not used when the configuration file name is explicitly stated in the command line. The daemon periodically checks the configuration file and all of its descendants and reloads (discards) their contents if some of the files have been either updated, (re-)moved, or added.
+
+The “**`filename`**” can be followed by a pipe character ( \| ) and some text (up to the end of the line or the comment introduced by the hash character). That text is then prepended to every line (but the `%include` directives) read from the included file.
+
+A configuration line of the form
+
+ @zone
+
+specifies the zone to which the entire configuration file applies, where a zone is a subdivision of the existing broadcast domain which does not intermix with other unrelated zones. Only one zone designation is allowed, and it must match the predefined site information (the numeric designation of the entire broadcast domain, which is either “guessed” by LBSMD or preset via a command-line parameter): the zone value must be a binary subset of the site value (which is usually a contiguous set of 1-bits, such as `0xC0` or `0x1E`).
+
+When no zone is specified, the zone is set equal to the entire site (broadcast domain) so that any regular service defined by the configuration is visible to each and every LBSMD running at the same site. Otherwise, only the servers with bitwise-matching zones are visible to each other: if 1, 2, and 3 are the zones of hosts “X”, “Y” and “Z”, respectively (all hosts reside within the same site, say 7), then servers from “X” are visible by “Z”, but not by “Y”; servers from “Y” are visible by “Z” but not by “X”; and finally, all servers from “Z” are visible by both “X” and “Y”. There’s a way to define servers at “X” to be visible by “Y” using an “Inter” server flag (see below).
+
+A configuration line of the form
+
+ [*]user
+
+introduces a user that is added to the host authority. There can be multiple authority lines across the configuration, and they are all aggregated into a list. The list can contain both individual user names and / or group names (denoted by a preceding asterisk). The listed users and / or members of the listed groups, will be allowed to operate on all server records that appear in the LBSMD configuration files on this host (individual server entries may designate additional personnel on a per-server basis). Additional authority entries are only allowed from the same branch of the configuration file tree: so if a file “a” includes a file “b”, where the first host authority is defined, then any file that is included (directly or indirectly) from “b” can add entries to the host authority, while no other file that is included later from “a”, can.
+
+A configuration line of the form
+
+ :port
+
+designates a local network port for monitoring by LBSMD: the daemon will regularly pull the port information as provided by servers in run-time: total port capacity, used capacity and free capacity; and make these values available in the load-balance messages sent to other LBSMDs. The ratio “free” over “total” will be used to calculate the port availability (1.0=fully free, 0.0=fully clogged). Servers may use arbitrary units to express the capacity, but both “used” and “free” may not be greater than “total”, and “used” must correspond to the actual used resource, yet “free” may be either calculated (e.g. algorithmically decreased in anticipation of the mouting load in order to shrink the port availability ratio quicker) or simply amounts to “total” – “used”. Note that “free” set to “0” signals the port as currently being unavailable for service (i.e. as if the port was down) – and an automatic connection check, if any, will not be performed by LBSMD on that port.
+
+A configuration line of the form
+
+ name=value
+
+goes into the host environment. The host environment can be accessed by clients when they perform the service name resolution. The host environment is designed to help the client to know about limitations/options that the host has, and based on this additional information the client can make a decision whether the server (despite the fact that it implements the service) is suitable for carrying out the client's request. For example, the host environment can give the client an idea about what databases are available on the host. The host environment is not interpreted or used in any way by either the daemon or by the load balancing algorithm, except that the name must be a valid identifier. The value may be practically anything, even empty. It is left solely for the client to parse the environment and to look for the information of interest. The host environment can be obtained from the service iterator by a call to `SERV_GetNextInfoEx()` (), which is documented in the [service mapping API](ch_conn.html#ch_conn.service_mapping_api)
+
+***Note***: White space characters which surround the name are not preserved but they are preserved in the value i.e. when they appear after the “=” sign.
+
+A configuration line of the form
+
+ service_name [check_specifier] server_descriptor [| launcher_info ]
+
+defines a server. The detailed description of the individual fields is given below.
+
+- **`service_name`** specifies the service name that the server is part of, for example TaxService. The same **`service_name`** may be used in multiple server definition lines to add more servers implementing that service.
+
+- **`[check_specifier]`** is an optional parameter (if omitted, the surrounding square brackets must not be used). The parameter is a comma separated list and each element in the list can be one of the following.
+
+ - **`[-]N[/M]`** where N and M are integers. This will lead to checking every N seconds with backoff time of M seconds if failed. The “-“ character is used when it is required to check dependencies only, but not the primary connection point. "0", which stands for "no check interval", disables checks for the service.
+
+ - **`[!][host[:port]][+[service]]`** which describes a dependency. The “!” character means negation. The **`service`** is a service name the describing service depends on and runs on **`host:port`**. The pair **`host:port`** is required if no service is specified. The **`host`**, :**`port`**, or both can be missing if **`service`** is specified (in that case the missing parts are read as “any”). The “+” character alone means “this service’s name” (of the one currently being defined). Multiple dependency specifications are allowed.
+
+ - **`[~][DOW[-DOW]][@H[-H]]`** which defines a schedule. The “~” character means negation. The service runs from **`DOW`** to **`DOW`** (**`DOW`** is one of Su, Mo, Tu, We, Th, Fr, Sa, or Hd, which stands for a federal holiday, and cannot be used in weekday ranges) or any if not specified, and between hours **`H`** to **`H`** (9-5 means 9:00am thru 4:59pm, 18-0 means 6pm thru midnight). Single **`DOW`** and / or **`H`** are allowed and mean the exact day of week (or a holiday) and / or one exact hour. Multiple schedule specifications are allowed.
+
+ - **`email@ncbi.nlm.nih.gov`** which makes the LBSMD daemon to send an e-mail to the specified address whenever this server changes its status (e.g. from up to down). Multiple e-mail specifications are allowed. The **`ncbi.nlm.nih.gov`** part is fixed and may not be changed.
+
+ - **`user`** or **`*group`** which makes the LBSMD daemon add the specified user or group of users to the list of personnel who are authorized to modify the server (e.g. post a penalty, issue a rerate command etc.). By default these actions are only allowed to the **`root`** and **`lbsmd`** users, as well as users added to the host authority. Multiple specifications are allowed.
+
+ - **`script`** which specifies a path to a local executable which checks whether the server is operational. The LBSMD daemon starts this script periodically as specified by the check time parameter(s) above. Only a single script specification is allowed. See [Check Script Specification](#ch_app.Check_Script_Specification) for more details.
+
+- **`server_descriptor`** specifies the address of the server and supplies additional information. An example of the **`server_descriptor`**: `STANDALONE somehost:1234 R=3000 L=yes S=yes B=-20` See [Server Descriptor Specification](#ch_app.Server_Descriptor_Specification) for more details.
+
+- **`launcher_info`** is basically a command line preceded by a pipe symbol ( \| ) which plays a role of a delimiter from the **`server_descriptor`**. It is only required for the **NCBID** type of service which are configured on the local host.
+
+
+
+##### Check Script Specification
+
+The check script file is configured between square brackets '[' and ']' in the service definition line. For example, the service definition line:
+
+`MYSERVICE [5, /bin/user/directory/script.sh] STANDALONE :2222 ...`
+
+sets the period in seconds between script checks as "`5`" (the default period is 15 seconds) and designates the check script file as "`/bin/user/directory/script.sh`" to be launched every 5 seconds. You can prefix the period with a minus sign (-) to indicate that LBSMD should not check the connection point (:2222 in this example) on its own, but should only run the script. The script must finish before the next check run is due. Otherwise, LBSMD will kill the script and ignore the result. Multiple repetitive failures may result in the check script removal from the check schedule.
+
+The following command-line parameters are always passed to the script upon execution:
+
+- **`argv[0]`** = name of the executable with preceding '\|' character if **`stdin`** / **`stdout`** are open to the server connection (/dev/null otherwise), ***NB***: '\|' is not always readily accessible from within shell scripts, so it's duplicated in **`argv[2]`** for convenience;
+
+- **`argv[1]`** = name of the service being checked;
+
+- **`argv[2]`** = if piped, "\|host:port" of the connection point being checked, otherwise "host:port" of the server as per configuration;
+
+The following additional command-line parameters will be passed to the script if it has been run before:
+
+- **`argv[3]`** = exit code obtained in the last check script run;
+
+- **`argv[4]`** = repetition count for **`argv[3]`** (***NB***: 0 means this is the first occurrence of the exit code given in **`argv[3]`**);
+
+- **`argv[5]`** = seconds elapsed since the last check script run.
+
+Output to **`stderr`** is attached to the LBSMD log file; the CPU limit is set to maximal allowed execution time. Nevertheless, the check must finish before the next invocation is due, per the server configuration.
+
+The check script is expected to produce one of the following exit codes:
+
+
+
+| Code(s) | Meaning |
+|-----------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| 0 | The server is fully available, i.e. "running at full throttle". |
+| 1 - 99 | Indicates the approximate percent of base capacity used. |
+| 100 - 110 | Server state is set as RESERVED. RESERVED servers are unavailable to most clients but not considered as officially DOWN. |
+| 111 - 120 | The server is not available and must not be used, i.e. DOWN. |
+| 123 | Retain the previous exit code (as supplied in **`argv[3]`**) and increment the repetition count. Retain the current server state, otherwise, and log a warning. |
+| 124 (*not* followed by 125) | Retain the current server state. |
+| 124 followed by 125 | Turn the server off, with no more checks. ***Note:*** This only applies when 124 is followed by 125, both without repetitions. |
+| 125 (*not* preceded by 124) | Retain the current server state. |
+| 126 | Script was found but not executable (POSIX, script error). |
+| 127 | Script was not found (POSIX, script error). |
+| 200 - 210 | STANDBY server (set the rate to 0.005). The rate will be rolled back to the previously set "regular" rate the next time the RERATE command comes; or when the check script returns anything other than 123, 124, 125, or the state-retaining ALERTs (211-220).
STANDBY servers are those having base rate in the range [0.001..0.009], with higher rates having better chance to get drafted for service. STANDBY servers are only used by clients if there are no usable non-STANDBY counterparts found. |
+| 211 - 220 | ALERT (email contacts and retain the current server state). |
+| 221 - 230 | ALERT (email contacts and base the server rate on the dependency check only). |
+
+
+
+Exit codes 126, 127, and other unlisted codes are treated as if 0 had been returned (i.e. the server rate is based on the dependency check only).
+
+Any exit code other than 123 resets the repetition count, even though the new code may be equal to the previous one. In the absence of a previous code, exit code 123 will not be counted as a repetition, will cause a warning to be logged.
+
+Any exit code *not* from the table above will cause a warning to be logged, and will be treated as if 0 has been returned. Note that upon the exit code sequence 124,125 no further script runs will occur, and the server will be taken out of service.
+
+If the check script crashes ungracefully (with or without the coredump) 100+ times in a row, it will be eliminated from further checks, and the server will be considered fully available (i.e. as if 0 had been returned).
+
+Servers are called SUPPRESSED when they are 100% penalized (see server penalties below); while RESERVED is a special state that LBSMD maintains. 100% penalty makes an entry not only unavailable for regular use (same as RESERVED) but also assumes some maintenance work in progress (so that any underlying state changes will not be announced immediately but only when the entry goes out of the 100% penalized state, if any state change still remains). On the other hand and from the client perspective, RESERVED and SUPPRESSED may look identical.
+
+***Note:*** The check script operation is complementary to setting a penalty prior to doing any disruptive changes in production. In other words, the script is only reliable as long as the service is expected to work. If there is any scheduled maintenance, it should be communicated to LBSMD via a penalty rather than by an assumption that the failing script will do the job of bringing the service to the down state and excluding it from LB.
+
+
+
+##### Server Descriptor Specification
+
+The **`server_descriptor`**, also detailed in `connect/ncbi_server_info.h` (), consists of the following fields:
+
+`server_type [host][:port] [arguments] [flags]`
+
+where:
+
+- **`server_type`** is one of the following keywords ([more info](ch_conn.html#ch_conn.service_connector)):
+
+ - ***NCBID*** for servers launched by ncbid.cgi
+
+ - ***STANDALONE*** for standalone servers listening to incoming connections on dedicated ports
+
+ - ***HTTP\_GET*** for servers, which are the CGI programs accepting only the GET request method
+
+ - ***HTTP\_POST*** for servers, which are the CGI programs accepting only the POST request method
+
+ - ***HTTP*** for servers, which are the CGI programs accepting either GET or POST request methods
+
+ - ***DNS*** for introduction of a name (fake service), which can be used later in load-balancing for domain name resolution
+
+ - ***NAMEHOLD*** for declaration of service names that cannot be defined in any other configuration files except for the current configuration file. ***Note:*** The FIREWALL server specification may not be used in a configuration file (i.e., may neither be declared as services nor as service name holders).
+
+- both **`host`** and **`port`** parameters are optional. Defaults are local host and port 80, except for ***STANDALONE*** and ***DNS*** servers, which do not have a default port value. If host is specified (by either of the following: keyword localhost, localhost IP address 127.0.0.1, real host name, or IP address) then the described server is not a subject for variable load balancing but is a static server. Such server always has a constant rate, independent of any host load.
+
+- **`arguments`** are required for HTTP\* servers and must specify the local part of the URL of the CGI program and, optionally, parameters such as `/somepath/somecgi.cgi?param1¶m2=value2¶m3=value3`. If no parameters are to be supplied, then the question mark (?) must be omitted, too. For **NCBID** servers, arguments are parameters to pass to the server and are formed as arguments for CGI programs, i.e., param1¶m2¶m3=value. As a special rule, '' (two single quotes) may be used to denote an empty argument for the **NCBID** server. ***STANDALONE*** and ***DNS*** servers do not take any **`arguments`**.
+
+- **`flags`** can come in any order (but no more than one instance of a flag is allowed) and essentially are the optional modifiers of values used by default. The following flags are recognized (see [ncbi\_server\_info.h](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/include/connect/ncbi_server_info.h)):
+
+ - load calculation keyword:
+
+ - ***Blast*** to use special algorithm for rate calculation acceptable for BLAST () applications. The algorithm uses instant values of the host load and thus is less conservative and more reactive than the ordinary one.
+
+ - ***Regular*** to use an ordinary rate calculation (default, and the only load calculation option allowed for static servers).
+
+ - Either of these keywords may be suffixed with “Inter”, such as to form ***RegularInter***, making the entry to cross the current zone boundary, and being available outside its zone.
+
+- base rate:
+
+ - R=value sets the base server reachability rate (as a floating point number); the default is 1000. Any negative value makes the server unreachable, and a value 0 is used. The range of the base rate is between 0.001 and 100000. Note that the range [0.001—0.009] is reverved for STANDBY servers – the ones that are only used by clients if no other usable non-STANDBY counterparts can be found.
+
+- locality markers (Note: If necessary, both L and P markers can be combined in a particular service definition):
+
+ - L={yes\|no} sets (if yes) the server to be local only. The default is no. The [service mapping API](ch_conn.html#ch_conn.service_mapping_api) returns local only servers in the case of mapping with the use of LBSMD running on the same - local - host (direct mapping), or if the dispatching (indirect mapping) occurs within the NCBI Intranet. Otherwise, if the service mapping occurs using a non-local network (certainly indirectly, by exchange with dispd.cgi) then servers that are local only are not seen.
+
+ - P={yes\|no} sets (if yes) the server to be private. The default is no. Private servers are not seen by the outside NCBI users (exactly like local servers), but in addition these servers are not seen from the NCBI Intranet if requested from a host, which is different from one where the private server runs. This flag cannot be used for DNS servers.
+
+- Stateful server:
+
+ - S={yes\|no}. The default is no. Indication of stateful server, which allows only dedicated socket (stateful) connections. This tag is not allowed for HTTP\* and DNS servers.
+
+- Secure server:
+
+ - $={yes\|no}. The default is no. Indication of the server to be used with secure connections only. For STANDALONE servers it means to use SSL, and for the HTTP\* ones – to use the HTTPS protocol.
+
+- Content type indication:
+
+ - C=type/subtype [no default] specification of Content-Type (including encoding), which server accepts. The value of this flag gets added automatically to any HTTP packet sent to the server by SERVICE connector. However, in order to communicate, a client still has to know and generate the data type accepted by the server, i.e. a protocol, which server understands. This flag just helps insure that HTTP packets all get proper content type, defined at service configuration. This tag is not allowed in DNS server specifications.
+
+- Bonus coefficient:
+
+ - B=double [0.0 = default] specifies a multiplicative bonus given to a server run locally, when calculating reachability rate. Special rules apply to negative/zero values: 0.0 means not to use the described rate increase at all (default rate calculation is used, which only slightly increases rates of locally run servers). Negative value denotes that locally run server should be taken in first place, regardless of its rate, if that rate is larger than percent of expressed by the absolute value of this coefficient of the average rate coefficient of other servers for the same service. That is -5 instructs to ignore locally run server if its status is less than 5% of average status of remaining servers for the same service.
+
+- Validity period:
+
+ - T=integer [0 = default] specifies the time in seconds this server entry is valid without update. (If equal to 0 then defaulted by the LBSM Daemon to some reasonable value.)
+
+Server descriptors of type ***NAMEHOLD*** are special. As **`arguments`**, they have only a server type keyword. The namehold specification informs the daemon that the service of this name and type is not to be defined later in any configuration file except for the current one. Also, if the host (and/or port) is specified, then this protection works only for the service name on the particular host (and/or port).
+
+***Note:*** it is recommended that a dummy port number (such as :0) is always put in the namehold specifications to avoid ambiguities with treating the server type as a host name. The following example disables **`TestService`** of type ***DNS*** from being defined in all other configuration files included later, and **`TestService2`** to be defined as a **NCBID** service on host foo:
+
+ TestService NAMEHOLD :0 DNS
+ TestService2 NAMEHOLD foo:0 NCBID
+
+
+
+#### Sites
+
+LBSMD is minimally aware of NCBI network layout and can generally guess its “site” information from either an IP address or special location-role files located in the /etc/ncbi directory: a BE-MD production and development site, a BE-MD.QA site, a BE-MD.TRY site, and lastly an ST-VA site. When reading zone information from the “@” directive of the configuration, LBSMD can treat special non-numeric values as the following: “@try” as the production zone within BE-MD.TRY, “@qa” as the production zone within BE-MD.QA, “@dev” as a development zone within the current site, and “@\*prod\*” (e.g. @intprod) as a production zone within the current site – where the production zone has a value of “1” and the development – “2”: so “@2” and “@dev” as well as “@1” and “@\*prod\*” are each equivalent. That makes the definition of zones more convenient by the %include directive with the pipe character:
+
+ %include /etc/ncbi/role |@ # define zone via the role file
+
+Suppose that the daemon detected its site as ST-VA and assigned it a value of 0x300; then the above directive assigns the current zone the value of 0x100 if the file reads “prod” or “1”, and zone 0x200 if the file reads “dev” or “2”. Note that if the file reads either “try” or “qa”, or “4”, the implied “@” directive will flag an error because of the mismatch between the resultant zone and the current site values.
+
+Both zone and site (or site alone) can be permanently assigned with the command-line parameters and then may not be overridden from the configuration file(s).
+
+
+
+#### Signals
+
+The table below describes the LBSMD daemon signal processing.
+
+
+
+|---------|---------------------------------------------------------------------------------------------------------------------------|
+| Signal | Reaction |
+| SIGHUP | reload the configuration |
+| SIGINT | quit |
+| SIGTERM | quit |
+| SIGUSR1 | toggle the verbosity level between less verbose (default) and more verbose (when every warning generated is stored) modes |
+
+
+
+
+
+#### Automatic Configuration Distribution
+
+The configuration files structure is unified for all the hosts in the NCBI network. It is shown on the figure below.
+
+[](/book/static/img/ch_app_lbsmd_cfg_structure.png "Click to see the full-resolution image")
+
+Figure 9. LBSMD Configuration Files Structure
+
+The common for all the configuration file prefix `/etc/lbsmd` is omitted on the figure. The arrows on the diagram show how the files are included.
+
+The files `servrc.cfg` and `servrc.cfg.systems` have fixed structure and should not be changed at all. The purpose of the file `local/servrc.cfg.systems` is to be modified by the systems group while the purpose of the file `local/servrc.cfg.ieb` isto be modified by the delegated members of the respected groups. To make it easier for changes all the `local/servrc.cfg.ieb` files from all the hosts in the NCBI network are stored in a centralized SVN repository. The repository can be received by issuing the following command:
+
+`svn co svn+ssh://subvert.be-md.ncbi.nlm.nih.gov/export/home/LBSMD_REPO`
+
+The file names in that repository match the following pattern:
+
+`hostname.{be-md|st-va}[.qa]`
+
+where `be-md` is used for Bethesda, MD site and `st-va` is used for Sterling, VA site. The optional `.qa` suffix is used for quality assurance department hosts.
+
+So, if it is required to change the `/etc/lbsmd/local/servrc.cfg.ieb` file on the sutils1 host in Bethesda the `sutils1.be-md` file is to be changed in the repository.
+
+As soon as the modified file is checked in the file will be delivered to the corresponding host with the proper name automatically. The changes will take effect in a few minutes. The process of the configuration distribution is illustrated on the figure below.
+
+[](/book/static/img/CFEngine.jpg "Click to see the full-resolution image")
+
+Figure 10. Automatic Configuration Distribution
+
+
+
+#### Monitoring and Control
+
+
+
+##### Service Search
+
+The following web page can be used to search for a service:
+
+
+
+The following screen will appear
+
+[](/book/static/img/LBSMDSearchMain.gif "Click to see the full-resolution image")
+
+Figure 11. NCBI Service Search Page
+
+As an example of usage a user might enter the partial name of the service like "TaxService" and click on the “Go” button. The search results will display "TaxService", "TaxService3" and "TaxService3Test" if those services are available (see ).
+
+
+
+##### lbsmc Utility
+
+Another way of monitoring the LBSMD daemon is using the lbsmc () utility. The utility periodically dumps onto the screen a table which represents the current content of the LBSMD daemon table. The utility output can be controlled by a number of command line options. The full list of available options and their description can be obtained by issuing the following command:
+
+`lbsmc -h`
+
+The NCBI intranet users can also get the list of options by clicking on this link: .
+
+For example, to print a list of hosts which names match the pattern “sutil\*” the user can issue the following command:
+
+ >./lbsmc -h sutil* 0
+ LBSMC - Load Balancing Service Mapping Client R100432
+ 03/13/08 16:20:23 ====== widget3.be-md.ncbi.nlm.nih.gov (00:00) ======= [2] V1.2
+ Hostname/IPaddr Task/CPU LoadAv LoadBl Joined Status StatBl
+ sutils1 151/4 0.06 0.03 03/12 13:04 397.62 3973.51
+ sutils2 145/4 0.17 0.03 03/12 13:04 155.95 3972.41
+ sutils3 150/4 0.20 0.03 03/12 13:04 129.03 3973.33
+ --------------------------------------------------------------------------------
+ Service T Type Hostname/IPaddr:Port LFS B.Rate Coef Rating
+ bounce +25 NCBID sutils1:80 L 1000.00 397.62
+ bounce +25 HTTP sutils1:80 1000.00 397.62
+ bounce +25 NCBID sutils2:80 L 1000.00 155.95
+ bounce +25 HTTP sutils2:80 1000.00 155.95
+ bounce +27 NCBID sutils3:80 L 1000.00 129.03
+ bounce +27 HTTP sutils3:80 1000.00 129.03
+ dispatcher_lb 25 DNS sutils1:80 1000.00 397.62
+ dispatcher_lb 25 DNS sutils2:80 1000.00 155.95
+ dispatcher_lb 27 DNS sutils3:80 1000.00 129.03
+ MapViewEntrez 25 STANDALONE sutils1:44616 L S 1000.00 397.62
+ MapViewEntrez 25 STANDALONE sutils2:44616 L S 1000.00 155.95
+ MapViewEntrez 27 STANDALONE sutils3:44616 L S 1000.00 129.03
+ MapViewMeta 25 STANDALONE sutils2:44414 L S 0.00 0.00
+ MapViewMeta 27 STANDALONE sutils3:44414 L S 0.00 0.00
+ MapViewMeta 25 STANDALONE sutils1:44414 L S 0.00 0.00
+ sutils_lb 25 DNS sutils1:80 1000.00 397.62
+ sutils_lb 25 DNS sutils2:80 1000.00 155.95
+ sutils_lb 27 DNS sutils3:80 1000.00 129.03
+ TaxService 25 NCBID sutils1:80 1000.00 397.62
+ TaxService 25 NCBID sutils2:80 1000.00 155.95
+ TaxService 27 NCBID sutils3:80 1000.00 129.03
+ TaxService3 +25 HTTP_POST sutils1:80 1000.00 397.62
+ TaxService3 +25 HTTP_POST sutils2:80 1000.00 155.95
+ TaxService3 +27 HTTP_POST sutils3:80 1000.00 129.03
+ test +25 HTTP sutils1:80 1000.00 397.62
+ test +25 HTTP sutils2:80 1000.00 155.95
+ test +27 HTTP sutils3:80 1000.00 129.03
+ testgenomes_lb 25 DNS sutils1:2441 1000.00 397.62
+ testgenomes_lb 25 DNS sutils2:2441 1000.00 155.95
+ testgenomes_lb 27 DNS sutils3:2441 1000.00 129.03
+ testsutils_lb 25 DNS sutils1:2441 1000.00 397.62
+ testsutils_lb 25 DNS sutils2:2441 1000.00 155.95
+ testsutils_lb 27 DNS sutils3:2441 1000.00 129.03
+ --------------------------------------------------------------------------------
+ * Hosts:4\747, Srvrs:44/1223/23 | Heap:249856, used:237291/249616, free:240 *
+ LBSMD PID: 17530, config: /etc/lbsmd/servrc.cfg
+
+
+
+##### NCBI Intranet Web Utilities
+
+The NCBI intranet users can also visit the following quick reference links:
+
+- Dead servers list:
+
+- Search engine for all available hosts, all services and database affiliation:
+
+If the lbsmc utility is run with the -f option then the output contains two parts:
+
+- The host table. The table is accompanied by raw data which are printed in the order they appear in the LBSMD daemon table.
+
+- The service table
+
+The output is provided in either long or short format. The format depends on whether the -w option was specified in the command line (the option requests the long (wide) output). The wide output occupies about 132 columns, while the short (normal) output occupies only 80, which is the standard terminal width.
+
+In case if the service name is more than the allowed number of characters to display the trailing characters will be replaced with “\>”. When there is more information about the host / service to be displayed the “+” character is put beside the host / service name (this additional information can be retrieved by adding the -i option). When both “+” and “\>” are to be shown they are replaced with the single character “\*”. In the case of wide-output format the “\#” character shown in the service line means that there is no host information available for the service (similar to the static servers). The “!” character in the service line denotes that the service was configured / stored with an error (this character actually should never appear in the listings and should be reported whenever encountered). Wide output for hosts contains the time of bootup and startup. If the startup time is preceded by the “~” character then the host was gone for a while and then came back while the lbsmc utility was running. The “+” character in the times is to show that the date belongs to the past year(s).
+
+
+
+##### Server Penalizer API and Utility
+
+The utility allows to report problems of accessing a certain server to the LBSMD daemon, in the form of a penalty which is a value in the range [0..100] that shows, in percentages, how bad the server is. The value 0 means that the server is completely okay, whereas 100 means that the server (is misbehaving and) should **not** be used at all. The penalty is not a constant value: once set, it starts to decrease in time, at first slowly, then faster and faster until it reaches zero. This way, if a server was penalized for some reason and later the problem has been resolved, then the server becomes available gradually as its penalty (not being reset by applications again in the absence of the offending reason) becomes zero. The figure below illustrates how the value of penalty behaves.
+
+[](/book/static/img/Penalty.jpg "Click to see the full-resolution image")
+
+Figure 12. Penalty Value Characteristics
+
+Technically, the penalty is maintained by a daemon, which has the server configured, i.e., received by a certain host, which may be different from the one where the server was put into the configuration file. The penalty first migrates to that host, and then the daemon on that host announces that the server was penalized.
+
+***Note:*** Once a daemon is restarted, the penalty information is lost.
+
+[Service mapping API](ch_conn.html#ch_conn.service_mapping_api) has a call `SERV_Penalize()` () declared in `connect/ncbi_service.h` (), which can be used to set the penalty for the last server obtained from the mapping iterator.
+
+For script files (similar to the ones used to start/stop servers), there is a dedicated utility program called `lbsm_feedback` (), which sets the penalty from the command line. This command should be used with extreme care because it affects the load-balancing mechanism substantially,.
+
+**lbsm\_feedback** is a part of the LBSM set of tools installed on all hosts which run **LBSMD**. As it was explained above, penalizing means to make a server less favorable as a choice of the load balancing mechanism. Because of the fact that the full penalty of 100% makes a server unavailable for clients completely, at the time when the server is about to shut down (restart), it is wise to increase the server penalty to the maximal value, i.e. to exclude the server from the service mapping. (Otherwise, the LBSMD daemon might not immediately notice that the server is down and may continue dispatching to that server.) Usually, the penalty takes at most 5 seconds to propagate to all participating network hosts. Before an actual server shutdown, the following sequence of commands can be used:
+
+ > /opt/machine/lbsm/sbin/lbsm_feedback 'Servicename STANDALONE host 100 120'
+ > sleep 5
+ now you can shutdown the server
+
+The effect of the above is to set the maximal penalty 100 for the service Servicename (of type ***STANDALONE***) running on host **`host`** for at least 120 seconds. After 120 seconds the penalty value will start going down steadily and at some stage the penalty becomes 0. The default hold time equals 0. It takes some time to deliver the penalty value to the other hosts on the network so ‘sleep 5’ is used. Please note the single quotes surrounding the penalty specification: they are required in a command shell because **lbsm\_feedback** takes only one argument which is the entire penalty specification.
+
+As soon as the server is down, the **LBSMD** daemon detects it in a matter of several seconds (if not instructed otherwise by the configuration file) and then does not dispatch to the server until it is back up. In some circumstances, the following command may come in handy:
+
+ > /opt/machine/lbsm/sbin/lbsm_feedback 'Servicename STANDALONE host 0'
+
+The command resets the penalty to 0 (no penalty) and is useful when, as for the previous example, the server is restarted and ready in less than 120 seconds, but the penalty is still held high by the **LBSMD** daemon on the other hosts.
+
+The formal description of the lbsm\_feedback utility parameters is given below.
+
+[](/book/static/img/lbsm_feedback.gif "Click to see the full-resolution image")
+
+Figure 13. lbsm\_feedback Arguments
+
+The `servicename` can be an identifier with ‘\*’ for any symbols and / or ‘?’ for a single character. The `penalty value` is an integer value in the range 0 ... 100. The `port number` and `time` are integers. The `hostname` is an identifier and the `rate value` is a floating point value.
+
+
+
+#### SVN Repository
+
+The SVN repository where the LBSMD daemon source code is located can be retrieved by issuing the following command:
+
+`svn co https://svn.ncbi.nlm.nih.gov/repos/toolkit/trunk/c++`
+
+The daemon code is in this file:
+
+`c++/src/connect/daemons/lbsmd.c`
+
+
+
+#### Log Files
+
+The LBSMD daemon stores its log files at the following location:
+
+`/var/log/lbsmd`
+
+The file is formed locally on a host where LBSMD daemon is running. The log file size is limited to prevent the disk being flooded with messages. A standard log rotation is applied to the log file so you may see the files:
+
+`/var/log/lbsmd.X.gz`
+
+where X is a number of the previous log file.
+
+The log file size can be controlled by the -s command line option. By default, -s 0 is the active flag, which provides a way to create (if necessary) and to append messages to the log file with no limitation on the file size whatsoever. The -s -1 switch instructs indefinite appending to the log file, which must exist. Otherwise, log messages are not stored. -s positive\_number restricts the ability to create (if necessary) and to append to the log file until the file reaches the specified size in kilobytes. After that, message logging is suspended, and subsequent messages are discarded. Note that the limiting file size is only approximate, and sometimes the log file can grow slightly bigger. The daemon keeps track of log files and leaves a final logging message, either when switching from one file to another, in case the file has been moved or removed, or when the file size has reached its limit.
+
+NCBI intranet users can get few (no more than 100) recent lines of the log file on an NCBI internal host. It is also possible to visit the following link:
+
+
+
+
+
+#### Configuration Examples
+
+Here is an example of a LBSMD configuration file:
+
+ # $Id$
+ #
+ # This is a configuration file of new NCBI service dispatcher
+ #
+ #
+ # DBLB interface definitions
+ %include /etc/lbsmd/servrc.cfg.db
+ # IEB's services
+ testHTTP /Service/test.cgi?Welcome L=no
+ Entrez2[0] HTTP_POST www.ncbi.nlm.nih.gov /entrez/eutils/entrez2server.fcgi \
+ C=x-ncbi-data/x-asn-binary L=no
+ Entrez2BLAST[0] HTTP_POST www.ncbi.nlm.nih.gov /entrez/eutils/entrez2server.cgi \
+ C=x-ncbi-data/x-asn-binary L=yes
+ CddSearch [0] HTTP_POST www.ncbi.nlm.nih.gov /Structure/cdd/c_wrpsb.cgi \
+ C=application/x-www-form-urlencoded L=no
+ CddSearch2 [0] HTTP_POST www.ncbi.nlm.nih.gov /Structure/cdd/wrpsb.cgi \
+ C=application/x-www-form-urlencoded L=no
+ StrucFetch [0] HTTP_POST www.ncbi.nlm.nih.gov /Structure/mmdb/mmdbsrv.cgi \
+ C=application/x-www-form-urlencoded L=no
+ bounce[60]HTTP /Service/bounce.cgi L=no C=x-ncbi-data/x-unknown
+ # Services of old dispatcher
+ bounce[60]NCBID '' L=yes C=x-ncbi-data/x-unknown | \
+ ..../web/public/htdocs/Service/bounce
+
+NCBI intranet users can also visit the following link to get a sample configuration file:
+
+
+
+
+
+### Database Load Balancing
+
+Database load balancing is an important part of the overall load balancing function. Please see the [Database Load Balancer](ch_dbapi.html#ch_dbapi.Database_loadbalanci) section in the [Database Access](ch_dbapi.html) chapter for more details.
+
+
+
+### Cookie / Argument Affinity Module (MOD\_CAF)
+
+
+
+#### Overview
+
+The cookie / argument affinity module (CAF module in the further discussion) helps to virtualize and to dispatch a web site by modifying the way how Apache resolves host names. It is done by superseding conventional `gethostbyname*()` API. The CAF module is implemented as an Apache web server module and uses the LBSMD daemon collected data to make a decision how to dispatch a request. The data exchange between the CAF module and the LBSMD daemon is done via a shared memory segment as shown on the figure below.
+
+[](/book/static/img/CAF-LBSMD.gif "Click to see the full-resolution image")
+
+Figure 14. CAF Module and LBSMD daemon data exchange
+
+The LBSMD daemon stores all the collected data in a shared memory segment and the CAF module is able to read data from that segment.
+
+The CAF module looks for special cookies and query line arguments, and analyses the LBSMD daemon data to resolve special names which can be configured in ProxyPass directives of mod\_proxy.
+
+The CAF module maintains a list of proxy names, cookies, and arguments (either 4 predefined, see below, or set forth via Apache configuration file by CAF directives) associated with cookies. Once a URL is translated to the use of one of the proxies (generally, by ProxyPass of mod\_proxy) then the information from related cookie (if any) and argument (if any) is used to find the best matching real host that corresponds to the proxy. Damaged cookies and arguments, if found in the incoming HTTP request, are ignored.
+
+A special host name is meant under proxy and the name contains a label followed by string ".lb" followed by an optional domain part. Such names trigger gethostbyname() substitute, supplied by the module, to consult load-balancing daemon's table, and to use both the constraints on the arguments and the preferred host information, found in the query string and the cookie, respectively.
+
+For example, the name "pubmed.lb.nlm.nih.gov" is an LB proxy name, which would be resolved by looking for special DNS services ("pubmed\_lb" in this example) provided by the LBSMD daemon. Argument matching (see also a separate section below) is done by searching the host environment of target hosts (corresponding to the LB proxy name) as supplied by the LBSMD daemon. That is, "db=PubMed" (to achieve PubMed database affinity) in the query that transforms into a call to an LB proxy, which in turn is configured to use the argument "DB", instructs to search only those target hosts that declare the proxy and have "db=... PubMed ..." configured in their LBSMD environments (and yet to remember to accommodate, if it is possible, a host preference from the cookie, if any found in the request).
+
+The CAF module also detects internal requests and allows them to use the entire set of hosts that the LB names are resolved to. For external requests, only hosts whose DNS services are not marked local (L=yes, or implicitly, by lacking "-i" flag in the LBSMD daemon launch command) will be allowed to serve requests. "HTTP\_CAF\_PROXIED\_HOST" environment is supplied (by means of an HTTP header tag named "`CAF-Proxied-Host`") to contain an address of the actual host posted the request. Impostor's header tags (if any) of this name are always stripped, so that backends always have correct information about the requesters. Note that all internal requests are trusted, so that an internal resource can make a request to execute on behalf of an outside client by providing its IP in the "`Client-Host`" HTTP header. The "`Client-Host`" tag gets through for internal requests only; to maintain security the tag is dropped for all external requests.
+
+The CAF module has its own status page that can be made available in the look somewhat resembling Apache status page. The status can be in either raw or HTML formatted, and the latter can also be sorted using columns in interest. Stats are designed to be fast, but sometimes inaccurate (to avoid interlocking, and thus latencies in request processing, there are no mutexes being used except for the table expansion). Stats are accumulated between server restarts (and for Apache 2.0 can survive graceful restarts, too). When the stat table is full (since it has a fixed size), it is cleaned in a way to get room for 1% of its capacity, yet trying to preserve the most of recent activity as well as the most of heavily used stats from the past. There are two cleaning algorithms currently implemented, and can be somehow tuned by means of `CAFexDecile`, `CAFexPoints`, and `CAFexSlope` directives which are described below.
+
+The CAF module can also report the number of slots that the Apache server has configured and used up each time a new request comes in and is being processed. The information resides in a shared memory segment that several Apache servers can use cooperatively on the same machine. Formerly, this functionality has been implemented in a separate SPY module, which is now fully integrated into this module. Using a special compile-time macro it is possible to obtain the former SPY-only functionality (now called LBSMD reporter feature) without any other CAF features. Note that no CAF\* directives will be recognized in Apache configuration, should the reduced functionality build be chosen.
+
+
+
+#### Configuration
+
+The table below describes Apache configuration directives which are taken into account by the CAF module.
+
+
+
+|-------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Directive | Description |
+| LBSMD { On \\| Off } | It can appear outside any paired section of the configuration file, and enables ["On", default in mod\_spy mode] or disables ["Off", default in full-fledged mod\_caf mode] the LBSMD reporter feature. When the module is built exclusively with the LBSMD reporter feature, this is the only directive, which is available for the use by the module. Please note that the directive is extremely global, and works across configuration files. Once "Off" is found throughout the configuration, it takes the effect. |
+| CAF { On \\| Off } | It can appear outside any paired section of the configuration file, and enables ["On", default] or disables ["Off"] the entire module. Please note that this directive is extremely global, and works across Apache configuration files, that is the setting "Off" anywhere in the configuration causes the module to go out of business completely. |
+| CAFQAMap name path | It can appear outside any paired section of the configuration file but only once in the entire set of the configuration files per "name", and if used, defines a path to the map file, which is to be loaded at the module initialization phase (if the path is relative, it specifies the location with respect to the daemon root prefix as defined at the time of the build, much like other native configuration locations do). The file is a text, line-oriented list (w/o line continuations). The pound symbol (\#) at any position introduces a comment (which is ignored by the parser). Any empty line (whether resulted from cutting off a comment, or just blank by itself) is skipped. Non-empty lines must contain a pair of words, delimited by white space(s) (that is, tab or space character(s)). The first word defines an LB group that is to be replaced with the second word, in the cases when the first word matches the LB group used in proxy passing of an internally-originating request. The matching is done by previewing a cookie named "name" that should contain a space-separated list of tokens, which must comprise a subset of names loaded from the left-hand side column of the QA file. Any unmatched token in the cookie will result the request to fail, so will do any duplicate name. That is, if the QA map file contains a paired rule "tpubmed tpubmedqa", and an internal (i.e. originating from within NCBI) request has the NCBIQA cookie listing "tpubmed", then the request that calls for use of the proxy-pass "tpubmed.lb" will actually use the name "tpubmedqa.lb" as if it appeared in the ProxyPass rule of mod\_proxy. Default is not to load any QA maps, and not to proceed with any substitutions. Note that if the module is disabled (CAF Off), then the map file, even if specified, need not to exist, and won't be loaded. |
+| CAFFailoverIP address | It defines hostname / IP to return on LB proxy names that cannot be resolved. Any external requests and local ones, in which argument affinity has to be taken into account, will fall straight back to use this address whenever the LB name is not known or LBSMD is not operational. All other requests will be given a chance to use regular DNS first, and if that fails, then fall back to use this IP. When the failover IP address is unset, a failed LB proxy name generally causes the Apache server to throw either "Bad gateway" (502) or "Generic server error" (500) to the client. This directive is global across the entire configuration, and the last setting takes the actual effect. |
+| CAFForbiddenIP address | It is similar to CAFFailoverIP described above yet applies only to the cases when the requested LB DNS name exists but cannot be returned as it would cause the name access violation (for example, an external access requires an internal name to be used to proxy the request). Default is to use the failover IP (as set by CAFFailoverIP), if available. |
+| CAFThrottleIP address | It is similar to CAFFailoverIP described above but applies only to abusive requests that should be throttled out. Despite this directive exists, the actual throttling mechanism is not yet in production. Default is to use the failover IP (as set by CAFFailoverIP), if available. |
+| CAFBusyIP address | It is similar to CAFFailoverIP described above but gets returned to clients when it is known that the proxy otherwise serving the request is overloaded. Default is to use the failover IP, if available. |
+| CAFDebug { Off \\| On \\| 2 \\| 3 } | It controls whether to print none ("Off"), some ("On"), more ("2"), or all ("3") debugging information into Apache log file. Per-request logging is automatically on when debugging is enabled by the native LogLevel directive of Apache (LogLevel debug), or with a command line option -e (Apache 2). This directive controls whether mod\_caf produces additional logging when doing maintenance cleaning of its status information (see CAFMaxNStats below). Debug level 1 (On) produces cleanup synopsis and histogram, level 2 produces per-stat eviction messages and the synopsis, and debug level 3 is a combination of the above. Default is "Off". The setting is global, and the last encounter has the actual effect. NOTE: per-stat eviction messages may cause latencies in request processing; so debug levels "2" and "3" should be used carefully, and only when actually needed. |
+| CAFTiming { Off \\| On \\| TOD } | It controls whether the module timing profile is done while processing requests. For this to work, though, CAFMaxNStats must first enable collection of statistics. Module's status page then will show how much time is being spent at certain stages of a request processing. Since proxy requests and non-proxy requests are processed differently they are accounted separately. "On" enables to make the time marks using the gettimeofday(2) syscall (accurate up to 1us) without reset upon each stat cleanup (note that tick count will wrap around rather frequently). Setting "TOD" is same as "On" but extends it so that counts do get reset upon every cleanup. Default is "Off". The setting is global, and the last encounter in the configuration file has the actual effect. |
+| CAFMaxNStats number | The number defines how many statistics slots are allocated for CAF status (aka CAF odometer). Value "0" disables the status page at all. Value "-1" sets default number of slots (which currently corresponds to the value of 319). Note that the number only sets a lower bound, and the actual number of allocated slots may be automatically extended to occupy whole number of pages (so that no "memory waste" occurs). The actual number of stats (and memory pages) is printed to the log file. To access the status page, a special handler must be installed for a designated location, as in the following example: `` ` SetHandler CAF-status` ` Order deny,allow` ` Deny from all` ` Allow from 130.14/16` `` 404 (Document not found) gets returned from the configured location if the status page has been disabled (number=0), or if it malfunctions. This directive is global across the entire configuration, and the last found setting takes the actual effect. CAF stats can survive server restarts [graceful and plain "restart"], but not stop / start triggering sequence. Note: "CAF Off" does not disable the status page if it has been configured before -- it just becomes frozen. So [graceful] restart with "CAF Off" won't prevent from gaining access to the status page, although the rest of the module will be rendered inactive. |
+| CAFUrlList url1 url2 ... | By default, CAF status does not distinguish individual CGIs as they are being accessed by clients. This option allows separating statistics on a per-URL basis. Care must be taken to remember of "combinatorial explosion", and thus the appropriate quantity of stats is to be pre-allocated with CAFMaxNStats if this directive is used, or else the statistics may renew too often to be useful. Special value "\*" allows to track every (F)CGI request by creating individual stat entries for unique (F)CGI names (with or without the path part, depending on a setting of CAFStatPath directive, below). Otherwise, only those listed are to be accounted for, leaving all others to accumulate into a nameless stat slot. URL names can have .cgi or .fcgi file name extensions. Alternatively, a URL name can have no extension to denote a CGI, or a trailing period (.) to denote an FCGI. A single dot alone (.) creates a specially named stat for all non-matching CGIs (both .cgi or .fcgi), and collects all other non-CGI requests in a nameless stat entry. (F)CGI names are case sensitive. When path stats are enabled (see CAFStatPath below), a relative path entry in the list matches any (F)CGI that has the trailing part matching the request (that is, "query.fcgi" matches any URL that ends in "query.fcgi", but "/query.fcgi" matches only the top-level ones). There is an internal limit of 1024 URLs that can be explicitly listed. Successive directives add to the list. A URL specified as a minus sign alone ("-") clears the list, so that no urls will be registered in stats. This is the default. This directive is only allowed at the top level, and applies to all virtual hosts. |
+| CAFUrlKeep url1 url2 ... | CAF status uses a fixed-size array of records to store access statistics, so whenever the table gets full, it has to be cleaned up by dropping some entries, which have not been updated too long, have fewer count values, etc. The eviction algorithm can be controlled by CAFexDecile, CAFexPoints, and CAFexSlope directives, described below, but even when finely tuned, can result in some important entries being pre-emptied, especially when per-URL stats are enabled. This directive helps avoid losing the important information, regardless of other empirical characteristics of a candidate-for-removal. The directive, like CAFUrlList above, lists individual URLs which, once recorded, have to be persistently kept in the table. Note that as a side effect, each value (except for "-") specified in this directive implicitly adds an entry as if it were specified with CAFUrlList. Special value "-" clears the keep list, but does not affect the URL list, so specifying "CAFUrlKeep a b -" is same as specifying "CAFUrlList a b" alone, that is, without obligation for CAF status to keep either "a" or "b" permanently. There is an internal limit of 1024 URLs that can be supplied by this directive. Successive uses add to the list. The directive is only allowed at the top level, and applies to all virtual hosts. |
+| CAFexDecile digit | It specifies the top decile(s) of the total number of stat slots, sorted by the hit count and subject for expulsion, which may not be made available for stat's cleanup algorithms should it be necessary to arrange a new slot by removing old/stale entries. Decile is a single digit 0 through 9, or a special value "default" (which currently translates to 1). Note that each decile equals 10%. |
+| CAFexPoints { value \\| percentage% } | The directive specifies how many records, as an absolute value, or as a percentage of total stat slots, are to be freed each time the stat table gets full. Keyword "default" also can be used, which results in eviction of 1% of all records (or just 1 record, whatever is greater). Note that if CAFUrlKeep is in use, the cleanup may not be always possible. The setting is global and the value found last takes the actual effect. |
+| CAFexSlope { value \\| "quad" } | The directive can be used to modify cleanup strategy used to vacate stat records when the stat table gets full. The number of evicted slots can be controlled by CAFexPoints directive. The value, which is given by this directive, is used to plot either circular ("quad") or linear (value \>= 0) plan of removal. The linear plan can be further fine-tuned by specifying a co-tangent value of the cut-off line over a time-count histogram of statistics, as a binary logarithm value, so that 0 corresponds to the co-tangent of 1 (=2^0), 1 (default) corresponds to the co-tangent of 2 (=2^1), 2 - to the co-tangent of 4 (=2^2), 3 - to 8 (=2^3), and so forth, up to a maximal feasible value 31 (since 2^32 overflows an integer, this results in the infinite co-tangent, causing a horizontal cut-off line, which does not take into account times of last updates, but counts only). The default co-tangent (2) prices the count of a stats twice higher than its longevity. The cleanup histogram can be viewed in the log if CAFDebug is set as 2 (or 3). The setting is global and the value found last takes the actual effect. |
+| CAFStatVHost { Off \\| On } | It controls whether VHosts of the requests are to be tracked on the CAF status page. By default, VHost separation is not done. Note that preserving graceful restart of the server may leave some stats VHost-less, when switching from VHost-disabled to VHost-enabled mode, with this directive. The setting is global and the setting found last has the actual effect. |
+| CAFStatPath { On \\| Off } | It controls whether the path part of URLs is to be stored and shown on the CAF status page. By default, the path portion is stripped. Keep in mind the relative path specifications as given in CAFUrlList directive, as well as the number of possible combinations of Url/VHost/Path, that can cause frequent overflows of the status table. When CAFStatPath is "Off", the path elements are stripped from all URLs provided in the CAFUrlList directive (and merging the identical names, if any result). This directive is global, and the setting found last having the actual effect. |
+| CAFOkDnsFallback { On \\| Off } | It controls whether it is okay to fallback for consulting regular DNS on the unresolved names, which are not constrained with locality and/or affinities. Since shutdown of SERVNSD (which provided the fake .lb DNS from the load balancer), fallback to system DNS looks painfully slow (at it has now, in the absence of the DNS server, to reach the timeout), so the default for this option is "Off". The setting is global, and the value found last takes the actual effect. |
+| CAFNoArgOnGet { On \\| Off } | It can appear outside any paired section of the configuration, "On" sets to ignore argument assignment in GET requests that don't have explicit indication of the argument. POST requests are not affected. Default is "Off", VHost-specific. |
+| CAFArgOnCgiOnly { On \\| Off } | It controls whether argument is taken into account when an FCGI or CGI is being accessed. Default is "Off". The setting is per-VHost specific. |
+| CAFCookies { Cookie \\| Cookie2 \\| Any } | It instructs what cookies to search for: "Cookie" stands for RFC2109 cookies (aka Netscape cookies), this is the default. "Cookie2" stands for new RFC2965 cookies (new format cookies). "Any" allows searching for both types of cookies. This is a per-server option that is not shared across virtual host definitions, and allowed only outside any \ or \. Note that, according to the standard, cookie names are not case-sensitive. |
+| CAFArgument argument | It defines argument name to look for in the URLs. There is no default. If set, the argument becomes default for any URL and also for proxies whose arguments are not explicitly set with CAFProxyArgument directives. The argument is special case sensitive: first, it is looked up "as-is" and, if that fails, in all uppercase then. This directive can appear outside any \ or \ and applies to virtual hosts (if any) independently. |
+| CAFHtmlAmp { On \\| Off } | It can appear outside any paired section of configuration, set to On enables to recognize "&" for the ampersand character in request URLs (caution: "&" in URLs is not standard-conforming). Default is "Off", VHost-specific. |
+| CAFProxyCookie proxy cookie | It establishes a correspondence between LB DNS named proxy and a cookie. For example, "CAFProxyCookie pubmed.lb MyPubMedCookie" defines that "MyPubMedCookie" should be searched for preferred host information when "pubmed.lb" is being considered as a target name for proxying the incoming request. This directive can appear anywhere in configuration, but is hierarchy complying. |
+| CAFProxyNoArgOnGet proxy { On \\| Off \\| Default } | The related description can be seen at the CAFNoArgOnGet directive description above. The setting applies only to the specified proxy. "Default" (default) is to use the global setting. |
+| CAFProxyArgOnCgiOnly proxy { On \\| Off \\| Default } | The related description can be seen at the CAFArgOnCgiOnly directive description above. The setting applies only to the specified proxy. "Default" (default) is to use the global setting. |
+| CAFProxyArgument proxy argument | It establishes a correspondence between LB DNS named proxy and a query line argument. This directive overrides any default that might have been set with global "CAFArgument" directive. Please see the list of predefined proxies below. The argument is special case sensitive: first, it is looked up "as-is" and, if that fails, in all uppercase then. The first argument occurrence is taken into consideration. It can appear anywhere in configuration, but is hierarchy complying. |
+| CAFProxyAltArgument proxy altargument | It establishes a correspondence between LB DNS named proxy and an alternate query line argument. The alternate argument (if defined) is used to search (case-insensitively) query string for the argument value, but treating the value as if it has appeared to argument set forth by CAFProxyArgument or CAFArgument directives for the location in question. If no alternate argument value is found, the regular argument search is performed. Please see the list of predefined proxies below. Can appear anywhere in configuration, but is hierarchy complying, and should apply for existing proxies only. Altargument "-" deletes the alternate argument (if any). Note again that unlike regular proxy argument (set forth by either CAFArgument (globally) or CAFProxyArgument (per-proxy) directives) the alternate argument is entirely case-insensitive. |
+| CAFProxyDelimiter proxy delimiter | It sets a one character delimiter that separates host[:port] field in the cookie, corresponding to the proxy, from some other following information, which is not pertinent to cookie affinity business. Default is '\\|'. No separation is performed on a cookie that does not have the delimiter -- it is then thought as been found past the end-of-line. It can appear anywhere in configuration, but is hierarchy complying. |
+| CAFProxyPreference proxy preference | It sets a preference (floating point number from the range [0..100]) that the proxy would have if a host matching the cookie is found. The preference value 0 selects the default value which is currently 95. It can appear anywhere in configuration, but is hierarchy complying. |
+| CAFProxyCryptKey proxy key | It sets a crypt key that should be used to decode the cookie. Default is the key preset when a cookie correspondence is created [via either "CAFProxyCookie" or "CAFProxyArgument"]. To disable cookie decrypting (e.g. if the cookie comes in as a plain text) use "". Can appear anywhere in configuration, but is hierarchy complying. |
+
+
+
+All hierarchy complying settings are inherited in directories that are deeper in the directory tree, unless overridden there. The new setting then takes effect for that and all descendant directories/locations.
+
+There are 4 predefined proxies that may be used [or operated on] without prior declaration by either "CAFProxyCookie" or "CAFProxyArgument" directives:
+
+
+
+|------------|-----------------|------------|-----------|----------|----------|----------|
+| LB name | CookieName | Preference | Delimiter | Crypted? | Argument | AltArg |
+| tpubmed.lb | LB-Hint-Pubmed | 95 | \\| | yes | db | \ |
+| eutils.lb | LB-Hint-Pubmed | 95 | \\| | yes | db | DBAF |
+| mapview.lb | LB-Hint-MapView | 95 | \\| | yes | \ | \ |
+| blastq.lb | LB-Hint-Blast | 95 | \\| | yes | \ | \ |
+
+
+
+***NOTE***: The same cookie can be used to tie up an affinity for multiple LB proxies. On the other hand, LB proxy names are all unique throughout the configuration file.
+
+***NOTE***: It is very important to keep in mind that arguments and alt-arguments are treated differently, case-wise. Alt-args are case insensitive, and are screened before the main argument (but appear as if the main argument has been found). On the other hand, main arguments are special case-sensitive, and are checked twice: "as is" first, then in all CAPs. So having both "DB" for alt-argument and "db" for the main, hides the main argument, and actually makes it case-insensitive. CAF will warn on some occurrences when it detects whether the argument overloading is about to happen (take a look at the logs).
+
+The CAF module is also able to detect if a request comes from a local client. The `/etc/ncbi/local_ips` file describes the rules for making the decision.
+
+The file is line-oriented, i.e. supposes to have one IP spec per one line. Comments are introduced by either "\#" or "!", no continuation lines allowed, the empty lines are ignored.
+
+An IP spec is a word (no embedded whitespace characters) and is either:
+
+- a host name or a legitimate IP address
+
+- a network specification in the form "networkIP / networkMask"
+
+- an IP range (explained below).
+
+A networkIP / networkMask specification can contain an IP prefix for the network (with or without all trailing zeroes present), and the networkMask can be either in CIDR notation or in the form of a full IP address (all 4 octets) expressing contiguous high-bit ranges (all the records below are equivalent):
+
+`130.14.29.0/24` `130.14.29/24` `130.14.29/255.255.255.0` `130.14.29.0/255.255.255.0`
+
+An IP range is an incomplete IP address (that is, having less than 4 full octets) followed by exactly one dot and one integer range, e.g.:
+
+`130.14.26.0-63`
+
+denotes a host range from `130.14.26.0` thru `130.14.26.63` (including the ends),
+
+`130.14.8-9`
+
+denotes a host range from `130.14.8.0` thru `130.14.9.255` (including the ends).
+
+***Note*** that `127/8` gets automatically added, whether or not it is explicitly included into the configuration file. The file loader also warns if it encounters any specifications that overlap each other. Inexistent (or unreadable) file causes internal hardcoded defaults to be used - a warning is issued in this case.
+
+***Note*** that the IP table file is read once per Apache daemon's life cycle (and it is \*not\* reloaded upon graceful restarts). The complete stop / start sequence should be performed to force the IP table be reloaded.
+
+
+
+#### Configuration Examples
+
+- To define that "WebEnv" cookie has an information about "pubmed.lb" preference in "/Entrez" and all the descendant directories one can use the following:
+
+
+
+
+ CAFProxyCookie pubmed.lb WebEnv
+ CAFPreference pubmed.lb 100
+
+
+The second directive in the above example sets the preference to 100% -- this is a preference, not a requirement, so meaning that using the host from the cookie is the most desirable, but not blindly instructing to go to in every case possible.
+
+- To define new cookie for some new LB name the following fragment can be used:
+
+
+
+
+ CAFProxyCookie myname.lb My-Cookie
+ CAFProxyCookie other.lb My-Cookie
+
+
+ CAFProxyCookie myname.lb My-Secondary-Cookie
+
+
+The effect of the above is that "My-Cookie" will be used in LB name searches of "myname.lb" in directory "/SomeDir", but in "/SomeDir/SubDir" and all directories of that branch, "My-Secondary-Cookie" will be used instead. If an URL referred to "/SomeDir/AnotherDir", then "My-Cookie" would still be used.
+
+***Note*** that at the same time "My-Cookie" is used under "/SomeDir" everywhere else if "other.lb" is being resolved there.
+
+- The following fragment disables cookie for "tpubmed.lb" [note that no "CAFProxyCookie" is to precede this directive because "tpubmed.lb" is predefined]:
+
+
+
+ CAFProxyPreference tpubmed.lb 0
+
+- The following directive associates proxy "systems.lb" with argument "ticket":
+
+
+
+ CAFProxyArgument systems.lb ticket
+
+The effect of the above is that if an incoming URL resolves to use "systems.lb", then "ticket", if found in the query string, would be considered for lookup of "systems.lb" with the load-balancing daemon.
+
+
+
+#### Arguments Matching
+
+Suppose that the DB=A is a query argument (explicit DB selection, including just "DB" (as a standalone argument, treated as missing value), "DB=" (missing value)). That will cause the following order of precedence in selecting the target host:
+
+
+
+|----------------|-------------------------------------------------------------------------------|
+| Match | Description |
+| DB=A | Best. "A" may be "" to match the missing value |
+| DB=\* | Good. "\*" stands for "any other" |
+| DB not defined | Fair |
+| DB=- | Poor. "-" stands for "missing in the request" |
+| DB=B | Mismatch. It is used for fallbacks only as the last resort |
+
+
+
+No host with an explicit DB assignment (DB=B or DB=-) is being selected above if there is an exclamation point "!" [stands for "only"] in the assignment. DB=~A for the host causes the host to be skipped from selection as well. DBs are screened in the order of appearance, the first one is taken, so "DB=~A A" skips all requests having DB=A in their query strings.
+
+Suppose that there is no DB selection in the request. Then the hosts are selected in the following order:
+
+
+
+|----------------|------------------------------------------------------------------------------|
+| Match | Description |
+| DB=- | Best "-" stands for "missing from the request" |
+| DB not defined | Good |
+| DB=\* | Fair. "\*" stands for "any other" |
+| DB=B | Poor |
+
+
+
+No host with a non-empty DB assignment (DB=B or DB=\*) is being selected in the above scenario if there is an exclamation point "!" [stands for "only"] in the assignment. DB=~- defined for the host causes the host not to be considered.
+
+Only if there are no hosts in the best available category of hosts, the next category is used. That is, no "good" matches will ever be used if there are "best" matches available. Moreover, if all "best" matches have been used up but are known to exist, the search fails.
+
+"~" may not be used along with "\*": "~\*" combination will be silently ignored entirety, and will not modify the other specified affinities. Note that "~" alone has a meaning of 'anything but empty argument value, ""'. Also note that formally, "~A" is an equivalent to "~A \*" as well as "~-" is an equivalent to "\*".
+
+
+
+##### Argument Matching Examples
+
+Host affinity
+
+DB=A ~B
+
+makes the host to serve requests having either DB=A or DB=\ in their query strings. The host may be used as a failover for requests that have DB=C in them (or no DB) if there is no better candidate available. Adding "!" to the affinity line would cause the host not to be used for any requests, in which the DB argument is missing.
+
+Host affinity
+
+DB=A -
+
+makes the host to serve requests with either explicit DB=A in their query strings, or not having DB argument at all. Failovers from searches not matching the above may occur. Adding "!" to the line disables the failovers.
+
+Host affinity
+
+DB=- \*
+
+makes the host to serve requests that don't have any DB argument in their query strings, or when their DB argument failed to literally match affinity lines of all other hosts. Adding "!" to the line doesn't change the behavior.
+
+
+
+#### Log File
+
+The CAF module uses the Apache web server log files to put CAF module’s messages into.
+
+
+
+#### Monitoring
+
+The status of the CAF modules can be seen via a web interface using the following links:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+### DISPD Network Dispatcher
+
+
+
+#### Overview
+
+The DISPD dispatcher is a CGI/1.0-compliant program (the actual file name is `dispd.cgi`). Its purpose is mapping a requested service name to an actual server location when the client has no direct access to the LBSMD daemon. This mapping is called dispatching. Optionally, the DISPD dispatcher can also pass data between the client, who requested the mapping, and the server, which implements the service, found as a result of dispatching. This combined mode is called a connection. The client may choose any of these modes if there are no special requirements on data transfer (e.g., firewall connection). In some cases, however, the requested connection mode implicitly limits the request to be a dispatching-only request, and the actual data flow between the client and the server occurs separately at a later stage.
+
+
+
+#### Protocol Description
+
+The dispatching protocol is designed as an extension of HTTP/1.0 and is coded in the HTTP header parts of packets. The request (both dispatching and connection) is done by sending an HTTP packet to the DISPD dispatcher with a query line of the form:
+
+ dispd.cgi?service=
+
+which can be followed by parameters (if applicable) to be passed to the service. The `` defines the name of the service to be used. The other parameters take the form of one or more of the following constructs:
+
+ &[=]
+
+where square brackets are used to denote an optional value part of the parameter.
+
+In case of a connection request the request body can contain data to be passed to the first-found server. A connection to this server is automatically initiated by the DISPD dispatcher. On the contrary, in case of a dispatching-only request, the body is completely ignored, that is, the connection is dropped after the header has been read and then the reply is generated without consuming the body data. That process may confuse an unprepared client.
+
+Mapping of a service name into a server address is done by the LBSMD daemon which is run on the same host where the DISPD dispatcher is run. The DISPD dispatcher never dispatches a non-local client to a server marked as local-only (by means of L=yes in the configuration of the LBSMD daemon). Otherwise, the result of dispatching is exactly what the client would get from the [service mapping API](ch_conn.html#ch_conn.service_mapping_api) if run locally. Specifying capabilities explicitly the client can narrow the server search, for example, by choosing stateless servers only.
+
+
+
+##### Client Request to DISPD
+
+The following additional HTTP tags are recognized in the client request to the DISPD dispatcher.
+
+
+
+
+
+
+
+
+
+
+
Tag
+
Description
+
+
+
Accepted-Server-Types: <list>
+
The <list> can include one or more of the following keywords separated by spaces:
+
+
NCBID
+
STANDALONE
+
HTTP
+
HTTP_GET
+
HTTP_POST
+
FIREWALL
+
+ The keyword describes the server type which the client is capable to handle. The default is any (when the tag is not present in the HTTP header), and in case of a connection request, the dispatcher will accommodate an actual found server with the connection mode, which the client requested, by relaying data appropriately and in a way suitable for the server. Note:FIREWALL indicates that the client chooses a firewall method of communication. Note: Some server types can be ignored if not compatible with the current client mode
+
+
+
Client-Mode: <client-mode>
+
The <client-mode> can be one of the following:
+
+
STATELESS_ONLY - specifies that the client is not capable of doing full-duplex data exchange with the server in a session mode (e.g., in a dedicated connection).
+
STATEFUL_CAPABLE - should be used by the clients, which are capable of holding an opened connection to a server. This keyword serves as a hint to the dispatcher to try to open a direct TCP channel between the client and the server, thus reducing the network usage overhead.
+
+ The default (when the tag is not present at all) is STATELESS_ONLY to support Web browsers.
+
+
+
Dispatch-Mode: <dispatch-mode>
+
The <dispatch-mode> can be one of the following:
+
+
INFORMATION_ONLY - specifies that the request is a dispatching request, and no data and/or connection establishment with the server is required at this stage, i.e., the DISPD dispatcher returns only a list of available server specifications (if any) corresponding to the requested service and in accordance with client mode and server acceptance.
+
NO_INFORMATION - is used to disable sending the above-mentioned dispatching information back to the client. This keyword is reserved solely for internal use by the DISPD dispatcher and should not be used by applications.
+
STATEFUL_INCLUSIVE - informs the DISPD dispatcher that the current request is a connection request, and because it is going over HTTP it is treated as stateless, thus dispatching would supply stateless servers only. This keyword modifies the default behavior, and dispatching information sent back along with the server reply (resulting from data exchange) should include stateful servers as well, allowing the client to go to a dedicated connection later.
+
OK_DOWN or OK_SUPPRESSED or PROMISCUOUS - defines a dispatch only request without actual data transfer for the client to obtain a list of servers which otherwise are not included such as, currently down servers (OK_DOWN), currently suppressed by having 100% penalty servers (OK_SUPPRESSED) or both (PROMISCUOUS)
+
+ The default (in the absence of this tag) is a connection request, and because it is going over HTTP, it is automatically considered stateless. This is to support calls for NCBI services from Web browsers.
+
+
+
Skip-Info-<n>: <server-info>
+
<n> is a number of <server-info> strings that can be passed to the DISPD dispatcher to ignore the servers from being potential mapping targets (in case if the client knows that the listed servers either do not work or are not appropriate). Skip-Info tags are enumerated by numerical consequent suffices (<n>), starting from 1. These tags are optional and should only be used if the client believes that the certain servers do not match the search criteria, or otherwise the client may end up with an unsuccessful mapping.
+
+
+
Client-Host: <host>
+
The tag is used by the DISPD dispatcher internally to identify the <host>, where the request comes from, in case if relaying is involved. Although the DISPD dispatcher effectively disregards this tag if the request originates from outside NCBI (and thus it cannot be easily fooled by address spoofing), in-house applications should not use this tag when connecting to the DISPD dispatcher because the tag is trusted and considered within the NCBI Intranet.
+
+
+
Server-Count: {N|ALL}
+
The tag defines how many server infos to include per response (default N=3, ALL causes everything to be returned at once). N is an integer and ALL is a keyword.
+
+
+
+
+
+
+
+
+##### DISPD Client Response
+
+The DISPD dispatcher can produce the following HTTP tags in response to the client.
+
+
+
+|-------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Tag | Description |
+| `Relay-Path: ` | The tag shows how the information was passed along by the DISPD dispatcher and the NCBID utility. This is essential for debugging purposes |
+| `Server-Info-: ` | The tag(s) (enumerated increasingly by suffix ``, starting from 1) give a list of servers, where the requested service is available. The list can have up to five entries. However, there is only one entry generated when the service was requested either in firewall mode or by a Web browser. For a non-local client, the returned server descriptors can include ***FIREWALL*** server specifications. Despite preserving information about host, port, type, and other (but not all) parameters of the original servers, ***FIREWALL*** descriptors are not specifications of real servers, but they are created on-the-fly by the DISPD dispatcher to indicate that the connection point of the server cannot be otherwise reached without the use of either firewalling or relaying. |
+| `Connection-Info: ` | The tag is generated in a response to a stateful-capable client and includes a host (in a dotted notation) and a port number (decimal value) of the connection point where the server is listening (if either the server has specifically started or the FWDaemon created that connection point because of the client's request). The ticket value (hexadecimal) represents the 4-byte ticket that must be passed to the server as binary data at the very beginning of the stream. If instead of a host, a port, and ticket information there is a keyword ***TRY\_STATELESS***, then for some reasons (see `Dispatcher-Failures` tag below) the request failed but may succeed if the client would switch into a stateless mode. |
+| `Dispatcher-Failures: ` | The tag value lists all transient failures that the dispatcher might have experienced while processing the request. A fatal error (if any) always appears as the last failure in the list. In this case, the reply body would contain a copy of the message as well. ***Note:*** Fatal dispatching failure is also indicated by an unsuccessful HTTP completion code. |
+| `Used-Server-Info-n: ` | The tag informs the client end of server infos that having been unsuccessfully used during current connection request (so that the client will be able to skip over them if needs to). `n` is an integral suffix, enumerating from 1. |
+| `Dispatcher-Messages:` | The tag is used to issue a message into standard error log of a client. The message is intercepted and delivered from within Toolkit HTTP API. |
+
+
+
+
+
+##### Communication Schemes
+
+After making a dispatching request and using the dispatching information returned, the client can usually connect to the server on its own. Sometimes, however, the client has to connect to the DISPD dispatcher again to proceed with communication with the server. For the DISPD dispatcher this would then be a connection request which can go one of two similar ways, relaying and firewalling.
+
+The figures (Figure7, Figure8) provided at the very beginning of the “Load Balancing” chapter can be used for better understanding of the communication schemes described below.
+
+- In the relay mode, the DISPD dispatcher passes data from the client to the server and back, playing the role of a middleman. Data relaying occurs when, for instance, a Web browser client wants to communicate with a service governed by the DISPD dispatcher itself.
+
+- In the firewall mode, DISPD sends out only the information about where the client has to connect to communicate with the server. This connection point and a verifiable ticket are specified in the `Connection-Info` tag in the reply header. ***Note:*** firewalling actually pertains only to the stateful-capable clients and servers.
+
+The firewall mode is selected by the presence of the ***FIREWALL*** keyword in the `Accepted-Server-Types` tag set by the client sitting behind a firewall and not being able to connect to an arbitrary port.
+
+These are scenarios of data flow between the client and the server, depending on the “stateness” of the client:
+
+A. Stateless client
+
+1. Client is **not using firewall** mode
+
+ - The client has to connect to the server by its own, using dispatching information obtained earlier; or
+
+ - The client connects to the DISPD dispatcher with a connection request (e.g., the case of Web browsers) and the DISPD dispatcher facilitates data relaying for the client to the server.
+
+2. If the client chooses to use the firewall mode then the only way to communicate with the server is to connect to the DISPD dispatcher (making a connection request) and use the DISPD dispatcher as a relay.
+
+***Note:*** Even if the server is stand-alone (but lacking S=yes in the configuration file of the LBSMD daemon) then the DISPD dispatcher initiates a microsession to the server and wraps its output into an HTTP/1.0-compliant reply. Data from both HTTP and NCBID servers are simply relayed one-to-one.
+
+B. Stateful-capable client
+
+1. A client which is **not using the firewall** mode has to connect directly to the server, using the dispatcher information obtained earlier (e.g., with the use of ***INFORMATION\_ONLY*** in `Dispatch-Mode` tag) if local; for external clients the connection point is provided by the `Connection-Info` tag (port range 4444-4544).
+
+2. If the firewall mode is selected, then the client has to expect `Connection-Info` to come back from the DISPD dispatcher pointing out where to connect to the server. If ***TRY\_STATELESS*** comes out as a value of the former tag, then the client has to switch into a stateless mode (e.g., by setting ***STATELESS\_ONLY*** in the `Client-Mode` tag) for the request to succeed.
+
+***Note:*** ***TRY\_STATELESS*** could be induced by many reasons, mainly because all servers for the service are stateless ones or because the FWDaemon is not available on the host, where the client's request was received.
+
+***Note:*** Outlined scenarios show that no prior dispatching information is required for a stateless client to make a connection request, because the DISPD dispatcher can always be used as a data relay (in this way, Web browsers can access NCBI services). But for a stateful-capable client to establish a dedicated connection an additional step of obtaining dispatching information must precede the actual connection.
+
+To support requests from Web browsers, which are unaware of HTTP extensions comprising dispatching protocol the DISPD dispatcher considers an incoming request that does not contain input dispatching tags as a connection request from a stateless-only client.
+
+The DISPD dispatcher uses simple heuristics in analyzing an HTTP header to determine whether the connection request comes from a Web browser or from an application (a service connector, for instance). In case of a Web browser the chosen data path could be more expensive but more robust including connection retries if required, whereas on the contrary with an application, the dispatcher could return an error, and the retry is delegated to the application.
+
+The DISPD dispatcher always preserves original HTTP tags `User-Agent` and `Client-Platform` when doing both relaying and firewalling.
+
+
+
+### NCBID Server Launcher
+
+
+
+#### Overview
+
+The LBSMD daemon supports services of type NCBID which are really Unix filter programs that read data from the stdin stream and write the output into the stdout stream without having a common protocol. Thus, HTTP/1.0 was chosen as a framed protocol for wrapping both requests and replies, and the NCBID utility CGI program was created to pass a request from the HTTP body to the server and to put the reply from the server into the HTTP body and send it back to the client. The NCBID utility also provides a dedicated connection between the server and the client, if the client supports the stateful way of communication. Former releases of the NCBID utility were implemented as a separate CGI program however the latest releases integrated the NCBID utility and the DISPD dispatcher into a single component (`ncbid.cgi` is a hard link to `dispd.cgi`).
+
+The NCBID utility determines the requested service from the query string in the same way as the DISPD dispatcher does, i.e., by looking into the value of the CGI parameter service. An executable file which has to be run is then obtained by searching the configuration file (shared with the LBSMD daemon; the default name is `servrc.cfg`): the path to the executable along with optional command-line parameters is specified after the bar character ("\|") in the line containing a service definition.
+
+The NCBID utility can work in either of two connection modes, stateless and stateful, as determined by reading the following HTTP header tag:
+
+`Connection-Mode: `
+
+where `` is one of the following:
+
+- ***STATEFUL***
+
+- ***STATELESS***
+
+The default value (when the tag is missing) is ***STATELESS*** to support calls from Web browsers.
+
+When the DISPD dispatcher relays data to the NCBID utility this tag is set in accordance with the current client mode.
+
+The ***STATELESS*** mode is almost identical to a call of a conventional CGI program with an exception that the HTTP header could hold tags pertaining to the dispatching protocol, and resulting from data relaying (if any) by the DISPD dispatcher.
+
+In the ***STATEFUL*** mode, the NCBID utility starts the program in a more tricky way, which is closer to working in a firewall mode for the DISPD dispatcher, i.e. the NCBID utility loads the program with its stdin and stdout bound to a port, which is switched to listening. The program becomes a sort of an Internet daemon (the only exception is that only one incoming connection is allowed). Then the client is sent back an HTTP reply containing the `Connection-Info` tag. The client has to use port, host, and ticket from that tag to connect to the server by creating a dedicated TCP connection.
+
+***Note***: the NCBID utility never generates ***TRY\_STATELESS*** keyword.
+
+For the sake of the backward compatibility the NCBID utility creates the following environment variables (in addition to CGI/1.0 environment variables created by the HTTP daemon when calling NCBID) before starting the service executables:
+
+
+
+|----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Name | Description |
+| NI\_CLIENT\_IPADDR | The variable contains an IP address of the remote host. It could also be an IP address of the firewall daemon if the NCBID utility was started as a result of firewalling. |
+| NI\_CLIENT\_PLATFORM | The variable contains the client platform extracted from the HTTP tag `Client-Platform` provided by the client if any. |
+
+
+
+
+
+### Firewall Daemon (FWDaemon)
+
+
+
+#### Overview
+
+The NCBI Firewall Daemon (FWDaemon) is essentially a network multiplexer listening at an advertised network address.
+
+The FWDaemon works in a close cooperation with the DISPD dispatcher which informs FWDaemon on how to connect to the “real” NCBI server and then instructs the network client to connect to FWDaemon (instead of the “real” NCBI server). Thus, the FWDaemon serves as a middleman that just pumps the network traffic from the network client to the NCBI server and back.
+
+The FWDaemon allows a network client to establish a persistent TCP/IP connection to any of publicly advertised NCBI services, provided that the client is allowed to make an outgoing network connection to any of the following FWDaemon addresses (on front-end NCBI machines):
+
+ ports 5860..5870 at both 130.14.29.112 and 165.112.7.12
+
+***Note:*** One FWDaemon can simultaneously serve many client/server pairs.
+
+
+
+##### FWDaemon Behind a "Regular" Firewall
+
+If a network client is behind a regular firewall, then a system administrator should open the above addresses (only!) for outgoing connections and set your client to "firewall" mode. Now the network client can use NCBI network services in a usual way (as if there were no firewall at all).
+
+
+
+##### FWDaemon Behind a "Non-Transparent" Firewall
+
+***Note:*** If a firewall is "non-transparent" (this is an extremely rare case), then a system administrator must "map" the corresponding ports on your firewall server to the advertised FWDaemon addresses (shown above). In this case, you will have to specify the address of your firewall server in the client configuration.
+
+The mapping on your non-transparent firewall server should be similar to the following:
+
+ CONN_PROXY_HOST:5860..5870 --> 130.14.29.112:5860..5870
+
+Please note that there is a port range that might not be presently used by any clients and servers, but it is reserved for future extensions. Nevertheless, it is recommended that you have this range configured on firewalls to allow the applications to function seamlessly in the future.
+
+
+
+#### Monitoring
+
+The FWDaemon could be monitored using the following web page:
+
+
+
+Having the page loaded into a browser the user will see the following.
+
+[](/book/static/img/FWDaemonMonitor.gif "Click to see the full-resolution image")
+
+Figure 15. FWDaemon Checking Web Page
+
+By clicking the “Check” button a page similar to the following will appear.
+
+[](/book/static/img/FWDaemonCheckPage.gif "Click to see the full-resolution image")
+
+Figure 16. FWDaemon Presence Check
+
+The outside NCBI network users can check the connection to the NAT service following the below steps:
+
+- Run the FWDaemon presence check as described above.
+
+- Take connection properties from any line where the status is “OK”. For example 130.14.29.112:5864
+
+- Establish a telnet session using those connection properties. The example of a connection session is given below (a case when a connection was successfully established).
+
+
+
+ > telnet 130.14.29.112 5864
+ Trying 130.14.29.112...
+ Connected to 130.14.29.112.
+ Escape character is '^]'.
+ NCBI Firewall Daemon: Invalid ticket. Connection closed.
+ See http://www.ncbi.nlm.nih.gov/cpp/network/firewall.html.
+ Connection closed by foreign host.
+
+
+
+#### Log Files
+
+The FWDaemon stores its log files at the following location:
+
+`/opt/machine/fwdaemon/log/fwdaemon`
+
+which is usually a link to `/var/log/fwdaemon`.
+
+The file is formed locally on a host where FWDaemon is running.
+
+
+
+#### FWDaemon and NCBID Server Data Exchange
+
+One of the key points in the communications between the NCBID server and the FWDaemon is that the DISPD dispatcher instructs the FWDaemon to expect a new client connection. This instruction is issued as a reaction on a remote client request. It is possible that the remote client requested a service but did not use it. To prevent resource leaking and facilitate the usage monitoring the FWDaemon keeps a track of those requested but not used connections in a special file. The NCBID dispatcher is able to read that file before requesting a new connection from the FWDaemon and if the client was previously marked as the one who left connections not used then the NCBID dispatcher refuses the connection request.
+
+The data exchange is illustrated on the figure below.
+
+[](/book/static/img/DISPDAndFWDaemon.jpg "Click to see the full-resolution image")
+
+Figure 17. DISPD FWDaemon Data Exchange
+
+The location of the `.dispd.msg` file is detected by the DISPD dispatcher as follows. The dispatcher determines the user name who owns the `dispd.cgi` executable. Then the dispatcher looks to the home directory for that user. The directory is used to look for the `.dispd.msg` file. The FWDaemon is run under the same user and the `.dispd.msg` file is saved by the daemon in its home directory.
+
+
+
+### Launcherd Utility
+
+The purpose of the launcherd utility is to replace the NCBID services on hosts where there is no Apache server installed and there is a need to have Unix filter programs to be daemonized.
+
+The launcherd utility is implemented as a command line utility which is controlled by command line arguments. The list of accepted arguments can be retrieved with the -h option:
+
+`service1:~> /export/home/service/launcherd -h` `Usage:` `launcherd [-h] [-q] [-v] [-n] [-d] [-i] [-p #] [-l file] service command [parameters...]` ` -h = Print usage information only; ignore anything else` ` -q = Quiet start [and silent exit if already running]` ` -v = Verbose logging [terse otherwise]` ` -n = No statistics collection` ` -d = Debug mode [do not go daemon, stay foreground]` ` -i = Internal mode [bind to localhost only]` ` -p # = Port # to listen on for incoming connection requests` `` -l = Set log file name [use `-' or `+' to run w/o logger] `` `Note: Service must be of type STANDALONE to auto-get the port.` `` Note: Logging to `/dev/null' is treated as logging to a file. `` `Signals: HUP, INT, QUIT, TERM to exit`
+
+The launcherd utility accepts the name of the service to be daemonized. Using the service name the utility checks the LBSMD daemon table and retrieves port on which the service requests should be accepted. As soon as an incoming request is accepted the launched forks and connects the socket with the standard streams of the service executable.
+
+One of the launcherd utility command line arguments is a path to a log file where the protocol messages are stored.
+
+The common practice for the launcherd utility is to be run by the standard Unix cron daemon. Here is an example of a cron schedule which runs the launcherd utility every 3 minutes:
+
+`# DO NOT EDIT THIS FILE - edit the master and reinstall.` `# (/export/home/service/UPGRADE/crontabs/service1/crontab ` `# installed on Thu Mar 20 20:48:02 2008) ` `# (Cron version -- $Id: crontab.c,v 2.13 1994/01/17 03:20:37 vixie Exp $) ` `MAILTO=ncbiduse@ncbi` `*/3 * * * * test -x /export/home/service/launcherd && /export/home/service/launcherd -q -l /export/home/service/bounce.log -- Bounce /export/home/service/bounce >/dev/null MAILTO=grid-mon@ncbi,taxhelp@ncbi` `*/3 * * * * test -x /export/home/service/launcherd && /export/home/service/launcherd -q -l /var/log/taxservice -- TaxService /export /home/service/taxservice/taxservice >/dev/null`
+
+
+
+### Monitoring Tools
+
+There are various ways to monitor the services available at NCBI. These are generic third party tools and specific NCBI developed utilities. The specific utilities are described above in the sections related to a certain component.
+
+The system availability and performance could be visualized by using Zabbix software. It can be reached at:
+
+
+
+One more web based tool to monitor servers / services statuses is Nagios. It can be reached at:
+
+[http://nagios.ncbi.nlm.nih.gov](http://nagios.ncbi.nlm.nih.gov/)
+
+
+
+### Quality Assurance Domain
+
+The quality assurance (QA) domain uses the same equipment and the same network as the production domain. Not all the services which are implemented in the production domain are implemented in the QA one. When a certain service is requested with the purpose of testing a service from QA should be called if it is implemented or a production one otherwise. The dispatching is implemented transparently. It is done by the CAF module running on production front ends. To implement that the CAFQAMap directive is put into the Apache web server configuration file as following:
+
+`CAFQAMap NCBIQA /opt/machine/httpd/public/conf/ncbiqa.mapping`
+
+The directive above defines the NCBIQA cookie which triggers names substitutions found in the `/opt/machine/httpd/public/conf/ncbiqa.mapping` file.
+
+To set the cookie the user can visit the following link:
+
+
+
+A screen similar to the following will appear:
+
+[](/book/static/img/QACookieManager.gif "Click to see the full-resolution image")
+
+Figure 18. QA Cookie Manager.
+
+While connecting to a certain service the cookie is analyzed by the CAF module and if the QA cookie is detected then name mapping is triggered. The mapping is actually a process of replacing one name with another. The replacement rules are stored in the `/opt/machine/httpd/public/conf/ncbiqa.mapping` file. The file content could be similar to the following:
+
+`portal portalqa`
+
+`eutils eutilsqa`
+
+`tpubmed tpubmedqa`
+
+which means to replace `portal` with `portalqa` etc.
+
+So the further processing of the request is done using the substituted name. The process is illustrated on the figure below.
+
+[](/book/static/img/QA.jpg "Click to see the full-resolution image")
+
+Figure 19. NCBI QA
+
+
+
+NCBI Genome Workbench
+---------------------
+
+The NCBI Genome Workbench is an integrated sequence visualization and analysis platform. This application runs on Windows, Unix, and Macintosh OS X.
+
+The following topics are discussed in this section:
+
+- [Design goals](#ch_app.gbench_dg)
+
+- [Design](#ch_app.gbench_design)
+
+
+
+### Design Goals
+
+The primary goal of Genome Workbench is to provide a flexible platform for development of new analytic and visualization techniques. To this end, the application must facilitate easy modification and extension. In addition, we place a large emphasis on cross-platform development, and Genome Workbench should function and appear identically on all supported platforms.
+
+
+
+### Design
+
+The basic design of Genome Workbench follows a modified Model-View-Controller (MVC) architecture. The MVC paradigm provides a clean separation between the data being dealt with (the model), the user's perception of this data (provided in views), and the user's interaction with this data (implemented in controllers). For Genome Workbench, as with many other implementations of the MVC architecture, the View and Controller are generally combined.
+
+Central to the framework is the notion of the data being modeled. The model here encompasses the NCBI data model, with particular emphasis on sequences and annotations. The Genome Workbench framework provides a central repository for all managed data through the static class interface in ***CDocManager***. ***CDocManager*** owns the single instance of the C++ Object Manager that is maintained by the application. ***CDocManager*** marshals individual ***CDocument*** classes to deal with data as the user requests. ***CDocument***, at its core, wraps a ***CScope*** class and thus provides a hook to the object manager.
+
+The View/Controller aspect of the architecture is implemented through the abstract class ***CView***. Each ***CView*** class is bound to a single document. Each ***CView*** class, in turn, represents a view of some portion of the data model or a derived object related to the document. This definition is intentionally vague; for example, when viewing a document that represents a sequence alignment, a sequence in that alignment may not be contained in the document itself but is distinctly related to the alignment and can be presented in the context of the document. In general, the views that use the framework will define a top-level FLTK window; however, a view could be defined to be a CGI context such that its graphical component is a Web browser.
+
+To permit maximal extensibility, the framework delegates much of the function of creating and presenting views and analyses to a series of plugins. In fact, most of the basic components of the application itself are implemented as plugins. The Genome Workbench framework defines three classes of plugins: data loaders, views, and algorithms. Technically, a plugin is simply a shared library defining a standard entry point. These libraries are loaded on demand; the entry point returns a list of plugin factories, which are responsible for creating the actual plugin instances.
+
+Cross-platform graphical development presents many challenges to proper encapsulation. To alleviate a lot of the difficulties seen with such development, we use a cross-platform GUI toolkit (FLTK) in combination with OpenGL for graphical development.
+
+
+
+NCBI NetCache Service
+---------------------
+
+- [What is NetCache?](#ch_app.what_is_netcache)
+
+- [What can NetCache be used for?](#ch_app.what_it_can_be_used)
+
+- [How to use NetCache](#ch_app.getting_started)
+
+ - [The basic ideas](#ch_app.The_basic_ideas)
+
+ - [Setting up your program to use NetCache](#ch_app.Set_up_your_program_to_use_NetCac)
+
+ - [Establish the NetCache service name](#ch_app.Establish_the_NetCache_service_na)
+
+ - [Initialize the client API](#ch_app.Initialize_the_client_API)
+
+ - [Store data](#ch_app.Store_data)
+
+ - [Retrieve data](#ch_app.Retrieve_data)
+
+ - [Samples and other resources](#ch_app.Available_samples)
+
+- [Questions and answers](#ch_app.Questions_and_answers)
+
+
+
+### What is NetCache?
+
+**NetCache** is a service that provides to distributed hosts a reliable and uniform means of accessing temporary storage. Using **NetCache**, distributed applications can store data temporarily without having to manage distributed access or handle errors. Applications on different hosts can access the same data simply by using the unique key for the data.
+
+CGI applications badly need this functionality to store session information between successive HTPP requests. Some session information could be embedded into URLs or cookies, however it is generally not a good idea because:
+
+- Some data should not be transmitted to the client, for security reasons.
+
+- Both URLs and cookies are quite limited in size.
+
+- Passing data via either cookie or URL generally requires additional encoding and decoding steps.
+
+- It makes little sense to pass data to the client only so it can be passed back to the server.
+
+Thus it is better to store this information on the server side. However, this information cannot be stored locally because successive HTTP requests for a given session are often processed on different machines. One possible way to handle this is to create a file in a shared network directory. But this approach can present problems to client applications in any of the standard operations:
+
+- Adding a blob
+
+- Removing a blob
+
+- Updating a blob
+
+- Automatically removing expired blobs
+
+- Automatically recovering after failures
+
+Therefore, it's better to provide a centralized service that provides robust temporary storage, which is exactly what **NetCache** does.
+
+**NetCache** is load-balanced and has high performance and virtually unlimited scalability. Any Linux, Unix or Windows machine can be a **NetCache** host, and any application can use it. For example, the success with which **NetCache** solves the problem of distributed access to temporary storage enables the [NCBI Grid](ch_grid.html) framework to rely on it for passing data between its components.
+
+
+
+### What can NetCache be used for?
+
+Programs can use **NetCache** for data exchange. For example, one application can put a blob into **NetCache** and pass the blob key to another application, which can then access (retrieve, update, remove) the data. Some typical use cases are:
+
+- Store CGI session info
+
+- Store CGI-generated graphics
+
+- Cache results of computations
+
+- Cache results of expensive DBMS or search system queries
+
+- Pass messages between programs
+
+The diagram below illustrates how **NetCache** works.
+
+[](/book/static/img/NetCache_diagramm.gif "Click to see the full-resolution image")
+
+1. Client requests a named service from the Load Balancer.
+
+2. Load Balancer chooses the least loaded server (on this diagram Server 2) corresponding to the requested service.
+
+3. Load Balancer returns the chosen server to the client.
+
+4. Client connects to the selected **NetCache** server and sends the data to store.
+
+5. **NetCache** generates and returns a unique key which can then be used to access the data.
+
+
+
+### How to use NetCache
+
+All new applications developed within NCBI should use **NetCache** together with the NCBI Load Balancer. It is not recommended to use an unbalanced **NetCache** service.
+
+The following topics explain how to use NetCache from an application:
+
+- [The basic ideas](#ch_app.The_basic_ideas)
+
+- [Set up your program to use NetCache](#ch_app.Set_up_your_program_to_use_NetCac)
+
+- [Establish the NetCache service name](#ch_app.Set_up_your_program_to_use_NetCac)
+
+- [Initialize the client API](#ch_app.Initialize_the_client_API)
+
+- [Store data](#ch_app.Store_data)
+
+- [Retrieve data](#ch_app.Retrieve_data)
+
+- [Samples and other resources](#ch_app.Available_samples)
+
+
+
+#### The basic ideas
+
+A typical **NetCache** implementation involves a load-balanced server daemon (the "service") and one or more clients that access the service through a software interface. See [netcached.ini](http://www.ncbi.nlm.nih.gov/viewvc/v1/trunk/c++/src/app/netcache/netcached.ini?view=log) for descriptions of the **NetCache** server daemon configuration parameters.
+
+Two classes provide an interface to **NetCache** - ***CNetCacheAPI*** and ***CNetICacheClient***. These classes share most of the basic ideas of using **NetCache**, but might be best suited for slightly different purposes. ***CNetCacheAPI*** might be a bit better for temporary storage in scenarios where the data is not kept elsewhere, whereas ***CNetICacheClient*** implements the ***ICache*** interface and might be a bit better for scenarios where the data still exists elsewhere but is also cached for performance reasons. ***CNetCacheAPI*** will probably be more commonly used because it automatically generates unique keys for you and it has a slightly simpler interface. ***CNetCacheAPI*** also supports stream insertion and extraction operators.
+
+There are multiple ways to write data to **NetCache** and read it back, but the basic ideas are:
+
+- **NetCache** stores data in blobs. There are no constraints on the format, and the size can be anything from one byte to "big" - that is, the size is specified using ***size\_t*** and the practical size limit is the lesser of available storage and organizational policy.
+
+- Blob identification is usually associated with a unique purpose.
+
+ - With ***CNetCacheAPI***, a blob is uniquely identified by a key that is generated by **NetCache** and returned to the calling code. Thus, the calling code can limit use of the blob to a given purpose. For example, data can be passed from one instance of a CGI to the next by storing the data in a **NetCache** blob and passing the key via cookie.
+
+ - With ***CNetICacheClient***, blobs are identified by the combination { key, version, subkey, cache name }, which isn't guaranteed to be unique. It is possible that two programs could choose the same combination and one program could change or delete the data stored by the other.
+
+- With ***CNetICacheClient***, the cache name can be specified in the registry and is essentially a convenient way of simulating namespaces.
+
+- When new data is written using a key that corresponds to existing data:
+
+ - API calls that use a buffer pointer replace the existing data.
+
+ - API calls that use a stream or writer append to the existing data.
+
+- Data written with a stream or writer won't be accessible from the **NetCache** server until the stream or writer is deleted or until the writer's ***Close()*** method is called.
+
+- A key must be supplied to retrieve data.
+
+- Blobs have a limited "time-to-live" (TTL).
+
+ - Reading a blob won't delete it - it will be removed automatically when its TTL has expired, or it can be removed explicitly.
+
+ - **NetCache** server daemons can specify a default TTL for their blobs using the `blob_ttl` entry in the `[netcache]` section of [netcached.ini](http://www.ncbi.nlm.nih.gov/viewvc/v1/trunk/c++/src/app/netcache/netcached.ini?view=log). There is no direct way to find the server's default TTL, but you can find it indirectly by creating a blob and calling ***GetBlobInfo()*** on the new blob. For an example of this, see [CSampleNetCacheClient::DemoPutRead()](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/sample/app/netcache/netcache_client_sample.cpp).
+
+ - Blob lifetime can be prolonged.
+
+ - By default, each time a blob is accessed its lifetime will be extended by the server's default `blob_ttl`. The default prolongation can be overridden by passing a TTL when accessing the blob (the passed value will apply only to that access).
+
+ - By default, the total lifetime of a blob, including all prolongations, will be limited to either 10 times the `blob_ttl` or 30 days, whichever is larger. The default maximum lifetime can be overridden with `max_ttl`.
+
+ - Lifetime prolongation can be disabled by setting the `prolong_on_read` entry to `false` in [netcached.ini](http://www.ncbi.nlm.nih.gov/viewvc/v1/trunk/c++/src/app/netcache/netcached.ini?view=log).
+
+ - ***Note:*** Calling ***GetBlobSize()*** will prolong a blob's lifetime (unless `prolong_on_read` is `false`), but calling ***GetBlobInfo()*** will not.
+
+
+
+#### Set up your program to use NetCache
+
+To use **NetCache** from your application, you must use the [NCBI application framework](ch_core.html#ch_core.CNcbiApplication) by deriving you application class from ***CNcbiApplication***. If your application is a CGI, you can derive from ***CCgiApplication***.
+
+You will need at least the following libraries in your `Makefile..app`:
+
+ # For CNcbiApplication-derived programs:
+ LIB = xconnserv xthrserv xconnect xutil xncbi
+
+ # For CCgiApplication-derived programs:
+ LIB = xcgi xconnserv xthrserv xconnect xutil xncbi
+
+ # If you're using CNetICacheClient, also add ncbi_xcache_netcache to LIB.
+
+ # All apps need this LIBS line:
+ LIBS = $(NETWORK_LIBS) $(DL_LIBS) $(ORIG_LIBS)
+
+Your source should include:
+
+ #include // for CNcbiApplication-derived programs
+ #include // for CCgiApplication-derived programs
+
+ #include // if you use CNetCacheAPI
+ #include // if you use CNetICacheClient
+
+An even easier way to get a new CGI application started is to use the [new\_project](ch_proj.html#ch_proj.new_project_Starting) script:
+
+ new_project mycgi app/netcache
+
+
+
+#### Establish the NetCache service name
+
+All applications using **NetCache** must use a service name. A service name is essentially just an alias for a group of **NetCache** servers from which the load balancer can choose when connecting the **NetCache** client and server. For applications with minimal resource requirements, the selected service may be relatively unimportant, but applications with large resource requirements may need their own dedicated **NetCache** servers. But in all cases, developers should contact nypk4jvylGujip5ust5upo5nv/ and ask what service name to use for new applications.
+
+Service names must match the pattern `[A-Za-z_][A-Za-z0-9_]*`, must not end in `_lb`, and are not case-sensitive. Limiting the length to 18 characters is recommended, but there is no hard limit.
+
+Service names are typically specified on the command line or stored in the application configuration file. For example:
+
+ [netcache_api]
+ service=the_svc_name_here
+
+
+
+#### Initialize the client API
+
+Initializing the **NetCache** API is extremely easy - simply create a ***CNetCacheAPI*** or ***CNetICacheClient*** object, selecting the constructor that automatically configures the API based on the application registry. Then, define the client name in the application registry using the `client` entry in the `[netcache_api]` section. The client name should be unique if the data is application-specific, or it can be shared by two or more applications that need to access the same data. The client name is added to AppLog entries, so it is helpful to indicate the application in this string.
+
+For example, put this in your source code:
+
+ // To configure automatically based on the config file, using CNetCacheAPI:
+ CNetCacheAPI nc_api(GetConfig());
+
+ // To configure automatically based on the config file, using CNetICacheClient:
+ CNetICacheClient ic_client(CNetICacheClient::eAppRegistry);
+
+and put this in your configuration file:
+
+ [netcache_api]
+ client=your_app_name_here
+
+If you are using ***CNetICacheClient***, you either need to use API methods that take a cache name or, to take advantage of automatic configuration based on the registry, specify a cache name in the `[netcache_api]` section, for example:
+
+ [netcache_api]
+ cache_name=your_cache_name_here
+
+For a complete reference of **NetCache** configuration parameters, please see the [NetCache and NetSchedule](ch_libconfig.html#ch_libconfig.NetCache_and_NetSchedule) section in the Library Configuration chapter:
+
+
+
+#### Store data
+
+There are ancillary multiple ways to save data, whether you're using ***CNetCacheAPI*** or ***CNetICacheClient***.
+
+With all the storage methods, you can supply a "time-to-live" parameter, which specifies how long (in seconds) a blob will be accessible. See the [basic ideas](#ch_app.The_basic_ideas) section for more information on time-to-live.
+
+
+
+##### Storing data using CNetCacheAPI
+
+If you are saving a new blob using ***CNetCacheAPI***, it will create a unique blob key and pass it back to you. Here are several ways to store data using ***CNetCacheAPI*** (see the [class reference](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCNetCacheAPI.html) for additional methods):
+
+ CNetCacheAPI nc_api(GetConfig());
+
+ // Write a simple object (and get the new blob key).
+ key = nc_api.PutData(message.c_str(), message.size());
+
+ // Or, overwrite the data by writing to the same key.
+ nc_api.PutData(key, message.c_str(), message.size());
+
+ // Or, create an ostream (and get a key), then insert into the stream.
+ auto_ptr os(nc_api.CreateOStream(key));
+ *os << "line one\n";
+ *os << "line two\n";
+ // (data written at stream deletion or os.reset())
+
+ // Or, create a writer (and get a key), then write data in chunks.
+ auto_ptr writer(nc_api.PutData(&key));
+ while(...) {
+ writer->Write(chunk_buf, chunk_size);
+ // (data written at writer deletion or writer.Close())
+
+
+
+##### Storing data using CNetICacheClient
+
+If you are saving a new blob using ***CNetICacheClient***, you must supply a unique { blob key / version / subkey / cache name } combination. Here are two ways (with the cache name coming from the registry) to store data using ***CNetICacheClient*** (see the [class reference](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCNetICacheClient.html) for additional methods):
+
+ CNetICacheClient ic_client(CNetICacheClient::eAppRegistry);
+
+ // Write a simple object.
+ ic_client.Store(key, version, subkey, message.c_str(), message.size());
+
+ // Or, create a writer, then write data in chunks.
+ auto_ptr
+ writer(ic_client.GetNetCacheWriter(key, version, subkey));
+ while(...) {
+ writer->Write(chunk_buf, chunk_size);
+ // (data written at writer deletion or writer.Close())
+
+
+
+#### Retrieve data
+
+Retrieving data is more or less complementary to storing data.
+
+If an attempt is made to retrieve a blob after its time-to-live has expired, an exception will be thrown.
+
+
+
+##### Retrieving data using CNetCacheAPI
+
+The following code snippet demonstrates three ways of retrieving data using ***CNetCacheAPI*** (see the [class reference](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCNetCacheAPI.html) for additional methods):
+
+ // Read a simple object.
+ nc_api.ReadData(key, message);
+
+ // Or, extract words from a stream.
+ auto_ptr is(nc_api.GetIStream(key));
+ while (!is->eof()) {
+ *is >> message; // get one word at a time, ignoring whitespace
+
+ // Or, retrieve the whole stream buffer.
+ NcbiCout << "Read: '" << is->rdbuf() << "'" << NcbiEndl;
+
+ // Or, read data in chunks.
+ while (...) {
+ ERW_Result rw_res = reader->Read(chunk_buf, chunk_size, &bytes_read);
+ chunk_buf[bytes_read] = '\0';
+ if (rw_res == eRW_Success) {
+ NcbiCout << "Read: '" << chunk_buf << "'" << NcbiEndl;
+ } else {
+ NCBI_USER_THROW("Error while reading BLOB");
+ }
+
+
+
+##### Retrieving data using CNetICacheClient
+
+The following code snippet demonstrates two ways to retrieve data using ***CNetICacheClient***, with the cache name coming from the registry (see the [class reference](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCNetICacheClient.html) for additional methods):
+
+ // Read a simple object.
+ ic_client.Read(key, version, subkey, chunk_buf, kMyBufSize);
+
+ // Or, read data in chunks.
+ size_t remaining(ic_client.GetSize(key, version, subkey));
+ auto_ptr reader(ic_client.GetReadStream(key, version, subkey));
+ while (remaining > 0) {
+ size_t bytes_read;
+ ERW_Result rw_res = reader->Read(chunk_buf, chunk_size, &bytes_read);
+ if (rw_res != eRW_Success) {
+ NCBI_USER_THROW("Error while reading BLOB");
+ }
+ // do something with the data
+ ...
+ remaining -= bytes_read;
+ }
+
+
+
+#### Samples and other resources
+
+Here is a sample client application that demonstrates a variety of ways to use **NetCache**:
+
+[src/sample/app/netcache/netcache\_client\_sample.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/sample/app/netcache/netcache_client_sample.cpp)
+
+Here is a sample application that uses **NetCache** from a CGI application:
+
+[src/sample/app/netcache/netcache\_cgi\_sample.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/sample/app/netcache/netcache_cgi_sample.cpp)
+
+Here are test applications for ***CNetCacheAPI*** and ***CNetICacheClient***:
+
+[src/connect/services/test/test\_netcache\_api.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/connect/services/test/test_netcache_api.cpp)
+
+[src/connect/services/test/test\_ic\_client.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/connect/services/test/test_ic_client.cpp)
+
+Please see the [NetCache and NetSchedule](ch_libconfig.html#ch_libconfig.NetCache_and_NetSchedule) section of the Library Configuration chapter for documentation on the **NetCache** configuration parameters.
+
+The `grid_cli` command-line tool (available on both Windows and Unix) provides convenient sub-commands for manipulating blobs, getting their status, checking servers, etc.
+
+You can also email nypk4jvylGujip5ust5upo5nv/ if you have questions.
+
+
+
+### Questions and answers
+
+**Q:What exactly is netcache's architecture, it is memory-based (like memcached), or does it use filesystem/sql/whatever?**
+
+A:It keeps its database on disk, memory-mapped; it also has a (configurable) "write-back buffer" - to use when there is a lot of data coming in, and a lot of this data gets re-written quickly (this is to help avoid thrashing the disk with relatively transient blob versions - when the OS's automatic memory swap mechanism may become sub-optimal).
+
+**Q:Is there an NCBI "pool" of netcache servers that we can simply tie in to, or do we have to set up netcache servers on our group's own machines?**
+
+A:We usually (except for PubMed) administer NC servers, most of which are shared. Depends on your load (hit rate, blob size distribution, blob lifetime, redundancy, etc.) we can point you to the shared NC servers or create a new NC server pool.
+
+**Q:I assume what's in c++/include/connect/services/\*hpp is the api to use for a client?**
+
+A:Yes, also try the samples under [src/sample/app/netcache](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/sample/app/netcache/) - for example:
+
+ new_project pc_nc_client app/netcache
+ cd pc_nc_client
+ make
+ ./netcache_client_sample1 -service NC_test
+ ./netcache_client_sample2 NC_test
+ ./netcache_client_sample3 NC_test
+
+**Q:Is there a way to build in some redundancy, e.g. so that if an individual server/host goes down, we don't lose data?**
+
+A:Yes, you can mirror NC servers, master-master style, including between BETH and COLO sites. Many NC users use mirrored instances nowadays, including PubMed.
+
+**Q:Is there a limit to the size of the data blobs that can be stored?**
+
+A:I have seen 400MB blobs there being written and read without an incident a thousand times a day (). We can do experiments to see how your load will be handled. As a general rule, you should ask nypk4jvylGujip5ust5upo5nv/ for guidance when changing your NC usage.
+
+**Q:How is the expiration of BLOBs handled by NetCache? My thinking is coming from two directions. First, I wouldn’t want BLOBs deleted out from under me, but also, if the expiration is too long, I don’t want to be littering the NetCache. That is: do I need to work hard to remove all of my BLOBs or can I just trust the automatic clean-up?**
+
+A:You can specify a "time-to-live" when you create a blob. If you don't specify a value, you can find the service's default value by calling ***GetBlobInfo()***. See the [basic ideas](#ch_app.The_basic_ideas) section for more details.
+
+
diff --git a/pages/ch_app_test.md b/pages/ch_app_test.md
new file mode 100644
index 00000000..e1aba8f7
--- /dev/null
+++ b/pages/ch_app_test.md
@@ -0,0 +1,2557 @@
+---
+layout: default
+title: C++ Toolkit test
+nav: pages/ch_app_test
+---
+
+
+24\. Applications
+===============================
+
+Created: April 1, 2003; Last Update: March 17, 2015.
+
+Overview
+--------
+
+- [Introduction](#ch_app.Intro)
+
+- [Chapter Outline](#ch_app.Outline)
+
+### Introduction
+
+Most of the applications discussed in this chapter are built on a regular basis, at least once a day from the latest sources, and if you are in NCBI, then you can find the latest version in the directory: `$NCBI/c++/Release/bin/` (or `$NCBI/c++/Debug/bin/`).
+
+### Chapter Outline
+
+The following is an outline of the topics presented in this chapter:
+
+- [DATATOOL: code generation and data serialization utility](ch_app.html#ch_app.datatool)
+
+ - [Invocation](ch_app.html#ch_app.datatool.html_refArgs)
+
+ - [Main arguments](ch_app.html#ch_app.datatool.html_refMainArgs)
+
+ - [Code generation arguments](ch_app.html#ch_app.datatool.html_refCodeGenerationAr)
+
+ - [Data specification conversion](ch_app.html#ch_app.Data_Specification_C)
+
+ - [Scope prefixes](ch_app.html#ch_app.Scope_Prefixes)
+
+ - [Modular DTD and Schemata](ch_app.html#ch_app.Modular_DTD_and_Sche)
+
+ - [Converting XML Schema into ASN.1](ch_app.html#ch_app.Converting_XML_Schem)
+
+ - [Definition file](ch_app.html#ch_app.datatool.html_refDefFile)
+
+ - [Common definitions](ch_app.html#ch_app.datatool.html_refDefCommon)
+
+ - [Definitions that affect specific types](ch_app.html#ch_app.datatool.html_refDefSpecific)
+
+ - [INTEGER, REAL, BOOLEAN, NULL](ch_app.html#ch_app.datatool.html_refDefINT)
+
+ - [ENUMERATED](ch_app.html#ch_app.datatool.html_refDefENUM)
+
+ - [OCTET STRING](ch_app.html#ch_app.datatool.html_refDefOCTETS)
+
+ - [SEQUENCE OF, SET OF](ch_app.html#ch_app.datatool.html_refDefArray)
+
+ - [SEQUENCE, SET](ch_app.html#ch_app.datatool.html_refDefClass)
+
+ - [CHOICE](ch_app.html#ch_app.datatool.html_refDefChoice)
+
+ - [The Special [-] Section](ch_app.html#ch_app.The_Special__Section)
+
+ - [Examples](ch_app.html#ch_app.datatool.html_refDefExample)
+
+ - [Module file](ch_app.html#ch_app.ch_app_datatool_html_refModFile)
+
+ - [Generated code](ch_app.html#ch_app.datatool.html_refCode)
+
+ - [Normalized name](ch_app.html#ch_app.datatool.html_refNormalizedName)
+
+ - [ENUMERATED types](ch_app.html#ch_app.datatool.html_refCodeEnum)
+
+ - [Class diagrams](ch_app.html#ch_app.dt_inside.html)
+
+ - [Specification analysis](ch_app.html#ch_app.dt_inside.html_specs)
+
+ - [ASN.1 specification analysis](ch_app.html#ch_app.dt_inside.html_specs_asn)
+
+ - [DTD specification analysis](ch_app.html#ch_app.dt_inside.html_specs_dtd)
+
+ - [Data types](ch_app.html#ch_app.dt_inside.html_data_types)
+
+ - [Data values](ch_app.html#ch_app.dt_inside.html_data_values)
+
+ - [Code generation](ch_app.html#ch_app.dt_inside.html_code_gen)
+
+- [Load Balancing](ch_app.html#ch_app.Load_Balancing)
+
+ - [Overview](ch_app.html#ch_app._Overview)
+
+ - [Load Balancing Service Mapping Daemon (LBSMD)](ch_app.html#ch_app.Load_Balancing_Servi)
+
+ - [Overview](ch_app.html#ch_app._Overview_1)
+
+ - [Configuration](ch_app.html#ch_app._Configuration)
+
+ - [Check Script Specification](ch_app.html#ch_app.Check_Script_Specification)
+
+ - [Server Descriptor Specification](ch_app.html#ch_app.Server_Descriptor_Specification)
+
+ - [Signals](ch_app.html#ch_app.Signals)
+
+ - [Automatic Configuration Distribution](ch_app.html#ch_app.Automatic_Configurat)
+
+ - [Monitoring and Control](ch_app.html#ch_app.Monitoring_and_Contr)
+
+ - [Service Search](ch_app.html#ch_app.Service_Search)
+
+ - [lbsmc Utility](ch_app.html#ch_app.lbsmc_Utility)
+
+ - [NCBI Intranet Web Utilities](ch_app.html#ch_app.NCBI_Intranet_Web_Ut)
+
+ - [Server Penalizer API and Utility](ch_app.html#ch_app.Server_Penalizer_API)
+
+ - [SVN Repository](ch_app.html#ch_app.SVN_Repository)
+
+ - [Log Files](ch_app.html#ch_app._Log_Files)
+
+ - [Configuration Examples](ch_app.html#ch_app._Configuration_Exampl)
+
+ - [Database Load Balancing](ch_app.html#ch_app.Database_Load_Balancing)
+
+ - [Cookie / Argument Affinity Module (MOD\_CAF)](ch_app.html#ch_app.Cookie___Argument_Af)
+
+ - [Overview](ch_app.html#ch_app._Overview_2)
+
+ - [Configuration](ch_app.html#ch_app._Configuration_1)
+
+ - [Configuration Examples](ch_app.html#ch_app._Configuration_Exampl)
+
+ - [Arguments Matching](ch_app.html#ch_app.Arguments_Matching)
+
+ - [Argument Matching Examples](ch_app.html#ch_app.Argument_Matching_Ex)
+
+ - [Log File](ch_app.html#ch_app.Log_File)
+
+ - [Monitoring](ch_app.html#ch_app._Monitoring)
+
+ - [DISPD Network Dispatcher](ch_app.html#ch_app.DISPD_Network_Dispat)
+
+ - [Overview](ch_app.html#ch_app._Overview_3)
+
+ - [Protocol Description](ch_app.html#ch_app.Protocol_Description)
+
+ - [Client Request to DISPD](ch_app.html#ch_app.Client_Request_to_DI)
+
+ - [DISPD Client Response](ch_app.html#ch_app.DISPD_Client_Respons)
+
+ - [Communication Schemes](ch_app.html#ch_app.Communication_Scheme)
+
+ - [NCBID Server Launcher](ch_app.html#ch_app.NCBID_Server_Launche)
+
+ - [Overview](ch_app.html#ch_app._Overview_4)
+
+ - [Firewall Daemon (FWDaemon)](ch_app.html#ch_app.Firewall_Daemon_FWDa)
+
+ - [Overview](ch_app.html#ch_app._Overview_5)
+
+ - [FWDaemon Behind a "Regular" Firewall](ch_app.html#ch_app.FWDaemon_Behind_a__R)
+
+ - [FWDaemon Behind a "Non-Transparent" Firewall](ch_app.html#ch_app.FWDaemon_Behind_a__N)
+
+ - [Monitoring](ch_app.html#ch_app._Monitoring_1)
+
+ - [Log Files](ch_app.html#ch_app._Log_Files_1)
+
+ - [FWDaemon and NCBID Dispatcher Data Exchange](ch_app.html#ch_app.FWDaemon_and_NCBID_D)
+
+ - [Launcherd Utility](ch_app.html#ch_app.Launcherd_Utility)
+
+ - [Monitoring Tools](ch_app.html#ch_app.Monitoring_Tools)
+
+ - [Quality Assurance Domain](ch_app.html#ch_app.Quality_Assurance_Do)
+
+- [NCBI Genome Workbench](ch_app.html#ch_app.applications1)
+
+ - [Design goals](ch_app.html#ch_app.gbench_dg)
+
+ - [Design](ch_app.html#ch_app.gbench_design)
+
+- [NCBI NetCache Service](ch_app.html#ch_app.ncbi_netcache_service)
+
+ - [What is NetCache?](ch_app.html#ch_app.what_is_netcache)
+
+ - [What can NetCache be used for?](ch_app.html#ch_app.what_it_can_be_used)
+
+ - [How to use NetCache](ch_app.html#ch_app.getting_started)
+
+ - [The basic ideas](ch_app.html#ch_app.The_basic_ideas)
+
+ - [Setting up your program to use NetCache](ch_app.html#ch_app.Set_up_your_program_to_use_NetCac)
+
+ - [Establish the NetCache service name](ch_app.html#ch_app.Establish_the_NetCache_service_na)
+
+ - [Initialize the client API](ch_app.html#ch_app.Initialize_the_client_API)
+
+ - [Store data](ch_app.html#ch_app.Store_data)
+
+ - [Retrieve data](ch_app.html#ch_app.Retrieve_data)
+
+ - [Samples and other resources](ch_app.html#ch_app.Available_samples)
+
+ - [Questions and answers](ch_app.html#ch_app.Questions_and_answers)
+
+
+
+DATATOOL: Code Generation and Data Serialization Utility
+--------------------------------------------------------
+
+**DATATOOL** source code is located at `c++/src/serial/datatool;` this application can perform the following:
+
+- Generate C++ data storage classes based on [ASN.1](http://www.itu.int/ITU-T/studygroups/com17/languages), [DTD](http://www.w3.org/TR/REC-xml) or [XML Schema](http://www.w3.org/XML/Schema) specification to be used with [NCBI data serialization streams](ch_ser.html).
+
+- Convert ASN.1 specification into a DTD or XML Schema specification and vice versa.
+
+- Convert data between ASN.1, XML and JSON formats.
+
+***Note:*** Because ASN.1, XML and JSON are, in general, incompatible, the last two functions are supported only partially.
+
+The following topics are discussed in subsections:
+
+- [Invocation](ch_app.html#ch_app.datatool.html_refArgs)
+
+- [Data specification conversion](ch_app.html#ch_app.Data_Specification_C)
+
+- [Definition file](ch_app.html#ch_app.datatool.html_refDefFile)
+
+- [Module file](ch_app.html#ch_app.ch_app_datatool_html_refModFile)
+
+- [Generated code](ch_app.html#ch_app.datatool.html_refCode)
+
+- [Class diagrams](ch_app.html#ch_app.dt_inside.html)
+
+
+
+### Invocation
+
+The following topics are discussed in this section:
+
+- [Main arguments](ch_app.html#ch_app.datatool.html_refMainArgs)
+
+- [Code generation arguments](ch_app.html#ch_app.datatool.html_refCodeGenerationAr)
+
+
+
+#### Main Arguments
+
+See [Table 1](ch_app.html#ch_app.tools_table1).
+
+
+
+Table 1. Main arguments
+
+| Argument | Effect | Comments |
+|------------------------|-----------------------------------------------------------|----------------------------------------------------------------------------|
+| -h | Display the **DATATOOL** arguments | Ignores other arguments |
+| -m \ | module specification file(s) - ASN.1, DTD, or XSD | Required argument |
+| -M \ | External module file(s) | Is used for IMPORT type resolution |
+| -i | Ignore unresolved types | Is used for IMPORT type resolution |
+| -f \ | Write ASN.1 module file | |
+| -fx \ | Write DTD module file | "-fx m" writes modular DTD file |
+| -fxs \ | Write XML Schema file | |
+| -fd \ | Write specification dump file in datatool internal format | |
+| -ms \ | Suffix of modular DTD or XML Schema file name | |
+| -dn \ | DTD module name in XML header | No extension. If empty, omit DOCTYPE declaration. |
+| -v \ | Read value in ASN.1 text format | |
+| -vx \ | Read value in XML format | |
+| -F | Read value completely into memory | |
+| -p \ | Write value in ASN.1 text format | |
+| -px \ | Write value in XML format | |
+| -pj \ | Write value in JSON format | |
+| -d \ | Read value in ASN.1 binary format | -t argument required |
+| -t \ | Binary value type name | See -d argument |
+| -e \ | Write value in ASN.1 binary format | |
+| -xmlns | XML namespace name | When specified, also makes XML output file reference Schema instead of DTD |
+| -sxo | No scope prefixes in XML output | |
+| -sxi | No scope prefixes in XML input | |
+| -logfile \ | File to which the program log should be redirected | |
+| conffile \ | Program's configuration (registry) data file | |
+| -version | Print version number | Ignores other arguments |
+
+
+
+
+
+#### Code Generation Arguments
+
+See [Table 2](ch_app.html#ch_app.tools_table2).
+
+
+
+Table 2. Code generation arguments
+
+| Argument | Effect | Comments |
+|-----------------|-------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| -od \ | C++ code [definition file](ch_app.html#ch_app.datatool.html_refDefFile) | See [Definition file](ch_app.html#ch_app.datatool.html_refDefFile) |
+| -ods | Generate an example definition file (e.g. `MyModuleName._sample_def`) | Must be used with another option that generates code such as -oA. |
+| -odi | Ignore absent code definition file | |
+| -odw | Issue a warning about absent code definition file | |
+| -oA | Generate C++ files for all types | Only types from the main module are used (see [-m](ch_app.html#ch_app.tools_table1) and -mx arguments). |
+| -ot \ | Generate C++ files for listed types | Only types from the main module are used (see [-m](ch_app.html#ch_app.tools_table1) and -mx arguments). |
+| -ox \ | Exclude types from generation | |
+| -oX | Turn off recursive type generation | |
+| -of \ | Write the list of generated C++ files | |
+| -oc \ | Write combining C++ files | |
+| -on \ | Default namespace | The value "-" in the [Definition file](ch_app.html#ch_app.datatool.html_refDefFile) means don't use a namespace at all and overrides the -on option specified elsewhere. |
+| -opm \ | Directory for searching source modules | |
+| -oph \ | Directory for generated \*.hpp files | |
+| -opc \ | Directory for generated \*.cpp files | |
+| -or \ | Add prefix to generated file names | |
+| -orq | Use quoted syntax form for generated include files | |
+| -ors | Add source file dir to generated file names | |
+| -orm | Add module name to generated file names | |
+| -orA | Combine all -or\* prefixes | |
+| -ocvs | create ".cvsignore" files | |
+| -oR \ | Set -op\* and -or\* arguments for NCBI directory tree | |
+| -oDc | Turn ON generation of Doxygen-style comments | The value "-" in the [Definition file](ch_app.html#ch_app.datatool.html_refDefFile) means don't generate Doxygen comments and overrides the -oDc option specified elsewhere. |
+| -odx \ | URL of documentation root folder | For Doxygen |
+| -lax\_syntax | Allow non-standard ASN.1 syntax accepted by asntool | The value "-" in the [Definition file](ch_app.html#ch_app.datatool.html_refDefFile) means don't allow non-standard syntax and overrides the -lax\_syntax option specified elsewhere. |
+| -pch \ | Name of the precompiled header file to include in all \*.cpp files | |
+| -oex \ | Add storage-class modifier to generated classes | Can be overriden by [[-].\_export](ch_app.html#ch_app.datatool.html_refDefCommon) in the definition file. |
+
+
+
+
+
+### Data Specification Conversion
+
+When parsing a data specification, **DATATOOL** identifies the specification format based on the source file extension - ASN, DTD, or XSD.
+
+
+
+#### Scope Prefixes
+
+Initially, **DATATOOL** and the serial library supported serialization in ASN.1 and XML format, and conversion of ASN.1 specification into DTD. Compared to ASN.1, DTD is a very sketchy specification in the sense that there is only one primitive type - string, and all elements are defined globally. The latter feature of DTD led to a decision to use ‘scope prefixes’ in XML output to avoid potential name conflicts. For example, consider the following ASN.1 specification:
+
+ Date ::= CHOICE {
+ str VisibleString,
+ std Date-std
+ }
+ Time ::= CHOICE {
+ str VisibleString,
+ std Time-std
+ }
+
+Here, accidentally, element ***str*** is defined identically both in ***Date*** and ***Time*** productions; while the meaning of element ***std*** depends on the context. To avoid ambiguity, this specification translates into the following DTD:
+
+
+
+
+
+
+
+
+Accordingly, these scope prefixes made their way into XML output.
+
+Later, DTD parsing was added into **DATATOOL**. Here, scope prefixes were not needed. Also, since these prefixes considerably increase the size of the XML output, they could be omitted when it is known in advance that there can be no ambiguity. So, **DATATOOL** has got command line flags, which would enable that.
+
+With the addition of XML Schema parser and generator, when converting ASN.1 specification, elements can be declared in Schema locally if needed, and scope prefixes make almost no sense. Still, they are preserved for compatibility.
+
+
+
+#### Modular DTD and Schemata
+
+Here, ‘module’ means ASN.1 module. Single ASN.1 specification file may contain several modules. When converting it into DTD or XML schema, it might be convenient to put each module definitions into a separate file. To do so, one should specify a special file name in `-fx` or `-fxs` command line parameter. The names of output DTD or Schema files will then be chosen automatically - they will be named after ASN.1 modules defined in the source. ‘Modular’ output does not make much sense when the source specification is DTD or Schema.
+
+You can find a number of DTDs and Schema converted by **DATATOOL** from NCBI public ASN.1 specifications [here](http://www.ncbi.nlm.nih.gov/data_specs).
+
+
+
+#### Converting XML Schema into ASN.1
+
+There are two major problems in converting XML schema into ASN.1 specification: how to define XML attributes and how to convert complex content models. The solution was greatly affected by the underlying implementation of data storage classes (classes which **DATATOOL** generates based on a specification). So, for example the following Schema
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+translates into this ASN.1:
+
+ Author ::= SEQUENCE {
+ attlist SET {
+ gender ENUMERATED {
+ male (1),
+ female (2)
+ } OPTIONAL
+ },
+ lastName VisibleString,
+ fF CHOICE {
+ foreName VisibleString,
+ fM SEQUENCE {
+ firstName VisibleString,
+ middleName VisibleString OPTIONAL
+ }
+ } OPTIONAL,
+ initials VisibleString OPTIONAL,
+ suffix VisibleString OPTIONAL
+ }
+
+Each unnamed local element gets a name. When generating C++ data storage classes from Schema, **DATATOOL** marks such data types as anonymous.
+
+It is possible to convert source Schema into ASN.1, and then use **DATATOOL** to generate C++ classes from the latter. In this case **DATATOOL** and serial library provide compatibility of ASN.1 output. If you generate data storage classes from Schema, and use them to write data in ASN.1 format (binary or text), if you then convert that Schema into ASN.1, generate classes from it, and again write same data in ASN.1 format using this new set of classes, then these two files will be identical.
+
+
+
+### Definition File
+
+It is possible to tune up the C++ code generation by using a definition file, which could be specified in the [-od](ch_app.html#ch_app.tools_table2) argument. The definition file uses the generic [NCBI configuration](ch_core.html#ch_core.registry_syntax) format also used in the configuration (`*.ini`) files found in NCBI's applications.
+
+**DATATOOL** looks for code generation parameters in several sections of the file in the following order:
+
+1. `[ModuleName.TypeName]`
+
+2. `[TypeName]`
+
+3. `[ModuleName]`
+
+4. `[-]`
+
+Parameter definitions follow a "name = value" format. The "name" part of the definition serves two functions: (1) selecting the specific element to which the definition applies, and (2) selecting the code generation parameter (such as `_class`) that will be fine-tuned for that element.
+
+To modify a top-level element, use a definition line where the name part is simply the desired code generation parameter (such as `_class`). To modify a nested element, use a definition where the code generation parameter is prefixed by a dot-separated "path" of the successive container element names from the data format specification. For path elements of type `SET OF` or `SEQUENCE OF`, use an "`E`" as the element name (which would otherwise be anonymous). ***Note:*** Element names will depend on whether you are using ASN.1, DTD, or Schema.
+
+For example, consider the following ASN.1 specification:
+
+ MyType ::= SEQUENCE {
+ label VisibleString ,
+ points SEQUENCE OF
+ SEQUENCE {
+ x INTEGER ,
+ y INTEGER
+ }
+ }
+
+Code generation for the various elements can be fine-tuned as illustrated by the following sample definition file:
+
+ [MyModule.MyType]
+ ; modify the top-level element (MyType)
+ _class = CMyTypeX
+
+ ; modify a contained element
+ label._class = CTitle
+
+ ; modify a "SEQUENCE OF" container type
+ points._type = vector
+
+ ; modify members of an anonymous SEQUENCE contained in a "SEQUENCE OF"
+ points.E.x._type = double
+ points.E.y._type = double
+
+ ; modify a DATATOOL-assigned class name
+ points.E._class = CPoint
+
+***Note:*** **DATATOOL** assigns arbitrary names to otherwise anonymous containers. In the example above, the `SEQUENCE` containing `x` and `y` has no name in the specification, so **DATATOOL** assigned the name `E`. If you want to change the name of a **DATATOOL**-assigned name, create a definition file and rename the class using the appropriate `_class` entry as shown above. To find out what the **DATATOOL**-assigned name will be, create a sample definition file using the **DATATOOL** `-ods` option. This approach will work regardless of the data specification format (ASN.1, DTD, or XSD).
+
+The following additional topics are discussed in this section:
+
+- [Common definitions](ch_app.html#ch_app.datatool.html_refDefCommon)
+
+- [Definitions that affect specific types](ch_app.html#ch_app.datatool.html_refDefSpecific)
+
+- [The Special [-] Section](ch_app.html#ch_app.The_Special__Section)
+
+- [Examples](ch_app.html#ch_app.datatool.html_refDefExample)
+
+
+
+#### Common Definitions
+
+Some definitions refer to the generated class as a whole.
+
+`_file` Defines the base filename for the generated or referenced C++ class.
+
+For example, the following definitions:
+
+ [ModuleName.TypeName]
+ _file=AnotherName
+
+Or
+
+ [TypeName]
+ _file=AnotherName
+
+would put the class ***CTypeName*** in files with the base name `AnotherName`, whereas these two:
+
+ [ModuleName]
+ _file=AnotherName
+
+Or
+
+ [-]
+ _file=AnotherName
+
+put **all** the generated classes into a single file with the base name `AnotherName`.
+
+`_extra_headers` Specify additional header files to include.
+
+For example, the following definition:
+
+ [-]
+ _extra_headers=name1 name2 \"name3\"
+
+would put the following lines into all generated headers:
+
+ #include
+ #include
+ #include "name3"
+
+Note the name3 clause. Putting name3 in quotes instructs **DATATOOL** to use the quoted syntax in generated files. Also, the quotes must be escaped with backslashes.
+
+`_dir` Subdirectory in which the generated C++ files will be stored (in case \_file not specified) or a subdirectory in which the referenced class from an external module could be found. The subdirectory is added to include directives.
+
+`_class` The name of the generated class (if `_class=-` is specified, then no code is generated for this type).
+
+For example, the following definitions:
+
+ [ModuleName.TypeName]
+ _class=CAnotherName
+
+Or
+
+ [TypeName]
+ _class=CAnotherName
+
+would cause the class generated for the type `TypeName` to be named ***CAnotherName***, whereas these two:
+
+ [ModuleName]
+ _class=CAnotherName
+
+Or
+
+ [-]
+ _class=CAnotherName
+
+would result in **all** the generated classes having the same name ***CAnotherName*** (which is probably not what you want).
+
+`_namespace` The namespace in which the generated class (or classes) will be placed.
+
+`_parent_class` The name of the base class from which the generated C++ class is derived.
+
+`_parent_type` Derive the generated C++ class from the class, which corresponds to the specified type (in case \_parent\_class is not specified).
+
+It is also possible to specify a storage-class modifier, which is required on Microsoft Windows to export/import generated classes from/to a DLL. This setting affects all generated classes in a module. An appropriate section of the definition file should look like this:
+
+ [-]
+ _export = EXPORT_SPECIFIER
+
+Because this modifier could also be specified in the [command line](ch_app.html#ch_app.tools_table2), the **DATATOOL** code generator uses the following rules to choose the proper one:
+
+- If no `-oex` flag is given in the command line, no modifier is added at all.
+
+- If `-oex ""` (that is, an empty modifier) is specified in the command line, then the modifier from the definition file will be used.
+
+- The command-line parameter in the form `-oex FOOBAR` will cause the generated classes to have a `FOOBAR` storage-class modifier, unless another one is specified in the definition file. The modifier from the definition file always takes precedence.
+
+
+
+#### Definitions That Affect Specific Types
+
+The following additional topics are discussed in this section:
+
+- [INTEGER, REAL, BOOLEAN, NULL](ch_app.html#ch_app.datatool.html_refDefINT)
+
+- [ENUMERATED](ch_app.html#ch_app.datatool.html_refDefENUM)
+
+- [OCTET STRING](ch_app.html#ch_app.datatool.html_refDefOCTETS)
+
+- [SEQUENCE OF, SET OF](ch_app.html#ch_app.datatool.html_refDefArray)
+
+- [SEQUENCE, SET](ch_app.html#ch_app.datatool.html_refDefClass)
+
+- [CHOICE](ch_app.html#ch_app.datatool.html_refDefChoice)
+
+
+
+##### INTEGER, REAL, BOOLEAN, NULL
+
+`_type` C++ type: int, short, unsigned, long, etc.
+
+
+
+##### ENUMERATED
+
+`_type` C++ type: int, short, unsigned, long, etc.
+
+`_prefix` Prefix for names of enum values. The default is "e".
+
+
+
+##### OCTET STRING
+
+`_char` Vector element type: char, unsigned char, or signed char.
+
+
+
+##### SEQUENCE OF, SET OF
+
+`_type` STL container type: list, vector, set, or multiset.
+
+
+
+##### SEQUENCE, SET
+
+`memberName._delay` Mark the specified member for delayed reading.
+
+
+
+##### CHOICE
+
+`_virtual_choice` If not empty, do not generate a special class for choice. Rather make the choice class as the parent one of all its variants.
+
+`variantName._delay` Mark the specified variant for delayed reading.
+
+
+
+#### The Special [-] Section
+
+There is a special section `[-]` allowed in the definition file which can contain definitions related to code generation. This is a good place to define a namespace or identify additional headers. It is a "top level" section, so entries placed here will override entries with the same name in other sections or on the command-line. For example, the following entries set the proper parameters for placing header files alongside source files:
+
+ [-]
+ ; Do not use a namespace at all:
+ -on = -
+
+ ; Use the current directory for generated .cpp files:
+ -opc = .
+
+ ; Use the current directory for generated .hpp files:
+ -oph = .
+
+ ; Do not add a prefix to generated file names:
+ -or = -
+
+ ; Generate #include directives with quotes rather than angle brackets:
+ -orq = 1
+
+Any of the code generation arguments in [Table 2](ch_app.html#ch_app.tools_table2) (except `-od`, `-odi`, and `-odw` which are related to specifying the definition file) can be placed in the `[-]` section.
+
+In some cases, the special value `"-"` causes special processing as noted in [Table 2](ch_app.html#ch_app.tools_table2).
+
+
+
+#### Examples
+
+If we have the following ASN.1 specification (this not a "real" specification - it is only for illustration):
+
+ Date ::= CHOICE {
+ str VisibleString,
+ std Date-std
+ }
+ Date-std ::= SEQUENCE {
+ year INTEGER,
+ month INTEGER OPTIONAL
+ }
+ Dates ::= SEQUENCE OF Date
+ Int-fuzz ::= CHOICE {
+ p-m INTEGER,
+ range SEQUENCE {
+ max INTEGER,
+ min INTEGER
+ },
+ pct INTEGER,
+ lim ENUMERATED {
+ unk (0),
+ gt (1),
+ lt (2),
+ tr (3),
+ tl (4),
+ circle (5),
+ other (255)
+ },
+ alt SET OF INTEGER
+ }
+
+Then the following definitions will effect the generation of objects:
+
+
+
+| Definition | Effected Objects |
+|---------------------------------------------------------------------|--------------------------------------------------------------------|
+| `[Date]` `str._type = string` | the `str` member of the `Date` structure |
+| `[Dates]` `E._pointer = true` | elements of the `Dates` container |
+| `[Int-fuzz]` `range.min._type = long` | the `min` member of the `range` member of the `Int-fuzz` structure |
+| `[Int-fuzz]` `alt.E._type = long` | elements of the `alt` member of the `Int-fuzz` structure |
+
+
+
+As another example, suppose you have a ***CatalogEntry*** type comprised of a ***Summary*** element and either a ***RecordA*** element or a ***RecordB*** element, as defined by the following XSD specification:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+In this specification, the `` element in ***CatalogEntryType*** is anonymous, so **DATATOOL** will assign an arbitrary name to it. The assigned name will not be descriptive, but fortunately you can use a definition file to change the assigned name.
+
+First find the **DATATOOL**-assigned name by creating a sample definition file using the `-ods` option:
+
+ datatool -ods -oA -m catalogentry.xsd
+
+The sample definition file (`catalogentry._sample_def`) shows `RR` as the class name:
+
+ [CatalogEntry]
+ RR._class =
+ Summary._class =
+
+Then edit the module definition file (`catalogentry.def`) and change `RR` to a more descriptive class name, for example:
+
+ [CatalogEntry]
+ RR._class=CRecordChoice
+
+The new name will be used the next time the module is built.
+
+
+
+### Module File
+
+Module files are not used directly by **DATATOOL**, but they are read by `new_module.sh` and [project\_tree\_builder](ch_config.html#ch_config._Build_the_Toolkit) and therefore determine what **DATATOOL**'s command line will be when **DATATOOL** is invoked from the NCBI build system.
+
+Module files simply consist of lines of the form "`KEY = VALUE`". Only the key `MODULE_IMPORT` is currently used (and is the only key ever recognized by `project_tree_builder`). Other keys used to be recognized by `module.sh` and still harmlessly remain in some files. The possible keys are:
+
+- `MODULE_IMPORT` These definitions contain a space-delimited list of other modules to import. The paths should be relative to `.../src` and should not include extensions.
For example, a valid entry could be: MODULE\_IMPORT = objects/general/general objects/seq/seq
+
+- `MODULE_ASN`, `MODULE_DTD`, `MODULE_XSD` These definitions explicitly set the specification filename (normally `foo.asn`, `foo.dtd`, or `foo.xsd` for `foo.module`). Almost no module files contain this definition. It is no longer used by the `project_tree_builder` and is therefore not necessary
+
+- `MODULE_PATH` Specifies the directory containing the current module, again relative to `.../src`. Almost all module files contain this definition, however it is no longer used by either `new_module.sh` or the `project_tree_builder` and is therefore not necessary.
+
+
+
+### Generated Code
+
+The following additional topics are discussed in this section:
+
+- [Normalized name](ch_app.html#ch_app.datatool.html_refNormalizedName)
+
+- [ENUMERATED types](ch_app.html#ch_app.datatool.html_refCodeEnum)
+
+
+
+#### Normalized Name
+
+By default, DATATOOL generates "normalized" C++ class names from ASN.1 type names using two rules:
+
+1. Convert any hyphens ("***-***") into underscores ("***\_***"), because hyphens are not legal characters in C++ class names.
+
+2. Prepend a 'C' character.
+
+For example, the default normalized C++ class name for the ASN.1 type name "***Seq-data***" is "***CSeq\_data***".
+
+The default C++ class name can be overridden by explicitly specifying in the definition file a name for a given ASN.1 type name. For example:
+
+ [MyModule.Seq-data]
+ _class=CMySeqData
+
+
+
+#### ENUMERATED Types
+
+By default, for every `ENUMERATED` ASN.1 type, **DATATOOL** will produce a C++ enum type with the name ***ENormalizedName***.
+
+
+
+### Class Diagrams
+
+The following topics are discussed in this section:
+
+- [Specification analysis](ch_app.html#ch_app.dt_inside.html_specs)
+
+- [Data types](ch_app.html#ch_app.dt_inside.html_data_types)
+
+- [Data values](ch_app.html#ch_app.dt_inside.html_data_values)
+
+- [Code generation](ch_app.html#ch_app.dt_inside.html_code_gen)
+
+
+
+#### Specification Analysis
+
+The following topics are discussed in this section:
+
+- [ASN.1 specification analysis](ch_app.html#ch_app.dt_inside.html_specs_asn)
+
+- [DTD specification analysis](ch_app.html#ch_app.dt_inside.html_specs_dtd)
+
+
+
+##### ASN.1 Specification Analysis
+
+See [Figure 1](ch_app.html#ch_app.specs_asn).
+
+
+
+[](/book/static/img/specs_asn.gif "Click to see the full-resolution image")
+
+1. ASN.1 specification analysis.
+
+
+
+##### DTD Specification Analysis
+
+See [Figure 2](ch_app.html#ch_app.specs_dtd).
+
+
+
+[](/book/static/img/specs_dtd.gif "Click to see the full-resolution image")
+
+2. DTD specification analysis.
+
+
+
+#### Data Types
+
+See [CDataType](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCDataType.html).
+
+
+
+#### Data Values
+
+See [Figure 3](ch_app.html#ch_app.data_values).
+
+
+
+[](/book/static/img/data_types.gif "Click to see the full-resolution image")
+
+3. Data values.
+
+
+
+#### Code Generation
+
+See [Figure 4](ch_app.html#ch_app.code_gen).
+
+
+
+[](/book/static/img/type_strings.gif "Click to see the full-resolution image")
+
+4. Code generation.
+
+
+
+Load Balancing
+--------------
+
+- [Overview](ch_app.html#ch_app._Overview)
+
+- [Load Balancing Service Mapping Daemon (LBSMD)](ch_app.html#ch_app.Load_Balancing_Servi)
+
+- [Database Load Balancing](ch_app.html#ch_app.Database_Load_Balancing)
+
+- [Cookie / Argument Affinity Module (MOD\_CAF)](ch_app.html#ch_app.Cookie___Argument_Af)
+
+- [DISPD Network Dispatcher](ch_app.html#ch_app.DISPD_Network_Dispat)
+
+- [NCBID Server Launcher](ch_app.html#ch_app.NCBID_Server_Launche)
+
+- [Firewall Daemon (FWDaemon)](ch_app.html#ch_app.Firewall_Daemon_FWDa)
+
+- [Launcherd Utility](ch_app.html#ch_app.Launcherd_Utility)
+
+- [Monitoring Tools](ch_app.html#ch_app.Monitoring_Tools)
+
+- [Quality Assurance Domain](ch_app.html#ch_app.Quality_Assurance_Do)
+
+***Note:*** For security reasons not all links in the public version of this document are accessible by the outside NCBI users.
+
+The section covers the following topics:
+
+- The purpose of load balancing
+
+- All the separate components’ purpose, internal details, configuration
+
+- Communications between the components
+
+- Monitoring facilities
+
+
+
+### Overview
+
+The purpose of load balancing is distributing the load among the service providers available on the NCBI network basing on certain rules. The load is generated by both locally-connected and Internet-connected users. The figures below show the most typical usage scenarios.
+
+[](/book/static/img/LoadBalancingLocal.jpg "Click to see the full-resolution image")
+
+Figure 5. Local Clients
+
+Please note that the figure is slightly simplified to remove unnecessary details for the time being.
+
+In case of local access to the NCBI resources there are two NCBI developed components, which are involved into the interactions. These are LBSMD daemon (Load Balancing Service Mapping Daemon) and mod\_caf (Cookie/Argument Affinity module) - an Apache web server module.
+
+The LBSMD daemon is running on each host in the NCBI network. The daemon reads its configuration file with all the services available on the host described. Then the LBSMD daemon broadcasts the available services and the current host load to the adjacent LBSMD daemons on a regular basis. The data received from the other LBSMD daemons are stored in a special table. So at some stage the LBSMD daemon on each host will have had a full description of the services available on the network as well as the current hosts’ load.
+
+The mod\_caf Apache’s module analyses special cookies, query line arguments and reads data from the table populated by the LBSMD daemon. Basing on the best match it makes a decision of where to pass a request further.
+
+Suppose for a moment that a local NCBI client runs a web browser, points to an NCBI web page and initiates a DB request via the web interface. At this stage the mod\_caf analyses the request line and makes a decision where to pass the request. The request is passed to the ServiceProviderN host which performs the corresponding database query. Then the query results are delivered to the client. The data exchange path is shown on the figure above using solid lines.
+
+Another typical scenario for the local NCBI clients is when client code is run on a user workstation. That client code might require a long term connection to a certain service, to a database for example. The browser is not able to provide this kind of connection so a direct connection is used in this case. The data exchange path is shown on the figure above using dashed lines.
+
+The communication scenarios become more complicated in case when clients are located outside of the NCBI network. The figure below describes the interactions between modules when the user requested a service which does not suppose a long term connection.
+
+[](/book/static/img/LoadBalancingInternetShort.jpg "Click to see the full-resolution image")
+
+Figure 6. Internet Clients. Short Term Connection
+
+The clients have no abilities to connect to front end Apache web servers directly. The connection is done via a router which is located in DMZ (Demilitarized Zone). The router selects one of the available front end servers and passes the request to that web server. Then the web server processes the request very similar to how it processes requests from a local client.
+
+The next figure explains the interactions for the case when an Internet client requests a service which supposes a long term connection.
+
+[](/book/static/img/LoadBalancingInternetLong.jpg "Click to see the full-resolution image")
+
+Figure 7. Internet Clients. Long Term Connection
+
+In opposite to the local clients the internet clients are unable to connect to the required service directly because of the DMZ zone. This is where DISPD, FWDaemon and a proxy come to help resolving the problem.
+
+The data flow in the scenario is as follows. A request from the client reaches a front end Apache server as it was discussed above. Then the front end server passes the request to the DISPD dispatcher. The DISPD dispatcher communicates to FWDaemon (Firewall Daemon) to provide the required service facilities. The FWDaemon answers with a special ticket for the requested service. The ticket is sent to the client via the front end web server and the router. Then the client connects to the NAT service in the DMZ zone providing the received ticket. The NAT service establishes a connection to the FWDaemon and passes the received earlier ticket. The FWDaemon, in turn, provides the connection to the required service. It is worth to mention that the FWDaemon is running on the same host as the DISPD dispatcher and neither DISPD nor FWDaemon can work without each other.
+
+The most complicated scenario comes to the picture when an arbitrary Unix filter program is used as a service provided for the outside NCBI users. The figure below shows all the components involved into the scenario.
+
+[](/book/static/img/LoadBalancingDispD.jpg "Click to see the full-resolution image")
+
+Figure 8. NCBID at Work
+
+The data flow in the scenario is as follows. A request from the client reaches a front end Apache server as it was discussed above. Then the front end server passes the request to the DISPD dispatcher. The DISPD communicates to both the FWDaemon and the NCBID utility on (possibly) the other host and requests to demonize a requested Unix filter program (Service X on the figure). The demonized service starts listening on the certain port for a network connection. The connection attributes are delivered to the FWDaemon and to the client via the web front end and the router. The client connects to the NAT service and the NAT service passes the request further to the FWDaemon. The FWDaemon passes the request to the demonized Service X on the Service Provider K host. Since that moment the client is able to start data exchange with the service. The described scenario is purposed for long term connections oriented tasks.
+
+Further sections describe all the components in more detail.
+
+
+
+### Load Balancing Service Mapping Daemon (LBSMD)
+
+
+
+#### Overview
+
+As mentioned earlier, the LBSMD daemon runs almost on every host that carries either public or private servers which, in turn, implement NCBI services. The services include CGI programs or standalone servers to access NCBI data.
+
+Each service has a unique name assigned to it. The “TaxService” would be an example of such a name. The name not only identifies a service. It also implies a protocol which is used for data exchange with that service. For example, any client which connects to the “TaxService” service knows how to communicate with that service regardless the way the service is implemented. In other words the service could be implemented as a standalone server on host X and as a CGI program on the same host or on another host Y (please note, however, that there are exceptions and for some service types it is forbidden to have more than one service type on the same host).
+
+A host can advertize many services. For example, one service (such as “Entrez2”) can operate with binary data only while another one (such as “Entrez2Text”) can operate with text data only. The distinction between those two services could be made by using a content type specifier in the LBSMD daemon configuration file.
+
+The main purpose of the LBSMD daemon is to maintain a table of all services available at NCBI at the moment. In addition the LBSMD daemon keeps track of servers that are found to be dysfunctional (dead servers). The daemon is also responsible for propagating trouble reports, obtained from applications. The application trouble reports are based on their experience with advertised servers (e.g., an advertised server is not technically marked dead but generates some sort of garbage). Further in this document, the latter kind of feedback is called a penalty.
+
+The principle of load balancing is simple: each server which implements a service is assigned a (calculated) rate. The higher the rate, the better the chance for that server to be chosen when a request for the service comes up. Note that load balancing is thus almost never deterministic.
+
+The LBSMD daemon calculates two parameters for the host on which it is running. The parameters are a normal host status and a BLAST host status (based on the instant load of the system). These parameters are then used to calculate the rate of all (non static) servers on the host. The rates of all other hosts are not calculated but received and stored in the LBSDM table.
+
+The LBSMD daemon can be restarted from a crontab every few minutes on all the production hosts to ensure that the daemon is always running. This technique is safe because no more than one instance of the daemon is permitted on a certain host and any attempt to start more than one is ignored. Normally, though, a running daemon instance is maintained afloat by some kind of monitoring software, such as “puppet” or “monit” that makes use of the crontabs unnecessary.
+
+The main loop of the LBSMD daemon:
+
+- periodically checks the configuration file and reloads the configuration when necessary;
+
+- checks for and processes incoming messages from neighbor LBSMD daemons running on other hosts; and
+
+- generates and broadcasts the messages to the other hosts about the load of the system and configured services.
+
+The LBSMD daemon can also periodically check whether the configured servers are alive: either by trying to establish a connection to them (and then disconnecting immediately, without sending/receiving any data) and / or by using a special plugin script that can do more intelligent, thorough, and server-specific diagnostics, and report the result back to LBSMD via an exit code.
+
+Lastly, LBSMD can pull port load information as posted by the running servers. This is done via a simple API . The information is then used to calculate the final server rates in run-time.
+
+Although cients can [redirect services](ch_conn.html#ch_conn.Service_Redirection), LBSMD does not distinguish between direct and redirected services.
+
+
+
+#### Configuration
+
+The LBSMD daemon is configured via command line options and via a configuration file. The full list of command line options can be retrieved by issuing the following command:
+
+`/opt/machine/lbsm/sbin/lbsmd --help`
+
+The local NCBI users can also visit the following link:
+
+
+
+The default name of the LBSMD daemon configuration file is `/etc/lbsmd/servrc.cfg`. Each line can be one of the following:
+
+- an include directive
+
+- site / zone designation
+
+- host authority information
+
+- a monitored port designation
+
+- a part of the host environment
+
+- a service definition
+
+- an empty line (entirely blank or containing a comment only)
+
+Empty lines are ignored in the file. Any single configuration line can be split into several physical lines by inserting backslash symbols (\\) before the line breaks. A comment is introduced by the pound/hash symbol (\#).
+
+A configuration line of the form
+
+ %include filename
+
+causes the contents of the named file **`filename`** to be inserted here. The daemon always assumes that relative file names (those that do not start with the slash character, /) are based on the daemon startup directory. This is true for any level of nesting.
+
+Once started, the daemon first tries to read its configuration from `/etc/lbsmd/servrc.cfg`. If the file is not found (or is not readable) the daemon looks for the configuration file `servrc.cfg` in the directory from which it has been started. This fallback mechanism is not used when the configuration file name is explicitly stated in the command line. The daemon periodically checks the configuration file and all of its descendants and reloads (discards) their contents if some of the files have been either updated, (re-)moved, or added.
+
+The “**`filename`**” can be followed by a pipe character ( \| ) and some text (up to the end of the line or the comment introduced by the hash character). That text is then prepended to every line (but the `%include` directives) read from the included file.
+
+A configuration line of the form
+
+ @zone
+
+specifies the zone to which the entire configuration file applies, where a zone is a subdivision of the existing broadcast domain which does not intermix with other unrelated zones. Only one zone designation is allowed, and it must match the predefined site information (the numeric designation of the entire broadcast domain, which is either “guessed” by LBSMD or preset via a command-line parameter): the zone value must be a binary subset of the site value (which is usually a contiguous set of 1-bits, such as `0xC0` or `0x1E`).
+
+When no zone is specified, the zone is set equal to the entire site (broadcast domain) so that any regular service defined by the configuration is visible to each and every LBSMD running at the same site. Otherwise, only the servers with bitwise-matching zones are visible to each other: if 1, 2, and 3 are the zones of hosts “X”, “Y” and “Z”, respectively (all hosts reside within the same site, say 7), then servers from “X” are visible by “Z”, but not by “Y”; servers from “Y” are visible by “Z” but not by “X”; and finally, all servers from “Z” are visible by both “X” and “Y”. There’s a way to define servers at “X” to be visible by “Y” using an “Inter” server flag (see below).
+
+A configuration line of the form
+
+ [*]user
+
+introduces a user that is added to the host authority. There can be multiple authority lines across the configuration, and they are all aggregated into a list. The list can contain both individual user names and / or group names (denoted by a preceding asterisk). The listed users and / or members of the listed groups, will be allowed to operate on all server records that appear in the LBSMD configuration files on this host (individual server entries may designate additional personnel on a per-server basis). Additional authority entries are only allowed from the same branch of the configuration file tree: so if a file “a” includes a file “b”, where the first host authority is defined, then any file that is included (directly or indirectly) from “b” can add entries to the host authority, while no other file that is included later from “a”, can.
+
+A configuration line of the form
+
+ :port
+
+designates a local network port for monitoring by LBSMD: the daemon will regularly pull the port information as provided by servers in run-time: total port capacity, used capacity and free capacity; and make these values available in the load-balance messages sent to other LBSMDs. The ratio “free” over “total” will be used to calculate the port availability (1.0=fully free, 0.0=fully clogged). Servers may use arbitrary units to express the capacity, but both “used” and “free” may not be greater than “total”, and “used” must correspond to the actual used resource, yet “free” may be either calculated (e.g. algorithmically decreased in anticipation of the mouting load in order to shrink the port availability ratio quicker) or simply amounts to “total” – “used”. Note that “free” set to “0” signals the port as currently being unavailable for service (i.e. as if the port was down) – and an automatic connection check, if any, will not be performed by LBSMD on that port.
+
+A configuration line of the form
+
+ name=value
+
+goes into the host environment. The host environment can be accessed by clients when they perform the service name resolution. The host environment is designed to help the client to know about limitations/options that the host has, and based on this additional information the client can make a decision whether the server (despite the fact that it implements the service) is suitable for carrying out the client's request. For example, the host environment can give the client an idea about what databases are available on the host. The host environment is not interpreted or used in any way by either the daemon or by the load balancing algorithm, except that the name must be a valid identifier. The value may be practically anything, even empty. It is left solely for the client to parse the environment and to look for the information of interest. The host environment can be obtained from the service iterator by a call to `SERV_GetNextInfoEx()` (), which is documented in the [service mapping API](ch_conn.html#ch_conn.service_mapping_api)
+
+***Note***: White space characters which surround the name are not preserved but they are preserved in the value i.e. when they appear after the “=” sign.
+
+A configuration line of the form
+
+ service_name [check_specifier] server_descriptor [| launcher_info ]
+
+defines a server. The detailed description of the individual fields is given below.
+
+- **`service_name`** specifies the service name that the server is part of, for example TaxService. The same **`service_name`** may be used in multiple server definition lines to add more servers implementing that service.
+
+- **`[check_specifier]`** is an optional parameter (if omitted, the surrounding square brackets must not be used). The parameter is a comma separated list and each element in the list can be one of the following.
+
+ - **`[-]N[/M]`** where N and M are integers. This will lead to checking every N seconds with backoff time of M seconds if failed. The “-“ character is used when it is required to check dependencies only, but not the primary connection point. "0", which stands for "no check interval", disables checks for the service.
+
+ - **`[!][host[:port]][+[service]]`** which describes a dependency. The “!” character means negation. The **`service`** is a service name the describing service depends on and runs on **`host:port`**. The pair **`host:port`** is required if no service is specified. The **`host`**, :**`port`**, or both can be missing if **`service`** is specified (in that case the missing parts are read as “any”). The “+” character alone means “this service’s name” (of the one currently being defined). Multiple dependency specifications are allowed.
+
+ - **`[~][DOW[-DOW]][@H[-H]]`** which defines a schedule. The “~” character means negation. The service runs from **`DOW`** to **`DOW`** (**`DOW`** is one of Su, Mo, Tu, We, Th, Fr, Sa, or Hd, which stands for a federal holiday, and cannot be used in weekday ranges) or any if not specified, and between hours **`H`** to **`H`** (9-5 means 9:00am thru 4:59pm, 18-0 means 6pm thru midnight). Single **`DOW`** and / or **`H`** are allowed and mean the exact day of week (or a holiday) and / or one exact hour. Multiple schedule specifications are allowed.
+
+ - **`email@ncbi.nlm.nih.gov`** which makes the LBSMD daemon to send an e-mail to the specified address whenever this server changes its status (e.g. from up to down). Multiple e-mail specifications are allowed. The **`ncbi.nlm.nih.gov`** part is fixed and may not be changed.
+
+ - **`user`** or **`*group`** which makes the LBSMD daemon add the specified user or group of users to the list of personnel who are authorized to modify the server (e.g. post a penalty, issue a rerate command etc.). By default these actions are only allowed to the **`root`** and **`lbsmd`** users, as well as users added to the host authority. Multiple specifications are allowed.
+
+ - **`script`** which specifies a path to a local executable which checks whether the server is operational. The LBSMD daemon starts this script periodically as specified by the check time parameter(s) above. Only a single script specification is allowed. See [Check Script Specification](ch_app.html#ch_app.Check_Script_Specification) for more details.
+
+- **`server_descriptor`** specifies the address of the server and supplies additional information. An example of the **`server_descriptor`**: `STANDALONE somehost:1234 R=3000 L=yes S=yes B=-20` See [Server Descriptor Specification](ch_app.html#ch_app.Server_Descriptor_Specification) for more details.
+
+- **`launcher_info`** is basically a command line preceded by a pipe symbol ( \| ) which plays a role of a delimiter from the **`server_descriptor`**. It is only required for the **NCBID** type of service which are configured on the local host.
+
+
+
+##### Check Script Specification
+
+The check script file is configured between square brackets '[' and ']' in the service definition line. For example, the service definition line:
+
+`MYSERVICE [5, /bin/user/directory/script.sh] STANDALONE :2222 ...`
+
+sets the period in seconds between script checks as "`5`" (the default period is 15 seconds) and designates the check script file as "`/bin/user/directory/script.sh`" to be launched every 5 seconds. You can prefix the period with a minus sign (-) to indicate that LBSMD should not check the connection point (:2222 in this example) on its own, but should only run the script. The script must finish before the next check run is due. Otherwise, LBSMD will kill the script and ignore the result. Multiple repetitive failures may result in the check script removal from the check schedule.
+
+The following command-line parameters are always passed to the script upon execution:
+
+- **`argv[0]`** = name of the executable with preceding '\|' character if **`stdin`** / **`stdout`** are open to the server connection (/dev/null otherwise), ***NB***: '\|' is not always readily accessible from within shell scripts, so it's duplicated in **`argv[2]`** for convenience;
+
+- **`argv[1]`** = name of the service being checked;
+
+- **`argv[2]`** = if piped, "\|host:port" of the connection point being checked, otherwise "host:port" of the server as per configuration;
+
+The following additional command-line parameters will be passed to the script if it has been run before:
+
+- **`argv[3]`** = exit code obtained in the last check script run;
+
+- **`argv[4]`** = repetition count for **`argv[3]`** (***NB***: 0 means this is the first occurrence of the exit code given in **`argv[3]`**);
+
+- **`argv[5]`** = seconds elapsed since the last check script run.
+
+Output to **`stderr`** is attached to the LBSMD log file; the CPU limit is set to maximal allowed execution time. Nevertheless, the check must finish before the next invocation is due, per the server configuration.
+
+The check script is expected to produce one of the following exit codes:
+
+
+
+| Code(s) | Meaning |
+|-----------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| 0 | The server is fully available, i.e. "running at full throttle". |
+| 1 - 99 | Indicates the approximate percent of base capacity used. |
+| 100 - 110 | Server state is set as RESERVED. RESERVED servers are unavailable to most clients but not considered as officially DOWN. |
+| 111 - 120 | The server is not available and must not be used, i.e. DOWN. |
+| 123 | Retain the previous exit code (as supplied in **`argv[3]`**) and increment the repetition count. Retain the current server state, otherwise, and log a warning. |
+| 124 (*not* followed by 125) | Retain the current server state. |
+| 124 followed by 125 | Turn the server off, with no more checks. ***Note:*** This only applies when 124 is followed by 125, both without repetitions. |
+| 125 (*not* preceded by 124) | Retain the current server state. |
+| 126 | Script was found but not executable (POSIX, script error). |
+| 127 | Script was not found (POSIX, script error). |
+| 200 - 210 | STANDBY server (set the rate to 0.005). The rate will be rolled back to the previously set "regular" rate the next time the RERATE command comes; or when the check script returns anything other than 123, 124, 125, or the state-retaining ALERTs (211-220).
STANDBY servers are those having base rate in the range [0.001..0.009], with higher rates having better chance to get drafted for service. STANDBY servers are only used by clients if there are no usable non-STANDBY counterparts found. |
+| 211 - 220 | ALERT (email contacts and retain the current server state). |
+| 221 - 230 | ALERT (email contacts and base the server rate on the dependency check only). |
+
+
+
+Exit codes 126, 127, and other unlisted codes are treated as if 0 had been returned (i.e. the server rate is based on the dependency check only).
+
+Any exit code other than 123 resets the repetition count, even though the new code may be equal to the previous one. In the absence of a previous code, exit code 123 will not be counted as a repetition, will cause a warning to be logged.
+
+Any exit code *not* from the table above will cause a warning to be logged, and will be treated as if 0 has been returned. Note that upon the exit code sequence 124,125 no further script runs will occur, and the server will be taken out of service.
+
+If the check script crashes ungracefully (with or without the coredump) 100+ times in a row, it will be eliminated from further checks, and the server will be considered fully available (i.e. as if 0 had been returned).
+
+Servers are called SUPPRESSED when they are 100% penalized (see server penalties below); while RESERVED is a special state that LBSMD maintains. 100% penalty makes an entry not only unavailable for regular use (same as RESERVED) but also assumes some maintenance work in progress (so that any underlying state changes will not be announced immediately but only when the entry goes out of the 100% penalized state, if any state change still remains). On the other hand and from the client perspective, RESERVED and SUPPRESSED may look identical.
+
+***Note:*** The check script operation is complementary to setting a penalty prior to doing any disruptive changes in production. In other words, the script is only reliable as long as the service is expected to work. If there is any scheduled maintenance, it should be communicated to LBSMD via a penalty rather than by an assumption that the failing script will do the job of bringing the service to the down state and excluding it from LB.
+
+
+
+##### Server Descriptor Specification
+
+The **`server_descriptor`**, also detailed in `connect/ncbi_server_info.h` (), consists of the following fields:
+
+`server_type [host][:port] [arguments] [flags]`
+
+where:
+
+- **`server_type`** is one of the following keywords ([more info](ch_conn.html#ch_conn.service_connector)):
+
+ - ***NCBID*** for servers launched by ncbid.cgi
+
+ - ***STANDALONE*** for standalone servers listening to incoming connections on dedicated ports
+
+ - ***HTTP\_GET*** for servers, which are the CGI programs accepting only the GET request method
+
+ - ***HTTP\_POST*** for servers, which are the CGI programs accepting only the POST request method
+
+ - ***HTTP*** for servers, which are the CGI programs accepting either GET or POST request methods
+
+ - ***DNS*** for introduction of a name (fake service), which can be used later in load-balancing for domain name resolution
+
+ - ***NAMEHOLD*** for declaration of service names that cannot be defined in any other configuration files except for the current configuration file. ***Note:*** The FIREWALL server specification may not be used in a configuration file (i.e., may neither be declared as services nor as service name holders).
+
+- both **`host`** and **`port`** parameters are optional. Defaults are local host and port 80, except for ***STANDALONE*** and ***DNS*** servers, which do not have a default port value. If host is specified (by either of the following: keyword localhost, localhost IP address 127.0.0.1, real host name, or IP address) then the described server is not a subject for variable load balancing but is a static server. Such server always has a constant rate, independent of any host load.
+
+- **`arguments`** are required for HTTP\* servers and must specify the local part of the URL of the CGI program and, optionally, parameters such as `/somepath/somecgi.cgi?param1¶m2=value2¶m3=value3`. If no parameters are to be supplied, then the question mark (?) must be omitted, too. For **NCBID** servers, arguments are parameters to pass to the server and are formed as arguments for CGI programs, i.e., param1¶m2¶m3=value. As a special rule, '' (two single quotes) may be used to denote an empty argument for the **NCBID** server. ***STANDALONE*** and ***DNS*** servers do not take any **`arguments`**.
+
+- **`flags`** can come in any order (but no more than one instance of a flag is allowed) and essentially are the optional modifiers of values used by default. The following flags are recognized (see [ncbi\_server\_info.h](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/include/connect/ncbi_server_info.h)):
+
+ - load calculation keyword:
+
+ - ***Blast*** to use special algorithm for rate calculation acceptable for BLAST () applications. The algorithm uses instant values of the host load and thus is less conservative and more reactive than the ordinary one.
+
+ - ***Regular*** to use an ordinary rate calculation (default, and the only load calculation option allowed for static servers).
+
+ - Either of these keywords may be suffixed with “Inter”, such as to form ***RegularInter***, making the entry to cross the current zone boundary, and being available outside its zone.
+
+- base rate:
+
+ - R=value sets the base server reachability rate (as a floating point number); the default is 1000. Any negative value makes the server unreachable, and a value 0 is used. The range of the base rate is between 0.001 and 100000. Note that the range [0.001—0.009] is reverved for STANDBY servers – the ones that are only used by clients if no other usable non-STANDBY counterparts can be found.
+
+- locality markers (Note: If necessary, both L and P markers can be combined in a particular service definition):
+
+ - L={yes\|no} sets (if yes) the server to be local only. The default is no. The [service mapping API](ch_conn.html#ch_conn.service_mapping_api) returns local only servers in the case of mapping with the use of LBSMD running on the same - local - host (direct mapping), or if the dispatching (indirect mapping) occurs within the NCBI Intranet. Otherwise, if the service mapping occurs using a non-local network (certainly indirectly, by exchange with dispd.cgi) then servers that are local only are not seen.
+
+ - P={yes\|no} sets (if yes) the server to be private. The default is no. Private servers are not seen by the outside NCBI users (exactly like local servers), but in addition these servers are not seen from the NCBI Intranet if requested from a host, which is different from one where the private server runs. This flag cannot be used for DNS servers.
+
+- Stateful server:
+
+ - S={yes\|no}. The default is no. Indication of stateful server, which allows only dedicated socket (stateful) connections. This tag is not allowed for HTTP\* and DNS servers.
+
+- Secure server:
+
+ - $={yes\|no}. The default is no. Indication of the server to be used with secure connections only. For STANDALONE servers it means to use SSL, and for the HTTP\* ones – to use the HTTPS protocol.
+
+- Content type indication:
+
+ - C=type/subtype [no default] specification of Content-Type (including encoding), which server accepts. The value of this flag gets added automatically to any HTTP packet sent to the server by SERVICE connector. However, in order to communicate, a client still has to know and generate the data type accepted by the server, i.e. a protocol, which server understands. This flag just helps insure that HTTP packets all get proper content type, defined at service configuration. This tag is not allowed in DNS server specifications.
+
+- Bonus coefficient:
+
+ - B=double [0.0 = default] specifies a multiplicative bonus given to a server run locally, when calculating reachability rate. Special rules apply to negative/zero values: 0.0 means not to use the described rate increase at all (default rate calculation is used, which only slightly increases rates of locally run servers). Negative value denotes that locally run server should be taken in first place, regardless of its rate, if that rate is larger than percent of expressed by the absolute value of this coefficient of the average rate coefficient of other servers for the same service. That is -5 instructs to ignore locally run server if its status is less than 5% of average status of remaining servers for the same service.
+
+- Validity period:
+
+ - T=integer [0 = default] specifies the time in seconds this server entry is valid without update. (If equal to 0 then defaulted by the LBSM Daemon to some reasonable value.)
+
+Server descriptors of type ***NAMEHOLD*** are special. As **`arguments`**, they have only a server type keyword. The namehold specification informs the daemon that the service of this name and type is not to be defined later in any configuration file except for the current one. Also, if the host (and/or port) is specified, then this protection works only for the service name on the particular host (and/or port).
+
+***Note:*** it is recommended that a dummy port number (such as :0) is always put in the namehold specifications to avoid ambiguities with treating the server type as a host name. The following example disables **`TestService`** of type ***DNS*** from being defined in all other configuration files included later, and **`TestService2`** to be defined as a **NCBID** service on host foo:
+
+ TestService NAMEHOLD :0 DNS
+ TestService2 NAMEHOLD foo:0 NCBID
+
+
+
+#### Sites
+
+LBSMD is minimally aware of NCBI network layout and can generally guess its “site” information from either an IP address or special location-role files located in the /etc/ncbi directory: a BE-MD production and development site, a BE-MD.QA site, a BE-MD.TRY site, and lastly an ST-VA site. When reading zone information from the “@” directive of the configuration, LBSMD can treat special non-numeric values as the following: “@try” as the production zone within BE-MD.TRY, “@qa” as the production zone within BE-MD.QA, “@dev” as a development zone within the current site, and “@\*prod\*” (e.g. @intprod) as a production zone within the current site – where the production zone has a value of “1” and the development – “2”: so “@2” and “@dev” as well as “@1” and “@\*prod\*” are each equivalent. That makes the definition of zones more convenient by the %include directive with the pipe character:
+
+ %include /etc/ncbi/role |@ # define zone via the role file
+
+Suppose that the daemon detected its site as ST-VA and assigned it a value of 0x300; then the above directive assigns the current zone the value of 0x100 if the file reads “prod” or “1”, and zone 0x200 if the file reads “dev” or “2”. Note that if the file reads either “try” or “qa”, or “4”, the implied “@” directive will flag an error because of the mismatch between the resultant zone and the current site values.
+
+Both zone and site (or site alone) can be permanently assigned with the command-line parameters and then may not be overridden from the configuration file(s).
+
+
+
+#### Signals
+
+The table below describes the LBSMD daemon signal processing.
+
+
+
+|---------|---------------------------------------------------------------------------------------------------------------------------|
+| Signal | Reaction |
+| SIGHUP | reload the configuration |
+| SIGINT | quit |
+| SIGTERM | quit |
+| SIGUSR1 | toggle the verbosity level between less verbose (default) and more verbose (when every warning generated is stored) modes |
+
+
+
+
+
+#### Automatic Configuration Distribution
+
+The configuration files structure is unified for all the hosts in the NCBI network. It is shown on the figure below.
+
+[](/book/static/img/ch_app_lbsmd_cfg_structure.png "Click to see the full-resolution image")
+
+Figure 9. LBSMD Configuration Files Structure
+
+The common for all the configuration file prefix `/etc/lbsmd` is omitted on the figure. The arrows on the diagram show how the files are included.
+
+The files `servrc.cfg` and `servrc.cfg.systems` have fixed structure and should not be changed at all. The purpose of the file `local/servrc.cfg.systems` is to be modified by the systems group while the purpose of the file `local/servrc.cfg.ieb` isto be modified by the delegated members of the respected groups. To make it easier for changes all the `local/servrc.cfg.ieb` files from all the hosts in the NCBI network are stored in a centralized SVN repository. The repository can be received by issuing the following command:
+
+`svn co svn+ssh://subvert.be-md.ncbi.nlm.nih.gov/export/home/LBSMD_REPO`
+
+The file names in that repository match the following pattern:
+
+`hostname.{be-md|st-va}[.qa]`
+
+where `be-md` is used for Bethesda, MD site and `st-va` is used for Sterling, VA site. The optional `.qa` suffix is used for quality assurance department hosts.
+
+So, if it is required to change the `/etc/lbsmd/local/servrc.cfg.ieb` file on the sutils1 host in Bethesda the `sutils1.be-md` file is to be changed in the repository.
+
+As soon as the modified file is checked in the file will be delivered to the corresponding host with the proper name automatically. The changes will take effect in a few minutes. The process of the configuration distribution is illustrated on the figure below.
+
+[](/book/static/img/CFEngine.jpg "Click to see the full-resolution image")
+
+Figure 10. Automatic Configuration Distribution
+
+
+
+#### Monitoring and Control
+
+
+
+##### Service Search
+
+The following web page can be used to search for a service:
+
+
+
+The following screen will appear
+
+[](/book/static/img/LBSMDSearchMain.gif "Click to see the full-resolution image")
+
+Figure 11. NCBI Service Search Page
+
+As an example of usage a user might enter the partial name of the service like "TaxService" and click on the “Go” button. The search results will display "TaxService", "TaxService3" and "TaxService3Test" if those services are available (see ).
+
+
+
+##### lbsmc Utility
+
+Another way of monitoring the LBSMD daemon is using the lbsmc () utility. The utility periodically dumps onto the screen a table which represents the current content of the LBSMD daemon table. The utility output can be controlled by a number of command line options. The full list of available options and their description can be obtained by issuing the following command:
+
+`lbsmc -h`
+
+The NCBI intranet users can also get the list of options by clicking on this link: .
+
+For example, to print a list of hosts which names match the pattern “sutil\*” the user can issue the following command:
+
+ >./lbsmc -h sutil* 0
+ LBSMC - Load Balancing Service Mapping Client R100432
+ 03/13/08 16:20:23 ====== widget3.be-md.ncbi.nlm.nih.gov (00:00) ======= [2] V1.2
+ Hostname/IPaddr Task/CPU LoadAv LoadBl Joined Status StatBl
+ sutils1 151/4 0.06 0.03 03/12 13:04 397.62 3973.51
+ sutils2 145/4 0.17 0.03 03/12 13:04 155.95 3972.41
+ sutils3 150/4 0.20 0.03 03/12 13:04 129.03 3973.33
+ --------------------------------------------------------------------------------
+ Service T Type Hostname/IPaddr:Port LFS B.Rate Coef Rating
+ bounce +25 NCBID sutils1:80 L 1000.00 397.62
+ bounce +25 HTTP sutils1:80 1000.00 397.62
+ bounce +25 NCBID sutils2:80 L 1000.00 155.95
+ bounce +25 HTTP sutils2:80 1000.00 155.95
+ bounce +27 NCBID sutils3:80 L 1000.00 129.03
+ bounce +27 HTTP sutils3:80 1000.00 129.03
+ dispatcher_lb 25 DNS sutils1:80 1000.00 397.62
+ dispatcher_lb 25 DNS sutils2:80 1000.00 155.95
+ dispatcher_lb 27 DNS sutils3:80 1000.00 129.03
+ MapViewEntrez 25 STANDALONE sutils1:44616 L S 1000.00 397.62
+ MapViewEntrez 25 STANDALONE sutils2:44616 L S 1000.00 155.95
+ MapViewEntrez 27 STANDALONE sutils3:44616 L S 1000.00 129.03
+ MapViewMeta 25 STANDALONE sutils2:44414 L S 0.00 0.00
+ MapViewMeta 27 STANDALONE sutils3:44414 L S 0.00 0.00
+ MapViewMeta 25 STANDALONE sutils1:44414 L S 0.00 0.00
+ sutils_lb 25 DNS sutils1:80 1000.00 397.62
+ sutils_lb 25 DNS sutils2:80 1000.00 155.95
+ sutils_lb 27 DNS sutils3:80 1000.00 129.03
+ TaxService 25 NCBID sutils1:80 1000.00 397.62
+ TaxService 25 NCBID sutils2:80 1000.00 155.95
+ TaxService 27 NCBID sutils3:80 1000.00 129.03
+ TaxService3 +25 HTTP_POST sutils1:80 1000.00 397.62
+ TaxService3 +25 HTTP_POST sutils2:80 1000.00 155.95
+ TaxService3 +27 HTTP_POST sutils3:80 1000.00 129.03
+ test +25 HTTP sutils1:80 1000.00 397.62
+ test +25 HTTP sutils2:80 1000.00 155.95
+ test +27 HTTP sutils3:80 1000.00 129.03
+ testgenomes_lb 25 DNS sutils1:2441 1000.00 397.62
+ testgenomes_lb 25 DNS sutils2:2441 1000.00 155.95
+ testgenomes_lb 27 DNS sutils3:2441 1000.00 129.03
+ testsutils_lb 25 DNS sutils1:2441 1000.00 397.62
+ testsutils_lb 25 DNS sutils2:2441 1000.00 155.95
+ testsutils_lb 27 DNS sutils3:2441 1000.00 129.03
+ --------------------------------------------------------------------------------
+ * Hosts:4\747, Srvrs:44/1223/23 | Heap:249856, used:237291/249616, free:240 *
+ LBSMD PID: 17530, config: /etc/lbsmd/servrc.cfg
+
+
+
+##### NCBI Intranet Web Utilities
+
+The NCBI intranet users can also visit the following quick reference links:
+
+- Dead servers list:
+
+- Search engine for all available hosts, all services and database affiliation:
+
+If the lbsmc utility is run with the -f option then the output contains two parts:
+
+- The host table. The table is accompanied by raw data which are printed in the order they appear in the LBSMD daemon table.
+
+- The service table
+
+The output is provided in either long or short format. The format depends on whether the -w option was specified in the command line (the option requests the long (wide) output). The wide output occupies about 132 columns, while the short (normal) output occupies only 80, which is the standard terminal width.
+
+In case if the service name is more than the allowed number of characters to display the trailing characters will be replaced with “\>”. When there is more information about the host / service to be displayed the “+” character is put beside the host / service name (this additional information can be retrieved by adding the -i option). When both “+” and “\>” are to be shown they are replaced with the single character “\*”. In the case of wide-output format the “\#” character shown in the service line means that there is no host information available for the service (similar to the static servers). The “!” character in the service line denotes that the service was configured / stored with an error (this character actually should never appear in the listings and should be reported whenever encountered). Wide output for hosts contains the time of bootup and startup. If the startup time is preceded by the “~” character then the host was gone for a while and then came back while the lbsmc utility was running. The “+” character in the times is to show that the date belongs to the past year(s).
+
+
+
+##### Server Penalizer API and Utility
+
+The utility allows to report problems of accessing a certain server to the LBSMD daemon, in the form of a penalty which is a value in the range [0..100] that shows, in percentages, how bad the server is. The value 0 means that the server is completely okay, whereas 100 means that the server (is misbehaving and) should **not** be used at all. The penalty is not a constant value: once set, it starts to decrease in time, at first slowly, then faster and faster until it reaches zero. This way, if a server was penalized for some reason and later the problem has been resolved, then the server becomes available gradually as its penalty (not being reset by applications again in the absence of the offending reason) becomes zero. The figure below illustrates how the value of penalty behaves.
+
+[](/book/static/img/Penalty.jpg "Click to see the full-resolution image")
+
+Figure 12. Penalty Value Characteristics
+
+Technically, the penalty is maintained by a daemon, which has the server configured, i.e., received by a certain host, which may be different from the one where the server was put into the configuration file. The penalty first migrates to that host, and then the daemon on that host announces that the server was penalized.
+
+***Note:*** Once a daemon is restarted, the penalty information is lost.
+
+[Service mapping API](ch_conn.html#ch_conn.service_mapping_api) has a call `SERV_Penalize()` () declared in `connect/ncbi_service.h` (), which can be used to set the penalty for the last server obtained from the mapping iterator.
+
+For script files (similar to the ones used to start/stop servers), there is a dedicated utility program called `lbsm_feedback` (), which sets the penalty from the command line. This command should be used with extreme care because it affects the load-balancing mechanism substantially,.
+
+**lbsm\_feedback** is a part of the LBSM set of tools installed on all hosts which run **LBSMD**. As it was explained above, penalizing means to make a server less favorable as a choice of the load balancing mechanism. Because of the fact that the full penalty of 100% makes a server unavailable for clients completely, at the time when the server is about to shut down (restart), it is wise to increase the server penalty to the maximal value, i.e. to exclude the server from the service mapping. (Otherwise, the LBSMD daemon might not immediately notice that the server is down and may continue dispatching to that server.) Usually, the penalty takes at most 5 seconds to propagate to all participating network hosts. Before an actual server shutdown, the following sequence of commands can be used:
+
+ > /opt/machine/lbsm/sbin/lbsm_feedback 'Servicename STANDALONE host 100 120'
+ > sleep 5
+ now you can shutdown the server
+
+The effect of the above is to set the maximal penalty 100 for the service Servicename (of type ***STANDALONE***) running on host **`host`** for at least 120 seconds. After 120 seconds the penalty value will start going down steadily and at some stage the penalty becomes 0. The default hold time equals 0. It takes some time to deliver the penalty value to the other hosts on the network so ‘sleep 5’ is used. Please note the single quotes surrounding the penalty specification: they are required in a command shell because **lbsm\_feedback** takes only one argument which is the entire penalty specification.
+
+As soon as the server is down, the **LBSMD** daemon detects it in a matter of several seconds (if not instructed otherwise by the configuration file) and then does not dispatch to the server until it is back up. In some circumstances, the following command may come in handy:
+
+ > /opt/machine/lbsm/sbin/lbsm_feedback 'Servicename STANDALONE host 0'
+
+The command resets the penalty to 0 (no penalty) and is useful when, as for the previous example, the server is restarted and ready in less than 120 seconds, but the penalty is still held high by the **LBSMD** daemon on the other hosts.
+
+The formal description of the lbsm\_feedback utility parameters is given below.
+
+[](/book/static/img/lbsm_feedback.gif "Click to see the full-resolution image")
+
+Figure 13. lbsm\_feedback Arguments
+
+The `servicename` can be an identifier with ‘\*’ for any symbols and / or ‘?’ for a single character. The `penalty value` is an integer value in the range 0 ... 100. The `port number` and `time` are integers. The `hostname` is an identifier and the `rate value` is a floating point value.
+
+
+
+#### SVN Repository
+
+The SVN repository where the LBSMD daemon source code is located can be retrieved by issuing the following command:
+
+`svn co https://svn.ncbi.nlm.nih.gov/repos/toolkit/trunk/c++`
+
+The daemon code is in this file:
+
+`c++/src/connect/daemons/lbsmd.c`
+
+
+
+#### Log Files
+
+The LBSMD daemon stores its log files at the following location:
+
+`/var/log/lbsmd`
+
+The file is formed locally on a host where LBSMD daemon is running. The log file size is limited to prevent the disk being flooded with messages. A standard log rotation is applied to the log file so you may see the files:
+
+`/var/log/lbsmd.X.gz`
+
+where X is a number of the previous log file.
+
+The log file size can be controlled by the -s command line option. By default, -s 0 is the active flag, which provides a way to create (if necessary) and to append messages to the log file with no limitation on the file size whatsoever. The -s -1 switch instructs indefinite appending to the log file, which must exist. Otherwise, log messages are not stored. -s positive\_number restricts the ability to create (if necessary) and to append to the log file until the file reaches the specified size in kilobytes. After that, message logging is suspended, and subsequent messages are discarded. Note that the limiting file size is only approximate, and sometimes the log file can grow slightly bigger. The daemon keeps track of log files and leaves a final logging message, either when switching from one file to another, in case the file has been moved or removed, or when the file size has reached its limit.
+
+NCBI intranet users can get few (no more than 100) recent lines of the log file on an NCBI internal host. It is also possible to visit the following link:
+
+
+
+
+
+#### Configuration Examples
+
+Here is an example of a LBSMD configuration file:
+
+ # $Id$
+ #
+ # This is a configuration file of new NCBI service dispatcher
+ #
+ #
+ # DBLB interface definitions
+ %include /etc/lbsmd/servrc.cfg.db
+ # IEB's services
+ testHTTP /Service/test.cgi?Welcome L=no
+ Entrez2[0] HTTP_POST www.ncbi.nlm.nih.gov /entrez/eutils/entrez2server.fcgi \
+ C=x-ncbi-data/x-asn-binary L=no
+ Entrez2BLAST[0] HTTP_POST www.ncbi.nlm.nih.gov /entrez/eutils/entrez2server.cgi \
+ C=x-ncbi-data/x-asn-binary L=yes
+ CddSearch [0] HTTP_POST www.ncbi.nlm.nih.gov /Structure/cdd/c_wrpsb.cgi \
+ C=application/x-www-form-urlencoded L=no
+ CddSearch2 [0] HTTP_POST www.ncbi.nlm.nih.gov /Structure/cdd/wrpsb.cgi \
+ C=application/x-www-form-urlencoded L=no
+ StrucFetch [0] HTTP_POST www.ncbi.nlm.nih.gov /Structure/mmdb/mmdbsrv.cgi \
+ C=application/x-www-form-urlencoded L=no
+ bounce[60]HTTP /Service/bounce.cgi L=no C=x-ncbi-data/x-unknown
+ # Services of old dispatcher
+ bounce[60]NCBID '' L=yes C=x-ncbi-data/x-unknown | \
+ ..../web/public/htdocs/Service/bounce
+
+NCBI intranet users can also visit the following link to get a sample configuration file:
+
+
+
+
+
+### Database Load Balancing
+
+Database load balancing is an important part of the overall load balancing function. Please see the [Database Load Balancer](ch_dbapi.html#ch_dbapi.Database_loadbalanci) section in the [Database Access](ch_dbapi.html) chapter for more details.
+
+
+
+### Cookie / Argument Affinity Module (MOD\_CAF)
+
+
+
+#### Overview
+
+The cookie / argument affinity module (CAF module in the further discussion) helps to virtualize and to dispatch a web site by modifying the way how Apache resolves host names. It is done by superseding conventional `gethostbyname*()` API. The CAF module is implemented as an Apache web server module and uses the LBSMD daemon collected data to make a decision how to dispatch a request. The data exchange between the CAF module and the LBSMD daemon is done via a shared memory segment as shown on the figure below.
+
+[](/book/static/img/CAF-LBSMD.gif "Click to see the full-resolution image")
+
+Figure 14. CAF Module and LBSMD daemon data exchange
+
+The LBSMD daemon stores all the collected data in a shared memory segment and the CAF module is able to read data from that segment.
+
+The CAF module looks for special cookies and query line arguments, and analyses the LBSMD daemon data to resolve special names which can be configured in ProxyPass directives of mod\_proxy.
+
+The CAF module maintains a list of proxy names, cookies, and arguments (either 4 predefined, see below, or set forth via Apache configuration file by CAF directives) associated with cookies. Once a URL is translated to the use of one of the proxies (generally, by ProxyPass of mod\_proxy) then the information from related cookie (if any) and argument (if any) is used to find the best matching real host that corresponds to the proxy. Damaged cookies and arguments, if found in the incoming HTTP request, are ignored.
+
+A special host name is meant under proxy and the name contains a label followed by string ".lb" followed by an optional domain part. Such names trigger gethostbyname() substitute, supplied by the module, to consult load-balancing daemon's table, and to use both the constraints on the arguments and the preferred host information, found in the query string and the cookie, respectively.
+
+For example, the name "pubmed.lb.nlm.nih.gov" is an LB proxy name, which would be resolved by looking for special DNS services ("pubmed\_lb" in this example) provided by the LBSMD daemon. Argument matching (see also a separate section below) is done by searching the host environment of target hosts (corresponding to the LB proxy name) as supplied by the LBSMD daemon. That is, "db=PubMed" (to achieve PubMed database affinity) in the query that transforms into a call to an LB proxy, which in turn is configured to use the argument "DB", instructs to search only those target hosts that declare the proxy and have "db=... PubMed ..." configured in their LBSMD environments (and yet to remember to accommodate, if it is possible, a host preference from the cookie, if any found in the request).
+
+The CAF module also detects internal requests and allows them to use the entire set of hosts that the LB names are resolved to. For external requests, only hosts whose DNS services are not marked local (L=yes, or implicitly, by lacking "-i" flag in the LBSMD daemon launch command) will be allowed to serve requests. "HTTP\_CAF\_PROXIED\_HOST" environment is supplied (by means of an HTTP header tag named "`CAF-Proxied-Host`") to contain an address of the actual host posted the request. Impostor's header tags (if any) of this name are always stripped, so that backends always have correct information about the requesters. Note that all internal requests are trusted, so that an internal resource can make a request to execute on behalf of an outside client by providing its IP in the "`Client-Host`" HTTP header. The "`Client-Host`" tag gets through for internal requests only; to maintain security the tag is dropped for all external requests.
+
+The CAF module has its own status page that can be made available in the look somewhat resembling Apache status page. The status can be in either raw or HTML formatted, and the latter can also be sorted using columns in interest. Stats are designed to be fast, but sometimes inaccurate (to avoid interlocking, and thus latencies in request processing, there are no mutexes being used except for the table expansion). Stats are accumulated between server restarts (and for Apache 2.0 can survive graceful restarts, too). When the stat table is full (since it has a fixed size), it is cleaned in a way to get room for 1% of its capacity, yet trying to preserve the most of recent activity as well as the most of heavily used stats from the past. There are two cleaning algorithms currently implemented, and can be somehow tuned by means of `CAFexDecile`, `CAFexPoints`, and `CAFexSlope` directives which are described below.
+
+The CAF module can also report the number of slots that the Apache server has configured and used up each time a new request comes in and is being processed. The information resides in a shared memory segment that several Apache servers can use cooperatively on the same machine. Formerly, this functionality has been implemented in a separate SPY module, which is now fully integrated into this module. Using a special compile-time macro it is possible to obtain the former SPY-only functionality (now called LBSMD reporter feature) without any other CAF features. Note that no CAF\* directives will be recognized in Apache configuration, should the reduced functionality build be chosen.
+
+
+
+#### Configuration
+
+The table below describes Apache configuration directives which are taken into account by the CAF module.
+
+
+
+|-------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Directive | Description |
+| LBSMD { On \\| Off } | It can appear outside any paired section of the configuration file, and enables ["On", default in mod\_spy mode] or disables ["Off", default in full-fledged mod\_caf mode] the LBSMD reporter feature. When the module is built exclusively with the LBSMD reporter feature, this is the only directive, which is available for the use by the module. Please note that the directive is extremely global, and works across configuration files. Once "Off" is found throughout the configuration, it takes the effect. |
+| CAF { On \\| Off } | It can appear outside any paired section of the configuration file, and enables ["On", default] or disables ["Off"] the entire module. Please note that this directive is extremely global, and works across Apache configuration files, that is the setting "Off" anywhere in the configuration causes the module to go out of business completely. |
+| CAFQAMap name path | It can appear outside any paired section of the configuration file but only once in the entire set of the configuration files per "name", and if used, defines a path to the map file, which is to be loaded at the module initialization phase (if the path is relative, it specifies the location with respect to the daemon root prefix as defined at the time of the build, much like other native configuration locations do). The file is a text, line-oriented list (w/o line continuations). The pound symbol (\#) at any position introduces a comment (which is ignored by the parser). Any empty line (whether resulted from cutting off a comment, or just blank by itself) is skipped. Non-empty lines must contain a pair of words, delimited by white space(s) (that is, tab or space character(s)). The first word defines an LB group that is to be replaced with the second word, in the cases when the first word matches the LB group used in proxy passing of an internally-originating request. The matching is done by previewing a cookie named "name" that should contain a space-separated list of tokens, which must comprise a subset of names loaded from the left-hand side column of the QA file. Any unmatched token in the cookie will result the request to fail, so will do any duplicate name. That is, if the QA map file contains a paired rule "tpubmed tpubmedqa", and an internal (i.e. originating from within NCBI) request has the NCBIQA cookie listing "tpubmed", then the request that calls for use of the proxy-pass "tpubmed.lb" will actually use the name "tpubmedqa.lb" as if it appeared in the ProxyPass rule of mod\_proxy. Default is not to load any QA maps, and not to proceed with any substitutions. Note that if the module is disabled (CAF Off), then the map file, even if specified, need not to exist, and won't be loaded. |
+| CAFFailoverIP address | It defines hostname / IP to return on LB proxy names that cannot be resolved. Any external requests and local ones, in which argument affinity has to be taken into account, will fall straight back to use this address whenever the LB name is not known or LBSMD is not operational. All other requests will be given a chance to use regular DNS first, and if that fails, then fall back to use this IP. When the failover IP address is unset, a failed LB proxy name generally causes the Apache server to throw either "Bad gateway" (502) or "Generic server error" (500) to the client. This directive is global across the entire configuration, and the last setting takes the actual effect. |
+| CAFForbiddenIP address | It is similar to CAFFailoverIP described above yet applies only to the cases when the requested LB DNS name exists but cannot be returned as it would cause the name access violation (for example, an external access requires an internal name to be used to proxy the request). Default is to use the failover IP (as set by CAFFailoverIP), if available. |
+| CAFThrottleIP address | It is similar to CAFFailoverIP described above but applies only to abusive requests that should be throttled out. Despite this directive exists, the actual throttling mechanism is not yet in production. Default is to use the failover IP (as set by CAFFailoverIP), if available. |
+| CAFBusyIP address | It is similar to CAFFailoverIP described above but gets returned to clients when it is known that the proxy otherwise serving the request is overloaded. Default is to use the failover IP, if available. |
+| CAFDebug { Off \\| On \\| 2 \\| 3 } | It controls whether to print none ("Off"), some ("On"), more ("2"), or all ("3") debugging information into Apache log file. Per-request logging is automatically on when debugging is enabled by the native LogLevel directive of Apache (LogLevel debug), or with a command line option -e (Apache 2). This directive controls whether mod\_caf produces additional logging when doing maintenance cleaning of its status information (see CAFMaxNStats below). Debug level 1 (On) produces cleanup synopsis and histogram, level 2 produces per-stat eviction messages and the synopsis, and debug level 3 is a combination of the above. Default is "Off". The setting is global, and the last encounter has the actual effect. NOTE: per-stat eviction messages may cause latencies in request processing; so debug levels "2" and "3" should be used carefully, and only when actually needed. |
+| CAFTiming { Off \\| On \\| TOD } | It controls whether the module timing profile is done while processing requests. For this to work, though, CAFMaxNStats must first enable collection of statistics. Module's status page then will show how much time is being spent at certain stages of a request processing. Since proxy requests and non-proxy requests are processed differently they are accounted separately. "On" enables to make the time marks using the gettimeofday(2) syscall (accurate up to 1us) without reset upon each stat cleanup (note that tick count will wrap around rather frequently). Setting "TOD" is same as "On" but extends it so that counts do get reset upon every cleanup. Default is "Off". The setting is global, and the last encounter in the configuration file has the actual effect. |
+| CAFMaxNStats number | The number defines how many statistics slots are allocated for CAF status (aka CAF odometer). Value "0" disables the status page at all. Value "-1" sets default number of slots (which currently corresponds to the value of 319). Note that the number only sets a lower bound, and the actual number of allocated slots may be automatically extended to occupy whole number of pages (so that no "memory waste" occurs). The actual number of stats (and memory pages) is printed to the log file. To access the status page, a special handler must be installed for a designated location, as in the following example: `` ` SetHandler CAF-status` ` Order deny,allow` ` Deny from all` ` Allow from 130.14/16` `` 404 (Document not found) gets returned from the configured location if the status page has been disabled (number=0), or if it malfunctions. This directive is global across the entire configuration, and the last found setting takes the actual effect. CAF stats can survive server restarts [graceful and plain "restart"], but not stop / start triggering sequence. Note: "CAF Off" does not disable the status page if it has been configured before -- it just becomes frozen. So [graceful] restart with "CAF Off" won't prevent from gaining access to the status page, although the rest of the module will be rendered inactive. |
+| CAFUrlList url1 url2 ... | By default, CAF status does not distinguish individual CGIs as they are being accessed by clients. This option allows separating statistics on a per-URL basis. Care must be taken to remember of "combinatorial explosion", and thus the appropriate quantity of stats is to be pre-allocated with CAFMaxNStats if this directive is used, or else the statistics may renew too often to be useful. Special value "\*" allows to track every (F)CGI request by creating individual stat entries for unique (F)CGI names (with or without the path part, depending on a setting of CAFStatPath directive, below). Otherwise, only those listed are to be accounted for, leaving all others to accumulate into a nameless stat slot. URL names can have .cgi or .fcgi file name extensions. Alternatively, a URL name can have no extension to denote a CGI, or a trailing period (.) to denote an FCGI. A single dot alone (.) creates a specially named stat for all non-matching CGIs (both .cgi or .fcgi), and collects all other non-CGI requests in a nameless stat entry. (F)CGI names are case sensitive. When path stats are enabled (see CAFStatPath below), a relative path entry in the list matches any (F)CGI that has the trailing part matching the request (that is, "query.fcgi" matches any URL that ends in "query.fcgi", but "/query.fcgi" matches only the top-level ones). There is an internal limit of 1024 URLs that can be explicitly listed. Successive directives add to the list. A URL specified as a minus sign alone ("-") clears the list, so that no urls will be registered in stats. This is the default. This directive is only allowed at the top level, and applies to all virtual hosts. |
+| CAFUrlKeep url1 url2 ... | CAF status uses a fixed-size array of records to store access statistics, so whenever the table gets full, it has to be cleaned up by dropping some entries, which have not been updated too long, have fewer count values, etc. The eviction algorithm can be controlled by CAFexDecile, CAFexPoints, and CAFexSlope directives, described below, but even when finely tuned, can result in some important entries being pre-emptied, especially when per-URL stats are enabled. This directive helps avoid losing the important information, regardless of other empirical characteristics of a candidate-for-removal. The directive, like CAFUrlList above, lists individual URLs which, once recorded, have to be persistently kept in the table. Note that as a side effect, each value (except for "-") specified in this directive implicitly adds an entry as if it were specified with CAFUrlList. Special value "-" clears the keep list, but does not affect the URL list, so specifying "CAFUrlKeep a b -" is same as specifying "CAFUrlList a b" alone, that is, without obligation for CAF status to keep either "a" or "b" permanently. There is an internal limit of 1024 URLs that can be supplied by this directive. Successive uses add to the list. The directive is only allowed at the top level, and applies to all virtual hosts. |
+| CAFexDecile digit | It specifies the top decile(s) of the total number of stat slots, sorted by the hit count and subject for expulsion, which may not be made available for stat's cleanup algorithms should it be necessary to arrange a new slot by removing old/stale entries. Decile is a single digit 0 through 9, or a special value "default" (which currently translates to 1). Note that each decile equals 10%. |
+| CAFexPoints { value \\| percentage% } | The directive specifies how many records, as an absolute value, or as a percentage of total stat slots, are to be freed each time the stat table gets full. Keyword "default" also can be used, which results in eviction of 1% of all records (or just 1 record, whatever is greater). Note that if CAFUrlKeep is in use, the cleanup may not be always possible. The setting is global and the value found last takes the actual effect. |
+| CAFexSlope { value \\| "quad" } | The directive can be used to modify cleanup strategy used to vacate stat records when the stat table gets full. The number of evicted slots can be controlled by CAFexPoints directive. The value, which is given by this directive, is used to plot either circular ("quad") or linear (value \>= 0) plan of removal. The linear plan can be further fine-tuned by specifying a co-tangent value of the cut-off line over a time-count histogram of statistics, as a binary logarithm value, so that 0 corresponds to the co-tangent of 1 (=2^0), 1 (default) corresponds to the co-tangent of 2 (=2^1), 2 - to the co-tangent of 4 (=2^2), 3 - to 8 (=2^3), and so forth, up to a maximal feasible value 31 (since 2^32 overflows an integer, this results in the infinite co-tangent, causing a horizontal cut-off line, which does not take into account times of last updates, but counts only). The default co-tangent (2) prices the count of a stats twice higher than its longevity. The cleanup histogram can be viewed in the log if CAFDebug is set as 2 (or 3). The setting is global and the value found last takes the actual effect. |
+| CAFStatVHost { Off \\| On } | It controls whether VHosts of the requests are to be tracked on the CAF status page. By default, VHost separation is not done. Note that preserving graceful restart of the server may leave some stats VHost-less, when switching from VHost-disabled to VHost-enabled mode, with this directive. The setting is global and the setting found last has the actual effect. |
+| CAFStatPath { On \\| Off } | It controls whether the path part of URLs is to be stored and shown on the CAF status page. By default, the path portion is stripped. Keep in mind the relative path specifications as given in CAFUrlList directive, as well as the number of possible combinations of Url/VHost/Path, that can cause frequent overflows of the status table. When CAFStatPath is "Off", the path elements are stripped from all URLs provided in the CAFUrlList directive (and merging the identical names, if any result). This directive is global, and the setting found last having the actual effect. |
+| CAFOkDnsFallback { On \\| Off } | It controls whether it is okay to fallback for consulting regular DNS on the unresolved names, which are not constrained with locality and/or affinities. Since shutdown of SERVNSD (which provided the fake .lb DNS from the load balancer), fallback to system DNS looks painfully slow (at it has now, in the absence of the DNS server, to reach the timeout), so the default for this option is "Off". The setting is global, and the value found last takes the actual effect. |
+| CAFNoArgOnGet { On \\| Off } | It can appear outside any paired section of the configuration, "On" sets to ignore argument assignment in GET requests that don't have explicit indication of the argument. POST requests are not affected. Default is "Off", VHost-specific. |
+| CAFArgOnCgiOnly { On \\| Off } | It controls whether argument is taken into account when an FCGI or CGI is being accessed. Default is "Off". The setting is per-VHost specific. |
+| CAFCookies { Cookie \\| Cookie2 \\| Any } | It instructs what cookies to search for: "Cookie" stands for RFC2109 cookies (aka Netscape cookies), this is the default. "Cookie2" stands for new RFC2965 cookies (new format cookies). "Any" allows searching for both types of cookies. This is a per-server option that is not shared across virtual host definitions, and allowed only outside any \ or \. Note that, according to the standard, cookie names are not case-sensitive. |
+| CAFArgument argument | It defines argument name to look for in the URLs. There is no default. If set, the argument becomes default for any URL and also for proxies whose arguments are not explicitly set with CAFProxyArgument directives. The argument is special case sensitive: first, it is looked up "as-is" and, if that fails, in all uppercase then. This directive can appear outside any \ or \ and applies to virtual hosts (if any) independently. |
+| CAFHtmlAmp { On \\| Off } | It can appear outside any paired section of configuration, set to On enables to recognize "&" for the ampersand character in request URLs (caution: "&" in URLs is not standard-conforming). Default is "Off", VHost-specific. |
+| CAFProxyCookie proxy cookie | It establishes a correspondence between LB DNS named proxy and a cookie. For example, "CAFProxyCookie pubmed.lb MyPubMedCookie" defines that "MyPubMedCookie" should be searched for preferred host information when "pubmed.lb" is being considered as a target name for proxying the incoming request. This directive can appear anywhere in configuration, but is hierarchy complying. |
+| CAFProxyNoArgOnGet proxy { On \\| Off \\| Default } | The related description can be seen at the CAFNoArgOnGet directive description above. The setting applies only to the specified proxy. "Default" (default) is to use the global setting. |
+| CAFProxyArgOnCgiOnly proxy { On \\| Off \\| Default } | The related description can be seen at the CAFArgOnCgiOnly directive description above. The setting applies only to the specified proxy. "Default" (default) is to use the global setting. |
+| CAFProxyArgument proxy argument | It establishes a correspondence between LB DNS named proxy and a query line argument. This directive overrides any default that might have been set with global "CAFArgument" directive. Please see the list of predefined proxies below. The argument is special case sensitive: first, it is looked up "as-is" and, if that fails, in all uppercase then. The first argument occurrence is taken into consideration. It can appear anywhere in configuration, but is hierarchy complying. |
+| CAFProxyAltArgument proxy altargument | It establishes a correspondence between LB DNS named proxy and an alternate query line argument. The alternate argument (if defined) is used to search (case-insensitively) query string for the argument value, but treating the value as if it has appeared to argument set forth by CAFProxyArgument or CAFArgument directives for the location in question. If no alternate argument value is found, the regular argument search is performed. Please see the list of predefined proxies below. Can appear anywhere in configuration, but is hierarchy complying, and should apply for existing proxies only. Altargument "-" deletes the alternate argument (if any). Note again that unlike regular proxy argument (set forth by either CAFArgument (globally) or CAFProxyArgument (per-proxy) directives) the alternate argument is entirely case-insensitive. |
+| CAFProxyDelimiter proxy delimiter | It sets a one character delimiter that separates host[:port] field in the cookie, corresponding to the proxy, from some other following information, which is not pertinent to cookie affinity business. Default is '\\|'. No separation is performed on a cookie that does not have the delimiter -- it is then thought as been found past the end-of-line. It can appear anywhere in configuration, but is hierarchy complying. |
+| CAFProxyPreference proxy preference | It sets a preference (floating point number from the range [0..100]) that the proxy would have if a host matching the cookie is found. The preference value 0 selects the default value which is currently 95. It can appear anywhere in configuration, but is hierarchy complying. |
+| CAFProxyCryptKey proxy key | It sets a crypt key that should be used to decode the cookie. Default is the key preset when a cookie correspondence is created [via either "CAFProxyCookie" or "CAFProxyArgument"]. To disable cookie decrypting (e.g. if the cookie comes in as a plain text) use "". Can appear anywhere in configuration, but is hierarchy complying. |
+
+
+
+All hierarchy complying settings are inherited in directories that are deeper in the directory tree, unless overridden there. The new setting then takes effect for that and all descendant directories/locations.
+
+There are 4 predefined proxies that may be used [or operated on] without prior declaration by either "CAFProxyCookie" or "CAFProxyArgument" directives:
+
+
+
+|------------|-----------------|------------|-----------|----------|----------|----------|
+| LB name | CookieName | Preference | Delimiter | Crypted? | Argument | AltArg |
+| tpubmed.lb | LB-Hint-Pubmed | 95 | \\| | yes | db | \ |
+| eutils.lb | LB-Hint-Pubmed | 95 | \\| | yes | db | DBAF |
+| mapview.lb | LB-Hint-MapView | 95 | \\| | yes | \ | \ |
+| blastq.lb | LB-Hint-Blast | 95 | \\| | yes | \ | \ |
+
+
+
+***NOTE***: The same cookie can be used to tie up an affinity for multiple LB proxies. On the other hand, LB proxy names are all unique throughout the configuration file.
+
+***NOTE***: It is very important to keep in mind that arguments and alt-arguments are treated differently, case-wise. Alt-args are case insensitive, and are screened before the main argument (but appear as if the main argument has been found). On the other hand, main arguments are special case-sensitive, and are checked twice: "as is" first, then in all CAPs. So having both "DB" for alt-argument and "db" for the main, hides the main argument, and actually makes it case-insensitive. CAF will warn on some occurrences when it detects whether the argument overloading is about to happen (take a look at the logs).
+
+The CAF module is also able to detect if a request comes from a local client. The `/etc/ncbi/local_ips` file describes the rules for making the decision.
+
+The file is line-oriented, i.e. supposes to have one IP spec per one line. Comments are introduced by either "\#" or "!", no continuation lines allowed, the empty lines are ignored.
+
+An IP spec is a word (no embedded whitespace characters) and is either:
+
+- a host name or a legitimate IP address
+
+- a network specification in the form "networkIP / networkMask"
+
+- an IP range (explained below).
+
+A networkIP / networkMask specification can contain an IP prefix for the network (with or without all trailing zeroes present), and the networkMask can be either in CIDR notation or in the form of a full IP address (all 4 octets) expressing contiguous high-bit ranges (all the records below are equivalent):
+
+`130.14.29.0/24` `130.14.29/24` `130.14.29/255.255.255.0` `130.14.29.0/255.255.255.0`
+
+An IP range is an incomplete IP address (that is, having less than 4 full octets) followed by exactly one dot and one integer range, e.g.:
+
+`130.14.26.0-63`
+
+denotes a host range from `130.14.26.0` thru `130.14.26.63` (including the ends),
+
+`130.14.8-9`
+
+denotes a host range from `130.14.8.0` thru `130.14.9.255` (including the ends).
+
+***Note*** that `127/8` gets automatically added, whether or not it is explicitly included into the configuration file. The file loader also warns if it encounters any specifications that overlap each other. Inexistent (or unreadable) file causes internal hardcoded defaults to be used - a warning is issued in this case.
+
+***Note*** that the IP table file is read once per Apache daemon's life cycle (and it is \*not\* reloaded upon graceful restarts). The complete stop / start sequence should be performed to force the IP table be reloaded.
+
+
+
+#### Configuration Examples
+
+- To define that "WebEnv" cookie has an information about "pubmed.lb" preference in "/Entrez" and all the descendant directories one can use the following:
+
+
+
+
+ CAFProxyCookie pubmed.lb WebEnv
+ CAFPreference pubmed.lb 100
+
+
+The second directive in the above example sets the preference to 100% -- this is a preference, not a requirement, so meaning that using the host from the cookie is the most desirable, but not blindly instructing to go to in every case possible.
+
+- To define new cookie for some new LB name the following fragment can be used:
+
+
+
+
+ CAFProxyCookie myname.lb My-Cookie
+ CAFProxyCookie other.lb My-Cookie
+
+
+ CAFProxyCookie myname.lb My-Secondary-Cookie
+
+
+The effect of the above is that "My-Cookie" will be used in LB name searches of "myname.lb" in directory "/SomeDir", but in "/SomeDir/SubDir" and all directories of that branch, "My-Secondary-Cookie" will be used instead. If an URL referred to "/SomeDir/AnotherDir", then "My-Cookie" would still be used.
+
+***Note*** that at the same time "My-Cookie" is used under "/SomeDir" everywhere else if "other.lb" is being resolved there.
+
+- The following fragment disables cookie for "tpubmed.lb" [note that no "CAFProxyCookie" is to precede this directive because "tpubmed.lb" is predefined]:
+
+
+
+ CAFProxyPreference tpubmed.lb 0
+
+- The following directive associates proxy "systems.lb" with argument "ticket":
+
+
+
+ CAFProxyArgument systems.lb ticket
+
+The effect of the above is that if an incoming URL resolves to use "systems.lb", then "ticket", if found in the query string, would be considered for lookup of "systems.lb" with the load-balancing daemon.
+
+
+
+#### Arguments Matching
+
+Suppose that the DB=A is a query argument (explicit DB selection, including just "DB" (as a standalone argument, treated as missing value), "DB=" (missing value)). That will cause the following order of precedence in selecting the target host:
+
+
+
+|----------------|-------------------------------------------------------------------------------|
+| Match | Description |
+| DB=A | Best. "A" may be "" to match the missing value |
+| DB=\* | Good. "\*" stands for "any other" |
+| DB not defined | Fair |
+| DB=- | Poor. "-" stands for "missing in the request" |
+| DB=B | Mismatch. It is used for fallbacks only as the last resort |
+
+
+
+No host with an explicit DB assignment (DB=B or DB=-) is being selected above if there is an exclamation point "!" [stands for "only"] in the assignment. DB=~A for the host causes the host to be skipped from selection as well. DBs are screened in the order of appearance, the first one is taken, so "DB=~A A" skips all requests having DB=A in their query strings.
+
+Suppose that there is no DB selection in the request. Then the hosts are selected in the following order:
+
+
+
+|----------------|------------------------------------------------------------------------------|
+| Match | Description |
+| DB=- | Best "-" stands for "missing from the request" |
+| DB not defined | Good |
+| DB=\* | Fair. "\*" stands for "any other" |
+| DB=B | Poor |
+
+
+
+No host with a non-empty DB assignment (DB=B or DB=\*) is being selected in the above scenario if there is an exclamation point "!" [stands for "only"] in the assignment. DB=~- defined for the host causes the host not to be considered.
+
+Only if there are no hosts in the best available category of hosts, the next category is used. That is, no "good" matches will ever be used if there are "best" matches available. Moreover, if all "best" matches have been used up but are known to exist, the search fails.
+
+"~" may not be used along with "\*": "~\*" combination will be silently ignored entirety, and will not modify the other specified affinities. Note that "~" alone has a meaning of 'anything but empty argument value, ""'. Also note that formally, "~A" is an equivalent to "~A \*" as well as "~-" is an equivalent to "\*".
+
+
+
+##### Argument Matching Examples
+
+Host affinity
+
+DB=A ~B
+
+makes the host to serve requests having either DB=A or DB=\ in their query strings. The host may be used as a failover for requests that have DB=C in them (or no DB) if there is no better candidate available. Adding "!" to the affinity line would cause the host not to be used for any requests, in which the DB argument is missing.
+
+Host affinity
+
+DB=A -
+
+makes the host to serve requests with either explicit DB=A in their query strings, or not having DB argument at all. Failovers from searches not matching the above may occur. Adding "!" to the line disables the failovers.
+
+Host affinity
+
+DB=- \*
+
+makes the host to serve requests that don't have any DB argument in their query strings, or when their DB argument failed to literally match affinity lines of all other hosts. Adding "!" to the line doesn't change the behavior.
+
+
+
+#### Log File
+
+The CAF module uses the Apache web server log files to put CAF module’s messages into.
+
+
+
+#### Monitoring
+
+The status of the CAF modules can be seen via a web interface using the following links:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+### DISPD Network Dispatcher
+
+
+
+#### Overview
+
+The DISPD dispatcher is a CGI/1.0-compliant program (the actual file name is `dispd.cgi`). Its purpose is mapping a requested service name to an actual server location when the client has no direct access to the LBSMD daemon. This mapping is called dispatching. Optionally, the DISPD dispatcher can also pass data between the client, who requested the mapping, and the server, which implements the service, found as a result of dispatching. This combined mode is called a connection. The client may choose any of these modes if there are no special requirements on data transfer (e.g., firewall connection). In some cases, however, the requested connection mode implicitly limits the request to be a dispatching-only request, and the actual data flow between the client and the server occurs separately at a later stage.
+
+
+
+#### Protocol Description
+
+The dispatching protocol is designed as an extension of HTTP/1.0 and is coded in the HTTP header parts of packets. The request (both dispatching and connection) is done by sending an HTTP packet to the DISPD dispatcher with a query line of the form:
+
+ dispd.cgi?service=
+
+which can be followed by parameters (if applicable) to be passed to the service. The `` defines the name of the service to be used. The other parameters take the form of one or more of the following constructs:
+
+ &[=]
+
+where square brackets are used to denote an optional value part of the parameter.
+
+In case of a connection request the request body can contain data to be passed to the first-found server. A connection to this server is automatically initiated by the DISPD dispatcher. On the contrary, in case of a dispatching-only request, the body is completely ignored, that is, the connection is dropped after the header has been read and then the reply is generated without consuming the body data. That process may confuse an unprepared client.
+
+Mapping of a service name into a server address is done by the LBSMD daemon which is run on the same host where the DISPD dispatcher is run. The DISPD dispatcher never dispatches a non-local client to a server marked as local-only (by means of L=yes in the configuration of the LBSMD daemon). Otherwise, the result of dispatching is exactly what the client would get from the [service mapping API](ch_conn.html#ch_conn.service_mapping_api) if run locally. Specifying capabilities explicitly the client can narrow the server search, for example, by choosing stateless servers only.
+
+
+
+##### Client Request to DISPD
+
+The following additional HTTP tags are recognized in the client request to the DISPD dispatcher.
+
+
+
+
+
+
+
+
+
+
+
Tag
+
Description
+
+
+
Accepted-Server-Types: <list>
+
The <list> can include one or more of the following keywords separated by spaces:
+
+
NCBID
+
STANDALONE
+
HTTP
+
HTTP_GET
+
HTTP_POST
+
FIREWALL
+
+ The keyword describes the server type which the client is capable to handle. The default is any (when the tag is not present in the HTTP header), and in case of a connection request, the dispatcher will accommodate an actual found server with the connection mode, which the client requested, by relaying data appropriately and in a way suitable for the server. Note:FIREWALL indicates that the client chooses a firewall method of communication. Note: Some server types can be ignored if not compatible with the current client mode
+
+
+
Client-Mode: <client-mode>
+
The <client-mode> can be one of the following:
+
+
STATELESS_ONLY - specifies that the client is not capable of doing full-duplex data exchange with the server in a session mode (e.g., in a dedicated connection).
+
STATEFUL_CAPABLE - should be used by the clients, which are capable of holding an opened connection to a server. This keyword serves as a hint to the dispatcher to try to open a direct TCP channel between the client and the server, thus reducing the network usage overhead.
+
+ The default (when the tag is not present at all) is STATELESS_ONLY to support Web browsers.
+
+
+
Dispatch-Mode: <dispatch-mode>
+
The <dispatch-mode> can be one of the following:
+
+
INFORMATION_ONLY - specifies that the request is a dispatching request, and no data and/or connection establishment with the server is required at this stage, i.e., the DISPD dispatcher returns only a list of available server specifications (if any) corresponding to the requested service and in accordance with client mode and server acceptance.
+
NO_INFORMATION - is used to disable sending the above-mentioned dispatching information back to the client. This keyword is reserved solely for internal use by the DISPD dispatcher and should not be used by applications.
+
STATEFUL_INCLUSIVE - informs the DISPD dispatcher that the current request is a connection request, and because it is going over HTTP it is treated as stateless, thus dispatching would supply stateless servers only. This keyword modifies the default behavior, and dispatching information sent back along with the server reply (resulting from data exchange) should include stateful servers as well, allowing the client to go to a dedicated connection later.
+
OK_DOWN or OK_SUPPRESSED or PROMISCUOUS - defines a dispatch only request without actual data transfer for the client to obtain a list of servers which otherwise are not included such as, currently down servers (OK_DOWN), currently suppressed by having 100% penalty servers (OK_SUPPRESSED) or both (PROMISCUOUS)
+
+ The default (in the absence of this tag) is a connection request, and because it is going over HTTP, it is automatically considered stateless. This is to support calls for NCBI services from Web browsers.
+
+
+
Skip-Info-<n>: <server-info>
+
<n> is a number of <server-info> strings that can be passed to the DISPD dispatcher to ignore the servers from being potential mapping targets (in case if the client knows that the listed servers either do not work or are not appropriate). Skip-Info tags are enumerated by numerical consequent suffices (<n>), starting from 1. These tags are optional and should only be used if the client believes that the certain servers do not match the search criteria, or otherwise the client may end up with an unsuccessful mapping.
+
+
+
Client-Host: <host>
+
The tag is used by the DISPD dispatcher internally to identify the <host>, where the request comes from, in case if relaying is involved. Although the DISPD dispatcher effectively disregards this tag if the request originates from outside NCBI (and thus it cannot be easily fooled by address spoofing), in-house applications should not use this tag when connecting to the DISPD dispatcher because the tag is trusted and considered within the NCBI Intranet.
+
+
+
Server-Count: {N|ALL}
+
The tag defines how many server infos to include per response (default N=3, ALL causes everything to be returned at once). N is an integer and ALL is a keyword.
+
+
+
+
+
+
+
+
+##### DISPD Client Response
+
+The DISPD dispatcher can produce the following HTTP tags in response to the client.
+
+
+
+|-------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Tag | Description |
+| `Relay-Path: ` | The tag shows how the information was passed along by the DISPD dispatcher and the NCBID utility. This is essential for debugging purposes |
+| `Server-Info-: ` | The tag(s) (enumerated increasingly by suffix ``, starting from 1) give a list of servers, where the requested service is available. The list can have up to five entries. However, there is only one entry generated when the service was requested either in firewall mode or by a Web browser. For a non-local client, the returned server descriptors can include ***FIREWALL*** server specifications. Despite preserving information about host, port, type, and other (but not all) parameters of the original servers, ***FIREWALL*** descriptors are not specifications of real servers, but they are created on-the-fly by the DISPD dispatcher to indicate that the connection point of the server cannot be otherwise reached without the use of either firewalling or relaying. |
+| `Connection-Info: ` | The tag is generated in a response to a stateful-capable client and includes a host (in a dotted notation) and a port number (decimal value) of the connection point where the server is listening (if either the server has specifically started or the FWDaemon created that connection point because of the client's request). The ticket value (hexadecimal) represents the 4-byte ticket that must be passed to the server as binary data at the very beginning of the stream. If instead of a host, a port, and ticket information there is a keyword ***TRY\_STATELESS***, then for some reasons (see `Dispatcher-Failures` tag below) the request failed but may succeed if the client would switch into a stateless mode. |
+| `Dispatcher-Failures: ` | The tag value lists all transient failures that the dispatcher might have experienced while processing the request. A fatal error (if any) always appears as the last failure in the list. In this case, the reply body would contain a copy of the message as well. ***Note:*** Fatal dispatching failure is also indicated by an unsuccessful HTTP completion code. |
+| `Used-Server-Info-n: ` | The tag informs the client end of server infos that having been unsuccessfully used during current connection request (so that the client will be able to skip over them if needs to). `n` is an integral suffix, enumerating from 1. |
+| `Dispatcher-Messages:` | The tag is used to issue a message into standard error log of a client. The message is intercepted and delivered from within Toolkit HTTP API. |
+
+
+
+
+
+##### Communication Schemes
+
+After making a dispatching request and using the dispatching information returned, the client can usually connect to the server on its own. Sometimes, however, the client has to connect to the DISPD dispatcher again to proceed with communication with the server. For the DISPD dispatcher this would then be a connection request which can go one of two similar ways, relaying and firewalling.
+
+The figures (Figure7, Figure8) provided at the very beginning of the “Load Balancing” chapter can be used for better understanding of the communication schemes described below.
+
+- In the relay mode, the DISPD dispatcher passes data from the client to the server and back, playing the role of a middleman. Data relaying occurs when, for instance, a Web browser client wants to communicate with a service governed by the DISPD dispatcher itself.
+
+- In the firewall mode, DISPD sends out only the information about where the client has to connect to communicate with the server. This connection point and a verifiable ticket are specified in the `Connection-Info` tag in the reply header. ***Note:*** firewalling actually pertains only to the stateful-capable clients and servers.
+
+The firewall mode is selected by the presence of the ***FIREWALL*** keyword in the `Accepted-Server-Types` tag set by the client sitting behind a firewall and not being able to connect to an arbitrary port.
+
+These are scenarios of data flow between the client and the server, depending on the “stateness” of the client:
+
+A. Stateless client
+
+1. Client is **not using firewall** mode
+
+ - The client has to connect to the server by its own, using dispatching information obtained earlier; or
+
+ - The client connects to the DISPD dispatcher with a connection request (e.g., the case of Web browsers) and the DISPD dispatcher facilitates data relaying for the client to the server.
+
+2. If the client chooses to use the firewall mode then the only way to communicate with the server is to connect to the DISPD dispatcher (making a connection request) and use the DISPD dispatcher as a relay.
+
+***Note:*** Even if the server is stand-alone (but lacking S=yes in the configuration file of the LBSMD daemon) then the DISPD dispatcher initiates a microsession to the server and wraps its output into an HTTP/1.0-compliant reply. Data from both HTTP and NCBID servers are simply relayed one-to-one.
+
+B. Stateful-capable client
+
+1. A client which is **not using the firewall** mode has to connect directly to the server, using the dispatcher information obtained earlier (e.g., with the use of ***INFORMATION\_ONLY*** in `Dispatch-Mode` tag) if local; for external clients the connection point is provided by the `Connection-Info` tag (port range 4444-4544).
+
+2. If the firewall mode is selected, then the client has to expect `Connection-Info` to come back from the DISPD dispatcher pointing out where to connect to the server. If ***TRY\_STATELESS*** comes out as a value of the former tag, then the client has to switch into a stateless mode (e.g., by setting ***STATELESS\_ONLY*** in the `Client-Mode` tag) for the request to succeed.
+
+***Note:*** ***TRY\_STATELESS*** could be induced by many reasons, mainly because all servers for the service are stateless ones or because the FWDaemon is not available on the host, where the client's request was received.
+
+***Note:*** Outlined scenarios show that no prior dispatching information is required for a stateless client to make a connection request, because the DISPD dispatcher can always be used as a data relay (in this way, Web browsers can access NCBI services). But for a stateful-capable client to establish a dedicated connection an additional step of obtaining dispatching information must precede the actual connection.
+
+To support requests from Web browsers, which are unaware of HTTP extensions comprising dispatching protocol the DISPD dispatcher considers an incoming request that does not contain input dispatching tags as a connection request from a stateless-only client.
+
+The DISPD dispatcher uses simple heuristics in analyzing an HTTP header to determine whether the connection request comes from a Web browser or from an application (a service connector, for instance). In case of a Web browser the chosen data path could be more expensive but more robust including connection retries if required, whereas on the contrary with an application, the dispatcher could return an error, and the retry is delegated to the application.
+
+The DISPD dispatcher always preserves original HTTP tags `User-Agent` and `Client-Platform` when doing both relaying and firewalling.
+
+
+
+### NCBID Server Launcher
+
+
+
+#### Overview
+
+The LBSMD daemon supports services of type NCBID which are really Unix filter programs that read data from the stdin stream and write the output into the stdout stream without having a common protocol. Thus, HTTP/1.0 was chosen as a framed protocol for wrapping both requests and replies, and the NCBID utility CGI program was created to pass a request from the HTTP body to the server and to put the reply from the server into the HTTP body and send it back to the client. The NCBID utility also provides a dedicated connection between the server and the client, if the client supports the stateful way of communication. Former releases of the NCBID utility were implemented as a separate CGI program however the latest releases integrated the NCBID utility and the DISPD dispatcher into a single component (`ncbid.cgi` is a hard link to `dispd.cgi`).
+
+The NCBID utility determines the requested service from the query string in the same way as the DISPD dispatcher does, i.e., by looking into the value of the CGI parameter service. An executable file which has to be run is then obtained by searching the configuration file (shared with the LBSMD daemon; the default name is `servrc.cfg`): the path to the executable along with optional command-line parameters is specified after the bar character ("\|") in the line containing a service definition.
+
+The NCBID utility can work in either of two connection modes, stateless and stateful, as determined by reading the following HTTP header tag:
+
+`Connection-Mode: `
+
+where `` is one of the following:
+
+- ***STATEFUL***
+
+- ***STATELESS***
+
+The default value (when the tag is missing) is ***STATELESS*** to support calls from Web browsers.
+
+When the DISPD dispatcher relays data to the NCBID utility this tag is set in accordance with the current client mode.
+
+The ***STATELESS*** mode is almost identical to a call of a conventional CGI program with an exception that the HTTP header could hold tags pertaining to the dispatching protocol, and resulting from data relaying (if any) by the DISPD dispatcher.
+
+In the ***STATEFUL*** mode, the NCBID utility starts the program in a more tricky way, which is closer to working in a firewall mode for the DISPD dispatcher, i.e. the NCBID utility loads the program with its stdin and stdout bound to a port, which is switched to listening. The program becomes a sort of an Internet daemon (the only exception is that only one incoming connection is allowed). Then the client is sent back an HTTP reply containing the `Connection-Info` tag. The client has to use port, host, and ticket from that tag to connect to the server by creating a dedicated TCP connection.
+
+***Note***: the NCBID utility never generates ***TRY\_STATELESS*** keyword.
+
+For the sake of the backward compatibility the NCBID utility creates the following environment variables (in addition to CGI/1.0 environment variables created by the HTTP daemon when calling NCBID) before starting the service executables:
+
+
+
+|----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Name | Description |
+| NI\_CLIENT\_IPADDR | The variable contains an IP address of the remote host. It could also be an IP address of the firewall daemon if the NCBID utility was started as a result of firewalling. |
+| NI\_CLIENT\_PLATFORM | The variable contains the client platform extracted from the HTTP tag `Client-Platform` provided by the client if any. |
+
+
+
+
+
+### Firewall Daemon (FWDaemon)
+
+
+
+#### Overview
+
+The NCBI Firewall Daemon (FWDaemon) is essentially a network multiplexer listening at an advertised network address.
+
+The FWDaemon works in a close cooperation with the DISPD dispatcher which informs FWDaemon on how to connect to the “real” NCBI server and then instructs the network client to connect to FWDaemon (instead of the “real” NCBI server). Thus, the FWDaemon serves as a middleman that just pumps the network traffic from the network client to the NCBI server and back.
+
+The FWDaemon allows a network client to establish a persistent TCP/IP connection to any of publicly advertised NCBI services, provided that the client is allowed to make an outgoing network connection to any of the following FWDaemon addresses (on front-end NCBI machines):
+
+ ports 5860..5870 at both 130.14.29.112 and 165.112.7.12
+
+***Note:*** One FWDaemon can simultaneously serve many client/server pairs.
+
+
+
+##### FWDaemon Behind a "Regular" Firewall
+
+If a network client is behind a regular firewall, then a system administrator should open the above addresses (only!) for outgoing connections and set your client to "firewall" mode. Now the network client can use NCBI network services in a usual way (as if there were no firewall at all).
+
+
+
+##### FWDaemon Behind a "Non-Transparent" Firewall
+
+***Note:*** If a firewall is "non-transparent" (this is an extremely rare case), then a system administrator must "map" the corresponding ports on your firewall server to the advertised FWDaemon addresses (shown above). In this case, you will have to specify the address of your firewall server in the client configuration.
+
+The mapping on your non-transparent firewall server should be similar to the following:
+
+ CONN_PROXY_HOST:5860..5870 --> 130.14.29.112:5860..5870
+
+Please note that there is a port range that might not be presently used by any clients and servers, but it is reserved for future extensions. Nevertheless, it is recommended that you have this range configured on firewalls to allow the applications to function seamlessly in the future.
+
+
+
+#### Monitoring
+
+The FWDaemon could be monitored using the following web page:
+
+
+
+Having the page loaded into a browser the user will see the following.
+
+[](/book/static/img/FWDaemonMonitor.gif "Click to see the full-resolution image")
+
+Figure 15. FWDaemon Checking Web Page
+
+By clicking the “Check” button a page similar to the following will appear.
+
+[](/book/static/img/FWDaemonCheckPage.gif "Click to see the full-resolution image")
+
+Figure 16. FWDaemon Presence Check
+
+The outside NCBI network users can check the connection to the NAT service following the below steps:
+
+- Run the FWDaemon presence check as described above.
+
+- Take connection properties from any line where the status is “OK”. For example 130.14.29.112:5864
+
+- Establish a telnet session using those connection properties. The example of a connection session is given below (a case when a connection was successfully established).
+
+
+
+ > telnet 130.14.29.112 5864
+ Trying 130.14.29.112...
+ Connected to 130.14.29.112.
+ Escape character is '^]'.
+ NCBI Firewall Daemon: Invalid ticket. Connection closed.
+ See http://www.ncbi.nlm.nih.gov/cpp/network/firewall.html.
+ Connection closed by foreign host.
+
+
+
+#### Log Files
+
+The FWDaemon stores its log files at the following location:
+
+`/opt/machine/fwdaemon/log/fwdaemon`
+
+which is usually a link to `/var/log/fwdaemon`.
+
+The file is formed locally on a host where FWDaemon is running.
+
+
+
+#### FWDaemon and NCBID Server Data Exchange
+
+One of the key points in the communications between the NCBID server and the FWDaemon is that the DISPD dispatcher instructs the FWDaemon to expect a new client connection. This instruction is issued as a reaction on a remote client request. It is possible that the remote client requested a service but did not use it. To prevent resource leaking and facilitate the usage monitoring the FWDaemon keeps a track of those requested but not used connections in a special file. The NCBID dispatcher is able to read that file before requesting a new connection from the FWDaemon and if the client was previously marked as the one who left connections not used then the NCBID dispatcher refuses the connection request.
+
+The data exchange is illustrated on the figure below.
+
+[](/book/static/img/DISPDAndFWDaemon.jpg "Click to see the full-resolution image")
+
+Figure 17. DISPD FWDaemon Data Exchange
+
+The location of the `.dispd.msg` file is detected by the DISPD dispatcher as follows. The dispatcher determines the user name who owns the `dispd.cgi` executable. Then the dispatcher looks to the home directory for that user. The directory is used to look for the `.dispd.msg` file. The FWDaemon is run under the same user and the `.dispd.msg` file is saved by the daemon in its home directory.
+
+
+
+### Launcherd Utility
+
+The purpose of the launcherd utility is to replace the NCBID services on hosts where there is no Apache server installed and there is a need to have Unix filter programs to be daemonized.
+
+The launcherd utility is implemented as a command line utility which is controlled by command line arguments. The list of accepted arguments can be retrieved with the -h option:
+
+`service1:~> /export/home/service/launcherd -h` `Usage:` `launcherd [-h] [-q] [-v] [-n] [-d] [-i] [-p #] [-l file] service command [parameters...]` ` -h = Print usage information only; ignore anything else` ` -q = Quiet start [and silent exit if already running]` ` -v = Verbose logging [terse otherwise]` ` -n = No statistics collection` ` -d = Debug mode [do not go daemon, stay foreground]` ` -i = Internal mode [bind to localhost only]` ` -p # = Port # to listen on for incoming connection requests` `` -l = Set log file name [use `-' or `+' to run w/o logger] `` `Note: Service must be of type STANDALONE to auto-get the port.` `` Note: Logging to `/dev/null' is treated as logging to a file. `` `Signals: HUP, INT, QUIT, TERM to exit`
+
+The launcherd utility accepts the name of the service to be daemonized. Using the service name the utility checks the LBSMD daemon table and retrieves port on which the service requests should be accepted. As soon as an incoming request is accepted the launched forks and connects the socket with the standard streams of the service executable.
+
+One of the launcherd utility command line arguments is a path to a log file where the protocol messages are stored.
+
+The common practice for the launcherd utility is to be run by the standard Unix cron daemon. Here is an example of a cron schedule which runs the launcherd utility every 3 minutes:
+
+`# DO NOT EDIT THIS FILE - edit the master and reinstall.` `# (/export/home/service/UPGRADE/crontabs/service1/crontab ` `# installed on Thu Mar 20 20:48:02 2008) ` `# (Cron version -- $Id: crontab.c,v 2.13 1994/01/17 03:20:37 vixie Exp $) ` `MAILTO=ncbiduse@ncbi` `*/3 * * * * test -x /export/home/service/launcherd && /export/home/service/launcherd -q -l /export/home/service/bounce.log -- Bounce /export/home/service/bounce >/dev/null MAILTO=grid-mon@ncbi,taxhelp@ncbi` `*/3 * * * * test -x /export/home/service/launcherd && /export/home/service/launcherd -q -l /var/log/taxservice -- TaxService /export /home/service/taxservice/taxservice >/dev/null`
+
+
+
+### Monitoring Tools
+
+There are various ways to monitor the services available at NCBI. These are generic third party tools and specific NCBI developed utilities. The specific utilities are described above in the sections related to a certain component.
+
+The system availability and performance could be visualized by using Zabbix software. It can be reached at:
+
+
+
+One more web based tool to monitor servers / services statuses is Nagios. It can be reached at:
+
+[http://nagios.ncbi.nlm.nih.gov](http://nagios.ncbi.nlm.nih.gov/)
+
+
+
+### Quality Assurance Domain
+
+The quality assurance (QA) domain uses the same equipment and the same network as the production domain. Not all the services which are implemented in the production domain are implemented in the QA one. When a certain service is requested with the purpose of testing a service from QA should be called if it is implemented or a production one otherwise. The dispatching is implemented transparently. It is done by the CAF module running on production front ends. To implement that the CAFQAMap directive is put into the Apache web server configuration file as following:
+
+`CAFQAMap NCBIQA /opt/machine/httpd/public/conf/ncbiqa.mapping`
+
+The directive above defines the NCBIQA cookie which triggers names substitutions found in the `/opt/machine/httpd/public/conf/ncbiqa.mapping` file.
+
+To set the cookie the user can visit the following link:
+
+
+
+A screen similar to the following will appear:
+
+[](/book/static/img/QACookieManager.gif "Click to see the full-resolution image")
+
+Figure 18. QA Cookie Manager.
+
+While connecting to a certain service the cookie is analyzed by the CAF module and if the QA cookie is detected then name mapping is triggered. The mapping is actually a process of replacing one name with another. The replacement rules are stored in the `/opt/machine/httpd/public/conf/ncbiqa.mapping` file. The file content could be similar to the following:
+
+`portal portalqa`
+
+`eutils eutilsqa`
+
+`tpubmed tpubmedqa`
+
+which means to replace `portal` with `portalqa` etc.
+
+So the further processing of the request is done using the substituted name. The process is illustrated on the figure below.
+
+[](/book/static/img/QA.jpg "Click to see the full-resolution image")
+
+Figure 19. NCBI QA
+
+
+
+NCBI Genome Workbench
+---------------------
+
+The NCBI Genome Workbench is an integrated sequence visualization and analysis platform. This application runs on Windows, Unix, and Macintosh OS X.
+
+The following topics are discussed in this section:
+
+- [Design goals](ch_app.html#ch_app.gbench_dg)
+
+- [Design](ch_app.html#ch_app.gbench_design)
+
+
+
+### Design Goals
+
+The primary goal of Genome Workbench is to provide a flexible platform for development of new analytic and visualization techniques. To this end, the application must facilitate easy modification and extension. In addition, we place a large emphasis on cross-platform development, and Genome Workbench should function and appear identically on all supported platforms.
+
+
+
+### Design
+
+The basic design of Genome Workbench follows a modified Model-View-Controller (MVC) architecture. The MVC paradigm provides a clean separation between the data being dealt with (the model), the user's perception of this data (provided in views), and the user's interaction with this data (implemented in controllers). For Genome Workbench, as with many other implementations of the MVC architecture, the View and Controller are generally combined.
+
+Central to the framework is the notion of the data being modeled. The model here encompasses the NCBI data model, with particular emphasis on sequences and annotations. The Genome Workbench framework provides a central repository for all managed data through the static class interface in ***CDocManager***. ***CDocManager*** owns the single instance of the C++ Object Manager that is maintained by the application. ***CDocManager*** marshals individual ***CDocument*** classes to deal with data as the user requests. ***CDocument***, at its core, wraps a ***CScope*** class and thus provides a hook to the object manager.
+
+The View/Controller aspect of the architecture is implemented through the abstract class ***CView***. Each ***CView*** class is bound to a single document. Each ***CView*** class, in turn, represents a view of some portion of the data model or a derived object related to the document. This definition is intentionally vague; for example, when viewing a document that represents a sequence alignment, a sequence in that alignment may not be contained in the document itself but is distinctly related to the alignment and can be presented in the context of the document. In general, the views that use the framework will define a top-level FLTK window; however, a view could be defined to be a CGI context such that its graphical component is a Web browser.
+
+To permit maximal extensibility, the framework delegates much of the function of creating and presenting views and analyses to a series of plugins. In fact, most of the basic components of the application itself are implemented as plugins. The Genome Workbench framework defines three classes of plugins: data loaders, views, and algorithms. Technically, a plugin is simply a shared library defining a standard entry point. These libraries are loaded on demand; the entry point returns a list of plugin factories, which are responsible for creating the actual plugin instances.
+
+Cross-platform graphical development presents many challenges to proper encapsulation. To alleviate a lot of the difficulties seen with such development, we use a cross-platform GUI toolkit (FLTK) in combination with OpenGL for graphical development.
+
+
+
+NCBI NetCache Service
+---------------------
+
+- [What is NetCache?](ch_app.html#ch_app.what_is_netcache)
+
+- [What can NetCache be used for?](ch_app.html#ch_app.what_it_can_be_used)
+
+- [How to use NetCache](ch_app.html#ch_app.getting_started)
+
+ - [The basic ideas](ch_app.html#ch_app.The_basic_ideas)
+
+ - [Setting up your program to use NetCache](ch_app.html#ch_app.Set_up_your_program_to_use_NetCac)
+
+ - [Establish the NetCache service name](ch_app.html#ch_app.Establish_the_NetCache_service_na)
+
+ - [Initialize the client API](ch_app.html#ch_app.Initialize_the_client_API)
+
+ - [Store data](ch_app.html#ch_app.Store_data)
+
+ - [Retrieve data](ch_app.html#ch_app.Retrieve_data)
+
+ - [Samples and other resources](ch_app.html#ch_app.Available_samples)
+
+- [Questions and answers](ch_app.html#ch_app.Questions_and_answers)
+
+
+
+### What is NetCache?
+
+**NetCache** is a service that provides to distributed hosts a reliable and uniform means of accessing temporary storage. Using **NetCache**, distributed applications can store data temporarily without having to manage distributed access or handle errors. Applications on different hosts can access the same data simply by using the unique key for the data.
+
+CGI applications badly need this functionality to store session information between successive HTPP requests. Some session information could be embedded into URLs or cookies, however it is generally not a good idea because:
+
+- Some data should not be transmitted to the client, for security reasons.
+
+- Both URLs and cookies are quite limited in size.
+
+- Passing data via either cookie or URL generally requires additional encoding and decoding steps.
+
+- It makes little sense to pass data to the client only so it can be passed back to the server.
+
+Thus it is better to store this information on the server side. However, this information cannot be stored locally because successive HTTP requests for a given session are often processed on different machines. One possible way to handle this is to create a file in a shared network directory. But this approach can present problems to client applications in any of the standard operations:
+
+- Adding a blob
+
+- Removing a blob
+
+- Updating a blob
+
+- Automatically removing expired blobs
+
+- Automatically recovering after failures
+
+Therefore, it's better to provide a centralized service that provides robust temporary storage, which is exactly what **NetCache** does.
+
+**NetCache** is load-balanced and has high performance and virtually unlimited scalability. Any Linux, Unix or Windows machine can be a **NetCache** host, and any application can use it. For example, the success with which **NetCache** solves the problem of distributed access to temporary storage enables the [NCBI Grid](ch_grid.html) framework to rely on it for passing data between its components.
+
+
+
+### What can NetCache be used for?
+
+Programs can use **NetCache** for data exchange. For example, one application can put a blob into **NetCache** and pass the blob key to another application, which can then access (retrieve, update, remove) the data. Some typical use cases are:
+
+- Store CGI session info
+
+- Store CGI-generated graphics
+
+- Cache results of computations
+
+- Cache results of expensive DBMS or search system queries
+
+- Pass messages between programs
+
+The diagram below illustrates how **NetCache** works.
+
+[](/book/static/img/NetCache_diagramm.gif "Click to see the full-resolution image")
+
+1. Client requests a named service from the Load Balancer.
+
+2. Load Balancer chooses the least loaded server (on this diagram Server 2) corresponding to the requested service.
+
+3. Load Balancer returns the chosen server to the client.
+
+4. Client connects to the selected **NetCache** server and sends the data to store.
+
+5. **NetCache** generates and returns a unique key which can then be used to access the data.
+
+
+
+### How to use NetCache
+
+All new applications developed within NCBI should use **NetCache** together with the NCBI Load Balancer. It is not recommended to use an unbalanced **NetCache** service.
+
+The following topics explain how to use NetCache from an application:
+
+- [The basic ideas](ch_app.html#ch_app.The_basic_ideas)
+
+- [Set up your program to use NetCache](ch_app.html#ch_app.Set_up_your_program_to_use_NetCac)
+
+- [Establish the NetCache service name](ch_app.html#ch_app.Set_up_your_program_to_use_NetCac)
+
+- [Initialize the client API](ch_app.html#ch_app.Initialize_the_client_API)
+
+- [Store data](ch_app.html#ch_app.Store_data)
+
+- [Retrieve data](ch_app.html#ch_app.Retrieve_data)
+
+- [Samples and other resources](ch_app.html#ch_app.Available_samples)
+
+
+
+#### The basic ideas
+
+A typical **NetCache** implementation involves a load-balanced server daemon (the "service") and one or more clients that access the service through a software interface. See [netcached.ini](http://www.ncbi.nlm.nih.gov/viewvc/v1/trunk/c++/src/app/netcache/netcached.ini?view=log) for descriptions of the **NetCache** server daemon configuration parameters.
+
+Two classes provide an interface to **NetCache** - ***CNetCacheAPI*** and ***CNetICacheClient***. These classes share most of the basic ideas of using **NetCache**, but might be best suited for slightly different purposes. ***CNetCacheAPI*** might be a bit better for temporary storage in scenarios where the data is not kept elsewhere, whereas ***CNetICacheClient*** implements the ***ICache*** interface and might be a bit better for scenarios where the data still exists elsewhere but is also cached for performance reasons. ***CNetCacheAPI*** will probably be more commonly used because it automatically generates unique keys for you and it has a slightly simpler interface. ***CNetCacheAPI*** also supports stream insertion and extraction operators.
+
+There are multiple ways to write data to **NetCache** and read it back, but the basic ideas are:
+
+- **NetCache** stores data in blobs. There are no constraints on the format, and the size can be anything from one byte to "big" - that is, the size is specified using ***size\_t*** and the practical size limit is the lesser of available storage and organizational policy.
+
+- Blob identification is usually associated with a unique purpose.
+
+ - With ***CNetCacheAPI***, a blob is uniquely identified by a key that is generated by **NetCache** and returned to the calling code. Thus, the calling code can limit use of the blob to a given purpose. For example, data can be passed from one instance of a CGI to the next by storing the data in a **NetCache** blob and passing the key via cookie.
+
+ - With ***CNetICacheClient***, blobs are identified by the combination { key, version, subkey, cache name }, which isn't guaranteed to be unique. It is possible that two programs could choose the same combination and one program could change or delete the data stored by the other.
+
+- With ***CNetICacheClient***, the cache name can be specified in the registry and is essentially a convenient way of simulating namespaces.
+
+- When new data is written using a key that corresponds to existing data:
+
+ - API calls that use a buffer pointer replace the existing data.
+
+ - API calls that use a stream or writer append to the existing data.
+
+- Data written with a stream or writer won't be accessible from the **NetCache** server until the stream or writer is deleted or until the writer's ***Close()*** method is called.
+
+- A key must be supplied to retrieve data.
+
+- Blobs have a limited "time-to-live" (TTL).
+
+ - Reading a blob won't delete it - it will be removed automatically when its TTL has expired, or it can be removed explicitly.
+
+ - **NetCache** server daemons can specify a default TTL for their blobs using the `blob_ttl` entry in the `[netcache]` section of [netcached.ini](http://www.ncbi.nlm.nih.gov/viewvc/v1/trunk/c++/src/app/netcache/netcached.ini?view=log). There is no direct way to find the server's default TTL, but you can find it indirectly by creating a blob and calling ***GetBlobInfo()*** on the new blob. For an example of this, see [CSampleNetCacheClient::DemoPutRead()](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/sample/app/netcache/netcache_client_sample.cpp).
+
+ - Blob lifetime can be prolonged.
+
+ - By default, each time a blob is accessed its lifetime will be extended by the server's default `blob_ttl`. The default prolongation can be overridden by passing a TTL when accessing the blob (the passed value will apply only to that access).
+
+ - By default, the total lifetime of a blob, including all prolongations, will be limited to either 10 times the `blob_ttl` or 30 days, whichever is larger. The default maximum lifetime can be overridden with `max_ttl`.
+
+ - Lifetime prolongation can be disabled by setting the `prolong_on_read` entry to `false` in [netcached.ini](http://www.ncbi.nlm.nih.gov/viewvc/v1/trunk/c++/src/app/netcache/netcached.ini?view=log).
+
+ - ***Note:*** Calling ***GetBlobSize()*** will prolong a blob's lifetime (unless `prolong_on_read` is `false`), but calling ***GetBlobInfo()*** will not.
+
+
+
+#### Set up your program to use NetCache
+
+To use **NetCache** from your application, you must use the [NCBI application framework](ch_core.html#ch_core.CNcbiApplication) by deriving you application class from ***CNcbiApplication***. If your application is a CGI, you can derive from ***CCgiApplication***.
+
+You will need at least the following libraries in your `Makefile..app`:
+
+ # For CNcbiApplication-derived programs:
+ LIB = xconnserv xthrserv xconnect xutil xncbi
+
+ # For CCgiApplication-derived programs:
+ LIB = xcgi xconnserv xthrserv xconnect xutil xncbi
+
+ # If you're using CNetICacheClient, also add ncbi_xcache_netcache to LIB.
+
+ # All apps need this LIBS line:
+ LIBS = $(NETWORK_LIBS) $(DL_LIBS) $(ORIG_LIBS)
+
+Your source should include:
+
+ #include // for CNcbiApplication-derived programs
+ #include // for CCgiApplication-derived programs
+
+ #include // if you use CNetCacheAPI
+ #include // if you use CNetICacheClient
+
+An even easier way to get a new CGI application started is to use the [new\_project](ch_proj.html#ch_proj.new_project_Starting) script:
+
+ new_project mycgi app/netcache
+
+
+
+#### Establish the NetCache service name
+
+All applications using **NetCache** must use a service name. A service name is essentially just an alias for a group of **NetCache** servers from which the load balancer can choose when connecting the **NetCache** client and server. For applications with minimal resource requirements, the selected service may be relatively unimportant, but applications with large resource requirements may need their own dedicated **NetCache** servers. But in all cases, developers should contact nypk4jvylGujip5ust5upo5nv/ and ask what service name to use for new applications.
+
+Service names must match the pattern `[A-Za-z_][A-Za-z0-9_]*`, must not end in `_lb`, and are not case-sensitive. Limiting the length to 18 characters is recommended, but there is no hard limit.
+
+Service names are typically specified on the command line or stored in the application configuration file. For example:
+
+ [netcache_api]
+ service=the_svc_name_here
+
+
+
+#### Initialize the client API
+
+Initializing the **NetCache** API is extremely easy - simply create a ***CNetCacheAPI*** or ***CNetICacheClient*** object, selecting the constructor that automatically configures the API based on the application registry. Then, define the client name in the application registry using the `client` entry in the `[netcache_api]` section. The client name should be unique if the data is application-specific, or it can be shared by two or more applications that need to access the same data. The client name is added to AppLog entries, so it is helpful to indicate the application in this string.
+
+For example, put this in your source code:
+
+ // To configure automatically based on the config file, using CNetCacheAPI:
+ CNetCacheAPI nc_api(GetConfig());
+
+ // To configure automatically based on the config file, using CNetICacheClient:
+ CNetICacheClient ic_client(CNetICacheClient::eAppRegistry);
+
+and put this in your configuration file:
+
+ [netcache_api]
+ client=your_app_name_here
+
+If you are using ***CNetICacheClient***, you either need to use API methods that take a cache name or, to take advantage of automatic configuration based on the registry, specify a cache name in the `[netcache_api]` section, for example:
+
+ [netcache_api]
+ cache_name=your_cache_name_here
+
+For a complete reference of **NetCache** configuration parameters, please see the [NetCache and NetSchedule](ch_libconfig.html#ch_libconfig.NetCache_and_NetSchedule) section in the Library Configuration chapter:
+
+
+
+#### Store data
+
+There are ancillary multiple ways to save data, whether you're using ***CNetCacheAPI*** or ***CNetICacheClient***.
+
+With all the storage methods, you can supply a "time-to-live" parameter, which specifies how long (in seconds) a blob will be accessible. See the [basic ideas](ch_app.html#ch_app.The_basic_ideas) section for more information on time-to-live.
+
+
+
+##### Storing data using CNetCacheAPI
+
+If you are saving a new blob using ***CNetCacheAPI***, it will create a unique blob key and pass it back to you. Here are several ways to store data using ***CNetCacheAPI*** (see the [class reference](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCNetCacheAPI.html) for additional methods):
+
+ CNetCacheAPI nc_api(GetConfig());
+
+ // Write a simple object (and get the new blob key).
+ key = nc_api.PutData(message.c_str(), message.size());
+
+ // Or, overwrite the data by writing to the same key.
+ nc_api.PutData(key, message.c_str(), message.size());
+
+ // Or, create an ostream (and get a key), then insert into the stream.
+ auto_ptr os(nc_api.CreateOStream(key));
+ *os << "line one\n";
+ *os << "line two\n";
+ // (data written at stream deletion or os.reset())
+
+ // Or, create a writer (and get a key), then write data in chunks.
+ auto_ptr writer(nc_api.PutData(&key));
+ while(...) {
+ writer->Write(chunk_buf, chunk_size);
+ // (data written at writer deletion or writer.Close())
+
+
+
+##### Storing data using CNetICacheClient
+
+If you are saving a new blob using ***CNetICacheClient***, you must supply a unique { blob key / version / subkey / cache name } combination. Here are two ways (with the cache name coming from the registry) to store data using ***CNetICacheClient*** (see the [class reference](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCNetICacheClient.html) for additional methods):
+
+ CNetICacheClient ic_client(CNetICacheClient::eAppRegistry);
+
+ // Write a simple object.
+ ic_client.Store(key, version, subkey, message.c_str(), message.size());
+
+ // Or, create a writer, then write data in chunks.
+ auto_ptr
+ writer(ic_client.GetNetCacheWriter(key, version, subkey));
+ while(...) {
+ writer->Write(chunk_buf, chunk_size);
+ // (data written at writer deletion or writer.Close())
+
+
+
+#### Retrieve data
+
+Retrieving data is more or less complementary to storing data.
+
+If an attempt is made to retrieve a blob after its time-to-live has expired, an exception will be thrown.
+
+
+
+##### Retrieving data using CNetCacheAPI
+
+The following code snippet demonstrates three ways of retrieving data using ***CNetCacheAPI*** (see the [class reference](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCNetCacheAPI.html) for additional methods):
+
+ // Read a simple object.
+ nc_api.ReadData(key, message);
+
+ // Or, extract words from a stream.
+ auto_ptr is(nc_api.GetIStream(key));
+ while (!is->eof()) {
+ *is >> message; // get one word at a time, ignoring whitespace
+
+ // Or, retrieve the whole stream buffer.
+ NcbiCout << "Read: '" << is->rdbuf() << "'" << NcbiEndl;
+
+ // Or, read data in chunks.
+ while (...) {
+ ERW_Result rw_res = reader->Read(chunk_buf, chunk_size, &bytes_read);
+ chunk_buf[bytes_read] = '\0';
+ if (rw_res == eRW_Success) {
+ NcbiCout << "Read: '" << chunk_buf << "'" << NcbiEndl;
+ } else {
+ NCBI_USER_THROW("Error while reading BLOB");
+ }
+
+
+
+##### Retrieving data using CNetICacheClient
+
+The following code snippet demonstrates two ways to retrieve data using ***CNetICacheClient***, with the cache name coming from the registry (see the [class reference](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCNetICacheClient.html) for additional methods):
+
+ // Read a simple object.
+ ic_client.Read(key, version, subkey, chunk_buf, kMyBufSize);
+
+ // Or, read data in chunks.
+ size_t remaining(ic_client.GetSize(key, version, subkey));
+ auto_ptr reader(ic_client.GetReadStream(key, version, subkey));
+ while (remaining > 0) {
+ size_t bytes_read;
+ ERW_Result rw_res = reader->Read(chunk_buf, chunk_size, &bytes_read);
+ if (rw_res != eRW_Success) {
+ NCBI_USER_THROW("Error while reading BLOB");
+ }
+ // do something with the data
+ ...
+ remaining -= bytes_read;
+ }
+
+
+
+#### Samples and other resources
+
+Here is a sample client application that demonstrates a variety of ways to use **NetCache**:
+
+[src/sample/app/netcache/netcache\_client\_sample.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/sample/app/netcache/netcache_client_sample.cpp)
+
+Here is a sample application that uses **NetCache** from a CGI application:
+
+[src/sample/app/netcache/netcache\_cgi\_sample.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/sample/app/netcache/netcache_cgi_sample.cpp)
+
+Here are test applications for ***CNetCacheAPI*** and ***CNetICacheClient***:
+
+[src/connect/services/test/test\_netcache\_api.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/connect/services/test/test_netcache_api.cpp)
+
+[src/connect/services/test/test\_ic\_client.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/connect/services/test/test_ic_client.cpp)
+
+Please see the [NetCache and NetSchedule](ch_libconfig.html#ch_libconfig.NetCache_and_NetSchedule) section of the Library Configuration chapter for documentation on the **NetCache** configuration parameters.
+
+The `grid_cli` command-line tool (available on both Windows and Unix) provides convenient sub-commands for manipulating blobs, getting their status, checking servers, etc.
+
+You can also email nypk4jvylGujip5ust5upo5nv/ if you have questions.
+
+
+
+### Questions and answers
+
+**Q:What exactly is netcache's architecture, it is memory-based (like memcached), or does it use filesystem/sql/whatever?**
+
+A:It keeps its database on disk, memory-mapped; it also has a (configurable) "write-back buffer" - to use when there is a lot of data coming in, and a lot of this data gets re-written quickly (this is to help avoid thrashing the disk with relatively transient blob versions - when the OS's automatic memory swap mechanism may become sub-optimal).
+
+**Q:Is there an NCBI "pool" of netcache servers that we can simply tie in to, or do we have to set up netcache servers on our group's own machines?**
+
+A:We usually (except for PubMed) administer NC servers, most of which are shared. Depends on your load (hit rate, blob size distribution, blob lifetime, redundancy, etc.) we can point you to the shared NC servers or create a new NC server pool.
+
+**Q:I assume what's in c++/include/connect/services/\*hpp is the api to use for a client?**
+
+A:Yes, also try the samples under [src/sample/app/netcache](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/sample/app/netcache/) - for example:
+
+ new_project pc_nc_client app/netcache
+ cd pc_nc_client
+ make
+ ./netcache_client_sample1 -service NC_test
+ ./netcache_client_sample2 NC_test
+ ./netcache_client_sample3 NC_test
+
+**Q:Is there a way to build in some redundancy, e.g. so that if an individual server/host goes down, we don't lose data?**
+
+A:Yes, you can mirror NC servers, master-master style, including between BETH and COLO sites. Many NC users use mirrored instances nowadays, including PubMed.
+
+**Q:Is there a limit to the size of the data blobs that can be stored?**
+
+A:I have seen 400MB blobs there being written and read without an incident a thousand times a day (). We can do experiments to see how your load will be handled. As a general rule, you should ask nypk4jvylGujip5ust5upo5nv/ for guidance when changing your NC usage.
+
+**Q:How is the expiration of BLOBs handled by NetCache? My thinking is coming from two directions. First, I wouldn’t want BLOBs deleted out from under me, but also, if the expiration is too long, I don’t want to be littering the NetCache. That is: do I need to work hard to remove all of my BLOBs or can I just trust the automatic clean-up?**
+
+A:You can specify a "time-to-live" when you create a blob. If you don't specify a value, you can find the service's default value by calling ***GetBlobInfo()***. See the [basic ideas](ch_app.html#ch_app.The_basic_ideas) section for more details.
+
+
diff --git a/pages/ch_blast.md b/pages/ch_blast.md
new file mode 100644
index 00000000..8393c747
--- /dev/null
+++ b/pages/ch_blast.md
@@ -0,0 +1,240 @@
+---
+layout: default
+title: C++ Toolkit test
+nav: pages/ch_blast
+---
+
+
+16\. BLAST API
+============================
+
+Created: August 22, 2006; Last Update: April 13, 2010.
+
+Overview
+--------
+
+The overview for this chapter consists of the following topics:
+
+- [Introduction](#ch_blast.intro)
+
+- [Chapter Outline](#ch_blast.outline)
+
+### Introduction
+
+BLAST (Basic Local Alignment Search Tool) is used to perform sequence similarity searches. Most often this means that BLAST is used to search a sequence (either DNA or protein) against a database of other sequences (either all nucleotide or all protein) in order to identify similar sequences. BLAST has many different flavors and can not only search DNA against DNA or protein against protein but also can translate a nucleotide query and search it against a protein database as well as the other way around. It can also compute a “profile” for the query sequence and use that for further searches as well as search the query against a database of profiles. BLAST is available as a web service at the NCBI, as a stand-alone binary, and is built into other tools. It is an extremely versatile program and probably the most heavily used similarity search program in the world. BLAST runs on a multitude of different platforms that include Windows, MacOS, LINUX, and many flavors of UNIX. It is also under continuing development with new algorithmic innovations. Multiple references to BLAST can be found at .
+
+The version of BLAST in the NCBI C++ Toolkit was rewritten from scratch based upon the version in the C Toolkit that was originally introduced in 1997. A decision was made to break the code for the new version of BLAST into two different categories. There is the “core” code of BLAST that is written in vanilla C and does not use any part of the NCBI C or C++ Toolkits. There is also the “API” code that is written in C++ and takes full advantage of the tools provided by the NCBI C++ Toolkit. The reason to write the core part of the code in vanilla C was so that the same code could be used in the C Toolkit (to replace the 1997 version) as well as to make it possible for researchers interested in algorithmic development to work with the core of BLAST independently of any Toolkit. Even though the core part was written without the benefit of the C++ or C Toolkits an effort was made to conform to the [Programming Policies and Guidelines](ch_style.html) chapter of this book. Doxygen-style comments are used to allow API documentation to be automatically generated (see the BLAST Doxygen link at ). Both the core and API parts of BLAST can be found under `algo/blast` in the C++ Toolkit.
+
+An attempt was made to isolate the user of the BLAST API (as exposed in `algo/blast/api`) from the core of BLAST, so that algorithmic enhancements or refactoring of that code would be transparent to the API programmer as far as that is possible. Since BLAST is continually under development and many of the developments involve new features it is not always possible or desirable to isolate the API programmer from these changes. This chapter will focus on the API for the C++ Toolkit. A few different search classes will be discussed. These include the ***CLocalBlast*** class, typically used for searching a query (or queries) against a BLAST database; ***CRemoteBlast***, used for sending searches to the NCBI servers; as well as ***CBl2Seq***, useful for searching target sequences that have not been formatted as a BLAST database.
+
+### Chapter Outline
+
+[CLocalBlast](#ch_blast.CLocalBlast)
+
+- [Query Sequence](#ch_blast._Query_Sequence)
+
+- [Options](#ch_blast._Options)
+
+- [Target Sequences](#ch_blast._Target_Sequences)
+
+- [Results](#ch_blast._Results)
+
+[CRemoteBlast](#ch_blast.CRemoteBlast)
+
+- [Query Sequence](#ch_blast._Query_Sequence_1)
+
+- [Options](#ch_blast._Options_1)
+
+- [Target Sequences](#ch_blast._Target_Sequences_1)
+
+- [Results](#ch_blast._Results_1)
+
+[The Uniform Interface](#ch_blast.The_Uniform_Interfac)
+
+[CBl2Seq](#ch_blast.CBl2Seq)
+
+- [Query Sequence](#ch_blast._Query_Sequence_2)
+
+- [Options and Program Type](#ch_blast.Options_and_Program_)
+
+- [Target Sequences](#ch_blast._Target_Sequences_2)
+
+- [Results](#ch_blast._Results_2)
+
+[C++ BLAST Options Cookbook](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/blast_opts_cookbook.html)
+
+[Sample Applications](#ch_blast.Sample_Applications)
+
+
+
+CLocalBlast
+-----------
+
+The class [CLocalBlast](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=CLocalBlast&d=C) can be used for searches that run locally on a machine (as opposed to sending the request over the network to use the CPU of another machine) and search a query (or queries) against a preformatted BLAST database, which holds the target sequence data in a format optimal for BLAST searches. The demonstration program `blast_demo.cpp` illustrates the use of ***CLocalBlast***. There are a few different ***CLocalBlast*** constructors, but they always take three arguments reflecting the need for a query sequence, a set of BLAST options, and a set of target sequences (e.g., BLAST database). First we discuss how to construct these arguments and then we discuss how to access the results.
+
+
+
+### Query Sequence
+
+The classes that perform BLAST searches expect to be given query sequences in one of a few formats. Each is a container for one or more query sequences expressed as ***CSeq\_loc*** objects, along with ancillary information. In this document we will only discuss classes that take either a ***SSeqLoc*** or a ***TSeqLocVector***, which is just a collection of ***SSeqLoc***’s.
+
+***CBlastInput*** is a class that converts an abstract source of sequence data into a format suitable for use by the BLAST search classes. This class may produce either a [TSeqLocVector](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=TSeqLocVector&d=T) container or a [CBlastQueryVector](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=CBlastQueryVector&d=C) container to represent query sequences. As mentioned above we limit our discussion to the ***TSeqLocVector*** class here.
+
+***CBlastInput*** can produce a single container that includes all the query sequences, or can output a batch of sequences at a time (the combined length of the sequences within each batch can be specified) until all of the sequences within the data source have been consumed.
+
+Sources of sequence data are represented by a ***CBlastInputSource***, or a class derived from it. ***CBlastInput*** uses these classes to read one sequence at a time from the data source and convert to a container suitable for use by the BLAST search classes.
+
+An example use of ***CBlastInputSource*** is ***CBlastFastaInputSource***, which represents a stream containing fasta-formatted biological sequences. Usually this class represents a collection of sequences residing in a text file. One sequence at a time is read from the file and converted into a BLAST input container.
+
+***CBlastFastaInputSource*** uses ***CBlastInputConfig*** to provide more control over the file reading process. For example, the read process can be limited to a range of each sequence, or sequence letters that appear in lowercase can be scheduled for masking by BLAST. ***CBlastInputConfig*** can be used by other classes to provide the same kind of control, although not all class members will be appropriate for every data source.
+
+
+
+### Options
+
+The BLAST options classes were designed to allow a programmer to easily set the options to values appropriate to common tasks, but then modify individual options as needed. [Table 1](#ch_blast.T18.3) lists the supported tasks.
+
+
+
+Table 1: List of tasks supported by the CBlastOptionsHandle. “Translated nucleotide” means that the input was nucleotide, but the comparison is based upon the protein. PSSM is a “position-specific scoring matrix”. The “EProgram” can be used as an argument to CBlastOptionsFactory::Create
+
+|----------------------|-----------------------|-----------------------|-----------------------|-------------------------------------------------------|
+| **EProgram (enum)** | **Default Word-size** | **Query type** | **Target type** | **Notes** |
+| ***eBlastN*** | 11 | Nucleotide | Nucleotide | |
+| ***eMegablast*** | 28 | Nucleotide | Nucleotide | Optimized for speed and closely related sequences |
+| ***eDiscMegablast*** | 11 | Nucleotide | Nucleotide | Optimized for cross-species matches |
+| ***eBlastp*** | 3 | Protein | Protein | |
+| ***eBlastx*** | 3 | Translated nucleotide | Protein | |
+| ***eTblastn*** | 3 | Protein | Translated nucleotide | |
+| ***eTblastx*** | 3 | Translated nucleotide | Translated nucleotide | |
+| ***eRPSBlast*** | 3 | Protein | PSSM | Can very quickly identify domains |
+| ***eRPSTblastn*** | 3 | Translated nucleotide | PSSM | |
+| ***ePSIBlast*** | 3 | PSSM | Protein | Extremely sensitive method to find distant homologies |
+| ***ePHIBlastp*** | 3 | Protein | Protein | Uses pattern in query to start alignments |
+
+
+
+The ***CBlastOptionsFactory*** class offers a single static method to create [CBlastOptionsHandle](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCBlastOptionsHandle.html) subclasses so that options applicable to all variants of BLAST can be inspected or modified. The actual type of the [CBlastOptionsHandle](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCBlastOptionsHandle.html) returned by the ***Create()*** method is determined by its `EProgram` argument (see [Table 1](#ch_blast.T18.3)). The return value of this function is guaranteed to have reasonable defaults set for the selected task.
+
+The ***CBlastOptionsHandle*** class encapsulates options that are common to all variants of BLAST, from which more specific tasks can inherit the common options. The subclasses of [CBlastOptionsHandle](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCBlastOptionsHandle.html) should present an interface that is more specific, i.e.: only contain options relevant to the task at hand, although it might not be an exhaustive interface for all options available for the task. Please note that the initialization of this class' data members follows the template method design pattern, and this should be followed by subclasses also. Below is an example use of the ***CBlastOptionsHandle*** to create a set of options appropriate to “blastn” and then to set the expect value to non-default values:
+
+ using ncbi::blast;
+
+ CRef
+ opts_handle(CBlastOptionsFactory::Create(eBlastn));
+ opts_handle->SetEvalueThreshold(1e-10);
+ blast(query, opts_handle, db);
+
+The [CBlastOptionsHandle](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCBlastOptionsHandle.html) classes offers a ***Validate()*** method in its interface which is called by the BLAST search classes prior to performing the actual search, but users of the C++ BLAST options APIs might also want to invoke this method to ensure that any exceptions thrown by the BLAST search classes do not originate from an incorrect setting of BLAST options. Please note that the ***Validate()*** method throws a [CBlastException](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCBlastException.html) in case of failure.
+
+If the same type of search (e.g., nucleotide query vs. nucleotide database) will always be performed, then it may be preferable to create an instance of the derived classes of the [CBlastOptionsHandle](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCBlastOptionsHandle.html). These classes expose an interface that is relevant to the task at hand, but the popular options can be modified as necessary:
+
+ using ncbi::blast;
+
+ CRef nucl_handle(new CBlastNucleotideOptionsHandle);
+ ...
+ nucl_handle->SetTraditionalBlastnDefaults();
+ nucl_handle->SetStrandOption(objects::eNa_strand_plus);
+ ...
+ CRef opts = CRef (&*nucl_handle);
+ CLocalBlast blast(query_factory, opts, db);
+
+The ***CBlastOptionsHandle*** design arranges the BLAST options in a hierarchy. For example all searches that involve protein-protein comparisons (including proteins translated from a nucleotide sequence) are handled by ***CBlastProteinOptionsHandle*** or a subclass (e.g., ***CBlastxOptionsHandle***). A limitation of this design is that the introduction of new algorithms or new options that only apply to some programs may violate the class hierarchy. To allow advanced users to overcome this limitation the ***GetOptions()*** and ***SetOptions()*** methods of the [CBlastOptionsHandle](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCBlastOptionsHandle.html) hierarchy allow access to the [CBlastOptions](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCBlastOptions.html) class, the lowest level class in the C++ BLAST options API which contains all options available to all variants of the BLAST algorithm. No guarantees about the validity of the options are made if this interface is used, therefore invoking ***Validate()*** is *strongly* recommended.
+
+
+
+### Target Sequences
+
+One may specify a BLAST database to search with the [CSearchDatabase](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=CSearchDatabase&d=C) class. Normally it is only necessary to provide a string for the database name and state whether it is a nucleotide or protein database. It is also possible to specify an entrez query or a vector of GI’s that will be used to limit the search.
+
+
+
+### Results
+
+The ***Run()*** method of ***CLocalBlast*** returns a [CSearchResultSet](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=CSearchResultSet&d=C) that may be used to obtain results of the search. The ***CSearchResultSet*** class is a random access container of ***CSearchResults*** objects, one for each query submitted in the search. The ***CSearchResult*** class provides access to alignment (as a ***CSeq\_align\_set***), the query **`Cseq_id`**, warning or error messages that were generated during the run, as well as the filtered query regions (assuming query filtering was set).
+
+
+
+CRemoteBlast
+------------
+
+The ***CRemoteBlast*** class sends a BLAST request to the SPLITD system at the NCBI. This can be advantageous in many situations. There is no need to download the (possibly) large BLAST databases to the user’s machine; the search may be spread across many machines by the SPLITD system at the NCBI, making it very fast; and the results will be kept on the NCBI server for 36 hours in case the users wishes to retrieve them again the next day. On the other hand the user must select one of the BLAST databases maintained by the NCBI since it is not possible to upload a custom database for searching. Here we discuss a ***CRemoteBlast*** constructor that takes three arguments, reflecting the need for a query sequence(s), a set of BLAST options, and a BLAST database. Readers are advised to read the ***CLocalBlast*** section before they read this section.
+
+
+
+### Query Sequence
+
+A ***TSeqLocVector*** should be used as input to ***CRemoteBlast***. Please see the section on [CLocalBlast](#ch_blast.CLocalBlast) for details.
+
+
+
+### Options
+
+***CBlastOptionsFactory::Create()*** can again be used to create options for ***CRemoteBlast***. In this case though it is necessary to set the second (default) argument of ***Create()*** to **`CBlastOptions::eRemote`**.
+
+
+
+### Target Sequences
+
+One may use the ***CSearchDatabase*** class to specify a BLAST database, similar to the method outlined in the [CLocalBlast](#ch_blast.CLocalBlast) section. In this case it is important to remember though that the user must select from the BLAST databases available on the NCBI Web site and not one built locally.
+
+
+
+### Results
+
+After construction of the ***CRemoteBlast*** object the user should call one of the ***SubmitSync()*** methods. After this returns the method ***GetResultSet()*** will return a ***CSearchResultSet*** which the user can interrogate using the same methods as in ***CLocalBlast***. Additionally the user may obtain the request identifier (RID) issued by the SPLITD system with the method ***GetRID()***.
+
+Finally ***CRemoteBlast*** provides a constructor that takes a string, which it expects to be an RID issued by the SPLITD system. This RID might have been obtained by an earlier run of ***CRemoteBlast*** or it could be one that was obtained from the NCBI SPLITD system via the web page. Note that the SPLITD system will keep results on it’s server for 36 hours, so the RID cannot be older than that.
+
+
+
+The Uniform Interface
+---------------------
+
+The [ISeqSearch](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=ISeqSearch) class is an abstract interface class. Concrete subclasses can run either local ([CLocalSeqSearch](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=CLocalSeqSearch)) or remote searches ([CRemoteSeqSearch](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=CRemoteSeqSearch)). The concrete classes will only perform an intersection of the tasks that ***CLocalBlast*** and ***CRemoteBlast*** can perform. As an example, there is no method to retrieve a Request identifier (RID) from subclasses of ***ISeqSearch*** as this is supported only for remote searches but not for local searches. The methods supported by the concrete subclasses and the return values are similar to those of ***CLocalBlast*** and ***CRemoteBlast***.
+
+
+
+CBl2Seq
+-------
+
+[CBl2Seq](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=CBl2Seq&d=C) is a class useful for searching a query (or queries) against one or more target sequences that have not been formatted as a BLAST database. These sequences may, for example, come from a user who pasted them into a web page or be fetched from the Entrez or ID1 services at the NCBI. The ***CBl2Seq*** constructors all take three arguments, reflecting the need for a set of query sequences, a set of target sequences, and some information about the BLAST options or program type to use. In this section it is assumed the reader has already read the previous section on ***CLocalBlast***.
+
+The BLAST database holds the target sequence data in a format optimal for BLAST searches, so that if a target sequence is to be searched more than a few times it is best to convert it to a BLAST database and use ***CLocalBlast***.
+
+
+
+### Query Sequence
+
+The query sequence (or sequences) is represented either as a ***SSeqLoc*** (for a single query sequence) or as a ***TSeqLocVector*** (in the case of multiple query sequences). The ***CBlastInput*** class, described in the [CLocalBlast](#ch_blast.CLocalBlast) section, can be used to produce a ***TSeqLocVector***.
+
+
+
+### Options and Program Type
+
+The ***CBl2Seq*** constructor takes either an [EProgram](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=EProgram) enum (see [Table 1](#ch_blast.T18.3)) or [CBlastOptionsHandle](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=CBlastOptionsHandle) (see relevant section under [CLocalBlast](#ch_blast.CLocalBlast)). In the former case the default set of options for the given ***EProgram*** are used. In the latter case it is possible for the user to set options to non-default values.
+
+
+
+### Target Sequences
+
+The target sequence(s) is represented either as a ***SSeqLoc*** or ***TSeqLocVector***.
+
+
+
+### Results
+
+The ***Run()*** method of the ***CBl2Seq*** class returns a collection of ***CSeq\_align\_set***’s. The method ***GetMessages()*** may be used to obtain any error or warning messages generated during the search.
+
+
+
+Sample Applications
+-------------------
+
+The following are sample applications that demonstrate the usage of the CBl2Seq and CLocalBlast classes respectively:
+
+- [blast\_sample.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/sample/app/blast/blast_sample.cpp)
+
+- [blast\_demo.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/sample/app/blast/blast_demo.cpp)
+
+
diff --git a/pages/ch_boost.md b/pages/ch_boost.md
new file mode 100644
index 00000000..d5a1d94f
--- /dev/null
+++ b/pages/ch_boost.md
@@ -0,0 +1,638 @@
+---
+layout: default
+title: C++ Toolkit test
+nav: pages/ch_boost
+---
+
+
+20\. Using the Boost Unit Test Framework
+======================================================
+
+Last Update: November 13, 2014.
+
+Overview
+--------
+
+The overview for this chapter consists of the following topics:
+
+- Introduction
+
+- Chapter Outline
+
+### Introduction
+
+This chapter discusses the Boost Unit Test Framework and how to use it within NCBI. The NCBI C++ Toolkit has incorporated and extended the open source [Boost.Test Library](http://www.boost.org/doc/libs/1_53_0/libs/test/doc/html/index.html), and provides a simplified way for the developers to create Boost-based C++ unit tests.
+
+The NCBI extensions add the ability to:
+
+- execute the code in a standard (*CNcbiApplication* -like) environment;
+
+- disable test cases or suites, using one of several methods;
+
+- establish dependencies between test cases and suites;
+
+- use NCBI command-line argument processing;
+
+- add initialization and finalization functions; and
+
+- use convenience macros for combining **`NO_THROW`** with other test tools.
+
+While the framework may be of interest to outside organizations, this chapter is intended for NCBI C++ developers. See also the Doxygen documentation for [tests](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/group__Tests.html).
+
+### Chapter Outline
+
+The following is an outline of the topics presented in this chapter:
+
+- [Why Use the Boost Unit Test Framework?](#ch_boost.Why_Use_the_Boost_Un)
+
+- [How to Use the Boost Unit Test Framework](#ch_boost.How_to_Use_the_Boost)
+
+ - [Creating a New Unit Test](#ch_boost.Creating_a_New_Unit_)
+
+ - [Customizing an Existing Unit Test](#ch_boost.Customizing_an_Exist)
+
+ - [Modifying the Makefile](#ch_boost.Modifying_the_Makefi)
+
+ - [Modifying the Source File](#ch_boost.Modifying_the_Source)
+
+ - [Using Testing Tools](#ch_boost.Using_Testing_Tools)
+
+ - [Adding Initialization and/or Finalization](#ch_boost.Adding_Initializatio)
+
+ - [Handling Timeouts](#ch_boost.Handling_Timeouts)
+
+ - [Handling Command-Line Arguments in Test Cases](#ch_boost.Handling_CommandLine)
+
+ - [Creating Test Suites](#ch_boost.Creating_Test_Suites)
+
+ - [Managing Dependencies](#ch_boost.Managing_Dependencie)
+
+ - [Unit Tests with Multiple Files](#ch_boost.Unit_Tests_with_Mult)
+
+ - [Disabling Tests](#ch_boost.Disabling_Tests)
+
+ - [Disabling Tests with Configuration File Entries](#ch_boost._Disabling_Tests_with)
+
+ - [Library-Defined Variables](#ch_boost.LibraryDefined_Variables)
+
+ - [User-Defined Variables](#ch_boost._Disabling_Tests_with_1)
+
+ - [Disabling or Skipping Tests Explicitly in Code](#ch_boost.Disabling_Tests_Expl)
+
+ - [Viewing Unit Tests Results from the Nightly Build](#ch_boost.Viewing_Unit_Tests_R)
+
+ - [Running Unit Tests from a Command-Line](#ch_boost.Running_Unit_Tests_f)
+
+ - [Limitations Of The Boost Unit Test Framework](#ch_boost.Limitations_of_the_B)
+
+
+
+Why Use the Boost Unit Test Framework?
+--------------------------------------
+
+“...*I would like to see a practical plan for every group in Internal Services to move toward standardized testing. Then, in addition to setting an example for the other coding groups, I hope that you will have guidance for them as well about how best to move ahead in this direction. Once you have that, and are adhering to it yourselves, I will start pushing the other coding groups in that direction*.”
+
+- Jim Ostell, April 21, 2008
+
+The value of unit testing is clearly recognized at the highest levels of management at NCBI. Here are some of the ways that using the Boost Unit Test Framework will directly benefit the developer:
+
+- The framework provides a uniform (and well-supported) testing and reporting environment.
+
+- Using the framework simplifies the process of creating and maintaining unit tests:
+
+ - The framework helps keep tests well-structured, straightforward, and easily expandable.
+
+ - You can concentrate on the testing of your functionality, while the framework takes care of all the testing infrastructure.
+
+- The framework fits into the NCBI nightly build system:
+
+ - All tests are run nightly on many platforms.
+
+ - All results are archived and available through a [web interface](http://intranet/ieb/ToolBox/STAT/test_stat/test_stat_ext.cgi).
+
+
+
+How to Use the Boost Unit Test Framework
+----------------------------------------
+
+This chapter assumes you are starting from a working Toolkit source tree. If not, please refer to the chapters on [obtaining the source code](ch_getcode_svn.html), and [configuring and building the Toolkit](ch_config.html).
+
+
+
+### Creating a New Unit Test
+
+On Unix or MS Windows, use the [new\_project](ch_proj.html#ch_proj.new_project_Starting) script to create a new unit test project:
+
+ new_project app/unit_test
+
+For example, to create a project named `foo`, type this in a command shell:
+
+ new_project foo app/unit_test
+
+This creates a directory named foo and then creates two projects within the new directory. One project will be the one named on the command-line (e.g. `foo`) and will contain a sample unit test using all the basic features of the Boost library. The other project will be named `unit_test_alt_sample` and will contain samples of advanced techniques not required in most unit tests.
+
+You can build and run these projects immediately to see how they work:
+
+ cd foo
+ make
+ make check
+
+Once your unit test is created, you must [customize](#ch_boost.Customizing_an_Exist) it to meet your testing requirements. This involves editing these files:
+
+
+
+|-------------------------------------|-------------------------------------------------------------------------------------------------|
+| **File** | **Purpose** |
+| `Makefile` | Main makefile for this directory - builds both the `foo` and `unit_test_alt_sample` unit tests. |
+| `Makefile.builddir` | Contains the path to a pre-built C++ Toolkit. |
+| `Makefile.foo_app` | Makefile for the `foo` unit test. |
+| `Makefile.in` | |
+| `Makefile.unit_test_alt_sample_app` | Makefile for the `unit_test_alt_sample` unit test. |
+| `foo.cpp` | Source code for the `foo` unit test. |
+| `unit_test_alt_sample.cpp` | Source code for the `unit_test_alt_sample` unit test. |
+| `unit_test_alt_sample.ini` | Configuration file for the `unit_test_alt_sample` unit test. |
+
+
+
+
+
+### Customizing an Existing Unit Test
+
+This section contains the following topics:
+
+- [Modifying the Makefile](#ch_boost.Modifying_the_Makefi)
+
+- [Modifying the Source File](#ch_boost.Modifying_the_Source)
+
+ - [Using Testing Tools](#ch_boost.Using_Testing_Tools)
+
+ - [Adding Initialization and/or Finalization](#ch_boost.Adding_Initializatio)
+
+ - [Handling Timeouts](#ch_boost.Handling_Timeouts)
+
+ - [Handling Command-Line Arguments in Test Cases](#ch_boost.Handling_CommandLine)
+
+ - [Creating Test Suites](#ch_boost.Creating_Test_Suites)
+
+ - [Managing Dependencies](#ch_boost.Managing_Dependencie)
+
+ - [Unit Tests with Multiple Files](#ch_boost.Unit_Tests_with_Mult)
+
+- [Disabling Tests](#ch_boost.Disabling_Tests)
+
+ - [Disabling Tests with Configuration File Entries](#ch_boost._Disabling_Tests_with)
+
+ - [Library-Defined Variables](#ch_boost.LibraryDefined_Variables)
+
+ - [User-Defined Variables](#ch_boost._Disabling_Tests_with_1)
+
+ - [Disabling or Skipping Tests Explicitly in Code](#ch_boost.Disabling_Tests_Expl)
+
+
+
+#### Modifying the Makefile
+
+The [new\_project](ch_proj.html#ch_proj.new_project_Starting) script generates a new unit test project that includes everything needed to use the Boost Unit Test Framework, but it won’t include anything specifically needed to build the library or application you are testing.
+
+Therefore, edit the unit test makefile (e.g. `Makefile.foo.app`) and add the appropriate paths and libraries needed by your library or application. Note that although the `new_project` script creates five makefiles, you will generally need to edit only one. If you are using Windows, please see the FAQ on [adding libraries to Visual C++ projects](ch_faq.html#ch_faq.How_do_I_add_a_library_to_a_Visua).
+
+Because the unit tests are based on the Boost Unit Test Framework, the makefiles must specify:
+
+ REQUIRES = Boost.Test.Included
+
+If you are using the `new_project` script (recommended), this setting is included automatically. Otherwise, make sure that `Boost.Test.Included` is listed in `REQUIRES`.
+
+***Note:*** Please also see the "[Defining and running tests](ch_proj.html#ch_proj.inside_tests)" section for unit test makefile information that isn't specific to Boost.
+
+
+
+#### Modifying the Source File
+
+A unit test is simply a test of a unit of code, such as a class. Because each unit has many requirements, each unit test has many test cases. Your unit test code should therefore consist of a test case for each testable requirement. Each test case should be as small and independent of other test cases as possible. For information on how to handle dependencies between test cases, see the section on [managing dependencies](#ch_boost.Managing_Dependencie).
+
+Starting with an existing unit test source file, simply add, change, or remove test cases as appropriate for your unit test. Test cases are defined by the **`BOOST_AUTO_TEST_CASE`** macro, which looks similar to a function. The macro has a single argument (the test case name) and a block of code that implements the test. Test case names must be unique at each level of the test suite hierarchy (see [managing dependencies](#ch_boost.Managing_Dependencie)). Test cases should contain code that will succeed if the requirement under test is correctly implemented, and fail otherwise. Determination of success is made using Boost [testing tools](#ch_boost.Using_Testing_Tools) such as **`BOOST_REQUIRE`** and **`BOOST_CHECK`**.
+
+The following sections discuss modifying the source file in more detail:
+
+- [Using Testing Tools](#ch_boost.Using_Testing_Tools)
+
+- [Adding Initialization and/or Finalization](#ch_boost.Adding_Initializatio)
+
+- [Handling Timeouts](#ch_boost.Handling_Timeouts)
+
+- [Handling Command-Line Arguments in Test Cases](#ch_boost.Handling_CommandLine)
+
+- [Creating Test Suites](#ch_boost.Creating_Test_Suites)
+
+- [Managing Dependencies](#ch_boost.Managing_Dependencie)
+
+- [Unit Tests with Multiple Files](#ch_boost.Unit_Tests_with_Mult)
+
+
+
+##### Using Testing Tools
+
+Testing tools are macros that are used to detect errors and determine whether a given test case passes or fails.
+
+While at a basic level test cases can pass or fail, it is useful to distinguish between those failures that make subsequent testing pointless or impossible and those that don’t. Therefore, there are two levels of testing: **`CHECK`** (which upon failure generates an error but allows subsequent testing to continue), and **`REQUIRE`** (which upon failure generates a fatal error and aborts the current test case). In addition, there is a warning level, **`WARN`**, that can report something of interest without generating an error, although by default you will have to [set a command-line argument](#ch_boost.Running_Unit_Tests_f) to see warning messages.
+
+If the failure of one test case should result in skipping another then you should [add a dependency](#ch_boost.Managing_Dependencie) between them.
+
+Many Boost testing tools have variants for each error level. The most common Boost testing tools are:
+
+
+
+|--------------------------------------------------|-------------------------------------------------------------------------------------------------------------|
+| **Testing Tool** | **Purpose** |
+| **`BOOST_(predicate)`** | Fails if the Boolean predicate (any logical expression) is false. |
+| **`BOOST__EQUAL(left, right)`** | Fails if the two values are not equal. |
+| **`BOOST__THROW(expression, exception)`** | Fails if execution of the expression doesn’t throw an exception of the given type (or one derived from it). |
+| **`BOOST__NO_THROW(expression)`** | Fails if execution of the expression throws any exception. |
+
+
+
+Note that **`BOOST__EQUAL(var1,var2)`** is equivalent to **`BOOST_ (var1==var2)`**, but in the case of failure it prints the value of each variable, which can be helpful. Also, it is not a good idea to compare floating point values directly - instead, use **`BOOST__CLOSE(var1,var2,tolerance)`**.
+
+See the Boost testing tools [reference page](http://www.boost.org/doc/libs/1_53_0/libs/test/doc/html/utf/testing-tools/reference.html) for documentation on these and other testing tools.
+
+The NCBI extensions to the Boost library add a number of convenience testing tools that enclose the similarly-named Boost testing tools in a **`NO_THROW`** test:
+
+
+
+|----------------------------------------|-------------------------------------------|
+| **Boost Testing Tool** | **NCBI "NO\_THROW " Extension** |
+| **`BOOST_(predicate)`** | **`NCBITEST_(predicate)`** |
+| **`BOOST__EQUAL(left, right)`** | **`NCBITEST__EQUAL(left, right)`** |
+| **`BOOST__NE(left, right)`** | **`NCBITEST__NE(left, right)`** |
+| **`BOOST__MESSAGE(pred, msg)`** | **`NCBITEST__MESSAGE(pred, msg)`** |
+
+
+
+***Note:*** Testing tools are only supported within the context of test cases. That is, within functions defined by the **`BOOST_AUTO_TEST_CASE`** macro and within functions called by a test case. They are not supported in functions defined by the **`NCBITEST_*`** macros.
+
+
+
+##### Adding Initialization and/or Finalization
+
+If your unit test requires initialization prior to executing test cases, or if finalization / clean-up is necessary, use these functions:
+
+ NCBITEST_AUTO_INIT()
+ {
+ // Your initialization code here...
+ }
+
+ NCBITEST_AUTO_FINI()
+ {
+ // Your finalization code here...
+ }
+
+
+
+##### Handling Timeouts
+
+If exceeding a maximum execution time constitutes a failure for your test case, use this:
+
+ // change the second parameter to the duration of your timeout in seconds
+ BOOST_AUTO_TEST_CASE_TIMEOUT(TestTimeout, 3);
+ BOOST_AUTO_TEST_CASE(TestTimeout)
+ {
+ // Your test code here...
+ }
+
+
+
+##### Handling Command-Line Arguments in Test Cases
+
+It is possible to retrieve command-line arguments from your test cases using the standard C++ Toolkit [argument handling API](ch_core.html#ch_core.cmd_line_args). The first step is to initialize the unit test to expect the arguments. Add code like the following to your source file:
+
+ NCBITEST_INIT_CMDLINE(descrs)
+ {
+ // Add calls like this for each command-line argument to be used.
+ descrs->AddOptionalPositional("some_arg",
+ "Sample command-line argument.",
+ CArgDescriptions::eString);
+ }
+
+For more examples of argument processing, see [test\_ncbiargs\_sample.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/corelib/test/test_ncbiargs_sample.cpp).
+
+Next, add code like the following to access the argument from within a test case:
+
+ BOOST_AUTO_TEST_CASE(TestCaseName)
+ {
+ const CArgs& args = CNcbiApplication::Instance()->GetArgs();
+ string arg_value = args["some_arg"].AsString();
+ // do something with arg_value ...
+ }
+
+Adding your own command-line arguments will not affect the application’s ability to process other command-line arguments such as `-help` or `-dryrun`.
+
+
+
+##### Creating Test Suites
+
+Test suites are simply groups of test cases. The test cases included in a test suite are those that appear between the beginning and ending test suite declarations:
+
+ BOOST_AUTO_TEST_SUITE(TestSuiteName)
+
+ BOOST_AUTO_TEST_CASE(TestCase1)
+ {
+ //...
+ }
+
+ BOOST_AUTO_TEST_CASE(TestCase2)
+ {
+ //...
+ }
+
+ BOOST_AUTO_TEST_SUITE_END();
+
+Note that the beginning test suite declaration defines the test suite name and does not include a semicolon.
+
+
+
+##### Managing Dependencies
+
+Test cases and suites can be dependent on other test cases or suites. This is useful when it doesn’t make sense to run a test after some other test fails:
+
+ NCBITEST_INIT_TREE()
+ {
+ // define individual dependencies
+ NCBITEST_DEPENDS_ON(test_case_dep, test_case_indep);
+ NCBITEST_DEPENDS_ON(test_case_dep, test_suite_indep);
+ NCBITEST_DEPENDS_ON(test_suite_dep, test_case_indep);
+ NCBITEST_DEPENDS_ON(test_suite_dep, test_suite_indep);
+
+ // define multiple dependencies
+ NCBITEST_DEPENDS_ON_N(item_dep, 2, (item_indep1, item_indep2));
+ }
+
+When an independent test item (case or suite) fails, all of the test items that depend on it will be skipped.
+
+
+
+##### Unit Tests with Multiple Files
+
+The [new\_project](ch_proj.html#ch_proj.new_project_Starting) script is designed to create single-file unit tests by default, but you can add as many files as necessary to implement your unit test. Use of the **`BOOST_AUTO_TEST_MAIN`** macro is now deprecated.
+
+
+
+#### Disabling Tests
+
+The Boost Unit Test Framework was extended by NCBI to provide several ways to disable test cases and suites. Test cases and suites are disabled based on logical expressions in the application configuration file or, less commonly, by explicitly disabling or skipping them. The logical expressions are based on unit test variables which are defined either by the library or by the user. All such variables are essentially Boolean in that they are either defined (**`true`**) or not defined (**`false`**). ***Note:*** these methods of disabling tests don't apply if specific tests are [run from the command-line](#ch_boost.Running_Unit_Tests_f).
+
+- [Disabling Tests with Configuration File Entries](#ch_boost._Disabling_Tests_with)
+
+- [Library-Defined Variables](#ch_boost.LibraryDefined_Variables)
+
+- [User-Defined Variables](#ch_boost._Disabling_Tests_with_1)
+
+- [Disabling or Skipping Tests Explicitly in Code](#ch_boost.Disabling_Tests_Expl)
+
+
+
+##### Disabling Tests with Configuration File Entries
+
+The **`[UNITTESTS_DISABLE]`** section of the application configuration file can be customized to disable test cases or suites. Entries in this section should specify a test case or suite name and a logical expression for disabling it (expressions that evaluate to **`true`** disable the test). The logical expression can be formed from the logical constants **`true`** and **`false`**, numeric constants, [library-defined](#ch_boost.LibraryDefined_Variables) or [user-defined](#ch_boost._Disabling_Tests_with_1) unit test variables, logical operators ('`!`', '`&&`', and '`||`'), and parentheses.
+
+To disable specific tests, use commands like:
+
+ [UNITTESTS_DISABLE]
+ SomeTestCaseName = OS_Windows && PLATFORM_BigEndian
+ SomeTestSuiteName = (OS_Linux || OS_Solaris) && COMPILER_GCC
+
+There is a special entry `GLOBAL` that can be used to disable all tests. For example, to disable all tests under Cygwin, use:
+
+ [UNITTESTS_DISABLE]
+ GLOBAL = OS_Cygwin
+
+***Note***: If the configuration file contains either a test name or a variable name that has not been defined (e.g. due to a typo) then the test program will exit immediately with an error, without executing any tests.
+
+
+
+##### Library-Defined Variables
+
+When the NCBI-extended Boost Test library is built, it defines a set of unit test variables based on the build, compiler, operating system, and platform. See [Table 1](#ch_boost.IT1) for a list of related variables ([test\_boost.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/corelib/test_boost.cpp) has the latest [list of variables](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=x_InitCommonParserVars&d=)).
+
+
+
+Table 1. Build Generated Predefined Variables
+
+| `Builds` | `Compilers` | `Operating Systems` | `Platforms` |
+|----------------------|----------------------|---------------------|-------------------------|
+| `BUILD_Debug` | `COMPILER_Compaq` | `OS_AIX` | `PLATFORM_BigEndian` |
+| `BUILD_Dll` | `COMPILER_GCC` | `OS_BSD` | `PLATFORM_Bits32` |
+| `BUILD_Release` | `COMPILER_ICC` | `OS_Cygwin` | `PLATFORM_Bits64` |
+| `BUILD_Static` | `COMPILER_KCC` | `OS_Irix` | `PLATFORM_LittleEndian` |
+| | `COMPILER_MipsPro` | `OS_Linux` | |
+| | `COMPILER_MSVC` | `OS_MacOS` | |
+| | `COMPILER_VisualAge` | `OS_MacOSX` | |
+| | `COMPILER_WorkShop` | `OS_Solaris` | |
+| | | `OS_Tru64` | |
+| | | `OS_Unix` | |
+| | | `OS_Windows` | |
+
+
+
+At run-time, the library also checks the `FEATURES` environment variable and creates unit test variables based on the current set of features. See [Table 2](#ch_boost.IT2) for a list of feature, package, and project related variables ([test\_boost.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/corelib/test_boost.cpp) has the latest [list of features](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=s_NcbiFeatures&d=)).
+
+
+
+Table 2. Check Script Generated Predefined Variables
+
+| `Features` | `Packages` | `Projects` |
+|----------------------|-----------------------------------------------------------------------|----------------------|
+| `AIX` | `BerkeleyDB` | `algo` |
+| `BSD` | `BerkeleyDB__` `(use for BerkeleyDB++)` | `app` |
+| `CompaqCompiler` | `Boost_Regex` | `bdb` |
+| `Cygwin` | `Boost_Spirit` | `cgi` |
+| `CygwinMT` | `Boost_Test` | `connext` |
+| `DLL` | `Boost_Test_Included` | `ctools` |
+| `DLL_BUILD` | `Boost_Threads` | `dbapi` |
+| `Darwin` | `BZ2` | `gbench` |
+| `GCC` | `C_ncbi` | `gui` |
+| `ICC` | `C_Toolkit` | `local_bsm` |
+| `in_house_resources` | `CPPUNIT` | `ncbi_crypt` |
+| `IRIX` | `EXPAT` | `objects` |
+| `KCC` | `Fast_CGI` | `serial` |
+| `Linux` | `FLTK` | |
+| `MIPSpro` | `FreeTDS` | |
+| `MSVC` | `FreeType` | |
+| `MSWin` | `FUSE` | |
+| `MT` | `GIF` | |
+| `MacOS` | `GLUT` | |
+| `Ncbi_JNI` | `GNUTLS` | |
+| `OSF` | `HDF5` | |
+| `PubSeqOS` | `ICU` | |
+| `SRAT_internal` | `JPEG` | |
+| `Solaris` | `LIBXML` | |
+| `unix` | `LIBXSLT` | |
+| `VisualAge` | `LocalBZ2` | |
+| `WinMain` | `LocalMSGMAIL2` | |
+| `WorkShop` | `LocalNCBILS` | |
+| `XCODE` | `LocalPCRE` | |
+| | `LocalSSS` | |
+| | `LocalZ` | |
+| | `LZO` | |
+| | `MAGIC` | |
+| | `MESA` | |
+| | `MUPARSER` | |
+| | `MySQL` | |
+| | `NCBILS2` | |
+| | `ODBC` | |
+| | `OECHEM` | |
+| | `OpenGL` | |
+| | `OPENSSL` | |
+| | `ORBacus` | |
+| | `PCRE` | |
+| | `PNG` | |
+| | `PYTHON` | |
+| | `PYTHON23` | |
+| | `PYTHON24` | |
+| | `PYTHON25` | |
+| | `SABLOT` | |
+| | `SGE` | |
+| | `SP` | |
+| | `SQLITE` | |
+| | `SQLITE3` | |
+| | `SQLITE3ASYNC` | |
+| | `SSSDB` | |
+| | `SSSUTILS` | |
+| | `Sybase` | |
+| | `SybaseCTLIB` | |
+| | `TIFF` | |
+| | `UNGIF` | |
+| | `UUID` | |
+| | `Xalan` | |
+| | `Xerces` | |
+| | `XPM` | |
+| | `Z` | |
+| | `wx2_8` | |
+| | `wxWidgets` | |
+| | `wxWindows` | |
+
+
+
+The automated nightly test suite defines the `FEATURES` environment variable before launching the unit test applications. In this way, unit test applications can also use run-time detected features to exclude specific tests from the test suite.
+
+***Note:*** The names of the features are modified slightly when creating unit test variables from names in the `FEATURES` environment variable. Specifically, each feature is prefixed by `FEATURE_` and all non-alphanumeric characters are changed to underscores. For example, to require the feature `in-house-resources` for a test (i.e. to disable the test if the feature is not present), use:
+
+ [UNITTESTS_DISABLE]
+ SomeTestCaseName = !FEATURE_in_house_resources
+
+
+
+##### User-Defined Variables
+
+You can define your own variables to provide finer control on disabling tests. First, define a variable in your source file:
+
+ NCBITEST_INIT_VARIABLES(parser)
+ {
+ parser->AddSymbol("my_ini_var", );
+ }
+
+Then add a line to the configuration file to disable a test based on the value of the new variable:
+
+ [UNITTESTS_DISABLE]
+ MyTestName = my_ini_var
+
+User-defined variables can be used in conjunction with [command-line arguments](#ch_boost.Handling_CommandLine):
+
+ NCBITEST_INIT_VARIABLES(parser)
+ {
+ const CArgs& args = CNcbiApplication::Instance()->GetArgs();
+ parser->AddSymbol("my_ini_var", args["my_arg"].HasValue());
+ }
+
+Then, passing the argument on the command-line controls the disabling of the test case:
+
+ ./foo my_arg # test is disabled
+ ./foo # test is not disabled (at least via command-line / config file)
+
+
+
+##### Disabling or Skipping Tests Explicitly in Code
+
+The NCBI extensions include a macro, **`NCBITEST_DISABLE`**, to unconditionally disable a test case or suite. This macro must be placed in the **`NCBITEST_INIT_TREE`** function:
+
+ NCBITEST_INIT_TREE()
+ {
+ NCBITEST_DISABLE(test_case_name);
+ NCBITEST_DISABLE(test_suite_name);
+ }
+
+The extensions also include two functions for globally disabling or skipping all tests. These functions should be called only from within the **`NCBITEST_AUTO_INIT`** or **`NCBITEST_INIT_TREE`** functions:
+
+ NCBITEST_INIT_TREE()
+ {
+ NcbiTestSetGlobalDisabled(); // A given unit test might include one
+ NcbiTestSetGlobalSkipped(); // or the other of these, not both.
+ // Most unit tests won’t use either.
+ }
+
+The difference between these functions is that globally disabled unit tests will report the status **`DIS`** to check scripts while skipped tests will report the status **`SKP`**.
+
+
+
+### Viewing Unit Tests Results from the Nightly Build
+
+The Boost Unit Test Framework provides more than just command-line testing. Each unit test built with the framework becomes incorporated into nightly testing and is tested on multiple platforms and under numerous configurations. All such results are archived in the database and available through a [web interface](http://intranet/ieb/ToolBox/STAT/test_stat/test_stat_ext.cgi).
+
+The main page (see [Figure 1](#ch_boost.F20.1)) provides many ways to narrow down the vast quantity of statistics available. The top part of the page allows you to select test date, test result, build configuration (branch, compiler, operating system, etc), debug/release, and more. The page also has a column for selecting tests, and a column for configurations. For best results, refine the selection as much as possible, and then click on the “See test statistics” button.
+
+
+
+[](/book/static/img/TestInterface.png "Click to see the full-resolution image")
+
+Figure 1. Test Interface
+
+The “See test statistics” button retrieves the desired statistics in a second page (see [Figure 2](#ch_boost.F20.2)). The results are presented in tables: one for each selected date, with unit tests down the left side and configurations across the top. Further refinements of the displayed results can be made by removing rows, columns, or dates; and by selecting whether all columns, all cells, or only selected cells are displayed.
+
+
+
+[](/book/static/img/TestMatrix.png "Click to see the full-resolution image")
+
+Figure 2. Test Matrix
+
+Each cell in the results tables represents a specific unit test performed on a specific date under a specific configuration. Clicking on a cell retrieves a third page (see [Figure 3](#ch_boost.F20.3)) that shows information about that test and its output.
+
+
+
+[](/book/static/img/TestResult.png "Click to see the full-resolution image")
+
+Figure 3. Test Result
+
+
+
+### Running Unit Tests from a Command-Line
+
+To run one or more selected test cases from a command-line, use this:
+
+ ./foo --run_test=TestCaseName1,TestCaseName2
+
+Multiple test cases can be selected by using a comma-separated list of names.
+
+To see all test cases in a unit test, use this:
+
+ ./foo -dryrun
+
+To see exactly which test cases passed and failed, use this:
+
+ ./foo --report_level=detailed
+
+To see warning messages, use this:
+
+ ./foo --log_level=warning
+
+Additional runtime parameters can be set. For a complete list, see the online [documentation](http://www.boost.org/doc/libs/1_53_0/libs/test/doc/html/utf/user-guide/runtime-config/reference.html).
+
+
+
+### Limitations of the Boost Unit Test Framework
+
+The currently known limitations are:
+
+- It is not suitable for most multi-threaded tests.
+
+- It is not suitable for "one-piece" applications (such as server or CGI). Such applications should be tested via their clients (which would preferably be unit test based).
+
+
diff --git a/pages/ch_browse.md b/pages/ch_browse.md
new file mode 100644
index 00000000..0e4338a6
--- /dev/null
+++ b/pages/ch_browse.md
@@ -0,0 +1,58 @@
+---
+layout: default
+title: C++ Toolkit test
+nav: pages/ch_browse
+---
+
+
+27\. NCBI C++ Toolkit Source Browser
+==================================================
+
+Source Browsers
+---------------
+
+The overview for this chapter consists of the following topics:
+
+- Introduction
+
+- Chapter Outline
+
+### Introduction
+
+The NCBI C++ Toolkit source code is highly browseable and can be searched in a variety of useful ways. To that end we provide two source browsers, one based on the [LXR Engine](#ch_browse.lxr) and another based on [Doxygen](#ch_browse.doxygen). These are complementary approaches that allow the Toolkit source to be searched and navigated according to its file hierarchy and present an alphabetical list of all classes, macros, variables, typedefs, etc. named in the Toolkit, as well as a summary of the parent-child relationships among the classes.
+
+### Chapter Outline
+
+The following is an outline of the topics presented in this chapter:
+
+- [LXR](#ch_browse.lxr)
+
+- [Doxygen Browser](#ch_browse.doxygen)
+
+
+
+LXR
+---
+
+The [LXR Engine](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/blurb.html) enables search-driven browsing together with a more conventional [navigation](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source) of the Toolkit's source. In [source](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source) mode, LXR provides navigation of the source tree through a Web-based front end. The LXR search modes [ident](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident), [find](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/find) and [search](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/search) will generate a list to identify all occurrences in the Toolkit where an identifier, file name, or specified free text, respectively, are found.
+
+An `identifier` in an LXR search is the name of a class, function, variable, macro, typedef, or other named entity within the Toolkit source. This search can be especially handy when attempting to determine, for example, which header has been left out when a symbol reference cannot be found.
+
+Some hints for using LXR:
+
+- For free-text LXR searches, patterns, wildcards, and regular expression syntax are allowed. See the [Search Help page](http://tidy.sourceforge.net/lxr_search_help.html) for details.
+
+- The identifier ("ident") and file ("find") LXR search modes attempt an **exact** and **case-sensitive** match to your query.
+
+- LXR indexes files from a root of `$NCBI/c++`; matches will be found not only in `src` and `include` but also in any resident build tree and the `compilers` and `scripts` directories as well.
+
+- ***Note***: The documentation itself is not searched by LXR.
+
+
+
+Doxygen Browser
+---------------
+
+The Doxygen tool has been used to generate a [Toolkit source code browser](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/index.html) from the source code files. The documentation is extracted directly from the sources, which makes it much easier to keep the documentation consistent with the source code. Doxygen has been configured to extract the code structure directly from the source code files. This feature is very useful because it quickly enables you to find your way in large source distributions. You can also visualize the relations between the various elements by means of dependency graphs, inheritance diagrams, and collaboration diagrams, which are all generated automatically.
+
+
diff --git a/pages/ch_build.md b/pages/ch_build.md
new file mode 100644
index 00000000..cfa1c2cc
--- /dev/null
+++ b/pages/ch_build.md
@@ -0,0 +1,404 @@
+---
+layout: default
+title: C++ Toolkit test
+nav: pages/ch_build
+---
+
+
+5\. Working with Makefiles
+========================================
+
+Last Update: September 26, 2014.
+
+Overview
+--------
+
+The overview for this chapter consists of the following topics:
+
+- Introduction
+
+- Chapter Outline
+
+### Introduction
+
+Building executables and libraries for a large, integrated set of software tools such as the C++ Toolkit, and doing so consistently on different platforms and architectures, is a daunting task. Therefore, the Toolkit developers have expended considerable effort to design a build system based upon the **make** utility as controlled by `makefiles`. Although it is, of course, possible to write one's own Toolkit `makefile` from scratch, it is seldom desirable. To take advantage of the experience, wisdom, and alchemy invested in Toolkit and to help avoid often inscrutable compilation issues:
+
+
+
+> **We strongly advise users to work with the Toolkit's make system.**
+
+With minimal manual editing (and after invoking the [configure](ch_config.html) script in your build tree), the build system adapts to your environment, compiler options, defines all relevant `makefile` macros and targets, allows for recursive builds of the entire Toolkit and targeted builds of single modules, and handles many other details that can confound manual builds.
+
+### Chapter Outline
+
+The following is an outline of the topics presented in this chapter:
+
+- [Major Makefiles](#ch_build.major_makefiles)
+
+- [Makefile Hierarchy](#ch_build.makefiles_hierarch)
+
+- [Meta-Makefiles](#ch_build.makefiles_meta)
+
+ - [Makefile.in Meta Files](#ch_build.makefile_in)
+
+ - [Expendable Projects](#ch_build.expendable_proj)
+
+- [Project Makefiles](#ch_build.build_proj_makefiles)
+
+ - [List of Optional Packages, Features, and Projects](#ch_build.packages_opt)
+
+- [Standard Build Targets](#ch_build.std_build_targets)
+
+ - [Meta-Makefile Targets](#ch_build.build_meta_makefiles)
+
+ - [Makefile Targets](#ch_build.build_make_proj_target)
+
+- [Makefile Macros and Makefile.mk](#ch_build.build_make_macros)
+
+- [Example Makefiles](#ch_build.build_make_examples)
+
+For information on configuring and building that isn't specific to makefiles, see the chapters on [configuring](ch_config.html) and [managing projects](ch_proj.html).
+
+
+
+Major Makefiles
+---------------
+
+The C++ Toolkit build system was based on the "GNU build system", which employs a configure script and makefiles. Before describing the C++ Toolkit build system in detail, we list the major types of makefiles used by the Toolkit:
+
+- **Meta-makefiles**. These files exist for each project and tie the project together in the Toolkit hierarchy; defining those applications and libraries as a project is necessary for (possibly recursively) building.
+
+- **Generic makefile templates** (`Makefile*.in`). The **configure** script processes these files from the src hierarchy to substitute for the special tags **`"@some_name@"`** and make other specializations required for a given project. Note that meta-makefiles are typically derived from such templates.
+
+- **Customized makefiles** (`Makefile.*.[lib|app]`). For each library or application, this file gives specific targets, compiler flags, and other project-specific build instructions. These files appear in the `src` hierarchy.
+
+- **Configured makefiles** (`Makefile` or `Makefile.*_[lib|app]`). A makefile generated by **configure** for each project and sub-project and placed in the appropriate location in the build tree ready for use will be called a "configured makefile". Note that meta-makefiles in the build tree may be considered “configured”.
+
+
+
+Makefile Hierarchy
+------------------
+
+All Toolkit `makefiles` reside in either the `src` directory as templates or customized files, or in the appropriate configured form in each of your `` hierarchies as illustrated in [Figure 1](#ch_build.F1)
+
+
+
+[](/book/static/img/make.gif "Click to see the full-resolution image")
+
+Figure 1. Makefile hierarchy.
+
+Most of the files listed in [Figure 1](#ch_build.F1) are templates from the `src` directory, with each corresponding `configured makefile` at the top of the build tree. Of these, `/Makefile` can be considered the master `makefile` in that it can recursively build the entire Toolkit. The role of each top-level `makefile` template is summarized as follows:
+
+- `Makefile.in` - makefile to perform a recursive build in all project subdirectories.
+
+- `Makefile.meta.in` - included by all makefiles that provide both local and recursive builds.
+
+- `Makefile.mk.in` - included by all makefiles; sets a lot of configuration variables.
+
+- `Makefile.lib.in` - included by all makefiles that perform a "standard" library build, when building only static libraries.
+
+- `Makefile.dll.in` - included by all makefiles that perform a "standard" library build, when building only shared libraries.
+
+- `Makefile.both.in` - included by all makefiles that perform a "standard" library build, when building both static and shared libraries.
+
+- `Makefile.lib.tmpl.in` - serves as a template for the project `customized makefiles` (`Makefile.*.lib[.in]`) that perform a "standard" library build.
+
+- `Makefile.app.in` - included by all makefiles that perform a "standard" application build.
+
+- `Makefile.app.tmpl.in` - serves as a template for the project `customized makefiles` (`Makefile.*.app[.in]`) that perform a "standard" application build.
+
+- `Makefile.rules.in, Makefile.rules_with_autodep.in` -- instructions for building object files; included by most other makefiles.
+
+The project-specific portion of the `makefile` hierarchy is represented in the figure by the `meta-makefile` template `c++/src/myProj/Makefile.in`, the customized makefile `c++/src/myProj/Makefile.myProj.[app|lib]` (not shown), and the configured makefile `c++/myBuild/build/myProj/Makefile`. In fact, every project and sub-project in the Toolkit has analogous files specialized to its project; in most circumstances, every new or user project should emulate this file structure to be compatible with the make system.
+
+
+
+Meta-Makefiles
+--------------
+
+A typical `meta-makefile` template (e.g. `Makefile.in` in your `foo/c++/src/bar_proj/` dir) looks like this:
+
+ # Supply Makefile.bar_u1, Makefile.bar_u2 ...
+ #
+ USR_PROJ = bar_u1 bar_u2 ...
+
+ # Supply Makefile.bar_l1.lib, Makefile.bar_l2.lib ...
+ #
+ LIB_PROJ = bar_l1 bar_l2 ...
+
+ # Supply Makefile.bar_a1.app, Makefile.bar_a2.app ...
+ #
+ APP_PROJ = bar_a1 bar_a2 ...
+
+ # Subprojects
+ #
+ SUB_PROJ = app sub_proj1 sub_proj2
+
+ srcdir = @srcdir@
+ include @builddir@/Makefile.meta
+
+This template separately specifies instructions for user, library and application projects, along with a set of three sub-projects that can be made. The mandatory final two lines `"srcdir = @srcdir@; include @builddir@/Makefile.meta"` define the [standard build targets](#ch_build.std_build_targets).
+
+
+
+### Makefile.in Meta Files
+
+The `Makefile.in` meta-make file in the project's source directory defines a kind of road map that will be used by the **configure** script to generate a makefile (`Makefile`) in the corresponding directory of the `build tree`. `Makefile.in` does **not** participate in the actual execution of **make**, but rather, defines what will happen at that time by directing the **configure** script in the creation of the `Makefile` that will be executed (see also the description of [standard build targets](#ch_build.std_build_targets) below).
+
+The meta-makefile `myProj/Makefile.in` should define at least one of the following macros:
+
+- **`USR_PROJ`** (optional) - a list of names for user-defined makefiles. This macro is provided for the usage of ordinary stand-alone makefiles which do not utilize the make commands contained in additional makefiles in the top-level `build` directory. Each `p_i` listed in `USR_PROJ = p_1 ... p_N` must have a corresponding Makefile.p\_i in the project's source directory. When **make** is executed, the **make** directives contained in these files will be executed directly to build the targets as specified.
+
+- **`LIB_PROJ`** (optional) - a list of names for library makefiles. For each library `l_i` listed in `LIB_PROJ = l_1 ... l_N`, you must have created a corresponding project makefile named Makefile.l\_i.lib in the project's source directory. When **make** is executed, these library project makefiles will be used along with `Makefile.lib` and `Makefile.lib.tmpl` (located in the top-level of the `build tree`) to build the specified libraries.
+
+- **`ASN_PROJ`** (optional) is like **`LIB_PROJ`**, with one additional feature: Any projects listed there will be interpreted as the names of ASN.1 module specifications to be processed by [datatool](ch_app.html#ch_app.datatool).
+
+- **`APP_PROJ`** (optional) - a list of names for application makefiles. Similarly, each application (`p1, p2, ..., pN`) listed under **`APP_PROJ`** must have a corresponding project makefile named `Makefile.p*.app` in the project's source directory. When **make** is executed, these application project makefiles will be used along with `Makefile.app` and `Makefile.app.tmpl` to build the specified executables.
+
+- **`SUB_PROJ`** (optional) - a list of names for subproject directories (used on recursive makes). The **`SUB_PROJ`** macro is used to recursively define **make** targets; items listed here define the subdirectories rooted in the project's source directory where **make** should also be executed.
+
+Some additional `meta-makefile` macros (listed in [Table 1](#ch_build.T1)) exist to specify various directory paths that **make** needs to know. The "@"-delimited tokens are substituted during configuration based on your environment and any command-line options passed to **configure**.
+
+
+
+Table 1. Path Specification Makefile Macros
+
+| Macro | Source | Synopsis |
+|----------------------|------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------|
+| **`bindir`** | **`@bindir@`**, `--bindir` | Executables built inside the build tree |
+| **`build_root`** | **`@build_root@`** | Path to the whole build tree |
+| **`builddir`** | **`@builddir@`** | Top build directory inside the build tree |
+| **`datadir`** | `--datadir` | Read-only architecture-independent data |
+| **`incdir`** | **`@incdir@`** | Top include directory inside the build tree |
+| **`includedir`** | **`@includedir@`**, `--includedir` | Top include directory in the source tree |
+| **`infodir`** | `--infodir` | Info documentation |
+| **`libdir`** | **`@libdir@`**, `--libdir` | Libraries built inside the build tree |
+| **`libexecdir`** | `--libexecdir` | Program executables |
+| **`localstatedir`** | `--localstatedir` | Modifiable single-machine data |
+| **`mandir`** | `--mandir` | Man documentation |
+| **`oldincludedir`** | `--oldincludedir` | C header files for non-gcc |
+| **`sbindir`** | `--sbindir` | System admin executables |
+| **`sharedstatedir`** | `--sharedstatedir` | Modifiable architecture-independent data |
+| **`srcdir`** | **`@srcdir@`**, `--srcdir` | Directory in the source tree that corresponds to the directory (`./`) in the build tree where the build is currently going on |
+| **`status_dir`** | **`@status_dir@`** | Configuration status files |
+| **`sysconfdir`** | `--sysconfdir` | Read-only single-machine data (default) |
+| **`top_srcdir`** | **`@top_srcdir@`** | Path to the whole NCBI C++ package |
+
+
+
+
+
+### Expendable Projects
+
+By default, failure of any project will cause make to exit immediately. Although this behavior can save a lot of time, it is not always desirable. One way to avoid it is to run `make -k` rather than `make`, but then major problems affecting a large portion of the build will still waste a lot of time.
+
+Consequently, the toolkit's build system supports an alternative approach: [meta-makefiles](#ch_build.makefiles_meta) can define `expendable` projects which should be built if possible but are allowed to fail without interrupting the build. The way to do this is to list such projects in **`EXPENDABLE_*_PROJ`** rather than **`*_PROJ`**.
+
+
+
+Project Makefiles
+-----------------
+
+When beginning a new project, the [new\_project](ch_proj.html#ch_proj.new_project_Starting) shell script will generate an initial configured makefile, `Makefile._app`, that you can modify as needed. In addition, a working sample application can also be checked out to experiment with or as an alternate template.
+
+The [import\_project](ch_getcode_svn.html#ch_getcode_svn.import_project_sh) script is useful for working on existing Toolkit projects without needing to build the whole Toolkit. In this case things are particularly straightforward as the project will be retrieved complete with its `makefile` already configured as `Makefile._[app|lib]`. (Note that there is an underscore in the name, not a period as in the similarly-named `customizable makefile` from which the configured file is derived.)
+
+**If you are working outside of the source tree:** In this scenario you are only linking to the Toolkit libraries and will not need to run the **configure** script, so a `Makefile.in` template `meta-makefile` is not required. Some of the typical edits required for the `customized makefile` are shown in the section on [working in a separate directory](ch_proj.html#ch_proj.outside_tree_makefile).
+
+**If you are working within the source tree or subtree:** Project subdirectories that do not contain any `*.in` files are ignored by the **configure** script. Therefore, you will now also need to create a `meta-makefile` for the newly created project before configuring your `build` directory to include the new project.
+
+Several examples are detailed on the "[Starting New Projects](ch_proj.html#ch_proj.make_proj_lib)" section.
+
+
+
+### List of optional packages, features and projects
+
+[Table 2](#ch_build.T2) displays the keywords you can list in **`REQUIRES`** in a customized [application](ch_proj.html#ch_proj.make_proj_app) or [library](ch_proj.html#ch_proj.make_proj_lib) makefile, along with the corresponding [configure options](ch_config.html#ch_config.ch_configprohibit_sy):
+
+
+
+Table 2. Optional Packages, Features, and Projects
+
+| Keyword | Optional ... | Configure options |
+|----------------------|----------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------|
+| | | |
+| | **... packages** | |
+| `C-Toolkit` | NCBI C Toolkit | `--without-ncbi-c` |
+| `Fast-CGI` | Fast-CGI library | `--without-fastcgi` |
+| `FLTK` | the Fast Light ToolKit | `--without-fltk,` `--with-fltk=DIR` |
+| `FreeTDS` | FreeTDS libraries | `--without-ftds,` `--with-ftds=DIR` |
+| `GEO` | NCBI GEO libraries | `--without-geo` |
+| `ORBacus` | ORBacus CORBA | `--without-orbacus,` `--with-orbacus=DIR` |
+| `PubMed` | NCBI PubMed libraries | `--without-pubmed` |
+| `SP` | SP libraries | `--without-sp` |
+| `SSSDB` | NCBI SSS DB library | `--without-sssdb,` `--without-sss` |
+| `SSSUTILS` | NCBI SSS UTILS library | `--without-sssutils,` `--without-sss` |
+| `Sybase` | Sybase libraries | `--without-sybase,` `--with-sybase-local(=DIR),` `--with-sybase-new` |
+| `wxWindows` | wxWindows | `--without-wxwin,` `--with-wxwin=DIR` |
+| | | |
+| | **... features** | |
+| `MT` | multithreading is available | `--with-mt` |
+| | | |
+| | **... projects** | |
+| `app` | standalone applications like ID1\_FETCH | `--with-app` |
+| `ctools` | projects based on the NCBI C toolkit | `--without-ctools` |
+| `gui` | projects that use the wxWindows GUI package | `--without-gui` |
+| `internal` | all internal projects | `--with-internal` |
+| `objects` | libraries to serialize ASN.1/XML objects | `--with-objects` |
+| `serial` | ASN.1/XML serialization library and datatool | `--without-serial` |
+| `local_lbsm` | IPC with locally running LBSMD | `--without-local-lbsm` |
+
+
+
+
+
+Standard Build Targets
+----------------------
+
+The following topics are discussed in this section:
+
+- [Meta-Makefile Targets](#ch_build.build_meta_makefiles)
+
+- [Makefile Targets](#ch_build.build_make_proj_target)
+
+
+
+### Meta-Makefile Targets
+
+The mandatory lines from the [meta-makefile example above](#ch_build.makefiles_meta),
+
+ srcdir = @srcdir@
+ include @builddir@/Makefile.meta
+
+provide the build rules for the following standard meta-makefile targets:
+
+- `all`:
+
+ - Run `"make -f {Makefile.*} all"` for the makefiles with the suffixes listed in macro **`USR_PROJ`**: `make -f Makefile.bar_u1 all` `make -f Makefile.bar_u2 all` `...`
+
+ - Build libraries using attributes defined in the customized makefiles `Makefile.*.lib` with the suffixes listed in macro **`LIB_PROJ`**.
+
+ - Build application(s) using attributes defined in the customized makefiles `Makefile.*.app` with the suffixes listed in macro **`APP_PROJ`**.
+
+- `all_r`:
+
+ - First make target `all`, then run `"make all_r"` in all subdirectories enlisted in **`$(SUB_PROJ)`**: `cd bar_test` `make -f Makefile all_r` `cd bar_sub_proj1` `make -f Makefile all_r` `...`
+
+- `clean`, `clean_r`:
+
+ - Run the same makefiles but with targets `clean` and `clean_r` (rather than `all` and `all_r`), respectively.
+
+- `purge`, `purge_r`:
+
+ - Run the same makefiles but with targets `purge` and `purge_r`.
+
+
+
+### Makefile Targets
+
+The standard build targets for Toolkit makefiles are `all, clean` and `purge`. Recall that recursive versions of these targets exist for meta-makefiles.
+
+- `all`:
+
+ - Compile the object modules specified in the **`"$(OBJ)"`** macro, and use them to build the library **`"$(LIB)"`** or the application **`"$(APP)"`**; then copy the resultant [`lib|app`] to the [[libdir\|bindir](#ch_build.T1)] directory, respectively.
+
+- `clean`:
+
+ - Remove all object modules and libs/apps that have been built by `all`.
+
+- `purge`:
+
+ - Do `clean`, and then remove the copy of the [`libs|apps`] from the [[libdir\|bindir](#ch_build.T1)] directory.
+
+The customized makefiles do not distinguish between recursive (`all_r, clean_r, purge_r`) and non-recursive (`all, clean, purge`) targets -- because the recursion and multiple build is entirely up to the [meta-makefiles](#ch_build.makefiles_meta).
+
+
+
+Makefile Macros and `Makefile.mk`
+---------------------------------
+
+There is a wide assortment of configured tools, flags, third party packages and [paths (see above)](#ch_build.T1). They can be specified for the whole build tree with the appropriate entry in `Makefile.mk`, which is silently included at the very beginning of the customized makefiles used to build [libraries](ch_proj.html#ch_proj.make_proj_lib) and [applications](ch_proj.html#ch_proj.make_proj_app).
+
+Many makefile macros are supplied with defaults **`ORIG_*`** in `Makefile.mk`. See the list of **`ORIG_*`** macros, and all others currently defined, in the [Makefile.mk.in](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/build-system/Makefile.mk.in) template for details. One should not override these defaults in normal use, but add your own flags to them as needed in the corresponding working macro; e.g., set `CXX = $(ORIG_CXX) -DFOO_BAR`.
+
+`Makefile.mk` defines makefile macros obtained during the configuration process for flags (see[Table 3](#ch_build.T3)), system and third-party packages (see [Table 4](#ch_build.T4)) and development tools (see [Table 5](#ch_build.T5)).
+
+
+
+Table 3. Flags
+
+| Macro | [Source](ch_config.html#ch_config.ref_TableToolsAndFlags) | [Synopsis](ch_config.html#ch_config.ch_configconfig_flag) |
+|---------------------|-----------------------------------------------------------|--------------------------------------------------------------------------------------------|
+| **`APP_LDFLAGS`** | compiler test | Compiler-dependent variaton on **`LDFLAGS`** |
+| **`CFLAGS`** | **`$CFLAGS`** | C compiler flags |
+| **`CPPFLAGS`** | **`$CPPFLAGS`** | C/C++ preprocessor flags |
+| **`CXXFLAGS`** | **`$CXXFLAGS`** | C++ compiler flags |
+| **`DEPFLAGS`** | **`$DEPFLAGS`** | Flags for file dependency lists |
+| **`DEPFLAGS_POST`** | compiler test | Related to VisualAge (retained for historical reasons) |
+| **`DLL_LDFLAGS`** | compiler test | Compiler-dependent variaton on **`LDFLAGS`** |
+| **`FAST_CFLAGS`** | **`$FAST_CFLAGS`** | [(\*)](#ch_build.build_make_macros) C compiler flags to generate faster code |
+| **`FAST_CXXFLAGS`** | **`$FAST_CXXFLAGS`** | [(\*)](#ch_build.build_make_macros) C++ compiler flags to generate faster code |
+| **`LDFLAGS`** | **`$LDFLAGS`** | Linker flags |
+| **`LIB_OR_DLL`** | **`@LIB_OR_DLL@`** | Specify whether to build a library as static or dynamic |
+| **`STATIC`** | **`@STATIC@`** | Library suffix to force static linkage (see [example](ch_proj.html#ch_proj.make_proj_app)) |
+
+
+
+
+
+Table 4. System and third-party packages
+
+| Macro | Source | Synopsis |
+|------------------------|-------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| **`FASTCGI_INCLUDE`** | **`$FASTCGI_INCLUDE`** | Fast-CGI headers |
+| **`FASTCGI_LIBS`** | **`$FASTCGI_LIBS`** | Fast-CGI libraries |
+| **`KSTAT_LIBS`** | **`$KSTAT_LIBS`** | KSTAT library (system) |
+| **`LIBS`** | **`$LIBS`** | Default libraries to link with |
+| **`MATH_LIBS`** | **`$MATH_LIBS`** | Math library (system) |
+| **`NETWORK_LIBS`** | **`$NETWORK_LIBS`** | Network library (system) |
+| **`NCBI_C_INCLUDE`** | **`$NCBI_C_INCLUDE`** | NCBI C toolkit headers |
+| **`NCBI_C_LIBPATH`** | **`$NCBI_C_LIBPATH`** | Path to the NCBI C Toolkit libraries |
+| **`NCBI_C_ncbi`** | **`$NCBI_C_ncbi`** | NCBI C CoreLib |
+| **`NCBI_PM_PATH`** | **`$NCBI_PM_PATH`** | Path to the PubMed package |
+| **`NCBI_SSS_INCLUDE`** | **`$NCBI_SSS_INCLUDE`** | NCBI SSS headers |
+| **`NCBI_SSS_LIBPATH`** | **`$NCBI_SSS_LIBPATH`** | Path to NCBI SSS libraries |
+| **`ORBACUS_INCLUDE`** | **`$ORBACUS_LIBPATH`** | Path to the ORBacus CORBA headers |
+| **`ORBACUS_LIBPATH`** | **`$ORBACUS_LIBPATH`** | Path to the ORBacus CORBA libraries |
+| **`PRE_LIBS`** | **`$PRE_LIBS`** | Use **`PRE_LIBS`** to place specific libraries or library directories earlier in the link command line than the standard libraries or directories (i.e. to precede **`$LIBS`**). For example, if you wanted to link with your custom library `mylib/libmylib.a` and also use a locally modified version of an NCBI library saved in a directory called `ncbilibs` you could use a **`PRE_LIBS`** macro similar to: `PRE_LIBS = -lmylib/mylib -Lncbilibs` |
+| **`RPCSVC_LIBS`** | **`$RPCSVC_LIBS`** | RPCSVC library (system) |
+| **`SYBASE_INCLUDE`** | **`$SYBASE_INCLUDE`** | SYBASE headers |
+| **`SYBASE_LIBS`** | **`$SYBASE_LIBS`** | SYBASE libraries |
+| **`THREAD_LIBS`** | **`$THREAD_LIBS`** | Thread library (system) |
+
+
+
+***Note:*** The values of the user-specified environment variables **`$FAST_CFLAGS`** and **`$FAST_CXXFLAGS`** will substitute the regular optimization flag `-O` (or `-O2`, etc.). For example, if in the environment: **`$FAST_CXXFLAGS`**=`-fast -speedy` and **`$CXXFLAGS`**=`-warn -O3 -std`, then in makefile: **`$(FAST_CXXFLAGS)`**=`-warn -fast -speedy -std`.
+
+
+
+Example Makefiles
+-----------------
+
+Below are links to examples of typical `makefiles`, complete with descriptions of their content.
+
+- Inside the Tree
+
+ - [An example meta-makefile and its associated project makefiles](ch_proj.html#ch_proj.inside_example)
+
+ - [Library project makefile: Makefile.myProj.lib](ch_proj.html#ch_proj.inside_lib_make)
+
+ - [Application project makefile: Makefile.myProj.app](ch_proj.html#ch_proj.inside_app_make)
+
+ - [Custom project makefile: Makefile.myProj](ch_proj.html#ch_proj.inside_cust_make)
+
+- New Projects and Outside the Tree
+
+ - [Use Shell Scripts to Create Makefiles](ch_proj.html#ch_proj.new_project_Starting)
+
+ - [Customized makefile to build a library](ch_proj.html#ch_proj.make_proj_lib)
+
+ - [Customized makefile to build an application](ch_proj.html#ch_proj.make_proj_app)
+
+ - [User-defined makefile to build... whatever](ch_proj.html#ch_proj.usr_def_makefile)
+
+
diff --git a/pages/ch_cgi.md b/pages/ch_cgi.md
new file mode 100644
index 00000000..09e05504
--- /dev/null
+++ b/pages/ch_cgi.md
@@ -0,0 +1,1478 @@
+---
+layout: default
+title: C++ Toolkit test
+nav: pages/ch_cgi
+---
+
+
+11\. CGI and Fast-CGI
+===================================
+
+Created: January 1, 2005; Last Update: February 2, 2015.
+
+Overview
+--------
+
+The overview for this chapter consists of the following topics:
+
+- Introduction
+
+- Chapter Outline
+
+### Introduction
+
+**CGI and Fast-CGI** [Libraries `xcgi` and `xfcgi`: [include](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/include/cgi) \| [src](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/cgi)]
+
+These library classes represent an [integrated framework](#ch_cgi.cgi_class_overview) with which to write CGI applications and are designed to help retrieve and parse an HTTP request and then to compose and deliver an HTTP response. (See also this additional [class reference documentation](#ch_cgi.)). `xfcgi` is a FastCGI version of `xcgi`.
+
+***Hint:*** Requires the target executable to be linked with a third-party FastCGI library, as in:
+
+[LIBS](ch_proj.html#ch_proj.make_proj_app)` = $(FASTCGI_LIBS) $(ORIG_LIBS)`.
+
+***Hint:*** On non-FastCGI capable platforms (or if run as a plain CGI on a FastCGI-capable platform), it works the same as a plain CGI.
+
+CGI Interface
+
+- [Basic CGI Application Class](#ch_cgi.cgi_app_class) (includes [CGI Diagnostic Handling](#ch_cgi.cgi_diag.html)) cgiapp[[.hpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/include/cgi/cgiapp.hpp) \| [.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/cgi/cgiapp.cpp)]
+
+- [CGI Application Context Classes](#ch_cgi.cgi_app_context) cgictx[[.hpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/include/cgi/cgictx.hpp) \| [.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/cgi/cgictx.cpp)]
+
+- [HTTP Request Parser](#ch_cgi.cgi_http_req) ncbicgi[[.hpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/include/cgi/ncbicgi.hpp) \| [.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/cgi/ncbicgi.cpp)]
+
+- [HTTP Cookies](#ch_cgi.cgi_http_cookies) ncbicgi[[.hpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/include/cgi/ncbicgi.hpp) \| [.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/cgi/ncbicgi.cpp)]
+
+- [HTTP Response Generator](#ch_cgi.cgi_http_resp) ncbicgir[[.hpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/include/cgi/ncbicgir.hpp) \| [.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/cgi/ncbicgir.cpp)]
+
+- [Basic CGI Resource Class](#ch_cgi.cgi_res_class) ncbires[[.hpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/include/cgi/ncbires.hpp) \| [.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/cgi/ncbires.cpp)]
+
+***FastCGI*** CGI Interface
+
+- Adapter Between C++ and FastCGI Streams fcgibuf[[.hpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/cgi/fcgibuf.hpp) \| [.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/cgi/fcgibuf.cpp)]
+
+- Fast-CGI Loop Function fcgi\_run[[.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/cgi/fcgi_run.cpp)]
+
+- Plain CGI Stub for the Fast-CGI Loop Function cgi\_run[[.cpp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/cgi/cgi_run.cpp)]
+
+**Demo Cases** [[src/cgi/demo](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/cgi/demo) \| [C++/src/sample/app/cgi/](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/sample/app/cgi/)]
+
+**Test Cases** [[src/cgi/test](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/cgi/test)]
+
+.
+
+### Chapter Outline
+
+The following is an outline of the topics presented in this chapter:
+
+[Developing CGI applications](#ch_cgi.cg_develop_apps)
+
+- [Overview of the CGI classes](#ch_cgi.cgi_class_overview)
+
+- [The CCgiApplication class](#ch_cgi.cgi_app_class)
+
+- [The CNcbiResource and CNcbiCommand classes](#ch_cgi.cgi_res_class)
+
+- [The CCgiRequest class](#ch_cgi.cgi_http_req)
+
+- [The CCgiResponse class](#ch_cgi.cgi_http_resp)
+
+- [The CCgiCookie class](#ch_cgi.cgi_http_cookies)
+
+- [The CCgiCookies class](#ch_cgi.cgi_cookies_class)
+
+- [The CCgiContext class](#ch_cgi.cgi_app_context)
+
+- [The CCgiUserAgent class](#ch_cgi.The_CCgiUserAgent_cl)
+
+- [Example code using the CGI classes](#ch_cgi.cgi_examples)
+
+- [CGI Registry configuration](#ch_cgi.cgi_reg_config)
+
+- [Supplementary Information](#ch_cgi.appendix)
+
+[CGI Diagnostic Handling](#ch_cgi.cgi_diag.html)
+
+- [diag-destination](#ch_cgi.cgi_diag.html_ref_destination)
+
+- [diag-threshold](#ch_cgi.cgi_diag.html_ref_threshold)
+
+- [diag-format](#ch_cgi.cgi_diag.html_ref_format)
+
+[NCBI C++ CGI Classes](#ch_cgi.)
+
+- [CCgiRequest](#ch_cgi.prog_man_cgi_1_14)
+
+- [CCgiResponse](#ch_cgi.prog_man_cgi_1_15)
+
+- [CCgiCookie](#ch_cgi.prog_man_cgi_1_16)
+
+- [CCgiCookies](#ch_cgi.prog_man_cgi_1_17)
+
+[An example web-based CGI application](#ch_cgi.html)
+
+- [Introduction](#ch_cgi.intro)
+
+- [Program description](#ch_cgi.descrip)
+
+- [Program design: Distributing the work](#ch_cgi.design)
+
+[CGI Status Codes](#ch_cgi.cgi_response_codes)
+
+[FCGI Redirection and Debugging C++ Toolkit CGI Programs](#ch_cgi.FCGI_Redirection_and_Debugging_C)
+
+
+
+Developing CGI applications
+---------------------------
+
+- [Overview of the CGI classes](#ch_cgi.cgi_class_overview)
+
+- [The CCgiApplication class](#ch_cgi.cgi_app_class)
+
+- [The CNcbiResource and CNcbiCommand classes](#ch_cgi.cgi_res_class)
+
+- [The CCgiRequest class](#ch_cgi.cgi_http_req)
+
+- [The CCgiResponse class](#ch_cgi.cgi_http_resp)
+
+- [The CCgiCookie class](#ch_cgi.cgi_http_cookies)
+
+- [The CCgiCookies class](#ch_cgi.cgi_cookies_class)
+
+- [The CCgiContext class](#ch_cgi.cgi_app_context)
+
+- [The CCgiUserAgent class](#ch_cgi.The_CCgiUserAgent_cl)
+
+- [Example code using the CGI classes](#ch_cgi.cgi_examples)
+
+- [CGI Registry configuration](#ch_cgi.cgi_reg_config)
+
+- [Supplementary Information](#ch_cgi.appendix)
+
+Although CGI programs are generally run as web applications with HTML interfaces, this section of the Programming Manual places emphasis on the CGI side of things, omitting HTML details of the implementation where possible. Similarly, the section on [Generating web pages](ch_html.html#ch_html.webpgs.html) focuses largely on the usage of HTML components independent of CGI details. The two branches of the NCBI C++ Toolkit hierarchy are all but independent of one another - with but one explicit hook between them: the constructors for HTML [page](ch_html.html#ch_html.page_classes) components accept a ***CCgiApplication*** as an optional argument. This ***CCgiApplication*** argument provides the HTML page component with access to all of the CGI objects used in the application.
+
+Further discussion of combining a CGI application with the HTML classes can be found in the section on [An example web-based CGI application](#ch_cgi.html). The focus in this chapter is on the CGI classes only. For additional information about the CGI classes, the reader is also referred to the discussion of [NCBI C++ CGI Classes](#ch_cgi.) in the Reference Manual.
+
+
+
+### The CGI classes
+
+[Figure 1](#ch_cgi.F1) illustrates the layered design of the CGI classes.
+
+
+
+[](/book/static/img/cgi.gif "Click to see the full-resolution image")
+
+Figure 1. Layered design of the CGI classes
+
+This design is best described by starting with a consideration of the capabilities one might need to implement a CGI program, including:
+
+- A way to retrieve and store the current values of environment variables
+
+- A means of retrieving and interpreting the client's query request string
+
+- Mechanisms to service and respond to the requested query
+
+- Methods and data structures to obtain, store, modify, and send cookies
+
+- A way to set/reset the context of the application (for Fast-CGI)
+
+The ***CCgiContext*** class unifies these diverse capabilities under one aggregate structure. As their names suggest, the ***CCgiRequest*** class receives and parses the request, and the ***CCgiResponse*** class outputs the response on an output stream. All incoming ***CCgiCookie***s are also parsed and stored by the ***CCgiRequest*** object, and the outgoing cookies are sent along with the response by the ***CCgiResponse*** object. The request is actually processed by the application's ***CNcbiResource***. The list of ***CNcbiCommand***s stored with that resource object are scanned to find a matching command, which is then executed.
+
+The ***CCgiContext*** object, which is a `friend` to the ***CCgiApplication*** class, orchestrates this sequence of events in coordination with the application object. The same application may be run in many different contexts, but the `resource` and defined set of `commands` are invariant. What changes with each context is the request and its associated response.
+
+The ***CCgiApplication*** class is a specialization of ***CNcbiApplication***. [Figure 2](#ch_cgi.F2) illustrates the adaptation of the ***Init()*** and ***Run()*** member functions inherited from the ***CNcbiApplication*** class to the requirements of CGI programming. Although the application is `contained` in the context, it is the application which creates and initializes each context in which it participates. The program arguments and environmental variables are passed along to the context, where they will be stored, thus freeing the application to be restarted in a new context, as in Fast-CGI.
+
+
+
+[](/book/static/img/cgirun.gif "Click to see the full-resolution image")
+
+Figure 2. Adapting the init() and run() methods inherited from CNcbiApplication
+
+The application's ***ProcessRequest*** member function is an abstract function that must be implemented for each application project. In most cases, this function will access the query and the environment variables via the ***CCgiContext***, using ***ctx.GetRequest()*** and ***ctx.GetConfig()***. The application may then service the request using its resource's ***HandleRequest()*** method. The context's response object can then be used to send an appropriate response.
+
+These classes are described in more detail below, along with abbreviated synopses of the class definitions. These are included here to provide a conceptual framework and are not intended as reference materials. For example, constructor and destructor declarations that operate on void arguments, and `const` methods that duplicate non-const declarations are generally not included here. Certain virtual functions and data members that have no meaning outside of a web application are also omitted. For complete definitions, refer to the header files via the source browsers.
+
+
+
+### The CCgiApplication Class ([\*](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCCgiApplication.html))
+
+As mentioned, the ***CCgiApplication*** class implements its own version of [Init()](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCCgiApplication.html#a0a910deea4387498e472b209967569f0), where it instantiates a [CNcbiResource](#ch_cgi.cgi_res_class) object using ***LoadResource()***. [Run()](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCCgiApplication.html#a9c4be90774829c6a66320a2391e7fcbb) is no longer a pure virtual function in this subclass, and its implementation now calls ***CreateContext(), ProcessRequest()***, and ***CCgiContext::GetResponse()***. The ***CCgiApplication*** class does **not** have a ***CCgiContext*** data member, because the application object can participate in multiple ***CCgiContext***s. Instead, a local variable in each ***Run()*** invocation stores a pointer to the context created there. The ***LoadServerContext()*** member function is used in Web applications, where it is necessary to store more complex run-time data with the context object. The ***CCgiServerContext*** object returned by this function is stored as a data member of a ***CCgiContext*** and is application specific.
+
+ class CCgiApplication : public CNcbiApplication
+ {
+ friend class CCgiContext;
+
+ public:
+ void Init(void);
+ void Exit(void);
+ int Run(void);
+
+ CNcbiResource& GetResource(void);
+ virtual int ProcessRequest(CCgiContext&) = 0;
+ CNcbiResource* LoadResource(void);
+ virtual CCgiServerContext* LoadServerContext(CCgiContext& context);
+
+ bool IsFastCGI(void) const;
+
+ protected:
+ CCgiContext* CreateContext(CNcbiArguments*, CNcbiEnvironment*,
+ CNcbiIstream*, CNcbiOstream*);
+
+ private: auto_ptr m_resource;
+ };
+
+If the program was **not** compiled as a FastCGI application (or the environment does not support FastCGI), then [IsFastCGI()](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=IsFastCGI) will return `false`. Otherwise, a "FastCGI loop" will be iterated over **`def_iter`** times, with the initialization methods and ***ProcessRequest()*** function being executed on each iteration. The value returned by ***IsFastCGI()*** in this case is `true`. ***Run()*** first calls ***IsFastCGI()***, and if that returns `false`, the application is run as a plain CGI program.
+
+
+
+### The CNcbiResource ([\*](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCNcbiResource.html)) and CNcbiCommand ([\*](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCNcbiCommand.html)) Classes
+
+The resource class is at the heart of the application, and it is here that the program's functionality is defined. The single argument to the resource class's constructor is a [CNcbiRegistry](ch_core.html#ch_core.CNcbiRegistry) object, which defines data paths, resources, and possibly environmental variables for the application. This information is stored in the resource class's data member, **`m_config`**. The only other data member is a [TCmdList](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=TCmdList) (a list of ***CNcbiCommand***s) called **`m_cmd`**.
+
+ class CNcbiResource
+ {
+ public:
+
+ CNcbiResource(CNcbiRegistry& config);
+
+ CNcbiRegistry& GetConfig(void);
+ const TCmdList& GetCmdList(void) const;
+ virtual CNcbiCommand* GetDefaultCommand(void) const = 0;
+ virtual const CNcbiResPresentation* GetPresentation(void) const;
+
+ void AddCommand(CNcbiCommand* command);
+ virtual void HandleRequest(CCgiContext& ctx);
+
+ protected:
+
+ CNcbiRegistry& m_config;
+ TCmdList m_cmd;
+ };
+
+The ***AddCommand()*** method is used when a resource is being initialized, to add commands to the command list. Given a ***CCgiRequest*** object defined in a particular context **`ctx`**, [HandleRequest(ctx)](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=HandleRequest) compares entries in the context's request to commands in **`m_cmd`**. The first command in **`m_cmd`** that matches an entry in the request is then executed (see below), and the request is considered "handled". If desired, a default command can be installed that will execute when no matching command is found. The default command is defined by implementing the pure virtual function ***GetDefaultCommand()***. The [CNcbiResPresentation](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCNcbiResPresentation.html) class is an abstract base class, and the member function, ***GetPresentation()***, returns 0. It is provided as a hook for implementing interfaces between information resources (e.g., databases) and CGI applications.
+
+ class CNcbiCommand
+ {
+ public:
+ CNcbiCommand(CNcbiResource& resource);
+
+ virtual CNcbiCommand* Clone(void) const = 0;
+ virtual string GetName() const = 0;
+ virtual void Execute(CCgiContext& ctx) = 0;
+ virtual bool IsRequested(const CCgiContext& ctx) const;
+
+ protected:
+ virtual string GetEntry() const = 0;
+ CNcbiResource& GetResource() const { return m_resource; }
+
+ private:
+ CNcbiResource& m_resource;
+ };
+
+***CNcbiCommand*** is an abstract base class; its only data member is a reference to the resource it belongs to, and most of its methods - with the exception of ***GetResource()*** and ***IsRequested()*** - are pure virtual functions. ***IsRequested()*** examines the `key=value` entries stored with the context's request object. When an entry is found where `key==GetEntry()` and `value==GetName()`, ***IsRequested()*** returns `true`.
+
+The resource's ***HandleRequest()*** method iterates over its command list, calling ***CNcbiCommand::IsRequested()*** until the first match between a command and a request entry is found. When ***IsRequested()*** returns `true`, the command is `cloned`, and the cloned command is then `executed`. Both the ***Execute()*** and ***Clone()*** methods are pure virtual functions that must be implemented by the user.
+
+
+
+### The CCgiRequest Class ([\*](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCCgiRequest.html))
+
+The ***CCgiRequest*** class serves as an interface between the user's query and the CGI program. Arguments to the constructor include a ***CNcbiArguments*** object, a ***CNcbiEnvironment*** object, and a ***CNcbiIstream*** object. The class constructors do little other than invoke ***CCgiRequest::x\_Init()***, where the actual initialization takes place.
+
+***x\_Init()*** begins by examining the environment argument, and if it is `NULL`, **`m_OwnEnv`** (an auto\_ptr) is reset to a dummy environment. Otherwise, **`m_OwnEnv`** is reset to the passed environment, making the request object the effective owner of that environment. The environment is then used to cache network information as "gettable" properties. Cached properties include:
+
+- server properties, such as the server name, gateway interface, and server port
+
+- client properties (the remote host and remote address)
+
+- client data properties (content type and content length of the request)
+
+- request properties, including the request method, query string, and path information
+
+- authentication information, such as the remote user and remote identity
+
+- standard HTTP properties (from the HTTP header)
+
+These properties are keyed to an enumeration named [ECgiProp](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/group__CGIReqRes.html#a8) and can be retrieved using the request object's ***GetProperty()*** member function. For example, `GetProperty(eCgi_HttpCookie)` is used to access cookies from the HTTP Header, and `GetProperty(eCgi_RequestMethod)` is used to determine from where the query string should be read.
+
+***NOTE:*** Setting **`$QUERY_STRING`** without also setting **`$REQUEST_METHOD`** will result in a failure by ***x\_init()*** to read the input query. ***x\_init()*** first looks for the definition of **`$REQUEST_METHOD`**, and depending on if it is ***GET*** or ***POST***, reads the query from the environment or the input stream, respectively. If the environment does not define **`$REQUEST_METHOD`**, then ***x\_Init()*** will try to read the query string from the command line only.
+
+ class CCgiRequest {
+ public:
+ CCgiRequest(const CNcbiArguments*, const CNcbiEnvironment*,
+ CNcbiIstream*, TFlags);
+
+ static const string& GetPropertyName(ECgiProp prop);
+ const string& GetProperty(ECgiProp prop) const;
+ size_t GetContentLength(void) const;
+ const CCgiCookies& GetCookies(void) const;
+ const TCgiEntries& GetEntries(void) const;
+ static SIZE_TYPE ParseEntries(const string& str, TCgiEntries& entries);
+ private:
+ void x_Init(const CNcbiArguments*, const CNcbiEnvironment*,
+ CNcbiIstream*, TFlags);
+
+ const CNcbiEnvironment* m_Env;
+ auto_ptr m_OwnEnv;
+ TCgiEntries m_Entries;
+ CCgiCookies m_Cookies;
+ };
+
+This abbreviated definition of the ***CCgiRequest*** class highlights its primary functions:
+
+To parse and store the `` pairs contained in the query string (stored in **`m_Entries`**).
+
+To parse and store the cookies contained in the HTTP header (stored in **`m_Cookies`**).
+
+As implied by the "T" prefix, [TCgiEntries](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/ident?i=TCgiEntries) is a type definition, and defines **`m_Entries`** to be an STL multimap of `` pairs. The ***CCgiCookies*** class (described [below](#ch_cgi.cgi_cookies_class)) contains an STL set of [CCgiCookie](#ch_cgi.cgi_http_cookies) and implements an interface to this set.
+
+
+
+### The CCgiResponse Class ([\*](http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/doxyhtml/classCCgiResponse.html))
+
+The ***CCgiResponse*** class provides an interface to the program's output stream (usually **`cout`**), which is the sole argument to the constructor for ***CCgiResponse***. The output stream can be accessed by the program using ***CCgiResponse::GetOutput()***, which returns a pointer to the output stream, or, by using ***CCgiResponse::out()***, which returns a reference to that stream.
+
+In addition to implementing controlled access to the output stream, the primary function of the response class is to generate appropriate HTML headers that will precede the rest of the response. For example, a typical sequence in the implementation of a particular command's execute function might be:
+
+ MyCommand::Execute(CCgiContext& ctx)
+ {
+ // ... generate the output and store it in MyOutput
+
+ ctx.GetResponse().WriteHeader();
+ ctx.GetResponse().out() << MyOutput;
+ ctx.GetResponse.out() << "