[med-svn] [solvate] 05/05: New upstream version 2.0.0b11.r4076

Thu Dec 28 20:17:35 UTC 2017

This is an automated email from the git hooks/post-receive script.

tille pushed a commit to annotated tag upstream/2.0.0b11.r4076
in repository solvate.

commit 18c166c00e3a723ea0d59fb1d6e7089ad13577a0
Author: Andreas Tille <tille at debian.org>
Date:   Thu Dec 28 21:13:05 2017 +0100

    New upstream version 2.0.0b11.r4076
---
 Acknowledgements                                |   53 +
 Code_Overview                                   |  135 +
 Makefile                                        |  104 +
 README                                          |   56 +
 ReleaseNotes                                    |   92 +
 debian/changelog                                |   22 -
 debian/compat                                   |    1 -
 debian/control                                  |   17 -
 debian/copyright                                |   19 -
 debian/doc-base                                 |    0
 debian/get-orig-source                          |   32 -
 debian/patches/fix_installing_i_directories     |   24 -
 debian/patches/fix_warning_in_make_pdf          |   11 -
 debian/patches/series                           |    2 -
 debian/rules                                    |   27 -
 debian/source/format                            |    1 -
 debian/watch                                    |    2 -
 emboss.txt                                      |   18 +
 gkb547_gml.pdf                                  |  Bin 0 -> 299716 bytes
 i/nav_brief.gif                                 |  Bin 0 -> 508 bytes
 i/nav_down.gif                                  |  Bin 0 -> 578 bytes
 i/nav_first.gif                                 |  Bin 0 -> 593 bytes
 i/nav_full.gif                                  |  Bin 0 -> 503 bytes
 i/nav_home.gif                                  |  Bin 0 -> 644 bytes
 i/nav_last.gif                                  |  Bin 0 -> 590 bytes
 i/nav_next.gif                                  |  Bin 0 -> 564 bytes
 i/nav_prev.gif                                  |  Bin 0 -> 581 bytes
 i/nav_top.gif                                   |  Bin 0 -> 646 bytes
 i/nav_up.gif                                    |  Bin 0 -> 587 bytes
 index.html                                      |   48 +
 manual/2nd_highest_confidence.png               |  Bin 0 -> 5985 bytes
 manual/Makefile                                 |  233 ++
 manual/NBase_clip.png                           |  Bin 0 -> 3378 bytes
 manual/README                                   |  192 +
 manual/assembly-t.texi                          |  656 ++++
 manual/assembly.CAP3.png                        |  Bin 0 -> 6361 bytes
 manual/assembly.cap2.png                        |  Bin 0 -> 2598 bytes
 manual/assembly.directed.png                    |  Bin 0 -> 4638 bytes
 manual/assembly.fak2.png                        |  Bin 0 -> 4294 bytes
 manual/assembly.new.png                         |  Bin 0 -> 2215 bytes
 manual/assembly.one.png                         |  Bin 0 -> 2081 bytes
 manual/assembly.screen.png                      |  Bin 0 -> 5513 bytes
 manual/assembly.shot.png                        |  Bin 0 -> 5411 bytes
 manual/assembly.single.png                      |  Bin 0 -> 4318 bytes
 manual/break_contig.png                         |  Bin 0 -> 1539 bytes
 manual/c_order_lb.png                           |  Bin 0 -> 2647 bytes
 manual/c_order_t1.png                           |  Bin 0 -> 11379 bytes
 manual/c_order_t1.small.png                     |  Bin 0 -> 7843 bytes
 manual/c_order_t2.png                           |  Bin 0 -> 10980 bytes
 manual/c_order_t2.small.png                     |  Bin 0 -> 7925 bytes
 manual/calc_consensus-t.texi                    |  858 ++++
 manual/calc_consensus.extended.png              |  Bin 0 -> 5238 bytes
 manual/calc_consensus.normal.png                |  Bin 0 -> 6852 bytes
 manual/calc_consensus.quality.png               |  Bin 0 -> 3321 bytes
 manual/calc_consensus.unfinished.png            |  Bin 0 -> 3719 bytes
 manual/cap2-t.texi                              |  380 ++
 manual/cap3-t.texi                              |  501 +++
 manual/check_ass.png                            |  Bin 0 -> 4308 bytes
 manual/check_db-t.texi                          |  178 +
 manual/clip-t.texi                              |  141 +
 manual/comparator-t.texi                        |  185 +
 manual/comparator.png                           |  Bin 0 -> 5969 bytes
 manual/comparator.small.png                     |  Bin 0 -> 3926 bytes
 manual/complement-t.texi                        |  123 +
 manual/conf_values_p.png                        |  Bin 0 -> 4552 bytes
 manual/conf_values_p.small.png                  |  Bin 0 -> 3399 bytes
 manual/configure-t.texi                         |  583 +++
 manual/configure.colour.png                     |  Bin 0 -> 2031 bytes
 manual/consistency_display-t.texi               |  176 +
 manual/consistency_p.png                        |  Bin 0 -> 11465 bytes
 manual/consistency_p.small.png                  |  Bin 0 -> 7575 bytes
 manual/contig_editor-t.texi                     | 2970 ++++++++++++++
 manual/contig_editor.join.png                   |  Bin 0 -> 10644 bytes
 manual/contig_editor.join.small.png             |  Bin 0 -> 4313 bytes
 manual/contig_editor.screen.png                 |  Bin 0 -> 17811 bytes
 manual/contig_editor.screen.small.png           |  Bin 0 -> 12284 bytes
 manual/contig_editor.search.png                 |  Bin 0 -> 3337 bytes
 manual/contig_editor.taged.png                  |  Bin 0 -> 3403 bytes
 manual/contig_editor.tagmacro.png               |  Bin 0 -> 3433 bytes
 manual/contig_editor.tagsel.png                 |  Bin 0 -> 21141 bytes
 manual/contig_editor.traces.compact.png         |  Bin 0 -> 64934 bytes
 manual/contig_editor.traces.compact.small.png   |  Bin 0 -> 39878 bytes
 manual/contig_editor.traces.png                 |  Bin 0 -> 58183 bytes
 manual/contig_editor.traces.small.png           |  Bin 0 -> 39286 bytes
 manual/contig_editor_grey_scale.png             |  Bin 0 -> 14136 bytes
 manual/contig_editor_grey_scale.small.png       |  Bin 0 -> 9919 bytes
 manual/contig_editor_sets.png                   |  Bin 0 -> 45621 bytes
 manual/contig_editor_sets.small.png             |  Bin 0 -> 31249 bytes
 manual/contig_list_box.png                      |  Bin 0 -> 7054 bytes
 manual/contig_navigation-t.texi                 |   51 +
 manual/contig_navigation_browse.png             |  Bin 0 -> 10240 bytes
 manual/contig_navigation_table.png              |  Bin 0 -> 43195 bytes
 manual/contig_ordering-t.texi                   |  139 +
 manual/contig_selector-t.texi                   |  173 +
 manual/contig_selector.png                      |  Bin 0 -> 3168 bytes
 manual/convert-t.texi                           |   81 +
 manual/convert_trace.1.texi                     |  140 +
 manual/copy_db.1.texi                           |   68 +
 manual/copy_reads-t.texi                        |  139 +
 manual/copy_reads.1.texi                        |  153 +
 manual/copy_reads.dialogue.png                  |  Bin 0 -> 8281 bytes
 manual/copyright.texi                           |   63 +
 manual/dependencies                             | 1024 +++++
 manual/difference_clip.png                      |  Bin 0 -> 3001 bytes
 manual/disassembly-t.texi                       |  247 ++
 manual/disassembly.png                          |  Bin 0 -> 4052 bytes
 manual/discrepancy_graph.png                    |  Bin 0 -> 5488 bytes
 manual/doctor_db-t.texi                         |  324 ++
 manual/doctor_db.main.png                       |  Bin 0 -> 941 bytes
 manual/doctor_db.structures.png                 |  Bin 0 -> 3303 bytes
 manual/eba.1.texi                               |   54 +
 manual/exp-t.texi                               |  902 +++++
 manual/exp_suggest-t.texi                       |  456 +++
 manual/exp_suggest.comp.png                     |  Bin 0 -> 3048 bytes
 manual/exp_suggest.double.png                   |  Bin 0 -> 3240 bytes
 manual/exp_suggest.long.png                     |  Bin 0 -> 3079 bytes
 manual/exp_suggest.primers.png                  |  Bin 0 -> 4781 bytes
 manual/extract-t.texi                           |   35 +
 manual/extract.png                              |  Bin 0 -> 3119 bytes
 manual/extract_fastq.1.texi                     |   44 +
 manual/extract_seq.1.texi                       |   62 +
 manual/fak2-t.texi                              |  308 ++
 manual/fij-t.texi                               |  186 +
 manual/fij.dialogue.png                         |  Bin 0 -> 8756 bytes
 manual/filebrowser-t.texi                       |  106 +
 manual/filebrowser.png                          |  Bin 0 -> 3536 bytes
 manual/filebrowser.texi                         |   55 +
 manual/find_oligo-t.texi                        |   70 +
 manual/find_oligo_pic.png                       |  Bin 0 -> 3893 bytes
 manual/find_renz.1.texi                         |   38 +
 manual/formats-t.texi                           |   96 +
 manual/formats.texi                             |   41 +
 manual/gap4-t.texi                              |  294 ++
 manual/gap4.texi                                |   44 +
 manual/gap4_intro-t.texi                        |  300 ++
 manual/gap4_mini-t.texi                         |  795 ++++
 manual/gap4_org-t.texi                          |   70 +
 manual/gap5-t.texi                              |  252 ++
 manual/gap5.texi                                |   44 +
 manual/gap5_assembly-t.texi                     |  249 ++
 manual/gap5_break_contig.png                    |  Bin 0 -> 6803 bytes
 manual/gap5_check_ass.png                       |  Bin 0 -> 9814 bytes
 manual/gap5_check_database.png                  |  Bin 0 -> 9030 bytes
 manual/gap5_check_db-t.texi                     |  159 +
 manual/gap5_comparator.png                      |  Bin 0 -> 9599 bytes
 manual/gap5_contig_editor-t.texi                | 1512 +++++++
 manual/gap5_contig_editor.454trace.png          |  Bin 0 -> 9760 bytes
 manual/gap5_contig_editor.join.png              |  Bin 0 -> 21392 bytes
 manual/gap5_contig_editor.names1.png            |  Bin 0 -> 2427 bytes
 manual/gap5_contig_editor.names2.png            |  Bin 0 -> 4859 bytes
 manual/gap5_contig_editor.primer_dialogue.png   |  Bin 0 -> 8035 bytes
 manual/gap5_contig_editor.primers.png           |  Bin 0 -> 14867 bytes
 manual/gap5_contig_editor.screen.png            |  Bin 0 -> 22023 bytes
 manual/gap5_contig_editor.search.png            |  Bin 0 -> 8868 bytes
 manual/gap5_contig_editor.traces.png            |  Bin 0 -> 15878 bytes
 manual/gap5_contig_selector.png                 |  Bin 0 -> 5748 bytes
 manual/gap5_delete_contigs.png                  |  Bin 0 -> 8283 bytes
 manual/gap5_disassembly-t.texi                  |  223 ++
 manual/gap5_disassembly.png                     |  Bin 0 -> 8401 bytes
 manual/gap5_export-t.texi                       |   85 +
 manual/gap5_export_sequences.png                |  Bin 0 -> 11890 bytes
 manual/gap5_export_tags.png                     |  Bin 0 -> 7782 bytes
 manual/gap5_fij-t.texi                          |  187 +
 manual/gap5_fij.dialogue.png                    |  Bin 0 -> 19384 bytes
 manual/gap5_find_read_pairs.png                 |  Bin 0 -> 6334 bytes
 manual/gap5_list_libraries.png                  |  Bin 0 -> 14904 bytes
 manual/gap5_org-t.texi                          |   70 +
 manual/gap5_read_pairs-t.texi                   |   59 +
 manual/gap5_remove_contig_holes.png             |  Bin 0 -> 9252 bytes
 manual/gap5_remove_pad_columns.png              |  Bin 0 -> 8853 bytes
 manual/gap5_repeats-t.texi                      |   75 +
 manual/gap5_rp_comparator.png                   |  Bin 0 -> 8914 bytes
 manual/gap5_shuffle-t.texi                      |   91 +
 manual/gap5_shuffle_pads.png                    |  Bin 0 -> 7662 bytes
 manual/gap5_template-t.texi                     |  271 ++
 manual/gap5_template_by_mapping.png             |  Bin 0 -> 23049 bytes
 manual/gap5_template_by_size.png                |  Bin 0 -> 30511 bytes
 manual/gap5_template_by_stacking.png            |  Bin 0 -> 19527 bytes
 manual/gap5_template_filter.png                 |  Bin 0 -> 4792 bytes
 manual/gap5_template_spread0.png                |  Bin 0 -> 23517 bytes
 manual/gap5_template_spread50.png               |  Bin 0 -> 51742 bytes
 manual/gap5_template_template.png               |  Bin 0 -> 6059 bytes
 manual/gap_database-t.texi                      |  237 ++
 manual/getABIfield.1.texi                       |  105 +
 manual/get_comment.1.texi                       |   35 +
 manual/get_scf_field.1.texi                     |   39 +
 manual/hash_exp.1.texi                          |   33 +
 manual/hash_extract.1.texi                      |   30 +
 manual/hash_list.1.texi                         |   29 +
 manual/hash_tar.1.texi                          |  121 +
 manual/header.m4                                |  260 ++
 manual/hidden-t.texi                            |   30 +
 manual/i/nav_brief.gif                          |  Bin 0 -> 508 bytes
 manual/i/nav_down.gif                           |  Bin 0 -> 578 bytes
 manual/i/nav_first.gif                          |  Bin 0 -> 593 bytes
 manual/i/nav_full.gif                           |  Bin 0 -> 503 bytes
 manual/i/nav_home.gif                           |  Bin 0 -> 644 bytes
 manual/i/nav_last.gif                           |  Bin 0 -> 590 bytes
 manual/i/nav_next.gif                           |  Bin 0 -> 564 bytes
 manual/i/nav_prev.gif                           |  Bin 0 -> 581 bytes
 manual/i/nav_top.gif                            |  Bin 0 -> 646 bytes
 manual/i/nav_up.gif                             |  Bin 0 -> 587 bytes
 manual/init_exp.1.texi                          |   56 +
 manual/interface-t.texi                         |  469 +++
 manual/interface.buttons.png                    |  Bin 0 -> 1594 bytes
 manual/interface.colour.png                     |  Bin 0 -> 2030 bytes
 manual/interface.entry.png                      |  Bin 0 -> 1070 bytes
 manual/interface.fonts.png                      |  Bin 0 -> 2583 bytes
 manual/interface.menus.png                      |  Bin 0 -> 1417 bytes
 manual/interface.output.png                     |  Bin 0 -> 7319 bytes
 manual/interface.output.small.png               |  Bin 0 -> 5146 bytes
 manual/interface.tag.png                        |  Bin 0 -> 4428 bytes
 manual/interface.texi                           |   42 +
 manual/list_libraries-t.texi                    |   39 +
 manual/lists-t.texi                             |  254 ++
 manual/makeSCF.1.texi                           |   94 +
 manual/make_weights.1.texi                      |  217 ++
 manual/man/man1/convert_trace.1                 |  160 +
 manual/man/man1/copy_db.1                       |   79 +
 manual/man/man1/eba.1                           |   64 +
 manual/man/man1/extract_seq.1                   |   70 +
 manual/man/man1/find_renz.1                     |   41 +
 manual/man/man1/get_comment.1                   |   39 +
 manual/man/man1/get_scf_field.1                 |   44 +
 manual/man/man1/init_exp.1                      |   63 +
 manual/man/man1/makeSCF.1                       |  110 +
 manual/man/man4/ExperimentFile.4                |  779 ++++
 manual/man/man4/scf.4                           |  421 ++
 manual/man/man4/ztr.4                           |  750 ++++
 manual/manpages-t.texi                          |  120 +
 manual/manpages.texi                            |   41 +
 manual/manual.texi                              |  133 +
 manual/mini_manual.texi                         |   78 +
 manual/mut_contig_editor5.png                   |  Bin 0 -> 21753 bytes
 manual/mut_contig_editor5.small.png             |  Bin 0 -> 8569 bytes
 manual/mut_contig_editor_dis5.png               |  Bin 0 -> 14973 bytes
 manual/mut_contig_editor_dis5.small.png         |  Bin 0 -> 8304 bytes
 manual/mut_mutscan_adaptive_noise_threshold.png |  Bin 0 -> 9617 bytes
 manual/mut_mutscan_peak_alignment_threshold.png |  Bin 0 -> 6465 bytes
 manual/mut_mutscan_peak_drop_threshold.png      |  Bin 0 -> 13182 bytes
 manual/mut_pregap4.png                          |  Bin 0 -> 11302 bytes
 manual/mut_template_all.png                     |  Bin 0 -> 5756 bytes
 manual/mut_template_all.small.png               |  Bin 0 -> 5481 bytes
 manual/mut_template_reads.png                   |  Bin 0 -> 7336 bytes
 manual/mut_template_reads.small.png             |  Bin 0 -> 5266 bytes
 manual/mut_template_reads_single.png            |  Bin 0 -> 6227 bytes
 manual/mut_template_reads_single.small.png      |  Bin 0 -> 5920 bytes
 manual/mut_traces_het.png                       |  Bin 0 -> 18724 bytes
 manual/mut_traces_het.small.png                 |  Bin 0 -> 12144 bytes
 manual/mut_traces_point.png                     |  Bin 0 -> 18869 bytes
 manual/mut_traces_point.small.png               |  Bin 0 -> 12098 bytes
 manual/mut_traces_positive.png                  |  Bin 0 -> 18198 bytes
 manual/mut_traces_positive.small.png            |  Bin 0 -> 10765 bytes
 manual/mutations-t.texi                         |  686 ++++
 manual/mutations.texi                           |   36 +
 manual/notes-t.texi                             |  151 +
 manual/notes.editor.png                         |  Bin 0 -> 5429 bytes
 manual/notes.selector.png                       |  Bin 0 -> 3682 bytes
 manual/phrap-t.texi                             |  195 +
 manual/phrap.assembly.png                       |  Bin 0 -> 3436 bytes
 manual/polyA_clip.1.texi                        |   53 +
 manual/preface-t.texi                           |   84 +
 manual/pregap4-t.texi                           | 4764 +++++++++++++++++++++++
 manual/pregap4.texi                             |   56 +
 manual/pregap4_compact.png                      |  Bin 0 -> 11430 bytes
 manual/pregap4_component.png                    |  Bin 0 -> 2001 bytes
 manual/pregap4_config.png                       |  Bin 0 -> 8989 bytes
 manual/pregap4_edit_exp.png                     |  Bin 0 -> 6697 bytes
 manual/pregap4_files.png                        |  Bin 0 -> 8018 bytes
 manual/pregap4_mini-t.texi                      |  423 ++
 manual/pregap4_org-t.texi                       |   76 +
 manual/pregap4_overview.png                     |  Bin 0 -> 4154 bytes
 manual/pregap4_overview2.png                    |  Bin 0 -> 7170 bytes
 manual/pregap4_select.png                       |  Bin 0 -> 11281 bytes
 manual/pregap4_separate.png                     |  Bin 0 -> 8962 bytes
 manual/pregap4_simpledb.png                     |  Bin 0 -> 3134 bytes
 manual/pregap4_textwin.png                      |  Bin 0 -> 7571 bytes
 manual/primer_pos_plot.png                      |  Bin 0 -> 3201 bytes
 manual/primer_pos_plot.small.png                |  Bin 0 -> 2300 bytes
 manual/primer_pos_seq_display.png               |  Bin 0 -> 3784 bytes
 manual/primer_pos_seq_display.small.png         |  Bin 0 -> 1708 bytes
 manual/primer_pos_text.png                      |  Bin 0 -> 6262 bytes
 manual/qclip.1.texi                             |  134 +
 manual/quality_clip.png                         |  Bin 0 -> 3064 bytes
 manual/quality_clip_ends.png                    |  Bin 0 -> 3431 bytes
 manual/quality_plot-t.texi                      |  114 +
 manual/read_clipping-t.texi                     |   20 +
 manual/read_clipping.texi                       |   41 +
 manual/read_coverage_d.png                      |  Bin 0 -> 3611 bytes
 manual/read_coverage_p.png                      |  Bin 0 -> 5941 bytes
 manual/read_coverage_p.small.png                |  Bin 0 -> 4436 bytes
 manual/read_pairs-t.texi                        |  216 +
 manual/read_pairs.png                           |  Bin 0 -> 1710 bytes
 manual/readpair_coverage_p.png                  |  Bin 0 -> 3786 bytes
 manual/readpair_coverage_p.small.png            |  Bin 0 -> 2590 bytes
 manual/references-t.texi                        |  122 +
 manual/references.texi                          |   41 +
 manual/renzymes-t.texi                          |   57 +
 manual/repeats-t.texi                           |   75 +
 manual/repeats.png                              |  Bin 0 -> 4973 bytes
 manual/restrict_enzymes-t.texi                  |  160 +
 manual/restrict_enzymes.png                     |  Bin 0 -> 4474 bytes
 manual/restrict_enzymes.small.png               |  Bin 0 -> 3225 bytes
 manual/results-t.texi                           |   50 +
 manual/results.1.png                            |  Bin 0 -> 3439 bytes
 manual/scf-t.texi                               |  437 +++
 manual/screen_seq.1.texi                        |  185 +
 manual/set_genetic_code.png                     |  Bin 0 -> 3956 bytes
 manual/show_rel-t.texi                          |   77 +
 manual/show_rel.png                             |  Bin 0 -> 3242 bytes
 manual/snp_candidates1.png                      |  Bin 0 -> 30323 bytes
 manual/snp_candidates1.small.png                |  Bin 0 -> 10434 bytes
 manual/snp_candidates2.png                      |  Bin 0 -> 35573 bytes
 manual/snp_candidates2.small.png                |  Bin 0 -> 24132 bytes
 manual/spin-t.texi                              | 3133 +++++++++++++++
 manual/spin.texi                                |   41 +
 manual/spin_align_p.png                         |  Bin 0 -> 5523 bytes
 manual/spin_align_p.small.png                   |  Bin 0 -> 3369 bytes
 manual/spin_align_seq.png                       |  Bin 0 -> 8142 bytes
 manual/spin_alignment_symbols.png               |  Bin 0 -> 1808 bytes
 manual/spin_author_d.png                        |  Bin 0 -> 4766 bytes
 manual/spin_author_p.png                        |  Bin 0 -> 15498 bytes
 manual/spin_author_p.small.png                  |  Bin 0 -> 12062 bytes
 manual/spin_base_bias_d.png                     |  Bin 0 -> 4166 bytes
 manual/spin_base_bias_p.png                     |  Bin 0 -> 7027 bytes
 manual/spin_base_bias_p.small.png               |  Bin 0 -> 5194 bytes
 manual/spin_codon_usage.png                     |  Bin 0 -> 15808 bytes
 manual/spin_codon_usage.small.png               |  Bin 0 -> 12171 bytes
 manual/spin_codon_usage_aaonly.png              |  Bin 0 -> 15381 bytes
 manual/spin_codon_usage_aaonly.small.png        |  Bin 0 -> 24469 bytes
 manual/spin_codon_usage_dial.png                |  Bin 0 -> 7610 bytes
 manual/spin_count_codons_d.png                  |  Bin 0 -> 9717 bytes
 manual/spin_count_codons_t.png                  |  Bin 0 -> 9679 bytes
 manual/spin_count_codons_t.small.png            |  Bin 0 -> 6193 bytes
 manual/spin_diagonals.png                       |  Bin 0 -> 8259 bytes
 manual/spin_dot_plot.png                        |  Bin 0 -> 10207 bytes
 manual/spin_dot_plot.small.png                  |  Bin 0 -> 6491 bytes
 manual/spin_find_orf_d.png                      |  Bin 0 -> 6162 bytes
 manual/spin_local_align.png                     |  Bin 0 -> 5310 bytes
 manual/spin_local_p1.png                        |  Bin 0 -> 5279 bytes
 manual/spin_local_p1.small.png                  |  Bin 0 -> 3606 bytes
 manual/spin_local_p2.png                        |  Bin 0 -> 5418 bytes
 manual/spin_local_p2.small.png                  |  Bin 0 -> 3337 bytes
 manual/spin_match_words.png                     |  Bin 0 -> 6457 bytes
 manual/spin_mini-t.texi                         |  364 ++
 manual/spin_org-t.texi                          |   28 +
 manual/spin_personal_search.png                 |  Bin 0 -> 2656 bytes
 manual/spin_plot.png                            |  Bin 0 -> 19725 bytes
 manual/spin_plot.small.png                      |  Bin 0 -> 36138 bytes
 manual/spin_plot_base_comp_d.png                |  Bin 0 -> 4921 bytes
 manual/spin_plot_base_comp_p.png                |  Bin 0 -> 6184 bytes
 manual/spin_plot_base_comp_p.small.png          |  Bin 0 -> 4724 bytes
 manual/spin_plot_drag1.png                      |  Bin 0 -> 16337 bytes
 manual/spin_plot_drag1.small.png                |  Bin 0 -> 10602 bytes
 manual/spin_plot_drag2.png                      |  Bin 0 -> 16437 bytes
 manual/spin_plot_drag2.small.png                |  Bin 0 -> 10986 bytes
 manual/spin_plot_drag3.png                      |  Bin 0 -> 16875 bytes
 manual/spin_plot_drag3.small.png                |  Bin 0 -> 10718 bytes
 manual/spin_plot_p.png                          |  Bin 0 -> 15677 bytes
 manual/spin_plot_p.small.png                    |  Bin 0 -> 10853 bytes
 manual/spin_restrict_enzymes-t.texi             |  191 +
 manual/spin_restrict_enzymes_d.png              |  Bin 0 -> 4625 bytes
 manual/spin_restrict_enzymes_p.png              |  Bin 0 -> 8839 bytes
 manual/spin_restrict_enzymes_p.small.png        |  Bin 0 -> 6033 bytes
 manual/spin_restrict_enzymes_p1.png             |  Bin 0 -> 8920 bytes
 manual/spin_restrict_enzymes_p1.small.png       |  Bin 0 -> 6032 bytes
 manual/spin_results_manager_d.png               |  Bin 0 -> 3283 bytes
 manual/spin_results_manager_d2.png              |  Bin 0 -> 16166 bytes
 manual/spin_results_manager_d2.small.png        |  Bin 0 -> 11663 bytes
 manual/spin_save_sequence_d.png                 |  Bin 0 -> 5063 bytes
 manual/spin_seq_display.png                     |  Bin 0 -> 5591 bytes
 manual/spin_seq_display.small.png               |  Bin 0 -> 3588 bytes
 manual/spin_seq_manager.png                     |  Bin 0 -> 5863 bytes
 manual/spin_sequence_display_d.png              |  Bin 0 -> 2513 bytes
 manual/spin_sequence_display_save_d.png         |  Bin 0 -> 1924 bytes
 manual/spin_sequence_display_t.png              |  Bin 0 -> 10734 bytes
 manual/spin_sequence_display_t.small.png        |  Bin 0 -> 6713 bytes
 manual/spin_similar_spans.png                   |  Bin 0 -> 7124 bytes
 manual/spin_simple_search.png                   |  Bin 0 -> 1951 bytes
 manual/spin_splice.png                          |  Bin 0 -> 6769 bytes
 manual/spin_splice.small.png                    |  Bin 0 -> 4853 bytes
 manual/spin_start_d.png                         |  Bin 0 -> 3737 bytes
 manual/spin_start_p.png                         |  Bin 0 -> 4902 bytes
 manual/spin_start_p.small.png                   |  Bin 0 -> 2895 bytes
 manual/spin_stops_d.png                         |  Bin 0 -> 4384 bytes
 manual/spin_stops_p.png                         |  Bin 0 -> 4957 bytes
 manual/spin_stops_p.small.png                   |  Bin 0 -> 3496 bytes
 manual/spin_stops_p2.png                        |  Bin 0 -> 15432 bytes
 manual/spin_stops_p2.small.png                  |  Bin 0 -> 10527 bytes
 manual/spin_string_search_d.png                 |  Bin 0 -> 3531 bytes
 manual/spin_string_search_p.png                 |  Bin 0 -> 3406 bytes
 manual/spin_string_search_p.small.png           |  Bin 0 -> 2359 bytes
 manual/spin_translate_d.png                     |  Bin 0 -> 3546 bytes
 manual/spin_translate_t.png                     |  Bin 0 -> 3727 bytes
 manual/spin_translate_t.small.png               |  Bin 0 -> 3673 bytes
 manual/spin_trna_p.png                          |  Bin 0 -> 3329 bytes
 manual/spin_trna_p.small.png                    |  Bin 0 -> 2330 bytes
 manual/spin_trna_t.png                          |  Bin 0 -> 7168 bytes
 manual/spin_trna_t.small.png                    |  Bin 0 -> 4915 bytes
 manual/spin_weight_matrix.png                   |  Bin 0 -> 3298 bytes
 manual/spin_weight_matrix.small.png             |  Bin 0 -> 2552 bytes
 manual/spin_weight_matrix_dial.png              |  Bin 0 -> 4408 bytes
 manual/stops-t.texi                             |   52 +
 manual/stops.png                                |  Bin 0 -> 4320 bytes
 manual/stops.small.png                          |  Bin 0 -> 1706 bytes
 manual/strand_coverage_d.png                    |  Bin 0 -> 3921 bytes
 manual/strand_coverage_p1.png                   |  Bin 0 -> 2937 bytes
 manual/strand_coverage_p1.small.png             |  Bin 0 -> 2404 bytes
 manual/strand_coverage_p2.png                   |  Bin 0 -> 2997 bytes
 manual/strand_coverage_p2.small.png             |  Bin 0 -> 2439 bytes
 manual/suggest_probes.main.png                  |  Bin 0 -> 3991 bytes
 manual/suggest_probes.select.png                |  Bin 0 -> 7126 bytes
 manual/tags-t.texi                              |  120 +
 manual/template-t.texi                          |  750 ++++
 manual/template.dialogue.png                    |  Bin 0 -> 3855 bytes
 manual/template.display.png                     |  Bin 0 -> 16986 bytes
 manual/template.display.small.png               |  Bin 0 -> 35576 bytes
 manual/template.quality.png                     |  Bin 0 -> 3032 bytes
 manual/template.quality.small.png               |  Bin 0 -> 2601 bytes
 manual/template.restriction.png                 |  Bin 0 -> 591 bytes
 manual/template.restriction.small.png           |  Bin 0 -> 384 bytes
 manual/template_status.png                      |  Bin 0 -> 3158 bytes
 manual/test.texi                                |  119 +
 manual/tools/docmake                            |   50 +
 manual/tools/edit_mini_contents.pl              |   25 +
 manual/tools/html_index.pl                      |   42 +
 manual/tools/list_nodes                         |    2 +
 manual/tools/lowersection                       |   21 +
 manual/tools/make_dependencies                  |   22 +
 manual/tools/make_eps                           |   81 +
 manual/tools/make_gif_html                      |   17 +
 manual/tools/make_pdf                           |  131 +
 manual/tools/make_png_html                      |   17 +
 manual/tools/make_ps                            |   82 +
 manual/tools/merge_indexes.pl                   |  258 ++
 manual/tools/pkfix.pl                           |  634 +++
 manual/tools/remove_xrefs.pl                    |    8 +
 manual/tools/reorder.tcl                        |  129 +
 manual/tools/texi2html                          | 1910 +++++++++
 manual/tools/texi2man.pl                        |  173 +
 manual/tools/texi2text                          |   59 +
 manual/tools/text2texi                          |    4 +
 manual/tools/update-nodes                       |    2 +
 manual/tools/update-nodes.el                    |   11 +
 manual/tools/xref_update.pl                     |   81 +
 manual/trace_dump.1.texi                        |   30 +
 manual/trace_print_menu.png                     |  Bin 0 -> 7863 bytes
 manual/trace_print_menu.small.png               |  Bin 0 -> 5561 bytes
 manual/trace_print_page_dialogue.png            |  Bin 0 -> 2799 bytes
 manual/trace_print_trace1.png                   |  Bin 0 -> 7598 bytes
 manual/trace_print_trace_dialogue.png           |  Bin 0 -> 3419 bytes
 manual/tracediff.1.texi                         |  125 +
 manual/trev-t.texi                              |  408 ++
 manual/trev.texi                                |   53 +
 manual/trev_conf_trace.png                      |  Bin 0 -> 9881 bytes
 manual/trev_conf_trace.small.png                |  Bin 0 -> 6853 bytes
 manual/trev_mini-t.texi                         |   60 +
 manual/trev_pic.png                             |  Bin 0 -> 8681 bytes
 manual/trev_pyro_trace.png                      |  Bin 0 -> 36558 bytes
 manual/vector_clip-t.texi                       |  955 +++++
 manual/vector_clip.1.texi                       |  241 ++
 manual/vector_clip.texi                         |   41 +
 manual/vector_primer-t.texi                     |   36 +
 manual/ztr-t.texi                               |  772 ++++
 overview.html.template                          |  305 ++
 parse_template                                  |   84 +
 scripting_manual/Makefile                       |   91 +
 scripting_manual/appendix-t.texi                |  331 ++
 scripting_manual/dependencies                   |   21 +
 scripting_manual/extension-t.texi               |  732 ++++
 scripting_manual/gap4-canno-t.texi              |  521 +++
 scripting_manual/gap4-cedit-t.texi              |  382 ++
 scripting_manual/gap4-cio-IO.h-t.texi           |  211 +
 scripting_manual/gap4-cio-basic-t.texi          |  805 ++++
 scripting_manual/gap4-cio-compile-t.texi        |   78 +
 scripting_manual/gap4-cio-database-t.texi       |  615 +++
 scripting_manual/gap4-cio-gapio-t.texi          |  172 +
 scripting_manual/gap4-cio-high-t.texi           |  716 ++++
 scripting_manual/gap4-cio-intro-t.texi          |  211 +
 scripting_manual/gap4-cio-mid-t.texi            |   75 +
 scripting_manual/gap4-cio-other-t.texi          |  229 ++
 scripting_manual/gap4-cio-t.texi                |   49 +
 scripting_manual/gap4-editor-t.texi             |  814 ++++
 scripting_manual/gap4-registration-t.texi       | 1773 +++++++++
 scripting_manual/gap4-scripting-comm-t.texi     | 1687 ++++++++
 scripting_manual/gap4-scripting-intro-t.texi    |   69 +
 scripting_manual/gap4-scripting-io-t.texi       |  502 +++
 scripting_manual/gap4-scripting-util-t.texi     |  316 ++
 scripting_manual/gap4-t.texi                    |   46 +
 scripting_manual/header.m4                      |  292 ++
 scripting_manual/i/nav_brief.gif                |  Bin 0 -> 508 bytes
 scripting_manual/i/nav_down.gif                 |  Bin 0 -> 578 bytes
 scripting_manual/i/nav_first.gif                |  Bin 0 -> 593 bytes
 scripting_manual/i/nav_full.gif                 |  Bin 0 -> 503 bytes
 scripting_manual/i/nav_home.gif                 |  Bin 0 -> 644 bytes
 scripting_manual/i/nav_last.gif                 |  Bin 0 -> 590 bytes
 scripting_manual/i/nav_next.gif                 |  Bin 0 -> 564 bytes
 scripting_manual/i/nav_prev.gif                 |  Bin 0 -> 581 bytes
 scripting_manual/i/nav_top.gif                  |  Bin 0 -> 646 bytes
 scripting_manual/i/nav_up.gif                   |  Bin 0 -> 587 bytes
 scripting_manual/preface-t.texi                 |   48 +
 scripting_manual/scripting.texi                 |  102 +
 scripting_manual/tkutils-t.texi                 | 1253 ++++++
 503 files changed, 54660 insertions(+), 158 deletions(-)

diff --git a/Acknowledgements b/Acknowledgements
new file mode 100644
index 0000000..717da39
--- /dev/null
+++ b/Acknowledgements
@@ -0,0 +1,53 @@
+The file contains acknowledgements to others for their input to the package.
+We apologise to those we have missed.
+
+------------------------------------------------------------------------------
+The OSP code used for oligo selection in bap, gap and gap4 is from (C) LaDeana
+Hillier and Philip Green. See the src/{gap4,bap}/osp-bits subdirectories.
+
+People wanting to obtain the program OSP should contact:
+
+    LaDeana Hillier (lfw at elegans.wustl.edu)
+    Department of Genetics
+    Washington University School of Medicine
+    4566 Scott Avenue, Box 8232
+    St. Louis, MO 63110
+    USA
+
+Reference:
+
+Hillier, L., and Green, P. (1991) PCR Methods and Applications, 1:124-128.
+OSP: an oligonucleotide selection program. 
+
+
+------------------------------------------------------------------------------
+The alignment algorithm used within the gap4 Directed Assembly mode is
+(c) 1992 Xiaoqiu Huang. See the src/gap4/align_ss2.c file.
+
+This algorithm is based upon one by Gene Myers and Web Miller, which we also
+use.
+
+References:
+
+Huang, X. CABIOS Vol 10 no 3. 227-235 (1994).
+On global sequence alignment.
+
+Myers, E.W.,  and Miller, W. (1988) Optimal alignments in linear space.
+Comput. Applic. Biosci. 4 11-17.
+
+------------------------------------------------------------------------------
+The Tcl language and the Tk toolkit were written by John Ousterhout. These
+have been used primarily for the interface in gap4.
+
+References:
+
+Ousterhout, J.K., (1990) ``TCL: An Embeddable Command Language'',
+in the Proceedings of the 1990 Winter USENIX Conference, pp 133-146.
+
+Ousterhout, J.K., (1991) ``An X11 Toolkit Based on the TCL Language'',
+in the Proceedings of the 1991 Winter USENIX Conference, pp 105-115.
+
+These papers can be obtained by ftp from:
+ftp://ftp.cs.berkeley.edu/ucb/tcl/tkUsenix91.ps
+ftp://ftp.cs.berkeley.edu/ucb/tcl/tkF10.ps
+
diff --git a/Code_Overview b/Code_Overview
new file mode 100644
index 0000000..844a6bb
--- /dev/null
+++ b/Code_Overview
@@ -0,0 +1,135 @@
+Organisation of the code 
+======================== 
+ 
+The package consists of the following directories, which I organise here into
+libraries and applications. In some cases it's a little bit complicated as the
+same directory houses both a C library, Tcl functions to communicate with that 
+library and/or provide additional GUI code, and an application. 
+ 
+For example Gap4 consists of lots of C (with a tiny bit of Fortran). These are
+compiled together to produce (for example) libgap.so. The same directory
+contains lots of .tcl files and an associated tclIndex file for use with the
+stash "load_package" command. Finally it also contains gap.tcl, the main
+startup file for gap4. ('Gap4' is effectively just "stash .../gap/gap.tcl").
+ 
+Pure C libraries 
+---------------- 
+ 
+Misc 
+io_lib 
+text_utils 
+g 
+seq_utils 
+mutlib (C++) 
+ 
+Pure Tcl libraries 
+------------------ 
+cap2 
+cap3 
+phrap 
+spin_emboss 
+ 
+C and Tcl libraries 
+------------------- 
+ 
+tk_utils 
+prefinish
+gap4 (also app startup code) 
+spin (also app startup code) 
+spin2 (also app startup code) 
+seqed (also app startup code) 
+ 
+Applications 
+------------ 
+ 
+abi 
+alf 
+convert 
+eba 
+expGetSeq 
+get_scf_field 
+hetins 
+init_exp 
+make_weights 
+qclip 
+pregap4 (pure Tcl) 
+screen_seq 
+tracealign 
+tracediff 
+traceview 
+trev (pure Tcl) 
+vector_clip 
+ 
+External libraries used 
+----------------------- 
+png 
+zlib 
+primer3 (modified to act as a library instead of an application) 
+tcl 
+tk 
+IncrTcl 
+tkdnd 
+iwidgets (pure itcl/itk) 
+tablelist (pure Tcl) 
+
+
+Library hierarchy
+-----------------
+
+A dependency tree is hard to draw if we include all the applications, as gap4
+and spin both include lots of dependencies which would lead to many
+crossing arrows. So I'll break down just the basic libraries.
+
+  	 +-----------------------------------+------------+
+	 |             tk_utils              | text_utils |
+         |  +------+----+--------+-----+-----+------------+
+         |  | Itcl | Tk | io_lib | png |                  |
+         +--+------+----+--------+-----+      Misc        |
+         |     Tcl      |    zlib      |                  |
+         +--------------+--------------+------------------+
+
+So io_lib does not (yet) depend on Misc, but it does contain quite a bit of
+duplicated code. The reason for this is that historically we've distributed
+io_lib as a separate Open Source package and so we copied the necessary bits
+from Misc into io_lib. Ideally we should revert back to having io_lib depend
+on Misc and remove this duplication.
+
+Misc is where the OS dependent bits belong such as byte-order handling
+functions and various implementations of various missing functions so that all 
+the OSes come to a common standard. In addition to this it has useful
+data-type handling code such as dynamic arrays and dynamic strings.
+
+Text_utils and tk_utils both contain some of the same functions: vmessage,
+verror, vfuncheader (and maybe more). In the tk_utils world these use Tcl/Tk
+to add text to the main text output windows. In text_utils these just print
+to stderr and stdout.
+
+A complication arises in that some algorithms in seq_utils will use vmessage
+or verror. seq_utils itself may be called (for example) from both Gap4 (which
+has a tk window and uses tk_utils) and vector_clip (which is a non-GUI tool
+and uses text_utils). So when linking Gap4 we want to use seq_utils and
+tk_utils and when linking vector_clip we want to use seq_utils and
+text_utils. The original plan (back in the unix-only days) was that when
+linking seq_utils we would not specify tk_utils or text_utils and so build a
+library with unresolved externals. These will be resolved only when linking
+the final application.
+
+Unfortunately exploiting such lazy linking techniques will not work on Windows 
+as they are not supported. All symbols have to be resolved when linking a
+library, meaning we have to explicitly state whether seq_utils links against
+tk_utils or text_utils. This then causes crashes if vector_clip links against
+seq_utils and text_utils, meaning that text_utils becomes less
+useful. Ultimately the solution is that the strict algorithm libraries (such
+as seq_utils) want rewriting so as not to print up messages at all; they
+should simply return error strings which are dealt with by the application in
+an application specific manner.
+
+Gap4 and spin depend on pretty much all of the above. Gap4 also depends on "g" 
+and "mutlib", both of which depend on a variety of other libraries (g: Misc,
+mutlib: tk_utils, io_lib, seq_utils, Misc).
+
+Finally, "prefinish" depends on Gap4 and most of the same libraries it depends 
+on.
+
+==============================================================================
+
diff --git a/Makefile b/Makefile
new file mode 100644
index 0000000..8d0a2c9
--- /dev/null
+++ b/Makefile
@@ -0,0 +1,104 @@
+VERSION=2.0.0b11
+prefix=/usr
+
+# GNU standard dir names
+datarootdir = ${prefix}/share
+docdir      = ${datarootdir}/doc/staden
+mandir      = ${datarootdir}/man
+man1dir     = ${mandir}/man1
+man4dir     = ${mandir}/man4
+htmldir     = ${docdir}
+
+# Dir names used below, incorporating DESTDIR
+DOCDIR     = ${DESTDIR}${docdir}
+MANDIR     = ${DESTDIR}${mandir}
+MAN1DIR    = ${DESTDIR}${man1dir}
+MAN4DIR    = ${DESTDIR}${man4dir}
+HTMLDIR    = ${DESTDIR}${htmldir}
+
+all:
+	@echo
+	@echo Please rerun make specifying either target \"unix\" or \"windows\".
+	@echo
+
+unix: SYSTEM=unix
+unix: common
+
+windows: SYSTEM=windows
+windows: common
+
+common:
+	cd manual; $(MAKE) $(SUBFLAGS) $(SYSTEM)
+	cd scripting_manual; $(MAKE) $(SUBFLAGS)
+	./parse_template $(SYSTEM) < overview.html.template > overview.html
+
+install:
+	# Man pages
+	-mkdir -p            $(MAN1DIR)
+	cp manual/man/man1/* $(MAN1DIR)
+
+	-mkdir -p            $(MAN4DIR)
+	cp manual/man/man4/* $(MAN4DIR)
+
+	# Main PDF docs
+	-mkdir -p                            $(DOCDIR)
+	cp manual/manual.pdf manual/mini.pdf $(DOCDIR)
+
+	# HTML pages
+	cp *.html $(HTMLDIR)
+
+	-mkdir            $(HTMLDIR)/manual
+	cp manual/*.html  $(HTMLDIR)/manual
+	cp manual/*.png   $(HTMLDIR)/manual
+	cp manual/*.index $(HTMLDIR)/manual
+	-mkdir            $(HTMLDIR)/manual/i
+	cp i/*            $(HTMLDIR)/manual/i
+
+	-mkdir                      $(HTMLDIR)/scripting_manual
+	-cp scripting_manual/*.html $(HTMLDIR)/scripting_manual
+	-cp scripting_manual/*.pdf  $(HTMLDIR)/scripting_manual
+	-mkdir                      $(HTMLDIR)/scripting_manual/i
+	cp i/*                      $(HTMLDIR)/scripting_manual/i
+
+	# Other bits and pieces
+	cp Acknowledgements *.txt *.pdf $(DOCDIR)
+	-mkdir -p $(DOCDIR)/i
+	cp i/*    $(DOCDIR)/i
+
+
+DISTSRC=staden_doc-$(VERSION)-src
+distsrc:
+	-mkdir -p $(DISTSRC)/manual
+	-mkdir -p $(DISTSRC)/scripting_manual
+	-cp -R i $(DISTSRC)/i
+	-cp -R manual/Makefile \
+	       manual/*.texi \
+	       manual/*.png \
+	       manual/*.m4 \
+	       manual/README \
+	       manual/dependencies \
+	       manual/tools \
+	       manual/man \
+	       $(DISTSRC)/manual
+	-cp -R scripting_manual/*.texi \
+	       scripting_manual/*.m4 \
+	       scripting_manual/Makefile \
+	       scripting_manual/dependencies \
+	       scripting_manual/i \
+	       scripting_manual/tools \
+	       $(DISTSRC)/scripting_manual
+	-cp Acknowledgements $(DISTSRC)
+	-cp README $(DISTSRC)
+	-cp *.pdf $(DISTSRC)
+	-cp emboss.txt $(DISTSRC)
+	-cp *.gif $(DISTSRC)
+	-cp Makefile $(DISTSRC)
+	-cp *template index.html $(DISTSRC)
+	-find $(DISTSRC) -name .svn -exec rm -rf {} \;
+	tar cfz $(DISTSRC).tar.gz $(DISTSRC)
+
+clean:
+	cd manual && make spotless
+
+install:
+
diff --git a/README b/README
new file mode 100644
index 0000000..1008a03
--- /dev/null
+++ b/README
@@ -0,0 +1,56 @@
+Rebuilding documention
+======================
+
+For this you'll need multiple dependencies, including:
+
+    bourne shell
+    sed/awk/grep
+    perl
+    tcl
+    texinfo
+    TeX (eg "tetex-bin" package)
+    emacs (for texinfo mode)
+    m4
+    imagemagick
+
+There are two versions of the manual, one for unix and one for
+windows, although they are very similar. You will need to either type
+"make unix" or "make windows" depending on which set of documentation
+you wish to build.
+
+You may also wish to redefine PAPER in the makefile. By default we
+build A4, but PAPER=us will use US letter format instead. Eg:
+
+      make unix PAPER=us
+
+Note that rebuilding the main manual subdirectory only works well from
+a clean directory. This is a long-standard bug, but it can be worked
+around by typing "(cd manual; make spotless)".
+
+
+Installing documention
+======================
+
+Use "make install prefix=<dir>" to install the package documentation
+somewhere. This is entirely platform independent, so the documentation
+will be copied to ${prefix}/share/man/ and ${prefix}/share/doc/staden/.
+
+This copies manuals previously been built in the first step.
+
+Without redefining the prefix variable the documentation will attempt
+to be installed into the system "/usr" directory. prefix defined here
+should be the same --prefix used when configuring and building the
+main Staden Package source directory.
+
+Additionally, the DESTDIR variable can be set allowing for staged
+installs. See http://www.gnu.org/prep/standards/html_node/DESTDIR.html
+for further details on this.
+
+
+Building a source distribution
+==============================
+
+If you've edited the manual and wish to rebuild a new documentation
+"source" package, the command is "make distsrc".
+
+This will create a staden-<version>-doc.tar.gz file.
diff --git a/ReleaseNotes b/ReleaseNotes
new file mode 100644
index 0000000..d104ee2
--- /dev/null
+++ b/ReleaseNotes
@@ -0,0 +1,92 @@
+			Staden Package v1.7.0
+			=====================
+
+This is the first release to provide prebuilt binaries for linux on
+x86-64 (eg AMD Opterons). We do not have a windows system on this
+architecture, although the 32-bit version should still work.
+
+On the Gap4 front there have been several minor joining related
+improvements in how it scores joins in Find Internal Joins and the
+functionality of the align button in the Join Editor.
+
+Also of consequence to gap4 are the various changes to io_lib (the
+library for all I/O to various trace file formats). With the addition
+of a new hash_exp program it now allows for experiment files to be
+concatenated together and indexed. The list of experiment file names
+should stil be supplied to gap4, but provided the EXP_PATH environment
+variable has been set correctly gap4 will be able to fetch individual
+sequences out of the concatenated experiment file
+archive. Improvements to the user interface for this still need to be
+made.
+
+454 SFF archives are now better supported. The defsult 454 indices
+now work, although the hash_sff program can be used to provide an
+alternative indexing strategy (possibly faster in some cases).
+
+Gap4 and trev also have another trace display style for traces that
+have 1 sample scan (x4 channels) per base call. In this case it can
+also draw 4 confidence values instead of 1 per base. These abilities
+will allow better integration of Solexa traces when more readily
+available.
+
+Plus of course the usual mix of bug fixes and minor tweaks. See the
+full change log for details (via the SourceForge site).
+
+
+Other notable changes
+=====================
+
+Gap4
+----
+
+* Various SNP Candidates improvements. The "correlation offset" is now
+  adjustable (this controls the average correlation score needed
+  before groups are considered for automatic merging). SNP base calls
+  now work by generating a consensus rather than requiring 100%
+  identity. it now skips sequences containing a REFS note. Merging can
+  be forced until the number of groups is less than or equal to a
+  predetermined amount (NB: not usually ideal).
+
+* Shuffle pads now has a "band size" parameter for the alignments.
+  Also bug fixed in various places.
+
+* The old editor shuffle command has been replaced by strip pads. It
+  now only removes entire columns of pads and does no pad movement at
+  all.
+
+* The join editor align button will now cope better with handling long
+  alignments in repeated data, hopefully avoiding the "too long for
+  practical use of dynamic programming" message in such cases.
+
+  It also now has "<" and ">" buttons either side of the "Align"
+  button. These anchor one end of the alignment to the current overlap
+  position and then only align from that point leftwards or
+  rightwards. This helps to force an alignment to anchor at a specific
+  location which is useful when aligning data consisting of multiple
+  repeat elements.
+
+* Alignments found by Find Internal Joins now take into account the
+  alignment score in addition to the percentage identity. This means
+  that no longer will a 100bp overlap at 100% identity be considered
+  as a better overlap than a 2kb overlap at 99% identity.
+
+* The Contig Navigation window now has Page Up and Page Down
+  keybindings for previous and next match. It also has the ability to
+  automatically display traces at the appropriate regions using the
+  contig editors "Auto-display traces" functionality.
+
+* The "View List" window now has a Save button.
+
+Io_lib
+------
+
+* New programs: hash_exp, hash_sff, append_sff, extract_fastq.
+
+* Added TRACE_PATH and EXP_PATH environment variables to use in
+  preference to RAWDATA (when defined).
+
+* Now uses libcurl instead of wget for much faster web based trace
+  fetching.
+
+
+
diff --git a/debian/changelog b/debian/changelog
deleted file mode 100644
index 2e5301d..0000000
--- a/debian/changelog
+++ /dev/null
@@ -1,22 +0,0 @@
-staden-doc (2.0.0b11.r4076-1) UNRELEASED; urgency=medium
-
-  * Team upload.
-  * Initial Upload to Debian
-
- -- Andreas Tille <tille at debian.org>  Thu, 28 Dec 2017 21:00:12 +0100
-
-staden-doc (2.0.0b9-0biolinux2) precise; urgency=low
-
-  * Make clean target actually clean up
-  * Remove rogue file '*.*.png.html'
-  * Fix help menus:
-    * Allow context menu to open copyright page
-    * Avoid compressing .index files
-
- -- Tim Booth <tbooth at ceh.ac.uk>  Wed, 12 Jun 2013 16:54:05 +0100
-
-staden-doc (2.0.0b9-0biolinux1) precise; urgency=low
-
-  * New package for Bio-Linux
-
- -- Tim Booth <tbooth at ceh.ac.uk>  Thu, 06 Jun 2013 12:10:19 +0100
diff --git a/debian/compat b/debian/compat
deleted file mode 100644
index 45a4fb7..0000000
--- a/debian/compat
+++ /dev/null
@@ -1 +0,0 @@
-8
diff --git a/debian/control b/debian/control
deleted file mode 100644
index 8de4d86..0000000
--- a/debian/control
+++ /dev/null
@@ -1,17 +0,0 @@
-Source: staden-doc
-Section: doc
-Priority: extra
-Maintainer: Tim Booth <tbooth at ceh.ac.uk>
-Build-Depends:
- debhelper (>= 8), tcl, emacs, imagemagick, m4,
- texinfo, texlive-base, texlive-latex-base, texlive-generic-recommended, texlive-latex-extra
-Standards-Version: 3.9.3
-Homepage: https://staden.sf.net
-
-Package: staden-doc
-Architecture: all
-Depends: ${misc:Depends}, ${shlibs:Depends}
-Recommends: staden-common, staden
-Description: documentation for Staden
- Documentation for Staden, including manpages for commands and full manual in
- PDF format.
diff --git a/debian/copyright b/debian/copyright
deleted file mode 100644
index 37924e9..0000000
--- a/debian/copyright
+++ /dev/null
@@ -1,19 +0,0 @@
-Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
-Upstream-Name: Staden
-Upstream-Contact: James Bonfield
-Source: https://sourceforge.net/p/staden/code/HEAD/tree/
-
-Files: *
-Copyright: 2005-2013, James Bonfield
-           2005-2013, Andrew Whitwham
-	   Rodger Staden
-           Kathryn Beal
-           Mark Jordan
-           Yaping Cheng
-           Simon Dear
-           Matthew Betts
-License: Modified BSD
-
-Files: debian/*
-Copyright: 2013, Tim Booth <tbooth at ceh.ac.uk>
-License: Simplified BSD
diff --git a/debian/doc-base b/debian/doc-base
deleted file mode 100644
index e69de29..0000000
diff --git a/debian/get-orig-source b/debian/get-orig-source
deleted file mode 100755
index 645ce1e..0000000
--- a/debian/get-orig-source
+++ /dev/null
@@ -1,32 +0,0 @@
-#!/bin/sh
-# if you need to repack for whatever reason you can
-# use this script via uscan or directly
-#
-# FIXME: currently the code is not conform to Debian Policy
-#        http://www.debian.org/doc/debian-policy/ch-source.html
-#        "get-orig-source (optional)"
-#        This target may be invoked in any directory, ...
-# --> currently it is assumed the script is called in the
-#     source directory featuring the debian/ dir
-
-COMPRESS=xz
-
-set -e
-NAME=`dpkg-parsechangelog | awk '/^Source/ { print $2 }'`
-VERSION=`dpkg-parsechangelog | awk '/^Version:/ { print $2 }' | sed 's/\([0-9\.br]\+\)-[0-9]\+$/\1/'`
-
-## NO tags no branches
-SVNURI="https://svn.code.sf.net/p/staden/code/staden/trunk/doc"
-revision=`LANG=C svn info ${SVNURI} | grep "^Last Changed Rev:" | sed 's/Last Changed Rev: *//'`
-VERSION=`echo ${VERSION}| sed "s/.r[0-9]\+$//"`.r${revision}
-
-TARDIR=${NAME}-${VERSION}
-
-mkdir -p ../tarballs
-cd ../tarballs
-# svn export conserves time stamps of the files, checkout does not
-#set -x
-LC_ALL=C svn export ${SVNURI} ${TARDIR} >/dev/null 2>/dev/null || true
-
-GZIP="--best --no-name" tar --owner=root --group=root --mode=a+rX -caf "$NAME"_"$VERSION".orig.tar.${COMPRESS} "${TARDIR}"
-rm -rf ${TARDIR}
diff --git a/debian/patches/fix_installing_i_directories b/debian/patches/fix_installing_i_directories
deleted file mode 100644
index f6f9dcb..0000000
--- a/debian/patches/fix_installing_i_directories
+++ /dev/null
@@ -1,24 +0,0 @@
---- a/Makefile
-+++ b/Makefile
-@@ -51,18 +51,15 @@
- 	cp manual/*.html  $(HTMLDIR)/manual
- 	cp manual/*.png   $(HTMLDIR)/manual
- 	cp manual/*.index $(HTMLDIR)/manual
--	-mkdir            $(HTMLDIR)/manual/i
--	cp i/*            $(HTMLDIR)/manual/i
-+	cp -r i           $(HTMLDIR)/manual
- 
- 	-mkdir                      $(HTMLDIR)/scripting_manual
- 	-cp scripting_manual/*.html $(HTMLDIR)/scripting_manual
--	-mkdir                      $(HTMLDIR)/scripting_manual/i
--	cp i/*                      $(HTMLDIR)/scripting_manual/i
-+	cp -r i                     $(HTMLDIR)/scripting_manual/i
- 
- 	# Other bits and pieces
- 	cp Acknowledgements *.txt *.pdf $(DOCDIR)
--	-mkdir -p $(DOCDIR)/i
--	cp i/*    $(DOCDIR)/i
-+	cp -r i/* $(DOCDIR)
- 
- 
- DISTSRC=staden_doc-$(VERSION)-src
diff --git a/debian/patches/fix_warning_in_make_pdf b/debian/patches/fix_warning_in_make_pdf
deleted file mode 100644
index e58dbcc..0000000
--- a/debian/patches/fix_warning_in_make_pdf
+++ /dev/null
@@ -1,11 +0,0 @@
---- a/manual/tools/make_pdf
-+++ b/manual/tools/make_pdf
-@@ -109,7 +109,7 @@
-     binmode(FILE,":raw");
-     while (<FILE>) {
- 	if (/^\/CropBox/) {
--	    ($a,$b)=/^\/CropBox \[(\d+) \d+ (\d+) \d+\]/;
-+	    ($a,$b)=/^\/CropBox \[([0-9.]+) [0-9.]+ ([0-9.]+) [0-9.]+\]/;
- 	    $width = $b-$a;
- 	}
-     }
diff --git a/debian/patches/series b/debian/patches/series
deleted file mode 100644
index cc1768b..0000000
--- a/debian/patches/series
+++ /dev/null
@@ -1,2 +0,0 @@
-fix_warning_in_make_pdf
-fix_installing_i_directories
diff --git a/debian/rules b/debian/rules
deleted file mode 100755
index e179ecc..0000000
--- a/debian/rules
+++ /dev/null
@@ -1,27 +0,0 @@
-#!/usr/bin/make -f
-%:
-	dh $@
-
-override_dh_auto_build:
-	make unix PAPER=a4
-	for x in gap4 pregap4 gap5 trev ; do \
-		echo "{Copyright} $${x}_1.html" >> manual/$${x}.index ; \
-	done
-
-override_dh_auto_install:
-	dh_auto_install
-	rm -f debian/*/usr/share/doc/staden/manual/'*.*.png.html'
-
-override_dh_auto_clean:
-	dh_auto_clean
-	rm -f config.log config.status
-	for ext in index html pdf tp pg ky vr aux cp cps fn fns log pgs toc vr vrs texinfo htmlinfo ; do \
-		find manual scripting_manual -name "*.$$ext" -delete ; \
-	done
-	rm -f overview.html
-
-override_dh_compress:
-	dh_compress -Xmanual
-
-get-orig-source:
-	. debian/get-orig-source
\ No newline at end of file
diff --git a/debian/source/format b/debian/source/format
deleted file mode 100644
index 163aaf8..0000000
--- a/debian/source/format
+++ /dev/null
@@ -1 +0,0 @@
-3.0 (quilt)
diff --git a/debian/watch b/debian/watch
deleted file mode 100644
index a729f9a..0000000
--- a/debian/watch
+++ /dev/null
@@ -1,2 +0,0 @@
-version=3
-http://sf.net/staden/staden_doc-([0-9ab.]*).tar.(?:bz2|gz)
diff --git a/emboss.txt b/emboss.txt
new file mode 100644
index 0000000..01e7485
--- /dev/null
+++ b/emboss.txt
@@ -0,0 +1,18 @@
+Setting up EMBOSS for use with Spin
+===================================
+
+The create_emboss_files program attempts to find the location for your
+installed EMBOSS release. From this is iterates through all of the acd files
+and produces tcl/tk GUIs for each program. These are placed in the
+$STADENROOT/lib/spin_emboss/acdtcl directory. An Emboss menu is added to Spin, 
+with the menu specification being in $STADENROOT/tables/emboss_menu.
+
+An example EMBOSS setup is distributed with the Staden Package. This was built 
+from the EMBOSS-2.0.0 release, but most of it is likely to work with newer
+versions, except obviously where new programs are added.
+
+Note that the dialogues created may be very large (as is the case with showseq 
+for example). However the ACD files we used were modified by adding
+appropriate section: and endsection: keywords. This means that some of the
+EMBOSS dialogues in Spin have multiple "tabs". (In time these will make their
+way in to the official EMBOSS releases.)
diff --git a/gkb547_gml.pdf b/gkb547_gml.pdf
new file mode 100644
index 0000000..d29143f
Binary files /dev/null and b/gkb547_gml.pdf differ
diff --git a/i/nav_brief.gif b/i/nav_brief.gif
new file mode 100644
index 0000000..b26bdbd
Binary files /dev/null and b/i/nav_brief.gif differ
diff --git a/i/nav_down.gif b/i/nav_down.gif
new file mode 100644
index 0000000..bf5ccf0
Binary files /dev/null and b/i/nav_down.gif differ
diff --git a/i/nav_first.gif b/i/nav_first.gif
new file mode 100644
index 0000000..75d3439
Binary files /dev/null and b/i/nav_first.gif differ
diff --git a/i/nav_full.gif b/i/nav_full.gif
new file mode 100644
index 0000000..65c4753
Binary files /dev/null and b/i/nav_full.gif differ
diff --git a/i/nav_home.gif b/i/nav_home.gif
new file mode 100644
index 0000000..5e1293c
Binary files /dev/null and b/i/nav_home.gif differ
diff --git a/i/nav_last.gif b/i/nav_last.gif
new file mode 100644
index 0000000..95a8a39
Binary files /dev/null and b/i/nav_last.gif differ
diff --git a/i/nav_next.gif b/i/nav_next.gif
new file mode 100644
index 0000000..7fa6ebe
Binary files /dev/null and b/i/nav_next.gif differ
diff --git a/i/nav_prev.gif b/i/nav_prev.gif
new file mode 100644
index 0000000..31176c4
Binary files /dev/null and b/i/nav_prev.gif differ
diff --git a/i/nav_top.gif b/i/nav_top.gif
new file mode 100644
index 0000000..cb77483
Binary files /dev/null and b/i/nav_top.gif differ
diff --git a/i/nav_up.gif b/i/nav_up.gif
new file mode 100644
index 0000000..434a6d6
Binary files /dev/null and b/i/nav_up.gif differ
diff --git a/index.html b/index.html
new file mode 100644
index 0000000..2774c41
--- /dev/null
+++ b/index.html
@@ -0,0 +1,48 @@
+<html>
+<head>
+<title>Staden Package</title>
+</head>
+
+<body bgcolor="#ffffff">
+<center><h1>Staden Package - Introduction</h1></center>
+<p>
+The <i>Staden Package</i> is a set of tools covering sequence assembly,
+editing and analysis. New releases can be found at the <a
+href="http://staden.sourceforge.net/">SourceForge</a> package site.
+</p>
+
+<br>
+
+<h2>Local documentation:</h2>
+<p>
+<table border=0 cellpadding=10 cellspacing=10 width=95%>
+<tr align=left>
+<td><a href="overview.html">Program Summary</a></td>
+<td>Lists the program names along with a brief description and a link
+to the full documentation, where available.</td>
+</tr>
+
+
+<tr align=left>
+<td><a href="manual/master_brief.html">Package Manual</a></td>
+<td>A copy of the full online manual for the full package. This is
+also available in <a href="manual.pdf">PDF</a> format too.</td>
+</tr>
+
+<tr align=left>
+<td><a href="manual/mini_toc.html">Introductory manual</a></td>
+<td>This is much shortened copy of the full manual, consisting
+primarily of introductions to the tools available. Also available in
+<a href="mini.pdf">PDF</a> format.</td>
+</tr>
+
+<tr align=left>
+<td><a href="scripting_manual/scripting_toc.html">Scripting manual</a></td>
+<td>This describes the scripting language built into Gap4. Based on
+Tcl, this manual describes the Gap4 extensions to the language rather
+than the Tcl syntax itself. Also available in <a
+href="scripting.pdf">PDF</a> format.</td>
+</tr>
+</table>
+</p>
+</body>
\ No newline at end of file
diff --git a/manual/2nd_highest_confidence.png b/manual/2nd_highest_confidence.png
new file mode 100644
index 0000000..2e9d8f0
Binary files /dev/null and b/manual/2nd_highest_confidence.png differ
diff --git a/manual/Makefile b/manual/Makefile
new file mode 100644
index 0000000..4428512
--- /dev/null
+++ b/manual/Makefile
@@ -0,0 +1,233 @@
+all: 
+	@echo
+	@echo Please rerun make specifying either target \"unix\" or \"windows\".
+	@echo
+
+#
+# Sorry if this Makefile doesn't work correctly regarding dependencies. GNU
+# make causes all sorts of headaches with it's inbuilt rules (which I seem
+# unable to remove, even when using -d) and it has some quirky ideas as to
+# which files are created temporarily (and thus should be removed). It's
+# best to usually do 'gmake spotless all' or some such.
+#
+# However you should try "gmake depend" to keep the dependencies file up to
+# date as this does solve many (if not all) dependency problems.
+#
+# The input files are always .texi
+#
+# .texinfo files are expanded up .texi files. They have the macros replaced
+# and have been processed by m4 to include Unix or Windows specific components.
+#
+
+# M4 preprocessor. Various buggy versions of this have caused problems in the
+# past, so you may need to redefine this. On Digital Unix 4.0E the system m4
+# does not work with our files. Certain versions (which?) of GNU m4 also fail,
+# but this has now been patched.
+M4=m4
+
+#-----------------------------------------------------------------------------
+# General rules
+
+#
+# Unix vs Windows rules.
+# The Unix and Windows manuals stem from the same text, but using m4 as a
+# preprocessor to generate different .texinfo files. We also generate different
+# documents designed for passing into TeX or texi2html. The changes are
+# minimal and relate to cross-referencing and page-splitting.
+#
+%.texinfo:	%.texi header.m4
+	$(M4) $(M4OPT) -D_tex < $< > $@
+	./tools/update-nodes $@
+
+%.htmlinfo:	%.texi header.m4
+	$(M4) $(M4OPT) -D_html < $< > $@
+	./tools/update-nodes $@
+
+# Remove implcit rules
+%.dvi:  %.texi
+%.dvi:  %.texinfo
+%.pdf:	%.texinfo
+
+%.eps_done:
+
+# How to build .dvi files from our m4-expanded .texinfo files
+%.dvi:	%.texinfo %.eps_done
+	texi2dvi $<
+
+# A4 or US Letter PostScript from DVI
+%.ps:	%.dvi
+	dvips -t $(PAPER) -Ppdf -o $@ $<
+
+# PDF generation. Directly from the texinfo.
+# Note that as the texinfo is reordered, this produces out of order
+# data too (contents page at the end).
+# One solution is (eg):
+#   pdftk manual.pdf cat 1-2 469-end 3-468 output new.pdf
+# however this loses the bookmarks.
+%.pdf:	%.texinfo
+	texi2pdf $<
+
+# HTML files - built from an expanded .texinfo file with the -D_html m4 macro
+# defined. We need the *_toc.html and the index files.
+# For ease of browsing we create a separate html document for each of the main
+# programs. The htmlinfo version is identical to texinfo except with a few
+# tweaks to the cross-references (to allow cross-referencing between top-level
+# documents) and the addition of an _split() command to request splitting an
+# html page at a specific point).
+%_toc.html:	%.htmlinfo
+	./tools/texi2html -menu -verbose -split_chapter -index_chars $<
+
+# Man pages - taken from the .texi files directly (NB: may need m4 expansion,
+# but this is not yet applied.)
+man/man1/%.1:	%.1.texi
+	./tools/texi2man.pl $< > $@
+
+man/man4/ExperimentFile.4: exp-t.texi
+	./tools/texi2man.pl $< > $@
+
+man/man4/scf.4: scf-t.texi
+	./tools/texi2man.pl $< > $@
+
+man/man4/ztr.4: ztr-t.texi
+	./tools/texi2man.pl $< > $@
+
+# For any large pictures (_lpicure() macro) we create an html page containing
+# the full-size picture to link from the small picture embedded in the page.
+gifs:
+	./tools/make_gif_html
+
+#pngs:
+#	./tools/make_png_html
+
+# PostScript versions of the gif images.
+%.ps: %.gif
+	./tools/make_ps $<
+
+%.eps: %.png
+	./tools/make_eps $<
+
+# %.pdf: %.png
+# 	./tools/make_pdf $<
+
+
+# The mini manuals need to have the xrefs removed. Internal references are
+# kept "as is".
+mini.texinfo: mini_manual.texinfo
+	./tools/remove_xrefs.pl < $< > $@
+mini.htmlinfo: mini_manual.texinfo
+	./tools/remove_xrefs.pl < $< > $@
+
+# Help within the programs is HTML based. Given a topic the appropriate URL
+# is obtained by looking it up in the .index file.
+%.index:	%_toc.html
+	./tools/html_index.pl $<
+
+# Resolves cross references between separate html texinfo documents by
+# searching for <!-- XREF:name --> comments in the html code and matching
+# these up with the node names.
+
+xref: \
+	formats.index \
+	pregap4.index \
+	read_clipping.index \
+	vector_clip.index \
+	interface.index \
+	trev.index \
+	gap4.index \
+	gap5.index \
+	manpages.index \
+	spin.index \
+	filebrowser.index 
+
+	./tools/xref_update.pl *_*.html
+
+# Master full and brief contents page generation. Basically this combines
+# the separate _toc.html pages into global ones.
+contents:
+	./tools/merge_indexes.pl \
+		gap4_toc.html \
+		gap5_toc.html \
+		mutations_toc.html \
+		pregap4_toc.html \
+		read_clipping_toc.html \
+		vector_clip_toc.html \
+		trev_toc.html \
+		spin_toc.html \
+		interface_toc.html \
+		formats_toc.html \
+		manpages_toc.html \
+		references_toc.html
+
+# Backup only the 'source' files and generating scripts
+backup::
+	tar cvf - Makefile README README.system *.template *.tcl tclIndex \
+	          docmake list_nodes lowersection html_index update-nodes \
+		  update-nodes.el *.gif *.texi *.pl texinfo.tex \
+	  | gzip > backup/`date +"%d_%m_%y"`.tar.gz
+
+#-----------------------------------------------------------------------------
+# The main make targets.
+
+unix: M4OPT+=-Uunix -D_unix
+unix: common
+
+windows: M4OPT+=-Uunix -D_windows
+windows: common
+
+ifeq ($(PAPER),us)
+M4OPT += -Dafourpaper=c
+PAPER=letter
+else
+M4OPT += -Dafourpaper=c
+PAPER=m4
+endif
+
+common: manual xref contents mini man
+
+
+mini.pdf_done:
+
+mini: mini.pdf mini_toc.html
+	./tools/edit_mini_contents.pl < mini_toc.html > tmp.html
+	mv tmp.html mini_toc.html
+
+manual:	 manual.pdf manual_html
+manual_html: \
+	interface_toc.html \
+	gap4_toc.html \
+	gap5_toc.html \
+	formats_toc.html \
+	vector_clip_toc.html \
+	trev_toc.html \
+	manpages_toc.html \
+	read_clipping_toc.html \
+	references_toc.html \
+	spin_toc.html \
+	pregap4_toc.html \
+	mutations_toc.html
+
+man:	man/man1/makeSCF.1 man/man1/eba.1 man/man1/makeSCF.1 \
+	man/man1/get_scf_field.1 man/man1/init_exp.1 man/man4/scf.4 \
+	man/man4/ExperimentFile.4 \
+	man/man1/extract_seq.1 man/man1/copy_db.1 \
+	man/man4/ztr.4 man/man1/convert_trace.1 \
+	man/man1/find_renz.1 man/man1/get_comment.1 \
+	man/man1/getABIfield.1 man/man1/trace_dump.1
+
+
+clean:
+	-rm -f *.aux *.cp *.fn *.ky *.log *.pg *.toc *.tp *.vr *.cps *.fns *.pgs *.vrs
+	-rm -f core _tmp.texi _tmp.texi~ *.texinfo *.texinfo.tmp *.texinfo~
+	-rm -f *.htmlinfo *.htmlinfo~
+
+spotless:	clean
+	-rm -f *.dvi *.html *.info *.info-[0-9] *.index *.topic
+	-rm -f master_index.html master_contents.html master_brief.html
+	-rm -f manual.ps mini.ps
+	-rm -f man/man1/* man/man4/*
+	-rm -f *.ps *.pdf *.eps
+
+depend:
+	./tools/make_dependencies > dependencies
+
+include dependencies
diff --git a/manual/NBase_clip.png b/manual/NBase_clip.png
new file mode 100644
index 0000000..131890d
Binary files /dev/null and b/manual/NBase_clip.png differ
diff --git a/manual/README b/manual/README
new file mode 100644
index 0000000..b5994c6
--- /dev/null
+++ b/manual/README
@@ -0,0 +1,192 @@
+Random notes on the documentation system
+----------------------------------------
+
+The base source type for the documenation is GNU TeXInfo. This is easy 
+to learn, but we have our own local changes to address a few problems.
+
+1. We want to have a common source for Unix and Windows documentation, 
+which means some sort of 'if' syntax. For flexibility we decided to go 
+with M4. Note though that not all M4 preprocessors are reliable enough 
+to cope. (For example earlier GNU releases and the Digital Unix m4 crash.)
+
+2. In the past TexInfo did not support images. For this reason we
+added our own image commands and you should not attempt to use the
+builtin image formatting of newer texinfo releases. Again this has
+been implemented using m4.
+
+3. We want to produce nice HTML output. Again this wasn't part of the
+texinfo package until recently, so we use a third party texi2html
+script (again with local modifications).
+
+4. We need to split html pages where we want and not just at each
+chapter or section heading. This is dealt with via the _split m4
+macro, which does nothing when producing postscript or pdf.
+
+PDF output is still not working ideally.  When producing the
+PostScript we use "dvips -Ppdf". Amongst other things this disables
+downloading of bitmap fonts which improves PDF quality. However there
+are still issues to do with resampling of image files. Generally using 
+an alternative postscript to pdf conversion tool may be best.
+
+
+How it works
+------------
+
+To understand what happens to the documentation I'll explain the
+various inputs, outputs and programs used.
+
+Printed copies
+..............
+
+manual.texi -> (m4) -> manual_unix.texinfo
+manual.texi -> (m4) -> manual_windows.texinfo
+
+The master document is a stub doc containing lots of _include commands 
+for each chapter. We use m4 to generate one large document. The m4
+step also replaces cross-reference macros, deals with the unix vs
+windows sections, and also produces two output versions (.texinfo or
+.htmlinfo) depending on whether this output is to be used for passing
+into tex or into texi2html.
+                     
+manual_unix.texinfo -> (tools/update-nodes) -> manual_unix.texinfo
+
+Update-nodes is a tiny script to invoke the emacs
+'texinfo-update-node' command on the document. This sets the @node
+parent, next and previous fields. I'm not sure if this step is still required.
+
+manual_unix.texinfo -> (texi2dvi) -> manual_unix.dvi
+
+Texi2dvi is a script to run tex (and texindex, etc) on the texinfo
+file.
+
+manual_unix.dvi -> (dvips) -> manual_unix.ps
+
+Converts the DVI file into a postscript file. We use dvips -t a4 -Ppdf 
+options.
+
+manual_unix.ps -> (ps2pdf) -> manual_unix.pdf
+
+Ps2pdf converts the PostScript into PDF. There are now more direct
+routes to go from texinfo to pdf, but again this route reflects the
+long history of using texinfo.
+
+
+HTML copies
+...........
+
+gap4.texi -> (m4) -> gap4_unix.htmlinfo
+gap4.texi -> (m4) -> gap4_windows.htmlinfo
+
+The html version is generated directly from the original manual.texi
+again, but using different m4 processing options (-D_html instead of
+-D_tex). This allows for the _split command to control html page
+breaks and for the cross referencing to work between documents.
+
+Note that we produce html copies of each main program in turn as
+separate 'documents', instead of htmlising the master manual.texi
+document. The original reason for this was that it makes linking
+easier as we can direct a link to gap4_unix_toc.html instead of
+manual_unix_172.html (for example). Changes in the html 'template'
+code (see the main $STADENROOT/doc/templates directory) make this
+unnecessary, but it's still nice to be able to get a full table of
+contents and index for one application without having it polluted by
+the other applications.
+
+gap4_unix.htmlinfo -> (tools/texi2html) -> gap4_unix_toc.html (+ others)
+
+Texi2html is the main conversion tool. I believe that there's now an
+html output mode of makeinfo, but again the reason we do not use this
+is partly historical. Also makeinfo does not work on our files due to
+the addition of the _split command.
+
+gap4_unix_toc.index -> (tools/html_index.pl) -> gap4_unix.index
+
+The .index file is how the Tcl/Tk programs get the context-sensitive
+help to work. It's a mapping of node name (ie 'topic') to a URL. Hence 
+in gap4 Tcl we have "show_help gap4 {FIJ-Dialogue}" which is looked up 
+in gap4_unix.index to give the line "{FIJ-Dialogue} gap4_unix_99.html",
+and so the web browser is then pointed in the appropriate
+location. This also means that adding/removing pages does not require
+the code to change at all.
+
+*.html -> (tools/xref_update.pl) -> *.html
+
+One issue of building separate html documents for gap4, pregap4, spin, 
+etc is that cross-references are only resolved by texi2html internal
+to that document. A reference in gap4 from the pregap4 documentation
+could not be resolved. The original m4 step adds html comments "<!--
+XREF:nodename -->" to the generated html. Texi2html was modified to
+also add comments before each node name. Then the xref_update.pl
+program simply matches these comments together to modify the html to
+resolve external cross-references.
+
+*_toc.html -> (tools/merge_indexes.pl) -> master_unix_brief.html
+					  master_unix_contents.html
+					  master_unix_index.html
+
+This merges the various separate table of contents produced by texi2html
+into verbose and brief combined copies. It also searches the contents
+page for the index and then merges the indexes together to produce a
+master index.
+
+
+Other bits and bobs
+...................
+
+There's also a few other specific tools for converting texinfo
+documents.
+
+The Unix manual pages ("man pages") are written in texinfo and
+converted using tools/texi2man.pl. This attempts to convert a basic
+and stylised texinfo document into a "nroff -man" format manual
+page. Cross references to other manual pages get converted into the
+(for example) "scf(4)" format we expect to see in manual pages. It's
+not foolproof and the formatting often goes astray, but it's easier
+than having two separate formats of text and realistically we wouldn't
+bother with man pages at all if we did not have a similar technique.
+
+The image format we use (currently) is gif. We use the imagemagic
+'convert' tool to generate the encapsulated postscript required by
+tex. However letting it just do the job by itself gives poor results
+as we do not want it to expand up all images to fit the page. For
+example this would lead to very zoomed images for simple dialogues.
+
+Instead we apply an image specific -density option to control the .ps
+generation. The tools/make_ps script handles this. It has a list of
+all the image names and the density, when not the default of 120dpi,
+for images that need special consideration.
+
+Some of the images are rather large and for fast web viewing we would
+like a half-sized version to be used instead. The distinction between
+the two is based on whether we use _picture or _lpicture (large
+picture) in the texi document. For large images the html code produced
+references image_name.small.gif, which in turn is a URL to a page
+named image_name.gif.html which references the full sized copy named
+image_name.gif. The generation of these tiny image_name.gif.html pages
+is via tools/make_gif.html. It takes no arguments as it simply
+iterates through all *.small.gif files.
+
+
+Using the Makefile
+------------------
+
+So now we get to the crucial bit - how do we rebuild the
+documentation?
+
+Unfortunately due to the complexity and dependencies I cannot be sure that all 
+dependencies are resolved. To be sure you may wish to do a "make spotless"
+before rebuilding. There is a crude dependencies generation system "make
+depend" to search for m4 _include, _picture and _lpicture commands. If you
+wish to add more dependency searching methods to this edit the
+tools/make_dependencies script.
+
+There's also a "unix" and "windows" target for building the
+documentation just for one platform type. This helps speed things up
+when iterative making edits and checking the results.
+
+Finally, try just using "make gap4_unix.dvi" when testing the
+formatting and then use "xdvi gap4_unix.dvi" to view the results. This
+will only build the docs for one application (gap4 in this example)
+and avoids going all the way to PostScript. Xdvi is an excellent
+viewer and I find it faster and easier than viewing postscript
+documents.
diff --git a/manual/assembly-t.texi b/manual/assembly-t.texi
new file mode 100644
index 0000000..a61941d
--- /dev/null
+++ b/manual/assembly-t.texi
@@ -0,0 +1,656 @@
+ at cindex Assembly
+ at cindex Entering readings
+
+Assembly is performed by selecting one of the functions from the
+Assembly menu. The options available are:
+
+ at menu
+* Assembly-Shot::       Normal shotgun assembly
+* Assembly-Ind::        Assemble independently
+* Assembly-Single::     Assemble into single stranded regions
+* Assembly-One::        Stack readings
+* Assembly-New::        Put all readings in separate contigs
+* Assembly-Directed::   Directed assembly
+* Assembly-Screen::     Screen only
+_ifdef([[_unix]],[[* Assembly-CAP2::       CAP2 assembly
+* Assembly-CAP3::       CAP3 assembly
+* Assembly-FAKII::      FAKII assembly
+* Assembly-Phrap::      Phrap assembly
+]])* Assembly-Tips::       Tips on entering readings
+* Assembly-Codes::      Assembly Failure Codes
+ at end menu
+
+ at ifset tex
+ at itemize @bullet
+ at item
+Normal shotgun assembly
+ at item
+Assemble independently
+ at item
+Assembly into single stranded regions
+ at item
+Stack readings
+ at item
+Put all readings in separate contigs
+ at item
+Directed assembly
+ at item
+Enter pre-assembled data
+ at item
+Screen only
+_ifdef([[_unix]],[[@item
+CAP2 assembly
+ at item
+CAP3 assembly
+ at item
+FAKII assembly
+ at item
+Phrap assembly]])
+ at end itemize
+ at end ifset
+
+The data for a project is stored in an assembly database 
+(_fxref(GapDB, Gap Database Files, gap4))
+All modes of assembly except CAP2, CAP3 and FAKII can either assemble all the
+readings for a project in a single operation or can add batches of
+new data as they are produced. CAP2, CAP3 and FAKII can only be used to 
+assemble all the data for a project as a single operation.
+
+For all modes the names of the readings to assemble are read from a
+list or file of file names, and the names of readings that fail to be
+entered are written to a list or a file of file names. If only a single read
+is to be assembled the "single" button may be pressed and the filename entered
+instead of the file of filenames.
+
+Now that a sufficient number of readings to get close to contiguity can be
+obtained quite quickly, and that more repetitive genomes are being sequenced
+it is sensible to use a "global" algorithm for assembly, such as Cap2, Cap3, 
+FakII or Phrap. These algorithms compare each reading against all of the 
+others to work out their most likely left to right order and so have a better 
+chance of correctly assembling repetitive elements than an algorithm that only
+compares readings to the ones already assembled.
+
+There is no limit to the length of the individual readings which can be
+assembled.  Hence reference sequences for use in mutation studies or for
+use as guide sequences can be assembled.
+
+
+ at cindex Assembly: limits
+ at cindex Assembly: resetting limits
+ at cindex Assembly: maxseq
+ at cindex Assembly: maxdb
+ at cindex Assembly: large projects
+ at cindex maxseq: gap4 assembly
+ at cindex maxdb: gap4 assembly
+ at cindex gap4 assembly limits
+ at cindex gap4: resetting assembly limits
+ at cindex gap4 database: resetting sizes
+
+Note that 
+Normal shotgun assembly (_fpref(Assembly-Shot, Normal Shotgun Assembly, assembly)),
+Assemble independently (_fpref(Assembly-Shot, Assembly Independently, assembly)),
+Assembly into single stranded regions (_fpref(Assembly-Single, Assembly Single, assembly)),
+Screen only (_fpref(Assembly-Screen, Screen Only, assembly)),
+Put all readings in separate contigs (_fpref(Assembly-New, Assembly new, assembly)),
+may require the parameters maxseq and maxdb to be set beforehand
+(_fpref(Conf-Set Maxseq, Set Maxseq, configure)). The maxseq parameter defines the maximum
+length of consensus that can be created, and the maxdb parameter the maximum number of readings
+and contigs that the database can hold (i.e. number of readings + number of contigs).
+
+
+
+_split()
+ at node Assembly-Shot
+ at section Normal Shotgun Assembly
+ at cindex Assembly: shotgun
+ at cindex Shotgun assembly 
+
+In the absence of any of the external assembly engines, which are in
+general superior, particularly for repetitive data, 
+this is the  mode that  most users will  employ  for all assembly.  It
+takes one reading at a time and compares it  with all the data already
+assembled in the database. If a reading matches  it is aligned. If the
+alignment is good enough the reading  is entered into the database. If
+a reading aligns  well with  two contigs it  is entered  into one of
+them, then the  two contigs are compared. If  they align well they are
+joined. If the  reading does not match it  starts  a new contig. If  a
+reading matches but does not align well it can  either be entered as a
+new contig or rejected.
+
+A submode allows  tagged  regions of contigs  to  be masked and  hence
+restricts the areas into which data is entered. Users select the types
+of tags to be used  as masks. As  outlined above readings are compared
+in two stages: first  the  program looks   for exact matches  of  some
+minimum  length, and  then for  each possible overlap  it  performs an
+alignment. If the masking mode is  selected the masked regions are not
+used during the  search for exact  matches,  but they are  used during
+alignment. The  effect  of this is   that new readings  that would lie
+entirely  inside masked regions will  not produce exact matches and so
+will not be entered.   However   readings that have  sufficient   data
+outside of masked areas can produce hits and will be correctly aligned
+even if they  overlap  the masked data.  For  this mode the names   of
+readings that do  not produce matches are  written  to the error  file
+with code 5. Note that new readings that carry tags of the types being
+used for masking will be masked only after they have been entered.
+
+_picture(assembly.shot,2.95in)
+
+As explained above the user can select to "Apply masking", and if so,
+the "Select tags" button will be activated and if it is clicked will
+bring up a dialogue to allow tag types to be selected.
+_fxref(Conf-Tag, Tag Selector, configure)
+
+The "display mode" dialogue allows the type of output produced to be
+set.  "Hide all alignments" means that only the briefest amount of
+output will be produced. "Show passed alignments" means that only
+alignments that fall inside the entry criteria will be displayed. "Show
+all alignments" means that all alignments, including those that fail the
+entry criteria, are displayed. "Show only failed alignments" displays
+alignments only for the readings that fail the entry criteria. Adding text to
+the text output window will increase the processing time.
+
+When comparing each reading the program looks first
+for a "Minimum initial match", and for each such matching region found
+it will produce an alignment. If the "Maximum pads per read" and the
+"Maximum percent mismatch" are not exceeded the reading will be
+entered. The maximum pads can be inserted in both the reading
+and the consensus. If users agree we would prefer to swap the maximum
+pads criteria for a minimum overlap. i.e. only overlaps of some
+minimum length would be accepted. 
+
+Assembly usually works on sets of reading names and they can be read from
+either a "file" or a "list" and an appropriate browser is available to enable
+users to choose the name of the file or list. If just a single reading is to
+be assembled choose "single" and enter the filename instead of the file or list
+of filenames.
+
+The routine writes the names of all the readings that are not entered to a
+"file" or a "list" and an appropriate browser is available to enable users to
+choose the name of the file or list. Occasionally it might be convenient to
+forbid joins between contigs to be made if a new reading overlaps them both,
+but the default is to "Permit joins".
+
+If a reading is found to match but does not align within the alignment
+criteria it can be entered as a new contig or rejected. These two
+choices are described as "Enter all readings" or "Reject failures".
+Pressing the "OK" button will start the assembly process.
+
+Note that this option may require the parameter maxseq to be set beforehand
+(_fpref(Conf-Set Maxseq, Set Maxseq, configure)). This parameter defines the maximum
+length of consensus that can be created.
+
+Typical output would be:
+
+ at example
+(Output removed to save space)
+
+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+Processing     51 in batch
+Reading name xb61h12.s1
+Reading length    104
+Total matches found     2
+Contig     9 position   590 matches strand -1 at position     1
+Contig    36 position    92 matches strand -1 at position     1
+Trying to align with contig      9
+Percent mismatch  2.1, pads in contig  0, pads in gel  1
+ Percentage mismatch   2.1
+              590       600       610       620       630       640
+     Consensus  TTGAAAAATTAAAAACTTTTTTTGAAAATAAAAAAGAGTGAAAGTAAAGTAAAAGACAAG
+                ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
+       Reading  TTGAAAAATTAAAAACTTTTTTTGAAAATAAAAAAGAGTGAAAGTAAAGTAAAAGACAAG
+                1        11        21        31        41        51
+
+              650       660       670       680
+     Consensus  TAGCATGTAAATCAACTAAAAATAACTAATATTTT
+                ::::::::::::::::::::::::: :::::::: 
+       Reading  TAGCATGTAAATCAACTAAAAATAA,TAATATTT-
+               61        71        81        91
+
+Trying to align with contig     36
+Percent mismatch  0.0, pads in contig  0, pads in gel  0
+ Percentage mismatch   0.0
+               92       102
+     Consensus  TTGAAAAATTAAAAACTTTT
+                ::::::::::::::::::::
+       Reading  TTGAAAAATTAAAAACTTTT
+                1        11
+
+Overlap between contigs    36 and     9
+Length of overlap between the contigs   111
+Entering the new reading into contig     9
+This gel reading has been given the number     47
+Complementing contig    36
+Complementing contig     9
+Trying to align the two contigs
+Percent mismatch  4.4, pads in contig  0, pads in gel  3
+ Percentage mismatch   5.3
+               86        96       106       116       126       136
+     Consensus  AAAAGTTTTTAATTTTTCAATTGTTTGGGTGTTCCTTTGACTATTAGAAAAACACCCCCC
+                ::::::::::::::::::::::::::::::::::::::::::::::::::: :: :::::
+     Consensus  AAAAGTTTTTAATTTTTCAATTGTTTGGGTGTTCCTTTGACTATTAGAAAA,CA,CCCCC
+                1        11        21        31        41        51
+
+              146       156       166       176       186       196
+     Consensus  TTGCTCCTGTTGTGCAATTTTTGTTTTAAGTTTTCAATC*TTT*TATTTTAATA
+                ::::::::::::::::::::::::::::::::::: ::: ::: :::::: :::
+     Consensus  TTGCTCCTGTTGTGCAATTTTTGTTTTAAGTTTTC-ATC,TTTTTATTTT-ATA
+               61        71        81        91       101       111
+
+Editing contig    36
+Completing the join between contigs    47 and    36
+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
+
+(Output removed to save space)
+
+Batch finished
+   100 sequences processed
+    96 sequences entered into database
+    11 joins made
+     9 joins failed
+ at end example
+
+_split()
+ at node Assembly-Ind
+ at subsection Assemble Independently
+ at cindex Assemble: independently i.e. ignoring previous data
+
+This mode works in exactly the same way as normal shotgun assembly
+(_fpref(Assembly-Shot, Normal Shotgun Assembly, assembly))
+with all its options and settings, except that the new batch of data is
+assembled independently of all the data already in the database. This
+means that the only overlaps found will be between the readings in the
+current batch. One role for this mode would be to assemble a
+batch of data that was known from the way it was produced (say a set of
+nested clones covering some problem region such as a repeat) to
+overlap. Use of Assemble Independently will ensure that the batch of
+readings will only be overlapped with one another, and will not be
+aligned with other similar regions of the consensus. Once assembled in
+this way they can be joined to other contigs using Find Internal Joins.
+_fxref(FIJ, Find Internal Joins, fij)
+
+_split()
+ at node Assembly-Single
+ at subsection Assemble Into Single Stranded Regions
+ at cindex Assembly: single stranded regions
+ at cindex Single stranded regions: assembling into
+
+This mode works like normal assembly (_fpref(Assembly-Shot, Normal Shotgun
+Assembly, assembly)) with masking, except that the masking is done for regions
+that already have sufficient data on both strands of the sequence. This means
+that new readings will only be assembled into regions that are single stranded
+or which border, and overlap, such segments. Note that this means that
+readings that do not match are not entered, therefore those that would
+actually lie between contigs are rejected.
+
+_picture(assembly.single,2.95in)
+
+The "display mode" dialogue allows the type of output produced to be
+set.  "Hide all alignments" means that only the briefest amount of
+output will be produced. "Show passed alignments" means that only
+alignments that fall inside the entry criteria will be displayed. "Show
+all alignments" means that all alignments, including those that fail the
+entry criteria, are displayed. "Show only failed alignments" displays
+alignments only for the readings that fail the entry criteria.
+
+When comparing each reading the program looks first for a "Minimum
+initial match", and for each such matching region found it will produce
+an alignment. If the "Maximum pads per read" and the "Maximum percent
+mismatch" are not exceeded the reading will be entered. The maximum pads
+can be inserted in both the reading and the consensus. If users agree we
+would prefer to swap the maximum pads criteria for a minimum overlap.
+i.e. only overlaps of some minimum length would be accepted.
+
+Assembly usually works on sets of reading names and they can be read from
+either a "file" or a "list" and an appropriate browser is available to enable
+users to choose the name of the file or list. If just a single reading is to
+be assembled choose "single" and enter the filename instead of the file or list
+of filenames.
+
+The routine writes the names of all the readings that are not entered to a
+"file" or a "list" and an appropriate browser is available to enable users to
+choose the name of the file or list.  Occasionally it might be convenient to
+forbid joins between contigs to be made if a new reading overlaps them both,
+but the default is to "Permit joins".
+
+Pressing the "OK" button will start the assembly process.
+
+Note that this option may require the parameter maxseq to be set beforehand
+(_fpref(Conf-Set Maxseq, Set Maxseq, configure)). This parameter defines the maximum
+length of consensus that can be created.
+
+
+_split()
+ at node Assembly-One
+ at subsection Stack Readings
+ at cindex Assembly: into one contig
+ at cindex Assembly: stack readings
+
+This assembly mode assumes that all the readings are already aligned 
+and simply stacks
+them on top of one another in a new contig. 
+
+_picture(assembly.one,2.95in)
+
+Assembly usually works on sets of reading names and they can be read from
+either a "file" or a "list" and an appropriate browser is available to enable
+users to choose the name of the file or list. If just a single reading is to
+be assembled choose "single" and enter the filename instead of the file or list
+of filenames.
+
+The routine writes the names of all the readings that are not entered to a
+"file" or a "list" and an appropriate browser is available to enable users to
+choose the name of the file or list.
+
+_split()
+ at node Assembly-New
+ at subsection Put All Readings In Separate Contigs
+ at cindex Assembly: into new contigs
+ at cindex Assembly: into separate contigs
+
+This algorithm 
+simply loads the readings into the database without comparing them, each
+starting a new contig. This can be of use to those employing the
+database for storage rather than assembly.
+
+_picture(assembly.new,2.95in)
+
+Assembly usually works on sets of reading names and they can be read from
+either a "file" or a "list" and an appropriate browser is available to enable
+users to choose the name of the file or list. If just a single reading is to
+be assembled choose "single" and enter the filename instead of the file or list
+of filenames.
+
+The routine writes the names of all the readings that are not entered to a
+"file" or a "list" and an appropriate browser is available to enable users to
+choose the name of the file or list.
+
+_split()
+ at node Assembly-Directed
+ at section Directed Assembly
+ at cindex Assembly: directed
+ at cindex Directed assembly
+
+This assembly method  assumes that a preprocessing
+program, such as an external assembly engine, 
+has been used to map the relative positions of the readings to
+within a reasonable level of accuracy or tolerance. 
+The assembly is "directed" by use of special "Assembly Position" or AP
+records included in each reading's experiment file. It is expected that
+these AP records will be added to the experiment files by the
+preprocessing program, or by a program which parses the output from such
+a program, and so the details given below are not of interest to the
+average user.
+
+The experiment file for each reading must
+contain a special "Assembly Position" or AP line that defines the
+position at which to assemble the reading. The position is not defined
+absolutely, but relative to any other reading (the "anchor reading")
+that has already been assembled. The definition includes the name of
+the anchor reading, the sense of the new reading, its offset relative
+to the anchor reading and the tolerance. i.e.:
+
+ at example
+AP   anchor_reading sense offset tolerance
+ at end example
+
+The sense is defined using + or - symbols.
+
+The offset can be of any size and can be positive or negative. Offset
+positions are defined from 0. i.e. the first base in a contig or a
+reading is base number 0.
+
+For normal use tolerance is a
+non-negative value, and the first base of the new reading must be
+aligned at plus or minus "tolerance" bases of "offset".  If tolerance
+is zero, after alignment the position must be exactly "offset"
+relative to the anchor reading.  If tolerance is negative then
+alignment is not performed and the reading is simply entered at
+position "offset" relative to the anchor reading.  
+
+To start a new contig the reading must include an AP line containing
+the anchor_reading *new* and the sense.
+
+
+Example AP line:
+
+ at example
+AP   fred.021 + 1002 40
+ at end example
+
+Example AP line to start a new contig:
+
+ at example
+AP   *new* +
+ at end example
+
+The algorithm is as follows. Get the next reading name, read the AP
+line, find the anchor reading in the database, get the consensus for
+the region defined by anchor_reading + offset +/- tolerance. Perform
+an alignment with the new reading, check the position and the
+percentage mismatch. If OK enter the reading.
+
+Obviously the way the positions of readings are specified is very
+flexible but one example of use would be to employ a file of file names
+containing a left-to-right ordered list of reading names, with each
+reading using the one to its left as its anchor reading. In this way
+whole contigs can be entered.
+
+Although not specifically designed for the purpose this mode of
+assembly can be used for "assembly onto template".
+
+_picture(assembly.directed,3.075in)
+
+If required, the alignments can be shown in the Output window by
+selecting "Display alignments". Only readings for which the "Maximum
+percent mismatch" after alignment is not exceeded will be entered into
+the database, unless the "enter all readings" box is checked. In that case
+a reading that does not match well enough will be placed in a new contig.
+Specifying a "Maximum percent mismatch" of zero has a special meaning; it
+implies that there should be no mismatches and so no alignments need to be
+performed, and hence the consensus does not need to be computed either. For
+data that has already been padded and aligned using an external tool (such as
+an external assembly program) setting Maximum percent mismatch to zero can
+have a significant improvement in the speed of Directed Assembly.
+
+The ``Ignore svec (SL/SR) clips'' option controls whether sequencing
+vector clip points should be considered when setting the hidden data
+sections for the sequence. With this option enabled only the quality
+clip (QL/QR) experiment file records will be used.
+
+Assembly usually works on sets of reading names and they can be read from
+either a "file" or a "list" and an appropriate browser is available to enable
+users to choose the name of the file or list. If just a single reading is to
+be assembled choose "single" and enter the filename instead of the file or list
+of filenames.
+
+The routine writes the names of all the readings that are not entered to a
+"file" or a "list" and an appropriate browser is available to enable users to
+choose the name of the file or list.
+
+It is important to note that the algorithm assumes that readings are
+entered in the correct order, i.e. a reading can only be entered into the
+defined AP position after
+the reading relative to which its position is defined. The order of the
+readings is defined by the order in the list or file of file names, and
+hence should be ordered by the external assembly  engine. But
+if the browser is used to select a batch of sequences, they are unlikely
+to be in the correct order by chance, so care must be taken in its use.
+If reading X specifies an anchor reading that has not been entered the 
+algorithm will start a new contig starting with X.
+
+_split()
+ at node Assembly-Screen
+ at section Screen Only
+ at cindex Assembly: screen only
+ at cindex Screen only: assembly
+
+This function is used to compare a batch of readings against the data in
+an assembly database without entering them. 
+It performs "normal shotgun assembly" and records the
+percentage mismatch for each matching reading in a file.  If required,
+this file
+could then be sorted on percentage mismatch and used as a file of file
+names for "normal shotgun assembly"; in which case the best matches
+would be entered first. The readings in the
+batch are only compared to the current contents of the assembly database,
+and are not compared against the other readings in the batch.
+
+
+_picture(assembly.screen,3.38333in)
+
+As explained in normal assembly
+(_fpref(Assembly-Shot, Normal Shotgun Assembly, assembly))
+the user can select to "Apply masking", and if so, the "Select tags"
+button will be activated and if it is clicked will bring up a dialogue
+to allow tag types to be selected. _fxref(Conf-Tag, Tag Selector, configure)
+
+The "display mode" dialogue allows the type of output produced to be
+set.  "Hide all alignments" means that only the briefest amount of
+output will be produced. "Show passed alignments" means that only
+alignments that fall inside the entry criteria will be displayed. "Show
+all alignments" means that all alignments, including those that fail the
+entry criteria, are displayed. "Show only failed alignments" displays
+alignments only for the readings that fail the entry criteria.
+
+When comparing each reading the program looks first for a "Minimum
+initial match", and for each such matching region found it will produce
+an alignment. If the "Maximum pads per read" and the "Maximum percent
+mismatch" are not exceeded the reading will be entered. The maximum pads
+can be inserted in both the reading and the consensus. If users agree we
+would prefer to swap the maximum pads criteria for a minimum overlap.
+i.e. only overlaps of some minimum length would be accepted.
+
+Screening usually works on sets of reading names and they can be read from
+either a "file" or a "list" and an appropriate browser is available to enable
+users to choose the name of the file or list. If just a single reading is to
+be assembled choose "single" and enter the filename instead of the file or list
+of filenames.
+
+The routine writes the names of all the readings and their alignment scores
+expressed as percentage mismatches to a "file" or a "list" and an appropriate
+browser is available to enable users to choose the name of the file or list.
+
+Previous versions of the package also had the ability to search for matches in 
+the "hidden" poor quality data at the ends of contigs. This feature is no
+longer available.
+
+Note that this option may require the parameter maxseq to be set beforehand
+(_fpref(Conf-Set Maxseq, Set Maxseq, configure)). This parameter defines the maximum
+length of consensus that can be created.
+
+
+_ifdef([[_unix]],[[
+_split()
+ at node Assembly-CAP2
+ at section Assembly CAP2
+_include(cap2-t.texi)
+
+_split()
+ at node Assembly-CAP3
+ at section Assembly CAP3
+_include(cap3-t.texi)
+
+_split()
+ at node Assembly-FAKII
+ at section Assembly FAKII
+_include(fak2-t.texi)
+
+_split()
+ at node Assembly-Phrap
+ at section Assembly Phrap
+_include(phrap-t.texi)
+
+]])
+
+_split()
+ at node Assembly-Tips
+ at section General Comments and Tips on Assembly
+ at cindex Tips on assembly
+ at cindex Assembly: tips
+
+The program has several methods for assembly and it may not be obvious
+which is most appropriate for a given problem. The following notes may
+help. They also contain information on methods for checking the
+correctness of an assembly.
+
+If you have access to an
+external program that can generate the order and approximate positions
+of readings then Directed Assembly can be used. The same is true if the
+experimental method used generates an ordered set of readings
+(_fpref(Assembly-Directed, Directed Assembly, assembly)).
+
+If you have access to a external global assembly program that can
+produce an assembly and write out correct experiment files then Directed
+Assembly can still be used by specifying a "tolerance" of -1 (in the
+experiment file AP lines).
+
+For routine shotgun assembly of whole data-sets or incremental data-sets
+Normal Shotgun Assembly can be used. Through the idea of "Masked
+assembly" this option also can also restrict the assembly to particular
+regions of the consensus
+(_fpref(Assembly-Shot, Normal shotgun assembly, assembly)).
+
+Note that 
+Normal shotgun assembly (_fpref(Assembly-Shot, Normal Shotgun Assembly, assembly)),
+Assemble independently (_fpref(Assembly-Shot, Assembly Independently, assembly)),
+Assembly into single stranded regions (_fpref(Assembly-Single, Assembly Single, assembly)),
+Screen only (_fpref(Assembly-Screen, Screen Only, assembly)),
+Put all readings in separate contigs (_fpref(Assembly-New, Assembly new, assembly)),
+may require the parameter maxseq to be set beforehand
+(_fpref(Conf-Set Maxseq, Set Maxseq, configure)). This parameter defines the maximum
+length of consensus that can be created. If you find that the assembly process
+is only entering the first few hundred of a batch of readings, try increasing maxseq.
+
+If you have a batch of readings that are known to overlap one another,
+but which, due to repeats, may also match other places in the consensus,
+then it can be helpful to use Assemble Independently. This will ensure
+that the batch of readings are compared only to one another, and hence
+will not be assembled into the wrong places
+(_fpref(Assembly-Ind, Assemble independently, assembly)).
+
+Almost all readings are assembled automatically in their first pass
+through the assembly routine. Those that are not can be dealt with in
+two ways. Either they can be put through assembly again with less
+stringent parameters, or entered using the "Put all readings in new
+contigs" routine and then joined to the contig they overlap using Find
+Internal Joins _fxref(FIJ, Find Internal Joins, fij).
+If it is found that readings are not being
+assembled in their first pass through the assembler, then it is likely
+that the contigs require some editing to improve the consensus. Also it
+may be that poor quality data is being used, possibly by users
+over-interpreting films or traces. In the long term it can be more
+efficient to stop reading early and save time on editing. For those
+using fluorescent sequencing machines the unused data can be
+incorporated after assembly using the Contig Editor and Double Strand.
+
+An independent and important check on assembly is obtained by
+sequencing both ends of templates. Providing the correct information is
+given in the experiment files gap can check the positions and
+orientations of readings from the same
+template (_fpref(Read Pairs, Find read pairs, read_pairs)).
+Any inconsistencies are
+shown both textually and graphically. In addition this information can
+be used to find possible joins between contigs.
+
+
+_split()
+ at node Assembly-Codes
+ at section Assembly Failure Codes
+ at cindex Assembly: failure codes
+
+ at table @var
+ at item 0
+The reading file was not found or is of invalid format
+ at item 1
+The reading file was too short (less than the minimum match length)
+ at item 2
+The reading appeared to match somewhere but failed to align
+sufficiently well (too many padding characters or too high a percentage
+mismatch)
+ at item 3
+A reading of the same name was already present in the database
+ at item 4
+This error number is no longer used
+ at item 5
+During a masked assembly, no sequence match with this reading was found.
+ at end table
diff --git a/manual/assembly.CAP3.png b/manual/assembly.CAP3.png
new file mode 100644
index 0000000..6f9f041
Binary files /dev/null and b/manual/assembly.CAP3.png differ
diff --git a/manual/assembly.cap2.png b/manual/assembly.cap2.png
new file mode 100644
index 0000000..3c1cb06
Binary files /dev/null and b/manual/assembly.cap2.png differ
diff --git a/manual/assembly.directed.png b/manual/assembly.directed.png
new file mode 100644
index 0000000..d91c7b0
Binary files /dev/null and b/manual/assembly.directed.png differ
diff --git a/manual/assembly.fak2.png b/manual/assembly.fak2.png
new file mode 100644
index 0000000..d842932
Binary files /dev/null and b/manual/assembly.fak2.png differ
diff --git a/manual/assembly.new.png b/manual/assembly.new.png
new file mode 100644
index 0000000..0df55e1
Binary files /dev/null and b/manual/assembly.new.png differ
diff --git a/manual/assembly.one.png b/manual/assembly.one.png
new file mode 100644
index 0000000..dab96da
Binary files /dev/null and b/manual/assembly.one.png differ
diff --git a/manual/assembly.screen.png b/manual/assembly.screen.png
new file mode 100644
index 0000000..fe27c45
Binary files /dev/null and b/manual/assembly.screen.png differ
diff --git a/manual/assembly.shot.png b/manual/assembly.shot.png
new file mode 100644
index 0000000..a884007
Binary files /dev/null and b/manual/assembly.shot.png differ
diff --git a/manual/assembly.single.png b/manual/assembly.single.png
new file mode 100644
index 0000000..bebc9f2
Binary files /dev/null and b/manual/assembly.single.png differ
diff --git a/manual/break_contig.png b/manual/break_contig.png
new file mode 100644
index 0000000..80dc120
Binary files /dev/null and b/manual/break_contig.png differ
diff --git a/manual/c_order_lb.png b/manual/c_order_lb.png
new file mode 100644
index 0000000..9f11eac
Binary files /dev/null and b/manual/c_order_lb.png differ
diff --git a/manual/c_order_t1.png b/manual/c_order_t1.png
new file mode 100644
index 0000000..aec2c02
Binary files /dev/null and b/manual/c_order_t1.png differ
diff --git a/manual/c_order_t1.small.png b/manual/c_order_t1.small.png
new file mode 100644
index 0000000..26c942d
Binary files /dev/null and b/manual/c_order_t1.small.png differ
diff --git a/manual/c_order_t2.png b/manual/c_order_t2.png
new file mode 100644
index 0000000..331140c
Binary files /dev/null and b/manual/c_order_t2.png differ
diff --git a/manual/c_order_t2.small.png b/manual/c_order_t2.small.png
new file mode 100644
index 0000000..f6922dd
Binary files /dev/null and b/manual/c_order_t2.small.png differ
diff --git a/manual/calc_consensus-t.texi b/manual/calc_consensus-t.texi
new file mode 100644
index 0000000..6d1705b
--- /dev/null
+++ b/manual/calc_consensus-t.texi
@@ -0,0 +1,858 @@
+_ifdef([[_gap4]],[[
+ at menu
+* Con-Normal::                  Normal Consensus Output
+* Con-Extended::                Extended Consensus Output
+* Con-Unfinished::              Unfinished Consensus Output
+* Con-Quality::                 Quality output
+* Con-Calculation::             Consensus Algorithms
+* Qual-Cal::                    The Quality Calculation
+* Con-Evaluation::              List Consensus Confidence
+* Con-ListBaseConf::            List Base Confidence
+ at end menu
+]],[[
+ at menu
+* Con-Normal::                  Normal Consensus Output
+* Con-Calculation::             Consensus Algorithms
+* Qual-Cal::                    The Quality Calculation
+* Con-Evaluation::              List Consensus Confidence
+* Con-ListBaseConf::            List Base Confidence
+ at end menu
+]])
+
+ at cindex Consensus: outputting
+ at cindex Calculate consensus
+ at cindex consensus IUB codes
+ at cindex IUB codes: consensus
+
+In this section we describe the types of consensus which gap4 can
+produce, the formats they can be written in, and the algorithms that can
+be used. The algorithms are not only used to produce consensus sequence
+files, but in many other places throughout gap4 where an analysis of the
+current quality of the data is required. One important place is inside
+the Contig Editor
+(_fpref(Editor, Editing in gap4, contig_editor))
+where they are used to produce an "on-the-fly" consensus, responding to
+every edit made by the user.
+
+The currently active consensus algorithm is selected from the
+"Consensus algorithm" dialogue in the main gap4 Options menu
+(_fpref(Conf-Consensus Algorithm, Consensus Algorithm, t)).
+
+There are four main types of consensus sequence file that can be
+produced by the program: Normal, Extended, Unfinished, and Quality. They
+are all invoked from the File menu.
+
+"Normal" is the type of consensus file that would be expected: a
+consensus from the non-hidden parts of a contig. "Extended" is the same
+as "Normal" but the consensus is extended by inclusion of
+the hidden, non-vector sequence, from the ends of the
+contig. 
+
+"Unfinished" is the same as "Normal" except that any position where
+the consensus does not have good data for both strands 
+is written using A,C,G,T characters,
+and the rest (which has good data for both strands) is written
+using a different set of symbols. This sequence can be used
+for screening against new readings: 
+only the regions needing more readings will produce
+matches. By screening readings in this way, prior to assembly, users can
+avoid entering readings which will not help finish the project, and
+which may require further editing work to be performed.
+
+"Quality" produces a sequence of characters of the same length
+as the consensus, but they instead encode the reliability of the
+consensus at each point.
+
+Consensus sequence files can also encode the positions of the currently
+active tag types by changing the case of the tagged characters (marking) 
+or writing them in a different character set (masking)
+(_fpref(Anno-Act, Active tags and masking, t)).
+
+The consensus algorithms are usually configured to produce only the
+characters A,C,G,T and "-", but it is possible to set them to produce
+the complete set of IUB codes. This mode is useful for some types of
+work and allows the range of observed base types at any position to be
+coded in the consensus. How the IUB codes are chosen
+is described in the introduction to the consensus algorithms
+(_fpref(Con-Calculation, The Consensus Algorithms, t)).
+
+Depending on the type of consensus produced, the consensus sequence
+files can be written in three different formats:
+Experiment files
+(_fpref(Formats-Exp, Experiment File, formats)), 
+FASTA (@cite{Pearson,W.R. Using the FASTA program to search protein
+and DNA sequence databases. Methods in Molecular Biology. 25, 365-389 (1994)})
+or staden formats.  If experiment file format is selected a further menu
+appears that allows users to select for the inclusion of tag data in the
+output file.
+For FASTA format the sequence headers include the contig identfier as the
+sequence name and the project database name, version number and the number of
+the leftmost reading in the contig as comments. e.g. 
+">xyzzy.s1 B0334.0.274" is database B0334, copy 0, and the left most reading
+for the contig is number 274, which has a name of xyzzy.s1.
+For staden format the headers include the project database name
+and the number of the leftmost reading in the contig. e.g. 
+"<B0334.00274------->" is database B0334 and the left most reading for
+the contig is number 274. Staden format is maintained only for
+historical reasons - i.e. there may still be a few unfortunate people using it.
+Obviously Experiment file format can contain much more information, and
+can serve as the basis of a submission to the sequence library.
+
+_split()
+ at node Con-Normal
+ at section Normal Consensus Output
+ at cindex Calculate consensus: normal consensus
+ at cindex Normal consensus
+ at cindex Fasta output from Gap
+
+This is the usual consensus type that will be calculated
+(and is available from the gap4 File menu).
+The currently active consensus algorithm is selected from the
+"Consensus algorithm" dialogue in the main gap4 Options menu
+(_fpref(Conf-Consensus Algorithm, Consensus Algorithm, t)).
+
+Contigs can be
+selected from a file of file names or a list.  In addition, tagged regions can
+be masked or marked (_fpref(Anno-Act, Active tags and masking, tags)), and
+output can be in Experiment file, fasta 
+or staden formats.  If experiment file format is selected a further menu
+appears that allows users to select for the inclusion of tag data in the
+output file.
+
+_picture(calc_consensus.normal,3.35833in)
+
+The contigs for which to calculate a consensus can be a particular
+"single" contig, "all contigs", or a subset of contigs whose names are
+stored in a "file" or a "list". If a file or list is selected the
+browse button will be activated, and if it is clicked, an appropriate
+browser will be invoked. If the user selects "single" then the
+dialogue for choosing the contig, and the section to process, becomes
+active.
+
+If the user selects either "mask active tags" or "mark active tags"
+the "Select tags" button is activated, and if it is clicked, a dialogue
+panel appears to enable the user to select which tag types should be
+used in these processes. If "mask" is selected all segments covered by
+the tag types chosen will not be written as ACGT but as defi
+symbols. If "mark" is selected the tagged segments will be written in
+lowercase characters. Masking is useful for producing a sequence to
+screen against other sequences: only the unmasked segments will
+produce hits.
+
+The "strip pads" option will remove pads ("*"s) from the consensus sequence.
+In the case of experiment files this will also automatically adjust the
+position and length of the annotations to ensure that they still mark the
+correct segment of sequence.
+
+Normally the consensus sequences are named after the left-most reading
+in each contig. For the purposes of single-template based sequencing
+projects (eg cDNA assemblies) the option exists to ``Name consensus by
+left-most template'' instead of by left-most reading.
+
+The routine can write its consensus sequence (plus extra data for
+experiment files) in "experiment file", "fasta" and "staden"
+formats. The output file can be chosen with the aid of a file
+browser. If experiment file format is selected the user can choose
+whether or not to have "all annotations", "annotations except in
+hidden", or "no annotations" written out with the sequence. If the
+user elects to include annotations the "select tags" button will become
+active, and if it is clicked, a dialogue for selecting the types to include
+will appear. 
+
+
+_ifdef([[_gap4]],[[
+_split()
+ at node Con-Extended
+ at section Extended Consensus Output
+ at cindex Calculate consensus: extended consensus
+ at cindex Extended consensus
+
+This consensus type 
+(which is available from the gap4 File menu)
+is useful for those who are too impatient to
+complete their sequence and want to compare it, in its fullest extent,
+to other data.  The sequence produced therefore includes hidden data
+from the ends of the contigs.  
+
+The currently active consensus algorithm is selected from the
+"Consensus algorithm" dialogue in the main gap4 Options menu
+(_fpref(Conf-Consensus Algorithm, Consensus Algorithm, t)).
+
+Contigs can be selected from a file of
+file names or a list.  In addition tagged regions can be masked or
+marked (_fpref(Anno-Act, Active tags and masking, tags)), and output can
+be in fasta or staden formats.
+
+_picture(calc_consensus.extended,3.38333in)
+
+The contigs for which to calculate a consensus can be a particular
+"single" contig, "all contigs", or a subset of contigs whose names are
+stored in a "file" or a "list". If a file or list is selected the
+browse button will be activated, and if it is clicked, an appropriate
+browser will be invoked. If the user selects "single" then the
+dialogue for choosing the contig and the section to process becomes
+active.
+
+Where possible
+the contigs are extended using the poor quality data from the readings
+near their ends. To ensure that this additional data is not too poor
+the program uses the following
+algorithm. It slides a window of size "Window size for good data scan"
+along the hidden data for each reading and stops if it finds a window
+that contains more than "Max dashes in scan window" non-ACGT
+characters. The data that extends the contig the furthest is added to
+its consensus sequence. 
+
+If the user selects either "mask active tags" or "mark active tags"
+the "Select tags" button is activated, and if it is clicked, a dialogue
+panel appears to enable the user to select which tag types should be
+used in these processes. If "mask" is selected all segments covered by
+the tag types chosen will not be written as ACGT but as defi
+symbols. If "mark" is selected the tagged segments will be written in
+lowercase characters. Masking is useful for producing a sequence to
+screen against other sequences: only the unmasked segments will
+produce hits.
+
+The "strip pads" option will remove pads ("*"s) from the consensus sequence.
+
+The routine can write its consensus sequence in "fasta" and "staden"
+formats. The output file can be chosen with the aid of a file browser. 
+
+_split()
+ at node Con-Unfinished
+ at section Unfinished Consensus Output
+ at cindex Calculate consensus: unfinished consensus
+ at cindex Unfinished consensus
+
+This option is available from the gap4 File menu.
+An "Unfinished" consensus is one in which any position where
+the consensus does not have good data for both strands 
+is written using A,C,G,T characters,
+and the rest (which has good data for both strands) is written
+using a different set of symbols (d,e,f,i). This sequence can be used
+for screening against new readings: 
+only the regions needing more readings will produce
+matches. By screening readings in this way, prior to assembly, users can
+avoid entering readings which will not help finish the project, and
+which may require further editing to be performed.
+This type of consensus
+when written in staden format, consists of
+A,C,G,T for single stranded regions and d,e,f,i for finished sequence
+(d=a,e=c,f=g,i=t). 
+
+
+The currently active consensus algorithm is selected from the
+"Consensus algorithm" dialogue in the main gap4 Options menu
+(_fpref(Conf-Consensus Algorithm, Consensus Algorithm, t)).
+
+Contigs can be selected from a
+file of file names or a list, and output can be in fasta or staden
+formats.
+
+_picture(calc_consensus.unfinished,3.175in)
+
+The contigs for which to calculate a consensus can be a particular
+"single" contig, "all contigs", or a subset of contigs whose names are
+stored in a "file" or a "list". If a file or list is selected the
+browse button will be activated, and if it is clicked, an appropriate
+browser will be invoked. If the user selects "single" then the
+dialogue for choosing the contig and the section to process becomes
+active.
+
+The "strip pads" option will remove pads ("*"s) from the consensus sequence.
+
+The routine can write its consensus sequence in "fasta" and "staden"
+formats. The output file can be chosen with the aid of a file browser. 
+
+_split()
+ at node Con-Quality
+ at section Quality Consensus Output
+ at cindex Calculate consensus: quality
+ at cindex Quality: output for consensus
+ at cindex Quality codes
+
+
+The Quality Consensus Output option described here 
+(which is available from the gap4 File menu)
+applies either of the two simple
+consensus calculations
+(_fpref(Con-Calculation-1, Consensus Calculation Using Base Frequencies,
+t)) and 
+(_fpref(Con-Calculation-2, Consensus Calculation Using Weighted Base Frequencies, t))
+to the data for each strand of the DNA separately. 
+The currently active consensus algorithm is selected from the
+"Consensus algorithm" dialogue in the main gap4 Options menu
+(_fpref(Conf-Consensus Algorithm, Consensus Algorithm, t)).
+
+It produces, not a consensus sequence, but an encoding of the "quality"
+of the data which defines whether it has been determined on both
+strands, and whether the strands agree.
+The categories of data
+and the codes produced are shown in the table. For example  'c' means 
+bad data on one strand is aligned with good data on the other.
+
+ at table @var
+ at item a
+ at kbd{Good Good (in agreement)}
+ at item b
+ at kbd{Good Bad}
+ at item c
+ at kbd{Bad  Good}
+ at item d
+ at kbd{Good None}
+ at item e
+ at kbd{None Good}
+ at item f
+ at kbd{Bad  Bad}
+ at item g
+ at kbd{Bad  None}
+ at item h
+ at kbd{None Bad}
+ at item i
+ at kbd{Good Good (disagree)}
+ at item j
+ at kbd{None None}
+ at end table
+
+_picture(calc_consensus.quality,3.175in)
+
+The contigs for which to calculate a consensus can be a particular
+"single" contig, "all contigs", or a subset of contigs whose names are
+stored in a "file" or a "list". If a file or list is selected the
+browse button will be activated, and if it is clicked, an appropriate
+browser will be invoked. If the user selects "single" then the
+dialogue for choosing the contig and the section to process becomes
+active.
+
+The routine can only write its consensus sequence in "staden"
+format. The output file can be chosen with the aid of a file browser. 
+]])
+
+_split()
+ at node Con-Calculation
+ at section The Consensus Algorithms
+ at cindex Calculate consensus: algorithm
+ at cindex Consensus calculation method
+ at cindex consensus IUB codes
+ at cindex IUB codes: consensus
+
+ at menu
+* Con-Calculation-1::     Consensus Calculation Using Base Frequencies
+* Con-Calculation-2::     Consensus Calculation Using Weighted Base Frequencies
+* Con-Calculation-3::     Consensus Calculation Using Confidence Values
+* Qual-Cal::              The Quality Calculation
+* Con-Evaluation::              List Consensus Confidence
+ at end menu
+
+The consensus calculation is a very important component of gap4. It is
+used to produce an "on-the-fly" consensus, responding to every
+individual change in the Contig Editor
+(_fpref(Editor, Editing in gap4, contig_editor))
+and is used to produce the final sequence for submission to the sequence
+libraries. Some years ago
+ at i{Bonfield, J.K. and Staden, R. The application of numerical estimates of
+base calling accuracy to DNA sequencing projects. Nucleic Acids Res. 23,
+1406-1410 (1995)} we put forward the idea of using base call 
+accuracy estimates in sequencing projects, and this has been partially
+realised with the values from the Phred program
+(@i{Ewing, B. and Green, P.
+Base-Calling of Automated Sequencer Traces Using Phred. II. Error
+Probabilities. Genome Research. Vol 8 no 3. 186-194 (1998)}).
+These values are widely used and have defined a decibel type
+scale for base call confidence values and gap4 is currently set to use 
+confidence values defined on this scale.
+An overview of our use of confidence values is contained in the
+introductory sections of the manual
+(_fpref(Intro-Base-Acc, The use of numerical estimates of base
+calling accuracy, t)).
+
+As is described elsewhere
+(_fpref(Con-Evaluation, List Consensus Confidence, calc_consensus))
+being able to calculate the confidence for each base in the consensus
+sequence makes it possible to estimate the number of errors it contains,
+and hence the number of errors that will be removed if particular bases
+are checked and, if necessary, edited. 
+
+Gap4 caters for base calls
+with and without confidence values and hence provides a choice of
+algorithms. 
+There are currently three consensus algorithms that may be used. The
+choice of the best algorithm will depend on the data that you have available
+and the purpose for which you are using gap4.
+
+The currently active consensus algorithm is selected from the
+"Consensus algorithm" dialogue in the main gap4 Options menu
+(_fpref(Conf-Consensus Algorithm, Consensus Algorithm, t)).
+
+The only way to produce a consensus sequence for which the reliability
+of each base is known, is to use reading data with base call confidence
+values. Their use, in combination with the Confidence Value 
+algorithm 
+(_fpref(Con-Calculation-3, Consensus Calculation Using Confidence Values, t)).
+is strongly recommended.
+
+For base calls without confidence values use the Base Frequencies algorithm
+(_fpref(Con-Calculation-1, Consensus Calculation Using Base Frequencies, t)).
+This is also a fast algorithm so
+it may be appopriate for very high depth assemblies such those 
+for mutation studies.
+
+For data with simple base call accuracy estimates rather than those on
+the decibel scale, the Weighted Base Frequencies algorithm should be used
+(_fpref(Con-Calculation-2, Consensus Calculation Using Weighted 
+Base Frequencies, t)).
+
+All confidence values lie in the range 0 to 100.
+When readings are entered into a database, gap4 assigns a confidence of
+99 to all bases 
+without confidence values. 
+For all three algorithms, a base with confidence of 100 is
+used to force the consensus base to that base type and to have a
+confidence of 100. However,if two or more base types at any position
+have confidence 100, the consensus will be set to "unknown", i.e. "-",
+and will have a confidence of 0.
+Note that dash ("-") is our preferred symbol for "unknown" as, within a
+sequence, it is more easily distinguished from A,C,G,T than "N". 
+
+The consensus sequence is also assigned a confidence, even when base
+call confidence values
+are not used to calculate it. 
+The scale and meaning of the consensus confidence changes
+between consensus algorithms. However the consensus cutoff parameter always
+has the same meaning. A consensus base with a confidence 'X' will be called as
+a dash when 'X' is lower than the consensus cutoff, otherwise it is the
+determined base type.
+
+Both the consensus cutoff and quality cutoff values can be set by using
+the "Configure cutoffs" command in the
+"Consensus algorithm" dialogue in the main gap4 Options menu
+(_fpref(Conf-Consensus Algorithm, Consensus Algorithm, t)).
+Within
+the Contig Editor (_fpref(Editor, Editing in gap4, contig_editor)) these
+values can be adjusted by clicking on the "<" and ">" symbols adjacent
+to the "C:" (consensus cutoff) and "Q:" (quality cutoff) displays in the
+top left corner of the editor. These buttons are repeating buttons - the
+values will adjust for as long as the left mouse button is held down.
+Changing these values lasts only as long as that invocation of the
+contig editor.
+
+The consensus algorithms are usually configured to produce only the
+characters A,C,G,T,* and "-", but it is possible to set them to produce
+the complete set of IUB codes. This mode is useful for some types of
+work and allows the range of observed base types at any position to be
+coded in the consensus. The IUB code at any position is determined in
+the following way.
+
+We assume that the user wants to know which base types have occurred at
+any point, but may want some control over the quality and relative
+frequency of those that are used to calculate the "consensus".
+For the simplest consensus algorithm there is no control
+over the quality of the base calls that are included, but the Consensus
+Cutoff can be used to control how the relative frequency affects the
+chosen IUB code. All base types whose computed "confidence" exceeds the
+Consensus Cutoff will be included in the selection of the IUB code. For
+example if only base type T reaches the Consenus Cutoff the IUB code
+will be T; if both T and C reach the cutoff the code will be Y; if A, C
+and T each reach the cutoff the code will be H; if A, C, G and T all
+reach the cutoff the code will be "N". For the Confidence Value
+algorithm the Quality Cutoff can be used to exclude base calls of low
+quality, so that all those that do not reach the Quality Cutoff are
+excluded from the IUB code calculation. Otherwise the logic of the code
+selection is the same as for the two simpler algorithms.
+
+Both the consensus cutoff and quality cutoff values can be set by using
+the "Configure cutoffs" command in the
+"Consensus algorithm" dialogue in the main gap4 Options menu
+(_fpref(Conf-Consensus Algorithm, Consensus Algorithm, t)).
+
+The algorithms are explained below.
+
+_split()
+ at node Con-Calculation-1
+ at subsection Consensus Calculation Using Base Frequencies
+
+This algorithm can be used for any data, with or without confidence values.
+Each standard base type is given the same weight. The consensus
+will be the most frequent base type in a given column provided that the
+consensus cutoff parameter is low enough. All unrecognised base types,
+including IUB codes, are treated as dashes.
+Dashes are given a
+weight of 1/10th that of recognised base types. Pads are given a weight
+which is the average of their neighbouring bases.
+
+The confidence of a consensus base for this method is expressed as a
+percentage. 
+So for example a column of bases of A, A, A and T will give a consensus base
+of A and a confidence of 75. Therefore a consensus cutoff of 76 or higher will
+give a consensus base of "-".
+
+In the event that more than one base type is calculated to have the same
+confidence, and this
+exceeds the consensus cutoff, the bases are assigned in descending order of
+precedence: A, C, G and T.
+
+The quality cutoff parameter (Q in the Contig Editor) 
+has no effect on this algorithm.
+
+_split()
+ at node Con-Calculation-2
+ at subsection Consensus Calculation Using Weighted Base Frequencies
+
+This method can be used when simple, unquantified, base call quality
+values are available. Instead of simply counting base type frequencies
+it sums the quality values.
+Hence a column of 4 bases A,
+A, A and T with confidence values 10, 10, 10 and 50 would give combined totals
+of 30/80 for A and 50/80 for T (compared to 3/4 for A and 1/4 for 
+T when using frequencies). As
+with the unweighted frequency method this sets the confidence value of the
+consensus base to be the the fraction of the chosen base type weights over the
+total weights (62.5 in the above example).
+
+The quality cutoff parameter controls which bases are used in the calculation.
+Only bases with quality values greater than or equal to the quality cutoff are
+used, otherwise they are completely ignored and have no effect on either the
+base type chosen for the consensus or the consensus confidence value. In the
+above example setting the quality cutoff to 20 would give a T with
+confidence 100 (100 * 50/50).
+
+In the event that more than one base type is calculated to have the same
+weight, and this
+exceeds the consensus cutoff, the bases are assigned in descending order of
+precedence: A, C, G and T.
+
+This is Rule IV of @cite{Bonfield,J.K. and Staden,R. The application of
+numerical estimates of base calling accuracy to DNA sequencing projects.
+Nucleic Acids Research 23, 1406-1410 (1995).}
+
+_split()
+ at node Con-Calculation-3
+ at subsection Consensus Calculation Using Confidence values
+
+This is the prefered consensus algorithm for reading data with Phred
+decibel scale confidence values. As will become clear from the follwing
+description, it is more complicated than the other algorithms, but
+produces a much more useful result.
+
+A difficulty in designing an algorithm to calculate the confidence for
+a consensus derived from several readings, possibly using different
+chemistries, and hopefully from both strands of the DNA, is knowing
+the level of
+independence of the results from different experiments - namely the readings.
+Given that sequencing traces are sequence dependent, we do not regard
+readings as wholly independent, but at the same time,
+repeated readings which confirm base calls may give us more confidence
+in their accuracy. In addition, if we get a particularly good sequencing
+run, with consequently high base call confidence values, we are 
+more likely to believe its base call and confidence value assignments.
+The final point in this preamble
+is that the Phred confidence values 
+refer only to the probability for the called base, and
+they tell us nothing about the relative likelihood of each of the other
+3 base types appearing at the same position.
+These difficulties are taken into account by our algorithm, which
+is described below.
+
+In what follows, a particular position in an alignment of readings is
+referred to as a "column".
+The base calls in a column are classified by their chemistry
+and strand. We currently group them into "top strand dye primer", "top strand
+dye terminator", "bottom strand dye primer" and "bottom strand dye terminator"
+classes.
+
+Within each class there may be zero or many base calls. For each
+class we check for multiple occurrences of the same base type. 
+For each base type we find the highest confidence value, and then
+increase it by an amount dependent on the number of confirming reads.
+Then Bayes formula is used to derive the probabilities and hence the
+confidence values for each base type.
+
+To further describe the method it is easiest to work through an example.
+Suppose we have 5 readings with the
+following characteristics covering a particular column.
+
+ at example
+Dye primer, top strand,        'A', confidence 20
+Dye primer, top strand,        'A', confidence 10
+Dye primer, top strand,        'T', confidence 20
+Dye terminator, top strand,    'T', confidence 10
+Dye primer, bottom strand,     'A', confidence 5
+ at end example
+
+Hence there are three possible classes.
+
+Examining the "dye primer top strand" class we
+see there are three readings (A, A and T). The highest A is 20. We add to
+this a fixed quantity to indicate one other occurence of an A in this set. For
+this example we add 5. Now we have an adjusted confidence of
+25 for A and 20 for T. This is equivalent to a .997 
+probability of A being correct and .99 probability of T being correct.
+To use Bayes we split the remaining probabilies evenly.
+A has a probability of .997 and so the remaining .003 is spread amongst the
+other base types. Similarly for the .01 of the T. The result is shown in
+the table below.
+
+ at example
+  |   A     C     G     T
+--+-----------------------
+A | .997  .001  .001  .001
+T | .0033 .0033 .0033 .990
+ at end example
+
+Bayesian calculations on
+this table then give us probabilities of approximately .766 for A,
+.00154 for C, .00154 for G and .231 for T.
+
+The other classes give probalities of .033 for A, C, G and .9 for T, and
+.316 for A, and .228 for C, G and T.
+
+To combine the values for each class we produce a table for a further Bayesian
+calculation. Once again we fill in the probabilities and spread the remainder
+evenly amongst the other base types.
+
+ at example
+           |   A      C      G     T
+-----------+--------------------------
+Primer Top | .766  .00154 .00154 .231
+Term   Top | .0333 .0333  .0333  .9
+Primer Bot | .316  .228   .228   .228
+ at end example
+
+From this Bayes gives the 
+final probabilities of .135 for A, .0002 for C, .0002 for
+G and .854 for T.
+This is what would be expected intuitively: the T signal was present in
+both dye primer and dye terminator experiments with 1/100 and 1/10 error
+rates whilst the A signal was present on both strands with 1/100 and 1/3 error
+rates. 
+Hence the consensus base is T with confidence 8.4 (-10*log10(1-.854)).
+
+If a padding character is present in a
+column we consider the pad as a separate base type and then evenly divide the
+remaining probabilities by 4 instead of 3.
+
+_split()
+ at node Qual-Cal
+ at subsection The Quality Calculation
+ at cindex Quality calculation algorithm
+
+The Quality Calculation described here 
+(which is available from the gap4 File menu)
+applies either of the two simple
+consensus calculations
+(_fpref(Con-Calculation-1, Consensus Calculation Using Base Frequencies,
+t)) and 
+(_fpref(Con-Calculation-2, Consensus Calculation Using Weighted Base Frequencies, t))
+to the data for each strand of the DNA separately. 
+It produces, not a consensus sequence, but an encoding of the "quality"
+of the data which defines whether it has been determined on both
+strands, and whether the strands agree.
+This quality is used as
+the basis for problem searches, such as find next problem, and the Quality
+Display within the Template Display (_fpref(Template-Quality, Quality Plot,
+template)).
+
+The categories of data
+and the codes produced are shown in the table. For example  'c' means 
+bad data on one strand is aligned with good data on the other.
+
+ at table @var
+ at item
+ at r{+Strand -Strand}
+ at item a
+ at r{Good    Good} (in agreement)
+ at item b
+ at r{Good    Bad}
+ at item c
+ at r{Bad     Good}
+ at item d
+ at r{Good    None}
+ at item e
+ at r{None    Good}
+ at item f
+ at r{Bad     Bad}
+ at item g
+ at r{Bad     None}
+ at item h
+ at r{None    Bad}
+ at item i
+ at r{Good    Good} (disagree)
+ at item j
+ at r{None    None}
+ at end table
+
+the "Configure cutoffs" command in the
+
+In the "Consensus algorithm" dialogue in the main gap4 Options menu
+(_fpref(Conf-Consensus Algorithm, Consensus Algorithm, t)),
+setting the configuration to treat readings flagged using the
+"Special Chemistry" Experiment File line (CH field) 
+(_fpref(Formats-Exp, Experiment File, formats))
+affects this
+calculation. When set, the reading counts for both strands
+in the Consensus and Quality
+Calculations, and hence is equivalent to having data on both
+strands. 
+
+
+_split()
+ at node Con-Evaluation
+ at section List Consensus Confidence
+ at cindex Calculate consensus: reliability
+ at cindex Calculate consensus: confidence
+ at cindex Consensus calculation confidence
+ at cindex Confidence of consensus
+ at cindex List confidence
+
+The Confidence Value consensus algorithm 
+(_fpref(Con-Calculation-3, Consensus Calculation Using Confidence Values, t))
+produces a consensus
+sequence for which the expected error rate for each base is known.
+The option described here 
+(which is available from the gap4 View menu)
+uses this information to calculate 
+the expected number of errors in a particular consensus sequence and
+to tabulate them.
+
+The decibel type scale introduced in the Phred program uses the formula
+-10xlog10(error_rate) to produce confidence values for the base calls. A
+confidence value of 10 corresponds to an error rate of 1/10; 20 to
+1/100; 30 to 1/1000; etc.
+
+So for example, if 50 bases in the consensus had confidence
+10, we would expect those 50 bases (with an error rate of 1/10) to
+contain 5 errors; and if 200 bases had confidence 20, we would expect
+them to contain 2 errors. If these 50 bases with confidence 10, and 200
+bases with confidence 20 were the least accurate parts of the consensus,
+they are the bases which we should check and edit first. In so doing we
+would be dealing with the places most likely to be wrong, and would
+raise the confidence of the whole consensus. The output produced by List
+Confidence shows the effect of working through all the lowest quality
+bases first, until the desired level of accuracy is reached. To do this
+it shows the cumulative number of errors that would be fixed by checking
+every consensus base with a confidence value less than a
+particular threshold.
+
+The List Confidence option is available from within the Commands menu of
+the Contig Editor and the main gap4 View menu. From the main menu
+the dialogue simply allows selection of one or more contigs. Pressing OK then
+produces a table similar to the following:
+
+ at example
+Sequence length = 164068 bases.
+Expected errors =  168.80 bases (1/971 error rate).
+
+Value   Frequencies     Expected  Cumulative    Cumulative      Cumulative
+                        errors    frequencies   errors          error rate
+--------------------------------------------------------------------------
+  0          0             0.00         0          0.00         1/971
+  1          1             0.79         1          0.79         1/976
+  2          0             0.00         1          0.79         1/976
+  3          3             1.50         4          2.30         1/985
+  4         30            11.94        34         14.24         1/1061
+  5          2             0.63        36         14.87         1/1065
+  6        263            66.06       299         80.94         1/1867
+  7        151            30.13       450        111.06         1/2841
+  8        164            25.99       614        137.06         1/5168
+  9         96            12.09       710        149.14         1/8344
+ 10         80             8.00       790        157.14         1/14069
+ at end example
+
+The output above states that there are 164068 bases in the consensus sequence
+with an expected 169 errors (giving an average error rate of one in 971).
+Next it lists each confidence value along with its frequency of occurrence and
+the expected number of errors (as explained above, frequency x
+error_rate).  For any particular confidence value the
+cumulative columns state: how many bases in the sequence have the same or
+lower confidence, how many errors are expected in those bases, and the
+new error rate if all these bases were checked and all the errors fixed.
+
+Above it states that there are 790 bases with confidence values of
+10 or less, and estimates there to be 157 errors in those 790 bases. 
+As we expect there to be about 169 errors in the whole consenus 
+this implies that manually checking
+those 790 bases would leave only 12 undetected errors. Given that the sequence
+length is 164068 bases this means an average error rate of 1 in 14069. 
+It is important to note that by using this editing strategy, this error
+rate  would be achieved by checking only 0.48% of the total number of
+consensus bases. This strategy is realised by use of the consensus
+quality search in the gap4 Contig Editor
+(_fpref(Editor-Search-ConsQual, Search by Consensus Quality, t)).
+
+_split()
+ at node Con-ListBaseConf
+ at section List Base Confidence
+ at cindex Confidence of base calls
+ at cindex List base confidence
+
+The various base-callers may produce a confidence value for each base
+call. Previous sections describe how this may be used to produce a
+consensus sequence along with a consensus confidence.
+
+This function tabulates the frequency of each base confidence value
+along with a count of how many times is matches or mismatches the
+consensus. Given that the standard scale for confidence values follows
+the @i{-10log10(probability of error)} formula we can determine what
+the expected frequency of mismatches should be for any particular
+confidence value. By comparing this with our observed frequencies we
+then have a powerful summary of the amount of misassembled data.
+
+ at example
+Total bases considered : 45270
+Problem score          : 1.337130
+
+Conf.        Match        Mismatch           Expected      Over-
+value         freq            freq               freq  representation
+---------------------------------------------------------------------
+  0              0               0               0.00      0.00
+  1              0               0               0.00      0.00
+  2              0               0               0.00      0.00
+  3              0               0               0.00      0.00
+  4             37              22              23.49      0.94
+  5              0               0               0.00      0.00
+  6             89              46              33.91      1.36
+  7            119              26              28.93      0.90
+  8            256              37              46.44      0.80
+  9            368              30              50.11      0.60
+ 10            669              31              70.00      0.44
+...
+ at end example
+
+In the above example we see that there are 59 sequence bases with
+confidence 4, of which 37 match the consensus and 22 do not. If we
+work on the assumption that the consensus is correct then we would
+expect approximately 40% of these to be incorrect, but we have
+measured 37% to be incorrect (22/59) giving 0.94 fraction of the
+expected amount.
+
+For a more problematic assembly, we may see a section of output like
+this:
+
+ at example
+Total bases considered : 1617511
+Problem score          : 311.591358
+
+Conf.        Match        Mismatch           Expected      Over-
+value         freq            freq               freq  representation
+---------------------------------------------------------------------
+...
+ 20          13432             384             138.16      2.78
+ 21          23384             851             192.51      4.42
+ 22          18763             487             121.46      4.01
+ 23          13712             300              70.23      4.27
+ 24          21182             363              85.77      4.23
+ 25          20466             218              65.41      3.33
+ 26           9752             123              24.80      4.96
+ 27          23071             282              46.60      6.05
+ 28          13816             158              22.15      7.13
+ 29          27514             166              34.85      4.76
+ 30          15664             140              15.80      8.86
+...
+ at end example
+
+We can see here that the observed mismatch frequency is greatly more
+than the expected number. This indicates the number of misassemblies
+(or SNPs in the case of mixed samples) within this project and is
+reflected by the combined ``Problem score''. This score is simply the
+sum of the final column (or 1 over that column for values less than
+1.0).
diff --git a/manual/calc_consensus.extended.png b/manual/calc_consensus.extended.png
new file mode 100644
index 0000000..d6a4082
Binary files /dev/null and b/manual/calc_consensus.extended.png differ
diff --git a/manual/calc_consensus.normal.png b/manual/calc_consensus.normal.png
new file mode 100644
index 0000000..747178b
Binary files /dev/null and b/manual/calc_consensus.normal.png differ
diff --git a/manual/calc_consensus.quality.png b/manual/calc_consensus.quality.png
new file mode 100644
index 0000000..189dc6a
Binary files /dev/null and b/manual/calc_consensus.quality.png differ
diff --git a/manual/calc_consensus.unfinished.png b/manual/calc_consensus.unfinished.png
new file mode 100644
index 0000000..ebe23ef
Binary files /dev/null and b/manual/calc_consensus.unfinished.png differ
diff --git a/manual/cap2-t.texi b/manual/cap2-t.texi
new file mode 100644
index 0000000..aba1060
--- /dev/null
+++ b/manual/cap2-t.texi
@@ -0,0 +1,380 @@
+ at cindex Assembly: CAP2
+ at cindex CAP2 Assembly
+ at cindex Huang: Assembly (CAP2)
+ at cindex Assembly: Huang
+
+This mode of assembly uses the global assembly program CAP2, developed
+by Xiaoqiu Huang.
+ at cite{Huang, X. An improved sequence assembly program. Genomics 33, 21-31 
+(1996)}. 
+
+The CAP2 program can be accessed via the Gap4 interface through the "Assembly"
+menu or as a stand alone program.
+
+The CAP2 program, for use with Gap4, must be obtained via ftp from the
+author, Xiaoqiu Huang.
+
+Email Xiaoqiu Huang (huang@@cs.mtu.edu) stating that you want CAP2 for
+use with gap4 and the operating system for which you need the program
+(one of: SunOS 4.1.1; Solaris 2.4; DEC OSF/1 V3.0 and Digital Unix; Irix
+5.3).  He will then contact you to arrange for the retrieval of the
+binary file.  The binary file is called cap2_s. Make this executable (eg
+chmod a+x cap2_s) and move it to the directory
+ at code{$STADENROOT/$MACHINE-bin}. The CAP2 options on the "Assembly" menu
+should now be available.
+
+ at menu
+* Assembly-Perform CAP2 assembly:: Perform CAP2 assembly
+* Assembly-Import CAP2 assembly:: Import CAP2 assembly data
+* Assembly-Perform and import CAP2 assembly:: Perform and import CAP2 assembly
+* Assembly-Stand alone CAP2 assembly:: Stand alone CAP2 assembly
+ at end menu
+
+ at node Assembly-Perform CAP2 assembly
+ at subsection Perform CAP2 assembly
+ at cindex Assembly: perform CAP2 
+ at cindex CAP2 assembly: perform
+
+_picture(assembly.cap2,2.59167in)
+
+The assembly works on either a file or list of reading names in experiment
+file format (_fpref(Formats-Exp, Experiment File, formats)). 
+CAP2 assembles the readings and the alignments
+are written to the output window. Irrespective of the original file
+format, new reading files are written in the destination directory in
+experiment file format. If the destination directory does not already
+exist, then it is created. These new files contain the additional
+information required to recreate the same assembly within Gap4. This is
+done by the addition of an AP line. _oxref(Assembly-Directed, Directed
+Assembly). 
+
+It is also possible to tell the program to identify chimeric
+fragments, report repeat structures and resolve them by setting the
+"Find repeats/chimerics" radiobutton to "Yes". If this is set to "No",
+these tasks are not performed. At the present time, CAP2 can only
+resolve direct repeats and not reverse repeats.
+
+
+ at node Assembly-Import CAP2 assembly
+ at subsection Import CAP2 assembly
+ at cindex Assembly: import CAP2 
+ at cindex CAP2 assembly: import
+
+This mode imports the aligned sequences produced after CAP2 assembly into
+Gap4 and maintains the same alignment. Importing 
+the files requires the directory containing the newly aligned readings, ie 
+the destination directory used in "Perform CAP2 assembly". Readings which are
+not entered are written to a "list" or "file" specified in the "Save failures"
+entry box. This mode is functionally equivalent to "Directed assembly".
+_oxref(Assembly-Directed, Directed Assembly). 
+
+ at node Assembly-Perform and import CAP2 assembly
+ at subsection Perform and import CAP2 assembly
+ at cindex Assembly: perform and import CAP2 
+ at cindex CAP2 assembly: perform and import
+
+This mode performs both the assembly _oref(Assembly-Perform CAP2
+assembly, Perform CAP2 assembly) and the import _oref(Assembly-Import
+CAP2 assembly, Import CAP2 assembly) together. The assembled readings
+are written to the destination directory and then are automatically
+imported from this directory into Gap4.
+
+ at node Assembly-Stand alone CAP2 assembly
+ at subsection Stand alone CAP2 assembly
+ at cindex Assembly: stand alone CAP2 
+ at cindex CAP2 assembly: stand alone
+
+The program can be alternatively accessed as a stand alone program with the 
+following command line arguments
+
+cap2_s -@{format@} file_of_filenames [-r] [-out destination_directory]
+
+@{format@} is the file format of the file of filenames and is either in 
+experiment file format or fasta format. Legal inputs are exp, EXP, fasta or
+FASTA.
+
+file_of_filenames is the name of the file containing the reading names to be
+assembled for experiment files or a single file of readings in fasta format.
+
+destination_directory is the name of a directory to which the new
+experiment files are written to. The default directory is "assemble".
+
+-r is optional and is equivalent to the "Find repeats/chimerics" option above.
+
+
+ at subheading Further details about CAP2
+The comments provided with CAP2 by Huang are detailed below.
+
+ at display
+   copyright (c) 1995-96 Xiaoqiu Huang and Michigan Technological University
+   No part of this program may be distributed without prior written
+   permission of the author.
+
+        Xiaoqiu Huang
+        Department of Computer Science
+        Michigan Technological University
+        Houghton, MI 49931
+        E-mail: huang@@cs.mtu.edu
+
+        Proper attribution of the author as the source of the software would
+        be appreciated:
+             Huang, X. (1996)
+             An Improved Sequence Assembly Program
+             Genomics, 33:21-31.
+
+   The CAP2 program assembles short DNA fragments into long sequences.
+   CAP2 contains a number of improvements to the original version
+   described in Genomics 14, pages 18-25, 1992. These improvements are:
+
+   o  Use of a more efficient filter for quickly detecting pairs of
+      fragments that could not overlap.
+   
+   o  Accurate evaluation of overlap strengths through the use
+      of internally generated fragment-specific confidence vectors.
+
+   o  Identification of fragments from repetitive sequences and
+      resolution of ambiguities in assembly of those fragments.
+
+   o  Identification of chimeric fragments.
+
+   o  Automated refinement of poorly aligned regions of fragment
+      alignments
+
+   A chimeric fragment is made of two short pieces from non-adjacent
+   regions of the DNA molecule. CAP2 may report a repeat structure like:
+ at end display
+ at example
+        F1      5' flanking
+        F2      5' flanking
+        I1      Internal
+        I2      Internal
+        I3      Internal
+        T1      3' flanking
+        T2      3' flanking
+ at end example
+ at display
+   where F1, F2, I1, I2, I3, T1 and T2 are fragment names. The
+   structure means that I1 ,I2 and I3 are from two copies of
+   a repetitive element, F1 and F2 flank the two copies at their
+   5' end, T1 and T2 flank them at their 3' end.
+   CAP2 produces the two copies in the final sequence by
+   resolving the ambiguities in the repeat structure.
+
+   CAP2 is efficient in computer memory: a large number of DNA 
+   fragments can be assembled. The time requirement is acceptable;
+   for example, CAP2 took 1.5 hours to assemble 829 fragments of a total
+   of 393 kb nucleotides into a single contig on a Sun SPARC 5.
+   The program is written in C and runs on Sun workstations.
+
+   The CAP2 program can be run with the -r option. If this option
+   is specified, then the program identifies chimeric fragments,
+   reports repeat structures and resolves them.
+   Otherwise, these tasks are not performed.
+
+   Large integer values should be used for MATCH, MISMAT, EXTEND.
+
+   The comments given above are for CAP2. Written on Feb. 11, 95. 
+
+   Acknowledgements
+     
+      Kathryn Beal found a bug in the Filter procedure.
+      The array elen was not always initialized.
+
+   Below is a description of the parameters in the #define section of CAP.
+   Two specially chosen sets of substitution scores and indel penalties
+   are used by the dynamic programming algorithm: heavy set for regions
+   of low sequencing error rates and light set for fragment ends of high
+   sequencing error rates. (Use integers only.)
+ at end display
+ at example
+        Heavy set:                       Light set:
+
+        MATCH     =  2                   MATCH     =  2
+        MISMAT    = -6                   LTMISM    = -3
+        EXTEND    =  4                   LTEXTEN   =  2
+ at end example
+ at display
+    In the initial assembly, any overlap must be of length at least OVERLEN,
+    and any overlap/containment must be of identity percentage at least
+    PERCENT. After the initial assembly, the program attempts to join
+    contigs together using weak overlaps. Two contigs are merged if the
+    score of the overlapping alignment is at least CUTOFF. The value for
+    CUTOFF is chosen according to the value for MATCH.
+
+    POS5 and POS3 are fragment positions such that the 5' end between base 1
+    and base POS5, and the 3' end after base POS3 are of high sequencing
+    error rates, say more than 5%. For mismatches and indels occurring in
+    the two ends, light penalties are used.
+
+    Acknowledgments
+     The function diff() of Gene Myers is modified and used here.
+
+    A file of input fragments looks like:
+ at end display
+ at example
+>G019uabh
+ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAA
+GTCTTGCTTGAATTAAAGACTTGTTTAAACACAAAAATTTAGAGTTTTAC
+TCAACAAAAGTGATTGATTGATTGATTGATTGATTGATGGTTTACAGTAG
+GACTTCATTCTAGTCATTATAGCTGCTGGCAGTATAACTGGCCAGCCTTT
+AATACATTGCTGCTTAGAGTCAAAGCATGTACTTAGAGTTGGTATGATTT
+ATCTTTTTGGTCTTCTATAGCCTCCTTCCCCATCCCCATCAGTCTTAATC
+AGTCTTGTTACGTTATGACTAATCTTTGGGGATTGTGCAGAATGTTATTT
+TAGATAAGCAAAACGAGCAAAATGGGGAGTTACTTATATTTCTTTAAAGC
+>G028uaah
+CATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTGAATTAAAGACTTGTT
+TAAACACAAAATTTAGACTTTTACTCAACAAAAGTGATTGATTGATTGAT
+TGATTGATTGATGGTTTACAGTAGGACTTCATTCTAGTCATTATAGCTGC
+TGGCAGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGTCAAAGC
+ATGTACTTAGAGTTGGTATGATTTATCTTTTTGGTCTTCTATAGCCTCCT
+TCCCCATCCCATCAGTCT
+>G022uabh
+TATTTTAGAGACCCAAGTTTTTGACCTTTTCCATGTTTACATCAATCCTG
+TAGGTGATTGGGCAGCCATTTAAGTATTATTATAGACATTTTCACTATCC
+CATTAAAACCCTTTATGCCCATACATCATAACACTACTTCCTACCCATAA
+GCTCCTTTTAACTTGTTAAAGTCTTGCTTGAATTAAAGACTTGTTTAAAC
+ACAAAATTTAGACTTTTACTCAACAAAAGTGATTGATTGATTGATTGATT
+GATTGAT
+>G023uabh
+AATAAATACCAAAAAAATAGTATATCTACATAGAATTTCACATAAAATAA
+ACTGTTTTCTATGTGAAAATTAACCTAAAAATATGCTTTGCTTATGTTTA
+AGATGTCATGCTTTTTATCAGTTGAGGAGTTCAGCTTAATAATCCTCTAC
+GATCTTAAACAAATAGGAAAAAAACTAAAAGTAGAAAATGGAAATAAAAT
+GTCAAAGCATTTCTACCACTCAGAATTGATCTTATAACATGAAATGCTTT
+TTAAAAGAAAATATTAAAGTTAAACTCCCCTATTTTGCTCGTTTTTGCTT
+ATCTAAAATACATTCTGCACAATCCCCAAAGATTGATCATACGTTAC
+>G006uaah
+ACATAAAATAAACTGTTTTCTATGTGAAAATTAACCTANNATATGCTTTG
+CTTATGTTTAAGATGTCATGCTTTTTATCAGTTGAGGAGTTCAGCTTAAT
+AATCCTCTAAGATCTTAAACAAATAGGAAAAAAACTAAAAGTAGAAAATG
+GAAATAAAATGTCAAAGCATTTCTACCACTCAGAATTGATCTTATAACAT
+GAAATGCTTTTTAAAAGAAAATATTAAAGTTAAACTCCCC
+ at end example
+ at display
+   A string after ">" is the name of the following fragment.
+   Only the five upper-case letters A, C, G, T and N are allowed
+   to appear in fragment data. No other characters are allowed.
+   A common mistake is the use of lower case letters in a fragment.
+
+   To run the program, type a command of form
+
+        cap2 file_of_filenames [-r]
+
+   The output goes to the terminal screen. So redirection of the
+   output into a file is necessary. The output consists of three parts:
+   overview of contigs at fragment level, detailed display of contigs
+   at nucleotide level, and consensus sequences.
+   The output of CAP on the sample input data looks like:
+
+'+' = direct orientation; '-' = reverse complement
+ at end display
+ at example
+OVERLAPS            CONTAINMENTS
+
+******************* Contig 1 ********************
+G022uabh+
+G019uabh+
+                    G028uaah+ is in G019uabh+
+G023uabh-
+                    G006uaah- is in G023uabh-
+
+DETAILED DISPLAY OF CONTIGS
+******************* Contig 1 ********************
+                .    :    .    :    .    :    .    :    .    :    .    :
+G022uabh+   TATTTTAGAGACCCAAGTTTTTGACCTTTTCCATGTTTACATCAATCCTGTAGGTGATTG
+            ____________________________________________________________
+consensus   TATTTTAGAGACCCAAGTTTTTGACCTTTTCCATGTTTACATCAATCCTGTAGGTGATTG
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G022uabh+   GGCAGCCATTTAAGTATTATTATAGACATTTTCACTATCCCATTAAAACCCTTTATGCCC
+            ____________________________________________________________
+consensus   GGCAGCCATTTAAGTATTATTATAGACATTTTCACTATCCCATTAAAACCCTTTATGCCC
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G022uabh+   ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTG
+G019uabh+   ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTG
+G028uaah+                            CATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTG
+            ____________________________________________________________
+consensus   ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTG
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G022uabh+   AATTAAAGACTTGTTTAAACACAAAA-TTTAGACTTTTACTCAACAAAAGTGATTGATTG
+G019uabh+   AATTAAAGACTTGTTTAAACACAAAAATTTAGAGTTTTACTCAACAAAAGTGATTGATTG
+G028uaah+   AATTAAAGACTTGTTTAAACACAAAA-TTTAGACTTTTACTCAACAAAAGTGATTGATTG
+            ____________________________________________________________
+consensus   AATTAAAGACTTGTTTAAACACAAAA-TTTAGACTTTTACTCAACAAAAGTGATTGATTG
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G022uabh+   ATTGATTGATTGATTGAT                                          
+G019uabh+   ATTGATTGATTGATTGATGGTTTACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGC
+G028uaah+   ATTGATTGATTGATTGATGGTTTACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGC
+            ____________________________________________________________
+consensus   ATTGATTGATTGATTGATGGTTTACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGC
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G019uabh+   AGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGTCAAAGCATGTACTTAGAGTT
+G028uaah+   AGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGTCAAAGCATGTACTTAGAGTT
+            ____________________________________________________________
+consensus   AGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGTCAAAGCATGTACTTAGAGTT
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G019uabh+   GGTATGATTTATCTTTTTGGTCTTCTATAGCCTCCTTCCCCATCCCCATCAGTCTTAATC
+G028uaah+   GGTATGATTTATCTTTTTGGTCTTCTATAGCCTCCTTCCCCATCCC-ATCAGTCT     
+            ____________________________________________________________
+consensus   GGTATGATTTATCTTTTTGGTCTTCTATAGCCTCCTTCCCCATCCCCATCAGTCTTAATC
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G019uabh+   AGTCTTGTTACGTTATGACT-AATCTTTGGGGATTGTGCAGAATGTTATTTTAGATAAGC
+G023uabh-         GTAACGT-ATGA-TCAATCTTTGGGGATTGTGCAGAATGT-ATTTTAGATAAGC
+            ____________________________________________________________
+consensus   AGTCTTGTAACGTTATGACTCAATCTTTGGGGATTGTGCAGAATGTTATTTTAGATAAGC
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G019uabh+   AAAA-CGAGCAAAAT-GGGGAGTT-A-CTT-A-TATTT-CTTT-AAA--GC         
+G023uabh-   AAAAACGAGCAAAATAGGGGAGTTTAACTTTAATATTTTCTTTTAAAAAGCATTTCATGT
+G006uaah-                   GGGGAGTTTAACTTTAATATTTTCTTTTAAAAAGCATTTCATGT
+            ____________________________________________________________
+consensus   AAAAACGAGCAAAATAGGGGAGTTTAACTTTAATATTTTCTTTTAAAAAGCATTTCATGT
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G023uabh-   TATAAGATCAATTCTGAGTGGTAGAAATGCTTTGACATTTTATTTCCATTTTCTACTTTT
+G006uaah-   TATAAGATCAATTCTGAGTGGTAGAAATGCTTTGACATTTTATTTCCATTTTCTACTTTT
+            ____________________________________________________________
+consensus   TATAAGATCAATTCTGAGTGGTAGAAATGCTTTGACATTTTATTTCCATTTTCTACTTTT
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G023uabh-   AGTTTTTTTCCTATTTGTTTAAGATCGTAGAGGATTATTAAGCTGAACTCCTCAACTGAT
+G006uaah-   AGTTTTTTTCCTATTTGTTTAAGATCTTAGAGGATTATTAAGCTGAACTCCTCAACTGAT
+            ____________________________________________________________
+consensus   AGTTTTTTTCCTATTTGTTTAAGATCGTAGAGGATTATTAAGCTGAACTCCTCAACTGAT
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G023uabh-   AAAAAGCATGACATCTTAAACATAAGCAAAGCATATTTTTAGGTTAATTTTCACATAGAA
+G006uaah-   AAAAAGCATGACATCTTAAACATAAGCAAAGCATATNNT-AGGTTAATTTTCACATAGAA
+            ____________________________________________________________
+consensus   AAAAAGCATGACATCTTAAACATAAGCAAAGCATATTTTTAGGTTAATTTTCACATAGAA
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G023uabh-   AACAGTTTATTTTATGTGAAATTCTATGTAGATATACTATTTTTTTGGTATTTATT
+G006uaah-   AACAGTTTATTTTATGT                                       
+            ____________________________________________________________
+consensus   AACAGTTTATTTTATGTGAAATTCTATGTAGATATACTATTTTTTTGGTATTTATT
+
+
+CONSENSUS SEQUENCES
+>Contig 1
+TATTTTAGAGACCCAAGTTTTTGACCTTTTCCATGTTTACATCAATCCTGTAGGTGATTG
+GGCAGCCATTTAAGTATTATTATAGACATTTTCACTATCCCATTAAAACCCTTTATGCCC
+ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTG
+AATTAAAGACTTGTTTAAACACAAAATTTAGACTTTTACTCAACAAAAGTGATTGATTG
+ATTGATTGATTGATTGATGGTTTACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGC
+AGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGTCAAAGCATGTACTTAGAGTT
+GGTATGATTTATCTTTTTGGTCTTCTATAGCCTCCTTCCCCATCCCCATCAGTCTTAATC
+AGTCTTGTAACGTTATGACTCAATCTTTGGGGATTGTGCAGAATGTTATTTTAGATAAGC
+AAAAACGAGCAAAATAGGGGAGTTTAACTTTAATATTTTCTTTTAAAAAGCATTTCATGT
+TATAAGATCAATTCTGAGTGGTAGAAATGCTTTGACATTTTATTTCCATTTTCTACTTTT
+AGTTTTTTTCCTATTTGTTTAAGATCGTAGAGGATTATTAAGCTGAACTCCTCAACTGAT
+AAAAAGCATGACATCTTAAACATAAGCAAAGCATATTTTTAGGTTAATTTTCACATAGAA
+AACAGTTTATTTTATGTGAAATTCTATGTAGATATACTATTTTTTTGGTATTTATT
+*/
+ at end example
diff --git a/manual/cap3-t.texi b/manual/cap3-t.texi
new file mode 100644
index 0000000..53b7c47
--- /dev/null
+++ b/manual/cap3-t.texi
@@ -0,0 +1,501 @@
+ at cindex Assembly: CAP3
+ at cindex CAP3 Assembly
+ at cindex Huang: Assembly (CAP3)
+ at cindex Assembly: Huang
+
+This mode of assembly uses the global assembly program CAP3, developed by Xiaoqiu Huang.
+ at cite{Huang, X. DNA Sequence Assembly under Forward-Reverse Constraints. In 
+preparation. (1998)}. 
+
+The CAP3 program can be accessed via the Gap4 interface through the "Assembly"
+menu or as a stand alone program.
+
+The CAP3 files for use with Gap4 must be obtained via ftp from the
+author, Xiaoqiu Huang.
+
+Email Xiaoqiu Huang (huang@@mtu.edu) stating that you want CAP3 for
+use with gap4 and the operating system for which you need the program
+(one of: Solaris 2; Digital Unix; SGI Irix; linux x86). He will then contact 
+you to arrange for the retrieval of the
+binary files.  The binary files are called cap3_s and 
+cap3_create_exp_constraints. Make these executable (eg chmod a+x cap3_s) and 
+move them to the directory
+ at code{$STADENROOT/$MACHINE-bin}. The CAP3 options on the "Assembly" menu
+should now be available.
+
+ at menu
+* Assembly-Perform CAP3 assembly:: Perform CAP3 assembly
+* Assembly-Import CAP3 assembly:: Import CAP3 assembly data
+* Assembly-Perform and import CAP3 assembly:: Perform and import CAP3 assembly
+* Assembly-Stand alone CAP3 assembly:: Stand alone CAP3 assembly
+* Assembly-Further details about CAP3:: Further details about CAP3
+ at end menu
+
+ at node Assembly-Perform CAP3 assembly
+ at subsection Perform CAP3 assembly
+ at cindex Assembly: perform CAP3 
+ at cindex CAP3 assembly: perform
+
+_picture(assembly.CAP3,2.86667in)
+
+The assembly works on either a file or list of reading names in experiment
+file format (_fpref(Formats-Exp, Experiment File, formats)). 
+CAP3 assembles the readings and the alignments
+are written to the output window. New reading files are written in the 
+destination directory in experiment file format. If the destination directory 
+does not already exist, then it is created. These new files contain the 
+additional information required to recreate the same assembly within Gap4. 
+This is done by the addition of an AP line. 
+_oxref(Assembly-Directed, Directed Assembly). 
+
+CAP3 uses forward-reverse constraints to correct errors in assembly of reads.
+The constraints file is
+generated automatically using the information in the experiment files by 
+setting the "Use constraint file" radiobutton to "Yes". The constraints file 
+is named after the input file with the addition of ".con" ie if the input file
+is called fofn, the constraint file is called fofn.con. Note that if the 
+"Use constraint file" is set to "No", then any files of the format
+input_file.con will be deleted from the current directory. For further details,
+_fpref(Assembly-Further details about CAP3, Further details about CAP3).
+
+CAP3 also can use quality values to determine the consensus sequence. If the 
+quality values are present in the experiment files, then they are automatically
+used. For further details,
+_fpref(Assembly-Further details about CAP3, Further details about CAP3).
+
+ at node Assembly-Import CAP3 assembly
+ at subsection Import CAP3 assembly
+ at cindex Assembly: import CAP3 
+ at cindex CAP3 assembly: import
+
+This mode imports the aligned sequences produced after CAP3 assembly into
+Gap4 and maintains the same alignment. Importing 
+the files requires the directory containing the newly aligned readings, ie 
+the destination directory used in "Perform CAP3 assembly". Readings which are
+not entered are written to a "list" or "file" specified in the "Save failures"
+entry box. This mode is functionally equivalent to "Directed assembly".
+_oxref(Assembly-Directed, Directed Assembly). 
+
+ at node Assembly-Perform and import CAP3 assembly
+ at subsection Perform and import CAP3 assembly
+ at cindex Assembly: perform and import CAP3 
+ at cindex CAP3 assembly: perform and import
+
+This mode performs both the assembly, 
+_fpref(Assembly-Perform CAP3 assembly, Perform CAP3 assembly) and the import, 
+_fpref(Assembly-Import CAP3 assembly, Import CAP3 assembly) together. The 
+assembled readings
+are written to the destination directory and then are automatically
+imported from this directory into Gap4.
+
+ at node Assembly-Stand alone CAP3 assembly
+ at subsection Stand alone CAP3 assembly
+ at cindex Assembly: stand alone CAP3 
+ at cindex CAP3 assembly: stand alone
+
+The program can be alternatively accessed as a stand alone program with the 
+following command line arguments
+
+cap3_s - at i{format} file_of_filenames [-out destination_directory]
+
+ at i{format} is the file format of the file of filenames and is either in 
+experiment file format or fasta format. Legal inputs are exp, EXP, fasta or
+FASTA.
+
+file_of_filenames is the name of the file containing the reading names to be
+assembled for experiment files or a single file of readings in fasta format.
+
+destination_directory is the name of a directory to which the new
+experiment files are written to. The default directory is "assemble".
+
+To use forward-reverse reading constraints, an appropriate 
+file_of_filenames.con file must exist in the current directory. This file 
+can be created from experiment files using the program:
+
+cap3_create_exp_constraints file_of_filenames
+
+where file_of_filenames is the same file as used for cap3_s. For fasta files,
+the constraint file is created using the program:
+
+formcon File_of_Reads Min_Distance Max_Distance
+
+See below for more information.
+
+If quality values are present in the experiment files, then these will be used
+automatically. For fasta files, the quality values must be in a separate file 
+of the type file_of_filenames.qual. See below for more information.
+
+ at node Assembly-Further details about CAP3
+ at subsection Further details about CAP3
+ at cindex Assembly: CAP3 information
+ at cindex CAP3 assembly: information
+
+The comments provided with CAP3 by Huang are detailed below.
+
+ at b{CONTIG ASSEMBLY PROGRAM Version 3 (CAP3)}
+
+copyright (c) 1998 Michigan Technological University
+No part of this program may be distributed without prior written
+permission of the author.
+
+ at display
+     Xiaoqiu Huang
+     Department of Computer Science
+     Michigan Technological University
+     Houghton, MI 49931
+     E-mail: huang@@cs.mtu.edu
+ at end display
+
+Proper attribution of the author as the source of the software would
+be appreciated:
+ at display
+     Huang, X. (1998)
+     DNA Sequence Assembly under Forward-Reverse Constraints.
+     In preparation.
+ at end display
+
+CAP3 uses forward-reverse constraints to correct errors in assembly of reads.
+CAP3 works better if a lot more constraints are used.  If the file of sequence
+reads in FASTA format is named "xyz", then the file of forward-reverse
+constraints must be named "xyz.con".  Each line of the constraint file
+specifies one forward-reverse constraint of the form:
+
+ at display
+ReadA   ReadB    MinimumDistance    MaximumDistance
+ at end display
+
+where ReadA and ReadB are names of two reads, and MinimumDistance and
+MaximumDistance are distances (integers) in base pairs.  The constraint is
+satisfied if ReadA in forward orientation occurs in a contig before ReadB in
+reverse orientation, or ReadB in forward orientation occurs in a contig before
+ReadA in reverse orientation, and their distance is between MinimumDistance
+and MaximumDistance. We have a separate program to generate a constraint file
+from the sequence file.
+
+The program reports whether each constraint is satisfied or not. The report is
+in file @file{xyz.con.results}.  A sample report file is given here:
+
+ at example
+CPBKY55F  CPBKY55R  500  6000  3210  satisfied
+CPBKY92F  CPBKY92R  500  6000  497   unsatisfied in distance
+CPBKY28F  CPBKY28R  500  6000   unsatisfied
+CPBKY56F  CPBKY56R  500  6000   10th link between CPBKI23F+ and CPBKT37R-
+ at end example
+
+The first four columns are simply taken from the constraint file.
+
+Line 1 indicates that the constraint is satisfied, where the actual distance
+between the two reads is given on the fifth column.
+
+Line 2 indicates that the constraint is not satisfied in distance, that is,
+the two reads in opposite orientation occur in the same contig, but their
+distance (given on the fifth column) is out of the given range.
+
+Line 3 indicates that the constraint is not satisfied.
+
+Line 4 indicates that this constraint is the 10th one that links two contigs,
+where the 3' read of one contig is @code{CPBKI23F} in plus orientation and the
+5' read of the other is @code{CPBKT37R} in minus orientation. The information
+suggests that the two contigs should go together in the gap closure phase.
+Information about corrections made using constraints is reported in file named
+ at file{.info}.
+
+A feature to use quality values in determination of consensus sequences has
+been added. The file of quality values must be named @file{xyz.qual}, where
+ at file{xyz} is the name of the sequence file.  Only the sequence file is given
+as an argument to the program.  All the other input files must be in the same
+directory.  CAP3 uses the same format of a quality file as Phrap.  The quality
+values of contig consensuses are given in file @file{xyz.contigs.qual}. The
+results of CAP3 go to the standand output.
+
+CAP3 also uses a more effective filter to speed up overlap computation.
+
+CAP3 assumes that the low-quality ends of sequence reads have been trimmed.
+Otherwise, CAP3 may not work well. We have a separate program to trim
+low-quality ends and to produce a corresponging Phred quality file.  If you
+need this program, please let us know.  We plan to remove this assumption in
+the future.
+
+The CAP3 program consists of two C source files: @file{cap3.c} and
+ at file{filter.c}. To produce the executable code named cap3, use the command:
+
+ at example
+cc -O  cap3.c filter.c -o cap3
+ at end example
+
+The usage is:
+
+ at example
+cap3  File_of_Reads  >  output
+ at end example
+
+The file @file{output} contains the output of CAP3.
+
+The features given above are new in CAP3. Below is for CAP2.  
+
+The CAP2 program assembles short DNA fragments into long sequences.
+CAP2 contains a number of improvements to the original version
+described in Genomics 14, pages 18-25, 1992. These improvements are:
+
+ at itemize @bullet
+ at item
+    Use of a more efficient filter for quickly detecting pairs of
+   fragments that could not overlap.
+
+ at item
+   Accurate evaluation of overlap strengths through the use
+   of internally generated fragment-specific confidence vectors.
+
+ at item
+   Identification of fragments from repetitive sequences and
+   resolution of ambiguities in assembly of those fragments.
+
+ at item
+   Identification of chimeric fragments.
+
+ at item
+   Automated refinement of poorly aligned regions of fragment
+   alignments
+ at end itemize
+
+A chimeric fragment is made of two short pieces from non-adjacent
+regions of the DNA molecule. CAP2 may report a repeat structure like:
+
+ at example
+F1	5' flanking
+F2	5' flanking
+I1	Internal
+I2	Internal
+I3	Internal
+T1	3' flanking
+T2	3' flanking
+ at end example
+
+where F1, F2, I1, I2, I3, T1 and T2 are fragment names. The structure means
+that I1 ,I2 and I3 are from two copies of a repetitive element, F1 and F2
+flank the two copies at their 5' end, T1 and T2 flank them at their 3' end.
+CAP2 produces the two copies in the final sequence by resolving the
+ambiguities in the repeat structure.
+
+CAP2 is efficient in computer memory: a large number of DNA fragments can be
+assembled. The time requirement is acceptable; for example, CAP2 took 1.5
+hours to assemble 829 fragments of a total of 393 kb nucleotides into a single
+contig on a Sun SPARC 5.  The program is written in C and runs on Sun
+workstations.
+
+The CAP2 program can be run with the -r option. If this option is specified,
+then the program identifies chimeric fragments, reports repeat structures and
+resolves them.  Otherwise, these tasks are not performed.
+
+Large integer values should be used for MATCH, MISMAT, EXTEND.
+
+The comments given above are for CAP2. Written on Feb. 11, 95.
+
+ at display
+Acknowledgements
+  
+   I thank Gene Spier for finding a problem with quality values for
+   reverse complements.
+ at end display
+
+Below is a description of the parameters in the #define section of CAP.
+Two specially chosen sets of substitution scores and indel penalties
+are used by the dynamic programming algorithm: heavy set for regions
+of low sequencing error rates and light set for fragment ends of high
+sequencing error rates. (Use integers only.)
+
+ at example
+	Heavy set:			 Light set:
+
+	MATCH     =  2			 MATCH     =  2
+	MISMAT    = -6			 LTMISM    = -3
+	EXTEND    =  4			 LTEXTEN   =  2
+ at end example
+
+In the initial assembly, any overlap must be of length at least OVERLEN,
+and any overlap/containment must be of identity percentage at least
+PERCENT. After the initial assembly, the program attempts to join
+contigs together using weak overlaps. Two contigs are merged if the
+score of the overlapping alignment is at least CUTOFF. The value for
+CUTOFF is chosen according to the value for MATCH.
+
+POS5 and POS3 are fragment positions such that the 5' end between base 1
+and base POS5, and the 3' end after base POS3 are of high sequencing
+error rates, say more than 5%. For mismatches and indels occurring in
+the two ends, light penalties are used.
+
+ at display
+Acknowledgments
+   The function diff() of Gene Myers is modified and used here.
+ at end display
+
+A file of input fragments looks like:
+
+ at example
+>G019uabh
+ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAA
+GTCTTGCTTGAATTAAAGACTTGTTTAAACACAAAAATTTAGAGTTTTAC
+TCAACAAAAGTGATTGATTGATTGATTGATTGATTGATGGTTTACAGTAG
+GACTTCATTCTAGTCATTATAGCTGCTGGCAGTATAACTGGCCAGCCTTT
+AATACATTGCTGCTTAGAGTCAAAGCATGTACTTAGAGTTGGTATGATTT
+ATCTTTTTGGTCTTCTATAGCCTCCTTCCCCATCCCCATCAGTCTTAATC
+AGTCTTGTTACGTTATGACTAATCTTTGGGGATTGTGCAGAATGTTATTT
+TAGATAAGCAAAACGAGCAAAATGGGGAGTTACTTATATTTCTTTAAAGC
+>G028uaah
+CATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTGAATTAAAGACTTGTT
+TAAACACAAAATTTAGACTTTTACTCAACAAAAGTGATTGATTGATTGAT
+TGATTGATTGATGGTTTACAGTAGGACTTCATTCTAGTCATTATAGCTGC
+TGGCAGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGTCAAAGC
+ATGTACTTAGAGTTGGTATGATTTATCTTTTTGGTCTTCTATAGCCTCCT
+TCCCCATCCCATCAGTCT
+>G022uabh
+TATTTTAGAGACCCAAGTTTTTGACCTTTTCCATGTTTACATCAATCCTG
+TAGGTGATTGGGCAGCCATTTAAGTATTATTATAGACATTTTCACTATCC
+CATTAAAACCCTTTATGCCCATACATCATAACACTACTTCCTACCCATAA
+GCTCCTTTTAACTTGTTAAAGTCTTGCTTGAATTAAAGACTTGTTTAAAC
+ACAAAATTTAGACTTTTACTCAACAAAAGTGATTGATTGATTGATTGATT
+GATTGAT
+>G023uabh
+AATAAATACCAAAAAAATAGTATATCTACATAGAATTTCACATAAAATAA
+ACTGTTTTCTATGTGAAAATTAACCTAAAAATATGCTTTGCTTATGTTTA
+AGATGTCATGCTTTTTATCAGTTGAGGAGTTCAGCTTAATAATCCTCTAC
+GATCTTAAACAAATAGGAAAAAAACTAAAAGTAGAAAATGGAAATAAAAT
+GTCAAAGCATTTCTACCACTCAGAATTGATCTTATAACATGAAATGCTTT
+TTAAAAGAAAATATTAAAGTTAAACTCCCCTATTTTGCTCGTTTTTGCTT
+ATCTAAAATACATTCTGCACAATCCCCAAAGATTGATCATACGTTAC
+>G006uaah
+ACATAAAATAAACTGTTTTCTATGTGAAAATTAACCTANNATATGCTTTG
+CTTATGTTTAAGATGTCATGCTTTTTATCAGTTGAGGAGTTCAGCTTAAT
+AATCCTCTAAGATCTTAAACAAATAGGAAAAAAACTAAAAGTAGAAAATG
+GAAATAAAATGTCAAAGCATTTCTACCACTCAGAATTGATCTTATAACAT
+GAAATGCTTTTTAAAAGAAAATATTAAAGTTAAACTCCCC
+ at end example
+
+A string after ">" is the name of the following fragment.
+Only the five upper-case letters A, C, G, T and N are allowed
+to appear in fragment data. No other characters are allowed.
+A common mistake is the use of lower case letters in a fragment.
+
+To run the program, type a command of form
+
+ at example
+cap  file_of_fragments  
+ at end example
+
+The output goes to the terminal screen. So redirection of the
+output into a file is necessary. The output consists of three parts:
+overview of contigs at fragment level, detailed display of contigs
+at nucleotide level, and consensus sequences.
+The output of CAP on the sample input data looks like:
+
+'+' = direct orientation; '-' = reverse complement
+
+ at example
+OVERLAPS            CONTAINMENTS
+
+******************* Contig 1 ********************
+G022uabh+
+G019uabh+
+                    G028uaah+ is in G019uabh+
+G023uabh-
+                    G006uaah- is in G023uabh-
+
+DETAILED DISPLAY OF CONTIGS
+******************* Contig 1 ********************
+                .    :    .    :    .    :    .    :    .    :    .    :
+G022uabh+   TATTTTAGAGACCCAAGTTTTTGACCTTTTCCATGTTTACATCAATCCTGTAGGTGATTG
+            ____________________________________________________________
+consensus   TATTTTAGAGACCCAAGTTTTTGACCTTTTCCATGTTTACATCAATCCTGTAGGTGATTG
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G022uabh+   GGCAGCCATTTAAGTATTATTATAGACATTTTCACTATCCCATTAAAACCCTTTATGCCC
+            ____________________________________________________________
+consensus   GGCAGCCATTTAAGTATTATTATAGACATTTTCACTATCCCATTAAAACCCTTTATGCCC
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G022uabh+   ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTG
+G019uabh+   ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTG
+G028uaah+                            CATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTG
+            ____________________________________________________________
+consensus   ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTG
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G022uabh+   AATTAAAGACTTGTTTAAACACAAAA-TTTAGACTTTTACTCAACAAAAGTGATTGATTG
+G019uabh+   AATTAAAGACTTGTTTAAACACAAAAATTTAGAGTTTTACTCAACAAAAGTGATTGATTG
+G028uaah+   AATTAAAGACTTGTTTAAACACAAAA-TTTAGACTTTTACTCAACAAAAGTGATTGATTG
+            ____________________________________________________________
+consensus   AATTAAAGACTTGTTTAAACACAAAA-TTTAGACTTTTACTCAACAAAAGTGATTGATTG
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G022uabh+   ATTGATTGATTGATTGAT                                          
+G019uabh+   ATTGATTGATTGATTGATGGTTTACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGC
+G028uaah+   ATTGATTGATTGATTGATGGTTTACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGC
+            ____________________________________________________________
+consensus   ATTGATTGATTGATTGATGGTTTACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGC
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G019uabh+   AGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGTCAAAGCATGTACTTAGAGTT
+G028uaah+   AGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGTCAAAGCATGTACTTAGAGTT
+            ____________________________________________________________
+consensus   AGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGTCAAAGCATGTACTTAGAGTT
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G019uabh+   GGTATGATTTATCTTTTTGGTCTTCTATAGCCTCCTTCCCCATCCCCATCAGTCTTAATC
+G028uaah+   GGTATGATTTATCTTTTTGGTCTTCTATAGCCTCCTTCCCCATCCC-ATCAGTCT     
+            ____________________________________________________________
+consensus   GGTATGATTTATCTTTTTGGTCTTCTATAGCCTCCTTCCCCATCCCCATCAGTCTTAATC
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G019uabh+   AGTCTTGTTACGTTATGACT-AATCTTTGGGGATTGTGCAGAATGTTATTTTAGATAAGC
+G023uabh-         GTAACGT-ATGA-TCAATCTTTGGGGATTGTGCAGAATGT-ATTTTAGATAAGC
+            ____________________________________________________________
+consensus   AGTCTTGTAACGTTATGACTCAATCTTTGGGGATTGTGCAGAATGTTATTTTAGATAAGC
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G019uabh+   AAAA-CGAGCAAAAT-GGGGAGTT-A-CTT-A-TATTT-CTTT-AAA--GC         
+G023uabh-   AAAAACGAGCAAAATAGGGGAGTTTAACTTTAATATTTTCTTTTAAAAAGCATTTCATGT
+G006uaah-                   GGGGAGTTTAACTTTAATATTTTCTTTTAAAAAGCATTTCATGT
+            ____________________________________________________________
+consensus   AAAAACGAGCAAAATAGGGGAGTTTAACTTTAATATTTTCTTTTAAAAAGCATTTCATGT
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G023uabh-   TATAAGATCAATTCTGAGTGGTAGAAATGCTTTGACATTTTATTTCCATTTTCTACTTTT
+G006uaah-   TATAAGATCAATTCTGAGTGGTAGAAATGCTTTGACATTTTATTTCCATTTTCTACTTTT
+            ____________________________________________________________
+consensus   TATAAGATCAATTCTGAGTGGTAGAAATGCTTTGACATTTTATTTCCATTTTCTACTTTT
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G023uabh-   AGTTTTTTTCCTATTTGTTTAAGATCGTAGAGGATTATTAAGCTGAACTCCTCAACTGAT
+G006uaah-   AGTTTTTTTCCTATTTGTTTAAGATCTTAGAGGATTATTAAGCTGAACTCCTCAACTGAT
+            ____________________________________________________________
+consensus   AGTTTTTTTCCTATTTGTTTAAGATCGTAGAGGATTATTAAGCTGAACTCCTCAACTGAT
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G023uabh-   AAAAAGCATGACATCTTAAACATAAGCAAAGCATATTTTTAGGTTAATTTTCACATAGAA
+G006uaah-   AAAAAGCATGACATCTTAAACATAAGCAAAGCATATNNT-AGGTTAATTTTCACATAGAA
+            ____________________________________________________________
+consensus   AAAAAGCATGACATCTTAAACATAAGCAAAGCATATTTTTAGGTTAATTTTCACATAGAA
+
+                .    :    .    :    .    :    .    :    .    :    .    :
+G023uabh-   AACAGTTTATTTTATGTGAAATTCTATGTAGATATACTATTTTTTTGGTATTTATT
+G006uaah-   AACAGTTTATTTTATGT                                       
+            ____________________________________________________________
+consensus   AACAGTTTATTTTATGTGAAATTCTATGTAGATATACTATTTTTTTGGTATTTATT
+
+
+CONSENSUS SEQUENCES
+>Contig 1
+TATTTTAGAGACCCAAGTTTTTGACCTTTTCCATGTTTACATCAATCCTGTAGGTGATTG
+GGCAGCCATTTAAGTATTATTATAGACATTTTCACTATCCCATTAAAACCCTTTATGCCC
+ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTG
+AATTAAAGACTTGTTTAAACACAAAATTTAGACTTTTACTCAACAAAAGTGATTGATTG
+ATTGATTGATTGATTGATGGTTTACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGC
+AGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGTCAAAGCATGTACTTAGAGTT
+GGTATGATTTATCTTTTTGGTCTTCTATAGCCTCCTTCCCCATCCCCATCAGTCTTAATC
+AGTCTTGTAACGTTATGACTCAATCTTTGGGGATTGTGCAGAATGTTATTTTAGATAAGC
+AAAAACGAGCAAAATAGGGGAGTTTAACTTTAATATTTTCTTTTAAAAAGCATTTCATGT
+TATAAGATCAATTCTGAGTGGTAGAAATGCTTTGACATTTTATTTCCATTTTCTACTTTT
+AGTTTTTTTCCTATTTGTTTAAGATCGTAGAGGATTATTAAGCTGAACTCCTCAACTGAT
+AAAAAGCATGACATCTTAAACATAAGCAAAGCATATTTTTAGGTTAATTTTCACATAGAA
+AACAGTTTATTTTATGTGAAATTCTATGTAGATATACTATTTTTTTGGTATTTATT
+ at end example
diff --git a/manual/check_ass.png b/manual/check_ass.png
new file mode 100644
index 0000000..b20ebcf
Binary files /dev/null and b/manual/check_ass.png differ
diff --git a/manual/check_db-t.texi b/manual/check_db-t.texi
new file mode 100644
index 0000000..469173e
--- /dev/null
+++ b/manual/check_db-t.texi
@@ -0,0 +1,178 @@
+ at cindex Check database
+
+ at menu
+* Check-Database::              Database checks
+* Check-Contig::                Contig checks
+* Check-Reading::               Reading checks
+* Check-Anno::                  Anno checks
+* Check-Note::                  Note checks
+* Check-Template::              Template checks
+* Check-Vector::                Vector checks
+* Check-Clone::                 Clone checks
+ at end menu
+
+This function 
+(which is available from the gap4 File menu)
+is used to perform a check on the logical consistency of
+the database.  No user intervention is required. If the checks are passed
+the message "Database is logically consistent" is written to the Output
+Window. If the database is not found to be consistent diagnostic
+messages will appear in the Output Window and Doctor Database from the
+Edit menu should be used to correct the problem. _fxref(Doctor Database,
+Doctor Database, gap4)
+
+Several options, such as assembly, automatically perform a check database
+prior to executing. If the database is found to be inconsistent the option
+will not continue. However some checks are considered as "non fatal" and will
+not block such operations. Currently the only non fatal checks are the
+positional checks for annotations and for readings that are never used. To fix
+the database, use the Doctor Database "ignore check database" setting to
+disable the inconsistency checking. _fxref(Doctor-IgnoreCheck, Ignoring Check
+Database, doctor_db)
+
+The following sections define the checks and the order in which they are
+performed.
+
+_split()
+ at node Check-Database
+ at section Database Checks
+ at cindex Check database: database checks
+
+ at itemize @bullet
+ at item Number of contigs used is <= number allocated
+ at item Disk and memory values for "number of contigs" are consistent
+ at item Number of readings used is <= number allocated
+ at item Disk and memory values for "number of readings" are consistent
+ at item Disk and memory values for "actual database size" are consistent
+ at item Actual database size <= maximum size
+ at item Data_class is either DNA(0) or protein(1).
+ at item Number of free annotations >= 0 and <= number allocated
+ at item Contig order is consistent
+ at item Number of free notes >= 0 and <= number allocated
+ at item First note has prev_type as GT_Database
+ at item Detect note loops
+ at end itemize
+
+ at node Check-Contig
+ at section Contig Checks
+ at cindex Check database: contig checks
+
+ at itemize @bullet
+ at item Has a left reading number
+ at item Has a right reading number
+ at item The left reading has no left neighbour
+ at item The right reading has no right neighbour
+ at item Chain right to
+ at itemize @minus
+ at item check loops
+ at item check holes
+ at item flag a reading as used
+ at end itemize
+ at item When finished chaining
+ at itemize @minus
+ at item check length is correct
+ at item check right reading number is correct
+ at end itemize
+ at item Reference only valid reading numbers
+ at item Chain left to
+ at itemize @minus
+ at item check loops
+ at item flag readings as used, if not done so in right chaining;
+ at end itemize
+ at item When finished chaining, check left reading number is correct
+ at item Chain along annotation list to
+ at itemize @minus
+ at item flag as used
+ at item detect annotation loops
+ at item annotation is within the contig
+ at item annotation is rightwards of previous
+ at end itemize
+ at item First note has prev_type as GT_Contigs
+ at item Detect note loops
+ at end itemize
+
+ at node Check-Reading
+ at section Reading Checks
+ at cindex Check database: reading checks
+
+ at itemize @bullet
+ at item Memory and disk values tally for
+ at itemize @minus
+ at item left neighbour
+ at item right neighbour
+ at item relative position
+ at item length + sense
+ at end itemize
+ at item Left neighbour is a valid reading number
+ at item Right neighbour is a valid reading number
+ at item Reading is not used zero times
+ at item Reading is not used more than once
+ at item Hand holding: (lnbr[rnbr[reading]] == reading)
+ at item Relative position of reading >= position of left neighbour
+ at item Length != 0
+ at item Used sequence length == "right clip position" - "left clip position"
+ at item Has valid strand (0 or 1)
+ at item Has valid primer
+ at item Has valid sense (0 or 1)
+ at item Chain along annotation list to 
+ at itemize @minus
+ at item flag as used
+ at item detect annotation loops;
+ at item annotation is rightwards of previous
+ at end itemize
+ at item First note has prev_type as GT_Readings
+ at item Detect note loops
+ at end itemize
+
+ at node Check-Anno
+ at section Annotation Checks
+ at cindex Check database: annotation checks
+
+ at itemize @bullet
+ at item No loops in free annotation list
+ at item Is neither used nor is on the free list
+ at item Annotation is not used more than once
+ at item Is used, yet is still on the free list
+ at item Length >= 0
+ at item Has valid strand (0 or 1)
+ at end itemize
+
+ at node Check-Note
+ at section Note Checks
+ at cindex Check database: note checks
+
+ at itemize @bullet
+ at item No loops in free note list
+ at item Is neither used nor is on the free list
+ at item Hand holding: (note->next->prev == note)
+ at item Note is not used more than once
+ at item Is used, yet is still on the free list
+ at end itemize
+
+ at node Check-Template
+ at section Template Checks
+ at cindex Check database: template checks
+
+ at itemize @bullet
+ at item Minimum insert length <= maximum insert length
+ at item Has valid vector
+ at item Has valid clone
+ at item Has valid strand
+ at end itemize
+
+ at node Check-Vector
+ at section Vector Checks
+ at cindex Check database: vector checks
+
+ at itemize @bullet
+ at item Level > 0
+ at item Level <= MAX_LEVEL (MAX_LEVEL currently is 10; a "feasibility" check)
+ at end itemize
+
+ at node Check-Clone
+ at section Clone Checks
+ at cindex Check database: clone checks
+
+ at itemize @bullet
+ at item Has valid vector
+ at end itemize
diff --git a/manual/clip-t.texi b/manual/clip-t.texi
new file mode 100644
index 0000000..41b6cb2
--- /dev/null
+++ b/manual/clip-t.texi
@@ -0,0 +1,141 @@
+ at menu
+* Clip-Difference::             Difference clipping
+* Clip-Quality::                Quality clipping
+* Clip-QClipEnds::              Quality clip ends
+* Clip-NBases:			N-Base clipping
+ at end menu
+
+ at cindex Clipping within Gap4
+
+Our
+consensus calculation algorithms use the data for all the unclipped
+bases covering each position in a contig. However, some assembly 
+engines may leave the ends of readings unaligned, and these
+unaligned bases could therefore lead to the
+production of an incorrect consensus. 
+The two clipping
+methods described here 
+(which are available from the gap4 Edit menu)
+are
+designed to overcome this potential problem.
+
+In addition to improving the reliability of the consensus
+calculation, clipping in this way tidies up the alignments,
+so helping the user to concentrate on the better data. 
+It is important to note that in no case
+is the clipped sequence thrown away. The contig editor can show this hidden
+data, and the clip points may be manually adjusted to reveal any clipped
+sequence.
+
+_split()
+ at node Clip-Difference
+ at subsection Difference Clipping
+ at cindex Clipping by differences
+ at cindex Difference clipping
+
+_picture(difference_clip,3.175in)
+
+The difference clipping method 
+(which is available from the gap4 Edit menu)
+works in stages. First it calculates the 
+most likely consensus
+sequence. Then it compares each reading with that consensus sequence and
+identifies areas at the ends of the reading where there are enough
+differences to indicate the possibility of badly aligned bases. The clip
+points are adjusted accordingly.
+
+To identify the clip points for each reading the algorithm first finds 
+a good matching
+segment near the middle of the reading. Then steps, base by base, from this
+point to the left accumulating a score as it goes by using +1 
+for a match and -2 for a mismatch. 
+It sets the left clip point at the position of the highest score.
+The right clip point is set in an equivalent way.
+These new clip points are used only if they are more severe than the
+existing ones. The portions of readings which have
+been clipped are then tagged using a @code{DIFF} tag type. To see
+which segments have been clipped use the contig editor search tool.
+
+After clipping the algorithm then identifies any holes (breaks in the contigs)
+that may have been created and fills them up again by extending the
+sequence(s) with the fewest number(s) of expected errors.
+
+_split()
+ at node Clip-Quality
+ at subsection Quality Clipping
+ at cindex Clipping by quality
+ at cindex Quality clipping
+
+_picture(quality_clip,3.175in)
+
+The quality clipping function 
+(which is available from the gap4 Edit menu)
+clips the ends of readings when the average
+(over 31 bases) confidence value is lower than a user defined threshold.  As
+with the difference clipping method the clips are only adjusted when the newly
+calculated clip points are more stringent than the originals.
+
+After clipping Gap4 then identifies any holes (breaks in the contigs) that may
+have been created and fills them up again by extending the sequence(s) with
+the fewest number of expected errors.
+
+An example output follows.
+ at example
+Hole from 32652 to 32725: extend #1378 and #1385 with 3.157324 expected errors
+ at end example
+
+We have observed that when using confidence values expressed as 
+-10*log(err_rate),
+it is sometimes better not to clip using the confidence values, but to
+use the difference clipping method 
+(_fpref(Clip-Difference, Difference Clipping, clip)).
+
+_split()
+ at node Clip-QClipEnds
+ at subsection Quality Clip Ends
+ at cindex Clipping by quality, ends only
+ at cindex Quality clip ends
+
+_picture(quality_clip_ends,3.30833in)
+
+This function performs a similar analysis to Quality Clipping, but
+only trimming the ends of contigs. This can be useful as Phrap
+automatically clips where sequences disagree, but the ends of contigs
+will not be trimmed in such a manner. By trimming such poor quality
+from the end Find Internal Joins may find some problematic matches.
+
+_split()
+ at node Clip-NBases
+ at subsection N-Base Clipping
+ at cindex Clipping by N bases
+ at cindex N-base clipping
+
+_picture(NBase_clip,3.30833in)
+
+The purpose of this function is to remove runs of @code{N}s or @code{-}s
+from the ends of sequences. Other bases may be interspersed in a run of
+dashes and the run will still be clipped, provided there are a
+sufficient number of non-A/C/G/T base calls. The exact algorithm for
+determining where a 'run' will stop is as follows:
+
+ at enumerate
+ at item
+Set score to zero
+
+ at item
+For each base call add 1 for @code{N} or @code{-}, -1 for @code{A},
+ at code{C}, @code{G} or @code{T}, zero for anything else.
+
+ at item
+Terminate when the score < -10.
+
+ at item
+Set the clip point at the highest score observed.
+ at end enumerate
+
+Generally this will have no effect (when on good data). It can never
+'grow' a sequence (by extending the cutoffs into the good data). It will
+never form a hole in a contig by clipping all sequences in a region (as
+it will extend the data from both ends of the hole to join it back
+together again).
+
diff --git a/manual/comparator-t.texi b/manual/comparator-t.texi
new file mode 100644
index 0000000..fff7769
--- /dev/null
+++ b/manual/comparator-t.texi
@@ -0,0 +1,185 @@
+ at menu
+* Compar-Examining::            Examining Results
+* Compar-AutoNavigation::       Automatic Match Navigation
+ at end menu
+
+ at cindex Comparator window
+ at cindex Contig Comparator
+
+_ifdef([[_gap4]],[[__Prog__ commands such as Find Internal Joins
+(_fpref(FIJ, Find Internal Joins, fij)), Find Repeats (_fpref(Repeats,
+Find Repeats, repeats)), Check Assembly (_fpref(Check Assembly, Check
+Assembly, check_ass)), and Find Read Pairs (_fpref(Read Pairs, Find
+Read Pairs, read_pairs))]],[[__Prog__ commands such as Find Internal Joins
+(_fpref(FIJ, Find Internal Joins, fij)) and Find Repeats (_fpref(Repeats,
+Find Repeats, repeats))]])
+automatically transform the Contig Selector (_fpref(Contig Selector,
+Contig Selector, contig_selector)) to produce the Contig Comparator.  To
+produce this transformation a copy of the Contig Selector is added at
+right angles to the original window to create a two dimensional
+rectangular surface on which to display the results of comparing or
+checking contigs. Each of the functions plots its results as diagonal
+lines of different colours.  If the plotted points are close to the main
+diagonal they represent results from pairs of contigs that are in the
+correct relative order.  Lines parallel to the main diagonal represent
+contigs that are in the correct relative orientation to one another.
+Those perpendicular to the main diagonal show results for which one
+contig would need to be reversed before the pair could be joined.  The
+manual contig dragging procedure can be used to change the relative
+positions of contigs.  _fxref(Contig-Selector-Order, Changing the Contig
+Order, contig_selector) As the contigs are dragged the plotted results
+will be automatically moved to their corresponding new positions.  This
+means that if users drag the contigs to move their plotted results close
+to the main diagonal they will be simultaneously putting their contigs
+into the correct relative positions.
+
+_ifdef([[_gap4]],[[Because this plot can simultaneously show the results of independent
+types of search, users can see if different analyses produce
+corroborating evidence for the ordering of contigs.  Also, if for
+example, a result
+from Check Assembly lies on the same horizontal or vertical projection
+as a result from Find Repeats, users can see the alternative position to
+place the doubtful reading. Ie this is an indication that a reading may have
+been assembled in an incorrect position.]])
+
+By use of popup menus the plotted results can be used to invoke a subset
+of commands.  For example if the user clicks the right mouse button over
+a result from Find Internal Joins a menu containing Invoke Join Editor
+(_fpref(Editor-Joining, The Join Editor, contig_editor)) and Invoke
+Contig Editors (_fpref(Editor, Editing in __prog__, contig_editor))
+will pop up. If the user selects Invoke Join Editor the Join Editor will
+be started with the two contigs aligned at the match position contained
+in the result. If required one of the contigs will be complemented to
+allow their alignment.
+
+_ifdef([[_gap4]],[[A typical display from the Contig Comparator is
+shown below. It includes results for Find Internal Joins in black,
+Find Repeats in red, Check Assembly in green, and Find Read Pairs in
+blue. Notice that there are several Find Internal Joins, Find Read
+Pairs and Find Repeats results close to the main diagonal near the top
+left of the display, indicating that the contigs represented in that
+area are likely to be in the correct relative positions to one
+another.  In the middle of the bottom right quadrant there is a blue
+diagonal line perpendicular to the main diagonal which indicates a
+pair of contigs that are in the wrong relative orientation.]],
+[[A typical display from the Contig Comparator is shown below. It
+includes results for Find Internal Joins in black, Find Repeats in
+red and Sequence Search in green. The currently highlighted item is
+shown in pink with a summary at the bottom of the screen. The
+orientation of this is from top-left to bottom-right indicating that
+the match is in the same orientation within both contigs (we can see
+some in the opposite orientation indicating that we need to reverse
+complement either of the two contigs before attempting any joins,
+although this will happen automatically).]])
+The crosshairs show the positions for a pair of contigs. The vertical
+line continues into the Contig Selector part of the display, and the
+position represented by the horizontal line is also duplicated there.
+
+_ifdef([[_gap4]],[[
+_lpicture(comparator,5.325in)
+]],[[
+_lpicture(gap5_comparator,5.25833in)
+]])
+
+_split()
+ at node Compar-Examining
+ at section Examining Results and Using Them to Select Commands
+ at cindex Contig Comparator: manipulating results
+
+Moving the cursor over plotted results highlights them, and the
+information line
+gives a brief description of the currently highlighted match. This is in
+the form:
+
+ at var{match name}: @var{contig1_number}@@@var{position_in_contig1},
+with @var{contig2_number}@@@var{position_in_contig2},
+ at var{length_of_the_match}
+
+For Find Internal Joins the percentage mismatch is also displayed.
+
+Several operations can be performed on each match. Pressing the right
+mouse button over a match invokes a popup menu.  This menu will contain
+a set of options which depends on the type of result to which the match
+corresponds. The following is a complete list, but not all will appear
+for each type of result.
+
+ at table @var
+ at item Information
+ at cindex Information, in Contig Comparator
+Sends a textual description of the match to the Output Window.
+
+ at cindex Hide, in Contig Comparator
+ at cindex Invoke contig editors, in Contig Comparator
+ at cindex Invoke contig join editors, in Contig Comparator
+_ifdef([[_gap4]],[[@cindex Invoke template display, in Contig Comparator]])
+
+ at item Hide
+Removes the match from the Contig Comparator. The match can be revealed
+again by using "Reveal all" within the Results Manager.
+
+ at item Invoke contig editors
+ at itemx Invoke join editors
+_ifdef([[_gap4]],[[@itemx Invoke template display]])
+
+When invoked these options bring up their respective
+displays to show the match in greater detail. 
+
+ at item Remove
+ at cindex Remove, in Contig Comparator
+Removes the match from the Contig Comparator. The match
+cannot be  revealed again by using "Reveal all" within the
+Results Manager.
+ at end table
+
+One of the items in the popup menu may have an asterisk next to it. This is
+the default operation which can also be performed by double clicking the left
+mouse button on the match.
+For Repeat or Find Internal Joins matches this will normally be the Join
+Editor, or two Contig Editors when the match is between two points in
+the same contig. _ifdef([[_gap4]],[[For Read Pairs two Template Displays are shown.]])
+
+The crosshairs can be toggled on and off and a diagonal line going from
+top left to bottom right of the plot can also be displayed if required.
+This is useful as a guide for moving the contigs such that their matches
+lie upon the diagonal line.
+
+The "Results" menu on the contig selector window provides a similar mechanism
+of accessing results, but at the level of all matches in a particular search.
+This is simply a menu driven interface to the Results Manager window
+(_fpref(Results, Results Manager, __prog__)), but containing only the results
+relevant to the contig comparator window.
+
+_split()
+ at node Compar-AutoNavigation
+ at section Automatic Match Navigation
+ at cindex Contig comparator: auto navigation
+ at cindex Contig comparator: next button
+ at cindex Next button, in Contig comparator
+ at cindex Sort Matches
+
+The "Next" button of the contig comparator window automatically invokes the
+default operation on the next match from the current active result. This
+provides a mechanism to step through each match in turn ensuring that no
+matches have been missed.
+
+With a single result (set of matches) plotted, the "Next" button simply steps
+through each match in turn until all have been seen. Moving the mouse above
+the "Next" button, without pressing it, highlights the next match and
+displays brief information about it in the status line at the bottom of the
+window. To step through the matches in "best first" order, select the "Sort
+Matches" option from the relevant name in the Results menu. The exact order is
+dependent on the result in question, but is generally arranged to be the most
+interesting ones first. _ifdef([[_gap4]],[[For example, Find Internal
+Joins shows the lowest mismatch first whilst Check Assembly shows the
+highest mismatches first.]])
+
+Bringing up another result now directs "Next" to step through each of the new
+matches. To change the result that "Next" operates on, use the Result menu to
+select the "Use for 'Next'" option in the desired result. Alternatively,
+double clicking on a match also causes "Next" to process the list starting
+from the selected result.
+
+The "Next" scheme remembers any matches that have been previously examined
+either by itself or by manually double clicking, and will skip these. To clear
+this 'visited' information select "Reset 'Next'" in the Results Manager.
+
diff --git a/manual/comparator.png b/manual/comparator.png
new file mode 100644
index 0000000..d918b3c
Binary files /dev/null and b/manual/comparator.png differ
diff --git a/manual/comparator.small.png b/manual/comparator.small.png
new file mode 100644
index 0000000..cf5c416
Binary files /dev/null and b/manual/comparator.small.png differ
diff --git a/manual/complement-t.texi b/manual/complement-t.texi
new file mode 100644
index 0000000..a9794cc
--- /dev/null
+++ b/manual/complement-t.texi
@@ -0,0 +1,123 @@
+_split()
+ at node Complement
+ at section Complement a Contig
+
+This function 
+(which is available from the gap4 Edit menu)
+is used to complement a contig, which means that it will
+complement and reverse all its readings and reorder them to produce a
+contig with the opposite orientation. It operates on a single contig
+selected via a dialogue box.
+
+_split()
+ at node Enter Tags
+ at section Enter Tags
+ at cindex tags: entering from a file
+ at cindex entering tags from file
+ at cindex annotations: entering from a file
+ at cindex entering annotations from file
+
+
+This routine 
+(which is available from the gap4 Edit menu)
+is used to add a set of tags (_fpref(Intro-Anno, Annotation
+readings and contigs, gap4)) stored in a file, to the database. The file
+format (see below) is identical to the output produced by the "save tags to
+file" option of "Find Repeats". _fxref(Repeats, Find Repeats, repeats) The
+format is a subset of the experiment file format. _fxref(Formats-Exp,
+Experiment Files, formats) The two are close enough for Enter tags to use an
+experiment file as input. The only input required is the name of the file to
+read and a file browser can be used to aid its selection.
+
+Note that "Enter tags" will remove any results plotted in the Contig 
+Comparator.
+
+The start of a typical file is shown below.
+
+ at example
+CC   Repeat number 0, end 1
+ID   zf48g3.s1
+TC   REPT b 1031..1072
+TC        Repeats with contig zf48g3.s1, offset 957
+CC   Repeat number 0, end 2
+ID   zf48g3.s1
+TC   REPT b 957..998
+TC        Repeats with contig zf48g3.s1, offset 1031
+CC   
+CC   Repeat number 1, end 1
+ID   zf48g3.s1
+TC   REPT b 1102..1130
+TC        Repeats with contig zf48g3.s1, offset 953
+CC   Repeat number 1, end 2
+ID   zf48g3.s1
+TC   REPT b 953..981
+TC        Repeats with contig zf48g3.s1, offset 1102
+ at end example
+
+_split()
+ at node Shuffle Pads
+ at section Shuffle Pads
+ at cindex pads: realigning
+ at cindex realigning sequences
+ at cindex shuffle pads
+
+This function realigns all of the sequences within a contig to improve
+pad placement. This can be considered as the replacement to the old
+Shuffle Pads command within the contig editor. (Being outside of the
+editor allows this to be autoamtically scripted.) The contigs to
+realign are specified as either a single contig, all contigs or to
+input a contig names from a file or a gap4 list. Currently the entire
+contig will be shuffled, which can take some time on large contigs. In
+future we plan to allow regions to be specified.
+
+Padding (gapping) problems originate in many sequence assembly
+algorithms, including gap4's, where sequences are aligned against a
+consensus rather than a profile. As an example let us consider
+aligning @code{TCAAGAC} (Sequence4) to the following contig:
+
+ at example
+Sequence1:    GATTCAAAGAC
+Sequence2:      TTCAA*GACGG
+Sequence3:        CAAAGACGGATC
+
+Consensus:    GATTCAAAGACGGATC
+ at end example
+
+The consensus contains a triple A because that is the most likely
+sequence, however we have three possible ways to align a sequence
+containing double A:
+
+ at example
+alignment1:      TCAA*GAC
+alignment1:      TCA*AGAC
+alignment1:      TC*AAGAC
+Consensus:    GATTCAAAGACGGATC
+ at end example
+
+All of these have identical alignment scores because the cost of
+inserting a gap into the sequence is identical at all
+points. Alignment algorithms typically always pick the same end to
+place pads (ie left end or right end), but after contigs get
+complemented and more data inserted this often yields pads at both as,
+as follows:
+
+ at example
+Sequence1:    GATTCAAAGAC
+Sequence2:      TTCAA*GACGG
+Sequence3:        CAAAGACGGATC
+Sequence4:       TC*AAGAC
+Consensus:    GATTCAAAGACGGATC
+ at end example
+
+The new Shuffle Pads algorithm implements the same ideas put forward
+by Anson and Myers in ReAligner. It aligns each sequence against a
+consensus vector where the entire column of bases in the consensus are
+used to compute match, mismatch and indel scores. The result is that
+pads generally get shuffled to the same end (not necessarily always
+left or always right) and the total number of disagreements to the
+consensus reduces.
+
+For speed we acknowledge that the new alignment will only deviate
+slightly from the old one and so a narrow ``band size'' is used. This
+paramater may be adjusted if required, but at the expense of speed.
+
diff --git a/manual/conf_values_p.png b/manual/conf_values_p.png
new file mode 100644
index 0000000..d66cddd
Binary files /dev/null and b/manual/conf_values_p.png differ
diff --git a/manual/conf_values_p.small.png b/manual/conf_values_p.small.png
new file mode 100644
index 0000000..5a5c19d
Binary files /dev/null and b/manual/conf_values_p.small.png differ
diff --git a/manual/configure-t.texi b/manual/configure-t.texi
new file mode 100644
index 0000000..2834e44
--- /dev/null
+++ b/manual/configure-t.texi
@@ -0,0 +1,583 @@
+ at c ----------------------------------------------------------------------
+
+ at node Conf-Introduction
+ at section Introduction
+ at cindex .gaprc
+ at cindex gaprc
+ at cindex .tk_utilsrc
+ at cindex tk_utilsrc
+ at cindex Colour blindness
+
+ at menu
+* Conf-Consensus Algorithm::    Consensus Algorithm
+* Conf-Set Maxseq::             Set Maxseq
+* Conf-Fonts::                  Set Fonts
+_ifdef([[_unix]],[[* Conf-Colour::                 The Colour Configuration Window]])
+* Conf-Configure Menus::        Configuring Menus
+* Conf-Set Genetic Code::       Set Genetic Code
+* Conf-Alignment Scores::       Alignment Scores
+* Conf-Trace File Location::    Trace File Location
+* Conf-Tag::                    The Tag Selector
+* Conf-GTAGDB::                 Tag GTAGDB File
+* Conf-Template Status::        Template Status
+ at end menu
+
+The Options menu allows selection of the Consensus algorithm and the
+genetic code to use, and adjustment of various parameters
+used throughout gap4. It also provides a way of setting
+more trivial things such as fonts and colours.
+
+Most of these options have "OK Permanent" buttons in addition to the normal
+"OK" button. The "OK Permanent" button will save the current settings to the
+ at file{.gaprc} file in the user's home directory.
+_ifdef([[_windows]],[[On Windows 95 this may be @code{C:\}.]])
+
+In general users will not need to be aware of this method as the most important
+configuration options are all available from within the graphical user
+interface. However there are many additional configurable parameters which may
+be referred to throughout the manual. These too are stored in the
+ at file{.gaprc} files.
+
+When gap4 starts up it will first load the complete set of configurations from
+the @file{$STADENROOT/tables/gaprc} file. Next it loads @file{.gaprc} from the
+user's home directory, and finally @file{.gaprc} from the user's current
+project directory.  This means that the setting stored in the @file{.gaprc}
+file in the user's project directory will have priority over those found in
+the home directory, which, in turn, have priority of those found in the Staden
+Package installation directories.
+
+Note that searching for the @file{.gaprc} files only applies when starting
+gap4 and not when opening new or different databases.
+_ifdef([[_windows]],[[Hence if the user double clicks on a database
+ at file{.aux} file then gap4 will read
+the @file{.gaprc} file found in the same directory as the database. If users
+start up gap4 from the Start menu and then open the project, the
+ at file{.gaprc} file in the project directory will not be read.
+]]) _ifdef([[_unix]],[[Hence if the user changes directory to their project
+directory and starts gap4, then gap4 will read the @file{.gaprc} file found in
+that directory. If the user starts up gap4 from another directory and then
+uses the filebrowser to open a database, the @file{.gaprc} file in the project
+directory will not be read.]])
+
+The format of commands in the @file{.gaprc} file are:
+
+ at quotation
+"#" followed by anything is a comment.
+
+"set_def VARIABLE value" sets the parameter "VARIABLE" to the value
+"value".  Note that value must be enclosed in double quotes if it
+contains spaces.
+
+"set_defx temp VARIABLE value" sets a parameter in a temporary list
+named "temp". This has no effect unless it is then used within a
+set_def command.  In this case we use "$temp" as the "value" parameter
+of a set_def command.
+ at end quotation
+
+An example follows:
+
+ at example
+ at group
+set_def FIJ.MAXMIS.VALUE		30.00
+
+set_def	TEMPLATE.PRIMER_REVERSE_COLOUR	"green"
+
+set_def CONTIG_EDITOR.DISAGREE_MODE	2
+set_def CONTIG_EDITOR.DISAGREE_CASE	0
+set_def CONTIG_EDITOR.MAX_HEIGHT	25
+ at end group
+ at end example
+
+Note that some adjustments will effect more than just gap4. For example,
+the colours of traces are stored in the
+ at file{.tk_utilsrc} file, and this file is used by both gap4 and trev.
+For colour blind users it can be useful to change these particular
+settings. For example the following is a @file{.tk_utilsrc} file
+to change the colours for the trace displays.
+
+ at example
+set_def TRACE.COLOUR_A			white
+set_def TRACE.COLOUR_C			blue
+set_def TRACE.COLOUR_G			black
+set_def TRACE.COLOUR_T			"#ff8000"
+set_def TRACE.LINE_WIDTH		2
+ at end example
+
+ at c ----------------------------------------------------------------------
+_split()
+ at node Conf-Consensus Algorithm
+ at section Consensus Algorithm
+
+Gap4 currently contains 3 consensus algorithms
+(_fpref(Con-Calculation, The Consensus Algorithms, t)). 
+This option 
+(which is available from the gap4 Options menu)
+allows the
+algorithm to be selected. 
+
+Note the consensus algorithm is used in
+several places throughout gap4: 
+Assembly
+(_fpref(Assembly-Shot, Normal Shotgun Assembly, assembly)),
+producing a consensus sequence file
+(_fpref(Con-Calculation, The Consensus Algorithms, t)),
+in the Contig Editor
+(_fpref(Editor, Editor introduction, contig_editor)),
+for Experiment Suggestion
+(_fpref(Experiments, Finishing Experiments, experiments)), and in the
+plot of the confidence values
+(_fpref(Consistency-Display, Consistency Display,t)).
+
+
+ at c ----------------------------------------------------------------------
+_split()
+ at node Conf-Set Maxseq
+ at section Set Maxseq/Maxdb
+ at cindex maxseq
+ at cindex maximum sequence length
+ at cindex sequence length, maximum
+
+The "Set maxseq/maxdb" option 
+(which is available from the gap4 Options menu)
+may be used to adjust the maximum size of the total
+consensus sequence contained within gap4. This includes concatenations
+of consensus sequences (with extra space for text headers) and the cutoff data at
+either end of each contig.
+
+When opening an already assembled project, maxseq is automatically increased
+accordingly (if required), so "Set maxseq" only needs to be used when adding
+in more data, such as when using the sequence assembly algorithms.
+
+The maxdb option controls the maximum combined number of readings and contigs
+allowed. Note that changing this does not take effect on the currently opened
+database so be sure to set it before opening your database.
+
+Both these values can also be adjusted by using the @code{-maxseq} and
+ at code{-maxdb} command line arguments.
+_fxref(Gap4-Cline, Command Line Arguments, gap4)
+
+ at c ----------------------------------------------------------------------
+_split()
+ at node Conf-Fonts
+ at section Set Fonts
+ at cindex fonts, adjusting
+
+"Set fonts" 
+(which is available from the gap4 Options menu)
+controls the fonts used for the various components of gap4's
+windows. Note that for the correct operation of some displays, careful
+font selection is necessary. For example it is not wise to chose a
+proportional font for the Contig Editor, which displays fixed width sequence
+alignments. For more complete documentation, see
+_fref(UI-Fonts, Font Selection, interface).
+
+ at c ----------------------------------------------------------------------
+_ifdef([[_unix]],[[
+_split()
+ at node Conf-Colour
+ at section Colour Configuration Window
+ at cindex Colour configuration window
+ at cindex Line thickness configuration
+
+Many gap4 displays make extensive use of colour.
+It is useful to able control the colours used for particular
+plots and the Colour Configuration window is used for this purpose. As the
+Colour Configuration window can be used from several different options, for
+convenience of documentation we refer to the window invoking the
+configuration window as the 'parent' window.
+
+One use for this dialogue is to edit the colours for individual
+restriction enzyme types when they are displayed as a single line within the
+Template Display. By default all types are drawn in black, 
+but the Colour Configuration dialogue enables each to be given its own
+colour.
+Another application is to adjust the colours used for displaying
+matches plotted within the Contig Comparator.
+
+Below is an example of using the Configure Window for a Find Read Pairs
+result. It was brought up using the configure command within the result
+manager. _fxref(Results, Result Manager, results) The window
+contains controls for adjusting both the line thickness and colour.
+Not all Colour Configuration
+dialogues (for example, when used with Restriction Enzyme Map) will include
+the line width section.
+
+_picture(configure.colour,2.35in)
+
+The colour is adjusted by dragging the three
+sliders until the coloured box at the bottom of the window shows
+the desired colour. Colours edited here will affect the displays
+within the parent window. Pressing OK will shut down the
+configuration window and keep these colours. Pressing cancel will
+remove the window and will set the colours in the parent window
+back to their original colours.
+]])
+
+ at c ----------------------------------------------------------------------
+_split()
+ at node Conf-Configure Menus
+ at section Configuring Menus
+ at cindex Configure menus
+ at cindex Menus, configuring
+ at cindex User levels
+
+When used for the first time
+gap4 will start up in beginner mode. What this means is that some of
+the less widely used options will not appear in the menus. The "Configure
+menus" command in the Options menu may be used to change between "beginner" and
+"expert" mode. In expert mode all the menu items will be displayed.
+
+To permanently set the menu level users select the appropriate level
+and press
+the "OK Permanent" button. This will save the menu level information to the
+ at file{.gaprc} file in their home directory.
+
+If desired, other menu levels may be created by the package
+administrator. This is achieved by editing the
+ at file{$STADENROOT/tables/gaprc_menu_full} file, changing the @code{MENU_LEVELS}
+definition and adding the appropriate labels to the end of each command. Each
+command specified in the menu file ends in a list of menu levels in which it
+is active. To make a command active for several levels, enclose the level
+identifiers in a Tcl list, such as @code{@{m e@}}. If this is missing,
+the command will be active at all menu levels.
+
+ at c ----------------------------------------------------------------------
+_split()
+ at node Conf-Set Genetic Code
+ at section Set Genetic Code
+ at cindex Set genetic code
+ at cindex Genetic code
+
+This function allows the user to change the genetic used in all the
+options. The codes are defined as a set of codon tables stored in the
+directory tables/gcodes distributed with the package. The current list
+of codes and their codon table file names is shown at the end of this
+section.
+
+The user interface consists of the dialogue shown below. The user selects
+the required code by clicking on it, and then clicking "OK" or "OK
+permanent". The former choice selects the code for immediate use, and
+the latter also selects it for future uses of the program.
+
+_picture(set_genetic_code,2.39167in)
+
+When the dialogue is left the codon table selected will be displayed, as
+below, in the Output Window.
+
+ at example
+      ===============================================
+      F ttt       S tct       Y tat       C tgt      
+      F ttc       S tcc       Y tac       C tgc      
+      L tta       S tca       * taa       W tga      
+      L ttg       S tcg       * tag       W tgg      
+      ===============================================
+      L ctt       P cct       H cat       R cgt      
+      L ctc       P ccc       H cac       R cgc      
+      L cta       P cca       Q caa       R cga      
+      L ctg       P ccg       Q cag       R cgg      
+      ===============================================
+      I att       T act       N aat       S agt      
+      I atc       T acc       N aac       S agc      
+      M ata       T aca       K aaa       G aga      
+      M atg       T acg       K aag       G agg      
+      ===============================================
+      V gtt       A gct       D gat       G ggt      
+      V gtc       A gcc       D gac       G ggc      
+      V gta       A gca       E gaa       G gga      
+      V gtg       A gcg       E gag       G ggg      
+      ===============================================
+ at end example
+
+The following table shows the list of available genetic codes and the
+files in which they are stored for use by the package. They were created
+from genetic code files obtained from the NCBI.
+
+ at example
+code_1  Standard
+code_2  Vertebrate Mitochondrial
+code_3  Yeast Mitochondrial
+code_4  Coelenterate  Mitochondrial
+code_4  Mold Mitochondrial
+code_4  Protozoan Mitochondrial
+code_4  Mycoplasma
+code_4  Spiroplasma
+code_5  Invertebrate Mitochondrial
+code_6  Ciliate Nuclear
+code_6  Dasycladacean Nuclear
+code_6  Hexamita Nuclear
+code_9  Echinoderm Mitochondrial
+code_10 Euplotid Nuclear
+code_11 Bacterial
+code_12 Alternative Yeast Nuclear
+code_13 Ascidian Mitochondrial
+code_14 Flatworm Mitochondrial
+code_15 Blepharisma Macronuclear
+ at end example
+
+ at c ----------------------------------------------------------------------
+_split()
+ at node Conf-Alignment Scores
+ at section Alignment Scores
+ at cindex Alignment scores
+ at cindex Alignment matrix file
+ at cindex Open penalty for alignments
+ at cindex Extension penalty for alignments
+ at cindex Gap penalties for alignments
+ at cindex Matrix for alignments
+
+The Alignment Scores command 
+(which is available from the gap4 Options menu)
+may be used to adjust the gap open and gap
+extension penalties for some of the alignment algorithms used within gap4. At
+present this will affect all alignments except the Find Internal Joins
+function and most of the assembly algorithms.
+
+For dealing with sequences where the alignment differences have been caused by
+real evolutionary events, these parameters will probably need changing from
+the defaults. The default values are set up with the assumption that any
+alignment differences are due to base calling errors, and hence the gap
+extension penalty will be high.
+
+The alignment matrix may also be adjusted, but this is not listed in the
+dialogue. To do this take a copy of @file{$STADENROOT/tables/nuc_matrix},
+edit the copy, and set the @code{ALIGNMENT.MATRIX_FILE} parameter in your
+ at file{.gaprc} file.
+
+ at c ----------------------------------------------------------------------
+_split()
+ at node Conf-Trace File Location
+ at section Trace File Location
+ at cindex Trace file location
+ at cindex RAWDATA
+
+Gap4 does not store the trace data within the gap4 database. Instead it stores
+the filename of the trace file. Usually the trace files are kept within the
+same directory as the gap4 database. If this is not the case
+gap4 needs to know where they are.
+
+To make sure that gap4 can still display the traces we need to specify any
+alternative locations where traces may be found. The "Trace File Location"
+command (which is available from the gap4 Options menu) performs this task. It
+brings up a dialogue asking for the directory names. If there is just one
+directory to specify, its name should be typed in. If there are several
+directories to search through, they must all be typed in, separated by the
+colon character (":"). To include a directory name that contains a colon, use
+a double colon.
+
+For example, on windows to specify two directories, use (eg)
+"@code{F::\tfiles1:G::\tfiles2}".
+
+In addition to specifying directories, RAWDATA may also be used to indicate
+that the trace files come from a variety of other sources using the
+general format SOURCETYPE=path. These can be combined with directories
+if desired. For example ``@code{.:/trace_cache:TAR=/traces/archived.tar}''.
+
+ at table @code
+ at cindex TAR= RAWDATA accessor
+ at item TAR=filename.tar
+Searches for the trace name in the Unix tar archive named
+ at i{filename.tar}. 
+
+If @i{filename.tar.index} exists and is of the format created using
+the @code{index_tar} program then the trace name will be looked up in
+the index instead of sequentially scanning through the tar file. In
+order to speed up accessing of traces within the tar file a command
+line utility named @code{index_tar} may be used. This produces a text
+index containing the filenames held within the tar and their offsets
+within it. Programs will then use this index file to provide a fast
+way of accessing the trace. The syntax for @code{index_tar} is:
+ at code{index_tar} @i{tar_filename} @code{>}
+ at i{tar_filename}@code{.index}. (For example "@code{index_tar
+traces.tar > traces.tar.index}".)
+
+ at cindex SFF= RAWDATA accessor
+ at item SFF=filename.sff
+Searches for the trace name in a 454 SFF archive named @i{filename.sff}. 
+SFF files have their own binary-sorted index which allows for random
+access.
+
+ at cindex HASH= RAWDATA accessor
+ at item HASH=archive.hash
+
+This method supersedes the TAR= accessor. Tar files may be ``hashed''
+using the @code{hash_tar} tool. Similarly 454 SFF archives may be
+hashed using @code{hash_sff}. In theory any type of archive may be
+indexed as a ``.hash'' provided that the traces are stored
+uncompressed (or compressed only using their own methods, such as with
+ZTR) so that random access is possible within the archive.
+
+The Hash file contains a precomputed binary index of all the traces
+contained within it stored in such a way that random access is very
+fast.
+
+ at cindex URL= RAWDATA accessor
+ at item URL=url
+
+This uses the external @code{wget} tool (@strong{not} supplied as part of the
+Staden Package) to fetch a given url. Anywhere that @code{%s} occurs
+within the specified @i{url} will be replaced by the trace
+name. Hence, for example,
+ at code{URL=http://trace.server.org/cgi-bin/lookup.pl?trace=%s} could be
+used to fetch named traces from a remote site. There are plans for
+such URL access to be made available via the Ensembl TraceArchive.
+ at end table
+
+
+If the gap4 database has been opened with write-access this directory
+location will be stored as a database @code{RAWD} note
+(_fpref(Notes-Special, Special Note Types, notes)), which is read by gap4 when
+it opens the database. The demonstration data supplied with the package
+includes an example database (named DEMO.0) that has a RAWD note to specify
+that traces are fetched from a tar file within the same directory.
+
+An alternative way of specifying the trace file location is by setting the
+ at code{RAWDATA} environment variable. On Unix and Windows NT this is
+straightforward (although system and shell specific). However on Windows 95
+this may prove difficult (and at least require a reboot), so manually setting
+the environment variable is no longer recommended.
+
+ at c ----------------------------------------------------------------------
+ at page
+_split()
+ at node Conf-Tag
+ at section The Tag Selector
+ at cindex Tag Selector
+ at cindex Annotation Selector
+
+Each command using tags (for example to mask tagged sequence segments) can
+utilise the Tag Selector to determine which tag types are to be used. As each
+command has its own particular use for tags, the default tags are
+command specific.
+
+_picture(interface.tag,3.39167in)
+
+The Tag Selector dialogue 
+(which is available from the relevant gap4 options)
+consists of a set of checkbuttons plus commands to
+select all tags or to deselect all tags. The "OK" button quits the display and
+accepts the selected list as the current list of active tags. The "Cancel"
+button quits the display without making any changes. The "As default" button
+marks the current selected tags as the defaults to be used for all future uses
+of this command. These selections are not saved to disk and will be lost when
+the program quits. To permanently set the default tag types, users must 
+modify their
+ at file{.gaprc} file. Brief instructions on how to edit this file follow.
+They are also contained within the copy of the file distributed with
+the package: @file{$STADENROOT/tables/gaprc}. Search for "@code{Tag
+type lists}".
+
+ at c ----------------------------------------------------------------------
+_split()
+ at node Conf-GTAGDB
+ at section The GTAGDB File
+ at cindex GTAGDB
+ at cindex Tag database
+
+To plot tags, gap uses a file describing the available tag types and
+their colours. It is possible for users to edit their own local copies
+of this file to create new tag types.
+
+The environment variable @code{GTAGDB} is used to specify the location
+of tag type databases. The @code{GTAGDB} variable consists of one or
+more file pathnames separated by colons. The first file read defines a
+set of tags and colours. Subsequent files can define additional
+tags and also override the earlier tag definitions. To achieve this gap4
+loads each file from the @code{GTAGDB} variable in the order of rightmost
+first to leftmost last. Thus, as is similar to the unix shell
+ at code{PATH} variable, the leftmost pathnames have highest precedence for
+the resultant tag definitions. The default @code{GTAGDB} specified in the
+staden login and profile scripts is:
+
+ at example
+GTAGDB:$HOME/GTAGDB:$STADTABL/GTAGDB
+ at end example
+
+Hence the @file{$STADTABL/GTAGDB} file is read and the
+ at file{$HOME/GTAGDB} and @file{GTAGDB} (a file in the current directory)
+files are merged if present. To add a new tag type only to the
+database local to the current directory, create a @file{GTAGDB} file in
+the current directory.
+
+The BNF grammar for the tag database is as follows:
+
+ at example
+<tag_db>       ::= <tag> <tag_db> | <empty>
+<tag>          ::= <tag_long_name> ':' <element_list> '\n'
+<element_list> ::= <element> | <element> ':' <element_list> | <empty>
+<element>      ::= <option_name> '=' <string>
+<option_name>  ::= 'id' | 'bg' | 'dt'
+ at end example
+
+Quoting strings is optional for single words, but necessary when writing
+a string containing spaces. In plain English, this means that to define
+the compression tag (@code{COMP}) to be displayed in red, with no
+default annotation string we write:
+
+ at example
+compression: id="COMP": bg=red
+ at end example
+
+Any lines starting with hash (@samp{#}) are considered as comments. Lines
+ending in backslash (@samp{\}) are joined with the next line. Hence the
+above definition can be written in a clearer form using:
+
+ at example
+# For marking compressions
+compression: \
+        id="COMP": \
+        bg=red:
+ at end example
+
+An example including a default annotation string of "default string" follows:
+
+ at example
+# For general comments
+comment: \
+        id="COMM": \
+        bg=MediumBlue: \
+        dt="default string"
+ at end example
+
+Allowed names for colours are those recognised by the windowing system.
+_ifdef([[_unix]],[[These include colour names defined in the @file{rgb.txt}
+file (probably @file{/usr/lib/X11/rgb.txt} or @file{/usr/openwin/lib/rgb.txt})
+and the exact colour specifications using the @code{"#rrggbb"} notation.]])
+
+ at c ----------------------------------------------------------------------
+_split()
+ at node Conf-Template Status
+ at section Template Status
+ at cindex Template Status
+ at cindex Template size tolerance
+ at cindex Primer types, ignoring
+
+This option allows control over computation of the template
+status. The validity of a template is computed by checking the size
+(based on the locations of assembled readings and position of vector
+tags) and the orientation of sequences (based on their ``primer type''
+values).
+
+_picture(template_status,2.925in)
+
+The most likely item to need changing is the ``size limit scale
+factor''. The expected range of template sizes for a ligation are
+specified in each template record as a minimum-to-maximum
+range. Gap4 takes a very simple approach as anything within this range
+is valid and anything outside it is invalid. The scale factor is
+applied such that the maximum range becomes ``max * scale'' and the
+minimum range becomes ``min / scale''. So a scale factor of 2 would
+adjust a range from 1.0-1.4Kb to 0.5-2.8Kb.
+
+The ``minimum valid vector tag length'' is designed to workaround
+problems where some assemblies end up with SVEC tags of 1 or 2 bases
+long (which are common when converting from phrap for some
+reason). The start and end of a template may be derived from observing
+a single reading with sequencing vector at both ends, so the presence
+of very short falsely added SVEC tags will mark many templates as
+inconsistent.
+
+The ``Ignore all primer-type values'' and ``Ignore custom primer-type
+values'' are methods to disable Gap4's trust in the primer type
+information for each sequence. Normally this will be one of
+universal-forward, universal-reverse, custom-forward (e.g. from a
+primer-walk) and custom-reverse. 
diff --git a/manual/configure.colour.png b/manual/configure.colour.png
new file mode 100644
index 0000000..8609f39
Binary files /dev/null and b/manual/configure.colour.png differ
diff --git a/manual/consistency_display-t.texi b/manual/consistency_display-t.texi
new file mode 100644
index 0000000..c6b8220
--- /dev/null
+++ b/manual/consistency_display-t.texi
@@ -0,0 +1,176 @@
+_split()
+ at node Consistency-Display
+ at section Consistency Display
+ at cindex Consistency display
+
+ at menu
+* Consistency-Display::     Consistency Display
+* Consistency-Confidence::  Confidence Values Graph
+* Consistency-ReadingCov::  Reading Coverage Histogram
+* Consistency-ReadPairCov:: Read-Pair Coverage Histogram
+* Consistency-StrandCov::   Strand Coverage
+* Consistency-2ndHighest::  2nd-Highest Confidence
+* Consistency-Diploid::     Diploid Graph
+ at end menu
+
+The Consistency Display provides plots designed to highlight 
+potential problems in contigs. It
+is invoked from the main gap4 View menu by selecting any of its plots. Once
+a plot has been displayed, any of the other types of consistency plot can
+be displayed within the same frame from the View menu of the Consistency
+Display. 
+
+An example showing the Confidence Values Graph and the corresponding Reading
+Coverage Histogram, Read-Pair Coverage Histogram and Strand Coverage is 
+shown below.
+
+_lpicture(consistency_p,6in)
+
+One or more contigs can be displayed and are drawn in the same order
+at the input contig list (which need not necessarily be in the same order as 
+the contig selector). If more than one contig is displayed, the contigs are
+drawn immediately after one another but are staggered in the y direction.
+
+The ruler ticks can be turned on or off from the View menu of the consistency
+display. 
+
+The plots can be enlarged or reduced using the standard zooming mechanism.
+_fxref(UI-Graphics-Zoom, Zooming, interface)
+
+The crosshair toggle button controls whether the crosshair is visible. This is
+shown as a black vertical and horizontal line. The position of the crosshair is
+shown in the 3 boxes to the right of the 
+crosshair toggle. The first box indicates the cursor position in the current
+contig. The second box indicates the overall position of the cursor in the 
+consensus. The last box shows the y position of the crosshair. 
+
+
+_split()
+ at node Consistency-Confidence
+ at subsection Confidence Values Graph
+ at cindex Confidence values graph
+
+This option can be invoked from the main gap4 View menu, in which case
+it appears as a single plot, or from the View menu of the Consistency Display
+in which case it appear part of the Consistency Display.
+
+The confidence values are determined from the current consensus algorithm
+(_fpref(Con-Calculation, The Consensus Algorithms, t)). 
+
+_lpicture(conf_values_p,6in)
+
+Please note that this plot can be very slow for long contigs. This is
+caused by the large number of points (not the calculation) and we hope
+to speed it up in a future release.
+
+_split()
+ at node Consistency-ReadingCov
+ at subsection Reading Coverage Histogram
+ at cindex Reading coverage
+
+This option can be invoked from the main gap4 View menu, in which case
+it appears as a single plot, or from the View menu of the Consistency Display
+in which case it will appear as part of the Consistency Display.
+
+The number of readings which cover each base position along the contig
+are plotted as a histogram. 
+
+_lpicture(read_coverage_p,6in)
+
+As can be seen in the dialogue below, the user can select the contigs(s)
+to display, and whether to plot: Forward strand only, Reverse strand
+only, Both strands or the Summation of both strands. In the example
+shown above both strands have been plotted: forward in red and reverse
+in black.
+
+_picture(read_coverage_d,3.175in)
+
+
+_split()
+ at node Consistency-ReadPairCov
+ at subsection Read-Pair Coverage Histogram
+ at cindex Read-pair coverage
+
+This option can be invoked from the main gap4 View menu, in which case
+it appears as a single plot, or from the View menu of the Consistency Display
+in which case it will appear as part of the Consistency Display.
+
+The number of read-pairs which cover each base position along the contig
+are plotted as a histogram. 
+
+_lpicture(readpair_coverage_p,6in)
+
+_split()
+ at node Consistency-StrandCov
+ at subsection Strand Coverage
+ at cindex strand coverage
+
+This option can be invoked from the main gap4 View menu, in which case
+it appears as a single plot, or from the View menu of the Consistency Display
+in which case it will appear as part of the Consistency Display.
+
+The display is used to show which regions of the data are covered by
+readings from each of the two strands of the DNA. 
+A separate line is drawn for each strand: forward in red and reverse
+in black.
+The function works in two complementary modes: it can plot the positions
+which are covered, or the positions which are not. The latter is probably
+the most useful as it directs users to the places requiring further data.
+
+The figure below shows the covered positions, and the figure below that
+shows the uncovered positions for the same contig.
+
+_lpicture(strand_coverage_p1,5.95in)
+_lpicture(strand_coverage_p2,5.95in)
+
+The plot can be regarded as a coarse version of the Quality Plot
+(_fpref(Template-Quality, Quality Plot, template)),
+in that it shows the strand coverage using the Quality Calculation
+(_fpref(Qual-Cal, The Quality Calculation, calc_consensus)),
+but does not reveal problems with individual base positions.
+
+
+_picture(strand_coverage_d,3.175in)
+
+The dialogue allows user to select the contig(s) and strands to analyse
+and whether to plot Coverage or Problems.
+
+_split()
+ at node Consistency-2ndHighest
+ at subsection 2nd-Highest Confidence
+ at cindex 2nd-Highest Confidence
+ at cindex Second highest confidence graph
+
+The traditional way to compute the consensus confidence values is to
+take into account both the matching and mismatching bases within each
+individual column. If instead we work on the hypothesis that a contig
+may have more than one sequence present then we can instead compute
+five consensus confidence values at every point (four bases plus pad)
+by only totally up the bases that agree and ignoring those that
+mismatch.
+
+_picture(2nd_highest_confidence,6in)
+
+In the case of zero conflicts the highest confidence value will be the
+same as the standard consensus confidence. When a conflict occurs, the
+second highest confidence value can be used as a measure of how strong
+the conflict could be. It is this value is plotted.
+
+_split()
+ at node Consistency-Diploid
+ at subsection Diploid Graph
+ at cindex Diploid Graph
+
+At present this is a rather specialist function written for a
+particular in-house purpose. This plot relates very closely to the
+2nd-Highest Confidence plot (_fpref(Consistency-2ndHighest,
+2nd-Highest Confidence, consistency_display)), but it also takes into
+account depth information.
+
+_picture(discrepancy_graph,6in)
+
+Specifically as assumption is made that a contig may consist of two
+alleles with approximately 50/50 ratio. Any discrepancies visible by
+looking at the second highest confidence value should therefore also
+be backed up by a 50/50 split in sequence depth.
+
diff --git a/manual/consistency_p.png b/manual/consistency_p.png
new file mode 100644
index 0000000..ba28b3d
Binary files /dev/null and b/manual/consistency_p.png differ
diff --git a/manual/consistency_p.small.png b/manual/consistency_p.small.png
new file mode 100644
index 0000000..891c930
Binary files /dev/null and b/manual/consistency_p.small.png differ
diff --git a/manual/contig_editor-t.texi b/manual/contig_editor-t.texi
new file mode 100644
index 0000000..8217415
--- /dev/null
+++ b/manual/contig_editor-t.texi
@@ -0,0 +1,2970 @@
+ at menu
+* Editor-Movement::            Moving around the editor
+* Editor-Names::               The sequence names display
+* Editor-Editing::             Commands for editing data
+* Editor-Selections::          Cut and paste control
+* Editor-Annotations::         Creating, editing and deleting tags
+* Editor-Searching::           Searching
+* Editor-Commands::            The ``commands'' menu
+* Editor-Settings::            The ``settings'' menu
+* Editor-Remove Readings::     Removing Readings
+* Editor-Primer Selection::    Searching for primers
+* Editor-Traces::              Displaying the raw trace data
+* Editor-Info::                The editor information line
+* Editor-Joining::             The join editor
+* Editor-Multiple Editors::    Using several editors at once
+* Editor-Quitting::            Quitting the editor
+* Editor-Techniques::          Editing techniques
+* Editor-Summary::             Summary of key bindings
+ at end menu
+
+The gap4 Contig Editor is designed to allow rapid checking and editing of
+characters in assembled readings. Very large savings in time can be achieved
+by its sophisticated problem finding procedures which automatically direct the
+user only to the bases that require attention.  The following is a selection of
+screenshots to give an overview of its use.
+
+_lpicture(contig_editor.screen,6in)
+
+The figure above shows a screendump from the Contig Editor
+which contains segments of aligned
+readings, their consensus and a six phase translation. The Commands menu
+is also shown.  The main components are: the controls at
+the top; reading names on the left; sequences to their right; and status lines
+at the bottom. Some of the reading names are written in light grey which
+indicates that their traces/chromatograms are being displayed (in
+another window, see below).
+
+One reading name is written with inverse colours, which indicates that it
+has been selected by the user. To the left of each reading name is the reading
+number, which is negative for readings which have been reversed and complemented.
+The first of the status lines, labelled ``Strands'', is showing a
+summary of strand coverage. The left half of the segment of sequence
+being displayed is covered
+only by readings from one strand of the DNA, but the right half contains data
+from both strands.
+
+Along the top of the editor window is a row of command buttons
+and menus. The rightmost pair of buttons provide help
+and exit.  To their left are two menus, one of which is currently in use.  To
+the left of this is a button which initially displays a search dialogue,
+and then pressing it again, will perform the selected search. 
+Further left is the undo button:
+each time the user clicks on this box the program reverses the previous edit
+command.  The next button, labelled ``Cutoffs'' is used to toggle between
+showing or hiding the reading data that is of poor quality or is vector
+sequence. In this figure it has been activated, revealing the poor quality
+data in light grey. Within this, sequencing vector is displayed in
+lilac. The next button to the left is the Edit Modes menu
+which allows users to select which editing commands are enabled. The
+next command toggles between insert and replace and so governs the effect of
+typing in the edit window. The 2 entryboxes on the left hand side labelled
+C and Q set the consensus and quality cutoff values 
+(_fpref(Editor-Techniques-Cutoffs, Consensus and Quality Cutoffs, contig_editor)).
+
+One of the readings contains a yellow tag, and elsewhere some bases are
+coloured red, which indicates they are of poor quality.  The Information Line
+at the bottom of the window can show 
+information about readings, annotations and
+base calls. In this case it is showing information about the reliability of
+the base beneath the editing cursor.
+
+_lpicture(contig_editor_grey_scale,6in)
+
+A better way of displaying the accuracy of bases is to shade their
+surroundings so that the lighter the background the better the data.
+In the figure above, this grey scale encoding of the base accuracy or
+confidence has been activated for bases in the readings and the
+consensus. This
+screenshot also shows the Contig Editor displaying disagreements and edits.
+Disagreements between the consensus and individual base calls are shown
+in dark green. Notice that these disagreements are in poor
+quality base calls. Edits (here they are all pads) are shown with a
+light green background. When they are present, replacements/insertions
+are shown in pink, deletions in red and confidence value changes in purple.
+The consensus confidence takes into account several factors, including
+individual base confidences, sequencing chemistry, and strand coverage.
+It can be seen that the consensus for 
+the section covered by data from only one strand has been calculated to
+be of lower confidence than the rest. The Status Line includes two
+positions marked with exclamation marks (!) which means that the
+sequence is covered by data from both strands, but that the consensus
+for each of the two strands is different.
+The Information Line at the bottom of the window is showing
+information about the reading under the cursor: its name, number,
+clipped length, full length, sequencing vector and BAC clone name.
+
+The Contig Editor can rapidly display the traces for any reading or set
+of readings. The number of rows and columns of traces 
+displayed can be set by the user. The traces scroll in register with one
+another, and with the cursor in the Contig Editor. Conversely, the
+Contig Editor cursor can be scrolled by the trace cursor. 
+A typical view is shown below.
+
+_lpicture(contig_editor.traces,6in)
+
+This figure is an example of the Trace Display showing three traces
+from readings in the previous two Contig Editor screendumps.
+These are the best two traces from each strand plus a trace from a
+reading which contains a disagreement with the consensus. The program
+can be configured to automatically 
+bring up this combination of traces for each
+problem located by the ``Next search'' option.
+The histogram or vertical bars plotted top down show the confidence
+value for each base call. The reading number, together with the direction of
+the reading (+ or -) and the chemistry by which it was determined, is given at
+the top left of each sub window.  There are three buttons ('Info', 'Diff', and
+'Quit') arranged vertically with X and Y scale bars to their right. The Info
+button produces a window like the one shown in the bottom right hand
+corner. The Diff button is mostly used for mutation detection, and causes a
+pair of traces to be subtracted from one another and the result plotted, hence
+revealing their differences.  (_fpref(Editor-Traces, Traces, contig_editor)).
+
+
+_split()
+ at node Editor-Movement
+ at section Moving the visible segment of the contig
+ at cindex Contig Editor: cursor movement
+
+The contig editor displays only one segment of the entire contig, although
+several contig editors can be in use at once.  Above the sequence display
+is a ``scrollbar''. This line represents the entire contig, with a greyed
+section representing the currently displayed segment. To change the
+displayed segment put the mouse cursor in the scrollbar and use the mouse
+buttons. The available controls are:
+
+ at example
+ at group
+Middle Mouse Button      Set displayed section
+Alt Left Mouse Button    Set displayed section
+Left Mouse Button        Scroll left or right one screenful
+ at end group
+ at end example
+
+On the far right side of the contig is a vertically oriented scrollbar.
+Typically the editor will be showing all available data, in which case the
+vertical scrollbar cannot be scrolled. In regions of exceptionally deep
+coverage, the editor makes sure that the controls, the consensus, and any
+status lines are visible. The remaining space is taken up with however many
+sequences fit. The vertical scrollbar can then be used, using the mouse buttons
+listed above, to scroll through the sequences.
+
+In addition to the scrollbars there are four buttons on the
+left hand side for scrolling by fixed amounts.
+
+ at example
+ at group
+<<              Scroll left half a screenful
+<               Scroll left one base
+>               Scroll right one base
+>>              Scroll right half a screenful
+ at end group
+ at end example
+
+Within the editor window itself two more key combinations can be used
+for scrolling forwards and backwards an entire screenful. These, and
+several others, are modelled after the @code{Emacs} key bindings.
+
+ at example
+ at group
+Control v       Scroll right one screenful
+Meta v          Scroll left one screenful
+ at end group
+ at end example
+
+Finally, moving the editing cursor will always adjust the displayed
+section so that the editing cursor is visible. Hence this can also be
+used to scroll around the editor in both horizontal and vertical fashions.
+
+
+
+
+_split()
+ at node Editor-Names
+ at section Names
+ at cindex Contig Editor: names display
+ at cindex Contig Editor: highlighting readings
+ at cindex Highlighting readings in the editor
+ at cindex names in the editor
+ at cindex reading names in the editor
+
+At the left side of the editor window is a display containing the
+reading names and numbers. Each line consists of its orientation
+(``+'' or ``-''), reading number, a coloured template consistency
+status and its name. The bottom line is always @code{CONSENSUS}. Also
+on the bottom line is the current edit status. This is modelled on
+ at code{Emacs}, and consists of one of @code{----}, @code{-%%-} and
+ at code{-**-}, to symbolise ``No unsaved edits made'', ``No edits made -
+editor is in read only mode'', and ``Unsaved edits made''.
+
+The maximum length of a reading name is 40 characters. Additionally there
+are 7 characters taken up with the direction and number of a reading. By
+default the names display only shows 23 characters (enough to show 16 letters
+of a reading name). A horizontal scrollbar just above the reading names can be
+used to scroll the reading names. Note that the numbers and orientation are
+always visible. To change the width of the editor names display set the
+ at code{CONTIG_EDITOR.NAMES_WIDTH} setting in your @file{.gaprc}. For example:
+
+ at example
+set_def CONTIG_EDITOR.NAMES_WIDTH	23
+ at end example
+
+The foreground colour for the text reveals whether the trace for this reading
+is shown - a grey foreground indicates that the trace is visible.  The
+background colour represents a user highlight and the disassembly mode. The
+default background colour is light grey (the same colour as the general editor
+background).  Clicking the left mouse button on a reading name toggles the
+background of the name component of number-name pair to black.  This is
+particularly useful for keeping track of an individual reading whilst
+scrolling the editor. As the editor scrolls an individual reading will move up
+and down the editor display. By highlighting this reading it becomes easy to
+track. The number component of the number-name pair is used to highlight
+readings that are to be disassembled. _fxref(Disassemble, Disassemble Readings,
+disassembly) In this case the background is dark grey.
+
+If the template display is in use, highlighting a reading name in the
+editor will select this reading in the template display (by marking
+it as bold). Similarly selecting a reading in the template display (left
+mouse button) will highlight the reading in the contig editor.
+Additionally the contig editor cursor is visible within the template
+display allowing the position of the editor to be controllable from the
+template display and connected plots (such as the quality plot). 
+_fxref(Template-Display, Template Display, template)
+
+The readings contained within the ``readings'' list are automatically
+highlighted when the editor starts. Toggling the highlighted names in the
+editor updates the ``readings'' list accordingly.
+_fxref(List-Special, Special List Names, lists)
+
+Once an output list for the editor has been set, pressing the middle
+mouse button, or Alt left mouse button, on the names display has the same effect as the using the
+left button, except that it adds (and never removes) the reading name to the
+specified list. _oxref(Editor-Output List, Set Output List). This is similar
+to using the left mouse button to add names to the ``readings'' list, except
+that it allows for multiple lists to be built up.
+
+Pressing the right mouse button on a name will popup a menu containing
+a variety of operations to perform for that specific reading.
+
+ at table @strong
+ at item Goto...
+This is a cascading menu containing all other readings on the same
+template, including ones on other contigs. Selecting the appropriate
+read name will move the editor to the left-most base in that
+sequence. If the sequence is in another editor then either the other
+editor will be moved (and created if needed).
+
+ at item Join to...
+This is only shown when a template has more than one reading in it and
+the readings are within separate contigs. when this is the case a
+cascading menu presents the list of readings in other
+contigs. Selecting one of these will bring up the join editor with
+both sequences visible (so that you will need to manually scroll to
+approximately the correct position in order to find the join).
+
+ at item Select this reading
+ at itemx Select this reading and all to right
+ at itemx Deselect this reading
+ at itemx Deselect this reading and all to right
+ at itemx Select readings on this template
+ at itemx Deselect readings on this template
+
+These commands (de)select one or several readings. ``Select this
+reading'' is the most simple method and this acts in the same way as
+simply left clicking on a sequence name.
+
+The other modes allow the (de)selection of sequences on this template
+(regardless of which contig they are in) or ranges. The ``and all to
+right'' modes are designed with disassemble readings in
+mind. Disassembling all readings from a specific point onwards using
+the ``Move readings to new contigs'' mode is analogous to using break
+contig. Selecting all readings within a range may be achieved by a
+combination of ``select this reading and all to right'' and a
+subsequence ``deselect this reading and all to right'' further along
+the contig.
+
+ at item List notes
+
+This invokes the note selector with this reading already listed
+(_fpref(Notes-Selector, Selecting Notes, notes)).
+
+ at item Set as reference sequence
+
+This marks the sequence as the Reference sequence
+(_fpref(Editor-Reference-Sequences, Reference sequences, contig_editor)).
+
+ at item Set as reference trace
+
+This marks the trace as a Reference trace for use with trace
+differencing
+(_fpref(Editor-Reference-Traces, Reference traces, contig_editor)).
+
+ at item Remove reading (this only)
+ at itemx Remove reading and all to right
+
+This marks one or more readings as ready for removal by disassemble
+readings
+(_fpref(Editor-Remove Readings, Removing readings from the contig,
+contig_editor)).
+You will then be prompted when you exit the editor whether you wish to
+disassemble the chosen readings.
+
+ at item Clear selection
+
+This clears the current reading selection.
+ at end table
+
+_split()
+ at node Editor-Editing
+ at section Editing
+ at cindex Editing: contig editor
+ at cindex Contig Editor: editing features
+
+ at menu
+* Editor-Cursor::              Moving the editing cursor
+* Editor-Modes::               Editing modes
+* Editor-Quality Values::      Adjusting the quality values
+* Editor-Cutoffs::             Adjusting the cutoff data
+* Editor-Editing Summary::     Summary of editing commands
+ at end menu
+
+Editing can take up a significant portion of the time taken to finish a
+sequencing project. Gap4 has a selection of searches (_fpref(Editor-Searching,
+Searching, contig_editor)) designed to speed up this process.
+The problems that require most attention are conflicts between good
+bases. Where base 
+confidence values are present it should be unnecessary to edit all
+conflicting bases as, in general,
+this will amount to adjusting poor quality data to agree with good quality
+data, in which case the consensus sequence should be correct anyway.
+
+Pads in the consensus should not be considered a problem
+requiring edits because it is possible to
+output the consensus sequence (from the main Gap4 File menu) with pads
+stripped out. Obviously poorly defined pads (a mixture of several pads and
+real bases) require checking in the same manner as other poorly
+defined consensus bases.
+
+If you wish to check all base conflicts set the consensus algorithm to
+Frequency (_fpref(Con-Calculation, The Consensus Algorithms, calc_consensus))
+and the consensus cutoff to 100. The consensus will then be a dash in all
+places where there is not a 100% agreement in the sequences. The ``Next
+Problem'' editor button will then step one at a time through each conflict.
+
+_split()
+ at node Editor-Cursor
+ at subsection Moving the editing cursor
+ at cindex Cursor: contig editor
+ at cindex Contig Editor: cursor
+
+Nearly all editing operations happen at the location of the editing cursor.
+This cursor appears as a solid block. The simplest mechanism of moving the
+cursor is simply use the left mouse button. Alternatively the following keys
+can be used.
+
+ at example
+ at group
+ Left arrow or Control b        Move left one base
+ Right arrow or Control f       Move right one base
+ Up arrow or Control p          Move up one base
+ Down arrow or Control n        Move down one base
+ Control a                      Move editing cursor to start of used
+ Control e                      Move editing cursor to end of used
+ Meta a                         Move editing cursor to start of cutoff
+ Meta e                         Move editing cursor to end of cutoff
+ Meta <                         Move editing cursor to start of contig
+ Meta >                         Move editing cursor to end of contig
+ at end group
+ at end example
+
+The difference between the last four Control and Meta key combinations
+depends on whether ``Cutoffs'' is set. If it is, then ``Control a''
+will move to the start of the used data for this reading and ``Meta a''
+will move to the start of the cutoff data for this reading. Otherwise
+they both move to the same point (the used data start). Similarly for
+``Control e'' and ``Meta e''. The action of these four key presses in the
+consensus line is simply to move to the start or end of the entire
+consensus sequence.
+
+The cursor can be placed on any sequence data shown in the editor.
+
+_split()
+ at node Editor-Modes
+ at subsection Editing Modes
+ at cindex Edit modes: contig editor
+ at cindex Contig Editor: edit modes
+ at cindex Superedit: contig editor
+
+The editor operates in two main edit modes - Replace and Insert.  Replace
+allows a character to be replaced by another and Insert allows characters
+to be inserted.  Replace is the default mode. The mode can be changed by
+pressing the button marked ``Insert''. The checkbox next to the button will
+be set (filled by a dark colour) when the mode is ``Insert''. By default
+these modes are restricted until the Edit Modes menu
+is used to change them.
+
+The Edit Modes menu consists of a series of checkboxes and radiobuttons which
+control which editing options are enabled.
+
+ at table @strong
+ at cindex Allow insert in read: contig editor
+ at cindex Allow del in read: contig editor
+ at cindex Contig Editor: allow insert in read
+ at cindex Contig Editor: Allow del in read
+ at item Allow insert in read
+ at itemx Allow del in read
+        Insertion or deletion within a reading will shift the sequence 
+characters and so will alter their alignment.
+This is acceptable provided action is
+        taken to correct it, by either shifting the reading or by
+        inserting or deleting a base elsewhere. This functionality is
+        disabled by default and is enabled by checking the appropriate
+        checkbox.  Note though that insertion and deletion of bases within
+        the cutoff data will shuffle the cutoff data rather than the
+        reading itself and hence will not break alignment. However this
+        operation still requires the edit mode to be enabled.
+
+ at sp 1
+ at cindex Allow insert any in cons: contig editor
+ at cindex Allow del dash cons: contig editor
+ at cindex Allow del any in cons: contig editor
+ at cindex Contig Editor: allow insert any in cons
+ at cindex Contig Editor: allow del dash in cons
+ at cindex Contig Editor: allow del any in cons
+ at item Allow insert any in cons
+ at itemx Allow del dash in cons
+ at itemx Allow del any in cons
+        These operations control the editing actions allowed for the
+        consensus. By default the only operations allowed are insertion
+        and deletion of pads. This is because consensus editing is
+        typically used for removing columns of pads where a single reading
+        has been overcalled.
+
+        When editing at 100% disagreement, such cases will be dashes in the
+        consensus, so ``Allow del dash in cons'' enables deletion of both
+        dash and pads.
+
+        ``Allow insert any in cons'' and ``Allow del any in cons'' allow any
+        column to be completely inserted or deleted.  These are potentially
+        dangerous actions, however the ``Evidence for edits'' options can
+        detect such edits.
+
+ at sp 1
+ at cindex Allow replace in cons: contig editor
+ at cindex Contig Editor: allow replace in cons
+ at item Allow replace in cons
+        Replacing a base in the consensus changes all of the bases in
+        readings at this point that disagree with the typed base. The
+        actual edit performed depends upon the ``Edit by base type'' and
+        ``Edit by confidence'' radiobuttons.
+        
+ at sp 1
+ at cindex Allow reading shift: contig editor
+ at cindex Contig Editor: allow reading shift
+ at item Allow reading shift
+        To shift a reading place the cursor at the far left end of the
+        reading. If cutoffs is set this should be the far left end of
+        the cutoff data. Then typing space or delete will move the reading
+        right or left respectively by one position. This operation is
+        disabled by default.
+
+ at sp 1
+ at cindex Allow transpose any: contig editor
+ at cindex Contig Editor: allow transpose any
+ at item Allow transpose any
+        Moving pads within a reading is often a useful procedure, and the
+        'movement' of a pad alone will not break the alignment.
+        For this reason it is possible to move pads around without using
+        insert/delete. Placing the cursor over a pad in a reading and
+        pressing ``Control l'' or ``Control r'' will move that pad left or
+        right one base. This operation will not work with the cursor on the
+        consensus. Pad movement is allowed at all times. The selection of
+        ``Allow transpose any'' allows any pair of adjacent characters to be
+swapped.
+ at sp 1
+ at cindex Allow uppercase: contig editor
+ at cindex Contig Editor: allow uppercase
+ at item Allow uppercase
+        A rule often followed by users is to type all modifications in
+        lower case which makes edited characters easier to see.
+        The ``Allow uppercase'' checkbox controls whether this rule is
+        enforced or not. By default ``Allow uppercase'' is checked which
+        means that the rule is not enforced.
+
+ at sp 1
+ at cindex Edit by base type: contig editor
+ at cindex Edit by base confidence: contig editor
+ at cindex Contig Editor: edit by base type
+ at cindex Contig Editor: edit by base confidence
+ at item Edit by base type
+ at itemx Edit by confidence
+
+These two selections are radiobuttons, and are mutually exclusive.
+They control the outcome when replacing bases
+in the consensus. When editing the consensus 
+``Edit by base type'' changes bases that disagree
+with the consensus to the base typed. ``Edit by confidence''
+changes the confidence of disagreeing bases to 0. If the consensus
+quality cutoff value is greater than or equal to zero, characters with an
+accuracy value of 0 are ignored in the consensus calculation. That is, although
+the characters still appear in the reading, they are not used to calculate the
+consensus. In this way it is possible to maintain the original base calls
+for visual inspection, but get the correct consensus.
+
+Note that ``Edit by confidence'' will not work if the
+``frequency'' consensus algorithm is in use (_fpref(Con-Calculation, The
+Consensus Calculation, calc_consensus)). If you wish to use ``Edit by
+confidence'', make sure that the quality cutoff is zero or higher, otherwise
+the frequency consensus algorithm will be used instead.
+
+ at sp 1
+ at cindex Allow F12 for fast tag deletion: contig editor
+ at cindex Contig Editor: Allow F12 for fast tag deletion
+ at item Allow F12 for fast tag deletion
+        F12 and Shift-F12 may be use to delete the tag underneath the
+        contig editor cursor (F12) or the mouse pointer
+        (Shift-F12). Initially these are disabled to prevent accidental
+        deletions.
+
+ at sp 1
+ at cindex Edit mode sets: contig editor
+ at cindex Mode sets: contig editor
+ at cindex Contig Editor: mode sets
+ at cindex Contig Editor: edit mode sets
+ at item Mode set 1
+ at itemx Mode set 2
+
+To make it easier to set the editing modes two user definable
+sets are available. By default these are as follows.
+
+Mode set 1:
+ at itemize @minus
+ at item   Disallow insert in read
+ at item   Disallow del in read
+ at item   Disallow insert any in cons
+ at item   Allow del dash in cons
+ at item   Disallow del any in cons
+ at item   Disallow replace in cons
+ at item   Disallow reading shift
+ at item   Disallow transpose any
+ at item   Allow uppercase
+ at item   Edit by confidence
+ at end itemize
+
+Mode set 2:
+ at itemize @minus
+ at item   Allow insert in read
+ at item   Allow del in read
+ at item   Allow insert any in cons
+ at item   Allow del dash in cons
+ at item   Disallow del any in cons
+ at item   Allow replace in cons
+ at item   Allow reading shift
+ at item   Disallow transpose any
+ at item   Allow uppercase
+ at item   Edit by confidence
+ at end itemize
+
+Currently the only way of redefining these sets is to add lines to
+your @file{.gaprc} file.
+_fxref(Conf-Introduction, Options Menu, configure)
+The method is to define a list of 1s and 0s to specify the states
+in the order listed above. The two default sets are defined as
+follows.
+
+ at example
+set_def CONTIG_EDITOR.SE_SET.1    @{0 0 0 1 0 0 0 0 1 1@}
+set_def CONTIG_EDITOR.SE_SET.2    @{1 1 1 1 0 1 1 0 1 1@}
+ at end example
+ at end table
+
+_split()
+ at node Editor-Quality Values
+ at subsection Adjusting the Quality Values
+ at cindex Quality values: contig editor, use within
+ at cindex Cutoff values: contig editor
+ at cindex Contig Editor: quality values
+ at cindex Contig Editor: cutoff values
+
+Each base has its own quality value. Assembly will allow only
+values between 1 and 99 inclusive. A quality value of 0 means that this base
+should be ignored. A quality value of 100 means that this base is definitely
+correct and the consensus will be forced to be the same base type and will be
+given a consensus confidence of 100. If two conflicting bases both have a
+quality of 100 the consensus will be a dash with a confidence of 0.
+
+Newly added bases or replaced bases are assigned their own quality values. By
+default these are both 100. The ``Set Default Confidence'' option in the
+settings menu allows these values to be changed.
+
+Several keyboard commands are available to edit the quality value of an
+individual base. The '[' and ']' keys set the quality to 0 and 100
+repsectively. To increment or decrement the confidence of a base by 1 use
+Shift plus the Up and Down arrow keys. To increment or decrement by 10 use
+Control plus the Up and Down arrow keys. The editor will beep if you reach
+quality 0 or 100. Finally note that quality values can also be made visible by
+the use of grey scales for the sequence background colour. _oxref(Editor-Show
+Quality, Show Quality).
+
+ at node Editor-Cutoffs
+ at subsection Adjusting the Cutoff Data
+ at cindex Cutoff data: contig editor
+ at cindex Hidden data: contig editor
+ at cindex Contig Editor: cutoff data
+
+The cutoff data is displayed by pressing the ``Cutoffs''
+toggle at the top of the editor. The cutoff sequence will be
+displayed in grey. We call the boundary between the cutoff data and
+the used data the cutoff position. These positions can be shifted
+left or right for each end of the reading using the Meta Left-arrow
+and Meta Right-arrow keys respectively. As keyboards may not have a
+meta key, Control Left-arrow and Control Right-arrow also have the
+same effect. These key combinations adjust the cutoff positions by a
+single base at a time. They only work when the cursor is on the very
+first or very last ``used'' base, depending on which cutoff you wish
+to adjust.
+
+If large changes are required the cutoffs can be ``zapped'' to
+new positions using the ``<'' and ``>'' keys. To use these, place the
+editing cursor to the position required (which may be within the
+cutoff data or the used data) and press the ``<'' key to set the left
+cutoff to the base between the cursor and the base leftwards of the
+cursor. Similarly ``>'' sets the right cutoff to the base between the
+cursor and the base leftwards of the cursor. Note that many
+keyboards have ``<'' and ``>'' above the ``,'' and ``.'' keys. In this case
+you will need to press Shift in conjunction with ``,'' and ``.'' to
+perform the operations.
+
+_split()
+ at node Editor-Editing Summary
+ at subsection Summary of Editing Commands
+ at cindex Summary of editing commands: contig editor
+ at cindex Contig Editor: editing keys
+
+A brief summary of these editing operations and which (if any) edit modes are
+required can be seen below:
+
+ at example
+Key          Location    Ins/Rep  Edit Mode           Action
+--------------------------------------------------------------------------
+base/*       Reading     Replace  any                 Change base
+base/*       Reading     Insert   Insert in read      Change base
+delete       Reading     both     Delete in read      Del base left & move
+Ctrl delete  Reading     both     Delete in read      Delete base to left
+Ctrl d       Reading     both     Delete in read      Delete under cursor
+delete       Read start  both     Readint shift       Shift left
+space        Read start  both     Reading shift       Shift right
+Ctrl l       Reading     both     any                 Move pad left
+Ctrl r       Reading     both     any                 Move pad right
+Ctrl l       Reading     both     Transpose any       Move base left
+Ctrl r       Reading     both     Transpose any       Move base left
+[            Reading     both     any                 Set quality to 0
+]            Reading     both     any                 Set quality to 100
+Shift Up     Reading     both     any                 Incr. quality by 1
+Shift Down   Reading     both     any                 Decr. quality by 1
+Ctrl Up      Reading     both     any                 Incr. quality by 10
+Ctrl Down    Reading     both     any                 Decr. quality by 10
+<            Reading     both     any                 Set left cutoff
+>            Reading     both     any                 Set right cutoff
+Meta left    Reading     both     any                 Adjust left cutoff
+Meta right   Reading     both     any                 Adjust right cutoff
+*            Consensus   both     any                 Insert pad column
+base         Consensus   Insert   Insert any in cons  Insert column
+base         Consensus   Replace  any                 Replace column
+delete *     Consensus   both     any                 Delete column
+delete -     Consensus   both     Del dash in con     Delete column
+delete any   Consensus   both     Del any in cons     Delete column
+Ctrl d       Consensus   both     Del dash/any        Delete column
+Shift F1-10  Read/Cons   both     any                 Create tag macro
+F1 to F10    Read/Cons   both     any                 Use tag macro
+F11          Read/Cons   both     any                 Edit tag under cursor
+Shift F11    Read/Cons   both     any                 Edit tag under pointer
+F12          Read/Cons   both     Fast tag deletion   Delete tag under cursor
+Shift F12    Read/Cons   both     Fast tag deletion   Del. tag under pointer
+
+ at end example
+
+_split()
+ at node Editor-Selections
+ at section Selections
+ at cindex Selections: contig editor
+ at cindex Contig Editor: selections
+
+It is possible to highlight an area of a reading or the
+consensus sequence in preparation for performing some further action
+upon it. Such examples of actions are: creating annotations and
+aligning sequence. We call these highlighted areas ``selections''.
+They will be displayed as an underlined region.
+
+The simplest way to make a selection is using the left mouse
+button. Pressing the mouse button marks the base beneath the cursor 
+as the start of the selection. Then, without releasing the button,
+moving the mouse cursor adjusts the end of the selection. Finally
+releasing the button will allow normal use of the mouse again.
+
+Sometimes we may wish to make a selection longer than is visible on the
+screen, or to extend our current selection. This can be done by using shift
+left mouse button to adjust the end of the selection. Hence we can mark the
+start of the selection using the left button, scroll along the contig to
+the desired position, and set the end using the shift left button.
+
+The selection is stored in a ``cut buffer''. This allows for
+the usual ``cut and paste'' operations between applications, although
+the contig editor only supports this in one direction (as it is not
+possible to ``paste'' into the window). The mechanism employed for this
+follows the usual X Windows standard of using the middle mouse button
+(or Alt left mouse button).
+For example, to send a piece of sequence to a text editor (eg
+ at code{Emacs}) mark the desired region using the left mouse button in
+the editor window and then press the middle button, or Alt left mouse
+button, whilst the mouse
+cursor is in the text editor window. The sequence will then be
+inserted into the text editor.
+
+A quick summary of the mouse commands follows.
+
+ at example
+Left button                       Position editing cursor to mouse cursor
+Left button (drag)                Mark start and end of selection
+Shift left button                 Adjust end of selection
+Middle button (another window)    Copy selected sequence
+Alt left button (another window)  Copy selected sequence
+ at end example
+
+_split()
+ at node Editor-Annotations
+ at section Annotations
+ at cindex Tags: contig editor
+ at cindex Annotations: contig editor
+ at cindex Contig Editor: annotations
+ at cindex Contig Editor: tags
+
+Annotations (or tags) can be placed at any position on readings or on
+the consensus.
+They are usually used to record
+positions of primers for walking, or to mark sites, such as
+repeats or compressions, that have caused problems during sequencing. 
+They can also be used to contain feature table data as read from an EMBL format
+sequence file
+(_fpref(Mutation-Detection-Reference-Sequences, Reference sequences, t)).
+Each
+annotation has a type such as ``primer'', a position, a length, a strand
+(forward, reverse or both) and an optional comment. Each type and strand
+has an associated colour that will be shown on the display. For
+information on searching for annotations see _oref(Editor-Search-Type,
+Searching by Tag Type), and _oref(Editor-Search-Anno, Searching by
+Annotation Comments).
+
+_picture(contig_editor.taged,4.16667in)
+
+To create an annotation,  make a selection and then select
+``Create Tag'' from the contig editor commands menu.
+_oxref(Editor-Commands, The Commands Menu). This will bring up a further
+window; the ``tag editor'' (shown above). The ``Type:'' button at the top of
+the editor invokes a selectable list from which tag types can be chosen. 
+See below.
+
+_picture(contig_editor.tagsel,3.5in)
+
+Use this to select the desired type of
+annotation.  
+
+Next the strand of the annotation can be selected. This
+will be displayed as one of ``<----->'', ``<-----'' and ``----->''. The
+comment (the box beneath the buttons) can be edited using the usual
+combination of keyboard input and arrow keys. The ``Save'' button will
+exit the tag editor and create the annotation. To abandon editing
+without creating the annotation use the ``Cancel'' button.
+
+To edit an existing annotation, position the editing cursor
+within a annotation and select ``Edit Tag'' from the commands menu. This
+will be a cascading menu, typically showing one tag. If multiple tags
+coincide at the same sequence position you will be able to chose which
+tag to edit. Once again the tag editor will be invoked and operates as
+before. The @b{F11} key is also a shortcut for editing the top-most
+tag underneath the editor cursor.
+When editing, the ``Save'' will save the edited changes and ``Cancel''
+will abandon changes.
+
+Removing a annotation involves positioning the editing cursor within
+an annotation and selecting ``Delete Tag'' from the commands menu. As with
+``Edit Tag'' this is a cascading menu to allow you to chose which tag at a
+specific point to delete.
+
+Within a tag editor two buttons ``Move'' and ``Copy'' may be used to
+reposition existing tags. When editing a tag, the current location of
+the tag is underlined within the editor. If a new region is
+highlighted (on the consensus, a different reading, or even in a
+different contig) and either of these buttons are pressed the tag will
+be saved to the new location and removed from the previous location
+if ``Move'' was used. This can be used as an easy way to adjust the
+extents of an existing tag or as a way to annotation multiple
+locations with the same tag contents.
+
+As usual, ``undo'' can be used to undo any of these annotation creations,
+edits and removals.
+
+Some tags may contain graphical controls instead of the usual text
+panel. These are encoded with the master gap4 tag database
+(@i{GTAGDB}) by specifying the default tag text to be a piece of
+``ACD'' code. A full description of the (modified for gap4) ACD syntax
+is not available currently, but it is strongly modelled on the the
+EMBOSS ACD syntax which has documentation at
+_uref(http://www.emboss.org/Acd/index.html).
+
+It is possible to add your own tag types by modifying either the
+system @i{GTAGDB} file or creating your own @i{GTAGDB} file in your
+home directory (for all your databases) or the current directory (for
+just those in that directory).
+
+_picture(contig_editor.tagmacro,4.50833in)
+
+For rapid annotating a series of 10 macros may be programmed. Press
+Shift and a function key between F1 and F10 to bring up the macro
+editor. This look much like the normal tag editor except that @b{Save}
+is replaced with @b{Save Macro} and saving does not actually create a
+tag on the sequence. To use the macro, highlight the bases you wish and
+press the function key corresponding to that macro - F1 to F10. For a
+single base pair tag you do not need to underline a region as the tag
+will automatically cover the base underneath the editing cursor. To
+remember these permanently use the ``Save Macros'' option in the
+``Settings'' menu.
+
+You may find that some function keys are already programmed to do other
+things (such as raise or lower windows), depending on the windowing
+environment in use. If this is the case either modify the configuration
+of your windowing system or simply use another macro key.
+
+For rapid editing and deleting the F11 and F12 keys may be used. These
+edit and delete the top-most tag underneath the editing cursor. If you
+wish to edit or delete the tag underneath the mouse cursor instead (and
+hence save a mouse click) use Shift F11 and Shift F12 for edit and delete.
+
+The Control-Q key sequence may be used to toggle the displaying of tags.
+Pressing it once will prevent all tags from being displayed in the editor.
+This is sometimes useful to see any colouring information underneath the tag.
+Pressing Control-Q once more will redisplay them.
+
+_split()
+ at node Editor-Searching
+ at section Searching
+ at cindex Searching: contig editor
+ at cindex Contig Editor: searching
+
+ at menu
+* Editor-Search-Pos::           Searching by position
+* Editor-Search-Prob::          Searching by problem
+* Editor-Search-Anno::          Searching by annotation comments
+* Editor-Search-Seq::           Searching by sequence
+* Editor-Search-Qual::          Searching by quality
+* Editor-Search-ConsQual::      Searching by consensus quality
+* Editor-Search-File::          Searching by file
+* Editor-Search-Name::          Searching by reading name
+* Editor-Search-Edit::          Searching by edits
+* Editor-Search-VerifyEdit1::   Searching by evidence for edit (1)
+* Editor-Search-VerifyEdit2::   Searching by evidence for edit (2)
+* Editor-Search-Type::          Searching by tag type
+* Editor-Search-Discrepancies:: Searching by discrepancies
+* Editor-Search-ConsDiscreps::  Searching by consensus discrepancies
+ at end menu
+
+The contig editor's searching ability and its links to the consensus
+calculation algorithm are crucial in determining the efficiency with which
+contigs can be checked and corrected. The consensus is calculated ``on the
+fly'' and changes in response to edits. For editing, the most important
+search functions are those which reveal problems in the consensus
+whilst ignoring all bases that are adequately well determined.
+The default search type is therefore by consensus quality. By default this
+is done in the forward direction and for a quality value of 30, although
+this is configurable by changing the collowing lines in the gaprc file.
+
+ at example
+set_def CONTIG_EDITOR.SEARCH.DEFAULT_TYPE       consquality
+set_def CONTIG_EDITOR.SEARCH.DEFAULT_DIRECTION  forward
+set_def CONTIG_EDITOR.SEARCH.CONSQUALITY_DEF    30
+ at end example
+
+Selecting ``Next Search'' brings up a window which can remain present
+during normal editor operation. The window allows the user to select
+the direction of search, the type of search, and a value to search
+on. The value is entered into a value text box, then pressing the
+``search'' button performs the search. If successful, the cursor is
+positioned accordingly. An audible tone indicates failure. Pressing
+the ``Cancel'' button removes the search window. The search window is
+automatically removed when the contig editor is exited.
+
+_picture(contig_editor.search,2.63333in)
+
+The ``Cutoffs''
+button can be used to select whether or not searching should find
+matches within the cutoff data.
+
+The Control-s key binding in the editor is equivalent to searching forward for
+the next match. The Escape Control-s key sequence performs a reverse search.
+Both key bindings will bring up the search window if it is not currently
+displayed.
+
+As is described below, there are thirteen different search modes.
+
+ at node Editor-Search-Pos
+ at subsection Search by Position
+ at cindex Searching by position: contig editor
+
+The presence of padding characters in the consensus can greatly alter the
+length of the sequence, and the positions of the bases along it. Positions
+can therefore be defined in two ways: those which include pads and those
+which do not. This option
+(termed a search!) moves the cursor to a specified position.
+The numeric position is specified in
+the value text box. Eg a value of ``1234'' causes the cursor to be
+placed at base number 1234 in the contig. 
+Positioning within a 
+reading is achieved by prefixing the number with the ``@@'' character,
+eg ``@@123'' positions the cursor at base 123 of the sequence in which
+the cursor lies. Relative positions can be specified by prefixing
+the number with a plus or minus character. Eg ``+1234'' will advance
+the cursor 1234 bases. If possible, the cursor is positioned within
+the same sequence. The direction buttons have no effect on this
+operation.
+
+ at node Editor-Search-Prob
+ at subsection Search by Problem
+ at cindex Searching by problem: contig editor
+
+This positions the cursor at the next place in the consensus
+sequence which is ``*'', ``-'' or ``N''. The search can be
+performed either forwards or backwards from the current cursor
+position. Obviously the characters
+appearing in the consensus depend on the selected consensus calculation
+algorithm and the thresholds set.
+
+ at node Editor-Search-Anno
+ at subsection Search by Annotation Comments
+ at cindex Searching by annotation comments: contig editor
+
+This positions the cursor at the start of the next tag which
+has a comment containing the string specified in the value box.
+Only currently active tag types are searched.
+The search performed is a regular expression search, and
+certain characters have special meaning. Be careful when your
+string contains ``.'', ``*'', ``[``, ``]'', ``\'', ``^'' or ``$''. The search can be
+performed either forwards or backwards from the current cursor
+position. Searching with an empty value will find all tags.
+
+ at node Editor-Search-Type
+ at subsection Search by Tag Type
+ at cindex Searching by tag type: contig editor
+
+This positions the cursor at the start of the next tag of the specified
+type. If the tag type is not active, the tag will be found and
+underlined but will remain invisible.
+To change
+the type, select from the menu that pops up when the mouse is clicked
+on the button labeled ``Type:''. The search can be performed either
+forwards or backwards of the current cursor position. To find all
+tags, use ``Search by Annotation Comments'', with an empty text box.
+
+ at node Editor-Search-Seq
+ at subsection Search by Sequence
+ at cindex Searching by sequence: contig editor
+
+This positions the cursor at the start of the next segment of
+sequence that matches the value specified in the text box.
+The search is case insensitive, ignores pads, and can allow a specified
+number of mismatches. It may be performed on sequence only, consensus
+only or both. It also operates either forwards or backwards from the
+current editing cursor position.
+
+ at node Editor-Search-Qual
+ at subsection Search by Quality
+ at cindex Searching by quality: contig editor
+
+This positions the cursor at the next place in the consensus
+sequence where the consensus for each of the two strands disagree.
+Where there is only data for one strand the search will stop
+at every base. The search can be performed either forwards or
+backwards from the current cursor position.
+
+ at node Editor-Search-ConsQual
+ at subsection Search by Consensus Quality
+ at cindex Searching by consensus quality: contig editor
+
+This positions the cursor on the consensus at the next
+position where the quality of
+the consensus is below a given threshold. The quality of the consensus is
+calculated by the consensus algorithm. For this search
+the quality threshold should be entered into the
+value box and should be within the range of 0 to 100 inclusive.
+
+ at node Editor-Search-File
+ at subsection Search by file
+ at cindex Searching by file: contig editor
+
+This steps the cursor through a set of positions specified in a file.
+The format for the positions in the file is one per
+line with each line consisting of a reading name, a position within that
+reading, and an optional comment. If a position is relative to the
+start of the contig rather than the start of any particular reading, then
+simply use the first reading in the contig. Positions that are beyond the
+ends for the reading are still valid, although the editing cursor is moved
+onto the consensus sequence.
+
+The comment can consist of any string. Multiline comments are possible, but
+they must be written using @code{\n} in the comment string rather than an
+actual newline character (which would signify the start of the next
+record). 
+The
+comment for the current position is displayed at the bottom of the editor search
+window in a text panel which is visible only when in the ``search by file''
+mode.
+
+Any record containing a reading name that is not in the current contig is silently
+ignored. This allows for a search file to have positions for all contigs.
+However at present there is no mechanism for stepping through an entire search
+file bringing up editors for each contig as required. 
+This will be implemented in the future.
+
+An example file follows.
+
+ at example
+xb63c7.s2 102
+xb63c7.s2 30 A multi-\nline comment.
+xb32a2.s1 56 Oligo, of length 12
+xa17b1.r1 5714 Repeat from 5714 to 5780
+ at end example
+
+ at node Editor-Search-Name
+ at subsection Search by Reading Name
+ at cindex Searching reading name: contig editor
+
+This positions the cursor at the left end of the reading specified
+in the value text box. If the value is prefixed with a hash sign it
+is assumed to be a reading number. Otherwise it is assumed to be a
+reading name. Eg ``#123'' positions the cursor at the left end of
+reading number 123. ``a16a12.s1'' positions at the start of reading
+a16a12.s1. If the value was ``a16'' the cursor is positioned at the
+first reading which starts with ``a16''.
+
+ at node Editor-Search-Edit
+ at subsection Search by Edit
+ at cindex Searching by edits: contig editor
+
+This positions the cursor at the next place in the contig where an edit has
+been made. Edits include base insertions, deletions, replacements and
+confidence value changes.The search can be performed either forwards or
+backwards from the current cursor position.
+
+ at node Editor-Search-VerifyEdit1
+ at subsection Search by Evidence for Edit (1)
+ at cindex Searching by Verify AND: contig editor
+ at cindex Verfiy AND: contig editor
+ at cindex Searching by Evidence for Edit1: contig editor
+ at cindex Evidence for Edit1: contig editor
+
+The Evidence for Edit (1) option checks edited bases to find bases in the
+consensus for which there is no evidence in the original readings. The
+definition of evidence is that at least one reading had this original base
+call. 
+Currently this search operates only in the forward direction.
+
+ at node Editor-Search-VerifyEdit2
+ at subsection Search by Evidence for Edit (2)
+p at cindex Searching by Verify OR: contig editor
+ at cindex Verfiy OR: contig editor
+ at cindex Searching by Evidence for Edit2: contig editor
+ at cindex Evidence for Edit2: contig editor
+
+The Evidence for Edit (2) option checks edited bases to find bases in the
+consensus for which there is no evidence in the original readings. The
+definition of evidence is that at least one reading from each strand
+had this original base call.
+Currently this searches only in the forward direction.
+
+ at node Editor-Search-Discrepancies
+ at subsection Search by Discrepancies
+ at cindex Searching by discrepancies: contig editor
+ at cindex Discrepancies: searching for in contig editor
+
+This finds positions where two or more bases are above a particular
+quality level, but in disagreement. The quality threshold is given in the
+value box and should be within the range of 0 to 100 inclusive.
+
+ at node Editor-Search-ConsDiscreps
+ at subsection Search by Consensus Discrepancies
+ at cindex Searching by consensus discrepancies: contig editor
+ at cindex Consensus discrepancies: searching for in contig editor
+
+This finds positions where there is a significant disagreement in a
+particular consensus base. Unlike ``by Discrepancies'' this does not
+look for individual base confidence values, but rather it combines
+multiple bases together for each base type and searches for the second
+highest confidence at any point. This is the same method use in the
+2nd-highest confidence graph
+(_fpref(Consistency-2ndHighest, 2nd-Highest Confidence, consistency_display)). 
+
+
+_split()
+ at node Editor-Commands
+ at section The Commands Menu
+ at cindex Commands menu: contig editor
+ at cindex Contig Editor: commands menu
+
+The Commands menu is available by either pressing the Commands button
+at the top of the contig editor window, or by pressing the Control key
+and the left mouse button, or by pressing right mouse button with the
+mouse cursor anywhere within the sequence display section of the
+contig editor. A menu will be revealed containing the following
+options (which are described in greater detail below).
+
+ at menu
+* Editor-Searching::            Searching
+* Editor-Annotations::          Create Tag
+* Editor-Annotations::          Edit Tag
+* Editor-Annotations::          Delete Tag
+* Editor-Comm-Save::            Save Contig
+* Editor-Comm-Dump::            Dump Contig to File
+* Editor-Comm-Consensus Trace:: Save Consensus Trace
+* Editor-Comm-List Confidence:: List Confidence
+* Editor-Comm-Report-Mutations:: Report Mutations
+* Editor-Comm-Primer Selection:: Select Primer
+* Editor-Comm-Align::           Align
+* Editor-Comm-Remove Reading::  Reading reading
+* Editor-Comm-Break Contig::    Break Contig
+ at end menu
+
+ at subsection Search
+
+This Contig Editor Commands menu function
+Performs a search. _oxref(Editor-Searching, Searching).
+
+ at subsection Create Tag
+
+This Contig Editor Commands menu function
+Creates an annotation. _oxref(Editor-Annotations, Annotations).
+
+ at subsection Edit Tag
+
+This Contig Editor Commands menu function
+Edits an annotation. _oxref(Editor-Annotations, Annotations).
+
+ at subsection Delete Tag
+
+This Contig Editor Commands menu function
+Removes an annotation. _oxref(Editor-Annotations, Annotations).
+
+ at node Editor-Comm-Save
+ at subsection Save Contig
+ at cindex Contig Editor: saving
+ at cindex Saving: contig editor
+
+This Contig Editor Commands menu function
+writes any edited data to disk. The undo history is
+cleared and it is no longer possible to quit and abandon these saved
+changes. The Control-x followed by Control-s will also save the contig editor
+in the same manner as the Save command.
+
+ at node Editor-Comm-Dump
+ at subsection Dump Contig to File
+ at cindex Dump Contig: contig editor
+ at cindex Contig Editor: Dump Contig
+ at cindex Contigs: printing
+ at cindex Printing contigs
+ at cindex Contigs: saving to file
+ at cindex saving contigs to file
+ at cindex Contig Editor: saving to file
+ at cindex aligned readings: saving to file
+ at cindex aligned readings: printing
+ at cindex printing: aligned readings
+
+This Contig Editor Commands menu function
+outputs the current contig, as currently shown (e.g. with status
+lines) to a file. The user can select the region to dump, the length of each
+line, and the file name to use. The sequence names can be up to 40 characters, 
+but often projects do not use the full length. To avoid wasted space in the
+output the number of columns to use for sequence names can be adjusted.
+
+ at node Editor-Comm-Consensus Trace
+ at subsection Save Consensus Trace
+ at cindex Save Consensus Trace: contig editor
+ at cindex Contig Editor: Save Consensus Trace
+ at cindex Consensus Trace
+ at cindex Trace: Consensus Trace
+
+This Contig Editor Commands menu function
+produces a trace file for the consensus sequence by averaging the
+traces of the readings. The command brings up a
+dialogue containing controls to specify the filename, the consensus start and
+end positions, the strand, and whether to use matching reads.
+
+As the trace of a reading is dependent on the direction it was read, the
+consensus trace can be computed from all the reads in either the forward or
+reverse directions, but not both at once. When the ``Use only matching reads''
+toggle is set to ``Yes'' only the readings of the correct strand that have the
+same base call as the consensus sequence are used. The option is useful for
+producing wild-type trace files for a mutation analysis project.
+
+ at node Editor-Comm-List Confidence
+ at subsection List Confidence
+ at cindex List confidence: contig editor
+ at cindex Contig Editor: List Confidence
+
+This Contig Editor Commands menu function
+operates in a very similar manner to the main Gap4 List Confidence
+command (_fpref(Con-Evaluation, List Confidence, calc_consensus)), except that
+it only operates on the current contig, and it
+uses the current editor consensus confidences rather than the ones saved to
+disk. It displays a dialogue requesting a range within the contig and a
+question asking if only summary of the results is required.
+
+Pressing OK or Apply will add to the editor information line a count of the
+expected number of errors and the error rate. If the ``Only update information
+line'' question was answered ``No'' then the full frequency table will also be
+output. It will appear in the main text output window in the same format
+as the ``List Confidence''
+command in the main Gap4 View menu. The Apply button can be used to calculate
+the number of errors without removing the dialogue.
+
+It is often the very ends of contigs (which are
+generally low coverage and bad quality) that have most of the errors,
+and so
+it is sometimes useful to set a range which
+includes all of the contig except for around 1000 bases from each end.
+
+
+
+ at node Editor-Comm-Report-Mutations
+ at subsection Report Mutations
+ at cindex Report mutations: contig editor
+ at cindex Mutation reporting: contig editor
+ at cindex Contig Editor: mutation reporting
+This Contig Editor Commands menu function is used to produce a list of all
+the bases annotated with mutation tags (or those bases which differ from
+the consensus/reference sequence). If the tags or differences are within
+segments of sequence which are also annotated with EMBL feature table CDS
+records, the report will include data describing its effect.
+The report, which can be sorted by sequence or position, 
+includes the reading names, mutation positions relative to the 
+reference
+sequence, the actual change, its effect, and the evidence. An example is shown
+below.
+
+ at example
+ at group
+ at cartouche
+
+001321_11aF 33885T>Y (silent F) (strand - only)
+001321_11aF 34407G>K (expressed E>[ED]) (strand - only)
+001321_11cF 35512T>Y (silent L) (double stranded)
+001321_11cF 35813C>Y (expressed P>[PL]) (double stranded)
+001321_11dF 36314A>R (expressed E>[EG]) (double stranded)
+001321_11eF 36749A>R (expressed K>[KR]) (double stranded)
+001321_11eF 37313T>K (noncoding) (strand - only)
+000256_11eF 36749A>G (expressed K>R) (double stranded)
+
+ at end cartouche
+ at end group
+ at end example
+
+Here the first record is for reading 001321_11aF, position 33885, T changed
+to T and C (i.e. is heterozygous) to produce no amino acid change, with evidence coming only from
+the complementary strand. The last record is for reading 000256_11eF, position
+36749, A changed to G, producing an amino acid change K to R, with evidence
+from both strands of the sequence. The penultimate record denotes a 
+heterozygote in a noncoding region.
+
+
+ at node Editor-Comm-Primer Selection
+ at subsection Select Primer
+ at cindex Primer selection: contig editor
+ at cindex Oligo selection: contig editor
+ at cindex Contig Editor: oligo selection
+ at cindex Contig Editor: primer selection
+
+This Contig Editor Commands menu function
+allows the user to employ the primer selection algorithm OSP to
+find primers for sequencing experiments. _oxref(Editor-Primer Selection,
+Searching for Primers).
+
+ at node Editor-Comm-Align
+ at subsection Align
+ at cindex Contig Editor: align
+ at cindex Align: contig editor
+
+This Contig Editor Commands menu function
+performs a sequence alignment between the currently selected segment of a
+reading and the consensus sequence. It provides a simple way of extending the
+visible part of a reading to use its hidden data, which is often useful to
+double strand a short section of consensus without the need to perform further
+experiments.  On a sequence, highlight the cutoff data to align along with a
+small section of the good quality non-cutoff data. Then select the align
+command and adjust the cutoff point as desired.  Pads are inserted in the
+consensus and readings as necessary, although pads will not be inserted in the
+cutoff data of other sequences.
+
+ at node Editor-Comm-Remove Reading
+ at subsection Remove Reading
+
+This Contig Editor Commands menu function
+marks a reading for subsequent removal. _oxref(Editor-Remove Readings,
+Removing readings from the contig)
+
+ at node Editor-Comm-Break Contig
+ at subsection Break Contig
+ at cindex Contig editor: break contig
+ at cindex Break contig: contig editor
+ at cindex Contig breaking
+
+This Contig Editor Commands menu function
+breaks the contig so that the reading underneath the editing cursor is
+the left end of a new contig. In order to perform this operation all edits
+are saved automatically first. Once saved these edits cannot be undone. This
+operation is identical to the Break Contig command in the main menu.
+_fxref(Break Contig, Break Contig, disassembly)
+
+
+_split()
+ at node Editor-Settings
+ at section The Settings Menu
+ at cindex Settings menu: contig editor
+ at cindex Contig Editor: settings menu
+ at cindex Consensus: contig editor
+ at cindex configure: contig editor
+ at cindex Settings: saving in contig editor
+ at cindex Contig Editor: saving settings
+ at cindex Contig Editor: saving configuration
+
+The purpose of this menu is to configure the operation of the contig
+editor, including the consensus calculation, the active tags and the
+status lines. Settings can be saved using the ``Save settings'' button,
+but this does not save any tag macros. These may be saved using the
+``Save Macros'' option. Settings for the following options can be changed.
+
+ at ifset tex
+ at itemize @bullet
+ at item
+Status Line
+ at item
+Trace Display
+ at item
+Consensus algorithm
+ at item
+Highlight Disagreements
+ at item
+Compare Strands
+ at item
+Toggle auto-save
+ at item
+3 Character Amino Acids
+ at item
+Show reading quality
+ at item
+Show consensus quality
+ at item
+Show edits
+ at item
+Show unpadded positions
+ at item
+Show template names
+ at item
+Set Active Tags
+ at item
+Set Output List
+ at item
+Set Default Confidences
+ at item Store Undo
+Set or unset saving of undo
+ at end itemize
+ at end ifset
+
+
+ at menu
+* Editor-Status::               Status Line
+* Editor-Trace Display::        Trace Display
+* Editor-Consensus algorithm::  Consensus Algorithm
+* Editor-Group Readigns::       Group Readings
+* Editor-Disagree::             Highlight Disagreements
+* Editor-Compare::              Compare Strands
+* Editor-auto-save::            Toggle auto-save
+* Editor-3Char::                3 Character Amino Acids
+* Editor-Show Quality::         Show reading/consensus quality
+* Editor-Show edits::           Show edits
+* Editor-Show unpadded pos::    Show Unpadded Positions
+* Editor-Show template names::  Show Template Names
+* Editor-Active Tags::          Set Active Tags
+* Editor-Output List::          Set Output List
+* Editor-Default Confidence::   Set Default Confidences
+* Editor-Store Undo::           Set or unset saving of undo
+ at end menu
+
+ at node Editor-Status
+ at subsection Status Line
+ at cindex Contig Editor: status line
+ at cindex Status line: contig editor
+
+The contig editor can display several additional text lines underneath the
+consensus sequence. This ``status'' data is of textual form and can
+provide additional
+information about the data displayed above. Currently, there are two forms
+of status line available. These are ``Strands'' and ``Translate Frame''. Both
+status line types update automatically as edits are made that change the
+consensus.
+
+The status line menu is accessed by cascading off the settings menu.
+It contains the following.
+
+ at itemize @bullet
+ at item Show Strands
+ at item Translate using feature tables
+
+ at item Translate frame 1+
+ at item Translate frame 2+
+ at item Translate frame 3+
+ at item Translate frame 1-
+ at item Translate frame 2-
+ at item Translate frame 3-
+
+ at item Translate + frames
+ at item Translate - frames
+ at item Translate all frames
+
+ at item Remove all
+ at end itemize
+
+ at cindex Contig Editor: Show Strands
+ at cindex Show Strands: contig editor
+
+``Show Strands'' creates a single line consisting of the +, -, = and !
+characters.  These indicate: positive strand only, negative strand
+only, both strands (in agreement) and both strands (in disagreement)
+respectively.
+
+ at cindex Contig Editor: translations
+ at cindex Contig Editor: translations using feature tables
+ at cindex Feature tables: Translation in Contig Editor
+ at cindex Translations: contig editor
+The frame translation  status lines provide translations in each of the
+six available  reading
+frames. Alternatively, using the ``Translate using feature tables'', only
+segments described in CDS records will be translated. The CDS records
+are those contained in the reference sequence. Translations can be
+displayed in either the single character or the three character amino
+acid codes.
+
+Pressing the right mouse button on the 'name' segment of the status
+line (on the left hand side) pops up a menu. The commands available
+may depend on the type of the status line chosen, however currently it
+will always only contain  the ``Remove'' command. This, as expected,
+removes the status line from the display. To remove all status lines
+use the ``Remove all'' command from the ``Status Line'' cascading menu.
+
+Note that
+the data in the status line cannot be cut and pasted, modified or
+searched; it is not possible to move the cursor into these lines.
+
+ at node Editor-Trace Display
+ at subsection Trace Display
+ at cindex Contig Editor: Trace Display menu
+ at cindex Trace Display menu: contig editor
+
+This is a cascading menu containing various options for configuring
+the trace views within the editor.
+
+ at subsubsection Auto-display Traces
+ at cindex Auto-display Traces: contig editor
+ at cindex Contig Editor: auto-display traces
+
+When switched on, auto-display traces will direct certain searches to
+automatically display relevant traces to aid in solving problems.  This
+works in conjunction with most appropriate searches. The traces chosen to
+solve the ``problem'' will, by default, be the best trace from each strand which
+agrees with the consensus (which is calculated at a low 
+consensus cutoff) and the
+best trace from each strand which disagrees with the consensus. This selection
+of traces may be adjusted by modifying the
+ at code{CONTIG_EDITOR.AUTO_DISPLAY_TRACES_CONF} configuration variable. The
+default setting of this is ``@code{+ - +d -d}''. Each of the space separated
+elements in this string corresponds to a trace file to choose. If one cannot
+be found, then it is ignored. The order listed here is the order in which they
+will be displayed in the trace window. The complete list of available trace
+specifiers is:
+
+ at table @code
+ at item +
+Best +ve strand trace agreeing with consensus
+ at item +p
+Best +ve strand dye-primer trace agreeing with consensus
+ at item +t
+Best +ve strand dye-terminator trace agreeing with consensus
+ at item -
+Best -ve strand trace agreeing with consensus
+ at item -p
+Best -ve strand dye-primer trace agreeing with consensus
+ at item -t
+Best -ve strand dye-terminator trace agreeing with consensus
+ at item d
+Best trace disagreeing with consensus
+ at item +d
+Best +ve strand trace disagreeing with consensus
+ at item -d
+Best -ve strand trace disagreeing with consensus
+ at item +2
+Second best +ve strand trace agreeing with consensus
+ at item +2p
+Second best +ve strand dye-primer trace agreeing with consensus
+ at item +2t
+Second best +ve strand dye-terminator trace agreeing with consensus
+ at item -2
+Second best -ve strand trace agreeing with consensus
+ at item -2p
+Second best -ve strand dye-primer trace agreeing with consensus
+ at item -2t
+Second best -ve strand dye-terminator trace agreeing with consensus
+ at item 2d
+Second best trace disagreeing with consensus
+ at item +2d
+Second best +ve strand trace disagreeing with consensus
+ at item -2d
+Second best -ve strand trace disagreeing with consensus
+ at end table
+
+ at subsubsection Show Read-pair Traces
+ at cindex Show read-pair Traces
+ at cindex Read-pairs, trace display
+
+When double-clicking on a sequence to view a trace this option will
+automatically identify traces on both strands of this template. Both
+the forward strand and reverse strand traces will then be shown, in
+that order.
+
+ at subsubsection Auto-diff Traces
+ at cindex Trace differences: contig editor
+ at cindex Auto-diff traces: contig editor
+ at cindex Contig Editor: Auto-diff traces
+
+Once this is
+activated, whenever the user double clicks on a base in the editor
+sequence display, not only is the reading's trace displayed, but also
+its designated reference trace plus the difference between them. If its
+complementary reading is available, its trace and reference trace and
+their differences are also displayed.
+
+If no traces have been specified to be the reference traces then Gap4
+will attempt to automatically pick two. It choses the highest quality
+pair of traces that come from the same template and disagree with
+either the forward or reverse strand of the trace initially
+double-clicked upon.
+
+_lpicture(mut_traces_het,6in)
+
+Trace differences display
+
+As is shown in the figure below, it is also possible to set the trace
+difference display to use positive and negative references
+
+_lpicture(mut_traces_positive,6in)
+
+For further information about mutation detection, see
+_fref(Mutation-Detection-Introduction, Search for Mutations, mutations)
+
+ at subsubsection Y scale differences
+ at cindex Y scale differences
+ at cindex Trace differences: Y scaling
+
+When performing trace alignments and differencing (using Auto-diff
+traces or via the manual ``difference'' option in the trace display)
+this option controls whether to perform a trace peak-height
+normalisation on both traces prior to alignment and substraction.
+
+ at node Editor-Consensus algorithm
+ at subsection Consensus Algorithm
+ at cindex Consensus algorith in contig editor
+
+This allows selection of the consensus algorithm to use within the Contig
+Editor. Like the consensus and quality cutoff parameters, it is local to the
+specific editor being used. The main Consensus algorithm option should be used
+to globally change the algorithm being used.
+_fxref(Conf-Consensus Algorithm, Consensus Algorithm, configure)
+
+ at node Editor-Group Readings
+ at subsection Group Readings
+ at cindex Templates: grouping readings in contig editor
+ at cindex Contig Editor: group readings
+ at cindex Group Readings: contig editor
+
+This is a cascading menu allowing the readings viewed in the editor to
+be sorted and grouped by different criteria. By default the order (in
+Y) that readings are listed in the editor is sorted by the position of
+the left-most used based. This option provides a choice of by
+position, strand (plus first, then minus), name, number, template and
+clone.
+
+Where appropriate an automatic sub-ordering is applied. For example
+sorting by strand will group the readings primarily into ``+'' and
+``-'' groups, but within the group the readings are still sorted by
+position. When grouping by template the sub-grouping is by strand.
+
+ at node Editor-Disagree
+ at subsection Highlight Disagreements
+ at cindex Highlight Disagreements: contig editor
+ at cindex Contig Editor: Highlight Disagreements
+ at cindex Dots: contig editor highlight disagreements
+ at cindex Colour: contig editor highlight disagreements
+
+This toggles between the normal sequence display (showing the current base
+assignments) and one in which those assignments that differ from the consensus
+are highlighted. It makes scanning for problems by eye much easier.
+
+Several modes of highlighting are available: ``By dots'' will only display the
+bases that differ from the consensus, displaying all other bases as full
+stops if they match or colons if they mismatch but are poor
+quality. The definition of poor quality here can be adjusted using the
+``Set quality threshold'' option of the Settings menu. The base
+colours are as normal (ie reflecting tags and quality).
+
+Highlight disagreements ``By foreground colour'' and ``By background
+colour'' displays all base characters, but colours those that differ
+from the consensus. Bases which differ by are below the
+difference quality threshold are not coloured. This allows easier
+visual scanning of the context that a difference occurs in, but it may
+be wise to disable the displaying of tags (hint: control-Q toggles
+tags on and off).
+
+Finally the ``Case sensitive'' toggle controls whether upper and lower
+case bases of the same base type should be considered as differences.
+
+ at node Editor-Compare
+ at subsection Compare Strands
+ at cindex Contig Editor: Compare Strands
+ at cindex Compare Strands: contig editor
+
+This toggles the consensus calculation routine between
+treating both strands together or independently. In the independent
+case any difference between the two strands is shown in the
+consensus as a '-'. Hence these clashes are found as problems by the
+``Search by problem'' option.
+
+ at node Editor-auto-save
+ at subsection Toggle auto-save
+ at cindex Contig Editor: toggle auto-save
+ at cindex Contig Editor: auto-save
+ at cindex Auto-save: contig editor
+ at cindex Toggle auto-save: contig editor
+
+Selecting auto-save toggles the auto save feature. Initially this is
+turned off each time the contig editor is invoked. Once toggled the
+adjacent checkbox will be set to indicate the feature is enabled and
+the contig will be saved. From that point onwards the contig editor
+will write its data to disk every 50 edits. Each time an auto save is
+performed it is announced in the output window. Saving more frequently
+can still be performed manually by using ``Save Contig''.
+
+Unlike ``saves'' made using the manual ``Save Contig'' command, 
+the ``Undo'' button will allow the user to undo edits regardless
+of when the last auto save occurred. 
+
+ at node Editor-3Char
+ at subsection 3 Character Amino Acids
+ at cindex 3 Character Amino Acids: contig editor
+ at cindex Contig Editor: 3 Character Amino Acids
+
+By default, the codon translation within the status line displays
+single character amino acid codes. Selecting ``3 Character
+Amino Acids'' will toggle the status line to display three
+character amino acid codes.
+
+ at node Editor-Show Quality
+ at subsection Show Reading and Consensus Quality
+ at cindex Show reading quality: contig editor
+ at cindex Show consensus quality: contig editor
+ at cindex Contig Editor: show reading quality
+ at cindex Contig Editor: show consensus quality
+ at cindex Quality values: contig editor, displayed
+
+When the quality cutoff value is 0 or higher and either of the ``show
+reading quality'' or ``show consensus quality'' 
+toggles is set, the background for
+bases is shaded in a grey level dependent on their quality.  There are ten
+levels of shading with the darkest representing poor data and the lightest
+representing good data. So with the quality cutoff set to 50, all bases with a
+quality of less than fifty are shown with a red foreground and a dark grey
+background, bases with quality just above 50 will have the darkest grey
+background, and bases with a quality of 100 will have the lightest background.
+When tags are present the background colour is that of the tag rather than the
+quality.
+
+The colours used are adjustable by modifying your @file{.gaprc} file. The
+defaults are shown below.
+
+ at example
+set_def CONTIG_EDITOR.QUAL0_COLOUR     "#494949"
+set_def CONTIG_EDITOR.QUAL1_COLOUR     "#696969"
+set_def CONTIG_EDITOR.QUAL2_COLOUR     "#898989"
+set_def CONTIG_EDITOR.QUAL3_COLOUR     "#a9a9a9"
+set_def CONTIG_EDITOR.QUAL4_COLOUR     "#b9b9b9"
+set_def CONTIG_EDITOR.QUAL5_COLOUR     "#c9c9c9"
+set_def CONTIG_EDITOR.QUAL6_COLOUR     "#d9d9d9"
+set_def CONTIG_EDITOR.QUAL7_COLOUR     "#e0e0e0"
+set_def CONTIG_EDITOR.QUAL8_COLOUR     "#e8e8e8"
+set_def CONTIG_EDITOR.QUAL9_COLOUR     "#f0f0f0"
+set_def CONTIG_EDITOR.QUAL_IGNORE      "#ff5050"
+ at end example
+
+
+ at node Editor-Show edits
+ at subsection Show edits
+ at cindex Show Edits: contig editor
+ at cindex Contig Editor: show edits
+
+When set, any change between the bases displayed and the original sequence
+held in the trace files is shown by changing the background colour of the
+changed base. The detection of these edits depends on the quality values
+and the ``original position'' data. Hence the traces do not need to be present
+in order to detect edits. The colour of the bases reflects the type of change
+found. The colours are adjustable by editing the @file{.gaprc} file. The
+following table lists the colour, gaprc variable name and the meaning.
+
+ at table @i
+ at item red
+ at code{CONTIG_EDITOR.EDIT_DEL_COLOUR } --- Deletion
+ at item pink
+ at code{CONTIG_EDITOR.EDIT_BASE_COLOUR} --- Base change or insertion
+ at item green
+ at code{CONTIG_EDITOR.EDIT_PAD_COLOUR } --- Padding character
+ at item purple
+ at code{CONTIG_EDITOR.EDIT_CONF_COLOUR} --- Confidence value
+ at end table
+
+ at node Editor-Show unpadded pos
+ at subsection Show Unpadded Positions
+ at cindex Show unpadded positions
+ at cindex Unpadded positions in editor
+
+The ruler at the top of the contig editor displays every tenth base number in
+the consensus sequence. Without ``show unpadded positions'' enabled any
+character in the consensus is counted, including padding characters. If ``show
+unpadded positions'' is enabled the ruler will only count non pad (``*'')
+characters. Please note that this may considerably slow down the editor on
+large databases as the full consensus needs to be calculated in order to plot
+the ruler. If you just need to obtain the occasional unpadded position it is
+better to press the Enter key or to use the ``unpadded position'' search.
+
+ at node Editor-Show template names
+ at subsection Show Template Names
+ at cindex Show template names
+ at cindex Contig Editor: template names
+
+The names panel on the left hand side of the editor normally shows the
+reading names. This option may be used to toggle this display to show
+the template names instead. When enabled the trace display also
+switches from showing reading names to template names.
+
+ at node Editor-Active Tags
+ at subsection Set Active Tags
+ at cindex Set Active Tags: contig editor
+ at cindex Contig Editor: set active tags
+
+``Set Active Tags'' allows configuration of which tag types should be displayed
+within the editor. Note that searches for tag annotations will only examine
+active tags, but searching for a specific tag type will find tags even when
+tags of this type are not visible. In this situation the tag will still be
+invisible, but as usual the tag location will be underlined. This option is
+particularly useful for exploring cases where a section of sequence has many
+overlapping tags. An alternative to using this dialogue is using the Control-Q
+key, which toggles the display of active tags.
+
+
+ at node Editor-Output List
+ at subsection Set Output List
+ at cindex Set Output List: contig editor
+ at cindex Contig Editor: set output list
+
+``Set output list'' pops up a dialogue asking for a list name to be used
+when outputting reading names (_fpref(Lists, Lists, lists)). 
+Once an output list has been specified,
+pressing the middle button, or Alt left mouse button,
+on a reading name will add the name to the
+end of list.  Note that selecting the same name more than once will add
+the name to the list more than once. The list is never cleared by the editor.
+This allows multiple editors to append to the same list. If required, use the
+list menu to clear the list.
+
+ at node Editor-Default Confidence
+ at subsection Set Default Confidences
+ at cindex Set Default Confidences: contig editor
+ at cindex Contig Editor: set default confidences
+ at cindex Confidence in contig editor
+ at cindex Quality in contig editor
+
+Replacing bases or inserting new bases in the editor can assign new confidence
+values to those bases. The default setting is to set these confidence values to 100
+which has the effect of forcing the consensus to be that base. The ``Set
+Default Confidences'' dialogue allows these default values to be changed.
+The allowable range of confidence values for a base is from 0 to 100
+inclusive. The dialogue also allows selection of confidence -1. This tells the
+editor to not change the confidence value. When replacing a base this keeps
+the same confidence value of the base that is being replaced. When inserting a
+base this uses the average of the confidence value of the two surrounding
+bases.
+
+
+ at node Editor-Store Undo
+ at subsection Set or unset saving of undo
+ at cindex Set or unset saving of undo: contig editor
+ at cindex Contig Editor: Set or unset saving of undo
+ at cindex Undo: contig editor
+ at cindex Undo toggle: contig editor
+
+Storing the undo information takes up a great deal of computer
+memory and slows down the alignment algorithm. Particularly when using
+the Join Editor for very large overlaps (e.g. after copying batches of
+readings from one database to another), it can be useful to turn off
+the saving of undo information.
+For this reason the settings menu
+contains an option to turn off (or on) the saving of undo information.
+
+
+_split()
+ at node Editor-Remove Readings
+ at section Removing Readings
+ at cindex Contig editor: disassemble readings
+ at cindex Contig editor: remove reading
+ at cindex Remove reading: contig editor
+ at cindex Disassembly: contig editor
+
+It is often desirable to completely remove a reading from a contig. When not
+using the editor this is typically performed using the Disassemble Readings
+function. _fxref(Disassemble, Disassemble Readings, disassembly)
+When using the editor, the ``Remove Reading'' option on the editor commands
+menu performs a similar task.
+
+The command marks the reading underneath the editing cursor to be removed once
+the editor is quitted. Until then, the reading number in the names section of
+the display is shown with a dark grey background. The reading will also not be
+used in the calculation of the consensus. Thus, if all readings at a
+particular section of consensus are marked for removal the consensus sequence
+will be shown as dashes. Selecting the ``Removing Reading'' command again with
+the editing cursor on a reading already marked for removal will cancel the
+removal request. The keyboard command of Control-H may also be used as a
+shortcut to the ``Removing Reading'' command.
+
+Once the editor has been quitted you will be asked whether you wish to
+disassemble the marked readings. Answering ``No'' will simply quit the editor as
+normal without removing any readings. Answering ``Yes'' will bring up the
+usual ``Disassemble Readings'' dialogue. The options here allow removal of all
+readings from this contig, or non-crucial only. A crucial reading is one that
+will cause this contig to be broken into two or more segments. A choice is
+also given as to whether the readings should be completely removed from this
+database, or for each reading to be placed in its own contig. Pressing ``OK''
+now will remove the readings from the contig, breaking the contig if
+necessary, and will quit the editor. Pressing ``Cancel'' will close the
+``Disassemble Readings'' dialogue without making any changes and will not quit
+the editor.
+
+At any time, quitting the editor and not disassembling the readings will leave
+a List (_fpref(Lists, Lists, lists)) named ``disassemble'' containing the
+readings marked for removal. These may then be disassembled at a later stage
+if necessary. However the list will only be available until the next editor is
+quit (at which stage that editor will create its own, possibly blank,
+disassemble list), so make a copy if necessary.
+
+_split()
+ at node Editor-Primer Selection
+ at section Primer Selection
+ at cindex Primer Selection: contig editor
+ at cindex Contig Editor: Primer selection
+ at cindex Oligo selection: contig editor
+ at cindex Contig Editor: Primer selection
+
+The oligo selection engine is the one used in the program OSP.
+It is described in 
+ at cite{Hillier, L., and Green, P. (1991). ``OSP: an oligonucleotide
+selection program,'' PCR Methods and Applications, 1:124-128}.
+Oligo selection is a complex operation. The normal mode of use is
+outlined below:
+
+ at enumerate
+ at item
+Open the oligo selection window, by selecting ``Select Primer'' from the
+contig editor commands menu.
+
+ at item
+Position the cursor to where you want the oligo to be chosen. While the
+oligo selection window is visible, you will still have complete control over
+positioning and editing within the contig editor.
+
+ at item
+Indicate the strand for which you require an oligo. This is
+done by toggling the direction arrows (``----->'' or ``<------'').
+
+
+ at item
+Press the ``Find Oligos'' button to find all suitable oligos
+(see the ``Parameters'' subsection below for further information on
+controlling this procedure). Information for the closest suitable oligo to
+the cursor position is given in the output text window and at the bottom
+of the editor in the information line. In the
+contig editor the position of the oligo is marked by a temporary
+tag on the consensus. The window is recentered if the oligo is
+off the screen.
+
+ at item
+If this oligo is not suitable (it may have been used before, and failed)
+the next closest oligo can be viewed by pressing ``Next''.
+
+ at item
+Suitable templates are automatically identified for the
+currently displayed oligo (see the ``Template selection''
+subsection below). By default, the template is that closest to
+the oligo site. If the choice is not suitable (it may be known to
+be a poor quality template, say) another can be chosen from the
+``Choose from'' pull-down menu. Templates that do not
+appear on the menu can be specified by selecting simply typing their name
+in the ``Template name'' entry box. However, the template must be on the correct
+strand and be upstream of the oligo.
+
+ at item
+A tag can be created for the current oligo by pressing the
+button ``Accept''. The annotation for this tag
+holds the name of the template and the oligo primer sequence.
+There are fields to allow the user to specify their own primer
+name (``serial#'') and comments (``flags'') for this tag. An example
+of oligo tag annotation:
+
+ at example
+ at group
+ serial#=
+ template=a16a9.s1
+ sequence=CGTTATGACCTATATTTTGTATG
+ flags=
+ at end group
+ at end example
+
+ at item
+The oligo selection window is closed when ``Accept'' or ``Quit'' is selected.
+ at end enumerate
+
+ at subsection Parameters
+
+The parameters controlling the selection of oligos can be
+changed by pressing the ``Edit parameters'' button. This invokes a
+dialogue box which allows the specification of further parameters.
+
+By default, the oligos are selected from a window that extends
+40 bases either side of the cursor. The size and location of this
+window relative to the cursor position can be changed in the
+``Edit parameters'' window.
+
+Primer constraints can be specified by melting temperature, length
+and G+C content.
+
+In gap4 oligos are ranked according to their overall score, where the
+best oligos have lower scores.
+
+ at subsection Template selection
+
+For simplicity, each reading is considered to represent a
+template. In practice, many readings can be made off the same
+template. Suitable templates that are identified are those that
+satisfy all of the following conditions:
+
+ at enumerate
+ at item
+are in the appropriate sense,
+
+ at item
+have 5' ends that start upstream of the oligo,
+
+ at item
+are sufficiently close to the oligo to be useful.
+ at end enumerate
+
+This last criterion relates to the insert size for the
+templates used for sequencing and the average reading length. A
+template is considered useful if a full reading can be made from it,
+taking into account both of these factors. The default insert size
+is 1000 bases (although the size range should be included in the
+experiment file for each reading, and hence the default would not be
+required), and the default average reading length is 400 bases.
+These values can be changed in the ``Edit parameters'' window.
+
+_split()
+ at node Editor-Traces
+ at section Traces
+ at cindex Trace displays: contig editor
+ at cindex Contig Editor: trace display
+
+The original trace data from which the readings where derived can be displayed
+by double clicking (two quick clicks) with the left or middle mouse button on
+the area of interest. Control t has the same effect.  The trace will be
+displayed centred around the base clicked upon and the name of the reading in
+the contig editor will be highlighted.  Double clicking on the consensus
+displays all the readings covering that position.  Double clicking on a
+reading which already has its trace displayed will cause the corresponding
+trace to be surrounded by a red border.
+
+Moving the mouse pointer over a base causes the display of an information
+line at the bottom of the window. This gives the base type, its position
+in the sequence, and its confidence value.
+
+There are two forms of trace display which are selected using the ``Compact''
+button at the top of the Trace display. The compact form differs by not
+showing the Info, Diff, Comp. and Cancel buttons at the left of each trace.
+
+Note that gap4 does not store the trace files in the project database:
+it stores only their names and reads them when required. However it does
+not know which directory they are stored in, unless this is specified using the ``Trace File Location'' option
+(_fpref(Conf-Trace File Location, Trace File Location, configure)). 
+
+_lpicture(contig_editor.traces,6in)
+
+The picture shows an example of three displayed traces. The reading number,
+together with the direction of the reading (+ or -) and the chemistry by
+which it was determined, is given at the top left of each sub window.
+The chemistry information is found from comments in the experiment file.
+'uf' and 'ur' indicate universal forward and universal reverse, 'cf' and 'cf'
+indicate custom forward and custom reverse, and 'p' and 't' indicate primer
+and terminator. There are four buttons ('Info', 'Diff', 'Comp.' and 'Cancel')
+below this information, and X and Y scale bars to the right.
+
+The ``Info'' button will display a window like the one shown at the bottom right
+of the picture. This contains the comments from the relevant SCF file.
+
+The ``Diff'' buttons are used to produce a new trace showing the differences
+between two existing traces. To use this, press ``Diff'' in any window. The
+mouse cursor then changes to a cross symbol. Pressing the left mouse button
+anywhere on another trace that has a ``Diff'' button will create the difference
+trace. Any other button cancels the operation. The algorithm used for
+computing the difference trace is adjustable by parameters in the settings
+menu (_fpref(Editor-Trace Display, Trace Display Settings, contig_editor)).
+The trace differencing was originally designed for visual inspection of
+suspected mutations
+ at cite{Bonfield, J.K., Rada, C. and Staden, R. Automated detection of point
+mutations using fluorescent sequence trace subtraction. Nucleic Acids Res. 26,
+3404-3409 (1998)}.
+
+The ``Comp.'' button complements the displayed trace. If the sequence in the
+editor has been complemented then the trace will automatically be shown in the 
+complementary sense. This button may be used to toggle the complementarity.
+
+The ``Cancel'' button will remove the trace.
+
+The X and Y scale bars zoom the trace in the appropriate direction. The default
+Y scale is to fit the highest peak on the screen without clipping. When the
+``Show confidence'' check-button is selected, the confidence value for each base
+call will be displayed as a histogram, overlayed on the trace displays. The
+base confidence values are not computed by gap4, but rather are read from the
+SCF file which is assumed to have been generated by one of the programs that
+compute confidence values (such as phred, ATQA or eba). When ABI files are in
+use, confidence values may not be shown.
+
+The trace is displayed on the right with a scrollbar directly below it and
+with the reading name in the top left corner. The vertical line seen in these
+three traces shows the location of the editing cursor in the contig editor window.
+The lock button on the trace displays ties the editing cursor movement to the 
+scrolling of the trace windows and vice versa.
+
+The trace display supports the display of up to four columns of traces,
+and can display any number of rows. The number of columns and rows can
+be configured and saved using the buttons at the top of the window. A
+scrollbar is provided if there are more traces to display than can be
+viewed with the current settings. 
+
+To modify the number of traces that are shown at any one time, and the heights
+of these, add (and edit) the following lines to your @file{$HOME/.gaprc} file.
+
+ at example
+set_def TRACE_DISPLAY.ROWS      	5
+set_def TRACE_DISPLAY.COLUMNS      	2
+set_def TRACE_DISPLAY.TRACE_HEIGHT	150
+ at end example
+
+New traces are always added to the bottom right of the window.
+
+Resizing the width of the trace window, moving the trace window and
+adjusting the X magnification are all remembered and used when bringing
+up new trace displays.
+
+The ``Close'' button at the top right of the Trace Display removes the
+Trace Display.
+
+An example of the ``Compact'' form of the trace display is shown below.
+
+_lpicture(contig_editor.traces.compact,6in)
+
+_split()
+ at node Editor-Reference-Data
+ at section Reference Sequence and Traces
+ at cindex Reference sequence: contig editor
+ at cindex Reference traces: contig editor
+ at cindex Contig Editor: Reference sequence
+ at cindex Contig Editor: Reference traces
+
+ at menu
+* Editor-Reference-Sequences::   Reference sequences
+* Editor-Reference-Traces::     Reference traces
+ at end menu
+
+Reference sequences can be used to provide standard base numbering for contigs.
+If they have feature table tags which contain CDS records the Contig Editor
+can use them to translate only the known coding segments, and in the correct
+reading frame. The primary use for reference sequences is in mutation detection.
+
+Reference Traces provide standards, both positive and negative for mutation
+detection by trace comparison.
+
+_split()
+ at node Editor-Reference-Sequences
+ at subsection Reference sequences
+
+In order to put readings and their mutations in context we use a
+reference sequence and feature table. This enables mutations to be
+reported using positions defined by the reference sequence, and also
+allows the effect of the mutations to be noted. To facilitate this gap4
+is able to store entries from the EMBL sequence library complete with
+their feature tables. These feature tables are converted to gap4
+database annotations (tags), which means that they can be selectively
+displayed in the template display and editor, and used to translate only
+the exons (in the correct reading frame).
+The reference sequence can be designated (or reassigned) 
+by right clicking on its name. Once set it should
+appear labelled ``S'' at the left edge of the editor.
+
+_split()
+ at node Editor-Reference-Traces
+ at subsection Reference traces
+From the ``settings'' menu of the editor
+the trace display can be set to ``Auto-Diff traces''. Once this is
+activated, whenever the user double clicks on a base in the editor
+sequence display, not only is the reading's trace displayed, but also
+its designated reference trace plus the difference between them. If its
+complementary reading is available, its trace and reference trace and
+their differences are also displayed.
+
+The preferred way of assigning reference traces to readings is by use of
+``naming conventions''; that is to have a simple set of rules which
+control the names given to the trace files. It can be seen in the
+figures showing the editor that forward and reverse readings from the
+same patient have names with a common root but which end either F or
+R. This both ties the two together (so the software knows which is the
+corresponding 
+complementary trace when the user double clicks on a reading) and also
+enables the association of readings and their reference traces. Once a
+convention has been adopted the rules can be defined for pregap4 by
+loading them via the ``Load Naming Scheme'' option in its File menu
+(_fpref(Pregap4-Naming, Pregap4 Naming Schemes, pregap4)). For
+any batch of readings the reference traces are defined within pregap4's
+``Reference Traces'' module.
+
+Within the Contig Editor reference traces can be set by right clicking
+on their names in the editor. When this is done a menu will popup. This
+allows the user to select whether the trace is to be used as a 
+positive or negative control.
+
+
+_split()
+ at node Editor-Template Status
+ at section Template Status Codes
+ at cindex Template Status Codes
+ at cindex Contig Editor: template status
+
+Adjacent to the reading name is a coloured block indicating the
+reliability of the template.
+
+ at table @strong
+ at item Red
+Strand conflict (e.g. two forward readings are assembled on opposing strands)
+ at item Blue
+Position conflict (e.g. the start of this template can be derived at
+multiple positions due to more than one universal primer sequence, but
+at positions > 100 base pairs apart).
+ at item Pink
+One end is not present in this contig, but is in another contig.
+ at item Light grey
+One template end sequence is not present in this database (ie not a read-pair)
+ at item Medium grey
+The measured template size is too large or too small
+ at item Dark grey
+Multiple problems
+ at end table
+
+
+These correspond to the (larger) set of single-letter codes that are
+listed in the editor information line
+
+The ``go to'' and ``select all readings from this template'' commands
+(obtained by right clicking on the reading name) are particularly
+useful when dealing with inconsistent templates.
+
+The colour codes map to the (larger) set of single-letter identifiers
+used in the information line
+(_fpref(Editor-Info, The Editor Information Line, contig_editor)).
+The letter codes are:
+
+ at table @strong
+ at item D
+Distance (negative in size)
+ at item d
+Distance (too large/small)
+ at item P
+Primer position
+ at item S
+Strand
+ at item E
+Guessed start or end position of template
+ at item I
+Spans contigs and contig-end distance is large
+ at item O
+Spans contigs, but contig-end distance is small
+ at item ok
+No problems
+ at item ?
+Unknown problem
+ at end table
+
+For templates with read-pairs spanning two contigs the distance from
+the end of each contig (in the direction that the template 'reads' in)
+is summed together to compute whether a contig join is viable. This in
+turn yields the ``O'' and ``I'' codes.
+
+
+_split()
+ at node Editor-Info
+ at section The Editor Information Line
+ at cindex Information line: contig editor
+ at cindex Status line: contig editor
+ at cindex Contig Editor: information line
+ at cindex Unpadded base positions
+
+The very bottom line of the editor display is text line used by the editor to
+display pieces of useful information. Currently this gives information on
+individual bases, readings, the contig, and tags, as the mouse is moved over
+the appropriate object. For bases (in both readings and the consensus) this
+information is only displayed when a mouse button is pressed. The left mouse
+button displays with format @code{BASE_BRIEF_FORMAT1} and format
+ at code{BASE_BRIEF_FORMAT2} is displayed when pressing 'Enter'. By default the
+only difference between the two is that 'Enter' will display the
+``unpadded position'' of a base in the consensus - ie its position in the
+consensus after pads have been removed.  The contents and format of the
+information displayed is completely configurable by adding the relevant
+definitions to your @file{.gaprc} file.  The defaults are as follows.
+
+ at example
+set_def READ_BRIEF_FORMAT  \
+	@{%n(#%Rn) Clone:%Cn Vector:%Tv Type:%P;%a Tmpl:%Tc %c@}
+
+set_def CONTIG_BRIEF_FORMAT  \
+        @{Contig:%n(#%Rn)   Length:%l   %c@}
+
+set_def TAG_BRIEF_FORMAT  \
+        @{Tag type:%t   Direction:%d   Comment:"%.100c"@}
+
+set_def BASE_BRIEF_FORMAT1  \
+	@{Base confidence:%c  (Probability %p)   Position %P@}
+
+set_def BASE_BRIEF_FORMAT2  \
+	@{Base confidence:%c  (Probability %p)   Position %P   \
+          Unpadded position %U@}
+
+ at end example
+
+Tag information is shown when the mouse is moved over an annotation. Read
+information is shown when the mouse is moved over the reading name in the
+names section of the display. Contig information is displayed when the mouse
+is moved over the ``Consensus'' line in the names display. If you wish to leave
+the contig editor window without changing the information line contents as the
+mouse moves over other information press and hold the Shift key whilst moving
+the mouse. This disables the automatic highlighting. The same mechanism also
+works for other windows (such as the template display).
+
+The general style of the formats is the string to display with particular
+strings substituting % characters. For instance in the reading format %n is
+substituted by the reading name. The general format of a % expansion is:
+
+ at itemize @bullet
+ at item
+        A percent sign.
+ at item
+        An optional minus sign to request left alignment of the information.
+        When displaying information in a specific field with where that data
+        does not fill the entire space allowed the information will, by
+        default, be right justified. Adding a minus character here requests
+        left justification.
+ at item
+        An optional minimum field width. This is a decimal number indicating
+        how much space to leave for this information.
+ at item
+        An optional precision for numbers or maximum field width for strings.
+        This is given as a fullstop followed by a decimal number.
+ at item
+        An optional 'R' to specify Raw mode. This changes the meaning of many
+        (but not all) of the expansion requests to give a numercial
+        representation of the data. For example %n is a reading name
+        and %Rn is a reading number.
+ at item
+        Th expansion type itself. This is either one or two letters. See below
+        for full details of their meanings.
+ at end itemize
+
+To programmers this syntax may seem very similar to @code{printf}. This is
+intentional, but do not assume it is the same. Specifically the print syntax
+of @code{%#}, @code{%+} and @code{%0} will not work.
+
+ at subsection Reading Information
+ at cindex Information line: readings in contig editor
+ at cindex READ_BRIEF_FORMAT
+
+Example output is @b{Reading:xc04a1.s1(#74)   Length:295(474)   Vector:m13mp18
+Clone:test   Chemistry:primer   Primer:forward universal}.
+
+ at table @strong
+ at item %%
+        A single % sign
+ at item %n
+        Reading name. Raw mode: number
+ at item %#
+        Reading number
+ at item %t
+        Trace name
+ at item %p
+        Position
+ at item %l
+        Clipped length
+ at item %L
+        Total length
+ at item %s
+        Start of clip
+ at item %e
+        End of clip
+ at item %S
+        Sense (whether complemented) - ``+'' or ``-''. Raw mode: 0/1
+ at item %a
+        Chemistry (eg ``BigDyeV3''). Raw mode: integer version
+ at item %d
+        Strand - ``+'' or ``-''. Raw mode: 0/1
+ at item %P
+        Primer - ``unknown'', ``forward universal'', ``reverse universal'',
+        ``forward custom'' or ``reverse custom''.  Raw mode: 0/1/2/3/4
+ at item %Tn
+        Template name. Raw mode: template number
+ at item %T#
+        Template number
+ at item %Tv
+        Template vector. Raw mode: template vector number
+ at item %Ti
+        Template insert size
+ at item %Tc
+        Template consistency (a mix of ``DdPSEO?'' or ``ok''). Raw
+        mode: as a number
+ at item %Cn
+        Clone name. Raw mode: clone number
+ at item %C#
+        Clone number
+ at item %Cv
+        Clone vector. Raw mode: clone vector number
+ at item %t
+        Trace filename.
+ at item %c
+	User defined text, taken from the the first note of type INFO.
+ at end table
+
+ at subsection Contig Information
+ at cindex Information line: contig in contig editor
+ at cindex CONTIG_BRIEF_FORMAT
+
+Example output is @b{Contig:xc04a1.s1(#74)   Length:1316}.
+
+ at table @strong
+ at item %%
+        Single % sign
+ at item %n
+        Left most reading name. Raw mode: reading number
+ at item %s
+        (As %n)
+ at item %e
+        Right most reading name. Raw mode: reading number
+ at item %#
+        Contig number
+ at item %l
+        Contig length
+ at item %E
+        Expected number of errors (can be slow on large contigs)
+ at item %c
+	User defined text, taken from the the first note of type INFO.
+ at end table
+
+ at subsection Tag Information
+ at cindex Information line: tags in contig editor
+ at cindex TAG_BRIEF_FORMAT
+
+Example output is @b{Tag type:OLIG   Direction:-   Comment:''template=xc04a1
+sequence=
+ at br CGATTGCAGAATAAGACG''}.
+
+ at table @strong
+ at item %%
+        Single % sign
+ at item %p
+        Tag position
+ at item %d
+        Tag direction - ``+'', ``-'' or ``=''. Raw mode: 0/1/2
+ at item %D
+        Tag direction - ``----->'', ``<-----'' or ``<---->''. Raw mode: 0/1/2
+ at item %t
+        Tag type (always 4 characters)
+ at item %l
+        Tag length
+ at item %#
+        Tag number (0 if unknown)
+ at item %c
+        Tag comment
+ at end table
+
+ at subsection Base Information
+ at cindex Information line: bases in contig editor
+ at cindex BASE_BRIEF_FORMAT1
+ at cindex BASE_BRIEF_FORMAT2
+
+Example output is @b{Base confidence:13  (Probability 0.954020)   Position 3805
+  Unpadded position 3678}.
+
+ at table @strong
+ at item %%
+        Single % sign
+ at item %c
+        Confidence value (phred style)
+ at item %p
+        Confidence value (as probability)
+ at item %P
+        Padded consensus base position
+ at item %U
+        Unpadded consensus base position
+ at end table
+
+
+_split()
+ at node Editor-Joining
+ at section The Join Editor
+ at cindex Join Editor
+ at cindex Contig Editor: joining
+
+Contigs are joined interactively using the Join Editor.
+This is simply a pair
+of contig editor displays stacked above one another with a ``differences''
+line in between. 
+Note that it is essential
+to align the contigs over the full length of their overlap. It is much more
+difficult to achieve this after a join has been made, and until the
+alignment is correct, the consensus sequence will be nonsense.
+
+The few differences between the Join Editor and the Contig Editor can be seen
+in the figure below. Otherwise all the commands and operations are the
+same as those for the Contig Editor 
+
+_lpicture(contig_editor.join,6in)
+
+One difference is the Lock button. When set (as it is in the
+illustration) scrolling either contig, by using the scrollbar or the
+four movement buttons, will also scroll the other contig.
+
+The Align button aligns the overlapping consensus sequences and adds
+pads The alignment routine assumes that the two contigs are already in
+approximately the right relative position (as they are immediately
+after the Join Editor has been invoked from Find Internal Joins, or
+Find Repeats). If they are not they must be positioned manually before
+using the Align button.
+
+The ``<'' and ``>'' buttons either side of the ``Align'' button
+perform the alignment from the editing cursor to the start of the
+contig and and from the cursor to the end of the contig
+only. Alignment end-gaps are penalised at the curosr position but not
+for the alignment end at the contig start/end position. These buttons
+are useful for when multiple alignment positions may be valid, such as
+is the case with an overlap consisting entirely of a STR.
+
+It should be noted
+that each of the pair of editors comprising the Contig Editor 
+maintains its own undo history, and using Align
+is likely to add to both undo histories. 
+Hence, to undo the results of the Align command
+the Undo button in both editors must be used.
+
+Note also that storing the undo information takes up a great deal of computer
+memory and slows down the alignment process. For this reason the settings menu
+contains an option to turn off (or on) the saving of undo information. When
+aligning very long overlaps it is advisable to turn off the undo saving.
+
+When ``Join/Quit'' is pressed a dialogue box is displayed containing
+the percentage mismatch of the overlap, and asking if the join should be made.
+For joins above a certain
+level of mismatch (20 percent by default) a second confirmation is required.
+
+_split()
+ at node Editor-Multiple Editors
+ at section Using Several Editors at Once
+ at cindex Contig Editor: multiple editors
+
+Several editors can be used simultaneously, even on the same contig.
+In the latter
+case, it is useful to understand the difference between the data and
+the view of the data.
+
+Each operating Contig Editor is a view of the data for
+a particular contig. With two editors
+viewing the same contig, making changes in either will effect the data
+that both are viewing, hence the change will be visible in both
+editors. Similarly, using Undo in either will undo the changes to both.
+
+When quitting and saving
+changes, other editors for the same contig will act as if a ``Save
+Contig'' request has been made by using the ``Commands'' menu (ie changes
+are written to disk and the undo information will be reset). Answering
+``no'' to the ``Save changes'' query,
+simply shuts down the editor without saving. If there is no other
+editor for this contig then the changes will be lost, otherwise the
+changes will be retained until the last editor for the contig is
+exited.
+
+Interaction between Contig Editors and Join Editors is more
+complicated and generally isn't advised. However such interactions
+work consistently with the notion of views of contigs. For example,
+suppose there are two Contig Editors open on two separate contigs, and in
+addition to these a Join Editor displaying both contigs. Making the
+join in the Join Editor will update the two stand-alone Contig Editors
+so that they are each 
+viewing the correct positions in the new contig, even though
+they're both now viewing the same contig.
+
+_split()
+ at node Editor-Quitting
+ at section Quitting the Editor
+ at cindex Quitting: contig editor
+ at cindex Contig Editor: quitting
+
+The ``Quit'' button quits the editor. If changes have been made since
+the last save (either a ``Save Contig'' or an auto- save) you will be
+asked whether you wish to save these changes.  Answering ``Cancel''
+abandons the quit process and provides control of the editor again,
+otherwise the appropriate action will be taken and the editor quitted.
+
+Within a join editor, the ``Quit'' button is changed to ``Join/Quit''. Pressing it
+will prompt for making the join. You will be told the percentage mismatch of
+the overlapping consensus sequences. The join can either be accepted,
+rejected, or cancelled (in which case the editor is not quitted and the join
+is not made).
+
+_split()
+ at node Editor-Techniques
+ at section Editing Techniques
+ at cindex Techniques of editing
+ at cindex Editing techniques
+ at cindex Contig Editor: editing techniques
+ at cindex Contig Editor: techniques
+
+ at menu
+* Editor-Techniques-Cutoffs::   Consensus and Quality Cutoffs
+* Editor-Techniques-EType::     Editing by Base Change or Confidence
+* Editor-Techniques-Overcall::  Base overcalls
+* Editor-Techniques-Undercall:: Base undercalls
+* Editor-Techniques-Disagree::  Multiple Base Disagreements
+* Editor-Techniques-Quality::   Poor Quality
+* Editor-Techniques-Check::     Checking for Errors
+ at end menu
+
+The editor documentation describes the available controls, but not how these
+should be used most efficiently. Some editing is performed in a local style or
+is personal preference, but a great deal of the common editing tasks are best
+dealt with in specific ways. This section aims to give example methods of
+resolving the common problems.  Typically, problems will be found using one of
+the editor searches (such as ``consensus quality'' or ``problem''). Used in
+conjunction with ``Auto-display traces'' (_fpref(Editor-Trace Display, Trace
+Display Settings, contig_editor)) this will automatically bring up a set of
+traces that are likely to be of assistance in resolving the problem. Prior to
+working on a contig it can be helpful to use ``Shuffle Pads'' to try to align
+padding characters. _fxref(Shuffle Pads, Shuffle Pads, shuffle_pads).
+
+
+_split()
+ at node Editor-Techniques-Cutoffs
+ at subsection Consensus and Quality Cutoffs
+
+The most rapid editing technique 
+(_fpref(Intro-Base-Acc, The use of numerical estimates of base calling
+accuracy, gap4)) is only available if base call
+confidence values have been assigned to the reading data using
+a scale proportional to -log(error_rate).
+Using the ``confidence'' consensus method will make use of confidence values to
+give the most probable consensus sequence and a probability of each base being
+correct. Using the editor ``consensus quality'' search then provides an
+extremely quick way of identifying the lowest quality consensus bases. The
+List Confidence command will give information on the expected number of
+errors that can be fixed by examining all consensus bases with a quality less
+than a particular amount. This gives a good indication to the choice of
+theshold to use in the consensus quality search. Additionally you will also be
+told the expected error rates. With this system it is possible to stop editing
+once a particular average quality has been achieved.
+
+Care should be taken in considering your desired error rate. An average error
+rate of 1 in 10,000 may be easily achievable. However there could still be
+consensus bases with very low confidence. Hence it is perhaps best to choose
+both an average error rate and a minimum consensus confidence for your
+finishing criteria. The consensus confidence values are scaled such that a
+confidence of 20 is a 1 in 100 error rate, 30 is 1 in 1000, 40 is 1 in 10000
+and so on.
+
+The rest of this section described methods to use when the
+aforementioned confidence values are not available.
+
+The Consensus and Quality cutoff values used whilst editing are personal
+preference. Rather than state suggested values, we discuss the merits of
+using example values.
+
+The meaning of the consensus and quality cutoff values changes slightly
+depending on the consensus algorithm in use. For more information on the
+algorithms and these values see _fref(Con-Calculation, The Consensus
+Calculation, calc_consensus)
+
+With the ``Base type frequencies'' and ``Quality weighted base type'' methods, a
+consensus cutoff value of 100 means that @strong{every} sequence disagreement
+will yield a dash in the consensus. Hence the ``Next Search'' button when in
+``problem'' search mode can be used to verify every potential problem. This is a
+lot of work, but if you wish to make sure that all disagreements are
+checked this is the easiest way.
+
+With a quality cutoff of -1, lowering the consensus cutoff value to (eg) 90
+means that a base in the consensus will only be a dash when over 10% of the
+bases disagree with the majority at that point. So a base covered by 11
+sequences, 10 of which state @code{A} and one of which states @code{C} would
+not be considered a problem and would not be found by the problem search. Note
+that this is regardless of the strand information. So if the @code{A}s are on
+the positive strand and the single @code{C} is on the negative strand then this
+is still not considered a problem. However, see below.
+
+Still working with the ``Base type frequencies'' and ``Quality weighted base
+type'' consensus methods, changing the quality cutoff to be 0 or more means
+that the consensus base is derived from the relative quality of bases instead
+of simple frequency counts.  A quality cutoff of 0 and a consensus cutoff of
+90 means that the base will be a dash only when the sum of the quality values
+for the most common base type (defined by the highest quality sum) is less
+than 90% of the total. In comparison with a quality cutoff of -1, this means
+that the above example of 10 @code{A} bases and 1 @code{C} base would be
+considered a problem if the @code{C} base had a sufficiently high quality.
+
+If you have confidence values for each base available you may consider it
+unnecessary to check disagreements caused by poor quality data disagreeing
+with good quality data, although disagreements between good data and good data
+should always be checked. However it should be obvious from this that with a
+quality cutoff of 0 and a consensus cutoff of 100% every sequence conflict is
+still considered a potential problem. A specific change in the consensus
+cutoff (eg from 100% to 90%) will typically find less problems when the
+quality cutoff is 0 than when it is -1. This is entirely due to differences
+between good quality data and poor quality data being excluded.
+
+Finally, the ``Compare Strands'' editor setting calculates two independent
+consensus sequences; one for each strand. The consensus shown is then the base
+calculated in each of the two consensus sequences if they agree, or dash if
+they do not. The ``confidence'' consensus algorithm already takes into account
+strand and chemistry when calculating the consensus base type and confidence,
+but will only lower the confidence value for strand disagreements, rather than
+setting the consensus base to be a dash.  For all consensus methods enabling
+``Compare Strands'' will force you to check all consensus bases where the
+evidence from each strand is conflicting.
+
+_split()
+ at node Editor-Techniques-EType
+ at subsection Editing by Base Change or Confidence
+ at cindex Editing techniques: confidence values
+ at cindex Confidence values: editing techniques
+ at cindex Contig editor: confidence values
+
+Once a location has been found where an edit needs to be made there are two
+possible methods of resolving the problem. Assuming that the edit is a base
+replacement, the first way is to simply replace the differing base with the
+corrected base. This adds the new base at 100% quality.  A second solution is
+to set the confidence of the differing base to 0.  Assuming that we have the
+quality cutoff set to zero or more, this will remove the differing base
+from the consensus calculation, thus enabling the consensus to be 100%
+identical.
+
+Both of these methods may be used when replacing bases in the consensus and
+are selectable using the ``Edit Modes'' menu. Fixing a problem by adjusting its
+confidence leaves the original, conflicting, base visible on the screen.
+However if the changed reading is the only one on a strand then adjusting the
+confidence means that the point only has good data on one strand.
+
+_split()
+ at node Editor-Techniques-Overcall
+ at subsection Base Overcalls
+ at cindex Editing techniques: overcalls
+ at cindex Overcalls: editing techniques
+
+A common problem is that of base overcalls that will result in perhaps one
+reading having and extra base, and all the others being padded by the alignment
+routines:
+
+ at example
+Read1      ACC*AG
+Read2      ACC*AG
+Read3      ACC*AG
+Read4      ACCCAG
+Read5      ACC*AG
+Consensus  ACC*AG
+ at end example
+
+In this first case we see that @code{Read4} has an extra @code{C}, probably
+due to an overcall. Check that the trace for @code{Read4} shows an overcall.
+It is a good idea to check good quality traces for both strands as well as the
+trace with the apparent problem. Also note that enabling ``Show reading
+quality'' (Settings menu) will show the reading quality as grey scales.
+
+We now need to remove the column. It would appear that this could be done by
+removing the @code{*} from each of Readings 1, 2, 3 and 5, and removing the
+ at code{C} from Reading 4. However this will only make edits to those five
+readings. As we're trying to remove an entire column from the contig, we need
+to shift to the left by a single base the position of any readings to the
+right. Naturally this is not the ideal method.
+
+By placing the editing cursor in the consensus (on the second A) we can
+press Delete to remove the entire column. This automatically makes sure that
+everything is consistent. If we are editing at 100% consensus cutoff then this
+consensus base will be a '-' instead of a '*'. For this to work we need to
+make sure that we have ``Allow del dash in cons'' enabled in the Edit Modes
+menu. _oxref(Editor-Modes, Editing Modes).
+
+_split()
+ at node Editor-Techniques-Undercall
+ at subsection Base Undercalls
+
+ at example
+Read1      ACCCAG
+Read2      ACCCAG
+Read3      ACCCAG
+Read4      ACC*AG
+Read5      ACCCAG
+Consensus  ACCCAG
+ at end example
+
+In the above case we see that @code{Read4} has a @code{C} missing. Once again
+we must check the traces to be sure that we wish to edit the reading. If so,
+then we can either make an edit specifically to @code{Read4} itself (in
+``replace'' mode) or type @code{C} in the consensus at this column. The latter
+will change either the base type of the @code{*} in @code{Read4} to @code{c},
+or will change its confidence value to 0. This depends upon the value of the
+``Edit by base type''/''Edit by confidence'' setting in the Edit Modes menu.
+
+When replacing base types, it is preferable to use lowercase letters. This
+makes the modified base stand out. However even when using uppercase letters
+it is always possible to search for edits at a later stage, although they
+won't be as obvious to the human eye. Finally, note that the ``Allow replace in
+cons'' mode must be set to to enable this solution.
+
+_split()
+ at node Editor-Techniques-Disagree
+ at subsection Multiple Base Disagreements
+
+ at example
+Read1      ACCGAG
+Read2      ACCGAG
+Read3      AC*GAG
+Read4      ACCGAG
+Read5      AC*GAG
+Consensus  AC-GAG
+ at end example
+
+Now we have a more complex case. Two disagreements out of five readings. Care
+should be taken to check these traces. Also note the strand of each reading.
+If the database is highly repetitive, and @code{Read1}, @code{Read2}, and
+ at code{Read4} are all from one strand, with @code{Read3} and @code{Read5} from
+the opposite strand then there is a chance that a misassembly has occurred or
+that the problem is a strand dependent sequencing artifact.
+
+Typically this is clear when using the ``Highlight disagreements''
+mode. 
+_oxref(Editor-Disagree, Highlight Disagreements). 
+By selecting this mode
+and by also highlighting the reading names (_fpref(Editor-Names, The sequence
+names display, contig_editor)) scanning along the contig will quickly show
+whether there are other disagreements in common with these two readings versus
+the other three (which would support the evidence of misassembly).
+
+If any misassembled readings have been spotted then mark them for disassembly
+(_fpref(Editor-Remove Readings, Remove Readings, contig_editor)) and they'll
+no longer cause conflicts in the consensus. If the problem is a simple case of
+needing to edit, then making the edit in the consensus will require only one
+key stroke instead of the two needed to edit the individual readings.
+
+_split()
+ at node Editor-Techniques-Quality
+ at subsection Poor Quality
+
+ at example
+Read1      ACC*AGT*CGTA
+Read2      ACC*AGT*CGTA
+Read3      ACC*AGT*CGTA
+Read4      ACCCAGTCCG
+Read5      ACC*AGT*CGTA
+Consensus  ACC*AGT*CGTA
+ at end example
+
+This is identical to the first case, except we have two edits within
+ at code{Read4} in close proximity. This is usually due to a poor quality
+reading, which can be checked by examining the trace and confidence values.
+Whilst we could continue to make edits in the normal fashion it may be wiser
+to take another approach.
+
+One technique is to adjust the cutoff data for @code{Read4}. By marking the
+data as hidden, this portion of the reading will no longer be used for
+producing the consensus. However we can only extend the cutoff data at one end
+or the other; it is not possible to have ``hidden'' data part way through a
+reading except by modifying its confidence. Note though that adjusting the
+cutoff data may mean that we have no data for one strand, which should be
+solved by extra experiments.
+
+If the reading is poor quality along its entire length, then disassembly is
+also a viable option. Using Highlight Disagreements (_fpref(Editor-Disagree, Highlight Disagreements). 
+) or
+Check Assembly (_fpref(Check Assembly, Check Assembly, check_ass)) is a good
+way of finding such readings. Note that disassembling readings may have other
+implications. It could cause a hole in the contig (in which case it will be
+broken in two) or it could cause a single stranded segment. If this is the
+case, the user needs to weigh up the work involved with making many edits along
+the length of this reading against performing another experiment to obtain better
+quality data.
+
+_split()
+ at node Editor-Techniques-Check
+ at subsection Checking for Errors
+
+It is important to check the final sequence for any errors introduced by
+incorrect editing. We strongly advise this when making use of the more
+dangerous options in ``Edit Modes'' as it is possible to accidentally make
+changes. The editor provides several methods for checking the edits performed
+on the data.
+
+The search menu contains two search types
+(_fpref(Editor-Search-VerifyEdit1, Search for evidence for edits(1), search))
+and
+(_fpref(Editor-Search-VerifyEdit2, Search for evidence for edits(2), search)).
+``Evidence for edits (1)'' searches for the next
+edited place where none of the original readings
+agree with the consensus. This helps to spot cases where entire
+columns have been inserted or deleted. ``Evidence for edits (2)'' performs the
+same checks as ``Evidence for edits (1)'', but on each strand independently. So
+it will find all edited places where there is not evidence from both
+strands.
+
+Further checks may be performed outside the editor. Using the Find Read Pairs
+command all templates containing both forward and reverse readings will be
+checked to make sure that the relative orientation and distance of the
+sequences is correct. _fxref(Read Pairs, Find Read Pairs, read_pairs)
+
+Finally, the Check Assembly command can either check the hidden data for each
+reading to check that it does not diverge from the consensus sequence, or the
+visible data can be examined to locate segments with a high proportion of
+disagreements with the consensus.  _fxref(Check Assembly, Check Assembly,
+check_ass) Such cases arise from unnoticed section of vector sequence,
+chimeric reads, or due to a reading being in the wrong copy of a repeated
+element.
+
+_split()
+ at node Editor-Summary
+ at section Summary
+ at cindex Summary: contig editor
+ at cindex Contig Editor: summary
+ at cindex Keyboard summary (contig editor)
+
+ at node Editor-Summary-Keys
+ at subsection Keyboard summary for editing window
+
+(``Left'', ``Right'', ``Up'', ``Down'' refer to the appropriate arrow keys.)
+
+ at example
+Escape/Control v                Scroll right one screenful
+Meta/Alt v                      Scroll left one screenful
+
+Left or Control b               Move editing cursor left one base
+Right or Control f              Move editing cursor right one base
+Up or Control p                 Move editing cursor up one base
+Down or Control n               Move editing cursor down one base
+Control a                       Move editing cursor to start of used
+Control e                       Move editing cursor to end of used
+Meta/Alt/Escape a               Move editing cursor to start of cutoff
+Meta/Alt/Escape e               Move editing cursor to end of cutoff
+Meta/Alt/Escape comma           Move editing cursor to start of contig
+Meta/Alt/Escape fullstop        Move editing cursor to end of contig
+
+Meta/Control/Alt Left           Extend left cutoff data
+Meta/Control/Alt Right          Extend right cutoff data
+Control l                       Move pad left
+Control r                       Move pad right
+<                               Zap left cutoff data
+>                               Zap right cutoff data
+
+[                               Set confidence to 0
+]                               Set confidence to 100
+Shift Up                        Increase confidence of base by 1
+Shift Down                      Decrease confidence of base by 1
+Control Up                      Increase confidence of base by 10
+Control Down                    Decrease confidence of base by 10
+Delete                          Delete base or Shift reading left
+Backspace                       Delete base or Shift reading left
+Control Delete                  Delete base from left and move left
+Control d                       Delete base from right; do not move
+
+Space                           Shift reading right
+Undo key or Control underscore  Perform an undo
+Control i                       Toggle insert mode
+Control h                       Toggle a sequence for removal
+Control t                       Display trace
+Control s                       Search forward
+Escape Control s                Search backwards
+Control x Control s             Save editor
+Control q                       Toggle tag display
+Control c or Control Insert     Copy underlined region to paste buffer
+Insert                          Insert a padding character (*)
+any ACGT1234DVBHKLMNRY5678*-    Insert or change base (both cases allowed)
+ at end example
+
+ at node Editor-Summary-Mouse
+ at subsection Mouse summary for editing window
+
+ at example
+Left button                     Position editing cursor to mouse cursor
+                                Update editor information line
+Left button (drag)              Mark start and end of selection
+Shift left button               Adjust end of selection
+Enter key                       Update editor information line (unpadded pos.)
+Return key                      As Enter key, but also moves editing cursor
+Left button (double click)      Display trace
+Middle button (double click)    Display trace
+Control left button             Display commands menu
+Right button                    Display commands menu
+Mouse-wheel                     Vertically scroll the editor
+Shift mouse-wheel               Vertically scroll the editor, slow
+Control mouse-wheel             Vertically scroll the editor, fast
+ at end example
+
+ at node Editor-Summary-MouseNames
+ at subsection Mouse summary for names window
+
+ at example
+ at group
+Left button                     Toggle user highlight (not in status line)
+Middle button/Alt left button   Add name to output list (if set)
+Right button                    Display popup menu
+ at end group
+ at end example
+
+ at node Editor-Summary-Scroll
+ at subsection Mouse summary for scrollbar
+
+ at example
+ at group
+Middle button                   Set scrollbar position
+Alt left button                 Set scrollbar position
+Left button                     Scroll left or right one screenful
+ at end group
+ at end example
+
+In addition to the scrollbar manipulation, the ``<<'', ``<'', ``>'',
+``>>'' buttons also scroll the editor left or right by half a
+screenful or one base.
diff --git a/manual/contig_editor.join.png b/manual/contig_editor.join.png
new file mode 100644
index 0000000..51e034b
Binary files /dev/null and b/manual/contig_editor.join.png differ
diff --git a/manual/contig_editor.join.small.png b/manual/contig_editor.join.small.png
new file mode 100644
index 0000000..948965f
Binary files /dev/null and b/manual/contig_editor.join.small.png differ
diff --git a/manual/contig_editor.screen.png b/manual/contig_editor.screen.png
new file mode 100644
index 0000000..787cd66
Binary files /dev/null and b/manual/contig_editor.screen.png differ
diff --git a/manual/contig_editor.screen.small.png b/manual/contig_editor.screen.small.png
new file mode 100644
index 0000000..5326d23
Binary files /dev/null and b/manual/contig_editor.screen.small.png differ
diff --git a/manual/contig_editor.search.png b/manual/contig_editor.search.png
new file mode 100644
index 0000000..7730216
Binary files /dev/null and b/manual/contig_editor.search.png differ
diff --git a/manual/contig_editor.taged.png b/manual/contig_editor.taged.png
new file mode 100644
index 0000000..de930bc
Binary files /dev/null and b/manual/contig_editor.taged.png differ
diff --git a/manual/contig_editor.tagmacro.png b/manual/contig_editor.tagmacro.png
new file mode 100644
index 0000000..0c4037d
Binary files /dev/null and b/manual/contig_editor.tagmacro.png differ
diff --git a/manual/contig_editor.tagsel.png b/manual/contig_editor.tagsel.png
new file mode 100644
index 0000000..e4e1b34
Binary files /dev/null and b/manual/contig_editor.tagsel.png differ
diff --git a/manual/contig_editor.traces.compact.png b/manual/contig_editor.traces.compact.png
new file mode 100644
index 0000000..79535a2
Binary files /dev/null and b/manual/contig_editor.traces.compact.png differ
diff --git a/manual/contig_editor.traces.compact.small.png b/manual/contig_editor.traces.compact.small.png
new file mode 100644
index 0000000..3427810
Binary files /dev/null and b/manual/contig_editor.traces.compact.small.png differ
diff --git a/manual/contig_editor.traces.png b/manual/contig_editor.traces.png
new file mode 100644
index 0000000..bcd152c
Binary files /dev/null and b/manual/contig_editor.traces.png differ
diff --git a/manual/contig_editor.traces.small.png b/manual/contig_editor.traces.small.png
new file mode 100644
index 0000000..955161e
Binary files /dev/null and b/manual/contig_editor.traces.small.png differ
diff --git a/manual/contig_editor_grey_scale.png b/manual/contig_editor_grey_scale.png
new file mode 100644
index 0000000..dd98124
Binary files /dev/null and b/manual/contig_editor_grey_scale.png differ
diff --git a/manual/contig_editor_grey_scale.small.png b/manual/contig_editor_grey_scale.small.png
new file mode 100644
index 0000000..d34006a
Binary files /dev/null and b/manual/contig_editor_grey_scale.small.png differ
diff --git a/manual/contig_editor_sets.png b/manual/contig_editor_sets.png
new file mode 100644
index 0000000..d991710
Binary files /dev/null and b/manual/contig_editor_sets.png differ
diff --git a/manual/contig_editor_sets.small.png b/manual/contig_editor_sets.small.png
new file mode 100644
index 0000000..4c44388
Binary files /dev/null and b/manual/contig_editor_sets.small.png differ
diff --git a/manual/contig_list_box.png b/manual/contig_list_box.png
new file mode 100644
index 0000000..1cc756b
Binary files /dev/null and b/manual/contig_list_box.png differ
diff --git a/manual/contig_navigation-t.texi b/manual/contig_navigation-t.texi
new file mode 100644
index 0000000..b82eac2
--- /dev/null
+++ b/manual/contig_navigation-t.texi
@@ -0,0 +1,51 @@
+ at cindex Search from file
+ at cindex Contig navigation
+ at cindex Contig region
+
+
+This function, which can be found under the view menu,
+allows the user to navigate to areas of interest within 
+contigs.
+When Contig navigation is selected a dialog box is raised
+asking for a filename containing the regions. The format is
+the same as the search by file function.
+_fxref(Editor-Search-File,Search by file,contig_editor)
+
+_picture(contig_navigation_browse,2.69167in)
+
+The user can either enter the name of the file or browse
+for it using the browse button. Once ok is hit, the file is 
+loaded into a table for viewing.
+
+_picture(contig_navigation_table,6in)
+
+The table has three fixed headers, contigID, Position and Problem
+Type. Clicking on any of these cause the whole table to be sorted on
+that column.  The regions can be viewed by either randomly double
+clicking on a row , by selecting a row and using the next (->>) and
+previous (<<-) buttons at the bottom, or by pressing the Page Up and
+Page Down keys.  The corresponding contig editor will be opened and
+moved to the position indicated.  Once a row has been clicked on it's
+background will be changed to highlight that it has been visited.
+
+The reset button will clear the table and re-read the data from 
+file.
+Auto-close editors is set on by default. It closes any un-needed 
+editors when the user selects a region on a different contig. 
+The Show Traces mode will automatically display some traces based on
+the same mechanisms used in the editors 'Auto-display Traces' option
+(_fpref(Editor-Trace Display, Trace Display Settings, contig_editor)).
+Save will save the table list, including all rows previously marked
+as selected, back to the file. If this file is re-read at a later stage
+then the table will have the same sort order and tagging as when saved. 
+
+The format of the input file is as follows:
+
+ at i{contig_identifier} @i{position} @i{comment}
+
+If the comment contains ``@code{To:}'' and a number then the region
+indicator at the bottom of the navigator window updates to show the
+size of the element, otherwise it just has a line showing the position
+of the start. Finally the comment may end in the 'nul' character to
+indicate that it has already been visited. (This is utilised by the
+Save command.)
diff --git a/manual/contig_navigation_browse.png b/manual/contig_navigation_browse.png
new file mode 100644
index 0000000..5657c60
Binary files /dev/null and b/manual/contig_navigation_browse.png differ
diff --git a/manual/contig_navigation_table.png b/manual/contig_navigation_table.png
new file mode 100644
index 0000000..bad780c
Binary files /dev/null and b/manual/contig_navigation_table.png differ
diff --git a/manual/contig_ordering-t.texi b/manual/contig_ordering-t.texi
new file mode 100644
index 0000000..baeb392
--- /dev/null
+++ b/manual/contig_ordering-t.texi
@@ -0,0 +1,139 @@
+ at menu
+* Order-Contigs::            Order Contigs
+* Read Pairs::               Find Read Pairs
+* FIJ::                      Find Internal Joins
+* Repeats::                  Find Repeats
+ at end menu
+
+After the initial rounds of assembly it is likely that the data for a
+sequencing project will still not be contiguous. In order to minimise
+the number of experiments required to finish the project it is useful 
+to be able to get as much from the existing data as possible. The
+functions described in this section can help to get the current set of
+contigs into a consistent left to right order, can discover joins
+between contigs which were missed or overlooked by the assembly
+engines, and can help in the analysis of repeats which may cause
+problems for assembly. It is one of the strengths of gap4 that the
+results from several of these independent types of analysis can be
+combined in a single display
+(_fpref(Contig Comparator, Contig Comparator, comparator)),
+and where they are seen to reinforce one another, users can feel more
+confident in their decisions.
+
+_lpicture(comparator,5.325in)
+
+A typical Contig Comparator display is shown in the figure above. It is
+showing results from other functions, as well as the ones described
+in this section.
+
+The first function
+(_fpref(Order-Contigs, Order Contigs, contig_ordering))
+automatically orders contigs based on read-pair data. The orderings
+found can be examined in the Template Display
+(_fpref(Template-Display, Template Display, template))
+
+The next function
+(_fpref(Read Pairs, Find read pairs, read_pairs))
+also examines read-pair data, but instead of automatically ordering the
+contigs, plots out their relationships in the Contig Comparator, from
+where the user can invoke the Template Display to check them, and use
+the Contig Selector
+to reorder them.
+
+Sometimes assembly engines will miss or regard some weak joins as too
+uncertain to be made. The Find Internal Joins function
+(_fpref(FIJ, Find Internal Joins, fij)),
+compares contigs, including their hidden data, to find matches between
+the ends of contigs. 
+Again results are presented in the Contig
+Comparator, and users can invoke the Contig Joining Editor
+(_fpref(Editor-Joining, The Join Editor, contig_editor))
+to examine and make joins.
+
+Whereas Find Internal Joins makes sure that alignments between contigs
+continue right to their ends, another search, Find Repeats
+(_fpref(Repeats, Find Repeats, repeats))
+finds any identical segments of sequence, wherever they lie in the
+consensus. This has several uses. It gives another way of finding
+potential joins, and it provides a way of anotating (tagging) repeats so
+that their positions are obvious to users, and can be taken into account
+by other search procedures.
+Again results are presented in the Contig
+Comparator, and users can invoke the Contig Joining Editor
+(_fpref(Editor-Joining, The Join Editor, contig_editor))
+to examine and make joins.
+
+ at page
+_split()
+ at node Order-Contigs
+ at section Order contigs
+
+ at cindex Ordering contigs:gap4
+ at cindex Read pair data and contig ordering
+ at cindex Template display and contig ordering
+ at cindex Update contig order
+ at cindex Listbox
+ at cindex Complement contig
+ at cindex Contig complementing
+ at cindex Super contigs
+
+This routine uses read-pair information to try to work out the left to right
+order of sets of contigs. 
+It is invoked from the gap4 Edit menu.
+At present it attempts to order all the contigs in
+the database, and when finished it produces a listbox window which containing
+one or more sets (one set per line) of contigs listed by the names of their
+leftmost readings.  By clicking on their names in the listbox the user can
+request that these "super contigs" should be shown in the standard Template
+display window 
+(_fpref(Template-Display, Template Display, template)).
+
+Using the
+tools available within this window the user can manually move or complement
+any contigs which appear to have been misplaced. The combination
+of automatic ordering and the facility to view the results by eye and manually
+correct any errors make this a powerful tool.  The new contig order can
+be saved to the database by selecting the "Update contig order" command from
+the "Edit" menu of the Template display.  Note, however, that unlike the
+editing operations in the Contig editor, which are only committed to the disk
+copy of the database at the user's request, all the complementing operations
+in gap4 are always performed both in memory and on the disk.  This means that
+any complementing done as part of the contig ordering process will be
+immediately committed to disk.
+
+An example of the "Super contig" listbox is shown here.
+
+_picture(c_order_lb,5.04167in)
+
+ at page
+The example seen in the figures shows a Template display before and
+after the application of the algorithm.
+
+_lpicture(c_order_t1,6in)
+ at exdent @i{Before ordering}
+ at page
+
+_lpicture(c_order_t2,6in)
+ at exdent @i{After ordering}
+
+Notice how the operation has reduced the large number of dark yellow (inconsistent) templates by ordering and complementing the contigs so that they are now
+consistent and show in bright yellow. The few remaining dark yellow templates
+represent problems, possibly with misassembly or with misnaming of
+readings. The reliability of these dark yellow templates is also
+questionable when noting that one or the other of the readings are
+typically within the middle of large contigs, and hence are not likely
+to be spanning contigs. The gaps between the contigs, shown in the ruler
+at the bottom of the template display, are real estimates of size of the
+missing data, based on the expected lengths of the templates.
+
+The algorithm is based on ideas used to build cosmid contigs using
+hybridisation data @cite{Zhang,P, Schon,EA, Fischer,SG, Cayanis,E,
+Weiss,J, Kistler,S and Bourne,P, (1994) "An algorithm based on graph
+theory for the assembly of contigs in physical mapping of DNA", CABIOS
+10, 309-317}. A difficulty for algorithms of this type is dealing with
+errors in the data, i.e. pairs of readings that have been incorrectly
+assigned to the same template (often by simple typing errors made prior
+to the creation of the experiment files). Our algorithm uses several
+simple heuristics to deal with such problems but one known problem is that
+it does not correctly deal with cases where templates span non-adjacent
+contigs, or where such contigs interleave.
diff --git a/manual/contig_selector-t.texi b/manual/contig_selector-t.texi
new file mode 100644
index 0000000..6d58eaa
--- /dev/null
+++ b/manual/contig_selector-t.texi
@@ -0,0 +1,173 @@
+ at menu
+* Contig-Selector-Contigs::             Selecting contigs
+* Contig-Selector-Order::               Changing the contig order
+* Contig-Selector-Menus::               The menus
+ at end menu
+
+The __prog__ Contig Selector is used to display, select and reorder contigs.
+It can be invoked from the __prog__ View menu, but will automatically appear when
+a database is opened.  In the Contig Selector all contigs are shown as
+colinear horizontal lines separated by short vertical lines.  The length of
+the horizontal lines is proportional to the length of the contigs and their
+left to right order represents the current ordering of the contigs. This
+Contig Order is stored in the gap database and users can change it by
+dragging the lines representing the contigs in the display.  The Contig
+Selector can also be used to select contigs for processing.
+
+_ifdef([[_gap4]],[[Tags
+(_fpref(Intro-Anno, Annotating and masking readings and contigs, __prog__)) can
+also be displayed in the Contig Selector window.  As the mouse is moved over a
+contig, it is highlighted and the contig name (left most reading name) and
+length are displayed in the status line. The number in brackets is the contig
+number.]]) _ifdef([[_gap5]],[[Unlike gap4, gap5 does not display
+annotations within the Contig Selector window.]])
+
+ at cindex Contig Selector: Contig order
+ at cindex Contig order: Contig Selector
+
+_ifdef([[_gap4]],[[
+_picture(contig_selector,4.63333in)
+]],[[
+_picture(gap5_contig_selector,5.34167in)
+]])
+
+The figure shows a typical display from the Contig Selector. At the top are
+the File, View and Results menus.  Below that are buttons for zooming
+and for displaying the crosshair. The four boxes to the right
+are used to display
+the X and Y coordinates of the crosshair. The rightmost two display the Y
+coordinates when the contig selector is transformed into the contig comparator
+(_fpref(Contig Comparator, Contig Comparator, comparator)).
+The two leftmost boxes display the X coordinates: the
+leftmost is the position in the contig and the other is the position
+in the overall consensus.  The crosshair is the vertical line spanning the
+panel below. 
+
+This panel shows the lines that represent the contigs and the
+currently active tags. Those tags shown above the contig lines are on readings
+and those below are on the consensus.  Right clicking on a tag gives a
+menu containing ``information'' (to see the tag contents) and ``Edit
+contig at tag'' which invokes the contig editor centred on the
+selected tag.
+
+The information line is showing data for
+the contig that is currently under the crosshair.
+
+_split()
+ at node Contig-Selector-Contigs
+ at section Selecting Contigs
+ at cindex Contig Selector: selecting contigs
+ at cindex selecting contigs: Contig Selector
+ at cindex naming contigs
+ at cindex contig naming
+ at cindex contigs - identifying
+ at cindex identifying contigs
+
+Contigs can be selected by either clicking with the left mouse button
+on the line representing the required contig in the contig selector window
+or alternatively by choosing the "List contigs" option from the "View" menu. 
+This option invokes a "Contig List" list box where the contig names and 
+numbers are listed in the same order as they appear in the contig selector 
+window. 
+
+_picture(contig_list_box,5.06667in)
+
+Within this list box the contig names can be sorted 
+alphabetically on contig name or numerically on contig number. This is done 
+by selecting the corresponding item from the sort 
+menu at the top of the list box. Clicking on a name within the list box is 
+equivalent to clicking on the corresponding contig in the contig selector.
+More than one contig can be selected by dragging out a region with the left
+mouse button. Dragging the mouse off the bottom of the list will scroll it to
+allow selection of a range larger than the displayed section of the
+list.  When the left button is pressed any existing selection is
+cleared. To select several disjoint entries in the list press control
+and the left mouse button.  The ``Copy'' button copies the current
+selection to the paste buffer.
+
+_ifdef([[_gap4]],[[Most commands require a contig identifier (which can be the name or
+number of any reading on the contig) and __prog__ contains several
+mechanisms for obtaining this information from users.  The names or
+numbers can be typed or cut and pasted into dialogue boxes (note that a
+reading number must be preceded by a # character, e.g.  "#102" means
+reading number 102 but "102" means the reading with name
+102).]],[[Most commands require a contig identifier, which can be the
+contig name itself or the name/number of any reading within that
+contig. __Prog__ always knows reading record numbers, but depending on
+the options used in tg_index when creating the assembly database the
+reading names may not be indexed. To specify a reading by record
+number, precede it by a # character, e.g. ``#10000'' means
+reading record number 10000, but ``10000'' means the contig or reading
+with name 10000.]])
+
+Also any
+currently active dialogue boxes that require a contig to be selected can
+be updated simply by clicking on a contig in the contig selector or clicking
+on an entry in the "Contig Names" list box.  For
+example, if the Edit contig command is selected from the Edit menu it
+will bring up a dialogue requesting the identity of the contig to edit.
+If the user clicks the left mouse button on a contig in the contig
+selector window, the contig editor dialogue will automatically change to
+contain the name of the selected contig.  Some commands, such as the
+Contig Editor, can be selected from a popup menu that is activated by
+clicking the right mouse button on the contig line in the Contig
+Selector or clicking the right mouse button on the corresponding name within
+the "Contig List" list box. This simultaneously defines the contig to 
+operate on and so the command starts up without dialogue.
+
+Several contigs can be selected at once by either clicking on each
+contig with the left mouse button or dragging out a selection rectangle
+by holding the left mouse button down. Contigs which are entirely
+enclosed within the rectangle will be selected. Alternatively, selecting
+several contigs from the "Contig Names" list box will also result in each
+contig being selected. Selected contigs are highlighted in bold. Selecting
+the same contig again will unselect it.
+
+The currently selected contigs are also kept in a 'list' named contigs.
+
+_split()
+ at node Contig-Selector-Order
+ at section Changing the Contig Order
+ at cindex Contig Selector: changing the contig order
+ at cindex Contig Selector: saving the contig order
+
+The order of contigs is shown by the order of the lines representing
+them within the Contig Selector. The order of contigs can be changed by
+moving these lines using the middle mouse button, or Alt left mouse
+button.  Several contigs may
+be moved at once by selecting several contigs using the above method.
+After selection, move the contigs with the middle mouse button, or Alt
+left mouse button, and
+position the mouse cursor where you want the selection to be moved to.
+Upon release of the mouse button the contigs will be shuffled to reflect
+their new order. The separator line at the point the contig was moved
+from increases in height.
+
+The contig order is saved automatically whenever a contig is created or
+removed (eg auto assemble), including operations like disassemble which
+temporarily create contigs. The order can be saved manually using the
+Save Contig Order option on the File menu.
+
+_split()
+ at node Contig-Selector-Menus
+ at section The Contig Selector Menus
+ at cindex Contig Selector: menus
+ at cindex File menu: Contig Selector
+ at cindex View menu: Contig Selector
+ at cindex Results menu: Contig Selector
+
+The File menu contains only one command; "Exit". This simply quits the contig
+selector display.
+
+The View menu gives access to the Results Manager (_fpref(Results,
+Results Manager, results)), allows contigs to be selected using a list box
+containing the contig names 
+(_oxref(Contig-Selector-Contigs, Selecting Contigs)),
+_ifdef([[_gap4]],[[allows active tags (_fpref(Conf-Tag, TagSelector, configure)) to be selected, ]])and the list of selected contigs to be cleared. 
+
+The Results menu is updated on the fly to contain cascading menus for each of
+the plots shown when the contig selector is in its 2D 
+Contig Comparator mode
+(_fpref(Contig Comparator, Contig Comparator, comparator)).
+The contents of these cascading menus are identical to
+the pulldown menus available from within the Results Manager.
diff --git a/manual/contig_selector.png b/manual/contig_selector.png
new file mode 100644
index 0000000..f7b04c8
Binary files /dev/null and b/manual/contig_selector.png differ
diff --git a/manual/convert-t.texi b/manual/convert-t.texi
new file mode 100644
index 0000000..9e25180
--- /dev/null
+++ b/manual/convert-t.texi
@@ -0,0 +1,81 @@
+ at cindex Convert program
+ at cindex Bap databases: conversion to gap4
+ at cindex Dap databases: conversion to bap or gap4
+
+ at menu
+* Conv-Program::        The conversion program
+* Conv-Example::        Example
+ at end menu
+
+gap4 is the current program in a rather long line of sequence
+assembly programs that have been distributed as part of the "Staden"
+Package. Each of these earlier programs used different types of file to
+store assembly data. These old files are incompatible with gap4, but the
+package contain a program (convert) to convert them to gap4 databases.
+It is possible to convert from:
+
+ at itemize @bullet
+ at item plain text file (created by convert)
+ at item dap database
+ at item bap database
+ at end itemize
+
+to any of the following formats
+ at itemize @bullet
+ at item plain text file (created by convert)
+ at item bap database
+ at item gap4 database
+ at end itemize
+
+ at node Conv-Program
+ at section The Conversion Program
+
+The program takes no command line arguments and has a scrolling text
+style of interface. Users are prompted
+for the format, name and version of the database to convert.
+If the source is an xdap or xbap database, ensure that the name and
+version are in uppercase. 
+If the source is a text file, the version is
+requested but ignored! Next users are prompted for the
+format, name and version of the database to create. Ensure
+that names and versions are in the appropriate case and that the files
+do already exist.
+
+Then the program converts the database (which may take some
+time) and writes out a message to signify that
+the conversion has successfully completed.
+
+ at node Conv-Example
+ at section Example
+ at cindex Convert program example
+
+Here is a log of a typical conversion session. User input is shown in
+bold.
+
+ at example
+Covert Project Database
+Version 1.3, 4th December 1995
+Please enter database to convert:
+
+Available types are:
+0. Flat file - created with this program
+1. xdap database
+2. xbap database
+
+Database type? @b{2}
+Database name? @b{ZK643}
+Database version? @b{0}
+
+Please enter database to create:
+
+Available types are:
+0. Flat file - created with this program
+1. xbap database
+2. xgap database
+
+Database type? @b{2}
+Database name? @b{ZK643}
+Database version? @b{1}
+
+Conversion completed
+ at end example
diff --git a/manual/convert_trace.1.texi b/manual/convert_trace.1.texi
new file mode 100644
index 0000000..bc0fcf5
--- /dev/null
+++ b/manual/convert_trace.1.texi
@@ -0,0 +1,140 @@
+ at cindex convert_trace: man page
+ at unnumberedsec NAME
+
+convert_trace --- Converts trace file formats
+
+ at unnumberedsec SYNOPSIS
+
+ at code{convert_trace}
+[@code{-in_format} @i{format}]
+[@code{-out_format} @i{format}]
+[@code{-fofn} @i{file_of_filenames}]
+[@code{-passed} @i{fofn}]
+[@code{-failed} @i{fofn}]
+[@code{-name} @i{id}]
+[@code{-subtract_background}]
+[@code{-normalise}]
+[@code{-scale} @i{range}]
+[@code{-compress} @i{mode}]
+[@code{-abi_data} @i{counts}]
+[@i{informat} @i{outformat}]
+
+ at unnumberedsec DESCRIPTION
+
+ at code{convert_trace} converts between the various DNA sequence chromatogram
+formats, optionally performing trace processing actions too. It can read ABI
+(raw or processed), ALF, CTF, SCF and ZTR formats. It can write CTF, EXP, PLN, 
+SCF and ZTR formats. (Note that EXP (Experiment File) and PLN formats are
+text sequences rather than a binary trace.)
+
+There are two main modes of operation; either with a file of filenames
+specified using the @code{-fofn} @i{filename} option, or acting as a filter
+to process one single file. In this case the input and output file format may
+be specified as the last two options on the command line.
+
+ at unnumberedsec OPTIONS
+ at table @asis
+ at item @code{-abi_data} @i{counts}
+    Only of use when processing ABI files. This indicates which ABI
+    @code{DATA} channel numbers to use. For sequencing files this defaults to
+    "9,10,11,12" which corresponds to the processed data. To read the raw data 
+    use "1,2,3,4".
+
+ at item @code{-compress} @i{mode}
+    Specifies the name of a program to use to compress the trace data prior to 
+    writing. Due to limitations in the current implementation this option does 
+    not work when @code{convert_trace} is operating as a filter (and so
+    requires use of the @code{-fofn} option). Valid values for @i{mode} are
+    compress, bzip, bzip2, gzip, pack and szip. Note that for ZTR, ZTR2 and
+    ZTR3 format files specifying compression modes will not reduce the file
+    size as this format already contains internal compression algorithms. The
+    ZTR1 format does not internally compress and so @code{-compress} will have 
+    an effect.
+
+ at item @code{-failed} @i{fofn}
+    Produces a file listing the filenames which have failed to be
+    converted. This only makes sense when also using @code{-fofn}.
+
+ at item @code{-fofn} @i{file_of_filenames}
+    Processes several files instead of one, with the filenames to read from and
+    written to being listed in @i{file_of_filenames} with one pair (input and
+    output filenames) being listed per line, separated by spaces. If the
+    filenames contain spaces then these may be "escaped" using
+    backslashes. Similarly backslashes should be escaped using a double
+    backslash. For example to convert "file a.scf" and "fileb.scf" to "file
+    a.ztr" and "fileb.ztr" respectively we would use a @i{file_of_filenames}
+    containing:
+
+ at example
+file\ a.scf    file\ a.ztr
+fileb.scf      fileb.ztr
+ at end example
+
+ at item @code{-in_format} @i{format}
+    Specifies the format for the input data. Typically the input format is
+    automatically determined so this may not be required. @i{format} should be 
+    one of ABI, ALF, CTF, EXP, PLN, SCF, ZTR, ZTR1, ZTR2 or ZTR3. The ZTR
+    formats all conform to the ZTR specification, but this indicates the
+    compression level to be used.
+
+ at item @code{-name} @i{id}
+    When producing an Experiment File this specifies the value of the
+    @code{ID} line. Without this option default Experiment File ID line is the 
+    output filename, or if this is stdout it is the input filename.
+
+ at item @code{-normalise}
+    Attempts to normalise the trace amplitudes to produce more even height
+    peaks. This may be useful to compensate for large spikes at either the
+    start or end of the trace.
+
+ at item @code{-out_format} @i{format}
+    Specifies the output format for all files, whether read from a file of
+    filenames or via a filter.  @i{format} should be 
+    one of ABI, ALF, CTF, EXP, PLN, SCF, ZTR, ZTR1, ZTR2 or ZTR3. The ZTR
+    formats all conform to the ZTR specification, but this indicates the
+    compression level to be used.
+
+ at item @code{-passed} @i{fofn}
+    Produces a file listing the filenames which have been successfully
+    converted. This only makes sense when also using @code{-fofn}.
+
+ at item @code{-scale} @i{range}
+    Scales all trace amplitudes so that they fit within the range of 0 to 
+    @i{range} inclusive. Any integer value of @i{range} may be used between 1
+    and 65535, but this option is designed for down-scaling traces in order to 
+    reduce file size.
+
+ at item @code{-subtract_background}
+    Attempts to remove background trace levels by analysing each trace channel 
+    independently to determine the baseline. This option is mainly used when
+    processing raw data.
+ at end table
+
+ at unnumberedsec EXAMPLES
+
+To convert several files to ZTR format using the same example file of
+filenames listed in the @code{-fofn} option above:
+
+ at example
+convert_trace -out_format ZTR -fofn filename
+ at end example
+
+To subtract the background from a raw ABI file and save this as an SCF file:
+
+ at example
+convert_trace -abi_data 1,2,3,4 -subtract_background ABI SCF < a.abi > a.scf
+ at end example
+
+ at unnumberedsec NOTES
+
+If ABI files are manually edited before input to convert_trace then the
+internal formats of these files may differ to the format expected by
+convert_trace.
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Formats-Scf, scf(4), formats)
+_fxref(Formats-Ztr, ztr(4), formats)
+_fxref(Man-makeSCF, makeSCF(1), makeSCF.1)
+
+
diff --git a/manual/copy_db.1.texi b/manual/copy_db.1.texi
new file mode 100644
index 0000000..f069573
--- /dev/null
+++ b/manual/copy_db.1.texi
@@ -0,0 +1,68 @@
+ at cindex Copy_db: man page
+ at unnumberedsec NAME
+
+copy_db --- a garbage collecting gap4 database copier and merger
+
+ at unnumberedsec SYNOPSIS
+
+ at code{copy_db} [@code{-v}] [@code{-f}] [@code {-b} @i{32/64}] [@code
+{-T}] @i{from.vers} ... @i{to.vers}
+
+ at unnumberedsec DESCRIPTION
+
+ at code{Copy_db} copies one or more gap4 databases to a new name by
+physically extracting the information from the first databases and
+writing it to the last database listed on the command line. This
+operation can be considered analogous to copying files into a directory.
+This is slower than a direct @code{cp} command, but has the advantage
+of merging several databases together and the resulting database will
+have been  garbage collected. That is, any fragmentation in the original
+databases is removed (as much as is possible).
+
+NOTE: Care should be taken when merging database. @strong{No checks} are
+performed to make sure that the databases do not already contain the
+same readings. Thus attempting to copy the same database several times will
+cause problems later on. No merging of vector, clone or template
+information is performed either.
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-v}
+     Enable verbose output. This gives a running summary of the current piece
+     of information being copied.
+
+ at item @code{-f}
+     Attempts to spot and fix various database corruptions. A
+     corrupted gap4 database may not be corruption free after this,
+     but there's more chance of being able to recover data.
+
+ at item -T
+     Removes annotation tags while copying. (Of limited use.)
+
+ at item -b @i{bitsize}
+     Generates the new database using a given bitsize, where
+     @i{bitsize} is either @code{32} or @code{64}.
+ at end table
+
+ at unnumberedsec EXAMPLES
+
+To merge database X with database Y to give a new database Z use:
+
+ at example
+copy_db X.0 Y.0 Z.0
+ at end example
+
+ at unnumberedsec NOTES
+
+To copy a database quickly without garbage collecting the UNIX @code{cp}
+command can be used as follows. This copies version F of database DB to
+version T of database XYZZY.
+
+ at example
+cp DB.F XYZZY.T; cp DB.F.aux XYZZY.T.aux
+ at end example
+
+Care must be taken to check for the busy file (@file{DB.F.BUSY}) before making
+the copy. If the database is written to during the operation of the copy
+command then the new database may be corrupted.
diff --git a/manual/copy_reads-t.texi b/manual/copy_reads-t.texi
new file mode 100644
index 0000000..e41a703
--- /dev/null
+++ b/manual/copy_reads-t.texi
@@ -0,0 +1,139 @@
+ at node Copy Reads
+ at section Introduction
+
+ at cindex copy reads
+ at cindex reads: copying to other databases
+ at cindex readings: copying to other databases
+ at cindex read raid
+
+During large scale sequencing projects where the genome is cloned into e.g.
+BACs prior to being subcloned into sequencing vectors it is generally 
+the case that the ends of the DNA from one BAC will overlap that of two other
+BACs. Unless it is being used for quality control, it is a waste of time to
+sequence the overlapping regions twice, and so most labs transfer the relevant
+data between the adjacent gap4 databases. This is the function of copy_reads
+which copies readings from a "source" database to a "destination" database.
+
+The consensus sequences for
+user selected contigs in each of the two databases are compared in both
+orientations. If an overlapping region is found, readings of sufficient
+quality are automatically assembled into the destination database. In 
+the source database readings which have been added to the destination
+database will be tagged with a "LENT" tag and the equivalent readings in
+the destination databse will be tagged with a "BORO" (borrowed) tag.
+
+_split()
+ at node Copy Reads-Dialogue
+ at subsection Copy Reads Dialogue
+ at cindex Copy reads: dialogue
+
+_picture(copy_reads.dialogue,3.56667in)
+
+The Copy reads function is available from either the File menu of gap4 or from
+the command line.
+
+The program must be able to write to both databases.
+It is recommended that you create backups of
+both databases before commencing using "Copy
+database". _fxref(GapDB-CopyDatabase, Making Backups of Databases,
+gap_database) 
+
+From within gap4:
+The source database must be entered into the "Open source database" entry
+box at the top of the dialogue box. The adjacent Browse buttons will list only gap4
+databases, that is files ending in aux. Either select from the browser
+by double clicking on the name or type in the database name. The ending
+of .aux is ignored.
+The destination database is always the database which is currently open in gap4.
+
+The location of the traces of the source database can either be
+determined from the rawdata note (_fpref(Conf-Trace File Location, Trace File Location, configure)) held within the database ("read from
+database") or can be entered via the "directory" option. The program
+will add the location of the source traces into the
+rawdata note of the destination database. If the environment variable
+RAWDATA is set, this will be taken to be the location of the destination
+database traces and will also be added to the rawdata note
+of the destination database. If there are no traces for the source
+database, no rawdata note will be created.
+
+One or more contigs from the source database can be compared. These are
+selected either by clicking on "all contigs" or providing a file
+containing a list of contig names (any reading name from within that
+contig, typically the first reading name). Only contigs over a user defined
+length will be used. A minimum reading quality
+can be set so that only readings with an average quality over the specified
+amount will be entered into the destination database.
+
+Contigs from the destination database can be chosen by either
+selecting "all contigs" or providing a file of contig names.
+
+The consensus sequence is determined for each contig in both databases
+using either the standard consensus algorithm or "Mask active tags". The
+latter option will activate the "Select tags" button. Clicking on this
+button will bring up a check box dialogue to enable the user to select
+the tags types they wish to activate. Masking the active tags means that
+all segments covered by tags that are "active" will not be used by the
+matching algorithms. A typical use of this mode is to avoid finding
+matches in segments covered by tags of type ALUS (ie segments thought to
+be Alu sequence) or REPT (ie segment that are known to be repeated
+elsewhere in the data (_fpref(Anno-Types, Tag types, tags)).
+
+The consensus searching parameters are equivalent to those found in the
+find internal joins algorithm (_fpref(FIJ, Find Internal Joins, fij)). 
+The search algorithm first finds matching words of length "Word
+length", and only considers overlaps of length at least "Minimum
+overlap". Only alignments better than "Maximum percent mismatch" will
+be reported. Find internal joins has the option of either a quick or
+sensitive algorithm. Here, it is only necessary to use the quick
+algorithm. The quick algorithm can find overlaps and align 100,000 base
+sequences in a few seconds by considering, in its initial phase only
+matching segments of length "Minimum initial match length". However it
+does a dynamic programming alignment of all the chunks between the
+matching segments, and so produces an optimal alignment. A banded
+dynamic algorithm can be selected, but as this only applies to the
+chunks between matching segments, which for good alignments will be very
+short, it should make little difference to the speed. The alignments
+between the consensus sequences can be displayed in the text output
+window by selecting "Display consensus alignments".
+
+If a match between two consensus sequences is found, the
+readings in that overlap are assembled into the destination database
+using the "directed assembly" function (_fpref(Assembly-Directed,
+Directed Assembly, assembly)). Only readings for which the "Maximum
+percent mismatch" is not exceeded, and which have an average
+reading quality higher than the specified minimum, will be entered into the
+database. Again, the alignments can be shown in the Output window by
+selecting "Display sequence alignments".
+
+From the command line:
+
+    copy_reads [-win] 
+               [-source_trace_dir ("")]
+               [-contigs_from <file> (all contigs)] 
+               [-min_contig_len (2000)] 
+               [-min_average_qual (30.0)] 
+               [-contigs_to <file> (all contigs)] 
+               [-mask <none mask> (none)] 
+               [-tag_types <list> ("")] 
+               [-word_length (8)] 
+               [-min_overlap (20)] 
+               [-max_pmismatch (30.0)] 
+               [-min_match (20)] 
+               [-band (1)] 
+               [-display_cons] 
+               [-align_max_mism (10.0)] 
+               [-display_seq] 
+               [source database] 
+               [destination database]
+
+The values in brackets () are the default values. The only mandatory values are
+the source and destintation databases. Details on these values are given in 
+the copy_reads man page (_fpref(Man-copy_reads, Copy reads, manpages)).
+
+The -win option will bring up a new program which presently only has one function (copy reads). This 
+is accessed from the "File" menu. This brings up a dialogue the same as that 
+from within gap4 except for an extra entry box to select the destination 
+database.
+
+
+
diff --git a/manual/copy_reads.1.texi b/manual/copy_reads.1.texi
new file mode 100644
index 0000000..37e7354
--- /dev/null
+++ b/manual/copy_reads.1.texi
@@ -0,0 +1,153 @@
+ at cindex Copy_reads: man_page
+ at unnumberedsec NAME
+
+copy_reads --- copies overlapping reads from a source database to a destination database
+
+ at unnumberedsec SYNOPSIS
+Usage:
+
+ at code{copy_reads} [@code{-win}] [@code{-source_trace_dir} @i{directory of source traces}]
+               [@code{-contigs_from} @i{file of contigs in source database}] 
+               [@code{-min_contig_len} @i{minimum contig length}] 
+               [@code{-min_average_qual} @i{minimum average read quality}] 
+               [@code{-contigs_to} @i{file of contigs in destination database}] 
+               [@code{-mask} @i{masking mode}] 
+               [@code{-tag_types} @i{list of tag types}] 
+               [@code{-word_length} @i{word length}] 
+               [@code{-min_overlap} @i{minimum overlap}] 
+               [@code{-max_pmismatch} @i{maximum percentage mismatch}] 
+               [@code{-min_match} @i{minimum match}] 
+               [@code{-band} @i{use banding algorithm}] 
+               [@code{-display_cons} @i{display consensus alignments}] 
+               [@code{-align_max_mism} @i{maximum percent mismatch}] 
+               [@code{-display_seq} @i{display reading alignments}] 
+               @i{source database}
+               @i{destination database}
+
+ at unnumberedsec DESCRIPTION
+
+During large scale sequencing projects where the genome is cloned into e.g.
+BACs prior to being subcloned into sequencing vectors it is generally 
+the case that the ends of the DNA from one BAC will overlap that of two other
+BACs. Unless it is being used for quality control, it is a waste of time to
+sequence the overlapping regions twice, and so most labs transfer the relevant
+data between the adjacent gap4 databases. This is the function of @code{copy_reads}
+which copies readings from a "source" database to a "destination" database.
+
+The consensus sequences for
+user selected contigs in each of the two databases are compared in both
+orientations. If an overlapping region is found, readings of sufficient
+quality are automatically assembled into the destination database. In 
+the source database readings which have been added to the destination
+database will be tagged with a "LENT" tag and the equivalent readings in
+the destination databse will be tagged with a "BORO" (borrowed) tag.
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-win}
+     Bring up a dialogue window
+
+ at item @code{-source_trace_dir} @i{directory of source traces}
+     The location of the traces of the source database can either be
+     specified by giving the directory name or if this is not specified,
+     determined from the rawdata note (_fpref(Conf-Trace File Location, 
+     Trace File Location, configure)) held within the database. The program
+     will add the location of the source traces into the
+     rawdata note of the destination database. If the environment variable
+     RAWDATA is set, this will be taken to be the location of the destination
+     database traces and will also be added to the rawdata note
+     of the destination database. If there are no traces for the source
+     database, no rawdata note will be created.
+
+ at item @code{-contigs_from} @i{file of contigs in source database}
+     One or more contigs from the source database can be compared. These are
+     selected either by providing a file containing a list of contig names 
+     (any reading name from within that contig, typically the first reading 
+     name). If no file is specified, all contigs will be compared.
+
+ at item @code{-min_contig_len} @i{minimum contig length}
+     Only contigs in the source database over a user defined length will be 
+     used. The default is 2000 bases.
+
+ at item @code{-min_average_qual} @i{minimum average read quality}
+      A minimum reading quality can be set so that only readings with an 
+      average quality over the specified amount will be entered into the 
+      destination database. The default is 30.0.
+
+ at item @code{-contigs_to} @i{file of contigs in destination database}
+     One or more contigs from the destination database can be compared. These are
+     selected either by providing a file containing a list of contig names 
+     (any reading name from within that contig, typically the first reading 
+     name). If no file is specified, all contigs will be compared.
+
+ at item @code{-mask} @i{masking mode}
+     The consensus sequence is determined for each contig in both databases
+     using either the standard consensus algorithm (none) or "Mask active tags" (mask).
+     Masking the active tags means that
+     all segments covered by tags that are "active" will not be used by the
+     matching algorithms. A typical use of this mode is to avoid finding
+     matches in segments covered by tags of type ALUS (ie segments thought to
+     be Alu sequence) or REPT (ie segment that are known to be repeated
+     elsewhere in the data (_fpref(Anno-Types, Tag types, tags)). The default
+     is none.
+
+ at item @code{-tag_types} @i{list of tag types}
+     A list of tag types to be used when the -mask option (above) is specified
+     to be in "mask" mode. The list is delimited by "".
+
+ at item @code{-word_length} @i{word length}
+     The consensus searching parameters are equivalent to those found in the
+     find internal joins algorithm (_fpref(FIJ, Find Internal Joins, fij)). 
+     The search algorithm first finds matching words of length @i{Word
+     length}. Possible values are 4 or 8. The default is 8. 
+
+ at item @code{-min_overlap} @i{minimum overlap}
+     The search algorithm only considers overlaps of length at least 
+     @i{Minimum overlap}. The default is 20.
+
+ at item @code{-max_pmismatch} @i{maximum percentage mismatch}
+     Only alignments better than @i{Maximum percent mismatch} will be reported.
+     The default is 30.0.
+
+ at item @code{-min_match} @i{minimum match}
+     The algorithm considers in its initial phase only matching segments of 
+     length @i{Minimum initial match length}. However it
+     does a dynamic programming alignment of all the chunks between the
+     matching segments, and so produces an optimal alignment. The default is
+     15.
+
+ at item @code{-band} @i{use banding algorithm}
+     A banded dynamic algorithm can be selected, but as this only applies to 
+     the chunks between matching segments, which for good alignments will be 
+     very short and it should make little difference to the speed. Possible
+     values are 0 (no) or 1 (yes). The default is 1. 
+
+ at item @code{-display_cons} @i{display consensus alignments} 
+     This allows the alignments between the consensus sequences to be 
+     displayed.
+
+ at item @code{-align_max_mism} @i{maximum percent mismatch} 
+     If a match between two consensus sequences is found, the
+     readings in that overlap are assembled into the destination database
+     using the "directed assembly" function (_fpref(Assembly-Directed,
+     Directed Assembly, assembly)). Only readings for which the @i{maximum
+     percent mismatch} is not exceeded, and which have an average
+     reading quality higher than the specified minimum, will be entered into 
+     the database. The default value is 10.0.
+
+ at item @code{-display_seq} @i{display reading alignments} 
+     This allows the alignments between the source database readings and the 
+     destination consensus to be displayed.
+
+ at end table
+
+ at unnumberedsec EXAMPLE
+
+To copy readings from @file{source_db} to @file{destination_db} and display
+the consensus match
+
+ at example
+copy_reads -display_cons source_db destination_db
+ at end example
+
diff --git a/manual/copy_reads.dialogue.png b/manual/copy_reads.dialogue.png
new file mode 100644
index 0000000..8ae56ab
Binary files /dev/null and b/manual/copy_reads.dialogue.png differ
diff --git a/manual/copyright.texi b/manual/copyright.texi
new file mode 100644
index 0000000..cddd252
--- /dev/null
+++ b/manual/copyright.texi
@@ -0,0 +1,63 @@
+ at ifset html
+ at chapter Copyright
+ at end ifset
+Copyright @copyright{} 1999-2002, Medical Research Council, Laboratory of
+Molecular Biology.
+Made available under the standard BSD licence.
+
+ at vskip4pt @hrule height 0.2pt width @hsize @vskip4pt
+
+Copyright @copyright{} 2002-2006, Genome Research Limited (GRL).
+Made available under the standard BSD licence.
+
+ at vskip4pt @hrule height 0.2pt width @hsize @vskip4pt
+
+Portions of this code are derived from a modified Primer3
+library. This bears the following copyright notice:
+
+Copyright @copyright{} 1996,1997,1998 Whitehead Institute for Biomedical
+Research. All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+1. Redistributions must reproduce the above copyright notice, this
+list of conditions and the following disclaimer in the  documentation
+and/or other materials provided with the distribution.  Redistributions of
+source code must also reproduce this information in the source code itself.
+
+2. If the program is modified, redistributions must include a notice
+(in the same places as above) indicating that the redistributed program is
+not identical to the version distributed by Whitehead Institute.
+
+3. All advertising materials mentioning features or use of this
+software  must display the following acknowledgment:
+This product includes software developed by the
+Whitehead Institute for Biomedical Research.
+
+4. The name of the Whitehead Institute may not be used to endorse or
+promote products derived from this software without specific prior written
+permission.
+
+We also request that use of this software be cited in publications as 
+
+Steve Rozen, Helen J. Skaletsky (1996,1997,1998)
+Primer3. Code available at
+http://www-genome.wi.mit.edu/genome_software/other/primer3.html
+
+THIS SOFTWARE IS PROVIDED BY THE WHITEHEAD INSTITUTE ``AS IS'' AND  ANY
+EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE  IMPLIED
+WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE  ARE
+DISCLAIMED. IN NO EVENT SHALL THE WHITEHEAD INSTITUTE BE LIABLE  FOR ANY
+DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL  DAMAGES
+(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS  OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)  HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+SUCH DAMAGE.
+
+ at vskip4pt @hrule height 0.2pt width @hsize @vskip4pt
+
+Permission is given to duplicate this manual in both paper and electronic
+forms.
diff --git a/manual/dependencies b/manual/dependencies
new file mode 100644
index 0000000..ffa1b26
--- /dev/null
+++ b/manual/dependencies
@@ -0,0 +1,1024 @@
+assembly-t.pdf_done:
+assembly-t.pdf_done: cap2-t.pdf_done
+assembly-t.pdf_done: cap3-t.pdf_done
+assembly-t.pdf_done: fak2-t.pdf_done
+assembly-t.pdf_done: phrap-t.pdf_done
+assembly-t.eps_done:
+assembly-t.eps_done: cap2-t.eps_done
+assembly-t.eps_done: cap3-t.eps_done
+assembly-t.eps_done: fak2-t.eps_done
+assembly-t.eps_done: phrap-t.eps_done
+assembly-t.pdf_done: assembly.shot.pdf
+assembly-t.pdf_done: assembly.single.pdf
+assembly-t.pdf_done: assembly.one.pdf
+assembly-t.pdf_done: assembly.new.pdf
+assembly-t.pdf_done: assembly.directed.pdf
+assembly-t.pdf_done: assembly.screen.pdf
+assembly-t.eps_done: assembly.shot.eps
+assembly-t.eps_done: assembly.single.eps
+assembly-t.eps_done: assembly.one.eps
+assembly-t.eps_done: assembly.new.eps
+assembly-t.eps_done: assembly.directed.eps
+assembly-t.eps_done: assembly.screen.eps
+calc_consensus-t.pdf_done:
+calc_consensus-t.eps_done:
+calc_consensus-t.pdf_done: calc_consensus.normal.pdf
+calc_consensus-t.pdf_done: calc_consensus.extended.pdf
+calc_consensus-t.pdf_done: calc_consensus.unfinished.pdf
+calc_consensus-t.pdf_done: calc_consensus.quality.pdf
+calc_consensus-t.eps_done: calc_consensus.normal.eps
+calc_consensus-t.eps_done: calc_consensus.extended.eps
+calc_consensus-t.eps_done: calc_consensus.unfinished.eps
+calc_consensus-t.eps_done: calc_consensus.quality.eps
+cap2-t.pdf_done:
+cap2-t.eps_done:
+cap2-t.pdf_done: assembly.cap2.pdf
+cap2-t.eps_done: assembly.cap2.eps
+cap3-t.pdf_done:
+cap3-t.eps_done:
+cap3-t.pdf_done: assembly.CAP3.pdf
+cap3-t.eps_done: assembly.CAP3.eps
+check_db-t.pdf_done:
+check_db-t.eps_done:
+clip-t.pdf_done:
+clip-t.eps_done:
+clip-t.pdf_done: difference_clip.pdf
+clip-t.pdf_done: quality_clip.pdf
+clip-t.pdf_done: quality_clip_ends.pdf
+clip-t.pdf_done: NBase_clip.pdf
+clip-t.eps_done: difference_clip.eps
+clip-t.eps_done: quality_clip.eps
+clip-t.eps_done: quality_clip_ends.eps
+clip-t.eps_done: NBase_clip.eps
+comparator-t.pdf_done:
+comparator-t.eps_done:
+comparator-t.pdf_done: comparator.pdf
+comparator-t.pdf_done: gap5_comparator.pdf
+comparator-t.eps_done: comparator.eps
+comparator-t.eps_done: gap5_comparator.eps
+complement-t.pdf_done:
+complement-t.eps_done:
+configure-t.pdf_done:
+configure-t.eps_done:
+configure-t.pdf_done: configure.colour.pdf
+configure-t.pdf_done: set_genetic_code.pdf
+configure-t.pdf_done: interface.tag.pdf
+configure-t.pdf_done: template_status.pdf
+configure-t.eps_done: configure.colour.eps
+configure-t.eps_done: set_genetic_code.eps
+configure-t.eps_done: interface.tag.eps
+configure-t.eps_done: template_status.eps
+consistency_display-t.pdf_done:
+consistency_display-t.eps_done:
+consistency_display-t.pdf_done: read_coverage_d.pdf
+consistency_display-t.pdf_done: strand_coverage_d.pdf
+consistency_display-t.pdf_done: 2nd_highest_confidence.pdf
+consistency_display-t.pdf_done: discrepancy_graph.pdf 
+consistency_display-t.pdf_done: consistency_p.pdf
+consistency_display-t.pdf_done: conf_values_p.pdf
+consistency_display-t.pdf_done: read_coverage_p.pdf
+consistency_display-t.pdf_done: readpair_coverage_p.pdf
+consistency_display-t.pdf_done: strand_coverage_p1.pdf
+consistency_display-t.pdf_done: strand_coverage_p2.pdf
+consistency_display-t.eps_done: read_coverage_d.eps
+consistency_display-t.eps_done: strand_coverage_d.eps
+consistency_display-t.eps_done: 2nd_highest_confidence.eps
+consistency_display-t.eps_done: discrepancy_graph.eps 
+consistency_display-t.eps_done: consistency_p.eps
+consistency_display-t.eps_done: conf_values_p.eps
+consistency_display-t.eps_done: read_coverage_p.eps
+consistency_display-t.eps_done: readpair_coverage_p.eps
+consistency_display-t.eps_done: strand_coverage_p1.eps
+consistency_display-t.eps_done: strand_coverage_p2.eps
+contig_editor-t.pdf_done:
+contig_editor-t.eps_done:
+contig_editor-t.pdf_done: contig_editor.taged.pdf
+contig_editor-t.pdf_done: contig_editor.tagsel.pdf
+contig_editor-t.pdf_done: contig_editor.tagmacro.pdf
+contig_editor-t.pdf_done: contig_editor.search.pdf
+contig_editor-t.pdf_done: contig_editor.screen.pdf
+contig_editor-t.pdf_done: contig_editor_grey_scale.pdf
+contig_editor-t.pdf_done: contig_editor.traces.pdf
+contig_editor-t.pdf_done: mut_traces_het.pdf
+contig_editor-t.pdf_done: mut_traces_positive.pdf
+contig_editor-t.pdf_done: contig_editor.traces.pdf
+contig_editor-t.pdf_done: contig_editor.traces.compact.pdf
+contig_editor-t.pdf_done: contig_editor.join.pdf
+contig_editor-t.eps_done: contig_editor.taged.eps
+contig_editor-t.eps_done: contig_editor.tagsel.eps
+contig_editor-t.eps_done: contig_editor.tagmacro.eps
+contig_editor-t.eps_done: contig_editor.search.eps
+contig_editor-t.eps_done: contig_editor.screen.eps
+contig_editor-t.eps_done: contig_editor_grey_scale.eps
+contig_editor-t.eps_done: contig_editor.traces.eps
+contig_editor-t.eps_done: mut_traces_het.eps
+contig_editor-t.eps_done: mut_traces_positive.eps
+contig_editor-t.eps_done: contig_editor.traces.eps
+contig_editor-t.eps_done: contig_editor.traces.compact.eps
+contig_editor-t.eps_done: contig_editor.join.eps
+contig_navigation-t.pdf_done:
+contig_navigation-t.eps_done:
+contig_navigation-t.pdf_done: contig_navigation_browse.pdf
+contig_navigation-t.pdf_done: contig_navigation_table.pdf
+contig_navigation-t.eps_done: contig_navigation_browse.eps
+contig_navigation-t.eps_done: contig_navigation_table.eps
+contig_ordering-t.pdf_done:
+contig_ordering-t.eps_done:
+contig_ordering-t.pdf_done: c_order_lb.pdf
+contig_ordering-t.pdf_done: comparator.pdf
+contig_ordering-t.pdf_done: c_order_t1.pdf
+contig_ordering-t.pdf_done: c_order_t2.pdf
+contig_ordering-t.eps_done: c_order_lb.eps
+contig_ordering-t.eps_done: comparator.eps
+contig_ordering-t.eps_done: c_order_t1.eps
+contig_ordering-t.eps_done: c_order_t2.eps
+contig_selector-t.pdf_done:
+contig_selector-t.eps_done:
+contig_selector-t.pdf_done: contig_selector.pdf
+contig_selector-t.pdf_done: gap5_contig_selector.pdf
+contig_selector-t.pdf_done: contig_list_box.pdf
+contig_selector-t.eps_done: contig_selector.eps
+contig_selector-t.eps_done: gap5_contig_selector.eps
+contig_selector-t.eps_done: contig_list_box.eps
+convert-t.pdf_done:
+convert-t.eps_done:
+convert_trace.1.pdf_done:
+convert_trace.1.eps_done:
+copy_db.1.pdf_done:
+copy_db.1.eps_done:
+copy_reads-t.pdf_done:
+copy_reads-t.eps_done:
+copy_reads-t.pdf_done: copy_reads.dialogue.pdf
+copy_reads-t.eps_done: copy_reads.dialogue.eps
+copy_reads.1.pdf_done:
+copy_reads.1.eps_done:
+copyright.pdf_done:
+copyright.eps_done:
+disassembly-t.pdf_done:
+disassembly-t.eps_done:
+disassembly-t.pdf_done: check_ass.pdf
+disassembly-t.pdf_done: break_contig.pdf
+disassembly-t.pdf_done: disassembly.pdf
+disassembly-t.eps_done: check_ass.eps
+disassembly-t.eps_done: break_contig.eps
+disassembly-t.eps_done: disassembly.eps
+doctor_db-t.pdf_done:
+doctor_db-t.eps_done:
+doctor_db-t.pdf_done: doctor_db.main.pdf
+doctor_db-t.pdf_done: doctor_db.structures.pdf
+doctor_db-t.eps_done: doctor_db.main.eps
+doctor_db-t.eps_done: doctor_db.structures.eps
+eba.1.pdf_done:
+eba.1.eps_done:
+exp-t.pdf_done:
+exp-t.eps_done:
+exp_suggest-t.pdf_done:
+exp_suggest-t.eps_done:
+exp_suggest-t.pdf_done: exp_suggest.double.pdf
+exp_suggest-t.pdf_done: exp_suggest.primers.pdf
+exp_suggest-t.pdf_done: exp_suggest.long.pdf
+exp_suggest-t.pdf_done: exp_suggest.comp.pdf
+exp_suggest-t.pdf_done: suggest_probes.main.pdf
+exp_suggest-t.pdf_done: suggest_probes.select.pdf
+exp_suggest-t.eps_done: exp_suggest.double.eps
+exp_suggest-t.eps_done: exp_suggest.primers.eps
+exp_suggest-t.eps_done: exp_suggest.long.eps
+exp_suggest-t.eps_done: exp_suggest.comp.eps
+exp_suggest-t.eps_done: suggest_probes.main.eps
+exp_suggest-t.eps_done: suggest_probes.select.eps
+extract-t.pdf_done:
+extract-t.eps_done:
+extract-t.pdf_done: extract.pdf
+extract-t.eps_done: extract.eps
+extract_fastq.1.pdf_done:
+extract_fastq.1.eps_done:
+extract_seq.1.pdf_done:
+extract_seq.1.eps_done:
+fak2-t.pdf_done:
+fak2-t.eps_done:
+fak2-t.pdf_done: assembly.fak2.pdf
+fak2-t.eps_done: assembly.fak2.eps
+fij-t.pdf_done:
+fij-t.eps_done:
+fij-t.pdf_done: fij.dialogue.pdf
+fij-t.pdf_done: comparator.pdf
+fij-t.eps_done: fij.dialogue.eps
+fij-t.eps_done: comparator.eps
+filebrowser-t.pdf_done:
+filebrowser-t.eps_done:
+filebrowser-t.pdf_done: filebrowser.pdf
+filebrowser-t.eps_done: filebrowser.eps
+filebrowser.pdf_done:
+filebrowser.pdf_done: copyright.pdf_done
+filebrowser.pdf_done: filebrowser-t.pdf_done
+filebrowser.eps_done:
+filebrowser.eps_done: copyright.eps_done
+filebrowser.eps_done: filebrowser-t.eps_done
+find_oligo-t.pdf_done:
+find_oligo-t.eps_done:
+find_oligo-t.pdf_done: find_oligo_pic.pdf
+find_oligo-t.eps_done: find_oligo_pic.eps
+find_renz.1.pdf_done:
+find_renz.1.eps_done:
+formats-t.pdf_done:
+formats-t.pdf_done: scf-t.pdf_done
+formats-t.pdf_done: ztr-t.pdf_done
+formats-t.pdf_done: exp-t.pdf_done
+formats-t.pdf_done: renzymes-t.pdf_done
+formats-t.pdf_done: vector_primer-t.pdf_done
+formats-t.eps_done:
+formats-t.eps_done: scf-t.eps_done
+formats-t.eps_done: ztr-t.eps_done
+formats-t.eps_done: exp-t.eps_done
+formats-t.eps_done: renzymes-t.eps_done
+formats-t.eps_done: vector_primer-t.eps_done
+formats.pdf_done:
+formats.pdf_done: copyright.pdf_done
+formats.pdf_done: formats-t.pdf_done
+formats.eps_done:
+formats.eps_done: copyright.eps_done
+formats.eps_done: formats-t.eps_done
+gap4-t.pdf_done:
+gap4-t.pdf_done: gap4_org-t.pdf_done
+gap4-t.pdf_done: gap4_mini-t.pdf_done
+gap4-t.pdf_done: gap4_intro-t.pdf_done
+gap4-t.pdf_done: contig_selector-t.pdf_done
+gap4-t.pdf_done: comparator-t.pdf_done
+gap4-t.pdf_done: template-t.pdf_done
+gap4-t.pdf_done: quality_plot-t.pdf_done
+gap4-t.pdf_done: stops-t.pdf_done
+gap4-t.pdf_done: restrict_enzymes-t.pdf_done
+gap4-t.pdf_done: contig_editor-t.pdf_done
+gap4-t.pdf_done: assembly-t.pdf_done
+gap4-t.pdf_done: contig_ordering-t.pdf_done
+gap4-t.pdf_done: read_pairs-t.pdf_done
+gap4-t.pdf_done: fij-t.pdf_done
+gap4-t.pdf_done: repeats-t.pdf_done
+gap4-t.pdf_done: disassembly-t.pdf_done
+gap4-t.pdf_done: exp_suggest-t.pdf_done
+gap4-t.pdf_done: calc_consensus-t.pdf_done
+gap4-t.pdf_done: complement-t.pdf_done
+gap4-t.pdf_done: show_rel-t.pdf_done
+gap4-t.pdf_done: contig_navigation-t.pdf_done
+gap4-t.pdf_done: find_oligo-t.pdf_done
+gap4-t.pdf_done: extract-t.pdf_done
+gap4-t.pdf_done: clip-t.pdf_done
+gap4-t.pdf_done: results-t.pdf_done
+gap4-t.pdf_done: lists-t.pdf_done
+gap4-t.pdf_done: notes-t.pdf_done
+gap4-t.pdf_done: gap_database-t.pdf_done
+gap4-t.pdf_done: copy_reads-t.pdf_done
+gap4-t.pdf_done: check_db-t.pdf_done
+gap4-t.pdf_done: doctor_db-t.pdf_done
+gap4-t.pdf_done: configure-t.pdf_done
+gap4-t.pdf_done: convert-t.pdf_done
+gap4-t.eps_done:
+gap4-t.eps_done: gap4_org-t.eps_done
+gap4-t.eps_done: gap4_mini-t.eps_done
+gap4-t.eps_done: gap4_intro-t.eps_done
+gap4-t.eps_done: contig_selector-t.eps_done
+gap4-t.eps_done: comparator-t.eps_done
+gap4-t.eps_done: template-t.eps_done
+gap4-t.eps_done: quality_plot-t.eps_done
+gap4-t.eps_done: stops-t.eps_done
+gap4-t.eps_done: restrict_enzymes-t.eps_done
+gap4-t.eps_done: contig_editor-t.eps_done
+gap4-t.eps_done: assembly-t.eps_done
+gap4-t.eps_done: contig_ordering-t.eps_done
+gap4-t.eps_done: read_pairs-t.eps_done
+gap4-t.eps_done: fij-t.eps_done
+gap4-t.eps_done: repeats-t.eps_done
+gap4-t.eps_done: disassembly-t.eps_done
+gap4-t.eps_done: exp_suggest-t.eps_done
+gap4-t.eps_done: calc_consensus-t.eps_done
+gap4-t.eps_done: complement-t.eps_done
+gap4-t.eps_done: show_rel-t.eps_done
+gap4-t.eps_done: contig_navigation-t.eps_done
+gap4-t.eps_done: find_oligo-t.eps_done
+gap4-t.eps_done: extract-t.eps_done
+gap4-t.eps_done: clip-t.eps_done
+gap4-t.eps_done: results-t.eps_done
+gap4-t.eps_done: lists-t.eps_done
+gap4-t.eps_done: notes-t.eps_done
+gap4-t.eps_done: gap_database-t.eps_done
+gap4-t.eps_done: copy_reads-t.eps_done
+gap4-t.eps_done: check_db-t.eps_done
+gap4-t.eps_done: doctor_db-t.eps_done
+gap4-t.eps_done: configure-t.eps_done
+gap4-t.eps_done: convert-t.eps_done
+gap4.pdf_done:
+gap4.pdf_done: copyright.pdf_done
+gap4.pdf_done: gap4-t.pdf_done
+gap4.eps_done:
+gap4.eps_done: copyright.eps_done
+gap4.eps_done: gap4-t.eps_done
+gap4_intro-t.pdf_done:
+gap4_intro-t.pdf_done: hidden-t.pdf_done
+gap4_intro-t.pdf_done: tags-t.pdf_done
+gap4_intro-t.eps_done:
+gap4_intro-t.eps_done: hidden-t.eps_done
+gap4_intro-t.eps_done: tags-t.eps_done
+gap4_mini-t.pdf_done:
+gap4_mini-t.eps_done:
+gap4_mini-t.pdf_done: contig_selector.pdf
+gap4_mini-t.pdf_done: interface.output.pdf
+gap4_mini-t.pdf_done: comparator.pdf
+gap4_mini-t.pdf_done: template.display.pdf
+gap4_mini-t.pdf_done: consistency_p.pdf
+gap4_mini-t.pdf_done: restrict_enzymes.pdf
+gap4_mini-t.pdf_done: stops.pdf
+gap4_mini-t.pdf_done: contig_editor.screen.pdf
+gap4_mini-t.pdf_done: contig_editor_grey_scale.pdf
+gap4_mini-t.pdf_done: contig_editor.traces.pdf
+gap4_mini-t.pdf_done: contig_editor.join.pdf
+gap4_mini-t.eps_done: contig_selector.eps
+gap4_mini-t.eps_done: interface.output.eps
+gap4_mini-t.eps_done: comparator.eps
+gap4_mini-t.eps_done: template.display.eps
+gap4_mini-t.eps_done: consistency_p.eps
+gap4_mini-t.eps_done: restrict_enzymes.eps
+gap4_mini-t.eps_done: stops.eps
+gap4_mini-t.eps_done: contig_editor.screen.eps
+gap4_mini-t.eps_done: contig_editor_grey_scale.eps
+gap4_mini-t.eps_done: contig_editor.traces.eps
+gap4_mini-t.eps_done: contig_editor.join.eps
+gap4_org-t.pdf_done:
+gap4_org-t.eps_done:
+gap5-t.pdf_done:
+gap5-t.pdf_done: gap5_check_db-t.pdf_done
+gap5-t.pdf_done: contig_selector-t.pdf_done
+gap5-t.pdf_done: comparator-t.pdf_done
+gap5-t.pdf_done: gap5_template-t.pdf_done
+gap5-t.pdf_done: gap5_contig_editor-t.pdf_done
+gap5-t.pdf_done: restrict_enzymes-t.pdf_done
+gap5-t.pdf_done: gap5_assembly-t.pdf_done
+gap5-t.pdf_done: gap5_export-t.pdf_done
+gap5-t.pdf_done: gap5_fij-t.pdf_done
+gap5-t.pdf_done: gap5_repeats-t.pdf_done
+gap5-t.pdf_done: gap5_read_pairs-t.pdf_done
+gap5-t.pdf_done: find_oligo-t.pdf_done
+gap5-t.pdf_done: gap5_disassembly-t.pdf_done
+gap5-t.pdf_done: gap5_shuffle-t.pdf_done
+gap5-t.pdf_done: calc_consensus-t.pdf_done
+gap5-t.pdf_done: list_libraries-t.pdf_done
+gap5-t.pdf_done: results-t.pdf_done
+gap5-t.pdf_done: lists-t.pdf_done
+gap5-t.eps_done:
+gap5-t.eps_done: gap5_check_db-t.eps_done
+gap5-t.eps_done: contig_selector-t.eps_done
+gap5-t.eps_done: comparator-t.eps_done
+gap5-t.eps_done: gap5_template-t.eps_done
+gap5-t.eps_done: gap5_contig_editor-t.eps_done
+gap5-t.eps_done: restrict_enzymes-t.eps_done
+gap5-t.eps_done: gap5_assembly-t.eps_done
+gap5-t.eps_done: gap5_export-t.eps_done
+gap5-t.eps_done: gap5_fij-t.eps_done
+gap5-t.eps_done: gap5_repeats-t.eps_done
+gap5-t.eps_done: gap5_read_pairs-t.eps_done
+gap5-t.eps_done: find_oligo-t.eps_done
+gap5-t.eps_done: gap5_disassembly-t.eps_done
+gap5-t.eps_done: gap5_shuffle-t.eps_done
+gap5-t.eps_done: calc_consensus-t.eps_done
+gap5-t.eps_done: list_libraries-t.eps_done
+gap5-t.eps_done: results-t.eps_done
+gap5-t.eps_done: lists-t.eps_done
+gap5.pdf_done:
+gap5.pdf_done: copyright.pdf_done
+gap5.pdf_done: gap5-t.pdf_done
+gap5.eps_done:
+gap5.eps_done: copyright.eps_done
+gap5.eps_done: gap5-t.eps_done
+gap5_assembly-t.pdf_done:
+gap5_assembly-t.eps_done:
+gap5_check_db-t.pdf_done:
+gap5_check_db-t.eps_done:
+gap5_check_db-t.pdf_done: gap5_check_database.pdf
+gap5_check_db-t.eps_done: gap5_check_database.eps
+gap5_contig_editor-t.pdf_done:
+gap5_contig_editor-t.eps_done:
+gap5_contig_editor-t.pdf_done: gap5_contig_editor.screen.pdf
+gap5_contig_editor-t.pdf_done: gap5_contig_editor.traces.pdf
+gap5_contig_editor-t.pdf_done: gap5_contig_editor.names1.pdf
+gap5_contig_editor-t.pdf_done: gap5_contig_editor.names2.pdf
+gap5_contig_editor-t.pdf_done: contig_editor.taged.pdf
+gap5_contig_editor-t.pdf_done: contig_editor.tagsel.pdf
+gap5_contig_editor-t.pdf_done: contig_editor.tagmacro.pdf
+gap5_contig_editor-t.pdf_done: gap5_contig_editor.search.pdf
+gap5_contig_editor-t.pdf_done: gap5_contig_editor.primer_dialogue.pdf
+gap5_contig_editor-t.pdf_done: gap5_contig_editor.primers.pdf
+gap5_contig_editor-t.pdf_done: gap5_contig_editor.traces.pdf
+gap5_contig_editor-t.pdf_done: gap5_contig_editor.454trace.pdf
+gap5_contig_editor-t.pdf_done: gap5_contig_editor.join.pdf
+gap5_contig_editor-t.eps_done: gap5_contig_editor.screen.eps
+gap5_contig_editor-t.eps_done: gap5_contig_editor.traces.eps
+gap5_contig_editor-t.eps_done: gap5_contig_editor.names1.eps
+gap5_contig_editor-t.eps_done: gap5_contig_editor.names2.eps
+gap5_contig_editor-t.eps_done: contig_editor.taged.eps
+gap5_contig_editor-t.eps_done: contig_editor.tagsel.eps
+gap5_contig_editor-t.eps_done: contig_editor.tagmacro.eps
+gap5_contig_editor-t.eps_done: gap5_contig_editor.search.eps
+gap5_contig_editor-t.eps_done: gap5_contig_editor.primer_dialogue.eps
+gap5_contig_editor-t.eps_done: gap5_contig_editor.primers.eps
+gap5_contig_editor-t.eps_done: gap5_contig_editor.traces.eps
+gap5_contig_editor-t.eps_done: gap5_contig_editor.454trace.eps
+gap5_contig_editor-t.eps_done: gap5_contig_editor.join.eps
+gap5_disassembly-t.pdf_done:
+gap5_disassembly-t.eps_done:
+gap5_disassembly-t.pdf_done: gap5_check_ass.pdf
+gap5_disassembly-t.pdf_done: gap5_break_contig.pdf
+gap5_disassembly-t.pdf_done: gap5_disassembly.pdf
+gap5_disassembly-t.pdf_done: gap5_delete_contigs.pdf
+gap5_disassembly-t.eps_done: gap5_check_ass.eps
+gap5_disassembly-t.eps_done: gap5_break_contig.eps
+gap5_disassembly-t.eps_done: gap5_disassembly.eps
+gap5_disassembly-t.eps_done: gap5_delete_contigs.eps
+gap5_export-t.pdf_done:
+gap5_export-t.eps_done:
+gap5_export-t.pdf_done: gap5_export_tags.pdf
+gap5_export-t.pdf_done: gap5_export_sequences.pdf
+gap5_export-t.eps_done: gap5_export_tags.eps
+gap5_export-t.eps_done: gap5_export_sequences.eps
+gap5_fij-t.pdf_done:
+gap5_fij-t.eps_done:
+gap5_fij-t.pdf_done: gap5_comparator.pdf
+gap5_fij-t.pdf_done: gap5_fij.dialogue.pdf
+gap5_fij-t.eps_done: gap5_comparator.eps
+gap5_fij-t.eps_done: gap5_fij.dialogue.eps
+gap5_org-t.pdf_done:
+gap5_org-t.eps_done:
+gap5_read_pairs-t.pdf_done:
+gap5_read_pairs-t.eps_done:
+gap5_read_pairs-t.pdf_done: gap5_find_read_pairs.pdf
+gap5_read_pairs-t.pdf_done: gap5_rp_comparator.pdf
+gap5_read_pairs-t.eps_done: gap5_find_read_pairs.eps
+gap5_read_pairs-t.eps_done: gap5_rp_comparator.eps
+gap5_repeats-t.pdf_done:
+gap5_repeats-t.eps_done:
+gap5_repeats-t.pdf_done: repeats.pdf
+gap5_repeats-t.eps_done: repeats.eps
+gap5_shuffle-t.pdf_done:
+gap5_shuffle-t.eps_done:
+gap5_shuffle-t.pdf_done: gap5_shuffle_pads.pdf
+gap5_shuffle-t.pdf_done: gap5_remove_pad_columns.pdf
+gap5_shuffle-t.pdf_done: gap5_remove_contig_holes.pdf
+gap5_shuffle-t.eps_done: gap5_shuffle_pads.eps
+gap5_shuffle-t.eps_done: gap5_remove_pad_columns.eps
+gap5_shuffle-t.eps_done: gap5_remove_contig_holes.eps
+gap5_template-t.pdf_done:
+gap5_template-t.eps_done:
+gap5_template-t.pdf_done: gap5_template_by_size.pdf
+gap5_template-t.pdf_done: gap5_template_filter.pdf
+gap5_template-t.pdf_done: gap5_template_spread0.pdf
+gap5_template-t.pdf_done: gap5_template_spread50.pdf
+gap5_template-t.pdf_done: gap5_template_template.pdf
+gap5_template-t.pdf_done: gap5_template_by_size.pdf
+gap5_template-t.pdf_done: gap5_template_by_stacking.pdf
+gap5_template-t.pdf_done: gap5_template_by_mapping.pdf
+gap5_template-t.eps_done: gap5_template_by_size.eps
+gap5_template-t.eps_done: gap5_template_filter.eps
+gap5_template-t.eps_done: gap5_template_spread0.eps
+gap5_template-t.eps_done: gap5_template_spread50.eps
+gap5_template-t.eps_done: gap5_template_template.eps
+gap5_template-t.eps_done: gap5_template_by_size.eps
+gap5_template-t.eps_done: gap5_template_by_stacking.eps
+gap5_template-t.eps_done: gap5_template_by_mapping.eps
+gap_database-t.pdf_done:
+gap_database-t.eps_done:
+getABIfield.1.pdf_done:
+getABIfield.1.eps_done:
+get_comment.1.pdf_done:
+get_comment.1.eps_done:
+get_scf_field.1.pdf_done:
+get_scf_field.1.eps_done:
+hash_exp.1.pdf_done:
+hash_exp.1.eps_done:
+hash_extract.1.pdf_done:
+hash_extract.1.eps_done:
+hash_list.1.pdf_done:
+hash_list.1.eps_done:
+hash_tar.1.pdf_done:
+hash_tar.1.eps_done:
+hidden-t.pdf_done:
+hidden-t.eps_done:
+init_exp.1.pdf_done:
+init_exp.1.eps_done:
+interface-t.pdf_done:
+interface-t.pdf_done: filebrowser-t.pdf_done
+interface-t.eps_done:
+interface-t.eps_done: filebrowser-t.eps_done
+interface-t.pdf_done: interface.buttons.pdf
+interface-t.pdf_done: interface.menus.pdf
+interface-t.pdf_done: interface.entry.pdf
+interface-t.pdf_done: interface.colour.pdf
+interface-t.pdf_done: interface.fonts.pdf
+interface-t.pdf_done: interface.output.pdf
+interface-t.eps_done: interface.buttons.eps
+interface-t.eps_done: interface.menus.eps
+interface-t.eps_done: interface.entry.eps
+interface-t.eps_done: interface.colour.eps
+interface-t.eps_done: interface.fonts.eps
+interface-t.eps_done: interface.output.eps
+interface.pdf_done:
+interface.pdf_done: copyright.pdf_done
+interface.pdf_done: interface-t.pdf_done
+interface.eps_done:
+interface.eps_done: copyright.eps_done
+interface.eps_done: interface-t.eps_done
+list_libraries-t.pdf_done:
+list_libraries-t.eps_done:
+list_libraries-t.pdf_done: gap5_list_libraries.pdf
+list_libraries-t.eps_done: gap5_list_libraries.eps
+lists-t.pdf_done:
+lists-t.eps_done:
+makeSCF.1.pdf_done:
+makeSCF.1.eps_done:
+make_weights.1.pdf_done:
+make_weights.1.eps_done:
+manpages-t.pdf_done:
+manpages-t.pdf_done: convert_trace.1.pdf_done
+manpages-t.pdf_done: copy_db.1.pdf_done
+manpages-t.pdf_done: copy_reads.1.pdf_done
+manpages-t.pdf_done: eba.1.pdf_done
+manpages-t.pdf_done: extract_seq.1.pdf_done
+manpages-t.pdf_done: extract_fastq.1.pdf_done
+manpages-t.pdf_done: find_renz.1.pdf_done
+manpages-t.pdf_done: getABIfield.1.pdf_done
+manpages-t.pdf_done: get_comment.1.pdf_done
+manpages-t.pdf_done: get_scf_field.1.pdf_done
+manpages-t.pdf_done: hash_exp.1.pdf_done
+manpages-t.pdf_done: hash_extract.1.pdf_done
+manpages-t.pdf_done: hash_list.1.pdf_done
+manpages-t.pdf_done: hash_tar.1.pdf_done
+manpages-t.pdf_done: init_exp.1.pdf_done
+manpages-t.pdf_done: makeSCF.1.pdf_done
+manpages-t.pdf_done: make_weights.1.pdf_done
+manpages-t.pdf_done: polyA_clip.1.pdf_done
+manpages-t.pdf_done: qclip.1.pdf_done
+manpages-t.pdf_done: screen_seq.1.pdf_done
+manpages-t.pdf_done: tracediff.1.pdf_done
+manpages-t.pdf_done: trace_dump.1.pdf_done
+manpages-t.pdf_done: vector_clip.1.pdf_done
+manpages-t.eps_done:
+manpages-t.eps_done: convert_trace.1.eps_done
+manpages-t.eps_done: copy_db.1.eps_done
+manpages-t.eps_done: copy_reads.1.eps_done
+manpages-t.eps_done: eba.1.eps_done
+manpages-t.eps_done: extract_seq.1.eps_done
+manpages-t.eps_done: extract_fastq.1.eps_done
+manpages-t.eps_done: find_renz.1.eps_done
+manpages-t.eps_done: getABIfield.1.eps_done
+manpages-t.eps_done: get_comment.1.eps_done
+manpages-t.eps_done: get_scf_field.1.eps_done
+manpages-t.eps_done: hash_exp.1.eps_done
+manpages-t.eps_done: hash_extract.1.eps_done
+manpages-t.eps_done: hash_list.1.eps_done
+manpages-t.eps_done: hash_tar.1.eps_done
+manpages-t.eps_done: init_exp.1.eps_done
+manpages-t.eps_done: makeSCF.1.eps_done
+manpages-t.eps_done: make_weights.1.eps_done
+manpages-t.eps_done: polyA_clip.1.eps_done
+manpages-t.eps_done: qclip.1.eps_done
+manpages-t.eps_done: screen_seq.1.eps_done
+manpages-t.eps_done: tracediff.1.eps_done
+manpages-t.eps_done: trace_dump.1.eps_done
+manpages-t.eps_done: vector_clip.1.eps_done
+manpages.pdf_done:
+manpages.pdf_done: copyright.pdf_done
+manpages.pdf_done: manpages-t.pdf_done
+manpages.eps_done:
+manpages.eps_done: copyright.eps_done
+manpages.eps_done: manpages-t.eps_done
+manual.pdf_done:
+manual.pdf_done: copyright.pdf_done
+manual.pdf_done: preface-t.pdf_done
+manual.pdf_done: gap5-t.pdf_done
+manual.pdf_done: gap4-t.pdf_done
+manual.pdf_done: mutations-t.pdf_done
+manual.pdf_done: pregap4-t.pdf_done
+manual.pdf_done: read_clipping-t.pdf_done
+manual.pdf_done: vector_clip-t.pdf_done
+manual.pdf_done: trev-t.pdf_done
+manual.pdf_done: spin-t.pdf_done
+manual.pdf_done: interface-t.pdf_done
+manual.pdf_done: formats-t.pdf_done
+manual.pdf_done: manpages-t.pdf_done
+manual.pdf_done: references-t.pdf_done
+manual.eps_done:
+manual.eps_done: copyright.eps_done
+manual.eps_done: preface-t.eps_done
+manual.eps_done: gap5-t.eps_done
+manual.eps_done: gap4-t.eps_done
+manual.eps_done: mutations-t.eps_done
+manual.eps_done: pregap4-t.eps_done
+manual.eps_done: read_clipping-t.eps_done
+manual.eps_done: vector_clip-t.eps_done
+manual.eps_done: trev-t.eps_done
+manual.eps_done: spin-t.eps_done
+manual.eps_done: interface-t.eps_done
+manual.eps_done: formats-t.eps_done
+manual.eps_done: manpages-t.eps_done
+manual.eps_done: references-t.eps_done
+mini_manual.pdf_done:
+mini_manual.pdf_done: copyright.pdf_done
+mini_manual.pdf_done: preface-t.pdf_done
+mini_manual.pdf_done: gap4_mini-t.pdf_done
+mini_manual.pdf_done: mutations-t.pdf_done
+mini_manual.pdf_done: pregap4_mini-t.pdf_done
+mini_manual.pdf_done: trev_mini-t.pdf_done
+mini_manual.pdf_done: spin_mini-t.pdf_done
+mini_manual.eps_done:
+mini_manual.eps_done: copyright.eps_done
+mini_manual.eps_done: preface-t.eps_done
+mini_manual.eps_done: gap4_mini-t.eps_done
+mini_manual.eps_done: mutations-t.eps_done
+mini_manual.eps_done: pregap4_mini-t.eps_done
+mini_manual.eps_done: trev_mini-t.eps_done
+mini_manual.eps_done: spin_mini-t.eps_done
+mutations-t.pdf_done:
+mutations-t.eps_done:
+mutations-t.pdf_done: mut_pregap4.pdf
+mutations-t.pdf_done: mut_traces_point.pdf
+mutations-t.pdf_done: mut_traces_het.pdf
+mutations-t.pdf_done: mut_contig_editor5.pdf
+mutations-t.pdf_done: mut_contig_editor_dis5.pdf
+mutations-t.pdf_done: mut_traces_positive.pdf
+mutations-t.pdf_done: mut_traces_het.pdf
+mutations-t.pdf_done: mut_template_all.pdf
+mutations-t.pdf_done: mut_template_reads.pdf
+mutations-t.pdf_done: mut_template_reads_single.pdf
+mutations-t.eps_done: mut_pregap4.eps
+mutations-t.eps_done: mut_traces_point.eps
+mutations-t.eps_done: mut_traces_het.eps
+mutations-t.eps_done: mut_contig_editor5.eps
+mutations-t.eps_done: mut_contig_editor_dis5.eps
+mutations-t.eps_done: mut_traces_positive.eps
+mutations-t.eps_done: mut_traces_het.eps
+mutations-t.eps_done: mut_template_all.eps
+mutations-t.eps_done: mut_template_reads.eps
+mutations-t.eps_done: mut_template_reads_single.eps
+mutations.pdf_done:
+mutations.pdf_done: copyright.pdf_done
+mutations.pdf_done: mutations-t.pdf_done
+mutations.eps_done:
+mutations.eps_done: copyright.eps_done
+mutations.eps_done: mutations-t.eps_done
+notes-t.pdf_done:
+notes-t.eps_done:
+notes-t.pdf_done: notes.selector.pdf
+notes-t.pdf_done: notes.editor.pdf
+notes-t.eps_done: notes.selector.eps
+notes-t.eps_done: notes.editor.eps
+phrap-t.pdf_done:
+phrap-t.eps_done:
+phrap-t.pdf_done: phrap.assembly.pdf
+phrap-t.eps_done: phrap.assembly.eps
+polyA_clip.1.pdf_done:
+polyA_clip.1.eps_done:
+preface-t.pdf_done:
+preface-t.eps_done:
+pregap4-t.pdf_done:
+pregap4-t.pdf_done: pregap4_org-t.pdf_done
+pregap4-t.pdf_done: pregap4_mini-t.pdf_done
+pregap4-t.eps_done:
+pregap4-t.eps_done: pregap4_org-t.eps_done
+pregap4-t.eps_done: pregap4_mini-t.eps_done
+pregap4-t.pdf_done: pregap4_files.pdf
+pregap4-t.pdf_done: pregap4_textwin.pdf
+pregap4-t.pdf_done: pregap4_separate.pdf
+pregap4-t.pdf_done: pregap4_compact.pdf
+pregap4-t.pdf_done: pregap4_config.pdf
+pregap4-t.pdf_done: mut_mutscan_adaptive_noise_threshold.pdf
+pregap4-t.pdf_done: mut_mutscan_peak_drop_threshold.pdf
+pregap4-t.pdf_done: mut_mutscan_peak_alignment_threshold.pdf
+pregap4-t.pdf_done: pregap4_component.pdf
+pregap4-t.pdf_done: pregap4_simpledb.pdf
+pregap4-t.pdf_done: pregap4_edit_exp.pdf
+pregap4-t.pdf_done: pregap4_select.pdf
+pregap4-t.eps_done: pregap4_files.eps
+pregap4-t.eps_done: pregap4_textwin.eps
+pregap4-t.eps_done: pregap4_separate.eps
+pregap4-t.eps_done: pregap4_compact.eps
+pregap4-t.eps_done: pregap4_config.eps
+pregap4-t.eps_done: mut_mutscan_adaptive_noise_threshold.eps
+pregap4-t.eps_done: mut_mutscan_peak_drop_threshold.eps
+pregap4-t.eps_done: mut_mutscan_peak_alignment_threshold.eps
+pregap4-t.eps_done: pregap4_component.eps
+pregap4-t.eps_done: pregap4_simpledb.eps
+pregap4-t.eps_done: pregap4_edit_exp.eps
+pregap4-t.eps_done: pregap4_select.eps
+pregap4.pdf_done:
+pregap4.pdf_done: copyright.pdf_done
+pregap4.pdf_done: pregap4-t.pdf_done
+pregap4.eps_done:
+pregap4.eps_done: copyright.eps_done
+pregap4.eps_done: pregap4-t.eps_done
+pregap4_mini-t.pdf_done:
+pregap4_mini-t.eps_done:
+pregap4_mini-t.pdf_done: pregap4_overview.pdf
+pregap4_mini-t.pdf_done: pregap4_overview2.pdf
+pregap4_mini-t.pdf_done: pregap4_separate.pdf
+pregap4_mini-t.pdf_done: pregap4_compact.pdf
+pregap4_mini-t.pdf_done: pregap4_files.pdf
+pregap4_mini-t.pdf_done: pregap4_config.pdf
+pregap4_mini-t.pdf_done: pregap4_textwin.pdf
+pregap4_mini-t.eps_done: pregap4_overview.eps
+pregap4_mini-t.eps_done: pregap4_overview2.eps
+pregap4_mini-t.eps_done: pregap4_separate.eps
+pregap4_mini-t.eps_done: pregap4_compact.eps
+pregap4_mini-t.eps_done: pregap4_files.eps
+pregap4_mini-t.eps_done: pregap4_config.eps
+pregap4_mini-t.eps_done: pregap4_textwin.eps
+pregap4_org-t.pdf_done:
+pregap4_org-t.eps_done:
+qclip.1.pdf_done:
+qclip.1.eps_done:
+quality_plot-t.pdf_done:
+quality_plot-t.eps_done:
+quality_plot-t.pdf_done: template.quality.pdf
+quality_plot-t.eps_done: template.quality.eps
+read_clipping-t.pdf_done:
+read_clipping-t.eps_done:
+read_clipping.pdf_done:
+read_clipping.pdf_done: copyright.pdf_done
+read_clipping.pdf_done: read_clipping-t.pdf_done
+read_clipping.eps_done:
+read_clipping.eps_done: copyright.eps_done
+read_clipping.eps_done: read_clipping-t.eps_done
+read_pairs-t.pdf_done:
+read_pairs-t.eps_done:
+read_pairs-t.pdf_done: read_pairs.pdf
+read_pairs-t.pdf_done: comparator.pdf
+read_pairs-t.eps_done: read_pairs.eps
+read_pairs-t.eps_done: comparator.eps
+references-t.pdf_done:
+references-t.eps_done:
+references.pdf_done:
+references.pdf_done: copyright.pdf_done
+references.pdf_done: references-t.pdf_done
+references.eps_done:
+references.eps_done: copyright.eps_done
+references.eps_done: references-t.eps_done
+renzymes-t.pdf_done:
+renzymes-t.eps_done:
+repeats-t.pdf_done:
+repeats-t.eps_done:
+repeats-t.pdf_done: repeats.pdf
+repeats-t.eps_done: repeats.eps
+restrict_enzymes-t.pdf_done:
+restrict_enzymes-t.eps_done:
+restrict_enzymes-t.pdf_done: restrict_enzymes.pdf
+restrict_enzymes-t.eps_done: restrict_enzymes.eps
+results-t.pdf_done:
+results-t.eps_done:
+results-t.pdf_done: results.1.pdf
+results-t.eps_done: results.1.eps
+scf-t.pdf_done:
+scf-t.eps_done:
+screen_seq.1.pdf_done:
+screen_seq.1.eps_done:
+show_rel-t.pdf_done:
+show_rel-t.eps_done:
+show_rel-t.pdf_done: show_rel.pdf
+show_rel-t.eps_done: show_rel.eps
+spin-t.pdf_done:
+spin-t.pdf_done: spin_org-t.pdf_done
+spin-t.pdf_done: spin_mini-t.pdf_done
+spin-t.pdf_done: spin_restrict_enzymes-t.pdf_done
+spin-t.eps_done:
+spin-t.eps_done: spin_org-t.eps_done
+spin-t.eps_done: spin_mini-t.eps_done
+spin-t.eps_done: spin_restrict_enzymes-t.eps_done
+spin-t.pdf_done: spin_plot_base_comp_d.pdf
+spin-t.pdf_done: spin_count_codons_d.pdf
+spin-t.pdf_done: set_genetic_code.pdf
+spin-t.pdf_done: spin_translate_t.pdf
+spin-t.pdf_done: spin_translate_d.pdf
+spin-t.pdf_done: spin_find_orf_d.pdf
+spin-t.pdf_done: spin_string_search_d.pdf
+spin-t.pdf_done: spin_weight_matrix_dial.pdf
+spin-t.pdf_done: spin_start_d.pdf
+spin-t.pdf_done: spin_stops_d.pdf
+spin-t.pdf_done: spin_codon_usage_dial.pdf
+spin-t.pdf_done: spin_author_d.pdf
+spin-t.pdf_done: spin_base_bias_d.pdf
+spin-t.pdf_done: spin_similar_spans.pdf
+spin-t.pdf_done: spin_match_words.pdf
+spin-t.pdf_done: spin_diagonals.pdf
+spin-t.pdf_done: spin_align_seq.pdf
+spin-t.pdf_done: spin_local_align.pdf
+spin-t.pdf_done: spin_alignment_symbols.pdf
+spin-t.pdf_done: spin_sequence_display_d.pdf
+spin-t.pdf_done: spin_sequence_display_save_d.pdf
+spin-t.pdf_done: spin_results_manager_d.pdf
+spin-t.pdf_done: spin_simple_search.pdf
+spin-t.pdf_done: spin_personal_search.pdf
+spin-t.pdf_done: spin_seq_manager.pdf
+spin-t.pdf_done: spin_save_sequence_d.pdf
+spin-t.pdf_done: spin_plot_base_comp_p.pdf
+spin-t.pdf_done: spin_count_codons_t.pdf
+spin-t.pdf_done: spin_string_search_p.pdf
+spin-t.pdf_done: spin_weight_matrix.pdf
+spin-t.pdf_done: spin_start_p.pdf
+spin-t.pdf_done: spin_stops_p.pdf
+spin-t.pdf_done: spin_stops_p2.pdf
+spin-t.pdf_done: spin_codon_usage.pdf
+spin-t.pdf_done: spin_codon_usage_aaonly.pdf
+spin-t.pdf_done: spin_author_p.pdf
+spin-t.pdf_done: spin_base_bias_p.pdf
+spin-t.pdf_done: spin_splice.pdf
+spin-t.pdf_done: spin_trna_p.pdf
+spin-t.pdf_done: spin_trna_t.pdf
+spin-t.pdf_done: spin_align_p.pdf
+spin-t.pdf_done: spin_local_p1.pdf
+spin-t.pdf_done: spin_local_p2.pdf
+spin-t.pdf_done: spin_translate_t.pdf
+spin-t.pdf_done: spin_plot_p.pdf
+spin-t.pdf_done: spin_results_manager_d2.pdf
+spin-t.pdf_done: spin_plot_drag1.pdf
+spin-t.pdf_done: spin_plot_drag2.pdf
+spin-t.pdf_done: spin_plot_drag3.pdf
+spin-t.pdf_done: spin_sequence_display_t.pdf
+spin-t.pdf_done: spin_dot_plot.pdf
+spin-t.pdf_done: spin_seq_display.pdf
+spin-t.pdf_done: spin_results_manager_d2.pdf
+spin-t.eps_done: spin_plot_base_comp_d.eps
+spin-t.eps_done: spin_count_codons_d.eps
+spin-t.eps_done: set_genetic_code.eps
+spin-t.eps_done: spin_translate_t.eps
+spin-t.eps_done: spin_translate_d.eps
+spin-t.eps_done: spin_find_orf_d.eps
+spin-t.eps_done: spin_string_search_d.eps
+spin-t.eps_done: spin_weight_matrix_dial.eps
+spin-t.eps_done: spin_start_d.eps
+spin-t.eps_done: spin_stops_d.eps
+spin-t.eps_done: spin_codon_usage_dial.eps
+spin-t.eps_done: spin_author_d.eps
+spin-t.eps_done: spin_base_bias_d.eps
+spin-t.eps_done: spin_similar_spans.eps
+spin-t.eps_done: spin_match_words.eps
+spin-t.eps_done: spin_diagonals.eps
+spin-t.eps_done: spin_align_seq.eps
+spin-t.eps_done: spin_local_align.eps
+spin-t.eps_done: spin_alignment_symbols.eps
+spin-t.eps_done: spin_sequence_display_d.eps
+spin-t.eps_done: spin_sequence_display_save_d.eps
+spin-t.eps_done: spin_results_manager_d.eps
+spin-t.eps_done: spin_simple_search.eps
+spin-t.eps_done: spin_personal_search.eps
+spin-t.eps_done: spin_seq_manager.eps
+spin-t.eps_done: spin_save_sequence_d.eps
+spin-t.eps_done: spin_plot_base_comp_p.eps
+spin-t.eps_done: spin_count_codons_t.eps
+spin-t.eps_done: spin_string_search_p.eps
+spin-t.eps_done: spin_weight_matrix.eps
+spin-t.eps_done: spin_start_p.eps
+spin-t.eps_done: spin_stops_p.eps
+spin-t.eps_done: spin_stops_p2.eps
+spin-t.eps_done: spin_codon_usage.eps
+spin-t.eps_done: spin_codon_usage_aaonly.eps
+spin-t.eps_done: spin_author_p.eps
+spin-t.eps_done: spin_base_bias_p.eps
+spin-t.eps_done: spin_splice.eps
+spin-t.eps_done: spin_trna_p.eps
+spin-t.eps_done: spin_trna_t.eps
+spin-t.eps_done: spin_align_p.eps
+spin-t.eps_done: spin_local_p1.eps
+spin-t.eps_done: spin_local_p2.eps
+spin-t.eps_done: spin_translate_t.eps
+spin-t.eps_done: spin_plot_p.eps
+spin-t.eps_done: spin_results_manager_d2.eps
+spin-t.eps_done: spin_plot_drag1.eps
+spin-t.eps_done: spin_plot_drag2.eps
+spin-t.eps_done: spin_plot_drag3.eps
+spin-t.eps_done: spin_sequence_display_t.eps
+spin-t.eps_done: spin_dot_plot.eps
+spin-t.eps_done: spin_seq_display.eps
+spin-t.eps_done: spin_results_manager_d2.eps
+spin.pdf_done:
+spin.pdf_done: copyright.pdf_done
+spin.pdf_done: spin-t.pdf_done
+spin.eps_done:
+spin.eps_done: copyright.eps_done
+spin.eps_done: spin-t.eps_done
+spin_mini-t.pdf_done:
+spin_mini-t.eps_done:
+spin_mini-t.pdf_done: spin_translate_t.pdf
+spin_mini-t.pdf_done: spin_plot_p.pdf
+spin_mini-t.pdf_done: spin_restrict_enzymes_p.pdf
+spin_mini-t.pdf_done: spin_plot_base_comp_p.pdf
+spin_mini-t.pdf_done: spin_weight_matrix.pdf
+spin_mini-t.pdf_done: spin_splice.pdf
+spin_mini-t.pdf_done: spin_base_bias_p.pdf
+spin_mini-t.pdf_done: spin_trna_t.pdf
+spin_mini-t.pdf_done: spin_sequence_display_t.pdf
+spin_mini-t.pdf_done: spin_dot_plot.pdf
+spin_mini-t.pdf_done: spin_plot.pdf
+spin_mini-t.pdf_done: spin_local_p1.pdf
+spin_mini-t.pdf_done: spin_align_p.pdf
+spin_mini-t.pdf_done: spin_seq_display.pdf
+spin_mini-t.eps_done: spin_translate_t.eps
+spin_mini-t.eps_done: spin_plot_p.eps
+spin_mini-t.eps_done: spin_restrict_enzymes_p.eps
+spin_mini-t.eps_done: spin_plot_base_comp_p.eps
+spin_mini-t.eps_done: spin_weight_matrix.eps
+spin_mini-t.eps_done: spin_splice.eps
+spin_mini-t.eps_done: spin_base_bias_p.eps
+spin_mini-t.eps_done: spin_trna_t.eps
+spin_mini-t.eps_done: spin_sequence_display_t.eps
+spin_mini-t.eps_done: spin_dot_plot.eps
+spin_mini-t.eps_done: spin_plot.eps
+spin_mini-t.eps_done: spin_local_p1.eps
+spin_mini-t.eps_done: spin_align_p.eps
+spin_mini-t.eps_done: spin_seq_display.eps
+spin_org-t.pdf_done:
+spin_org-t.eps_done:
+spin_restrict_enzymes-t.pdf_done:
+spin_restrict_enzymes-t.eps_done:
+spin_restrict_enzymes-t.pdf_done: spin_restrict_enzymes_d.pdf
+spin_restrict_enzymes-t.pdf_done: spin_restrict_enzymes_p.pdf
+spin_restrict_enzymes-t.pdf_done: spin_restrict_enzymes_p1.pdf
+spin_restrict_enzymes-t.eps_done: spin_restrict_enzymes_d.eps
+spin_restrict_enzymes-t.eps_done: spin_restrict_enzymes_p.eps
+spin_restrict_enzymes-t.eps_done: spin_restrict_enzymes_p1.eps
+stops-t.pdf_done:
+stops-t.eps_done:
+stops-t.pdf_done: stops.pdf
+stops-t.eps_done: stops.eps
+tags-t.pdf_done:
+tags-t.eps_done:
+template-t.pdf_done:
+template-t.pdf_done: consistency_display-t.pdf_done
+template-t.eps_done:
+template-t.eps_done: consistency_display-t.eps_done
+template-t.pdf_done: template.dialogue.pdf
+template-t.pdf_done: template.display.pdf
+template-t.pdf_done: template.display.pdf
+template-t.pdf_done: template.quality.pdf
+template-t.pdf_done: template.restriction.pdf
+template-t.pdf_done: snp_candidates1.pdf
+template-t.pdf_done: snp_candidates2.pdf
+template-t.pdf_done: contig_editor_sets.pdf
+template-t.eps_done: template.dialogue.eps
+template-t.eps_done: template.display.eps
+template-t.eps_done: template.display.eps
+template-t.eps_done: template.quality.eps
+template-t.eps_done: template.restriction.eps
+template-t.eps_done: snp_candidates1.eps
+template-t.eps_done: snp_candidates2.eps
+template-t.eps_done: contig_editor_sets.eps
+test.pdf_done:
+test.eps_done:
+trace_dump.1.pdf_done:
+trace_dump.1.eps_done:
+tracediff.1.pdf_done:
+tracediff.1.eps_done:
+trev-t.pdf_done:
+trev-t.pdf_done: trev_mini-t.pdf_done
+trev-t.eps_done:
+trev-t.eps_done: trev_mini-t.eps_done
+trev-t.pdf_done: trace_print_menu.pdf
+trev-t.pdf_done: trace_print_page_dialogue.pdf
+trev-t.pdf_done: trace_print_trace_dialogue.pdf
+trev-t.pdf_done: trace_print_trace1.pdf
+trev-t.pdf_done: trev_conf_trace.pdf
+trev-t.eps_done: trace_print_menu.eps
+trev-t.eps_done: trace_print_page_dialogue.eps
+trev-t.eps_done: trace_print_trace_dialogue.eps
+trev-t.eps_done: trace_print_trace1.eps
+trev-t.eps_done: trev_conf_trace.eps
+trev.pdf_done:
+trev.pdf_done: copyright.pdf_done
+trev.pdf_done: trev-t.pdf_done
+trev.eps_done:
+trev.eps_done: copyright.eps_done
+trev.eps_done: trev-t.eps_done
+trev_mini-t.pdf_done:
+trev_mini-t.eps_done:
+trev_mini-t.pdf_done: trev_pic.pdf
+trev_mini-t.pdf_done: trace_print_trace1.pdf
+trev_mini-t.pdf_done: trev_conf_trace.pdf
+trev_mini-t.pdf_done: trev_pyro_trace.pdf
+trev_mini-t.eps_done: trev_pic.eps
+trev_mini-t.eps_done: trace_print_trace1.eps
+trev_mini-t.eps_done: trev_conf_trace.eps
+trev_mini-t.eps_done: trev_pyro_trace.eps
+vector_clip-t.pdf_done:
+vector_clip-t.eps_done:
+vector_clip-t.pdf_done: primer_pos_text.pdf
+vector_clip-t.pdf_done: primer_pos_plot.pdf
+vector_clip-t.pdf_done: primer_pos_seq_display.pdf
+vector_clip-t.eps_done: primer_pos_text.eps
+vector_clip-t.eps_done: primer_pos_plot.eps
+vector_clip-t.eps_done: primer_pos_seq_display.eps
+vector_clip.1.pdf_done:
+vector_clip.1.eps_done:
+vector_clip.pdf_done:
+vector_clip.pdf_done: copyright.pdf_done
+vector_clip.pdf_done: vector_clip-t.pdf_done
+vector_clip.eps_done:
+vector_clip.eps_done: copyright.eps_done
+vector_clip.eps_done: vector_clip-t.eps_done
+vector_primer-t.pdf_done:
+vector_primer-t.eps_done:
+ztr-t.pdf_done:
+ztr-t.eps_done:
diff --git a/manual/difference_clip.png b/manual/difference_clip.png
new file mode 100644
index 0000000..18f154e
Binary files /dev/null and b/manual/difference_clip.png differ
diff --git a/manual/disassembly-t.texi b/manual/disassembly-t.texi
new file mode 100644
index 0000000..fe7aa49
--- /dev/null
+++ b/manual/disassembly-t.texi
@@ -0,0 +1,247 @@
+ at node Contig-Checking-and-Breaking
+ at chapter Checking Assemblies and Removing Readings
+ at menu
+
+* Check Assembly::                    Checking Assemblies
+* Removing Readings::                 Removing Readings and Breaking Contigs
+* Break Contig::                      Breaking Contigs
+* Disassemble::                       Disassembling Readings
+ at end menu
+
+ at cindex assembly problems: breaking contigs
+ at cindex assembly problems: removing readings
+ at cindex assembly problems: disassembling readings
+
+After assembly, and prior to editing, it can be useful to examine the
+quality of the alignments between individual readings and the
+sections of the consensus which they overlap. This may
+reveal doubtful joins between sections of contigs, poorly aligned
+readings, or readings that have been misplaced. By using this analysis
+in combination with other gap4
+functions 
+such as Find internal joins (_fpref(FIJ, Find Internal
+Joins, fij)) and Find repeats (_fpref(Repeats, Find Repeats,
+repeats)), 
+it is also possible to discover if 
+readings have been positioned in the
+wrong copies of repeat elements. 
+The functions for checking the alignment of readings in contigs are
+described below.
+_fxref(Check Assembly, Checking Assemblies, check_ass)
+
+If readings are found to be misplaced
+or need removing for other reasons, gap4 has functions
+for breaking contigs
+(_fpref(Break Contig, Breaking Contigs, disassembly)),
+and removing readings
+(_fpref(Disassemble, Disassembling Readings, disassembly)).
+These functions can be accessed through the main gap4 Edit menu or from
+within the Contig Editor.
+
+If readings are removed from contigs to start new contigs of one
+reading, these contigs can then be processed by Find internal joins 
+(_fpref(FIJ, Find Internal
+Joins, fij)) 
+and the Join editor
+(_fpref(Editor-Joining, The Join Editor, contig_editor)), which should
+reveal all the other positions at which the reading matches.
+
+ at page
+_split()
+ at node Check Assembly
+ at subsection Checking Assemblies
+ at cindex Check assembly
+
+The Check Assembly routine (which is invoked from the gap4 View menu)
+is used to check contigs for potentially misassembled readings
+by comparing them against the segment of the consensus which
+they overlap.  It has two modes of use: the first simply counts the
+percentage mismatch between each reading and the consensus it overlaps,
+and the second performs an alignment between the hidden data for a
+reading and the consensus it overlaps.  If the percentage is above a
+user defined maximum, a result is produced.  That is, one mode compares
+the "visible" part of the readings, and the other aligns and compares
+the hidden data. Results are displayed in
+the Output Window and plotted on the main diagonal in the Contig
+Comparator. _fxref(Contig Comparator, Contig Comparator, comparator)
+
+From the Contig Comparator the user can invoke the Contig Editor to
+examine the alignment of any problem reading. _fxref(Editor, Editing in
+gap4, contig_editor) If the reading appears to be correctly positioned
+the user can either edit it, or in the case of poor alignment of the
+hidden data, place a tag, so that it does not produce a result if the
+search is done again.  Note however such data will then also be ignored
+by the automatic double stranding routine. _fxref(Double Strand, Double Stranding, exp_suggest)
+A typical textual output from the analysis of hidden data is shown below.
+
+ at example
+ at group
+Reading 802(fred.s1) has percentage mismatch of 25.86
+
+              375       385       395       405       415       425
+        Reading *CCTGTTTTAAATTG-TGG-C-CCCG*-TTAACCGGGGT*CAAC**CTGGGTTGCTTA
+                 : ::::: :::::: ::  : :::::  ::: ::: ::::::  ::::: ::::: :
+      Consensus ACATGTTT*AAATTGATGAACACCCG*AATAAACGGTGT*CAAAA*CTGGATTGCTAA
+             2929      2939      2949      2959      2969      2979
+ at end group
+ at end example
+
+_picture(check_ass,3.38333in)
+
+Users select either to search only one contig ("single"), all contigs
+("all contigs"), or a subset of contigs contained in a "file" or a
+"list". If "file" or "list" is selected the "browse" button will be
+activated and clicking on it will invoke a file or list browser. If a
+single contig is selected the "Contig identifier" dialogue will be
+activated and users should enter a contig name.
+
+Selecting between analysing the visible or hidden data is done by
+clicking on "yes" or "no" in the "Use cutoff data" dialogue. All
+alignments that are worse than "Maximum percentage of mismatches" will
+produce a result in the Output Window and the Contig Comparator.  If
+"Use cutoff data" is selected then dialogue to enable the user to
+restrict the quality and length of the hidden data that the program
+aligns is activated.  First, to avoid finding very short
+mismatching regions (where percentage mismatch figures could be very
+high) users can set a "Minimum length of alignment" figure. Secondly to
+ensure that the hidden data is not so bad that alignments will
+necessarily be poor, the program uses the following algorithm. It slides
+a window of size "Window size for good data scan" along the hidden data
+for each reading and stops if it finds a window that contains more than
+"Max dashes in scan window" non-ACGT characters.
+
+To check the used data for each reading 
+("Use cutoff data" is set to "No") the program
+compares all segments of size 'window' against the consensus sequence 
+that they lie above (obviously no alignment is required).
+If the percentage mismatch within any segment is above the
+specified amount, then the entire 'alignment' of the reading and consensus
+is displayed. Note that in the output the program will first give the percentage
+mismatch over the window length, and then the percentage over the whole reading. 
+To check the overall percentage mismatch of readings, 
+simply set the "Window size for used data" to be longer than the
+reading lengths. To check for divergence of segments within readings
+set the window size accordingly.
+
+ at cindex reading percent mismatch
+ at cindex readings: sorted on alignment score
+ at cindex aligned readings: sorted on alignment score
+
+The "Information" window produced by selecting "Information" from the
+Contig Comparator "Results" menu produces a summary of the results
+sorted in order os percentage mismatch.
+
+
+By clicking with the right mouse button
+on results plotted in the Contig Comparator a pop-up menu is revealed
+which can be used to invoke the Contig Editor
+(_fpref(Editor, Editing in gap4, contig_editor)). The editor will start
+up with the cursor positioned on the problem reading. If the reading is
+found to be misplaced it can be marked for removal from within the Editor
+(_fpref(Editor-Comm-Remove Reading, Remove Reading, contig_editor)).
+However, prior to this it may be beneficial to use some of the other
+analyses such as Find internal joins (_fpref(FIJ, Find Internal
+Joins, fij)) and Find repeats (_fpref(Repeats, Find Repeats,
+repeats)), which may help to find its correct location. Both of these
+functions produce results plotted in the Contig Comparator
+(_fpref(Contig Comparator, Contig Comparator, comparator)) and any
+alternative locations will give matches on the same vertical or
+horizontal projection as the problem reading.
+
+ at page
+_split()
+ at node Removing Readings
+ at section Removing Readings and Breaking Contigs
+
+Occasionally contigs require more drastic changes than simple basecall
+edits. Sometimes it is necessary
+to remove readings that have been put in the wrong
+place, or to break contigs that should not have been joined. Gap4
+contains functions to help with these problems, and two
+types of interface. 
+
+If a contig
+needs to be broken cleanly into two new contigs, with all the readings,
+other than the two at the incorrect join, still linked together, then
+Break Contig 
+(_fpref(Break Contig, Breaking Contigs, disassembly)), or
+(_fpref(Editor-Comm-Break Contig, Break Contig, contig_editor))
+should be used. The former interface is available via the main gap4 Edit
+menu, and the latter as an option in the Contig Editor.
+
+If one or more readings need removing from from contig(s), even if their
+removal will break the contiguity of a contig, then
+(_fpref(Disassemble, Disassemble Readings, disassembly)), or
+(_fpref(Editor-Comm-Remove Reading, Remove Reading, contig_editor))
+should be used. The former interface is available via the main gap4 Edit
+menu, and the latter as an option in the Contig Editor. Readings can be
+removed from the database completely, or moved to start individual new
+contigs, one for each reading.
+
+
+ at page
+_split()
+ at node Break Contig
+ at subsection Breaking Contigs
+ at cindex Break contig
+
+The Break Contig function (which is available from the gap4 Edit menu)
+enables contigs to be broken by removing the
+link between two adjacent readings. The
+user defines the name or number of the reading that, after the break,
+will be at the left end of the new contig. That is, the break is made
+between the named reading and the reading to its left.
+
+_picture(break_contig,2.925in)
+
+It is also possible to interactive select places to break the contig when
+using the Contig Editor.
+_fxref(Editor-Comm-Break Contig, Break Contig, contig_editor)
+
+ at page
+_split()
+ at node Disassemble
+ at subsection Disassembling Readings
+ at cindex Disassemble readings
+ at cindex Removing readings
+
+This function is used to remove readings from a database or move
+readings to new contigs. 
+There are two interfaces which allow sets of readings to be
+disassembled. One is to identify the readings interactively when using
+the
+Contig Editor
+(_fpref(Editor-Remove Readings, Remove Readings, contig_editor)),
+and the other, described below, is available as a separate option from
+the main gap4 Edit menu.
+
+_picture(disassembly,3.39167in)
+
+If readings are removed from the database all reference to them is
+deleted. If a reading is moved to a ``single-read contig'' a new
+contig will be created containing this one single reading, which may
+then be re-processed by Find Internal Joins
+(_fpref(FIJ, Find Internal
+Joins, fij)) 
+and the Join editor
+(_fpref(Editor-Joining, The Join Editor, contig_editor)), which should
+reveal all the other positions at which the reading matches.
+
+More useful is the general ``Move readings to new contigs''. This will
+keep any assembly relationships intact between the set of readings to
+be disassembled. For example if three readings overlap then when
+disassembled all three will end up in a single new contig. This
+function is particularly useful for pulling apart false joins or
+repeats.
+
+The set of readings to be processed can be read from a ``file'' or a ``list'' and
+clicking on the ``browse'' button will invoke an appropriate browser. If just a
+single reading is to be assembled choose ``single'' and enter the
+reading name instead of the file or list of filenames.
+
+Removal via a ``list'' is a particularly powerful option when
+controlled via the list generation functions within the contig
+editor. For example break contig could be viewed as disassembling a
+list of readings selected using ``Select this reading and all to
+right''.
+
diff --git a/manual/disassembly.png b/manual/disassembly.png
new file mode 100644
index 0000000..7a396b5
Binary files /dev/null and b/manual/disassembly.png differ
diff --git a/manual/discrepancy_graph.png b/manual/discrepancy_graph.png
new file mode 100644
index 0000000..4e776ac
Binary files /dev/null and b/manual/discrepancy_graph.png differ
diff --git a/manual/doctor_db-t.texi b/manual/doctor_db-t.texi
new file mode 100644
index 0000000..8befe98
--- /dev/null
+++ b/manual/doctor_db-t.texi
@@ -0,0 +1,324 @@
+ at cindex Doctor Database
+
+ at menu
+* Doctor-Structures::           Structures menu
+* Doctor-IgnoreCheck::          Ignoring check database
+* Doctor-Extend::               Creating new structures
+* Doctor-Anno::                 Listing and removing annotations
+* Doctor-Shift::                Shifting readings
+* Doctor-Delete::               Delete contigs
+* Doctor-Contig Order::         Resetting the contig order
+ at end menu
+
+Doctor Database 
+(which is available from the gap4 Edit menu)
+is used to make arbitrary changes to the database. It
+should be extremely unlikely that is use will be required, and if so, is for
+experts only. Very
+few checks are performed on the user's input and there are few
+limitations on what can be done.  
+Consequently this option should never be used without
+first making a backup using "Copy database". _fxref(GapDB-CopyDatabase, Making
+Backups of Databases, gap_database) It is very easy to create
+inconsistencies within the database. Do not feel that values (such as
+the maximum gel reading length) can be safely
+changed simply because they are shown in
+a dialogue. 
+
+_picture(doctor_db.main,2.09167in)
+
+The main window consists of a menubar containing "File", "Structures" and
+"Commands" menus. The menus contain:
+
+ at itemize @bullet
+ at item File
+ at itemize @minus
+ at item New
+ at item Quit
+ at end itemize
+ at sp 1
+ at item Structures
+ at itemize @minus
+ at item Database
+ at item Reading
+ at item Contig
+ at item Annotation
+ at item Template
+ at item Original clone
+ at item Vector
+ at item Note
+ at end itemize
+ at sp 1
+ at item Commands
+ at itemize @minus
+ at item Check
+ at item Ignore check database
+ at item Extend structures
+ at itemize @minus
+ at item Reading
+ at item Annotation
+ at item Template
+ at item Clone
+ at item Vector
+ at end itemize
+ at item Delete contig
+ at item Shift readings
+ at item Reset contig order
+ at item Output annotations to file
+ at item Delete annotations
+ at end itemize
+ at end itemize
+
+The New command in the Commands menu brings up another Doctor Database window
+complete with its own menubar. This is useful for comparing structures.
+Whilst Doctor Database is running all other program dialogues, including the
+main gap4 menubar, are blocked. Control is reenabled once the last Doctor
+Database window is removed. Remember to perform a Check Database
+(Commands menu) before quitting to double check for database consistency.
+
+
+_split()
+ at node Doctor-Structures
+ at section Structures Menu
+
+ at ifset html
+ at menu
+* Doctor-Database::             Database structure
+* Doctor-Reading::              Reading structure
+* Doctor-Contig::               Contig structure
+* Doctor-Annotation::           Annotation structure
+* Doctor-Template::             Template structure
+* Doctor-Clone::                Original clone structure
+* Doctor-Note::                 Note structure
+ at end menu
+ at end ifset
+
+The gap4 database consists of records of several
+predefined types. The types correspond to the commands available within the
+Structures menu. All of these, except for the "Database" command, insert a
+dialogue between the menubar and whatever is underneath it. In the picture
+below we have selected "Annotations" from the menu which has prompted for
+"Which annotation (1-380)" (the 1-380 is the valid range of inputs available).
+
+_picture(doctor_db.structures,2.68333in)
+
+In the panel beneath the "Which annotation" question is a panel detailing
+another annotation structure. In general the structure type and number are
+shown at the top of the panel (in this case annotation number 100). Beneath
+this are the structure fields on the left followed by the values for these
+fields on the right. Sometimes gap4 may store a value as numeric, but
+display the structure as both a numeric and a string describing this value.
+For instance here the annotation strand is "1" which is gap4's way of storing
+"reverse".
+
+Some values have an arrow next to them, such as with the "next" field in the
+illustration. Clicking on this arrow will display the structure referenced by
+this value. Here it is another annotation (annotation 357). It is
+stated 
+that the annotation is part of Contig number 6. Clicking on the arrow next to
+this will reveal that contig structure.
+
+Selected notes on editing the structures follows.
+
+ at node Doctor-Database
+ at subsection Database Structure
+ at cindex Database structure: doctor database
+ at cindex Doctor database: database structure
+
+There is only a single Database structure. A description of its 
+more important fields follows.
+
+ at table @strong
+ at item num_contigs
+The number of currently @i{used} contigs
+ at sp 1
+ at item num_readings
+The number of currently @i{used} readings
+ at sp 1
+ at item Ncontigs
+The number of currently @i{allocated} contigs
+ at sp 1
+ at item Nreadings
+The number of currently @i{allocated} readings
+ at sp 1
+ at item contigs
+ at itemx readings
+ at itemx annotations
+ at itemx templates
+ at itemx clones
+ at itemx vectors
+ at itemx notes
+Record numbers of arrays holding the record numbers of each item
+ at sp 1
+ at item free_annotations
+A linked list of unused annotations
+ at item free_notes
+A linked list of unused notes
+ at end table
+
+ at node Doctor-Reading
+ at subsection Reading Structure
+ at cindex Reading structure: doctor database
+ at cindex Doctor database: reading structure
+
+Some Reading Structure fields reference the record number in the gap4
+database of a string. Where this string is short, such as the reading name,
+both the record number and the contents of the string can be edited. To edit a
+single name the string should be changed. To swap two reading names around
+either edit both strings or swap the two name record numbers.
+
+The @strong{annotations} value references an annotation number. If this is
+zero then this reading has no annotations.
+
+The @strong{length} is the complete length of sequence, including hidden data.
+The @strong{sequence_length} is the length of only the used sequence. The
+location of the hidden data is specified by the @strong{start} and
+ at strong{end} values. Note that @strong{sequence_length=end-start-1}.
+
+A @strong{left} or @strong{right} value of zero means that this reading has no
+left or right neighbour.
+
+ at node Doctor-Contig
+ at subsection Contig Structure
+ at cindex Contig structure: doctor database
+ at cindex Doctor database: contig structure
+
+A Contig Structure is defined as a list of readings. 
+The @strong{left} and @strong{right} values
+specify the first and last reading numbers in the doubly linked list
+representing the contig.
+
+ at node Doctor-Annotation
+ at subsection Annotation Structure
+ at cindex Annotation structure: doctor database
+ at cindex Doctor database: annotation structure
+
+Annotations are stored as linked lists. Each reading and each contig has a
+(possibly blank) list. All other unused annotations are held on the free list.
+The @strong{next} value is used to reference the next annotation number. A
+value of zero represents the end of the list.
+
+ at node Doctor-Template
+ at subsection Template Structure
+ at cindex Template structure: doctor database
+ at cindex Doctor database: template structure
+
+The Template name field can be edited as both a string and the record number
+pointing to that string. The Template Structure display has links to a vector
+number and a clone.
+
+ at node Doctor-Clone
+ at subsection Original Clone Structure
+ at cindex Clone structure: doctor database
+ at cindex Doctor database: clone structure
+ at cindex Doctor database: original clone structure
+
+The original clone name is often the name of the database. The use of original
+clones is primarily for large scale sequencing. When breaking down a sequence
+into cosmids and then into sequencing templates, we say that each cosmid is a
+clone.
+
+ at node Doctor-Note
+ at subsection Note Structure
+ at cindex Note structure: doctor database
+ at cindex Doctor database: note structure
+
+A Note may be considered as a positonless annotation (without the position,
+length or strand fields). Notes store both their creation and
+last-modification dates. Notes may be attached, in a linked-list fashion, to
+readings, contigs, or the database structure.
+
+_split()
+ at node Doctor-IgnoreCheck
+ at section Ignoring Check Database
+ at cindex Check database: ignoring
+ at cindex Ignore check database
+
+Many functions use the Check Database function to determine whether the
+database is consistent. Often editing an inconsistent database can yield
+more and more inconsistencies. However it is sometimes useful to use such an
+editing function in the process of fixing the database. In such cases, the
+"Ignore check database" toggle should be set.
+
+An example of the use is for the Break Contig function. 
+If we find that a database is
+inconsistent due there being a gap in the contig, the obvious solution is to
+fix this using Break Contig. But Break Contig checks for consistency, and
+refuses to work if the database is inconsistent.
+
+_split()
+ at node Doctor-Extend
+ at section Extending Structures
+ at cindex Extending structures: doctor database
+ at cindex Doctor database: extending structures
+
+Sometimes it is required to allocate new structures. The "Extend structure"
+item on the command menu reveals a cascading menu containing the different
+structure types. Once a type has been selected a dialogue appears asking how
+many extra structures to create.
+
+The new structures created can then be modified using the Structures menu.
+Expect strange behaviour if these structures are not initialised correctly.
+
+_split()
+ at node Doctor-Anno
+ at section Listing and Removing Annotations
+ at cindex Output annotations to file
+ at cindex Delete annotations
+ at cindex Annotations: deleting (Doctor Database)
+ at cindex Annotations: outputting to file (Doctor Database)
+
+The Commands menu contains two commands for manipulating lists of annotations.
+ at code{Output annotations to file} saves a list of annotations to file. The
+dialogue requests a filename to save the annotations to and an annotation
+type. Only one type can be specified.
+
+The format of the file is @code{"Annotation_number Type Position Length
+Strand"}.
+
+The "Delete annotations" command requests a file of annotations in this
+format. The function then removes these annotations from readings and contigs
+and adds them to the free annotation list.
+
+_split()
+ at node Doctor-Shift
+ at section Shift Readings
+ at cindex Shift readings: doctor database
+ at cindex Doctor database: shift readings
+
+The Shift Readings 
+option allows the user to change the relative positions of a set of
+neighbouring readings starting at a selected reading. Hence it can
+be used to change the alignment of readings within a contig.  It prompts for
+the number of the first reading to shift and then the relative
+distance to move by. A negative shift will move the readings leftwards.
+
+The reading and all its rightward neighbours are moved by the requested
+distance. Tags on the readings and the consensus are moved accordingly.
+The command also automatically updates then length of the contig.
+
+_split()
+ at node Doctor-Delete
+ at section Delete Contig
+ at cindex Delete contig: doctor database
+ at cindex Doctor database: delete contig
+ at cindex Contig, deletion of: doctor database
+
+The Delete Contig 
+function removes a contig and all its readings.
+Annotations on the removed readings and contig are added to the free
+annotations list.
+
+_split()
+ at node Doctor-Contig Order
+ at section Reset Contig Order
+ at cindex Contig order, reset: doctor database
+ at cindex Doctor database: contig order
+ at cindex Doctor database: reset contig order
+
+The contig order information contains a list of contig numbers. If a contig
+number does not appear within this list, or if it appears more than once,
+then the contig order is inconsistent and windows such as the Contig
+Selector may not work. The Reset Contig Order 
+function resets the contig order to a
+consistent state, but will lose the existing contig order information.
diff --git a/manual/doctor_db.main.png b/manual/doctor_db.main.png
new file mode 100644
index 0000000..f751eed
Binary files /dev/null and b/manual/doctor_db.main.png differ
diff --git a/manual/doctor_db.structures.png b/manual/doctor_db.structures.png
new file mode 100644
index 0000000..70feee6
Binary files /dev/null and b/manual/doctor_db.structures.png differ
diff --git a/manual/eba.1.texi b/manual/eba.1.texi
new file mode 100644
index 0000000..d838d3f
--- /dev/null
+++ b/manual/eba.1.texi
@@ -0,0 +1,54 @@
+ at cindex eba: man page
+ at unnumberedsec NAME
+
+eba --- Estimates Base Accuracy in an SCF or ZTR file
+
+ at unnumberedsec SYNOPSIS
+
+ at code{eba} [@i{trace_file}]
+
+ at unnumberedsec DESCRIPTION
+
+ at code{Eba} will calculate numerical estimates of base accuracy for each
+base in an SCF or ZTR file. The figures calculated should not be considered as
+reliable and better values can be obtained from phred or ATQA.
+
+The method employed by eba to estimate the base accuracies performs the
+following calculation for each base. Calculate the area under the peaks
+for each base type. Divide the area under the called base by the largest
+area under the other three bases. From the 2002 release these values are
+normalised to the phred scale (this was achieved by comaring the
+original eba values and phred values for 4.6 million base calls of
+Sanger Centre data).
+
+With no filename as an argument eba reads from standard input and writes
+to standard output. This enables eba to be used as a filter, or to
+estimate base accuracies for unwritable files. If a file is specified on
+the command line then the accuracy figures will be written to this file.
+
+ at unnumberedsec EXAMPLES
+
+To write base accuracy figures to an SCF file named @code{e04f10.s1SCF}.
+
+ at example
+ at code{eba e04f10.s1SCF}
+ at end example
+
+To write base accuracy figures on the original eba scale to an SCF file 
+named @code{e04f10.s1SCF}.
+
+ at example
+ at code{eba -old_scale e04f10.s1SCF}
+ at end example
+
+To write base accuracy figures to a ZTR file named @code{e04f10.s1.ztr}
+in another users directory, and to store the updated file in the
+current directory:
+
+ at example
+ at code{eba < ~user/e04f10.s1.ztr > e04f10.s1.ztr}
+ at end example
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Formats-Scf, scf(4), formats)
diff --git a/manual/exp-t.texi b/manual/exp-t.texi
new file mode 100644
index 0000000..1876768
--- /dev/null
+++ b/manual/exp-t.texi
@@ -0,0 +1,902 @@
+ at ignore
+ at c MANSECTION=4
+ at unnumberedsec NAME
+
+ExperimentFile --- Experiment File Format
+ at end ignore
+
+ at node Formats-Exp
+ at section Experiment File
+ at cindex Experiment files
+
+Experiment files contain gel readings plus information about them, and are
+used during the processing of the sequence. They are used to carry data
+between programs: they provide input to the programs and programs may in
+turn add to or modify them. When the experiment file for a reading reaches
+the assembly program it should be carrying all the data needed for its
+subsequent processing. The assembly program will copy what it needs into
+the assembly database. The file format is based on that of EMBL sequence
+entries and, if required, can be read as such by programs like spin.
+
+ at menu
+* Exp-Records::                 Records
+* Exp-Explain::                 Explanation of Records
+* Exp-Example::                 Example Experiment File
+* Exp-Unsupported::             Unsupported Additions
+ at end menu
+
+_split()
+ at node Exp-Records
+ at subsection Records
+ at cindex Experiment files: record types
+ at cindex Records in experiment files
+
+It is important to note that the assembly program gap4
+(_fpref(Gap4-Introduction, Gap4 introduction, gap4))
+ will not operate to
+its full effect if it is not given all the necessary data. For example
+gap4 contains many functions that can analyse the positions and relative
+orientations of readings from the same template in order to check the
+correctness of the assembly and determine the contig order. However if
+the records that name templates and their estimated lengths, and define
+the primers used to obtain readings from them are missing, none of these
+valuable analyses can be performed reliably. One way to ensure that all
+the necessary fields are present is to use the program pregap4
+(_fpref(Pregap4-Introduction, Pregap4 introduction, pregap4)).
+
+
+In the descriptions below records containing * are those read into the
+database during normal assembly; those with ** are extra items required when
+entering pre-assembled data; those with *** are read from SCF files
+(after the experiment file has been read to obtain the SCF file name);
+(_fpref(Formats-Scf, SCF introduction, scf))
+the record marked **** is an extra item required for Directed Assembly.
+
+The order of records in the file is not important. They are listed
+here in alphabetical order with, where possible, reasons for the 
+origin of their names. Several are redundant and no group is likely
+to make use of them all. Obviously others can be added in the future.
+Initially they might be of local use but if their use becomes wider they
+can be added to the standard set. Standard EMBL records such as FT are
+assumed to be included.
+
+
+ at c TABLE_MODE=1
+ at table @var
+ at item AC
+ACcession number
+ at item AP
+Assembly Position ****
+ at item AQ
+AVerage Quality for bases 100..200
+ at item AV
+Accuracy values for externally assembled data **, ***
+ at item BC
+Base Calling software
+ at item CC
+Comment line
+ at item CF
+Cloning vector sequence File
+ at item CH
+Special CHemistry
+ at item CL
+Cloning vector Left end
+ at item CN
+Clone Name
+ at item CR
+Cloning vector Right end
+ at item CS
+Cloning vector Sequence present in sequence *
+ at item CV
+Cloning Vector type
+ at item DR
+Direction of Read
+ at item DT
+DaTe of experiment
+ at item EN
+Entry Name
+ at item EX
+EXperimental notes
+ at item FM
+sequencing vector Fragmentation Method
+ at item ID
+IDentifier *
+ at item LE
+was Library Entry, but now identifies a well in a micro titre dish
+ at item LI
+was subclone LIbrary but now identifies a micro titre dish
+ at item LN
+Local format trace file Name *
+ at item LT
+Local format trace file Type *
+ at item MC
+MaChine on which experiment ran
+ at item MN
+Machine generated trace file Name
+ at item MT
+Machine generated trace file Type
+ at item ON
+Original base Numbers (positions) **
+ at item OP
+OPerator
+ at item PC
+Position in Contig **
+ at item PD
+Primer data (the sequence of a primer)
+ at item PN
+Primer Name
+ at item PR
+PRimer type *
+ at item PS
+Processing Status
+ at item QL
+poor Quality sequence present at Left (5') end *
+ at item QR
+poor Quality sequence present at Right (3') end *
+ at item RS
+Reference Sequence for numbering and mutation detection
+ at item SC
+Sequencing vector Cloning site
+ at item SE
+SEnse (ie whether complemented) **
+ at item SF
+Sequencing vector sequence File
+ at item SI
+Sequencing vector Insertion length *
+ at item SL
+Sequencing vector sequence present at Left (5') end *
+ at item SP
+Sequencing vector Primer site (relative to cloning site)
+ at item SQ
+SeQuence *
+ at item SR
+Sequencing vector sequence present at Right (3') end *
+ at item SS
+Screening Sequence
+ at item ST
+STrands *
+ at item SV
+Sequencing Vector type *
+ at item TG
+Gel reading Tag *
+ at item TC
+Contig Tag *
+ at item TN
+Template Name *
+ at item WT
+Wild type trace
+ at end table
+ at c TABLE_MODE=0
+
+_split()
+ at node Exp-Explain
+ at subsection Explanation of Records
+ at cindex Experiment file: explanation of records
+
+ at c TABLE_MODE=2
+ at cindex AC: experiment file line type
+ at table @code
+ at item Record
+AC, ACcession line
+ at item Format
+AC   string
+ at item Explanation
+A unique identifier for the reading.
+ at end table
+ at sp 2
+ at cindex AP: experiment file line type
+ at table @code
+ at item Record
+AP, Assembly Position
+ at item Format
+AP   Name_of_anchor_reading sense offset tolerance
+ at item Explanation
+For readings whose position has been mapped by an external program, these
+records tell the "directed assembly" algorithm where to assemble the data.
+Positions are defined as offsets from an "anchor reading" which is the name of
+any reading already in the database, an orientation (sense, + or -), and a
+tolerance. Readings are aligned at relative position offset + or - tolerance.
+ at end table
+ at sp 2
+ at cindex AQ: experiment file line type
+ at table @code
+ at item Record
+AQ, Average Quality of the reading.
+ at item Format
+AQ   Numeric value in range 1 - 99.
+ at item Explanation
+The average value of the "numerical estimate of base calling accuracy" as
+calculated by program eba. The value is useful for monitoring data quality and
+could also be used for deciding on an order of assembly - for example assemble
+the highest quality readings first.
+ at end table
+ at sp 2
+ at cindex AV: experiment file line type
+ at table @code
+ at item Record
+AV, Accuracy Values
+ at item Format
+AV   q1 q2 q3 @dots{} or a1,c1,g1,t1 a2,c2,g2,t2 @dots{}
+ at item Explanation
+The accuracy values lie in the range 1-99. Either 1 per base (eg 89 50 @dots{}
+or 4 per base (eg 0,89,5,2 50,3,7,10). @cite{Bonfield,J.K and Staden,R.
+The application of numerical estimates of base calling accuracy to DNA
+sequencing projects. Nucleic Acids Res. 23 1406-1410, (1995)}.
+ at end table
+ at sp 2
+ at cindex BC: experiment file line type
+ at table @code
+ at item Record
+BC, Base Calling software
+ at end table
+ at sp 2
+ at cindex CC: experiment file line type
+ at table @code
+ at item Record
+CC, Comment line
+ at item Format
+CC   string
+ at item Explanation
+Any comments can be added on any number of lines.
+ at end table
+ at sp 2
+ at cindex CF: experiment file line type
+ at table @code
+ at item Record
+CF, Cloning vector sequence File
+ at item Format
+CF   string
+ at item Explanation
+The name of the file containing the sequence of the cloning vector, to be used
+by vector_clip (_fpref(Vector_Clip, Screening Against Vector Sequences, vector_clip)).
+
+ at end table
+ at sp 2
+ at cindex CH: experiment file line type
+ at table @code
+ at item Record
+CH, Special CHemistry
+ at item Format
+CH   number
+ at item Explanation
+Used to flag readings as having been sequenced using a "special chemistry". The
+number is a bit pattern with a bit for each chemistry type, thus allowing
+combinations of chemistries to be listed. Currently bit 0 is used to
+distinguish between dye-primer (0) and dye-terminator (1) chemistries. Bits 1
+to 4 inclusive indicate the type of chemistry: unknown (0, 0000), ABI
+Rhodamine (1, 0001), ABI dRhodamine (2, 0010), BigDye (3, 0011), Energy
+Transfer (4, 0100) and LiCor (5, 0101). So for example a BigDye Terminator has 
+bits 00111 set which is 7 in decimal.
+ at end table
+ at sp 2
+ at cindex CL: experiment file line type
+ at table @code
+ at item Record
+CL, Cloning vector Left end
+ at item Format
+CL   number
+ at item Explanation
+The base position in the sequence that contains the last base in the cloning
+vector. Currently gap4 only uses the CS line.
+ at end table
+ at sp 2
+ at cindex CN: experiment file line type
+ at table @code
+ at item Record
+CN, Clone Name
+ at item Format
+CN   string
+ at item Explanation
+The name of the segment of DNA that the reading has been
+derived from. Typically the name of a physical map clone. 
+ at end table
+ at sp 2
+ at cindex CR: experiment file line type
+ at table @code
+ at item Record
+CR, Cloning vector Right end
+ at item Format
+CR   number
+ at item Explanation
+The base position in the sequence that contains the first base in the cloning
+vector. Currently gap4 only uses the CS line.
+ at end table
+ at sp 2
+ at cindex CS: experiment file line type
+ at table @code
+ at item Record
+CS, Cloning vector Sequence present in sequence
+ at item Format
+CS   range
+ at item Explanation
+Regions of sequence found by vector_clip 
+(_fpref(Vector_Clip, Screening Against Vector Sequences,
+vector_clip)) to be cloning vector. Used in assembly to
+exclude unwanted sequence.
+ at end table
+ at sp 2
+ at cindex CV: experiment file line type
+ at table @code
+ at item Record
+CV, Cloning Vector type
+ at item Format
+CV   string
+ at item Explanation
+The type of the cloning vector used.
+ at end table
+ at sp 2
+ at cindex DR: experiment file line type
+ at table @code
+ at item Record
+DR, Direction of Read
+ at item Format
+DR   direction
+ at item Explanation
+Whether forward or reverse primers were used. Allows
+mapping of forward and reverse reads off the same template. NOTE however
+that we do not encourage the use of this method as the terms
+direction, sense and strand can be confusing. Instead we encourage the
+use of the PRimer line.
+ at end table
+ at sp 2
+ at cindex DT: experiment file line type
+ at table @code
+ at item Record
+DT, DaTe of experiment
+ at item Format
+DT   dd-mon-yyyy
+ at item Explanation
+Any date information.
+ at end table
+ at sp 2
+ at cindex EN: experiment file line type
+ at table @code
+ at item Record
+EN, Entry Name
+ at item Format
+EN   string
+ at item Explanation
+The name given to the reading
+ at end table
+ at sp 2
+ at cindex EX: experiment file line type
+ at table @code
+ at item Record
+EX, EXperimental notes
+ at item Format
+EX   string
+ at item Explanation
+Another type of comment line for additional information.
+ at end table
+ at sp 2
+ at cindex FM: experiment file line type
+ at table @code
+ at item Record
+FM, sequencing vector Fragmentation Method
+ at item Format
+FM   string
+ at item Explanation
+Fragmentation method used to create sequencing library.
+ at end table
+ at sp 2
+ at cindex ID: experiment file line type
+ at table @code
+ at item Record
+ID, IDentifier
+ at item Format
+ID   string
+ at item Explanation
+This is the name given to the reading inside the assembly database
+and is equivalent to the ID line of an EMBL entry.
+ at end table
+ at sp 2
+ at cindex LE: experiment file line type
+ at table @code
+ at item Record
+LE, Can be used to identify the location of materials
+ at item Format
+LE   string
+ at item Explanation
+Originally a micro titre dish well number. Used in
+combination with LI.
+ at end table
+ at sp 2
+ at cindex LI: experiment file line type
+ at table @code
+ at item Record
+LI, Can be used to identify the location of materials
+ at item Format
+LI   string
+ at item Explanation
+Originally a micro titre dish identifier. Used in
+combination with LE.
+ at end table
+ at sp 2
+ at cindex LN: experiment file line type
+ at table @code
+ at item Record
+LN, Local format trace file Name
+ at item Format
+LN   string
+ at item Explanation
+The name of the local format trace file. This information is passed
+onto gap4, and allows for local formats to be used.
+ at end table
+ at sp 2
+ at cindex LT: experiment file line type
+ at table @code
+ at item Record
+LT, Local format trace file Type
+ at item Format
+LT   string
+ at item Explanation
+The type of the local trace file type (usually SCF).
+ at end table
+ at sp 2
+ at cindex MC: experiment file line type
+ at table @code
+ at item Record
+MC, MaChine on which sequencing experiment was run
+ at item Format
+MC   string
+ at item Explanation
+The lab's name for the sequencing machine used to create the data.
+Used for logging the performance of individual machines.
+ at end table
+ at sp 2
+ at cindex MN: experiment file line type
+ at table @code
+ at item Record
+MN, Machine generated trace file Name
+ at item Format
+MN   string
+ at item Explanation
+The name of the trace file generated by the sequencing machine MC.
+ at end table
+ at sp 2
+ at cindex MT: experiment file line type
+ at table @code
+ at item Record
+MT, Machine generated trace file Type
+ at item Format
+MT   string
+ at item Explanation
+The type of machine generated trace file.
+ at end table
+ at sp 2
+ at cindex ON: experiment file line type
+ at table @code
+ at item Record
+ON, Original base Numbers (positions)
+ at item Format
+ON   (eg) 1..43 0 45..63 65..74 0 75..536
+ at item Explanation
+The A..B notation means that values A to B inclusive, so this example reads
+that bases 1 to 43 are unchanged, there is a change at 44, etc.
+ at end table
+ at sp 2
+ at cindex OP: experiment file line type
+ at table @code
+ at item Record
+OP, OPerator
+ at item Format
+OP   string
+ at item Explanation
+Someone's name, possibly the person who ran the
+sequencing machine. Useful, with expansion of the string field for
+monitoring the performance of individuals!
+ at end table
+ at sp 2
+ at cindex PC: experiment file line type
+ at table @code
+ at item Record
+PC,  Position in Contig
+ at item Format
+PC    number
+ at item Explanation
+For preassembled data, the position to put the left end of the reading.
+ at end table
+ at sp 2
+ at cindex PD: primer data - the sequence of a primer
+ at table @code
+ at item Record
+PD,  Primer Data
+ at item Format
+PD    sequence
+ at item Explanation
+The primer sequence.
+ at end table
+ at sp 2
+ at cindex PN: experiment file line type
+ at table @code
+ at item Record
+PN, Primer Name
+ at item Format
+PN   string
+ at item Explanation
+Name of primer used, using local naming convention. Could be a
+universal primer. 
+ at end table
+ at sp 2
+ at cindex PR: experiment file line type
+ at table @code
+ at item Record
+PR, PRimer type
+ at item Format
+PR   number
+ at item Explanation
+This record shows the direction of the reading and distinguishes between
+primers from the ends of the insert and those that are internal. It is
+important for the analysis of the relative orientations and positions of
+readings on templates. When the positions of readings on templates are
+analysed (_fpref(Read Pairs, Find read pairs, read_pairs)) primer types
+1,2,3 and 4 are represented using the symbols F,R,f and r respectively.
+
+ at c TABLE_MODE=1
+ at table @var
+ at item 0
+Unknown
+ at item 1
+Forward from beginning of insert
+ at item 2
+Reverse from end of insert
+ at item 3
+Custom forward i.e. a forward primer other than type 1.
+ at item 4
+Custom reverse i.e. a reverse primer other than type 2.
+ at end table
+ at c TABLE_MODE=2
+ at end table
+ at sp 2
+ at cindex PS: experiment file line type
+ at table @code
+ at item Record
+PS, Processing Status
+ at item Format
+PS   explanation
+ at item Explanation
+Indication of processing status. 
+ at end table
+ at sp 2
+ at cindex QL: experiment file line type
+ at table @code
+ at item Record
+QL, poor Quality sequence present at Left (5') end
+ at item Format
+QL   position
+ at item Explanation
+The sequence up to and including the base at the marked position are
+considered to be of too poor quality to be used. 
+It may overlap with other marked
+sequences - CS, SL or SR. Used in assembly to exclude unwanted sequence.
+ at end table
+ at sp 2
+ at cindex QR: experiment file line type
+ at table @code
+ at item Record
+QR, poor Quality sequence present at Right (3') end
+ at item Format
+QR   position
+ at item Explanation
+The sequence from and including the base at the marked position to the
+end is considered to be of too poor quality to be used. It may overlap with
+other marked sequences - CS, SL or SR. Used in assembly to exclude
+unwanted sequence.
+ at end table
+ at sp 2
+ at cindex RS: experiment file line type
+ at table @code
+ at item Record
+RS, Reference Sequence
+ at item Format
+RS   string
+ at item Explanation
+The name of a sequence, usually in EMBL format, used to define the target
+sequence, base numbering 
+and feature table data for a project. Used to define the numbering and
+changes produced by mutations in individual sequence readings
+(_fpref(Mutation-Detection-Introduction, Introduction to mutation detection,t)).
+ at end table
+ at sp 2
+ at cindex SC: experiment file line type
+ at table @code
+ at item Record
+SC, Sequencing vector Cloning site
+ at item Format
+SC   position
+ at item Explanation
+The cloning site of the sequence vector. Used by vector_clip 
+(_fpref(Vector_Clip, Screening Against Vector Sequences, vector_clip)).
+ at end table
+ at sp 2
+ at cindex SE: experiment file line type
+ at table @code
+ at item Record
+SE, SEnse (ie whether complemented)
+ at item Format
+SE   number
+ at item Explanation
+For preassembled data, the sense of the reading (0 for forward, 1 for
+reverse).
+ at end table
+ at sp 2
+ at cindex SF: experiment file line type
+ at table @code
+ at item Record
+SF, Sequencing vector sequence File
+ at item Format
+SF   string
+ at item Explanation
+The name of the file containing the sequence of the 
+sequencing vector, to be used by vector_clip 
+(_fpref(Vector_Clip, Screening Against Vector Sequences, vector_clip)).
+ at end table
+ at sp 2
+ at cindex SI: experiment file line type
+ at table @code
+ at item Record
+SI, Sequencing vector Insertion length
+ at item Format
+SI   range
+ at item Explanation
+Expected insertion length of sequence in sequencing
+vector. Useful for selecting templates for further experiments.
+ at end table
+ at sp 2
+ at cindex SL: experiment file line type
+ at table @code
+ at item Record
+SL, Sequencing vector sequence present at Left (5') end
+ at item Format
+SL   position
+ at item Explanation
+The sequence up to and including the base at the marked 
+position are considered to be sequencing vector. Written by vector_clip
+(_fpref(Vector_Clip, Screening Against Vector Sequences, vector_clip)).
+ at end table
+ at sp 2
+ at cindex SP: experiment file line type
+ at table @code
+ at item Record
+SP, Sequencing vector Primer site (relative to cloning site)
+ at item Format
+SP   position
+ at item Explanation
+Location of the primer using to sequence relative to cloning site.
+Used by vector_clip 
+(_fpref(Vector_Clip, Screening Against Vector Sequences, vector_clip)).
+ at end table
+ at sp 2
+ at cindex SQ: experiment file line type
+ at table @code
+ at item Record
+SQ, SeQuence
+ at item Format
+SQ   \nsequence blocks at dots{}\n//\n
+ at item Explanation
+Complete sequence, as determined by the sequencing machine. The sequence is
+broken into blocks of 10 bases with 6 blocks per line separated by a space
+(see the example below).
+ at end table
+ at sp 2
+ at cindex SR: experiment file line type
+ at table @code
+ at item Record
+SR, Sequencing vector sequence present at Right (3') end
+ at item Format
+SR   position
+ at item Explanation
+The sequence from and including the base at the marked 
+position to the end are considered to be sequencing vector. Written by
+vector_clip 
+(_fpref(Vector_Clip, Screening Against Vector Sequences, vector_clip)).
+ at end table
+ at sp 2
+ at cindex SS: experiment file line type
+ at table @code
+ at item Record
+SS, Screening Sequence
+ at item Format
+SS   string
+ at item Explanation
+Note that in earlier versions of this documentation this field was explained
+incorrectly. Due to this the field is not currently being used by any of our
+programs. The original meaning was to specify a sequence to screen against.
+Any number of SS lines could be present to denote any number of screening
+sequences. In the future we may change the meaning of this field to be a
+single SS line containing a file of filenames of screening sequences. If this
+causes problems for people then we will choose a new line type, so please
+inform us now. Also note that contrary to previous documentation, vector_clip does
+not use this field (it uses the SF field instead).
+ at end table
+ at sp 2
+ at cindex ST: experiment file line type
+ at table @code
+ at item Record
+ST, STrands
+ at item Format
+ST   number
+ at item Explanation
+Denotes whether this is a single or double stranded template. This
+is useful for deducing suitable templates for later experiments.
+ at end table
+ at sp 2
+ at cindex SV: experiment file line type
+ at table @code
+ at item Record
+SV, Sequencing Vector type
+ at item Format
+SV   string
+ at item Explanation
+Type of sequencing vector used. Can be used for choosing
+templates for custom primer experiments.
+ at end table
+ at sp 2
+ at cindex TG: experiment file line type
+ at table @code
+ at item Record
+TC, Tag to be placed on the Consensus.
+ at item Format
+TC   TYPE S position..length
+ at item Explanation
+These lines instruct gap4 to place tags on the consensus.
+The format defines the tag type which is a 4 character identifier
+and should start at column position 5), its strand  ( "+", "-" or
+"=" which means both strands), its start position followed by the
+position of its end. These two values are separated by "..". Following
+lines starting TG with space characters up to column 10 are written
+into the comment field of the tag. For example the next three lines
+define a tag of type comment that is to be on both strands over the
+range 100 to 110 and the comment field will contain "This comment
+contains several lines".
+ at example
+TC   COMM = 100..110
+TC        This comment contains
+TC          several lines
+ at end example
+ at end table
+ at sp 2
+ at cindex TC: experiment file line type
+ at table @code
+ at item Record
+TG, Tag to be placed on the reading.
+ at item Format
+TG   TYPE S position..length
+ at item Explanation
+These lines instruct gap4 to place tags on the reading.
+See TC for further information.
+ at end table
+ at sp 2
+ at cindex TN: experiment file line type
+ at table @code
+ at item Record
+TN, Template Name
+ at item Format
+TN   string
+ at item Explanation
+The name of the template used in the experiment.
+ at end table
+ at sp 2
+ at cindex WT: wild type trace file
+ at table @code
+ at item Record
+WT, Wild Type trace file
+ at item Format
+WT   string
+ at item Explanation
+The filename of the wild type trace file. Used for mutation studies.
+ at end table
+ at c TABLE_MODE=0
+
+_split()
+ at node Exp-Example
+ at subsection Example
+ at cindex Experiment file: example
+ at cindex Example experiment file
+
+ at example
+ID   h4a01h6.s1
+EN   h4a01h6.s1
+TN   h4a01h6
+EX   lane 18, run time 10 hrs
+MN   Sample 18
+MC   A
+MT   ABI
+LN   h4a01h6.s1SCF
+LT   SCF
+DT   08-Jan-1993
+OP   ak
+TN   h4a01h6
+SV   M13mp18
+SF   /pubseq/seqlibs/vectors/m13mp18.seq
+SI   1000..2000
+SC   6249
+PN   -21
+PR   1
+DR   +
+SP   41
+ST   1
+CN   3G9
+CV   sCos-1
+CF   /pubseq/seqlibs/vectors/sCos-1.seq
+SS   /pubseq/seqlibs/vectors/m13mp18.seq
+SQ
+     GCTTGCATGC CTGCAGGTCG ACTCTAGAGG ATCCCCAACC AGTAAGGCAA CCCCGCCAGC
+     CTAGCCGGGT CCTCAACGAC AGGAGCACGA TCATGCGCAC CCGTCAGATC CAGACATGAT
+     AAGATACATT GATGAGTTTG GACAAACCAC AACTAGAATG CAGT-AAAAA AATGCTTTAT
+     TTGTGAAATT TGTGATGCTA TTGCTTTATT TGTAACCATT ATAAGCTGCA ATAAACAAGT
+     TAACAACAAC AATTGCATTC ATTTTATGTT TCAGGTTCAG GGGGAGGTGT GGGAGGTTTT
+     TTAAAGCAAG TAAAACCTCT ACAAATGTGG TATGGCTGAT TATGATCTCT AGTCAAGGCA
+     CTATACATCA AATATT-CCT TATTAACCCC CTTTACAAAT TTAAAAGGCT -AAAGGGTCC
+     ACAATTTTTG -GCCTAGGTA TTAATAGCCG GCACTTCTT- TGCCTGTTTT GG-GTAGGG-
+     AAAACCGGTA TGTTT-TGGT T-TTC
+//
+QL   0
+QR   281
+SL   36
+SR   506
+CS   37..280
+PS   Completely cloning vector
+ at end example
+
+_split()
+ at node Exp-Unsupported
+ at subsection Unsupported Additions (From LaDeana Hillier)
+ at cindex Experiment file: unsupported additions
+
+Note the clash on AP which the io-lib uses for "Assembly Position"
+and PC which is used for "Position in Contig"
+
+ at c INDENT=0.1i
+ at example
+People to track:
+TP Template Prep person
+QP Sequencer Person, person who does sequencing reactions
+LP Loader Person
+AL Agar Loader person (when they run a gel to determine SI)
+AP Agar reaction Person   (person who does the reactions to prepare
+                        the template to be run on a gel)
+
+Gel specific information
+GN Gel Name
+GL Gel Lane
+GP Gel Pourer person
+AG Agar Gel name (sizing gel)
+AF Agar Fate, no insert, no bands, what else?
+
+Name of library
+LB  Library name, probably not critical to assembly even though
+        one CN may have more than one library.  But it is important
+        to the cDNA project although I could put it in CN, since
+        the cDNA project wouldn't have a CN otherwise.
+
+Processing information
+PC processing comment (a comment about PS)
+        I think PS should just hold pass or fail and PC should hold
+        additional information about why things passed.
+
+Trace information gotten from the ABI machine (from info field in SCF file):
+TS   Trace Spacing
+DP   Dye Primer
+HA   signal strengtH A
+HG   signal strengtH G
+HC   signal strengtH C
+HT   signal strengtH T
+
+(NOTE rs suggested these should go in a single record
+
+PP   Primer Position  (position at which primer peak was detected in trace)
+
+Stuff most likely specific to the cDNA project:
+MP Map Position 
+TT Tissue Type of the library
+EI dbEst Id  
+ER dbEst Remark
+OE Other Est's which are similar
+NI NCBI ID
+GB GenBank accession number
+SD Submission Date (when est was submitted)
+UD Update date (when it was last updated)
+CI citation associated with this cDNA
+ at end example
+ at c INDENT=0.5i
diff --git a/manual/exp_suggest-t.texi b/manual/exp_suggest-t.texi
new file mode 100644
index 0000000..b976c00
--- /dev/null
+++ b/manual/exp_suggest-t.texi
@@ -0,0 +1,456 @@
+_split()
+ at node Experiments
+ at chapter Finishing Experiments
+
+ at menu
+* Double Strand::         Double Stranding
+* Suggest Primers::       Suggest Primers
+* Suggest Long::          Suggest Long Readings
+* Compressions::          Resequence Compressions
+* Suggest Probes::        Suggest Probes
+ at end menu
+
+Gap4 contains several functions for helping to select experiments to
+finish an assembly project. These functions 
+(which are all available from the gap4 Experiments menu)
+are able to automatically
+analyse the
+contigs to find the regions which need attention, and to suggest
+appropriate experiments. 
+
+Prior to performing any experiments it can
+be worthwhile to try to make the most of the existing data by moving the
+boundary between the hidden and visible data of 
+readings to cover single stranded readings.
+(_fpref(Double Strand, Double Strand, exp_suggest)) 
+
+The following "Experiment Suggestion" functions analyse the contigs to
+find problems, and then suggest the best templates to use for further
+experiments. 
+
+Primers and templates for primer walking experiments can be suggested.
+(_fpref(Suggest Primers, Suggest Primers, exp_suggest)).
+Sometimes resequencing on a long gel machine will help to fill a single
+stranded region or join a pair of contigs.
+(_fpref(Suggest Long, Suggest Long Readings, exp_suggest)).
+Compressions and stops can be solved by resequencing using an different
+chemistry.
+(_fpref(Compressions, Compressions and Stops, exp_suggest)).
+In order to select oligos to use as probes for clones near the ends of
+contigs a further function is available.
+(_fpref(Suggest Probes, Suggest Probes, exp_suggest)).
+
+_split()
+ at node Double Strand
+ at section Double Stranding
+ at cindex Double strand
+
+The purpose of this function 
+(which is available from the gap4 Edits menu)
+is to use hidden data
+to fill regions of contigs that have
+data on only one strand
+(_fpref(Intro-Hidden, Use of the "hidden" poor quality data, gap4)).
+First the routine finds a region that has data for
+only one strand. Then it examines the nearby readings on the other
+strand to see if they have hidden data that covers the single stranded
+region.  If so it
+finds the best alignment between this hidden data and the consensus over
+the region. If this alignment is good enough the data is converted from
+hidden to visible.  This process is continued over all the selected
+contigs. The function can be run on a subsection of a single contig, on all
+contigs, or on a subset of contigs that are named in a file of a list.
+
+Significant portions of the sequence can be  covered  by  this
+operation,  hence  saving  a great deal of experimental work, and it
+can be used as a standard part of cleaning up a sequencing project.
+However it must be noted that an increased number of edits may be
+required after its application. The amount of  cutoff
+data  used depends on the number of mismatches and the percentage
+mismatch in the alignment. That is, it depends on the quality of the
+alignment, not the quality of the data: if it aligns it is assumed to
+be correct!
+
+The program reports its progress in the Output window as shown in the
+following example.
+
+ at example
+Wed 03:52:46 PM: double strand
+------------------------------------------------------------
+Double stranding contig xf48g3.s1 between 1 and 6189
+Double stranded zf23b2.s1       by 121 bases at offset 3752
+Double stranded zf18g11.s1      by 194 bases at offset 5652
+Positive strand :
+	Double stranded 315 bases with 2 inserts into consensus
+	Filled 0 holes
+Complementing contig   358
+Double stranded zg29a11.s1      by 42 bases at offset 5265 - Filled
+Double stranded zf38c7.s1       by 131 bases at offset 5015 - Filled
+Negative strand :
+	Double stranded 174 bases with 1 insert into consensus
+	Filled 2 holes
+ at end example
+
+_picture(exp_suggest.double,3.25in)
+
+The contigs to process can be a particular
+"single" contig, "all contigs", or a subset of contigs whose names are
+stored in a "file" or a "list". If a file or list is selected the
+browse button will be activated, and if it is clicked, an appropriate
+browser will be invoked. If the user selects "single" then the
+dialogue for choosing the contig and the section to process becomes
+active.
+
+Only alignments with not more than "Maximum number of mismatches" and
+"Maximum percentage of mismatches" will be accepted.
+
+ at page
+_split()
+ at node Suggest Primers
+ at section Suggest Primers
+ at cindex Suggest primers
+ at cindex Primers: suggestion of
+
+The purpose of this function 
+(which is available from the gap4 Experiments menu)
+is to suggest custom primer experiments to
+extend and "double strand" contigs.  First the routine finds regions of
+contigs with data on only one strand. Then it selects templates and
+primers, which if used in sequencing experiments, would produce data to
+cover these single stranded regions.  This information is written to a
+file or a list and also appears in the Output window.  For each primer
+suggested a tag is automatically created containing the template name
+and the sequence.  See also _oref(Suggest Long, Suggest Long), and
+_oref(Double Strand, Double Strand).
+
+The following example shows how the results appear in the Output
+window.
+
+ at example
+Wed 04:53:08 PM: Suggest Primers
+------------------------------------------------------------
+Selecting oligos for contig xf23a3.s1 between 1 and 12379
+At  3873 - template zf23b2, primer GAAACTGGATAATACGAC, number 1
+At  5847 - template zf18g11, primer CCTCCAATAGCGTGAAG, number 2
+At  7924 - template zf22d11, primer GTAAAGTGTAATTCAAGGAAG, number 3
+At  9033 - template zf97c10, primer ATGATAGAAATCTCGTGG, number 4
+At  9972 - template zf98b5, primer GCGGAAAGTTGAAAGAG, number 5
+At 10506 - template zg09a9, primer ACACATCATTTCGGAGG, number 6
+At 10958 - template zf24c1, primer CAGTTTACGAGAAAGTCC, number 7
+At 11529 - template zg29a12, primer ACCTTCCCAAAAGTTCC, number 8
+At 11897 - template zf97d7, primer AACCCGATTTTCGTAATG, number 9
+Complementing contig   358
+At 11400 - template zf38b1, primer CGAAGACCCAAAGAAAG, number 11
+At  9902 - template zf98a4, primer CTTTTCTCTTTCAACTTTCC, number 12
+At  7104 - template zf22h10, primer GTTGTCACGAAAATCGC, number 13
+At  6564 - template zf21e6, primer CGGATCAAATATGGATGG, number 14
+At  1499 - template zf98a11, primer CGTGATTTTTACACTATTTCC, number 15
+At   774 - template zf19c4, primer TCCAATTTTGATTCAGGC, number 16
+Complementing contig    46
+ at end example
+
+The following shows the contents of the corresponding file. The fields are
+ at i{template name}, @i{reading name}, @i{primer name}, @i{primer sequence},
+ at i{position} and @i{direction}.
+
+ at example
+zf23b2 zf23b2.s1 B0334.1 GAAACTGGATAATACGAC 3818 +
+zf18g11 zf18g11.s1 B0334.2 CCTCCAATAGCGTGAAG 5789 +
+zf22d11 zf22d11.s1 B0334.3 GTAAAGTGTAATTCAAGGAAG 7883 +
+zf97c10 zf97c10.s1 B0334.4 ATGATAGAAATCTCGTGG 8984 +
+zf98b5 zf98b5.s1 B0334.5 GCGGAAAGTTGAAAGAG 9932 +
+zg09a9 zg09a9.s1 B0334.6 ACACATCATTTCGGAGG 10460 +
+zf24c1 zf24c1.s1 B0334.7 CAGTTTACGAGAAAGTCC 10902 +
+zg29a12 zg29a12.r1 B0334.8 ACCTTCCCAAAAGTTCC 11487 +
+zf97d7 zf97d7.s1 B0334.9 AACCCGATTTTCGTAATG 11855 +
+zf23a3 zf23a3.s1 B0334.10 CAAAGCAATGTCCCCAG 12339 +
+zf38b1 zf38b1.s1 B0334.11 CGAAGACCCAAAGAAAG 930 -
+zf98a4 zf98a4.s1 B0334.12 CTTTTCTCTTTCAACTTTCC 2427 -
+zf22h10 zf22h10.s1 B0334.13 GTTGTCACGAAAATCGC 5220 -
+zf21e6 zf21e6.s1 B0334.14 CGGATCAAATATGGATGG 5771 -
+zf98a11 zf98a11.s1 B0334.15 CGTGATTTTTACACTATTTCC 10833 -
+zf19c4 zf19c4.s1 B0334.16 TCCAATTTTGATTCAGGC 11565 -
+ at end example
+
+_picture(exp_suggest.primers,3.175in)
+
+The contigs to process can be a particular
+"single" contig, "all contigs", or a subset of contigs whose names are
+stored in a "file" or a "list". If a file or list is selected the
+browse button will be activated and, if it is clicked, an appropriate
+browser will be invoked. If the user selects "single", then the
+dialogue for choosing the contig and the section to process becomes
+active.
+
+The primer sequences, their template names and their reading names can
+be written to a file or a list and an appropriate browser can be used to
+aid its selection.
+
+For each single stranded region located, the program will search for a
+primer on its 5' side in the region "search start position", to
+"search end position". That is, it will try to locate a primer starting at
+"search start position" and then will look increasingly further away
+until it reaches "search end position".
+
+If required, by employing the "number of primers per match" entry box,
+the user can request that the program tries to suggest more than one
+primer per problem. The "primer start number" is an attempt to
+generate a unique name for each primer suggested. If the number was
+set to, say 11, and the database was named B0334, then the first primer
+would be named B0334.11, the next B0334.12, etc in the output file.
+
+The "Edit parameters" button invokes a dialogue box which allows the 
+specification of further parameters. Primer constraints can be specified 
+by melting temperature, length and G+C content.
+
+ at page
+_split()
+ at node Suggest Long
+ at section Suggest Long Readings
+ at cindex Suggest long readings
+ at cindex Long readings: suggestion of
+
+This routine 
+(which is available from the gap4 Experiments menu)
+suggests which templates could be resequenced on a long gel
+machine to fill in single stranded regions or extend contigs. The "Estimated
+long reading length" tells the routine the expected length of reading that
+will be produced by the sequencing machine. The routine finds all single
+stranded regions, and where possible suggests solutions. Solutions will not be
+suggested using readings from templates that have inconsistent read-pair
+information.
+
+The example output below shows a list of  problem  segments
+followed by suggested templates.
+
+ at example
+  Prob 1..1:            Extend contig start for joining.
+      Long       c91d3.s1@   367. T_pos=366, T_size=1000..1500 (1250), cov 189
+      Long      c99e12.s1@   340. T_pos=191, T_size=1000..1500 (1250), cov 216
+
+  Prob 1..456:          No +ve strand data.
+      No solution.
+
+  Prob 1597..1736:      No +ve strand data.
+      Long       c53c6.s1@  1074. T_pos=341, T_size=1000..1500 (1250), cov 32
+      Long      e04c11.s1@  1076. T_pos=376, T_size=1000..1500 (1250), cov 34
+      Long       e05h9.s1@  1081. T_pos=377, T_size=1000..1500 (1250), cov 39
+      Long       e05a1.s1@  1198. T_pos=329, T_size=1000..1500 (1250), cov 156*
+      Long      c53b11.s1@  1382. T_pos=216, T_size=1000..1500 (1250), cov 340*
+
+  Prob 2530..2532:      No +ve strand data.
+      Long       e03a8.s1@  2283. T_pos=199, T_size=1000..1500 (1250), cov 308*
+      Long      e05b10.s1@  2331. T_pos=200, T_size=1000..1500 (1250), cov 356*
+
+  Prob 3974..4067:      No -ve strand data.
+      No solution.
+
+  Prob 4067..4067:      Extend contig end for joining.
+   D  Long       e06a3.s1@  3588. T_pos=366, T_size=1000..1500 (1582), cov 76
+      Long       c53b1.s1@  3709. T_pos=360, T_size=1000..1500 (1250), cov 197
+ at end example
+
+        Some brief notes on the above output; looking at the suggested
+  rerun of reading e05a1.s1.
+
+ at table @code
+ at item Prob 1597..1736:        No +ve strand data.
+A single stranded region has been identified in this contig at bases
+1597 to 1736 inclusive.
+
+ at item "?D Long"
+The optional two letters before the word "Long" are used to flag possibly
+inconsistent templates (templates that are definitely inconsistent are
+ignored). "?" means that no primer information is available
+for the template that the reading is from. "D" means that the template size is not
+within the expected minimum and maximum. In this case the observed size is
+displayed (see below).
+
+ at item "Long       e05a1.s1@  1198."
+A possible solution; rerun reading e05a1.s1 as a long gel. The first
+used base at the 5' end of this reading is at position 1198 in the
+contig. Typically this roughly corresponds to the primer position for
+this reading in the contig.
+
+ at item T_pos=329
+The last used base at the 3' end of the reading is estimated to be the
+329th base of the template.  Together with the template lengths this
+gives us an estimate of how much template there is available for a long
+gel or for walking.
+
+ at item T_size=1000..1500 (1250)
+The estimated size for this template is 1250 bases.  Gap4 is supplied a
+minimum and maximum size when a reading is assembled.  In this case the
+minimum is 1000 bases, and the maximum 1500.  When 
+forward and reverse reads assembled into the same contig
+estimate the real length reasonably accurately. Otherwise (as can be
+seen here), the estimated length is simply the average of the supplied
+minimum and maximum lengths.
+
+ at item cov 156*
+We would expect a long gel to cover our "hole" by 156 bases. This
+estimate is based purely on the position of the start of the reading in
+relation to the start of the hole, and the estimated length of a long
+gel.  The asterisk here marks that this coverage is more than enough to
+completely solve the problem by plugging the positive strand hole.
+ at end table
+
+For the problem "3974..4067" there is "No solution" listed.  This is due
+to the fact that there are no suitable readings within the estimated
+long gel reading length of this problem.
+
+_picture(exp_suggest.long,3.175in)
+
+ at page
+_split()
+ at node Compressions
+ at section Compressions and Stops
+ at cindex Compressions: suggested experiments
+ at cindex Stops: suggested experiments
+
+This option 
+(which is available from the gap4 Experiments menu)
+searches through a region of a contig looking for stop (STOP) or
+compression (COMP) tags.  These tags could have been added using the Contig
+Editor or by a suitable external program which can analyse traces to detect
+these types of problems. For each such tag found the routine produces a list
+of readings that could be resequenced to try to solve the problem. Obviously
+the types of experiments available will change as the technology
+improves but at present the program produces output that suggests "Taq
+terminator" experiments. We welcome suggestions for other experiment types or
+news of any programs that can automatically assign the tags. The results, in
+the form of suggestions, are written to the Output window.
+
+_picture(exp_suggest.comp,3.175in)
+
+Note that the Taq reading length is used as  a  guideline  for
+deciding  which  readings  are  suitable  candidates  for  solving a
+problem. All readings in the correct orientation and with their 5'  ends
+within  this  length  are  assumed  to solve the problem. The actual
+distance is listed in the output; an example of this is shown below.
+
+ at example
+  Prob 1544..1545: COMP tag on strand 0 (forward)
+     Taq for xd26d8.s1        @  1365 179
+
+  Prob 1554..1554: STOP tag on strand 0 (forward)
+     Taq for xd26d8.s1        @  1365 189
+
+  Prob 5276..5288: COMP tag on strand 1 (reverse)
+     Taq for xc34g11.s1       @  5299  23
+     Taq for xc34g11.s1t      @  5298  22
+     Taq for xc34d6.s1        @  5316  40
+     Taq for xc45e1.s1        @  5463 187
+
+  Prob 24042..24046: COMP tag on strand 1 (reverse)
+     Taq for xc50a12.s1       @ 24167 125
+     Taq for xc33d1.s1        @ 24188 146
+     Taq for xc36h4.s1        @ 24208 166
+     Taq for xc51c8.s1        @ 24232 190
+ at end example
+
+The format of the above output is:
+
+ at example
+  Prob <start>..<end>: <type> tag on strand <st>
+      Taq for <read> @@ <pos> <distance>
+      ...
+ at end example      
+
+Where:
+
+ at table @code
+ at item <start>..<end>
+marks the inclusive range for the tag in the contig.
+ at item <type>
+is the type of the current tag.
+ at item <st>
+is the strand of the reading that the tag is placed upon
+ at item <read>
+is the gel reading name.
+ at item <pos>
+is the position of the 5' end of <read> in the contig.
+ at item <distance>
+is the distance of the 5' end from the tag.
+ at end table
+
+ at page
+_split()
+ at node Suggest Probes
+ at section Suggest Probes
+ at cindex Suggest probes
+ at cindex Oligos: choosing for probes
+
+The suggest probes function 
+(which is available from the gap4 Experiments menu)
+looks for oligos at the end of each contig
+suitable for use with an @i{oligo probing strategy} invented by
+Jonathan Flint. 
+ at cite{Flint,J., Sims,M., Clark,K., Staden,R. and Thomas,K. An 
+oligo-screening strategy to fill gaps found during shotgun sequencing
+projects. DNA Sequence 8, 241-245}. The probing strategy is used part
+way through a sequencing project to find clones which should help to
+extend contigs. The gap4 function described here is used to select
+oligos from readings that are near the ends of the current
+contigs. These oligos are synthesised and then used to probe a pool of
+sequencing clones. Those which it selects are then sequenced in the hope
+that they will lengthen the contigs.
+
+_picture(suggest_probes.main,3.175in)
+
+The dialogue contains the usual methods of selecting the set of contigs to
+operate on. For each end of the selected contigs, oligos are chosen using the
+OSP @cite{Hillier, L., and Green, P. (1991). "OSP: an oligonucleotide
+selection program," PCR Methods and Applications, 1:124-128}.
+selection criteria which is dependent on the maximum and minimum size of
+oligos specified. The "search from" and "search to" parameters control the
+area of consensus sequence in which to search for oligos. For example,
+if they are set to 10 and 100 respectively the a section of consensus sequence
+used is 90 bases long and starts 10 bases from the end of the contig.
+
+Once an oligo is found it is screened against all the existing
+consensus sequence. An oligo is rejected if it matches with a score
+greater than or equal to the "maximum percentage match". If a file of
+vector filenames has been specified then the oligos are also screened
+against the vector sequences.
+
+Typical output for a single contig follows. The output shows all oligos that
+have passed the screening process. The information listed includes the
+distance of this oligo from the end of the contig (@code{Dist ??}), the score
+returned from the OSP selection (@code{primer=??}), the melting temperature
+(@code{Tm=??}), the best percentage match found (@code{match=??%}) and the
+oligo sequence.
+
+ at example
+Contig zf37b5.s1(495): Start
+    Rejected 8 oligos due to non uniqueness
+Contig zf37b5.s1(495): End
+    No oligos found
+Contig zf48g3.s1(315): Start
+    Pos     71, Dist  70, primer=16, Tm=52, match=75%, GCGTTTTACAATAACTTCTC
+    Pos     80, Dist  79, primer=16, Tm=50, match=72%, AATAACTTCTCAGGCAAC
+    Pos     69, Dist  68, primer=16, Tm=52, match=75%, GTGCGTTTTACAATAACTTC
+    Pos     48, Dist  47, primer=20, Tm=50, match=72%, AAAATACCATTGCAGCTC
+    Pos     52, Dist  51, primer=20, Tm=55, match=71%, TACCATTGCAGCTCACC
+    Pos     51, Dist  50, primer=20, Tm=52, match=71%, ATACCATTGCAGCTCAC
+    Pos     63, Dist  62, primer=22, Tm=55, match=76%, CTCACCGTGCGTTTTAC
+    Pos     68, Dist  67, primer=24, Tm=50, match=72%, CGTGCGTTTTACAATAAC
+    Pos     77, Dist  76, primer=24, Tm=50, match=72%, TACAATAACTTCTCAGGC
+    Pos     46, Dist  45, primer=28, Tm=50, match=78%, TCAAAATACCATTGCAGC
+    Rejected 1 oligo due to non uniqueness
+ at end example
+
+This output is sent to both the Output Window and additionally
+to a suggest probes output window. This latter window (shown below) allows
+selection of oligos from those available for each contig by clicking the
+left mouse button on a line of the output. The selected oligos are shown in
+blue. By default the first in each set is automatically selected.
+
+_picture(suggest_probes.select,5.04167in)
+
+The selected oligos can then be written to a file by filling in the "output
+filename" and will have OLIG tags created for them when the "Create tags"
+checkbutton is selected. This output window vanishes once OK is pressed, but
+the text in the main Output Window is left intact.
+
+
diff --git a/manual/exp_suggest.comp.png b/manual/exp_suggest.comp.png
new file mode 100644
index 0000000..654c628
Binary files /dev/null and b/manual/exp_suggest.comp.png differ
diff --git a/manual/exp_suggest.double.png b/manual/exp_suggest.double.png
new file mode 100644
index 0000000..55ffd73
Binary files /dev/null and b/manual/exp_suggest.double.png differ
diff --git a/manual/exp_suggest.long.png b/manual/exp_suggest.long.png
new file mode 100644
index 0000000..2a41a0b
Binary files /dev/null and b/manual/exp_suggest.long.png differ
diff --git a/manual/exp_suggest.primers.png b/manual/exp_suggest.primers.png
new file mode 100644
index 0000000..188c5de
Binary files /dev/null and b/manual/exp_suggest.primers.png differ
diff --git a/manual/extract-t.texi b/manual/extract-t.texi
new file mode 100644
index 0000000..5c94f86
--- /dev/null
+++ b/manual/extract-t.texi
@@ -0,0 +1,35 @@
+This function 
+(which is available from the gap4 File menu)
+is used to produce copies of readings stored in the assembly
+database. The readings, and information about them, are written to disk in
+experiment file format (_fpref(Formats-Exp, Experiment file format, exp)) and
+will include any edits made and tags created. They are written in their
+original orientation. No change is made to the copies in the assembly
+database: this process creates copies and should not be confused with
+"Disassemble readings".  _fxref(Disassemble, Disassemble Readings,
+disassembly) The names of the readings to extract can be read from a list or a
+file of file names.  Clicking on the browse button will invoke an appropriate
+browser dialogue. If just a single reading is to be assembled choose "single"
+and enter the filename instead of the file or list of filenames.  The files
+are written into the "Destination directory" with their original file names.
+
+_picture(extract,3.00833in)
+
+If required, the files will include additional information suitable for
+processing by either "Enter pre-assembled data" or "Directed assembly"
+(_fpref(Exp-Records, Experiment file format explained, exp)).  Both contain
+the ON and AV Experiment File records. Pre-assembled data also contains SE and
+PC records whilst Directed assembly contains AP records.  It is recommended
+that Directed Assembly format is always used in preference to the Preassemble
+format.
+
+ at cindex Merging databases
+ at cindex Splitting databases
+ at cindex Database merging
+ at cindex Database splitting
+
+To merge databases use the "Directed assembly" format to output the contigs
+required. Then, within the database you wish to merge the data use the Directed
+Assembly (_fpref(Assembly-Directed, Directed Assembly, assembly)) command. By
+using Directed Assembly with new blank databases it is also possible to create
+database subsets or to split databases.
diff --git a/manual/extract.png b/manual/extract.png
new file mode 100644
index 0000000..b60e2e9
Binary files /dev/null and b/manual/extract.png differ
diff --git a/manual/extract_fastq.1.texi b/manual/extract_fastq.1.texi
new file mode 100644
index 0000000..5244593
--- /dev/null
+++ b/manual/extract_fastq.1.texi
@@ -0,0 +1,44 @@
+ at cindex extract_fastq: man page
+ at unnumberedsec NAME
+
+extract_fastq --- extracts sequence and quality from a trace or experiment file.
+
+ at unnumberedsec SYNOPSIS
+
+ at code{extract_fastq}
+[@code{-}(@code{abi}|@code{alf}|@code{scf}|@code{ztr}|@code{exp}|@code{pln})]
+[@code{-good_only}] [@code{-clip_cosmid}] [@code{-fasta_out}]
+[@code{-output} @i{output_name}] [@i{input_name}] @code{...}
+
+ at unnumberedsec DESCRIPTION
+
+ at code{extract_fastq} extracts the sequence and quality information
+from binary trace files or Experiment files. The input can be read
+either from standard input or read from files listed directly as
+arguments or contained within a ``file of filenames''. Output is
+either sent to standard output or a named file. It contains the
+sequence and confidence stored in single-line fastq format.
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-abi}, @code{-alf}, @code{-scf}, @code{-ztr}, @code{-exp}, @code{-pln}
+    Specify an input file format. This is not usually required as
+    @code{extract_seq} will automatically determine the correct input file
+    type. This option is supplied incase the automatic determination is
+    incorrect (which is possible, but has never been observed).
+
+ at item @code{-output} @i{file}
+    The sequence will be written to @i{file} instead of standard
+    output.
+
+ at item @code{-fofn} @i{file_of_filenames}
+    Read the reading names from @i{file_of_filenames} with one per line.
+ at end table
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Formats-Exp, ExperimentFile(4), formats)
+_fxref(Formats-Scf, scf(4), formats)
+_fxref(Man-extract_seq extract_seq(1), extract_seq.1)
+ at code{Read}(4)
diff --git a/manual/extract_seq.1.texi b/manual/extract_seq.1.texi
new file mode 100644
index 0000000..0fd6e29
--- /dev/null
+++ b/manual/extract_seq.1.texi
@@ -0,0 +1,62 @@
+ at cindex extract_seq: man page
+ at unnumberedsec NAME
+
+extract_seq --- extracts sequence from a trace or experiment file.
+
+ at unnumberedsec SYNOPSIS
+
+ at code{extract_seq} [@code{-r}]
+[@code{-}(@code{abi}|@code{alf}|@code{scf}|@code{ztr}|@code{exp}|@code{pln})]
+[@code{-good_only}] [@code{-clip_cosmid}] [@code{-fasta_out}]
+[@code{-output} @i{output_name}] [@i{input_name}] @code{...}
+
+ at unnumberedsec DESCRIPTION
+
+ at code{extract_seq} extracts the sequence information from binary trace
+files, Experiment files, or from the old Staden format plain files. The input
+can be read either from files or from standard input, and the output can be
+written to either a file or standard output. Multiple input files can be
+specified. The output contains the sequences split onto lines of at most 60
+characters each.
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-r}
+    Directs reading of experiment file to attempt extraction of sequence from
+    the referenced (@code{LN} and @code{LT} line types) trace file. Without
+    this option, or when the trace file cannot be found, the sequence
+    output is that listed in the Experiment File. This option has no effect
+    for other input format types.
+
+ at item @code{-abi}, @code{-alf}, @code{-scf}, @code{-ztr}, @code{-exp}, @code{-pln}
+    Specify an input file format. This is not usually required as
+    @code{extract_seq} will automatically determine the correct input file
+    type. This option is supplied incase the automatic determination is
+    incorrect (which is possible, but has never been observed).
+
+ at item @code{-good_only}
+    When reading an experiment file or SCF file containing clip marks, output
+    only the @i{good} sequence which is contained within the boundaries marked
+    by the @code{QL}, @code{QR}, @code{SL}, @code{SR}, @code{CL}, @code{CR}
+    and @code{CS} line types.
+
+ at item @code{-clip_cosmid}
+    When the @code{-good_only} argument is specified this controls whether the
+    cosmid sequence should be considered good data. Without this argument
+    cosmid sequence is considered good.
+
+ at item @code{-fasta_out}
+    Specifies that the output should be in fasta format
+
+ at item @code{-output} @i{file}
+    The sequence will be written to @i{file} instead of standard
+    output.
+ at end table
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Formats-Exp, ExperimentFile(4), formats)
+_fxref(Formats-Scf, scf(4), formats)
+_fxref(Man-extract_fastq extract_fastq(1), extract_fastq.1)
+ at code{Read}(4)
diff --git a/manual/fak2-t.texi b/manual/fak2-t.texi
new file mode 100644
index 0000000..a991a1f
--- /dev/null
+++ b/manual/fak2-t.texi
@@ -0,0 +1,308 @@
+ at cindex Assembly: FAKII
+ at cindex FAKII Assembly
+ at cindex Myers: Assembly (FAKII)
+ at cindex Assembly: Myers
+
+This mode of assembly uses the global assembly program FAKII, developed by 
+Myers @cite{Eugene W. Myers Jr., Mudita Jain and Susan Larson, University of
+Arizona, Department of Computer Science}. 
+
+The FAKII program can be accessed via the Gap4 interface through the "Assembly"
+menu or as a series of stand alone programs. 
+
+The FAKII files for use with Gap4 must be obtained via ftp from the
+authors Eugene W Myers Jr., Mudita Jain, Eric Anson and Susan Larson
+at the University of Arizona, Department of Computer Science.
+
+First email a request for authorisation to Dr Gene Myers
+(gene@@cs.arizona.edu) stating you want FAKII for use with Gap4. He
+will email you a postscript file containing the authorisation
+document.  Print the document.  Read it.  Sign it, fax it back to Gene
+Myers at +1 (520) 621-4246 and also post the original signed copy back
+to Gene.
+
+Then email Susan Larson (susanjo@@cs.arizona.edu) requesting a copy of
+FAKII for use with Gap4 and she will contact you to arrange for the
+transfer. Make sure you tell Susan the operating system for which you
+need the program (one of: SunOS 4.1.1; Solaris 2.5; DEC OSF/1 V3.0 and
+Digital Unix; Irix 5.3).  Make the files Susan sends executable (eg
+chmod a+x *) and move them into the directory
+ at code{$STADENROOT/$MACHINE-bin}.  The environment variable FAKII must
+also be set to @code{$STADENROOT/$MACHINE-bin}, for example, for the
+bash shell, @code{export FAKII=$STADENROOT/$MACHINE-bin}.  You could add
+this to your staden.profile or staden.login files.
+
+Prior to the files being in this directory the FAKII items on the
+assembly menu in Gap4 will have been greyed out; now they should
+appear in normal text and the functions will be selectable.
+
+ at menu
+* Assembly-Perform FAKII assembly:: Perform FAKII assembly
+* Assembly-Import FAKII assembly:: Import FAKII assembly data
+* Assembly-Perform and import FAKII assembly:: Perform and import FAKII assembly
+ at end menu
+
+ at node Assembly-Perform FAKII assembly
+ at subsection Perform FAKII assembly
+ at cindex Assembly: perform FAKII 
+ at cindex FAKII assembly: perform
+
+_picture(assembly.fak2,2.59167in)
+
+Assembly using FAKII can be split into either two or three distinct phases. The
+first phase is that of computing and storing overlaps (graph creation). The 
+second phase is optional and involves the creation of a constraint file. The 
+third phase is the computation and display of the assembly based on the graph
+and the constraint file if one was created.
+
+The assembly works on a file of reading names in experiment file format 
+(_fpref(Formats-Exp, Experiment File, formats)) 
+
+The graph creation phase is modulated by three floating point numbers that
+control which overlaps are detected and/or accepted as follows:
+
+The "Error limit". The maximum sequencing error rate for which overlaps will
+be guaranteed to be detected. For example, if this is set to 10%, then the 
+program
+looks for overlaps with 20% or less differences in the aligned
+regions. This parameter should never be greater than .2, and we
+suggest .099 as a standard value.
+
+The "Overlap threshold". The overlap score of an
+overlap is the log of the a priori odds that such an overlap would
+occur by chance. Pragmatically, this score is the length of the
+overlap minus a marginally decreasing penalty per difference. A
+typical value is 10, implying an overlap of at least 10 bases is
+needed and that for the overlap to occur by chance is a one in a
+million (approximately 4^10) event.
+
+The "Distribution limit". It is further
+required that the distribution of differences along the alignment of
+an overlap not be highly skewed but spread across the alignment. The
+distribution score of an alignment is the minimum over all segments of
+the alignment of the probability that one would see the observed
+number of differences in that segment given an underlying error
+process occurring at rate "Error limit". This probability should not be too
+small, as if it is, it implies there is a segment of the alignment
+that has an unusually large number of differences in it. Note that
+this is quite conservative as we are assuming the error process is
+at the maximum error rate (and not the average error rate). We recommend 
+using a value of .0001 or less.
+
+The "Error limit" and "Distribution limit" parameters control
+the efficiency with which overlaps are detected. The smaller the
+error limit or the higher the distribution limit, the less time
+overlap detection will take. By far the most important of these two
+efficiency parameters in Version 4.1 is the "Error limit". Note that
+both are not "thresholds", but only "limits": the graph creation function 
+guarantees to find all overlaps inside the error limit and distribution
+limit, but may report additional overlaps as well. On the other hand,
+the overlap threshold is a true threshold: any overlap not scoring
+above it, i.e., that is not statistically significant enough, will
+not be entered into the overlap graph. 
+
+One should set these three parameters to the most lenient/inclusive values 
+that they think
+will be ever be needed for proper assembly, moderated by the level of
+efficiency with which the computation can be done. Philosophically, our view 
+is that overlap detection is a one-time
+computation in which one determines all the possible ways that the
+fragments could go together. Later, during assembly, one
+can select a more stringent subset of the overlaps with which to meld
+fragments. With regard to efficiency, it should be noted that there
+are significant changes in performance as "Error limit" crosses the levels
+.05 and .10. Thus our recommendation is to use .099 as a standard setting.
+
+The graph creation routine creates a binary file in the 
+directory specified in the "Destination directory" entrybox. The name of this
+file is defined in the .gaprc file. 
+_fxref(Conf-Introduction, Options Menu, configure)
+The default name is "graph.bin". In addition, any output from this routine is
+written to a file "graph_stderr" which is in the destination directory. 
+This information is also displayed in the text output window. The graph 
+binary file may be used as input to the standalone programs, "show_graph"
+and "assemble".
+
+The FAKII assembly program supports the use of a constraints file. This file
+is generated automatically by setting the "Use constraint file" radiobutton
+to "Yes". A binary and ascii version of this file are written to the 
+destination directory. The names of these files can be specified in the 
+.gaprc file and their default values are "constraint.bin" and 
+"constraint.ascii". The binary version of the constraints file may be used
+with the "assemble" stand alone program via the "-c" option.
+
+Readings which are on the same template are constrained by both distance and 
+orientation. The template name is defined in the experiment
+file by the TN line (_fpref(Formats-Exp, Experiment File, formats)) If this 
+does not exist, the EN or alternatively the ID 
+line is used. If none of these have been defined, the template is deemed to 
+be "unknown". The orientation is determined from the primer information (PR). 
+If no PR line is defined, the primer type is guessed from the strand (ST) 
+information. The template length is given as a range in the SI line. Forward 
+and reverse primer readings must lie at the beginning and end of the template 
+respectively and therefore must be separated by the template length. Custom 
+primers may lie anywhere on the template. 
+
+The final phase is that of assembly which is based on the graph and the 
+constraints file, if one was created. Several alternative assemblies may be
+produced from a single set of input parameters. These different assemblies
+may be distinguished by setting the "Assembly number". Setting this to 1 will
+produce the best assembly. Setting it to 2, will produce the 2nd best 
+assembly, etc. 
+
+The assembly takes place over a
+subset of the edges in the overlap graph determined by three
+floating point parameters as follows:
+
+The "Error rate". The distribution
+score of each edge in the overlap graph will be computed assuming an
+error process at the specified rate. Edges will then be eliminated
+if their distribution score/probability is below "Distribution threshold".
+
+The "Overlap threshold". Specifies the minimum
+overlap score for edges to be considered in assemblies. Setting this
+paramenter to 0. guarantees that no edges are eliminated on this
+basis.
+
+The "Distribution threshold". Specifies the minimum error distribution score 
+for edges to be considered in assemblies. Any edge in the overlap graph
+whose distribution score with respect to error rate "Error rate" is less
+than "Distribution threshold" is eliminated from consideration as regards 
+melding fragments. Setting this parameter to 1.0 eliminates all edges, and
+setting it to 0.0 eliminates none.
+
+The destination directory defines where the output files will be written. If
+the directory does not already exist, it is created. 
+
+The assembly routine creates a binary file "assem.bin" in the destination
+directory. In addition, any output from this routine is written to a file
+"assemble_stderr" also in the destination directory. The assembly binary file
+may be used as input to the "show_layout", "show_multi" and "write_exp_file" 
+stand alone programs.
+
+It is possible to view the final assembly in two ways using the "Show layout" 
+and "Show multi-alignment" check buttons. 
+
+Show layout produces a "stick diagram" of an assembly in which the 
+arrangement of fragments in each contig of an assembly is shown by depicting 
+each fragment as a line with an arrowhead at one end or the other to indicate 
+its orientation. (Details as for the show_layout command)
+
+ at example
+ at group
+*** CONTIG 1 (Score = 3480.32): 
+ 
+           0.2K      0.4K      0.6K      0.8K      1.0K 
+              |         |         |         |         | 
+ 1:  --------->      ---------------------> <----------. 
+ 2:   <-----------------  <--------------------+-------. 
+ 3:     <--------------   <--------------------- <-----. 
+ 4:      <----------        ------------------->-------. 
+ 5:                         --------------->      -----. 
+ 6:                               ------------->         
+ 7:                          ---------------->           
+ 8:                                 <----------------    
+ 
+ 1:  xb54f3.s1:   1  xb66a6.s1: 322 xb60c11.s1: 793 
+ 2:  xb66e3.r1:  38  xb60e9.s1: 435 xb63f10.s1: 852 
+ 3: xb57h12.s1:  72  xc04a1.r1: 435  xb66f8.s1: 884 
+ 4:  xb61e3.s1:  85  xb64b3.s1: 470  xb56b6.s1: 874 
+ 5: xb54b12.s1: 463  xb58f4.s1: 919 
+ 6:  xb64a1.s1: 600 
+ 7:  xb66a5.s1: 481 
+ 8:  xb60f4.s1: 622 
+ 
+ at end group
+ at end example
+
+ at example
+ at group
+
+          
+ 1: .----                                                
+ 2: .---->                                               
+ 3: .-----                                               
+ 4: .-------->                                           
+ 5: .---->                                               
+ 6:  ----->                                              
+ 
+ 1: xb60c11.s1: 793 
+ 2: xb63f10.s1: 852 
+ 3:  xb66f8.s1: 884 
+ 4:  xb56b6.s1: 874 
+ 5:  xb58f4.s1: 919 
+ 6: xb62d10.s1:1007 
+ 
+ at end group
+ at end example
+
+
+Show multi-alignment prints a multi-alignment of each contig of an assembly
+along with the consensus sequence. (Details as for the show_multi command).
+
+ at example
+ at group
+
+*** CONTIG 1 (Score = 3480.32): 
+ 
+ xb54f3.s1>: CTNTNAAAAGGCGTTGGATTNGTACGTTTCGACAAAAAAGACGAAGCTGA 
+ xb66e3.r1<:                                      AAGACGAAGCTGA 
+             -------------------------------------------------- 
+             CTnTnAAAAGGCGTTGGATTnGTACGTTTCGACAAAAAAGACGAAGCTGA 
+ 
+ xb54f3.s1>: GTGTTGCAATTAAAACACTAAATGGAAGTATTCCATCAGGATGTTCAGAG 
+ xb66e3.r1<: -TGTTGCAATTAAAACACTAAATGGAAGTATTCCATCAGGATGTTCAGAG 
+xb57h12.s1<:                      ATGGAAGTATTCCATCAGGATGTTCAGAG 
+ xb61e3.s1<:                                   ATCAGGATGTTCAGAG 
+             -------------------------------------------------- 
+             gTGTTGCAATTAAAACACTAAATGGAAGTATTCCATCAGGATGTTCAGAG 
+ 
+ xb54f3.s1>: CAAATCACAGTGAAATTCGCAAATAATCCAGCAAGTAACAATCCGAAAGG 
+ xb66e3.r1<: CAAATCACAGTGAAATTCGCAAATAATCCAGCAAGTAACAATCCGAAAGG 
+xb57h12.s1<: CAAATCACAGTGAAATTCGCAAATAATCCAGCAAGTAACAATCCGAAAGG 
+ xb61e3.s1<: CAAATCACAGTGAAATTCGCAAATAATCCAGCAAGTAACAATCCGAAAGG 
+             -------------------------------------------------- 
+             CAAATCACAGTGAAATTCGCAAATAATCCAGCAAGTAACAATCCGAAAGG 
+ 
+ at end group
+ at end example
+
+ at node Assembly-Import FAKII assembly
+ at subsection Import FAKII assembly
+ at cindex Assembly: import FAKII 
+ at cindex FAKII assembly: import
+
+This mode imports the aligned sequences produced after FAKII assembly into
+Gap4 and maintains the same alignment. It takes data from
+the directory containing the assembly binary file (default
+name "assem.bin"), ie the destination directory used in "Perform FAKII 
+assembly". A single contig may be entered, all the contigs or a file or list
+of contig numbers. Note that the contig numbers are those defined by
+FAKII and not by Gap4. The assembly information for each reading is 
+extracted from the assembly binary file and new experiment files are created
+in the same directory as assembly binary file (ie that defined in "Directory
+containing assembly"). If the original experiment files are accessible (ie in
+the directory in which the Gap4 program is being run), the new experiment files
+will incorporate information from the original experiment files. If the
+original files are not available, the new experiment files produced will 
+contain only limited information. Once the new experiment files have been
+created, these are read into Gap4 in a manner which is functionally equivalent
+to "Directed assembly". 
+_oxref(Assembly-Directed, Directed Assembly). 
+Readings from the selected contigs which are
+not entered are written to a "list" or "file" specified in the "Save failures"
+entry box.
+
+ at node Assembly-Perform and import FAKII assembly
+ at subsection Perform and import FAKII assembly
+ at cindex Assembly: perform and import FAKII 
+ at cindex FAKII assembly: perform and import
+
+This mode performs both the assembly _oref(Assembly-Perform FAKII
+assembly, Perform FAKII assembly) and the import _oref(Assembly-Import
+FAKII assembly, Import FAKII assembly) routines together. The assembled 
+readings are written to the destination directory and then are automatically
+imported from this directory into the Gap4 database.
+
+
+
diff --git a/manual/fij-t.texi b/manual/fij-t.texi
new file mode 100644
index 0000000..eba783c
--- /dev/null
+++ b/manual/fij-t.texi
@@ -0,0 +1,186 @@
+ at cindex Find internal joins
+ at cindex joining contigs
+ at cindex contig joining
+ at cindex hidden data
+ at cindex overlap finding
+ at cindex finding overlaps
+ at cindex finding joins
+ at cindex masking
+ at cindex marking
+
+The purpose of this function (which is invoked from the gap4 View menu)
+is to use sequences already in the database
+to find possible joins between contigs.  Generally these will be joins
+that were missed or judged to be unsafe during assembly and this
+function allows users to examine the overlaps and decide if they should
+be made. During assembly joins may have been missed because of poor
+data, or not been made because the sequence was repetitive.  Also it may
+be possible to find potential joins by extending the consensus sequences
+with the data from the 3' ends of readings which was considered to be
+too unreliable to align during assembly i.e. we can search in the
+"hidden data".
+
+If it has not already occurred, use of this function will automatically
+transform the Contig Selector into the Contig Comparator.  Each match
+found is plotted as a diagonal line in the Contig Comparator, and is
+written as an alignment in the Output Window. The length of the diagonal
+line is proportional to the length of the aligned region. If the match
+is for two contigs in the same orientation the diagonal will be parallel
+to the main diagonal, if they are not in the same orientation the line
+will be perpendicular to
+the main diagonal. The matches displayed in the Contig Comparator can be
+used to invoke the Join Editor (_fpref(Editor-Joining, The Join Editor,
+contig_editor)) 
+or Contig Editor.  _fxref(Editor,
+Editing in gap4, contig_editor) 
+Alternatively, the "Next" button at the top left of the Contig
+Comparator can be used to select each result in turn, starting with the
+best, and ending with the worst. When this is in use, users can find the 
+match in the Contig Comparator which corresponds to the next result by
+placing the cursor over the Next button. The plotted match and the contigs
+involved will turn white.
+
+_lpicture(comparator,5.325in)
+
+A typical display from the Contig Comparator is shown in the figure
+above. 
+
+To define the match all numbering is relative to base number one in the
+contig: matches to the left (i.e.  in the hidden data) have negative
+positions, matches off the right end of the contig (i.e. in the hidden
+data) have positions greater than that of the contig length.  The
+convention for reporting the positions of overlaps is as follows: if
+neither contig needs to be complemented the positions are as shown.  If
+the program says "contig x in the - sense" then the positions shown
+assume contig x has been complemented. For example, in the results given
+below the positions for the first overlap are as reported, but those for
+the second assume that the contig in the minus sense (i.e. 443) has been
+complemented.
+
+ at example
+Possible join between contig   445 in the + sense and contig   405
+Percentage mismatch after alignment =  4.9
+       412        422        432        442        452        462
+    405  TTTCCCGACT GGAAAGCGGG CAGTGAGCGC AACGCAATTA ATGTGAG,TT AGCTCACTCA
+          ::::::::: : ::::::::  ::::: ::: :::::::::: :::::::::: ::::::::::
+    445  *TTCCCGACT G,AAAGCGGG TAGTGA,CGC AACGCAATTA ATGTGAG*TT AGCTCACTCA
+      -127       -117       -107        -97        -87        -77
+       472        482        492        502        512
+    405  TTAGGCACCC CAGGCTTTAC ACTTTATGCT TCCGGCTCGT AT
+         :::::::::: :::::::::: :::::::::: :::::::::: ::
+    445  TTAGGCACCC CAGGCTTTAC ACTTTATGCT TCCGGCTCGT AT
+       -67        -57        -47        -37        -27
+Possible join between contig   443 in the - sense and contig   423
+Percentage mismatch after alignment = 10.4
+        64         74         84         94        104        114
+    423  ATCGAAGAAA GAAAAGGAGG AGAAGATGAT TTTAAAAATG AAACG*CGAT GTCAGATGGG
+         :::: ::::: :::::::::: :::::::::: ::::::  :: ::::: :::: :::::::::
+    443  ATCG,AGAAA GAAAAGGAGG AGAAGATGAT TTTAAA,,TG AAACGACGAT GTCAGATGG,
+      3610       3620       3630       3640       3650       3660
+       124        134        144        154        164
+    423  TTG*ATGAAG TAGAAGTAGG AG*AGGTGGA AGAGAAGAGA GTGGGA
+         ::: :::::: :::::::::: :: :::::::  ::: ::::: :: ::
+    443  TTGGATGAAG TAGAAGTAGG AGGAGGTGGA ,GAG,AGAGA GTTGG*
+      3670       3680       3690       3700       3710
+ at end example
+
+_split()
+ at node FIJ-Dialogue
+ at subsection Find Internal Joins Dialogue
+ at cindex Find internal joins: dialogue
+
+_picture(fij.dialogue,3.35833in)
+
+The contigs to use in the search can be defined as "all contigs", a list
+of contigs in a file "file", or a list of contigs in a list "list".
+If "file" or "list" is selected the browse button is activated
+and gives access to file or list browsers.
+Two types of search can be selected: one, "Probe all against all"
+compares all the contigs defined against one another; the other "Probe
+with single contig", compares one contig against all the contigs in the
+list. If this option is selected the Contig identifier panel in the
+dialogue box is ungreyed. Both sense of the sequences are compared.
+
+
+If users elect not to "Use standard consensus" they can either "Mark
+active tags" or "Mask active tags", in which cases the "Select tags"
+button will be activated. Clicking on this button will bring up a check
+box dialogue to enable the user to select the tags types they wish to
+activate. Masking the active tags means that all segments covered by
+tags that are "active" will not be used by the matching algorithms.
+A typical
+use of this mode is to avoid finding matches in segments covered by tags
+of type ALUS (ie segments thought to be Alu sequence)
+or REPT (ie segment that are known to be repeated elsewhere in
+the data (_fpref(Anno-Types, Tag types, tags)). "Marking" is of less use:
+matches will be found in marked
+segments during searching, but in the alignment shown
+in the Output Window, marked segments will be shown in lower case.
+
+Some alignments may be very large. For speed and ease of scrolling
+Gap4 does not display the textual form of the longest alignments,
+although they are still visible within the contig comparator
+window. The maximum length of the alignment to print up is controlled
+by the ``Maximum alignment length to list (bp)'' control.
+
+The default setting for the consensus
+is to "Use hidden data" which means that where possible the
+contigs are extended using the poor quality data from the readings near
+their ends. To ensure that this additional data is not so poor that
+matches will be missed, the program uses algorithms which can be configured
+from the "Edit hidden data parameters" dialogue. Two algorithms are available.
+Both slide a window along the reading until a set criteria is met.
+By default an algorithm which sums confidence values within the window is used.
+It stops when a window with < "Minimum average confidence" is found. The other
+algorithm counts the number of uncalled bases in the window and stops when
+the total reaches "Max number of uncalled bases in window".
+The selected algorithm is applied to all the readings near the ends of contigs
+and the data that extends the contig the furthest is added to its consensus
+sequence. 
+
+If your total consensus sequence length (including a 20 character header for
+each contig that is used internally by the program) plus any hidden data 
+at the ends of contigs is greater than the current value of a parameter 
+called maxseq, Find Internal Joins may produce an error message advising 
+you to increase maxseq. Maxseq can be set on the command line
+(_fpref(Gap4-Cline, Command line arguments, gap4)) or by using the options
+menu (_fpref(Conf-Set Maxseq, Set Maxseq, configure)).
+
+The search algorithms first finds matching words of length "Word length",
+and only considers overlaps of length at least "Minimum overlap". Only
+alignments better than "Maximum percent mismatches" will be reported.
+
+There are two search algorithms: "Sensitive" or "Quick". The quick algorithm
+should be applied first, and then the sensitive one employed
+to find any less obvious
+overlaps. 
+
+The sensitive algorithm sums the lengths of
+the matching words of length "Word length" on each diagonal. It then finds
+the centre of gravity of the most significant diagonals. Significant diagonals
+are those whose probability of occurence is < "Diagonal threshold". It then
+uses a dynamic programming algorithm to align around the centre of gravity,
+using a band size of "Alignment band size (percent)". For example: if the 
+overlap was 1000 bases long and the percentage set at 5, the aligner would 
+only consider alignments within 50 bases either side of the centre of gravity.
+Obviously the larger the percentage and the overlap, the slower the aligment.
+
+The quick algorithm can find overlaps and align 100,000 base sequences in a
+few seconds by considering, in its initial phase only matching segments of
+length "Minimum initial match length". However it does a dynamic programming
+alignment of all the chunks between the matching segments, and so produces an
+optimal alignment. Again a banded dynamic algorithm can be selected, but as
+this only applies to the chunks between matching segments, which for good
+alignments will be very short, it should make little difference to the speed.
+
+After the search the results will be sorted so that the best matches
+are at the top of a list where best is defined as a combination of
+alignment length and alignment percent identity (in some earlier Gap4
+releases this was scored purely on percent identity). This list can be
+stepped through, one result at a time using the Contig Joining Editor,
+by clicking on the "Next" button at the top left of the Contig
+Comparator.
+
+ at cindex error messages: find internal joins
+ at cindex error messages: maxseq
+ at cindex maxseq: find internal joins
diff --git a/manual/fij.dialogue.png b/manual/fij.dialogue.png
new file mode 100644
index 0000000..8e53c0a
Binary files /dev/null and b/manual/fij.dialogue.png differ
diff --git a/manual/filebrowser-t.texi b/manual/filebrowser-t.texi
new file mode 100644
index 0000000..be78612
--- /dev/null
+++ b/manual/filebrowser-t.texi
@@ -0,0 +1,106 @@
+ at cindex File browser
+ at menu
+ at ifset html
+* FB-Introduction::             Introduction
+ at end ifset
+* FB-DirFiles::                 Directories and files
+* FB-Filter::                   Filters
+ at ifset standalone
+* Index::			Index
+ at end ifset
+ at end menu
+
+_split()
+ at ifset html
+ at node FB-Introduction
+ at unnumberedsubsec Introduction
+ at end ifset
+ at cindex File browser: introduction
+
+The file browser is a dialogue that allows the user to select 
+files from any directory. It is typically used when choosing a file for a
+particular action, such as opening a database in gap or saving a
+trace file in trev. The precise details of the layout may change
+depending on this context. In some circumstances, such as loading sequences into spin 
+several files may be selected (in which case the dialogue will be titled
+"Open multiple files"), in others only a single file can be selected.
+The illustration below shows the file browser
+as displayed when opening files from within trev. The
+ at samp{Formats} and @samp{Filter} section here are used to select
+different file types. These dialogue components may not appear in all
+file browsers.
+
+_picture(filebrowser,2.75in)
+
+The @samp{OK}, @samp{Filter} and @samp{Cancel} buttons perform their usual
+tasks; @samp{OK} accepts the file currently shown in the selection component,
+and @samp{Cancel} quits the dialogue.
+
+_split()
+ at node FB-DirFiles
+ at subsection Directories and Files
+ at cindex Directories: file browser
+ at cindex Files: file browser
+ at cindex File browser: directories
+ at cindex File browser: files
+
+The main component of the file browser dialogue consists of two scroll
+lists placed side by side. The left list is labelled "Directory" and
+shows a list of other directories to choose from. The right list is the
+list of files in the currently displayed directory.
+
+Double clicking with the left mouse button on a directory updates the
+file list. If a filter file browser component is visible then the
+current directory will be displayed as the start of the filter. The
+directory named ".." is the parent directory of the current directory.
+
+Single clicking on a file name updates the Selection component. Double
+clicking on a file name chooses this file and removes the dialogue. That
+is, it is equivalent to single clicking on the file to update the
+selection followed by pressing the @samp{OK} button. 
+
+When the dialogue
+allows multiple files to be selected (then titled "Open multiple files")
+holding down the Ctrl key will retain items already chosen.
+
+ at node FB-Filter
+ at subsection Filters
+ at cindex File browser: filters
+ at cindex Filters: file browser
+
+The top component of the file browser shown in the introduction
+contained a Filter component. This consists of a text entry window
+containing a string of the form @i{directory_name}/@i{file_pattern}.
+
+Some dialogues may not have a Filter component. In these cases the
+ at i{file_pattern} is taken to be "@code{*}". Hence all files will be listed.
+The @i{directory_name} is the directory that the current list of files
+are contained within. The @i{file_pattern} is used to specify which of
+the files within this directory should be listed in the file list. The
+pattern uses the same form as UNIX shell wild card matching. To
+summarise this see the following table of simple examples.
+
+ at table @code
+ at item *
+Every filename
+ at item xb*
+Every filename starting in "xb"
+ at item a*b
+Every filename starting with "a" and ending with "b"
+ at item a?b
+Every three letter filename with "a" as the first letter and "b" as the
+last letter.
+ at item *scf
+Every filename ending in "scf"
+ at item *[sS]cf
+Every filename ending with "scf" or "Scf".
+ at item *@{scf,SCF@}
+Every filename ending with "scf" or "SCF".
+ at end table
+
+ at cindex File browser: formats
+ at cindex Formats: file browser
+
+Some file browsers also include a Formats component. This is used to
+select the input or output format of the selected file. Updating the
+format will typically also update the filter.
diff --git a/manual/filebrowser.png b/manual/filebrowser.png
new file mode 100644
index 0000000..936baf7
Binary files /dev/null and b/manual/filebrowser.png differ
diff --git a/manual/filebrowser.texi b/manual/filebrowser.texi
new file mode 100644
index 0000000..847dd0a
--- /dev/null
+++ b/manual/filebrowser.texi
@@ -0,0 +1,55 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+ at c %**start of header
+ at setfilename filebrowser.info
+ at settitle File Browser
+ at setchapternewpage odd
+ at iftex
+ at afourpaper
+ at end iftex
+ at setchapternewpage odd
+ at c %**end of header
+
+include(header.m4)
+
+ at c Experiment with smaller amounts of whitespace between chapters
+ at c and sections.
+ at tex
+\global\chapheadingskip = 15pt plus 4pt minus 2pt 
+\global\secheadingskip = 12pt plus 3pt minus 2pt
+\global\subsecheadingskip = 9pt plus 2pt minus 2pt
+ at end tex
+
+ at c Experiment with smaller amounts of whitespace between paragraphs in
+ at c the 8.5 by 11 inch format.
+ at tex
+\global\parskip 6pt plus 1pt
+ at end tex
+
+ at titlepage
+ at title File Browser
+ at subtitle 
+ at author 
+ at page
+ at vskip 0pt plus 1filll
+_include(copyright.texi)
+ at end titlepage
+
+ at set standalone
+ at node Top
+ at ifinfo
+ at top top-filebrowser
+ at end ifinfo
+
+ at raisesections
+_include(filebrowser-t.texi)
+
+_split()
+ at node Index
+ at unnumberedsec Index
+ at printindex cp
+ at lowersections
+
+ at shortcontents
+ at contents
+ at bye
diff --git a/manual/find_oligo-t.texi b/manual/find_oligo-t.texi
new file mode 100644
index 0000000..8f26c5f
--- /dev/null
+++ b/manual/find_oligo-t.texi
@@ -0,0 +1,70 @@
+ at cindex Find oligos
+ at cindex Oligo search
+ at cindex Find sequences
+ at cindex Sequence Search
+ at cindex string search
+
+The purpose of this function 
+(which is available from the __prog__ View menu)
+is to find matches between the consensus
+sequence and short segments of sequence defined by the user.
+The segments of sequence (or "strings") can be typed into the dialogue
+provided or can be the sequences covered by consensus tag types 
+(_fpref(Anno-Types, Tag types, tags))
+selected by the user. The latter mode hence provides a way of checking
+to see if a tagged segment of the sequence occurs elsewhere in the
+consensus. The function was previously known as "Find Oligos".
+
+_picture(find_oligo_pic,3.375in)
+
+Users can elect to search against a "single" contig, "all contigs",
+or a subset of contigs defined in a list (_fpref(Lists, Lists, lists))
+or a file. If "file" or
+"list" is selected the browse button is activated and gives access to
+file or list browsers. If they choose to analyse a single contig the
+dialogue concerned with selecting the contig and the region to search
+becomes activated.
+
+Both strands of the consensus are scanned using a very simple algorithm:
+insertions and deletions are not allowed, but mismatches are.
+The "Minimum percent match" defines the smallest percentage match which will
+be reported by the algorithm. A value of 75 means that at least 75% of
+the bases must match the target sequence. 
+
+The user can elect to use tags or to specify their own sequences for
+the search. Selecting "Use tags" will activate the "Select tags" browse
+button. Clicking on this button will bring up a check box dialogue to
+enable the user to select the tags types they wish to activate. 
+Alternatively selecting "Enter sequence" will activate a text entry box
+and the user can enter a string of characters. Only the characters ACGTU
+are allowed and there is no limit to the length of the string.
+
+If it has not already occurred, selection of this function will automatically
+transform the Contig Selector into the Contig Comparator. 
+_fxref(Contig Comparator, Contig Comparator, comparator)
+Each match found is plotted as a diagonal line in the Contig Comparator. 
+The length of the diagonal line is proportional to the length of the
+search string. Self matches from the tag search are not reported. 
+
+If the match between the search string and the contig are in the same 
+orientation, the diagonal match line will be parallel to the main diagonal, 
+otherwise the line will be perpendicular to the main diagonal. Matches found
+between a tag and a contig can be used to invoke the Join Editor 
+(_fpref(Editor-Joining,
+The Join Editor, contig_editor)) or Contig Editors (_fpref(Editor,
+Editing in __prog__, contig_editor)).
+Matches between a specified sequence and a contig will only invoke the
+Contig Editor. All of the matches found are displayed in the Output Window e.g.
+
+ at example
+ at group
+Match found between tag on contig 315 in the + sense and contig 495
+ Percentage mismatch  16.7
+              957       967       977       987       997
+            315 CATAAGGATTTCCAATATTTTATTCCAGTTGGGCATCCTAGT
+                 ::  ::::::::::: :::::::::::::::::: ::::  
+            495 GATTGGGATTTCCAATGTTTTATTCCAGTTGGGCACCCTAAG
+                2        12        22        32        42
+ at end group
+ at end example
+
diff --git a/manual/find_oligo_pic.png b/manual/find_oligo_pic.png
new file mode 100644
index 0000000..0a02516
Binary files /dev/null and b/manual/find_oligo_pic.png differ
diff --git a/manual/find_renz.1.texi b/manual/find_renz.1.texi
new file mode 100644
index 0000000..0e8e562
--- /dev/null
+++ b/manual/find_renz.1.texi
@@ -0,0 +1,38 @@
+ at cindex find_renz: man page
+ at unnumberedsec NAME
+
+find_renz --- Identifies the position of a cut site within a sequence
+
+ at unnumberedsec SYNOPSIS
+
+ at code{find_renz} [@code{-vp}] @i{enzyme} @i{filename} ...
+
+ at unnumberedsec DESCRIPTION
+
+ at code{find_renz} may be used to determine the position that an enzyme cuts a
+sequence. It's use as a command line utility is primarily designed for
+internal use within @code{pregap4} and as a user utility for producing
+ at i{vector-primer} files for use with @code{vector_clip}. As such it is
+dedicated to finding one and only one such cut site and considers no cuts
+sites or multiple cut sites to be an error.
+
+Only one enzyme may be specified, which is given by the enzyme name (upper or
+lower case is not important). One or more filenames may be specified. If an
+enzyme does not cut a sequence the message "Enzyme not found in sequence" will 
+be sent to stderr. If an enzyme cuts a sequence more than once the message
+"Found more than one match" will be sent to stderr. Otherwise output is
+produced to stdout. This means that wildcards may be used (@code{find_renz -vp 
+smai *.seq >> vpfile}) with the output redirected without needing to consider
+whether the enzyme is suitable for all files matching the wildcard pattern.
+
+ at unnumberedsec OPTIONS
+ at table @asis
+ at item @code{-vp}
+    Specifies that the output should be in a format suitable for saving to a
+    vector-primer file (to use with vector_clip). Without this only the cut
+    site position is listed.
+ at end table
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Man-vector_clip, vector_clip(1), vector_clip.1)
diff --git a/manual/formats-t.texi b/manual/formats-t.texi
new file mode 100644
index 0000000..8db2bf2
--- /dev/null
+++ b/manual/formats-t.texi
@@ -0,0 +1,96 @@
+ at menu
+ at ifset html
+* Formats-Introduction::        Introduction
+ at end ifset
+* Formats-Scf::                 SCF
+* Formats-Exp::                 Experiment File
+* Formats-Restriction::         Restriction Enzymes
+* Formats-Vector_Primer::       Vector_primer Files
+* Formats-Vector-Sequences::    Vector Sequence Files
+ at end menu
+
+_split()
+ at ifset html
+ at node Formats-Introduction
+ at unnumberedsec Introduction
+ at end ifset
+
+ at cindex Reading name restrictions
+ at cindex File name restrictions
+ at cindex SCF file name restrictions
+ at cindex Experiment file name restrictions
+ at cindex Sample name restrictions
+ at cindex Restrictions on file names
+ at cindex Restrictions on experiment file names
+ at cindex Restrictions on SCF file names
+ at cindex Restrictions on reading names
+ at cindex Restrictions on sample names
+
+
+This section introduces the various file formats used by the
+package, but first we describe some limitations on the names of files.
+ 
+There are restrictions on the characters used in
+file names and the length of the file names.
+ 
+Characters permitted in file names:
+ 
+QWERTYUIOPASDFGHJKLZXCVBNMqwertyuiopasdfghjklzxcvbnm1234567890._-
+ 
+A reading name or experiment file name used in a sequence assembly project
+must not be longer than 16 characters.
+ 
+These restrictions also apply to SCF files which means, in turn, also to
+the names given to samples obtained from sequencing instruments. For example
+do not give sample names such as 27/OCT/96/r.1 when using and ABI machine:
+the / symbols will be interpreted as directory name separators on UNIX!
+ 
+
+Currently the formats used by the package include the following.
+
+ at menu
+* Formats-Scf::                 SCF
+* Formats-Ztr::                 ZTR
+* Formats-Exp::                 Experiment File
+* Formats-Restriction::         Restriction Enzymes
+* Formats-Vector_Primer::       Vector_primer Files
+* Formats-Vector-Sequences::    Vector Sequence Files
+ at end menu
+
+_split()
+_include(scf-t.texi)
+
+_split()
+_include(ztr-t.texi)
+
+ at page
+_split()
+_include(exp-t.texi)
+
+ at page
+_split()
+_include(renzymes-t.texi)
+
+ at page
+_split()
+_include(vector_primer-t.texi)
+
+ at page
+_split()
+ at node Formats-Vector-Sequences
+ at section Vector Sequence Format
+ at cindex format: vector sequences
+ at cindex vector sequences format
+ at cindex plain text
+
+Sequences such as vectors or E. coli which are compared against readings using
+vector_clip
+(_fpref(Vector_Clip-Introduction, Vector_clip,t))
+and screen_seq
+(_fpref(Screen_seq, Screening for known possible contaminant
+sequences, screening), usually via pregap4
+(_fpref(Pregap4-Introduction,Pregap4, pregap4)), must be stored as plain text.
+i.e. the files should contain only the sequence data (no header or title)
+on records (lines) of up to 60 characters. Each record should be terminated
+by a newline character. No other characters should appear in the file.
+
diff --git a/manual/formats.texi b/manual/formats.texi
new file mode 100644
index 0000000..b87457f
--- /dev/null
+++ b/manual/formats.texi
@@ -0,0 +1,41 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+ at c %**start of header
+ at setfilename formats.info
+ at settitle File Formats
+ at setchapternewpage odd
+ at iftex
+ at afourpaper
+ at end iftex
+ at setchapternewpage odd
+ at c %**end of header
+
+ at set standalone
+include(header.m4)
+
+ at titlepage
+ at title File Formats
+ at subtitle 
+ at author 
+ at page
+ at vskip 0pt plus 1filll
+_include(copyright.texi)
+ at end titlepage
+
+ at node Top
+ at ifinfo
+ at top top-formats
+ at end ifinfo
+
+ at raisesections
+_include(formats-t.texi)
+
+_split()
+ at node Index
+ at unnumberedsec Index
+ at printindex cp
+ at lowersections
+
+ at shortcontents
+ at contents
+ at bye
diff --git a/manual/gap4-t.texi b/manual/gap4-t.texi
new file mode 100644
index 0000000..eeb29f5
--- /dev/null
+++ b/manual/gap4-t.texi
@@ -0,0 +1,294 @@
+_define(_gap4)
+
+_include(gap4_org-t.texi)
+_include(gap4_mini-t.texi)
+_include(gap4_intro-t.texi)
+ at page
+_split()
+ at node Contig Selector
+ at chapter Contig Selector
+_include(contig_selector-t.texi)
+
+ at page
+_split()
+ at node Contig Comparator
+ at chapter Contig Comparator
+_include(comparator-t.texi)
+
+ at page
+_split()
+ at node Contig-Overviews
+ at chapter Contig Overviews
+_include(template-t.texi)
+
+ at page
+_split()
+ at node Quality
+ at section Plotting Consensus Quality
+_include(quality_plot-t.texi)
+
+ at page
+_split()
+ at node Stops
+ at section Plotting Stop Codons
+_include(stops-t.texi)
+
+ at page
+_split()
+ at node Restrict
+ at section Plotting Restriction Enzymes
+_include(restrict_enzymes-t.texi)
+
+ at page
+_split()
+ at node Editor
+ at chapter Editing in Gap4
+_include(contig_editor-t.texi)
+
+ at page
+_split()
+ at node Assembly
+ at chapter Assembling and Adding Readings to a Database
+_include(assembly-t.texi)
+
+
+ at page
+_split()
+ at node Ordering-and-Joining
+ at chapter Ordering and Joining Contigs
+_include(contig_ordering-t.texi)
+
+ at page
+_split()
+ at node Read Pairs
+ at section Find Read Pairs
+_include(read_pairs-t.texi)
+
+ at page
+_split()
+ at node FIJ
+ at section Find Internal Joins
+_include(fij-t.texi)
+
+ at page
+_split()
+ at node Repeats
+ at section Find Repeats
+_include(repeats-t.texi)
+
+ at page
+_split()
+_include(disassembly-t.texi)
+
+ at page
+_split()
+_include(exp_suggest-t.texi)
+
+ at page
+_split()
+ at node Calculate Consensus
+ at chapter Calculating Consensus Sequences
+_include(calc_consensus-t.texi)
+
+ at page
+_split()
+ at node gap4-misc
+ at chapter Miscellaneous functions
+ at menu
+* Complement::    Complement a contig
+* Enter Tags::    Entering Files of Tags into a Database
+* Shuffle Pads:: Shuffle Pads
+* Show Relationships:: List Reading and Contig Information
+* Find Oligos::   Sequence Search
+* Auto Clipping:: Automatic Clipping by Quality and Sequence Similarity
+* Contig Navigation:: Navigate to contig regions from file
+ at end menu
+
+_split()
+_include(complement-t.texi)
+
+ at page
+_split()
+ at node Show Relationships
+ at section Show Relationships
+_include(show_rel-t.texi)
+
+ at page
+_split()
+ at node Contig Navigation
+ at section Contig Navigation
+_include(contig_navigation-t.texi)
+
+ at page
+_split()
+ at node Find Oligos
+ at section Sequence Search
+_include(find_oligo-t.texi)
+
+ at page
+_split()
+ at node Extract Readings
+ at section Extract Readings
+_include(extract-t.texi)
+
+ at page
+_split()
+ at node Auto Clipping
+ at section Automatic Clipping by Quality and Sequence Similarity
+_include(clip-t.texi)
+
+ at page
+_split()
+ at node Results
+ at chapter Results Manager
+_include(results-t.texi)
+
+ at page
+_split()
+ at node Lists
+ at chapter Lists
+_include(lists-t.texi)
+
+ at page
+_split()
+ at node Notes
+ at chapter Notes
+_include(notes-t.texi)
+
+ at page
+_split()
+ at node GapDB
+ at chapter Gap4 Database Files
+_include(gap_database-t.texi)
+
+ at page
+_split()
+ at node Copy Readings
+ at chapter Copy Readings
+_include(copy_reads-t.texi)
+
+ at page
+_split()
+ at node Check Database
+ at chapter Check Database
+_include(check_db-t.texi)
+
+ at page
+_split()
+ at node Doctor Database
+ at chapter Doctor Database
+_include(doctor_db-t.texi)
+
+ at page
+_split()
+ at node Conf
+ at chapter Configuring
+_include(configure-t.texi)
+
+
+_split()
+ at node Gap4-Cline
+ at chapter Command Line Arguments
+ at cindex Command line arguments
+
+ at table @code
+ at cindex -bitsize 
+ at cindex bitsize (command line option)
+ at cindex 64-bit Gap4 databases
+ at item -bitsize
+Specifies whether the database file size is 32-bit or
+64-bit. Practically speaking due to the use of signed numbers in
+places and the restriction of 32-bit for the number of records in a
+database (even when using @code{-bitsize 64} for 64-bit file offsets)
+the practical limits are 2Gb filesize for @code{-bitsize 32} and
+somewhere around about 100-million sequences for @code{-bitsize 64}. 
+
+Gap4 only needs this option for creating new databases. The bit-size
+of existing databases is automatically detected when they are opened.
+
+Databases produced in 64-bit format are not compatible with older
+versions of Gap4, but old and newly created 32-bit databases still work with
+the 64-bit Gap4 (and are maintained in 32-bit format so editing them
+will not invalidate their use by older Gap4s). The @code{copy_db}
+program (_fpref(Man-copy_db, Copy_db, manpages)) can be used to
+convert file formats.
+
+ at sp 1
+ at cindex -maxdb
+ at cindex maxdb (command line option)
+ at item -maxdb
+Specifies the maximum number of readings plus contigs. This value is not
+automatically adjusted whilst the program is running, but is not allowed to be
+set to a value too small for the database to be opened. It controls the size
+of some areas of memory (approximately @code{16*maxdb} bytes) used during
+execution of gap. The default value is @code{8000}.
+ at sp 1
+ at cindex -maxseq
+ at cindex maxseq (command line option)
+ at item -maxseq
+Specifies the maximum number of characters used in the concatenated consensus
+sequences. This parameter is generally not required as the value is normally
+computed and adjusted automatically. However a few functions (such as
+assembly) still need to know a maximum size before hand. The default is
+ at code{100000} bases.
+ at sp 1
+ at item -ro
+ at itemx -read_only
+ at cindex -read_only
+ at cindex read_only (command line option)
+Opens the database (if specified on the command line) in read only mode. This
+does not apply to databases opened using the file browser.
+ at sp 1
+ at cindex -check
+ at cindex -nocheck
+ at cindex nocheck (command line option)
+ at cindex check (command line option)
+ at item -check
+ at itemx -no_check
+Specifies whether to run the "Check Database" option when opening new
+databases. @code{-check} forces this to always be done and @code{-nocheck}
+forces it to never be done. By default Check Database is always performed when
+opening databases in read-write mode and never performed when opening in
+read-only mode.
+ at sp 1
+ at item -exec_notes
+ at itemx -no_exec_notes
+ at cindex -exec_notes
+ at cindex -no_exec_notes
+ at cindex security
+Controls whether to search for and execute any Notes of type
+ at code{OPEN} or @code{CLOS}. This may be an important security measure
+if you are using foreign databases. Gap4 defaults to -no_check_notes.
+ at sp 1
+ at item -rawdata_note
+ at itemx -no_rawdata_note
+ at cindex -rawdata_note
+ at cindex -no_rawdata_note
+Controls whether to make use of the @code{RAWD} note type for
+specifying the trace file search path. Defaults to -rawdata_note.
+ at sp 1
+ at item -csel
+ at itemx -no_csel
+ at cindex -csel
+ at cindex -no_csel
+Controls whether to automatically start up the contig selector when
+opening a new gap4 database. In some cases (such as when dealing with
+many EST clusters each in their own contig) the contig selector is not
+a practical tool; this simply offers a way of speeding up database
+opening. Defaults to -csel.
+ at sp 1
+ at item --
+Treat this as the last command line option. Only useful if the database name
+is specified and the name starts with a minus character (not
+recommended!).
+ at end table
+
+_ifdef([[_unix]],[[
+ at page
+_split()
+ at node Convert
+ at chapter Converting Old Databases
+_include(convert-t.texi)
+]])
+
+_undefine(_gap4)
diff --git a/manual/gap4.texi b/manual/gap4.texi
new file mode 100644
index 0000000..cd3f4b3
--- /dev/null
+++ b/manual/gap4.texi
@@ -0,0 +1,44 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+ at c %**start of header
+ at setfilename gap4.info
+ at setcontentsaftertitlepage
+ at setshortcontentsaftertitlepage
+ at settitle Gap4
+ at setchapternewpage odd
+ at iftex
+ at afourpaper
+ at end iftex
+ at setchapternewpage odd
+ at c %**end of header
+
+define(`__prog__',`gap4')
+define(`__Prog__',`Gap4')
+
+ at set standalone
+include(header.m4)
+
+ at titlepage
+ at title Gap4
+ at subtitle 
+ at author 
+ at page
+ at vskip 0pt plus 1filll
+_include(copyright.texi)
+ at end titlepage
+
+ at node Top
+ at ifinfo
+ at top top-gap4
+ at end ifinfo
+
+_include(gap4-t.texi)
+
+_split()
+ at node Index
+ at unnumbered Index
+ at printindex cp
+
+ at shortcontents
+ at contents
+ at bye
diff --git a/manual/gap4_intro-t.texi b/manual/gap4_intro-t.texi
new file mode 100644
index 0000000..22fa1f8
--- /dev/null
+++ b/manual/gap4_intro-t.texi
@@ -0,0 +1,300 @@
+ at page
+_split()
+ at node Gap-Intro-Menus
+ at section Gap4 Menus
+
+
+The main window for gap4 contains File, Edit, View, Options, Experiments,
+Lists and Assembly menus. 
+
+ at node Gap-Intro-Menus-File
+ at subsection Gap4 File menu
+
+The File menu includes database opening and
+copying functions and consensus calculation options. 
+
+ at itemize @bullet
+ at item Change Directory (_fpref(GapDB-Directories, Directories, Directories))
+ at item Check Database (_fpref(Check Database, Check Database, check_db))
+ at item New (_fpref(GapDB-New, Opening a New Database, newdb))
+ at item Open (_fpref(GapDB-Existing, Opening an Existing Database, exist))
+ at item Copy Database (_fpref(GapDB-CopyDatabase, Making Backups of Databases,db))
+ at item Copy Readings (_fpref(Copy Reads, Copying Readings,copy_reads))
+ at item Save Consensus (_fpref(Con-Calculation, The Consensus Calculation,calc_consensus))
+ at item Extract Readings (_fpref(Extract Readings, Extract Readings, ex))
+ at end itemize
+
+ at node Gap-Intro-Menus-Edit
+ at subsection Gap4 Edit menu
+The Edit menu
+contains options that alter the contents of the database.
+
+ at itemize @bullet
+ at item Edit Contig (_fpref(Editor, Editor introduction, contig_editor))
+ at item Join Contigs (_fpref(Editor-Joining, Editor joining, contig_editor))
+ at item Save Contig Order (_fpref(Order-Contigs, Order Contigs, contig_ordering))
+ at item Break Contig (_fpref(Break Contig, Break Contig, disassembly))
+ at item Complement a Contig (_fpref(Complement, Complement a Contig, c))
+ at item Order Contigs (_fpref(Order-Contigs, Order Contigs, contig_ordering))
+ at item Quality Clip (_fpref(Clip-Quality, Quality Clipping, c))
+ at item Quality Clip Ends (_fpref(Clip-QClipEnds, Quality Clip Ends, c))
+ at item Difference Clip (_fpref(Clip-Difference, Difference Clipping, d))
+ at item N-Base Clip (_fpref(Clip-NBases, N-Base Clipping, c))
+ at item Double Strand (_fpref(Double Strand, Double Strand, exp_suggest))
+ at item Disassemble Readings (_fpref(Break Contig, Break Contig, disassembly))
+ at item Enter Tags (_fpref(Enter Tags, Enter Tags, complement))
+ at item Edit Notebooks (_fpref(Notes, Notes, notes))
+ at item Doctor Database (_fpref(Doctor Database, Doctor database, doctor_db))
+ at end itemize
+
+ at node Gap-Intro-Menus-View
+ at subsection Gap4 View menu
+
+The View menu contains options to look at the data at several levels of
+detail, and analytic functions which present their results graphically.
+
+ at itemize @bullet
+ at item Contig Selector (_fpref(Contig Selector, Contig Selector,contig_selector))
+ at item ResultsManager (_fpref(Results, Results Manager, results))
+ at item Find Internal Joins (_fpref(FIJ, Find Internal Joins, fij))
+ at item Find Read Pairs (_fpref(Read Pairs, Find Read Pairs, read_pairs))
+ at item Find Repeats (_fpref(Repeats, Find repeats, repeats))
+ at item Check Assembly (_fpref(Check Assembly, Check Assembly, check_ass))
+ at item Sequence Search (_fpref(Find Oligos, Find Oligos, find_oligo))
+ at item Template Display (_fpref(Template-Display, Template Display, template))
+ at item Show Relationships (_fpref(Show Relationships, Show Relationships, show_rel))
+ at item Restriction Enzyme map
+(_fpref(Restrict, Restriction Enzyme Search, restrict_enzymes))
+ at item Stop Codon Map (_fpref(Stops, Stop Codon Map, stops))
+ at item Quality Plot (_fpref(Template-Quality, Quality Plot, template))
+ at item List Confidence (_fpref(Con-Evaluation, List Confidence, calc_consensus))
+ at item Reading Coverage Histogram (_fpref(Consistency-ReadingCov, Reading
+Coverage Histogram, consistency_display))
+ at item Read-Pair Coverage Histogram (_fpref(Consistency-ReadPairCov,
+Read-Pair Coverage Histogram, consistency_display))
+ at item Strand Coverage (_fpref(Consistency-StrandCov, Strand Coverage, consistency_display))
+ at item Confidence Values Graph (_fpref(Consistency-Confidence, Confidence
+Values Graph, consistency_display))
+ at end itemize
+
+ at node Gap-Intro-Menus-Options
+ at subsection Gap4 Options menu
+The Options menu contains options for configuring gap4.
+
+ at itemize @bullet
+ at item Consensus Algorithm
+(_fpref(Conf-Consensus Algorithm, Consensus Algorithm, configure))
+ at item Set Maxseq
+(_fpref(Conf-Set Maxseq, Set Maxseq, configure))
+ at item Set Fonts
+(_fpref(Conf-Fonts, Set Fonts, configure))
+_ifdef([[_unix]],[[@item Colours
+(_fpref(Conf-Colour, The Colour Configuration Window, configure))]])
+ at item Configure Menus
+(_fpref(Conf-Configure Menus, Configuring Menus, configure))
+ at item Set Genetic Code
+(_fpref(Conf-Set Genetic Code, Set Genetic Code, configure))
+ at item Alignment Scores
+(_fpref(Conf-Alignment Scores, Alignment Scores, configure))
+ at item Trace File Location
+(_fpref(Conf-Trace File Location, Trace File Location, configure))
+ at end itemize
+
+ at node Gap-Intro-Menus-Experiments
+ at subsection Gap4 Experiments menu
+
+The Experiments menu contains options to analyse the contigs and to
+suggest experimental solutions to problems.
+
+ at itemize @bullet
+ at item Suggest Long Readings (_fpref(Suggest Long, Suggest Long Readings, exp_suggest))
+ at item Suggest Primers (_fpref(Suggest Primers, Suggest Primers, exp_suggest))
+ at item Compressions and Stops
+(_fpref(Compressions, Compressions and Stops, exp_suggest))
+ at item Suggest Probes
+(_fpref(Suggest Probes, Suggest Probes, exp_suggest))
+ at end itemize
+
+ at node Gap-Intro-Menus-Lists
+ at subsection Gap4 Lists menu
+
+The Lists menu contains a set of options for creating and editing lists for
+use in various parts of the program.
+
+ at itemize @bullet
+ at item Creation and Editing 
+(_fpref(Lists, Lists Introduction, lists))
+ at item Contigs To Readings
+(_fpref(List-ContigToRead, Contigs To Readings Command, lists))
+ at item Minimal Coverage
+(_fpref(List-MinCoverage, Minimum Coverage, lists))
+ at item Unattached Readings
+(_fpref(List-Unattached, Unattached Readings, lists))
+ at item Highlight Readings List
+(_fpref(List-HighlightReadings, Highlight Readings List, lists))
+ at item Search Sequence Names
+(_fpref(List-SearchSequenceNames, Search Sequence Names, lists))
+ at item Search Template Names
+(_fpref(List-SearchTemplateNames, Search Template Names, lists))
+ at item Search Annotation Contents
+(_fpref(List-SearchAnnotations, Search Annotation Contents, lists))
+ at end itemize
+
+
+ at node Gap-Intro-Menus-Assembly
+ at subsection Gap4 Assembly menu
+The Assembly menu contains various assembly and data entry methods.
+
+ at itemize @bullet
+ at item Normal Shotgun Assembly
+(_fpref(Assembly-Shot, Normal Shotgun Assembly, assembly))
+ at item Directed Assembly (_fpref(Assembly-Directed, Directed Assembly,
+assembly))
+ at item Screen Only (_fpref(Assembly-Screen, Assembly Screen Only, assembly))
+ at item Assembly Independently
+(_fpref(Assembly-Ind, Assembly Independently, assembly))
+_ifdef([[_unix]],[[@item Cap2 Assembly (_fpref(Assembly-CAP2, Assembly CAP2, assembly))
+ at item Cap3 Assembly (_fpref(Assembly-CAP3, Assembly CAP3, assembly))
+ at item FAKII Assembly (_fpref(Assembly-FAKII, Assembly FAKII, assembly))
+ at item Phrap Assembly (_fpref(Assembly-Phrap Assemble, Phrap Assembly, assembly))
+]])@end itemize
+
+ at page
+_split()
+ at node Intro-Base-Acc
+ at section The use of numerical estimates of base calling accuracy
+
+ at cindex Base accuracies - use of
+ at cindex Confidence values - use of
+ at cindex Quality values - use of
+ at cindex Editing and base accuracies
+
+
+In this section we give an overview of our use, when available, of
+base call accuracy estimates or confidence values. We also explain
+the importance of the consensus calculations used by gap4, and their
+role in minimising the work needed to complete sequencing projects.
+
+We first put forward the idea of using numerical estimates of base
+calling accuracy in our paper describing SCF format 
+ at cite{Dear, S. and Staden, R, 1992. A standard file format for data from DNA
+sequencing instruments. DNA Sequence 3, 107-110} and then expanded on
+their use for editing and assembly in 
+ at cite{Bonfield,J.K. and Staden,R. The application of numerical estimates
+of base calling accuracy to DNA sequencing projects. Nucleic Acids
+Res. 23, 1406-1410 (1995)}.
+
+In Bonfield and Staden (1995), we stated 
+"...the most useful outcome of having a sequence reading determined by a
+computer-controlled instrument would be that each base was assigned a
+numerical estimate of its probability of having been called
+correctly... having numerical estimates of base accuracy is the key to
+further automation of data handling for sequencing projects. ... The
+simple procedure we propose in this paper is a method of using the
+numerical estimates of base calling accuracy to obviate much of the
+tedious and time consuming trace checking currently performed during a
+sequencing project. In summary we propose that the numerical estimates
+of base accuracy should be used by software to decide if conflicts
+between readings require human expertise to help adjudicate. We argue
+that if the accuracy estimates are reasonably reliable then the
+majority of conflicts can be ignored... and so the time taken to check
+and edit a contig will be greatly reduced." 
+
+This has been achieved by making the consensus calculations 
+(_fpref(Con-Calculation, The Consensus Calculation, calc_consensus))
+central to gap4, and by providing calculations which 
+make use of base call accuracy estimates to give each
+consensus base a quality measure. 
+The consensus is not stored in
+the gap4 database but is calculated when required by each function
+that needs it, and hence always takes into account the current data. 
+In the Contig Editor the consensus is updated instantly to reflect any
+change made by the user.
+
+In 1998 the first useable probability values became available through
+the program Phred
+(@i{Ewing, B. and Green, P.
+Base-Calling of Automated Sequencer Traces Using Phred. II. Error
+Probabilities. Genome Research. Vol 8 no 3. 186-194 (1998)}).
+Phred produces a confidence value that defines the probability that the
+base call is correct. This was an important step forward and
+these values are widely used and have defined a decibel type
+scale for base call confidence values. Gap4 is currently set to use 
+confidence values defined on this scale.
+
+The confidence value is given by the formula
+ at example
+     C_value = -10*log10(probability of error)
+ at end example
+
+A confidence value of 10 corresponds to an error rate of 1/10; 20 to
+1/100; 30 to 1/1000; and so on. Using the main
+gap4 consensus algorithm they enable the production of a consensus
+sequence for which the expected error rate for each base is known.
+
+As is described elsewhere
+(_fpref(Con-Evaluation, List Consensus Confidence, calc_consensus))
+being able to calculate the confidence for each base in the consensus
+sequence makes it possible to estimate the number of errors it contains,
+and hence the number of errors that will be removed if particular bases
+are checked and, if necessary, edited. 
+For example, if 1000 bases in the consensus had confidence
+20, we would expect those 1000 bases (with an error rate of 1/100) to
+contain 10 errors.
+
+Another program which produces decibel scale confidence values for ABI
+377 data is ATQA 
+ at cite{Daniel H. Wagner, Associates, at http://www.wagner.com/}.
+
+For gap4 the confidence values
+are expected to lie in the range 1 to 99, with 0 and 100
+having special meanings to the program.
+
+The confidence values are stored
+in SCF or Experiment files and copied into gap4 databases during assembly
+or data entry. 
+
+The searches provided by the Contig Editor
+(_fpref(Editor-Searching, Searching, contig_editor))
+are one of gap4's most important time saving features. The user
+selects a search type, for example to find places where the confidence
+for the consensus falls below a given threshold, and the search
+automatically moves the cursor to the next such position in the
+consensus. The Contig Editor locates the next
+problem by applying the consensus calculation 
+to the contig.
+To edit a contig the user selects
+"Search" repeatedly, knowing that it will 
+only move to places where there is a conflict
+between good data or where the data is poor.
+Note that the program is usually configured to automatically
+display the relevant traces for each position located by the search option.
+
+The main result is that far fewer disagreements
+between data are brought to the attention of the user and fewer traces
+have to be inspected by eye, and so the whole process is faster.
+Another consequence of the
+strategy is that, as fewer bases need changing to produce the correct
+consensus, most of what appears on the screen will be the original
+base calls. Indeed we have taken this a step further and suggest
+that if a base needs changing because it has a high accuracy estimate,
+and is conflicting with other good data, then rather than change the
+character shown on the screen, the user should lower its accuracy
+value. By so doing more of the original base calls are left unchanged
+and hence are visible to the user. There is a function within the
+contig editor to reset the accuracy value for the current base to
+0. Alternatively the accuracy value for the base that is thought to be
+correct can be set within the contig editor to 100. 
+
+ at page
+_split()
+ at node Intro-Hidden
+ at section Use of the "hidden" poor quality data
+_include(hidden-t.texi)
+
+ at page
+_split()
+ at node Intro-Anno
+ at section Annotating and masking readings and contigs
+_include(tags-t.texi)
+
diff --git a/manual/gap4_mini-t.texi b/manual/gap4_mini-t.texi
new file mode 100644
index 0000000..62e5d90
--- /dev/null
+++ b/manual/gap4_mini-t.texi
@@ -0,0 +1,795 @@
+_split()
+ at node Gap4-Introduction
+ at chapter Introduction
+ at cindex Gap4
+ at menu
+* Gap-Intro-Files::         Summary of the Files used and the Preprocessing Steps
+* Gap-Intro-Funtions::          Summary of Gap4's Functions
+* Gap-Intro-Interface::         Introduction to the Gap4 User Interface
+* Gap-Intro-Interface-CS::      Introduction to the Gap4 Contig Selector
+* Gap-Intro-Interface-CC::      Introduction to the Gap4 Contig Comparator
+* Gap-Intro-Interface-TD::      Introduction to the Gap4 Template Display
+* Gap-Intro-Interface-CD::      Introduction to the Gap4 Consistency Display
+* Gap-Intro-Interface-RE::      Introduction to the Gap4 Restriction Map
+* Gap-Intro-Interface-SC::      Introduction to the Gap4 Stop Codon Map
+* Gap-Intro-Interface-CE::      Introduction to the Gap4 Contig Editor
+* Gap-Intro-Interface-CJ::      Introduction to the Gap4 Contig Joining Editor
+ at end menu
+
+
+Gap4 
+is a Genome Assembly Program.  
+The program contains all the tools that would be expected from an assembly
+program plus many unique features and a very easily used interface.
+The original version was described in
+ at cite{Bonfield,J.K., Smith,K.F. and
+Staden,R. A new DNA sequence assembly program. Nucleic Acids Res. 24,
+4992-4999 (1995)} 
+
+Gap4 is very big and powerful. Everybody employs a subset of options and
+has their favourite way of accessing and using them. Although there is a
+lot of it, users are encouraged to go through the whole of the documentation
+once, just to discover what is possible, and the way that best suits
+their own work. At the very least, the whole of this introductory
+chapter should be read, as in the long run, it will save time.
+
+This chapter serves as a cross reference point, to give an overview of the
+program and to introduce some of the important ideas which it uses. The
+main topics that are introduced are listed in the current section. We
+introduced the use of base call accuracy values for speeding up
+sequencing projects 
+(_fpref(Intro-Base-Acc, The use of numerical estimates of base
+calling accuracy, t)).
+The ability to annotate segments of readings and the consensus can be very
+convenient 
+(_fpref(Intro-Anno, Annotating and masking readings and contigs, t)).
+Generally the 3' ends of readings from sequencing instruments are of too
+low a quality to be used to create reliable consensus, but they can be
+useful, for example, for finding joins between contigs
+(_fpref(Intro-Hidden, Use of the "hidden" poor quality data, t)).
+
+One of the most powerful features of gap4 is its graphical user
+interface which enables the data to be viewed and manipulated at several
+levels of resolution. The displays which provide these different views are 
+introduced, with several screenshots
+(_fpref(Gap-Intro-Interface, Introduction to the gap4 User Interface, t)).
+
+It is important to understand the different files used by our
+sequence assembly software, and how the data is processed before it
+reaches gap4
+(_fpref(Gap-Intro-Files,
+Summary of the Files used and the Preprocessing Steps, t)).
+
+Note that gap4 is a very flexible program, and is designed so that it
+can easily be configured to suit different purposes and ways of
+working. For example it is easy to create a beginners 
+version of gap4 which has
+only a subset of functions. What is described in this manual is the full
+version, and so is likely to contain some perhaps more esoteric options
+that few people will need to use.
+This introductory section also contains a
+complete list of the options in the gap4 main menus
+(_fpref(Gap-Intro-Menus, Gap4 Menus, t)).
+
+In addition to sequence assembly, gap4 can be used for managing mutation
+study data and for helping to discover and check for mutations
+(_fpref(Mutation-Detection-Introduction, Introduction to Searching for Mutations, t)).
+
+Two further useful facilities of gap4 are "Lists" and "Notes".
+For many operations it is convenient to be able to process sets of data 
+together - for example to
+calculate a consensus sequence for a subset of the contigs. To
+facilitate this gap4 uses lists
+(_fpref(Lists, Lists Introduction, lists))
+A `Note' 
+(_fpref(Notes, Notes, notes))
+is an arbitrary piece of text which can be attached to any
+reading, any contig, or to the
+database in general. 
+
+
+ at page
+_split()
+ at node Gap-Intro-Files
+ at section Summary of the Files used and the Preprocessing Steps
+
+ at cindex BUSY files
+ at cindex database write access
+ at cindex database readonly access
+ at cindex readonly
+ at cindex database limits
+ at cindex gap4 database limits: resetting
+ at cindex gap4 database sizes
+ at cindex gap4 database sizes: resetting
+ at cindex gap4 database: maxdb
+ at cindex gap4 database: maxseq
+ at cindex gap4 database: reading length limits
+ at cindex reading length limits in gap4
+ at cindex database: gap4 maxdb
+ at cindex database: gap4 maxseq
+ at cindex trace files: location
+ at cindex trace files: defining location
+ at cindex directories: trace files
+ at cindex gap4: viewing trace files
+
+
+
+
+
+ at cindex simultaneous database access
+
+Gap4 stores the data for an assembly project in a gap4
+database. Before being entered into the gap4 database the data must be
+passed through several preassembly steps, usually via pregap4
+(_fpref(Pregap4-Introduction, Pregap4 introduction, pregap4)). 
+These steps are outlined below.
+
+The programs can handle data produced by a variety of sequencing
+instruments. 
+They can also
+handle data entered using digitisers or that has been typed in by
+hand. Usually the trace files in proprietary format, such as
+those of ABI, are converted to SCF files (_fpref(Formats-Scf, SCF introduction,
+scf)) or ZTR files.
+As originally put forward in @cite{Bonfield,J.K. and Staden,R. The application of
+numerical estimates of base calling accuracy to DNA sequencing
+projects. Nucleic Acids Research 23, 1406-1410 (1995).} gap4 makes
+important use of basecall confidence values, 
+(_fpref(Intro-Base-Acc, The use of numerical estimates of base
+calling accuracy, t))
+which are normally stored in the reading's SCF file.
+
+One of the first steps in the preprocessing is to copy
+the base calls from the trace files 
+to text files known as Experiment files
+(_fpref(Formats-Exp, Experiment files, exp)). 
+All the subsequent processes operate on the Experiment files.
+Other preassembly steps include quality and vector clipping.
+Each step is performed by a specific program
+controlled by the program pregap4
+(_fpref(Pregap4-Introduction, Pregap4 introduction, pregap4)). 
+
+Experiment file format is similar to that of EMBL sequence entries in
+that each record starts with a two letter identifier, but we have
+invented new records specific to sequencing experiments. One of
+pregap4's tasks is to augment the Experiment files to include data about
+the vectors, primers and templates used in the production of each
+reading, and if necessary it can extract this information from external
+databases. Some of the information is needed by pregap4 and some by
+gap4. (Note that in order to get the most from gap4 it is essential to make
+sure that it is supplied, via the Experiment files, with all the information
+it needs.)
+
+The trace files are not altered, but are kept as archival data so that
+it is always possible to check the original base calls and traces. Any
+changes to the data prior to assembly
+(and we recommend that none are made until readings
+can be viewed aligned with others) are made to the copy of the sequence
+in the Experiment file.
+
+The reading data, in Experiment file format, is entered into the project
+database (_fpref(GapDB, Gap Database Files, gap4)), usually via one of
+the assembly engines. Because Experiment file format was based on EMBL
+file format, EMBL files can also be entered and their feature tables will
+be convered to tags.  There is no limit to the length of readings which
+can be entered.
+
+All the changes to
+the data made by gap4 are made to the copies of the data in the project
+database.  Once the data has been copied into the gap4 database the
+Experiment files are no longer required.
+
+Gap4 uses the trace files to display the traces 
+(_fpref(Editor-Traces, Traces, t)),
+and to compare the edited bases with the original base calls
+(_fpref(Editor-Search-VerifyEdit1, Search by Evidence for Edit (1), t)),
+(_fpref(Editor-Search-VerifyEdit2, Search by Evidence for Edit (2), t)).
+However gap4 databases do not store trace files: they record only the
+names of the trace files 
+(which are copied from the readings' Experiment files).
+This means that
+if the trace files for a project are not in the same directory/folder as
+the gap4 database, gap4 needs to be told where they are, otherwise it
+cannot use them. Ideally, all the trace files for a project should be stored
+in one directory. To tell gap4 where they are the "Trace file location"
+command in the Options menu should be used (_fpref(Conf-Trace File Location,
+Trace File Location,t)).
+
+Gap4 databases have a number of size constraints, some of which can be altered
+by users and others which are fixed. 
+
+While gap4 is running it often needs to calculate a consensus. The maximum size
+of this sequence is controlled by a variable "maxseq". Most routines are able 
+to automatically increase the value of maxseq while they are running, but some 
+of the older functions, including some of the original assembly engines, are 
+not. This means that it is important for users to set maxseq to a sufficiently 
+high value before running these elderly routines. By default maxseq is 
+currently set to 100000, but users can set it on the command line or from 
+within the Options menu.
+
+Gap4 databases contain one record for each reading and one for each contig. 
+The sum of these two sets of records is the "database_size", and the maximum 
+value that database_size is permitted to reach is "maxdb". When databases are
+initialised maxdb is set, by default, to 8000. Users can alter this value on
+the command line or from within the Options menu of gap4.
+
+Gap4 databases also limit the number and names of readings so that various 
+output routines know how many character positions are required: the maximum
+number imposed in this way is 99,999,999, and the maximum reading name length
+is 40.
+
+Currently we have sites with single gap4 databases containing over 200,000
+readings with consensus sequences in excess of 7,000,000 bases.
+
+A gap4 database can be used by several users simultaneously, but only
+one is allowed to change the contents of the database, and the others
+are given "readonly" access. 
+As part of its mechanism to
+prevent more than one person editing a database at once
+gap4 uses a "BUSY" file
+to signify that the database is opened for writing.
+Before opening a database for
+writing, gap4 checks to see if the BUSY file for that database exists. 
+If it does, the database is
+opened only for reading, if not it creates the file, so that any
+additional attempts to open the
+database for writing will be blocked. 
+When the user with write access closes the database, the BUSY file is
+deleted, hence re-enabling its ability to be opened for changes.
+It is worth remembering that a side effect of this mechanism, 
+is that in the event of a
+program or system crash the BUSY file will be left on the disk, even
+though 
+the database is
+not being used. In this case users must remove the BUSY file 
+before using the database
+(_fpref(GapDB, Gap4 Database Files, t)).
+
+The final result from a sequencing project is a consensus sequence 
+(_fpref(Con-Calculation, The Consensus Calculation, calc_consensus))
+and
+gap4 can write these in Experiment file format, fasta format or staden
+format. Of course the whole database and all the trace files are also
+useful for future reference as they allow any queries about the accuracy
+of the sequence to be answered.
+
+ at page
+_split()
+ at node Gap-Intro-Funtions
+ at section Summary of Gap4's Functions
+
+The tasks which gap4 can perform can be roughly divided into 
+assembly
+(_fpref(Assembly, Assembly Introduction, assembly)), 
+finishing
+(_fpref(Experiments, Finishing Experiments, experiments)),
+and editing
+(_fpref(Editor, Editor introduction, contig_editor)).
+But gap4 contains many other functions which can help to complete a
+sequencing project with the minimum amount of effort, and some of these
+are listed below.
+
+Readings are entered into the gap4 database using the 
+assembly algorithms (_fpref(Assembly, Assembly Introduction, assembly)). 
+In general these algorithms will build the largest 
+contigs they can by finding overlaps between the readings, however some,
+perhaps more doubtful, 
+joins between contigs may be missed, and these can be discovered, checked 
+and made using 
+Find Internal Joins (_fpref(FIJ, Find Internal Joins, fij)),
+Find repeats (_fpref(Repeats, Find repeats, repeats)) and
+Join Contigs (_fpref(Editor-Joining, The Join Editor, contig_editor)).
+Find Internal Joins compares the ends of contigs to see if there are
+possible overlaps and then presents the overlap in the Contig Joining
+Editor, from where the user can view the traces, make edits and join the
+contigs. Find Repeats can be used in a similar way, but unlike Find
+Internal Joins it does not require the matches it finds to continue to
+the ends of contigs.
+
+Read-pair data can be used to automatically put contigs into the
+correct order
+(_fpref(Order-Contigs, Ordering Contigs, contig_ordering)),
+and information about contigs which share templates can be plotted out
+(_fpref(Read Pairs, Find Read Pairs, read_pairs)).
+The relationships of readings and templates, within and between contigs
+can also be shown by the Template Display
+(_fpref(Template-Display, Template Display, template))
+which has a wide selection of display modes and uses.
+
+Problems with the assembly can be revealed by use of 
+Check Assembly (_fpref(Check Assembly, Checking Assemblies, check_ass)),
+Find repeats (_fpref(Repeats, Find repeats, repeats)), and 
+Restriction Enzyme mapping
+(_fpref(Restrict, Plotting Restriction Enzymes, restrict_enzymes)).
+Check Assembly compares every reading with the segment of the consensus
+it overlaps to see how well it aligns. Those that align poorly are
+plotted out in the Contig Comparator. Find Repeats also presents its
+results in the Contig Comparator, so if used in conjunction with Check
+Assembly, it can show cases where readings have been assembled into
+the wrong copy of a repeated element. At the end of a project 
+the Restriction Enzyme map function can be used 
+to compare the consensus sequence with a restriction digest of
+the target sequence.
+Problems can also be found by use of the various Coverage Plots available in
+the Consistency Display
+(_fpref(Consistency-Display, Consistency Display, consistency)). These
+plots will show regions of low or high reading coverage
+(_fpref(Consistency-ReadingCov, Reading Coverage Histogram,
+consistency_display)),
+places with data for only one strand
+(_fpref(Consistency-StrandCov, Strand Coverage, consistency_display)),
+or where there is no read-pair coverage
+(_fpref(Consistency-ReadPairCov, Read-Pair Coverage Histogram, consistency_display)).
+Errors can be corrected by 
+Disassemble Readings (_fpref(Disassemble, Disassembling Readings, disassembly))
+and Break Contig (_fpref(Break Contig, Breaking Contigs, disassembly)) which
+can remove readings from contigs or databases or can break contigs.
+
+The general level of completeness of the consensus sequence
+can be seen diagrammatically using the 
+Quality Plot (_fpref(Template-Quality, Quality Plot, template)), and
+the confidence values for each base in the consensus sequence can be
+plotted (_fpref(Consistency-Confidence, Confidence
+Values Graph, consistency_display)).
+
+The most powerful component of gap4 is its Contig Editor 
+(_fpref(Editor, Editor introduction, contig_editor)).
+which has many
+display modes and search facilities to enable very rapid discovery and
+fixing of base call errors.
+
+If working on a protein coding sequence, the
+consensus can be analysed using the
+Stop Codon Map (_fpref(Stops, Stop Codon Map, stops)), and
+its translation viewed using the Contig Editor
+(_fpref(Editor-Status, Status Line, contig_editor)).
+
+The final result from a sequencing project is a consensus sequence 
+(_fpref(Con-Calculation, The Consensus Calculation, calc_consensus)).
+
+
+ at page
+_split()
+ at node Gap-Intro-Interface
+ at section Introduction to the gap4 User Interface
+
+Gap4 has a main window from which all the main options are selected from
+menus. When a database is open it also has a Contig Selector which will
+transform into a Contig Comparator whenever needed. In addition many of
+the gap4 functions, such as the Contig Editor or the Template Display
+will create their own windows when they are activated. All the graphical
+displays and the Contig Editor can be scrolled in register. The base of the
+graphical display
+windows usually contains an Information Line for showing short textual
+data about results or items touched by the mouse cursor. Gap4 is
+best operated using a three button mouse, but alternative keybindings
+are available. Full details of the user interface
+are described elsewhere
+(_fpref(UI-Introduction, User Interface, t)), and here we give an
+introduction based around a series of screenshots.
+
+The main window (shown below) contains an Output window for
+textual results, an Error window for error messages, and a series of
+menus arranged along the top. The contents of the two text windows can
+be searched, edited and saved. Each set of results is preceded by
+a header containing the time and date when it was generated.
+
+Some of the text will be underlined and shaded differently. These are
+hyperlinks which perform an operation when clicked (with the left mouse
+button) on, typically invoking a graphical display such as the contig
+editor. Clicking on these with the right mouse button will bring up a menu of
+additional operations. At present only a few commands (Show Relationships and 
+the Search functions) produce hypertext, but if there is sufficient interest
+this may be expanded on.
+
+_lpicture(interface.output,5.13333in)
+
+ at page
+_split()
+ at node Gap-Intro-Interface-CS
+ at subsection Introduction to the Contig Selector
+
+
+The gap4 Contig Selector is used to display, select and reorder contigs.
+In the Contig Selector all contigs
+are shown as colinear horizontal lines separated by short vertical
+lines. The length of the horizontal
+lines is proportional to the length of the contigs and their left to
+right order represents the current
+ordering of the contigs. Users can change the contig order by
+dragging the lines representing the contigs. This is done by clicking
+and holding the middle mouse button, or Alt left mouse button, 
+on a line and then moving the mouse cursor.
+The Contig Selector can also be used to select
+contigs for processing. For example, clicking with the right mouse
+button on the line representing a
+contig will invoke a menu containing the
+commands which can be performed on that
+contig. 
+There are several alternative ways of specifying which contig an
+operation should be performed on. Contigs are identified by the name or
+number of any reading they contain. When a dialogue is requesting a
+contig name, using the left mouse button to click on the contig in the
+Contig Selector will
+transfer its name to the dialogue box. Other methods are available
+(_fpref(Contig-Selector-Contigs, Selecting Contigs, t)).
+
+As the mouse is moved over a contig, it is highlighted and the contig
+name (left
+most reading name) and length are displayed in the Information Line. 
+The number in brackets is the contig number (actually the number of its
+leftmost reading).
+Tags or annotations
+(_fpref(Intro-Anno, Annotating and masking readings and contigs, t))
+can also be displayed in the
+Contig Selector window. 
+
+_picture(contig_selector,4.63333in)
+
+The figure shows a typical display from  the Contig Selector. At the top
+are the File, View and Results menus. Below that are buttons for
+zooming and  for displaying the crosshair. The  four  boxes to the
+right are used to display the X and  Y coordinates of the crosshair. The
+rightmost two   display the Y  coordinates  when the contig  selector is
+transformed  into    the   Contig   Comparator.
+The  two leftmost   boxes display the  X  coordinates: the
+leftmost is the position in the contig and the  other is the position in
+the overall consensus. The  crosshair is the  vertical line spanning the
+panel below. Tags are shown as coloured rectangles above and below the
+lines
+(_fpref(Contig Selector, Contig Selector, contig_selector)).
+
+ at page
+_split()
+ at node Gap-Intro-Interface-CC
+ at subsection Introduction to the Contig Comparator
+
+Gap4 commands such as Find Internal Joins (_fpref(FIJ, Find Internal
+Joins, fij)), Find Repeats (_fpref(Repeats, Find Repeats, repeats)),
+Check Assembly (_fpref(Check Assembly, Check Assembly, check_ass)), and
+Find Read Pairs (_fpref(Read Pairs, Find Read Pairs, read_pairs))
+automatically transform the Contig Selector (_fpref(Contig Selector,
+Contig Selector, contig_selector)) to produce the Contig Comparator.
+To
+produce this transformation a copy of the Contig Selector is added at
+right angles to the original window to create a two dimensional
+rectangular surface on which to display the results of comparing or
+checking contigs. 
+
+Each of the functions plots its results as diagonal
+lines of different colours.  In general, 
+if the plotted points are close to the main
+diagonal they represent results from pairs of contigs that are in the
+correct relative order.  Lines parallel to the main diagonal represent
+contigs that are in the correct relative orientation to one another.
+Those perpendicular to the main diagonal show results for which one
+contig would need to be reversed before the pair could be joined.  The
+manual contig dragging procedure can be used to change the relative
+positions of contigs.  _fxref(Contig-Selector-Order, Changing the Contig
+Order, contig_selector) As the contigs are dragged the plotted results
+will automatically be moved to their corresponding new positions.  This
+means that, in general, 
+if users drag the contigs to move their plotted results close
+to the main diagonal they will simultaneously be putting their contigs
+into the correct relative positions.
+
+This plot can simultaneously show the results of independent types of
+search, making it easy for users to see if different analyses produce
+corroborating evidence for the ordering of contigs. Indications that a
+reading may have been assembled in an incorrect position can also be
+seen - if for example a result from Check Assembly lies on the same
+horizontal or vertical projection as a result from Find Repeats, users
+can see the alternative position to place the doubtful reading.
+
+The plotted results can be used to invoke a subset of commands by the
+use of pop-up menus.
+For example if the user clicks the right mouse button over
+a result from Find Internal Joins a menu containing Invoke Join Editor
+(_fpref(Editor-Joining, The Join Editor, contig_editor)) and Invoke
+Contig Editors (_fpref(Editor, Editing in gap4, contig_editor))
+will pop up. If the user selects Invoke Join Editor the Join Editor will
+be started with the two contigs aligned at the match position contained
+in the result. If required one of the contigs will be complemented to
+allow their alignment. 
+
+ at page
+_lpicture(comparator,5.325in)
+
+A typical display from the Contig Comparator is shown above. It includes
+results for Find Internal Joins in black, Find Repeats in red, Check
+Assembly in green, and Find Read Pairs in blue. 
+Notice that there are several internal joins, read pairs and repeats
+close to the main diagonal near the top left of the display. This
+indicates that the contigs represented in that area are 
+likely to be in the correct positions relative to one another. In the
+middle of the bottom right quadrant there is a blue diagonal line
+perpendicular to the main diagonal. This indicates a pair of contigs
+that are in the wrong relative orientation. The crosshairs show the
+positions for a pair of 
+contigs. The vertical line continues into the Contig Selector part of
+the display, and the position represented by the horizontal line is also
+duplicated there
+(_fpref(Contig Comparator, Contig Comparator, comparator)).
+
+
+ at page
+_split()
+ at node Gap-Intro-Interface-TD
+ at subsection Introduction to the Template Display
+
+The Template Display can show schematic plots of readings, templates,
+tags, restriction enzyme sites and the consensus quality. Colour coding
+distinguishes reading, primer and template types. The Template Display
+can also be used to reorder contigs and to invoke the Contig Editor.
+
+An example showing all these information types can be seen in the Figure below.
+
+
+_lpicture(template.display,6in)
+
+The large top section contains lines and arrows representing readings
+and templates. Beneath this are rulers;
+one for each contig, and below those is the quality plot. 
+The template and reading section of the display is in two parts. The top
+part contains the templates which have been sequenced from both ends but
+which are in some way inconsistent - for example given the current
+relative positions of their readings, they may have a length that is
+larger or greater than that expected, or the two readings may, as it
+were, face away from one another. Colour coding is used to distinguish
+between different types of inconsistency, and whether or not the
+inconsistency involves readings within or between contigs. For example,
+most of the problems shown in the screendump above are coloured
+dark yellow, indicating an inconsistency between a pair of contigs.
+The rest of the data, (mostly dark blue indicating templates sequenced
+from only one end), is plotted below the data for the inconsistent
+templates.
+Forward readings are light blue and reverse readings are orange.
+Templates in bright yellow have been sequenced from both ends, are consistent and
+span a pair of contigs (and so indicating the relative orientation and
+separation of the contigs). 
+
+At the bottom is the restriction enzyme plot.
+The coloured blocks immediately above and below the ruler are tags.
+Those above the ruler 
+can also be seen on their corresponding readings in the large top
+section. 
+The display can be zoomed. The position of a crosshair
+is shown in the two left most boxes in the top right hand corner. 
+The leftmost
+shows the distance in bases between the crosshair and the start of the 
+contig
+underneath the crosshair. The middle box shows the distance between the
+crosshair and the start of the first contig. The right box shows the 
+distance
+between two selected cut sites in the restriction enzyme plots
+(_fpref(Template-Display, Template Display, template)).
+
+ at page
+_split()
+ at node Gap-Intro-Interface-CD
+ at subsection Introduction to the Consistency Display
+
+The Consistency Display provides plots designed to highlight 
+potential problems in contigs. It
+is invoked from the main gap4 View menu by selecting any of its plots. Once
+a plot has been displayed, any of the other types of consistency plot can
+be displayed within the same frame from the View menu of the Consistency
+Display. 
+
+An example showing the Confidence Values Graph and the corresponding Reading
+Coverage Histogram, Read-Pair Coverage Histogram and Strand Coverage
+is shown below.
+
+_lpicture(consistency_p,6in)
+
+If more than one contig is displayed, the contigs are
+drawn immediately after one another but are staggered in the y direction.
+
+The ruler ticks can be turned on or off from the View menu of the consistency
+display. 
+The plots can be enlarged or reduced using the standard zooming mechanism.
+_fxref(UI-Graphics-Zoom, Zooming, interface)
+
+The crosshair toggle button controls whether the crosshair is visible. This is
+shown as a black vertical and horizontal line. The position of the crosshair is
+shown in the 3 boxes to the right of the 
+crosshair toggle. The first box indicates the cursor position in the current
+contig. The second box indicates the overall position of the cursor in the 
+consensus. The last box shows the y position of the crosshair. 
+(_fpref(Consistency-Display, Consistency Display, consistency)).
+
+
+ at page
+_split()
+ at node Gap-Intro-Interface-RE
+ at subsection Introduction to the Restriction Enzyme Map
+
+The restriction enzyme map function finds and displays restriction sites
+within a specified region of a contig. Users can select the enzyme
+types to search for and can save the sites found as tags within the
+database.
+
+_lpicture(restrict_enzymes,6in)
+
+This figure shows a typical view of the Restriction Enzyme Map
+in which the results for each enzyme type have been configured by the
+user to be drawn in different colours.  On the left of the display the
+enzyme names are shown adjacent to their rows of plotted results. If no
+result is found for any particular enzyme eg here APAI, the row will
+still be shown so that zero cutters can be identified. Three of the
+enzymes types have been selected and are shown highlighted. The results
+can be scrolled vertically (and horizontally if the plot is zoomed in).
+A ruler is shown along the base and the current cursor position (the 
+vertical black line) is shown in the left hand box near the top right of
+the display.  If the user clicks, in turn, on two restriction sites
+their separation in base pairs will appear in the top right hand box.
+Information about the last site touched is shown in the Information line
+at the bottom of the display. At the top the edit menu is shown
+and can be used to create tags for highlighted enzyme types
+(_fpref(Restrict, Restriction Enzyme Search, restrict_enzymes)).
+
+
+ at page
+_split()
+ at node Gap-Intro-Interface-SC
+ at subsection Introduction to the Stop Codon Map
+
+The Stop Codon Map plots the positions of all the stop codons on one or
+both strands of a contig consensus sequence.  If the Contig Editor is 
+being used on
+the same contig, the Refresh button will be enabled, and if used, will 
+fetch the
+current consensus from the editor, repeat the search and replot the stop
+codons.
+
+_lpicture(stops,6in)
+
+The figure shows a typical zoomed in view of the Stop Codon Map display.
+The positions for the stop codons in each reading frame (here all six
+frames are
+shown) are displayed in horizontal strips. Along the top are buttons for
+zooming, the crosshair toggle, a refresh
+button and two boxes for showing the crosshair position. The left box shows
+the current position and the right-hand box the separation of the last two
+stop codons selected by the user.  Below the display of stop codons is a
+ruler and a horizontal scrollbar. The information line is showing the data 
+for
+the last stop codon the user has touched with the cursor. Also shown on the
+left is the View menu which is used to select the 
+reading frames to display
+(_fpref(Stops, Stop Codon Map, stops)).
+
+
+ at page
+_split()
+ at node Gap-Intro-Interface-CE
+ at subsection Introduction to the Contig Editor
+
+The gap4 Contig Editor is designed to allow rapid checking and editing of
+characters in assembled readings. Very large savings in time can be achieved
+by its sophisticated problem finding procedures which automatically direct the
+user only to the bases that require attention.  The following is a selection of
+screenshots to give an overview of its use.
+
+_lpicture(contig_editor.screen,6in)
+
+The figure above shows a screendump from the Contig Editor
+which contains segments of aligned
+readings, their consensus and a six phase translation. The Commands menu
+is also shown.  The main components are: the controls at
+the top; reading names on the left; sequences to their right; and status lines
+at the bottom. Some of the reading names are written in light grey which
+indicates that their traces/chromatograms are being displayed (in
+another window, see below).
+
+One reading name is written with inverse colours, which indicates that it
+has been selected by the user. To the left of each reading name is the reading
+number, which is negative for readings which have been reversed and complemented.
+The first of the status lines, labelled "Strands", is showing a
+summary of strand coverage. The left half of the segment of sequence
+being displayed is covered
+only by readings from one strand of the DNA, but the right half contains data
+from both strands.
+
+Along the top of the editor window is a row of command buttons
+and menus. The rightmost pair of buttons provide help
+and exit.  To their left are two menus, one of which is currently in use.  To
+the left of this is a button which initially displays a search dialogue,
+and then pressing it again, will perform the selected search. 
+Further left is the undo button:
+each time the user clicks on this box the program reverses the previous edit
+command.  The next button, labelled "Cutoffs" is used to toggle between
+showing or hiding the reading data that is of poor quality or is vector
+sequence. In this figure it has been activated, showing the poor quality
+data in light grey. Within this, sequencing vector is displayed in
+lilac. The next button to the left is the Edit Modes menu
+which allows users to select which editing commands are enabled. The
+next command toggles between insert and replace and so governs the effect of
+typing in the edit window.
+
+One of the readings contains a yellow tag, and elsewhere some bases are
+coloured red, which indicates they are of poor quality.  The Information Line
+at the bottom of the window can show 
+information about readings, annotations and
+base calls. In this case it is showing information about the reliability of
+the base beneath the editing cursor.
+
+_lpicture(contig_editor_grey_scale,6in)
+
+A better way of displaying the accuracy of bases is to shade their
+surroundings so that the lighter the background the better the data.
+In the figure above, this grey scale encoding of the base accuracy or
+confidence has been activated for bases in the readings and the
+consensus. This
+screenshot also shows the Contig Editor displaying disagreements and edits.
+Disagreements between the consensus and individual base calls are shown
+in dark green. Notice that these disagreements are in poor
+quality base calls. Edits (here they are all pads) are shown with a
+light green background. When they are present, replacements/insertions
+are shown in pink, deletions in red and confidence value changes in purple.
+The consensus confidence takes into account several factors, including
+individual base confidences, sequencing chemistry, and strand coverage.
+It can be seen that the consensus for 
+the section covered by data from only one strand has been calculated to
+be of lower confidence than the rest. The Status Line includes two
+positions marked with exclamation marks (!) which means that the
+sequence is covered by data from both strands, but that the consensus
+for each of the two strands is different.
+The Information Line at the bottom of the window is showing
+information about the reading under the cursor: its name, number,
+clipped length, full length, sequencing vector and BAC clone name.
+
+_lpicture(contig_editor.traces,6in)
+
+The Contig Editor can rapidly display the traces for any reading or set
+of readings. The number of rows and columns of traces 
+displayed can be set by the user. The traces scroll in register with one
+another, and with the cursor in the Contig Editor. Conversely, the
+Contig Editor cursor can be scrolled by the trace cursor. 
+A typical view is shown above.
+
+This figure is an example of the Trace Display showing three traces
+from readings in the previous two Contig Editor screendumps.
+These are the best two traces from each strand plus a trace from a
+reading which contains a disagreement with the consensus. The program
+can be configured to automatically 
+bring up this combination of traces for each
+problem located by the "Next search" option.
+The histogram or vertical bars plotted top down show the confidence
+value for each base call. The reading number, together with the direction of
+the reading (+ or -) and the chemistry by which it was determined, is given at
+the top left of each sub window.  There are three buttons ('Info', 'Diff', and
+'Quit') arranged vertically with X and Y scale bars to their right. The Info
+button produces a window like the one shown in the bottom right hand
+corner. The Diff button is mostly used for mutation detection, and causes a
+pair of traces to be subtracted from one another and the result plotted, hence
+revealing their differences.  (_fpref(Editor-Traces, Traces, contig_editor)).
+
+ at page
+_split()
+ at node Gap-Intro-Interface-CJ
+ at subsection Introduction to the Contig Joining Editor
+
+Contigs are joined interactively using the Join Editor.
+This is simply a pair
+of contig editor displays stacked one above the other with a "differences"
+line in between. The Contig Join Editor is usually invoked by clicking
+on a Find Internal Joins, or Find Repeats result in the Contig
+Comparator. In which case the two contigs will appear 
+with the match found by these searches displayed.
+
+The few differences between the Join Editor and the Contig Editor can be seen
+in the figure below. Otherwise all the commands and operations are the
+same as those for the Contig Editor.
+
+_lpicture(contig_editor.join,6in)
+
+In this figure the Cutoff or Hidden data is being displayed for the
+right hand contig. One difference between the Contig Editor and the Join
+Editor is the Lock button. When set (as it is in the
+illustration) the two contigs scroll in register, otherwise they can be
+scrolled independently.
+
+The Align button aligns the overlapping consensus sequences
+(_fpref(Editor-Joining, Editor joining, contig_editor)).
+
diff --git a/manual/gap4_org-t.texi b/manual/gap4_org-t.texi
new file mode 100644
index 0000000..d215f8f
--- /dev/null
+++ b/manual/gap4_org-t.texi
@@ -0,0 +1,70 @@
+ at node Gap-Intro-Manual
+ at chapter Organisation of the gap4 Manual
+
+
+The main body of the gap4 manual is divided, where possible, 
+into sections covering related topics. If appropriate, these sections
+commence with an overview of the functions they contain.
+After the Introduction, the manual contains chapters on some important
+components of the user interface: the Contig Selector
+(_fpref(Contig Selector, Contig Selector, contig_selector)),
+the Contig Comparator
+(_fpref(Contig Comparator, Contig Comparator, comparator)),
+and then, in the chapter on Contig Overviews 
+(_fpref(Contig-Overviews, Contig Overviews, c))
+we describe the Template
+Display
+(_fpref(Template-Display, Template Display, template)),
+and its subcomponents
+the Stop Codon Plot
+(_fpref(Stops, Stop Codon Map, stops)), and the
+Restriction Enzyme Plot
+(_fpref(Restrict, Restriction Enzyme Search, restrict_enzymes)).
+
+Then there is a long chapter on the powerful Contig Editor
+(_fpref(Editor, Editor introduction, contig_editor)), followed by a
+chapter describing the many assembly engines and assembly modes which
+gap4 can offer
+(_fpref(Assembly, Assembly Introduction, assembly)).
+
+Gap4 contains functions to use the data in an assembly database to find the
+left to right order of contigs, and to compare their consensus sequences
+to look for joins that may have been missed during assembly.
+A "read-pair" is obtained by sequencing a DNA template (or "insert")
+from both ends: we then know the relative orientations of the two
+readings, and if we know the approximate
+template length, we know how far apart they
+should be after assembly. The next chapter is on the use of read-pair
+data for ordering contigs and checking assemblies and on the use of
+consensus comparisons for finding joins
+(_fpref(Ordering-and-Joining, Ordering and Joining Contigs, t)).
+
+
+The next chapter is on checking assemblies and removing readings
+(_fpref(Contig-Checking-and-Breaking, Checking Assemblies and Removing
+Readings, t)). The following chapter describes gap4's methods for
+suggesting experiments for helping to finish a sequencing project 
+(_fpref(Experiments, Finishing Experiments, experiments)). Then we
+describe the various consensus calculation algorithms, and the options
+for creating consensus sequence files
+(_fpref(Con-Calculation, The Consensus Calculation,
+calc_consensus)). Next is the description of a set of miscellaneous
+functions
+(_fpref(gap4-misc, Miscellaneous functions, t)), followed by chapters on
+the Results Manager
+(_fpref(Results, Results Manager, results)),
+Lists
+(_fpref(Lists, Lists Introduction, lists)),
+Notes
+(_fpref(Notes, Notes, notes)),
+Configuring gap4
+(_fpref(Conf-Introduction, Options Menu, configure)),
+gap4 Database Files
+(_fpref(GapDB, Gap Database Files, gap4)),
+_ifdef([[_unix]],[[Converting Old Databases
+(_fpref(Convert, Converting Old Databases, t)),
+]])Checking Databases for corruptions
+(_fpref(Check Database, Check Database, check_db))
+and Doctoring corrupted databases
+(_fpref(Doctor Database, Doctor database, doctor_db)).
+
diff --git a/manual/gap5-t.texi b/manual/gap5-t.texi
new file mode 100644
index 0000000..866fcc7
--- /dev/null
+++ b/manual/gap5-t.texi
@@ -0,0 +1,252 @@
+_define(_gap5)
+
+ at c __include(gap5_org-t.texi)
+ at c __include(gap5_mini-t.texi)
+ at c __include(gap5_intro-t.texi)
+
+ at page
+_split()
+ at node Gap5_DB
+ at chapter Gap5 Databases
+_include(gap5_check_db-t.texi)
+
+ at page
+_split()
+ at node Contig Selector and Comparator
+ at chapter Contig Selector / Comparator
+ at node Contig Selector
+ at section Contig Selector
+ at lowersections
+_include(contig_selector-t.texi)
+ at raisesections
+
+ at page
+_split()
+ at node Contig Comparator
+ at section Contig Comparator
+ at lowersections
+_include(comparator-t.texi)
+ at raisesections
+
+ at page
+_split()
+ at node Template Display
+ at chapter Template Display
+_include(gap5_template-t.texi)
+
+ at page
+_split()
+ at node Editor
+ at chapter Editing in Gap5
+_include(gap5_contig_editor-t.texi)
+ 
+ at page
+_split()
+ at node Restrict
+ at section Plotting Restriction Enzymes
+_include(restrict_enzymes-t.texi)
+
+ at page
+_split()
+ at chapter Importing and Exporting Data
+ at section Assembly
+ at node Assembly
+ at lowersections
+_include(gap5_assembly-t.texi)
+ at raisesections
+
+ at page
+_split
+_include(gap5_export-t.texi)
+ 
+ at c @page
+ at c __split()
+ at c @node Ordering-and-Joining
+ at c @chapter Ordering and Joining Contigs
+ at c __include(contig_ordering-t.texi)
+ at c 
+
+ at page
+_split()
+ at node Matches
+ at chapter Finding Sequence Matches
+ at node FIJ
+ at section Find Internal Joins
+_include(gap5_fij-t.texi)
+
+ at page
+_split()
+ at node Repeats
+ at section Find Repeats
+_include(gap5_repeats-t.texi)
+
+ at page
+_split()
+ at node Read Pairs
+ at section Find Read Pairs
+_include(gap5_read_pairs-t.texi)
+
+ at page
+_split()
+ at node Find Oligos
+ at section Sequence Search
+_include(find_oligo-t.texi)
+
+ at page
+_split
+ at node Disassembly and Data Deletion
+_include(gap5_disassembly-t.texi)
+
+ at page
+_split
+ at node Tidying up alignments
+_include(gap5_shuffle-t.texi)
+
+ at page
+_split()
+ at node Calculate Consensus
+ at chapter Calculating Consensus Sequences
+_include(calc_consensus-t.texi)
+
+ at page
+_split()
+ at node Misc
+ at chapter Other Miscellany
+ at lowersections
+ at node List Libraries
+ at chapter List Libraries
+_include(list_libraries-t.texi)
+
+ at page
+_split()
+ at node Results
+ at chapter Results Manager
+_include(results-t.texi)
+ 
+ at page
+_split()
+ at node Lists
+ at chapter Lists
+_include(lists-t.texi)
+ at raisesections
+
+ at c 
+ at c @page
+ at c __split()
+ at c @node GapDB
+ at c @chapter Gap5 Database Files
+ at c __include(gap5_database-t.texi)
+ at c 
+ at c @page
+ at c __split()
+ at c @node Conf
+ at c @chapter Configuring
+ at c __include(configure-t.texi)
+ at c 
+ at c 
+ at c __split()
+ at c @node Gap5-Cline
+ at c @chapter Command Line Arguments
+ at c @cindex Command line arguments
+ at c 
+ at c @table @code
+ at c @cindex -bitsize 
+ at c @cindex bitsize (command line option)
+ at c @cindex 64-bit Gap4 databases
+ at c @item -bitsize
+ at c Specifies whether the database file size is 32-bit or
+ at c 64-bit. Practically speaking due to the use of signed numbers in
+ at c places and the restriction of 32-bit for the number of records in a
+ at c database (even when using @code{-bitsize 64} for 64-bit file offsets)
+ at c the practical limits are 2Gb filesize for @code{-bitsize 32} and
+ at c somewhere around about 100-million sequences for @code{-bitsize 64}. 
+ at c 
+ at c Gap4 only needs this option for creating new databases. The bit-size
+ at c of existing databases is automatically detected when they are opened.
+ at c 
+ at c Databases produced in 64-bit format are not compatible with older
+ at c versions of Gap4, but old and newly created 32-bit databases still work with
+ at c the 64-bit Gap4 (and are maintained in 32-bit format so editing them
+ at c will not invalidate their use by older Gap4s). The @code{copy_db}
+ at c program (_f p r e f (Man-copy_db, Copy_db, manpages)) can be used to
+ at c convert file formats.
+ at c 
+ at c @sp 1
+ at c @cindex -maxdb
+ at c @cindex maxdb (command line option)
+ at c @item -maxdb
+ at c Specifies the maximum number of readings plus contigs. This value is not
+ at c automatically adjusted whilst the program is running, but is not allowed to be
+ at c set to a value too small for the database to be opened. It controls the size
+ at c of some areas of memory (approximately @code{16*maxdb} bytes) used during
+ at c execution of gap. The default value is @code{8000}.
+ at c @sp 1
+ at c @cindex -maxseq
+ at c @cindex maxseq (command line option)
+ at c @item -maxseq
+ at c Specifies the maximum number of characters used in the concatenated consensus
+ at c sequences. This parameter is generally not required as the value is normally
+ at c computed and adjusted automatically. However a few functions (such as
+ at c assembly) still need to know a maximum size before hand. The default is
+ at c @code{100000} bases.
+ at c @sp 1
+ at c @item -ro
+ at c @itemx -read_only
+ at c @cindex -read_only
+ at c @cindex read_only (command line option)
+ at c Opens the database (if specified on the command line) in read only mode. This
+ at c does not apply to databases opened using the file browser.
+ at c @sp 1
+ at c @cindex -check
+ at c @cindex -nocheck
+ at c @cindex nocheck (command line option)
+ at c @cindex check (command line option)
+ at c @item -check
+ at c @itemx -no_check
+ at c Specifies whether to run the "Check Database" option when opening new
+ at c databases. @code{-check} forces this to always be done and @code{-nocheck}
+ at c forces it to never be done. By default Check Database is always performed when
+ at c opening databases in read-write mode and never performed when opening in
+ at c read-only mode.
+ at c @sp 1
+ at c @item -exec_notes
+ at c @itemx -no_exec_notes
+ at c @cindex -exec_notes
+ at c @cindex -no_exec_notes
+ at c @cindex security
+ at c Controls whether to search for and execute any Notes of type
+ at c @code{OPEN} or @code{CLOS}. This may be an important security measure
+ at c if you are using foreign databases. Gap4 defaults to -no_check_notes.
+ at c @sp 1
+ at c @item -rawdata_note
+ at c @itemx -no_rawdata_note
+ at c @cindex -rawdata_note
+ at c @cindex -no_rawdata_note
+ at c Controls whether to make use of the @code{RAWD} note type for
+ at c specifying the trace file search path. Defaults to -rawdata_note.
+ at c @sp 1
+ at c @item -csel
+ at c @itemx -no_csel
+ at c @cindex -csel
+ at c @cindex -no_csel
+ at c Controls whether to automatically start up the contig selector when
+ at c opening a new gap4 database. In some cases (such as when dealing with
+ at c many EST clusters each in their own contig) the contig selector is not
+ at c a practical tool; this simply offers a way of speeding up database
+ at c opening. Defaults to -csel.
+ at c @sp 1
+ at c @item --
+ at c Treat this as the last command line option. Only useful if the database name
+ at c is specified and the name starts with a minus character (not
+ at c recommended!).
+ at c @end table
+ at c 
+ at c __ifdef([[_unix]],[[
+ at c @page
+ at c __split()
+ at c @node Convert
+ at c @chapter Converting Old Databases
+ at c __include(gap5_convert-t.texi)
+ at c ]])
+
+_undefine(_gap5)
diff --git a/manual/gap5.texi b/manual/gap5.texi
new file mode 100644
index 0000000..1187b22
--- /dev/null
+++ b/manual/gap5.texi
@@ -0,0 +1,44 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+ at c %**start of header
+ at setfilename gap5.info
+ at setcontentsaftertitlepage
+ at setshortcontentsaftertitlepage
+ at settitle Gap5
+ at setchapternewpage odd
+ at iftex
+ at afourpaper
+ at end iftex
+ at setchapternewpage odd
+ at c %**end of header
+
+define(`__prog__',`gap5')
+define(`__Prog__',`Gap5')
+
+ at set standalone
+include(header.m4)
+
+ at titlepage
+ at title Gap5
+ at subtitle 
+ at author James Bonfield, Wellcome Trust Sanger Institute
+ at page
+ at vskip 0pt plus 1filll
+_include(copyright.texi)
+ at end titlepage
+
+ at node Top
+ at ifinfo
+ at top top-gap5
+ at end ifinfo
+
+_include(gap5-t.texi)
+
+_split()
+ at node Index
+ at unnumbered Index
+ at printindex cp
+
+ at shortcontents
+ at contents
+ at bye
diff --git a/manual/gap5_assembly-t.texi b/manual/gap5_assembly-t.texi
new file mode 100644
index 0000000..49ed00f
--- /dev/null
+++ b/manual/gap5_assembly-t.texi
@@ -0,0 +1,249 @@
+ at cindex Assembly
+ at cindex Entering readings
+
+There are two main types of assembly - denovo and mapped - with the
+latter not really being a true assembly at all.
+
+Denovo assembly consists of an assembly of DNA fragments without
+typically knowing any of the goal target sequence. Hence it compares
+sequence fragments against each other in order to form contigs.
+Mapped assembly makes uses of a known reference sequence and compares
+all sequence fragments against the reference, which is a far simpler
+and faster process than denovo assembly.
+
+Gap5 however has neither denovo or mapped assembly built-in. Instead
+it relies on externally running standard command-line tools. At
+present this consists purely of using bwa for a mapped assembly, but
+in future this will be expanded upon.
+
+This means that the Assembly menu currently only contains a ``Map
+Reads'' sub-menu, which is turn has multiple choices for bwa
+usage. You will not be directly able to join contigs using these
+facilities or to fill holes in the contig, although this is possible
+by manually following some of the steps outlined below and using an
+alternate step for generating the SAM file.
+
+ at menu
+* Assembly-tg_index::        Importing with tg_index
+* Assembly-Fasta::           Importing fasta/fastq files
+* Assembly-Map-bwa-aln::     Mapped assembly by bwa aln
+* Assembly-Map-bwa-dbwtsw::  Mapped assembly by bwa dbwtsw
+ at end menu
+
+_split()
+ at node Assembly-tg_index
+ at section Importing with tg_index
+ at cindex Assembly: tg_index
+ at cindex tg_index
+
+To enable efficient editing of data, Gap5 needs its own database
+format for storing sequence assemblies. Formats such as BAM are good
+at random access for read-only viewing, but are not at all amenable to
+actions such as reverse complementing a contig and joining it to
+another.
+
+Hence we need a tool that can take existing assembly formats and
+convert them to a form suitable for Gap5. The @code{tg_index} program
+performs this task. It is strictly a command line tool, although in
+some specific cases Gap5 has basic GUI dialogues to wrap it up.
+
+One or more input files may be specified. The general form is:
+
+ at code{tg_index} @i{[options]} @code{-o} @i{gap5_db_name}
+ at i{input_file_name} ...
+
+An example usage is:
+
+ at example
+    tg_index -z 16384 -o test_data.g5 test_data.bam
+    gap5 test_data.g5 &
+ at end example
+
+
+File formats supported are SAM, BAM, ACE, MAQ (both short and long
+variants), CAF, BAF, Fasta and Fastq. The latter two have no assembly
+and/or alignment information so they are simply loaded as single-read
+contigs instead.  Tg_index typically automatically detects the type of
+file, but in rare cases you may need to explicitly state the input
+file type.
+
+Tg_index options:
+
+ at table @b
+ at item -o @i{filename}
+Creates a gap5 database named @i{filename} and @i{filename}@code{.aux}
+If not specified the default is ``g_db''.
+
+ at item -a
+Append to an existing database, instead of creating a new one (which
+is the default action).
+
+ at item -n
+When appending, the default behaviour is to add reads to existing
+contigs if contigs with the appropriate names already exist. This
+option always forces creation of new contigs instead.
+
+ at item -g
+When appending to an existing database, assume that the alignment has
+been performed against an ungapped copy of the consensus exported from
+this database. (This is internally used when performing mapped
+assemblies as they consist of exporting the consensus, running the
+external mapped alignment tool, and then importing the newly generated
+alignments.)
+
+ at item -m
+ at itemx -M
+Forces the input to be treated as MAQ, both short (-m) and long (-M)
+formats are supported. By default the file format is automatically
+detected.
+
+ at item -A
+Forces the input to be treads as ACE format.
+
+ at item -B
+Forces the input to be treads as BAF format.
+
+ at item -C
+Forces the input to be treads as CAF format.
+
+ at item -b
+ at itemx -s
+Forces the input to be treads as BAM (-b) or SAM (-s) format. SAM must
+have @@SQ headers present. Both need to be sorted by position.
+
+ at item -z @i{bin_size}
+Modifies the size of the smallest allowable contig bin. Large contigs
+will contain child bins, each of which will contain smaller bins,
+recursing down to a minimum bin size. Sequences are then placed in the
+smallest bin they entirely fit within. The default minimum bin size is
+4096 bytes. For very shallow assemblies increasing this will improve
+performance and the decrease disk space used. Ideally 5,000 to 10,000
+sequences per bin is an approximate figure to aim for.
+
+ at item -u
+Store unmapped reads only (from SAM/BAM only)
+
+ at item -x
+Store SAM/BAM auxillary key:value records too. 
+
+ at item -p
+ at itemx -P
+Enable (-p) or disable (-P) read-pairing. By default this is
+enabled. The purpose of this is to link sequences from the same
+template to each other such that gap5 knows the insert size and
+read-pairings. Generally this is desirable, but it adds extra time and
+memory to identify the pairs. Hence for single-ended runs the option
+exists to disable attempts at read-pairing.
+
+ at item -f
+Attempt a faster form of read-pairing. In this mode we link the second
+occurrence of a template to the first occurrence, but not vice
+versa. This is sufficient for the template display graphical views to
+work, but will cause other parts of the program to behave
+inconsistently. For example the contig editor ``goto...'' popup menu
+will sometimes be missing.
+
+ at item -t
+ at itemx -T
+Controls whether to index (-t) or not (-T) the sequence names. By
+default this is disabled. Adding a sequence name index permits us to
+search by sequence name or to use a sequence name in any dialogue that
+requires a contig identifier. However it consumes more disc space to
+store this index and it can be time consuming to construct it.
+
+ at item -r @i{nseq}
+Reserves space for at least @i{nseq} sequences. This generally isn't
+necessary, but if the total number of records extends above 2 million
+(equivalent to 2 billion sequences, or less if we have lots of
+contigs, bins and annotation records to write) then we run out of
+suitable sequence record numbers. This option preallocates the lower
+record numbers and reserves them solely for sequence records.
+
+ at item -c @i{compression_method}
+Specifies an alternate compression method. This defaults to @i{zlib},
+but can be set to either @i{none} for fastest speed or @i{lzma} for
+best compression.
+ at end table
+
+_split()
+ at node Assembly-Fasta
+ at section Importing fasta/fastq files
+ at cindex Assembly: fasta/fastq
+
+Sometimes we have a few individual sequences we wish to import as
+single-read contigs. That is we won't align them against each other or
+against existing data, but just load them into our gap5 database so we
+can then run tools such as Find Repeats or Find Internal Joins on
+them. (This can be ideal for importing consensus sequences.)
+
+The ``Import Fasta/Fastq as single-read contigs'' function is designed
+for this purpose.  Behind the scenes it is nothing more than running
+ at code{tg_index -a} to add a fasta or fastq file.
+
+_split()
+ at node Assembly-Map-bwa-aln
+ at section Mapped assembly by bwa aln
+ at cindex Assembly: bwa aln
+ at cindex bwa
+
+This function runs the bwa program using the ``aln'' method for
+aligning sequences. It is appropriate for matching most types of
+short-read data.
+
+The GUI is little more than a wrapper around command line tools,
+which can essentially be repeatedly manually as follows.
+
+ at enumerate
+ at item
+Calculate and save the consensus for all contigs in the database in
+fastq format.
+
+ at item
+Index the consensus sequence using ``bwa index''.
+
+ at item
+Map our input data against the bwa index using ``bwa aln''.
+Repeat for reverse matches too.
+
+ at item
+Generate SAM format from the alignments using ``bwa samse'' or ``bwa
+sampe''.
+
+ at item
+Convert to BAM and sort by position.
+
+ at item
+Import the BAM file, appending to the existing gap5 database
+(equivalent to @code{tg_index -a}).
+ at end enumerate
+
+ at node Assembly-Map-bwa-dbwtsw
+ at section Mapped assembly by bwa dbwtsw
+ at cindex Assembly: bwa dbwtsw
+ at cindex bwa
+
+This function runs the bwa program using the ``dbwtsw'' method for
+aligning sequences. This should be used when attempting to align
+longer sequences or data with lots of indels.
+
+The GUI is little more than a wrapper around command line tools,
+which can essentially be repeatedly manually as follows.
+
+ at enumerate
+ at item
+Calculate and save the consensus for all contigs in the database in
+fastq format.
+
+ at item
+Index the consensus sequence using ``bwa index''.
+
+ at item
+Map our input data against the bwa index using ``bwa dbwtsw''.
+
+ at item
+Convert to BAM and sort by position.
+
+ at item
+Import the BAM file, appending to the existing gap5 database
+(equivalent to @code{tg_index -a}).
+ at end enumerate
diff --git a/manual/gap5_break_contig.png b/manual/gap5_break_contig.png
new file mode 100644
index 0000000..cef165a
Binary files /dev/null and b/manual/gap5_break_contig.png differ
diff --git a/manual/gap5_check_ass.png b/manual/gap5_check_ass.png
new file mode 100644
index 0000000..b86f410
Binary files /dev/null and b/manual/gap5_check_ass.png differ
diff --git a/manual/gap5_check_database.png b/manual/gap5_check_database.png
new file mode 100644
index 0000000..91ddfed
Binary files /dev/null and b/manual/gap5_check_database.png differ
diff --git a/manual/gap5_check_db-t.texi b/manual/gap5_check_db-t.texi
new file mode 100644
index 0000000..a0246d7
--- /dev/null
+++ b/manual/gap5_check_db-t.texi
@@ -0,0 +1,159 @@
+ at node tg_index
+ at section Creating databases
+ at cindex database creation
+ at cindex tg_index
+
+Gap5 cannot directly work on assembly formats in their native format.
+This is a substantial difference from things like BAM file viewers, but
+the reason is simply that the other formats do not have data structured
+in a manner that is suitable for in-place editing. Gap5 is first and
+foremost an assembly editor.
+
+Gap5 databases are currently created external to Gap5 using a
+command-line program named @code{tg_index}.
+
+ at code{tg_index} [options] @i{input_file ...}
+
+The most general usage is simply to specify one or more data files
+(it accepts SAM/BAM, CAF, ACE, BAF, MAQ and in a more limited fashion
+fasta/fastq), optionally specifying the output database with @code{-o}
+ at i{database_name}. This will then create a database suitable for editing
+by Gap5.
+
+Valid options are:
+
+ at table @code
+ at item -m
+Input is MAQ format
+ at item -M
+Input is MAQ-long format
+ at item -A
+Input is ACE format
+ at item -B
+Input is BAF format
+ at item -C
+Input is CAF format
+ at item -f
+Input is FASTA format
+ at item -F
+Input is FASTQ format
+ at item -b
+Input is BAM format
+ at item -s
+Input is SAM format (with @@SQ headers)
+ at br
+ at item -u
+Also store unmapped reads (SAM/BAM only)
+ at item -x
+Also store auxillary records (SAM/BAM only)
+ at item -r
+Store reference-position data (on)  (SAM/BAM only)
+ at item -R
+Don't store reference-position data (SAM/BAM  only)
+ at item -D
+Do not remove duplicates (SAM/BAM only)
+ at br
+ at item -p
+Link read-pairs together (default on)
+ at item -P
+Do not link read-pairs together
+ at br
+ at item -q @i{value}
+Number of reads to queue in memory while waiting for pairing.  Use to
+reduce memory  requirements for assemblies with lots of single reads at
+the expense of running time.  0 for all in memory, suggest 1000000 if
+used (default 0).
+ at br
+ at item -a
+Append to existing db
+ at item -n
+New contigs always (relevant if appending)
+ at br
+ at item -g
+When appending to an existing db, assume the alignment was performed
+against an ungapped copy of the existing consensus. Add gaps back in to
+reads and/or consensus as needed.
+
+ at item -t
+Index sequence names (default)
+ at item -T
+Do not index sequence names
+
+ at item -z @i{value}
+Specify minimum bin size (default is '4k')
+
+ at item -f
+Fast mode: read-pair links are unidirectional large databases, eg n.seq
+> 100 million. 
+
+ at item -d @i{data_types}
+Only copy over certain data types. This is a comma separated list
+containing one or more words from: seq, qual, anno, name, all or none 
+
+ at item -c @i{method}
+Specifies the compression method. This shold be one of 'none', 'zlib' or
+'lzma'. Zlib is the default.
+
+ at item -[1-9]
+Use a fixed compression level from 1 to 9
+
+ at item -v @i{version_num}
+Request a specific database formation version
+ at end table
+
+To merge existing gap5 databases you will need to export either one or
+both into an intermediate format (we suggest SAM) and then use tg_index
+to import data again.
+
+ at node OpenDatabase
+ at section Opening/closing databases
+ at cindex Open database
+
+The Open menu item is in the main gap5 File menu. It brings up a file
+browser allowing selection of the gap5 database name. Databases consist
+of two files - a main data block (.g5d) and a data index (.g5x).  It
+does not matter which you choose as gap5 will open both.
+
+Alternatively you can specify the database name on the command line when
+launching gap5. Additionally this supports read-only access if you
+specify the @code{-ro} flag. For example to open a database named Egu.0
+(the old Gap4 convention implying version 0) in read-only mode we would
+type:
+
+ at code{gap5 -ro Egu.0 &}
+
+
+ at node GapDB-Directories
+ at section Changing directories
+ at cindex Change directory
+
+By default gap5 changes to the directory containing the database you
+have open. All local output files specified (for example Save Consensus
+or Export Sequences) will be relative to that location unless you use a
+full pathname. The current working directory may be changed by using the
+Change Direction dialogue, found in the main File menu.
+
+ at node CheckDatabase
+ at section Check Database
+ at cindex Check database
+
+This function (which is available from the Gap5 File menu) is used to
+perform a check on the logical consistency of the database.  No user
+intervention is required. If the checks are passed the program will
+report zero errors. Otherwise a report of each error is displayed.
+
+_picture(gap5_check_database,3.34167in)
+
+On a large database these checks can take a considerable amount of
+time. The default is a thorough, but slow, check. However a faster mode
+is available which only performs gross contig and contig-binning level
+checks, omitting the per sequence and per annotation validation.
+
+The dialogue also offers the choice of attempting to fix any problems
+that are found. It is strongly recommended that you back the gap5
+database up prior to performing fixes as depending on the nature of the
+corruption the choices made may not necessarily be an improvement. Note
+that this also may not fix every problem that is found, or the fixes
+themselves may cause other errors to be found so it is best to recheck
+again.
+
diff --git a/manual/gap5_comparator.png b/manual/gap5_comparator.png
new file mode 100644
index 0000000..55094a2
Binary files /dev/null and b/manual/gap5_comparator.png differ
diff --git a/manual/gap5_contig_editor-t.texi b/manual/gap5_contig_editor-t.texi
new file mode 100644
index 0000000..45785f8
--- /dev/null
+++ b/manual/gap5_contig_editor-t.texi
@@ -0,0 +1,1512 @@
+ at menu
+* Editor-Movement::            Moving around the editor
+* Editor-Names::               The sequence names display
+* Editor-Editing::             Commands for editing data
+* Editor-Select Base::         Cut and paste control of sequence
+* Editor-Select Seq::          Selecting Sequences
+* Editor-Annotations::         Creating, editing and deleting tags
+* Editor-Searching::           Searching
+* Editor-Settings::            The ``settings'' menu
+* Editor-Primer Selection::    Searching for primers
+* Editor-Traces::              Displaying the raw trace data
+* Editor-Info::                The Editor Information Line
+* Editor-Joining::             The join editor
+* Editor-Multiple Editors::    Using several editors at once
+* Editor-Quitting::            Quitting the editor
+* Editor-Summary::             Summary of key and mouse bindings
+ at end menu
+
+The Gap5 Contig Editor is designed to allow rapid checking and editing of
+characters in assembled readings. Very large savings in time can be achieved
+by its sophisticated problem finding procedures which automatically direct the
+user only to the bases that require attention.  The following is a selection of
+screenshots to give an overview of its use.
+
+_picture(gap5_contig_editor.screen,6in)
+
+The figure above shows a screendump from the Contig Editor showing the
+consensus for a small region of a contig and the aligned reads.
+The main components are, top-most menu bar; common buttons and
+controls beneath this; the main name and sequence panels to the left
+and right; scrollbars and jog-control; a status text line at the bottom.
+
+The names panel on the left can show either reading names or a small
+ASCII diagram representing their position, orientation and mapping
+quality as a grey-scale. The sequences to the right in the screenshot
+has base quality shown in grey (dark being poor, light being good)
+with disagreements to the consensus at the top shown in blue. The
+consensus line also shows base qualities. You may notice we have a
+mixture of long and short sequences, with the longer ones being at the
+top. This screenshot is from a mixed assembly of Illumina short-read
+data and ABI Sanger-method capillary sequences.
+
+One base is drawn in inverse video (a ``G''). This is the current
+location of the editing cursor. We can move this we arrow keys or
+clicking with the left mouse button. It behaves much like the editing
+cursor in a word processor and need not be visible in the portion of
+the contig we are viewing.
+
+Also visible is a set of bases coloured yellow. These are an OLIGO
+annotation. Gap5 supports a wide variety of annotation types (often
+also referred to as ``tags''). These are covered later in more detail.
+
+_picture(gap5_contig_editor.traces,6in)
+
+This figure is an example of the Trace Display showing three capillary
+traces and an Illumina trace from readings in the previous Contig
+Editor screendumps. Note that this demonstrates the possibility of
+showing the raw trace data for new short-read sequencing technologies,
+but typically this is not available due to the high storage size.
+
+_split()
+ at node Editor-Movement
+ at section Moving the visible segment of the contig
+ at cindex Contig Editor: scrolling
+
+The contig editor displays only one segment of the entire contig,
+although several contig editors can be in use at once.  Below the
+sequence is a scrollbar and below that a ``jog'' control. The
+scrollbar behaves as expected, allowing rapid positioning anywhere
+within the contig using the middle mouse button or left-clicking and
+dragging the slider. However with extremely long contigs (for example
+100Mb) it can become tricky to move by the desired amount. Each pixel
+on the scrollbar may represent 100Kb worth of data, so dragging the
+scrollbar is only approximate positioning. Equally so clicking in the
+trough to move a screen-full at a time can be too small. This is where
+the jog-control can be of use.
+
+By default this is always centred. Clicking and dragging this left or
+right starts to scroll the editor, at a speed proportional to how far
+away from the centre the jog is dragged. Releasing the mouse button
+stops automatically scrolling and recentres the jog control.
+
+The final, more precise, manner of positioning the editor view is with
+the text entry box in the bottom left corner. Type in any coordinate
+here and press return to jump straight to that location. Note however
+that Gap5's coordinates are currently always in padded form; that is
+to say that a gap in the consensus caused by an insertion in one of
+the aligned sequences is still counted as a base position.
+
+For particularly deep displays the vertical scrollbar on the right
+edge of the window will also be useful. While scrolling in X, the
+editor attempts to keep the same sequences visible on screen. To do
+this it may automatically adjust the Y scrollbar for you due to
+changing layout of sequences. (By default the top-most sequence is
+always the sequence that starts furthest left and the bottom most is
+the sequence starting furthest right.)
+
+If you have a mouse wheel, this may also be used for small
+scrolling. By itself it scrolls in Y one sequence at a time. With the
+Control key held down it scrolls in larger increments. Using the Shift
+key in conjunction with the mouse wheel scrolls in X instead, with
+Shift+Control to scroll in larger increments.
+
+The displayed portion of the contig is separate from the current
+location of the editing cursor. This is displayed as a black rectangle
+with typically a light coloured letter inside it. Any editing keys
+operate on the base underneath this or to the base immediately
+preceding it for Delete. We cover the topic of editing later
+(_fpref(Editor-Editing, Editing, gap5_contig_editor)), however moving the
+editing cursor is also another way of scrolling the editor.
+
+Finally the Page Up and Page Down keys scroll the editor left or right
+by 90% of current screen width.  Used with Shift the moves in
+increments of 1Kb, with Control in increments of 10Kb and with both
+Shift and Control in increments of 100Kb. The Home and End keys jump
+to the start or end of the current item underneath the editing curosr
+- either a sequence or the consensus.
+
+_split()
+ at node Editor-Names
+ at section Names
+ at cindex Contig Editor: names display
+ at cindex Contig Editor: highlighting readings
+ at cindex Highlighting readings in the editor
+ at cindex names in the editor
+ at cindex reading names in the editor
+
+At the left side of the editor window is the ``names panel''. This
+either displays an ASCII pictorial summary of the sequence layout or
+the actual sequence names themselves depending on the settings in use.
+Between the names panel and the sequences panel is a vertical line,
+visible at the right edge of the above image. This can be dragged left
+and right to adjust the proportion of display dedicated to the names
+and sequence panels.
+
+The default name display looks like this:
+
+_picture(gap5_contig_editor.names1,1.55in)
+
+This plot is a mini diagram of the way the sequences overlap. Here the
+> and < symbols represent the start of sequences, assembled on either
+the forward or reverse strand, with the ... sections reflecting their
+relative lengths. The background shading indicates the mapping quality
+of the sequence (which may not be available in many cases, depending
+on how the assembly was derived). This should indicate the likelihood
+that the sequence has been assembled to the correct point. Sequence
+that appears to map elsewhere, e.g. due to a repeat, will be dark grey
+while unique sequence will be light grey or white. Moving the mouse
+cursor over a sequence will tell you the precise mapping quality along
+with additional information such as the sequence name, the technology
+used (Sanger, Illumina, 454, etc), and whether it is part of a pair of
+sequences.
+
+In the editor Settings menu is a checkbox labelled ``Pack
+Sequences''. When checked we permit multiple sequences to be drawn in
+the same row. Unchecking this reverts to the Gap4 style of display
+where each sequence has its own dedicated row. This also has an affect
+on the names panel, which switches to showing the sequence names, as
+below.
+
+_picture(gap5_contig_editor.names2,1.55in)
+
+This still uses the > and < symbols to reflect strand and grey scales
+for representing the mapping quality. The > and < are now also
+coloured independently.
+
+ at itemize
+ at item light blue
+The read is not paired
+ at item white
+Forms a consistent pair
+ at item grey
+Paired, but the insert size is too large or too small
+ at item red
+Paired, but in an invalid orientation
+ at item orange
+Paired, but the other end is in another contig
+ at end itemize
+
+At the bottom of the names panel is an editable text field containing
+the current display position. Adjacent to this is a small ``P''
+indicating these coordinates are ``padded''. Clicking this will
+alternate with ``R'' to indicate reference coordinates, although these
+may not be available in all situations.  Note that currently, for speed
+reasons, it cannot directly display unpadded coordinates. 
+
+Typing into this position entry-box allows us to direct the editor to a
+specific location. If we end the number with ``u'' it performs an
+unpadded to padded conversion before jumping to this location.
+
+Left clicking on a name will toggle the background between the current
+grey to a shade of blue (with luminosity once again reflecting mapping
+quality). This indicates that the sequence name has been added to the
+``readings'' list. Multiple names may be selected and deselecting by
+pressing and holding the left mouse button while moving the mouse
+cursor.
+
+In both display modes, pressing the right mouse button brings up a
+context sensitive menu containing operations relevant to that specific
+sequence. This may contain the following commands.
+
+
+ at table @strong
+ at item Copy name to clipboard
+ at itemx Copy #number to clipboard
+These copy the sequence name or the record number to the clipboard for
+use in a subsequent paste operation. Note that there is no visual cue
+that this has happened. The same function may also be achieved by
+left-clicking and dragging the mouse horiztonally, as if attempting to
+highlight a region of text.
+
+These two items are also available when right clicking on the Consensus
+label, but in this case it copies the contig name or number to the
+clipboard instead.
+
+ at item Goto...
+This lists other sequences sharing the same template, such as the
+other end of a read-pair. Selecting this command will jump the editor
+to the left-most base in that sequence. If the sequence is in another
+contig then a new editor will be created, unless one already exists
+for that contig in which case that other editor will be moved
+accordingly.
+
+ at item Join to...
+In the case of read-pairs that span contigs, the join to function will
+bring up the join editor for both contigs involved, automatically
+complementing the other contig if appropriate based on the library pair
+orientation statistics.
+ at end table
+
+Right clicking on the contig name also pops up a menu. In here are
+otions to change the contig name or the starting coordinate. These
+options are also available in the editor Commands menu.
+
+_split()
+ at node Editor-Editing
+ at section Editing
+ at cindex Editing: contig editor
+ at cindex Contig Editor: editing features
+
+ at menu
+* Editor-Cursor::              Moving the editing cursor
+* Editor-Quality Values::      Adjusting the quality values
+* Editor-Cutoffs::             Adjusting the cutoff data
+* Editor-Positions::           Adjusting the alignment coordinates
+* Editor-Editing Summary::     Summary of editing commands
+ at end menu
+
+Editing can take up a significant portion of the time taken to finish
+a sequencing project. Gap5 has a selection of searches
+(_fpref(Editor-Searching, Searching, contig_editor)) designed to speed
+up this process.  The problems that require most attention are
+conflicts between good bases. Where base confidence values are present
+it should be unnecessary to edit all conflicting bases as, generally,
+this will amount to adjusting poor quality data to agree with good
+quality data in which case the consensus sequence should be correct
+anyway.
+
+Pads in the consensus should not be considered a problem requiring
+edits because it is possible to output the consensus sequence (from
+the main Gap5 File menu) with pads stripped out. Obviously poorly
+defined pads (a mixture of several alignment padding characters and
+real bases) require checking in the same manner as other poorly
+defined consensus bases.
+
+To change a base simply overtype with a new base call, one of a,c,g or
+t in lowercase. Alternatively a base can be changed to an alignment
+padding character by pressing ``*''. These new bases and pads
+automatically get given a quality value of 100, but see below for how
+to adjust this. The consensus cannot be edited in this manner.
+
+To insert a gap into sequence press ``i'' or the Insert key. At present
+only alignment pads can be inserted, not bases, although the pads can
+subsequently be edited to turn them into bases. The ``i'' and Insert
+keys also permits insertions of gaps into the consensus, which it
+achieves by inserting into every sequence aligned at that position.
+
+Bases may be deleted by pressing the Delete or Backspace key. This
+deletes the base immediately to the left of the current editing
+cursor. Note that if Delete or Backspace is pressed with the editing
+cursor on the consensus this removes an entire column of
+data. Deleting anything other than alignment padding characters
+(either in sequences or the consensus) is a dangerous operation
+needing careful thought. To prevent accidental removal of data
+therefore, to delete anything other than ``*'' you must press Control
+in conjunction with Delete or Backspace.
+
+
+_split()
+ at node Editor-Cursor
+ at subsection Moving the editing cursor
+ at cindex Cursor: contig editor
+ at cindex Contig Editor: cursor
+
+Nearly all editing operations happen at the location of the editing cursor.
+This cursor appears as a black block containing the base in a light
+colour, instead of the usual black base on a light background.
+
+The simplest mechanism of moving the cursor is using the left
+mouse button. Alternatively the following keys can be used.
+
+ at example
+ at group
+ Left arrow or Control b        Move left one base
+ Right arrow or Control f       Move right one base
+ Up arrow or Control p          Move up one base
+ Down arrow or Control n        Move down one base
+ Control a                      Move editing cursor to start of sequence
+ Control e                      Move editing cursor to end of sequence
+ Home                           Move editing cursor to start of sequence
+ End                            Move editing cursor to end of sequence
+ Meta or Alt <                  Move editing cursor to start of contig
+ Meta or Alt >                  Move editing cursor to end of contig
+ at end group
+ at end example
+
+If any of these move the editing cursor outside of the visible region,
+the editor will scroll to accommodate. Control-a and Control-e with
+the editor on the consensus line will also jump to the start and end
+of the contig.
+
+If ``Cutoffs'' are shown (_fpref(Editor-Cutoffs, Adjust the Cutoff
+Data, gap5_contig_editor)) the cursor may be placed in the cutoff data
+too. Note that turning off displaying cutoff data would then leave the
+editor on an invisible base, so it is moved to the consensus line instead.
+
+_split()
+ at node Editor-Quality Values
+ at subsection Adjusting the Quality Values
+ at cindex Quality values: contig editor, use within
+ at cindex Cutoff values: contig editor
+ at cindex Contig Editor: quality values
+ at cindex Contig Editor: cutoff values
+
+Each base has its own quality value. Assembly will allow only
+values between 1 and 99 inclusive. A quality value of 0 means that this base
+should be ignored. A quality value of 100 means that this base is definitely
+correct and the consensus will be forced to be the same base type and will be
+given a consensus confidence of 100. If two conflicting bases both have a
+quality of 100 the consensus will be a dash with a confidence of 0.
+
+Newly added bases or replaced bases are assigned a quality of 100.
+
+Several keyboard commands are available to edit the quality value of an
+individual base.
+
+ at example
+ at group
+ [                        Set quality to 0 and move cursor right
+ ]                        Set quality to 100 and move cursor right
+ Shift   Up-Arrow         Increment quality by 1
+ Control Up-Arrow         Increment quality by 10
+ Shift   Down-Arrow       Decrement quality by 1
+ Control Down-Arrow       Decrement quality by 10
+ at end group
+ at end example
+
+Finally note that quality values can also be made visible by clicking
+on the ``Quality'' checkbutton at the top of the editor. This shows
+the quality by use of a  grey scale.
+
+ at node Editor-Positions
+ at subsection Adjusting the alignment coordinates
+ at cindex Contig Editor: alignment coordinates
+
+On rare occasions we may need to move an entire sequence a small
+amount to achieve an optimal alignment, rather than simply inserting
+or deleting pads.
+
+This is achieved by using Control plus the left and right arrow keys
+while the editing cursor is anywhere on the sequence.
+
+ at example
+ at group
+ Control Left-Arrow       Shift sequence left
+ Control Right-Arrow      Shift sequence right
+ at end group
+ at end example
+
+
+ at node Editor-Cutoffs
+ at subsection Adjusting the Cutoff Data
+ at cindex Cutoff data: contig editor
+ at cindex Hidden data: contig editor
+ at cindex Contig Editor: cutoff data
+
+Sequences typically consist of a good quality ``used'' portion and
+poor quality ``clipped'' or ``cutoff'' portions at the 5' and 3' ends
+of the sequence. Although for short sequencing technologies it's quite
+likely we have no cutoff data at all. The reason for this is that the
+low quality ends of sequences may have a sufficient number of errors
+that the sequence alignment algorithms are no longer confident they
+have the correct bases aligned, or event that the sequence simply
+disagrees too much.
+
+By default these are not shown, although you may see blank lines in
+the display as room is left for this sequence even when it is not
+visible. The cutoff data may be displayed by pressing the ``Cutoffs''
+check-button at the top of the editor. The cutoff sequence will then
+be displayed in grey. We call the boundary between the cutoff data and
+the used data the cutoff position. These positions can be adjusted by
+pressing the ``<'' (left cutoff) or ``>'' (right cutoff) keys. In both
+cases the cutoff point is between the base with the editing cursor and
+the base to the left of the editing cursor.
+
+Using the ``<'' and ``>'' keys with the editing cursor in the consensus
+performs bulk versions of these edits by clipping every single sequence
+to that poinit. One small difference here though is that the bulk
+versions only ever shrink cutoff data and do not grow it. 
+
+ at example
+ at group
+ <                        In sequence: set left cutoff position
+ >                        In sequence: set right cutoff position
+
+ <                        In consensus: bulk clip left cutoff
+ >                        In consensus: bulk clip right cutoff
+ at end group
+ at end example
+
+_split()
+ at node Editor-Editing Summary
+ at subsection Summary of Editing Commands
+ at cindex Summary of editing commands: contig editor
+ at cindex Contig Editor: editing keys
+
+A brief summary of these editing operations can be seen below:
+
+ at example
+ Key              Location      Action
+ -----------------------------  --------------------
+ a,c,g,t,*        Reading       Change base
+ i, Insert        Reading       Insert pad
+ Delete           Reading       Delete * to left
+ Ctrl Delete      Reading       Delete any base to left
+
+ Control Left     Reading       Move reading left
+ Control Right    Reading       Move reading right
+
+ [                Reading       Set quality to 0
+ ]                Reading       Set quality to 100
+ Shift Up         Reading       Incr. quality by 1
+ Shift Down       Reading       Decr. quality by 1
+ Ctrl Up          Reading       Incr. quality by 10
+ Ctrl Down        Reading       Decr. quality by 10
+ <                Reading       Set left cutoff
+ >                Reading       Set right cutoff
+
+ i, Insert        Consensus     Insert column of pads
+ Delete           Consensus     Delete * to left
+ Ctrl Delete      Consensus     Delete any base to left
+ <                Consensus     Bulk clip left cutoff
+ >                Consensus     Bulk clip right cutoff
+ at end example
+
+_split()
+ at node Editor-Select Base
+ at section Cut and Paste Control of Sequence
+ at cindex Selections: contig editor
+ at cindex Contig Editor: selections
+
+It is possible to highlight an area of a reading or the
+consensus sequence in preparation for performing some further action
+upon it. Such examples of actions are: creating annotations and
+pasting into a new window. We call these highlighted areas
+``selections''. They are displayed as an underlined region.
+
+The simplest way to make a selection is using the left mouse
+button. Pressing the mouse button marks the base beneath the cursor 
+as the start of the selection. Then, without releasing the button,
+moving the mouse cursor adjusts the end of the selection. Finally
+releasing the button will allow normal use of the mouse again. If
+while marking a selection we reach the edge of the window then the
+editor will automatically start scrolling for us.
+
+Sometimes we may wish to make a particularly long selection, or just
+extend an existing selection after we've already released the mouse
+button. This can be done by using shift left mouse button to adjust
+the end of the selection. Hence we can mark the start of the selection
+using the left button, scroll along the contig to the desired
+position, and set the end using the shift left button.
+
+The selection is stored in the ``clipboard''. This allows for
+the usual ``cut and paste'' operations between applications, although
+the contig editor only supports this in one direction (as it is not
+possible to ``paste'' into the window). The mechanism employed for this
+follows the usual X Windows standard of using the middle mouse button.
+
+A quick summary of the mouse selection commands follows.
+
+ at example
+Left button                         Position editing cursor to mouse cursor
+Left button (drag)                  Mark start and end of selection
+Shift left button                   Adjust end of selection
+Middle button (in another window)   Copy selected sequence
+ at end example
+
+_split()
+ at node Editor-Select Seq
+ at section Selecting Sequences
+ at cindex Selecting sequences: contig editor
+ at cindex Contig Editor: selecting sequences
+
+The list named ``readings'' is used for all sequences selected in all
+editors. This is automatically updated whenever a sequence is selected
+or deselected.
+
+Inividual sequence names can be (de)selected by clicking on them with
+the left mouse button, or clicking and dragging out a region. This works
+well for a few sequences.
+
+If you need to select all readings overlapping a specific consensus base
+or a region of consensus bases mark the range of the consensus you wish
+to select over by pressing and dragging the left mouse button (as if you
+were going to create an annotation) and then either right click in the
+consensus or use the Commands menu to choose Select Reads. When using
+the Commands menu you get a dialogue asking for confirmation of the
+start and end positions and the option of whether to select sequences
+that overlap this range or only those which are entirely containing
+within that range. When using the right-click popup on the consensus it
+simply takes the defaults (overlapping sequences).
+
+Deselection follows the same procedure.
+
+_split()
+ at node Editor-Annotations
+ at section Annotations
+ at cindex Tags: contig editor
+ at cindex Annotations: contig editor
+ at cindex Contig Editor: annotations
+ at cindex Contig Editor: tags
+
+Annotations (or tags) can be placed at any position on readings or on
+the consensus. They are usually used to record positions of primers
+for walking, or to mark sites, such as repeats or compressions, that
+have caused problems during sequencing.  Each annotation has a type
+such as ``primer'', a position, a length, a strand (forward, reverse
+or both) and an optional comment. Each type and strand has an
+associated colour that will be shown on the display. For information
+on searching for annotations see _oref(Editor-Search-Type, Searching
+by Tag Type), and _oref(Editor-Search-Anno, Searching by Annotation
+Comments).
+
+_picture(contig_editor.taged,4.16667in)
+
+ at i{FIXME: not all of the tag editor features are supported yet;
+specifically the Move/Copy functionality is currently missing.}
+
+To create an annotation, make a selection and then select ``Create Tag''
+from the contig editor commands menu at the top of the editor or by
+pressing the right mouse button.  _oxref(Editor-Commands, The Commands
+Menu). This will bring up a further window; the ``tag editor'' (shown
+above). The ``Type:'' button at the top of the editor invokes a
+selectable list from which tag types can be chosen.  See below.
+
+_picture(contig_editor.tagsel,3.5in)
+
+Use this to select the desired type of annotation.  
+
+Next the strand of the annotation can be selected. This will be
+displayed as one of ``<---->'', ``<----'', ``---->'' and ``?----?''
+indicating both strands, top strand only, bottom strand only, and
+stranded but unknown strand respectively. These mirror the GFF strand
+definitions.  The comment (the box beneath the buttons) can be edited
+using the usual combination of keyboard input and arrow keys. The
+``Save'' button will exit the tag editor and create the annotation. To
+abandon editing without creating the annotation use the ``Cancel''
+button.
+
+To edit an existing annotation, position the editing cursor
+within a annotation and select ``Edit Tag'' from the commands menu. This
+will be a cascading menu, typically showing one tag. If multiple tags
+coincide at the same sequence position you will be able to chose which
+tag to edit. Once again the tag editor will be invoked and operates as
+before. The @b{F11} key is also a shortcut for editing the top-most
+tag underneath the editor cursor.
+When editing, the ``Save'' will save the edited changes and ``Cancel''
+will abandon changes.
+
+Removing a annotation involves positioning the editing cursor within
+an annotation and selecting ``Delete Tag'' from the commands menu. As with
+``Edit Tag'' this is a cascading menu to allow you to chose which tag at a
+specific point to delete. The @b{F12} key is a shortcut to remove the
+top-most tag underneath the editor cursor.
+
+As usual, ``undo'' can be used to undo any of these annotation creations,
+edits and removals.
+
+Some tags may contain graphical controls instead of the usual text
+panel. These are encoded with the master gap4/5 tag database
+(@i{GTAGDB}) by specifying the default tag text to be a piece of
+``ACD'' code. A full description of the (modified for gap4/5) ACD syntax
+is not available currently, but it is strongly modelled on the the
+EMBOSS ACD syntax which has documentation at
+_uref(http://www.emboss.org/Acd/index.html).
+
+It is possible to add your own tag types by modifying either the
+system @i{GTAGDB} file or creating your own @i{GTAGDB} file in your
+home directory (for all your databases) or the current directory (for
+just those in that directory).
+
+For rapid editing and deleting the F11 and F12 keys may be used. These
+edit and delete the top-most tag underneath the editing cursor. If you
+wish to edit or delete the tag underneath the mouse cursor instead (and
+hence save a mouse click) use Shift F11 and Shift F12 for edit and delete.
+
+The Control-Q key sequence may be used to toggle the displaying of tags.
+Pressing it once will prevent all tags from being displayed in the editor.
+This is sometimes useful to see any colouring information underneath the tag.
+Pressing Control-Q once more will redisplay them.
+
+ at node Editor-Annotatons-Macro
+ at subsection Annotation Macros
+
+_picture(contig_editor.tagmacro,4.50833in)
+
+For rapid annotating a series of 10 macros may be programmed. Press
+Shift and a function key between F1 and F10 to bring up the macro
+editor. This look much like the normal tag editor except that @b{Save}
+is replaced with @b{Save Macro} and saving does not actually create a
+tag on the sequence. To use the macro, highlight the bases you wish and
+press the function key corresponding to that macro - F1 to F10. For a
+single base pair tag you do not need to underline a region as the tag
+will automatically cover the base underneath the editing cursor. To
+remember these permanently use the ``Save Tag Macros'' option in the
+``Settings'' menu.
+
+If you have an existing tag you wish to rapidly duplicate to many
+places, use Control plus a function key to copy the tag underneath the
+editing cursor to that numbered tag macro. This is simply a short cut
+for Shift and the function key, but without needing to manually
+replicate the tag type and textual comment.
+
+You may find that some function keys are already programmed to do other
+things (such as raise or lower windows), depending on the windowing
+environment in use. If this is the case either modify the configuration
+of your windowing system or simply use another macro key.
+
+ at example
+ at group
+ Shift   F1-F10           Create a tag macro via a dialogue window
+ Control F1-F10           Create a tag macro from tag at editor cursor
+ F1-F10                   Apply a tag macro (create a real tag)
+ at end group
+ at end example
+
+
+_split()
+ at node Editor-Searching
+ at section Searching
+ at cindex Searching: contig editor
+ at cindex Contig Editor: searching
+
+ at menu
+* Editor-Search-Anno::          Searching by annotation comments
+* Editor-Search-Type::          Searching by tag type
+* Editor-Search-PPosition::     Searching by padded position
+* Editor-Search-UPosition::     Searching by unpadded position
+* Editor-Search-Seq::           Searching by sequence
+* Editor-Search-Name::          Searching by reading name
+* Editor-Search-RefIndel::      Searching by reference indel
+* Editor-Search-ConsQual::      Searching by consensus quality
+* Editor-Search-ConsDiscrep::   Searching by consensus discrepancy
+* Editor-Search-ConsHet::       Searching by consensus heterozygosity
+* Editor-Search-depth-lt::      Searching by low coverage
+* Editor-Search-depth-ht::      Searching by high coverage
+ at end menu
+
+The contig editor's searching ability and its links to the consensus
+calculation algorithm are crucial in determining the efficiency with which
+contigs can be checked and corrected. The consensus is calculated ``on the
+fly'' and changes in response to edits. For editing, the most important
+search functions are those which reveal problems in the consensus
+whilst ignoring all bases that are adequately well determined.
+The standard search type is therefore by consensus quality. By default this
+is done in the forward direction and for a quality value of 30, although
+this is configurable by changing the collowing lines in the gap5rc file.
+
+ at example
+set_def CONTIG_EDITOR.SEARCH.DEFAULT_TYPE       consquality
+set_def CONTIG_EDITOR.SEARCH.DEFAULT_DIRECTION  forward
+set_def CONTIG_EDITOR.SEARCH.CONSQUALITY_DEF    30
+ at end example
+
+Pressing the ``Search'' button brings up a separate search
+window. This allows the user to select the direction of search, the
+type of search, and a value to search on. The value is entered into a
+value text box, then pressing the ``search'' button performs the
+search. If successful, the cursor is positioned accordingly.
+
+_picture(gap5_contig_editor.search,2.65in)
+
+The Control-s and Control-r key bindings in the editor are equivalent
+to searching for the next or previous match. Both key bindings will
+bring up the search window if it is not currently displayed (and not
+search), otherwise they perform the search currently selected in that
+window. Additionally with the mouse focus in the search dialogue window
+the Page Up and Page Down keys will perform previous and next search
+too.
+
+As is described below, there are several search modes.
+
+ at node Editor-Search-Anno
+ at subsection Search by Annotation Comments
+
+This positions the cursor at the start of the next tag which
+has a comment containing the string specified in the value box.
+ at c Only currently active tag types are searched.
+The search performed is a regular expression search, and
+certain characters have special meaning. Be careful when your
+string contains ``.'', ``*'', ``[``, ``]'', ``\'', ``^'' or ``$''. The search can be
+performed either forwards or backwards from the current cursor
+position. Searching with an empty value will find all tags.
+
+ at node Editor-Search-Type
+ at subsection Search by Tag Type
+
+This positions the cursor at the start of the next tag of the specified
+type. To change the type, click on the currently listed tag type,
+which displays a tag type selection dialogue. The search can be
+performed either forwards or backwards of the current cursor
+position. To find all  tags, use ``Search by Annotation Comments'',
+with an empty text box.
+
+ at node Editor-Search-UPosition
+ at subsection Search by Padded Position
+
+This jumps to a padded location in the editor and is directly equivalent
+to typing a number into the position entry box in the bottom left corner
+of the editor followed by ``p''.
+
+It is also possible to do relative searches by prefixing the location
+with + or -. So +100 will skip ahead 100 bases.
+
+ at node Editor-Search-PPosition
+ at subsection Search by Unpadded Position
+
+As per the padded search, but this jumps to an unpadded coordinate -
+essentially the number of non-* bases since the start of the contig,
+regardless of whether the first consensus base is labelled as base 1.
+
+ at node Editor-Search-Seq
+ at subsection Search by Sequence
+
+This positions the cursor at the start of the next segment of
+sequence that matches the value specified in the text box.
+The search is case insensitive, ignores pads, and can allow a specified
+number of mismatches. Unlike Gap4, Gap5's sequence search only looks
+in the consensus sequence. It also operates either forwards or
+backwards from the current editing cursor position.
+
+ at node Editor-Search-Name
+ at subsection Search by Reading Name
+
+This positions the cursor at the left end of the reading specified
+in the value text box. Note that not all reading names may be indexed
+by Gap5 and that the search will not find unindexed names. See
+ at code{tg_index -t} for information on creating Gap5 databases with
+reading name indices.
+
+The reading name has to be an exact match and so currently does not
+find prefix strings. If multiple sequences exist with the same name
+(which should be strongly discouraged) then it is undefined which will
+be found first.
+
+ at node Editor-Search-RefIndel
+ at subsection Search by Reference InDel
+
+Note: this information may not be available in all scenarios. If you
+imported the gap5 database from a SAM or BAM file there is an implicit
+set of reference coordinates used within SAM/BAM. Gap5 can keep track of
+the relationship between gap5's padded coordinate system and the
+reference coordinates. This function uses this data to search for the
+next or previous reference insertion or deletion.
+
+ at node Editor-Search-ConsQual
+ at subsection Search by Consensus Quality
+
+This positions the cursor on the consensus at the next
+position where the quality of the consensus is below a given
+threshold. The quality threshold should be entered into the
+value box and should be within the range of 0 to 100 inclusive.
+
+ at node Editor-Search-ConsDiscrep
+ at subsection Search by Consensus Discrepancy
+
+The consensus algorithm can keep track of the expected number of
+differences to the consensus given sequence depth and sequence quality
+values. This search looks for locations where the actual number of
+differences exceeds the expected amount by more than a specified factor.
+
+ at node Editor-Search-ConsHet
+ at subsection Search by Consensus Heterozygosity
+
+The consensus algorithm has a simple heterozygous calling method. Rather
+than simply weighing up the evidence for the base being A, C, G, T or a
+pad it also considers that it may be a combination of any two of these
+values. The consensus scores for the individual bases as well as the
+highest scoring consensus base can be seen in the editor information
+line when the mouse cursor is moved over a consensus base.
+
+This search is looking for consensus bases where the best heterozygous
+score is greater than or equal to the specified value.
+
+ at node Editor-Search-depth-lt
+ at subsection Search by Low Coverage
+
+This jumps to the next or previous location where the sequence coverage
+drops below a specified value.
+
+ at node Editor-Search-depth-ht
+ at subsection Search by High Coverage
+
+This jumps to the next or previous location where the sequence coverage
+is higher than a specified value. Regions of extreme depth are often
+indication of misassemblies.
+
+_split()
+ at node Editor-Settings
+ at section The Settings Menu
+ at cindex Settings menu: contig editor
+ at cindex Contig Editor: settings menu
+ at cindex Consensus: contig editor
+ at cindex configure: contig editor
+ at cindex Settings: saving in contig editor
+ at cindex Contig Editor: saving settings
+ at cindex Contig Editor: saving configuration
+
+The purpose of this menu is to configure the operation of the contig
+editor. Settings can be saved using the ``Save settings'' button, which
+also saves preferences for the editor width and height and the location
+of the divider between the names and sequence panels. It does not save
+tag macros though; these may be saved separately using the ``Save
+Macros'' option. Settings for the following options can be changed.
+
+ at ifset tex
+ at itemize @bullet
+ at item Group Readings
+ at item
+Highlight Disagreements
+ at itemize
+ at item By dots
+ at item By foreground colour
+ at item By background colour
+ at item Case sensitive
+ at end itemize
+ at item
+Set quality threshold
+ at item
+Pack sequences
+ at item
+Hide annoations
+ at item
+Background stripes
+ at item
+Show Mapping Quality
+ at item
+Show Template Status
+ at item
+Padded coordinates
+ at item
+Reference coordinates
+ at item
+Save tag macros
+ at item
+Save settings
+ at end itemize
+ at end ifset
+
+
+ at menu
+* Editor-Group Readings::       Group Readings
+* Editor-Disagree::             Highlight Disagreements
+* Editor-Pack Sequences::       Pack Sequences
+* Editor-Hide Annotations::     Hide Annotations
+ at end menu
+
+ at node Editor-Group Readings
+ at subsection Group Readings
+ at cindex Group Readings: contig editor
+ at cindex Contig Editor: Group Readings
+
+Sequences have an ``X'' location in the editor defined by the location
+within the contig that they align to. The ``Y'' location though is
+determined by the sequence layout algorithm, governed by the Pack
+Sequences setting and Group Readings options.
+
+By default sequences are grouped into distinct technologies, typically
+with longer sequences up the top (capillary) and shorter ones at the
+bottom (Illumina, SOLiD).  Within these technology groups the sequences
+are then sorted by their start location, so the top-most sequences start
+earlier and the bottom most sequences start later.
+
+The Group Readings menu allows user control over these primary and
+secondary collating orders. The sorting methods are defined below.
+
+ at table @asis
+ at item By technology
+Sorted in order of unknown, sanger (capillary), Illumina, SOLiD, 454.
+
+ at item By clipped start
+Sorted by the visible (non-cutoff) start position.
+
+ at item By start
+Sorted by the start position, regardless of whether the base is in
+cutoff data or not.
+
+ at item By template
+Sorted by template name. In Gap5 this is always defined to be a prefix
+of the sequence name, or optionally the same as the sequence name. The
+sorting method is using a simple ASCII collation order.
+
+ at item By strand
+Sorts data into the top strand first followed by the bottom strand data.
+
+ at item By base
+This sort order is different from all others in that it depends on the
+location of the editor cursor.
+
+Sorts sequences by the base type overlapping the last editor cursor
+location in the consensus. The collation order is A, C, G, T, N and *.
+Sequences that do not overlap that consensus location or those that only
+overlap in the cutoff portion are not sorted by this method. If this is
+used as the primary sort then these other sequences will be sorted using
+the secondary sort. If the secondary sort is By Base then an implicit
+tertiary sort order of By Start is used.
+
+Note that moving the editing cursor around sequences will not update the
+Y order. Only placement of the editing cursor on the consensus will
+update this.
+ at end table
+
+ at node Editor-Disagree
+ at subsection Highlight Disagreements
+ at cindex Highlight Disagreements: contig editor
+ at cindex Contig Editor: Highlight Disagreements
+ at cindex Dots: contig editor highlight disagreements
+ at cindex Colour: contig editor highlight disagreements
+
+This toggles between the normal sequence display (showing the current base
+assignments) and one in which those assignments that differ from the consensus
+are highlighted. It makes scanning for problems by eye much easier.
+
+Several modes of highlighting are available: ``By dots'' will only display the
+bases that differ from the consensus, displaying all other bases as full
+stops if they match or colons if they mismatch but are poor
+quality. The definition of poor quality here can be adjusted using the
+``Set quality threshold'' option of the Settings menu. The base
+colours are as normal (ie reflecting tags and quality).
+
+Highlight disagreements ``By foreground colour'' and ``By background
+colour'' displays all base characters, but colours those that differ
+from the consensus. Bases which differ by are below the
+difference quality threshold are shaded in light blue while high
+quality differences are dark blue. This allows easier
+visual scanning of the context that a difference occurs in, but it may
+be wise to disable the displaying of tags (hint: control-Q toggles
+tags on and off).
+
+Finally the ``Case sensitive'' toggle controls whether upper and lower
+case bases of the same base type should be considered as differences.
+
+ at node Editor-Pack Sequences
+ at subsection Pack Sequences
+
+This controls whether the editor allocates one row per sequence or
+whether it is permitted to pack multiple sequences onto a single row,
+assuming they do not overlap.
+
+The latter allows for a more compact plot which is desirable when
+dealing with short sequences, however it has the side effect that the
+reading names can no longer be listed in the names panel to the left.
+
+ at node Editor-Hide Annotations
+ at subsection Hide Annotations
+
+Sometimes we need to see the background shading underneath an
+annotation, for example to see the base quality or if we have
+Highlight Disagreements turned on using the @i{by background colour}
+mode. This option simply hides all annotations from display until it
+is selected again to reveal them once more.
+
+The Control-Q keyboard shortcut has the same effect.
+
+
+_split()
+ at node Editor-Primer Selection
+ at section Primer Selection
+ at cindex Primer Selection: contig editor
+ at cindex Contig Editor: Primer selection
+ at cindex Oligo selection: contig editor
+ at cindex Contig Editor: Primer selection
+
+The ``Find Primer Walk'' function from the Commands menu is an
+interface to the Primer3 program (builtin to Gap5 so it does not need
+an external installation). Currently it only allows for selection of a
+single internal oligo suitable for ``walking'' along a
+template. It is designed for manual finishing work and is not
+appropriate for automatic finishing. Future plans are to add PCR support.
+
+The command brings up its own dialogue window.
+
+_picture(gap5_contig_editor.primer_dialogue,2.925in)
+
+The top portion of this window controls where to look for primers. By
+default it will be either side of the editing cursor location. We also
+specify here what strand we wish to run our experiment on.
+
+Below this are a series of Primer3 parameters. Please see the Primer3
+documentation for a full description of these.
+
+Upon hitting OK, and assuming that some primers can be found, a new
+window showing the available choices is presented.
+
+_picture(gap5_contig_editor.primers,5.4in)
+
+The primers show are sorted by Primer3 score, with lower being
+better. Clicking on any of the other headings in the table allows the
+data to be re-sorted by that column. Clicking the left mouse button on
+any line will show the location of this primer in the main editor
+window as an underlined region. It also updates the bottom half of the
+Oligos window with further details.
+
+At the bottom of the window are two editable selections. The left most
+labelled ``Seq. name to tag'' allows us to pick a sequence we wish to
+place an oligo (@code{OLIG}) annotation on, which defaults to the
+consensus sequence. The right selection box labelled ``Template name''
+is an list of identified templates at this region, however this is not
+necessarily exhaustive as it only includes the sequences at this
+position and may miss some read-pairs that span this region. If you
+have a specific template in mind you can also type in the name of it
+to here.
+
+Pressing the ``Add annotation'' button then creates an oligo
+annotation. The text associated with the annotation will depend on the
+primer chosen, but an example follows.
+
+ at example
+Sequence        AACACATGGTAAAGCAGATG
+Template        zDH64-714h06
+GC              40.0
+Temperature     53.45
+Score           1.54377204143
+Date_picked     Thu Aug 12 17:31:18 BST 2010
+Oligoname       ??
+ at end example
+
+
+_split()
+ at node Editor-Traces
+ at section Traces
+ at cindex Trace displays: contig editor
+ at cindex Contig Editor: trace display
+
+The original trace data from which the readings where derived can be
+displayed by double clicking (two quick clicks) with the left or
+middle mouse button on the area of interest. Control-t has the same
+effect.  The trace will be displayed centred around the base clicked
+upon and the name of the reading in the contig editor will be
+highlighted.  Double clicking on the consensus displays traces for all
+the readings covering that position.
+
+Moving the mouse pointer over a trace base causes the display of an
+information line at the bottom of the window. This gives the base
+type, its position in the sequence, and its confidence value.
+
+There are two forms of trace display which are selected using the ``Compact''
+button at the top of the Trace display. The compact form differs by not
+showing the Info, Diff, Comp. and Cancel buttons at the left of each trace.
+
+Note that Gap5 does not store the trace files in the project database:
+it stores only their names and reads them when required. By default it
+will attempt to look for them in the current working directory (likely
+the same directory as the gap database). However this can be adjusted
+to look in other directories or via URLs using ``Trace file location''
+in the main Gap5 configure menu
+(_fpref(Conf-Trace File Location, Trace File Location, configure)). 
+
+_picture(gap5_contig_editor.traces,6in)
+
+This figure is an example of the Trace Display showing three capillary
+traces and an Illumina trace.  On the top line, the Lock checkbutton
+keeps the trace data in sync with the editor cursor position. The
+layout is controlled by the Columns and Rows selectors at the top of
+the window; 2 column by up to 3 rows in the above screenshot. Show
+confidence draws coloured bars and a numerical value representing the
+quality of each individual base-call. 
+
+The main trace panels each have the sequence name displayed in the top
+left corner. Below this are X and Y zoom controls on the left and the
+actual trace data on the right. The style of this will depend on the
+type of trace. Sanger chromatograms take multiple samples per base and
+are subsequently analysed (base-called) to identify the peaks and the
+number/type of bases represented by that peak. These are drawn using
+smooth lines, examples of which can be seen in the top row of the
+image above. Illumina GA instruments are ``clocked'' in that each and
+every measurement corresponds to one base. These are drawn using a
+stick plot, as seen in the bottom row of the screen-shot. Note that it
+is quite likely you will not have the processed trace data available
+for Illumina GA sequences due to size constraints, so the above is
+simply an example of what @i{could} be viewed rather than a typical
+example.
+
+454 instruments use pyro-sequencing and so produce a variable number
+of bases per measurement, with each measurement being clocked to a
+specific cycle (flow) on the sequencing instrument. Hence 454 data is
+also drawn using a stick plot, although with potentially multiple
+bases per measurement. An example is visible below.
+
+_picture(gap5_contig_editor.454trace,6in)
+
+The horizontal rulers in this plot correspond to normalised peak
+intensities for 1.0, 2.0 and so on to indicate 1, 2, 3... bases per
+flow. Clearly visible are flows of approximate height 1 (C T A G T on
+the left), 2 (the following AA) and 0 (the G between the left most C
+and T). Above these the confidence bars are visible.
+
+Right clicking on a trace will bring up a popup menu containing the
+following options.
+
+ at table @i
+ at item Information
+Displays some basic textual information about the trace. The
+information available will vary by trace type, but it may include
+details such as the length, instrument and run-date.
+
+ at item Save
+Saves the trace in ZTR format to a local file on disk. This can be
+useful for when you are using a remote service for fetching traces or
+extracting them from an archive such as .sff or .srf file.
+
+ at item Complement
+Reverse complements the trace display. This does not modify data in
+any way, but simply adjusts how it is drawn.
+
+ at item Quit
+Removes this trace from the trace window. If it is the last displayed
+trace then the window will be removed too.
+ at end table
+
+
+_split()
+ at node Editor-Info
+ at section The Editor Information Line
+ at cindex Information line: contig editor
+ at cindex Status line: contig editor
+ at cindex Contig Editor: information line
+ at cindex Unpadded base positions
+
+The very bottom line of the editor display is text line used by the editor to
+display pieces of useful information. Currently this gives information on
+individual bases, readings, the contig, and tags, as the mouse is moved over
+the appropriate object. Each type of object we move the mouse pointer
+over (sequence base, consensus base, sequence name panel, annotation)
+has its own list of information to display which can be configured
+using a format string stored in your @i{$HOME/.gap5rc} file.
+
+Typically you will not need to modify these, but if you choose to do
+so the default values to start from are shown below.
+
+
+ at smallexample
+# Mouse-over a sequence the reading name panel
+set_def READ_BRIEF_FORMAT \
+        {Reading:%n(#%Rn)  Tech:%V  Length:%l(%L)  MappingQ:%m%**/%*m  Pos:%S%p / %*S%*p}
+
+# Mouse-over the "Consensus" label in the name panel
+set_def CONTIG_BRIEF_FORMAT  \
+        {Contig:%n(#%Rn)   Length:%l  Start:%s  End:%e}
+
+# Mouse-over a base in a sequence
+set_def BASE_BRIEF_FORMAT1  \
+        { Base %b confidence:%4.1c (Prob. %Rc, raw %4.1A %4.1C %4.1G %4.1T)   Position %Rp  %n}
+
+# Mouse-over a base in the consensus
+set_def BASE_BRIEF_FORMAT2  \
+        {Base confidence:%4.1c (Prob. %Rc)  A=%4.1A C=%4.1C G=%4.1G T=%4.1T *=%4.1*  Position %p}
+
+# Mouse-over an annotation
+set_def TAG_BRIEF_FORMAT  \
+        {Tag type:%t  Comment:"%.100c"}
+ at end smallexample
+
+The text output is as listed above, but replacing percent-code strings
+with a relevant piece of text. In many cases a capital R indicates raw
+mode to display a numerical value instead of a string. For example
+ at code{%n} in READ_BRIEF_FORMAT will be replaced by the sequence name
+while @code{%Rn} will be replaced by the sequence record number. The
+full syntax of percent expansion is as follows:
+
+ at itemize @bullet
+ at item
+        A percent sign.
+ at item
+        An optional minus sign to request left alignment of the information.
+        When displaying information in a specific field with where that data
+        does not fill the entire space allowed the information will, by
+        default, be right justified. Adding a minus character here requests
+        left justification.
+ at item
+        An optional minimum field width. This is a decimal number indicating
+        how much space to leave for this information.
+ at item
+        An optional precision for numbers or maximum field width for strings.
+        This is given as a fullstop followed by a decimal number.
+ at item
+        An optional 'R' to specify Raw mode. This changes the meaning of many
+        (but not all) of the expansion requests to give a numercial
+        representation of the data. For example %n is a reading name
+        and %Rn is a reading number.
+ at item
+        Th expansion type itself. This is either one or two letters. See below
+        for full details of their meanings.
+ at end itemize
+
+To programmers this syntax may seem very similar to @code{printf}. This is
+intentional, but do not assume it is the same. Specifically the print syntax
+of @code{%#}, @code{%+} and @code{%0} will not work.
+
+ at subsection Reading Information
+ at cindex Information line: readings in contig editor
+ at cindex READ_BRIEF_FORMAT
+ at cindex BASE_BRIEF_FORMAT1
+
+Used when we move the mouse over a sequence name in the names panel or
+a sequence base-call. Example output is @b{Reading:xc04a1.s1(#74)
+Tech:Sanger  Length:295(474)  MappingQ:50}. Note that not all
+expansions make sense when used in the names panel as no cursor
+X position is available.
+
+ at table @strong
+ at item %%
+        A single % sign
+ at item %n
+        Reading name. Raw mode: record number
+ at item %#
+        Reading record number
+ at item %p
+        Position in sequence. Raw mode: position in contig.
+ at item %l
+        Clipped sequence length
+ at item %L
+        Unclipped sequence length
+ at item %s
+        Start of clip
+ at item %e
+        End of clip
+ at item %S
+        Sense (whether complemented) - ``<<'' or ``>>''. Raw mode: 0/1
+ at item %d
+        Strand - ``+'' or ``-''. Raw mode: 0/1
+ at item %b
+        Base call
+ at item %c
+        Confidence value of called base (phred style). Raw mode: probability
+ at item %A
+ at itemx %C
+ at itemx %G
+ at itemx %T
+        Individual confidence (phred style) of A,C,G,T component in
+        log-odds form. Raw mode: probability value.
+ at item %m
+        Mapping Quality. Raw mode: probability of correctly mapped.
+ at item %V
+        Instrument type - Sanger, Illumina, SOLiD, 454 or Unknown.
+ at end table
+
+ at subsection Contig Information
+ at cindex Information line: contig in contig editor
+ at cindex CONTIG_BRIEF_FORMAT
+ at cindex BASE_BRIEF_FORMAT2
+
+For the CONTIG_BRIEF_FORMAT and BASE_BRIEF_FORMAT2 the following
+expansions apply. These operate on contigs and the consensus
+sequence. 
+
+ at table @strong
+ at item %%
+        Single % sign
+ at item %n
+        Contig name. Raw mode: contig record number.
+ at item %#
+        Contig record number
+ at item %p
+        Position in contig
+ at item %l
+        Length of contig
+ at item %s
+        Contig start coordinate
+ at item %e
+        Contig end coordinate
+ at item %b
+        Called consensus base
+ at item %c
+        Score for called consensus base. Raw mode: probability value
+ at item %A
+ at itemx %C
+ at itemx %G
+ at itemx %T
+ at itemx %*
+        Individual confidence for A,C,G,T,* base types in log-odds
+        form. Raw mode: as a probability value.
+ at end table
+
+ at subsection Tag Information
+ at cindex Information line: tags in contig editor
+ at cindex TAG_BRIEF_FORMAT
+
+The TAG_BRIEF_FORMAT string is used to display annotation
+summaries. The possible percent encodings are as follows.
+
+ at table @strong
+ at item %%
+        Single % sign
+ at item %p
+        Tag position
+ at item %t
+        Tag type (always 4 characters)
+ at item %l
+        Tag length
+ at item %#
+        Tag number (0 if unknown)
+ at item %c
+        Tag comment
+ at end table
+
+
+_split()
+ at node Editor-Joining
+ at section The Join Editor
+ at cindex Join Editor
+ at cindex Contig Editor: joining
+
+Contigs are joined interactively using the Join Editor.  This is
+simply a pair of contig editor displays stacked above one another.
+The top editor is flipped in Y so that the consensus appears at the
+bottom. This allows the two consensus sequences to be adjacent to one
+another, separated only by a ``differences'' line.  Note that it is
+essential to align the contigs over the full length of their
+overlap. It is much more difficult to achieve this after a join has
+been made, and until the alignment is correct, the consensus sequence
+will be nonsense.
+
+The few differences between the Join Editor and the Contig Editor can be seen
+in the figure below. Otherwise all the commands and operations are the
+same as those for the Contig Editor 
+
+_picture(gap5_contig_editor.join,6in)
+
+One difference is the Lock button. When set (as it is in the
+illustration) scrolling either contig will also scroll the other contig.
+
+The Align button aligns the overlapping consensus sequences and adds
+pads as necessary. The alignment routine assumes that the two contigs
+are already in approximately the right relative position (as they are
+immediately after the Join Editor has been invoked from Find Internal
+Joins, or Find Repeats). If they are not you may get better results by
+manually positioning then before hand.
+
+The ``<'' and ``>'' buttons either side of the ``Align'' button
+perform the alignment from the editing cursor to the start of the
+contig and and from the cursor to the end of the contig
+only. Alignment end-gaps are penalised at the curosr position but not
+for the alignment end at the contig start/end position. These buttons
+are useful for when multiple alignment positions may be valid, such as
+is the case with an overlap consisting entirely of a short tandem repeat.
+
+It should be noted that each of the pair of editors comprising the
+Contig Editor  maintains its own undo history, and using Align
+is likely to add to both undo histories. There is only one Undo
+button, but it applies to the editor last clicked within. A hint is
+given as to which of the two editors this is by highlighting the
+editor in a red border when the mouse is moved over the Undo button.
+
+Pressing the Join button will display a small dialogue box informing
+you of the length and percentage match of the overlap between the two
+contigs. At this point you can decide to make the join, to not make
+the join (both of which remove the editors from the screen) or to
+cancel which leaves the join editor visible still to permit further
+editing.
+
+
+_split()
+ at node Editor-Multiple Editors
+ at section Using Several Editors at Once
+ at cindex Contig Editor: multiple editors
+
+Several editors can be used simultaneously, even on the same contig.
+In the latter case, it is useful to understand the difference between
+the data and the view of the data.
+
+Each operating Contig Editor is a view of the data for
+a particular contig. With two editors
+viewing the same contig, making changes in either will modify the data
+that both are viewing, hence the change will be visible in both
+editors. Similarly, using Undo in either will undo the changes to both.
+
+Interaction between Contig Editors and Join Editors is more
+complicated and generally isn't advised. However such interactions
+work consistently with the notion of views of contigs. For example,
+suppose there are two Contig Editors open on two separate contigs, and in
+addition to these a Join Editor displaying both contigs. Making the
+join in the Join Editor will update the two stand-alone Contig Editors
+so that they are each viewing the correct positions in the new contig,
+even though they're both now viewing the same contig.
+
+_split()
+ at node Editor-Quitting
+ at section Quitting the Editor
+ at cindex Quitting: contig editor
+ at cindex Contig Editor: quitting
+
+The Exit operation in the File menu quits the editor. If changes have
+been made since the last save you will be asked whether you wish to
+save these changes.  Answering ``Cancel'' abandons the exit process
+and provides control of the editor again, otherwise the appropriate
+action will be taken and the editor quitted.
+
+_split()
+ at node Editor-Summary
+ at section Summary
+ at cindex Summary: contig editor
+ at cindex Contig Editor: summary
+ at cindex Keyboard summary (contig editor)
+
+ at node Editor-Summary-Keys
+ at subsection Keyboard summary for editing window
+
+(``Left'', ``Right'', ``Up'', ``Down'' refer to the appropriate arrow keys.)
+
+ at example
+Page Up                         Scroll left by 1Kb
+Shift-Page Up                   Scroll left by 10Kb
+Control-Page Up                 Scroll left by 100Kb
+Shift-Control-Page Up           Scroll left by 1Mb
+
+Page Down                       Scroll right by 1Kb
+Shift-Page Down                 Scroll right by 10Kb
+Control-Page Down               Scroll right by 100Kb
+Shift-Control-Page Down         Scroll right by 1Mb
+
+Left arrow or Control-b         Move editing cursor left one base
+Right arrow or Control-f        Move editing cursor right one base
+Up arrow or Control-p           Move editing cursor up one base
+Down arrow or Control-n         Move editing cursor down one base
+Control-a or Home               Move editing cursor to start of sequence
+Control-e or End                Move editing cursor to end of sequence
+Alt-comma                       Move editing cursor to start of contig
+Alt-fullstop                    Move editing cursor to end of contig
+
+Control-t                       Display trace
+Control-s                       Search forward
+Control-r                       Search backwards
+Control-q                       Toggle tag display
+
+<                               Set left cutoff clip point (in sequence)
+>                               Set right cutoff clip point (in sequence)
+
+<                               Bulk clip left cutoff (in consensus)
+>                               Bulk clip right cutoff (in consensus)
+
+[                               Set confidence to 0
+]                               Set confidence to 100
+Shift Up                        Increase confidence of base by 1
+Shift Down                      Decrease confidence of base by 1
+Control Up                      Increase confidence of base by 10
+Control Down                    Decrease confidence of base by 10
+
+a, c, g, t or *                 Overwrite base with a new call.
+i or Insert                     Insert pad (or column if in consensus)
+Backspace or Delete             Delete padding character
+Ctrl-Backspace or Ctrl-Delete   Delete base (any base type)
+
+Control-right arrow             Move sequence right 1 base-pair
+Control-left arrow              Move sequence left 1 base-pair
+
+F11                             Edit tag under editing cursor
+F12                             Delete tag under editing cursor
+
+Shift F1 to Shift F10           Edit tag macro 1 to 10
+Control F1 to Control F10       Copy tag at editing cursor to macro 1 to 10
+F1 to F10                       Create tag from macro 1 to 10
+
+ at end example
+
+ at node Editor-Summary-Mouse
+ at subsection Mouse summary for editing window
+
+ at example
+Left button                     Position editing cursor to mouse cursor
+Left button (drag)              Mark start and end of selection
+Shift left button               Adjust end of selection
+Left button (double click)      Display trace
+Right button                    Display commands menu
+Mouse-wheel                     Vertically scroll the editor
+Control mouse-wheel             Vertically scroll the editor, fast
+Shift mouse-wheel               Horizontally scroll the editor
+Shift Control mouse-wheel       Horizontally scroll the editor, fast
+ at end example
+
+ at node Editor-Summary-MouseNames
+ at subsection Mouse summary for names window
+
+ at example
+ at group
+Left button + drag              Copy sequence name to clip-board
+Right button                    Display popup menu
+Mouse-wheel                     Vertically scroll the editor
+Control mouse-wheel             Vertically scroll the editor, fast
+ at end group
+ at end example
+
diff --git a/manual/gap5_contig_editor.454trace.png b/manual/gap5_contig_editor.454trace.png
new file mode 100644
index 0000000..2caf6a9
Binary files /dev/null and b/manual/gap5_contig_editor.454trace.png differ
diff --git a/manual/gap5_contig_editor.join.png b/manual/gap5_contig_editor.join.png
new file mode 100644
index 0000000..dcedc56
Binary files /dev/null and b/manual/gap5_contig_editor.join.png differ
diff --git a/manual/gap5_contig_editor.names1.png b/manual/gap5_contig_editor.names1.png
new file mode 100644
index 0000000..19f062c
Binary files /dev/null and b/manual/gap5_contig_editor.names1.png differ
diff --git a/manual/gap5_contig_editor.names2.png b/manual/gap5_contig_editor.names2.png
new file mode 100644
index 0000000..5c47a4d
Binary files /dev/null and b/manual/gap5_contig_editor.names2.png differ
diff --git a/manual/gap5_contig_editor.primer_dialogue.png b/manual/gap5_contig_editor.primer_dialogue.png
new file mode 100644
index 0000000..e93368c
Binary files /dev/null and b/manual/gap5_contig_editor.primer_dialogue.png differ
diff --git a/manual/gap5_contig_editor.primers.png b/manual/gap5_contig_editor.primers.png
new file mode 100644
index 0000000..4ff4243
Binary files /dev/null and b/manual/gap5_contig_editor.primers.png differ
diff --git a/manual/gap5_contig_editor.screen.png b/manual/gap5_contig_editor.screen.png
new file mode 100644
index 0000000..abcbc2a
Binary files /dev/null and b/manual/gap5_contig_editor.screen.png differ
diff --git a/manual/gap5_contig_editor.search.png b/manual/gap5_contig_editor.search.png
new file mode 100644
index 0000000..51bce8d
Binary files /dev/null and b/manual/gap5_contig_editor.search.png differ
diff --git a/manual/gap5_contig_editor.traces.png b/manual/gap5_contig_editor.traces.png
new file mode 100644
index 0000000..09be07b
Binary files /dev/null and b/manual/gap5_contig_editor.traces.png differ
diff --git a/manual/gap5_contig_selector.png b/manual/gap5_contig_selector.png
new file mode 100644
index 0000000..31c6364
Binary files /dev/null and b/manual/gap5_contig_selector.png differ
diff --git a/manual/gap5_delete_contigs.png b/manual/gap5_delete_contigs.png
new file mode 100644
index 0000000..f3298cd
Binary files /dev/null and b/manual/gap5_delete_contigs.png differ
diff --git a/manual/gap5_disassembly-t.texi b/manual/gap5_disassembly-t.texi
new file mode 100644
index 0000000..4779992
--- /dev/null
+++ b/manual/gap5_disassembly-t.texi
@@ -0,0 +1,223 @@
+ at chapter Checking Assemblies and Removing Readings
+ at menu
+* Check Assembly::                    Checking Assemblies
+* Removing Readings::                 Removing Readings and Breaking Contigs
+* Break Contig::                      Breaking Contigs
+* Disassemble::                       Disassembling Readings
+* Delete Contigs::                    Delete Contigs
+ at end menu
+
+ at cindex assembly problems: breaking contigs
+ at cindex assembly problems: removing readings
+ at cindex assembly problems: disassembling readings
+
+After assembly, and prior to editing, it can be useful to examine the
+quality of the alignments between individual readings and the
+sections of the consensus which they overlap. This may
+reveal doubtful joins between sections of contigs, poorly aligned
+readings, or readings that have been misplaced. By using this analysis
+in combination with other gap5
+functions 
+such as Find internal joins (_fpref(FIJ, Find Internal
+Joins, fij)) and Find repeats (_fpref(Repeats, Find Repeats,
+repeats)), 
+it is also possible to discover if 
+readings have been positioned in the
+wrong copies of repeat elements. 
+
+If readings are found to be misplaced
+or need removing for other reasons, gap5 has functions
+for breaking contigs
+(_fpref(Break Contig, Breaking Contigs, disassembly)),
+and removing readings
+(_fpref(Disassemble, Disassembling Readings, disassembly)).
+These functions can be accessed through the main gap5 Edit menu or from
+within the Contig Editor.
+
+If readings are removed from contigs to start new contigs of one
+reading, these contigs can then be processed by Find internal joins 
+(_fpref(FIJ, Find Internal
+Joins, fij)) 
+and the Join editor
+(_fpref(Editor-Joining, The Join Editor, contig_editor)), which should
+reveal all the other positions at which the reading matches.
+
+ at page
+_split()
+ at node Check Assembly
+ at subsection Checking Assemblies
+ at cindex Check assembly
+
+The Check Assembly routine (which is invoked from the gap5 View menu) is
+used to check contigs for potentially misassembled readings by comparing
+them against the segment of the consensus which they overlap.  It simply
+slides a small window along the sequence identifying regions of high
+disagreement between that portion of sequence and the consensus. Results
+are displayed in the Output Window and plotted on the main diagonal in
+the Contig Comparator. _fxref(Contig Comparator, Contig Comparator,
+comparator)
+
+From the Contig Comparator the user can invoke the Contig Editor to
+examine the alignment of any problem reading. _fxref(Editor, Editing in
+gap5, contig_editor) If the reading appears to be correctly positioned
+the user can either edit it, or instead select the name to add it to the
+``readings'' list for subsequent disassembly or removal.
+
+_picture(gap5_check_ass,3.34167in)
+
+Users select either to search only one contig ("single"), all contigs
+("all contigs"), or a subset of contigs contained in a "file" or a
+"list". If "file" or "list" is selected the "browse" button will be
+activated and clicking on it will invoke a file or list browser. If a
+single contig is selected the "Contig identifier" dialogue will be
+activated and users should enter a contig name.
+
+The percentage disagreement and over what size of window are both
+configurable parameters. Additionally there is a parameter to control
+whether N bases in the sequence should be considered as disagreements or
+not. The choice will depend on whether you are looking for sequences
+that appear to be in the wrong place (ignore Ns) or simply sequences
+that appear to have a large number of incorrect base calls (keep Ns).
+
+ at cindex reading percent mismatch
+ at cindex readings: sorted on alignment score
+ at cindex aligned readings: sorted on alignment score
+
+The "Information" window produced by selecting "Information" from the
+Contig Comparator "Results" menu produces a summary of the results
+sorted in order os percentage mismatch.
+
+
+By clicking with the right mouse button
+on results plotted in the Contig Comparator a pop-up menu is revealed
+which can be used to invoke the Contig Editor
+(_fpref(Editor, Editing in gap4, contig_editor)). The editor will start
+up with the cursor positioned on the problem reading. If the reading is
+found to be misplaced it can be marked for removal from within the Editor
+(_fpref(Editor-Comm-Remove Reading, Remove Reading, contig_editor)).
+However, prior to this it may be beneficial to use some of the other
+analyses such as Find internal joins (_fpref(FIJ, Find Internal
+Joins, fij)) and Find repeats (_fpref(Repeats, Find Repeats,
+repeats)), which may help to find its correct location. Both of these
+functions produce results plotted in the Contig Comparator
+(_fpref(Contig Comparator, Contig Comparator, comparator)) and any
+alternative locations will give matches on the same vertical or
+horizontal projection as the problem reading.
+
+ at page
+_split()
+ at node Removing Readings
+ at section Removing Readings and Breaking Contigs
+
+Occasionally contigs require more drastic changes than simple basecall
+edits. Sometimes it is necessary
+to remove readings that have been put in the wrong
+place, or to break contigs that should not have been joined. Gap5
+contains functions to help with these problems, and two
+types of interface. 
+
+If a contig
+needs to be broken cleanly into two new contigs, with all the readings,
+other than the two at the incorrect join, still linked together, then
+Break Contig 
+(_fpref(Break Contig, Breaking Contigs, disassembly)), or
+(_fpref(Editor-Comm-Break Contig, Break Contig, contig_editor))
+should be used. The former interface is available via the main gap5 Edit
+menu, and the latter as an option in the Contig Editor.
+
+If one or more readings need removing from from contig(s), even if their
+removal will break the contiguity of a contig, then
+(_fpref(Disassemble, Disassemble Readings, disassembly)), or
+(_fpref(Editor-Comm-Remove Reading, Remove Reading, contig_editor))
+should be used. The former interface is available via the main gap5 Edit
+menu, and the latter as an option in the Contig Editor. Readings can be
+removed from the database completely, or moved to start individual new
+contigs, one for each reading.
+
+
+ at page
+_split()
+ at node Break Contig
+ at subsection Breaking Contigs
+ at cindex Break contig
+
+The Break Contig function (which is available from the gap5 Edit menu)
+enables contigs to be broken by removing the link between two adjacent
+readings. The user defines the contig coordinate to break at. All
+sequences starting to the right of that position will be placed into a
+new contig.
+
+_picture(gap5_break_contig,3.34167in)
+
+Breaking contig can somtimes cause more holes to be created. The
+``Remove contig holes'' will also cause subsequent breaks to happen at
+these cases, producing more than one additional contig.  If we have
+aligned against a reference and expect regions of zero coverage then
+this option should be disabled.
+
+ at page
+_split()
+ at node Disassemble
+ at subsection Disassembling Readings
+ at cindex Disassemble readings
+ at cindex Removing readings
+
+This function is used to remove readings from a database or move
+readings to new contigs. 
+
+_picture(gap5_disassembly,3.40833in)
+
+If readings are removed from the database all reference to them is
+deleted. If a reading is moved to a ``single-read contig'' a new
+contig will be created containing this one single reading, which may
+then be re-processed by Find Internal Joins
+(_fpref(FIJ, Find Internal Joins, fij)) 
+and the Join editor
+(_fpref(Editor-Joining, The Join Editor, contig_editor)), which should
+reveal all the other positions at which the reading matches.
+
+More useful is the general ``Move readings to new contigs''. This will
+keep any assembly relationships intact between the set of readings to
+be disassembled. For example if three readings overlap then when
+disassembled all three will end up in a single new contig. This
+function is particularly useful for pulling apart false joins or
+repeats.
+
+The set of readings to be processed can be read from a ``file'' or a ``list'' and
+clicking on the ``browse'' button will invoke an appropriate browser. If just a
+single reading is to be assembled choose ``single'' and enter the
+reading name instead of the file or list of filenames.
+
+Removal via a ``list'' is a particularly powerful option when
+controlled via the list generation functions within the contig
+editor. For example break contig could be viewed as disassembling a
+list of readings selected using ``Select this reading and all to
+right''.
+
+Unlike gap4, gap5 can cope with having holes in contigs. (This is
+obviously a requirement when dealing with mapped alignments.)  Hence
+gap5 gives us a choice whether to break contigs into two (or more)
+pieces when removing sequences produces holes in the contigs. By
+default this is enabled.
+
+ at page
+_split()
+ at node Delete Contigs
+ at subsection Delete Contigs
+ at cindex Delete Contigs
+ at cindex Removing contigs
+
+While Disassemble Readings is capable of removing entire contigs, it is
+inefficient for this task as it has a lot of additional house-keeping to
+perform.
+
+_picture(gap5_delete_contigs,3.34167in)
+
+Delete Contigs should be used when we wish to remove entire contigs.
+Be careful not to accidentally choose this over disassemble readings as
+even when giving a single sequence name, this function will interpret it
+as a request for removing all other sequences in that contig too.
+
+There is no Undo feature, so backups are advised before hand.
+
+
diff --git a/manual/gap5_disassembly.png b/manual/gap5_disassembly.png
new file mode 100644
index 0000000..441c737
Binary files /dev/null and b/manual/gap5_disassembly.png differ
diff --git a/manual/gap5_export-t.texi b/manual/gap5_export-t.texi
new file mode 100644
index 0000000..b447208
--- /dev/null
+++ b/manual/gap5_export-t.texi
@@ -0,0 +1,85 @@
+ at node ImportGFF
+ at section Importing GFF
+ at cindex GFF: importing from
+ at cindex Import GFF Annotations
+
+Annotations within GFF files can be imported to Gap5 as annotations
+(sometimes referred to as tags).  The ``Import GFF Annotatons''
+function in the main File menu performs this task. Note that in order
+for this to work the contigs should not have been edited or
+complemented since the GFF file was created, otherwise the coordinates
+in the GFF file will not match.
+
+One caveat to this relates to sequence gaps.  By default consensus
+gaps/padding characters are excluded from the contig consensus
+sequences when counting GFF sequence coordinates.  In some cases we
+may wish to support annotations in a gapped sequence, so the ``GFF
+coordinates are already padded'' checkbox may be used to disable this
+coordinate de-padding process.
+
+
+_split()
+ at node ExportTags
+ at section Export Tags
+ at cindex Export Tags
+ at cindex Export GFF
+ at cindex GFF: exporting
+
+This dialogue allows annotations (``tags'') to be written to disk as
+a GFF version 3 file.
+
+Currently this just uses the GFF ``remark'' type, but future plans
+will be to support a more wide variety of GFF types. 
+
+_picture(gap5_export_tags,3.325in)
+
+By default the coordinates generated are de-padded, such that ``*''s
+in the consensus sequence are not counted when identifying the
+coordinate of an annotation. This may be disabled by deselecting the
+``Unpadded coordinates'' checkbox.
+
+The object a tag is attached to is typically the contig it is within,
+with the contig name being used in the first column of the GFF
+file. This applies even for annotations place on a sequence rather
+than the consensus. This feature may also be disabled by deselecting
+the ``Map sequence tags to consensus'' checkbox.
+
+Example GFF output follows, with ``...'' to denote lines truncated for
+illustrative purposes.
+
+ at example
+Contig6  gap5  remark  4745  4745  .  .  .  type=COMM;Note=Possible SNP?
+Contig2  gap5  remark  3178  3196  .  .  .  type=OLIG;Note=Template%09xb63f10%0AOligoname%09??%0A...
+ at end example
+
+Note we can see URL style percent encoding being used to avoid GFF
+format metacharacters, as per the GFFv3 specification.
+
+
+_split()
+ at node ExportSequences
+ at section Export Sequences
+ at cindex Export Sequences
+
+This function exports sequence and annotation data from a Gap5
+database to a variety of assembly formats.
+
+_picture(gap5_export_sequences,3.48333in)
+
+The fasta and fastq formats are basic sequence-only or sequence plus
+quality, with no support for contigs or alignments.  The BAF, CAF, ACE
+and SAM formats all hold assembly data and so are reasonably complete
+representatives of data within Gap5. Note that ACE does not directly
+support quality values and this export function does not create the
+associated phdball file that houses this data.
+
+There is also no direct support for BAM, however command line tools
+like samtools or picard can convert the SAM file into BAM format. The
+SAM file should already be sorted by position.
+
+For SAM only there are additional options: whether to fix mate-pair
+information and whether to use depadded coordinates.  This former will
+ensure that the MRNM (Mate Reference Name), MPOS and ISIZE fields are
+filled out. Note that this considerably slows down the speed of
+exporting, so it is disabled by default.
+
diff --git a/manual/gap5_export_sequences.png b/manual/gap5_export_sequences.png
new file mode 100644
index 0000000..6c70a0b
Binary files /dev/null and b/manual/gap5_export_sequences.png differ
diff --git a/manual/gap5_export_tags.png b/manual/gap5_export_tags.png
new file mode 100644
index 0000000..3f4570f
Binary files /dev/null and b/manual/gap5_export_tags.png differ
diff --git a/manual/gap5_fij-t.texi b/manual/gap5_fij-t.texi
new file mode 100644
index 0000000..889b81a
--- /dev/null
+++ b/manual/gap5_fij-t.texi
@@ -0,0 +1,187 @@
+ at cindex Find internal joins
+ at cindex joining contigs
+ at cindex contig joining
+ at cindex hidden data
+ at cindex overlap finding
+ at cindex finding overlaps
+ at cindex finding joins
+ at cindex masking
+ at cindex marking
+
+The purpose of this function (which is invoked from the Gap5 View menu)
+is to use sequences already in the database
+to find possible joins between contigs.  Generally these will be joins
+that were missed or judged to be unsafe during assembly and this
+function allows users to examine the overlaps and decide if they should
+be made. During assembly joins may have been missed because of poor
+data, or not been made because the sequence was repetitive.  Also it may
+be possible to find potential joins by extending the consensus sequences
+with the data from the 3' ends of readings which was considered to be
+too unreliable to align during assembly i.e. we can search in the
+"hidden data".
+
+If it has not already occurred, use of this function will automatically
+transform the Contig Selector into the Contig Comparator.  Each match
+found is plotted as a diagonal line in the Contig Comparator, and is
+written as an alignment in the Output Window. The length of the diagonal
+line is proportional to the length of the aligned region. If the match
+is for two contigs in the same orientation the diagonal will be parallel
+to the main diagonal, if they are not in the same orientation the line
+will be perpendicular to
+the main diagonal. The matches displayed in the Contig Comparator can be
+used to invoke the Join Editor (_fpref(Editor-Joining, The Join Editor,
+contig_editor)) 
+or Contig Editor.  _fxref(Editor,
+Editing in gap5, contig_editor) 
+Alternatively, the "Next" button at the top left of the Contig
+Comparator can be used to select each result in turn, starting with the
+best, and ending with the worst. When this is in use, users can find the 
+match in the Contig Comparator which corresponds to the next result by
+placing the cursor over the Next button. The plotted match and the contigs
+involved will turn white.
+
+_picture(gap5_comparator,5.25833in)
+
+A typical display from the Contig Comparator is shown in the figure
+above. 
+
+To define the match all numbering is relative to base number one in the
+contig: matches to the left (i.e.  in the hidden data) have negative
+positions, matches off the right end of the contig (i.e. in the hidden
+data) have positions greater than that of the contig length.  The
+convention for reporting the positions of overlaps is as follows: if
+neither contig needs to be complemented the positions are as shown.  If
+the program says "contig x in the - sense" then the positions shown
+assume contig x has been complemented. For example, in the results given
+below the positions for the first overlap are as reported, but those for
+the second assume that the contig in the minus sense (i.e. 443) has been
+complemented.
+
+ at example
+Possible join between contig   445 in the + sense and contig   405
+Percentage mismatch after alignment =  4.9
+       412        422        432        442        452        462
+    405  TTTCCCGACT GGAAAGCGGG CAGTGAGCGC AACGCAATTA ATGTGAG,TT AGCTCACTCA
+          ::::::::: : ::::::::  ::::: ::: :::::::::: :::::::::: ::::::::::
+    445  *TTCCCGACT G,AAAGCGGG TAGTGA,CGC AACGCAATTA ATGTGAG*TT AGCTCACTCA
+      -127       -117       -107        -97        -87        -77
+       472        482        492        502        512
+    405  TTAGGCACCC CAGGCTTTAC ACTTTATGCT TCCGGCTCGT AT
+         :::::::::: :::::::::: :::::::::: :::::::::: ::
+    445  TTAGGCACCC CAGGCTTTAC ACTTTATGCT TCCGGCTCGT AT
+       -67        -57        -47        -37        -27
+Possible join between contig   443 in the - sense and contig   423
+Percentage mismatch after alignment = 10.4
+        64         74         84         94        104        114
+    423  ATCGAAGAAA GAAAAGGAGG AGAAGATGAT TTTAAAAATG AAACG*CGAT GTCAGATGGG
+         :::: ::::: :::::::::: :::::::::: ::::::  :: ::::: :::: :::::::::
+    443  ATCG,AGAAA GAAAAGGAGG AGAAGATGAT TTTAAA,,TG AAACGACGAT GTCAGATGG,
+      3610       3620       3630       3640       3650       3660
+       124        134        144        154        164
+    423  TTG*ATGAAG TAGAAGTAGG AG*AGGTGGA AGAGAAGAGA GTGGGA
+         ::: :::::: :::::::::: :: :::::::  ::: ::::: :: ::
+    443  TTGGATGAAG TAGAAGTAGG AGGAGGTGGA ,GAG,AGAGA GTTGG*
+      3670       3680       3690       3700       3710
+ at end example
+
+_split()
+ at node FIJ-Dialogue
+ at subsection Find Internal Joins Dialogue
+ at cindex Find internal joins: dialogue
+
+_picture(gap5_fij.dialogue,3.39167in)
+
+The contigs to use in the search can be defined as "all contigs", a list
+of contigs in a file "file", or a list of contigs in a list "list".
+If "file" or "list" is selected the browse button is activated
+and gives access to file or list browsers.
+Two types of search can be selected: one, "Probe all against all"
+compares all the contigs defined against one another; the other "Probe
+with single contig", compares one contig against all the contigs in the
+list. If this option is selected the Contig identifier panel in the
+dialogue box is ungreyed. Both sense of the sequences are compared.
+
+
+If users elect not to "Use standard consensus" they can either "Mark
+active tags" or "Mask active tags", in which cases the "Select tags"
+button will be activated. Clicking on this button will bring up a check
+box dialogue to enable the user to select the tags types they wish to
+activate. Masking the active tags means that all segments covered by
+tags that are "active" will not be used by the matching algorithms.
+A typical
+use of this mode is to avoid finding matches in segments covered by tags
+of type ALUS (ie segments thought to be Alu sequence)
+or REPT (ie segment that are known to be repeated elsewhere in
+the data (_fpref(Anno-Types, Tag types, tags)). "Marking" is of less use:
+matches will be found in marked
+segments during searching, but in the alignment shown
+in the Output Window, marked segments will be shown in lower case.
+
+Some alignments may be very large. For speed and ease of scrolling
+Gap5 does not display the textual form of the longest alignments,
+although they are still visible within the contig comparator
+window. The maximum length of the alignment to print up is controlled
+by the ``Maximum alignment length to list (bp)'' control.
+
+The default setting for the consensus
+is to "Use hidden data" which means that where possible the
+contigs are extended using the poor quality data from the readings near
+their ends. To ensure that this additional data is not so poor that
+matches will be missed, the program uses algorithms which can be configured
+from the "Edit hidden data parameters" dialogue. Two algorithms are available.
+Both slide a window along the reading until a set criteria is met.
+By default an algorithm which sums confidence values within the window is used.
+It stops when a window with < "Minimum average confidence" is found. The other
+algorithm counts the number of uncalled bases in the window and stops when
+the total reaches "Max number of uncalled bases in window".
+The selected algorithm is applied to all the readings near the ends of contigs
+and the data that extends the contig the furthest is added to its consensus
+sequence. 
+
+If your total consensus sequence length (including a 20 character header for
+each contig that is used internally by the program) plus any hidden data 
+at the ends of contigs is greater than the current value of a parameter 
+called maxseq, Find Internal Joins may produce an error message advising 
+you to increase maxseq. Maxseq can be set on the command line
+(_fpref(Gap4-Cline, Command line arguments, gap4)) or by using the options
+menu (_fpref(Conf-Set Maxseq, Set Maxseq, configure)).
+
+The search algorithms first finds matching words of length "Word length",
+and only considers overlaps of length at least "Minimum overlap". Only
+alignments better than "Maximum percent mismatches" will be reported.
+
+There are three search algorithms: ``Sensitive'', ``Quick'' and
+``Fastest''. The quick or fastest algorithm should be applied first, and
+then the sensitive one employed to find any less obvious overlaps.
+
+The sensitive algorithm sums the lengths of
+the matching words of length "Word length" on each diagonal. It then finds
+the centre of gravity of the most significant diagonals. Significant diagonals
+are those whose probability of occurence is < "Diagonal threshold". It then
+uses a dynamic programming algorithm to align around the centre of gravity,
+using a band size of "Alignment band size (percent)". For example: if the 
+overlap was 1000 bases long and the percentage set at 5, the aligner would 
+only consider alignments within 50 bases either side of the centre of gravity.
+Obviously the larger the percentage and the overlap, the slower the aligment.
+
+The fastest and quick algorithms can find overlaps and align 100,000
+base sequences in a few seconds by considering, in its initial phase
+only matching segments of length "Minimum initial match length".  However
+it does a dynamic programming alignment of all the chunks between the
+matching segments, and so produces an optimal alignment. Again a banded
+dynamic algorithm can be selected, but as this only applies to the
+chunks between matching segments, which for good alignments will be very
+short, it should make little difference to the speed.  the fastest and
+quick methods only differ in how aggressively they prune potential
+alignments before entering the dynamic programming phase.
+
+After the search the results will be sorted so that the best matches
+are at the top of a list where best is defined as a combination of
+alignment length and alignment percent identity. This list can be
+stepped through, one result at a time using the Contig Joining Editor,
+by clicking on the "Next" button at the top left of the Contig
+Comparator.
+
+ at cindex error messages: find internal joins
+ at cindex error messages: maxseq
+ at cindex maxseq: find internal joins
diff --git a/manual/gap5_fij.dialogue.png b/manual/gap5_fij.dialogue.png
new file mode 100644
index 0000000..d54b5de
Binary files /dev/null and b/manual/gap5_fij.dialogue.png differ
diff --git a/manual/gap5_find_read_pairs.png b/manual/gap5_find_read_pairs.png
new file mode 100644
index 0000000..22734bd
Binary files /dev/null and b/manual/gap5_find_read_pairs.png differ
diff --git a/manual/gap5_list_libraries.png b/manual/gap5_list_libraries.png
new file mode 100644
index 0000000..f3478a4
Binary files /dev/null and b/manual/gap5_list_libraries.png differ
diff --git a/manual/gap5_org-t.texi b/manual/gap5_org-t.texi
new file mode 100644
index 0000000..47d79dc
--- /dev/null
+++ b/manual/gap5_org-t.texi
@@ -0,0 +1,70 @@
+ at node Gap5-Intro-Manual
+ at chapter Organisation of the gap5 Manual
+
+
+The main body of the gap5 manual is divided, where possible, 
+into sections covering related topics. If appropriate, these sections
+commence with an overview of the functions they contain.
+After the Introduction, the manual contains chapters on some important
+components of the user interface: the Contig Selector
+(_fpref(Contig Selector, Contig Selector, contig_selector)),
+the Contig Comparator
+(_fpref(Contig Comparator, Contig Comparator, comparator)),
+and then, in the chapter on Contig Overviews 
+(_fpref(Contig-Overviews, Contig Overviews, c))
+we describe the Template
+Display
+(_fpref(Template-Display, Template Display, template)),
+and its subcomponents
+the Stop Codon Plot
+(_fpref(Stops, Stop Codon Map, stops)), and the
+Restriction Enzyme Plot
+(_fpref(Restrict, Restriction Enzyme Search, restrict_enzymes)).
+
+Then there is a long chapter on the powerful Contig Editor
+(_fpref(Editor, Editor introduction, contig_editor)), followed by a
+chapter describing the many assembly engines and assembly modes which
+gap4 can offer
+(_fpref(Assembly, Assembly Introduction, assembly)).
+
+Gap4 contains functions to use the data in an assembly database to find the
+left to right order of contigs, and to compare their consensus sequences
+to look for joins that may have been missed during assembly.
+A "read-pair" is obtained by sequencing a DNA template (or "insert")
+from both ends: we then know the relative orientations of the two
+readings, and if we know the approximate
+template length, we know how far apart they
+should be after assembly. The next chapter is on the use of read-pair
+data for ordering contigs and checking assemblies and on the use of
+consensus comparisons for finding joins
+(_fpref(Ordering-and-Joining, Ordering and Joining Contigs, t)).
+
+
+The next chapter is on checking assemblies and removing readings
+(_fpref(Contig-Checking-and-Breaking, Checking Assemblies and Removing
+Readings, t)). The following chapter describes gap4's methods for
+suggesting experiments for helping to finish a sequencing project 
+(_fpref(Experiments, Finishing Experiments, experiments)). Then we
+describe the various consensus calculation algorithms, and the options
+for creating consensus sequence files
+(_fpref(Con-Calculation, The Consensus Calculation,
+calc_consensus)). Next is the description of a set of miscellaneous
+functions
+(_fpref(gap4-misc, Miscellaneous functions, t)), followed by chapters on
+the Results Manager
+(_fpref(Results, Results Manager, results)),
+Lists
+(_fpref(Lists, Lists Introduction, lists)),
+Notes
+(_fpref(Notes, Notes, notes)),
+Configuring gap4
+(_fpref(Conf-Introduction, Options Menu, configure)),
+gap4 Database Files
+(_fpref(GapDB, Gap Database Files, gap4)),
+_ifdef([[_unix]],[[Converting Old Databases
+(_fpref(Convert, Converting Old Databases, t)),
+]])Checking Databases for corruptions
+(_fpref(Check Database, Check Database, check_db))
+and Doctoring corrupted databases
+(_fpref(Doctor Database, Doctor database, doctor_db)).
+
diff --git a/manual/gap5_read_pairs-t.texi b/manual/gap5_read_pairs-t.texi
new file mode 100644
index 0000000..d995d75
--- /dev/null
+++ b/manual/gap5_read_pairs-t.texi
@@ -0,0 +1,59 @@
+ at cindex Find read pairs
+ at cindex Read pairs
+
+This function is used to check the positions and orientations of
+readings taken from the same templates. 
+It is invoked from the gap5 View menu.
+
+For each template the relative
+position of its readings and the contigs they are in are examined. This
+analysis can give information about the relative order, separation and
+orientations of contigs and also show possible problems in the data.
+The search can be over the whole database or a subset of contigs named
+in a list (_fpref(Lists, Lists, lists)) 
+or file of file names. The results are written to the Output
+Window and plotted in the Contig Comparator 
+(_fxref(Contig Comparator, Contig Comparator, comparator)).
+Read pair information is also used to colour code the results displayed in the
+Template Display 
+(_fpref(Template-Display, Template Display, template)).
+
+Note that during assembly the template names and lengths are copied from
+the experiment files into the gap database. _fxref(Formats-Exp,
+Experiment Files, exp) The accuracy of the lengths will depend upon some
+size selection being performed during the cloning procedures.
+
+_picture(gap5_find_read_pairs,3.525in)
+
+Users choose to process "all contigs" or a subset selected from a file
+of file names ("file") or a list ("list"). If either of the subset
+options is selected the "browse" button will be activated and can be
+clicked on to call up a file or list browser dialogue.
+
+_split()
+ at node ReadPair-Display
+ at subsection Find Read Pairs Graphical Output
+ at cindex Find read pairs: display
+
+The contig comparator is used to plot all templates with readings that span
+contigs. That is, the lines drawn on the contig comparator are a visual
+representation of the relationship (orientation and overlap) between contigs.
+When a template spans more than two contigs, all the combinations of pairs of
+contigs are plotted. However such cases are uncommon.
+
+_lpicture(gap5_rp_comparator,5.25833in)
+
+The figure above shows a typical Contig Comparator plot which includes
+several types of result in addition to those from Read Pair analysis.
+
+The lines for the read-pairs 
+are, by default, shown in blue. The length of the line is the average
+length of the two readings within the pair. The slope of the line represents
+the relative orientation of the two readings. If they are both the same
+orientation (including both complemented) the line is drawn from top left to
+bottom right, otherwise the line is drawn from top right to bottom left.
+
+Clicking with the right mouse button on a read pair line brings up a
+menu containing, amongst other things, "Invoke join editor"
+(_fpref(Editor-Joining, The Join Editor, contig_editor)).  This will
+bring up the Join Editor with the two contigs shown end to end.
diff --git a/manual/gap5_remove_contig_holes.png b/manual/gap5_remove_contig_holes.png
new file mode 100644
index 0000000..6b088b3
Binary files /dev/null and b/manual/gap5_remove_contig_holes.png differ
diff --git a/manual/gap5_remove_pad_columns.png b/manual/gap5_remove_pad_columns.png
new file mode 100644
index 0000000..9d9991d
Binary files /dev/null and b/manual/gap5_remove_pad_columns.png differ
diff --git a/manual/gap5_repeats-t.texi b/manual/gap5_repeats-t.texi
new file mode 100644
index 0000000..a07e5b1
--- /dev/null
+++ b/manual/gap5_repeats-t.texi
@@ -0,0 +1,75 @@
+ at cindex Find repeats
+ at cindex Repeat search
+
+The purpose of this function (which is invoked from the Gap5 View menu) 
+is to find exact repeats in contig
+consensus sequences. An exact repeat is defined as a run of consecutive
+identical ACGT characters; no mismatches or gaps are permitted.
+
+If it has not already occurred, selection of
+this function will automatically
+transform the Contig Selector into the Contig Comparator.
+_fxref(Contig Comparator, Contig Comparator, comparator)
+Each match found is plotted as a diagonal line in the Contig Comparator.
+The length of the diagonal line is proportional to the length of the
+match.
+
+If the match is for two contigs in the same orientation the diagonal
+will be parallel to the main diagonal, if they are not the line will be
+perpendicular to the main diagonal. The matches displayed in the Contig
+Comparator can be used to invoke the Join Editor (_fpref(Editor-Joining,
+The Join Editor, contig_editor)) or Contig Editors (_fpref(Editor,
+Editing in Gap5, contig_editor)), and an Information button will display
+data about the match in the Output window. e.g.
+
+ at example
+ at group
+Repeat match
+    From contig xb54a3.s1(#26) at 78
+    With contig xb62h3.s1(#3) at 1
+    Length 37
+ at end group
+ at end example
+
+This means that position 78 in the contig with xb54a3.s1 (reading number
+26) at its left end matches 37 bases at position 1 in the contig with
+xb62h3.s1 (number 3) at its left end.
+
+_picture(repeats,3.70833in)
+
+Users can elect to search a "single" contig, or compare "all contigs",
+or a subset of contigs defined in a list or a file. If "file" or "list"
+is selected the browse button is activated and gives access to file or
+list browsers.  If they choose to analyse a single contig the dialogue
+concerned with selecting the contig and the region to search becomes
+activated. The "Minimum Repeat" defines the smallest match that the
+algorithm will report.  The algorithm will search only for repeats in
+the forward direction "Find direct repeats", or only those in the
+reverse direction "Find inverted repeats", or both "Find both".
+
+If "Mask active tags" is selected the "Select tags" button is activated.
+Clicking on this button will bring up a check box dialogue to enable the
+user to select the tags types they wish to activate. Masking the active
+tags means that all segments covered by tags that are "active" will not
+be used in the matching algorithm.
+A typical use of this mode is to avoid finding
+matches in segments covered by tags of type ALUS (ie segments thought to
+be Alu sequence) or that already covered by REPT tags. 
+_fxref(Anno-Types, Tag types, tags)
+
+After the search is complete clicking on "Yes" in the "Save tags to
+file" panel will activate the "File name" box and all repeats on the
+list will be written to a file. This file can be used with "Enter tags"
+(_fpref(Enter Tags, Enter Tags, complement)) to create REPT tags for all
+the repeats found.  Note that "Enter tags" will remove all the results 
+plotted in the contig comparator.
+
+Note that the current version of Find Repeats has a limit to the number
+of repeats it can store. The limit depends on the current maximum
+consensus length, so if you want to increase the limit, reset the
+maximum consensus length. This can be done using the "Set maxseq" item
+in the "Options" menu.
+
+
+
+
diff --git a/manual/gap5_rp_comparator.png b/manual/gap5_rp_comparator.png
new file mode 100644
index 0000000..6f91ad0
Binary files /dev/null and b/manual/gap5_rp_comparator.png differ
diff --git a/manual/gap5_shuffle-t.texi b/manual/gap5_shuffle-t.texi
new file mode 100644
index 0000000..286db37
--- /dev/null
+++ b/manual/gap5_shuffle-t.texi
@@ -0,0 +1,91 @@
+ at chapter Tidying up alignments
+ at menu
+* Shuffle Pads::                Shuffle Pads
+* Remove Pad Columns::          Remove Pad Columns
+* Remove Contig Holes::         Remove Contig Holes
+ at end menu
+
+The Shuffle Pads, Remove pad Columns and Remove Contig Holes all share a
+common goal of tidying up sequence alignments, possibly also breaking
+the contig up.
+
+ at node Shuffle Pads
+ at section Shuffle Pads
+ at cindex Shuffle Pads
+
+This function is an implementation of the Anson and Myers ``ReAligner''
+algorithm. It analyses multiple sequence alignments to detect locations
+where the number of disagreements to the consensus could be reduced by
+realignment of sequences, possibly also correcting the consensus in the
+process. For example:
+
+ at example
+Sequence1:    GATTCAAAGAC
+Sequence2:      TTCAA*GACGG
+Sequence3:       TC*AAGAC
+Consensus:    GATTCAAAGACGGATC
+ at end example
+
+The consensus contains @code{AAA}, but the corrected alignment only has
+two As:
+
+ at example
+Sequence1:    GATTCAAAGAC
+Sequence2:      TTC*AAGACGG
+Sequence3:       TC*AAGAC
+Consensus:    GATTC*AAGACGGATC
+ at end example
+
+_picture(gap5_shuffle_pads,3.34167in)
+
+For speed we acknowledge that the new alignment will only deviate
+slightly from the old one and so a narrow ``band size'' is used. This
+paramater may be adjusted if required, but at the expense of speed.
+
+
+ at page
+_split()
+ at node Remove Pad Columns
+ at section Remove Pad Columns
+ at cindex Remove Pad Columns
+
+There are cases where we may have multiple alignments where every single
+sequence has a padding character such that the complete column is
+``*''. This can occur when disassembling data from a falsely made join.
+
+The Shuffle Pads algorithm will remove entire columns of pads when it
+finds them, but it is time consuming and it may also edit alignments
+elsewhere. The Remove Pad Columns function is a faster, more specific
+solution to this problem.
+
+_picture(gap5_remove_pad_columns,3.34167in)
+
+By default the function will only ever delete columns where 100% of the
+sequences have a pad/gap. However with appropriate due care it is
+possible to reduce this and allow removal of columns where a few
+sequences have a real base provided the overall percentage is still
+high. This is achieved by reducing the ``Percentage pad needed''
+parameter.
+
+Reducing from 100% is not recommended though as it is removal of data
+purely for tidyness sake, while the consensus algorithm will
+automatically find the correct solution.
+
+ at page
+_split()
+ at node Remove Contig Holes
+ at section Remove Contig Holes
+ at cindex Remove Contig Holes
+
+Unlike Gap4, Gap5 permits contig regions with zero coverage. These can
+naturally occur when using sequence mapping to known references. However
+in a denovo assembly context they are not desireable.
+
+_picture(gap5_remove_contig_holes,3.34167in)
+
+Some algorithms have check boxes querying whether you wish holes to be
+removed by breaking contigs up, but this dialogue offers a choice of
+fixing the holes at a later stage.
+
+It identifies all regions of zero coverage and will break the contig
+into multiple fragments.
diff --git a/manual/gap5_shuffle_pads.png b/manual/gap5_shuffle_pads.png
new file mode 100644
index 0000000..8349047
Binary files /dev/null and b/manual/gap5_shuffle_pads.png differ
diff --git a/manual/gap5_template-t.texi b/manual/gap5_template-t.texi
new file mode 100644
index 0000000..831f521
--- /dev/null
+++ b/manual/gap5_template-t.texi
@@ -0,0 +1,271 @@
+ at menu
+* Template-Filter::                   Filtering
+* Template-Template::                 Template plot
+* Template-Depth::                    Depth / coverage plot
+ at end menu
+
+ at cindex Template Display
+
+The template display is a graphical overview of a single contig. It
+allows us to see how much data we have, how long the fragments are and
+how they relate to each other (whether they are forming valid pairs).
+
+_picture(gap5_template_by_size,6in)
+
+The window consists of one or more tracks, by default showing the
+reading template layout at the top and a sequence / read-pair coverage
+plot at the bottom. The Tracks menu allows us to turn these on and off.
+
+Below the main menu bar is a series of buttons that bring up new
+dialogues for controlling how the data is to be display and what is to
+be displayed.
+
+Then come a graphic plot per track. A cross-hair automatically tracks
+the cursor, indicating the X and Y coordinates (in appropriate units)
+in the status line at the bottom of the window. The track displays can
+be moved by either using the horizontal and vertical scrollbars at the
+bottom and right hand edges of the window, or by clicking and dragging
+the contents of the window. While dragging the display will not update
+to show newly visible regions of a contig until the left mouse button
+is released. 
+
+Finally the bottom contains a scrollbar and ruler for positioning and
+a series of controls. The X scale simply controls how many
+base-pairs of the contig are covered by he window. The X scale number
+is arbitrary, but is interpreted in an exponential manner so it is
+easy to rapidly zoom in or zoom out.  All other controls in the bottom
+panel do not affect the reading coverage track, so they are covered in
+the template track section below.
+
+
+ at node Template-Filter
+ at section Filtering data
+
+By default all templates are used for drawing the tracks, but there
+are times when we may wish to focus on specific problem data or to
+exclude it from our graphics.
+
+_picture(gap5_template_filter,3.39167in)
+
+The Filter button at the top of the Template Display brings up the
+dialogue shown above. Making changes to this dialogue either have an
+instant impact on the display (when ``Auto update'' is enabled) or
+instead only when we hit Apply or OK to dismiss the dialogue.
+
+The Pairs: section allows us to select either reads on all templates,
+reads that are the sole read for that template, or reads that are
+paired on a template.  Note that the definition of a pair here is
+strictly dependant on how many reads for a template are in the 
+gap5 database rather than the library preparation strategy. So a
+paired-end template for which only one read is in the gap5 database
+(perhaps due to failure to map) is classified as ``single''.
+
+The Consistency section can be used to select all, consistent only or
+inconsistent only data. This requires read-paired data (single reads
+cannot be inconsistent as so are considered as consistent). The
+interpretation of inconsistent currently is that the two reads of a
+pair do not point towards one another, but in future releases this is
+planned to check the correct orientation for that library type as for
+some constructions it is normal to have reads pointing in the same
+orientation.
+
+The Spanning section governs whether to display read pairs with one
+read in this contig and the other read in another contig. Handling
+templates with more than two reads is still on-going work, but when
+finished a spanning read-pair will be one with any read not in this
+contig.
+
+Underneath these are two sliders applied in addition to the above
+filters. They allow removal of any read or read-pair (depending on the
+type of data being plotted) with a mapping quality outside the
+selected range.
+
+ at node Template-Template
+ at section Template plot
+
+This is the main body of the template display window. The default plot
+will be showing read-pairs, mainly coloured by mapping quality with
+the insert size governing the Y coordinate. Larger inserts are at the
+bottom of the track while shorter ones are at the top.
+
+The colours used are as follows:
+
+ at table @strong
+ at item blue
+This is a template with only one reading present. It could be either a
+pair with one end not in this assembly, or a true single-ended
+sequencing experiment. The horizontal size of the line is now the
+length of the individual sequence rather than the computed length of
+the insert.
+
+ at item orange
+This is a template with one reading present in another contig. The
+size of the line is derived from the size of the data in this contig
+(typically a single reading).
+
+ at item red
+This template is considered as inconsistent in some manner, typically
+due to the relative position and orientation of the forward and
+reverse sequences being incorrect.
+
+ at item grey (variety of)
+Any consistent read-pair is coloured by the mapping quality, by
+default using the average of the individual sequence mapping
+qualities. Lighter shades represent higher mapping qualities.
+ at end table
+
+
+The row of scale bars at the bottom of the window control how data is
+to be plotted. They are:
+
+ at table @strong
+ at item X Scale
+Controls how many base-pairs in the contig to plot. Higher values
+indicate more base pairs, but with an exponentially growing scale.
+
+ at item Y Magnification
+Governs the amount of vertical space consumed by the template track.
+This has no impact on the depth track.
+
+ at item Y Offset
+Adds a small shift to the Y position of data prior to plotting. This
+is of little use unless Separate Strands has also been selected, where
+upon this allows the two halves of the plot to be brought closer
+together. (Effectively meaning the a plot can go from -1000 to -100 and
++100 to +1000 instead of -1000 to +1000 with a blank area in the
+middle if our sequences are a minimum of 100 bases long.)
+
+ at item Stacking Y Size
+Only of use in Stacking Y-Position mode. This vertically groups
+together data of similar length, allowing a basic approach of
+separating short-read and long-read technologies. The Y layout is
+performed in steps of ``Stacking Y Size''. To pack reads tightly
+together regardless of length,  set this to the maximum value possible.
+
+ at item Y Spread
+This adds a small perturbation to the computed Y coordinates of lines
+in the template track. When the Y coordinate is derived based on the
+insert size of the read-pair it is not always clear whether a line
+represents a single item or many items stacked perfectly on top of one
+another. The Y spread control compensates for this.
+
+_picture(gap5_template_spread0,4in)
+Template track with Y spread of 0.
+
+_picture(gap5_template_spread50,4in)
+Template track with Y spread of 50.
+ at end table
+
+ at node Template-Y-Type
+ at subsection Controlling The Y Layout.
+
+The layout and type of data in the template track can be controlled
+using the Template button at the top of the main template display
+window.
+
+_picture(gap5_template_template,3.06667in)
+
+The Y Position section controls how the Y coordinates are computed
+when plotting data (with X being tied to the position in the assembly
+or reference). It can be one of three settings.
+
+ at table @strong
+ at item Template size
+
+_picture(gap5_template_by_size,4in)
+
+The default mode. The size of an object is defined to be the number of
+bases it spans. This is normally the size of a read-pair, or if the
+pair spans contigs or if only readings are shown it is the size of a
+single reading instead. Larger objects are at the bottom of the
+window. This Y method very clearly reveals indels in a mapped
+assembly. It sometimes also sometimes reveals misassemblies.
+
+Given that items of identical size will stack on top of one another,
+of particular use to this display mode is the Y Spread control in the
+main window.
+
+ at item Stacking
+
+_picture(gap5_template_by_stacking,4in)
+
+A more traditional view - each and every item is allocated its own
+non-overlapping Y coordinate (although low Y magnifications may imply
+these are drawn at the same Y pixel).
+
+It is still possible to partially group items by their insert size
+using the ``Stacking Y Size'' control in the main window.
+
+ at item Mapping Quality
+
+_picture(gap5_template_by_mapping,4in)
+
+Finally we can display data collated by the mapping score. This is
+typically only available for mapped assemblies. This plot sometimes
+helps to reveal regions where all the data present is of poor mapping
+quality, indicating a likely repeat.
+ at end table
+
+Adjacent to the Y Position frame is the Colour frame. This controls
+the colour of the lines drawn in the template display rather than
+their location.
+
+ at table @strong
+ at item Combined mapping quality
+ at itemx Minimum mapping quality
+ at itemx Maximum mapping quality
+For templates with multiple reads visible, we have a variety of
+mapping qualities. Often these individual sequence mapping qualities
+will differ, but we wish to draw a single line for the template with a
+single colour. These three methods control whether we take the
+average, minimum or maximum values from the individual sequences on
+this template.
+
+ at item Reads
+The line typically represents the entire span of the insert, but we
+may not have sequence data for all of the template. This colour mode
+will also draw the portions of the template that we have known
+sequence for, in green for forward strand sequences and magenta for
+reverse strand sequences. Any remaining portion of template between
+the reads is drawn using the combined mapping quality.
+ at end table
+
+At the bottom of this dialogue is a row of check buttons.
+
+``>>Acc'' enables accurate mode, but be warned this can be very
+slow.  When the template display is drawn it fetches all data within
+the visible portion plus a little bit ether side. From this reads from
+the same template are paired up. However when a template spans a
+substantially larger range than is shown we may only have fetched one
+read for this template. We do know that such a template forms a pair,
+but we do not know the exact location of the other end or even whether
+it is in this contig. The assumption is that it is not, and the
+template is drawn in orange. Enabling accurate mode will work out the
+precise location of the other end and if it is present elsewhere
+within this contig then the insert size will be correctly determined
+and the plot adjusted accordingly.
+
+The ``Reads'' checkbutton (not to be confused with the Reads colour
+selector) disables all drawing of read-pairing and template lines,
+instead drawing lines to represent the known DNA sequence instead.
+
+``Y-log scale'' controls whether we plot our Y values using log or
+linear scales.
+
+``Separate strands'' attempts to classify all templates as coming from
+the top or bottom strand of DNA (based on the orientation of the
+sequences on that template, although sometimes these are
+conflicting). It then splits the plot in two, forming an approximate
+mirror image. This may be of use in some transcriptome sequencing
+experiments.
+
+ at node Template-Depth
+ at section Depth / Coverage Plot
+
+The depth track shows coverage of both individual readings and
+read-pairs, where a read-pair counts as +1 coverage over the entire
+length it spans rather than just the portion directly sequenced.
+
+The filter options for (in)consistent read pairs also apply here,
+giving the option to only show depth of consistent pairs.
+
diff --git a/manual/gap5_template_by_mapping.png b/manual/gap5_template_by_mapping.png
new file mode 100644
index 0000000..264e5df
Binary files /dev/null and b/manual/gap5_template_by_mapping.png differ
diff --git a/manual/gap5_template_by_size.png b/manual/gap5_template_by_size.png
new file mode 100644
index 0000000..1dd9ef7
Binary files /dev/null and b/manual/gap5_template_by_size.png differ
diff --git a/manual/gap5_template_by_stacking.png b/manual/gap5_template_by_stacking.png
new file mode 100644
index 0000000..43083e8
Binary files /dev/null and b/manual/gap5_template_by_stacking.png differ
diff --git a/manual/gap5_template_filter.png b/manual/gap5_template_filter.png
new file mode 100644
index 0000000..f71277b
Binary files /dev/null and b/manual/gap5_template_filter.png differ
diff --git a/manual/gap5_template_spread0.png b/manual/gap5_template_spread0.png
new file mode 100644
index 0000000..499c9eb
Binary files /dev/null and b/manual/gap5_template_spread0.png differ
diff --git a/manual/gap5_template_spread50.png b/manual/gap5_template_spread50.png
new file mode 100644
index 0000000..3c79050
Binary files /dev/null and b/manual/gap5_template_spread50.png differ
diff --git a/manual/gap5_template_template.png b/manual/gap5_template_template.png
new file mode 100644
index 0000000..1de6864
Binary files /dev/null and b/manual/gap5_template_template.png differ
diff --git a/manual/gap_database-t.texi b/manual/gap_database-t.texi
new file mode 100644
index 0000000..a19fc15
--- /dev/null
+++ b/manual/gap_database-t.texi
@@ -0,0 +1,237 @@
+ at menu
+* GapDB-Directories::   Directories
+* GapDB-New::           Creating a new database
+* GapDB-Existing::      Opening an existing database
+* GapDB-CopyDatabase::  Making backups of databases
+* GapDB-Names::         Reading and Contig Names and Numbers
+ at end menu
+
+ at cindex Database: gap4 filenames
+ at cindex Entering readings
+ at cindex readings: entering
+
+Gap4 stores the data for each sequencing project (e.g. the data 
+for a single cosmid or BAC) in a gap4 assembly database, so at 
+the start of a sequencing project
+the user should employ gap4 to create the database for the project 
+(_fpref(GapDB-New, Opening a New Database, gap_database)).
+New database are created with sufficient index space for around
+8000 readings, but this can be extended if required.
+
+Gel reading data
+in experiment file format 
+(_fpref(Formats-Exp, Experiment File Format, exp))
+is entered into the database using the methods
+available from the assembly menu 
+(_fpref(Assembly, Entering Readings into the Database (Assembly),gap4)).
+
+To assemble more data for the project or
+to edit or analyse readings already entered the user should open the
+same project database 
+(_fpref(GapDB-Existing, Opening an Existing Database, gap_database)).
+
+Although the database files are designed to be free of corruption it is
+advisable to make regular backups 
+(_fpref(GapDB-CopyDatabase, Making Backups of Databases, gap-database)).
+
+
+Database names can have from one to 240 letters and must not include a
+full stop or spaces. The database itself consists of two files; a file
+of records and an index file. If the database is called @file{FRED} then
+version 0 of the database comprises the pair of files named
+ at file{FRED.0} and @file{FRED.0.aux}, the latter of these being the index
+file. The "version" is the character after the full stop in these
+filenames. Versions are not limited to numbers alone, but must be single characters.
+
+ at cindex Database: busy file
+ at cindex Busy file
+ at cindex readonly
+ at cindex Database: readonly
+ at cindex Database: locked
+ at cindex locked database
+ at cindex readonly database
+
+When a database is opened for writing a @file{BUSY} file is created. For the
+ at file{FRED} database this will be named @file{FRED.0.BUSY}. When the
+database is closed the file is deleted. The file is
+used by gap4 to signify that the database is opened for writing and is
+part of its mechanism to prevent more than one person editing a
+database at any time. Before opening a database for writing, 
+gap4 checks to see if the BUSY file for that database exists. If it does
+the database is opened only for reading, if not it creates the file, so
+that any additional attempts to open the database for writing will be
+blocked. A side effect of this mechanism, is that 
+in the event of a program or system crash the BUSY file
+will be left on the disk, even though the database is not being used. In
+this case users must remove the BUSY file 
+(after checking that it really isn't in
+use!) using, on UNIX the @code{rm} command before opening the database. Eg
+"@code{rm FRED.0.BUSY}". On Windows use the Recycle Bin.
+
+The gap4 database is robustly designed.  Killing the program whilst
+updating the database should never yield an inconsistent state. A
+"roll-back" mechanism is utilised to undo any partially written updates
+and revert to the last consistent database. Hence quitting abnormally
+may result in the loss of some data. Always quit using the Exit command
+within the File menu.
+
+However it is advised that copies of the database are made
+regularly to safeguard against any software bugs or disk
+corruptions.
+
+_split()
+ at node GapDB-Directories
+ at section Directories
+ at cindex Directories
+
+By default, Gap4 expects files to be in the current directory.
+In dialogues which request filenames, full pathnames can be specified,
+however it is generally tidier to keep files
+specific to a particular project in the same directory as the project database.
+Creating new databases and opening new databases will change directory to the
+directory containing the opened project.
+
+It is possible to change the current directory by selecting "Change directory"
+from the File menu. Be warned that changing to a directory other than that
+containing the database and the trace files may mean that gap4 can no
+longer find the trace files. 
+The solutions to this problem are discussed elsewhere 
+(_fpref(Conf-Trace File Location, Trace File Location,t)).
+
+_split()
+ at node GapDB-New
+ at section Opening a New Database
+ at cindex Creating a new database
+ at cindex Database: new
+ at cindex Database: creating new
+ at cindex New database creation
+
+To create a new gap4 database select the "New" command from the File menu. This
+brings up a dialogue prompting the the new filename. Type the name of
+the database to create without specifying the version number. To
+create version 0 of a database named @file{FRED} typing @code{FRED} will
+create the two database files, @file{FRED.0} and @file{FRED.0.aux}.
+
+If the database already exists you will be asked whether you wish to
+overwrite it. Any database that was already open will be closed before
+the new database is created. The new database is then opened, ready for
+input.
+
+Note that Gap4 database names are case sensitive. 
+
+_split()
+ at node GapDB-Existing
+ at section Opening an Existing Database
+ at cindex Opening databases
+ at cindex Database: opening
+
+To open an existing database select the "Open..." command from the File
+menu.  This brings up a file browser where the database name can be
+selected. The databases will be listed in a @code{NAME.V} notation
+(where @code{V} is the version number). Double clicking on the database
+name will then open this database.
+
+If the program already had a database open it will close it 
+before the new one is
+opened. If the new database is already in use by gap4 a dialogue will
+appear warning you that the database has been opened in read only mode.
+This mode prevents any edits from being made to the database by greying out
+certain options and disabling the editing capabilities in the contig editor.
+
+A database may also be opened by specifying the database name and
+version on the unix command line. To open version 0 of the database
+ at file{FRED} use "@code{gap4 FRED.0}".
+
+_split()
+ at node GapDB-CopyDatabase
+ at section Making Backups of Databases
+ at cindex Backing up databases
+ at cindex Database: backups
+ at cindex Save As
+ at cindex Copy Database
+
+The importance of making regular backups of your data cannot be over stated.
+Using the "Copy database" command from the File menu brings up a dialogue
+asking for a new database version. Type in a single character for the
+new version and press "ok" or return. If the new database already exists
+you will be asked whether you wish to overwrite it. Any subsequent changes you
+make will still be to the database that you originally opened, not to the
+database you have just saved to.
+
+The database file may sometimes become fragmented. An option available when
+saving is to use garbage collection. This creates the new database by only
+copying over the used portions of data (and hence reduces fragmentation).
+However it is quite a lot slower than the standard "Copy database" mechanism,
+so if this causes problems add "@code{set_def COPY_DATABASE.COLLECT 0}" to
+your @file{.gaprc} file to change the default to no garbage collection. It
+should be noted that garbage collection also performs a rigorous database
+consistency check.
+
+Do not always use the same version character for you backups. Instead
+keep several different backups. Otherwise you may find that both your
+current database and the backup have problems. It is also wise to run
+"check database" to verify data integrity. 
+_fxref(Check Database, Check Database, check_db)
+
+It is also possible to backup databases from outside gap4 by using
+standard unix commands to copy @strong{both} the record and index files.
+Care should be taken when doing this to ensure that the database is not
+being modified whilst copying. See your unix or 
+Windows manuals for further details or
+the @code{copy_db} manual page (_fpref(Man-copy_db, Copy_db, manpages)) for
+the external garbage collecting database copy program.
+
+_split()
+ at node GapDB-Names
+ at section Reading and Contig Names and Numbers
+ at cindex Reading names
+ at cindex Reading name restrictions
+ at cindex Reading name length restrictions
+ at cindex File name restrictions
+ at cindex Experiment file name restrictions
+ at cindex Experiment file name length restrictions
+ at cindex SCF file name restrictions
+ at cindex sample name restrictions
+ at cindex Reading numbers
+ at cindex Readings: maximum in a database
+ at cindex Database: maximum size
+ at cindex Contig names
+
+For various reasons there are restrictions on the characters used in 
+file names and the length of the file names.
+
+Characters permitted in file names:
+
+ at c @code{A}.. at code{Za}.. at code{z0}.. at code{9._-}
+ at code{ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789._-}
+
+A reading name or experiment file name must not be longer than 40 characters.
+
+These restrictions also apply to SCF files which means, in turn, also to
+the names given to samples obtained from sequencing instruments. For example
+do not give sample names such as 27/OCT/96/r.1 when using and ABI machine:
+the / symbols will be interpreted as directory name separators on UNIX!
+
+As each reading is entered into a project database it is given a unique
+number. The first is numbered 1, the second 2 and so on. Their reading
+names are read from the ID line in the experiment files and copied into
+the database. As new readings are created and existing ones removed the
+reading numbers change in an unpredictable fashion. Hence when taking
+notes on a project always record the reading name instead of the reading
+number.
+
+The maximum number of readings a database can hold is 99,999,999.
+
+Many options ask for a reading or contig identifier. A contig identifier is
+simply any reading name or number within that contig. A reading
+identifier is either the reading name or the hash ("@code{#}") character
+followed by the number. For example, if the reading name is
+ at code{fred.gel} with number 99 users could type "@code{fred.gel}" or
+"@code{#99}" when asked to identify the contig.
+
+Generally when prompting for a contig or reading name a default is
+supplied. This is the last name you used, or if you've only just opened
+the database, the name of the longest contig in the database.
+For more information about selecting contigs within the program see
+_fxref(Contig-Selector-Contigs, Selecting Contigs, t).
+
diff --git a/manual/getABIfield.1.texi b/manual/getABIfield.1.texi
new file mode 100644
index 0000000..511ab41
--- /dev/null
+++ b/manual/getABIfield.1.texi
@@ -0,0 +1,105 @@
+ at cindex getABIfield: man page
+ at unnumberedsec NAME
+
+getABIfield --- extract arbitrary components from an ABI file
+
+ at unnumberedsec SYNOPSIS
+
+ at code{getABIfield} [@code{OPTIONS}] @i{filename} [@code{Field-ID} [
+ at code{Count}]] ...
+
+ at unnumberedsec DESCRIPTION
+
+The @code{getABIfield} command extracts specified blocks from an ABI
+file and displays them in a variety of formats. The ABI file may be
+considered as a directory structure with files (data blocks) contained
+within it. Supply just the ABI filename as an argument will give a
+listing of the blocks.
+
+To extract specific data one or more ``name count'' pairs need to be
+specified.
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-a}
+    Dump all blocks.
+
+ at item @code{-D} @i{separator}
+    Sets the output field separator for elements within a date and
+    time format. Dates default to ``yyyy/mm/dd'' format and times default
+    to ``hh:mm:ss.xx''.
+
+ at item @code{-F} @i{separator}
+    Sets the output field separator to be a specified character.
+    This defaults to space.
+
+ at item @code{-f} @i{format}
+    Reformat the data to a specific style. By default the data is
+    listed in the format specified within the ABI file.
+    @i{format} should be chosen of @code{1}(1-byte integer),
+    @code{4}(2-byte integer), @code{5}(4-byte integer),
+    @code{7}(4-byte real), @code{8}(8-byte real), @code{10}(date),
+    @code{11}(time), @code{18}(Pascal-string), @code{19}(C-string).
+
+ at item @code{-h}
+    Displays data in hex format. By default the output format will be
+    chosen based on the data type (eg string, integer, floating
+    point).
+
+ at item @code{-I} @i{fofn}
+    Instead of reading the single file specified on the argument list
+    this reads a list of filenames from @i{fofn}. If @i{fofn} is ``-''
+    then the file of filenames is read from 'stdin'.
+
+ at item @code{-L} @i{separator}
+    Sets the line separator between multiple blocks listed within a
+    single file. Defaults to newline.
+
+ at item @code{-l}
+    Sets the output field separator to be a newline.
+
+    Query mode. Here no output is displayed, but it simply returns true
+    or false depending on whether any of requested comments were found.
+
+ at item @code{-r}
+    Displays data in raw byte format.
+    
+ at item @code{-t}
+    Enable tagged output format. Each name/count pair are listed on a
+    single line in the format ``filename name count data...''.
+ at end table
+
+ at unnumberedsec EXAMPLES
+
+To extract the run dates in a tagged format for all the ab1 files in
+the current working directory:
+
+ at example
+ls *.ab1 | getABIfield -t -I - RUND
+ at end example
+
+To see the order of the processed data channels (e.g. ``GATC'') on a
+single file:
+
+ at example
+getABIfield 3150.ab1 FWO_
+ at end example
+
+To see the processed trace data for the first channel (e.g. ``G'')
+with one sample point per line:
+
+ at example
+getABIfield -l 3150.ab1 DATA 9
+ at end example
+
+To obtain the version numbers of the various trace processing steps:
+
+ at example
+getABIfield -t 3150.ab1 SVER 1 SVER 2 SVER 3
+ at end example
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Man-get_comment, get_comment(1), get_comment.1)
+_fxref(Formats-Scf, scf(4), formats)
diff --git a/manual/get_comment.1.texi b/manual/get_comment.1.texi
new file mode 100644
index 0000000..816ed88
--- /dev/null
+++ b/manual/get_comment.1.texi
@@ -0,0 +1,35 @@
+ at cindex get_comment: man page
+ at unnumberedsec NAME
+
+get_comment --- extract comments from trace files
+
+ at unnumberedsec SYNOPSIS
+
+ at code{get_comment} [ @code{-c} ] [ @code{Field-ID} ... ]
+
+ at unnumberedsec DESCRIPTION
+
+The @code{get_comment} command extracts text fields from a variety of trace
+formats, read in from stdin. Each comment is of the form
+ at i{Field-ID}=@i{comment}, regardless of the file format. @i{Field-ID} is
+typically 4 character identifier.
+
+With no @i{Field-ID} arguments specified all comments are listed. Otherwise
+only those specified on the command line are listed.
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-h}
+    Display the usage help.
+
+ at item @code{-c}
+    Suppresses the output of the @i{Field-ID}. Only the right hand side of the
+    comment is displayed. The default action is the display the full comment in
+    the form listed above.
+
+ at end table
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Man-get_scf_field, get_scf_field(1), get_scf_field.1)
diff --git a/manual/get_scf_field.1.texi b/manual/get_scf_field.1.texi
new file mode 100644
index 0000000..a78d755
--- /dev/null
+++ b/manual/get_scf_field.1.texi
@@ -0,0 +1,39 @@
+ at cindex get_scf_field: man page
+ at unnumberedsec NAME
+
+get_scf_field --- extract comments from an SCF file
+
+ at unnumberedsec SYNOPSIS
+
+ at code{get_scf_field} [ @code{-cqs} ] @i{filename} [ @code{Field-ID} ... ]
+
+ at unnumberedsec DESCRIPTION
+
+The @code{get_scf_field} command extracts comments from an SCF file. Each
+comment is of the form @i{Field-ID}=@i{comment}. Where @i{Field-ID} is
+a 4 character identifier.
+
+With no @i{Field-ID} arguments specified all comments are listed. Otherwise
+only those specified on the command line are listed.
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-c}
+    Suppresses the output of the @i{Field-ID}. Only the right hand side of the
+    comment is displayed. The default action is the display the full comment in
+    the form listed above.
+
+ at item @code{-q}
+    Query mode. Here no output is displayed, but it simply returns true
+    or false depending on whether any of requested comments were found.
+
+ at item @code{-s}
+    Silent mode. No error messages are produced, except for usage messages. It
+    returns true or false for success or failure.
+ at end table
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Man-get_comment, get_comment(1), get_comment.1)
+_fxref(Formats-Scf, scf(4), formats)
diff --git a/manual/hash_exp.1.texi b/manual/hash_exp.1.texi
new file mode 100644
index 0000000..d5bf7fe
--- /dev/null
+++ b/manual/hash_exp.1.texi
@@ -0,0 +1,33 @@
+ at cindex hash_exp: man page
+ at unnumberedsec NAME
+
+hash_exp --- produces an index for a file of concatenated experiment files.
+
+ at unnumberedsec SYNOPSIS
+
+ at code{hash_exp} @i{exp_archive}
+
+ at unnumberedsec DESCRIPTION
+
+ at code{hash_exp} adds a hash-table index on to the end of a
+concatenated file of experiment files. It's purpose is simply to
+provide random access to experiment files while also reducing the
+number of separate disk files.
+
+The @code{hash_list} program will list the contents of the hashed
+archive. The entry names stored in the archive are taken from the
+ at code{ID} lines in the experiment files rather than their original
+filenames.
+
+Within Gap4 you can assemble a hashed experiment file archive and it will
+automatically assemble all files within it. For finer grain control
+use @code{hash_list} to produce a file of filenames and then edit this
+accordingly before supplying it as a ``fofn'' to gap4. In this case
+you will also need to configure Gap4 to set the @code{EXP_PATH}
+environment variable to contain @code{HASH=}@i{exp_archive_filename}.
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Formats-Exp, ExperimentFile(4), formats)
+_fxref(Man-hash_list, hash_list(1), hash_list.1)
+ at code{Read}(4)
diff --git a/manual/hash_extract.1.texi b/manual/hash_extract.1.texi
new file mode 100644
index 0000000..69a1cec
--- /dev/null
+++ b/manual/hash_extract.1.texi
@@ -0,0 +1,30 @@
+ at cindex hash_extract: man page
+ at unnumberedsec NAME
+
+hash_extract --- Extracts entries from a hashed archive
+
+ at unnumberedsec SYNOPSIS
+
+ at code{hash_extract} [@code{-I} @i{fofn}] @i{archive} [@i{filename} ...]
+
+ at unnumberedsec DESCRIPTION
+
+ at code{hash_extract} outputs to stdout the specified filenames from a
+hashed archive file (regardless of whether it was a tar file, SFF
+file, exp file or some other original format). If multiple filenames
+are specified they are concatenated together.
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-I} @i{fofn}
+    Specifies a file of filenames to extract instead of reading from
+    the argument list.
+ at end table
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Formats-Exp, ExperimentFile(4), formats)
+_fxref(Man-hash_list, hash_list(1), hash_list.1)
+_fxref(Man-hash_tar, hash_tar(1), hash_tar.1)
+ at code{Read}(4)
diff --git a/manual/hash_list.1.texi b/manual/hash_list.1.texi
new file mode 100644
index 0000000..88318b6
--- /dev/null
+++ b/manual/hash_list.1.texi
@@ -0,0 +1,29 @@
+ at cindex hash_list: man page
+ at unnumberedsec NAME
+
+hash_list --- lists the contents of a hashed archive.
+
+ at unnumberedsec SYNOPSIS
+
+ at code{hash_list} [@code{-l}] @i{exp_archive}
+
+ at unnumberedsec DESCRIPTION
+
+ at code{hash_list} lists the contents of a hashed file. It may be used
+to produce a file of filenames to supply to other tools, such as gap4
+or convert_trace.
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-l}
+    ``Long'' format: also reports the position and size of each file
+    in the archive.
+ at end table
+
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Formats-Exp, ExperimentFile(4), formats)
+_fxref(Man-hash_extract, hash_extract(1), hash_extract.1)
+ at code{Read}(4)
diff --git a/manual/hash_tar.1.texi b/manual/hash_tar.1.texi
new file mode 100644
index 0000000..9c10948
--- /dev/null
+++ b/manual/hash_tar.1.texi
@@ -0,0 +1,121 @@
+ at cindex hash_tar: man page
+ at unnumberedsec NAME
+
+hash_tar --- Adds a hash table index to a tar file
+
+ at unnumberedsec SYNOPSIS
+
+ at code{hash_tar} [OPTIONS] @i{tarfile} > @i{tarfile}.hash
+ at br
+ at code{hash_tar} -A [OPTIONS] @i{tarfile} >> @i{tarfile}
+
+ at unnumberedsec DESCRIPTION
+
+ at code{hash_tar} adds an index to a tar file so that random access may
+be performed on it. It is a successor to the @code{index_tar}
+program.
+
+The index is a hash table which may be appended, prepended or stored
+in a separate file. Then the @code{hash_list} and @code{hash_extract}
+programs may be used to query the contents and to extract contents
+from the indexed tar archive. Note that it's not possible to add to
+such tar archives without also having to rebuild the index.
+
+Various @i{io_lib} based tools also support transparent reading out of
+tar files when indexed using this tool, so this provides a quick and
+easy way to remove the clutter of thousands of small trace files on
+disk.
+
+In separate file mode the hash index is stored in its own file. It's
+the most flexible method as it means that the tar file can be modified
+and appended to with ease provided that the hash index is
+recomputed. In order for this to work the hash index file also needs
+to store the filename of its associated tar file (see the -a option).
+
+In append mode the hash index is assumed to be appended on the end of
+the tar file itself. As tar files normally end in a blank block this
+does not damage the tar and @code{tar tvf} will still work
+correctly. However appending to the tar file will cause problems.
+
+In prepend mode the hash index comes first and the tar follows. This
+breaks normal tar commands, but is the the fastest way to retrieve
+data (it avoids a read and a seek call compared to append mode).
+
+For space saving reasons it's possible to add a header and a footer to
+each entry too. In this case a named entry from the tar file is
+prepended or appended at extraction time.
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-a} @i{archive_filename}
+    Use this if reading from stdin and you wish to create a hash index
+    that is to be stored as a separate file.
+
+ at item @code{-A}
+    Append mode. No archive name will be stored in the index and so
+    the extraction tools assume the index is appended to the same file
+    as the archive itself.
+
+ at item @code{-b}
+    Store the ``base name'' of the tar file names. That is if the tar
+    holds file @i{a/b/c} then the item held in the index will be @i{c}.
+
+ at item @code{-d}
+    Index directory names too. (Most likely a useless feature!)
+
+ at item @code{-f} @i{name}
+    Set tar entry 'name' to be a file footer
+
+ at item @code{-h} @i{name}
+    Set tar entry 'name' to be a file header
+
+ at item @code{-O}
+    Prepend mode. It is assumed that all offsets within the archive
+    file start from the end of the index (ie the index is the first
+    bit in the file).
+
+ at item @code{-v}
+    Verbose mode.
+
+ at end table
+
+ at unnumberedsec EXAMPLES
+
+The most common usage is just to append an index to an existing tar
+file. Then extract a file from it.
+
+ at example
+hash_tar -A file.tar >> file.tar
+hash_extract file.tar xyzzy/plugh > plugh
+ at end example
+
+For absolute maximum speed maybe you wish to prepend the hash
+index. This speeds up the ``magic number'' detection and avoids
+unnecessary seeks.
+
+ at example
+hash_tar -O file.tar > file.tar.hash
+cat file.tar.hash file.tar > hashedfile.tar
+ at end example
+
+Finally, if we have a tar file of Experiment Files maybe we wish to
+add a footer indicating a date and comment to each experiment file so
+that upon extraction we get a concatenation of the original experiment
+file and the footer.
+
+ at example
+(echo "CC   Comment";date "+DT   %Y-%m-%d") > exp_foot
+tar rf file.tar exp_foot
+hash_tar -f exp_foot -A file.tar >> file.tar
+# Now test:
+hash_extract file.tar xyzzy.exp > xyzzy.exp
+tail -2 xyzzy.exp
+ at end example
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Formats-Exp, ExperimentFile(4), formats)
+_fxref(Man-hash_list, hash_list(1), hash_list.1)
+_fxref(Man-hash_extract, hash_extract(1), hash_extract.1)
+ at code{Read}(4)
diff --git a/manual/header.m4 b/manual/header.m4
new file mode 100644
index 0000000..f11269b
--- /dev/null
+++ b/manual/header.m4
@@ -0,0 +1,260 @@
+ at c ---------------------------------------------------------------------------
+ at c Experiment with smaller amounts of whitespace between chapters
+ at c and sections.
+ at c ---------------------------------------------------------------------------
+ at tex
+ at set tex
+\global\chapheadingskip = 15pt plus 4pt minus 2pt 
+\global\secheadingskip = 12pt plus 3pt minus 2pt
+\global\subsecheadingskip = 9pt plus 2pt minus 2pt
+ at end tex
+
+ at c ---------------------------------------------------------------------------
+ at c @split{} command
+ at c
+ at c only makes sense for html.
+ at c ---------------------------------------------------------------------------
+ at tex
+\global\def\split{}
+ at end tex
+
+ at c ---------------------------------------------------------------------------
+ at c Experiment with smaller amounts of whitespace between paragraphs in
+ at c the 8.5 by 11 inch `format'.
+ at tex
+\global\parskip 6pt plus 1pt
+ at end tex
+ at c ---------------------------------------------------------------------------
+
+ at c ---------------------------------------------------------------------------
+ at c Magic with comments. m4 can set comment characters to whatever it wants.
+ at c They do not even have to be on one line (but by default the start and end
+ at c characters are "#" and newline).
+ at c
+ at c We `define' new start and end comments: @nm4 and @m4. (Remember as no m4 and
+ at c m4).
+ at c
+ at c m4 will not remove text in comments, it just ignores it. So the comment
+ at c characters themselves need to be harmless to tex. We solve this by creating
+ at c two new tex commands to do nothing.
+ at c ---------------------------------------------------------------------------
+ at tex
+\global\def\m4{}
+\global\def\nm4{}
+ at end tex
+changecom(@nm4, at m4)
+
+ at c ---------------------------------------------------------------------------
+ at c Rename the m4 commands to _commands. This will greatly reduce the chance of
+ at c them occurring in our text by chance.
+ at c ---------------------------------------------------------------------------
+define(`_define',defn(`define'))
+define(`_changecom',defn(`changecom'))
+define(`_changequote',defn(`changequote'))
+define(`_errprint',defn(`errprint'))
+define(`_maketemp',defn(`maketemp'))
+define(`_sinclude',defn(`sinclude'))
+define(`_translit',defn(`translit'))
+define(`_traceoff',defn(`traceoff'))
+define(`_undefine',defn(`undefine'))
+define(`_undivert',defn(`undivert'))
+define(`_decr',defn(`decr'))
+define(`_defn',defn(`defn'))
+define(`_divert',defn(`divert'))
+define(`_divnum',defn(`divnum'))
+define(`_dlen',defn(`dlen'))
+define(`_dumpdef',defn(`dumpdef'))
+define(`_eval',defn(`eval'))
+define(`_m4exit',defn(`m4exit'))
+define(`_ifelse',defn(`ifelse'))
+define(`_ifdef',defn(`ifdef'))
+define(`_include',defn(`include'))
+define(`_incr',defn(`incr'))
+define(`_index',defn(`index'))
+define(`_popdef',defn(`popdef'))
+define(`_pushdef',defn(`pushdef'))
+define(`_shift',defn(`shift'))
+define(`_substr',defn(`substr'))
+define(`_syscmd',defn(`syscmd'))
+define(`_sysval',defn(`sysval'))
+define(`_traceon',defn(`traceon'))
+define(`_m4wrap',defn(`m4wrap'))
+define(`_format',define(`format'))
+
+_undefine(`define')
+_undefine(`changecom')
+_undefine(`changequote')
+_undefine(`errprint')
+_undefine(`maketemp')
+_undefine(`sinclude')
+_undefine(`translit')
+_undefine(`traceoff')
+_undefine(`undefine')
+_undefine(`undivert')
+_undefine(`unix')
+_undefine(`windows')
+_undefine(`decr')
+_undefine(`defn')
+_undefine(`divert')
+_undefine(`divnum')
+_undefine(`dlen')
+_undefine(`dumpdef')
+_undefine(`eval')
+_undefine(`m4exit')
+_undefine(`ifelse')
+_undefine(`ifdef')
+_undefine(`include')
+_undefine(`incr')
+_undefine(`index')
+_undefine(`popdef')
+_undefine(`pushdef')
+_undefine(`shift')
+_undefine(`substr')
+_undefine(`syscmd')
+_undefine(`sysval')
+_undefine(`traceon')
+_undefine(`m4wrap')
+_undefine(`format')
+
+ at c ---------------------------------------------------------------------------
+ at c Change quotes to [[ and ]]. Otherwise quotes are likely to cause us
+ at c problems. [[ and ]] are not likely to occur by chance in our docs.
+ at c
+ at c If we need to use an m4 keyword in our text, then we may do so with
+ at c (eg) [[_m4command]].
+ at c
+ at c If we wish to use [[ and ]] in our text, enclose it with comments:
+ at c @nm4{}[[@m4{}
+ at c ---------------------------------------------------------------------------
+_changequote([[,]])
+
+ at c ---------------------------------------------------------------------------
+ at c picture macro
+ at c
+ at c Adds a picture to the document. For texi2dvi it uses postscript. For PDF
+ at c it uses a PDF picture. For html it loads a png file.
+ at c
+ at c argument 1: a filename prefix. .ps, .pdf and .png are added to the prefix
+ at c             as required.
+ at c ---------------------------------------------------------------------------
+_define([[_picture]],[[_ifdef([[_tex]],[[@image{[[$*]]}]])
+_ifdef([[_html]],[[
+ at ifhtml
+<p>
+<img src="[[$1]].png" alt="[picture]">
+ at end ifhtml]])]])
+
+ at c ---------------------------------------------------------------------------
+ at c lpicture macro
+ at c
+ at c Adds a large picture to the document. In tex this is the same as the
+ at c picture macro. For html it displays a small png file with a link to the
+ at c full size one.
+ at c
+ at c argument 1: a filename prefix. .pdf, .png, .small.png and .png.html are
+ at c             added to the prefix as required.
+ at c ---------------------------------------------------------------------------
+_define([[_lpicture]],[[_ifdef([[_tex]],[[@image{[[$*]]}]])
+_ifdef([[_html]],[[
+ at ifhtml
+<p>
+<a href="[[$*]].png.html"><img src="[[$*]].small.png" alt="[picture]"></a>
+<br><font size="-1">(Click for full size image)<font size="+0"><br>
+ at end ifhtml]])]])
+
+ at c ---------------------------------------------------------------------------
+ at c @nm4{}
+ at c
+ at c _ifunix macro
+ at c _ifwindows macro
+ at c
+ at c These two macros may be used to surround text which we wish to only
+ at c appear in one version or another. They check the _ifunix and _ifwindows
+ at c defines.
+ at c An example usage is:
+ at c
+ at c     _ifunix([[
+ at c     @split{}
+ at c     @node Assembly-CAP2
+ at c     @section Assembly CAP2
+ at c     _include(cap2-t.texi)
+ at c     ]])(
+ at c
+ at c An alternative to this is using _ifdef directly. Eg:
+ at c
+ at c     _ifdef([[_unix]],[[
+ at c     @split{}
+ at c     @node Assembly-CAP2
+ at c     @section Assembly CAP2
+ at c     _include(cap2-t.texi)
+ at c     ]])(
+ at c
+ at c @m4{}
+ at c ---------------------------------------------------------------------------
+_define([[_ifunix]],[[_ifdef([[_unix]],[[$*]])]])
+_define([[_ifwindows]],[[_ifdef([[_windows]],[[$*]])]])
+
+ at c ---------------------------------------------------------------------------_
+ at c uref macro
+ at c
+ at c This exists in newer texinfo release, but for now we try to emulate it as
+ at c well as possible (albeit in a m4 instead of texinfo manner).
+ at c
+ at c _uref(url) will just link to that url, with the 'url' as the text.
+ at c _uref(url,text) will link to that url, with 'text' as the text in the
+ at c   html format. For tex format it'll use "text (@code{url})".
+ at c _uref(url,,text) will link to that url, with 'text' as the text in both
+ at c   html and tex formats.
+ at c ---------------------------------------------------------------------------
+_define([[_uref]],[[_ifelse(1,$#,[[_ifdef([[_html]],[[@ifhtml
+<a href="$1">
+ at end ifhtml]])
+$1
+_ifdef([[_html]],[[
+ at ifhtml
+</a>
+ at end ifhtml]])]],[[_ifelse(2,$#,[[_ifdef([[_tex]],[[$2 (@code{$1})]])
+_ifdef([[_html]],[[@ifhtml
+<a href="$1">$2</a>
+ at end ifhtml]])]],[[_ifdef([[_html]],[[
+ at ifhtml
+
+<a href="$1">
+ at end ifhtml
+$3
+ at ifhtml
+</a>
+ at end ifhtml
+]])]])]])]])
+
+ at c normal refs
+_ifdef([[_tex]],[[
+_define([[_fxref]],[[@xref{$1,$1,$2}.]])
+_define([[_fpref]],[[@pxref{$1,$1,$2}]])
+_define([[_fref]],[[@ref{$1,$1,$2}.]])
+_define([[_split]],[[]])
+]])
+
+ at c html refs
+_ifdef([[_html]],[[
+_define([[_fxref]],[[
+ at ifhtml
+<!-- XREF:$1 -->
+ at end ifhtml
+ at xref{$1,$1,$2,$3,$3}.]])
+_define([[_fpref]],[[
+ at ifhtml
+<!-- XREF:$1 -->
+ at end ifhtml
+ at pxref{$1,$1,$2,$3,$3}]])
+_define([[_fref]],[[
+ at ifhtml
+<!-- XREF:$1 -->
+ at end ifhtml
+ at ref{$1,$1,$2,$3,$3}.]])
+_define([[_split]],[[@split]])
+]])
+
+ at c common refs
+_define([[_oxref]],[[@xref{$1,$1,$2}]])
+_define([[_oref]],[[@ref{$1,$1,$2}]])
diff --git a/manual/hidden-t.texi b/manual/hidden-t.texi
new file mode 100644
index 0000000..f0e7db5
--- /dev/null
+++ b/manual/hidden-t.texi
@@ -0,0 +1,30 @@
+ at cindex Hidden data
+ at cindex data hidden
+
+In general sequences obtained from machines contain segments such as
+vector sequence and poor quality data that need either to be removed or
+ignored during assembly and editing. In our package we do not remove
+such segments but instead we mark them so that the programs can deal
+with them appropriately. In gap4 such data is referred to as "hidden".
+The positions to hide are determined initially by preprocessing programs
+such as vector_clip (_fpref(Vector_Clip, Screening Against Vector
+Sequences, vector_clip)) and qclip
+(_fpref(Man-qclip, qclip, manpages)).
+
+
+The hidden data can be revealed in the Contig Editor by toggling the
+Cutoffs button (_fpref(Editor-Cutoffs, Adjusting the Cutoff
+data, contig_editor)); can be used to search for possible joins between
+contigs (_fpref(FIJ, Find Internal Joins, fij)), and can be included in
+the consensus sequence (_fpref(Con-Extended, Extended consensus,
+calc_consensus)) to be used by external screening programs.  For these
+cases the program can distinguish data that is hidden because it is
+vector and data that is hidden because it is of poor quality: only poor quality
+data is included.
+
+The position of hidden data can be changed interactively in the Contig
+Editor. In addition the Double Strand function (_fpref(Double Strand,
+Double stranding, exp_suggest)) will reduce the amount of hidden data
+for readings that cover single stranded regions of contigs, if the data
+aligns well with that on the other strand.
+
diff --git a/manual/i/nav_brief.gif b/manual/i/nav_brief.gif
new file mode 100644
index 0000000..b26bdbd
Binary files /dev/null and b/manual/i/nav_brief.gif differ
diff --git a/manual/i/nav_down.gif b/manual/i/nav_down.gif
new file mode 100644
index 0000000..bf5ccf0
Binary files /dev/null and b/manual/i/nav_down.gif differ
diff --git a/manual/i/nav_first.gif b/manual/i/nav_first.gif
new file mode 100644
index 0000000..75d3439
Binary files /dev/null and b/manual/i/nav_first.gif differ
diff --git a/manual/i/nav_full.gif b/manual/i/nav_full.gif
new file mode 100644
index 0000000..65c4753
Binary files /dev/null and b/manual/i/nav_full.gif differ
diff --git a/manual/i/nav_home.gif b/manual/i/nav_home.gif
new file mode 100644
index 0000000..5e1293c
Binary files /dev/null and b/manual/i/nav_home.gif differ
diff --git a/manual/i/nav_last.gif b/manual/i/nav_last.gif
new file mode 100644
index 0000000..95a8a39
Binary files /dev/null and b/manual/i/nav_last.gif differ
diff --git a/manual/i/nav_next.gif b/manual/i/nav_next.gif
new file mode 100644
index 0000000..7fa6ebe
Binary files /dev/null and b/manual/i/nav_next.gif differ
diff --git a/manual/i/nav_prev.gif b/manual/i/nav_prev.gif
new file mode 100644
index 0000000..31176c4
Binary files /dev/null and b/manual/i/nav_prev.gif differ
diff --git a/manual/i/nav_top.gif b/manual/i/nav_top.gif
new file mode 100644
index 0000000..cb77483
Binary files /dev/null and b/manual/i/nav_top.gif differ
diff --git a/manual/i/nav_up.gif b/manual/i/nav_up.gif
new file mode 100644
index 0000000..434a6d6
Binary files /dev/null and b/manual/i/nav_up.gif differ
diff --git a/manual/init_exp.1.texi b/manual/init_exp.1.texi
new file mode 100644
index 0000000..056aaeb
--- /dev/null
+++ b/manual/init_exp.1.texi
@@ -0,0 +1,56 @@
+ at cindex init_exp: man page
+ at unnumberedsec NAME
+
+init_exp --- create and initialise an Experiment File
+
+ at unnumberedsec SYNOPSIS
+
+ at code{init_exp} [@code{-}(@code{abi}|@code{alf}|@code{scf}|@code{pln})]
+[@code{-output} @i{file}] [@code{-name} @i{entry_name}] [@code{-conf}] @i{file}
+
+ at unnumberedsec DESCRIPTION
+
+ at code{init_exp} initiates an Experiment File for a binary trace file or
+a plain sequence file. The Experiment File created
+contains the @code{ID}, @code{EN}, @code{LN}, @code{LT} and @code{SQ}
+lines.
+
+The experiment file is, by default, sent to standard output, unless an
+output file is specified using the @code{-output} option. The default
+entry name for the Experiment File is derived from the filenames used.
+If an output file has been specified, then this is taken as the
+ at code{EN} field. Otherwise the input file name is used. The user can
+override the default by using the @code{-name} option.
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-abi}, @code{-alf}, @code{-scf}, @code{-pln}
+    Specify an input file format. This is not usually required as
+    @code{init_exp} will automatically determine the correct input file
+    type. This option is supplied incase the automatic determination is
+    incorrect (which is possible, but has never been observed).
+
+ at item @code{-output} @i{file}
+    The experiment file will be written to @i{file} instead of standard
+    output. Additionally the value of the @code{EN} and @code{ID}
+    fields, assuming @code{-name} has not been specified, will be @i{file}.
+
+ at item @code{-name} @i{name}
+    Sets the @code{ID} and @code{EN} fields to @i{name}, regardless of
+    the output filename used.
+
+ at item @code{-conf}
+    Fills out the @code{AV} field with the quality values found in the SCF
+    file.
+
+ at end table
+
+ at unnumberedsec NOTES
+
+This program was formerly known as @code{expGetSeq}.
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Formats-Exp, ExperimentFile(4), formats)
+ at code{Read}(4)
diff --git a/manual/interface-t.texi b/manual/interface-t.texi
new file mode 100644
index 0000000..f43ac38
--- /dev/null
+++ b/manual/interface-t.texi
@@ -0,0 +1,469 @@
+ at menu
+* UI-Introduction::	        Introduction
+* UI-Basics::                   Basic controls; buttons, menus, entries
+* UI-Mouse::                    Standard Mouse Operations
+* UI-Output::                   The Output and Error Windows
+* UI-Graphics::                 Graphics Window
+* UI-Colour::                   Colour Selector
+* File Browser::                File Browser
+* UI-Fonts::                    Font Selection
+ at ifset standalone
+* Index::                       Index
+ at end ifset
+ at end menu
+
+_split()
+ at node UI-Introduction
+ at unnumberedsec Introduction
+ at cindex User interface
+ at cindex User interface: introduction
+
+This chapter describes the graphical user interface implemented in the
+programs gap4, pregap4, spin and trev.
+
+_split()
+ at node UI-Basics
+ at section Basic Interface Controls
+ at menu
+* UI-Buttons::                  Buttons
+* UI-Menus::                    Menus
+* UI-Text::                     Text windows
+* UI-Entries::                  Entry boxes
+ at end menu
+
+The key components of any graphical interface are menus, buttons and text entry
+boxes. These are usually grouped together into 'panels' or 'dialogues'. For
+the sake of simplifying the rest of the documentation we describe the
+operation and control of the user interface here.
+
+_split()
+ at node UI-Buttons
+ at subsection Buttons
+ at cindex Buttons
+ at cindex User interface: buttons
+
+There are three basic types of button; command buttons, radio buttons and
+check buttons. Buttons are used by moving the mouse pointer over the button
+(it will then be highlighted, as is "Choice 3" below) and pressing the left
+mouse button. The illustration below demonstrates each of the three types.
+
+_picture(interface.buttons,2.64167in)
+
+The buttons labelled "Command 1", "Command 2" and "Command 3" are command
+buttons. These perform a particular action, which is usually determined by the
+text within the button. Typically command buttons have a slightly raised look.
+A typical example is the "Clear" button visible on the main gap4 window.
+See the illustration in _oref(UI-Output, Output and Error Windows).
+
+The buttons underneath, labelled "Choice 1", "Choice 2" and "Choice 3" are
+radio buttons. These have small diamonds to the left of each name. Only one of
+these boxes in each group of radio buttons (it is possible to have several
+distinct groups) may be set at any one time. In this example "Choice 1" has
+been selected. For example, selecting "Choice 3" will now clear the diamond
+next to "Choice 1" and fill the diamond next to "Choice 3".
+
+The bottom row of buttons are check buttons. These each have small boxes to
+the left of each name. They act similarly to radio buttons except that more than
+one can be selected at any one time. Here we have "Check 2" and "Check 3"
+selected, with "Check 1" deselected. Pressing a check button will toggle it;
+so clicking the left mouse button on "Check 2" would deselect it and clear the
+neighbouring box. The "Scroll on output" button is an example of a check
+button. See the illustration in _oref(UI-Output, Output and Error Windows).
+
+_split()
+ at node UI-Menus
+ at subsection Menus
+ at cindex Menus
+ at cindex User interface: menus
+
+When many operations are available it is impractical to arrange them all in
+command buttons. For this reason we have menus. Typically we will use a
+"menubar" consisting of several menus arranged side by side 
+
+_picture(interface.menus,1.36667in)
+
+The menubar is a series of menu buttons arranged side by side. In the above
+picture, the menus are "File" through to "Help". Selecting an item from a menu
+is done by pressing and holding the left mouse button whilst the cursor is
+above the menu button. The available menu choices will then be displayed.
+Whilst still pressing the mouse button, move down to the desired choice and
+then release the mouse button. Releasing the mouse button when the mouse
+cursor is not over a menu item will remove the menu without executing any
+options. Alternatively, it is possible to press and release the left mouse
+button whilst the cursor is above a menu button. The menu options will be
+revealed.  Now move down and press and release the left button once more once
+on the selected item.
+
+To see an overview of the menu contents press the left mouse button over a
+menu button and move the mouse cursor over the other menu buttons. As each
+menu button is highlighted the appropriate options for this menu will be
+shown.
+
+Some menu items lead to further menus. These are called cascading menus. Treat
+these exactly as normal menus. 
+
+To tear off a menu pull down the menu using the left mouse button, select
+the dashed perforation line, and release the button. The menu will be
+redrawn with a title bar which can be used to move it to any position on
+the screen. Not all menus support tearing off.
+
+_split()
+ at node UI-Text
+ at subsection Text Windows
+ at cindex Text windows
+ at cindex User interface: text windows
+
+A text window is simply an area of the screen set aside for displaying textual
+information. A typical example is the Output and Error windows seen on the
+main gap4 screen. See the illustration in _oref(UI-Output, Output and
+Error Windows).
+
+The most basic use of text windows is to display data. If the data is large
+then there will usually be scrollbars on the right and bottom sides of the
+text display. If the data is of an editable nature (such as the comments in a
+tag in gap4) we may perform many editing operations on the text. The
+simplest commands follow.
+
+ at example
+ at group
+Arrow keys              Moves the editing cursor
+Left mouse button       Sets the editing cursor
+Middle mouse button     Panning - controls both scrollbars at once
+Alt left mouse button   Panning - controls both scrollbars at once
+Delete                  Deletes the character to the left of the cursor
+Most other keys         Adds text to the window
+ at end group
+ at end example
+
+In addition to the above,  some more advanced features are  available,
+mostly following the @code{Emacs} style of key bindings.
+
+ at example
+Delete                  Delete region (when highlighted), otherwise as above
+Control D               Delete character to the right of the cursor
+Control N               Down one line
+Control P               Up one line
+Control B               Move back on character
+Control F               Move forward on character
+Control A               Move to start of line
+Control E               Move to end of line
+Meta b                  Move back one word
+Meta f                  Move forward one word
+Meta <                  Move to start
+Meta >                  Move to end
+Control Up              Move up one paragraph
+Control Down            Move down one paragraph
+Next                    Move done one page
+Prev                    Move up one page
+Control K               Delete to end of line
+Control T               Transpose two characters
+Drag left button        Highlights a region (for cut and paste)
+Control /               Select all (for cut and paste)
+Control \               Deselect all (for cut and paste)
+ at end example
+
+_split()
+ at node UI-Entries
+ at subsection Text Entry Boxes
+ at cindex Entry boxes
+ at cindex User interface: entry boxes
+
+An entry box is basically a small, one line, text window. All of the same
+editing commands exist, although many are redundant for such a small window.
+
+_picture(interface.entry,2.08333in)
+
+A typical entry box can be seen in the gap4 dialogue for opening new
+databases. Here the ringed region to the right of the "Enter new filename"
+text is the entry box. The current contents of this entry is "file". The
+vertical black line visible is the text entry point.
+
+_split()
+ at node UI-Mouse
+ at section Standard Mouse Operations
+ at cindex Mouse control: overview
+ at cindex Mouse buttons: overview
+ at cindex Buttons: mouse overview
+ at cindex Left mouse button: overview
+ at cindex Middle mouse button: overview
+ at cindex Alt left mouse button: overview
+ at cindex Right mouse button: overview
+
+The same mouse buttons are used for similar operations throughout the
+programs. A brief description of the mouse control is listed below.
+On UNIX three button mice are used, but on Windows or Linux two buttons
+are more
+common, and so the alternative of Alt-left-mouse button is used for the
+middle button.
+
+ at code{Left button                     Select}
+ at quotation
+In a dialogue this selects an item from a list of items.
+Within a graphical display (eg the template display) this
+"selects" an item. Selected items are shown in bold. Selecting
+an already selected item will deselect it.
+ at end quotation
+
+ at code{Drag left button                Select region}
+ at quotation
+This operates only for the graphical displays. A rectangular box can
+be dragged out between where the left button was pressed (and held
+down) to the current mouse cursor position. Releasing the left button
+will then select all items contained entirely within the rectangle.
+Within the contig editor such selections are displayed by underlining
+the region instead.
+ at end quotation
+
+ at code{Drag middle button              Move}
+ at quotation
+This currently operates only for the contig selector. The selected
+items (or the item under the mouse pointer if none are selected) are
+dragged until the middle button is released.
+ at end quotation
+
+ at code{Drag Alt left button            Move}
+ at quotation
+This currently operates only for the contig selector. The selected
+items (or the item under the mouse pointer if none are selected) are
+dragged until the button is released.
+ at end quotation
+
+ at code{Right button                    Popup menu}
+ at quotation
+Within some displays this will pop up a menu displaying a list of
+commands that can be used on the selected item.
+ at end quotation
+
+_split()
+ at node UI-Output
+ at section The Output and Error Windows
+ at cindex Output window
+ at cindex Error window
+ at cindex Search: in the output window
+ at cindex Scroll on output
+ at cindex Redirect output
+ at cindex Clear: in output window
+
+The main screen has three portions; the menubar, the output window and the
+error window. Of these, the output and error windows are identical except for
+the data that appears within them. Here we describe the general operation of
+the output window only, although the details apply to the error window too.
+
+_lpicture(interface.output,5.13333in)
+
+The output window consists of a text window with a set of labels and buttons
+above it. At the top left is the window name followed by a colon. After
+the colon the name of the current output file will be shown in blue italic
+letters. All new output appearing in this window will also be sent to this
+file. Initially no output file is specified and this label is blank (as can be
+seen in the error window). Using the "Redirect" menu located at the top right
+of the window a new file can be opened, or an existing one closed (in which
+case output is no longer sent to the specified file). The output and error
+windows may both have redirection files.
+
+The "Search" button invokes a dialogue box requesting a string to search
+and whether to search forwards or backwards from the current position of
+the cursor. The search is case insensitive. Hitting the the OK button finds 
+the next match.The bell is sounded if no more matches can be found. 
+
+The "Scroll on output" check button toggles whether the window should
+automatically scroll when new output appears to ensure that it is
+visible. The default (as seen in the illustration) state is to
+scroll. The "Clear" button removes all output from the window.
+
+Each command, when run, adds a title to the output window. This
+contains the current time together with the command name. Output for
+this command then appears beneath the header. In the illustration the
+output from three commands is visible. Of these the "edit contig"
+command produces no output, but still has a header.
+
+Pressing the right button with the cursor above a piece of output
+(either in its header or the text beneath it) will pop up a menu of
+operations. This operation is not valid for the error window. The
+commands are:
+
+ at code{Show input parameters}
+ at quotation
+Inserts the input parameters of the command, if any, beneath the 
+command header
+ at end quotation
+
+ at code{Remove}
+ at quotation
+Deletes this text from the output window
+ at end quotation
+
+ at code{Output to disk}
+ at quotation
+Sends this text to a specified file
+ at end quotation
+
+ at code{Output to list}
+ at quotation
+Sends this text to a specified list
+ at end quotation
+
+_ifdef([[_unix]],[[
+
+ at code{Output to command}
+ at quotation
+Starts up a specified command and sends this text to the input of the
+command. Any output from the command is added back to the output
+window. Any errors from the command appear in the output window.
+Currently this allows commands to run for up to five seconds, and
+terminates the command if it has taken longer. To start longer running
+applications add an ampersand (&) after the command name.
+ at end quotation
+
+ ]])
+
+The output operations allow the user to specify whether the header, input 
+parameters or text for the command, in any combination, are sent to the
+output.
+
+
+
+The text in the error window has a different format to the output
+window. Instead of large portions of text separated by headers, each
+item in the error window consists of a single line containing the
+date, the name of the function producing the error, and a brief
+description of the error. Many error messages will be displayed in
+their own dialogue boxes (eg not having write access to a file) and
+hence will not appear in the error window.  Each time an error message
+is added the bell is rung.
+
+_split()
+ at node UI-Graphics
+ at section Graphics Window
+ at cindex Graphics windows: user interface
+
+ at menu
+ at ifset html
+* UI-Graphics-Intro::	        Introduction
+ at end ifset
+* UI-Graphics-Zoom::            Zooming
+ at end menu
+
+ at ifset html
+ at node UI-Graphics-Intro
+ at unnumberedsubsec Introduction
+ at end ifset
+
+The graphical displays have several features in common. Commands are
+selected from buttons and menus ranged along the top of the window.
+Menus can be "torn off" and positioned anywhere on the screen. Zooming
+is allowed using the mouse. The "Zoom out" button undoes the previous zoom
+command. Crosshairs and cursors can be toggled on and off, and their
+coordinates in base positions appear in boxes in the top right hand
+corner of the displays.  Items plotted in the graphical displays
+have text attached, and as the cursor passes over an item, it is
+highlighted and its text appears in an Information line at the bottom of
+the display.
+
+ at node UI-Graphics-Zoom
+ at subsection Zooming
+ at cindex Zooming graphics
+ at cindex User interface: Zooming graphics
+
+Plots can be enlarged either by resizing the window or zooming. 
+In some plots zooming
+is achieved by holding down the control key and right mouse button and
+dragging out a rectangle.  Rectangles that are too small are ignored and
+a warning bell will sound. The content of the window is magnified such
+that the contents of the zoom box fill the window. 
+The Zoom out button will restore the plot to the previous
+magnification. In other plots, x and y scale boxes achieve similar effects.
+
+_split()
+ at node UI-Colour
+ at section Colour Selector
+ at cindex Colour selector
+ at cindex User interface: colour selector
+
+A common operation is to change the colour of a plot. For this we use the
+colour selector dialogue shown below. The three sliders control the red,
+green, and blue intensities to use in producing the desired colour. The
+shaded box at the bottom illustrates the current colour. In some displays
+this will also interactively update the colour in the associated plot
+simultaneously.
+
+_picture(interface.colour,2.35in)
+
+Pressing the "OK" button will quit the colour selector and update the
+appropriate colours in the plot. Pressing "Cancel" will quit the colour
+selector without making any changes to the plot. Note that some colour
+dialogues may also be combined with extra controls for adjusting other
+graphical styles, such as the line width.
+
+Many programs have a "Colours" command in the Options menu. This displays two
+colour selectors; one for each of the foreground and background colours. This
+can be used to adjust the main colour scheme used for the program. Pressing
+"OK" selects this colour scheme and keeps it in use until the program exits.
+Pressing "OK Permanent" accepts this colour scheme, but also updates the
+ at file{$HOME/.tk_utilsrc} file. This means that the colour scheme will be used
+for all future program uses. To revert to the default colours, manually edit
+the @file{$HOME/.tk_utilsrc} file.
+
+_split()
+ at node File Browser
+ at section File Browser
+_include(filebrowser-t.texi)
+
+_split()
+ at node UI-Fonts
+ at section Font Selection
+ at cindex Fonts
+ at cindex Configuring: fonts
+
+The Options menu of most programs contains a "Set fonts" command. This brings
+up a font selection dialogue consisting of some sample text, three option
+menus to select the font name, family and size, and some check buttons for
+font styles.
+
+_picture(interface.fonts,2.65833in)
+
+In the above picture, @strong{button_font} is the currently selected font
+name. This option menu contains several of the following font types. The exact
+ones available depends on the program being used.
+
+ at table @strong
+ at item button_font
+Used for buttons, labels, checkbuttons and radiobuttons.
+ at item menu_font
+Used for menu buttons and their contents, including pull down menu contents.
+ at item text_font
+Used for textual displays, such as the main output windows. This should be
+chosen to be a fixed width font, such as @code{Courier}.
+ at item sheet_font
+Used for the scrolled text displays such as the contig editor in gap4 and the
+sequence displays in spin. This too needs to be chosen as a fixed
+width font.
+ at item title_font
+Used as for headings within text windows such as contig names in the gap4
+suggest probes function.
+ at item menu_title_font
+Used in the title line of popup menus.
+ at item trace_font
+Used for the sequence and number displays in the trace displays for both gap4
+and Trev.
+ at end table
+
+Next to the font name is the font family selector. The contents of this menu
+will depend on the fonts available to your system. Some may be inappropriate,
+or not even in the correct language. Next to the family selector is the size
+menu. This contains a range of sizes in both pixel and point units. If a font
+of a particular size is not available, the nearest font or size will be
+automatically chosen. Specifying fonts to be a fixed number of points states
+that the font should have a specific physical size, regardless of monitor size
+or screen resolution. There are 72.27 points to the inch. Underneath these we
+have Bold, Italic, Overstrike and Underline check buttons.
+
+Whilst choosing the font, the fonts used in the entire program automatically
+update to show you how things will look. Pressing "Cancel" will reset the
+fonts back to their original state. Pressing "OK" will keep these chosen
+fonts, until the program is exited. Pressing "OK Permanent" will keep these
+fonts, but will also add them to the user's @file{$HOME/.tk_utilsrc} file.
+This file is processed when the programs start up, and so your font choice
+will be permanently chosen. To remove this font choice, manual editing of the
+ at file{$HOME/.tk_utilsrc} file is required.
+
diff --git a/manual/interface.buttons.png b/manual/interface.buttons.png
new file mode 100644
index 0000000..d028838
Binary files /dev/null and b/manual/interface.buttons.png differ
diff --git a/manual/interface.colour.png b/manual/interface.colour.png
new file mode 100644
index 0000000..728fe25
Binary files /dev/null and b/manual/interface.colour.png differ
diff --git a/manual/interface.entry.png b/manual/interface.entry.png
new file mode 100644
index 0000000..35d7536
Binary files /dev/null and b/manual/interface.entry.png differ
diff --git a/manual/interface.fonts.png b/manual/interface.fonts.png
new file mode 100644
index 0000000..8dbbe68
Binary files /dev/null and b/manual/interface.fonts.png differ
diff --git a/manual/interface.menus.png b/manual/interface.menus.png
new file mode 100644
index 0000000..05cc65c
Binary files /dev/null and b/manual/interface.menus.png differ
diff --git a/manual/interface.output.png b/manual/interface.output.png
new file mode 100644
index 0000000..e6e4c20
Binary files /dev/null and b/manual/interface.output.png differ
diff --git a/manual/interface.output.small.png b/manual/interface.output.small.png
new file mode 100644
index 0000000..3e433b2
Binary files /dev/null and b/manual/interface.output.small.png differ
diff --git a/manual/interface.tag.png b/manual/interface.tag.png
new file mode 100644
index 0000000..c8306d3
Binary files /dev/null and b/manual/interface.tag.png differ
diff --git a/manual/interface.texi b/manual/interface.texi
new file mode 100644
index 0000000..490deb0
--- /dev/null
+++ b/manual/interface.texi
@@ -0,0 +1,42 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+
+ at c %**start of header
+ at setfilename interface.info
+ at settitle The User Interface
+ at c @setchapternewpage odd
+ at iftex
+ at afourpaper
+ at end iftex
+ at setchapternewpage odd
+ at c %**end of header
+
+include(header.m4)
+
+ at titlepage
+ at title The User Interface
+ at subtitle 
+ at author 
+ at page
+ at vskip 0pt plus 1filll
+_include(copyright.texi)
+ at end titlepage
+
+ at node Top
+ at ifinfo
+ at top top-interface
+ at end ifinfo
+
+ at raisesections
+ at set standalone
+_include(interface-t.texi)
+
+_split()
+ at node Index
+ at unnumberedsec Index
+ at printindex cp
+ at lowersections
+
+ at shortcontents
+ at contents
+ at bye
diff --git a/manual/list_libraries-t.texi b/manual/list_libraries-t.texi
new file mode 100644
index 0000000..84e6ba6
--- /dev/null
+++ b/manual/list_libraries-t.texi
@@ -0,0 +1,39 @@
+ at cindex List Libraries
+ at cindex Insert Sizes
+ at cindex Read groups: SAM RG tags
+
+The List Libraries window is perhaps misnamed as it handles arbitrary
+groups of reads, possibly due to the use of multiple libraries,
+multiple instrument types or simply multiple lanes on a single
+instrument. For SAM/BAM files this informations comes from the
+ at code{@@RG} header lines. For other formats Gap5 typically makes use of
+the input filename to group data together.
+
+_picture(gap5_list_libraries,5.18333in)
+
+The basic plot shows a list of library names and how frequently read
+pairs have been identified as matching to the same contig. This is
+computed at the time of import via tg_index and so will not be updated
+on contig joining or breakage. The @i{Type} field indicates
+the instrument platform type (for example Illumina or 454), although
+this is often absent from the input BAM files.
+
+The @i{Insert size} and standard deviation (@i{s.d.}) are derived from
+the sequence alignments, with assumptions of an approximately Gaussian
+distribution. While not entirely accurate this is typically sufficient
+for most libraries when viewed in a summary table. Finally the
+ at i{Orientation} field indicates the relative orientation in which most
+of the read-pairs have been assembled. This will be one of
+``@code{-> <-}'', ``@code{<- ->}'' or ``@code{-> -> / <- <-}'' to
+indicate the relative orientations of the read-pair. Whether the observed
+orientation is correct will depend on the particular sequencing
+strategy used.
+
+Underneath the list is a histogram of observed insert sizes for the
+currently selected library. The graph is currently very rudimentary
+with no controls, but it will auto-scale to fit the data. The example
+shown above is an Illumina large insert library showing two distinct
+distributions with the smaller being where the biotin enrichment failed
+and short templates were included in the library. (Note in this example
+the sequence orientations have been flipped so the bulk of the data is
+in the orientation expected by other tools.)
diff --git a/manual/lists-t.texi b/manual/lists-t.texi
new file mode 100644
index 0000000..0e07cbe
--- /dev/null
+++ b/manual/lists-t.texi
@@ -0,0 +1,254 @@
+ at menu
+* List-Special::                Special list names
+* List-Commands::               Basic list commands
+* List-ContigToRead::           Contigs To Readings command
+_ifdef([[_gap4]],[[* List-MinCoverage::            Minimal Coverage command
+* List-Unattached::             Unattached Readings command
+* List-HighlightReadings::      Highlight Readings List
+* List-SearchSequenceNames::    Search Sequence Names
+* List-SearchTemplateNames::    Search Template Names
+* List-SearchAnnotations::      Search Annotation Contents]],[[* List-SearchSequenceNames::    Search Sequence Names]])
+ at end menu
+
+ at cindex Lists
+
+For many operations it is convenient to be able to process sets of data together - for example to calculate a consensus sequence for a subset of the contigs. To facilitate this __prog__ uses lists.
+
+Most __prog__ commands dealing with batches of files or sets of readings or contigs
+can use either files of filenames or lists. When selecting list names from
+within dialogues the "browse" button will display a window containing all the
+currently existing lists. To select a list simply double click on the list
+name. Alternatively the name may simply be typed in.
+
+The List menu on the main menubar contains commands to Edit, Create, Delete,
+Copy, Load, and Save lists. Some of these display a list editor. This is
+simply a scrollable text window supporting simple editing facilities 
+(_fpref(UI-Text, Text Windows, interface)).
+
+The "Clear" button clears the list. The "Ok" button removes the list
+editor window. It is not necessary to use "Ok" here before supplying the list
+name for input to another option. 
+
+_split()
+ at node List-Special
+ at section Special List Names
+ at cindex Lists: special names
+ at cindex contigs list
+ at cindex readings list
+ at cindex allcontigs list
+ at cindex allreadings list
+
+Some lists are automatically updated or are generated on-the-fly as needed.
+The lists named "contigs" and "readings" correspond to the currently selected
+contigs in the contig selector window and the currently selected readings in
+the template displays. Note that lists (with any names) can also be created
+from selected items in the contig editor.
+_fxref(Editor-Output List, Set Output List, contig_editor)
+The "allcontigs" and "allreadings" lists are created as needed and always
+contain an identifier for every contig and every reading identifier.
+
+Because of the way the lists are implemented, as is outlined below,
+there are some useful "tricks" that can be employed.
+A list name consisting of a contig identifier surrounded by square
+brackets ('[' and ']') will cause the creation of a list containing all of the
+readings within that contig. For example, to use the Extract Readings
+option (_fpref(Extract Readings, Extract Readings, ex))
+to extract all the readings from
+contig 'xb54f8.s1', the list name given in the Extract Readings dialogue
+would be '[xb54f8.s1]'.
+
+A list name surrounded by curly brackets ('@{' and '@}') will cause the
+creation of a list containing all of the readings in the contigs
+named in the
+specified list name. So '@{contigs@}' is equivalent to all the readings
+in the
+contigs contained in the 'contigs' list. Hence the 'allreadings' list is
+identical to '@{allcontigs@}'.
+
+These tricks can be used anywhere where a list name is required except for
+editing and deletion of lists. As a final example,
+to produce a file of filenames for the
+currently selected contigs, save the list named '@{contigs@}' to a file.
+
+_split()
+ at node List-Commands
+ at section Basic List Commands
+ at cindex Lists: commands
+ at cindex Lists: copy
+ at cindex Lists: load
+ at cindex Lists: save
+ at cindex Lists: edit
+ at cindex Lists: create
+ at cindex Lists: delete
+_ifdef([[_unix]],[[@cindex Lists: print]])
+ at cindex Copy list
+ at cindex Load list
+ at cindex Save list
+ at cindex Edit list
+ at cindex Create list
+ at cindex Delete list
+_ifdef([[_unix]],[[@cindex Print list]])
+
+The basic operations that can be performed on lists include copying,
+loading, saving, editing,
+_ifdef([[_unix]],[[printing,]])
+creation and deletion. Joining and splitting can only be performed
+using the list editors and using cut and paste between windows.
+
+The Load and Save commands require a list name and a file name. If
+only the name of the file is given the list is assumed to have the same
+name.  If it is desired to load or
+save a list from/to a file of a different name then both should be
+specified. Creating a list that already exists (or loading a file into
+an already existing list) is allowed, but will produce a warning
+message.
+
+The ``Reading list'' option controls whether the list to be loaded is
+a list of reading names (which is normally the case). This will then
+turn on hyperlinking in any text views of this list. Double-left
+clicking on an underlined reading name will bring up the contig editor
+while right-clicking will bring up a command menu.
+
+_split()
+ at node List-ContigToRead
+ at section Contigs To Readings Command
+ at cindex Lists: Contigs to Readings
+ at cindex Contigs to Readings: lists
+ at cindex File of filenames generation
+
+This command produces a list or file of reading names for a single contig or
+for a set of contigs. The user interface provides a dialogue
+to select the contigs and to select a list name or filename.
+
+_ifdef([[_gap4]],[[
+_split()
+ at node List-MinCoverage
+ at section Minimal Coverage Command
+ at cindex Lists: minimal coverage
+ at cindex minimal coverage: lists
+
+This command produces a minimal list of readings that
+together span the entire length of a contig. The dialogue allows 
+contigs names to be defined using a list or a file of filenames. 
+The output produced, can be sent to a list or a file of filenames. 
+An example use of
+this function is to determine a minimal set of overlapping readings
+for resequencing.
+
+_split()
+ at node List-Unattached
+ at section Unattached Readings Command
+ at cindex Lists: unattached readings
+ at cindex Unattached readings: lists
+
+This command finds the contigs that consist of single readings. The output
+can be written to a list or a file of filenames. One example
+use of the option is for tidying up projects by removing the trivial
+and unrequired contigs. In this case the list would be used
+as input to disassemble readings
+(_fpref(Disassemble, Disassembling Readings, disassembly)).
+
+ at node List-HighlightReadings
+ at section Highlight Readings List
+ at cindex Lists: highlight readings list
+ at cindex Highlight readings list
+
+This simply loads the ``readings'' list so that the template display
+and contig editor auto-highlight the chosen readings. This function is
+the same as the Highlight Readings List option in the template display.
+
+_split()
+ at node List-SearchSequenceNames
+ at section Search Sequence Names
+ at cindex Lists: search sequence names
+ at cindex Search sequence names: lists
+ at cindex Reading names, searching for
+ at cindex Sequence names, searching for
+
+This command allows searching for sequences matching a given pattern. The
+function produces both a list in the text output window and a __prog__ "list" of
+reading names. The highlighted output is clickable, with the left mouse button 
+invoking the contig editor and the right mouse button displaying a popup-menu
+allowing additional operations (contig editor, template display, reading notes 
+and contig notes).
+
+The text search may be performed as either case-sensitive or
+case-insensitive. Additionally the pattern search types are available.
+
+ at table @strong
+ at item sub-string
+Matches any reading name where the pattern matches all or part of the name.
+
+ at item wild-cards
+Searches for a pattern using normal filename wild-card matching syntax. So
+ at code{*} matches any sequence of characters, @code{?} matches any single
+character, @code{[}@i{chars}@code{]} matches a set of characters defined by
+ at i{chars}, and @code{\}@i{char} matches the literal character
+ at i{char}. Character sets may use a minus sign to match a range. For example
+ at code{x*.[fr][1-9]} matches any name starting with @code{x} and ending with
+fullstop followed by either @code{f} or @code{r} followed by a single digit
+between 1 and 9 inclusive. To match a substring using wild-cards prepend or
+append the search string with @code{*}.
+
+ at item regular expression
+This uses the Tcl regular expression syntax to perform a match. These patterns
+are naturally sub-strings unless anchored to one or both ends using the
+ at code{^}@i{expression}@code{$} syntax. A full description of regular
+expressions is beyond the scope of this manual.
+ at end table
+
+_split()
+ at node List-SearchTemplateNames
+ at section Search Template Names
+ at cindex Lists: Search template names
+ at cindex Search template names: lists
+ at cindex Template names, searching for
+
+This searches for template names matching a given pattern. The list
+produced will contain just the template names, but the information listed in
+the text output window lists the template names and the readings contained
+within each template. The reading names are hyperlinks and so double
+left-clicking on them will bring up the contig editor whilst right-clicking
+brings up a popup menu.
+
+For a description of the types of template search patterns see
+_fref(List-SearchSequenceNames, Search Sequence Names, lists)
+
+_split()
+ at node List-SearchAnnotations
+ at section Search Annotation Contents
+ at cindex Lists: Search annotation contents
+ at cindex Search annotation contents: lists
+ at cindex Annotations, searching for
+ at cindex Tags, searching for
+
+This searches the contents of annotations on both the individual reading
+sequences and the consensus sequences. A gap4 list will be produce containing
+the annotation number, contig and position. In the text output window a more
+complete description is available listing the annotation type and the contents
+of each annotation. Both the list and text-output window will contain a
+highlighted section which is a hyperlink. Double clicking on this with the
+left mouse button will bring up the contig editor at that point. Clicking with
+the right mouse button will display a popup-menu with further options.
+
+For a description of the types of annotation search patterns see
+_fref(List-SearchSequenceNames, Search Sequence Names, lists)
+
+]],[[
+_split()
+ at node List-SearchSequenceNames
+ at section Search Sequence Names
+ at cindex Lists: search sequence names
+ at cindex Search sequence names: lists
+ at cindex Reading names, searching for
+ at cindex Sequence names, searching for
+
+This command allows searching for sequences matching a prefix. The
+function produces both a list in the text output window and a __prog__ "list" of
+reading names. The highlighted output is clickable, with the left mouse button 
+invoking the contig editor and the right mouse button displaying a popup-menu
+allowing additional operations (contig editor, template display, reading notes 
+and contig notes).
+
+All searches are case sensitive and prefix only.
+]])
diff --git a/manual/makeSCF.1.texi b/manual/makeSCF.1.texi
new file mode 100644
index 0000000..8383465
--- /dev/null
+++ b/manual/makeSCF.1.texi
@@ -0,0 +1,94 @@
+ at cindex makeSCF: man page
+ at unnumberedsec NAME
+
+makeSCF --- Converts trace files to SCF files.
+
+ at unnumberedsec SYNOPSIS
+
+ at code{makeSCF} [@code{-8}] [@code{-2}] [@code{-3}]
+-(@code{abi}|@code{alf}|@code{scf}|@code{pln}) @i{input_name}
+[@code{-compress} @i{compression_mode}] [@code{-normalise}
+[@code{-output} @i{output_name}]
+
+ at unnumberedsec DESCRIPTION
+
+ at code{MakeSCF} converts trace files to the SCF format. It can input ABI 373A,
+Pharmacia A.L.F., or previously created SCF files (although converting from
+SCF to SCF serves no useful purpose!). 
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-8}
+    Force conversion to 8 bit sample data. This shrinks the size of SCF
+    files using 16 bit sample values, but at a loss of resolution. For trace
+    display purposes this accuracy loss is acceptable.
+
+ at item @code{-2}
+    Force the output to be written in SCF version 2. By default the
+    latest version (3) is used.
+
+ at item @code{-3}
+    Force the output to be written in SCF version 3. This is the default.
+
+ at item @code{-s}
+    Silent mode. This prevents the output of the copyright message.
+
+ at item @code{-abi}, @code{-alf}, @code{-scf}, @code{-any}
+    Specify an input file format. A file format of "any" will force
+    @code{makeSCF} to automatically determine the correct input file type.
+
+ at item @code{-compress} @i{compression_mode}
+    Requests the generated SCF file to be passed through a separate compression
+    program before writing to disk. @code{makeSCF} does not contain any
+    compression algorithms itself. It requires the appropriately named tool to
+    be on the system and in the user's @r{PATH}.
+    Valid responses for @i{compression_mode} are (in order of best compression
+    first) @code{bzip}, @code{gzip}, @code{compress} and @code{pack}. Note
+    that @code{bzip} at present is only bzip version 1 and that bzip version 2
+    is incompatible.
+
+ at item @code{-normalise}
+    Performs some very simple trace normalisation. This subtracts the
+    background signal (by defining the background signal to be the lowest of
+    the four traces) and rescales the peak heights, averaging the height over
+    a `window' of 1000 trace sample points. This option may be useful
+    for some unscaled ALF files.
+
+ at item @code{-output} @i{file}
+    Specifies the filename for the SCF file to be produced. If this is not
+    specified the SCF file will be sent to standard output.
+ at end table
+
+ at unnumberedsec EXAMPLES
+
+To convert an ABI 373A trace:
+
+ at example
+ at code{makeSCF -8 -abi trace.abi -output trace.scf}
+ at end example
+
+To convert an ALF archive to individual SCF files (Warning! this 
+will most certainly fail if your clone names contain spaces):
+
+ at example
+ at code{alfsplit trace.alf | awk '/^Clone/ @{print $3 "ALF"@}' > trace.files}
+
+ at code{sh -c 'for i in `cat trace.files`;do makeSCF -alf $i -output}
+ at code{    $i.scf;done}
+ at end example
+
+ at unnumberedsec NOTES
+
+If ABI and A.L.F files are edited before input to makeSCF the contents of
+the resulting SCF files are unpredictable.
+To use Pharmacia A.L.F. files the @code{alfsplit} program should first
+be used. Then @code{makeSCF} should be run on each of the split files.
+See the example above.
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Formats-Scf, scf(4), formats)
+_fxref(Man-convert_trace, convert_trace(1), convert_trace.1)
+_fxref(Man-eba, eba(1), eba.1)
+
diff --git a/manual/make_weights.1.texi b/manual/make_weights.1.texi
new file mode 100644
index 0000000..4597116
--- /dev/null
+++ b/manual/make_weights.1.texi
@@ -0,0 +1,217 @@
+ at cindex Make_weights: man page
+ at unnumberedsec NAME
+
+make_weights --- makes weight matrices from sequence alignments
+
+ at unnumberedsec SYNOPSIS
+
+ at code{make_weights} [@code{-v}] [@code{-m} @i{mark position}]
+[@code{-c} @i{minimum score}] [@code{-C} @i{maximum score}]@br
+[@code{-w} @i{input weight matrix file name}]
+[@code{-o} @i{output weight matrix file name}]
+[@i{input aligned sequences file}]
+
+
+ at unnumberedsec DESCRIPTION
+
+ at code{make_weights} 
+is used to create weight matrix files from a file
+of aligned sequences. These weight matrices are for use with spin.
+
+The simplest usage is to read in a file of aligned sequence motifs, and write 
+out a weight matrix file created from their observed character frequencies at 
+each position. The only command line input required is the name of the file
+of aligned sequence and the name for the output weight matrix file.
+In this mode, make_weights reads in the file of aligned motifs, counts the
+character frequencies at each position, calculates weights from these, and then
+applies the weights to all the input sequences, recording the score for each.
+By default the two cutoff scores written to the weight matrix file will be set
+to the minimum and maximum scores obtained from this process. In this mode
+nothing will be written to the output screen.
+
+If no output file is supplied, none is written, but the scores for all the
+input sequences are written to the screen. In this way the user can decide
+whether to override the cutoff scores written to the weight matrix file.
+To set these values they can be supplied on the command line using the -c
+and -C options.
+
+To see the range of scores for a set of aligned sequences and an existing
+weight matrix file, the -w option should be used. In this case the matrix
+file is read, applied to the set of aligned sequences, and the scores are
+listed on the screen.
+
+The -m option is used to set the mark position and the -v option simply 
+lists the current version number of the program.
+
+The screen output produced by make_weights can be used as input to 
+make_weights. An example is shown below.
+
+ at example
+Input to make_weights:
+
+HSTGM1A   acagcggaccgtgtgaccat comments
+HSARAF1G  aagtctaacagtatctatct 
+HSU01337  aagtctaacagtatctatct 
+HSA132695 gccgattgccgtatgtaaaa 
+HSCEL     ctctctgcaggtctcgggat 
+
+Output from make_weights:
+
+HSTGM1A   acagcggaccgtgtgaccat 0 0.049247 comments
+HSARAF1G  aagtctaacagtatctatct 1 0.010509 
+HSU01337  aagtctaacagtatctatct 2 0.010509 
+HSA132695 gccgattgccgtatgtaaaa 3 0.133783 
+HSCEL     ctctctgcaggtctcgggat 4 0.206426 
+ at end example
+
+The output has added two extra columns between the sequences and the comments:
+a motif number and its score. This file could be passed through a sorting
+program to shift the lowest scoring motifs to the bottom of the file,
+and then the records with poor scores investigated, and perhaps removed.
+On UNIX the following creates a file as shown above, called don.s, and
+then sorts it on score to create the ordered file don.ss.
+
+ at example
+make_weights don.mw > don.s
+sort -n -r +3 -o don.ss don.s
+ at end example
+
+The weights are calculated in the following way.
+
+The algorithm deals with the problem of zero counts by adding a small amount 
+to every element. For alignments with few sequences the effect will be quite 
+marked, but for large datasets it will be very small.
+
+The score for unknown characters found in sequences is set to the mean for 
+the column.
+
+In calculating the log odds it is assumed that probability of each base type 
+in a random sequence is 0.25
+
+Let the counts for each position (column) and character type in the 
+alignment be stored in counts, and put the weights in matrix. Both
+are two dimensional arrays. Char_set_size is the character set size
+which is 4 for DNA.
+
+ at example
+ for each column sum the counts to get the total
+     set small to 1 if total = 0 otherwise 1/total
+     set column total to total + small*char_set_size
+     for each character type
+        set matrix to counts + small
+        p = matrix/total
+        p = log ( p / 0.25 )
+        matrix = p
+     set unknown char (matrix) to mean for the column
+ end
+ at end example
+
+Aligned sequences file format
+
+ at i{name sequence comments}
+
+The file containing the aligned sequences should consist entirely of records
+containing data. Each record should contain a name, followed by the sequence,
+followed by arbitrary comments. Each record must be less that 2048 characters.
+At present make_weights is set to handle up to 10,000 records. Within a record
+fields (other than within the comments section) are separated by spaces. It
+is assumed that the sequences are aligned and do not contain leading spaces. 
+For example, the last but 1 record below is not aligned in the file, but will 
+be aligned after parsing.
+
+ at example
+AB002455  ctgacagaaggtgccagggt 1
+AB002456  ccctggctgggtgagtatct 1
+AB002456  tttgctccaggtagacactg 2
+HSE27     atgtttgagggtgagggccc 1
+AB002460  atccccaaaggtgccacagc 1 unusual
+AB002461    cagggcccaggtaagggcgg 1
+AB003312  aatgctcaaggtacagagac 1
+ at end example
+
+A weight matrix file (as shown below) consists of a single record title
+(here test matrix), a record containing the motif length (here 11), the
+"mark position" (here 5), and the minimum and maximum scores (here 
+0.0 and 10.0). The "mark position" is an offset which is added to
+the position of any matches reported by the search routine in spin.
+The next two records are ignored by the programs. The first gives the
+matrix column positions, and the next the total counts in each column.
+The final records (4 for DNA weight matrices) give the counts for each
+character type at each position in the motif. These counts are converted
+into weights that are used during the searches. Any position in a sequence
+which scores at least as high as the minimum score (here 0.0) is reported 
+as a match, and if the results are plotted they are scaled to fit the range
+defined by the minimum and maximum scores (here 0.0 and 10.0).
+
+ at example
+test matrix
+11 5 0.0 10.0
+P     0     1     2     3     4     5     6     7     8     9    10
+n  8067  8067  8069  8067  8069  8069  8069  8069  8069  8069  8068
+a  2572  4755   700    61    73  3759  5667   542  1236  2082  1624
+c  3137  1109   301    33    89   260   671   518  1282  1803  2379
+g  1515  1146  6343  7897   103  3759  1060  6502  1821  2879  2098
+t   845  1059   725    77  7803   288   670   506  3728  1303  1967
+ at end example
+
+The maximum number of columns in a record is 20. Longer motifs will have weight
+matrix files with sufficient blocks of 20 columns. For example the one shown 
+below has 22 positions and so a second block has been started.
+
+ at example
+title
+22 0 0.0 3.0
+P  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19
+n  4  4  4  4  4  4  4  4  4  4  4  4  4  4  4  4  4  4  4  4
+a  2  2  2  2  2  2  2  2  2  0  0  0  2  2  2  2  2  2  2  2
+c  1  1  1  1  1  1  1  1  1  0  0  0  1  1  1  1  1  1  1  1
+g  0  0  0  0  0  0  0  0  1  4  4  3  0  0  0  0  0  0  0  0
+t  1  1  1  1  1  1  1  1  0  0  0  1  1  1  1  1  1  1  1  1
+P 20 21
+n  4  4
+a  2  2
+c  1  1
+g  0  0
+t  1  1
+ at end example
+
+ at unnumberedsec OPTIONS
+ at table @asis
+ at item @code{-v}
+     Show the version number of the program.
+
+ at item @code{-m}
+     Set the mark position. When matches are found using the weight matrix
+     offset m is added to the reported match position.
+
+ at item @code{-c}
+     Set the minimum score. When the weight matrix is used to search a new
+     sequence all positions which reach this score are reported as a match.
+
+ at item @code{-C}
+     Set the maximum score. When the weight matrix is used to search a new
+     sequence, matches are plotted using this value as the maximum.
+
+ at item @code{-w}
+     Apply an input weight matrix to the set of aligned motifs. Write the 
+     scores for each motif on the screen, but do not create a new weight 
+     matrix file.
+
+ at item @code{-o}
+     The file name for the weight matrix created.
+ at end table
+
+ at unnumberedsec EXAMPLE
+
+ at example
+make_weights
+Usage: make_weights [options] input_file
+Where options are:
+    [-w input weights filename]      [-o output filename]
+    [-c min score]                   [-C max score]
+    [-m mark position]               [-v version]
+ at end example
+
+ at unnumberedsec SEE ALSO
+
+_fxref(SPIN-Weight-Matrix-Search, Motif search, spin)
diff --git a/manual/man/man1/convert_trace.1 b/manual/man/man1/convert_trace.1
new file mode 100644
index 0000000..5e33cc2
--- /dev/null
+++ b/manual/man/man1/convert_trace.1
@@ -0,0 +1,160 @@
+.TH "convert_trace" 1 "" "" "Staden Package"
+.SH "NAME"
+.PP
+convert_trace \- Converts trace file formats
+
+.SH "SYNOPSIS"
+.PP
+
+\fBconvert_trace\fP
+[\fB-in_format\fP \fIformat\fP]
+[\fB-out_format\fP \fIformat\fP]
+[\fB-fofn\fP \fIfile_of_filenames\fP]
+[\fB-passed\fP \fIfofn\fP]
+[\fB-failed\fP \fIfofn\fP]
+[\fB-name\fP \fIid\fP]
+[\fB-subtract_background\fP]
+[\fB-normalise\fP]
+[\fB-scale\fP \fIrange\fP]
+[\fB-compress\fP \fImode\fP]
+[\fB-abi_data\fP \fIcounts\fP]
+[\fIinformat\fP \fIoutformat\fP]
+
+.SH "DESCRIPTION"
+.PP
+
+\fBconvert_trace\fP converts between the various DNA sequence chromatogram
+formats, optionally performing trace processing actions too. It can read ABI
+(raw or processed), ALF, CTF, SCF and ZTR formats. It can write CTF, EXP, PLN, 
+SCF and ZTR formats. (Note that EXP (Experiment File) and PLN formats are
+text sequences rather than a binary trace.)
+
+There are two main modes of operation; either with a file of filenames
+specified using the \fB-fofn\fP \fIfilename\fP option, or acting as a filter
+to process one single file. In this case the input and output file format may
+be specified as the last two options on the command line.
+
+.SH "OPTIONS"
+.PP
+.TP
+\fB-abi_data\fP \fIcounts\fP
+Only of use when processing ABI files. This indicates which ABI
+\fBDATA\fP channel numbers to use. For sequencing files this defaults to
+"9,10,11,12" which corresponds to the processed data. To read the raw data 
+use "1,2,3,4".
+
+.TP
+\fB-compress\fP \fImode\fP
+Specifies the name of a program to use to compress the trace data prior to 
+writing. Due to limitations in the current implementation this option does 
+not work when \fBconvert_trace\fP is operating as a filter (and so
+requires use of the \fB-fofn\fP option). Valid values for \fImode\fP are
+compress, bzip, bzip2, gzip, pack and szip. Note that for ZTR, ZTR2 and
+ZTR3 format files specifying compression modes will not reduce the file
+size as this format already contains internal compression algorithms. The
+ZTR1 format does not internally compress and so \fB-compress\fP will have 
+an effect.
+
+.TP
+\fB-failed\fP \fIfofn\fP
+Produces a file listing the filenames which have failed to be
+converted. This only makes sense when also using \fB-fofn\fP.
+
+.TP
+\fB-fofn\fP \fIfile_of_filenames\fP
+Processes several files instead of one, with the filenames to read from and
+written to being listed in \fIfile_of_filenames\fP with one pair (input and
+output filenames) being listed per line, separated by spaces. If the
+filenames contain spaces then these may be "escaped" using
+backslashes. Similarly backslashes should be escaped using a double
+backslash. For example to convert "file a.scf" and "fileb.scf" to "file
+a.ztr" and "fileb.ztr" respectively we would use a \fIfile_of_filenames\fP
+containing:
+
+.nf
+.in +0.5i
+file\\ a.scf    file\\ a.ztr
+fileb.scf      fileb.ztr
+.in -0.5i
+.fi
+
+.TP
+\fB-in_format\fP \fIformat\fP
+Specifies the format for the input data. Typically the input format is
+automatically determined so this may not be required. \fIformat\fP should be 
+one of ABI, ALF, CTF, EXP, PLN, SCF, ZTR, ZTR1, ZTR2 or ZTR3. The ZTR
+formats all conform to the ZTR specification, but this indicates the
+compression level to be used.
+
+.TP
+\fB-name\fP \fIid\fP
+When producing an Experiment File this specifies the value of the
+\fBID\fP line. Without this option default Experiment File ID line is the 
+output filename, or if this is stdout it is the input filename.
+
+.TP
+\fB-normalise\fP
+Attempts to normalise the trace amplitudes to produce more even height
+peaks. This may be useful to compensate for large spikes at either the
+start or end of the trace.
+
+.TP
+\fB-out_format\fP \fIformat\fP
+Specifies the output format for all files, whether read from a file of
+filenames or via a filter.  \fIformat\fP should be 
+one of ABI, ALF, CTF, EXP, PLN, SCF, ZTR, ZTR1, ZTR2 or ZTR3. The ZTR
+formats all conform to the ZTR specification, but this indicates the
+compression level to be used.
+
+.TP
+\fB-passed\fP \fIfofn\fP
+Produces a file listing the filenames which have been successfully
+converted. This only makes sense when also using \fB-fofn\fP.
+
+.TP
+\fB-scale\fP \fIrange\fP
+Scales all trace amplitudes so that they fit within the range of 0 to 
+\fIrange\fP inclusive. Any integer value of \fIrange\fP may be used between 1
+and 65535, but this option is designed for down-scaling traces in order to 
+reduce file size.
+
+.TP
+\fB-subtract_background\fP
+Attempts to remove background trace levels by analysing each trace channel 
+independently to determine the baseline. This option is mainly used when
+processing raw data.
+.TE
+.SH "EXAMPLES"
+.PP
+
+To convert several files to ZTR format using the same example file of
+filenames listed in the \fB-fofn\fP option above:
+
+.nf
+.in +0.5i
+convert_trace -out_format ZTR -fofn filename
+.in -0.5i
+.fi
+
+To subtract the background from a raw ABI file and save this as an SCF file:
+
+.nf
+.in +0.5i
+convert_trace -abi_data 1,2,3,4 -subtract_background ABI SCF < a.abi > a.scf
+.in -0.5i
+.fi
+
+.SH "NOTES"
+.PP
+
+If ABI files are manually edited before input to convert_trace then the
+internal formats of these files may differ to the format expected by
+convert_trace.
+
+.SH "SEE ALSO"
+.PP
+
+\fBscf\fR(4)
+\fBztr\fR(4)
+\fBmakeSCF\fR(1)
+
diff --git a/manual/man/man1/copy_db.1 b/manual/man/man1/copy_db.1
new file mode 100644
index 0000000..a2fdbe5
--- /dev/null
+++ b/manual/man/man1/copy_db.1
@@ -0,0 +1,79 @@
+.TH "copy_db" 1 "" "" "Staden Package"
+.SH "NAME"
+.PP
+copy_db \- a garbage collecting gap4 database copier and merger
+
+.SH "SYNOPSIS"
+.PP
+
+\fBcopy_db\fP [\fB-v\fP] [\fB-f\fP] [@code {-b} \fI32/64\fP] [@code
+{-T}] \fIfrom.vers\fP ... \fIto.vers\fP
+
+.SH "DESCRIPTION"
+.PP
+
+\fBCopy_db\fP copies one or more gap4 databases to a new name by
+physically extracting the information from the first databases and
+writing it to the last database listed on the command line. This
+operation can be considered analogous to copying files into a directory.
+This is slower than a direct \fBcp\fP command, but has the advantage
+of merging several databases together and the resulting database will
+have been  garbage collected. That is, any fragmentation in the original
+databases is removed (as much as is possible).
+
+NOTE: Care should be taken when merging database. \fBNo checks\fP are
+performed to make sure that the databases do not already contain the
+same readings. Thus attempting to copy the same database several times will
+cause problems later on. No merging of vector, clone or template
+information is performed either.
+
+.SH "OPTIONS"
+.PP
+
+.TP
+\fB-v\fP
+Enable verbose output. This gives a running summary of the current piece
+of information being copied.
+
+.TP
+\fB-f\fP
+Attempts to spot and fix various database corruptions. A
+corrupted gap4 database may not be corruption free after this,
+but there's more chance of being able to recover data.
+
+.TP
+-T
+Removes annotation tags while copying. (Of limited use.)
+
+.TP
+-b \fIbitsize\fP
+Generates the new database using a given bitsize, where
+\fIbitsize\fP is either \fB32\fP or \fB64\fP.
+.TE
+.SH "EXAMPLES"
+.PP
+
+To merge database X with database Y to give a new database Z use:
+
+.nf
+.in +0.5i
+copy_db X.0 Y.0 Z.0
+.in -0.5i
+.fi
+
+.SH "NOTES"
+.PP
+
+To copy a database quickly without garbage collecting the UNIX \fBcp\fP
+command can be used as follows. This copies version F of database DB to
+version T of database XYZZY.
+
+.nf
+.in +0.5i
+cp DB.F XYZZY.T; cp DB.F.aux XYZZY.T.aux
+.in -0.5i
+.fi
+
+Care must be taken to check for the busy file (DB.F.BUSY) before making
+the copy. If the database is written to during the operation of the copy
+command then the new database may be corrupted.
diff --git a/manual/man/man1/eba.1 b/manual/man/man1/eba.1
new file mode 100644
index 0000000..3e65be2
--- /dev/null
+++ b/manual/man/man1/eba.1
@@ -0,0 +1,64 @@
+.TH "eba" 1 "" "" "Staden Package"
+.SH "NAME"
+.PP
+eba \- Estimates Base Accuracy in an SCF or ZTR file
+
+.SH "SYNOPSIS"
+.PP
+
+\fBeba\fP [\fItrace_file\fP]
+
+.SH "DESCRIPTION"
+.PP
+
+\fBEba\fP will calculate numerical estimates of base accuracy for each
+base in an SCF or ZTR file. The figures calculated should not be considered as
+reliable and better values can be obtained from phred or ATQA.
+
+The method employed by eba to estimate the base accuracies performs the
+following calculation for each base. Calculate the area under the peaks
+for each base type. Divide the area under the called base by the largest
+area under the other three bases. From the 2002 release these values are
+normalised to the phred scale (this was achieved by comaring the
+original eba values and phred values for 4.6 million base calls of
+Sanger Centre data).
+
+With no filename as an argument eba reads from standard input and writes
+to standard output. This enables eba to be used as a filter, or to
+estimate base accuracies for unwritable files. If a file is specified on
+the command line then the accuracy figures will be written to this file.
+
+.SH "EXAMPLES"
+.PP
+
+To write base accuracy figures to an SCF file named \fBe04f10.s1SCF\fP.
+
+.nf
+.in +0.5i
+\fBeba e04f10.s1SCF\fP
+.in -0.5i
+.fi
+
+To write base accuracy figures on the original eba scale to an SCF file 
+named \fBe04f10.s1SCF\fP.
+
+.nf
+.in +0.5i
+\fBeba -old_scale e04f10.s1SCF\fP
+.in -0.5i
+.fi
+
+To write base accuracy figures to a ZTR file named \fBe04f10.s1.ztr\fP
+in another users directory, and to store the updated file in the
+current directory:
+
+.nf
+.in +0.5i
+\fBeba < ~user/e04f10.s1.ztr > e04f10.s1.ztr\fP
+.in -0.5i
+.fi
+
+.SH "SEE ALSO"
+.PP
+
+\fBscf\fR(4)
diff --git a/manual/man/man1/extract_seq.1 b/manual/man/man1/extract_seq.1
new file mode 100644
index 0000000..e9d8daf
--- /dev/null
+++ b/manual/man/man1/extract_seq.1
@@ -0,0 +1,70 @@
+.TH "extract_seq" 1 "" "" "Staden Package"
+.SH "NAME"
+.PP
+extract_seq \- extracts sequence from a trace or experiment file.
+
+.SH "SYNOPSIS"
+.PP
+
+\fBextract_seq\fP [\fB-r\fP]
+[\fB-\fP(\fBabi\fP|\fBalf\fP|\fBscf\fP|\fBztr\fP|\fBexp\fP|\fBpln\fP)]
+[\fB-good_only\fP] [\fB-clip_cosmid\fP] [\fB-fasta_out\fP]
+[\fB-output\fP \fIoutput_name\fP] [\fIinput_name\fP] \fB...\fP
+
+.SH "DESCRIPTION"
+.PP
+
+\fBextract_seq\fP extracts the sequence information from binary trace
+files, Experiment files, or from the old Staden format plain files. The input
+can be read either from files or from standard input, and the output can be
+written to either a file or standard output. Multiple input files can be
+specified. The output contains the sequences split onto lines of at most 60
+characters each.
+
+.SH "OPTIONS"
+.PP
+
+.TP
+\fB-r\fP
+Directs reading of experiment file to attempt extraction of sequence from
+the referenced (\fBLN\fP and \fBLT\fP line types) trace file. Without
+this option, or when the trace file cannot be found, the sequence
+output is that listed in the Experiment File. This option has no effect
+for other input format types.
+
+.TP
+\fB-abi\fP, \fB-alf\fP, \fB-scf\fP, \fB-ztr\fP, \fB-exp\fP, \fB-pln\fP
+Specify an input file format. This is not usually required as
+\fBextract_seq\fP will automatically determine the correct input file
+type. This option is supplied incase the automatic determination is
+incorrect (which is possible, but has never been observed).
+
+.TP
+\fB-good_only\fP
+When reading an experiment file or SCF file containing clip marks, output
+only the \fIgood\fP sequence which is contained within the boundaries marked
+by the \fBQL\fP, \fBQR\fP, \fBSL\fP, \fBSR\fP, \fBCL\fP, \fBCR\fP
+and \fBCS\fP line types.
+
+.TP
+\fB-clip_cosmid\fP
+When the \fB-good_only\fP argument is specified this controls whether the
+cosmid sequence should be considered good data. Without this argument
+cosmid sequence is considered good.
+
+.TP
+\fB-fasta_out\fP
+Specifies that the output should be in fasta format
+
+.TP
+\fB-output\fP \fIfile\fP
+The sequence will be written to \fIfile\fP instead of standard
+output.
+.TE
+.SH "SEE ALSO"
+.PP
+
+\fBExperimentFile\fR(4)
+\fBscf\fR(4)
+_fxref(Man-extract_fastq extract_fastq(1), extract_fastq.1)
+\fBRead\fP(4)
diff --git a/manual/man/man1/find_renz.1 b/manual/man/man1/find_renz.1
new file mode 100644
index 0000000..eb70240
--- /dev/null
+++ b/manual/man/man1/find_renz.1
@@ -0,0 +1,41 @@
+.TH "find_renz" 1 "" "" "Staden Package"
+.SH "NAME"
+.PP
+find_renz \- Identifies the position of a cut site within a sequence
+
+.SH "SYNOPSIS"
+.PP
+
+\fBfind_renz\fP [\fB-vp\fP] \fIenzyme\fP \fIfilename\fP ...
+
+.SH "DESCRIPTION"
+.PP
+
+\fBfind_renz\fP may be used to determine the position that an enzyme cuts a
+sequence. It's use as a command line utility is primarily designed for
+internal use within \fBpregap4\fP and as a user utility for producing
+\fIvector-primer\fP files for use with \fBvector_clip\fP. As such it is
+dedicated to finding one and only one such cut site and considers no cuts
+sites or multiple cut sites to be an error.
+
+Only one enzyme may be specified, which is given by the enzyme name (upper or
+lower case is not important). One or more filenames may be specified. If an
+enzyme does not cut a sequence the message "Enzyme not found in sequence" will 
+be sent to stderr. If an enzyme cuts a sequence more than once the message
+"Found more than one match" will be sent to stderr. Otherwise output is
+produced to stdout. This means that wildcards may be used (@code{find_renz -vp 
+smai *.seq >> vpfile}) with the output redirected without needing to consider
+whether the enzyme is suitable for all files matching the wildcard pattern.
+
+.SH "OPTIONS"
+.PP
+.TP
+\fB-vp\fP
+Specifies that the output should be in a format suitable for saving to a
+vector-primer file (to use with vector_clip). Without this only the cut
+site position is listed.
+.TE
+.SH "SEE ALSO"
+.PP
+
+\fBvector_clip\fR(1)
diff --git a/manual/man/man1/get_comment.1 b/manual/man/man1/get_comment.1
new file mode 100644
index 0000000..cc3b6d9
--- /dev/null
+++ b/manual/man/man1/get_comment.1
@@ -0,0 +1,39 @@
+.TH "get_comment" 1 "" "" "Staden Package"
+.SH "NAME"
+.PP
+get_comment \- extract comments from trace files
+
+.SH "SYNOPSIS"
+.PP
+
+\fBget_comment\fP [ \fB-c\fP ] [ \fBField-ID\fP ... ]
+
+.SH "DESCRIPTION"
+.PP
+
+The \fBget_comment\fP command extracts text fields from a variety of trace
+formats, read in from stdin. Each comment is of the form
+\fIField-ID\fP=\fIcomment\fP, regardless of the file format. \fIField-ID\fP is
+typically 4 character identifier.
+
+With no \fIField-ID\fP arguments specified all comments are listed. Otherwise
+only those specified on the command line are listed.
+
+.SH "OPTIONS"
+.PP
+
+.TP
+\fB-h\fP
+Display the usage help.
+
+.TP
+\fB-c\fP
+Suppresses the output of the \fIField-ID\fP. Only the right hand side of the
+comment is displayed. The default action is the display the full comment in
+the form listed above.
+
+.TE
+.SH "SEE ALSO"
+.PP
+
+\fBget_scf_field\fR(1)
diff --git a/manual/man/man1/get_scf_field.1 b/manual/man/man1/get_scf_field.1
new file mode 100644
index 0000000..99a3e9d
--- /dev/null
+++ b/manual/man/man1/get_scf_field.1
@@ -0,0 +1,44 @@
+.TH "get_scf_field" 1 "" "" "Staden Package"
+.SH "NAME"
+.PP
+get_scf_field \- extract comments from an SCF file
+
+.SH "SYNOPSIS"
+.PP
+
+\fBget_scf_field\fP [ \fB-cqs\fP ] \fIfilename\fP [ \fBField-ID\fP ... ]
+
+.SH "DESCRIPTION"
+.PP
+
+The \fBget_scf_field\fP command extracts comments from an SCF file. Each
+comment is of the form \fIField-ID\fP=\fIcomment\fP. Where \fIField-ID\fP is
+a 4 character identifier.
+
+With no \fIField-ID\fP arguments specified all comments are listed. Otherwise
+only those specified on the command line are listed.
+
+.SH "OPTIONS"
+.PP
+
+.TP
+\fB-c\fP
+Suppresses the output of the \fIField-ID\fP. Only the right hand side of the
+comment is displayed. The default action is the display the full comment in
+the form listed above.
+
+.TP
+\fB-q\fP
+Query mode. Here no output is displayed, but it simply returns true
+or false depending on whether any of requested comments were found.
+
+.TP
+\fB-s\fP
+Silent mode. No error messages are produced, except for usage messages. It
+returns true or false for success or failure.
+.TE
+.SH "SEE ALSO"
+.PP
+
+\fBget_comment\fR(1)
+\fBscf\fR(4)
diff --git a/manual/man/man1/init_exp.1 b/manual/man/man1/init_exp.1
new file mode 100644
index 0000000..0e3ed3c
--- /dev/null
+++ b/manual/man/man1/init_exp.1
@@ -0,0 +1,63 @@
+.TH "init_exp" 1 "" "" "Staden Package"
+.SH "NAME"
+.PP
+init_exp \- create and initialise an Experiment File
+
+.SH "SYNOPSIS"
+.PP
+
+\fBinit_exp\fP [\fB-\fP(\fBabi\fP|\fBalf\fP|\fBscf\fP|\fBpln\fP)]
+[\fB-output\fP \fIfile\fP] [\fB-name\fP \fIentry_name\fP] [\fB-conf\fP] \fIfile\fP
+
+.SH "DESCRIPTION"
+.PP
+
+\fBinit_exp\fP initiates an Experiment File for a binary trace file or
+a plain sequence file. The Experiment File created
+contains the \fBID\fP, \fBEN\fP, \fBLN\fP, \fBLT\fP and \fBSQ\fP
+lines.
+
+The experiment file is, by default, sent to standard output, unless an
+output file is specified using the \fB-output\fP option. The default
+entry name for the Experiment File is derived from the filenames used.
+If an output file has been specified, then this is taken as the
+\fBEN\fP field. Otherwise the input file name is used. The user can
+override the default by using the \fB-name\fP option.
+
+.SH "OPTIONS"
+.PP
+
+.TP
+\fB-abi\fP, \fB-alf\fP, \fB-scf\fP, \fB-pln\fP
+Specify an input file format. This is not usually required as
+\fBinit_exp\fP will automatically determine the correct input file
+type. This option is supplied incase the automatic determination is
+incorrect (which is possible, but has never been observed).
+
+.TP
+\fB-output\fP \fIfile\fP
+The experiment file will be written to \fIfile\fP instead of standard
+output. Additionally the value of the \fBEN\fP and \fBID\fP
+fields, assuming \fB-name\fP has not been specified, will be \fIfile\fP.
+
+.TP
+\fB-name\fP \fIname\fP
+Sets the \fBID\fP and \fBEN\fP fields to \fIname\fP, regardless of
+the output filename used.
+
+.TP
+\fB-conf\fP
+Fills out the \fBAV\fP field with the quality values found in the SCF
+file.
+
+.TE
+.SH "NOTES"
+.PP
+
+This program was formerly known as \fBexpGetSeq\fP.
+
+.SH "SEE ALSO"
+.PP
+
+\fBExperimentFile\fR(4)
+\fBRead\fP(4)
diff --git a/manual/man/man1/makeSCF.1 b/manual/man/man1/makeSCF.1
new file mode 100644
index 0000000..0211a6c
--- /dev/null
+++ b/manual/man/man1/makeSCF.1
@@ -0,0 +1,110 @@
+.TH "makeSCF" 1 "" "" "Staden Package"
+.SH "NAME"
+.PP
+makeSCF \- Converts trace files to SCF files.
+
+.SH "SYNOPSIS"
+.PP
+
+\fBmakeSCF\fP [\fB-8\fP] [\fB-2\fP] [\fB-3\fP]
+-(\fBabi\fP|\fBalf\fP|\fBscf\fP|\fBpln\fP) \fIinput_name\fP
+[\fB-compress\fP \fIcompression_mode\fP] [\fB-normalise\fP
+[\fB-output\fP \fIoutput_name\fP]
+
+.SH "DESCRIPTION"
+.PP
+
+\fBMakeSCF\fP converts trace files to the SCF format. It can input ABI 373A,
+Pharmacia A.L.F., or previously created SCF files (although converting from
+SCF to SCF serves no useful purpose!). 
+
+.SH "OPTIONS"
+.PP
+
+.TP
+\fB-8\fP
+Force conversion to 8 bit sample data. This shrinks the size of SCF
+files using 16 bit sample values, but at a loss of resolution. For trace
+display purposes this accuracy loss is acceptable.
+
+.TP
+\fB-2\fP
+Force the output to be written in SCF version 2. By default the
+latest version (3) is used.
+
+.TP
+\fB-3\fP
+Force the output to be written in SCF version 3. This is the default.
+
+.TP
+\fB-s\fP
+Silent mode. This prevents the output of the copyright message.
+
+.TP
+\fB-abi\fP, \fB-alf\fP, \fB-scf\fP, \fB-any\fP
+Specify an input file format. A file format of "any" will force
+\fBmakeSCF\fP to automatically determine the correct input file type.
+
+.TP
+\fB-compress\fP \fIcompression_mode\fP
+Requests the generated SCF file to be passed through a separate compression
+program before writing to disk. \fBmakeSCF\fP does not contain any
+compression algorithms itself. It requires the appropriately named tool to
+be on the system and in the user's @r{PATH}.
+Valid responses for \fIcompression_mode\fP are (in order of best compression
+first) \fBbzip\fP, \fBgzip\fP, \fBcompress\fP and \fBpack\fP. Note
+that \fBbzip\fP at present is only bzip version 1 and that bzip version 2
+is incompatible.
+
+.TP
+\fB-normalise\fP
+Performs some very simple trace normalisation. This subtracts the
+background signal (by defining the background signal to be the lowest of
+the four traces) and rescales the peak heights, averaging the height over
+a `window' of 1000 trace sample points. This option may be useful
+for some unscaled ALF files.
+
+.TP
+\fB-output\fP \fIfile\fP
+Specifies the filename for the SCF file to be produced. If this is not
+specified the SCF file will be sent to standard output.
+.TE
+.SH "EXAMPLES"
+.PP
+
+To convert an ABI 373A trace:
+
+.nf
+.in +0.5i
+\fBmakeSCF -8 -abi trace.abi -output trace.scf\fP
+.in -0.5i
+.fi
+
+To convert an ALF archive to individual SCF files (Warning! this 
+will most certainly fail if your clone names contain spaces):
+
+.nf
+.in +0.5i
+\fBalfsplit trace.alf | awk '/^Clone/ {print $3 "ALF"@\fP' > trace.files}
+
+\fBsh -c 'for i in `cat trace.files`;do makeSCF -alf $i -output\fP
+\fB    $i.scf;done\fP
+.in -0.5i
+.fi
+
+.SH "NOTES"
+.PP
+
+If ABI and A.L.F files are edited before input to makeSCF the contents of
+the resulting SCF files are unpredictable.
+To use Pharmacia A.L.F. files the \fBalfsplit\fP program should first
+be used. Then \fBmakeSCF\fP should be run on each of the split files.
+See the example above.
+
+.SH "SEE ALSO"
+.PP
+
+\fBscf\fR(4)
+\fBconvert_trace\fR(1)
+\fBeba\fR(1)
+
diff --git a/manual/man/man4/ExperimentFile.4 b/manual/man/man4/ExperimentFile.4
new file mode 100644
index 0000000..ef89bc3
--- /dev/null
+++ b/manual/man/man4/ExperimentFile.4
@@ -0,0 +1,779 @@
+.TH "ExperimentFile" 4 "" "" "Staden Package"
+.SH "NAME"
+.PP
+ExperimentFile \- Experiment File Format
+
+.SH "Experiment File"
+.PP
+
+Experiment files contain gel readings plus information about them, and are
+used during the processing of the sequence. They are used to carry data
+between programs: they provide input to the programs and programs may in
+turn add to or modify them. When the experiment file for a reading reaches
+the assembly program it should be carrying all the data needed for its
+subsequent processing. The assembly program will copy what it needs into
+the assembly database. The file format is based on that of EMBL sequence
+entries and, if required, can be read as such by programs like spin.
+
+_split()
+
+.SS "Records"
+.PP
+
+It is important to note that the assembly program gap4
+(_fpref(Gap4-Introduction, Gap4 introduction, gap4))
+will not operate to
+its full effect if it is not given all the necessary data. For example
+gap4 contains many functions that can analyse the positions and relative
+orientations of readings from the same template in order to check the
+correctness of the assembly and determine the contig order. However if
+the records that name templates and their estimated lengths, and define
+the primers used to obtain readings from them are missing, none of these
+valuable analyses can be performed reliably. One way to ensure that all
+the necessary fields are present is to use the program pregap4
+(_fpref(Pregap4-Introduction, Pregap4 introduction, pregap4)).
+
+In the descriptions below records containing * are those read into the
+database during normal assembly; those with ** are extra items required when
+entering pre-assembled data; those with *** are read from SCF files
+(after the experiment file has been read to obtain the SCF file name);
+(_fpref(Formats-Scf, SCF introduction, scf))
+the record marked **** is an extra item required for Directed Assembly.
+
+The order of records in the file is not important. They are listed
+here in alphabetical order with, where possible, reasons for the 
+origin of their names. Several are redundant and no group is likely
+to make use of them all. Obviously others can be added in the future.
+Initially they might be of local use but if their use becomes wider they
+can be added to the standard set. Standard EMBL records such as FT are
+assumed to be included.
+
+.nf
+.BR AC "  ACcession number"
+.BR AP "  Assembly Position ****"
+.BR AQ "  AVerage Quality for bases 100..200"
+.BR AV "  Accuracy values for externally assembled data **, ***"
+.BR BC "  Base Calling software"
+.BR CC "  Comment line"
+.BR CF "  Cloning vector sequence File"
+.BR CH "  Special CHemistry"
+.BR CL "  Cloning vector Left end"
+.BR CN "  Clone Name"
+.BR CR "  Cloning vector Right end"
+.BR CS "  Cloning vector Sequence present in sequence *"
+.BR CV "  Cloning Vector type"
+.BR DR "  Direction of Read"
+.BR DT "  DaTe of experiment"
+.BR EN "  Entry Name"
+.BR EX "  EXperimental notes"
+.BR FM "  sequencing vector Fragmentation Method"
+.BR ID "  IDentifier *"
+.BR LE "  was Library Entry, but now identifies a well in a micro titre dish"
+.BR LI "  was subclone LIbrary but now identifies a micro titre dish"
+.BR LN "  Local format trace file Name *"
+.BR LT "  Local format trace file Type *"
+.BR MC "  MaChine on which experiment ran"
+.BR MN "  Machine generated trace file Name"
+.BR MT "  Machine generated trace file Type"
+.BR ON "  Original base Numbers (positions) **"
+.BR OP "  OPerator"
+.BR PC "  Position in Contig **"
+.BR PD "  Primer data (the sequence of a primer)"
+.BR PN "  Primer Name"
+.BR PR "  PRimer type *"
+.BR PS "  Processing Status"
+.BR QL "  poor Quality sequence present at Left (5') end *"
+.BR QR "  poor Quality sequence present at Right (3') end *"
+.BR RS "  Reference Sequence for numbering and mutation detection"
+.BR SC "  Sequencing vector Cloning site"
+.BR SE "  SEnse (ie whether complemented) **"
+.BR SF "  Sequencing vector sequence File"
+.BR SI "  Sequencing vector Insertion length *"
+.BR SL "  Sequencing vector sequence present at Left (5') end *"
+.BR SP "  Sequencing vector Primer site (relative to cloning site)"
+.BR SQ "  SeQuence *"
+.BR SR "  Sequencing vector sequence present at Right (3') end *"
+.BR SS "  Screening Sequence"
+.BR ST "  STrands *"
+.BR SV "  Sequencing Vector type *"
+.BR TG "  Gel reading Tag *"
+.BR TC "  Contig Tag *"
+.BR TN "  Template Name *"
+.BR WT "  Wild type trace"
+.fi
+_split()
+
+.SS "Explanation of Records"
+.PP
+
+
+.PD 0
+.IP Record 13
+AC, ACcession line
+.IP Format 13
+AC   string
+.IP Explanation 13
+A unique identifier for the reading.
+.sp
+.PD
+.PD 0
+.IP Record 13
+AP, Assembly Position
+.IP Format 13
+AP   Name_of_anchor_reading sense offset tolerance
+.IP Explanation 13
+For readings whose position has been mapped by an external program, these
+records tell the "directed assembly" algorithm where to assemble the data.
+Positions are defined as offsets from an "anchor reading" which is the name of
+any reading already in the database, an orientation (sense, + or -), and a
+tolerance. Readings are aligned at relative position offset + or - tolerance.
+.sp
+.PD
+.PD 0
+.IP Record 13
+AQ, Average Quality of the reading.
+.IP Format 13
+AQ   Numeric value in range 1 - 99.
+.IP Explanation 13
+The average value of the "numerical estimate of base calling accuracy" as
+calculated by program eba. The value is useful for monitoring data quality and
+could also be used for deciding on an order of assembly - for example assemble
+the highest quality readings first.
+.sp
+.PD
+.PD 0
+.IP Record 13
+AV, Accuracy Values
+.IP Format 13
+AV   q1 q2 q3 @dots{} or a1,c1,g1,t1 a2,c2,g2,t2 @dots{}
+.IP Explanation 13
+The accuracy values lie in the range 1-99. Either 1 per base (eg 89 50 @dots{}
+or 4 per base (eg 0,89,5,2 50,3,7,10). @cite{Bonfield,J.K and Staden,R.
+The application of numerical estimates of base calling accuracy to DNA
+sequencing projects. Nucleic Acids Res. 23 1406-1410, (1995)}.
+.sp
+.PD
+.PD 0
+.IP Record 13
+BC, Base Calling software
+.sp
+.PD
+.PD 0
+.IP Record 13
+CC, Comment line
+.IP Format 13
+CC   string
+.IP Explanation 13
+Any comments can be added on any number of lines.
+.sp
+.PD
+.PD 0
+.IP Record 13
+CF, Cloning vector sequence File
+.IP Format 13
+CF   string
+.IP Explanation 13
+The name of the file containing the sequence of the cloning vector, to be used
+by vector_clip (_fpref(Vector_Clip, Screening Against Vector Sequences, vector_clip)).
+
+.sp
+.PD
+.PD 0
+.IP Record 13
+CH, Special CHemistry
+.IP Format 13
+CH   number
+.IP Explanation 13
+Used to flag readings as having been sequenced using a "special chemistry". The
+number is a bit pattern with a bit for each chemistry type, thus allowing
+combinations of chemistries to be listed. Currently bit 0 is used to
+distinguish between dye-primer (0) and dye-terminator (1) chemistries. Bits 1
+to 4 inclusive indicate the type of chemistry: unknown (0, 0000), ABI
+Rhodamine (1, 0001), ABI dRhodamine (2, 0010), BigDye (3, 0011), Energy
+Transfer (4, 0100) and LiCor (5, 0101). So for example a BigDye Terminator has 
+bits 00111 set which is 7 in decimal.
+.sp
+.PD
+.PD 0
+.IP Record 13
+CL, Cloning vector Left end
+.IP Format 13
+CL   number
+.IP Explanation 13
+The base position in the sequence that contains the last base in the cloning
+vector. Currently gap4 only uses the CS line.
+.sp
+.PD
+.PD 0
+.IP Record 13
+CN, Clone Name
+.IP Format 13
+CN   string
+.IP Explanation 13
+The name of the segment of DNA that the reading has been
+derived from. Typically the name of a physical map clone. 
+.sp
+.PD
+.PD 0
+.IP Record 13
+CR, Cloning vector Right end
+.IP Format 13
+CR   number
+.IP Explanation 13
+The base position in the sequence that contains the first base in the cloning
+vector. Currently gap4 only uses the CS line.
+.sp
+.PD
+.PD 0
+.IP Record 13
+CS, Cloning vector Sequence present in sequence
+.IP Format 13
+CS   range
+.IP Explanation 13
+Regions of sequence found by vector_clip 
+(_fpref(Vector_Clip, Screening Against Vector Sequences,
+vector_clip)) to be cloning vector. Used in assembly to
+exclude unwanted sequence.
+.sp
+.PD
+.PD 0
+.IP Record 13
+CV, Cloning Vector type
+.IP Format 13
+CV   string
+.IP Explanation 13
+The type of the cloning vector used.
+.sp
+.PD
+.PD 0
+.IP Record 13
+DR, Direction of Read
+.IP Format 13
+DR   direction
+.IP Explanation 13
+Whether forward or reverse primers were used. Allows
+mapping of forward and reverse reads off the same template. NOTE however
+that we do not encourage the use of this method as the terms
+direction, sense and strand can be confusing. Instead we encourage the
+use of the PRimer line.
+.sp
+.PD
+.PD 0
+.IP Record 13
+DT, DaTe of experiment
+.IP Format 13
+DT   dd-mon-yyyy
+.IP Explanation 13
+Any date information.
+.sp
+.PD
+.PD 0
+.IP Record 13
+EN, Entry Name
+.IP Format 13
+EN   string
+.IP Explanation 13
+The name given to the reading
+.sp
+.PD
+.PD 0
+.IP Record 13
+EX, EXperimental notes
+.IP Format 13
+EX   string
+.IP Explanation 13
+Another type of comment line for additional information.
+.sp
+.PD
+.PD 0
+.IP Record 13
+FM, sequencing vector Fragmentation Method
+.IP Format 13
+FM   string
+.IP Explanation 13
+Fragmentation method used to create sequencing library.
+.sp
+.PD
+.PD 0
+.IP Record 13
+ID, IDentifier
+.IP Format 13
+ID   string
+.IP Explanation 13
+This is the name given to the reading inside the assembly database
+and is equivalent to the ID line of an EMBL entry.
+.sp
+.PD
+.PD 0
+.IP Record 13
+LE, Can be used to identify the location of materials
+.IP Format 13
+LE   string
+.IP Explanation 13
+Originally a micro titre dish well number. Used in
+combination with LI.
+.sp
+.PD
+.PD 0
+.IP Record 13
+LI, Can be used to identify the location of materials
+.IP Format 13
+LI   string
+.IP Explanation 13
+Originally a micro titre dish identifier. Used in
+combination with LE.
+.sp
+.PD
+.PD 0
+.IP Record 13
+LN, Local format trace file Name
+.IP Format 13
+LN   string
+.IP Explanation 13
+The name of the local format trace file. This information is passed
+onto gap4, and allows for local formats to be used.
+.sp
+.PD
+.PD 0
+.IP Record 13
+LT, Local format trace file Type
+.IP Format 13
+LT   string
+.IP Explanation 13
+The type of the local trace file type (usually SCF).
+.sp
+.PD
+.PD 0
+.IP Record 13
+MC, MaChine on which sequencing experiment was run
+.IP Format 13
+MC   string
+.IP Explanation 13
+The lab's name for the sequencing machine used to create the data.
+Used for logging the performance of individual machines.
+.sp
+.PD
+.PD 0
+.IP Record 13
+MN, Machine generated trace file Name
+.IP Format 13
+MN   string
+.IP Explanation 13
+The name of the trace file generated by the sequencing machine MC.
+.sp
+.PD
+.PD 0
+.IP Record 13
+MT, Machine generated trace file Type
+.IP Format 13
+MT   string
+.IP Explanation 13
+The type of machine generated trace file.
+.sp
+.PD
+.PD 0
+.IP Record 13
+ON, Original base Numbers (positions)
+.IP Format 13
+ON   (eg) 1..43 0 45..63 65..74 0 75..536
+.IP Explanation 13
+The A..B notation means that values A to B inclusive, so this example reads
+that bases 1 to 43 are unchanged, there is a change at 44, etc.
+.sp
+.PD
+.PD 0
+.IP Record 13
+OP, OPerator
+.IP Format 13
+OP   string
+.IP Explanation 13
+Someone's name, possibly the person who ran the
+sequencing machine. Useful, with expansion of the string field for
+monitoring the performance of individuals!
+.sp
+.PD
+.PD 0
+.IP Record 13
+PC,  Position in Contig
+.IP Format 13
+PC    number
+.IP Explanation 13
+For preassembled data, the position to put the left end of the reading.
+.sp
+.PD
+.PD 0
+.IP Record 13
+PD,  Primer Data
+.IP Format 13
+PD    sequence
+.IP Explanation 13
+The primer sequence.
+.sp
+.PD
+.PD 0
+.IP Record 13
+PN, Primer Name
+.IP Format 13
+PN   string
+.IP Explanation 13
+Name of primer used, using local naming convention. Could be a
+universal primer. 
+.sp
+.PD
+.PD 0
+.IP Record 13
+PR, PRimer type
+.IP Format 13
+PR   number
+.IP Explanation 13
+This record shows the direction of the reading and distinguishes between
+primers from the ends of the insert and those that are internal. It is
+important for the analysis of the relative orientations and positions of
+readings on templates. When the positions of readings on templates are
+analysed (_fpref(Read Pairs, Find read pairs, read_pairs)) primer types
+1,2,3 and 4 are represented using the symbols F,R,f and r respectively.
+
+.nf
+.BR 0 "  Unknown"
+.BR 1 "  Forward from beginning of insert"
+.BR 2 "  Reverse from end of insert"
+.BR 3 "  Custom forward i.e. a forward primer other than type 1."
+.BR 4 "  Custom reverse i.e. a reverse primer other than type 2."
+.fi
+.sp
+.PD
+.PD 0
+.IP Record 13
+PS, Processing Status
+.IP Format 13
+PS   explanation
+.IP Explanation 13
+Indication of processing status. 
+.sp
+.PD
+.PD 0
+.IP Record 13
+QL, poor Quality sequence present at Left (5') end
+.IP Format 13
+QL   position
+.IP Explanation 13
+The sequence up to and including the base at the marked position are
+considered to be of too poor quality to be used. 
+It may overlap with other marked
+sequences - CS, SL or SR. Used in assembly to exclude unwanted sequence.
+.sp
+.PD
+.PD 0
+.IP Record 13
+QR, poor Quality sequence present at Right (3') end
+.IP Format 13
+QR   position
+.IP Explanation 13
+The sequence from and including the base at the marked position to the
+end is considered to be of too poor quality to be used. It may overlap with
+other marked sequences - CS, SL or SR. Used in assembly to exclude
+unwanted sequence.
+.sp
+.PD
+.PD 0
+.IP Record 13
+RS, Reference Sequence
+.IP Format 13
+RS   string
+.IP Explanation 13
+The name of a sequence, usually in EMBL format, used to define the target
+sequence, base numbering 
+and feature table data for a project. Used to define the numbering and
+changes produced by mutations in individual sequence readings
+(_fpref(Mutation-Detection-Introduction, Introduction to mutation detection,t)).
+.sp
+.PD
+.PD 0
+.IP Record 13
+SC, Sequencing vector Cloning site
+.IP Format 13
+SC   position
+.IP Explanation 13
+The cloning site of the sequence vector. Used by vector_clip 
+(_fpref(Vector_Clip, Screening Against Vector Sequences, vector_clip)).
+.sp
+.PD
+.PD 0
+.IP Record 13
+SE, SEnse (ie whether complemented)
+.IP Format 13
+SE   number
+.IP Explanation 13
+For preassembled data, the sense of the reading (0 for forward, 1 for
+reverse).
+.sp
+.PD
+.PD 0
+.IP Record 13
+SF, Sequencing vector sequence File
+.IP Format 13
+SF   string
+.IP Explanation 13
+The name of the file containing the sequence of the 
+sequencing vector, to be used by vector_clip 
+(_fpref(Vector_Clip, Screening Against Vector Sequences, vector_clip)).
+.sp
+.PD
+.PD 0
+.IP Record 13
+SI, Sequencing vector Insertion length
+.IP Format 13
+SI   range
+.IP Explanation 13
+Expected insertion length of sequence in sequencing
+vector. Useful for selecting templates for further experiments.
+.sp
+.PD
+.PD 0
+.IP Record 13
+SL, Sequencing vector sequence present at Left (5') end
+.IP Format 13
+SL   position
+.IP Explanation 13
+The sequence up to and including the base at the marked 
+position are considered to be sequencing vector. Written by vector_clip
+(_fpref(Vector_Clip, Screening Against Vector Sequences, vector_clip)).
+.sp
+.PD
+.PD 0
+.IP Record 13
+SP, Sequencing vector Primer site (relative to cloning site)
+.IP Format 13
+SP   position
+.IP Explanation 13
+Location of the primer using to sequence relative to cloning site.
+Used by vector_clip 
+(_fpref(Vector_Clip, Screening Against Vector Sequences, vector_clip)).
+.sp
+.PD
+.PD 0
+.IP Record 13
+SQ, SeQuence
+.IP Format 13
+SQ   \\nsequence blocks at dots{}\\n//\\n
+.IP Explanation 13
+Complete sequence, as determined by the sequencing machine. The sequence is
+broken into blocks of 10 bases with 6 blocks per line separated by a space
+(see the example below).
+.sp
+.PD
+.PD 0
+.IP Record 13
+SR, Sequencing vector sequence present at Right (3') end
+.IP Format 13
+SR   position
+.IP Explanation 13
+The sequence from and including the base at the marked 
+position to the end are considered to be sequencing vector. Written by
+vector_clip 
+(_fpref(Vector_Clip, Screening Against Vector Sequences, vector_clip)).
+.sp
+.PD
+.PD 0
+.IP Record 13
+SS, Screening Sequence
+.IP Format 13
+SS   string
+.IP Explanation 13
+Note that in earlier versions of this documentation this field was explained
+incorrectly. Due to this the field is not currently being used by any of our
+programs. The original meaning was to specify a sequence to screen against.
+Any number of SS lines could be present to denote any number of screening
+sequences. In the future we may change the meaning of this field to be a
+single SS line containing a file of filenames of screening sequences. If this
+causes problems for people then we will choose a new line type, so please
+inform us now. Also note that contrary to previous documentation, vector_clip does
+not use this field (it uses the SF field instead).
+.sp
+.PD
+.PD 0
+.IP Record 13
+ST, STrands
+.IP Format 13
+ST   number
+.IP Explanation 13
+Denotes whether this is a single or double stranded template. This
+is useful for deducing suitable templates for later experiments.
+.sp
+.PD
+.PD 0
+.IP Record 13
+SV, Sequencing Vector type
+.IP Format 13
+SV   string
+.IP Explanation 13
+Type of sequencing vector used. Can be used for choosing
+templates for custom primer experiments.
+.sp
+.PD
+.PD 0
+.IP Record 13
+TC, Tag to be placed on the Consensus.
+.IP Format 13
+TC   TYPE S position..length
+.IP Explanation 13
+These lines instruct gap4 to place tags on the consensus.
+The format defines the tag type which is a 4 character identifier
+and should start at column position 5), its strand  ( "+", "-" or
+"=" which means both strands), its start position followed by the
+position of its end. These two values are separated by "..". Following
+lines starting TG with space characters up to column 10 are written
+into the comment field of the tag. For example the next three lines
+define a tag of type comment that is to be on both strands over the
+range 100 to 110 and the comment field will contain "This comment
+contains several lines".
+.nf
+.in +0.5i
+TC   COMM = 100..110
+TC        This comment contains
+TC          several lines
+.in -0.5i
+.fi
+.sp
+.PD
+.PD 0
+.IP Record 13
+TG, Tag to be placed on the reading.
+.IP Format 13
+TG   TYPE S position..length
+.IP Explanation 13
+These lines instruct gap4 to place tags on the reading.
+See TC for further information.
+.sp
+.PD
+.PD 0
+.IP Record 13
+TN, Template Name
+.IP Format 13
+TN   string
+.IP Explanation 13
+The name of the template used in the experiment.
+.sp
+.PD
+.PD 0
+.IP Record 13
+WT, Wild Type trace file
+.IP Format 13
+WT   string
+.IP Explanation 13
+The filename of the wild type trace file. Used for mutation studies.
+.sp
+.PD
+_split()
+
+.SS "Example"
+.PP
+
+.nf
+.in +0.5i
+ID   h4a01h6.s1
+EN   h4a01h6.s1
+TN   h4a01h6
+EX   lane 18, run time 10 hrs
+MN   Sample 18
+MC   A
+MT   ABI
+LN   h4a01h6.s1SCF
+LT   SCF
+DT   08-Jan-1993
+OP   ak
+TN   h4a01h6
+SV   M13mp18
+SF   /pubseq/seqlibs/vectors/m13mp18.seq
+SI   1000..2000
+SC   6249
+PN   -21
+PR   1
+DR   +
+SP   41
+ST   1
+CN   3G9
+CV   sCos-1
+CF   /pubseq/seqlibs/vectors/sCos-1.seq
+SS   /pubseq/seqlibs/vectors/m13mp18.seq
+SQ
+     GCTTGCATGC CTGCAGGTCG ACTCTAGAGG ATCCCCAACC AGTAAGGCAA CCCCGCCAGC
+     CTAGCCGGGT CCTCAACGAC AGGAGCACGA TCATGCGCAC CCGTCAGATC CAGACATGAT
+     AAGATACATT GATGAGTTTG GACAAACCAC AACTAGAATG CAGT-AAAAA AATGCTTTAT
+     TTGTGAAATT TGTGATGCTA TTGCTTTATT TGTAACCATT ATAAGCTGCA ATAAACAAGT
+     TAACAACAAC AATTGCATTC ATTTTATGTT TCAGGTTCAG GGGGAGGTGT GGGAGGTTTT
+     TTAAAGCAAG TAAAACCTCT ACAAATGTGG TATGGCTGAT TATGATCTCT AGTCAAGGCA
+     CTATACATCA AATATT-CCT TATTAACCCC CTTTACAAAT TTAAAAGGCT -AAAGGGTCC
+     ACAATTTTTG -GCCTAGGTA TTAATAGCCG GCACTTCTT- TGCCTGTTTT GG-GTAGGG-
+     AAAACCGGTA TGTTT-TGGT T-TTC
+//
+QL   0
+QR   281
+SL   36
+SR   506
+CS   37..280
+PS   Completely cloning vector
+.in -0.5i
+.fi
+
+_split()
+
+.SS "Unsupported Additions (From LaDeana Hillier)"
+.PP
+
+Note the clash on AP which the io-lib uses for "Assembly Position"
+and PC which is used for "Position in Contig"
+
+.nf
+.in +0.1i
+People to track:
+TP Template Prep person
+QP Sequencer Person, person who does sequencing reactions
+LP Loader Person
+AL Agar Loader person (when they run a gel to determine SI)
+AP Agar reaction Person   (person who does the reactions to prepare
+                        the template to be run on a gel)
+
+Gel specific information
+GN Gel Name
+GL Gel Lane
+GP Gel Pourer person
+AG Agar Gel name (sizing gel)
+AF Agar Fate, no insert, no bands, what else?
+
+Name of library
+LB  Library name, probably not critical to assembly even though
+        one CN may have more than one library.  But it is important
+        to the cDNA project although I could put it in CN, since
+        the cDNA project wouldn't have a CN otherwise.
+
+Processing information
+PC processing comment (a comment about PS)
+        I think PS should just hold pass or fail and PC should hold
+        additional information about why things passed.
+
+Trace information gotten from the ABI machine (from info field in SCF file):
+TS   Trace Spacing
+DP   Dye Primer
+HA   signal strengtH A
+HG   signal strengtH G
+HC   signal strengtH C
+HT   signal strengtH T
+
+(NOTE rs suggested these should go in a single record
+
+PP   Primer Position  (position at which primer peak was detected in trace)
+
+Stuff most likely specific to the cDNA project:
+MP Map Position 
+TT Tissue Type of the library
+EI dbEst Id  
+ER dbEst Remark
+OE Other Est's which are similar
+NI NCBI ID
+GB GenBank accession number
+SD Submission Date (when est was submitted)
+UD Update date (when it was last updated)
+CI citation associated with this cDNA
+.in -0.1i
+.fi
diff --git a/manual/man/man4/scf.4 b/manual/man/man4/scf.4
new file mode 100644
index 0000000..c156b10
--- /dev/null
+++ b/manual/man/man4/scf.4
@@ -0,0 +1,421 @@
+.TH "scf" 4 "" "" "Staden Package"
+.SH "NAME"
+.PP
+scf \- SCF File Format
+
+.SH "SCF"
+.PP
+
+SCF format files are used to store data from DNA sequencing
+instruments. Each file contains the data for a single reading and
+includes: its trace sample points, its called sequence, the positions
+of the bases relative to the trace sample points, and numerical
+estimates of the accuracy of each base. Comments and "private data"
+can also be stored. The format is machine
+independent and the first version was described in Dear, S and Staden, R. "A
+standard file format for data from DNA sequencing instruments", DNA
+Sequence 3, 107-110, (1992). 
+
+Since then it has undergone several important changes. The first allowed for
+different sample point resolutions. The second, in response to the need to
+reduce file sizes for large projects, involved a major reorganisation of the
+ordering of the data items in the file and also in the way they are
+represented.  Note that despite these changes we have retained the original
+data structures into which the data is read. Also this reorganisation in
+itself has not made the files smaller but it has produced files that are more
+effectively compressed using standard programs such as gzip. The io library
+included in the package contains routines that can read and write all the
+different versions of the format (including reading of compressed files). The
+header record was not affected by this change. This documentation covers both
+the format of scf files and the data structures that are used by the io
+library. Prior to version 3.00 these two things corresponded much more
+closely.
+
+_split()
+
+.SS "Header Record"
+.PP
+
+The file begins with a 128 byte header record that describes the
+location and size of the chromatogram data in the file. Nothing is
+implied about the order in which the components (samples, sequence and
+comments) appear. The version field is a 4 byte character array
+representing the version and revision of the SCF format. The current
+value of this field is "3.00".
+
+.nf
+.in +0.2i
+/*
+ * Basic type definitions
+ */
+typedef unsigned int   uint_4;
+typedef signed   int    int_4;
+typedef unsigned short uint_2;
+typedef signed   short  int_2;
+typedef unsigned char  uint_1;
+typedef signed   char   int_1;
+
+/*
+ * Type definition for the Header structure
+ */
+#define SCF_MAGIC (((((uint_4)'.'<<8)+(uint_4)'s'<<8) \\
+                     +(uint_4)'c'<<8)+(uint_4)'f')
+
+typedef struct {
+    uint_4 magic_number;
+    uint_4 samples;          /* Number of elements in Samples matrix */
+    uint_4 samples_offset;   /* Byte offset from start of file */
+    uint_4 bases;            /* Number of bases in Bases matrix */
+    uint_4 bases_left_clip;  /* OBSOLETE: No. bases in left clip (vector) */
+    uint_4 bases_right_clip; /* OBSOLETE: No. bases in right clip (qual) */
+    uint_4 bases_offset;     /* Byte offset from start of file */
+    uint_4 comments_size;    /* Number of bytes in Comment section */
+    uint_4 comments_offset;  /* Byte offset from start of file */
+    char version[4];         /* "version.revision", eg '3' '.' '0' '0' */
+    uint_4 sample_size;      /* Size of samples in bytes 1=8bits, 2=16bits*/
+    uint_4 code_set;         /* code set used (but ignored!)*/
+    uint_4 private_size;     /* No. of bytes of Private data, 0 if none */
+    uint_4 private_offset;   /* Byte offset from start of file */
+    uint_4 spare[18];        /* Unused */
+} Header;
+.in -0.2i
+.fi
+
+For versions of SCF files 2.0 or greater (\fBHeader.version\fP is `greater
+than' "2.00"), the version number, precision of data, the uncertainty code set
+are specified in the header.  Otherwise, the precision is assumed to be 1
+byte, and the code set to be the default code set.  The following uncertainty
+code sets are recognised (but still ignored by our programs!).
+
+.nf
+.in +0.2i
+0       {A,C,G,T,-}   (default)
+1       Staden
+2       IUPAC (NC-IUB)
+3       Pharmacia A.L.F. (NC-IUB)
+4       {A,C,G,T,N}   (ABI 373A)
+5       IBI/Pustell
+6       DNA*
+7       DNASIS
+8       IG/PC-Gene
+9       MicroGenie
+.in -0.2i
+.fi
+
+_split()
+
+.SS "Sample Points."
+.PP
+
+The trace information is stored at byte offset
+\fBHeader.samples_offset\fP from the start of the file. For each
+sample point there are values for each of the four bases.  
+\fBHeader.sample_size\fP holds the
+precision of the sample values. The precision must be one of "1"
+(unsigned byte) and "2" (unsigned short). The sample points need not be
+normalised to any particular value, though it is assumed that they
+represent positive values. This is, they are of unsigned type.
+
+With the introduction of scf version 3.00, in an attempt to produce
+efficiently compressed files, the sample points
+are stored in A,C,G,T order; i.e. all the values for base A, followed by all
+those for C, etc. In addition they are stored, not as their original 
+magnitudes, but in terms of the
+differences between successive values. The C language code used to
+transform the values for precision 2 samples is shown below.
+
+.nf
+.in +0.2i
+void delta_samples2 ( uint_2 samples[], int num_samples, int job) {
+ 
+    /* If job == DELTA_IT:
+     *  change a series of sample points to a series of delta delta values:
+     *  ie change them in two steps:
+     *  first: delta = current_value - previous_value
+     *  then: delta_delta = delta - previous_delta
+     * else
+     *  do the reverse
+     */
+ 
+    int i;
+    uint_2 p_delta, p_sample;
+ 
+    if ( DELTA_IT == job ) {
+        p_delta  = 0;
+        for (i=0;i<num_samples;i++) {
+            p_sample = samples[i];
+            samples[i] = samples[i] - p_delta;
+            p_delta  = p_sample;
+        }
+        p_delta  = 0;
+        for (i=0;i<num_samples;i++) {
+            p_sample = samples[i];
+            samples[i] = samples[i] - p_delta;
+            p_delta  = p_sample;
+        }
+    }
+    else {
+        p_sample = 0;
+        for (i=0;i<num_samples;i++) {
+            samples[i] = samples[i] + p_sample;
+            p_sample = samples[i];
+        }
+        p_sample = 0;
+        for (i=0;i<num_samples;i++) {
+            samples[i] = samples[i] + p_sample;
+            p_sample = samples[i];
+        }
+    }
+}
+.in -0.2i
+.fi
+
+The io library data structure is as follows:
+
+.nf
+.in +0.2i
+/*
+ * Type definition for the Sample data
+ */
+typedef struct {
+        uint_1 sample_A;           /* Sample for A trace */
+        uint_1 sample_C;           /* Sample for C trace */
+        uint_1 sample_G;           /* Sample for G trace */
+        uint_1 sample_T;           /* Sample for T trace */
+} Samples1;
+
+typedef struct {
+        uint_2 sample_A;           /* Sample for A trace */
+        uint_2 sample_C;           /* Sample for C trace */
+        uint_2 sample_G;           /* Sample for G trace */
+        uint_2 sample_T;           /* Sample for T trace */
+} Samples2;
+.in -0.2i
+.fi
+
+_split()
+
+.SS "Sequence Information."
+.PP
+
+Information relating to the base interpretation of the trace is stored
+at byte offset Header.bases_offset from the start of the file. 
+Stored for each base are: its
+character representation and a number (an index into the Samples data
+structure) indicating its position within the trace. The relative
+probabilities of each of the 4 bases occurring at the point where the
+base is called can be stored in \fBprob_A\fP , \fBprob_C\fP ,
+\fBprob_G\fP and \fBprob_T\fP.
+
+From version 3.00 these items are stored in the following order: all
+"peak indexes", i.e. the positions in the sample points to which the
+bases corresponds; all the accuracy estimates for base type A, all for
+C,G and T; the called bases; this is followed by 3 sets of empty int1
+data items. These values are read into the following data structure by
+the routines in the io library.
+
+.nf
+.in +0.2i
+/*
+ * Type definition for the sequence data
+ */
+typedef struct {
+    uint_4 peak_index;        /* Index into Samples matrix for base posn */
+    uint_1 prob_A;            /* Probability of it being an A */
+    uint_1 prob_C;            /* Probability of it being an C */
+    uint_1 prob_G;            /* Probability of it being an G */
+    uint_1 prob_T;            /* Probability of it being an T */
+    char   base;              /* Called base character        */
+    uint_1 spare[3];          /* Spare */
+} Base;
+.in -0.2i
+.fi
+
+_split()
+
+.SS "Comments."
+.PP
+
+Comments are stored at offset Header.comments_offset from the start of
+the file. Lines in this section are of the format:
+
+<Field-ID>=<Value>
+
+<Field-ID> can be any string, though several have special meaning and
+their use is encouraged.
+
+.nf
+.in +0.2i
+ID      Field                           Example
+MACH    Sequencing machine model        MACH=Pharmacia A.L.F.
+TPSW    Trace processing software       TPSW=A.L.F. Analysis
+          version                         Program, Version=1.67
+BCSW    Base calling software version   BCSW=A.L.F. Analysis
+                                          Program, Version=1.67
+DATF    Data source format              DATF=AM_Version=2.0
+DATN    Data source name                DATN=a10c.alf
+CONV    Format conversion software      CONV=makeSCF v2.0
+.in -0.2i
+.fi
+
+Other fields might include:
+
+.nf
+.in +0.2i
+ID      Field                           Example
+OPER    Operator                        OPER=sd
+STRT    Time run started                STRT=Aug 05 1991  12:25:01
+STOP    Time run stopped                STOP=Aug 05 1991  16:26:25
+PROC    Time processed                  PROC=Aug 05 1991  18:50:13
+EDIT    Time edited                     EDIT=Aug 05 1991  19:06:18
+NAME    Sample name                     NAME=a21b1.s1
+SIGN    Average signal strength         SIGN=A=56,C=66,G=13,T=18
+SPAC    Average base spacing            SPAC=12.04
+SCAL    Factor used in scaling traces   SCAL=0.5
+ACMP    Compression annotation          COMP=99,6
+ASTP    Stop annotation                 STOP=143,12
+.in -0.2i
+.fi
+
+.nf
+.in +0.2i
+
+/*
+ * Type definition for the comments
+ */
+typedef char Comments[];                /* Zero terminated list of
+                                           \\n separated entries */
+
+.in -0.2i
+.fi
+
+_split()
+
+.SS "Private data."
+.PP
+
+The private data section is provided to store any information required
+that is not supported by the SCF standard. If the field in the header
+is 0 then there is no private data section. We impose no restrictions
+upon the format of this section. However we feel it maybe a good idea
+to use the first four bytes as a magic number identifying the used
+format of the private data.
+
+_split()
+
+.SS "File structure."
+.PP
+
+From SCF version 3.0 onwards the in memory structures and the data on the disk
+are not in the same format. The overview of the data on disk for the different
+versions is summarised below.
+
+.nf
+.in +0.2i
+
+Versions 1 and 2
+
+(Note Samples1 can be replaced by Samples2 as appropriate.)
+
+Length in bytes                        Data
+---------------------------------------------------------------------
+128                                    header
+Number of samples * 4 * sample size    Samples1 or Samples2 structure
+Number of bases * 12                   Base structure
+Comments size                          Comments
+Private data size                      private data
+
+Version 3
+
+Length in bytes                        Data
+---------------------------------------------------------------------------
+128                                    header
+Number of samples * sample size        Samples for A trace
+Number of samples * sample size        Samples for C trace
+Number of samples * sample size        Samples for G trace
+Number of samples * sample size        Samples for T trace
+Number of bases * 4                    Offset into peak index for each base
+Number of bases                        Accuracy estimate bases being 'A'
+Number of bases                        Accuracy estimate bases being 'C'
+Number of bases                        Accuracy estimate bases being 'G'
+Number of bases                        Accuracy estimate bases being 'T'
+Number of bases                        The called bases
+Number of bases * 3                    Reserved for future use
+Comments size                          Comments
+Private data size                      Private data
+---------------------------------------------------------------------------
+.in -0.2i
+.fi
+
+_split()
+
+.SS "Notes"
+.PP
+
+"Forward byte and reverse bit" ordering will be used for all integer
+values. This is the same as used in the MC680x0 and SPARC processors,
+but the reverse of the byte ordering used on the Intel 80x86 processors.
+
+.nf
+.in +0.2i
+         Off+0   Off+1  
+       +-------+-------+  
+uint_2 |  MSB  |  LSB  |  
+       +-------+-------+  
+
+         Off+0   Off+1   Off+2   Off+3
+       +-------+-------+-------+-------+
+uint_4 |  MSB  |  ...  |  ...  |  LSB  | 
+       +-------+-------+-------+-------+
+.in -0.2i
+.fi
+
+To read integers on systems with any byte order use something like this:
+
+.nf
+.in +0.2i
+uint_2 read_uint_2(FILE *fp)
+{
+    unsigned char buf[sizeof(uint_2)];
+
+    fread(buf, sizeof(buf), 1, fp);
+    return (uint_2)
+        (((uint_2)buf[1]) +
+         ((uint_2)buf[0]<<8));
+}
+
+uint_4 read_uint_4(FILE *fp)
+{
+    unsigned char buf[sizeof(uint_4)];
+
+    fread(buf, sizeof(buf), 1, fp);
+    return (uint_4)
+        (((unsigned uint_4)buf[3]) +
+         ((unsigned uint_4)buf[2]<<8) +
+         ((unsigned uint_4)buf[1]<<16) +
+         ((unsigned uint_4)buf[0]<<24));
+}
+.in -0.2i
+.fi
+
+_split()
+
+The SCF format version 3.00 has been designed with file compression in mind.
+No new information is recorded when compared to the version 2.02 format,
+except the data is stored in a manner conducive to efficient compression.
+
+Experimentation @footnote{Analysed using a data set of 100 ABI (and their SCF
+equivalent) files} has shown that 16 bit SCF version 3.00 files can achieve a
+9:1 compression ratio and 8 bit SCF files a 14.5:1 compression ratio. These
+figures are for SCF files without quality values compressed using the
+\fBbzip\fP utility. \fBgzip\fP tends to give between 20 to 40% larger files
+than \fBbzip\fP. Compressed SCF files containing accuracy values tend to be
+around 10% larger than those without accuracy values.
+
+Whilst compression is not a specific part of the SCF standard, the size of
+trace files and the compression ratios attainable suggests that it is wise to
+handle compressed files. The Staden Package utilities, such as gap4 and trev,
+automatically uncompress and compress SCF files as needed.
+
+Note that at present, on the fly compression, as just described, is not
+implemented for the Windows version of the package.
diff --git a/manual/man/man4/ztr.4 b/manual/man/man4/ztr.4
new file mode 100644
index 0000000..97fb1f0
--- /dev/null
+++ b/manual/man/man4/ztr.4
@@ -0,0 +1,750 @@
+.TH "ztr" 4 "" "" "Staden Package"
+.SH "NAME"
+.PP
+ztr \- ZTR File Format (v1.2)
+
+.SH "ZTR"
+.PP
+
+The ZTR format is used for storing analogue chromotogram data from DNA
+sequencing instruments.
+
+_split()
+
+.SS "Header"
+.PP
+
+The header consists of an 8 byte magic number (see below), followed by a 1-byte
+major version number and 1-byte minor version number.
+
+Changes in minor numbers should not cause problems for parsers. It indicates
+a change in chunk types (different contents), but the file format is the
+same.
+
+The major number is reserved for any incompatible file format changes (which
+hopefully should be never).
+
+.nf
+.in +0.2i
+/* The header */
+typedef struct {
+    unsigned char  magic[8];	  /* 0xae5a54520d0a1a0a (be) */
+    unsigned char  version_major; /* 1 */
+    unsigned char  version_minor; /* 2 */
+} ztr_header_t;
+
+/* The ZTR magic numbers */
+#define ZTR_MAGIC		"\\256ZTR\\r\\n\\032\\n"
+#define ZTR_VERSION_MAJOR	1
+#define ZTR_VERSION_MINOR	2
+.in -0.2i
+.fi
+
+So the total header will consist of:
+
+.nf
+.in +0.2i
+Byte number   0  1  2  3  4  5  6  7  8  9
+            +--+--+--+--+--+--+--+--+--+--+
+Hex values  |ae 5a 54 52 0d 0a 1a 0d|01 02|
+            +--+--+--+--+--+--+--+--+--+--+
+.in -0.2i
+.fi
+
+_split()
+
+.SS "Chunk Format"
+.PP
+
+The basic structure of a ZTR file is (header,chunk*) - ie header followed by
+zero or more chunks. Each chunk consists of a type, some meta-data and some
+data, along with the lengths of both the meta-data and data.
+
+.nf
+.in +0.2i
+Byte number   0  1  2  3  4  5  6  7  8  9
+            +--+--+--+--+---+---+---+---+--+--+  -  +--+--+--+--+--+--  -  --+
+Hex values  |   type    |meta-data length  | meta-data |data length| data .. |
+            +--+--+--+--+---+---+---+---+--+--+  -  +--+--+--+--+--+--  -  --+
+.in -0.2i
+.fi
+
+Ie in C:
+
+.nf
+.in +0.2i
+typedef struct {
+    uint4 type;			/* chunk type (be) */
+    uint4 mdlength;		/* length of meta-data field (be) */
+    char *mdata;		/* meta data */
+    uint4 dlength;		/* length of data field (be) */
+    char *data;			/* a format byte and the data itself */
+} ztr_chunk_t;
+.in -0.2i
+.fi
+
+All 2 and 4-byte integer values are stored in big endian format.
+
+The meta-data is uncompressed (and so it does not start with a format
+byte). The format of the meta-data is chunk specific, and many chunk types
+will have no meta-data. In this case the meta-data length field will be zero
+and this will be followed immediately by the data-length field.
+
+The data length is the length in bytes of the entire 'data' block, including
+the format information held within it.
+
+The first byte of the data consists of a format byte. The most basic format is
+zero - indicating that the data is "as is"; it's the real thing. Other formats
+exist in order to encode various filtering and compression techniques. The
+information encoded in the next bytes will depend on the format byte.
+
+.nf
+.in +0.2i
+Byte number   0 1  2       N
+            +--+--+--  -  --+
+Hex values  | 0|  raw data  |
+            +--+--+--  -  --+
+.in -0.2i
+.fi
+
+Raw data has no compression or filtering. It just contains the unprocessed
+data. It consists of a one byte header (0) indicating raw format followed by N 
+bytes of data.
+
+.nf
+.in +0.2i
+Byte number   0  1    2     3     4      5     6  7  8               N
+            +--+----+----+-----+-----+-------+--+--+--+--  -  --+--+--+
+Hex values  | 1| Uncompressed length | guard | run length encoded data|
+            +--+----+----+-----+-----+-------+--+--+--+--  -  --+--+--+
+.in -0.2i
+.fi
+
+Run length encoding replaces stretches of N identical bytes (with value V)
+with the guard byte G followed by N and V. All other byte values are stored 
+as normal, except for occurrences of the guard byte, which is stored as G 0.
+For example with a guard value of 8:
+
+Input data:
+.nf
+.in +0.2i
+	20 9 9 9 9 9 10 9 8 7
+.in -0.2i
+.fi
+
+Output data:
+.nf
+.in +0.2i
+	1			(rle format)
+	0 0 0 10		(original length)
+	8			(guard)
+	20 8 5 9 10 9 8 0 7	(rle data)
+.in -0.2i
+.fi
+
+.nf
+.in +0.2i
+Byte number   0  1    2     3     4    5  6  7         N
+            +--+----+----+-----+-----+--+--+--+--  -  --+
+Hex values  | 2| Uncompressed length | Zlib encoded data|
+            +--+----+----+-----+-----+--+--+--+--  -  --+
+.in -0.2i
+.fi
+
+This uses the zlib code to compress a data stream. The ZLIB data may itself be 
+encoded using a variety of methods (LZ77, Huffman), but zlib will
+automatically determine the format itself. Often using zlib mode
+Z_HUFFMAN_ONLY will provide best compression when combined with other
+filtering techniques.
+
+.nf
+.in +0.2i
+Byte number   0       1        2      N 
+            +--+-------------+--  -  --+
+Hex values  |40| Delta level |   data  |
+            +--+-------------+--  -  --+
+.in -0.2i
+.fi
+
+This technique replaces successive bytes with their differences. The level
+indicates how many rounds of differencing to apply, which should be between 1
+and 3. For determining the first difference we compare against zero. All
+differences are internally performed using unsigned values with automatic an
+wrap-around (taking the bottom 8-bits). Hence 2-1 is 1 and 1-2 is 255.
+
+For example, with level set to 1:
+
+Input data:
+.nf
+.in +0.2i
+      10 20 10 200 190 5
+.in -0.2i
+.fi
+
+Output data:
+.nf
+.in +0.2i
+       1			(delta1 format)
+       1			(level)
+       10 10 246 190 246 71	(delta data)
+.in -0.2i
+.fi
+
+For level set to 2:
+
+Input data:
+.nf
+.in +0.2i
+      10 20 10 200 190 5
+.in -0.2i
+.fi
+
+Output data:
+.nf
+.in +0.2i
+       1			(delta1 format)
+       2			(level)
+       10 0 236 200 56 81	(delta data)
+.in -0.2i
+.fi
+
+.nf
+.in +0.2i
+Byte number   0       1        2      N 
+            +--+-------------+--  -  --+
+Hex values  |41| Delta level |   data  |
+            +--+-------------+--  -  --+
+.in -0.2i
+.fi
+
+This format is as data format 64 except that the input data is read in 2-byte
+values, so we take the difference between successive 16-bit numbers. For
+example "0x10 0x20 0x30 0x10" (4 8-bit numbers; 2 16-bit numbers) yields "0x10
+0x20 0x1f 0xf0". All 16-bit input data is assumed to be aligned to the start
+of the buffer and is assumed to be in big-endian format.
+
+.nf
+.in +0.2i
+Byte number   0       1        2  3  4      N 
+            +--+-------------+--+--+--  -  --+
+Hex values  |42| Delta level | 0| 0|   data  |
+            +--+-------------+--+--+--  -  --+
+.in -0.2i
+.fi
+
+This format is as data formats 64 and 65 except that the input data is read in
+4-byte values, so we take the difference between successive 32-bit numbers.
+
+Two padding bytes (2 and 3) should always be set to zero. Their purpose is to
+make sure that the compressed block is still aligned on a 4-byte boundary
+(hence making it easy to pass straight into the 32to8 filter).
+
+At present these are reserved for dynamic differencing where the 'level' field 
+varies - applying the appropriate level for each section of data. Experimental 
+at present...
+
+.nf
+.in +0.2i
+Byte number   0
+            +--+--  -  --+
+Hex values  |46|   data  |
+            +--+--  -  --+
+.in -0.2i
+.fi
+
+This method assumes that the input data is a series of big endian 2-byte
+signed integer values. If the value is in the range of -127 to +127 inclusive
+then it is written as a single signed byte in the output stream, otherwise we
+write out -128 followed by the 2-byte value (in big endian format). This
+method works well following one of the delta techniques as most of the 16-bit
+values are typically then small enough to fit in one byte.
+
+Example input data:
+.nf
+.in +0.2i
+	0 10 0 5 -1 -5 0 200 -4 -32 (bytes)
+	(As 16-bit big-endian values: 10 5 -5 200 -800)
+.in -0.2i
+.fi
+
+Output data:
+.nf
+.in +0.2i
+       70			(16-to-8 format)
+       10 5 -5 -128 0 200 -128 -4 -32
+.in -0.2i
+.fi
+
+.nf
+.in +0.2i
+Byte number   0
+            +--+--  -  --+
+Hex values  |47|   data  |
+            +--+--  -  --+
+.in -0.2i
+.fi
+
+This format is similar to format 70, but we are reducing 32-bit numbers (big
+endian) to 8-bit numbers.
+
+.nf
+.in +0.2i
+Byte number   0  1     FF 100  101   N
+            +--+--  -  -  - --+-- - --+
+Hex values  |48| follow bytes |  data |
+            +--+--  -  -  - --+-- - --+
+.in -0.2i
+.fi
+
+For each symbol we compute the most frequent symbol following it. This is
+stored in the "follow bytes" block (256 bytes). The first character in the
+data block is stored as-is. Then for each subsequent character we store the
+difference between the predicted character value (obtained by using
+follow[previous_character]) and the real value. This is a very crude, but
+fast, method of removing some residual non-randomness in the input data and so 
+will reduce the data entropy. It is best to use this prior to entropy encoding 
+(such as huffman encoding).
+
+Version 1.1 only.
+Replaced by format 74 in Version 1.2.
+
+WARNING: This method was experimental and has been replaced with an
+integer equivalent. The floating point method may give system specific
+results.
+
+.nf
+.in +0.2i
+Byte number   0  1  2      N
+            +--+--+--  -  --+
+Hex values  |49| 0|   data  |
+            +--+--+--  -  --+
+.in -0.2i
+.fi
+
+This method takes big-endian 16-bit data and attempts to curve-fit it using
+chebyshev polynomials. The exact method employed uses the 4 preceeding values
+to calculate chebyshev polynomials with 5 coefficents. Of these 5 coefficients
+only 4 are used to predict the next value. Then we store the difference
+between the predicted value and the real value. This procedure is repeated
+throughout each 16-bit value in the data. The first four 16-bit values are
+stored with a simple 1-level 16-bit delta function. Reversing the predictor
+follows the same procedure, except now adding the differences between stored
+value and predicted value to get the real value.
+
+Version 1.2 onwards
+This replaces the floating point code in ZTR v1.1.
+
+.nf
+.in +0.2i
+Byte number   0  1  2      N
+            +--+--+--  -  --+
+Hex values  |4A| 0|   data  |
+            +--+--+--  -  --+
+.in -0.2i
+.fi
+
+This method takes big-endian 16-bit data and attempts to curve-fit it using
+chebyshev polynomials. The exact method employed uses the 4 preceeding values
+to calculate chebyshev polynomials with 5 coefficents. Of these 5 coefficients
+only 4 are used to predict the next value. Then we store the difference
+between the predicted value and the real value. This procedure is repeated
+throughout each 16-bit value in the data. The first four 16-bit values are
+stored with a simple 1-level 16-bit delta function. Reversing the predictor
+follows the same procedure, except now adding the differences between stored
+value and predicted value to get the real value.
+
+_split()
+
+.SS "Chunk Types"
+.PP
+
+As described above, each chunk has a type. The format of the data contained in 
+the chunk data field (when written in format 0) is described below.
+Note that no chunks are mandatory. It is valid to have no chunks at all.
+However some chunk types may depend on the existance of others. This will be
+indicated below, where applicable.
+
+Each chunk type is stored as a 4-byte value. Bit 5 of the first byte is used
+to indicate whether the chunk type is part of the public ZTR spec (bit 5 of
+first byte == 0) or is a private/custom type (bit 5 of first byte == 1). Bit
+5 of the remaining 3 bytes is reserved - they must always be set to zero.
+
+Practically speaking this means that public chunk types consist entirely of
+upper case letters (eg TEXT) whereas private chunk types start with a
+lowercase letter (eg tEXT). Note that in this example TEXT and tEXT are
+completely independent types and they may have no more relationship with each
+other than (for example) TEXT and BPOS types.
+
+It is valid to have multiples of some chunks (eg text chunks), but not for
+others (such as base calls). The order of chunks does not matter unless
+explicitly specified.
+
+A chunk may have meta-data associated with it. This is data about the data
+chunk. For example the data chunk could be a series of 16-bit trace samples,
+while the meta-data could be a label attached to that trace (to distinguish
+trace A from traces C, G and T). Meta-data is typically very small and so it
+is never need be compressed in any of the public chunk types (although
+meta-data is specific to each chunk type and so it would be valid to have
+private chunks with compressed meta-data if desirable).
+
+The first byte of each chunk data when uncompressed must be zero, indicating
+raw format. If, having read the chunk data, this is not the case then the
+chunk needs decompressing or reverse filtering until the first byte is
+zero. There may be a few padding bytes between the format byte and the first
+element of real data in the chunk. This is to make file processing simpler
+when the chunk data consists of 16 or 32-bit words; the padding bytes ensure
+that the data is aligned to the appropriate word size. Any padding bytes
+required will be listed in the appopriate chunk definition below.
+
+The following lists the chunk types available in 32-bit big-endian format.
+In all cases the data is presented in the uncompressed form, starting with the 
+raw format byte and any appropriate padding.
+
+.nf
+.in +0.2i
+Meta-data:
+Byte number   0  1  2  3
+            +--+--+--+--+
+Hex values  | data name |
+            +--+--+--+--+
+
+Data:
+Byte number   0  1  2  3  4  5  6  7       N
+            +--+--+--+--+--+--+--+--+-     -+
+Hex values  | 0| 0| data| data| data|   -   |
+            +--+--+--+--+--+--+--+--+-     -+
+.in -0.2i
+.fi
+
+This encodes a series of 16-bit trace samples. The first data byte is the
+format (raw); the second data byte is present for padding purposes only. After 
+that comes a series of 16-bit big-endian values.
+
+The meta-data for this chunk contains a 4-byte name associated with the
+trace. If a name is shorter than 4 bytes then it should be right padded with
+nul characters to 4 bytes. For sequencing traces the four lanes representig A, 
+C, G and T signals have names "A\\0\\0\\0", "C\\0\\0\\0", "G\\0\\0\\0" and "T\\0\\0\\0".
+
+At present other names are not reserved, but it is recommended that (for
+consistency with elsewhere) you label private trace arrays with names starting 
+in a lowercase letter (specifically, bit 5 is 1).
+
+For sequencing traces it is expected that there will be four SAMP chunks,
+although the order is not specified.
+
+.nf
+.in +0.2i
+Meta-data: none present
+
+Data:
+Byte number   0  1  2  3  4  5  6  7       N
+            +--+--+--+--+--+--+--+--+-     -+
+Hex values  | 0| 0| data| data| data|   -   |
+            +--+--+--+--+--+--+--+--+-     -+
+.in -0.2i
+.fi
+
+The first byte is 0 (raw format). Next is a single padding byte (also 0).
+Then follows a series of 2-byte big-endian trace samples for the "A" trace,
+followed by a series of 2-byte big-endian traces samples for the "C" trace,
+also followed by the "G" and "T" traces (in that order). The assumption is
+made that there is the same number of data points for all traces and hence the 
+length of each trace is simply the number of data elements divided by four.
+
+This chunk is mutually exclusive with the SAMP chunks. If both sets are
+defined then the last found in the file should be used. Experimentation has
+shown that this gives around 3% saving over 4 separate SAMP chunks.
+
+.nf
+.in +0.2i
+Meta-data: none present
+
+Data:
+Byte number   0  1  2  3      N  
+            +--+--+--+--  -  --+
+Hex values  | 0| base calls    |
+            +--+--+--+--  -  --+
+.in -0.2i
+.fi
+
+The first byte is 0 (raw format). This is followed by the base calls in ASCII
+format (one base per byte). The base call case an encoding set should be IUPAC
+characters [1].
+
+.nf
+.in +0.2i
+Meta-data: none present
+
+Data:
+Byte number   0  1  2  3  4  5  6  7       
+            +--+--+--+--+--+--+--+--+-     -+--+--+--+--+
+Hex values  | 0| padding|   data    |   -   |    data   |
+            +--+--+--+--+--+--+--+--+-     -+--+--+--+--+
+.in -0.2i
+.fi
+
+This chunk contains the mapping of base call (BASE) numbers to sample (SAMP)
+numbers; it defines the position of each base call in the trace data. The
+position here is defined as the numbering of the 16-bit positions held in the
+SAMP array, counting zero as the first value.
+
+The format is 0 (raw format) followed by three padding bytes (all 0). Next
+follows a series of 4-byte big-endian numbers specifying the position of each
+base call as an index into the sample arrays (when considered as a 2-byte
+array with the format header stripped off).
+
+Excluding the format and padding bytes, the number of 4-byte elements should
+be identical to the number of base calls. All sample numbers are counted from
+zero. No sample number in BPOS should be beyond the end of the SAMP arrays
+(although it should not be assumed that the SAMP chunks will be before this
+chunk). Note that the BPOS elements may not be totally in sorted order as
+the base calls may be shifted relative to one another due to compressions.
+
+.nf
+.in +0.2i
+Meta-data: none present
+
+Data:
+Byte number   0  1              N              4N
+            +--+--+--   -   --+--+----- -  -----+
+Hex values  | 0| call confidence | A/C/G/T conf |
+            +--+--+--   -   --+--+----- -  -----+
+
+(N == number of bases in BASE chunk)
+.in -0.2i
+.fi
+
+The first byte of this chunk is 0 (raw format). This is then followed by a
+series confidence values for the called base. Next comes all the remaining
+confidence values for A, C, G and T excluding those that have already been
+written (ie the called base). So for a sequence AGT we would store confidences
+A1 G2 T3 C1 G1 T1 A2 C2 T2 A3 C3 G3.
+
+The purpose of this is to group the (likely) highest confidence value (those
+for the called base) at the start of the chunk followed by the remaining
+values. Hence if phred confidence values are written in a CNF4 chunk the first
+quarter of chunk will consist of phred confidence values and the last three
+quarters will (assuming no ambiguous base calls) consist entirely of zeros.
+
+For the purposes of storage the confidence value for a base call that is not
+A, C, G or T (in any case) is stored as if the base call was T.
+
+The confidence values should be from the "-10 * log10 (1-probability)". These
+values are then converted to their nearest integral value.
+If a program wishes to store confidence values in a different range then this
+should be stored in a different chunk type.
+
+If this chunk exists it must exist after a BASE chunk.
+
+.nf
+.in +0.2i
+Meta-data: none present
+
+Data:	      0 
+            +--+-  -  -+--+-  -  -+--+-     -+-  -  -+--+-  -  -+--+--+
+Hex values  | 0| ident | 0| value | 0|   -   | ident | 0| value | 0| 0|
+            +--+-  -  -+--+-  -  -+--+-     -+-  -  -+--+-  -  -+--+--+
+.in -0.2i
+.fi
+
+This contains a series of "identifier\\0value\\0" pairs.
+
+The identifiers and values may be any length and may contain any data except
+the nul character. The nul character marks the end of the identifier or the
+end of the value. Multiple identifier-value pairs are allowable, with a double 
+nul character marking the end of the list.
+
+Identifiers starting with bit 5 clear (uppercase) are part of the public ZTR
+spec. Any public identifier not listed as part of this spec should be
+considered as reserved. Identifiers that have bit 6 set (lowercase) are for
+private use and no restriction is placed on these.
+
+See below for the text identifier list.
+
+.nf
+.in +0.2i
+Meta-data: none present
+
+Data:
+Byte number   0  1  2  3  4  5  6  7  8
+            +--+--+--+--+--+--+--+--+--+
+Hex values  | 0| left clip | right clip|
+            +--+--+--+--+--+--+--+--+--+
+.in -0.2i
+.fi
+
+This contains suggested quality clip points. These are stored as zero (raw
+data) followed by a 4-byte big endian value for the left clip point and a
+4-byte big endian value for the right clip point. Clip points are defined in
+units of base calls, with a value of 1 clipping the first base (so zero
+indicates no left clip and NumberOfBases+1 indicates no right clip).
+
+.nf
+.in +0.2i
+Meta-data: none present
+
+Data:
+Byte number   0  1  2  3  4 
+            +--+--+--+--+--+
+Hex values  | 0|   CRC-32  |
+            +--+--+--+--+--+
+.in -0.2i
+.fi
+
+This chunk is always just 4 bytes of data containing a CRC-32 checksum,
+computed according to the widely used ANSI X3.66 standard. If present, the
+checksum will be a check of all of the data since the last CR32 chunk.
+This will include checking the header if this is the first CR32 chunk, and
+including the previous CRC32 chunk if it is not. Obviously the checksum will
+not include checks on this CR32 chunk.
+
+.nf
+.in +0.2i
+Meta-data: none present
+
+Data:
+Byte number   0  1        N
+            +--+--   -   --+
+Hex values  | 0| free text |
+            +--+--   -   --+
+.in -0.2i
+.fi
+
+This allows arbitrary textual data to be added. It does not require a
+identifier-value pairing or any nul termination.
+
+_split()
+
+.SS "Text Identifiers"
+.PP
+
+These are for use in the TEXT segments. None are required, but if any of these
+identifiers are present they must confirm to the description below. Much
+(currently all) of this list has been taken from the NCBI Trace Archive [2]
+documentation. It is duplicated here as the ZTR spec is not tied to the same
+revision schedules as the NCBI trace archive (although it is intended that any
+suitable updates to the trace archive should be mirrored in this ZTR spec).
+
+The Trace Archive specifies a maximum length of values. The ZTR spec does not
+have length limitations, but for compatibility these sizes should still be
+observed.
+
+The Trace Archive also states some identifiers are mandatory; these are marked
+by asterisks below. These identifiers are not mandatory in the ZTR spec (but
+clearly they need to exist if the data is to be submitted to the NCBI).
+
+Finally, some fields are not appropriate for use in the ZTR spec, such as
+BASE_FILE (the name of a file containing the base calls). Such fields are
+included only for compatibility with the Trace Arhive. It is not expected that 
+use of ZTR would allow for the base calls to be read from an external file
+instead of the ZTR BASE chunk.
+
+[ Quoted from TraceArchiveRFC v1.17 ]
+
+.nf
+.in +0.2i
+Identifier      Size       Meaning			 Example value(s)
+----------      -----      ----------------------------  -----------------
+TRACE_NAME *      250      name of the trace             HBBBA1U2211
+                           as used at the center
+                           unique within the center
+                           but not among centers.
+                           
+SUBMISSION_TYPE *   -      type of submission
+                           
+CENTER_NAME *     100      name of center                BCM
+CENTER_PROJECT    200      internal project name         HBBB
+                           used within the center
+                           
+TRACE_FILE *      200      file name of the trace	 ./traces/TRACE001.scf
+                           relative to the top of
+                           the volume.
+                           
+TRACE_FORMAT *     20      format of the tracefile
+                           
+SOURCE_TYPE *       -      source of the read
+                           
+INFO_FILE         200      file name of the info file
+INFO_FILE_FORMAT   20        
+                           
+BASE_FILE         200      file name of the base calls
+QUAL_FILE         200      file name of the base calls
+                           
+                           
+TRACE_DIRECTION     -      direction of the read
+TRACE_END           -      end of the template
+PRIMER            200      primer sequence
+PRIMER_CODE                which primer was used
+                           
+STRATEGY            -      sequencing strategy
+TRACE_TYPE_CODE     -      purpose of trace
+                           
+PROGRAM_ID         100     creator of trace file         phred-0.990722.h
+                           program-version
+                           
+TEMPLATE_ID         20     used for read pairing         HBBBA2211
+                           
+CHEMISTRY_CODE       -     code of the chemistry         (see below)
+ITERATION            -     attempt/redo                  1
+                           (int 1 to 255)
+                           
+CLIP_QUALITY_LEFT          left clip of the read in bp due to quality
+CLIP_QUALITY_RIGHT         right " " " " "
+CLIP_VECTOR_LEFT           left clip of the read in bp due to vector
+CLIP_VECTOR_RIGHT          right " " " " "
+
+                           
+SVECTOR_CODE        40     sequencing vector used        (in table)
+SVECTOR_ACCESSION   40     sequencing vector used        (in table)
+CVECTOR_CODE        40     clone vector used             (in table)
+CVECTOR_ACCESSION   40     clone vector used             (in table)
+                           
+INSERT_SIZE          -     expected size of insert       2000,10000
+                           in base pairs (bp)
+                           (int 1 to 2^32)
+                           
+PLATE_ID            32     plate id at the center          
+WELL_ID                    well                          1-384
+
+
+SPECIES_CODE *       -     code for species
+SUBSPECIES_ID       40     name of the subspecies
+                           Is this the same as strain
+
+CHROMOSOME           8     name of the chromosome        ChrX, Chr01, Chr09
+                           
+                           
+LIBRARY_ID          30     the source library of the clone
+CLONE_ID            30     clone id                      RPCI11-1234 
+ 
+ACCESSION           30     NCBI accession number         AC00001
+                           
+PICK_GROUP_ID       30     an id to group traces picked
+                           at the same time.
+PREP_GROUP_ID       30     an id to group traces prepared
+                           at the same time
+                           
+                           
+RUN_MACHINE_ID      30     id of sequencing machine
+RUN_MACHINE_TYPE    30     type/model of machine
+RUN_LANE            30     lane or capillary of the trace
+RUN_DATE             -     date of run
+RUN_GROUP_ID        30     an identifier to group traces
+                           run on the same machine
+
+[ End of quote from TraceArchiveRFC ]
+
+More detailed information on the format of these values should be obtained
+from the Trace Archive RFC [2].
+.in -0.2i
+.fi
+
+_split()
+
+.SS "References"
+.PP
+
+[1] IUPAC: http://www.chem.qmw.ac.uk/iubmb/misc/naseq.html
+
+[2] http://www.ncbi.nlm.nih.gov/Traces/TraceArchiveRFC.html
+
diff --git a/manual/manpages-t.texi b/manual/manpages-t.texi
new file mode 100644
index 0000000..40ee4d7
--- /dev/null
+++ b/manual/manpages-t.texi
@@ -0,0 +1,120 @@
+ at lowersections
+
+ at c Put man pages in alphabetical order as there's little other sensible
+ at c ordering to these.
+
+ at page
+ at node Man-convert_trace
+ at chapter Convert_trace
+_include(convert_trace.1.texi)
+
+ at page
+ at node Man-copy_db
+ at chapter Copy_db
+_include(copy_db.1.texi)
+
+ at page
+ at node Man-copy_reads
+ at chapter Copy_reads
+_include(copy_reads.1.texi)
+
+ at page
+ at node Man-eba
+ at chapter Eba
+_include(eba.1.texi)
+
+ at page
+ at node Man-extract_seq
+ at chapter Extract_seq
+_include(extract_seq.1.texi)
+
+ at page
+ at node Man-extract_fastq
+ at chapter Extract_fastq
+_include(extract_fastq.1.texi)
+
+ at page
+ at node Man-find_renz
+ at chapter Find_renz
+_include(find_renz.1.texi)
+
+ at page
+ at node Man-getABIfield
+ at chapter GetABIfield
+_include(getABIfield.1.texi)
+
+ at page
+ at node Man-get_comment
+ at chapter Get_comment
+_include(get_comment.1.texi)
+
+ at page
+ at node Man-get_scf_field
+ at chapter Get_scf_field
+_include(get_scf_field.1.texi)
+
+ at page
+ at node Man-hash_exp
+ at chapter Hash_exp
+_include(hash_exp.1.texi)
+
+ at page
+ at page
+ at node Man-hash_extract
+ at chapter Hash_extract
+_include(hash_extract.1.texi)
+
+ at page
+ at node Man-hash_list
+ at chapter Hash_list
+_include(hash_list.1.texi)
+
+ at node Man-hash_tar
+ at chapter Hash_tar
+_include(hash_tar.1.texi)
+
+ at page
+ at node Man-init_exp
+ at chapter Init_exp
+_include(init_exp.1.texi)
+
+ at page
+ at node Man-makeSCF
+ at chapter MakeSCF
+_include(makeSCF.1.texi)
+
+ at page
+ at node Man-make_weights
+ at chapter Make_weights
+_include(make_weights.1.texi)
+
+ at page
+ at node Man-polyA_clip
+ at chapter PolyA_clip
+_include(polyA_clip.1.texi)
+
+ at node Man-qclip
+ at chapter Qclip
+_include(qclip.1.texi)
+
+ at page
+ at node Man-screen_seq
+ at chapter Screen_seq
+_include(screen_seq.1.texi)
+
+ at page
+ at node Man-tracediff
+ at chapter TraceDiff
+_include(tracediff.1.texi)
+
+ at page
+ at node Man-trace_dump
+ at chapter Trace_dump
+_include(trace_dump.1.texi)
+
+ at page
+ at node Man-vector_clip
+ at chapter Vector_clip
+_include(vector_clip.1.texi)
+
+ at raisesections
diff --git a/manual/manpages.texi b/manual/manpages.texi
new file mode 100644
index 0000000..b8ed872
--- /dev/null
+++ b/manual/manpages.texi
@@ -0,0 +1,41 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+ at c %**start of header
+ at setfilename manpages.info
+ at settitle Manual Pages
+ at setchapternewpage odd
+ at iftex
+ at afourpaper
+ at end iftex
+ at setchapternewpage odd
+ at c %**end of header
+
+ at set standalone
+include(header.m4)
+
+ at titlepage
+ at title Manual Pages
+ at subtitle 
+ at author 
+ at page
+ at vskip 0pt plus 1filll
+_include(copyright.texi)
+ at end titlepage
+
+ at node Top
+ at ifinfo
+ at top top-manpages
+ at end ifinfo
+
+ at raisesections
+_include(manpages-t.texi)
+
+_split()
+ at node Index
+ at unnumbered Index
+ at printindex cp
+ at lowersections
+
+ at shortcontents
+ at contents
+ at bye
diff --git a/manual/manual.texi b/manual/manual.texi
new file mode 100644
index 0000000..e21c029
--- /dev/null
+++ b/manual/manual.texi
@@ -0,0 +1,133 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+
+ at c %**start of header
+ at setfilename manual.info@
+ at setcontentsaftertitlepage
+ at setshortcontentsaftertitlepage
+ at settitle The Staden Package Manual
+ at setchapternewpage odd
+ at iftex
+ at afourpaper
+ at end iftex
+ at c %**end of header
+
+include(header.m4)
+
+ at finalout
+
+ at titlepage
+ at title The Staden Package Manual
+ at subtitle Last update on @today{}
+ at author James Bonfield, Kathryn Beal, Mark Jordan,
+ at author Yaping Cheng and Rodger Staden
+ at page
+ at vskip 0pt plus 1filll
+_include(copyright.texi)
+ at end titlepage
+
+ at node Top
+ at ifinfo
+ at top Manual
+ at end ifinfo
+
+ at menu
+* Gap5::                Next generation assembly editing with Gap5
+* Gap4::                Sequence assembly and finishing using Gap4
+* Mutations::           Searching for mutations using pregap4 and gap4
+* Pregap4::             Preparing readings for assembly using pregap4
+* Read Clipping::       Marking poor quality segments of readings
+* Vector_Clip::         Marking vector segments using vector_clip
+* Trev::                Viewing and editing trace data using trev
+* Spin::                Analysing and comparing sequences using spin
+* Interface::           User Interface
+* Formats::             File Formats
+* Man Pages::           Manual Pages
+* Index::               General Index
+* File Index::          File Index
+* Variable Index::      Variable Index
+* Function Index::      Function Index
+ at end menu
+
+ at node Manual-preface
+ at unnumberedsec Preface
+_include(preface-t.texi)
+
+ at node Gap5
+ at chapter Next generation assembly editing with Gap5
+ at lowersections
+_include(gap5-t.texi)
+ at raisesections
+
+ at node Gap
+ at chapter Sequence assembly and finishing using Gap4
+ at lowersections
+_include(gap4-t.texi)
+ at raisesections
+
+ at node Mutations
+ at chapter Searching for point mutations using pregap4 and gap4
+_include(mutations-t.texi)
+
+ at c @node Registration
+ at c @chapter Contig Registration
+ at c [[_include(registration-t.texi)]]
+
+ at node Pregap4
+ at chapter Preparing readings for assembly using pregap4
+ at lowersections
+_include(pregap4-t.texi)
+ at raisesections
+
+ at node Read Clipping
+ at chapter Marking poor quality and vector segments of readings
+_include(read_clipping-t.texi)
+
+_include(vector_clip-t.texi)
+
+ at node Trev
+ at chapter Viewing and editing trace data using trev
+_include(trev-t.texi)
+
+ at node Spin
+ at chapter Analysing and comparing sequences using spin
+ at lowersections
+_include(spin-t.texi)
+ at raisesections
+
+ at node Interface
+ at chapter User Interface
+_include(interface-t.texi)
+
+ at node Formats
+ at chapter File Formats
+_include(formats-t.texi)
+
+ at node Man Pages
+ at chapter Man Pages
+_include(manpages-t.texi)
+
+ at node References
+ at unnumbered References
+_include(references-t.texi)
+
+ at node Index
+ at unnumbered General Index
+ at printindex cp
+
+ at node File Index
+ at unnumbered File Index
+ at printindex pg
+
+ at node Variable Index
+ at unnumbered Variable Index
+ at printindex vr
+
+ at node Function Index
+ at unnumbered Function Index
+ at printindex fn
+
+ at shortcontents
+ at contents
+
+ at bye
diff --git a/manual/mini_manual.texi b/manual/mini_manual.texi
new file mode 100644
index 0000000..92248ad
--- /dev/null
+++ b/manual/mini_manual.texi
@@ -0,0 +1,78 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+
+ at c %**start of header
+ at setfilename mini_manual.info
+ at setcontentsaftertitlepage
+ at setshortcontentsaftertitlepage
+ at settitle The Staden Package Mini-Manual
+ at setchapternewpage odd
+ at iftex
+ at afourpaper
+ at end iftex
+ at c %**end of header
+
+include(header.m4)
+
+ at finalout
+
+ at titlepage
+ at title The Staden Package Mini-Manual
+ at subtitle Last update on @today{}
+ at author James Bonfield, Kathryn Beal, Mark Jordan,
+ at author Yaping Cheng and Rodger Staden
+ at page
+ at vskip 0pt plus 1filll
+_include(copyright.texi)
+ at end titlepage
+
+ at node Top
+ at ifinfo
+ at top Mini-Manual
+ at end ifinfo
+
+ at menu
+* Mini-Preface::             Preface
+* Mini-Gap4::                Sequence Assembly and Finishing Using gap4
+* Mini-Mutations::           Searching for mutations using pregap4 and gap4
+* Mini-Pregap4::             Preparing Readings for Assembly Using Pregap4
+* Mini-Trev::                Viewing Traces Using Trev
+* Mini-Spin::                Analysing Sequences Using Spin
+ at end menu
+
+ at node Mini-Preface
+ at chapter Preface
+_include(preface-t.texi)
+
+ at node Mini-Gap4
+ at chapter Sequence Assembly and Finishing Using Gap4
+ at lowersections
+_include(gap4_mini-t.texi)
+ at raisesections
+
+ at node Mutations
+ at chapter Searching for point mutations using pregap4 and gap4
+_include(mutations-t.texi)
+
+ at node Mini-Pregap4
+ at chapter Preparing Readings for Assembly Using Pregap4
+ at lowersections
+_include(pregap4_mini-t.texi)
+ at raisesections
+
+ at node Mini-Trev
+ at chapter Viewing Traces Using Trev
+ at lowersections
+_include(trev_mini-t.texi)
+ at raisesections
+
+ at node Mini-Spin
+ at chapter Analysing Sequences Using Spin
+ at lowersections
+_include(spin_mini-t.texi)
+ at raisesections
+
+ at c @shortcontents
+ at contents
+
+ at bye
diff --git a/manual/mut_contig_editor5.png b/manual/mut_contig_editor5.png
new file mode 100644
index 0000000..59d0da8
Binary files /dev/null and b/manual/mut_contig_editor5.png differ
diff --git a/manual/mut_contig_editor5.small.png b/manual/mut_contig_editor5.small.png
new file mode 100644
index 0000000..c82bc41
Binary files /dev/null and b/manual/mut_contig_editor5.small.png differ
diff --git a/manual/mut_contig_editor_dis5.png b/manual/mut_contig_editor_dis5.png
new file mode 100644
index 0000000..aae90ac
Binary files /dev/null and b/manual/mut_contig_editor_dis5.png differ
diff --git a/manual/mut_contig_editor_dis5.small.png b/manual/mut_contig_editor_dis5.small.png
new file mode 100644
index 0000000..e17a42d
Binary files /dev/null and b/manual/mut_contig_editor_dis5.small.png differ
diff --git a/manual/mut_mutscan_adaptive_noise_threshold.png b/manual/mut_mutscan_adaptive_noise_threshold.png
new file mode 100644
index 0000000..1a023d1
Binary files /dev/null and b/manual/mut_mutscan_adaptive_noise_threshold.png differ
diff --git a/manual/mut_mutscan_peak_alignment_threshold.png b/manual/mut_mutscan_peak_alignment_threshold.png
new file mode 100644
index 0000000..0686f3f
Binary files /dev/null and b/manual/mut_mutscan_peak_alignment_threshold.png differ
diff --git a/manual/mut_mutscan_peak_drop_threshold.png b/manual/mut_mutscan_peak_drop_threshold.png
new file mode 100644
index 0000000..2a0d3a2
Binary files /dev/null and b/manual/mut_mutscan_peak_drop_threshold.png differ
diff --git a/manual/mut_pregap4.png b/manual/mut_pregap4.png
new file mode 100644
index 0000000..235b2cd
Binary files /dev/null and b/manual/mut_pregap4.png differ
diff --git a/manual/mut_template_all.png b/manual/mut_template_all.png
new file mode 100644
index 0000000..409fea8
Binary files /dev/null and b/manual/mut_template_all.png differ
diff --git a/manual/mut_template_all.small.png b/manual/mut_template_all.small.png
new file mode 100644
index 0000000..8d3d5c9
Binary files /dev/null and b/manual/mut_template_all.small.png differ
diff --git a/manual/mut_template_reads.png b/manual/mut_template_reads.png
new file mode 100644
index 0000000..d3a93f8
Binary files /dev/null and b/manual/mut_template_reads.png differ
diff --git a/manual/mut_template_reads.small.png b/manual/mut_template_reads.small.png
new file mode 100644
index 0000000..d5878da
Binary files /dev/null and b/manual/mut_template_reads.small.png differ
diff --git a/manual/mut_template_reads_single.png b/manual/mut_template_reads_single.png
new file mode 100644
index 0000000..7a5e70a
Binary files /dev/null and b/manual/mut_template_reads_single.png differ
diff --git a/manual/mut_template_reads_single.small.png b/manual/mut_template_reads_single.small.png
new file mode 100644
index 0000000..3fe7c05
Binary files /dev/null and b/manual/mut_template_reads_single.small.png differ
diff --git a/manual/mut_traces_het.png b/manual/mut_traces_het.png
new file mode 100644
index 0000000..fd2180a
Binary files /dev/null and b/manual/mut_traces_het.png differ
diff --git a/manual/mut_traces_het.small.png b/manual/mut_traces_het.small.png
new file mode 100644
index 0000000..88ad091
Binary files /dev/null and b/manual/mut_traces_het.small.png differ
diff --git a/manual/mut_traces_point.png b/manual/mut_traces_point.png
new file mode 100644
index 0000000..855533d
Binary files /dev/null and b/manual/mut_traces_point.png differ
diff --git a/manual/mut_traces_point.small.png b/manual/mut_traces_point.small.png
new file mode 100644
index 0000000..86e79ad
Binary files /dev/null and b/manual/mut_traces_point.small.png differ
diff --git a/manual/mut_traces_positive.png b/manual/mut_traces_positive.png
new file mode 100644
index 0000000..9319244
Binary files /dev/null and b/manual/mut_traces_positive.png differ
diff --git a/manual/mut_traces_positive.small.png b/manual/mut_traces_positive.small.png
new file mode 100644
index 0000000..6aea930
Binary files /dev/null and b/manual/mut_traces_positive.small.png differ
diff --git a/manual/mutations-t.texi b/manual/mutations-t.texi
new file mode 100644
index 0000000..f02a5bc
--- /dev/null
+++ b/manual/mutations-t.texi
@@ -0,0 +1,686 @@
+ at menu
+* Mutation-Detection-Introduction:: Mutation Detection Introduction
+* Mutation-Detection-Methods:: Mutation Detection Programs
+* Mutation-Detection-Reference-Data:: Mutation Detection Reference Data
+* Mutation-Detection-Reference-Sequences:: Mutation Detection Reference Sequences
+* Mutation-Detection-Reference-Traces:: Mutation Detection Reference Traces
+* Using-The-Template-Display-With-Mutation-Data:: Using The Template Display With Mutation Data
+* Configuring-The-Gap4-Editor-For-Mutation-Data:: Configuring The Gap4 Editor For Mutation Data
+* Using-The-Gap4-Editor-With-Mutation-Data:: Using The Gap4 Editor With Mutation Data
+* Processing-Batches-Of-Mutation-Data-Trace-Files:: Processing Batches Of Mutation Data Trace Files
+* Processing-Batches-Of-Mutation-Data-Trace-Files-Using-Pregap4:: Processing Batches Of Mutation Data Trace Files Using Pregap4
+* Discussion-Of-Mutation-Data-Processing:: Discussion Of Mutation Data Processing
+ at end menu
+
+The original version of these methods was described in 
+ at cite{James K Bonfield, Cristina Rada and Rodger Staden, 
+"Automated detection of point
+mutations using fluorescent sequence trace subtraction", Nucleic Acids
+Res.  26, 3404-3409, 1998.}. The more recent work has been done by Mark
+Jordan and James Bonfield with advice from Graham Taylor, Andrew
+Wallace, Will Wang and others.
+
+_split()
+ at node Mutation-Detection-Introduction
+ at section Introduction to mutation detection
+ at cindex Mutation detection: introduction
+ at cindex detection of mutations: introduction
+
+Our methods for detecting mutations are based on the alignment and comparison
+of the fluorescent traces produced by Sanger DNA sequencing. To use clinical
+terminology, samples from patients are compared to standard reference traces.
+Patient and reference traces should be produced using the same primers and 
+sequencing chemistry, ideally from both strands of the DNA. The data shown
+in the examples below is from exon 11 of the BRCA1 gene.
+
+The basic idea is illustrated in the following two figures which are screen
+dumps from our program gap4(_fpref(Gap4-Introduction, Gap4 introduction, gap4)). The first shows
+a sample containing a point mutation and the second contains a heterozygous
+base position. The displays are bisected vertically: at the top left is the 
+sample trace
+from one strand of the DNA, below that the reference trace for that 
+strand, and underneath the difference between these traces which is 
+obtained by 
+subtracting one from the other.
+On the right is corresponding data from the other DNA strand (shown 
+complemented).
+
+_lpicture(mut_traces_point,6in)
+
+Figure 1. Top and bottom strand differences for a point mutation.
+
+_lpicture(mut_traces_het,6in)
+
+Figure 2. Top and bottom strand differences for a heterozygous base.
+
+As can be seen, although no vertical scaling is performed the difference trace
+is quite flat or is consistently either above or below the mid-line, except 
+at the sites of mutations. Near these are strong peaks, but notice that only
+for the mutated base are there peaks both above and below the mid-line. The
+context effects caused by the mutation produce peaks only in one direction.
+
+It is perhaps necessary to point out that analysis of the traces is essential
+because base callers make mistakes: they can assign the wrong base types and
+also assign single bases where the DNA is heterozygous. An example of the latter
+can be observed in Figure 2: on one strand the base caller has assigned
+a "-" symbol at position 251, at least indicating uncertainty, but on the
+other strand it has assigned "T". The DNA is clearly heterozygous at this
+position. This means that simply looking for differences between patient
+sequences and reference sequences will cause point mutations and heterozygous
+bases to be missed (of course base calling errors will also create
+false differences).
+
+These trace displays alone are very useful for visual inspection of data 
+and are all
+some users want. However we also have programs which automatically analyse 
+the trace differences and tag the bases which have significant peaks as possible
+sites of mutation.
+
+
+Trace viewing is initiated from within the gap4 editor(_fpref(Editor, Editing in gap4, contig_editor)).
+Each record in the editor shows an individual reading with its number and name
+at the left. Negative numbers denote readings which have been complemented.
+Several sequences have special status. At the top is a sequence labelled with
+a letter S at the left edge. This is the reference sequence, here the EMBL
+entry HSLBRCA1 which covers the entirety of the BRCA1 gene. The numbering
+at the top of the display corresponds to positions in this reference sequence.
+The program has also coloured (green) all exons on the reference sequence.
+The bottom DNA sequence in the editor is labelled "CONSENSUS". For mutation
+detection work this sequence is forced to be identical to the reference.
+Below the CONSENSUS sequence is the amino acid sequence for the reference.
+This is calculated on the fly using the feature table of the reference 
+sequence and so translates only exons and in their correct reading frames.
+Two other sequences (near the top) are labelled R and F. These are the readings
+providing the reverse
+and forward reference traces for this segment of the data.
+
+_lpicture(mut_contig_editor5,6in)
+
+Figure 3. A set of aligned sequence readings displayed in the gap4 editor.
+
+At the very bottom of the editor is an information line which is used to
+display data about items touched by the mouse cursor. Here it is showing
+data about one of the positions tagged as possibly being heterozygous. 
+It includes the
+observed base types (G and A) and the scores achieved by the automated analysis.
+
+The editor can be set to show only differences between readings and the 
+reference; all matching bases appear as dots. For example, Figure 4.
+shows the same data as Figure 3, but with the editor set to show differences,
+and the information line showing details about a possible mutation.
+
+
+_lpicture(mut_contig_editor_dis5,6in)
+
+Figure 4. An alternative view of aligned sequence readings in the gap4 editor.
+
+
+
+One column contains several bases tagged in  red, signifying possible
+heterozygotes, and some in orange denoting possible point mutations. 
+During visual inspection the program can be made to move the cursor from 
+one tag to the next and to display the aligned traces as shown
+above in Figures 1 and 2.
+
+It is also possible to have positive controls for displaying the trace 
+differences; i.e. reference traces which contain the mutation. In this case the traces
+appear as shown in figure 5. Here the forward and reverse positive controls
+are shown to the right of the normal plots. In Figure 5 the positive control
+difference plots are quite flat hence, in this case, providing confirmation 
+of the presence of the heterozygous base.
+
+_lpicture(mut_traces_positive,6in)
+
+Figure 5. Top and bottom strand differences and positive control for a heterozygous base.
+
+As mentioned above the package contains programs which can automatically
+compare the traces and their reference sequences. The output from these
+programs are the tags shown in the editor. Users can check the traces at
+these positions using the displays shown in Figures 1, 2 and 5; if necessary
+removing or adding tags. Alternatively users can rely entirely on visual
+inspection and create all tags themselves.
+
+Once all the mutations are correctly tagged the program can produce a report
+which includes the reading names, mutation positions relative to the reference
+sequence, the actual change, its effect, and the evidence. An example is shown
+below in Figure 6.
+
+ at example
+ at group
+ at cartouche
+
+001321_11aF 33885T>Y (silent F) (strand - only)
+001321_11aF 34407G>K (expressed E>[ED]) (strand - only)
+001321_11cF 35512T>Y (silent L) (double stranded)
+001321_11cF 35813C>Y (expressed P>[PL]) (double stranded)
+001321_11dF 36314A>R (expressed E>[EG]) (double stranded)
+001321_11eF 36749A>R (expressed K>[KR]) (double stranded)
+001321_11eF 37313T>K (noncoding) (strand - only)
+000256_11eF 36749A>G (expressed K>R) (double stranded)
+
+ at end cartouche
+ at end group
+ at end example
+Figure 6. How gap4 reports mutations.
+ at cindex mutation report
+
+
+Here the first record is for reading 001321_11aF, position 33885, T changed
+to T and C (i.e. is heterozygous) to produce no amino acid change, with evidence coming only from
+the complementary strand. The last record is for reading 000256_11eF, position
+36749, A changed to G, producing an amino acid change K to R, with evidence
+from both strands of the sequence. The penultimate record denotes a 
+heterozygote in a noncoding region.
+
+_split()
+ at node Mutation-Detection-Methods
+ at subsection Mutation Detection Programs
+
+The software handles batches of trace data from sequencing instruments. It 
+performs
+all processing except base calling (although it can employ third party
+programs such as phred for this step). This includes file format
+conversions, quality clipping, scanning for mutations and heterozygotes,
+multiple sequence alignment, easy visual inspection of traces, production of
+reports, and the accumulation and storage of readings and traces. The
+software also handles the initialisation/configuration of standard
+reference files and databases for any project. The two main programs are
+pregap4 and gap4. Pregap4 (_fpref(Pregap4-Introduction, Pregap4 introduction, pregap4))
+prepares data for gap4 by automatically using
+a variety of smaller programs, including those used to search for mutations:
+mutscan (_fpref(Pregap4-Modules-Mutation Scanner, Mutation Scanner, t).
+Gap4 (_fpref(Gap4-Introduction, Gap4 introduction, gap4))
+is used to store the aligned readings, to view the sequences and
+traces, and to produce a report listing the observed mutations.
+
+Any number of sequences can be processed in a  single run, and for each
+individual  patient sample the  operation  is generally
+performed in two steps. First, via pregap4, the traces are aligned and
+compared to the reference traces and any possible mutations or heterozygous
+bases marked.  Secondly,  the data  is transfered into a gap4 database
+from  where  users can   visually check   the differences between  the
+reference and patient traces.
+
+The program mutscan (_fpref(Pregap4-Modules-Mutation Scanner, Mutation
+Scanner,t)) can automatically compare patient and reference traces to
+find point mutations and heterozygous bases.  Users can set parameters
+which control the sensistivity of the algorithms (and hence which
+determine the ratio of false negative and positive results). Mutscan
+adds tags of type ``mutation'' or ``heterozygous'' to the patient
+files. The tags contain the numerical scores achieved at the site of
+the reported base changes, and they can be viewed via the gap4
+editor(_fpref(Editor, Editing in gap4, contig_editor)).  Mutscan is
+normally run via pregap4 (_fpref(Pregap4-Introduction, Pregap4
+introduction, pregap4)).
+
+The description of the programs given below is presented in reverse order of
+use i.e. gap4 then pregap4, but first we give further details about the use
+of reference data.
+
+
+_split()
+ at node Mutation-Detection-Reference-Data
+ at subsection Mutation Detection Reference Data
+
+The mutation detection methods require reference traces and optionally 
+reference sequences. Reference traces are used for automatic mutation
+detection and for visual inspection of trace differences. Reference sequences
+are used in gap4 to provide a base numbering standard, and if required to
+provide feature table entries to control translation and mutation reporting.
+
+ at menu
+* Mutation-Detection-Reference-Sequences:: Reference Sequences
+* Mutation-Detection-Reference-Traces:: Reference Traces
+ at end menu
+
+
+ at node Mutation-Detection-Reference-Sequences
+ at subsection Reference Sequences
+ at cindex reference sequences
+ at cindex mutation detection: reference sequences
+
+Reference sequences are used in gap4 
+(_fpref(Gap4-Introduction, Gap4 introduction, gap4)).
+Here they can be used to define a
+numbering system independent of gaps introduced to produce alignments.
+The numbering can start at any point in the reference sequence. If the
+reference sequence is entered with a feature table the features are
+converted to tags and can be used to control translation of the sequence
+in the contig editor. For mutation detection work the reference sequence
+and feature table
+enable mutations to be
+reported using positions defined by the reference sequence, and also
+allows the effect of the mutations to be noted. 
+Gap4 is able to store entries from the EMBL sequence library complete with
+their feature tables. These feature tables are converted to gap4
+database annotations (tags), which means that they can be selectively
+displayed in the template display and editor, and used to translate only
+the exons (in the correct reading frame). Obviously it may be useful to
+augment the feature tables with the sites of known polymorphisms or deleterious
+mutations so that they can be displayed in gap4 as landmarks.
+When it comes to producing a
+report of the observed mutations the feature table is used to work out
+if a mutation is expressed and if so what the amino acid change is.
+Additional tags can be created to specify the positions of the primers
+or restriction sites used to obtain data covering segments of the sequence.
+For any project the reference sequence need only be set up once. Either
+project databases can be started with the reference sequence already 
+configured or the reference can be assembled along with the reading data.
+The reference sequence can be designated (or reassigned) as follows.
+In pregap4 (_fpref(Pregap4-Introduction, Pregap4 introduction, pregap4))
+it can be named in the module "Reference Traces". In the
+gap4 editor it can be set by right clicking on its name. Once set it should
+appear labelled "S" at the left edge of the editor.
+
+ at node Mutation-Detection-Reference-Traces
+ at subsection Reference Traces
+ at cindex reference traces
+ at cindex mutation detection: reference traces
+
+References traces are used by the automatic mutation detection program
+mutscan (_fpref(Pregap4-Modules-Mutation Scanner, Mutation Scanner,
+t), and by the trace difference display in the gap4
+editor(_fpref(Editor, Editing in gap4, contig_editor)).  
+Ideally forward and reverse reference traces should be
+available and should be obtained using the same primers and sequencing
+chemistry as the patient data.  From the "settings" menu of the editor
+the trace display can be set to "Auto-Diff traces". Once this is
+activated, whenever the user double clicks on a base in the editor
+sequence display, not only is the reading's trace displayed, but also
+its designated reference trace plus the difference between them. If its
+complementary reading is available, its trace and reference trace and
+their differences are also displayed.  These
+trace displays and the editing cursor scroll in synch.
+
+_lpicture(mut_traces_het,6in)
+
+Top and bottom strand differences for a heterozygous base.
+
+
+The preferred way of assigning reference traces to readings is by use of
+"naming conventions"; that is to have a simple set of rules which
+control the names given to the trace files. It can be seen in the
+figures showing the editor that forward and reverse readings from the
+same patient have names with a common root but which end either F or
+R. This both ties the two together (so the software knows which is the
+corresponding 
+complementary trace when the user double clicks on a reading) and also
+enables the association of readings and their reference traces. Once a
+convention has been adopted the rules can be defined for pregap4 by
+loading them via the "Load Naming Scheme" option in its File menu
+(_fpref(Pregap4-Naming, Pregap4 Naming Schemes, pregap4)). For
+any batch of readings the reference traces are defined within pregap4's
+"Reference Traces" module.  Note that this mode of operation, by
+allowing the specification of only one forward and one reverse trace,
+limits each batch of traces processed to those which correspond to a
+given pair of reference traces. The size of the batch is unlimited. 
+
+
+The alternative way of specifying the reference traces is to right click
+on their names in the editor. This also allows positive trace controls to be 
+specified (which is not possible in pregap4).
+
+
+
+_split()
+
+ at node Using-The-Template-Display-With-Mutation-Data
+ at subsection Using The Template Display With Mutation Data
+
+
+_lpicture(mut_template_all,6in)
+
+Figure 7. The template display showing the whole of the BRCA1 gene (exons in green).
+
+The view obtained from the Template display and shown in Figure 7 is not of 
+practical use but serves here to illustrate the overall
+arrangement of the data for our chosen example the BRCA1 gene. This figure
+shows the entirety of the EMBL entry HSLBRCA1 with its exons marked
+in green. Only exon 11 has patient trace data stacked above it.
+
+_lpicture(mut_template_reads,6in)
+
+
+Figure 8. A zoomed-in version of the data shown in Figure 7. 
+
+Here we can see all the readings
+covering exon 11. Forward readings are light blue, reverse readings orange, 
+primers are 
+marked in yellow, mutations in red and orange.
+A common mutation appears in the leftmost set of readings and illustrates
+the value of using the template display for visualising the overall pattern 
+of the tagged mutations. 
+
+ at node Configuring-The-Gap4-Editor-For-Mutation-Data
+ at subsection Configuring The Gap4 Editor For Mutation Data
+
+The current version of the gap4 editor contains very many options that are
+not needed for mutation data. Given sufficient demand a version tailored for
+mutation studies could be produced. For now it might make it easier to understand
+the program if its origin as a genome assembly program is borne in mind.
+Here we outline the options and settings relevant to mutation studies.
+The assignment of reference sequence and traces is described above. From the
+editor they can be set by right clicking on the reading names.
+
+Gap4 enables segments of sequences to be annotated (or tagged). Each tag 
+has a type (eg primer) and each type has an associated colour. Each instance
+of a tag can include editable text. This text can be viewed and edited by right
+clicking on the tag and selecting "Edit tag", after which a text box will appear.
+Gap4 can display annotations/tags as background colour and the user can specify
+which tag types are shown. For mutation studies the following tag types may
+usefully be activated, and all others turned off. Using the "Set Active Tags"
+option in the "Settings" menu first click on "Clear all".  
+Then click on "primer".
+To add further types
+you must hold down the "Ctrl" key on the keyboard while clicking. 
+Now scroll down and click on "Mutation", "Heterozygous" and "FEATURE CDS".
+Add any others required, then click "OK".
+
+The following configurations are performed via the "Settings" menu.
+
+Gap4 has three consensus generation algorithms. When using a reference
+sequence it is convenient if the consensus shown in the editor is forced
+to be the same as the reference. This will be the case if either
+the "Weighted base frequencies" or the "Confidence values" consensus algorithms
+are being used. This selection is made using the "Consensus algorithm" option.
+
+Translations are shown in what gap4 refers to as the "Status" line.
+To enable automatic translation of the exons defined in the reference sequence,
+in the "Status Line" option set "Translate using feature tables".
+
+To enable automatic display of trace diferences, in the "Trace Display" option
+set "Auto-Diff Traces".
+
+To show only the base differences between the consensus/reference, set 
+"Highlight Disagreements". These can be shown by dots or colour.
+
+To show base confidence values set "Show reading quality" and also make sure
+that the value in the box labelled "Q" at the top left of the editor is set
+to 0 or greater.
+
+To force forward and reverse reading pairs to be shown in adjacent records in
+the editor set "Group readings by templates" (NB this assumes that an appropriate
+naming scheme has been used).
+
+If a reference sequence is assigned, the numbering at the top of the sequence
+will reflect the base positions in that sequence. Any pads in the reference
+sequence are ignored. If no reference sequence is assigned, the numbering will 
+ignore pads if the "Show unpadded positions" option is activated.
+
+At the bottom of the "Settings" menu is an option to "Save settings". Use of
+this will mean that the current configuration will be set automatically next
+time the editor is used (and hence the steps just described only need to be
+performed once).
+
+ at node Using-The-Gap4-Editor-With-Mutation-Data
+ at subsection Using The Gap4 Editor With Mutation Data
+
+The current version of the editor has a fixed width and a maximum
+height. If too many sequences are present at any position a vertical
+scrollbar on the right edge can be used to move them up and down. The
+CONSENSUS line will always be visible, but at present, the reference
+sequence is scrolled along with all the other sequences and so may
+disappear. Horizontal scrolling is achieved in the usual ways, plus by use of
+the >, >> and <, << buttons. The reading names can be moved left and right 
+using the scrollbar above them.
+
+Configure the editor as described above.
+
+The traces for readings (and their reverse) can be examined over their full
+length one at a time by simply double clicking on them then scrolling
+along. Any 
+mutations observed can be labelled by right clicking on the base in the editor 
+display and invoking
+the "Create tag" option. This brings up a dialogue box. At the top is a
+button marked "Type:comment"; clicking on this will bring up another dialogue
+with a list of all the tag types; choose the appropriate one ("Heterozygous"
+or "Mutation"). There are obviously many advantages to examining the traces
+like this using gap4. However, if the automated mutation detection methods
+are trusted, or used in way that makes them trustworthy for the type of
+study being undertaken, then there are quicker ways of examining the data.
+
+The "Next Search" button at the top of the editor gives access to many types
+of search, one of which is "tag type". If this is selected a button appears
+labelled "Tag type COMM(Comment)". Clicking on this will bring up a dialogue
+showing all the available tag types. If the user selects, say "Mutation", 
+each time the "Next Search" button is used the program will position the
+editing cursor on the next
+mutation tag. Double clicking will automatically bring up the appropriate 
+traces as shown in figures 1, 2 and 5
+(_fpref(Mutation-Detection-Introduction, Introduction to mutation detection,t)).
+The user can view the traces and if necessary alter the tag (eg delete it
+if it is a false positive).
+
+Once all the data has been checked and all mutations and heterozygous bases
+have been tagged a report can be generated using the "Report Mutations"
+option in the editor "Commands" menu. Note that it is also possible to
+simply report all differences between base calls and the reference, but the
+usual procedure is for the program to report all bases tagged as "Mutation"
+or "Heterozygous". Example output is shown above in Figure 6
+(_fpref(Mutation-Detection-Introduction, Introduction to mutation detection,t)).
+The report appears in the gap4 "Output window" which can
+be saved to disk by right clicking on the text and selecting "Output to
+disk".
+
+
+_split()
+ at node Processing-Batches-Of-Mutation-Data-Trace-Files
+ at subsection Processing Batches Of Mutation Data Trace Files
+
+It is not clear which is the best way of organising the data for the simplest
+and most efficient processing using the current programs, but 
+for now we make the following suggestions.
+
+We assume that the region of the DNA being studied has a standard set of 
+forward and reverse primer pairs covering all segments of interest and that
+a standard reference sequence in EMBL format is available.
+
+We recommend that batches of data from single primer pair combinations
+are processed separately, using separate temporary gap4 databases. 
+For example, exon 11 of BRCA1 can be covered by five
+pairs of forward and reverse primers and we suggest that
+batches of traces obtained from each of these primer pairs should be 
+processed using five gap4 databases.
+
+Each processing run should create a new database and should enter, not 
+only the
+new sets of patient data for that particular
+primer pair, but also the corresponding
+reference sequence and reference traces.
+
+Obviously when several primer pairs are needed to cover a given region of
+the DNA (eg for BRCA1) the same reference sequence would be used for
+all the primer pairs.
+
+An alternative to the above is to create a template database 
+for each primer pair which contains the data for the corresponding 
+forward and reverse 
+reference traces plus the fully annotated reference sequence.
+These template databases are copied to create a
+temporary database for each new batch of data for the given primer pair.
+
+Whichever of these two strategies is adopted
+each batch of new data is processed, analysed and 
+assembled into these temporary databases, inspected
+visually, and a mutation report generated.
+
+The use of separate temporary databases
+simplifies the assignment of reference traces and the use of the report
+generation function.
+
+_lpicture(mut_template_reads_single,6in)
+
+Figure 9. An overview of a database containing data for only one primer pair of BRCA1
+
+For long term storage and to facilitate larger studies, the content of each
+of these temporary databases is then transferred to archive databases, after
+which the temporary databases are no longer needed. 
+The archive databases could be restricted to individual primer pairs
+or could accommodate data covering the whole of the reference sequence.
+
+ at node Processing-Batches-Of-Mutation-Data-Trace-Files-Using-Pregap4
+ at subsection Processing Batches Of Mutation Data Trace Files Using Pregap4
+
+All the data processing other than visual inspection of traces and report 
+generation is handled by the program pregap4
+(_fpref(Pregap4-Introduction, Pregap4 introduction, pregap4)). 
+Pregap4 achieves this by
+running a set of individual programs selected by the user. 
+
+_picture(mut_pregap4,6in)
+
+Figure 10. The pregap4 Configure Modules window showing a typical list of mutation data option selections.
+
+The "Configure Modules" window shown in Figure 10. 
+is used to select which programs
+to apply to a batch of data, and to configure their usage. On the left is a list
+of programs and options, with "x" showing the ones that have been selected.
+If the user clicks on an option name its name is given a blue background and
+its configurable parameters are shown in the right hand panel to enable the
+user to alter them. Here "Reference Traces" has been selected which 
+enables the user to set the reference traces and sequence. 
+
+The other selected options (marked with "x") are typical of the ones used for
+mutation detection studies. Below we describe the use of each plus a few 
+alternatives. All of the options are descibed in more detail elsewhere in
+our documentation, our intention here is to give an overview of their use
+during mutation studies.
+
+Note that the window labelled "Files to Process" is used to
+tell the program which files to process as a batch.
+
+ at subsection Configuration Of Pregap4 For Mutation Data
+
+
+ at table @var
+
+ at item General Configuration
+
+This option allows the user to select whether the trace names used for
+the samples should be the same as their file names or should be the
+names stored inside the files.
+
+ at item Phred
+
+Phred is a base caller which also assigns confidence values to each base.
+Generally the data passed to pregap4 has already been base called. However
+not all base callers assign confidence values and so it can be useful to
+apply phred or ATQA (which does not base call but does assign confidence values).
+Alternatively "Estimate Base Accuracies" can be applied which is a simple
+program for providing numerical values which reflect the signal to noise ratio
+for each base, and which can be used instead of confidence values.
+(Note that if quality clipping is used, its score thresholds depend on 
+whether confidence values of eba values are used).
+
+ at item Trace Format Conversion
+
+This option can be used to convert bulky files such as those of ABI to a
+compact format such as SCF or ZTR without loss of the data required for
+trace display.
+
+ at item Initialise Experiment Files
+
+The input to gap4 and several of the other programs used here is a data
+format known as Experiment file format. This step, which has no
+configurable parameters is essential for mutation data processing.
+
+ at item Augment Experiment Files
+
+The section on Reference Traces outlined the use of "Naming Schemes" for
+associating pairs of forward and reverse readings, and for assigning
+reference traces. The naming scheme must be loaded from pregap4's File
+menu. "Augment Experiment Files" must be activated in order for the
+naming scheme to be applied. No parameters need be set.
+
+ at item Quality Clip
+
+The reliability of the base calls varies with position along the sequence.
+Near to both ends the data is less reliable. The "Quality Clip" option
+trims the ends of the sequences by analysing their confidence values or
+accuracy estimates (if present) or the density of unknown bases in the 
+sequence. By observing these "clip points" other processing programs
+will work more reliably.
+
+ at item Reference Traces
+
+As explained above it is necessary to specify a reference trace (preferably
+one for each strand of the data if processing data from both strands). The
+Reference sequence can also be set here.
+Note that
+even if our suggestion to preload the reference traces into the gap4
+database is followed, it is still necessary to specify them here for use
+by the
+mutation detection modules.
+
+ at item Trace Difference
+
+This is the program which compares the patient and reference traces to
+search for possible mutations. It adds data to the experiment files
+to mark each predicted mutation, and this data will appear as tags in the gap4
+database. It can also create a new trace file containing the difference
+of the reference and the sample. The numerical parameters control the
+sensitivity of the algorithms, and hence the ratio between the numbers
+of false positive and negative results.
+
+ at item Heterozygote Scanner
+
+This is the program which compares the patient and reference traces to
+search for possible heterozygous bases. It adds data to the experiment files
+to mark each predicted heterozygous base, 
+and this data will appear as tags in the gap4
+database. The numerical parameters control the
+sensitivity of the algorithms, and hence the ratio between the numbers
+of false positive and negative results.
+
+ at item Gap4 shotgun assembly
+
+In order to be able report the positions of mutations relative to the reference
+sequence, and to be able to compare sets of samples from patients, it is
+necessary to perform multiple sequence alignment on the data. This is termed
+"assembly" and is usually performed by gap4, although other programs can be
+operated via pregap4. If following the suggestion to preload the reference
+sequence to a temporary database for each batch, supply the name of this
+database here. Otherwise a new database should be named and created
+from this option. (If this strategy is adopted make sure that the reference 
+sequence and the references traces are assembled!) The parameters
+that control the assembly process and are described elsewhere.
+ at end table
+
+Note that pregap4 has the facility to save its configuration and parameter
+settings. 
+This means that the current configuration will be set automatically next
+time the program is used (and hence the steps just described only need to be
+performed once). In addition pregap4 can be run non-interactively
+by typing a single line on the command line.
+Taking thse two capabilities together, means that only one line need be 
+typed in order to process all subsequent batches of data (assuming the
+file names are reused, which is easy to arrange.)
+
+
+_split()
+ at node Discussion-Of-Mutation-Data-Processing
+ at subsection Discussion Of Mutation Data Processing Methods
+
+At present pregap4 and gap4 clearly show their primary usage in the field
+of genome assembly, but versions tailored to mutation studies can be created once
+the requirements are agreed. 
+Ideally all processing should be controlled by a single program which once
+configured for any project should require users to provide only the project
+name - all other file names and parameters could be preset, and all processing,
+including archiving and backup, performed automatically, leaving the data 
+ready for visual inspection. 
+
+The automatic mutation and heterozygote detection
+programs work well on all the test data we have but now they
+require evaluation by external groups. Such analysis would
+enable us to improve the algorithms and to tune their parameters.
+At present we know that sometimes a base will be declared both as a mutation
+and as a heterozygous position when visual inspection shows that it is
+one or the other.
+
+There is still much that can be done overall to improve the methods, 
+but the text above
+summarises their status in July 2002.
+Although currently valuable for real scientific
+and clinical work they should perhaps be viewed as prototypes.
+
diff --git a/manual/mutations.texi b/manual/mutations.texi
new file mode 100644
index 0000000..1cb6eed
--- /dev/null
+++ b/manual/mutations.texi
@@ -0,0 +1,36 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+ at c %**start of header
+ at setfilename mutations.info
+ at settitle Mutation Detection
+ at iftex
+ at afourpaper
+ at end iftex
+ at setchapternewpage odd
+ at c %**end of header
+
+ at set standalone
+include(header.m4)
+
+ at titlepage
+ at title Mutation Detection
+ at page
+ at vskip 0pt plus 1filll
+_include(copyright.texi)
+ at end titlepage
+
+ at node Top
+ at ifinfo
+ at top top-mutations
+ at end ifinfo
+
+ at raisesections
+_include(mutations-t.texi)
+
+_split()
+ at node Index
+ at unnumberedsec Index
+ at printindex cp
+
+ at contents
+ at bye
diff --git a/manual/notes-t.texi b/manual/notes-t.texi
new file mode 100644
index 0000000..baed903
--- /dev/null
+++ b/manual/notes-t.texi
@@ -0,0 +1,151 @@
+ at menu
+* Notes-Selector::              Selecting Notes
+* Notes-Editor::                Editing Notes
+* Notes-Special::               Special Note Types
+ at end menu
+
+ at cindex Notes
+
+A `Note' is an arbitrary piece of text which can be attached to any reading,
+any contig, or to a database as a whole. Each note also contains a note
+type, a creation date and a modification date. Any number of notes can
+be attached to each reading, contig or database.
+They can be considered as positionless tags.
+
+_split()
+ at node Notes-Selector
+ at section Selecting Notes
+ at cindex Notes: selecting
+ at cindex Edit notebooks
+
+The primary interface to creating, viewing and editing notes is the Note
+Selector window. This is accessable from a variety of places, 
+including anywhere a contig or reading name (or line in a graphical plot) is
+displayed, and also by using the "Edit Notebooks" command in the main gap4 Edit
+menu.
+
+_picture(notes.selector,6in)
+
+The Note Selector initially starts up showing the database notes (unless
+selected from a specific contig or reading plot). The picture above shows three
+notes attached to the main gap4 database record. These are of type @code{OPEN}
+and @code{RAWD}, both of which have a specific meaning to gap4, and type
+ at code{COMM}.
+
+The View Menu is used to see a list of notes for readings or contigs.
+If Reading Notes or Contig Notes is selected, the interface will ask for
+a reading or contig
+identifier by adding an extra line to the Note Selector Window, just beneath
+the menus. Typing one in and pressing return will then list the notes for
+that reading or contig. 
+
+To speed up selection, it is possible to use the right
+mouse button on the Contig Selector Window and in the contig rulers at the
+bottom of many plots (such as the Template Display), to select the "List Notes"
+option. This will start the Note Selector if it is not already running, and
+will direct it to display notes for the desired contig. Similarly, the
+right mouse button can be used 
+to popup a menu from a reading in the Template Display or
+from a reading name in the Contig Editor.
+
+To edit a note, double click anywhere in the Note Selector on the line
+for the note.
+
+To delete a note, single click on the note line to highlight it and then select
+"Delete" from the Note Selector Edit menu. To delete several notes
+at once, first highlight a range by left clicking and dragging the mouse to
+mark a region of notes, and then use Delete. Alternatively notes may be
+deleted by double clicking to bring up the note editor and selecting Delete
+from the Note Editor File menu.
+
+To create a new note use the "New" command from the Edit menu. The note will
+be added to whatever data type is currently shown. To create a
+note for a particular contig, select that contig using the Contig Notes option
+in the View menu, and then use New to create a new note. New notes
+will have type @code{COMM} and the contents can be in any format.
+
+_split()
+ at node Notes-Editor
+ at section Editing Notes
+ at cindex Notes: editing
+
+Double clicking on a note in the Note Selector, or creating a new note, will
+bring up the Note Editor Window. This is simple text editor,
+allowing use of keyboard arrow keys and the mouse to position and edit text.
+It also has keyboard bindings for many of the simple emacs movement commands.
+
+_picture(notes.editor,5.04167in)
+
+At the top of the Notes Editor are three buttons. 
+The leftmost is the File menu
+which contains the "Save", "Delete" and "Exit" options. Next to this is the
+Type selector. This menu name displays the currently selected note type. To
+change the note 
+type select the appropriate type from the Type menu. The final button
+gives access to the online Help.
+
+Listed underneath the menu are the creation and modification dates. The
+creation date if fixed when a note is created.
+The modification date is adjusted every time a note is edited.
+(Simply viewing a note will not update the modification date, but saving
+changes to it will.)
+
+Underneath these is the note text itself. For convenience, the first line of
+each note is shown in the note selector window (so it can be helpful to
+make it identifiable).
+
+_split()
+ at node Notes-Special
+ at section Special Note Types
+ at cindex Notes: special types
+
+Several types of note have special meanings. These include the
+ at code{OPEN}, @code{CLOS} and @code{RAWD} note types.
+
+ at table @code
+ at cindex CLOS note type
+ at cindex OPEN note type
+ at item OPEN
+ at itemx CLOS
+Notes of type OPEN and CLOS
+should contain pure Tcl code. If they exist, they will be
+executed when the database is opened (@code{OPEN}) and closed (@code{CLOS}).
+Take great care in creating and editing a note with these types! The purpose
+is to allow configuration options to be attached to a database, and
+hence allow for different gap4 configurations to be used when a UNIX directory
+contains more than one database. In general use of the @file{.gaprc} file 
+(_fpref(Conf-Introduction, Options Menu, configure)) is probably safer.
+
+If there is a problem with a database containing a malformed @code{OPEN} or
+ at code{CLOS} note, it may be opened using @code{gap4 -no_exec_notes}. This will
+prevent gap4 from executing the @code{OPEN} and @code{CLOS} notes and so allow
+them to be fixed using the Note Editor.
+ at sp 1
+
+ at cindex RAWD note type
+ at cindex RAWDATA
+ at item RAWD
+This note specifies an alternative to the @code{RAWDATA} environment
+variable and should be set to be the full directory name for the
+location of the trace files for the database.
+If both the environment variable and the note are exist then
+the note will take priority. This automatic use of this note can be disabled
+be using the @code{-no_rawdata_note} command line option to gap4.
+ at sp 1
+
+ at cindex INFO note type
+ at item INFO
+When created on a reading or a contig, this note may be displayed in the
+contig editor "information line"
+(_fpref(Editor-Info, The Editor Information Line, contig_editor))
+when the user moves the mouse over the editor sequence name list.
+ at end table
+
+It is possible to create your own types by editing the 
+ at file{$STADENROOT/tables/NOTEDB}
+file. The format is fairly self explainatory, and is very similar to the
+ at file{GTAGDB} file. Each note type should consist of the long name followed by
+a colon and @code{id=}@i{4_letter_short_name}, optionally followed by
+ at code{dt="}@i{any default text for this note}@code{"}. Lines may be split at
+colons by adding a backslash to the end of the line. See the standard
+ at file{NOTEDB} file for examples.
diff --git a/manual/notes.editor.png b/manual/notes.editor.png
new file mode 100644
index 0000000..f6e4044
Binary files /dev/null and b/manual/notes.editor.png differ
diff --git a/manual/notes.selector.png b/manual/notes.selector.png
new file mode 100644
index 0000000..fa7b212
Binary files /dev/null and b/manual/notes.selector.png differ
diff --git a/manual/phrap-t.texi b/manual/phrap-t.texi
new file mode 100644
index 0000000..0c6a643
--- /dev/null
+++ b/manual/phrap-t.texi
@@ -0,0 +1,195 @@
+ at cindex Assembly: Phrap
+ at cindex Phrap Assembly
+ at cindex Green, Phil (Phrap)
+
+This mode of assembly uses the 
+Phrap program, developed by Phil Green.  For best
+Phrap and Gap4 integration a modified version (gcphrap) is required.  The main
+purpose of the change is to allow Phrap to support the Experiment File format
+for both input and output.  
+For this version please email Phil Green (phg@@u.washington.edu).
+
+A summary of the benefits of using the Gap4 Phrap interface follows.
+
+ at itemize @bullet
+ at item Naming conventions.
+Phrap has it's own specific naming conventions. Failure to adhere to these
+will reduce the reliability of phrap. Using the modified Phrap to read
+Experiment Files allows use of any naming scheme as no information needs to be
+encoded in the reading name.  (Instead it is in other fields in the Experiment
+File.)
+
+ at item User interface.
+Phrap can now be used just as easily as Gap4's own assembly options by simply
+making Phrap available on the Gap4 menus.
+
+ at item Pregap4 compatible.
+As Phrap now reads Experiment files this means a common preprocessing program
+can be used for whichever assembly algorithm is chosen.
+ at end itemize
+
+Finally note that if Phred is used for base calling Gap4 will operate best
+with the "confidence" probability mode enabled. _fxref(Con-Calculation, The
+Consensus Calculation, calc_consensus)
+
+ at menu
+* Assembly-Before Phrap::       Before using Phrap
+* Assembly-Phrap Assemble::     Phrap Assembly
+* Assembly-Phrap Reassemble::   Phrap Reassembly
+* Assembly-Phrap CLI::          Phrap on the Command Line
+ at end menu
+
+ at node Assembly-Before Phrap
+ at subsection Before Using Phrap
+
+To get the most out of Phrap (and Gap4) a base caller which generates
+confidence values should be used. The Phred base caller (also written by Phil
+Green) is probably the most widely used example and has been extensively
+tested in conjunction with Phrap.
+
+There are two significant methods of running Phred. The first is to produce a
+ at file{.phd} file containing the new base calls and confidence values. The
+second is to produce a new SCF file. For use with Gap4 we recommend outputting
+SCF files as this will ensure a correct synchronisation between the trace
+displays and the sequence displays. In the following example phred is used to
+reassign the base calls for all traces held in the @file{chromat_dir}
+directory, writing new SCF files into the @file{new_chromat_dir} directory.
+
+ at example
+phred -id chromat_dir -cd new_chromat_dir
+ at end example
+
+These SCF files can then be passed into pregap in the same fashion as normal
+except for one additional @code{.pregaprc} parameter ("@code{do_eba=No"}) to
+disable Pregap's own quality value assignment. Add this to your
+ at file{.pregaprc} file using @code{echo "do_eba=No" >> .pregaprc}.  If
+cross_match needs to be used, instead of the vector_clip program used in
+pregap, the Experiment File patch also allows cross_match to read (but
+currently not write) Experiment Files. This means that Pregap can be used with
+vector clipping disabled to generate the Experiment Files. cross_match can
+then be used to output clipping sequence in a fasta file which could be passed
+into Phrap.
+
+ at example
+cross_match fofn.passed vector.seq -minmatch 12 -minscore 20 -screen > scr.out
+ at end example
+
+The above example uses cross_match to analyse the pregap output of files
+listed in @file{fofn.passed}. This will produce a new file named
+ at file{fofn.passed.screen} which will be a Fasta format file rather than a new
+file of filenames. However this filename can be given to the Phrap interface
+in Gap4 instead of the requested file of filenames and Phrap will
+automatically detect that this is a fasta file.
+
+ at node Assembly-Phrap Assemble
+ at subsection Phrap Assembly
+
+The Phrap assemble command takes a file of Experiment File filenames and
+passes these into Phrap for assembly. The resulting assembly from Phrap is
+then automatically entered into the Gap4 database (implemented using the
+Directed Assembly command).
+
+_picture(phrap.assembly,3.1in)
+
+The "Destination directory" in the above dialogue is the location for Phrap to
+output the assembled data in Experiment File format. These files do not need
+to be kept unless further analysis of the assembly outside of Gap4 is
+required. Internally they are used as input to the Directed Assembly option.
+
+If you have specific Phrap parameters add them to the "Other phrap parameters"
+entry box. Please see the documentation that came with Phrap for a list of
+available parameters. If in doubt, just leave this blank.
+
+Next there is the option to perform quality clipping (_fpref(Clip-Quality,
+Quality clipping, clip)) and difference clipping (_fpref(Clip-Difference,
+Difference clipping, clip)). These options are useful for tidying up the Phrap
+assembly. To see the raw Phrap assembly turn both of these off. They may be
+selected from the Gap4 Edit menu at a later stage without the need to rerun
+phrap.
+
+Pressing OK will then start Phrap running. At the end of assembly you should
+be presented with output in the main text window and the Contig
+Selector window. Phrap will also have produced several files named after the
+input file of filenames. These have extensions @file{.contigs},
+ at file{.contigs.qual}, @file{.log} and @file{.singlets}. The Phrap
+documentation explains their contents. The main output of Phrap is also
+written to disk as a file named @file{stdout}, held in the destination
+directory.
+
+ at node Assembly-Phrap Reassemble
+ at subsection Phrap Reassembly
+
+Gap4 also provides a graphical interface for using Phrap to reassemble a set
+of sequences already held within a Gap4 database. It extracts readings from
+the database, reassembles them using Phrap, and enters the newly assembled
+readings back into the database.
+
+The dialogue is identical to that used in the Phrap Assemble command. For
+dialogue help please see _oref(Assembly-Phrap Assemble, Phrap Assembly).
+
+Edits to both sequences and confidence values are preserved. Annotations are
+also preserved although they may have their length changed if the reassembly
+results in adding or removing a pad within the annotated segment.
+
+Although it is not necessary to understand the individual steps taken during
+reassembly it is instructive and may answer some questions.
+
+ at itemize @bullet
+ at item Backup the database to version @code{~}.
+
+ at item "Extract Readings" on the list of readings we wish to reassemble. This
+dumps out the edited sequences, confidences and annotations (and more) to the
+Experiment Files.
+
+ at item "Disassemble Readings" to remove the old copies from the Gap4 database.
+This will break contigs if necessary (such as when reassembling a chunk within
+the middle of a contig).
+
+ at item Run phrap on our Experiment Files created in step 2.
+
+ at item "Directed Assembly" on the phrap output.
+ at end itemize
+
+ at node Assembly-Phrap CLI
+ at subsection Phrap on the Command Line
+
+If you wish to use the new Phrap within your own scripts you will probably
+need to understand how to use Phrap on the command line. The full Phrap
+documentation should come with the Phrap distribution. Here we just give an
+outline of the changes involved in handling Experiment files.
+
+Phrap automatically detects the file type for input sequences. If the contents
+of the file start with a '>' it is assumed to be a Fasta file and processing
+is identical to the previous Phrap version. Otherwise the file is assumed to
+be a file of Experiment File filenames.
+
+With Experiment Files, the @code{PR}, @code{TN} and @code{CH} line types are
+used to hold information which Phrap normally requires in the reading name (in
+a Phrap specific format). We produce a new sequence name for phrap consisting
+of @i{phrap_name}@code{//}@i{file_name} where @i{phrap_name} is generated from
+the aforementioned Experiment File lines. This allows for minimal Phrap source
+changes whilst retaining complete user control over naming conventions. Phrap
+also reads the @code{SL} and @code{SR} line types, which specify the vector
+clips. Quality clip information is ignored.
+
+If the @code{-exp} parameter is given to Phrap, Phrap reads the next argument
+as a directory in which to write Experiment Files. Use "@code{-exp .}" to
+overwrite the input files, although this is not usually recommended. Without
+this parameter Phrap will output fasta or ace format files in the normal
+manner.
+
+The filenames of the Experiment Files are the same as the input file names.
+The Phrap reading name is processed to strip off the @i{phrap_name}@code{//}
+to obtain the original Experiment File name. This Experiment File is then read
+and all relevant information copied out to the newly created Experiment File.
+Annotations (@code{TG} lines) have their positions and lengths updated as
+required (due to padding). New quality left (@code{QL}) and quality right
+(@code{QR}) line types created. Finally an Assembly Position (@code{AP}) line
+is added. This provides the necessary information for the Gap4 Directed
+Assembly option to enter the sequences.
+
+One result of this method is that it is possible to use cross_match with a set
+of Experiment files to output a screened fasta file and then to run Phrap on
+the fasta file producing Experiment Files. Despite the fact that Phrap was
+only given a fasta file, the original Experiment File contents are used in
+writing out the aligned Experiment Files.
diff --git a/manual/phrap.assembly.png b/manual/phrap.assembly.png
new file mode 100644
index 0000000..38f29f8
Binary files /dev/null and b/manual/phrap.assembly.png differ
diff --git a/manual/polyA_clip.1.texi b/manual/polyA_clip.1.texi
new file mode 100644
index 0000000..848e456
--- /dev/null
+++ b/manual/polyA_clip.1.texi
@@ -0,0 +1,53 @@
+ at cindex polyA_clip: man page
+ at cindex polyA clipping
+ at cindex polyT clipping
+
+ at unnumberedsec NAME
+
+polyA_clip --- Mark polyA and polyT heads and tails.
+
+ at unnumberedsec SYNOPSIS
+
+ at code{polyA_clip} [@code{-vt}] [@code{-t}] [@code{-x} @i{min_length(0)}] 
+[@code{p} @i{percent_cutoff(95)}] [@code{w} @i{window_length(50)}] files...
+
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-v}
+     Enable verbose output. This outputs information on which files are
+     currently being clipped.
+
+ at item @code{-t}
+     Test mode. The SL and SR information is written to stdout instead of
+     being appended to the Experiment file.
+
+ at item @code{-x} @i{min_length}
+     Sequences which after clipping are shorter than min_length are reported.
+     
+ at item @code{-w} @i{window_length}
+     The length of the window that is slid along the sequence to analyse the
+     composition.
+
+ at item @code{-p} @i{percentage}
+     Windows containing this percentage of A or T bases are considered as 
+     polyA or polyT
+ at end table
+
+ at unnumberedsec DESCRIPTION
+
+PolyA_clip searches the 5' and 3' ends of sequence readings for the presence
+of polyA and polyT heads and tails. It marks them using the SL and SR 
+experiment file records, and hence should be applied after quality clipping
+and sequence vector clipping. Any number of files can be processed in a single
+run. The algorithm is as follows. The user supplies window_length and  
+percentage. From MIN(QR,SR) slide the window left until 
+percent_A < percentage and percent_T < percentage.
+Then from the right edge of the window look left until
+a C or G is found. Mark this base SR. Do the equivalent for the 5' end and
+mark SL.
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Formats-Exp, ExperimentFile(4), formats)
diff --git a/manual/preface-t.texi b/manual/preface-t.texi
new file mode 100644
index 0000000..e702a03
--- /dev/null
+++ b/manual/preface-t.texi
@@ -0,0 +1,84 @@
+This manual describes the sequence handling and analysis software
+developed at the Medical Research Council Laboratory of Molecular
+Biology, Cambridge, UK, which has come to be known as the Staden
+Package.
+
+The vast bulk of work on the package was done at LMB within Rodger
+Staden's group, which over time has consisted of Tim Gleeson, Simon
+Dear, James Bonfield, Kathryn Beal, Mark Jordan and Yaping
+Cheng. Besides the group members a number of people have made
+important contributions; most notably including David Judge and John
+Taylor for feedback / tutorials and developing the Windows release
+respectively.
+
+Since mid-2003 the group in LMB no longer exists. The package became
+``open source'' and moved onto SourceForge in early 2004. The only
+active maintainer (James Bonfield) now works at the Wellcome Trust
+Sanger Institute. The new package homepage may be found at
+_uref(http://staden.sourceforge.net/) and the SourceForge project page
+is at _uref(https://sourceforge.net/projects/staden/).
+
+The focus of the development since 1990 has been to produce improved
+methods for processing the data for large scale sequencing projects,
+and this is reflected in the scope of the package: the most advanced
+components (trev, prefinish, pregap4 and gap4) are those used in that
+area.  Nevertheless the package also contains a program (spin) for the
+analysis and comparison of finished sequences. The latter also
+provides a graphical user interface to EMBOSS.
+
+Since the LMB group disbanded it has become necessary to reduce the
+scope of further development, so active work is primarily being
+directed to the Gap4 program.
+
+Gap4 performs sequence assembly, contig ordering based on read pair
+data, contig joining based on sequence comparisons, assembly checking,
+repeat searching, experiment suggestion, read pair analysis and contig
+editing. It has graphical views of contigs, templates, readings and
+traces which all scroll in register. Contig editor searches and
+experiment suggestion routines use confidence values to calculate the
+confidence of the consensus sequence and hence identify only places
+requiring visual trace inspection or extra data. The result is
+extremely rapid finishing and a consensus of known accuracy.
+
+Pregap4 provides a graphical user interface to set up the processing
+required to prepare trace data for assembly or analysis. It also
+automates these processes. The possible processes which can be set up
+and automated include trace format conversion, quality analysis,
+vector clipping, contaminant screening, repeat searching and mutation
+detection.
+
+Trev is a rapid and flexible viewer and editor for ABI, ALF, SCF and
+ZTR trace files.
+
+Prefinish analyses partially completed sequence assemblies and
+suggests the most efficient set of experiments to help finish the
+project.
+
+Tracediff and hetscan automatically locate mutations by comparing
+trace data against reference traces. They annotate the mutations found
+ready for viewing in gap4.
+
+
+Spin analyses nucleotide sequences to find genes, restriction sites,
+motifs, etc. It can perform translations, find open reading frames,
+count codons, etc. Many results are presented graphically and a
+sliding sequence window is linked to the graphics cursor.  Spin also
+compares pairs of sequences in many ways.  It has very rapid dot
+matrix analysis, global and local alignment algorithms, plus a sliding
+sequence window linked to the graphical plots. It can compare nucleic
+acid against nucleic acid, protein against protein, and protein
+against nucleic acid.
+
+
+The manual describes, in turn, each of the main programs in the
+package: gap4, and then pregap4 and its associated programs such as
+trev, and then spin.  This is followed by a description of the
+graphical user interface, the ZTR, SCF and Experiment file formats
+used by our software, UNIX manpages for several of the smaller
+programs, and finally a list of papers published about the software.
+The description for each of the programs includes an introductory
+section which is intended to be sufficient to enable people to start
+using them, although in order to get the most from the programs, and
+to find the most efficient ways of using them we recommend that the
+whole manual is read once. The mini-manual is made up from the
+introductory sections for each of the main programs.
diff --git a/manual/pregap4-t.texi b/manual/pregap4-t.texi
new file mode 100644
index 0000000..1722765
--- /dev/null
+++ b/manual/pregap4-t.texi
@@ -0,0 +1,4764 @@
+_include(pregap4_org-t.texi)
+_include(pregap4_mini-t.texi)
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-Intro-Menus
+ at section Pregap4 Menus
+
+The main window of pregap4 contains File, Modules, Information source
+and Options menus.
+
+ at c --------------------------------------------------------------------------
+ at node Pregap4-Intro-Menus-File
+ at subsection Pregap4 File menu
+
+The File menu includes functions to set the files for processing,
+loading configuration files and naming schemes, including configuration
+components, starting processing and exiting.
+
+ at itemize @bullet
+ at item Set Files to Process (_fpref(Pregap4-Files, Specifying Files to Process, pregap4))
+ at item Load New Config File (_fpref(Pregap4-Config-Files, Using Config Files, pregap4))
+ at item Load Naming Scheme (_fpref(Pregap4-Naming, Pregap4 Naming Schemes, pregap4))
+ at item Include Config Component (_fpref(Pregap4-Components, Pregap4
+Components, pregap4))
+ at item Save All Parameters (in all modules) (_fpref(Pregap4-Modules, Configuring Modules, pregap4))
+ at item Save All Parameters (in all modules) to: (_fpref(Pregap4-Modules, Configuring Modules, pregap4))
+ at item Save Module List (_fpref(Pregap4-Modules, Configuring Modules, pregap4))
+ at item Exit
+ at end itemize
+
+ at c --------------------------------------------------------------------------
+ at node Pregap4-Intro-Menus-Modules
+ at subsection Pregap4 Modules menu
+
+The pregap4 Modules menu contains options for adding and configuring
+modules, and running pregap4.
+
+ at itemize @bullet
+ at item Add/Remove Modules (_fpref(Pregap4-ModAdd, Adding and Removing
+Modules, pregap4))
+ at item Configure Modules (_fpref(Pregap4-Modules, Configuring Modules, pregap4))
+ at item Select all modules
+ at item Deselect all modules
+ at end itemize
+
+ at c --------------------------------------------------------------------------
+ at node Pregap4-Intro-Menus-Information
+ at subsection Pregap4 Information source menu
+
+The Information source menu contains options for specifying how the
+information required for the experiment files is to be obtained. These
+menu options can also be entered from the "Augment Experiment Files"
+module.
+
+ at itemize @bullet
+ at item Simple Text Database (_fpref(Pregap4-Database-Simple, Simple text
+Database))
+ at item Experiment File Line Types (_fpref(Pregap4-Database-LineTypes,
+Experiment File Line Types))
+ at end itemize
+
+ at c --------------------------------------------------------------------------
+ at node Pregap4-Intro-Menus-Options
+ at subsection Pregap4 Options menu
+
+The Options menu contains options for setting fonts and colours and
+defining the style of the user interface.
+
+ at itemize @bullet
+ at item Set Fonts (_fpref(Pregap4-Config-Fonts Colours, Fonts and Colours, pregap4))
+ at item Set Colours (_fpref(Pregap4-Config-Fonts Colours, Fonts and Colours, pregap4))
+ at item Compact Window Style (_fpref(Pregap4-Config-Window Styles, Window Styles, pregap4))
+ at item Separate Window Style (_fpref(Pregap4-Config-Window Styles, Window Styles, pregap4))
+ at end itemize
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-Files
+ at chapter Specifying Files to Process
+ at cindex Files, specifying
+
+Pregap4 needs to be given a list of files to process. These files can be
+binary trace files (in ABI, ALF, SCF, CTF or ZTR format), Experiment Files,
+FASTA, or
+plain text. The files to process do not need to all be in the same format.
+FASTA files will be converted to Experiment files.
+
+_picture(pregap4_files,6in)
+
+Refering to the figure above,
+the "Files to Process" dialogue can be brought up from the File menu, or just
+by pressing the appropriate tab when in @code{compact_win} mode.
+
+On the left hand side we have the current list of files to process. This list
+can be edited simply by clicking with the mouse and typing as normal. This
+only edits Pregap4's temporary copy of this list and does not modify the
+contents of any file of filenames that the list was obtained from.
+
+On the right side of the panel is the pregap4 output filename prefix, the
+output directory name, and several buttons. The filename prefix is used when
+Pregap4 needs to create files for its own use, both for temporary and not so
+temporary files. For example after processing there may be @i{prefix}.passed,
+ at i{prefix}.failed files. The prefix defaults to @file{pregap} until a file of
+filenames is loaded, in which case it switches to the last used file of
+filenames.  All files will be created within the output directory, regardless
+of where the input files reside. The output directory defaults to the current
+directory or to the last used input directory.
+
+The buttons allow selection of the files to process. The "Add files" button
+will bring up a file browser, which will allow one or more file to be
+selected. Pressing Ok on the file browser will then add the selected files to
+the "List of files to process" panel on the left side of the pregap4 window.
+The "Add file of filenames" button may be used to select a list of files whose
+filenames have been written to a `file of filenames'. The list of files to
+process may be edited within pregap4, allowing new filenames to be
+added or removed. The "Clear current list" will
+remove all filenames from the list. Both the "Add files" and "Add file of
+filenames" button append their selections to the list of files to process, so
+to replace the current list the "Clear current list" button must first be
+used. Finally the "Save current list to..." button may be used to produce a
+new file of filenames, containing the combined list of files to process.
+
+_ifdef([[_unix]],[[It is possible to specify the files to process on the
+command line at the time of starting up Pregap4. If we have a file named
+ at file{files} containing three filenames: @file{xb54a3.s1SCF},
+ at file{xb54b12.r1LSCF} and @file{xb54b12.r1SCF}, then the first two command
+lines below are equivalent.
+
+ at example
+pregap4 -fofn files
+pregap4 xb54a3.s1SCF xb54b12.r1LSCF xb54b12.r1SCF
+pregap4 *SCF
+ at end example
+
+If the only files ending in @file{SCF} in this directory were the three listed
+above then the last command above would also be equivalent to the other two.
+]])
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-Running
+ at chapter Running Pregap4
+ at cindex Run command
+
+When the Run button or Run command (File menu) is used, pregap4 starts
+processing the files using the selected modules and their configurations. If
+the configuration is invalid an error message will be produced. For example
+the following may be written to the error window, and the configure modules
+panel will be selected with the problematic module automatically highlighted.
+
+ at example
+Fri 10 Jul 10:04:25 1998 Run: Module sequence_vector_clip needs configuring
+ at end example
+
+Assuming that the configuration is correct, the processing will start and
+output will be sent to the output window as progress is made. The progress
+within each module is shown by a series of fullstops (@code{.}) for each
+correctly processed sequence, and an exclamation mark (@code{!}) for each
+failed sequence.
+
+_picture(pregap4_textwin,6in)
+
+The text output window above shows the early processing stages of 20
+sequences. When finished pregap4 will produce a report containing information
+from each module and the final list of passed and failed sequences. For
+example:
+
+ at example
+- Report Production -
+Passed files:
+    xb54a3.s1.exp (xb54a3.s1SCF.gz) : type EXP
+    xb54b12.r1L.exp (xb54b12.r1LSCF.gz) : type EXP
+    xb54b12.r1.exp (xb54b12.r1SCF.gz) : type EXP
+    xb54b12.s1.exp (xb54b12.s1SCF.gz) : type EXP
+    xb54c3.s1.exp (xb54c3.s1SCF.gz) : type EXP
+
+Failed files:
+    xb54g5.s1.exp (xb54g5.s1SCF.gz) 'screen_vector_clip:  sequence too short'
+
+- Report from 'Augment Experiment Files' -
+xb54a3.s1.exp : added fields SF CF SC SP TN ST PR SI CH.
+xb54b12.r1L.exp : added fields SF CF SC SP TN ST PR SI CH.
+xb54b12.r1.exp : added fields SF CF SC SP TN ST PR SI CH.
+xb54b12.s1.exp : added fields SF CF SC SP TN ST PR SI CH.
+xb54g5.s1.exp : added fields SF CF SC SP TN ST PR SI CH.
+xb54c3.s1.exp : added fields SF CF SC SP TN ST PR SI CH.
+
+- Report from 'Tag Repeats' -
+xb54a3.s1.exp : no repeat found.
+xb54b12.r1L.exp : no repeat found.
+xb54b12.r1.exp : no repeat found.
+xb54b12.s1.exp : no repeat found.
+xb54c3.s1.exp : no repeat found.
+
+
+                       ***   Processing finished   ***
+ at end example
+
+ at pindex .passed
+ at pindex .failed
+ at pindex .log
+ at pindex .report
+
+The list of passed and failed files are written to @i{prefix}.passed and
+ at i{prefix}.failed, where @i{prefix} is the output filename prefix specified in
+the "Files to Process" panel. The reports are written to @i{prefix}.report.
+The passed and failed files contain the most recent filenames associated with
+each sequence. So if a sequence fails early on it could be listed as something
+like @code{xb54a3.s1SCF.gz} and if it fails later it will be listed like
+ at code{xb54a3.s1.exp}. This is because it is the final filename
+which is important for later processing, such as for assembly into gap4.
+
+A @i{prefix}.log file is also created containing a list of passed files,
+failed files, and the filename history for each file (the intermediates will
+still exist). The format of the passed section is "@i{filename}
+ at code{(}@i{file_type}@code{) PASSED}". The format of the failed section is
+"@i{filename} @code{(}@i{file_type}@code{) ERROR: }@i{error message}". The
+format of the file history lines is a series of "@i{filename}
+ at code{(}@i{file_type}@code{)}" segments separated by "@code{<-}", with the
+original filename listed to the right. Filenames containing Tcl
+meta-characters may be `escaped' using curly braces or back slashes. (The Tcl
+ at code{subst} command may be used to generate the original name.) An example of
+a log file follows. This was produced with the command line
+"@code{pregap4  "Sample 671"  WT5.exp  zf89a2.s1.scf  xb56e5.s1.scf}".
+
+ at example
+[passed files]
+ha59a6.s1.exp (EXP) PASSED
+WT5.exp (EXP) PASSED
+xb56e5.s1.exp (EXP) PASSED
+
+[failed files]
+zf89a2.s1.exp (UNK) ERROR: screen_vector_clip:  sequence too short
+
+[passed file history]
+ha59a6.s1.exp (EXP) <- ha59a6.s1.scf (SCF) <- @{Sample 671@} (ABI)
+WT5.exp (EXP)
+xb56e5.s1.exp (EXP) <- xb56e5.s1.scf
+
+[failed file history]
+zf89a2.s1.exp (UNK) <- zf89a2.s1.scf
+ at end example
+
+Some modules may also keep their own separate records, such as an assembly
+log. Where this is the case, it will be explained in the help specific to that
+module.
+
+After running pregap4 it is time to either assemble the data (if this was not
+done using pregap4) or to edit it. If the data has already been assembled with
+Pregap4 then you will need to start up gap4 and use `Open Database'. Otherwise
+one of the gap4 assembly functions should be used, with the
+ at i{filename_prefix}@code{.passed} file. For more information on this see the
+Gap4 manual.
+
+_ifdef([[_unix]],[[
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-Batch
+ at chapter Non Interactive Processing
+ at cindex Batch mode
+ at cindex Non-interactive processing
+
+Pregap4 can also be used in a non-interactive environment (in "batch mode").
+For this to work it is necessary for pregap4 to already have a valid
+configuration file with sufficient information for pregap4 to successfully
+complete the required processing steps. The best way generate this is to take
+a small set of sequences that you wish to work on and to run Pregap4
+interactively on these.  Once pregap4 can run and complete and you know that
+the configuration for that set is correct, use "Save all parameters (in all
+modules)" from the Modules menu to save the current configuration to disk.
+
+It will then be possible to run pregap4 with the @code{-nowin} argument on the
+full set of sequences and on any other set that require the same processing
+steps. It will be necessary to specify the files to process on the command
+line. Then pregap4 will execute the processing steps sending output normally
+seen in the output window to the standard output (@code{stdout}).
+
+It is possible to have different configuration files for different data sets.
+These can be specified on the command line using @code{-config}.
+
+ at example
+pregap4 -nowin -config clip_only.conf -fofn files > files.output
+ at end example
+
+The above example runs pregap4 in batch mode on the files listed in
+ at file{files}. It will use the previously defined configuration contained in
+ at file{clip_only.conf} and will save text output to @file{files.output}. After
+this processing, the files @file{files.passed}, @file{files.failed},
+ at file{files.log} and @file{files.report} will also have been produced.
+
+If you wish to manually or automatically (via your own script) generate the
+Pregap4 configuration file instead of using the GUI, please see
+_oref(Pregap4-ManualConfig, Low Level Pregap4 Configuration).
+]])
+_ifdef([[_unix]],[[
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-CLI
+ at chapter Command Line Arguments
+ at cindex Command line arguments
+ at cindex Arguments, command line
+
+Typically for interactive use of pregap4 users need type nothing more than
+ at code{pregap4}. For the more inquisitive user the following command line
+options are available.
+
+ at table @asis
+ at cindex -config
+ at pindex pregap4.config
+ at item @code{-config} @i{config_file}
+This specifies an alternative configuration file to Pregap4. The default
+configuration file is named @file{pregap4.config}. It is valid to specify a
+filename which does not yet exist. This filename will be used for both reading
+and writing configurations to.
+ at sp 1
+
+ at cindex -fofn
+ at item @code{-fofn} @i{filename}
+Specifies a file of filenames for processing. Multiple uses of @code{-fofn}
+are allowed and they are additive. The default prefix for pregap4 output files
+is derived from the last specified file of filenames.
+ at sp 1
+
+ at cindex -nowin
+ at cindex -no_win
+ at item @code{-nowin}
+ at itemx @code{-no_win}
+These are synonyms. They prevent pregap4 from displaying its graphical user
+interface and force it to automatically "Run".  This argument should only be
+used when driving pregap4 as a batch job. It is necessary to create a valid
+configuration file before using this option.
+ at sp 1
+
+ at cindex -win_compact
+ at item @code{-win_compact}
+This uses a compact GUI mode with the main dialogues listed as separate tabs
+in the main window. This can be made the default display style by selecting
+"Compact Window Style" in the Options menus. The command line option overrides
+this default.
+_oxref(Pregap4-Config-Window Styles, Window Styles).
+ at sp 1
+
+ at cindex -win_separate
+ at item @code{-win_separate}
+This uses a GUI mode using separate top level windows in a similar manner to
+Gap4 and Spin. This can be made the default display style by selecting
+"Separate Window Style" in the Options menus. The command line option
+overrides this default.
+_oxref(Pregap4-Config-Window Styles, Window Styles).
+
+ at cindex --
+ at item @code{--}
+Indicates the end of pregap4 options. This is used in case filenames start with
+a minus sign, to distinguish filenames from possible pregap4 options.
+ at end table
+
+Any other arguments on the command line are assumed to be filenames. For
+example the following command executes pregap4 in batch mode using
+configuration file @file{batchX} on all files in the current directory named
+ at i{something}@code{.ZTR}.
+
+ at example
+pregap4 -config batchX -nowin -- *.ZTR
+ at end example
+
+The @code{--} in the above example is to guard against the unlikely case where
+ at code{*.ZTR} could match a filename starting with minus.
+]])
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-Config
+ at chapter Configuring the Pregap4 User Interface
+ at cindex Configuring pregap4
+
+ at menu
+* Pregap4-Config-Fonts Colours::        Fonts and Colours
+* Pregap4-Config-Window Styles::        Window Styles
+ at end menu
+
+ at node Pregap4-Config-Fonts Colours
+ at section Fonts and Colours
+ at cindex Fonts
+ at cindex Colours
+
+The pregap4 Options menu contains options for modifying the fonts and colours
+used. These options are common to many programs and so are documented
+elsewhere.
+_fxref(UI-Fonts, Font Selection, interface)
+_fxref(UI-Colour, Colour Selector, interface)
+
+ at node Pregap4-Config-Window Styles
+ at section Window Styles
+ at cindex Window styles
+ at cindex Styles of windows
+
+Pregap4 supports two styles of windowing. The default method is a compact
+mode, with the alternative being "separate" mode - similar to gap4 and
+spin.
+
+_picture(pregap4_separate,5.54167in)
+
+This is the "separate" window style. Here the main window is always visible,
+with commands in the main window bringing up new windows. In the picture above
+the configure window can be seen on top of the main window.
+
+ at page
+The second style is "compact" mode.
+
+_picture(pregap4_compact,6in)
+
+In the compact picture above the most common top level windows are "pages" in
+a tabbed notebook.
+_ifdef([[_unix]],[[This is similar to some window styles in the Microsoft
+Windows desktop.]])
+The benefit is greatly reduced screen space and quicker
+controls, but the text output window is no longer permanently visible.
+
+To switch styles select the "Compact Window Style" and "Separate Windows
+Style" commands from the Options menu.
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-Modules
+ at chapter Configuring Modules
+ at cindex Configuring modules
+ at cindex Modules, configuring
+
+ at menu
+* Pregap4-Modules-General::              General Configuration
+* Pregap4-Modules-EBA::                  Estimate Base Accuracies
+* Pregap4-Modules-Phred::                Phred
+* Pregap4-Modules-ATQA::                 ATQA
+* Pregap4-Modules-ConvertTrace::         Trace Format Conversion
+_ifdef([[_unix]],[[* Pregap4-Modules-Compress Traces::      Compress Trace Files]])
+* Pregap4-Modules-Initexp::              Initialise Experiment Files
+* Pregap4-Modules-Augment::              Augment Experiment Files
+* Pregap4-Modules-Quality Clip::         Quality Clip
+* Pregap4-Modules-Sequence Vector::      Sequencing Vector Clip
+* Pregap4-Modules-Cross_match::          Cross_match
+* Pregap4-Modules-Cloning Vector::       Cloning Vector Clip
+* Pregap4-Modules-Screen Vector::        Screen for Unclipped Vector
+* Pregap4-Modules-Screen::               Screen Sequences
+* Pregap4-Modules-Blast::                Blast Screen
+* Pregap4-Modules-Interactive Clip::     Interactive Clipping
+* Pregap4-Modules-Extract Seq::          Extract Sequence
+* Pregap4-Modules-RepeatMasker::         RepeatMasker
+* Pregap4-Modules-Repeats::              Tag Repeats
+* Pregap4-Modules-Mutations::            Mutation Detection
+* Pregap4-Modules-Reference Traces::     Reference Traces
+* Pregap4-Modules-Trace Difference::     Trace Difference
+* Pregap4-Modules-Mutation Scanner::     Mutation Scanner
+* Pregap4-Modules-Gap4 Assembly::        Gap4 Shotgun Assembly
+* Pregap4-Modules-Cap2 Assembly::        Cap2 Assembly
+* Pregap4-Modules-Cap3 Assembly::        Cap3 Assembly
+* Pregap4-Modules-FakII Assembly::       FakII Assembly
+* Pregap4-Modules-Phrap Assembly::       Phrap Assembly
+* Pregap4-Modules-Enter Assembly::       Enter Assembly into Gap4
+* Pregap4-Modules-Email::                Email
+* Pregap4-Modules-Old Cloning Vector::   Old Cloning Vector Clip - Obsolete
+* Pregap4-Modules-ABI2SCF::              ALF/ABI to SCF Conversion - Obsolete
+ at end menu
+
+The "Configure Modules" dialogue is available from the Modules menu or, when
+using the compact window style, by pressing the Configure Modules tab.
+
+This dialogue contains the main interface through which most of the
+user's interaction with pregap4
+will be performed. The left side of the display contains a list of the
+currently loaded modules. One module in this list will be highlighted.
+The right side of the display shows the configuration panel for this
+highlighted module.
+
+_picture(pregap4_config,6in)
+
+The module list shown on the left consists of a series of module names and
+their status, and is termed the "enable status".  The @code{[ ]} and
+ at code{[x]} strings at the left of the name indicates whether this module is
+enabled; crossed boxes are enabled modules. The highlighting is another
+indication of whether the module is enabled. The "General Configuration"
+module is mandatory and cannot be disabled. The text to the right of the
+module name indicates whether the module has been given all the parameters
+needed for it to process. This will be one of "ok" (all configuration options
+have been filled in), "-" (no configuration options exist for this module),
+"edit" (further configuration is required") or blank (this module is
+disabled).
+
+The "enable status" can be toggled by left clicking on the "@code{[ ]}" to the
+left of the module name. The enable status can be written to the current
+Pregap4 configuration file using the "Save Module List" or "Save All
+Parameters" commands in the Modules menu. Left clicking anywhere on a module
+name in the module list will switch the pane on the right side of the window
+to display any available parameters for this module. Not all modules will have
+parameters to configure.
+
+For modules that do have parameters, the top line of the configuration panel
+will contain a button labelled "Save these parameters". This button will save
+all parameters for this module to the configuration file. Note that this is
+not the same as the "Save all parameters" option in the main Modules menu, as
+this saves all parameters in all modules.
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-General
+ at section General Configuration
+ at cindex General configuration module
+ at pindex init.p4m
+
+ at table @strong
+ at item Description
+This is a mandatory module. It is always the first module executed and will
+not appear in the "Add/Remove Modules" list. Its purpose is to set general
+parameters which affect several other modules. At present it contains just two
+items.
+ at sp 1
+
+ at item Option: Get entry names from trace files
+Many trace formats include storage for a sequence "sample name". This option
+controls whether or not the sample name should be used instead of deriving the
+name from the filename. If "No" is answered to this question then the sequence
+sample name will be generated by removing the filename suffix; for example
+ at code{xb55a2.s1.ztr} will become @code{xb55a2.s1}.
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-EBA
+ at section Estimate Base Accuracies
+ at cindex  Estimate base accuracies module
+ at pindex eba.p4m
+
+ at table @strong
+ at item Description
+This module analyses the traces at each base call to estimate a confidence
+value for the called base. It does this by simply looking at the area
+underneath the trace for the called base and dividing this by the highest area
+under the trace for the three uncalled bases. This is a very simplistic
+statistic which should ideally only be used for measuring the average
+reliability of the entire sequence rather than any individual base. If another
+program (eg Phred, or ATQA) is available then this should be used in
+preference. From the 2002 release the eba values are
+normalised to the phred scale (this was achieved by comaring the
+original eba values and phred values for 4.6 million base calls of
+Sanger Centre data).
+
+There are no adjustable parameters for this module.
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Phred
+ at section Phred
+ at cindex Phred module
+ at pindex phred.p4m
+
+Phred is not included as part of the Staden Package. It is available from Phil
+Green.
+_uref(http://www.genome.washington.edu/UWGC/analysistools/phred.htm)
+
+ at table @strong
+ at item Description
+Phred is an ABI base caller. @cite{Ewing, B. and Green, P. 1998. Base-Calling
+of Automated Sequencer Traces Using Phred. II. Error Probabilities. Genome
+Res. 8, 186-194}. It will analyse the chromatogram data to produce new base
+calls. For each base it assigns confidence value indicating how likely this
+base call is to be correct. These confidence values are significantly more
+reliable than those produced by eba and they are compatible with the Phrap
+assembly program and the gap4 consensus algorithm.
+ at sp 1
+Phred can process either ABI or SCF files, but pregap4 will automatically
+convert all input to SCF format first. This means that the phred pregap4
+module will be able to process any supported trace format.
+ at sp 1
+There are no adjustable parameters for this module.
+ at end table
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-ATQA
+ at section ATQA
+ at cindex ATQA module
+ at pindex atqa.p4m
+
+ATQA is not include as part of the Staden package. It is available from
+its developers, Daniel H. Wagner, Associates, at
+_uref(http://www.wagner.com/).
+
+ at table @strong
+ at item Description
+The ATQA program estimates confidence values for each called base in a
+lane file. A confidence value corresponds to the probability that the
+associated base call is incorrect by the formula
+
+ at example
+score = -10*log10(probability of error).
+ at end example
+
+(This is the same log scale used by Phred.) In fact, the ATQA program
+computes four confidence values for each called base. The first three
+values correspond to the probabilities of substitution, insertion, and
+deletion errors, respectively. The fourth value is a combined score
+representing the probability that the called base is an error of any
+sort. Currently, only the combined confidence value is used by Staden
+package software.
+ at sp 1
+Unlike Phred, the ATQA program does not produce base calls. Rather, it
+assigns confidence values to each base call in a lane file based on
+features of the trace data. The current version of the ATQA program is
+tuned to base calls made by the ABI base caller and to trace data from
+the ABI 377 sequencer.
+ at sp 1
+Although ATQA can read ABI files, it will not create SCF files in such
+circumstances. However pregap4 will always convert any non SCF trace files
+into SCF format before running ATQA, so an explicit conversion is not
+required.
+
+ at end table
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-ConvertTrace
+ at section Trace Format Conversion
+ at cindex Trace Format Conversion
+ at cindex convert_trace
+ at pindex convert_trace.p4m
+
+ at table @strong
+ at item Description
+This converts files between the various supported trace formats. At present it
+can read ABI, ALF, SCF, CTF and ZTR formats, and can write SCF, CTF and ZTR.
+Of these formats, ZTR typically represents the smallest size and is fast due
+to its own internal compression routines.
+_ifdef([[_unix]],[[For a table of file sizes coupled
+with external compression tools, see
+_oref(Pregap4-Modules-Compress Traces, Compress Trace Files).]])
+
+
+The Trace Format Conversion may also be used to apply some simple editing
+methods to the traces. These include down-scaling (to reduce file size),
+background subtraction, and amplitude normalisation.
+ at sp 1
+
+ at item Option: Output format
+This selects the format for the output trace files. If the output format is
+the same as the input format then the input files will not be
+overridden. Instead new files will be produced with names based on the input
+names, generated by replacing (for example) ".scf" with "..scf".
+The available output format choices are ZTR, CTF and SCF.
+
+ at item Option: Downscale sample range
+ at itemx Option: Range
+These select whether to reduce the scale used to store the amplitudes, and if
+so to what range. ABI files typically range from 0 to 1600 (which is
+approximately 11-bit data). Shrinking this down to 0 to 255 (8-bit) will
+usually be visually comparable as the trace displays in Gap4 and Trev are
+typically smaller than 255 pixels high, although if the Y scale is increased
+differences will still be detectable. The purpose of this is to further reduce
+file size.
+ at sp 1
+
+ at item Option: Subtract background
+This attempts to eliminate the trace background by a simple technique of
+deducting the lowest of the four amplitudes from all of the four
+amplitudes. This is an overly crude method which should only be used when the
+preprocessing software included on the sequencing manufacturer's instruments
+has not been used.
+
+ at item Option: Normalise amplitudes
+This uses a sliding window to compute the average single strengths. From this
+it scales the data to try and provide, on average, more uniform peak heights
+along the trace. Again this is a very simplistic method and so it is not
+advisable unless their is a problem with the sequencing manufacturer's own
+software.
+
+ at item Option: Delete temporary files
+When pregap4 can determine that a trace file is neither the original input or
+the final output then it is considered to be a temporary file which may be
+suitable for deletion. An example would be using Phred with ABI files and then
+converting to ZTR. Phred produces SCF files and so we have ABI to SCF to ZTR,
+in which the SCF files may be safely deleted.
+ at end table
+
+_ifdef([[_unix]],[[
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Compress Traces
+ at section Compress Trace Files
+ at cindex Compress Trace Files module
+ at pindex compress_trace.p4m
+
+ at table @strong
+ at item Description
+All the programs that access trace files can uncompress on-the-fly. This module
+maybe used to compress existing trace files. No compression programs are
+supplied with the package, although there are several good public domain
+compression programs available.
+
+Note that using the ZTR trace format will typically yield better compression
+than using any of the supported compression programs on an SCF
+file. Attempting to compress a ZTR file using this module will not decrease
+the size (and may even increase the file size). Also see
+_oref(Pregap4-Modules-ConvertTrace, Trace format conversion).
+ at sp 1
+
+ at item Option: Compression method
+This selects the algorithm used for compressing trace files. The choices are
+None, Compress (the standard UNIX compression program), Gzip (from GNU) and
+Bzip versions 1 and 2. These are listed in ascending order of compression
+ratios, with Bzip (either version) giving the best compression. Generally Gzip
+is the best supported program and is not too far behind Bzip.
+
+The following table provides comparisons with compression sizes on ABI and SCF
+files. (Note that the ABI file typically holds 12 bit trace data data.) The
+sizes listed are the average length, in bytes, of the 96 (originally ABI 3700
+trace) files that the tests were performed on. The SCF and ZTR files were
+recalled using phred, so also contain confidence values. For comparison, the
+ZTR file sizes are also shown.
+
+ at sp 1
+ at example
+ at strong{File type}               @strong{Size in bytes}
+abi                     189150
+compressed abi          104681
+gzipped abi             87789
+bzipped abi             62032
+16-bit scf              82124
+compress 16-bit scf     26574
+gzipped 16-bit scf      25957
+bzipped 16-bit scf      18877
+16-bit ztr              17185
+8-bit scf               45967
+compress 8-bit scf      15000
+gzipped 8-bit scf       14718
+bzipped 8-bit scf       12659
+8-bit ztr               11155
+ at end example
+ at end table
+]])
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Initexp
+ at section Initialise Experiment Files
+ at cindex Initialise Experiment files module
+ at pindex init_exp.p4m
+
+ at table @strong
+ at item Description
+This modules creates an Experiment File from a trace file (of any format). It
+uses the @code{init_exp} program to write @code{ID}, @code{EN}, @code{LN},
+ at code{LT}, @code{AQ} and @code{SQ} Experiment File line types. This module is
+mandatory for many subsequent modules, such as vector clipping/screening and
+assembly.
+
+There are no adjustable parameters for this module.
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Augment
+ at section Augment Experiment Files
+ at cindex Augment Experiment files module
+ at pindex augment_exp.p4m
+
+ at table @strong
+ at item Description
+This module adds further data to the Experiment File, with the additional
+information typically obtained from external sources. Such information could
+be the data required by the vector clipping program, or template
+information
+needed by gap4.
+
+The parameters for this module may be configured by using the "Simple Text
+Database" (_fpref(Pregap4-Database-Simple, Simple text Database)) or
+"Experiment File Line Types" (_fpref(Pregap4-Database-LineTypes, Experiment
+File Line Types)) dialogues. These both allow setting of the Experiment File
+records to be written during the Augment stage.
+ at sp 1
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Quality Clip
+ at section Quality Clip
+ at cindex Quality clip module
+ at pindex quality_clip.p4m
+
+ at table @strong
+ at item Description
+
+This module determines where the sequence quality is too poor to use for
+reliable assembly. It supercedes the Uncalled Base Clip module.  This uses the
+ at code{qclip} program which reads and writes to Experiment Files. Its default
+quality evaluation is based on the range of values produced by the Estimate
+Base Accuracies module (quality value 70, averaged over 100 bases). For use
+with phred, try lower values such as quality value 15 averaged over 50 bases.
+When quality values are not available it will use the same method as the
+Uncalled Base Clip module; to analyse the base calls and count the number of
+undetermined bases within a given window of sequence. Both 5' and 3' ends may
+be quality clipped.
+
+For the confidence mode of clipping the method starts from the point of
+highest average quality, and then steps outwards in both directions until the
+average quality is below a defined threshold.
+
+For the sequence mode of clipping the method starts from a defined position
+and steps outwards in both directions until the number of uncalled bases
+within a given window length exceeds a predefined threshold. For
+more details see the @code{qclip} documentation
+(_fpref(Man-qclip, qclip, manpages)).
+
+Note that the Phrap assembly algorithm works best without quality clipping and
+it can make use of the full length of readings (due to the use of the Phred
+confidence values).
+ at sp 1
+
+ at item Option: Clip mode
+This may be one of "by sequence" or "by confidence". The "by sequence" mode is
+equivalent to the Uncalled Clip module. The "by confidence" mode uses
+Phred-scaled confidence values to determine the quality for clipping. This
+does not work with @code{eba} confidence values.
+ at sp 1
+
+ at item Option: Minimum extent
+The lowest allowable 5' clip position.
+ at sp 1
+
+ at item Option: Maximum extent
+The largest allowable 3' clip position.
+ at sp 1
+
+ at item Option: Minimum length
+If after quality clipping the good portion of a sequence is shorter than the
+specified length, then this file will be rejected with the message "qclip:
+Sequence too short".
+ at sp 1
+
+ at item Option: Window length
+The window length over which the confidence will be averaged.
+This option is only relevant for the "clip by confidence" mode.
+ at sp 1
+
+ at item Option: Average confidence
+The minimum average confidence (over `window length' bases) for sequence to be
+accepted as good quality.
+This option is only relevant for the "clip by confidence" mode.
+ at sp 1
+
+ at item Option: Start offset
+The base number to start the 5' and 3' good quality searches from.
+This option is only relevant for the "clip by sequence" mode.
+ at sp 1
+
+ at item Option: 3' window length
+The window length in which to count uncalled bases.
+This option is only relevant for the "clip by sequence" mode.
+ at sp 1
+
+ at item Option: 3' number of uncalled bases
+The maximum allowed count of uncalled bases in a single window length.
+This option is only relevant for the "clip by sequence" mode.
+ at sp 1
+
+ at item Option: 5' window length
+The window length in which to count uncalled bases.
+This option is only relevant for the "clip by sequence" mode.
+ at sp 1
+
+ at item Option: 5' number of uncalled bases
+The maximum allowed count of uncalled bases in a single window length.
+This option is only relevant for the "clip by sequence" mode.
+ at sp 1
+
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Sequence Vector
+ at section Sequencing Vector Clip
+ at cindex Sequencing vector clip module
+ at pindex sequence_vector_clip.p4m
+
+ at table @strong
+ at item Description
+This module uses the @code{vector_clip} program to identify and mark the
+sequencing vector (those used to produce templates for sequencing, eg m13mp18
+or puc18). To achieve this task it needs to know information about the vector
+including the cut site position and the position of the primer site relative
+to the cut site.
+_fxref(Vector_Clip-Sites, Defining the Positions of Cloning and Primer Sites for Vector_Clip, vector_clip).
+ at sp 1
+
+ at item Option: Use Vector-primer file
+Vector_clip may be told to search through a series of vectors and primers held
+within an external file. Alternatively we can request that it looks only at
+one specific, known, vector. This question is to determine which of the two
+mutually exclusive methods to use. In general it is still important for the
+Experiment File to contain primer and template data. The Vector-primer module can
+be used to add the primer and sequencing vector information to the Experiment File
+but not the template name.
+
+ at item Option: Vector-primer filename.
+This is only used if the "Use Vector-primer file" question was answered with
+"Yes". Each input sequence will be compared against each vector-primer pair to
+find the best match. This provides a simple way of comparing against multiple
+vectors or comparing against both forward and reverse primers of a single
+vector. For further details on creating this vector-primer file, see
+_fref(Vector_Clip-Vector_Primer-Files, Vector_Primer file format, vector_clip).
+ at sp 1
+
+ at item Option: Select vector-primer subset
+This is used in conjuction with the vector-primer filename to indicate which
+of the vector-primer pairs listed in this file should be used. Initially this
+is set to all vector-primer pairs, but efficiency will be greatly increased if
+just the required subset is selected. (Internally pregap4 will then temporarily
+produce a new vector-primer filename each time @code{vector_clip} requires
+one, containing just the selected items.) To select more than one
+vector-primer pair use the standard listbox mouse bindings: single left click
+to pick an item; click and drag to select a range; and control left click to
+toggle a single item. The selected list will be saved to the pregap4
+configuration file whenever all the parameters for this module are saved.
+ at sp 1
+
+ at item Option: Max primer to cut-site length
+This parameter is only used when a vector-primer file is defined. The sequence
+stored in the vector-primer file may be considerably longer than we expect to
+see at the start of the sequences being analysed. By defining the maximum
+length of sequence we expect to see, @code{vector_clip} may be more sensitive
+and slightly faster.
+ at sp 1
+
+ at item Option: Vector file name
+This, and the following two options, are only used if the "Use Vector-primer
+file" question was answered with "No".  The vector file name should be the
+name of a file containing just the vector bases or white space, in a plain
+text format.
+ at sp 1
+
+ at item Option: Cut site
+The cut site specified as a base count from the start of the vector file.
+ at sp 1
+
+ at item Option: Primer site
+The primer site specified as a base offset from the cut site. e.g. for m13mp18
+forward primers the value is 41. If, instead of the usual single value,
+two values
+are specified separated by a slash, then this gives the values for the
+universal forward and reverse primers (for example "@code{41/-24}"). Only use
+this format if the @code{PR} (primer type) experiment file line type is known
+AND will be specified in the experiment file. If the PR record
+is not specified in the
+experiment file, the primer site position will be set to zero, and the vector
+clipping is unlikely to work correctly.
+(PR values do not have to be known if they
+can be derived using naming schemes such as those used
+by the Sanger Centre). If the
+primer site indicates a custom primer sequence then the primer site is taken
+to be 0.
+ at sp 1
+
+ at item Option: Percentage minimum 5' match
+ at itemx Option: Percentage minimum 3' match
+Both ends of the sequence are checked using a dynamic programming algorithm to
+find the optimal alignment. An end is marked as vector if the
+percentage match is at least as high as this supplied parameter.
+ at sp 1
+
+ at item Option: Default 5' position
+This specifies the value to use for marking the 5' sequencing vector if none
+is detected. Specifying this as -1 will cause the absolute value given for the
+primer site (which is specified as relative to the cut site).
+ at sp 1
+
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Cross_match
+ at section Cross_match
+ at cindex Cross_match module
+ at pindex cross_match_svec.p4m
+
+Cross_match is not included as part of the Staden Package. It is available from
+Phil Green.
+_uref(http://www.genome.washington.edu/UWGC/analysistools/swat.htm)
+
+ at table @strong
+ at item Description
+This uses the @code{cross_match} program to search for sequencing vector.
+(Future versions may also check for other cloning vectors.) This allows for
+searching of multiple vector files. However as cross_match does not make use of
+primer and cut site information the vector detection is inherently less
+sensitive than @code{vector_clip}
+(_fpref(Vector_Clip, Screening against Vector Sequences, vector_clip)).
+ at sp 1
+
+ at item Option: FASTA vector file name
+This specifies a fasta format file of one or more sequencing vector sequences.
+ at sp 1
+
+ at item Option: Minimum match length
+Minimum length of matching word for SWAT comparison.
+ at sp 1
+
+ at item Option: Minimum score
+Minimum SWAT score.
+ at sp 1
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Cloning Vector
+ at section Cloning Vector Clip
+ at cindex Cloning vector clip module
+ at pindex cloning_vector_clip.p4m
+
+ at table @strong
+ at item Description
+This module searches for non "sequencing" vectors used in the shotgunning
+process, eg for Cosmid or YAC. Any fragment in any orientation of this vector
+could be present so there is no need for the cut sites to be known. The
+ at code{vector_clip} program is used for this task
+(_fpref(Vector_Clip, Screening against Vector Sequences, vector_clip)).
+
+ at sp 1
+
+ at item Option: Vector file name
+The filename containing the vector sequence. At present this should be a file
+containing a single plain text sequence containing just the bases or white
+space.
+ at sp 1
+
+ at item Option: Max probability
+For each match its probability of occurring by chance is calculated. Any match
+with a probability lower than `Max probability' is accepted.
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Screen Vector
+ at section Screen for Unclipped Vector
+ at cindex Screen for unclipped vector module
+ at pindex screen_vector.p4m
+
+ at table @strong
+ at item Description
+This module may be used to identify undetected segments of sequencing vector
+or to detect recombinations. After searching and marking sequencing vector,
+any further strong matches to the sequencing vector indicate a possible
+problem. This module uses the @code{vector_clip} program
+(_fpref(Vector_Clip, Screening against Vector Sequences, vector_clip)).
+
+Note that this module requires the Sequencing Vector Clip module to be used
+before screening, otherwise all sequences containing unclipped vector will be
+falsely rejected.
+ at sp 1
+
+ at item Option: Minimum length of match
+If a match of at least this length is found then the sequence currently being
+processed will be rejected.
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Screen
+ at section Screen Sequences
+ at cindex Screen sequences module
+ at pindex screen_seq.p4m
+
+ at table @strong
+ at item Description
+This module can perform very fast matches between the sequences to process and
+one or more screen sequences. Any sequence containing a significant match is
+rejected. An example of use for this module is to reject sequences prior to
+assembly that appear to be contaminated with E. coli. This uses the
+ at code{screen_seq} program
+(_fpref(Man-screen_seq, Screen_seq, screen_seq)).
+ at sp 1
+
+ at item Option: Screen single sequence
+This is yes/no question used to determine whether the screen sequence filename
+is the filename of a single sequence or a filename of a file containing a
+series of sequence filenames. To compare just one file select "Yes".
+ at sp 1
+
+ at item Option: Screen sequence file (of filenames)
+This is either the filename of a single sequence or the filename of a file of
+filenames, depending on the answer to the previous question. The sequence
+files must be in plain text format containing just the bases or white space.
+ at sp 1
+
+ at item Option: Maximum screen sequence length
+The maximum length of any individual screen sequence.
+ at sp 1
+
+ at item Option: Minimum match length
+Any fragment containing an exact match longer than
+this length will be rejected.
+ at sp 1
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Blast
+ at section Blast Screen
+ at cindex Blast screen module
+ at pindex blast.p4m
+
+ at table @strong
+ at item Description
+This module uses the @code{blastall} program to compare all the input
+sequences against a prebuilt blast database of screen sequences. It is not
+possible to compare against a subset of the database - to do this build a new
+blast database using formatdb. This module is an alternative to the
+Screen Sequences module which uses the @code{screen_seq} program.
+
+Blast may be used for either completely rejecting sequences or for simply
+tagging the matching segments, or for both. If you wish to tag with several
+tag types, then several instances of the Blast screen module need to be used.
+
+Blast is not included as part of the Staden Package. It is available from the
+NCBI.
+ at sp 1
+
+ at item Option: BLAST database
+This is the filename of the BLAST database to screen against, with the
+ at file{.nhr}, @file{.nin} and @file{.nsq} suffixes removed.
+ at sp 1
+
+ at item Option: E value
+This specifies the `E value' used by blast when determining which hits should
+be considered as real.
+
+ at item Option: Match fraction
+This is the total percentage of the sequence which much have a blast match
+somewhere in the BLAST database searched in order to reject this sequence.
+Segments of the input sequence that match multiple components in the BLAST
+database are only counted once when computing this percentage, but the
+locations of the matches in the BLAST database do not need to be consecutive.
+
+If you wish to accept everything, but still want to tag the matches, then set
+the match fraction to greater than @code{1.0}.
+
+ at item Option: Tag type
+The default for this is @code{<none>} which indicates no tagging is
+required. Otherwise this should be a 4 letter tag type (such as @code{REPT})
+known to gap4.
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Interactive Clip
+ at section Interactive Clipping
+ at cindex Interactive clipping module
+ at pindex interactive_clip.p4m
+
+ at table @strong
+ at item Description
+This modules invokes the @code{trev} program to view the raw chromatogram
+files. The user can then adjust the quality and vector clip positions if
+desired.
+The trev window will contain Next and Previous
+buttons to skip from trace to trace. The Reject buttons allows a trace to be
+rejected, in which case it is added to the failure file with the message
+"@code{interactive clip: manually rejected}".
+
+There are no adjustable parameters for this module.
+ at sp 1
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Extract Seq
+ at section Extract Sequence
+ at cindex Extract Sequence module
+ at pindex extract_seq.p4m
+
+ at table @strong
+ at item Description
+
+This module uses the @code{extract_seq} program to extract the sequence
+information from binary trace files, Experiment files, or from the old
+Staden format plain files. The output contains the sequences split onto
+lines of at most 60 characters each, in plain or fasta format. The input
+files are passed unchanged onto subsequent modules.
+ at sp 1
+
+ at item Option: Output only the good sequence
+    When reading an experiment file or trace file containing clip marks, output
+    only the good sequence which is contained within the boundaries marked
+    by the @code{QL}, @code{QR}, @code{SL}, @code{SR}, @code{CL}, @code{CR}
+    and @code{CS} line types.
+ at sp 1
+
+ at item Option: Consider cosmid as good sequence
+    When the @code{Output only the good sequence} option is specified
+    this controls whether the cosmid sequence should be considered
+    good.
+ at sp 1
+
+ at item Option: Output in fasta format
+    Specifies that the output should be in fasta format rather than
+    plain text.
+ at sp 1
+
+ at item Option: Output in one file only
+    If this option is selected then the output from every sequence is
+    sent to one file. This is best used with the
+    @code{Output in fasta format} option selected, and is useful for
+    feeding into BLAST searches, for example. The file to write to is
+    specified in the @code{File name} filed.
+
+    If this option is unselected then the output is sent to separate
+    files, one per sequence. The output files have the same name as
+    the input files, except with an extra suffix specified in the
+    @code{File name suffix} field.
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-RepeatMasker
+ at section RepeatMasker
+ at cindex RepeatMasker module
+ at pindex repeat_masker.p4m
+
+RepeatMasker is not included as part of the Staden Package. It is available
+from Arian Smit.
+_uref(http://ftp.genome.washington.edu/RM/RepeatMasker.html)
+
+ at table @strong
+ at item Description
+This module uses the @code{RepeatMasker} program. This is a program which
+searches for a comprehensive set of repeat sequences. Any matches which are
+found will be tagged with a comment indicating the type of repeat. These tags
+will then be visible from within gap4. Full documentation is available from
+the author of RepeatMasker, or from typing @code{RepeatMasker -h}.
+
+ at sp 1
+
+ at item Option: Repeat library
+This specifies the directory containing the library of repeat sequences. Only
+one library directory may be specified. The library "<default>" will let
+RepeatMasker use its own default library.
+ at sp 1
+
+ at item Option: RepeatMasker cutoff
+This specifies the cutoff score for RepeatMasker. The documentation with
+RepeatMasker states that a cutoff of 250 will guarantee no false positives.
+ at sp 1
+
+ at item Option: Gap4 tag type
+When a repeat is found a tag will be added to the Experiment File. This
+specifies the tag type to use. It should be one of the tag types available to
+Gap4, but other tag types may be used if desired (they will be coloured as is
+ at code{COMM}ent tags in gap4).
+ at sp 1
+
+ at item Option: Types of repeat to screen against
+The default setting of RepeatMasker is to search for primate repeats, however
+it may be told to search for other repeat families or to restrict its search
+to only ALU primate repeats. The full list of options here are Alu only,
+Rodent only, Simple only, Mammalian excluding primate/rodent, and no low
+complexity. These are as defined in the RepeatMasker documentation. It is not
+known what effect enabling mutually exclusive options will have.
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Repeats
+ at section Tag Repeats
+ at cindex Tag repeats module
+ at pindex tag_repeats.p4m
+
+ at table @strong
+ at item Description
+This module uses the @code{repe} program to identify and mark known repetitive
+elements within the sequences. An example usage is to tag all ALU fragments.
+This information may be used by the gap4 assembly algorithm to improve the
+assembly by initially ignoring matches between two ALU fragments which may
+otherwise produce incorrect assemblies. If available, we recommend using
+RepeatMasker instead of this module.
+ at sp 1
+
+ at item Option: Repeat file name
+This is the filename of a file of filenames, each of which contain a single
+repeat to search for. The format of these individual files is plain text
+consisting of just the nucleotides and white space.
+ at sp 1
+
+ at item Option: Repeat score
+This is the minimum score for classifying a matched segment as a repeat.
+ at sp 1
+
+ at item Option: Tag type
+This is the gap4 tag type to use for identifying this repeat segment. It is
+not possible to choose different tag types for different repeats, although the
+tag comments contain the match score and match filename.
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Mutations
+ at section Mutation Detection
+ at cindex Mutation detection module
+ at pindex trace_diff.p4m
+
+ at table @strong
+ at item Description
+ at strong{Superceded by the newer modules:}
+(_fpref(Pregap4-Modules-Trace Difference, Trace
+Difference))
+and (_fpref(Pregap4-Modules-Mutation Scanner, Mutation Scanner)).
+
+This module compares each sequence chromatogram against a "wild type" or
+reference chromatogram to detect point mutations. The mutations are
+detected by aligning and subtracting each trace from the wild type trace to
+produce a "difference trace". The difference trace is then analysed to
+identify point mutations which are written back to the Experiment File and
+ at code{MUTN} tags. This uses the @code{trace_diff} program
+ at cite{Bonfield, J.K., Rada, C. and Staden, R. Automated detection of point
+mutations using fluorescent sequence trace subtraction. Nucleic Acids Res. 26,
+3404-3409 (1998)}.
+
+Obviously the reference traces should be as similar as possible to the ones
+being compared against it. It should be prepared by sequencing the wild type
+from the same primer, and using the same chemistry as the readings being
+screened.  One good way to produce a reference trace is to run the wild type
+sequence on the gel along with the other samples.  It is also possible to get
+gap4 to produce a consensus trace. This requires using pregap4 twice. Firstly
+process the sequences through pregap4 with all the appropriate options except
+with the mutation detection module disabled.  Assemble these sequences into
+gap4. Within gap4, for each contig start up the Contig Editor and select Save
+Consensus Trace from the command menu. This will produce a trace which is the
+average of the traces in that contig. Then delete the gap4 database and
+reprocess the sequences using Pregap4, this time using mutation detection to
+compare against the consensus trace.
+
+ at sp 1
+
+ at item Option: Wild type file (+ve strand)
+ at itemx Option: Wild type file (-ve strand)
+These are the filenames of the chromatogram for the wild type sequence on each
+strand. These may be in any allow trace format (SCF, ZTR, ABI, CTF or ALF).
+In the augment stage, these are represented in the @code{WT} line type using
+ at i{plus_filename}@code{|}@i{minus_filename} notation.
+ at sp 1
+
+ at item Option: Start position
+ at itemx Option: End position
+These define the range within each sequence in which to identify mutations.
+The algorithm works better on good quality data so including very bad sequence
+may give errors.
+ at sp 1
+
+ at item Option: Score
+This a threshold used to determine when a peak in the difference trace is
+considered to be a mutation. The higher the value the more stringent the test.
+ at sp 1
+
+ at item Option: Alignment band width
+The trace alignment is performed by firstly doing a sequence alignment on the
+text sequences contained in the two files. This parameter
+specifies the band width for
+this alignment. Smaller values give quicker alignments, but only work if the
+alignment is sufficiently close to the main diagonal.
+ at sp 1
+
+ at item Option: Other arguments
+This allows for any other arguments to be passed to the @code{trace_diff}
+program. See the trace_diff documentation for more details.
+ at sp 1
+ at end table
+ at strong{The module above is superceded by the newer modules:}
+(_fpref(Pregap4-Modules-Trace Difference, Trace
+Difference,t))
+and (_fpref(Pregap4-Modules-Mutation Scanner, Mutation Scanner,t)).
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Reference Traces
+ at section Reference Traces and Reference Sequences
+ at cindex Reference trace module
+ at cindex Reference sequence: pregap4
+ at cindex pregap4: Reference sequence
+ at cindex pregap4: naming schemes
+ at cindex naming schemes: pregap4
+ at cindex naming schemes: mutation detection
+
+ at table @strong
+ at item Description
+This module specifies the reference traces and reference sequences used by the
+two mutation detection modules (_fpref(Pregap4-Modules-Trace Difference, Trace
+Difference) and _fpref(Pregap4-Modules-Mutation Scanner, Mutation
+Scanner)). The left and right clip points for each trace can also be specified.
+
+A reference trace should be as similar as possible to the ones being
+compared against. It should be prepared by sequencing the wild type from
+the same primer and using the same chemistry as the readings being
+screened. One good way to produce a reference trace is to run the wild type
+sequence on the gel along with the other samples.
+
+If the input files have been sequenced from both strands, reference traces from
+each strand may be specified here.
+
+NOTE:
+In order for pregap4 to choose the appropriate wild type trace it needs to know
+the strand for each input sequence. This is specified by the PR record in the
+experiment file which is typically generated using a naming convention
+(_fpref(Pregap4-Naming, Pregap4 Naming Schemes, pregap4)) If pregap4 cannot
+determine the strand, or if only one reference trace is specified, then each
+input sequence will be compared against the +ve strand reference trace.
+
+The reference data supplied in this module, when entered with gap4 shotgun assembly,
+will add REFS and REFT notes (_fpref(Notes, Notes, notes)) to the gap4 database.
+A reference sequence is used to number bases in the Contig Editor
+(_fpref(Editor-Reference-Data, Reference sequences and traces,t)) and in reporting
+the positions of mutations (_fpref(Editor-Comm-Report-Mutations, Report Mutations,t).)
+
+ at sp 1
+
+ at item Option: Reference Trace (+ve strand)
+ at itemx Option: Reference Trace (-ve strand)
+These are the filenames of the chromatogram for the reference trace on each
+strand. These may be in any allowable trace format (ZTR, SCF, ABI, CTF or ALF).
+The filenames are entered into the experiment file as @code{WT} records by the
+"Augment Database" phase of pregap4, so this module must also be enabled.
+
+ at item Option: Clip left
+ at itemx Option: Clip right
+These values determine which region of the reference trace (in bases) is used
+for mutation detection. This can be used to exclude poor quality regions, or
+restrict the range over which mutation detection occurs. Restricting the range
+will also speed up the algorithms. If you specify -1 for any value, mutscan will
+use the clip point QL/QR records within the reference trace experiment file
+(provided they exist). If they don't exist, then the entire reference trace is used.
+i.e. No clipping occurs. If the range specified is too small, the mutation detection
+algorithms may report an error, since there must be a useful overlap between
+the sequences in order to process them.
+
+ at item Option: Reference Sequence
+This specifies the reference sequence, which is typically an annotated EMBL
+entry. This field is optional.
+
+ at item Option: Start base number
+If a reference sequence was specified this indicates which base number it will
+start counting from within Gap4's contig editor. It also defines the positions
+of mutations, as output by the Report mutations function of gap4
+_fxref(Editor-Comm-Report-Mutations, Report Mutations,t).
+
+ at item Option: Circular
+ at itemx Option: Sequence length
+If the reference sequence is defined to be circular then the length needs to
+be known too. When the base number reaches the sequence length the next base
+in the sequence will be renumbered to base 1. This may be useful if the
+circular reference sequence needs to be chopped to form a linear sequence at a
+different position than the standard numbering. (For example this is typical
+when sequencing the mitochondrial variable loop, which by standard conventions
+contains base number 1.)
+
+ at sp 1
+ at end table
+
+Note that it is possible (though no longer recommended)
+to use gap4 to produce a consensus trace. This requires
+using pregap4 twice. Firstly process the sequences through pregap4 with all
+the appropriate options except with the mutation detection modules
+disabled. Assemble these sequences into gap4. Within gap4, for each contig
+start up the Contig Editor and select Save Consensus Trace from the command
+menu (available only in expert mode). This will produce a trace which is the
+average of the traces in that contig. Then delete the gap4 database and
+reprocess the sequences using Pregap4, this time using mutation detection to
+compare against the consensus trace. Best results are usually obtained by
+first deleting pads in the consensus sequence. You should inspect the
+resulting consensus trace carefully to ensure there are no discontinuities
+introduced as a result of the pad deletions.
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Trace Difference
+ at section Trace Difference
+ at cindex Trace difference module
+
+ at table @strong
+ at item Description
+This module compares each sequence chromatogram against a "wild type" or
+reference chromatogram to detect point mutations. The mutations are
+detected by aligning and subtracting each trace from the wild type trace to
+produce a "difference trace". The difference trace is then analysed to
+identify point mutations which are written back to the Experiment File as
+ at code{MUTA} tags. The basic idea is explained in the paper @cite{Bonfield,
+J.K., Rada, C. and Staden, R. Automated detection of point mutations using
+fluorescent sequence trace subtraction. Nucleic Acids Res. 26, 3404-3409 (1998)}.
+
+This implementation is the second version of the algorithm. The previous
+version used basecalls to do trace alignment. This led to problems when
+bases were called in error (often the case around mutations). The new algorithm
+ignores the basecalls completely and aligns the trace signals themselves,
+avoiding such problems. This is much more computationally intensive, but it
+has proved to be fast enough for interactive use.
+
+If the input files have sequenced from both strands then two wild type
+sequences may be given. In order for pregap4 to choose the appropriate wild
+type trace it needs to know the strand for each input sequence, which is
+typically generated using the naming convention. A simple naming scheme is
+provided with pregap4 (in the lib/pregap4/naming_schemes directory) called
+"mutation_detection.p4t". This can be loaded from the pregap4 file menu. It
+assumes that trace names have an 'f' or 'r' suffix, denoting the forward and
+reverse strands respectively. If you need something more complex, then you'll
+have to create and load your own naming scheme. If pregap4 cannot determine
+the strand, or if only one wild type is specified, then each input sequence
+will be compared against the +ve strand wild type.
+
+The reference or wild type traces for tracediff are specified in the
+_fpref(Pregap4-Modules-Reference Traces, Reference Traces module).
+
+ at sp 1
+
+
+ at item Option: Sensitivity
+This threshold is used to determine when an above/below baseline double
+peak in the difference trace is considered to be a mutation. It is specified
+in standard deviations from the mean over the analysis window. The higher the
+value, the more stringent the test. This value is reduced dynamically
+by the algorithm in the presense of mutations since small mutations near
+larger ones can often be missed with a uniform sensitivity setting. It's
+likely that some experimentation with this parameter will be required for
+optimal mutation detection in your data.
+ at sp 1
+
+
+ at item Option: Noise threshold
+This threshold is used to filter out low level noise during the analysis
+phase. It is specified as a percentage of the maximum peak-to-peak trace
+difference value. A high threshold will lead to fewer false positives but
+you run the additional risk of missing low level mutations.
+ at sp 1
+
+
+ at item Option: Analysis window length
+Analysis of the trace difference is done over a local region to counter
+the effects of non-stationarity in the trace signal. The analysis region is
+defined by a short window whose length is specified in bases. The window is
+asymmetric in that it's located to the left of the base it's positioned on.
+This avoids measurement problems when mutations are encountered. The window
+size is a tradeoff. If it's too big, low level mutations may be missed. If
+it's too small, there may be insufficient data to give unbiased measurements
+leading to many false positives.
+ at sp 1
+
+
+ at item Option: Maximum peak alignment deviation
+The centres of each individual half-peak of a double peak above and below
+the baseline must align reasonably well for them to be considered to be
+real mutations. The amount of half-peak alignment deviation allowable is
+specified in bases by this parameter, usually as a fraction of one base.
+ at sp 1
+
+
+ at item Option: Maximum peak width
+During analysis, the width of each peak is measured to avoid problems caused
+by gel artifacts. These often appear as broad peaks that overlay many bases.
+The maximum peak width is specified in bases. A lower value will lead to
+fewer false positives, but you run the additional risk of missing smeared
+mutations towards the end of a trace.
+ at sp 1
+
+
+ at item Option: Complement bases on reverse strand tags
+After mutation detection and after readings have been assembled into a GAP4
+database, GAP4 displays both forward and reverse readings in a single direction
+in the contig editor. This makes it much easier to compare sequences and traces
+in both directions simultaneously. When the corresponding traces are displayed,
+any reverse strand traces are complemented automatically such that the bases are
+interchanged. In this case, the original mutation tag generated by tracediff will
+then be of the wrong sense, so if checked, this option complements the tag base
+labels to match the complemented trace displayed by GAP4.
+ at sp 1
+
+
+ at item Option: Write difference traces out to disk
+After trace difference analysis, the generated traces are normally discarded and not
+written to disk. Checking this option lets you save the trace difference files to
+the same directory as the original traces. The .ZTR trace format is used for this
+purpose. The original filename is retained and a "_diff.ztr" suffix is appended.
+ at sp 1
+ at end table
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Mutation Scanner
+ at section Mutation Scanner
+ at cindex Mutation scanner module
+
+ at table @strong
+ at item Description
+This module compares each input sequence chromatagram against a reference
+chromatogram (or trace) to detect mutations. The reference traces are specified
+in the _fpref(Pregap4-Modules-Reference Traces, Reference Traces module). Using
+this method it is possible to detect both base-change mutations and heterozygous
+mutations.
+
+It works by aligning the reference trace with the input trace and then examining
+the peak pairs for each individual base separately. It does not use basecalls as
+these are prone to error and their use generates too many false positives. After
+normalisation, the amplitude ratios of peak pairs which are abnormal are analysed
+more closely. For heterozygotes, a drop in peak height with respect to the reference
+of about 50% is expected. The final set of candidate mutations are validated against
+a difference trace to ensure it contains a double peak at that location, thus
+confirming the mutation to be real. After chromatagram analysis has been completed,
+mutation tags are written back to the Experiment File as @code{HETE} and @code{MUTA}
+tags.
+
+ at sp 1
+
+ at item Option: Adaptive Noise Floor
+Traces are very noisy difficult to process signals. To find valid peaks in a trace
+an adaptive noise threshold based on envelope height is used to eliminate all low
+level noise from consideration. The effect of this parameter can be seen in the
+trace below. By default this parameter is set to 25% of envelope peak height.
+If set lower, too much noise is picked up; if set higher, low level mutations may
+be missed.
+
+_picture(mut_mutscan_adaptive_noise_threshold,3.75in)
+
+ at sp 1
+
+ at item Option: Upper and Lower Peak Drop Thresholds
+For heterozygote mutations, the peak height of the mutant drops by 50% with respect
+to the normalised reference trace as shown in the trace below. For accurate detection,
+we use this information to validate potential mutations. Due to overzealous preprocessing
+done by sequencing machine software, the peak height drops are often not 50%, but
+typically hover between 20% and 70% of reference peak height. Any potential heterozygote
+whose peak height drop with respect to the normalised reference trace that lies within
+this range is considered to be a real mutation.
+
+_picture(mut_mutscan_peak_drop_threshold,3.83333in)
+
+ at sp 1
+
+ at item Option: Peak Alignment Search Window Size
+In an ideal world, heterozygote peaks in a trace would be perfectly aligned on top of
+each other. In practice however, they can often be skewed due to gel chemistry problems
+or inaccurate mobility correction as shown in the trace below. When mutscan looks
+for peak pairs, it allows for this skew by looking either side of the current position
+for nearby peaks. This parameter is the distance mutscan looks in bases around each
+candidate position.
+
+_picture(mut_mutscan_peak_alignment_threshold,3.875in)
+
+ at sp 1
+
+ at item Option: Heterozygote SNR Threshold
+For a normal trace containing normal bases, the signal-to-noise ratio (SNR) is the
+ratio of the highest base peak to the second highest trace level. Mutscan computes
+this value in decibels (dB) as 20*log10(S/N). For normal bases, this usually in
+the region of 20-30dB or higher. However, for heterozygotes, the SNR as defined
+by this measure degrades significantly to around 2-5dB. This is the mechanism
+mutscan uses to accurately determine the mutation tag type. If the candidate
+mutation's SNR is equal to or below this threshold, mutscan designates it to be
+heterozygous, otherwise it's considered to be a normal base-change mutation.
+
+ at sp 1
+
+ at item Option: Trace Alignment Failure Threshold
+Mutscan works by aligning a mutant trace against a reference trace and comparing
+the peaks. However, if the traces are too different, the alignment may fail and
+as a consequence, large numbers of false positive mutation tags are generated.
+Typically, within each trace there are only one or two mutations, so if we find
+15 mutations, then we can confidently predict that things have gone badly wrong!
+This parameter sets a threshold, beyond which an alignment failure error message
+is printed, rather than outputting large numbers of invalid mutation tags.
+
+ at sp 1
+
+ at item Option: Complement Bases on Reverse Strand Tags
+After mutation detection and after readings have been assembled into a GAP4
+database, GAP4 displays both forward and reverse readings in a single direction
+in the contig editor. This makes it much easier to compare sequences and traces
+in both directions simultaneously. When the corresponding traces are displayed,
+any reverse strand traces are complemented automatically such that the bases are
+interchanged. In this case, the original mutation tag generated by mutscan will
+then be of the wrong sense. If checked, this option complements the tag base
+labels to match the complemented trace displayed by GAP4.
+ at sp 1
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Gap4 Assembly
+ at section Gap4 Shotgun Assembly
+ at cindex Gap4 shotgun assembly module
+ at pindex gap4_assemble.p4m
+
+ at table @strong
+ at item Description
+This module assembles the processed sequences into gap4 using gap4's own
+assembly engine. Note that this is incompatible with use of "Enter assembly
+into Gap4", which should only be used for external (to gap4) assembly engines.
+ at sp 1
+
+ at item Option: Gap4 database name
+ at itemx Option: Gap4 database version
+The name and version of the database to assemble into.
+ at sp 1
+
+ at item Option: Create new database
+This is a toggle to define whether the specified gap4 database should be
+created or appended to. Be warned that at present creating a new database will
+overwrite existing one of the same name, in the same directory, without any
+warnings.
+ at sp 1
+
+ at item Option: Minimum exact match
+ at itemx Option: Maximum number of pads
+ at itemx Option: Maximum percentage mismatch
+These control the main assembly parameters within gap4. For more details see
+_fref(Assembly-Shot, Normal shotgun assembly, gap4)
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Cap2 Assembly
+ at section Cap2 Assembly
+ at cindex Cap2 assembly module
+ at pindex cap2_assemble.p4m
+
+Cap2 is not included as part of the Staden Package. It is available from
+Xiaoqiu Huang (_uref(mailto:huang at mtu.edu,,huang@@mtu.edu)).
+
+ at table @strong
+ at item Description
+This module uses the @code{cap2} program to perform shotgun assembly. Output
+will be placed in the @i{fofn}.assembly directory, where @i{fofn} is the
+filename prefix listed in the "Files to Process" panel. The output is in a
+format suitable for directed assembly within gap4. This can also be performed
+by using the "Enter Assembly into Gap4" module.
+ at sp 1
+There are no adjustable parameters for this module.
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Cap3 Assembly
+ at section Cap3 Assembly
+ at cindex Cap3 assembly module
+ at pindex cap3_assemble.p4m
+
+Cap3 is not included as part of the Staden Package. It is available from
+Xiaoqiu Huang (_uref(mailto:huang at mtu.edu,,huang@@mtu.edu)).
+
+ at table @strong
+ at item Description
+This module uses the @code{cap3} program to perform shotgun assembly. Output
+will be placed in the @i{fofn}.assembly directory, where @i{fofn} is the
+filename prefix listed in the "Files to Process" panel. The output is in a
+format suitable for directed assembly within gap4. This can also be performed
+by using the "Enter Assembly into Gap4" module. Cap3 differs from Cap2 in that
+it can make use of confidence values (in the range supplied from
+ at code{phred}) and constraints.
+ at sp 1
+
+ at item Option: Auto-generate constraints
+When enabled, this uses the reading direction (forward / reverse primers),
+the template name and the insert size, to produce a file containing data to
+constrain how the readings may be assembled.
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-FakII Assembly
+ at section FakII Assembly
+ at cindex FakII assembly module
+ at pindex fakii_assemble.p4m
+
+FakII is not included as part of the Staden Package. It is available from
+Susan Miller (_uref(mailto:susanjo at cs.arizona.edu,,susanjo@@cs.arizona.edu)).
+
+ at table @strong
+ at item Description
+This module uses the @code{FakII} suite of programs to perform shotgun
+assembly. Output will be placed in the @i{fofn}.assembly directory, where @i{fofn}
+is the filename prefix listed in the "Files to Process" panel. The output is
+in a format suitable for directed assembly within gap4. This can also be
+performed by using the "Enter Assembly into Gap4" module.
+ at sp 1
+
+ at item Option: E limit
+ at itemx Option: D limit
+ at itemx Option: O threshold
+These parameters control the @code{graph} component of FakII, which is used to
+find the initial overlaps between sequences.
+_ifdef([[_unix]],[[For further details, see
+_fref(Assembly-FAKII, Assembly FAKII, assembly)
+]])@sp 1
+
+ at item Option: Auto-generate constraints
+When enabled, this uses the reading direction (forward / reverse primers),
+the template name and the insert size, to produce a file containing data to
+constrain how the readings may be assembled.
+ at sp 1
+
+ at item Option: E rate
+ at itemx Option: O threshold
+ at itemx Option: D threshold
+These parameters control the @code{assemble} component of FakII, which is used
+for determining the best construction of sequences from the overlap graph.
+
+ at item Option: Assembly number
+This allows for non optimum assemblies to be chosen. The optimum assembly is
+assembly number 1, with the next optimum being number 2, and so on.
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Phrap Assembly
+ at section Phrap Assembly
+ at cindex Phrap assembly module
+ at pindex phrap_assemble.p4m
+
+Phrap is not included as part of the Staden Package. It is available from
+Phil Green.
+_uref(http://www.genome.washington.edu/UWGC/analysistools/phrap.htm)
+
+ at table @strong
+ at item Description
+This module uses the @code{phrap} program to perform shotgun assembly. Output
+will be placed in the @i{fofn}.assembly directory, where @i{fofn} is the
+filename prefix listed in the "Files to Process" panel. The output is in a
+format suitable for directed assembly within gap4. This can also be performed
+by using the "Enter Assembly into Gap4" module. Phrap can make use of the
+confidence value information written by the @code{phred} program to produce
+better assemblies. Phrap also uses the full length of the sequence and will
+ignore any quality clipping. It is still necessary to clip sequencing vector.
+ at sp 1
+
+ at item Option: Minimum exact match
+Minimum length of matching word for SWAT comparison.
+ at sp 1
+
+ at item Option: Minimum SWAT score
+Minimum SWAT score.
+ at sp 1
+
+ at item Option: Other phrap arguments
+Any other phrap command line arguments.
+ at sp 1
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Enter Assembly
+ at section Enter Assembly into Gap4
+ at cindex Enter assembly module
+ at pindex enter_assembly.p4m
+
+ at table @strong
+ at item Description
+This module is used to enter assemblies into gap4 which have been generated
+externally to gap4 (ie all assembly engines except "Gap4 shotgun assembly"). This is
+achieved by using the gap4 "Directed Assembly" function. The assembly is read
+from the @i{fofn}.assembly directory, where @i{fofn} is the filename prefix
+listed in the "Files to Process" panel.
+ at sp 1
+
+ at item Option: Gap4 database name
+ at itemx Option: Gap4 database version
+The name and version of the database to assemble into.
+ at sp 1
+
+ at item Option: Create new database
+This is a toggle to define whether the specified gap4 database should be
+created or appended to.  Be warned that at present creating a new database
+will overwrite existing one of the same name, in the same directory, without
+any warnings.
+ at sp 1
+
+ at item Option: Post-assembly quality clipping
+ at itemx Option: Lowest (average) quality to use
+This can be used to direct gap4 to run the "Quality Clip" function after
+entering the assembly. This performs quality clipping by identifying segments
+where the average quality is below a particular threshold. This should only be
+necessary if quality clipping was not performed earlier (eg because Phrap was
+used for assembly), and even then it is usually better to use difference
+clipping instead.
+ at sp 1
+
+ at item Option: Post-assembly difference clipping
+This can be used to direct gap4 to run the "Difference Clip" function after
+entering the assembly. This identifies ends of readings where the alignment
+between readings and consensus is bad and marks these ends as hidden data. This
+is primarily designed for use after the Phrap assembly engine, which sometimes
+leaves poorly aligned end fragments.
+ at end table
+
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Email
+ at section Email
+ at cindex Email module
+ at pindex email.p4m
+
+ at table @strong
+ at item Description
+This module can be used to send an E-mail indicating that the processing of
+Pregap4 has reached a given point. This may be of use when running pregap4 in
+batch mode, where the GUI is not visible. Typically the email module is placed
+at the end of the module list to indicate that pregap4 has (almost) finished,
+however it may be used elsewhere in the module list if desired.
+ at sp 1
+
+ at item Option: Email address
+The email address to send a message to.
+ at sp 1
+
+ at item Option: Email program
+The mail agent used for sending the message.
+ at sp 1
+
+ at item Option: Program arguments
+The arguments (except for the email address) to the mail agent. These could
+include options for setting the email subject.
+ at end table
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-Old Cloning Vector
+ at section Old Cloning Vector Clip - Obsolete
+ at cindex Cloning vector clip module (old style)
+ at pindex old_cloning_vector_clip.p4m
+
+ at table @strong
+ at item Description
+This is an older version of the Cloning Vector Clip module. It still uses the
+ at code{vector_clip} program to perform this task, but does not use the newer
+probabilistic model for analysing matches. It is still present as an option
+for people who have tuned the parameters for their data and are happy with
+this. The probability mode is recommended
+(_fpref(Vector_Clip, Screening against Vector Sequences, vector_clip)).
+ at sp 1
+
+ at item Option: Vector file name
+The filename containing the vector sequence. At present this should be a file
+containing a single plain text sequence containing just the bases or white
+space.
+ at sp 1
+
+ at item Option: Word length
+ at itemx Option: Number of diagonals
+ at itemx Option: Diagonal score
+The searching method involves hashing words to quickly identify matches and
+then combining these words along the best and neighbouring diagonals to
+produce an overall score which is compared against the diagonal score to
+determine whether this is vector sequence. The score is normalised from 0 (no
+match) to 1.0 (perfect match). For full details on this see the vector_clip
+manual.
+ at end table
+
+ at c --------------------
+_split()
+ at node Pregap4-Modules-ABI2SCF
+ at section ALF/ABI to SCF Conversion - Obsolete
+ at cindex ALF/ABI to SCF conversion module
+
+ at table @strong
+ at item Description
+This module converts ABI and ALF files to SCF format using the @code{makeSCF}
+program. SCF format is not required by programs such as gap4,
+but it is considerably
+smaller and has been designed to give high compression ratios.
+ at sp 1
+
+ at item Option: SCF bit size
+This selects the data size for the chromatogram data. An 8 bit value can store
+256 possible values, which is typically good enough for display purposes. If Y
+scaling is required (for instance because the signal strength diminishes
+significantly along the length of the trace), or further computational analysis
+of the trace is required, a 16 bit data size should be chosen.  As the majority
+of the trace file is the sample data, using 8 bit data typically saves about
+half of the disk space.
+_ifdef([[_unix]],[[Also see _oref(Pregap4-Modules-Compress Traces, Compress Trace Files).]])
+This module may also be used for converting 16-bit SCF files to 8-bit SCF
+files.
+ at end table
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-Config-Files
+ at chapter Using Config Files
+ at cindex Configuration files
+
+Pregap4 uses configuration files to remember the setup for each user or
+project. These files define which modules are activated and what their
+parameter settings are. The files,
+which can obviously save considerable amounts of time, are created
+automatically and can be saved from the Configure Modules Window once
+the configuration is complete.
+
+The "Load New Config File" option, available from the File menu, may be used
+to switch to a new (existing) configuration file. Pregap4 will display a file
+browser window to enable selection of another configuration file. Once chosen,
+Pregap4 will discard the existing configuration and use the new one.
+From this point onwards, any modifications and saving in Pregap4
+will be to the new configuration file.
+
+_ifdef([[_unix]],[[
+This option is equivalent to selecting the configuration file on the
+command menu, such as in the following example.
+
+ at example
+pregap4 -config new_config_file
+ at end example
+]])
+
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-Naming
+ at chapter Pregap4 Naming Schemes
+ at cindex Load naming scheme
+ at cindex Naming schemes
+
+ at menu
+* Pregap4-Naming-Mutation detection:: Mutation Detection Naming Scheme
+* Pregap4-Naming-Sanger_names_old:: Old Sanger Centre Naming Scheme
+* Pregap4-Naming-Sanger_names_new:: New Sanger Centre Naming Scheme
+* Pregap4-Naming-Writing::          Writing your own Naming Schemes
+ at end menu
+
+The "Load Naming Scheme" command is in the File menu. It will bring up a
+dialogue requesting the pathname of a naming scheme file. The browse button
+will automatically bring up the file browser in the pregap4 naming scheme
+directory, however naming schemes can be loaded from elsewhere if desired.
+The "Save to config file" query determines whether the component is also
+copied to the current pregap4 configuration file to make this component the
+default for subsequent pregap4 runs.
+
+The use of naming schemes within pregap4 is specifically for extracting
+information from a reading name in order to supply paramaters to other pregap4
+modules or to gap4. For example a naming scheme may be used to indicate where
+both the forward and reverse primers have been used to generate two sequences,
+which gap4 can then use for checking assembly and suggesting possible contig
+joins.
+
+Currently only two naming schemes are supplied with pregap4, both of which are
+from the Sanger Centre. To create your own naming schemes please see
+_oref(Pregap4-Naming-Writing, Writing Your Own Naming Scheme).
+
+_split()
+ at node Pregap4-Naming-Mutation detection
+ at section Mutation Detection Naming Scheme
+ at cindex Mutation detection naming scheme
+ at pindex mutation_detection.p4t
+
+ at table @strong
+ at item Filename
+mutation_detection.p4t
+ at sp 1
+
+ at item Description
+This naming scheme can be used for other purposes too, but its primary goal is
+to provide the simplest scheme possible suitable for handling pairs of
+sequences for the mutation detection module.
+
+Any sequence with a name ending with @code{f} or @code{F} is assumed to be a
+forward reading and any sequence with a naming ending with @code{r} or
+ at code{R} is assumed to be a reverse reading. The rest of the name
+(i.e. everything except the last character) is used as the template name and
+so needs to exactly match between the forward and reverse reading pair.
+ at sp 1
+
+ at item Configuration section
+ at code{[naming_scheme]}
+ at sp 1
+
+ at item Configuration elements
+ at code{PR_com}, @code{TN_com}.
+ at end table
+
+
+_split()
+ at node Pregap4-Naming-Sanger_names_old
+ at section Old Sanger Centre Naming Scheme
+ at cindex Sanger Centre naming scheme, old
+ at pindex sanger_names_old.p4t
+
+ at table @strong
+ at item Filename
+sanger_names_old.p4t
+ at sp 1
+
+ at item Description
+This scheme extracts information from sequence names by assuming that they
+adhere to the old-style Sanger Centre naming scheme. The information extracted
+consists of the template name, primer type and chemistry information. The
+format of a reading name is as follows.
+ at sp 1
+<@i{template_name}>@code{.}<@i{strand}><@i{primer}><@i{conditions}><@i{repetition}>
+ at sp 1
+In the above, @i{strand} and @i{primer} are each one character long and are
+defined according to the following tables. @i{Conditions} can be
+0, 1 or 2 characters indicating none, one or two sequencing conditions.
+ at i{Repetitions} is optional and is used purely for creating unique names when
+resequencing a template with the same strand, primer and conditions as a
+previous sequencing reaction.
+
+ at tex
+\global\tableindent=1.2in
+ at end tex
+ at sp 1
+ at table @asis
+ at item @strong{Strand}
+ at strong{Description}
+ at item @code{s}, @code{f}
+Forward, single stranded template.
+ at item @code{r}
+Reverse, single stranded template.
+ at item @code{p}
+Forward, double stranded template.
+ at item @code{q}
+Reverse, double stranded template.
+ at end table
+
+ at sp 1
+ at table @asis
+ at item @strong{Primer}
+ at strong{Description}
+ at item @code{1}
+Universal primer (end of insert).
+ at item @code{2}, @code{3}, etc
+Custom primer.
+ at end table
+
+ at sp 1
+ at table @asis
+ at item @strong{Conditions}
+ at strong{Description}
+ at item @code{t}
+Dye terminator chemistry.
+ at item @code{l}
+Long gel
+ at end table
+
+ at sp 1
+ at table @asis
+ at item @strong{Repetition}
+ at strong{Description}
+ at item @code{a}, @code{b}, @code{c}, ...
+Any letter except l and t
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+
+ at sp 1
+For example "@code{U16F10.p1t}" is a forward dye-terminator sequence from the
+double stranded template named @code{U16F10} using the universal primer.
+ at sp 1
+
+ at item Configuration section
+ at code{[naming_scheme]}
+ at sp 1
+
+ at item Configuration elements
+ at code{PR_com}, @code{TN_com}, @code{CH_com}.
+ at end table
+
+_split()
+ at node Pregap4-Naming-Sanger_names_new
+ at section New Sanger Centre Naming Scheme
+ at cindex Sanger Centre naming scheme, new
+ at pindex sanger_names_new.p4t
+
+ at table @strong
+ at item Filename
+sanger_names_new.p4t
+ at sp 1
+
+ at item Description
+This scheme extracts information from sequence names by assuming that they
+adhere to the new-style (~1997) Sanger Centre naming scheme. This is explained
+clearly on the Sanger Centre's web pages at
+
+_uref(http://www.sanger.ac.uk/Software/sequencing/,,http://www.sanger.ac.uk/Software/sequencing/)
+ at sp 1
+The information extracted consists of the template name, primer type and
+chemistry information. The format of a reading name is as follows.
+ at sp 1
+<@i{template_name}>@code{.}<@i{strand}><@i{primer}><@i{chemistry}>
+ at sp 1
+The @i{strand}, @i{primer} and @i{chemistry} fields are each one character
+long and are mandatory. They are defined by the following tables.
+
+ at tex
+\global\tableindent=1.2in
+ at end tex
+ at sp 1
+ at table @asis
+ at item @strong{Strand}
+ at strong{Description}
+ at item @code{p}
+Forward, double stranded template.
+ at item @code{q}
+Reverse, double stranded template.
+ at item @code{r}
+Reverse, single stranded template.
+ at item @code{s}
+Forward, single stranded template.
+ at end table
+
+ at sp 1
+ at table @asis
+ at item @strong{Primer}
+ at strong{Description}
+ at item @code{1}
+Universal primer (end of insert).
+ at item @code{2}
+Custom primer.
+ at end table
+
+ at sp 1
+ at table @asis
+ at item @strong{Conditions}
+ at strong{Description}
+ at item @code{t}
+standard (ABI) terminator
+ at item @code{d}
+dRhodamine terminator
+ at item @code{p}
+standard (ABI) primer
+ at item @code{e}
+energy transfer primer
+ at item @code{b}
+big dye primer
+ at item @code{c}
+big dye terminator
+ at item @code{l}
+licor
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+
+ at sp 1
+For example "@code{U16F10.p1t}" is a forward standard (ABI) terminator
+sequence from the double stranded template named @code{U16F10} using the
+universal primer.
+ at sp 1
+
+ at item Configuration section
+ at code{[naming_scheme]}
+ at sp 1
+
+ at item Configuration elements
+ at code{PR_com}, @code{TN_com}, @code{CH_com}.
+ at end table
+
+_split()
+ at node Pregap4-Naming-Writing
+ at section Writing Your Own Naming Schemes
+ at cindex Naming schemes, creating.
+
+The naming schemes are defined in the "component" files. At present two
+examples exist; both are naming schemes taken from the Sanger Centre. It is
+possible to define your own naming scheme, or indeed any other component. A
+component is basically just a file which you want to add (in its entirety) to
+the user's pregap4 configuration file. Typically these files end in the
+extension @file{.p4t}.
+
+The naming schemes are defined by use of three variables: @var{ns_name},
+ at var{ns_regexp} and @var{ns_lt}.
+
+ at var{ns_name} is simply a text name for the naming scheme.
+
+ at var{ns_regexp} is a regular expression which will be matched against each
+sequence identifier. The bracketed segments are assigned to Tcl variables
+which can be referenced as @code{$1}, @code{$2}, @code{$3} etc.
+
+ at var{ns_lt} is an array indexed by Experiment File line types. The contents of
+a particular array element is either a string containing the value for that
+line type or the word @code{subst} followed by a substitution list of the
+following format:
+
+ at code{subst @{}@i{string} @code{@{}@i{pattern} @i{replacement}@code{@}} ...
+ at i{default_replacement}@code{@}}
+
+In addition to this we need a bit of preamble stating that the following
+component is part of the pregap4 naming scheme section. This can be done by
+making sure the first line of the component file is @code{[naming_scheme]}.
+
+A completely new example naming scheme may be, in English, as follows:
+
+The reading identifier will consist of the template name, followed by a full
+stop, followed by two characters to determine the primer type and position,
+a single character to determine the chemistry, and any extra characters needed
+to create a unique name. Forward and reverse readings from the same "insert"
+or "template" will share the same template name. This in turn allows for gap4
+to know the relative positions, orientations and distances of two such
+readings and hence will allow it to point out possible problems.
+
+Putting this more specifically: a template name is any string of
+alpha-numerics (a-z, 0-9 and underscore). The primer type could be defined as:
+
+ at table @code
+ at item uf
+universal forward primer
+item ur
+universal reverse primer
+ at item cf
+custom forward primer
+ at item cr
+custom reverse primer
+ at end table
+
+The chemistry can be defined as:
+
+ at table @code
+ at item p
+Dye-Primer
+ at item P
+Big dye-primer
+ at item t
+Dye-Terminator
+ at item T
+Big dye-terminator
+ at end table
+
+For example @code{fred.ufp}, @code{fred.urp} and @code{bert.cfT} are all valid
+names.
+
+The above variable definitions may seem complex so we shall work through the
+example naming scheme. Firstly we need to define the regular expression. To
+new users this can be complex, but is described in great detail in many places
+(try the Unix "grep" manual page). In the shortest form: dot (@code{.})
+matches any character; square brackets delimit a set of characters, any one of
+which is allowed (or if it starts with @code{^} it is the complement set - any
+except those listed). Following a character or set with @code{+} indicates one
+or more copies of the preceeding expression, @code{*} is for zero or more
+copies, and @code{?} is for zero or one copy.
+
+So to define our example names we would start our component file with:
+
+ at example
+[naming_scheme]
+set ns_name "Example naming scheme"
+set ns_regexp @{([^.]*)\.(..)(.).*@}
+ at end example
+
+The backslash in the above text is to state that we want to match a real full
+stop character instead of the "any character" that regular expressions usually
+regard full stop as meaning. The @code{ns_regexp} will store the three
+bracketed segments in @code{$1}, @code{$2} and @code{$3}.
+
+The first segment is the template name. To use this we simply add:
+
+ at example
+set ns_lt(TN) @{$1@}
+ at end example
+
+The next segment is the primer type.  The primer type is defined for gap4 as a
+single digit number. 0 is for unknown, 1 is universal forward primer, 2 is
+universal reverse primer, 3 is custom forward primer, and 4 is custom reverse
+primer. So we wish to map @code{uf} to @code{1}, @code{ur} to @code{2},
+ at code{cf} to @code{3}, @code{cr} to @code{4}, and anything else to @code{0}.
+This is done with the following command:
+
+ at example
+set ns_lt(PR) @{subst @{$2 @{uf 1@} @{ur 2@} @{cf 3@} @{cr 4@} 0@}@}
+ at end example
+
+The final segment is the chemistry. At present gap4 only distinguishes between
+dye-primer and dye-terminators, although our naming scheme also "knows about"
+big dyes. So we wish to map both @code{p} and @code{P} to chemistry type
+ at code{0}, and @code{t} and @code{T} to chemistry type @code{1}. Anything else
+we'll also assume is dye-primer. In much the same way that the regular
+expressions work, we can use square brackets in our patterns to say "any of
+these letters". So the command for this is:
+
+ at example
+set ns_lt(CH) @{subst @{$3 @{[pP] 0@} @{[tT] 1@} 0@}@}
+ at end example
+
+The final line to add to the component file is @code{set_name_scheme}. This is
+a pregap4 command which tells it that you have finished defining the naming
+scheme. So the completed component file is simply:
+
+ at example
+[naming_scheme]
+set ns_name "Example naming scheme"
+set ns_regexp @{([^.]*)\.(..)(.).*@}
+set ns_lt(TN) @{$1@}
+set ns_lt(PR) @{subst @{$2 @{uf 1@} @{ur 2@} @{cf 3@} @{cr 4@} 0@}@}
+set ns_lt(CH) @{subst @{$3 @{[pP] 0@} @{[tT] 1@} 0@}@}
+set_name_scheme
+ at end example
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-Components
+ at chapter Pregap4 Components
+ at cindex Components
+ at cindex Include config component
+
+A "Component" in pregap4 is a predefined section of a pregap4 configuration
+file. It will generally be used to add on complex configurations which are not
+easily created using the GUI. Currently there are only two predefined
+components, both of which specify a naming scheme and so are easiest loaded
+using the Load Naming Convention function; see _oref(Pregap4-Naming, Load
+Naming Convention).
+
+_picture(pregap4_component,2.98333in)
+
+The "Include Config Component" command in the File menu is used to load a
+component. The "browse" button will bring up a file browser listing the
+default pregap4 component directory, however components can be loaded from
+elsewhere if desired. The "Save to config file" query determines whether the
+component is also copied to the current pregap4 configuration file to make
+this component the default for subsequent pregap4 runs.
+
+Note that a component may have a configuration section listed within it. If
+this is present the component will replace any configuration with the same
+section name.
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-Database
+ at chapter Information Sources
+ at cindex Information sources
+ at cindex Database integration
+
+ at menu
+* Pregap4-Database-Simple::     Simple Text Database
+* Pregap4-Database-LineTypes::  Experiment File Line Types
+ at end menu
+
+The "Information Sources" menu contains options to obtain information from
+external data sources, such as a text database. This does not include
+information encoded inside a reading name convention. At present there are only
+two options - Simple Text Database and Experiment File Line Types - although
+it is hoped that a link up to a robust relational database could also be
+included here.
+
+_split()
+ at node Pregap4-Database-Simple
+ at section Simple Text Database
+ at cindex Simple Text Database
+ at cindex Database, plain text format
+ at cindex Augment, by text database
+
+This option allows interrogation of a very simple format text database with
+one line per sequence. The sequence identifier is the first word of a line
+with one or more additional columns of information relating to specific
+information about that sequence. All columns in the database file must have
+the same format and only one database file may be used at any one time.
+
+For example, we may wish to store the primer type, primer site, template name
+and the number of strands on the template for each sequence. This corresponds
+to the @code{PR}, @code{SP}, @code{TN} and @code{ST} Experiment File line
+types. We could then create a text database looking something like the
+following:
+
+ at example
+# ID            PR      SP      TN        ST
+xb54a3.s1       1        41     xb54a3    1
+xb54b12.s1      1        41     xb54b12   2
+xb54b12.r1      2       -24     xb54b12   2
+xb54b12.r1L     2       -24     xb54b12   2
+ at end example
+
+(The first line, starting with @code{#} is just a comment. Pregap4 does not
+use this; it is purely so that we know which information is in which column.)
+
+We can then direct pregap4 to extract the information from each of these four
+columns for each reading being processed and to store this information in the
+Experiment File. This information can then be utilised by the vector clipping
+and assembly modules.
+
+_picture(pregap4_simpledb,2.86667in)
+
+The Simple Text Database interface consists of an entry box to specify the
+database file name, add and delete buttons, and a line type selector for each
+column in the database (excluding the reading name column).
+The above picture contains the database setup for extracting the primer type,
+primer position, template name and number of strands as described in the above
+example.
+
+The "Add column" button adds a new line type selector at the bottom of the
+window.  This contains an option menu which can be clicked to choose a new
+Experiment File line type and a label indicating the column number. The
+"Delete column" button removes the bottom-most line type selector.
+
+The "Ok" button will accept this configuration and will also write the details
+to the current pregap4 configuration file. To disable a previously setup
+Simple Database Configuration press delete until there are no line types
+listed and then press Ok once more.
+
+The simple text database does not need to include a record for every
+reading and special characters can be used to encode names so that
+readings produced in similar ways can be grouped.
+For example, if the first 6 letters of the name encode a "plate" name,
+and all the sequences on that plate have been sequenced using the same
+vector then we could create a database file as follows.
+
+ at example
+# ID            SF              SC      SP
+6abz91*         m13mp18.seq     6249    41/-24
+6aca68*         puc18.seq       248     40/-28
+6aca69*         puc18.seq       248     40/-28
+6aca70*         puc18.seq       248     40/-28
+6acb21*         m13mp18.seq     6249    41/-24
+6acd49*         puc18.seq       248     40/-28
+6acd51*         puc18.seq       248     40/-28
+ at end example
+
+The sequence identifier (ID) is searched for using a pattern matching rule (as
+dictated by the Tcl @code{string match} command). The pattern matching uses
+special characters as follows:
+
+ at table @code
+ at item *
+Matches any sequence of characters in the reading identifier, including an
+empty string.
+ at sp 1
+ at item ?
+Matches any single character in the reading identifier.
+ at sp 1
+ at item [@i{chars}]
+Matches any character in the set given by chars. If a sequence of the form
+ at i{r-v} appears in chars, then any character between @i{r} and @i{v},
+inclusive, will match (@code{rstuv}).
+ at sp 1
+ at item \@i{x}
+Matches the single character @i{x}. This provides a way of avoiding the special
+interpretation of the characters @code{*?[]\} in the reading identifier.
+ at end table
+
+_split()
+ at node Pregap4-Database-LineTypes
+ at section Experiment File Line Types
+ at cindex Experiment File line types
+ at cindex Line types, in experiment file
+ at cindex Augment, by line types
+
+A list of Experiment File line types may be viewed and edited using this
+option (_fpref(Exp-Records, Experiment file format record types, exp)).
+A table of the Experiment File line types is displayed along with
+their current values. A brief description of the line type underneath the
+mouse cursor is displayed in the Information Line
+at the bottom of the window. The table consists of
+three columns. The first is a label identifying the line being edited. The
+second column is an option menu from which either @i{Value} or @i{Command} may
+be selected. The third column is the current value or command.
+
+_picture(pregap4_edit_exp,4.23333in)
+
+Line type values will be used for every sequence. The Augment Experiment Files
+module will add this line type to each Experiment File. This is suitable for
+specifying information which is constant across an entire batch, such as
+insert size (SI) or operator (OP). The Line type commands are executed each
+time the Augment Experiment Files module adds that line to the Experiment
+File. Hence the commands are used for information which may change from
+sequence to sequence. This table should be used for editing line type values,
+but we do not recommend that you use it for editing commands (although it is
+useful to know which commands have been set).
+
+In the above example the PR, SP and ST commands were generated using the
+Simple Text Database interface, whilst the SC and SF values come from the
+Sequencing Vector Clip module parameters and the SI value was typed in by
+hand using this table.
+
+The "OK (in memory)" and "OK (and save)" buttons will accept the currently
+displayed values and commands. The "OK (in memory)" button will use these
+settings for the current pregap4 runs. The "OK (and save)" button will use
+them for the current session and all subsequent pregap4 sessions as it saves
+the information to the pregap4 configuration file.
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-ModAdd
+ at chapter Adding and Removing Modules
+ at cindex Adding modules
+ at cindex Removing modules
+ at cindex Modules, adding and removing
+
+This section details how to select newly written modules in Pregap4, or how to
+change the order of existing modules.
+
+It is for system managers and advanced users only.
+
+
+Pregap4 has a default set of modules to use. Any module within this list may
+be enabled or disabled. If you only need to screen a set of experiment files
+using @code{blast} or @code{screen_seq} it may be tempting to use the
+Add/Remove Modules screen (from Modules menu) to remove everything else. This
+is not necessary; just disable the unwanted modules. The real purpose of
+Add/Remove Modules is to define the contents and order of the list that
+appears in the Configure Modules screen.  This may be required if you create
+your own modules, or if you wish to never use certain modules. (Removing them
+from the list instead of simply disabling them will speed up starting
+Pregap4.)
+
+It is possible for a module to be used more than once. For example if you wish
+to use blast to screen against several databases then this control may be used
+to add two "Blast screen" items to the Configure Modules screen. Note though,
+that this is not applicable to many modules. For example it is not possible to
+screen against multiple vectors by simply using multiple Sequencing Vector
+Clip modules (rather this should be done using a file of vector-primer
+information). No error checking is performed with the Add/Remove Modules
+screen.
+
+A pregap4 module is a specific piece of Tcl/Tk code that interfaces between
+pregap4 (by providing a @code{run} procedure and an optional GUI for
+configuration) and an external program to do the main work (as Tcl itself is
+generally too slow for anything except the most simple of operations). The
+exact specification of a module can be found elsewhere
+(_oref(Pregap4-WritingMods, Writing New Modules)).
+
+_picture(pregap4_select,4.9in)
+
+ at pindex .pregap4rc
+ at pindex pregap4rc
+ at vindex MODULE_PATH
+
+All modules must end in @file{.p4m}. Pregap4 uses a module search path to
+search for files with this suffix. The module search path is a space separated
+list. By default it will be set to @code{$STADENROOT/lib/pregap4/modules}. It
+may be adjusted temporarily within the program, or permanently by setting the
+MODULE_PATH variable within your @file{.pregap4rc} or run-specific
+configuration files. For example:
+
+ at example
+set MODULE_PATH "$env(STADLIB)/pregap4/modules ."
+ at end example
+
+The two lists shown in the dialogue represent the current modules to use (on
+the left) and the total list of known modules. Modules may be added to the
+left (to use) list by clicking any mouse button on the right hand list,
+dragging the mouse cursor to a location within the left list, and then release
+the mouse button. To remove a module from the 'to use' list simply drag and
+drop from left to right. This mechanism also allows for changing the order of
+modules within the left list.
+
+The order of modules is vitally important and in the current version of
+Pregap4 the validity of the order is not checked. Common sense should prevent
+most problems. For instance it is pointless to assemble and enter into gap4
+before vector clipping. The best source of information on the possible
+orderings comes from the documentation for each individual module. Some
+modules are directly incompatible with each other as they perform the same or
+mutually exclusive tasks. For example it is only possible to use one of the
+assembly methods.
+
+Once the modules have been selected press "Apply" to reinitialise Pregap4.
+If you wish to make your newly selected list the default for subsequent
+Pregap4 runs use the "Save Module List" command in the Modules menu.
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-ManualConfig
+ at chapter Low Level Pregap4 Configuration
+ at cindex low level pregap4 configuration
+ at cindex Configuration: pregap4 low level
+
+ at menu
+* Pregap4-ManualConfig-Global::         Low Level Global Configuration
+* Pregap4-ManualConfig-Components::     Low Level Component Configuration
+* Pregap4-ManualConfig-Modules::        Low Level Module Configuration
+ at end menu
+
+Most users will never need to configure pregap4 at this level. However by
+understanding the methods that pregap4 uses you will see how to tailor Pregap4
+in a more flexible manner.
+
+The pregap4 configuration file consists of several sections. Each section is
+started with "@code{[}@i{section_name}@code{]}". The main pregap4 sections are
+ at code{[module_list]}, @code{[global_variables]}, and one section per module
+named @code{[::}@i{module_name}@code{]}.
+
+_split()
+ at node Pregap4-ManualConfig-Global
+ at section Low Level Global Configuration
+ at cindex Global variables
+ at vindex MODULE_PATH
+ at vindex MODULES
+
+The @code{[module_list]} section contains definitions of the
+ at code{MODULE_PATH} and @code{MODULES} variables. The @code{MODULE_PATH} is a
+space delimited list of directory names in which to search for the pregap4
+modules (@code{*.p4m} files). The @code{MODULES} variable is the default list
+of modules to list in the configuration window. The order of modules in this
+list determines the order that they will be executed in. The default
+ at code{[module_list]} section is as follows:
+
+ at example
+[module_list]
+set MODULE_PATH "$env(STADLIB)/pregap4/modules ."
+set MODULES @{
+    phred
+    atqa
+    convert_trace
+    eba
+    compress_trace
+    init_exp
+    augment_exp
+    quality_clip
+    trace_clip
+    sequence_vector_clip
+    cross_match_svec
+    cloning_vector_clip
+    screen_vector
+    screen_seq
+    blast
+    interactive_clip
+    repeat_masker
+    tag_repeats
+    trace_diff
+    gap4_assemble
+    cap2_assemble
+    cap3_assemble
+    fakii_assemble
+    phrap_assemble
+    enter_assembly
+    email
+@}
+ at end example
+
+The @code{[global_variables]} section defines the values for each of the
+Experiment File line types. These are currently primarily used by the Augment
+Experiment Files module, but may also be used by the vector clipping and
+mutation detection modules. The default @code{[global_variables]} section is
+blank.
+
+As an example, the following section defines the sequencing vector file for
+each sequence to be @code{m13mp18.vector} with @code{6249} and @code{41} as
+the cut site and primer site. Each sequence has a primer type of 1 (forward
+universal primer) and the template name is derived from the sequence name by
+taking the segment of the string preceding the full stop.
+
+ at example
+[global_variables]
+set SF m13mp18.vector
+set SC 6249
+set SP 41
+set PR 1
+proc TN_com @{@} @{ global lines; return [lindex [split $lines(ID) .] 0] @}
+ at end example
+
+_split()
+ at node Pregap4-ManualConfig-Components
+ at section Low Level Component Configuration
+ at cindex Component configuration
+
+Pregap4 Components may each contain their own configuration section. Where
+several components are mutually exclusive, such as components describing
+naming conventions, it makes sense to give each component the same
+configuration section. This will ensure that loading a new component will
+overwrite the old one. At present the only defined components both create a
+ at code{[naming_scheme]} section.
+
+Components may redefine items which could appear in other configuration
+sections. In this case the last definition of that setting will take priority.
+For instance if a component defines the @code{TN_com} procedure and this is
+also defined in the @code{[global_variables]} section then the component will
+only take priority if it is after the global section in the configuration file.
+
+Components may also be used to define parameters for modules. Once again the
+components need to be listed after the module definitions being overridden. To
+define module components in this way, use the @code{module} command. An example
+follows.
+
+ at example
+ at code{module tag_repeats @{set repeat_file repeats.list@}}
+ at end example
+
+_split()
+ at node Pregap4-ManualConfig-Modules
+ at section Low Level Module Configuration
+ at cindex Modules, configuring
+ at cindex Configuring modules
+
+ at menu
+* Pregap4-ManualConfig-General::              General Configuration
+* Pregap4-ManualConfig-ABI2SCF::              ALF/ABI to SCF Conversion
+* Pregap4-ManualConfig-EBA::                  Estimate Base Accuracies
+* Pregap4-ManualConfig-Phred::                Phred
+* Pregap4-ManualConfig-ATQA::                 ATQA
+_ifdef([[_unix]],[[* Pregap4-ManualConfig-Compress Traces::      Compress Trace Files]])
+* Pregap4-ManualConfig-ConvertTrace::         Trace Format Conversion
+* Pregap4-ManualConfig-Initexp::              Initialise Experiment Files
+* Pregap4-ManualConfig-Augment::              Augment Experiment Files
+* Pregap4-ManualConfig-Uncalled Clip::        Uncalled Base Clip
+* Pregap4-ManualConfig-Quality Clip::         Quality Clip
+* Pregap4-ManualConfig-Sequence Vector::      Sequencing Vector Clip
+* Pregap4-ManualConfig-Cross_match::          Cross_match
+* Pregap4-ManualConfig-Cloning Vector::       Cloning Vector Clip
+* Pregap4-ManualConfig-Old Cloning Vector::   Old Cloning Vector Clip
+* Pregap4-ManualConfig-Screen Vector::        Screen for Unclipped Vector
+* Pregap4-ManualConfig-Screen::               Screen Sequences
+* Pregap4-ManualConfig-Blast::                Blast Screen
+* Pregap4-ManualConfig-Interactive Clip::     Interactive Clipping
+* Pregap4-ManualConfig-Extract Seq::          Extract Sequence
+* Pregap4-ManualConfig-Repeats::              Tag Repeats
+* Pregap4-ManualConfig-RepeatMasker::         Repeat Masker
+* Pregap4-ManualConfig-Mutations::            Mutation Detection
+* Pregap4-ManualConfig-Gap4 Assembly::        Gap4 Shotgun Assembly
+* Pregap4-ManualConfig-Cap2 Assembly::        Cap2 Assembly
+* Pregap4-ManualConfig-Cap3 Assembly::        Cap3 Assembly
+* Pregap4-ManualConfig-FakII Assembly::       FakII Assembly
+* Pregap4-ManualConfig-Phrap Assembly::       Phrap Assembly
+* Pregap4-ManualConfig-Enter Assembly::       Enter Assembly into Gap4
+* Pregap4-ManualConfig-Email::                Email
+* Pregap4-ManualConfig-Shutdown::             Shutdown
+ at end menu
+
+Each module has its own configuration section named after the module name. The
+configuration section will be "@code{[::}@i{module_filename_prefix}@code{]}"
+where @i{module_filename_prefix} is the filename of the module with the
+ at code{.p4m} extension removed. For example, the @code{init_exp.p4m} module has a
+configuration section of @code{[::init_exp]}.
+
+Each module is loaded into its own namespace, also named after the module in
+the same manner. Thus in the above example the Initialise Experiment Files
+module uses the namespace @code{::init_exp}. This means that all local
+variables within that module will be within that name space and will not clash
+with identical named variables in other modules. When pregap4 reads the
+configuration file any configuration section starting in double colon is taken
+to be a name space and the following configuration is executed in that
+namespace. So the following example enables the Initialise Experiment Files
+module, but disables the Estimate Base Accuracies module.
+
+ at vindex enabled
+
+ at example
+[::init_exp]
+set enabled 1
+
+[::eba]
+set enabled 0
+ at end example
+
+In the following sections the variables, inputs and outputs of each module are
+listed. Every module has an @code{enabled} local variable. This may be either
+ at code{0} for disabled or @code{1} for enabled. Disabled modules are still
+listed in the configuration panel, although they will not be executed.
+
+The tables in each section below list the module filename, the local variables
+and a very brief description of their valid values, the files used or produced
+by this module, the possible sequence specific errors that can be produced
+(which will be written to the failure file as the reason for failure), and the
+format of any @code{SEQ} lines in the module report. Other information may
+also be reported, but the @code{SEQ} lines are easily recognisable to
+facilitate easy parsing of results.
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-General
+ at subsection General Configuration
+ at cindex General configuration configuration
+ at pindex init.p4m
+ at vindex use_sample_name
+
+ at table @strong
+ at item Filename
+ at code{init.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item use_sample_name
+ at code{0}/@code{1}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+(none)
+ at sp 1
+
+ at item Errors
+ at code{init: Unreadable or nonexistent file}
+@*@code{init: Unknown file type}
+ at sp 1
+
+ at item Report
+(None)
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-ABI2SCF
+ at subsection ALF/ABI to SCF Conversion
+ at cindex ALF/ABI to SCF conversion configuration
+ at pindex to_scf.p4m
+ at vindex bit_size
+
+ at table @strong
+ at item Filename
+ at code{to_scf.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item bit_size
+ at code{8}/@code{16}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+Creates the SCF files if required.
+ at sp 1
+
+ at item Errors
+ at code{makeSCF: }@i{makeSCF error message}
+ at sp 1
+
+ at item Report
+ at code{SEQ} @i{seqid}@code{: created from} @i{old seqid}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-EBA
+ at subsection Estimate Base Accuracies
+ at cindex Estimate base accuracies configuration
+ at pindex eba.p4m
+
+ at table @strong
+ at item Filename
+ at code{eba.p4m}
+ at sp 1
+
+
+ at item Local variables
+(none)
+
+ at item Files
+Modified SCF files with new confidence values.
+ at sp 1
+
+ at item Errors
+ at code{eba: }@i{eba error message}
+ at sp 1
+
+ at item Report
+(none)
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Phred
+ at subsection Phred
+ at cindex Phred configuration
+ at pindex phred.p4m
+
+ at table @strong
+ at item Filename
+ at code{phred.p4m}
+ at sp 1
+
+ at item Local variables
+(none)
+ at sp 1
+
+ at item Files
+ at pindex .scf_dir
+ at pindex phred.log
+ at i{fofn}@code{.tmp} --- temporary
+ at br @i{fofn}@code{.scf_dir} --- temporary directory
+ at br @code{phred.log} --- phred log file
+ at br Creates the SCF files if required.
+ at sp 1
+
+ at item Errors
+ at code{phred: }@i{phred error message}
+ at sp 1
+
+ at item Report
+ at code{SEQ} @i{seqid}@code{: created from} @i{old seqid}
+ at br @code{SEQ} @i{seqid}@code{: re-base-called}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-ATQA
+ at subsection ATQA
+ at cindex ATQA configuration
+ at pindex atqa.p4m
+
+ at table @strong
+ at item Filename
+ at code{atqa.p4m}
+ at sp 1
+
+
+ at item Local variables
+(none)
+ at sp 1
+
+ at item Files
+Updates the confidence values within SCF files.
+ at sp 1
+
+ at item Errors
+ at code{ATQA: }@i{ATQA error message}
+ at sp 1
+
+ at item Report
+ at code{SEQ} @i{seqid}@code{: confidences recalculated by ATQA.}
+ at end table
+ at sp 3
+
+_ifdef([[_unix]],[[
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Compress Traces
+ at subsection Compress Trace Files
+ at cindex Compress Trace Files configuration
+ at pindex compress_trace.p4m
+ at vindex compression
+ at vindex keep_names
+
+ at table @strong
+ at item Filename
+ at code{compress_trace.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item compression
+ at code{none}/@code{compress}/@code{gzip}/@code{bzip}/@code{bzip2}
+ at item keep_names
+ at code{0}/@code{1}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+ at i{seqid}@code{.tmp} --- temporary
+ at br @i{seqid}@code{.}@i{compression_extension}
+ at sp 1
+
+ at item Errors
+(none)
+ at sp 1
+
+ at item Report
+ at code{SEQ} @i{seqid}@code{: did not compress}
+ at end table
+ at sp 3
+]])
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-ConvertTrace
+ at subsection Trace Format Conversion
+ at cindex Trace Format Conversion configuration
+ at pindex convert_trace.p4m
+
+ at table @strong
+ at item Filename
+ at code{convert_trace.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item output_format
+ at code{ZTR}/@code{CTF}/@code{ZTR}
+ at item down_scale
+ at code{0}/@code{1}
+ at item down_scale_range
+ at i{integer}
+ at item subtract_background
+ at code{0}/@code{1}
+ at item normalise
+ at code{0}/@code{1}
+ at item del_temp_files
+ at code{0}/@code{1}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+Creates Trace files.
+ at sp 1
+
+ at item Errors
+ at code{convert_trace: }@i{convert_trace error message}
+ at sp 1
+
+ at item Report
+ at code{SEQ} @i{seqid}@code{: created from} @i{old seqid}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Initexp
+ at subsection Initialise Experiment Files
+ at cindex Initialise Experiment Files configuration
+ at pindex init_exp.p4m
+
+ at table @strong
+ at item Filename
+ at code{init_exp.p4m}
+ at sp 1
+
+ at item Local variables
+(none)
+ at sp 1
+
+ at item Files
+Creates Experiment files.
+ at sp 1
+
+ at item Errors
+ at code{init_exp: }@i{init_exp error message}
+ at sp 1
+
+ at item Report
+ at code{SEQ} @i{seqid}@code{: created from} @i{old seqid}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Augment
+ at subsection Augment Experiment Files
+ at cindex Augment Experiment files configuration
+ at pindex augment_exp.p4m
+ at vindex _com
+
+ at table @strong
+ at item Filename
+ at code{augment_exp.p4m}
+ at sp 1
+
+ at item Local variables
+(none)
+ at sp 1
+
+ at item Global variables
+All the @code{@i{XX}} and @code{@i{XX}_com} variables that define the two
+letter Experiment File line types.
+ at sp 1
+
+ at item Files
+Updates Experiment files.
+ at sp 1
+
+ at item Errors
+(none)
+ at sp 1
+
+ at item Report
+ at code{SEQ} @i{seqid}@code{: added fields} @i{field names}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Uncalled Clip
+ at subsection Uncalled Base Clip
+ at cindex Uncalled base clip configuration
+ at pindex uncalled_clip.p4m
+ at vindex offset
+ at vindex min_extent
+ at vindex max_extent
+ at vindex right_win_length
+ at vindex right_num_uncalled
+ at vindex left_win_length
+ at vindex left_num_uncalled
+
+ at table @strong
+ at item Filename
+ at code{uncalled_clip.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item offset
+ at i{integer}
+ at item min_extent
+ at i{integer}
+ at item max_extent
+ at i{integer}
+ at item right_win_length
+ at i{integer}
+ at item right_num_uncalled
+ at i{integer}
+ at item left_win_length
+ at i{integer}
+ at item left_num_uncalled
+ at i{integer}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+Modifies Experiment files
+ at sp 1
+
+ at item Errors
+ at code{clip: }@i{clip error message}
+ at sp 1
+
+ at item Report
+ at code{SEQ} @i{seqid}@code{: clipped using 'clip'}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Quality Clip
+ at subsection Quality Clip
+ at cindex quality clip configuration
+ at pindex quality_clip.p4m
+ at vindex window_length
+ at vindex conf_val
+ at vindex clip_mode
+ at vindex offset
+ at vindex min_extent
+ at vindex max_extent
+ at vindex min_length
+ at vindex right_win_length
+ at vindex right_num_uncalled
+ at vindex left_win_length
+ at vindex left_num_uncalled
+
+ at table @strong
+ at item Filename
+ at code{quality_clip.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item clip_mode
+ at code{sequence}/@code{confidence}
+ at item window_length
+ at i{integer}
+ at item conf_val
+ at i{integer}
+ at item min_extent
+ at i{integer}
+ at item max_extent
+ at i{integer}
+ at item min_length
+ at i{integer}
+ at item offset
+ at i{integer}
+ at item right_win_length
+ at i{integer}
+ at item right_num_uncalled
+ at i{integer}
+ at item left_win_length
+ at i{integer}
+ at item left_num_uncalled
+ at i{integer}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+Modifies Experiment files
+ at sp 1
+
+ at item Errors
+ at code{qclip: }@i{qclip error message}
+ at code{qclip: }@i{Sequence too short (length=%d)}
+ at sp 1
+
+ at item Report
+ at code{SEQ} @i{seqid}@code{: clipped using 'qclip'}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Sequence Vector
+ at subsection Sequencing Vector Clip
+ at cindex Sequencing vector clip configuration
+ at pindex sequence_vector_clip.p4m
+ at vindex def_5_pos
+ at vindex update_exp_file
+ at vindex min_5_match
+ at vindex min_3_match
+ at vindex use_vp_file
+ at vindex vp_file
+ at vindex vector_list
+ at vindex vp_length
+ at vindex SP
+ at vindex SF
+ at vindex SC
+
+ at table @strong
+ at item Filename
+ at code{sequence_vector_clip.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item min_5_match
+ at i{float}
+ at item min_3_match
+ at i{float}
+ at item def_5_pos
+ at i{integer}
+ at item update_exp_file
+ at code{0}/@code{1}
+ at item use_vp_file
+ at code{0}/@code{1}
+ at item vp_file
+ at i{filename}
+ at item vector_list
+ at i{string}
+ at item vp_length
+ at i{integer}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Global variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item SP
+ at itemx SP_com
+ at item SC
+ at itemx SC_com
+ at item SF
+ at itemx SF_com
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+ at pindex .svec_passed
+ at pindex .svec_failed
+ at i{fofn}@code{.tmp} --- temporary
+ at br @i{fofn}@code{.svec_passed}
+ at br @i{fofn}@code{.svec_failed}
+ at sp 1
+
+ at item Errors
+ at code{sequence_vector_clip: No SF, SC or SP information}
+ at br @code{sequence_vector_clip: }@i{vector_clip error message}
+ at br @code{sequence_vector_clip: lost file}
+ at sp 1
+
+ at item Report
+ at code{SEQ} @i{seqid}@code{: passed}
+ at br @code{SEQ} @i{seqid}@code{: failed (}@i{reason}@code{)}
+ at br @code{SEQ} @i{seqid}@code{: lost}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Cross_match
+ at subsection Cross_match
+ at cindex Cross_match configuration
+ at pindex cross_match_svec.p4m
+ at vindex minmatch
+ at vindex minscore
+ at vindex vector_file
+ at vindex SF
+
+ at table @strong
+ at item Filename
+ at code{cross_match_svec.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item minmatch
+ at i{integer}
+ at item minscore
+ at i{integer}
+ at item vector_file
+ at i{filename}
+ at item gap_size
+ at i{integer}
+ at item tag_type
+ at i{string}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Global variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item SF
+ at itemx SF_com
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+ at i{fofn}@code{.tmp} --- temporary
+ at br @i{fofn}@code{.tmp.log} --- temporary
+ at br @i{fofn}@code{.tmp.screen} --- temporary
+ at sp 1
+
+ at item Errors
+ at code{cross_match: }@i{cross_match error message}
+ at br @code{cross_match: entirely vector}
+ at sp 1
+
+ at item Report
+ at code{SEQ} @i{seqid}@code{: masked }@i{start}@code{ to }@i{end}
+ at br @code{SEQ} @i{seqid}@code{: no matches}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Cloning Vector
+ at subsection Cloning Vector Clip
+ at cindex Cloning vector clip configuration
+ at pindex cloning_vector_clip.p4m
+ at vindex word_length
+ at vindex probability
+ at vindex update_expfile
+ at vindex CF
+
+ at table @strong
+ at item Filename
+ at code{cloning_vector_clip.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item word_length
+ at i{integer}
+ at item probability
+ at i{float}
+ at item update_exp_file
+ at code{0}/@code{1}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Global variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item CF
+ at itemx CF_com
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+ at pindex .cvec_passed
+ at pindex .cvec_failed
+ at i{fofn}@code{.tmp} --- temporary
+ at br @i{fofn}@code{.cvec_passed}
+ at br @i{fofn}@code{.cvec_failed}
+ at sp 1
+
+ at item Errors
+ at code{sequence_vector_clip: No CF information}
+ at br @code{sequence_vector_clip: }@i{vector_clip error message}
+ at br @code{sequence_vector_clip: lost file}
+ at sp 1
+
+ at item Report
+ at code{SEQ} @i{seqid}@code{: checked}
+ at br @code{SEQ} @i{seqid}@code{: failed (}@i{reason}@code{)}
+ at br @code{SEQ} @i{seqid}@code{: lost}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Old Cloning Vector
+ at subsection Old Cloning Vector Clip
+ at cindex Old cloning vector clip configuration
+ at cindex Cloning vector clip configuration (old style)
+ at pindex old_cloning_vector_clip.p4m
+ at vindex word_length
+ at vindex num_diags
+ at vindex diag_score
+ at vindex update_expfile
+ at vindex CF
+
+ at table @strong
+ at item Filename
+ at code{old_cloning_vector_clip.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item word_length
+ at i{integer}
+ at item num_diags
+ at i{integer}
+ at item diag_score
+ at i{float}
+ at item update_exp_file
+ at code{0}/@code{1}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Global variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item CF
+ at itemx CF_com
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+ at pindex .cvec_passed
+ at pindex .cvec_failed
+ at i{fofn}@code{.tmp} --- temporary
+ at br @i{fofn}@code{.cvec_passed}
+ at br @i{fofn}@code{.cvec_failed}
+ at sp 1
+
+ at item Errors
+ at code{sequence_vector_clip: No CF information}
+ at br @code{sequence_vector_clip: }@i{vector_clip error message}
+ at br @code{sequence_vector_clip: lost file}
+ at sp 1
+
+ at item Report
+ at code{SEQ} @i{seqid}@code{: checked}
+ at br @code{SEQ} @i{seqid}@code{: failed (}@i{reason}@code{)}
+ at br @code{SEQ} @i{seqid}@code{: lost}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Screen Vector
+ at subsection Screen for Unclipped Vector
+ at cindex Screen for unclipped vector configuration
+ at pindex screen_vector.p4m
+ at vindex min_match
+ at vindex update_exp_file
+ at vindex SF
+
+ at table @strong
+ at item Filename
+ at code{screen_vector.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item min_match
+ at i{integer}
+ at item update_exp_file
+ at code{0}/@code{1}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Global variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item SF
+ at itemx SF_com
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+ at pindex .screenvec_passed
+ at pindex .screenvec_failed
+ at i{fofn}@code{.tmp} --- temporary
+ at br @i{fofn}@code{.screenvec_passed}
+ at br @i{fofn}@code{.screenvec_failed}
+ at sp 1
+
+ at item Errors
+ at code{sequence_vector_clip: No CF information}
+ at br @code{sequence_vector_clip: }@i{vector_clip error message}
+ at br @code{sequence_vector_clip: lost file}
+ at sp 1
+
+ at item Report
+ at code{SEQ} @i{seqid}@code{: passed}
+ at br @code{SEQ} @i{seqid}@code{: failed (}@i{reason}@code{)}
+ at br @code{SEQ} @i{seqid}@code{: lost}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Screen
+ at subsection Screen Sequences
+ at cindex Screen sequences configuration
+ at pindex screen_seq.p4m
+ at vindex min_match
+ at vindex max_length
+ at vindex screen_mode
+ at vindex screen_file
+
+ at table @strong
+ at item Filename
+ at code{screen_seq.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item min_match
+ at i{integer}
+ at item max_length
+ at i{integer}
+ at item screen_mode
+ at code{single}/@code{fofn}
+ at item screen_file
+ at i{filename}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+ at pindex .screenseq_passed
+ at pindex .screenseq_failed
+ at i{fofn}@code{.tmp} --- temporary
+ at br @i{fofn}@code{.screenseq_passed}
+ at br @i{fofn}@code{.screenseq_failed}
+ at sp 1
+
+ at item Errors
+ at code{screen_seq: }@i{screen_seq error message}
+ at br @code{screen_seq: lost file}
+ at sp 1
+
+ at item Report
+ at code{SEQ} @i{seqid}@code{: passed}
+ at br @code{SEQ} @i{seqid}@code{: failed (}@i{reason}@code{)}
+ at br @code{SEQ} @i{seqid}@code{: lost}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Blast
+ at subsection Blast Screen
+ at cindex Blast screen configuration
+ at pindex blast.p4m
+ at vindex match_fraction
+
+ at table @strong
+ at item Filename
+ at code{blast.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item database
+ at i{filename_prefix}
+ at item match_fraction
+ at i{float}
+ at item e_value
+ at i{float}
+ at item tag_type
+ at i{string}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+ at pindex .blast
+ at i{fofn}@code{.blast}
+ at sp 1
+
+ at item Errors
+ at code{blast: match fraction }@i{fraction}
+ at sp 1
+
+ at item Report
+ at code{SEQ }@i{seqid}@code{: total match length = }@i{match_length}
+ at code{ (fract=}@i{match_fraction}@code{)}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Interactive Clip
+ at subsection Interactive Clipping
+ at cindex Interactive clipping configuration
+ at pindex interactive_clip.p4m
+
+ at table @strong
+ at item Filename
+ at code{interactive_clip.p4m}
+ at sp 1
+
+ at item Local variables
+(none)
+ at sp 1
+
+ at item Files
+May modify Experiment files.
+ at sp 1
+
+ at item Errors
+ at code{interactive_clip: manually rejected}
+ at sp 1
+
+ at item Report
+ at code{SEQ }@i{seqid}@code{: accepted}
+ at br @code{SEQ }@i{seqid}@code{: rejected}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Extract Seq
+ at subsection Extract Sequence
+ at cindex Extract sequence configuration
+ at pindex extract_seq.p4m
+
+ at table @strong
+ at item Filename
+ at code{extract_seq.p4m}
+ at sp 1
+
+ at item Local variables
+(none)
+ at sp 1
+
+ at item Files
+    User specified
+ at sp 1
+
+ at item Errors
+    @code{extract_seq: }@i{extract_seq error message}
+ at sp 1
+
+ at item Report
+    (none)
+ at sp 3
+ at end table
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Repeats
+ at subsection Tag Repeats
+ at cindex Tag repeats configuration
+ at pindex tag_repeats.p4m
+ at vindex repeat_file
+ at vindex score
+ at vindex tag_types
+
+ at table @strong
+ at item Filename
+ at code{tag_repeats.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item repeat_file
+ at i{filename}
+ at item score
+ at i{float}
+ at item tag_types
+ at i{string}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+ at pindex .tagrep_free
+ at pindex .tagrep_repeat
+ at pindex .tagrep_log
+ at i{fofn}@code{.tmp} --- temporary
+ at br @i{fofn}@code{.tagrep_free}
+ at br @i{fofn}@code{.tagrep_repeat}
+ at br @i{fofn}@code{.tagrep_log}
+ at sp 1
+
+ at item Errors
+ at code{tag_repeats: lost file}
+ at sp 1
+
+ at item Report
+ at code{SEQ }@i{seqid}@code{: no repeat found}
+ at br @code{SEQ }@i{seqid}@code{: repeat found (}@i{details}@code{)}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-RepeatMasker
+ at subsection RepeatMasker
+ at cindex RepeatMasker configuration
+ at pindex repeats_masker.p4m
+ at vindex tag_type
+ at vindex alu_only
+ at vindex simple_only
+ at vindex no_primate_rodent
+ at vindex no_low_complexity
+ at vindex library
+ at vindex cutoff
+
+ at table @strong
+ at item Filename
+ at code{repeat_masker.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item tag_type
+ at i{string}
+ at item alu_only
+ at code{0}/@code{1}
+ at item simple_only
+ at code{0}/@code{1}
+ at item no_primate_rodent
+ at code{0}/@code{1}
+ at item rodent_only
+ at code{0}/@code{1}
+ at item no_low_complexity
+ at code{0}/@code{1}
+ at item library
+ at i{directory}
+ at item cutoff
+ at i{integer}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+ at pindex .fasta.cat
+ at pindex .fasta.masked
+ at pindex .fasta.masked.log
+ at pindex .fasta.out
+ at pindex .fasta.out.xm
+ at pindex .fasta.tbl
+ at i{fofn}@code{.fasta.cat}
+ at br @i{fofn}@code{.fasta.masked}
+ at br @i{fofn}@code{.fasta.masked.log}
+ at br @i{fofn}@code{.fasta.out}
+ at br @i{fofn}@code{.fasta.out.xm}
+ at br @i{fofn}@code{.fasta.tbl}
+ at sp 1
+
+ at item Errors
+(none)
+ at sp 1
+
+ at item Report
+ at code{SEQ }@i{seqid}@code{: Repeat} @i{repeat name}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Mutations
+ at subsection Mutation Detection
+ at cindex Mutation detection configuration
+ at pindex trace_diff.p4m
+ at vindex score
+ at vindex start
+ at vindex end
+ at vindex band_width
+ at vindex other_args
+ at vindex update_exp_file
+ at vindex WT
+
+ at table @strong
+ at item Filename
+ at code{trace_diff.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item score
+ at i{float}
+ at item start
+ at i{int}
+ at item end
+ at i{int}
+ at item band_width
+ at i{int}
+ at item other_args
+ at i{string}
+ at item update_exp_file
+ at code{0}/@code{1}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Global variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item WT
+ at itemx WT_com
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+May modify Experiment files.
+ at sp 1
+
+ at item Errors
+ at code{trace_diff: No wildtype information}
+ at br @code{trace_diff: }@i{trace_diff error message}
+ at sp 1
+
+ at item Report
+ at code{SEQ }@i{seqid}@code{: mutations at }@i{positions}
+ at br @code{SEQ }@i{seqid}@code{: no mutations}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Gap4 Assembly
+ at subsection Gap4 Shotgun Assembly
+ at cindex Gap4 shotgun assembly configuration
+ at pindex gap4_assemble.p4m
+ at vindex database_name
+ at vindex database_version
+ at vindex create
+ at vindex min_match
+ at vindex max_pads
+ at vindex max_pmismatch
+ at vindex enter_all
+
+ at table @strong
+ at item Filename
+ at code{gap4_assemble.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item database_name
+ at i{string}
+ at item database_version
+ at i{character}
+ at item create
+ at code{0}/@code{1}
+ at item min_match
+ at i{integer}
+ at item max_pads
+ at i{integer}
+ at item max_pmismatch
+ at i{float}
+ at item enter_all
+ at code{0}/@code{1}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+ at pindex .aux
+ at i{database_name}@code{.}@i{database_version}
+ at br @i{database_name}@code{.}@i{database_version}@code{.aux}
+ at sp 1
+
+ at item Errors
+ at code{gap4_assemble: failed with code }@i{error_code}
+ at sp 1
+
+ at item Report
+ at code{SEQ }@i{seqid}@code{: failed (}@i{assembly error code}@code{)}
+ at br @code{SEQ }@i{seqid}@code{: failed (filename contains spaces)}
+ at br @code{SEQ }@i{seqid}@code{: skipping - not in correct format}
+ at br @code{SEQ }@i{seqid}@code{: assembled}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Cap2 Assembly
+ at subsection Cap2 Assembly
+ at cindex Cap2 assembly configuration
+ at pindex cap2_assemble.p4m
+
+ at table @strong
+ at item Filename
+ at code{cap2_assemble.p4m}
+ at sp 1
+
+ at item Local variables
+(none)
+ at sp 1
+
+ at item Files
+ at pindex .assembly/fofn
+ at pindex .assembly/cap2_stdout
+ at pindex .assembly/cap2_stderr
+ at i{fofn}@code{.tmp} --- temporary
+ at br @i{fofn}@code{.assembly/fofn}
+ at br @i{fofn}@code{.assembly/cap2_stdout}
+ at br @i{fofn}@code{.assembly/cap2_stderr}
+ at sp 1
+
+ at item Errors
+ at code{cap2_s: }@i{cap2_s error message}
+ at br @code{cap2_s: rejected}
+ at sp 1
+
+ at item Report
+ at code{SEQ }@i{seqid}@code{: rejected}
+ at br @code{SEQ }@i{seqid}@code{: assembled}
+ at br @code{SEQ }@i{seqid}@code{: failed (filename contains spaces)}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Cap3 Assembly
+ at subsection Cap3 Assembly
+ at cindex Cap3 assembly configuration
+ at pindex cap3_assemble.p4m
+
+ at table @strong
+ at item Filename
+ at code{cap3_assemble.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item generate_constraints
+ at code{0}/@code{1}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+ at pindex .assembly/fofn
+ at pindex .assembly/cap3_stdout
+ at pindex .assembly/cap3_stderr
+ at pindex .con
+ at pindex .con.results
+ at pindex .cap3_info
+ at pindex .contigs.qual
+ at i{fofn}@code{.tmp} --- temporary
+ at br @i{fofn}@code{.assembly/fofn}
+ at br @i{fofn}@code{.assembly/cap3_stdout}
+ at br @i{fofn}@code{.assembly/cap3_stderr}
+ at br @i{fofn}@code{.con}
+ at br @i{fofn}@code{.con.results}
+ at br @i{fofn}@code{.cap3_info}
+ at br @i{fofn}@code{.contigs.qual}
+ at sp 1
+
+ at item Errors
+ at code{cap3_s: }@i{cap3_s error message}
+ at br @code{cap3_s: rejected}
+ at sp 1
+
+ at item Report
+ at code{SEQ }@i{seqid}@code{: rejected}
+ at br @code{SEQ }@i{seqid}@code{: assembled}
+ at br @code{SEQ }@i{seqid}@code{: failed (filename contains spaces)}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-FakII Assembly
+ at subsection FakII Assembly
+ at cindex FakII assembly configuration
+ at pindex fakii_assemble.p4m
+ at vindex graph_e_limit
+ at vindex graph_o_threshold
+ at vindex graph_d_limit
+ at vindex assem_number
+ at vindex assem_e_rate
+ at vindex assem_o_threshold
+ at vindex assem_d_threshold
+ at vindex generate_constraints
+
+ at table @strong
+ at item Filename
+ at code{fakii_assemble.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item graph_e_limit
+ at i{float}
+ at item graph_o_threshold
+ at i{float}
+ at item graph_d_limit
+ at i{float}
+ at item assem_number
+ at i{integer}
+ at item assem_e_rate
+ at i{float}
+ at item assem_o_threshold
+ at i{float}
+ at item assem_d_threshold
+ at i{float}
+ at item generate_constraints
+ at code{0}/@code{1}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+ at pindex .assembly/fofn
+ at pindex .assembly/graph_stderr
+ at pindex .assembly/graph.dat
+ at pindex .assembly/constraints_stderr
+ at pindex .assembly/constraints.ascii
+ at pindex .assembly/constraints.dat
+ at pindex .assembly/assemble_stderr
+ at pindex .assembly/assemble.dat
+ at pindex .assembly/write_exp_file_stdout
+ at pindex .assembly/write_exp_file_stderr
+ at i{fofn}@code{.tmp} --- temporary
+ at br @i{fofn}@code{.assembly/fofn}
+ at br @i{fofn}@code{.assembly/graph_stderr}
+ at br @i{fofn}@code{.assembly/graph.dat}
+ at br @i{fofn}@code{.assembly/constraints_stderr}
+ at br @i{fofn}@code{.assembly/constraints.ascii}
+ at br @i{fofn}@code{.assembly/constraints.dat}
+ at br @i{fofn}@code{.assembly/assemble_stderr}
+ at br @i{fofn}@code{.assembly/assemble.dat}
+ at br @i{fofn}@code{.assembly/write_exp_file_stdout}
+ at br @i{fofn}@code{.assembly/write_exp_file_stderr}
+ at sp 1
+
+ at item Errors
+ at code{fakii: rejected}
+ at sp 1
+
+ at item Report
+ at code{SEQ }@i{seqid}@code{: assembled}
+ at br @code{SEQ }@i{seqid}@code{: rejected}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Phrap Assembly
+ at subsection Phrap Assembly
+ at cindex Phrap assembly configuration
+ at pindex phrap_assemble.p4m
+ at vindex minmatch
+ at vindex minscore
+ at vindex other_args
+
+ at table @strong
+ at item Filename
+ at code{phrap_assemble.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item minmatch
+ at i{integer}
+ at item minscore
+ at i{integer}
+ at item other_args
+ at i{string}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+ at pindex .assembly/fofn
+ at pindex .assembly/phrap_stdout
+ at pindex .assembly/phrap_stderr
+ at pindex .contigs
+ at pindex .contigs.qual
+ at pindex .singlets
+ at pindex .phrap_log
+ at i{fofn}@code{.tmp} --- temporary
+ at br @i{fofn}@code{.assembly/fofn}
+ at br @i{fofn}@code{.assembly/phrap_stdout}
+ at br @i{fofn}@code{.assembly/phrap_stderr}
+ at br @i{fofn}@code{.contigs}
+ at br @i{fofn}@code{.contigs.qual}
+ at br @i{fofn}@code{.singlets}
+ at br @i{fofn}@code{.phrap_log}
+ at sp 1
+
+ at item Errors
+ at code{phrap: singlet}
+ at br @code{phrap: failed}
+ at sp 1
+
+ at item Report
+ at code{SEQ }@i{seqid}@code{: assembled}
+ at br @code{SEQ }@i{seqid}@code{: singlet}
+ at br @code{SEQ }@i{seqid}@code{: failed}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Enter Assembly
+ at subsection Enter Assembly into Gap4
+ at cindex Enter assembly configuration
+ at pindex enter_assembly.p4m
+ at vindex database_name
+ at vindex database_version
+ at vindex create
+ at vindex quality_clip
+ at vindex difference_clip
+
+ at table @strong
+ at item Filename
+ at code{enter_assembly.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item database_name
+ at i{string}
+ at item database_version
+ at i{character}
+ at item create
+ at code{0}/@code{1}
+ at item quality_clip
+ at code{0}/@code{1}
+ at item quality
+ at i{integer}
+ at item difference_clip
+ at code{0}/@code{1}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+ at pindex .assembly/fofn
+ at pindex .aux
+ at i{fofn}@code{.assembly/fofn} --- reads
+ at br @i{fofn}@code{.assembly/*} --- reads
+ at br @i{database_name}@code{.}@i{database_version}
+ at br @i{database_name}@code{.}@i{database_version}@code{.aux}
+ at sp 1
+
+ at item Errors
+(none)
+ at sp 1
+
+ at item Report
+ at code{SEQ }@i{seqid}@code{: failed}
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Email
+ at subsection Email
+ at cindex Email configuration
+ at pindex email.p4m
+ at vindex email_progam
+ at vindex email_args
+ at vindex email_address
+
+ at table @strong
+ at item Filename
+ at code{email.p4m}
+ at sp 1
+
+ at item Local variables
+ at tex
+\global\tableindent=2in
+ at end tex
+ at table @code
+ at item email_program
+ at i{string}
+ at item email_args
+ at i{string}
+ at item email_address
+ at i{string}
+ at end table
+ at tex
+\global\tableindent=0.8in
+ at end tex
+ at sp 1
+
+ at item Files
+(none)
+ at sp 1
+
+ at item Errors
+(none)
+ at sp 1
+
+ at item Report
+(none)
+ at end table
+ at sp 3
+
+ at c --------------------
+_split()
+ at node Pregap4-ManualConfig-Shutdown
+ at subsection Shutdown
+ at cindex Shutdown configuration
+ at pindex shutdown.p4m
+
+ at table @strong
+ at item Filename
+ at code{shutdown.p4m}
+ at sp 1
+
+ at item Local variables
+(none)
+ at sp 1
+
+ at item Files
+ at pindex .passed
+ at pindex .failed
+ at pindex .log
+ at pindex .report
+ at i{fofn}@code{.passed}
+ at br @i{fofn}@code{.failed}
+ at br @i{fofn}@code{.log}
+ at br @i{fofn}@code{.report}
+ at sp 1
+
+ at item Errors
+(none)
+ at sp 1
+
+ at item Report
+This module collates the reports from all other modules.
+ at end table
+
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-WritingMods
+ at chapter Writing New Modules
+ at cindex Modules, creating
+
+ at menu
+* Pregap4-WritingMods-Overview::           An Overview of a Module
+* Pregap4-WritingMods-Functions::          Functions
+* Pregap4-WritingMods-Module Variables::   Module Variables
+* Pregap4-WritingMods-Global Variables::   Global Variables
+* Pregap4-WritingMods-Builtin::            Builtin Functions
+* Pregap4-WritingMods-Example::            An Example Module
+ at end menu
+
+ at node Pregap4-WritingMods-Overview
+ at section An Overview of a Module
+ at cindex Modules, overview
+
+A pregap4 module is a single file containing a series of functions with
+predefined interfaces. Pregap4 uses these functions to communicate with module.
+
+This section is for system managers and programmers only.
+
+
+The module itself is written using the Tcl/Tk language. A definition of this
+language is outside the scope of this manual, however several books exist on
+the subject. Each modules executes inside a Tcl "namespace". This means that
+modules may make use of global variables and global function names without
+fear of clashing with other modules. Indeed the use of specific function names
+and global variables is of considerable importance for designing a new module.
+
+ at node Pregap4-WritingMods-Functions
+ at section Functions
+ at cindex Module functions
+ at cindex Functions, in modules
+
+The basic structure of a module is that it has a series of known functions
+which pregap4 expects to use. Some of these functions are mandatory, whilst
+others will only be called by pregap4 if they have been defined.
+
+ at table @var
+ at item name
+ at findex name
+    Mandatory.
+
+    Arguments: none
+
+    Returns:   The textual name of the module.
+
+    This function is used to query a human readable name for the module
+    (eg "ALF/ABI to SCF Conversion"). This name is used in the module list
+    at the left side of the pregap4 window.
+ at sp 2
+
+ at item init
+ at findex init
+    Optional.
+
+    Arguments: None
+
+    Returns:   None
+
+    This sets up any data structures needed for this module. It can be
+    used for providing defaults for global variables when they are not
+    known (eg they have no settings in the system or user pregap4rc files)
+    and for setting up any other data structures required.
+ at sp 2
+
+ at item run
+ at findex run
+    Optional.
+
+    Arguments: A Tcl list of files to process
+
+    Returns:   A new Tcl list of files for subsequent processing.
+
+    This is the main work horse. It is optional, however in all but the
+        most esoteric cases, it will be needed.
+
+    The single argument is a Tcl list of sequence names. These are either
+    filenames on disk or identifiers used for fetching data from a
+    database. The module should loop through the sequences which it can
+    process (which may not be all of them, depending on the known
+    information and file types).
+
+    When finished, it needs to return a new list of files. If a file has
+    been rejected by this module (eg it is completely sequencing vector)
+    then this sequence name should be omitted from the returned list.
+    However do make sure that all failed files have an error string
+    attached to them by setting the file_error(seq name) array element.
+ at sp 2
+
+ at item shutdown
+ at findex shutdown
+    Optional.
+
+    Arguments: A Tcl list of files to process
+
+    Returns:   A new Tcl list of files for subsequent processing.
+
+    Deallocates any data structures that have been setup during the init
+    or run stages. Most modules will not need this function. As with the
+    run module, the returned value should be the list of passed files,
+    which is generally the same as the list passed into this function.
+
+    A special module, which is always included by pregap4, is the
+    shutdown.p4m module. This is always the last module to have shutdown
+    called. It produces the reports for pregap4 and does some general house
+    keeping.
+ at sp 2
+
+ at item create_dialogue
+ at findex create_dialogue
+    Optional.
+
+    Arguments: A tk pathname
+
+    Returns:   None
+
+    This create a dialogue controlling the parameters for this module. The
+    tk pathname passed into this function should be the root for all
+    components of this dialogue. (Note though that this is not a toplevel
+    window, but a subwindow of the main pregap4 dialogue.)
+ at sp 2
+
+ at item check_params
+ at findex check_params
+        Optional.
+
+        Arguments: None
+
+        Returns:   A variable name or a blank string.
+
+        This checks that this module has valid answers to all of its
+        mandatory questions. If this is the case a blank string is returned,
+        otherwise the first variable name which needs a value is returned.
+ at sp 2
+
+ at item process_dialogue
+ at findex process_dialogue
+    Optional.
+
+    Arguments: A tk pathname
+
+    Returns:   0 for failure, 1 for success
+
+    This is executed in all modules before the run functions are executed.
+    It's purpose is to extract any information from user editable entries
+    or checkboxes ready for the run function to utilise. It may also be
+    used to check that the data entered is valid.
+
+    The return code is used to indicate whether this module has sufficient
+    data to execute. If 0 is returned pregap4 will beep and make sure that
+    the dialogue 'tab' for this module is displayed. Further processing
+    then stops until the 'Run' button is pressed again.
+
+    For instance if a module needs to know the sequencing vector to screen
+    against, then this should check if the value has been entered or can
+    be obtained via a command. If so it returns 1.
+ at sp 2
+
+ at item configure_dialogue path mode
+ at findex configure_dialogue
+    Optional.
+
+    Arguments: A tk pathname, the configure mode
+
+    Returns:   None
+
+    If this function is present pregap4 will add a button to the top of the
+    module dialogue inviting the user to save the parameters for this
+        module to the configuration file.
+
+        In early releases of pregap4 (2000.0 and before) a "Select parameters
+        to save" button was also available. To maintain compatibility with
+        older modules the "mode" parameter is still used. If you wish the
+        module to be backwards compatible with old pregap4 releases then this
+        needs to be checked to make sure that it contains "save_all". If it
+        does not then no action should be taken. In the 2001 release and newer
+        the "mode" parameter will always contain "save_all" so no check is
+        required.
+
+        To save the dialogue information this function should use the pregap4
+        mod_save and glob_save functions.
+ at end table
+
+ at node Pregap4-WritingMods-Module Variables
+ at section Module Variables
+ at cindex Module variables
+ at cindex Variables, in modules
+
+ at table @var
+ at item mandatory
+ at vindex mandatory
+    The existence of this variable (set to anything) states that this
+    module cannot be disabled.
+ at sp 1
+
+ at item hidden
+ at vindex hidden
+
+The existence of this variable states that its name shall not appear
+    in the module list (although it will still be used).
+ at sp 1
+
+ at item report
+ at vindex report
+    The contents of this variable are displayed at the end of the pregap
+    run by the shutdown.p4m module.
+ at end table
+
+ at node Pregap4-WritingMods-Global Variables
+ at section Global Variables
+ at cindex Global variables
+ at cindex Variables, global to all modules
+
+Several global variables exist which may need to be updated within the
+modules. For successful operation it is required to update these when
+applicable.
+
+ at table @var
+ at item file_type
+ at vindex file_type
+This is a Tcl array indexed by file name. It is initialised by the General
+Configuration module to be one of @code{ABI}, @code{ALF}, @code{EXP},
+ at code{PLN}, @code{SCF} or @code{UNK}.
+ at sp 1
+
+ at item file_error
+ at vindex file_error
+This is a global array indexed by the current file name. If a file has been
+rejected by a module (ie not returned from the @code{run} function) then
+the appropriate array element must be filled with a reason. Typically the
+format for this reason will start with the module name followed by a colon.
+For example "makeSCF: unknown file type".
+ at sp 1
+
+ at item file_id
+ at vindex file_id
+This is a global array, indexed by filenames, containing the sequence
+identifiers (which are often different to the sequence filenames). It is
+initialised by the General Configuration module.
+ at sp 1
+
+ at item file_orig_name
+This is a global array holding any original filename for each currently
+processed file. It is initialised by the General Configuration module such
+that each file points to its own filename.
+ at end table
+
+When creating and returning a new file (such as when switching from SCF files
+to Experiment Files in the Initialise Experiment Files module) it is required
+that the arrays are all updated correctly. This involves creating new array
+elements for each of the above four arrays. The @var{file_type} array element,
+indexed by a new name should contain the new file type (eg @code{set
+file_type(seq10.exp) EXP}). The @var{file_error} array element should be set
+to a blank string. The @var{file_id} should inherit the sequence identifier
+from the original file (eg @code{set file_id(seq10.exp) $file_id(seq10.scf)}).
+The @var{file_orig_name} array element should point to the old filename (not
+the original filename pointed to by the old filename). In this way
+ at var{file_orig_name} could be considered as a list of the intermediate files
+generated for each final sequence file.
+
+ at node Pregap4-WritingMods-Builtin
+ at section Builtin Functions
+ at cindex Functions, builtin
+
+Apologies, but this section of documentation is still unfinished.
+
+The full definition of these functions may be found in the Tcl code for
+Pregap4 itself. It is recommended that you use the Unix @code{grep} utility to
+find the definitions and example uses.
+
+ at node Pregap4-WritingMods-Example
+ at section An Example Module
+ at cindex Example code
+ at cindex Module, example code
+
+The best examples are the existing modules. Try looking at the Compress Trace
+Files module as an example. This may be found in
+ at file{$STADENROOT/lib/pregap4/modules/compress_trace.p4m}.
+
+
diff --git a/manual/pregap4.texi b/manual/pregap4.texi
new file mode 100644
index 0000000..e0bb37d
--- /dev/null
+++ b/manual/pregap4.texi
@@ -0,0 +1,56 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+
+ at c %**start of header
+ at setfilename pregap4.info
+ at settitle Pregap4
+ at iftex
+ at afourpaper
+ at end iftex
+ at setchapternewpage odd
+ at c %**end of header
+
+ at set standalone
+include(header.m4)
+
+ at finalout
+
+ at titlepage
+ at title Pregap4
+ at subtitle
+ at author
+ at page
+ at vskip 0pt plus 1filll
+_include(copyright.texi)
+ at end titlepage
+
+ at node Top
+ at ifinfo
+ at top top-pregap4
+ at end ifinfo
+
+_include(pregap4-t.texi)
+
+_split()
+ at node General Index
+ at unnumbered General Index
+ at printindex cp
+
+_split()
+ at node File Index
+ at unnumbered File Index
+ at printindex pg
+
+_split()
+ at node Variable Index
+ at unnumbered Variable Index
+ at printindex vr
+
+_split()
+ at node Function Index
+ at unnumbered Function Index
+ at printindex fn
+
+ at shortcontents
+ at contents
+ at bye
diff --git a/manual/pregap4_compact.png b/manual/pregap4_compact.png
new file mode 100644
index 0000000..7802929
Binary files /dev/null and b/manual/pregap4_compact.png differ
diff --git a/manual/pregap4_component.png b/manual/pregap4_component.png
new file mode 100644
index 0000000..56562dd
Binary files /dev/null and b/manual/pregap4_component.png differ
diff --git a/manual/pregap4_config.png b/manual/pregap4_config.png
new file mode 100644
index 0000000..aab2dfc
Binary files /dev/null and b/manual/pregap4_config.png differ
diff --git a/manual/pregap4_edit_exp.png b/manual/pregap4_edit_exp.png
new file mode 100644
index 0000000..ac3fdfe
Binary files /dev/null and b/manual/pregap4_edit_exp.png differ
diff --git a/manual/pregap4_files.png b/manual/pregap4_files.png
new file mode 100644
index 0000000..8074d1c
Binary files /dev/null and b/manual/pregap4_files.png differ
diff --git a/manual/pregap4_mini-t.texi b/manual/pregap4_mini-t.texi
new file mode 100644
index 0000000..7a739a4
--- /dev/null
+++ b/manual/pregap4_mini-t.texi
@@ -0,0 +1,423 @@
+ at c This is a Texinfo file describing Pregap4.
+ at c By itself it is human readable as a text document, although this document
+ at c alone does not contain everything needed to process with texi2dvi.
+
+_split()
+ at node Pregap4-Introduction
+ at chapter Introduction
+ at cindex Introduction
+ at cindex Pregap4
+
+ at menu
+* Pregap4-Intro-Files::         Summary of the Files used and the Processing Steps
+* Pregap4-Intro-Interface::         Introduction to the Pregap4 User Interface
+* Pregap4-Intro-Interface-Files:: Introduction to the Files to Process Window
+* Pregap4-Intro-Interface-Configure:: Introduction to the Configure Modules Window
+* Pregap4-Intro-Interface-Output:: Introduction to the Textual Output Window
+* Pregap4-Intro-Interface-Running:: Introduction to Running Pregap4
+ at end menu
+
+
+Before entry into a gap4 database the raw data from sequencing instruments
+needs to be passed through several processes, such as screening for vectors,
+quality evaluation, and conversion of data formats. 
+Pregap4 is used to pass a batch of
+readings through these steps in an automatic way. It provides an
+interface for setting up and configuring the processing and for
+controlling the passage of the readings through each stage.
+The separate tasks are termed "modules" and each module is typically
+managed by a dedicated program. Pregap4 wraps all of these
+modules into a single easy to use environment, whilst maintaining the
+flexibility to select and extend the processing modules.
+It is an, as yet, 
+unpublished replacement of the program pregap
+ at cite{Bonfield, J.K. and Staden, R. Experiment files and their application
+during large-scale sequencing projects. DNA Sequence 6, 109-117 (1996)}.
+
+
+_split()
+ at node Pregap4-Intro-Files
+ at section Summary of the Files used and the Processing Steps
+
+Gap4 stores the data for an assembly project in a gap4
+database. Before being entered into the gap4 database the data must be
+passed through several steps via pregap4. The range
+of tasks that can be peformed using pregap4 are shown schematically in
+the following figure.
+
+ at page
+_picture(pregap4_overview,4.45833in)
+ at page
+
+
+The package can handle data produced by a variety of sequencing
+instruments, and also data entered using digitisers or that has been typed in by
+hand. One of the first steps is to convert trace files, such as those of
+ABI, which are in proprietary format, to SCF files 
+(_fpref(Formats-Scf, SCF introduction, scf)).  
+
+Next, as originally put forward in @cite{Bonfield,J.K. and Staden,R. The application of
+numerical estimates of base calling accuracy to DNA sequencing
+projects. Nucleic Acids Research 23, 1406-1410 (1995)} 
+(_fpref(Intro-Base-Acc, The use of numerical estimates of base
+calling accuracy, t)), if they are not already included in the files,
+base call confidence values are calculated, and are 
+normally stored in the reading's SCF file.
+
+Next the base calls are copied from the trace files 
+to text files known as Experiment files
+(_fpref(Formats-Exp, Experiment files, exp)). 
+
+ at cindex FASTA files: pregap4
+ at cindex pregap4: FASTA files
+
+Note it is also possible to enter sequence readings in the form of FASTA
+files for use at this stage of the processing, in which case they will be
+automatically converted to Experiment file format.
+
+All the subsequent processes operate on the Experiment files.
+
+Experiment file format is similar to that of EMBL sequence entries in
+that each record starts with a two letter identifier, but we have
+invented new records specific to sequencing experiments. 
+Gap4 can make use of information about readings which may not be
+contained within the raw data files, such as sequencing chemistry and whether
+it is a forward or reverse reading. Gap4 will work without this information,
+but at a reduced level. For instance knowing which forward and reverse
+readings belong together allows gap4 to check the validity of assembly
+and for automatic ordering of contigs.
+
+One of
+pregap4's next tasks is to augment the Experiment files to include data about
+the chemistry, vectors, primers and templates used in the production of each
+reading, and if necessary it can extract this information from external
+databases (_fpref(Pregap4-Database, Information Sources, pregap4)), or via
+local reading name conventions
+(_fpref(Pregap4-Naming, Pregap4 Naming Schemes, t)).
+Once the Experiment file for a reading contains all the necessary
+information the remaining processing programs can be used in turn to
+analyse the data. 
+
+First the reading is marked at both ends to define the range of
+reasonable quality base calls
+(_fpref(Pregap4-Modules-Quality Clip, Quality Clip, t)).
+
+Then the reading is searched for the
+presence of sequencing vector at the 5' end 3' ends
+(_fpref(Pregap4-Modules-Sequence Vector, Sequencing Vector Clip, t)).
+
+Next the sequence is checked for the presence of "cloning" vector,
+i.e. non-sequencing vectors, such as those of BACs
+(_fpref(Pregap4-Modules-Cloning Vector, Cloning Vector Clip, t)).
+
+The final check of this type is to screen the reading for any vector
+that may have been missed in the previous searches
+(_fpref(Pregap4-Modules-Screen Vector, Screen for Unclipped Vector, t)).
+
+The next check is to screen the reading for any set of
+sequences which it may be contaminated by, such as E. coli
+(_fpref(Pregap4-Modules-Screen, Screen Sequences, t)).
+
+Note that vector sequence files are normally stored in the package
+vectors directory/folder. If a file of vector file names is used the
+vector sequences can also be stored in its directory/folder. Files
+of file names and vector-primer files can also contain environment
+variables to define the location of vector files.
+
+Vector_primer files, vector sequence files and files of file names
+must be stored in plain text files
+(_fpref(Formats-Vector_Primer, Vector_primer Files, vector_primer)),
+(_fpref(Formats-Vector-Sequences, Vector sequence format, vector files)).
+
+Pregap4 is usually used non-interactively once the modules have been
+configured, but some groups prefer (or have the time) to check the data
+by eye using the program trev (_fpref(Trev,Trev,trev)) at this stage.
+
+Another option is to search the readings for families of known repeats
+(_fpref(Pregap4-Modules-Repeats, Tag Repeats, t)). This will tag any
+regions which are found to match known repeats.
+
+Some groups are using the package for mutation studies and 
+the final pregap4 option, prior to assembly is to use the mutation scanner program
+(_fpref(Mutation-Detection-Introduction, Introduction to mutation
+detection, mutations)) to search the readings for mutations
+(_fpref(Pregap4-Modules-Mutation Scanner, Mutation Scanner, t).
+
+Pregap4 can also be used to assemble the readings into a gap4 database
+(_fpref(Pregap4-Modules-Gap4 Assembly, Gap4 Shotgun Assembly, t)), or
+to assemble the readings using an external assembly engine such as FAKII
+(_fpref(Pregap4-Modules-FakII Assembly, FakII Assembly, t)),
+and then to enter that assembly into a gap4 database
+(_fpref(Pregap4-Modules-Enter Assembly, Enter Assembly into Gap4, t)).
+
+_ifdef([[_unix]],[[The following figure shows an overview of the range of
+tasks that can be
+performed by pregap4, plus the names of the programs which can be used.
+The program names marked with an asterisk (*) are not included in the
+Staden Package and must be obtained from elsewhere.
+
+ at page
+_picture(pregap4_overview2,4.75833in)
+ at page]])
+
+ at cindex pregap4rc files
+ at cindex configuration files: pregap4
+
+It is unlikely that any particular user will want to employ all of these
+options and one of pregap4's modes of use is to enable users to
+configure the program for their work
+(_fpref(Pregap4-Modules, Configuring Modules, pregap4)).
+Not only can they select which
+tasks should be performed, and which of the alternative programs
+("modules") should be used for them, but also the order in which they are
+applied. Although it is very rarely a problem, this high level of
+flexibility comes at a price in the current version of pregap4: pregap4
+does not include code to check on the logicality of the configuration
+set by a user and will attempt to execute the modules in the order
+given. There are some users, who having read this section, will
+configure pregap4 to perform assembly before creating the Experiment
+files from the trace files. Pregap4 will attempt to do this and 
+no data will be assembled as the files given to the assembly engine
+will be in the wrong format. This is just something to be aware of.
+
+Pregap4 uses configuration files to remember the setup for each user or
+project. These files define which modules are activated and what their
+parameter settings are 
+(_fpref(Pregap4-Config-Files, Using Config Files, pregap4)). These files,
+which can obviously save considerable amounts of time, are created
+automatically and can be saved from the Configure Modules Window once
+the configuration is complete.
+
+
+The trace files are not altered, but are kept as archival data so that
+it is always possible to check the original base calls and traces. The
+trace files are used by gap4 to display traces and to compare the final
+consensus sequence with the original data, therefore they must be kept
+online for the lifetime of the project. To save disk space it is best to use
+SCF files and, if they were derived from a proprietary format such as
+that of ABI, to remove the originals. 
+
+Any changes to the data prior to assembly
+(and we recommend that none are made until readings
+can be viewed aligned with others) are made to the copy of the sequence
+in the Experiment file. For example the results of all the searching
+procedures outlined above are added as new records to each reading's
+Experiment file.
+The reading data, in Experiment file format, is entered into the project
+database
+(_fpref(GapDB, Gap Database Files, gap4)),
+usually via one of the assembly engines. All the changes to
+the data made by gap4 are made to the copies of the data in the project
+database.  Once the data has been copied into the gap4 database the
+Experiment files are no longer required.
+
+ at cindex temporary files
+ at cindex pregap4: temporary files
+
+During processing pregap4 uses temporary files. The number and nature of
+these files depends on the modules used. At the very least pregap4 will
+produce files containing the names of the input files and the result of
+their processing. Those that were processed successfully will be stored
+in a file with a name ending ".passed" and those that failed in one
+ending ".failed". The ".passed" file can be used as a file of input file
+names for assembly into gap4 (assuming that a pregap4 
+assembly module has not already been used). 
+
+While it is running, pregap4 will 
+create files with a file name prefix
+defined by the user, and store them
+in an output directory of the user's choice
+(_fpref(Pregap4-Files, Specifying Files to Process, pregap4)).
+
+When processing has 
+finished pregap4 will produce a report containing information
+from each module and the final list of passed and failed sequences.
+
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-Intro-Interface
+ at section Introduction to the Pregap4 User Interface
+
+Pregap4 provides interfaces to define the batch of data files to be
+processed, which modules are to applied to them; to configure the
+modules, and to start the processing. It also provides mechanisms for
+adding and removing modules, but this facility
+will be used far less often than the others.
+
+Pregap4 supports two styles of windowing. The default method is a compact
+mode, with the alternative being "separate" mode - similar to gap4 and
+spin.
+
+
+_picture(pregap4_separate,5.54167in)
+
+This is the "separate" window style. Here the main window is always visible,
+with commands in the main window bringing up new windows. In the picture above 
+the configure window can be seen on top of the main window.
+
+The second style is "compact" mode.
+
+_picture(pregap4_compact,6in)
+
+In the compact picture above the most common top level windows are "pages" in
+a tabbed notebook. The benefit is greatly reduced screen space and quicker
+controls, but the text output window is no longer permanently visible.
+The Window Style can be changed using the options menu
+(_fpref(Pregap4-Config-Window Styles, Window Styles, pregap4)).
+
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-Intro-Interface-Files
+ at subsection Introduction to the Files to Process Window
+
+Pregap4 operates on batches of files. These files can be
+binary trace files (in ABI, ALF or SCF format), Experiment Files, or plain
+text, and do not need to all be in the same format. The Files to Process
+Window is used to define which files are to be processed.
+The "Files to Process" dialogue (see below) 
+can be brought up from the File menu, or
+by pressing the appropriate tab when in @code{compact_win} mode.
+
+_picture(pregap4_files,6in)
+
+
+On the left hand side of the figure
+is the current list of files to process. This list
+can be edited simply by clicking with the mouse and typing.
+
+On the right side of the panel is the pregap4 output filename prefix, the
+output directory name, and several buttons. The filename prefix is used when
+pregap4 needs to create files.
+For example after processing there may be @i{prefix}.passed,
+ at i{prefix}.failed files. All files will be created within the output directory.
+
+The buttons allow selection of the files to process. The "Add files" button
+will bring up a file browser, which will allow one or more files to be
+selected. Pressing Ok on the file browser will then add the selected files to
+the "List of files to process" panel on the left side of the pregap4 window.
+
+The "Add file of filenames" button may be used to select a list of files whose
+filenames have been written to a `file of filenames'. 
+
+The "Clear current list" button will
+remove all filenames from the list. 
+
+Both the "Add files" and "Add file of
+filenames" button append their selections to the list of files to process, so
+to replace the current list the "Clear current list" button must first be
+used. 
+
+The "Save current list to..." button may be used to produce a
+new file of filenames, containing the combined list of files to process.
+
+_ifdef([[_unix]],[[It is also 
+possible to specify the files to process on the command line.
+Three examples:
+
+ at example
+pregap4 -fofn files
+pregap4 xb54a3.s1SCF xb54b12.r1LSCF xb54b12.r1SCF
+pregap4 *SCF
+ at end example
+]])
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-Intro-Interface-Configure
+ at subsection Introduction to the Configure Modules Window
+
+The "Configure Modules" dialogue is available from the Modules menu or, when
+using the compact window style, by pressing the Configure Modules tab.
+
+As can be seen in the figure below, 
+the left side of the display contains a list of the
+currently loaded modules. One module in this list will be highlighted. 
+The right side of the display shows the configuration panel for this
+highlighted module and is module specific.
+
+_picture(pregap4_config,6in)
+
+The module list shown on the left consists of a series of module names and
+their status, and is termed the "enable status". 
+The tick or cross at the left of the name indicates whether this
+module is enabled. 
+The text to the right of the module name indicates whether the
+module has been given all the parameters needed for it to run. This will
+be one of "ok" (all configuration options have been filled in), "-" (no
+configuration options exist for this module), "edit" (further configuration is
+required") or blank (this module is disabled).
+
+The "enable status" can be toggled by left clicking on the tick/cross to the
+left of the module name. The enable status can be written to the current
+Pregap4 configuration file using the "Save Module List" or "Save All
+Parameters" commands in the Modules menu. Left clicking anywhere on a module
+name in the module list will switch the pane on the right side of the window
+to display any available parameters for this module. Not all modules will have
+parameters to configure.
+
+For modules that do have parameters, the top line of the configuration panel
+will contain two buttons labelled "Select params to save" and "Save these
+parameters". The "Select params to save" button will add check boxes next to
+each parameter. Clicking on these check boxes allows selection of individual
+parameters to save for this module. Once these have been selected pressing the
+"Save" will save only those selected to the pregap4 configuration
+file. Pressing the "Save these parameters" button will save all parameters for
+this module to the configuration file. 
+
+The bottom strip of the window is an "Information Line".
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-Intro-Interface-Output
+ at subsection Introduction to the Textual Output Window
+
+Pregap4 has a main text output window identical to that of gap4 and spin.
+It is used for showing textual results in the top section and
+error messages in the lower part. Full details of the user interface are
+given elsewhere 
+(_fpref(UI-Introduction, User Interface, t)), but an example of the Text
+Output Window is given below.
+
+_picture(pregap4_textwin,6in)
+
+ at c --------------------------------------------------------------------------
+_split()
+ at node Pregap4-Intro-Interface-Running
+ at subsection Introduction to Running Pregap4
+
+When pregap4 is started the user first needs to
+select the files to process. This is done using the "Files to Process"
+command (from the File menu).
+_ifdef([[_unix]],[[Alternatively the files can be specified on the command
+line at the time of starting up Pregap4.]])
+The "Configure Modules" tab allows for the currently available modules to
+be enabled or disabled, and the module parameters edited accordingly.
+
+Once all modules have been configured (so that none have @code{edit} listed
+next to their name) pregap4 is ready to begin processing. This is
+started by pressing
+"Run" or by selecting "Run" from the File menu. 
+
+When pregap4 has a setup that would be useful in the future
+"Save All Parameters (in all modules)" from the Modules menu can be
+used, and pregap4 will store all the module parameters to a
+configuration file ready for subsequent runs.
+
+_ifdef([[_unix]],[[To run pregap4 in a non interactive mode use "@code{pregap4
+-nowin}". This will not bring up a graphical interface and will attempt to
+"Run" automatically. Hence it is necessary to also specify the files to
+process on the command line and also to have previously configured pregap4.]])
+
+When processing has 
+finished pregap4 will produce a report containing information
+from each module and the final list of passed and failed sequences.
+
+If for any reason pregap4 fails a particular step in the processing, users
+are strongly recommended to correct whatever has caused the module to fail,
+clean up any files it has created, and then repeat the whole process. That
+is, until users have a good understanding of what happens at each stage of
+processing, it is better to repeat all the steps with the original list of
+files, than to try to guess which step to continue from.
+
diff --git a/manual/pregap4_org-t.texi b/manual/pregap4_org-t.texi
new file mode 100644
index 0000000..8eb5728
--- /dev/null
+++ b/manual/pregap4_org-t.texi
@@ -0,0 +1,76 @@
+ at node Pregap4-Intro-Manual
+ at chapter Organisation of the Pregap4 Manual
+
+Pregap4 is a relatively simple program to use. It is also very flexible
+and extendable, and so much of the manual is taken up by explaining to
+programmers and system managers how it can be configured. The average
+user need not be concerned with these details.
+
+The Introductory section of the manual is meant to give an overview of
+the program: what it is for, the files it uses and functions it
+performs, and how to use it.
+It is very important for all users to have a basic
+understanding of the files used by pregap4 and the processes through
+which it can pass their data
+(_fpref(Pregap4-Intro-Files, Summary of the Files used and the
+Processing Steps, t)).
+The next section of the Introduction
+(_fpref(Pregap4-Intro-Menus, Pregap4 Menus, t))
+tabulates the program's menus. This is followed by an overview of the
+pregap4 user interface
+(_fpref(Pregap4-Intro-Interface, Introduction to the Pregap4 User
+Interface, t)) which should give a clear idea of how to actually use the
+program, and concludes the introductory section.
+
+More detail about how to define the set of files to be
+processed
+(_fpref(Pregap4-Files, Specifying Files to Process, pregap4)) is
+followed by a section showing 
+how to run pregap4 and giving examples of its use
+(_fpref(Pregap4-Running, Running Pregap4, pregap4)).
+_ifdef([[_unix]],[[This is followed by notes on non-interactive processing
+(_fpref(Pregap4-Batch, Non Interactive Processing, t)) and
+details of the command line arguments that can be used
+(_fpref(Pregap4-CLI, Command line arguments, t)).
+]])Next are sections
+on configuring the pregap4 user interface
+(_fpref(Pregap4-Config, Configuring the Pregap4 User Interface, t)).
+
+The next part of the manual
+describes how to use the Configure Modules Window to
+select the modules to apply and to set their parameters
+(_fpref(Pregap4-Modules, Configuring Modules, pregap4)).
+This is one of the longest and most detailed parts of the manual in that
+it describes how to configure all the current possible modules, many of
+which will not be available at all sites, and several of which perform
+identical functions. Obviously, only the entries which describe 
+the functions that are available at a site, are of interest.
+
+One of the important tasks of pregap4 is to make sure that each
+reading's Experiment file contains all the information needed by gap4 to
+ensure the accuracy of the final consensus sequence and to make the
+project proceed as efficiently as possible. Pregap4 provides several methods
+for sourcing this information. One of these, as for example employed at
+the Sanger Centre in the UK, is to encode some information about a
+reading in its reading name. Pregap4 contains flexible mechanisms to
+enable a variety of the "Naming schemes" or "Naming conventions" to be
+used as a source of information to augment the Experiment files
+(_fpref(Pregap4-Naming, Pregap4 Naming Schemes, t)).
+Alternatively pregap4 can use simple text databases as an information source
+(_fpref(Pregap4-Database, Information Sources, pregap4)), or the user
+can set up some Experiment file record types for use with a batch of readings
+(_fpref(Exp-Records, Experiment file format record types, exp)).
+
+The rest of the manual deals with increasingly complicated matters, and
+the average user should never need to consult these sections. First
+there is a section on adding an removing modules
+(_fpref(Pregap4-ModAdd, Adding and Removing Modules, t)). This describes
+how to control the list of modules which appear in the Configure Modules
+Window. The package is usually shipped with this list set to contain
+more modules than are likely to be available at any one site and so it
+might be found useful to remove those that are not available.
+
+The next two sections, as their names imply, are for programmers only
+(_fpref(Pregap4-ManualConfig, Low Level Pregap4 Configuration, t))
+and
+(_fpref(Pregap4-WritingMods, Writing New Modules,t)).
diff --git a/manual/pregap4_overview.png b/manual/pregap4_overview.png
new file mode 100644
index 0000000..f6de5c6
Binary files /dev/null and b/manual/pregap4_overview.png differ
diff --git a/manual/pregap4_overview2.png b/manual/pregap4_overview2.png
new file mode 100644
index 0000000..4ed8d35
Binary files /dev/null and b/manual/pregap4_overview2.png differ
diff --git a/manual/pregap4_select.png b/manual/pregap4_select.png
new file mode 100644
index 0000000..095efd7
Binary files /dev/null and b/manual/pregap4_select.png differ
diff --git a/manual/pregap4_separate.png b/manual/pregap4_separate.png
new file mode 100644
index 0000000..816c285
Binary files /dev/null and b/manual/pregap4_separate.png differ
diff --git a/manual/pregap4_simpledb.png b/manual/pregap4_simpledb.png
new file mode 100644
index 0000000..7fd9184
Binary files /dev/null and b/manual/pregap4_simpledb.png differ
diff --git a/manual/pregap4_textwin.png b/manual/pregap4_textwin.png
new file mode 100644
index 0000000..521a301
Binary files /dev/null and b/manual/pregap4_textwin.png differ
diff --git a/manual/primer_pos_plot.png b/manual/primer_pos_plot.png
new file mode 100644
index 0000000..eb87bbf
Binary files /dev/null and b/manual/primer_pos_plot.png differ
diff --git a/manual/primer_pos_plot.small.png b/manual/primer_pos_plot.small.png
new file mode 100644
index 0000000..344d5d8
Binary files /dev/null and b/manual/primer_pos_plot.small.png differ
diff --git a/manual/primer_pos_seq_display.png b/manual/primer_pos_seq_display.png
new file mode 100644
index 0000000..fb83bcc
Binary files /dev/null and b/manual/primer_pos_seq_display.png differ
diff --git a/manual/primer_pos_seq_display.small.png b/manual/primer_pos_seq_display.small.png
new file mode 100644
index 0000000..8004f2f
Binary files /dev/null and b/manual/primer_pos_seq_display.small.png differ
diff --git a/manual/primer_pos_text.png b/manual/primer_pos_text.png
new file mode 100644
index 0000000..5c28a89
Binary files /dev/null and b/manual/primer_pos_text.png differ
diff --git a/manual/qclip.1.texi b/manual/qclip.1.texi
new file mode 100644
index 0000000..1f1ab21
--- /dev/null
+++ b/manual/qclip.1.texi
@@ -0,0 +1,134 @@
+ at cindex Qclip: man page
+ at unnumberedsec NAME
+
+qclip --- an Experiment File sequence clipper
+
+ at unnumberedsec SYNOPSIS
+
+Usage when confidence values are available (default mode):
+
+ at code{qclip} [@code{-c}] [@code{-vt}] [@code{-m} @i{minimum_extent}]
+[@code{-M} @i{maximum_extent}] [@code{-w} @i{window_length}]@br
+[@code{-q} @i{average_quality}]
+
+Usage when confidence values are not available or are to be ignored:
+
+ at code{qclip} [@code{-c}] [@code{-vt}] [@code{-m} @i{minimum_extent}]
+[@code{-M} @i{maximum_extent}] [@code{-s} @i{start_offset}]
+[@code{-R} @i{r_length}]@br
+[@code{-r} @i{r_unknown}]
+[@code{-L} @i{l_length}] [@code{-l} @i{l_unknown}]
+
+ at unnumberedsec DESCRIPTION
+
+ at code{Qclip} is a simple program to decide how much of the 5' and 3' ends of a
+sequence, stored as an Experiment File, should be clipped off
+i.e. marked to be ignored during assembly.
+
+The decision is made either by analysing the average confidence levels
+stored in the Experiment file (or an associated trace file), or by
+counting the numbers of unknown bases (eg @code{-} or @code{N}) found within
+windows slid left to right along the sequence.
+
+Large numbers of files can be processed in a single run and each file
+argument is assumed to be a valid Experiment File. The sequence
+is read from the Experiment File @code{SQ} record and the trace is read
+using the @code{LN} and @code{LT} identifiers; clipping is performed
+and @code{QL} and @code{QR} identifiers are appended to the file.
+
+For the default mode of clipping by confidence levels, the program firstly
+finds the region of highest average quality. A window is then slid from this
+point both rightwards and leftwards until the average quality over that
+ at i{window length} (specified with the @code{-w} argument) drops below the 
+ at i{average_quality} argument. The exact position of the clip point within that
+window is determined by successively decreasing the window length.
+
+When confidence values are not available, or when the @code{-n} argument is
+used, only the sequence base calls are analysed. In this
+case the right clip position is calculated by sliding a window of
+length @code{r_length} rightwards along the sequence, starting from base
+ at code{start_offset}, and stopping when a window containing at least
+ at code{r_unknown} unknown bases is found. 
+The left clip position is calculated by
+sliding a window leftwards from base @code{start_offset}. The
+algorithm used is identical to the right clip position except that the
+ at code{l_unknown} and @code{l_length} parameters are used.
+
+The default arguments are
+"@code{-c -m 0 -M 9999 -w 30 -q 10}."
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-v}
+     Enable verbose output. This outputs information on which files are
+     currently being clipped.
+
+ at item @code{-t}
+     Test mode. The QL and QR information is written to stdout instead of
+     being appended to the Experiment file.
+
+ at item @code{-c}
+     Clip by confidence levels. This is the default mode of operation.
+
+ at item @code{-n}
+     Clip by unknown base calls, even when confidence values are available.
+
+ at item @code{-m} @i{extent}
+     If the clip algorithm returns a @code{QL} clip value of less than
+     @i{extent}, use @i{extent} as the @code{QL}
+     value.
+
+ at item @code{-M} @i{extent}
+     If the clip algorithm returns a @code{QR} clip value of more than
+     @i{extent}, use @i{extent} as the @code{QR}
+     value.
+
+ at item @code{-w}
+     Only used for the confidence level clipping mode.
+     The window length over which to compute the average confidence value.
+
+ at item @code{-q}
+     Only used for the confidence level clipping mode.
+     The minimum average confidence in any given window for this window to
+     be considered as good quality sequence.
+        
+ at item @code{-s} @i{offset}
+     Only used for the unknown base clipping mode.
+     Force the first window to start the calculations from position @i{offset}
+     in the sequence. This can be useful to avoid poor data at the 5'
+     end of a sequence.
+
+ at item @code{-R} @i{length}
+     Only used for the unknown base clipping mode.
+     Set the length for the first rightwards window to @i{length}
+
+ at item @code{-r} @i{unknown}
+     Only used for the unknown base clipping mode.
+     Stop sliding the first rightwards window when there are greater than or
+     equal to @i{unknown} bases within the current window.
+
+ at item @code{-L} @i{length}
+     Only used for the unknown base clipping mode.
+     Set the length for the second rightwards window to @i{length}.  Setting
+     this value to zero prevents the second window calculations from being
+     performed.
+
+ at item @code{-l} @i{unknown}
+     Only used for the unknown base clipping mode.
+     Stop sliding the second rightwards window when there are greater than or
+     equal to @i{unknown} bases within the current window.
+ at end table
+
+ at unnumberedsec EXAMPLE
+
+To clip a batch of sequences listed in the @file{fofn} file with a minimum
+left clip value of 20 bases use:
+
+ at example
+qclip -m 20 `cat fofn`
+ at end example
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Formats-Exp, ExperimentFile(4), formats)
diff --git a/manual/quality_clip.png b/manual/quality_clip.png
new file mode 100644
index 0000000..d4243c2
Binary files /dev/null and b/manual/quality_clip.png differ
diff --git a/manual/quality_clip_ends.png b/manual/quality_clip_ends.png
new file mode 100644
index 0000000..31f8d41
Binary files /dev/null and b/manual/quality_clip_ends.png differ
diff --git a/manual/quality_plot-t.texi b/manual/quality_plot-t.texi
new file mode 100644
index 0000000..9ea6adc
--- /dev/null
+++ b/manual/quality_plot-t.texi
@@ -0,0 +1,114 @@
+ at cindex Quality plot
+
+This option can be invoked from the main gap4 View menu, in which case
+it appears as a single plot, or from the View menu of the Template
+Display, in which case it will appear as part of the Template Display.
+
+For each base in the consensus a "quality" code
+is computed based on the accuracy of 
+the data on each strand and whether or not the two strands agree. In a
+future release it will be renamed the "Strand Comparison Plot"
+This "quality" is then plotted using colour and 
+height to distinguish the quality codes shown below.
+
+ at example
+ at group
+Colour  Height          Meaning
+
+grey    0 to 0          OK on both strands, both agree
+blue    0 to 1          OK on plus strand only
+green  -1 to 0          OK on minus strand only
+red    -1 to 1          Bad on both strands
+black  -2 to 2          OK on both strands but they disagree
+ at end group
+ at end example
+
+_lpicture(template.quality,6in)
+
+For example, in the figure we see that the first four hundred or so
+bases are mostly only well determined on the forward strand.
+
+ at node Quality-Examining
+ at subsection Examining the Quality Plot
+ at cindex Quality plot: examining the plot
+
+Note that when displaying many bases the screen resolution implies that the
+quality codes for many bases will appear in the same screen pixel.  However
+the use of varying heights ensures that all problematic regions will be
+visible, even when the problem is only with a single base position. Hence when
+the quality plot consists of a single grey line all known quality problems
+have been resolved, at the current consensus and quality cutoffs. 
+
+The quality plot appears as "Calculate quality" in the Results Manager window
+(_fpref(Results, Results Manager, results)).
+
+Within the Results Manager commands available, using the right mouse
+button, include "Information",
+which lists a summary of
+the distribution of quality types to the output window, and "List" which lists
+the actual quality values for each base to the output window. These quality
+values are written in a textual form of single letters per base and are listed
+below.
+
+ at table @var
+ at item
+ at r{+Strand -Strand}
+ at item a
+ at r{Good    Good} (in agreement)
+ at item b
+ at r{Good    Bad}
+ at item c
+ at r{Bad     Good}
+ at item d
+ at r{Good    None}
+ at item e
+ at r{None    Good}
+ at item f
+ at r{Bad     Bad}
+ at item g
+ at r{Bad     None}
+ at item h
+ at r{None    Bad}
+ at item i
+ at r{Good    Good} (disagree)
+ at item j
+ at r{None    None}
+ at end table
+
+An example of the output using "Information" and "List" follows.
+
+ at example
+============================================================
+Wed 02 Apr 12:14:06 1997: quality summary
+------------------------------------------------------------
+Contig xb56b6.s1 (#11)
+ 81.00 OK on both strands and they agree(a)
+  3.94 OK on plus strand only(b,d)
+ 11.98 OK on minus strand only(c,e)
+  1.85 Bad on both strands(f,g,h,j)
+  1.22 OK on both strands but they disagree(i)
+============================================================
+Wed 02 Apr 12:14:09 1997: quality listing
+------------------------------------------------------------
+Contig xb56b6.s1 (#11)
+
+          10         20         30         40         50         60
+  eeeeeeeeee eeeeeeeeee eeeeeeeeee eeeeeeehee eeeeeeeeee eeeeeeeeee
+
+          70         80         90        100        110        120
+  eeeeeeeeee eeeeeeeeee eeeeeeeeee eeeeeeeeee eeeeeeeeee eeeeeeeeee
+
+         130        140        150        160        170        180
+  eeeeeeeeee eeeeeeeeee eeeeeeeeee eeeeeeeeee eeeeeeeeee eeeeeeeeee
+
+         190        200        210        220        230        240
+  eeeeeeeeee eeeeeeeeee heeeeeeeee eeeeeeeici iiaiaciiia aaaaaaaaac
+
+         250        260        270        280        290        300
+  aaaacaaaaa aaaaaaaiia aaaaaaaaaa aaaaaaaaaa aaaabaaaaa aaaaaaaaaa
+
+         310        320        330        340        350        360
+  aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa faaaaaaaaa
+
+[ output removed for brevity ]
+ at end example
diff --git a/manual/read_clipping-t.texi b/manual/read_clipping-t.texi
new file mode 100644
index 0000000..3e1fbfb
--- /dev/null
+++ b/manual/read_clipping-t.texi
@@ -0,0 +1,20 @@
+
+_split()
+ at node Clipping-Introduction
+ at unnumberedsec Introduction to read clipping
+ at cindex clipping readings
+ at cindex reading clipping
+ at cindex hidden data
+
+For most assembly routines to work well it is necessary to present them with
+data of reasonable quality. Generally sequences produced by machines suffer
+from having poor quality data at one or both ends and so methods are
+needed to define
+where the data is too poor to use. Some base callers include an increased
+number of "N" symbols in the sequence in doubtful regions and so these
+can be searched for. In the ideal situation base accuracy estimates or confidence
+values for each base call will be available, and then these can be
+searched to find where the average confidence value
+becomes too low for reliable assembly.
+The program qclip (_fpref(Man-qclip,qclip, manpages)) performs both type
+of analysis.
diff --git a/manual/read_clipping.texi b/manual/read_clipping.texi
new file mode 100644
index 0000000..068f793
--- /dev/null
+++ b/manual/read_clipping.texi
@@ -0,0 +1,41 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+ at c %**start of header
+ at setfilename read_clipping.info
+ at settitle Clipping poor data from the ends of readings
+ at setchapternewpage odd
+ at iftex
+ at afourpaper
+ at end iftex
+ at setchapternewpage odd
+ at c %**end of header
+
+ at set standalone
+include(header.m4)
+
+ at titlepage
+ at title Clipping poor data from the ends of readings
+ at subtitle 
+ at author 
+ at page
+ at vskip 0pt plus 1filll
+_include(copyright.texi)
+ at end titlepage
+
+ at node Top
+ at ifinfo
+ at top top-read_clipping
+ at end ifinfo
+
+ at raisesections
+_include(read_clipping-t.texi)
+
+_split()
+ at node Index
+ at unnumberedsec Index
+ at printindex cp
+ at lowersections
+
+ at shortcontents
+ at contents
+ at bye
diff --git a/manual/read_coverage_d.png b/manual/read_coverage_d.png
new file mode 100644
index 0000000..4889228
Binary files /dev/null and b/manual/read_coverage_d.png differ
diff --git a/manual/read_coverage_p.png b/manual/read_coverage_p.png
new file mode 100644
index 0000000..74c81be
Binary files /dev/null and b/manual/read_coverage_p.png differ
diff --git a/manual/read_coverage_p.small.png b/manual/read_coverage_p.small.png
new file mode 100644
index 0000000..6bc843a
Binary files /dev/null and b/manual/read_coverage_p.small.png differ
diff --git a/manual/read_pairs-t.texi b/manual/read_pairs-t.texi
new file mode 100644
index 0000000..d743ab0
--- /dev/null
+++ b/manual/read_pairs-t.texi
@@ -0,0 +1,216 @@
+ at menu
+* ReadPair-Display::            Graphical Output
+* ReadPair-Output::             Textual Output
+ at end menu
+
+ at cindex Find read pairs
+ at cindex Read pairs
+
+This function is used to check the positions and orientations of
+readings taken from the same templates. 
+It is invoked from the gap4 View menu.
+
+For each template the relative
+position of its readings and the contigs they are in are examined. This
+analysis can give information about the relative order, separation and
+orientations of contigs and also show possible problems in the data.
+The search can be over the whole database or a subset of contigs named
+in a list (_fpref(Lists, Lists, lists)) 
+or file of file names. The results are written to the Output
+Window and plotted in the Contig Comparator 
+(_fxref(Contig Comparator, Contig Comparator, comparator)).
+Read pair information is also used to colour code the results displayed in the
+Template Display 
+(_fpref(Template-Display, Template Display, template)).
+
+Note that during assembly the template names and lengths are copied from
+the experiment files into the gap database. _fxref(Formats-Exp,
+Experiment Files, exp) The accuracy of the lengths will depend upon some
+size selection being performed during the cloning procedures.
+
+_picture(read_pairs,2.65in)
+
+Users choose to process "all contigs" or a subset selected from a file
+of file names ("file") or a list ("list"). If either of the subset
+options is selected the "browse" button will be activated and can be
+clicked on to call up a file or list browser dialogue.
+
+_split()
+ at node ReadPair-Display
+ at subsection Find Read Pairs Graphical Output
+ at cindex Find read pairs: display
+
+The contig comparator is used to plot all templates with readings that span
+contigs. That is, the lines drawn on the contig comparator are a visual
+representation of the relationship (orientation and overlap) between contigs.
+When a template spans more than two contigs, all the combinations of pairs of
+contigs are plotted. However such cases are uncommon.
+
+_lpicture(comparator,5.325in)
+
+The figure above shows a typical Contig Comparator plot which includes
+several types of result in addition to those from Read Pair analysis.
+
+The lines for the read-pairs 
+are, by default, shown in blue. The length of the line is the average
+length of the two readings within the pair. The slope of the line represents
+the relative orientation of the two readings. If they are both the same
+orientation (including both complemented) the line is drawn from top left to
+bottom right, otherwise the line is drawn from top right to bottom left.
+
+Clicking with the right mouse button on a read pair line brings up a menu
+containing, amongst other things, "Invoke template display"
+(_fpref(Template-Display, Template Display, template)).
+This creates a template display of the two contigs. The spanning template 
+will be coloured bright yellow if the readings on the template are 
+consistent with one another, or dark yellow if they are not. The ordering of
+the contigs may need to be altered, or one contig may need complementing,
+before the readings on the template become consistent. Using the 
+"Invoke join editor" command 
+(_fpref(Editor-Joining, The Join Editor, contig_editor))
+from the same menu will bring up the Join Editor 
+with the two contigs shown end to end. 
+
+_split()
+ at node ReadPair-Output
+ at subsection Find Read Pairs Text Output
+ at cindex Find read pairs: output
+ at cindex Find read pairs: example
+
+Two types of results are written to the Output Window: those containing
+apparently consistent data about the relative orientations and positions
+of contigs, and those that show inconsistencies in the data. The
+inconsistencies will be due to misassembly or to misnaming of readings and
+templates.
+
+In the Output Window the program writes a line of information for each
+template and a line of information for each reading from that template.
+In order to restrict this information to fit on a standard 80 column
+display a few abbreviations are used.  An example for two consistent and
+one 
+problematic template is shown below. Templates with possible problems are
+separated from those without. The templates shown are sorted by problem;
+consistent templates at the top followed by increasingly inconsistent
+templates at the bottom.
+
+ at example
+ at group
+Template       zf18c8( 117), length 1400-2000(expected 1700)
+     Reading        zf18a2.s1(   +1F), pos   5620  +91, contig   46
+     Reading        zf18c8.s1( -117F), pos   1084 +288, contig  127
+
+Template       zf98f4( 659), length 1400-2000(computed 7263)
+     Reading        zf98f4.s1( -659F), pos     27 +238, contig  548
+     Reading        zf98f4.r1( +800R), pos   5392 +211, contig   46
+
+*** Possibly problematic templates listed below ***
+Template       zf24g6( 262), length 1400-2000(observed 1365)
+ D   Reading        zf24g6.r1( +808R), pos    463 +206, contig   46
+ D   Reading        zf24g6.s1( -262F), pos   1559 +268, contig   46
+ at end group
+ at end example
+
+ at subsubsection The Template Lines
+ at cindex Find read pairs: template lines
+ at cindex Template: find read pairs
+
+To describe the format of the template line we provide a detailed explanation
+of the lines above for the last Template block.
+
+ at table @code
+ at item "Template       zf24g6( 262)"
+This is template with name "zf24g6" and number 262.
+
+ at item length 1400-2000
+These are the minimum and maximum lengths specified for this template.
+
+ at item observed(1365)
+This section has the general format of "comment(distance)", where "comment" is
+one of the following.
+
+ at table @var
+ at item observed
+The template has both forward and reverse readings within this contig. From
+this information the actual size of the template can be seen. In the example
+this is "1365".
+
+ at item expected
+The template length is estimated as the average of the specified minimum and
+maximum size. This will be seen when the template does not span contigs and
+does not have both forward and reverse primers visible.
+
+ at item computed
+The template has forward and reverse readings in different contigs. The length
+is computed by butting the two contigs together, end to end, and finding the
+resultant separation of the template ends. It is not possible to tell whether
+the two contigs overlap, and if so by how much. Hence the "computed" lengths
+should not be considered as absolute.
+ at end table
+ at end table
+
+ at subsubsection The Reading Lines
+ at cindex Find read pairs: reading lines
+
+ at table @code
+ at item "?DPS"
+The first four characters may be either space or one of "?", "D", "P" or "S".
+The meaning of each of these is as follows.
+
+ at table @var
+ at item ?
+No primer information is available for these readings.
+ at item D
+The distance between forward and reverse primers (ie the template length) is
+not as expected.
+ at item P
+The primer information for readings on this template is inconsistent. An
+example of this is where two forward readings exist, both using the universal
+primer, and the readings are not in close proximity to each other.
+ at item S
+The template strand information is inconsistent. This problem can be seen when
+the forward and reverse readings are from the same strand, or two forward
+readings are pointing in opposite directions.
+ at end table
+
+Absence of all of these characters means that the template is consistent.
+
+ at item "Reading        zf24g6.r1"
+The reading name
+
+ at item "( +808R)"
+The reading number. The "+" or "-" character preceding the number represents
+whether the reading has been complemented ("+" for original, "-" for
+complemented). The letter following the number indicates the primer
+information found for this reading. It may be one of:
+
+ at table @var
+ at item ?
+Unknown
+ at item F
+Forward, universal primer
+ at item f
+Forward, custom primer (eg a walk)
+ at item R
+Reverse, universal primer
+ at item r
+Reverse, custom primer
+ at end table
+
+ at item "pos    463 +206"
+The position and the length of the reading within the contig. In this case the
+reading starts at position 463 and extends for 206 bases. For a complemented
+reading the position marks the 3' end of the reading. For both cases the
+position can be considered as the 'left end' of the reading as displayed
+within the contig.
+
+ at item "contig   46"
+The reading number of the left most reading within this contig.
+ at end table
+
+In the above example the template has two readings. It can be seen that the
+template starts at contig position 463 and finishes at position 1827.  The
+observed length is 1365, which is just below the expected minimum length of
+1400. Hence the template is flagged as having an invalid distance. There are
+no other inconsistencies for this template and so it is likely that the
+only "problem" is that the experimental size selection process was not
+as precise as was thought.
diff --git a/manual/read_pairs.png b/manual/read_pairs.png
new file mode 100644
index 0000000..039280e
Binary files /dev/null and b/manual/read_pairs.png differ
diff --git a/manual/readpair_coverage_p.png b/manual/readpair_coverage_p.png
new file mode 100644
index 0000000..66b6803
Binary files /dev/null and b/manual/readpair_coverage_p.png differ
diff --git a/manual/readpair_coverage_p.small.png b/manual/readpair_coverage_p.small.png
new file mode 100644
index 0000000..db4cbce
Binary files /dev/null and b/manual/readpair_coverage_p.small.png differ
diff --git a/manual/references-t.texi b/manual/references-t.texi
new file mode 100644
index 0000000..f0ace7d
--- /dev/null
+++ b/manual/references-t.texi
@@ -0,0 +1,122 @@
+ at node References
+ at unnumberedsec Publications
+ at cindex references
+
+ at enumerate 1
+ at item Bonfield, James K. and Staden, Rodger.
+ZTR: a new format for DNA sequence trace data. Bioinformatics 18, 3-10, (2002).
+
+ at item Bonfield, James, K., Beal, Kathryn F., Betts, Matthew J. and Staden, Rodger.
+Trev: a DNA trace editor and viewer. 
+Bioinformatics 18, 194-195, (2002)
+
+ at item Rodger Staden, David P. Judge and James K. Bonfield
+Sequence assembly and finishing methods
+Bioinformatics. A Practical Guide to the Analysis of Genes and Proteins. 
+Second Edition
+Eds. Andreas D. Baxevanis and B. F. Francis Ouellette. John Wiley & Sons, 
+New York, NY, USA, (2001)
+
+ at item The C. elegans Sequencing Consortium.
+Genome Sequence of the Nematode C. elegans: A Platform for
+Investigating Biology.
+Science 282, 2012-2018 (1998)
+
+ at item Rodger Staden, Kathryn F. Beal and James K. Bonfield
+The Staden Package, 1998. 
+Computer Methods in Molecular Biology
+Eds Stephen Misener and Steve Krawetz. The Humana Press Inc., Totowa, NJ
+07512
+
+ at item Bonfield, J.K., Rada, C. and Staden, R.
+Automated detection of point mutations using flourescent sequence trace
+subtraction.
+Nucleic Acids Res. 26, 3404-3409 (1998)
+
+ at item Flint, J., Sims, M., Clark, K., Staden, R. and Thomas, K.
+An Oligo-Screening Strategy to Fill Gaps Found During Shotgun
+Sequencing Projects.
+DNA Sequence 8, 241-245 (1998)
+
+ at item Staden, R. The Staden Sequence Analysis Package.
+Molecular Biotechnology 5, 233-241 (1996)
+
+ at item Bonfield, J.K. and Staden, R. Experiment files and their application during 
+large-scale sequencing projects.
+DNA Sequence 6, 109-117 (1996)
+
+ at item Staden, R. Indexing and using sequence databases.
+Methods in Enzymology 266, 105-114 (1996)
+
+ at item Bonfield, J.K., Smith, K.F. and Staden, R. A new DNA sequence assembly program.
+Nucleic Acids Res. 24, 4992-4999 (1995)
+
+ at item Bonfield, J.K. and Staden, R. The application of numerical estimates of 
+base calling accuracy to DNA sequencing projects.
+Nucleic Acids Res. 23, 1406-1410 (1995)
+
+ at item Dear, S. and Staden, R. A standard file format for data from DNA sequencing
+instruments.
+DNA Sequence 3, 107-110 (1992)
+
+ at item Staden, R. and Dear, S. Indexing the sequence libraries: Software providing
+a common indexing system for all the standard sequence libraries. 
+DNA Sequence 3, 99-105 (1992).
+
+ at item Dear, S. and Staden, R. A sequence assembly and editing program for 
+efficient management of large projects.
+Nucleic Acid Res. 19, 3907-3911 (1991).
+
+ at item Staden, R. Screening protein and nucleic acid sequences against libraries of
+patterns.
+DNA Sequence 1, 369-374 (1991).
+
+ at item Staden, R. Searching for patterns in protein and nucleic acid sequences. 
+Methods in Enzymology 183, 193-211. (1990).
+
+ at item Staden, R. Finding protein coding regions in genomic sequences.
+Methods in Enzymology 183, 163-180. (1990).
+
+ at item Staden, R. Methods for discovering novel motifs in nucleic acid sequences.
+CABIOS 5, 293-298 (1989)
+
+ at item Staden R, Methods for calculating the probabilities of finding
+patterns in sequences.
+CABIOS 5 89-96 (1989)
+
+ at item Staden R, Methods to define and locate patterns of motifs in
+sequences.
+CABIOS 4, 53-60 (1988)
+
+ at item Staden, R.  Graphic methods to determine the function of nucleic acid sequences.
+Nucleic Acid Res. 12, 521-538 (1984)
+
+ at item Staden, R.  Computer methods to locate signals in nucleic acid sequences. 
+Nucleic Acid Res. 12, 505-519 (1984)
+
+ at item Staden, R. A computer program to enter DNA gel reading data into a computer.
+Nucleic Acid Res. 12, 499-503 (1984)
+
+ at item Staden, R. Measurements of the effects that coding for a protein
+has on a DNA sequence and their use for finding genes.
+Nucleic Acid Res. 12, 551-567 (1984)
+
+ at item Staden, R.  and  McLachlan,  A.D. Codon preference and its use in identifying protein
+coding regions in long DNA sequences.
+Nucleic Acid Res. 10 141-156 (1982)
+
+ at item Staden, R. Automation of the computer handling of gel reading data produced 
+by the shotgun method of DNA sequencing. 
+Nucleic Acid Res. 10, 4731-4751 (1982)
+
+ at item Staden, R. An interactive graphics program for comparing and aligning
+nucleic acid and amino acid sequences. 
+Nucleic Acid Res. 10, 2951-2961 (1982)
+
+ at item Staden, R. A new computer method for the storage and  manipulation
+of DNA gel reading data. 
+Nucleic Acid Res. 8, 3673-3694 (1980)
+
+ at item Staden, R. A computer program to search for tRNA genes.
+Nucleic Acid Res. 8, 817-825 (1980)
+ at end enumerate
diff --git a/manual/references.texi b/manual/references.texi
new file mode 100644
index 0000000..a6592e0
--- /dev/null
+++ b/manual/references.texi
@@ -0,0 +1,41 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+ at c %**start of header
+ at setfilename references.info
+ at settitle References
+ at setchapternewpage odd
+ at iftex
+ at afourpaper
+ at end iftex
+ at setchapternewpage odd
+ at c %**end of header
+
+ at set standalone
+include(header.m4)
+
+ at titlepage
+ at title References
+ at subtitle 
+ at author 
+ at page
+ at vskip 0pt plus 1filll
+_include(copyright.texi)
+ at end titlepage
+
+ at node Top
+ at ifinfo
+ at top top-references
+ at end ifinfo
+
+ at raisesections
+_include(references-t.texi)
+
+_split()
+ at node Index
+ at unnumberedsec Index
+ at printindex cp
+ at lowersections
+
+ at shortcontents
+ at contents
+ at bye
diff --git a/manual/renzymes-t.texi b/manual/renzymes-t.texi
new file mode 100644
index 0000000..44f59ec
--- /dev/null
+++ b/manual/renzymes-t.texi
@@ -0,0 +1,57 @@
+ at node Formats-Restriction
+ at section Restriction Enzyme File
+ at cindex Restriction enzyme files
+
+Restriction enzymes and their recognition sequences used by the package
+must be stored in the format described below. Updates of the files can be
+obtained from the REBASE restriction enzyme database of Dr R
+Roberts. Contact roberts@@neb.com or macelis@@neb.com to join the mailing
+list and state that you want the files sent in "staden" format.
+
+Standard four-cutter, six-cutter and all-enzymes files are supplied with
+the package and
+users can create and use their own "personal" files.  To create your own file
+of enzymes you may need to extract the information from the currently
+defined files. These are stored in the tables directory (folder)
+distributed with the package, and are named:
+
+ at example
+RENZYM.4
+RENZYM.6
+RENZYM.ALL
+ at end example
+
+
+We call the
+recognition sequences "strings". The format is as follows: each
+string or set of strings must be preceded by a name, each string
+must be preceded and terminated with a slash (/), and each set of
+strings by 2 slashes. For example AATII/GACGT'C// defines the name
+AATII, its recognition sequence GACGTC and its cut site with the '
+symbol; ACCI/GT'MKAC// defines the name ACCI and its recognition
+sequence includes IUB symbols for incompletely defined symbols in
+nucleic acid sequences; BBVI/GCAGCNNNNNNNN'/'NNNNNNNNNNNNGCTGC//
+defines the name BBVI and this time two recognition sequences and
+cut sites are specified to enable the definition of the cut position
+relative to the recognition sequence. If no cut site is
+included the first base of the recognition sequence is displayed as
+being on the 3' side of the recognition sequence.
+
+A section of a typical file follows:
+
+ at example
+ AATII/GACGT'C//
+ ACCI/GT'MKAC//
+ AFLII/C'TTAAG//
+ AVAII/G'GWCC//
+ AVRII/C'CTAGG//
+ BANI/G'GYRCC//
+ BANII/GRGCY'C//
+ BBVI/GCAGCNNNNNNNN'/'NNNNNNNNNNNNGCTGC//
+ BCLI/T'GATCA//
+ BGLI/GCCNNNN'NGGC//
+ BGLII/A'GATCT//
+ BINI/GGATCNNNN'/'NNNNNGATCC//
+ BSMI/GAATGCN'/NG'CATTC//
+ BSP1286/GDGCH'C//
+ at end example
diff --git a/manual/repeats-t.texi b/manual/repeats-t.texi
new file mode 100644
index 0000000..d03cb0a
--- /dev/null
+++ b/manual/repeats-t.texi
@@ -0,0 +1,75 @@
+ at cindex Find repeats
+ at cindex Repeat search
+
+The purpose of this function (which is invoked from the gap4 View menu) 
+is to find exact repeats in contig
+consensus sequences. An exact repeat is defined as a run of consecutive
+identical ACGT characters; no mismatches or gaps are permitted.
+
+If it has not already occurred, selection of
+this function will automatically
+transform the Contig Selector into the Contig Comparator.
+_fxref(Contig Comparator, Contig Comparator, comparator)
+Each match found is plotted as a diagonal line in the Contig Comparator.
+The length of the diagonal line is proportional to the length of the
+match.
+
+If the match is for two contigs in the same orientation the diagonal
+will be parallel to the main diagonal, if they are not the line will be
+perpendicular to the main diagonal. The matches displayed in the Contig
+Comparator can be used to invoke the Join Editor (_fpref(Editor-Joining,
+The Join Editor, contig_editor)) or Contig Editors (_fpref(Editor,
+Editing in gap4, contig_editor)), and an Information button will display
+data about the match in the Output window. e.g.
+
+ at example
+ at group
+Repeat match
+    From contig xb54a3.s1(#26) at 78
+    With contig xb62h3.s1(#3) at 1
+    Length 37
+ at end group
+ at end example
+
+This means that position 78 in the contig with xb54a3.s1 (reading number
+26) at its left end matches 37 bases at position 1 in the contig with
+xb62h3.s1 (number 3) at its left end.
+
+_picture(repeats,3.70833in)
+
+Users can elect to search a "single" contig, or compare "all contigs",
+or a subset of contigs defined in a list or a file. If "file" or "list"
+is selected the browse button is activated and gives access to file or
+list browsers.  If they choose to analyse a single contig the dialogue
+concerned with selecting the contig and the region to search becomes
+activated. The "Minimum Repeat" defines the smallest match that the
+algorithm will report.  The algorithm will search only for repeats in
+the forward direction "Find direct repeats", or only those in the
+reverse direction "Find inverted repeats", or both "Find both".
+
+If "Mask active tags" is selected the "Select tags" button is activated.
+Clicking on this button will bring up a check box dialogue to enable the
+user to select the tags types they wish to activate. Masking the active
+tags means that all segments covered by tags that are "active" will not
+be used in the matching algorithm.
+A typical use of this mode is to avoid finding
+matches in segments covered by tags of type ALUS (ie segments thought to
+be Alu sequence) or that already covered by REPT tags. 
+_fxref(Anno-Types, Tag types, tags)
+
+After the search is complete clicking on "Yes" in the "Save tags to
+file" panel will activate the "File name" box and all repeats on the
+list will be written to a file. This file can be used with "Enter tags"
+(_fpref(Enter Tags, Enter Tags, complement)) to create REPT tags for all
+the repeats found.  Note that "Enter tags" will remove all the results 
+plotted in the contig comparator.
+
+Note that the current version of Find Repeats has a limit to the number
+of repeats it can store. The limit depends on the current maximum
+consensus length, so if you want to increase the limit, reset the
+maximum consensus length. This can be done using the "Set maxseq" item
+in the "Options" menu.
+
+
+
+
diff --git a/manual/repeats.png b/manual/repeats.png
new file mode 100644
index 0000000..eee9c12
Binary files /dev/null and b/manual/repeats.png differ
diff --git a/manual/restrict_enzymes-t.texi b/manual/restrict_enzymes-t.texi
new file mode 100644
index 0000000..a432f2d
--- /dev/null
+++ b/manual/restrict_enzymes-t.texi
@@ -0,0 +1,160 @@
+ at menu
+* Restrict-Selecting::          Selecting enzymes
+* Restrict-Examining::          Examining the plot
+* Restrict-Reconfig::           Reconfiguring the plot
+_ifdef([[_gap4]],[* Restrict-Tags::               Creating tags for cut sites
+])* Restrict-Output::             Textual outputs
+ at end menu
+
+ at cindex Restriction enzymes
+
+The restriction enzyme map function finds and displays restriction sites
+within a specified region of a contig. 
+It is invoked from the gap4 View menu.
+Users can select the enzyme
+types to search for and can save the sites found as tags within the
+database.
+
+_lpicture(restrict_enzymes,6in)
+
+This figure shows a typical view of the Restriction Enzyme Map
+in which the results for each enzyme type have been configured by the
+user to be drawn in different colours.  On the left of the display the
+enzyme names are shown adjacent to the lines of plotted results. If no
+result is found for any particular enzyme eg here APAI, the line will
+still be drawn so that zero cutters can be identified. Three of the
+enzymes types have been selected and are shown highlighted. The results
+can be scrolled vertically (and horizontally if the plot is zoomed in).
+A ruler is shown along the base and the current cursor position (the vertical
+black line) is shown in the left hand box near the top right of
+the display.  If the user clicks, in turn, on two restriction sites
+their separation in base pairs will appear in the top right hand box.
+Information about the last site touched is shown in the Information line
+at the bottom of the display. At the top the edit menu is shown
+torn off and can be used to create tags for highlighted enzyme types.
+
+_split()
+ at node Restrict-Selecting
+ at subsection Selecting Enzymes
+ at cindex Restriction enzymes: selecting enzymes
+
+Files of restriction enzyme names and their cut sites are stored in disk
+files. For the format of these files and notes about creating new ones see 
+_fref(Formats-Restriction, Restriction enzyme files, renzymes)
+
+When the file is read, the list of enzymes is displayed in a scrolling
+window.  To select enzymes press and drag the left mouse button within
+the list.  Dragging the mouse off the bottom of the list will scroll it to
+allow selection of a range larger than the displayed section of the
+list.  When the left button is pressed any existing selection is
+cleared. To select several disjoint entries in the list press control
+and the left mouse button. Once the enzymes have been chosen, pressing
+OK will create the plot.
+
+_split()
+ at node Restrict-Examining
+ at subsection Examining the Plot
+ at cindex Restriction enzymes: examining the plot
+
+Positioning the cursor over a match will cause its name and cut position
+to appear in the information line.  If the right mouse button is pressed
+over a match, a popup menu containing Information and Configure will
+appear. The Information function in this menu will display the data for
+this cut site and enzyme in the Output Window.
+
+It is possible to find the distance between any two cut sites.  Pressing
+the left mouse button on a match will display "Select another cut" at
+the bottom of the window.  Then, pressing the left button on another
+match will display the distance, in bases, between the two sites. This
+is shown in a box located at the top right corner of the window.
+
+_split()
+ at node Restrict-Reconfig
+ at subsection Reconfiguring the Plot
+ at cindex Configure: restriction enzymes
+ at cindex Restriction enzymes: configuring
+
+The plot displays the results for each restriction enzyme on a separate
+line.  Enzymes with no sites are also shown.  The order of these lines
+may be changed by pressing and dragging the middle mouse button or alt + left
+mouse button on one of the displayed names at the left side of the screen.
+
+The results are plotted as black lines but users can select colours for
+each enzyme type by pressing the right button on any of its matches.  A
+menu containing Information and Configure will pop up. Configure will
+display a colour selection dialogue.  Adjusting the colour here will
+adjust the colour for all matches for this restriction enzyme.
+
+_ifdef([[_gap4]],[[
+_split()
+ at node Restrict-Tags
+ at subsection Creating Tags for Cut Sites
+ at cindex Tags: restriction enzymes plot
+ at cindex Restriction enzymes: tags, creation of
+ at cindex Cut sites: restriction enzymes
+ at cindex Restriction enzymes: cut sites
+
+Clicking the left mouse button on an enzyme name at the left of the
+display toggles a highlight.  The Create tags command from the Edit menu
+will add tags to the database for all the matches whose enzyme names are
+highlighted.
+The command displays a dialogue box
+listing the enzyme names on the left, and the tag type to create for
+that enzyme on the right. Tag types must be chosen for all the listed
+restriction enzyme types before the tags can be created. Suitable tag
+types to choose are the ENZ0, ENZ1 (etc) tags.
+]])
+
+_split()
+ at node Restrict-Output
+ at subsection Textual Outputs
+ at cindex Restriction enzymes: textual output
+ at cindex Output enzyme by enzyme: restriction enzymes plot
+ at cindex Output ordered on position: restriction enzymes plot
+
+The Results menu of the plot contains options to list the restriction
+enzyme sites found. One option sorts the results by enzyme name and the
+other by the positions of the matches.
+
+The output below shows the textual output from "Output enzyme by enzyme".
+The Fragment column gives the size of the fragments between each of the cut
+sites. The Lengths column contains the fragment sizes sorted on size.
+
+ at example
+Contig zf98g12.r1 (#801) 
+Number of enzymes = 3
+Number of matches = 7
+  Matches found=     1 
+      Name            Sequence                  Position Fragment lengths
+    1 AATII           GACGT'C                       7130   7129    556 
+                                                            556   7129 
+  Matches found=     5 
+      Name            Sequence                  Position Fragment lengths
+    1 ACCI            GT'CGAC                        414    413    189 
+    2 ACCI            GT'CTAC                       1296    882    413 
+    3 ACCI            GT'CTAC                       3871   2575    882 
+    4 ACCI            GT'CTAC                       5816   1945   1681 
+    5 ACCI            GT'CGAC                       7497   1681   1945 
+                                                            189   2575 
+  Matches found=     1 
+      Name            Sequence                  Position Fragment lengths
+    1 AHAII           GA'CGTC                       7127   7126    559 
+                                                            559   7126 
+ at end example
+
+The output below shows the textual output from "Output ordered on position".
+
+ at example
+Contig zf98g12.r1 (#801) 
+Number of enzymes = 3
+Number of matches = 7
+      Name            Sequence                  Position Fragment lengths
+    1 ACCI            GT'CGAC                        414    413      3 
+    2 ACCI            GT'CTAC                       1296    882    189 
+    3 ACCI            GT'CTAC                       3871   2575    367 
+    4 ACCI            GT'CTAC                       5816   1945    413 
+    5 AHAII           GA'CGTC                       7127   1311    882 
+    6 AATII           GACGT'C                       7130      3   1311 
+    7 ACCI            GT'CGAC                       7497    367   1945 
+                                                            189   2575 
+ at end example
diff --git a/manual/restrict_enzymes.png b/manual/restrict_enzymes.png
new file mode 100644
index 0000000..bd20752
Binary files /dev/null and b/manual/restrict_enzymes.png differ
diff --git a/manual/restrict_enzymes.small.png b/manual/restrict_enzymes.small.png
new file mode 100644
index 0000000..de323a5
Binary files /dev/null and b/manual/restrict_enzymes.small.png differ
diff --git a/manual/results-t.texi b/manual/results-t.texi
new file mode 100644
index 0000000..30de780
--- /dev/null
+++ b/manual/results-t.texi
@@ -0,0 +1,50 @@
+ at cindex Results manager
+ at cindex Results manager: introduction
+ at cindex Results: removing
+ at cindex Removing results
+
+Some commands within __prog__ produce "results" that are updated
+automatically as data is edited. The Result Manager provides a way to
+list these results, and to interact with them.
+
+A result is an abstract term used to define any collection of
+data. Typically this data can be displayed, manipulated and is usually
+updated automatically when changes are made that affect it. Each set
+of matches from a particular search 
+plotted on the Contig Comparator 
+(_fpref(Contig Comparator, Contig
+Comparator, comparator))
+is a result, as are
+entire displays such as the Template Display.
+
+_picture(results.1,4.03333in)
+
+The "results" window, shown above, can be invoked either from the View
+menu in the main display or from the View menu of the Contig Comparator.
+Each result is listed in the window on a separate line containing the
+time that the result was created (which may not be the same as when it
+was last updated), the name of the function that created the result, and
+the result number. The number is simply a unique identifier to help
+distinguish two results produced by the same function.
+
+Each item in the list is consuming memory on your computer. Running
+functions over and over again without removing the previous results
+will slow down your machine and it will, eventually, run out of
+memory. Removing items from the list solves this.
+
+Pressing the right mouse button over an listed item will display a popup
+menu of operations that can be performed on this result. The operations
+available will always contain "Remove" which will delete this result and
+shut down any associated window, but others listed will depend on the
+result selected. In the illustration above the popup menu for the
+"Repeat search" can be seen. Here the operations relate to a set of
+repeat matches currently being displayed in the Contig Comparator (not
+shown).
+
+The Contig Comparator functions ("Find internal joins", "Find read
+pairs", "Find repeats", "Check assembly" and "Find Sequences") are all
+listed in the Results Manager once per usage of the function. It is
+worth remembering that the only places to completely remove the plots
+from one of these functions is using the "Remove" command within the
+Results Manager or to use the "Clear" button within the Contig
+Comparator to remove all plots.
diff --git a/manual/results.1.png b/manual/results.1.png
new file mode 100644
index 0000000..f20706c
Binary files /dev/null and b/manual/results.1.png differ
diff --git a/manual/scf-t.texi b/manual/scf-t.texi
new file mode 100644
index 0000000..6ffd2dc
--- /dev/null
+++ b/manual/scf-t.texi
@@ -0,0 +1,437 @@
+ at ignore
+ at c MANSECTION=4
+ at unnumberedsec NAME
+
+scf --- SCF File Format
+ at end ignore
+
+ at node Formats-Scf
+ at section SCF
+ at cindex SCF
+
+SCF format files are used to store data from DNA sequencing
+instruments. Each file contains the data for a single reading and
+includes: its trace sample points, its called sequence, the positions
+of the bases relative to the trace sample points, and numerical
+estimates of the accuracy of each base. Comments and "private data"
+can also be stored. The format is machine
+independent and the first version was described in Dear, S and Staden, R. "A
+standard file format for data from DNA sequencing instruments", DNA
+Sequence 3, 107-110, (1992). 
+
+Since then it has undergone several important changes. The first allowed for
+different sample point resolutions. The second, in response to the need to
+reduce file sizes for large projects, involved a major reorganisation of the
+ordering of the data items in the file and also in the way they are
+represented.  Note that despite these changes we have retained the original
+data structures into which the data is read. Also this reorganisation in
+itself has not made the files smaller but it has produced files that are more
+effectively compressed using standard programs such as gzip. The io library
+included in the package contains routines that can read and write all the
+different versions of the format (including reading of compressed files). The
+header record was not affected by this change. This documentation covers both
+the format of scf files and the data structures that are used by the io
+library. Prior to version 3.00 these two things corresponded much more
+closely.
+
+
+ at menu
+* Scf-Header::          Header record
+* Scf-Sample::          Sample points
+* Scf-Sequence::        Sequence information
+* Scf-Comments::        Comments
+* Scf-Private::         Private data
+* Scf-File-structure::  File structure
+* Scf-Notes::           Notes
+ at end menu
+
+_split()
+ at node Scf-Header
+ at subsection Header Record
+ at cindex Header record: SCF
+ at cindex Header: SCF structure
+ at cindex SCF header record
+ at cindex Magic number: SCF
+ at cindex SCF magic number
+
+The file begins with a 128 byte header record that describes the
+location and size of the chromatogram data in the file. Nothing is
+implied about the order in which the components (samples, sequence and
+comments) appear. The version field is a 4 byte character array
+representing the version and revision of the SCF format. The current
+value of this field is "3.00".
+
+ at c INDENT=0.2i
+ at example
+/*
+ * Basic type definitions
+ */
+typedef unsigned int   uint_4;
+typedef signed   int    int_4;
+typedef unsigned short uint_2;
+typedef signed   short  int_2;
+typedef unsigned char  uint_1;
+typedef signed   char   int_1;
+
+/*
+ * Type definition for the Header structure
+ */
+#define SCF_MAGIC (((((uint_4)'.'<<8)+(uint_4)'s'<<8) \
+                     +(uint_4)'c'<<8)+(uint_4)'f')
+
+typedef struct @{
+    uint_4 magic_number;
+    uint_4 samples;          /* Number of elements in Samples matrix */
+    uint_4 samples_offset;   /* Byte offset from start of file */
+    uint_4 bases;            /* Number of bases in Bases matrix */
+    uint_4 bases_left_clip;  /* OBSOLETE: No. bases in left clip (vector) */
+    uint_4 bases_right_clip; /* OBSOLETE: No. bases in right clip (qual) */
+    uint_4 bases_offset;     /* Byte offset from start of file */
+    uint_4 comments_size;    /* Number of bytes in Comment section */
+    uint_4 comments_offset;  /* Byte offset from start of file */
+    char version[4];         /* "version.revision", eg '3' '.' '0' '0' */
+    uint_4 sample_size;      /* Size of samples in bytes 1=8bits, 2=16bits*/
+    uint_4 code_set;         /* code set used (but ignored!)*/
+    uint_4 private_size;     /* No. of bytes of Private data, 0 if none */
+    uint_4 private_offset;   /* Byte offset from start of file */
+    uint_4 spare[18];        /* Unused */
+@} Header;
+ at end example
+
+ at quotation
+For versions of SCF files 2.0 or greater (@strong{Header.version} is `greater
+than' "2.00"), the version number, precision of data, the uncertainty code set
+are specified in the header.  Otherwise, the precision is assumed to be 1
+byte, and the code set to be the default code set.  The following uncertainty
+code sets are recognised (but still ignored by our programs!).
+ at end quotation
+
+ at example
+0       @{A,C,G,T,-@}   (default)
+1       Staden
+2       IUPAC (NC-IUB)
+3       Pharmacia A.L.F. (NC-IUB)
+4       @{A,C,G,T,N@}   (ABI 373A)
+5       IBI/Pustell
+6       DNA*
+7       DNASIS
+8       IG/PC-Gene
+9       MicroGenie
+ at end example
+
+_split()
+ at node Scf-Sample
+ at subsection Sample Points.
+ at cindex Sample points: SCF
+ at cindex SCF: Sample points
+ at cindex Samples1: SCF structure
+ at cindex Samples2: SCF structure
+
+The trace information is stored at byte offset
+ at strong{Header.samples_offset} from the start of the file. For each
+sample point there are values for each of the four bases.  
+ at strong{Header.sample_size} holds the
+precision of the sample values. The precision must be one of "1"
+(unsigned byte) and "2" (unsigned short). The sample points need not be
+normalised to any particular value, though it is assumed that they
+represent positive values. This is, they are of unsigned type.
+
+With the introduction of scf version 3.00, in an attempt to produce
+efficiently compressed files, the sample points
+are stored in A,C,G,T order; i.e. all the values for base A, followed by all
+those for C, etc. In addition they are stored, not as their original 
+magnitudes, but in terms of the
+differences between successive values. The C language code used to
+transform the values for precision 2 samples is shown below.
+
+
+ at example
+void delta_samples2 ( uint_2 samples[], int num_samples, int job) @{
+ 
+    /* If job == DELTA_IT:
+     *  change a series of sample points to a series of delta delta values:
+     *  ie change them in two steps:
+     *  first: delta = current_value - previous_value
+     *  then: delta_delta = delta - previous_delta
+     * else
+     *  do the reverse
+     */
+ 
+    int i;
+    uint_2 p_delta, p_sample;
+ 
+    if ( DELTA_IT == job ) @{
+        p_delta  = 0;
+        for (i=0;i<num_samples;i++) @{
+            p_sample = samples[i];
+            samples[i] = samples[i] - p_delta;
+            p_delta  = p_sample;
+        @}
+        p_delta  = 0;
+        for (i=0;i<num_samples;i++) @{
+            p_sample = samples[i];
+            samples[i] = samples[i] - p_delta;
+            p_delta  = p_sample;
+        @}
+    @}
+    else @{
+        p_sample = 0;
+        for (i=0;i<num_samples;i++) @{
+            samples[i] = samples[i] + p_sample;
+            p_sample = samples[i];
+        @}
+        p_sample = 0;
+        for (i=0;i<num_samples;i++) @{
+            samples[i] = samples[i] + p_sample;
+            p_sample = samples[i];
+        @}
+    @}
+@}
+ at end example
+
+The io library data structure is as follows:
+
+ at example
+/*
+ * Type definition for the Sample data
+ */
+typedef struct @{
+        uint_1 sample_A;           /* Sample for A trace */
+        uint_1 sample_C;           /* Sample for C trace */
+        uint_1 sample_G;           /* Sample for G trace */
+        uint_1 sample_T;           /* Sample for T trace */
+@} Samples1;
+
+typedef struct @{
+        uint_2 sample_A;           /* Sample for A trace */
+        uint_2 sample_C;           /* Sample for C trace */
+        uint_2 sample_G;           /* Sample for G trace */
+        uint_2 sample_T;           /* Sample for T trace */
+@} Samples2;
+ at end example
+
+_split()
+ at node Scf-Sequence
+ at subsection Sequence Information.
+ at cindex SCF: sequence
+ at cindex Sequence: SCF
+ at cindex Base: SCF structure
+
+Information relating to the base interpretation of the trace is stored
+at byte offset Header.bases_offset from the start of the file. 
+Stored for each base are: its
+character representation and a number (an index into the Samples data
+structure) indicating its position within the trace. The relative
+probabilities of each of the 4 bases occurring at the point where the
+base is called can be stored in @strong{prob_A} , @strong{prob_C} ,
+ at strong{prob_G} and @strong{prob_T}.
+
+From version 3.00 these items are stored in the following order: all
+"peak indexes", i.e. the positions in the sample points to which the
+bases corresponds; all the accuracy estimates for base type A, all for
+C,G and T; the called bases; this is followed by 3 sets of empty int1
+data items. These values are read into the following data structure by
+the routines in the io library.
+
+ at example
+/*
+ * Type definition for the sequence data
+ */
+typedef struct @{
+    uint_4 peak_index;        /* Index into Samples matrix for base posn */
+    uint_1 prob_A;            /* Probability of it being an A */
+    uint_1 prob_C;            /* Probability of it being an C */
+    uint_1 prob_G;            /* Probability of it being an G */
+    uint_1 prob_T;            /* Probability of it being an T */
+    char   base;              /* Called base character        */
+    uint_1 spare[3];          /* Spare */
+@} Base;
+ at end example
+
+_split()
+ at node Scf-Comments
+ at subsection Comments.
+ at cindex SCF: comments
+ at cindex Comments: SCF
+
+Comments are stored at offset Header.comments_offset from the start of
+the file. Lines in this section are of the format:
+
+ at quotation
+<Field-ID>=<Value>
+ at end quotation
+
+<Field-ID> can be any string, though several have special meaning and
+their use is encouraged.
+
+ at example
+ID      Field                           Example
+MACH    Sequencing machine model        MACH=Pharmacia A.L.F.
+TPSW    Trace processing software       TPSW=A.L.F. Analysis
+          version                         Program, Version=1.67
+BCSW    Base calling software version   BCSW=A.L.F. Analysis
+                                          Program, Version=1.67
+DATF    Data source format              DATF=AM_Version=2.0
+DATN    Data source name                DATN=a10c.alf
+CONV    Format conversion software      CONV=makeSCF v2.0
+ at end example
+
+Other fields might include:
+
+ at example
+ID      Field                           Example
+OPER    Operator                        OPER=sd
+STRT    Time run started                STRT=Aug 05 1991  12:25:01
+STOP    Time run stopped                STOP=Aug 05 1991  16:26:25
+PROC    Time processed                  PROC=Aug 05 1991  18:50:13
+EDIT    Time edited                     EDIT=Aug 05 1991  19:06:18
+NAME    Sample name                     NAME=a21b1.s1
+SIGN    Average signal strength         SIGN=A=56,C=66,G=13,T=18
+SPAC    Average base spacing            SPAC=12.04
+SCAL    Factor used in scaling traces   SCAL=0.5
+ACMP    Compression annotation          COMP=99,6
+ASTP    Stop annotation                 STOP=143,12
+ at end example
+
+ at example
+ at group
+/*
+ * Type definition for the comments
+ */
+typedef char Comments[];                /* Zero terminated list of
+                                           \n separated entries */
+ at end group
+ at end example
+
+_split()
+ at node Scf-Private
+ at subsection Private data.
+ at cindex SCF: private data
+ at cindex Private data: SCF
+
+The private data section is provided to store any information required
+that is not supported by the SCF standard. If the field in the header
+is 0 then there is no private data section. We impose no restrictions
+upon the format of this section. However we feel it maybe a good idea
+to use the first four bytes as a magic number identifying the used
+format of the private data.
+
+_split()
+ at node Scf-File-structure
+ at subsection File structure.
+ at cindex SCF: file structure
+ at cindex File structure: SCF
+
+From SCF version 3.0 onwards the in memory structures and the data on the disk
+are not in the same format. The overview of the data on disk for the different
+versions is summarised below.
+
+ at example
+
+Versions 1 and 2
+
+(Note Samples1 can be replaced by Samples2 as appropriate.)
+
+Length in bytes                        Data
+---------------------------------------------------------------------
+128                                    header
+Number of samples * 4 * sample size    Samples1 or Samples2 structure
+Number of bases * 12                   Base structure
+Comments size                          Comments
+Private data size                      private data
+
+Version 3
+
+Length in bytes                        Data
+---------------------------------------------------------------------------
+128                                    header
+Number of samples * sample size        Samples for A trace
+Number of samples * sample size        Samples for C trace
+Number of samples * sample size        Samples for G trace
+Number of samples * sample size        Samples for T trace
+Number of bases * 4                    Offset into peak index for each base
+Number of bases                        Accuracy estimate bases being 'A'
+Number of bases                        Accuracy estimate bases being 'C'
+Number of bases                        Accuracy estimate bases being 'G'
+Number of bases                        Accuracy estimate bases being 'T'
+Number of bases                        The called bases
+Number of bases * 3                    Reserved for future use
+Comments size                          Comments
+Private data size                      Private data
+---------------------------------------------------------------------------
+ at end example
+
+_split()
+ at node Scf-Notes
+ at subsection Notes
+
+ at node Scf-Notes-Ordering
+ at subsubsection Byte ordering and integer representation.
+ at cindex SCF: byte ordering
+ at cindex Byte ordering: SCF
+
+"Forward byte and reverse bit" ordering will be used for all integer
+values. This is the same as used in the MC680x0 and SPARC processors,
+but the reverse of the byte ordering used on the Intel 80x86 processors.
+
+ at example
+         Off+0   Off+1  
+       +-------+-------+  
+uint_2 |  MSB  |  LSB  |  
+       +-------+-------+  
+
+         Off+0   Off+1   Off+2   Off+3
+       +-------+-------+-------+-------+
+uint_4 |  MSB  |  ...  |  ...  |  LSB  | 
+       +-------+-------+-------+-------+
+ at end example
+
+To read integers on systems with any byte order use something like this:
+
+ at example
+uint_2 read_uint_2(FILE *fp)
+@{
+    unsigned char buf[sizeof(uint_2)];
+
+    fread(buf, sizeof(buf), 1, fp);
+    return (uint_2)
+        (((uint_2)buf[1]) +
+         ((uint_2)buf[0]<<8));
+@}
+
+uint_4 read_uint_4(FILE *fp)
+@{
+    unsigned char buf[sizeof(uint_4)];
+
+    fread(buf, sizeof(buf), 1, fp);
+    return (uint_4)
+        (((unsigned uint_4)buf[3]) +
+         ((unsigned uint_4)buf[2]<<8) +
+         ((unsigned uint_4)buf[1]<<16) +
+         ((unsigned uint_4)buf[0]<<24));
+@}
+ at end example
+
+_split()
+ at node Scf-Notes-Compression
+ at subsubsection Compression of SCF Files
+
+The SCF format version 3.00 has been designed with file compression in mind.
+No new information is recorded when compared to the version 2.02 format,
+except the data is stored in a manner conducive to efficient compression.
+
+Experimentation @footnote{Analysed using a data set of 100 ABI (and their SCF
+equivalent) files} has shown that 16 bit SCF version 3.00 files can achieve a
+9:1 compression ratio and 8 bit SCF files a 14.5:1 compression ratio. These
+figures are for SCF files without quality values compressed using the
+ at code{bzip} utility. @code{gzip} tends to give between 20 to 40% larger files
+than @code{bzip}. Compressed SCF files containing accuracy values tend to be
+around 10% larger than those without accuracy values.
+
+Whilst compression is not a specific part of the SCF standard, the size of
+trace files and the compression ratios attainable suggests that it is wise to
+handle compressed files. The Staden Package utilities, such as gap4 and trev,
+automatically uncompress and compress SCF files as needed.
+
+Note that at present, on the fly compression, as just described, is not
+implemented for the Windows version of the package.
diff --git a/manual/screen_seq.1.texi b/manual/screen_seq.1.texi
new file mode 100644
index 0000000..7858632
--- /dev/null
+++ b/manual/screen_seq.1.texi
@@ -0,0 +1,185 @@
+ at cindex screen_seq: man page
+ at unnumberedsec NAME
+
+screen_seq --- filters out sequence readings containing contaminating DNA
+
+ at unnumberedsec SYNOPSIS
+
+ at code{screen_seq} @code{-}[@code{lcwmiIsSpft}]
+[@code{-l} @i{Length of minimum match (25)}]
+[@code{-m} @i{Maximum vector length (100000)}]
+[@code{-i} @i{Input file of reading file names}]
+[@code{-I} @i{Input file of single reading to screen}]
+[@code{-s} @i{Input file of sequence file names}]
+[@code{-S} @i{Input file of single sequence to screen against}]
+[@code{-p} @i{Passed output file of file names}]
+[@code{-f} @i{Failed output file of file names}]
+[@code{-t} @i{Test only mode}]
+
+
+ at unnumberedsec DESCRIPTION
+
+ at code{screen_seq} searches sequence readings to
+filter out those from extraneous DNA
+such as vector or bacterial sequences. We have separated this task
+from that of locating and marking the extents of sequencing vector and
+other cloning vectors. There we require precise identification of the
+junction between the vectors and the target DNA. The filtering process
+described here is designed to spot strong matches between readings and a
+panel of possible contaminating sequences, and it splits readings into
+passes and fails. Readings that fail have a PS line containing the word
+"contaminant" and a tag of type "CONT" added to their experiment file.
+
+Normal usage would be to compare a batch of readings in experiment file
+format against a batch of possible contaminant sequences stored in (at
+present) simple text files. Each batch is presented to the program as a
+file of file names, and the program will write out two new files of file
+names: one containing the names of the files that do not match any of
+the contaminant sequences (the passes), and the other those that do
+match (the
+fails). It is also possible to compare single readings and single
+contaminant files by giving their file names (i.e. it is not necessary
+to use a file of file names for single files).
+
+Given the frequent need to compare against the full E. coli genome the
+algorithm is designed to be fast. The user controls the speed and
+sensitivity by supplying a single parameter, "min_match".
+The program will find the longest exact match of at
+least min_match characters.
+
+The search is
+conducted only over the clipped portion of the readings. On our Alpha machine
+it takes about 1 second to compare both strands of a reading against the
+4.7 million bases of E. coli.
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-l} @i{Length of minimum match (25)}
+        The length of match required to initiate a closer search.
+
+ at item @code{-m} @i{Maximum vector length (100000)}
+        The maximum length of the longest sequence to screen the readings against.
+
+ at item @code{-i} @i{Input file of reading file names}
+
+ at item @code{-I} @i{Input file of single reading to screen}
+
+ at item @code{-s} @i{Input file of sequence file names to screen against}
+
+ at item @code{-S} @i{Input file of single sequence to screen against}
+
+ at item @code{-p} @i{Passed output file of file names}
+
+ at item @code{-f} @i{Failed output file of file names}
+
+ at item @code{-t} @i{Test only mode}
+        In test mode no experiment files are changed and the results are written
+        to stdout. When not in test mode a dot "." is written to stdout for each
+        comparison, and an exclamation mark "!" for each error detected.
+ at end table
+
+ at unnumberedsec EXAMPLES
+
+ at example
+Usage: screen_seq [options and paramters] 
+Where options and parameters are:
+    [-l minimum match (25)]           [-m Max vector length (100000)]
+    [-i readings to screen fofn]      [-I reading to screen]
+    [-s seqs to screen against fofn]  [-S seq to screen against]
+    [-t test only]
+    [-p passed fofn]                  [-f failed fofn]
+ at end example
+
+
+1. Screen the readings whose names are stored in fofn against a batch of
+possible contaminant sequences whose names are stored in vnames. Write
+the names of the readings that pass to file p and those that fail to
+file f. Increase the maximum sequence length to 5000,000 characters and
+require a minimum match of 20.
+
+
+ at example
+ at code{screen_seq -i fofn -s vnames -p p -f f -l20 -m5000000}
+ at end example
+
+2. Screen the single reading stored in xpg33.g1 against a batch of
+possible contaminant sequences whose names are stored in vnames. If the
+reading does not match write its name to file p, otherwise to
+file f. Increase the maximum sequence length to 5000,000 characters and
+require a minimum match of 20.
+
+ at example
+ at code{screen_seq -I xpg33.g1 -s vnames -p p -f f -l20 -m5000000}
+ at end example
+
+3. Screen the readings whose names are stored in fofn against a single
+possible contaminant sequence stored in ecoli.seq. Write
+the names of the readings that pass to file pass and those that fail to
+file fails. Increase the maximum sequence length to 5000,000 characters and
+require minimum  match of 20.
+
+ at example
+ at code{screen_seq -i fofn -S ecoli.seq -p pass -f fails -l20 -m5000000}
+ at end example
+
+
+
+ at unnumberedsec NOTES
+
+Limits
+
+Screen_seq is currently set to be able to process a maximum of 10,000
+readings and 5000 screening sequences in a single run. The maximum
+length of any screening sequence is 100,000 although this can be
+overridden by use of the -m parameter (set it to 5000000 for E. coli).
+At present the sequences to screen against must be stored in simple text
+files containing individual sequences, with no entry names, and <100
+characters per line.
+
+
+The following errors can be reported.
+
+ at cindex Screen_seq: error codes
+ at enumerate 1
+ at item   "Failed to open file of file names to screen against". Fatal failure to
+open the file of file names to screen against.
+ at item   "Failed to open single file to screen against". Fatal failure to
+open the file to screen against.
+ at item   "Failed to open file of file names to screen". Fatal failure to
+open the file of file names to screen.
+ at item   "Failed to open single file to screen". Fatal failure to
+open the file to screen.
+ at item   "Failed to open file of passed file names". Fatal failure to
+open the file of file names for readings that do not match.
+ at item   "Failed to open file of failed file names". Fatal failure to
+open the file of file names for readings that match.
+ at item   "Failed to open single file to screen". Fatal failure to
+open the file to screen.
+ at item   "Error: could not open vector file". An individual sequence file
+could not be opened.
+ at item   "Error: could not read vector file". An individual sequence file
+could not be read.
+ at item   "Error: could not hash vector file". An individual sequence file
+could not be prepared for comparison.
+ at item	"Error: could not open experiment file". The file does not exist
+or is unreadable.
+ at item	"Error: no sequence in experiment file".
+ at item	"Error: sequence too short". The reading is shorter than the
+minimum match length.
+ at item	"Error: could not write to experiment file". The disk is full or
+the file is write protected.
+ at item   "Error: hashing problem". An error occurred in the comparison
+algorithm. Please report to staden-package@@mrc-lmb.cam.ac.uk
+ at end enumerate
+
+Inconsistencies in the selection of options, such as selecting -I and
+-i, should also cause the usage message (shown below) to appear, and 
+the program to terminate. 
+
+ at i{PS} record added to the experiment file for any reading that matches.
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Formats-Exp,Experiment File, formats)
+_fxref(Vector_Clip, Screening Against Vector Sequences, vector_clip)
diff --git a/manual/set_genetic_code.png b/manual/set_genetic_code.png
new file mode 100644
index 0000000..53dab95
Binary files /dev/null and b/manual/set_genetic_code.png differ
diff --git a/manual/show_rel-t.texi b/manual/show_rel-t.texi
new file mode 100644
index 0000000..b1bb315
--- /dev/null
+++ b/manual/show_rel-t.texi
@@ -0,0 +1,77 @@
+ at cindex Show relationships
+
+This function 
+(which is available from the gap4 View menu)
+is used to show the relationships of the gel readings in
+the database in three ways.
+
+ at enumerate
+ at item
+All contig descriptor lines followed by all gel descriptor lines.
+
+ at item
+All contigs one after the other sorted, i.e. for each contig show its
+contig descriptor line followed by all its gel descriptor lines sorted
+on position from left to right
+
+ at item
+Selected contigs: show the contig line and, in left to right order, the gel 
+readings. This can be done for a list or a file of contigs. For a single
+contig the output can be restricted to a user-defined region.
+ at end enumerate
+
+_picture(show_rel,3.175in)
+
+In the above illustration, a single contig, all contigs, a file or list of
+contigs can be selected. For a single contig, the contig identifier and
+range selector becomes enabled. Choosing a file or list enables the "browse"
+button which will invoke either the file or list browser respectively. When 
+"all" contigs is selected a further choice is available: whether to 
+ at samp{Show readings in positional order}. This question determines
+whether to output in method 1 (No) or 2 (Yes) listed above.
+
+The function is particularly useful for creating files or lists of
+reading names. To create a list of reading names run Show Relationships
+to produce the desired output to the Output Window. Then either use cut
+and paste from this window to a list editor, or use the right mouse
+button in the output window to request the "Output to list" option. In
+this latter case the header "@code{CONTIG LINES}" and "@code{GEL LINES}"
+lines should be removed (although most functions will happily ignore,
+with warnings, a list containing unknown reading names).
+
+In the output window the reading names are underlined, indicating that they
+are hyperlinks. Double clicking on a name with the left mouse button will
+bring up the contig editor showing the start of that sequence, or it will move
+an existing contig editor to display that position. (You may wish to turn off
+the "Scroll on output" button if you do not wish the text output window to
+scroll to the bottom as it displays the "Edit contig" title.) Clicking on a
+reading name with the right mouse button will bring up a popup menu containing
+Edit contig, Template display, List reading notes and List contig notes.
+
+Below is an example showing a contig from position 1 to 689.  The left
+gel reading is number 6 and has archive name HINW.010, the rightmost gel
+reading is number 2 and is has archive name HINW.004.  On each gel
+descriptor line is shown: the name of the archive version, the gel
+number, the position of the left end of the gel reading relative to the
+left end of the contig, the length of the gel reading (if this is
+negative it means that the gel reading is in the opposite orientation to
+its archive), the number of the gel reading to the left and the number
+of the gel reading to the right.
+
+ at example
+CONTIG LINES
+CONTIG      LINE  LENGTH               ENDS
+                                    LEFT   RIGHT
+              48     689               6       2
+GEL LINES
+NAME      NUMBER POSITION LENGTH     NEIGHBOURS
+                                    LEFT   RIGHT
+HINW.010       6        1   -279       0       3
+HINW.007       3       91   -265       6       5
+HINW.009       5      137   -299       3      17
+HINW.999      17      140    273       5      12
+HINW.017      12      193    265      17      18
+HINW.031      18      385   -245      12       2
+HINW.004       2      401   -289      18       0
+ at end example
+
diff --git a/manual/show_rel.png b/manual/show_rel.png
new file mode 100644
index 0000000..1f6fb2c
Binary files /dev/null and b/manual/show_rel.png differ
diff --git a/manual/snp_candidates1.png b/manual/snp_candidates1.png
new file mode 100644
index 0000000..2256afe
Binary files /dev/null and b/manual/snp_candidates1.png differ
diff --git a/manual/snp_candidates1.small.png b/manual/snp_candidates1.small.png
new file mode 100644
index 0000000..c1091b0
Binary files /dev/null and b/manual/snp_candidates1.small.png differ
diff --git a/manual/snp_candidates2.png b/manual/snp_candidates2.png
new file mode 100644
index 0000000..356a87e
Binary files /dev/null and b/manual/snp_candidates2.png differ
diff --git a/manual/snp_candidates2.small.png b/manual/snp_candidates2.small.png
new file mode 100644
index 0000000..97a6510
Binary files /dev/null and b/manual/snp_candidates2.small.png differ
diff --git a/manual/spin-t.texi b/manual/spin-t.texi
new file mode 100644
index 0000000..811e1e8
--- /dev/null
+++ b/manual/spin-t.texi
@@ -0,0 +1,3133 @@
+_include(spin_org-t.texi)
+_include(spin_mini-t.texi)
+
+_split()
+ at node SPIN-Intro-Menus
+ at section Spin Menus
+ at cindex EMBOSS
+
+The main window for spin contains File, View, Options, Sequences, Statistics,
+Translation, Search, Comparison and Emboss menus.
+
+ at node SPIN-Intro-Menu-File
+ at subsection Spin File Menu
+
+The File menu includes sequence reading, saving and management options.
+
+ at itemize @bullet
+ at item Load sequences (_fpref(SPIN-Read Sequences, Reading in sequences, t))
+ at item Save (_fpref(SPIN-Save Sequence, Save, t))
+ at item Change directory
+ at item Sequence manager (_fpref(SPIN-Sequence Manager, Sequence manager)) 
+ at item Exit
+ at end itemize
+
+
+ at node SPIN-Intro-Menu-View
+ at subsection Spin View Menu
+
+The View menu contains options to give access to the Results
+Manager, and to the Sequence Display.
+
+ at itemize @bullet
+ at item Results manager (_fpref(SPIN-Result-Manager, Result manager))
+ at item Sequence display (_fpref(SPIN-Sequence-Display, Sequence display))
+ at end itemize
+
+ at node SPIN-Intro-Menu-Options
+ at subsection Spin Options Menu
+
+The Options menu contains options for configuring spin and its functions.
+
+ at itemize @bullet
+ at item Change protein score matrix (_fpref(SPIN-Changing the score matrix, Changing the score matrix))
+ at item Set protein alignment symbols
+(_fpref(SPIN-Set protein alignment symbols, Set protein alignment symbols, t))
+ at item Configure maximum number of matches 
+(_fpref(SPIN-Changing Max Match Number, Changing the maximum number of matches,t))
+ at item Configure default number of matches 
+(_fpref(SPIN-Changing Default Match Number, Changing the default number of
+matches))
+ at item Hide duplicate matches 
+(_fpref(SPIN-Hide duplicate matches, Hide duplicate matches,t))
+ at item Set fonts
+ at item Colours
+
+ at end itemize
+
+ at node SPIN-Intro-Menu-Sequences
+ at subsection Spin Sequences Menu
+
+The Sequences menu contain options for manipulating the sequences
+currently loaded into spin.
+All these operations
+are also obtainable from a pop up menu in the sequence manager 
+(_fpref(SPIN-Sequence Manager, Sequence manager)).
+
+ at itemize @bullet
+ at item Horizontal
+(_fpref(SPIN-Change Active Sequence, Change the active sequence, t))
+ at item Vertical
+(_fpref(SPIN-Change Active Sequence, Change the active sequence, t))
+ at item Set range
+(_fpref(SPIN-Set Range, Set the range,t))
+ at item Copy 
+(_fpref(SPIN-Copy, Copy sequence,t))
+ at item Complement sequence
+(_fpref(SPIN-Complement Sequence, Complement sequence,t))
+ at item Interconvert t and u
+(_fpref(SPIN-Interconvert t and u, Interconvert t and u,t))
+ at item Translate sequence
+(_fpref(SPIN-Translate Sequence, Translate sequence,t))
+ at item Scramble sequence
+(_fpref(SPIN-Scramble Sequence, Scramble sequence,t))
+ at item Sequence type
+(_fpref(SPIN-Sequence Type, Sequence type,t))
+ at item Rotate sequence
+(_fpref(SPIN-Rotate Sequence, Rotate sequence,t))
+ at item Save
+(_fpref(SPIN-Save Sequence, Save sequence,t))
+ at item Delete
+(_fpref(SPIN-Delete Sequence, Delete sequence,t))
+ at end itemize
+
+ at node SPIN-Intro-Menu-Statistics
+ at subsection Spin Statistics Menu
+
+The Statistics menu contains the spin functions for analysing and
+plotting the composition of sequences.
+
+ at itemize @bullet
+ at item Count sequence composition
+(_fpref(SPIN-Base-Composition, Count Sequence Composition))
+ at item Plot base composition
+(_fpref(SPIN-Plot-Base-Composition, Plot Base Composition))
+ at item Count dinucleotide frequencies
+(_fpref(SPIN-Dinucleotide-Freq, Dinucleotide Frequencies)).
+ at end itemize
+
+ at node SPIN-Intro-Menu-Translation
+ at subsection Spin Translation Menu
+
+The "Translation" menu contains options to set the genetic code,
+translate to protein, find open reading frames and to calculate 
+codon tables.
+
+ at itemize @bullet
+ at item Set genetic code
+(_fpref(SPIN-Set-Genetic-Code, Set Genetic Code))
+ at item Translate
+(_fpref(SPIN-Translation-General, Translation)),
+ at item Find open reading frames
+(_fpref(SPIN-Open-Reading-Frames, Find Open Reading Frames))
+ at item Calculate and write codon table to disk
+(_fpref(SPIN-Codon-Usage-Tables, Calculate codon usage, t))
+ at end itemize
+
+ at node SPIN-Intro-Menu-Search
+ at subsection Spin Search Menu
+
+The "Search" menu contains a variety of different searching and analysis
+techniques.
+
+ at itemize @bullet
+ at item Protein genes: Codon pref
+(_fpref(SPIN-Codon-Usage-Method, Codon Usage Method))
+ at item Protein genes: Author test
+(_fpref(SPIN-Author-Test, Author Test))
+ at item Protein genes: Base bias
+(_fpref(SPIN-Uneven-Positional-Base-Freqs, Uneven Positional base
+Frequencies))
+ at item tRNA genes
+(_fpref(SPIN-TRNA-Search, tRNA Search))
+ at item Search for string (DNA)
+(_fpref(SPIN-String-Search, Subsequence search))
+ at item Restriction enzyme map
+(_fpref(SPIN-Restrict-Introduction, Restriction enzyme search))
+ at item Plot start codons
+(_fpref(SPIN-Start-Codon-Search, Start Codon Search))
+ at item Plot stop codons
+(_fpref(SPIN-Stop-Codon-Search, Stop Codon Search))
+ at item Search for splice junctions
+(_fpref(SPIN-Splice-Site-Search, Splice Site Search))
+ at item Search using weight matrix
+(_fpref(SPIN-Weight-Matrix-Search, Motif Search))
+ at end itemize
+
+
+ at node SPIN-Intro-Menu-Comparison
+ at subsection Spin Comparison Menu
+
+The Comparison menu contains the spin analytical functions for comparing
+and aligning the sequences.
+
+ at itemize @bullet
+ at item Find similar spans
+(_fpref(SPIN-Find similar spans, Finding Similar Spans))
+ at item Find matching words
+(_fpref(SPIN-Find matching words, Finding Matching Words))
+ at item Find best diagonals
+(_fpref(SPIN-Find Best Diagonals, Finding the Best Diagonals))
+ at item Align sequences
+(_fpref(SPIN-Align Sequences, Aligning Sequences Globally))
+ at item Local alignment
+(_fpref(SPIN-Local alignment, Aligning Sequences Locally))
+ at end itemize
+
+_ifdef([[_unix]],[[
+ at node SPIN-Intro-Menu-Emboss
+ at subsection Spin Emboss Menu
+ at cindex EMBOSS
+
+Spin provides a graphical user interface for most of the the programs contained in EMBOSS
+_uref(http://www.hgmp.mrc.ac.uk/Software/EMBOSS/). 
+Those that are not provided
+are the ones that deal with multiple sequence alignments and the various rarely
+used tools such as the ones for database indexing. There are a lot of programs
+in EMBOSS so cascading menus are used. As far as the user is concerned the
+EMBOSS programs appear as though part of spin - all results are plotted in the
+same way as for equivalent spin functions, the sequences can be viewed in
+the sequence displays, and textual results appear in the Output Window.
+
+An important feature of EMBOSS is that it provides access to the sequence
+libraries.
+
+Notes on configuring EMBOSS for use via spin are included in the
+package's /doc directory. In summary these notes state the following:
+After installing EMBOSS the main task is to
+create the dialogues and menus for spin. This is entirely automatic.
+The create_emboss_files program attempts to find the location for your
+installed EMBOSS release. From this is iterates through all of the acd
+files
+and produces tcl/tk GUIs for each program. These are placed in the
+$STADENROOT/lib/spin_emboss/acdtcl directory. An Emboss menu is added to 
+Spin,
+with the menu specification being in $STADENROOT/tables/emboss_menu.
+]])
+
+ at node SPIN-Functions
+ at chapter Spin's Analytical Functions
+
+ at menu
+* SPIN-Base-Composition::     Count Sequence composition
+* SPIN-Dinucleotide-Freq::    Dinucleotide frequencies
+* SPIN-Plot-Base-Composition::   Plot base composition
+* SPIN-String-Search::Subsequence search
+* SPIN-Codon-Usage-Tables:: Calculate codon usage
+* SPIN-Set-Genetic-Code::   Set genetic code
+* SPIN-Translation-General::  Translation - general
+* SPIN-Open-Reading-Frames::  Find open reading frames
+* SPIN-Start-Codon-Search::  Start codon search
+* SPIN-Stop-Codon-Search::   Stop codon search
+* SPIN-Codon-Usage-Method::   Codon usage method
+* SPIN-Positional-Base-Prefs::   Positional base preferences
+* SPIN-Author-Test::   Author test
+* SPIN-Uneven-Positional-Base-Freqs::   Uneven positional base frequencies
+* SPIN-Splice-Site-Search::   Splice site search
+* SPIN-Weight-Matrix-Search::   Motif search
+* SPIN-TRNA-Search::   tRNA search
+* SPIN-Find similar spans::   Finding Similar Spans
+* SPIN-Find matching words::  Finding Matching Words
+* SPIN-Local alignment::      Aligning Sequences Locally
+* SPIN-Find Best Diagonals::  Finding the Best Diagonals
+* SPIN-Align Sequences::      Aligning Sequences Globally
+ at end menu
+
+
+Spin contains both simple and sophisticated analytical functions, mostly
+producing graphical results. The following sections describe the
+functions, approximately in order of increasing complexity.
+
+_split()
+ at node SPIN-Base-Composition
+ at section Count Sequence Composition
+ at cindex Sequence composition:spin
+
+When a sequence is read into the program its composition is displayed in
+the Output Window to provide a simple check that the data has been
+read correctly. The values can also be requested from the "Statistics"
+menu, when a dialogue will allow subsections of the sequence to be
+analysed. The results are displayed as shown below.
+
+ at example
+ at group
+
+============================================================
+Wed 12 Nov 17:10:25 1997: sequence composition
+------------------------------------------------------------
+A 1966 (24.17%) C 1996 (24.54%) G 2185 (26.86%) T 1987 (24.43%) - 0 (0.00%)
+
+Or for protein sequences:
+============================================================
+Mon 14 Oct 17:11:04 2002: sequence composition
+------------------------------------------------------------
+Sequence MYSA_DROME: 1 to 2411
+Protein
+AA  A     B     C     D     E     F     G     H     I     K     L     M     N    
+N  201   0     30    150   281   74    127   45    126   233   243   43    121   
+%  8.3   0.0   1.2   6.2   11.7  3.1   5.3   1.9   5.2   9.7   10.1  1.8   5.0   
+M  14287 0     3094  17263 36281 10891 7246  6171  14258 29865 27498 5642  13807 
+
+AA  P     Q     R     S     T     V     W     Y     Z     X     *     -    
+N  55    167   141   96    93    108   14    63    0     0     0     0     
+%  2.3   6.9   5.8   4.0   3.9   4.5   0.6   2.6   0.0   0.0   0.0   0.0   
+M  5341  21398 22022 8360  9403  10706 2607  10280 0     0     0     0     
+M  5341  21398 22022 8360  9403  10706 2607  10280 0     0     0     0     
+ at end group
+ at end example
+
+
+ at node SPIN-Dinucleotide-Freq
+ at section Count Dinucleotide Frequencies
+ at cindex Dinucleotide frequencies:spin
+
+This routine simply counts dinucleotide frequencies for the selected region of
+the sequence. It also calculates an expected distribution based on the base 
+composition. The output looks like:
+
+ at example
+ at group
+        A                C                G                T
+     Obs    Expected  Obs    Expected  Obs    Expected  Obs    Expected
+ A     7.91    5.84     5.64    5.93     5.05    6.49     5.57    5.91
+ C     5.91    5.93     5.14    6.02     7.38    6.59     6.10    5.99
+ G     6.11    6.49     7.56    6.59     6.30    7.22     6.90    6.56
+ T     4.24    5.91     6.18    5.99     8.14    6.56     5.86    5.97
+
+ at end group
+ at end example
+
+_split()
+ at node SPIN-Plot-Base-Composition
+ at section Plot base composition
+ at cindex Base composition plotting:spin
+ at cindex Plotting base composition:spin
+ at cindex Composition: sequence:spin
+ at cindex Sequence composition:spin
+ at cindex Persistence of results:spin
+ at cindex Memory saving:spin
+ at cindex Memory usage:spin
+
+
+The composition of the sequence can be displayed graphically. A window is
+slid along the sequence one base at a time, and at each point the number
+of occurrences of each selected base type is counted and plotted. Users can
+select which base types are counted, the size of the window used and the
+region of the sequence to analyse. 
+
+_lpicture(spin_plot_base_comp_p,6in)
+
+For
+example the A and T composition can be plotted by selecting base
+types A and T in the dialogue. As usual the values can also be listed in
+the text Output Window. Note that this "result", i.e. the base
+composition counts for every position along the sequence, will persist
+until the user explicitly removes it. To save memory delete results as
+soon as they are no longer required.
+
+_picture(spin_plot_base_comp_d,3.01667in)
+
+
+_split()
+ at node SPIN-Codon-Usage-Tables
+ at section Calculate codon usage
+ at cindex Codon usage:spin
+ at cindex Codon frequencies:spin
+ at cindex Codon tables:spin
+ at cindex Codon composition:spin
+
+Codon usage tables can be calculated and written to the Output Window,
+and written to disk. If required the values found can be added to the
+counts in a
+pre-existing codon table, or when written out to disk they can be
+concatenated with an existing codon table file. In the first case the
+existing file will be read and added to the values calculated for the
+region defined by the user. In the latter, the values calculated for the
+region defined by the user will be written immediately after those from
+the existing table, hence producing a pair of tables joined end to end.
+An example of this is shown at the end of this section, and a more
+typical result is shown below.
+
+_lpicture(spin_count_codons_t,5.34167in)
+
+Refering to the figure of the dialogue below, the user can select the
+range and strand over which to
+count. Note that irrespective of the strand being counted, the positions in the
+sequence are always defined from the current 5' end. i.e. to count over
+bases 1 to 100 the user should set the Start position to 1 and the End
+position to 100.
+The values in the table can be expressed as
+observed counts or as percentages of usage for the cognate amino acid.
+
+_picture(spin_count_codons_d,3.05in)
+
+The table can be output as a single table (as shown above), or as a
+double table (shown below). The user can request that the counts from an
+existing table be read and added to the counts which are about to be
+calculated, in which case the "File name" text window will be activated.
+If the user selects to output a double table, this dialogue will also be
+activated. To save the output in the selected form to a file, the user
+should fill in the "Save table to" text window.
+
+Two of the protein coding search functions 
+(_fpref(SPIN-Codon-Usage-Method, Codon Usage Method))
+and
+(_fpref(SPIN-Author-Test, Author Test))
+work best using a double
+codon table. The top table should contain the codon usage for the coding
+regions and the bottom table the usage for non-coding regions. A typical
+double codon table of this sort is shown below.
+
+ at example
+
+      ===============================================
+      F ttt     4 S tct    30 Y tat     5 C tgt     9
+      F ttc    35 S tcc    21 Y tac    15 C tgc     5
+      L tta     4 S tca     7 * taa     0 * tga     0
+      L ttg    24 S tcg     9 * tag     0 W tgg    15
+      ===============================================
+      L ctt    71 P cct     1 H cat    17 R cgt    37
+      L ctc    39 P ccc     2 H cac    15 R cgc    18
+      L cta     0 P cca    14 Q caa    87 R cga     1
+      L ctg     4 P ccg     0 Q cag    18 R cgg     1
+      ===============================================
+      I att    33 T act    30 N aat    12 S agt     2
+      I atc    53 T acc    20 N aac    59 S agc     5
+      I ata     1 T aca     3 K aaa    23 R aga    38
+      M atg    32 T acg     0 K aag   117 R agg     0
+      ===============================================
+      V gtt    30 A gct    71 D gat    58 G ggt     5
+      V gtc    22 A gcc    54 D gac    32 G ggc     1
+      V gta     7 A gca     6 E gaa    76 G gga    49
+      V gtg     5 A gcg     0 E gag   101 G ggg     1
+      ===============================================
+      ===============================================
+      F ttt    10 S tct     8 Y tat     7 C tgt     4
+      F ttc    12 S tcc     2 Y tac     4 C tgc     3
+      L tta     6 S tca     4 * taa     7 * tga    10
+      L ttg    11 S tcg     3 * tag     4 W tgg     6
+      ===============================================
+      L ctt     5 P cct     3 H cat     4 R cgt     0
+      L ctc     6 P ccc     1 H cac     4 R cgc     0
+      L cta     3 P cca     1 Q caa     9 R cga     5
+      L ctg     6 P ccg     3 Q cag     5 R cgg     2
+      ===============================================
+      I att    13 T act     6 N aat     7 S agt     4
+      I atc     7 T acc     0 N aac     3 S agc     2
+      I ata    12 T aca     5 K aaa     9 R aga    16
+      M atg     7 T acg     3 K aag     4 R agg     4
+      ===============================================
+      V gtt     6 A gct     2 D gat     8 G ggt     4
+      V gtc     2 A gcc     1 D gac     3 G ggc     1
+      V gta     5 A gca     1 E gaa     9 G gga     9
+      V gtg     4 A gcg     0 E gag     3 G ggg     0
+      ===============================================
+ at end example
+
+To calculate such a table using spin the following steps are
+required. First calculate the codon usage for a typical coding
+segment and save the resulting table in table A. Then use the option
+again, but this time select to "Output double table", and type the name
+of table A into the "File name" text box. Next define the start and end
+points of a non-coding region, and save the results to double table
+B. The file containing double table B is now suitable for use by the
+protein gene searching functions.
+
+_split()
+ at node SPIN-Set-Genetic-Code
+ at section Set genetic code
+ at cindex Set genetic code:spin
+ at cindex Genetic code:spin
+
+This function allows the user to change the genetic used in all the
+options. The codes are defined as a set of codon tables stored in the
+directory tables/gcodes distributed with the package. The current list
+of codes and their codon table file names is shown at the end of this
+section.
+
+The user interface consists of the dialogue shown below. The user selects
+the required code by clicking on it, and then clicking "OK" or "OK
+permanent". The former choice selects the code for immediate use, and
+the latter also selects it for future uses of the program.
+
+_picture(set_genetic_code,2.39167in)
+
+When the dialogue is left the codon table selected will be displayed, as
+below, in the Output Window.
+
+ at example
+      ===============================================
+      F ttt       S tct       Y tat       C tgt      
+      F ttc       S tcc       Y tac       C tgc      
+      L tta       S tca       * taa       W tga      
+      L ttg       S tcg       * tag       W tgg      
+      ===============================================
+      L ctt       P cct       H cat       R cgt      
+      L ctc       P ccc       H cac       R cgc      
+      L cta       P cca       Q caa       R cga      
+      L ctg       P ccg       Q cag       R cgg      
+      ===============================================
+      I att       T act       N aat       S agt      
+      I atc       T acc       N aac       S agc      
+      M ata       T aca       K aaa       G aga      
+      M atg       T acg       K aag       G agg      
+      ===============================================
+      V gtt       A gct       D gat       G ggt      
+      V gtc       A gcc       D gac       G ggc      
+      V gta       A gca       E gaa       G gga      
+      V gtg       A gcg       E gag       G ggg      
+      ===============================================
+ at end example
+
+The following table shows the list of available genetic codes and the
+files in which they are stored for use by the package. They were created
+from genetic code files obtained from the NCBI.
+
+ at example
+code_1  Standard
+code_2  Vertebrate Mitochondrial
+code_3  Yeast Mitochondrial
+code_4  Coelenterate  Mitochondrial
+code_4  Mold Mitochondrial
+code_4  Protozoan Mitochondrial
+code_4  Mycoplasma
+code_4  Spiroplasma
+code_5  Invertebrate Mitochondrial
+code_6  Ciliate Nuclear
+code_6  Dasycladacean Nuclear
+code_6  Hexamita Nuclear
+code_9  Echinoderm Mitochondrial
+code_10 Euplotid Nuclear
+code_11 Bacterial
+code_12 Alternative Yeast Nuclear
+code_13 Ascidian Mitochondrial
+code_14 Flatworm Mitochondrial
+code_15 Blepharisma Macronuclear
+ at end example
+
+_split()
+ at node SPIN-Translation-General
+ at section Translation - general
+
+ at cindex Translation to protein:spin
+ at cindex DNA translation:spin
+ at cindex Protein:spin
+
+Translations of the sequence can be obtained in three ways. The first is
+an option available within the "Sequence manager" or the "Sequences" menu 
+(_fpref(SPIN-Sequence Manager, Sequence manager)).
+
+The second is
+an option in the Sequence display 
+(_fpref(SPIN-Sequence-Display, Spin Sequence Display)) which enables the 
+translation to be shown with the scrolling sequence.
+
+The third, which has two methods of defining the segments to translate,
+is described here. The translations are written to the Output window, from
+where they can be saved to disk.
+A segment of a typical display is shown below.
+
+_picture(spin_translate_t,3.01667in)
+
+Users select either to use a feature table to define the segments to translate
+or can simply enter a start and end position. In the latter case a six phase
+transtation over that one segment is written out. If a feature table is used
+(and this assumes the sequence file was an EMBL entry complete with features),
+the CDS records from the table will be listed in the dialogue window and the
+user can select which ones should be used
+(_fpref(SPIN-Feature Tables, Use of feature tables in spin, t)).
+The translations produced will be
+written in FASTA format ready to be saved to disk (which the next release of
+spin will do automatically!). 
+
+The user can also choose
+the line length, and whether one or three letter amino acid symbols are
+produced. 
+
+_picture(spin_translate_d,3.01667in)
+
+_split()
+ at node SPIN-Open-Reading-Frames
+ at section Find open reading frames
+ at cindex Find open reading frames:spin
+ at cindex Open reading frames:spin
+
+This function will find open reading frames greater than an specified length. 
+The results can be output in two ways, either in feature table format
+
+ at example
+ at group
+
+FT   CDS             120..233
+FT   CDS             161..256
+FT   CDS             301..396
+FT   CDS             333..497
+FT   CDS             512..736
+FT   CDS             525..965
+FT   CDS             740..952
+FT   CDS             754..876
+FT   CDS             956..1789
+ at end group
+ at end example
+
+or as a fasta file.
+
+ at example
+ at group
+>120                 120..233
+VISENISLLKIGAKNHHWLLKQLLKMSMGGFCCVNVIY*
+>161                 161..256
+EPSLAVKTVIKNVNGWFLLCKCHLLNRYLFLD*
+>301                 301..396
+ICCARTCAICDLKHALSPVFTRYLQFFMIEQG*
+>333                 333..497
+SEARFITSVYALFTVFHDRTGLAEKSQLYALEKYLNIYSPFGYLLFEITGAHRII*
+>512                 512..736
+CLTLSLKESFIRHAAYLEGSRSEKRDVCVARESKRCSEASARSVTGGDSKWIAVQPQRPL
+LGRLCNKRGPGSLSA*
+>525                 525..965
+ALKKVLYDTRHTSKGAGVKNVMSVSLVSRNVARKLLLVQLLVVIASGLLFSLKDPFWGVS
+AISGGLAVFLPNVLFMIFAWRHQAHTPAKGRVAWTFAFGEAFKVLAMLVLLVVALAVLKA
+VFLPLIVTWVLVLVVQILAPAVINNKG*
+>740                 740..952
+RFVYDICLASPGAYTSERPGGLDIRIWRSFQSSGDVGVTGGGVGGFKGGILAADRYVGFG
+AGGSDTGTGCN*
+>754                 754..876
+YLPGVTRRIHQRKAGWPGHSHLAKLSKFWRCWCYWWWRWRF*
+>956                 956..1789
+QQRVKGIMASENMTPQDYIGHHLNNLQLDLRTFSLVDPQNPPATFWTINIDSMFFSVVLG
+LLFLVLFRSVAKKATSGVPGKFQTAIELVIGFVNGSVKDMYHGKSKLIAPLALTIFVWVF
+LMNLMDLLPIDLLPYIAEHVLGLPALRVVPSADVNVTLSMALGVFILILFYSIKMKGIGG
+FTKELTLQPFNHWAFIPVNLILEGVSLLSKPVSLGLRLFGNMYAGELIFILIAGLLPWWS
+QWILNVPWAIFHILIITLQAFIFMVLTIVYLSMASEEH*
+ at end group
+ at end example
+
+The user can select that start and end points over which to do the search,
+which strand to search (either the forward, reverse or both) and the
+minimum length of the open reading frame in codons. If the output is being
+written in fasta format, the name of file is also required.
+
+_picture(spin_find_orf_d,3.01667in)
+
+_include(spin_restrict_enzymes-t.texi)
+
+
+_split()
+ at node SPIN-String-Search
+ at section Subsequence search
+ at cindex String searching:spin
+ at cindex Subsequence searching:spin
+ at cindex Searching for strings:spin
+ at cindex String matching:spin
+ at cindex Matching strings:spin
+ at cindex Finding strings:spin
+ at cindex String finding:spin
+ at cindex Percentage matches:spin
+ at cindex Searching for oligos:spin
+ at cindex Oligo searching:spin
+ at cindex Motif searching: percentage matches:spin
+
+
+Two subsequence or string searches are available. One, selected from the "Search" menu
+on the Output Window, produces both graphical and textual output, and
+the other, selected from the "Search" button in the Sequence display,
+moves the cursor to the position of the next match. Here we document the
+use of the first search, and the other is described in
+_oref(SPIN-Sequence-Display-Search, Sequence display string search).
+
+As shown in the dialogue the user selects the range and strand over
+which the search should be performed, the search algorithm, the minimum percentage match,
+and the subsequence/string for which to search. The search algorithm allows either NC-IUB 
+codes @cite{Cornish-Bowden, A. (1985) Nucl. Acids Res. 13, 3021-3030} or a 
+literal search. The literal search will search for exact matches eg
+inputting a search string of "n" will search for the letter "n". The NC-IUB 
+codes option can use any of the NC-IUB symbols
+shown in the figure below and the search is not case sensitive.
+
+ at example
+ at group
+ at cartouche
+              NC-IUB SYMBOLS
+ 
+        A,C,G,T
+        R        (A,G)        'puRine'
+        Y        (T,C)        'pYrimidine'
+        W        (A,T)        'Weak'
+        S        (C,G)        'Strong'
+        M        (A,C)        'aMino'
+        K        (G,T)        'Keto'
+        H        (A,T,C)      'not G'
+        B        (G,C,T)      'not A'
+        V        (G,A,C)      'not T'
+        D        (G,A,T)      'not C'
+        N        (G,A,C,T)    'aNy'
+
+ at end cartouche
+ at end group
+ at end example
+
+_picture(spin_string_search_d,3.01667in)
+
+The matches are plotted as vertical lines at the match positions with
+the heights of the lines in proportion to their score. The matches are
+also written in the Output Window as shown below.
+
+_lpicture(spin_string_search_p,6in)
+
+
+ at example
+ at group
+============================================================
+Tue 19 Oct 11:52:50 1999: string search
+------------------------------------------------------------
+Position 7837 score 9 percent match 90.000000
+ Percentage mismatch  10.0
+                 1
+          string atrytayrat
+                 ::..::..: 
+      atpase.seq atgctatgag
+              7837
+ at end group
+ at end example
+
+_split()
+ at node SPIN-Weight-Matrix-Search
+ at section Motif search
+ at cindex Motif searching:spin
+ at cindex Weight matrix:spin
+ at cindex Searching:spin
+ at cindex Searching for motifs:spin
+
+This option is used to search for motifs such as binding sites. The
+motifs are defined using weight matrices which are stored as files 
+that need to be created beforehand. These matrices are usually
+calculated from alignments of trusted examples of the motif. The
+ at code{make_weights} program can be used to create weight matrices from
+sets of aligned sequences
+(_fpref(Man-make_weights, Make_weights, t))
+We also plan to build up a library of matrix files which we will place
+in our ftp site.
+
+An example weight matrix file is shown below. It consists
+of a title record; a record defining the motif size, an offset and the score
+range; 2 records which need to be present but which are ignored; 4 records
+defining the base frequencies calculated from the trusted examples.
+
+An example weight matrix file is shown below. The first line gives the
+title ('Mount acceptors' in this example). The next line gives the motif 
+length (18), the "mark position" (15), and the minimum and maximum
+scores (0.0 and 10.0). The "mark position" is an offset which is added
+to the position of any matches reported by the search routine in
+spin. The next two lines are ignored by the programs. The first of them
+gives the matrix column positions, and the next gives the total counts
+in each column. The final lines (4 for DNA weight matrices) give the
+counts for each character type at each position in the motif. These
+counts are converted into weights that are used during the searches. Any
+position in a sequence which scores at least as high as the minimum
+score is reported as a match, and if the results are plotted they are
+scaled to fit the range defined by the minimum and maximum scores.
+
+ at example
+ at group
+ Mount acceptors
+     18    15   0.0   10.0
+ P -14 -13 -12 -11 -10  -9  -8  -7  -6  -5  -4  -3  -2  -1   0   1   2   3
+ N 113 113 113 113 113 113 113 113 113 113 113 113 113 113 113 113 113 113
+ T  58  50  57  59  67  56  58  49  47  66  64  31  34   0   0  11  41  31
+ C  21  28  34  25  29  33  35  32  42  40  33  25  74   0   0  23  28  41
+ A  17  11  11  18   7  17  12  23  15   3  10  29   5 113   0  24  21  21
+ G  17  24  11  11  10   7   8   9   9   4   6  28   0   0 113  55  23  20
+ at end group
+ at end example
+
+Search results are plotted as log-odds and appear as shown below.
+
+_lpicture(spin_weight_matrix,6in)
+
+The dialogue for the option is shown below.
+
+_picture(spin_weight_matrix_dial,3.01667in)
+
+
+_split()
+ at node SPIN-Gene-Finding
+ at section Gene finding
+ at cindex Finding genes: Introduction:spin
+ at cindex Gene finding: Introduction:spin
+ at cindex Searching: protein genes:spin
+ at cindex Searching: tRNA genes:spin
+ at cindex Searching: motifs:spin
+ at cindex motifs: spin
+
+ at menu
+* SPIN-Start-Codon-Search::      Start Codon Search
+* SPIN-Stop-Codon-Search::       Stop Codon Search
+* SPIN-Codon-Usage-Method::	Codon Usage Method
+* SPIN-Author-Test::	        Author Test
+* SPIN-Uneven-Positional-Base-Freqs::	Uneven Positional Base Frequencies Method
+* SPIN-Splice-Site-Search::	Splice Site Search
+* SPIN-Weight-Matrix-Search::	Motif Search
+* SPIN-TRNA-Search::		tRNA Gene Search
+ at end menu
+
+
+Many years ago @cite{Staden R. (1984) Graphic methods to determine
+the function of nucleic acid sequences. Nucl. Acids Res. 12, 521-538}
+we separated methods for searching for genes and their
+control regions into two classes: "gene search by signal", and "gene
+search by content". 
+ at cite{Staden R. (1985) Computer methods to locate genes  and  signals  in
+nucleic acid sequences, Genetic Engineering: Principles
+and Methods Vol. 7, Edited  by  J. K. Setlow   and   A.
+Hollaender, Plenum Publishing Corp.}.
+Signal searches look for short segments of
+sequences such as promoters, ribosome binding sites, splice junctions,
+etc, whereas content searches look for the sequence patterns that are
+characteristic of protein coding regions, or RNA genes. Protein coding
+sequences produce particular amino acid sequences, often using preferred
+codons, and this leaves patterns in the sequence that can be used to
+distinguish them from non-protein-coding DNA. tRNA genes must produce
+stable cloverleaf structures and "standard" tRNAs must contain
+particular (conserved) bases at locations within the cloverleaf. These
+features can be used to locate tRNA genes, and probably other RNA
+genes could be sought in a similar way.
+
+The methods described in the following sections are either "content" or
+"signal" searches and spin's graphical presentation of results can be used to
+see if together they produce a consistent gene prediction. 
+
+_split()
+ at node SPIN-Start-Codon-Search
+ at subsection Start codon search
+ at cindex Searching: start codons:spin
+ at cindex Start codons:spin
+
+This function plots the positions of all start codons using the default
+genetic code in all 3 reading frames. The positions can be listed to the
+Output Window. The start codons are plotted beneath
+the centre of the plot (in contrast to the stop codons which are plotted
+above the centre). If any of the gene search methods are currently being
+displayed, the start codons will automatically be plotted on top of the
+corresponding frame, otherwise they will be plotted in three separate plots.
+These plots can be dragged and dropped in the usual manner.
+
+_picture(spin_start_d,3.01667in)
+
+_lpicture(spin_start_p,6in)
+
+_split()
+ at node SPIN-Stop-Codon-Search
+ at subsection Stop codon search
+ at cindex Searching: stop codons:spin
+ at cindex Searching: protein genes:spin
+ at cindex Open reading frames:spin
+ at cindex Searching: open reading frames:spin
+
+Stop codons can be searched for on either (or both) strands of the
+sequence. The stop codons are displayed graphically, with a different
+colour used for each reading frame, and their positions can also be
+listed in the Output Window. If any of the gene search methods are currently
+being displayed, the stop codons will automatically be plotted on top of the
+corresponding frame. As usual the graphical plots can be dragged and dropped 
+to new locations.
+
+_picture(spin_stops_d,3.01667in)
+
+_lpicture(spin_stops_p,6in)
+
+In the example below we show the stop codon search results after they
+have been drawn on a plot containing results from
+a protein gene search method. Here we can see that the open reading
+frames coincide with the highest scoring segments from the protein gene
+prediction plots.
+
+_lpicture(spin_stops_p2,6in)
+
+
+_split()
+ at node SPIN-Codon-Usage-Method
+ at subsection Codon usage method
+
+ at cindex Searching: protein genes:spin
+ at cindex Finding protein genes:spin
+ at cindex Codon usage tables:spin
+ at cindex Normalisation: codon usage tables:spin
+ at cindex Stop codons:spin
+ at cindex Codon usage method:spin
+
+
+This gene finding method is based on @cite{Staden, R. and McLachlan, A.D. (1982)
+Codon preference and its use in identifying protein coding regions in
+long DNA sequences. Nucl. Acid Res. 10, 141-156.}
+
+The current method contains a number of improvements on the original
+one. We are trying to decide if each segment of the sequence is coding or 
+non-coding. Each possibility is represented by a model consisting of a table of
+expected codon usage. The calculation finds the
+odds that each segment of the sequence fits either the coding or 
+non-coding model, and the results are plotted as log odds. 
+
+The results for each reading frame are plotted in the graphics
+window with frame 1 in the top panel, frame 2 the middle and frame 3
+in the bottom panel. Frame 1 is the frame of the first base in the
+active region. At each position along the sequence the program
+also plots a single dot for the reading frame with the highest
+score. These dots appear at the midpoints of the three panels and will
+form a continuous line if one reading frame is consistently the
+highest scoring.
+
+The figure shown below shows a SPIN Sequence Plot containing the results of
+the codon usage method on a sequence from C. elegans. This sequence has
+strong codon usage bias and so produces clear results for the method.
+Here the results are for the standard codon usage employing only the
+codon usage table shown below and a window length of 67 codons (i.e.
+no table of codon usage for non-coding sequence was supplied, and no
+normalisation was performed on the coding table). Compare the results
+to the other screen dump shown later, which also uses
+a window of 67 codons.
+
+Also visible in the figure are the cross hairs. Their x position is shown
+in sequence base numbers in the left hand box above the plot, and the y
+coordinate, expressed using the score values of the gene search, is
+shown in the right hand box. Each line in the window has its own colour
+and can be dragged and dropped to new locations to reorganise the plot.
+The cursor in the plot can be used to control the position of the
+cursor in the sequence display.
+
+_lpicture(spin_codon_usage,6in)
+
+As can be seen in the dialogue below 
+the user can define the size of the scan window in codons (note that the
+window length must be odd), the name of
+the file containing the codon usage table, and the region of the
+sequence to be analysed. The longer the window the smoother the plots
+but the more difficult it is to finds the ends of the coding
+segments. The stronger the codon preference in the codon table the
+higher the discrimination between coding and non-coding (assuming the
+sequence being analysed has the same preferences as those of the
+table). Note also that the amino acid composition represented in the
+table will also influence the results.
+
+_picture(spin_codon_usage_dial,3.01667in)
+
+The user should supply the name of a file containing two concatenated
+codon usage tables - the first being from coding sequence and the
+second from noncoding sequence. 
+This double codon table can be calculated by
+spin using 
+the Codon Usage function
+(_fpref(SPIN-Codon-Usage-Tables, Calculate codon usage, t)).
+
+If the user gives
+the name of a file that contains only a single codon table the
+algorithm will assume that it is from coding sequence, and will
+generate a noncoding table that consists of the frequencies
+that would be expected if the sequence being analysed was random
+but had the same base composition as the codon table.
+
+If no table is specified the program will generate a codon usage table
+corresponding to an average amino acid composition, and then derive
+a non-coding table from its base composition. This is equivalent to the
+"positional base preferences" method, and hence replaces it. More information
+about this method is given further down
+(_fpref(SPIN-Positional-Base-Prefs, Positional base Preferences))
+
+In addition the user can select to set the amino acid composition of the
+coding table to have an average amino acid composition, and/or to have
+no codon preference (i.e. for each amino acid the codon counts are equal,
+i.e.  (TTT = TTC); (TTA = TTG = CTT = CTC = CTA = CTG); ...;
+(GGT = GGC = GGA = GGG)). In the latter case the search uses amino acid 
+composition only.
+
+
+The   average   amino
+  composition  used  to normalise the values in the codon table
+is that described by McCaldon and Argos @cite{McCaldon  and  Argos (1988),
+Proteins  4,  99-122}. 
+
+The dialogue also allows the user to control whether or not the positions
+of stop codons are included in the display.
+
+Codon tables are scaled so that the sum of their values is 1000 and then
+any zero entries are set to 1/1000. Stop codons in the coding table are
+made to be neutral by setting them to the mean value for the table.
+
+Example of the tables employed/calculated for
+an input coding table, no non-coding table, and normalise to average
+amino acid composition.
+
+ at example
+ at group
+
+Table read in:
+
+      ===============================================
+      F ttt     3 S tct    29 Y tat     5 C tgt     9
+      F ttc    35 S tcc    21 Y tac    15 C tgc     5
+      L tta     2 S tca     6 * taa     0 * tga     0
+      L ttg    23 S tcg     9 * tag     0 W tgg    15
+      ===============================================
+      L ctt    70 P cct     1 H cat    17 R cgt    37
+      L ctc    39 P ccc     2 H cac    15 R cgc    18
+      L cta     0 P cca    14 Q caa    87 R cga     1
+      L ctg     4 P ccg     0 Q cag    17 R cgg     1
+      ===============================================
+      I att    32 T act    30 N aat    11 S agt     1
+      I atc    53 T acc    20 N aac    56 S agc     5
+      I ata     1 T aca     3 K aaa    21 R aga    36
+      M atg    31 T acg     0 K aag   115 R agg     0
+      ===============================================
+      V gtt    28 A gct    69 D gat    57 G ggt     5
+      V gtc    22 A gcc    52 D gac    32 G ggc     1
+      V gta     7 A gca     6 E gaa    76 G gga    48
+      V gtg     4 A gcg     0 E gag    99 G ggg     1
+      ===============================================
+
+ at end group
+ at end example
+
+ at example
+ at group
+
+Program generates non-coding table from the base 
+composition of the coding table:
+
+      ===============================================
+      F ttt    13 S tct    13 Y tat    12 C tgt    18
+      F ttc    13 S tcc    12 Y tac    12 C tgc    17
+      L tta    12 S tca    12 * taa    11 * tga    16
+      L ttg    18 S tcg    17 * tag    16 W tgg    24
+      ===============================================
+      L ctt    13 P cct    12 H cat    12 R cgt    17
+      L ctc    12 P ccc    12 H cac    11 R cgc    17
+      L cta    12 P cca    11 Q caa    11 R cga    16
+      L ctg    17 P ccg    17 Q cag    16 R cgg    23
+      ===============================================
+      I att    12 T act    12 N aat    11 S agt    16
+      I atc    12 T acc    11 N aac    11 S agc    16
+      I ata    11 T aca    11 K aaa    11 R aga    15
+      M atg    16 T acg    16 K aag    15 R agg    22
+      ===============================================
+      V gtt    18 A gct    17 D gat    16 G ggt    24
+      V gtc    17 A gcc    17 D gac    16 G ggc    23
+      V gta    16 A gca    16 E gaa    15 G gga    22
+      V gtg    24 A gcg    23 E gag    22 G ggg    31
+      ===============================================
+
+ at end group
+ at end example
+
+ at example
+ at group
+
+Program generates coding table with average amino acid
+composition and stops set to mean:
+
+      ===============================================
+      F ttt     3 S tct    28 Y tat     8 C tgt    11
+      F ttc    36 S tcc    20 Y tac    24 C tgc     6
+      L tta     1 S tca     6 * taa    16 * tga    16
+      L ttg    15 S tcg     9 * tag    16 W tgg    13
+      ===============================================
+      L ctt    46 P cct     3 H cat    12 R cgt    23
+      L ctc    25 P ccc     6 H cac    10 R cgc    11
+      L cta     0 P cca    42 Q caa    33 R cga     1
+      L ctg     3 P ccg     0 Q cag     7 R cgg     1
+      ===============================================
+      I att    19 T act    33 N aat     7 S agt     1
+      I atc    32 T acc    22 N aac    37 S agc     5
+      I ata     1 T aca     3 K aaa     9 R aga    22
+      M atg    24 T acg     0 K aag    48 R agg     0
+      ===============================================
+      V gtt    30 A gct    45 D gat    34 G ggt     7
+      V gtc    24 A gcc    34 D gac    19 G ggc     1
+      V gta     8 A gca     4 E gaa    27 G gga    63
+      V gtg     4 A gcg     0 E gag    35 G ggg     1
+      ===============================================
+
+ at end group
+ at end example
+
+_split()
+ at node SPIN-Positional-Base-Prefs
+ at subsection Positional base preferences
+
+This method for finding protein coding regions
+is a variant of the codon usage method. Here, instead of measuring the
+closeness to an table of codon frequencies whose main discriminating
+power is due codon preferences, we look for similarity to the codon
+usage that would be expected from a protein sequence of average amino
+acid composition, but with no codon preference. The method is
+surprisingly effective:
+When  tested  against all the E. coli
+  sequences in the EMBL sequence library it correctly  identified  the
+  coding  frame  for  91% of window positions.  (The E. coli sequences
+  were chosen only for technical reasons: we have no  reason  to  think
+  the method would work less well on other organisms with roughly even
+  base composition.) 
+ at cite{Staden R. (1990) Finding protein coding regions in genomic sequences.
+                  In Doolittle, R,R (ed), Methods in Enzymology, 183, 
+                  Academic Press, San Diego, CA, 163-180.}
+
+The   average   amino
+  composition  used  to derive the values in the codon table
+is that described by McCaldon and Argos @cite{McCaldon  and  Argos (1988),
+Proteins  4,  99-122}. 
+
+_lpicture(spin_codon_usage_aaonly,6in)
+
+Above is the result of applying this method to the C. elegans sequence
+analysed above with a codon preference table. Note that, as would be expected,
+the main difference
+is the that the range of observed scores is very much reduced.
+
+_split()
+ at node SPIN-Author-Test
+ at subsection Author test
+ at cindex Searching: protein genes:spin
+ at cindex Finding protein genes:spin
+ at cindex Codon usage tables:spin
+ at cindex Sequence interpretation: finding protein genes:spin
+ at cindex Normalisation: codon usage tables:spin
+ at cindex Stop codons:spin
+ at cindex Author test:spin
+
+This is an unpublished method for distinguishing between coding and
+noncoding segments of a DNA sequence. It is basically an extension of
+the Codon Usage method in which we compare the sequence to two tables
+of codon usage to see which of the two it is most like. One table
+should contain typical codon usage from a coding sequence and the
+other typical codon usage from a noncoding region. It is based on
+methods used to decide authorship of text - is the usage of words
+(codons) more
+like that of author A (coding) or that of author B (noncoding)?
+
+_lpicture(spin_author_p,6in)
+
+The results for each reading frame are plotted in the graphics
+window with frame 1 in the top panel, frame 2 the middle and frame 3
+in the bottom panel. Frame 1 is the frame of the first base in the
+active region. At each position along the sequence the program
+also plots a single dot for the reading frame with the highest
+score. These dots appear at the midpoints of the three panels and will
+form a continuous line if one reading frame is consistently the
+highest scoring.
+The figure shows a SPIN Sequence Plot containing the results of
+the author test method on a sequence from E. coli.
+Also visible are the cross hairs. Their x position is shown
+in sequence base numbers in the left hand box above the plot, and the y
+coordinate, expressed using the score values of the gene search, is
+shown in the right hand box. Each line in the window has its own colour
+and can be dragged and dropped to new locations to reorganise the plot.
+The cursor in the plot can be used to control the position of the
+cursor in the sequence display.
+
+
+ at example
+ at group
+A typical pair of concatenated codon tables for use by the Author test
+
+      ===============================================
+      F ttt     0 S tct     6 Y tat     2 C tgt     3
+      F ttc     3 S tcc     8 Y tac     6 C tgc     0
+      L tta     0 S tca     0 * taa     0 * tga     0
+      L ttg     1 S tcg     0 * tag     0 W tgg     0
+      ===============================================
+      L ctt     1 P cct     0 H cat     0 R cgt    12
+      L ctc     1 P ccc     0 H cac     4 R cgc     5
+      L cta     1 P cca     2 Q caa     2 R cga     0
+      L ctg    19 P ccg     7 Q cag    12 R cgg     0
+      ===============================================
+      I att     5 T act     3 N aat     2 S agt     2
+      I atc    22 T acc     6 N aac     7 S agc     1
+      I ata     0 T aca     1 K aaa     8 R aga     0
+      M atg     8 T acg     0 K aag     2 R agg     0
+      ===============================================
+      V gtt    14 A gct    12 D gat     7 G ggt    16
+      V gtc     1 A gcc     4 D gac     9 G ggc    11
+      V gta     7 A gca     8 E gaa    14 G gga     0
+      V gtg     4 A gcg     5 E gag     2 G ggg     0
+      ===============================================
+      ===============================================
+      F ttt    16 S tct     8 Y tat     8 C tgt    12
+      F ttc     7 S tcc     0 Y tac     4 C tgc     8
+      L tta     7 S tca     9 * taa    14 * tga     6
+      L ttg     7 S tcg     5 * tag     2 W tgg    17
+      ===============================================
+      L ctt     4 P cct     5 H cat     7 R cgt     4
+      L ctc     1 P ccc     0 H cac     7 R cgc     8
+      L cta     2 P cca     2 Q caa     3 R cga     4
+      L ctg     6 P ccg     1 Q cag     7 R cgg     4
+      ===============================================
+      I att     5 T act     3 N aat     3 S agt     4
+      I atc     2 T acc     5 N aac     1 S agc     1
+      I ata     6 T aca     8 K aaa    13 R aga     7
+      M atg     4 T acg     5 K aag     9 R agg     6
+      ===============================================
+      V gtt     5 A gct     2 D gat     3 G ggt     3
+      V gtc     3 A gcc     4 D gac     3 G ggc     5
+      V gta     2 A gca     4 E gaa     3 G gga     2
+      V gtg     5 A gcg     5 E gag     2 G ggg     5
+      ===============================================
+ at end group
+ at end example
+
+The mathematical treatment of the data is very different from that of
+the codon usage method.
+
+Given the two tables of codon usage the algorithm works out the
+optimal weighting to give each codon to obtain the best discrimination
+between coding and noncoding sequence. 
+The user sets the expected error rate as a percentage and the algorithm
+will choose the corresponding window length to use for the analysis.
+
+_picture(spin_author_d,3.01667in)
+
+The user should supply the name of a file containing two concatenated
+codon usage tables - the first being from coding sequence and the
+second from noncoding sequence. 
+This double codon table can be calculated by
+spin using 
+the Codon Usage function
+(_fpref(SPIN-Codon-Usage-Tables, Calculate codon usage, t)).
+
+If the user gives
+the name of a file that contains only a single codon table the
+algorithm will assume that it is from coding sequence, and will
+generate a noncoding table that consists of the frequencies
+that would be expected if the sequence being analysed was random.
+The region to be analysed can also be set.
+
+
+_split()
+ at node SPIN-Uneven-Positional-Base-Freqs
+ at subsection Uneven positional base preferences
+ at cindex Searching: protein genes:spin
+ at cindex Finding protein genes:spin
+ at cindex Sequence interpretation: finding protein genes:spin
+ at cindex Uneven positional base frequencies:spin
+
+This method is used to find regions of a  sequence  that
+code  for  a
+protein.  It is based  on  the method of Fickett @cite{Fickett,J. (1982) 
+Nucl. Acid Res.10}, and unlike the other methods currently in the
+package does not attempt to say either which strand or frame is likely
+to be coding, only which regions of the sequence.
+
+The method looks for sections of the sequence in which the
+frequencies at which  each  of  the  four  bases  occupy  the  three
+positions  in  codons  is  nonrandom.  The level of nonrandomness is
+plotted on a scale that shows the probability that the  sequence  is
+coding.  At each position along a sequence the calculation gives the
+same value for all six possible reading frames, so only one value is
+plotted. Seventy six percent of coding regions score above 0.78  and
+76% of noncoding regions below 0.78. 
+No known window in a
+coding region has a value below 0.4, but 14% of windows in noncoding
+sequences  score  below  it. No known window in a noncoding region
+reaches a score of 1.34, but this score is reached by 16%  of  known  coding
+regions. These statements are now very much out of date.
+
+The method was first described in  @cite{Staden R. (1984) 
+Nucl. Acid Res.  12, 551-567}.
+It looks through the sequence in one  fixed  phase  and  counts  the
+number  of  times  each  base  appears  in  each  of  the three codon
+positions: for each window position it counts A1,A2,A3 and  C1,C2,C3
+and  G1,G2,G3  and  T1,T2,T3  and calculates AMEAN=(A1+A2+A3)/3, and
+similarly CMEAN, GMEAN and TMEAN; it  then  calculates  ADIF=abs(A1-
+AMEAN)+abs(A2-AMEAN)+abs(A3-AMEAN) and similarly CDIF, GDIF and TDIF
+to measure the differences  between  an  even  base  usage  for  all
+positions  in  the  codons  and the observed usage. The routine then
+calculates and plots the sum ADIF+CDIF+GDIF+TDIF.
+
+
+
+In the figure shown below it will be seen that much of the sequence
+being analysed appears to be coding, and this is indeed the case. Many
+of the troughs between peaks correspond to the ends of genes in this
+E. coli sequence (which was not a good choice to illustrate the
+method!). The horizontal line is at 76%. 76% of coding regions achieve values
+above this line and 76% of noncoding regions achieve scores below the line.
+ 
+_lpicture(spin_base_bias_p,6in)
+
+As can be seen in the dialogue below 
+the user can set the window length in codons (although around 67 codons is
+generally suitable) and can restrict the search to a sub region of the
+sequence. Note that the window length must be odd.
+
+
+_picture(spin_base_bias_d,3.01667in)
+
+_split()
+ at node SPIN-Splice-Site-Search
+ at subsection Splice site search
+ at cindex Splice junctions:spin
+ at cindex Intron/exon boundaries:spin
+ at cindex Weight matrix: splice sites:spin
+ at cindex Splice sites:spin
+ at cindex Reading frame:spin
+ at cindex Searching: splice sites:spin
+ at cindex Searching: protein genes:spin
+
+This method is used to search  for  mRNA  splice  junctions  
+using  a  weight
+matrix.  The  default  weight  matrix is still that derived from the
+paper of @cite{Mount S.M, (1982) Nucl. Acids Res. 10, 459-472}, but
+we are about to create a whole new set which will be organism specific,
+and will include them in later releases and make them available via ftp.
+
+The  results are displayed in three colours, one colour for each
+reading frame. The donors are plotted upwards from the base of the
+panel and the acceptors are plotted downwards from the top of the
+panel. The donors and acceptors with the same colour are compatible; 
+eg red donors are compatible with red acceptors.
+Of course it is the  combination
+of  reading  frame  and splice sites that really matters, so donors
+and acceptors drawn in different colours can be compatible if the
+reading frame changes. By default all the sites are drawn in the same
+plot but in the figure shown below they have been separated by reading
+frame using the programs ability to reorganise the positions of
+graphical results. This layout of the donors and acceptors is designed
+to fit with the gene search methods and stop codon plots.
+The results are plotted as Log-Odds.
+
+_lpicture(spin_splice,6in)
+
+The  frequency  table  shown
+below  is  used  as  a  weight  matrix  and  AG  and GT are
+obligatory at the appropriate positions.
+ 
+ at example
+ at group
+ Mount acceptors redone 16-4-91                              
+     18    15   0.0   10.0
+ P -14 -13 -12 -11 -10  -9  -8  -7  -6  -5  -4  -3  -2  -1   0   1   2   3
+ N 113 113 113 113 113 113 113 113 113 113 113 113 113 113 113 113 113 113
+ T  58  50  57  59  67  56  58  49  47  66  64  31  34   0   0  11  41  31
+ C  21  28  34  25  29  33  35  32  42  40  33  25  74   0   0  23  28  41
+ A  17  11  11  18   7  17  12  23  15   3  10  29   5 113   0  24  21  21
+ G  17  24  11  11  10   7   8   9   9   4   6  28   0   0 113  55  23  20
+ Mount donors redone 16-4-91                                 
+     12     4   0.0    8.0
+ P  -2  -1   0   1   2   3   4   5   6   7   8   9
+ N 136 136 136 136 136 136 136 136 136 136 136 136
+ T  28   8  15  17   0 136   9  16   7  84  30  36
+ C  41  60  16   7   0   0   3  13   3  17  28  39
+ A  40  56  89  12   0   0  83  91  12  23  53  33
+ G  27  12  16 100 136   0  41  16 114  12  25  28
+ at end group
+ at end example
+
+_split()
+ at node SPIN-TRNA-Search
+ at subsection tRNA search
+
+ at cindex Searching: tRNA genes:spin
+ at cindex tRNA gene search:spin
+ at cindex Sequence interpretation: tRNA gene search:spin
+ at cindex Cloverleaf:spin
+ at cindex Conserved bases in tRNA:spin
+ at cindex Intron in tRNA:spin
+ at cindex tRNA introns:spin
+ at cindex Stems and loops:spin
+
+This method is used to find segments of a sequence that might code for tRNAs.
+It looks  for  potential cloverleaf forming structures and then for the
+presence of the expected conserved bases. It presents  results  graphically
+and draws out the cloverleafs.
+ 
+The algorithm uses a large number of parameters including some loop
+lengths, scores for each of the four stems, and scores for the conserved
+bases, but we have not yet included an interface for setting these. We
+apologise for this and plan to add the interface in a future release. 
+Using individual base pair scores of A-T = G-C = 2, G-T = 1, 
+in its present form the algorithm is set to search for segments of
+sequence satisfying the following minimum scores:
+
+
+ at itemize @bullet
+ at item aminoacyl stem 12
+ at item tu stem 9
+ at item anticodon stem 8
+ at item du stem 4
+ at item minimum total stem score 36
+ at item minimum number of conserved bases 16
+ at item no introns
+ at end itemize
+ 
+The algorithm was first described in @cite{Staden,R. (1980)
+A computer program to search for tRNA genes. Nucl. Acid Res 8, 
+817-825}, but has been completely rewritten since then.
+The tRNAs  that  have been  sequenced  so far have two
+characteristics that can be used to locate their genes  within  long
+DNA  sequences.  Firstly they  have  a common   secondary  structure
+-  the  cloverleaf  -  and  secondly, particular bases almost always
+appear  at  certain  positions  in the cloverleaf.   The  cloverleaf
+is composed of four base-paired stems and four loops.  Three of  the
+stems are  of  fixed  length  but  the fourth,  the  dhu  stem which
+usually has four base pairs, sometimes has only three.  All  of  the
+loops  can  vary in size.   The  following relationships between the
+stems in the cloverleaf are assumed in the program:  (a)  there  are
+no  bases  between  one  end   of   the  aminoacyl  stem   and   the
+adjoining tuc stem;  (b) there are two bases between  the  aminoacyl
+stem and the dhu stem;  (c) there is one base between the  dhu  stem
+and the anticodon stem;  (d) there are at least three bases  between
+the  anticodon  stem  and the tuc stem.  The program looks first for
+cloverleaf structure and then for  conserved  bases.
+ 
+The output shows the position of the possible gene in the
+sequence by a vertical line the height of which shows the number  of
+basepairs  made in the stems. Typical graphical output:
+
+_lpicture(spin_trna_p,6in)
+
+The cloverleaf structure is also drawn
+in the text Output Window. Typical text output:
+
+_lpicture(spin_trna_t,5.13333in)
+
+
+_split()
+ at node SPIN-Comparisons
+ at chapter Spin Comparison Functions
+
+ at menu
+* SPIN-Find similar spans::   Finding Similar Spans
+* SPIN-Find matching words::  Finding Matching Words
+* SPIN-Local alignment::      Aligning Sequences Locally
+* SPIN-Find Best Diagonals::  Finding the Best Diagonals
+* SPIN-Align Sequences::      Aligning Sequences Globally
+ at end menu
+
+Spin contains three functions for finding local segments of similarity
+between pairs of sequences 
+(_fpref(SPIN-Find similar spans, Finding Similar Spans)),
+(_fpref(SPIN-Find matching words, Finding Matching Words)) and
+(_fpref(SPIN-Local alignment, Aligning Sequences Locally)),
+and two for finding global
+similarity
+(_fpref(SPIN-Find Best Diagonals, Finding the Best Diagonals)) and
+(_fpref(SPIN-Align Sequences, Aligning Sequences Globally)).
+All functions produce results which are plotted in a dot
+matrix display called a SPIN Sequence Comparison Plot.
+Obviously global similarity consists of many small matching
+segments and so the local similarity searches, when plotted, will
+also reveal any larger scale relationships.
+
+_split()
+ at node SPIN-Find similar spans
+ at section Finding Similar Spans
+ at cindex Find similar spans: spin
+
+This method was first described by
+McLachlan @cite{Mclachlan,A.D. Tests for comparing related amino acid sequences
+J. Mol. Biol. 61, 409-424 (1971)}. 
+It involves calculating a score for each position in the plot
+by summing points found when looking forwards and
+backwards along a diagonal line of a given length (window length). The
+algorithm does not simply look for identity but uses a score matrix that
+contains scores for every possible pair of character types. At each point
+that the score is above a minimum score, a match is saved. The matches
+are plotted as a single point in the SPIN Sequence Comparison Plot, corresponding to the centre of the
+matching span (_fpref(SPIN-SPIN Sequence Comparison Plot, SPIN Sequence Comparison Plot)) (Although see "Rescan
+matches, below).
+
+_picture(spin_similar_spans,3.01667in)
+
+The dialogue box (shown above) requests the horizontal and vertical sequences
+and their ranges (_fpref(SPIN-Selecting a sequence, Selecting a sequence)), 
+the window span length and the minimum score. 
+Only results above this minimum score are plotted. The default
+value for the minimum score is one that would produce
+approximately 500 matches between two random sequences of the same
+composition as the two under investigation
+(_fpref(SPIN-Probability Calculations, Probabilities and expected number of matches)). 
+This value of 500 can be changed using the "Configure default number
+of matches" option of the "Options" menu on the main menubar
+(_fpref(SPIN-Changing Default Match Number, Changing the default number of
+matches)). The upper and lower limits of the minimum score are similarly
+determined except that the expected number of matches for the upper limit is
+0 and for the lower limit is "maximum number of matches". The "maximum number of
+matches" value can be altered if more matches are required to be plotted by 
+using the
+"Configure maximum number of matches" option of the "Options" menu
+(_fpref(SPIN-Changing Max Match Number, Changing the maximum number of
+matches)).
+
+Further operations available for find similiar spans are:
+
+ at table @var
+
+ at item Information
+
+This command gives a brief description of the sequences used in the comparison,
+the input parameters used and the number of matches found.
+
+ at example
+ at group
+
+horizontal EMBL: hsproperd 
+vertical EMBL: mmproper
+window length 11 min match 9
+number of matches 1772
+
+ at end group
+ at end example
+
+ at item Results
+
+A detailed listing of all the hits found is displayed in the Output Window.
+
+ at example
+ at group
+
+Positions          2 h        630 v and score          9
+
+ Percentage mismatch  18.2
+                2        12
+              H agcctatcaac
+                ::::::: : :
+              V agcctatgagc
+              630       640
+
+Positions          7 h        369 v and score          9
+
+ Percentage mismatch  18.2
+                7        17
+              H atcaacccaga
+                :  ::::::::
+              V aggaacccaga
+              369       379
+
+ at end group
+ at end example
+
+ at item Tabulate Scores
+This option lists scores, probabilities, and their expected and
+observed numbers of matches.
+
+ at example
+ at group
+
+score    9 probability 1.73e-04 expected          365 observed 1772
+score   10 probability 1.17e-05 expected           25 observed 601
+score   11 probability 3.60e-07 expected            1 observed 149
+
+ at end group
+ at end example
+
+ at item Rescan matches
+It is also possible to plot a dot for each residue with a score above a 
+minimum value within each matching span using the "Rescan matches" command.
+This is only a temporary result and will be destroyed if the SPIN Sequence Comparison Plot is
+altered (_fpref(SPIN-Managing-Results, Controlling and Managing Results,t)).
+
+
+ at item Configure
+This option allows the line width and colour of the matches to be altered. 
+_fxref(UI-Colour, Colour Selector, interface)
+A colour browser is displayed from which the desired line width or colour can 
+be configured. Pressing OK will update the SPIN Sequence Comparison Plot.
+
+ at item Display sequences
+Selecting this command invokes the SPIN Sequence Comparison Display 
+(_fpref(SPIN-Sequence-Comparison Display, Sequence comparison display)). 
+Moving the cursor in the sequence display will move the cursors of the
+same sequence in any SPIN Sequence Comparison Plot (_fpref(SPIN-Cursors, Cursors)).
+To force the sequence display to show the nearest match,
+use the "nearest match" button in the sequence display plot. To force
+the sequences to maintain their current register activate the "Lock" button.
+
+ at item Hide
+This option removes the points from the SPIN Sequence Comparison Plot but retains the information
+in memory.
+
+ at item Reveal
+This option will redisplay previously hidden points in the SPIN Sequence Comparison Plot.
+
+ at item Remove
+This command removes all the information regarding this particular
+invocation of Find similar spans and access to this data lost.
+
+ at end table
+
+_split()
+ at node SPIN-Find matching words
+ at section Finding Matching Words
+ at cindex Find matching words: spin
+
+The find matching words routine finds runs of identical characters in the
+sequence. Its main value is speed, being hundreds of times faster than the
+find similar spans function. It is of course not very sensitive but is
+useful for long DNA sequences.
+
+_picture(spin_match_words,3.01667in)
+
+The dialogue allows the horizontal and vertical sequences and their ranges
+to be selected (_fpref(SPIN-Selecting a sequence, Selecting a sequence)).
+The word length is the minimum number of consecutive matching characters. All 
+runs of identical characters that are at least as long as the word length will
+produce a line on the SPIN Sequence Comparison Plot of length proportional to the actual word
+length 
+(_fpref(SPIN-SPIN Sequence Comparison Plot, SPIN Sequence Comparison Plot)).
+
+Further operations available for find matching words are:
+
+ at table @var
+
+ at item Information
+This command gives a brief description of the sequences used in the comparison, the input parameters used and the number of hits found.
+
+ at example
+ at group
+
+horizontal EMBL: hsproperd
+vertical EMBL: mmproper
+word length 8 
+Number of matches 140
+
+ at end group
+ at end example
+
+ at item Results
+A detailed listing of all the matching words is obtained in the Output
+Window. The horizontal (h) and vertical (v) positions of the beginning of 
+the match are listed along with the length of the match and the match itself.
+
+ at example
+ at group
+
+Positions        162 h          4 v and length         14
+ttcacccagtatga
+Positions        225 h         67 v and length         18
+gaagactgctgtctcaac
+Positions        509 h        118 v and length          8
+ctctgtca
+Positions        276 h        118 v and length          9
+ctctgtcag
+Positions        288 h        130 v and length          8
+tgcaggtc
+Positions        626 h        131 v and length          8
+gcaggtct
+Positions       1208 h        144 v and length          8
+atggtcag
+
+ at end group
+ at end example
+
+ at item Tabulate scores
+This option lists scores, probabilities, and their expected and observed 
+numbers of matches.
+ at example
+ at group
+
+score    8 probability 2.06e-05 expected           43 observed 140
+score    9 probability 5.35e-06 expected           11 observed 67
+score   10 probability 1.39e-06 expected            3 observed 45
+score   11 probability 3.60e-07 expected            1 observed 35
+score   12 probability 9.35e-08 expected            0 observed 22
+score   13 probability 2.43e-08 expected            0 observed 18
+score   14 probability 6.30e-09 expected            0 observed 17
+score   15 probability 1.63e-09 expected            0 observed 11
+score   16 probability 4.24e-10 expected            0 observed 9
+score   17 probability 1.10e-10 expected            0 observed 9
+score   18 probability 2.86e-11 expected            0 observed 8
+score   19 probability 7.42e-12 expected            0 observed 6
+score   20 probability 1.93e-12 expected            0 observed 5
+score   21 probability 5.00e-13 expected            0 observed 3
+score   22 probability 1.30e-13 expected            0 observed 2
+score   23 probability 3.37e-14 expected            0 observed 2
+score   24 probability 8.74e-15 expected            0 observed 2
+
+ at end group
+ at end example
+
+
+ at item Configure
+This option allows the line width and colour of the matches to be altered.
+_fxref(UI-Colour, Colour Selector, interface)
+A colour browser is displayed from which the desired line width or colour can 
+be configured. Pressing OK will update the SPIN Sequence Comparison Plot.
+
+ at item Display sequences
+Selecting this command invokes the sequence display 
+(_fpref(SPIN-Sequence-Comparison Display, Sequence comparison display)). 
+Moving the cursor in the sequence display will move the cursors of the
+same sequence in any SPIN Sequence Comparison Plot (_fpref(SPIN-Cursors, Cursors)).
+To force the sequence display to show the nearest match,
+use the "nearest match" button in the sequence display plot.
+
+ at item Hide
+This option removes the points from the SPIN Sequence Comparison Plot but retains the information
+in memory.
+
+ at item Reveal
+This option will redisplay previously hidden points in the SPIN Sequence Comparison Plot.
+
+ at item Remove
+This command removes all the information regarding this particular
+invocation of Find matching words, and access to this data is lost.
+
+ at end table
+
+_split()
+ at node SPIN-Find Best Diagonals
+ at section Finding the Best Diagonals
+ at cindex Find best diagonals: spin
+
+This option is among the fastest and can be useful for a quick comparison of
+two long DNA sequences. The algorithm is as follows.
+First it finds the positions of runs of identical
+characters ("words") of length word length, as for the find matching words
+algorithm. These words are accumulated in an imaginary SPIN Sequence Comparison Plot and the
+number of hits on each diagonal is summed to produce a histogram.
+The histogram is
+analysed to find its mean and standard deviation. The diagonals that
+lie above some cutoff score (defined in standard deviation units), are 
+rescanned using the find similar spans algorithm. Any window lengths
+reaching the cutoff score produce a dot which is
+plotted in the usual way.
+
+_picture(spin_diagonals,3.01667in)
+
+The dialogue box requests horizontal and vertical sequences and their ranges
+(_fpref(SPIN-Selecting a sequence, Selecting a sequence)),
+the minimum number of identical characters in a run
+"word length", the minimum standard deviation, the window length and the
+minimum score.
+
+The points are plotted to the SPIN Sequence Comparison Plot (_fpref(SPIN-SPIN Sequence Comparison Plot, SPIN Sequence Comparison Plot)).
+
+Further operations available for find best diagonals are:
+
+ at table @var
+
+ at item Information
+This command gives a brief description of the sequences used in the 
+comparison and the input parameters used.
+
+ at example
+ at group
+
+horizontal EMBL: hsproperd
+vertical EMBL: mmproper
+window length 11 minimum score 9 word length 8 minimum sd 3.000000
+
+ at end group
+ at end example
+
+ at item Results
+A listing of all the matches is obtained in the Output Window. The horizontal 
+(h) and vertical (v) positions of the beginning of the match are listed.
+
+ at example
+ at group
+
+Positions       1066 h        905 v 
+Positions       1067 h        906 v 
+Positions       1068 h        907 v 
+Positions       1069 h        908 v 
+Positions       1070 h        909 v 
+Positions       1071 h        910 v 
+Positions       1072 h        911 v 
+Positions       1073 h        912 v 
+Positions       1074 h        913 v 
+
+ at end group
+ at end example
+
+ at item Configure
+This option allows the line width and colour of the matches to be altered.
+_fxref(UI-Colour, Colour Selector, interface)
+A colour browser is displayed from which the desired line width or colour can 
+be configured. Pressing OK will update the SPIN Sequence Comparison Plot.
+
+ at item Display sequences
+Selecting this command invokes the sequence display 
+(_fpref(SPIN-Sequence-Comparison Display, Sequence comparison display)). 
+Moving the cursor in the sequence display will move the cursors of the
+same sequence in any SPIN Sequence Comparison Plot (_fpref(SPIN-Cursors, Cursors)).
+To force the sequence display to show the nearest match,
+use the "nearest match" button in the sequence display plot.
+
+ at item Hide
+This option removes the points from the SPIN Sequence Comparison Plot but retains the information
+in memory.
+
+ at item Reveal
+This option will redisplay previously hidden points in the SPIN Sequence Comparison Plot.
+
+ at item Remove
+This command removes all the information regarding this particular
+invocation of Find best diagonals, and access to this data is lost.
+
+ at end table
+
+
+_split()
+ at node SPIN-Align Sequences
+ at section Aligning Sequences Globally
+ at cindex Align sequences: spin
+
+This function will produce an optimal global alignment of two segments of the
+sequence. The dynamic programming alignment algorithm is based on 
+ at cite{Huang,X On global sequence alignment. CABIOS 10 227-235 (1994)}. 
+There is no length limit of the sequences but the sequences to be
+aligned should be of the same type i.e. both be DNA or both protein. 
+
+_picture(spin_align_seq,3.01667in)
+
+A dialogue box (shown above) requests the horizontal and vertical sequences 
+and the ranges over which they are to be aligned 
+(_fpref(SPIN-Selecting a sequence, Selecting a sequence))
+and the gap start penalty 
+and the gap extension penalty. In addition, if the sequence is DNA, 
+the "score for match" and
+"score for mis-match" must be provided. These values are used to generate a
+score matrix. For protein sequences, the score matrix can be changed from the 
+"Options" menu (_fpref(SPIN-Changing the score matrix, Changing the
+score matrix)).
+
+The alignment is displayed in the Output Window along with the 
+percentage mismatch (see below) and on the SPIN Sequence Comparison Plot as a line. The
+line represents the path of the alignment.
+
+The following plot shows a global alignment of two Xenopus Laevis
+sequences. The vertical sequence (xlactcag) is genomic DNA, and the
+horizontal sequence (xlacacr) is the corresponding cDNA. The vertical
+sections of the plotted path correspond to introns in the genomic DNA, which
+are obviously absent from the cDNA.
+
+_lpicture(spin_align_p,5.31667in)
+
+
+Below we show a typical alignment (from a different pair of sequences)
+as produced in the Output Window.
+
+ at example
+ at group
+
+ Percentage mismatch  29.6
+                1        11        21        31        41        51
+      hsproperd gagcctatcaacccagataaagcgggacctcctctctggtagaggtgcagggggcagtac
+                                                                            
+       mmproper ************************************************************
+             -157      -147      -137      -127      -117      -107
+
+               61        71        81        91       101       111
+      hsproperd tcaacatgatcacagagggagcgcaggcccctcgattgttgctgccgccgctgctcctgc
+                                                                            
+       mmproper ************************************************************
+              -97       -87       -77       -67       -57       -47
+
+              121       131       141       151       161       171
+      hsproperd tgctcaccctgccagccacaggctcagaccccgtgctctgcttcacccagtatgaagaat
+                                                      :: :::::::::::::: :: :
+       mmproper **************************************tgtttcacccagtatgaggagt
+              -37       -27       -17        -7         3        13
+
+              181       191       201       211       221       231
+      hsproperd cctccggcaagtgcaagggcctcctggggggtggtgtcagcgtggaagactgctgtctca
+                :::: :::: :::::: ::::: :: ::: : :   :::: :: ::::::::::::::::
+       mmproper cctctggcaggtgcaaaggcctacttgggagagacatcagggtagaagactgctgtctca
+               23        33        43        53        63        73
+
+ at end group
+ at end example
+
+The two aligned sequences are automatically saved in memory and can be
+accessed through the sequence manager. They are assigned default filenames
+which are based on the parent with the addition of _a"number" where "number" is
+a unique identifier (see the twelth and thirteenth entries of the sequence
+manager picture (_fpref(SPIN-Sequence Manager, Sequence manager)).
+
+Further operations available for align sequences are:
+
+ at table @var
+
+ at item Information
+This command gives a brief description of the sequences used in the 
+comparison and the input parameters used.
+
+ at example
+ at group
+
+horizontal PERSONAL: m13mp18.seq from 1 to 7250
+vertical PERSONAL: lawrist7.seq from 1 to 5261
+
+ at end group
+ at end example
+
+ at item Configure
+This option allows the line width and colour of the matches to be altered.
+_fxref(UI-Colour, Colour Selector, interface)
+A colour browser is displayed from which the desired line width or colour can 
+be configured. Pressing OK will update the SPIN Sequence Comparison Plot.
+
+ at item Display sequences
+Selecting this command invokes the Sequence Comparison Display 
+(_fpref(SPIN-Sequence-Comparison Display, Sequence comparison display)). 
+Moving the cursor in the sequence display will move the cursors of the
+same sequence in any SPIN Sequence Comparison Plot (_fpref(SPIN-Cursors, Cursor)).
+To force the sequence display to show the nearest match,
+use the "nearest match" button in the sequence display plot.
+
+ at item Hide
+This option removes the points from the SPIN Sequence Comparison Plot but retains the information
+in memory.
+
+ at item Reveal
+This option will redisplay previously hidden points in the SPIN Sequence Comparison Plot.
+
+ at item Remove
+This command removes all the information regarding this particular
+invocation of Align sequences, and access to this data is lost.
+
+ at end table
+
+_split()
+ at node SPIN-Local alignment
+ at section Aligning Sequences Locally
+ at cindex Local alignment: spin
+ at cindex Alignment local: spin
+ at cindex Sim: spin
+ at cindex Smith-Waterman: spin
+
+The local alignment routine is based around the program SIM by 
+Huang and Miller which is an implementation of the Smith-Waterman algorithm
+ at cite{Huang,X.Q. & Miller, W. A Time-Efficient, Linear-Space Local Similarity Algorithm. Advances in Applied Mathematics 12 337-357 (1991)}.
+
+SIM finds k best non-intersecting alignments between two sequences or
+within a single 
+sequence using dynamic programming techniques. The alignments are
+reported in order of decreasing similarity score and share no aligned pairs.
+SIM requires space proportional to the sum of the input sequence lengths
+and the output alignment lengths, so it accommodates 100,000-base
+sequences on a workstation. Both sequences must be of the same type, ie both
+be DNA or both be protein.
+
+_picture(spin_local_align,3.01667in)
+
+A dialogue box (shown above) requests the horizontal and vertical sequences 
+and the ranges over which they are to be aligned 
+(_fpref(SPIN-Selecting a sequence, Selecting a sequence)). Either a specified
+number of alignments can be requested or alternatively, all alignments above
+a certain score. If the sequence
+is DNA, the scores for a matching aligned pair, a transition and a transversion
+must be provided. These values are used to generate a score matrix. For 
+protein sequences, the score matrix can be changed from the 
+"Options" menu (_fpref(SPIN-Changing the score matrix, Changing the
+score matrix)). Both DNA and protein sequences require the penalty for opening
+a gap and the penalty for gap extension.
+
+The alignments are displayed in the Output Window along with the 
+percentage mismatch (see below) and on the SPIN Sequence Comparison Plot as a series of lines, each
+line corresonding to a single alignment. The
+line represents the path of alignments. 
+
+
+The following two plots show local alignments of two Xenopus Laevis
+sequences. The vertical sequence (xlactcag) is genomic DNA, and the
+horizontal sequence (xlacacr) is the corresponding cDNA.
+
+The first plot is of a local alignment using a higher than default
+penalty for each residue in the gap (1 as opposed to 0.2). It has also
+been specified that all alignments scoring more than 20 are to be shown.
+The result of this is seven aligned regions, represented by seven
+diagonal lines in the plot. These regions correspond to the exons that
+are present in both sequences, separated by the introns that are only
+present in the genomic sequence.
+
+_lpicture(spin_local_p1,5.31667in)
+
+The second plot shows the result for the same two sequences when the
+default gap penalty is accepted and when only the highest scoring
+alignment is displayed. This best alignment covers five of the seven
+exons identified in the previous plot, with the lower gap penalty
+allowing it to span the introns that separate them.
+
+_lpicture(spin_local_p2,5.31667in)
+
+Below is a typical aligment as written to the Output Window.
+
+ at example
+ at group
+
+ Percentage mismatch  35.7
+               438       448       458       468       478       488
+               h caggcctgtgaggaccagcagtgctgtcctgagatgggcggctggtctggctgggggccc
+                 :::::::::::   :::: ::  ::: ::       :: : :::: :   :::::: :::
+               m caggcctgtgacacccagaagacctgccccacacatggggcctgggcatcctggggcccc
+               451       461       471       481       491       501
+
+               498       508       518
+               h tgggagccttgctctgtcacctgc
+                 :::   ::  :::: :   :::::
+               m tggagcccccgctcaggatcctgc
+               511       521       531
+
+ at end group
+ at end example
+
+Further operations available for local alignments are:
+
+ at table @var
+
+ at item Information
+This command gives a brief description of the sequences used in the 
+comparison and the input parameters used.
+
+ at example
+ at group
+
+horizontal PERSONAL: h from 1 to 1553
+vertical PERSONAL: m from 1 to 1358
+number of alignments 3 
+score for match 1
+score for transition -1
+score for transversion -1
+penalty for starting gap 6
+penalty for each residue in gap 0.2
+
+ at end group
+ at end example
+
+ at item Configure
+This option allows the line width and colour of the matches to be altered.
+_fxref(UI-Colour, Colour Selector, interface)
+A colour browser is displayed from which the desired line width or colour can 
+be configured. Pressing OK will update the SPIN Sequence Comparison Plot.
+
+ at item Display sequences
+Selecting this command invokes the Sequence Comparison Display 
+(_fpref(SPIN-Sequence-Comparison Display, Sequence comparison display)). 
+Moving the cursor in the sequence display will move the cursors of the
+same sequence in any SPIN Sequence Comparison Plot (_fpref(SPIN-Cursors, Cursor)).
+To force the sequence display to show the nearest match,
+use the "nearest match" button in the sequence display plot.
+
+ at item Hide
+This option removes the points from the SPIN Sequence Comparison Plot but retains the information
+in memory.
+
+ at item Reveal
+This option will redisplay previously hidden points in the SPIN Sequence Comparison Plot.
+
+ at item Remove
+This command removes all the information regarding this particular
+invocation of Local alignment, and access to this data is lost.
+
+ at end table
+
+
+_split()
+ at node SPIN-Managing-Results
+ at chapter Controlling and Managing Results
+
+ at menu
+* SPIN-Probability Calculations::  Probabilities and expected numbers of matches
+* SPIN-Changing Max Match Number:: Changing the maximum number of matches
+* SPIN-Changing Default Match Number:: Changing the default number of matches
+* SPIN-Hide duplicate matches:: Hide duplicate matches
+* SPIN-Changing the score matrix:: Changing the score matrix
+* SPIN-Set protein alignment symbols:: Set protein alignment symbols
+* SPIN-Result-Manager:: Result manager
+ at end menu
+
+Spin allows the parameters for each analytical option to be set in
+dialogues immediately prior to their execution, but there are other global
+parameters which can influence the results obtained, and they are
+described here.
+
+This section also covers, in its description of the "Results Manager"
+how results can be manipulated after they have been obtained. As this
+implies, almost all searches conducted by spin produce results that are
+retained until the user explicitly deletes them, and these are termed
+"Permanent results". Any other results are termed "Temporary". At
+present, the only temporary results are those produced by a variant of
+the Similar Spans algorithm,
+(_fpref(SPIN-Find similar spans, Finding Similar Spans)),
+in which the plot can be overlayed by marking every
+identical character in each matching span with separate dots. These
+extra dots are not stored as results and any changes to the SPIN Sequence Comparison Plot,
+for example plotting new data, will destroy them.  
+
+
+ at node SPIN-Probability Calculations
+ at section Probabilities and expected numbers of matches
+ at cindex Match probabilities in spin
+ at cindex Probabilities in spin
+ at cindex Significance of matches in spin
+ at cindex Expected number of matches in spin
+
+
+To suggest reasonable ranges of cutoff scores for the 
+Similar Spans
+(_fpref(SPIN-Find similar spans, Finding Similar Spans))
+and Matching Words
+(_fpref(SPIN-Find matching words, Finding Matching Words))
+comparison
+functions, and later to help users assess the significance of the 
+matches found between sequences,
+spin calculates their probabilities and the expected number of
+matches
+ at cite{Staden R, Methods for calculating the probabilities of finding
+patterns in sequences. CABIOS 5 89-96 (1989)}. 
+For both algorithms the probability
+depends on the composition of the two sequences, the cutoff score, and,
+for the matching spans algorithm, the score matrix. The probability is
+the chance of finding the given score in 
+infinitely long random sequences of the same composition as the pair
+being compared. The expected number of
+matches for any score is calculated by multiplying its probability value
+by the product of the lengths of the two sequences.
+Note that no correction is made for the case of comparing a sequence
+against itself.
+
+The matches found for these two algorithms can be assessed by selecting
+the Tabulate Scores option, which will produce a list of observed and
+expected results as shown in the example below.
+
+ at example
+ at group
+
+score    9 probability 1.73e-04 expected          365 observed 1772
+score   10 probability 1.17e-05 expected           25 observed 601
+score   11 probability 3.60e-07 expected            1 observed 149
+
+ at end group
+ at end example
+
+In this case there are clearly many more matches at each score level
+than would be expected by chance.
+
+ at node SPIN-Changing Max Match Number
+ at section Changing the maximum number of matches
+ at cindex Changing the maximum number of matches: spin
+
+The maximum number of matches is a guideline
+limit to the number of matches 
+that a comparison function is allowed to produce. In conjunction with the
+probability calculations its value is used to determine the range of
+scores allowed for an option - for example, the lowest "minimum score"
+for the "find similar spans" function 
+(_fpref(SPIN-Find similar spans, Finding Similar Spans)). Altering the
+maximum number of matches value will 
+in turn alter the range of scores available in a function. 
+Note that this maximum only provides a guideline and each function
+will always attempt to calculate all matches. However,
+if the scores are set too low, and the sequences are long, 
+very large numbers of matches matches may be produced and the functions
+may run slowly as the program gobbles up increasing amounts of memory to
+store them.
+
+The maximum number of matches is altered using the Options menu. 
+
+ at node SPIN-Changing Default Match Number
+ at section Changing the default number of matches
+ at cindex Changing the default number of matches: spin
+
+The default number of matches is used to determine the default score in the
+function dialogue boxes. If the two sequences being analysed were
+scrambled (i.e. had the order of their bases or amino acids changed
+randomly) and then compared using the default score, they should be
+found to contain approximately the default number of matches.
+Hence this number provides users with a crude assessment of 
+the significance of the matches found: if more than the default number
+are found, the sequences are more similar than is likely by chance.
+
+The default number of matches is altered using the Options menu.
+
+ at node SPIN-Hide duplicate matches
+ at section Hide duplicate matches
+ at cindex Hide duplicate matches: spin
+ at cindex Duplicate matches: spin
+
+If the horizontal and vertical sequences are the same
+the comparison plots would be a mirror image about the main diagonal. 
+In this case the default is that only the lower half of 
+the plot is calculated. If the "Hide duplicate matches" checkbutton is
+not set the entire plot will be displayed. It is important to note this
+property of the algorithms: if only one half of the plot is displayed
+the main diagonal has been found to be identical!
+
+ at node SPIN-Changing the score matrix
+ at section Changing the score matrix
+ at cindex Changing the score matrix: spin
+ at cindex Format of protein score matrix
+ at cindex Protein score matrix format
+ at cindex Score matrix format
+
+This option allows users to select their own score matrix for protein
+sequence comparison and is 
+available from the "Options" menu which invokes a dialogue 
+box. Enter the full filename of the matrix in the entry box. Clicking on
+the "browse" button will invoke a file browser. _fxref(File Browser,
+File Browser, interface) 
+
+The recommended
+format for the matrices is that used by
+blast @cite{Altschul, Stephen F., Warren Gish, Webb Miller, Eugene W. Myers,
+and David J. Lipman.  Basic local alignment search tool.  J. Mol. Biol.
+215:403-10 (1990)}.
+Note that the NCBI make a whole range of protein score matrices
+available in this format and we include the one shown below
+in the package tables directory in a file named pam250.
+
+ at tex
+\global\let\nonarrowing=\comment
+ at end tex
+ at example
+#
+# This matrix was produced by "pam" Version 1.0.6 [28-Jul-93]
+#
+# PAM 250 substitution matrix, scale = ln(2)/3 = 0.231049
+#
+# Expected score = -0.844, Entropy = 0.354 bits
+#
+# Lowest score = -8, Highest score = 17
+#
+   A  R  N  D  C  Q  E  G  H  I  L  K  M  F  P  S  T  W  Y  V  B  Z  X  *
+A  2 -2  0  0 -2  0  0  1 -1 -1 -2 -1 -1 -3  1  1  1 -6 -3  0  0  0  0 -8
+R -2  6  0 -1 -4  1 -1 -3  2 -2 -3  3  0 -4  0  0 -1  2 -4 -2 -1  0 -1 -8
+N  0  0  2  2 -4  1  1  0  2 -2 -3  1 -2 -3  0  1  0 -4 -2 -2  2  1  0 -8
+D  0 -1  2  4 -5  2  3  1  1 -2 -4  0 -3 -6 -1  0  0 -7 -4 -2  3  3 -1 -8
+C -2 -4 -4 -5 12 -5 -5 -3 -3 -2 -6 -5 -5 -4 -3  0 -2 -8  0 -2 -4 -5 -3 -8
+Q  0  1  1  2 -5  4  2 -1  3 -2 -2  1 -1 -5  0 -1 -1 -5 -4 -2  1  3 -1 -8
+E  0 -1  1  3 -5  2  4  0  1 -2 -3  0 -2 -5 -1  0  0 -7 -4 -2  3  3 -1 -8
+G  1 -3  0  1 -3 -1  0  5 -2 -3 -4 -2 -3 -5  0  1  0 -7 -5 -1  0  0 -1 -8
+H -1  2  2  1 -3  3  1 -2  6 -2 -2  0 -2 -2  0 -1 -1 -3  0 -2  1  2 -1 -8
+I -1 -2 -2 -2 -2 -2 -2 -3 -2  5  2 -2  2  1 -2 -1  0 -5 -1  4 -2 -2 -1 -8
+L -2 -3 -3 -4 -6 -2 -3 -4 -2  2  6 -3  4  2 -3 -3 -2 -2 -1  2 -3 -3 -1 -8
+K -1  3  1  0 -5  1  0 -2  0 -2 -3  5  0 -5 -1  0  0 -3 -4 -2  1  0 -1 -8
+M -1  0 -2 -3 -5 -1 -2 -3 -2  2  4  0  6  0 -2 -2 -1 -4 -2  2 -2 -2 -1 -8
+F -3 -4 -3 -6 -4 -5 -5 -5 -2  1  2 -5  0  9 -5 -3 -3  0  7 -1 -4 -5 -2 -8
+P  1  0  0 -1 -3  0 -1  0  0 -2 -3 -1 -2 -5  6  1  0 -6 -5 -1 -1  0 -1 -8
+S  1  0  1  0  0 -1  0  1 -1 -1 -3  0 -2 -3  1  2  1 -2 -3 -1  0  0  0 -8
+T  1 -1  0  0 -2 -1  0  0 -1  0 -2  0 -1 -3  0  1  3 -5 -3  0  0 -1  0 -8
+W -6  2 -4 -7 -8 -5 -7 -7 -3 -5 -2 -3 -4  0 -6 -2 -5 17  0 -6 -5 -6 -4 -8
+Y -3 -4 -2 -4  0 -4 -4 -5  0 -1 -1 -4 -2  7 -5 -3 -3  0 10 -2 -3 -4 -2 -8
+V  0 -2 -2 -2 -2 -2 -2 -1 -2  4  2 -2  2 -1 -1 -1  0 -6 -2  4 -2 -2 -1 -8
+B  0 -1  2  3 -4  1  3  0  1 -2 -3  1 -2 -4 -1  0  0 -5 -3 -2  3  2 -1 -8
+Z  0  0  1  3 -5  3  3  0  2 -2 -3  0 -2 -5  0  0 -1 -6 -4 -2  2  3 -1 -8
+X  0 -1  0 -1 -3 -1 -1 -1 -1 -1 -1 -1 -1 -2 -1  0  0 -4 -2 -1 -1 -1 -1 -8
+* -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8  1
+ at end example
+
+But for historical reasons the default matrix used by the program
+is the one shown below.
+
+ at example
+   C  S  T  P  A  G  N  D  E  Q  B  Z  H  R  K  M  I  L  V  F  Y  W  -  X  ?  
+C 22 10  8  7  8  7  6  5  5  5  5  5  7  6  5  5  8  4  8  6 10  2 10 10 10 10
+S 10 12 11 11 11 11 11 10 10  9 10 10  9 10 10  8  9  7  9  7  7  8 10 10 10 10
+T  8 11 13 10 11 10 10 10 10  9 10 10  9  9 10  9 10  8 10  7  7  5 10 10 10 10
+P  7 11 10 16 11  9  9  9  9 10  9 10 10 10  9  8  8  7  9  5  5  4 10 10 10 10
+A  8 11 11 11 12 11 10 10 10 10 10 10  9  8  9  9  9  8 10  6  7  4 10 10 10 10
+G  7 11 10  9 11 15 10 11 10  9 10 10  8  7  8  7  7  6  9  5  5  3 10 10 10 10
+N  6 11 10  9 10 10 12 12 11 11 12 11 12 10 11  8  8  7  8  6  8  6 10 10 10 10
+D  5 10 10  9 10 11 12 14 13 12 13 12 11  9 10  7  8  6  8  4  6  3 10 10 10 10
+E  5 10 10  9 10 10 11 13 14 12 12 13 11  9 10  8  8  7  8  5  6  3 10 10 10 10
+Q  5  9  9 10 10  9 11 12 12 14 11 13 13 11 11  9  8  8  8  5  6  5 10 10 10 10
+B  5 10 10  9 10 10 12 13 12 11 13 11 11 10 10  8  8  6  8  5  7  4 10 10 10 10
+Z  5 10 10 10 10 10 11 12 13 13 11 14 12 10 10  8  8  8  8  5  6  4 10 10 10 10
+H  7  9  9 10  9  8 12 11 11 13 11 12 16 12 10  8  8  8  8  8 10  7 10 10 10 10
+R  6 10  9 10  8  7 10  9  9 11 10 10 12 16 13 10  8  7  8  6  6 12 10 10 10 10
+K  5 10 10  9  9  8 11 10 10 11 10 10 10 13 15 10  8  7  8  5  6  7 10 10 10 10
+M  5  8  9  8  9  7  8  7  8  9  8  8  8 10 10 16 12 14 12 10  8  6 10 10 10 10
+I  8  9 10  8  9  7  8  8  8  8  8  8  8  8  8 12 15 12 14 11  9  5 10 10 10 10
+L  4  7  8  7  8  6  7  6  7  8  6  8  8  7  7 14 12 16 12 12  9  8 10 10 10 10
+V  8  9 10  9 10  9  8  8  8  8  8  8  8  8  8 12 14 12 14  9  8  4 10 10 10 10
+F  6  7  7  5  6  5  6  4  5  5  5  5  8  6  5 10 11 12  9 19 17 10 10 10 10 10
+Y 10  7  7  5  7  5  8  6  6  6  7  6 10  6  6  8  9  9  8 17 20 10 10 10 10 10
+W  2  8  5  4  4  3  6  3  3  5  4  4  7 12  7  6  5  8  4 10 10 27 10 10 10 10
+- 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
+X 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
+? 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
+  10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
+ at end example
+ at tex
+\global\let\nonarrowing=\relax
+ at end tex
+
+ at node SPIN-Set protein alignment symbols
+ at section Set protein alignment symbols
+ at cindex protein alignment symbols: spin
+
+This option allows users to set their own sequence similarity levels and 
+alignment symbols for protein sequence alignments. The dialogue (shown below)
+provides for setting a symbol for identical characters and for three levels
+of similarity with corresponding symbols.
+
+_picture(spin_alignment_symbols,2.4in)
+
+_split()
+ at node SPIN-USER-Interface
+ at chapter The Spin User Interface
+
+ at menu
+* SPIN-Spin-Plot::                             The SPIN Sequence PLot
+* SPIN-Sequence-Display:: Sequence display
+* SPIN-SPIN Sequence Comparison Plot::         The SPIN Sequence Comparison Plot
+* SPIN-Sequence-Comparison Display::           The SPIN Sequence Comparison Display
+ at end menu
+
+Spin has several displays. The first is
+a top level window from which all the main options are selected and
+which receives textual results. 
+Most analytical functions which operate on single sequences
+add their graphical results to a "SPIN Sequence Plot" that is associated 
+with the sequence being analysed. (An exception is the restriction enzyme 
+search which produces its own separate window.) Most functions which compare
+pairs of sequences add their results to a "SPIN Sequence Comparison Plot".
+The SPIN Sequence Plot and the SPIN Sequence Comparison Plot each have
+associated sequence display windows: the Sequence Display and the 
+Sequence Comparison Display. These allow the text of the sequences to be
+viewed and use cursors to show the corresponding positions in the graphical
+displays.
+
+Spin is
+best operated using a three button mouse, but alternative keybindings
+are available. Full details of the user interface
+are described elsewhere
+(_fpref(UI-Introduction, User Interface, t)).
+
+The main window (shown below) contains an Output Window for
+textual results, an Error window for error messages, and a series of
+menus arranged along the top
+(_fpref(SPIN-Intro-Menus, Spin menus,t)).
+The contents of the two text windows can
+be searched, edited and saved. Each set of results is preceded by
+a header containing the time and date when it was generated.
+
+_lpicture(spin_translate_t,3.01667in)
+
+As can be seen 
+the main menu bar contains File, View, Options, Sequences, Statistics,
+Translation, Comparison, Search and Emboss menus.
+
+_split()
+ at node SPIN-Spin-Plot
+ at section SPIN Sequence Plot
+
+ at cindex SPIN Sequence Plot: spin
+
+ at menu
+* SPIN-CURSORS::         Cursors
+* SPIN-CROSSHAIRS::      Crosshairs
+* SPIN-ZOOM::            Zoom
+* SPIN-DRAG::            Drag and drop
+ at end menu
+
+Graphical results are shown in separate windows. Most functions add their
+graphical results to a SPIN Sequence Plot that is associated with the
+sequence being analysed, but some functions such as the
+restriction enzyme search produce their own separate windows. Each set of
+graphical results has its own particular default drawing method, but
+individual results can be "dragged and dropped" by users, hence allowing
+plots to rearranged and superimposed. Plots can also be extracted from
+the main graphics window and dropped to create new independent windows.
+The graphical results can be zoomed and scrolled in both x and y
+directions. Zooming is achieved using the X and Y scale bars at the bottom
+of the plot. The individual plots can be scrolled in y
+using the scroll bars attached to their right hand edge. The sequence
+can be scrolled using the scroll bar at the base of the plot.
+
+_lpicture(spin_plot_p,6in)
+
+The figure shows a SPIN Sequence Plot containing the results of
+a gene search method based on codon usage, upon which is superimposed a 
+search for start codons and stop codons. The vertical blue line represents
+the cursor. The x position of the cursor is shown in the left hand box above
+the plot. Each result in the window has its own colour.
+At the right hand side of each panel is a set of square boxes with the
+same colours as the lines drawn in the adjacent plot. These
+icon-like objects represent individual results and allow the user to 
+operate on them. For example at the right of the middle panel is a
+pop-up menu containing the items: "Information", "List results",
+"Configure", "Hide" and "Remove". 
+(_fpref(SPIN-Result-Manager, Result manager)).
+
+The square icons can also be used to move the corresponding results to
+new locations. These operations are explained below (_fpref(SPIN-DRAG, Drag and drop)).
+The cursor in the plot can be used to control the position of the
+cursor in the sequence display.
+
+The SPIN Sequence Plot contains three menus: "File", "View" and
+"Results". The "File" menu contains the "Exit" command which closes down
+the plot; the "View" menu contains the "Results manager" command (_fpref(SPIN-Result-Manager, Result manager));
+ and the "Results" menu contains a list of the results, colour coded
+i.e. the text is written in the same colour as the plot. For each result
+a series of commands can be accessed from a cascading menu.
+(_fpref(SPIN-Result-Manager, Result manager)).
+
+
+_lpicture(spin_results_manager_d2,6in)
+
+ at node SPIN-CURSORS
+ at subsection Cursors
+ at cindex Cursor: spin
+
+Each sequence displayed in a SPIN Sequence Plot will have a corresponding cursor
+of a particular colour. The same sequence displayed in several SPIN Sequence Plots
+will have a cursor of the same colour. To move a cursor, click on it
+with the middle mouse button, or Alt left mouse button, 
+held down and drag the mouse.
+The cursor will move all
+other cursors displayed that relate to that sequence, whether these be
+in different SPIN Sequence Plots or within the sequence display.
+
+
+ at node SPIN-CROSSHAIRS
+ at subsection Crosshairs
+
+ at cindex Crosshairs: spin
+
+Crosshairs can be turned on or off using the check button labelled
+"crosshairs". The x and y positions of the crosshairs are indicated in
+the two boxes to the right of the check box respectively. The x value is
+the base position in the sequence and the y value the score for the
+corresponding plot.
+
+
+ at node SPIN-ZOOM
+ at subsection Zoom
+ at cindex Zoom: spin
+
+The graphical results can be zoomed and scrolled in both x and y
+directions. Zooming is achieved using the X and Y scale bars at the bottom
+the plot. The individual plots can be scrolled in y
+using the scroll bars attached to their right hand edge. The sequence
+can be scrolled using the scroll bar at the base of the plot.
+
+ at node SPIN-DRAG
+ at subsection Drag and drop
+ at cindex Drag and drop graphics: spin
+ at cindex Graphics rearrangement: spin
+
+The square boxes at the right edge of the SPIN Sequence Plot panels have the same
+colours as the individual results in the display. These icons can be used to
+drag and drop the results to which they correspond. This is activated by
+pressing the middle mouse button, or Alt left mouse button, 
+over the box and then moving the cursor
+over the SPIN Sequence Plot to the new location. As the cursor moves over each
+part of the plot rectangular boxes will appear to indicate the position
+that the dragged result will occupy if the mouse button is
+released. Results can be dropped on top of another plot (signified by a
+rectangle drawn over the centre of the plot), above another plot
+(signified by a rectangle drawn in the top third of the plot), or below
+another plot (signified by a rectangle drawn in the bottom third of the
+plot). The figure below shows three panels containing a protein gene
+prediction and three panels showing the positions of stop codons in each
+of the reading frames. 
+
+_lpicture(spin_plot_drag1,6in)
+
+ at page
+The figure below shows the same SPIN Sequence Plot but with the rectangle
+indicating where the user has dragged the frame 1 stop codon results.
+
+_lpicture(spin_plot_drag2,6in)
+
+ at page
+The figure below shows the result of releasing the middle mouse
+button, or Alt left mouse button: 
+the stop codon plot has moved to be superimposed on the gene
+prediction plot.
+
+_lpicture(spin_plot_drag3,6in)
+
+_split()
+ at node SPIN-Sequence-Display
+ at section Sequence display
+ at cindex Sequence display:spin
+ at cindex Sequence viewer:spin
+ at cindex Sequence scrolling:spin
+ at cindex Cursor positioning:spin
+ at cindex Cursor linking:spin
+ at cindex Cursor dragging:spin
+ at cindex Display interaction:spin
+ at cindex Interaction of displays:spin
+
+Each sequence shown in the "Sequence manager" list can have its own
+"Sequence display" window.
+The Sequence display provides a way of viewing and scrolling along the
+characters of the sequence. It can also show translations to protein and
+the positions of restriction enzyme cutting sites. Movement along the
+sequence is controlled by standard mouse and cursor commands and by the
+use of a subsequence/string search routine. The cursor can also be controlled by
+dragging the cursor in the programs' graphical displays.
+
+When restriction enzyme cutting sites are shown in the Sequence display
+window the number of rows of text to be displayed at any position along the
+sequence varies with the density of the sites. This presents a tricky
+problem about how to position the lines of text relative to the top and
+bottom of the window. Should the height of the window grow and shrink
+vertically as the user scrolls? Should the window maintain a fixed
+height, in which case should the top or the bottom be clipped if the
+number of lines of text exceeds the window height? We have programmed it
+so that
+the nucleotide sequence remains at a fixed height to provide a constant
+reference and the user can select this height by use of a vertical
+scroll bar and by growing or shrinking the window. The scroll bar will
+allow vertical movement when the number of lines of text exceeds the
+current window size.
+
+_lpicture(spin_sequence_display_t,6in)
+
+
+_split()
+ at node SPIN-Sequence-Display-Search
+ at subsection Search
+ at cindex String searching:spin
+ at cindex Subsequence searching:spin
+ at cindex Searching for strings:spin
+ at cindex String matching:spin
+ at cindex Matching strings:spin
+ at cindex Finding strings:spin
+ at cindex String finding:spin
+ at cindex Percentage matches:spin
+ at cindex Searching for oligos:spin
+ at cindex Oligo searching:spin
+ at cindex Motif searching: percentage matches:spin
+ at cindex NC-IUB symbols:spin
+ at cindex IUB symbols:spin
+ at cindex DNA character set
+ at cindex Nucleotide symbols
+
+The Sequence display contains a subsequence or string search function which moves the
+cursor to the position of the next match.
+As shown in the dialogue the user selects the direction and strand over
+which the search should be performed, the search algorithm, the minimum percentage match,
+and the subsequence/string for which to search. The search algorithm allows either NC-IUB 
+codes @cite{Cornish-Bowden, A. (1985) Nucl. Acids Res. 13, 3021-3030} or a 
+literal search. The literal search will search for exact matches eg
+inputting a search string of "n" will search for the letter "n". The NC-IUB 
+codes option can use any of the NC-IUB symbols
+shown in the figure below. Once activated the Search dialogue
+will remain visible until the user clicks on the "Cancel" button. 
+The cursor will move to the next
+matching position each time the user clicks on the "Search" button, or
+will "beep" if there is no such match.
+
+_picture(spin_sequence_display_d,2.18333in)
+ 
+ at example
+ at group
+ at cartouche
+              NC-IUB SYMBOLS
+ 
+        A,C,G,T
+        R        (A,G)        'puRine'
+        Y        (T,C)        'pYrimidine'
+        W        (A,T)        'Weak'
+        S        (C,G)        'Strong'
+        M        (A,C)        'aMino'
+        K        (G,T)        'Keto'
+        H        (A,T,C)      'not G'
+        B        (G,C,T)      'not A'
+        V        (G,A,C)      'not T'
+        D        (G,A,T)      'not C'
+        N        (G,A,C,T)    'aNy'
+
+ at end cartouche
+ at end group
+ at end example
+
+ at node SPIN-Sequence-Display-Save
+ at subsection Save
+
+ at cindex Restriction enzyme sites:spin
+ at cindex Sites: restriction enzymes:spin
+ at cindex Cutting sites: restriction enzymes:spin
+ at cindex Dumping results to file:spin
+ at cindex Translation to protein:spin
+ at cindex Double stranded sequence listing:spin
+ at cindex Single stranded sequence listing:spin
+
+
+Saving the contents of the sequence display to a file
+
+The Sequence display contents can be dumped to a file by selecting the
+"Save" option from the menu. Whatever options are currently activated
+(i.e. which of restriction enzyme sites, translation, ruler and strands)
+will be written to disk. The user can define the region of the sequence
+for which to dump the results and the name of the file to use.
+
+_picture(spin_sequence_display_save_d,2.15in)
+
+
+_split()
+ at node SPIN-SPIN Sequence Comparison Plot
+ at section SPIN Sequence Comparison Plot
+ at cindex SPIN Sequence Comparison Plot: spin
+ at cindex Dot plot: spin
+
+ at menu
+* SPIN-Cursors::                         Cursors
+* SPIN-Crosshairs::                      Crosshairs
+* SPIN-Zoom::                            Zoom
+* SPIN-Drag::                            Drag and drop
+ at end menu
+
+When a comparison function has been run on a pair of sequences a 
+SPIN Sequence Comparison Plot will appear. The results from each 
+comparison
+can be viewed in a "Sequence Comparison Display" in which the two sequences can
+be scrolled passed one another.
+
+The SPIN Sequence Comparison Plot display shows the results of comparison algorithms. Each match
+is represented as either a single dot ("Find similar spans", "Find best diagonals") or
+a line ("Find matching words", "Align sequences", "Local alignment"). 
+Sets of matches from a single invocation of a
+comparison command are termed "a result".  Each result is plotted using
+a single colour which can be configured via the results manager
+(_fpref(SPIN-Result-Manager, Result manager)). The maximum dimensions
+of the SPIN Sequence Comparison Plot are indicated on the rulers at the bottom and left hand side.
+It is possible within spin to compare many different sequences. This means that
+there may be more than one horizontal or vertical sequence shown in the spin
+plot. All the points are scaled to the largest sequence in each direction.
+Plots can also be extracted from or added to each SPIN Sequence Comparison Plot.
+
+_lpicture(spin_dot_plot,5.88333in)
+
+The diagram above shows the results of a "find similar spans" search (olive)
+(_fpref(SPIN-Find similar spans, Finding Similar Spans)), 
+and a "find
+matching words" 
+(red) (_fpref(SPIN-Find matching words, Finding Matching Words)), between human and
+mouse properdin (hsproperd and mmproper). 
+At the right hand side is a set of square boxes with the
+same colours as the dots drawn in the adjacent plot. These
+icon-like objects represent individual results and allow the user to 
+operate on them. For example, in the figure above, the user has clicked
+the right mouse button on the icon to raise a pop-up menu beneath the 
+"matching words" result
+(_fpref(SPIN-Result-Manager, Result manager)).
+
+The square icons can also be used to move the corresponding results to
+new locations. These operations are explained below 
+(_fpref(SPIN-Drag, Drag and drop)).
+
+The SPIN Sequence Comparison Plot has 3 menus, "File", "View" and "Results". 
+
+The "File" menu contains the "Exit" command to quit the SPIN Sequence Comparison Plot. This
+shuts down the SPIN Sequence Comparison Plot display removes all the results displayed in that
+plot.
+
+The "View" menu contains the "Results manager" command, see
+_oref(SPIN-Result-Manager, Result manager).
+
+The "Results" menu provides a quick method of interfacing with the menu 
+obtainable via the "Results manager" (_fpref(SPIN-Result-Manager, Result manager)).
+
+_split()
+ at node SPIN-Cursors
+ at subsection Cursors
+ at cindex Cursor: spin
+
+Each sequence displayed in a SPIN Sequence Comparison Plot will have a corresponding cursor of a
+particular colour. In the picture above, the sequence on the horizontal
+axis has a vertical blue cursor whereas the sequence on the vertical axis has 
+a horizontal olive green cursor. 
+The same sequence displayed in several SPIN Sequence Comparison Plots will have
+a cursor of the same colour unless the sequence has been plotted on a 
+different axis. The x and y positions of the cursors are indicated in
+the two boxes to the right of the crosshair check box respectively. To move a 
+cursor, click on it with the middle mouse button, or Alt left mouse button,
+held down and drag the mouse.
+The cursor will move all
+other cursors displayed that relate to that sequence, whether they are
+in different SPIN Sequence Comparison Plots or within the sequence display.
+
+ at node SPIN-Crosshairs
+ at subsection Crosshairs
+ at cindex Crosshairs: spin
+
+Crosshairs can be turned on or off using the check button labelled
+"crosshairs". The x and y positions of the crosshairs are indicated in
+the two boxes to the right of the check box. The position
+of the crosshairs can be "frozen" at a particular position by pressing
+the control button and moving the mouse cursor outside the SPIN Sequence Comparison Plot
+window.
+
+ at node SPIN-Zoom
+ at subsection Zoom
+ at cindex Zoom: spin
+
+Plots can be enlarged either by resizing the window or zooming. Zooming
+is achieved in two ways: either using the buttons at the base of the plot; 
+or by holding down the control key and right mouse button and
+dragging out a rectangle around the region to be zoomed. 
+Rectangles that are too small are ignored and
+a warning bell will sound. The Back button will restore the plot to the
+previous magnification. Zooming will increase the magnification of the
+plot so that the contents of the dragged out rectangle fill the display.
+The scrollbars allow the SPIN Sequence Comparison Plot to be scrolled in both directions.
+
+It is not possible to zoom the results from Rescan matches
+(_fpref(SPIN-Find similar spans, Finding Similar Spans)).
+
+
+ at node SPIN-Drag
+ at subsection Drag and drop
+ at cindex Drag and drop graphics: spin
+ at cindex Graphics rearrangement: spin
+
+The square boxes at the right edge of the SPIN Sequence Comparison Plot panels have the same
+colours as the individual results in the display. These icons can be used to
+drag and drop the results to which they correspond. This is activated by
+pressing the middle mouse button, or Alt left mouse button, 
+over the box and then moving the cursor
+over the SPIN Sequence Comparison Plot to the new location or anywhere outside the SPIN Sequence Comparison Plot. 
+As the cursor moves over each
+part of the plot rectangular boxes will appear to indicate the position
+that the dragged result will occupy if the mouse button is
+released. Results can be dropped on top of another plot (signified by a
+rectangle drawn over the centre of the plot), above another plot
+(signified by a rectangle drawn in the top third of the plot), or below
+another plot (signified by a rectangle drawn in the bottom third of the
+plot). Moving the mouse cursor outside the SPIN Sequence Comparison Plot and releasing the mouse
+button will create a new SPIN Sequence Comparison Plot containing that result.
+
+_split()
+ at node SPIN-Sequence-Comparison Display
+ at section Sequence Comparison Display
+ at cindex Sequence display: spin
+
+A sequence display is associated with a single set of results 
+(_fpref(SPIN-Result-Manager, Result manager)). To invoke a Sequence 
+Comparison Display, bring up a pop
+up menu for the required result, either from the Results manager, the Results
+menu in the SPIN Sequence Comparison Plot, or the coloured square icon on the right of the spin
+plot. From the menu, select the "Display sequences" option.
+
+_lpicture(spin_seq_display,6in)
+
+The horizontal sequence is drawn above the vertical sequence. In the
+picture above, "hsproperd" is the horizontal sequence and "mmproper" is
+the vertical sequence. The central panel indicates characters which are
+identical between the horizontal and vertical sequences. The buttons to
+the left of the sequences allow scrolling of the sequences either 1 character
+at a time (< or >) or a screen width (<< or >>). Pressing the Lock
+button "locks" the two sequences together and they can be scrolled as
+one.  Movement of the sequences is also controlled by the scrollbars or by
+moving the corresponding cursor in the SPIN Sequence Comparison Plot 
+(_fpref(SPIN-SPIN Sequence Comparison Plot, SPIN Sequence Comparison Plot)). The black cursors in the sequence display 
+correspond to the position of the cursor in the SPIN Sequence Comparison Plot. The sequences can be
+made to 'jump' to the nearest match in those results by pressing the 
+"Nearest match" or "Nearest dot" buttons. Nearest match means the match
+whose x,y coordinate in sequence character positions is closest, whereas
+Nearest dot means the match which appears closest in screen
+coordinates. If the display edges were proportional to the sequence
+lengths the Nearest dot and Nearest match would be equivalent.
+
+_split()
+ at node SPIN-Results
+ at chapter Controlling and Managing Results
+
+ at node SPIN-Result-Manager
+ at section Result manager
+
+ at cindex Results manager: spin
+ at menu
+* SPIN-RM-Information::          Show information
+* SPIN-RM-List::                 List results
+* SPIN-RM-Configure::            Configure results
+* SPIN-RM-Hide::                 Hide results
+* SPIN-RM-Reveal::               Reveal results
+* SPIN-RM-Remove::               Remove results
+ at end menu
+
+
+Many functions within spin produce "results" which are plotted to 
+the SPIN Sequence Plot (_fpref(SPIN-Spin-Plot, SPIN Sequence Plot)) or
+the 
+SPIN Sequence Comparison Plot (_fpref(SPIN-SPIN Sequence Comparison Plot, SPIN Sequence Comparison Plot)).
+The Result Manager provides a
+mechanism to interrogate and operate on these results. 
+
+_picture(spin_results_manager_d,2.45in)
+
+The Result Manager can be accessed via the "Results manager" command in
+the View menu on either the main menu or the menu bar of the SPIN Sequence Plot. 
+Alternatively the results can be accessed as a menu attached to the "Results"
+option on the SPIN Sequence Plot menu bar. In this case the individual results are
+written in the same colour as the plots they refer to:
+
+_lpicture(spin_results_manager_d2,6in)
+
+
+Each result is listed in the window containing the time the result was
+created, the name of the function which created the result and the
+result number. The number is simply a unique identifier to help
+distinguish between multiple results produced by the same function. The
+results are listed in time order, the oldest at the top.
+
+ at cindex Memory saving: spin
+
+Each item in the list is consuming memory on your computer. Running
+functions over and over again without removing the previous results will
+slow down your machine and it will, eventually, run out of memory. Removing 
+items from the list solves this.
+
+Pressing the right mouse button over an listed item will display a popup
+menu of operations to perform on this result. 
+
+ at node SPIN-RM-Information
+ at subsection Information
+
+This option in the pop-up writes data about the parameters used to
+obtain the corresponding result.
+
+ at node SPIN-RM-List
+ at subsection List
+
+This option in the pop-up writes all the numerical values for the result
+to the Output Window. This should be used sparingly as it requires a lot
+of memory.
+
+ at node SPIN-RM-Configure
+ at subsection Configure
+
+This option allows the line width and colour of the matches to be altered
+(_fpref(UI-Colour, Colour Selector, interface)).
+A colour browser is displayed from which the desired line width or colour can 
+be configured. Pressing OK will update the SPIN Sequence Plot.
+
+ at node SPIN-RM-Hide
+ at subsection Hide
+
+This option removes the points from the SPIN Sequence Plot but retains the information
+in memory.
+
+ at node SPIN-RM-Reveal
+ at subsection Reveal
+
+This option will redisplay previously hidden points in the SPIN Sequence Plot.
+
+ at node SPIN-RM-Remove
+ at subsection Remove
+
+This command removes all the information regarding this particular
+result and access to this data is lost.
+
+
+_split()
+ at node Reading and Managing Sequences
+ at chapter Reading and Managing Sequences
+
+Spin manages sequences at two levels. First it provides for reading
+sequences into the program from disk files, and secondly it contains a
+range of facilities for deriving new sequences
+from them. For example it can internally produce
+protein sequences from DNA sequences, or produce the complement of a DNA
+sequence, rotate it about any position, or scramble it. Each of these
+types of internal operation produces a new sequence which can be
+analysed using the comparison functions, or which can be saved to
+disk. In the same way, performing a sequence alignment produces two new
+sequences which can be analysed or saved to disk.
+
+The sections below deals first with reading sequences from disk, and then
+with what can be done to produce new sequences in memory.
+New sequences are obtained from disk using the Load sequences option in
+the File menu, and sequences are managed internally using the Sequences
+menu. 
+
+ at node SPIN-Feature Tables
+ at section Use of feature tables in spin
+ at cindex feature tables
+
+At present spin can only read feature tables from EMBL style files and use them
+to perform translations to protein. Like many components of spin, for us this
+was an exercise in doing the hard part (ie parsing and using the table), but
+we still need to apply it to many other tasks. We also need to generalise it
+to read genbank files.
+
+We also limited the types of record we accepted: only those with precisely
+defined endpoints. This means we do not store records which for example
+include <1..2000 or 1001.1005 but would store and use
+those with (1001..2000) or complement(join(2691..4571,4918..5163)).
+
+
+ at node SPIN-Read Sequences
+ at section Reading in sequences
+ at cindex Read sequence: spin
+ at cindex Entry sequence: spin
+
+This section describes how sequences are obtained from disk files.
+
+ at menu
+* SPIN-Simple search::           Simple
+* SPIN-Personal search::         Extracting a sequence from a personal archive file
+ at end menu
+
+Personal sequence files can be in plain text, "Staden", EMBL, 
+Genbank, PIR, FASTA and GCG formats. If
+supported by the format, personal files can contain multiple entries
+preceded by entry names. As is explained below a browser is available
+for selecting entries from such files. The file format is worked out 
+automatically.
+
+
+New sequences are entered into spin using the "Load sequence" option in
+the File menu. 
+_ifdef([[_unix]],[[This invokes a cascading menu containing 2 modes of
+searching, the simple search and a personal archive
+search (see below).]])
+
+If a sequence is entered which has the
+same name as one already loaded, its name within the program 
+is changed by the addition of
+'#number' where 'number' is a unique identifer. For example, if the 
+sequence "hsproperd" has already been loaded and this sequence is loaded
+again, the name of this second sequence is changed to "hsproperd#0".
+
+If only one sequence has been loaded, the comparison functions will
+compare this sequence against itself.
+
+
+ at node SPIN-Simple search
+ at subsection Simple search
+ at cindex Load sequence: spin
+ at cindex Get sequence: spin
+ at cindex Simple search: spin
+ at cindex EMBOSS
+
+_picture(spin_simple_search,3.41667in)
+
+This allows the selection of personal files. The second 
+option button is an "Entry" / "Filename" selection menu and the 
+entry box next to this should be completed with either an entryname or 
+file_name accordingly. If a personal file is selected, the filename 
+should be entered in the entrybox. The Browse buttons at the far right of
+the dialogue box either invoke a library browser or a file browser,
+_fpref(File Browser, File Browser, interface) depending on whether a sequence 
+library or personal file has been selected. 
+From the file browser multiple files can be entered by use of the Ctrl key 
+and mouse. Library access is only available
+via EMBOSS.
+
+ at node SPIN-Personal search
+ at subsection Extracting a sequence from a personal archive file
+ at cindex Personal search: spin
+
+_picture(spin_personal_search,2.09167in)
+
+This method invokes an archive browser. Enter the filename of the personal 
+file in the entrybox. The Browse
+button to the right will invoke a file browser, 
+_fpref(File Browser, File Browser, interface). If the file contains 
+multiple entries, these will be displayed in the list box. It is necessary to
+press "Enter" after entering the filename in order for the entries to be 
+displayed. Select the required entryname(s) and press OK. The selected entries
+should now have been loaded into spin.
+
+ at node SPIN-Sequence Manager
+ at section Sequence manager
+ at cindex Sequence manager: spin
+
+
+Spin allows more than two sequences to be available to the user. The sequence
+manager allows the user to perform operations on the sequences which have
+been loaded into spin. The same operations can also be invoked from the 
+"Sequences" menu. The sequence manager is invoked from the "File" menu.
+This command invokes a list box showing all the sequences
+which have been read into spin together with their ranges, lengths and whether
+they are DNA (D) or Protein (P).  The currently active horizontal and
+vertical sequences are marked with "H" and "V" respectively. In the 
+picture below, these are "hsproperd" and "mmproper".
+
+_picture(spin_seq_manager,4.725in)
+
+Clicking on the sequence name in the sequence manager with the right mouse 
+button invokes a pop-up menu containing operations which may be performed on 
+that sequence. The operations available depends on whether the sequence is DNA
+or protein.
+
+These options are described in greater detail below.
+
+ at menu
+* SPIN-Change Active Sequence::  Change the active sequence
+* SPIN-Set Range::               Set the range
+* SPIN-Copy::                    Copy the sequence
+* SPIN-Complement Sequence::     Complement sequence
+* SPIN-Interconvert t and u::    Interconvert t and u
+* SPIN-Translate Sequence::      Translate sequence
+* SPIN-Scramble Sequence::       Scramble sequence
+* SPIN-Rotate Sequence::         Rotate sequence
+* SPIN-Save Sequence::           Save sequence
+* SPIN-Delete Sequence::         Delete sequence
+ at end menu
+
+ at node SPIN-Change Active Sequence
+ at subsection Change the active sequence
+ at cindex Active sequence: spin
+
+To change the currently active horizontal or vertical sequence, use either 
+the "Sequences" menu or select the sequence from the Sequence manager.
+Select 
+"Horizontal" or "Vertical" from the menu. This sequence will now be
+the active "Horizontal" or "Vertical" sequence.
+
+ at node SPIN-Set Range
+ at subsection Set the range
+ at cindex Set the range: spin
+ at cindex Range: spin
+
+If you are only interested in a particular region of a sequence, it is
+possible to specify the start and end positions of this region to create a
+new entry in the Sequence manager. The new sequence will have the same name
+as the parent, with the addition of a "_s" plus a unique number. The third
+entry in the picture above shows the range has been set from 100 to 1000,
+giving a total length of 901 bases for the sequence "hsproperd". 
+
+ at node SPIN-Copy
+ at subsection Copy Sequence
+ at cindex Copy sequence: spin
+ at cindex Spin: copy sequence
+
+This option in the Sequences menu allows a sequence to be duplicated or
+copied.  This simply creates a new entry in the Sequence Manager. The
+user can select which sequence and the segment start and end points to
+be copied.
+
+ at node SPIN-Sequence Type
+ at subsection Sequence type
+ at cindex Spin: sequence type (linear or circular)
+ at cindex circular sequences:spin
+
+This option allows the user to change the status of a sequence to become
+either circular or linear.
+
+ at node SPIN-Complement Sequence
+ at subsection Complement sequence
+ at cindex Complement sequence: spin
+
+This function will reverse and complement nucleic acid sequences.
+Select the "Complement" command from either the "Sequences" menu or the 
+sequence manager pop-up menu. A new sequence will be added to the sequence 
+manager list
+box with the same name as the parent but with "_c" appended to the end. The
+forth entry in the picture above is the complemented sequence of "hsproperd".
+
+ at node SPIN-Interconvert t and u
+ at subsection Interconvert t and u
+ at cindex Transcribe sequence: spin
+ at cindex Interconvert t and u: spin
+
+This function interconverts T and U characters i.e. between DNA and RNA. A
+new sequence is added to the sequence manager list box with the same name
+as the parent but the addition to the end of "_r". The fifth entry is the
+picture above is the transcribed sequence of "hsproperd".
+
+ at node SPIN-Translate Sequence
+ at subsection Translate sequence
+ at cindex Translate sequence: spin
+
+This operation is only available for DNA sequences. 
+Select the "Translate" command from either the "Sequences" menu or the 
+sequence manager pop-up menu. It is possible to 
+translate in any particular frame by selecting the appropriate check box. 
+For each translation, a new sequence will be added to the
+sequence manager list box with the same name as the parent but with the
+addition to the end of either
+"_rf1", "_rf2" or "_rf3" to signify reading frames 1, 2 or 3 respectively.
+
+The "all together" option will produce a single new sequence
+in the sequence manager, with the extension "_rf123", exemplified by the
+ninth entry in the picture above. Although at this point
+the sequence is still DNA, when it is used in a comparison function
+the program will translate it automatically into the three reading frames. The
+important point is that the results from the three reading frames will be
+superimposed in the plot, hence enabling frameshift errors to be spotted.
+
+For example to compare a DNA sequence in all it's reading frames with a protein:
+
+ at enumerate
+
+ at item Convert the DNA sequence using the "all together" command
+ at item Select this sequence as horizontal
+ at item Select the protein sequence as vertical
+ at item Invoke the relevant comparison function
+
+ at end enumerate
+
+ at node SPIN-Scramble Sequence
+ at subsection Scramble sequence
+ at cindex Scramble sequence: spin
+
+This function produces a version of a given DNA or protein sequence in
+which the characters are randomly reordered. i.e. the new sequence has
+the same length and composition as the original but with the characters in
+a random order. The
+new sequence will be added to the sequence manager list box with the same
+name as the parent except with "_x" plus a unique number appended to the end. 
+The tenth entry
+in the picture above is the scrambled version of "hsproperd". For long
+sequences, scrambling and then comparing should produce similar
+numbers of matches as are predicted by the probability calculations
+(_fpref(SPIN-Probability Calculations, Probabilities and expected number of matches)). 
+
+ at node SPIN-Rotate Sequence
+ at subsection Rotate sequence
+ at cindex Rotate sequence: spin
+
+This function allows the user to specify a new origin for a sequence. A new
+sequence is added to the sequence manager list box with the same name as the
+parent except with "_o" plus a unique number appended to the end. The
+eleventh entry in the picture above is of a rotated version of "mmproper".
+This operation is not allowed for sub-sequences ie those created using
+"Set range".
+
+
+ at node SPIN-Save Sequence
+ at subsection Save sequence
+ at cindex Save sequence: spin
+
+To save a sequence to a file, select the "Save" option from either the "File"
+menu, the "Sequences" menu or the sequence manager pop-up
+menu.  This command invokes a file name entry box. The browse button to
+the right of the dialogue box invokes a filebrowser.
+_fxref(File Browser, File Browser, interface)
+The sequence is written as either EMBL or FASTA format. If EMBL is selected
+and the sequence has an associated feature table, the feature table will
+also be written out
+(_fpref(SPIN-Feature Tables, Use of feature tables in spin, t)).
+
+_picture(spin_save_sequence_d,3.01667in)
+
+
+ at node SPIN-Delete Sequence
+ at subsection Delete sequence
+ at cindex Delete sequence: spin
+
+To delete a sequence, select the "Delete" option from either the "Sequences"
+menu or the sequence manager pop-up menu. This
+command will remove the sequence from the sequence manager and all plots and
+results that were produced from it. 
+
+ at node SPIN-Selecting a sequence
+ at section Selecting a sequence
+ at cindex Selecting a sequence: spin
+ at cindex Seq identifier: spin
+
+All the comparison functions request a horizontal and a vertical sequence
+and the ranges over which the function will operate. It is therefore possible 
+to compare the same sequence over different ranges. The default sequences which
+appear when the dialogue box for the comparison function is brought up, are
+the current "active" sequences  
+(_fpref(SPIN-Sequence Manager, Sequence manager)). 
+To select a different
+sequence for this invocation of the function, press the Browse button to
+the right of the "Seq identifier" box. This will invoke a Sequence manager if 
+one is not already displayed. Clicking with the left mouse button on the name 
+of the required sequence in the sequence manager will update the function 
+dialogue box with the sequence name and it's currently defined range. To 
+change the range, enter the new start and end positions. These positions will 
+be remembered for future invocations of any of the comparison functions for
+this sequence.
+
+
diff --git a/manual/spin.texi b/manual/spin.texi
new file mode 100644
index 0000000..2cf642c
--- /dev/null
+++ b/manual/spin.texi
@@ -0,0 +1,41 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+ at c %**start of header
+ at setfilename spin.info
+ at settitle Spin
+ at setchapternewpage odd
+ at iftex
+ at afourpaper
+ at end iftex
+ at setchapternewpage odd
+ at c %**end of header
+
+ at set standalone
+include(header.m4)
+
+ at titlepage
+ at title Spin
+ at subtitle 
+ at author 
+ at page
+ at vskip 0pt plus 1filll
+_include(copyright.texi)
+ at end titlepage
+
+ at node Top
+ at ifinfo
+ at top top-spin
+ at end ifinfo
+
+ at raisesections
+_include(spin-t.texi)
+
+_split()
+ at node Index
+ at unnumberedsec Index
+ at printindex cp
+ at lowersections
+
+ at shortcontents
+ at contents
+ at bye
diff --git a/manual/spin_align_p.png b/manual/spin_align_p.png
new file mode 100644
index 0000000..f665ca7
Binary files /dev/null and b/manual/spin_align_p.png differ
diff --git a/manual/spin_align_p.small.png b/manual/spin_align_p.small.png
new file mode 100644
index 0000000..d16c532
Binary files /dev/null and b/manual/spin_align_p.small.png differ
diff --git a/manual/spin_align_seq.png b/manual/spin_align_seq.png
new file mode 100644
index 0000000..4055a24
Binary files /dev/null and b/manual/spin_align_seq.png differ
diff --git a/manual/spin_alignment_symbols.png b/manual/spin_alignment_symbols.png
new file mode 100644
index 0000000..2923495
Binary files /dev/null and b/manual/spin_alignment_symbols.png differ
diff --git a/manual/spin_author_d.png b/manual/spin_author_d.png
new file mode 100644
index 0000000..45377b5
Binary files /dev/null and b/manual/spin_author_d.png differ
diff --git a/manual/spin_author_p.png b/manual/spin_author_p.png
new file mode 100644
index 0000000..a49a5ac
Binary files /dev/null and b/manual/spin_author_p.png differ
diff --git a/manual/spin_author_p.small.png b/manual/spin_author_p.small.png
new file mode 100644
index 0000000..b7fc417
Binary files /dev/null and b/manual/spin_author_p.small.png differ
diff --git a/manual/spin_base_bias_d.png b/manual/spin_base_bias_d.png
new file mode 100644
index 0000000..9adf061
Binary files /dev/null and b/manual/spin_base_bias_d.png differ
diff --git a/manual/spin_base_bias_p.png b/manual/spin_base_bias_p.png
new file mode 100644
index 0000000..5992fdd
Binary files /dev/null and b/manual/spin_base_bias_p.png differ
diff --git a/manual/spin_base_bias_p.small.png b/manual/spin_base_bias_p.small.png
new file mode 100644
index 0000000..2c086e2
Binary files /dev/null and b/manual/spin_base_bias_p.small.png differ
diff --git a/manual/spin_codon_usage.png b/manual/spin_codon_usage.png
new file mode 100644
index 0000000..e724493
Binary files /dev/null and b/manual/spin_codon_usage.png differ
diff --git a/manual/spin_codon_usage.small.png b/manual/spin_codon_usage.small.png
new file mode 100644
index 0000000..b059513
Binary files /dev/null and b/manual/spin_codon_usage.small.png differ
diff --git a/manual/spin_codon_usage_aaonly.png b/manual/spin_codon_usage_aaonly.png
new file mode 100644
index 0000000..63bd04c
Binary files /dev/null and b/manual/spin_codon_usage_aaonly.png differ
diff --git a/manual/spin_codon_usage_aaonly.small.png b/manual/spin_codon_usage_aaonly.small.png
new file mode 100644
index 0000000..e2a5848
Binary files /dev/null and b/manual/spin_codon_usage_aaonly.small.png differ
diff --git a/manual/spin_codon_usage_dial.png b/manual/spin_codon_usage_dial.png
new file mode 100644
index 0000000..29e0da2
Binary files /dev/null and b/manual/spin_codon_usage_dial.png differ
diff --git a/manual/spin_count_codons_d.png b/manual/spin_count_codons_d.png
new file mode 100644
index 0000000..050de9c
Binary files /dev/null and b/manual/spin_count_codons_d.png differ
diff --git a/manual/spin_count_codons_t.png b/manual/spin_count_codons_t.png
new file mode 100644
index 0000000..160e2e8
Binary files /dev/null and b/manual/spin_count_codons_t.png differ
diff --git a/manual/spin_count_codons_t.small.png b/manual/spin_count_codons_t.small.png
new file mode 100644
index 0000000..4d117cc
Binary files /dev/null and b/manual/spin_count_codons_t.small.png differ
diff --git a/manual/spin_diagonals.png b/manual/spin_diagonals.png
new file mode 100644
index 0000000..9006b63
Binary files /dev/null and b/manual/spin_diagonals.png differ
diff --git a/manual/spin_dot_plot.png b/manual/spin_dot_plot.png
new file mode 100644
index 0000000..8b813fc
Binary files /dev/null and b/manual/spin_dot_plot.png differ
diff --git a/manual/spin_dot_plot.small.png b/manual/spin_dot_plot.small.png
new file mode 100644
index 0000000..7ab5f15
Binary files /dev/null and b/manual/spin_dot_plot.small.png differ
diff --git a/manual/spin_find_orf_d.png b/manual/spin_find_orf_d.png
new file mode 100644
index 0000000..f422548
Binary files /dev/null and b/manual/spin_find_orf_d.png differ
diff --git a/manual/spin_local_align.png b/manual/spin_local_align.png
new file mode 100644
index 0000000..d96ccbc
Binary files /dev/null and b/manual/spin_local_align.png differ
diff --git a/manual/spin_local_p1.png b/manual/spin_local_p1.png
new file mode 100644
index 0000000..47e03a6
Binary files /dev/null and b/manual/spin_local_p1.png differ
diff --git a/manual/spin_local_p1.small.png b/manual/spin_local_p1.small.png
new file mode 100644
index 0000000..e15eede
Binary files /dev/null and b/manual/spin_local_p1.small.png differ
diff --git a/manual/spin_local_p2.png b/manual/spin_local_p2.png
new file mode 100644
index 0000000..42e6e5a
Binary files /dev/null and b/manual/spin_local_p2.png differ
diff --git a/manual/spin_local_p2.small.png b/manual/spin_local_p2.small.png
new file mode 100644
index 0000000..f45973d
Binary files /dev/null and b/manual/spin_local_p2.small.png differ
diff --git a/manual/spin_match_words.png b/manual/spin_match_words.png
new file mode 100644
index 0000000..3fd200c
Binary files /dev/null and b/manual/spin_match_words.png differ
diff --git a/manual/spin_mini-t.texi b/manual/spin_mini-t.texi
new file mode 100644
index 0000000..5fe1489
--- /dev/null
+++ b/manual/spin_mini-t.texi
@@ -0,0 +1,364 @@
+ at cindex Spin
+
+_split()
+ at node SPIN-Introduction
+ at chapter Introduction
+
+Spin is an interactive and graphical 
+program for analysing and comparing sequences. It contains functions to
+search for restriction sites, consensus sequences/motifs and protein 
+coding regions, can analyse the composition of the sequence and
+translate DNA to protein. 
+It also contains functions for locating segments of similarity within and
+between sequences, and for finding global and local alignments between pairs of
+sequences. 
+To help assess the statistical significance of comparisons the program can
+calculate tables of expected and observed score frequencies for each
+score level.
+Most analytical functions which operate on single sequences
+add their graphical results to a "SPIN Sequence Plot" that is associated 
+with the sequence being analysed. (An exception is the restriction enzyme 
+search which produces its own separate window.) Most functions which compare
+pairs of sequences add their results to a "SPIN Sequence Comparison Plot".
+The SPIN Sequence Plot and the SPIN Sequence Comparison Plot each have
+associated sequence display windows: the Sequence Display and the 
+Sequence Comparison Display. These allow the text of the sequences to be
+viewed and use cursors to show the corresponding positions in the graphical
+displays.
+The graphical plots can be zoomed, and cursors or crosshairs can be
+used to locate the positions of the individual results. Plots can be 
+superimposed.
+
+ at node SPIN-Intro-Functions
+ at section Summary of the Spin Single Sequence Functions
+
+Spin's main single sequence analytical functions are accessed via the 
+Statistics,
+Translation and Search menus.
+The "Statistics" menu contains options to count and plot the base composition 
+(_fpref(SPIN-Plot-Base-Composition, Plot Base Composition))
+and also to count the dinucleotide frequencies
+(_fpref(SPIN-Dinucleotide-Freq, Dinucleotide Frequencies)).
+
+The "Translation" menu contains options to set the genetic code
+(_fpref(SPIN-Set-Genetic-Code, Set Genetic Code)), translate to protein
+(_fpref(SPIN-Translation-General, Translation)),
+find open reading frames and write the results in either feature
+table format or as fasta format protein sequence files
+(_fpref(SPIN-Open-Reading-Frames, Find Open Reading Frames)), and to calculate 
+codon tables.
+
+The "Search" menu contains a variety of different searching techniques.
+"Protein genes" has four methods for finding protein genes
+(_fpref(SPIN-Codon-Usage-Method, Codon Usage Method))
+(_fpref(SPIN-Author-Test, Author Test)),
+(_fpref(SPIN-Positional-Base-Prefs, Positional base Preferences))
+(accessed as a subcomponent of the Codon Usage Method), and
+(_fpref(SPIN-Uneven-Positional-Base-Freqs, Uneven Positional base
+Frequencies)). There is also a method to search for tRNA genes
+(_fpref(SPIN-TRNA-Search, tRNA Search)).
+It is also possible to perform subsequence or string searches 
+(_fpref(SPIN-String-Search, String search)) and restriction enzyme searches
+(_fpref(SPIN-Restrict-Introduction, Restriction enzyme search)). 
+There are searches for start 
+(_fpref(SPIN-Start-Codon-Search, Start Codon Search)) and stop codons 
+(_fpref(SPIN-Stop-Codon-Search, Stop Codon Search)), splice junction
+searches (_fpref(SPIN-Splice-Site-Search, Splice Site Search)), and
+general motif searches using weight matrices (_fpref(SPIN-Weight-Matrix-Search, 
+Motif Search)).
+
+_split()
+ at node SPIN-Intro-Comparison-Functions
+ at section Summary of the Spin Comparison Functions
+
+This section outlines the functions obtained from the Comparison menu.
+All produce graphical and textual output. 
+Using a score matrix, the "Find similar spans"
+function compares every segment of one sequence with all those of the
+other and reports those that reach a user defined score. The
+segments are of a fixed length (span) set by the user
+(_fpref(SPIN-Find similar spans, Finding Similar Spans)).
+To look for short matching segments of any length, and allowing gaps, a
+local dynamic programming routine can be used
+(_fpref(SPIN-Local alignment, Aligning Sequences Locally)).
+The fastest routine for locating segments of similarity (and generally
+only suitable for DNA sequences) finds all identical subsequences (or words)
+(_fpref(SPIN-Find matching words, Finding Matching Words)).
+For a quick global comparison of sequences using a combination of the
+Matching Words and Matching Spans algorithms the "Find best
+diagonals" algorithm can be used
+(_fpref(SPIN-Find Best Diagonals, Finding the Best Diagonals)).
+Global alignments can be produced and plotted using a dynamic programming
+algorithm 
+(_fpref(SPIN-Align Sequences, Aligning Sequences Globally)).
+
+ at page
+_split()
+ at node SPIN-Intro-Interface
+ at section Introduction to the Spin User Interface
+
+ at menu
+* SPIN-Intro-Interface-plot::                          The SPIN Sequence Plot
+* SPIN-Intro-Interface-seq::                      The SPIN Sequence Display
+* SPIN-Intro-Interface-comparison-plot::         The SPIN Sequence Comparison Plot
+* SPIN-Intro-Interface-sequence-comparison-display::           The SPIN Sequence Comparison Display
+ at end menu
+
+Spin has several main displays. The first is
+a top level window from which all the main options are selected and
+which receives textual results. 
+Most analytical functions which operate on single sequences
+add their graphical results to a "SPIN Sequence Plot" that is associated 
+with the sequence being analysed. (An exception is the restriction enzyme 
+search which produces its own separate window.) Most functions which compare
+pairs of sequences add their results to a "SPIN Sequence Comparison Plot".
+The SPIN Sequence Plot and the SPIN Sequence Comparison Plot each have
+associated sequence display windows: the Sequence Display and the 
+Sequence Comparison Display. These allow the text of the sequences to be
+viewed and use cursors to show the corresponding positions in the graphical
+displays.
+
+Spin is
+best operated using a three button mouse, but alternative keybindings
+are available. Full details of the user interface
+are described elsewhere
+(_fpref(UI-Introduction, User Interface, t)), and here we give an
+introduction based around a series of screenshots.
+
+The main window (shown below) contains an Output Window for
+textual results, an Error window for error messages, and a series of
+menus arranged along the top
+(_fpref(SPIN-Intro-Menus, Spin menus,t)).
+The contents of the two text windows can
+be searched, edited and saved. Each set of results is preceded by
+a header containing the time and date when it was generated.
+
+_lpicture(spin_translate_t,3.01667in)
+
+As can be seen 
+the main menu bar contains File, View, Options, Sequences, Statistics,
+Translation, Comparison, Search and Emboss menus.
+In general most functions add their graphical results to a 
+"SPIN Sequence Plot", but those obtained from the Comparison menu add
+their results to a "SPIN Sequence Comparison Plot".
+
+ at page
+_split()
+ at node SPIN-Intro-Interface-plot
+ at subsection Introduction to the Spin Plot
+
+Most of the spin functions display their results in a 
+two-dimensional plot called a "spin plot" (_fpref(SPIN-Spin-Plot, Spin plot)).
+Sets of matches from a single invocation of a
+function are termed "a result".  Each result is plotted using
+a single colour which can be configured via the results manager
+(_fpref(SPIN-Result-Manager, Result manager)). 
+
+The figure shown below shows a spin plot window containing the results of
+a gene search method based on codon usage, superimposed on a search for
+stop codons
+(_fpref(SPIN-Codon-Usage-Method, Codon Usage Method)). 
+Each plot window contains a cross hair. Its x position is shown
+in sequence base numbers in the left hand box above the plot, and the y
+coordinate, expressed using the score values of the gene search, is
+shown in the right hand box.
+
+_lpicture(spin_plot_p,6in)
+
+At the right hand side of each panel is a set of square boxes with the
+same colours as the lines drawn in the adjacent plot. These
+icon-like objects represent individual results and allow the user to 
+operate on them. For example at the right of the middle panel is a
+pop-up menu containing the items: "Information", "List results",
+"Configure", "Hide" and "Remove". 
+(_fpref(SPIN-Result-Manager, Result manager)).
+
+These icons can also be used to
+drag and drop the results to which they correspond. This is activated by
+pressing the middle mouse button, or Alt left mouse button, 
+over the box and then moving the cursor
+over the spin plot to the new location or anywhere outside the spin plot
+(_fpref(SPIN-DRAG, Drag and drop))
+
+Each spin plot window also contains a cursor that denotes the position of
+the cursor in the Sequence display window 
+(_fpref(SPIN-Sequence-Display, Spin Sequence Display)).
+The user can move a cursor by clicking and dragging 
+with the middle mouse button, or Alt left mouse button.
+This will move the cursor in the sequence display and all other cursors displayed
+that relate to the sequence.
+
+The graphical results can be zoomed and scrolled in both x and y
+directions. Zooming is achieved using the X and Y scale bars at the top
+left hand corner of the plot. The individual plots can be scrolled in y
+using the scroll bars attached to their right hand edge. The sequence
+can be scrolled using the scroll bar at the base of the plot.
+
+To illustrate further uses of the program we include some more screen
+dumps below.
+
+_lpicture(spin_restrict_enzymes_p,6in)
+
+The figure above shows the results of a search for restriction enzymes
+(_fpref(SPIN-Restrict-Introduction, Restriction enzyme search)).
+
+_lpicture(spin_plot_base_comp_p,6in)
+
+The figure above is a plot of the base composition of a sequence.
+
+_lpicture(spin_weight_matrix,6in)
+
+The figure above shows the way in which the results of
+weight matrix searches for motifs are plotted
+(_fpref(SPIN-Weight-Matrix-Search, Motif Search)).
+
+_lpicture(spin_splice,6in)
+
+
+The figure about shows the way in which the results of
+searches for splice junctions are plotted. The donor and acceptor
+predictions are separated and a different colour is used for each
+reading frame
+(_fpref(SPIN-Splice-Site-Search, Splice Site Search)).
+
+_lpicture(spin_base_bias_p,6in)
+
+The figure above shows a method for finding protein coding regions which
+does not distinguish reading frame or strand
+(_fpref(SPIN-Uneven-Positional-Base-Freqs, Uneven Positional base
+Frequencies)).
+
+
+_lpicture(spin_trna_t,5.13333in)
+
+The figure above shows how results from the tRNA gene search function
+are displayed in the Output window
+(_fpref(SPIN-TRNA-Search, tRNA Search)).
+
+
+_split()
+ at node SPIN-Intro-Interface-seq
+ at subsection Introduction to the Spin Sequence Display
+
+Spin also has a sequence display window in which the user can view the
+sequence in
+textual form. This window allows the user to scroll along the sequence.
+Users can view one or both strands,
+can switch on displays of the encoded amino acids in up to six reading
+frames, can switch on a display of the restriction enzyme sites, and can
+perform other simple subsequence or string searches to locate features in the
+sequence. In the figure shown below the user has switched on a three
+phase translation on the top strand, double stranded sequence, and a
+restriction enzyme search.
+
+_lpicture(spin_sequence_display_t,6in)
+
+The sequence cursor can be under the control of the graphics
+cursor i.e. the cursor in the sequence viewer can be moved by the user
+dragging the cursor in the graphics window. Similarly the cursor in the
+graphics plots can be moved by the sequence viewer cursor.
+
+
+ at page
+_split()
+ at node SPIN-Intro-Interface-comparison-plot
+ at subsection Introduction to the Spin Sequence Comparison Plot
+
+All of the spin comparison 
+functions display their results as points or lines in a 
+two-dimensional plot called a "Spin Sequence Comparison Plot" (_fpref(SPIN-SPIN Sequence Comparison Plot, Spin Sequence Comparison Plot)).
+Sets of matches from a single invocation of a
+comparison command are termed "a result".  Each result is plotted using
+a single colour which can be configured via the results manager
+(_fpref(SPIN-Result-Manager, Result manager)). 
+
+_lpicture(spin_dot_plot,5.88333in)
+
+The diagram above shows the results of a "Find similar spans" search (olive)
+(_fpref(SPIN-Find similar spans, Finding Similar Spans)), and a "Find
+matching words" 
+(red) (_fpref(SPIN-Find matching words, Finding Matching Words)).
+
+At the right hand side is a set of square boxes with the
+same colours as the dots drawn in the adjacent plot. These
+icon-like objects represent individual results and allow the user to 
+operate on them. For example clicking with the right mouse button brings
+up the pop-up menu beneath the "matching words"
+result contains the results menu for this result
+(_fpref(SPIN-Result-Manager, Result manager)).
+These icons can also be used to
+drag and drop the results to which they correspond. This is activated by
+pressing the middle mouse button, or Alt left mouse button, 
+over the box and then moving the cursor
+over the Spin Sequence Comparison Plot to the new location or anywhere outside the Spin Sequence Comparison Plot
+(_fpref(SPIN-Drag, Drag and drop)).
+
+Crosshairs can be turned on or off using the check button labelled
+"crosshairs". The x and y positions of the crosshairs are indicated in
+the two boxes to the right of the check box.
+
+Each sequence displayed in a Spin Sequence Comparison Plot will have a cursor 
+of a particular colour. In the picture above, the sequence on the horizontal
+axis has a vertical blue cursor whereas the sequence on the vertical axis has 
+a horizontal olive green cursor. In general, 
+the same sequence displayed in several Spin Sequence Comparison Plots will have
+a cursor of the same colour.
+The user can move a cursor by clicking and dragging 
+with the middle mouse button, or with Alt left mouse button.
+This will move all other
+cursors displayed that relate to the sequence, whether they are in
+different Spin Sequence Comparison Plots or within the sequence display.
+
+Plots can be enlarged either by resizing the window or zooming. Zooming
+is achieved by holding down the control key and right mouse button and
+dragging out a rectangle. This process can be repeated.
+The Back button will restore the plot to the previous magnification. 
+
+To illustrate further uses of the program we include some more screen
+dumps below.
+
+
+_lpicture(spin_plot,5.74167in)
+
+The picture above shows the results after performing a "Find similar spans"
+comparison between the three reading frames of two DNA sequences, producing 
+nine superimposed sets of results.
+
+ at page
+_lpicture(spin_local_p1,5.31667in)
+
+Local alignment searches join similar segments with lines. The above
+screen dump shows such an analysis in which 
+genomic DNA containing 7 exons and is compared to
+its corresponding cDNA.
+
+
+_lpicture(spin_align_p,5.31667in)
+
+The above screendump shows a global alignment of the same pair of
+sequences.
+
+_split()
+ at node SPIN-Intro-Interface-sequence-comparison-display
+ at subsection Introduction to the Spin Sequence Comparison Display
+
+A sequence comparison display is associated with a single set of results 
+and can be invoked by bringing up a pop
+up menu for the required result, either from the Results manager
+(_fpref(SPIN-Result-Manager, Result manager)), the Results
+menu in the Spin Sequence Comparison Plot, or the coloured square icon on the 
+right of the plot.
+
+_lpicture(spin_seq_display,6in)
+
+The horizontal sequence is drawn above the vertical sequence and the 
+central panel indicates characters which are
+identical. The buttons (< >) and (<< >>) scroll the sequences.
+Pressing the Lock
+button forces the sequences to scroll together.
+Movement of the sequences is also controlled by the scrollbars or by
+moving the corresponding cursor in the Spin Sequence Comparison Plot.
+The black cursors in the sequence display 
+correspond to the position of the cursor in the Spin Sequence Comparison Plot. The sequences can be
+made to 'jump' to the nearest match in those results by pressing the 
+"Nearest match" or "Nearest dot" buttons.
diff --git a/manual/spin_org-t.texi b/manual/spin_org-t.texi
new file mode 100644
index 0000000..31b6112
--- /dev/null
+++ b/manual/spin_org-t.texi
@@ -0,0 +1,28 @@
+ at node SPIN_ORG
+ at chapter Organisation of the Spin Manual
+
+The Introductory section of the manual gives an overview of the
+functions
+(_fpref(SPIN-Intro-Functions, Summary of the Spin Functions, t)),
+the menus
+(_fpref(SPIN-Intro-Menus, Spin Menus, t))
+and the user interface
+(_fpref(SPIN-Intro-Interface, Introduction to the Spin User Interface,
+t)). The Introduction to the user interface includes a range of screen
+dumps which give an overview of what spin can do, and taken as a whole,
+the introduction should contain sufficient information 
+to enable users to start using the program. 
+
+The next section 
+describes in turn each of the main functions
+(_fpref(SPIN-Functions, The Spin Functions, t)).
+This is followed by a detailed description of the spin user interface
+(_fpref(SPIN-USER-Interface, The Spin User Interface, t)).
+Next is a section describing how users can control the results from
+the functions, and how they can be manipulated once they have
+been obtained
+(_fpref(SPIN-Results, Controlling and Managing Results, t)).
+The final part of the manual describes how to read sequences into spin
+and the kinds of manipulations which can be performed on them to prepare
+them for analysis
+(_fpref(Reading and Managing Sequences, Reading and Managing Sequences, t)).
diff --git a/manual/spin_personal_search.png b/manual/spin_personal_search.png
new file mode 100644
index 0000000..649968b
Binary files /dev/null and b/manual/spin_personal_search.png differ
diff --git a/manual/spin_plot.png b/manual/spin_plot.png
new file mode 100644
index 0000000..3a42e2f
Binary files /dev/null and b/manual/spin_plot.png differ
diff --git a/manual/spin_plot.small.png b/manual/spin_plot.small.png
new file mode 100644
index 0000000..dbd069a
Binary files /dev/null and b/manual/spin_plot.small.png differ
diff --git a/manual/spin_plot_base_comp_d.png b/manual/spin_plot_base_comp_d.png
new file mode 100644
index 0000000..e87d738
Binary files /dev/null and b/manual/spin_plot_base_comp_d.png differ
diff --git a/manual/spin_plot_base_comp_p.png b/manual/spin_plot_base_comp_p.png
new file mode 100644
index 0000000..c832ebf
Binary files /dev/null and b/manual/spin_plot_base_comp_p.png differ
diff --git a/manual/spin_plot_base_comp_p.small.png b/manual/spin_plot_base_comp_p.small.png
new file mode 100644
index 0000000..e6db895
Binary files /dev/null and b/manual/spin_plot_base_comp_p.small.png differ
diff --git a/manual/spin_plot_drag1.png b/manual/spin_plot_drag1.png
new file mode 100644
index 0000000..1c2fa82
Binary files /dev/null and b/manual/spin_plot_drag1.png differ
diff --git a/manual/spin_plot_drag1.small.png b/manual/spin_plot_drag1.small.png
new file mode 100644
index 0000000..b35ea6c
Binary files /dev/null and b/manual/spin_plot_drag1.small.png differ
diff --git a/manual/spin_plot_drag2.png b/manual/spin_plot_drag2.png
new file mode 100644
index 0000000..d7ca0b4
Binary files /dev/null and b/manual/spin_plot_drag2.png differ
diff --git a/manual/spin_plot_drag2.small.png b/manual/spin_plot_drag2.small.png
new file mode 100644
index 0000000..c7bc695
Binary files /dev/null and b/manual/spin_plot_drag2.small.png differ
diff --git a/manual/spin_plot_drag3.png b/manual/spin_plot_drag3.png
new file mode 100644
index 0000000..17e19fc
Binary files /dev/null and b/manual/spin_plot_drag3.png differ
diff --git a/manual/spin_plot_drag3.small.png b/manual/spin_plot_drag3.small.png
new file mode 100644
index 0000000..c101b41
Binary files /dev/null and b/manual/spin_plot_drag3.small.png differ
diff --git a/manual/spin_plot_p.png b/manual/spin_plot_p.png
new file mode 100644
index 0000000..3be3123
Binary files /dev/null and b/manual/spin_plot_p.png differ
diff --git a/manual/spin_plot_p.small.png b/manual/spin_plot_p.small.png
new file mode 100644
index 0000000..3395f21
Binary files /dev/null and b/manual/spin_plot_p.small.png differ
diff --git a/manual/spin_restrict_enzymes-t.texi b/manual/spin_restrict_enzymes-t.texi
new file mode 100644
index 0000000..6b5332d
--- /dev/null
+++ b/manual/spin_restrict_enzymes-t.texi
@@ -0,0 +1,191 @@
+_split()
+ at node SPIN-Restrict-Introduction
+ at section Restriction enzyme search
+ at menu
+* SPIN-Restrict-Selecting::          Selecting enzymes
+* SPIN-Restrict-Examining::          Examining the plot
+* SPIN-Restrict-Reconfig::           Reconfiguring the plot
+ at end menu
+
+_split()
+ at cindex Restriction enzymes: introduction: spin
+
+The restriction enzyme map function finds and displays restriction sites
+found within a specified region of a sequence. Users can select the enzyme
+types to search for.
+
+_lpicture(spin_restrict_enzymes_p,6in)
+
+This figure shows a typical view of the Restriction Enzyme Map function
+in which the results for most enzyme types are shown as black vertical
+lines opposite the enzyme names, but in which some of the enzymes sites
+have been configured by the user to be drawn in different colours.
+If no
+result is found for any particular enzyme eg here AccIII, the line will
+still be drawn so that zero cutters can be identified. 
+The results
+can be scrolled vertically (and horizontally if the plot is zoomed in).
+A ruler is shown along the base and the current cursor (the vertical
+black line) position is shown in the left hand box near the top right of
+the display.  If the user clicks, in turn, on two restriction sites
+their separation in base pairs will appear in the top right hand box.
+Information about the last site touched is shown in the Information line
+at the bottom of the display.
+
+_split()
+ at node SPIN-Restrict-Selecting
+ at subsection Selecting Enzymes
+ at cindex Restriction enzymes: selecting enzymes: spin
+
+Files of restriction enzyme names and their cut sites are stored in disk
+files. For the format of these files see 
+_fref(Formats-Restriction, Restriction enzyme files, restriction_enzymes)
+Standard four-cutter, six-cutter and all-enzymes files are available and
+the users can use their own "personal" files.  To create your own file
+of enzymes you may need to extract the information from the currently
+defined files. These are pointed to by the @code{RENZYM.4},
+ at code{RENZYM.6} and @code{RENZYM.ALL} environment variables.
+
+_picture(spin_restrict_enzymes_d,3.01667in)
+
+When the file is read the list of enzymes is displayed in a scrolling
+window.  To select enzymes press and drag the left mouse button within
+the list.  Dragging the mouse off the bottom of the list will scroll to
+allow selection of a range larger than the displayed section of the
+list.  When the left button is pressed any existing selection is
+cleared. To select several disjoint entries in the list press control
+and the left mouse button. Once the enzymes have been chosen, pressing
+OK will create the plot.
+
+_split()
+ at node SPIN-Restrict-Examining
+ at subsection Examining the Plot
+ at cindex Restriction enzymes: examining the plot: spin
+
+Positioning the cursor over a match will cause its name and cut position
+to appear in the information line.  If the right mouse button is pressed
+over a match, a popup menu containing Information and Configure will
+appear. The Information function in this menu will display the data for
+this cut site and enzyme in the output window.
+
+It is possible to find the distance between any two cut sites.  Pressing
+the left mouse button on a match will display "Select another cut" at
+the bottom of the window.  Then, pressing the left button on another
+match will display the distance, in bases, between the two sites. This
+is shown in a box located at the top right corner of the window.
+
+_split()
+ at node SPIN-Restrict-Reconfig
+ at subsection Reconfiguring the Plot
+ at cindex Configure: restriction enzymes: spin
+ at cindex Restriction enzymes: configuring: spin
+
+The plot displays the results for each restriction enzyme on a separate
+line.  Enzymes with no sites are also shown.  The order of these lines
+may be changed by pressing and dragging the middle mouse button, or Alt
+left mouse button, on one
+of the displayed names at the left side of the screen. For example the
+figure below shows the results seen above but after the coloured 
+(i.e. non-black) rows of sites have been dragged and dropped to be
+vertically adjacent.
+
+_lpicture(spin_restrict_enzymes_p1,6in)
+
+The results are plotted as black lines but users can select colours for
+each enzyme type by pressing the right button on any of its matches.  A
+menu containing Information and Configure will pop up. Configure will
+display a colour selection dialogue.  Adjusting the colour here will
+adjust the colour for all matches found with this restriction enzyme.
+
+_split()
+ at node SPIN-Restrict-Printing
+ at subsection Printing the sites
+ at cindex Restriction site printing:spin
+ at cindex Restriction site listing:spin
+
+From the Result manager (_fpref(SPIN-Result-Manager, Result manager)),
+menu a pop-up menu for restriction sites results can be used to write
+the results in two forms to the the Output window - from here the
+results can be saved to a file. The two choices of format are "Output
+enzyme by enzyme" and "Output ordered on position", brief examples of
+which are shown below. The output also appears in an "Information" window.
+Note that these listings are also available from
+gap4. 
+
+The restriction enzyme results output ordered "enzyme by enzyme".
+The enzymes
+sites are numbered and named and the actual cut site from the sequence
+is written, followed by the position of the cut, the fragment size, and
+finally a sorted list of fragment sizes. 
+A list of zero cutters is written underneath.
+
+ at example
+
+  Matches found=     1 
+      Name            Sequence                 Position Fragment lengths
+    1 ApaLI           G'TGCAC                      3506   3505   3505 
+                                                          4629   4629 
+  Matches found=     8                         
+      Name            Sequence                 Position Fragment lengths
+    1 ApoI            A'AATTC                      1939   1938    184 
+    2 ApoI            G'AATTT                      2632    693    339 
+    3 ApoI            A'AATTT                      2996    364    364 
+    4 ApoI            G'AATTC                      3180    184    419 
+    5 ApoI            A'AATTT                      5283   2103    639 
+    6 ApoI            G'AATTC                      5702    419    693 
+    7 ApoI            A'AATTC                      6341    639   1455 
+    8 ApoI            A'AATTC                      7796   1455   1938 
+                                                           339   2103 
+  Matches found=     2                         
+      Name            Sequence                 Position Fragment lengths
+    1 AseI            AT'TAAT                      1790   1789    435 
+    2 AseI            AT'TAAT                      2225    435   1789 
+
+Zero cutters:
+      Acc65I
+      AccIII
+      AclNI
+      AhdI
+      ApaI
+      AscI
+      Asp700I
+      Asp718I
+      AspEI
+      AsuNHI
+      AvrII
+                                                          5910   5910 
+ at end example
+
+
+The restriction enzyme results output ordered on position. The enzymes
+sites are numbered and named and the actual cut site from the sequence
+is written, followed by the position of the cut, the fragment size, and
+finally a sorted list of cut sizes.
+
+ at example
+
+============================================================
+Wed 19 Nov 15:42:38 1997: Restriction enzymes result list
+------------------------------------------------------------
+Sequence /nfs/skye/home10/rs/work/doc/spin/atpase.dat
+Number of enzymes = 80
+Number of matches = 597
+      Name            Sequence                 Position Fragment lengths
+    1 AspLEI          GCG'C                         157    156      0 
+    2 AccII           CG'CG                         313    156      0 
+    3 AspLEI          GCG'C                         313      0      0 
+    4 AviII           TGC'GCA                       322      9      0 
+    5 AspLEI          GCG'C                         323      1      0 
+    6 AsuHPI          'CGCTTTATCACC                 342     19      0 
+    7 AflIII          A'CGCGT                       362     20      0 
+    8 AccII           CG'CG                         364      2      0 
+    9 BcgI            'AACAGGGTTAGCAGAAAAGTCG       389     25      0 
+   10 BcgI            GCAGAAAAGTCGCAATTGTATGCA'     423     34      0 
+   11 AsuHPI          'CATTTATTCACC                 440     17      0 
+   12 AspLEI          GCG'C                         486     46      0 
+   13 AciI            C'CGC                         502     16      0 
+   14 AciI            G'CGG                         552     50      0 
+   15 AccII           CG'CG                         552      0      0 
+   16 AclI            AA'CGTT                       614     62      0 
+
+ at end example
diff --git a/manual/spin_restrict_enzymes_d.png b/manual/spin_restrict_enzymes_d.png
new file mode 100644
index 0000000..6d26958
Binary files /dev/null and b/manual/spin_restrict_enzymes_d.png differ
diff --git a/manual/spin_restrict_enzymes_p.png b/manual/spin_restrict_enzymes_p.png
new file mode 100644
index 0000000..0c7155a
Binary files /dev/null and b/manual/spin_restrict_enzymes_p.png differ
diff --git a/manual/spin_restrict_enzymes_p.small.png b/manual/spin_restrict_enzymes_p.small.png
new file mode 100644
index 0000000..3d5e86b
Binary files /dev/null and b/manual/spin_restrict_enzymes_p.small.png differ
diff --git a/manual/spin_restrict_enzymes_p1.png b/manual/spin_restrict_enzymes_p1.png
new file mode 100644
index 0000000..529a274
Binary files /dev/null and b/manual/spin_restrict_enzymes_p1.png differ
diff --git a/manual/spin_restrict_enzymes_p1.small.png b/manual/spin_restrict_enzymes_p1.small.png
new file mode 100644
index 0000000..5112f3c
Binary files /dev/null and b/manual/spin_restrict_enzymes_p1.small.png differ
diff --git a/manual/spin_results_manager_d.png b/manual/spin_results_manager_d.png
new file mode 100644
index 0000000..b916476
Binary files /dev/null and b/manual/spin_results_manager_d.png differ
diff --git a/manual/spin_results_manager_d2.png b/manual/spin_results_manager_d2.png
new file mode 100644
index 0000000..6407852
Binary files /dev/null and b/manual/spin_results_manager_d2.png differ
diff --git a/manual/spin_results_manager_d2.small.png b/manual/spin_results_manager_d2.small.png
new file mode 100644
index 0000000..32acdf0
Binary files /dev/null and b/manual/spin_results_manager_d2.small.png differ
diff --git a/manual/spin_save_sequence_d.png b/manual/spin_save_sequence_d.png
new file mode 100644
index 0000000..396d272
Binary files /dev/null and b/manual/spin_save_sequence_d.png differ
diff --git a/manual/spin_seq_display.png b/manual/spin_seq_display.png
new file mode 100644
index 0000000..9bbe58b
Binary files /dev/null and b/manual/spin_seq_display.png differ
diff --git a/manual/spin_seq_display.small.png b/manual/spin_seq_display.small.png
new file mode 100644
index 0000000..3b11954
Binary files /dev/null and b/manual/spin_seq_display.small.png differ
diff --git a/manual/spin_seq_manager.png b/manual/spin_seq_manager.png
new file mode 100644
index 0000000..84d5f73
Binary files /dev/null and b/manual/spin_seq_manager.png differ
diff --git a/manual/spin_sequence_display_d.png b/manual/spin_sequence_display_d.png
new file mode 100644
index 0000000..e5bee48
Binary files /dev/null and b/manual/spin_sequence_display_d.png differ
diff --git a/manual/spin_sequence_display_save_d.png b/manual/spin_sequence_display_save_d.png
new file mode 100644
index 0000000..181b892
Binary files /dev/null and b/manual/spin_sequence_display_save_d.png differ
diff --git a/manual/spin_sequence_display_t.png b/manual/spin_sequence_display_t.png
new file mode 100644
index 0000000..e68b1f2
Binary files /dev/null and b/manual/spin_sequence_display_t.png differ
diff --git a/manual/spin_sequence_display_t.small.png b/manual/spin_sequence_display_t.small.png
new file mode 100644
index 0000000..8bf124c
Binary files /dev/null and b/manual/spin_sequence_display_t.small.png differ
diff --git a/manual/spin_similar_spans.png b/manual/spin_similar_spans.png
new file mode 100644
index 0000000..7b922b2
Binary files /dev/null and b/manual/spin_similar_spans.png differ
diff --git a/manual/spin_simple_search.png b/manual/spin_simple_search.png
new file mode 100644
index 0000000..f4affc3
Binary files /dev/null and b/manual/spin_simple_search.png differ
diff --git a/manual/spin_splice.png b/manual/spin_splice.png
new file mode 100644
index 0000000..ce469eb
Binary files /dev/null and b/manual/spin_splice.png differ
diff --git a/manual/spin_splice.small.png b/manual/spin_splice.small.png
new file mode 100644
index 0000000..7e0466e
Binary files /dev/null and b/manual/spin_splice.small.png differ
diff --git a/manual/spin_start_d.png b/manual/spin_start_d.png
new file mode 100644
index 0000000..4f4632d
Binary files /dev/null and b/manual/spin_start_d.png differ
diff --git a/manual/spin_start_p.png b/manual/spin_start_p.png
new file mode 100644
index 0000000..089dd57
Binary files /dev/null and b/manual/spin_start_p.png differ
diff --git a/manual/spin_start_p.small.png b/manual/spin_start_p.small.png
new file mode 100644
index 0000000..11b5c94
Binary files /dev/null and b/manual/spin_start_p.small.png differ
diff --git a/manual/spin_stops_d.png b/manual/spin_stops_d.png
new file mode 100644
index 0000000..d0d96f9
Binary files /dev/null and b/manual/spin_stops_d.png differ
diff --git a/manual/spin_stops_p.png b/manual/spin_stops_p.png
new file mode 100644
index 0000000..4c9c642
Binary files /dev/null and b/manual/spin_stops_p.png differ
diff --git a/manual/spin_stops_p.small.png b/manual/spin_stops_p.small.png
new file mode 100644
index 0000000..af68c96
Binary files /dev/null and b/manual/spin_stops_p.small.png differ
diff --git a/manual/spin_stops_p2.png b/manual/spin_stops_p2.png
new file mode 100644
index 0000000..322874e
Binary files /dev/null and b/manual/spin_stops_p2.png differ
diff --git a/manual/spin_stops_p2.small.png b/manual/spin_stops_p2.small.png
new file mode 100644
index 0000000..fb1b9a5
Binary files /dev/null and b/manual/spin_stops_p2.small.png differ
diff --git a/manual/spin_string_search_d.png b/manual/spin_string_search_d.png
new file mode 100644
index 0000000..4e54f3a
Binary files /dev/null and b/manual/spin_string_search_d.png differ
diff --git a/manual/spin_string_search_p.png b/manual/spin_string_search_p.png
new file mode 100644
index 0000000..c1e1df8
Binary files /dev/null and b/manual/spin_string_search_p.png differ
diff --git a/manual/spin_string_search_p.small.png b/manual/spin_string_search_p.small.png
new file mode 100644
index 0000000..7ba0747
Binary files /dev/null and b/manual/spin_string_search_p.small.png differ
diff --git a/manual/spin_translate_d.png b/manual/spin_translate_d.png
new file mode 100644
index 0000000..d91a543
Binary files /dev/null and b/manual/spin_translate_d.png differ
diff --git a/manual/spin_translate_t.png b/manual/spin_translate_t.png
new file mode 100644
index 0000000..107e669
Binary files /dev/null and b/manual/spin_translate_t.png differ
diff --git a/manual/spin_translate_t.small.png b/manual/spin_translate_t.small.png
new file mode 100644
index 0000000..64a05e7
Binary files /dev/null and b/manual/spin_translate_t.small.png differ
diff --git a/manual/spin_trna_p.png b/manual/spin_trna_p.png
new file mode 100644
index 0000000..c2e4e75
Binary files /dev/null and b/manual/spin_trna_p.png differ
diff --git a/manual/spin_trna_p.small.png b/manual/spin_trna_p.small.png
new file mode 100644
index 0000000..88a8e55
Binary files /dev/null and b/manual/spin_trna_p.small.png differ
diff --git a/manual/spin_trna_t.png b/manual/spin_trna_t.png
new file mode 100644
index 0000000..5166d5d
Binary files /dev/null and b/manual/spin_trna_t.png differ
diff --git a/manual/spin_trna_t.small.png b/manual/spin_trna_t.small.png
new file mode 100644
index 0000000..c906b49
Binary files /dev/null and b/manual/spin_trna_t.small.png differ
diff --git a/manual/spin_weight_matrix.png b/manual/spin_weight_matrix.png
new file mode 100644
index 0000000..48ddccf
Binary files /dev/null and b/manual/spin_weight_matrix.png differ
diff --git a/manual/spin_weight_matrix.small.png b/manual/spin_weight_matrix.small.png
new file mode 100644
index 0000000..624ac49
Binary files /dev/null and b/manual/spin_weight_matrix.small.png differ
diff --git a/manual/spin_weight_matrix_dial.png b/manual/spin_weight_matrix_dial.png
new file mode 100644
index 0000000..1f4273d
Binary files /dev/null and b/manual/spin_weight_matrix_dial.png differ
diff --git a/manual/stops-t.texi b/manual/stops-t.texi
new file mode 100644
index 0000000..a3fc430
--- /dev/null
+++ b/manual/stops-t.texi
@@ -0,0 +1,52 @@
+ at cindex Stop codons display
+ at cindex Plot stop codons
+
+The Stop Codon Map plots the positions of all the stop codons on one or both
+strands of a contig consensus sequence.  
+It can be invoked from the gap4 View menu.
+If the Contig Editor is being used on
+the same contig, the Refresh button will be enabled and if used will fetch the
+current consensus from the editor, repeat the search and replot the stop
+codons.
+
+_lpicture(stops,6in)
+
+The figure shows a typical zoomed in view of the Stop Codon Map display.  The
+positions for the stop codons in each reading frame (here all six frames are
+shown) are displayed in horizontal strips. Along the top are buttons for zooming, the crosshair toggle, a refresh
+button and two boxes for showing the crosshair position. The left box shows
+the current position and the right-hand box the separation of the last two
+stops codons selected by the user.  Below the display of stop codons is a
+ruler and a horizontal scrollbar. The information line is showing the data for
+the last stop codon the user has touched with the cursor. Also shown on the
+left is a copy of the View menu which is user to select the reading
+frames to display.
+
+
+ at node Stops-Examining
+ at subsection Examining the Plot
+ at cindex Plot stop codons: examining the plot
+ at cindex Stop codons: examining the plot
+
+Positioning the  cursor over a plotted  point will cause its codon and
+position to appear in the information line.
+
+It is possible to find the distance between any two stop codons.
+Pressing the left mouse button on a plotted point will display "Select
+another codon" at the bottom of the window.  Then, pressing the left
+button on another plotted point will display the distance, in bases,
+between the two sites. This is shown in the box located at the top right
+corner of the window.
+
+ at node Stops-Updating
+ at subsection Updating the Plot
+ at cindex Plot stop codons: updating the plot
+ at cindex Stop codons: updating the plot
+
+If the Contig Editor (_fpref(Editor, Editing in gap4, contig_editor)) is
+currently running on the same contig as is being displayed as a Stop
+Codon Map, the Refresh button will be shown in bold lettering and hence
+be active, otherwise it will be greyed out.  Pressing the button will
+fetch the current consensus from the Contig Editor and replot its stop
+codons.  Hence the plot can be kept current with the changes being made
+in the editor.
diff --git a/manual/stops.png b/manual/stops.png
new file mode 100644
index 0000000..d8a38e2
Binary files /dev/null and b/manual/stops.png differ
diff --git a/manual/stops.small.png b/manual/stops.small.png
new file mode 100644
index 0000000..4f962f1
Binary files /dev/null and b/manual/stops.small.png differ
diff --git a/manual/strand_coverage_d.png b/manual/strand_coverage_d.png
new file mode 100644
index 0000000..1badd16
Binary files /dev/null and b/manual/strand_coverage_d.png differ
diff --git a/manual/strand_coverage_p1.png b/manual/strand_coverage_p1.png
new file mode 100644
index 0000000..e2c118d
Binary files /dev/null and b/manual/strand_coverage_p1.png differ
diff --git a/manual/strand_coverage_p1.small.png b/manual/strand_coverage_p1.small.png
new file mode 100644
index 0000000..f0256c2
Binary files /dev/null and b/manual/strand_coverage_p1.small.png differ
diff --git a/manual/strand_coverage_p2.png b/manual/strand_coverage_p2.png
new file mode 100644
index 0000000..55ff0c3
Binary files /dev/null and b/manual/strand_coverage_p2.png differ
diff --git a/manual/strand_coverage_p2.small.png b/manual/strand_coverage_p2.small.png
new file mode 100644
index 0000000..d21a116
Binary files /dev/null and b/manual/strand_coverage_p2.small.png differ
diff --git a/manual/suggest_probes.main.png b/manual/suggest_probes.main.png
new file mode 100644
index 0000000..bf56a90
Binary files /dev/null and b/manual/suggest_probes.main.png differ
diff --git a/manual/suggest_probes.select.png b/manual/suggest_probes.select.png
new file mode 100644
index 0000000..d68bb30
Binary files /dev/null and b/manual/suggest_probes.select.png differ
diff --git a/manual/tags-t.texi b/manual/tags-t.texi
new file mode 100644
index 0000000..735cfb8
--- /dev/null
+++ b/manual/tags-t.texi
@@ -0,0 +1,120 @@
+ at cindex Annotating readings
+ at cindex Annotating contigs
+ at cindex Labelling readings
+ at cindex Labelling contigs
+ at cindex Tags
+
+Gap4 can label segments of readings and contigs using "tags"
+(_fpref(Editor-Annotations, Create Tag, t)).
+The program
+recognises a set of standard tags types and users can also invent
+their own. Each tag type has a unique four character identifier, a name,
+a direction, a colour and a text string for recording notes. Tags can be
+created, edited and removed by users and by internal routines. Tags can
+also be input along with readings. This is important when reference sequences
+are used during mutation detection
+(_fpref(Mutation-Detection-Reference-Sequences, Reference sequences,
+t)).
+
+ at menu
+* Anno-Types::          Standard tag types
+* Anno-Act::            Active tags and masking
+ at end menu
+
+
+_split()
+ at node Anno-Types
+ at subsection Standard tag types
+
+The standard tag types include those shown below plus the FT records from EMBL
+sequence file entries. Users can also invent their
+own and add them to their personal GTAGDB. This is a file that describes
+the available tag types and their colours
+(_fpref(Conf-GTAGDB, Configure
+the tag database, configure)).
+
+ at sp 2
+ at example
+ at group
+ at strong{Code}    @strong{Function}
+COMM    Comment
+COMP    Compression
+RCMP    Resolved compression
+STOP    Stop
+OLIG    Oligo (primer)
+REPT    Repeat
+ALUS    Alu sequence
+SVEC    Sequencing vector
+CVEC    Cloning vector
+MASK    Mask me
+FNSH    Finished segment
+ENZ0    Restriction enzyme 0
+ENZ9    Restriction enzyme 9
+MUTN    Mutation
+DIFF    Sequence different to consensus
+HETE    Heterozygous mutation
+HET+    Heterozygous mutation False +ve
+HET-    Heterozygous mutation False -ve
+HOM+    Homozygous mutation False +ve
+HOM-    Homozygous mutation False -ve
+FCDS    FEATURE: CDS
+F***    All other (60) EMBL FT record types
+ at end group
+ at end example
+
+_split()
+ at node Anno-Act
+ at subsection Active tags and masking
+
+ at cindex Active tags
+ at cindex Masking contigs
+ at cindex Contigs masking
+ at cindex Marking contigs
+ at cindex Contigs marking
+
+Tags are used for a variety of purposes and for each function in the
+program the user can choose which tag types are currently
+"active". Where they are being used to provide visual clues this will
+determine which tag types appear in the displays, but for other
+functions they can be used to control which parts of the sequence are
+omitted from processing. This mode of tag use is called "masking". For
+example the program contains a routine to search for repeats, and if any
+are found, the user needs to know if such sequence duplications are
+caused by incorrect assembly or are genuine repeats. Once the user has
+checked a duplication reported by the program and found it to be a
+repeat, it can be labelled with a REPT tag. If the repeat routine is run
+in masking mode and with REPT tags active, any segment covered by a REPT
+tag will not be reported as a match. So once the "problem" has been
+dealt with it can be labelled so it is not reported on subsequent
+searches. In addition the tag is available to provide annotation for the
+completed sequence when it is sent to the data libraries.
+
+A more complicated application of masking is available for two of the
+other search procedures in the program: (_fpref(Assembly-Shot, Shotgun
+assembly, assembly)) and (_fpref(FIJ, Find Internal Joins, fij)). The former
+is the general assembly function and the latter is used to find
+potential joins between contigs in the database. Below we describe how
+masking can be used during assembly and similar comments apply to Find
+Internal Joins. 
+
+In the assembly function the user can choose to employ
+masking and then select the types of tags to be used as masks. Readings
+are compared in two stages: first the program looks for exact matches of
+some minimum length and then for each possible overlap it performs an
+alignment. If the masking mode is selected the masked regions are not
+used during the search for exact matches, but they are used during
+alignment. The effect of this is that new readings that would lie
+entirely inside masked regions will not produce exact matches and so
+will not be entered. However readings that have sufficient data outside
+of masked segments can produce matches and will be correctly aligned
+even if they overlap the masked data. A common use for masking during
+assembly or Find Internal Joins is to avoid finding matches that are
+entirely contained in Alu segments.
+
+A further mode related to masking is "marking". Marking is available for
+the consensus calculation (_fpref(Calculate Consensus, Consensus
+calculation, calc_consensus)) and for Find Internal Joins (_fpref(FIJ,
+Find Internal Joins, fij)). Instead of masking the regions covered by
+active tags these routines simply write these sections of the consensus
+sequence in lowercase letters. That is they make it easy for users to
+see where the tagged segments are. Marking has no other effect.
diff --git a/manual/template-t.texi b/manual/template-t.texi
new file mode 100644
index 0000000..ef61e7b
--- /dev/null
+++ b/manual/template-t.texi
@@ -0,0 +1,750 @@
+ at menu
+* Template-Display::            Template Display
+* Template-Templates::          Reading and Template Plot
+* Template-Templates-Display::  Reading and Template Plot Display
+* Template-Templates-Options::  Reading and Template Plot Options
+* Template-Templates-Operations:: Reading and Template Plot Operations
+* Template-Quality::            Quality Plot
+* Template-Restriction::        Restriction Enzyme Plot
+* Consistency-Display::         Consistency Display
+* SNP-Candidates::              SNP Candidates
+ at end menu
+
+_split()
+
+Gap4 provides views of the data for an assembly project at 3 levels of
+resolution: the whole project can be seen from the Contig Selector 
+(_fpref(Contig Selector, Contig Selector, Contig Selector)),
+the most detail from the Contig Editor
+(_fpref(Editor, Editing in gap4, contig_editor)), and the Contig
+Overview Displays, described in this section, provide an intermediate
+level of information and data manipulation. 
+They are available from the main gap4 View menu.
+
+
+These middle level resolution displays provide graphical overviews of
+individual contigs or sets of contigs.  The possible
+information shown includes readings, templates, tags, restriction enzyme
+sites, stop codons, plots of the consensus quality, read coverage,
+read-pair coverage, strand coverage and consensus confidence.
+The displays of readings, templates, tags, restriction enzyme
+sites and plots of the consensus quality can be shown in a single
+window called the Template Display 
+(_fpref(Template-Display, Template Display, template)).
+The plots of reading coverage, read-pair coverage, strand coverage and
+consensus confidence can be shown in a single display called the
+Consistency Display
+(_fpref(Consistency-Display, Consistency Display, consistency_display)),
+or as separate plots.
+The Stop Codon Plot 
+(_fpref(Stops, Plotting Stop Codons, stops))
+and a more informative version of the Restriction Enzyme Plot
+(_fpref(Restrict, Plotting Restriction Enzymes, restrict_enzymes))
+can be shown in separate windows.
+
+ at node Template-Display
+ at section Template Display
+ at cindex Template Display
+
+ at menu
+* Template-Templates::          Reading and Template Plot
+* Template-Templates-Display::  Reading and Template Plot Display
+* Template-Templates-Options::  Reading and Template Plot Options
+* Template-Templates-Operations:: Reading and Template Plot Operations
+* Template-Quality::            Quality Plot
+* Template-Restriction::        Restriction Enzyme Plot
+ at end menu
+
+The Template Display can show schematic plots of 
+readings, templates, tags, restriction enzyme
+sites and the consensus quality. It can be used to reorder contigs,
+create tags and invoke the Contig Editor. 
+It is invoked from the main gap4 View menu.
+
+An example showing all these information types can be seen in the Figure below.
+
+
+_lpicture(template.display,6in)
+
+The large top section contains lines and arrows representing readings
+and templates. Beneath this are rulers;
+one for each contig, and below those is the quality plot. 
+The template and reading section of the display is in two parts. The top
+part contains the templates which have been sequenced from both ends but
+which are in some way inconsistent - for example given the current
+relative positions of their readings, they may have a length that is
+larger or greater than that expected, or the two readings may, as it
+were, face away from one another. Colour coding is used to distinguish
+between different types of inconsistency, and whether or not the
+inconsistency involves readings within or between contigs. For example,
+most of the problems shown in the screendump above are coloured
+dark yellow, indicating an inconsistency between a pair of contigs.
+The rest of the data, (mostly dark blue indicating templates sequenced
+from only one end), is plotted below the data for the inconsistent
+templates.
+Forward readings are blue and reverse readings are orange.
+Templates in bright yellow have been sequenced from both ends, are consistent and
+span a pair of contigs (and so indicate the relative orientation and
+separation of the contigs). 
+
+The coloured blocks immediately above and below the ruler are tags.
+Those above the ruler 
+can also be seen on their corresponding readings in the large top
+section. Zooming is available. The position of a crosshair
+is shown in the two left most boxes in the top right hand corner. The leftmost
+shows the distance in bases between the crosshair and the start of the contig
+underneath the crosshair. The middle box shows the distance between the
+crosshair and the start of the first contig. The right box shows the distance
+between two selected cut sites in the restriction enzyme plots.
+
+_picture(template.dialogue,3.325in)
+
+As seen in the dialogue above,
+users can choose to display a single contig, all contigs, or a subset of 
+contigs from a file of filenames ("file") or a list ("list"). If either the
+file or list options are chosen, the "browse" button will be activated and can
+be used to call up a file or list browser dialogue.
+
+The items to be shown in the initial template display can be selected from the
+list of checkboxes. The default is to display all templates and readings.
+However, it is possible to display only templates with more than one reading 
+("Ignore 'single' templates) or templates with both forward and reverse 
+readings ("Show only read pairs"). These latter two options may be beneficial 
+if the database is very large.
+
+In the section below we give details about the individual components of
+the overall Template Display.
+
+_split()
+ at node Template-Templates
+ at subsection Reading and Template Plot
+ at cindex Template Display: reading plot
+ at cindex Template Display: template plot
+ at cindex Template plot: template display
+ at cindex Reading plot: template display
+ at cindex Ignore single templates: template display
+ at cindex Template display: ignore single templates
+ at cindex Show only read pairs: template display
+ at cindex Template display: show only read pairs
+
+ at menu
+* Template-Templates-Display::  Reading and Template Plot Display
+* Template-Templates-Options::  Reading and Template Plot Options
+* Template-Templates-Operations:: Reading and Template Plot Operations
+ at end menu
+
+The Reading and Template Plot shows templates and readings. The
+following sections describe the display, its options, and the operations
+which it can be used to perform.
+It is invoked from the main gap4 View menu.
+
+ at node Template-Templates-Display
+ at subsubsection Reading and Template Plot Display
+
+The Reading and Template Plot shows templates and readings.  Colour is used to
+provide additional information.
+The reading colour is used to convey the primer
+information. The default colours are:
+
+ at table @var
+ at item red
+primer unknown
+ at item green
+forwards primer
+ at item orange
+reverse primer
+ at item dark_cyan
+custom forward primer
+ at item orange-red
+custom reverse primer
+ at end table
+
+Colour is used to distinguish the number and the location
+of the readings derived from each template.
+Templates with readings derived from only one end are drawn in blue. 
+Those with readings from both ends
+are pink when both ends are contained within the same contig.
+Those with readings from both ends are green when
+the readings are in different contigs and one of 
+the contigs is not being plotted.
+
+For each template gap4 stores an expected length, as a range between two
+values. From an assembly it is often possible to work out the actual length of
+a template based upon the positions within a contig of readings sequenced
+using the forward and reverse primers. The forward and reverse readings on
+a single template (called a read pair) are considered to be inconsistent if 
+this observed distance is outside of
+the range of acceptable sizes and then the template is drawn in black. 
+Alternatively it may be possible that both forward and reverse readings are 
+assembled on the same strand (in which case both arrows will point in the same
+direction). This too is a problem and hence the templates are drawn in
+black. 
+
+If more than one contig is displayed then the distance between adjacent 
+contigs is determined from any read pair information. If there are spanning 
+templates between two adjacent contigs and the readings on that template are 
+consistent, i.e. are in the correct orientation, the template is coloured yellow. 
+Templates which span non-adjacent contigs in the display or contain 
+inconsistent readings are coloured dark yellow.
+
+A summary of the default template colours follows.
+
+ at table @var
+ at item blue
+the template contains only readings from one end
+ at item pink
+the template contains both forward and reverse readings in the same contig
+ at item green
+the template contains both forward and reverse readings, but they are in
+separate contigs, and one of the contigs is not being displayed.
+ at item black
+the readings on the template are within the same contig but are in
+contradictory orientations or are an unexpected distance apart
+ at item yellow
+the readings on the template are within different contigs (both of which are being displayed) and are consistent
+ at item dark_yellow
+the readings on the template are within different contigs (both of which are being displayed) and are inconsistent
+ at end table
+
+ at cindex Ruler: template display
+ at cindex Contig: template display
+ at cindex Template display: ruler
+ at cindex Template display: contig
+
+If more than one contig is displayed, the contigs are positioned in the same
+left to right order as the input contig list, (which need not necessarily be in
+the same order as the contig selector).
+Overlapping contigs are drawn as staggered lines. 
+If the user selects the "Calculate contig positions" option from the menu
+the horizontal distance between adjacent contigs is
+determined from any available read pair information. 
+Otherwise, or in the absence of any read pair
+information, the second contig is positioned immediately following the first
+contig, but will be drawn staggered in the vertical direction. If the 
+readings on a template spanning two contigs are consistent, the distance 
+between the contigs is determined using the template's mean length.
+If there are several templates spanning a pair of contigs
+an average distance is calculated and used as the final 
+offset between the contigs. 
+Templates which span non-adjacent contigs or contain inconsistent readings 
+are not used in the calculation of the contig offsets. It is possible that 
+data in the database is inconsistent to such an extent that, although spanning 
+templates have consistent readings, the averaging can lead to a display which 
+shows the templates to have inconsistent readings, eg the readings are 
+pointing in opposite directions. 
+
+A summary of the templates and readings used to calculate the distance 
+between two contigs is displayed in the output window. An example is given 
+below:
+
+ at example
+============================================================
+Wed 02 Apr 10:35:51 1997: template display
+------------------------------------------------------------
+Contig zf98g12.r1(651) and Contig zf23d2.s1(348) 
+Template       zf22h7( 376) length 1893
+Reading        zf22h7.r1(  +10R), pos   6257 +208, contig  651
+Reading        zf22h7.s1( -376F), pos    145 +331, contig  348
+Template       zf49f5( 536) length 1510
+Reading        zf49f5.r1( +255R), pos   6562 +239, contig  651
+Reading        zf49f5.s1( -536F), pos    227 +135, contig  348
+Gap between contigs = -11
+Offset of contig 348 from the beginning = 7674
+ at end example
+
+The contig names and numbers are given in the top line. Below this, the
+spanning template name, number and length is displayed. Below this the reading
+name, whether the reading has been complemented (+: original -: complemented),
+number, primer information, starting position, length and contig number. This
+is of similar format to that displayed by the read pairs output.
+_fxref(ReadPair-Output, Find Read Pairs, read_pairs) The average gap between
+the contigs is given and finally the distance in bases between the start of
+the second contig and the start of the left most contig in the display.
+
+_split()
+ at node Template-Templates-Options
+ at subsubsection Reading and Template Plot Options
+ at cindex Template display: tags
+ at cindex Template display: select tags
+ at cindex Select tags: template display
+ at cindex Tags: template display
+
+
+_lpicture(template.display,6in)
+
+Within the figure shown above the contents of the View menu are visible. The 
+"Templates", "Readings", "Quality Plot" and "Restriction Enzyme Plot" commands
+control which attributes are displayed. The graphics are always scaled to fit the
+information within the window size, subject to the current zoom level. This
+means that turning off templates, but leaving readings displayed, will improve
+visibility of the reading information.
+
+The "Ruler ticks" checkbox determines whether to draw numerical ticks on the 
+contigs. The number of ticks is defined in the .gaprc
+(_fpref(Conf-Introduction, Options Menu, configure)) file as NUM_TICKS 
+although the actual number of ticks per contig that will be displayed
+also depends on the space available on the screen.
+
+The "ignore 'single' templates" toggle controls whether to display all 
+templates or only those containing more than one reading. The "show only read 
+pairs" toggle controls whether all templates or only those containing both 
+forward and reverse readings are displayed.  Hence when set the templates 
+displayed are those with a known (observed) length. The "Show only spanning
+read pairs" toggle controls whether to display all templates or only those
+containing forward and reverse readings which are in different contigs.
+
+The plot can be enlarged or reduced using the standard zooming mechanism.
+_fxref(UI-Graphics-Zoom, Zooming, interface)
+
+The crosshair toggle button controls whether the cursor is visible. This is
+shown as a black vertical line. The position of the crosshair is displayed
+in the two boxes to the right of the crosshair toggle. The first box indicates
+the cursor position in the current contig. The second box indicates the 
+overall position of the cursor in the consensus. The third box is used to 
+show the distance between restriction enzyme cut sites. 
+_oxref(Template-Restriction, Restriction Enzyme Plot).
+
+Tags that are on the consensus can only be seen on the ruler. These are
+marked beneath the ruler line. Tags on readings can be seen both on the
+ruler (above the line) and on their appropriate readings within the
+template window. To configure the tag types that are shown use the
+"select Tags" command in the View menu. This brings up the usual tag
+selection dialog box. _fxref(Conf-Tag, Tag Selector, configure)
+
+
+_split()
+ at node Template-Templates-Operations
+ at subsubsection Reading and Template Plot Operations
+
+ at cindex Readings list: template display
+ at cindex Template display: readings list
+ at cindex Template display: active readings
+
+
+The contig editor can be invoked by double clicking the middle mouse button,
+or Alt the left mouse button, 
+in any of the displays, ie template, ruler, quality or restriction enzyme
+plots. The editor will start up with the editing cursor on the base that 
+corresponds to the position clicked on in the Template Display. If more than 
+one contig is currently being displayed the editor decides which contig to show
+using the following rules. If the user clicks on the Quality Plot, the contig
+lines or the Restriction Enzyme Display, the corresponding contigs will
+be shown. If the user clicks on a gap between these displays the nearest contig
+will be selected. If the user clicks on
+the template or reading lines, the editor will show the contig whose left
+end is to the left of and closest to the cursor.
+
+The long blue vertical line seen in the previous 
+figure is the position of the 
+editing cursor within a Contig Editor. Each editor will produce its own cursor
+and each will be visible. Moving the editing 
+cursor within a contig editor automatically moves its cursor within the 
+Template Display. Similarly, clicking and dragging the editor cursor with the 
+middle mouse button, or Alt left mouse button, within the Template
+Display scrolls the associated Contig Editor.
+
+
+The order of the contigs can be changed within the Template Display by
+clicking with the middle mouse button, or Alt left mouse button, 
+on a contig line and dragging the line to
+the new position. The Template Display will update automatically once the 
+mouse button is released. The change of a dark yellow template to bright
+yellow is indicative that the two contigs are now in consistent positions
+and orientations. The order of the contigs in the gap4 database, as 
+displayed in the contig selector, can be updated by selecting the 
+"Update contig order" command in the Edit menu.
+
+By clicking on any of the contig lines in the ruler a popup menu is invoked.
+From this, information on the contig can be obtained, the contig editor can be
+started, the contig can be complemented, and the templates within the
+contig can be highlighted (shown by changing their line width).
+
+A list named @code{readings} always exists. It contains the
+list of readings that are highlighted in all the currently shown
+template displays.  _fxref(Lists, Lists, lists) The highlighting
+mechanism used is to draw the readings as thicker, bolder, lines. The
+"clear Active Readings" command from the View menu clears this list. The
+"highlight reading list" command loads a new set of readings to use for
+the "readings" list and then highlights these.
+
+To interactively add and remove readings from the active list use the
+left mouse button. Clicking on an individual reading will toggle its
+state from active to non active and back again. Pressing and holding the
+left mouse button, and moving the mouse, will drag out a bounding box.
+When the button is released all readings that are contained entirely
+within the bounding box will be toggled.
+
+Activating a reading (using any of the above methods) when an editor is
+running, will also highlight the reading within the editor. Similarly,
+highlighting the reading in the editor activates it within the template
+display and adds it to the active reading list.
+
+
+_split()
+ at node Template-Quality
+ at subsection Quality Plot
+ at cindex Template display: quality plot
+ at cindex Quality plot: template display
+
+This option can be invoked from the main gap4 View menu, in which case
+it appears as a single plot, or from the View menu of the Template
+Display, in which case it will appear as part of the Template Display.
+
+This display provides an overview of the quality of the consensus. The
+Contig Editor can be used to examine the problems revealed. A typical
+plot is displayed below.
+
+_lpicture(template.quality,6in)
+
+For each base in the consensus a quality is
+computed based on the accuracy of the data on each strand. As can be seen in
+the Figure above, this information
+is then plotted using colour and height to distinguish between the
+different quality assignments.
+The colour and height codes are explained below.
+
+ at example
+ at group
+Colour  Height          Meaning
+
+grey    0 to 0          OK on both strands, both agree
+blue    0 to 1          OK on plus strand only
+green  -1 to 0          OK on minus strand only
+red    -1 to 1          Bad on both strands
+black  -2 to 2          OK on both strands but they disagree
+ at end group
+ at end example
+
+For example, in the figure we see that the first four hundred or so
+bases are mostly only well determined on the forward strand.
+
+Note that when a large number of bases are being displayed the limited screen 
+resolution causes the
+quality codes for adjacent bases to be drawn as single pixels. However
+the use of varying heights ensures that all problematic bases will be
+visible. Hence when
+the quality plot consists of a single grey line all known quality problems
+have been resolved, at the current consensus and quality cutoffs.
+
+To check problems the contig editor can be invoked by double clicking on the
+middle mouse button, or Alt left mouse button. 
+It will appear centred on the base corresponding to the
+position on which the mouse was clicked.
+
+The quality plot appears as "Calculate quality" in the Results Manager window
+(_fpref(Results, Results Manager, results)).
+
+Within the Results Manager commands available, using the right mouse
+button, include "Information",
+which lists a summary of
+the distribution of quality types to the output window, and "List" which lists
+the actual quality values for each base to the output window. These quality
+values are written in a textual form of single letters per base and are listed
+below.
+
+ at table @var
+ at item
+ at r{+Strand -Strand}
+ at item a
+ at r{Good    Good} (in agreement)
+ at item b
+ at r{Good    Bad}
+ at item c
+ at r{Bad     Good}
+ at item d
+ at r{Good    None}
+ at item e
+ at r{None    Good}
+ at item f
+ at r{Bad     Bad}
+ at item g
+ at r{Bad     None}
+ at item h
+ at r{None    Bad}
+ at item i
+ at r{Good    Good} (disagree)
+ at item j
+ at r{None    None}
+ at end table
+
+An example of the output using "Information" and "List" follows.
+
+ at example
+============================================================
+Wed 02 Apr 12:14:06 1997: quality summary
+------------------------------------------------------------
+Contig xb56b6.s1 (#11)
+ 81.00 OK on both strands and they agree(a)
+  3.94 OK on plus strand only(b,d)
+ 11.98 OK on minus strand only(c,e)
+  1.85 Bad on both strands(f,g,h,j)
+  1.22 OK on both strands but they disagree(i)
+============================================================
+Wed 02 Apr 12:14:09 1997: quality listing
+------------------------------------------------------------
+Contig xb56b6.s1 (#11)
+
+          10         20         30         40         50         60
+  eeeeeeeeee eeeeeeeeee eeeeeeeeee eeeeeeehee eeeeeeeeee eeeeeeeeee
+
+          70         80         90        100        110        120
+  eeeeeeeeee eeeeeeeeee eeeeeeeeee eeeeeeeeee eeeeeeeeee eeeeeeeeee
+
+         130        140        150        160        170        180
+  eeeeeeeeee eeeeeeeeee eeeeeeeeee eeeeeeeeee eeeeeeeeee eeeeeeeeee
+
+         190        200        210        220        230        240
+  eeeeeeeeee eeeeeeeeee heeeeeeeee eeeeeeeici iiaiaciiia aaaaaaaaac
+
+         250        260        270        280        290        300
+  aaaacaaaaa aaaaaaaiia aaaaaaaaaa aaaaaaaaaa aaaabaaaaa aaaaaaaaaa
+
+         310        320        330        340        350        360
+  aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa faaaaaaaaa
+
+[ output removed for brevity ]
+ at end example
+
+_split()
+ at node Template-Restriction
+ at subsection Restriction Enzyme Plot
+ at cindex Template display: restriction enzymes
+ at cindex Restriction enzymes: template display
+
+The restriction enzyme plot within the template display is a reduced version
+of the main Restriction Enzyme Map function. The dialogue used for choosing
+the restriction enzymes is identical and is described with the main function.
+_fxref(Restrict, Plotting Restriction Enzymes, restrict)
+It is invoked from the Template Display View menu.
+An example plot from the template display can be seen below.
+
+_lpicture(template.restriction,5.925in)
+
+Here we see the searches for two restriction enzymes. Each vertical line is
+drawn at the cut position of the matched restriction site. Unlike the main
+restriction enzyme plot here all matches are plotted on a single
+horizontal plot. Initially all sites are drawn in black. To distinguish one
+site from another either touch the site with the mouse cursor and read the 
+template
+display information line, or place the mouse cursor above a site and press
+the right mouse button. This pops up a menu containing "Information" and
+"Configure". The "Configure" option can be used to change the colour of all
+matches found for this enzyme. In the figure above we have changed
+the initial colours for both of the restriction enzymes searched for. The
+"Information" command displays information for all sites found in the text
+output window.
+
+As with the main Restriction Enzyme Map function, clicking the left mouse
+button on two restriction sites in turn displays the distance between the
+chosen sites in the information line. This figure is also displayed in the box
+at the top right hand corner of the template display.
+
+_include(consistency_display-t.texi)
+
+_split
+ at node SNP-Candidates
+ at section SNP Candidates
+ at cindex SNP candidates
+ at cindex Haplotype assignment
+
+The 2nd-Highest Confidence (_fpref(Consistency-2ndHighest, 2nd-Highest
+Confidence, 2nd-Highest Confidence)) and the Diploid Graph
+(_fpref(Consistency-Diploid, Diploid Graph, Diploid Graph) both plot
+indicators of how likely an alignment column is to be made up of 2 or
+more sequence populations.
+
+By studying these in further detail we should be able to spot
+correlated differences and to start assigning haplotypes. The SNP
+Candidate plot initially brings up a dialogue asking for a single
+contig and range. After selecting this a window is displayed showing
+the likely locations of SNPs as seen below.
+
+_lpicture(snp_candidates1,6in)
+
+The top row of this has controls to define how the 2nd-Highest
+Confidence or Diploid Graph results are analysed in order to pick
+candidate locations for SNPs.
+
+Going from right to left, the ``2 alleles only'' toggle switches
+between the two algorithms; when enabled it uses the additional
+assumption coded into the Diploid Graph of their being only two
+populations in approximately 50:50 ratio. Next the minimum base
+quality may be adjusted. Any difference with a poorer quality than
+this is completely ignored. The minimum discrepancy score is a
+threshold (with high indicating a strong SNP) applied to the results
+of the consistency plot results. A spike in this plot needs to be at
+least as high as this score to be accepted. This score is then
+adjusted for immediate proximity to other SNPs (e.g. it forms a run of
+bases) and this adjusted score is compared against the minimum SNP
+score parameter. Typically this can be left low. If any of these
+parameters are modified press the ``Recalculate candidate SNPs''
+button to recompute.
+
+The large central panel contains a vertically scrolled representation
+of the candidate SNPs found. By default the left-most plot contains a
+pictorial view of the sequence depth. Next to this is a vertical ruler
+showing the relative positions of candidate SNPs. Both of these two
+plots are to scale based on the sequence itself. To the right of these
+come a series of text based items with one row per candidate
+SNP. Initially this consists only of a check button (``Use''),
+Position, Score and the frequency of base types observed at that
+consensus column. Double clicking on any row will bring up the contig
+editor at that position showing the potential SNP. You may manually
+curate which ones you consider to be true or not by enabling or
+disabling it use the ``Use'' checkbox on that row. The score may also
+be manually adjusted allowing certain differences to be forced apart
+by using a very high score.
+
+The second row from the top contains a row of options controlling how
+the correlation between candidate SNPs is used to assign
+haplotypes. For every template in the contig the algorithm produces a
+fake sequence consistencing only of the bases considered to be a
+candidate SNP and enabled by having the ``Use'' checkbox set. These
+fake sequences are then clustered to form groups. No re-alignment is
+performed as the existing multiple alignment has already been made
+(although you may wish to run the Shuffle Pads algorithm before hand
+if the existing sequence alignment is poor).
+
+This is a fairly standard clustering algorithm that starts with each
+sequence being the sole member of a set. All sets are compared with
+each other based on the correlation between sets using an adjusted
+correlation score (achieved by subtracting ``Correlation offset'') and
+then the overlaps are ranked by score. The best scoring 
+sets are then merged together. If Fast Mode is not being used the
+merged set is then compared against everything else once more to
+obtain new scores, otherwise a simple adjustment is guessed
+at. Skipping this step speeds up the algorithm considerably and
+generally gives sufficient results; hence the Fast Mode toggle. This
+process is repeated until no two sets have an overlap score of greater
+than or equal to the ``Minimum merge score''.
+
+The Filter Templates button brings up a new dialogue box containing an
+editable list (initially blank) of template names. Adding a template
+name here will force this template to be ignored by the clustering
+algorithm. You may also enter reading names here too and they will be
+automatically converted to template names, hence filtering out all
+other readings from the same template. If you or suspect specific
+templates from being chimeric then this is where they should be listed.
+
+The Cluster by SNPs button starts the clustering process running. It
+cannot be interrupted and may take a few minutes. After completion the
+``Sets'' component (rightmost) of the central plot is updated as seen
+in the below screenshot. Each set is a group of templates clustered
+together based on the candidate SNPs. They are sorted in left to right
+order such that the left-most set contains the most number of
+templates and the right most set contains the fewest. The consensus
+for members of that set is displayed in each square and the quality of
+the consensus is shown in a similar fashion to the contig editor, with
+white being good quality and dark grey being poor (usually due to
+being low coverage within that set).
+
+The background to the entire row is also shaded to indicate the
+observed quality of that SNP in the context of this clustering. A
+white background indicates that two or more sets exist with high
+quality consensus bases (>= quality 90) that differ. A light grey
+background is used where the consensus bases differ but not with high
+quality bases. A dark grey background is used to indicate that the
+consensus in all sets covering that SNP candidate agree. This
+typically happens when either the clustering has failed or when a
+candidate SNP is not a real indicator of which haplotype a sequence
+belongs to, such as a base calling error or a random fluctuation in
+homopolymer length. If you wish to force this SNP to be used for
+clustering then try increasing its score and re-clustering again.
+
+_lpicture(snp_candidates2,6in)
+
+Hence in the above example we see two distinct good quality sets made
+from the SNPs between 1503 and 2334 and two more good quality sets
+from 12039 onwards. This indicates that we have no templates where one
+end spans SNPs in the 1503-2334 region and the other end spans SNPs in
+the 12039 onwards region. We also have a series of smaller sets which
+probably arise due to incorrect base calls or more rarely due to
+chimeras.
+
+Now if we double click to get the contig editor up it will display an
+additional window labelled ``Tabs''. NOTE: this does not happen if a
+contig editor for this contig is already being displayed. If so shut
+that one down first. Notice that the sequence names are also
+coloured. This indicates the set the sequence has been assigned
+to. The picture below also has the ``Highlight Disagreements'' mode
+enabled with a difference quality cutoff sufficient set to match the
+one used in the SNP Candidates plot. Two clear SNP positions can be
+seen.
+
+_lpicture(contig_editor_sets,6in)
+
+The tabs window lists the set numbers and their size (except for
+``All''). Selecting a set will show just sequences from that set. This
+allows for the set consensus and quality values to be viewed. The
+editor also allows for sequences to be moved from one set to another,
+but for now this is purely serves a visual purpose and the movements
+are not passed back to the main SNP candidates window (although this
+is an obvious change to make).
+
+Moving back to the main SNP Candidates window note that we have a
+series of selection buttons at the bottom of the window. These control
+automatic selection of rows (SNPs) based on their quality assigned by
+observing the set consensus sequences. The clustering algorithm only
+works on selected sets so this allows for poor quality SNPs to be
+removed from further calculations. Additionally to simplify the view
+unselected SNPs may be removed by pressing the ``Remove unselected'' button.
+
+Above each set has a checkbutton above it (not visible in the
+screenshot). Initially these are not enabled, but they indicate which
+sets certain operations should be performed on. Pressing the right
+mouse button over a set (or a set checkbox) brings up a menu
+indicating the following operations.
+
+ at table @strong
+ at item Delete set
+ at itemx Merge selected sets
+This removes either the clicked upon set or all enabled sets (those
+that have their checkbox set) from the display.
+ at sp 1
+ at item Save this consensus
+ at itemx Save consensus for selected sets
+This brings up a dialogue box allowing the consensus for a single or
+selected sets to be saved in FASTA format. The set numbers is a space
+separated list of numbers representing the sets to save, starting
+with the leftmost set being numbered as 1. Initially this is either
+the one you clicked on or all the selected ones, but it may be edited
+in this dialogue too prior to saving. Strip pads removes padding
+characters ('*') from the consensus.
+
+``Incorporate ungrouped templates'' controls how template sequences
+that were not assigned to at least one set are dealt with. It could be
+considered that sequences covering regions where no SNPs have been
+detected should be included when computing the consensus, and this is
+the default action. However this can be disabled such that only
+sequences that were specifically used for breaking the assembly apart
+into sets form the consensus.
+ at sp 1
+ at item Produce fofn for this set
+ at itemx Produce fofn for selected sets
+These options allow a file or list of reading names to be
+saved. A single fofn is produced but multiple sets may be grouped
+together in one fofn. Here the set number ``0'' is a placeholder for
+all of the sequences that were not assigned to a set.
+ at end table
+
+The final set of controls to discuss in the SNP Candidates window
+control the splitting of sets into contigs. This is a one-way action
+which cannot be undone, so make sure you backup the database using
+Copy Database before hand.
+
+The ``Split sets to contigs'' button moves the readings in each
+selected set to its own contig. In some cases a set may be
+non-contiguous. Remember that templates are assigned to sets, but a
+template may often only have the end sequence known with the middle
+portion being unsequenced. Gap4 does not currently handle scaffolds
+and super-contigs so in order to keep such sets held together in a
+single contig the ``Add fake consensus'' option may be used. This adds
+an additional sequence to the contig that contains the consensus for
+the set (including from readings that were unassigned). This also
+handily means that new contigs produced from multiple sets are already
+aligned and base coordinates are directly comparable. Hence two such sets may
+be viewed in the Join Editor by typing their names into the main Join
+Contigs dialogue. (Find Internal Joins will attempt to realign the
+contigs and often fails if the set contains many regions of unknown
+consensus.)
+
diff --git a/manual/template.dialogue.png b/manual/template.dialogue.png
new file mode 100644
index 0000000..596dc33
Binary files /dev/null and b/manual/template.dialogue.png differ
diff --git a/manual/template.display.png b/manual/template.display.png
new file mode 100644
index 0000000..e1f5c93
Binary files /dev/null and b/manual/template.display.png differ
diff --git a/manual/template.display.small.png b/manual/template.display.small.png
new file mode 100644
index 0000000..5172563
Binary files /dev/null and b/manual/template.display.small.png differ
diff --git a/manual/template.quality.png b/manual/template.quality.png
new file mode 100644
index 0000000..7770a3c
Binary files /dev/null and b/manual/template.quality.png differ
diff --git a/manual/template.quality.small.png b/manual/template.quality.small.png
new file mode 100644
index 0000000..f3cef19
Binary files /dev/null and b/manual/template.quality.small.png differ
diff --git a/manual/template.restriction.png b/manual/template.restriction.png
new file mode 100644
index 0000000..cdbf889
Binary files /dev/null and b/manual/template.restriction.png differ
diff --git a/manual/template.restriction.small.png b/manual/template.restriction.small.png
new file mode 100644
index 0000000..86376f8
Binary files /dev/null and b/manual/template.restriction.small.png differ
diff --git a/manual/template_status.png b/manual/template_status.png
new file mode 100644
index 0000000..78536b5
Binary files /dev/null and b/manual/template_status.png differ
diff --git a/manual/test.texi b/manual/test.texi
new file mode 100644
index 0000000..b66e879
--- /dev/null
+++ b/manual/test.texi
@@ -0,0 +1,119 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+
+include(header.m4)
+
+ at node Top
+ at chapter Top
+
+ at display
+display
+example
+ with multiple
+  spaced @image{gap5_template_by_stacking,2in}
+    text
+ at end display
+
+
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+
+ at table @asis
+ at item By size: _picture(gap5_template_by_size,2in)
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+
+ at item @image{gap5_template_by_stacking,2in}
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+ at end table
+
+
+ at multitable @columnfractions .2 .4 .4
+ at item By template size By template size By template size By template size
+ at tab @image{gap5_template_by_size,2in}
+ at tab blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+
+ at item By stacking By stacking By stacking By stacking
+ at tab @image{gap5_template_by_stacking,2in}
+ at tab Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+Text text text text
+ at end multitable
+
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+blah blah blah blah
+
+ at bye
diff --git a/manual/tools/docmake b/manual/tools/docmake
new file mode 100755
index 0000000..eac756f
--- /dev/null
+++ b/manual/tools/docmake
@@ -0,0 +1,50 @@
+#!/bin/sh
+#
+# jkb 06/10/95 - creates stubs for the document system.
+#
+
+if [ "$DOCDIR"x = "x" ]
+then
+    DOCDIR=.
+    export DOCDIR
+fi
+
+text2texi=$DOCDIR/tools/text2texi
+
+# Ask the user for the names.
+echo -n 'Enter document name (no spaces) : '
+read DOCNAME
+echo -n 'Enter document title : '
+read DOCTITLE
+echo -n 'Enter document subtitle (if any) : '
+read DOCSUBTIT
+echo -n 'Enter document author (if any) : '
+read DOCAUTHOR
+echo
+
+# Escape any shell or texi meta characters - yum!
+XDOCNAME=`echo $DOCNAME|sed 's:\\\\:\\\\\\\\:g;s:/:\\\\/:g'|$text2texi`
+XDOCTITLE=`echo $DOCTITLE|sed 's:\\\\:\\\\\\\\:g;s:/:\\\\/:g'|$text2texi`
+XDOCSUBTIT=`echo $DOCSUBTIT|sed 's:\\\\:\\\\\\\\:g;s:/:\\\\/:g'|$text2texi`
+XDOCAUTHOR=`echo $DOCAUTHOR|sed 's:\\\\:\\\\\\\\:g;s:/:\\\\/:g'|$text2texi`
+
+# Create the doc.texi file
+echo Creating $DOCNAME.texi
+sed "s/DOCNAME/$XDOCNAME/;s/DOCTITLE/$XDOCTITLE/;s/DOCSUBTIT/$XDOCSUBTIT/;s/DOCAUTHOR/$XDOCAUTHOR/" < $DOCDIR/doc.template > $DOCNAME.texi
+
+echo Creating $DOCNAME-t.texi
+# Create the doc-t.texi file
+cp $DOCDIR/doc-t.template $DOCNAME-t.texi
+
+# Edit the makefile. (let's hope DOCNAME doesn't start with a fullstop)
+cp Makefile Makefile.bak
+echo 'Editing Makefile (a copy is in Makefile.bak)'
+ed Makefile << _ed_ > /dev/null
+/^all:/s/xref/$DOCNAME xref/
++1
+i
+$DOCNAME:       ${DOCNAME}_toc.html $DOCNAME.index $DOCNAME.dvi
+.
+w
+q
+_ed_
diff --git a/manual/tools/edit_mini_contents.pl b/manual/tools/edit_mini_contents.pl
new file mode 100755
index 0000000..57b5fe8
--- /dev/null
+++ b/manual/tools/edit_mini_contents.pl
@@ -0,0 +1,25 @@
+#!/usr/bin/perl -w
+
+#
+# 14/03/00 jkb
+#
+# Adds the Home, Up, etc to the mini manual contents pages.
+#
+while (<>) {
+  if (/^<H1>/) {
+    print <<EOH
+<a href="../index.html"><img src="i/nav_home.gif" alt="home"></a>
+<a href="master_contents.html"><img src="i/nav_full.gif" alt="full"></a>
+<hr size=4>
+EOH
+  } elsif (/<H2>Last update on/) {
+    next;
+  } elsif (/^<HR>$/) {
+    print <<EOF
+<hr size=4> 
+<a href="../index.html"><img src="i/nav_home.gif" alt="home"></a> 
+<a href="master_contents.html"><img src="i/nav_full.gif" alt="full"></a> 
+EOF
+  }
+  print;
+}
diff --git a/manual/tools/html_index.pl b/manual/tools/html_index.pl
new file mode 100755
index 0000000..af064d6
--- /dev/null
+++ b/manual/tools/html_index.pl
@@ -0,0 +1,42 @@
+#!/usr/bin/perl -w
+
+# jkb 12/12/95
+# Usage: html_index.pl DOCUMENT_toc.html
+
+# Builds $ARGV.index Their use is as follows:
+#
+# $ARGV.index contains a mapping of node names to urls. The show_help Tcl
+# command reads this file.
+
+#$last = "";
+#@sub = ();
+
+$name = $ARGV[0];
+$name =~ s/_toc.html//;
+open(INDEX, "> $name.index") || die "Couldn't create $name.index";
+
+print INDEX "{Contents} ${name}_toc.html\n";
+
+#$"=":";
+while (<ARGV>) {
+#    if (/<UL>/) {
+#	push(@sub, $last) if ($last ne "");
+#    }
+#
+#    if (/<\/UL>/) {
+#	pop(@sub);
+#    }
+
+    s/<CODE>(.*)<\/CODE>/$1/;
+    if (/NODE:(.*) -->.*SEC.*HREF="(.*)">([^<]*)/) {
+	print INDEX "{$1} $2\n";
+#	if ($#sub>=0) {
+#	    print INDEX "{@sub:$3} $2\n";
+#	} else {
+#	    print INDEX "{$3} $2\n";
+#	}
+#	$last=$3;
+    }
+}
+
+close(INDEX);
diff --git a/manual/tools/list_nodes b/manual/tools/list_nodes
new file mode 100755
index 0000000..89eb1cb
--- /dev/null
+++ b/manual/tools/list_nodes
@@ -0,0 +1,2 @@
+#!/bin/sh
+egrep '^@node' *.texi | sed 's/\(.*\):@node \([^,]*\).*/\1:\2/'
diff --git a/manual/tools/lowersection b/manual/tools/lowersection
new file mode 100755
index 0000000..98ea2c3
--- /dev/null
+++ b/manual/tools/lowersection
@@ -0,0 +1,21 @@
+#!/bin/sed -f
+s/^@subsection/@subsubsection/
+s/^@section/@subsection/
+s/^@chapter/@section/
+
+s/^@unnumberedsubsec/@unnumberedsubsubsec/
+s/^@unnumberedsec/@unnumberedsubsec/
+s/^@unnumbered/@unnumberedsec/
+
+s/^@numberedsubsec/@numberedsubsubsec/
+s/^@numberedsec/@numberedsubsec/
+s/^@numbered/@numberedsec/
+
+s/^@appendixsubsec/@appendixsubsubsec/
+s/^@appendixsec/@appendixsubsec/
+s/^@appendix/@appendixsec/
+
+s/^@subheading/@subsubheading/
+s/^@heading/@subheading/
+s/^@chapheading/@sheading/
+s/^@majorheading/@sheading/
diff --git a/manual/tools/make_dependencies b/manual/tools/make_dependencies
new file mode 100755
index 0000000..f0a532c
--- /dev/null
+++ b/manual/tools/make_dependencies
@@ -0,0 +1,22 @@
+#!/bin/sh
+# Produces a dependencies file for use with gmake.
+
+for i in *.texi
+do
+    # M4 include commands.
+    b=`echo $i | sed 's/\.texi//'`
+
+    echo "$b.pdf_done:"
+    sed -n "s/^_include(\(.*\)\.texi)/$b.pdf_done: \1.pdf_done/p" $i
+
+    echo "$b.eps_done:"
+    sed -n "s/^_include(\(.*\)\.texi)/$b.eps_done: \1.eps_done/p" $i
+
+    # pdf image files from the images
+    sed -n "s/^_picture(\([^,]*\).*)/$b.pdf_done: \1.pdf/p" $i
+    sed -n "s/^_lpicture(\([^,]*\).*)/$b.pdf_done: \1.pdf/p" $i
+
+    # eps files from the images
+    sed -n "s/^_picture(\([^,]*\).*)/$b.eps_done: \1.eps/p" $i
+    sed -n "s/^_lpicture(\([^,]*\).*)/$b.eps_done: \1.eps/p" $i
+done
diff --git a/manual/tools/make_eps b/manual/tools/make_eps
new file mode 100755
index 0000000..dc80791
--- /dev/null
+++ b/manual/tools/make_eps
@@ -0,0 +1,81 @@
+#!/usr/bin/perl -w
+
+#
+# Converts a gif/png image to postscript with the following rules.
+#
+# 1. The standard conversion is at 120x120 dpi.
+# 2. The maximum width is 16.3 cm.
+#
+# We resolve rule 2 by increasing the dpi.
+#
+
+# Location of imagic conversion program
+#$convert = '/usr/local/bin/convert';
+#$convert = '/usr/X11R6/bin/convert';
+$convert = '/usr/bin/convert';
+
+# Default density
+$default_density = '120';
+
+# For images with a non standard density, we list them here.
+%densities = (
+	      'conf_values_p', '124',
+	      'consistency_p', '123',
+	      'contig_editor.join', '168',
+	      'contig_editor.screen', '168',
+	      'contig_editor.traces', '161',
+	      'contig_editor.traces.compact', '161',
+	      'contig_editor_grey_scale', '168',
+	      'contig_editor_mutations', '168',
+	      'interface.menus', '26',
+	      'primer_pos_plot', '126',
+	      'primer_pos_seq_display', '131',
+	      'read_coverage_p', '123',
+	      'readpair_coverage_p', '125',
+	      'restrict_enzymes', '126',
+	      'spin_plot_base_comp_p', '121',
+	      'spin_plot_drag1', '125',
+	      'spin_plot_drag2', '125',
+	      'spin_plot_drag3', '125',
+	      'spin_restrict_enzymes_p', '125',
+	      'spin_restrict_enzymes_p1', '125',
+	      'spin_seq_display', '151',
+	      'spin_sequence_display_t', '155',
+	      'spin_trna_p', '121',
+	      'stops', '128',
+	      'traces_diff', '163',
+	      'conf_values_p', '121',
+	      'mut_contig_editor5', '167',
+	      'mut_contig_editor_dis5', '167',
+	      'mut_traces_het', '161',
+	      'mut_traces_point', '161',
+	      'mut_traces_positive', '161',
+	      );
+
+while ($#ARGV >= 0) {
+    $_ = shift(@ARGV);
+    next if (/\.small\./);
+    s/\.(gif|png)//;
+    my $old_fmt=$1;
+    $fname = $_;
+    $density = $densities{$_} ? $densities{$_} : $default_density;
+    $density = $density . 'x' . $density;
+
+    # Convert the image
+    system "$convert -density $density $_.$old_fmt $_.eps";
+
+    # Find the size of the postscript image.
+    open(FILE, "$_.eps") || die "Cannot open $_.eps\n";
+    while (<FILE>) {
+	if (s/^%%BoundingBox: 0 0 (.*) .*/$1/) {
+	    $width=$_;
+	}
+    }
+    close(FILE);
+
+    # Check the size. 16.3cm = 462 1/72th inch.
+    if ($width gt 462) {
+	$size=int($width/462.0*120+1);
+	print "FIX: Suggested new density: '$fname', '$size',\n";
+    }
+}
diff --git a/manual/tools/make_gif_html b/manual/tools/make_gif_html
new file mode 100755
index 0000000..6b25bec
--- /dev/null
+++ b/manual/tools/make_gif_html
@@ -0,0 +1,17 @@
+#!/bin/sh
+
+# For all the gifs that are split into a small and large version, create a
+# .gif.html file which is simply an html page containing the gif. This is
+# needed because the tcl/tk html viewer doesn't support gif files directly.
+
+for i in *.small.*.gif
+do
+    name=`echo $i | sed 's/\.small\.\(.*\)\.gif$/\.\1/'`
+    # name=`echo $i | sed 's/.small.gif$//'`
+    echo "Creating $name.gif.html"
+    cat > $name.gif.html << _eof_
+<html><body bgcolor="#ffffff">
+<img src="$name.gif">
+</body></html>
+_eof_
+done
\ No newline at end of file
diff --git a/manual/tools/make_pdf b/manual/tools/make_pdf
new file mode 100755
index 0000000..95a9bb9
--- /dev/null
+++ b/manual/tools/make_pdf
@@ -0,0 +1,131 @@
+#!/usr/bin/perl -w
+
+#
+# Converts a png image to postscript with the following rules.
+#
+# 1. The standard conversion is at 120x120 dpi.
+# 2. The maximum width is 16.3 cm.
+#
+# We resolve rule 2 by increasing the dpi.
+#
+
+# Location of imagic conversion program
+$convert = 'convert';
+foreach ( '/usr/bin/magick',
+	  '/usr/local/bin/magick',
+	  '/usr/bin/convert',
+	  '/usr/local/bin/convert',
+	  '/usr/X11R6/bin/convert') {
+    if ( -e $_ ) {
+	$convert = $_;
+	last;
+    }
+}
+
+# Default density
+$default_density = '120';
+
+# For images with a non standard density, we list them here.
+%densities = (
+	      'template.display', '124',
+	      'consistency_p', '132',
+	      'restrict_enzymes', '135',
+	      'stops', '137',
+	      'contig_editor.screen', '180',
+	      'contig_editor_grey_scale', '180',
+	      'contig_editor.traces', '173',
+	      'contig_editor.join', '180',
+	      '2nd_highest_confidence', '129',
+	      'discrepancy_graph', '129',
+	      'conf_values_p', '134',
+	      'read_coverage_p', '132',
+	      'readpair_coverage_p', '134',
+	      'template.quality', '122',
+	      'snp_candidates1', '167',
+	      'snp_candidates2', '167',
+	      'contig_editor_sets', '178',
+	      'mut_traces_het', '173',
+	      'mut_traces_positive', '173',
+	      'contig_editor.traces.compact', '173',
+	      'c_order_t1', '126',
+	      'c_order_t2', '126',
+	      'contig_navigation_table', '125',
+	      'notes.selector', '125',
+	      'mut_pregap4', '124',
+	      'mut_traces_point', '173',
+	      'mut_contig_editor5', '178',
+	      'mut_contig_editor_dis5', '178',
+	      'mut_template_all', '124',
+	      'mut_template_reads', '124',
+	      'mut_template_reads_single', '124',
+	      'pregap4_compact', '124',
+	      'pregap4_files', '124',
+	      'pregap4_config', '124',
+	      'pregap4_textwin', '124',
+	      'primer_pos_plot', '135',
+	      'primer_pos_seq_display', '140',
+	      'trev_pic', '128',
+	      'trace_print_trace1', '128',
+	      'trev_conf_trace', '128',
+	      'trev_pyro_trace', '127',
+	      'trace_print_menu', '128',
+	      'spin_plot_p', '128',
+	      'spin_restrict_enzymes_p', '134',
+	      'spin_plot_base_comp_p', '130',
+	      'spin_weight_matrix', '128',
+	      'spin_splice', '127',
+	      'spin_base_bias_p', '124',
+	      'spin_sequence_display_t', '166',
+	      'spin_seq_display', '162',
+	      'spin_restrict_enzymes_p1', '134',
+	      'spin_string_search_p', '125',
+	      'spin_start_p', '127',
+	      'spin_stops_p', '127',
+	      'spin_stops_p2', '127',
+	      'spin_codon_usage', '126',
+	      'spin_codon_usage_aaonly', '126',
+	      'spin_author_p', '126',
+	      'spin_trna_p', '129',
+	      'spin_results_manager_d2', '126',
+	      'spin_plot_drag1', '134',
+	      'spin_plot_drag2', '134',
+	      'spin_plot_drag3', '134',
+	      'gap5_template_spread0', '151',
+	      'gap5_template_spread50', '151',
+	      'gap5_template_by_stacking', '152',
+	      'gap5_template_by_mapping', '152',
+	      'gap5_template_by_size', '152',
+	      'gap5_contig_editor.screen', '151',
+	      'gap5_contig_editor.join', '151',
+	      'gap5_contig_editor.traces', '173',
+	      'gap5_contig_editor.454trace', '173',
+	      );
+
+while ($#ARGV >= 0) {
+    $_ = shift(@ARGV);
+    next if (/\.small\./);
+    s/\.png//;
+    $fname = $_;
+    $density = $densities{$_} ? $densities{$_} : $default_density;
+    $density = $density . 'x' . $density;
+
+    # Convert the image
+    system "$convert -units PixelsPerInch -density $density $_.png $_.pdf";
+
+    # Find the size of the postscript image.
+    open(FILE, "$_.pdf") || die "Cannot open $_.pdf\n";
+    binmode(FILE,":raw");
+    while (<FILE>) {
+	if (/^\/CropBox/) {
+	    ($a,$b)=/^\/CropBox \[(\d+(?:\.\d+)?) \d+(?:\.\d+)? (\d+(?:\.\d+)?) \d+(?:\.\d+)?\]/;
+	    $width = $b-$a;
+	}
+    }
+    close(FILE);
+
+    # Check the size. 6in = 432 1/72th inch.
+    if ($width > 432) {
+	$size=int($width/432.0*120+1);
+	print "FIX: Suggested new density: '$fname', '$size',\n";
+    }
+}
diff --git a/manual/tools/make_png_html b/manual/tools/make_png_html
new file mode 100755
index 0000000..f46105a
--- /dev/null
+++ b/manual/tools/make_png_html
@@ -0,0 +1,17 @@
+#!/bin/sh
+
+# For all the pngs that are split into a small and large version, create a
+# .png.html file which is simply an html page containing the png. This is
+# needed because the tcl/tk html viewer doesn't support png files directly.
+
+for i in *.small.*.png
+do
+    name=`echo $i | sed 's/\.small\.\(.*\)\.png$/\.\1/'`
+    # name=`echo $i | sed 's/.small.png$//'`
+    echo "Creating $name.png.html"
+    cat > $name.png.html << _eof_
+<html><body bgcolor="#ffffff">
+<img src="$name.png">
+</body></html>
+_eof_
+done
\ No newline at end of file
diff --git a/manual/tools/make_ps b/manual/tools/make_ps
new file mode 100755
index 0000000..27e34aa
--- /dev/null
+++ b/manual/tools/make_ps
@@ -0,0 +1,82 @@
+#!/usr/bin/perl -w
+
+#
+# Converts a gif/png image to postscript with the following rules.
+#
+# 1. The standard conversion is at 120x120 dpi.
+# 2. The maximum width is 16.3 cm.
+#
+# We resolve rule 2 by increasing the dpi.
+#
+
+# Location of imagic conversion program
+#$convert = '/usr/local/bin/convert';
+#$convert = '/usr/X11R6/bin/convert';
+$convert = '/usr/bin/convert';
+
+# Default density
+$default_density = '120';
+
+# For images with a non standard density, we list them here.
+%densities = (
+	      'conf_values_p', '124',
+	      'consistency_p', '123',
+	      'contig_editor.join', '168',
+	      'contig_editor.screen', '168',
+	      'contig_editor.traces', '161',
+	      'contig_editor.traces.compact', '161',
+	      'contig_editor_grey_scale', '168',
+	      'contig_editor_mutations', '168',
+	      'interface.menus', '26',
+	      'primer_pos_plot', '126',
+	      'primer_pos_seq_display', '131',
+	      'read_coverage_p', '123',
+	      'readpair_coverage_p', '125',
+	      'restrict_enzymes', '126',
+	      'spin_plot_base_comp_p', '121',
+	      'spin_plot_drag1', '125',
+	      'spin_plot_drag2', '125',
+	      'spin_plot_drag3', '125',
+	      'spin_restrict_enzymes_p', '125',
+	      'spin_restrict_enzymes_p1', '125',
+	      'spin_seq_display', '151',
+	      'spin_sequence_display_t', '155',
+	      'spin_trna_p', '121',
+	      'stops', '128',
+	      'traces_diff', '163',
+	      'conf_values_p', '121',
+	      'mut_contig_editor5', '167',
+	      'mut_contig_editor_dis5', '167',
+	      'mut_traces_het', '161',
+	      'mut_traces_point', '161',
+	      'mut_traces_positive', '161',
+	      );
+
+while ($#ARGV >= 0) {
+    $_ = shift(@ARGV);
+    next if (/\.small\./);
+    s/\.(gif|png)//;
+    my $old_fmt=$1;
+    $fname = $_;
+    $density = $densities{$_} ? $densities{$_} : $default_density;
+    $density = $density . 'x' . $density;
+
+    # Convert the image
+    print "processing $_\n";
+    system "$convert -density $density $_.$old_fmt $_.ps";
+
+    # Find the size of the postscript image.
+    open(FILE, "$_.ps") || die "Cannot open $_.ps\n";
+    while (<FILE>) {
+	if (s/^%%BoundingBox: 0 0 (.*) .*/$1/) {
+	    $width=$_;
+	}
+    }
+    close(FILE);
+
+    # Check the size. 16.3cm = 462 1/72th inch.
+    if ($width gt 462) {
+	$size=int($width/462.0*120+1);
+	print "FIX: Suggested new density: '$fname', '$size',\n";
+    }
+}
diff --git a/manual/tools/merge_indexes.pl b/manual/tools/merge_indexes.pl
new file mode 100755
index 0000000..e78cc55
--- /dev/null
+++ b/manual/tools/merge_indexes.pl
@@ -0,0 +1,258 @@
+#!/usr/bin/perl -w
+
+#
+# 13/10/95 jkb
+#
+# Loops around the table of contents html files generated by texi2html
+# to find the index files. When found, we merge these to generate a single
+# master index. Simultaneously we output full and brief contents pages, and
+# update those pages we're accessing to point back to our master pages.
+#
+
+%index = ();
+$silent = 1;
+$doit = 0;
+$curr_file = "";
+$TODAY = &pretty_date;			# like "20 September 1993"
+$http_prefix = "..";
+$http_prefix2 = "$http_prefix/manual";
+$package_version = "version 1.5 (2004)";
+
+open(TOCL, "> master_contents.html")
+	|| die "Couldn't create master_contents.html";
+open(TOCS, "> master_brief.html")
+	|| die "Couldn't create master_brief.html";
+
+# Create full and brief contents page headers
+print TOCL <<EOH;
+<HTML>
+<HEAD>
+<TITLE>Master Table of Contents</TITLE>
+</HEAD>
+<BODY bgcolor="#ffffff">
+<a href="$http_prefix/index.html"><img src="i/nav_home.gif" alt="home"></a>
+<a href="master_brief.html"><img src="i/nav_brief.gif" alt="brief"></a>
+<hr size=4>
+<H1>Master Table of Contents</H1>
+<H3>For $package_version</H3>
+For the most recent version of this documentation see the package
+<a href="http://staden.sourceforge.net/documentation.html">home page</a>.
+EOH
+
+print TOCS <<EOH;
+<HTML>
+<HEAD>
+<TITLE>Master Table of Contents (Brief)</TITLE>
+</HEAD>
+<BODY bgcolor="#ffffff">
+<a href="$http_prefix/index.html"><img src="i/nav_home.gif" alt="home"></a>
+<a href="master_contents.html"><img src="i/nav_full.gif" alt="full"></a>
+<hr size=4>
+<H1>Master Table of Contents (Brief)</H1>
+<H3>For $package_version</H3>
+For the most recent version of this documentation see the package
+<a href="http://staden.sourceforge.net/documentation.html">home page</a>.
+<HR>
+<P>
+EOH
+
+# Scan through contents pages adding to master contents and collating the
+# index information.
+while (<ARGV>) {
+    if ($ARGV ne $curr_file) {
+	if ($curr_file) {
+	    close(NEW_TOC);
+	    rename("_$curr_file", $curr_file);
+	}
+	$curr_file = $ARGV;
+	open(NEW_TOC, "> _$ARGV") || die "Couldn't create _$ARGV";
+    }
+
+    if (/^<BODY.*>$/) { # Header is next line
+	$silent = 0;
+	print NEW_TOC $_;
+	print NEW_TOC <<EOH;
+<a href="$http_prefix/index.html"><img src="i/nav_home.gif" alt="home"></a>
+<a href="master_brief.html"><img src="i/nav_brief.gif" alt="brief"></a>
+<a href="master_contents.html"><img src="i/nav_full.gif" alt="full"></a>
+<hr size=4>
+EOH
+	next;
+    }
+    if (/^<HR>$/) { #Footer of _toc file (hack: assume on a line by itself)
+        print NEW_TOC <<EOH;
+<hr size=4>
+<a href="$http_prefix/index.html"><img src="i/nav_home.gif" alt="home"></a>
+<a href="master_brief.html"><img src="i/nav_brief.gif" alt="brief"></a>
+<a href="master_contents.html"><img src="i/nav_full.gif" alt="full"></a>
+<hr>
+EOH
+	$silent = 1;
+	$doit = 0;
+	next;
+    }
+    print NEW_TOC $_;
+    next if ($silent);
+
+    if (/^<H1>(.*)<\/H1>$/) {
+	print TOCL "<HR>\n<H2><A HREF=\"$ARGV\">$1</A></H2>\n";
+	print TOCS "<H2><A HREF=\"$ARGV\">$1</A></H2>\n";
+	next;
+    }
+    $doit = 1 if (/^<UL>$/);
+    print TOCL $_ if ($doit);
+    if (/<LI><A NAME="SEC.*HREF="([^#]*)[^"]*">(.*[Ii]ndex.*)<\/A>$/) {
+	print "Found '$2' in $1: ";
+	
+	&generate_index($1, $2);
+    }
+}
+if ($curr_file) {
+    close(NEW_TOC);
+    rename("_$curr_file", $curr_file);
+}
+
+# Create full and brief contents page footers.
+print TOCL <<EOF;
+<HR>
+<H2><A HREF="master_index.html">Master Index</A></H2>
+<hr size=4>
+<a href="$http_prefix/index.html"><img src="i/nav_home.gif" alt="home"></a>
+<a href="master_brief.html"><img src="i/nav_brief.gif" alt="brief"></a>
+</BODY>
+</HTML>
+EOF
+close(TOCL);
+
+print TOCS <<EOF;
+<H2><A HREF="master_index.html">Master Index</A></H2>
+<hr size=4>
+<a href="$http_prefix/index.html"><img src="i/nav_home.gif" alt="home"></a>
+<a href="master_contents.html"><img src="i/nav_full.gif" alt="full"></a>
+<HR>
+This document was generated using the <CITE>merge_indexes.pl</CITE> program.
+<p>
+<i>Last generated on $TODAY.
+</i>
+<font size="-1"><br>
+</BODY>
+</HTML>
+EOF
+close(TOCS);
+
+# Generate our master index (also with links to the master contents pages).
+&print_index();
+
+
+#------------------------------------------------------------------------------
+
+sub generate_index {
+    local($file,$key) = @_;
+    $level = 0;
+    $letter = "";
+
+    open(FILE, "< $file") || die "Couldn't read $file";
+
+    while (<FILE>) {
+	# Look for $key as an HREF in the file. This will be the index start
+	if (/^<H([1-6])><A NAME=.*>$key<\/A><\/H[1-6]>$/) {
+	    $level = $1+1;
+	    next;
+	}
+    
+	# Indexes always end in </P>	    
+	if ($level && /^<\/P>$/) {
+	    print "\n";
+	    close(FILE);
+	    return;
+	}
+    
+	# Detect new index letter sections
+#	if ($level && /^<H$level>(.)</) {
+	if ($level && /^<H2>(.)</) {
+	    print $letter;
+	    $letter = $1;
+	    next;
+	}
+    
+	if ($letter && /^<LI>/) {
+            if (!$index{$letter}) {
+	        $index{$letter} = "";
+	    }
+	    $index{$letter} .= $_;
+	}
+   }
+    
+   print "\n";
+   close(FILE);
+}
+
+sub print_index {
+    open(MASTER, "> master_index.html") 
+	|| die "Couldn't open master_index.html";
+
+    print MASTER <<EOH;
+<HTML>
+<HEAD>
+<TITLE>Master Index</TITLE>
+</HEAD>
+<BODY bgcolor="#ffffff">
+<a href="$http_prefix/index.html"><img src="i/nav_home.gif" alt="home"></a>
+<a href="master_brief.html"><img src="i/nav_brief.gif" alt="brief"></a>
+<a href="master_contents.html"><img src="i/nav_full.gif" alt="full"></a>
+<hr size=4>
+<H1>Master Index</H1>
+<H3>For $package_version</H3>
+<P>
+EOH
+
+    foreach $letter (sort keys(%index)) {
+	$uletter = $letter;
+	$letter =~ tr/a-z/A-Z/;
+	print MASTER "<A HREF=\"master_index.html#LET$uletter\">$letter</A>\n";
+    }
+    print MASTER "<P>\n";
+
+    foreach $letter (sort keys(%index)) {
+	print MASTER "<A NAME=\"LET$letter\"></A>";
+	print MASTER "<H2>$letter</H2>\n<DIR>\n";
+	print MASTER sort sort_sub split('\n', $index{$letter});
+#	print MASTER $index{$letter};
+	print MASTER "</DIR>\n";
+    }
+
+    print MASTER <<EOF;
+<hr size=4>
+<a href="$http_prefix/index.html"><img src="i/nav_home.gif" alt="home"></a>
+<a href="master_brief.html"><img src="i/nav_brief.gif" alt="brief"></a>
+<a href="master_contents.html"><img src="i/nav_full.gif" alt="full"></a>
+<HR>
+This document was generated using the <CITE>merge_indexes.pl</CITE> program.
+<p>
+<i>Last generated on $TODAY.
+</i>
+<font size="-1"><br>
+</BODY>
+</HTML>
+EOF
+
+    close(MASTER);
+}
+
+sub sort_sub {
+    $_=$a; s/<LI>[^>]*>//; $A=$_;
+    $_=$b; s/<LI>[^>]*>//; $B=$_;
+    
+    return "\L$A\E" cmp "\L$B\E";
+}
+
+# Taken from Lionel Cons' texi2html convertor
+sub pretty_date {
+    local(@MoY, $sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst);
+
+    @MoY = ('January', 'Febuary', 'March', 'April', 'May', 'June',
+	    'July', 'August', 'September', 'October', 'November', 'December');
+    ($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst) = localtime(time);
+    $year += ($year < 70) ? 2000 : 1900;
+    return("$mday $MoY[$mon] $year");
+}
diff --git a/manual/tools/pkfix.pl b/manual/tools/pkfix.pl
new file mode 100755
index 0000000..fbf1208
--- /dev/null
+++ b/manual/tools/pkfix.pl
@@ -0,0 +1,634 @@
+eval '(exit $?0)' && eval 'exec perl -S $0 ${1+"$@"}' && eval 'exec perl -S $0 $argv:q'
+  if 0;
+use strict;
+$^W=1; # turn warning on
+#
+# pkfix.pl
+#
+# Copyright (C) 2001 Heiko Oberdiek.
+#
+# This program may be distributed and/or modified under the
+# conditions of the LaTeX Project Public License, either version 1.2
+# of this license or (at your option) any later version.
+# The latest version of this license is in
+#   http://www.latex-project.org/lppl.txt
+# and version 1.2 or later is part of all distributions of LaTeX
+# version 1999/12/01 or later.
+#
+# See file "README" for a list of files that belongs to this project.
+#
+# This file "pkfix.pl" may be renamed to "pkfix"
+# for installation purposes.
+#
+my $file        = "pkfix.pl";
+my $program     = uc($&) if $file =~ /^\w+/;
+my $project     = lc($program);
+my $version     = "0.8";
+my $date        = "2001/04/23";
+my $author      = "Heiko Oberdiek";
+my $copyright   = "Copyright (c) 2001 by $author.";
+#
+# Reqirements: Perl5, dvips
+# History:
+#   2001/04/12 v0.1:
+#     * First try.
+#   2001/04/13 v0.2:
+#     * TeX/dvips is called for each font for the case of errors.
+#     * First release.
+#   2001/04/15 v0.3:
+#     * Call of kpsewhich with option --progname.
+#     * Extracting of texps.pro from temporary PostScript file,
+#       if kpsewhich failed.
+#     * Option -G0 for dvips run added.
+#   2001/04/16 v0.4:
+#     * Support for merging PostScript fonts added.
+#     * \special{!...}/@fedspecial detection added.
+#     * Bug fix: I detection.
+#   2001/04/17 v0.5:
+#     * Redirection of stderr (dvips run) if possible.
+#   2001/04/20 v0.6:
+#     * Bug fix: dvips font names can contain numbers.
+#   2001/04/21 v0.7:
+#     * Bug fix: long dvi file name in ps file.
+#   2001/04/23 v0.8:
+#     * Bug fix: post string parsing.
+#
+
+### program identification
+my $title = "$program $version, $date - $copyright\n";
+
+### error strings
+my $Error = "!!! Error:"; # error prefix
+my $Warning = "!!! Warning:"; # warning prefix
+
+### variables
+my $envvar    = uc($project);
+my $infile    = "";
+my $outfile   = "";
+my $texpsfile = "texps.pro";
+my $prefix    = "_${project}_$$";
+# my $prefix    = "_${project}_";
+my $tempfile  = "$prefix";
+my $texfile   = "$tempfile.tex";
+my $dvifile   = "$tempfile.dvi";
+my $logfile   = "$tempfile.log";
+my $psfile    = "$tempfile.ps";
+my $missfile  = "missfont.log";
+my @cleanlist = ($texfile, $dvifile, $logfile, $psfile);
+push(@cleanlist, $missfile) unless -f $missfile;
+
+my $err_redirect = " 2>&1";
+$err_redirect = "" if $^O =~ /dos/i ||
+                      $^O =~ /os2/i ||
+                      $^O =~ /mswin32/i ||
+                      $^O =~ /cygwin/i;
+
+my $x_resolution    = 0;
+my $y_resolution    = 0;
+my $blocks_found    = 0;
+my $fonts_converted = 0;
+my $fonts_merged    = 0;
+my $fonts_misses    = 0;
+
+### option variables
+my @bool = ("false", "true");
+$::opt_tex     = "tex";
+$::opt_dvips   = "dvips";
+$::opt_kpsewhich = "kpsewhich --progname $project";
+$::opt_options = "-Ppdf -G0";
+$::opt_help       = 0;
+$::opt_quiet      = 0;
+$::opt_debug      = 0;
+$::opt_verbose    = 0;
+$::opt_clean      = 1;
+
+my $usage = <<"END_OF_USAGE";
+${title}Syntax:   \L$program\E [options] <inputfile.ps> <outputfile.ps>
+Function: This program tries to replace pk fonts in <inputfile.ps>
+          by the type 1 versions. The result is written in <outputfile.ps>.
+Options:                                                         (defaults:)
+  --help            print usage
+  --(no)quiet       suppress messages                            ($bool[$::opt_quiet])
+  --(no)verbose     verbose printing                             ($bool[$::opt_verbose])
+  --(no)debug       debug informations                           ($bool[$::opt_debug])
+  --(no)clean       clear temp files                             ($bool[$::opt_clean])
+  --tex texcmd      tex command name (plain format)              ($::opt_tex)
+  --dvips dvipscmd  dvips command name                           ($::opt_dvips)
+  --options opt     dvips options                                ($::opt_options)
+END_OF_USAGE
+
+### environment variable PKFIX
+if ($ENV{$envvar}) {
+  unshift(@ARGV, split(/\s+/, $ENV{$envvar}));
+}
+
+### process options
+my @OrgArgv = @ARGV;
+use Getopt::Long;
+GetOptions(
+  "help!",
+  "quiet!",
+  "debug!",
+  "verbose!",
+  "clean!",
+  "tex=s",
+  "dvips=s",
+  "options=s"
+) or die $usage;
+!$::opt_help or die $usage;
+ at ARGV < 3 or die "$usage$Error Too many files!\n";
+ at ARGV == 2 or die "$usage$Error Missing file names!\n";
+
+$::opt_quiet = 0 if $::opt_verbose;
+$::opt_clean = 0 if $::opt_debug;
+
+### get file names
+$infile = $ARGV[0];
+$outfile = $ARGV[1];
+
+print $title unless $::opt_quiet;
+
+print "*** input file: `$infile'\n" if $::opt_verbose;
+print "*** output file: `$outfile'\n" if $::opt_verbose;
+
+if ($::opt_debug) {
+  print <<"END_DEB";
+*** OSNAME: $^O
+*** PERL_VERSION: $]
+*** ARGV: @OrgArgv
+END_DEB
+}
+
+### get texps.pro
+my $texps_data   = 0;
+my $texps_string = get_texps_pro();
+
+### open input and output files
+open(IN, $infile) or die "$Error Cannot open `$infile'!\n";
+open(OUT, ">$outfile") or die "$Error Cannot write `$outfile'!\n";
+
+##################################
+# expected format:
+#   ...
+#   %%DVIPSParameters:... dpi=([\dx]+)...
+#   ...
+#   TeXDict begin \d+ \d+ \d+ \d+ \d+ \(\S+\)
+#   @start ...
+#   ...
+#   %DVIPSBitmapFont: (\S+) (\S+) ([\d\.]+) (\d+)
+#   /(\S+) ...
+#   ...
+#   %EndDVIPSBitmapFont
+#   ...
+#   ... end
+#   %%EndProlog
+#
+# or if \special{!...} was used, the lines with TeXDict:
+#   TeXdict begin @defspecial
+#
+#   ...
+#
+#   @fedspecial end TeXDict begin
+#   \d+ \d+ \d+ \d+ \d+ \(\S+\) @start
+#
+# bitmap font:
+# start:
+#   %%DVIPSBitmapFont: {dvips font} {font name} {at x pt} {chars}
+#   /{dvips font} {chars} {max. char number + 1} df
+# character, variant a:
+#   <{hex code}>{char number} D
+# character, variant b:\
+#   [<{hex code}>{num1} {num2} {num3} {num4} {num5} {char number} D
+# end:
+#   E
+#   %%EndDVIPSBitmapFont
+#
+# type 1 font:
+# before TeXDict line:
+#   %%BeginFont: CMR10
+#   ...
+#   %%EndFont
+# after @start:
+#   /Fa ... /CMR10 rf
+###################################
+
+my $x_comment_resolution = 0;
+my $y_comment_resolution = 0;
+my $start_string = "";
+my $post_string = "";
+my $dvips_resolution = "";
+my $texps_found = 0;
+my @font_list = ();
+my %font_txt = ();
+my %font_count = ();
+my %font_entry = ();
+
+sub init {
+  $x_comment_resolution = 0;
+  $y_comment_resolution = 0;
+  $x_resolution = 0;
+  $y_resolution = 0;
+  $start_string = "";
+  $texps_found = 0;
+  @font_list = ();
+  %font_txt = ();
+  %font_count = ();
+  %font_entry = ();
+}
+
+init();
+
+while (<IN>) {
+
+  if (/^%%Creator: (dvips\S*) (\S+)\s/) {
+    print "*** %%Creator: $1 $2\n" if $::opt_debug;
+  }
+
+  if (/^%DVIPSParameters:.*dpi=([\dx]+)/) {
+    print OUT;
+    my $str = $1;
+    $x_comment_resolution = 0;
+    $y_comment_resolution = 0;
+    if ($str =~ /^(\d+)x(\d+)$/) {
+      $x_comment_resolution = $1;
+      $y_comment_resolution = $2;
+    }
+    if ($str =~ /^(\d+)$/) {
+      $x_comment_resolution = $1;
+      $y_comment_resolution = $1;
+    }
+    print "*** %DVIPSParameters: dpi=$str " .
+          "(x=$x_comment_resolution, y=$y_comment_resolution)\n"
+      if $::opt_debug;
+    $x_comment_resolution > 0 && $y_comment_resolution > 0 or
+      die "$Error Wrong resolution value " .
+          "($x_comment_resolution x $y_comment_resolution)!\n";
+    next;
+  }
+
+  if (/^%%BeginProcSet: texps.pro/) {
+    $texps_found = 1;
+    print "*** texps.pro found\n" if $::opt_debug;
+  }
+
+  if (/^TeXDict begin \@defspecial/) {
+    print "*** \@defspecial found.\n" if $::opt_debug;
+    $start_string = $_;
+    while (<IN>) {
+      $start_string .= $_;
+      last if /\@fedspecial end TeXDict begin/;
+    }
+  }
+  elsif (/^TeXDict begin \d+ \d+ \d+ \d+ \d+/) {
+    print "*** TeXDict begin <5 nums> found.\n" if $::opt_debug;
+    $start_string = $_;
+  }
+  if ($start_string ne "") {
+    # look for @start
+    unless (/\@start/) {
+      while (<IN>) {
+        $start_string .= $_;
+        last if /\@start/;
+      }
+    }
+
+    # divide post part
+    $start_string =~ /^([\s\S]*\@start)\s*([\s\S]*)$/ or
+      die "$Error Parse error (\@start)!\n";
+    $start_string = "$1\n";
+    $post_string = $2;
+    $post_string =~ s/\s*$//;
+    $post_string .= "\n" unless $post_string eq "";
+
+    $start_string =~
+      /\d+\s+\d+\s+\d+\s+(\d+)\s+(\d+)\s+\((.*)\)\s+\@start/ or
+      die "$Error Parse error (\@start parameters)!\n";
+
+    $blocks_found++;
+    print "*** dvi file: $3\n" if $::opt_debug;
+
+    # get and check resolution values
+    $x_resolution = $1;
+    $y_resolution = $2;
+    print "*** resolution: $x_resolution x $y_resolution\n"
+      if $::opt_debug;
+    $x_comment_resolution > 0 or
+      die "$Error Missing comment `%DVIPSParameters'!\n";
+    $x_resolution == $x_comment_resolution &&
+    $y_resolution == $y_comment_resolution or
+      die "$Error Resolution values in comment and PostScript " .
+          "does not match!\n";
+    # setting dvips resolution option(s)
+    if ($x_resolution == $y_resolution) {
+      $dvips_resolution = "-D $x_resolution";
+    }
+    else {
+      $dvips_resolution = "-X $x_resolution -Y $y_resolution";
+    }
+
+    while (<IN>) {
+      if (/^%%EndProlog/) {
+        $texps_data > 0 or die "$Error File `texps.pro' not found!\n";
+        print OUT $texps_string unless $texps_found;
+        foreach (@font_list) {
+          my $fontname = $_;
+          print "*** Adding font `$fontname'\n"
+            if $::opt_debug;
+          my ($dummy, $err);
+          if ($font_count{$fontname} > 1) {
+            $fonts_merged++;
+            print "*** Merging font `$fontname' ($font_count{$fontname}).\n"
+              unless $::opt_quiet;
+            ($font_txt{$fontname}, $dummy, $err) =
+              get_font($font_entry{$fontname});
+            $err == 0 or die "$Error Cannot merge font `$fontname'!\n";
+          }
+          print OUT $font_txt{$fontname};
+        }
+        print OUT $start_string,
+                  $post_string,
+                  $_;
+        print "*** %%EndProlog\n" if $::opt_debug;
+        init();
+        last;
+      }
+
+      if (/^%DVIPSBitmapFont: (\S+) (\S+) ([\d.]+) (\d+)/) {
+        my $bitmap_string = $_;
+        my $dvips_fontname = $1;
+        my $fontname = $2;
+        my $entry = "\\Font\{$1\}\{$2\}\{$3\}\{";
+        print "*** Font $1: $2 at $3pt, $4 chars\n" if $::opt_verbose;
+        my $line = "";
+        my $num = -1;
+        my $chars = $4;
+        my $count = 0;
+        while (<IN>) {
+          $bitmap_string .= $_;
+          last if /^%EndDVIPSBitmapFont/;
+          chomp;
+          $line .= " " . $_;
+        }
+        $line =~ s/<[0-9A-F ]*>/ /g;
+
+        print "*** <Font> $line\n" if $::opt_debug;
+
+        while ($line =~ /\s(\d+)\s+D(.*)/) {
+          $num = $1;
+          $count++;
+          $entry .= "$num,";
+          $line = $2;
+          while ($line =~ /^[\s\d\[]*I(.*)/) {
+            $num++;
+            $count++;
+            $entry .= "$num,";
+            $line = $1;
+          }
+        }
+        $chars == $count or
+          die "$Error Parse error, $count chars of $chars found " .
+            "($fontname)!\n";
+
+        $entry =~ s/,$//;
+        $entry .= "\}";
+
+        print "*** Converting font `$fontname'.\n" unless $::opt_quiet;
+        my ($font_part, $start_part, $err) = get_font($entry);
+        if ($err == 0) {
+          if (defined($font_count{$fontname})) {
+            $font_count{$fontname}++;
+            $font_entry{$fontname} .= "\n$entry";
+          }
+          else {
+            push @font_list, $fontname;
+            $font_txt{$fontname} = $font_part;
+            $font_count{$fontname} = 1;
+            $font_entry{$fontname} = $entry;
+          }
+          $start_part =~ s/\/Fa/\/$dvips_fontname/;
+          $start_string .= $start_part;
+          $fonts_converted++;
+        }
+        else {
+          $start_string .= $bitmap_string;
+          $fonts_misses++;
+          print "!!! Font conversion of `$fontname' failed!\n";
+        }
+
+        next;
+      }
+
+      $post_string .= $_;
+    }
+    next;
+  }
+
+  print OUT;
+}
+
+close(IN);
+close(OUT);
+
+if ($::opt_clean) {
+  print "*** clear temp files\n" if $::opt_verbose;
+  foreach (@cleanlist) {
+    unlink;
+  }
+}
+
+if (!$::opt_quiet) {
+  if ($blocks_found > 1) {
+    print "==> $blocks_found blocks found.\n";
+  }
+  if ($fonts_misses) {
+    print "==> $fonts_misses font conversion(s) failed.\n";
+  }
+  if ($fonts_converted) {
+    print "==> $fonts_converted font(s) converted.\n";
+    if ($fonts_merged) {
+      print "==> $fonts_merged font(s) merged.\n";
+    }
+  }
+  else {
+    print "==> no fonts converted\n";
+  }
+}
+
+
+# get type 1 font
+# param:  $entry: font entry as TeX string
+# return: $font:  font file as string
+#         $start: font definition after @start
+#         $err:   error indication
+sub get_font {
+  my $entry = $_[0];
+  my $font = "";
+  my $start = "";
+  my $err = 0;
+  my @err = ("", "", 1);
+  local *OUT;
+  local *IN;
+
+  ### write temp tex file
+  open(OUT, ">$texfile") or die "$Error Cannot write `$texfile'!\n";
+  print OUT <<'TEX_HEADER';
+\nonstopmode
+\nopagenumbers
+\def\Font#1#2#3#4{%
+  \expandafter\font\csname font@#1\endcsname=#2 at #3pt\relax
+  \csname font@#1\endcsname
+  \hbox to 0pt{%
+    \ScanChar#4,\NIL
+    \hss
+  }%
+}
+\def\ScanChar#1,#2\NIL{%
+  \char#1\relax
+  \ifx\\#2\\%
+  \else
+    \ReturnAfterFi{%
+      \ScanChar#2\NIL
+    }%
+  \fi
+}
+\long\def\ReturnAfterFi#1\fi{\fi#1}
+\noindent
+TEX_HEADER
+
+  print OUT "$entry\n\\bye\n";
+  close(OUT);
+
+  ### run tex
+  {
+    print "*** run TeX\n" if $::opt_verbose;
+
+    my $cmd = "$::opt_tex $tempfile";
+    print ">>> $cmd\n" if $::opt_verbose;
+    my @capture = `$cmd`;
+    if (!defined(@capture)) {
+      print "$Warning Cannot execute TeX!\n";
+      return @err;
+    }
+    if ($::opt_verbose) {
+      print @capture;
+    }
+    else {
+      foreach (@capture) {
+        print if /^!\s/;
+      }
+    }
+    if ($?) {
+      my $exitvalue = $?;
+      if ($exitvalue > 255) {
+        $exitvalue >>= 8;
+        print "$Warning Closing TeX (exit status: $exitvalue)!\n";
+        return @err;
+      }
+      print "$Warning Closing TeX ($exitvalue)!\n";
+      return @err;
+    }
+  }
+
+  ### run dvips
+  {
+    print "*** run dvips\n" if $::opt_verbose;
+
+    my $cmd = "$::opt_dvips $::opt_options $dvips_resolution $tempfile";
+    print ">>> $cmd\n" if $::opt_verbose;
+    # dvips writes on stderr :-(
+    my @capture = `$cmd$err_redirect`;
+    if ($::opt_verbose) {
+      print @capture;
+    }
+    if ($?) {
+      my $exitvalue = $?;
+      if ($exitvalue > 255) {
+        $exitvalue >>= 8;
+        print "$Warning Closing dvips (exit status: $exitvalue)!\n";
+        return @err;
+      }
+      print "$Warning Closing dvips ($exitvalue)!\n";
+      return @err;
+    }
+  }
+
+  ### get font and start part
+  open(IN, $psfile) or die "$Error Cannot open `$psfile'!\n";
+
+  while (<IN>) {
+    if ($texps_data == 0 && /^%%BeginProcSet: texps.pro/) {
+      $texps_string = $_;
+      while (<IN>) {
+        $texps_string .= $_;
+        last if /^%%EndProcSet/;
+      }
+      $texps_data = 1;
+      print "*** texps.pro extracted.\n" if $::opt_debug;
+      next;
+    }
+    if (/^%%BeginFont:/) {
+      $font .= $_;
+      while (<IN>) {
+        $font .= $_;
+        last if /^%%EndFont/;
+      }
+      next;
+    }
+    if (/^\@start/) {
+      s/^\@start\s*//;
+      $start .= $_;
+      while (<IN>) {
+        last if /^%%EndProlog/;
+        $start .= $_;
+      }
+      if (($start =~ s/\s*end\s*$/\n/) != 1) {
+        $err = 1;
+        print "$Warning Parse error, `end' not found!\n";
+      }
+      print "*** start: $start" if $::opt_debug;
+      last;
+    }
+  }
+  close(IN);
+
+  if ($font eq "") {
+    print "$Warning `%%BeginFont' not found!\n";
+    return @err;
+  }
+  return ($font, $start, $err);
+}
+
+
+# get_texps_pro
+# return: string with content of texps.pro
+sub get_texps_pro {
+  $texps_data = 0;
+  # get file name
+  my $backupWarn = $^W;
+  $^W = 0;
+  my $file = `$::opt_kpsewhich $texpsfile`;
+  $^W = $backupWarn;
+  if (!defined($file) or $file eq "") {
+    print "$Warning: Cannot find `$texpsfile' with kpsewhich!\n"
+      if $::opt_debug;
+    return "";
+  }
+  chomp $file;
+  print "*** texps.pro: $file\n" if $::opt_debug;
+
+  # read file
+  local *IN;
+  open(IN, $file) or die "$Error: Cannot open `$file'!\n";
+  my @lines = <IN>;
+  @lines > 0 or die "$Error: Empty file `$file'!\n";
+  chomp $lines[@lines-1];
+  my $str = "%%BeginProcSet: texps.pro\n";
+  $"="";
+  $str .= "@lines\n";
+  $"=" ";
+  $str .= "%%EndProcSet\n";
+  $texps_data = 1;
+  return $str;
+}
+
+__END__
diff --git a/manual/tools/remove_xrefs.pl b/manual/tools/remove_xrefs.pl
new file mode 100755
index 0000000..739d632
--- /dev/null
+++ b/manual/tools/remove_xrefs.pl
@@ -0,0 +1,8 @@
+#!/usr/bin/perl -w
+$_="";
+read(STDIN, $_, 9999999);
+s/\s*\(\@pxref{[^}]*}\)//g;
+s/\s*\@xref{[^}]*}.//g;
+s/,\././g;
+print;
+
diff --git a/manual/tools/reorder.tcl b/manual/tools/reorder.tcl
new file mode 100644
index 0000000..7e00a95
--- /dev/null
+++ b/manual/tools/reorder.tcl
@@ -0,0 +1,129 @@
+#
+# Reads in a PostScript file generated by TeXInfo.
+#
+# The contents pages are initially at the end of the postscript file, so this
+# tool reorders these to put them between the first 2 (cover sheet) and the
+# actual manual start.
+#
+# We make assumptions to do this. The page structure will initially be:
+#
+# Page 1 - title
+# Page 2 - copyright
+# Page 1 - 1st Page
+# Page 2 - 2nd Page
+# ...
+# Page -1 - 1st contents page
+# Page -2 - 2nd contents page
+#
+
+proc process_header {line} {
+    global element header subdoc
+
+    if {[regexp {^%%Page:\s*(-?\d*)\s*(-?\d*)} $line all vpage rpage]} {
+	puts stderr "Start Page $vpage"
+	set subdoc 0
+	set element $vpage
+	return start_page
+    } else {
+	append header $line\n
+    }
+    return header
+}
+
+proc process_start_page {line} {
+    global element start_page subdoc
+    if {[regexp {^%%BeginDocument} $line]} {
+	incr subdoc
+    }
+    if {[regexp {^%%EndDocument} $line]} {
+	incr subdoc -1
+    }
+    if {$subdoc == 0} {
+	if {[regexp {^%%Page:\s*(-?\d*)\s*(-?\d*)} $line all vpage rpage]} {
+	    if {[info exists start_page($vpage)]} {
+		puts stderr "Page $vpage"
+		set element $vpage
+		return page
+	    } else {
+		puts stderr "Start Page $vpage"
+		set element $vpage
+		return start_page
+	    }
+	}
+	if {[regexp {^%%Trailer} $line]} {
+	    puts stderr "Trailer"
+	    return trailer
+	}
+    }
+    append start_page($element) $line\n
+    return start_page
+}
+
+proc process_page {line} {
+    global element page subdoc
+    if {[regexp {^%%BeginDocument} $line]} {
+	incr subdoc
+    }
+    if {[regexp {^%%EndDocument} $line]} {
+	incr subdoc -1
+    }
+    if {$subdoc == 0} {
+	if {[regexp {^%%Page:\s*(-?\d*)\s*(-?\d*)} $line all vpage rpage]} {
+	    puts stderr "Page $vpage"
+	    set element $vpage
+	    return page
+	}
+	if {[regexp {^%%Trailer} $line]} {
+	    puts stderr "Trailer"
+	    return trailer
+	}
+    }
+    append page($element) $line\n
+    return page
+}
+
+proc process_trailer {line} {
+    append trailer $line\n
+    return trailer
+}
+
+# Load the document into an array consisting of elements start, title,
+# copyright, pagex, page-x, end
+set state header
+set header ""
+set trailer ""
+
+puts stderr "Header"
+while {[gets stdin line] != -1} {
+    set state [process_$state $line]
+}
+
+# Print up the pages in their logical order
+puts $header
+set count 1
+foreach p [lsort -integer [array names start_page]] {
+    puts "%%Page: $p $count"
+    puts $start_page($p)
+    incr count
+
+}
+set pages [lsort -integer [array names page]]
+set pospage {}
+set negpage {}
+
+foreach p $pages {
+    if {$p < 0} {
+	set negpage "$p $negpage"
+    } else {
+	lappend pospage $p
+    }
+}
+
+set pages "$negpage $pospage"
+foreach p $pages {
+    puts "%%Page: $p $count"
+    puts $page($p)
+    incr count
+}
+puts $trailer
+exit
\ No newline at end of file
diff --git a/manual/tools/texi2html b/manual/tools/texi2html
new file mode 100755
index 0000000..5dcbaf1
--- /dev/null
+++ b/manual/tools/texi2html
@@ -0,0 +1,1910 @@
+#!/usr/bin/perl
+'di';
+'ig00';
+#+##############################################################################
+#                                                                              #
+# File: texi2html                                                              #
+#                                                                              #
+# Description: Program to transform most Texinfo documents to HTML             #
+#                                                                              #
+#-##############################################################################
+
+# @(#)texi2html	1.39 07/27/95	Written (mainly) by Lionel Cons, Lionel.Cons at cern.ch
+
+# The man page for this program is included at the end of this file and can be
+# viewed using the command 'nroff -man texi2html'.
+# Please read the copyright at the end of the man page.
+#
+# 12/10/95 (UK notation) Changes made by James Bonfield to add the @split
+# command.
+#
+
+#+++############################################################################
+#                                                                              #
+# Constants                                                                    #
+#                                                                              #
+#---############################################################################
+
+$DEBUG_TOC   =  1;
+$DEBUG_INDEX =  2;
+$DEBUG_BIB   =  4;
+$DEBUG_GLOSS =  8;
+$DEBUG_DEF   = 16;
+$DEBUG_HTML  = 32;
+
+$BIBRE = '\[[\w\/]+\]';			# RE for a bibliography reference
+$FILERE = '[\/\w.+-]+';			# RE for a file name
+$VARRE = '[^\s\{\}]+';			# RE for a variable name
+$NODERE = '[^@{}:\'`",]+';		# RE for a node name
+$NODESRE = '[^@{}:\'`"]+';		# RE for a list of node names
+$XREFRE = '[^@{}]+';			# RE for a xref (should use NODERE)
+
+$ERROR = "***";			        # prefix for errors and warnings
+$THISPROG = "texi2html 1.39";			# program name and version
+$HOMEPAGE = "http://wwwcn.cern.ch/dci/texi2html/"; # program home page
+$TODAY = &pretty_date;			# like "20 September 1993"
+$SPLITTAG = "<!-- SPLIT HERE -->\n";	# tag to know where to split
+$PROTECTTAG = "_ThisIsProtected_";	# tag to recognize protected sections
+$html2_doctype = '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0 Strict Level 2//EN">';
+$http_prefix = "http://staden.sourceforge.net/manual";
+
+#
+# language dependent constants
+#
+#$LDC_SEE = 'see';
+#$LDC_SECTION = 'section';
+#$LDC_IN = 'in';
+#$LDC_TOC = 'Table of Contents';
+#$LDC_GOTO = 'Go to the';
+#$LDC_FOOT = 'Footnotes';
+# TODO: @def* shortcuts
+
+#
+# pre-defined indices
+#
+%predefined_index = (
+		    'cp', 'c',
+		    'fn', 'f',
+		    'vr', 'v',
+		    'ky', 'k',
+		    'pg', 'p',
+		    'tp', 't',
+	            );
+
+#
+# valid indices
+#
+%valid_index = (
+		    'c', 1,
+		    'f', 1,
+		    'v', 1,
+		    'k', 1,
+		    'p', 1,
+		    't', 1,
+		);
+
+#
+# texinfo section names to level
+#
+%sec2level = (
+	      'top', 0,
+	      'chapter', 1,
+	      'unnumbered', 1,
+	      'majorheading', 1,
+	      'chapheading', 1,
+	      'appendix', 1,
+	      'section', 2,
+	      'unnumberedsec', 2,
+	      'heading', 2,
+	      'appendixsec', 2,
+	      'appendixsection', 2,
+	      'subsection', 3,
+	      'unnumberedsubsec', 3,
+	      'subheading', 3,
+	      'appendixsubsec', 3,
+	      'subsubsection', 4,
+	      'unnumberedsubsubsec', 4,
+	      'subsubheading', 4,
+	      'appendixsubsubsec', 4,
+	      );
+
+#
+# accent map, TeX command to ISO name
+#
+%accent_map = (
+	       '"', 'uml',
+	       '~', 'tilde',
+	       '^', 'circ',
+	       );
+
+#
+# texinfo "simple things" (@foo) to HTML ones
+#
+%simple_map = (
+	       # cf. makeinfo.c
+	       "*", "<BR>",		# HTML+
+	       "br", "<P>",		# paragraph break
+	       " ", " ",
+	       "\n", "\n",
+	       "|", "",
+	       # spacing commands
+	       ":", "",
+	       "!", "!",
+	       "?", "?",
+	       ".", ".",
+	       );
+
+#
+# texinfo "things" (@foo{}) to HTML ones
+#
+%things_map = (
+	       'TeX', 'TeX',
+	       'bullet', '*',
+	       'copyright', '(C)',
+	       'dots', '...',
+	       'equiv', '==',
+	       'error', 'error-->',
+	       'expansion', '==>',
+	       'minus', '-',
+	       'point', '-!-',
+	       'print', '-|',
+	       'result', '=>',
+	       'today', $TODAY,
+	       );
+
+#
+# texinfo styles (@foo{bar}) to HTML ones
+#
+%style_map = (
+	      'asis', '',
+	      'b', 'B',
+	      'cite', 'CITE',
+	      'code', 'CODE',
+	      'ctrl', '&do_ctrl',	# special case
+	      'dfn', 'STRONG',		# DFN tag is illegal in the standard
+	      'dmn', '',		# useless
+	      'emph', 'EM',
+	      'file', '"TT',		# will put quotes, cf. &apply_style
+	      'i', 'I',
+	      'kbd', 'KBD',
+	      'key', 'KBD',
+	      'r', '',			# unsupported
+	      'samp', '"SAMP',		# will put quotes, cf. &apply_style
+	      'sc', '&do_sc',		# special case
+	      'strong', 'STRONG',
+	      't', 'TT',
+	      'titlefont', '',		# useless
+	      'var', 'VAR',
+	      'w', '',			# unsupported
+	      );
+
+#
+# texinfo format (@foo/@end foo) to HTML ones
+#
+%format_map = (
+	       'display', 'PRE',
+	       'example', 'PRE',
+	       'format', 'PRE',
+	       'lisp', 'PRE',
+	       'quotation', 'BLOCKQUOTE',
+	       'smallexample', 'PRE',
+	       'smalllisp', 'PRE',
+	       # lists
+	       'itemize', 'UL',
+	       'enumerate', 'OL',
+	       # poorly supported
+	       'flushleft', 'PRE',
+	       'flushright', 'PRE',
+	       );
+
+#
+# texinfo definition shortcuts to real ones
+#
+%def_map = (
+	    # basic commands
+	    'deffn', 0,
+	    'defvr', 0,
+	    'deftypefn', 0,
+	    'deftypevr', 0,
+	    'defcv', 0,
+	    'defop', 0,
+	    'deftp', 0,
+	    # basic x commands
+	    'deffnx', 0,
+	    'defvrx', 0,
+	    'deftypefnx', 0,
+	    'deftypevrx', 0,
+	    'defcvx', 0,
+	    'defopx', 0,
+	    'deftpx', 0,
+	    # shortcuts
+	    'defun', 'deffn Function',
+	    'defmac', 'deffn Macro',
+	    'defspec', 'deffn {Special Form}',
+	    'defvar', 'defvr Variable',
+	    'defopt', 'defvr {User Option}',
+	    'deftypefun', 'deftypefn Function',
+	    'deftypevar', 'deftypevr Variable',
+	    'defivar', 'defcv {Instance Variable}',
+	    'defmethod', 'defop Method',
+	    # x shortcuts
+	    'defunx', 'deffnx Function',
+	    'defmacx', 'deffnx Macro',
+	    'defspecx', 'deffnx {Special Form}',
+	    'defvarx', 'defvrx Variable',
+	    'defoptx', 'defvrx {User Option}',
+	    'deftypefunx', 'deftypefnx Function',
+	    'deftypevarx', 'deftypevrx Variable',
+	    'defivarx', 'defcvx {Instance Variable}',
+	    'defmethodx', 'defopx Method',
+	    );
+
+#
+# things to skip
+#
+%to_skip = (
+	    # comments
+	    'c', 1,
+	    'comment', 1,
+	    # useless
+	    'contents', 1,
+	    'shortcontents', 1,
+	    'summarycontents', 1,
+	    'footnotestyle', 1,
+	    'end ifclear', 1,
+	    'end ifset', 1,
+	    'titlepage', 1,
+	    'end titlepage', 1,
+	    # unsupported commands (formatting)
+	    'afourpaper', 1,
+	    'cropmarks', 1,
+	    'finalout', 1,
+	    'headings', 1,
+	    'need', 1,
+	    'page', 1,
+	    'setchapternewpage', 1,
+	    'everyheading', 1,
+	    'everyfooting', 1,
+	    'evenheading', 1,
+	    'evenfooting', 1,
+	    'oddheading', 1,
+	    'oddfooting', 1,
+	    'smallbook', 1,
+	    'vskip', 1,
+	    # unsupported formats
+	    'cartouche', 1,
+	    'end cartouche', 1,
+	    'group', 1,
+	    'end group', 1,
+	    'raisesections', 1,
+	    'lowersections', 1,
+	    );
+
+#+++############################################################################
+#                                                                              #
+# Argument parsing, initialisation                                             #
+#                                                                              #
+#---############################################################################
+
+$use_bibliography = 1;
+$use_acc = 0;
+$debug = 0;
+$doctype = '';
+$check = 0;
+$expandinfo = 0;
+$use_glossary = 0;
+$invisible_mark = '';
+$use_iso = 0;
+ at include_dirs = ();
+$show_menu = 0;
+$split_node = 0;
+$split_chapter = 0;
+$verbose = 0;
+$index_chars = 0;
+$usage = <<EOT;
+This is $THISPROG
+To convert a Texinfo file to HMTL: $0 [options] file
+  where options can be:
+    -expandinfo    : use \@ifinfo sections, not \@iftex
+    -glossary      : handle a glossary
+    -invisible name: use 'name' as an invisible anchor
+    -I dir         : search also for files in 'dir'
+    -menu          : handle menus
+    -split_chapter : split on main sections
+    -split_node    : split on nodes
+    -usage         : print usage instructions
+    -verbose       : verbose output
+    -index_chars   : whether to add index shortcuts
+To check converted files: $0 -check [-verbose] files
+EOT
+
+while ($#ARGV >= 0 && $ARGV[0] =~ /^-/) {
+    $_ = shift(@ARGV);
+    if (/^-acc$/)            { $use_acc = 1; next; }
+    if (/^-d(ebug)?(\d+)?$/) { $debug = $2 || shift(@ARGV); next; }
+    if (/^-doctype$/)        { $doctype = shift(@ARGV); next; }
+    if (/^-c(heck)?$/)       { $check = 1; next; }
+    if (/^-e(xpandinfo)?$/)  { $expandinfo = 1; next; }
+    if (/^-g(lossary)?$/)    { $use_glossary = 1; next; }
+    if (/^-i(nvisible)?$/)   { $invisible_mark = shift(@ARGV); next; }
+    if (/^-p(refix)?$/)      { $http_prefix = shift(@ARGV); next; }
+    if (/^-iso$/)            { $use_iso = 1; next; }
+    if (/^-I(.+)?$/)         { push(@include_dirs, $1 || shift(@ARGV)); next; }
+    if (/^-m(enu)?$/)        { $show_menu = 1; next; }
+    if (/^-s(plit)?_?(n(ode)?|c(hapter)?)?$/) {
+	if ($2 =~ /^n/) {
+	    $split_node = 1;
+	} else {
+	    $split_chapter = 1;
+	}
+	next;
+    }
+    if (/^-v(erbose)?$/)     { $verbose = 1; next; }
+    if (/^-index_chars$/)    { $index_chars = 1; next; }
+    die $usage;
+}
+if ($check) {
+    die $usage unless @ARGV > 0;
+    ✓
+    exit;
+}
+
+if ($expandinfo) {
+    $to_skip{'ifinfo'}++;
+    $to_skip{'end ifinfo'}++;
+} else {
+    $to_skip{'iftex'}++;
+    $to_skip{'end iftex'}++;
+}
+$invisible_mark = '<IMG SRC="invisible.xbm">' if $invisible_mark eq 'xbm';
+die $usage unless @ARGV == 1;
+$docu = shift(@ARGV);
+if ($docu =~ /.*\//) {
+    chop($docu_dir = $&);
+    $docu_name = $';
+} else {
+    $docu_dir = '.';
+    $docu_name = $docu;
+}
+unshift(@include_dirs, $docu_dir);
+# jkb 22/04/2003. Added htmlinfo as a valid name
+$docu_name =~ s/\.(te?x|html?)(i|info)?$//;	# basename of the document
+
+$docu_toc = $docu_doc = $docu_foot = $docu_name;
+$docu_toc  .= '_toc.html';		# document's table of contents
+$docu_doc  .= '.html';			# document's contents
+$docu_foot .= '_foot.html';		# document's footnotes
+
+#
+# variables
+#
+%value = ();				# hold texinfo variables
+$value{'html'} = 1;			# predefine html (the output format)
+$value{'texi2html'} = '1.39';		# predefine texi2html (the translator)
+foreach ('author', 'title', 'subtitle', 'filename') { # prevent -w warnings
+    $value{$_} = '';
+}
+%node2sec = ();				# node to section name
+%node2href = ();			# node to HREF
+%bib2href = ();				# bibliography reference to HREF
+%gloss2href = ();			# glossary term to HREF
+ at sections = ();				# list of sections
+%tag2pro = ();				# protected sections
+
+#
+# initial indexes
+#
+$bib_num = 0;
+$foot_num = 0;
+$gloss_num = 0;
+$idx_num = 0;
+$sec_num = 0;
+$doc_num = 0;
+$html_num = 0;
+
+#
+# can I use ISO8879 characters? (HTML+)
+#
+if ($use_iso) {
+    $things_map{'bullet'} = "•";
+    $things_map{'copyright'} = "©";
+    $things_map{'dots'} = "…";
+    $things_map{'equiv'} = "≡";
+    $things_map{'expansion'} = "→";
+    $things_map{'point'} = "∗";
+    $things_map{'result'} = "⇒";
+}
+
+#
+# read texi2html extensions (if any)
+#
+$extensions = 'texi2html.ext'; # extensions in working directory
+if (-f $extensions) {
+    print "# reading extensions from $extensions\n" if $verbose;
+    require($extensions);
+}
+($progdir = $0) =~ s/[^\/]+$//;
+if ($progdir && ($progdir ne './')) {
+    $extensions = "${progdir}texi2html.ext"; # extensions in texi2html directory
+    if (-f $extensions) {
+	print "# reading extensions from $extensions\n" if $verbose;
+	require($extensions);
+    }
+}
+
+print "# reading from $docu\n" if $verbose;
+
+#+++############################################################################
+#                                                                              #
+# Pass 1: read source, handle command, variable, simple substitution           #
+#                                                                              #
+#---############################################################################
+
+ at lines = ();				# whole document
+ at toc_lines = ();			# table of contents
+$toplevel = 0;			        # top level seen in hierarchy
+$curlevel = 0;				# current level in TOC
+$node = '';				# current node name
+$do_split = 0;				# split at next node
+$split_at_node = 0;			# did we split at this node
+$just_split = 1;			# Last op was a split
+$in_table = 0;				# am I inside a table
+$table_type = '';			# type of table ('', 'f', 'v')
+ at tables = ();			        # nested table support
+$in_bibliography = 0;			# am I inside a bibliography
+$in_glossary = 0;			# am I inside a glossary
+$in_top = 0;				# am I inside the top node
+$in_pre = 0;				# am I inside a preformatted section
+$in_list = 0;				# am I inside a list
+$in_html = 0;				# am I inside an HTML section
+$first_line = 1;		        # is it the first line
+$dont_html = 0;				# don't protect HTML on this line
+$split_num = 0;				# split index
+$deferred_ref = '';			# deferred reference for indexes
+ at html_stack = ();			# HTML elements stack
+$html_element = '';			# current HTML element
+&html_reset;
+
+# build code for simple substitutions
+# the maps used (%simple_map and %things_map) MUST be aware of this
+# watch out for regexps, / and escaped characters!
+$subst_code = '';
+foreach (keys(%simple_map)) {
+    ($re = $_) =~ s/(\W)/\\$1/g; # protect regexp chars
+    $subst_code .= "s/\\\@$re/$simple_map{$_}/g;\n";
+}
+foreach (keys(%things_map)) {
+    $subst_code .= "s/\\\@$_\\{\\}/$things_map{$_}/g;\n";
+}
+if ($use_acc) {
+    # accentuated characters
+    foreach(keys(%accent_map)) {
+	$subst_code .= "s/\\\@\\$_([aeiou])/&\${1}$accent_map{$_};/g;\n";
+    }
+}
+eval("sub simple_substitutions { $subst_code }");
+
+&init_input;
+while ($_ = &next_line) {
+    #
+    # remove \input on the first lines only
+    #
+    if ($first_line) {
+	next if /^\\input/;
+	$first_line = 0;
+    }
+    #
+    # parse texinfo tags
+    #
+    $tag = '';
+    $end_tag = '';
+    if (/^\@end\s+(\w+)\b/) {
+	$end_tag = $1;
+    } elsif (/^\@(\w+)\b/) {
+	$tag = $1;
+    }
+    #
+    # handle @ifhtml / @end ifhtml
+    #
+    if ($in_html) {
+	if ($end_tag eq 'ifhtml') {
+	    $in_html = 0;
+	} else {
+	    $tag2pro{$in_html} .= $_;
+	}
+	next;
+    } elsif ($tag eq 'ifhtml') {
+	$in_html = $PROTECTTAG . ++$html_num;
+	push(@lines, $in_html);
+	next;
+    }
+    #
+    # try to skip the line
+    #
+    if ($end_tag) {
+	next if $to_skip{"end $end_tag"};
+    } elsif ($tag) {
+	next if $to_skip{$tag};
+	last if $tag eq 'bye';
+    }
+    if ($in_top) {
+	# parsing the top node
+	if ($tag eq 'node' || $tag eq 'include' || $sec2level{$tag}) {
+	    # no more in top
+	    $in_top = 0;
+	} else {
+	    # skip it
+	    next;
+	}
+    }
+    #
+    # try to remove inlined comments
+    # syntax from tex-mode.el comment-start-skip
+    #
+    s/((^|[^\@])(\@\@)*)\@c(omment)? .*/$1/;
+    # non-@ substitutions cf. texinfmt.el
+    s/``/\"/g;
+    s/''/\"/g;
+    s/([\w ])---([\w ])/$1--$2/g;
+    #
+    # analyze the tag
+    #
+    if ($tag) {
+	# skip lines
+	&skip_until($tag), next if $tag eq 'ignore';
+	if ($expandinfo) {
+	    &skip_until($tag), next if $tag eq 'iftex';
+	} else {
+	    &skip_until($tag), next if $tag eq 'ifinfo';
+	}
+	&skip_until($tag), next if $tag eq 'tex';
+	# Split at next node line
+	if ($tag eq 'split') {
+	    $do_split = 1;
+	    next;
+	}
+	# handle special tables
+	if ($tag eq 'table') {
+	    $table_type = '';
+	} elsif ($tag eq 'ftable') {
+	    $tag = 'table';
+	    $table_type = 'f';
+	} elsif ($tag eq 'vtable') {
+	    $tag = 'table';
+	    $table_type = 'v';
+	}
+	# special cases
+	if ($tag eq 'top' || ($tag eq 'node' && /^\@node\s+top\s*,/i)) {
+	    $in_top = 1;
+	    @lines = (); # ignore all lines before top (title page garbage)
+	    next;
+	} elsif ($tag eq 'node') {
+	    $in_top = 0;
+	    &protect_html;	# if node contains '&' for instance
+	    warn "$ERROR Bad node line: $_" unless $_ =~ /^\@node\s$NODESRE$/o;
+	    s/^\@node\s+//;
+	    ($node) = split(/,/);
+	    $node =~ s/\s+/ /g; # normalize
+	    $node =~ s/ $//;
+	    if ($split_node || $do_split) {
+		&next_doc;
+		push(@lines, $SPLITTAG) if $split_num++;
+		push(@lines, "<!-- NODE:$node -->\n");
+		push(@sections, $node);
+		$do_split = 0;
+		$split_at_node = 1;
+		$just_split = 1;	
+	    } else {
+		push(@lines, "<!-- NODE:$node -->\n");
+		$split_at_node = 0;
+	    }
+	    next;
+	} elsif ($tag eq 'include') {
+	    if (/^\@include\s+($FILERE)\s*$/o) {
+		$file = $1;
+		unless (-e $file) {
+		    foreach $dir (@include_dirs) {
+			$file = "$dir/$1";
+			last if -e $file;
+		    }
+		}
+		if (-e $file) {
+		    &open($file);
+		    print "# including $file\n" if $verbose;
+		} else {
+		    warn "$ERROR Can't find $file, skipping";
+		}
+	    } else {
+		warn "$ERROR Bad include line: $_";
+	    }
+	    next;
+	} elsif ($tag eq 'ifclear') {
+	    if (/^\@ifclear\s+($VARRE)\s*$/o) {
+		next unless defined($value{$1});
+		&skip_until($tag);
+	    } else {
+		warn "$ERROR Bad ifclear line: $_";
+	    }
+	    next;
+	} elsif ($tag eq 'ifset') {
+	    if (/^\@ifset\s+($VARRE)\s*$/o) {
+		next if defined($value{$1});
+		&skip_until($tag);
+	    } else {
+		warn "$ERROR Bad ifset line: $_";
+	    }
+	    next;
+	} elsif ($tag eq 'menu') {
+	    unless ($show_menu) {
+		&skip_until($tag);
+		next;
+	    }
+	    &html_push_if($tag);
+	    push(@lines, &html_debug("\n", __LINE__));
+	} elsif ($format_map{$tag}) {
+	    $in_pre = 1 if $format_map{$tag} eq 'PRE';
+	    &html_push_if($format_map{$tag});
+	    push(@lines, &html_debug("\n", __LINE__));
+	    $in_list++ if $format_map{$tag} eq 'UL' || $format_map{$tag} eq 'OL' ;
+	    push(@lines, &debug("<$format_map{$tag}>\n", __LINE__));
+	    next;
+	} elsif ($tag eq 'table') {
+	    if (/^\@[fv]?table\s+\@(\w+)\s*$/) {
+		$in_table = $1;
+		unshift(@tables, join($;, $table_type, $in_table));
+		push(@lines, &debug("<DL COMPACT>\n", __LINE__));
+		&html_push_if('DL');
+		push(@lines, &html_debug("\n", __LINE__));
+	    } else {
+		warn "$ERROR Bad table line: $_";
+	    }
+	    next;
+	} elsif ($tag eq 'synindex' || $tag eq 'syncodeindex') {
+	    if (/^\@$tag\s+(\w)\w\s+(\w)\w\s*$/) {
+		eval("*${1}index = *${2}index");
+	    } else {
+		warn "$ERROR Bad syn*index line: $_";
+	    }
+	    next;
+	} elsif ($tag eq 'sp') {
+	    push(@lines, &debug("<P>\n", __LINE__));
+	    next;
+	} elsif ($tag eq 'defindex' || $tag eq 'defcodeindex') {
+	    if (/^\@$tag\s+(\w\w)\s*$/) {
+		$valid_index{$1} = 1;
+	    } else {
+		warn "$ERROR Bad defindex line: $_";
+	    }
+	    next;
+	} elsif (defined($def_map{$tag})) {
+	    if ($def_map{$tag}) {
+		s/^\@$tag\s+//;
+		$tag = $def_map{$tag};
+		$_ = "\@$tag $_";
+		$tag =~ s/\s.*//;
+	    }
+	}
+	if (defined($def_map{$tag})) {
+	    s/^\@$tag\s+//;
+	    $tag =~ s/x$//;
+	    1 while s/(\{[^\}]*)\s+([^\{]*\})/$1$;9$2/; # protect spaces inside {}
+	    &protect_html;
+	    @args = split(/\s+/, $_);
+	    foreach(@args) {s/$;9/ /g;} # unprotect spaces
+	    $type = shift(@args);
+	    $type =~ s/^\{(.*)\}$/$1/;
+	    print "# def ($tag): {$type} ", join(', ', @args), "\n"
+		if $debug & $DEBUG_DEF;
+	    $type .= ':'; # it's nicer like this
+	    $name = shift(@args);
+	    $name =~ s/^\{(.*)\}$/$1/;
+	    if ($tag eq 'deffn' || $tag eq 'defvr' || $tag eq 'deftp') {
+		$_ = "<U>$type</U> <B>$name</B>";
+		$_ .= " <I>@args</I>" if @args;
+		$_ .= &debug("<P>\n", __LINE__);
+	    } elsif ($tag eq 'deftypefn' || $tag eq 'deftypevr'
+		     || $tag eq 'defcv' || $tag eq 'defop') {
+		$ftype = $name;
+		$name = shift(@args);
+		$name =~ s/^\{(.*)\}$/$1/;
+		$_ = "<U>$type</U> $ftype <B>$name</B>";
+		$_ .= " <I>@args</I>" if @args;
+		$_ .= &debug("<P>\n", __LINE__);
+	    } else {
+		warn "$ERROR Unknown definition type: $tag\n";
+		$_ = "<U>$type</U> <B>$name</B>";
+		$_ .= " <I>@args</I>" if @args;
+		$_ .= &debug("<P>\n", __LINE__);
+	    }
+	    if ($tag eq 'deffn' || $tag eq 'deftypefn' || $tag eq 'defop') {
+		unshift(@input_spool, "\@findex $name\n");
+	    } elsif ($tag eq 'defvr' || $tag eq 'deftypevr' || $tag eq 'defcv') {
+		unshift(@input_spool, "\@vindex $name\n");
+	    } else {
+		unshift(@input_spool, "\@tindex $name\n");
+	    }
+	    $dont_html = 1;
+	}
+    } elsif ($end_tag) {
+	if ($format_map{$end_tag}) {
+	    $in_pre = 0 if $format_map{$end_tag} eq 'PRE';
+	    $in_list-- if $format_map{$end_tag} eq 'UL' || $format_map{$end_tag} eq 'OL' ;
+	    &html_pop_if('LI', 'P');
+	    &html_pop_if();
+	    push(@lines, &debug("</$format_map{$end_tag}>\n", __LINE__));
+	    push(@lines, &html_debug("\n", __LINE__));
+	} elsif ($end_tag eq 'table' ||
+		 $end_tag eq 'ftable' ||
+		 $end_tag eq 'vtable') {
+	    shift(@tables);
+	    if (@tables) {
+		($table_type, $in_table) = split($;, $tables[0]);
+	    } else {
+		$in_table = 0;
+	    }
+	    push(@lines, "</DL>\n");
+	    &html_pop_if('DD');
+	    &html_pop_if();
+	} elsif (defined($def_map{$end_tag})) {
+	    if ($html_element ne 'P') {
+		push(@lines, &debug("<P>\n", __LINE__));
+		&html_push('P');
+		push(@lines, &html_debug("\n", __LINE__));
+	    }
+	} elsif ($end_tag eq 'menu') {
+	    &html_pop_if();
+	    push(@lines, $_); # must keep it for pass 2
+	}
+	next;
+    }
+    #
+    # misc things
+    #
+    # protect texi and HTML things
+    &protect_texi;
+    &protect_html unless $dont_html;
+    $dont_html = 0;
+    # substitution (unsupported things)
+    s/^\@center\s+//g;
+    s/^\@exdent\s+//g;
+    s/\@noindent\s+//g;
+    s/\@refill\s+//g;
+    # other substitutions
+    &simple_substitutions;
+    s/\@value{($VARRE)}/$value{$1}/eg;
+    s/\@footnote\{/\@footnote$docu_doc\{/g; # mark footnotes, cf. pass 4
+    #
+    # analyze the tag again
+    #
+    if ($tag) {
+	if (defined($sec2level{$tag}) && $sec2level{$tag} > 0) {
+	    if (/^\@$tag\s+(.+)$/) {
+		$name = $1;
+		$name =~ s/\s+$//;
+		$level = $sec2level{$tag};
+		if ($tag =~ /heading$/) {
+		    push(@lines, &html_debug("\n", __LINE__));
+		    if ($html_element ne 'body') {
+			# We are in a nice pickle here. We are trying to get a H? heading
+			# even though we are not in the body level. So, we convert it to a
+			# nice, bold, line by itself.
+			$_ = &debug("\n\n<P><STRONG>$name</STRONG></P>\n\n", __LINE__);
+		    } else {
+			$_ = &debug("<H$level>$name</H$level>\n", __LINE__);
+			&html_push_if('body');
+		    }
+		    print "# heading, section $name, level $level\n"
+			if $debug & $DEBUG_TOC;
+		} else {
+		    if ($split_chapter && $split_at_node == 0) {
+			unless ($toplevel) {
+			    # first time we see a "section"
+			    unless ($level == 1) {
+				warn "$ERROR The first section found is not of level 1: $_";
+				warn "$ERROR I'll split on sections of level $level...\n";
+			    }
+			    $toplevel = $level;
+			}
+			if ($level == $toplevel) {
+			    &next_doc;
+			    splice(@lines,
+				   $#lines+($lines[$#lines] !~ /^<!-- NODE/),
+				   0, $SPLITTAG) if $split_num++;
+			    push(@sections, $name);
+			    $just_split = 1;
+			}
+		    }
+		    $id = 'SEC' . ++$sec_num;
+		    # check biblio and glossary
+		    $in_bibliography = ($name =~ /^bibliography$/i);
+		    $in_glossary = ($name =~ /^glossary$/i);
+		    # check node
+		    if ($node) {
+			if ($node2sec{$node}) {
+			    warn "$ERROR Duplicate node found: $node\n";
+			} else {
+			    $node2sec{$node} = $name;
+			    $node2href{$node} = "$docu_doc#$id";
+			    print "# node $node, section $name, level $level\n"
+				if $debug & $DEBUG_TOC;
+			}
+			$node = "<!-- NODE:" . $node . " -->";
+		    } else {
+			print "# no node, section $name, level $level\n"
+			    if $debug & $DEBUG_TOC;
+		    }
+		    # update TOC
+		    while ($level > $curlevel) {
+			$curlevel++;
+			push(@toc_lines, "<UL>\n");
+		    }
+		    while ($level < $curlevel) {
+			$curlevel--;
+			push(@toc_lines, "</UL>\n");
+		    }
+		    if ($just_split) {
+		        $_ = $node . "<LI>" . &anchor($id, "$docu_doc", $name, 1);
+			$just_split = 0;
+		    } else {
+		        $_ = $node . "<LI>" . &anchor($id, "$docu_doc#$id", $name, 1);
+		    }
+		    $node = '';
+		    push(@toc_lines, &substitute_style($_));
+		    # update DOC
+		    push(@lines, &html_debug("\n", __LINE__));
+		    &html_reset;
+		    $_ =  "<H$level>" . &anchor($id, "$docu_toc#$id", $name) . "</H$level>\n";
+		    $_ = &debug($_, __LINE__);
+		    push(@lines, &html_debug("\n", __LINE__));
+		}
+		# update DOC
+		foreach $line (split(/\n+/, $_)) {
+		    push(@lines, "$line\n");
+		}
+		next;
+	    } else {
+		warn "$ERROR Bad section line: $_";
+	    }
+	} else {
+	    # track variables
+	    $value{$1} = $2, next if /^\@set\s+($VARRE)\s+(.*)$/o;
+	    delete $value{$1}, next if /^\@clear\s+($VARRE)\s*$/o;
+	    # store things
+	    $value{'filename'} = $1, next if /^\@setfilename\s+(.*)$/;
+	    $value{'author'} .= "$1\n", next if /^\@author\s+(.*)$/;
+	    $value{'subtitle'} .= "$1\n", next if /^\@subtitle\s+(.*)$/;
+	    $value{'title'} = $1, next if /^\@settitle\s+(.*)$/;
+	    $value{'title'} = $1, next if /^\@title\s+(.*)$/;
+	    # index
+	    if (/^\@(..?)index\s+/) {
+		unless ($valid_index{$1}) {
+		    warn "$ERROR Undefined index command: $_";
+		    next;
+		}
+		$id = 'IDX' . ++$idx_num;
+		$index = $1 . 'index';
+		$what = &substitute_style($');
+		$what =~ s/\s+$//;
+		print "# found $index for '$what' id $id\n"
+		    if $debug & $DEBUG_INDEX;
+		eval("\$$index\{\$what\} = \"$docu_doc#$id\"");
+		#
+		# dirty hack to see if I can put an invisible anchor...
+		#
+		if ($html_element eq 'P' ||
+		    $html_element eq 'LI' ||
+		    $html_element eq 'DT' ||
+		    $html_element eq 'DD' ||
+		    $html_element eq 'ADDRESS' ||
+		    $html_element eq 'B' ||
+		    $html_element eq 'BLOCKQUOTE' ||
+		    $html_element eq 'PRE' ||
+		    $html_element eq 'SAMP') {
+                    push(@lines, &anchor($id, '', $invisible_mark, !$in_pre));
+                } elsif ($html_element eq 'body') {
+		    push(@lines, &debug("<P>\n", __LINE__));
+                    push(@lines, &anchor($id, '', $invisible_mark, !$in_pre));
+		    &html_push('P');
+		} elsif ($html_element eq 'DL' ||
+			 $html_element eq 'UL' ||
+			 $html_element eq 'OL' ) {
+		    $deferred_ref .= &anchor($id, '', $invisible_mark, !$in_pre) . " ";
+		}
+		next;
+	    }
+	    # list item
+	    if (/^\@itemx?\s+/) {
+		$what = $';
+		$what =~ s/\s+$//;
+		if ($in_bibliography && $use_bibliography) {
+		    if ($what =~ /^$BIBRE$/o) {
+			$id = 'BIB' . ++$bib_num;
+			$bib2href{$what} = "$docu_doc#$id";
+			print "# found bibliography for '$what' id $id\n"
+			    if $debug & $DEBUG_BIB;
+			$what = &anchor($id, '', $what);
+		    }
+		} elsif ($in_glossary && $use_glossary) {
+		    $id = 'GLOSS' . ++$gloss_num;
+		    $entry = $what;
+		    $entry =~ tr/A-Z/a-z/ unless $entry =~ /^[A-Z\s]+$/;
+		    $gloss2href{$entry} = "$docu_doc#$id";
+		    print "# found glossary for '$entry' id $id\n"
+			if $debug & $DEBUG_GLOSS;
+		    $what = &anchor($id, '', $what);
+		}
+		&html_pop_if('P');
+		if ($html_element eq 'DL' || $html_element eq 'DD') {
+		    if ($things_map{$in_table} && !$what) {
+			# special case to allow @table @bullet for instance
+			push(@lines, &debug("<DT>$things_map{$in_table}\n", __LINE__));
+		    } else {
+			push(@lines, &debug("<DT>\@$in_table\{$what\}\n", __LINE__));
+		    }
+		    push(@lines, "<DD>");
+		    &html_push('DD') unless $html_element eq 'DD';
+		    if ($table_type) { # add also an index
+			unshift(@input_spool, "\@${table_type}index $what\n");
+		    }
+		} else {
+		    push(@lines, &debug("<LI>$what\n", __LINE__));
+		    &html_push('LI') unless $html_element eq 'LI';
+		}
+		push(@lines, &html_debug("\n", __LINE__));
+		if ($deferred_ref) {
+		    push(@lines, &debug("$deferred_ref\n", __LINE__));
+		    $deferred_ref = '';
+		}
+		next;
+	    }
+	}
+    }
+    # paragraph separator
+    if ($_ eq "\n") {
+	next if $#lines >= 0 && $lines[$#lines] eq "\n";
+	if ($html_element eq 'P') {
+	    push(@lines, "\n");
+	    $_ = &debug("</P>\n", __LINE__);
+	    &html_pop;
+	}
+    } elsif ($html_element eq 'body' || $html_element eq 'BLOCKQUOTE') {
+	push(@lines, "<P>\n");
+	&html_push('P');
+	$_ = &debug($_, __LINE__);
+    }
+    # otherwise
+    push(@lines, $_);
+}
+
+# finish TOC
+$level = 0;
+while ($level < $curlevel) {
+    $curlevel--;
+    push(@toc_lines, "</UL>\n");
+}
+
+print "# end of pass 1\n" if $verbose;
+
+#+++############################################################################
+#                                                                              #
+# Pass 2/3: handle style, menu, index, cross-reference                         #
+#                                                                              #
+#---############################################################################
+
+ at lines2 = ();				# whole document (2nd pass)
+ at lines3 = ();				# whole document (3rd pass)
+$in_menu = 0;				# am I inside a menu
+
+while (@lines) {
+    $_ = shift(@lines);
+    #
+    # special case (protected sections)
+    #
+    if (/^$PROTECTTAG/o) {
+	push(@lines2, $_);
+	next;
+    }
+    #
+    # menu
+    #
+    $in_menu = 1, push(@lines2, &debug("<UL>\n", __LINE__)), next if /^\@menu\b/;
+    $in_menu = 0, push(@lines2, &debug("</UL>\n", __LINE__)), next if /^\@end\s+menu\b/;
+    if ($in_menu) {
+	if (/^\*\s+($NODERE)::/o) {
+	    $descr = $';
+	    chop($descr);
+	    &menu_entry($1, $1, $descr);
+	} elsif (/^\*\s+(.+):\s+([^\t,\.\n]+)[\t,\.\n]/) {
+	    $descr = $';
+	    chop($descr);
+	    &menu_entry($1, $2, $descr);
+	} elsif (/^\*/) {
+	    warn "$ERROR Bad menu line: $_";
+	} else { # description continued?
+	    push(@lines2, $_);
+	}
+	next;
+    }
+    #
+    # printindex
+    #
+    if (/^\@printindex\s+(\w\w)\b/) {
+	if ($predefined_index{$1}) {
+	    $index = $predefined_index{$1} . 'index';
+	} else {
+	    $index = $1 . 'index';
+	}
+	eval("*ary = *$index");
+	@keys = keys(%ary);
+	foreach$key (@keys) {
+	    $_ = $key;
+	    1 while s/<(\w+)>\`(.*)\'<\/\1>/$2/; # remove HTML tags with quotes
+	    1 while s/<(\w+)>(.*)<\/\1>/$2/; # remove HTML tags
+	    &unprotect_html;
+	    &unprotect_texi;
+	    tr/A-Z/a-z/; # lowercase
+	    $key2alpha{$key} = $_;
+	    print "# index $key sorted as $_\n"
+		if $key ne $_ && $debug & $DEBUG_INDEX;
+	}
+	if ($index_chars) {
+	    $last_letter = undef;
+	    foreach(sort byalpha @keys) {
+	        $letter = substr($key2alpha{$_}, 0, 1);
+	        $letter = substr($key2alpha{$_}, 0, 2) if $letter eq $;;
+		$uletter = $letter;
+		$letter =~ tr/a-z/A-Z/;
+	        if (!defined($last_letter) || $letter ne $last_letter) {
+		    $last_letter = $letter;
+	            push(@lines2, &anchor('', "$docu_doc#LET$uletter",
+			 $letter, 1));
+	        }
+       	    }
+	}
+	$last_letter = undef;
+	foreach(sort byalpha @keys) {
+	    $letter = substr($key2alpha{$_}, 0, 1);
+	    $letter = substr($key2alpha{$_}, 0, 2) if $letter eq $;;
+	    if (!defined($last_letter) || $letter ne $last_letter) {
+		local($_) = $letter;
+		&protect_html;
+		push(@lines2, "</DIR>\n") if defined($last_letter);
+		push(@lines2, "<A NAME=\"LET$letter\"></A>\n");
+		push(@lines2, "<H2>$_</H2>\n");
+		push(@lines2, "<DIR>\n");
+		$last_letter = $letter;
+	    }
+	    push(@lines2, "<LI>" . &anchor('', $ary{$_}, $_, 1));
+	}
+	push(@lines2, "</DIR>\n") if defined($last_letter);
+	next;
+    }
+    #
+    # simple style substitutions
+    #
+    $_ = &substitute_style($_);
+    #
+    # xref
+    #
+    while (/\@(x|px|info|)ref{($XREFRE)(}?)/o) {
+	# note: Texinfo may accept other characters
+	($type, $nodes, $full) = ($1, $2, $3);
+	($before, $after) = ($`, $');
+	if (! $full && $after) {
+	    warn "$ERROR Bad xref (no ending } on line): $_";
+	    $_ = "$before$;0${type}ref\{$nodes$after";
+	    next; # while xref
+	}
+	if ($type eq 'x') {
+	    $type = 'See ';
+	} elsif ($type eq 'px') {
+	    $type = 'see ';
+	} elsif ($type eq 'info') {
+	    $type = 'See Info';
+	} else {
+	    $type = '';
+	}
+	unless ($full) {
+	    $next = shift(@lines);
+	    $next = &substitute_style($next);
+	    chop($nodes); # remove final newline
+	    if ($next =~ /\}/) { # split on 2 lines
+		$nodes .= " $`";
+		$after = $';
+	    } else {
+		$nodes .= " $next";
+		$next = shift(@lines);
+		$next = &substitute_style($next);
+		chop($nodes);
+		if ($next =~ /\}/) { # split on 3 lines
+		    $nodes .= " $`";
+		    $after = $';
+		} else {
+		    warn "$ERROR Bad xref (no ending }): $_";
+		    $_ = "$before$;0xref\{$nodes$after";
+		    unshift(@lines, $next);
+		    next; # while xref
+		}
+	    }
+	}
+	$nodes =~ s/\s+/ /g; # normalize
+	@args = split(/\s*,\s*/, $nodes);
+	$node = $args[0]; # the node is always the first arg
+	$sec = $node2sec{$node};
+	if (@args == 5) { # reference to another manual
+	    $sec = $args[2] || $node;
+	    $man = $args[4] || $args[3];
+	    $_ = "${before}${type}section `$sec' in \@cite{$man}$after";
+	} elsif ($type =~ /Info/) { # inforef
+	    warn "$ERROR Wrong number of arguments: $_" unless @args == 3;
+	    ($nn, $_, $in) = @args;
+	    $_ = "${before}${type} file `$in', node `$nn'$after";
+	} elsif ($sec) {
+	    $href = $node2href{$node};
+	    $_ = "${before}${type}section " . &anchor('', $href, $sec) . $after;
+	} else {
+	    warn "$ERROR Undefined node ($node): $_";
+	    $_ = "$before$;0xref{$nodes}$after";
+	}
+    }
+    #
+    # try to guess bibliography references or glossary terms
+    #
+    unless (/^<H\d><A NAME=\"SEC\d/) {
+	if ($use_bibliography) {
+	    $done = '';
+	    while (/$BIBRE/o) {
+		($pre, $what, $post) = ($`, $&, $');
+		$href = $bib2href{$what};
+		if (defined($href) && $post !~ /^[^<]*<\/A>/) {
+		    $done .= $pre . &anchor('', $href, $what);
+		} else {
+		    $done .= "$pre$what";
+		}
+		$_ = $post;
+	    }
+	    $_ = $done . $_;
+	}
+	if ($use_glossary) {
+	    $done = '';
+	    while (/\b\w+\b/) {
+		($pre, $what, $post) = ($`, $&, $');
+		$entry = $what;
+		$entry =~ tr/A-Z/a-z/ unless $entry =~ /^[A-Z\s]+$/;
+		$href = $gloss2href{$entry};
+		if (defined($href) && $post !~ /^[^<]*<\/A>/) {
+		    $done .= $pre . &anchor('', $href, $what);
+		} else {
+		    $done .= "$pre$what";
+		}
+		$_ = $post;
+	    }
+	    $_ = $done . $_;
+	}
+    }
+    # otherwise
+    push(@lines2, $_);
+}
+print "# end of pass 2\n" if $verbose;
+
+#
+# split style substitutions
+#
+while (@lines2) {
+    $_ = shift(@lines2);
+    #
+    # special case (protected sections)
+    #
+    if (/^$PROTECTTAG/o) {
+	push(@lines3, $_);
+	next;
+    }
+    #
+    # split style substitutions
+    #
+    $old = '';
+    while ($old ne $_) {
+        $old = $_;
+	if (/\@(\w+)\{/) {
+	    ($before, $style, $after) = ($`, $1, $');
+	    if (defined($style_map{$style})) {
+		$_ = $after;
+		$text = '';
+		$after = '';
+		$failed = 1;
+		while (@lines2) {
+		    if (/\}/) {
+			$text .= $`;
+			$after = $';
+			$failed = 0;
+			last;
+		    } else {
+			$text .= $_;
+			$_ = shift(@lines2);
+		    }
+		}
+		if ($failed) {
+		    die "* Bad syntax (\@$style) after: $before\n";
+		} else {
+		    $text = &apply_style($style, $text);
+		    $_ = "$before$text$after";
+		}
+	    }
+	}
+    }
+    # otherwise
+    push(@lines3, $_);
+}
+print "# end of pass 3\n" if $verbose;
+
+#+++############################################################################
+#                                                                              #
+# Pass 4: foot notes, final cleanup                                            #
+#                                                                              #
+#---############################################################################
+
+ at foot_lines = ();			# footnotes
+ at doc_lines = ();			# final document
+$end_of_para = 0;			# true if last line is <P>
+
+while (@lines3) {
+    $_ = shift(@lines3);
+    #
+    # special case (protected sections)
+    #
+    if (/^$PROTECTTAG/o) {
+	push(@doc_lines, $_);
+	$end_of_para = 0;
+	next;
+    }
+    #
+    # footnotes
+    #
+    while (/\@footnote([^\{\s]+)\{/) {
+	($before, $d, $after) = ($`, $1, $');
+	$_ = $after;
+	$text = '';
+	$after = '';
+	$failed = 1;
+	while (@lines3) {
+	    if (/\}/) {
+		$text .= $`;
+		$after = $';
+		$failed = 0;
+		last;
+	    } else {
+		$text .= $_;
+		$_ = shift(@lines3);
+	    }
+	}
+	if ($failed) {
+	    die "* Bad syntax (\@footnote) after: $before\n";
+	} else {
+	    $id = 'FOOT' . ++$foot_num;
+	    $foot = "($foot_num)";
+	    push(@foot_lines, "<H3>" . &anchor($id, "$d#$id", $foot) . "</H3>\n");
+	    $text = "<P>$text" unless $text =~ /^\s*<P>/;
+	    push(@foot_lines, "$text\n");
+	    $_ = $before . &anchor($id, "$docu_foot#$id", $foot) . $after;
+	}
+    }
+    #
+    # remove unnecessary <P>
+    #
+    if (/^\s*<P>\s*$/) {
+	next if $end_of_para++;
+    } else {
+	$end_of_para = 0;
+    }
+    # otherwise
+    push(@doc_lines, $_);
+}
+print "# end of pass 4\n" if $verbose;
+
+#+++############################################################################
+#                                                                              #
+# Pass 5: print things                                                         #
+#                                                                              #
+#---############################################################################
+
+$header = <<EOT;
+<!-- This HTML file has been created by $THISPROG
+     from $docu on $TODAY -->
+EOT
+
+$title = $value{'title'} || "Untitled Document";
+$_ = &substitute_style($title);
+&unprotect_texi;
+$full_title = $_;
+
+#
+# print TOC
+#
+if (open(FILE, "> $docu_toc")) {
+    print "# creating $docu_toc...\n" if $verbose;
+    &print_header("$title - Table of Contents");
+    print FILE "<H1>$full_title</H1>\n";
+    if ($value{'subtitle'}) {
+	chop($value{'subtitle'}); # rmv last \n
+	foreach (split(/\n/, $value{'subtitle'})) {
+	    $_ = &substitute_style($_);
+	    &unprotect_texi;
+	    print FILE "<H2>$_</H2>\n";
+	}
+    }
+    if ($value{'author'}) {
+	chop($value{'author'}); # rmv last \n
+	foreach (split(/\n/, $value{'author'})) {
+	    $_ = &substitute_style($_);
+	    &unprotect_texi;
+	    print FILE "<ADDRESS>$_</ADDRESS>\n";
+	}
+    }
+    print FILE "<P>\n";
+    &print(*toc_lines, FILE);
+    print FILE <<EOT;
+<HR>
+<P>This document was generated on $TODAY using the
+<A HREF=\"$HOMEPAGE\">texi2html</A>
+translator version 1.39 (with local modifications)</P>
+EOT
+    &print_footer;
+    close(FILE);
+} else {
+    warn "$ERROR Can't write to $docu_toc: $!\n";
+}
+
+#
+# print document
+#
+if ($split_chapter || $split_node) {
+    $doc_num = 0;
+    $last_num = scalar(@sections);
+    $first_doc = &doc_name(1);
+    $last_doc = &doc_name($last_num);
+    while (@sections) {
+	$section = shift(@sections);
+	&next_doc;
+	if (open(FILE, "> $docu_doc")) {
+	    print "# creating $docu_doc...\n" if $verbose;
+	    &print_header("$title - $section");
+	    $prev_doc = ($doc_num == 1 ? undef : &doc_name($doc_num - 1));
+	    $next_doc = ($doc_num == $last_num ? undef : &doc_name($doc_num + 1));
+	    $navigation = &anchor('', $first_doc,
+	"<IMG SRC=\"i/nav_first.gif\" ALT=\"first\">") . "  " if $prev_doc;
+	    $navigation .= &anchor('', $prev_doc,
+	"<IMG SRC=\"i/nav_prev.gif\" ALT=\"previous\">") . "  " if $prev_doc;
+	    $navigation .= &anchor('', $next_doc,
+	"<IMG SRC=\"i/nav_next.gif\" ALT=\"next\">") . "  " if $next_doc;
+	    $navigation .= &anchor('', $last_doc,
+	"<IMG SRC=\"i/nav_last.gif\" ALT=\"last\">") . "  " if $next_doc;
+	    $navigation .= &anchor('', $docu_toc,
+	"<IMG SRC=\"i/nav_top.gif\" ALT=\"contents\">") . "\n";
+	    print FILE "$navigation<HR>\n";
+	    # find corresponding lines
+            @tmp_lines = ();
+            while (@doc_lines) {
+		$_ = shift(@doc_lines);
+		last if ($_ eq $SPLITTAG);
+		push(@tmp_lines, $_);
+	    }
+            &print(*tmp_lines, FILE);
+	    print FILE "<HR>\n$navigation";
+	    &print_footer;
+	    close(FILE);
+	} else {
+	    warn "$ERROR Can't write to $docu_doc: $!\n";
+	}
+    }
+} else {
+    if (open(FILE, "> $docu_doc")) {
+	print "# creating $docu_doc...\n" if $verbose;
+	&print_header($title);
+	print FILE "<H1>$full_title</H1>\n";
+        &print(*doc_lines, FILE);
+	&print_footer;
+	close(FILE);
+    } else {
+	warn "$ERROR Can't write to $docu_doc: $!\n";
+    }
+}
+
+#
+# print footnotes
+#
+if (@foot_lines) {
+    if (open(FILE, "> $docu_foot")) {
+	print "# creating $docu_foot...\n" if $verbose;
+	&print_header("$title - Footnotes");
+	print FILE "<H1>$full_title</H1>\n";
+        &print(*foot_lines, FILE);
+	&print_footer;
+	close(FILE);
+    } else {
+	warn "$ERROR Can't write to $docu_foot: $!\n";
+    }
+}
+
+print "# that's all folks\n" if $verbose;
+
+#+++############################################################################
+#                                                                              #
+# Low level functions                                                          #
+#                                                                              #
+#---############################################################################
+
+sub check {
+    local($_, %seen, %context, $before, $match, $after);
+
+    while (<>) {
+	if (/\@(\*|\.|\:|\@|\{|\})/) {
+	    $seen{$&}++;
+	    $context{$&} .= "> $_" if $verbose;
+	    $_ = "$`XX$'";
+	    redo;
+	}
+	if (/\@(\w+)/) {
+	    ($before, $match, $after) = ($`, $&, $');
+	    if ($before =~ /\b[\w-]+$/ && $after =~ /^[\w-.]*\b/) { # e-mail address
+		$seen{'e-mail address'}++;
+		$context{'e-mail address'} .= "> $_" if $verbose;
+	    } else {
+		$seen{$match}++;
+		$context{$match} .= "> $_" if $verbose;
+	    }
+	    $match =~ s/^\@/X/;
+	    $_ = "$before$match$after";
+	    redo;
+	}
+    }
+    
+    foreach(sort(keys(%seen))) {
+	if ($verbose) {
+	    print "$_\n";
+	    print $context{$_};
+	} else {
+	    print "$_ ($seen{$_})\n";
+	}
+    }
+}
+
+sub open {
+    local($name) = @_;
+
+    ++$fh_name;
+    if (open($fh_name, $name)) {
+	unshift(@fhs, $fh_name);
+    } else {
+	warn "$ERROR Can't read file $name: $!\n";
+    }
+}
+
+sub init_input {
+    @fhs = ();			# hold the file handles to read
+    @input_spool = ();		# spooled lines to read
+    $fh_name = 'FH000';
+    &open($docu);
+}
+
+sub next_line {
+    local($fh, $line);
+
+    if (@input_spool) {
+	$line = shift(@input_spool);
+	return($line);
+    }
+    while (@fhs) {
+	$fh = $fhs[0];
+	$line = <$fh>;
+	return($line) if $line;
+	close($fh);
+	shift(@fhs);
+    }
+    return(undef);
+}
+
+# used in pass 1, use &next_line
+sub skip_until {
+    local($tag) = @_;
+    local($_);
+
+    while ($_ = &next_line) {
+	return if /^\@end\s+$tag\s*$/;
+    }
+    die "* Failed to find '$tag' after: " . $lines[$#lines];
+}
+
+#
+# HTML stacking to have a better HTML output
+#
+
+sub html_reset {
+    @html_stack = ('html');
+    $html_element = 'body';
+}
+
+sub html_push {
+    local($what) = @_;
+    push(@html_stack, $html_element);
+    $html_element = $what;
+}
+
+sub html_push_if {
+    local($what) = @_;
+    push(@html_stack, $html_element)
+	if ($html_element && $html_element ne 'P');
+    $html_element = $what;
+}
+
+sub html_pop {
+    $html_element = pop(@html_stack);
+}
+
+sub html_pop_if {
+    local($elt);
+
+    if (@_) {
+	foreach $elt (@_) {
+	    if ($elt eq $html_element) {
+		$html_element = pop(@html_stack) if @html_stack;
+		last;
+	    }
+	}
+    } else {
+	$html_element = pop(@html_stack) if @html_stack;
+    }
+}
+
+sub html_debug {
+    local($what, $line) = @_;
+    return("<!-- $line @html_stack, $html_element -->$what")
+	if $debug & $DEBUG_HTML;
+    return($what);
+}
+
+# to debug the output...
+sub debug {
+    local($what, $line) = @_;
+    return("<!-- $line -->$what")
+	if $debug & $DEBUG_HTML;
+    return($what);
+}
+
+sub menu_entry {
+    local($entry, $node, $descr) = @_;
+    local($href);
+
+    $href = $node2href{$node};
+    if ($href) {
+	$descr =~ s/^\s+//;
+#	$descr = ": $descr" if $descr;
+# Changed 12/10/95 by jkb
+#	push(@lines2, "<LI>" . &anchor('', $href, $entry) . "$descr\n");
+	push(@lines2, "<LI>" . &anchor('', $href, $descr) . "\n");
+    } else {
+	warn "$ERROR Undefined node ($node): $_";
+    }
+}
+
+sub do_ctrl { "^$_[0]" }
+
+sub do_sc { "\U$_[0]\E" }
+
+sub apply_style {
+    local($texi_style, $text) = @_;
+    local($style);
+
+    $style = $style_map{$texi_style};
+    if (defined($style)) { # known style
+	if ($style =~ /^\"/) { # add quotes
+	    $style = $';
+	    $text = "\`$text\'";
+	}
+	if ($style =~ /^\&/) { # custom
+	    $style = $';
+	    $text = &$style($text);
+	} elsif ($style) { # good style
+	    $text = "<$style>$text</$style>";
+	} else { # no style
+	}
+    } else { # unknown style
+	$text = undef;
+    }
+    return($text);
+}
+
+# remove Texinfo styles
+sub remove_style {
+    local($_) = @_;
+    s/\@\w+{([^\{\}]+)}/$1/g;
+    return($_);
+}
+
+sub substitute_style {
+    local($_) = @_;
+    local($changed, $done, $style, $text);
+
+    $changed = 1;
+    while ($changed) {
+	$changed = 0;
+	$done = '';
+	while (/\@(\w+){([^\{\}]+)}/) {
+	    $text = &apply_style($1, $2);
+	    if ($text) {
+		$_ = "$`$text$'";
+		$changed = 1;
+	    } else {
+		$done .= "$`\@$1";
+		$_ = "{$2}$'";
+	    }
+	}
+        $_ = $done . $_;
+    }
+    return($_);
+}
+
+sub anchor {
+    local($name, $href, $text, $newline) = @_;
+    local($result);
+
+    $result = "<A";
+    $result .= " NAME=\"$name\"" if $name;
+    $result .= " HREF=\"$href\"" if $href;
+    $result .= ">$text</A>";
+    $result .= "\n" if $newline;
+    return($result);
+}
+
+sub pretty_date {
+    local(@MoY, $sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst);
+
+    @MoY = ('January', 'Febuary', 'March', 'April', 'May', 'June',
+	    'July', 'August', 'September', 'October', 'November', 'December');
+    ($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst) = localtime(time);
+    $year += ($year < 70) ? 2000 : 1900;
+    return("$mday $MoY[$mon] $year");
+}
+
+sub doc_name {
+    local($num) = @_;
+
+    return("${docu_name}_$num.html");
+}
+
+sub next_doc {
+    $docu_doc = &doc_name(++$doc_num);
+}
+
+sub print {
+    local(*lines, $fh) = @_;
+    local($_);
+
+    while (@lines) {
+	$_ = shift(@lines);
+	if (/^$PROTECTTAG/o) {
+	    $_ = $tag2pro{$_};
+	} else {
+	    &unprotect_texi;
+	}
+	print $fh $_;
+    }
+}
+
+sub print_header {
+    local($_);
+
+    # clean the title
+    $_ = &remove_style($_[0]);
+    &unprotect_texi;
+    # print the header
+    if ($doctype eq 'html2') {
+	print FILE $html2_doctype;
+    } elsif ($doctype) {
+	print FILE $doctype;
+    }
+    print FILE <<EOT;
+<HTML>
+<HEAD>
+$header
+<TITLE>$_</TITLE>
+</HEAD>
+<BODY bgcolor="#ffffff">
+EOT
+}
+
+sub print_footer {
+    print FILE <<EOT;
+<hr>
+<i>Last generated on $TODAY.</i>
+<font size="-1"><br>
+</font>
+</BODY>
+</HTML>
+EOT
+}
+
+sub protect_texi {
+    # protect @ { } ` '
+    s/\@\@/$;0/go;
+    s/\@\{/$;1/go;
+    s/\@\}/$;2/go;
+    s/\@\`/$;3/go;
+    s/\@\'/$;4/go;
+}
+
+sub protect_html {
+    # protect & < >
+    s/\&/\&\#38;/g;
+    s/\</\&\#60;/g;
+    s/\>/\&\#62;/g;
+    s/\&\#60;\/A\&\#62;/<\/A>/g; # assume </A> is HTML
+    s/\&\#60;A ([^\&]+)\&\#62;/<A $1>/g; # assume <A [^&]+> is HTML
+    s/\&\#60;IMG ([^\&]+)\&\#62;/<IMG $1>/g; # assume <IMG [^&]+> is HTML
+}
+
+sub unprotect_texi {
+    s/$;0/\@/go;
+    s/$;1/\{/go;
+    s/$;2/\}/go;
+    s/$;3/\`/go;
+    s/$;4/\'/go;
+}
+
+sub unprotect_html {
+    s/\&\#38;/\&/g;
+    s/\&\#60;/\</g;
+    s/\&\#62;/\>/g;
+}
+
+sub byalpha {
+    $key2alpha{$a} cmp $key2alpha{$b};
+}
+
+##############################################################################
+
+	# These next few lines are legal in both Perl and nroff.
+
+.00;			# finish .ig
+ 
+'di			\" finish diversion--previous line must be blank
+.nr nl 0-1		\" fake up transition to first page again
+.nr % 0			\" start at page 1
+'; __END__ ############# From here on it's a standard manual page ############
+.TH TEXI2HTML 1 "07/27/95"
+.AT 3
+.SH NAME
+texi2html \- a Texinfo to HTML converter
+.SH SYNOPSIS
+.B texi2html [options] file
+.PP
+.B texi2html -check [-verbose] files
+.SH DESCRIPTION
+.I Texi2html
+converts the given Texinfo file to a set of HTML files. It tries to handle
+most of the Texinfo commands. It creates hypertext links for cross-references,
+footnotes...
+.PP
+It also tries to add links from a reference to its corresponding entry in the
+bibliography (if any). It may also handle a glossary (see the
+.B \-glossary
+option).
+.PP
+.I Texi2html
+creates several files depending on the contents of the Texinfo file and on
+the chosen options (see FILES).
+.PP
+The HTML files created by
+.I texi2html
+are closer to TeX than to Info, that's why
+.I texi2html
+converts @iftex sections and not @ifinfo ones by default. You can reverse
+this with the \-expandinfo option.
+.SH OPTIONS
+.TP 12
+.B \-check
+Check the given file and give the list of all things that may be Texinfo commands.
+This may be used to check the output of
+.I texi2html
+to find the Texinfo commands that have been left in the HTML file.
+.TP
+.B \-expandinfo
+Expand @ifinfo sections, not @iftex ones.
+.TP
+.B \-glossary
+Use the section named 'Glossary' to build a list of terms and put links in the HTML
+document from each term toward its definition.
+.TP
+.B \-invisible \fIname\fP
+Use \fIname\fP to create invisible destination anchors for index links. This is a workaround
+for a known bug of many WWW browsers, including xmosaic.
+.TP
+.B \-I \fIdir\fP
+Look also in \fIdir\fP to find included files.
+.TP
+.B \-menu
+Show the Texinfo menus; by default they are ignored.
+.TP
+.B \-split_chapter
+Split the output into several HTML files (one per main section:
+chapter, appendix...).
+.TP
+.B \-split_node
+Split the output into several HTML files (one per node).
+.TP
+.B \-usage
+Print usage instructions, listing the current available command-line options.
+.TP
+.B \-verbose
+Give a verbose output. Can be used with the
+.B \-check
+option.
+.PP
+.SH FILES
+By default
+.I texi2html
+creates the following files (foo being the name of the Texinfo file):
+.TP 16
+.B foo_toc.html
+The table of contents.
+.TP
+.B foo.html
+The document's contents.
+.TP
+.B foo_foot.html
+The footnotes (if any).
+.PP
+When used with the
+.B \-split
+option, it creates several files (one per chapter or node), named
+.B foo_n.html
+(n being the indice of the chapter or node), instead of the single
+.B foo.html
+.PP
+If an
+.B @split
+command is found and a
+.B \-split_chapter
+option is used then a new file will be chosen at the next node regardless of
+whether the
+.B \-split_node
+command is used.
+file.
+.SH VARIABLES
+.I texi2html
+predefines the following variables: \fBhtml\fP, \fBtexi2html\fP.
+.SH ADDITIONAL COMMANDS
+.I texi2html
+implements the following non-Texinfo commands:
+.TP 16
+.B @ifhtml
+This indicates the start of an HTML section, this section will passed through
+without any modofication.
+.TP
+.B @end ifhtml
+This indcates the end of an HTML section.
+.SH VERSION
+This is \fItexi2html\fP version 1.39, 07/27/95.
+.PP
+The latest version of \fItexi2html\fP can be found in WWW, cf. URL
+http://wwwcn.cern.ch/dci/texi2html/
+.SH AUTHOR
+The main author is Lionel Cons, CERN CN/DCI/UWS, Lionel.Cons at cern.ch.
+Many other people around the net contributed to this program.
+.SH COPYRIGHT
+This program is the intellectual property of the European
+Laboratory for Particle Physics (known as CERN). No guarantee whatsoever is
+provided by CERN. No liability whatsoever is accepted for any loss or damage
+of any kind resulting from any defect or inaccuracy in this information or
+code.
+.PP
+CERN, 1211 Geneva 23, Switzerland
+.SH "SEE ALSO"
+GNU Texinfo Documentation Format,
+HyperText Markup Language (HTML),
+World Wide Web (WWW).
+.SH BUGS
+This program does not understand all Texinfo commands (yet).
+.PP
+TeX specific commands (normally enclosed in @iftex) will be
+passed unmodified.
+.ex
diff --git a/manual/tools/texi2man.pl b/manual/tools/texi2man.pl
new file mode 100755
index 0000000..b4833da
--- /dev/null
+++ b/manual/tools/texi2man.pl
@@ -0,0 +1,173 @@
+#!/usr/bin/perl -w
+
+# jkb 05/12/95
+# Converts our man pages written in TexInfo to troff man format. The texinfo
+# versions need to be pretty similar in layout, with the usual NAME,
+# DESCRIPTION etc headings. (Further support to be added when and if it's
+# required.) Note that it's much easier to convert from texinfo to troff than
+# troff to texinfo.
+#
+
+$newline=1;
+$example=0;
+$see_also=0;
+$table_mode=0;
+$indent="0.5i";
+$mansection="1";
+
+sub convert_line {
+    if (/^\@c MANSECTION=(.*)/) {
+	$mansection="$1";
+    }
+    # First section (NAME) is parse to find the manual page name. This
+    # is needed for the .TH line (which is outputted here).
+    if (/^\@(unnumberedsec|section) (.*)/) {
+	s/^\@(unnumberedsec|section) (.*)/.SH "$2"\n.PP/;
+	if ($2 eq "NAME") {
+	    $t="";
+	    do {
+	        s/^@.*//;
+	        s/\@[^ ]*{([^}]*)}/$1/g;
+		($_ ne "\n") && ($t .= $_);
+	        $_ = <>;
+		convert_line();
+	    } while (!/ \\- /);
+	    /([^ ]*) \\-/;
+	    print ".TH \"$1\" $mansection \"\" \"\" \"Staden Package\"\n$t";
+	} elsif ($2 eq "SEE ALSO") {
+	    $see_also=1;
+       	}
+    } else {
+        s/\\/\\\\/g;
+    }
+    s/^\@subsection (.*)/.SS "$1"\n.PP/;
+    s/^\@unnumberedsubsec (.*)/.SS "$1"\n.PP/;
+    s/^\@example/.nf\n.in +$indent/ && ($example=1);
+    s/^\@end example/.in -$indent\n.fi/ && ($example=0);
+
+    s/\@strong{([^}]*)}/\\fB$1\\fP/g;
+    s/\@code{([^}]*)}/\\fB$1\\fP/g;
+    s/\@b{([^}]*)}/\\fB$1\\fP/g;
+    s/\@i{([^}]*)}/\\fI$1\\fP/g;
+    s/\@var{([^}]*)}/\\fI$1\\fP/g;
+
+    # See also commands, typically as cross references.
+    if (/_fxref\(/) {
+	if ($see_also) {
+	    s/_fxref\([^,]*,[ \t\n]*(([^(]*)([^,]*)),[^)]*\)/\\fB$2\\fR$3/g;
+	} else {
+	    s/_fxref\([^,]*,[ \t\n]*([^,]*),[^)]*\)/See Section $1./g;
+	}
+    }
+
+    if ($see_also) {
+	s/\@\*//g;
+    } else {
+        s/\@\*/\n.br\n/g;
+    }
+    s/ --- / \\- /g;
+    s/\@{/{/;
+    s/\@}/}/;
+    s/\@\@/\@/;
+
+    $example==0 && s/^[ ]*//;
+
+    if (/^\@c TABLE_MODE=(.*)/) {
+	$table_mode=$1;
+	$_="";
+    }
+
+    if (/^\@c INDENT=(.*)/) {
+	$indent="$1";
+	$_="";
+    }
+
+    if (/^\@table/) {
+	$_ = <>;
+	if ($table_mode == 1) {
+	    print ".nf\n";
+	} elsif ($table_mode == 2) {
+	    print ".PD 0\n";
+	}
+	do {
+	    convert_line();
+	
+	    if (/^\@item (.*)/) {
+		if ($table_mode == 1) {
+		    print ".BR $1 ";
+		} elsif ($table_mode == 2) {
+		    print ".IP $1 13\n";
+		} else {
+		    print ".TP\n";
+		    print "$1\n";
+		}
+       	    } elsif ($_ eq ".fi") {
+		print "$_\n";
+	    } elsif ($_ ne "") {
+		if ($table_mode == 1) {
+		    s/\n//;
+		    print "\"  $_\"\n";
+	        } else {
+		    print;
+		}
+	    }
+
+	    $_ = <>;
+	} while (!/^\@end table/);
+
+	if ($table_mode == 1) {
+	    $_ = ".fi"
+	} elsif ($table_mode == 2) {
+	    $_ = ".sp\n.PD"
+	} else {
+	    $_ = ".TE";
+	}
+    }
+}
+
+while (<>) {
+    # Skip menus
+    if (/^\@menu/) {
+	while (!/^\@end menu/) {
+	    $_ = <>;
+	}
+    }
+
+    # Skip TeX commands
+    next if (/^\\/);
+
+    # Convert tables
+#    if (/^\@table/) {
+#	print ".TS\ntab(\t);\nl l.\n";
+#	$_ = <>;
+#	do {
+#	    convert_line();
+#	
+#	    if (/^\@item (.*)/) {
+#		print "$1\t";
+#	    } else {
+#		print "$_";
+#	    }
+#
+#	    $_ = <>;
+#	} while (!/^\@end table/);
+#
+#	$_ = ".TE";
+#    }
+
+    # Convert all other line types
+    convert_line();
+
+    # Strip out any remaining texinfo commands
+    s/^@.*//;
+    s/\@[^ ]*{([^}]*)}/$1/g;
+
+    # Output the man commands.
+    if (!$example && $_ eq "\n") {
+ 	$newline++;
+    } else {
+	$newline=0;
+    }
+
+    ($newline < 2) && print;
+}
diff --git a/manual/tools/texi2text b/manual/tools/texi2text
new file mode 100755
index 0000000..fbf64f8
--- /dev/null
+++ b/manual/tools/texi2text
@@ -0,0 +1,59 @@
+#!/bin/awk -f
+
+# Skip over menus
+/^@menu/ {
+    in_menu=1;
+}
+
+/^@end menu/ {
+    in_menu = 0;
+    next
+}
+
+# Strip certain texinfo instructions completely
+/^@c/ {next}
+/^@cindex/ {next}
+/^@node/ {next}
+/^@picture/ {next}
+/^@example/ {next}
+/^@format/ {next}
+/^@cartouche/ {next}
+/^@group/ {next}
+/^@end/ {next}
+/^@enumerate/ {next}
+/^@itemize/ {next}
+/^@item/ {next}
+
+# Remove heading styles
+/^@section/ {$1="";}
+/^@subsection/ {$1="";}
+/^@subsubsection/ {$1="";}
+/^@chapter/ {$1="";}
+/^@numbered/ {$1="";}
+/^@numberedsec/ {$1="";}
+/^@numberedsubsec/ {$1="";}
+/^@numberedsubsubsec/ {$1="";}
+/^@unnumbered/ {$1="";}
+/^@unnumberedsec/ {$1="";}
+/^@unnumberedsubsec/ {$1="";}
+/^@unnumberedsubsubsec/ {$1="";}
+/^@appendix/ {$1="";}
+/^@appendixsec/ {$1="";}
+/^@appendixsubsec/ {$1="";}
+/^@appendixsubsubsec/ {$1="";}
+/^@chapheading/ {$1="";}
+/^@majorheading/ {$1="";}
+/^@heading/ {$1="";}
+/^@subheading/ {$1="";}
+/^@subsubheading/ {$1="";}
+
+# Remove inline text formatting commands
+/@[a-z][a-z]*\{/ {gsub("@[a-z][a-z]*\\\{","");gsub("\\\}","");}
+/@@/ {gsub("@@","@");}
+
+# Print what we've got left
+{
+    if (in_menu==0) {
+	print;
+    }
+}
diff --git a/manual/tools/text2texi b/manual/tools/text2texi
new file mode 100755
index 0000000..1d78f31
--- /dev/null
+++ b/manual/tools/text2texi
@@ -0,0 +1,4 @@
+#!/bin/sed -f
+s/@/@@/g
+s/{/@{/g
+s/}/@}/g
diff --git a/manual/tools/update-nodes b/manual/tools/update-nodes
new file mode 100755
index 0000000..9a784f1
--- /dev/null
+++ b/manual/tools/update-nodes
@@ -0,0 +1,2 @@
+#!/bin/sh
+emacs -batch $1 -l $0.el
diff --git a/manual/tools/update-nodes.el b/manual/tools/update-nodes.el
new file mode 100644
index 0000000..6e86d34
--- /dev/null
+++ b/manual/tools/update-nodes.el
@@ -0,0 +1,11 @@
+(load-library "texnfo-upd")
+(set-mark (point))
+
+;; 28/02/00 jkb
+;; The end-of-buffer crashes xemacs command, but the docs now tell us to
+;; use goto-char instead.
+;; (end-of-buffer)
+(goto-char (point-max))
+
+(texinfo-update-node)
+(save-some-buffers t)
diff --git a/manual/tools/xref_update.pl b/manual/tools/xref_update.pl
new file mode 100755
index 0000000..c3d350d
--- /dev/null
+++ b/manual/tools/xref_update.pl
@@ -0,0 +1,81 @@
+#!/usr/bin/perl -w
+
+#
+# 13/10/95 jkb
+#
+# Process a list of html files updating any unresolved cross references.
+# These may not have been made by texi2html as the reference was in a
+# different file.
+#
+# This assumes the reference will be of the form (on a line by itself)
+# "See section `node` in <CITE>section name</CITE>"
+# See the maros.texi (fxref, fref) for more details. An assumption has
+# also been made that we're using a modified texi2html that writes
+# <!-- NODE:name --> and <!-- XREF:node --> lines (which ours does, due to
+# mods in texi2html and our macros.texi).
+#
+
+%node_table = ();
+ at FILES = @ARGV;
+
+#
+# Loop around all files generating a map of node tables to URLs.
+#
+while (<ARGV>) {
+    if (/^<!-- NODE:/) {
+	s/^<!-- NODE://;
+	s/ -->\n//;
+	$node_name = $_;
+    }
+
+    if (/^<H[1-6]><A NAME=/) {
+	/NAME="([^"]*)/;
+	if ($node_name) {
+	    $node_table{$node_name} = "$ARGV#$1";
+	    $node_name="";
+	}
+    }
+}
+
+#
+# Loop around the files once more updating the cross references
+#
+while (<@FILES>) {
+    print "Processing $_\n";
+    $first_line=0;
+
+    $fname = $_;
+    open(IN, "< $fname");
+    open(OUT, "> _$fname") || die "Couldn't create _$fname";
+    while (<IN>) {
+        if (/^<!-- XREF:/) {
+	    /XREF:(.*) -->$/;
+	    $node_name=$1;
+	    next;
+	}
+	if (/section `[^']*' in <CITE>/) {
+	    /([^`]*)`([^']*).*<CITE>(.*)<\/CITE>(.*)/;
+	    if (exists $node_table{$node_name}) {
+	        print "Resolved cross-reference \"$2\"\n";
+	        print OUT "$1<A HREF=\"$node_table{$node_name}\">$2</A>$4";
+#			  . " in " .
+#			  "<A HREF=\"$3_toc.html\">$2_toc.html</A>.";
+	        next;
+	    } else {
+                print "Couldn't resolve \"$2\"\n";
+	    }
+	}
+
+	s/\n$//;
+	if ($first_line) {
+	    print OUT "$_";
+       	} else {
+	    print OUT "\n$_";
+	}
+    }
+    print OUT "\n";
+    close(IN);
+    close(OUT);
+
+    rename("_$fname", $fname);
+}
diff --git a/manual/trace_dump.1.texi b/manual/trace_dump.1.texi
new file mode 100644
index 0000000..995a718
--- /dev/null
+++ b/manual/trace_dump.1.texi
@@ -0,0 +1,30 @@
+ at cindex trace_dump: man page
+ at unnumberedsec NAME
+
+trace_dump --- lists in a textual form the contents of a trace file.
+
+ at unnumberedsec SYNOPSIS
+
+ at code{trace_dump} @i{file}
+
+ at unnumberedsec DESCRIPTION
+
+ at code{trace_dump} extracts the contents of a trace file and lists it
+in textual format. It is primarily a debugging tool for use with
+io_lib, but can serve as a useful way to query the contents of a trace
+file outside of graphical programs such as Trev. The @i{file} may be of
+any supported trace format (and so this tool replaces the older
+ at code{scf_dump} program).
+
+Each portion of the trace file is listed in its own block. The block
+names output are ``[Trace]'' (containing general information such as
+the number of samples), ``[Bases]'', ``[A_Trace]'', ``[C_Trace]'',
+``[G_Trace]'', ``[T_Trace]'' and ``[Info]'' (containing the free text
+comments).
+
+ at unnumberedsec SEE ALSO
+
+_fxref(Formats-Scf, scf(4), formats)
+_fxref(Formats-Ztr, ztr(4), formats)
+ at code{Read}(4)
+
diff --git a/manual/trace_print_menu.png b/manual/trace_print_menu.png
new file mode 100644
index 0000000..c1e2858
Binary files /dev/null and b/manual/trace_print_menu.png differ
diff --git a/manual/trace_print_menu.small.png b/manual/trace_print_menu.small.png
new file mode 100644
index 0000000..68f9ed6
Binary files /dev/null and b/manual/trace_print_menu.small.png differ
diff --git a/manual/trace_print_page_dialogue.png b/manual/trace_print_page_dialogue.png
new file mode 100644
index 0000000..279ca0c
Binary files /dev/null and b/manual/trace_print_page_dialogue.png differ
diff --git a/manual/trace_print_trace1.png b/manual/trace_print_trace1.png
new file mode 100644
index 0000000..9be35bf
Binary files /dev/null and b/manual/trace_print_trace1.png differ
diff --git a/manual/trace_print_trace_dialogue.png b/manual/trace_print_trace_dialogue.png
new file mode 100644
index 0000000..cb59aa2
Binary files /dev/null and b/manual/trace_print_trace_dialogue.png differ
diff --git a/manual/tracediff.1.texi b/manual/tracediff.1.texi
new file mode 100644
index 0000000..381036b
--- /dev/null
+++ b/manual/tracediff.1.texi
@@ -0,0 +1,125 @@
+ at cindex tracediff: man page
+ at unnumberedsec NAME
+
+tracediff --- Compare two trace files for differences to detect mutations.
+
+ at unnumberedsec SYNOPSIS
+
+ at code{tracediff}
+        [@code{-a} @i{peak-alignment-deviation}]
+        [@code{-c} @i{complement-reverse-strand-tags}]
+        [@code{-d} @i{output-difference-traces}]
+        [@code{-f} @i{file-of-filenames}]
+        [@code{-n} @i{analysis-window-length}]
+        [@code{-q} @i{quiet-mode}]
+        [@code{-s} @i{analysis-sensitivity}]
+        [@code{-t} @i{noise-threshold}]
+        [@code{-w} @i{maximum-peak-width}]
+        @i{experiment_file(s)}
+
+ at unnumberedsec DESCRIPTION
+
+ at code{tracediff} compares a pair of traces to look for mutations. It aligns
+the traces, and then subtracts one trace from the other to produce a "difference
+trace". This difference trace is analysed to distinguish between mutations and
+incorrect base calls. @cite{Bonfield,JK, Rada,C and Staden,R Automated detection
+of point mutations using fluorescent sequence trace subtraction. Nucl. Acids Res. 26, 
+3404-3409 (1998)}.
+
+For an overview and more details about mutation detection see
+_fref(Mutation-Detection-Introduction, Search for Mutations, mutations).
+
+To detect mutations, compute the mean and standard deviation of the
+difference trace, and then locate bases associated with a significant
+pair of peaks, one positive, the other negative.
+For example a base change from an @code{A} to @code{T} will cause a positive
+ at code{A} trace difference and a negative @code{T} trace difference. If both
+the positive and negative differences are more than @i{num_sd} multiples of
+the standard deviation from the mean, then this is flagged as a potential
+mutation. Mutations are written to the experiment file as @code{MUTA} tags.
+
+The @i{experiment_file} contains records specifying the input trace, the reference
+trace and the strand direction. It also contains the clipping points for the input
+trace. A minimal experiment file for tracediff might look like this:
+
+LN   27_17f.ztr
+PR   1
+QL   10
+QR   839
+WT   C:/my_dataset/09_5f
+
+Where the @code{LN} record specifies the name of the input trace, the @code{PR}
+record specifies the strand direction 1=forward, 2=reverse, the @code{QL} and
+ at code{QR} records specify the input trace left and right clip points respectively,
+and the @code{WT} record specifies the wildtype trace. You can also optionally
+specify clip points for the wildtype trace as @code{WL} and @code{WR} records.
+Pregap4 generates suitable experiment files automatically, so these would not
+normally be created manually.
+
+ at unnumberedsec OPTIONS
+
+ at table @asis
+ at item @code{-a} @i{peak-alignment-deviation}
+The centres of each individual half-peak of a double peak above and below
+the baseline must align reasonably well for them to be considered to be
+a real mutation. The amount of half-peak alignment deviation allowable is
+specified in bases by this parameter, usually as a fraction of one base.
+
+ at item @code{-c} @i{complement-reverse-strand-tags}
+After mutation detection and after readings have been assembled into a GAP4
+database, GAP4 displays both forward and reverse readings in a single direction
+in the contig editor. This makes it much easier to compare sequences and traces
+in both directions simultaneously. When the corresponding traces are displayed,
+any reverse strand traces are complemented automatically such that the bases are
+interchanged. In this case, the original mutation tag generated by tracediff will
+then be of the wrong sense, so if checked, this option complements the tag base
+labels to match the complemented trace displayed by GAP4.
+
+ at item @code{-d} @i{output-difference-traces}
+After trace difference analysis, the generated traces are normally discarded and not
+written to disk. Checking this option lets you save the trace difference files to 
+the same directory as the original traces. The .ZTR trace format is used for this
+purpose. The original filename is retained and a "_diff.ztr" suffix is appended.
+
+ at item @code{-f} @i{file-of-filenames}
+Specifies the filename of a simple text file containing a list of experiment
+files to be processed by tracediff.
+
+ at item @code{-n} @i{analysis-window-length}
+Analysis of the trace difference is done over a local region to counter
+the effects of non-stationarity in the trace signal. The analysis region is
+defined by a short window whose length is specified in bases. The window is
+asymmetric in that it's located to the left of the base it's positioned on.
+This avoids measurement problems when mutations are encountered. The window
+size is a tradeoff. If it's too big, low level mutations may be missed. If
+it's too small, there may be insufficient data to give unbiased measurements
+leading to many false positives.
+
+ at item @code{-q} @i{quiet-mode}
+If specified, no information is output to stdout. The mutations will still
+be written to the experiment file as tags.
+
+ at item @code{-s} @i{analysis-sensitivity}
+This threshold is used to determine when an above/below baseline double
+peak in the difference trace is considered to be a mutation. It is specified
+in standard deviations from the mean over the analysis window. The higher the
+value, the more stringent the test. This value is reduced dynamically
+by the algorithm in the presense of mutations since small mutations near
+larger ones can often be missed with a uniform sensitivity setting. It's
+likely that some experimentation with this parameter will be required for
+optimal mutation detection in your data.
+
+ at item @code{-t} @i{noise-threshold}
+This threshold is used to filter out low level noise during the analysis
+phase. It is specified as a percentage of the maximum peak-to-peak trace
+difference value. A high threshold will lead to fewer false positives but
+you run the additional risk of missing low level mutations.
+
+ at item @code{-w} @i{maximum-peak-width}
+During analysis, the width of each peak is measured to avoid problems caused
+by gel artifacts. These often appear as broad peaks that overlay many bases.
+The maximum peak width is specified in bases. A lower value will lead to
+fewer false positives, but you run the additional risk of missing smeared
+mutations towards the end of a trace.
+
+ at end table
diff --git a/manual/trev-t.texi b/manual/trev-t.texi
new file mode 100644
index 0000000..985f6b4
--- /dev/null
+++ b/manual/trev-t.texi
@@ -0,0 +1,408 @@
+ at node Trev
+ at section Introduction
+ at cindex Trev: introduction
+
+ at cindex Trev 
+ at menu
+* Trev-Opening::                Opening trace files
+* Trev-View::                   Viewing the trace
+* Trev-Searching::              Searching
+* Trev-Information::            Information
+* Trev-Editing::                Editing
+* Trev-Save::                   Saving a trace file
+* Trev-Files::                  Processing multiple files
+* Trev-Print::                  Printing a trace
+* Trev-Quit::                   Quitting Trev
+ at ifset standalone
+* Index::			Index
+ at end ifset
+ at end menu
+
+_include(trev_mini-t.texi)
+
+_split()
+ at node Trev-Opening
+ at section Opening trace files
+ at cindex Trev: opening trace files
+ at cindex Opening trace files: Trev
+
+Trace files can be opened either on the command line or from within Trev.  
+In both cases it is possible to open several traces at once. In this case trev
+will add Next File, Previous File and Goto File buttons to allow quick
+navigation between traces.
+
+On the command line, this is simply done by specifying several files. With the
+"Open" dialogue from within trev multiple files may be selected by dragging
+with the left mouse button or using shift+left button and control+left button
+to extend regions or to toggle loading of individual files.
+
+ at node Trev-Opening-Command
+ at subsection Opening a trace file from the command line
+ at cindex Command line arguments: Trev
+
+ at table @code
+usage: trev [-@{ABI,ALF,EXP,SCF,PLN,Any@}] [-edits @var{value}]
+[-editscf] [-xmag @var{value}] [-ymag @var{value}] [-restrict]
+[@var{tracefilename} ...]
+
+ at sp 1
+ at item -ABI, -ALF, -EXP, -SCF, -PLN, -Any
+Optional. Defaults to Any. These define the possible input trace formats
+available. Currently these are 'ABI', 'ALF', experiment
+(_fpref(Formats-Exp, Experiment File, formats)), 'SCF'
+(_fpref(Formats-Scf, scf, formats)), plain ASCII text or 'any' in which case
+the program attempts to establish the file format from information
+contained within the trace file. 
+
+ at sp 1
+ at item -edits @var{value}
+Optional. Defaults to 1. If @var{value} is 1, the trace sequence can be
+edited. If @var{value} is 0, no edit line is displayed in Trev and the
+sequence may not be edited.
+
+ at sp 1
+ at item -editscf
+Optional. By default writing to SCF is disabled for safety and reasons of
+preference (we feel that all edits should be contained within an associated
+Experiment File thus leaving the original trace file intact). Specifying
+ at code{-editscf} allows writing to SCF files.
+
+ at sp 1
+ at item -pregap_mode
+Optional. Only used by Pregap4. This adds a Reject button to Trev
+and disables certain file operations. This argument should only be used by
+programs that run Trev as subprocesses for processing batches of files.
+
+ at sp 1
+ at item -restrict
+Optional. Restricts the use of the trace editor to a single file by
+disabling the ability to open another file from within Trev. The main
+use of this option is for calling Trev from within scripts.
+
+ at sp 1
+ at item -xmag @var{value}
+Optional. Defaults to 150. Specifies the magnification along the X axis
+of the trace. Larger values represent higher magnifications.
+
+ at sp 1
+ at item -ymag @var{value}
+Optional. Defaults to 10. Specifies the magnification along the Y axis
+of the trace. The value should be between 10 and 100 with 10 showing all
+the trace and 100 being the largest magnification.
+ at end table
+
+ at node Trev-Opening-Internal
+ at subsection Opening a trace file from within Trev
+ at cindex Filebrowser: Trev
+
+To open a trace file select the "Open..." command from the File menu.
+This brings up a file browser from where the trace name can be selected.
+_fxref(File Browser, File Browser, filebrowser) The format of the trace
+file should be selected from the row of Format buttons. Currently these
+are 'ABI', 'ALF', Experiment File (_fpref(Formats-Exp, Experiment File,
+formats)), 'SCF' (_fpref(Formats-Scf, SCF File, formats)), plain ASCII
+text or 'any' in which case the program attempts to establish the file format
+from information contained within the trace file. Opening an experiment
+file opens the trace file named within the experiment file. Double
+clicking on the trace name will open this trace file.
+
+If a trace file is already open, it is closed before the new one is
+opened. If the previous trace has been edited, but not saved, a dialogue
+box is displayed, asking if you wish to save the file before loading a
+new file. Selecting "Yes" will automatically save the file to its
+current filename. Selecting "No" will discard any changes that have been
+made.
+
+_split()
+ at node Trev-View
+ at section Viewing the trace
+ at cindex Trev: scaling
+ at cindex Scaling: Trev
+ at cindex Trev: fonts
+ at cindex Fonts, within trev
+
+The trace can be scrolled using the scrollbar directly beneath the
+menubar. The trace can be magnified both in the vertical and horizontal
+directions using the two scales to the left of the trace.
+
+The base numbers, original sequence, edited sequence, confidence values
+and the trace can each be switched on or off
+by using the check buttons in the "Display"
+option of the View menu.
+
+The font for the original and edited sequence can be chosen from three sizes,
+selectable by using the Font submenu of the View menu.
+
+The figure below shows the bases, edited bases, a histogram of the confidence
+values, the traces, and the Information Window which can be switched on
+from the View Menu.
+
+_lpicture(trev_conf_trace,6in)
+
+_split()
+ at node Trev-Searching
+ at subsection Searching
+ at cindex Trev: searching
+ at cindex Searching: Trev
+
+        Selecting the "Search..." command in the View menu brings up a
+window into which a text string can be entered. Pressing the "Next"
+button positions the cursor at the start of the next piece of sequence
+that matches the string specified in the text box. Pressing "Previous", 
+finds the previous match. The search is case insensitive.
+
+_split()
+ at node Trev-Information
+ at subsection Information
+ at cindex Trev: information
+ at cindex Information: Trev
+
+	The comments from the SCF file of the trace can be displayed
+using the "Information" option in the View menu.
+
+_split()
+ at node Trev-Editing
+ at section Editing
+ at cindex Trev: editing
+ at cindex Editing: Trev
+
+ at node Trev-Cutoffs
+ at subsection Setting the left and right cutoffs
+ at cindex Trev: setting cutoffs
+ at cindex Cutoff data: Trev
+ at cindex Vectors, in Trev
+ at cindex Trev: vector sequence
+
+	Poor data at the left and right ends of the trace can be marked
+using the "Left Quality" and "Right Quality" options in the Edit menu.
+Alternatively a keyboard shortcut for editing the cutoff is to press
+ at code{Control L} or @code{Control R} to edit left or right cutoff
+respectively.  To select the left cutoff, choose the "Left Quality" option
+from the menu. Then click the left mouse button at the required position in the 
+trace display.
+The region from the start of the sequence to this position will
+be highlighted in grey. To select the right hand cutoff, choose the
+"Right Quality" option in the Edit menu and click the required position in the trace
+display. The region between the
+left boundary and the end of the sequence will be highlighted. To prevent
+accidentally changing the cutoffs once these have been selected, choose the
+"Sequence" option in the Edit menu.
+	
+If vector sequence has been marked trev will also display these in a similar
+fashion to the quality cutoffs except in a peach colour. These cutoffs can be
+changed by selecting "Left Vector" and "Right Vector" in the same fashion as
+editing the quality cutoffs. Where both quality and vector cutoffs coincide
+trev draws the regions by striping between both peach and grey.
+
+ at node Trev-Sequence
+ at subsection Editing the sequence
+ at cindex Trev: editing the sequence
+ at cindex Editing the sequence: Trev
+
+	If the ability to edit has not been disabled, there will be two
+windows showing the trace sequence. The original sequence is displayed
+in the upper window. The window below this, which contains the blue cursor,
+is the editing window. To edit this sequence, select the "Sequence"
+option in the Edit menu. The editing cursor is positioned by clicking
+with the left mouse button within the display. Bases are deleted to the
+left of the cursor using the delete key of the keyboard. Additional
+bases are inserted to the left of the cursor. Only A, a, C, c, G, g, T,
+and t are allowed. It is recommended that edits are entered in
+lower case to distinguish them from the original bases.
+
+ at node Trev-Undo
+ at subsection Undoing clip edits
+ at cindex Trev: undo
+ at cindex Undo clip edits, trev
+
+It is often easy to accidently forget which editing mode you are in and adjust
+a quality or vector clip point by mistake. Trev keeps track of all clip edits
+and hence these may be "Undone" by selecting "Undo Clipping" from the Edit
+menu. This will remove the last clip edit. It is not yet possible to undo
+sequence edits.
+
+_split()
+ at node Trev-Save
+ at section Saving a trace file
+ at cindex Trev: saving a trace file
+ at cindex Saving: Trev
+
+To save a trace file to a different file name or format choose the "Save
+As..."  command from the File menu. Select the format the file is to be
+saved in using the Format buttons. The output formats are CTF, SCF, ZTR,
+experiment and plain text. 
+Type a new name into the Selection box or
+select an existing name from the list of file names. Experiment format
+traces can be saved to their existing name using the "Save" option in
+the File menu.
+
+_split()
+ at node Trev-Files
+ at section Processing multiple files
+ at cindex Multiple files in Trev
+ at cindex Previous button, trev
+ at cindex Next button, trev
+ at cindex Reject button, trev
+ at cindex Goto file button, trev
+
+When several trace files are specified on the command line to Trev, it will
+add Previous File, Next File, and Goto File buttons. The Previous File and
+Next File simply step through the specified trace files. The Goto File button
+will bring up a scrollable list of all the trace files specified. Clicking on
+any trace filename in this list will jump to that file.
+
+If Trev was brought up from Pregap4, or the @code{-pregap_mode} command line
+switch was used, Trev will also display a Reject button. This may be used to
+indicate to Pregap4 that the trace file shown is not worthy of any clipping at
+all and should be sent to the Pregap4 "failed" file.
+
+_split()
+ at node Trev-Print
+ at section Printing a trace
+ at cindex Trev: printing a trace
+
+The Print option is available via the File menu, as shown below.
+
+_picture(trace_print_menu,6in)
+
+It produces a PostScript file which you must then send to the printer
+yourself.
+
+All sizes given in the dialogues explained below should be in
+PostScript points (72pt = 1inch).
+
+Defaults and available options are specified in the file
+tk_utilsrc. These can be changed by copying the relevant line from
+tk_utilsrc into a file called .tk_utilsrc in your home or working
+directory, and then altering the settings as desired.
+
+Note that it is not yet possible to include the histogram of confidence values
+in the postscript output.
+
+ at node Trev-Print-PageOptions
+ at subsection Page options
+ at cindex Trev: page options
+
+_picture(trace_print_page_dialogue,2.34167in)
+
+ at node Trev-Print-PageOptions-Paper
+ at subsubsection Paper options
+ at cindex Trev: paper options
+
+Currently available page sizes:
+ at table @var
+ at item A4
+(842 x 595)
+ at item A3
+(1191 x 842)
+ at item US Letter
+(792 x 612)
+ at end table
+
+Please note that the page size and orientation options do not
+determine the paper format that your printer will use. This must be
+set externally to trev.
+
+ at node Trev-Print-PageOptions-Panels
+ at subsubsection Panels
+ at cindex Trev: print panels
+
+Traces are printed width-ways across the page. When the right-hand
+margin of the page is reached, printing continues below the current
+section and from the left-hand side. A 'panel' is one page-width's
+worth of trace (minus margins).
+
+The trace and the sequence and sequence number information are printed
+entirely within the given height of the panel, and the separation
+gives the amount of space that is left between panels. Thus they,
+together with the page height and top and bottom margins, determine
+how many panels will be printed per page.
+
+ at node Trev-Print-PageOptions-Fonts
+ at subsubsection Fonts
+ at cindex Trev: print fonts
+
+All fonts listed should be available to most PostScript printers. Most 
+printers will default to Courier if a selected font is not recognised.
+
+ at node Trev-Print-TraceOptions
+ at subsection Trace options
+ at cindex Trev: trace print options
+
+_picture(trace_print_trace_dialogue,4.83333in)
+
+ at node Trev-Print-TraceOptions-Title
+ at subsubsection Title
+ at cindex Trev: trace print title
+
+The title is printed in the top left hand corner of every page. The
+default is the name of the trace file.
+
+ at node Trev-Print-TraceOptions-Colour
+ at subsubsection Line width and colour
+ at cindex Trev: trace print colour and line width
+
+The defaults are those used by the trev display.
+The colours shown in the selection dialogue may not correspond exactly 
+to those printed, depending on the capabilities of your printer.
+Different colours will usually be printed using grey-scales on black
+and white printers.
+
+ at node Trev-Print-TraceOptions-Dash
+ at subsubsection Dash pattern
+ at cindex Trev: trace print dash pattern
+
+Dash pattern is in PostScript dash format:
+
+	dash_1 gap_1... dash_n gap_n offset
+
+'dash_n' and 'gap_n' are the lengths of dashes and the gaps between
+them. The dash pattern starts at dash_1, continues to gap_n, then
+starts again at dash_1, until the whole line has been drawn. If n = 0,
+i.e. no values are given for 'dash' and 'gap', the result is a normal
+unbroken line. Offset must be given, and is the distance into the dash
+pattern at which the pattern should be started. The dash pattern is not
+demonstrated by the example line on the ps_trace_setup dialogue.
+
+ at node Trev-Print-TraceOptions-Bases
+ at subsubsection Print bases
+ at cindex Trev: trace print bases
+
+Allows a subsection of the trace to be printed.
+
+The 'Visible' button sets the region to that currently displayed in
+the main trev window. If the display is altered, the print base
+settings will not change unless 'Visible' is pressed again. The whole
+sequence is printed if the start position is greater than the end
+position. The OK button will not work if the start or end positions
+given are outside the range of the sequence.
+
+ at node Trev-Print-TraceOptions-Magnification
+ at subsubsection Print magnification
+ at cindex Trev: trace print magnification
+
+The X and Y scales are taken from the trev display, and cannot
+be set independently for PostScript output.
+
+ at node Trev-Print-Example
+ at subsection Example
+ at cindex Trev: trace print example
+
+The segment of output displayed below indicates the effects
+of the settings given in the example dialogue screendumps shown above.
+NB: the page has been clipped to save space. The section shown is the
+top part of an A4 page.
+
+_picture(trace_print_trace1,6in)
+
+_split()
+ at node Trev-Quit
+ at section Quitting 
+ at cindex Trev: quit
+ at cindex Quit: Trev
+
+To exit Trev, select the "Exit" command from the File menu. If the
+sequence has been edited but not saved, a dialogue box is displayed,
+asking if you wish to save the file before quitting. Selecting "Yes"
+will automatically save the file to it's current filename. Selecting
+"No" will discard any changes that have been made.
diff --git a/manual/trev.texi b/manual/trev.texi
new file mode 100644
index 0000000..61621c8
--- /dev/null
+++ b/manual/trev.texi
@@ -0,0 +1,53 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+ at c %**start of header
+ at setfilename trev.info
+ at settitle Trev
+ at c @setchapternewpage odd
+ at iftex
+ at afourpaper
+ at end iftex
+ at setchapternewpage on
+ at c %**end of header
+
+ at set standalone
+include(header.m4)
+
+ at titlepage
+ at title Trev
+ at subtitle 
+ at author 
+ at page
+ at vskip 0pt plus 1filll
+_include(copyright.texi)
+ at end titlepage
+
+ at node Top
+ at ifinfo
+ at top top-trev
+ at end ifinfo
+
+ at c @tex
+ at c \global\pageno=-10
+ at c @end tex
+ at c 
+ at c @unnumbered Preface
+ at c PREFACE TEXT
+ at c 
+ at c @tex
+ at c \vfill \eject
+ at c \global\pageno=1
+ at c @end tex
+
+ at raisesections
+_include(trev-t.texi)
+
+_split()
+ at node Index
+ at unnumberedsec Index
+ at printindex cp
+ at lowersections
+
+ at shortcontents
+ at contents
+ at bye
diff --git a/manual/trev_conf_trace.png b/manual/trev_conf_trace.png
new file mode 100644
index 0000000..dd04e4a
Binary files /dev/null and b/manual/trev_conf_trace.png differ
diff --git a/manual/trev_conf_trace.small.png b/manual/trev_conf_trace.small.png
new file mode 100644
index 0000000..ac3199f
Binary files /dev/null and b/manual/trev_conf_trace.small.png differ
diff --git a/manual/trev_mini-t.texi b/manual/trev_mini-t.texi
new file mode 100644
index 0000000..d771b3f
--- /dev/null
+++ b/manual/trev_mini-t.texi
@@ -0,0 +1,60 @@
+
+For some types of sequencing project it is convenient to view and edit the
+chromatogram data prior to assembly into a gap4 database
+(_fpref(Gap4-Introduction, Gap4 Introduction, gap4)),
+and this is the function of the program trev.
+
+Trev displays the original trace data, its base calls and confidence
+values, and it allows the sequence of the
+trace to be edited and the left and right cutoffs to be defined. 
+Several file formats can be read in addition to our own Experiment Files
+(_fpref(Formats-Exp, Experiment File, formats)),  and 'SCF' files
+(_fpref(Formats-Scf, scf, formats)). 
+Any edits made are normally saved to Experiment files, not to the 
+chromatogram files which we regard as archival data. 
+
+A typical display from trev is shown below. It includes the trace data, the
+original sequence, the edited sequence, the
+menu bar, and the name of the sequence being edited. The left cutoff region 
+is shown shaded. 
+
+_picture(trev_pic,6in)
+
+
+The trace can be scrolled using the scrollbar directly beneath the
+menubar. The trace can be magnified in the vertical and horizontal
+directions using the scale bars to the left of the trace.
+
+The base numbers, original sequence, edited sequence, confidence values
+and the trace can each be switched on or off, and the font for the
+original and edited sequence is selectable.
+
+ at page
+The figure below shows the bases, edited bases, a histogram of the confidence
+values, the traces, and the Information Window which can be switched on
+from the View Menu.
+
+_lpicture(trev_conf_trace,6in)
+
+Trev uses ``io_lib'' for handling the various sequencing instrument
+file formats. This means it has support for ABI, MegaBace (when saved
+in ABI format), SCF (used by LiCor and some other manufacturers), ZTR
+and SFF (454).
+
+The above pictures all come from instruments using the Sanger
+sequencing method, however more recently support has been added for
+pyrosequencing methods (as used by 454 Life Sciences amongst
+others). An example of this is below.
+
+_lpicture(trev_pyro_trace,6in)
+
+Trev can be used to produce postscript
+files of the traces so that they can be printed. The colours, line
+widths, etc are configurable. An example is shown in the figure below.
+ 
+_picture(trace_print_trace1,6in)
+
+Note that we strongly 
+recommend that readings are not edited prior to assembly as it is far better
+to edit them when their alignment with other readings can be seen.
+
diff --git a/manual/trev_pic.png b/manual/trev_pic.png
new file mode 100644
index 0000000..9c48c37
Binary files /dev/null and b/manual/trev_pic.png differ
diff --git a/manual/trev_pyro_trace.png b/manual/trev_pyro_trace.png
new file mode 100644
index 0000000..d3a5ac8
Binary files /dev/null and b/manual/trev_pyro_trace.png differ
diff --git a/manual/vector_clip-t.texi b/manual/vector_clip-t.texi
new file mode 100644
index 0000000..e088984
--- /dev/null
+++ b/manual/vector_clip-t.texi
@@ -0,0 +1,955 @@
+_split()
+ at node Vector_Clip
+ at chapter Screening Against Vector Sequences
+ at cindex Vector_Clip
+
+ at menu
+ at ifset html
+* Vector_Clip-Introduction::   Introduction
+ at end ifset
+* Vector_Clip-Algorithms:: Algorithms
+* Vector_Clip-Options::    Command line options
+* Vector_Clip-Parameters::    Command line parameters
+* Vector_Clip-Errors::     Error codes
+* Vector_Clip-Examples::   Examples
+* Vector_Clip-Vector_Primer-Files:: Vector_Primer Files
+* Vector_Clip-Vector_Primer-File-Notes:: Vector_Primer Notes
+* Vector_Clip-Sites::      Defining the cloning and primer sites for vector_clip
+* Vector_Clip-Cloning Site::   Finding the cloning site
+ at end menu
+
+ at ifset html
+_split()
+ at node Vector_Clip-Introduction
+ at section Introduction
+ at end ifset
+ at cindex Screening against vector sequence
+ at cindex Vector sequence: screening
+
+For most assembly engines to work well it is necessary to  present them
+with data of good quality and which contains only the target sequence. One
+pre-assembly task is to locate and mark all segments of readings which contain
+vectors used in their production. In our package this task is performed by
+vector_clip which compares batches of readings against vector sequences.
+Sequence readings are stored in experiment file format 
+(_fpref(Formats-Exp, Experiment File, formats))
+ and, for
+the majority of projects 
+each experiment file should contain the data
+required by vector_clip: the file names of the vectors to screen against,
+and, for the sequencing vector, the position of the cloning and primer sites.
+See 
+_oref(Vector_Clip-Vector_Primer-Files, Vector_Primer files)
+for an alternative and simpler method of defining vector data for vector_clip.
+The program pregap4 
+(_fpref(Pregap4-Introduction,Pregap4, pregap4)), 
+contains modules for creating experiment files
+from trace files, and for adding data about the vectors used. 
+When
+vector_clip runs it adds records to the reading's experiment file to 
+denote the start and end of any segments which are found to match the vectors. 
+
+For conventional sequencing projects there are two types of vector for which
+readings will need to be screened: the sequencing vector, and, for cases
+where, say, whole cosmids or BACs have been shotgunned, the cloning vector.  
+These two
+screening tasks are different.  When screening for the sequencing vector we
+may expect to find data to exclude, both from the primer region and, when the
+insert is short, from the other side of the cloning site. It is also a wise
+precaution to check for rearrangements of the sequencing vector.  When
+screening out cosmid vector we may find that either the 5' end, or the 3' end,
+or the whole of the sequence is vector. Also for the cloning vector search we
+need to compare both strands of the sequence.
+
+In order to filter out readings that contain the sequences
+of contaminant DNA such as E. coli, a separate program screen_seq should
+be used (_fpref(Screen_seq, Screening for known possible contaminant
+sequences, screening))
+
+A further type of search is required for a new method that is being
+developed at MRC HGMP, Hinxton, UK.  This new method (M. Starkey,
+personal communication) is an application of a technology described as
+"molecular indexing" @cite{Unrau, P. and Deugau, K.V. (1994) Non-cloning
+amplification of specific DNA fragments from whole genomic DNA digests
+using DNA indexers. Gene 145, 163-169}. It produces sequences with a
+primer at their 3' ends which need to be found and removed.
+
+Some groups are using transposons to produce random start points for
+sequencing reactions, and vector_clip contains an experimental search
+procedure for dealing with the data generated by such methods.
+
+Vector_clip is usually run as part of the pregap4 process
+(_fpref(Pregap4-Introduction,Pregap4, pregap4))
+and will usually be called three times: the
+first to locate and mark the sequencing vector; next to check for vector
+rearrangements; and finally to locate and mark cosmid vector segments.
+
+Vector_clip operates on batches of readings using files of file names:
+one input file and two output files - one for the names of the readings
+that pass and one for those that fail. The program also modifies the
+reading files.
+
+In earlier versions of vector_clip all the information
+needed about the vector (i.e.  its name, location on disk, the cloning
+and primer sites used) for each reading was expected to be stored in the reading's
+experiment file (_fxref(Formats-Exp, Experiment File, formats)) but, as
+is explained in the next paragraph, the 
+newest version employs an alternative method for providing data about
+sequencing vectors.
+For notes
+on defining the cloning and primer sites, see 
+_oref(Vector_Clip-Sites, Defining the Positions of Cloning and Primer Sites for Vector_Clip).
+
+The 1999.0 release of the package contained an experimental new method of
+providing vector_clip with data about the vectors to search for. Using feedback
+from the trial period we have simplified the method and improved the algorithm.
+
+The new method
+uses files containing, not the complete vector sequences,
+but the segments of sequence between the primers and the cloning
+site. These files are termed "vector_primer" files see 
+_oref(Vector_Clip-Vector_Primer-Files, Vector_Primer files),
+and the vector_primer
+mode of vector_clip uses the data in these files to search for the vectors.
+
+The vector_primer file can contain the data for up to (at present) 100 
+vector and primer combinations, although it would not be efficient to
+compare each reading against an unnecessarily large number of records.
+When vector_clip finds a match to one of the vectors defined in the 
+vector_primer file it can not only mark the matching segment in the reading,
+but also adds the name of the file containing the vector sequence, and the
+primer type to the readings experiment file. The vector file name can then
+be used by vector_clip in its search for vector rearrangements, and the
+primer type can be used by gap4 in its analysis of read pairs (Note, however,
+that for read pair analysis, 
+gap4 still needs to know which readings came from the same template,
+so that data must be added to the Experiment file in some other way).
+
+A big advantage of the vector_primer file method, is that it simplifies the
+task of providing vector_clip with data. In addition, the task of creating the
+vector_primer files is simplified in that the -V option in vector_clip
+removes the necessity for the records in the vector_primer file to
+contain precisely the sequence between the primer and the cloning site
+_oref(Vector_Clip-Vector_Primer-File-Notes, Vector_Primer File Notes).
+
+Vector_primer files are also used by the search for transposon data.
+
+If setting up these programs seems a little
+daunting, it is important to realise that the majority 
+users need not concern themselves
+with the details of vector_clip and the creation of experiment files for
+their readings; or if they do, these configuration operations are
+only performed
+once per project, and are made relatively easy by the use of pregap4.
+
+
+_split()
+ at node Vector_Clip-Algorithms
+ at section Algorithms
+
+
+For locating sequencing vector the program uses a dynamic programming
+algorithm and two percentage matches as cutoffs - one for the 5' end
+and another for the 3' end. Both searches include the poor quality data
+at the ends of the readings. 
+This mode writes the SL and SR records in experiment files.
+
+If the users selects the vector_primer file mode of vector_clip the
+program searches the 5' end of each reading for
+all of the forward and reverse sequence segments in the primer_vector
+file and notes
+the one which matches best. If this one is above the user defined
+threshold the 5' clip point will be set and the
+experiment file will be modified accordingly.
+The program then compares the rest of the reading with all of the
+segments in the vector_primer file to find the one which matches best.
+Again if the user defined threshold is reached the experiment file will
+be modified accordingly. If the best 5' and 3' matches come from different
+records in the vector_primer file a warning message is printed.
+If a 5' match is found it will be used to determine the file name of the
+vector sequence and the primer type. If only a 3' match is found it will
+be used to determine these items. If no match is found no PR record is
+written.
+This mode writes the SL, SR SF and PR records in experiment files. If the
+vector file name is missing from the vector_primer file record, the SF
+record is not written.
+
+For locating cloning vector two algorithms are available, both of which
+use hashing. 
+The original method needs a "Word
+length" (word_length), the "Number of diagonals to combine" (num_diags) and
+a "Cutoff score" (diagonal_score).  The word length is the minimum number
+of consecutive bases that will count as a match. The algorithm treats the
+problem like a dot matrix comparison. First it finds all matches of length
+word_length; then it locates the diagonal with the highest normalised
+score.  Then it adds the scores for the adjacent diagonals (num_diags).  If
+the combined score is at least "diagonal_score" the experiment file is
+updated to indicate the location of the vector sequence.  The score
+represents the proportion of a diagonal that contains matching words, and
+the maximum score for any diagonal is 1.0.
+This mode writes the CS records in experiment files.
+If the whole reading is cloning vector
+this mode writes a PS record containing "all cloning vector",
+
+
+A newer method also hashes using "word_length" consecutive bases and 
+accumulates the hits for each diagonal, but instead of using a score cutoff,
+it decides if there is
+a match using a probability threshold "P" supplied by the user. 
+For each length of diagonal vector_clip calculates "E" the score that would be
+expected for probability "P", and then compares it with the observed score "O".
+If for any diagonal O>E a match is declared and expressed as 100(O-E)/E. This
+new method is an attempt to overcome the problem that even though the
+scores on diagonals are normalised to lie in the range 0.0 to 1.0 the scores
+are still a function of the diagonal length. The probability P hence allows
+vector_clip to use a different cutoff score for each length of diagonal.
+Tests have shown that the probability based algorithm is very much more 
+reliable than the older one. 
+By default the program still
+uses the old algorithm, the probability based one being switched on by
+the user specifying a probability cutoff (option -P). It is strongly
+recommended that the probability based method is used and for our data we have
+found that a probability of 0.0000000000001 or 1.0e-13 gives good results.
+This mode writes the CS records in experiment files.
+If the whole reading is cloning vector
+this mode writes a PS record containing "all cloning vector".
+
+The search for "vector rearrangements" uses a simple algorithm which
+looks only for a match of length "minimum match".  All readings that
+contain a string of characters of at least this length that match a segment
+of the vector sequence exactly will be classed as "vector rearrangements"
+and their names will not be written to the file of passed file names.
+This mode writes a PS record containing "vector rearrangement" in experiment 
+files if a match is found. Note that if a reading's Experiment file does not
+contain an SF (i.e. name of sequencing vector file) the vector rearrangements
+search does not fail the reading: its name goes into the pass file.
+
+The search for transposon generated data is somewhat complicated, as is 
+explained below.
+
+The transposon ends must be stored in a vector_primer file.
+The vector sequence file should be named in the SF record.
+Numerous scores are required.
+       
+First get the transposon end sequences from the vector_primer file.
+Then get the vector sequence and rotate it around the cloning site.
+Next use dynamic programming to search with both of the transposon end 
+sequences and note the highest score. If above score L reset SL.
+Now use hashing to 
+search the 20 bases after SL for a match to any part of the 
+vector, on both strands.
+If the best match is above score l, use dynamic programming to 
+try to align from the match point to the cloning site. If the 
+alignment score is >= score R reset SL.
+If the previous two steps fail to find a match to vector we assume that
+the transposon inserted into the target DNA and not the vector.
+The reading could hence run into vector at its 3' end so we
+use dynamic programming to
+search from SL onwards, for the sequences either side of the
+cloning site (we do not know the orientation of the transposon
+(and hence the read) relative to the vector).
+If we find a match >= score R reset SR.
+
+_split()
+ at node Vector_Clip-Options
+ at section Options
+
+ at example
+Usage: vector_clip [options] file_of_filenames
+Where options are:
+    [-s mark sequencing vector]      [-c mark cloning vector]
+    [-h hgmp primer]                 [-r vector rearrangements]
+    [-w word_length (4)]             [-n num_diags (7)]
+    [-d diagonal score (0.35)]       [-l minimum match (20)]
+    [-L minimum % 5' match (60)]     [-R minimum % 3' match (80)]
+    [-m default 5' position]         [-t test only]
+    [-M Max vector length (100000)]  [-P max Probability]
+    [-v vector_primer filename]      [-i vector_primer filename]
+    [-V vector_primer length]
+    [-p passed fofn]                 [-f failed fofn]
+ at end example
+
+Options:
+
+ at table @code
+ at item -s
+Mark sequencing vector. Searches for 5' primer, 3' running into vector.
+ at item -c
+Mark cloning vector. Searches both strands for cloning vector.
+ at item -h
+Hgmp primer. Searches 3' end for a primer.
+ at item -i vector_primer filename
+Mark transposon data.
+ at item -r
+Vector rearrangements. Searches for sequencing vector rearrangements.
+ at item -t
+Test only. Does not change the experiment files, displays hits.
+ at end table
+
+ at node Vector_Clip-Parameters
+
+ at section Parameters (defaults in brackets)
+ at table @var
+ at item @code{-L} minimum percentage match 5' end (60)
+sequencing vector searches and transposon search
+ at item @code{-R} minimum percentage match 3' end (80)
+sequencing vector searches and transposon search
+ at item @code{-m} minimum 5' position
+allows a minimum 5' end cutoff to be set if a sufficiently good match is not 
+found (i.e. it is really a default 5' cutoff position). 
+If a value of -1 is used the program will set the cutoff to be the 
+distance between the primer and the cloning site.
+ at item @code{-v} vector-primer-pair filename
+sequencing vector search using vector-primer-pair file
+ at item @code{-V} vector_primer length
+the length of the sequence stored in the vector_primer file to use for
+the 5' search
+ at item @code{-w} word_length (4)
+cloning vector search hash length
+ at item @code{-P} probability
+cloning vector search, (a score less likely than P is a match)
+ at item @code{-n} num_diags (7)
+cloning vector search, old score based algorithm: number of diagonals to combine
+ at item @code{-d} diagonal score (0.35)
+cloning vector search, old score based algorithm
+ at item @code{-l} minimum match (20)
+sequencing vector rearrangements and transposon search minimum match length
+ at item @code{-M} maximum vector length (100000)
+all algorithms, reset for vectors >100000 bases
+ at item @code{-p} passed fofn
+file of file names for passed files
+ at item @code{-f} failed fofn
+file of file names for failed files
+ at item input fofn ...
+input file of file names
+ at end table
+
+_split()
+ at node Vector_Clip-Errors
+ at section Error codes
+ at cindex Vector_Clip: error codes
+ at cindex error codes in vector_clip
+
+The following errors can occur.
+
+ at cindex Vector_Clip: error codes
+ at enumerate 1
+ at item Error: could not open experiment file
+ at item Error: no sequence in experiment file
+ at item Error: sequence too short
+ at item Error: missing vector file name
+ at item Error: missing cloning site
+ at item Error: missing primer site
+ at item Error: could not open vector file
+ at item Error: could not write to experiment file
+ at item Error: could not read vector file
+ at item Error: missing primer sequence
+ at item Error: hashing problem
+ at item Error: alignment problem
+ at item Error: invalid cloning site
+ at item Warning: sequence now too short (no message)
+ at item Warning: sequence entirely cloning vector (no message)
+ at item Warning: possible vector rearrangement (no message)
+ at item Warning: error parsing vector_primer file
+ at item Warning: primer pair mismatch!
+ at item Aborting: more than X entries in vector_primer file
+ at end enumerate
+
+
+_split()
+ at node Vector_Clip-Examples
+ at section Examples
+
+Screen for sequencing vector using 5' cutoff of 70%, a 3' cutoff of 90%
+and default 5' primer position of 30. The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail.
+
+
+ at example
+ at code{vector_clip -s -L70 -R90 -m30 -pfiles.pass -f files.fail files.in}
+ at end example
+
+Screen for sequencing vector using 5' cutoff of 60%, a 3' cutoff of 80%
+and default 5' primer position of 30. The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail. This shows
+that the default search is for sequencing vector.
+
+
+ at example
+ at code{vector_clip -m30 -pfiles.pass -f files.fail files.in}
+ at end example
+
+Screen for sequencing vector using 5' cutoff of 60%, a 3' cutoff of 80%
+and a vector-primer-pair file called vpfile. Only the 20 bases closest
+to the cloning site will be used for the 5' search.
+The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail.
+
+
+ at example
+ at code{vector_clip -v vpfile -V20 -pfiles.pass -f files.fail files.in}
+ at end example
+
+Screen transposon data using 5' cutoff of 80%, a 3' cutoff of 85%, a match length of 10
+and a vector-primer-pair file called vector_primer_file. 
+The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail.
+
+
+ at example
+ at code{vector_clip -i vector_primer_file -L 80 -R 85 -l 10 -pfiles.pass \}
+ at code{            -f files.fail files.in}
+ at end example
+
+
+Screen for cloning vector using the old algorithm with a word length of 4, 
+summing 7 diagonals and diagonal cutoff score of 0.4. 
+The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail.
+
+
+ at example
+ at code{vector_clip -c -w4 -n7 -d0.4 -pfiles.pass -f files.fail files.in}
+ at end example
+
+Screen for cloning vector using the probability based algorithm with a 
+word length of 4 and probability cutoff of 1.0e-13.
+The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail.
+
+
+ at example
+ at code{vector_clip -c -P 1.0e-13 -pfiles.pass -f files.fail files.in}
+ at end example
+
+
+Screen for 3' primer using a cutoff of 75%.
+The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail.
+
+ at example
+ at code{vector_clip -h -R75 -pfiles.pass -f files.fail files.in}
+ at end example
+
+Screen for sequencing vector rearrangements using a cutoff of 20 bases.
+The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail.
+
+ at example
+ at code{vector_clip -r -l20 -pfiles.pass -f files.fail files.in}
+ at end example
+
+
+_split()
+ at node Vector_Clip-Vector_Primer-Files
+ at section Vector_Primer file format
+ at cindex Vector_primer files
+ at cindex format: vector_primer files
+
+The vector_primer files store
+the data for each vector/primer pair combination as a single record
+(line) and up to 100 records can be contained in a file. The items on each
+line must be separated by spaces or tabs (only the file name can contain spaces)
+and a newline character ends the record. 
+It is important to realise that the format has been simplified since the first
+version of the method appeared in release 1999.0 and any files created for the
+1999.0 release will need to be edited!
+
+The items in a record are:
+
+name seq_r seq_f file_name
+
+name is an arbitrary record name.
+seq_r is the sequence between the reverse primer and the cloning site.
+seq_f is the sequence between the forward primer and the cloning site.
+file_name is the name of the file containing the complete vector sequence.
+
+An example file containing two entries 
+(for m13mp18, and a vector called f1) is 
+shown below. "\" symbols have been used to denote wrapped lines and so it
+can be seen that the first record is shown on two lines and the next on 1.
+
+ at example
+
+m13mp18 attacgaattcgagctcggtaccc ggggatcctctagagtcgacctgcaggcatgcaagcttggc \
+/pubseq/tables/vectors/m13mp18.seq
+f1 CCGGGAATTCGCGGCCGCGTCGACT CTAGACTCGAGTTATGCATGCA  af_clones_vec
+ at end example
+
+
+Note that the segments of sequence can be longer (or shorter) than the
+sequences between the primer and the cloning site. The -V option of
+vector_clip allows the user to specifiy that a fixed number of bases
+closest to the cloning site be used for any particular run, and so the
+same record in the vector_primer file could be used for several primers
+as long as the cloning site was the same. If it is necessary to get the 
+sequence segments
+precisely defined refer to the figure below. This 
+contains an annotated section of the
+m13mp18 vector around the SmaI site, to see how it corresponds to the
+first record in the vector_primer file. The primers shown are the 16mer 
+reverse(-21) and the 17mer forward(-20), and the vector_primer
+record is the sequence between the primers with a space at the cloning site,
+
+followed by a file name.
+
+ at example
+                                                 SmaI 
+                                                 ++++++++10++ 
+                         ---20--------10---------123456789012
+               r(-21)    432109876543210987654321
+         aacagctatgaccatg
+ acacaggaaacagctatgaccatgattacgaattcgagctcggtacccggggatcctcta
+       6210      6220      6230      6240      6250      6260
+
+ ++++++20++++++++30++++++++40+
+ 34567890123456789012345678901       f(-20)
+                              tgaccggcagcaaaatg
+ gagtcgacctgcaggcatgcaagcttggcactggccgtcgttttacaacgtcgtgactgg
+       6270      6280      6290      6300      6310      6320
+
+ at end example
+
+
+
+ at node Vector_Clip-Vector_Primer-File-Notes
+ at section Vector_Primer File Notes
+
+There are several consequences of using vector_primer files
+to specify the sequencing vector details. Please read a description
+of the vector_primer file algorithm in the algorithms section 
+_oref(Vector_Clip-Algorithms, Vector_clip algorithms).
+
+
+Firstly, to get the vector segments of readings marked correctly
+it is not necessary to include the relevant data in their
+experiment files.
+
+Secondly, because vector_clip compares all the primer-vector pairs in the
+primer_vector file it would be inefficient to include very large numbers
+of records in these files. Instead it would be better to have a master 
+vector_primer file which contained all the combinations used in the lab
+and then to copy the relevant ones to project specific files.
+
+Thirdly, even though vector_clip can write the PR record (primer type) into
+the experiment file if it finds a match, gap4 still needs the template name
+data in order to do read pair analysis.
+
+Finally note that the -V option for vector_clip means that the
+segments of sequence in the vector_primer file need not be made exactly the
+right length when the files are created: it matters only that the
+cloning site is correctly specified and that there is sufficient length
+of sequence on either side. For example, vector_primer files
+could be created in which all records included 40 bases from either side
+of the cloning sites. The -V option allows the
+alignment to be limited to the segment of sequence closest to the
+cloning site. For example, -V 20 specifies that at most 20 bases around
+the cloning site are used. 
+
+_split()
+ at node Vector_Clip-Sites
+ at section Defining Cloning and Primer Sites for Vector_Clip 
+ at cindex Cloning site, defining
+ at cindex Primer site, defining
+ at cindex Vector_Clip: cloning site, defining
+ at cindex Vector_Clip: primer site, defining
+ at cindex vector file formats
+ at cindex file formats for vectors
+ at cindex formats: vector files
+
+Vector sequences should be stored in simple text files with up to 80
+characters of data per line. Sequencing vectors are those vectors such
+as m13 used to produce templates for sequencing.  All other vectors,
+such as cosmid vectors, that are used to purify and grow the DNA prior
+to it being subcloned into sequencing vectors are termed "cloning
+vectors".  It is important that the files containing cloning vector
+sequences which are used by vector_clip are arranged so that the cloning site
+follows the last base in the file.  For example (where X is the cloning
+site):
+
+ at example
+start of file	
+acatacatacatatata
+acatagatagatacaga
+.
+.
+.
+cagatataX
+end of file
+
+     Cloning Vector File Base Ordering
+ at end example
+
+
+In order for vector_clip to search readings for segments of sequencing 
+vectors it either needs to use a vector_primer file or 
+it needs to know the positions of the cloning site and primers.
+If not using a vector_primer file, each reading's experiment file should contain SC and SP
+records, and also a primer type record (PR).
+The following section explains the numbering system used with
+an example for m13mp18, and then describes how to use spin 
+(_fpref(SPIN-Introduction, Introduction, spin))
+to work out the values for other vector, cloning site, and primer combinations.
+
+
+The position of the cloning site depends on the ordering of the bases in
+the particular vector sequence file being used. That is, as the
+sequences are circular, the file may be arranged to start at any base
+and still give the same circular sequence.  Vector_Clip must be told the
+correct position of the cloning site, then, relative to that, the
+position of the first base that will be included in the reading. i.e.
+the relative position of the first base 3' of the primer.
+
+Below we use
+EMBL entry M13MP18 as an example.  The figure includes a double stranded
+listing of 120 characters of m13mp18 around the SmaI site at 6249, and 
+some of the restriction sites. Between the restriction sites and the sequence
+we have added lines to explain the numbering used by vector_clip. The
+numbers below the row of "+" symbols show positive positions (to the right
+of the SmaI cloning site), and the numbers below the "-" symbols show
+negative positions (to the left of the cloning site). Below these lines
+we show the sequences of the 16mer reverse primer "r(-21)" which is at
+relative position -24, and the 17mer forward primer "f(-20)" which is at
+relative position 41.
+
+ at example
+The positions of SmaI site and forward and reverse primers for M13MP18
+
+                               EcoRI                                            
+                               .   TaqI                                         
+                               .   .     SacI                                   
+                               .   .     .     XmaI                             
+                               .   .     .     .HpaII
+                               .   .     .     ..AsuC2I                         
+                               .   .     .     ..SmaI                           
+                               .   .     .     ...  BamHI                       
+                               .   .     .     ...  MboI                        
+                               .   .     .     ...  Sau3AI                      
+                               .   .     .     ...  XhoII                       
+                               .   .     .     ...  . PspN4I          
+                               .   .     .     ...  . .   XbaI        
+                                                 ++++++++10++
+                         ---20--------10---------123456789012
+               r(-21)    432109876543210987654321
+         aacagctatgaccatg
+ acacaggaaacagctatgaccatgattacgaattcgagctcggtacccggggatcctcta
+       6210      6220      6230      6240      6250      6260
+ tgtgtcctttgtcgatactggtactaatgcttaagctcgagccatgggcccctaggagat
+
+                                                                      
+  HinfI                                                               
+  . SalI                                                              
+  . .AccI                                                             
+  . ..        SdaI                                                    
+  . ..        .  BspMI                                                
+  . ..        .  .  BbuI         CfrI                                 
+  . ..        .  .  Hsp92II      . BshI                               
+  . ..        .  .  PaeI         . HaeIII                             
+  . ..        .  .  SphI         . PalI                               
+  . ..        .  .  . Cac8I      . .Bse1I        MaeII                
+  . ..        .  .  . HindIII    . .BseNI        .  TaiI              
+  . ..        .  .  . . AluI     . .BsrI         .  TscI              
+  . ..        .  .  . . . MwoI   . .TspRI        .  .Tsp45I           
+ ++++++20++++++++30++++++++40+
+ 34567890123456789012345678901       f(-20)
+                              tgaccggcagcaaaatg
+ gagtcgacctgcaggcatgcaagcttggcactggccgtcgttttacaacgtcgtgactgg
+       6270      6280      6290      6300      6310      6320
+ ctcagctggacgtccgtacgttcgaaccgtgaccggcagcaaaatgttgcagcactgacc
+
+ at end example
+
+
+_split()
+ at node Vector_Clip-Cloning Site
+ at section Finding the Cloning and Primer Sites
+ at cindex Cloning site, finding
+ at cindex Primer site, finding
+
+The problem addressed here is how to work out the positions of the cloning
+and primer sites for vector_clip. The numbers can be
+worked out from listings of the vector sequences but once you know how,
+it is far easier to
+use the restriction enzyme search in spin 
+(_fpref(SPIN-Introduction, Introduction, spin))
+to do it, and that
+is what we explain here. Some familiarity with spin will help.
+To use the restriction enzyme search in spin it is necessary to have created
+a file containing the definitions of the sequences to search for
+(_fpref(Formats-Restriction, Restriction enzyme files, renzymes))
+These files give each enzyme a name and a set of strings (with cut positions
+marked by "'"). The name is terminated by "/" and each string by  "/". An extra
+"/" terminates all the data for each enzyme. For example
+"fred/aaa'ttt/gatc'a//" defines enzyme fred to have two recognition sequences
+aaattt and gatca with cut positions denoted by "'".
+
+For our current purpose we treat the primer sequences as restriction enzymes
+which each have a single recognition sequence; making sure that the sequence
+is the sense of the primer that is present in the sequence being searched, 
+and that the "cut positions" define the 3' ends (i.e. the end where the new
+sequence will start). 
+Again if we use the m13mp18 vector, its SmaI cloning site and the 17mer (-20) 
+forward and 16mer (-21) reverse primers as an example. The reverse primer
+has the sequence 5'aacagctatgaccatg3' and the forward one is 
+5'gtaaaacgacggccagt3', and the SmaI site is ccc'ggg where "'" defines the cutsite. 
+
+The restriction enzyme file should contain the following:
+
+ at example
+
+f(-20)/'actggccgtcgttttac//
+SmaI/CCC'GGG//
+r(-21)/aacagctatgaccatg'//
+ at end example
+
+This names the 17mer forward primer as f(-20) and defines its recognition
+sequence as 'actggccgtcgttttac. Note that this is the complement of the primer
+(which is what appears in the sequence being searched) and that the "cut position"
+is defined
+by the "'" symbol. SmaI is named and defined in the next record by
+SmaI/CCC'GGG//. The 16mer reverse primer is named r(-21) and defined by
+r(-21)/aacagctatgaccatg'//, and this time we search for the sequence of the
+primer and the "cut position" is again at the 3' end.
+
+Having started spin and read in the m13mp18 sequence select "Restriction enzyme map"
+from the "search" menu. A dialogue will appear requesting "Select input source" and
+with "6 cutter file" as the default. Select "personal" and give the name of the file
+for m13mp18 (a file containing the definitions shown above should be found in 
+$STADTABL/m13mp18_primers). The names "f(-20), SmaI and r(-21)" should appear in the
+selection box in the dialogue. Select all three and the graphical result should 
+appear.  Magnify the plot by hitting the "+50%" button and then scroll to the region 
+around 6249 which should look as shown below.
+
+_lpicture(primer_pos_plot,6in)
+
+This plot and the functionality of spin are sufficient to work out the numbers
+for vector_clip, but there is also a way of getting the values printed in the
+text output window (this is described later). To obtain the numbers from the plot
+first touch the line showing the SmaI site, its position (6249) 
+will be written in the
+information line at the bottom of the plot. This is the position of the cloning site
+and hence is the value for the SC record in the experiment file. Now click on the
+line for the SmaI site and the information line will display 
+"Select another cut"; click
+on the line for the forward primer; the distance between the SmaI site and the
+forward primer (41) will be displayed in the information line, and in the top
+right hand box of the plot. Being to the right of the cloning site, this gives
+a positive value for the experiment file SP record. Clicking on the SmaI site,
+and then the line for the reverse primer, gives 24 which, being to the left, is 
+a negative value for the SP record.
+
+To get the numbers displayed in the text output window, select the "Output ordered
+on position" item from the "Results" menu of the "Restriction enzyme map" plot.
+For the example given here they will appear as shown below.
+
+_picture(primer_pos_text,5.11667in)
+
+Finally it is also possible to work out the numbering by using the restriction
+enzyme search in the spin "sequence display" which can be selected from the "View"
+menu. It will appear as shown below.
+
+_lpicture(primer_pos_seq_display,6in)
+
+
+_split()
+ at node Screen_seq
+ at chapter Screening Readings for Contaminant Sequences
+ at cindex Screen_seq
+ at cindex screen_seq
+ at cindex Screening readings for contaminant sequences
+ at cindex screening for vectors
+ at cindex screening for bacterial sequences
+ at cindex filtering out extraneous readings
+ at cindex extraneous readings: filtering out
+ at cindex readings: extraneous
+ at cindex removing extraneous readings
+
+
+ at menu
+ at ifset html
+* Screen_Seq-Introduction::   Introduction
+ at end ifset
+* Screen_Seq-Parameters::     Command line options and parameters
+* Screen_Seq-limits::         Limitations
+* Screen_Seq-Errors::         Error codes
+* Screen_Seq-Examples::       Examples
+ at end menu
+
+ at ifset html
+_split()
+ at node Screen_Seq-Introduction
+ at section Introduction
+ at end ifset
+
+
+This section explains how to use the program screen_seq to
+filter out unwanted readings: i.e. how to
+search for and separate readings containing the sequences of extraneous
+DNA, such as vector or bacterial sequences. We have separated this task
+from that of locating and marking the extents of sequencing vector and
+other cloning vectors. There we require precise identification of the
+junction between the vectors and the target DNA. The filtering process
+described here is designed to spot strong matches between readings and a
+panel of possible contaminating sequences, and it splits readings into
+passes and fails. Readings that fail have a PS line containing the word
+"contaminant" and a "CONT" tag added to
+their experiment file.
+
+Normal usage would be to compare a batch of readings in experiment file
+format against a batch of possible contaminant sequences stored in (at
+present) simple text files. Each batch is presented to the program as a
+file of file names, and the program will write out two new files of file
+names: one containing the names of the files that do not match any of
+the contaminant sequences (the passes), and the other those that do
+match (the
+fails). It is also possible to compare single readings and single
+contaminant files by giving their file names (i.e. it is not necessary
+to use a file of file names for single files).
+
+Given the frequent need to compare against the full E. coli genome the
+algorithm is designed to be fast. Only one parameter is required: the minimum
+match length, min_match. All readings which contain a segment of sequence
+of length min_match which exactly matches a possible contaminant sequence
+are filtered out.
+
+
+The search is
+conducted only over the clipped portion of the readings. On our aging
+Alpha machine
+it takes about 1 second to compare both strands of a reading against the
+4.7 million bases of E. coli.
+
+_split()
+ at node Screen_Seq-Parameters
+ at section Parameters
+ at table @var
+ at item @code{-l} Length of minimum match (25).
+all readings with a match of this length are hits
+ at item @code{-m} Maximum vector length. 
+the length of the longest sequence to screen against (100000).
+ at item @code{-i} Input file of reading file names. 
+the file names of the readings to screen.
+ at item @code{-I} Input file of single reading to screen. 
+the file name of the reading to screen.
+ at item @code{-s} Input file of sequence file names. 
+the file names of the sequences to screen against.
+ at item @code{-S} Input file name of single sequence to screen against.
+ at item @code{-p} Passed output file of file names. 
+for the names of the readings that do not match.
+ at item @code{-f} Failed output file of file names. 
+for the names of the readings that match.
+ at item @code{-t} Test only mode. 
+results are only written to stdout and the experiment files are not altered.
+ at end table
+
+
+_split()
+ at node Screen_Seq-limits
+ at section Limits
+ at cindex screen_seq, limits
+
+
+Screen_seq is currently set to be able to process a maximum of 10,000
+readings and 5000 screening sequences in a single run. The maximum
+length of any screening sequence is 100,000 although this can be
+overridden by use of the -m parameter (set it to 5000000 for E. coli).
+At present the sequences to screen against must be stored in simple text
+files containing individual sequences, with no entry names, and <100
+characters per line.
+
+_split()
+ at node Screen_Seq-Errors
+ at section Error codes
+ at cindex Screen_seq: error codes
+ at cindex error codes in screen_seq
+
+The following errors can occur.
+
+ at cindex Screen_seq: error codes
+ at enumerate 1
+ at item   "Failed to open file of file names to screen against". Fatal failure to
+open the file of file names to screen against.
+ at item   "Failed to open single file to screen against". Fatal failure to
+open the file to screen against.
+ at item   "Failed to open file of file names to screen". Fatal failure to
+open the file of file names to screen.
+ at item   "Failed to open single file to screen". Fatal failure to
+open the file to screen.
+ at item   "Failed to open file of passed file names". Fatal failure to
+open the file of file names for readings that do not match.
+ at item   "Failed to open file of failed file names". Fatal failure to
+open the file of file names for readings that match.
+ at item   "Error: could not open vector file". An individual sequence file
+could not be opened.
+ at item   "Error: could not read vector file". An individual sequence file
+could not be read.
+ at item   "Error: could not hash vector file". An individual sequence file
+could not be prepared for comparison.
+ at item	"Error: could not open experiment file". The file does not exist
+or is unreadable.
+ at item	"Error: no sequence in experiment file".
+ at item	"Error: sequence too short". The reading is shorter than the
+minimum match length.
+ at item	"Error: could not write to experiment file". The disk is full or
+the file is write protected.
+ at item   "Error: hashing problem". An error occurred in the comparison
+algorithm. Please report to staden-package@@mrc-lmb.cam.ac.uk
+ at end enumerate
+
+Inconsistencies in the selection of options, such as selecting -I and
+-i, should also cause the usage message (shown below) to appear, and 
+the program to terminate. 
+
+ at example
+Usage: screen_seq [options and paramters] 
+Where options and parameters are:
+    [-l minimum match (25)]           [-m Max vector length (100000)]
+    [-i readings to screen fofn]      [-I reading to screen]
+    [-s seqs to screen against fofn]  [-S seq to screen against]
+    [-t test only]
+    [-p passed fofn]                  [-f failed fofn]
+ at end example
+
+_split()
+ at node Screen_Seq-Examples
+ at section Examples
+
+
+Screen the readings whose names are stored in fofn against a batch of
+possible contaminant sequences whose names are stored in vnames. Write
+the names of the readings that pass to file p and those that fail to
+file f. Increase the maximum sequence length to 5000,000 characters and
+require a minimum match of 20.
+
+ at example
+ at code{screen_seq -i fofn -s vnames -p p -f f -l20 -m5000000 }
+ at end example
+
+Screen the single reading stored in xpg33.g1 against a batch of
+possible contaminant sequences whose names are stored in vnames. If the
+reading does not match write its name to file p, otherwise to
+file f. Increase the maximum sequence length to 5000,000 characters and
+require a minimum match of 20.
+
+ at example
+ at code{screen_seq -I xpg33.g1 -s vnames -p p -f f -l20 -m5000000 }
+ at end example
+
+Screen the readings whose names are stored in fofn against a single
+possible contaminant sequence stored in ecoli.seq. Write
+the names of the readings that pass to file pass and those that fail to
+file fails. Increase the maximum sequence length to 5000,000 characters and
+require minimum match of 20.
+
+ at example
+ at code{screen_seq -i fofn -S ecoli.seq -p pass -f fails -l20 -m5000000 }
+ at end example
+
diff --git a/manual/vector_clip.1.texi b/manual/vector_clip.1.texi
new file mode 100644
index 0000000..8251882
--- /dev/null
+++ b/manual/vector_clip.1.texi
@@ -0,0 +1,241 @@
+
+ at cindex vector_clip: man page
+ at unnumberedsec NAME
+
+vector_clip --- finds and marks vector segments in sequence readings
+
+ at unnumberedsec SYNOPSIS
+
+
+ at code{vector_clip} @code{-}[@code{schr}]
+[@code{-w} @i{word_length (4)}] [@code{-n} @i{num_diags (7)}]
+[@code{-d} @i{diagonal_score (0.35)}] [@code{-l} @i{minimum_match (20/70%)}]
+[@code{-m} @i{minimum_5'_position}] [@code{-t}] [@code{-p}
+ at i{passed_fofn}] [@code{-f} @i{failed_fofn}] @i{input_fofn}
+
+ at unnumberedsec DESCRIPTION
+
+ at code{vector_clip} finds and marks vector segments in sequence readings stored
+in experiment file format. For sequencing vectors it can be used to find the
+5' primer and, for short inserts, the sequence to the 3' side of the cloning
+site. It can also be used to find 3' primer sequences. A further option can do
+a final check for any vector rearrangements that could be missed by the more
+specific searches around the cloning site. For cloning vectors it will search
+both orientations of the sequence and mark any segments found.  The vector
+sequences must be stored as simple text files. For cloning vector
+searches the reading's experiment file must contain the name of the
+cloning vector file. For sequencing vector searches, either the experiment
+file for each reading must contain the information about the vector
+sequence (the file name, cloning site and primer offset) or
+vector-primer files must be used. Vector-primer files contain sets of
+sequences from around cloning sites, and vector_clip can use these to
+find the vector that matches each reading best. If the match is above
+the cutoff score the reading is clipped. Vector-primer files are the
+simplest method of providing vector_clip with the data it needs for
+finding sequencing vectors. More information is available elsewhere
+(_fpref(Vector_Clip, Screening Against Vector Sequences, t)).
+
+
+The program processes batches of readings by the use of file of file names:
+one is used for input and two for output. The input file lists the names of
+all the readings to process, one name per line. One output file contains the
+names of all the readings that pass the screening and the other contains the
+names of those that fail.
+
+ at unnumberedsec OPTIONS
+
+ at table @code
+ at item -s
+Mark sequencing vector. Searches for 5' primer, 3' running into vector.
+ at item -c
+Mark cloning vector. Searches both strands for cloning vector.
+ at item -h
+Hgmp primer. Searches 3' end for a primer.
+ at item -i vector_primer filename
+Mark transposon data.
+ at item -r
+Vector rearrangements. Searches for sequencing vector rearrangements.
+ at item -t
+Test only. Does not change the experiment files, displays hits.
+ at end table
+
+ at table @var
+ at item @code{-L} minimum percentage match 5' end (60)
+sequencing vector searches and transposon search
+ at item @code{-R} minimum percentage match 3' end (80)
+sequencing vector searches and transposon search
+ at item @code{-m} minimum 5' position
+allows a minimum 5' end cutoff to be set if a sufficiently good match is not 
+found (i.e. it is really a default 5' cutoff position). 
+If a value of -1 is used the program will set the cutoff to be the 
+distance between the primer and the cloning site.
+ at item @code{-v} vector-primer-pair filename
+sequencing vector search using vector-primer-pair file
+ at item @code{-V} vector_primer length
+the length of the sequence stored in the vector_primer file to use for
+the 5' search
+ at item @code{-w} word_length (4)
+cloning vector search hash length
+ at item @code{-P} probability
+cloning vector search, (a score less likely than P is a match)
+ at item @code{-n} num_diags (7)
+cloning vector search, old score based algorithm: number of diagonals to combine
+ at item @code{-d} diagonal score (0.35)
+cloning vector search, old score based algorithm
+ at item @code{-l} minimum match (20)
+sequencing vector rearrangements and transposon search minimum match length
+ at item @code{-M} maximum vector length (100000)
+all algorithms, reset for vectors >100000 bases
+ at item @code{-p} passed fofn
+file of file names for passed files
+ at item @code{-f} failed fofn
+file of file names for failed files
+ at item input fofn ...
+input file of file names
+ at end table
+
+ at unnumberedsec EXAMPLES
+
+ at example
+Usage: vector_clip [options] file_of_filenames
+Where options are:
+    [-s mark sequencing vector]      [-c mark cloning vector]
+    [-h hgmp primer]                 [-r vector rearrangements]
+    [-w word_length (4)]             [-n num_diags (7)]
+    [-d diagonal score (0.35)]       [-l minimum match (20)]
+    [-L minimum % 5' match (60)]     [-R minimum % 3' match (80)]
+    [-m default 5' position]         [-t test only]
+    [-M Max vector length (100000)]  [-P max Probability]
+    [-v vector_primer filename]      [-i vector_primer filename]
+    [-V vector_primer length]
+    [-p passed fofn]                 [-f failed fofn]
+ at end example
+
+
+Screen for sequencing vector using 5' cutoff of 70%, a 3' cutoff of 90%
+and default 5' primer position of 30. The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail.
+
+
+ at example
+ at code{vector_clip -s -L70 -R90 -m30 -pfiles.pass -f files.fail files.in}
+ at end example
+
+Screen for sequencing vector using 5' cutoff of 60%, a 3' cutoff of 80%
+and default 5' primer position of 30. The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail. This shows
+that the default search is for sequencing vector.
+
+
+ at example
+ at code{vector_clip -m30 -pfiles.pass -f files.fail files.in}
+ at end example
+
+Screen for sequencing vector using 5' cutoff of 60%, a 3' cutoff of 80%
+and a vector-primer-pair file called vector_primer_file. 
+The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail.
+
+
+ at example
+ at code{vector_clip -v vector_primer_file -pfiles.pass -f files.fail files.in}
+ at end example
+
+Screen transposon data using 5' cutoff of 80%, a 3' cutoff of 85%, a match length of 10
+and a vector-primer-pair file called vector_primer_file. 
+The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail.
+
+
+ at example
+ at code{vector_clip -i vector_primer_file -L 80 -R 85 -l 10 -pfiles.pass \}
+ at code{            -f files.fail files.in}
+ at end example
+
+
+Screen for cloning vector using the old algorithm with a word length of 4, 
+summing 7 diagonals and diagonal cutoff score of 0.4. 
+The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail.
+
+
+ at example
+ at code{vector_clip -c -w4 -n7 -d0.4 -pfiles.pass -f files.fail files.in}
+ at end example
+
+Screen for cloning vector using the probability based algorithm with a 
+word length of 4 and probability cutoff of 1.0e-13.
+The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail.
+
+
+ at example
+ at code{vector_clip -c -P 1.0e-13 -pfiles.pass -f files.fail files.in}
+ at end example
+
+
+Screen for 3' primer using a cutoff of 75%.
+The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail.
+
+ at example
+ at code{vector_clip -h -R75 -pfiles.pass -f files.fail files.in}
+ at end example
+
+Screen for sequencing vector rearrangements using a cutoff of 20 bases.
+The batch of files to process are
+named in files.in, the names of the passed files are written to
+files.pass and the names of those that fail to files.fail.
+
+ at example
+ at code{vector_clip -r -l20 -pfiles.pass -f files.fail files.in}
+ at end example
+
+ at unnumberedsec NOTES
+
+The following error messages can be generated.
+
+ at cindex Vector_Clip: error codes
+ at enumerate 1
+ at item Error: could not open experiment file
+ at item Error: no sequence in experiment file
+ at item Error: sequence too short
+ at item Error: missing vector file name
+ at item Error: missing cloning site
+ at item Error: missing primer site
+ at item Error: could not open vector file
+ at item Error: could not write to experiment file
+ at item Error: could not read vector file
+ at item Error: missing primer sequence
+ at item Error: hashing problem
+ at item Error: alignment problem
+ at item Error: invalid cloning site
+ at item Warning: sequence now too short (no message)
+ at item Warning: sequence entirely cloning vector (no message)
+ at item Warning: possible vector rearrangement (no message)
+ at item Warning: error parsing vector_primer file
+ at item Warning: primer pair mismatch!
+ at item Aborting: more than X entries in vector_primer file
+ at end enumerate
+
+ at i{SL}, @i{SR}, @i{CL}, @i{CR}, @i{CS}, @i{PS}, @i{PR} and @i{SF}
+records are written to the experiment files.
+
+ at unnumberedsec SEE ALSO
+
+
+_fxref(Formats-Exp,Experiment File, formats) 
+
+
+For notes on defining the cloning and primer sites,
+_fxref(Vector_Clip-Sites, Defining the Positions of Cloning and Primer Sites for Vector_Clip, vector_clip)
+
+
+_fxref(Formats-Scf, scf(4), formats)
diff --git a/manual/vector_clip.texi b/manual/vector_clip.texi
new file mode 100644
index 0000000..b4711aa
--- /dev/null
+++ b/manual/vector_clip.texi
@@ -0,0 +1,41 @@
+\input epsf     % -*-texinfo-*-
+\input texinfo
+ at c %**start of header
+ at setfilename vector_clip.info
+ at settitle Vector clipping
+ at setchapternewpage odd
+ at iftex
+ at afourpaper
+ at end iftex
+ at setchapternewpage odd
+ at c %**end of header
+
+ at set standalone
+include(header.m4)
+
+ at titlepage
+ at title Vector clipping
+ at subtitle 
+ at author 
+ at page
+ at vskip 0pt plus 1filll
+_include(copyright.texi)
+ at end titlepage
+
+ at node Top
+ at ifinfo
+ at top top-vector_clip
+ at end ifinfo
+
+ at raisesections
+_include(vector_clip-t.texi)
+
+_split()
+ at node Index
+ at unnumberedsec Index
+ at printindex cp
+ at lowersections
+
+ at shortcontents
+ at contents
+ at bye
diff --git a/manual/vector_primer-t.texi b/manual/vector_primer-t.texi
new file mode 100644
index 0000000..200c151
--- /dev/null
+++ b/manual/vector_primer-t.texi
@@ -0,0 +1,36 @@
+ at node Formats-Vector_Primer
+ at section Vector_primer File
+ at cindex Vector_Primer files
+ at cindex format: vector_primer files
+
+
+The vector_primer files store
+the data for each vector/primer pair combination as a single record
+(line) and up to 100 records can be contained in a file. The items on each
+line must be separated by spaces or tabs (only the file name can contain spaces)
+and a newline character ends the record. 
+
+The items in a record are:
+
+name seq_r seq_f file_name
+
+name is an arbitrary record name.
+seq_r is the sequence between the reverse primer and the cloning site.
+seq_f is the sequence between the forward primer and the cloning site.
+file_name is the name of the file containing the complete vector sequence.
+
+An example file containing two entries 
+(for m13mp18, and a vector called f1) is 
+shown below. "\" symbols have been used to denote wrapped lines and so it
+can be seen that the first record is shown on two lines and the next on 1.
+
+ at example
+
+m13mp18 attacgaattcgagctcggtaccc ggggatcctctagagtcgacctgcaggcatgcaagcttggc \
+/pubseq/tables/vectors/m13mp18.seq
+f1 CCGGGAATTCGCGGCCGCGTCGACT CTAGACTCGAGTTATGCATGCA  af_clones_vec
+ at end example
+See 
+_oref(Vector_Clip-Vector_Primer-Files, Vector_Primer files)
+for information about creating new vector_primer file entries.
+
diff --git a/manual/ztr-t.texi b/manual/ztr-t.texi
new file mode 100644
index 0000000..8046199
--- /dev/null
+++ b/manual/ztr-t.texi
@@ -0,0 +1,772 @@
+ at ignore
+ at c MANSECTION=4
+ at unnumberedsec NAME
+
+ztr --- ZTR File Format (v1.2)
+ at end ignore
+
+ at node Formats-Ztr
+ at section ZTR
+ at cindex ZTR
+
+The ZTR format is used for storing analogue chromotogram data from DNA
+sequencing instruments.
+
+ at menu
+* ZTR-Header::                  Header
+* ZTR-Chunk Format::            Chunk Format
+* ZTR-Chunk Types::             Chunk Types
+* ZTR-Text Identifiers::        Text Identifiers
+* ZTR-References::              References
+ at end menu
+
+
+_split()
+ at node ZTR-Header
+ at subsection Header
+ at cindex ZTR header
+
+The header consists of an 8 byte magic number (see below), followed by a 1-byte
+major version number and 1-byte minor version number.
+
+Changes in minor numbers should not cause problems for parsers. It indicates
+a change in chunk types (different contents), but the file format is the
+same.
+
+The major number is reserved for any incompatible file format changes (which
+hopefully should be never).
+
+ at c INDENT=0.2i
+ at example
+/* The header */
+typedef struct @{
+    unsigned char  magic[8];	  /* 0xae5a54520d0a1a0a (be) */
+    unsigned char  version_major; /* 1 */
+    unsigned char  version_minor; /* 2 */
+@} ztr_header_t;
+
+/* The ZTR magic numbers */
+#define ZTR_MAGIC		"\256ZTR\r\n\032\n"
+#define ZTR_VERSION_MAJOR	1
+#define ZTR_VERSION_MINOR	2
+ at end example
+
+So the total header will consist of:
+
+ at example
+Byte number   0  1  2  3  4  5  6  7  8  9
+            +--+--+--+--+--+--+--+--+--+--+
+Hex values  |ae 5a 54 52 0d 0a 1a 0d|01 02|
+            +--+--+--+--+--+--+--+--+--+--+
+ at end example
+
+_split()
+ at node ZTR-Chunk Format
+ at subsection Chunk Format
+ at cindex Chunks, ZTR
+ at cindex ZTR Chunk format
+
+The basic structure of a ZTR file is (header,chunk*) - ie header followed by
+zero or more chunks. Each chunk consists of a type, some meta-data and some
+data, along with the lengths of both the meta-data and data.
+
+ at example
+Byte number   0  1  2  3  4  5  6  7  8  9
+            +--+--+--+--+---+---+---+---+--+--+  -  +--+--+--+--+--+--  -  --+
+Hex values  |   type    |meta-data length  | meta-data |data length| data .. |
+            +--+--+--+--+---+---+---+---+--+--+  -  +--+--+--+--+--+--  -  --+
+ at end example
+
+
+Ie in C:
+
+ at example
+typedef struct @{
+    uint4 type;			/* chunk type (be) */
+    uint4 mdlength;		/* length of meta-data field (be) */
+    char *mdata;		/* meta data */
+    uint4 dlength;		/* length of data field (be) */
+    char *data;			/* a format byte and the data itself */
+@} ztr_chunk_t;
+ at end example
+
+All 2 and 4-byte integer values are stored in big endian format.
+
+The meta-data is uncompressed (and so it does not start with a format
+byte). The format of the meta-data is chunk specific, and many chunk types
+will have no meta-data. In this case the meta-data length field will be zero
+and this will be followed immediately by the data-length field.
+
+The data length is the length in bytes of the entire 'data' block, including
+the format information held within it.
+
+The first byte of the data consists of a format byte. The most basic format is
+zero - indicating that the data is "as is"; it's the real thing. Other formats
+exist in order to encode various filtering and compression techniques. The
+information encoded in the next bytes will depend on the format byte.
+
+
+ at subsubsection Data format 0 - Raw
+
+ at example
+Byte number   0 1  2       N
+            +--+--+--  -  --+
+Hex values  | 0|  raw data  |
+            +--+--+--  -  --+
+ at end example
+
+Raw data has no compression or filtering. It just contains the unprocessed
+data. It consists of a one byte header (0) indicating raw format followed by N 
+bytes of data.
+
+
+ at subsubsection Data format 1 - Run Length Encoding
+
+ at example
+Byte number   0  1    2     3     4      5     6  7  8               N
+            +--+----+----+-----+-----+-------+--+--+--+--  -  --+--+--+
+Hex values  | 1| Uncompressed length | guard | run length encoded data|
+            +--+----+----+-----+-----+-------+--+--+--+--  -  --+--+--+
+ at end example
+
+Run length encoding replaces stretches of N identical bytes (with value V)
+with the guard byte G followed by N and V. All other byte values are stored 
+as normal, except for occurrences of the guard byte, which is stored as G 0.
+For example with a guard value of 8:
+
+Input data:
+ at example
+	20 9 9 9 9 9 10 9 8 7
+ at end example
+
+Output data:
+ at example
+	1			(rle format)
+	0 0 0 10		(original length)
+	8			(guard)
+	20 8 5 9 10 9 8 0 7	(rle data)
+ at end example
+
+
+ at subsubsection Data format 2 - ZLIB
+
+ at example
+Byte number   0  1    2     3     4    5  6  7         N
+            +--+----+----+-----+-----+--+--+--+--  -  --+
+Hex values  | 2| Uncompressed length | Zlib encoded data|
+            +--+----+----+-----+-----+--+--+--+--  -  --+
+ at end example
+
+This uses the zlib code to compress a data stream. The ZLIB data may itself be 
+encoded using a variety of methods (LZ77, Huffman), but zlib will
+automatically determine the format itself. Often using zlib mode
+Z_HUFFMAN_ONLY will provide best compression when combined with other
+filtering techniques.
+
+
+ at subsubsection Data format 64/0x40 - 8-bit delta
+
+ at example
+Byte number   0       1        2      N 
+            +--+-------------+--  -  --+
+Hex values  |40| Delta level |   data  |
+            +--+-------------+--  -  --+
+ at end example
+
+This technique replaces successive bytes with their differences. The level
+indicates how many rounds of differencing to apply, which should be between 1
+and 3. For determining the first difference we compare against zero. All
+differences are internally performed using unsigned values with automatic an
+wrap-around (taking the bottom 8-bits). Hence 2-1 is 1 and 1-2 is 255.
+
+For example, with level set to 1:
+
+Input data:
+ at example
+      10 20 10 200 190 5
+ at end example
+
+Output data:
+ at example
+       1			(delta1 format)
+       1			(level)
+       10 10 246 190 246 71	(delta data)
+ at end example
+
+For level set to 2:
+       
+Input data:
+ at example
+      10 20 10 200 190 5
+ at end example
+
+Output data:
+ at example
+       1			(delta1 format)
+       2			(level)
+       10 0 236 200 56 81	(delta data)
+ at end example
+
+
+ at subsubsection Data format 65/0x41 - 16-bit delta
+
+ at example
+Byte number   0       1        2      N 
+            +--+-------------+--  -  --+
+Hex values  |41| Delta level |   data  |
+            +--+-------------+--  -  --+
+ at end example
+
+This format is as data format 64 except that the input data is read in 2-byte
+values, so we take the difference between successive 16-bit numbers. For
+example "0x10 0x20 0x30 0x10" (4 8-bit numbers; 2 16-bit numbers) yields "0x10
+0x20 0x1f 0xf0". All 16-bit input data is assumed to be aligned to the start
+of the buffer and is assumed to be in big-endian format.
+
+
+ at subsubsection Data format 66/0x42 - 32-bit delta
+
+ at example
+Byte number   0       1        2  3  4      N 
+            +--+-------------+--+--+--  -  --+
+Hex values  |42| Delta level | 0| 0|   data  |
+            +--+-------------+--+--+--  -  --+
+ at end example
+
+
+This format is as data formats 64 and 65 except that the input data is read in
+4-byte values, so we take the difference between successive 32-bit numbers.
+
+Two padding bytes (2 and 3) should always be set to zero. Their purpose is to
+make sure that the compressed block is still aligned on a 4-byte boundary
+(hence making it easy to pass straight into the 32to8 filter).
+
+
+ at subsubsection Data format 67-69/0x43-0x45 - reserved
+
+At present these are reserved for dynamic differencing where the 'level' field 
+varies - applying the appropriate level for each section of data. Experimental 
+at present...
+
+
+ at subsubsection Data format 70/0x46 - 16 to 8 bit conversion
+
+ at example
+Byte number   0
+            +--+--  -  --+
+Hex values  |46|   data  |
+            +--+--  -  --+
+ at end example
+
+This method assumes that the input data is a series of big endian 2-byte
+signed integer values. If the value is in the range of -127 to +127 inclusive
+then it is written as a single signed byte in the output stream, otherwise we
+write out -128 followed by the 2-byte value (in big endian format). This
+method works well following one of the delta techniques as most of the 16-bit
+values are typically then small enough to fit in one byte.
+
+Example input data:
+ at example
+	0 10 0 5 -1 -5 0 200 -4 -32 (bytes)
+	(As 16-bit big-endian values: 10 5 -5 200 -800)
+ at end example
+
+Output data:
+ at example
+       70			(16-to-8 format)
+       10 5 -5 -128 0 200 -128 -4 -32
+ at end example
+
+
+ at subsubsection Data format 71/0x47 - 32 to 8 bit conversion
+
+ at example
+Byte number   0
+            +--+--  -  --+
+Hex values  |47|   data  |
+            +--+--  -  --+
+ at end example
+
+This format is similar to format 70, but we are reducing 32-bit numbers (big
+endian) to 8-bit numbers.
+
+
+ at subsubsection Data format 72/0x48 - "follow" predictor
+
+ at example
+Byte number   0  1     FF 100  101   N
+            +--+--  -  -  - --+-- - --+
+Hex values  |48| follow bytes |  data |
+            +--+--  -  -  - --+-- - --+
+ at end example
+
+For each symbol we compute the most frequent symbol following it. This is
+stored in the "follow bytes" block (256 bytes). The first character in the
+data block is stored as-is. Then for each subsequent character we store the
+difference between the predicted character value (obtained by using
+follow[previous_character]) and the real value. This is a very crude, but
+fast, method of removing some residual non-randomness in the input data and so 
+will reduce the data entropy. It is best to use this prior to entropy encoding 
+(such as huffman encoding).
+
+
+ at subsubsection Data format 73/0x49 - floating point 16-bit chebyshev polynomial predictor
+
+Version 1.1 only.
+Replaced by format 74 in Version 1.2.
+
+WARNING: This method was experimental and has been replaced with an
+integer equivalent. The floating point method may give system specific
+results.
+
+ at example
+Byte number   0  1  2      N
+            +--+--+--  -  --+
+Hex values  |49| 0|   data  |
+            +--+--+--  -  --+
+ at end example
+
+This method takes big-endian 16-bit data and attempts to curve-fit it using
+chebyshev polynomials. The exact method employed uses the 4 preceeding values
+to calculate chebyshev polynomials with 5 coefficents. Of these 5 coefficients
+only 4 are used to predict the next value. Then we store the difference
+between the predicted value and the real value. This procedure is repeated
+throughout each 16-bit value in the data. The first four 16-bit values are
+stored with a simple 1-level 16-bit delta function. Reversing the predictor
+follows the same procedure, except now adding the differences between stored
+value and predicted value to get the real value.
+
+
+ at subsubsection Data format 74/0x4A - integer based 16-bit chebyshev polynomial predictor
+
+Version 1.2 onwards
+This replaces the floating point code in ZTR v1.1.
+
+ at example
+Byte number   0  1  2      N
+            +--+--+--  -  --+
+Hex values  |4A| 0|   data  |
+            +--+--+--  -  --+
+ at end example
+
+This method takes big-endian 16-bit data and attempts to curve-fit it using
+chebyshev polynomials. The exact method employed uses the 4 preceeding values
+to calculate chebyshev polynomials with 5 coefficents. Of these 5 coefficients
+only 4 are used to predict the next value. Then we store the difference
+between the predicted value and the real value. This procedure is repeated
+throughout each 16-bit value in the data. The first four 16-bit values are
+stored with a simple 1-level 16-bit delta function. Reversing the predictor
+follows the same procedure, except now adding the differences between stored
+value and predicted value to get the real value.
+
+
+
+_split()
+ at node ZTR-Chunk Types
+ at subsection Chunk Types
+ at cindex ZTR Chunk types
+
+As described above, each chunk has a type. The format of the data contained in 
+the chunk data field (when written in format 0) is described below.
+Note that no chunks are mandatory. It is valid to have no chunks at all.
+However some chunk types may depend on the existance of others. This will be
+indicated below, where applicable.
+
+Each chunk type is stored as a 4-byte value. Bit 5 of the first byte is used
+to indicate whether the chunk type is part of the public ZTR spec (bit 5 of
+first byte == 0) or is a private/custom type (bit 5 of first byte == 1). Bit
+5 of the remaining 3 bytes is reserved - they must always be set to zero.
+
+Practically speaking this means that public chunk types consist entirely of
+upper case letters (eg TEXT) whereas private chunk types start with a
+lowercase letter (eg tEXT). Note that in this example TEXT and tEXT are
+completely independent types and they may have no more relationship with each
+other than (for example) TEXT and BPOS types.
+
+It is valid to have multiples of some chunks (eg text chunks), but not for
+others (such as base calls). The order of chunks does not matter unless
+explicitly specified.
+
+A chunk may have meta-data associated with it. This is data about the data
+chunk. For example the data chunk could be a series of 16-bit trace samples,
+while the meta-data could be a label attached to that trace (to distinguish
+trace A from traces C, G and T). Meta-data is typically very small and so it
+is never need be compressed in any of the public chunk types (although
+meta-data is specific to each chunk type and so it would be valid to have
+private chunks with compressed meta-data if desirable).
+
+The first byte of each chunk data when uncompressed must be zero, indicating
+raw format. If, having read the chunk data, this is not the case then the
+chunk needs decompressing or reverse filtering until the first byte is
+zero. There may be a few padding bytes between the format byte and the first
+element of real data in the chunk. This is to make file processing simpler
+when the chunk data consists of 16 or 32-bit words; the padding bytes ensure
+that the data is aligned to the appropriate word size. Any padding bytes
+required will be listed in the appopriate chunk definition below.
+
+
+The following lists the chunk types available in 32-bit big-endian format.
+In all cases the data is presented in the uncompressed form, starting with the 
+raw format byte and any appropriate padding.
+
+ at subsubsection SAMP
+
+
+ at example
+Meta-data:
+Byte number   0  1  2  3
+            +--+--+--+--+
+Hex values  | data name |
+            +--+--+--+--+
+
+Data:
+Byte number   0  1  2  3  4  5  6  7       N
+            +--+--+--+--+--+--+--+--+-     -+
+Hex values  | 0| 0| data| data| data|   -   |
+            +--+--+--+--+--+--+--+--+-     -+
+ at end example
+
+This encodes a series of 16-bit trace samples. The first data byte is the
+format (raw); the second data byte is present for padding purposes only. After 
+that comes a series of 16-bit big-endian values.
+
+The meta-data for this chunk contains a 4-byte name associated with the
+trace. If a name is shorter than 4 bytes then it should be right padded with
+nul characters to 4 bytes. For sequencing traces the four lanes representig A, 
+C, G and T signals have names "A\0\0\0", "C\0\0\0", "G\0\0\0" and "T\0\0\0".
+
+At present other names are not reserved, but it is recommended that (for
+consistency with elsewhere) you label private trace arrays with names starting 
+in a lowercase letter (specifically, bit 5 is 1).
+
+For sequencing traces it is expected that there will be four SAMP chunks,
+although the order is not specified.
+
+
+ at subsubsection SMP4
+
+
+ at example
+Meta-data: none present
+
+Data:
+Byte number   0  1  2  3  4  5  6  7       N
+            +--+--+--+--+--+--+--+--+-     -+
+Hex values  | 0| 0| data| data| data|   -   |
+            +--+--+--+--+--+--+--+--+-     -+
+ at end example
+
+
+The first byte is 0 (raw format). Next is a single padding byte (also 0).
+Then follows a series of 2-byte big-endian trace samples for the "A" trace,
+followed by a series of 2-byte big-endian traces samples for the "C" trace,
+also followed by the "G" and "T" traces (in that order). The assumption is
+made that there is the same number of data points for all traces and hence the 
+length of each trace is simply the number of data elements divided by four.
+
+This chunk is mutually exclusive with the SAMP chunks. If both sets are
+defined then the last found in the file should be used. Experimentation has
+shown that this gives around 3% saving over 4 separate SAMP chunks.
+
+ at subsubsection BASE
+
+
+ at example
+Meta-data: none present
+
+Data:
+Byte number   0  1  2  3      N  
+            +--+--+--+--  -  --+
+Hex values  | 0| base calls    |
+            +--+--+--+--  -  --+
+ at end example
+
+The first byte is 0 (raw format). This is followed by the base calls in ASCII
+format (one base per byte). The base call case an encoding set should be IUPAC
+characters [1].
+
+ at subsubsection BPOS
+
+
+ at example
+Meta-data: none present
+
+Data:
+Byte number   0  1  2  3  4  5  6  7       
+            +--+--+--+--+--+--+--+--+-     -+--+--+--+--+
+Hex values  | 0| padding|   data    |   -   |    data   |
+            +--+--+--+--+--+--+--+--+-     -+--+--+--+--+
+ at end example
+
+This chunk contains the mapping of base call (BASE) numbers to sample (SAMP)
+numbers; it defines the position of each base call in the trace data. The
+position here is defined as the numbering of the 16-bit positions held in the
+SAMP array, counting zero as the first value.
+
+The format is 0 (raw format) followed by three padding bytes (all 0). Next
+follows a series of 4-byte big-endian numbers specifying the position of each
+base call as an index into the sample arrays (when considered as a 2-byte
+array with the format header stripped off).
+
+Excluding the format and padding bytes, the number of 4-byte elements should
+be identical to the number of base calls. All sample numbers are counted from
+zero. No sample number in BPOS should be beyond the end of the SAMP arrays
+(although it should not be assumed that the SAMP chunks will be before this
+chunk). Note that the BPOS elements may not be totally in sorted order as
+the base calls may be shifted relative to one another due to compressions.
+
+ at subsubsection CNF4
+
+
+ at example
+Meta-data: none present
+
+Data:
+Byte number   0  1              N              4N
+            +--+--+--   -   --+--+----- -  -----+
+Hex values  | 0| call confidence | A/C/G/T conf |
+            +--+--+--   -   --+--+----- -  -----+
+
+(N == number of bases in BASE chunk)
+ at end example
+
+The first byte of this chunk is 0 (raw format). This is then followed by a
+series confidence values for the called base. Next comes all the remaining
+confidence values for A, C, G and T excluding those that have already been
+written (ie the called base). So for a sequence AGT we would store confidences
+A1 G2 T3 C1 G1 T1 A2 C2 T2 A3 C3 G3.
+
+The purpose of this is to group the (likely) highest confidence value (those
+for the called base) at the start of the chunk followed by the remaining
+values. Hence if phred confidence values are written in a CNF4 chunk the first
+quarter of chunk will consist of phred confidence values and the last three
+quarters will (assuming no ambiguous base calls) consist entirely of zeros.
+
+For the purposes of storage the confidence value for a base call that is not
+A, C, G or T (in any case) is stored as if the base call was T.
+
+The confidence values should be from the "-10 * log10 (1-probability)". These
+values are then converted to their nearest integral value.
+If a program wishes to store confidence values in a different range then this
+should be stored in a different chunk type.
+
+If this chunk exists it must exist after a BASE chunk.
+
+ at subsubsection TEXT
+
+
+ at example
+Meta-data: none present
+
+Data:	      0 
+            +--+-  -  -+--+-  -  -+--+-     -+-  -  -+--+-  -  -+--+--+
+Hex values  | 0| ident | 0| value | 0|   -   | ident | 0| value | 0| 0|
+            +--+-  -  -+--+-  -  -+--+-     -+-  -  -+--+-  -  -+--+--+
+ at end example
+
+This contains a series of "identifier\0value\0" pairs.
+
+The identifiers and values may be any length and may contain any data except
+the nul character. The nul character marks the end of the identifier or the
+end of the value. Multiple identifier-value pairs are allowable, with a double 
+nul character marking the end of the list.
+
+Identifiers starting with bit 5 clear (uppercase) are part of the public ZTR
+spec. Any public identifier not listed as part of this spec should be
+considered as reserved. Identifiers that have bit 6 set (lowercase) are for
+private use and no restriction is placed on these.
+
+See below for the text identifier list.
+
+ at subsubsection CLIP
+
+
+ at example
+Meta-data: none present
+
+Data:
+Byte number   0  1  2  3  4  5  6  7  8
+            +--+--+--+--+--+--+--+--+--+
+Hex values  | 0| left clip | right clip|
+            +--+--+--+--+--+--+--+--+--+
+ at end example
+
+This contains suggested quality clip points. These are stored as zero (raw
+data) followed by a 4-byte big endian value for the left clip point and a
+4-byte big endian value for the right clip point. Clip points are defined in
+units of base calls, with a value of 1 clipping the first base (so zero
+indicates no left clip and NumberOfBases+1 indicates no right clip).
+
+
+
+ at subsubsection CR32
+
+
+ at example
+Meta-data: none present
+
+Data:
+Byte number   0  1  2  3  4 
+            +--+--+--+--+--+
+Hex values  | 0|   CRC-32  |
+            +--+--+--+--+--+
+ at end example
+
+This chunk is always just 4 bytes of data containing a CRC-32 checksum,
+computed according to the widely used ANSI X3.66 standard. If present, the
+checksum will be a check of all of the data since the last CR32 chunk.
+This will include checking the header if this is the first CR32 chunk, and
+including the previous CRC32 chunk if it is not. Obviously the checksum will
+not include checks on this CR32 chunk.
+
+
+ at subsubsection COMM
+
+ at example
+Meta-data: none present
+
+Data:
+Byte number   0  1        N
+            +--+--   -   --+
+Hex values  | 0| free text |
+            +--+--   -   --+
+ at end example
+
+This allows arbitrary textual data to be added. It does not require a
+identifier-value pairing or any nul termination.
+
+
+_split()
+ at node ZTR-Text Identifiers
+ at subsection Text Identifiers
+ at cindex ZTR Text Identifiers
+
+These are for use in the TEXT segments. None are required, but if any of these
+identifiers are present they must confirm to the description below. Much
+(currently all) of this list has been taken from the NCBI Trace Archive [2]
+documentation. It is duplicated here as the ZTR spec is not tied to the same
+revision schedules as the NCBI trace archive (although it is intended that any
+suitable updates to the trace archive should be mirrored in this ZTR spec).
+
+The Trace Archive specifies a maximum length of values. The ZTR spec does not
+have length limitations, but for compatibility these sizes should still be
+observed.
+
+The Trace Archive also states some identifiers are mandatory; these are marked
+by asterisks below. These identifiers are not mandatory in the ZTR spec (but
+clearly they need to exist if the data is to be submitted to the NCBI).
+
+Finally, some fields are not appropriate for use in the ZTR spec, such as
+BASE_FILE (the name of a file containing the base calls). Such fields are
+included only for compatibility with the Trace Arhive. It is not expected that 
+use of ZTR would allow for the base calls to be read from an external file
+instead of the ZTR BASE chunk.
+
+[ Quoted from TraceArchiveRFC v1.17 ]
+
+ at example
+Identifier      Size       Meaning			 Example value(s)
+----------      -----      ----------------------------  -----------------
+TRACE_NAME *      250      name of the trace             HBBBA1U2211
+                           as used at the center
+                           unique within the center
+                           but not among centers.
+                           
+SUBMISSION_TYPE *   -      type of submission
+                           
+CENTER_NAME *     100      name of center                BCM
+CENTER_PROJECT    200      internal project name         HBBB
+                           used within the center
+                           
+TRACE_FILE *      200      file name of the trace	 ./traces/TRACE001.scf
+                           relative to the top of
+                           the volume.
+                           
+TRACE_FORMAT *     20      format of the tracefile
+                           
+SOURCE_TYPE *       -      source of the read
+                           
+INFO_FILE         200      file name of the info file
+INFO_FILE_FORMAT   20        
+                           
+BASE_FILE         200      file name of the base calls
+QUAL_FILE         200      file name of the base calls
+                           
+                           
+TRACE_DIRECTION     -      direction of the read
+TRACE_END           -      end of the template
+PRIMER            200      primer sequence
+PRIMER_CODE                which primer was used
+                           
+STRATEGY            -      sequencing strategy
+TRACE_TYPE_CODE     -      purpose of trace
+                           
+PROGRAM_ID         100     creator of trace file         phred-0.990722.h
+                           program-version
+                           
+TEMPLATE_ID         20     used for read pairing         HBBBA2211
+                           
+CHEMISTRY_CODE       -     code of the chemistry         (see below)
+ITERATION            -     attempt/redo                  1
+                           (int 1 to 255)
+                           
+CLIP_QUALITY_LEFT          left clip of the read in bp due to quality
+CLIP_QUALITY_RIGHT         right " " " " "
+CLIP_VECTOR_LEFT           left clip of the read in bp due to vector
+CLIP_VECTOR_RIGHT          right " " " " "
+
+                           
+SVECTOR_CODE        40     sequencing vector used        (in table)
+SVECTOR_ACCESSION   40     sequencing vector used        (in table)
+CVECTOR_CODE        40     clone vector used             (in table)
+CVECTOR_ACCESSION   40     clone vector used             (in table)
+                           
+INSERT_SIZE          -     expected size of insert       2000,10000
+                           in base pairs (bp)
+                           (int 1 to 2^32)
+                           
+PLATE_ID            32     plate id at the center          
+WELL_ID                    well                          1-384
+
+
+SPECIES_CODE *       -     code for species
+SUBSPECIES_ID       40     name of the subspecies
+                           Is this the same as strain
+
+CHROMOSOME           8     name of the chromosome        ChrX, Chr01, Chr09
+                           
+                           
+LIBRARY_ID          30     the source library of the clone
+CLONE_ID            30     clone id                      RPCI11-1234 
+ 
+ACCESSION           30     NCBI accession number         AC00001
+                           
+PICK_GROUP_ID       30     an id to group traces picked
+                           at the same time.
+PREP_GROUP_ID       30     an id to group traces prepared
+                           at the same time
+                           
+                           
+RUN_MACHINE_ID      30     id of sequencing machine
+RUN_MACHINE_TYPE    30     type/model of machine
+RUN_LANE            30     lane or capillary of the trace
+RUN_DATE             -     date of run
+RUN_GROUP_ID        30     an identifier to group traces
+                           run on the same machine
+
+[ End of quote from TraceArchiveRFC ]
+
+More detailed information on the format of these values should be obtained
+from the Trace Archive RFC [2].
+ at end example
+
+
+_split()
+ at node ZTR-References
+ at subsection References
+ at cindex ZTR References
+
+[1] IUPAC: http://www.chem.qmw.ac.uk/iubmb/misc/naseq.html
+
+[2] http://www.ncbi.nlm.nih.gov/Traces/TraceArchiveRFC.html
+
diff --git a/overview.html.template b/overview.html.template
new file mode 100644
index 0000000..1cf6003
--- /dev/null
+++ b/overview.html.template
@@ -0,0 +1,305 @@
+<html>
+<head>
+<title>Staden Package Program Summary</title>
+</head>
+<body bgcolor="#ffffff">
+<a href="index.html"><img src="i/nav_home.gif" alt="home"></a>
+<hr size=4>
+<h1 align=center>Staden Package Program Summary</h1>
+
+<h2>Assembly</h2>
+
+<h3>Assembly program</h3>
+
+<table border=3 cellpadding=4 cellspacing=1 width=95%>
+<tr align=left>
+<td><i><a href="manual/gap4_toc.html">gap4</a></i>
+	<td>Performs assembly, contig joining, 
+	assembly checking, repeat searching, experiment suggestion,
+        read pair analysis and contig editing. Has graphical views of
+	contigs, templates, readings and traces which all scroll in register.
+	Contig editor searches
+	and experiment suggestion routines use phred confidence values 
+	to calculate the confidence of the consensus sequence and hence 
+	only identify places requiring visual trace inspection or extra data.
+	The result is extremely rapid finishing and a consensus of known
+	accuracy.<br>
+
+<tr align=left>
+<td><i><a href="manual/gap5_toc.html">gap5</a></i>
+	<td>This is the new development version of Gap4, designed to
+	work with the large volumes of data attainable through the
+	newer sequencing technologies (eg Illumina, SOLiD, 454).
+	At present documentation is absent, but it shares many common
+	features with Gap4.<br>
+
+<tr align=left>
+<td><i>tg_index</i>
+	<td>Generates Gap5 format databases from input assembly
+	formats: SAM, BAM, MAQ, ACE, etc.<br>
+
+</table>
+
+<h3>Preparing sequence trace data for analysis or assembly</h3>
+
+<table border=3 cellpadding=4 cellspacing=1 width=95%>
+<tr align=left>
+<td><i><a href="manual/pregap4_toc.html">pregap4</a></i>
+	<td>Provides a graphical user interface to set up the processing
+	required to prepare trace data for assembly or analysis; and also 
+	gives a method for its automation. The possible processes which
+	can be set up include trace format conversion, quality analysis,
+	vector clipping, contaminant screening and repeat searching.<br>
+
+<tr align=left>
+<td><i><a href="%INDEX:manpages/Man-qclip%">qclip</a></i>
+	<td>Performs simple quality clipping of Experiment Files based
+	on confidence values or on the sequence composition.<br>
+
+<tr align=left>
+<td><i><a href="%INDEX:manpages/Man-polyA_clip%">polyA_clip</a></i>
+	<td>Marks polyA and polyT heads and tails.<br>
+
+<tr align=left>
+<td><i>stops</i>
+	<td>Identifies Sanger sequencing "stops" in SCF and ZTR trace files.<br>
+
+</table>
+
+<h3>Sequence screening</h3>
+
+<table border=3 cellpadding=4 cellspacing=1 width=95%>
+<tr align=left>
+<td><i><a href="%INDEX:vector_clip/Vector_Clip%">vector_clip</a></i>
+	<td>Finds and marks (with tags) vector segments of sequence readings
+        stored as Experiment Files. Rapid and sensitive, and usually
+	used via <i>pregap4.</i><br>
+
+<tr align=left>
+<td><i><a href="%INDEX:vector_clip/Screen_seq%">screen_seq</a></i>
+	<td>Searches sequence readings stored as Experiment Files 
+	for matches against sets of possible contaminant
+	sequences. Typically used to look for E.Coli or yeast
+	contamination. Very fast, and usually used via <i>pregap4.</i><br>
+
+<tr align=left>
+<td><i><a href="%INDEX:manpages/Man-find_renz%">find_renz</a></i>
+	<td>Finds and marks (with tags) known repeat sequences ( e.g. ALUs)
+        in sequence readings stored as Experiment Files.
+	Usually used via <i>pregap4.</i><br>
+
+</table>
+
+<h3>Trace viewing</h3>
+
+<table border=3 cellpadding=4 cellspacing=1 width=95%>
+<tr align=left>
+<td><i><a href="manual/trev_toc.html">trev</a></i>
+	<td>A rapid and flexible viewer and editor for
+	 ABI, ALF or SCF trace files. Provides good
+	support for interaction with Experiment Files.<br>
+</table>
+
+<h2>Mutation detection</h2>
+
+<table border=3 cellpadding=4 cellspacing=1 width=95%>
+<tr align=left>
+<td><i><a href="%INDEX:manpages/Man-tracediff%">tracediff</a></i>
+	<td>Automatically locates point mutations by comparing new traces
+        against those of a reference trace. Handles any number of files
+	in a single run and prepares results which can be viewed in <i>gap4</i>.
+<tr align=left>
+<td><i><a href="%INDEX:pregap4/Pregap4-Modules-Mutation Scanner%">mutscan</a></i>
+	<td>Used in conjunction with <i>tracediff</i> to search for
+	heterozygous positions and, where they coincide, to label
+	<i>tracediff</i> results appropriately.
+<tr align=left>
+<td><i><a href="manual/gap4_toc.html">gap4</a></i>
+	<td>For viewing aligned sequences and traces and checking automatic
+        mutation assignments. Can subtract traces and display their 
+        differences.<br>
+</table>
+
+
+<p><hr size=4>
+<h2>Sequence analysis</h2>
+
+<table border=3 cellpadding=4 cellspacing=1 width=95%>
+<tr align=left>
+<td><i><a href="manual/spin_toc.html">spin</a></i>
+
+	<td>A combination of the older nip4 and sip4 program.  Spin compares
+	pairs of sequences in many ways, often presenting
+        its results graphically. Has very rapid dot matrix analysis, 
+	global and local alignment, plus a sliding sequence window linked to
+	the graphical plots. Can compare nucleic acid against nucleic acid,
+	protein against protein, and protein against nucleic acid.
+	Analyses nucleotide sequences to find genes, restriction sites,
+        motifs, etc. Performs translations, finds open reading frames, counts
+	codons, etc.<br>
+
+<tr align=left>
+<td><i><a href="%INDEX:manpages/Man-make_weights%">make_weights</a></i>
+	<td>Analyses a multiple alignment to produce a weight matrix for use
+	within spin.<br>
+
+<tr align=left>
+%UNIX%<td><i><a href="%INDEX:spin/SPIN-Intro-Menu-Emboss%">create_emboss_files</a></i>
+%UNIX%	<td>Creates the GUI interface files for Spin from emboss ACD
+%UNIX%  files. Only needs to be run once after a new EMBOSS release.<br>
+</table>
+
+<p><hr size=4>
+<h2>Sequence trace and reading file manipulation</h2>
+
+<h3>Any trace file</h3>
+
+<table border=3 cellpadding=4 cellspacing=1 width=95%>
+<tr align=left>
+<td><i><a href="%INDEX:manpages/Man-convert_trace%">convert_trace</a></i>
+	<td>Converts traces from any format to any format. Also handles
+	trace background subtraction and normalisation.<br>
+<tr align=left>
+<td><i><a href="%INDEX:manpages/Man-get_comment%">get_comment</a></i>
+	<td>Extracts text from the comment fields from any trace
+	format. Replaces the get_scf_field program.<br>
+<tr align=left>
+<td><i>index_tar</i>
+	<td>Produces a text index from a <i>tar</i> file. Used for speeding up 
+	RAWDATA access within gap4.<br>
+</table>
+
+<h3>ABI files</h3>
+
+<table border=3 cellpadding=4 cellspacing=1 width=95%>
+<tr align=left>
+<td><i>getABIstring</i>
+	<td>Displays arbitrary string fields from an ABI trace file.<br>
+
+<tr align=left>
+<td><i>getABIhex</i>
+	<td>Displays arbitrary fields from an ABI trace file as hex codes.<br>
+
+<tr align=left>
+<td><i>getABIraw</i>
+	<td>Displays arbitrary fields from an ABI trace file in the raw
+	format.<br>
+
+<tr align=left>
+<td><i>getABIcomment</i>
+	<td>Displays the comments from an ABI trace file. Equivalent to
+	<i>getABIstring CMNT</i>.<br>
+
+<tr align=left>
+<td><i>getABISampleName</i>
+	<td>Displays the sample name (reading name) stored in an ABI trace
+	file. Equivalent to <i>getABIstring SMPL</i><br>
+
+<tr align=left>
+<td><i>getABIdate</i>
+	<td>Displays the run date from an ABI trace file.<br>
+</table>
+
+<h3>ALF files</h3>
+
+<table border=3 cellpadding=4 cellspacing=1 width=95%>
+<tr align=left>
+<td><i>alfsplit</i>
+	<td>Splits the Pharmacia ALF gel file into multiple files. This is
+	necessary before processing by <i>pregap4</i>.<br>
+</table>
+
+
+<h3>SCF files</h3>
+
+<table border=3 cellpadding=4 cellspacing=1 width=95%>
+<tr align=left>
+<td><i><a href="%INDEX:manpages/Man-makeSCF%">makeSCF</a></i>
+	<td>Converts existing trace files (whatever format) into SCF
+	files.<br>
+
+<tr align=left>
+<td><i>scf_info</i>
+	<td>Displays details stored in the header of an SCF file.<br>
+
+<tr align=left>
+<td><i>scf_dump</i>
+	<td>Displays the entire SCF file contents in a human readable
+	format.<br>
+
+<tr align=left>
+<td><i>scf_update</i>
+	<td>Converts between SCF file versions (2 to 3 and vice versa).<br>
+
+<tr align=left>
+<td><i><a href="%INDEX:manpages/Man-get_scf_field%">get_scf_field</a></i>
+	<td>Extracts data from the SCF comment section.<br>
+
+<tr align=left>
+<td><i><a href="%INDEX:manpages/Man-eba%">eba</a></i>
+	<td>Estimates the base accuracy of each base in an SCF file.
+</table>
+
+<h3>Gap4 database utilities</h3>
+
+<table border=3 cellpadding=4 cellspacing=1 width=95%>
+%UNIX%<tr align=left>
+%UNIX%<td><i><a href="%INDEX:gap4/Convert%">convert</a></i>
+%UNIX%	<td>Converts between the various assembly database formats.<br>
+
+<tr align=left>
+<td><i><a href="%INDEX:manpages/Man-copy_db%">copy_db</a></i>
+	<td>Copies and garbage collects <i>gap4</i> databases.<br>
+
+<tr align=left>
+<td><i><a href="%INDEX:manpages/Man-copy_reads%">copy_reads</a></i>
+	<td>Aligns two <i>gap4</i> databases and copies overlapping
+	sequences from one to the other.<br>
+
+</table>
+
+<h3>Other sequencing utilities</h3>
+
+<table border=3 cellpadding=4 cellspacing=1 width=95%>
+<tr align=left>
+<td><i><a href="%INDEX:manpages/Man-extract_seq%">extract_seq</a></i>
+	<td>Extracts the sequence component from trace files or
+	experiment files.<br>
+
+<tr align=left>
+<td><i><a href="%INDEX:manpages/Man-init_exp%">init_exp</a></i>
+	<td>Extracts the sequence and related information from trace files
+	to output in Experiment File format.
+</table>
+
+
+<p><hr size=4>
+<h2>Scripting utilities</h2>
+
+<table border=3 cellpadding=4 cellspacing=1 width=95%>
+<tr align=left>
+<td><i><a href="scripting_manual/scripting_toc.html">stash</a></i>
+	<td>General purpose scripting interface to Gap4 and Spin,
+	be used for producing graphical scripts and interfaces.<br>
+
+</table>
+
+<p><hr size=4>
+<h2>Misc</h2>
+
+<table border=3 cellpadding=4 cellspacing=1 width=95%>
+
+<tr align=left>
+<td><i>splitseq_da</i>
+	<td>Splits large sequences into a set of overlapping smaller
+	sequences. Outputs the sequences in a Experiment File format
+	with attributes suitable for input using Directed Assembly.<br>
+
+</table>
+
+<p>
+<hr>
+<a href="index.html"><img src="i/nav_home.gif" alt="home"></a>
+</body>
+</html>
+
diff --git a/parse_template b/parse_template
new file mode 100755
index 0000000..9232ee5
--- /dev/null
+++ b/parse_template
@@ -0,0 +1,84 @@
+#!/bin/sh
+#\
+exec tclsh $0 $@
+
+#
+# This programs searches for embedded statements in text files and
+# replaces them with the appropriate HTML.
+# The replacement rules are:
+#
+# "%INDEX:"htmlmanual/index_name"%"
+#	expands to the local URL in the same way that "show_help" proc does.
+#
+# "%UNIX%"
+#	this line is for unix only
+#
+# "%WINDOWS%"
+#	this line is for windows only
+#
+#
+# For example, we may have:
+#    <td><i><a href="%INDEX:manpages/Man-makeSCF%">makeSCF</a></i>
+#
+# which will be replaced by:
+#    <td><i><a href="manual/manpages_unix_7.html">makeSCF</a></i>
+#
+
+#
+# Usage: parse_template [system] < a.html.template > a.html
+#
+# system defaults to unix, but may be specified as either "unix" or "windows".
+#
+
+
+proc find_topic {topic {sys unix}} {
+    regexp {%INDEX:(.*)/(.*)%} $topic dummy file topic
+
+    set file manual/${file}.index
+
+    # Convert from topic to section
+    if {[catch {set fd [open $file]} err]} {
+	puts stderr "Error: $err"
+	return ""
+    }
+    while {[gets $fd l] != -1} {
+	if {[string compare [lindex $l 0] $topic] == 0} {
+	    close $fd
+	    return manual/[lindex $l 1]
+	}
+    }
+    
+    puts stderr "Error: No topic $topic in $file"
+    return ""
+}
+
+
+if {$argc > 0} {
+    set sys $argv
+} else {
+    set sys unix
+}
+
+set file [read stdin]
+
+#--- Replace %SYS%
+regsub -all {%SYS%} $file $sys file
+
+#--- Delete %UNIX% and %WINDOWS% lines
+if {$sys == "unix"} {
+    regsub -all "%UNIX%" $file {} file
+    regsub -all "\[^\n\]*%WINDOWS%\[^\n\]*\n" $file {} file
+} else {
+    regsub -all "\[^\n\]*%UNIX%\[^\n\]*\n" $file {} file
+    regsub -all "%WINDOWS%" $file {} file
+}
+
+#--- Replace %INDEX.*%
+# Terribly inefficient, but it works.
+while {[regexp {%INDEX:[^%]*%} $file topic] == 1} {
+    regsub {%INDEX:[^%]*%} $file [find_topic $topic $sys] file
+}
+
+puts $file
+
+exit
diff --git a/scripting_manual/Makefile b/scripting_manual/Makefile
new file mode 100644
index 0000000..c7b0752
--- /dev/null
+++ b/scripting_manual/Makefile
@@ -0,0 +1,91 @@
+all: start
+
+#
+# Sorry if this Makefile doesn't work correctly regarding dependencies. GNU
+# make causes all sorts of headaches with it's inbuilt rules (which I seem
+# unable to remove, even when using -d) and it has some quirky ideas as to
+# which files are created temporarily (and thus should be removed). It's
+# best to usually do 'gmake spotless all' or some such.
+#
+# However you should try "gmake depend" to keep the dependencies file up to
+# date as this does solve many (if not all) dependency problems.
+#
+# The input files are always .texi
+#
+# .texinfo files are expanded up .texi files. They have the macros replaced
+# and have been processed by m4 to include Unix or Windows specific components.
+#
+# The _us.* files are US-letter format copies, otherwise we assume everything
+# is in A4 format.
+#
+
+# M4 preprocessor. Various buggy versions of this have caused problems in the
+# past, so you may need to redefine this. On Digital Unix 4.0E the system m4
+# does not work with our files. Certain versions (which?) of GNU m4 also fail,
+# but this has now been patched.
+M4=m4
+
+#-----------------------------------------------------------------------------
+# General rules
+
+#
+# M4 processed texinfo
+#
+%.texinfo:	%.texi
+	$(M4) -D_tex < $< > $@
+	../manual/tools/update-nodes $@
+
+%.htmlinfo:	%.texi
+	$(M4) -D_html < $< > $@
+	../manual/tools/update-nodes $@
+
+# How to build .dvi files from our m4-expanded .texinfo
+# files
+%.dvi:	%.texinfo
+	texi2dvi $<
+
+# US-letter PostScript from DVI
+%_us.ps:	%_us.dvi
+	dvips -t letter -o $@ $<
+
+# A4 PostScript from DVI
+%.ps:	%.dvi
+	dvips -t a4 -Ppdf -o $@ $<
+
+# PDF generation - directly from the texinfo file.
+%.pdf:	%.texinfo
+	texi2pdf $<
+
+
+# Converts an A4 formatted .texi file into a US-letter formatted file.
+%_us.texinfo: %.texinfo
+	egrep -v '^@afourpaper' < $< > $@
+
+# HTML files - built from an expanded .texinfo file with the -D_html m4 macro
+# defined. We need the *_toc.html and the index files.
+# For ease of browsing we create a separate html document for each of the main
+# programs. The htmlinfo version is identical to texinfo except with a few
+# tweaks to the cross-references (to allow cross-referencing between top-level
+# documents) and the addition of an _split() command to request splitting an
+# html page at a specific point).
+%_toc.html:	%.htmlinfo
+	../manual/tools/texi2html -menu -verbose -split_chapter -index_chars $<
+
+#-----------------------------------------------------------------------------
+# The main make targets.
+
+clean:
+	-rm *.aux *.cp *.fn *.ky *.log *.pg *.toc *.tp *.vr *.cps *.fns *.pgs *.vrs
+	-rm core _tmp.texi _tmp.texi~ *.texinfo *.texinfo.tmp *.texinfo~
+	-rm *.htmlinfo *.htmlinfo~
+
+spotless:	clean
+	-rm *.dvi *.html *.info *.info-[0-9] *.index *.topic
+	-rm *.ps
+
+depend:
+	../manual/tools/make_dependencies > dependencies
+
+include dependencies
+
+start: scripting.texinfo scripting.pdf scripting_toc.html
diff --git a/scripting_manual/appendix-t.texi b/scripting_manual/appendix-t.texi
new file mode 100644
index 0000000..a30c90f
--- /dev/null
+++ b/scripting_manual/appendix-t.texi
@@ -0,0 +1,331 @@
+ at cindex Composition source code
+
+Here are the main source components for the Gap4 composition extension
+
+ at menu
+* Appendix-Composition-Makefile::        Makefile
+* Appendix-Composition-composition.c::   composition.c
+* Appendix-Composition-composition.tcl:: composition.tcl
+ at end menu
+
+ at split{}
+ at node Appendix-Composition-Makefile
+ at appendixsec Makefile
+ at cindex Composition Makefile
+ at cindex Makefile, composition package
+
+ at format
+ at example
+# Makefile for the composition 'package' to add to gap4.
+
+LIBS = composition
+PROGS= $(LIBS)
+
+SRCROOT=$(STADENROOT)/src
+include $(SRCROOT)/mk/global.mk
+include $(SRCROOT)/mk/$(MACHINE).mk
+
+INSTALLDIR  = ./install
+
+INCLUDES_E += $(TCL_INC) $(TKUTILS_INC) $(GAP4_INC) $(G_INC)
+CFLAGS     += $(SHLIB_CFLAGS)
+
+TESTBIN     = $(O)
+L           = $(INSTALLDIR)/$(O)
+
+# Objects
+OBJS = \
+        $(TESTBIN)/composition.o
+
+DEPS = \
+        $(G_DEP) \
+        $(TKUTILS_DEP) \
+        $(TCL_DEP)
+
+#
+# Main dependency
+#
+$(LIBS) : $(L)/$(SHLIB_PREFIX)$(LIBS)$(SHLIB_SUFFIX)
+        @@
+
+$(L)/$(SHLIB_PREFIX)$(LIBS)$(SHLIB_SUFFIX): $(OBJS)
+        -mkdir $(INSTALLDIR)
+        -mkdir $(INSTALLDIR)/$(O)
+        $(SHLIB_LD) $(SHLIB_LDFLAGS) $@@ $(OBJS) $(DEPS)
+
+DEPEND_OBJ = $(OBJS)
+
+install: $(LIBS)
+        cp tclIndex composition.tcl compositionrc composition.topic \
+        composition.index composition.html $(INSTALLDIR)
+
+include dependencies
+ at end example
+ at end format
+
+ at split{}
+ at node Appendix-Composition-composition.c
+ at appendixsec composition.c
+ at cindex composition.c
+
+ at format
+ at example
+#include <tcl.h>
+
+#include "IO.h"                 /* GapIO */
+#include "gap_globals.h"        /* consensus/quality cutoffs */
+#include "qual.h"               /* calc_consensus() */
+#include "cli_arg.h"            /* cli_arg, parse_args() */
+
+static int tcl_composition(ClientData clientData, Tcl_Interp *interp,
+                           int argc, char **argv);
+static char *doit(GapIO *io, int contig, int lreg, int rreg);
+
+/*
+ * This is called when the library is dynamically linked in with the calling
+ * program. Use it to initialise any tables and to register the necessary
+ * commands.
+ */
+int Composition_Init(Tcl_Interp *interp) @{
+    if (NULL == Tcl_CreateCommand(interp,
+                                  "composition",
+                                  tcl_composition,
+                                  (ClientData) NULL,
+                                  (Tcl_CmdDeleteProc *) NULL))
+        return TCL_ERROR;
+
+    return TCL_OK;
+@}
+
+
+/*
+ * The composition itself.
+ * This is called with an argc and argv in much the same way that main()
+ * is. We can either parse them ourselves, our use the gap parse_args
+ * utility routine.
+ */
+static int tcl_composition(ClientData clientData, Tcl_Interp *interp,
+                           int argc, char **argv) @{
+    int num_contigs;
+    contig_list_t *contigs = NULL;
+    char *result;
+    int i;
+    Tcl_DString dstr;
+
+    /* A structure definition to store the arguments in */
+    typedef struct @{
+        GapIO *io;
+        char *ident;
+    @} test_args;
+
+    /* The mapping of the argument strings to our structure above */
+    test_args args;
+    cli_args a[] = @{
+        @{"-io",       ARG_IO,  1, NULL, offsetof(test_args, io)@},
+        @{"-contigs",  ARG_STR, 1, NULL, offsetof(test_args, ident)@},
+        @{NULL,      0,       0, NULL, 0@}
+    @};
+
+    /*
+     * First things first, add a header to the output window. This shows the
+     * date and function name.
+     */
+    vfuncheader("test command");
+
+    /* Parse the arguments */
+    if (-1 == gap_parse_args(a, &args, argc, argv)) @{
+        return TCL_ERROR;
+    @}
+
+    active_list_contigs(args.io, args.ident, &num_contigs, &contigs);
+    if (num_contigs == 0) @{
+        xfree(contigs);
+        return TCL_OK;
+    @}
+
+    /* Do the actual work */
+    Tcl_DStringInit(&dstr);
+    for (i = 0; i < num_contigs; i++) @{
+        result = doit(args.io, contigs[i].contig, contigs[i].start,
+                      contigs[i].end);
+        if (NULL == result) @{
+            xfree(contigs);
+            return TCL_ERROR;
+        @}
+
+        Tcl_DStringAppendElement(&dstr, result);
+    @}
+
+    Tcl_DStringResult(interp, &dstr);
+
+    xfree(contigs);
+    return TCL_OK;
+@}
+
+/*
+ * Our main work horse. For something to do as an example we'll output
+ * the sequence composition of the contig in the given range.
+ */
+static char *doit(GapIO *io, int contig, int lreg, int rreg) @{
+    static char result[1024];
+    char *consensus;
+    int i, n[5];
+
+    if (0 == lreg && 0 == rreg) @{
+        rreg = io_clength(io, contig);
+        lreg = 1;
+    @}
+
+    if (NULL == (consensus = (char *)xmalloc(rreg-lreg+1)))
+        return NULL;
+
+    if (-1 == calc_consensus(contig, lreg, rreg, CON_SUM,
+                             consensus, NULL, NULL, NULL,
+                             consensus_cutoff, quality_cutoff,
+                             database_info, (void *)io)) @{
+        xfree(consensus);
+        return NULL;
+    @}
+
+    n[0] = n[1] = n[2] = n[3] = n[4] = 0;
+    for (i = 0; i <= rreg - lreg; i++) @{
+        switch(consensus[i]) @{
+        case 'a':
+        case 'A':
+            n[0]++;
+            break;
+
+        case 'c':
+        case 'C':
+            n[1]++;
+            break;
+
+        case 'g':
+        case 'G':
+            n[2]++;
+            break;
+
+        case 't':
+        case 'T':
+            n[3]++;
+            break;
+
+        default:
+            n[4]++;
+        @}
+    @}
+
+    /* Return the information */
+    sprintf(result, "%d %d %d %d %d %d",
+            rreg - lreg + 1, n[0], n[1], n[2], n[3], n[4]);
+
+    xfree(consensus);
+
+    return result;
+@}
+ at end example
+ at end format
+
+ at split{}
+ at node Appendix-Composition-composition.tcl
+ at appendixsec composition.tcl
+ at cindex composition.tcl
+
+ at format
+ at example
+# The main command procedure to bring up the dialogue
+proc Composition @{io@} @{
+    global composition_defs
+
+    # Create a dialogue window
+    set t [keylget composition_defs COMPOSITION.WIN]
+    if [winfo exists $t] @{
+        raise $t
+        return
+    @}
+    toplevel $t
+
+    # Add the standard contig selector dialogues
+    contig_id $t.id -io $io
+    lorf_in $t.infile [keylget composition_defs COMPOSITION.INFILE] \
+        "@{contig_id_configure $t.id -state disabled@}
+         @{contig_id_configure $t.id -state disabled@}
+         @{contig_id_configure $t.id -state disabled@}
+         @{contig_id_configure $t.id -state normal@}
+        " -bd 2 -relief groove
+
+    # Add the ok/cancel/help buttons
+    okcancelhelp $t.but \
+        -ok_command "Composition2 $io $t $t.id $t.infile" \
+        -cancel_command "destroy $t" \
+        -help_command "show_help %composition Composition"
+
+    pack $t.infile $t.id $t.but -side top -fill both
+@}
+
+# The actual gubbins. This can be either in straight Tcl, or using Tcl and
+# C. In this example, for efficiency, we'll do most of the work in C.
+proc Composition2 @{io t id infile@} @{
+    # Process the dialogue results:
+    if @{[lorf_in_get $infile] == 4@} @{
+        # Single contig
+        set name [contig_id_gel $id]
+        set lreg [contig_id_lreg $id]
+        set rreg [contig_id_rreg $id]
+        SetContigGlobals $io $name $lreg $rreg
+        set list "@{$name $lreg $rreg@}"
+    @} elseif @{[lorf_in_get $infile] == 3@} @{
+        # All contigs
+        set list [CreateAllContigList $io]
+    @} else @{
+        # List or File of contigs
+        set list [lorf_get_list $infile]
+    @}
+
+    # Remove the dialogue
+    destroy $t
+
+    # Do it!
+    SetBusy
+    set res [composition -io $io -contigs $list]
+    ClearBusy
+
+    # Format the output
+    set count 0
+    set tX 0
+    set tA 0
+    set tC 0
+    set tG 0
+    set tT 0
+    set tN 0
+    foreach i $res @{
+        vmessage "Contig [lindex [lindex $list $count] 0]"
+        incr count
+
+        set X [lindex $i 0]; incr tX $X
+        if @{$X <= 0@} continue;
+
+        set A [lindex $i 1]; incr tA $A
+        set C [lindex $i 2]; incr tC $C
+        set G [lindex $i 3]; incr tG $G
+        set T [lindex $i 4]; incr tT $T
+        set N [lindex $i 5]; incr tN $N
+        vmessage "  Length  [format %6d $X]"
+        vmessage "  No. As  [format @{%6d %5.2f%%@} $A [expr 100*$@{A@}./$X]]"
+        vmessage "  No. Cs  [format @{%6d %5.2f%%@} $C [expr 100*$@{C@}./$X]]"
+        vmessage "  No. Gs  [format @{%6d %5.2f%%@} $G [expr 100*$@{G@}./$X]]"
+        vmessage "  No. Ts  [format @{%6d %5.2f%%@} $T [expr 100*$@{T@}./$X]]"
+        vmessage "  No. Ns  [format @{%6d %5.2f%%@} $N [expr 100*$@{N@}./$X]]\n"
+    @}
+
+    if @{$count > 1@} @{
+        vmessage "Total length [format %6d $tX]"
+        vmessage "Total As     [format @{%6d %5.2f%%@} $tA [expr 100*$@{A@}./$tX]]"
+        vmessage "Total Cs     [format @{%6d %5.2f%%@} $tC [expr 100*$@{C@}./$tX]]"
+        vmessage "Total Gs     [format @{%6d %5.2f%%@} $tG [expr 100*$@{G@}./$tX]]"
+        vmessage "Total Ts     [format @{%6d %5.2f%%@} $tT [expr 100*$@{T@}./$tX]]"
+        vmessage "Total Ns     [format @{%6d %5.2f%%@} $tN [expr 100*$@{N@}./$tX]]"
+    @}
+@}
+ at end example
+ at end format
diff --git a/scripting_manual/dependencies b/scripting_manual/dependencies
new file mode 100644
index 0000000..36908a2
--- /dev/null
+++ b/scripting_manual/dependencies
@@ -0,0 +1,21 @@
+gap4-cio-t.texi: gap4-cio-intro-t.texi
+gap4-cio-t.texi: gap4-cio-compile-t.texi
+gap4-cio-t.texi: gap4-cio-database-t.texi
+gap4-cio-t.texi: gap4-cio-gapio-t.texi
+gap4-cio-t.texi: gap4-cio-IO.h-t.texi
+gap4-cio-t.texi: gap4-cio-basic-t.texi
+gap4-cio-t.texi: gap4-cio-other-t.texi
+gap4-t.texi: gap4-scripting-intro-t.texi
+gap4-t.texi: gap4-scripting-io-t.texi
+gap4-t.texi: gap4-scripting-util-t.texi
+gap4-t.texi: gap4-scripting-comm-t.texi
+gap4-t.texi: gap4-editor-t.texi
+scripting.texi: preface-t.texi
+scripting.texi: tkutils-t.texi
+scripting.texi: gap4-t.texi
+scripting.texi: gap4-cio-t.texi
+scripting.texi: gap4-cedit-t.texi
+scripting.texi: gap4-canno-t.texi
+scripting.texi: gap4-registration-t.texi
+scripting.texi: extension-t.texi
+scripting.texi: appendix-t.texi
diff --git a/scripting_manual/extension-t.texi b/scripting_manual/extension-t.texi
new file mode 100644
index 0000000..13bc403
--- /dev/null
+++ b/scripting_manual/extension-t.texi
@@ -0,0 +1,732 @@
+ at cindex Extensions, writing
+ at cindex Packages, writing
+ at cindex Modules, writing
+ at cindex Plugins, writing
+ at cindex Writing packages
+ at cindex Composition, package
+
+ at menu
+* Pkg-Command::         Creating a New Tcl Command
+* Pkg-GUI::             Adding a GUI to the Command
+* Pkg-Config::          Creating the Config File
+* Pkg-Help::            Writing the Online Help
+* Pkg-Wrappings::       Wrapping it all up
+ at end menu
+
+An important feature of the newer Tcl/Tk based applications is the ability to
+write extensions to directly add new functionality. These are typical add new
+commands onto the main menus.
+
+The best method of explaining the process of creating an extension is to work
+through an example. Here we supply the full code for a Gap4 extension to count
+base composition. The same techniques will apply to writing extensions for
+other programs and we will point out the Gap4 specific components.
+The example is somewhat simplistic, but hopefully will explain the framework
+needed to write a more complex package.
+
+The complete sources for the composition package can be found in the
+ at file{src/composition} file.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Pkg-Command
+ at section Creating a New Tcl Command
+
+ at menu
+* Pkg-Command-Reg::     Registering the Command
+* Pkg-Command-Parse::   Parsing the Arguments
+* Pkg-Command-Return::  Returning a Result
+* Pkg-Command-Code::    Writing the Code Itself
+ at end menu
+
+In general, for speed we wish our main algorithm to be written in C. Tcl is an
+interpreted language and runs very much slower than compiled C. As Tcl
+provides a method to extend the language with our own commands we will create
+a new command, which in this case is to be named "composition".
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Pkg-Command-Reg
+ at subsection Registering the Command
+ at findex Tcl_CreateCommand(C)
+ at cindex Registering a command
+ at cindex Command registration
+ at findex Composition_Init(C)
+ at cindex Tcl commands, creating
+ at cindex Creating Tcl commands
+ at cindex Composition, command registration
+
+Firstly we need to tell the Tcl interpreter which Tcl command should call
+which C function. We do this using the @code{Tcl_CreateCommand} function. This
+is typically called within the package initialisation routine. For a package
+named @code{composition} this is the @code{Composition_Init} routine.
+
+ at example
+/*
+ * This is called when the library is dynamically linked in with the calling
+ * program. Use it to initialise any tables and to register the necessary
+ * commands.
+ */
+int Composition_Init(Tcl_Interp *interp) @{
+    if (NULL == Tcl_CreateCommand(interp,
+                                  "composition",
+                                  tcl_composition,
+                                  (ClientData) NULL,
+                                  (Tcl_CmdDeleteProc *) NULL)) @{
+        return TCL_ERROR;
+    @}
+
+    return TCL_OK;
+@}
+ at end example
+
+In the above example we are saying that the Tcl command '@code{composition}'
+should call the C function '@code{tcl_composition}'. If we wished to call the
+C function with a specific argument that is known at the time of this
+initialisation then we would specify it in the @code{ClientData} argument
+(@code{NULL} in this example). The full information on using
+ at code{Tcl_CreateCommand} is available in the Tcl manual pages.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Pkg-Command-Parse
+ at subsection Parsing the Arguments
+ at cindex Argument parsing
+ at cindex Option parsing
+ at cindex Parsing arguments
+ at cindex Command line arguments
+ at cindex cli_arg.h
+ at cindex Composition, argument parsing
+ at findex tcl_composition(C)
+
+Our policy is to have a simple function to parse the command line arguments
+passed from Tcl. This should massage the arguments into a format usable by a
+separate (from Tcl) C or Fortran function which does the actual work. This
+clearly separates out the Tcl interface from the algorithms. The parsing will
+be done in the function registered with the Tcl interpreter. In our example
+this is @code{tcl_composition}.
+
+The latest Tcl/Tk release provides functions for easing the parsing of command
+line arguments. In the future we @i{may} switch to using this scheme, but at
+present we use (and document) our own methods. A quick overview of this is
+that we declare a structure to hold the argument results, a structure to
+define the available command line parameters, and then call the
+ at code{parse_args} or @code{gap_parse_args} function. Note that it is entirely
+up to the author of the package code for the arguments should be processed.
+
+Firstly we need to include the @file{cli_arg.h} file. Secondly declare a
+structure containing the argument results. The structure does not need to be
+referenced outside of this file and so need not be in a public header file.
+Next we need a structure of type @code{cli_args[]} to specify the mapping of
+command line argument strings to argument result addresses. The
+ at code{cli_args} structure is defined as follows.
+
+ at vindex cli_args
+ at example
+    typedef struct @{
+    char *command;      /* What to recognise, including the '-' symbol */
+    int type;           /* ARG_??? */
+    int value;          /* Set if this argument takes an argument */
+    char *def;          /* NULL if non optional argument */
+    int offset;         /* Offset into the 'result' structure */
+@} cli_args;
+ at end example
+
+ at vindex command, cli_args field
+ at var{Command} is a text string holding the option name, such as "-file".
+The last entry in the argument array should have a @var{command} of
+ at code{NULL}.
+
+ at vindex value, cli_args field
+ at var{Value} is either 0 or 1 to indicate whether an extra argument is
+required after the command line option. A value of 1 indicates that an extra
+argument is needed.
+
+ at vindex type, cli_args field
+ at var{Type} specifies the type of this extra argument. It can be one of
+ at code{ARG_INT}, @code{ARG_STR}, @code{ARG_ARR}, @code{ARG_FLOAT} and (for Gap4
+only) @code{ARG_IO} to represent types of @code{int}, @code{char *},
+ at code{char []}, @code{float} and @code{GapIO *}. An option with no extra
+argument must have the type of @code{ARG_INT} as in this case the stored value
+will be 0 or 1 to indicate whether the option was specified.
+
+Of the above types, @code{ARG_ARR} requires a better description. Options of
+this type are character arrays where the option argument is copied into the
+array. The @var{value} field for this type only specifies the length of the
+array. Finally the @code{offsetofa} macro instead of the @code{offsetof} macro
+(see below) must be used for the @var{offset} structure field. This type will
+possibly be removed in the future in favour of keeping @code{ARG_STR}. For
+ at code{ARG_STR} the result is a character pointer which is set to the option
+argument. This requires no bounds checking and can use the standard
+ at code{offsetof} macro.
+
+ at vindex def, cli_args field
+ at var{Def} specifies the default value for this option. If the option takes no
+extra argument or if it takes an extra argument and no default is suitable,
+then @code{NULL} should be used. Otherwise @code{def} is a text string, even
+in the case of @code{ARG_INT} in which case it will be converted to integer if
+needed.
+
+ at vindex offset, cli_args field
+ at findex offsetof(C)
+ at findex offsetofa(C)
+ at var{Offset} specifies the location within the results structure to store the
+result. The @code{offsetof} macro can be used to find this location. An
+exception to this is the @code{ARG_ARR} type where the @code{offsetofa}
+macro needs to be used instead (with the same syntax).
+
+For our composition package we will have the following two structures.
+
+ at example
+typedef struct @{
+    GapIO *io;
+    char *ident;
+@} test_args;
+
+test_args args;
+cli_args a[] = @{
+    @{"-io",       ARG_IO,  1, NULL, offsetof(test_args, io)@},
+    @{"-contigs",  ARG_STR, 1, NULL, offsetof(test_args, ident)@},
+    @{NULL,        0,       0, NULL, 0@}
+@};
+ at end example
+
+So we have two command line options, -io and -contigs, both of which take
+extra arguments. These are stored in @code{args.io} and @code{args.ident}
+respectively. The last line indicates the end of the argument list.
+
+ at findex parse_args(C)
+ at findex gap_parse_args(C)
+Once we've defined the structures we can actually process the process the
+arguments This is done using either @code{parse_args} or
+ at code{gap_parse_args}. The latter of these two is for Gap4 only and is the
+only one which understands the @code{ARG_IO} type. The functions take four
+arguments which are the address of the @code{cli_args[]} array, the address
+of the result structure, and the @code{argc} and @code{argv} variables. The
+functions returns -1 for an error and 0 for success.
+
+ at example
+    if (-1 == gap_parse_args(a, &args, argc, argv)) @{
+        return TCL_ERROR;
+    @}
+ at end example
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Pkg-Command-Return
+ at subsection Returning a Result
+ at cindex Results, returning to Tcl
+ at cindex Tcl results
+ at cindex Returning results to Tcl
+ at cindex Pitfalls, in setting Tcl results
+ at findex Tcl_AppendResult(C)
+ at findex Tcl_SetResult(C)
+ at findex vTcl_SetResult(C)
+ at findex Tcl_ResetResult(C)
+ at findex Tcl_DStringResult(C)
+ at vindex interp->result(C)
+ at cindex Composition, returning result
+
+To return a result to Tcl the @var{interp->result} variable needs to be set.
+This can be done in a variety of ways including setting the result manually or
+using a function such as @code{Tcl_SetResult}, @code{Tcl_AppendResult},
+ at code{Tcl_ResetResult} or @code{Tcl_DStringResult}.
+
+However the choice of which to use is not as obvious as may first appear. A
+cautionary tale will illustrate some of the easy pitfalls. The following
+points are not made sufficiently clear in John Ousterhouts Tcl and Tk book.
+Additionally the problems are real and have been observed in the development
+of Gap4.
+
+Consider the case where we have many commands registered with the interpreter.
+One such example could be:
+
+ at example
+int example(ClientData clientData, Tcl_Interp *interp, int argc, char **argv)
+@{
+    /* ... */
+
+    sprintf(interp->result, "%d", some_c_func());
+    return TCL_OK;
+@}
+ at end example
+
+Now deep within @code{some_c_func} we have a @code{Tcl_Eval} call which
+happens to end with something like the following:
+
+ at example
+proc some_tcl_func @{@} @{
+    # ...
+
+    set fred jim
+@}
+ at end example
+
+Due to the call of @code{Tcl_Eval} in @code{some_c_func} the
+ at var{interp->result} is now set to the last returned result, which is from the
+ at code{set} command. In the above example @var{interp->result} points to 'jim'.
+The @code{sprintf} command in the @code{example} function will overwrite this
+string and hence change the value of the @var{fred} Tcl variable. This causes
+confusion and in some cases may also cause memory corruption where data is
+incorrectly freed.
+
+The moral of this tale is to be extremely wary. As there is no knowledge of
+what @code{some_c_func} does (and remember it may get updated later) we seem
+to trapped. One possible solution is to rewrite the @code{example} function as
+follows.
+
+ at example
+int example(ClientData clientData, Tcl_Interp *interp, int argc, char **argv)
+@{
+    int ret;
+    /* ... */
+
+    ret = some_c_func();
+    Tcl_ResetResult(interp);
+    sprintf(interp->result, "%d", ret);
+    return TCL_OK;
+@}
+ at end example
+
+This leads to another pitfall. If we have '@code{sprintf(interp->result, "%d",
+some_c_func(interp));}' and @code{some_c_func} calls (possibly indirectly) the
+ at code{Tcl_ResetResult} function then we'll be modifying the
+ at var{interp->result} address. This leads to undefined execution of code. (Is
+ at code{sprintf} passed the original or final @var{interp->result} pointer?)
+
+Therefore I'm inclined to think that we should never use
+ at code{Tcl_ResetResult} except immediately before a modification of
+ at var{interp->result} in a separate C statement. My personal recommendation is
+to never write directly to @var{interp->result}. Additionally never reset
+ at var{interp->result} to a new string unless @var{interp->freeProc} is also
+updated correctly. In preference, use @code{Tcl_SetResult}.
+
+The @code{Tcl_SetResult} function should always work fine, however it does not
+take @code{printf} style arguments. We have implemented a
+ at code{vTcl_SetResult} which takes an @var{interp} argument and the standard
+ at code{printf} format and additional arguments. For instance we would rewrite
+the example function as the following
+
+ at example
+int example(ClientData clientData, Tcl_Interp *interp, int argc, char **argv)
+@{
+    int ret;
+    /* ... */
+
+    vTcl_SetResult(interp, "%d", some_c_func());
+    return TCL_OK;
+@}
+ at end example
+
+As a final note on @code{vTcl_SetResult}; the current implementation only
+allows strings up to 8192 bytes. This should be easy to remedy if it causes
+problems for other developers.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Pkg-Command-Code
+ at subsection Writing the Code Itself
+ at cindex Composition, algorithm
+
+The final C code itself is obviously completely different for each extension.
+
+In the example composition package we loop through each contig listed in our
+ at code{-contigs} command line argument running a separate function that returns
+a Tcl list containing the total number of characters processed and the number
+of A, C, G, T and unknown nucleotides. Each list in turn is then added as an
+item to another list which is used for the final result.
+
+ at example
+    /* Do the actual work */
+    Tcl_DStringInit(&dstr);
+    for (i = 0; i < num_contigs; i++) @{
+        result = doit(args.io, contigs[i].contig, contigs[i].start,
+                      contigs[i].end);
+        if (NULL == result) @{
+            xfree(contigs);
+            return TCL_ERROR;
+        @}
+
+        Tcl_DStringAppendElement(&dstr, result);
+    @}
+
+    Tcl_DStringResult(interp, &dstr);
+
+    xfree(contigs);
+    return TCL_OK;
+@}
+ at end example
+
+The above is the end of the @code{tcl_composition} function. @code{doit} is
+our main algorithm written in C (which has no knowledge of Tcl). We use the
+Tcl dynamic strings routines to build up the final return value. The complete
+C code for this package can be found in the appendices.
+
+If a command has persistent data about a contig (such as a plot containing the
+composition) the registration scheme should be used to keep this data up to
+date whenever database edits are made. _oxref(Registration, Gap4 Contig
+Registration Scheme).
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Pkg-GUI
+ at section Adding a GUI to the Command
+ at cindex Composition, GUI
+ at cindex Composition, interface
+ at cindex GUI for commands
+ at cindex Graphical User Interface for commands
+
+ at menu
+* Pkg-GUI-Dialogue::            The Dialogue Creation
+* Pkg-GUI-Callback::            Calling the New Command
+* Pkg-GUI-tclIndex::            The tclIndex file
+ at end menu
+
+Now we've defined a new Tcl command to perform the real guts of our example
+package we need to add Tk dialogues to provide a graphical interface to the
+user. This will typically be split into two main parts; the construction of
+the dialogues and the 'OK' callback procedure.
+
+ at split{}
+ at node Pkg-GUI-Dialogue
+ at subsection The Dialogue Creation
+ at cindex Dialogue creation
+ at findex Composition(T)
+
+Firstly, we need to create the dialogue. This is done using both standard Tk
+commands and extra widgets defined in the tk_utils package. For the
+composition package the dialogue procedure is as follows.
+
+ at example
+proc Composition @{io@} @{
+    global composition_defs
+
+    # Create a dialogue window
+    set t [keylget composition_defs COMPOSITION.WIN]
+    if [winfo exists $t] @{
+        raise $t
+        return
+    @}
+    toplevel $t
+
+    # Add the standard contig selector dialogues
+    contig_id $t.id -io $io
+    lorf_in $t.infile [keylget composition_defs COMPOSITION.INFILE] \
+        "@{contig_id_configure $t.id -state disabled@}
+         @{contig_id_configure $t.id -state disabled@}
+         @{contig_id_configure $t.id -state disabled@}
+         @{contig_id_configure $t.id -state normal@}
+        " -bd 2 -relief groove
+
+    # Add the ok/cancel/help buttons
+    okcancelhelp $t.but \
+        -ok_command "Composition2 $io $t $t.id $t.infile" \
+        -cancel_command "destroy $t" \
+        -help_command "show_help %composition Composition"
+
+    pack $t.infile $t.id $t.but -side top -fill both
+@}
+ at end example
+
+Firstly we define the procedure name. In this case we'll call it
+ at code{Composition}. It takes a single argument which is the IO handle of an
+opened Gap4 database.
+
+Next we need to create a new window. We've stored the Tk pathname of this
+window in the @code{COMPOSITION.WIN} keyed list value in the defaults for this
+package. As our package is called @var{composition} the defaults are
+ at var{composition_defs}. We define them as global and use @code{keylget} to
+fetch the window pathname. It is wise to check that the dialogue window
+doesn't already exist before attempting to create a new one. This could happen
+if the user selects the option from the main menu twice without closing down
+the first dialogue window.
+
+Then the real dialogue components are added. In this case these consist of
+ at code{contig_id}, @code{lorf_in} and @code{okcancelhelp} widgets. These are
+explained (FIXME: will be...) in the tk_utils and gap4 chapters. Note that the
+ at code{okcancelhelp} command requires three Tcl scripts to execute when each of
+the Ok, Cancel and Help buttons are pressed.
+
+For the Ok button we call the @code{Composition2} procedure with the widget
+pathnames containing the users selections. The Cancel button is easy as we
+simply need to destroy the dialogue window. The Help button will call the
+ at code{show_help} command to display the appropriate documentation. More on
+this later.
+
+ at split{}
+ at node Pkg-GUI-Callback
+ at subsection Calling the New Command
+ at findex lorf_in(T)
+ at findex lorf_in_get(T)
+ at findex lorf_get_list(T)
+ at findex contig_id(T)
+ at findex CreateAllContigList(T)
+ at findex SetContigGlobals(T)
+ at findex Composition2(T)
+
+Once the Ok callback from the @code{okcancelhelp} widget in the main dialogue
+has been executed we need to process any options the user has changed within
+the dialogue and pass these on to the main algorithms.
+
+For the extension widget we set the OK callback to execute a
+ at code{Composition2} procedure. This starts as follows.
+
+ at example
+# The actual gubbins. This can be either in straight tcl, or using Tcl and
+# C. In this example, for efficiency, we'll do most of the work in C.
+proc Composition2 @{io t id infile@} @{
+    # Process the dialogue results:
+    if @{[lorf_in_get $infile] == 4@} @{
+        # Single contig
+        set name [contig_id_gel $id]
+        set lreg [contig_id_lreg $id]
+        set rreg [contig_id_rreg $id]
+        SetContigGlobals $io $name $lreg $rreg
+        set list "@{$name $lreg $rreg@}"
+    @} elseif @{[lorf_in_get $infile] == 3@} @{
+        # All contigs
+        set list [CreateAllContigList $io]
+    @} else @{
+        # List or File of contigs
+        set list [lorf_get_list $infile]
+    @}
+
+    # Remove the dialogue
+    destroy $t
+
+    # Do it!
+    SetBusy
+    set res [composition -io $io -contigs $list]
+    ClearBusy
+ at end example
+
+For this Gap4 command we have used the @code{lorf_in} widget to let the user
+select operations for a single contig, all contigs, a list of contigs, or a
+file of contigs. We firstly process this to build up the appropriate values to
+send to the @code{-list} option of the @code{composition} Tcl command. The
+processes involved here are explained in the @code{lorf_in} widget
+documentation. (FIXME: to write).
+
+Next we remove the dialogue window, enable the busy mode to grey out other
+menu items, and execute the command itself saving its result in the Tcl
+ at var{res} variable.
+
+The procedure then continues by stepping through the @var{res} variable using
+tcl list and formatting commands to output to the main text window with the
+ at code{vmessage} command. The complete code for this can be found in the
+appendices.
+
+ at split{}
+ at node Pkg-GUI-tclIndex
+ at subsection The tclIndex file
+ at cindex tclIndex file
+
+One final requirement before the Tcl dialogue is complete is to create the
+ at file{tclIndex} file.
+
+Tcl uses a method whereby Tcl files are only loaded and executed when a
+command is first needed. This is done by referencing @var{auto_index} array in
+the Tcl error handler. This handler requires the @file{tclIndex} files to
+determine the location of each command. Failing to create this file will cause
+Tcl to complain that a command does not exist.
+
+To create a @file{tclIndex} file start up either @code{stash} or @code{tclsh}
+and type '@code{auto_mkindex} @i{dir}' where @i{dir} is the name of the
+directory (often simply ".") containing the Tcl files. For the composition
+package this created the following @file{tclIndex} file.
+
+ at example
+# Tcl autoload index file, version 2.0
+# This file is generated by the "auto_mkindex" command
+# and sourced to set up indexing information for one or
+# more commands.  Typically each line is a command that
+# sets an element in the auto_index array, where the
+# element name is the name of a command and the value is
+# a script that loads the command.
+
+set auto_index(Composition) [list source [file join $dir composition.tcl]]
+set auto_index(Composition2) [list source [file join $dir composition.tcl]]
+ at end example
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Pkg-Config
+ at section Creating the Config File
+ at cindex Config file for packages
+ at cindex Package config file
+
+The package needs to have a config file - an @file{rc} file. For the
+composition package this will be named @file{compositionrc}. The file contains 
+package dependencies, menu commands, and any user adjustable defaults.
+
+For the composition package we do not need any dependencies. The package
+depends on gap4 and tk_utils, but both of these are already loaded. If we did
+need to use an additional package, or simply an additional dynamic library,
+then we could add further @code{load_package} commands to the start of the
+file.
+
+Next we define the menu items. We could add an entirely new menu if the
+package defines many additional commands. In this example we'll simply add an
+extra command onto the standard Gap4 View menu.
+
+ at example
+ at group
+# We want to add to the View menu a new command named "Test Command".
+# This will call our TestCommand procedure with the contents of the global
+# $io variable (used for accessing the gap database).
+#
+# The command itself should be greyed out when the database is not open or
+# is empty.
+
+add_command     @{View.List Composition@}       8 10 @{Composition \$io@}
+ at end group
+ at end example
+
+This specifies that the @code{Composition $io} command is to be added to the
+View menu as 'List Composition'. It will be enabled only when the database is
+open and has data (8) and is disabled during busy modes and when the database
+has no data or is not open (10).
+
+Next we add any defaults. For the composition package this is simply the
+dialogue values for the composition command.
+
+ at example
+# Now for the default values required by the composition command. Some of
+# these are the sort of things that will be configured by users (eg the
+# default cutoff score in a search routine) by creating their own .rc file
+# (.compositionrc in this case). Others are values used entirely by the
+# package itself. In our case  that's all we've got.
+
+set_defx defs_c_in      WHICH.NAME      "Input contigs from"
+set_defx defs_c_in      WHICH.BUTTONS   @{list file @{@{all contigs@}@} single@}
+set_defx defs_c_in      WHICH.VALUE     3
+set_defx defs_c_in      NAME.NAME       "List or file name"
+set_defx defs_c_in      NAME.BROWSE     "browse"
+set_defx defs_c_in      NAME.VALUE      ""
+
+set_def COMPOSITION.WIN .composition
+set_def COMPOSITION.INFILE $defs_c_in
+ at end example
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Pkg-Help
+ at section Writing the Online Help
+ at cindex Help, writing
+ at cindex Writing help
+ at cindex Online help
+ at cindex Texinfo
+ at cindex composition.topic
+ at cindex composition.index
+ at cindex composition.html
+ at cindex HTML help files
+ at cindex Topic help files
+ at cindex Index help files
+
+The online help (including this) and printed manual for our programs are
+written using Texinfo.  However due to the usage of pictures (which aren't
+supported by Texinfo) we've made several modifications to the documentation
+system. We have modified makeinfo and texi2html scripts too. Consequently the
+system we use for documentation is not ready for public usage.
+
+However the final files needed for online usage by the applications can be
+produce by any system capable of creating HTML files and our own @file{.index}
+and @file{.topic} files.
+
+The principle method of bringing up help from a package is to use the
+ at code{show_help} command. For the composition widget we used the following.
+
+ at example
+show_help %composition Composition
+ at end example
+
+The @code{%composition} indicates that the @code{show_help} command should
+read the @file{composition.topic} and @file{composition.index} files. These
+are normally read from the @file{$STADENROOT/manual/} directory, but by
+preceeding the name with a percent sign we can direct the @code{show_help}
+command to search for these files in the composition package directory.
+
+The last argument of @code{show_help} is the topic to display. In this case it
+is @code{Composition}. If the topic includes spaces then remember to use the
+Tcl quoting mechanism. The topic file is then scanned to find and line with
+this topic as the first 'word'. The second 'word' contains the index name. The
+index name is then looked up in the index file (as the first word) to find the
+URL (the second word). This two stage lookup is designed to protect against
+renaming section headings in the documentation. The index file can be easily
+created by parsing the html files to generate a mapping of heading names to
+URLs. However if the documentation changes we do not wish to need to change
+the Tcl calls to @code{show_help}.
+
+ at example
+ at i{composition.topic file:}
+
+    @{Composition@} @{Composition@}
+
+
+ at i{composition.index file:}
+
+    @{Composition@} composition.html
+ at end example
+
+For the composition package we have very simple topic and index files. The
+index and topic names are identical, so the topic file is trivial. The index
+file contains a single line mapping the @code{Composition} index entry to the
+ at code{composition.html} file. If a named tag within the html file is needed
+then the URL would be @code{composition.html#tagname}. The html file itself is
+held within the same directory as the topic and index files.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Pkg-Wrappings
+ at section Wrapping it all up
+
+We've now got all the code that we need to build a complete package. If this
+package is to be kept separate from the main Staden Package installation tree
+then we need to build our own directory tree for the package.
+
+For example, we'll create a separate directory for the composition package
+named @file{/home/spackages}. Within this directory we should place
+the rc file (@file{compositionrc}), the documentation
+(@file{composition.topic}, @file{composition.index} and
+ at file{composition.html}) and the Tcl files @file{composition.tcl} and
+ at file{tclIndex}.
+
+Additionally we need to have a dynamic library containing the C command. This
+should be placed in @file{/home/spackages/MACHINE-binaries/} where
+ at file{MACHINE} is the machine type (eg @code{alpha}, @code{solaris},
+ at code{sun}, @code{sgi}, @code{linux} or @code{windows}). The library will
+probably be named something like @file{libcomposition.so}.
+
+The actual compilation of the library is complicated due to each machine type
+having different linker options. The full description of the Makefile system
+is beyond the scope of this documentation, but in brief, the system works by
+having a single @file{Makefile} for the package, a @file{global.mk} file in
+the @file{$STADENROOT/src/mk} directory containing general definitions, and a
+system specific (eg @file{alpha.mk}) file also in @file{$STADENROOT/src/mk}
+defining system architecture specific definitions. These combine to allow
+system independent macros to be used for building dynamic libraries.
+The complete composition package Makefile is in the appendices.
+
+Once the package has been installed correctly an @code{ls -R} on the
+installation directory should look something like the following.
+
+ at example
+alpha-binaries/     composition.index   composition.topic   tclIndex
+composition.html    composition.tcl     compositionrc
+
+./alpha-binaries:
+libcomposition.so   so_locations
+ at end example
+
+Note that packages for multiple architectures may share the same installation
+tree as each architecture will need only its own @file{MACHINE-binaries}
+directory.
+
+The final requirement is to add the package onto gap4. This is done by adding
+the following to the users @file{.gaprc} file (where
+ at file{/installation/directory/} is the location where containing the list of
+files).
+
+ at example
+load_package /installation/directory/composition
+ at end example
diff --git a/scripting_manual/gap4-canno-t.texi b/scripting_manual/gap4-canno-t.texi
new file mode 100644
index 0000000..dcc4d26
--- /dev/null
+++ b/scripting_manual/gap4-canno-t.texi
@@ -0,0 +1,521 @@
+NOTE: The terms @var{annotation} and @var{tag} are freely interchangable.
+Their varying use simply reflects the evolution of the code.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node shift_contig_tags
+ at section shift_contig_tags
+ at vindex shift_contig_tags(C)
+ at cindex consensus tags, shifting
+ at cindex tags, shifting in consensus
+ at cindex annotations, shifting in consensus
+
+ at example
+#include <tagUtils.h>
+
+void shift_contig_tags(
+        GapIO  *io,
+        int     contig,
+        int     posn,
+        int     dist);
+ at end example
+
+This function moves tags within a contig with number @var{contig}. All tags
+starting at position @var{posn}, or to the right of @var{posn} are moved to
+the right by @var{dist} bases. @var{dist} should not be a negative value.
+
+The function is used internally by (for example) algorithms to add pads to the
+consensus or for joing contigs.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node merge_contig_tags
+ at section merge_contig_tags
+ at vindex merge_contig_tags(C)
+ at cindex tags, merging in consensus
+ at cindex annotations, merging in consensus
+
+ at example
+#include <tagUtils.h>
+
+void merge_contig_tags(
+        GapIO  *io,
+        int     contig1,
+        int     contig2,
+        int     off);
+ at end example
+
+This function is used to join a tag list from one contig to a tag list from
+another contig. All the tags in contig number @var{contig1} are added to
+contig number @var{contig2}. Each tag is moved by @var{off} bases when it is
+copied. The tag list is correctly maintained as sorted list. At the end of the
+function, the tags list for contig number @var{contig2} is set to 0.
+
+The main purpose of this function is for use when joining contigs.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node complement_contig_tags
+ at section complement_contig_tags
+ at vindex complement_contig_tags(C)
+ at cindex consensus tags, complementing
+ at cindex tags, complementing
+ at cindex annotations, complementing
+
+ at example
+#include <tagUtils.h>
+
+void complement_contig_tags(
+        GapIO  *io,
+        int     contig);
+ at end example
+
+This function complements the positions and orientations of each tag on the
+consensus sequence for contig number @var{contig}. The tags on the readings
+are not modified as these are always kept in their original orientation.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node split_contig_tags
+ at section split_contig_tags
+ at vindex split_contig_tags(C)
+ at cindex consensus tags, splitting in two
+ at cindex tags, splitting lists
+ at cindex annotations, splitting lists
+
+ at example
+#include <tagUtils.h>
+
+void split_contig_tags(
+        GapIO  *io,
+        int     cont1,
+        int     cont2,
+        int     posl,
+        int     posr);
+ at end example
+
+This function is called by the break contig algorithm and has little, if any,
+other use. When we're splitting a contig in half we need to move the
+annotations too. Annotations that overlap the two contigs are duplicated.
+Annotations that overlap the end of a contig have their lengths and positions
+corrected.
+
+ at var{posl} and @var{posr} hold the overlap region of contigs @var{cont1} and
+ at var{cont2} before splitting. At the time of calling this routine, @var{cont2}
+has just been created (and has no tags). Both contigs have their lengths set
+correctly, but all of the tags are still in @var{cont1}. This function
+corrects these tag locations.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node remove_contig_tags
+ at section remove_contig_tags
+ at vindex remove_contig_tags(C)
+ at cindex consensus tags, removing
+ at cindex tags, removing from consensus
+ at cindex annotations, removing from consensus
+
+ at example
+#include <tagUtils.h>
+
+void remove_contig_tags(
+        GapIO  *io,
+        int     contig,
+        int     posl,
+        int     posr);
+ at end example
+
+This function removes annotations over the region defined as @var{posl} to
+ at var{posr} from the consensus for contig number @var{contig}. Passing
+ at var{posl} and @var{posr} as zero implies the entire consensus. This uses the
+ at code{rmanno} function for the main portion of the work.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node remove_gel_tags
+ at section remove_contig_tags
+ at vindex remove_contig_tags(C)
+ at cindex reading tags, removing
+ at cindex tags, removing from readings
+ at cindex annotations, removing from readings
+
+ at example
+#include <tagUtils.h>
+
+void remove_gel_tags(
+        GapIO  *io,
+        int     gel,
+        int     posl,
+        int     posr);
+ at end example
+
+This function removes annotations over the region defined as @var{posl} to
+ at var{posr} from the reading numbered @var{gel}. Passing @var{posl} and
+ at var{posr} as zero implies the entire reading. This uses the @code{rmanno}
+function for the main portion of the work.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node rmanno
+ at section rmanno
+ at vindex rmanno(C)
+ at cindex tags, removing
+ at cindex annotations, removing
+
+ at example
+#include <tagUtils.h>
+
+int rmanno(
+        GapIO  *io,
+        int     anno,
+        int     lpos,
+        int     rpos);
+ at end example
+
+This function removes annotations in a specified region from an annotation
+list. The annotation list starts at annotation number @var{anno}. The new list
+head (which will change if we delete the first annotation) is returned. The
+region to remove annotations over is between base numbers @var{lpos} and
+ at var{rpos} inclusive. Note that annotations overlapping this region, but not
+contained entirely within it, will have their either their position or length
+modified, or may need splitting in two. (Consider the case where a single tag
+spans the entire region to see where splitting is necessary.)
+
+When succeeding the the new annotation number to form the annotation list
+head. Otherwise returns 0.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node tag2values
+ at section tag2values
+ at vindex tag2values(C)
+ at cindex tags, converting from string to values
+ at cindex annotations, converting from string to values
+ at cindex conversion of tag formats
+
+ at example
+#include <tagUtils.h>
+
+int tag2values(
+        char   *tag,
+        char   *type,
+        int    *start,
+        int    *end,
+        int    *strand,
+        char   *comment);
+ at end example
+
+This function converts a tag in string format to a tag represented by a series
+of separate integer/string values. It performs the opposite task to the
+ at code{values2tag} function.
+
+The tag string format is as used in the experiment file @code{TG} lines. The
+format is "@var{TYPE}<space>@var{S}<space>@var{start}.. at var{end}" followed by
+zero or more comment lines, each starting with a newline character. @var{TYPE}
+is the tag type, which must be 4 characters, and @var{S} is the strand; one of
+"@code{+}", "@code{-}" or "@code{b}" (both).
+
+The tag string is passed as the @var{tag} argument. This is then expanded into
+the @var{type}, @var{start}, @var{end}, @var{strand} and @var{comment} values.
+The comment must have been allocated before hand (@code{strlen(tag)} will
+always be large enough). If no comment was found then @var{comment} is set to
+be an empty string. @var{type} should be allocated to be 5 bytes long.
+
+The function returns 0 for success, -1 for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node values2tag
+ at section values2tag
+ at vindex values2tag(C)
+ at cindex tags, converting from values to string
+ at cindex annotations, converting from values to string
+ at cindex conversion of tag formats
+
+ at example
+#include <tagUtils.h>
+
+int values2tag(
+        char   *tag,
+        char   *type,
+        int     start,
+        int     end,
+        int     strand,
+        char   *comment);
+ at end example
+
+This function converts a tag represented by a series of separate
+integer/string values to a single string of the format used by the experiment
+file @code{TG} line type. It performs the opposite task to the
+ at code{tag2values} function.
+
+For the format of the tag string please see _ref(tag2values, tag2values).
+
+The @var{type}, @var{start}, @var{end}, @var{strand} and @var{comment}
+paramaters contain the current tag details. @var{comment} must be specified
+even when no comment exists, but can be specified as a blank string in this
+case. @var{tag} is expected to have been allocated already and no bounds
+checks are performed. A safe size for allocation is @code{strlen(comment)+30}.
+
+The function returns 0 for success, -1 for failure.
+
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node rmanno_list
+ at section rmanno_list
+ at vindex rmanno_list(C)
+ at cindex tags, removing
+ at cindex annotations, removing
+
+ at example
+#include <tagUtils.h>
+
+int rmanno_list(
+        GapIO  *io,
+        int     anno_ac,
+        int    *anno_av);
+ at end example
+
+This function removes a list of annotations from the database. The annotation
+lists for readings and contigs are also updated accordingly. The annotations
+numbers to remove are held in an array named @var{anno_av} with @var{anno_ac}
+elements.
+
+This function returns 0 for success, -1 for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node insert_NEW_tag
+ at section insert_NEW_tag
+ at vindex insert_NEW_tag(C)
+ at cindex tag, creation of
+ at cindex annotations, creation of
+
+ at example
+#include <tagUtils.h>
+
+void insert_NEW_tag(
+        GapIO  *io,
+        int     N,
+        int     pos,
+        int     length,
+        char   *type,
+        char   *comment,
+        int     sense);
+ at end example
+
+This function adds a new tag to the database. If @var{N} is positive, the tag
+is added to reading number @var{N}, otherwise it is added to contig number
+ at var{-N}. The reading and contig annotation lists are updated accordingly.
+
+The @var{pos}, @var{length}, @var{type}, @var{comment} and @var{sense}
+arguments specify the position, length, type (a 4 character string), comment
+and orientation of the tag to create. @var{comment} may be @code{NULL}.
+ at var{sense} should be one of 0 for forward, 1 for reverse and 2 for both.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node create_tag_for_gel
+ at section create_tag_for_gel
+ at vindex create_tag_for_gel(C)
+ at cindex tag, creation of
+ at cindex annotations, creation of
+
+ at example
+#include <tagUtils.h>
+
+void create_tag_for_gel(
+        GapIO  *io,
+        int     gel,
+        int     gellen,
+        char   *tag);
+ at end example
+
+This function is a textual analogue of the @code{insert_NEW_tag} function
+(which it uses). The function creates a new tag for a reading. The
+ at var{gel} argument should contain the reading number and @var{gellen} the
+reading length. The tag to create is passed as the @var{tag} argument which is
+in the same format as taken by the @var{tag2values} function.
+_oxref(tag2values, tag2values).
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node ctagget
+ at section ctagget and vtagget
+ at vindex ctagget(C)
+ at vindex vctagget(C)
+ at cindex tag, searching for
+ at cindex annotations, searching for
+
+ at example
+#include <tagUtils.h>
+
+GAnnotations *ctagget(
+        GapIO  *io,
+        int     gel,
+        char   *type);
+
+GAnnotations *vtagget(
+        GapIO  *io,
+        int     gel,
+        int     num_t,
+        char  **type);
+ at end example
+
+These function provides a mechanism of iterating around all the available tags
+of particular types on a given reading or contig number. The @code{ctagget}
+function searches for a single tag type, passed in @var{type} as a 4 byte
+string. The @code{vtagget} function searches for a set of tag types, passed as
+an array of @var{num_t} 4 byte strings.
+
+To use the functions, call them with a non zero @var{gel} number and the tag
+type(s). The function will return a pointer to a @var{GAnnotations} structure
+containing the first tag on this reading or contig of this type. If none are
+found, @code{NULL} is returned.
+
+To find the next tag on this reading or contig, of the same type, call the
+function with @var{gel} set to 0. To find all the tags of this type, keep
+repeating this until @code{NULL} is returned.
+
+Returns a @var{GAnnotations} pointer for success, @code{NULL} for "not found",
+and @code{(GAnnotations *)-1} for failure. The annotation pointer returned is
+valid until the next call of the function.
+
+For example, the following function prints information on all vector tags for
+a given reading.
+
+ at example
+void print_tags(GapIO *io, int rnum) @{
+    char *type[] = @{"SVEC", "CVEC"@};
+    GAnnotations *a;
+
+    a = vtagget(io, rnum, sizeof(types)/sizeof(*types), types);
+
+    while (a && a != (GAnnotations *)-1) @{
+        printf("position %d, length %d\n",
+            a->position, a->length);e
+
+        a = vtagget(io, 0, sizeof(types)/sizeof(*types), types);
+    @}
+@}
+ at end example
+
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node tag_shift_for_insert
+ at section tag_shift_for_insert
+ at vindex tag_shift_for_insert(C)
+ at cindex tag, insertion within
+ at cindex annotations, insertion within
+ at cindex inserting into tags
+
+ at example
+#include <tagUtils.h>
+
+void tag_shift_for_insert(
+        GapIO  *io,
+        int     N,
+        int     pos);
+ at end example
+
+This function shifts or extends tags by a single base. The purpose is to
+handle cases where we need to insert into a sequence. An edit at position
+ at var{pos} will mean moving every tag to the right of this one base rightwards.
+A tag that spans position @var{pos} will have it's length increased by one.
+If @var{N} is positive it specifies the reading number to operate on,
+otherwise it specifies the contig number (negated).
+
+NOTE: This function @strong{does not} work correctly for complemented
+readings. It is planned to fix this problem by creating a new function that
+operates in a more intelligent fashion. To work around this problem, logic
+similar to the following needs to be used.
+
+ at example
+    /*
+     * Adjust tags
+     * NOTE: Must always traverse reading in reverse of original sense
+     */
+    if (complemented) @{
+        for(i=j=0; i < gel_len; i++) @{
+            if (orig_seq[i] != padded_seq[j]) @{
+                tag_shift_for_insert(io, gel_num, length-j);
+            @} else
+                j++;
+        @}
+    @} else @{
+        for(i=j=gel_len-1; i >= 0; i--) @{
+            if (orig_seq[i] != padded_seq[j]) @{
+                tag_shift_for_insert(io, gel_num, j+1);
+            @} else
+                j--;
+        @}
+    @}
+ at end example
+
+In the above example @var{padded_seq} is a padded copy of @var{orig_seq}. The
+function calls @code{tag_shift_for_insert} for each pad. Note that the order
+of the insertions is important and differs depending on whether the reading is
+complemented or not.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node tag_shift_for_delete
+ at section tag_shift_for_delete
+ at vindex tag_shift_for_delete(C)
+ at cindex tag, deletion within
+ at cindex annotations, deletion within
+ at cindex deleting into tags
+
+ at example
+#include <tagUtils.h>
+
+void tag_shift_for_delete(
+        GapIO  *io,
+        int     N,
+        int     pos);
+ at end example
+
+This function shifts or shrinks tags by a single base. The purpose is to
+handle cases where we need to delete a base within a sequence. An deletion at
+position @var{pos} will mean moving every tag to the right of this position
+one base leftwards.  A tag that spans position @var{pos} will have it's length
+decreased by one.  If @var{N} is positive it specifies the reading number to
+operate on, otherwise it specifies the contig number (negated).
+
+NOTE: This function @strong{does not} work correctly for complemented
+readings. Also, it does not remove the tag when a deletion shrinks it's size
+to 0. It is planned to fix these problem by creating a new function that
+operates in a more intelligent fashion. To work around this problem, use logic
+similar to the example in @code{tag_shift_for_insert}.
+_oxref(tag_shift_for_insert, tag_shift_for_insert).
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node type2str
+ at section type2str and str2type
+ at vindex type2str(C)
+ at vindex str2type(C)
+ at cindex tags, type conversion
+ at cindex annotations, type conversion
+
+ at example
+#include <tagUtils.h>
+
+int str2type(
+        char   *stype);
+
+void type2str(
+        int     itype,
+        char    stype[5]);
+ at end example
+
+Note that these two functions are infact #defines. The prototypes are listed
+simply to guide their correct usage.
+
+ at code{str2type} converts a 4 character tag type, pointed to by @var{stype}
+into an integer value as used in the @var{GAnnotations.type} field.
+
+ at code{type2str} converts an integer type passed as @var{itype} to a 4
+character (plus 1 nul) string.
diff --git a/scripting_manual/gap4-cedit-t.texi b/scripting_manual/gap4-cedit-t.texi
new file mode 100644
index 0000000..612a938
--- /dev/null
+++ b/scripting_manual/gap4-cedit-t.texi
@@ -0,0 +1,382 @@
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_complement_seq
+ at section io_complement_seq
+ at vindex io_complement_seq(C)
+ at cindex complementing
+
+ at example
+#include <IO.h>
+
+int io_complement_seq(
+        int2   *length,
+        int2   *start,
+        int2   *end,
+        char   *seq,
+        int1   *conf,
+        int2   *opos);
+ at end example
+
+This function complements a sequence held in memory. No database I/O is
+performed.  A sequence of length @var{*length} is passed in the @var{seq}
+argument with associated confidence values (@var{conf}) and original positions
+(@var{opos}) arrays. The @var{start} and @var{end} arguments contain the left
+and right cutoff points within this sequence.
+
+The function will reverse and comlement the sequence, negate the @var{start}
+and @var{end} values, and reverse the @var{conf} and @var{opos} arrays. If
+either of @var{conf} or @var{opos} are passed as @code{NULL}, neither will be
+reversed. @var{length} is not modified, despite the fact that it is passed by
+reference.
+
+The function returns 0 for success.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_insert_seq
+ at section io_insert_seq
+ at vindex io_insert_seq(C)
+
+ at example
+#include <IO.h>
+
+int io_insert_seq(
+        int2    maxgel,
+        int2   *length,
+        int2   *start,
+        int2   *end,
+        char   *seq,
+        int1   *conf,
+        int2   *opos,
+        int2    pos,
+        char   *bases,
+        int1   *newconf,
+        int2   *newopos,
+        int2    Nbases);
+ at end example
+
+ at code{io_insert_seq} inserts one or more bases into the sequence, confidence
+and original positions arrays specified. No database I/O is performed.
+
+The existing sequence, confidence values, and original positions arrays are
+passed as @var{seq}, @var{conf}, and @var{opos} arguments. All are mandatory.
+The length of sequence and hence the number of used elements in these arrays
+is passed as @var{length}, with @var{start} and @var{end} containing the left
+and right cutoff positions.
+
+The new sequence, confidence values, and original positions to insert are
+passed as @var{bases}, @var{newconf} and @var{newopos}. The number of bases to
+insert is @var{Nbases}. Either or both of the @var{newconf} and @var{newopos}
+arguments may be NULL. The inserted confidence values will then default to 100
+for non pad ("@code{*}") bases. For pads, the confidence value defaults to the
+average of the confidence values of the first two neighbouring bases that are
+not pads. The inserted original positions default to 0. These bases are to be
+inserted at the position specified by @var{pos}, counting as position 1 being
+to the left of the first base in the sequence.
+
+As this operation increases the size of the @var{seq}, @var{conf}, and
+ at var{opos} arrays, their allocated size must be passed in @var{maxgel}. If the
+insertion causes data to be shuffle beyond the @var{maxgel}, the right end of
+the sequence is clipped to ensure that no more than @var{maxgel} bases are
+present. The @var{start} and @var{end} values may be incremented, depending on
+where the insertion occurs.
+
+This function returns 0 for success.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_delete_seq
+ at section io_delete_seq
+ at vindex io_delete_seq(C)
+
+ at example
+#include <IO.h>
+
+int io_delete_seq(
+        int2    maxgel,
+        int2   *length,
+        int2   *start,
+        int2   *end,
+        char   *seq,
+        int1   *conf,
+        int2   *opos,
+        int2    pos,
+        int2    Nbases);
+ at end example
+
+ at code{io_delete_seq} removes one or more bases from the sequence, confidence
+and original positions arrays specified. No database I/O is performed.
+
+The existing sequence, confidence values, and original positions arrays are
+passed as @var{seq}, @var{conf}, and @var{opos} arguments. All are mandatory.
+The length of sequence and hence the number of used elements in these arrays
+is passed as @var{length}, with @var{start} and @var{end} containing the left
+and right cutoff positions. The allocated size of these arrays is
+ at var{maxgel}, however it is not required by this function (FIXME).
+
+The @var{pos} and @var{Nbases} arguments specify where and how many bases to
+delete, counting with the first base as base number 1. The @var{length}
+argument is described by @var{Nbases} and the @var{seq}, @var{conf} and
+ at var{opos} arrays shuffled accordingly. The @var{start} and @var{end} values
+may be decrememnted, depending on where the deletion occurs.
+
+The function returns 0 for success.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_replace_seq
+ at section io_replace_seq
+ at vindex io_replace_seq(C)
+
+ at example
+#include <IO.h>
+
+int io_replace_seq(
+        int2    maxgel,
+        int2   *length,
+        int2   *start,
+        int2   *end,
+        char   *seq,
+        int1   *conf,
+        int2   *opos,
+        int2    pos,
+        char   *bases,
+        int1   *newconf,
+        int2   *newopos,
+        int2    Nbases,
+        int     diff_only,
+        int     conf_only);
+ at end example
+
+ at code{io_replace_seq} replaces on or more bases from the sequence, confidence
+and original positions arrays specified. No database I/O is performed.
+
+The existing sequence, confidence values, and original positions arrays are
+passed as @var{seq}, @var{conf}, and @var{opos} arguments. All are mandatory.
+The length of sequence and hence the number of used elements in these arrays
+is passed as @var{length}, with @var{start} and @var{end} containing the left
+and right cutoff positions. The allocated size of these arrays is
+ at var{maxgel}. FIXME: it is used - does it need to be?
+
+The new sequence, confidence values, and original positions to replace are
+passed as @var{bases}, @var{newconf} and @var{newopos}. The number of bases to
+replace is @var{NBases}. Either or both of the @var{newconf} and @var{newopos}
+arguments may be NULL. The replaced confidence values will then default to 100
+for non pad ("@code{*}") bases. For pads, the confidence value defaults to the
+average of the confidence values of the first two neighbouring bases that are
+not pads. The replaced original positions default to 0. These bases are to be
+inserted at the position specified by @var{pos}, counting as position 1 being
+to the left of the first base in the sequence. The @var{length}, @var{start}
+and @var{end} values are left unchanged.
+
+This function returns 0 for success.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node pad_consensus
+ at section pad_consensus
+ at vindex pad_consensus(C)
+
+ at example
+#include <IO.h>
+
+int pad_consensus(
+        GapIO  *io,
+        int     contig,
+        int     pos,
+        int     npads);
+ at end example
+
+This function inserts @var{npads} pads into the consensus for contig number
+ at var{contig} at position @var{pos} by inserting into all of the readings
+creating the consensus at this point.
+
+The function deals with inserting to the appropriate readings including
+adjustment of cutoff positions and annotations, moving of all the readings
+to the right of @var{pos}, and adjustment of the annotations on the consensus
+sequence.
+
+It returns 0 for success.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node calc_consensus
+ at section calc_consensus
+ at vindex calc_consensus(C)
+ at cindex consensus calculation
+
+ at example
+#include <qual.h>
+
+int calc_consensus(
+        int     contig,
+        int     start,
+        int     end,
+        int     mode,
+        char   *con,
+        char   *con2,
+        float  *qual,
+        float  *qual2,
+        float   cons_cutoff,
+        int     qual_cutoff,
+        int    (*info_func)(int          job,
+                            void        *mydata,
+                            info_arg_t  *theirdata),
+        void   *info_data);
+
+int database_info(
+        int          job,
+        void        *mydata,
+        info_arg_t  *theirdata);
+ at end example
+
+This function calculates the consensus sequence for a given segment of a
+contig. It can produce a single consensus sequence using all readings, or
+split it into two sequences; one for each strand. Additionally, it can produce
+either one (combinded strands) or two (individual strands) sets of values
+relating to the accuracy of the returned consensus.
+
+The @var{contig}, @var{start} and @var{end} arguments hold the contig and
+range to calculate the consensus for. The ranges are inclusive and start
+counting with the first base as position 1.
+
+ at var{con} and @var{con2} are buffers to store the consensus. These are
+allocated by the caller to be at least of size @var{end-start+1}. If
+ at var{con2} is @code{NULL} both strands are calculated as a single consensus to
+be stored in @var{con}. Otherwise the top strand is stored in @var{con} and
+the bottom strand is stored in @var{con2}.
+
+ at var{mode} should be one of @code{CON_SUM} or @code{CON_WDET}. @code{CON_SUM}
+is the "normal" mode, which indicates that the consensus sequence is simply
+the most likely base or a dash (depending on @var{cons_cutoff}. The
+ at code{CON_WDET} mode is used to return special characters for bases that are
+good quality and identical on both strands. Where one strand has a dash, the
+consensus base for the other strand is used. Where both strands differ, and
+are not dashes, the consensus is returned as dash. Note that despite requiring
+the consensus for each starnd independently, this mode requires that
+ at var{con2} is @code{NULL}. To summarise the action of the @code{CON_WDET}
+mode, the final consensus is derived as follows:
+
+ at example
+ Top     Bottom   Resulting
+Strand   Strand     Base
+---------------------------
+   A        A         d
+   C        C         e
+   G        G         f
+   T        T         i
+   -        -         -
+   -        @var{x}         @var{x}
+   @var{x}        -         @var{x}
+   @var{x}        @var{y}         -
+ at end example
+
+[Where @var{x} and @var{y} are one of A, C, G or T, and @var{x} != @var{y}.]
+
+ at var{qual_cutoff} and @var{cons_cutoff} hold the quality and consensus cutoff
+paramaters used in the consensus algorithm for determining which bases are of
+sufficient quality to use and by how big a majority this base type must have
+before it is returned as the consensus base (otherwise "-" is used). For a
+complete description of how these parameters operate see the consensus
+algorithm description in the main Gap4 manual. (FIXME: should we duplicate
+this here?)
+
+The @var{qual} and @var{qual2} buffers are allocated by the caller to be the
+same size as the @var{con} and @var{con2} buffers. They are filled with the
+a floating point representing the ratio of score for the consensus base type
+to the score for all base types (where the definition of score depends on the
+ at var{qual_cutoff} parameter). This is the value compared against
+ at var{cons_cutoff} to determine whether the consensus base is a dash.
+Either or both of @var{qual} and @var{qual2} can be passed as NULL if no
+accuracy information is required. Note that the accuracy information for
+ at var{qual2} is only available when @var{con2} has also been passed as non NULL.
+
+The algorithm uses @var{info_func} to obtain information about the readings
+from the database. @var{info_data} is passed as the second argument
+(@var{mydata}) to @var{info_func}. @var{info_func} is called each time some
+information is required about a reading or contig. It's purpose is to abstract
+out the algorithm from the data source. There are currently two such
+functions, the most commonly used of which is @code{database_info} function
+(the other being @code{contEd_info} to fetch data from the contig editor
+structures). The @code{database_info} function obtains the sequence details
+from the database. It requires a @var{GapIO} pointer to be passed as
+ at var{info_data}.
+
+The function returns 0 for success, -1 for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node calc_quality
+ at section calc_quality
+ at vindex calc_quality(C)
+ at cindex quality calculation
+ at cindex accuracy calculation
+
+ at example
+#include <qual.h>
+
+int calc_quality(
+        int     contig,
+        int     start,
+        int     end,
+        char   *qual,
+        float   cons_cutoff,
+        int     qual_cutoff,
+        int   (*info_func)(int          job,
+                           void        *mydata,
+                           info_arg_t  *theirdata),
+        void   *info_data)
+
+int database_info(
+        int          job,
+        void        *mydata,
+        info_arg_t  *theirdata);
+ at end example
+
+This function calculates the quality codes for a given segment of a contig
+consensus sequence. The quality information is stored in the @var{qual}
+buffer, which should be allocated by the caller to be at least
+ at var{end-start+1} bytes long. The contents of this buffer is one byte per
+base, consisting of a letter between 'a' and 'j'. There are #defines in
+ at file{qual.h} assigning meanings to these codes, which should be used in
+preference to hard coding the codes themselves. The defines and meanings are
+as follows.
+
+ at table @code
+ at item a - R_GOOD_GOOD_EQ
+Data is good on both strands and both strands agree on the same consensus base.
+ at item b - R_GOOD_BAD
+Data is good on the top strand, but poor on the bottom strand.
+ at item c - R_BAD_GOOD
+Data is good on the bottom strand, but poor on the top strand.
+ at item d - R_GOOD_NONE
+Data is good on the top strand, but no data is available on the bottom strand.
+ at item e - R_NONE_GOOD
+Data is good on the bottom strand, but no data is available on the top strand.
+ at item f - R_BAD_BAD
+Data is available on both strands, but both strands are poor data.
+ at item g - R_BAD_NONE
+Data is poor on the top strand, with no data on the bottom strand.
+ at item h - R_NONE_BAD
+Data is poor on the bottom strand, with no data on the top strand.
+ at item i - R_GOOD_GOOD_NE
+Data is good on both strands, but the consensus base differs between top and
+bottom strand.
+ at item j - R_NONE_NONE
+No data is available on either strand (this should never occur).
+ at end table
+
+The @var{contig}, @var{start} and @var{end} arguments hold the contig and
+range to calculate the quality for. The ranges are inclusive and start
+counting with the first base as position 1.
+
+ at var{qual_cutoff} and @var{cons_cutoff} hold the quality and consensus cutoff
+parameters. These are used in an identical manner to the @code{calc_quality}
+function. _oxref(calc_quality, calc_quality).
+
+The @var{info_func} and @var{info_data} arguments are also used in the same
+way as @code{calc_quality}. Generally @var{info_func} should be
+ at code{database_info} and @var{info_data} should be a @var{GapIO} pointer.
+This will then read the sequence data from the Gap4 database.
+
+The function returns 0 for success, -1 for failure.
diff --git a/scripting_manual/gap4-cio-IO.h-t.texi b/scripting_manual/gap4-cio-IO.h-t.texi
new file mode 100644
index 0000000..d55bc81
--- /dev/null
+++ b/scripting_manual/gap4-cio-IO.h-t.texi
@@ -0,0 +1,211 @@
+
+There are many C macros defined to interact with the @var{GapIO} structure.
+These both simplify and improve readeability of the code and also provide a
+level of future proofing. Where the macros are available it is always
+advisable to use these instead of accessing the @var{GapIO} structure
+directly.
+
+Note that not all of these macros are actually held within the @file{IO.h}
+file, rather some are in files included by @file{IO.h}. However whenever
+wishing to use one of these macros you should still use "@code{#include
+<IO.h>}".
+
+ at table @code
+ at findex io_dbsize(C)
+ at item io_dbsize(@var{io})
+	@var{io}@code{->db.actual_db_size}@br
+	The maximum number of readings plus contigs allowed.
+
+ at findex max_gel_len(C)
+ at cindex maximum reading length
+ at cindex reading length, maximum
+ at cindex gel length, maximum
+ at item max_gel_len(@var{io})
+	@var{(io)}@code{->max_gel_len}@br
+	The maximum reading length.
+
+ at findex NumContigs(C)
+ at item NumContigs(@var{io})
+	@var{(io)}@code{->db.num_contigs}@br
+	The number of used contigs.
+
+ at findex NumReadings(C)
+ at item NumReadings(@var{io})
+	@var{(io)}@code{->db.num_readings}@br
+	The number of used readings.
+
+ at findex Ncontigs(C)
+ at item Ncontigs(@var{io})
+	@var{(io)}@code{->db.Ncontigs}@br
+	The number of allocated contigs.
+
+ at findex Nreadings(C)
+ at item Nreadings(@var{io})
+	@var{(io)}@code{->db.Nreadings}@br
+	The number of allocated readings.
+
+ at findex Nannotations(C)
+ at item Nannotations(@var{io})
+	@var{(io)}@code{->db.Nannotations}@br
+	The number of allocated annotations.
+
+ at findex Ntemplates(C)
+ at item Ntemplates(@var{io})
+	@var{(io)}@code{->db.Ntemplates}@br
+	The number of annotated templates.
+
+ at findex Nclones(C)
+ at item Nclones(@var{io})
+	@var{(io)}@code{->db.Nclones}@br
+	The number of allocated clones.
+
+ at findex Nvectors(C)
+ at item Nvectors(@var{io})
+	@var{(io)}@code{->db.Nvectors}@br
+	The number of allocated vectors.
+
+ at findex io_relpos(C)
+ at item io_relpos(@var{io,g})
+	@var{(io)}@code{->relpos[(}@var{g}@code{)]}@br
+	The position of a reading @var{g}.
+
+ at findex io_length(C)
+ at item io_length(@var{io,g})
+	@var{(io)}@code{->length[(}@var{g}@code{)]}@br
+	The length of a reading @var{g}. If the reading is complemented this
+	value is negative, but still represents the length.
+
+ at findex io_lnbr(C)
+ at item io_lnbr(@var{io,g})
+	@var{(io)}@code{->lnbr[(}@var{g}@code{)]}@br
+	The reading number of the left neighbour of reading @var{g}.
+
+ at findex io_rnbr(C)
+ at item io_rnbr(@var{io,g})
+	@var{(io)}@code{->rnbr[(}@var{g}@code{)]}@br
+	The reading number of the right neighbour of reading @var{g}.
+
+ at findex io_clength(C)
+ at item io_clength(@var{io,c})
+	@var{(io)}@code{->relpos[io_dbsize(}@var{io}@code{)-(}@var{c}@code{)]}@br
+	The length of contig @var{c}.
+
+ at findex io_clnbr(C)
+ at item io_clnbr(@var{io,c})
+	@var{(io)}@code{->lnbr[io_dbsize(}@var{io}@code{)-(}@var{c}@code{)]}@br
+	The leftmost reading number of contig @var{c}.
+
+ at findex io_crnbr(C)
+ at item io_crnbr(@var{io,c})
+	@var{(io)}@code{->rnbr[io_dbsize(}@var{io}@code{)-(}@var{c}@code{)]}@br
+	The rightmost reading number of contig @var{c}.
+
+ at findex io_name(C)
+ at item io_name(@var{io})
+	@var{(io)}@code{->db_name}@br
+	The database name.
+
+ at findex io_rdonlu(C)
+ at item io_rdonly(@var{io})
+	This returns 1 when the database has been opened as read-only; 0
+	otherwise.
+
+ at findex io_rname(C)
+ at item io_rname(@var{io,g})
+	This returns the reading name for reading number @var{g}. This is
+	fetched from the in memory cache.
+
+ at findex io_wname(C)
+ at item io_wname(@var{io,g,n})
+	Sets the in-memory copy of the reading name for reading number @var{g}
+	to be the string @var{n}. This does not write to disk.
+
+ at findex PRIMER_TYPE(C)
+ at item PRIMER_TYPE(@var{r})
+	This returns the type of the primer used for sequencing reading number
+	@var{r}. This information is calculated from the @var{primer} and
+	@var{strand} fields of the @var{GReadings} structure. It returns one
+	of @code{GAP_PRIMER_UNKNOWN}, @code{GAP_PRIMER_FORWARD},
+	@code{GAP_PRIMER_REVERSE}, @code{GAP_PRIMER_CUSTFOR} and
+	@code{GAP_PRIMER_CUSTREV}.
+
+ at findex PRIMER_TYPE_GUESS(C)
+ at item PRIMER_TYPE_GUESS(@var{r})
+	As @code{PRIMER_TYPE} except always choose a sensible guess in place
+	of @code{GAP_PRIMER_UNKNOWN}.
+
+ at findex STRAND(C)
+ at item STRAND(@var{r})
+	Returns the strand (one of @code{GAP_STRAND_FORWARD} or
+	@code{GAP_STRAND_REVERSE}) from the primer information for reading
+	number @var{r}. The reason for these primer and strand macros is that
+	the meaning of the @var{primer} and @var{strand} fields of
+	@var{GReadings} has changed slightly from early code in that we now
+	make a distinction between custom forward primers and custom reverse
+	primers. The @var{strand} field may become completely redundant in
+	future as it can now be derived entirely from the primer.
+
+ at cindex Cache, GReadings
+ at cindex GReadings cache
+ at cindex Reading name cache
+ at findex contig_read(C)
+ at findex gel_read(C)
+ at findex tag_read(C)
+ at findex vector_read(C)
+ at findex clone_read(C)
+ at item  contig_read(@var{io, cn, c})
+ at itemx gel_read(@var{io, gn, g})
+ at itemx tag_read(@var{io, tn, t})
+ at itemx vector_read(@var{io, vn, v})
+ at itemx clone_read(@var{io, cn, c})
+	Reads one of the basic database structures. For contigs,
+	@code{contig_read} reads contig number @var{cn} and stores in the
+	@var{GContigs} structure named @var{c}. Eg to read the a contig:
+
+ at example
+	GContigs c;
+	contig_read(io, contig_num, c);
+ at end example
+
+	This is functionally equivalent to:
+
+ at example
+	GContigs c;
+	GT_Read(io, arr(GCardinal, io->contigs, contig_num-1),
+		&c, sizeof(c), GT_Contigs);
+ at end example
+
+	The exception to this is @code{gel_read} which reads from a cached
+	copy held in memory.
+
+ at findex contig_write(C)
+ at findex gel_write(C)
+ at findex tag_write(C)
+ at findex vector_write(C)
+ at findex clone_write(C)
+ at item  contig_write(@var{io, cn, c})
+ at item  gel_write(@var{io, gn, g})
+ at item  tag_write(@var{io, tn, t})
+ at item  vector_write(@var{io, vn, v})
+ at item  clone_write(@var{io, cn, c})
+	Writes one of the basic types in a similar fashion to the read
+	functions. To write to annotation number @var{anno} we should use:
+
+ at example
+	GAnnotations a;
+	/* ... some code to manipulate 'a' ... */
+	tag_write(io, anno, a);	       
+ at end example
+
+	This is functionally equivalent to:
+
+ at example
+	GT_Write(io, arr(GCardinal, io->annotations, anno-1),
+		 &a, sizeof(a), GT_Annotations);
+ at end example
+
+	Note that the @code{gel_write} function @strong{must} be used instead
+	of @code{GT_Write} as @code{gel_write} will also update the reading
+	memory cache.
+
+ at end table
diff --git a/scripting_manual/gap4-cio-basic-t.texi b/scripting_manual/gap4-cio-basic-t.texi
new file mode 100644
index 0000000..bd04b08
--- /dev/null
+++ b/scripting_manual/gap4-cio-basic-t.texi
@@ -0,0 +1,805 @@
+These functions consist of both basic functions for reading, writing and
+creation of database items and simple I/O functions that build upon such
+operations. They are mainly contained within the @file{Gap4/IO.c} file.
+
+The return codes do vary greatly from function to function. Most return 0 for
+success and -1 for failure. However some will return other codes. In general
+it is best to check equality to the success code rather than equality to a
+specific failure code.
+
+and read as an array of @var{GCardinal}s. @var{elements} indicates the number
+of array elements and not the size of the array in bytes.
+
+ at code{BitmapRead} reads records of type @code{GT_Bitmap}. The bitmap is
+allocated by this function. @var{elements} indicates the number of bits and
+not the size of the bitmap in bytes.
+
+ at subsection GT_Write, GT_Write_cached, TextWrite, DataWrite, ArrayWrite, BitmapWrite
+
+ at findex GT_Write(C)
+ at findex GT_Write_cached(C)
+ at findex TextWrite(C)
+ at findex DataWrite(C)
+ at findex ArrayWrite(C)
+ at findex BitmapWrite(C)
+ at example
+#include <IO.h>
+
+int GT_Write(
+        GapIO  *io,
+        int     rec,
+        void   *buf,
+        int     len,
+        GCardinal type);
+
+int GT_Write_cached(
+        GapIO  *io,
+        int     read,
+        GReadings *r);
+
+int TextWrite(
+        GapIO  *io,
+        int     rec,
+        char   *buf,
+        int     len);
+
+int DataWrite(
+        GapIO  *io,
+        int     rec,
+        void   *buf,
+        int     len,
+        int     size);
+
+int ArrayWrite(
+        GapIO  *io,
+        int     rec,
+        int     elements,
+        Array   a);
+
+int BitmapWrite(
+        GapIO  *io,
+        int     rec,
+        Bitmap  b);
+ at end example
+
+These functions write record number @var{rec} with the appropriate data type.
+They return zero for success and an error code for failure.
+
+ at code{GT_Write} writes arbitrary records of type @var{type}. This is usually a
+structure. Do not use this function for writing @var{GReadings} structures.
+For best compatibility, use the @code{contig_write}, @code{gel_write},
+ at code{tag_write}, @code{vector_write} and @code{clone_write} function.
+
+ at code{GT_Write_cached} is an interface to @code{GT_Write} which also updates
+the in-memory reading cache. For best compatibility, use the
+ at code{gel_write} function.
+
+ at code{TextWrite} writes a record of type @code{GT_Text}. It is used to write
+text only strings. 
+
+ at code{DataWrite} writes a record of type @code{GT_Data}. It is used to write
+binary data such as sequence confidence values.
+
+ at code{ArrayWrite} writes a record of type @code{GT_Array}. The array must be
+an array of @var{GCardinal} values. @var{elements} indicates the number of
+array elements and not the size of the array in bytes.
+
+ at code{BitmapWrite} writes a record of type @code{GT_Bitmap}. @var{elements}
+indicates the number of bits and not the size of the bitmap in bytes.
+
+ at node G4Cio-io_handle
+ at subsection io_handle and handle_io
+ at cindex IO handles
+ at findex io_handle(C)
+ at findex handle_io(C)
+
+ at example
+#include <IO.h>
+
+GapIO *io_handle(
+        f_int *handle);
+
+f_int *handle_io(
+        GapIO *io);
+ at end example
+
+These two routines convert between @var{GapIO} pointers and integer handles.
+Both the Fortran and Tcl code uses integer handles due to no support for
+structures.
+
+ at code{io_handle} takes a pointer to an integer handle and returns the
+associated @var{GapIO} pointer. It returns NULL for failure.
+
+ at code{handle_io} takes a @var{GapIO} pointer and returns a pointer to a
+integer handle. It returns NULL for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-io_read_seq
+ at subsection io_read_seq
+ at cindex Sequence, reading
+ at cindex Reading sequences
+ at findex io_read_seq(C)
+
+ at example
+#include <IO.h>
+
+int io_read_seq(
+        GapIO  *io,
+        int     N,
+        int2   *length,
+        int2   *start,
+        int2   *end,
+        char   *seq,
+        int1   *conf,
+        int2   *opos);
+ at end example
+
+This function loads from memory and disk information on gel readings and
+stores this in the paramaters passed over.
+
+The reading number to read should be passed as @var{N}. The integers pointed
+to by @var{length}, @var{start} and @var{end} pointers are then written to
+with the total length (@var{GReadings.length}), the last base number (counting
+from 1) of the left hand cutoff data, and the first base number of te right
+hand cutoff data.
+
+The sequence, confidence and original position data is then loaded and stored
+in the address pointed to by @var{seq}, @var{conf} and @var{opos} respectively.
+This is expected to be allocated to the correct size by the caller of this
+function. Either or both of @var{conf} and @var{opos} can be NULL, in which
+case the data is not loaded or stored. @var{seq} must always be non NULL.
+
+This function returns 0 for success and non zero for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-io_write_seq
+ at subsection io_write_seq
+ at cindex Sequence, writing
+ at cindex Writing sequences
+ at findex io_write_seq(C)
+
+ at example
+#include <IO.h>
+
+
+int io_write_seq(
+        GapIO  *io,
+        int     N,
+        int2   *length,
+        int2   *start,
+        int2   *end,
+        char   *seq,
+        int1   *conf,
+        int2   *opos);
+ at end example
+
+This function updates disk and memory details of reading number @var{N}. If
+this reading does not yet exist, all non existant readings up to and including
+ at var{N} will be initialised first using the @code{io_init_readings} function.
+
+[FIXME: The current implement @strong{does not} update the fortran lngth
+(io_length()) array. This needs to be done by the caller. ]
+
+The @var{length} argument is the total length of the sequence, and hence also
+the expected size of the @var{seq}, @var{conf} and @var{opos} arrays.
+ at var{start} and @var{end} contain the last base number of the left cutoff data
+and the first base number of the right cutoff data.
+
+Unlike @var{io_read_seq}, all arguments to this function are mandatory.
+If the records on disk do not already exist then they are allocated first
+using the @code{allocate} function.
+
+This function returns 0 for success and non zero for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-get_read_info
+ at subsection get_read_info, get_vector_info, get_clone_info and get_subclone_info
+ at findex get_read_info(C)
+ at findex get_vector_info(C)
+ at findex get_clone_info(C)
+ at findex get_subclone_info(C)
+
+ at example
+#include <IO.h>
+
+int get_read_info(
+        GapIO  *io,
+        int     N,
+        char   *clone,
+        int     l_clone,
+        char   *cvector,
+        int     l_cvector,
+        char   *subclone,
+        int     l_subclone,
+        char   *scvector,
+        int     l_scvector,
+        int    *length,
+        int    *insert_min,
+        int    *insert_max,
+        int    *direction,
+        int    *strands,
+        int    *primer,
+        int    *clone_id,
+        int    *subclone_id,
+        int    *cvector_id,
+        int    *scvector_id);
+
+int get_vector_info(
+        GapIO  *io,
+        int     vector_id,
+        char   *vector,
+        int l_vector);
+
+int get_clone_info(
+        GapIO  *io,
+        int     clone_id,
+        char   *clone,
+        int     l_clone,
+        char   *cvector,
+        int     l_cvector,
+        int    *cvector_id);
+
+int get_subclone_info(
+        GapIO  *io,
+        int     subclone_id,
+        char   *clone,
+        int     l_clone,
+        char   *cvector,
+        int     l_cvector,
+        char   *subclone,
+        int     l_subclone,
+        char   *scvector,
+        int     l_scvector,
+        int    *insert_min,
+        int    *insert_max,
+        int    *strands,
+        int    *clone_id,
+        int    *cvector_id,
+        int    *scvector_id);
+ at end example
+
+These functions return clone, template and vector information.
+
+ at code{get_vector_info} returns the name of a vector. This is stored in the
+buffer at @var{vector}.
+
+ at code{get_clone_info} function returns the name of the clone and the vector
+number (stored at @var{clone} and @var{cvector_id} and results of
+ at code{get_vector_info} for this vector.
+
+ at code{get_subclone_info} returns the template information (insert size, number
+of strands, vector and clone numbers stored at @var{insert_min},
+ at var{insert_max}, @var{strands}, @var{scvector_id} and @var{clone_id}) along
+with the results from @code{get_vector_info} and @code{get_clone_info} on the
+appropriate vector and clone numbers.
+
+ at code{get_read_info} returns the reading information including direction,
+primer, template (subclone) number (stored at @var{direction}, @var{strands},
+ at var{primer}, and @var{clone_id}), and the results of the
+ at code{get_subclone_info} on this template number.
+
+For all four functions, the arguments used to store text fields, such as the
+clone name (@var{clone}), all have corresponding buffer lengths sent as the
+same argument name preceeded by @var{l_} (eg @var{l_clone}). These buffers
+need to be allocated by the caller of the function.
+
+Any buffer or integer pointer arguments may be passed as @code{NULL} to avoid
+filling in this field. For buffers the same is also true when specifying the
+buffer length as zero.
+
+The @var{clone}, @var{vector} and @var{subclone} buffers are used to store the
+names of the clone, vector or template. If appropriate, the clone or
+template number will also be stored at the @var{clone_id} and
+ at var{subclone_id} addresses.
+
+For functions returning information more than one vector, these are split into
+two levels. The sequencing vector is the vector used to sequence this
+template. It has arguments named @var{scvector} (name), @var{l_scvector} (name
+length) and @var{scvector_id} (vector number). The clone vector is the vector
+used in the sequecing of the fragment which is later broken down and
+resequenced as templates. This may not be appropriate in many projects. It has
+arguments named @var{cvector} (name), @var{l_cvector} (name length) and
+ at var{cvector_id} (vector number).
+
+All functions return 0 for success and an error code for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_init_reading
+ at subsection io_init_reading, io_init_contig and io_init_annotations
+ at findex io_init_reading(C)
+ at findex io_init_contig(C)
+ at findex io_init_annotations(C)
+
+ at example
+#include <IO.h>
+
+int io_init_reading(
+        GapIO  *io,
+        int     N);
+
+int io_init_contig(
+        GapIO  *io,
+        int     N);
+
+int io_init_annotations(
+        GapIO  *io,
+        int     N);
+ at end example
+
+These functions create new reading, contig and annotations structures. Each
+takes two arguments; the first being the @var{GapIO} pointer, and the second
+being the new reading, contig or annotation number to create. This is not the
+number of new structures, but rather the highest allowed number for this
+structure.
+For instance, if we have 10 readings, "@code{io_init_reading(io, 12)}" will
+create two more, numbered 11 and 12.
+
+For readings, the records are recovered (by increasing the @var{GDatabase}
+ at var{NumReadings} field to @var{NReadings}) if available. The new
+ at var{GReadings} structure are not guaranteed to be clear.
+
+For contigs, the records are recovered if available. The contig_order array is
+also updated with the new contigs being added at the rightmost position. The
+new contigs are added to the registration scheme with blank registration
+lists. The new @var{GContigs} structures are not guaranteed to be clear.
+
+For annotations, new records are always allocated from disk. It is up to the
+caller to first check that there are no free annotations in the
+ at var{free_annotations} list. The new @var{GAnnotations} structures are not
+guaranteed to be clear.
+
+All functions returns return 0 for success, and -1 for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_read_annotation
+ at subsection io_read_annotation and io_write_annotation
+ at findex io_read_annotation(C)
+ at findex io_write_annotation(C)
+ at cindex tags, reading and writing
+ at cindex annotations, reading and writing
+
+ at example
+#include <IO.h>
+
+int io_read_annotation(
+        GapIO  *io,
+        int     N,
+        int    *anno);
+
+int io_write_annotation(
+        GapIO  *io,
+        int     N,
+        int    *anno);
+ at end example
+
+These functions read and write the first annotation number in the linked lists
+referenced by the reading and contig structures.
+For both functions, @var{N} is a reading number if it is above zero or a
+contig number when below zero (in which case it is negated).
+
+ at code{io_read_annotation} reads the @var{annotations} field of reading @var{N}
+or contig @var{-N} and stores this in @var{anno}. It sets @var{anno} to 0
+returns 1 for failure. Otherwise it returns 0.
+
+ at code{io_write_annotation} sets the @var{annotations} field of reading @var{N}
+or contig @var{-N} to be @var{*anno}. Despite the fact that it is a pointer,
+the contents of @var{anno} is not modified. It returns 1 for failure and 0 for
+success (but currently always returns 0).
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node allocate
+ at subsection allocate
+ at cindex allocating records
+ at findex allocate(C)
+
+ at example
+#include <IO.h>
+
+int allocate(
+        GapIO    *io,
+        GCardinal type);
+ at end example
+
+These allocate and deallocate records in the g database.
+
+Th @code{allocate} function allocates a new record from the g database. It
+finds a free record, or creates a new record, and returns this record number.
+The record will be automatically locked for exclusive read/write access. The
+type of the record is sent in @var{type}.  This must be one of following:
+
+ at itemize @asis
+ at item @code{GT_Text}
+ at item @code{GT_Data}
+ at item @code{GT_Array}
+ at item @code{GT_Bitmap}
+ at item @code{GT_Database}
+ at item @code{GT_Contigs}
+ at item @code{GT_Readings}
+ at item @code{GT_Vectors}
+ at item @code{GT_Annotations}
+ at item @code{GT_Templates}
+ at item @code{GT_Clones}
+ at end itemize
+
+The function does not initialise or even write the new record to disk. The
+record number is valid, but a @code{GT_Read} call will produce an error. It is
+up to the caller to initialise the structure and perform the first
+ at code{GT_Write} (or equivalent) call.
+
+It returns the record number for success, and terminates the program for
+failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node deallocate
+ at subsection deallocate
+ at findex deallocate(C)
+ at cindex deallocating records
+ at cindex removing records
+
+ at example
+#include <IO.h>
+
+int deallocate(
+        GapIO    *io,
+        int       rec);
+ at end example
+
+The @code{deallocate} function removes record @var{rec} from the g database.
+This uses the @code{g_remove} function, but unlocking is only performed at the
+next database flush.
+
+It returns 0 for success and 1 for failure.
+
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_deallocate_reading
+ at subsection io_deallocate_reading
+ at findex io_deallocate_reading(C)
+ at cindex readings, deallocating
+ at cindex deallocating readings
+ at cindex removing readings
+
+ at example
+#include <IO.h>
+
+int io_deallocate_reading(
+        GapIO  *io,
+        int     N);
+ at end example
+
+The @code{io_deallocate_reading} function deallocates the records linked to by
+reading number @var{N}. These are the @var{name}, @var{trace_name},
+ at var{trace_type}, @var{sequence}, @var{confidence} and @var{orig_positions}
+fields of the @var{GReadings} structure.
+
+The reading itself is not deallocated. The operation of Gap4 requires that
+reading numbers are sequential with all numbers used. It is up to the caller
+of this routine to make sure that this is still true.
+
+It returns 0 for success and >=1 for failure.
+
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_read_rd
+ at subsection io_read_rd and io_write_rd
+ at findex io_read_rd(C)
+ at findex io_write_rd(C)
+ at cindex trace data, reading and writing
+
+ at example
+#include <IO.h>
+
+int io_read_rd(
+        GapIO  *io,
+        int     N,
+        char   *file,
+        int     filelen,
+        char   *type,
+        int     typelen);
+
+int io_write_rd(
+        GapIO  *io,
+        int     N,
+        char   *file,
+        int     filelen,
+        char   *type,
+        int     typelen);
+ at end example
+
+These routines read and write the reading 'raw data' paramaters. These are the
+file name and file type of the sequence trace file.
+
+For both functions, @var{N} is the reading number; @var{file} is a buffer,
+allocated by the caller, of length @var{filelen}; and @var{type} is a buffer,
+allocated by the caller, of length @var{typelen}.
+
+ at code{io_read_rd} copies the trace filename to @var{file} and it's type to
+ at var{type}. If either of these unknown the corresponding buffer is filled with
+spaces instead. It returns 0 if both name and type are known and 1 is either
+or both are unknown.
+
+ at code{io_write_rd} write new file name and file type information. If @var{N}
+is an unknown reading number, it is first allocated using
+ at code{io_init_readings}. It returns 0 for success.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node open_db
+ at subsection open_db
+ at findex open_db(C)
+ at findex database, opening
+ at findex opening databases
+
+ at example
+#include <IO.h>
+
+GapIO *open_db(
+        char   *project,
+        char   *version,
+        int    *status,
+        int     create,
+        int     read_only);
+ at end example
+
+ at code{open_db} opens existing databases or creates new databases.
+The database to be opened or created has unix filenames of
+"@var{project}. at var{version}" and "@var{project}. at var{version}.aux".
+
+The @var{create} variable should be 0 or 1. A value of 1 indicates that this
+database is to be created. This will not be done if there is a file named
+"@var{project}. at var{version}.BUSY", in which case the @var{status} variable is
+set to contain @code{IO_READ_ONLY}.
+
+The @var{read_only} variable should be 0 or 1. A value of 1 indicates that the
+database should be opened in read only mode, otherwise read/write access is
+desired. If the database is busy then the database may still be opened in read
+only mode instead. In this case the @var{status} variable is set to contain
+ at code{IO_READ_ONLY}.
+
+The @var{GapIO} structure is then initialised and returned. A successful
+return will leave @var{status} containing 0. For failure, the function returns
+NULL.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node close_db
+ at subsection close_db
+ at findex close_db(C)
+ at findex database, closing
+ at findex closing databases
+
+ at example
+#include <IO.h>
+
+int close_db(
+        GapIO *io);
+ at end example
+
+This function closes a database. @var{io} is a @var{GapIO} pointer returned
+from a previous call to @code{open_db}. If necessary, the busy file is
+removed, and all allocated memory is freed.
+
+The function returns 0 for success and -1 for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node del_db
+ at subsection del_db
+ at findex del_db(C)
+ at findex database, deletion of
+ at findex deleting databases
+
+ at example
+#include <IO.h>
+
+int del_db(
+        char   *project,
+        char   *version);
+ at end example
+
+This removes the databases files for a particular @var{version} of a
+ at var{project}. The database should not be open at the time of calling this
+function. On unix, the files removed are named "@var{project}. at var{version}"
+and "@var{project}. at var{version}.aux".
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node flush2t
+ at subsection flush2t
+ at findex flush2t(C)
+ at cindex flushing data
+ at cindex time stamps
+
+ at example
+#include <IO.h>
+
+void flush2t(
+        GapIO *io);
+ at end example
+
+This functions checks out all written data by updating the database time
+stamp. If Gap4 crashes, upon restarting any data written since the last time
+stamp is ignored. The purpose of this is to ensure that the data in the
+database is internally consistent. Hence you should only call this function
+when the database writes are consistent.
+
+An example of this is in deleting a reading @var{N} which has left and right
+neighbours of @var{L} and @var{R}. The operation of writes may be:
+
+ at itemize @minus
+ at item set right neighbour of @var{L} to be @var{R}
+ at item set left neighbour of @var{R} to be @var{L}
+ at item deallocate @var{N}.
+ at end itemize
+
+The database is consistent before these operations, and after these
+operations, but not at any stage in between.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node get_gel_num
+ at subsection get_gel_num and get_contig_num
+ at findex get_gel_num(C)
+ at findex get_contig_num(C)
+ at cindex reading names, reading
+ at cindex contig names, reading
+
+ at example
+#include <IO.h>
+
+int get_gel_num(
+        GapIO  *io,
+        char   *gel_name,
+        int     is_name);
+
+int get_contig_num(
+        GapIO  *io,
+        char   *gel_name,
+        int     is_name);
+ at end example
+
+These functions convert reading and contig names into reading and contig
+numbers. (A contig name is defined to be the name of any reading held within
+that contig.)
+
+The @var{is_name} argument is mainly used for backwards compatibility. It
+should be passed as either @code{GGN_ID} or @code{GGN_NAME}. When equal to
+ at code{GGN_ID}, @var{gel_name} is treated as a @var{reading identifier},
+otherwise it is treated as a @var{reading name}. An identifier is defined to
+be either a reading name; a hash sign followed by a reading number; or an
+equals sign followed by a contig number.
+
+Both functions return -1 for failure or the appropriate reading or contig
+number for success.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node lget_gel_num
+ at subsection lget_gel_num and lget_contig_num
+ at findex lget_gel_num(C)
+ at findex lget_contig_num(C)
+ at cindex reading names, reading
+ at cindex contig names, reading
+ at vindex contig_list_t(C)
+
+ at example
+#include <IO.h>
+
+int lget_gel_num(
+        GapIO  *io,
+        int     listArgc,
+        char  **listArgv,
+        int    *rargc,
+        int   **rargv);
+
+int lget_contig_num(
+        GapIO  *io,
+        int     listArgc,
+        char  **listArgv,
+        int    *rargc,
+        contig_list_t **rargv);
+ at end example
+
+These functions perform the same task as @code{get_gel_num} and
+ at code{get_contig_num} except on lists of identifier instead of single
+identifiers.
+
+The list of identifiers is passed in @var{listArgv} as an array of
+ at var{listArgc} strings. They return arrays of reading or contig numbers by
+setting @var{*rargv} to point to an array of @var{*rargc} elements. The memory
+is allocated by these functions and should be deallocated by the caller using
+ at code{free}.
+
+For @code{lget_gel_num} the return arrays are arrays of integer values.
+ at code{lget_contig_num} returns arrays of @var{contig_list_t} structures. This
+structure is defined as follows.
+
+ at example
+typedef struct contig_list @{
+    int contig;
+    int start;
+    int end;
+@} contig_list_t;
+ at end example
+
+If any string within the @var{listArgv} argument to @code{lget_contig_num} is
+a list, the second and third elements of this list are used to define the
+ at var{start} and @var{end} offsets within the contig (which is defined by the
+name held in the first element of the list). Otherwise, the @var{start} and
+ at var{end} fields are set to 1 and the length of the contig respectively.
+
+For instance, it is legal for pass over "@code{rname}", "@code{rname 100}" and
+"@code{rname 100 200}" as contig identifiers.
+
+Both functions return 0 for success and -1 for failure.  Note that the
+returned @var{rargc} value may not be the same as @var{listArgc} in the case
+where one or more identifiers could not be translated.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node to_contigs_only
+ at subsection to_contigs_only
+ at findex to_contigs_only(C)
+
+ at example
+#include <IO.h>
+
+int *to_contigs_only(
+        int     num_contigs,
+        contig_list_t *cl);
+ at end example
+
+This functions converts an array of @var{contig_list_t} structures to an array
+of integers containing only the contig number information. The @var{cl} and
+ at var{num_contigs} elements correspond to the returned @var{rargv} and
+ at var{rargc} arguments from the @code{lget_contig_num} function.
+
+It returns a malloc array of integers for success or @code{NULL} for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node chain_left
+ at subsection chain_left
+ at vindex chain_left(C)
+ at cindex left most reading
+
+ at example
+#include <IO.h>
+
+int chain_left(
+        GapIO  *io,
+        int     gel);
+ at end example
+
+This function finds the left most reading number of the contig containing the
+reading numbered @var{gel}. This is done by chaining along the left neighbours
+of each reading in turn until the contig end is reached. The function detects
+possible loops and returns -1 in this case. Otherwise the left most reading
+number is returned.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node rnumtocnum
+ at subsection rnumtocnum
+ at vindex rnumtocnum(C)
+ at cindex contig numbers, from reading numbers
+
+ at example
+#include <IO.h>
+
+int rnumtocnum(
+        GapIO  *io,
+        int     gel);
+ at end example
+
+This function returns the contig number for the contig containing the reading
+numbered @var{gel}. It returns -1 if the contig number cannot be found.
diff --git a/scripting_manual/gap4-cio-compile-t.texi b/scripting_manual/gap4-cio-compile-t.texi
new file mode 100644
index 0000000..6da2783
--- /dev/null
+++ b/scripting_manual/gap4-cio-compile-t.texi
@@ -0,0 +1,78 @@
+ at cindex compiling
+ at cindex linking
+ at vindex GAPDB_EXT_INC
+ at vindex GAPDB_EXT_OBJS
+ at vindex GAPDB_EXT_LIBS
+
+If you require usage of the Gap4 I/O functions in a program other than Gap4
+itself you will need to compile and link in particular ways to use the
+function prototypes and to add the Gap4 functions to your binary. At present,
+the object files required for database access do not comprise a library.
+
+The compiler include search path needs adjusting to add the
+ at file{$STADENROOT/src/gap4} directory and possibly the
+ at file{$STADENROOT/src/g} directory. Once your own object files are compiled,
+they need to be linked with the following gap4 object files.
+
+ at table @code
+ at itemx $STADENROOT/src/gap4/$MACHINE-binaries/actf.o
+ at itemx $STADENROOT/src/gap4/$MACHINE-binaries/gap-create.o
+ at itemx $STADENROOT/src/gap4/$MACHINE-binaries/gap-dbstruct.o
+ at itemx $STADENROOT/src/gap4/$MACHINE-binaries/gap-error.o
+ at itemx $STADENROOT/src/gap4/$MACHINE-binaries/gap-if.o
+ at itemx $STADENROOT/src/gap4/$MACHINE-binaries/gap-init.o
+ at itemx $STADENROOT/src/gap4/$MACHINE-binaries/gap-io.o
+ at item  $STADENROOT/src/gap4/$MACHINE-binaries/gap-local.o
+ at itemx $STADENROOT/src/gap4/$MACHINE-binaries/gap-remote.o
+ at itemx $STADENROOT/src/gap4/$MACHINE-binaries/IO.o
+ at itemx $STADENROOT/src/gap4/$MACHINE-binaries/io_handle.o
+ at itemx $STADENROOT/src/gap4/$MACHINE-binaries/io-reg.o
+ at itemx $STADENROOT/src/gap4/$MACHINE-binaries/io_utils.o
+ at itemx $STADENROOT/src/gap4/$MACHINE-binaries/text-io-reg.o
+ at end table
+
+Finally, a library search path of @file{$STADENROOT/lib/$MACHINE-binaries}
+should be used to link the @code{-lg -ltext_utils -lmisc} libraries.
+
+All of the above definitions have been added to a single Makefile held in
+ at file{$STADENROOT/src/mk/gap4_defs.mk} as the @code{GAPDB_EXT_INC},
+ at code{GAPDB_EXT_OBJS} and @code{GAPDB_EXT_LIBS} variables. When possible,
+these should be used in preference to hard coding the variable object
+filenames as this provides protection against future coding changes.
+So for example, if we have a program held in the file @file{demo.c} we could
+have a simple Makefile as follows.
+
+ at example
+SRCROOT=$(STADENROOT)/src
+include $(SRCROOT)/mk/global.mk
+include $(SRCROOT)/mk/$(MACHINE).mk
+
+OBJS = $(O)/demo.o
+
+LIBS = $(MISC_LIB)
+
+$(O)/demo: $(OBJS)
+        $(CLD) -o $@ $(OBJS) $(LIBS) $(LIBSC)
+ at end example
+
+If we now extend this program so that it requires the Gap4 I/O routines, the
+Makefile should be modified to:
+
+ at example
+SRCROOT=$(STADENROOT)/src
+include $(SRCROOT)/mk/global.mk
+include $(SRCROOT)/mk/$(MACHINE).mk
+include $(SRCROOT)/mk/gap4_defs.mk
+
+INCLUDES_E += $(GAPDB_EXT_INC)
+
+OBJS = $(O)/demo.o $(GAPDB_EXT_OBJS)
+
+LIBS = $(MISC_LIB) $(GAPDB_EXT_LIBS)
+
+$(O)/demo: $(OBJS)
+        $(CLD) -o $@ $(OBJS) $(LIBS) $(LIBSC)
+ at end example
+
+If you require an example of a program that utilises the Gap4 I/O functions,
+see the @code{convert} program in @file{$STADENROOT/src/convert/}.
diff --git a/scripting_manual/gap4-cio-database-t.texi b/scripting_manual/gap4-cio-database-t.texi
new file mode 100644
index 0000000..de2c1e1
--- /dev/null
+++ b/scripting_manual/gap4-cio-database-t.texi
@@ -0,0 +1,615 @@
+Before using any of the functions a firm understanding of the data structures
+is needed. The main objects held within the database are readings, contigs,
+templates, vectors, clones and annotations. These reference additional records
+of other objects or one of the primitive types.
+
+There are five basic types from which the database structures are constructed.
+These are:
+
+ at table @var
+ at item GCardinal
+	A single 4 byte integer.
+
+ at item Text
+	An ascii string which may ending in a null. The null character may, or
+	may not, be present in the actual data stored on the disk.
+
+ at item Array
+	An extendable list of 4 byte integer values.
+
+ at item Bitmap
+	An extendable array of single bits.
+
+ at item Data
+	Any other data. This is handled in a similar manner to the Text type
+	except the null character may be present.
+ at end table
+
+In the C code, the @var{GCardinal} is the basic type used in most database
+structures.  Other structure elements are larger and so are typically stored
+as another @code{GCardinal} containing the record number of the data itself.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-GDatabase
+ at subsection The GDatabase Structure
+
+ at vindex GDatabase(C)
+ at example
+#define GAP_DB_VERSION 2
+#define GAP_DNA		   0
+#define GAP_PROTEIN	   1
+
+typedef struct @{ 
+    GCardinal version;		/* Database version - GAP_DB_VERSION */
+    GCardinal maximum_db_size;	/* MAXDB */
+    GCardinal actual_db_size;	/* */
+    GCardinal max_gel_len;	/* 4096 */
+    GCardinal data_class;	/* GAP_DNA or GAP_PROTEIN */
+
+    /* Used counts */
+    GCardinal num_contigs;	/* number of contigs used */
+    GCardinal num_readings;	/* number of readings used */
+
+    /* Bitmaps */
+    GCardinal Nfreerecs;	/* number of bits */
+    GCardinal freerecs;		/* record no. of freerecs bitmap */
+
+    /* Arrays */
+    GCardinal Ncontigs;		/* elements in array */
+    GCardinal contigs;		/* record no. of array of type GContigs */
+
+    GCardinal Nreadings;	/* elements in array */
+    GCardinal readings;		/* record no. of array of type GReading */
+
+    GCardinal Nannotations;	/* elements in array */
+    GCardinal annotations;	/* record no. of array of type GAnnotation */
+    GCardinal free_annotations; /* head of list of free annotations */
+
+    GCardinal Ntemplates;	/* elements in array */
+    GCardinal templates;	/* record no. of array of type GTemplates */
+
+    GCardinal Nclones;		/* elements in array */
+    GCardinal clones;		/* record no. of array of type GClones */
+
+    GCardinal Nvectors;		/* elements in array */
+    GCardinal vectors;		/* record no. of array of type GVectors */
+
+    GCardinal contig_order;	/* record no. of array of type GCardinal */
+
+    GCardinal Nnotes;		/* elements in array */
+    GCardinal notes_a;		/* records that are GT_Notes */
+    GCardinal notes;		/* Unpositional annotations */
+    GCardinal free_notes;	/* SINGLY linked list of free notes */
+@} GDatabase; 
+
+ at end example
+
+This is always the first record in the database. In contains information about
+the Gap4 database as a whole and can be viewed as the root from which all
+other records are eventually referenced from. Care must be taken when dealing
+with counts of contigs and readings as there are two copies; one for the used
+number and one for the allocated number.
+
+The structure contains several database record numbers of arrays. These arrays
+in turn contain record numbers of structures. Most other structures, and
+indeed functions within Gap4, then reference structure numbers (eg a reading
+number) and not their record numbers. The conversion from one to the other is
+done by accessing the arrays listed in the GDatabase structure.
+
+For instance, to read the structure for contig number 5 we could do the
+following.
+
+ at example
+GContigs c;
+GT_Read(io, arr(GCardinal, io->contigs, 5-1), &c, sizeof(c), GT_Contigs);
+ at end example
+
+In the above code, @code{io->contigs} is the array of GCardinals whose record
+number is contained within the @var{contigs} element of the GDatabase
+structure. In practise, this is hidden away by simply calling
+"@code{contig_read(io, 5, c)}" instead.
+
+ at table @var
+ at vindex version, GDatabase. (C)
+ at vindex GAP_DB_VERSION(C)
+ at item version
+	Database record format version control. The current version is held
+	within the @code{GAP_DB_VERSION} macro.
+
+ at vindex maximum_db_size, GDatabase. (C)
+ at vindex actual_db_size, GDatabase. (C)
+ at item maximum_db_size
+ at item actual_db_size
+	These are essentially redundant as Gap4 can support any number of
+	readings up to @var{maximum_db_size}, and @var{maximum_db_size} can be
+	anything the user desires. It is specifable using the @code{-maxdb}
+	command line argument to gap4.
+
+ at vindex max_gel_len, GDatabase. (C)
+ at item max_gel_len
+	This is currently hard coded as 4096 (but is relatively easy to
+	change).
+
+ at vindex data_class, GDatabase. (C)
+ at vindex GAP_DNA(C)
+ at vindex GAP_PROTEIN(C)
+ at item data_class
+	This specifies whether the database contains DNA or protein sequences.
+	In the current implementation only DNA is supported.
+
+ at vindex num_contigs, GDatabase. (C)
+ at vindex num_readings, GDatabase. (C)
+ at item  num_contigs
+ at itemx num_readings
+	These specify the number of @strong{used} contigs and readings. They
+	may be different from the number of records allocated.
+
+ at vindex Nfreerecs, GDatabase. (C)
+ at vindex freerecs, GDatabase. (C)
+ at item  Nfreerecs
+ at itemx freerecs
+	@var{freerecs} is the record number of a bitmap with a single element
+	per record in the database. Each free bit in the bitmap corresponds to
+	a free record.	The @var{Nfreerecs} variable holds the number of bits
+	allocated in the freerecs bitmap.
+
+ at vindex Ncontigs, GDatabase. (C)
+ at vindex contigs, GDatabase. (C)
+ at item  Ncontigs
+ at itemx contigs
+	@var{contigs} is the record number of an array of GCardinals. Each
+	element of the array is the record number of a GContigs structures.
+	@var{Ncontigs} is the number of elements allocated in the
+	@var{contigs} array. Note that this is different from
+	@var{num_contigs}, which is the number of elements used.
+
+ at vindex Nreadings, GDatabase. (C)
+ at vindex readings, GDatabase. (C)
+ at item  Nreadings
+ at itemx readings
+	@var{readings} is the record number of an array of GCardinals. Each
+	element of the array is the record number of a GReadings structures.
+	@var{Nreadings} is the number of elements allocated in the
+	@var{readings} array. Note that this is different from
+	@var{num_readings}, which is the number of elements used.
+
+ at vindex Nannotations, GDatabase. (C)
+ at vindex annotations, GDatabase. (C)
+ at vindex free_annotations, GDatabase. (C)
+ at item  Nannotations
+ at itemx annotations
+ at itemx free_annotations
+	@var{annotations} is the record number of an array of GCardinals. Each
+	element of the array is the record number of a GAnnotations
+	structures.  @var{Nannotations} is the number of elements allocated in
+	the @var{annotations} array. @var{free_annotations} is the record
+	number of the first free annotation, which forms the head of a linked
+	list of free annotations.
+
+ at vindex Ntemplates, GDatabase. (C)
+ at vindex templates, GDatabase. (C)
+ at item  Ntemplates
+ at itemx templates
+	@var{templates} is the record number of an array of GCardinals. Each
+	element of the array is the record number of a GTemplates structures.
+	@var{Ntemplates} is the number of elements allocated in the
+	@var{templates} array.
+
+ at vindex Nclones, GDatabase. (C)
+ at vindex clones, GDatabase. (C)
+ at item  Nclones
+ at itemx clones
+	@var{clones} is the record number of an array of GCardinals. Each
+	element of the array is the record number of a GClones structures.
+	@var{Nclones} is the number of elements allocated in the @var{clones}
+	array.
+
+ at vindex Nvectors, GDatabase. (C)
+ at vindex vectors, GDatabase. (C)
+ at item  Nvectors
+ at itemx vectors
+	@var{vectors} is the record number of an array of GCardinals. Each
+	element of the array is the record number of a GVectors structures.
+	@var{Nvectors} is the number of elements allocated in the
+	@var{vectors} array.
+
+ at vindex contig_order, GDatabase. (C)
+ at item  contig_order
+	This is the record number of an array of GCardinals of size
+	@var{NContigs}. Each element of the array is a contig number. The
+	index of the array element indicates the position of this contig.
+	Thus the contigs are displayed in the order that they appear in this
+	array.
+ at end table
+
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-GReadings
+ at subsection The GReadings Structure
+
+ at vindex GAP_SENSE_ORIGINAL(C)
+ at vindex GAP_SENSE_REVERSE(C)
+ at vindex GAP_STRAND_FORWARD(C)
+ at vindex GAP_STRAND_REVERSE(C)
+ at vindex GAP_PRIMER_UNKNOWN(C)
+ at vindex GAP_PRIMER_FORWARD(C)
+ at vindex GAP_PRIMER_REVERSE(C)
+ at vindex GAP_PRIMER_CUSTFOR(C)
+ at vindex GAP_PRIMER_CUSTREV(C)
+ at vindex GAP_CHEM_DOUBLE(C)
+
+ at example
+/* GReadings.sense */
+#define GAP_SENSE_ORIGINAL 0
+#define GAP_SENSE_REVERSE  1
+/* GReadings.strand */
+#define GAP_STRAND_FORWARD 0
+#define GAP_STRAND_REVERSE 1
+/* GReadings.primer */
+#define GAP_PRIMER_UNKNOWN 0
+#define GAP_PRIMER_FORWARD 1
+#define GAP_PRIMER_REVERSE 2
+#define GAP_PRIMER_CUSTFOR 3
+#define GAP_PRIMER_CUSTREV 4
+
+/* GReadings.chemistry */
+/*	Bit 0 is 1 for terminator, 0 for primer */
+#define GAP_CHEM_TERMINATOR	(1<<0)
+/*	Bits 1 to 4 inclusive are the type (any one of, not bit pattern) */
+#define GAP_CHEM_TYPE_MASK	(15<<1)
+#define GAP_CHEM_TYPE_UNKNOWN	(0<<1)
+#define GAP_CHEM_TYPE_ABI_RHOD	(1<<1)
+#define GAP_CHEM_TYPE_ABI_DRHOD	(2<<1)
+#define GAP_CHEM_TYPE_BIGDYE	(3<<1)
+#define GAP_CHEM_TYPE_ET	(4<<1)
+#define GAP_CHEM_TYPE_LICOR	(5<<1)
+
+typedef struct @{
+    GCardinal name;
+    GCardinal trace_name;
+    GCardinal trace_type;
+    GCardinal left;		/* left neighbour */
+    GCardinal right;		/* right neighbour */
+    GCardinal position;		/* position in contig */
+    GCardinal length;		/* total length of reading */
+    GCardinal sense;		/* 0 = original, 1 = reverse */
+    GCardinal sequence;
+    GCardinal confidence;
+    GCardinal orig_positions;
+    GCardinal chemistry;	/* see comments above (GAP_CHEM_*) */
+    GCardinal annotations;	/* start of annotation list */
+    GCardinal sequence_length;	/* clipped length */
+    GCardinal start;		/* last base of left cutoff */
+    GCardinal end;		/* first base of right cutoff */
+    GCardinal template;		/* aka subclone */
+    GCardinal strand;		/* 0 = forward, 1 = reverse */
+    GCardinal primer;		/* 0 = unknown, 1 = forwards, */
+				/* 2 = reverse, 3 = custom forward */
+                                /* 4 = custom reverse */
+    GCardinal notes;		/* Unpositional annotations */
+@} GReadings; 
+ at end example
+
+The reading structure contains information related to individual sequence
+fragments. It should be read and written using the @code{gel_read} and
+ at code{gel_write} functions. Whilst it is perfectly possible to use
+ at code{GT_Read} to access this data, using @code{gel_read} will read from an
+in-memory cache and so is much faster. Using @code{GT_Write} to write a
+ at var{GReadings} structure must never be used as it will invalidate the cache.
+
+ at table @var
+ at vindex name, GReadings. (C)
+ at item name
+The record number of the text string containing the reading identifier.
+Care must be taken to use the correct functions to access the reading name.
+Use @code{io_read_reading_name} and @code{io_write_reading_name} instead of
+ at code{io_read_text} or @code{io_write_text}. _oxref(Script-io_rw_reading_name,
+io_read_reading_name and io_write_reading_name).
+
+ at vindex trace_name, GReadings. (C)
+ at item trace_name
+The record number of the text string containing the trace filename.
+
+ at vindex trace_type, GReadings. (C)
+ at item trace_type
+The record number of the text string containing the type of the trace.
+
+ at vindex left, GReadings. (C)
+ at item left
+	The left hand neighbour of this sequence, or 0 if this is the first
+	reading in the contig. Sequences are stored in a doubly linked list
+	which is sorted on positional order. The right hand neighbour of the
+	sequence referenced by this field should be the same as this sequence
+	number. NOTE: this is the reading number, not the record number.
+
+ at vindex right, GReadings. (C)
+ at item right
+	The right hand neighbour of this sequence, or 0 if this is the last
+	reading in the contig. The left hand neighbour of the sequence
+	referenced by this field should be the same as this sequence number.
+	NOTE: this is the reading number, not the record number.
+
+ at vindex position, GReadings. (C)
+ at item position
+	The absolute position of this reading within the contig (starting from
+	1).
+
+ at vindex length, GReadings. (C)
+ at item length
+	The total length of this reading, including cutoff data.
+
+ at vindex sense, GReadings. (C)
+ at item sense
+	The orientation of this reading. 0=original, 1=reversed. The
+	@code{GAP_SENSE_*} macros should be used in preference to integer
+	values.
+
+ at vindex sequence, GReadings. (C)
+ at item sequence
+	The record number of the text string containing the complete sequence.
+
+ at vindex confidence, GReadings. (C)
+ at item confidence
+	The record number of the 1 byte integer array containing the confidence
+	values. This has one value per sequence base and so is the same length
+	as the sequence array.
+
+ at vindex orig_positions, GReadings. (C)
+ at item orig_positions
+	The record number of the 2 byte integer array containing the original
+	positions of each base. This has one 2 byte value per sequence base.
+
+ at vindex chemistry, GReadings. (C)
+ at item chemistry
+	The chemistry type of this reading. 0=normal. @code{chemistry &
+	GAP_CHEM_DOUBLE} contains the terminator reaction information. Non
+	zero implies a terminator reaction, which can then optionally be used
+	as double stranded sequence.
+
+ at vindex annotations, GReadings. (C)
+ at item annotations
+	The number of the first annotation for this reading. Annotations are
+	stored in a linked list structure. This value is 0 if no annotations
+	are available. NOTE: This is not the same as the record number of the
+	first annotation.
+
+ at vindex sequence_length, GReadings. (C)
+ at item sequence_length
+	The used length of sequence. This should always be the same as the
+	@var{end-start-1}.
+
+ at vindex start, GReadings. (C)
+ at item start
+	The position of the last base in the left hand cutoff data (starting
+	from 1).
+
+ at vindex end, GReadings. (C)
+ at item end
+	The position of the first base in the right hand cutoff data (starting
+	from 1).
+
+ at vindex template, GReadings. (C)
+ at item template
+	The template number. Readings sharing a template (ie insert) have the
+	same template number.
+
+ at vindex strand, GReadings. (C)
+ at item strand
+	The strand this sequence was derived from. 0=forward, 1=reverse. The
+	@code{GAP_STRAND_*} macros should be used in preference to integer
+	values.
+
+ at vindex primer, GReadings. (C)
+ at item primer
+	The primer type for this sequence. 0=unknown, 1=forward, 2=reverse,
+	3=custom forward, 4=custom reverse. The @code{GAP_PRIMER_*} macros
+	should be used in preference to integer values.
+ at end table
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-GContigs
+ at subsection The GContigs Structure
+
+ at example
+typedef struct @{ 
+    GCardinal left;		/* left reading number */
+    GCardinal right;		/* right reading number */
+    GCardinal length;		/* contig sequence length */
+    GCardinal annotations;	/* start of annotation list */
+    GCardinal notes;		/* Unpositional annotations */
+@} GContigs; 
+ at end example
+
+ at table @var
+ at vindex left, GContigs. (C)
+ at item left
+	The number of the leftmost reading in this contig. This is a reading
+	number, not a record number.
+
+ at vindex right, GContigs. (C)
+ at item right
+	The number of the rightmost reading in this contig. This is a reading
+	number, not a record number. Note that the rightmost reading is
+	defined as the reading the left end furthest to the right and not the
+	reading with the right end furthest to the right.
+
+ at vindex length, GContigs. (C)
+ at item length
+	The total length of this contig.
+
+ at vindex annotations, GContigs. (C)
+ at item annotations
+	The annotation number of the first annotation on the consensus for
+	this contig or 0 if none are available.
+ at end table
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-GAnnotations
+ at subsection The GAnnotations Structure
+
+ at example
+typedef struct @{ 
+    GCardinal type;
+    GCardinal position; 
+    GCardinal length; 
+    GCardinal strand; 
+    GCardinal annotation; 
+    GCardinal next;
+@} GAnnotations; 
+ at end example
+
+The annotations (aka tags) are comments attached to segments of readings or
+contig consensus sequences. The location is stored as position and length in
+the original orientation, so complementing a reading does not require edits to
+the annotations. Consensus sequences are always considered uncomplemented and
+so complementing a contig does require complementing of annotations that are
+stored on the consensus.
+
+The annotations can be linked together to form linked lists, sorted on
+ascending position. The @var{GReadings} and @var{GContigs} structures contain
+an annotations field which holds the annotation number of the left most
+(original orientation) annotation.
+
+Unused annotations are kept in an unsorted linked list referenced by the
+ at var{free_annotatons} field of the @var{GDatabase} structure.
+
+ at table @var
+ at vindex type, GAnnotations. (C)
+ at item type
+	The type of the annotation; a 4 byte integer which the user sees as a
+	4 character string.
+
+ at vindex position, GAnnotations. (C)
+ at item position
+	The position of the left end of the annotation.
+
+ at vindex length, GAnnotations. (C)
+ at item length
+	The length of the annotation.
+
+ at vindex strand, GAnnotations. (C)
+ at item strand 
+	The annotation strand. 0 for positive, 1 for negative, and 2 for both.
+
+ at vindex annotation, GAnnotations. (C)
+ at item annotation
+	The record number of the text string containing a comment for the
+	annotation. Zero means no comment.
+
+ at vindex next, GAnnotations. (C)
+ at item next
+	The annotation number of the next annotation in the linked list, or
+	zero if this is the last in this linked list.
+ at end table
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-GVectors
+ at subsection The GVectors Structure
+
+ at example
+/* GVectors.level */
+#define GAP_LEVEL_UNKNOWN  0
+#define GAP_LEVEL_CLONE	   1
+#define GAP_LEVEL_SUBCLONE 2
+
+typedef struct @{
+    GCardinal name;		/* vector name */
+    GCardinal level;		/* 1=clone, 2=subclone, etc */
+@} GVectors; 
+ at end example
+
+The vector structure contains simply information on any vectors used in
+cloning and subcloning. The @var{GTemplates} and @var{GClones} structures
+reference this structure.
+
+ at table @var
+ at vindex name, GVectors. (C)
+ at item name
+	The record number of the text string containing the name for this
+	vector.
+
+ at vindex level, GVectors. (C)
+ at item level
+	A numeric value for the level of the vector. Use the
+	@code{GAP_LEVEL_*} macros for this field.
+ at end table
+
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-GTemplates
+ at subsection The GTemplates Structure
+
+ at example
+typedef struct @{
+    GCardinal name;
+    GCardinal strands;
+    GCardinal vector;
+    GCardinal clone;
+    GCardinal insert_length_min;
+    GCardinal insert_length_max;
+@} GTemplates;
+ at end example
+
+The template structure holds information about the physcial insert of a clone.
+A reading is within any single template, but several readings may share the
+same template.
+
+ at table @var
+ at vindex name, GTemplates. (C)
+ at item name
+	The record number of the text string containing the template name
+
+ at vindex strands, GTemplates. (C)
+ at item strands
+	The number of strands available. Either 1 or 2.
+
+ at vindex vector, GTemplates. (C)
+ at item vector
+	The vector number of the vector ("sequencing vector") used.
+
+ at vindex clone, GTemplates. (C)
+ at item clone
+	The clone number of the clone that this template came from.
+
+ at vindex insert_len_min, GTemplates. (C)
+ at item insert_len_min
+	The minimum expected size of insert.
+
+ at vindex insert_len_max, GTemplates. (C)
+ at item insert_len_max
+	The maximum expected size of insert.
+ at end table
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-GClones
+ at subsection The GClones Structure
+
+ at example
+typedef struct @{
+    GCardinal name;
+    GCardinal vector;
+@} GClones;
+ at end example
+
+The clone structure holds simple information to identify which original piece
+of materal our templates were derived from. Often we have a single clone per
+database and the database name is the same as the clone name.
+
+ at table @var
+ at vindex name, GClones. (C)
+ at item name
+	The record number of the text string containing the clone name.
+
+ at vindex vector, GClones. (C)
+ at item vector
+	The vector number of the vector used. The vector referenced here could
+	be M13 for a very small project, or a cosmid, YAC or BAC for a larger
+	"subcloned
+ at end table
diff --git a/scripting_manual/gap4-cio-gapio-t.texi b/scripting_manual/gap4-cio-gapio-t.texi
new file mode 100644
index 0000000..aa52d85
--- /dev/null
+++ b/scripting_manual/gap4-cio-gapio-t.texi
@@ -0,0 +1,172 @@
+ at vindex GapIO(C)
+ at cindex GapIO structure
+
+The main object passed around between the I/O functions is the @var{GapIO}
+structure. This is returned from the @code{open_db} function and is then
+passed around in much the same manner as a unix file descriptor or @var{FILE}
+pointer is. The structure, held in @file{gap4/IO.h}, is as follows.
+
+ at example
+typedef struct @{
+    GapServer *server;		/* our server */
+    GapClient *client;		/* ourselves */
+
+    int Nviews;			/* number of locked views */
+    Array views;		/* all locked views */
+
+    GDatabase db;		/* main database record */
+    Bitmap freerecs;		/* bitmap of unused */
+    Array contigs;		/* list of contig */
+    Array readings;		/* list of reading records */
+    Array annotations;		/* list of annotation records */
+    Array templates;		/* list of template records */
+    Array clones;		/* list of clone records */
+    Array vectors;		/* list of vector records */
+
+    int4 *relpos;		/* relpg[] */
+    int4 *length;		/* length[] */
+    int4 *lnbr;			/* lnbr[] */
+    int4 *rnbr;			/* rnbr[] */
+
+    char db_name[DB_FILELEN];	/* database "file.version" */
+
+    Array contig_order;		/* order of contigs */
+    Array contig_reg;		/* Registration arrays for each contig */
+
+#ifdef GAP_CACHE
+    Array reading;		/* Array of GReading _structures_ */
+    Array read_names;		/* Array of reading names */
+#endif
+    int freerecs_changed;	/* Whether to flush freerecs bitmap */
+    Bitmap updaterecs;		/* bitmap of updated records */
+    Bitmap tounlock;		/* bitmap of records to unlock at next flush */
+@} GapIO;
+ at end example
+
+Many of the items held within this structure are used internally by the I/O
+functions. However it's worth describing all very briefly.
+
+ at table @var
+ at vindex server, GapIO. (C)
+ at vindex client, GapIO. (C)
+ at item  server
+ at itemx client
+	The @var{server} and @var{client} pointers are used in the low level g
+	library communication. They need not be used by any external code.
+
+ at vindex Nviews, GapIO. (C)
+ at vindex views, GapIO. (C)
+ at item  Nviews
+ at itemx views
+	Each record in the database needs to be locked before it can be
+	accessed. A view is returned for each independent lock of a record.
+	These are used internally by the low level reading and writing
+	function.
+
+ at vindex db, GapIO. (C)
+ at item  db
+	This is a direct copy of the @var{GDatabase} structure for this
+	database. This needs to be kept up to date with the on disk copy
+	whenever changes are made (eg by adding a new reading).
+
+ at vindex freerecs, GapIO. (C)
+ at item  freerecs
+	This is a copy of the free records bitmap referenced by the
+	@var{io->db.freerecs} field. It is kept up to date internally.
+
+ at vindex contigs, GapIO. (C)
+ at vindex readings, GapIO. (C)
+ at vindex annotations, GapIO. (C)
+ at vindex templates, GapIO. (C)
+ at vindex clones, GapIO. (C)
+ at vindex vectors, GapIO. (C)
+ at item  contigs
+ at itemx readings
+ at itemx annotations
+ at itemx templates
+ at itemx clones
+ at itemx vectors
+	These are lookup arrays to convert structure numbers to record
+	numbers. For instance, all readings are numbered from 1 upwards.
+	Similarly for contigs. However reading number 1 and contig number 1
+	will have their own unique record numbers in the g database.
+
+	The extensible array package is used for storing this information. To
+	translate from reading number @var{N} to the record number use
+	"@code{arr(GCardinal, io->readings, N-1)}".
+
+ at vindex relpos, GapIO. (C)
+ at vindex length, GapIO. (C)
+ at vindex lnbr, GapIO. (C)
+ at vindex rnbr, GapIO. (C)
+ at item  relpos
+ at itemx length
+ at itemx lnbr
+ at itemx rnbr
+	These are arrays of 4-byte integers of size
+	@var{io->db.actual_db_size}. They hold information about both
+	readings and contigs. 
+
+	For readings, the array contents hold copies of the @var{position},
+	@var{sequence_length}, @var{left} and @var{right} fields of the
+	@var{GReadings} structures. Reading number @var{R} has this data
+	stored in array elements @var{R} (counting from element 0, which is
+	left blank).
+	
+	For contigs, the array contents hold copies of the @var{length},
+	@var{left} and @var{right} fields of the @var{GContigs} structure. For
+	historical reasons the contig length is held in the @var{relpos}
+	array with the @var{length} array left blank. Contig number @var{C}
+	has this data stored in array elements @var{io->db.actual_db_size-C}.
+
+	For ease of use and future compatibility several macros have been
+	defined for accessing this data. _oxref(G4Cio-Macros, IO.h Macros).
+	These should be used instead of direct access.	Thus to find the
+	length of reading @var{R} we use @code{io_length(io,R)} and to find
+	the length of contig @var{C} we use @code{io_clength(io,C)}.
+
+	NOTE: These arrays are not updated automatically. If you modify data
+	using one of the write functions you also need to update the arrays in
+	sync. This is one of the problems that the check database command
+	looks for so mistakes should be obvious.
+
+ at vindex db_name, GapIO. (C)
+ at item  db_name
+	The name of the database in a @i{file.version} syntax. This array is
+	allocated to be @code{DB_FILELEN} bytes long. The @code{io_name} macro
+	should be used for accessing this field.
+
+ at vindex contig_order, GapIO. (C)
+ at item  contig_order
+	An array loaded from @var{io->db.contig_order}. This holds the left to
+	right ordering of contigs. It is automatically undated by the create
+	and delete contig function.
+
+ at vindex contig_reg, GapIO. (C)
+ at item  contig_reg
+	The contig registration scheme information. There's an entire chapter
+	on this topic. _oxref(Registration, Gap4 Contig Registration Scheme).
+
+ at vindex reading, GapIO. (C)
+ at vindex read_names, GapIO. (C)
+ at item  reading
+ at itemx read_names
+	These are cached copies of the @var{GReadings} structures and the
+	reading names referenced by the @var{GReadings.name} fields. They are
+	updated automatically when using the correct functions
+	(@code{gel_read} and @code{gel_write}). Use of lower level functions
+	is disallowed for accessing this data.
+
+ at vindex freerecs_changed, GapIO. (C)
+ at vindex updaterecs, GapIO. (C)
+ at vindex tounlock, GapIO. (C)
+ at item  freerecs_changed
+ at itemx updaterecs
+ at itemx tounlock
+	These three are used internally for maintaining the update and
+	data flushing scheme. @var{freerecs_changed} is a flag to state
+	whether or not the @var{freerecs} bitmap needs writing to disk.
+	@var{updaterecs} and @var{tounlock} are bitmaps with a bit per record
+	to signify whether the record needs rewriting or unlocking. Their use
+	is not required outside of the low level functions.
+ at end table
diff --git a/scripting_manual/gap4-cio-high-t.texi b/scripting_manual/gap4-cio-high-t.texi
new file mode 100644
index 0000000..d83401b
--- /dev/null
+++ b/scripting_manual/gap4-cio-high-t.texi
@@ -0,0 +1,716 @@
+ at node G4Cio-io_handle
+ at subsection io_handle and handle_io
+ at cindex IO handles
+ at findex io_handle(C)
+ at findex handle_io(C)
+
+ at example
+#include <IO.h>
+
+GapIO *io_handle(
+	f_int *handle);
+
+f_int *handle_io(
+	GapIO *io);
+ at end example
+
+These two routines convert between @var{GapIO} pointers and integer handles.
+Both the Fortran and Tcl code uses integer handles due to no support for
+structures.
+
+ at code{io_handle} takes a pointer to an integer handle and returns the
+associated @var{GapIO} pointer. It returns NULL for failure.
+
+ at code{handle_io} takes a @var{GapIO} pointer and returns a pointer to a
+integer handle. It returns NULL for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-io_read_seq
+ at subsection io_read_seq
+ at cindex Sequence, reading
+ at cindex Reading sequences
+ at findex io_read_seq(C)
+
+ at example
+#include <IO.h>
+
+int io_read_seq(
+	GapIO  *io,
+	int	N,
+	int2   *length,
+	int2   *start,
+	int2   *end,
+	char   *seq,
+	int1   *conf,
+	int2   *opos);
+ at end example
+
+This function loads from memory and disk information on gel readings and
+stores this in the paramaters passed over.
+
+The reading number to read should be passed as @var{N}. The integers pointed
+to by @var{length}, @var{start} and @var{end} pointers are then written to
+with the total length (@var{GReadings.length}), the last base number (counting
+from 1) of the left hand cutoff data, and the first base number of te right
+hand cutoff data.
+
+The sequence, confidence and original position data is then loaded and stored
+in the address pointed to by @var{seq}, @var{conf} and @var{opos} respectively.
+This is expected to be allocated to the correct size by the caller of this
+function. Either or both of @var{conf} and @var{opos} can be NULL, in which
+case the data is not loaded or stored. @var{seq} must always be non NULL.
+
+This function returns 0 for success and non zero for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-io_write_seq
+ at subsection io_write_seq
+ at cindex Sequence, writing
+ at cindex Writing sequences
+ at findex io_write_seq(C)
+
+ at example
+#include <IO.h>
+
+
+int io_write_seq(
+	GapIO  *io,
+	int	N,
+	int2   *length,
+	int2   *start,
+	int2   *end,
+	char   *seq,
+	int1   *conf,
+	int2   *opos);
+ at end example
+
+This function updates disk and memory details of reading number @var{N}. If
+this reading does not yet exist, all non existant readings up to and including
+ at var{N} will be initialised first using the @code{io_init_readings} function.
+
+[FIXME: The current implement @strong{does not} update the fortran lngth
+(io_length()) array. This needs to be done by the caller. ]
+
+The @var{length} argument is the total length of the sequence, and hence also
+the expected size of the @var{seq}, @var{conf} and @var{opos} arrays.
+ at var{start} and @var{end} contain the last base number of the left cutoff data
+and the first base number of the right cutoff data.
+
+Unlike @var{io_read_seq}, all arguments to this function are mandatory.
+If the records on disk do not already exist then they are allocated first
+using the @code{allocate} function.
+
+This function returns 0 for success and non zero for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-get_read_info
+ at subsection get_read_info, get_vector_info, get_clone_info and get_subclone_info
+ at findex get_read_info(C)
+ at findex get_vector_info(C)
+ at findex get_clone_info(C)
+ at findex get_subclone_info(C)
+
+ at example
+#include <IO.h>
+
+int get_read_info(
+	GapIO  *io,
+	int	N,
+	char   *clone,
+	int	l_clone,
+	char   *cvector,
+	int	l_cvector,
+	char   *subclone,
+	int	l_subclone,
+	char   *scvector,
+	int	l_scvector,
+	int    *length,
+	int    *insert_min,
+	int    *insert_max,
+	int    *direction,
+	int    *strands,
+	int    *primer,
+	int    *clone_id,
+	int    *subclone_id,
+	int    *cvector_id,
+	int    *scvector_id);
+
+int get_vector_info(
+	GapIO  *io,
+	int	vector_id,
+	char   *vector,
+	int l_vector);
+
+int get_clone_info(
+	GapIO  *io,
+	int	clone_id,
+	char   *clone,
+	int	l_clone,
+	char   *cvector,
+	int	l_cvector,
+	int    *cvector_id);
+
+int get_subclone_info(
+	GapIO  *io,
+	int	subclone_id,
+	char   *clone,
+	int	l_clone,
+	char   *cvector,
+	int	l_cvector,
+	char   *subclone,
+	int	l_subclone,
+	char   *scvector,
+	int	l_scvector,
+	int    *insert_min,
+	int    *insert_max,
+	int    *strands,
+	int    *clone_id,
+	int    *cvector_id,
+	int    *scvector_id);
+ at end example
+
+These functions return clone, template and vector information.
+
+ at code{get_vector_info} returns the name of a vector. This is stored in the
+buffer at @var{vector}.
+
+ at code{get_clone_info} function returns the name of the clone and the vector
+number (stored at @var{clone} and @var{cvector_id} and results of
+ at code{get_vector_info} for this vector.
+
+ at code{get_subclone_info} returns the template information (insert size, number
+of strands, vector and clone numbers stored at @var{insert_min},
+ at var{insert_max}, @var{strands}, @var{scvector_id} and @var{clone_id}) along
+with the results from @code{get_vector_info} and @code{get_clone_info} on the
+appropriate vector and clone numbers.
+
+ at code{get_read_info} returns the reading information including direction,
+primer, template (subclone) number (stored at @var{direction}, @var{strands},
+ at var{primer}, and @var{clone_id}), and the results of the
+ at code{get_subclone_info} on this template number.
+
+For all four functions, the arguments used to store text fields, such as the
+clone name (@var{clone}), all have corresponding buffer lengths sent as the
+same argument name preceeded by @var{l_} (eg @var{l_clone}). These buffers
+need to be allocated by the caller of the function.
+
+Any buffer or integer pointer arguments may be passed as @code{NULL} to avoid
+filling in this field. For buffers the same is also true when specifying the
+buffer length as zero.
+
+The @var{clone}, @var{vector} and @var{subclone} buffers are used to store the
+names of the clone, vector or template. If appropriate, the clone or
+template number will also be stored at the @var{clone_id} and
+ at var{subclone_id} addresses.
+
+For functions returning information more than one vector, these are split into
+two levels. The sequencing vector is the vector used to sequence this
+template. It has arguments named @var{scvector} (name), @var{l_scvector} (name
+length) and @var{scvector_id} (vector number). The clone vector is the vector
+used in the sequecing of the fragment which is later broken down and
+resequenced as templates. This may not be appropriate in many projects. It has
+arguments named @var{cvector} (name), @var{l_cvector} (name length) and
+ at var{cvector_id} (vector number).
+
+All functions return 0 for success and an error code for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_init_reading
+ at subsection io_init_reading, io_init_contig and io_init_annotations
+ at findex io_init_reading(C)
+ at findex io_init_contig(C)
+ at findex io_init_annotations(C)
+
+ at example
+#include <IO.h>
+
+int io_init_reading(
+	GapIO  *io,
+	int	N);
+
+int io_init_contig(
+	GapIO  *io,
+	int	N);
+
+int io_init_annotations(
+	GapIO  *io,
+	int	N);
+ at end example
+
+These functions create new reading, contig and annotations structures. Each
+takes two arguments; the first being the @var{GapIO} pointer, and the second
+being the new reading, contig or annotation number to create. This is not the
+number of new structures, but rather the highest allowed number for this
+structure.
+For instance, if we have 10 readings, "@code{io_init_reading(io, 12)}" will
+create two more, numbered 11 and 12.
+
+For readings, the records are recovered (by increasing the @var{GDatabase}
+ at var{NumReadings} field to @var{NReadings}) if available. The new
+ at var{GReadings} structure are not guaranteed to be clear.
+
+For contigs, the records are recovered if available. The contig_order array is
+also updated with the new contigs being added at the rightmost position. The
+new contigs are added to the registration scheme with blank registration
+lists. The new @var{GContigs} structures are not guaranteed to be clear.
+
+For annotations, new records are always allocated from disk. It is up to the
+caller to first check that there are no free annotations in the
+ at var{free_annotations} list. The new @var{GAnnotations} structures are not
+guaranteed to be clear.
+
+All functions returns return 0 for success, and -1 for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_read_annotation
+ at subsection io_read_annotation and io_write_annotation
+ at findex io_read_annotation(C)
+ at findex io_write_annotation(C)
+ at cindex tags, reading and writing
+ at cindex annotations, reading and writing
+
+ at example
+#include <IO.h>
+
+int io_read_annotation(
+	GapIO  *io,
+	int	N,
+	int    *anno);
+
+int io_write_annotation(
+	GapIO  *io,
+	int	N,
+	int    *anno);
+ at end example
+
+These functions read and write the first annotation number in the linked lists
+referenced by the reading and contig structures.
+For both functions, @var{N} is a reading number if it is above zero or a
+contig number when below zero (in which case it is negated).
+
+ at code{io_read_annotation} reads the @var{annotations} field of reading @var{N}
+or contig @var{-N} and stores this in @var{anno}. It sets @var{anno} to 0
+returns 1 for failure. Otherwise it returns 0.
+
+ at code{io_write_annotation} sets the @var{annotations} field of reading @var{N}
+or contig @var{-N} to be @var{*anno}. Despite the fact that it is a pointer,
+the contents of @var{anno} is not modified. It returns 1 for failure and 0 for
+success (but currently always returns 0).
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node allocate
+ at subsection allocate
+ at cindex allocating records
+ at findex allocate(C)
+
+ at example
+#include <IO.h>
+
+int allocate(
+	GapIO	 *io,
+	GCardinal type);
+ at end example
+
+These allocate and deallocate records in the g database.
+
+Th @code{allocate} function allocates a new record from the g database. It
+finds a free record, or creates a new record, and returns this record number.
+The record will be automatically locked for exclusive read/write access. The
+type of the record is sent in @var{type}.  This must be one of following:
+
+ at itemize @asis
+ at item @code{GT_Text}
+ at item @code{GT_Data}
+ at item @code{GT_Array}
+ at item @code{GT_Bitmap}
+ at item @code{GT_Database}
+ at item @code{GT_Contigs}
+ at item @code{GT_Readings}
+ at item @code{GT_Vectors}
+ at item @code{GT_Annotations}
+ at item @code{GT_Templates}
+ at item @code{GT_Clones}
+ at end itemize
+
+The function does not initialise or even write the new record to disk. The
+record number is valid, but a @code{GT_Read} call will produce an error. It is
+up to the caller to initialise the structure and perform the first
+ at code{GT_Write} (or equivalent) call.
+
+It returns the record number for success, and terminates the program for
+failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node deallocate
+ at subsection deallocate
+ at findex deallocate(C)
+ at cindex deallocating records
+ at cindex removing records
+
+ at example
+#include <IO.h>
+
+int deallocate(
+	GapIO	 *io,
+	int	  rec);
+ at end example
+
+The @code{deallocate} function removes record @var{rec} from the g database.
+This uses the @code{g_remove} function, but unlocking is only performed at the
+next database flush.
+
+It returns 0 for success and 1 for failure.
+
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_deallocate_reading
+ at subsection io_deallocate_reading
+ at findex io_deallocate_reading(C)
+ at cindex readings, deallocating
+ at cindex deallocating readings
+ at cindex removing readings
+
+ at example
+#include <IO.h>
+
+int io_deallocate_reading(
+	GapIO  *io,
+	int	N);
+ at end example
+
+The @code{io_deallocate_reading} function deallocates the records linked to by
+reading number @var{N}. These are the @var{name}, @var{trace_name},
+ at var{trace_type}, @var{sequence}, @var{confidence} and @var{orig_positions}
+fields of the @var{GReadings} structure.
+
+The reading itself is not deallocated. The operation of Gap4 requires that
+reading numbers are sequential with all numbers used. It is up to the caller
+of this routine to make sure that this is still true.
+
+It returns 0 for success and >=1 for failure.
+
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_read_rd
+ at subsection io_read_rd and io_write_rd
+ at findex io_read_rd(C)
+ at findex io_write_rd(C)
+ at cindex trace data, reading and writing
+
+ at example
+#include <IO.h>
+
+int io_read_rd(
+	GapIO  *io,
+	int	N,
+	char   *file,
+	int	filelen,
+	char   *type,
+	int	typelen);
+
+int io_write_rd(
+	GapIO  *io,
+	int	N,
+	char   *file,
+	int	filelen,
+	char   *type,
+	int	typelen);
+ at end example
+
+These routines read and write the reading 'raw data' paramaters. These are the
+file name and file type of the sequence trace file.
+
+For both functions, @var{N} is the reading number; @var{file} is a buffer,
+allocated by the caller, of length @var{filelen}; and @var{type} is a buffer,
+allocated by the caller, of length @var{typelen}.
+
+ at code{io_read_rd} copies the trace filename to @var{file} and it's type to
+ at var{type}. If either of these unknown the corresponding buffer is filled with
+spaces instead. It returns 0 if both name and type are known and 1 is either
+or both are unknown.
+
+ at code{io_write_rd} write new file name and file type information. If @var{N}
+is an unknown reading number, it is first allocated using
+ at code{io_init_readings}. It returns 0 for success.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node open_db
+ at subsection open_db
+ at findex open_db(C)
+ at findex database, opening
+ at findex opening databases
+
+ at example
+#include <IO.h>
+
+GapIO *open_db(
+	char   *project,
+	char   *version,
+	int    *status,
+	int	create,
+	int	read_only);
+ at end example
+
+ at code{open_db} opens existing databases or creates new databases.
+The database to be opened or created has unix filenames of
+"@var{project}. at var{version}" and "@var{project}. at var{version}.aux".
+
+The @var{create} variable should be 0 or 1. A value of 1 indicates that this
+database is to be created. This will not be done if there is a file named
+"@var{project}. at var{version}.BUSY", in which case the @var{status} variable is
+set to contain @code{IO_READ_ONLY}.
+
+The @var{read_only} variable should be 0 or 1. A value of 1 indicates that the
+database should be opened in read only mode, otherwise read/write access is
+desired. If the database is busy then the database may still be opened in read
+only mode instead. In this case the @var{status} variable is set to contain
+ at code{IO_READ_ONLY}.
+
+The @var{GapIO} structure is then initialised and returned. A successful
+return will leave @var{status} containing 0. For failure, the function returns
+NULL.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node close_db
+ at subsection close_db
+ at findex close_db(C)
+ at findex database, closing
+ at findex closing databases
+
+ at example
+#include <IO.h>
+
+int close_db(
+	GapIO *io);
+ at end example
+
+This function closes a database. @var{io} is a @var{GapIO} pointer returned
+from a previous call to @code{open_db}. If necessary, the busy file is
+removed, and all allocated memory is freed.
+
+The function returns 0 for success and -1 for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node del_db
+ at subsection del_db
+ at findex del_db(C)
+ at findex database, deletion of
+ at findex deleting databases
+
+ at example
+#include <IO.h>
+
+int del_db(
+	char   *project,
+	char   *version);
+ at end example
+
+This removes the databases files for a particular @var{version} of a
+ at var{project}. The database should not be open at the time of calling this
+function. On unix, the files removed are named "@var{project}. at var{version}"
+and "@var{project}. at var{version}.aux".
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node flush2t
+ at subsection flush2t
+ at findex flush2t(C)
+ at cindex flushing data
+ at cindex time stamps
+
+ at example
+#include <IO.h>
+
+void flush2t(
+	GapIO *io);
+ at end example
+
+This functions checks out all written data by updating the database time
+stamp. If Gap4 crashes, upon restarting any data written since the last time
+stamp is ignored. The purpose of this is to ensure that the data in the
+database is internally consistent. Hence you should only call this function
+when the database writes are consistent.
+
+An example of this is in deleting a reading @var{N} which has left and right
+neighbours of @var{L} and @var{R}. The operation of writes may be:
+
+ at itemize @minus
+ at item set right neighbour of @var{L} to be @var{R}
+ at item set left neighbour of @var{R} to be @var{L}
+ at item deallocate @var{N}.
+ at end itemize
+
+The database is consistent before these operations, and after these
+operations, but not at any stage in between.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node get_gel_num
+ at subsection get_gel_num and get_contig_num
+ at findex get_gel_num(C)
+ at findex get_contig_num(C)
+ at cindex reading names, reading
+ at cindex contig names, reading
+
+ at example
+#include <IO.h>
+
+int get_gel_num(
+	GapIO  *io,
+	char   *gel_name,
+	int	is_name);
+
+int get_contig_num(
+	GapIO  *io,
+	char   *gel_name,
+	int	is_name);
+ at end example
+
+These functions convert reading and contig names into reading and contig
+numbers. (A contig name is defined to be the name of any reading held within
+that contig.)
+
+The @var{is_name} argument is mainly used for backwards compatibility. It
+should be passed as either @code{GGN_ID} or @code{GGN_NAME}. When equal to
+ at code{GGN_ID}, @var{gel_name} is treated as a @var{reading identifier},
+otherwise it is treated as a @var{reading name}. An identifier is defined to
+be either a reading name; a hash sign followed by a reading number; or an
+equals sign followed by a contig number.
+
+Both functions return -1 for failure or the appropriate reading or contig
+number for success.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node lget_gel_num
+ at subsection lget_gel_num and lget_contig_num
+ at findex lget_gel_num(C)
+ at findex lget_contig_num(C)
+ at cindex reading names, reading
+ at cindex contig names, reading
+ at vindex contig_list_t(C)
+
+ at example
+#include <IO.h>
+
+int lget_gel_num(
+	GapIO  *io,
+	int	listArgc,
+	char  **listArgv,
+	int    *rargc,
+	int   **rargv);
+
+int lget_contig_num(
+	GapIO  *io,
+	int	listArgc,
+	char  **listArgv,
+	int    *rargc,
+	contig_list_t **rargv);
+ at end example
+
+These functions perform the same task as @code{get_gel_num} and
+ at code{get_contig_num} except on lists of identifier instead of single
+identifiers.
+
+The list of identifiers is passed in @var{listArgv} as an array of
+ at var{listArgc} strings. They return arrays of reading or contig numbers by
+setting @var{*rargv} to point to an array of @var{*rargc} elements. The memory
+is allocated by these functions and should be deallocated by the caller using
+ at code{free}.
+
+For @code{lget_gel_num} the return arrays are arrays of integer values.
+ at code{lget_contig_num} returns arrays of @var{contig_list_t} structures. This
+structure is defined as follows.
+
+ at example
+typedef struct contig_list @{
+    int contig;
+    int start;
+    int end;
+@} contig_list_t;
+ at end example
+
+If any string within the @var{listArgv} argument to @code{lget_contig_num} is
+a list, the second and third elements of this list are used to define the
+ at var{start} and @var{end} offsets within the contig (which is defined by the
+name held in the first element of the list). Otherwise, the @var{start} and
+ at var{end} fields are set to 1 and the length of the contig respectively.
+
+For instance, it is legal for pass over "@code{rname}", "@code{rname 100}" and
+"@code{rname 100 200}" as contig identifiers.
+
+Both functions return 0 for success and -1 for failure.	 Note that the
+returned @var{rargc} value may not be the same as @var{listArgc} in the case
+where one or more identifiers could not be translated.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node to_contigs_only
+ at subsection to_contigs_only
+ at findex to_contigs_only(C)
+
+ at example
+#include <IO.h>
+
+int *to_contigs_only(
+	int	num_contigs,
+	contig_list_t *cl);
+ at end example
+
+This functions converts an array of @var{contig_list_t} structures to an array
+of integers containing only the contig number information. The @var{cl} and
+ at var{num_contigs} elements correspond to the returned @var{rargv} and
+ at var{rargc} arguments from the @code{lget_contig_num} function.
+
+It returns a malloc array of integers for success or @code{NULL} for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node chain_left
+ at subsection chain_left
+ at vindex chain_left(C)
+ at cindex left most reading
+
+ at example
+#include <IO.h>
+
+int chain_left(
+	GapIO  *io,
+	int	gel);
+ at end example
+
+This function finds the left most reading number of the contig containing the
+reading numbered @var{gel}. This is done by chaining along the left neighbours
+of each reading in turn until the contig end is reached. The function detects
+possible loops and returns -1 in this case. Otherwise the left most reading
+number is returned.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node rnumtocnum
+ at subsection rnumtocnum
+ at vindex rnumtocnum(C)
+ at cindex contig numbers, from reading numbers
+
+ at example
+#include <IO.h>
+
+int rnumtocnum(
+	GapIO  *io,
+	int	gel);
+ at end example
+
+This function returns the contig number for the contig containing the reading
+numbered @var{gel}. It returns -1 if the contig number cannot be found.
diff --git a/scripting_manual/gap4-cio-intro-t.texi b/scripting_manual/gap4-cio-intro-t.texi
new file mode 100644
index 0000000..f953d6e
--- /dev/null
+++ b/scripting_manual/gap4-cio-intro-t.texi
@@ -0,0 +1,211 @@
+ at cindex IO introduction (C)
+ at cindex Overview of Gap4 IO (C)
+
+[General notes to go somewhere: It is better to check success return codes
+rather than failure ones as the failure ones are often variable (-1, 1, >0,
+etc) but most return 0 for success.]
+
+The Gap4 I/O access from within C consists of several layers. These layers
+provide ways of breaking down the tasks into discrete methods, and of hiding
+most of the implementation details. For the programmer willing to extend Gap4,
+only the higher layer levels are of interest. Hence the lowest levels are
+described only briefly.
+
+ at subsection "g" Level - Raw Database Access
+
+At the final end of any I/O is the actual code to read and write information
+to the disk. In Gap4 this is handled through a library named "g". This
+contains code for reading, writing, locking and updating of the physical
+database. It does not describe the structures contained in the gap database
+format itself, but rather provides functions to read and write arbitrary
+blocks of data. Don't delve into this unless you're feeling brave!
+
+The code for this library is contained within the @file{src/g} directory.
+No documentation is currently available on these functions.
+
+ at subsection "Communication" Level - Interfaces to the "g" Level
+
+This level of code deals with describing the real Gap4 data structures and
+the interfacing with the g library. Generally this code should not be used.
+
+This code is contained within the @file{src/gap4} directory and breaks down as
+follows:
+
+ at table @file
+ at item  gap-if.c
+ at itemx gap-local.c
+ at itemx gap-remote.c
+	Interface functions with the g library. These are to provide
+	support for a local (ie compiled in) or remote (unimplemented)
+	database server.
+
+ at item  gap-io.c
+	Contains @code{GAP_READ} and @code{GAP_WRITE} functions in byte swap
+	and non byte swap forms (depending on the system arch.). The
+	@code{gap_io_init()} function automatically determines the machine
+	endian and sets up function pointers to call the correct functions.
+
+ at item  gap-error.c
+	Definitions of @code{GAP_ERROR} and @code{GAP_ERROR_FATAL} functions.
+
+ at item  gap-dbstruct.c
+ at itemx gap-create.c
+	Functions for creation, initialisation, and copying of database
+	files.
+
+ at item  gap-dbstruct.h
+	@strong{VERY USEFUL!} The definitions of the gap structures that are
+	stored in the database.
+
+ at item  gap-init.c
+	Initialises communication with the "g" database server by use of
+	@code{gap_init()}, @code{gap_open_server()} and
+	@code{gap_shutdown_server()} functions.
+ at end table
+
+No documentation is currently available on these functions.
+
+ at subsection Basic Gap4 I/O
+
+This level contains the basic functions for reading, writing, creation and
+deletion of the Gap4 structures, such as readings and templates as well as
+higher level functions built on top of these. It is this level of code that
+should generally be used by the programmer. The implementation of this level
+has function code and prototypes spread over a variety of files, but the
+programmer should only @code{#include} the @file{IO.h} file.
+
+The primary functions are:
+
+ at table @file
+ at item IO.c
+ at table @code
+ at item  open_db
+ at itemx close_db
+ at itemx del_db
+	Opening/creation, closing and deletion of databases.
+
+ at item  GT_Read, GT_Write, GT_Write_cached
+ at itemx TextRead, TextAllocRead, TextWrite
+ at itemx DataRead, DataWrite
+ at itemx ArrayRead, ArrayWrite
+ at itemx BitmapRead, BitmapWrite
+	The basic IO calls. Note that the GT ones are for handling structures
+	(eg GReadings) and the others for data of the associated type.
+
+ at item  io_init_contig
+ at itemx io_init_annotations
+ at itemx io_init_reading
+	Some functions for initialising new data structures. These in turn
+	call the @code{allocate()} function to create new database records.
+
+ at item  io_read_seq
+ at itemx io_write_seq
+	Reads and writes sequence information.
+
+ at item  io_read_rd
+	Fetches the trace type and name values for a reading.
+
+ at item  io_read_annotation
+ at itemx io_write_annotation
+	Reading and writing of annotations (also known as tags).
+
+ at item  allocate
+ at itemx deallocate
+ at itemx io_deallocate_reading
+        Allocation and deallocation of records.
+
+ at item  flush2t
+        Flushes changes back to disk. The various write commands write the
+        data to disk, but until a flush occurs they will not be committed as
+        the up to date copies.
+ at end table
+
+ at item io_handle.c
+ at table @code
+ at item  io_handle
+ at itemx handle_io
+        Converts between C @var{GapIO} pointer and an integer value which can
+        be passed around in Tcl and Fortran. The integer handle is used in the
+        Tcl scripting language.
+ at end table
+
+ at item io_utils.[ch]
+ at table @code
+ at item  get_gel_num, lget_gel_num
+ at itemx get_contig_num, lget_contig_num
+        Converts single or lists of reading identifiers into reading or contig
+        numbers (with start and end ranges).
+
+ at item  to_contigs_only
+        Converts a list of reading identifiers to contig numbers.
+
+ at item  get_read_name
+ at itemx get_contig_name
+ at itemx get_vector_name
+ at itemx get_template_name
+ at itemx get_clone_name
+        Converts a structure number into its textual name.
+
+ at item  chain_left
+        Finds the left most reading number in a contig from a given reading
+        number.
+
+ at item  rnumtocnum
+        Converts from a reading number into a contig number.
+ at end table
+ at end table
+
+
+ at subsection Other I/O Functions
+
+Still more I/O functions exist that aren't listed under the "Basic Gap4 I/O"
+header. The reason for this is primarily due to code structure rather than any
+particular grouping based on functionality. Specifically, these functions
+cannot be easily linked into "external" applications without a considerable
+amount of effort.
+
+The file break down is as follows.
+
+ at table @file
+ at item IO2.c
+ at table @code
+ at item  io_complement_seq
+	Complements, in memory, a sequence and associated structures.
+
+ at item  io_insert_seq
+ at itemx io_delete_seq
+ at itemx io_replace_seq
+	Modifies in memory sequence details.
+
+ at item  io_insert_base
+ at itemx io_modify_base
+ at itemx io_delete_base
+        Modifies a single base in a sequence on the disk.
+
+ at item  pad_consensus
+        Inserts pads to the consensus sequence and all the readings at that
+        point.
+
+ at item  io_delete_contig
+        Removes a contig structure.
+ at end table
+
+ at item IO3.c
+ at table @code
+ at item  get_read_info
+ at itemx get_vector_info
+ at itemx get_clone_info
+	Fetches miscellaneous information for reads (primers, insert size,
+	etc), vectors and clones.
+
+ at item  io_get_extension
+	Returns the right cutoff of a reading. Found by checking the cut
+	points and any vector tags.
+
+ at item  io_mod_extension
+	Modifies the cutoffs of readings.
+
+ at item  write_rname
+        Updates a reading name in memory and disk.
+ at end table
+ at end table
diff --git a/scripting_manual/gap4-cio-mid-t.texi b/scripting_manual/gap4-cio-mid-t.texi
new file mode 100644
index 0000000..02b1b9b
--- /dev/null
+++ b/scripting_manual/gap4-cio-mid-t.texi
@@ -0,0 +1,75 @@
+The middle level functions consist of basic functions for reading, writing and
+creation of structures, text strings, arrays, bitmaps and other items of raw
+data. They're contained within the @file{Gap4/IO.c} file and @file{IO.h}
+should be #included before usage.
+
+ at subsection GT_Read, TextRead, TextAllocRead, DataRead, ArrayRead and BitmapRead
+ at findex GT_Read(C)
+ at findex TextRead(C)
+ at findex TextAllocRead(C)
+ at findex DataRead(C)
+ at findex ArrayRead(C)
+ at findex BitmapRead(C)
+ at example
+#include <IO.h>
+
+int GT_Read(
+	GapIO  *io,
+	int	rec,
+	void   *buf,
+	int	len,
+	GCardinal type_check);
+
+int TextRead(
+	GapIO  *io,
+	int	rec,
+	char   *buf,
+	int	len);
+
+char *TextAllocRead(
+	GapIO  *io,
+	int	rec);
+
+int DataRead(
+	GapIO  *io,
+	int	rec,
+	void   *buf,
+	int	len,
+	int	size);
+
+Array ArrayRead(
+	GapIO  *io,
+	int	rec,
+	int	elements);
+
+Bitmap BitmapRead(
+	GapIO  *io,
+	int	rec,
+	int	elements);
+ at end example
+
+These functions read record number @var{rec} to the buffer @var{buf} of length
+ at var{len}. Each returns zero for success and an error number for failure.
+If the length of the data on disk is less than @var{len} then only @var{len}
+bytes are read. If @var{len} is greater than the data on disk then the
+remaining bytes in @var{buf} are undefined.
+
+ at code{GT_Read} reads arbitrary records of type @var{type_check}. This is
+typically a structure; for instance a @var{GContigs} structure with type
+ at code{GT_Contigs}. For best compatibility, use the @code{contig_read},
+ at code{gel_read}, @code{tag_read}, @code{vector_read} and @code{clone_read}
+function.
+
+ at code{Text_Read} reads records of type @code{GT_Text}. This includes sequences
+and other text strings. It is possible to read text data containing NULLs, but
+this is not advisable.
+
+ at code{TextAllocRead} is identical to @code{Text_Read} except that the
+necessary memory is allocated using @code{malloc}. This returns the string for
+success and NULL for failure.
+
+ at code{DataRead} reads records of type @code{GT_Data}. This should be used for
+binary data that is not in one of the other principle formats. This includes
+reading confidence values and original positions.
+
+ at code{ArrayRead} reads records of type @code{GT_Array}. The array is allocated
diff --git a/scripting_manual/gap4-cio-other-t.texi b/scripting_manual/gap4-cio-other-t.texi
new file mode 100644
index 0000000..b2547f1
--- /dev/null
+++ b/scripting_manual/gap4-cio-other-t.texi
@@ -0,0 +1,229 @@
+This section includes all the other I/O functions which don't fit well into
+the other sections. Specifically, these functions cannot be used when
+compiling external programs that utilise the gap4 I/O functions. The reason
+for this is that they require many other portions of the gap4 objects which in
+turn require more.
+
+Whilst it is possible to still link in this manner, it is unwieldy and far
+from ideal. If you have need to use any of these functions in code that is to
+run separate from Gap4 then please mail us. We will then investigate tidying
+up the code further to aid such compilations.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_get_extension
+ at subsection io_get_extension
+ at findex io_get_extension(C)
+ at cindex cutoff data, reading
+ at cindex hidden data, reading
+
+ at example
+#include <IO.h>
+
+int io_get_extension(
+	GapIO  *io,
+	int	N,
+	char   *seq,
+	int	max_seq,
+	int    *length,
+	int    *complement);
+ at end example
+
+ at code{io_get_extension} reads the usable 3' cutoff data for reading number
+ at var{N}. The cutoff data is stored in @var{seq}. The length stored is the
+smaller of @var{max_seq} bytes or the length of the 3' cutoff data.
+The length of data stored in @var{seq} is written to the @var{length} pointer.
+The orientation of the reading is stored in the @var{complement} pointer.
+
+The reading annotations are also read to determine which segments are
+considered usable. The existance of a tag with type @code{IGNS} or
+ at code{IGNC}, anywhere on the reading, indicates that there is no suitable
+cutoff data for this reading. @var{length} is set to 1 and the function
+returns 1.
+
+If a tag of type @code{SVEC} or @code{CVEC} exists within the 3' cutoff
+data the segment returned consists of that between the 3' cutoff point and
+the start of the vector tag.
+
+The function returns 0 for success and 1 for failure.
+
+NOTE: The current implementation looks any tags with types @code{IGN?} and
+ at code{?VEC} rather than the specific types listed.
+
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_mod_extension
+ at subsection io_mod_extension
+ at findex io_mod_extension(C)
+ at cindex cutoff data, modifying
+ at cindex hidden data, modifying
+
+ at example
+#include <IO.h>
+
+int io_mod_extension(
+	GapIO  *io,
+	int	N,
+	int	shorten_by);
+ at end example
+
+ at code{io_mod_extension} modifies the position of the 3' cutoff data for
+reading number @var{N}. The 3' cutoff position is defined to be the base
+number, counting from 1, of the first base within the cutoff data.
+
+ at var{shorten_by} is subtracted from either the @var{end} or @var{start} field
+in the @var{GReadings} structure, depending on whether the reading is
+complemented. It is legal to specify a negative amount to increase the
+used portion of the reading.
+
+[FIXME]@br
+NOTE that this implementation does not set the @var{sequence_length} field or
+the @code{io_length(io,N)} data for this reading.
+
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_insert_base
+ at subsection io_insert_base, io_modify_base and io_delete_base
+ at findex io_insert_base(C)
+ at findex io_modify_base(C)
+ at findex io_delete_base(C)
+
+ at example
+#include <IO.h>
+
+int io_insert_base(
+	GapIO  *io,
+	int	gel,
+	int	pos,
+	char	base);
+
+int io_modify_base(
+	GapIO  *io,
+	int	gel,
+	int	pos,
+	char	base);
+
+int io_delete_base(
+	GapIO  *io,
+	int	gel,
+	int	pos);
+ at end example
+
+These functions modify readings by inserting, changing, or deleting individual
+bases. Where needed, they update any annotations on the reading to ensure that
+all annotations are still covering the same sequence fragments. The confidence
+values and original positions arrays are also updated. Inserted and edited
+bases are given confidence of 100 and original positions of 0.
+
+ at code{io_insert_base} uses the @code{io_insert_seq} function to inserts a
+single base with chacter @var{base} to at base position @var{pos}. Positions
+are measured counted such that inserting at base 1 inserts a base at the start
+of sequence.
+
+ at code{io_modify_base} uses the @code{io_replace_seq} function to replace a
+single base at position @var{pos} with @var{base}.
+
+ at code{io_delete_base} uses the @code{io_delete_seq} function to delete a
+single base at position @var{pos}.
+
+FIXME:@br
+NOTE that @code{io_insert_base} and @code{io_delete_base} modify the sequence,
+but DO NOT update the the @var{GReadings.sequence_length} or
+ at code{io_length()} data.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node io_delete_contig
+ at subsection io_delete_contig
+ at findex io_delete_contig(C)
+ at cindex contig, deletion of
+
+ at example
+#include <IO.h>
+
+int io_delete_contig(
+	GapIO  *io,
+	int	contig_num);
+ at end example
+
+This function deletes a single contig number from the database. It
+ at strong{does not} remove any of the readings on the contig, but all
+annotations on the consensus sequence for this contig are deallocated.
+
+The last contig in the database is renumbered to be @var{contig_num}. This
+updates the @code{io_clength()}, @code{io_clnbr()}, and @code{io_crnbr()}
+arrays in @var{io} and the contig order information.
+
+A @code{REG_DELETE} notification is sent to the deleted contig @strong{after}
+removal, followed by a @code{REG_NUMBER_CHANGE} notification to renumbered
+contig, followed by updating the contig registry tables.e
+
+It returns 0 for success.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node write_rname
+ at subsection write_rname
+ at findex write_rname(C)
+ at cindex reading names, writing
+
+ at example
+#include <IO.h>
+
+int write_rname(
+	GapIO  *io,
+	int	rnum,
+	char   *name);
+ at end example
+
+This writes a new reading name @var{name} for reading number @var{rnum}.
+This updates both the disk and memory copies of the reading structure and the
+reading name cache, using the @code{gel_write} and @code{io_wname} functions.
+If reading @var{rnum} does not exist, it is created first using the
+ at code{io_init_reading} function.
+
+It returns 0 for success and -1 for failure.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node get_read_name
+ at subsection get_read_name, get_contig_name, get_vector_name, get_template_name, and get_clone_name
+ at vindex get_read_name(C)
+ at vindex get_contig_name(C)
+ at vindex get_vector_name(C)
+ at vindex get_template_name(C)
+ at vindex get_clone_name(C)
+
+ at example
+#include <IO.h>
+
+char *get_read_name(
+	GapIO  *io,
+	int number);
+
+char *get_contig_name(
+	GapIO  *io,
+	int number);
+
+char *get_vector_name(
+	GapIO  *io,
+	int number);
+
+char *get_template_name(
+	GapIO  *io,
+	int number);
+
+char *get_clone_name(
+	GapIO  *io,
+	int number);
+ at end example
+
+These functions convert reading, contig, vector, template and clone numbers
+into reading, contig, vector, and clone names respectively. Each function
+takes a @var{number} argument and returns a string containing the name. The
+string is held in a static buffer and is valid only until the next call of the
+same function. If the name is unknown, the string "@code{???}" is returned.
+
+
diff --git a/scripting_manual/gap4-cio-t.texi b/scripting_manual/gap4-cio-t.texi
new file mode 100644
index 0000000..f5d714f
--- /dev/null
+++ b/scripting_manual/gap4-cio-t.texi
@@ -0,0 +1,49 @@
+ at menu
+* G4cio-Introduction::          Introduction and Overview
+* G4cio-Compiling::             Compiling and Linking with Other Programs
+* G4cio-Database_Structures::   Database Structures
+* G4cio-GapIO_Structure::       The GapIO Structure
+* G4cio-Macros::                IO.h Macros
+* G4cio-Basic_Level::           Basic Gap4 I/O
+* G4cio-Other::                 Other I/O Functions
+ at end menu
+
+ at node G4Cio-Introduction
+ at section Introduction and Overview
+_include(gap4-cio-intro-t.texi)
+
+ at c
+ at split{}
+ at section Compiling and Linking with Other Programs
+_include(gap4-cio-compile-t.texi)
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-Database_Structures
+ at section Database Structures
+_include(gap4-cio-database-t.texi)
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-GapIO_Structure
+ at section The GapIO Structure
+_include(gap4-cio-gapio-t.texi)
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-Macros
+ at section IO.h Macros
+_include(gap4-cio-IO.h-t.texi)
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-Basic_Level
+ at section Basic Gap4 I/O
+_include(gap4-cio-basic-t.texi)
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node G4Cio-Other
+ at section Other I/O functions
+_include(gap4-cio-other-t.texi)
+
diff --git a/scripting_manual/gap4-editor-t.texi b/scripting_manual/gap4-editor-t.texi
new file mode 100644
index 0000000..7edb706
--- /dev/null
+++ b/scripting_manual/gap4-editor-t.texi
@@ -0,0 +1,814 @@
+ at cindex Editor widget
+
+ at menu
+* GEditor-Intro::       Introduction
+* GEditor-Configure::   Configuration Options
+* GEditor-Commands::    Widget Commands
+ at end menu
+
+ at c -------------------------------------------------------------------------
+ at node GEditor-Intro
+ at subsection Introduction
+ at findex editor(C)
+
+ at c -------------------------------------------------------------------------
+ at split{}
+ at node GEditor-Configure
+ at subsection Configuration Options
+ at cindex Editor widget: configuration
+ at cindex Configuration: editor widget
+
+These options are specified when creating the editor widget to configure its
+look and feel. In addition to the options listed below the editor supports the
+ at code{-width}, @code{-height}, @code{-font}, @code{-borderWidth},
+ at code{-relief}, @code{-foreground}, @code{-background}, @code{-xscrollcommand}
+and @code{-yscrollcommand}. These are described in detail in the Tk
+ at i{options} manual page. Note that the @code{-width} and @code{-height} values
+are measured in characters.
+
+In the descriptions below `Command-Line Name' refers to he switch used
+  in class commands and @code{configure} widget commands to set this value.
+`Database Name' refers to the option's name in the option database (e.g.  in
+ at file{.Xdefaults} files).  `Database Class' refers to the option's class value
+in the option database.
+
+ at sp 1
+ at table @asis
+ at item Command-Line Name: @code{-lightcolour}
+ at itemx Database Name: @code{lightColour}
+ at itemx Database Class: @code{Foreground}
+ at vindex -lightcolour: editor widget
+ at vindex lightColour: editor widget
+
+        Specifies the foreground colour to use when displaying the cutoff
+        data.
+
+ at sp 1
+ at item Command-Line Name: @code{-max_height}
+ at itemx Database Name: @code{maxHeight}
+ at itemx Database Class: @code{MaxHeight}
+ at vindex -max_height: editor widget
+ at vindex maxHeight: editor widget
+
+        Specifies the maximum height the editor is allowed to display, in
+        units of characters. The vertical scrollbar will be used when more
+        than this many sequences are displayed.
+
+ at sp 1
+ at item Command-Line Name: @code{-qualcolour}@i{n} (0 <= @i{n} <= 9)
+ at itemx Database Name: @code{qualColour}@i{n}
+ at itemx Database Class: @code{Background}
+ at vindex -qualcolour: editor widget
+ at vindex qualColour: editor widget
+
+        These specify the 10 colours to be used for the background of the
+        bases when @code{show_quality} is enabled. @code{-qualcolour0} should
+        be the darkest (defaults to '@code{#494949}') and @code{-qualcolour9}
+        should be the lightest (defaults to '@code{#d9d9d9}').
+
+ at sp 1
+ at item Command-Line Name: @code{-qual_fg}
+ at itemx Database Name: @code{qualForeground}
+ at itemx Database Class: @code{Foreground}
+ at vindex -qual_fg: editor widget
+ at vindex qualForeground: editor widget
+
+        This specifies the foreground colour of bases with poorer
+        quality than the current quality cutoff. By default this is
+        redish ('@code{#ff5050}').
+ at end table
+
+ at c -------------------------------------------------------------------------
+ at split{}
+ at node GEditor-Commands
+ at subsection Widget Commands
+ at cindex Editor commands
+ at cindex Editor widget commands
+
+ at menu
+* GEditor-Units::               Units and Coordinates
+* GEditor-Cursor::              The Editing Cursor
+* GEditor-Select::              The Selection
+* GEditor-Cutoff::              Cutoff Adjustments
+* GEditor-Anno::                Annotations
+* GEditor-Edits::               Editing Commands
+* GEditor-Settings::            Editing Toggles and Settings
+* GEditor-Search::              Searching
+* GEditor-Primer::              Primer Selection
+* GEditor-Status::              The Status Line
+* GEditor-Trace::               The Trace Display
+* GEditor-Misc::                Miscellaneous Commands
+ at end menu
+
+The 'editor' widget is based upon the sheet display widget except with a large
+range of editing commands added. The data for the editor cannot be specified
+from the Tcl level, rather this requires using a C interface to adjust the
+tkEditor structure. Hence the editor widget is very specific for the task at
+hand.
+
+ at c -------------------------------------------------------------------------
+ at split{}
+ at node GEditor-Units
+ at subsubsection Units and Coordinates
+ at cindex Editor units
+ at cindex Units in editor widget
+
+The contig editor works in base coordinates. Some widget commands take x
+and/or y position arguments. These are by default in base units. However it is
+possible to use '@code{@@pos}' as the position argument to specify
+'@code{pos}' as pixel units.
+
+ at c -------------------------------------------------------------------------
+ at split{}
+ at node GEditor-Cursor
+ at subsubsection The Editing Cursor
+ at cindex Editor cursor
+ at cindex Cursor in editor widget
+
+ at table @var
+ at findex cursor_left: editor widget
+ at findex cursor_right: editor widget
+ at findex cursor_up: editor widget
+ at findex cursor_down: editor widget
+ at item @code{cursor_left}
+ at itemx @code{cursor_right}
+ at itemx @code{cursor_up}
+ at itemx @code{cursor_down}
+
+        Move the editing cursor in the appropriate direction. The exact
+        allowed movements depends on where the cursor is and whether cutoff
+        data is displayed.
+
+ at findex read_start: editor widget
+ at findex read_end: editor widget
+ at item @code{read_start}
+ at itemx @code{read_end}
+
+        Positions the cursor at the beginning or end of the used data for this
+        sequence.
+
+ at findex read_start2: editor widget
+ at findex read_end2: editor widget
+ at item @code{read_start2}
+ at itemx @code{read_end2}
+
+        Positions the cursor at the beginning or end of the displayed data for
+        this sequence. These differ from @code{read_start} and @code{read_end}
+        when cutoff data is displayed in that they use the ends of the
+        cutoff data.
+
+ at findex contig_start: editor widget
+ at findex contig_end: editor widget
+ at item @code{contig_start}
+ at itemx @code{contig_end}
+
+        Positions the cursor on the consensus line at the start or end of the
+        contig.
+
+ at findex cursor_set: editor widget
+ at item @code{cursor_set} xpos ypos
+
+        Positions the cursor at the correct position and sequence based on an
+        (x,y) coordinate pair from the topleft corner of the screen. Units are
+        in bases unless '@code{@@}@i{xpos} @code{@@}@i{ypos}' is used, in
+        which case they are pixels.
+
+ at findex cursor_consensus: editor widget
+ at item @code{cursor_consensus ?}xpos at code{?}
+        Positions the cursor at an absolute position within the
+        consensus. If no @i{xpos} is given the existing position
+        within the contig is returned.
+
+ at end table
+
+ at c -------------------------------------------------------------------------
+ at split{}
+ at node GEditor-Select
+ at subsubsection The Selection
+ at cindex Selections: editor widget
+ at cindex Editor widget: selections
+
+The widget supports the standard X selection via the '@code{select}' command.
+The general form of this command is '@code{select} @i{option ?arg?}'. A
+selection here is simply a portion of text. Selections can be made on any
+sequence or consensus sequence and are denoted by being underlined.
+
+ at table @var
+ at findex select clear: editor widget
+ at item @code{select clear}
+
+        Clears and disowns the current selection.
+
+ at findex select from: editor widget
+ at item @code{select from} pos
+
+        Grabs the current selection and sets it's start position.
+
+ at findex select to: editor widget
+ at findex select adjust: editor widget
+ at item @code{select to} pos
+ at itemx @code{select adjust} pos
+
+        Currently both these are the same. They set the end position of the
+        selection.
+ at end table
+
+ at c -------------------------------------------------------------------------
+ at split{}
+ at node GEditor-Cutoff
+ at subsubsection Cutoff Adjustments
+ at cindex Cutoffs: editor widget
+ at cindex Editor widget: cutoffs
+
+The consensus calculation can be tuned by changing the threshhold at which a
+particular base type is considered to have the 'majority'; a dash (-) is
+displayed when the majority is not sufficiently high. See the staden package
+manual for precise details on this.
+
+An additiona quality cutoff can be applied to each base. This determines the
+contribution that each base makes to the consensus calculation and also the
+colour used when displaying bases on the screen.  Bases with a quality lower
+than the cutoff are displayed in @code{qualColour} and @code{qualForeground}
+colours, as defined in the configuration Options listed above.
+
+ at table @var
+ at findex set_ccutoff: editor widget
+ at item @code{set_ccutoff ?}value at code{?}
+
+        If @var{value} is specified the consensus cutoff is set to
+        @var{value}.  Otherwise the existing consensus cutoff value is
+        returned without making any changes.
+
+ at findex set_qcutoff: editor widget
+ at item @code{set_qcutoff ?}value at code{?}
+
+        If @var{value} is specified the quality cutoff is set to @var{value}.
+        Otherwise the existing quality cutoff value is returned without making
+        any changes.
+ at end table
+
+ at c -------------------------------------------------------------------------
+ at split{}
+ at node GEditor-Anno
+ at subsubsection Annotations
+ at cindex Annotations: editor widget
+ at cindex Editor widget: annotations
+
+ at table @var
+ at findex delete_anno: editor widget
+ at item @code{delete_anno}
+
+        Delete the tag underneath the cursor. This also sets the current
+        selection to be the range covered by the tag.
+
+ at findex create_anno: editor widget
+ at item @code{create_anno}
+
+        Brings up a tag editor window to create a new tag. This requires the
+        selection to have been previously set.
+
+ at findex edit_anno: editor widget
+ at item @code{edit_anno}
+
+        Brings up a tag editor window. This also sets the current selection to
+        be the range covered by the tag.
+ at end table
+
+ at c -------------------------------------------------------------------------
+ at split{}
+ at node GEditor-Edits
+ at subsubsection Editing Commands
+ at cindex Editing commands: editor widget
+ at cindex Editor widget: editing commands
+
+ at table @var
+ at findex transpose_left: editor widget
+ at findex transpose_right: editor widget
+ at item @code{transpose_left}
+ at itemx @code{transpose_right}
+
+        Moves a base in a sequence either left or right one character. Does
+        not work on the consensus sequence. Only pads can be moved unless the
+        appropriate superedit mode is enabled.
+
+ at findex extend_left: editor widget
+ at findex extend_right: editor widget
+ at findex zap_left: editor widget
+ at findex zap_right: editor widget
+ at item @code{extend_left}
+ at itemx @code{extend_right}
+ at itemx @code{zap_left}
+ at itemx @code{zap_right}
+
+        Adjusts the current left or right cutoff for a sequence. The
+        @code{extend_} commands move the cutoff by a single base and require
+        the editing cursor to be at the appropriate end of the used data. The
+        @code{zap_} commands set the appropriate end of the used data to be
+        the current cursor position.
+
+ at findex delete_key: editor widget
+ at findex delete_left_key: editor widget
+ at item @code{delete_key}
+ at itemx @code{delete_left_key}
+
+        Delete comes in two modes. Both delete the base to the left of the
+        editing cursor. @code{delete_key} then moves the sequence to the right
+        and the editing cursor left by one base to fill the removed base.
+        @code{delete_left_key} moves the sequence to the left of the editing
+        cursor right by one base, and hence changes the sequence start
+        position too. Typically the @kbd{DEL} key is bound to
+        @code{delete_key} and @kbd{CTRL-DEL} is bound to
+        @code{delete_left_key}.
+
+ at findex edit_key: editor widget
+ at itemx @code{edit_key} character
+
+        Other general key presses. Typically any other key press is bound to
+        this call, which then handles the actual editing or replacing of
+        bases. The key character should be passed over as an argument.
+
+ at findex set_confidence: editor widget
+ at item @code{set_confidence} value
+
+        Sets the confidence value of a base to @i{value}. In the current
+        implementation only values of 0 and 100 are allowed.
+
+ at end table
+
+ at c -------------------------------------------------------------------------
+ at split{}
+ at node GEditor-Settings
+ at subsubsection Editing Toggles and Settings
+ at cindex Toggles: editor widget
+ at cindex Settings: editor widget
+ at cindex Editor widget: toggles
+ at cindex Editor widget: settings
+
+The editor has a variety of boolean values for determining editing and display
+modes. Most take an optional @var{value} parameter to explicitly set the value
+of the boolean. With no @var{value} parameter specified the boolean is toggled
+instead.
+
+ at table @var
+ at findex set_reveal: editor widget
+ at item @code{set_reveal ?}value at code{?}
+
+        This sets the editor 'cutoffs' mode. A setting of 1
+        indicates that cutoff data is to be displayed in the
+        @code{lightColour} colour. A setting of 0 indicates that no cutoff
+        data is to be displayed.
+
+ at findex set_insert: editor widget
+ at item @code{set_insert ?}value at code{?}
+
+        This command sets the editor insert/replace mode. A @i{value} of 1
+        sets the editor to insert mode. A @i{value} of 0 sets the editor to
+        replace mode.
+
+ at cindex Superedit: editor widget
+ at cindex Editor widget: superedit
+ at findex superedit: editor widget
+ at item @code{superedit} modes
+
+        This command sets which editing actions should be allowed. The
+        @i{modes} argument should be a Tcl list of 10 values, each 0
+        (disabled) or 1 (enabled). The values in order repesent insert any to
+        read, delete any from read, insert to consensus, delete dash from
+        consensus, delete any from consensus, replace base in consensus, shift
+        readings, transpose any bases, can use uppercase edits, and
+        replacement mode. The replacement mode is 0 for editing by base type
+        and 1 for edit by confidence value.
+
+ at findex auto_save: editor widget
+ at item @code{auto_save ?}value at code{?}
+
+        This command sets the auto-save mode. A @i{value} of 1 enables
+        auto-saving. A @i{value} of 0 disables it.
+
+ at findex show_differences: editor widget
+ at item @code{show_differences ?}value at code{?}
+
+        This command set the show differences mode. A @i{value} of 1 will
+        display only those bases that disagree with the consensus. All
+        other bases are displayed as a fullstop. A @i{value} of 0 shows all
+        bases.
+
+ at findex compare_strands: editor widget
+ at item @code{compare_strands ?}value at code{?}
+
+        This command sets the compare strands mode. A @i{value} of 1 will make
+        the editor compute the consensus separately for the positive and
+        negative strands. Strands that disagree are given a final consensus
+        character of '-'. A @i{value} of 0 will use the normal single
+        consensus mode.
+
+ at findex join_lock: editor widget
+ at item @code{join_lock ?}value at code{?}
+
+        This command sets the scroll locking between two editors forming a
+        join editor. A @i{value} of 1 will mean that scrolling (not cursor
+        movement) in one contig will also scroll the other contig.
+
+ at findex show_quality: editor widget
+ at item @code{show_quality} value
+
+        This commands sets the quality display mode. With a @i{value} of 1 and
+        a positive quality cutoff value all qualities values are displayed as
+        grey scales using the 10 @code{qualColour}@i{n} configuration options.
+
+ at end table
+
+ at c -------------------------------------------------------------------------
+ at split{}
+ at node GEditor-Search
+ at subsubsection Searching
+ at cindex Searching in the editor widget
+ at cindex Editor widget: searching
+
+The editor search procedures search for a particular item and move the editing
+cursor and xview position if a search item is found. Each search command takes
+a direction and a search string. @i{Direction} can be either '@code{forward}'
+or '@code{reverse}'.
+
+ at table @var
+ at findex search name: editor widget
+ at item @code{search} direction @code{name} value
+
+        Searches for the reading name starting with @i{value}.
+
+ at findex search anno: editor widget
+ at item @code{search} direction @code{anno ?}value at code{?}
+
+        Searches for the annotation containing a comment matching the
+        @i{value} regular expression. Not specifying @i{value} will match all
+        annotations.
+
+ at findex search sequence: editor widget
+ at item @code{search} direction @code{sequence} value
+
+        Searches for the sequence @i{value} using a case-insensitive exact
+        match.
+        
+ at findex search tag: editor widget
+ at item @code{search} direction @code{tag} value
+
+        Searches for a tag with type @i{value}.
+
+ at findex search position:
+ at item @code{search} direction @code{position} value
+
+        Moves to a specific position. If @i{value} is an absolute number (eg
+        '@code{30717}' then the editing cursor is moved to that consensus
+        base.  If @i{value} is '@code{@@}' followed by a number (eg
+        '@code{@@100}') then the editing cursor is moved to that base within
+        the current reading.  If @i{value} starts with a plus or minus the
+        editing cursor is moved forwards or backwards by that amount. The
+        @i{direction} parameter here has no effect and is included purely for
+        consistency.
+
+ at findex search problem: editor widget
+ at item @code{search} direction @code{problem}
+
+        Searches for undefined bases or pads.
+
+ at findex search quality: editor widget
+ at item @code{search} direction @code{quality}
+
+        Searches for bases of poor quality (undefined bases, pads, or single
+        stranded data).
+
+ at findex search edit: editor widget
+ at item @code{search} direction @code{edit}
+
+        Searches for sequence edits, including confidence value changes.
+
+ at findex search verifyand: editor widget
+ at findex search verifyor: editor widget
+ at item @code{search} direction @code{verifyand}
+ at itemx @code{search} direction @code{verifyor}
+
+        Searches for consensus bases that have a lack of evidence in the
+        original sequences. @code{verifyand} looks for evidence on both
+        strands together. @code{verifyor} looks for evidence on each strand
+        independently and defines a match to be places where either strand has
+        a lack of evidence. In the current implementation of these two
+        searches only the forward direction is supported.
+ at end table
+
+ at c -------------------------------------------------------------------------
+ at split{}
+ at node GEditor-Primer
+ at subsubsection Primer Selection
+ at cindex Primer selection in editor widget
+ at cindex Editor widget: primer selection
+ at findex select_oligos: editor widget
+
+These control the searching for and creation of oligo primers. Together they
+form the Select Primer functionality of the contig editor.  The
+ at code{generate} command must be run first. All other commands have undefined
+behaviour when the generate command has not been run since the last quit
+command.
+
+ at table @var
+ at findex select_oligos generate: editor_widget
+ at item @code{select_oligos generate} sense forward backward avg_length
+
+        Generates a list of oligos suitable for use on the @i{sense} strand,
+        within @i{forward} bases rightwards of the cursor and @i{backward}
+        bases leftwards. Returns the number of oligos found, or -1 for error.
+
+ at findex select_oligos next: editor_widget
+ at item @code{select_oligos next}
+
+        Picks the next oligo in the list produced by the @code{generate}
+        command (or the first if this hasn't been called yet). This remembers
+        the current active oligo number and returns the default template
+        followed by the complete list of templates (including the default)
+        suitable for this oligo.
+
+ at findex select_oligos accept: editor_widget
+ at item @code{select_oligos accept} template
+
+        Adds the tag to the database for this oligo using the named
+        @i{template} ("" can be specified here if none is required). Returns a
+        status line containing the template name and the oligo sequence. 
+
+ at findex select_oligos quit: editor_widget
+ at item @code{select_oligos quit}
+
+        Frees up memory allocated by @code{generate} command.
+ at end table
+
+ at c -------------------------------------------------------------------------
+ at split{}
+ at node GEditor-Status
+ at subsubsection The Status Line
+ at cindex Status line in editor widget
+ at cindex Editor widget: status line
+ at findex status: editor widget
+
+ at table @var
+ at findex status add: editor_widget
+ at item @code{status add} type
+
+        Adds a new status line to the bottom of the editor. The @i{type} may
+        be one of the following.
+
+        @table @var
+        @item 0
+                Strand display
+        @item 1, 2 or 3
+                Amino acid translations in reading frame 1, 2 and 3 for the
+                positive strand.
+
+        @item 4, 5 or 6
+                Amino acid translations in reading frame 1, 2 and 3 for the
+                negative strand.
+        @end table
+
+ at findex status delete: editor_widget
+ at item status @code{delete} type
+
+        Delete a status line. The @i{type} is from the same set listed above.
+
+ at findex translation_mode: editor_widget
+ at item @code{translation_mode} mode
+
+        This command sets the style of amino acids displayed. @i{Mode} may be
+        either @code{1} or @code{3} to output 1 character or 3 character
+        abbreviations.
+ at end table
+
+ at c -------------------------------------------------------------------------
+ at split{}
+ at node GEditor-Trace
+ at subsubsection The Trace Display
+ at cindex Trace display in editor widget
+ at cindex Editor widget: trace display
+
+ at table @var
+ at findex autodisplay_traces: editor widget
+ at item @code{autodisplay_traces ?}value at code{?}
+
+        This command sets the automatic trace display mode. A @i{value} of 1
+        will make the editor display relevant traces to solve a problem when
+        the @code{problem} search type is used. A @i{value} of 0 disables
+        this.
+
+ at findex set_trace_lock: editor widget
+ at item @code{set_trace_lock ?}value at code{?}
+
+        This command sets the locking mode between the editor cursor and the
+        trace cursor. With a @i{value} of 1 any movement in the editor cursor
+        also moves the connected trace displays. A @i{value} of 0 disables
+        this.
+
+ at findex trace_comparator: editor widget
+ at item @code{trace_comparator ?}identifier at code{?}
+
+        This command specifies another reading identifier (within the same
+        contig) to compare all new traces against. The comparator
+        @var{identifier} can either be a reading identifier to compare against
+        that specific reading or @code{0} to compare against a consensus
+        trace.  When @code{invoke_trace} is called the comparator trace, the
+        requested trace, and their differences are displayed. With no
+        @var{identifier} argument the automatic trace comparison is disabled.
+
+ at findex trace_config: editor widget
+ at item @code{trace_config ?}match select at code{?}
+
+        This command controls of generation of the consensus trace when
+        @code{trace_comparator 0} is used. The consensus trace is calculated
+        as the average trace of readings on the same strand as the trace we
+        wish to compare it against. If @var{match} is non zero, each single
+        base segment of the consensus trace is averaged from only readings in
+        agreement with the consensus sequence. If @var{select} is non zero the
+        trace to compare against is not used in the consensus trace
+        calculation. With no @var{match} or @var{select} arguments the current
+        settings are returned.
+
+ at findex delete_trace: editor_widget
+ at item @code{delete_trace} path
+
+        Removes a trace with the Tk @i{path} from the trace display. Useful
+        for when quitting the editor.
+
+ at findex invoke_trace: editor_widget
+ at item @code{invoke_trace}
+
+        Adds a trace to the trace display.
+
+ at findex diff_trace: editor_widget
+ at item @code{diff_trace} path1 path2
+
+        This brings up a difference trace between two currently displayed
+        traces with the Tk pathnames of @var{path1} and @var{path2}. These
+        pathnames are returned from the initial @code{trace_add} and
+        @code{trace_create} Tcl utility routines and are typically only known
+        internally to the editor.
+
+ at end table
+
+ at c -------------------------------------------------------------------------
+ at split{}
+ at node GEditor-Misc
+ at subsubsection Miscellaneous Commands
+ at cindex Editor, miscellaneous commands
+
+ at table @var
+ at findex xview: editor_widget
+ at findex yview: editor_widget
+ at item @code{xview ?}position at code{?}
+ at itemx @code{yview ?}position at code{?}
+
+These commands are used to query and change the horizontal and vertical
+position of the information displayed in the editor's window. Without
+specifying the optional @i{position} argument the current value is returned.
+Specifying @i{position} sets the position and updates the editor display.
+
+ at findex align: editor_widget
+ at item @code{align}
+
+        Aligns the data covered by the selection with the consensus sequence.
+        The sequence is then padded automatically.
+
+ at findex configure: editor_widget
+ at item @code{configure ?}option at code{? ?}value option value ... at code{?}
+
+        Reconfigures the editor. NB: not all configuration options allowed at
+        startup operate correctly when reconfiguring. (This is a bug.)
+
+ at findex dump_contig: editor_widget
+ at item @code{dump_contig} filename from to line_length
+
+        Saves the contig display to a file within a specified region. The
+        output consists of the data and settings of the current display.
+
+ at findex edits_made: editor_widget
+ at item @code{edits_made}
+
+        Queries whether edits have been made. Returns 1 if they have, 0 if
+        they have not.
+
+ at findex find_read: editor_widget
+ at item @code{find_read} identifier
+
+        Converts a reading identifier to an internal editor sequence number.
+
+ at findex get_displayed_annos: editor_widget
+ at item @code{get_displayed_annos}
+
+        Returns a list of the displayed annotation types.
+
+ at findex get_extents: editor_widget
+ at item @code{get_extents}
+
+        Returns the start and end of the displayable contig positions. If
+        cutoff data is shown this will also include the cutoff data beyond the
+        normal contig ends.
+
+ at findex get_hidden_reads: editor_widget
+ at item @code{get_hidden_reads}
+
+        This returns the hidden reads as a list of reading name identifiers.
+
+ at findex get_name: editor_widget
+ at item @code{get_name ?}gel_number at code{?}
+
+        Returns the gel name from a given internal reading number, or for the
+        reading underneath the editing cursor.
+
+ at findex get_number: editor_widget
+ at item @code{get_number ?}xpos ypos at code{?}
+
+        Returns the editor's internal reading number covering the screen
+        coordinate (@i{xpos}, at i{ypos}).  If no @i{xpos} and @i{ypos} are
+        specified then the position of the editing cursor is used.
+
+ at findex get_read_number: editor_widget
+ at item @code{get_read_number ?}xpos ypos at code{?}
+
+        Returns the reading number covering the screen coordinate
+        (@i{xpos}, at i{ypos}).  If no @i{xpos} and @i{ypos} are specified then
+        the position of the editing cursor is used.
+
+ at findex hide_read: editor_widget
+ at item @code{hide_read}
+
+        This command toggles the 'hidden' status of a reading. Hidden readings
+        are not used to compute the consensus.
+
+ at findex io: editor_widget
+ at item @code{io}
+
+        This returns the IO handle used for this editor.
+
+ at findex join: editor_widget
+ at item @code{join}
+
+        Performs a join in the join editor.
+
+ at findex join_align: editor_widget
+ at item @code{join_align}
+
+        Performs an alignment (and pads automatically) on the overlapping
+        region in a join editor.
+
+ at findex join_mode: editor_widget
+ at item @code{join_mode}
+
+        Queries whether the editor is part of a join editor. Returns 1 if it
+        is and 0 if it is not.
+
+ at findex join_percentage: editor_widget
+ at item @code{join_percentage}
+
+        Returns the percentage mismatch of the overlap for a join editor.
+
+ at findex save: editor_widget
+ at item @code{save}
+
+        Saves the database, but doesn't quit.
+
+ at findex set_displayed_annos: editor_widget
+ at item @code{set_displayed_annos ?}type ... at code{?}
+
+        Sets the displayed annotation types to those specified. All other are
+        turned off.
+
+ at findex shuffle_pads: editor_widget
+ at item @code{shuffle_pads}
+
+        Realigns pads along the total length of the consensus.
+
+ at findex undo: editor_widget
+ at item @code{undo}
+
+        Undoes the last compound operation (from a list of changes).
+
+ at findex write_mode: editor_widget
+ at item @code{write_mode}
+
+        This toggles the editor between read-write and read-only mode.
+
+ at end table
+
+ at c -------------------------------------------------------------------------
+ at split{}
+ at node GEditor-Quit
+ at subsubsection Quitting the Widget
+ at cindex Quitting the editor
+ at cindex Editor, quitting
+
+Destroying the editor widget automatically destroys the associated data
+(edStruct) and deregisters from the contig. However a quit command also
+exists. The difference between using the Tk destroy command and quit is that
+quit also sends acknowledgements of shutdown allowing other displays to tidy
+up (such as deleting a displayed cursor). Hence quit is the preferred method.
+
+ at table @var
+ at findex quit: editor_widget
+ at item @code{quit}
+
+        Destroys the widget.
+ at end table
diff --git a/scripting_manual/gap4-registration-t.texi b/scripting_manual/gap4-registration-t.texi
new file mode 100644
index 0000000..0f0f720
--- /dev/null
+++ b/scripting_manual/gap4-registration-t.texi
@@ -0,0 +1,1773 @@
+ at menu
+ at ifset html
+* Reg-Introduction::    Introduction
+ at end ifset
+* Reg-Structures::          Data Structures
+* Reg-Registrating Data::   Registering a piece of data
+* Reg-Callbacks::           The callback function
+* Reg-Notifications::       The notifications available
+* Reg-Sending::             Sending a notification
+* Reg-Tasks::               Specific notification tasks
+* Reg-Functions::           C Functions available
+* Reg-Locking::             Locking mechanisms
+* Reg-Specific Examples::   Examples of Specific Functions
+* Reg-Tcl::                 Tcl Interfaces
+* Reg-To Do::               Future enhancements
+ at end menu
+
+ at ifset html
+ at split{}
+ at node Reg-Introduction
+ at section Introduction
+ at end ifset
+ at cindex Registration introduction
+
+Each function wishing to access a contig on a long term basis needs to
+register itself before accessing the data. For example, the template display
+and contig editors should register, but show relationships produces a report
+taken from a single snap shot of the data and so does not need to register.
+
+The idea of registering is to allow communication between views of the same
+(or derived) data, this insuring that they can be automatically kept up to
+date when modifications are made, and can provide mechanisms to prevent
+multiple, incompatible, edits of the same data. An example can be seen in the
+template display. Suppose we have the display showing contig 4. A join contig
+operation links this to contig 7 and produces a new contig --- number 10.
+Other contig numbers may have shuffled too: if we had 9 contigs then contig 9
+may well be renumbered to contig 4.
+
+Therefore we need to notify any functions displaying contig 9 of a contig
+number change. We also need to notify displays of contig 4 that the number is
+now 10, and both this and contig 7 that the contents and length has changed.
+
+Central to the scheme is the result manager. This displays a list of which
+data is registered and provides a further method for the user to interrogate
+specific results.
+
+Notifications may also be for requesting data as well as informing changes.
+All registered items must respond to certain notifications, such as for
+determining the name of the function, so that it can be listed in the results
+manager.
+
+ at split{}
+ at node Reg-Structures
+ at section Data Structures
+ at cindex Registration structures
+
+For each contig we maintain a list of displays of this data. We register by
+supplying a function (of a specific type) to our registration scheme, along
+with any data of our own (called our @i{client_data}) that we wish to be
+passed back. When an operation is performed on this contig the function that
+we specified is called along with our own client_data and a description of the
+operation made. A function often does not need to be told of all changes, so
+when registering it's possible to list only those operations that should be
+responded to.
+
+In addition to maintaining the above information, each registration contains
+an identifier, a time stamp, a type, and an "id" value.
+
+The identifier is a simply number that is used to specify a single registered
+data, or a group of registered data. An example of it's use is within the
+contig selector; the selector is registered on all contigs, but each
+registration has the same identifier. A new identifier is returned by calling
+the @code{register_id} function.
+
+The time stamp is allocated automatically when the @code{contig_register}
+function is called. It is displayed within the results manager.
+
+ at cindex Types, registration scheme
+ at cindex Registration types
+ at vindex REG_TYPE_UNKNOWN
+ at vindex REG_TYPE_EDITOR
+ at vindex REG_TYPE_FIJ
+ at vindex REG_TYPE_READPAIR
+ at vindex REG_TYPE_REPEAT
+ at vindex REG_TYPE_QUALITY
+ at vindex REG_TYPE_TEMPLATE
+ at vindex REG_TYPE_RESTRICTION
+ at vindex REG_TYPE_STOPCODON
+ at vindex REG_TYPE_CONTIGSEL
+ at vindex REG_TYPE_CHECKASS
+ at vindex REG_TYPE_OLIGO
+The type is used to flag a registered data as belonging to a specific
+function. This is useful for when we wish to send a notification to all
+instances of a particular display, or to query whether the contig editor is
+running (such as performed by the stop codon display). The current types known
+are:
+
+ at example
+ at group
+REG_TYPE_UNKNOWN
+REG_TYPE_EDITOR
+REG_TYPE_FIJ
+REG_TYPE_READPAIR
+REG_TYPE_REPEAT
+REG_TYPE_QUALITY
+REG_TYPE_TEMPLATE
+REG_TYPE_RESTRICTION
+REG_TYPE_STOPCODON
+REG_TYPE_CONTIGSEL
+REG_TYPE_CHECKASS
+REG_TYPE_OLIGO
+ at end group
+ at end example
+
+The id value is used to distinguish which pieces of data are connected. Each
+"result" has a single id value, but may consist of multiple pieces of
+registered data, all sharing the same id.
+
+ at cindex contig_reg_t structure
+ at vindex contig_reg_t
+So the registration consists of the following structure:
+
+ at example
+ at group
+typedef struct @{
+    void  (*func)(
+                  GapIO    *io,
+                  int       contig,
+                  void     *fdata,
+                  reg_data *jdata);
+    void   *fdata;
+    int     id;
+    time_t  time;
+    int     flags;
+    int     type;
+    int     uid; /* A _unique_ identifier for this contig_reg_t */
+@} contig_reg_t;
+ at end group
+ at end example
+
+The @code{func} and @code{fdata} are the callback functions and
+client_data. @var{uid} is a number unique to all registrations, even those
+that have common @var{id} values. You need not be concerned about it's use; it
+is internal to the registration system.
+
+Hence the total memory used by the registration system is an array of arrays
+of above structures. One array per contig, containing an array of
+ at var{contig_reg_t} structs.
+
+A notification of an action involves creating a @var{reg_data} structure and
+sending this to one of the notification functions (such as
+ at code{contig_notify}). The @var{reg_data} structure is infact a union of many
+structure types; one for each notification type. In common to all these types
+is the job field. This must be filled out with the current notification type.
+_oxref(Reg-Notifications, The Notifications Available).
+
+As @var{reg_data} is a union of structures, it must be access by a further
+pointer indirection. For instance, to determine the position of the contig
+editor cursor from a @code{REG_CURSOR_NOTIFY} notification we need to write
+"@code{reg_data->cursor_notify->pos}" rather than simply
+"@code{reg_data->pos}". The complete list of union names can be found in
+io-reg.h. The current list is summarised below. The types and use of these
+structures will be discussed in further detail later.
+
+ at cindex reg_data structure
+ at vindex reg_data
+ at example
+typedef union _reg_data @{
+    /* MUST be first here and in job data structs */
+    int job;
+    
+    reg_generic         generic;
+    reg_number          number;
+    reg_join            join;
+    reg_order           order;
+    reg_length          length;
+    reg_query_name      name;
+    reg_delete          delete;
+    reg_complement      complement;
+    reg_get_lock        glock;
+    reg_set_lock        slock;
+    reg_quit            quit;
+    reg_get_ops         get_ops;
+    reg_invoke_op       invoke_op;
+    reg_params          params;
+    reg_cursor_notify   cursor_notify;
+    reg_anno            annotations;
+    reg_register        c_register;
+    reg_deregister      c_deregister;@
+    reg_highlight_read  highlight;
+    reg_buffer_start    buffer_start;
+    reg_buffer_end      buffer_end;
+@} reg_data;
+ at end example
+
+ at split{}
+ at node Reg-Registrating Data
+ at section Registering a Piece of Data
+ at cindex Registration of data
+
+To register data several things need to be known; the contig number, the
+callback function, the client_data (typically the address of the data to
+register), the list of notifications to respond to, an indentifier, and the
+"type" of this data (one of the @code{REG_TYPE_} macros).
+
+If the data needs updating when more than one specific contig changes, then
+the data should be registered with more than one contig.
+
+Use the @code{contig_register} function to register an item. The prototype
+is:
+
+ at example
+#include <io-reg.h>
+
+int contig_register(
+        GapIO  *io,
+        int     contig,
+        void  (*func)(
+                      GapIO     *io,
+                      int        contig,
+                      void      *fdata,
+                      reg_data  *jdata),
+        void   *fdata,
+        int     id,
+        int     flags,
+        int     type);
+ at end example
+
+ at var{contig} is a contig number in the C sense (@code{1} to
+ at code{NumContigs(io)}), not a gel reading number.
+
+The @var{fdata} (the client_data mentioned before) can be anything you wish.
+It will be passed back to the callback function @var{func} when a notification
+is made.  Typically it's best to simply pass the address of your data that you
+wish to keep up to date. If your data is not a single pointer then turn it
+into one by creating a structure containing all the relevant pointers.
+
+The id number is usually unique for each time an option it ran, but common to
+all registrations of this particular piece of data. This is not a hard and
+fast rule --- it depends on how you wish to interact with this data. For
+instance, the contig selector window registers with all contigs so that it can
+be notified when any contig changes. The same @var{id} is used for each of
+these registrations as it is the collection of registrations as a whole which
+is required for the display.
+
+"Flags" is used to request which notifications should be sent to this callback
+function. Each notification has a name which is actually a #define for a
+number. This names can be ORed together to generate a bit field of
+acknowledged requests. There are some predefined bitfields (for shortening the
+function call) that can themselves be ORed together. _oxref(Reg-Notifications,
+The Notifications Available).  Finally, one special flag can be ORed on to
+request that this function does not appear in the results manager window. This
+flag is @code{REG_FLAG_INVIS}: see the contig selector code for an example.
+
+An example of using @code{contig_register} can be seen in the stop codon plot.
+Our stop codon results are all held within a structure of type
+ at var{mobj_stop}. The general outline of our stop codon code is as follows:
+
+ at example
+mobj_stop *s;
+int id;
+
+if (NULL == (s = (mobj_stop *)xmalloc(sizeof(mobj_stop)))) @{
+    return 0;
+@}
+
+[ Fill in our 's' structure with our results ]
+
+DrawStopCodons(s);
+id = register_id();
+contig_register(io, contig_number, stop_codon_callback, (void *)s, id,
+                REG_REQUIRED | REG_DATA_CHANGE | REG_OPS | REG_GENERIC
+                | REG_NUMBER_CHANGE | REG_REGISTERS | REG_CURSOR_NOTIFY,
+                REG_TYPE_STOPCODON);
+ at end example
+
+Here we've requested that the result @var{s}, of type
+ at code{REG_TYPE_STOPCODON}, should be passed to the @code{stop_codon_callback}
+function whenever a notification of type @code{REG_REQUIRED},
+ at code{REG_DATA_CHANGE}, @code{REG_OPS}, @code{REG_GENERIC},
+ at code{REG_NUMBER_CHANGE}, @code{REG_REGISTERS} or @code{REG_CURSOR_NOTIFY}
+occurs. These notification types are actually combinations of types, but more
+on this later.
+
+ at split{}
+ at node Reg-Callbacks
+ at section The Callback Function
+ at cindex Callbacks, registration
+ at cindex Registration callbacks
+
+The callback function must be of the following prototype:
+
+ at example
+void function(
+        GapIO     *io,
+        int        contig,
+        void      *fdata,
+        reg_data  *jdata);
+ at end example
+
+Here @var{fdata} will be the client_data specified when registering. The first
+task within our callback function will be to cast this to a useful type. As
+the type of this @var{fdata} will change depending on what piece of data is
+registered this is a required, but tedious, action.
+
+The next task at hand is to see exactly why the callback function was called.
+This is listed in the @var{reg_data} parameter. Specifically
+ at code{jdata->job} will be one of the many notification types. The suggested
+coding method is to perform a switch on this field as follows:
+
+ at example
+static void some_callback(GapIO *io, int contig, void *fdata, reg_data *jdata)
+@{
+    some_type_t *s = (some_type_t *)fdata;
+
+    switch(jdata->job) @{
+    case REG_QUERY_NAME:
+        sprintf(jdata->name.line, "Some name");
+        break;
+
+    case REG_QUIT:
+    case REG_DELETE:
+        ShutDownSomeDisplay(fdata);
+        xfree(fdata);
+        break;
+    @}
+@}
+ at end example
+
+ at code{REG_QUERY_NAME}, @code{REG_QUIT}, @code{REG_DELETE} and
+ at code{REG_PARAMS} are required to be accepted by all registered items.
+
+In general the callback function will also be interested in changes to the
+contig that the data is registered with. These involve the @code{REG_JOIN_TO},
+ at code{REG_COMPLEMENT}, @code{REG_LENGTH}, @code{REG_NUMBER_CHANGE} and
+ at code{REG_ANNO} requests.
+
+For precise details on handling the various notifications, please see
+the following section.
+
+ at split{}
+ at node Reg-Notifications
+ at section The Notifications Available
+ at cindex Notifications, registration
+ at cindex Registration notifications
+
+ at menu
+* Reg-REG_GENERIC::                         REG_GENERIC
+* Reg-REG_NUMBER_CHANGE::                   REG_NUMBER_CHANGE
+* Reg-REG_JOIN_TO::                         REG_JOIN_TO
+* Reg-REG_ORDER::                           REG_ORDER
+* Reg-REG_LENGTH::                          REG_LENGTH
+* Reg-REG_QUERY_NAME::                      REG_QUERY_NAME
+* Reg-REG_DELETE::                          REG_DELETE
+* Reg-REG_GET_LOCK and REG_SET_LOCK::       REG_GET_LOCK and REG_SET_LOCK
+* Reg-REG_COMPLEMENT::                      REG_COMPLEMENT
+* Reg-REG_PARAMS::                          REG_PARAMS
+* Reg-REG_QUIT::                            REG_QUIT
+* Reg-REG_CURSOR_NOTIFY::                   REG_CURSOR_NOTIFY
+* Reg-REG_GET_OPS::                         REG_GET_OPS
+* Reg-REG_INVOKE_OP::                       REG_INVOKE_OP
+* Reg-REG_ANNO::                            REG_ANNO
+* Reg-REG_REGISTER and REG_DEREGISTER::     REG_REGISTER and REG_DEREGISTER
+* Reg-REG_HIGHLIGHT_READ::                  REG_HIGHLIGHT_READ
+* REG-REG_BUFFER_START and REG_BUFFER_END:: REG_BUFFER_START and REG_BUFFER_END
+ at end menu
+
+In order to shorten code, especially when requesting which notifications
+should be accepted using the @code{contig_register} call, the following
+macros may be of use. They are used to group the various notifications.
+
+ at example
+ at group
+#define REG_REQUIRED    (REG_QUERY_NAME | REG_DELETE | REG_QUIT | REG_PARAMS)
+#define REG_DATA_CHANGE (REG_JOIN_TO | REG_LENGTH | REG_COMPLEMENT)
+#define REG_OPS         (REG_GET_OPS | REG_INVOKE_OP)
+#define REG_LOCKS       (REG_GET_LOCK | REG_SET_LOCK)
+#define REG_REGISTERS   (REG_REGISTER | REG_DEREGISTER)
+#define REG_BUFFER      (REG_BUFFER_START | REG_BUFFER_END)
+#define REG_ALL         (REG_REQUIRED | REG_DATA_CHANGE | REG_OPS | REG_LOCKS\
+                         | REG_ORDER | REG_CURSOR_NOTIFY | REG_NUMBER_CHANGE \
+                         | REG_ANNO | REG_REGISTERS | REG_HIGHLIGHT_READ \
+                         | REG_BUFFER)
+ at end group
+ at end example
+
+In the following descriptions, we outline the different notifications in the
+format of name followed by the name within the @var{reg_data} structure, the
+structure itself, and the description.
+
+ at split{}
+ at node Reg-REG_GENERIC
+ at subsection REG_GENERIC
+ at cindex REG_GENERIC
+ at vindex REG_GENERIC
+ at example
+ at group
+reg_generic         generic;
+
+typedef struct @{
+    int    job;        /* REG_GENERIC */
+    int    task;       /* Some specific task */
+    void  *data;     /* And data associated with the task */
+@} reg_generic;
+ at end group
+ at end example
+
+This is used for sending specific requests to specific data or data types.
+The task is a macro named after the type the task deals with. Eg
+ at code{TASK_EDITOR_SETCURSOR}. @code{REG_GENERIC} is usually used in conjuction
+with a @code{result_notify} or @code{type_contig_notify} function call.
+_oxref(Reg-Tasks, Specific Notification Tasks).
+
+ at node Reg-REG_NUMBER_CHANGE
+ at subsection REG_NUMBER_CHANGE
+ at cindex REG_NUMBER_CHANGE
+ at vindex REG_NUMBER_CHANGE
+ at example
+ at group
+reg_number          number;
+
+typedef struct @{
+    int    job;        /* REG_NUMBER_CHANGE */
+    int    number;     /* New contig number */
+@} reg_number;
+ at end group
+ at end example
+
+Sent whenever a contig number changes, but not when a reading number
+changes. This is currently only sent when renumbering contigs during a
+contig delete operation.
+    
+ at node Reg-REG_JOIN_TO
+ at subsection REG_JOIN_TO
+ at cindex REG_JOIN_TO
+ at vindex REG_JOIN_TO
+ at example
+ at group
+reg_join            join;
+
+typedef struct @{
+    int    job;        /* REG_JOIN_TO */
+    int    contig;     /* New contig number */
+    int    offset;     /* Offset of old contig into new contig */
+@} reg_join;
+ at end group
+ at end example
+
+Used to notify data that this contig has just been joined to another contig,
+at a specified offset. @var{contig} is contig number that this contig has been
+joined to (and hence it's new number). @var{offset} is the offset within the
+new contig that the old contig has been joined to. This request is always sent
+to the right most of the contig pair to join. The leftmost contig receives a
+ at code{REG_LENGTH} notification. _oxref(Reg-Joining two contigs, Joining Two
+Contigs).
+
+ at node Reg-REG_ORDER
+ at subsection REG_ORDER
+ at cindex REG_ORDER
+ at vindex REG_ORDER
+ at example
+ at group
+reg_order           order;
+
+typedef struct @{
+    int    job;        /* REG_ORDER */
+    int    pos;        /* New order */
+@} reg_order;
+ at end group
+ at end example
+
+The purpose is to inform when the contig order changes. @var{pos} is the new
+position of this contig. To be consistent, there will be further
+ at var{REG_ORDER} requests indicating the new position of the contig that was
+previously at this position. Typically this is simply handled by sending a
+notification for each contig. To handle these efficiently it is probably best
+to use the @code{REG_BUFFER_START} and @code{REG_BUFFER_END} notifications.
+
+ at split{}
+ at node Reg-REG_LENGTH
+ at subsection REG_LENGTH
+ at cindex REG_LENGTH
+ at vindex REG_LENGTH
+ at example
+ at group
+reg_length          length;
+
+typedef struct @{
+    int    job;        /* REG_LENGTH, implies data change too */
+    int    length;     /* New length */
+@} reg_length;
+ at end group
+ at end example
+
+Sent whenever the length or data within of a contig changes. In this respect
+ at code{REG_LENGTH} is a bit of a misnomer; replacing a single base within the
+contig editor and then saving (which does not change the length of that
+contig) will still send a @code{REG_LENGTH} request to inform data that the
+contig has changed. This is one of the most frequently sent and acknowledged
+requests.
+
+ at node Reg-REG_QUERY_NAME
+ at subsection REG_QUERY_NAME
+ at cindex REG_QUERY_NAME
+ at vindex REG_QUERY_NAME
+ at example
+ at group
+reg_query_name      name;
+
+typedef struct @{
+    int    job;        /* REG_QUERY_NAME */
+    char  *line;     /* char[80] */
+@} reg_query_name;
+ at end group
+ at end example
+
+Sent by the @code{result_names} routine to obtain a brief one line (less than
+80 characters) name of this registered item. Callback procedures should write
+into the @var{line} field themselves with no need for memory allocation.  The
+name returned here will be used as a component of the line within the Results
+Manager window. Registered data is required to handle this request, unless it
+is invisible (has the @code{REG_FLAG_INVIS} bit set).
+
+ at node Reg-REG_DELETE
+ at subsection REG_DELETE
+ at cindex REG_DELETE
+ at vindex REG_DELETE
+ at example
+ at group
+reg_delete          delete;
+
+typedef struct @{
+    int    job;        /* REG_DELETE */
+@} reg_delete;
+ at end group
+ at end example
+
+The registered data should be removed and any associated displays should be
+shutdown. This is in response to a contig being deleted (by the
+ at code{io_delete_contig} function), or a programmed shutdown to force
+associated displays to quit (such as when forcing the quality display to quit
+when the user quits the template display). Registered data is required to
+handle this request.
+
+ at node Reg-REG_GET_LOCK and REG_SET_LOCK
+ at subsection REG_GET_LOCK and REG_SET_LOCK
+ at cindex REG_GET_LOCK
+ at cindex REG_GET_LOCK
+ at vindex REG_SET_LOCK
+ at example
+ at group
+#define REG_LOCK_READ   1
+#define REG_LOCK_WRITE  2
+
+reg_get_lock        glock;
+reg_set_lock        slock;
+
+typedef struct @{
+    int    job;        /* REG_GET_LOCK */
+    int    lock;       /* Sends lock requirements, returns locks allowed */
+@} reg_get_lock, reg_set_lock;
+ at end group
+ at end example
+
+Both these notifications share the same structure. The pair are used in
+conjunction to determine whether exclusive write access is allowed on this
+contig, and if so to set this access. This is all managed by the
+ at code{contig_lock_write} function. _oxref(Reg-Locking, Locking
+Mechanisms). Functions wishing to modify data, such as complement, should
+use locking.
+
+ at split{}
+ at node Reg-REG_COMPLEMENT
+ at subsection REG_COMPLEMENT
+ at cindex REG_COMPLEMENT
+ at vindex REG_COMPLEMENT
+ at example
+ at group
+reg_complement      complement;
+
+typedef struct @{
+    int    job;        /* REG_COMPLEMENT */
+@} reg_complement;
+ at end group
+ at end example
+
+Notifies that the contig has just been complemented. It may prove easy to
+simply handle this and other data change notifications all the same.  However
+in slow functions, it may be quicker to handle complement functions
+separately, as it can be quicker to complement result data than to recalculate
+it.
+
+ at node Reg-REG_PARAMS
+ at subsection REG_PARAMS
+ at cindex REG_PARAMS
+ at vindex REG_PARAMS
+ at example
+ at group
+reg_params          params;
+
+typedef struct @{
+    int    job;        /* REG_PARAMS */
+    char  *string;     /* Pointer to params string */
+@} reg_params;
+ at end group
+ at end example
+
+Sent as a request for obtaining the parameters used for generating this
+data. Note that in contrast to @code{REG_NAME} the @var{string}  field here is
+not already allocated. The function acknowledging this request should point
+ at var{string} to a static buffer of it's own. Currently, although implemented,
+this request is not used.
+
+ at node Reg-REG_QUIT
+ at subsection REG_QUIT
+ at cindex REG_QUIT
+ at vindex REG_QUIT
+ at example
+ at group
+reg_quit            quit;
+
+typedef struct @{
+    int    job;        /* REG_GET_LOCK */
+    int    lock;       /* Sends lock requirements, returns locks allowed */
+@} reg_quit;
+ at end group
+ at end example
+
+Sent to request a shutdown for this display. This is not like
+ at code{REG_DELETE}n, whereby the data is told that it must shutdown as the
+contig has already been deleted. If a display cannot shutdown (for example it
+is a contig editor that has unsaved data) the lock should be cleared and the
+calling function should check this to determine whether the shutdown
+succeeded. This is handled internally by the @code{tcl_quit_displays}
+function.
+
+ at node Reg-REG_CURSOR_NOTIFY
+ at subsection REG_CURSOR_NOTIFY
+ at cindex REG_CURSOR_NOTIFY
+ at vindex REG_CURSOR_NOTIFY
+ at example
+ at group
+reg_cursor_notify   cursor_notify;
+
+typedef struct @{
+    int    job;           /* REG_CURSOR_NOTIFY */
+    int    editor_id;     /* Which contig editor */
+    int    seq;           /* Gel reading number (0 == consensus) */
+    int    pos;           /* Position in gel reading */
+@} reg_cursor_notify;
+ at end group
+ at end example
+
+Sent by the contig editor at startup and whenever the editing cursor moves.
+The @var{editor_id} is a number unique to each contig editor, so it is
+possible to distinguish different editors. @var{seq} is either 0 for the
+consensus, or a gel reading number. @var{pos} is the offset within that gel
+reading, rather than the total offset into the consensus (unless @var{seq} is
+0).
+
+ at split{}
+ at node Reg-REG_GET_OPS
+ at subsection REG_GET_OPS
+ at cindex REG_GET_OPS
+ at vindex REG_GET_OPS
+ at example
+ at group
+reg_get_ops         get_ops;
+
+typedef struct @{
+    int    job;      /* REG_GET_OPS */
+    char  *ops;      /* Somewhere to place ops in, unalloced to start with */
+@} reg_get_ops;
+ at end group
+ at end example
+
+Within the Results Manager a popup menu is available for choosing from a list
+of tasks to be performed on this data. These can include anything, but
+typically include deleting the data and listing textual information.  The
+ at var{ops} field will intitially point to @code{NULL} when the callback
+function is called. The callback function should then assign ops to a static
+string listing @code{NULL} separated items to appear on the popup menu, ending
+in a double @code{NULL}. If an item in this string is "@code{SEPARATOR}", a
+separator line on the menu will appear. If an item is "@code{PLACEHOLDER}",
+then nothing for this item will appear in the menu, but the numbering used for
+ at code{REG_INVOKE_OP} will count "@code{PLACEHOLDER}" as an option. An example
+of the acknowledging code follows:
+
+ at example
+case REG_GET_OPS:
+    if (r->all_hidden)
+        jdata->get_ops.ops = "Information\0PLACEHOLDER\0"
+            "Hide all\0Reveal all\0SEPARATOR\0Remove\0";
+    else
+        jdata->get_ops.ops = "Information\0Configure\0"
+            "Hide all\0Reveal all\0SEPARATOR\0Remove\0";
+    break;
+ at end example
+
+Here we have a menu containing, "Information", "Configure", "Hide all",
+"Reveal all" and "Remove". In this example, if @code{r->all_hidden} is set
+then the "Configure" option does not appear, but the later options (eg Remove)
+will always be given the same number (4 in this case).
+
+ at node Reg-REG_INVOKE_OP
+ at subsection REG_INVOKE_OP
+ at cindex REG_INVOKE_OP
+ at vindex REG_INVOKE_OP
+ at example
+ at group
+reg_invoke_op       invoke_op;
+
+typedef struct @{
+    int    job;        /* REG_INVOKE_OP */
+    int    op;         /* Operation to perform */
+@} reg_invoke_op;
+ at end group
+ at end example
+
+When the user has chosen an option from the Results Manager popup window (from
+the list returned by @code{REG_GET_OPS}), @code{REG_INVOKE_OP} is called with
+an integer value (held in the @var{op} field) detailing which operation was
+chosen. @var{op} starts counting from zero for the first item returned from
+ at code{REG_GET_OPS}, and counts up one each time for each operation or
+ at code{PLACEHOLDER} listed. An example of an acknowledge for
+ at code{REG_INVOKE_OP} to complement the example given in @code{REG_GET_OPS}
+follows:
+
+ at example
+case REG_INVOKE_OP:
+    switch (jdata->invoke_op.op) @{
+    case 0: /* Information */
+        csmatch_info((mobj_repeat *)r, "Find Repeats");
+        break;
+    case 1: /* Configure */
+        csmatch_configure(io, cs->window, (mobj_repeat *)r);
+        break;
+    case 2: /* Hide all */
+        csmatch_hide(our_interp, cs->window, (mobj_repeat *)r, csplot_hash);
+        break;
+    case 3: /* Reveal all */
+        csmatch_reveal(our_interp, cs->window, (mobj_repeat *)r, csplot_hash);
+        break;
+    case 4: /* Remove */
+        csmatch_remove(io, cs->window, (mobj_repeat *)r, csplot_hash);
+        break;
+    @}
+    break;
+ at end example
+
+ at split{}
+ at node Reg-REG_ANNO
+ at subsection REG_ANNO
+ at cindex REG_ANNO
+ at vindex REG_ANNO
+ at example
+ at group
+reg_anno            annotations;
+
+typedef struct @{
+    int    job;        /* REG_ANNO */
+@} reg_anno;
+ at end group
+ at end example
+
+Sent when only the annotations (tags) for a contig have been updated. It is
+sometimes simplest for clients to handle @code{REG_ANNO} in the same manner as
+ at code{REG_LENGTH}. However in some cases it can be much more efficient to
+handle separately as it may be easier to redisplay annotations than to
+redisplay everything.
+
+ at node Reg-REG_REGISTER and REG_DEREGISTER
+ at subsection REG_REGISTER and REG_DEREGISTER
+ at cindex REG_REGISTER
+ at vindex REG_REGISTER
+ at cindex REG_DEREGISTER
+ at vindex REG_DEREGISTER
+ at example
+ at group
+reg_register        c_register;
+reg_deregister      c_deregister;
+
+typedef struct @{
+    int    job;        /* REG_REGISTER, REG_DEREGISTER */
+    int    id;         /* Registration id */
+    int    type;       /* Registration type */
+    int    contig;     /* Contig number */
+@} reg_register, reg_deregister;
+ at end group
+ at end example
+
+Both of these notifications share the same structure. They are sent whenever a
+registration or deregistration of another piece of data is performed for this
+contig. An example of the use of this is within the stop codon display which
+enables use of the "Refresh" button when a contig editor is running. The
+ at var{id}, @var{type} and @var{contig} fields here are the same as the fields
+with the same name from the @var{contig_reg_t} structure.
+
+ at node Reg-REG_HIGHLIGHT_READ
+ at subsection REG_HIGHLIGHT_READ
+ at cindex REG_HIGHLIGHT_READ
+ at vindex REG_HIGHLIGHT_READ
+ at example
+ at group
+reg_highlight_read  highlight;
+
+typedef struct @{
+    int    job;       /* REG_HIGHLIGHT_READ */
+    int    seq;       /* Gel reading number (-ve == contig consensus) */
+    int    val;       /* 1==highlight, 0==dehighlight */
+@} reg_highlight_read;
+ at end group
+ at end example
+
+This is used for notifying that an individual reading has been highlighted.
+It's purpose is to allow displays to synchronise highlighting of data. For
+instance, both the contig editor and template display send and acknowledge
+this notification. Thus when a name in the editor is highlighted the template
+display will highlight the appropriate reading, and vice versa.
+
+When @var{seq} is positive it represents the reading to highlight, otherwise
+it is 0 minus the contig number (not leftmost reading number).
+
+ at node Reg-REG_BUFFER_START and REG_BUFFER_END
+ at subsection REG_BUFFER_START and REG_BUFFER_END
+ at cindex REG_BUFFER_START
+ at vindex REG_BUFFER_START
+ at cindex REG_BUFFER_END
+ at vindex REG_BUFFER_END
+ at example
+ at group
+reg_buffer_start    buffer_start;
+reg_buffer_end      buffer_end;
+
+typedef struct @{
+    int    job;
+@} reg_buffer_start, reg_buffer_end;
+ at end group
+ at end example
+
+These two notifications share the same structure, which holds no information.
+The purpose of @code{REG_BUFFER_START} is simply as a signal that many
+notifications will be arriving in quick succession, until a
+ at code{REG_BUFFER_END} request arrives. The purpose is to speed up redisplay of
+functions registered with many contigs.
+
+As an example consider the enter tags function. This adds tags to many,
+potentially all, contigs. We can keep track of which contigs we need to send
+ at code{REG_ANNO} requests to, and send them with code similar to the following:
+
+ at example
+/* Notify of the start of the flurry of updates */
+rs.job = REG_BUFFER_START;
+for (i = 0; i < NumContigs(args.io); i++) @{
+    if (contigs[i]&1) @{
+        contig_notify(args.io, i+1, (reg_data *)&rs);
+    @}
+@}
+
+/* Now notify all the contigs that we've added tags to */
+ra.job = REG_ANNO;
+for (i = 0; i < NumContigs(args.io); i++) @{
+    if (contigs[i]&1) @{
+        contig_notify(args.io, i+1, (reg_data *)&ra);
+    @}
+@}
+
+/* Notify of the end of the flurry of updates */
+re.job = REG_BUFFER_END;
+for (i = 0; i < NumContigs(args.io); i++) @{
+    if (contigs[i]&1) @{
+        contig_notify(args.io, i+1, (reg_data *)&re);
+    @}
+@}
+ at end example
+
+Consider the action of the contig selector. This needs to refresh the display
+whenever any modifications are made, including annotations. The enter tags
+function needs to send notifications to many contigs, thus the contig selector
+will receive many requests. It is obviously more efficient for the contig
+selector to only redisplay once. The addition of @code{BUFFER_START} and
+ at code{BUFFER_END} solve this. As we don't know exactly which functions will be
+registered with which contigs, the enter tags code has to notify every contig.
+Hence the contig selector code must keep a count on the start and end of
+buffers so that it only needs to redisplay on the last buffer end. This code
+is as follows (tidied up and much shortened for brevity):
+
+ at example
+switch(jdata->job) @{
+case REG_BUFFER_START:
+    @{
+        cs->buffer_count++;
+        cs->do_update = REG_BUFFER_START;
+        return;
+    @}
+
+case REG_BUFFER_END:
+    @{
+        cs->buffer_count--;
+        if (cs->buffer_count <= 0) @{
+            cs->buffer_count = 0;
+            if (cs->do_update & REG_LENGTH) @{
+                [ Redisplay Contigs ]
+            @} else if (cs->do_update & REG_ANNO) @{
+                [ Redisplay Tags ]
+            @} else if (cs->do_update & REG_ORDER) @{
+                [ Shuffle Order]
+            @}
+            cs->do_update = 0;
+        @}
+        return;
+    @}
+
+case REG_ANNO:
+    @{
+        if (!cs->do_update) @{
+            [ Redisplay Tags ]
+        @} else @{
+            cs->do_update |= REG_ANNO;
+        @}
+        return;
+    @}
+/* etc */
+ at end example
+
+For further examples of handling buffering see the template display code.
+
+ at split{}
+ at node Reg-Sending
+ at section Sending a Notification
+ at cindex Notification, sending
+ at cindex Sending a notification
+
+When a function modifies data it is the responsibility of this function to
+inform others, via the contig registration scheme, of this change. At the time
+of notification the data on disk and in memory should be consistent (ie that
+check_database should not fail). To illustrate this, when joining two contigs
+we should not start sending notifications until we've recomputed the lengths
+and left/right neighbours of the joined contig.
+
+To send a request, one of the notification functions should be used. The
+simplest of these is @code{contig_notify}. This function takes a @var{GapIO}
+pointer, a contig number, and a @var{reg_data} pointer as arguments. The
+ at var{reg_data} is the union of notification types outlined in the above
+sections. The separate steps for notifying are:
+
+ at enumerate
+ at item
+Create a variable of the appropriate structure type (eg @code{reg_length}).
+ at item
+Fill the job field of this structure with the correct definition (eg
+ at code{REG_LENGTH}).
+ at item
+Fill in any structure dependant fields of the structure (eg @var{length} in the
+case of @code{reg_length}).
+ at item
+Call @code{contig_notify} with the @var{GapIO}, contig number and notification
+structure.  The notification structure should be cast back to a pointer to the
+ at var{reg_data} union type.
+ at end enumerate
+
+An example illustrating the above steps would be:
+
+ at example
+ at group
+reg_length jl;
+
+[...]
+
+jl.job = REG_LENGTH;
+jl.length = some_length;
+contig_notify(io, contig_number, (reg_data *)&jl);
+ at end group
+ at end example
+
+The available notification functions are @code{contig_notify},
+ at code{result_notify}, @code{type_notify} and @code{type_contig_notify}.
+_oxref(Reg-Functions, C Functions Available).
+
+ at split{}
+ at node Reg-Tasks
+ at section Specific Notification Tasks 
+ at cindex Tasks, notification
+ at cindex Notification tasks
+
+Some registered items may support extra forms of communication than the listed
+notifications. In this case, we use the @code{REG_GENERIC} notification
+together with a task number and some task specific data to send a specific
+task to a specific registered data. This provides a way for individual
+displays to add new communicates methods to the registration scheme.
+
+To send a @code{REG_GENERIC} task, the @var{reg_generic} structure must first
+be completed by setting @var{job}, @var{task} and @var{data}. @var{Data} will
+point to another structure, which is unique for specific type of task. The
+task data structure must then be initialised and sent to the appropriate
+client contig, id or type.
+
+The @var{task} number needs to be unique across all the types of generic tasks
+likely to be sent to the client. For instance, a contig editor can receive
+ at code{TASK_EDITOR_SETCURSOR} and @code{TASK_EDITOR_GETCON} tasks. Obviously
+the @code{#define}s for these tasks need to be different. However they may
+safely coincide with @code{TASK_TEMPLATE_REDRAW}, which is used by the
+template display, as we know that the the editor will never receive this task
+(and vice versa). The assignment of task numbers is at present something which
+requires further investigation. However the use of defines everywhere means
+that they are trivial to change.
+
+ at subsection TASK_EDITOR_GETCON
+ at cindex TASK_EDITOR_GETCON
+ at vindex TASK_EDITOR_GETCON
+ at example
+ at group
+typedef struct @{
+    char  *con;         /* Allocated by the contig editor */
+    int    lreg;        /* Set lreg and rreg to 0 for all consensus */
+    int    rreg;
+    int    con_cut;
+    int    qual_cut;
+@} task_editor_getcon;
+ at end group
+ at end example
+Allocates and calculates a consensus (stored in @var{con}) between @var{lreg}
+and @var{rreg}. If @var{lreg} and @var{rreg} are both zero, then all the
+consensus is computed. The calling function is expected to free @var{con} when
+finished. An example of use can be seen in the stop codon code:
+
+ at example
+reg_generic gen;
+task_editor_getcon tc;
+
+gen.job = REG_GENERIC;
+gen.task = TASK_EDITOR_GETCON;
+gen.data = (void *)&tc;
+
+tc.lreg = 0;
+tc.rreg = 0;
+tc.con_cut = consensus_cutoff;
+tc.qual_cut = quality_cutoff;
+
+if (type_contig_notify(args.io, args.contig, REG_TYPE_EDITOR,
+                       (reg_data *)&gen, 0) == -1)
+    return TCL_OK;
+
+[...]
+
+xfree(tc.con);
+ at end example
+    
+ at subsection TASK_CANVAS_SCROLLX
+ at cindex TASK_CANVAS_SCROLLX
+ at vindex TASK_CANVAS_SCROLLX
+
+ at subsection TASK_CANVAS_SCROLLY
+ at cindex TASK_CANVAS_SCROLLY
+ at vindex TASK_CANVAS_SCROLLY
+
+ at subsection TASK_CANVAS_ZOOMBACK
+ at cindex TASK_CANVAS_ZOOMBACK
+ at vindex TASK_CANVAS_ZOOMBACK
+
+ at subsection TASK_CANVAS_ZOOM
+ at cindex TASK_CANVAS_ZOOM
+ at vindex TASK_CANVAS_ZOOM
+
+ at subsection TASK_CANVAS_CURSOR_X
+ at cindex TASK_CANVAS_CURSOR_X
+ at vindex TASK_CANVAS_CURSOR_X
+
+ at subsection TASK_CANVAS_CURSOR_Y
+ at cindex TASK_CANVAS_CURSOR_Y
+ at vindex TASK_CANVAS_CURSOR_Y
+
+ at subsection TASK_CANVAS_CURSOR_DELETE
+ at cindex TASK_CANVAS_CURSOR_DELETE
+ at vindex TASK_CANVAS_CURSOR_DELETE
+
+ at subsection TASK_CANVAS_RESIZE
+ at cindex TASK_CANVAS_RESIZE
+ at vindex TASK_CANVAS_RESIZE
+
+ at subsection TASK_CANVAS_REDRAW
+ at cindex TASK_CANVAS_REDRAW
+ at vindex TASK_CANVAS_REDRAW
+
+ at subsection TASK_CANVAS_WORLD
+ at cindex TASK_CANVAS_WORLD
+ at vindex TASK_CANVAS_WORLD
+
+ at subsection TASK_WINDOW_ADD
+ at cindex TASK_WINDOW_ADD
+ at vindex TASK_WINDOW_ADD
+
+ at subsection TASK_WINDOW_DELETE
+ at cindex TASK_WINDOW_DELETE
+ at vindex TASK_WINDOW_DELETE
+
+ at subsection TASK_CS_REDRAW
+ at cindex TASK_CS_REDRAW
+ at vindex TASK_CS_REDRAW
+
+ at subsection TASK_RENZ_INFO
+ at cindex TASK_RENZ_INFO
+ at vindex TASK_RENZ_INFO
+
+ at subsection TASK_TEMPLATE_REDRAW
+ at cindex TASK_TEMPLATE_REDRAW
+ at vindex TASK_TEMPLATE_REDRAW
+
+ at subsection TASK_DISPLAY_RULER
+ at cindex TASK_DISPLAY_RULER
+ at vindex TASK_DISPLAY_RULER
+
+ at subsection TASK_DISPLAY_TICKS
+ at cindex TASK_DISPLAY_TICKS
+ at vindex TASK_DISPLAY_TICKS
+
+ at split{}
+ at node Reg-Functions
+ at section C Functions Available
+
+ at menu
+* Reg-contig_register_init::        contig_register_init
+* Reg-register_id::                 register_id
+* Reg-contig_register::             contig_register
+* Reg-contig_deregister::           contig_deregister
+* Reg-contig_notify::               contig_notify
+* Reg-contig_register_join::        contig_register_join
+* Reg-result_to_regs::              result_to_regs
+* Reg-result_names::                result_names
+* Reg-result_time::                 result_time
+* Reg-result_notify::               result_notify
+* Reg-result_data::                 result_data
+* Reg-type_to_result::              type_to_result
+* Reg-type_notify::                 type_notify
+* Reg-type_contig_notify::          type_contig_notify
+ at end menu
+
+The prototypes for all of these functions can be found in @file{io-reg.h}. The
+code for these functions is held in @file{io-reg.c}.
+
+ at split{}
+ at node Reg-contig_register_init
+ at subsection contig_register_init
+ at findex contig_register_init(C)
+ at example
+ at group
+#include <io-reg.h>
+
+int contig_register_init(GapIO  *io);
+ at end group
+ at end example
+
+Initialises the contig register lists. This is only performed once,
+upon opening of a new database. The registration lists are
+automatically extended when new contigs are created.
+
+The function returns 0 for succes, -1 for error.
+
+ at node Reg-register_id
+ at subsection register_id
+ at findex register_id(C)
+ at example
+ at group
+int register_id();
+
+Returns: the id (always a non zero value).
+ at end group
+ at end example
+Returns a new id number for use as the id field to be sent to a
+ at code{contig_register} call. Each time this function is called a new number
+is returned.
+
+ at node Reg-contig_register
+ at subsection contig_register
+ at findex contig_register(C)
+ at example
+ at group
+int contig_register(GapIO *io, int contig,
+                    void (*func)(GapIO *io, int contig, void *fdata,
+                                 reg_data *jdata),
+                    void *fdata,
+                    int id, int flags, int type);
+Returns:  0 for success
+         -1 for error.
+ at end group
+ at end example
+Registers "func(io, contig, fdata, jdata)" with the specified contig.
+This doesn't check whether the (func,fdata) pair already exist for
+this contig.
+
+ at node Reg-contig_deregister
+ at subsection contig_deregister
+ at findex contig_deregister(C)
+ at example
+ at group
+int contig_deregister(GapIO *io, int contig,
+                      void (*func)(GapIO *io, int contig, void *fdata,
+                                   reg_data *jdata),
+                      void *fdata);
+
+Returns:  0 for success
+         -1 for error.
+ at end group
+ at end example
+Deregisters "func(io, contig, fdata, jdata)" from the specified
+contig. The (func,fdata) pair must match exactly to deregister.
+
+ at node Reg-contig_notify
+ at subsection contig_notify
+ at findex contig_notify(C)
+ at example
+ at group
+void contig_notify(GapIO *io, int contig, reg_data *jdata);
+ at end group
+ at end example
+Sends a notification request to all items registered with the
+specified contig.
+
+ at node Reg-contig_register_join
+ at subsection contig_register_join
+ at findex contig_register_join(C)
+ at example
+ at group
+int contig_register_join(GapIO *io, int cfrom, int cto);
+
+Returns:  0 for success
+         -1 for error.
+ at end group
+ at end example
+Joins two registration lists. This adds all items listed on the
+registration list for contig 'cfrom' to the registration list for
+contig 'cto'. Entries that are registered on both lists are not
+duplicated. The 'cfrom' registration list is left intact.
+
+ at split{}
+ at node Reg-result_to_regs
+ at subsection result_to_regs
+ at findex result_to_regs(C)
+ at example
+ at group
+contig_reg_t **result_to_regs(GapIO *io, int id);
+
+Returns:  An allocated list of contig_reg_t pointers upon success.
+          NULL for failure.
+ at end group
+ at end example
+Converts an id number to an array of @var{contig_reg_t} pointers. The
+ at var{contig_reg_t} structures pointed to are considered the property of the
+registration scheme and should not be modified. The caller is expect
+to deallocate the returned list by calling the @code{xfree} function.
+
+ at node Reg-result_names
+ at subsection result_names
+ at findex result_names(C)
+ at example
+ at group
+char *result_names(GapIO *io, int *contig, int *reg, int *id, int first);
+
+Returns: The next name upon success.
+         NULL for failure.
+ at end group
+ at end example
+Generates description of functions registered with a particular contig.
+If contig 0 is specified then all are listed.
+'contig' is modified to return the contig number this result was from
+(useful when sending contig 0), as is 'reg' to return the index into
+the registration array for this contig. This (contig,reg) pair
+specifies a particular result without the need for remembering
+pointers. 'id' contains a unique id number for this result.
+
+ at node Reg-result_time
+ at subsection result_time
+ at findex result_time(C)
+ at example
+ at group
+char *result_time(GapIO *io, int contig, int id);
+
+Returns: The time for success.
+         "unknown" for failure.
+ at end group
+ at end example
+Given a specific contig and id number, returns a string describing the
+time a specific id was registered. This assumes that all registered
+items with this id was registered at the same time. The string is
+statically allocated and should be be freed.
+
+ at node Reg-result_notify
+ at subsection result_notify
+ at findex result_notify(C)
+ at example
+ at group
+void result_notify(GapIO *io, int id, reg_data *jdata, int all);
+ at end group
+ at end example
+Sends a notification request to registered data with the specified id.
+If 'all' is non zero then all registered data with this id will be
+notified, otherwise only the first instance of this id found will be
+notified.
+
+ at node Reg-result_data
+ at subsection result_data
+ at findex result_data(C)
+ at example
+ at group
+void *result_data(GapIO *io, int id, int contig);
+
+Returns:  contig_reg_t->data for id upon success
+          NULL upon failure.
+ at end group
+ at end example
+Returns the data component of a @var{contig_reg_t} structure for a specific
+id. If id represents more than one piece of data, the first found
+(the search order is undefined) is returned. If the contig is
+specified then id will be search for only within this contig
+registration list, otherwise (when contig is zero) all contigs are
+scanned.
+
+ at split{}
+ at node Reg-type_to_result
+ at subsection type_to_result
+ at findex type_to_result(C)
+ at example
+ at group
+int type_to_result(GapIO *io, int type, int contig);
+
+Returns:  id value for success.
+          0 for failure.
+ at end group
+ at end example
+Returns the first id value found for a given id. If contig is specifed
+as a non zero value we search for id only within this contig.
+Otherwise all contigs are scanned.
+
+ at node Reg-type_notify
+ at subsection type_notify
+ at findex type_notify(C)
+ at example
+ at group
+int type_notify(GapIO *io, int type, reg_data *jdata, int all);
+        
+Returns:  0 for success
+         -1 when none of this type were found.
+ at end group
+ at end example
+Sends a notification request to registered data with the specified
+type. If 'all' is non zero then all registered data with this type
+will be notified, otherwise only the first instance of this type found
+will be notified.
+
+ at node Reg-type_contig_notify
+ at subsection type_contig_notify
+ at findex type_contig_notify(C)
+ at example
+ at group
+int type_contig_notify(GapIO *io, int contig, int type,
+                       reg_data *jdata, int all);
+
+Returns:  0 for success
+         -1 when none of this type were found.
+ at end group
+ at end example
+Sends a notification request to registered data of a given type only
+within the specified contig. If 'all' is non zero then all registered
+data with this type in this contig will be notified, otherwise only
+the first instance of this type found will be notified. 
+
+ at split{}
+ at node Reg-Locking
+ at section Locking Mechanisms
+ at cindex Locking
+
+When preparing to update data it is essential that a function checks whether
+other displays are currently accessing this data, and if so whether these
+displays are allowing the data to be modified.
+
+This is implemented with use of the REG_GET_LOCK and REG_SET_LOCK
+notifications. These notifications both both include a lock field within their
+structures. This is initially set to the mode of access desired (currently
+REG_LOCK_WRITE is the only one we support). The @code{contig_notify} call is
+then used to send this notification to all appropriate data callbacks. If a
+callback wishes to block the request to write it should clear this lock flag.
+
+The calling code then checks the returned status of the lock flag. If the
+REG_LOCK_WRITE bit is still set then it knows locking is allowed. In this case
+notification of the acceptance of this lock is sent around using the
+REG_SET_LOCK request. An example of the communication follows. To send the
+lock request we do:
+
+ at example
+    reg_get_lock lg;
+
+    lg.job = REG_GET_LOCK;
+    lg.lock = REG_LOCK_WRITE;
+
+    contig_notify(io, contig, (reg_data *)&lg);
+ at end example
+
+The default action of ignoring the REG_GET_LOCK request will allow the write
+operation to take place. The contig editor does not support updates of the
+contig that it is editing other than those made by itself, so it needs to
+block such locks. The callback procedure of the contig editor contains:
+
+ at example
+    case REG_GET_LOCK:
+        /*
+         * We need exclusive access, so clear any write lock
+         */
+        if (jdata->glock.lock & REG_LOCK_WRITE)
+            jdata->glock.lock &= ~REG_LOCK_WRITE;
+
+        break;
+ at end example
+
+The calling code should now check the status of the lock and send a
+REG_SET_LOCK request if the lock was not blocked:
+
+ at example
+    if (lg.lock & REG_LOCK_WRITE) @{
+        reg_set_lock ls;
+
+        ls.job = REG_SET_LOCK;
+        ls.lock = REG_LOCK_WRITE;
+
+        contig_notify(io, contig, (reg_data *)&ls);
+
+        [ ... ]
+    @}
+ at end example
+
+To simplify this procedure, the @code{contig_lock_write} function performs
+the above lock request and acknowledge protocol.
+
+ at example
+int contig_lock_write(GapIO *io, int contig);
+
+Returns:  0 for success (write granted)
+         -1 for failure (write blocked)
+
+ at end example
+
+In some cases, where large amounts of data are modified in unpredictable
+fashion, it is easier to simply shut down all displays viewing the database
+before proceding. This is especially true of functions such as assembly where
+all contigs maybe modified. In this case we use the locking mechanism once
+more, except with a REG_QUIT call instead of REG_GET_LOCK. The same procedure
+of checking and clearing (if necessary) the lock flag is used. Once again, an
+example from the contig editor callback illustrates the procedure.
+
+ at example
+    case REG_QUIT:
+        /*
+         * We are being asked to quit. We can only allow this is we
+         * haven't made changes.
+         */
+        if (_editsMade(db)) @{
+            jdata->glock.lock &= ~REG_LOCK_WRITE;
+        @} else @{
+            DBI_callback(db, DBCALL_QUIT, 0, 0, NULL);
+        @}
+
+        break;
+ at end example
+
+The code above checks whether the editor has made any edits. If not the editor
+is shutdown, otherwise the REG_LOCK_WRITE flag is cleared.
+
+The @code{tcl_quit_displays} function can be used to perform the REG_QUIT
+locking procedure. Currently this is an interface to Tcl and no C interface,
+other than using the contig_notify with REG_QUIT, exists.
+_oxref(Reg-Tcl, Tcl Interfaces).
+
+ at split{}
+ at node Reg-Specific Examples
+ at section Examples of Specific Functions
+
+ at menu
+* Reg-Deleting a contig::           Deleting a contig
+* Reg-Joining two contigs::         Joining two contigs
+ at end menu
+
+Here we describe in detail how certain operations interact with the contig
+registration. They are described here because the notifications generated may
+not be immediately obvious.
+
+ at split{}
+ at node Reg-Deleting a contig
+ at subsection Deleting a contig
+ at cindex Deleting a contig
+
+As contig numbers must always be from 1 to N, where N is the number of
+contigs, if we remove a particular contig, we need to ensure we still have
+contigs 1 to N-1. In thise case, deleting contig x, where x != N, will mean
+that we have a hole (at x) which can be filled by moving N down to x.
+
+To illustrate in an algorithm we have the following; Given N contigs and a
+request to delete contig x.
+
+ at enumerate
+ at item
+Delete contig x. This is a NULL operation as far as the
+ at code{io_delete_contig} operation goes as we're already assuming the data on
+this contig has gone elsewhere.
+ at item
+Move contig N to contig x (if x != N). This includes updating the disk
+images as well as the fortran arrays and the contig order, but not the
+registration lists --- yet.
+ at item
+Decrement the number of contigs. (N--)
+ at item
+Notify contig x of the delete using REG_DELETE.
+ at item
+Notify contig N of the renumber to contig x using REG_NUMBER_CHANGE.
+(if appropriate)
+ at item
+Update registration list information.
+ at end enumerate
+
+Hence it is important to remember that after an @code{io_delete_contig} the
+contig numbers may not be the same as before the call.
+
+ at split{}
+ at node Reg-Joining two contigs
+ at subsection Joining two contigs
+ at cindex Joining two contigs
+
+The order of events within the joining is crucial. In the past several bugs
+have arisen due to this order being incorrect. We need to notify both the left
+and right contigs of the change, to join the two registration lists, and to
+delete the contig. Deleting the contig must be the last operation as this may
+renumber one of our contigs.
+
+The order used is as follows, assuming we are joining two contigs together.
+We join 'left' to 'right', giving a new contig 'left'.
+
+ at enumerate
+ at item
+Perform the actual join of the data. This involves updating everything
+except without notifications and without modifying the registration
+lists.
+ at item
+Send a REG_JOIN_TO request to 'right' informing the new contig number
+is 'left'. This also includes the offset of 'right' within 'left'.
+ at item
+Merge the registration lists using @code{contig_register_join}. We copy
+'right' to 'left', leaving 'right' unchanged. It is required to leave
+'right' unchanged so that the delete request is acknowledged.
+ at item
+Notify 'left' of a change of length using REG_LENGTH. Note that this
+now also includes notifying items previously register with 'right'.
+ at item
+Delete contig 'right'. As shown above, this will generate REG_DELETE
+and possibly REG_NUMBER_CHANGE requests.
+ at end enumerate
+
+ at split{}
+ at node Reg-Tcl
+ at section Tcl Interfaces
+
+ at menu
+* Reg-tcl_clear_cp::             clear_cp
+* Reg-tcl_clear_template::       clear_template
+* Reg-tcl_register_id::          register_id
+* Reg-tcl_result_names::         result_names
+* Reg-tcl_result_time::          result_time
+* Reg-tcl_result_delete::        result_delete
+* Reg-tcl_result_quit::          result_quit
+* Reg-tcl_reg_get_ops::          reg_get_ops
+* Reg-tcl_reg_invoke_op::        reg_invoke_op
+* Reg-tcl_reg_notify_highlight:: reg_notify_highlight
+* Reg-tcl_reg_notify_update::    reg_notify_update
+* Reg-tcl_quit_displays::        quit_displays
+ at end menu
+
+Some of the contig registration scheme needs to be visible at the Tcl/Tk
+level. This includes, amongst other things, anything to do with the Results
+Manager window. The complete list of Tcl callable functions can be found in
+tk-io-reg.h. The functions are described below.
+
+ at split{}
+ at node Reg-clear_cp
+ at subsection clear_cp
+ at findex clear_cp(T)
+ at example
+ at group
+clear_cp -io handle -id number
+
+Returns: nothing
+ at end group
+ at end example
+
+This command removes (sends a @code{REG_QUIT} request) all registered items
+that have displays on the contig comparator window. Currently this list is
+hard coded to include the following types: @code{REG_TYPE_FIJ},
+ at code{REG_TYPE_READPAIR}, @code{REG_TYPE_REPEAT}, @code{REG_TYPE_CHECKASS},
+ at code{REG_TYPE_OLIGO}.
+
+The contig comparator is then turned back into the 1D contig selector window.
+The @var{id} of the contig comparator is needed for this.
+
+ at split{}
+ at node Reg-clear_template
+ at subsection clear_template
+ at findex clear_template(T)
+ at example
+ at group
+clear_template -io handle -id number
+
+Returns: nothing
+ at end group
+ at end example
+
+This command deletes all items on the template display with an id of
+ at var{number}. It loops through all windows contained within this template
+display, sending a @code{REG_QUIT} request to them.
+
+FIXME: This doesn't appear to remove either the template display itself or the
+ruler. Is it meant to?
+
+ at split{}
+ at node Reg-tcl_register_id
+ at subsection register_id
+ at findex register_id(T)
+ at example
+ at group
+register_id
+        
+Returns: the id.
+ at end group
+ at end example
+A Tcl interface to the @code{register_id} function.
+
+ at node Reg-tcl_result_names
+ at subsection result_names
+ at findex result_names(T)
+ at example
+ at group
+result_names -io handle
+
+Returns: a list describing all results.
+ at end group
+ at end example
+A Tcl interface to the @code{result_names} function. This produces a single
+string describing the complete list of results. The format is
+"@{contig regnum id string@} ?@{contig regnum id string@}? ..." and so can
+be accessed as a Tcl list.
+
+ at node Reg-tcl_result_time
+ at subsection result_time
+ at findex result_time(T)
+ at example
+ at group
+result_time -io handle -contig contig_number -id id_number
+
+Returns: the time in string format.
+ at end group
+ at end example
+A Tcl interface to the @code{result_time} function.
+
+ at node Reg-tcl_result_delete
+ at subsection result_delete
+ at findex result_delete(T)
+ at example
+ at group
+result_delete -io handle -id id_number
+
+Returns: nothing
+ at end group
+ at end example
+Sends a REG_DELETE request to a specific id.
+
+ at node Reg-tcl_result_quit
+ at subsection result_quit
+ at findex result_quit(T)
+ at example
+ at group
+result_quit -io handle -id id_number
+
+Returns: nothing
+ at end group
+ at end example
+Sends a REG_QUIT request to a specific id.
+
+ at split{}
+ at node Reg-tcl_reg_get_ops
+ at subsection reg_get_ops
+ at findex reg_get_ops(T)
+ at example
+ at group
+reg_get_ops -io handle -id id_number
+
+Returns: a Tcl list of available operations.
+ at end group
+ at end example
+A Tcl interface to the REG_GET_OPS notification.
+
+ at node Reg-tcl_reg_invoke_op
+ at subsection reg_invoke_op
+ at findex reg_invoke_op(T)
+ at example
+ at group
+reg_invoke_op -io handle -id id_number -option option_number
+
+Returns: nothing
+ at end group
+ at end example
+A Tcl interface to the REG_INVOKE_OP notification.
+
+ at node Reg-tcl_reg_notify_update
+ at subsection reg_notify_update
+ at findex reg_notify_update(T)
+ at example
+ at group
+reg_notify_update -io handle -contig contig_number
+
+Returns: nothing
+ at end group
+ at end example
+Sends a REG_LENGTH request to a specific contig, or to all contigs if
+contig_number is specified as 0.
+        
+ at node Reg-tcl_reg_notify_highlight
+ at subsection reg_notify_highlight
+ at findex reg_notify_highlight(T)
+ at example
+ at group
+reg_notify_highlight -io handle -reading identifier -highlight value
+
+Returns: nothing
+ at end group
+ at end example
+Sends a REG_HIGHLIGHT request to a specific contig, indicating that the
+highlight value of the specified @var{reading_number} is @var{value}. The
+reading is specified as an @var{identifier} consisting of the name,
+#reading_number or =contig_number.
+        
+ at node Reg-tcl_quit_displays
+ at subsection quit_displays
+ at findex quit_displays(T)
+ at example
+ at group
+quit_displays io_handle function_name
+
+Returns:  0 for success
+         -1 for failure
+ at end group
+ at end example
+Sends a REG_QUIT request to all registered data. If an error occurs,
+a database busy message is sent to the error window with the
+"function_name" listed.
+
+ at split{}
+ at node Reg-To Do
+ at section Future Enhancements
+
+ at enumerate
+ at item
+Rationalise naming and arguments to functions:
+ at enumerate
+ at item
+Some Tcl interfaces don't take "-io handle" notation (@code{quit_displays})
+ at item
+We refer to "result" in function names, but "id" in arguments. They're the
+same.
+ at end enumerate
+ at item
+Add more type conversion routines. Also rationalise the existing routines.
+We should have a completely orthoganal set of interrogation function so
+that manipulation contigs, types and ids are the same.
+ at item
+Document usage of registration scheme within the contig comparitor (it's
+not straight forward or immediately obvious).
+ at end enumerate
+
diff --git a/scripting_manual/gap4-scripting-comm-t.texi b/scripting_manual/gap4-scripting-comm-t.texi
new file mode 100644
index 0000000..9f8b047
--- /dev/null
+++ b/scripting_manual/gap4-scripting-comm-t.texi
@@ -0,0 +1,1687 @@
+ at cindex Gap4 main commands
+
+ at menu
+* G4Comm-assemble_direct::        assemble_direct
+* G4Comm-assemble_misc::          assemble... commands
+* G4Comm-break_contig::           break_contig
+* G4Comm-calc_quality::           calc_quality
+* G4Comm-check_assembly::         check_assembly
+* G4Comm-check_database::         check_database
+* G4Comm-complement_contig::      complement_contig
+* G4Comm-delete_contig::          delete_contig
+* G4Comm-disassemble_readings::   disassemble_readings
+* G4Comm-double_strand::          double_strand
+* G4Comm-edit_contig::            edit_contig
+* G4Comm-enter_tags::             enter_tags
+* G4Comm-extract_readings::       extract_readings
+* G4Comm-find_long_gels::         find_long_gels
+* G4Comm-find_oligo::             find_oligo
+* G4Comm-find_primers::           find_primers
+* G4Comm-find_probes::            find_probes
+* G4Comm-find_read_pairs::        find_read_pairs
+* G4Comm-find_repeats::           find_repeats
+* G4Comm-find_taq_terminator::    find_taq_terminator
+* G4Comm-find_internal_joins::    find_internal_joins
+* G4Comm-get_consensus::          get_consensus
+* G4Comm-join_contig::            join_contig
+* G4Comm-minimal_coverage::       minimal_coverage
+* G4Comm-pre_assemble::           pre_assemble
+* G4Comm-shift_readings::         shift_readings
+* G4Comm-show_relationships::     show_relationships
+* G4Comm-unattached_readings::    unattached_readings
+ at end menu
+
+ at split{}
+ at node G4Comm-assemble_direct
+ at unnumberedsec assemble_direct
+ at findex assemble_direct(T)
+ at cindex Directed assembly
+ at cindex Assembly, directed
+
+ at example
+ at group
+ at exdent @code{assemble_direct}
+ -io            @i{io_handle:integer}
+ -files         @i{filenames:strings}
+?-output_mode   @i{mode:integer(0)}?
+?-max_pmismatch @i{percentage:float(-1)}?
+ at end group
+ at end example
+
+This performs the gap4 directed assembly function. It takes a list of
+Experiment File filenames and processes these according to their content. The
+Experiment Files should contain AP lines to govern their positions in the
+assembly.
+
+The function returns a list of failed files separated by newlines.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-files} filenames
+
+A Tcl list of Experiment File filenames.
+
+ at sp 1
+ at item @code{-output_mode} mode
+
+Whether to display alignments when assembling sequences containing a
+tolerance of zero or more. A @i{mode} of non-zero displays alignments, 0 does
+not. This is an optional argument with the default as 0.
+
+ at sp 1
+ at item @code{-max_pmismatch} percentage
+
+When aligning sequences (tolerance >= 0) the aligned sequences must match the
+consensus sequence with <= @i{percentage} mismatch. A @i{percentage} of -1
+implies no check should be made. The default for this option is -1.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-assemble_misc
+ at unnumberedsec assemble... commands
+ at cindex Independent assembly
+ at cindex Assembly, independent
+ at findex assemble_independent(T)
+
+ at example
+ at group
+ at exdent @code{assemble_independent}
+ -io             @i{io_handle:integer}
+ -files          @i{filenames:strings}
+?-output_mode    @i{mode:integer(1)}?
+?-min_match      @i{length:integer(20)}?
+?-min_overlap    @i{length:integer(0)}?
+?-max_pads       @i{count:integer(25)}?
+?-max_pmismatch  @i{percentage:float(5.0)}?
+?-joins          @i{to_join:integer(1)}?
+?-enter_failures @i{to_enter:integer(0)}?
+?-tag_types      @i{types:strings()}?
+ at end group
+ at end example
+
+ at cindex New contigs assembly
+ at cindex Assembly, new contigs
+ at findex assemble_new_contigs(T)
+ at example
+ at group
+ at exdent @code{assemble_new_contigs}
+ -io             @i{io_handle:integer}
+ -files          @i{filenames:strings}
+ at end group
+ at end example
+
+ at cindex One contig assembly
+ at cindex Assembly, one contig
+ at findex assemble_one_contig(T)
+ at example
+ at group
+ at exdent @code{assemble_one_contig}
+ -io             @i{io_handle:integer}
+ -files          @i{filenames:strings}
+ at end group
+ at end example
+
+ at cindex Screen-only assembly
+ at cindex Assembly, screen only
+ at findex assemble_screen(T)
+ at example
+ at group
+ at exdent @code{assemble_screen}
+ -io             @i{io_handle:integer}
+ -files          @i{filenames:strings}
+?-output_mode    @i{mode:integer(1)}?
+?-min_match      @i{length:integer(20)}?
+?-min_overlap    @i{length:integer(0)}?
+?-max_pads       @i{count:integer(25)}?
+?-max_pmismatch  @i{percentage:float(5.0)}?
+?-save_align     @i{to_save:integer(0)}?
+?-win_size       @i{length:integer(0)}?
+?-max_dashes     @i{count:integer(0)}?
+?-tag_types      @i{types:strings()}?
+ at end group
+ at end example
+
+ at cindex Shotgun assembly
+ at cindex Assembly, shotgun
+ at findex assemble_shotgun(T)
+ at example
+ at group
+ at exdent @code{assemble_shotgun}
+ -io             @i{io_handle:integer}
+ -files          @i{filenames:strings}
+?-output_mode    @i{mode:integer(1)}?
+?-min_match      @i{length:integer(20)}?
+?-min_overlap    @i{length:integer(0)}?
+?-max_pads       @i{count:integer(25)}?
+?-max_pmismatch  @i{percentage:float(5.0)}?
+?-joins          @i{to_join:integer(1)}?
+?-enter_failures @i{to_enter:integer(0)}?
+?-tag_types      @i{types:strings()}?
+ at end group
+ at end example
+
+ at cindex Single stranded assembly
+ at cindex Assembly, single stranded
+ at findex assemble_single_strand(T)
+ at example
+ at group
+ at exdent @code{assemble_single_strand}
+ -io             @i{io_handle:integer}
+ -files          @i{filenames:strings}
+?-output_mode    @i{mode:integer(1)}?
+?-min_match      @i{length:integer(20)}?
+?-min_overlap    @i{length:integer(0)}?
+?-max_pads       @i{count:integer(25)}?
+?-max_pmismatch  @i{percentage:float(5.0)}?
+?-joins          @i{to_join:integer(1)}?
+?-enter_failures @i{to_enter:integer(0)}?
+ at end group
+ at end example
+
+The assembly functions listed above all take similar arguments, but perform
+varying modes of assembly. The complete list of available arguments is listed
+below, but note that not all arguments apply to each function. Most functions
+return the failed readings and error codes with newlines between each
+reading and error code pair. @code{Screen_only} may return (when
+ at i{save_align} is enabled) the reading alignment scores in a similar fashion.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-files} filenames
+
+ at i{Filenames} must contain a Tcl list of files to assemble.
+
+ at sp 1
+ at vindex -output_mode
+ at item @code{-output_mode} mode
+
+Specifies the level of verbosity of the output. The default is 0. @i{Mode}
+must be one of the following.
+
+ at table @asis
+ at item 1
+Display no alignments
+ at item 2
+Display only passed alignments
+ at item 3
+Display all alignments
+ at item 4
+Display only failed alignments
+ at end table
+
+ at sp 1
+ at vindex -min_match
+ at item @code{-min_match} length
+Specifies the minimum length of exact match used during the hashing stage of
+assembly. The minium allowed value for this is 8. The default is 20.
+
+ at sp 1
+ at vindex -min_overlap
+ at item @code{-min_overlap} length
+
+This specifies the minimum length of an overlap between a reading and
+a consensus sequence. The default is 0 which implies no overlap is too short.
+Note that @code{-min_match} is still used so all overlaps have to be larger
+than that parameter in order to be found.
+
+ at sp 1
+ at vindex -max_pads
+ at item @code{-max_pads} count
+
+After alignments the number of pads required in each of the two sequences
+(consensus and reading, or two consensuses) must be less than or equal to
+ at i{count}. The default is 25.
+
+ at sp 1
+ at vindex -max_pmismatch
+ at item @code{-max_pmismatch} percentage
+
+After alignments the percentage of bases that do not match must be less than
+or equal to @i{percentage}. This is a floating point value. The default is
+5.0.
+
+ at sp 1
+ at item @code{-save_align} to_save
+
+This controls whether alignments scores are to be returned as the function
+result. A non zero value returns the scores. The default is 0.
+
+ at sp 1
+ at vindex -win_size
+ at vindex -max_dashes
+ at item @code{-win_size} length
+ at itemx @code{-max_dashes} count
+
+During a screen-only assembly the cutoff data may be searched for matches. The
+criteria for determining how much cutoff sequence to align is selected as
+the portion where no more than @i{count} unknown ("-") bases within a region
+of @i{length} bases. Setting both these parameters to 0 means that cutoff data
+will be not searched. These are the defaults.
+
+ at sp 1
+ at item @code{-joins} to_join
+
+This controls whether joins between contigs should be allowed. A non zero
+value allows joins. The default is 1.
+
+ at sp 1
+ at item @code{-enter_failures} to_enter
+
+This controls whether failed readings should still be entered into the
+databases as single reading contigs. A non zero value will enable this. The
+default is 0.
+
+ at sp 1
+ at vindex -tag_types
+ at item @code{-tag_types} types
+
+The assembly algorithm can mask segments of sequence covered by tags so
+that they are not used during hashing step and hence do not initiate
+overlaps.  If @i{types} is a non blank list of tag types then masking will
+be applied to hide sequence covered by tags of these types from the initial
+hashing stage of assembly. The default is a blank list, which means no
+masking will be performed.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-break_contig
+ at unnumberedsec break_contig
+ at cindex Contig breaking
+ at cindex Break contig
+ at findex break_contig(T)
+
+ at example
+ at group
+ at exdent @code{break_contig}
+ -io            @i{io_handle:integer}
+ -readings      @i{identifiers:strings}
+ at end group
+ at end example
+
+This command breaks contigs into two or more pieces at given points. The
+function returns no value but will generate a Tcl error if an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-readings} identifiers
+
+This specifies the list of readings. For each reading the contig will be
+broken such that the reading forms the left end of a new contig.
+ at end table
+
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-calc_quality
+ at unnumberedsec TODO: calc_quality
+ at findex calc_quality(T)
+ at cindex Quality calculation
+ at cindex Calculating quality
+
+ at example
+ at group
+ at exdent @code{calc_quality}
+ -io            @i{io_handle:integer}
+ -contig        @i{contig}
+ -window        @i{window}
+ at end group
+ at end example
+
+This command will have the interface updated in the future to conform to the
+style used by other commands. Use at your own risk.
+
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-check_assembly
+ at unnumberedsec check_assembly
+ at findex check_assembly(T)
+ at cindex Assembly, checking
+ at cindex Check assembly
+
+ at example
+ at group
+ at exdent @code{check_assembly}
+ -io            @i{io_handle:integer}
+ -contigs       @i{identifiers:strings}
+?-cutoff        @i{use_cutoffs:integer(1)}?
+?-min_len       @i{length:integer(10)}?
+?-win_size      @i{length:integer(29)}?
+?-max_dashes    @i{count:integer(3)}?
+?-max_pmismatch @i{percentage:float(15.0)}?
+ at end group
+ at end example
+
+The command performs the Gap4 Check Assembly command. It compares either the
+used data or the cutoff data against the consensus sequence to find readings
+with poor match. The function is not currently ideally suitable for use in a
+script as it plots directly to the Contig Selector display. The function
+returns no value but will generate a Tcl error if an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contigs} identifiers
+
+This specifies the list of contigs to check. Only the contig identifier is
+currently used, although the syntax specifying start and end ranges is valid.
+
+ at sp 1
+ at item @code{-cutoff} use_cutoffs
+
+Controls whether the cutoff data is to be analysed. If @i{use_cutoffs} is a
+non zero value the cutoff data will be aligned and compared against the
+consensus. Otherwise the already aligned used data will be compared against
+the consensus. The default is 1.
+
+ at sp 1
+ at item @code{-min_len} min_length
+ at itemx @code{-win_size} window_length
+ at itemx @code{-max_dashes} count
+
+These parameters are only used when @i{-cutoff} is enabled. The criteria for
+determining how much cutoff sequence to align is selected as only the portion
+where no more than @i{count} unknown ("-") bases within a region of
+ at i{window_length} bases. This sequence is then only used if the amount
+selected is at least @i{min_length} bases long. The defaults are 10 for
+ at code{-min_len}, 29 for @code{-win_size} and 3 for @code{-max_dashes}.
+
+ at sp 1
+ at item @code{-max_pmismatch} percentage
+
+Only matches with greater than the specified percentage mismatch are displayed
+as problems. The default is 15.0.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-check_database
+ at unnumberedsec check_database
+ at findex check_database(T)
+ at cindex Database checking
+ at cindex Check database
+
+ at example
+ at group
+ at exdent @code{check_database}
+ -io            @i{io_handle:integer}
+ at end group
+ at end example
+
+This function performs the gap4 check database function. It returns the number
+of serious database corruptions detected.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-complement_contig
+ at unnumberedsec complement_contig
+ at findex complement_contig(T)
+ at cindex Contig, complementing
+ at cindex Complement contig
+
+ at example
+ at group
+ at exdent @code{complement_contig}
+ -io            @i{io_handle:integer}
+ -contigs       @i{identifiers:strings}
+ at end group
+ at end example
+
+This command complements one or more contigs and writes back the modified data
+to the database. It returns 0 for success and 1 for failure.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contigs} identifiers
+
+This specifies the list of contigs to complement. Only the contig identifier is
+currently used, although the syntax specifying start and end ranges is valid.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-delete_contig
+ at unnumberedsec delete_contig
+ at findex delete_contig(T)
+ at cindex Contig, deletion
+ at cindex Delete contig
+
+ at example
+ at group
+ at exdent @code{delete_contig}
+ -io            @i{io_handle:integer}
+ -contigs       @i{identifiers:strings}
+ at end group
+ at end example
+
+This command deletes one or more contigs from the database, including readings
+and associated information. The function returns no value but will generate a
+Tcl error if an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contigs} identifiers
+
+This specifies the list of contigs to delete. Only the contig identifier is
+currently used, although the syntax specifying start and end ranges is valid.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-disassemble_readings
+ at unnumberedsec disassemble_readings
+ at findex disassemble_readings(T)
+ at cindex Reading, disassembly
+ at cindex Disassemble readings
+
+ at example
+ at group
+ at exdent @code{disassemble_readings}
+ -io            @i{io_handle:integer}
+ -readings      @i{identifiers:strings}
+?-all           @i{for_all:integer(1)}?
+?-remove        @i{to_remove:integer(1)}?
+ at end group
+ at end example
+
+This command disassembles readings by either removing them from the database
+or moving them to their own contigs.  The function returns no value but will
+generate a Tcl error if an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-readings} identifiers
+
+Specifies the list of readings to disassemble.
+
+ at sp 1
+ at item @code{-all} for_all
+
+Controls whether to disassemble all readings or only those that are not
+"crucial" (those that would cause a contig to break into fragments). A
+non-zero value will disassemble all. The default is 1.
+
+ at sp 1
+ at item @code{-remove} to_remove
+
+Controls whether to completely remove the readings from the database or to
+move them to their own contigs. A non-zero value will remove them, otherwise
+they are moved. The default is 1.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-double_strand
+ at unnumberedsec double_strand
+ at findex double_strand(T)
+ at cindex Double stranding
+ at cindex Strands
+
+ at example
+ at group
+ at exdent @code{double_strand}
+ -io            @i{io_handle:integer}
+ -contigs       @i{identifiers:strings}
+?-max_nmismatch @i{count:integer(8)}
+?-max_pmismatch @i{percentage:float(8.)}
+ at end group
+ at end example
+
+This command searches for single stranded regions and attempts to make them
+double stranded data by finding neighbouring readings with hidden data that
+is good enough to reveal. The function returns no value but will generate a
+Tcl error if an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contigs} identifiers
+
+This specifies the list of contigs to double strand. The @i{@{contig start
+end@}} syntax may be used for an identifier to double strand only a region of
+the contig, otherwise all of it is double stranded.
+
+ at sp 1
+ at vindex -max_nmismatch
+ at item @code{-max_nmismatch
+
+This} cifies the maximum number of mismatches allowed in the extended dat
+between the reading and the consensus. The default is 8.
+
+ at sp 1
+ at vindex -max_pmismatch
+ at item @code{-max_pmismatch
+
+This} cifies the maximum percentage mismatch allowed in the extended dat
+between the reading and the consensus. The default is 8.0.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-edit_contig
+ at unnumberedsec edit_contig
+ at findex edit_contig(T)
+ at cindex Contig, editing
+ at cindex Edit contig
+
+ at example
+ at group
+ at exdent @code{edit_contig}
+ -io            @i{io_handle:integer}
+ -contig        @i{identifier:string}
+?-reading       @i{identifier:string()}?
+?-pos           @i{position:integer(1)}?
+ at end group
+ at end example
+
+This command brings up a contig editor display. The function returns no value
+but will generate a Tcl error if an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contig} identifier
+
+This specifies the contig to edit.
+
+ at sp 1
+ at item @code{-reading} identifier
+ at item @code{-pos} position
+
+By default the editor starts with the display and cursor at the left end of
+the consensus. Use these options to specify a different reading and position.
+The position is relative to the start of the specified reading. To start the
+editor at a specific position in the consensus sequence use only @code{-pos}.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-enter_tags
+ at unnumberedsec enter_tags
+ at findex enter_tags(T)
+ at cindex Tags, adding
+ at cindex Adding tags
+
+ at example
+ at group
+ at exdent @code{enter_tags}
+ -io            @i{io_handle:integer}
+ -file          @i{filename:string}
+ at end group
+ at end example
+
+This command reads a file containing tags (annotations) and enters them into
+the database. The function returns no value but will generate a Tcl error if
+an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-file} filename
+
+This specifies the file containing the tag data.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-extract_readings
+ at unnumberedsec extract_readings
+ at findex extract_readings(T)
+ at cindex Readings, extracting
+ at cindex Extract readings
+
+ at example
+ at group
+ at exdent @code{extract_readings}
+ -io            @i{io_handle:integer}
+ -readings      @i{identifiers:strings}
+?-directory     @i{directory:string(extracts)}?
+?-quality       @i{add_quality:integer(1)}?
+ at end group
+ at end example
+
+This command copies the edited sequences from the database to Experiment
+Files on disk. The database is not altered. The function returns no value
+but will generate a Tcl error if an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-readings} identifiers
+
+Specifies the list of readings to extract.
+
+ at sp 1
+ at item @code{-directory} directory
+
+The files created are all placed in a subdirectory (created by this command).
+This option specifies the directory to be used. The default is
+ at file{extracts}.
+
+ at sp 1
+ at item @code{-quality} add_quality
+
+This controls whether quality, original positions, and pre-assembly format
+data is to be included in the file. A non-zero value will output the extra
+data. The default is 1.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-find_long_gels
+ at unnumberedsec find_long_gels
+ at findex find_long_gels(T)
+ at cindex Readings, long runs
+ at cindex Find long gels
+
+ at example
+ at group
+ at exdent @code{find_long_gels}
+ -io            @i{io_handle:integer}
+ -contigs       @i{identifiers:strings}
+?-avg_len       @i{length:integer(500)}?
+ at end group
+ at end example
+
+This command searches for places where rerunning a reading as a long gel will
+solve a problem. The function returns no value but will generate a Tcl error
+if an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contigs} identifiers
+
+This specifies the list of contigs to search. The @i{@{contig start end@}}
+syntax may be used for an identifier to search only a region of the
+contig, otherwise all of it is searched.
+
+ at sp 1
+ at vindex -avg_len
+ at item @code{-avg_len} length
+
+This specifies the length expected for a long reading. This is used to
+determine which readings are suitable for rerunning and the amount of
+coverage available. The default is 500 base pairs.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-find_oligo
+ at unnumberedsec find_oligo
+ at findex find_oligo(T)
+ at cindex Oligos, finding
+ at cindex Primers, finding
+ at cindex Find oligo
+
+ at example
+ at group
+ at exdent @code{find_oligo}
+ -io            @i{io_handle:integer}
+ -contigs       @i{identifiers:strings}
+?-min_pmatch    @i{percentage:float(75.0)}?
+?-seq           @i{sequence:string()}?
+?-tag_types     @i{types:string()}?
+ at end group
+ at end example
+
+This command searches for short sequence segments and plots them in the Contig
+Selector. It will fail when not running in a graphical environment containing
+the Contig Selector. The function returns no value but will generate a Tcl
+error if an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contigs} identifiers
+
+This specifies the list of contigs to search. The @i{@{contig start end@}}
+syntax may be used for an identifier to search only a region of the
+contig, otherwise all of it is searched.
+
+ at sp 1
+ at item @code{-min_pmatch} percentage
+
+Only matches with this level of similarity or better will be displayed. The
+default is 75%.
+
+ at sp 1
+ at item @code{-seq} sequence
+
+The command will search for the @i{sequence} in each of the specified
+contigs, plotting matches above (or equal to) the mininum percentage match.
+This option takes precedence over the @code{-tag_types} option. The default
+is a blank string, which implies no searching.
+
+ at sp 1
+ at item @code{-tag_types} types
+
+If a list of tag types is specified the algorithm first obtains the sequence
+underneath each tag of these types. For each sequence the search is
+independently performed with that sequence as the search string. If
+ at code{-seq} has also been specified this option is invalid. The default is a
+blank list of tag types, which implies no tags will be searched for.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-find_primers
+ at unnumberedsec find_primers
+ at findex find_primers(T)
+ at cindex Primers, suggesting
+ at cindex Find primers
+
+This command performs the Gap4 "Suggest Primers" function. It searches for
+locations where choosing an oligo primer for "walking" off another reading
+will solve a problem. The command returns a list of primer
+information in the form "@i{template_name reading_name primer_identifier
+sequence position direction}", separated by newlines.
+
+ at example
+ at group
+ at exdent @code{find_primers}
+ -io            @i{io_handle:integer}
+ -contigs       @i{identifiers:strings()}
+?-search_from   @i{position:integer(20)}?
+?-search_to     @i{position:integer(60)}?
+?-num_primers   @i{count:integer(1)}?
+?-primer_start  @i{count:integer(1)}?
+?-params        @i{OSP_params:string}?
+ at end group
+ at end example
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contigs} identifiers
+
+This specifies the list of contigs to search. The @i{@{contig start end@}}
+syntax may be used for an identifier to search only a region of the
+contig, otherwise all of it is searched.
+
+ at sp 1
+ at item @code{-search_from} position
+ at itemx @code{-search_to} position
+
+These two options control the region, relative to the problem, in which to
+look for suitable oligos. The defaults are @i{from} 20 @i{to} 60, which
+means that to cover an area starting at position 1000 in the forward strand
+the command will pick oligos from the sequence at positions 940 to 980.
+
+ at sp 1
+ at item @code{-num_primers} count
+
+This controls how many oligos to pick to solve each problem. The default is 1.
+
+ at sp 1
+ at item @code{-primer_start} count
+
+Each oligo is given a primer name consisting of the database name followed by
+the primer number. The numbers start at @i{count} and increment for each new
+primer. The default is 1.
+
+ at sp 1
+ at vindex osp_defs
+ at vindex gap_defs, OSP
+ at item @code{-params} OSP_params
+
+This specifies the parameters to the OSP algorithm as a keyed list. The
+defaults are undefined unless the gaprc file has been parsed. In this case the
+defaults are as used by Gap4. Not all of the OSP parameters listed below are
+needed or used, but we don't have further details. The defaults listed in the
+gaprc file are:
+
+ at example
+#----------------------------------------------
+# The OSP Prm defaults
+#----------------------------------------------
+set_def OSP.prod_len_low                0
+set_def OSP.prod_len_high               200
+set_def OSP.prod_gc_low                 0.40
+set_def OSP.prod_gc_high                0.55
+set_def OSP.prod_tm_low                 70.0
+set_def OSP.prod_tm_high                90.0
+
+set_def OSP.min_prim_len                17
+set_def OSP.max_prim_len                23
+set_def OSP.prim_gc_low                 0.30
+set_def OSP.prim_gc_high                0.70
+set_def OSP.prim_tm_low                 50
+set_def OSP.prim_tm_high                55
+
+set_def OSP.self3_hmlg_cut              8
+set_def OSP.selfI_hmlg_cut              14
+set_def OSP.pp3_hmlg_cut                8
+set_def OSP.ppI_hmlg_cut                14
+set_def OSP.primprod3_hmlg_cut          0
+set_def OSP.primprodI_hmlg_cut          0
+set_def OSP.primother3_hmlg_cut         0.0
+set_def OSP.primotherI_hmlg_cut         0.0
+set_def OSP.delta_tm_cut                2.0
+set_def OSP.end_nucs                    S
+
+set_def OSP.wt_prod_len                 0
+set_def OSP.wt_prod_gc                  0
+set_def OSP.wt_prod_tm                  0
+set_def OSP.wt_prim_s_len               0
+set_def OSP.wt_prim_a_len               0
+set_def OSP.wt_prim_s_gc                0
+set_def OSP.wt_prim_a_gc                0
+set_def OSP.wt_prim_s_tm                0
+set_def OSP.wt_prim_a_tm                0
+set_def OSP.wt_self3_hmlg_cut           2
+set_def OSP.wt_selfI_hmlg_cut           1
+set_def OSP.wt_pp3_hmlg_cut             2
+set_def OSP.wt_ppI_hmlg_cut             1
+set_def OSP.wt_primprod3_hmlg_cut       0
+set_def OSP.wt_primprodI_hmlg_cut       0
+set_def OSP.wt_primother3_hmlg_cut      0
+set_def OSP.wt_primotherI_hmlg_cut      0
+set_def OSP.wt_delta_tm_cut             0
+set_def OSP.AT_score                    2
+set_def OSP.CG_score                    4
+set_def OSP.wt_ambig                    avg
+ at end example
+
+To change a default you need to specify the full OSP parameters with modified
+values. For instance:
+
+ at example
+global gap_defs
+
+set osp_defs [keylget gap_defs OSP]
+keylset osp_defs min_prim_len 18
+
+find_primers \
+        -params $osp_defs \
+        @i{(etc)}
+ at end example
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-find_probes
+ at unnumberedsec find_probes
+ at findex find_probes(T)
+ at cindex Probes, finding
+ at cindex Find probes
+
+ at example
+ at group
+ at exdent @code{find_probes}
+ -io            @i{io_handle:integer}
+ -contigs       @i{identifiers:strings}
+?-min_size      @i{length:integer(15)}?
+?-max_size      @i{length:integer(19)}?
+?-max_pmatch    @i{fraction:float(90.0)}?
+?-from          @i{position:integer(10)}?
+?-to            @i{position:integer(100)}?
+?-vectors       @i{filename:string()}?
+ at end group
+ at end example
+
+This command performs the Gap4 "Suggest Probes" function. It searches for
+unique sequences at the ends of contigs suitable for probing clone
+libraries to pick overlapping sequences and hence to extend contigs.  The
+command returns a newline separated list of probes in the form
+"@code{Contig} @i{ident} @code{position} @i{int} @code{Tm} @i{int}
+ at code{sequence} @i{string}".
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contigs} identifiers
+
+This specifies the list of contigs to use. Only the contig identifier is
+currently used, although the syntax specifying start and end ranges is valid.
+
+ at sp 1
+ at item @code{-min_size} length
+ at itemx @code{-max_size} length
+
+These specify an inclusive range of the allowed lengths of probes chosen.
+The defaults are @i{min_size} of 15 and @i{max_size} of 19.
+
+ at sp 1
+ at item @code{-max_pmatch} fraction
+
+Each potential probe sequence is comparared against all contig sequences
+and, optionally, several vector sequences. This option specifies the
+maximum percentage match between the probe and the comparison sequences.
+Only sequences with no matches above this percentage match are considered
+unique.  sequences.  The default is 90.0.
+
+ at sp 1
+ at item @code{-from} position
+ at itemx @code{-to} position
+
+These specify the area in which to look for probes as an offset from the
+ends of the contigs. The defaults are @i{from} 10 @i{to} 100.
+
+ at sp 1
+ at item @code{-vectors} filename
+
+This specifies a file of vector filenames. NB: This will possibly be changed
+to a Tcl list of vector filenames. The uniqueness search will then also check
+the vector files for matches. The vector files can be in any format readable
+by the @i{seq_utils} library (currently Staden, EMBL, CODATA, GENBANK and
+FASTA). The default is blank, which implies no vectors to check.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-find_read_pairs
+ at unnumberedsec find_read_pairs
+ at findex find_read_pairs(T)
+ at cindex Reading pairs, finding
+ at cindex Find read pairs
+
+ at example
+ at group
+ at exdent @code{find_read_pairs}
+ -io            @i{io_handle:integer}
+ -contigs       @i{identifiers:strings}
+ at end group
+ at end example
+
+This command searches for templates containing both forward and reverse
+readings where the forward and reverse readings are in differing contigs. This
+information is plotted on the Contig Selector. The command will not work if
+the Contig Selector is not displayed. The function returns no value but will
+generate a Tcl error if an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contigs} identifiers
+
+This specifies the list of contigs to use. Only the contig identifier is
+currently used, although the syntax specifying start and end ranges is valid.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-find_repeats
+ at unnumberedsec find_repeats
+ at findex find_repeats(T)
+ at cindex Repeats, finding
+ at cindex Find repeats
+
+ at example
+ at group
+ at exdent @code{find_repeats}
+ -io            @i{io_handle:integer}
+ -contigs       @i{identifiers:strings}
+?-direction     @i{direction:integer(3)}?
+?-min_match     @i{length:integer(25)}?
+?-outfile       @i{filename:string()}?
+?-tag_types     @i{types:string()}?
+ at end group
+ at end example
+
+The command searches for perfect matches between two or more fragments in the
+consensus sequences. This information is plotted on the Contig Selector. The
+command will not work if the Contig Selector is not displayed. The function
+returns no value but will generate a Tcl error if an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contigs} identifiers
+
+This specifies the list of contigs to search. The @i{@{contig start end@}}
+syntax may be used for an identifier to search only a region of the
+contig, otherwise all of it is searched.
+
+ at sp 1
+ at item @code{-direction} direction
+
+This specifies whether forward repeats (1), reverse repeats (2), or both (3)
+are to be found. The default is 3 (both).
+
+ at sp 1
+ at item @code{-min_match} length
+
+This specifies the minimum length of a repeat to be searched for. The default
+is 25. The minimum allowed is 8.
+
+ at sp 1
+ at item @code{-outfile} filename
+
+This specifies a file in which to save the tag hits. The results are
+written in the form of REPT annotations which are suitable for passing onto
+the @code{enter_tags} command.  The default is a blank string, which
+implies no file should be created.
+
+ at sp 1
+ at item @code{-tag_types} types
+
+If @i{types} is a non blank list of tag types then masking will be applied to
+remove sequence covered by tags of these types from the repeat searching. The
+default is a blank list, which means no masking will be performed.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-find_taq_terminator
+ at unnumberedsec find_taq_terminator
+ at findex find_taq_terminator(T)
+ at cindex Taq terminators, suggesting
+ at cindex Terminator reactions, suggesting
+ at cindex Find taq terminators
+
+ at example
+ at group
+ at exdent @code{find_taq_terminator}
+ -io            @i{io_handle:integer}
+ -contigs       @i{identifiers:strings}
+?-avg_len       @i{length:integer(350)}?
+ at end group
+ at end example
+
+This command searches for places where rerunning a reading as a dye terminator
+reaction will solve a problem. Currently these places are identified by the
+presence of a COMP (compression) or STOP annotation. The function returns no
+value but will generate a Tcl error if an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contigs} identifiers
+
+This specifies the list of contigs to search. The @i{@{contig start end@}}
+syntax may be used for an identifier to search only a region of the
+contig, otherwise all of it is searched.
+
+ at sp 1
+ at vindex -avg_len
+ at item @code{-avg_len} length
+
+This specifies the expected length achieved by a terminator reading . This is
+used to determine which readings are suitable for rerunning and for the amount
+of coverage available. The default is 350 base pairs.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-find_internal_joins
+ at unnumberedsec find_internal_joins
+ at findex find_internal_joins(T)
+ at cindex Joins, finding
+ at cindex Find internal joins
+
+ at example
+ at group
+ at exdent @code{find_internal_joins}
+ -io            @i{io_handle:integer}
+ -contigs       @i{identifiers:strings}
+?-mode          @i{mode:string(end_all)}?
+?-segment       @i{identifier:string()}?
+?-min_match     @i{length:integer(15)}?
+?-max_pads      @i{count:integer(25)}?
+?-max_pmismatch @i{percentage:float(30.0)}?
+?-win_size      @i{length:integer(0)}?
+?-max_dashes    @i{count:integer(0)}?
+?-probe_length  @i{length:integer(100)}?
+?-mask          @i{mask:string(none)}?
+?-tag_types     @i{types:string()}?
+ at end group
+ at end example
+
+This command searches for potential joins between contigs by comparing the
+sequence data in each contig. This information is plotted on the Contig
+Selector. The command will not work if the Contig Selector is not displayed.
+The function returns no value but will generate a Tcl error if an error
+occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contigs} identifiers
+
+This specifies the list of contigs to search. The @i{@{contig start end@}}
+syntax may be used for an identifier to search only a region of the
+contig, otherwise all of it is searched.
+
+ at sp 1
+ at item @code{-mode} mode
+
+This specifies the segments of contigs in which to search for joins.
+Valid @i{mode}s are:
+
+ at table @code
+ at item end_end
+Compares only the ends of each contigs with the ends of other contigs.
+ at item end_all
+Compares the ends of each contigs with the entirety of other contigs.
+ at item all_all
+Compares all of each contig with all of the other contigs.
+ at item segment
+Compares a segment of a particular contig with all of the contig contigs.
+ at end table
+
+The default mode is "@code{end_all}".
+
+ at sp 1
+ at item @code{-segment} identifier
+
+When @i{mode} is "@code{segment}" this specifies the region of the contig
+identifier to compare. The default is blank.
+
+ at sp 1
+ at item @code{-min_match} length
+
+Specifies the minimum length of exact match used during the hashing stage of
+find internal joins. The minium allowed value for this is 14. The default is
+15.
+
+ at sp 1
+ at item @code{-max_pads} count
+
+After alignments the number of pads required in each of the two consensus
+sequences must be less than or equal to @i{count}. The default is 25.
+
+ at sp 1
+ at item @code{-max_pmismatch} count
+
+After alignments the percentage of bases that do not match must be less than
+or equal to @i{percentage}. This is a floating point value. The default is
+30.0.
+
+ at sp 1
+ at item @code{-win_size} length
+ at itemx @code{-max_dashes} count
+
+If these are both set to non-zero values the cutoff data will be searched for
+matches. The criteria for determining how much cutoff sequence to align is
+selected as the portion where no more than @i{count} unknown ("-") bases
+within a region of @i{length} bases. The defaults are 0 for both, which
+implies only used data should be searched.
+
+ at sp 1
+ at item @code{-mask} mask
+ at itemx @code{-tag_types} mask
+
+If @i{types} is a non blank list of tag types then masking or marking will
+be applied to the sequence covered by tags of these types from. When
+ at i{mask} is "@code{mark}" the sequence is converted to an alternative
+character set so that matches will be found, but are clearly visible in the
+output as being matches between marked fragments. When @i{mask} is
+"@code{mask}" the sequence is removed so that no matches will be initiated
+between this sequence and another fragment. The defaults are"
+ at code{none}" for @i{mask} and a blank string for the tag types, which
+disables masking and marking.
+ at end table
+
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-get_consensus
+ at unnumberedsec get_consensus
+ at findex get_consensus(T)
+ at cindex Consensus calculation
+
+ at example
+ at group
+ at exdent @code{get_consensus}
+ -io            @i{io_handle}
+ -contigs       @i{identifiers:strings}
+ -outfile       @i{filename:string}
+?-type          @i{type:string(normal)}?
+?-mask          @i{mask:string(none)}?
+?-tag_types     @i{types:string()}?
+?-win_size      @i{length:integer(0)}?
+?-max_dashes    @i{count:integer(0)}?
+?-format        @i{format:integer(3)}?
+?-annotations   @i{annotations:integer(0)}?
+?-truncate      @i{truncate:integer(0)}?
+ at end group
+ at end example
+
+This command calculates the consensus sequence for one or more contigs and
+saves it to a file. The function returns no value but will generate a Tcl
+error if an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contigs} identifiers
+
+This specifies the list of contigs to search. The @i{@{contig start end@}}
+syntax may be used for an identifier to search only a region of the
+contig, otherwise all of it is searched.
+
+ at sp 1
+ at item @code{-outfile} filename
+
+Specifies the filename to write the consensus sequence too. This has no
+default value.
+
+ at sp 1
+ at item @code{-type} type
+
+This specifies the final output type for the consensus algorithm. Valid
+ at i{type}s are:
+
+ at table @code
+ at item normal
+The standard consensus sequence consisting of A, C, G, T, - and *.
+
+ at item extended
+As per @code{normal}, except the cutoff data at the ends of contigs is used to
+provide consensus sequence beyond the well defined contig ends.
+
+ at item unfinished
+The consensus sequence in single stranded regions is output as a, c, g and
+t whilst the consensus for finished regions is listed as d, e f and i (for
+a, c, g and t respectively).  The quality of each base is output instead
+of the consensus base. The base quality is listed as a single letter from
+the following table showing the quality of each strand independently.
+
+ at cindex Quality codes
+ at table @var
+ at item a
+ at kbd{Good Good (in agreement)}
+ at item b
+ at kbd{Good Bad}
+ at item c
+ at kbd{Bad  Good}
+ at item d
+ at kbd{Good None}
+ at item e
+ at kbd{None Good}
+ at item f
+ at kbd{Bad  Bad}
+ at item g
+ at kbd{Bad  None}
+ at item h
+ at kbd{None Bad}
+ at item i
+ at kbd{Good Good (disagree)}
+ at item j
+ at kbd{None None}
+ at end table
+ at end table
+
+ at sp 1
+ at item @code{-win_size} length
+ at itemx @code{-max_dashes} count
+
+These are only of use during the @i{extended} consensus type.  The criteria
+for determining how much cutoff sequence to output is selected as the
+portion where there are no more than @i{count} unknown ("-") bases are
+found within a region of @i{length} bases. The defaults are 0 for both,
+which implies that only used data should be output.
+
+ at sp 1
+ at item @code{-format} format
+
+Specifies the output format of the file to be created. All formats can be
+written for all consensus types, but some may not be legal (eg Fasta files
+containing quality codes instead of sequence). The available formats are:
+
+ at cindex Format codes
+ at table @code
+ at item 1
+Staden format
+ at item 2
+Fasta format
+ at item 3
+Experiment File format
+ at end table
+
+The default is 3.
+
+ at sp 1
+ at item @code{-annotations} annotations
+
+This controls whether to output annotations. This is only of used in the
+Experiment File output format. Note that with the @i{extended} consensus type
+the annotation positions are still for the @i{normal} consensus; this is a bug
+which will only be fixed if it is considered useful. A non-zero value will
+output annotations. The default is 0, which is to not output annotations.
+
+ at sp 1
+ at item @code{-truncate} truncate
+
+This controls whether annotations within or overlapping the cutoff data will
+be output. A non-zero value will not output annotations within the cutoff
+data. The default is 0.
+
+ at sp 1
+ at item @code{-mask} mask
+ at itemx @code{-tag_types} types
+
+If @i{types} is a non blank list of tag types then masking or marking will
+be applied to the sequence covered by tags of these types from. When
+ at i{mask} is "@code{mask}" the sequence is converted to an alternative
+character set (@var{d}, @var{e}, @var{f} and @var{i} for Experiment Files
+and Staden format and @var{n}s for Fasta format). When @i{mask} is
+"@code{mark}" the sequence is in lowercase. The defaults are"
+ at code{none}" for @i{mask} and a blank string for the tag types, which
+disables masking and marking. Masking and marking is only used in the
+ at i{normal} and @i{extended} consensus types.
+ at end table
+
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-join_contig
+ at unnumberedsec join_contig
+ at findex join_contig(T)
+ at cindex Contig, joining
+ at cindex Join contigs
+
+ at example
+ at group
+ at exdent @code{join_contig}
+ -io            @i{io_handle:integer}
+ -contig1       @i{identifier:string}
+?-reading1      @i{identifier:string()}?
+?-pos1          @i{position:integer(1)}?
+ -contig2       @i{identifier:string}
+?-reading2      @i{identifier:string()}?
+?-pos2          @i{position:integer(1)}?
+ at end group
+ at end example
+
+This command brings up a join editor display. The display consists of two
+contig editors, one above the other. The function returns no value but will
+generate a Tcl error if an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contig1} identifier
+ at itemx @code{-contig2} identifier
+
+These specify the contigs to join.
+
+ at sp 1
+ at item @code{-reading1} identifier
+ at itemx @code{-reading2} identifier
+ at itemx @code{-pos1} position
+ at itemx @code{-pos2} position
+
+By default the editors start with the display and cursor at the left end of
+the consensus. Use these options to specify a different reading and position.
+The position is relative to the start of the specified reading. To start the
+editors at specific positions in the consensus sequence use only
+ at code{-pos}@i{n}.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-minimal_coverage
+ at unnumberedsec minimal_coverage
+ at findex minimal_coverage(T)
+ at cindex Contig coverage
+ at cindex Minimal coverage
+
+ at example
+ at group
+ at exdent @code{minimal_coverage}
+ -io            @i{io_handle:integer}
+ -contigs       @i{indentifiers:strings}
+ at end group
+ at end example
+
+This command produces a list of readings that, between them, cover the full
+length of the contig. The algorithm may not produce the optimum set of
+readings, but the result is at least close to optimum. The command returns the
+minimal list of readings.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contigs} identifiers
+
+This specifies the list of contigs to use. Only the contig identifier is
+currently used, although the syntax specifying start and end ranges is valid.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-pre_assemble
+ at unnumberedsec pre_assemble
+ at findex pre_assemble(T)
+ at cindex Assembly, preassembled data
+
+ at example
+ at group
+ at exdent @code{pre_assemble}
+ -io            @i{io_handle:integer}
+ -files         @i{filenames:strings}
+ at end group
+ at end example
+
+This command performs the Gap4 "Enter Preassembled Data" function. It
+assembles data that contains the PC, SE, ON and AV Experiment File line types
+to specify exactly the position data. This is superseded by the
+assemble_direct command and should no longer be used. The function returns no
+value but will generate a Tcl error if an error occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-files} filenames
+
+ at i{Filenames} must contain a Tcl list of files to assemble.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-shift_readings
+ at unnumberedsec shift_readings
+ at findex shift_readings(T)
+ at cindex Readings, shifting
+ at cindex Shift readings
+
+ at example
+ at group
+ at exdent @code{shift_readings}
+ -io            @i{io_handle:integer}
+ -readings      @i{identifiers:integers}
+ -distances     @i{distances:integers}
+ at end group
+ at end example
+
+This command shifts all readings to the right of (and including) a specified
+reading by a particular amount to the left or right. It is mainly for manually
+manipulating the database structures to join contigs. Use is not recommended.
+The function returns no value but will generate a Tcl error if an error
+occurs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at item @code{-readings} readings
+ at itemx @code{-distances} distances
+
+For each reading listed in the @i{readings} argument, that reading and all
+those to the right of it are shifted by the corresponding element in the
+ at i{distances} list. Positive distances shift right; negative distances shift
+left.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-show_relationships
+ at unnumberedsec show_relationships
+ at findex show_relationships(T)
+ at cindex Contig, relationship lists
+ at cindex Show relationships
+
+ at example
+ at group
+ at exdent @code{show_relationships}
+ -io            @i{io_handle:integer}
+?-contigs       @i{identifiers:strings()}?
+?-order         @i{order:integer(1)}?
+ at end group
+ at end example
+
+This command performs the Gap4 Show Relationships function. The function
+returns no value but will generate a Tcl error if an error occurs.
+
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-contigs} identifiers
+
+Specifies single segments of contigs for which to display the relationships
+data. In the current implementation only the first contig (and start and
+end position) in the identifier list is processed. Not specifying any
+contigs (which is the default) will make show_relationships process all
+contigs.
+
+ at sp 1
+ at item @code{-order} order
+
+Controls whether the output should be sorted on positional order or reading
+number order. This has no effect when @code{-contigs} is used. An @i{order}
+of 1 specifies that the output will list each contig in turn together with
+the readings within that contig listed in positional order. An @i{order} of
+0 lists all contig records first followed by all readings in contig and
+reading number order. The default is 1.
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-unattached_readings
+ at unnumberedsec unattached_readings
+ at findex unattached_readings(T)
+ at cindex Readings, finding unattached
+ at cindex Unattached readings
+
+ at example
+ at group
+ at exdent @code{unattached_readings}
+ -io            @i{io_handle:integer}
+ at end group
+ at end example
+
+This command produces a list of the contigs which are single readings. The
+command returns a Tcl list of the reading identifiers for these contigs.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+ at end table
diff --git a/scripting_manual/gap4-scripting-intro-t.texi b/scripting_manual/gap4-scripting-intro-t.texi
new file mode 100644
index 0000000..4a1c558
--- /dev/null
+++ b/scripting_manual/gap4-scripting-intro-t.texi
@@ -0,0 +1,69 @@
+ at cindex Gap4
+
+This chapter describes the gap4 scripting language. The language is an
+extension of the Tcl and Tk languages. This manual does not contain
+information on using Tcl and Tk itself - only our extensions.
+
+For the purpose of consistency, many gap4 commands take identical arguments.
+To simplify the documentation and to remove redundancy these arguments are
+only briefly discussed with each command description. However first we need to
+describe the terminology used throughout this manual.
+
+ at table @i
+ at cindex Reading identifier
+ at vindex identifier
+ at item Reading identifier
+        Used to specify a reading. It can consist of the reading's unique
+        name, a hash followed by its reading number, or if it is at the start
+        of a contig, an equals followed by the contig number.
+        
+        Eg @code{fred.s1}, @code{#12}, or @code{=2}.
+
+ at cindex Contig identifier
+ at item Contig identifier
+        A contig is identified by any reading within it, so all reading
+        identifiers are contig identifiers. However when a contig
+        identifier is displayed by a command it typically chooses the left
+        most reading name. If a contig number is known, simply use
+        @code{=}@i{number} as a contig identifier.
+ at end table
+
+ at noindent Common arguments:
+ at table @asis
+ at vindex -contigs
+ at item @code{-contigs} @i{contig_list}
+        @i{Contig_list} is a Tcl list of contig identifiers. If an item in the
+        list is itself a list, then the first element of the list is the
+        identifier and the second and third elements specify a range
+        within that contig.
+
+        Eg @code{-contigs @{read1 @{read5 1000 2000@} read6@}}
+
+ at vindex -readings
+ at item @code{-readings} @i{reading_list}
+        @i{Reading_list} is a Tcl list of reading identifiers.
+
+        Eg @code{-reading_list @{read1 read2@}}
+
+ at vindex -contig
+ at item @code{-contig} @i{contig_identifier}
+        Specifies a single contig by an indentifier.
+
+ at vindex -reading
+ at item @code{-reading} @i{reading_identifier}
+        Specifies a single reading by an identifier.
+
+ at vindex -cnum
+ at item @code{-cnum} @i{contig_number}
+        Specifies a contig by its number (NB: this not the same as a reading
+        number within that contig).
+
+ at vindex -rnum
+ at item @code{-rnum} @i{reading_number}
+        Specifies a reading by its number.
+
+ at vindex -io
+ at item @code{-io} @i{io_handle}
+        Specifies an IO handle by a numerical value as returned from a
+        previous @code{open_db} command.
+ at end table
diff --git a/scripting_manual/gap4-scripting-io-t.texi b/scripting_manual/gap4-scripting-io-t.texi
new file mode 100644
index 0000000..d1ecb5b
--- /dev/null
+++ b/scripting_manual/gap4-scripting-io-t.texi
@@ -0,0 +1,502 @@
+ at cindex Gap4 IO
+ at cindex Gap4 database access
+ at cindex IO, gap4 database access
+ at cindex Database access to gap4
+
+ at menu
+* Script-Intro::        Introduction
+* Script-IO Basics::    IO Primitives
+* Script-IO Commands::  Gap-level IO Commands
+ at end menu
+
+ at c @split{}
+ at node Script-Intro
+ at section Introduction
+
+FIXME: Add intro here
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Script-IO Basics
+ at section IO Primitives
+ at cindex IO Primitives
+
+ at menu
+* Script-io_rw_text::         io_read_text and io_write_text
+* Script-io_rw_data::         io_read_data and io_write_data
+* Script-flush::              Flushing data
+ at end menu
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Script-io_rw_text
+ at subsection io_read_text and io_write_text
+ at findex io_read_text(T)
+ at findex io_write_text(T)
+ at cindex text reading
+ at cindex text writing
+ at cindex reading text records
+ at cindex writing text records
+
+The database structures typically contain record numbers of text strings
+rather than copies of the strings themselves, as this easily allows resizing
+of the strings.
+
+ at table @asis
+ at item @code{io_read_text} @i{io} @i{record_number}
+Reads the text from @i{record_number} and returns it. Results in a Tcl error
+if it fails.
+ at sp 1
+ at item @code{io_write_text} @i{io} @i{record_number} @i{text}
+Writes @i{text} to the requested @i{record_number}. Returns 0 for success, -1
+for failure
+ at end table
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Script-io_rw_data
+ at subsection io_read_data and io_write_data
+ at findex io_read_data(T)
+ at findex io_write_data(T)
+ at cindex data reading
+ at cindex data writing
+ at cindex reading data records
+ at cindex writing data records
+
+These functions are for reading and writing the binary data in the database.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Script-flush
+ at subsection Flushing data
+ at findex io_flush(T)
+ at findex flush2t(C)
+ at vindex gap_auto_flush(T)
+ at cindex flushing data
+
+When updating the database information it is often necessary to perform
+several edits. Initially we assume that the database is consistent and
+correct. After updating the database we also wish for the database to be
+consistent and correct. However during update this may not be true.
+
+Consider the case of adding a new reading to the end of a contig. We need to
+write the new reading with it's left neighbour set to the original last
+reading in the contig; then need to update the original last reading's
+right neighbour to reference our new last reading; and finally we need to
+update the contig information. During this operation the database is
+inconsistent so should the program or system terminate unexpectedly we wish to
+revert back to the earlier consistent state.
+
+This is performed by use of controlled flushing. The database internally
+maintains a time stamp of the last flushed state. When we open a database that
+contains data written after the last flush we ignore the new data and use the
+data written at the last flush.
+
+ at table @asis
+ at item @code{io_flush} @i{io}
+A Tcl function to flush the data stored in the database reference by @i{io}.
+Always returns success.
+ at sp 1
+ at item @code{void flush2t(}@i{GapIO *io}@code{)}
+A C function to flush the database. Void return type.
+ at sp 1
+ at item @code{gap_auto_flush}
+A variable to control whether the Tcl level io write commands (eg.
+ at code{io_write_reading}, @code{io_write_reading_name}, @code{io_add_reading}
+and @code{io_allocate}) automatically flush after performing the write. Note
+the consequences of this action. By default this is set to 0 which disables
+automatic flush. A non zero value enables automatic flush.
+ at end table
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Script-IO Commands
+ at section Low-level IO Commands
+
+ at menu
+* Script-open_close::         Opening, Closing and Copying Databases
+
+* Script-io_rw_database::     io_read_database and io_write_database
+* Script-io_rw_reading::      io_read_reading and io_write_reading
+* Script-io_rw_contig::       io_read_contig and io_write_contig
+* Script-io_rw_annotation::   io_read_annotation and io_write_annotation
+* Script-io_rw_vector::       io_read_vector and io_write_vector
+* Script-io_rw_template::     io_read_template and io_write_template
+* Script-io_rw_clone::        io_read_clone and io_write_clone
+
+* Script-io_rw_reading_name:: io_read_reading_name and io_write_reading_name
+* Script-io_add::             io_add_* commands and io_allocate
+ at end menu
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Script-open_close
+ at subsection Opening, Closing and Copying Databases
+
+Before any database accessing can take place the gap4 database must be opened.
+This is done using the @code{open_db} call. This returns an @i{io} handle
+which should be passed to all other functions accessing the database.
+
+ at sp 1
+ at findex open_db(T)
+ at cindex opening gap4 databases from Tcl
+ at example
+ at group
+ at exdent @code{open_db}
+ -name          @i{database_name}
+?-version       @i{version}?
+?-create        @i{boolean}?
+?-access        @i{access_mode}?
+ at end group
+ at end example
+
+This opens a database named @i{database_name}. The actual files used will be
+ at i{database_name.version}. The routine is used for both creating a new
+database and opening an existing database. The value returned is the io handle
+of the opened database. More than one database may be opened at one time.
+
+ at table @asis
+ at item -name @i{database_name}
+Specifies the database name. The name is the start component of the two
+filenames used for storing the database and so is the section up to, but not
+including, the full stop. This is not an optional argument.
+ at sp 1
+ at item -version @i{version}
+This optional parameter specifies the database version. The version is
+the single character after the full stop in the UNIX database filenames. It is
+expected to be a single character. The default value (as used for newly
+created databases) is "0".
+ at sp 1
+ at item -create @i{boolean}
+Whether to open an existing database (-create 0) or a new database (-create
+1). The default here is 0; to open an existing database.
+ at sp 1
+ at item -access @i{access_mode}
+ at vindex read_only(T)
+The @i{access_mode} specifies whether the database is to be opened in
+read-only mode or read-write mode. Valid arguments are "r", "READONLY", "rw"
+and "WRITE". If a database is opened in "rw" or "WRITE" mode a BUSY file will
+be created. If the BUSY file already exists then the database is opened in
+read-only mode instead. Either way, the @code{read_only} Tcl variable is set
+to 0 for read-write and 1 for read-only mode.
+ at end table
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+_rule
+ at findex close_db(T)
+ at cindex closing gap4 databases from Tcl
+ at example
+ at group
+ at exdent @code{close_db}
+ -io            @i{io_handle}
+ at end group
+ at end example
+
+This closes a previously opened database. Returns nothing, but produces a Tcl
+error for failure.
+
+ at table @asis
+ at item -io @i{io}
+Specifies which database to close. The @i{io} is the io handle returned from a
+previous @code{open_db} call. Attempting to close databases that have not been
+opened will lead to undefined results.
+ at end table
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+_rule
+ at findex copy_db(T)
+ at cindex copying gap4 databases from Tcl
+ at example
+ at group
+ at exdent @code{copy_db}
+ -io            @i{io_handle}
+ -version       @i{version}
+?-collect       @i{boolean}?
+ at end group
+ at end example
+
+This command copies a currently open database to a new version number. The
+currently opened database is not modified. After copying the current open
+database referred to by @i{io_handle} is still the original database.
+
+ at table @asis
+ at item -io @i{io}
+Specifies which database to copy. The @i{io} is the io handle returned from a
+previous @code{open_db} call. Attempting to copy databases that have not been
+opened will lead to undefined results.
+ at sp 1
+ at item -version @i{version}
+This parameter specifies the database version to create to place the copy in.
+ at sp 1
+ at item -collect @i{boolean}
+This optional parameter specifies whether to perform garbage collection when
+copying the file. A value of 0 means no garbage collection; which is simply to
+do a raw byte-by-byte copy of the two database files. A non zero value will
+read and write each reading, contig, (etc) in turn to the new database, thus
+resolving any database fragmentation. The default value is "0".
+ at end table
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Script-io_rw_database
+ at subsection io_read_database
+ at findex io_read_database(T)
+ at cindex database structure, Tcl io
+
+The database structure holds information that is relevant to the entire
+project rather than on a per reading, per contig or per 'whatever' basis.
+Among other things it keeps track of the amount of information stored.
+
+ at table @asis
+ at item @code{io_read_database} @i{io}
+Reads the database structure from a specified @i{io} number and stores it in a
+keyed list. Returns the structure as keyed list when successful, or a blank
+string for failure.
+ at sp 1
+ at item @code{io_write_database} @i{io} @i{keyed_list_contents}
+Writes the database structure stored in the @i{keyed_list} to a specified
+ at i{io} number. Returns 0 for success, -1 for failure.
+ at end table
+
+For a description of the database structure, see (FIXME) "The GDatabase
+Structure".
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Script-io_rw_reading
+ at subsection io_read_reading
+ at findex io_read_reading(T)
+ at cindex reading structure, Tcl access
+
+The reading structure holds the primary information stored for each sequence.
+It references several other structures by their numbers into their own
+structure index. The reading structures also contain references to other
+reading structures. This is done by use of a doubly linked list ("left" and
+"right" fields), sorted on ascending position within the contig.
+
+ at table @asis
+ at item @code{io_read_reading} @i{io} @i{reading_number}
+Reads a reading structure from a specified @i{io} number and stores it in
+a keyed list.
+ at sp 1
+ at item @code{io_write_reading} @i{io} @i{reading_number} @i{keyed_list_contents}
+Writes a reading structure stored in the @i{keyed_list} to a specified @i{io}
+number. Returns 0 for success, -1 for failure.
+ at end table
+
+For a description of the readinf structure, see (FIXME) "The GReadings
+Structure".
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Script-io_rw_contig
+ at subsection io_read_contig
+ at findex io_read_contig(T)
+ at cindex contig structure, Tcl access
+
+The contig structure holds simple information about each contiguous stretch of
+sequence. The actual contents and sequence of the contig is held within the
+readings structures, including the relative positioning of each sequence.
+
+ at table @asis
+ at item @code{io_read_contig} @i{io} @i{contig_number}
+Reads a contig structure from a specified @i{io} number and stores it in
+a keyed list.
+ at sp 1
+ at item @code{io_write_contig} @i{io} @i{contig_number} @i{keyed_list_contents}
+Writes a contig structure stored in the @i{keyed_list} to a specified @i{io}
+number. Returns 0 for success, -1 for failure.
+ at end table
+
+For a description of the database structure, see (FIXME) "The GContigs
+Structure".
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Script-io_rw_annotation
+ at subsection io_read_annotation
+ at findex io_read_annotations(T)
+ at cindex annotations structure, Tcl access
+ at cindex tag structure, Tcl access
+
+Annotations, also known as tags, are general comments attached to segments of
+sequences (either real readings or the consensus). They form a singly linked
+list by use of the "next" field. The annotations must be sorted in ascending
+order.
+
+ at table @asis
+ at item @code{io_read_annotation} @i{io} @i{annotation_number}
+Reads an annotation structure from a specified @i{io} number and stores it in
+a keyed list.
+ at sp 1
+ at item @code{io_write_annotation} @i{io} @i{annotation_number}
+ at i{keyed_list_contents}
+Writes an annotation structure stored in the @i{keyed_list} to a specified
+ at i{io} number. Returns 0 for success, -1 for failure.
+ at end table
+
+For a description of the database structure, see (FIXME) "The GAnnotations
+Structure".
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Script-io_rw_vector
+ at subsection io_read_vector
+ at findex io_read_vector(T)
+ at cindex vector structure, Tcl access
+
+This holds information used on the vectors (one structure per vector) used for
+all stages of cloning and subcloning. For example both m13mp18 and pYAC4
+vectors.
+
+ at table @asis
+ at item @code{io_read_vector} @i{io} @i{vector_number}
+Reads a vector structure from a specified @i{io} number and stores it in
+a keyed list.
+ at sp 1
+ at item @code{io_write_vector} @i{io} @i{vector_number} @i{keyed_list_contents}
+Writes a vector structure stored in the @i{keyed_list} to a specified @i{io}
+number. Returns 0 for success, -1 for failure.
+ at end table
+
+For a description of the database structure, see (FIXME) "The GVectors
+Structure".
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Script-io_rw_template
+ at subsection io_read_template
+ at findex io_read_template(T)
+ at cindex template structure, Tcl access
+
+The template is the final piece of material used for the readings. So if we
+sequenced the insert from both ends then we would expect to have two reading
+structures referencing this template structure.
+
+ at table @asis
+ at item @code{io_read_template} @i{io} @i{template_number}
+Reads a template structure from a specified @i{io} number and stores it in
+a keyed list.
+ at sp 1
+ at item @code{io_write_template} @i{io} @i{template_number} @i{keyed_list_contents}
+Writes a template structure stored in the @i{keyed_list} to a specified @i{io}
+number. Returns 0 for success, -1 for failure.
+ at end table
+
+For a description of the database structure, see (FIXME) "The GTemplates
+Structure".
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Script-io_rw_clone
+ at subsection io_read_clone
+ at findex io_read_clone(T)
+ at cindex clone structure, Tcl access
+
+The clone is the the material that our templates were derived from.
+Typically the clone name is used as the database name. Example vectors are
+cosmid, YAC or BAC vectors.
+
+ at table @asis
+ at item @code{io_read_clone} @i{io} @i{clone_number}
+Reads a clone structure from a specified @i{io} number and stores it in
+a keyed list.
+ at sp 1
+ at item @code{io_write_clone} @i{io} @i{clone_number} @i{keyed_list_contents}
+Writes a clone structure stored in the @i{keyed_list} to a specified @i{io}
+number. Returns 0 for success, -1 for failure.
+ at end table
+
+For a description of the database structure, see (FIXME) "The GClones
+Structure".
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Script-io_rw_reading_name
+ at subsection io_read_reading_name and io_write_reading_name
+ at findex io_read_reading_name(T)
+ at findex io_write_reading_name(T)
+ at cindex reading name, Tcl access
+
+When accessing the reading name record referenced by the reading structure,
+special purpose functions must be used. The reading names are cached in
+memory once a database is opened. This speeds up accesses, but requires
+different IO functions. Note that @code{io_write_text} to update a reading
+name will invalidate the cache and cause bugs.
+
+ at table @asis
+ at item @code{io_read_reading_name} @i{io} @i{reading_number}
+Returns the reading name for reading @i{reading_number}.
+ at sp 1
+ at item @code{io_write_reading_name} @i{io} @i{reading_number} @i{name}
+Writes the new reading @i{name} for reading @i{reading_number}. Assuming
+correct syntax, this always returns success.
+ at end table
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node Script-io_add
+ at subsection io_add_* commands and io_allocate
+ at cindex allocation of structures from Tcl
+ at cindex structure allocation from Tcl
+ at cindex reading allocation, Tcl
+ at cindex contig allocation, Tcl
+ at cindex annotation allocation, Tcl
+ at cindex template allocation, Tcl
+ at cindex vector allocation, Tcl
+ at cindex clone allocation, Tcl
+
+A set of Tcl functions exists for allocating new gap4 database structures.
+Each function allocates the next sequentially numbered structure. In the case
+of annotations it is preferable to reuse items stored on the annotation free
+list before allocating new structures.
+
+ at table @asis
+ at findex io_add_reading(T)
+ at item @code{io_add_reading} @i{io}
+Creates a new reading numbered @code{NumReadings(io)+1}. The name, trace_name
+and trace_type fields are all allocated and written as "uninitialised". No
+other items are allocated and all other fields are set to 0. The database
+num_readings and Nreadings fields are also updated.  Returns the new reading
+number.
+ at sp 1
+ at findex io_add_contig(T)
+ at item @code{io_add_contig} @i{io}
+Creates a new contig numbered @code{NumContigs(io)+1}. The contig structure
+fields are all set to 0. The contig_order array is updated with the new contig
+as the last (right most) one. The database num_contigs and Ncontigs fields are
+also updated. Returns the new contig number.
+ at sp 1
+ at findex io_add_annotation(T)
+ at item @code{io_add_annotation} @i{io}
+Creates a new annotation. The structure fields initialised to 0. The database
+Nannotations field is also updated. Returns the new annotation number.
+ at sp 1
+ at findex io_add_template(T)
+ at item @code{io_add_template} @i{io}
+Creates a new template. The template name is allocated and set to
+"uninitialised"; strand is set to 1; vector is set to the "unknown" vector (1)
+if present or creates a new blank vector to reference; and the clone,
+insert_size_min and insert_size_max are set to 0. The database Ntemplates
+field is also updated. Returns the new template number.
+ at sp 1
+ at findex io_add_vector(T)
+ at item @code{io_add_vector} @i{io}
+Creates a new vector. The vector name is allocated and set to "uninitialised".
+The level is set to 0. The database Nvectors field is also updated. Returns
+the new vector number.
+ at sp 1
+ at findex io_add_clone(T)
+ at item @code{io_add_clone} @i{io}
+Creates a new clone. The clone name is allocated and set to "uninitialised".
+The vector is set to the "unknown" vector(1) or creates a new blank vector to
+reference. The database Nclones field is also updated. Returns the new
+template number.
+ at sp 1
+ at findex io_allocate(T)
+ at item @code{io_allocate} @i{io} @i{type}
+Allocates a new record of the specified @i{type}. Currently only the
+ at code{text} type is supported. The new record number is returned.
+ at end table
diff --git a/scripting_manual/gap4-scripting-util-t.texi b/scripting_manual/gap4-scripting-util-t.texi
new file mode 100644
index 0000000..53319d9
--- /dev/null
+++ b/scripting_manual/gap4-scripting-util-t.texi
@@ -0,0 +1,316 @@
+ at cindex Gap4 utitility commands
+
+ at menu
+* G4Comm-db_info::              db_info
+* G4Comm-edid_to_editor::       edid_to_editor
+* G4Comm-add_tags::             add_tags
+* G4Comm-get_read_names::       get_read_names
+ at end menu
+
+ at split{}
+ at node G4Comm-db_info
+ at unnumberedsec db_info
+ at findex db_info(T)
+ at cindex database information from Tcl
+ at cindex num_readings, db_info command
+ at cindex num_contigs, db_info command
+ at cindex t_contig_length, db_info command
+ at cindex contig, total length
+ at cindex get_read_num, db_info command
+ at cindex get_contig_num, db_info command
+ at cindex chain_left, db_info command
+ at cindex longest_contig, db_info
+ at cindex contig, finding the longest
+ at cindex db_name, db_info command
+ at cindex database name, db_info command
+
+ at noindent
+ at code{db_info} @code{num_readings} @i{io}
+
+This command returns the number of readings in the database.
+
+ at sp 1
+ at noindent
+ at code{db_info} @code{num_contigs} @i{io}
+
+This command returns the number of contigs in the database.
+
+ at sp 1
+ at noindent
+ at code{db_info} @code{t_contig_length} @i{io}
+
+This command returns the total number of characters in the consensus for all
+contigs.
+
+ at sp 1
+ at noindent
+ at code{db_info} @code{t_read_length} @i{io}
+
+This command returns the total number of bases used in all the readings.
+
+ at sp 1
+ at noindent
+ at code{db_info} @code{get_read_num} @i{io} @i{reading_identifier}
+
+This command returns the reading number (between 1 and @i{num_contigs}) for a
+specific reading. For instance, to convert the reading name
+ at code{xb64a10.s1} to its reading number we use:
+
+ at example
+set rnum [db_info get_read_num $io xb64a10.s1]
+ at end example
+
+If the reading name is not found, -1 is returned.
+
+ at sp 1
+ at noindent
+ at code{db_info} @code{get_contig_num} @i{io} @i{reading_identifier}
+
+This command returns the contig number for a specific reading. The
+number returned is the number of the contig structure, not the number of the
+left most reading within this contig. It returns -1 for failure.
+
+ at sp 1
+ at noindent
+ at code{db_info} @code{chain_left} @i{io} @i{reading_identifier}
+
+This command returns the left most reading number within a contig specified by
+the @i{reading_identifier}. It returns -1 for failure.
+
+ at sp 1
+ at noindent
+ at code{db_info} @code{longest_contig} @i{io}
+
+This command returns the contig number (not the left most reading number) of
+the longest contig in the database.
+
+ at sp 1
+ at noindent
+ at code{db_info} @code{db_name} @i{io}
+
+This command returns the name of the database opened with the specified @i{io}
+handle. The name returned includes the version number, so an example result
+would be @code{TEST.0}.
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-edid_to_editor
+ at unnumberedsec edid_to_editor
+ at cindex Editor identifier
+ at findex edid_to_editor(T)
+ at vindex REG_CURSOR_NOTIFY
+
+ at noindent
+ at code{edid_to_editor} @i{editor_id}
+
+This command converts the contig editor identifier number to the Tk pathname
+of the associated Editor widget. The contig editor identifier can be obtained
+from acknowledging (within C) a REG_CURSOR_NOTIFY event.
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-add_tags
+ at unnumberedsec add_tags
+ at findex add_tags(T)
+ at cindex Tags, adding
+ at cindex Adding tags
+
+ at example
+ at group
+ at exdent @code{add_tags}
+ -io            @i{io_handle:integer}
+ -tags          @i{tag_list:strings}
+ at end group
+ at end example
+
+This command adds a series of annotations to readings and contigs within the
+database.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at sp 1
+ at item @code{-tags} tag_list
+
+This specifies the list of annotations to add. The format of @i{tag_list} is
+as a Tcl list of tag items, each of the format:
+
+ at i{reading_number tag_type direction start}@code{..}@i{end comment_lines}
+
+If the @i{reading_number} is negative the tag is added to the consensus of the
+contig numbered - at i{reading_number}. The @i{tag_type} should be the four
+character tag type. The @i{direction} should be one of "+", "-" or "=" (both).
+The @i{start} and @i{end} specify the inclusive range of bases the annotation
+covers. These count from 1 in the original orientation of the sequence. The
+ at i{comment_lines} hold the text for the annotation.  Several lines may be
+included.
+ at end table
+
+The following example adds two tags. The first is to reading #12 from
+position 10 to 20 inclusive. The second is to contig #1.
+
+ at example
+set t "@{12 COMM + 10..20 comment@} @{-1 REPT = 22..23 multi-line\ncomments@}"
+add_tags -io $io -tag_list $t
+ at end example
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-get_read_names
+ at unnumberedsec get_read_names
+ at findex get_read_names(T)
+ at cindex Reading names, getting
+ at cindex Reading identifiers, getting
+
+ at example
+ at group
+ at exdent @code{get_read_names}
+ -io            @i{io_handle:integer}
+?@i{identifier} ...?
+ at end group
+ at end example
+
+This command converts a list of reading identifiers to reading names. The
+identifiers can be either "#number" or the actual read name itself, although
+the command is obviously only useful for the first syntax. The names are
+returned as a Tcl list.
+
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-contig_order_to_number
+ at unnumberedsec contig_order_to_number
+ at findex contig_order_to_number(T)
+
+ at example
+ at group
+ at exdent contig_order_to_number
+ -io            @i{io_handle:integer}
+ -order         @i{position:integer}
+ at end group
+ at end example
+
+This command converts a contig position number to a contig number. That is we
+can ask "which is the second contig from the left". The function returns the
+coontig number.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at item @code{-order} position
+
+The position of the contig. "1" is the left most contig.
+
+ at end table
+
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-reset_contig_order
+ at unnumberedsec reset_contig_order
+ at findex reset_contig_order(T)
+
+ at example
+ at group
+ at exdent reset_contig_order
+ -io            @i{io_handle:integer}
+ at end group
+ at end example
+
+This command resets the contig order so that the lowest numbered contig is at
+the left and the highest numbered contig at the right. The new contig order is
+written to disk and the database is flushed.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at end table
+
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-flush_contig_order
+ at unnumberedsec flush_contig_order
+ at findex flush_contig_order(T)
+
+ at example
+ at group
+ at exdent flush_contig_order
+ -io            @i{io_handle:integer}
+ at end group
+ at end example
+
+This command writes the contig order information to disk and then runs the
+ at code{io_flush} command.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-remove_contig_duplicates
+ at unnumberedsec remove_contig_duplicates
+ at findex remove_contig_duplicates(T)
+
+ at example
+ at group
+ at exdent remove_contig_duplicates
+ -io            @i{io_handle:integer}
+ -contigs       @i{identifiers:strings}
+ at end group
+ at end example
+
+This function removes duplicate contig identifiers from a given list. The
+function takes a list of identifiers (in the usual name or #number fashion)
+and returns a list of the left most reading names in the contigs. If two
+different identifiers for the same contig are given, only the one identifier
+is returned.
+
+ at table @var
+ at item @code{-io} io_handle
+
+The database IO handle returned from a previous @code{open_db} call.
+
+ at item @code{-contigs} identifiers
+
+The list of contig identifiers.
+
+ at end table
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node G4Comm-get_tag_array
+ at unnumberedsec get_tag_array
+ at findex get_tag_array(T)
+
+ at example
+ at group
+ at exdent get_tag_array
+ at end group
+ at end example
+
+This function parses the tag databases and returns a Tcl list containing the
+tag information. Each element of the returned list consist of the tag name,
+its type, and the default comment.
+
+For instance, the standard installation returns a list starting with
+"@code{@{comment COMM ?@} @{oligo OLIG @{@}@} @{compression COMP @{@}@} @{stop
+STOP @{@}@}}".
diff --git a/scripting_manual/gap4-t.texi b/scripting_manual/gap4-t.texi
new file mode 100644
index 0000000..3d0a999
--- /dev/null
+++ b/scripting_manual/gap4-t.texi
@@ -0,0 +1,46 @@
+ at menu
+* Gap4s-Introduction::          Introduction
+* Gap4s-IO::                    Low-level IO Access
+* Gap4s-Util::                  Utility Commands
+* Gap4s-Main::                  Main Commands
+* Gap4s-Editor::                The Editor Widget
+* Gap4s-EdNames::               The EdNames Widget
+ at end menu
+
+ at node Gap4s-Introduction
+ at section Introduction
+_include(gap4-scripting-intro-t.texi)
+
+ at page
+ at node Gap4s-IO
+ at section Low-level IO Access
+ at lowersections
+_include(gap4-scripting-io-t.texi)
+ at raisesections
+
+ at page
+ at node Gap4s-Util
+ at section Utility Commands
+ at lowersections
+_include(gap4-scripting-util-t.texi)
+ at raisesections
+
+ at page
+ at node Gap4s-Main
+ at section Main Commands
+ at lowersections
+_include(gap4-scripting-comm-t.texi)
+ at raisesections
+
+ at page
+ at node Gap4s-Editor
+ at section The Editor Widget
+_include(gap4-editor-t.texi)
+
+
+ at page
+ at node Gap4s-EdNames
+ at section The EdNames Widget
+
+This widget is currently undocumented. See the @file{src/gap4/tkEdNames.c}
+file.
diff --git a/scripting_manual/header.m4 b/scripting_manual/header.m4
new file mode 100644
index 0000000..ae3443a
--- /dev/null
+++ b/scripting_manual/header.m4
@@ -0,0 +1,292 @@
+ at c ---------------------------------------------------------------------------
+ at c Experiment with smaller amounts of whitespace between chapters
+ at c and sections.
+ at c ---------------------------------------------------------------------------
+ at tex
+ at set tex
+\global\chapheadingskip = 15pt plus 4pt minus 2pt 
+\global\secheadingskip = 12pt plus 3pt minus 2pt
+\global\subsecheadingskip = 9pt plus 2pt minus 2pt
+ at end tex
+
+ at c ---------------------------------------------------------------------------
+ at c @split{} command
+ at c
+ at c only makes sense for html.
+ at c ---------------------------------------------------------------------------
+ at tex
+\global\def\split{}
+ at end tex
+
+ at c ---------------------------------------------------------------------------
+ at c Experiment with smaller amounts of whitespace between paragraphs in
+ at c the 8.5 by 11 inch `format'.
+ at tex
+\global\parskip 6pt plus 1pt
+ at end tex
+ at c ---------------------------------------------------------------------------
+
+ at c ---------------------------------------------------------------------------
+ at c Magic with comments. m4 can set comment characters to whatever it wants.
+ at c They do not even have to be on one line (but by default the start and end
+ at c characters are "#" and newline).
+ at c
+ at c We `define' new start and end comments: @nm4 and @m4. (Remember as no m4 and
+ at c m4).
+ at c
+ at c m4 will not remove text in comments, it just ignores it. So the comment
+ at c characters themselves need to be harmless to tex. We solve this by creating
+ at c two new tex commands to do nothing.
+ at c ---------------------------------------------------------------------------
+ at tex
+\global\def\m4{}
+\global\def\nm4{}
+ at end tex
+changecom(@nm4, at m4)
+
+ at c ---------------------------------------------------------------------------
+ at c Rename the m4 commands to _commands. This will greatly reduce the chance of
+ at c them occurring in our text by chance.
+ at c ---------------------------------------------------------------------------
+define(`_define',defn(`define'))
+define(`_changecom',defn(`changecom'))
+define(`_changequote',defn(`changequote'))
+define(`_errprint',defn(`errprint'))
+define(`_maketemp',defn(`maketemp'))
+define(`_sinclude',defn(`sinclude'))
+define(`_translit',defn(`translit'))
+define(`_traceoff',defn(`traceoff'))
+define(`_undefine',defn(`undefine'))
+define(`_undivert',defn(`undivert'))
+define(`_decr',defn(`decr'))
+define(`_defn',defn(`defn'))
+define(`_divert',defn(`divert'))
+define(`_divnum',defn(`divnum'))
+define(`_dlen',defn(`dlen'))
+define(`_dumpdef',defn(`dumpdef'))
+define(`_eval',defn(`eval'))
+define(`_m4exit',defn(`m4exit'))
+define(`_ifelse',defn(`ifelse'))
+define(`_ifdef',defn(`ifdef'))
+define(`_include',defn(`include'))
+define(`_incr',defn(`incr'))
+define(`_index',defn(`index'))
+define(`_popdef',defn(`popdef'))
+define(`_pushdef',defn(`pushdef'))
+define(`_shift',defn(`shift'))
+define(`_substr',defn(`substr'))
+define(`_syscmd',defn(`syscmd'))
+define(`_sysval',defn(`sysval'))
+define(`_traceon',defn(`traceon'))
+define(`_m4wrap',defn(`m4wrap'))
+define(`_format',define(`format'))
+
+_undefine(`define')
+_undefine(`changecom')
+_undefine(`changequote')
+_undefine(`errprint')
+_undefine(`maketemp')
+_undefine(`sinclude')
+_undefine(`translit')
+_undefine(`traceoff')
+_undefine(`undefine')
+_undefine(`undivert')
+_undefine(`unix')
+_undefine(`windows')
+_undefine(`decr')
+_undefine(`defn')
+_undefine(`divert')
+_undefine(`divnum')
+_undefine(`dlen')
+_undefine(`dumpdef')
+_undefine(`eval')
+_undefine(`m4exit')
+_undefine(`ifelse')
+_undefine(`ifdef')
+_undefine(`include')
+_undefine(`incr')
+_undefine(`index')
+_undefine(`popdef')
+_undefine(`pushdef')
+_undefine(`shift')
+_undefine(`substr')
+_undefine(`syscmd')
+_undefine(`sysval')
+_undefine(`traceon')
+_undefine(`m4wrap')
+_undefine(`format')
+
+ at c ---------------------------------------------------------------------------
+ at c Change quotes to [[ and ]]. Otherwise quotes are likely to cause us
+ at c problems. [[ and ]] are not likely to occur by chance in our docs.
+ at c
+ at c If we need to use an m4 keyword in our text, then we may do so with
+ at c (eg) [[_m4command]].
+ at c
+ at c If we wish to use [[ and ]] in our text, enclose it with comments:
+ at c @nm4{}[[@m4{}
+ at c ---------------------------------------------------------------------------
+_changequote([[,]])
+
+ at c ---------------------------------------------------------------------------
+ at c picture macro
+ at c
+ at c Adds a picture to the document. For tex it loads a PostScript file. For
+ at c html it loads a gif file.
+ at c
+ at c argument 1: a filename prefix. .ps and .gif are added to the prefix
+ at c             as required.
+ at c ---------------------------------------------------------------------------
+_define([[_picture]],[[_ifdef([[_unix]],[[_ifdef([[_tex]],[[@tex
+ at sp 1
+ at epsfbox{[[$*]].unix.ps}
+ at end tex]])
+_ifdef([[_html]],[[
+ at ifhtml
+<p>
+<img src="[[$*]].unix.gif" alt="[picture]">
+ at end ifhtml]])]],[[_ifdef([[_tex]],[[@tex
+ at sp 1
+ at epsfbox{[[$*]].unix.ps}
+ at end tex]])
+_ifdef([[_html]],[[
+ at ifhtml
+<p>
+<img src="[[$*]].unix.gif" alt="[picture]">
+ at end ifhtml]])]])]])
+
+ at c ---------------------------------------------------------------------------
+ at c lpicture macro
+ at c
+ at c Adds a large picture to the document. In tex this is the same as the
+ at c picture macro. For html it displays a small gif file with a link to the
+ at c full size one.
+ at c
+ at c argument 1: a filename prefix. .ps, .gif, .small.gif and .gif.html are
+ at c             added to the prefix as required.
+ at c ---------------------------------------------------------------------------
+_define([[_lpicture]],[[_ifdef([[_unix]],[[_ifdef([[_tex]],[[@tex
+ at sp 1
+ at epsfbox{[[$*]].unix.ps}
+ at end tex]])
+_ifdef([[_html]],[[
+ at ifhtml
+<p>
+<a href="[[$*]].unix.gif.html"><img src="[[$*]].small.unix.gif" alt="[picture]"></a>
+<br><font size="-1">(Click for full size image)<font size="+0"><br>
+ at end ifhtml]])]],[[_ifdef([[_tex]],[[@tex
+ at sp 1
+ at epsfbox{[[$*]].unix.ps}
+ at end tex]])
+_ifdef([[_html]],[[
+ at ifhtml
+<p>
+<a href="[[$*]].unix.gif.html"><img src="[[$*]].small.unix.gif" alt="[picture]"></a>
+<br><font size="-1">(Click for full size image)<font size="+0"><br>
+ at end ifhtml]])]])]])
+
+ at c ---------------------------------------------------------------------------
+ at c @nm4{}
+ at c
+ at c _ifunix macro
+ at c _ifwindows macro
+ at c
+ at c These two macros may be used to surround text which we wish to only
+ at c appear in one version or another. They check the _ifunix and _ifwindows
+ at c defines.
+ at c An example usage is:
+ at c
+ at c     _ifunix([[
+ at c     @split{}
+ at c     @node Assembly-CAP2
+ at c     @section Assembly CAP2
+ at c     _include(cap2-t.texi)
+ at c     ]])(
+ at c
+ at c An alternative to this is using _ifdef directly. Eg:
+ at c
+ at c     _ifdef([[_unix]],[[
+ at c     @split{}
+ at c     @node Assembly-CAP2
+ at c     @section Assembly CAP2
+ at c     _include(cap2-t.texi)
+ at c     ]])(
+ at c
+ at c @m4{}
+ at c ---------------------------------------------------------------------------
+_define([[_ifunix]],[[_ifdef([[_unix]],[[$*]])]])
+_define([[_ifwindows]],[[_ifdef([[_windows]],[[$*]])]])
+
+ at c ---------------------------------------------------------------------------_
+ at c uref macro
+ at c
+ at c This exists in newer texinfo release, but for now we try to emulate it as
+ at c well as possible (albeit in a m4 instead of texinfo manner).
+ at c
+ at c _uref(url) will just link to that url, with the 'url' as the text.
+ at c _uref(url,text) will link to that url, with 'text' as the text in the
+ at c   html format. For tex format it'll use "text (@code{url})".
+ at c _uref(url,,text) will link to that url, with 'text' as the text in both
+ at c   html and tex formats.
+ at c ---------------------------------------------------------------------------
+_define([[_uref]],[[_ifelse(1,$#,[[_ifdef([[_html]],[[@ifhtml
+<a href="$1">
+ at end ifhtml]])
+$1
+_ifdef([[_html]],[[
+ at ifhtml
+</a>
+ at end ifhtml]])]],[[_ifelse(2,$#,[[_ifdef([[_tex]],[[$2 (@code{$1})]])
+_ifdef([[_html]],[[@ifhtml
+<a href="$1">$2</a>
+ at end ifhtml]])]],[[_ifdef([[_html]],[[
+ at ifhtml
+
+<a href="$1">
+ at end ifhtml
+$3
+ at ifhtml
+</a>
+ at end ifhtml
+]])]])]])]])
+
+ at c normal refs
+_ifdef([[_tex]],[[
+_define([[_fxref]],[[@xref{$1,$1,$2}.]])
+_define([[_fpref]],[[@pxref{$1,$1,$2}]])
+_define([[_fref]],[[@ref{$1,$1,$2}.]])
+_define([[_split]],[[]])
+]])
+
+ at c html refs
+_ifdef([[_html]],[[
+_define([[_fxref]],[[
+ at ifhtml
+<!-- XREF:$1 -->
+ at end ifhtml
+ at xref{$1,$1,$2,$3,$3}.]])
+_define([[_fpref]],[[
+ at ifhtml
+<!-- XREF:$1 -->
+ at end ifhtml
+ at pxref{$1,$1,$2,$3,$3}]])
+_define([[_fref]],[[
+ at ifhtml
+<!-- XREF:$1 -->
+ at end ifhtml
+ at ref{$1,$1,$2,$3,$3}.]])
+_define([[_split]],[[@split]])
+]])
+
+ at c common refs
+_define([[_oxref]],[[@xref{$1,$1,$2}]])
+_define([[_oref]],[[@ref{$1,$1,$2}]])
+
+ at c A horizontal ruler, using TeX or HTML
+_define([[_rule]],[[@sp 1
+ at tex
+\hrule height 0.5pt width \hsize
+ at end tex
+ at ifhtml
+<hr>
+ at end ifhtml]])
diff --git a/scripting_manual/i/nav_brief.gif b/scripting_manual/i/nav_brief.gif
new file mode 100644
index 0000000..b26bdbd
Binary files /dev/null and b/scripting_manual/i/nav_brief.gif differ
diff --git a/scripting_manual/i/nav_down.gif b/scripting_manual/i/nav_down.gif
new file mode 100644
index 0000000..bf5ccf0
Binary files /dev/null and b/scripting_manual/i/nav_down.gif differ
diff --git a/scripting_manual/i/nav_first.gif b/scripting_manual/i/nav_first.gif
new file mode 100644
index 0000000..75d3439
Binary files /dev/null and b/scripting_manual/i/nav_first.gif differ
diff --git a/scripting_manual/i/nav_full.gif b/scripting_manual/i/nav_full.gif
new file mode 100644
index 0000000..65c4753
Binary files /dev/null and b/scripting_manual/i/nav_full.gif differ
diff --git a/scripting_manual/i/nav_home.gif b/scripting_manual/i/nav_home.gif
new file mode 100644
index 0000000..5e1293c
Binary files /dev/null and b/scripting_manual/i/nav_home.gif differ
diff --git a/scripting_manual/i/nav_last.gif b/scripting_manual/i/nav_last.gif
new file mode 100644
index 0000000..95a8a39
Binary files /dev/null and b/scripting_manual/i/nav_last.gif differ
diff --git a/scripting_manual/i/nav_next.gif b/scripting_manual/i/nav_next.gif
new file mode 100644
index 0000000..7fa6ebe
Binary files /dev/null and b/scripting_manual/i/nav_next.gif differ
diff --git a/scripting_manual/i/nav_prev.gif b/scripting_manual/i/nav_prev.gif
new file mode 100644
index 0000000..31176c4
Binary files /dev/null and b/scripting_manual/i/nav_prev.gif differ
diff --git a/scripting_manual/i/nav_top.gif b/scripting_manual/i/nav_top.gif
new file mode 100644
index 0000000..cb77483
Binary files /dev/null and b/scripting_manual/i/nav_top.gif differ
diff --git a/scripting_manual/i/nav_up.gif b/scripting_manual/i/nav_up.gif
new file mode 100644
index 0000000..434a6d6
Binary files /dev/null and b/scripting_manual/i/nav_up.gif differ
diff --git a/scripting_manual/preface-t.texi b/scripting_manual/preface-t.texi
new file mode 100644
index 0000000..be0c6bf
--- /dev/null
+++ b/scripting_manual/preface-t.texi
@@ -0,0 +1,48 @@
+This manual is a guide to programming with the newer Tcl/Tk based Staden
+Package programs. It covers both using the programs in a scripting environment
+and writing modules to extend the functionality of them. The main content
+current covers the Tcl interfaces, with very little of the C functions
+currently documented. The reader should also be familier with the Tcl
+language.
+
+
+ at unnumberedsec Conventions Used in This Manual
+
+ at exdent @i{Italic} is used for:
+ at itemize @bullet
+ at item variable names
+ at item command line values
+ at item structure fields
+ at end itemize
+
+ at exdent @code{Fixed width bold} is used for:
+ at itemize @bullet
+ at item Code examples
+ at item Command line arguments
+ at item Typed in commands
+ at item Program output
+ at end itemize
+
+ at sp 1
+The general format of the syntax for the more complex Tcl commands is to list
+the command name in bold followed by one or more command line arguments in
+bold with command line values in italic. The command line values have a
+brief description of the use of the value followed by the type and a default
+value. The Tcl convention of surrounding optional values in question marks is
+used. For instance the @code{edit_contig} command has the following syntax.
+
+ at example
+ at group
+ at exdent @code{edit_contig}
+ -io            @i{io_handle:integer}
+ -contig        @i{identifier:string}
+?-reading       @i{identifier:string()}?
+?-pos           @i{position:integer(1)}?
+ at end group
+ at end example
+
+ at sp 1
+ at code{-io} and @code{-pos} both take integer values. @code{-pos} is
+optional, and has a default value of 1. @code{-contig} and @code{-reading}
+both require string values. @code{-reading} is optional, and has a default
+value of a blank string.
diff --git a/scripting_manual/scripting.texi b/scripting_manual/scripting.texi
new file mode 100644
index 0000000..68124b1
--- /dev/null
+++ b/scripting_manual/scripting.texi
@@ -0,0 +1,102 @@
+\input texinfo
+ at c %**start of header
+ at setfilename scripting.info
+ at settitle Programming with Gap4
+ at setchapternewpage odd
+ at iftex
+ at afourpaper
+ at end iftex
+ at c %**end of header
+
+ at set standalone
+include(header.m4)
+
+ at titlepage
+ at title Programming with Gap4
+ at subtitle Version 0.99.2 (October 1997)
+ at subtitle
+ at author James Bonfield
+ at page
+ at vskip 0pt plus 1filll
+Copyright @copyright{} 1995, 1996, Medical Research Council, Laboratory of Molecular Biology.
+ at end titlepage
+
+ at node Top
+ at ifinfo
+ at top Programming Gap4
+ at end ifinfo
+
+ at menu
+* Preface::             Preface
+* Tk_utils::            Tk_utils Library
+* Tcl-scripts::         Tcl Scripting of Gap4
+* C-IO::                Database I/O in C
+* C-Sequence::          Sequence Manipulation in C
+* C-Anno::              Annotation Functions in C
+* Registration::        Contig Registration Scheme
+* Packages::            Writing Packages
+
+* Appendix-Compostion:: Appendix A - Composition Package
+ at end menu
+
+ at node Preface
+ at unnumbered Preface
+_include(preface-t.texi)
+
+ at node Tk_utils
+ at chapter Tk_utils Library
+_include(tkutils-t.texi)
+
+ at node Tcl-scripts
+ at chapter Tcl Scripting of Gap4
+_include(gap4-t.texi)
+
+ at node C-IO
+ at chapter Database I/O in C
+_include(gap4-cio-t.texi)
+
+ at node C-Editing
+ at chapter Sequence Editing Functions in C
+_include(gap4-cedit-t.texi)
+
+ at node C-Anno
+ at chapter Annotation Functions in C
+_include(gap4-canno-t.texi)
+
+ at node Registration
+ at chapter Contig Registration Scheme
+_include(gap4-registration-t.texi)
+
+ at node Packages
+ at chapter Writing Packages
+_include(extension-t.texi)
+
+ at node Appendix-Composition
+ at appendix Composition Package
+_include(appendix-t.texi)
+
+ at split{}
+ at node Function Index
+ at unnumbered Function Index
+This index contains lists of the C and Tcl function calls available. Entry
+items listed with a @i{(T)} suffix are callable from Tcl. Entry items listed
+with a @i{(C)} suffix are callable from within C.
+ at printindex fn
+
+ at split{}
+ at node Variable Index
+ at unnumbered Variable and Type Index
+This index contains lists of C and Tcl variables and types. Entry items listed
+with a @i{(T)} suffix are Tcl variables. Entry items listed with a @i{(C)}
+suffix are C variables. All types are C.
+ at printindex vr
+
+ at split{}
+ at node Concept Index
+ at unnumbered Concept Index
+ at printindex cp
+
+ at summarycontents
+ at contents
+
+ at bye
diff --git a/scripting_manual/tkutils-t.texi b/scripting_manual/tkutils-t.texi
new file mode 100644
index 0000000..93698e3
--- /dev/null
+++ b/scripting_manual/tkutils-t.texi
@@ -0,0 +1,1253 @@
+ at cindex Tk_utils library
+ at cindex Tkutils library
+ at cindex stash
+ at vindex TKUTILS_LIBRARY
+ at vindex LD_LIBRARY_PATH
+
+ at menu
+* TkU-Keyed Lists::             Keyed Lists
+* TkU-Dynamic::                 Runtime Loading of Libraries.
+* TkU-Defaults::                Default Files
+* TkU-Menus::                   Specifying Menu Configurations
+* TkU-Menu Control::            Controlling Menu Behaviour
+* TkU-Dialogues::               Common Dialogue Components
+* TkU-Output::                  Text Output and Errors
+* TkU-Other::                   Other Utility Commands
+ at end menu
+
+The @i{tk_utils} library provides basic Tcl and Tk extensions suitable for all
+applications. The common components of the programs, such as the text output
+display, keyed lists, and the configuration file handling are contained within
+this library.
+
+The @code{stash} executable is a modified version of @code{wish} that contains
+these commands. When not using @code{stash} the Tcl load command must be used
+to dynamically link the library. From wish it is necessary to use the
+following startup code:
+
+ at example
+lappend auto_path $env(TKUTILS_LIBRARY)
+catch @{load libmisc.so@}
+catch @{load libread.so@}
+load libtk_utils.so
+ at end example
+
+The above assumes that the @code{TKUTILS_LIBRARY} and @code{LD_LIBRARY_PATH}
+environment variables have been set correctly. These are automatically done if
+the package initialisation files (@file{staden.profile} or
+ at file{staden.login}) are sourced.
+
+Once either @code{stash} or a boot-strapped @code{wish} is running the
+tk_utils library code is available.
+
+ at split{}
+ at node TkU-Keyed Lists
+ at section Keyed Lists
+ at cindex Keyed Lists
+ at cindex TclX
+
+Many functions make use of the TclX Keyed List extension. Keyed Lists can be
+compared to C structures. The following description has been taken from the
+TclX distribution @footnote{The TclX copyright states the following.
+
+ at i{Copyright 1992-1996 Karl Lehenbauer and Mark Diekhans.
+Permission to use, copy, modify, and distribute this software and its
+documentation for any purpose and without fee is hereby granted, provided
+that the above copyright notice appear in all copies.  Karl Lehenbauer and
+Mark Diekhans make no representations about the suitability of this
+software for any purpose.  It is provided "as is" without express or
+implied warranty.}}.
+
+ at quotation
+  <start of quotation>
+
+  A keyed list is a list in which each element contains a key and value pair.
+  These  element  pairs  are stored as lists themselves, where the key is the
+  first element of the list, and the value  is  the  second.   The  key-value
+  pairs are referred to as fields.  This is an example of a keyed list:
+
+ at example
+@{@{NAME @{Frank Zappa@}@} @{JOB @{musician and composer@}@}@}
+ at end example
+
+  If the variable @var{person} contained the above list, then @code{keylget
+  person NAME} would return @code{@{Frank Zappa@}}.  Executing the command:
+
+ at example
+keylset person ID 106
+ at end example
+
+  would make person contain
+
+ at example
+@{@{ID 106@} @{NAME @{Frank Zappa@}@} @{JOB @{musician and composer@}@}
+ at end example
+
+  Fields may contain subfields; `.' is the  separator  character.   Subfields
+  are  actually  fields where the value is another keyed list.  Thus the
+  following list has the top level fields @code{ID} and @code{NAME}, and
+  subfields @code{NAME.FIRST} and @code{NAME.LAST}:
+
+ at example
+@{ID 106@} @{NAME @{@{FIRST Frank@} @{LAST Zappa@}@}@}
+ at end example
+
+  There is no limit to the recursive depth  of  subfields,  allowing  one  to
+  build complex data structures.
+
+  Keyed lists are constructed and accessed via a  number  of  commands.   All
+  keyed list management commands take the name of the variable containing the
+  keyed list as an argument (i.e. passed by reference), rather  than  passing
+  the list directly.
+
+ at table @asis
+ at findex keyldel(C)
+ at item @code{keyldel} @i{listvar key}
+       Delete the field specified by key from the keyed list in the  variable
+       @var{listvar}.  This removes both the key and the value from the keyed
+       list.
+
+ at sp 1
+ at findex keylget(C)
+ at item @code{keylget} @i{listvar ?key? ?retvar | @{@}?}
+       Return the value associated with key from the keyed list in the
+       variable @var{listvar}.  If @var{retvar} is not specified, then the
+       value will be returned as the result of the command. In this case, if
+       key is not found in the list, an error will result.
+
+       If @var{retvar} is specified and key is in the list, then the value is
+       returned in the variable retvar and the command returns 1 if the key
+       was present within the list.  If key isn't in the list, the command
+       will return 0, and retvar will be left unchanged.  If @code{@{@}} is
+       specified for retvar, the value is not returned, allowing the Tcl
+       programmer to determine if a key is present in a keyed list without
+       setting a variable as a side-effect.
+
+       If key is omitted, then a list of all the keys in the  keyed  list  is
+       returned.
+
+ at sp 1
+ at findex keylkeys(C)
+ at item @code{keylkeys} @i{listvar ?key?}
+       Return the a list of the keys in the keyed list in the variable
+       @var{listvar}.  If keys is specified, then it is the name of a key
+       field whose subfield keys are to be retrieve.
+
+ at sp 1
+ at findex keylset(C)
+ at item @code{keylset} @i{listvar key value ?key2 value2 ...?}
+       Set the value associated with key, in the keyed list contained in the
+       variable @var{listvar}, to value.  If listvar does not exists, it is
+       created.  If @var{key} is not currently in the list, it will be added.
+       If it already exists, @var{value} replaces the existing value.
+       Multiple keywords and values may be specified, if desired.
+ at end table
+
+  <end of quotation>
+ at end quotation
+
+An example best illustrates their usage. In this case we're using Gap4 to
+extract some @i{template} information for readings within an assembly
+database.
+
+ at example
+% set io [open_db -name TEST -version 1 -access rw]
+% set r [io_read_reading $io 1]
+% puts $r
+@{name 34@} @{trace_name 39@} @{trace_type 40@} @{left 25@} @{right 33@} @{position 90@}
+@{length 545@} @{sense 1@} @{sequence 36@} @{confidence 37@} @{orig_positions 38@}
+@{chemistry 0@} @{annotations 1@} @{sequence_length 440@} @{start 71@} @{end 512@}
+@{template 1@} @{strand 0@} @{primer 1@}
+% set t [io_read_template $io [keylget r template]]
+% puts $t
+@{name 45@} @{strands 1@} @{vector 1@} @{clone 1@} @{insert_length_min 1400@}
+@{insert_length_max 2000@}
+% keylset t insert_length_max 2500
+% puts $t
+@{name 45@} @{strands 1@} @{vector 1@} @{clone 1@} @{insert_length_min 1400@}
+@{insert_length_max 2500@}
+% io_write_template $io [keylget r template] $t
+% close_db -io $io
+ at end example
+
+The above is an interactive session. It starts by opening database
+ at code{TEST}, version @code{1}. Then the first reading is loaded from the
+database and listed. Next the template for this reading is loaded and also
+listed. Finally, the maximum length for this template is changed to 2500
+,written back to the database, and the database closed.
+
+ at split{}
+ at node TkU-Dynamic
+ at section Runtime Loading of Libraries
+ at cindex Runtime libraries
+ at cindex Dynamic libraries
+ at cindex Libraries, loading of
+
+The main command for loading dynamic libraries is the @code{load_package}
+command. This adds on a new directory to the Tcl search path and dynamically
+loads up a new C library. For programmers, the procedure of creating these
+libraries is initially fairly complex. Once done, all the user requires is a
+single @code{load_package} command adding to the application @file{rc} file to
+extend the applications functionality.
+
+The existing Tcl package system allows for the dynamic loading to be delayed
+until a command is needed. However this system does not satisfactorily deal
+with the case where libraries contain only C commands. Hence the package
+system utilised by the Staden Package dynamically links in libraries to the
+running executable at the time of the load_package call. This is typically
+done in the startup phase of programs.
+
+_rule
+ at split{}
+ at node TkU-load_package
+ at unnumberedsubsec load_package
+ at findex load_package(C), short version
+ at vindex tk_utils_defs
+ at vindex _defs
+ at example
+ at exdent @strong{load_package} @i{name}
+ at end example
+
+This loads the dynamic library named (eg) lib at i{name}.so. The "lib" and
+".so" components of this library name a system dependent strings. The system
+will automatically use the correct local terminology depending on the system
+type.
+
+Firstly the @code{$STADLIB/}@i{name} directory is appended to Tcl auto_path
+variable. Next the @code{$STADTABL/}@i{name}@code{rc} file is used to specify
+the package menus and defaults (which are saved as a keyed list in the global
+tcl variable @i{name}_defs). The @code{.}@i{name}@code{rc} file is also loaded
+up from the callers HOME directory and from the current directory, if they
+exist, in this order.  This means that a user can override defaults specified
+in the @code{STADTABL} directory by creating an rc file in their home
+directory, and then to override these specifications further in a
+project-by-project fashion by adding configurations to the current directory.
+
+Next the library itself is dynamically loaded. The file to be loaded is held
+within the @code{$STADLIB/$MACHINE-binaries} directory. If the library does
+not exist within this directory then it is not loaded and no error is
+produced.
+
+Finally if existent, the package initialisation function in C will then be
+called with a Tcl interpreter as the sole argument and returns an integer
+(TCL_OK or TCL_ERROR). It is this function which performs the registering of
+new commands to the Tcl language. The C function name must be the package name
+with the first character as upper case, the following characters as lowercase,
+and suffixed by @code{_Init}. See the Tcl load manual page for full details.
+
+So for the tk_utils library the @code{$STADLIB/tk_utils} directory is added to
+the auto_path variable, the @code{$STADTABL/tk_utilsrc} file is processed, and
+executes the C function @code{Tk_utils_Init()}. 
+
+_rule{}
+ at split{}
+ at findex load_package(C), long version
+ at example
+ at exdent @strong{load_package} @i{tcldir libdir name ?init?}
+ at end example
+
+This is the more versatile form of the load_package command. The procedures
+performed are the same, however the location of the files is no longer
+controlled solely by environment variables.
+
+ at i{Tcldir} specifies the directory to add to the Tcl auto_path variable and is
+used for the search path of the @i{name}@code{rc} file. As with the simpler
+form of load_package the @code{STADTABL}, HOME, and current directory versions
+of the rc file are also loaded, with each file overriding values specified in
+the earlier copies.
+
+The @i{libdir} argument specifies the location to find the dynamic library
+file to load. Specifying this as a single @code{-} (minus sign) requests that
+no dynamic library is to be loaded. In this way libraries consisting solely of
+Tcl files may be used. Specifying @i{libdir} as a blank string (either "" or
+@{@}) indicates that the library is to be searched for in the users
+ at code{LD_LIBRARY_PATH} instead.
+
+Both the @i{tcldir} and @i{libdir} variables allow a few substitutions to
+expand up to common locations.
+
+ at table @var
+ at item %L
+Expands to @code{$STADLIB}
+ at item %S
+Expands to @code{$STADENROOT/src}
+ at item %%
+Expands to a single percent sign
+ at end table
+
+The @i{init} argument is used to indicate whether the dynamic library loaded
+has an initialisation routine. It should be set to 0 or 1. The current
+implementation always attemps to execute the initialisation routine, but
+when @i{init} is 0 errors will be ignored.
+
+ at split{}
+ at node TkU-Defaults
+ at section Default Files
+ at cindex Default files
+ at cindex rc files
+
+The application @i{rc} files contain all the configuration details required by
+the application. Typically an @i{rc} file starts by loading up more packages
+using more @code{load_package} commands. This allows for hierarchial
+dependencies of packages and simplifies the loading of any single package. For
+instance, the @file{siprc} file contains a @code{load_package} call for
+seqlib. The @file{seqlibrc} file in turn has a @code{load_package} call for
+ at code{tk_utils}.
+
+Next we may define the menu data. Defining menus here allows for extensions to
+be written that add new commands directly onto the main menu. This obviously
+provides the ability to have third party extensions without sacrificing
+usability for the user. _oxref(TkU-Menus, Specifying Menu Configurations).
+
+The rest of the @i{rc} file will contain the default value for applications.
+These may vary from the configuration parameters to the colours of plots to
+the text used in a particular dialogue. The available parameters to set are a
+function of the application itself, but the commands used to set these are
+universal.
+
+_rule
+ at split{}
+ at node TkU-set_def
+ at unnumberedsubsec set_def
+ at findex set_def(T)
+ at example
+ at exdent @strong{set_def} @i{parameter} @i{value}
+ at end example
+
+This sets the application parameter @i{parameter} to @i{value}. @i{Parameter}
+is a Keyed List field within the application defs variable. If @i{value}
+is more than one word, the Tcl quoting mechanisms must be used. Valid examples
+are:
+
+ at example
+set_def CONSENSUS_CUTOFF                0.01
+set_def STOP_CODON.RULER_COLOUR         black
+set_def TRACE_DISPLAY.BACKGROUND        $normal_bg
+set_def CONTIG_EDITOR.SE_SET.1          @{0 0 0 1 0 0 0 0 1 1@}
+set_def CONTIG_EDITOR.SE_SET.1          "0 0 0 1 0 0 0 0 1 1"
+ at end example
+
+The last two of these are different ways of acheiving the same result.
+
+_rule
+ at split{}
+ at node TkU-set_defx
+ at unnumberedsubsec set_defx
+ at findex set_defx(T)
+ at example
+ at exdent @strong{set_defx} @i{variable} @i{parameter} @i{value}
+ at end example
+
+When we have common values to set for many parameters we may use the
+ at code{set_defx} command. For example take the following settings:
+
+ at example
+set_def FIJ.HIDDEN.NAME         "Window size for good data scan"
+set_def FIJ.HIDDEN.MIN          1
+set_def FIJ.HIDDEN.MAX          200
+set_def FIJ.HIDDEN.VALUE        100
+
+set_def ASSEMBLE.HIDDEN.NAME    "Window size for good data scan"
+set_def ASSEMBLE.HIDDEN.MIN     1
+set_def ASSEMBLE.HIDDEN.MAX     200
+set_def ASSEMBLE.HIDDEN.VALUE   100
+ at end example
+
+The repetition here of common elements is tedious. Using @code{set_defx} the
+equivalent becomes:
+
+ at example
+set_defx defs_hidden    NAME    "Window size for good data scan"
+set_defx defs_hidden    MIN     1
+set_defx defs_hidden    MAX     200
+set_defx defs_hidden    VALUE   100
+
+set_def  FIJ.HIDDEN             $defs_hidden
+set_def  ASSEMBLE.HIDDEN        $defs_hidden
+ at end example
+
+ at split{}
+ at node TkU-Menus
+ at section Specifying Menu Configurations
+ at cindex Configuring menus
+ at cindex Menu configuration
+
+By specifying menu configurations within the application rc file we provide
+the ability for extensions to include their own menu additions. When combined
+with the dynamic linking ability this means that new C functions can be
+written complete with GUI and menu items. These can then be "wrapped up" into
+a package suitable for distribution to other users.
+
+Not all menus within our programs are specified within the configuration file,
+but typically the main menu is. Theoretically other menus (such as the gap4
+contig editor ones) could be defined in this manner too.
+
+An important concept in the menu code is menu states. At any time a menu item
+or a menu button can be either enabled or disabled (greyed out). Certain
+actions require a subset of the menu items to be enabled or disabled. Actions
+can be split into menu state changes that enable menu items and those that
+disable them. If an action needs to both enable and disable then two menu
+state changes should be applied. Menu states are specified as bit patterns
+with one bit per action.
+
+For example in gap4 we have several enable states and several disable states.
+
+ at sp 1
+ at table @asis
+ at item @strong{bit}
+ at strong{Enable description}
+ at item 0
+Startup settings
+ at item 2
+A new database has been opened
+ at item 3
+The database has data
+ at end table
+
+ at sp 1
+ at table @asis
+ at item @strong{bit}
+ at strong{Disable description}
+ at item 1
+Busy mode has been set
+ at item 2
+The database has been closed
+ at item 3
+The database has no data
+ at item 4
+Read-only mode is enabled
+ at end table
+
+Note that not all bits are used in the enable and disable settings. This is
+purely to simplify the numbering for the user. For example bits 2 and 3 
+have the same meaning for both the enable set and the disable set.
+
+Bit 0 is always the startup setting. If a menu item does not have this bit set
+then it is disabled, otherwise it is enabled.
+
+Bit 1 is always used by the busy mode. Busy mode disables items that have bit
+1 set in the disable settings. When busy mode is turned off the menu settings
+revert to their initial state (prior to busy mode being enabled) and so no
+enable bit is necessary.
+
+The other bits defined are application dependent. In this case bits 2 and 3
+define whether the database opened and whether it contains data.
+
+_rule
+ at split{}
+ at node TkU-set_menu
+ at unnumberedsubsec set_menu
+ at findex set_menu(T)
+ at example
+ at exdent @strong{set_menu} @i{name}
+ at end example
+
+The first menu command to be used is @code{set_menu}. This states that all
+further menu commands, until the next set_menu, will store their data in the
+Tcl variable @i{name}.
+
+_rule
+ at split{}
+ at node TkU-add_menu
+ at unnumberedsubsec add_menu
+ at findex add_menu(T)
+ at example
+ at exdent @strong{add_menu} @i{name} @i{onval} @i{offval} @i{pos}
+ at end example
+
+Before adding commands to menus we need to create the menus themselves. The
+ at code{add_menu} command does this. The menu @i{name} is the text to appear for
+the menu button. If this includes spaces it must be enclosed in quotes or
+curly brackets.
+
+ at i{Onval} and @i{offval} define the state masks for the enable and disable
+sets. Menus are always enabled whenever any of the items within them are
+enabled, even if the @i{offval} set defines otherwise. Menus are usually
+enabled at startup (@i{onval} == 1) and disabled during busy mode (@i{offval}
+== 2).
+
+The @i{pos} argument may be either @code{left} or @code{right}. This requests
+the position to place the menu. Each leftwards positioned menu is packed to
+the right of the currently shown left menus. Hence the order in which menus
+are defined controls the order in which they will appear. Similarly for
+rightwards positioned menus.
+
+So for example, the Gap4 main menus are defined as follows.
+
+ at example
+add_menu File           1 2 left
+add_menu Edit           1 2 left
+add_menu View           1 2 left
+add_menu Options        1 2 left
+add_menu Experiments    1 2 left
+add_menu Lists          1 2 left
+add_menu Assembly       1 2 left
+add_menu Help           1 0 right
+ at end example
+
+If more than one add_menu command is present for the same menu name the latter
+of the two takes priority.
+
+_rule
+ at split{}
+ at node TkU-add_cascade
+ at unnumberedsubsec add_cascade
+ at findex add_cascade(T)
+ at example
+ at exdent @strong{add_cascade} @i{name} @i{onval} @i{offval}
+ at end example
+
+This adds a cascading menu item within an existing menu. The @i{name} should
+be the menu name followed by a full stop followed by the cascading menu name.
+So to add a @code{Save To} cascading menu to the @code{File} menu the @i{name}
+should be set to "@code{@{File.Save To@}}".
+
+ at i{Onval} and @i{offval} operate in the same fashion as the @code{add_menu}
+command.
+
+_rule
+ at split{}
+ at node TkU-add_command
+ at unnumberedsubsec add_command
+ at findex add_command(T)
+ at example
+ at exdent @strong{add_command} @i{name} @i{onval} @i{offval} @i{command}
+ at end example
+
+This adds a new command to an application. The @i{name} should be the menu
+pathname followed by fullstop followed by the name of the command to appear in
+the menu. So if the command is within a cascading menu the @i{name} will have
+several components broken down by fullstops, ending in the command name
+itself.
+
+The @i{onval} and @i{offval} arguments control the states for which the
+command is to be enabled in.
+
+The @i{command} argument is the command to execute when this menu item is
+selected. This is a single argument so Tcl quoting rules must be obeyed for
+multi-word commands. This command is evaluated (using the Tcl @code{eval}
+command) at the time of selecting the menu item. If the command is to contain
+references to variables, it is important to distinguish between variables
+expanded at the time of creating the menu item and the time of executing the
+menu item by backslashing the latter.
+
+For example, the Gap4 "Quality" mode of the consensus output has the following
+specification.
+
+ at example
+add_command  @{File.Calculate a consensus.quality@}  8 10  @{QualityDialog \$io@}
+ at end example
+
+Here the "quality" command is within the "Calculate a consensus" cascading
+menu which is within the "File" menu. It is enabled by bit 3 (a database
+containing data has been opened) and is disabled by bits 1 and 3 (the database
+has no data or busy mode is enabled). The command to run is
+ at code{QualityDialog $io}. If we did not backslash the @code{$io} in this
+command the @i{io} variable would be expanded up at the time of creating the
+menus, say to "0". Then when the menu item is selected we would always execute
+ at code{QualityDialog 0} which is not the desired effect.
+
+_rule
+ at split{}
+ at node TkU-add_separator
+ at unnumberedsubsec add_separator
+ at findex add_separator(T)
+ at example
+ at exdent @strong{add_separator} @i{name}
+ at end example
+
+This simply adds a separator to the menu. The @i{name} specifies both the menu
+containing the separator and a name for the separator itself. Separator names
+do not appear in the menu, but are still required.
+
+_rule
+ at split{}
+ at node TkU-add_radio
+ at unnumberedsubsec add_radio
+ at findex add_radio(T)
+ at example
+ at exdent @strong{add_radio} @i{name} @i{onval} @i{offval} @i{variable} @i{value} @i{command}
+ at end example
+
+Multiple radio buttons are grouped together to form a set of which any one
+button can be activated at any one time. The @code{add_radio} command adds
+commands to menus in a similar fashion to the @code{add_command} command, but
+has two additional arguments; @i{variable} and @i{value}.
+
+Each radio button within a group uses the same @i{variable} with differing
+ at i{values}. When a radio button is selected the global Tcl @i{variable} is set
+with the associated @i{value} and the @i{command} is executed. A useful tip is
+that the contents of the @i{variable} may be passed to the @i{command} as an
+argument using @code{\$}@i{variable}.
+
+As each group can specify its own variable, multiple radio button groups are
+possible .
+
+_rule
+ at split{}
+ at node TkU-add_check
+ at unnumberedsubsec add_check
+ at findex add_check(T)
+ at example
+ at exdent @strong{add_check} @i{name} @i{onval} @i{offval} @i{variable} @i{command}
+ at end example
+
+A check button command is identical to a normal command created by
+ at code{add_command} except that the menu item also has a box showing the
+current toggled state. Unlike radio buttons each check button operates
+independently of every other check button.
+
+The @i{variable} button specifies the global Tcl variable to hold the state
+for this check button. It will contain 1 for enabled and 0 for disabled.
+Whenever the item is selected the variable will be toggled and the command
+executed.
+
+ at split{}
+ at node TkU-Menu Control
+ at section Controlling Menu Behaviour
+ at cindex Controlling menu behaviour
+ at cindex Menu control
+
+The creation and control of menus within applications is governed by further
+menu commands. These do not appear within the configuration files but rather
+the Tcl code for the applications themselves.
+
+_rule
+ at split{}
+ at node TkU-create_menus
+ at unnumberedsubsec create_menus
+ at cindex Menu creation
+ at cindex Creating menus
+ at findex create_menus(T)
+ at example
+ at exdent @strong{create_menus} @i{menu_specs} ?@i{pathname}?
+ at end example
+
+Uses the menu specifications passed over in the @i{menu_specs} variable to
+create the main menubar. @i{Pathname} specifies the root Tk window pathname in
+which to create the menus. If this is not specified the Tk root (.) is used
+instead.
+
+The menu specifications are created from processing the application rc file.
+The @code{set_menu} command is used to specify a Tcl variable to store these
+specifications in. The contents of this variable should be used as the
+ at i{menu_specs} argument.
+
+_rule
+ at split{}
+ at node TkU-menu_state_on
+ at unnumberedsubsec menu_state_on
+ at cindex Enable menu states
+ at cindex Menu enabling
+ at findex menu_state_on(T)
+ at example
+ at exdent @strong{menu_state_on} @i{menu_specs} @i{mask} ?@i{pathname}?
+ at end example
+
+Enables menu items by applying menu state @i{mask} to the menus described in
+the @i{menu_specs} data. @i{Menu_specs} is the contents of the variable
+created by the @code{set_menu} command and written to by subsequent
+ at code{add_}* commands.
+
+The @i{mask} is applied to each item in the menu specs. If the menu item
+enable set ANDed with the @i{mask} is non zero the menu item is enabled.
+Otherwise it is not changed (and not disabled). It is possible to combine
+multiple enable bits together in a single call. Hence the following two
+examples are identical.
+
+ at example
+menu_state_on $gap_menu 4 .mainwin.menus
+menu_state_on $gap_menu 8 .mainwin.menus
+
+menu_state_on $gap_menu 12 .mainwin.menus
+ at end example
+
+ at i{Pathname} specifies the root location of the menu widgets as given to a
+previous @code{create_menus} command.
+
+_rule
+ at split{}
+ at node TkU-menu_state_off
+ at unnumberedsubsec menu_state_off
+ at cindex Disable menu states
+ at cindex Menu disabling
+ at findex menu_state_off(T)
+ at example
+ at exdent @strong{menu_state_off} @i{menu_specs} @i{mask} ?@i{pathname}?
+ at end example
+
+This command is the same as the @code{menu_state_off} command except that menu
+items with their disable set value ANDed with the @i{mask} are disabled.
+Otherwise the menu item is left in the current state.
+
+_rule
+ at split{}
+ at node TkU-menu_state_set
+ at unnumberedsubsec menu_state_set
+ at findex menu_state_set(T)
+ at example
+ at exdent @strong{menu_state_set} @i{menu_spec_variable} @i{mask} ?@i{pathname}?
+ at end example
+
+This command provides a combined interface to the @code{menu_state_on} and
+ at code{menu_state_off} functions. The name of the global variable containing
+the menu specifications is passed over in the @i{menu_spec_variable} argument.
+This must have been set by using the @code{set_menu} command.
+
+If the @i{mask} value is positive the @code{menu_state_on} command is called
+with this mask, otherwise the @code{menu_state_off} command is called with the
+absolute value of the @i{mask}.
+
+ at i{Pathname} specifies the root location of the menu widgets as given to a
+previous @code{create_menus} command.
+
+_rule
+ at split{}
+ at node TkU-menu_state_save
+ at unnumberedsubsec menu_state_save
+ at findex menu_state_save(T)
+ at cindex Menu, saving states
+ at cindex Saving menu states
+ at example
+ at exdent @strong{menu_state_save} @i{pathname}
+ at end example
+
+This command queries the current states of the menus created as children of
+ at i{pathname} and returns them as a string suitable for passing to a later
+ at code{menu_state_restore} function. The principle use of this function is
+within the @code{SetBusy} command.
+
+_rule
+ at split{}
+ at node TkU-menu_state_restore
+ at unnumberedsubsec menu_state_restore
+ at findex menu_state_restore(T)
+ at cindex Menu, restoring states
+ at cindex Restoring menu states
+ at cindex Menu, loading states
+ at cindex Loading menu states
+ at example
+ at exdent @strong{menu_state_restore} @i{pathname} @i{states}
+ at end example
+
+This commands sets the current states of the menus created as children of
+ at i{pathname}. The @i{state} variable contains the menu state information as
+returned from an earlier @i{menu_state_save} function. The principle use of
+this function is within the @code{ClearBusy} command.
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node TkU-Dialogues
+ at section Common Dialogue Components
+ at cindex Dialogue components
+ at findex radiolist(T)
+ at findex entrybox(T)
+ at findex checklist(T)
+ at findex okcancelhelp(T)
+ at findex ColourBox(T)
+ at findex repeater(T)
+ at findex scalebox(T)
+ at findex yes_no(T)
+
+This section has yet to be written. I need to outline the basic tk_utils
+widget-like commands: radiolist, entrybox, checklist, okcancelhelp, ColourBox,
+repeater, scalebox and yes_no. The interfaces will probably change to tidy
+things up before this section is written.
+
+okcancelhelp
+
+checklist
+
+entrybox
+
+messagebox
+
+radiolist
+
+renzbox
+
+repeater
+
+scalebox
+
+scale_range
+
+yes_no
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node TkU-Output
+ at section Text Output and Errors
+ at cindex Output of text messages
+ at cindex Error messages, outputting
+
+A selection of C and Tcl functions exist for outputting text to either the
+stdout or stderr streams. For entirely text based applications these messages
+simply appear on their usual streams. For graphical applications the messages
+can appear in the main window of the application. The programmer is free to
+use the usual C output routines, such as @code{printf}, but doing so will no
+output to the main window.
+
+To utilise the text based version of the routines no initialisation is
+required. For the windowing version the following startup code should be used
+from within @code{stash}.
+
+ at example
+tkinit
+pack [frame .output -relief raised -bd 2] -fill both -expand 1
+load_package tk_utils
+tout_create_wins .output
+ at end example
+
+In the above example @code{.output} can be replaced by any window name you
+choose. The @code{load_package tk_utils} command is required to load the
+ at var{tk_utils_defs} variable. The @code{tout_create_wins} command does the
+actual work of creating the necessary output and error windows. From then on,
+the text output routines will send data to the windows instead of stdout and
+stderr.
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-Output-tout_init
+ at unnumberedsubsec tout_init
+ at findex tout_init(T)
+
+ at example
+ at exdent @code{tout_init} @i{output_path} @i{error_path}
+ at end example
+
+This command initialises the redirection of the text output commands. The two
+rrquired arguments specify the Tk pathnames of text widgets for the output and
+errors to be sent to. The function returns nothing.
+
+The following example illustrates the usage. In practise the
+ at code{tout_create_wins} command should be used instead to provide a common
+style interface.
+
+ at example
+pack [text .output -height 5] [text .error -height 5] -side top
+tout_init .output .error
+vmessage This appears in the output window
+verror ERR_WARN This appears in the error window
+ at end example
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-Output-tout_create_wins
+ at unnumberedsubsec tout_create_wins
+ at findex tout_create_wins(T)
+
+ at example
+ at exdent @code{tout_create_wins} @i{frame}
+ at end example
+
+This creates output and error windows within the specified @var{frame}.
+ at var{frame} may be @code{@{@}} to add these directly to the top level.
+The function returns nothing. The windows created also contain functional
+search, scroll on output, clear, and redirect buttons.
+
+This function also calls the @code{tout_init} command to initialise
+redirection of the text output functions.
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-Output-tout_set_scroll
+ at unnumberedsubsec tout_set_scroll
+ at findex tout_set_scroll(T)
+ at cindex Scrolling on output
+
+ at example
+ at exdent @code{tout_set_scroll} @i{stream} @i{to_scroll}
+ at end example
+
+This command controls whether outputting text should automatically scroll the
+relevant output window so that the new text is visible. @var{stream} should be
+one of @code{stdout} or @code{stderr}. If @var{to_scroll} is 0, scrolling is
+not automatically performed, otherwise scrolling is performed.
+
+This control is connected to the "scroll on output" button created by the
+ at code{tout_create_wins} command.
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-Output-tout_set_redir
+ at unnumberedsubsec tout_set_redir
+ at findex tout_set_redir(T)
+ at cindex Redirecting output
+
+ at example
+ at exdent @code{tout_set_redir} @i{stream} @i{filename}
+ at end example
+
+This command can be used to enable redirection of any output or error to a
+file. Output also still appears in the appropriate window. @var{stream} should
+be one of @code{stdout} or @code{stderr}. @var{filename} specifies which file
+to save output to. Any previously redirected filename for this stream is
+automatically closed. A blank @var{filename} can be used to close the current
+redirection for this stream without opening a new file. The command returns 1
+for success, 0 for failure.
+
+This control is connected to the "redirect" menu created by the
+ at code{tout_create_wins} command.
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-Output-tout_pipe
+ at unnumberedsubsec tout_pipe
+ at findex tout_pipe(T)
+ at cindex Piping output to commands
+
+ at example
+ at exdent @code{tout_pipe} @i{command} @i{input} @i{forever}
+ at end example
+
+This command executes the unix shell @var{command} with @var{input}. If
+ at var{forever} is 0, the command is terminated if it takes more than a specific
+amount of time (currently 5 seconds). A value of @var{forever} other than 0
+causes the @code{tout_pipe} command to wait until @var{command} has finished.
+The stdout and stderr streams from @var{command} appear in the appropriate
+output window. The command returns 0 for success, -1 for failure.
+
+NOTE: This command may not be implemented on all platforms.
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-Output-error_bell
+ at unnumberedsubsec error_bell
+ at findex error_bell(T)
+ at cindex Bell, upon errors
+ at cindex Error bell
+
+ at example
+ at exdent @code{error_bell} @i{status}
+ at end example
+
+This command controls whether a bell should be emitted for each error
+displayed. (Currently bells only ring for the C implementation of
+ at code{verror} and not the Tcl one). If @var{status} is 0, no bell is rung.
+
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-Output-vmessage
+ at unnumberedsubsec vmessage
+ at cindex Text output
+
+ at findex vmessage(C)
+ at example
+ at exdent @code{#include <text_output.h>}
+ at exdent @code{void vmessage(char *fmt, ...);}
+ at end example
+
+This C function displays text in the text output window or prints to stdout
+when in a non graphical environment. Arguments are passed in the standard
+ at code{printf} syntax. Hence @code{vmessage("output");} and
+ at code{vmessage("value=%d",i);} are both legal uses.
+
+ at sp 1
+ at findex vmessage(T)
+ at example
+ at exdent @code{vmessage} ?@var{text} ...?
+ at end example
+
+This is the Tcl interface to the vmessage C function. Any number of arguments
+can be specified. Each are concatenated together with spaces inbetween them.
+
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-Output-verror
+ at unnumberedsubsec verror
+ at cindex Error output
+
+ at findex verror(C)
+ at example
+ at exdent @code{#include <text_output.h>}
+ at exdent @code{void verror(int priority, char *name, char *fmt, ...);}
+ at end example
+
+This C function displays text in the error output window or prints to stderr
+when in a non graphical environment. The @var{priority} argument may be one
+of @code{ERR_WARN} or @code{ERR_FATAL}. The @var{name} argument is used as
+part of the error message, along with the time stamp and the error itself.
+ at var{name} should not be any more than 50 characters long, and ideally much
+shorter. The @var{fmt} arguments onwards form the standard @code{printf} style
+arguments of a format specifier and string components.
+
+An error with priority of @code{ERR_WARN} will be sent only to the error
+window. Priority @code{ERR_FATAL} will print to stderr as well.
+ at code{ERR_FATAL} should be used in conditions where there is a chance that the
+program may subsequently crash, thus removing the error window from the screen
+and preventing users from reporting error messages.
+
+ at sp 1
+ at findex verror(T)
+ at example
+ at exdent @code{vmessage} @var{priority} @var{text} ?...?
+ at end example
+
+This is the Tcl interface to the verror C function. The @var{priority}
+argument should be one of @code{ERR_WARN} or @code{ERR_FATAL} as described
+above. The @var{text} and subsequent arguments make up the contents of the
+error message itself with each argument concatenated with a single space
+between arguments.  The Tcl (not C) implementation of @code{verror} currently
+has a limit of 8192 bytes of error message per call.
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-Output-vfuncheader
+ at unnumberedsubsec vfuncheader
+ at cindex Function header output
+ at cindex Header text output
+
+ at findex vfuncheader(C)
+ at example
+ at exdent @code{#include <text_output.h>}
+ at exdent @code{void vfuncheader(char *fmt, ...);}
+ at end example
+
+This C function displays the name of a function in the output window. The
+function header consists of ruler lines, the date and time, and the formatted
+string specified by the @var{fmt} and subsequent arguments.  These arguments
+should be specified in the standard @code{printf} style. The header, after
+formatting, must be less than 8192 bytes long.
+
+ at sp 1
+ at findex vfuncheader(T)
+ at example
+ at exdent @code{vfuncheader} @var{title}
+ at end example
+
+This is the Tcl interface to the vfuncheader C function. It takes a single
+argument named @var{title} and uses this as the function title. The @var{title}
+must be less 8192 bytes long.
+
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-Output-vfuncgroup
+ at unnumberedsubsec vfuncgroup
+ at cindex Function group output
+ at cindex Group (function) output
+
+ at findex vfuncgroup(C)
+ at example
+ at exdent @code{#include <text_output.h>}
+ at exdent @code{void vfuncgroup(int group, char *fmt, ...);}
+ at end example
+
+This C function is identical to the @code{vfuncheader} function except that it
+will not output a new header if the last call to @code{vfuncgroup} was with
+the same @var{group} number and there have been no intevening
+ at code{vfuncheader} calls.
+
+The @var{group} argument is an integer value specifying a group number. Each
+option within a program using this function should have its own unique group
+number. However currently there is no allocation system for ensuring that this
+is so. The @var{fmt} and subsequent arguments specify the header in the
+standard @code{printf} style. The header, after formatting, must be less than
+8192 bytes long.
+
+ at sp 1
+ at findex vfuncgroup(T)
+ at example
+ at exdent @code{vfuncgroup} @var{group_number} @var{title}
+ at end example
+
+This is the Tcl interface to the vfuncheader C function. The @var{title}
+must be less 8192 bytes long.
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-Output-vfuncparams
+ at unnumberedsubsec vfuncparams
+ at cindex Function parameters
+ at cindex Parameters, text output
+
+ at findex vfuncparams(C)
+ at example
+ at exdent @code{#include <text_output.h>}
+ at exdent @code{void vfuncparams(char *fmt, ...);}
+ at end example
+
+This function sets the parameters used for producing the current output. These
+are added as a tagged text segment to the text underneath the last displayed
+header. The right mouse button in the output window brings up a menu from
+which these parameters can be displayed. By default they are not displayed.
+The parameters can be any length and are specified by @var{fmt} and
+subsequent arguments in the standard @code{printf} style.
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-Output-start_message
+ at unnumberedsubsec start_message and end_message
+ at cindex Text buffering, start_message
+ at cindex Output buffering, start_message
+
+ at findex start_message(C)
+ at findex end_message(C)
+ at example
+ at exdent @code{#include <text_output.h>}
+ at exdent @code{void start_message(void);}
+ at exdent @code{void end_message(void);}
+ at end example
+
+Sometimes we wish to bring up a separate window containing simple message
+outputs (eg in gap4 this could be information about a reading that was clicked
+on). The @code{start_message} function clears the current message buffer and
+starts copying all subsequent output to the stdout window to this buffer.
+
+The @code{end_message} function disables this message copying and display the
+current contents of the message buffer in a separate window.
+
+At present, there are no Tcl interface to these routines.
+
+
+ at c ---------------------------------------------------------------------------
+ at split{}
+ at node TkU-Other
+ at section Other Utility Commands
+
+ at c -------------------------------------------------------------------------
+ at node TkU-Tkinit
+ at unnumberedsubsec tkinit
+ at findex tkinit(T)
+ at cindex Tk, initialising
+
+ at example
+ at exdent @code{tkinit}
+ at end example
+
+This command calls the @code{Tk_Init} C function. The purpose of this function
+is to allow the @code{stash} program to be used in a non windowing
+environment. To achieve this the initialisation of Tk has been delayed until
+this command is ran. Hence one binary can be used for both text work (no
+ at code{tkinit} call) and graphics work (with a @code{tkinit} call).
+
+ at c -------------------------------------------------------------------------
+ at node TkU-Capture
+ at unnumberedsubsec capture
+ at findex capture(T)
+ at cindex Capturing command output
+ at cindex output: saving
+
+ at example
+ at exdent @code{capture} @i{command} ?@i{varname}?
+ at end example
+
+This command executes @i{command} and stores any text written to stdout in the
+tcl variable named in @i{varname}. If @i{varname} is not specified then the
+output is returned, otherwise the return codes from the @code{Tcl_Eval}
+routine are used (ie @code{TCL_OK} for success).
+
+For example the command "@code{set x [capture @{puts foo@}]}" and
+"@code{capture @{puts foo@} x}" both set @i{x} to contain "@i{foo\n}".
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-expandpath
+ at unnumberedsubsec expandpath
+ at findex expandpath(T)
+ at cindex Path expansion
+ at cindex Tilde expansion
+
+ at example
+ at exdent @code{expandpath} @i{pathname}
+ at end example
+
+This command returns an expanded copy of @i{pathname} with tilde sequences
+and environment variables expanded in a usual shell-like fashion. It is a
+direct interface to the @code{expandpath} C routine, so see this for full
+details.
+
+For example, the command "@code{expandpath @{~/bin/$MACHINE@}}" may return a
+string like "@i{/home5/pubseq/bin/alpha}".
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-vTcl_SetResult
+ at unnumberedsubsec vTcl_SetResult
+ at findex vTcl_SetResult(C)
+ at cindex Tcl_SetResult; varargs version
+
+ at example
+ at exdent @code{#include <tcl_utils.h>}
+ at exdent @code{void vTcl_SetResult(Tcl_Interp *interp, char *fmt, ...);}
+ at end example
+
+This function is a varargs implementation of the standard @code{Tcl_SetResult}
+function. The Tcl result is set to be the string specified by the @var{fmt}
+and subsequent arguments in the standard @code{sprintf} style.
+
+NOTE: The current implementation has a limit of setting up to 8192 bytes.
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-vTcl_DStringAppend
+ at unnumberedsubsec vTcl_DStringAppend
+ at findex vTcl_DStringAppend(C)
+ at cindex Tcl_DStringAppend; varargs version
+
+ at example
+ at exdent @code{#include <tcl_utils.h>}
+ at exdent @code{void vTcl_DStringAppend(Tcl_DString *dsPtr, char *fmt, ...);}
+ at end example
+
+This function is a varargs implementation of the standard
+ at code{Tcl_DStringAppend} function. The string specified by the @var{fmt} and
+subsequent arguments (in the standard @code{sprintf} style) is appended to the
+existing dynamic string.
+
+ at c -------------------------------------------------------------------------
+_rule
+ at split{}
+ at node TkU-w
+ at unnumberedsubsec w and vw
+ at findex w(C)
+ at findex vw(C)
+ at cindex Strings, making writable
+ at cindex Writable strings
+ at cindex Keyed lists, writable strings
+ at cindex Tcl_SetVar, writable strings
+
+ at example
+ at exdent @code{#include <tcl_utils.h>}
+ at exdent @code{char *w(char *str);}
+ at exdent @code{char *vw(char *fmt, ...);}
+ at end example
+
+These functions return strings held in writable memory. Writable strings are
+required in the arguments of many Tcl functions, including @code{Tcl_SetVar}
+and @code{Tcl_GetKeyedListField}. The arguments specify the string the return
+as writable. For @code{w()} this is simply an exact copy of the argument. For
+ at code{vw()} the returned string is a formatted copy of the input, which is
+specified in the standard @code{printf} style.
+
+The return value from the @code{w} function isvalid only until the next call
+of @code{w()}. Similarly for the @code{vw} function.
+
+Examples of usage are:
+
+ at example
+Tcl_GetKeyedListField(interp, vw("MODE%d", mode_num), gap_defs, &buf);
+
+Tcl_SetVar(interp, w("arr(element)"), "10", TCL_GLOBAL_ONLY);
+ at end example
+
+NOTE: In the current implementations both functions have a limit of handling
+8192 bytes per call.

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/solvate.git