[med-svn] [lambda-align] 02/06: New upstream version 1.9.1
Andreas Tille
tille at debian.org
Sun Dec 18 08:28:40 UTC 2016
This is an automated email from the git hooks/post-receive script.
tille pushed a commit to branch master
in repository lambda-align.
commit 4b25706b720207e75dc4f190d4a6cc2c75a81081
Author: Andreas Tille <tille at debian.org>
Date: Sun Dec 18 09:07:14 2016 +0100
New upstream version 1.9.1
---
CMakeLists.txt | 4 +-
LICENSE-AGPL3.rst | 671 +++++++++++++++++++++++++++++++
LICENSE-GPL3.rst | 704 --------------------------------
LICENSE.rst | 8 +-
src/CMakeLists.txt | 8 +-
src/holders.hpp | 79 +++-
src/lambda.cpp | 103 +++--
src/lambda.hpp | 1039 ++++++++++++++++++++++++++++++++++++++----------
src/lambda_indexer.cpp | 30 +-
src/lambda_indexer.hpp | 263 ++++++------
src/match.hpp | 32 +-
src/misc.hpp | 35 ++
src/options.hpp | 276 ++++++++++---
src/output.hpp | 25 +-
14 files changed, 2061 insertions(+), 1216 deletions(-)
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 58969a9..49bcb36 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -37,5 +37,5 @@ add_subdirectory(src)
# Add Tests
# ----------------------------------------------------------------------------
-message ("\n${ColourBold}Setting up unit tests${ColourReset}")
-add_subdirectory(tests)
+# message ("\n${ColourBold}Setting up unit tests${ColourReset}")
+# add_subdirectory(tests)
diff --git a/LICENSE-AGPL3.rst b/LICENSE-AGPL3.rst
new file mode 100644
index 0000000..980af45
--- /dev/null
+++ b/LICENSE-AGPL3.rst
@@ -0,0 +1,671 @@
+GNU Affero General Public License
+=================================
+
+*Version 3, 19 November 2007*
+*Copyright © 2007 Free Software Foundation, In* <http://fsf.org>
+
+Everyone is permitted to copy and distribute verbatim copies
+of this license document, but changing it is not allowed.
+
+Preamble
+--------
+
+The GNU Affero General Public License is a free, copyleft license for
+software and other kinds of works, specifically designed to ensure
+cooperation with the community in the case of network server software.
+
+The licenses for most software and other practical works are designed
+to take away your freedom to share and change the works. By contrast,
+our General Public Licenses are intended to guarantee your freedom to
+share and change all versions of a program--to make sure it remains free
+software for all its users.
+
+When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+them if you wish), that you receive source code or can get it if you
+want it, that you can change the software or use pieces of it in new
+free programs, and that you know you can do these things.
+
+Developers that use our General Public Licenses protect your rights
+with two steps: **(1)** assert copyright on the software, and **(2)** offer
+you this License which gives you legal permission to copy, distribute
+and/or modify the software.
+
+A secondary benefit of defending all users' freedom is that
+improvements made in alternate versions of the program, if they
+receive widespread use, become available for other developers to
+incorporate. Many developers of free software are heartened and
+encouraged by the resulting cooperation. However, in the case of
+software used on network servers, this result may fail to come about.
+The GNU General Public License permits making a modified version and
+letting the public access it on a server without ever releasing its
+source code to the public.
+
+The GNU Affero General Public License is designed specifically to
+ensure that, in such cases, the modified source code becomes available
+to the community. It requires the operator of a network server to
+provide the source code of the modified version running there to the
+users of that server. Therefore, public use of a modified version, on
+a publicly accessible server, gives the public access to the source
+code of the modified version.
+
+An older license, called the Affero General Public License and
+published by Affero, was designed to accomplish similar goals. This is
+a different license, not a version of the Affero GPL, but Affero has
+released a new version of the Affero GPL which permits relicensing under
+this license.
+
+The precise terms and conditions for copying, distribution and
+modification follow.
+
+TERMS AND CONDITIONS
+--------------------
+
+0. Definitions
+~~~~~~~~~~~~~~
+
+"This License" refers to version 3 of the GNU Affero General Public License.
+
+"Copyright" also means copyright-like laws that apply to other kinds of
+works, such as semiconductor masks.
+
+"The Program" refers to any copyrightable work licensed under this
+License. Each licensee is addressed as "you". "Licensees" and
+"recipients" may be individuals or organizations.
+
+To "modify" a work means to copy from or adapt all or part of the work
+in a fashion requiring copyright permission, other than the making of an
+exact copy. The resulting work is called a "modified version" of the
+earlier work or a work "based on" the earlier work.
+
+A "covered work" means either the unmodified Program or a work based
+on the Program.
+
+To "propagate" a work means to do anything with it that, without
+permission, would make you directly or secondarily liable for
+infringement under applicable copyright law, except executing it on a
+computer or modifying a private copy. Propagation includes copying,
+distribution (with or without modification), making available to the
+public, and in some countries other activities as well.
+
+To "convey" a work means any kind of propagation that enables other
+parties to make or receive copies. Mere interaction with a user through
+a computer network, with no transfer of a copy, is not conveying.
+
+An interactive user interface displays "Appropriate Legal Notices"
+to the extent that it includes a convenient and prominently visible
+feature that **(1)** displays an appropriate copyright notice, and **(2)**
+tells the user that there is no warranty for the work (except to the
+extent that warranties are provided), that licensees may convey the
+work under this License, and how to view a copy of this License. If
+the interface presents a list of user commands or options, such as a
+menu, a prominent item in the list meets this criterion.
+
+1. Source Code
+~~~~~~~~~~~~~~
+
+The "source code" for a work means the preferred form of the work
+for making modifications to it. "Object code" means any non-source
+form of a work.
+
+A "Standard Interface" means an interface that either is an official
+standard defined by a recognized standards body, or, in the case of
+interfaces specified for a particular programming language, one that
+is widely used among developers working in that language.
+
+The "System Libraries" of an executable work include anything, other
+than the work as a whole, that **(a)** is included in the normal form of
+packaging a Major Component, but which is not part of that Major
+Component, and **(b)** serves only to enable use of the work with that
+Major Component, or to implement a Standard Interface for which an
+implementation is available to the public in source code form. A
+"Major Component", in this context, means a major essential component
+(kernel, window system, and so on) of the specific operating system
+(if any) on which the executable work runs, or a compiler used to
+produce the work, or an object code interpreter used to run it.
+
+The "Corresponding Source" for a work in object code form means all
+the source code needed to generate, install, and (for an executable
+work) run the object code and to modify the work, including scripts to
+control those activities. However, it does not include the work's
+System Libraries, or general-purpose tools or generally available free
+programs which are used unmodified in performing those activities but
+which are not part of the work. For example, Corresponding Source
+includes interface definition files associated with source files for
+the work, and the source code for shared libraries and dynamically
+linked subprograms that the work is specifically designed to require,
+such as by intimate data communication or control flow between those
+subprograms and other parts of the work.
+
+The Corresponding Source need not include anything that users
+can regenerate automatically from other parts of the Corresponding
+Source.
+
+The Corresponding Source for a work in source code form is that
+same work.
+
+2. Basic Permissions
+~~~~~~~~~~~~~~~~~~~~
+
+All rights granted under this License are granted for the term of
+copyright on the Program, and are irrevocable provided the stated
+conditions are met. This License explicitly affirms your unlimited
+permission to run the unmodified Program. The output from running a
+covered work is covered by this License only if the output, given its
+content, constitutes a covered work. This License acknowledges your
+rights of fair use or other equivalent, as provided by copyright law.
+
+You may make, run and propagate covered works that you do not
+convey, without conditions so long as your license otherwise remains
+in force. You may convey covered works to others for the sole purpose
+of having them make modifications exclusively for you, or provide you
+with facilities for running those works, provided that you comply with
+the terms of this License in conveying all material for which you do
+not control copyright. Those thus making or running the covered works
+for you must do so exclusively on your behalf, under your direction
+and control, on terms that prohibit them from making any copies of
+your copyrighted material outside their relationship with you.
+
+Conveying under any other circumstances is permitted solely under
+the conditions stated below. Sublicensing is not allowed; section 10
+makes it unnecessary.
+
+3. Protecting Users' Legal Rights From Anti-Circumvention Law
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+No covered work shall be deemed part of an effective technological
+measure under any applicable law fulfilling obligations under article
+11 of the WIPO copyright treaty adopted on 20 December 1996, or
+similar laws prohibiting or restricting circumvention of such
+measures.
+
+When you convey a covered work, you waive any legal power to forbid
+circumvention of technological measures to the extent such circumvention
+is effected by exercising rights under this License with respect to
+the covered work, and you disclaim any intention to limit operation or
+modification of the work as a means of enforcing, against the work's
+users, your or third parties' legal rights to forbid circumvention of
+technological measures.
+
+4. Conveying Verbatim Copies
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You may convey verbatim copies of the Program's source code as you
+receive it, in any medium, provided that you conspicuously and
+appropriately publish on each copy an appropriate copyright notice;
+keep intact all notices stating that this License and any
+non-permissive terms added in accord with section 7 apply to the code;
+keep intact all notices of the absence of any warranty; and give all
+recipients a copy of this License along with the Program.
+
+You may charge any price or no price for each copy that you convey,
+and you may offer support or warranty protection for a fee.
+
+### 5. Conveying Modified Source Versions
+
+You may convey a work based on the Program, or the modifications to
+produce it from the Program, in the form of source code under the
+terms of section 4, provided that you also meet all of these conditions:
+
+* **a)** The work must carry prominent notices stating that you modified
+ it, and giving a relevant date.
+* **b)** The work must carry prominent notices stating that it is
+ released under this License and any conditions added under section 7.
+ This requirement modifies the requirement in section 4 to
+ "keep intact all notices".
+* **c)** You must license the entire work, as a whole, under this
+ License to anyone who comes into possession of a copy. This
+ License will therefore apply, along with any applicable section 7
+ additional terms, to the whole of the work, and all its parts,
+ regardless of how they are packaged. This License gives no
+ permission to license the work in any other way, but it does not
+ invalidate such permission if you have separately received it.
+* **d)** If the work has interactive user interfaces, each must display
+ Appropriate Legal Notices; however, if the Program has interactive
+ interfaces that do not display Appropriate Legal Notices, your
+ work need not make them do so.
+
+A compilation of a covered work with other separate and independent
+works, which are not by their nature extensions of the covered work,
+and which are not combined with it such as to form a larger program,
+in or on a volume of a storage or distribution medium, is called an
+"aggregate" if the compilation and its resulting copyright are not
+used to limit the access or legal rights of the compilation's users
+beyond what the individual works permit. Inclusion of a covered work
+in an aggregate does not cause this License to apply to the other
+parts of the aggregate.
+
+6. Conveying Non-Source Forms
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You may convey a covered work in object code form under the terms
+of sections 4 and 5, provided that you also convey the
+machine-readable Corresponding Source under the terms of this License,
+in one of these ways:
+
+* **a)** Convey the object code in, or embodied in, a physical product
+ (including a physical distribution medium), accompanied by the
+ Corresponding Source fixed on a durable physical medium
+ customarily used for software interchange.
+* **b)** Convey the object code in, or embodied in, a physical product
+ (including a physical distribution medium), accompanied by a
+ written offer, valid for at least three years and valid for as
+ long as you offer spare parts or customer support for that product
+ model, to give anyone who possesses the object code either **(1)** a
+ copy of the Corresponding Source for all the software in the
+ product that is covered by this License, on a durable physical
+ medium customarily used for software interchange, for a price no
+ more than your reasonable cost of physically performing this
+ conveying of source, or **(2)** access to copy the
+ Corresponding Source from a network server at no charge.
+* **c)** Convey individual copies of the object code with a copy of the
+ written offer to provide the Corresponding Source. This
+ alternative is allowed only occasionally and noncommercially, and
+ only if you received the object code with such an offer, in accord
+ with subsection 6b.
+* **d)** Convey the object code by offering access from a designated
+ place (gratis or for a charge), and offer equivalent access to the
+ Corresponding Source in the same way through the same place at no
+ further charge. You need not require recipients to copy the
+ Corresponding Source along with the object code. If the place to
+ copy the object code is a network server, the Corresponding Source
+ may be on a different server (operated by you or a third party)
+ that supports equivalent copying facilities, provided you maintain
+ clear directions next to the object code saying where to find the
+ Corresponding Source. Regardless of what server hosts the
+ Corresponding Source, you remain obligated to ensure that it is
+ available for as long as needed to satisfy these requirements.
+* **e)** Convey the object code using peer-to-peer transmission, provided
+ you inform other peers where the object code and Corresponding
+ Source of the work are being offered to the general public at no
+ charge under subsection 6d.
+
+A separable portion of the object code, whose source code is excluded
+from the Corresponding Source as a System Library, need not be
+included in conveying the object code work.
+
+A "User Product" is either **(1)** a "consumer product", which means any
+tangible personal property which is normally used for personal, family,
+or household purposes, or **(2)** anything designed or sold for incorporation
+into a dwelling. In determining whether a product is a consumer product,
+doubtful cases shall be resolved in favor of coverage. For a particular
+product received by a particular user, "normally used" refers to a
+typical or common use of that class of product, regardless of the status
+of the particular user or of the way in which the particular user
+actually uses, or expects or is expected to use, the product. A product
+is a consumer product regardless of whether the product has substantial
+commercial, industrial or non-consumer uses, unless such uses represent
+the only significant mode of use of the product.
+
+"Installation Information" for a User Product means any methods,
+procedures, authorization keys, or other information required to install
+and execute modified versions of a covered work in that User Product from
+a modified version of its Corresponding Source. The information must
+suffice to ensure that the continued functioning of the modified object
+code is in no case prevented or interfered with solely because
+modification has been made.
+
+If you convey an object code work under this section in, or with, or
+specifically for use in, a User Product, and the conveying occurs as
+part of a transaction in which the right of possession and use of the
+User Product is transferred to the recipient in perpetuity or for a
+fixed term (regardless of how the transaction is characterized), the
+Corresponding Source conveyed under this section must be accompanied
+by the Installation Information. But this requirement does not apply
+if neither you nor any third party retains the ability to install
+modified object code on the User Product (for example, the work has
+been installed in ROM).
+
+The requirement to provide Installation Information does not include a
+requirement to continue to provide support service, warranty, or updates
+for a work that has been modified or installed by the recipient, or for
+the User Product in which it has been modified or installed. Access to a
+network may be denied when the modification itself materially and
+adversely affects the operation of the network or violates the rules and
+protocols for communication across the network.
+
+Corresponding Source conveyed, and Installation Information provided,
+in accord with this section must be in a format that is publicly
+documented (and with an implementation available to the public in
+source code form), and must require no special password or key for
+unpacking, reading or copying.
+
+7. Additional Terms
+~~~~~~~~~~~~~~~~~~~
+
+"Additional permissions" are terms that supplement the terms of this
+License by making exceptions from one or more of its conditions.
+Additional permissions that are applicable to the entire Program shall
+be treated as though they were included in this License, to the extent
+that they are valid under applicable law. If additional permissions
+apply only to part of the Program, that part may be used separately
+under those permissions, but the entire Program remains governed by
+this License without regard to the additional permissions.
+
+When you convey a copy of a covered work, you may at your option
+remove any additional permissions from that copy, or from any part of
+it. (Additional permissions may be written to require their own
+removal in certain cases when you modify the work.) You may place
+additional permissions on material, added by you to a covered work,
+for which you have or can give appropriate copyright permission.
+
+Notwithstanding any other provision of this License, for material you
+add to a covered work, you may (if authorized by the copyright holders of
+that material) supplement the terms of this License with terms:
+
+* **a)** Disclaiming warranty or limiting liability differently from the
+ terms of sections 15 and 16 of this License; or
+* **b)** Requiring preservation of specified reasonable legal notices or
+ author attributions in that material or in the Appropriate Legal
+ Notices displayed by works containing it; or
+* **c)** Prohibiting misrepresentation of the origin of that material, or
+ requiring that modified versions of such material be marked in
+ reasonable ways as different from the original version; or
+* **d)** Limiting the use for publicity purposes of names of licensors or
+ authors of the material; or
+* **e)** Declining to grant rights under trademark law for use of some
+ trade names, trademarks, or service marks; or
+* **f)** Requiring indemnification of licensors and authors of that
+ material by anyone who conveys the material (or modified versions of
+ it) with contractual assumptions of liability to the recipient, for
+ any liability that these contractual assumptions directly impose on
+ those licensors and authors.
+
+All other non-permissive additional terms are considered "further
+restrictions" within the meaning of section 10. If the Program as you
+received it, or any part of it, contains a notice stating that it is
+governed by this License along with a term that is a further
+restriction, you may remove that term. If a license document contains
+a further restriction but permits relicensing or conveying under this
+License, you may add to a covered work material governed by the terms
+of that license document, provided that the further restriction does
+not survive such relicensing or conveying.
+
+If you add terms to a covered work in accord with this section, you
+must place, in the relevant source files, a statement of the
+additional terms that apply to those files, or a notice indicating
+where to find the applicable terms.
+
+Additional terms, permissive or non-permissive, may be stated in the
+form of a separately written license, or stated as exceptions;
+the above requirements apply either way.
+
+8. Termination
+~~~~~~~~~~~~~~
+
+You may not propagate or modify a covered work except as expressly
+provided under this License. Any attempt otherwise to propagate or
+modify it is void, and will automatically terminate your rights under
+this License (including any patent licenses granted under the third
+paragraph of section 11).
+
+However, if you cease all violation of this License, then your
+license from a particular copyright holder is reinstated **(a)**
+provisionally, unless and until the copyright holder explicitly and
+finally terminates your license, and **(b)** permanently, if the copyright
+holder fails to notify you of the violation by some reasonable means
+prior to 60 days after the cessation.
+
+Moreover, your license from a particular copyright holder is
+reinstated permanently if the copyright holder notifies you of the
+violation by some reasonable means, this is the first time you have
+received notice of violation of this License (for any work) from that
+copyright holder, and you cure the violation prior to 30 days after
+your receipt of the notice.
+
+Termination of your rights under this section does not terminate the
+licenses of parties who have received copies or rights from you under
+this License. If your rights have been terminated and not permanently
+reinstated, you do not qualify to receive new licenses for the same
+material under section 10.
+
+9. Acceptance Not Required for Having Copies
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You are not required to accept this License in order to receive or
+run a copy of the Program. Ancillary propagation of a covered work
+occurring solely as a consequence of using peer-to-peer transmission
+to receive a copy likewise does not require acceptance. However,
+nothing other than this License grants you permission to propagate or
+modify any covered work. These actions infringe copyright if you do
+not accept this License. Therefore, by modifying or propagating a
+covered work, you indicate your acceptance of this License to do so.
+
+10. Automatic Licensing of Downstream Recipients
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Each time you convey a covered work, the recipient automatically
+receives a license from the original licensors, to run, modify and
+propagate that work, subject to this License. You are not responsible
+for enforcing compliance by third parties with this License.
+
+An "entity transaction" is a transaction transferring control of an
+organization, or substantially all assets of one, or subdividing an
+organization, or merging organizations. If propagation of a covered
+work results from an entity transaction, each party to that
+transaction who receives a copy of the work also receives whatever
+licenses to the work the party's predecessor in interest had or could
+give under the previous paragraph, plus a right to possession of the
+Corresponding Source of the work from the predecessor in interest, if
+the predecessor has it or can get it with reasonable efforts.
+
+You may not impose any further restrictions on the exercise of the
+rights granted or affirmed under this License. For example, you may
+not impose a license fee, royalty, or other charge for exercise of
+rights granted under this License, and you may not initiate litigation
+(including a cross-claim or counterclaim in a lawsuit) alleging that
+any patent claim is infringed by making, using, selling, offering for
+sale, or importing the Program or any portion of it.
+
+11. Patents
+~~~~~~~~~~~
+
+A "contributor" is a copyright holder who authorizes use under this
+License of the Program or a work on which the Program is based. The
+work thus licensed is called the contributor's "contributor version".
+
+A contributor's "essential patent claims" are all patent claims
+owned or controlled by the contributor, whether already acquired or
+hereafter acquired, that would be infringed by some manner, permitted
+by this License, of making, using, or selling its contributor version,
+but do not include claims that would be infringed only as a
+consequence of further modification of the contributor version. For
+purposes of this definition, "control" includes the right to grant
+patent sublicenses in a manner consistent with the requirements of
+this License.
+
+Each contributor grants you a non-exclusive, worldwide, royalty-free
+patent license under the contributor's essential patent claims, to
+make, use, sell, offer for sale, import and otherwise run, modify and
+propagate the contents of its contributor version.
+
+In the following three paragraphs, a "patent license" is any express
+agreement or commitment, however denominated, not to enforce a patent
+(such as an express permission to practice a patent or covenant not to
+sue for patent infringement). To "grant" such a patent license to a
+party means to make such an agreement or commitment not to enforce a
+patent against the party.
+
+If you convey a covered work, knowingly relying on a patent license,
+and the Corresponding Source of the work is not available for anyone
+to copy, free of charge and under the terms of this License, through a
+publicly available network server or other readily accessible means,
+then you must either **(1)** cause the Corresponding Source to be so
+available, or **(2)** arrange to deprive yourself of the benefit of the
+patent license for this particular work, or **(3)** arrange, in a manner
+consistent with the requirements of this License, to extend the patent
+license to downstream recipients. "Knowingly relying" means you have
+actual knowledge that, but for the patent license, your conveying the
+covered work in a country, or your recipient's use of the covered work
+in a country, would infringe one or more identifiable patents in that
+country that you have reason to believe are valid.
+
+If, pursuant to or in connection with a single transaction or
+arrangement, you convey, or propagate by procuring conveyance of, a
+covered work, and grant a patent license to some of the parties
+receiving the covered work authorizing them to use, propagate, modify
+or convey a specific copy of the covered work, then the patent license
+you grant is automatically extended to all recipients of the covered
+work and works based on it.
+
+A patent license is "discriminatory" if it does not include within
+the scope of its coverage, prohibits the exercise of, or is
+conditioned on the non-exercise of one or more of the rights that are
+specifically granted under this License. You may not convey a covered
+work if you are a party to an arrangement with a third party that is
+in the business of distributing software, under which you make payment
+to the third party based on the extent of your activity of conveying
+the work, and under which the third party grants, to any of the
+parties who would receive the covered work from you, a discriminatory
+patent license **(a)** in connection with copies of the covered work
+conveyed by you (or copies made from those copies), or **(b)** primarily
+for and in connection with specific products or compilations that
+contain the covered work, unless you entered into that arrangement,
+or that patent license was granted, prior to 28 March 2007.
+
+Nothing in this License shall be construed as excluding or limiting
+any implied license or other defenses to infringement that may
+otherwise be available to you under applicable patent law.
+
+12. No Surrender of Others' Freedom
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License. If you cannot convey a
+covered work so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you may
+not convey it at all. For example, if you agree to terms that obligate you
+to collect a royalty for further conveying from those to whom you convey
+the Program, the only way you could satisfy both those terms and this
+License would be to refrain entirely from conveying the Program.
+
+13. Remote Network Interaction; Use with the GNU General Public License
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Notwithstanding any other provision of this License, if you modify the
+Program, your modified version must prominently offer all users
+interacting with it remotely through a computer network (if your version
+supports such interaction) an opportunity to receive the Corresponding
+Source of your version by providing access to the Corresponding Source
+from a network server at no charge, through some standard or customary
+means of facilitating copying of software. This Corresponding Source
+shall include the Corresponding Source for any work covered by version 3
+of the GNU General Public License that is incorporated pursuant to the
+following paragraph.
+
+Notwithstanding any other provision of this License, you have
+permission to link or combine any covered work with a work licensed
+under version 3 of the GNU General Public License into a single
+combined work, and to convey the resulting work. The terms of this
+License will continue to apply to the part which is the covered work,
+but the work with which it is combined will remain governed by version
+3 of the GNU General Public License.
+
+14. Revised Versions of this License
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The Free Software Foundation may publish revised and/or new versions of
+the GNU Affero General Public License from time to time. Such new versions
+will be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number. If the
+Program specifies that a certain numbered version of the GNU Affero General
+Public License "or any later version" applies to it, you have the
+option of following the terms and conditions either of that numbered
+version or of any later version published by the Free Software
+Foundation. If the Program does not specify a version number of the
+GNU Affero General Public License, you may choose any version ever published
+by the Free Software Foundation.
+
+If the Program specifies that a proxy can decide which future
+versions of the GNU Affero General Public License can be used, that proxy's
+public statement of acceptance of a version permanently authorizes you
+to choose that version for the Program.
+
+Later license versions may give you additional or different
+permissions. However, no additional obligations are imposed on any
+author or copyright holder as a result of your choosing to follow a
+later version.
+
+15. Disclaimer of Warranty
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
+APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
+HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
+OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
+THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
+IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
+ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+
+16. Limitation of Liability
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
+THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
+GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
+USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
+DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
+PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
+EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
+SUCH DAMAGES.
+
+17. Interpretation of Sections 15 and 16
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If the disclaimer of warranty and limitation of liability provided
+above cannot be given local legal effect according to their terms,
+reviewing courts shall apply local law that most closely approximates
+an absolute waiver of all civil liability in connection with the
+Program, unless a warranty or assumption of liability accompanies a
+copy of the Program in return for a fee.
+
+*END OF TERMS AND CONDITIONS*
+
+How to Apply These Terms to Your New Programs
+---------------------------------------------
+
+If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+state the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+| <one line to give the program's name and a brief idea of what it does.>
+| Copyright (C) <year> <name of author>
+|
+| This program is free software: you can redistribute it and/or modify
+| it under the terms of the GNU Affero General Public License as published by
+| the Free Software Foundation, either version 3 of the License, or
+| (at your option) any later version.
+|
+| This program is distributed in the hope that it will be useful,
+| but WITHOUT ANY WARRANTY; without even the implied warranty of
+| MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+| GNU Affero General Public License for more details.
+|
+| You should have received a copy of the GNU Affero General Public License
+| along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If your software can interact with users remotely through a computer
+network, you should also make sure that it provides a way for users to
+get its source. For example, if your program is a web application, its
+interface could display a "Source" link that leads users to an archive
+of the code. There are many ways you could offer source, and different
+solutions will be better for different programs; see section 13 for the
+specific requirements.
+
+You should also get your employer (if you work as a programmer) or school,
+if any, to sign a "copyright disclaimer" for the program, if necessary.
+For more information on this, and how to apply and follow the GNU AGPL, see
+<http://www.gnu.org/licenses/>.
diff --git a/LICENSE-GPL3.rst b/LICENSE-GPL3.rst
deleted file mode 100644
index b2cade9..0000000
--- a/LICENSE-GPL3.rst
+++ /dev/null
@@ -1,704 +0,0 @@
-GNU GENERAL PUBLIC LICENSE
-==========================
-
-Version 3, 29 June 2007
-
-Copyright (C) 2007 `Free Software Foundation, Inc. <http://fsf.org/>`_
-
-Everyone is permitted to copy and distribute verbatim copies of this
-license document, but changing it is not allowed.
-
-Preamble
---------
-
-The GNU General Public License is a free, copyleft license for software
-and other kinds of works.
-
-The licenses for most software and other practical works are designed to
-take away your freedom to share and change the works. By contrast, the
-GNU General Public License is intended to guarantee your freedom to
-share and change all versions of a program--to make sure it remains free
-software for all its users. We, the Free Software Foundation, use the
-GNU General Public License for most of our software; it applies also to
-any other work released this way by its authors. You can apply it to
-your programs, too.
-
-When we speak of free software, we are referring to freedom, not price.
-Our General Public Licenses are designed to make sure that you have the
-freedom to distribute copies of free software (and charge for them if
-you wish), that you receive source code or can get it if you want it,
-that you can change the software or use pieces of it in new free
-programs, and that you know you can do these things.
-
-To protect your rights, we need to prevent others from denying you these
-rights or asking you to surrender the rights. Therefore, you have
-certain responsibilities if you distribute copies of the software, or if
-you modify it: responsibilities to respect the freedom of others.
-
-For example, if you distribute copies of such a program, whether gratis
-or for a fee, you must pass on to the recipients the same freedoms that
-you received. You must make sure that they, too, receive or can get the
-source code. And you must show them these terms so they know their
-rights.
-
-Developers that use the GNU GPL protect your rights with two steps:
-
-1. assert copyright on the software, and
-2. offer you this License giving you legal permission to copy,
- distribute and/or modify it.
-
-For the developers' and authors' protection, the GPL clearly explains
-that there is no warranty for this free software. For both users' and
-authors' sake, the GPL requires that modified versions be marked as
-changed, so that their problems will not be attributed erroneously to
-authors of previous versions.
-
-Some devices are designed to deny users access to install or run
-modified versions of the software inside them, although the manufacturer
-can do so. This is fundamentally incompatible with the aim of protecting
-users' freedom to change the software. The systematic pattern of such
-abuse occurs in the area of products for individuals to use, which is
-precisely where it is most unacceptable. Therefore, we have designed
-this version of the GPL to prohibit the practice for those products. If
-such problems arise substantially in other domains, we stand ready to
-extend this provision to those domains in future versions of the GPL, as
-needed to protect the freedom of users.
-
-Finally, every program is threatened constantly by software patents.
-States should not allow patents to restrict development and use of
-software on general-purpose computers, but in those that do, we wish to
-avoid the special danger that patents applied to a free program could
-make it effectively proprietary. To prevent this, the GPL assures that
-patents cannot be used to render the program non-free.
-
-The precise terms and conditions for copying, distribution and
-modification follow.
-
-TERMS AND CONDITIONS
---------------------
-
-0. Definitions.
-~~~~~~~~~~~~~~~
-
-*This License* refers to version 3 of the GNU General Public License.
-
-*Copyright* also means copyright-like laws that apply to other kinds of
-works, such as semiconductor masks.
-
-*The Program* refers to any copyrightable work licensed under this
-License. Each licensee is addressed as *you*. *Licensees* and
-*recipients* may be individuals or organizations.
-
-To *modify* a work means to copy from or adapt all or part of the work
-in a fashion requiring copyright permission, other than the making of an
-exact copy. The resulting work is called a *modified version* of the
-earlier work or a work *based on* the earlier work.
-
-A *covered work* means either the unmodified Program or a work based on
-the Program.
-
-To *propagate* a work means to do anything with it that, without
-permission, would make you directly or secondarily liable for
-infringement under applicable copyright law, except executing it on a
-computer or modifying a private copy. Propagation includes copying,
-distribution (with or without modification), making available to the
-public, and in some countries other activities as well.
-
-To *convey* a work means any kind of propagation that enables other
-parties to make or receive copies. Mere interaction with a user through
-a computer network, with no transfer of a copy, is not conveying.
-
-An interactive user interface displays *Appropriate Legal Notices* to
-the extent that it includes a convenient and prominently visible feature
-that
-
-1. displays an appropriate copyright notice, and
-2. tells the user that there is no warranty for the work (except to the
- extent that warranties are provided), that licensees may convey the
- work under this License, and how to view a copy of this License.
-
-If the interface presents a list of user commands or options, such as a
-menu, a prominent item in the list meets this criterion.
-
-1. Source Code.
-~~~~~~~~~~~~~~~
-
-The *source code* for a work means the preferred form of the work for
-making modifications to it. *Object code* means any non-source form of a
-work.
-
-A *Standard Interface* means an interface that either is an official
-standard defined by a recognized standards body, or, in the case of
-interfaces specified for a particular programming language, one that is
-widely used among developers working in that language.
-
-The *System Libraries* of an executable work include anything, other
-than the work as a whole, that (a) is included in the normal form of
-packaging a Major Component, but which is not part of that Major
-Component, and (b) serves only to enable use of the work with that Major
-Component, or to implement a Standard Interface for which an
-implementation is available to the public in source code form. A *Major
-Component*, in this context, means a major essential component (kernel,
-window system, and so on) of the specific operating system (if any) on
-which the executable work runs, or a compiler used to produce the work,
-or an object code interpreter used to run it.
-
-The *Corresponding Source* for a work in object code form means all the
-source code needed to generate, install, and (for an executable work)
-run the object code and to modify the work, including scripts to control
-those activities. However, it does not include the work's System
-Libraries, or general-purpose tools or generally available free programs
-which are used unmodified in performing those activities but which are
-not part of the work. For example, Corresponding Source includes
-interface definition files associated with source files for the work,
-and the source code for shared libraries and dynamically linked
-subprograms that the work is specifically designed to require, such as
-by intimate data communication or control flow between those subprograms
-and other parts of the work.
-
-The Corresponding Source need not include anything that users can
-regenerate automatically from other parts of the Corresponding Source.
-
-The Corresponding Source for a work in source code form is that same
-work.
-
-2. Basic Permissions.
-~~~~~~~~~~~~~~~~~~~~~
-
-All rights granted under this License are granted for the term of
-copyright on the Program, and are irrevocable provided the stated
-conditions are met. This License explicitly affirms your unlimited
-permission to run the unmodified Program. The output from running a
-covered work is covered by this License only if the output, given its
-content, constitutes a covered work. This License acknowledges your
-rights of fair use or other equivalent, as provided by copyright law.
-
-You may make, run and propagate covered works that you do not convey,
-without conditions so long as your license otherwise remains in force.
-You may convey covered works to others for the sole purpose of having
-them make modifications exclusively for you, or provide you with
-facilities for running those works, provided that you comply with the
-terms of this License in conveying all material for which you do not
-control copyright. Those thus making or running the covered works for
-you must do so exclusively on your behalf, under your direction and
-control, on terms that prohibit them from making any copies of your
-copyrighted material outside their relationship with you.
-
-Conveying under any other circumstances is permitted solely under the
-conditions stated below. Sublicensing is not allowed; section 10 makes
-it unnecessary.
-
-3. Protecting Users' Legal Rights From Anti-Circumvention Law.
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-No covered work shall be deemed part of an effective technological
-measure under any applicable law fulfilling obligations under article 11
-of the WIPO copyright treaty adopted on 20 December 1996, or similar
-laws prohibiting or restricting circumvention of such measures.
-
-When you convey a covered work, you waive any legal power to forbid
-circumvention of technological measures to the extent such circumvention
-is effected by exercising rights under this License with respect to the
-covered work, and you disclaim any intention to limit operation or
-modification of the work as a means of enforcing, against the work's
-users, your or third parties' legal rights to forbid circumvention of
-technological measures.
-
-4. Conveying Verbatim Copies.
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-You may convey verbatim copies of the Program's source code as you
-receive it, in any medium, provided that you conspicuously and
-appropriately publish on each copy an appropriate copyright notice; keep
-intact all notices stating that this License and any non-permissive
-terms added in accord with section 7 apply to the code; keep intact all
-notices of the absence of any warranty; and give all recipients a copy
-of this License along with the Program.
-
-You may charge any price or no price for each copy that you convey, and
-you may offer support or warranty protection for a fee.
-
-5. Conveying Modified Source Versions.
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-You may convey a work based on the Program, or the modifications to
-produce it from the Program, in the form of source code under the terms
-of section 4, provided that you also meet all of these conditions:
-
-- a) The work must carry prominent notices stating that you modified it,
-and giving a relevant date. - b) The work must carry prominent notices
-stating that it is released under this License and any conditions added
-under section 7. This requirement modifies the requirement in section 4
-to *keep intact all notices*. - c) You must license the entire work, as
-a whole, under this License to anyone who comes into possession of a
-copy. This License will therefore apply, along with any applicable
-section 7 additional terms, to the whole of the work, and all its parts,
-regardless of how they are packaged. This License gives no permission to
-license the work in any other way, but it does not invalidate such
-permission if you have separately received it. - d) If the work has
-interactive user interfaces, each must display Appropriate Legal
-Notices; however, if the Program has interactive interfaces that do not
-display Appropriate Legal Notices, your work need not make them do so.
-
-A compilation of a covered work with other separate and independent
-works, which are not by their nature extensions of the covered work, and
-which are not combined with it such as to form a larger program, in or
-on a volume of a storage or distribution medium, is called an
-*aggregate* if the compilation and its resulting copyright are not used
-to limit the access or legal rights of the compilation's users beyond
-what the individual works permit. Inclusion of a covered work in an
-aggregate does not cause this License to apply to the other parts of the
-aggregate.
-
-6. Conveying Non-Source Forms.
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-You may convey a covered work in object code form under the terms of
-sections 4 and 5, provided that you also convey the machine-readable
-Corresponding Source under the terms of this License, in one of these
-ways:
-
-- a) Convey the object code in, or embodied in, a physical product
-(including a physical distribution medium), accompanied by the
-Corresponding Source fixed on a durable physical medium customarily used
-for software interchange. - b) Convey the object code in, or embodied
-in, a physical product (including a physical distribution medium),
-accompanied by a written offer, valid for at least three years and valid
-for as long as you offer spare parts or customer support for that
-product model, to give anyone who possesses the object code either
- 1. a copy of the Corresponding Source for all the software in the
-product that is covered by this License, on a durable physical medium
-customarily used for software interchange, for a price no more than your
-reasonable cost of physically performing this conveying of source, or 2.
-access to copy the Corresponding Source from a network server at no
-charge.
-
-- c) Convey individual copies of the object code with a copy of the
-written offer to provide the Corresponding Source. This alternative is
-allowed only occasionally and noncommercially, and only if you received
-the object code with such an offer, in accord with subsection 6b. - d)
-Convey the object code by offering access from a designated place
-(gratis or for a charge), and offer equivalent access to the
-Corresponding Source in the same way through the same place at no
-further charge. You need not require recipients to copy the
-Corresponding Source along with the object code. If the place to copy
-the object code is a network server, the Corresponding Source may be on
-a different server operated by you or a third party) that supports
-equivalent copying facilities, provided you maintain clear directions
-next to the object code saying where to find the Corresponding Source.
-Regardless of what server hosts the Corresponding Source, you remain
-obligated to ensure that it is available for as long as needed to
-satisfy these requirements. - e) Convey the object code using
-peer-to-peer transmission, provided you inform other peers where the
-object code and Corresponding Source of the work are being offered to
-the general public at no charge under subsection 6d.
-
-A separable portion of the object code, whose source code is excluded
-from the Corresponding Source as a System Library, need not be included
-in conveying the object code work.
-
-A *User Product* is either
-
-1. a *consumer product*, which means any tangible personal property
- which is normally used for personal, family, or household purposes,
- or
-2. anything designed or sold for incorporation into a dwelling.
-
-In determining whether a product is a consumer product, doubtful cases
-shall be resolved in favor of coverage. For a particular product
-received by a particular user, *normally used* refers to a typical or
-common use of that class of product, regardless of the status of the
-particular user or of the way in which the particular user actually
-uses, or expects or is expected to use, the product. A product is a
-consumer product regardless of whether the product has substantial
-commercial, industrial or non-consumer uses, unless such uses represent
-the only significant mode of use of the product.
-
-*Installation Information* for a User Product means any methods,
-procedures, authorization keys, or other information required to install
-and execute modified versions of a covered work in that User Product
-from a modified version of its Corresponding Source. The information
-must suffice to ensure that the continued functioning of the modified
-object code is in no case prevented or interfered with solely because
-modification has been made.
-
-If you convey an object code work under this section in, or with, or
-specifically for use in, a User Product, and the conveying occurs as
-part of a transaction in which the right of possession and use of the
-User Product is transferred to the recipient in perpetuity or for a
-fixed term (regardless of how the transaction is characterized), the
-Corresponding Source conveyed under this section must be accompanied by
-the Installation Information. But this requirement does not apply if
-neither you nor any third party retains the ability to install modified
-object code on the User Product (for example, the work has been
-installed in ROM).
-
-The requirement to provide Installation Information does not include a
-requirement to continue to provide support service, warranty, or updates
-for a work that has been modified or installed by the recipient, or for
-the User Product in which it has been modified or installed. Access to a
-network may be denied when the modification itself materially and
-adversely affects the operation of the network or violates the rules and
-protocols for communication across the network.
-
-Corresponding Source conveyed, and Installation Information provided, in
-accord with this section must be in a format that is publicly documented
-(and with an implementation available to the public in source code
-form), and must require no special password or key for unpacking,
-reading or copying.
-
-7. Additional Terms.
-~~~~~~~~~~~~~~~~~~~~
-
-*Additional permissions* are terms that supplement the terms of this
-License by making exceptions from one or more of its conditions.
-Additional permissions that are applicable to the entire Program shall
-be treated as though they were included in this License, to the extent
-that they are valid under applicable law. If additional permissions
-apply only to part of the Program, that part may be used separately
-under those permissions, but the entire Program remains governed by this
-License without regard to the additional permissions.
-
-When you convey a copy of a covered work, you may at your option remove
-any additional permissions from that copy, or from any part of it.
-(Additional permissions may be written to require their own removal in
-certain cases when you modify the work.) You may place additional
-permissions on material, added by you to a covered work, for which you
-have or can give appropriate copyright permission.
-
-Notwithstanding any other provision of this License, for material you
-add to a covered work, you may (if authorized by the copyright holders
-of that material) supplement the terms of this License with terms:
-
-a. Disclaiming warranty or limiting liability differently from the terms
- of sections 15 and 16 of this License; or
-b. Requiring preservation of specified reasonable legal notices or
- author attributions in that material or in the Appropriate Legal
- Notices displayed by works containing it; or
-c. Prohibiting misrepresentation of the origin of that material, or
- requiring that modified versions of such material be marked in
- reasonable ways as different from the original version; or
-d. Limiting the use for publicity purposes of names of licensors or
- authors of the material; or
-e. Declining to grant rights under trademark law for use of some trade
- names, trademarks, or service marks; or
-f. Requiring indemnification of licensors and authors of that material
- by anyone who conveys the material (or modified versions of it) with
- contractual assumptions of liability to the recipient, for any
- liability that these contractual assumptions directly impose on those
- licensors and authors.
-
-All other non-permissive additional terms are considered *further
-restrictions* within the meaning of section 10. If the Program as you
-received it, or any part of it, contains a notice stating that it is
-governed by this License along with a term that is a further
-restriction, you may remove that term. If a license document contains a
-further restriction but permits relicensing or conveying under this
-License, you may add to a covered work material governed by the terms of
-that license document, provided that the further restriction does not
-survive such relicensing or conveying.
-
-If you add terms to a covered work in accord with this section, you must
-place, in the relevant source files, a statement of the additional terms
-that apply to those files, or a notice indicating where to find the
-applicable terms.
-
-Additional terms, permissive or non-permissive, may be stated in the
-form of a separately written license, or stated as exceptions; the above
-requirements apply either way.
-
-8. Termination.
-~~~~~~~~~~~~~~~
-
-You may not propagate or modify a covered work except as expressly
-provided under this License. Any attempt otherwise to propagate or
-modify it is void, and will automatically terminate your rights under
-this License (including any patent licenses granted under the third
-paragraph of section 11).
-
-However, if you cease all violation of this License, then your license
-from a particular copyright holder is reinstated
-
-a. provisionally, unless and until the copyright holder explicitly and
- finally terminates your license, and
-b. permanently, if the copyright holder fails to notify you of the
- violation by some reasonable means prior to 60 days after the
- cessation.
-
-Moreover, your license from a particular copyright holder is reinstated
-permanently if the copyright holder notifies you of the violation by
-some reasonable means, this is the first time you have received notice
-of violation of this License (for any work) from that copyright holder,
-and you cure the violation prior to 30 days after your receipt of the
-notice.
-
-Termination of your rights under this section does not terminate the
-licenses of parties who have received copies or rights from you under
-this License. If your rights have been terminated and not permanently
-reinstated, you do not qualify to receive new licenses for the same
-material under section 10.
-
-9. Acceptance Not Required for Having Copies.
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-You are not required to accept this License in order to receive or run a
-copy of the Program. Ancillary propagation of a covered work occurring
-solely as a consequence of using peer-to-peer transmission to receive a
-copy likewise does not require acceptance. However, nothing other than
-this License grants you permission to propagate or modify any covered
-work. These actions infringe copyright if you do not accept this
-License. Therefore, by modifying or propagating a covered work, you
-indicate your acceptance of this License to do so.
-
-10. Automatic Licensing of Downstream Recipients.
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Each time you convey a covered work, the recipient automatically
-receives a license from the original licensors, to run, modify and
-propagate that work, subject to this License. You are not responsible
-for enforcing compliance by third parties with this License.
-
-An *entity transaction* is a transaction transferring control of an
-organization, or substantially all assets of one, or subdividing an
-organization, or merging organizations. If propagation of a covered work
-results from an entity transaction, each party to that transaction who
-receives a copy of the work also receives whatever licenses to the work
-the party's predecessor in interest had or could give under the previous
-paragraph, plus a right to possession of the Corresponding Source of the
-work from the predecessor in interest, if the predecessor has it or can
-get it with reasonable efforts.
-
-You may not impose any further restrictions on the exercise of the
-rights granted or affirmed under this License. For example, you may not
-impose a license fee, royalty, or other charge for exercise of rights
-granted under this License, and you may not initiate litigation
-(including a cross-claim or counterclaim in a lawsuit) alleging that any
-patent claim is infringed by making, using, selling, offering for sale,
-or importing the Program or any portion of it.
-
-11. Patents.
-~~~~~~~~~~~~
-
-A *contributor* is a copyright holder who authorizes use under this
-License of the Program or a work on which the Program is based. The work
-thus licensed is called the contributor's *contributor version*.
-
-A contributor's *essential patent claims* are all patent claims owned or
-controlled by the contributor, whether already acquired or hereafter
-acquired, that would be infringed by some manner, permitted by this
-License, of making, using, or selling its contributor version, but do
-not include claims that would be infringed only as a consequence of
-further modification of the contributor version. For purposes of this
-definition, *control* includes the right to grant patent sublicenses in
-a manner consistent with the requirements of this License.
-
-Each contributor grants you a non-exclusive, worldwide, royalty-free
-patent license under the contributor's essential patent claims, to make,
-use, sell, offer for sale, import and otherwise run, modify and
-propagate the contents of its contributor version.
-
-In the following three paragraphs, a *patent license* is any express
-agreement or commitment, however denominated, not to enforce a patent
-(such as an express permission to practice a patent or covenant not to
-sue for patent infringement). To *grant* such a patent license to a
-party means to make such an agreement or commitment not to enforce a
-patent against the party.
-
-If you convey a covered work, knowingly relying on a patent license, and
-the Corresponding Source of the work is not available for anyone to
-copy, free of charge and under the terms of this License, through a
-publicly available network server or other readily accessible means,
-then you must either
-
-1. cause the Corresponding Source to be so available, or
-2. arrange to deprive yourself of the benefit of the patent license for
- this particular work, or
-3. arrange, in a manner consistent with the requirements of this
- License, to extend the patent license to downstream recipients.
-
-*Knowingly relying* means you have actual knowledge that, but for the
-patent license, your conveying the covered work in a country, or your
-recipient's use of the covered work in a country, would infringe one or
-more identifiable patents in that country that you have reason to
-believe are valid.
-
-If, pursuant to or in connection with a single transaction or
-arrangement, you convey, or propagate by procuring conveyance of, a
-covered work, and grant a patent license to some of the parties
-receiving the covered work authorizing them to use, propagate, modify or
-convey a specific copy of the covered work, then the patent license you
-grant is automatically extended to all recipients of the covered work
-and works based on it.
-
-A patent license is *discriminatory* if it does not include within the
-scope of its coverage, prohibits the exercise of, or is conditioned on
-the non-exercise of one or more of the rights that are specifically
-granted under this License. You may not convey a covered work if you are
-a party to an arrangement with a third party that is in the business of
-distributing software, under which you make payment to the third party
-based on the extent of your activity of conveying the work, and under
-which the third party grants, to any of the parties who would receive
-the covered work from you, a discriminatory patent license
-
-a. in connection with copies of the covered work conveyed by you (or
- copies made from those copies), or
-b. primarily for and in connection with specific products or
- compilations that contain the covered work, unless you entered into
- that arrangement, or that patent license was granted, prior to 28
- March 2007.
-
-Nothing in this License shall be construed as excluding or limiting any
-implied license or other defenses to infringement that may otherwise be
-available to you under applicable patent law.
-
-12. No Surrender of Others' Freedom.
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-If conditions are imposed on you (whether by court order, agreement or
-otherwise) that contradict the conditions of this License, they do not
-excuse you from the conditions of this License. If you cannot convey a
-covered work so as to satisfy simultaneously your obligations under this
-License and any other pertinent obligations, then as a consequence you
-may not convey it at all. For example, if you agree to terms that
-obligate you to collect a royalty for further conveying from those to
-whom you convey the Program, the only way you could satisfy both those
-terms and this License would be to refrain entirely from conveying the
-Program.
-
-13. Use with the GNU Affero General Public License.
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Notwithstanding any other provision of this License, you have permission
-to link or combine any covered work with a work licensed under version 3
-of the GNU Affero General Public License into a single combined work,
-and to convey the resulting work. The terms of this License will
-continue to apply to the part which is the covered work, but the special
-requirements of the GNU Affero General Public License, section 13,
-concerning interaction through a network will apply to the combination
-as such.
-
-14. Revised Versions of this License.
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-The Free Software Foundation may publish revised and/or new versions of
-the GNU General Public License from time to time. Such new versions will
-be similar in spirit to the present version, but may differ in detail to
-address new problems or concerns.
-
-Each version is given a distinguishing version number. If the Program
-specifies that a certain numbered version of the GNU General Public
-License *or any later version* applies to it, you have the option of
-following the terms and conditions either of that numbered version or of
-any later version published by the Free Software Foundation. If the
-Program does not specify a version number of the GNU General Public
-License, you may choose any version ever published by the Free Software
-Foundation.
-
-If the Program specifies that a proxy can decide which future versions
-of the GNU General Public License can be used, that proxy's public
-statement of acceptance of a version permanently authorizes you to
-choose that version for the Program.
-
-Later license versions may give you additional or different permissions.
-However, no additional obligations are imposed on any author or
-copyright holder as a result of your choosing to follow a later version.
-
-15. Disclaimer of Warranty.
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
-APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
-HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM *AS IS* WITHOUT
-WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT
-LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
-PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF
-THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME
-THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
-
-16. Limitation of Liability.
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
-WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR
-CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
-INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES
-ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT
-NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES
-SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE
-WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN
-ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
-
-17. Interpretation of Sections 15 and 16.
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-If the disclaimer of warranty and limitation of liability provided above
-cannot be given local legal effect according to their terms, reviewing
-courts shall apply local law that most closely approximates an absolute
-waiver of all civil liability in connection with the Program, unless a
-warranty or assumption of liability accompanies a copy of the Program in
-return for a fee.
-
-END OF TERMS AND CONDITIONS
----------------------------
-
-How to Apply These Terms to Your New Programs
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-If you develop a new program, and you want it to be of the greatest
-possible use to the public, the best way to achieve this is to make it
-free software which everyone can redistribute and change under these
-terms.
-
-To do so, attach the following notices to the program. It is safest to
-attach them to the start of each source file to most effectively state
-the exclusion of warranty; and each file should have at least the
-*copyright* line and a pointer to where the full notice is found.
-
-::
-
- <one line to give the program's name and a brief idea of what it does.>
- Copyright (C) <year> <name of author>
-
- This program is free software: you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation, either version 3 of the License, or
- (at your option) any later version.
-
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
-
- You should have received a copy of the GNU General Public License
- along with this program. If not, see <http://www.gnu.org/licenses/>.
-
-Also add information on how to contact you by electronic and paper mail.
-
-If the program does terminal interaction, make it output a short notice
-like this when it starts in an interactive mode:
-
-::
-
- <program> Copyright (C) <year> <name of author>
- This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
- This is free software, and you are welcome to redistribute it
- under certain conditions; type `show c' for details.
-
-The hypothetical commands ``show w`` and ``show c`` should show the
-appropriate parts of the General Public License. Of course, your
-program's commands might be different; for a GUI interface, you would
-use an *about box*.
-
-You should also get your employer (if you work as a programmer) or
-school, if any, to sign a *copyright disclaimer* for the program, if
-necessary. For more information on this, and how to apply and follow the
-GNU GPL, see
-`http://www.gnu.org/licenses/ <http://www.gnu.org/licenses/>`_.
-
-The GNU General Public License does not permit incorporating your
-program into proprietary programs. If your program is a subroutine
-library, you may consider it more useful to permit linking proprietary
-applications with the library. If this is what you want to do, use the
-GNU Lesser General Public License instead of this License. But first,
-please read
-`http://www.gnu.org/philosophy/why-not-lgpl.html <http://www.gnu.org/philosophy/why-not-lgpl.html>`_.
diff --git a/LICENSE.rst b/LICENSE.rst
index 6c89f88..2c47a5b 100644
--- a/LICENSE.rst
+++ b/LICENSE.rst
@@ -6,15 +6,15 @@ lambda copyright
All rights reserved.
Lambda is *free software*: you can redistribute it and/or modify
-it under the terms of the GNU General Public License as published by
-the Free Software Foundation, either version 3 of the License, or
-(at your option) any later version.
+it under the terms of the GNU Affero General Public License as
+published by the Free Software Foundation, either version 3 of the
+License, or (at your option) any later version.
Lambda is distributed in the hope that it will be useful,
but **without any warranty**; without even the implied warranty of
**merchantability** or **fitness for a particular purpose**.
-See the file `LICENSE-GPL3.rst <./LICENSE-GPL3.rst>`__ or
+See the file `LICENSE-AGPL3.rst <./LICENSE-AGPL3.rst>`__ or
http://www.gnu.org/licenses/ for a full text of the license and the
rights and obligations implied.
diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 8c18c27..846bf9d 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -12,8 +12,8 @@
# change this after every release
set (SEQAN_APP_VERSION_MAJOR "1")
-set (SEQAN_APP_VERSION_MINOR "0")
-set (SEQAN_APP_VERSION_PATCH "0")
+set (SEQAN_APP_VERSION_MINOR "9")
+set (SEQAN_APP_VERSION_PATCH "1")
# don't change the following
set (SEQAN_APP_VERSION "${SEQAN_APP_VERSION_MAJOR}.${SEQAN_APP_VERSION_MINOR}.${SEQAN_APP_VERSION_PATCH}")
@@ -86,7 +86,7 @@ message (STATUS "LAMBDA version is: ${SEQAN_APP_VERSION}")
option (LAMBDA_FASTBUILD "Build only blastp and blastx modes (speeds up build)." OFF)
option (LAMBDA_NATIVE_BUILD "Architecture-specific optimizations, i.e. g++ -march=native." ON)
option (LAMBDA_STATIC_BUILD "Include all libraries in the binaries." OFF)
-option (LAMBDA_MMAPPED_DB "Use mmapped access to the database." ON)
+option (LAMBDA_MMAPPED_DB "Use mmapped access to the database." OFF)
option (LAMBDA_LINGAPS_OPT "Add optimized codepaths for linear gap costs (inc. bin size and compile time)." OFF)
if (LAMBDA_FASTBUILD)
@@ -220,7 +220,7 @@ install (TARGETS lambda lambda_indexer
# Install non-binary files for the package to share/lambda
install (FILES ../LICENSE.rst
../LICENSE-BSD.rst
- ../LICENSE-GPL3.rst
+ ../LICENSE-AGPL3.rst
../README.rst
DESTINATION "share/doc/lambda")
diff --git a/src/holders.hpp b/src/holders.hpp
index d397855..4aa9bf9 100644
--- a/src/holders.hpp
+++ b/src/holders.hpp
@@ -54,6 +54,7 @@ struct StatsHolder
uint64_t hitsMerged;
uint64_t hitsTooShort;
uint64_t hitsMasked;
+ std::vector<uint16_t> seedLengths;
// pre-extension
uint64_t hitsFailedPreExtendTest;
@@ -70,6 +71,12 @@ struct StatsHolder
uint64_t hitsFinal;
uint64_t qrysWithHit;
+// times
+ double timeGenSeeds;
+ double timeSearch;
+ double timeSort;
+ double timeExtend;
+
StatsHolder()
{
clear();
@@ -81,6 +88,7 @@ struct StatsHolder
hitsMerged = 0;
hitsTooShort = 0;
hitsMasked = 0;
+ seedLengths.clear();
hitsFailedPreExtendTest = 0;
hitsPutativeDuplicate = 0;
@@ -93,6 +101,11 @@ struct StatsHolder
hitsFinal = 0;
qrysWithHit = 0;
+
+ timeGenSeeds = 0;
+ timeSearch = 0;
+ timeSort = 0;
+ timeExtend = 0;
}
StatsHolder plus(StatsHolder const & rhs)
@@ -101,6 +114,7 @@ struct StatsHolder
hitsMerged += rhs.hitsMerged;
hitsTooShort += rhs.hitsTooShort;
hitsMasked += rhs.hitsMasked;
+ append(seedLengths, rhs.seedLengths);
hitsFailedPreExtendTest += rhs.hitsFailedPreExtendTest;
hitsPutativeDuplicate += rhs.hitsPutativeDuplicate;
@@ -113,6 +127,12 @@ struct StatsHolder
hitsFinal += rhs.hitsFinal;
qrysWithHit += rhs.qrysWithHit;
+
+ timeGenSeeds += rhs.timeGenSeeds;
+ timeSearch += rhs.timeSearch;
+ timeSort += rhs.timeSort;
+ timeExtend += rhs.timeExtend;
+
return *this;
}
@@ -146,19 +166,24 @@ void printStats(StatsHolder const & stats, LambdaOptions const & options)
std::cout << "Remaining\033[0m"
<< "\n after Seeding "; BLANKS;
std::cout << R << rem;
- std::cout << "\n - masked " << R << stats.hitsMasked
- << RR << (rem -= stats.hitsMasked);
- std::cout << "\n - merged " << R << stats.hitsMerged
- << RR << (rem -= stats.hitsMerged);
- std::cout << "\n - putative duplicates " << R
- << stats.hitsPutativeDuplicate << RR
- << (rem -= stats.hitsPutativeDuplicate);
- std::cout << "\n - putative abundant " << R
- << stats.hitsPutativeAbundant << RR
- << (rem -= stats.hitsPutativeAbundant);
- std::cout << "\n - failed pre-extend test " << R
- << stats.hitsFailedPreExtendTest << RR
- << (rem -= stats.hitsFailedPreExtendTest);
+ if (stats.hitsMasked)
+ std::cout << "\n - masked " << R << stats.hitsMasked
+ << RR << (rem -= stats.hitsMasked);
+ if (options.mergePutativeSiblings)
+ std::cout << "\n - merged " << R << stats.hitsMerged
+ << RR << (rem -= stats.hitsMerged);
+ if (options.filterPutativeDuplicates)
+ std::cout << "\n - putative duplicates " << R
+ << stats.hitsPutativeDuplicate << RR
+ << (rem -= stats.hitsPutativeDuplicate);
+ if (options.filterPutativeAbundant)
+ std::cout << "\n - putative abundant " << R
+ << stats.hitsPutativeAbundant << RR
+ << (rem -= stats.hitsPutativeAbundant);
+ if (options.preScoring)
+ std::cout << "\n - failed pre-extend test " << R
+ << stats.hitsFailedPreExtendTest << RR
+ << (rem -= stats.hitsFailedPreExtendTest);
std::cout << "\n - failed %-identity test " << R
<< stats.hitsFailedExtendPercentIdentTest << RR
<< (rem -= stats.hitsFailedExtendPercentIdentTest);
@@ -175,6 +200,31 @@ void printStats(StatsHolder const & stats, LambdaOptions const & options)
if (rem != stats.hitsFinal)
std::cout << "WARNING: hits dont add up\n";
+
+ std::cout << "Detailed Non-Wall-Clock times:\n"
+ << " genSeeds: " << stats.timeGenSeeds << "\n"
+ << " search: " << stats.timeSearch << "\n"
+ << " sort: " << stats.timeSort << "\n"
+ << " extend: " << stats.timeExtend << "\n\n";
+
+ if (length(stats.seedLengths))
+ {
+ double _seedLengthSum = std::accumulate(stats.seedLengths.begin(), stats.seedLengths.end(), 0.0);
+ double seedLengthMean = _seedLengthSum / stats.seedLengths.size();
+
+ double _seedLengthMeanSqSum = std::inner_product(stats.seedLengths.begin(),
+ stats.seedLengths.end(),
+ stats.seedLengths.begin(),
+ 0.0);
+ double seedLengthStdDev = std::sqrt(_seedLengthMeanSqSum / stats.seedLengths.size() -
+ seedLengthMean * seedLengthMean);
+ uint16_t seedLengthMax = *std::max_element(stats.seedLengths.begin(), stats.seedLengths.end());
+
+ std::cout << "SeedStats:\n"
+ << " avgLength: " << seedLengthMean << "\n"
+ << " stddev: " << seedLengthStdDev << "\n"
+ << " max: " << seedLengthMax << "\n\n";
+ }
}
if (options.verbosity >= 1)
@@ -349,7 +399,7 @@ public:
// ----------------------------------------------------------------------------
template <typename TGlobalHolder_,
- typename TScoreExtension>
+ typename TScoreExtension_>
class LocalDataHolder
{
public:
@@ -358,6 +408,7 @@ public:
using TSeeds = StringSet<typename Infix<TRedQrySeq const>::Type>;
using TSeedIndex = Index<TSeeds, IndexSa<>>;
using TMatch = typename TGlobalHolder::TMatch;
+ using TScoreExtension = TScoreExtension_;
// references to global stuff
diff --git a/src/lambda.cpp b/src/lambda.cpp
index 88288e4..a640b40 100644
--- a/src/lambda.cpp
+++ b/src/lambda.cpp
@@ -21,6 +21,9 @@
#include <iostream>
+//TODO TEMPORARY REMOVE
+#define amd64
+
#include <seqan/basic.h>
#include <seqan/sequence.h>
#include <seqan/arg_parse.h>
@@ -51,12 +54,12 @@ using namespace seqan;
// forwards
inline int
-argConv0(LambdaOptions const & options);
+argConv0(LambdaOptions & options);
//-
template <typename TOutFormat,
BlastTabularSpec h>
inline int
-argConv1(LambdaOptions const & options,
+argConv1(LambdaOptions & options,
TOutFormat const & /**/,
BlastTabularSpecSelector<h> const &);
//-
@@ -64,7 +67,7 @@ template <typename TOutFormat,
BlastTabularSpec h,
BlastProgram p>
inline int
-argConv2(LambdaOptions const & options,
+argConv2(LambdaOptions & options,
TOutFormat const & /**/,
BlastTabularSpecSelector<h> const &,
BlastProgramSelector<p> const &);
@@ -74,7 +77,7 @@ template <typename TOutFormat,
BlastTabularSpec h,
BlastProgram p>
inline int
-argConv3(LambdaOptions const & options,
+argConv3(LambdaOptions & options,
TOutFormat const &,
BlastTabularSpecSelector<h> const &,
BlastProgramSelector<p> const &,
@@ -86,7 +89,7 @@ template <typename TOutFormat,
BlastTabularSpec h,
BlastProgram p>
inline int
-argConv4(LambdaOptions const & options,
+argConv4(LambdaOptions & options,
TOutFormat const & /**/,
BlastTabularSpecSelector<h> const &,
BlastProgramSelector<p> const &,
@@ -100,7 +103,7 @@ template <typename TIndexSpec,
BlastProgram p,
BlastTabularSpec h>
inline int
-realMain(LambdaOptions const & options,
+realMain(LambdaOptions & options,
TOutFormat const & /**/,
BlastTabularSpecSelector<h> const &,
BlastProgramSelector<p> const &,
@@ -133,7 +136,7 @@ int main(int argc, char const ** argv)
// CONVERT Run-time options to compile-time Format-Type
inline int
-argConv0(LambdaOptions const & options)
+argConv0(LambdaOptions & options)
{
CharString output = options.output;
if (endsWith(output, ".gz"))
@@ -157,7 +160,7 @@ argConv0(LambdaOptions const & options)
template <typename TOutFormat,
BlastTabularSpec h>
inline int
-argConv1(LambdaOptions const & options,
+argConv1(LambdaOptions & options,
TOutFormat const & /**/,
BlastTabularSpecSelector<h> const &)
{
@@ -206,7 +209,7 @@ template <typename TOutFormat,
BlastTabularSpec h,
BlastProgram p>
inline int
-argConv2(LambdaOptions const & options,
+argConv2(LambdaOptions & options,
TOutFormat const & /**/,
BlastTabularSpecSelector<h> const &,
BlastProgramSelector<p> const &)
@@ -242,7 +245,7 @@ template <typename TOutFormat,
BlastTabularSpec h,
BlastProgram p>
inline int
-argConv3(LambdaOptions const & options,
+argConv3(LambdaOptions & options,
TOutFormat const &,
BlastTabularSpecSelector<h> const &,
BlastProgramSelector<p> const &,
@@ -277,42 +280,14 @@ template <typename TOutFormat,
BlastTabularSpec h,
BlastProgram p>
inline int
-argConv4(LambdaOptions const & options,
+argConv4(LambdaOptions & options,
TOutFormat const & /**/,
BlastTabularSpecSelector<h> const &,
BlastProgramSelector<p> const &,
TRedAlph const & /**/,
TScoreExtension const & /**/)
{
- int indexType = options.dbIndexType;
-// if (indexType == -1) // autodetect
-// {
-// //TODO FIX THIS WITH NEW EXTENSIONS
-// CharString file = options.dbFile;
-// append(file, ".sa");
-// struct stat buffer;
-// if (stat(toCString(file), &buffer) == 0)
-// {
-// indexType = 0;
-// } else
-// {
-// file = options.dbFile;
-// append(file, ".sa.val"); // FM Index
-// struct stat buffer;
-// if (stat(toCString(file), &buffer) == 0)
-// {
-// indexType = 1;
-// } else
-// {
-// std::cerr << "No Index file could be found, please make sure paths "
-// << "are correct and the files are readable.\n" << std::flush;
-//
-// return -1;
-// }
-// }
-// }
-
- if (indexType == 0)
+ if (options.dbIndexType == DbIndexType::SUFFIX_ARRAY)
return realMain<IndexSa<>>(options,
TOutFormat(),
BlastTabularSpecSelector<h>(),
@@ -341,7 +316,7 @@ template <typename TIndexSpec,
BlastProgram p,
BlastTabularSpec h>
inline int
-realMain(LambdaOptions const & options,
+realMain(LambdaOptions & options,
TOutFormat const & /**/,
BlastTabularSpecSelector<h> const &,
BlastProgramSelector<p> const &,
@@ -355,13 +330,17 @@ realMain(LambdaOptions const & options,
"\n======================================================"
"\nVersion ", SEQAN_APP_VERSION, "\n\n");
+ int ret = validateIndexOptions<TRedAlph, p>(options);
+ if (ret)
+ return ret;
+
if (options.verbosity >= 2)
printOptions<TLocalHolder>(options);
TGlobalHolder globalHolder;
// context(globalHolder.outfile).scoringScheme._internalScheme = matr;
- int ret = prepareScoring(globalHolder, options);
+ ret = prepareScoring(globalHolder, options);
if (ret)
return ret;
@@ -373,9 +352,9 @@ realMain(LambdaOptions const & options,
if (ret)
return ret;
- ret = loadSegintervals(globalHolder, options);
- if (ret)
- return ret;
+// ret = loadSegintervals(globalHolder, options);
+// if (ret)
+// return ret;
ret = loadQuery(globalHolder, options);
if (ret)
@@ -441,9 +420,13 @@ realMain(LambdaOptions const & options,
localHolder.init(t);
// seed
- res = generateSeeds(localHolder);
- if (res)
- continue;
+ double buf = sysTime();
+ if (!options.adaptiveSeeding)
+ {
+ res = generateSeeds(localHolder);
+ if (res)
+ continue;
+ }
if (options.doubleIndexing)
{
@@ -451,19 +434,33 @@ realMain(LambdaOptions const & options,
if (res)
continue;
}
+ localHolder.stats.timeGenSeeds += sysTime() - buf;
// search
- search(localHolder);
+ buf = sysTime();
+ search(localHolder); //TODO seed refining if iterateMatches gives 0 results
+ localHolder.stats.timeSearch += sysTime() - buf;
+
+// // TODO DEBUG
+// for (auto const & m : localHolder.matches)
+// _printMatch(m);
// sort
- sortMatches(localHolder);
+ if (options.filterPutativeAbundant || options.filterPutativeDuplicates || options.mergePutativeSiblings)
+ {
+ buf = sysTime();
+ sortMatches(localHolder);
+ localHolder.stats.timeSort += sysTime() - buf;
+ }
// extend
- res = iterateMatches(localHolder);
+ buf = sysTime();
+ if (length(localHolder.matches) > 0)
+ res = iterateMatches(localHolder);
+ localHolder.stats.timeExtend += sysTime() - buf;
if (res)
continue;
-
if ((!options.doubleIndexing) && (TID == 0) &&
(options.verbosity >= 1))
{
@@ -489,7 +486,7 @@ realMain(LambdaOptions const & options,
if (!options.doubleIndexing)
{
- myPrint(options, 2, "Runtime: ", sysTime() - start, "s.\n\n");
+ myPrint(options, 2, "Runtime total: ", sysTime() - start, "s.\n\n");
}
printStats(globalHolder.stats, options);
diff --git a/src/lambda.hpp b/src/lambda.hpp
index c83ca44..a2b9cec 100644
--- a/src/lambda.hpp
+++ b/src/lambda.hpp
@@ -95,6 +95,81 @@ struct Comp :
// ============================================================================
// --------------------------------------------------------------------------
+// Function readIndexOption()
+// --------------------------------------------------------------------------
+
+inline void
+readIndexOption(std::string & optionString,
+ std::string const & optionIdentifier,
+ LambdaOptions const & options)
+{
+ std::ifstream f{(options.indexDir + "/option:" + optionIdentifier).c_str(),
+ std::ios_base::in | std::ios_base::binary};
+ if (f.is_open())
+ {
+ auto fit = directionIterator(f, Input());
+ readLine(optionString, fit);
+ f.close();
+ }
+ else
+ {
+ throw std::runtime_error("ERROR: Expected option specifier:\n" + options.indexDir + "/option:" +
+ optionIdentifier + "\nYour index seems incompatible, try to recreate it "
+ "and report a bug if the issue persists.");
+ }
+}
+
+// --------------------------------------------------------------------------
+// Function validateIndexOptions()
+// --------------------------------------------------------------------------
+
+template <typename TRedAlph,
+ BlastProgram p>
+inline int
+validateIndexOptions(LambdaOptions const & options)
+{
+ std::string buffer;
+ readIndexOption(buffer, "alph_translated", options);
+ if (buffer != _alphName(TransAlph<p>()))
+ {
+ std::cerr << "ERROR: Your index is of translated alphabet type: " << buffer << "\n But lambda expected: "
+ << _alphName(TransAlph<p>()) << "\n Did you specify the right -p parameter?\n\n";
+ return -1;
+
+ }
+ buffer.clear();
+ readIndexOption(buffer, "alph_reduced", options);
+ if (buffer != _alphName(TRedAlph()))
+ {
+ std::cerr << "ERROR: Your index is of reduced alphabet type: " << buffer << "\n But lambda expected: "
+ << _alphName(TRedAlph()) << "\n Did you specify the right -ar parameter?\n\n";
+ return -1;
+ }
+ buffer.clear();
+ readIndexOption(buffer, "db_index_type", options);
+ unsigned long b = 0;
+ if ((!lexicalCast(b, buffer)) || (b != static_cast<unsigned long>(options.dbIndexType)))
+ {
+ std::cerr << "ERROR: Your index type is: " << _indexName(static_cast<DbIndexType>(std::stoul(buffer)))
+ << "\n But lambda expected: " << _indexName(options.dbIndexType)
+ << "\n Did you specify the right -di parameter?\n\n";
+ return -1;
+ }
+ if (qIsTranslated(p) && sIsTranslated(p))
+ {
+ buffer.clear();
+ readIndexOption(buffer, "genetic_code", options);
+ unsigned long b = 0;
+ if ((!lexicalCast(b, buffer)) || (b != static_cast<unsigned long>(options.geneticCode)))
+ {
+ std::cerr << "WARNING: The codon translation table used during indexing and during search are different. "
+ "This is not a problem per se, but is likely not what you want.\n\n";
+ }
+ }
+ return 0;
+}
+
+// --------------------------------------------------------------------------
// Function prepareScoring()
// --------------------------------------------------------------------------
@@ -195,9 +270,8 @@ loadSubjects(GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h> & globalHo
strIdent = "Loading Subj Sequences...";
myPrint(options, 1, strIdent);
- _dbSeqs = options.dbFile;
- append(_dbSeqs, ".");
- append(_dbSeqs, _alphName(TransAlph<p>()));
+ _dbSeqs = options.indexDir;
+ append(_dbSeqs, "/translated_seqs");
ret = open(globalHolder.subjSeqs, toCString(_dbSeqs), OPEN_RDONLY);
if (ret != true)
@@ -226,8 +300,8 @@ loadSubjects(GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h> & globalHo
strIdent = "Loading Subj Ids...";
myPrint(options, 1, strIdent);
- _dbSeqs = options.dbFile;
- append(_dbSeqs, ".ids");
+ _dbSeqs = options.indexDir;
+ append(_dbSeqs, "/seq_ids");
ret = open(globalHolder.subjIds, toCString(_dbSeqs), OPEN_RDONLY);
if (ret != true)
{
@@ -239,7 +313,7 @@ loadSubjects(GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h> & globalHo
myPrint(options, 1, " done.\n");
myPrint(options, 2, "Runtime: ", finish, "s \n\n");
- context(globalHolder.outfile).dbName = options.dbFile;
+ context(globalHolder.outfile).dbName = options.indexDir;
// if subjects where translated, we don't have the untranslated seqs at all
// but we still need the data for statistics and position un-translation
@@ -249,8 +323,8 @@ loadSubjects(GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h> & globalHo
std::string strIdent = "Loading Lengths of untranslated Subj sequences...";
myPrint(options, 1, strIdent);
- _dbSeqs = options.dbFile;
- append(_dbSeqs, ".untranslengths");
+ _dbSeqs = options.indexDir;
+ append(_dbSeqs, "/untranslated_seq_lengths");
ret = open(globalHolder.untransSubjSeqLengths, toCString(_dbSeqs), OPEN_RDONLY);
if (ret != true)
{
@@ -279,24 +353,8 @@ loadDbIndexFromDisk(TGlobalHolder & globalHolder,
std::string strIdent = "Loading Database Index...";
myPrint(options, 1, strIdent);
double start = sysTime();
- std::string path = toCString(options.dbFile);
- path += '.' + std::string(_alphName(typename TGlobalHolder::TRedAlph()));
- if (TGlobalHolder::indexIsFM)
- path += ".fm";
- else
- path += ".sa";
-
- // Check if the index is of the old format (pre 0.9.0) by looking for different files
- if ((globalHolder.blastProgram != BlastProgram::BLASTN) && // BLASTN indexes are compatible
- ((TGlobalHolder::alphReduction && fileExists(toCString(path + ".txt.concat"))) ||
- (!TGlobalHolder::alphReduction && TGlobalHolder::indexIsFM && !fileExists(toCString(path + ".lf.drv.wtc.24")))))
- {
- std::cerr << ((options.verbosity == 0) ? strIdent : std::string())
- << " failed.\n"
- << "It appears you tried to open an old index (created before 0.9.0) which "
- << "is not supported. Please remove the old files and create a new index with lambda_indexer!\n";
- return 200;
- }
+ std::string path = toCString(options.indexDir);
+ path += "/index";
int ret = open(globalHolder.dbIndex, path.c_str(), OPEN_RDONLY);
if (ret != true)
@@ -318,7 +376,7 @@ loadDbIndexFromDisk(TGlobalHolder & globalHolder,
length(indexSA(globalHolder.dbIndex)), "\n\n");
// this is actually part of prepareScoring(), but the values are just available now
- if (sIsTranslated(globalHolder.blastProgram ))
+ if (sIsTranslated(TGlobalHolder::blastProgram ))
{
// last value has sum of lengths
context(globalHolder.outfile).dbTotalLength = back(globalHolder.untransSubjSeqLengths);
@@ -336,54 +394,54 @@ loadDbIndexFromDisk(TGlobalHolder & globalHolder,
// Function loadSegintervals()
// --------------------------------------------------------------------------
-template <BlastTabularSpec h,
- BlastProgram p,
- typename TRedAlph,
- typename TIndexSpec,
- typename TOutFormat>
-inline int
-loadSegintervals(GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h> & globalHolder,
- LambdaOptions const & options)
-{
-
- double start = sysTime();
- std::string strIdent = "Loading Database Masking file...";
- myPrint(options, 1, strIdent);
-
- CharString segFileS = options.dbFile;
- append(segFileS, ".binseg_s.concat");
- CharString segFileE = options.dbFile;
- append(segFileE, ".binseg_e.concat");
- bool fail = false;
- struct stat buffer;
- // file exists
- if ((stat(toCString(segFileS), &buffer) == 0) &&
- (stat(toCString(segFileE), &buffer) == 0))
- {
- //cut off ".concat" again
- resize(segFileS, length(segFileS) - 7);
- resize(segFileE, length(segFileE) - 7);
-
- fail = !open(globalHolder.segIntStarts, toCString(segFileS), OPEN_RDONLY);
- if (!fail)
- fail = !open(globalHolder.segIntEnds, toCString(segFileE), OPEN_RDONLY);
- } else
- {
- fail = true;
- }
-
- if (fail)
- {
- std::cerr << ((options.verbosity == 0) ? strIdent : std::string())
- << " failed.\n";
- return 1;
- }
-
- double finish = sysTime() - start;
- myPrint(options, 1, " done.\n");
- myPrint(options, 2, "Runtime: ", finish, "s \n\n");
- return 0;
-}
+// template <BlastTabularSpec h,
+// BlastProgram p,
+// typename TRedAlph,
+// typename TIndexSpec,
+// typename TOutFormat>
+// inline int
+// loadSegintervals(GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h> & globalHolder,
+// LambdaOptions const & options)
+// {
+//
+// double start = sysTime();
+// std::string strIdent = "Loading Database Masking file...";
+// myPrint(options, 1, strIdent);
+//
+// CharString segFileS = options.dbFile;
+// append(segFileS, ".binseg_s.concat");
+// CharString segFileE = options.dbFile;
+// append(segFileE, ".binseg_e.concat");
+// bool fail = false;
+// struct stat buffer;
+// // file exists
+// if ((stat(toCString(segFileS), &buffer) == 0) &&
+// (stat(toCString(segFileE), &buffer) == 0))
+// {
+// //cut off ".concat" again
+// resize(segFileS, length(segFileS) - 7);
+// resize(segFileE, length(segFileE) - 7);
+//
+// fail = !open(globalHolder.segIntStarts, toCString(segFileS), OPEN_RDONLY);
+// if (!fail)
+// fail = !open(globalHolder.segIntEnds, toCString(segFileE), OPEN_RDONLY);
+// } else
+// {
+// fail = true;
+// }
+//
+// if (fail)
+// {
+// std::cerr << ((options.verbosity == 0) ? strIdent : std::string())
+// << " failed.\n";
+// return 1;
+// }
+//
+// double finish = sysTime() - start;
+// myPrint(options, 1, " done.\n");
+// myPrint(options, 2, "Runtime: ", finish, "s \n\n");
+// return 0;
+// }
// --------------------------------------------------------------------------
// Function loadQuery()
@@ -488,8 +546,8 @@ template <BlastTabularSpec h,
typename TIndexSpec,
typename TOutFormat>
inline int
-loadQuery(GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h> & globalHolder,
- LambdaOptions const & options)
+loadQuery(GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h> & globalHolder,
+ LambdaOptions & options)
{
using TGH = GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h>;
double start = sysTime();
@@ -521,7 +579,7 @@ loadQuery(GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h> & globalHolde
options);
// sam and bam need original sequences if translation happened
- if (qIsTranslated(globalHolder.blastProgram) && (options.outFileFormat > 0) &&
+ if (qIsTranslated(TGH::blastProgram) && (options.outFileFormat > 0) &&
(options.samBamSeq > 0))
std::swap(origSeqs, globalHolder.untranslatedQrySeqs);
@@ -562,6 +620,28 @@ loadQuery(GlobalDataHolder<TRedAlph, TIndexSpec, TOutFormat, p, h> & globalHolde
<< ".\n";
return -1;
}
+
+ if (options.extensionMode == LambdaOptions::ExtensionMode::AUTO)
+ {
+ if (maxLen <= 100)
+ {
+ #if 0 // defined(SEQAN_SIMD_ENABLED) && defined(__AVX2__)
+ options.extensionMode = LambdaOptions::ExtensionMode::FULL_SIMD;
+ options.band = -1;
+ #else
+ options.extensionMode = LambdaOptions::ExtensionMode::FULL_SERIAL;
+ #endif
+ options.xDropOff = -1;
+ options.filterPutativeAbundant = false;
+ options.filterPutativeDuplicates = false;
+ options.mergePutativeSiblings = false;
+ }
+ else
+ {
+ options.extensionMode = LambdaOptions::ExtensionMode::XDROP;
+ }
+ }
+
return 0;
}
@@ -684,13 +764,16 @@ seedLooksPromising(LocalDataHolder<TGlobalHolder, TScoreExtension> const & lH,
int64_t effectiveQBegin = m.qryStart;
int64_t effectiveSBegin = m.subjStart;
- uint64_t effectiveLength = lH.options.seedLength * lH.options.preScoring;
- if (lH.options.preScoring > 1)
+ uint64_t actualLength = m.qryEnd - m.qryStart;
+ uint64_t effectiveLength = std::max(static_cast<uint64_t>(lH.options.seedLength * lH.options.preScoring),
+ actualLength);
+
+ if (effectiveLength > actualLength)
{
effectiveQBegin -= (lH.options.preScoring - 1) *
- lH.options.seedLength / 2;
+ actualLength / 2;
effectiveSBegin -= (lH.options.preScoring - 1) *
- lH.options.seedLength / 2;
+ actualLength / 2;
// std::cout << effectiveQBegin << "\t" << effectiveSBegin << "\n";
int64_t min = std::min(effectiveQBegin, effectiveSBegin);
if (min < 0)
@@ -760,21 +843,23 @@ onFind(LocalDataHolder<TGlobalHolder, TScoreExtension> & lH,
- getSeqOffset(subjOcc)
- lH.options.seedLength);
- TMatch m{static_cast<typename TMatch::TQId>(lH.seedRefs[seedId]),
- static_cast<typename TMatch::TSId>(getSeqNo(subjOcc)),
- static_cast<typename TMatch::TPos>(lH.seedRanks[seedId] * lH.options.seedOffset),
- static_cast<typename TMatch::TPos>(getSeqOffset(subjOcc))};
+ TMatch m {static_cast<typename TMatch::TQId>(lH.seedRefs[seedId]),
+ static_cast<typename TMatch::TSId>(getSeqNo(subjOcc)),
+ static_cast<typename TMatch::TPos>(lH.seedRanks[seedId] * lH.options.seedOffset),
+ static_cast<typename TMatch::TPos>(lH.seedRanks[seedId] * lH.options.seedOffset + lH.options.seedLength),
+ static_cast<typename TMatch::TPos>(getSeqOffset(subjOcc)),
+ static_cast<typename TMatch::TPos>(getSeqOffset(subjOcc) + lH.options.seedLength)};
bool discarded = false;
auto const halfSubjL = lH.options.seedLength / 2;
- if (!sIsTranslated(lH.gH.blastProgram))
+ if (!sIsTranslated(TGlobalHolder::blastProgram))
{
for (unsigned k = 0; k < length(lH.gH.segIntStarts[m.subjId]); ++k)
{
// more than half of the seed falls into masked interval
if (intervalOverlap(m.subjStart,
- m.subjStart + lH.options.seedLength,
+ m.subjEnd,
lH.gH.segIntStarts[m.subjId][k],
lH.gH.segIntEnds[m.subjId][k])
>= halfSubjL)
@@ -796,10 +881,190 @@ onFind(LocalDataHolder<TGlobalHolder, TScoreExtension> & lH,
lH.matches.emplace_back(m);
}
+template <typename TGlobalHolder,
+ typename TScoreExtension,
+ typename TSubjOcc>
+inline void
+onFindVariable(LocalDataHolder<TGlobalHolder, TScoreExtension> & lH,
+ TSubjOcc subjOcc,
+ typename TGlobalHolder::TMatch::TQId const seedId,
+ typename TGlobalHolder::TMatch::TPos const seedBegin,
+ typename TGlobalHolder::TMatch::TPos const seedLength)
+{
+ using TMatch = typename TGlobalHolder::TMatch;
+ if (TGlobalHolder::indexIsFM) // positions are reversed
+ setSeqOffset(subjOcc,
+ length(lH.gH.subjSeqs[getSeqNo(subjOcc)])
+ - getSeqOffset(subjOcc)
+ - seedLength);
+
+ TMatch m {seedId,
+ static_cast<typename TGlobalHolder::TMatch::TSId>(getSeqNo(subjOcc)),
+ seedBegin,
+ static_cast<typename TGlobalHolder::TMatch::TPos>(seedBegin + seedLength),
+ static_cast<typename TGlobalHolder::TMatch::TPos>(getSeqOffset(subjOcc)),
+ static_cast<typename TGlobalHolder::TMatch::TPos>(getSeqOffset(subjOcc) + seedLength)};
+
+ if (!seedLooksPromising(lH, m))
+ ++lH.stats.hitsFailedPreExtendTest;
+ else
+ lH.matches.emplace_back(m);
+}
+
// --------------------------------------------------------------------------
// Function search()
// --------------------------------------------------------------------------
+//TODO experiment with tuned branch prediction
+
+template <typename TIndexIt, typename TNeedleIt, typename TLambda, typename TLambda2>
+inline void
+__goDownNoErrors(TIndexIt const & indexIt,
+ TNeedleIt const & needleIt,
+ TNeedleIt const & needleItEnd,
+ TLambda & continRunnable,
+ TLambda2 & reportRunnable)
+{
+ TIndexIt nextIndexIt(indexIt);
+ if ((needleIt != needleItEnd) &&
+ goDown(nextIndexIt, *needleIt) &&
+ continRunnable(indexIt, nextIndexIt))
+ {
+ __goDownNoErrors(nextIndexIt, needleIt + 1, needleItEnd, continRunnable, reportRunnable);
+ } else
+ {
+ reportRunnable(indexIt);
+ }
+}
+
+template <typename TIndexIt, typename TNeedleIt, typename TLambda, typename TLambda2>
+inline void
+__goDownErrors(TIndexIt const & indexIt,
+ TNeedleIt const & needleIt,
+ TNeedleIt const & needleItEnd,
+ TLambda & continRunnable,
+ TLambda2 & reportRunnable)
+{
+ using TAlph = typename Value<TNeedleIt>::Type;
+
+ unsigned contin = 0;
+
+ if (needleIt != needleItEnd)
+ {
+ for (unsigned i = 0; i < ValueSize<TAlph>::VALUE; ++i)
+ {
+ TIndexIt nextIndexIt(indexIt);
+ if (goDown(nextIndexIt, static_cast<TAlph>(i)) &&
+ continRunnable(indexIt, nextIndexIt))
+ {
+ ++contin;
+ if (ordValue(*needleIt) == i)
+ __goDownErrors(nextIndexIt, needleIt + 1, needleItEnd, continRunnable, reportRunnable);
+ else
+ __goDownNoErrors(nextIndexIt, needleIt + 1, needleItEnd, continRunnable, reportRunnable);
+ }
+ }
+ }
+
+ if (contin == 0)
+ reportRunnable(indexIt);
+}
+
+template <typename TGlobalHolder,
+ typename TScoreExtension>
+inline void
+__serachAdaptive(LocalDataHolder<TGlobalHolder, TScoreExtension> & lH,
+ uint64_t const seedLength)
+{
+ typedef typename Iterator<typename TGlobalHolder::TDbIndex, TopDown<> >::Type TIndexIt;
+
+ // TODO optionize
+ size_t constexpr seedHeurFactor = 10;
+ size_t constexpr minResults = 1;
+
+ size_t needlesSum = lH.gH.redQrySeqs.limits[lH.indexEndQry] - lH.gH.redQrySeqs.limits[lH.indexBeginQry];
+ // BROKEN:lengthSum(infix(lH.gH.redQrySeqs, lH.indexBeginQry, lH.indexEndQry));
+ // the above is faster anyway (but only works on concatdirect sets)
+
+ size_t needlesPos = 0;
+
+ TIndexIt root(lH.gH.dbIndex);
+ TIndexIt indexIt = root;
+
+ for (size_t i = lH.indexBeginQry; i < lH.indexEndQry; ++i)
+ {
+ for (size_t seedBegin = 0; /* below */; seedBegin += lH.options.seedOffset)
+ {
+ // skip proteine 'X' or Dna 'N'
+ while ((lH.gH.qrySeqs[i][seedBegin] == unknownValue<TransAlph<TGlobalHolder::blastProgram>>()) &&
+ (seedBegin <= length(lH.gH.redQrySeqs[i]) - seedLength))
+ ++seedBegin;
+
+ // termination criterium
+ if (seedBegin > length(lH.gH.redQrySeqs[i]) - seedLength)
+ break;
+
+ indexIt = root;
+
+ size_t desiredOccs = length(lH.matches) >= lH.options.maxMatches
+ ? minResults
+ : (lH.options.maxMatches - length(lH.matches)) * seedHeurFactor /
+ ((needlesSum - needlesPos - seedBegin) / lH.options.seedOffset);
+
+ if (desiredOccs == 0)
+ desiredOccs = minResults;
+
+ // go down seedOffset number of characters without errors
+ for (size_t k = 0; k < lH.options.seedOffset; ++k)
+ if (!goDown(indexIt, lH.gH.redQrySeqs[i][seedBegin + k]))
+ break;
+ // if unsuccessful, move to next seed
+ if (repLength(indexIt) != lH.options.seedOffset)
+ continue;
+
+ auto continRunnable = [&seedLength, &desiredOccs] (TIndexIt const & prevIndexIt, TIndexIt const & indexIt)
+ {
+ // NON-ADAPTIVE
+// return (repLength(indexIt) <= seedLength);
+ // ADAPTIVE SEEDING:
+
+ // always continue if minimum seed length not reached
+ if (repLength(indexIt) <= seedLength)
+ return true;
+
+ // always continue if it means not loosing hits
+ if (countOccurrences(indexIt) == countOccurrences(prevIndexIt))
+ return true;
+
+ // do vodoo heuristics to see if this hit is to frequent
+ if (countOccurrences(indexIt) < desiredOccs)
+ return false;
+
+ return true;
+ };
+
+ auto reportRunnable = [&seedLength, &lH, &i, &seedBegin] (TIndexIt const & indexIt)
+ {
+ if (repLength(indexIt) >= seedLength)
+ {
+ appendValue(lH.stats.seedLengths, repLength(indexIt));
+ lH.stats.hitsAfterSeeding += countOccurrences(indexIt);
+ for (auto const & occ : getOccurrences(indexIt))
+ onFindVariable(lH, occ, i, seedBegin, repLength(indexIt));
+ }
+ };
+
+ __goDownErrors(indexIt,
+ begin(lH.gH.redQrySeqs[i], Standard()) + seedBegin + lH.options.seedOffset,
+ end(lH.gH.redQrySeqs[i], Standard()),
+ continRunnable,
+ reportRunnable);
+ }
+
+ needlesPos += length(lH.gH.redQrySeqs[i]);
+ }
+}
+
template <typename BackSpec, typename TLocalHolder>
inline void
__searchDoubleIndex(TLocalHolder & lH)
@@ -885,23 +1150,19 @@ template <typename TLocalHolder>
inline void
search(TLocalHolder & lH)
{
+ //TODO implement adaptive seeding with 0-n mismatches
if (lH.options.maxSeedDist == 0)
__search<Backtracking<Exact>>(lH);
- else if (lH.options.hammingOnly)
- __search<Backtracking<HammingDistance>>(lH);
+ else if (lH.options.adaptiveSeeding)
+ __serachAdaptive(lH, lH.options.seedLength);
else
-#if 0 // reactivate if edit-distance seeding is readded
- __search<Backtracking<EditDistance>>(lH);
-#else
- return;
-#endif
+ __search<Backtracking<HammingDistance>>(lH);
}
// --------------------------------------------------------------------------
// Function joinAndFilterMatches()
// --------------------------------------------------------------------------
-
template <typename TLocalHolder>
inline void
sortMatches(TLocalHolder & lH)
@@ -941,6 +1202,85 @@ sortMatches(TLocalHolder & lH)
}
}
+// --------------------------------------------------------------------------
+// Function _setFrames()
+// --------------------------------------------------------------------------
+
+template <typename TBlastMatch,
+ typename TLocalHolder>
+inline void
+_setFrames(TBlastMatch & bm,
+ typename TLocalHolder::TMatch const & m,
+ TLocalHolder const & lH)
+{
+ if (qIsTranslated(TLocalHolder::TGlobalHolder::blastProgram))
+ {
+ bm.qFrameShift = (m.qryId % 3) + 1;
+ if (m.qryId % 6 > 2)
+ bm.qFrameShift = -bm.qFrameShift;
+ } else if (qHasRevComp(TLocalHolder::TGlobalHolder::blastProgram))
+ {
+ bm.qFrameShift = 1;
+ if (m.qryId % 2)
+ bm.qFrameShift = -bm.qFrameShift;
+ } else
+ {
+ bm.qFrameShift = 0;
+ }
+
+ if (sIsTranslated(TLocalHolder::TGlobalHolder::blastProgram))
+ {
+ bm.sFrameShift = (m.subjId % 3) + 1;
+ if (m.subjId % 6 > 2)
+ bm.sFrameShift = -bm.sFrameShift;
+ } else if (sHasRevComp(TLocalHolder::TGlobalHolder::blastProgram))
+ {
+ bm.sFrameShift = 1;
+ if (m.subjId % 2)
+ bm.sFrameShift = -bm.sFrameShift;
+ } else
+ {
+ bm.sFrameShift = 0;
+ }
+}
+
+// --------------------------------------------------------------------------
+// Function _writeMatches()
+// --------------------------------------------------------------------------
+
+template <typename TBlastRecord,
+ typename TLocalHolder>
+inline void
+_writeRecord(TBlastRecord & record,
+ TLocalHolder & lH)
+{
+ if (length(record.matches) > 0)
+ {
+ ++lH.stats.qrysWithHit;
+ // sort and remove duplicates -> STL, yeah!
+ auto const before = record.matches.size();
+ record.matches.sort();
+ if (!lH.options.filterPutativeDuplicates)
+ {
+ record.matches.unique();
+ lH.stats.hitsDuplicate += before - record.matches.size();
+ }
+ if (record.matches.size() > lH.options.maxMatches)
+ {
+ lH.stats.hitsAbundant += record.matches.size() -
+ lH.options.maxMatches;
+ record.matches.resize(lH.options.maxMatches);
+ }
+ lH.stats.hitsFinal += record.matches.size();
+
+ myWriteRecord(lH, record);
+ }
+}
+
+// --------------------------------------------------------------------------
+// Function computeBlastMatch()
+// --------------------------------------------------------------------------
+
template <typename TBlastMatch,
typename TLocalHolder>
inline int
@@ -964,10 +1304,10 @@ computeBlastMatch(TBlastMatch & bm,
// bm.sEnd);
// std::cout << "Query Id: " << m.qryId
-// << "\t TrueQryId: " << getTrueQryId(bm.m, lH.options, lH.gH.blastProgram)
+// << "\t TrueQryId: " << getTrueQryId(bm.m, lH.options, TGlobalHolder::blastProgram)
// << "\t length(qryIds): " << length(qryIds)
// << "Subj Id: " << m.subjId
-// << "\t TrueSubjId: " << getTrueSubjId(bm.m, lH.options, lH.gH.blastProgram)
+// << "\t TrueSubjId: " << getTrueSubjId(bm.m, lH.options, TGlobalHolder::blastProgram)
// << "\t length(subjIds): " << length(subjIds) << "\n\n";
assignSource(bm.alignRow0, infix(lH.gH.qrySeqs[m.qryId], bm.qStart, bm.qEnd));
@@ -1269,53 +1609,17 @@ computeBlastMatch(TBlastMatch & bm,
// std::cout << "ALIGN BEFORE STATS:\n" << bm.align << "\n";
computeAlignmentStats(bm, context(lH.gH.outfile));
-
if (bm.alignStats.alignmentIdentity < lH.options.idCutOff)
return PERCENTIDENT;
// const unsigned long qryLength = length(row0);
computeBitScore(bm, context(lH.gH.outfile));
- // the length adjustment cache must no be written to by multiple threads
- SEQAN_OMP_PRAGMA(critical(evalue_length_adj_cache))
- {
- computeEValue(bm, context(lH.gH.outfile));
- }
-
+ computeEValueThreadSafe(bm, context(lH.gH.outfile));
if (bm.eValue > lH.options.eCutOff)
- {
return EVALUE;
- }
-
- if (qIsTranslated(TLocalHolder::TGlobalHolder::blastProgram))
- {
- bm.qFrameShift = (m.qryId % 3) + 1;
- if (m.qryId % 6 > 2)
- bm.qFrameShift = -bm.qFrameShift;
- } else if (qHasRevComp(TLocalHolder::TGlobalHolder::blastProgram))
- {
- bm.qFrameShift = 1;
- if (m.qryId % 2)
- bm.qFrameShift = -bm.qFrameShift;
- } else
- {
- bm.qFrameShift = 0;
- }
- if (sIsTranslated(TLocalHolder::TGlobalHolder::blastProgram))
- {
- bm.sFrameShift = (m.subjId % 3) + 1;
- if (m.subjId % 6 > 2)
- bm.sFrameShift = -bm.sFrameShift;
- } else if (sHasRevComp(TLocalHolder::TGlobalHolder::blastProgram))
- {
- bm.sFrameShift = 1;
- if (m.subjId % 2)
- bm.sFrameShift = -bm.sFrameShift;
- } else
- {
- bm.sFrameShift = 0;
- }
+ _setFrames(bm, m, lH);
return 0;
}
@@ -1323,7 +1627,7 @@ computeBlastMatch(TBlastMatch & bm,
template <typename TLocalHolder>
inline int
-iterateMatches(TLocalHolder & lH)
+iterateMatchesExtend(TLocalHolder & lH)
{
using TGlobalHolder = typename TLocalHolder::TGlobalHolder;
// using TMatch = typename TGlobalHolder::TMatch;
@@ -1339,8 +1643,8 @@ iterateMatches(TLocalHolder & lH)
using TBlastRecord = BlastRecord<TBlastMatch>;
// constexpr TPos TPosMax = std::numeric_limits<TPos>::max();
-// constexpr uint8_t qFactor = qHasRevComp(lH.gH.blastProgram) ? 3 : 1;
-// constexpr uint8_t sFactor = sHasRevComp(lH.gH.blastProgram) ? 3 : 1;
+// constexpr uint8_t qFactor = qHasRevComp(TGlobalHolder::blastProgram) ? 3 : 1;
+// constexpr uint8_t sFactor = sHasRevComp(TGlobalHolder::blastProgram) ? 3 : 1;
double start = sysTime();
if (lH.options.doubleIndexing)
@@ -1354,7 +1658,7 @@ iterateMatches(TLocalHolder & lH)
// std::cout << "Length of matches: " << length(lH.matches);
// for (auto const & m : lH.matches)
// {
-// std::cout << m.qryId << "\t" << getTrueQryId(m,lH.options, lH.gH.blastProgram) << "\n";
+// std::cout << m.qryId << "\t" << getTrueQryId(m,lH.options, TGlobalHolder::blastProgram) << "\n";
// }
// double topMaxMatchesMedianBitScore = 0;
@@ -1367,11 +1671,11 @@ iterateMatches(TLocalHolder & lH)
++it)
{
itN = std::next(it,1);
- auto const trueQryId = it->qryId / qNumFrames(lH.gH.blastProgram);
+ auto const trueQryId = it->qryId / qNumFrames(TGlobalHolder::blastProgram);
TBlastRecord record(lH.gH.qryIds[trueQryId]);
- record.qLength = (qIsTranslated(lH.gH.blastProgram)
+ record.qLength = (qIsTranslated(TGlobalHolder::blastProgram)
? lH.gH.untransQrySeqLengths[trueQryId]
: length(lH.gH.qrySeqs[it->qryId]));
@@ -1380,7 +1684,7 @@ iterateMatches(TLocalHolder & lH)
// inner loop over matches per record
for (; it != itEnd; ++it)
{
- auto const trueSubjId = it->subjId / sNumFrames(lH.gH.blastProgram);
+ auto const trueSubjId = it->subjId / sNumFrames(TGlobalHolder::blastProgram);
itN = std::next(it,1);
// std::cout << "FOO\n" << std::flush;
// std::cout << "QryStart: " << it->qryStart << "\n" << std::flush;
@@ -1432,7 +1736,7 @@ iterateMatches(TLocalHolder & lH)
{
// declare all the rest as putative abundant
while ((it != itEnd) &&
- (trueQryId == it->qryId / qNumFrames(lH.gH.blastProgram)))
+ (trueQryId == it->qryId / qNumFrames(TGlobalHolder::blastProgram)))
{
// not already marked as abundant, duplicate or merged
if (!isSetToSkip(*it))
@@ -1454,57 +1758,56 @@ iterateMatches(TLocalHolder & lH)
auto & bm = back(record.matches);
bm.qStart = it->qryStart;
- bm.qEnd = it->qryStart + lH.options.seedLength;
+ bm.qEnd = it->qryEnd; // it->qryStart + lH.options.seedLength;
bm.sStart = it->subjStart;
- bm.sEnd = it->subjStart + lH.options.seedLength;
+ bm.sEnd = it->subjEnd;//it->subjStart + lH.options.seedLength;
bm.qLength = record.qLength;
- bm.sLength = sIsTranslated(lH.gH.blastProgram)
+ bm.sLength = sIsTranslated(TGlobalHolder::blastProgram)
? lH.gH.untransSubjSeqLengths[trueSubjId]
: length(lH.gH.subjSeqs[it->subjId]);
// MERGE PUTATIVE SIBLINGS INTO THIS MATCH
- for (auto it2 = itN;
- (it2 != itEnd) &&
- (trueQryId == it2->qryId / qNumFrames(lH.gH.blastProgram)) &&
- (trueSubjId == it2->subjId / sNumFrames(lH.gH.blastProgram));
- ++it2)
+ if (lH.options.mergePutativeSiblings)
{
- // same frame
- if ((it->qryId % qNumFrames(lH.gH.blastProgram) == it2->qryId % qNumFrames(lH.gH.blastProgram)) &&
- (it->subjId % sNumFrames(lH.gH.blastProgram) == it2->subjId % sNumFrames(lH.gH.blastProgram)))
+ for (auto it2 = itN;
+ (it2 != itEnd) &&
+ (trueQryId == it2->qryId / qNumFrames(TGlobalHolder::blastProgram)) &&
+ (trueSubjId == it2->subjId / sNumFrames(TGlobalHolder::blastProgram));
+ ++it2)
{
-
-// TPos const qDist = (it2->qryStart >= bm.qEnd)
-// ? it2->qryStart - bm.qEnd // upstream
-// : 0; // overlap
-//
-// TPos sDist = TPosMax; // subj match region downstream of *it
-// if (it2->subjStart >= bm.sEnd) // upstream
-// sDist = it2->subjStart - bm.sEnd;
-// else if (it2->subjStart >= it->subjStart) // overlap
-// sDist = 0;
-
- // due to sorting it2->qryStart never <= it->qStart
- // so subject sequences must have same order
- if (it2->subjStart < it->subjStart)
- continue;
-
- long const qDist = it2->qryStart - bm.qEnd;
- long const sDist = it2->subjStart - bm.sEnd;
-
- if ((qDist == sDist) &&
- (qDist <= (long)lH.options.seedGravity))
+ // same frame
+ if ((it->qryId % qNumFrames(TGlobalHolder::blastProgram) == it2->qryId % qNumFrames(TGlobalHolder::blastProgram)) &&
+ (it->subjId % sNumFrames(TGlobalHolder::blastProgram) == it2->subjId % sNumFrames(TGlobalHolder::blastProgram)))
{
- bm.qEnd = std::max(bm.qEnd,
- static_cast<TBlastPos>(it2->qryStart
- + lH.options.seedLength));
- bm.sEnd = std::max(bm.sEnd,
- static_cast<TBlastPos>(it2->subjStart
- + lH.options.seedLength));
- ++lH.stats.hitsMerged;
-
- setToSkip(*it2);
+
+ // TPos const qDist = (it2->qryStart >= bm.qEnd)
+ // ? it2->qryStart - bm.qEnd // upstream
+ // : 0; // overlap
+ //
+ // TPos sDist = TPosMax; // subj match region downstream of *it
+ // if (it2->subjStart >= bm.sEnd) // upstream
+ // sDist = it2->subjStart - bm.sEnd;
+ // else if (it2->subjStart >= it->subjStart) // overlap
+ // sDist = 0;
+
+ // due to sorting it2->qryStart never <= it->qStart
+ // so subject sequences must have same order
+ if (it2->subjStart < it->subjStart)
+ continue;
+
+ long const qDist = it2->qryStart - bm.qEnd;
+ long const sDist = it2->subjStart - bm.sEnd;
+
+ if ((qDist == sDist) &&
+ (qDist <= (long)lH.options.seedGravity))
+ {
+ bm.qEnd = std::max(bm.qEnd, static_cast<TBlastPos>(it2->qryEnd));
+ bm.sEnd = std::max(bm.sEnd, static_cast<TBlastPos>(it2->subjEnd));
+ ++lH.stats.hitsMerged;
+
+ setToSkip(*it2);
+ }
}
}
}
@@ -1518,8 +1821,8 @@ iterateMatches(TLocalHolder & lH)
// ++lH.stats.goodMatches;
if (lH.options.outFileFormat > 0)
{
- bm._n_qId = it->qryId / qNumFrames(lH.gH.blastProgram);
- bm._n_sId = it->subjId / sNumFrames(lH.gH.blastProgram);
+ bm._n_qId = it->qryId / qNumFrames(TGlobalHolder::blastProgram);
+ bm._n_sId = it->subjId / sNumFrames(TGlobalHolder::blastProgram);
}
break;
case EVALUE:
@@ -1537,16 +1840,20 @@ iterateMatches(TLocalHolder & lH)
<< "subjId: " << it->subjId << "\t"
<< "seed qry: " << infix(lH.gH.redQrySeqs,
it->qryStart,
- it->qryStart + lH.options.seedLength)
+ it->qryEnd)
+// it->qryStart + lH.options.seedLength)
<< "\n subj: " << infix(lH.gH.redSubjSeqs,
it->subjStart,
- it->subjStart + lH.options.seedLength)
+ it->subjEnd)
+// it->subjStart + lH.options.seedLength)
<< "\nunred qry: " << infix(lH.gH.qrySeqs,
it->qryStart,
- it->qryStart + lH.options.seedLength)
+ it->qryEnd)
+// it->qryStart + lH.options.seedLength)
<< "\n subj: " << infix(lH.gH.subjSeqs,
it->subjStart,
- it->subjStart + lH.options.seedLength)
+ it->subjEnd)
+// it->subjStart + lH.options.seedLength)
<< "\nmatch qry: " << infix(lH.gH.qrySeqs,
bm.qStart,
bm.qEnd)
@@ -1567,19 +1874,21 @@ iterateMatches(TLocalHolder & lH)
// PUTATIVE DUBLICATES CHECK
for (auto it2 = itN;
(it2 != itEnd) &&
- (trueQryId == it2->qryId / qNumFrames(lH.gH.blastProgram)) &&
- (trueSubjId == it2->subjId / sNumFrames(lH.gH.blastProgram));
+ (trueQryId == it2->qryId / qNumFrames(TGlobalHolder::blastProgram)) &&
+ (trueSubjId == it2->subjId / sNumFrames(TGlobalHolder::blastProgram));
++it2)
{
// same frame and same range
if ((it->qryId == it2->qryId) &&
(it->subjId == it2->subjId) &&
(intervalOverlap(it2->qryStart,
- it2->qryStart + lH.options.seedLength,
+ it2->qryEnd,
+// it2->qryStart + lH.options.seedLength,
bm.qStart,
bm.qEnd) > 0) &&
(intervalOverlap(it2->subjStart,
- it2->subjStart + lH.options.seedLength,
+ it2->subjEnd,
+// it2->subjStart + lH.options.seedLength,
bm.sStart,
bm.sEnd) > 0))
{
@@ -1604,32 +1913,11 @@ iterateMatches(TLocalHolder & lH)
// last item or new TrueQryId
if ((itN == itEnd) ||
- (trueQryId != itN->qryId / qNumFrames(lH.gH.blastProgram)))
+ (trueQryId != itN->qryId / qNumFrames(TGlobalHolder::blastProgram)))
break;
}
- if (length(record.matches) > 0)
- {
- ++lH.stats.qrysWithHit;
- // sort and remove duplicates -> STL, yeah!
- auto const before = record.matches.size();
- record.matches.sort();
- if (!lH.options.filterPutativeDuplicates)
- {
- record.matches.unique();
- lH.stats.hitsDuplicate += before - record.matches.size();
- }
- if (record.matches.size() > lH.options.maxMatches)
- {
- lH.stats.hitsAbundant += record.matches.size() -
- lH.options.maxMatches;
- record.matches.resize(lH.options.maxMatches);
- }
- lH.stats.hitsFinal += record.matches.size();
-
- myWriteRecord(lH, record);
- }
-
+ _writeRecord(record, lH);
}
if (lH.options.doubleIndexing)
@@ -1644,4 +1932,319 @@ iterateMatches(TLocalHolder & lH)
return 0;
}
+#ifdef SEQAN_SIMD_ENABLED
+template <typename TLocalHolder>
+inline int
+iterateMatchesFullSimd(TLocalHolder & lH)
+{
+ using TGlobalHolder = typename TLocalHolder::TGlobalHolder;
+ using TMatch = typename TGlobalHolder::TMatch;
+ using TPos = typename TMatch::TPos;
+ using TBlastPos = uint32_t; //TODO why can't this be == TPos
+ using TBlastMatch = BlastMatch<
+ typename TLocalHolder::TAlignRow0,
+ typename TLocalHolder::TAlignRow1,
+ TBlastPos,
+ typename Value<typename TGlobalHolder::TQryIds>::Type,// const &,
+ typename Value<typename TGlobalHolder::TSubjIds>::Type// const &,
+ >;
+ using TBlastRecord = BlastRecord<TBlastMatch>;
+
+ typedef FreeEndGaps_<True, True, True, True> TFreeEndGaps;
+ typedef AlignConfig2<LocalAlignment_<>,
+ DPBandConfig<BandOff>,
+ TFreeEndGaps,
+ TracebackOn<TracebackConfig_<CompleteTrace, GapsLeft> > > TAlignConfig;
+
+ typedef int TScoreValue; //TODO don't hardcode
+ typedef typename Size<typename TLocalHolder::TAlignRow0>::Type TSize;
+ typedef TraceSegment_<TPos, TSize> TTraceSegment;
+
+ typedef typename SimdVector<int16_t>::Type TSimdAlign;
+
+ unsigned const numAlignments = length(lH.matches);
+ unsigned const sizeBatch = LENGTH<TSimdAlign>::VALUE;
+ unsigned const fullSize = sizeBatch * ((numAlignments + sizeBatch - 1) / sizeBatch);
+
+ String<TScoreValue> results;
+ resize(results, numAlignments);
+
+ // Create a SIMD scoring scheme.
+ Score<TSimdAlign, ScoreSimdWrapper<typename TGlobalHolder::TScoreScheme> > simdScoringScheme(seqanScheme(context(lH.gH.outfile).scoringScheme));
+
+ // Prepare string sets with sequences.
+ StringSet<typename Source<typename TLocalHolder::TAlignRow0>::Type, Dependent<> > depSetH;
+ StringSet<typename Source<typename TLocalHolder::TAlignRow1>::Type, Dependent<> > depSetV;
+ reserve(depSetH, fullSize);
+ reserve(depSetV, fullSize);
+
+
+ auto const trueQryId = lH.matches[0].qryId / qNumFrames(TGlobalHolder::blastProgram);
+
+ TBlastRecord record(lH.gH.qryIds[trueQryId]);
+ record.qLength = (qIsTranslated(TGlobalHolder::blastProgram)
+ ? lH.gH.untransQrySeqLengths[trueQryId]
+ : length(lH.gH.qrySeqs[lH.matches[0].qryId]));
+
+ size_t maxDist = 0;
+ switch (lH.options.band)
+ {
+ case -3: maxDist = ceil(log2(record.qLength)); break;
+ case -2: maxDist = floor(sqrt(record.qLength)); break;
+ case -1: break;
+ default: maxDist = lH.options.band; break;
+ }
+
+ TAlignConfig config;//(-maxDist, maxDist);
+
+ // create blast matches
+ for (auto it = lH.matches.begin(), itEnd = lH.matches.end(); it != itEnd; ++it)
+ {
+ auto const trueSubjId = it->subjId / sNumFrames(TGlobalHolder::blastProgram);
+
+ // create blastmatch in list without copy or move
+ record.matches.emplace_back(lH.gH.qryIds [trueQryId],
+ lH.gH.subjIds[trueSubjId]);
+
+ auto & bm = back(record.matches);
+ auto & m = *it;
+
+ bm.qLength = record.qLength;
+ bm.sLength = sIsTranslated(TGlobalHolder::blastProgram)
+ ? lH.gH.untransSubjSeqLengths[trueSubjId]
+ : length(lH.gH.subjSeqs[it->subjId]);
+
+ long lenDiff = (long)it->subjStart - (long)it->qryStart;
+
+ TPos sStart;
+ TPos qStart;
+ if (lenDiff >= 0)
+ {
+ sStart = lenDiff;
+ qStart = 0;
+ }
+ else
+ {
+ sStart = 0;
+ qStart = -lenDiff;
+ }
+ TPos sEnd = std::min(sStart + length(lH.gH.qrySeqs[it->qryId]), length(lH.gH.subjSeqs[it->subjId]));
+
+ assignSource(bm.alignRow0, infix(lH.gH.qrySeqs[it->qryId], qStart, length(lH.gH.qrySeqs[it->qryId])));
+ assignSource(bm.alignRow1, infix(lH.gH.subjSeqs[it->subjId], sStart, sEnd));
+
+
+ appendValue(depSetH, source(bm.alignRow0));
+ appendValue(depSetV, source(bm.alignRow1));
+
+ _setFrames(bm, *it, lH);
+
+ bm._n_qId = it->qryId / qNumFrames(TGlobalHolder::blastProgram);
+ bm._n_sId = it->subjId / sNumFrames(TGlobalHolder::blastProgram);
+ }
+
+ // fill up last batch
+ for (size_t i = numAlignments; i < fullSize; ++i)
+ {
+ appendValue(depSetH, source(back(record.matches).alignRow0));
+ appendValue(depSetV, source(back(record.matches).alignRow1));
+ }
+
+ // Run alignments in batches.
+ auto matchIt = record.matches.begin();
+ for (auto pos = 0u; pos < fullSize; pos += sizeBatch)
+ {
+ auto infSetH = infixWithLength(depSetH, pos, sizeBatch);
+ auto infSetV = infixWithLength(depSetV, pos, sizeBatch);
+
+ TSimdAlign resultsBatch;
+
+ StringSet<String<TTraceSegment> > trace;
+ resize(trace, sizeBatch, Exact());
+
+ _prepareAndRunSimdAlignment(resultsBatch, trace, infSetH, infSetV, simdScoringScheme, config, typename TLocalHolder::TScoreExtension());
+
+ // copy results and finish traceback
+ // TODO(rrahn): Could be parallelized!
+ // to for_each call
+ for(auto x = pos; x < pos + sizeBatch && x < numAlignments; ++x)
+ {
+ results[x] = resultsBatch[x - pos];
+ _adaptTraceSegmentsTo(matchIt->alignRow0, matchIt->alignRow1, trace[x - pos]);
+ ++matchIt;
+ }
+ }
+
+ // TODO share this code with above function
+ for (auto it = record.matches.begin(), itEnd = record.matches.end(); it != itEnd; /*below*/)
+ {
+ TBlastMatch & bm = *it;
+
+ bm.sStart = beginPosition(bm.alignRow1);
+ bm.qStart = beginPosition(bm.alignRow0);
+ bm.sEnd = endPosition(bm.alignRow1);
+ bm.qEnd = endPosition(bm.alignRow0);
+
+ computeAlignmentStats(bm, context(lH.gH.outfile));
+
+ if (bm.alignStats.alignmentIdentity < lH.options.idCutOff)
+ {
+ ++lH.stats.hitsFailedExtendPercentIdentTest;
+ it = record.matches.erase(it);
+ continue;
+ }
+
+ computeBitScore(bm, context(lH.gH.outfile));
+
+ computeEValueThreadSafe(bm, context(lH.gH.outfile));
+
+ if (bm.eValue > lH.options.eCutOff)
+ {
+ ++lH.stats.hitsFailedExtendEValueTest;
+ it = record.matches.erase(it);
+ continue;
+ }
+
+ ++it;
+ }
+
+ _writeRecord(record, lH);
+
+ return 0;
+}
+
+#endif // SEQAN_SIMD_ENABLED
+
+template <typename TLocalHolder>
+inline int
+iterateMatchesFullSerial(TLocalHolder & lH)
+{
+ using TGlobalHolder = typename TLocalHolder::TGlobalHolder;
+ using TMatch = typename TGlobalHolder::TMatch;
+ using TPos = typename TMatch::TPos;
+ using TBlastPos = uint32_t; //TODO why can't this be == TPos
+ using TBlastMatch = BlastMatch<
+ typename TLocalHolder::TAlignRow0,
+ typename TLocalHolder::TAlignRow1,
+ TBlastPos,
+ typename Value<typename TGlobalHolder::TQryIds>::Type,// const &,
+ typename Value<typename TGlobalHolder::TSubjIds>::Type// const &,
+ >;
+ using TBlastRecord = BlastRecord<TBlastMatch>;
+
+ auto const trueQryId = lH.matches[0].qryId / qNumFrames(TGlobalHolder::blastProgram);
+
+ TBlastRecord record(lH.gH.qryIds[trueQryId]);
+ record.qLength = (qIsTranslated(TGlobalHolder::blastProgram)
+ ? lH.gH.untransQrySeqLengths[trueQryId]
+ : length(lH.gH.qrySeqs[lH.matches[0].qryId]));
+
+ unsigned maxDist = 0;
+ switch (lH.options.band)
+ {
+ case -3: maxDist = ceil(log2(record.qLength)); break;
+ case -2: maxDist = floor(sqrt(record.qLength)); break;
+ case -1: break;
+ default: maxDist = lH.options.band; break;
+ }
+
+ // create blast matches
+ for (auto it = lH.matches.begin(), itEnd = lH.matches.end(); it != itEnd; ++it)
+ {
+ auto const trueSubjId = it->subjId / sNumFrames(TGlobalHolder::blastProgram);
+
+ // create blastmatch in list without copy or move
+ record.matches.emplace_back(lH.gH.qryIds [trueQryId],
+ lH.gH.subjIds[trueSubjId]);
+
+ auto & bm = back(record.matches);
+ auto & m = *it;
+
+ bm.qLength = record.qLength;
+ bm.sLength = sIsTranslated(TGlobalHolder::blastProgram)
+ ? lH.gH.untransSubjSeqLengths[trueSubjId]
+ : length(lH.gH.subjSeqs[it->subjId]);
+
+ long lenDiff = (long)it->subjStart - (long)it->qryStart;
+
+ TPos sStart;
+ TPos qStart;
+ if (lenDiff >= 0)
+ {
+ sStart = lenDiff;
+ qStart = 0;
+ }
+ else
+ {
+ sStart = 0;
+ qStart = -lenDiff;
+ }
+ TPos sEnd = std::min(sStart + length(lH.gH.qrySeqs[it->qryId]), length(lH.gH.subjSeqs[it->subjId]));
+
+ assignSource(bm.alignRow0, infix(lH.gH.qrySeqs[it->qryId], qStart, length(lH.gH.qrySeqs[it->qryId])));
+ assignSource(bm.alignRow1, infix(lH.gH.subjSeqs[it->subjId], sStart, sEnd));
+
+// localAlignment2(bm.alignRow0,
+// bm.alignRow1,
+// seqanScheme(context(lH.gH.outfile).scoringScheme),
+// -maxDist,
+// maxDist,
+// lH.alignContext);
+ localAlignment(bm.alignRow0,
+ bm.alignRow1,
+ seqanScheme(context(lH.gH.outfile).scoringScheme),
+ -maxDist,
+ maxDist);
+
+ bm.sStart = beginPosition(bm.alignRow1);
+ bm.qStart = beginPosition(bm.alignRow0);
+ bm.sEnd = endPosition(bm.alignRow1);
+ bm.qEnd = endPosition(bm.alignRow0);
+
+ computeAlignmentStats(bm, context(lH.gH.outfile));
+
+ if (bm.alignStats.alignmentIdentity < lH.options.idCutOff)
+ {
+ ++lH.stats.hitsFailedExtendPercentIdentTest;
+ record.matches.pop_back();
+ continue;
+ }
+
+ computeBitScore(bm, context(lH.gH.outfile));
+
+ computeEValueThreadSafe(bm, context(lH.gH.outfile));
+
+ if (bm.eValue > lH.options.eCutOff)
+ {
+ ++lH.stats.hitsFailedExtendEValueTest;
+ record.matches.pop_back();
+ continue;
+ }
+
+ _setFrames(bm, m, lH);
+
+ bm._n_qId = it->qryId / qNumFrames(TGlobalHolder::blastProgram);
+ bm._n_sId = it->subjId / sNumFrames(TGlobalHolder::blastProgram);
+ }
+
+ _writeRecord(record, lH);
+
+ return 0;
+}
+
+template <typename TLocalHolder>
+inline int
+iterateMatches(TLocalHolder & lH)
+{
+#ifdef SEQAN_SIMD_ENABLED
+ if (lH.options.extensionMode == LambdaOptions::ExtensionMode::FULL_SIMD)
+ return iterateMatchesFullSimd(lH);
+ else
+#endif
+ if (lH.options.extensionMode == LambdaOptions::ExtensionMode::FULL_SERIAL)
+ return iterateMatchesFullSerial(lH);
+ else
+ return iterateMatchesExtend(lH);
+}
+
#endif // HEADER GUARD
diff --git a/src/lambda_indexer.cpp b/src/lambda_indexer.cpp
index d53f0df..90eafd9 100644
--- a/src/lambda_indexer.cpp
+++ b/src/lambda_indexer.cpp
@@ -19,8 +19,9 @@
// lambda.cpp: Main File for the main application
// ==========================================================================
-#include <seqan/basic.h>
+#include <initializer_list>
+#include <seqan/basic.h>
#include <seqan/arg_parse.h>
#include <seqan/seq_io.h>
@@ -178,24 +179,24 @@ realMain(LambdaIndexerOptions const & options,
if (sIsTranslated(p))
_saveOriginalSeqLengths(originalSeqs.limits, options);
- // convert the seg file to seqan binary format
- ret = convertMaskingFile(length(originalSeqs), options);
- if (ret)
- return ret;
+// // convert the seg file to seqan binary format
+// ret = convertMaskingFile(length(originalSeqs), options);
+// if (ret)
+// return ret;
// translate or swap depending on program
translateOrSwap(translatedSeqs, originalSeqs, options);
}
// dump translated and unreduced sequences (except where they are included in index)
- if ((options.alphReduction != 0) || (options.dbIndexType != 0))
+ if ((options.alphReduction != 0) || (options.dbIndexType == DbIndexType::FM_INDEX))
dumpTranslatedSeqs(translatedSeqs, options);
// see if final sequence set actually fits into index
if (!checkIndexSize(translatedSeqs))
return -1;
- if (options.dbIndexType == 1)
+ if (options.dbIndexType == DbIndexType::FM_INDEX)
{
using TIndexSpec = TFMIndex<TIndexSpecSpec>;
generateIndexAndDump<TIndexSpec,TIndexSpecSpec>(translatedSeqs,
@@ -211,6 +212,21 @@ realMain(LambdaIndexerOptions const & options,
TRedAlph());
}
+ // dump options
+ for (auto && s : std::initializer_list<std::pair<std::string, std::string>>
+ {
+ { options.indexDir + "/option:db_index_type", std::to_string(static_cast<uint32_t>(options.dbIndexType))},
+ { options.indexDir + "/option:alph_original", std::string(_alphName(OrigSubjAlph<p>())) },
+ { options.indexDir + "/option:alph_translated", std::string(_alphName(TransAlph<p>())) },
+ { options.indexDir + "/option:alph_reduced", std::string(_alphName(TRedAlph())) },
+ { options.indexDir + "/option:genetic_code", std::to_string(options.geneticCode) }
+ })
+ {
+ std::ofstream f{std::get<0>(s).c_str(), std::ios_base::out | std::ios_base::binary};
+ f << std::get<1>(s);
+ f.close();
+ }
+
return 0;
}
diff --git a/src/lambda_indexer.hpp b/src/lambda_indexer.hpp
index 54fc759..ce40b9f 100644
--- a/src/lambda_indexer.hpp
+++ b/src/lambda_indexer.hpp
@@ -113,8 +113,8 @@ loadSubjSeqsAndIds(TCDStringSet<String<TOrigAlph>> & originalSeqs,
myPrint(options, 1, "Dumping Subj Ids...");
//TODO save to TMPDIR instead
- CharString _path = options.dbFile;
- append(_path, ".ids");
+ CharString _path = options.indexDir;
+ append(_path, "/seq_ids");
save(ids, toCString(_path));
myPrint(options, 1, " done.\n");
@@ -139,8 +139,8 @@ _saveOriginalSeqLengths(TLimits limits, // we want copy!
myPrint(options, 1, " dumping untranslated subject lengths...");
//TODO save to TMPDIR instead
- CharString _path = options.dbFile;
- append(_path, ".untranslengths");
+ CharString _path = options.indexDir;
+ append(_path, "/untranslated_seq_lengths");
save(limits, toCString(_path));
}
@@ -184,7 +184,7 @@ dumpTranslatedSeqs(TCDStringSet<String<TTransAlph>> const & translatedSeqs,
myPrint(options, 1, "Dumping unreduced Subj Sequences...");
//TODO save to TMPDIR instead
- std::string _path = options.dbFile + '.' + std::string(_alphName(TTransAlph()));
+ std::string _path = options.indexDir + "/translated_seqs";
save(translatedSeqs, _path.c_str());
myPrint(options, 1, " done.\n");
@@ -252,132 +252,132 @@ checkIndexSize(TCDStringSet<String<TRedAlph>> const & seqs)
return true;
}
-// --------------------------------------------------------------------------
-// Function loadSubj()
-// --------------------------------------------------------------------------
-
-inline int
-convertMaskingFile(uint64_t numberOfSeqs,
- LambdaIndexerOptions const & options)
-
-{
- StringSet<String<unsigned>, Owner<ConcatDirect<>>> segIntStarts;
- StringSet<String<unsigned>, Owner<ConcatDirect<>>> segIntEnds;
-// resize(segIntervals, numberOfSeqs, Exact());
-
- if (options.segFile != "")
- {
- myPrint(options, 1, "Constructing binary seqan masking from seg-file...");
-
- std::ifstream stream;
- stream.open(toCString(options.segFile));
- if (!stream.is_open())
- {
- std::cerr << "ERROR: could not open seg file.\n";
- return -1;
- }
-
- auto reader = directionIterator(stream, Input());
-
-// StringSet<String<Tuple<unsigned, 2>>> _segIntervals;
-// auto & _segIntervals = segIntervals;
-// resize(_segIntervals, numberOfSeqs, Exact());
- StringSet<String<unsigned>> _segIntStarts;
- StringSet<String<unsigned>> _segIntEnds;
- resize(_segIntStarts, numberOfSeqs, Exact());
- resize(_segIntEnds, numberOfSeqs, Exact());
- CharString buf;
-// std::tuple<unsigned, unsigned> tup;
-
-// auto curSeq = begin(_segIntervals);
- unsigned curSeq = 0;
- while (value(reader) == '>')
- {
-// if (curSeq == end(_segIntervals))
-// return -7;
- if (curSeq == numberOfSeqs)
- {
- std::cerr << "ERROR: seg file has more entries then database.\n";
- return -7;
- }
- skipLine(reader);
- if (atEnd(reader))
- break;
-
- unsigned curInt = 0;
- while ((!atEnd(reader)) && (value(reader) != '>'))
- {
- resize(_segIntStarts[curSeq], length(_segIntStarts[curSeq])+1);
- resize(_segIntEnds[curSeq], length(_segIntEnds[curSeq])+1);
- clear(buf);
- readUntil(buf, reader, IsWhitespace());
-
-// std::get<0>(tup) = strtoumax(toCString(buf), 0, 10);
- _segIntStarts[curSeq][curInt] = strtoumax(toCString(buf), 0, 10);
- skipUntil(reader, IsDigit());
-
- clear(buf);
- readUntil(buf, reader, IsWhitespace());
-
-// std::get<1>(tup) = strtoumax(toCString(buf), 0, 10);
- _segIntEnds[curSeq][curInt] = strtoumax(toCString(buf), 0, 10);
-
-// appendValue(*curSeq, tup);
-
- skipLine(reader);
- curInt++;
- }
- if (atEnd(reader))
- break;
- else
- curSeq++;
- }
-// if (curSeq != end(_segIntervals))
-// return -9;
- if (curSeq != (numberOfSeqs - 1))
- {
- std::cerr << "ERROR: seg file has less entries (" << curSeq + 1
- << ") than database (" << numberOfSeqs << ").\n";
- return -9;
- }
-
- segIntStarts.concat = concat(_segIntStarts);
- segIntStarts.limits = stringSetLimits(_segIntStarts);
- segIntEnds.concat = concat(_segIntEnds);
- segIntEnds.limits = stringSetLimits(_segIntEnds);
-// segIntEnds = _segIntEnds;
-// segIntervals = _segIntervals; // non-concatdirect to concatdirect
-
- stream.close();
-
- } else
- {
- myPrint(options, 1, "No Seg-File specified, no masking will take place.\n");
-// resize(segIntervals, numberOfSeqs, Exact());
- resize(segIntStarts, numberOfSeqs, Exact());
- resize(segIntEnds, numberOfSeqs, Exact());
- }
-
-// for (unsigned u = 0; u < length(segIntStarts); ++u)
+// // --------------------------------------------------------------------------
+// // Function loadSubj()
+// // --------------------------------------------------------------------------
+//
+// inline int
+// convertMaskingFile(uint64_t numberOfSeqs,
+// LambdaIndexerOptions const & options)
+//
+// {
+// StringSet<String<unsigned>, Owner<ConcatDirect<>>> segIntStarts;
+// StringSet<String<unsigned>, Owner<ConcatDirect<>>> segIntEnds;
+// // resize(segIntervals, numberOfSeqs, Exact());
+//
+// if (options.segFile != "")
// {
-// myPrint(options, 1,u, ": ";
-// for (unsigned v = 0; v < length(segIntStarts[u]); ++v)
+// myPrint(options, 1, "Constructing binary seqan masking from seg-file...");
+//
+// std::ifstream stream;
+// stream.open(toCString(options.segFile));
+// if (!stream.is_open())
+// {
+// std::cerr << "ERROR: could not open seg file.\n";
+// return -1;
+// }
+//
+// auto reader = directionIterator(stream, Input());
+//
+// // StringSet<String<Tuple<unsigned, 2>>> _segIntervals;
+// // auto & _segIntervals = segIntervals;
+// // resize(_segIntervals, numberOfSeqs, Exact());
+// StringSet<String<unsigned>> _segIntStarts;
+// StringSet<String<unsigned>> _segIntEnds;
+// resize(_segIntStarts, numberOfSeqs, Exact());
+// resize(_segIntEnds, numberOfSeqs, Exact());
+// CharString buf;
+// // std::tuple<unsigned, unsigned> tup;
+//
+// // auto curSeq = begin(_segIntervals);
+// unsigned curSeq = 0;
+// while (value(reader) == '>')
+// {
+// // if (curSeq == end(_segIntervals))
+// // return -7;
+// if (curSeq == numberOfSeqs)
+// {
+// std::cerr << "ERROR: seg file has more entries then database.\n";
+// return -7;
+// }
+// skipLine(reader);
+// if (atEnd(reader))
+// break;
+//
+// unsigned curInt = 0;
+// while ((!atEnd(reader)) && (value(reader) != '>'))
+// {
+// resize(_segIntStarts[curSeq], length(_segIntStarts[curSeq])+1);
+// resize(_segIntEnds[curSeq], length(_segIntEnds[curSeq])+1);
+// clear(buf);
+// readUntil(buf, reader, IsWhitespace());
+//
+// // std::get<0>(tup) = strtoumax(toCString(buf), 0, 10);
+// _segIntStarts[curSeq][curInt] = strtoumax(toCString(buf), 0, 10);
+// skipUntil(reader, IsDigit());
+//
+// clear(buf);
+// readUntil(buf, reader, IsWhitespace());
+//
+// // std::get<1>(tup) = strtoumax(toCString(buf), 0, 10);
+// _segIntEnds[curSeq][curInt] = strtoumax(toCString(buf), 0, 10);
+//
+// // appendValue(*curSeq, tup);
+//
+// skipLine(reader);
+// curInt++;
+// }
+// if (atEnd(reader))
+// break;
+// else
+// curSeq++;
+// }
+// // if (curSeq != end(_segIntervals))
+// // return -9;
+// if (curSeq != (numberOfSeqs - 1))
// {
-// myPrint(options, 1,'(', segIntStarts[u][v], ", ", segIntEnds[u][v], ") ";
+// std::cerr << "ERROR: seg file has less entries (" << curSeq + 1
+// << ") than database (" << numberOfSeqs << ").\n";
+// return -9;
// }
-// myPrint(options, 1,'\n';
+//
+// segIntStarts.concat = concat(_segIntStarts);
+// segIntStarts.limits = stringSetLimits(_segIntStarts);
+// segIntEnds.concat = concat(_segIntEnds);
+// segIntEnds.limits = stringSetLimits(_segIntEnds);
+// // segIntEnds = _segIntEnds;
+// // segIntervals = _segIntervals; // non-concatdirect to concatdirect
+//
+// stream.close();
+//
+// } else
+// {
+// myPrint(options, 1, "No Seg-File specified, no masking will take place.\n");
+// // resize(segIntervals, numberOfSeqs, Exact());
+// resize(segIntStarts, numberOfSeqs, Exact());
+// resize(segIntEnds, numberOfSeqs, Exact());
// }
- myPrint(options, 1, "Dumping binary seqan mask file...");
- CharString _path = options.dbFile;
- append(_path, ".binseg_s");
- save(segIntStarts, toCString(_path));
- _path = options.dbFile;
- append(_path, ".binseg_e");
- save(segIntEnds, toCString(_path));
- myPrint(options, 1, " done.\n");
- myPrint(options, 2, "\n");
- return 0;
-}
+//
+// // for (unsigned u = 0; u < length(segIntStarts); ++u)
+// // {
+// // myPrint(options, 1,u, ": ";
+// // for (unsigned v = 0; v < length(segIntStarts[u]); ++v)
+// // {
+// // myPrint(options, 1,'(', segIntStarts[u][v], ", ", segIntEnds[u][v], ") ";
+// // }
+// // myPrint(options, 1,'\n';
+// // }
+// myPrint(options, 1, "Dumping binary seqan mask file...");
+// CharString _path = options.dbFile;
+// append(_path, ".binseg_s");
+// save(segIntStarts, toCString(_path));
+// _path = options.dbFile;
+// append(_path, ".binseg_e");
+// save(segIntEnds, toCString(_path));
+// myPrint(options, 1, " done.\n");
+// myPrint(options, 2, "\n");
+// return 0;
+// }
// --------------------------------------------------------------------------
// Function createSuffixArray()
@@ -566,13 +566,10 @@ generateIndexAndDump(StringSet<TString, TSpec> & seqs,
// Dump Index
myPrint(options, 1, "Writing Index to disk...");
s = sysTime();
- std::string path = toCString(options.dbFile);
- path += '.' + std::string(_alphName(TRedAlph()));
- if (indexIsFM)
- path += ".fm";
- else
- path += ".sa";
+ std::string path = options.indexDir + "/index";
+
save(dbIndex, path.c_str());
+
e = sysTime() - s;
myPrint(options, 1, " done.\n");
myPrint(options, 2, "Runtime: ", e, "s \n");
diff --git a/src/match.hpp b/src/match.hpp
index bc664b5..d44d99a 100644
--- a/src/match.hpp
+++ b/src/match.hpp
@@ -44,10 +44,10 @@ struct Match
TQId qryId;
TSId subjId;
TPos qryStart;
-// TPos qryEnd;
+ TPos qryEnd;
TPos subjStart;
-// TPos subjEnd;
+ TPos subjEnd;
// Match()
// :
@@ -67,13 +67,13 @@ struct Match
inline bool operator== (Match const & m2) const
{
- return std::tie(qryId, subjId, qryStart, subjStart/*, qryEnd, subjEnd*/)
- == std::tie(m2.qryId, m2.subjId, m2.qryStart, m2.subjStart/*, m2.qryEnd, m2.subjEnd*/);
+ return std::tie(qryId, subjId, qryStart, subjStart, qryEnd, subjEnd)
+ == std::tie(m2.qryId, m2.subjId, m2.qryStart, m2.subjStart, m2.qryEnd, m2.subjEnd);
}
inline bool operator< (Match const & m2) const
{
- return std::tie(qryId, subjId, qryStart, subjStart/*, qryEnd, subjEnd*/)
- < std::tie(m2.qryId, m2.subjId, m2.qryStart, m2.subjStart/*, m2.qryEnd, m2.subjEnd*/);
+ return std::tie(qryId, subjId, qryStart, subjStart, qryEnd, subjEnd)
+ < std::tie(m2.qryId, m2.subjId, m2.qryStart, m2.subjStart, m2.qryEnd, m2.subjEnd);
}
};
@@ -272,16 +272,16 @@ myHyperSortSingleIndex(std::vector<Match<TAlph>> & matches,
// m1.subjEnd = std::max(m1.subjEnd, m2.subjEnd);
// }
-
-// inline void
-// _printMatch(Match const & m)
-// {
-// std::cout << "MATCH Query " << m.qryId
-// << "(" << m.qryStart << ", " << m.qryEnd
-// << ") on Subject "<< m.subjId
-// << "(" << m.subjStart << ", " << m.subjEnd
-// << ")" << std::endl << std::flush;
-// }
+template <typename TAlph>
+inline void
+_printMatch(Match<TAlph> const & m)
+{
+ std::cout << "MATCH Query " << m.qryId
+ << "(" << m.qryStart << ", " << m.qryEnd
+ << ") on Subject "<< m.subjId
+ << "(" << m.subjStart << ", " << m.subjEnd
+ << ")" << std::endl << std::flush;
+}
diff --git a/src/misc.hpp b/src/misc.hpp
index bd6de73..ddd0967 100644
--- a/src/misc.hpp
+++ b/src/misc.hpp
@@ -412,6 +412,41 @@ appendToStatus(std::stringstream & status,
}
// ----------------------------------------------------------------------------
+// Function computeEValueThreadSafe
+// ----------------------------------------------------------------------------
+
+template <typename TBlastMatch,
+ typename TScore,
+ BlastProgram p,
+ BlastTabularSpec h>
+inline double
+computeEValueThreadSafe(TBlastMatch & match,
+ BlastIOContext<TScore, p, h> & context)
+{
+#if defined(__FreeBSD__) && defined(STDLIB_LLVM)
+ // https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192320
+ static std::vector<std::unordered_map<uint64_t, uint64_t>> _cachedLengthAdjustmentsArray(omp_get_num_threads());
+ static std::unordered_map<uint64_t, uint64_t> & _cachedLengthAdjustments = _cachedLengthAdjustmentsArray[omp_get_thread_num()];
+#else
+ static thread_local std::unordered_map<uint64_t, uint64_t> _cachedLengthAdjustments;
+#endif
+
+ // convert to 64bit and divide for translated sequences
+ uint64_t ql = match.qLength / (qIsTranslated(context.blastProgram) ? 3 : 1);
+ // length adjustment not yet computed
+ if (_cachedLengthAdjustments.find(ql) == _cachedLengthAdjustments.end())
+ _cachedLengthAdjustments[ql] = _lengthAdjustment(context.dbTotalLength, ql, context.scoringScheme);
+
+ uint64_t adj = _cachedLengthAdjustments[ql];
+
+ match.eValue = _computeEValue(match.alignStats.alignmentScore,
+ ql - adj,
+ context.dbTotalLength - adj,
+ context.scoringScheme);
+ return match.eValue;
+}
+
+// ----------------------------------------------------------------------------
// remove tag type
// ----------------------------------------------------------------------------
diff --git a/src/options.hpp b/src/options.hpp
index 7441cf9..b73dc5b 100644
--- a/src/options.hpp
+++ b/src/options.hpp
@@ -104,9 +104,10 @@ struct DefaultIndexStringSpec<StringSet<TString, TSpec>>
template <typename TDirection, typename TStorageSpec>
struct FormattedFileContext<FormattedFile<Bam, TDirection, BlastTabular>, TStorageSpec>
{
- typedef StringSet<Segment<String<char, MMap<> >, InfixSegment> > TNameStore;
- typedef NameStoreCache<TNameStore> TNameStoreCache;
- typedef BamIOContext<TNameStore, TNameStoreCache, TStorageSpec> Type;
+ typedef typename DefaultIndexStringSpec<StringSet<void, void>>::Type TStringSpec; // see above
+ typedef StringSet<Segment<String<char, TStringSpec>, InfixSegment> > TNameStore;
+ typedef NameStoreCache<TNameStore> TNameStoreCache;
+ typedef BamIOContext<TNameStore, TNameStoreCache, TStorageSpec> Type;
};
}
@@ -122,7 +123,8 @@ struct LambdaFMIndexConfig
#else
using TAlloc = Alloc<>;
#endif
- using Bwt = WaveletTree<void, WTRDConfig<LengthSum, TAlloc> >;
+// using Bwt = WaveletTree<void, WTRDConfig<LengthSum, TAlloc> >;
+ using Bwt = Levels<void, LevelsRDConfig<LengthSum, TAlloc, 1, 3> >;
using Sentinels = Levels<void, LevelsRDConfig<LengthSum, TAlloc> >;
static const unsigned SAMPLING = 10;
@@ -187,11 +189,22 @@ bool setEnv(TString const & key, TValue & value)
}
// ==========================================================================
+// Option Enums
+// ==========================================================================
+
+enum class DbIndexType : uint8_t
+{
+ SUFFIX_ARRAY,
+ FM_INDEX,
+ BI_FM_INDEX
+};
+
+// ==========================================================================
// Classes
// ==========================================================================
// --------------------------------------------------------------------------
-// Class LambdaOptions
+// Class SharedOptions
// --------------------------------------------------------------------------
// This struct stores the options from the command line.
@@ -203,13 +216,9 @@ struct SharedOptions
std::string commandLine;
- std::string dbFile;
+ std::string indexDir;
- int dbIndexType = 0;
- // for indexer, the file format of database sequences
- // for main app, the file format of query sequences
- // 0 -- fasta, 1 -- fastq
-// int fileFormat = 0;
+ DbIndexType dbIndexType;
int alphReduction = 0;
@@ -233,6 +242,9 @@ struct SharedOptions
}
};
+// --------------------------------------------------------------------------
+// Class LambdaOptions
+// --------------------------------------------------------------------------
struct LambdaOptions : public SharedOptions
{
@@ -255,6 +267,7 @@ struct LambdaOptions : public SharedOptions
// bool semiGlobal;
bool doubleIndexing = true;
+ bool adaptiveSeeding;
unsigned seedLength = 0;
unsigned maxSeedDist = 1;
@@ -281,8 +294,18 @@ struct LambdaOptions : public SharedOptions
int idCutOff = 0;
unsigned long maxMatches = 500;
+ enum class ExtensionMode : uint8_t
+ {
+ AUTO,
+ XDROP,
+ FULL_SERIAL,
+ FULL_SIMD
+ };
+ ExtensionMode extensionMode;
+
bool filterPutativeDuplicates = true;
bool filterPutativeAbundant = true;
+ bool mergePutativeSiblings = true;
int preScoring = 0; // 0 = off, 1 = seed, 2 = region (
double preScoringThresh = 0.0;
@@ -293,9 +316,14 @@ struct LambdaOptions : public SharedOptions
}
};
+// --------------------------------------------------------------------------
+// Class LambdaIndexerOptions
+// --------------------------------------------------------------------------
+
struct LambdaIndexerOptions : public SharedOptions
{
- std::string segFile = "";
+ std::string dbFile;
+// std::string segFile = "";
std::string algo = "";
bool truncateIDs;
@@ -310,7 +338,7 @@ struct LambdaIndexerOptions : public SharedOptions
// ==========================================================================
// --------------------------------------------------------------------------
-// Function displayCopyright()
+// Function sharedSetup()
// --------------------------------------------------------------------------
void
@@ -321,7 +349,7 @@ sharedSetup(ArgumentParser & parser)
std::string(SEQAN_REVISION) + ")";
setVersion(parser, versionString);
setDate(parser, __DATE__);
- setShortCopyright(parser, "2013-2016 Hannes Hauswedell, released under the GNU GPL v3 (or later); "
+ setShortCopyright(parser, "2013-2016 Hannes Hauswedell, released under the GNU AGPL v3 (or later); "
"2016 Knut Reinert and Freie Universität Berlin, released under the 3-clause-BSDL");
setCitation(parser, "Hauswedell et al (2014); doi: 10.1093/bioinformatics/btu439");
@@ -330,18 +358,18 @@ sharedSetup(ArgumentParser & parser)
" Copyright (c) 2013-2016, Hannes Hauswedell\n"
" All rights reserved.\n"
"\n"
- " Lambda is free software: you can redistribute it and/or modify\n"
- " it under the terms of the GNU General Public License as published by\n"
- " the Free Software Foundation, either version 3 of the License, or\n"
- " (at your option) any later version.\n"
+ " This program is free software: you can redistribute it and/or modify\n"
+ " it under the terms of the GNU Affero General Public License as\n"
+ " published by the Free Software Foundation, either version 3 of the\n"
+ " License, or (at your option) any later version.\n"
"\n"
" Lambda is distributed in the hope that it will be useful,\n"
" but WITHOUT ANY WARRANTY; without even the implied warranty of\n"
" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n"
" GNU General Public License for more details.\n"
"\n"
- " You should have received a copy of the GNU General Public License\n"
- " along with Lambda. If not, see <http://www.gnu.org/licenses/>.\n"
+ " You should have received a copy of the GNU Affero General Public License\n"
+ " along with this program. If not, see <http://www.gnu.org/licenses/>.\n"
"\n"
" Copyright (c) 2016 Knut Reinert and Freie Universität Berlin\n"
" All rights reserved.\n"
@@ -410,7 +438,7 @@ parseCommandLine(LambdaOptions & options, int argc, char const ** argv)
// Define usage line and long description.
addUsageLine(parser, "[\\fIOPTIONS\\fP] \\fI-q QUERY.fasta\\fP "
- "\\fI-d DATABASE.fasta\\fP "
+ "\\fI-i INDEX.lambda\\fP "
"[\\fI-o output.m8\\fP]");
sharedSetup(parser);
@@ -423,12 +451,12 @@ parseCommandLine(LambdaOptions & options, int argc, char const ** argv)
setValidValues(parser, "query", toCString(concat(getFileExtensions(SeqFileIn()), ' ')));
setRequired(parser, "q");
- addOption(parser, ArgParseOption("d", "database",
- "Path to original database sequences (a precomputed index with .sa or .fm needs to exist!).",
+ addOption(parser, ArgParseOption("i", "index",
+ "The database index (created by the lambda_indexer executable).",
ArgParseArgument::INPUT_FILE,
"IN"));
- setValidValues(parser, "database", toCString(concat(getFileExtensions(SeqFileIn()), ' ')));
- setRequired(parser, "d");
+ setRequired(parser, "index");
+ setValidValues(parser, "index", ".lambda");
addOption(parser, ArgParseOption("di", "db-index-type",
"database index is in this format.",
@@ -627,6 +655,14 @@ parseCommandLine(LambdaOptions & options, int argc, char const ** argv)
// ArgParseArgument::INTEGER));
// setDefaultValue(parser, "ungapped-seeds", "1");
+ addOption(parser, ArgParseOption("as", "adaptive-seeding",
+ "SECRET",
+ ArgParseArgument::STRING,
+ "STR"));
+ setValidValues(parser, "adaptive-seeding", "on off");
+ setDefaultValue(parser, "adaptive-seeding", "on");
+ setAdvanced(parser, "adaptive-seeding");
+
addOption(parser, ArgParseOption("sl", "seed-length",
"Length of the seeds (default = 14 for BLASTN).",
ArgParseArgument::INTEGER));
@@ -637,7 +673,7 @@ parseCommandLine(LambdaOptions & options, int argc, char const ** argv)
"Offset for seeding (if unset = seed-length, non-overlapping; "
"default = 5 for BLASTN).",
ArgParseArgument::INTEGER));
- setDefaultValue(parser, "seed-offset", "10");
+ setDefaultValue(parser, "seed-offset", "5");
setAdvanced(parser, "seed-offset");
addOption(parser, ArgParseOption("sd", "seed-delta",
@@ -691,6 +727,14 @@ parseCommandLine(LambdaOptions & options, int argc, char const ** argv)
setDefaultValue(parser, "filter-putative-abundant", "on");
setAdvanced(parser, "filter-putative-abundant");
+ addOption(parser, ArgParseOption("pm", "merge-putative-siblings",
+ "Merge seed from one region, "
+ "stop searching if the remaining realm looks unfeasable.",
+ ArgParseArgument::STRING));
+ setValidValues(parser, "merge-putative-siblings", "on off");
+ setDefaultValue(parser, "merge-putative-siblings", "on");
+ setAdvanced(parser, "merge-putative-siblings");
+
// addOption(parser, ArgParseOption("se",
// "seedminevalue",
// "after postproc worse seeds are "
@@ -759,6 +803,17 @@ parseCommandLine(LambdaOptions & options, int argc, char const ** argv)
setMinValue(parser, "band", "-3");
setAdvanced(parser, "band");
+ addOption(parser, ArgParseOption("em", "extension-mode",
+ "Choice of extension algorithms.",
+ ArgParseArgument::STRING));
+#ifdef SEQAN_SIMD_ENABLED
+ setValidValues(parser, "extension-mode", "auto xdrop fullSerial fullSIMD");
+#else
+ setValidValues(parser, "extension-mode", "auto xdrop fullSerial");
+#endif
+ setDefaultValue(parser, "extension-mode", "auto");
+ setAdvanced(parser, "extension-mode");
+
addTextSection(parser, "Tuning");
addText(parser, "Tuning the seeding parameters and (de)activating alphabet "
"reduction has a strong "
@@ -804,6 +859,9 @@ parseCommandLine(LambdaOptions & options, int argc, char const ** argv)
// Extract option values.
getOptionValue(options.queryFile, parser, "query");
+
+ getOptionValue(options.indexDir, parser, "index");
+
// if (endsWith(options.queryFile, ".fastq") ||
// endsWith(options.queryFile, ".fq"))
// options.fileFormat = 1;
@@ -929,6 +987,10 @@ parseCommandLine(LambdaOptions & options, int argc, char const ** argv)
options.versionInformationToOutputFile = (buffer == "on");
clear(buffer);
+ getOptionValue(buffer, parser, "adaptive-seeding");
+ options.adaptiveSeeding = (buffer == "on");
+
+ clear(buffer);
getOptionValue(options.seedLength, parser, "seed-length");
if ((!isSet(parser, "seed-length")) &&
(options.blastProgram == BlastProgram::BLASTN))
@@ -937,7 +999,7 @@ parseCommandLine(LambdaOptions & options, int argc, char const ** argv)
if (isSet(parser, "seed-offset"))
getOptionValue(options.seedOffset, parser, "seed-offset");
else
- options.seedOffset = options.seedLength;
+ options.seedOffset = options.seedLength / 2;
if (isSet(parser, "seed-gravity"))
getOptionValue(options.seedGravity, parser, "seed-gravity");
@@ -1009,11 +1071,17 @@ parseCommandLine(LambdaOptions & options, int argc, char const ** argv)
getOptionValue(buffer, parser, "filter-putative-abundant");
options.filterPutativeAbundant = (buffer == "on");
+ getOptionValue(buffer, parser, "merge-putative-siblings");
+ options.mergePutativeSiblings = (buffer == "on");
+
// TODO always prescore 1
getOptionValue(options.preScoring, parser, "pre-scoring");
if ((!isSet(parser, "pre-scoring")) &&
(options.alphReduction == 0))
options.preScoring = 1;
+ // for adaptive seeding we take the full resized seed (and no surroundings)
+// if (options.adaptiveSeeding)
+// options.preScoring = 1;
getOptionValue(options.preScoringThresh, parser, "pre-scoring-threshold");
// if (options.preScoring == 0)
@@ -1023,6 +1091,33 @@ parseCommandLine(LambdaOptions & options, int argc, char const ** argv)
getOptionValue(numbuf, parser, "num-matches");
options.maxMatches = static_cast<unsigned long>(numbuf);
+ getOptionValue(buffer, parser, "extension-mode");
+ if (buffer == "fullSIMD")
+ {
+ options.extensionMode = LambdaOptions::ExtensionMode::FULL_SIMD;
+ options.filterPutativeAbundant = false;
+ options.filterPutativeDuplicates = false;
+ options.mergePutativeSiblings = false;
+ options.xDropOff = -1;
+ options.band = -1;
+ }
+ else if (buffer == "fullSerial")
+ {
+ options.extensionMode = LambdaOptions::ExtensionMode::FULL_SERIAL;
+ options.filterPutativeAbundant = false;
+ options.filterPutativeDuplicates = false;
+ options.mergePutativeSiblings = false;
+ options.xDropOff = -1;
+ }
+ else if (buffer == "xdrop")
+ {
+ options.extensionMode = LambdaOptions::ExtensionMode::XDROP;
+ }
+ else
+ {
+ options.extensionMode = LambdaOptions::ExtensionMode::AUTO;
+ }
+
return ArgumentParser::PARSE_OK;
}
@@ -1034,7 +1129,7 @@ parseCommandLine(LambdaIndexerOptions & options, int argc, char const ** argv)
ArgumentParser parser("lambda_indexer");
// Define usage line and long description.
- addUsageLine(parser, "[\\fIOPTIONS\\fP] \\-d DATABASE.fasta\\fP");
+ addUsageLine(parser, "[\\fIOPTIONS\\fP] \\-d DATABASE.fasta [-i INDEX.lambda]\\fP");
sharedSetup(parser);
@@ -1048,21 +1143,20 @@ parseCommandLine(LambdaIndexerOptions & options, int argc, char const ** argv)
setRequired(parser, "database");
setValidValues(parser, "database", toCString(concat(getFileExtensions(SeqFileIn()), ' ')));
- addOption(parser, ArgParseOption("s",
- "segfile",
- "SEG intervals for database"
- "(optional).",
- ArgParseArgument::INPUT_FILE));
-
- setValidValues(parser, "segfile", "seg");
+// addOption(parser, ArgParseOption("s",
+// "segfile",
+// "SEG intervals for database"
+// "(optional).",
+// ArgParseArgument::INPUT_FILE));
+// setValidValues(parser, "segfile", "seg");
+// hideOption(parser, "segfile"); // TODO remove completely
addSection(parser, "Output Options");
-// addOption(parser, ArgParseOption("o",
-// "output",
-// "Index of database sequences",
-// ArgParseArgument::OUTPUT_FILE,
-// "OUT"));
-// setValidValues(parser, "output", "sa fm");
+ addOption(parser, ArgParseOption("i", "index",
+ "The output directory for the index files (defaults to \"DATABASE.lambda\").",
+ ArgParseArgument::INPUT_FILE,
+ "OUT"));
+ setValidValues(parser, "index", ".lambda");
addOption(parser, ArgParseOption("di", "db-index-type",
"Suffix array or full-text minute space.",
@@ -1186,7 +1280,7 @@ parseCommandLine(LambdaIndexerOptions & options, int argc, char const ** argv)
return res;
// Extract option values
- getOptionValue(options.segFile, parser, "segfile");
+// getOptionValue(options.segFile, parser, "segfile");
getOptionValue(options.algo, parser, "algorithm");
if ((options.algo == "mergesort") || (options.algo == "quicksort") || (options.algo == "quicksortbuckets"))
{
@@ -1202,6 +1296,29 @@ parseCommandLine(LambdaIndexerOptions & options, int argc, char const ** argv)
getOptionValue(buffer, parser, "truncate-ids");
options.truncateIDs = (buffer == "on");
+
+ getOptionValue(options.dbFile, parser, "database");
+ if (isSet(parser, "index"))
+ getOptionValue(options.indexDir, parser, "index");
+ else
+ options.indexDir = options.dbFile + ".lambda";
+
+
+ if (fileExists(options.indexDir.c_str()))
+ {
+ std::cerr << "ERROR: An output directory already exists at " << options.indexDir << '\n'
+ << "Remove it, or choose a different location.\n";
+ return ArgumentParser::PARSE_ERROR;
+ }
+ else
+ {
+ if (mkdir(options.indexDir.c_str(), S_IRWXU | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH))
+ {
+ std::cerr << "ERROR: Cannot create output directory at " << options.indexDir << '\n';;
+ return ArgumentParser::PARSE_ERROR;
+ }
+ }
+
return ArgumentParser::PARSE_OK;
}
@@ -1212,13 +1329,13 @@ parseCommandLineShared(SharedOptions & options, ArgumentParser & parser)
int buf = 0;
std::string buffer;
- getOptionValue(options.dbFile, parser, "database");
-
getOptionValue(buffer, parser, "db-index-type");
if (buffer == "sa")
- options.dbIndexType = 0;
- else // if fm
- options.dbIndexType = 1;
+ options.dbIndexType = DbIndexType::SUFFIX_ARRAY;
+ else if (buffer == "bifm")
+ options.dbIndexType = DbIndexType::BI_FM_INDEX;
+ else
+ options.dbIndexType = DbIndexType::FM_INDEX;
getOptionValue(buffer, parser, "program");
if (buffer == "blastn")
@@ -1274,6 +1391,10 @@ parseCommandLineShared(SharedOptions & options, ArgumentParser & parser)
return ArgumentParser::PARSE_OK;
}
+// --------------------------------------------------------------------------
+// Function _alphName()
+// --------------------------------------------------------------------------
+
constexpr const char *
_alphName(AminoAcid const & /**/)
{
@@ -1316,6 +1437,26 @@ _alphName(Dna5 const & /**/)
return "dna5";
}
+// --------------------------------------------------------------------------
+// Function _indexName()
+// --------------------------------------------------------------------------
+
+inline std::string
+_indexName(DbIndexType const t)
+{
+ switch (t)
+ {
+ case DbIndexType::SUFFIX_ARRAY: return "suffix_array";
+ case DbIndexType::FM_INDEX: return "fm_index";
+ case DbIndexType::BI_FM_INDEX: return "bi_fm_index";
+ }
+ return "ERROR_UNKNOWN_INDEX_TYPE";
+}
+
+// --------------------------------------------------------------------------
+// Function printOptions()
+// --------------------------------------------------------------------------
+
template <typename TLH>
inline void
printOptions(LambdaOptions const & options)
@@ -1334,7 +1475,7 @@ printOptions(LambdaOptions const & options)
std::cout << "OPTIONS\n"
<< " INPUT\n"
<< " query file: " << options.queryFile << "\n"
- << " db file: " << options.dbFile << "\n"
+ << " index directory: " << options.indexDir << "\n"
<< " db index type: " << (TGH::indexIsFM
? "FM-Index\n"
: "SA-Index\n")
@@ -1397,6 +1538,7 @@ printOptions(LambdaOptions const & options)
<< " putative-duplicates: " << (options.filterPutativeDuplicates
? std::string("on")
: std::string("off")) << "\n"
+
<< " SCORING\n"
<< " scoring scheme: " << options.scoringMethod << "\n"
<< " score-match: " << (options.scoringMethod
@@ -1407,10 +1549,36 @@ printOptions(LambdaOptions const & options)
: std::to_string(options.misMatch)) << "\n"
<< " score-gap: " << options.gapExtend << "\n"
<< " score-gap-open: " << options.gapOpen << "\n"
- << " EXTENSION\n"
+ << " EXTENSION\n";
+ switch (options.extensionMode)
+ {
+ case LambdaOptions::ExtensionMode::AUTO:
+ std::cout
+ << " extensionMode: auto (depends on query length)\n"
<< " x-drop: " << options.xDropOff << "\n"
<< " band: " << bandStr << "\n"
- << " BUILD OPTIONS:\n"
+ << " [depending on the automatically chosen mode x-drop or band might get disabled.\n";
+ break;
+ case LambdaOptions::ExtensionMode::XDROP:
+ std::cout
+ << " extensionMode: individual\n"
+ << " x-drop: " << options.xDropOff << "\n"
+ << " band: " << bandStr << "\n";
+ break;
+ case LambdaOptions::ExtensionMode::FULL_SERIAL:
+ std::cout
+ << " extensionMode: batch, but serialized\n"
+ << " x-drop: not used\n"
+ << " band: " << bandStr << "\n";
+ break;
+ case LambdaOptions::ExtensionMode::FULL_SIMD:
+ std::cout
+ << " extensionMode: batch with SIMD\n"
+ << " x-drop: not used\n"
+ << " band: not used\n";
+ break;
+ }
+ std::cout << " BUILD OPTIONS:\n"
<< " cmake_build_type: " << std::string(CMAKE_BUILD_TYPE) << "\n"
<< " fastbuild: "
#if defined(FASTBUILD)
@@ -1442,6 +1610,14 @@ printOptions(LambdaOptions const & options)
#else
<< "off\n"
#endif
+ << " seqan_simd: "
+ #if defined(SEQAN_SIMD_ENABLED) && defined(__AVX2__)
+ << "avx2\n"
+ #elif defined(SEQAN_SIMD_ENABLED) && defined(__SSE4_2__)
+ << "sse4\n"
+ #else
+ << "off\n"
+ #endif
<< "\n";
}
diff --git a/src/output.hpp b/src/output.hpp
index f0d9a2f..3dd199e 100644
--- a/src/output.hpp
+++ b/src/output.hpp
@@ -110,14 +110,15 @@ blastMatchOneCigar(TCigar & cigar,
TLocalHolder const & lH)
{
using TCElem = typename Value<TCigar>::Type;
+ using TGlobalHolder = typename TLocalHolder::TGlobalHolder;
SEQAN_ASSERT_EQ(length(m.alignRow0), length(m.alignRow1));
// translate positions into dna space
- unsigned const transFac = qIsTranslated(lH.gH.blastProgram) ? 3 : 1;
+ unsigned const transFac = qIsTranslated(TGlobalHolder::blastProgram) ? 3 : 1;
// clips resulting from translation / frameshift are always hard clips
unsigned const leftFrameClip = std::abs(m.qFrameShift) - 1;
- unsigned const rightFrameClip = qIsTranslated(lH.gH.blastProgram) ? (m.qLength - leftFrameClip) % 3 : 0;
+ unsigned const rightFrameClip = qIsTranslated(TGlobalHolder::blastProgram) ? (m.qLength - leftFrameClip) % 3 : 0;
// regular clipping from local alignment (regions outside match) can be hard or soft
unsigned const leftClip = m.qStart * transFac;
unsigned const rightClip = (length(source(m.alignRow0)) - m.qEnd) * transFac;
@@ -192,6 +193,7 @@ blastMatchTwoCigar(TCigar & dnaCigar,
TLocalHolder const & lH)
{
using TCElem = typename Value<TCigar>::Type;
+ using TGlobalHolder = typename TLocalHolder::TGlobalHolder;
SEQAN_ASSERT_EQ(length(m.alignRow0), length(m.alignRow1));
@@ -301,7 +303,7 @@ myWriteHeader(TGH & globalHolder, TLambdaOptions const & options)
context(globalHolder.outfile).fields = options.columns;
auto & versionString = context(globalHolder.outfile).versionString;
clear(versionString);
- append(versionString, _programTagToString(globalHolder.blastProgram));
+ append(versionString, _programTagToString(TGH::blastProgram));
append(versionString, " 2.2.26+ [created by LAMBDA");
if (options.versionInformationToOutputFile)
{
@@ -318,7 +320,7 @@ myWriteHeader(TGH & globalHolder, TLambdaOptions const & options)
auto & subjIds = contigNames(context);
// set sequence lengths
- if (sIsTranslated(globalHolder.blastProgram))
+ if (sIsTranslated(TGH::blastProgram))
{
//TODO can we get around a copy?
subjSeqLengths = globalHolder.untransSubjSeqLengths;
@@ -425,6 +427,7 @@ template <typename TLH, typename TRecord>
inline void
myWriteRecord(TLH & lH, TRecord const & record)
{
+ using TGH = typename TLH::TGlobalHolder;
if (lH.options.outFileFormat == 0) // BLAST
{
SEQAN_OMP_PRAGMA(critical(filewrite))
@@ -445,7 +448,7 @@ myWriteRecord(TLH & lH, TRecord const & record)
for (auto & bamR : bamRecords)
{
// untranslate for sIsTranslated
- if (sIsTranslated(lH.gH.blastProgram))
+ if (sIsTranslated(TGH::blastProgram))
{
bamR.beginPos = mIt->sStart * 3 + std::abs(mIt->sFrameShift) - 1;
if (mIt->sFrameShift < 0)
@@ -472,16 +475,16 @@ myWriteRecord(TLH & lH, TRecord const & record)
{
clear(protCigar);
// native protein
- if ((lH.gH.blastProgram == BlastProgram::BLASTP) || (lH.gH.blastProgram == BlastProgram::TBLASTN))
+ if ((TGH::blastProgram == BlastProgram::BLASTP) || (TGH::blastProgram == BlastProgram::TBLASTN))
blastMatchOneCigar(protCigar, *mIt, lH);
- else if (qIsTranslated(lH.gH.blastProgram)) // translated
+ else if (qIsTranslated(TGH::blastProgram)) // translated
blastMatchTwoCigar(bamR.cigar, protCigar, *mIt, lH);
else // BLASTN can't have protein sequence
blastMatchOneCigar(bamR.cigar, *mIt, lH);
}
else
{
- if ((lH.gH.blastProgram != BlastProgram::BLASTP) && (lH.gH.blastProgram != BlastProgram::TBLASTN))
+ if ((TGH::blastProgram != BlastProgram::BLASTP) && (TGH::blastProgram != BlastProgram::TBLASTN))
blastMatchOneCigar(bamR.cigar, *mIt, lH);
}
// we want to include the seq
@@ -498,7 +501,7 @@ myWriteRecord(TLH & lH, TRecord const & record)
(endPosition(mIt->alignRow0) != endPosition(mPrevIt->alignRow0)));
}
- if (lH.gH.blastProgram == BlastProgram::BLASTN)
+ if (TGH::blastProgram == BlastProgram::BLASTN)
{
if (lH.options.samBamHardClip)
{
@@ -512,7 +515,7 @@ myWriteRecord(TLH & lH, TRecord const & record)
bamR.seq = source(mIt->alignRow0);
}
}
- else if (qIsTranslated(lH.gH.blastProgram))
+ else if (qIsTranslated(TGH::blastProgram))
{
if (lH.options.samBamHardClip)
{
@@ -571,7 +574,7 @@ myWriteRecord(TLH & lH, TRecord const & record)
int8_t(mIt->sFrameShift), 'c');
if (lH.options.samBamTags[SamBamExtraTags<>::Q_AA_SEQ])
{
- if ((lH.gH.blastProgram == BlastProgram::BLASTN) || (!writeSeq))
+ if ((TGH::blastProgram == BlastProgram::BLASTN) || (!writeSeq))
appendTagValue(bamR.tags,
std::get<0>(SamBamExtraTags<>::keyDescPairs[SamBamExtraTags<>::Q_AA_SEQ]),
"*", 'Z');
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/lambda-align.git
More information about the debian-med-commit
mailing list