[Pkg-openssl-changes] r708 - in openssl/branches/wheezy/debian: . patches
Kurt Roeckx
kroeckx at moszumanska.debian.org
Thu Jan 8 20:36:40 UTC 2015
Author: kroeckx
Date: 2015-01-08 20:36:40 +0000 (Thu, 08 Jan 2015)
New Revision: 708
Added:
openssl/branches/wheezy/debian/patches/0094-Fix-various-certificate-fingerprint-issues.patch
openssl/branches/wheezy/debian/patches/0095-Constify-ASN1_TYPE_cmp-add-X509_ALGOR_cmp.patch
openssl/branches/wheezy/debian/patches/0098-ECDH-downgrade-bug-fix.patch
openssl/branches/wheezy/debian/patches/0099-Only-allow-ephemeral-RSA-keys-in-export-ciphersuites.patch
openssl/branches/wheezy/debian/patches/0107-fix-error-discrepancy.patch
openssl/branches/wheezy/debian/patches/0108-Fix-for-CVE-2014-3570.patch
openssl/branches/wheezy/debian/patches/0109-Fix-crash-in-dtls1_get_record-whilst-in-the-listen-s.patch
openssl/branches/wheezy/debian/patches/0110-Follow-on-from-CVE-2014-3571.-This-fixes-the-code-th.patch
openssl/branches/wheezy/debian/patches/0111-Unauthenticated-DH-client-certificate-fix.patch
openssl/branches/wheezy/debian/patches/0112-A-memory-leak-can-occur-in-dtls1_buffer_record-if-ei.patch
Modified:
openssl/branches/wheezy/debian/changelog
openssl/branches/wheezy/debian/patches/series
Log:
Fix several CVEs
Modified: openssl/branches/wheezy/debian/changelog
===================================================================
--- openssl/branches/wheezy/debian/changelog 2015-01-08 19:56:24 UTC (rev 707)
+++ openssl/branches/wheezy/debian/changelog 2015-01-08 20:36:40 UTC (rev 708)
@@ -1,12 +1,13 @@
openssl (1.0.1e-2+deb7u14) wheezy-security; urgency=medium
- * Disable SSLv3 by default. It can be enabled again by calling
- SSL_CTX_clear_options() or SSL_clear_options() with SSL_OP_NO_SSLv3.
- It can also be enabled again by setting OPENSSL_ALLOW_SSLv3 in the
- environment to anything.
- This fixes the POODLE issue (CVE-2014-3566).
- * Fix CVE-2014-3569. We're not affected by it since we don't build with
- the no-ssl3 option (yet).
+ - Fix for CVE-2014-3571
+ - Fix for CVE-2015-0206
+ - Fix for CVE-2014-3569
+ - Fix for CVE-2014-3572
+ - Fix for CVE-2015-0204
+ - Fix for CVE-2015-0205
+ - Fix for CVE-2014-8275
+ - Fix for CVE-2014-3570
-- Kurt Roeckx <kurt at roeckx.be> Wed, 31 Dec 2014 13:45:07 +0100
Added: openssl/branches/wheezy/debian/patches/0094-Fix-various-certificate-fingerprint-issues.patch
===================================================================
--- openssl/branches/wheezy/debian/patches/0094-Fix-various-certificate-fingerprint-issues.patch (rev 0)
+++ openssl/branches/wheezy/debian/patches/0094-Fix-various-certificate-fingerprint-issues.patch 2015-01-08 20:36:40 UTC (rev 708)
@@ -0,0 +1,211 @@
+From a8565530e27718760220df469f0a071c85b9e731 Mon Sep 17 00:00:00 2001
+From: "Dr. Stephen Henson" <steve at openssl.org>
+Date: Sat, 20 Dec 2014 15:09:50 +0000
+Subject: [PATCH 094/117] Fix various certificate fingerprint issues.
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+By using non-DER or invalid encodings outside the signed portion of a
+certificate the fingerprint can be changed without breaking the signature.
+Although no details of the signed portion of the certificate can be changed
+this can cause problems with some applications: e.g. those using the
+certificate fingerprint for blacklists.
+
+1. Reject signatures with non zero unused bits.
+
+If the BIT STRING containing the signature has non zero unused bits reject
+the signature. All current signature algorithms require zero unused bits.
+
+2. Check certificate algorithm consistency.
+
+Check the AlgorithmIdentifier inside TBS matches the one in the
+certificate signature. NB: this will result in signature failure
+errors for some broken certificates.
+
+3. Check DSA/ECDSA signatures use DER.
+
+Reencode DSA/ECDSA signatures and compare with the original received
+signature. Return an error if there is a mismatch.
+
+This will reject various cases including garbage after signature
+(thanks to Antti Karjalainen and Tuomo Untinen from the Codenomicon CROSS
+program for discovering this case) and use of BER or invalid ASN.1 INTEGERs
+(negative or with leading zeroes).
+
+CVE-2014-8275
+Reviewed-by: Emilia Käsper <emilia at openssl.org>
+
+(cherry picked from commit 684400ce192dac51df3d3e92b61830a6ef90be3e)
+---
+ CHANGES | 37 +++++++++++++++++++++++++++++++++++++
+ crypto/asn1/a_verify.c | 12 ++++++++++++
+ crypto/dsa/dsa_asn1.c | 14 +++++++++++++-
+ crypto/ecdsa/ecs_vrf.c | 15 ++++++++++++++-
+ crypto/x509/x_all.c | 2 ++
+ 5 files changed, 78 insertions(+), 2 deletions(-)
+
+diff --git a/CHANGES b/CHANGES
+index c3bb940..c91552c 100644
+--- a/CHANGES
++++ b/CHANGES
+@@ -4,6 +4,43 @@
+
+ Changes between 1.0.1j and 1.0.1k [xx XXX xxxx]
+
++ *) Fix various certificate fingerprint issues.
++
++ By using non-DER or invalid encodings outside the signed portion of a
++ certificate the fingerprint can be changed without breaking the signature.
++ Although no details of the signed portion of the certificate can be changed
++ this can cause problems with some applications: e.g. those using the
++ certificate fingerprint for blacklists.
++
++ 1. Reject signatures with non zero unused bits.
++
++ If the BIT STRING containing the signature has non zero unused bits reject
++ the signature. All current signature algorithms require zero unused bits.
++
++ 2. Check certificate algorithm consistency.
++
++ Check the AlgorithmIdentifier inside TBS matches the one in the
++ certificate signature. NB: this will result in signature failure
++ errors for some broken certificates.
++
++ Thanks to Konrad Kraszewski from Google for reporting this issue.
++
++ 3. Check DSA/ECDSA signatures use DER.
++
++ Reencode DSA/ECDSA signatures and compare with the original received
++ signature. Return an error if there is a mismatch.
++
++ This will reject various cases including garbage after signature
++ (thanks to Antti Karjalainen and Tuomo Untinen from the Codenomicon CROSS
++ program for discovering this case) and use of BER or invalid ASN.1 INTEGERs
++ (negative or with leading zeroes).
++
++ Further analysis was conducted and fixes were developed by Stephen Henson
++ of the OpenSSL core team.
++
++ (CVE-2014-8275)
++ [Steve Henson]
++
+ *) Do not resume sessions on the server if the negotiated protocol
+ version does not match the session's version. Resuming with a different
+ version, while not strictly forbidden by the RFC, is of questionable
+diff --git a/crypto/asn1/a_verify.c b/crypto/asn1/a_verify.c
+index fc84cd3..a571009 100644
+--- a/crypto/asn1/a_verify.c
++++ b/crypto/asn1/a_verify.c
+@@ -90,6 +90,12 @@ int ASN1_verify(i2d_of_void *i2d, X509_ALGOR *a, ASN1_BIT_STRING *signature,
+ ASN1err(ASN1_F_ASN1_VERIFY,ASN1_R_UNKNOWN_MESSAGE_DIGEST_ALGORITHM);
+ goto err;
+ }
++
++ if (signature->type == V_ASN1_BIT_STRING && signature->flags & 0x7)
++ {
++ ASN1err(ASN1_F_ASN1_VERIFY, ASN1_R_INVALID_BIT_STRING_BITS_LEFT);
++ goto err;
++ }
+
+ inl=i2d(data,NULL);
+ buf_in=OPENSSL_malloc((unsigned int)inl);
+@@ -146,6 +152,12 @@ int ASN1_item_verify(const ASN1_ITEM *it, X509_ALGOR *a,
+ return -1;
+ }
+
++ if (signature->type == V_ASN1_BIT_STRING && signature->flags & 0x7)
++ {
++ ASN1err(ASN1_F_ASN1_VERIFY, ASN1_R_INVALID_BIT_STRING_BITS_LEFT);
++ return -1;
++ }
++
+ EVP_MD_CTX_init(&ctx);
+
+ /* Convert signature OID into digest and public key OIDs */
+diff --git a/crypto/dsa/dsa_asn1.c b/crypto/dsa/dsa_asn1.c
+index 6058534..473af87 100644
+--- a/crypto/dsa/dsa_asn1.c
++++ b/crypto/dsa/dsa_asn1.c
+@@ -176,13 +176,25 @@ int DSA_verify(int type, const unsigned char *dgst, int dgst_len,
+ const unsigned char *sigbuf, int siglen, DSA *dsa)
+ {
+ DSA_SIG *s;
++ const unsigned char *p = sigbuf;
++ unsigned char *der = NULL;
++ int derlen = -1;
+ int ret=-1;
+
+ s = DSA_SIG_new();
+ if (s == NULL) return(ret);
+- if (d2i_DSA_SIG(&s,&sigbuf,siglen) == NULL) goto err;
++ if (d2i_DSA_SIG(&s,&p,siglen) == NULL) goto err;
++ /* Ensure signature uses DER and doesn't have trailing garbage */
++ derlen = i2d_DSA_SIG(s, &der);
++ if (derlen != siglen || memcmp(sigbuf, der, derlen))
++ goto err;
+ ret=DSA_do_verify(dgst,dgst_len,s,dsa);
+ err:
++ if (derlen > 0)
++ {
++ OPENSSL_cleanse(der, derlen);
++ OPENSSL_free(der);
++ }
+ DSA_SIG_free(s);
+ return(ret);
+ }
+diff --git a/crypto/ecdsa/ecs_vrf.c b/crypto/ecdsa/ecs_vrf.c
+index ef9acf7..2836efe 100644
+--- a/crypto/ecdsa/ecs_vrf.c
++++ b/crypto/ecdsa/ecs_vrf.c
+@@ -57,6 +57,7 @@
+ */
+
+ #include "ecs_locl.h"
++#include "cryptlib.h"
+ #ifndef OPENSSL_NO_ENGINE
+ #include <openssl/engine.h>
+ #endif
+@@ -84,13 +85,25 @@ int ECDSA_verify(int type, const unsigned char *dgst, int dgst_len,
+ const unsigned char *sigbuf, int sig_len, EC_KEY *eckey)
+ {
+ ECDSA_SIG *s;
++ const unsigned char *p = sigbuf;
++ unsigned char *der = NULL;
++ int derlen = -1;
+ int ret=-1;
+
+ s = ECDSA_SIG_new();
+ if (s == NULL) return(ret);
+- if (d2i_ECDSA_SIG(&s, &sigbuf, sig_len) == NULL) goto err;
++ if (d2i_ECDSA_SIG(&s, &p, sig_len) == NULL) goto err;
++ /* Ensure signature uses DER and doesn't have trailing garbage */
++ derlen = i2d_ECDSA_SIG(s, &der);
++ if (derlen != sig_len || memcmp(sigbuf, der, derlen))
++ goto err;
+ ret=ECDSA_do_verify(dgst, dgst_len, s, eckey);
+ err:
++ if (derlen > 0)
++ {
++ OPENSSL_cleanse(der, derlen);
++ OPENSSL_free(der);
++ }
+ ECDSA_SIG_free(s);
+ return(ret);
+ }
+diff --git a/crypto/x509/x_all.c b/crypto/x509/x_all.c
+index e06602d..fef55f8 100644
+--- a/crypto/x509/x_all.c
++++ b/crypto/x509/x_all.c
+@@ -72,6 +72,8 @@
+
+ int X509_verify(X509 *a, EVP_PKEY *r)
+ {
++ if (X509_ALGOR_cmp(a->sig_alg, a->cert_info->signature))
++ return 0;
+ return(ASN1_item_verify(ASN1_ITEM_rptr(X509_CINF),a->sig_alg,
+ a->signature,a->cert_info,r));
+ }
+--
+2.1.4
+
Added: openssl/branches/wheezy/debian/patches/0095-Constify-ASN1_TYPE_cmp-add-X509_ALGOR_cmp.patch
===================================================================
--- openssl/branches/wheezy/debian/patches/0095-Constify-ASN1_TYPE_cmp-add-X509_ALGOR_cmp.patch (rev 0)
+++ openssl/branches/wheezy/debian/patches/0095-Constify-ASN1_TYPE_cmp-add-X509_ALGOR_cmp.patch 2015-01-08 20:36:40 UTC (rev 708)
@@ -0,0 +1,77 @@
+From 5951cc004b96cd681ffdf39d3fc9238a1ff597ae Mon Sep 17 00:00:00 2001
+From: "Dr. Stephen Henson" <steve at openssl.org>
+Date: Sun, 14 Dec 2014 23:14:15 +0000
+Subject: [PATCH 095/117] Constify ASN1_TYPE_cmp add X509_ALGOR_cmp.
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+Reviewed-by: Emilia Käsper <emilia at openssl.org>
+(cherry picked from commit 4c52816d35681c0533c25fdd3abb4b7c6962302d)
+---
+ crypto/asn1/a_type.c | 2 +-
+ crypto/asn1/asn1.h | 2 +-
+ crypto/asn1/x_algor.c | 11 +++++++++++
+ crypto/x509/x509.h | 1 +
+ 4 files changed, 14 insertions(+), 2 deletions(-)
+
+diff --git a/crypto/asn1/a_type.c b/crypto/asn1/a_type.c
+index a45d2f9..5e1bc76 100644
+--- a/crypto/asn1/a_type.c
++++ b/crypto/asn1/a_type.c
+@@ -113,7 +113,7 @@ IMPLEMENT_STACK_OF(ASN1_TYPE)
+ IMPLEMENT_ASN1_SET_OF(ASN1_TYPE)
+
+ /* Returns 0 if they are equal, != 0 otherwise. */
+-int ASN1_TYPE_cmp(ASN1_TYPE *a, ASN1_TYPE *b)
++int ASN1_TYPE_cmp(const ASN1_TYPE *a, const ASN1_TYPE *b)
+ {
+ int result = -1;
+
+diff --git a/crypto/asn1/asn1.h b/crypto/asn1/asn1.h
+index 672c97f..3c45d5d 100644
+--- a/crypto/asn1/asn1.h
++++ b/crypto/asn1/asn1.h
+@@ -776,7 +776,7 @@ DECLARE_ASN1_FUNCTIONS_fname(ASN1_TYPE, ASN1_ANY, ASN1_TYPE)
+ int ASN1_TYPE_get(ASN1_TYPE *a);
+ void ASN1_TYPE_set(ASN1_TYPE *a, int type, void *value);
+ int ASN1_TYPE_set1(ASN1_TYPE *a, int type, const void *value);
+-int ASN1_TYPE_cmp(ASN1_TYPE *a, ASN1_TYPE *b);
++int ASN1_TYPE_cmp(const ASN1_TYPE *a, const ASN1_TYPE *b);
+
+ ASN1_OBJECT * ASN1_OBJECT_new(void );
+ void ASN1_OBJECT_free(ASN1_OBJECT *a);
+diff --git a/crypto/asn1/x_algor.c b/crypto/asn1/x_algor.c
+index 274e456..57cc956 100644
+--- a/crypto/asn1/x_algor.c
++++ b/crypto/asn1/x_algor.c
+@@ -142,3 +142,14 @@ void X509_ALGOR_set_md(X509_ALGOR *alg, const EVP_MD *md)
+ X509_ALGOR_set0(alg, OBJ_nid2obj(EVP_MD_type(md)), param_type, NULL);
+
+ }
++
++int X509_ALGOR_cmp(const X509_ALGOR *a, const X509_ALGOR *b)
++ {
++ int rv;
++ rv = OBJ_cmp(a->algorithm, b->algorithm);
++ if (rv)
++ return rv;
++ if (!a->parameter && !b->parameter)
++ return 0;
++ return ASN1_TYPE_cmp(a->parameter, b->parameter);
++ }
+diff --git a/crypto/x509/x509.h b/crypto/x509/x509.h
+index 092dd74..ed767f8 100644
+--- a/crypto/x509/x509.h
++++ b/crypto/x509/x509.h
+@@ -768,6 +768,7 @@ int X509_ALGOR_set0(X509_ALGOR *alg, ASN1_OBJECT *aobj, int ptype, void *pval);
+ void X509_ALGOR_get0(ASN1_OBJECT **paobj, int *pptype, void **ppval,
+ X509_ALGOR *algor);
+ void X509_ALGOR_set_md(X509_ALGOR *alg, const EVP_MD *md);
++int X509_ALGOR_cmp(const X509_ALGOR *a, const X509_ALGOR *b);
+
+ X509_NAME *X509_NAME_dup(X509_NAME *xn);
+ X509_NAME_ENTRY *X509_NAME_ENTRY_dup(X509_NAME_ENTRY *ne);
+--
+2.1.4
+
Added: openssl/branches/wheezy/debian/patches/0098-ECDH-downgrade-bug-fix.patch
===================================================================
--- openssl/branches/wheezy/debian/patches/0098-ECDH-downgrade-bug-fix.patch (rev 0)
+++ openssl/branches/wheezy/debian/patches/0098-ECDH-downgrade-bug-fix.patch 2015-01-08 20:36:40 UTC (rev 708)
@@ -0,0 +1,91 @@
+From ef28c6d6767a6a30df5add36171894c96628fe98 Mon Sep 17 00:00:00 2001
+From: "Dr. Stephen Henson" <steve at openssl.org>
+Date: Fri, 24 Oct 2014 12:30:33 +0100
+Subject: [PATCH 098/117] ECDH downgrade bug fix.
+
+Fix bug where an OpenSSL client would accept a handshake using an
+ephemeral ECDH ciphersuites with the server key exchange message omitted.
+
+Thanks to Karthikeyan Bhargavan for reporting this issue.
+
+CVE-2014-3572
+Reviewed-by: Matt Caswell <matt at openssl.org>
+
+(cherry picked from commit b15f8769644b00ef7283521593360b7b2135cb63)
+---
+ CHANGES | 7 +++++++
+ ssl/s3_clnt.c | 18 +++++++++++++++---
+ 2 files changed, 22 insertions(+), 3 deletions(-)
+
+diff --git a/CHANGES b/CHANGES
+index bfb75be..8d3e6ff 100644
+--- a/CHANGES
++++ b/CHANGES
+@@ -4,6 +4,13 @@
+
+ Changes between 1.0.1j and 1.0.1k [xx XXX xxxx]
+
++ *) Abort handshake if server key exchange message is omitted for ephemeral
++ ECDH ciphersuites.
++
++ Thanks to Karthikeyan Bhargavan for reporting this issue.
++ (CVE-2014-3572)
++ [Steve Henson]
++
+ *) Ensure that the session ID context of an SSL is updated when its
+ SSL_CTX is updated via SSL_set_SSL_CTX.
+
+diff --git a/ssl/s3_clnt.c b/ssl/s3_clnt.c
+index 7a95d5a..43ffc77 100644
+--- a/ssl/s3_clnt.c
++++ b/ssl/s3_clnt.c
+@@ -1277,6 +1277,8 @@ int ssl3_get_key_exchange(SSL *s)
+ int encoded_pt_len = 0;
+ #endif
+
++ EVP_MD_CTX_init(&md_ctx);
++
+ /* use same message size as in ssl3_get_certificate_request()
+ * as ServerKeyExchange message may be skipped */
+ n=s->method->ssl_get_message(s,
+@@ -1287,14 +1289,26 @@ int ssl3_get_key_exchange(SSL *s)
+ &ok);
+ if (!ok) return((int)n);
+
++ alg_k=s->s3->tmp.new_cipher->algorithm_mkey;
++
+ if (s->s3->tmp.message_type != SSL3_MT_SERVER_KEY_EXCHANGE)
+ {
++ /*
++ * Can't skip server key exchange if this is an ephemeral
++ * ciphersuite.
++ */
++ if (alg_k & (SSL_kEDH|SSL_kEECDH))
++ {
++ SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE, SSL_R_UNEXPECTED_MESSAGE);
++ al = SSL_AD_UNEXPECTED_MESSAGE;
++ goto f_err;
++ }
+ #ifndef OPENSSL_NO_PSK
+ /* In plain PSK ciphersuite, ServerKeyExchange can be
+ omitted if no identity hint is sent. Set
+ session->sess_cert anyway to avoid problems
+ later.*/
+- if (s->s3->tmp.new_cipher->algorithm_mkey & SSL_kPSK)
++ if (alg_k & SSL_kPSK)
+ {
+ s->session->sess_cert=ssl_sess_cert_new();
+ if (s->ctx->psk_identity_hint)
+@@ -1339,9 +1353,7 @@ int ssl3_get_key_exchange(SSL *s)
+ /* Total length of the parameters including the length prefix */
+ param_len=0;
+
+- alg_k=s->s3->tmp.new_cipher->algorithm_mkey;
+ alg_a=s->s3->tmp.new_cipher->algorithm_auth;
+- EVP_MD_CTX_init(&md_ctx);
+
+ al=SSL_AD_DECODE_ERROR;
+
+--
+2.1.4
+
Added: openssl/branches/wheezy/debian/patches/0099-Only-allow-ephemeral-RSA-keys-in-export-ciphersuites.patch
===================================================================
--- openssl/branches/wheezy/debian/patches/0099-Only-allow-ephemeral-RSA-keys-in-export-ciphersuites.patch (rev 0)
+++ openssl/branches/wheezy/debian/patches/0099-Only-allow-ephemeral-RSA-keys-in-export-ciphersuites.patch 2015-01-08 20:36:40 UTC (rev 708)
@@ -0,0 +1,214 @@
+From 37580f43b5a39f5f4e920d17273fab9713d3a744 Mon Sep 17 00:00:00 2001
+From: "Dr. Stephen Henson" <steve at openssl.org>
+Date: Thu, 23 Oct 2014 17:09:57 +0100
+Subject: [PATCH 099/117] Only allow ephemeral RSA keys in export ciphersuites.
+
+OpenSSL clients would tolerate temporary RSA keys in non-export
+ciphersuites. It also had an option SSL_OP_EPHEMERAL_RSA which
+enabled this server side. Remove both options as they are a
+protocol violation.
+
+Thanks to Karthikeyan Bhargavan for reporting this issue.
+(CVE-2015-0204)
+Reviewed-by: Matt Caswell <matt at openssl.org>
+Reviewed-by: Tim Hudson <tjh at openssl.org>
+
+(cherry picked from commit 4b4c1fcc88aec8c9e001b0a0077d3cd4de1ed0e6)
+
+Conflicts:
+ doc/ssl/SSL_CTX_set_options.pod
+---
+ CHANGES | 8 ++++++++
+ doc/ssl/SSL_CTX_set_options.pod | 10 +---------
+ doc/ssl/SSL_CTX_set_tmp_rsa_callback.pod | 23 ++++++++---------------
+ ssl/d1_srvr.c | 21 ++++++---------------
+ ssl/s3_clnt.c | 7 +++++++
+ ssl/s3_srvr.c | 21 ++++++---------------
+ ssl/ssl.h | 5 ++---
+ 7 files changed, 38 insertions(+), 57 deletions(-)
+
+diff --git a/CHANGES b/CHANGES
+index 8d3e6ff..594d7c5 100644
+--- a/CHANGES
++++ b/CHANGES
+@@ -11,6 +11,14 @@
+ (CVE-2014-3572)
+ [Steve Henson]
+
++ *) Remove non-export ephemeral RSA code on client and server. This code
++ violated the TLS standard by allowing the use of temporary RSA keys in
++ non-export ciphersuites and could be used by a server to effectively
++ downgrade the RSA key length used to a value smaller than the server
++ certificate. Thanks for Karthikeyan Bhargavan for reporting this issue.
++ (CVE-2015-0204)
++ [Steve Henson]
++
+ *) Ensure that the session ID context of an SSL is updated when its
+ SSL_CTX is updated via SSL_set_SSL_CTX.
+
+diff --git a/doc/ssl/SSL_CTX_set_options.pod b/doc/ssl/SSL_CTX_set_options.pod
+index 6e6b5e6..e80a72c 100644
+--- a/doc/ssl/SSL_CTX_set_options.pod
++++ b/doc/ssl/SSL_CTX_set_options.pod
+@@ -158,15 +158,7 @@ temporary/ephemeral DH parameters are used.
+
+ =item SSL_OP_EPHEMERAL_RSA
+
+-Always use ephemeral (temporary) RSA key when doing RSA operations
+-(see L<SSL_CTX_set_tmp_rsa_callback(3)|SSL_CTX_set_tmp_rsa_callback(3)>).
+-According to the specifications this is only done, when a RSA key
+-can only be used for signature operations (namely under export ciphers
+-with restricted RSA keylength). By setting this option, ephemeral
+-RSA keys are always used. This option breaks compatibility with the
+-SSL/TLS specifications and may lead to interoperability problems with
+-clients and should therefore never be used. Ciphers with EDH (ephemeral
+-Diffie-Hellman) key exchange should be used instead.
++This option is no longer implemented and is treated as no op.
+
+ =item SSL_OP_CIPHER_SERVER_PREFERENCE
+
+diff --git a/doc/ssl/SSL_CTX_set_tmp_rsa_callback.pod b/doc/ssl/SSL_CTX_set_tmp_rsa_callback.pod
+index 534643c..8794eb7 100644
+--- a/doc/ssl/SSL_CTX_set_tmp_rsa_callback.pod
++++ b/doc/ssl/SSL_CTX_set_tmp_rsa_callback.pod
+@@ -74,21 +74,14 @@ exchange and use EDH (Ephemeral Diffie-Hellman) key exchange instead
+ in order to achieve forward secrecy (see
+ L<SSL_CTX_set_tmp_dh_callback(3)|SSL_CTX_set_tmp_dh_callback(3)>).
+
+-On OpenSSL servers ephemeral RSA key exchange is therefore disabled by default
+-and must be explicitly enabled using the SSL_OP_EPHEMERAL_RSA option of
+-L<SSL_CTX_set_options(3)|SSL_CTX_set_options(3)>, violating the TLS/SSL
+-standard. When ephemeral RSA key exchange is required for export ciphers,
+-it will automatically be used without this option!
+-
+-An application may either directly specify the key or can supply the key via
+-a callback function. The callback approach has the advantage, that the
+-callback may generate the key only in case it is actually needed. As the
+-generation of a RSA key is however costly, it will lead to a significant
+-delay in the handshake procedure. Another advantage of the callback function
+-is that it can supply keys of different size (e.g. for SSL_OP_EPHEMERAL_RSA
+-usage) while the explicit setting of the key is only useful for key size of
+-512 bits to satisfy the export restricted ciphers and does give away key length
+-if a longer key would be allowed.
++An application may either directly specify the key or can supply the key via a
++callback function. The callback approach has the advantage, that the callback
++may generate the key only in case it is actually needed. As the generation of a
++RSA key is however costly, it will lead to a significant delay in the handshake
++procedure. Another advantage of the callback function is that it can supply
++keys of different size while the explicit setting of the key is only useful for
++key size of 512 bits to satisfy the export restricted ciphers and does give
++away key length if a longer key would be allowed.
+
+ The B<tmp_rsa_callback> is called with the B<keylength> needed and
+ the B<is_export> information. The B<is_export> flag is set, when the
+diff --git a/ssl/d1_srvr.c b/ssl/d1_srvr.c
+index e40701e..da4c21e 100644
+--- a/ssl/d1_srvr.c
++++ b/ssl/d1_srvr.c
+@@ -454,24 +454,15 @@ int dtls1_accept(SSL *s)
+ case SSL3_ST_SW_KEY_EXCH_B:
+ alg_k = s->s3->tmp.new_cipher->algorithm_mkey;
+
+- /* clear this, it may get reset by
+- * send_server_key_exchange */
+- if ((s->options & SSL_OP_EPHEMERAL_RSA)
+-#ifndef OPENSSL_NO_KRB5
+- && !(alg_k & SSL_kKRB5)
+-#endif /* OPENSSL_NO_KRB5 */
+- )
+- /* option SSL_OP_EPHEMERAL_RSA sends temporary RSA key
+- * even when forbidden by protocol specs
+- * (handshake may fail as clients are not required to
+- * be able to handle this) */
+- s->s3->tmp.use_rsa_tmp=1;
+- else
+- s->s3->tmp.use_rsa_tmp=0;
++ /*
++ * clear this, it may get reset by
++ * send_server_key_exchange
++ */
++ s->s3->tmp.use_rsa_tmp=0;
+
+ /* only send if a DH key exchange or
+ * RSA but we have a sign only certificate */
+- if (s->s3->tmp.use_rsa_tmp
++ if (0
+ /* PSK: send ServerKeyExchange if PSK identity
+ * hint if provided */
+ #ifndef OPENSSL_NO_PSK
+diff --git a/ssl/s3_clnt.c b/ssl/s3_clnt.c
+index 43ffc77..023c679 100644
+--- a/ssl/s3_clnt.c
++++ b/ssl/s3_clnt.c
+@@ -1537,6 +1537,13 @@ int ssl3_get_key_exchange(SSL *s)
+ #ifndef OPENSSL_NO_RSA
+ if (alg_k & SSL_kRSA)
+ {
++ /* Temporary RSA keys only allowed in export ciphersuites */
++ if (!SSL_C_IS_EXPORT(s->s3->tmp.new_cipher))
++ {
++ al=SSL_AD_UNEXPECTED_MESSAGE;
++ SSLerr(SSL_F_SSL3_GET_SERVER_CERTIFICATE,SSL_R_UNEXPECTED_MESSAGE);
++ goto f_err;
++ }
+ if ((rsa=RSA_new()) == NULL)
+ {
+ SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE,ERR_R_MALLOC_FAILURE);
+diff --git a/ssl/s3_srvr.c b/ssl/s3_srvr.c
+index ac2cc3d..d883f86 100644
+--- a/ssl/s3_srvr.c
++++ b/ssl/s3_srvr.c
+@@ -447,20 +447,11 @@ int ssl3_accept(SSL *s)
+ case SSL3_ST_SW_KEY_EXCH_B:
+ alg_k = s->s3->tmp.new_cipher->algorithm_mkey;
+
+- /* clear this, it may get reset by
+- * send_server_key_exchange */
+- if ((s->options & SSL_OP_EPHEMERAL_RSA)
+-#ifndef OPENSSL_NO_KRB5
+- && !(alg_k & SSL_kKRB5)
+-#endif /* OPENSSL_NO_KRB5 */
+- )
+- /* option SSL_OP_EPHEMERAL_RSA sends temporary RSA key
+- * even when forbidden by protocol specs
+- * (handshake may fail as clients are not required to
+- * be able to handle this) */
+- s->s3->tmp.use_rsa_tmp=1;
+- else
+- s->s3->tmp.use_rsa_tmp=0;
++ /*
++ * clear this, it may get reset by
++ * send_server_key_exchange
++ */
++ s->s3->tmp.use_rsa_tmp=0;
+
+
+ /* only send if a DH key exchange, fortezza or
+@@ -474,7 +465,7 @@ int ssl3_accept(SSL *s)
+ * server certificate contains the server's
+ * public key for key exchange.
+ */
+- if (s->s3->tmp.use_rsa_tmp
++ if (0
+ /* PSK: send ServerKeyExchange if PSK identity
+ * hint if provided */
+ #ifndef OPENSSL_NO_PSK
+diff --git a/ssl/ssl.h b/ssl/ssl.h
+index a6a1c77..2ba5923 100644
+--- a/ssl/ssl.h
++++ b/ssl/ssl.h
+@@ -596,9 +596,8 @@ struct ssl_session_st
+ #define SSL_OP_SINGLE_ECDH_USE 0x00080000L
+ /* If set, always create a new key when using tmp_dh parameters */
+ #define SSL_OP_SINGLE_DH_USE 0x00100000L
+-/* Set to always use the tmp_rsa key when doing RSA operations,
+- * even when this violates protocol specs */
+-#define SSL_OP_EPHEMERAL_RSA 0x00200000L
++/* Does nothing: retained for compatibiity */
++#define SSL_OP_EPHEMERAL_RSA 0x0
+ /* Set on servers to choose the cipher according to the server's
+ * preferences */
+ #define SSL_OP_CIPHER_SERVER_PREFERENCE 0x00400000L
+--
+2.1.4
+
Added: openssl/branches/wheezy/debian/patches/0107-fix-error-discrepancy.patch
===================================================================
--- openssl/branches/wheezy/debian/patches/0107-fix-error-discrepancy.patch (rev 0)
+++ openssl/branches/wheezy/debian/patches/0107-fix-error-discrepancy.patch 2015-01-08 20:36:40 UTC (rev 708)
@@ -0,0 +1,27 @@
+From ffd14272c4c82f68a07b2e2192538adb560fa684 Mon Sep 17 00:00:00 2001
+From: "Dr. Stephen Henson" <steve at openssl.org>
+Date: Wed, 7 Jan 2015 17:36:17 +0000
+Subject: [PATCH 107/117] fix error discrepancy
+
+Reviewed-by: Matt Caswell <matt at openssl.org>
+(cherry picked from commit 4a4d4158572fd8b3dc641851b8378e791df7972d)
+---
+ ssl/s3_clnt.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/ssl/s3_clnt.c b/ssl/s3_clnt.c
+index 023c679..7692716 100644
+--- a/ssl/s3_clnt.c
++++ b/ssl/s3_clnt.c
+@@ -1541,7 +1541,7 @@ int ssl3_get_key_exchange(SSL *s)
+ if (!SSL_C_IS_EXPORT(s->s3->tmp.new_cipher))
+ {
+ al=SSL_AD_UNEXPECTED_MESSAGE;
+- SSLerr(SSL_F_SSL3_GET_SERVER_CERTIFICATE,SSL_R_UNEXPECTED_MESSAGE);
++ SSLerr(SSL_F_SSL3_GET_KEY_EXCHANGE,SSL_R_UNEXPECTED_MESSAGE);
+ goto f_err;
+ }
+ if ((rsa=RSA_new()) == NULL)
+--
+2.1.4
+
Added: openssl/branches/wheezy/debian/patches/0108-Fix-for-CVE-2014-3570.patch
===================================================================
--- openssl/branches/wheezy/debian/patches/0108-Fix-for-CVE-2014-3570.patch (rev 0)
+++ openssl/branches/wheezy/debian/patches/0108-Fix-for-CVE-2014-3570.patch 2015-01-08 20:36:40 UTC (rev 708)
@@ -0,0 +1,3155 @@
+From e078642ddea29bbb6ba29788a6a513796387fbbb Mon Sep 17 00:00:00 2001
+From: Andy Polyakov <appro at openssl.org>
+Date: Mon, 5 Jan 2015 14:52:56 +0100
+Subject: [PATCH 108/117] Fix for CVE-2014-3570.
+
+Reviewed-by: Emilia Kasper <emilia at openssl.org>
+(cherry picked from commit e793809ba50c1e90ab592fb640a856168e50f3de)
+(with 1.0.1-specific addendum)
+---
+ crypto/bn/asm/mips.pl | 611 +++---------
+ crypto/bn/asm/mips3.s | 2201 --------------------------------------------
+ crypto/bn/asm/x86_64-gcc.c | 34 +-
+ crypto/bn/bn_asm.c | 16 +-
+ crypto/bn/bntest.c | 102 +-
+ 5 files changed, 234 insertions(+), 2730 deletions(-)
+ delete mode 100644 crypto/bn/asm/mips3.s
+
+diff --git a/crypto/bn/asm/mips.pl b/crypto/bn/asm/mips.pl
+index d2f3ef7..215c9a7 100644
+--- a/crypto/bn/asm/mips.pl
++++ b/crypto/bn/asm/mips.pl
+@@ -1872,6 +1872,41 @@ ___
+
+ ($a_4,$a_5,$a_6,$a_7)=($b_0,$b_1,$b_2,$b_3);
+
++sub add_c2 () {
++my ($hi,$lo,$c0,$c1,$c2,
++ $warm, # !$warm denotes first call with specific sequence of
++ # $c_[XYZ] when there is no Z-carry to accumulate yet;
++ $an,$bn # these two are arguments for multiplication which
++ # result is used in *next* step [which is why it's
++ # commented as "forward multiplication" below];
++ )=@_;
++$code.=<<___;
++ mflo $lo
++ mfhi $hi
++ $ADDU $c0,$lo
++ sltu $at,$c0,$lo
++ $MULTU $an,$bn # forward multiplication
++ $ADDU $c0,$lo
++ $ADDU $at,$hi
++ sltu $lo,$c0,$lo
++ $ADDU $c1,$at
++ $ADDU $hi,$lo
++___
++$code.=<<___ if (!$warm);
++ sltu $c2,$c1,$at
++ $ADDU $c1,$hi
++ sltu $hi,$c1,$hi
++ $ADDU $c2,$hi
++___
++$code.=<<___ if ($warm);
++ sltu $at,$c1,$at
++ $ADDU $c1,$hi
++ $ADDU $c2,$at
++ sltu $hi,$c1,$hi
++ $ADDU $c2,$hi
++___
++}
++
+ $code.=<<___;
+
+ .align 5
+@@ -1920,21 +1955,10 @@ $code.=<<___;
+ sltu $at,$c_2,$t_1
+ $ADDU $c_3,$t_2,$at
+ $ST $c_2,$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_2,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_1,$a_1 # mul_add_c(a[1],b[1],c3,c1,c2);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_3,$t_1
+- sltu $at,$c_3,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_1,$t_2
+- sltu $at,$c_1,$t_2
+- $ADDU $c_2,$at
++___
++ &add_c2($t_2,$t_1,$c_3,$c_1,$c_2,0,
++ $a_1,$a_1); # mul_add_c(a[1],b[1],c3,c1,c2);
++$code.=<<___;
+ mflo $t_1
+ mfhi $t_2
+ $ADDU $c_3,$t_1
+@@ -1945,67 +1969,19 @@ $code.=<<___;
+ sltu $at,$c_1,$t_2
+ $ADDU $c_2,$at
+ $ST $c_3,2*$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_3,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_1,$a_2 # mul_add_c2(a[1],b[2],c1,c2,c3);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_1,$t_1
+- sltu $at,$c_1,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_2,$t_2
+- sltu $at,$c_2,$t_2
+- $ADDU $c_3,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $ADDU $c_3,$at
+- $MULTU $a_4,$a_0 # mul_add_c2(a[4],b[0],c2,c3,c1);
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_1,$t_1
+- sltu $at,$c_1,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_2,$t_2
+- sltu $at,$c_2,$t_2
+- $ADDU $c_3,$at
++___
++ &add_c2($t_2,$t_1,$c_1,$c_2,$c_3,0,
++ $a_1,$a_2); # mul_add_c2(a[1],b[2],c1,c2,c3);
++ &add_c2($t_2,$t_1,$c_1,$c_2,$c_3,1,
++ $a_4,$a_0); # mul_add_c2(a[4],b[0],c2,c3,c1);
++$code.=<<___;
+ $ST $c_1,3*$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_1,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_3,$a_1 # mul_add_c2(a[3],b[1],c2,c3,c1);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_2,$t_1
+- sltu $at,$c_2,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_3,$t_2
+- sltu $at,$c_3,$t_2
+- $ADDU $c_1,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $ADDU $c_1,$at
+- $MULTU $a_2,$a_2 # mul_add_c(a[2],b[2],c2,c3,c1);
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_2,$t_1
+- sltu $at,$c_2,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_3,$t_2
+- sltu $at,$c_3,$t_2
+- $ADDU $c_1,$at
++___
++ &add_c2($t_2,$t_1,$c_2,$c_3,$c_1,0,
++ $a_3,$a_1); # mul_add_c2(a[3],b[1],c2,c3,c1);
++ &add_c2($t_2,$t_1,$c_2,$c_3,$c_1,1,
++ $a_2,$a_2); # mul_add_c(a[2],b[2],c2,c3,c1);
++$code.=<<___;
+ mflo $t_1
+ mfhi $t_2
+ $ADDU $c_2,$t_1
+@@ -2016,97 +1992,23 @@ $code.=<<___;
+ sltu $at,$c_3,$t_2
+ $ADDU $c_1,$at
+ $ST $c_2,4*$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_2,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_1,$a_4 # mul_add_c2(a[1],b[4],c3,c1,c2);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_3,$t_1
+- sltu $at,$c_3,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_1,$t_2
+- sltu $at,$c_1,$t_2
+- $ADDU $c_2,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $ADDU $c_2,$at
+- $MULTU $a_2,$a_3 # mul_add_c2(a[2],b[3],c3,c1,c2);
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_3,$t_1
+- sltu $at,$c_3,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_1,$t_2
+- sltu $at,$c_1,$t_2
+- $ADDU $c_2,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $MULTU $a_6,$a_0 # mul_add_c2(a[6],b[0],c1,c2,c3);
+- $ADDU $c_2,$at
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_3,$t_1
+- sltu $at,$c_3,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_1,$t_2
+- sltu $at,$c_1,$t_2
+- $ADDU $c_2,$at
++___
++ &add_c2($t_2,$t_1,$c_3,$c_1,$c_2,0,
++ $a_1,$a_4); # mul_add_c2(a[1],b[4],c3,c1,c2);
++ &add_c2($t_2,$t_1,$c_3,$c_1,$c_2,1,
++ $a_2,$a_3); # mul_add_c2(a[2],b[3],c3,c1,c2);
++ &add_c2($t_2,$t_1,$c_3,$c_1,$c_2,1,
++ $a_6,$a_0); # mul_add_c2(a[6],b[0],c1,c2,c3);
++$code.=<<___;
+ $ST $c_3,5*$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_3,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_5,$a_1 # mul_add_c2(a[5],b[1],c1,c2,c3);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_1,$t_1
+- sltu $at,$c_1,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_2,$t_2
+- sltu $at,$c_2,$t_2
+- $ADDU $c_3,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $ADDU $c_3,$at
+- $MULTU $a_4,$a_2 # mul_add_c2(a[4],b[2],c1,c2,c3);
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_1,$t_1
+- sltu $at,$c_1,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_2,$t_2
+- sltu $at,$c_2,$t_2
+- $ADDU $c_3,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $ADDU $c_3,$at
+- $MULTU $a_3,$a_3 # mul_add_c(a[3],b[3],c1,c2,c3);
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_1,$t_1
+- sltu $at,$c_1,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_2,$t_2
+- sltu $at,$c_2,$t_2
+- $ADDU $c_3,$at
++___
++ &add_c2($t_2,$t_1,$c_1,$c_2,$c_3,0,
++ $a_5,$a_1); # mul_add_c2(a[5],b[1],c1,c2,c3);
++ &add_c2($t_2,$t_1,$c_1,$c_2,$c_3,1,
++ $a_4,$a_2); # mul_add_c2(a[4],b[2],c1,c2,c3);
++ &add_c2($t_2,$t_1,$c_1,$c_2,$c_3,1,
++ $a_3,$a_3); # mul_add_c(a[3],b[3],c1,c2,c3);
++$code.=<<___;
+ mflo $t_1
+ mfhi $t_2
+ $ADDU $c_1,$t_1
+@@ -2117,112 +2019,25 @@ $code.=<<___;
+ sltu $at,$c_2,$t_2
+ $ADDU $c_3,$at
+ $ST $c_1,6*$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_1,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_1,$a_6 # mul_add_c2(a[1],b[6],c2,c3,c1);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_2,$t_1
+- sltu $at,$c_2,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_3,$t_2
+- sltu $at,$c_3,$t_2
+- $ADDU $c_1,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $ADDU $c_1,$at
+- $MULTU $a_2,$a_5 # mul_add_c2(a[2],b[5],c2,c3,c1);
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_2,$t_1
+- sltu $at,$c_2,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_3,$t_2
+- sltu $at,$c_3,$t_2
+- $ADDU $c_1,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $ADDU $c_1,$at
+- $MULTU $a_3,$a_4 # mul_add_c2(a[3],b[4],c2,c3,c1);
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_2,$t_1
+- sltu $at,$c_2,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_3,$t_2
+- sltu $at,$c_3,$t_2
+- $ADDU $c_1,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $ADDU $c_1,$at
+- $MULTU $a_7,$a_1 # mul_add_c2(a[7],b[1],c3,c1,c2);
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_2,$t_1
+- sltu $at,$c_2,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_3,$t_2
+- sltu $at,$c_3,$t_2
+- $ADDU $c_1,$at
++___
++ &add_c2($t_2,$t_1,$c_2,$c_3,$c_1,0,
++ $a_1,$a_6); # mul_add_c2(a[1],b[6],c2,c3,c1);
++ &add_c2($t_2,$t_1,$c_2,$c_3,$c_1,1,
++ $a_2,$a_5); # mul_add_c2(a[2],b[5],c2,c3,c1);
++ &add_c2($t_2,$t_1,$c_2,$c_3,$c_1,1,
++ $a_3,$a_4); # mul_add_c2(a[3],b[4],c2,c3,c1);
++ &add_c2($t_2,$t_1,$c_2,$c_3,$c_1,1,
++ $a_7,$a_1); # mul_add_c2(a[7],b[1],c3,c1,c2);
++$code.=<<___;
+ $ST $c_2,7*$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_2,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_6,$a_2 # mul_add_c2(a[6],b[2],c3,c1,c2);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_3,$t_1
+- sltu $at,$c_3,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_1,$t_2
+- sltu $at,$c_1,$t_2
+- $ADDU $c_2,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $ADDU $c_2,$at
+- $MULTU $a_5,$a_3 # mul_add_c2(a[5],b[3],c3,c1,c2);
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_3,$t_1
+- sltu $at,$c_3,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_1,$t_2
+- sltu $at,$c_1,$t_2
+- $ADDU $c_2,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $ADDU $c_2,$at
+- $MULTU $a_4,$a_4 # mul_add_c(a[4],b[4],c3,c1,c2);
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_3,$t_1
+- sltu $at,$c_3,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_1,$t_2
+- sltu $at,$c_1,$t_2
+- $ADDU $c_2,$at
++___
++ &add_c2($t_2,$t_1,$c_3,$c_1,$c_2,0,
++ $a_6,$a_2); # mul_add_c2(a[6],b[2],c3,c1,c2);
++ &add_c2($t_2,$t_1,$c_3,$c_1,$c_2,1,
++ $a_5,$a_3); # mul_add_c2(a[5],b[3],c3,c1,c2);
++ &add_c2($t_2,$t_1,$c_3,$c_1,$c_2,1,
++ $a_4,$a_4); # mul_add_c(a[4],b[4],c3,c1,c2);
++$code.=<<___;
+ mflo $t_1
+ mfhi $t_2
+ $ADDU $c_3,$t_1
+@@ -2233,82 +2048,21 @@ $code.=<<___;
+ sltu $at,$c_1,$t_2
+ $ADDU $c_2,$at
+ $ST $c_3,8*$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_3,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_3,$a_6 # mul_add_c2(a[3],b[6],c1,c2,c3);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_1,$t_1
+- sltu $at,$c_1,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_2,$t_2
+- sltu $at,$c_2,$t_2
+- $ADDU $c_3,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $ADDU $c_3,$at
+- $MULTU $a_4,$a_5 # mul_add_c2(a[4],b[5],c1,c2,c3);
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_1,$t_1
+- sltu $at,$c_1,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_2,$t_2
+- sltu $at,$c_2,$t_2
+- $ADDU $c_3,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $ADDU $c_3,$at
+- $MULTU $a_7,$a_3 # mul_add_c2(a[7],b[3],c2,c3,c1);
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_1,$t_1
+- sltu $at,$c_1,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_2,$t_2
+- sltu $at,$c_2,$t_2
+- $ADDU $c_3,$at
++___
++ &add_c2($t_2,$t_1,$c_1,$c_2,$c_3,0,
++ $a_3,$a_6); # mul_add_c2(a[3],b[6],c1,c2,c3);
++ &add_c2($t_2,$t_1,$c_1,$c_2,$c_3,1,
++ $a_4,$a_5); # mul_add_c2(a[4],b[5],c1,c2,c3);
++ &add_c2($t_2,$t_1,$c_1,$c_2,$c_3,1,
++ $a_7,$a_3); # mul_add_c2(a[7],b[3],c2,c3,c1);
++$code.=<<___;
+ $ST $c_1,9*$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_1,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_6,$a_4 # mul_add_c2(a[6],b[4],c2,c3,c1);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_2,$t_1
+- sltu $at,$c_2,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_3,$t_2
+- sltu $at,$c_3,$t_2
+- $ADDU $c_1,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $ADDU $c_1,$at
+- $MULTU $a_5,$a_5 # mul_add_c(a[5],b[5],c2,c3,c1);
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_2,$t_1
+- sltu $at,$c_2,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_3,$t_2
+- sltu $at,$c_3,$t_2
+- $ADDU $c_1,$at
++___
++ &add_c2($t_2,$t_1,$c_2,$c_3,$c_1,0,
++ $a_6,$a_4); # mul_add_c2(a[6],b[4],c2,c3,c1);
++ &add_c2($t_2,$t_1,$c_2,$c_3,$c_1,1,
++ $a_5,$a_5); # mul_add_c(a[5],b[5],c2,c3,c1);
++$code.=<<___;
+ mflo $t_1
+ mfhi $t_2
+ $ADDU $c_2,$t_1
+@@ -2319,52 +2073,17 @@ $code.=<<___;
+ sltu $at,$c_3,$t_2
+ $ADDU $c_1,$at
+ $ST $c_2,10*$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_2,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_5,$a_6 # mul_add_c2(a[5],b[6],c3,c1,c2);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_3,$t_1
+- sltu $at,$c_3,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_1,$t_2
+- sltu $at,$c_1,$t_2
+- $ADDU $c_2,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $ADDU $c_2,$at
+- $MULTU $a_7,$a_5 # mul_add_c2(a[7],b[5],c1,c2,c3);
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_3,$t_1
+- sltu $at,$c_3,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_1,$t_2
+- sltu $at,$c_1,$t_2
+- $ADDU $c_2,$at
++___
++ &add_c2($t_2,$t_1,$c_3,$c_1,$c_2,0,
++ $a_5,$a_6); # mul_add_c2(a[5],b[6],c3,c1,c2);
++ &add_c2($t_2,$t_1,$c_3,$c_1,$c_2,1,
++ $a_7,$a_5); # mul_add_c2(a[7],b[5],c1,c2,c3);
++$code.=<<___;
+ $ST $c_3,11*$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_3,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_6,$a_6 # mul_add_c(a[6],b[6],c1,c2,c3);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_1,$t_1
+- sltu $at,$c_1,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_2,$t_2
+- sltu $at,$c_2,$t_2
+- $ADDU $c_3,$at
++___
++ &add_c2($t_2,$t_1,$c_1,$c_2,$c_3,0,
++ $a_6,$a_6); # mul_add_c(a[6],b[6],c1,c2,c3);
++$code.=<<___;
+ mflo $t_1
+ mfhi $t_2
+ $ADDU $c_1,$t_1
+@@ -2375,21 +2094,10 @@ $code.=<<___;
+ sltu $at,$c_2,$t_2
+ $ADDU $c_3,$at
+ $ST $c_1,12*$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_1,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_7,$a_7 # mul_add_c(a[7],b[7],c3,c1,c2);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_2,$t_1
+- sltu $at,$c_2,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_3,$t_2
+- sltu $at,$c_3,$t_2
+- $ADDU $c_1,$at
++___
++ &add_c2($t_2,$t_1,$c_2,$c_3,$c_1,0,
++ $a_7,$a_7); # mul_add_c(a[7],b[7],c3,c1,c2);
++$code.=<<___;
+ $ST $c_2,13*$BNSZ($a0)
+
+ mflo $t_1
+@@ -2457,21 +2165,10 @@ $code.=<<___;
+ sltu $at,$c_2,$t_1
+ $ADDU $c_3,$t_2,$at
+ $ST $c_2,$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_2,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_1,$a_1 # mul_add_c(a[1],b[1],c3,c1,c2);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_3,$t_1
+- sltu $at,$c_3,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_1,$t_2
+- sltu $at,$c_1,$t_2
+- $ADDU $c_2,$at
++___
++ &add_c2($t_2,$t_1,$c_3,$c_1,$c_2,0,
++ $a_1,$a_1); # mul_add_c(a[1],b[1],c3,c1,c2);
++$code.=<<___;
+ mflo $t_1
+ mfhi $t_2
+ $ADDU $c_3,$t_1
+@@ -2482,52 +2179,17 @@ $code.=<<___;
+ sltu $at,$c_1,$t_2
+ $ADDU $c_2,$at
+ $ST $c_3,2*$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_3,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_1,$a_2 # mul_add_c(a2[1],b[2],c1,c2,c3);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_1,$t_1
+- sltu $at,$c_1,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_2,$t_2
+- sltu $at,$c_2,$t_2
+- $ADDU $c_3,$at
+- mflo $t_1
+- mfhi $t_2
+- slt $at,$t_2,$zero
+- $ADDU $c_3,$at
+- $MULTU $a_3,$a_1 # mul_add_c2(a[3],b[1],c2,c3,c1);
+- $SLL $t_2,1
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_1,$t_1
+- sltu $at,$c_1,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_2,$t_2
+- sltu $at,$c_2,$t_2
+- $ADDU $c_3,$at
++___
++ &add_c2($t_2,$t_1,$c_1,$c_2,$c_3,0,
++ $a_1,$a_2); # mul_add_c2(a2[1],b[2],c1,c2,c3);
++ &add_c2($t_2,$t_1,$c_1,$c_2,$c_3,1,
++ $a_3,$a_1); # mul_add_c2(a[3],b[1],c2,c3,c1);
++$code.=<<___;
+ $ST $c_1,3*$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_1,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_2,$a_2 # mul_add_c(a[2],b[2],c2,c3,c1);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_2,$t_1
+- sltu $at,$c_2,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_3,$t_2
+- sltu $at,$c_3,$t_2
+- $ADDU $c_1,$at
++___
++ &add_c2($t_2,$t_1,$c_2,$c_3,$c_1,0,
++ $a_2,$a_2); # mul_add_c(a[2],b[2],c2,c3,c1);
++$code.=<<___;
+ mflo $t_1
+ mfhi $t_2
+ $ADDU $c_2,$t_1
+@@ -2538,21 +2200,10 @@ $code.=<<___;
+ sltu $at,$c_3,$t_2
+ $ADDU $c_1,$at
+ $ST $c_2,4*$BNSZ($a0)
+-
+- mflo $t_1
+- mfhi $t_2
+- slt $c_2,$t_2,$zero
+- $SLL $t_2,1
+- $MULTU $a_3,$a_3 # mul_add_c(a[3],b[3],c1,c2,c3);
+- slt $a2,$t_1,$zero
+- $ADDU $t_2,$a2
+- $SLL $t_1,1
+- $ADDU $c_3,$t_1
+- sltu $at,$c_3,$t_1
+- $ADDU $t_2,$at
+- $ADDU $c_1,$t_2
+- sltu $at,$c_1,$t_2
+- $ADDU $c_2,$at
++___
++ &add_c2($t_2,$t_1,$c_3,$c_1,$c_2,0,
++ $a_3,$a_3); # mul_add_c(a[3],b[3],c1,c2,c3);
++$code.=<<___;
+ $ST $c_3,5*$BNSZ($a0)
+
+ mflo $t_1
+diff --git a/crypto/bn/asm/mips3.s b/crypto/bn/asm/mips3.s
+deleted file mode 100644
+index dca4105..0000000
+--- a/crypto/bn/asm/mips3.s
++++ /dev/null
+@@ -1,2201 +0,0 @@
+-.rdata
+-.asciiz "mips3.s, Version 1.1"
+-.asciiz "MIPS III/IV ISA artwork by Andy Polyakov <appro at fy.chalmers.se>"
+-
+-/*
+- * ====================================================================
+- * Written by Andy Polyakov <appro at fy.chalmers.se> for the OpenSSL
+- * project.
+- *
+- * Rights for redistribution and usage in source and binary forms are
+- * granted according to the OpenSSL license. Warranty of any kind is
+- * disclaimed.
+- * ====================================================================
+- */
+-
+-/*
+- * This is my modest contributon to the OpenSSL project (see
+- * http://www.openssl.org/ for more information about it) and is
+- * a drop-in MIPS III/IV ISA replacement for crypto/bn/bn_asm.c
+- * module. For updates see http://fy.chalmers.se/~appro/hpe/.
+- *
+- * The module is designed to work with either of the "new" MIPS ABI(5),
+- * namely N32 or N64, offered by IRIX 6.x. It's not ment to work under
+- * IRIX 5.x not only because it doesn't support new ABIs but also
+- * because 5.x kernels put R4x00 CPU into 32-bit mode and all those
+- * 64-bit instructions (daddu, dmultu, etc.) found below gonna only
+- * cause illegal instruction exception:-(
+- *
+- * In addition the code depends on preprocessor flags set up by MIPSpro
+- * compiler driver (either as or cc) and therefore (probably?) can't be
+- * compiled by the GNU assembler. GNU C driver manages fine though...
+- * I mean as long as -mmips-as is specified or is the default option,
+- * because then it simply invokes /usr/bin/as which in turn takes
+- * perfect care of the preprocessor definitions. Another neat feature
+- * offered by the MIPSpro assembler is an optimization pass. This gave
+- * me the opportunity to have the code looking more regular as all those
+- * architecture dependent instruction rescheduling details were left to
+- * the assembler. Cool, huh?
+- *
+- * Performance improvement is astonishing! 'apps/openssl speed rsa dsa'
+- * goes way over 3 times faster!
+- *
+- * <appro at fy.chalmers.se>
+- */
+-#include <asm.h>
+-#include <regdef.h>
+-
+-#if _MIPS_ISA>=4
+-#define MOVNZ(cond,dst,src) \
+- movn dst,src,cond
+-#else
+-#define MOVNZ(cond,dst,src) \
+- .set noreorder; \
+- bnezl cond,.+8; \
+- move dst,src; \
+- .set reorder
+-#endif
+-
+-.text
+-
+-.set noat
+-.set reorder
+-
+-#define MINUS4 v1
+-
+-.align 5
+-LEAF(bn_mul_add_words)
+- .set noreorder
+- bgtzl a2,.L_bn_mul_add_words_proceed
+- ld t0,0(a1)
+- jr ra
+- move v0,zero
+- .set reorder
+-
+-.L_bn_mul_add_words_proceed:
+- li MINUS4,-4
+- and ta0,a2,MINUS4
+- move v0,zero
+- beqz ta0,.L_bn_mul_add_words_tail
+-
+-.L_bn_mul_add_words_loop:
+- dmultu t0,a3
+- ld t1,0(a0)
+- ld t2,8(a1)
+- ld t3,8(a0)
+- ld ta0,16(a1)
+- ld ta1,16(a0)
+- daddu t1,v0
+- sltu v0,t1,v0 /* All manuals say it "compares 32-bit
+- * values", but it seems to work fine
+- * even on 64-bit registers. */
+- mflo AT
+- mfhi t0
+- daddu t1,AT
+- daddu v0,t0
+- sltu AT,t1,AT
+- sd t1,0(a0)
+- daddu v0,AT
+-
+- dmultu t2,a3
+- ld ta2,24(a1)
+- ld ta3,24(a0)
+- daddu t3,v0
+- sltu v0,t3,v0
+- mflo AT
+- mfhi t2
+- daddu t3,AT
+- daddu v0,t2
+- sltu AT,t3,AT
+- sd t3,8(a0)
+- daddu v0,AT
+-
+- dmultu ta0,a3
+- subu a2,4
+- PTR_ADD a0,32
+- PTR_ADD a1,32
+- daddu ta1,v0
+- sltu v0,ta1,v0
+- mflo AT
+- mfhi ta0
+- daddu ta1,AT
+- daddu v0,ta0
+- sltu AT,ta1,AT
+- sd ta1,-16(a0)
+- daddu v0,AT
+-
+-
+- dmultu ta2,a3
+- and ta0,a2,MINUS4
+- daddu ta3,v0
+- sltu v0,ta3,v0
+- mflo AT
+- mfhi ta2
+- daddu ta3,AT
+- daddu v0,ta2
+- sltu AT,ta3,AT
+- sd ta3,-8(a0)
+- daddu v0,AT
+- .set noreorder
+- bgtzl ta0,.L_bn_mul_add_words_loop
+- ld t0,0(a1)
+-
+- bnezl a2,.L_bn_mul_add_words_tail
+- ld t0,0(a1)
+- .set reorder
+-
+-.L_bn_mul_add_words_return:
+- jr ra
+-
+-.L_bn_mul_add_words_tail:
+- dmultu t0,a3
+- ld t1,0(a0)
+- subu a2,1
+- daddu t1,v0
+- sltu v0,t1,v0
+- mflo AT
+- mfhi t0
+- daddu t1,AT
+- daddu v0,t0
+- sltu AT,t1,AT
+- sd t1,0(a0)
+- daddu v0,AT
+- beqz a2,.L_bn_mul_add_words_return
+-
+- ld t0,8(a1)
+- dmultu t0,a3
+- ld t1,8(a0)
+- subu a2,1
+- daddu t1,v0
+- sltu v0,t1,v0
+- mflo AT
+- mfhi t0
+- daddu t1,AT
+- daddu v0,t0
+- sltu AT,t1,AT
+- sd t1,8(a0)
+- daddu v0,AT
+- beqz a2,.L_bn_mul_add_words_return
+-
+- ld t0,16(a1)
+- dmultu t0,a3
+- ld t1,16(a0)
+- daddu t1,v0
+- sltu v0,t1,v0
+- mflo AT
+- mfhi t0
+- daddu t1,AT
+- daddu v0,t0
+- sltu AT,t1,AT
+- sd t1,16(a0)
+- daddu v0,AT
+- jr ra
+-END(bn_mul_add_words)
+-
+-.align 5
+-LEAF(bn_mul_words)
+- .set noreorder
+- bgtzl a2,.L_bn_mul_words_proceed
+- ld t0,0(a1)
+- jr ra
+- move v0,zero
+- .set reorder
+-
+-.L_bn_mul_words_proceed:
+- li MINUS4,-4
+- and ta0,a2,MINUS4
+- move v0,zero
+- beqz ta0,.L_bn_mul_words_tail
+-
+-.L_bn_mul_words_loop:
+- dmultu t0,a3
+- ld t2,8(a1)
+- ld ta0,16(a1)
+- ld ta2,24(a1)
+- mflo AT
+- mfhi t0
+- daddu v0,AT
+- sltu t1,v0,AT
+- sd v0,0(a0)
+- daddu v0,t1,t0
+-
+- dmultu t2,a3
+- subu a2,4
+- PTR_ADD a0,32
+- PTR_ADD a1,32
+- mflo AT
+- mfhi t2
+- daddu v0,AT
+- sltu t3,v0,AT
+- sd v0,-24(a0)
+- daddu v0,t3,t2
+-
+- dmultu ta0,a3
+- mflo AT
+- mfhi ta0
+- daddu v0,AT
+- sltu ta1,v0,AT
+- sd v0,-16(a0)
+- daddu v0,ta1,ta0
+-
+-
+- dmultu ta2,a3
+- and ta0,a2,MINUS4
+- mflo AT
+- mfhi ta2
+- daddu v0,AT
+- sltu ta3,v0,AT
+- sd v0,-8(a0)
+- daddu v0,ta3,ta2
+- .set noreorder
+- bgtzl ta0,.L_bn_mul_words_loop
+- ld t0,0(a1)
+-
+- bnezl a2,.L_bn_mul_words_tail
+- ld t0,0(a1)
+- .set reorder
+-
+-.L_bn_mul_words_return:
+- jr ra
+-
+-.L_bn_mul_words_tail:
+- dmultu t0,a3
+- subu a2,1
+- mflo AT
+- mfhi t0
+- daddu v0,AT
+- sltu t1,v0,AT
+- sd v0,0(a0)
+- daddu v0,t1,t0
+- beqz a2,.L_bn_mul_words_return
+-
+- ld t0,8(a1)
+- dmultu t0,a3
+- subu a2,1
+- mflo AT
+- mfhi t0
+- daddu v0,AT
+- sltu t1,v0,AT
+- sd v0,8(a0)
+- daddu v0,t1,t0
+- beqz a2,.L_bn_mul_words_return
+-
+- ld t0,16(a1)
+- dmultu t0,a3
+- mflo AT
+- mfhi t0
+- daddu v0,AT
+- sltu t1,v0,AT
+- sd v0,16(a0)
+- daddu v0,t1,t0
+- jr ra
+-END(bn_mul_words)
+-
+-.align 5
+-LEAF(bn_sqr_words)
+- .set noreorder
+- bgtzl a2,.L_bn_sqr_words_proceed
+- ld t0,0(a1)
+- jr ra
+- move v0,zero
+- .set reorder
+-
+-.L_bn_sqr_words_proceed:
+- li MINUS4,-4
+- and ta0,a2,MINUS4
+- move v0,zero
+- beqz ta0,.L_bn_sqr_words_tail
+-
+-.L_bn_sqr_words_loop:
+- dmultu t0,t0
+- ld t2,8(a1)
+- ld ta0,16(a1)
+- ld ta2,24(a1)
+- mflo t1
+- mfhi t0
+- sd t1,0(a0)
+- sd t0,8(a0)
+-
+- dmultu t2,t2
+- subu a2,4
+- PTR_ADD a0,64
+- PTR_ADD a1,32
+- mflo t3
+- mfhi t2
+- sd t3,-48(a0)
+- sd t2,-40(a0)
+-
+- dmultu ta0,ta0
+- mflo ta1
+- mfhi ta0
+- sd ta1,-32(a0)
+- sd ta0,-24(a0)
+-
+-
+- dmultu ta2,ta2
+- and ta0,a2,MINUS4
+- mflo ta3
+- mfhi ta2
+- sd ta3,-16(a0)
+- sd ta2,-8(a0)
+-
+- .set noreorder
+- bgtzl ta0,.L_bn_sqr_words_loop
+- ld t0,0(a1)
+-
+- bnezl a2,.L_bn_sqr_words_tail
+- ld t0,0(a1)
+- .set reorder
+-
+-.L_bn_sqr_words_return:
+- move v0,zero
+- jr ra
+-
+-.L_bn_sqr_words_tail:
+- dmultu t0,t0
+- subu a2,1
+- mflo t1
+- mfhi t0
+- sd t1,0(a0)
+- sd t0,8(a0)
+- beqz a2,.L_bn_sqr_words_return
+-
+- ld t0,8(a1)
+- dmultu t0,t0
+- subu a2,1
+- mflo t1
+- mfhi t0
+- sd t1,16(a0)
+- sd t0,24(a0)
+- beqz a2,.L_bn_sqr_words_return
+-
+- ld t0,16(a1)
+- dmultu t0,t0
+- mflo t1
+- mfhi t0
+- sd t1,32(a0)
+- sd t0,40(a0)
+- jr ra
+-END(bn_sqr_words)
+-
+-.align 5
+-LEAF(bn_add_words)
+- .set noreorder
+- bgtzl a3,.L_bn_add_words_proceed
+- ld t0,0(a1)
+- jr ra
+- move v0,zero
+- .set reorder
+-
+-.L_bn_add_words_proceed:
+- li MINUS4,-4
+- and AT,a3,MINUS4
+- move v0,zero
+- beqz AT,.L_bn_add_words_tail
+-
+-.L_bn_add_words_loop:
+- ld ta0,0(a2)
+- subu a3,4
+- ld t1,8(a1)
+- and AT,a3,MINUS4
+- ld t2,16(a1)
+- PTR_ADD a2,32
+- ld t3,24(a1)
+- PTR_ADD a0,32
+- ld ta1,-24(a2)
+- PTR_ADD a1,32
+- ld ta2,-16(a2)
+- ld ta3,-8(a2)
+- daddu ta0,t0
+- sltu t8,ta0,t0
+- daddu t0,ta0,v0
+- sltu v0,t0,ta0
+- sd t0,-32(a0)
+- daddu v0,t8
+-
+- daddu ta1,t1
+- sltu t9,ta1,t1
+- daddu t1,ta1,v0
+- sltu v0,t1,ta1
+- sd t1,-24(a0)
+- daddu v0,t9
+-
+- daddu ta2,t2
+- sltu t8,ta2,t2
+- daddu t2,ta2,v0
+- sltu v0,t2,ta2
+- sd t2,-16(a0)
+- daddu v0,t8
+-
+- daddu ta3,t3
+- sltu t9,ta3,t3
+- daddu t3,ta3,v0
+- sltu v0,t3,ta3
+- sd t3,-8(a0)
+- daddu v0,t9
+-
+- .set noreorder
+- bgtzl AT,.L_bn_add_words_loop
+- ld t0,0(a1)
+-
+- bnezl a3,.L_bn_add_words_tail
+- ld t0,0(a1)
+- .set reorder
+-
+-.L_bn_add_words_return:
+- jr ra
+-
+-.L_bn_add_words_tail:
+- ld ta0,0(a2)
+- daddu ta0,t0
+- subu a3,1
+- sltu t8,ta0,t0
+- daddu t0,ta0,v0
+- sltu v0,t0,ta0
+- sd t0,0(a0)
+- daddu v0,t8
+- beqz a3,.L_bn_add_words_return
+-
+- ld t1,8(a1)
+- ld ta1,8(a2)
+- daddu ta1,t1
+- subu a3,1
+- sltu t9,ta1,t1
+- daddu t1,ta1,v0
+- sltu v0,t1,ta1
+- sd t1,8(a0)
+- daddu v0,t9
+- beqz a3,.L_bn_add_words_return
+-
+- ld t2,16(a1)
+- ld ta2,16(a2)
+- daddu ta2,t2
+- sltu t8,ta2,t2
+- daddu t2,ta2,v0
+- sltu v0,t2,ta2
+- sd t2,16(a0)
+- daddu v0,t8
+- jr ra
+-END(bn_add_words)
+-
+-.align 5
+-LEAF(bn_sub_words)
+- .set noreorder
+- bgtzl a3,.L_bn_sub_words_proceed
+- ld t0,0(a1)
+- jr ra
+- move v0,zero
+- .set reorder
+-
+-.L_bn_sub_words_proceed:
+- li MINUS4,-4
+- and AT,a3,MINUS4
+- move v0,zero
+- beqz AT,.L_bn_sub_words_tail
+-
+-.L_bn_sub_words_loop:
+- ld ta0,0(a2)
+- subu a3,4
+- ld t1,8(a1)
+- and AT,a3,MINUS4
+- ld t2,16(a1)
+- PTR_ADD a2,32
+- ld t3,24(a1)
+- PTR_ADD a0,32
+- ld ta1,-24(a2)
+- PTR_ADD a1,32
+- ld ta2,-16(a2)
+- ld ta3,-8(a2)
+- sltu t8,t0,ta0
+- dsubu t0,ta0
+- dsubu ta0,t0,v0
+- sd ta0,-32(a0)
+- MOVNZ (t0,v0,t8)
+-
+- sltu t9,t1,ta1
+- dsubu t1,ta1
+- dsubu ta1,t1,v0
+- sd ta1,-24(a0)
+- MOVNZ (t1,v0,t9)
+-
+-
+- sltu t8,t2,ta2
+- dsubu t2,ta2
+- dsubu ta2,t2,v0
+- sd ta2,-16(a0)
+- MOVNZ (t2,v0,t8)
+-
+- sltu t9,t3,ta3
+- dsubu t3,ta3
+- dsubu ta3,t3,v0
+- sd ta3,-8(a0)
+- MOVNZ (t3,v0,t9)
+-
+- .set noreorder
+- bgtzl AT,.L_bn_sub_words_loop
+- ld t0,0(a1)
+-
+- bnezl a3,.L_bn_sub_words_tail
+- ld t0,0(a1)
+- .set reorder
+-
+-.L_bn_sub_words_return:
+- jr ra
+-
+-.L_bn_sub_words_tail:
+- ld ta0,0(a2)
+- subu a3,1
+- sltu t8,t0,ta0
+- dsubu t0,ta0
+- dsubu ta0,t0,v0
+- MOVNZ (t0,v0,t8)
+- sd ta0,0(a0)
+- beqz a3,.L_bn_sub_words_return
+-
+- ld t1,8(a1)
+- subu a3,1
+- ld ta1,8(a2)
+- sltu t9,t1,ta1
+- dsubu t1,ta1
+- dsubu ta1,t1,v0
+- MOVNZ (t1,v0,t9)
+- sd ta1,8(a0)
+- beqz a3,.L_bn_sub_words_return
+-
+- ld t2,16(a1)
+- ld ta2,16(a2)
+- sltu t8,t2,ta2
+- dsubu t2,ta2
+- dsubu ta2,t2,v0
+- MOVNZ (t2,v0,t8)
+- sd ta2,16(a0)
+- jr ra
+-END(bn_sub_words)
+-
+-#undef MINUS4
+-
+-.align 5
+-LEAF(bn_div_3_words)
+- .set reorder
+- move a3,a0 /* we know that bn_div_words doesn't
+- * touch a3, ta2, ta3 and preserves a2
+- * so that we can save two arguments
+- * and return address in registers
+- * instead of stack:-)
+- */
+- ld a0,(a3)
+- move ta2,a1
+- ld a1,-8(a3)
+- bne a0,a2,.L_bn_div_3_words_proceed
+- li v0,-1
+- jr ra
+-.L_bn_div_3_words_proceed:
+- move ta3,ra
+- bal bn_div_words
+- move ra,ta3
+- dmultu ta2,v0
+- ld t2,-16(a3)
+- move ta0,zero
+- mfhi t1
+- mflo t0
+- sltu t8,t1,v1
+-.L_bn_div_3_words_inner_loop:
+- bnez t8,.L_bn_div_3_words_inner_loop_done
+- sgeu AT,t2,t0
+- seq t9,t1,v1
+- and AT,t9
+- sltu t3,t0,ta2
+- daddu v1,a2
+- dsubu t1,t3
+- dsubu t0,ta2
+- sltu t8,t1,v1
+- sltu ta0,v1,a2
+- or t8,ta0
+- .set noreorder
+- beqzl AT,.L_bn_div_3_words_inner_loop
+- dsubu v0,1
+- .set reorder
+-.L_bn_div_3_words_inner_loop_done:
+- jr ra
+-END(bn_div_3_words)
+-
+-.align 5
+-LEAF(bn_div_words)
+- .set noreorder
+- bnezl a2,.L_bn_div_words_proceed
+- move v1,zero
+- jr ra
+- li v0,-1 /* I'd rather signal div-by-zero
+- * which can be done with 'break 7' */
+-
+-.L_bn_div_words_proceed:
+- bltz a2,.L_bn_div_words_body
+- move t9,v1
+- dsll a2,1
+- bgtz a2,.-4
+- addu t9,1
+-
+- .set reorder
+- negu t1,t9
+- li t2,-1
+- dsll t2,t1
+- and t2,a0
+- dsrl AT,a1,t1
+- .set noreorder
+- bnezl t2,.+8
+- break 6 /* signal overflow */
+- .set reorder
+- dsll a0,t9
+- dsll a1,t9
+- or a0,AT
+-
+-#define QT ta0
+-#define HH ta1
+-#define DH v1
+-.L_bn_div_words_body:
+- dsrl DH,a2,32
+- sgeu AT,a0,a2
+- .set noreorder
+- bnezl AT,.+8
+- dsubu a0,a2
+- .set reorder
+-
+- li QT,-1
+- dsrl HH,a0,32
+- dsrl QT,32 /* q=0xffffffff */
+- beq DH,HH,.L_bn_div_words_skip_div1
+- ddivu zero,a0,DH
+- mflo QT
+-.L_bn_div_words_skip_div1:
+- dmultu a2,QT
+- dsll t3,a0,32
+- dsrl AT,a1,32
+- or t3,AT
+- mflo t0
+- mfhi t1
+-.L_bn_div_words_inner_loop1:
+- sltu t2,t3,t0
+- seq t8,HH,t1
+- sltu AT,HH,t1
+- and t2,t8
+- sltu v0,t0,a2
+- or AT,t2
+- .set noreorder
+- beqz AT,.L_bn_div_words_inner_loop1_done
+- dsubu t1,v0
+- dsubu t0,a2
+- b .L_bn_div_words_inner_loop1
+- dsubu QT,1
+- .set reorder
+-.L_bn_div_words_inner_loop1_done:
+-
+- dsll a1,32
+- dsubu a0,t3,t0
+- dsll v0,QT,32
+-
+- li QT,-1
+- dsrl HH,a0,32
+- dsrl QT,32 /* q=0xffffffff */
+- beq DH,HH,.L_bn_div_words_skip_div2
+- ddivu zero,a0,DH
+- mflo QT
+-.L_bn_div_words_skip_div2:
+-#undef DH
+- dmultu a2,QT
+- dsll t3,a0,32
+- dsrl AT,a1,32
+- or t3,AT
+- mflo t0
+- mfhi t1
+-.L_bn_div_words_inner_loop2:
+- sltu t2,t3,t0
+- seq t8,HH,t1
+- sltu AT,HH,t1
+- and t2,t8
+- sltu v1,t0,a2
+- or AT,t2
+- .set noreorder
+- beqz AT,.L_bn_div_words_inner_loop2_done
+- dsubu t1,v1
+- dsubu t0,a2
+- b .L_bn_div_words_inner_loop2
+- dsubu QT,1
+- .set reorder
+-.L_bn_div_words_inner_loop2_done:
+-#undef HH
+-
+- dsubu a0,t3,t0
+- or v0,QT
+- dsrl v1,a0,t9 /* v1 contains remainder if anybody wants it */
+- dsrl a2,t9 /* restore a2 */
+- jr ra
+-#undef QT
+-END(bn_div_words)
+-
+-#define a_0 t0
+-#define a_1 t1
+-#define a_2 t2
+-#define a_3 t3
+-#define b_0 ta0
+-#define b_1 ta1
+-#define b_2 ta2
+-#define b_3 ta3
+-
+-#define a_4 s0
+-#define a_5 s2
+-#define a_6 s4
+-#define a_7 a1 /* once we load a[7] we don't need a anymore */
+-#define b_4 s1
+-#define b_5 s3
+-#define b_6 s5
+-#define b_7 a2 /* once we load b[7] we don't need b anymore */
+-
+-#define t_1 t8
+-#define t_2 t9
+-
+-#define c_1 v0
+-#define c_2 v1
+-#define c_3 a3
+-
+-#define FRAME_SIZE 48
+-
+-.align 5
+-LEAF(bn_mul_comba8)
+- .set noreorder
+- PTR_SUB sp,FRAME_SIZE
+- .frame sp,64,ra
+- .set reorder
+- ld a_0,0(a1) /* If compiled with -mips3 option on
+- * R5000 box assembler barks on this
+- * line with "shouldn't have mult/div
+- * as last instruction in bb (R10K
+- * bug)" warning. If anybody out there
+- * has a clue about how to circumvent
+- * this do send me a note.
+- * <appro at fy.chalmers.se>
+- */
+- ld b_0,0(a2)
+- ld a_1,8(a1)
+- ld a_2,16(a1)
+- ld a_3,24(a1)
+- ld b_1,8(a2)
+- ld b_2,16(a2)
+- ld b_3,24(a2)
+- dmultu a_0,b_0 /* mul_add_c(a[0],b[0],c1,c2,c3); */
+- sd s0,0(sp)
+- sd s1,8(sp)
+- sd s2,16(sp)
+- sd s3,24(sp)
+- sd s4,32(sp)
+- sd s5,40(sp)
+- mflo c_1
+- mfhi c_2
+-
+- dmultu a_0,b_1 /* mul_add_c(a[0],b[1],c2,c3,c1); */
+- ld a_4,32(a1)
+- ld a_5,40(a1)
+- ld a_6,48(a1)
+- ld a_7,56(a1)
+- ld b_4,32(a2)
+- ld b_5,40(a2)
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu c_3,t_2,AT
+- dmultu a_1,b_0 /* mul_add_c(a[1],b[0],c2,c3,c1); */
+- ld b_6,48(a2)
+- ld b_7,56(a2)
+- sd c_1,0(a0) /* r[0]=c1; */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu c_1,c_3,t_2
+- sd c_2,8(a0) /* r[1]=c2; */
+-
+- dmultu a_2,b_0 /* mul_add_c(a[2],b[0],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- dmultu a_1,b_1 /* mul_add_c(a[1],b[1],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu c_2,c_1,t_2
+- dmultu a_0,b_2 /* mul_add_c(a[0],b[2],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- sd c_3,16(a0) /* r[2]=c3; */
+-
+- dmultu a_0,b_3 /* mul_add_c(a[0],b[3],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu c_3,c_2,t_2
+- dmultu a_1,b_2 /* mul_add_c(a[1],b[2],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_2,b_1 /* mul_add_c(a[2],b[1],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_3,b_0 /* mul_add_c(a[3],b[0],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- sd c_1,24(a0) /* r[3]=c1; */
+-
+- dmultu a_4,b_0 /* mul_add_c(a[4],b[0],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu c_1,c_3,t_2
+- dmultu a_3,b_1 /* mul_add_c(a[3],b[1],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_2,b_2 /* mul_add_c(a[2],b[2],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_1,b_3 /* mul_add_c(a[1],b[3],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_0,b_4 /* mul_add_c(a[0],b[4],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- sd c_2,32(a0) /* r[4]=c2; */
+-
+- dmultu a_0,b_5 /* mul_add_c(a[0],b[5],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu c_2,c_1,t_2
+- dmultu a_1,b_4 /* mul_add_c(a[1],b[4],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_2,b_3 /* mul_add_c(a[2],b[3],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_3,b_2 /* mul_add_c(a[3],b[2],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_4,b_1 /* mul_add_c(a[4],b[1],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_5,b_0 /* mul_add_c(a[5],b[0],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- sd c_3,40(a0) /* r[5]=c3; */
+-
+- dmultu a_6,b_0 /* mul_add_c(a[6],b[0],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu c_3,c_2,t_2
+- dmultu a_5,b_1 /* mul_add_c(a[5],b[1],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_4,b_2 /* mul_add_c(a[4],b[2],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_3,b_3 /* mul_add_c(a[3],b[3],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_2,b_4 /* mul_add_c(a[2],b[4],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_1,b_5 /* mul_add_c(a[1],b[5],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_0,b_6 /* mul_add_c(a[0],b[6],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- sd c_1,48(a0) /* r[6]=c1; */
+-
+- dmultu a_0,b_7 /* mul_add_c(a[0],b[7],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu c_1,c_3,t_2
+- dmultu a_1,b_6 /* mul_add_c(a[1],b[6],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_2,b_5 /* mul_add_c(a[2],b[5],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_3,b_4 /* mul_add_c(a[3],b[4],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_4,b_3 /* mul_add_c(a[4],b[3],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_5,b_2 /* mul_add_c(a[5],b[2],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_6,b_1 /* mul_add_c(a[6],b[1],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_7,b_0 /* mul_add_c(a[7],b[0],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- sd c_2,56(a0) /* r[7]=c2; */
+-
+- dmultu a_7,b_1 /* mul_add_c(a[7],b[1],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu c_2,c_1,t_2
+- dmultu a_6,b_2 /* mul_add_c(a[6],b[2],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_5,b_3 /* mul_add_c(a[5],b[3],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_4,b_4 /* mul_add_c(a[4],b[4],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_3,b_5 /* mul_add_c(a[3],b[5],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_2,b_6 /* mul_add_c(a[2],b[6],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_1,b_7 /* mul_add_c(a[1],b[7],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- sd c_3,64(a0) /* r[8]=c3; */
+-
+- dmultu a_2,b_7 /* mul_add_c(a[2],b[7],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu c_3,c_2,t_2
+- dmultu a_3,b_6 /* mul_add_c(a[3],b[6],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_4,b_5 /* mul_add_c(a[4],b[5],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_5,b_4 /* mul_add_c(a[5],b[4],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_6,b_3 /* mul_add_c(a[6],b[3],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_7,b_2 /* mul_add_c(a[7],b[2],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- sd c_1,72(a0) /* r[9]=c1; */
+-
+- dmultu a_7,b_3 /* mul_add_c(a[7],b[3],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu c_1,c_3,t_2
+- dmultu a_6,b_4 /* mul_add_c(a[6],b[4],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_5,b_5 /* mul_add_c(a[5],b[5],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_4,b_6 /* mul_add_c(a[4],b[6],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_3,b_7 /* mul_add_c(a[3],b[7],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- sd c_2,80(a0) /* r[10]=c2; */
+-
+- dmultu a_4,b_7 /* mul_add_c(a[4],b[7],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu c_2,c_1,t_2
+- dmultu a_5,b_6 /* mul_add_c(a[5],b[6],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_6,b_5 /* mul_add_c(a[6],b[5],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_7,b_4 /* mul_add_c(a[7],b[4],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- sd c_3,88(a0) /* r[11]=c3; */
+-
+- dmultu a_7,b_5 /* mul_add_c(a[7],b[5],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu c_3,c_2,t_2
+- dmultu a_6,b_6 /* mul_add_c(a[6],b[6],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_5,b_7 /* mul_add_c(a[5],b[7],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- sd c_1,96(a0) /* r[12]=c1; */
+-
+- dmultu a_6,b_7 /* mul_add_c(a[6],b[7],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu c_1,c_3,t_2
+- dmultu a_7,b_6 /* mul_add_c(a[7],b[6],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- sd c_2,104(a0) /* r[13]=c2; */
+-
+- dmultu a_7,b_7 /* mul_add_c(a[7],b[7],c3,c1,c2); */
+- ld s0,0(sp)
+- ld s1,8(sp)
+- ld s2,16(sp)
+- ld s3,24(sp)
+- ld s4,32(sp)
+- ld s5,40(sp)
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sd c_3,112(a0) /* r[14]=c3; */
+- sd c_1,120(a0) /* r[15]=c1; */
+-
+- PTR_ADD sp,FRAME_SIZE
+-
+- jr ra
+-END(bn_mul_comba8)
+-
+-.align 5
+-LEAF(bn_mul_comba4)
+- .set reorder
+- ld a_0,0(a1)
+- ld b_0,0(a2)
+- ld a_1,8(a1)
+- ld a_2,16(a1)
+- dmultu a_0,b_0 /* mul_add_c(a[0],b[0],c1,c2,c3); */
+- ld a_3,24(a1)
+- ld b_1,8(a2)
+- ld b_2,16(a2)
+- ld b_3,24(a2)
+- mflo c_1
+- mfhi c_2
+- sd c_1,0(a0)
+-
+- dmultu a_0,b_1 /* mul_add_c(a[0],b[1],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu c_3,t_2,AT
+- dmultu a_1,b_0 /* mul_add_c(a[1],b[0],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu c_1,c_3,t_2
+- sd c_2,8(a0)
+-
+- dmultu a_2,b_0 /* mul_add_c(a[2],b[0],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- dmultu a_1,b_1 /* mul_add_c(a[1],b[1],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu c_2,c_1,t_2
+- dmultu a_0,b_2 /* mul_add_c(a[0],b[2],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- sd c_3,16(a0)
+-
+- dmultu a_0,b_3 /* mul_add_c(a[0],b[3],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu c_3,c_2,t_2
+- dmultu a_1,b_2 /* mul_add_c(a[1],b[2],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_2,b_1 /* mul_add_c(a[2],b[1],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_3,b_0 /* mul_add_c(a[3],b[0],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- sd c_1,24(a0)
+-
+- dmultu a_3,b_1 /* mul_add_c(a[3],b[1],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu c_1,c_3,t_2
+- dmultu a_2,b_2 /* mul_add_c(a[2],b[2],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_1,b_3 /* mul_add_c(a[1],b[3],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- sd c_2,32(a0)
+-
+- dmultu a_2,b_3 /* mul_add_c(a[2],b[3],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu c_2,c_1,t_2
+- dmultu a_3,b_2 /* mul_add_c(a[3],b[2],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- sd c_3,40(a0)
+-
+- dmultu a_3,b_3 /* mul_add_c(a[3],b[3],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sd c_1,48(a0)
+- sd c_2,56(a0)
+-
+- jr ra
+-END(bn_mul_comba4)
+-
+-#undef a_4
+-#undef a_5
+-#undef a_6
+-#undef a_7
+-#define a_4 b_0
+-#define a_5 b_1
+-#define a_6 b_2
+-#define a_7 b_3
+-
+-.align 5
+-LEAF(bn_sqr_comba8)
+- .set reorder
+- ld a_0,0(a1)
+- ld a_1,8(a1)
+- ld a_2,16(a1)
+- ld a_3,24(a1)
+-
+- dmultu a_0,a_0 /* mul_add_c(a[0],b[0],c1,c2,c3); */
+- ld a_4,32(a1)
+- ld a_5,40(a1)
+- ld a_6,48(a1)
+- ld a_7,56(a1)
+- mflo c_1
+- mfhi c_2
+- sd c_1,0(a0)
+-
+- dmultu a_0,a_1 /* mul_add_c2(a[0],b[1],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- slt c_1,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu c_3,t_2,AT
+- sd c_2,8(a0)
+-
+- dmultu a_2,a_0 /* mul_add_c2(a[2],b[0],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- slt c_2,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_1,a_1 /* mul_add_c(a[1],b[1],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- sd c_3,16(a0)
+-
+- dmultu a_0,a_3 /* mul_add_c2(a[0],b[3],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- slt c_3,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_1,a_2 /* mul_add_c2(a[1],b[2],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_3,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- sd c_1,24(a0)
+-
+- dmultu a_4,a_0 /* mul_add_c2(a[4],b[0],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- slt c_1,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_3,a_1 /* mul_add_c2(a[3],b[1],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_1,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_2,a_2 /* mul_add_c(a[2],b[2],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- sd c_2,32(a0)
+-
+- dmultu a_0,a_5 /* mul_add_c2(a[0],b[5],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- slt c_2,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_1,a_4 /* mul_add_c2(a[1],b[4],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_2,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_2,a_3 /* mul_add_c2(a[2],b[3],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_2,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- sd c_3,40(a0)
+-
+- dmultu a_6,a_0 /* mul_add_c2(a[6],b[0],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- slt c_3,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_5,a_1 /* mul_add_c2(a[5],b[1],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_3,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_4,a_2 /* mul_add_c2(a[4],b[2],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_3,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_3,a_3 /* mul_add_c(a[3],b[3],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- sd c_1,48(a0)
+-
+- dmultu a_0,a_7 /* mul_add_c2(a[0],b[7],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- slt c_1,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_1,a_6 /* mul_add_c2(a[1],b[6],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_1,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_2,a_5 /* mul_add_c2(a[2],b[5],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_1,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_3,a_4 /* mul_add_c2(a[3],b[4],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_1,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- sd c_2,56(a0)
+-
+- dmultu a_7,a_1 /* mul_add_c2(a[7],b[1],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- slt c_2,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_6,a_2 /* mul_add_c2(a[6],b[2],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_2,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_5,a_3 /* mul_add_c2(a[5],b[3],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_2,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_4,a_4 /* mul_add_c(a[4],b[4],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- sd c_3,64(a0)
+-
+- dmultu a_2,a_7 /* mul_add_c2(a[2],b[7],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- slt c_3,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_3,a_6 /* mul_add_c2(a[3],b[6],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_3,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_4,a_5 /* mul_add_c2(a[4],b[5],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_3,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- sd c_1,72(a0)
+-
+- dmultu a_7,a_3 /* mul_add_c2(a[7],b[3],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- slt c_1,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_6,a_4 /* mul_add_c2(a[6],b[4],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_1,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_5,a_5 /* mul_add_c(a[5],b[5],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- sd c_2,80(a0)
+-
+- dmultu a_4,a_7 /* mul_add_c2(a[4],b[7],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- slt c_2,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_5,a_6 /* mul_add_c2(a[5],b[6],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_2,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- sd c_3,88(a0)
+-
+- dmultu a_7,a_5 /* mul_add_c2(a[7],b[5],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- slt c_3,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_6,a_6 /* mul_add_c(a[6],b[6],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- sd c_1,96(a0)
+-
+- dmultu a_6,a_7 /* mul_add_c2(a[6],b[7],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- slt c_1,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- sd c_2,104(a0)
+-
+- dmultu a_7,a_7 /* mul_add_c(a[7],b[7],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sd c_3,112(a0)
+- sd c_1,120(a0)
+-
+- jr ra
+-END(bn_sqr_comba8)
+-
+-.align 5
+-LEAF(bn_sqr_comba4)
+- .set reorder
+- ld a_0,0(a1)
+- ld a_1,8(a1)
+- ld a_2,16(a1)
+- ld a_3,24(a1)
+- dmultu a_0,a_0 /* mul_add_c(a[0],b[0],c1,c2,c3); */
+- mflo c_1
+- mfhi c_2
+- sd c_1,0(a0)
+-
+- dmultu a_0,a_1 /* mul_add_c2(a[0],b[1],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- slt c_1,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu c_3,t_2,AT
+- sd c_2,8(a0)
+-
+- dmultu a_2,a_0 /* mul_add_c2(a[2],b[0],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- slt c_2,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- dmultu a_1,a_1 /* mul_add_c(a[1],b[1],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- sd c_3,16(a0)
+-
+- dmultu a_0,a_3 /* mul_add_c2(a[0],b[3],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- slt c_3,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- dmultu a_1,a_2 /* mul_add_c(a2[1],b[2],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- slt AT,t_2,zero
+- daddu c_3,AT
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sltu AT,c_2,t_2
+- daddu c_3,AT
+- sd c_1,24(a0)
+-
+- dmultu a_3,a_1 /* mul_add_c2(a[3],b[1],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- slt c_1,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- dmultu a_2,a_2 /* mul_add_c(a[2],b[2],c2,c3,c1); */
+- mflo t_1
+- mfhi t_2
+- daddu c_2,t_1
+- sltu AT,c_2,t_1
+- daddu t_2,AT
+- daddu c_3,t_2
+- sltu AT,c_3,t_2
+- daddu c_1,AT
+- sd c_2,32(a0)
+-
+- dmultu a_2,a_3 /* mul_add_c2(a[2],b[3],c3,c1,c2); */
+- mflo t_1
+- mfhi t_2
+- slt c_2,t_2,zero
+- dsll t_2,1
+- slt a2,t_1,zero
+- daddu t_2,a2
+- dsll t_1,1
+- daddu c_3,t_1
+- sltu AT,c_3,t_1
+- daddu t_2,AT
+- daddu c_1,t_2
+- sltu AT,c_1,t_2
+- daddu c_2,AT
+- sd c_3,40(a0)
+-
+- dmultu a_3,a_3 /* mul_add_c(a[3],b[3],c1,c2,c3); */
+- mflo t_1
+- mfhi t_2
+- daddu c_1,t_1
+- sltu AT,c_1,t_1
+- daddu t_2,AT
+- daddu c_2,t_2
+- sd c_1,48(a0)
+- sd c_2,56(a0)
+-
+- jr ra
+-END(bn_sqr_comba4)
+diff --git a/crypto/bn/asm/x86_64-gcc.c b/crypto/bn/asm/x86_64-gcc.c
+index 31476ab..2d39407 100644
+--- a/crypto/bn/asm/x86_64-gcc.c
++++ b/crypto/bn/asm/x86_64-gcc.c
+@@ -273,6 +273,10 @@ BN_ULONG bn_sub_words(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b, int n)
+ /* sqr_add_c(a,i,c0,c1,c2) -- c+=a[i]^2 for three word number c=(c2,c1,c0) */
+ /* sqr_add_c2(a,i,c0,c1,c2) -- c+=2*a[i]*a[j] for three word number c=(c2,c1,c0) */
+
++/*
++ * Keep in mind that carrying into high part of multiplication result
++ * can not overflow, because it cannot be all-ones.
++ */
+ #if 0
+ /* original macros are kept for reference purposes */
+ #define mul_add_c(a,b,c0,c1,c2) { \
+@@ -287,10 +291,10 @@ BN_ULONG bn_sub_words(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b, int n)
+ BN_ULONG ta=(a),tb=(b),t0; \
+ t1 = BN_UMULT_HIGH(ta,tb); \
+ t0 = ta * tb; \
+- t2 = t1+t1; c2 += (t2<t1)?1:0; \
+- t1 = t0+t0; t2 += (t1<t0)?1:0; \
+- c0 += t1; t2 += (c0<t1)?1:0; \
++ c0 += t0; t2 = t1+((c0<t0)?1:0);\
+ c1 += t2; c2 += (c1<t2)?1:0; \
++ c0 += t0; t1 += (c0<t0)?1:0; \
++ c1 += t1; c2 += (c1<t1)?1:0; \
+ }
+ #else
+ #define mul_add_c(a,b,c0,c1,c2) do { \
+@@ -328,22 +332,14 @@ BN_ULONG bn_sub_words(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b, int n)
+ : "=a"(t1),"=d"(t2) \
+ : "a"(a),"m"(b) \
+ : "cc"); \
+- asm ("addq %0,%0; adcq %2,%1" \
+- : "+d"(t2),"+r"(c2) \
+- : "g"(0) \
+- : "cc"); \
+- asm ("addq %0,%0; adcq %2,%1" \
+- : "+a"(t1),"+d"(t2) \
+- : "g"(0) \
+- : "cc"); \
+- asm ("addq %2,%0; adcq %3,%1" \
+- : "+r"(c0),"+d"(t2) \
+- : "a"(t1),"g"(0) \
+- : "cc"); \
+- asm ("addq %2,%0; adcq %3,%1" \
+- : "+r"(c1),"+r"(c2) \
+- : "d"(t2),"g"(0) \
+- : "cc"); \
++ asm ("addq %3,%0; adcq %4,%1; adcq %5,%2" \
++ : "+r"(c0),"+r"(c1),"+r"(c2) \
++ : "r"(t1),"r"(t2),"g"(0) \
++ : "cc"); \
++ asm ("addq %3,%0; adcq %4,%1; adcq %5,%2" \
++ : "+r"(c0),"+r"(c1),"+r"(c2) \
++ : "r"(t1),"r"(t2),"g"(0) \
++ : "cc"); \
+ } while (0)
+ #endif
+
+diff --git a/crypto/bn/bn_asm.c b/crypto/bn/bn_asm.c
+index c43c91c..a33b634 100644
+--- a/crypto/bn/bn_asm.c
++++ b/crypto/bn/bn_asm.c
+@@ -438,6 +438,10 @@ BN_ULONG bn_sub_words(BN_ULONG *r, const BN_ULONG *a, const BN_ULONG *b, int n)
+ /* sqr_add_c(a,i,c0,c1,c2) -- c+=a[i]^2 for three word number c=(c2,c1,c0) */
+ /* sqr_add_c2(a,i,c0,c1,c2) -- c+=2*a[i]*a[j] for three word number c=(c2,c1,c0) */
+
++/*
++ * Keep in mind that carrying into high part of multiplication result
++ * can not overflow, because it cannot be all-ones.
++ */
+ #ifdef BN_LLONG
+ #define mul_add_c(a,b,c0,c1,c2) \
+ t=(BN_ULLONG)a*b; \
+@@ -478,10 +482,10 @@ BN_ULONG bn_sub_words(BN_ULONG *r, const BN_ULONG *a, const BN_ULONG *b, int n)
+ #define mul_add_c2(a,b,c0,c1,c2) { \
+ BN_ULONG ta=(a),tb=(b),t0; \
+ BN_UMULT_LOHI(t0,t1,ta,tb); \
+- t2 = t1+t1; c2 += (t2<t1)?1:0; \
+- t1 = t0+t0; t2 += (t1<t0)?1:0; \
+- c0 += t1; t2 += (c0<t1)?1:0; \
++ c0 += t0; t2 = t1+((c0<t0)?1:0);\
+ c1 += t2; c2 += (c1<t2)?1:0; \
++ c0 += t0; t1 += (c0<t0)?1:0; \
++ c1 += t1; c2 += (c1<t1)?1:0; \
+ }
+
+ #define sqr_add_c(a,i,c0,c1,c2) { \
+@@ -508,10 +512,10 @@ BN_ULONG bn_sub_words(BN_ULONG *r, const BN_ULONG *a, const BN_ULONG *b, int n)
+ BN_ULONG ta=(a),tb=(b),t0; \
+ t1 = BN_UMULT_HIGH(ta,tb); \
+ t0 = ta * tb; \
+- t2 = t1+t1; c2 += (t2<t1)?1:0; \
+- t1 = t0+t0; t2 += (t1<t0)?1:0; \
+- c0 += t1; t2 += (c0<t1)?1:0; \
++ c0 += t0; t2 = t1+((c0<t0)?1:0);\
+ c1 += t2; c2 += (c1<t2)?1:0; \
++ c0 += t0; t1 += (c0<t0)?1:0; \
++ c1 += t1; c2 += (c1<t1)?1:0; \
+ }
+
+ #define sqr_add_c(a,i,c0,c1,c2) { \
+diff --git a/crypto/bn/bntest.c b/crypto/bn/bntest.c
+index 7771e92..48bc633 100644
+--- a/crypto/bn/bntest.c
++++ b/crypto/bn/bntest.c
+@@ -678,44 +678,98 @@ int test_mul(BIO *bp)
+
+ int test_sqr(BIO *bp, BN_CTX *ctx)
+ {
+- BIGNUM a,c,d,e;
+- int i;
++ BIGNUM *a,*c,*d,*e;
++ int i, ret = 0;
+
+- BN_init(&a);
+- BN_init(&c);
+- BN_init(&d);
+- BN_init(&e);
++ a = BN_new();
++ c = BN_new();
++ d = BN_new();
++ e = BN_new();
++ if (a == NULL || c == NULL || d == NULL || e == NULL)
++ {
++ goto err;
++ }
+
+ for (i=0; i<num0; i++)
+ {
+- BN_bntest_rand(&a,40+i*10,0,0);
+- a.neg=rand_neg();
+- BN_sqr(&c,&a,ctx);
++ BN_bntest_rand(a,40+i*10,0,0);
++ a->neg=rand_neg();
++ BN_sqr(c,a,ctx);
+ if (bp != NULL)
+ {
+ if (!results)
+ {
+- BN_print(bp,&a);
++ BN_print(bp,a);
+ BIO_puts(bp," * ");
+- BN_print(bp,&a);
++ BN_print(bp,a);
+ BIO_puts(bp," - ");
+ }
+- BN_print(bp,&c);
++ BN_print(bp,c);
+ BIO_puts(bp,"\n");
+ }
+- BN_div(&d,&e,&c,&a,ctx);
+- BN_sub(&d,&d,&a);
+- if(!BN_is_zero(&d) || !BN_is_zero(&e))
+- {
+- fprintf(stderr,"Square test failed!\n");
+- return 0;
+- }
++ BN_div(d,e,c,a,ctx);
++ BN_sub(d,d,a);
++ if(!BN_is_zero(d) || !BN_is_zero(e))
++ {
++ fprintf(stderr,"Square test failed!\n");
++ goto err;
++ }
+ }
+- BN_free(&a);
+- BN_free(&c);
+- BN_free(&d);
+- BN_free(&e);
+- return(1);
++
++ /* Regression test for a BN_sqr overflow bug. */
++ BN_hex2bn(&a,
++ "80000000000000008000000000000001FFFFFFFFFFFFFFFE0000000000000000");
++ BN_sqr(c, a, ctx);
++ if (bp != NULL)
++ {
++ if (!results)
++ {
++ BN_print(bp,a);
++ BIO_puts(bp," * ");
++ BN_print(bp,a);
++ BIO_puts(bp," - ");
++ }
++ BN_print(bp,c);
++ BIO_puts(bp,"\n");
++ }
++ BN_mul(d, a, a, ctx);
++ if (BN_cmp(c, d))
++ {
++ fprintf(stderr, "Square test failed: BN_sqr and BN_mul produce "
++ "different results!\n");
++ goto err;
++ }
++
++ /* Regression test for a BN_sqr overflow bug. */
++ BN_hex2bn(&a,
++ "80000000000000000000000080000001FFFFFFFE000000000000000000000000");
++ BN_sqr(c, a, ctx);
++ if (bp != NULL)
++ {
++ if (!results)
++ {
++ BN_print(bp,a);
++ BIO_puts(bp," * ");
++ BN_print(bp,a);
++ BIO_puts(bp," - ");
++ }
++ BN_print(bp,c);
++ BIO_puts(bp,"\n");
++ }
++ BN_mul(d, a, a, ctx);
++ if (BN_cmp(c, d))
++ {
++ fprintf(stderr, "Square test failed: BN_sqr and BN_mul produce "
++ "different results!\n");
++ goto err;
++ }
++ ret = 1;
++err:
++ if (a != NULL) BN_free(a);
++ if (c != NULL) BN_free(c);
++ if (d != NULL) BN_free(d);
++ if (e != NULL) BN_free(e);
++ return ret;
+ }
+
+ int test_mont(BIO *bp, BN_CTX *ctx)
+--
+2.1.4
+
Added: openssl/branches/wheezy/debian/patches/0109-Fix-crash-in-dtls1_get_record-whilst-in-the-listen-s.patch
===================================================================
--- openssl/branches/wheezy/debian/patches/0109-Fix-crash-in-dtls1_get_record-whilst-in-the-listen-s.patch (rev 0)
+++ openssl/branches/wheezy/debian/patches/0109-Fix-crash-in-dtls1_get_record-whilst-in-the-listen-s.patch 2015-01-08 20:36:40 UTC (rev 708)
@@ -0,0 +1,44 @@
+From 8d7aab986b499f34d9e1bc58fbfd77f05c38116e Mon Sep 17 00:00:00 2001
+From: "Dr. Stephen Henson" <steve at openssl.org>
+Date: Sat, 3 Jan 2015 00:45:13 +0000
+Subject: [PATCH 109/117] Fix crash in dtls1_get_record whilst in the listen
+ state where you get two separate reads performed - one for the header and one
+ for the body of the handshake record.
+
+CVE-2014-3571
+
+Reviewed-by: Matt Caswell <matt at openssl.org>
+---
+ ssl/d1_pkt.c | 2 --
+ ssl/s3_pkt.c | 2 ++
+ 2 files changed, 2 insertions(+), 2 deletions(-)
+
+diff --git a/ssl/d1_pkt.c b/ssl/d1_pkt.c
+index edd17df..d717260 100644
+--- a/ssl/d1_pkt.c
++++ b/ssl/d1_pkt.c
+@@ -642,8 +642,6 @@ again:
+ /* now s->packet_length == DTLS1_RT_HEADER_LENGTH */
+ i=rr->length;
+ n=ssl3_read_n(s,i,i,1);
+- if (n <= 0) return(n); /* error or non-blocking io */
+-
+ /* this packet contained a partial record, dump it */
+ if ( n != i)
+ {
+diff --git a/ssl/s3_pkt.c b/ssl/s3_pkt.c
+index d1cd752..1ec9e6e 100644
+--- a/ssl/s3_pkt.c
++++ b/ssl/s3_pkt.c
+@@ -183,6 +183,8 @@ int ssl3_read_n(SSL *s, int n, int max, int extend)
+ * at once (as long as it fits into the buffer). */
+ if (SSL_version(s) == DTLS1_VERSION || SSL_version(s) == DTLS1_BAD_VER)
+ {
++ if (left == 0 && extend)
++ return 0;
+ if (left > 0 && n > left)
+ n = left;
+ }
+--
+2.1.4
+
Added: openssl/branches/wheezy/debian/patches/0110-Follow-on-from-CVE-2014-3571.-This-fixes-the-code-th.patch
===================================================================
--- openssl/branches/wheezy/debian/patches/0110-Follow-on-from-CVE-2014-3571.-This-fixes-the-code-th.patch (rev 0)
+++ openssl/branches/wheezy/debian/patches/0110-Follow-on-from-CVE-2014-3571.-This-fixes-the-code-th.patch 2015-01-08 20:36:40 UTC (rev 708)
@@ -0,0 +1,31 @@
+From 45fe66b8ba026186aa5d8ef1e0e6010ea74d5c0b Mon Sep 17 00:00:00 2001
+From: Matt Caswell <matt at openssl.org>
+Date: Sat, 3 Jan 2015 00:54:35 +0000
+Subject: [PATCH 110/117] Follow on from CVE-2014-3571. This fixes the code
+ that was the original source of the crash due to p being NULL. Steve's fix
+ prevents this situation from occuring - however this is by no means obvious
+ by looking at the code for dtls1_get_record. This fix just makes things look
+ a bit more sane.
+
+Reviewed-by: Dr Steve Henson <steve at openssl.org>
+---
+ ssl/d1_pkt.c | 3 ++-
+ 1 file changed, 2 insertions(+), 1 deletion(-)
+
+diff --git a/ssl/d1_pkt.c b/ssl/d1_pkt.c
+index d717260..73ce488 100644
+--- a/ssl/d1_pkt.c
++++ b/ssl/d1_pkt.c
+@@ -676,7 +676,8 @@ again:
+ * would be dropped unnecessarily.
+ */
+ if (!(s->d1->listen && rr->type == SSL3_RT_HANDSHAKE &&
+- *p == SSL3_MT_CLIENT_HELLO) &&
++ s->packet_length > DTLS1_RT_HEADER_LENGTH &&
++ s->packet[DTLS1_RT_HEADER_LENGTH] == SSL3_MT_CLIENT_HELLO) &&
+ !dtls1_record_replay_check(s, bitmap))
+ {
+ rr->length = 0;
+--
+2.1.4
+
Added: openssl/branches/wheezy/debian/patches/0111-Unauthenticated-DH-client-certificate-fix.patch
===================================================================
--- openssl/branches/wheezy/debian/patches/0111-Unauthenticated-DH-client-certificate-fix.patch (rev 0)
+++ openssl/branches/wheezy/debian/patches/0111-Unauthenticated-DH-client-certificate-fix.patch 2015-01-08 20:36:40 UTC (rev 708)
@@ -0,0 +1,38 @@
+From 98a0f9660d374f58f79ee0efcc8c1672a805e8e8 Mon Sep 17 00:00:00 2001
+From: "Dr. Stephen Henson" <steve at openssl.org>
+Date: Thu, 23 Oct 2014 20:36:17 +0100
+Subject: [PATCH 111/117] Unauthenticated DH client certificate fix.
+
+Fix to prevent use of DH client certificates without sending
+certificate verify message.
+
+If we've used a client certificate to generate the premaster secret
+ssl3_get_client_key_exchange returns 2 and ssl3_get_cert_verify is
+never called.
+
+We can only skip the certificate verify message in
+ssl3_get_cert_verify if the client didn't send a certificate.
+
+Thanks to Karthikeyan Bhargavan for reporting this issue.
+CVE-2015-0205
+Reviewed-by: Matt Caswell <matt at openssl.org>
+---
+ ssl/s3_srvr.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/ssl/s3_srvr.c b/ssl/s3_srvr.c
+index d883f86..fadca74 100644
+--- a/ssl/s3_srvr.c
++++ b/ssl/s3_srvr.c
+@@ -3014,7 +3014,7 @@ int ssl3_get_cert_verify(SSL *s)
+ if (s->s3->tmp.message_type != SSL3_MT_CERTIFICATE_VERIFY)
+ {
+ s->s3->tmp.reuse_message=1;
+- if ((peer != NULL) && (type & EVP_PKT_SIGN))
++ if (peer != NULL)
+ {
+ al=SSL_AD_UNEXPECTED_MESSAGE;
+ SSLerr(SSL_F_SSL3_GET_CERT_VERIFY,SSL_R_MISSING_VERIFY_MESSAGE);
+--
+2.1.4
+
Added: openssl/branches/wheezy/debian/patches/0112-A-memory-leak-can-occur-in-dtls1_buffer_record-if-ei.patch
===================================================================
--- openssl/branches/wheezy/debian/patches/0112-A-memory-leak-can-occur-in-dtls1_buffer_record-if-ei.patch (rev 0)
+++ openssl/branches/wheezy/debian/patches/0112-A-memory-leak-can-occur-in-dtls1_buffer_record-if-ei.patch 2015-01-08 20:36:40 UTC (rev 708)
@@ -0,0 +1,129 @@
+From 04685bc949e90a877656cf5020b6d4f90a9636a6 Mon Sep 17 00:00:00 2001
+From: Matt Caswell <matt at openssl.org>
+Date: Wed, 7 Jan 2015 14:18:13 +0000
+Subject: [PATCH 112/117] A memory leak can occur in dtls1_buffer_record if
+ either of the calls to ssl3_setup_buffers or pqueue_insert fail. The former
+ will fail if there is a malloc failure, whilst the latter will fail if
+ attempting to add a duplicate record to the queue. This should never happen
+ because duplicate records should be detected and dropped before any attempt
+ to add them to the queue. Unfortunately records that arrive that are for the
+ next epoch are not being recorded correctly, and therefore replays are not
+ being detected. Additionally, these "should not happen" failures that can
+ occur in dtls1_buffer_record are not being treated as fatal and therefore an
+ attacker could exploit this by sending repeated replay records for the next
+ epoch, eventually causing a DoS through memory exhaustion.
+
+Thanks to Chris Mueller for reporting this issue and providing initial
+analysis and a patch. Further analysis and the final patch was performed by
+Matt Caswell from the OpenSSL development team.
+
+CVE-2015-0206
+
+Reviewed-by: Dr Stephen Henson <steve at openssl.org>
+---
+ ssl/d1_pkt.c | 30 +++++++++++++++++++++---------
+ 1 file changed, 21 insertions(+), 9 deletions(-)
+
+diff --git a/ssl/d1_pkt.c b/ssl/d1_pkt.c
+index 73ce488..0059fe2 100644
+--- a/ssl/d1_pkt.c
++++ b/ssl/d1_pkt.c
+@@ -212,7 +212,7 @@ dtls1_buffer_record(SSL *s, record_pqueue *queue, unsigned char *priority)
+ /* Limit the size of the queue to prevent DOS attacks */
+ if (pqueue_size(queue->q) >= 100)
+ return 0;
+-
++
+ rdata = OPENSSL_malloc(sizeof(DTLS1_RECORD_DATA));
+ item = pitem_new(priority, rdata);
+ if (rdata == NULL || item == NULL)
+@@ -247,18 +247,22 @@ dtls1_buffer_record(SSL *s, record_pqueue *queue, unsigned char *priority)
+ if (!ssl3_setup_buffers(s))
+ {
+ SSLerr(SSL_F_DTLS1_BUFFER_RECORD, ERR_R_INTERNAL_ERROR);
++ if (rdata->rbuf.buf != NULL)
++ OPENSSL_free(rdata->rbuf.buf);
+ OPENSSL_free(rdata);
+ pitem_free(item);
+- return(0);
++ return(-1);
+ }
+
+ /* insert should not fail, since duplicates are dropped */
+ if (pqueue_insert(queue->q, item) == NULL)
+ {
+ SSLerr(SSL_F_DTLS1_BUFFER_RECORD, ERR_R_INTERNAL_ERROR);
++ if (rdata->rbuf.buf != NULL)
++ OPENSSL_free(rdata->rbuf.buf);
+ OPENSSL_free(rdata);
+ pitem_free(item);
+- return(0);
++ return(-1);
+ }
+
+ return(1);
+@@ -314,8 +318,9 @@ dtls1_process_buffered_records(SSL *s)
+ dtls1_get_unprocessed_record(s);
+ if ( ! dtls1_process_record(s))
+ return(0);
+- dtls1_buffer_record(s, &(s->d1->processed_rcds),
+- s->s3->rrec.seq_num);
++ if(dtls1_buffer_record(s, &(s->d1->processed_rcds),
++ s->s3->rrec.seq_num)<0)
++ return -1;
+ }
+ }
+
+@@ -530,7 +535,6 @@ printf("\n");
+
+ /* we have pulled in a full packet so zero things */
+ s->packet_length=0;
+- dtls1_record_bitmap_update(s, &(s->d1->bitmap));/* Mark receipt of record. */
+ return(1);
+
+ f_err:
+@@ -563,7 +567,8 @@ int dtls1_get_record(SSL *s)
+
+ /* The epoch may have changed. If so, process all the
+ * pending records. This is a non-blocking operation. */
+- dtls1_process_buffered_records(s);
++ if(dtls1_process_buffered_records(s)<0)
++ return -1;
+
+ /* if we're renegotiating, then there may be buffered records */
+ if (dtls1_get_processed_record(s))
+@@ -700,7 +705,9 @@ again:
+ {
+ if ((SSL_in_init(s) || s->in_handshake) && !s->d1->listen)
+ {
+- dtls1_buffer_record(s, &(s->d1->unprocessed_rcds), rr->seq_num);
++ if(dtls1_buffer_record(s, &(s->d1->unprocessed_rcds), rr->seq_num)<0)
++ return -1;
++ dtls1_record_bitmap_update(s, bitmap);/* Mark receipt of record. */
+ }
+ rr->length = 0;
+ s->packet_length = 0;
+@@ -713,6 +720,7 @@ again:
+ s->packet_length = 0; /* dump this record */
+ goto again; /* get another record */
+ }
++ dtls1_record_bitmap_update(s, bitmap);/* Mark receipt of record. */
+
+ return(1);
+
+@@ -864,7 +872,11 @@ start:
+ * buffer the application data for later processing rather
+ * than dropping the connection.
+ */
+- dtls1_buffer_record(s, &(s->d1->buffered_app_data), rr->seq_num);
++ if(dtls1_buffer_record(s, &(s->d1->buffered_app_data), rr->seq_num)<0)
++ {
++ SSLerr(SSL_F_DTLS1_READ_BYTES, ERR_R_INTERNAL_ERROR);
++ return -1;
++ }
+ rr->length = 0;
+ goto start;
+ }
+--
+2.1.4
+
Modified: openssl/branches/wheezy/debian/patches/series
===================================================================
--- openssl/branches/wheezy/debian/patches/series 2015-01-08 19:56:24 UTC (rev 707)
+++ openssl/branches/wheezy/debian/patches/series 2015-01-08 20:36:40 UTC (rev 708)
@@ -72,5 +72,15 @@
Fix-for-SRTP-Memory-Leak.patch
Fix-for-session-tickets-memory-leak.patch
Fix-no-ssl3-configuration-option.patch
-disable_sslv3.patch
+#disable_sslv3.patch
Keep-old-method-in-case-of-an-unsupported-protocol.patch
+0094-Fix-various-certificate-fingerprint-issues.patch
+0095-Constify-ASN1_TYPE_cmp-add-X509_ALGOR_cmp.patch
+0098-ECDH-downgrade-bug-fix.patch
+0099-Only-allow-ephemeral-RSA-keys-in-export-ciphersuites.patch
+0108-Fix-for-CVE-2014-3570.patch
+0109-Fix-crash-in-dtls1_get_record-whilst-in-the-listen-s.patch
+0110-Follow-on-from-CVE-2014-3571.-This-fixes-the-code-th.patch
+0111-Unauthenticated-DH-client-certificate-fix.patch
+0112-A-memory-leak-can-occur-in-dtls1_buffer_record-if-ei.patch
+
More information about the Pkg-openssl-changes
mailing list