[med-svn] [Git][med-team/hnswlib][master] 5 commits: New upstream version 0.6.2

Andreas Tille (@tille) gitlab at salsa.debian.org
Thu Feb 17 19:44:47 GMT 2022



Andreas Tille pushed to branch master at Debian Med / hnswlib


Commits:
e3dd016a by Andreas Tille at 2022-02-17T20:35:16+01:00
New upstream version 0.6.2
- - - - -
5c7ad126 by Andreas Tille at 2022-02-17T20:35:16+01:00
routine-update: New upstream version

- - - - -
fa4dba80 by Andreas Tille at 2022-02-17T20:35:16+01:00
Update upstream source from tag 'upstream/0.6.2'

Update to upstream version '0.6.2'
with Debian dir da65639751c484f332053c92b25099c75c32d9a8
- - - - -
99610be3 by Andreas Tille at 2022-02-17T20:38:35+01:00
Update patches

- - - - -
30264080 by Andreas Tille at 2022-02-17T20:42:47+01:00
Upload to unstable

- - - - -


13 changed files:

- README.md
- debian/changelog
- debian/patches/do-not-use-native-flags.patch
- debian/patches/use-shared-while-linking.patch
- debian/rules
- examples/git_tester.py
- hnswlib/hnswlib.h
- hnswlib/space_ip.h
- hnswlib/space_l2.h
- + python_bindings/LazyIndex.py
- python_bindings/bindings.cpp
- python_bindings/tests/bindings_test_getdata.py
- setup.py


Changes:

=====================================
README.md
=====================================
@@ -3,7 +3,20 @@ Header-only C++ HNSW implementation with python bindings.
 
 **NEWS:**
 
-**version 0.6** 
+
+**version 0.6.2** 
+
+* Fixed a bug in saving of large pickles. The pickles with > 4GB could have been corrupted. Thanks Kai Wohlfahrt for reporting.
+* Thanks to ([@GuyAv46](https://github.com/GuyAv46)) hnswlib inner product now is more consitent accross architectures (SSE, AVX, etc). 
+* 
+
+**version 0.6.1** 
+
+* Thanks to ([@tony-kuo](https://github.com/tony-kuo)) hnswlib AVX512 and AVX builds are not backwards-compatible with older SSE and non-AVX512 architectures. 
+* Thanks to ([@psobot](https://github.com/psobot)) there is now a sencible message instead of segfault when passing a scalar to get_items.
+* Thanks to ([@urigoren](https://github.com/urigoren)) hnswlib has a lazy index creation python wrapper.
+
+**version 0.6.0** 
 * Thanks to ([@dyashuni](https://github.com/dyashuni)) hnswlib now uses github actions for CI, there is a search speedup in some scenarios with deletions. `unmark_deleted(label)` is now also a part of the python interface (note now it throws an exception for double deletions). 
 * Thanks to ([@slice4e](https://github.com/slice4e)) we now support AVX512; thanks to ([@LTLA](https://github.com/LTLA)) the cmake interface for the lib is now updated. 
 * Thanks to ([@alonre24](https://github.com/alonre24)) we now have a python bindings for brute-force (and examples for recall tuning: [TESTING_RECALL.md](TESTING_RECALL.md). 
@@ -228,6 +241,9 @@ or you can install via pip:
 
 
 ### For developers 
+Contributions are highly welcome!
+
+Please make pull requests against the `develop` branch.
 
 When making changes please run tests (and please add a test to `python_bindings/tests` in case there is new functionality):
 ```bash
@@ -252,10 +268,6 @@ https://github.com/dbaranchuk/ivf-hnsw
 * .Net implementation: https://github.com/microsoft/HNSW.Net
 * CUDA implementation: https://github.com/js1010/cuhnsw
 
-### Contributing to the repository
-Contributions are highly welcome!
-
-Please make pull requests against the `develop` branch.
 
 ### 200M SIFT test reproduction 
 To download and extract the bigann dataset (from root directory):


=====================================
debian/changelog
=====================================
@@ -1,3 +1,10 @@
+hnswlib (0.6.2-1) unstable; urgency=medium
+
+  * Team upload.
+  * New upstream version
+
+ -- Andreas Tille <tille at debian.org>  Thu, 17 Feb 2022 20:38:57 +0100
+
 hnswlib (0.6.0-1) unstable; urgency=medium
 
   * Team upload.


=====================================
debian/patches/do-not-use-native-flags.patch
=====================================
@@ -20,14 +20,3 @@ Last-Update: 2020-11-11
      endif()
  
      add_executable(test_updates examples/updates_test.cpp)
---- a/setup.py
-+++ b/setup.py
-@@ -74,7 +74,7 @@ class BuildExt(build_ext):
-     """A custom build extension for adding compiler-specific options."""
-     c_opts = {
-         'msvc': ['/EHsc', '/openmp', '/O2'],
--        'unix': ['-O3', '-march=native'],  # , '-w'
-+        'unix': ['-O3'],  # , '-w'
-     }
-     link_opts = {
-         'unix': [],


=====================================
debian/patches/use-shared-while-linking.patch
=====================================
@@ -3,7 +3,7 @@ Description: Enable "-shared" while linking
 Last-Changed: September 7, 2020
 --- a/setup.py
 +++ b/setup.py
-@@ -86,7 +86,7 @@
+@@ -93,7 +93,7 @@ class BuildExt(build_ext):
          link_opts['unix'] += ['-stdlib=libc++', '-mmacosx-version-min=10.7']
      else:
          c_opts['unix'].append("-fopenmp")


=====================================
debian/rules
=====================================
@@ -5,6 +5,9 @@ export LC_ALL=C.UTF-8
 PYBUILD_NAME=hnswlib
 PYBUILD_SYSTEM=pybuild
 
+# avoid -march=native to respect baseline
+export HNSWLIB_NO_NATIVE=yes
+
 include /usr/share/dpkg/default.mk
 export DEB_BUILD_MAINT_OPTIONS = hardening=+all
 


=====================================
examples/git_tester.py
=====================================
@@ -1,16 +1,34 @@
 from pydriller import Repository
 import os 
 import datetime
-os.system("cp examples/speedtest.py examples/speedtest2.py")
-for commit in Repository('.', from_tag="v0.5.2").traverse_commits():
-    print(commit.hash)
-    print(commit.msg)
+os.system("cp examples/speedtest.py examples/speedtest2.py") # the file has to be outside of git
+for idx, commit in enumerate(Repository('.', from_tag="v0.6.0").traverse_commits()):    
+    name=commit.msg.replace('\n', ' ').replace('\r', ' ')
+    print(idx, commit.hash, name)
+
+
+
+for commit in Repository('.', from_tag="v0.6.0").traverse_commits():
+    
+    name=commit.msg.replace('\n', ' ').replace('\r', ' ')
+    print(commit.hash, name)
     
     os.system(f"git checkout {commit.hash}; rm -rf build; ")
-    os.system("python -m pip install .")
-    os.system(f'python examples/speedtest2.py -n "{commit.msg}" -d 4 -t 1')
-    os.system(f'python examples/speedtest2.py -n "{commit.msg}" -d 64 -t 1')
-    os.system(f'python examples/speedtest2.py -n "{commit.msg}" -d 128 -t 1')
-    os.system(f'python examples/speedtest2.py -n "{commit.msg}" -d 4 -t 24')
-    os.system(f'python examples/speedtest2.py -n "{commit.msg}" -d 128 -t 24')
+    print("\n\n--------------------\n\n")
+    ret=os.system("python -m pip install .")
+    print(ret)
+    
+    if ret != 0:
+        print ("build failed!!!!")
+        print ("build failed!!!!")
+        print ("build failed!!!!")
+        print ("build failed!!!!")
+        continue    
+    
+    os.system(f'python examples/speedtest2.py -n "{name}" -d 4 -t 1')
+    os.system(f'python examples/speedtest2.py -n "{name}" -d 64 -t 1')
+    os.system(f'python examples/speedtest2.py -n "{name}" -d 128 -t 1')
+    os.system(f'python examples/speedtest2.py -n "{name}" -d 4 -t 24')
+    os.system(f'python examples/speedtest2.py -n "{name}" -d 128 -t 24')
+
 


=====================================
hnswlib/hnswlib.h
=====================================
@@ -15,8 +15,25 @@
 #ifdef _MSC_VER
 #include <intrin.h>
 #include <stdexcept>
+#include "cpu_x86.h"
+void cpu_x86::cpuid(int32_t out[4], int32_t eax, int32_t ecx) {
+    __cpuidex(out, eax, ecx);
+}
+__int64 xgetbv(unsigned int x) {
+    return _xgetbv(x);
+}
 #else
 #include <x86intrin.h>
+#include <cpuid.h>
+#include <stdint.h>
+void cpuid(int32_t cpuInfo[4], int32_t eax, int32_t ecx) {
+    __cpuid_count(eax, ecx, cpuInfo[0], cpuInfo[1], cpuInfo[2], cpuInfo[3]);
+}
+uint64_t xgetbv(unsigned int index) {
+    uint32_t eax, edx;
+    __asm__ __volatile__("xgetbv" : "=a"(eax), "=d"(edx) : "c"(index));
+    return ((uint64_t)edx << 32) | eax;
+}
 #endif
 
 #if defined(USE_AVX512)
@@ -30,6 +47,65 @@
 #define PORTABLE_ALIGN32 __declspec(align(32))
 #define PORTABLE_ALIGN64 __declspec(align(64))
 #endif
+
+// Adapted from https://github.com/Mysticial/FeatureDetector
+#define _XCR_XFEATURE_ENABLED_MASK  0
+
+bool AVXCapable() {
+    int cpuInfo[4];
+
+    // CPU support
+    cpuid(cpuInfo, 0, 0);
+    int nIds = cpuInfo[0];
+
+    bool HW_AVX = false;
+    if (nIds >= 0x00000001) {
+        cpuid(cpuInfo, 0x00000001, 0);
+        HW_AVX = (cpuInfo[2] & ((int)1 << 28)) != 0;
+    }
+
+    // OS support
+    cpuid(cpuInfo, 1, 0);
+
+    bool osUsesXSAVE_XRSTORE = (cpuInfo[2] & (1 << 27)) != 0;
+    bool cpuAVXSuport = (cpuInfo[2] & (1 << 28)) != 0;
+
+    bool avxSupported = false;
+    if (osUsesXSAVE_XRSTORE && cpuAVXSuport) {
+        uint64_t xcrFeatureMask = xgetbv(_XCR_XFEATURE_ENABLED_MASK);
+        avxSupported = (xcrFeatureMask & 0x6) == 0x6;
+    }
+    return HW_AVX && avxSupported;
+}
+
+bool AVX512Capable() {
+    if (!AVXCapable()) return false;
+
+    int cpuInfo[4];
+
+    // CPU support
+    cpuid(cpuInfo, 0, 0);
+    int nIds = cpuInfo[0];
+
+    bool HW_AVX512F = false;
+    if (nIds >= 0x00000007) { //  AVX512 Foundation
+        cpuid(cpuInfo, 0x00000007, 0);
+        HW_AVX512F = (cpuInfo[1] & ((int)1 << 16)) != 0;
+    }
+
+    // OS support
+    cpuid(cpuInfo, 1, 0);
+
+    bool osUsesXSAVE_XRSTORE = (cpuInfo[2] & (1 << 27)) != 0;
+    bool cpuAVXSuport = (cpuInfo[2] & (1 << 28)) != 0;
+
+    bool avx512Supported = false;
+    if (osUsesXSAVE_XRSTORE && cpuAVXSuport) {
+        uint64_t xcrFeatureMask = xgetbv(_XCR_XFEATURE_ENABLED_MASK);
+        avx512Supported = (xcrFeatureMask & 0xe6) == 0xe6;
+    }
+    return HW_AVX512F && avx512Supported;
+}
 #endif
 
 #include <queue>
@@ -108,7 +184,6 @@ namespace hnswlib {
 
         return result;
     }
-
 }
 
 #include "space_l2.h"


=====================================
hnswlib/space_ip.h
=====================================
@@ -10,15 +10,20 @@ namespace hnswlib {
         for (unsigned i = 0; i < qty; i++) {
             res += ((float *) pVect1)[i] * ((float *) pVect2)[i];
         }
-        return (1.0f - res);
+        return res;
 
     }
 
+    static float
+    InnerProductDistance(const void *pVect1, const void *pVect2, const void *qty_ptr) {
+        return 1.0f - InnerProduct(pVect1, pVect2, qty_ptr);
+    }
+
 #if defined(USE_AVX)
 
 // Favor using AVX if available.
     static float
-    InnerProductSIMD4Ext(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
+    InnerProductSIMD4ExtAVX(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
         float PORTABLE_ALIGN32 TmpRes[8];
         float *pVect1 = (float *) pVect1v;
         float *pVect2 = (float *) pVect2v;
@@ -61,13 +66,20 @@ namespace hnswlib {
 
         _mm_store_ps(TmpRes, sum_prod);
         float sum = TmpRes[0] + TmpRes[1] + TmpRes[2] + TmpRes[3];;
-        return 1.0f - sum;
-}
+        return sum;
+    }
+    
+    static float
+    InnerProductDistanceSIMD4ExtAVX(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
+        return 1.0f - InnerProductSIMD4ExtAVX(pVect1v, pVect2v, qty_ptr);
+    }
 
-#elif defined(USE_SSE)
+#endif
+
+#if defined(USE_SSE)
 
     static float
-    InnerProductSIMD4Ext(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
+    InnerProductSIMD4ExtSSE(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
         float PORTABLE_ALIGN32 TmpRes[8];
         float *pVect1 = (float *) pVect1v;
         float *pVect2 = (float *) pVect2v;
@@ -119,7 +131,12 @@ namespace hnswlib {
         _mm_store_ps(TmpRes, sum_prod);
         float sum = TmpRes[0] + TmpRes[1] + TmpRes[2] + TmpRes[3];
 
-        return 1.0f - sum;
+        return sum;
+    }
+
+    static float
+    InnerProductDistanceSIMD4ExtSSE(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
+        return 1.0f - InnerProductSIMD4ExtSSE(pVect1v, pVect2v, qty_ptr);
     }
 
 #endif
@@ -128,7 +145,7 @@ namespace hnswlib {
 #if defined(USE_AVX512)
 
     static float
-    InnerProductSIMD16Ext(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
+    InnerProductSIMD16ExtAVX512(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
         float PORTABLE_ALIGN64 TmpRes[16];
         float *pVect1 = (float *) pVect1v;
         float *pVect2 = (float *) pVect2v;
@@ -154,13 +171,20 @@ namespace hnswlib {
         _mm512_store_ps(TmpRes, sum512);
         float sum = TmpRes[0] + TmpRes[1] + TmpRes[2] + TmpRes[3] + TmpRes[4] + TmpRes[5] + TmpRes[6] + TmpRes[7] + TmpRes[8] + TmpRes[9] + TmpRes[10] + TmpRes[11] + TmpRes[12] + TmpRes[13] + TmpRes[14] + TmpRes[15];
 
-        return 1.0f - sum;
+        return sum;
+    }
+
+    static float
+    InnerProductDistanceSIMD16ExtAVX512(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
+        return 1.0f - InnerProductSIMD16ExtAVX512(pVect1v, pVect2v, qty_ptr);
     }
 
-#elif defined(USE_AVX)
+#endif
+
+#if defined(USE_AVX)
 
     static float
-    InnerProductSIMD16Ext(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
+    InnerProductSIMD16ExtAVX(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
         float PORTABLE_ALIGN32 TmpRes[8];
         float *pVect1 = (float *) pVect1v;
         float *pVect2 = (float *) pVect2v;
@@ -192,13 +216,20 @@ namespace hnswlib {
         _mm256_store_ps(TmpRes, sum256);
         float sum = TmpRes[0] + TmpRes[1] + TmpRes[2] + TmpRes[3] + TmpRes[4] + TmpRes[5] + TmpRes[6] + TmpRes[7];
 
-        return 1.0f - sum;
+        return sum;
+    }
+
+    static float
+    InnerProductDistanceSIMD16ExtAVX(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
+        return 1.0f - InnerProductSIMD16ExtAVX(pVect1v, pVect2v, qty_ptr);
     }
 
-#elif defined(USE_SSE)
+#endif
+
+#if defined(USE_SSE)
 
-      static float
-      InnerProductSIMD16Ext(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
+    static float
+    InnerProductSIMD16ExtSSE(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
         float PORTABLE_ALIGN32 TmpRes[8];
         float *pVect1 = (float *) pVect1v;
         float *pVect2 = (float *) pVect2v;
@@ -239,14 +270,24 @@ namespace hnswlib {
         _mm_store_ps(TmpRes, sum_prod);
         float sum = TmpRes[0] + TmpRes[1] + TmpRes[2] + TmpRes[3];
 
-        return 1.0f - sum;
+        return sum;
+    }
+
+    static float
+    InnerProductDistanceSIMD16ExtSSE(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
+        return 1.0f - InnerProductSIMD16ExtSSE(pVect1v, pVect2v, qty_ptr);
     }
 
 #endif
 
 #if defined(USE_SSE) || defined(USE_AVX) || defined(USE_AVX512)
+    DISTFUNC<float> InnerProductSIMD16Ext = InnerProductSIMD16ExtSSE;
+    DISTFUNC<float> InnerProductSIMD4Ext = InnerProductSIMD4ExtSSE;
+    DISTFUNC<float> InnerProductDistanceSIMD16Ext = InnerProductDistanceSIMD16ExtSSE;
+    DISTFUNC<float> InnerProductDistanceSIMD4Ext = InnerProductDistanceSIMD4ExtSSE;
+
     static float
-    InnerProductSIMD16ExtResiduals(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
+    InnerProductDistanceSIMD16ExtResiduals(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
         size_t qty = *((size_t *) qty_ptr);
         size_t qty16 = qty >> 4 << 4;
         float res = InnerProductSIMD16Ext(pVect1v, pVect2v, &qty16);
@@ -255,11 +296,11 @@ namespace hnswlib {
 
         size_t qty_left = qty - qty16;
         float res_tail = InnerProduct(pVect1, pVect2, &qty_left);
-        return res + res_tail - 1.0f;
+        return 1.0f - (res + res_tail);
     }
 
     static float
-    InnerProductSIMD4ExtResiduals(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
+    InnerProductDistanceSIMD4ExtResiduals(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
         size_t qty = *((size_t *) qty_ptr);
         size_t qty4 = qty >> 2 << 2;
 
@@ -270,7 +311,7 @@ namespace hnswlib {
         float *pVect2 = (float *) pVect2v + qty4;
         float res_tail = InnerProduct(pVect1, pVect2, &qty_left);
 
-        return res + res_tail - 1.0f;
+        return 1.0f - (res + res_tail);
     }
 #endif
 
@@ -281,16 +322,37 @@ namespace hnswlib {
         size_t dim_;
     public:
         InnerProductSpace(size_t dim) {
-            fstdistfunc_ = InnerProduct;
+            fstdistfunc_ = InnerProductDistance;
     #if defined(USE_AVX) || defined(USE_SSE) || defined(USE_AVX512)
+        #if defined(USE_AVX512)
+            if (AVX512Capable()) {
+                InnerProductSIMD16Ext = InnerProductSIMD16ExtAVX512;
+                InnerProductDistanceSIMD16Ext = InnerProductDistanceSIMD16ExtAVX512;
+            } else if (AVXCapable()) {
+                InnerProductSIMD16Ext = InnerProductSIMD16ExtAVX;
+                InnerProductDistanceSIMD16Ext = InnerProductDistanceSIMD16ExtAVX;
+            }
+        #elif defined(USE_AVX)
+            if (AVXCapable()) {
+                InnerProductSIMD16Ext = InnerProductSIMD16ExtAVX;
+                InnerProductDistanceSIMD16Ext = InnerProductDistanceSIMD16ExtAVX;
+            }
+        #endif
+        #if defined(USE_AVX)
+            if (AVXCapable()) {
+                InnerProductSIMD4Ext = InnerProductSIMD4ExtAVX;
+                InnerProductDistanceSIMD4Ext = InnerProductDistanceSIMD4ExtAVX;
+            }
+        #endif
+
             if (dim % 16 == 0)
-                fstdistfunc_ = InnerProductSIMD16Ext;
+                fstdistfunc_ = InnerProductDistanceSIMD16Ext;
             else if (dim % 4 == 0)
-                fstdistfunc_ = InnerProductSIMD4Ext;
+                fstdistfunc_ = InnerProductDistanceSIMD4Ext;
             else if (dim > 16)
-                fstdistfunc_ = InnerProductSIMD16ExtResiduals;
+                fstdistfunc_ = InnerProductDistanceSIMD16ExtResiduals;
             else if (dim > 4)
-                fstdistfunc_ = InnerProductSIMD4ExtResiduals;
+                fstdistfunc_ = InnerProductDistanceSIMD4ExtResiduals;
     #endif
             dim_ = dim;
             data_size_ = dim * sizeof(float);
@@ -311,5 +373,4 @@ namespace hnswlib {
     ~InnerProductSpace() {}
     };
 
-
 }


=====================================
hnswlib/space_l2.h
=====================================
@@ -23,7 +23,7 @@ namespace hnswlib {
 
     // Favor using AVX512 if available.
     static float
-    L2SqrSIMD16Ext(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
+    L2SqrSIMD16ExtAVX512(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
         float *pVect1 = (float *) pVect1v;
         float *pVect2 = (float *) pVect2v;
         size_t qty = *((size_t *) qty_ptr);
@@ -52,12 +52,13 @@ namespace hnswlib {
 
         return (res);
 }
+#endif
 
-#elif defined(USE_AVX)
+#if defined(USE_AVX)
 
     // Favor using AVX if available.
     static float
-    L2SqrSIMD16Ext(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
+    L2SqrSIMD16ExtAVX(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
         float *pVect1 = (float *) pVect1v;
         float *pVect2 = (float *) pVect2v;
         size_t qty = *((size_t *) qty_ptr);
@@ -89,10 +90,12 @@ namespace hnswlib {
         return TmpRes[0] + TmpRes[1] + TmpRes[2] + TmpRes[3] + TmpRes[4] + TmpRes[5] + TmpRes[6] + TmpRes[7];
     }
 
-#elif defined(USE_SSE)
+#endif
+
+#if defined(USE_SSE)
 
     static float
-    L2SqrSIMD16Ext(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
+    L2SqrSIMD16ExtSSE(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
         float *pVect1 = (float *) pVect1v;
         float *pVect2 = (float *) pVect2v;
         size_t qty = *((size_t *) qty_ptr);
@@ -141,6 +144,8 @@ namespace hnswlib {
 #endif
 
 #if defined(USE_SSE) || defined(USE_AVX) || defined(USE_AVX512)
+    DISTFUNC<float> L2SqrSIMD16Ext = L2SqrSIMD16ExtSSE;
+
     static float
     L2SqrSIMD16ExtResiduals(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
         size_t qty = *((size_t *) qty_ptr);
@@ -156,7 +161,7 @@ namespace hnswlib {
 #endif
 
 
-#ifdef USE_SSE
+#if defined(USE_SSE)
     static float
     L2SqrSIMD4Ext(const void *pVect1v, const void *pVect2v, const void *qty_ptr) {
         float PORTABLE_ALIGN32 TmpRes[8];
@@ -208,7 +213,17 @@ namespace hnswlib {
     public:
         L2Space(size_t dim) {
             fstdistfunc_ = L2Sqr;
-        #if defined(USE_SSE) || defined(USE_AVX) || defined(USE_AVX512)
+    #if defined(USE_SSE) || defined(USE_AVX) || defined(USE_AVX512)
+        #if defined(USE_AVX512)
+            if (AVX512Capable())
+                L2SqrSIMD16Ext = L2SqrSIMD16ExtAVX512;
+            else if (AVXCapable())
+                L2SqrSIMD16Ext = L2SqrSIMD16ExtAVX;
+        #elif defined(USE_AVX)
+            if (AVXCapable())
+                L2SqrSIMD16Ext = L2SqrSIMD16ExtAVX;
+        #endif
+
             if (dim % 16 == 0)
                 fstdistfunc_ = L2SqrSIMD16Ext;
             else if (dim % 4 == 0)
@@ -217,7 +232,7 @@ namespace hnswlib {
                 fstdistfunc_ = L2SqrSIMD16ExtResiduals;
             else if (dim > 4)
                 fstdistfunc_ = L2SqrSIMD4ExtResiduals;
-        #endif
+    #endif
             dim_ = dim;
             data_size_ = dim * sizeof(float);
         }


=====================================
python_bindings/LazyIndex.py
=====================================
@@ -0,0 +1,44 @@
+import hnswlib
+"""
+    A python wrapper for lazy indexing, preserves the same api as hnswlib.Index but initializes the index only after adding items for the first time with `add_items`.
+"""
+class LazyIndex(hnswlib.Index):
+    def __init__(self, space, dim,max_elements=1024, ef_construction=200, M=16):
+        super().__init__(space, dim)
+        self.init_max_elements=max_elements
+        self.init_ef_construction=ef_construction
+        self.init_M=M
+    def init_index(self, max_elements=0,M=0,ef_construction=0):
+        if max_elements>0:
+            self.init_max_elements=max_elements
+        if ef_construction>0:
+            self.init_ef_construction=ef_construction
+        if M>0:
+            self.init_M=M
+        super().init_index(self.init_max_elements, self.init_M, self.init_ef_construction)
+    def add_items(self, data, ids=None, num_threads=-1):
+        if self.max_elements==0:
+            self.init_index()
+        return super().add_items(data,ids, num_threads)
+    def get_items(self, ids=None):
+        if self.max_elements==0:
+            return []
+        return super().get_items(ids)
+    def knn_query(self, data,k=1, num_threads=-1):
+        if self.max_elements==0:
+            return [], []
+        return super().knn_query(data, k, num_threads)
+    def resize_index(self, size):
+        if self.max_elements==0:
+            return self.init_index(size)
+        else:
+            return super().resize_index(size)
+    def set_ef(self, ef):
+        if self.max_elements==0:
+            self.init_ef_construction=ef
+            return
+        super().set_ef(ef)
+    def get_max_elements(self):
+        return self.max_elements
+    def get_current_count(self):
+        return self.element_count


=====================================
python_bindings/bindings.cpp
=====================================
@@ -260,11 +260,16 @@ public:
         if (!ids_.is_none()) {
             py::array_t < size_t, py::array::c_style | py::array::forcecast > items(ids_);
             auto ids_numpy = items.request();
-            std::vector<size_t> ids1(ids_numpy.shape[0]);
-            for (size_t i = 0; i < ids1.size(); i++) {
-                ids1[i] = items.data()[i];
+
+            if (ids_numpy.ndim == 0) {
+              throw std::invalid_argument("get_items accepts a list of indices and returns a list of vectors");
+            } else {
+              std::vector<size_t> ids1(ids_numpy.shape[0]);
+              for (size_t i = 0; i < ids1.size(); i++) {
+                  ids1[i] = items.data()[i];
+              }
+              ids.swap(ids1);
             }
-            ids.swap(ids1);
         }
 
         std::vector<std::vector<data_t>> data;
@@ -287,12 +292,12 @@ public:
     py::dict getAnnData() const { /* WARNING: Index::getAnnData is not thread-safe with Index::addItems */
       std::unique_lock <std::mutex> templock(appr_alg->global);
 
-      unsigned int level0_npy_size = appr_alg->cur_element_count * appr_alg->size_data_per_element_;
-      unsigned int link_npy_size = 0;
-      std::vector<unsigned int> link_npy_offsets(appr_alg->cur_element_count);
+      size_t level0_npy_size = appr_alg->cur_element_count * appr_alg->size_data_per_element_;
+      size_t link_npy_size = 0;
+      std::vector<size_t> link_npy_offsets(appr_alg->cur_element_count);
 
       for (size_t i = 0; i < appr_alg->cur_element_count; i++){
-        unsigned int linkListSize = appr_alg->element_levels_[i] > 0 ? appr_alg->size_links_per_element_ * appr_alg->element_levels_[i] : 0;
+        size_t linkListSize = appr_alg->element_levels_[i] > 0 ? appr_alg->size_links_per_element_ * appr_alg->element_levels_[i] : 0;
         link_npy_offsets[i]=link_npy_size;
         if (linkListSize)
           link_npy_size += linkListSize;
@@ -321,7 +326,7 @@ public:
       memcpy(element_levels_npy, appr_alg->element_levels_.data(), appr_alg->element_levels_.size() * sizeof(int));
 
       for (size_t i = 0; i < appr_alg->cur_element_count; i++){
-        unsigned int linkListSize = appr_alg->element_levels_[i] > 0 ? appr_alg->size_links_per_element_ * appr_alg->element_levels_[i] : 0;
+        size_t linkListSize = appr_alg->element_levels_[i] > 0 ? appr_alg->size_links_per_element_ * appr_alg->element_levels_[i] : 0;
         if (linkListSize){
           memcpy(link_list_npy+link_npy_offsets[i], appr_alg->linkLists_[i], linkListSize);
         }
@@ -495,11 +500,11 @@ public:
 
       memcpy(appr_alg->element_levels_.data(), element_levels_npy.data(), element_levels_npy.nbytes());
 
-      unsigned int link_npy_size = 0;
-      std::vector<unsigned int> link_npy_offsets(appr_alg->cur_element_count);
+      size_t link_npy_size = 0;
+      std::vector<size_t> link_npy_offsets(appr_alg->cur_element_count);
 
       for (size_t i = 0; i < appr_alg->cur_element_count; i++){
-        unsigned int linkListSize = appr_alg->element_levels_[i] > 0 ? appr_alg->size_links_per_element_ * appr_alg->element_levels_[i] : 0;
+        size_t linkListSize = appr_alg->element_levels_[i] > 0 ? appr_alg->size_links_per_element_ * appr_alg->element_levels_[i] : 0;
         link_npy_offsets[i]=link_npy_size;
         if (linkListSize)
           link_npy_size += linkListSize;
@@ -508,7 +513,7 @@ public:
       memcpy(appr_alg->data_level0_memory_, data_level0_npy.data(), data_level0_npy.nbytes());
 
       for (size_t i = 0; i < appr_alg->max_elements_; i++) {
-          unsigned int linkListSize = appr_alg->element_levels_[i] > 0 ? appr_alg->size_links_per_element_ * appr_alg->element_levels_[i] : 0;
+          size_t linkListSize = appr_alg->element_levels_[i] > 0 ? appr_alg->size_links_per_element_ * appr_alg->element_levels_[i] : 0;
           if (linkListSize == 0) {
               appr_alg->linkLists_[i] = nullptr;
           } else {


=====================================
python_bindings/tests/bindings_test_getdata.py
=====================================
@@ -41,6 +41,9 @@ class RandomSelfTestCase(unittest.TestCase):
         print("Adding all elements (%d)" % (len(data)))
         p.add_items(data, labels)
 
+        # Getting data by label should raise an exception if a scalar is passed:
+        self.assertRaises(ValueError, lambda: p.get_items(labels[0]))
+
         # After adding them, all labels should be retrievable
         returned_items = p.get_items(labels)
         self.assertSequenceEqual(data.tolist(), returned_items)


=====================================
setup.py
=====================================
@@ -1,5 +1,6 @@
 import os
 import sys
+import platform
 
 import numpy as np
 import pybind11
@@ -7,7 +8,7 @@ import setuptools
 from setuptools import Extension, setup
 from setuptools.command.build_ext import build_ext
 
-__version__ = '0.6.0'
+__version__ = '0.6.1'
 
 
 include_dirs = [
@@ -74,14 +75,20 @@ class BuildExt(build_ext):
     """A custom build extension for adding compiler-specific options."""
     c_opts = {
         'msvc': ['/EHsc', '/openmp', '/O2'],
-        'unix': ['-O3', '-march=native'],  # , '-w'
+        #'unix': ['-O3', '-march=native'],  # , '-w'
+        'unix': ['-O3'],  # , '-w'
     }
+    if not os.environ.get("HNSWLIB_NO_NATIVE"):
+        c_opts['unix'].append('-march=native')
+
     link_opts = {
         'unix': [],
         'msvc': [],
     }
 
     if sys.platform == 'darwin':
+        if platform.machine() == 'arm64':
+            c_opts['unix'].remove('-march=native')
         c_opts['unix'] += ['-stdlib=libc++', '-mmacosx-version-min=10.7']
         link_opts['unix'] += ['-stdlib=libc++', '-mmacosx-version-min=10.7']
     else:



View it on GitLab: https://salsa.debian.org/med-team/hnswlib/-/compare/b71a8081d4ccba025d0fc69f5f628f3339d9d588...302640805e6dac29e2ece55786f44cc042856c17

-- 
View it on GitLab: https://salsa.debian.org/med-team/hnswlib/-/compare/b71a8081d4ccba025d0fc69f5f628f3339d9d588...302640805e6dac29e2ece55786f44cc042856c17
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20220217/d4d5a7bc/attachment-0001.htm>


More information about the debian-med-commit mailing list