[Debian-med-packaging] Bug#960756: Bug#960756: python-biopython FTBFS on 32bit: test_NCBI_BLAST_tools.BlastDB failures

Étienne Mollier etienne.mollier at mailoo.org
Tue Jun 9 13:25:58 BST 2020


Hi all,

Andreas Tille, on 2020-06-08 16:01:33 +0200:
> any voluntee to follow this hint of upstream?

Having a look a this issue, here is what I can tell so far.

> > Perhaps makeblastdb itself failed (and our wrapper didn't notice)? Those
> > are the first files looked for after calling makeblastdb, to see if it
> > could make a BLAST database.  Are there any GenBank/NC_005816.fna.n* or
> > GenBank/NC_005816.faa.p* files present?
> > 
> > If it helps, the commands our script was trying to run were:
> > 
> > $ makeblastdb -dbtype nucl -in GenBank/NC_005816.fna \
> > -parse_seqids -hash_index -max_file_sz 20MB  -taxid 10
> > 
> > and:
> > 
> > $ makeblastdb -dbtype prot -in GenBank/NC_005816.faa \
> > -parse_seqids -hash_index -max_file_sz 20MB -taxid 10

On my i686 machine, both of these commands end up in error,
failing to allocate memory:

	$ makeblastdb -dbtype nucl -in GenBank/NC_005816.fna -parse_seqids -hash_index -max_file_sz 20MB  -taxid 10
	
	
	Building a new DB, current time: 06/09/2020 08:28:08
	New DB name:   /tmp/python-biopthon/Tests/GenBank/NC_005816.fna
	New DB title:  GenBank/NC_005816.fna
	Sequence type: Nucleotide
	Deleted existing Nucleotide BLAST database named /tmp/python-biopthon/Tests/GenBank/NC_005816.fna
	Keep MBits: T
	Maximum file size: 20000000B
	Adding sequences from FASTA; added 1 sequences in 0.284663 seconds.
	
	No volumes were created.
	
	BLAST Database creation error: mdb_env_open: Cannot allocate memory

Looking up the strace to see what happens exactly from a kernel
point of view, the program attempts to map 3647256576 bytes of
memory in which the stub of database will be built:

	lstat64("/tmp/python-biopthon/Tests/GenBank/NC_005816.fna.ndb", 0xbfc2e5ac) = -1 ENOENT (No such file or directory)
	openat(AT_FDCWD, "/tmp/python-biopthon/Tests/GenBank/NC_005816.fna.ndb", O_RDWR|O_CREAT, 0664) = 4
	fstatfs(4, {f_type=XFS_SB_MAGIC, f_bsize=4096, f_blocks=73645943, f_bfree=64080178, f_bavail=64080178, f_files=147363840, f_ffree=147171712, f_fsid={val=[65027, 0]}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_NOATIME}) = 0
	pread64(4, "", 92, 0)                   = 0
	pwrite64(4, "\0\0\0\0\0\0\10\0\0\0\0\0\336\300\357\276\1\0\0\0\0\0\0\0\0\270d\331\0\20\0\0"..., 8192, 0) = 8192
	mmap2(NULL, 3647256576, PROT_READ, MAP_SHARED, 4, 0) = -1 ENOMEM (Cannot allocate memory)
	            ~~~~~~~~~~

To rule out a few issues that could have caused more or less
artificial memory starvation situations, I tried to bring the
following changes to my configuration:

  - append an additional 4 GiB of swap through a file;

  - move to a PAE aware kernel since my original configuration
    had no use for virtual memory extension past the 3 GiB limit
    anyway:
	$ uname -sr
	Linux 4.19.0-9-686-pae
	$ grep PAE /boot/config-`uname -r`
	CONFIG_X86_PAE=y

  - check RLIMIT_DATA to make sure they were not blocking:
	$ prlimit   # filtered
	AS         address space limit unlimited unlimited bytes
	DATA       max data size       unlimited unlimited bytes

  - increase the vm.max_map_count by two orders of magnitude
    compared to the default (65536), just in case:
	$ cat /proc/sys/vm/max_map_count
	1000000

  - enable memory overcommit and allow unreasonable levels of
    commit ratios:
	$ grep . /proc/sys/vm/overcommit_*
	/proc/sys/vm/overcommit_kbytes:0
	/proc/sys/vm/overcommit_memory:1
	/proc/sys/vm/overcommit_ratio:200
    but that shouldn't be important given the fact that in such
    mmap configuration, the memory does not need to be
    committed anyway, that was just to rule out that point too.

For comparison, on 64 bits systems, the size of the mmap is of
precisely 300 GB, and the command works very well whatever the
actual size of physical memory is available on the host.

My current impression is that makeblastdb is unable to work
properly on most 32 bits machines, because the amount of memory
needing to be addressed by the process looks like it might
exceed too easily 32 bits architectural limits.

Have a nice day,
-- 
Étienne Mollier <etienne.mollier at mailoo.org>
Fingerprint:  5ab1 4edf 63bb ccff 8b54  2fa9 59da 56fe fff3 882d
Help find cures against the Covid-19 !  Give CPU cycles:
  * Rosetta at home: https://boinc.bakerlab.org/rosetta/
  * Folding at home: https://foldingathome.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 659 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/debian-med-packaging/attachments/20200609/44730f10/attachment-0001.sig>


More information about the Debian-med-packaging mailing list