[Debian-med-packaging] Bug#1086156: python-biopython test failure with libcifpp changes (Bug #1086156)

Étienne Mollier emollier at debian.org
Wed Oct 30 19:42:18 GMT 2024


Hi Maarten, Hi Andrius,

I've been in the vicinity for a couple of years now, and yet
neither biology nor chemistry are my field, far from it.  Thank
you for helping me connecting the dots with this issue!  :)

Maarten L. Hekkelman, on 2024-10-30:
> Of course, it would be easiest to include the components.cif file in
> libcifpp. However, this file changes weekly and it is huge. Two good reasons
> not to do it.
> 
> What I did for density-fitness e.g. is include a subset of components.cif
> large enough for testing. That way, you can avoid having to download the
> entire CCD file for just some simple tests on basic proteins. Have a look at
> the density-fitness debian/tests files, perhaps this may help you solve your
> problem.
> 
> There are similar ways to provide a dummy components.cif file. See
> https://pdb-redo.github.io/libcifpp/resources.html for more information.

Thank you for the hints, as a start, I checked whether I manage
to get somewhere with your mini-ccd.cif from density-fitness.
It seemed on first sight that I may have to populate a few more
fields before being able to fulfil the requirements of
python-biopython test suite :

	test_PDB_DSSP ... /<<PKGBUILDDIR>>/.pybuild/cpython3_3.12/build/Bio/PDB/DSSP.py:199: UserWarning: 
	Configuration error:
	
	The attempt to retrieve compound information for "SO4" failed.
	
	This information is searched for in a CCD file called components.cif or
	components.cif.gz which should be located in one of the following directories:
	
	"/usr/share/libcifpp"
	"/var/cache/libcifpp"
	"/<<PKGBUILDDIR>>/libcifpp-data"
	"/usr/share/libcifpp"
	"/usr/share/libcifpp"
	
	(Note that you can add a directory to the search paths by setting the 
	LIBCIFPP_DATA_DIR environmental variable)
	
	On Linux an optional cron script might have been installed that automatically updates
	components.cif and mmCIF dictionary files. This script only works when the file
	libcifpp.conf contains an uncommented line with the text:
	
	update=true
	
	If you do not have a working cron script, you can manually update the files
	in /var/cache/libcifpp using the following commands:
	
	curl -o /var/cache/libcifpp/components.cif https://files.wwpdb.org/pub/pdb/data/monomers/components.cif
	curl -o /var/cache/libcifpp/mmcif_pdbx.dic https://mmcif.wwpdb.org/dictionaries/ascii/mmcif_pdbx_v50.dic
	curl -o /var/cache/libcifpp/mmcif_ma.dic https://github.com/ihmwg/ModelCIF/raw/master/dist/mmcif_ma.dic
	
	The current order of compound factory objects is:
	
	CCD components.cif resource
	CCD components.cif resource
	Unknown compound: SO4
	Missing compound information for SO4
	
	  warnings.warn(err)
	ok

That being said, despite the incomplete subset, it allows the
test to pass, allowing the python-biopython build to go through.
I should be able to upload a fix shortly.  :)

> As an alternative, I'm thinking about packaging a subset with libcifpp.
> Dictionaries can be stacked already, so having a subset might be a simple
> way out of this. But the question then is, what to include in the subset. I
> did some counting on components in PDB entries to see if there is a clear
> cut-off, but couldn't find one. And only including the standard amino acids
> and nucleic acids is a bit too limited to be useful.

I understand how problematic it is to determine a useful subset,
and am afraid I don't believe to have any good idea.  For all I
can tell, in its present shape, the mini-ccd.cif already allows
for basic functional testing.

Have a nice day,  :)
-- 
  .''`.  Étienne Mollier <emollier at debian.org>
 : :' :  pgp: 8f91 b227 c7d6 f2b1 948c  8236 793c f67e 8f0d 11da
 `. `'   sent from /dev/pts/2, please excuse my verbosity
   `-    on air: Patrick Moraz - Sonata in C (3rd movement Alle…
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/debian-med-packaging/attachments/20241030/1b50e8dd/attachment.sig>


More information about the Debian-med-packaging mailing list