[Debian-med-packaging] Bug#1004037: Segmentation fault in plink2 (Was: src:plink2: fails to migrate to testing for too long: autopkgtest regression)

Chris Chang chrchang523 at gmail.com
Fri Feb 18 20:53:58 GMT 2022


I have posted an update under the provisional assumption that it's gcc 11's
new ipa-modref pass that is causing this code to fail, since it does seem
to break some similar code.

On Fri, Feb 18, 2022 at 11:49 AM Chris Chang <chrchang523 at gmail.com> wrote:

> What compiler version are you using?  This implies that the pgl_malloc
> inline function is not being compiled to the expected code; there is an
> existing non-inlined version that is used for very old gcc versions, but it
> looks like it may also be needed here.
>
> On Fri, Feb 18, 2022 at 11:40 AM Andreas Tille <andreas at an3as.eu> wrote:
>
>> Hi again,
>>
>> I applied this patch and now I get:
>>
>> (gdb) run
>> Starting program: /usr/lib/plink2/plink2-sse2 --debug --pfile tmp_data
>> --export vcf vcf-dosage=DS --out tmp_data2
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
>> [New Thread 0x7ffff4cc7640 (LWP 4060797)]
>> [New Thread 0x7fffec4c6640 (LWP 4060798)]
>> [New Thread 0x7fffebcc5640 (LWP 4060799)]
>> PLINK v2.00a3 64-bit (29 Jan 2022)
>> www.cog-genomics.org/plink/2.0/
>> (C) 2005-2022 Shaun Purcell, Christopher Chang   GNU General Public
>> License v3
>> Logging to tmp_data2.log.
>> Options in effect:
>>   --debug
>>   --export vcf vcf-dosage=DS
>>   --out tmp_data2
>>   --pfile tmp_data
>>
>> Start time: Fri Feb 18 19:06:45 2022
>> 31998 MiB RAM detected; reserving 15999 MiB for main workspace.
>> Using up to 4 compute threads.
>> [New Thread 0x7ffff7fc5640 (LWP 4060800)]
>> sizeof(PhenoCol): 40  pheno_cols: 0
>> --debug: setting pheno_cols[0].nonmiss. = nullptr
>>
>> Thread 1 "plink2-sse2" received signal SIGSEGV, Segmentation fault.
>> 0x00005555556fb82e in plink2::LoadPsam (psamname=psamname at entry=0x7fffffffbe70
>> "tmp_data.psam", pheno_range_list_ptr=<optimized out>, fam_cols=...,
>> pheno_ct_max=<optimized out>,
>>     missing_pheno=<optimized out>, affection_01=0, max_thread_ct=4,
>> piip=0x7fffffff8880, sample_include_ptr=0x7fffffff8790,
>> founder_info_ptr=0x7fffffff87a8, sex_nm_ptr=0x7fffffff8798,
>>     sex_male_ptr=0x7fffffff87a0, pheno_cols_ptr=0x7fffffff8770,
>> pheno_names_ptr=0x7fffffff8780, raw_sample_ct_ptr=0x7fffffff8728,
>> pheno_ct_ptr=0x7fffffff8720,
>>     max_pheno_name_blen_ptr=0x7fffffff87b0) at ../plink2_psam.cc:615
>> warning: Source file is more recent than executable.
>> 615             pheno_cols[pheno_idx].nonmiss = nullptr;
>>
>> Kind regards
>>
>>       Andreas.
>>
>> Am Fri, Feb 18, 2022 at 08:45:12AM -0800 schrieb Chris Chang:
>> > Ok, I don't know why that particular line would fail, but I've added
>> > another debug-print before it on GitHub.
>> >
>> > On Fri, Feb 18, 2022 at 4:24 AM Andreas Tille <andreas at fam-tille.de>
>> wrote:
>> >
>> > > Hi Chris,
>> > >
>> > > Am Thu, Feb 17, 2022 at 07:13:49PM -0800 schrieb Chris Chang:
>> > > > I was unable to replicate this issue on a Debian EC2 instance.
>> However,
>> > > > there are very few things that happen between printing "End time:"
>> and
>> > > > program exit, so I have added a bunch of debug-prints (active when
>> the
>> > > > --debug flag is passed in) to the latest GitHub commit that should
>> reveal
>> > > > which of those few things is triggering the segfault; let me know
>> if you
>> > > > are able to run this build.
>> > >
>> > > I think the issue is a bit more complex.  Debian provides a wrapper
>> > > which calls the best / most performant plink2.  The issue seems to
>> > > occure for SFX=avx.  First I do:
>> > >
>> > >
>> > >    /usr/lib/plink2/plink2-avx --debug --dummy 33 65537 0.1
>> dosage-freq=0.1
>> > > --out tmp_data
>> > >
>> > > This works.  In the next step I fire up gdb then which results in
>> > >
>> > >
>> > > (gdb) run
>> > > Starting program: /usr/lib/plink2/plink2-avx --debug --pfile tmp_data
>> > > --export vcf vcf-dosage=DS --out tmp_data2
>> > > [Thread debugging using libthread_db enabled]
>> > > Using host libthread_db library
>> "/lib/x86_64-linux-gnu/libthread_db.so.1".
>> > > [New Thread 0x7ffff4cc7640 (LWP 2931408)]
>> > > [New Thread 0x7ffff44c6640 (LWP 2931409)]
>> > > [New Thread 0x7fffebcc5640 (LWP 2931411)]
>> > > PLINK v2.00a3 SSE4.2 (29 Jan 2022)
>> > > www.cog-genomics.org/plink/2.0/
>> > > (C) 2005-2022 Shaun Purcell, Christopher Chang   GNU General Public
>> > > License v3
>> > > Logging to tmp_data2.log.
>> > > Options in effect:
>> > >   --debug
>> > >   --export vcf vcf-dosage=DS
>> > >   --out tmp_data2
>> > >   --pfile tmp_data
>> > >
>> > > Start time: Fri Feb 18 11:58:49 2022
>> > > 31998 MiB RAM detected; reserving 15999 MiB for main workspace.
>> > > Using up to 4 compute threads.
>> > > [New Thread 0x7ffff7fc5640 (LWP 2931412)]
>> > >
>> > > Thread 1 "plink2-avx" received signal SIGSEGV, Segmentation fault.
>> > > plink2::LoadPsam (psamname=psamname at entry=0x7fffffffbe70
>> "tmp_data.psam",
>> > > pheno_range_list_ptr=<optimized out>, fam_cols=...,
>> pheno_ct_max=<optimized
>> > > out>,
>> > >     missing_pheno=<optimized out>, affection_01=0, max_thread_ct=4,
>> > > piip=0x7fffffff8880, sample_include_ptr=0x7fffffff87a0,
>> > > founder_info_ptr=0x7fffffff87b8, sex_nm_ptr=0x7fffffff87a8,
>> > >     sex_male_ptr=0x7fffffff87b0, pheno_cols_ptr=0x7fffffff8780,
>> > > pheno_names_ptr=0x7fffffff8790, raw_sample_ct_ptr=0x7fffffff8738,
>> > > pheno_ct_ptr=0x7fffffff8730,
>> > >     max_pheno_name_blen_ptr=0x7fffffff87c0) at ../plink2_psam.cc:611
>> > > warning: Source file is more recent than executable.
>> > > 611             pheno_cols[pheno_idx].nonmiss = nullptr;
>> > >
>> > >
>> > > I also added some more debug lines in a patch[1].
>> > >
>> > > It seems that there is actually the weak part of the code since the
>> > > output turns to
>> > >
>> > > ...
>> > > Start time: Fri Feb 18 13:19:13 2022
>> > > 31998 MiB RAM detected; reserving 15999 MiB for main workspace.
>> > > Using up to 4 compute threads.
>> > > [New Thread 0x7ffff7fc5640 (LWP 3957711)]
>> > > --debug: setting pheno_cols[0].nonmiss. = nullptr
>> > >
>> > > Thread 1 "plink2-sse2" received signal SIGSEGV, Segmentation fault.
>> > > 0x00005555556fb6ff in plink2::LoadPsam (psamname=psamname at entry
>> =0x7fffffffbe70
>> > > "tmp_data.psam", pheno_range_list_ptr=<optimized out>, fam_cols=...,
>> > > pheno_ct_max=<optimized out>,
>> > >     missing_pheno=<optimized out>, affection_01=0, max_thread_ct=4,
>> > > piip=0x7fffffff8880, sample_include_ptr=0x7fffffff87a0,
>> > > founder_info_ptr=0x7fffffff87b8, sex_nm_ptr=0x7fffffff87a8,
>> > >     sex_male_ptr=0x7fffffff87b0, pheno_cols_ptr=0x7fffffff8780,
>> > > pheno_names_ptr=0x7fffffff8790, raw_sample_ct_ptr=0x7fffffff8738,
>> > > pheno_ct_ptr=0x7fffffff8730,
>> > >     max_pheno_name_blen_ptr=0x7fffffff87c0) at ../plink2_psam.cc:614
>> > > warning: Source file is more recent than executable.
>> > > 614             pheno_cols[pheno_idx].nonmiss = nullptr;
>> > >
>> > >
>> > > I hope this might help a bit to track down the issue
>> > >
>> > >     Andreas.
>> > >
>> > >
>> > >
>> > > [1]
>> > >
>> https://salsa.debian.org/med-team/plink2/-/blob/master/debian/patches/debug2.patch
>> > >
>> > > --
>> > > http://fam-tille.de
>> > >
>>
>> --
>> http://fam-tille.de
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-packaging/attachments/20220218/387e487c/attachment.htm>


More information about the Debian-med-packaging mailing list