[Debian-med-packaging] Bug#1004037: Segmentation fault in plink2 (Was: src:plink2: fails to migrate to testing for too long: autopkgtest regression)
Andreas Tille
andreas at an3as.eu
Fri Feb 18 21:22:43 GMT 2022
I confirm its gcc-11. I'll check tomorrow. Thanks a lot for your quick and helpful responses, Andreas.
Am Fri, Feb 18, 2022 at 12:53:58PM -0800 schrieb Chris Chang:
> I have posted an update under the provisional assumption that it's gcc 11's
> new ipa-modref pass that is causing this code to fail, since it does seem
> to break some similar code.
>
> On Fri, Feb 18, 2022 at 11:49 AM Chris Chang <chrchang523 at gmail.com> wrote:
>
> > What compiler version are you using? This implies that the pgl_malloc
> > inline function is not being compiled to the expected code; there is an
> > existing non-inlined version that is used for very old gcc versions, but it
> > looks like it may also be needed here.
> >
> > On Fri, Feb 18, 2022 at 11:40 AM Andreas Tille <andreas at an3as.eu> wrote:
> >
> >> Hi again,
> >>
> >> I applied this patch and now I get:
> >>
> >> (gdb) run
> >> Starting program: /usr/lib/plink2/plink2-sse2 --debug --pfile tmp_data
> >> --export vcf vcf-dosage=DS --out tmp_data2
> >> [Thread debugging using libthread_db enabled]
> >> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> >> [New Thread 0x7ffff4cc7640 (LWP 4060797)]
> >> [New Thread 0x7fffec4c6640 (LWP 4060798)]
> >> [New Thread 0x7fffebcc5640 (LWP 4060799)]
> >> PLINK v2.00a3 64-bit (29 Jan 2022)
> >> www.cog-genomics.org/plink/2.0/
> >> (C) 2005-2022 Shaun Purcell, Christopher Chang GNU General Public
> >> License v3
> >> Logging to tmp_data2.log.
> >> Options in effect:
> >> --debug
> >> --export vcf vcf-dosage=DS
> >> --out tmp_data2
> >> --pfile tmp_data
> >>
> >> Start time: Fri Feb 18 19:06:45 2022
> >> 31998 MiB RAM detected; reserving 15999 MiB for main workspace.
> >> Using up to 4 compute threads.
> >> [New Thread 0x7ffff7fc5640 (LWP 4060800)]
> >> sizeof(PhenoCol): 40 pheno_cols: 0
> >> --debug: setting pheno_cols[0].nonmiss. = nullptr
> >>
> >> Thread 1 "plink2-sse2" received signal SIGSEGV, Segmentation fault.
> >> 0x00005555556fb82e in plink2::LoadPsam (psamname=psamname at entry=0x7fffffffbe70
> >> "tmp_data.psam", pheno_range_list_ptr=<optimized out>, fam_cols=...,
> >> pheno_ct_max=<optimized out>,
> >> missing_pheno=<optimized out>, affection_01=0, max_thread_ct=4,
> >> piip=0x7fffffff8880, sample_include_ptr=0x7fffffff8790,
> >> founder_info_ptr=0x7fffffff87a8, sex_nm_ptr=0x7fffffff8798,
> >> sex_male_ptr=0x7fffffff87a0, pheno_cols_ptr=0x7fffffff8770,
> >> pheno_names_ptr=0x7fffffff8780, raw_sample_ct_ptr=0x7fffffff8728,
> >> pheno_ct_ptr=0x7fffffff8720,
> >> max_pheno_name_blen_ptr=0x7fffffff87b0) at ../plink2_psam.cc:615
> >> warning: Source file is more recent than executable.
> >> 615 pheno_cols[pheno_idx].nonmiss = nullptr;
> >>
> >> Kind regards
> >>
> >> Andreas.
> >>
> >> Am Fri, Feb 18, 2022 at 08:45:12AM -0800 schrieb Chris Chang:
> >> > Ok, I don't know why that particular line would fail, but I've added
> >> > another debug-print before it on GitHub.
> >> >
> >> > On Fri, Feb 18, 2022 at 4:24 AM Andreas Tille <andreas at fam-tille.de>
> >> wrote:
> >> >
> >> > > Hi Chris,
> >> > >
> >> > > Am Thu, Feb 17, 2022 at 07:13:49PM -0800 schrieb Chris Chang:
> >> > > > I was unable to replicate this issue on a Debian EC2 instance.
> >> However,
> >> > > > there are very few things that happen between printing "End time:"
> >> and
> >> > > > program exit, so I have added a bunch of debug-prints (active when
> >> the
> >> > > > --debug flag is passed in) to the latest GitHub commit that should
> >> reveal
> >> > > > which of those few things is triggering the segfault; let me know
> >> if you
> >> > > > are able to run this build.
> >> > >
> >> > > I think the issue is a bit more complex. Debian provides a wrapper
> >> > > which calls the best / most performant plink2. The issue seems to
> >> > > occure for SFX=avx. First I do:
> >> > >
> >> > >
> >> > > /usr/lib/plink2/plink2-avx --debug --dummy 33 65537 0.1
> >> dosage-freq=0.1
> >> > > --out tmp_data
> >> > >
> >> > > This works. In the next step I fire up gdb then which results in
> >> > >
> >> > >
> >> > > (gdb) run
> >> > > Starting program: /usr/lib/plink2/plink2-avx --debug --pfile tmp_data
> >> > > --export vcf vcf-dosage=DS --out tmp_data2
> >> > > [Thread debugging using libthread_db enabled]
> >> > > Using host libthread_db library
> >> "/lib/x86_64-linux-gnu/libthread_db.so.1".
> >> > > [New Thread 0x7ffff4cc7640 (LWP 2931408)]
> >> > > [New Thread 0x7ffff44c6640 (LWP 2931409)]
> >> > > [New Thread 0x7fffebcc5640 (LWP 2931411)]
> >> > > PLINK v2.00a3 SSE4.2 (29 Jan 2022)
> >> > > www.cog-genomics.org/plink/2.0/
> >> > > (C) 2005-2022 Shaun Purcell, Christopher Chang GNU General Public
> >> > > License v3
> >> > > Logging to tmp_data2.log.
> >> > > Options in effect:
> >> > > --debug
> >> > > --export vcf vcf-dosage=DS
> >> > > --out tmp_data2
> >> > > --pfile tmp_data
> >> > >
> >> > > Start time: Fri Feb 18 11:58:49 2022
> >> > > 31998 MiB RAM detected; reserving 15999 MiB for main workspace.
> >> > > Using up to 4 compute threads.
> >> > > [New Thread 0x7ffff7fc5640 (LWP 2931412)]
> >> > >
> >> > > Thread 1 "plink2-avx" received signal SIGSEGV, Segmentation fault.
> >> > > plink2::LoadPsam (psamname=psamname at entry=0x7fffffffbe70
> >> "tmp_data.psam",
> >> > > pheno_range_list_ptr=<optimized out>, fam_cols=...,
> >> pheno_ct_max=<optimized
> >> > > out>,
> >> > > missing_pheno=<optimized out>, affection_01=0, max_thread_ct=4,
> >> > > piip=0x7fffffff8880, sample_include_ptr=0x7fffffff87a0,
> >> > > founder_info_ptr=0x7fffffff87b8, sex_nm_ptr=0x7fffffff87a8,
> >> > > sex_male_ptr=0x7fffffff87b0, pheno_cols_ptr=0x7fffffff8780,
> >> > > pheno_names_ptr=0x7fffffff8790, raw_sample_ct_ptr=0x7fffffff8738,
> >> > > pheno_ct_ptr=0x7fffffff8730,
> >> > > max_pheno_name_blen_ptr=0x7fffffff87c0) at ../plink2_psam.cc:611
> >> > > warning: Source file is more recent than executable.
> >> > > 611 pheno_cols[pheno_idx].nonmiss = nullptr;
> >> > >
> >> > >
> >> > > I also added some more debug lines in a patch[1].
> >> > >
> >> > > It seems that there is actually the weak part of the code since the
> >> > > output turns to
> >> > >
> >> > > ...
> >> > > Start time: Fri Feb 18 13:19:13 2022
> >> > > 31998 MiB RAM detected; reserving 15999 MiB for main workspace.
> >> > > Using up to 4 compute threads.
> >> > > [New Thread 0x7ffff7fc5640 (LWP 3957711)]
> >> > > --debug: setting pheno_cols[0].nonmiss. = nullptr
> >> > >
> >> > > Thread 1 "plink2-sse2" received signal SIGSEGV, Segmentation fault.
> >> > > 0x00005555556fb6ff in plink2::LoadPsam (psamname=psamname at entry
> >> =0x7fffffffbe70
> >> > > "tmp_data.psam", pheno_range_list_ptr=<optimized out>, fam_cols=...,
> >> > > pheno_ct_max=<optimized out>,
> >> > > missing_pheno=<optimized out>, affection_01=0, max_thread_ct=4,
> >> > > piip=0x7fffffff8880, sample_include_ptr=0x7fffffff87a0,
> >> > > founder_info_ptr=0x7fffffff87b8, sex_nm_ptr=0x7fffffff87a8,
> >> > > sex_male_ptr=0x7fffffff87b0, pheno_cols_ptr=0x7fffffff8780,
> >> > > pheno_names_ptr=0x7fffffff8790, raw_sample_ct_ptr=0x7fffffff8738,
> >> > > pheno_ct_ptr=0x7fffffff8730,
> >> > > max_pheno_name_blen_ptr=0x7fffffff87c0) at ../plink2_psam.cc:614
> >> > > warning: Source file is more recent than executable.
> >> > > 614 pheno_cols[pheno_idx].nonmiss = nullptr;
> >> > >
> >> > >
> >> > > I hope this might help a bit to track down the issue
> >> > >
> >> > > Andreas.
> >> > >
> >> > >
> >> > >
> >> > > [1]
> >> > >
> >> https://salsa.debian.org/med-team/plink2/-/blob/master/debian/patches/debug2.patch
> >> > >
> >> > > --
> >> > > http://fam-tille.de
> >> > >
> >>
> >> --
> >> http://fam-tille.de
> >>
> >
--
http://fam-tille.de
More information about the Debian-med-packaging
mailing list