Bug#978292: superlu-dist: FTBFS: dh_auto_test: error: cd obj-x86_64-linux-gnu && make -j1 test ARGS\+=-j1 returned exit code 2

Lucas Nussbaum lucas at debian.org
Sat Dec 26 21:56:19 GMT 2020


Source: superlu-dist
Version: 6.2.0+dfsg1-3
Severity: serious
Justification: FTBFS on amd64
Tags: bullseye sid ftbfs
Usertags: ftbfs-20201226 ftbfs-bullseye

Hi,

During a rebuild of all packages in sid, your package failed to build
on amd64.

Relevant part (hopefully):
> make[2]: Entering directory '/<<PKGBUILDDIR>>/obj-x86_64-linux-gnu'
> Running tests...
> /usr/bin/ctest --force-new-ctest-process -j1
> Test project /<<PKGBUILDDIR>>/obj-x86_64-linux-gnu
>       Start  1: pdtest_1x1_1_2_8_20_SP
>  1/14 Test  #1: pdtest_1x1_1_2_8_20_SP ...........***Failed    0.04 sec
> [ip-172-31-12-139:00795] [[25331,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 320
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   opal_pmix_base_select failed
>   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> 
>       Start  2: pdtest_1x1_3_2_8_20_SP
>  2/14 Test  #2: pdtest_1x1_3_2_8_20_SP ...........***Failed    0.02 sec
> [ip-172-31-12-139:00796] [[25332,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 320
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   opal_pmix_base_select failed
>   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> 
>       Start  3: pdtest_1x2_1_2_8_20_SP
>  3/14 Test  #3: pdtest_1x2_1_2_8_20_SP ...........***Failed    0.02 sec
> [ip-172-31-12-139:00797] [[25333,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 320
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   opal_pmix_base_select failed
>   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> 
>       Start  4: pdtest_1x2_3_2_8_20_SP
>  4/14 Test  #4: pdtest_1x2_3_2_8_20_SP ...........***Failed    0.02 sec
> [ip-172-31-12-139:00798] [[25334,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 320
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   opal_pmix_base_select failed
>   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> 
>       Start  5: pdtest_2x1_1_2_8_20_SP
>  5/14 Test  #5: pdtest_2x1_1_2_8_20_SP ...........***Failed    0.02 sec
> [ip-172-31-12-139:00799] [[25335,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 320
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   opal_pmix_base_select failed
>   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> 
>       Start  6: pdtest_2x1_3_2_8_20_SP
>  6/14 Test  #6: pdtest_2x1_3_2_8_20_SP ...........***Failed    0.02 sec
> [ip-172-31-12-139:00800] [[25288,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 320
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   opal_pmix_base_select failed
>   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> 
>       Start  7: pdtest_2x2_1_2_8_20_SP
>  7/14 Test  #7: pdtest_2x2_1_2_8_20_SP ...........***Failed    0.01 sec
> [ip-172-31-12-139:00801] [[25289,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 320
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   opal_pmix_base_select failed
>   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> 
>       Start  8: pdtest_2x2_3_2_8_20_SP
>  8/14 Test  #8: pdtest_2x2_3_2_8_20_SP ...........***Failed    0.01 sec
> [ip-172-31-12-139:00802] [[25290,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 320
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   opal_pmix_base_select failed
>   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> 
>       Start  9: pddrive1
>  9/14 Test  #9: pddrive1 .........................***Failed    0.02 sec
> [ip-172-31-12-139:00803] [[25291,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 320
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   opal_pmix_base_select failed
>   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> 
>       Start 10: pddrive2
> 10/14 Test #10: pddrive2 .........................***Failed    0.01 sec
> [ip-172-31-12-139:00804] [[25292,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 320
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   opal_pmix_base_select failed
>   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> 
>       Start 11: pddrive3
> 11/14 Test #11: pddrive3 .........................***Failed    0.02 sec
> [ip-172-31-12-139:00805] [[25293,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 320
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   opal_pmix_base_select failed
>   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> 
>       Start 12: pzdrive1
> 12/14 Test #12: pzdrive1 .........................***Failed    0.01 sec
> [ip-172-31-12-139:00806] [[25294,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 320
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   opal_pmix_base_select failed
>   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> 
>       Start 13: pzdrive2
> 13/14 Test #13: pzdrive2 .........................***Failed    0.02 sec
> [ip-172-31-12-139:00807] [[25295,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 320
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   opal_pmix_base_select failed
>   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> 
>       Start 14: pzdrive3
> 14/14 Test #14: pzdrive3 .........................***Failed    0.02 sec
> [ip-172-31-12-139:00808] [[25280,0],0] ORTE_ERROR_LOG: Not found in file ../../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 320
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   opal_pmix_base_select failed
>   --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> 
> 
> 0% tests passed, 14 tests failed out of 14
> 
> Total Test time (real) =   0.24 sec
> 
> The following tests FAILED:
> 	  1 - pdtest_1x1_1_2_8_20_SP (Failed)
> 	  2 - pdtest_1x1_3_2_8_20_SP (Failed)
> 	  3 - pdtest_1x2_1_2_8_20_SP (Failed)
> 	  4 - pdtest_1x2_3_2_8_20_SP (Failed)
> 	  5 - pdtest_2x1_1_2_8_20_SP (Failed)
> 	  6 - pdtest_2x1_3_2_8_20_SP (Failed)
> 	  7 - pdtest_2x2_1_2_8_20_SP (Failed)
> 	  8 - pdtest_2x2_3_2_8_20_SP (Failed)
> 	  9 - pddrive1 (Failed)
> 	 10 - pddrive2 (Failed)
> 	 11 - pddrive3 (Failed)
> 	 12 - pzdrive1 (Failed)
> 	 13 - pzdrive2 (Failed)
> 	 14 - pzdrive3 (Failed)
> Errors while running CTest
> make[2]: *** [Makefile:140: test] Error 8
> make[2]: Leaving directory '/<<PKGBUILDDIR>>/obj-x86_64-linux-gnu'
> dh_auto_test: error: cd obj-x86_64-linux-gnu && make -j1 test ARGS\+=-j1 returned exit code 2

The full build log is available from:
   http://qa-logs.debian.net/2020/12/26/superlu-dist_6.2.0+dfsg1-3_unstable.log

A list of current common problems and possible solutions is available at
http://wiki.debian.org/qa.debian.org/FTBFS . You're welcome to contribute!

If you reassign this bug to another package, please marking it as 'affects'-ing
this package. See https://www.debian.org/Bugs/server-control#affects

If you fail to reproduce this, please provide a build log and diff it with me
so that we can identify if something relevant changed in the meantime.

About the archive rebuild: The rebuild was done on EC2 VM instances from
Amazon Web Services, using a clean, minimal and up-to-date chroot. Every
failed build was retried once to eliminate random failures.



More information about the debian-science-maintainers mailing list