Compiling / linking with PARPACK

classic Classic list List threaded Threaded
10 messages Options
Kyrre Ness Sjøbæk Kyrre Ness Sjøbæk
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Compiling / linking with PARPACK

Hi,

I'm trying to debug what looks like a PARPACK-related bug in an inherited code, but before I started to ask questions here on the mailing list, i wanted to update to arpack-ng (we are currently using some very old PARPACK).

I managed to compile ARPACK and PARPACK, and also link with it. However I get a strange MPI error. I think I am able to reproduce this in
PARPACK/EXAMPLES/MPI/pzndrv1.f,
compiling it manually as the makefile expects an ARmake.inc.

The commands and output is (more questions underneath it!):

kyrre ~/PhD/ACE3P/Software/arpack-ng_3.1.1/PARPACK/EXAMPLES/MPI $ mpif90 -O2 -fPIC pzndrv1.f /home/kyrre/PhD/ACE3P/Software/arpack-ng/lib/libparpack.a /home/kyrre/PhD/ACE3P/Software/ARPACK/libarpack.a -lg2c -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6 -o pzndrv1
kyrre ~/PhD/ACE3P/Software/arpack-ng_3.1.1/PARPACK/EXAMPLES/MPI $ ./pzndrv1
[fenris:24694] *** An error occurred in MPI_Allreduce
[fenris:24694] *** on communicator MPI_COMM_WORLD
[fenris:24694] *** MPI_ERR_OP: invalid reduce operation
[fenris:24694] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
kyrre ~/PhD/ACE3P/Software/arpack-ng_3.1.1/PARPACK/EXAMPLES/MPI $ mpirun -n 4 ./pzndrv1
[fenris:24698] *** An error occurred in MPI_Allreduce
[fenris:24698] *** on communicator MPI_COMM_WORLD
[fenris:24698] *** MPI_ERR_OP: invalid reduce operation
[fenris:24698] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 24698 on
node fenris exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[fenris:24696] 3 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[fenris:24696] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

Do you know whats going on? I get the same error from the big application.

This is my (P)ARPACK compilation sequence:
export LDFLAGS=-L/usr/lib64/openmpi/lib #Needed as my system (fedora 16) uses "module load" and LD_LIBRARY_PATH to find MPI, which is ignored by ARPACK build
./configure --prefix=/home/kyrre/PhD/ACE3P/Software/arpack-ng/ --enable-static --enable-mpi
make
make install

Hope some of you have time to look at this,
Regards,
Kyrre Sjøbæk,
PhD student @ University of Oslo / CERN (currently on pacific time)
Kyrre Ness Sjøbæk Kyrre Ness Sjøbæk
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Compiling / linking with PARPACK [ possible bugfix ]

On 14. juni 2012 11:00, Kyrre Ness Sjøbæk wrote:

> Hi,
>
> I'm trying to debug what looks like a PARPACK-related bug in an inherited code, but before I started to ask questions here on the mailing list, i wanted to update to arpack-ng (we are currently using some very old PARPACK).
>
> I managed to compile ARPACK and PARPACK, and also link with it. However I get a strange MPI error. I think I am able to reproduce this in
> PARPACK/EXAMPLES/MPI/pzndrv1.f,
> compiling it manually as the makefile expects an ARmake.inc.
>
> The commands and output is (more questions underneath it!):
>
> kyrre ~/PhD/ACE3P/Software/arpack-ng_3.1.1/PARPACK/EXAMPLES/MPI $ mpif90 -O2 -fPIC pzndrv1.f /home/kyrre/PhD/ACE3P/Software/arpack-ng/lib/libparpack.a /home/kyrre/PhD/ACE3P/Software/ARPACK/libarpack.a -lg2c -L/usr/lib/gcc/x86_64-redhat-linux/3.4.6 -o pzndrv1
> kyrre ~/PhD/ACE3P/Software/arpack-ng_3.1.1/PARPACK/EXAMPLES/MPI $ ./pzndrv1
> [fenris:24694] *** An error occurred in MPI_Allreduce
> [fenris:24694] *** on communicator MPI_COMM_WORLD
> [fenris:24694] *** MPI_ERR_OP: invalid reduce operation
> [fenris:24694] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
> kyrre ~/PhD/ACE3P/Software/arpack-ng_3.1.1/PARPACK/EXAMPLES/MPI $ mpirun -n 4 ./pzndrv1
> [fenris:24698] *** An error occurred in MPI_Allreduce
> [fenris:24698] *** on communicator MPI_COMM_WORLD
> [fenris:24698] *** MPI_ERR_OP: invalid reduce operation
> [fenris:24698] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 1 with PID 24698 on
> node fenris exiting improperly. There are two reasons this could occur:
>
> 1. this process did not call "init" before exiting, but others in
> the job did. This can cause a job to hang indefinitely while it waits
> for all processes to call "init". By rule, if one process calls "init",
> then ALL processes must call "init" prior to termination.
>
> 2. this process called "init", but exited without calling "finalize".
> By rule, all processes that call "init" MUST call "finalize" prior to
> exiting or it will be considered an "abnormal termination"
>
> This may have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
> [fenris:24696] 3 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal
> [fenris:24696] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>
> Do you know whats going on? I get the same error from the big application.
>
> This is my (P)ARPACK compilation sequence:
> export LDFLAGS=-L/usr/lib64/openmpi/lib #Needed as my system (fedora 16) uses "module load" and LD_LIBRARY_PATH to find MPI, which is ignored by ARPACK build
> ./configure --prefix=/home/kyrre/PhD/ACE3P/Software/arpack-ng/ --enable-static --enable-mpi
> make
> make install
>
> Hope some of you have time to look at this,
> Regards,
> Kyrre Sjøbæk,
> PhD student @ University of Oslo / CERN (currently on pacific time)

Hi again,
using my systems MPI wrappers for compilation, I have the following compilation command of arpack-ng:
$ make clean && MPIF77=mpif77 F77=mpif77 CC=mpicc FFLAGS=-g CFLAGS=-g ./configure --prefix=/home/kyrre/PhD/ACE3P/Software/arpack-ng/ --enable-static --enable-mpi && make -j8 && make install
Note that I'm also replacing f77 with mpif77, which in the end makes linking of applications simpler (I don't need to link with libg2c, and do it explicitly including specify).

The examples now compiles and fail as before, with the same MPI error message:
  PARPACK/EXAMPLES/MPI $ mpif77 -O2 -fPIC pdsdrv1.f /home/kyrre/PhD/ACE3P/Software/arpack-ng/lib/libparpack.a /home/kyrre/PhD/ACE3P/Software/ARPACK/libarpack.a -o pdsdrv1
  PARPACK/EXAMPLES/MPI $ ./pdsdrv1
[fenris:13409] *** An error occurred in MPI_Allreduce
[fenris:13409] *** on communicator MPI_COMM_WORLD
[fenris:13409] *** MPI_ERR_OP: invalid reduce operation
[fenris:13409] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort

Noting that the distribution includes mpif.h, thus overriding my systems OpenMPI headers, I move these files out of the way:
  PARPACK/EXAMPLES/MPI $ rm mpif.h
  PARPACK/EXAMPLES/MPI $ rm ../../SRC/MPI/mpif.h
*compile*

Recompiling and running the examples still work, and running the ones with real (non-complex) datatypes seems to work. However, the complex drivers pcndrv1 and pzndrv1 fails with a segmentation violation:
$ ./pcndrv1
[fenris:01187] *** Process received signal ***
[fenris:01187] Signal: Segmentation fault (11)
[fenris:01187] Signal code: Address not mapped (1)
[fenris:01187] Failing at address: 0xc
[fenris:01187] [ 0] /lib64/libpthread.so.0(+0xf500) [0x7f8999e35500]
[fenris:01187] [ 1] ./pcndrv1() [0x417ea0]
[fenris:01187] [ 2] ./pcndrv1() [0x417aaf]
[fenris:01187] [ 3] ./pcndrv1() [0x40c8ed]
[fenris:01187] [ 4] ./pcndrv1() [0x4122c6]
[fenris:01187] [ 5] ./pcndrv1() [0x40e1ba]
[fenris:01187] [ 6] ./pcndrv1() [0x402ddf]
[fenris:01187] [ 7] ./pcndrv1() [0x4018f7]
[fenris:01187] [ 8] ./pcndrv1() [0x40100f]
[fenris:01187] [ 9] /lib64/libc.so.6(__libc_start_main+0xed) [0x7f8999a8f69d]
[fenris:01187] [10] ./pcndrv1() [0x401041]
[fenris:01187] *** End of error message ***
Segmentation fault (core dumped)

The backtrace looks as following:
PARPACK/EXAMPLES/MPI $ gdb pzndrv1
GNU gdb (GDB) Fedora (7.3.50.20110722-13.fc16)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/kyrre/PhD/ACE3P/Software/arpack-ng_3.1.1/PARPACK/EXAMPLES/MPI/pzndrv1...done.
(gdb) r
Starting program: /home/kyrre/PhD/ACE3P/Software/arpack-ng_3.1.1/PARPACK/EXAMPLES/MPI/pzndrv1
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Detaching after fork from child process 20224.

Program received signal SIGSEGV, Segmentation fault.
0x0000000000417940 in lsame_ ()
Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.5.2-1.fc16.x86_64 krb5-libs-1.9.3-2.fc16.x86_64 libselinux-2.1.6-6.fc16.x86_64 libxml2-2.7.8-6.fc16.x86_64 openssl-1.0.0j-1.fc16.x86_64 pciutils-libs-3.1.7-6.fc16.x86_64
(gdb) bt
#0  0x0000000000417940 in lsame_ ()
#1  0x00000000004205ff in dlamch_ ()
#2  0x000000000040cdad in pdlamch_ ()
#3  0x00000000004129f8 in pznapps (comm=0, n=100, kev=4, np=16, shift=..., v=..., ldv=256, h=..., ldh=20, resid=..., q=..., ldq=20, workl=..., workd=...) at pznapps.f:246
#4  0x000000000040e74d in pznaup2 (comm=0, ido=99, bmat='I', n=100, which='LM', nev=4, np=16, tol=1.1102230246251565e-16, resid=..., mode=1, iupd=1, ishift=1, mxiter=300,
     v=..., ldv=256, h=..., ldh=20, ritz=..., bounds=..., q=..., ldq=20, workl=..., ipntr=..., workd=..., rwork=..., info=0, _bmat=1, _which=2) at pznaup2.f:734
#5  0x0000000000402f8e in pznaupd (comm=0, ido=99, bmat='I', n=100, which='LM', nev=4, tol=1.1102230246251565e-16, resid=..., ncv=20, v=..., ldv=256, iparam=..., ipntr=...,
     workd=..., workl=..., lworkl=1300, rwork=..., info=0, _bmat=1, _which=2) at pznaupd.f:596
#6  0x0000000000401944 in MAIN__ ()
#7  0x000000000040100f in main ()

, where item #3 refers to a line looking like
          unfl = pdlamch( 'safe minimum' )

Noting that pdlamch actually takes 2 arguments, a BLACS context handle and a character (I think it ignores anything but the "s"?), while the old PARPACK version uses dlamch which doesn't have the BLACS handle,
I tried to remove the "p" in pdlamch both in pznapps.f and pcnapps.f. After recompiling, the examples now runs!

Finally, note that the non-parallel examples seems to work fine:
  EXAMPLES/SIMPLE $ mpif77 -O2 -fPIC znsimp.f /home/kyrre/PhD/ACE3P/Software/ARPACK/libarpack.a -o znsimp
  $ ./znsimp
***** OUTPUT / LOTS ******

My system seems to have version 3 of Lapack, is this OK?
$ rpm -qa | grep lapack
lapack-3.3.1-1.fc16.x86_64
lapack-devel-3.3.1-1.fc16.x86_64

I personally used neither FORTRAN or ARPACK before, so this problem is a bit hard for me to work with. Still hoping that the "patch" is useful, and please tell me if its somehow completely wrong!

Regards,
Kyrre Sjøbæk
Sylvestre Ledru-4 Sylvestre Ledru-4
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Compiling / linking with PARPACK [ possible bugfix ]

Hello,

Le 16/06/2012 02:29, Kyrre Ness Sjøbæk a écrit :
> On 14. juni 2012 11:00, Kyrre Ness Sjøbæk wrote:
>> Hi,
>>
> Noting that pdlamch actually takes 2 arguments, a BLACS context handle
> and a character (I think it ignores anything but the "s"?), while the
> old PARPACK version uses dlamch which doesn't have the BLACS handle,
> I tried to remove the "p" in pdlamch both in pznapps.f and pcnapps.f.
> After recompiling, the examples now runs!
Right, it was probably the issue!
I commited your fix:
http://forge.scilab.org/index.php/p/arpack-ng/source/commit/171f59a65a7e7b031170f0972fe12eda56466286/

I will release a new version when we have more changes

Thanks again,
Sylvestre
Kyrre Ness Sjøbæk Kyrre Ness Sjøbæk
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Compiling / linking with PARPACK [ possible bugfix ]

On 22. juni 2012 22:09, Sylvestre Ledru wrote:

> Hello,
>
> Le 16/06/2012 02:29, Kyrre Ness Sjøbæk a écrit :
>> On 14. juni 2012 11:00, Kyrre Ness Sjøbæk wrote:
>>> Hi,
>>>
>> Noting that pdlamch actually takes 2 arguments, a BLACS context handle
>> and a character (I think it ignores anything but the "s"?), while the
>> old PARPACK version uses dlamch which doesn't have the BLACS handle,
>> I tried to remove the "p" in pdlamch both in pznapps.f and pcnapps.f.
>> After recompiling, the examples now runs!
> Right, it was probably the issue!
> I commited your fix:
> http://forge.scilab.org/index.php/p/arpack-ng/source/commit/171f59a65a7e7b031170f0972fe12eda56466286/
>
> I will release a new version when we have more changes
>
> Thanks again,
> Sylvestre

Thanks for fixing that bug!

I saw that you changed the pdlamch call to also use the MPI instance, but isn't pdlamch really for BLACS, not MPI? Or does BLACS run on top of MPI? I just don't have that much experience with either high-performance linear algebra software or FORTRAN77...

However, there seems to be something wonky with the build system as well, especially when trying to build a static version. Building on my laptop (fairly standard Fedora 16 / 64 bit installation) works fine, but on hopper.nersc.gov I run into problems.

To build here, I first tried this configure (which didn't work):
$ MPIF77=ftn F77=ftn CC=cc FFLAGS="-O2 -fPIC" CFLAGS="-O2 -fPIC" ./configure --prefix=$HOME/$NERSC_HOST/Software_kiHwan/arpack-ng/ --enable-static --enable-mpi
(lots of output)
checking for Fortran 77 libraries of ftn...  -L/opt/cray/udreg/2.3.1-1.0400.3911.5.13.gem/lib64 -L/opt/cray/ugni/2.3-1.0400.4127.5.20.gem/lib64 -L/opt/cray/pmi/3.0.1-1.0000.8917.33.1.gem/lib64 -L/opt/cray/dmapp/3.2.1-1.0400.3965.10.63.gem/lib64 -L/opt/cray/xpmem/0.1-2.0400.30792.5.6.gem/lib64 -L/opt/cray/mpt/5.4.5/xt/gemini/mpich2-pgi/109/lib -L/opt/cray/mpt/5.4.5/xt/gemini/sma/lib64 -L/opt/xt-libsci/11.0.06/pgi/109/mc12/lib -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/usr/lib64 -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/lib64 -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/usr/lib/alps -L/usr/lib/alps -L/opt/pgi/12.4.0/linux86-64/12.4/libso -L/opt/pgi/12.4.0/linux86-64/12.4/lib -L/usr/lib64 -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/opt/cray/atp/1.4.4/lib/ -lAtpSigHCommData -lAtpSigHandler -lscicpp_pgi -lsci_pgi_mp -lmpichf90_pgi -lmpich_pgi -lmpl -lrt -lsma -lxpmem -ldmapp -lugni -lpmi -lalpslli -lalpsutil -ludreg -lpthread -lzceh -lstdmpz -lCmpz -lpgmp -lpgf90
 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lnspgc -lpgc -lm
checking for dummy main to link with Fortran 77 libraries... none
checking for Fortran 77 name-mangling scheme... lower case, underscore, no extra underscore
checking if sgemm_ is being linked in already... yes
configure: error: Cannot find BLAS libraries

... which is a bit strange as sgemm_ is (if I remember correctly) a part of BLAS? It's linked in by default:
http://www.nersc.gov/users/software/programming-libraries/math-libraries/libsci/

Bypassing this problem by specifying the path to the self-build ATLAS binaries and specifying that i want ONLY static libs bypasses this problem, but creates some new error messages
$ make clean
$ MPIF77=ftn F77=ftn CC=cc FFLAGS="-O2 -fPIC" CFLAGS="-O2 -fPIC" ./configure --prefix=$HOME/$NERSC_HOST/Software_kiHwan/arpack-ng/ --enable-static --enable-mpi --with-blas=$HOME/$NERSC_HOST/Software_kiHwan/atlas/lib/libatlas.a --with-lapack=$HOME/$NERSC_HOST/Software_kiHwan/atlas/lib/liblapack.a --disable-shared
(everything OK)
$ make
(lots of output)
Making all in .
make[1]: Entering directory `/global/u1/k/ksjobak/hopper/Software_kiHwan/arpack-ng_3.1.1'
/bin/sh ./libtool --tag=CC   --mode=link cc  -O2 -fPIC -version-info 2:0  -o libarpack.la -rpath /global/homes/k/ksjobak/hopper/Software_kiHwan/arpack-ng/lib  ./SRC/libarpacksrc.la ./UTIL/libarpackutil.la /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/libatlas.a /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/liblapack.a -L/opt/cray/udreg/2.3.1-1.0400.3911.5.13.gem/lib64 -L/opt/cray/ugni/2.3-1.0400.4127.5.20.gem/lib64 -L/opt/cray/pmi/3.0.1-1.0000.8917.33.1.gem/lib64 -L/opt/cray/dmapp/3.2.1-1.0400.3965.10.63.gem/lib64 -L/opt/cray/xpmem/0.1-2.0400.30792.5.6.gem/lib64 -L/opt/cray/mpt/5.4.5/xt/gemini/mpich2-pgi/109/lib -L/opt/cray/mpt/5.4.5/xt/gemini/sma/lib64 -L/opt/xt-libsci/11.0.06/pgi/109/mc12/lib -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/usr/lib64 -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/lib64 -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/usr/lib/alps -L/usr/lib/alps -L/opt/pgi/12.4.0/linux86-64/12.4/libso -L/opt/pgi/12.4.0/linux86-64/
12.4/lib -L/usr/lib64 -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/opt/cray/atp/1.4.4/lib/ -lAtpSigHCommData -lAtpSigHandler -lscicpp_pgi -lsci_pgi_mp -lmpichf90_pgi -lmpich_pgi -lmpl -lrt -lsma -lxpmem -ldmapp -lugni -lpmi -lalpslli -lalpsutil -ludreg -lpthread -lzceh -lstdmpz -lCmpz -lpgmp -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lnspgc -lpgc -lm

*** Warning: Linking the shared library libarpack.la against the
*** static library /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/libatlas.a is not portable!

*** Warning: Linking the shared library libarpack.la against the
*** static library /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/liblapack.a is not portable!
libtool: link: (cd .libs/libarpack.lax/libarpacksrc.a && ar x "/global/homes/k/ksjobak/hopper/Software_kiHwan/arpack-ng_3.1.1/./SRC/.libs/libarpacksrc.a")
libtool: link: (cd .libs/libarpack.lax/libarpackutil.a && ar x "/global/homes/k/ksjobak/hopper/Software_kiHwan/arpack-ng_3.1.1/./UTIL/.libs/libarpackutil.a")
(compilation progresses...)
make[2]: Entering directory `/global/u1/k/ksjobak/hopper/Software_kiHwan/arpack-ng_3.1.1/PARPACK'
/bin/sh ../libtool --tag=CC   --mode=link cc  -O2 -fPIC -version-info 2:0  -o libparpack.la -rpath /global/homes/k/ksjobak/hopper/Software_kiHwan/arpack-ng/lib  ../SRC/libarpacksrc.la ../UTIL/libarpackutil.la ../PARPACK/SRC/MPI/libparpacksrcmpi.la ../PARPACK/UTIL/MPI/libparpackutilmpi.la /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/libatlas.a /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/liblapack.a -L/opt/cray/udreg/2.3.1-1.0400.3911.5.13.gem/lib64 -L/opt/cray/ugni/2.3-1.0400.4127.5.20.gem/lib64 -L/opt/cray/pmi/3.0.1-1.0000.8917.33.1.gem/lib64 -L/opt/cray/dmapp/3.2.1-1.0400.3965.10.63.gem/lib64 -L/opt/cray/xpmem/0.1-2.0400.30792.5.6.gem/lib64 -L/opt/cray/mpt/5.4.5/xt/gemini/mpich2-pgi/109/lib -L/opt/cray/mpt/5.4.5/xt/gemini/sma/lib64 -L/opt/xt-libsci/11.0.06/pgi/109/mc12/lib -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/usr/lib64 -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/lib64 -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/usr/lib/alps -
L/usr/lib/alps -L/opt/pgi/12.4.0/linux86-64/12.4/libso -L/opt/pgi/12.4.0/linux86-64/12.4/lib -L/usr/lib64 -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/opt/cray/atp/1.4.4/lib/ -lAtpSigHCommData -lAtpSigHandler -lscicpp_pgi -lsci_pgi_mp -lmpichf90_pgi -lmpich_pgi -lmpl -lrt -lsma -lxpmem -ldmapp -lugni -lpmi -lalpslli -lalpsutil -ludreg -lpthread -lzceh -lstdmpz -lCmpz -lpgmp -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lnspgc -lpgc -lm -lfmpich

*** Warning: Linking the shared library libparpack.la against the
*** static library /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/libatlas.a is not portable!
 
*** Warning: Linking the shared library libparpack.la against the
*** static library /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/liblapack.a is not portable!
libtool: link: (cd .libs/libparpack.lax/libarpacksrc.a && ar x "/global/homes/k/ksjobak/hopper/Software_kiHwan/arpack-ng_3.1.1/PARPACK/../SRC/.libs/libarpacksrc.a")
(some more output, seems to finish OK)

Do you understand what's going on? Why is it creating a dynamic library if I only asked for a static one? I don't have too much experience with creating configure scripts etc...

Regards,
Kyrre
Kyrre Ness Sjøbæk Kyrre Ness Sjøbæk
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Compiling / linking with PARPACK [ possible bugfix ]

On 25. juni 2012 14:27, Kyrre Ness Sjøbæk wrote:

> On 22. juni 2012 22:09, Sylvestre Ledru wrote:
>> Hello,
>>
>> Le 16/06/2012 02:29, Kyrre Ness Sjøbæk a écrit :
>>> On 14. juni 2012 11:00, Kyrre Ness Sjøbæk wrote:
>>>> Hi,
>>>>
>>> Noting that pdlamch actually takes 2 arguments, a BLACS context handle
>>> and a character (I think it ignores anything but the "s"?), while the
>>> old PARPACK version uses dlamch which doesn't have the BLACS handle,
>>> I tried to remove the "p" in pdlamch both in pznapps.f and pcnapps.f.
>>> After recompiling, the examples now runs!
>> Right, it was probably the issue!
>> I commited your fix:
>> http://forge.scilab.org/index.php/p/arpack-ng/source/commit/171f59a65a7e7b031170f0972fe12eda56466286/
>>
>> I will release a new version when we have more changes
>>
>> Thanks again,
>> Sylvestre
>
> Thanks for fixing that bug!
>
> I saw that you changed the pdlamch call to also use the MPI instance, but isn't pdlamch really for BLACS, not MPI? Or does BLACS run on top of MPI? I just don't have that much experience with either high-performance linear algebra software or FORTRAN77...
>
> However, there seems to be something wonky with the build system as well, especially when trying to build a static version. Building on my laptop (fairly standard Fedora 16 / 64 bit installation) works fine, but on hopper.nersc.gov I run into problems.
>
> To build here, I first tried this configure (which didn't work):
> $ MPIF77=ftn F77=ftn CC=cc FFLAGS="-O2 -fPIC" CFLAGS="-O2 -fPIC" ./configure --prefix=$HOME/$NERSC_HOST/Software_kiHwan/arpack-ng/ --enable-static --enable-mpi
> (lots of output)
> checking for Fortran 77 libraries of ftn... -L/opt/cray/udreg/2.3.1-1.0400.3911.5.13.gem/lib64 -L/opt/cray/ugni/2.3-1.0400.4127.5.20.gem/lib64 -L/opt/cray/pmi/3.0.1-1.0000.8917.33.1.gem/lib64 -L/opt/cray/dmapp/3.2.1-1.0400.3965.10.63.gem/lib64 -L/opt/cray/xpmem/0.1-2.0400.30792.5.6.gem/lib64 -L/opt/cray/mpt/5.4.5/xt/gemini/mpich2-pgi/109/lib -L/opt/cray/mpt/5.4.5/xt/gemini/sma/lib64 -L/opt/xt-libsci/11.0.06/pgi/109/mc12/lib -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/usr/lib64 -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/lib64 -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/usr/lib/alps -L/usr/lib/alps -L/opt/pgi/12.4.0/linux86-64/12.4/libso -L/opt/pgi/12.4.0/linux86-64/12.4/lib -L/usr/lib64 -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/opt/cray/atp/1.4.4/lib/ -lAtpSigHCommData -lAtpSigHandler -lscicpp_pgi -lsci_pgi_mp -lmpichf90_pgi -lmpich_pgi -lmpl -lrt -lsma -lxpmem -ldmapp -lugni -lpmi -lalpslli -lalpsutil -ludreg -lpthread -lzceh -lstdmpz -lCmpz -lpgmp
> -lpgf90
> -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lnspgc -lpgc -lm
> checking for dummy main to link with Fortran 77 libraries... none
> checking for Fortran 77 name-mangling scheme... lower case, underscore, no extra underscore
> checking if sgemm_ is being linked in already... yes
> configure: error: Cannot find BLAS libraries
>
> ... which is a bit strange as sgemm_ is (if I remember correctly) a part of BLAS? It's linked in by default:
> http://www.nersc.gov/users/software/programming-libraries/math-libraries/libsci/
>
> Bypassing this problem by specifying the path to the self-build ATLAS binaries and specifying that i want ONLY static libs bypasses this problem, but creates some new error messages
> $ make clean
> $ MPIF77=ftn F77=ftn CC=cc FFLAGS="-O2 -fPIC" CFLAGS="-O2 -fPIC" ./configure --prefix=$HOME/$NERSC_HOST/Software_kiHwan/arpack-ng/ --enable-static --enable-mpi --with-blas=$HOME/$NERSC_HOST/Software_kiHwan/atlas/lib/libatlas.a --with-lapack=$HOME/$NERSC_HOST/Software_kiHwan/atlas/lib/liblapack.a --disable-shared
> (everything OK)
> $ make
> (lots of output)
> Making all in .
> make[1]: Entering directory `/global/u1/k/ksjobak/hopper/Software_kiHwan/arpack-ng_3.1.1'
> /bin/sh ./libtool --tag=CC --mode=link cc -O2 -fPIC -version-info 2:0 -o libarpack.la -rpath /global/homes/k/ksjobak/hopper/Software_kiHwan/arpack-ng/lib ./SRC/libarpacksrc.la ./UTIL/libarpackutil.la /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/libatlas.a /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/liblapack.a -L/opt/cray/udreg/2.3.1-1.0400.3911.5.13.gem/lib64 -L/opt/cray/ugni/2.3-1.0400.4127.5.20.gem/lib64 -L/opt/cray/pmi/3.0.1-1.0000.8917.33.1.gem/lib64 -L/opt/cray/dmapp/3.2.1-1.0400.3965.10.63.gem/lib64 -L/opt/cray/xpmem/0.1-2.0400.30792.5.6.gem/lib64 -L/opt/cray/mpt/5.4.5/xt/gemini/mpich2-pgi/109/lib -L/opt/cray/mpt/5.4.5/xt/gemini/sma/lib64 -L/opt/xt-libsci/11.0.06/pgi/109/mc12/lib -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/usr/lib64 -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/lib64 -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/usr/lib/alps -L/usr/lib/alps -L/opt/pgi/12.4.0/linux86-64/12.4/libso
> -L/opt/pgi/12.4.0/linux86-64/
> 12.4/lib -L/usr/lib64 -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/opt/cray/atp/1.4.4/lib/ -lAtpSigHCommData -lAtpSigHandler -lscicpp_pgi -lsci_pgi_mp -lmpichf90_pgi -lmpich_pgi -lmpl -lrt -lsma -lxpmem -ldmapp -lugni -lpmi -lalpslli -lalpsutil -ludreg -lpthread -lzceh -lstdmpz -lCmpz -lpgmp -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lnspgc -lpgc -lm
>
> *** Warning: Linking the shared library libarpack.la against the
> *** static library /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/libatlas.a is not portable!
>
> *** Warning: Linking the shared library libarpack.la against the
> *** static library /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/liblapack.a is not portable!
> libtool: link: (cd .libs/libarpack.lax/libarpacksrc.a && ar x "/global/homes/k/ksjobak/hopper/Software_kiHwan/arpack-ng_3.1.1/./SRC/.libs/libarpacksrc.a")
> libtool: link: (cd .libs/libarpack.lax/libarpackutil.a && ar x "/global/homes/k/ksjobak/hopper/Software_kiHwan/arpack-ng_3.1.1/./UTIL/.libs/libarpackutil.a")
> (compilation progresses...)
> make[2]: Entering directory `/global/u1/k/ksjobak/hopper/Software_kiHwan/arpack-ng_3.1.1/PARPACK'
> /bin/sh ../libtool --tag=CC --mode=link cc -O2 -fPIC -version-info 2:0 -o libparpack.la -rpath /global/homes/k/ksjobak/hopper/Software_kiHwan/arpack-ng/lib ../SRC/libarpacksrc.la ../UTIL/libarpackutil.la ../PARPACK/SRC/MPI/libparpacksrcmpi.la ../PARPACK/UTIL/MPI/libparpackutilmpi.la /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/libatlas.a /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/liblapack.a -L/opt/cray/udreg/2.3.1-1.0400.3911.5.13.gem/lib64 -L/opt/cray/ugni/2.3-1.0400.4127.5.20.gem/lib64 -L/opt/cray/pmi/3.0.1-1.0000.8917.33.1.gem/lib64 -L/opt/cray/dmapp/3.2.1-1.0400.3965.10.63.gem/lib64 -L/opt/cray/xpmem/0.1-2.0400.30792.5.6.gem/lib64 -L/opt/cray/mpt/5.4.5/xt/gemini/mpich2-pgi/109/lib -L/opt/cray/mpt/5.4.5/xt/gemini/sma/lib64 -L/opt/xt-libsci/11.0.06/pgi/109/mc12/lib -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/usr/lib64 -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/lib64
> -L/opt/cray/xe-sysroot/4.0.36.securitypatch.20120130/usr/lib/alps -
> L/usr/lib/alps -L/opt/pgi/12.4.0/linux86-64/12.4/libso -L/opt/pgi/12.4.0/linux86-64/12.4/lib -L/usr/lib64 -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/opt/cray/atp/1.4.4/lib/ -lAtpSigHCommData -lAtpSigHandler -lscicpp_pgi -lsci_pgi_mp -lmpichf90_pgi -lmpich_pgi -lmpl -lrt -lsma -lxpmem -ldmapp -lugni -lpmi -lalpslli -lalpsutil -ludreg -lpthread -lzceh -lstdmpz -lCmpz -lpgmp -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lnspgc -lpgc -lm -lfmpich
>
> *** Warning: Linking the shared library libparpack.la against the
> *** static library /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/libatlas.a is not portable!
>
> *** Warning: Linking the shared library libparpack.la against the
> *** static library /global/homes/k/ksjobak/hopper/Software_kiHwan/atlas/lib/liblapack.a is not portable!
> libtool: link: (cd .libs/libparpack.lax/libarpacksrc.a && ar x "/global/homes/k/ksjobak/hopper/Software_kiHwan/arpack-ng_3.1.1/PARPACK/../SRC/.libs/libarpacksrc.a")
> (some more output, seems to finish OK)
>
> Do you understand what's going on? Why is it creating a dynamic library if I only asked for a static one? I don't have too much experience with creating configure scripts etc...
>
> Regards,
> Kyrre

Just to follow up on this:

- Isn't it also a good idea to remove mpi.h from the distribution, as this is really a system header, with system-dependent constants?
- There is clearly a bug in the compile script, as it detects built-in BLAS and LAPACK, but configure then exits on
if test x"$BLAS_LIBS" = x; then
        as_fn_error $? "Cannot find BLAS libraries" "$LINENO" 5
fi

and

if test x"$LAPACK_LIBS" = x; then
        as_fn_error $? "Cannot find LAPACK libraries" "$LINENO" 5
fi

which to me seems wrong (it finds sgemm__ and cheev__, setting ax_blas_ok = yes and ax_lapack_ok = yes). I'm not shure how to fix this, except disabling the whole check in configure...

- There is still something strange with the linking:

$ make clean
$ MPIF77=ftn F77=ftn CC=cc CXX=CC ./configure --prefix=$HOME/$NERSC_HOST/Software_kiHwan/arpack-ng/ --disable-static --enable-mpi --enable-shared
$ make
*removed tons of output*
make[1]: Entering directory `/global/u1/k/ksjobak/hopper/Software2/arpack-ng_3.1.1/TESTS'
ftn  -g -O2 -c -o dnsimp.o dnsimp.f
ftn  -g -O2 -c -o mmio.o mmio.f
/bin/sh ../libtool --tag=F77   --mode=link ftn  -g -O2   -o dnsimp dnsimp.o mmio.o ../libarpack.la
libtool: link: ftn -g -O2 -o .libs/dnsimp dnsimp.o mmio.o  ../.libs/libarpack.so -Wl,-rpath -Wl,/global/homes/k/ksjobak/hopper/Software_kiHwan/arpack-ng/lib
/usr/bin/ld: attempted static link of dynamic object `../.libs/libarpack.so'
collect2: ld returned 1 exit status

Removing any references to the TESTS subdirectory "fixed" this error - maybe the makefile there has some bugs? I also sometimes get an error about trying to statically link with libstdc++.so, but couldn't reproduce that now.

- Finally, should I rather use the web bug report tool than the mailing list? Hoping to get this package to work, not having to go back to the old parpack version because I can't build the new one (where there seems to be a MPI-related problem causing deadlocks)...
- Note that in the previous email, I used the PGI compiler, not GCC. My mistake, and GCC doesn't give me these warnings.

Cheers, and bon weekend,
Kyrre Sjøbæk
Sylvestre Ledru-4 Sylvestre Ledru-4
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Compiling / linking with PARPACK [ possible bugfix ]

In reply to this post by Kyrre Ness Sjøbæk
Le 25/06/2012 14:27, Kyrre Ness Sjøbæk a écrit :
> On 22. juni 2012 22:09, Sylvestre Ledru wrote:
>> Hello,
>>
> I saw that you changed the pdlamch call to also use the MPI instance,
> but isn't pdlamch really for BLACS, not MPI? Or does BLACS run on top
> of MPI? I just don't have that much experience with either
> high-performance linear algebra software or FORTRAN77...
>
I changed all occurences of this call. It has to take 2 args. If I
missed any, please let me know.

> However, there seems to be something wonky with the build system as
> well, especially when trying to build a static version. Building on my
> laptop (fairly standard Fedora 16 / 64 bit installation) works fine,
> but on hopper.nersc.gov I run into problems.
>
[...]
> Do you understand what's going on? Why is it creating a dynamic
> library if I only asked for a static one? I don't have too much
> experience with creating configure scripts etc...
You have to explicitly disable it.

FYI:
$ ./configure --enable-static --disable-shared --enable-mpi
$ make
$ find . -iname '*.a'|wc -l
7
$ find . -iname '*.so*'|wc -l
0

Sylvestre

Sylvestre Ledru-4 Sylvestre Ledru-4
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Compiling / linking with PARPACK [ possible bugfix ]

In reply to this post by Kyrre Ness Sjøbæk
Le 29/06/2012 17:03, Kyrre Ness Sjøbæk a écrit :
>
> Just to follow up on this:
>
> - Isn't it also a good idea to remove mpi.h from the distribution, as
> this is really a system header, with system-dependent constants?
could you report a bug on this ?
>
> - There is clearly a bug in the compile script, as it detects built-in
> BLAS and LAPACK, but configure then exits on
> if test x"$BLAS_LIBS" = x; then
>     as_fn_error $? "Cannot find BLAS libraries" "$LINENO" 5
> fi
What do you mean byt "built-in" ?

Anyway, if you believe it is a bug, please report one

Sylvestre

Kyrre Ness Sjøbæk Kyrre Ness Sjøbæk
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Compiling / linking with PARPACK [ possible bugfix ]

In reply to this post by Sylvestre Ledru-4
 On Sat, 30 Jun 2012 21:57:54 +0200, Sylvestre Ledru
 <[hidden email]> wrote:

> Le 25/06/2012 14:27, Kyrre Ness Sjøbæk a écrit :
>> On 22. juni 2012 22:09, Sylvestre Ledru wrote:
>>> Hello,
>>>
>> I saw that you changed the pdlamch call to also use the MPI
>> instance,
>> but isn't pdlamch really for BLACS, not MPI? Or does BLACS run on
>> top
>> of MPI? I just don't have that much experience with either
>> high-performance linear algebra software or FORTRAN77...
>>
> I changed all occurences of this call. It has to take 2 args. If I
> missed any, please let me know.
>

 Its just that with all the documentation I find for pdlamch, the first
 argument is the BLACS context handle, while you used the MPI
 communicator. Are they really the same? I'm no expert on neither
 ScaLAPACK or BLACS, but I can't find it stated anywhere that you may use
 a MPI communicator as a BLACS context handle...

>> However, there seems to be something wonky with the build system as
>> well, especially when trying to build a static version. Building on
>> my
>> laptop (fairly standard Fedora 16 / 64 bit installation) works fine,
>> but on hopper.nersc.gov I run into problems.
>>
> [...]
>> Do you understand what's going on? Why is it creating a dynamic
>> library if I only asked for a static one? I don't have too much
>> experience with creating configure scripts etc...
> You have to explicitly disable it.
>
> FYI:
> $ ./configure --enable-static --disable-shared --enable-mpi
> $ make
> $ find . -iname '*.a'|wc -l
> 7
> $ find . -iname '*.so*'|wc -l
> 0
>
> Sylvestre

 I think I tried this, without any difference. It might be something
 strange with the compiler - I try to pin it down better, and submit a
 bug report if breaks like what I think, or send you an email if it was a
 PEBKAC.

 --- Kyrre
egunon egunon
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Compiling / linking with PARPACK

In reply to this post by Kyrre Ness Sjøbæk
Hello,

I have just downloaded ARPACK & P_ARPACK.

Nevertheless, I get an error executing the P_ARPACK's examples. My laptop is an intel core 2 and I am using these libraries in a virtualbox, scientific linux 6.5.

I installed the latest version of openmpi and although everything compiles well I got this error :
      An error occurred in MPI_Allreduce
      on communicator MPI_COMM_WORLD
      MPI_ERROR_OP: invalid reduce operation
      MPI_ERRORS_ARE_FATAL: your MPI job will now abort.

This is the PATH I have: /usr/local/bin:/usr/bin/:/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/lib/openmpi/bin/::/usr/lib/openmpi/bin//usr/lib/openmpi/bin/

(I don't know why the last one is repeated three times)

LD_LIBRARY_PATH: /usr/lib/openmpi/lib/

If you need a log file, please specify the location as I am just starting with this library.
Browsing on the net it seems that the installation should be pretty straightforward, I expect this to be a problem with my configuration script.
Do you know how I could solve this error?

Thank you in advance,

Garazi Gómez de Segura
egunon egunon
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Compiling / linking with PARPACK

Hello


2014-07-14 0:16 GMT+02:00 egunon [via Scilab / Xcos - Mailing Lists Archives] <[hidden email]>:
Hello,

I have just downloaded ARPACK & P_ARPACK.

Nevertheless, I get an error executing the P_ARPACK's examples. My laptop is an intel core 2 and I am using these libraries in a virtualbox, scientific linux 6.5.

I installed the latest version of openmpi and although everything compiles well I got this error :
      An error occurred in MPI_Allreduce
      on communicator MPI_COMM_WORLD
      MPI_ERROR_OP: invalid reduce operation
      MPI_ERRORS_ARE_FATAL: your MPI job will now abort.

This is the PATH I have: /usr/local/bin:/usr/bin/:/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/lib/openmpi/bin/::/usr/lib/openmpi/bin//usr/lib/openmpi/bin/

(I don't know why the last one is repeated three times)

LD_LIBRARY_PATH: /usr/lib/openmpi/lib/

If you need a log file, please specify the location as I am just starting with this library.
Browsing on the net it seems that the installation should be pretty straightforward, I expect this to be a problem with my configuration script.
Do you know how I could solve this error?

Thank you in advance,

Garazi Gómez de Segura


If you reply to this email, your message will be added to the discussion below:
http://mailinglists.scilab.org/Compiling-linking-with-PARPACK-tp4024371p4030906.html
To unsubscribe from Compiling / linking with PARPACK, click here.
NAML

Loading...