Ticket #3011 (closed defect: fixed)

Opened 5 years ago

Last modified 3 years ago

Compile fail on OS X Lion with PGI

Reported by: jsquyres Owned by: rhc
Priority: major Milestone: Open MPI 1.6.6
Version: trunk Keywords:
Cc: PHHargrove@…


Reported by Paul Hargrove: http://www.open-mpi.org/community/lists/devel/2012/01/10258.php

This was reported on an OMPI 1.5.5 rc, but I'm making the executive RM decision to push this to 1.6.

Short version:

Making all in tools/orte-clean 
source='../../../../orte/tools/orte-clean/orte-clean.c' object='orte-clean.o' libtool=no \ 
DEPDIR=.deps depmode=none /bin/sh ../../../../config/depcomp \ 
pgcc -DHAVE_CONFIG_H -I. -I../../../../orte/tools/orte-clean -I../../../opal/include -I../../../orte/include -I../../../ompi/include -I../../../opal/mca/paffinity/linux/plpa/src/libplpa -I../../../.. -I../../.. -I../../../../opal/include -I../../../../orte/include -I../../../../ompi/include -D_REENTRANT -O -DNDEBUG -c -o orte-clean.o ../../../../orte/tools/orte-clean/orte-clean.c 
/bin/sh ../../../libtool --tag=CC --mode=link pgcc -O -DNDEBUG -export-dynamic -o orte-clean orte-clean.o ../../../orte/libopen-rte.la-lutil libtool: link: pgcc -O -DNDEBUG -o orte-clean orte-clean.o  ../../../orte/.libs/libopen-rte.a /Users/paul/openmpi-1.4.5rc2/BLD-pgi-11.10/opal/.libs/libopen-pal.a -lutil 
Undefined symbols for architecture x86_64: 
  "_orte_odls", referenced from: 
      _orte_errmgr_base_error_abort in libopen-rte.a(errmgr_base_fns.o) 
ld: symbol(s) not found for architecture x86_64 

Change History

comment:1 Changed 5 years ago by phargrov

This was also seen in 1.4.5rc5: http://www.open-mpi.org/community/lists/devel/2012/02/10333.php

I will assume there is no plan to fix it for 1.4.x either.

comment:2 Changed 3 years ago by donb2000

The problem is not OS-specific, I reproduced it on a Snow Leopard system.

I was able to build 64-bit Open MPI 1.6.5 on Snow Leopard (10.6.8) using PGI compilers after making two edits:

(1) unresolved external orte_odls

  • orte/mca/odls/base/odls_base_open.c, initialize orte_odls:

orte_odls_base_module_t orte_odls = {0};

(2) pgcpp unrecognized option -bind_at_load

  • ompi/contrib/vt/vt/extlib/otf/libtool, set variable wl in several places:

wl = "-Wl,"

I used the following configure command:

../openmpi-1.6.5/configure CC=pgcc CXX=pgcpp F77=pgf77 FC=pgf90 CPP="pgcc -E" CXXCPP="pgcpp -E" CFLAGS=-v CXXFLAGS=-v --prefix=/blah/blah/blah 2>&1 | tee config.out

Details: It looks like orte_odls is a global variable, and when the error occurs the object is being linked but no executable code in the object is referenced, and the global isn't linked in. This behavior is different from Linux, where it is linked in. I created a small test case that reproduced the behavior, but my small test case also failed when using gcc, so there must be some configure/libtool magic that is changing the behavior for gcc.

comment:3 Changed 3 years ago by jsquyres

  • Status changed from new to closed
  • Resolution set to invalid

I got an off-list email from Don about this ticket:

Jeff, please disregard my question below, further investigation reveals that the lack of shared libs is our problem. Regards, --Don

So I'm closing this ticket.

comment:4 Changed 3 years ago by jsquyres

  • Status changed from closed to reopened
  • Resolution invalid deleted

comment:5 Changed 3 years ago by jsquyres

  • Owner changed from jsquyres to rhc
  • Status changed from reopened to assigned

Ralph -- what do you think on Don's solution to (1)? (i.e., initialize orte_odls)

comment:6 Changed 3 years ago by rhc

I don't have any heartburn over it, though I confess I'm unable to reproduce the problem on my Mac. Has anyone else been able to reproduce it? Or is this a PGI-only issue?

comment:7 Changed 3 years ago by jsquyres

Note that this is v1.6.x.

I'm able to build/run on my Mac, so I assume this is a PGI issue.

comment:8 Changed 3 years ago by rhc

We'll probably hit the same issue in 1.7 and above as well, then, as the referenced code hasn't changed. Guess I'll go ahead and add the initializer, and CMR it back to the releases.

comment:9 Changed 3 years ago by donb2000

As for the other part of the solution, getting libtool to do the right thing wrt '-Wl,' and PGI compilers, is this an upstream issue that I need to take up with the libtool maintainers? Can we patch Open MPI for now until that is resolved?

comment:10 Changed 3 years ago by jsquyres

Yes, this is an upstream Libtool issue.

However, if you can produce a patch for us that we should apply in the autogen/configure process, we can use that in the meantime.

comment:11 Changed 3 years ago by jsquyres

The orte_odls issue is fixed on the trunk in r29129, and fixed in v1.6 branch (although I don't expect a 1.6.6 release) in r29130. Will therefore start showing up in nightly builds tonight.

Will be committed to v1.7 branch soon.

comment:12 Changed 3 years ago by jsquyres

  • Status changed from assigned to closed
  • Resolution set to fixed

This has been fixed on the v1.6 branch.

Note: See TracTickets for help on using tickets.