Gerhard Niklasch on Sun, 29 Oct 2000 15:26:05 +0100 (MET)


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PARI] about the behavior of install()


In response to:
> Message-ID: <Pine.LNX.4.10.10010290017000.23994-100000@pev.math.univ-montp2.fr>
> Date: Sun, 29 Oct 2000 00:52:15 +0200 (CEST)
> From: Philippe Elbaz-Vincent <pev@pev.math.univ-montp2.fr>
> 
> I am puzzled by the behavior of the function install(), or should I say
> its internal mechanism, and I will appreciate any help on this matter.
[...]
> So my question is; does this illustrate the expected behavior of
> install()?
> If yes, what is the correct way to kill and re-install a loadable
> function ? (because it's quite cumbersome when you're testing/
> debugging a function)

Yes, this is the expected behavior, and as far as I can see there
is currently no way around it.

Installing a function F living in some shared library called L.so
into gp is a 2-stage affair:

(1) gp asks the runtime loader to map L.so into the processes's
address space, using the dlopen() library function from libc or
(on some Unices) libdl.  The runtime loader locates the actual
file IN (for inode) referred to by the name L.so, and if all goes
well, gp ends up with an anonymous handle to the dynamic library,
DLH.

(2) gp then uses DLH to perform a dlsym() lookup for something called
F in the mapped-in library IN.  If successful, the result is the
start address of the function.  This address and the other info
from the install() invocation is entered into the gp interpreter's
lookup table.

Now killing F from the GP session will undo step (2) (F is deleted
from the interpreter table), but not (1).  It would not be safe to
undo (1) blindly, since other functions from the same shared library
might have been installed in the meantime, and if we removed IN from
the address space, we'd be left with orphaned pointers, courting a
segmentation fault as soon as one of them was used.

Rebuilding the shared library will unlink the file IN on disk from
its former directory entry L.so, and will attach a freshly linked
shared dynamic library IN' to the name L.so.  The running gp process
is now (typically) the only thing referencing the old inode IN, and
IN will be deleted from disk as soon as this gp process stops referen-
cing it (typically when you quit the gp session).  Again, it would
be suicidal if things were otherwise.  The entry addresses of func-
tions in the re-built IN' will typically be different from what they
were in IN, and the runtime loader has no way of knowing whether and
where the running process may have stashed away a pointer obtained
earlier through dlsym().  If IN were replaced by IN' on the fly, all
such pointers would now point at random garbage.

When you install F a second time into the existing gp session, step
(1) becomes a no-op: the runtime loader is asked for a library which
is already present in the address space, and the newly returned handle
DLH' will be equivalent to the old DLH.  Thus (2) will pick up the
pointer to the same old F in the same old IN.


So far for explaining why things are as they are;  now what could
be done about changing this?  There is a way (1) could be undone --
gp would have to call dlclose(DLH) and thus promise to the runtime
loader that neither DLH nor any pointer obtained from it by dlsym()
will be used any longer.  Then IN _will_ be unmapped, and the next
install and dlopen() would pick up the recompiled IN' under the same
old name L.so, giving the effect you intended.  In order for this to
be safe at the gp level, gp would need to keep (at least) some sort
of reference count  (to the number of times it had used dlsym() on
a given handle, and not yet killed the symbol returned.  Actually
it's going to be more complicated than that since at present install
doesn't remember the handle.  And I could think of situations where
a reference count would be insufficient -- when the same shared lib
is accessible by more than one name...  But it would probably suffice
for practical purposes).

So anybody fancy a little project?  kill0() is in src/language/anal.c,
and install() and its internal sub-functions come from src/gp/highlvl.c.
(Beware, the data structure behind ep will probably need to be changed,
to have a slot in which DLH can be remembered so we know to find the
reference count when we kill0() the entree, and there's the special
case of a DLH obtained without using a name and referring to the main
symbol table of the running gp process...)

Hope this helps,
Gerhard