[isabelle-dev] AFP devel broken

Fri Dec 7 23:50:05 CET 2012

On Thu, 6 Dec 2012, Tobias Nipkow wrote:

> Am 06/12/2012 12:57, schrieb Makarius:
>> This does not say anything yet.  We need to collect further details and
>> hypotheses and test them.  I will also try again to reproduce it myself.
>
> One point may (or may not) have got lost. The problem seems to be independent
> from the AFP. Remember that I had the situation where even HOLCF hung. It turned
> out that trying to load any theory based on HOL hung. I eventually forced HOL to
> be recompiled and then things were more normal again. Yes, it sounds strange.

The situation should be under control again.

I've introduced this problem myself in a recent cleanup of the future task 
scheduler, probably around 262c36fd5f26 from 18-Oct-2012.  Since we are 
lagging behind with the testing infrastructure itself, we often get delays 
in exposing issues of the systems being tested.


So this is the explanation for the empirical observations on this thread:

   * AFP uses document_variants=document:outline=/proof,/ML which means > 1
     document preparation jobs in the end; thus it had the future task
     scheduler and potentially some worker threads still running when the
     image was saved.

   * After reloading the image to build another session on top, the old
     tasks were revived by the Poly/ML runtime system, but a fresh instance
     of the scheduler + workers forked by Isabelle/ML as usual.  The old
     threads could still hold locks or access critical variables in a way
     that interferes with the fresh crew.

What is also funny in the persistent thread model is that the thread 
equality changes but physically the threads operate on the same data 
space as before the dump.  This makes it hard to check explicitly against 
such mistakes inside ML.

Anyway, in such a situations of Isabelle/ML threading problems (not 
Poly/ML threading problems nor Scala/JVM threading problems) setting 
Multithreading.trace := 5 gives an idea what is wrong. Since it was all in 
my own machine room it was easy to isolate and address -- problems by 
Oracle or EPFL take much longer.


So here is my changeset to conclude this mail thread (hopefully):

# HG changeset patch
# User wenzelm
# Date 1354918479 -3600
# Node ID 702278df3b57aff4c55fec3bffb0206ace927432
# Parent  f8cd5e53653b98df3433356a94bbb3810042c626
make double-sure that the future scheduler is properly shutdown, otherwise 
its threads are made persistent and will deadlock with the fresh instance 
after reloading the image (NB: Present.finish involves another 
Par_List.map over document_variants and thus might fork again);

diff -r f8cd5e53653b -r 702278df3b57 src/Pure/System/session.ML

--- a/src/Pure/System/session.ML	Fri Dec 07 23:11:01 2012 +0100
+++ b/src/Pure/System/session.ML	Fri Dec 07 23:14:39 2012 +0100
@@ -79,6 +79,7 @@
    Keyword.status ();
    Outer_Syntax.check_syntax ();
    Options.reset_default ();
+  Future.shutdown ();
    session_finished := true);


 	Makarius