NEWS: More uniform and scalable declarations of intro/elim/dest rules

Mon Jul 21 17:45:57 CEST 2025

*** General ***

* Declarations of intro/elim/dest rules for Pure and the Classical
Reasoner (e.g. HOL) are handled more uniformly and efficiently: the
order of rule declarations is maintained accurately, regardless of
intermediate [rule del] declarations. Furthermore, [dest] is now a
proper declaration on its own account, instead of the former expansion
[elim_format, elim]. Consequently, [rule del] no longer deletes the
[elim_format] of the given rule, only the original rule. Very rare
INCOMPATIBILITY: tools like "blast" and "auto" may fail in unusual
situations.

This refers to Isabelle/372273ab6ebb and AFP/eae86974a665.


That is the notable point, where all Isabelle + AFP applications essentially 
work unchanged. Intermediately, in my desparation to get it done, I had 
adjusted a few odd proofs. See this already backed-out changeset:

changeset:   15842:d06954e7388a
user:        wenzelm
date:        Sun Jul 13 04:21:58 2025 +0200
files:       thys/Buchi_Complementation/Complementation_Implement.thy 
thys/Tree_Enumeration/Rooted_Tree.thy
description:
adapted proofs to Isabelle/df2d774bcf21: spurious failures due to unclear reasons;

diff -r ba30fe0ad97c -r d06954e7388a 
thys/Buchi_Complementation/Complementation_Implement.thy

--- a/thys/Buchi_Complementation/Complementation_Implement.thy	Fri Jul 11 
15:02:29 2025 +0200
+++ b/thys/Buchi_Complementation/Complementation_Implement.thy	Sun Jul 13 
04:21:58 2025 +0200
@@ -597,7 +597,9 @@
      fix a f g
      assume "g \<in> expand_map (get_4 A (bounds_3 A a (refresh_1 f)))"
      then show "g \<in> expand_map (get_3 A (bounds_3 A a (refresh_1 f)))"
-      unfolding get_4_def get_3_def items_4_def items_3_def 
expand_map_alt_def by blast
+      unfolding get_4_def get_3_def items_4_def items_3_def 
expand_map_alt_def mem_Collect_eq
+      supply not_None_eq [iff del] by blast
+
    qed
    lemma complement_4_language_2: "language (complement_4 A) \<subseteq> 
language (complement_3 A)"
      using language_mono complement_4_less by (auto dest: monoD)
diff -r ba30fe0ad97c -r d06954e7388a thys/Tree_Enumeration/Rooted_Tree.thy
--- a/thys/Tree_Enumeration/Rooted_Tree.thy	Fri Jul 11 15:02:29 2025 +0200
+++ b/thys/Tree_Enumeration/Rooted_Tree.thy	Sun Jul 13 04:21:58 2025 +0200
@@ -554,7 +554,7 @@
  abbreviation "ltree_stree_subtrees ts \<equiv> SOME xs. fset_of_list xs = 
ltree_stree |`| ts \<and> distinct xs \<and> sorted_wrt (\<lambda>t s. 
tree_ltree t \<le> tree_ltree s) xs"

  lemma fset_of_list_ltree_stree_subtrees[simp]: "fset_of_list 
(ltree_stree_subtrees ts) = ltree_stree |`| ts"
-  using someI_ex[OF distinct_sorted_wrt_list] by fast
+  using someI_ex[OF distinct_sorted_wrt_list] by force

  lemma set_ltree_stree_subtrees[simp]: "set (ltree_stree_subtrees ts) = 
ltree_stree ` fset ts"
    using fset_of_list_ltree_stree_subtrees by (metis (mono_tags, lifting) 
fset.set_map fset_of_list.rep_eq)


A bit later, the hidden failure of smt proof reconstruction prompted me to 
revisit the whole affair once again, comparing netpair content more carefully, 
and reconstructing/eliminating reasons for remaining deviations.

Some of these oddities had been around for approx. 30 years: Delicate 
diversions from the intention written as prose comments wrt. the actual 
implementation in ML. I made the mistake to take the prose text more seriously 
than the formal ML text. Ultimately, real-world applications found in AFP 
determine what is right or wrong, and don't care about good intentions written 
as comments.


Here are some notable points in history, together with my present 
interpretation of the archeological findings. This is going back to the depths 
of time, when Isabelle was still an odd research experiment.

* b3f190995bc9 lcp 28-Apr-1995 "Recoded addSIs, etc., so that nets are built 
incrementally instead of from scratch each time.  The old approach used 
excessive time (>1 sec for adding a rule to ZF_cs) and probably excessive 
space.  Now rep_claset includes all record fields.  joinrules no longer sorts 
the rules on number of subgoals, since it does not see the full set of them; 
instead the number of subgoals modifies the priority."

That is an early attempt to make things scale better. That version is notable 
to assign integer "tags" for sorting rules according to the following 
calculation of the length of intro vs. elim rules:

   fun insert (nI,nE) = insert_tagged_list o (tag_brls (~(2*nI+nE))) o joinrules;

Still a bit inefficient (based on list length) and not quite correct, as 
explained below.

* c06d01f75764 paulson 01-May-1996: "Provides merge_cs to support default 
clasets".

Here the comment says:

(*Merge works by adding all new rules of the 2nd claset into the 1st claset.
   Merging the term nets may look more efficient, but the rather delicate
   treatment of priority might get muddled up.*)

But the ML source says something different: It effectively orders all 
declarations from the RHS of the merge according to their "rule kind" (using 
current terminology). That is no problem for disjoint netpairs, but within a 
single netpair, newly introduced elim rules now take precedence over intro rules.

That odd ordering from 1996 made smt reconstruction fail now, because I 
reimplemented the prose specification instead of the actual ML story.

Further note that we still see quadratic complexity for the merge, because it 
is based on plain lists (which implicitly retains the order of declarations).

* 2425068fe13a wenzelm 14-May-2011: "slightly more efficient claset 
operations, using Item_Net to maintain rules in canonical order".

That was my attempt to address the complexity of extending/merging claset rule 
declarations, but not a fully successful one. Its use of Item_Net.T instead of 
a plain list was fine, because that is a scalable data structure with n*log(n) 
complexity instead of n^2. My oversight in 2011 was the remaining linear 
operation Item_Net.length to imitate the allocation of a "fresh" declaration 
index for each rule.

* ca600cbfd4bf wenzelm 10-Jul-2025 "more accurate "next" counter for each 
insert operation: subtle change of semantics wrt. Item_Net.length, due to 
delete operation; avoid costly Item_Net.length, which is linear in size;"

Approx. 2 weeks ago, I realized the core problem of inefficiency of the 
claset, only after cleaning up the sources and saying clearly in ML what 
really happens, instead of trusting odd comments. Apart from inefficiency of 
length-based allocation of a "fresh" index, it was also wrong due to the 
presence of "rule del" declarations: thus an old index could have been reused 
later on --- it is not really fresh. Luckily, the more correct implementation 
in this changeset did not affect any applications in Isabelle + AFP. So 
ca600cbfd4bf remained the "stable status-quo" in subsequent struggles to 
finish this renovation project.

* 8aa1c98b948b wenzelm 11-Jul-2025 "maintain collective rule declarations via 
type Bires.decls, with netpair operations derived from it;"

This is where I switched over to a quite different background store for the 
claset rules, with subtle differences in ordering of declarations (and a few 
other problems that I introduced on the spot). The idea is plain and simple: 
all rules are maintained in one scalable data structure Bires.decls, and the 
netpairs derived from it can count on precise weight and index information for 
the ordering. The reality turned out rather ugly, though: applications from 
AFP that did not quite work yet.


All these problems are resolved in Isabelle/372273ab6ebb and AFP/eae86974a665, 
as far as I can tell. It recovers all visible applications, without odd 
changes of proofs.

We could still be a bit more ambitious, though, and eliminate the strange 
Bires.decl_merge_ord that is there to imitate the merge ordering from 30 years 
ago. Changing that would mean to take the user-provided declaration order 
seriously, even for newly added rules in a merge.

That further detail would require to go into the smt proof reconstruction, to 
see where its invocations of fast_tac/blast_tac/auto_tac etc. actually fail, 
or rather consume much more time than before --- and thus break (line 6705 of 
"$AFP/Modular_arithmetic_LLL_and_HNF_algorithms/HNF_Mod_Det_Soundness.thy"), 
for example.


	Makarius