[isabelle-dev] Grouping ISABELLE_FULL_TEST?
Jasmin Blanchette
jasmin.blanchette at inria.fr
Wed Feb 3 11:04:48 CET 2016
> On 02.02.2016, at 08:28, Lars Hupel <hupel at in.tum.de> wrote:
>
>> What remains are various benchmarks. We could move these sessions to
>> ~~/src/Benchmarks and omit that directory in ~~/ROOTS. Thus it can be
>> included on demand like this:
>>
>> isabelle build -d '~~/src/Benchmarks' -a
>>
>> or like this:
>>
>> isabelle build -D '~~/src/Benchmarks'
>>
>> This avoids more sophisticated Boolean algebra on session groups. It
>> also makes clear from the source structure, what is normally not tested.
>>
>> I basically do the same with -d '$AFP' all the time.
>
> Sounds reasonable. There's no pressure to do it just yet; I'd be happy
> to wait for post-release mode.
I have a slight worry concerning the characterization and naming as "benchmarks". From what I understand, a "benchmark" is a theory file that tests the performance of the system. Looking at the list
HOL-ex
Sudoku
HOL-Datatype_Examples
Brackin
IsaFoR
Misc_N2M
HOL-Word-SMT_Examples
SMT_Tests
HOL-Quickcheck_Benchmark
Find_Unused_Assms_Examples
Needham_Schroeder_No_Attacker_Example
Needham_Schroeder_Guided_Attacker_Example
Needham_Schroeder_Unguided_Attacker_Example
HOL-Record_Benchmark
Record_Benchmark
I would say that only "Record_Benchmark" is truly benchmarking. The others are merely slow examples or examples that require a special setup.
Special setup:
Sudoku
SMT_Tests (requires Z3)
Slow:
Brackin
IsaFoR
Misc_N2M
Probably also in the slow category (the last three might have a benchmarking component):
Find_Unused_Assms_Examples
Needham_Schroeder_No_Attacker_Example
Needham_Schroeder_Guided_Attacker_Example
Needham_Schroeder_Unguided_Attacker_Example
Slow and benchmarking:
Record_Benchmark
Conversely, some examples that are always processed can be considered benchmarking -- e.g. "Nitpick_Examples/Hotel_Nits.thy".
Jasmin
More information about the isabelle-dev
mailing list