[isabelle-dev] Jenkins maintenance

Mon Oct 3 17:12:30 CEST 2016

I have decided to add my own opinions to this debate. My knowledge of
the old testing system is limited, but I hope my perspective will still
hold some interest.

First of all, a usable testing infrastructure does not grow overnight by
itself. Someone has to put in a lot of time and effort to make this
happen, and that's what Lars did – much of it, as I gather, in his free
time. I'm sure Lars does not object to others adapting this
infrastructure, and he has in fact already implemented a number of
suggestions for improvements.

>From what I have picked up, the new testing infrastructure we have now
is also a lot more adaptable and maintainable than anything we had
before. The words ‘industry standard’ may seem meaningless to you,
Makarius, but it cannot be denied that if /everyone/ does testing in a
certain way these days, then that is at least some evidence that it
might be a good idea to consider. Relying on the vague idea that things
never break because people never make mistakes does not seem like a
sensible approach to me – and once you accept that things /can/ break
because people /do/ make mistakes, prompt notification of /what/ broke
and /why/ is important.

No one is saying that the current state of affairs is optimal. I can
think of a number of possible improvements myself and have privately
communicated them to Lars, who seemed quite grateful for the feedback.

I for one think that a big improvement would be a system where no one
pushes to the repository directly: every push goes to a testing server,
and if the test is successful, the changes can then automatically be
merged into the repository without an additional test. This avoids the
current duplication of tests where one first pushes to the testboard,
waits for everything to run through, and then pushes to the repository
where everything is tested again. It would also be an effective
safeguard against breaking the repository (and possibly even the AFP),
which /does/ happen quite frequently. (cf. the current situation where
things have been broken for days) Another improvement I can think of is
that it would be good to have a little more control over testing:
aborting tests when they are not necessary anymore, testing only the
repository (and not the AFP) etc. All of this we can talk about and find
a solution in time.

It is important here to stress that none of us is working /against/
another – we all want the best possible infrastructure to do our work,
and I for one do not like the path this discussion has been taking. The
reality of it all is that our testing infrastructure was in shambles for
quite some time. Then Lars came along and put in a lot of effort to get
things up and running again. The main response on this mailing list were
vague insinuations that he does not understand how to do tests, or does
not care about doing it properly, which could hardly be farther removed
from the truth.

This email is getting a bit longer than I had planned, so let me
summarise my points once more for additional clarity:
– The current testing infrastructure is good, but not perfect, and we
should work together to make it better.
– The testboard, while also not perfect, has already become an
invaluable tool in my work on both Isabelle and the AFP.
– The tone of the discussion on this mailing list could occasionally do
with a little less passive-aggressiveness and fewer underhanded
insinuations.
– Lars deserves enormous gratitude from all of us for the work he put
into this so far. Of course, this does not mean that his decisions must
not be criticised, but when you work on something (mostly) alone in your
free time and the only public response you get is complaints, that's not
exactly uplifting.

Cheers and a happy Unification Day to all of you,

Manuel