Print view

The Kassel Dowsing Test: Part 2

By Robert Konig, Jurgen Moll, and Armadeo Sarma


The premier issue of Swift contained the first part Of "The Kassel Dowsing Test," a reprinted article from Skeptiker, about the first original project of the GWUP, the German skeptics' organization.

We left the group of dowsers and GWUP members at the test site in Kassel thoroughly agreed and assured of the protocol, poised to make history. A wide variety of pendulums, forked sticks, and bobbing and twisted springs were in agitated motion as the claimants eagerly awaited their chance at the DM20,000 prize. TV cameras covered every aspect of the proceedings. Claimants told interviewing reporters that they were astonished at the naivety of the GWUP people who were offering them this easy way to win a substantial prize.

I must admit that at such moments, I have a momentary feeling of "But what if...?" Dowsers are almost universally honest folks who really believe they can pass such tests, and their guileless exhilaration is infectious. But as we've shown so many times, these folks are merely subject to the "ideomotor effect," whereby they are innocently unaware of moving the dowsing device, and do so unconsciously. They are often able to succeed in poorly designed and poorly controlled demonstrations, usually depending upon common sense and careful observation, but they always fail in this sort of strict, double-blind, monitored test. Experience has shown me that any number of strong contraindications rarely sway them, and they persist in their convictions that they have supernatural abilities and that they can easily prove them to doubters. There is no joy in having to tell honest-but-deluded claimants that they have not demonstrated their claims to be true. When we demonstrate that dowsing is a delusion, we shoot fish in a barrel.

Lacking huge grants of money and endless maintenance funding, those of us who design and conduct tests of unusual claims often have to satisfy ourselves with going after less important targets, leaving the more damaging and glamorous pseudoscientific claptrap to proliferate. A dowser bobbing a stick in a field is a sad sight, but not a serious threat like homeopathic "medicine" or "recovered memory" witch-hunts. Too bad Congress didn't see fit to hand us the $30 million that they gave to the promotion of quackery by unqualified "experts" at the National Institutes of Health, where it was promptly squandered.

GWUP deserves high marks for the care they demonstrated at Kassel. I'm happy that I was able to contribute to the design of the protocol, and I feel that the results speak for themselves. Dowsing will continue, of that we're sure. But at least critics of these silly notions will now have a very definitive piece of research to which they can point when confronted with the usual blather on the subject.

Again, we are grateful for the translation skills of Jutta Degener with assistance from Clive Feather and Mark Brader. Ms. Degener also provided us with the official portrait of Pigasus, mascot of the 2000 Club, and designed the popular JREF Web page.

Here is the second and final part of the Skeptiker article. - J. R.

The Test Commences

Of 21 dowsers who applied in writing, 20 came to Kassel to participate in the tests. Nineteen of them took the test involving water running through pipelines, while the last said the whole area was too "contaminated" for him to do the tests. Fourteen participants took part in the box experiment, but only 13 of them were used in determining the results, because one person broke the previously agreed-upon rules; this was the same person who turned down the water experiment. The results from this person are listed separately; the overall results would not be affected if they had been included.

The 19 participants who took the water test made 30 runs each and scored between 11 and 20 (37% to 67%) - see "Water Test" figure below for a chart of the distribution with a total score of 298 out of 570 (52.3%).

Four errors were made while setting the valves; in each case a valve was turned off when it should have been on. In three cases it was noticed during the trials and corrected immediately, while the fourth case was discovered afterwards and confirmed from the videotapes.

For scoring the results, the actual valve setting was used.

Another incident occurred during a changeover of observers: the new person didn't completely cover the receptacle tank during two runs. That resulted in an increased level of noise from the running water.

One observer thought that a slight difference between the sounds of the two settings could have been noticed. The mistake was discovered during a routine check of the trial conditions. Most dowsers felt the box experiments were more difficult, and so expected not to do as well as they did in the water tests. They scored between O and 2 hits each out of 10, 1.08 on average, against an expected value of 1 - see "Box Test" figure below for a chart of the distribution.

One candidate was omitted from the results of the box experiments, as mentioned already. This was because the realization diverged from the protocol in two respects: first, this person's runs were done outside, possibly compromising the double-blind setup; and second, they made 20 tests rather than the pre-agreed 10. Even so, the contestant failed to make a single hit. Altogether the 13 participants scored 11% (14 hits out of 130); if the omitted results are included, this shrinks to 9% (14 out of 150).

Apart from the actual results, we also gained other interesting insights during the experiments. The dowsers indicated "interfering anomalies" prior to the start of the water test (see diagram below). Not only did the "anomalies" diverge considerably from each other, but the dowsers also traced the disturbances to widely different causes. These ranged from water veins via buried metals to "global lattice networks."

The Results

The overall result of the water trials (52.3%) is very close to the expected rate of 50%. The distribution of rates is within the range that would be expected under the chance hypothesis, which is therefore confirmed. Now, considering the best results from the water trials, we see that two participants achieved 20 hits and a third person scored 19. Taken alone, this might seem remarkable. But in fact, the chance of two or more people scoring 20 is about 24%, while the chance of three or more scoring 19 is 30%, both higher than one might have expected. One should remember that such outlying results are of limited value, even if they look unusual, because a large number of such patterns can be "discovered" in any random sequence, depending on a human observer's sensitivity.

Apart from this, we also compared the hit rates with random YES/NO settings. These random drawings scored between 11 and 21, thus managing to generate a better score than even the best dowsers. Singularities in random results are quite likely and don't signify a deeper meaning. Even a single result of 23 or 24 wouldn't be proof for "earth rays" or other "locational influences."

The results from the box experiments are equally clear. 95% of all trials of this type should be expected to score between 5% and 15%. In this case, the actual result (10.8%) is very close to the expected value. The distribution of the results (O to 2 hits out of 10) also provides no hint of a hidden effect.

The overall results fail to verify the claimed abilities of dowsers. Of course, this is not the same as proving that such abilities don't exist, because it is practically impossible to prove such a thing to the satisfaction of believers in dowsing. Someone can always claim that we tested the wrong dowsers, used the wrong hypotheses, or expected too strong an effect. Take, for example, that last objection. If we wanted to test an effect at the 54% level, we would have to make more than the scheduled 570 experiments for the water trials. Testing a hit rate of 53% would require at least 1000 separate runs. Under these circumstances, who could deny a dowser the claim that they were fatigued?

An even more important point is that, though people occasionally talk about a weak, only statistically significant effect, there is no clear definition of this effect. But such a definition is needed before designing a test for it. Once defined, even a small effect could be tested for in principle. The belated discovery of significant results for not previously defined hypotheses cannot be used as proof. If you look hard enough, something significant can almost always be found. Such results are at most a starting point for new hypotheses and new tests.

Conclusions and Outlook

The trials do not confirm the pre-defined hypothesis. The tested dowsers could not achieve their claims in either of two situations; to the contrary, and as predicted by the GWUP the results were exactly what would have been expected by chance. A closer examination of the results does not hint at any "small effect" either, but it should be admitted that the experiments weren't designed to detect such a thing (even supposing that such an effect had been well defined before the experiments were made).

There have been a few suggestions for improving future trials. To begin with, more people should be involved with controlling the test conditions in order to be able to react immediately to protocol errors. Such errors are a greater danger than accidental, statistical deviations. For the same reason, this test's requirement for the repetition of a result should definitely be retained. Second, deviations from protocol, as in the case of the test person who was disqualified for the box experiment, should be excluded as a matter of principle. Third, it was pointed out that the shack with the valves wasn't completely isolated from the outside world; an accomplice could have gained information from the reactions of the people in the shack and passed it on in some way. Even though this is considered a very minor risk, it should be excluded in the future.

These three examples show how difficult it is to conduct the perfect experiment. Nevertheless, it must be said that no other German trials for the dowsing/earth ray problem have come close to the high standards to which this one aspired. Given the right conditions, the GWUP will continue to hold experiments on claims of dowsing and of other paranormal or extraordinary claims. However, a strict precondition will be that the hypotheses are precisely defined before the tests, that the tests are strictly controlled, and that they can be conducted as double-blind tests. The more extraordinary the claim is, the stronger the security controls must be.


At this time, we'd like to thank the people and institutions involved with the preparation and realization of the dowsing test, without whose help such activities would have been impossible. We are grateful to the Hessische Rundfunk for their generosity. We should also mention that the Kassel Fire Department School provided invaluable assistance both in technical matters and with personnel. Finally, our special thanks to James Randi, who not only significantly influenced the design of the dowsing test, but also helped make it a worthwhile and very pleasant experience for all of us.

The Story That Never Was

The Hessische Rundfunk TV network, who paid for the expenses of setting up the dowsing tests, had covered the proceedings assiduously. Their crews were unobtrusively everywhere, taping every aspect of the tests. Such involvement of personnel and equipment, aside from the outlay of expenses for the basic water delivery system and security procedures, is quite expensive. They had planned to prepare a TV special, and GWUP had granted them this right in return for their participation. Crews and executives from the network were as eager as all of us to see the final results, but as it became evident that the dowsers had failed spectacularly, interest faded quickly. Crews packed away their equipment, scheduled post-results interviews were canceled, and the TV special never took place. It was a case of a "non-story" to Hessische Rundfunk, though if the dowsers had been successful, we expect it would have been a celebration of rare dimensions. - J.R.