# forum 9: week of 12 March: Fisher and the design of experiments

Endorsing what Nicole said: let's get really clear about the idea here, as epistemology as well as take-it-or-leave-it scientific method. So let's have a variety of questions, problems, puzzles. One issue that I think is central is this

When we do an experiment the results are interesting if they are surprising. Take 'surprising' as 'improbable'. Improbable given what, given which assumptions about probabilities? (You're testing a coin for bias so you do a long series of tosses, and they're mostly heads. This is not surprising if we assume that it is biased, but is if we assume that it is fair. But we don't want to assume either: that's just what we want to find out.) Different attitudes to this separate different philosophies of testing.

In response to Dr. Morton's question, from the Fisher reading, as well as some rudimentary knowledge of statistics (most of which is also based on Fisher's work, obviously), it follows that results of experiments are interesting if they are statistically significant, i.e. it is improbable that these results have occurred purely by chance. Given this, we are functioning under the assumptions that we, as experimenters, have considered every possible combination of the results before the experiment has been conducted, and it would be highly unlikely that the data that would falsify the null hypothesis occurred by accident. This would also depend on the sample size used in the experiment, and Fisher stresses the importance of this by stating, *"The odds could be made much higher by enlarging the experiment, while if the experiment were much smaller, even the greatest possible success would give odds so low that the result might, with considerable probability, be ascribed to chance"*
Nicole, I found (and I wouldn't be surprised if others felt the same way) this reading to be a lot more accessible than last week's, and some basic knowledge of statistics has primed my understanding of Fisher this time, so I haven't encountered any difficulties.

What I got from reading about the experimental design in this reading and also some previous knowledge that i have from taken research methods is that there is always a possibility that your results could be due to chance or other factors not accounted for. You can set a high p-value or make your experiment larger or even increase the reliability by test-retest methods but though the the possibility of it being due to chance will get smaller and smaller you can never know for certain that your results indicate an actual effect. Another factor besides the possibility of the results being due to chance is that no matter how hard you try to control the noise and other third variables that could be contributing to your results you still could have not taken something into account without being aware of it and your results could have been the effect of that. That is why in science we take all these precautions but you can never KNOW for certain that your hypothesis is true and that is why you always say "based on the data we can conclude so and so" and can never say that it IS so and so way for certain. I think the experimental method yet again reveals the interesting aspect of knowledge in that we can never fully 100 percent know anything. we think we know but we can come to find out later that we were that one percent chance or there was another variable we didn't account for or some other factor that distorted our knowledge.

I have a lot of respect for experimental design, and the amount of knowledge we can gain from it. My biggest uncertainty, (I realize this would be a lot more useful if I had posted this prior to Nicole's presentation) is what made researchers come to the standard 5% level of significance that is used? And although this is the most common significance level, what circumstances lead researchers to sometimes use a 1% significance level?

Thanks for your question, Andrea. Both the 5% and 1% significance levels are common. For the ones who think 5% is *too high*, most of them tend to go with 1% for the significance level because a lower significance level is (supposedly) more desirable. Why 5% is too high a significance level is incredibly subjective, and more often than not relies on *factors specific to the discipline* undergoing investigation that I am unable to speak of here. In any case, this is the view on significance levels among researchers that is most articulated. The 5% significance level has become known as a *standard* because of how frequently it has been used, *and* the fact that some journals have made it as a rule for publications. That is, *if* results don't reach the 5% significance level, then those studies are *generally not accepted* in those journals. I am quite skeptical myself on the merits of this idea about the 5% significance level as *the standard*, and support a view on significance levels that does *not agree* with the most articulated view I mentioned earlier. Thus, I am not quite the right person to ask what circumstances lead researchers to sometimes use a 1% significance level. Needless to say that the debate on how to *correctly* interpret significance levels is nowhere near reconciliation, though one of the goals of my term paper for PHIL 440 is to shed light on this topic. Also, I found one article on the internet that tries to address this question of why 5% is a common significance level: http://www.jerrydallal.com/LHSP/p05.htm. I hope this helps. Let me know if you have any more questions on this topic.