User login
Popular Content
No articles have been rated.
Poll
Evaluating The No Significant Difference Phenomenon
Using Experimental Methodology To Reach Valid Conclusions
The use of experiments has long been a central strategy in the scientific study of causation by some independent condition or factor, and its consequences to some dependent one that takes place as a result of various “treatments,” or prior conditions. In many ways, the experiment is the basic strategy of most sciences—excluding only those where manipulation is not possible (e.g., astronomy, geology and meteorology). For the study of animals, including human beings, the experimental strategy was developed centuries ago on in the biological sciences, where subsequently it proved to be one of the most important procedures by which valid knowledge could be obtained concerning the consequences of a host of medications, treatments and conditions. Later, as applied to other fields, such as psychology, valid conclusions could be reached by comparing the consequent performance of groups of animal or human subjects, with similar characteristics, who have been subjected to different “treatments.”
Implementing an experimental designin which valid comparisons can be made about the effects or influencesof different “treatments” canbe a difficult task—whether in medical research, psychology or anyother field. If the subjects under observation in two treatment groupsto be compared clearly have different characteristics, then any resultsthat are obtained can be attributed to those distinct characteristics,and not necessarily to the different treatments. That makes the outcomeof the experiment essentially invalid and meaningless. That principle holdstrue whether it is a medical experiment with rats or monkeys that are beingused to test the effects of new drug, or a study of human beings in twolearning conditions conducted in a psychological experiment. Indeed, ifan experimenter designed a study in which one set of subjects were ratsbred in captivity, and the other consisted of similar animals that hadbeen caught in the sewers, no serious scientist would have much faith inany comparisons that were made regarding their performance on the two differenttreatments.
Similarly, in psychological experiments comparing two groups of human subjectswho had experienced distinct conditions, if one group consisted of peopleof very different characteristics and circumstances—who selectedthemselves to be in a comparison group—such a comparison would havelittle meaning. The only way valid knowledge can be obtained from suchcomparisons is when the members of the two groups being compared have (onaverage at least) quite similar characteristics. Thus, one could not useone group, consisting of working mothers, and another group consistingof young college students, and then compare their performance when subjectedto two different “treatments” or experiences. It would notbe possible to tell whether the results were due to the experimental treatmentor due to the fact that they were quite different types of human beings.
Geoffrey Keppel, a noted authority on the design and conduct of psychologicalexperiments puts it this way:
In the design of any experiment, we attempt to hold constant as many factors as we can that may influence the behavior we are studying. We test all animals in the same apparatus, perhaps with the same experimenter, and often under a high degree of control of the experimental environment (temperature, illumination, background noise, and so on). Ideally, we would like to hold physically constant all important variables except the ones under systematic study. Factors which we are unable to control in this fashion, or which are not sufficiently important to control, we allow to vary randomly across the treatment conditions. Thus, randomization of these so-called “nuisance” factors is the major way in which we obtain internal validity—i.e., the elimination of biases. which if present might invalidate any conclusions drawn concerning the manipulations in the experiment. (Keppel, G., 1973, pp.314)
Essentially, then, an experiment conducted withinthe strict rules of science would begin with a pool of relatively similarsubjects—animalor human—who are to be subjected to two experimental “conditions.” Thatpool of subjects need not be composed of individuals who are in everyrespect totally and exactly alike. Such a situation would be difficultto find in the real world. But assuming that they are basically similar,with some individual variations, they can still serve as subjects inan internally valid experiment.
The key to achieving “internal validity” as it is called(with findings not negated by conditions of the experiment), is to assignthe subjects in the pool randomly to each of the treatment conditions.In that way, any possible factors that could influence the outcome havebeen delegated, on a chance-like basis, to each treatment group. Undersuch a condition, it is highly unlikely that there would be a concentrationof factors in one of the treatment groups that would have a major influenceon the outcome.
Other kinds of comparisons of treatments on different subject groupsrisk violating these important principles—and consequently riskthe validity of their findings. As an example, in many cases, investigatorslocate two groups—such as two distinct groups of students—eachof which in the past has had had some different experience. They thenstudy them closely to see if those different experiences appear to haveinfluenced them in different ways. Often the term “natural experiment” isused to describe such comparisons. Unfortunately, an underlying assumptionin such comparisons is that only the distinct treatments can accountfor what has been found—and not variations in the personal characteristicsof the compared groups. Given the requirement of random assignment notedabove, however, whatever differences or similarities are found undersuch circumstances cannot validly be attributed to the treatment alone.It is equally likely that any differences that have been observed havecome about because of uncontrolled factors in the groups being compared.
One of the common problems in so-called natural experiments arises whenthe participants in the groups being compared are “self-selected” (asopposed to being randomly assigned). There are a host of factors thatwould make one experimental procedure more or less interesting, or moreor less desirable, than another to individual participants. To allowthe subjects of such a study to use their own preferences or personalcircumstances to determine which treatment they will receive would presenta truly serious challenge to the internal validity of any comparisonand would compromise any results that might be obtained.
Today, in the study of the outcome ofDistance Learning (teaching and learning by professors and students whomake use of the Internet) there has been a huge accumulation of “experiments’ comparingthe outcomes of studying via the traditional classroom setting verusvia the Internet with lessons on screens and exchanged via email. Unfortunately,however, virtually all of the "No Significant Difference"studies canbe classified as “naturalexperiments” in the sense discussed above. None of these studieshas been based on the classic principles of experiments, in the sensedescribed by Keppel. That is, in an extensive review of the literaturedevoted to comparing students in distance learning courses with thosein traditional settings studying similar subject matter, no such studycould be found that started with a pool of similar students and thenrandomly assigned them to the different treatments (e.g., traditionallectures, etc,. vs . lessons transmitted over computer systems)—andalso held constant all other factors. Essentially, then, such studiesare often comparisons of groups of student who have self-selected the “treatment” thatthey are in. In that sense, comparing such unlike groups involves thesame problems of internal validity as would a medical experiment usingcaptured sewer rats in one group and domestically raised animals in theother.
There are also other seriously limiting factors that are present in suchcomparisons. One is the differential drop-out rate. In the typical classroomsetting in a traditional institution, students sign up for a course andthey must continue to meet its demands, attending lectures, taking testsand so forth throughout a semester. A few may drop the course—usuallyearly in the term. If they drop later, they receive a “W” ontheir grade transcript (which does not speak well of their capacitiesas students). In such a setting, then, the majority of students in thetraditional setting—both bright and less so—persist throughoutthe course. Indeed. some may receive D’s or even F’s fortheir work. In that way, the class is composed not only of bright andmotivated students, but also of those whose abilities are not as highand whose determination to succeed may be less intense.
Among those who sign up to take an onlinecourse, it is clear that far more drop out before the end of the term—with little in the wayof penalties, other than loss of the fee for the course. Those who dropout are very likely to be the less able and less determined students,who have probably been having trouble keeping up—leaving mainlythose who have the personal qualities of ability and determination thatare needed to finish.
An individual's learning style maybe another factor that affects whether a student succeeds in an onlineclass. There are many examples of personal qualities that allow someto excel in one situation more readily than another. For example, suchdifferences are apparent (but not limited to) in how individuals performin sporting events. As frustrating as it may be in competition, somehave a "natural ability" -- perhaps gifted with an uncommonphysical stamina or a proclivity to learn more in a shorter period oftime. The same may be said of any number of disciplines or types of intelligence(Gardner, 1983). Some students, then, may simply prefer online classesbecause the instruction simply plays to their abilities.
Under these conditions, if an investigatorsets out to conduct a “naturalexperiment” to see if there are “any significant differences” (infactors like “personal satisfaction,” “quality of learning,” etc.)the result is a comparison of “apples and oranges” (sewerrats vs domestically raised animals) That is, the comparison is madebetween the bright and motivated students in the online course who stayedon and finished, with the mixed body of high and low performing studentswho stayed in the traditional course. One need not be a “rocketscientist” to predict that such a comparison will show that theonline students performed well. In comparing them with those in the traditionalsetting, there did not seem to be a “significant difference” attributableto the teaching method used. Unfortunately, however, virtually all therequirements for conducting an internally valid experiment have beenviolated in such comparisons, and the outcome cannot be accepted as meaningful.
REFERENCES
- Gardner, Howard (1993). Multiple Intelligences: The Theory in Practice. Basic Books:New York.
- Keppel, Geoffrey, (1973). Design and Analysis: A Researcher’s Handbook. Prentice Hall: Englewood Cliffs, N.J.
- Login to post comments
-
Designed and developed by Curiositî Design Solutions — Owned and operated by Pilot Media © 2008-2012. All Rights Reserved.
