Complete Entire Week 4
Table of Contents
Complete Entire Week
1. Which one of the following types of variables is most difficult to evaluate objectively in a true experiment? Explain why you think that (See instructions below).
a) Dependent variable
b) Independent variable
c) Confounding variable
d) Extraneous variable
e) None of the above
Instructions: Make selection, provide a concept definition (text), and support your opinion on the selection with an example from research that illustrates the concept. Do so in a maximum of 250 words. Use credible and peer reviewed sources. Credible sources include course materials, University Library research that is peer reviewed, and Internet sites ending in .edu or .gov with with the one exception of research pulled from the www.apa.org site. If research is pulled from the APA site, use the www.apa.org
1.The variable that is the most difficult to evaluate objectively in a true experiment would be Extraneous variable. According to Cozby & Bates (2015), “It would be impossible to know whether participants that were participating in an aerobics class or those watching aerobics on video, would have a better mood due to what they were doing” (p.162). With extraneous variable there are so many other factors that come into play such as; does either room have more doors, air conditioning, heating, windows, ect. Those things actually can change the response of each group making the data collected unreliable. In an article I found regarding women who are pregnant and using cocaine, the study that was done took place over quite a few years. According to Richardson & Day (1999), “One of the issues that were identified was the failure to control adequately for extraneous variables” (p.234). The researchers realized that some of the studies were inadequate and that most of the time information was not interpreted correctly to the client or their providers. The lack of communication caused further issues and endangered some of those pregnancies. Since the study on prenatal cocaine exposure was performed over such a lengthy period of time it is hard to make sure that there will not be anything extraneous that would have an effect on the study. Without trying to eliminate those extraneous variables the study becomes compromised and the data does not appear to be as relevant as other studies.
Cozby, P. C., & Bates, S. C. (2015). Methods in Behavioral Research (12th ed.). New York, NY: McGraw-Hill.
Richardson, G. A., & Day, N. L. (1999). Studies of prenatal cocaine exposure: Assessing the influence of extraneous variables. Journal of Drug Issues, 29(2), 225-236. Retrieved from https://search-proquest-com.contentproxy.phoenix.edu/docview/208833439?accountid=458
2.Independent variables are tested to see of the have an effect on the dependent variable, which is why the extraneous variables (not intentionally studied) are known to be undesirable variables, and sometimes they are difficult for the researcher to control (Cozby, 2015). As an example, since the extraneous variable is not a variable of interest, they may still influence an outcome of a research study or experiment. According to Losen & Oyinalde (2014), the extraneous variable has its positives as it can be used to provide alternative explanations when coming to the experiments effects, but it must be controlled for and not take the place of the independent variable, which has to determine the actual effects. References:Cozby, P. C., & Bates, S. C. (2015). Methods in Behavioral Research (12th ed.). New York, NY: McGraw-Hill. Losen, A., & Oyinalde, A.O (2014) Extraneous Effects of Race, Gender, and Race-Gender Homo- and Heterophily Conditions on Data Quality 4(1) Directory of Open Access Journals DOI: 10.1177/2158244014525418
3.The variable that I think is most difficult to evaluate is the confounding variable. In our reading from chapter four they talk in depth about the confounding variable. They explain the third variable that is hard to get a read on. According to Cozby & Bates (2015) the confounding variable is what we call the third when an uncontrolled one is operating. When a third variable is operating it can cause a huge problem since it can introduce an alternative explanation which can reduce the overall validity of the study (Cozby & Bates, 2015). If two variables are confounded they are so intertwined that you will not be able to determine which of the variables is operating in a situation (Cozby & Bates, 2015). The example they give is about how exercise can cause a reduction in anxiety but when they input income that can cause the third variable (Cozby & Bates, 2015). The third variable which can be extraneous to the two variables being studied. There can be any number of third variables that may be responsible a relationship between two variables (Cozby & Bates, 2015).
Cozby, P. C., & Bates, S. C. (2015). Methods in Behavioral Research (12th ed.). New York, NY: McGraw-Hill.
4.The confounding variables can be difficult to control by the researcher (Cozby & Bates, 2015). In fact, it is said that researchers do fail to control it, as to eliminate the underlying problems the human judgment is necessary. The confounding variable also makes it difficult to find a linkage between treatments and the outcomes. According to Brodt, Dettori, Skelly (2012), Confounding happens when the effects are mixed, where the confounding factors may provide false demonstrations which show to be apparently associated with the treatments and or outcomes, when in reality there is not an association. When coming to exposures in the medical field, treatment group observations, consideration is recommended when coming to the effect truly due to exposure or alternative explanations, there for appropriate methods have to be used for adjustments, where the human judgment is required.
Brodt, E., Dettori, J.R., Skelly, A.C. (2012) Assessing bias: the importance of considering confounding NCBI Retrieved from: https;//www.ncbi.mlh.nih.gov
Cozby, P. C., & Bates, S. C. (2015). Methods in Behavioral Research (12th ed.). New York, NY: McGraw-Hill.
Experimental Design Chapter 8
· Define confounding variable, and describe how confounding variables are related to internal validity.
· Describe the posttest-only design and the pretest-posttest design, including the advantages and disadvantages of each design.
· Contrast an independent groups (between-subjects) design with a repeated measures (within-subjects) design.
· Summarize the advantages and disadvantages of using a repeated measures design.
· Describe a matched pairs design, including reasons to use this design.
Page 162IN THE EXPERIMENTAL METHOD, THE RESEARCHER ATTEMPTS TO CONTROL ALL EXTRANEOUS VARIABLES. Suppose you want to test the hypothesis that exercise affects mood. To do this, you might put one group of people through a 1-hour aerobics workout and put another group in a room where they are asked to watch a video of people exercising for an hour. All participants would then complete the same mood assessment. Now suppose that the people in the aerobics class rate themselves as happier than those in the video viewing condition. Can the difference in mood be attributed to the difference in the exercise? Yes, if there is no other difference between the groups. However, what if the aerobics group was given the mood assessment in a room with windows but the video-only group was tested in a room without windows? In that case, it would be impossible to know whether the better mood of the participants in the aerobics group was due to the exercise or to the presence of windows.
CONFOUNDING AND INTERNAL VALIDITY
Recall from Chapter 4 that the experimental method has the advantage of allowing a relatively unambiguous interpretation of results. The researcher manipulates the independent variable to create groups and then compares the groups on the dependent variable. All other variables are kept constant, either through direct experimental control or through randomization. If the groups are different, the researcher can conclude that the independent variable caused the results because the only difference between the groups is the manipulated variable.
Although the task of designing an experiment is logically elegant and exquisitely simple, you should be aware of possible pitfalls. In the hypothetical exercise experiment just described, the variables of exercise and window presence are confounded. The window variable was not kept constant. A confounding variable is a variable that varies along with the independent variable; confounding occurs when the effects of the independent variable and an uncontrolled variable are intertwined so you cannot determine which of the variables is responsible for the observed effect. If the window variable had been held constant, both the exercise and the video condition would have taken place in identical rooms. That way, the effect of windows would not be a factor to consider when interpreting the difference between the groups.
In short, both rooms in the exercise experiment should have had windows or both should have been windowless. Because one room had windows and one room did not, any difference in the dependent variable (mood) cannot be attributed solely to the independent variable (exercise). An alternative explanation can be offered: The difference in mood may have been caused, at least in part, by the window variable.
Good experimental design requires eliminating possible confounding variables that could result in alternative explanations. A researcher can claim that the independent variable caused the results only by eliminating competing, Page 163alternative explanations. When the results of an experiment can confidently be attributed to the effect of the independent variable, the experiment is said to have internal validity (remember that internal validity refers to the ability to draw conclusions about causal relationships from our data; see Chapter 4). To achieve good internal validity, the researcher must design and conduct the experiment so that only the independent variable can be the cause of the results (Campbell & Stanley, 1966).
This chapter will focus on true experimental designs, which provide the highest degree of internal validity. In Chapter 11, we will turn to an examination of quasi-experimental designs, which lack the crucial element of random assignment while attempting to infer that an independent variable had an effect on a dependent variable. Internal validity is discussed further in Chapter 11 and external validity, the extent to which findings may be generalized, is the focus of Chapter 14.
The simplest possible experimental design has two variables: the independent variable and the dependent variable. The independent variable has a minimum of two levels, an experimental group and a control group. Researchers must make every effort to ensure that the only difference between the two groups is the manipulated (independent) variable.
Remember, the experimental method involves control over extraneous variables, through either keeping such variables constant (experimental control) or using randomization to make sure that any extraneous variables will affect both groups equally. The basic, simple experimental design can take one of two forms: a posttest-only design or a pretest-posttest design.
A researcher using a posttest-only design must (1) obtain two equivalent groups of participants, (2) introduce the independent variable, and (3) measure the effect of the independent variable on the dependent variable. The design looks like this:
Thus, the first step is to choose the participants and assign them to the two groups. The procedures used must achieve equivalent groups to eliminate any Page 164potential selection differences:The people selected to be in the conditions cannot differ in any systematic way. For example, you cannot select high-income individuals to participate in one condition and low-income individuals for the other. The groups can be made equivalent by randomly assigning participants to the two conditions or by having the same participants participate in both conditions. Recall from Chapter 4 that random assignment is done in such a way that each participant is assigned to a condition randomly without regard to any personal characteristic of the individual. The R in the diagram means that participants were randomly assigned to the two groups.
Next, the researcher must choose two levels of the independent variable, such as an experimental group that receives a treatment and a control group that does not. Thus, a researcher might study the effect of reward on motivation by offering a reward to one group of children before they play a game and offering no reward to children in the control group. A study testing the effect of a treatment method for reducing smoking could compare a group that receives the treatment with a control group that does not. Another approach would be to use two different amounts of the independent variable—that is, to use more reward in one group than the other or to compare the effects of different amounts of relaxation training designed to help people quit smoking (e.g., 1 hour of training compared with 10 hours). Another approach would be to include two qualitatively different conditions; for example, one group of test-anxious students might write about their anxiety and the other group could participate in a meditation exercise prior to a test. All of these approaches would provide a basis for comparison of the two groups. (Of course, experiments may include more than two groups; for example, we might compare two different smoking cessation treatments along with a no-treatment control group—these types of experimental designs will be described in Chapter 10).
Finally, the effect of the independent variable is measured. The same measurement procedure is used for both groups, so that comparison of the two groups is possible. Because the groups were equivalent prior to the introduction of the independent variable and there were no confounding variables, any difference between the groups on the dependent variable must be attributed to the effect of the independent variable. This elegant experimental design has a high degree of internal validity. That is, we can confidently conclude that the independent variable caused the dependent variable. In actuality, a statistical significance test would be used to assess the difference between the groups. However, we do not need to be concerned with statistics at this point. An experiment must be well designed, and confounding variables must be eliminated before we can draw conclusions from statistical analyses.
The only difference between the posttest-only design and the pretest-posttest design is that in the latter a pretest is given before the experimental manipulation is introduced:
This design makes it possible to ascertain that the groups were, in fact, equivalent at the beginning of the experiment. However, this precaution is usually not necessary if participants have been randomly assigned to the two groups. With a sufficiently large sample of participants, random assignment will produce groups that are virtually identical in all respects.
You are probably wondering how many participants are needed in each group to make sure that random assignment has made the groups equivalent. The larger the sample, the less likelihood there is that the groups will differ in any systematic way prior to the manipulation of the independent variable. In addition, as sample size increases, so does the likelihood that any difference between the groups on the dependent variable is due to the effect of the independent variable. There are formal procedures for determining the sample size needed to detect a statistically significant effect, but as a rule of thumb you will probably need a minimum of 20 to 30 participants per condition. In some areas of research, many more participants may be necessary. Further issues in determining the number of participants needed for an experiment are described in Chapter 13.
Comparing Posttest-Only and Pretest-Posttest Designs
Each of these two experimental designs has advantages and disadvantages that influence the decision whether to include or omit a pretest. The first decision factor concerns the equivalence of the groups in the experiment. Although randomization is likely to produce equivalent groups, it is possible that, with small sample sizes, the groups will not be equal. Thus, a pretest enables the researcher to assess whether the groups are in fact equivalent to begin with.
Sometimes, a pretest is necessary to select the participants in the experiment. A researcher might need to give a pretest to find the lowest or highest scorers on a smoking measure, a math anxiety test, or a prejudice measure. Once identified, the participants would be randomly assigned to the experimental and control groups.
The pretest-posttest design immediately makes us focus on the change from pretest to posttest. This emphasis on change is incorporated into the analysis of the group differences. Also, the extent of change in each individual can be Page 166examined. If a smoking reduction program appears to be effective for some individuals but not others, attempts can be made to find out why.
A pretest is also necessary whenever there is a possibility that participants will drop out of the experiment; this is most likely to occur in a study that lasts over a long time period. The dropout factor in experiments is called attrition or mortality. People may drop out for reasons unrelated to the experimental manipulation, such as illness; sometimes, however, attrition is related to the experimental manipulation. Even if the groups are equivalent to begin with, different attrition rates can make them nonequivalent. How might mortality affect a treatment program designed to reduce smoking? One possibility is that the heaviest smokers in the experimental group might leave the program. Therefore, when the posttest is given, only the light smokers would remain, so that a comparison of the experimental and control groups would show less smoking in the experimental group even if the program had no effect. In this way, attrition (mortality) becomes an alternative explanation for the results. Use of a pretest enables you to assess the effects of attrition; you can look at the pretest scores of the dropouts and know whether their scores differed from the scores of the individuals completing the study. Thus, with the pretest, it is possible to examine whether attrition is a plausible alternative explanation—an advantage in the experimental design.
One disadvantage of a pretest, however, is that it may be time-consuming and awkward to administer in the context of the particular experimental procedures being used. Perhaps most important, a pretest can sensitize participants to what you are studying, enabling them to figure out what is being studied and (potentially) why. They may then react differently to the manipulation than they would have without the pretest. When a pretest affects the way participants react to the manipulation, it is very difficult to generalize the results to people who have not received a pretest. That is, the independent variable may not have an effect in the real world, where pretests are rarely given. We will examine this issue more fully in Chapter 14.
If awareness of the pretest is a problem, the pretest can be disguised. One way to do this is by administering it in a completely different situation with a different experimenter. Another approach is to embed the pretest in a set of irrelevant measures so it is not obvious that the researcher is interested in a particular topic.
It is also possible to assess the impact of the pretest directly with a combination of both the posttest-only and the pretest-posttest design. In this design, half the participants receive only the posttest, and the other half receive both the pretest and the posttest (see Figure 8.1). This is formally called a Solomon four-group design. If there is no impact of the pretest, the posttest scores will be the same in the two control groups (with and without the pretest) and in the two experimental groups. Garvin and Damson (2008) employed a Solomon four-group design to study the effect of viewing female fitness magazine models on a measure of depressed mood. Female college students spent 30 minutes viewing either the fitness magazines or magazines such as National Geographic. Two possible outcomes of this study are shown in Figure 8.2. The top graph illustrates an outcome in which the pretest has no impact: The fitness magazine viewing results in higher depression in both the posttest-only and the pretest-posttest condition. This is what was found in the study. The lower graph shows an outcome in which there is a difference between the treatment and control groups when there is a pretest, but there is no group difference when the pretest is absent.
Solomon four-group design
Examples of outcomes of Solomon four-group design
ASSIGNING PARTICIPANTS TO EXPERIMENTAL CONDITIONS
Recall that there are two basic ways of assigning participants to experimental conditions. In one procedure, participants are randomly assigned to the various conditions so that each participates in only one group. This is called an independent groups design. It is also known as a between-subjects design because comparisons are made between different groups of participants. In the other procedure, participants are in all conditions. In an experiment with two conditions, for example, each participant is assigned to both levels of the independent variable. This is called a repeated measures design,because each participant is measured after receiving each level of the independent variable. You will also see this called a within-subjects design; in this design, comparisons are made within the same group of participants (subjects). In the next two sections, we will examine each of these designs in detail.
INDEPENDENT GROUPS DESIGN
In an independent groups design, different participants are assigned to each of the conditions using random assignment. This means that the decision to assign an individual to a particular condition is completely random and beyond the control of the researcher. For example, you could ask for the participant’s month of birth; individuals born in odd-numbered months would be assigned to one group and those born in even-numbered months would be assigned to the other group. In practice, researchers use a sequence of random numbers to determine assignment. Such numbers come from a random number generator such as Research Randomizer, available online at http://www.randomizer.org or QuickCalcs at http://www.graphpad.com/quickcalcs/randomize1.cfm; Excel can also generate random numbers. These programs allow you to randomly determine the assignment of each participant to the various groups in your study. Random assignment will prevent any systematic biases, and the groups can be considered equivalent in terms of participant characteristics such as income, intelligence, age, personality, and political attitudes. In this way, participant differences cannot be an explanation for results of the experiment. As we noted in Chapter 4, in an experiment on the effects of exercise on anxiety, lower levels of Page 169anxiety in the exercise group than in the no-exercise group cannot be explained by saying that people in the groups are somehow different on characteristics such as income, education, or personality.
An alternative procedure is to have the same individuals participate in all of the groups. This is called a repeated measures experimental design.
REPEATED MEASURES DESIGN
Consider an experiment investigating the relationship between the meaningfulness of material and the learning of that material. In an independent groups design, one group of participants is given highly meaningful material to learn and another group receives less meaningful material. For example, the meaningful material might include a story relating the material to a real-life event. In a repeated measures design, the same individuals participate in both conditions. Thus, participants might first read low-meaningful material and take a recall test to measure learning; the same participants would then read high-meaningful material and take the recall test. You can see why this is called a repeated measures design; participants are repeatedly measured on the dependent variable after being in each condition of the experiment.
Advantages and Disadvantages of Repeated Measures Design
The repeated measures design has several advantages. An obvious one is that fewer research participants are needed, because each individual participates in all conditions. When participants are scarce or when it is costly to run each individual in the experiment, a repeated measures design may be preferred. In much research on perception, for instance, extensive training of participants is necessary before the actual experiment can begin. Such research often involves only a few individuals who participate in all conditions of the experiment.
An additional advantage of repeated measures designs is that they are extremely sensitive to finding statistically significant differences between groups. This is because we have data from the same people in both conditions. To illustrate why this is important, consider possible data from the recall experiment. Using an independent groups design, the first three participants in the high-meaningful condition had scores of 68, 81, and 92. The first three participants in the low-meaningful condition had scores of 64, 78, and 85. If you calculated an average score for each condition, you would find that the average recall was a bit higher when the material was more meaningful. However, there is a lot of variability in the scores in both groups. You certainly are not finding that everyone in the high-meaningful condition has high recall and everyone in the other condition has low recall. The reason for this variability is that people differ—there are individual differences in recall abilities, so there is a range of scores in both conditions. This is part of “random error” in the scores that we cannot explain.
Page 170However, if the same scores were obtained from the first three participants in a repeated measures design, the conclusions would be much different. Let’s line up the recall scores for the two conditions:
With a repeated measures design, the individual differences can be seen and explained. It is true that some people score higher than others because of individual differences in recall abilities, but now you can much more clearly see the effect of the independent variable on recall scores. It is much easier to separate the systematic individual differences from the effect of the independent variable: Scores are higher for every participant in the high-meaningful condition. As a result, we are much more likely to detect an effect of the independent variable on the dependent variable.
The major problem with a repeated measures design stems from the fact that the different conditions must be presented in a particular sequence. Suppose that there is greater recall in the high-meaningful condition. Although this result could be caused by the manipulation of the meaningfulness variable, the result could also simply be an order effect—the order of presenting the treatments affects the dependent variable. Thus, greater recall in the high-meaningful condition could be attributed to the fact that the high-meaningful task came second in the order of presentation of the conditions. Performance on the second task might improve merely because of the practice gained on the first task. This improvement is in fact called a practice effect, or learning effect. It is also possible that a fatigue effect could result in a deterioration in performance from the first to the second condition as the research participant becomes tired, bored, or distracted.
It is also possible for the effect of the first treatment to carry over to influence the response to the second treatment—this is known as a carryover effect. Suppose the independent variable is severity of a crime. After reading about the less severe crime, the more severe one might seem much worse to participants than it normally would. In addition, reading about the severe crime might subsequently cause participants to view the less severe crime as much milder than they normally would. In both cases, the experience with one condition carried over to affect the response to the second condition. In this example, the carryover effect was a psychological effect of the way that the two situations contrasted with one another.
A carryover effect may also occur when the first condition produces a change that is still influencing the person when the second condition is introduced. Suppose the first condition involves experiencing failure at an important Page 171task. This may result in a temporary increase in stress responses. How long does it take before the person returns to a normal state? If the second condition is introduced too soon, the stress may still be affecting the participant.
There are two approaches to dealing with order effects. The first is to employ counterbalancing techniques. The second is to devise a procedure in which the interval between conditions is long enough to minimize the influence of the first condition on the second.
Complete counterbalancing In a repeated measures design, it is very important to counterbalance the order of the conditions. With complete counterbalancing, all possible orders of presentation are included in the experiment. In the example of a study on learning high- and low-meaningful material, half of the participants would be randomly assigned to the low-high order, and the other half would be assigned to the high-low order. This design is illustrated as follows:
By counterbalancing the order of conditions, it is possible to determine the extent to which order is influencing the results. In the hypothetical memory study, you would know whether the greater recall in the high-meaningful condition is consistent for both orders; you would also know the extent to which a practice effect is responsible for the results.
Counterbalancing principles can be extended to experiments with three or more groups. With three groups, there are 6 possible orders (3! = 3 × 2 × 1 = 6). With four groups, the number of possible orders increases to 24 (4! = 4 × 3 × 2 × 1 = 24); you would need a minimum of 24 participants to represent each order, and you would need 48 participants to have only two participants per order. Imagine the number of orders possible in an experiment by Shepard and Metzler (1971). In their basic experimental paradigm, each participant is shown a three-dimensional object along with the same figure rotated at one of 10 different angles ranging from 0 degrees to 180 degrees (see the sample objects illustrated in Figure 8.3). Each time, the participant presses a button when it is determined that the two figures are the same or different. The dependent variable is reaction time—the amount of time it takes to decide whether the figures are the same or different. The results show that reaction time becomes longer as the angle of rotation increases away from the original. In this experiment with 10 conditions, there are 3,628,800 possible orders! Fortunately, there are alternatives to complete counterbalancing that still allow researchers to draw valid conclusions about the effect of the independent variable without running some 3.6 million tests.
Three-dimensional figures used by Shepard and Metzler (1971)
Adapted from “Mental Rotation of Three-Dimensional Objects,” by R. N. Shepard and J. Metzler, 1971, Science, 171, pp. 701–703.
Latin squares A technique to control for order effects without having all possible orders is to construct a Latin square: a limited set of orders constructed to ensure that (1) each condition appears at each ordinal position and (2) each condition precedes and follows each condition one time. Using a Latin square to determine order controls for most order effects without having to include all possible orders. Suppose you replicated the Shepard and Metzler (1971) study using only 4 of the 10 rotations: 0, 60, 120, and 180 degrees. A Latin square for these four conditions is shown in Figure 8.4. Each row in the square is one of the orders of the conditions (the conditions are labeled A, B, C, and D). The number of orders in a Latin square is equal to the number of conditions; thus, if there are four conditions, there are four orders. When you conduct your study using the Latin square to determine order, you need at least one participant per row. Usually, you will have two or more participants per row; the number of participants tested in each order must be equal.
A Latin square with four conditions
Note: The four conditions were randomly given letter designations. A = 60 degrees, B = 0 degrees, C = 180 degrees, and D = 120 degrees. Each row represents a different order of running the conditions.
Time Interval Between Treatments
In addition to counterbalancing the order of treatments, researchers need to carefully determine the time interval between presentation of treatments and possible activities between them. A rest period may counteract a fatigue effect; attending to an unrelated task between treatments may reduce the possibility that participants will contrast the first treatment with the second. If the treatment is the administration of a drug that takes time to wear off, the interval between treatments may have to be a day or more. Lane, Cherek, Tcheremissine, Lieving, and Pietras (2005) used a repeated measures design to study the effect of marijuana on risk taking. The subjects came the lab in the morning and passed a drug test. They were then given one of three marijuana doses. The dependent variable was a measure of risk taking. Subjects were tested in this way for each dosage. Because of the time necessary for the effects of the drug to wear off, the three conditions were run on separate days at least five days apart. A similarly long time interval would be needed with procedures that produce emotional changes, such as heightened anxiety or anger. You may have noted that introduction of an extended time interval may create a separate problem: Participants will have to commit to the experiment for a longer period of time. This can make it more difficult to recruit volunteers, and if the study extends over two or more days, some participants may drop out of the experiment altogether. And for the record, increased marijuana doses did result in making riskier decisions.
Choosing Between Independent Groups and Repeated Measures Designs
Repeated measures designs have two major advantages over independent groups designs: (1) a reduction in the number of participants required to complete the experiment and (2) greater control over participant differences and thus greater ability to detect an effect of the independent variable. As noted previously, in certain areas of research, these advantages are very important. However, the disadvantages of repeated measures designs and the precautions required to deal with them are usually sufficient reasons for researchers to use independent groups designs.
A very different consideration in whether to use a repeated measures design concerns generalization to conditions in the “real world.” Greenwald (1976) has pointed out that in actual everyday situations, we sometimes encounter independent variables in an independent groups fashion: We encounter only Page 174one condition without a contrasting comparison. However, some independent variables are most frequently encountered in a repeated measures fashion: Both conditions appear, and our responses occur in the context of exposure to both levels of the independent variable. Thus, for example, if you are interested in how a defendant’s characteristics affects jurors, an independent groups design may be most appropriate because actual jurors focus on a single defendant in a trial. However, if you are interested in the effects of a job applicant’s characteristics on employers, a repeated measures design would be reasonable because employers typically consider several applicants at once. Whether to use an independent groups or repeated measures design may be partially determined by these generalization issues.
Finally, any experimental procedure that produces a relatively permanent change in an individual cannot be used in a repeated measures design. Examples include a psychotherapy treatment or a surgical procedure such as the removal of brain tissue.
MATCHED PAIRS DESIGN
A somewhat more complicated method of assigning participants to conditions in an experiment is called a matched pairs design. Instead of simply randomly assigning participants to groups, the goal is to first match people on a participant variable such as age or a personality trait (see Chapter 4). The matching variable will be either the dependent measure or a variable that is strongly related to the dependent variable. For example, in a learning experiment, participants might be matched on the basis of scores on a cognitive ability measure or even grade point average. If cognitive ability is not related to the dependent measure, however, matching would be a waste of time. The goal is to achieve the same equivalency of groups that is achieved with a repeated measures design without the necessity of having the same participants in both conditions. The design looks like this:
When using a matched pairs design, the first step is to obtain a measure of the matching variable from each individual. The participants are then rank ordered from highest to lowest based on their scores on the matching variable. Now the researcher can form matched pairs that are approximately equal on the characteristic (the highest two participants form the first pair, the next Page 175two form the second pair, and so on). Finally, the members of each pair are randomly assigned to the conditions in the experiment. (Note that there are methods of matching pairs of individuals on the basis of scores derived from multiple variables; these methods are described briefly in Chapter 11.)
A matched pairs design ensures that the groups are equivalent (on the matching variable) prior to introduction of the independent variable manipulation. This assurance could be particularly important with small sample sizes, because random assignment procedures are more likely to produce equivalent groups as the sample size increases. Matching, then, is most likely to be used when only a few participants are available or when it is very costly to run large numbers of individuals in the experiment—as long as there is a strong relationship between a dependent measure and the matching variable. The result is a greater ability to detect a statistically significant effect of the independent variable because it is possible to account for individual differences in responses to the independent variable, just as we saw with a repeated measures design. (The issues of variability and statistical significance are discussed further in Chapter 13 and Appendix C.)
However useful they are, matching procedures can be costly and time-consuming, because they require measuring participants on the matching variable prior to the experiment. Such efforts are worthwhile only when the matching variable is strongly related to the dependent measure and you know that the relationship exists prior to conducting your study. For these reasons, matched pairs is not a commonly used experimental design. However, we will discuss matching again in Chapter 11 when describing quasi-experimental designs that do not have random assignment to conditions. You now have a fundamental understanding of the design of experiments. In the next chapter, we will consider issues that arise when you decide how to actually conduct an experiment.
ILLUSTRATIVE ARTICLE: EXPERIMENTAL DESIGN
We are constantly connected. We can be reached by cell phone almost anywhere, at any time. Text messages compete for our attention. Email and instant messaging (IM) can interrupt our attention whenever we are using a cell phone or computer. Is this a problem? Most people like to think of themselves as experts at multitasking. Is that true?
A study conducted by Bowman, Levine, Waite, and Gendron (2010) attempted to determine whether IMing during a reading session affected test performance. In this study, participants were randomly assigned to one of three conditions: one where they were asked to IM prior to reading, one in which they were asked to IM during reading, and one in which IMing was not allowed at all. Afterward, all participants completed a brief test on the material presented in the reading.
First, acquire and read the article:Page 176
Bowman, L. L., Levine, L. E., Waite, B. M., & Gendron, M. (2010). Can students really multitask? An experimental study of instant messaging while reading. Computers & Education, 54, 927–931. doi:10.1016/j.compedu.2009.09.024
After reading the article, answer the following questions:
1. This experiment used a posttest-only design. How could the researchers have used a pretest-posttest design? What would the advantages and disadvantages be of using a pretest-posttest design?
2. This experiment used an independent groups design.
a. How could they have used a repeated measures design? What would have been the advantages and disadvantages of using a repeated measures design?
b. How could they have used a matched pairs design? What variables do you think would have been worthwhile to match participants on? What would have been the advantages and disadvantages of using a matched pairs design?
3. What potential confounding variables can you think of?
4. In what way does this study reflect—or not reflect—the reality of studying and test taking in college? That is, how would you evaluate the external validity of this study?
5. How good was the internal validity of this experiment?
6. What were the researchers’ key conclusions of this experiment?
7. Would you have predicted the results obtained in this experiment? Why or why not?
Attrition (also mortality) (p. 166)
Between-subjects design (also independent groups design) (p. 168)
Carryover effect (p. 170)
Confounding variable (p. 162)
Counterbalancing (p. 171)
Fatigue effect (p. 170)
Independent groups design (also between-subjects design) (p. 168)
Internal validity (p. 163)
Latin square (p. 172)
Matched pairs design (p. 174)
Mortality (also attrition) (p. 166)
Order effect (p. 170)
Posttest-only design (p. 163)
Practice effect (also learning effect) (p. 170)
Pretest-posttest design (p. 164)Page 177
Random assignment (p. 168)
Repeated measures design (also within-subjects design) (p. 168)
Selection differences (p. 164)
Solomon four-group design (p. 166)
Within-subjects design (also repeated measures design) (p. 168)
1. What is confounding of variables?
2. What is meant by the internal validity of an experiment?
3. How do the two true experimental designs eliminate the problem of selection differences?
4. Distinguish between the posttest-only design and the pretest-posttest design. What are the advantages and disadvantages of each?
5. What is a repeated measures design? What are the advantages of using a repeated measures design? What are the disadvantages?
6. What are some of the ways of dealing with the problems of a repeated measures design?
7. When would a researcher decide to use the matched pairs design? What would be the advantages of this design?
8. The procedure used to obtain your sample (i.e., random or nonrandom sampling) is not the same as the procedure for assigning participants to conditions; distinguish between random sampling and random assignment.
1. Design an experiment to test the hypothesis that single-gender math classes are beneficial to adolescent females. Construct operational definitions of both the independent and dependent variables. Your experiment should have two groups and use the matched pairs procedure. Make a good case for your selection of the matching variable. In addition, defend your choice of either a posttest-only design or a pretest-posttest design.
2. Design a repeated measures experiment that investigates the effect of report presentation style on the grade received for the report. Use two levels of the independent variable: a “professional style” presentation (high-quality paper, consistent use of margins and fonts, carefully constructed tables and charts) and a “nonprofessional style” (average-quality paper, frequent changes in the margins and fonts, tables and charts lacking proper labels). Discuss the necessity for using counterbalancing. Create a table illustrating the experimental design.
3. Professor Foley conducted a cola taste test. Each participant in the experiment first tasted 2 ounces of Coca-Cola, then 2 ounces of Pepsi, and finally 2 ounces of Sam’s Choice Cola. A rating of the cola’s flavor was made after each taste. What are the potential problems with this experimental design and the procedures used? Revise the design and procedures to address these problems. You may wish to consider several alternatives and think about the advantages and disadvantages of each.
1. Which one of the following is an example of an expectation that can cause bias in an experiment? Explain why you think that is so (See instructions below).
a) Experimenter behaves inconsistently with participants
b) Participant wants to look good in the eyes of the experimenter
c) Experimenter is unaware of the hypothesis
d) Participant reads the hypothesis in the informed consent form
e) All of the above
Instructions: Make selection, provide a concept definition (text), and support your opinion on the selection with an example from research that illustrates the concept. Do so in a maximum of 250 words. Use credible and peer reviewed sources. Credible sources include course materials, University Library research that is peer reviewed, and Internet sites ending in .edu or .gov with with the one exception of research pulled from the www.apa.org site. If research is pulled from the APA site, use the www.apa.org
1.Experimenter bias is something that can ruin all credibility in the outcomes of an experiment. Because experimenters are usually aware of the study that they have to do, they usually end up having certain expectations out of the study, creating experimenter bias or expectancy effects (Cozby 2015). Cozby uses an example in which he says a researcher is more likely to ask certain questions to participants in particular conditions, which would cause the outcome of the results to be biased (Cozby 2015). Personally, I think that experimenter bias is inevitable. The human mind is conditioned to have certain feelings that are associated with certain situations, so it is harder than people think to remove oneself from the research in order to produce objective results. In a study focusing on experimenter bias, a researcher studies how the selection of the participants have a direct effect on the bias that experiments have during the actual study; usually the bias begins in the selection process, with researchers choosing certain people who they feel will produce the greatest results (Forester 2000), therefore proving experimenter bias is inevitable.
Forster, K. I. (2000). The potential for experimenter bias effects in word recognition experiments. Memory & Cognition, 28(7), 1109-1115.
Conducting Experiments Chapter 8
· Distinguish between straightforward and staged manipulations of an independent variable.
· Describe the three types of dependent variables: self-report, behavioral, and physiological.
· Discuss sensitivity of a dependent variable, contrasting floor effects and ceiling effects.
· Describe ways to control participant expectations and experimenter expectations.
· List the reasons for conducting pilot studies.
· Describe the advantages of including a manipulation check in an experiment.
THE PREVIOUS CHAPTERS HAVE LAID THE FOUNDATION FOR PLANNING A RESEARCH INVESTIGATION. In this chapter, we will focus on some very practical aspects of conducting research. How do you select the research participants? What should you consider when deciding how to manipulate an independent variable? What should you worry about when you measure a variable? What do you do when the study is completed?
SELECTING RESEARCH PARTICIPANTS
The focus of your study may be children, college students, elderly adults, employees, rats, pigeons, or even cockroaches or flatworms; in all cases, the participants or subjects must somehow be selected. The method used to select participants can have a profound impact on external validity. Remember that external validity is defined as the extent to which results from a study can be generalized to other populations and settings.
Most research projects involve sampling research participants from a population of interest. The population is composed of all of the individuals of interest to the researcher. Samples may be drawn from the population using probability sampling or nonprobability sampling techniques. When it is important to accurately describe the population, you must use probability sampling. This is why probability sampling is so crucial when conducting scientific polls. Much research, on the other hand, is more interested in testing hypotheses about behavior: attempting to detect whether X causes Yrather than describing a population. Here, the two focuses of the study are the relationships between the variables being studied and tests of predictions derived from theories of behavior. In such cases, the participants may be found in the easiest way possible using nonprobability sampling methods, also known as haphazard or “convenience” methods. You may ask students in introductory psychology classes to participate, knock on doors in your dorm to find people to be tested, or choose a class in which to test children simply because you know the teacher. Nothing is wrong with such methods as long as you recognize that they affect the ability to generalize your results to some larger population. In Chapter 14 , we examine the issues of generalizing from the rather atypical samples of college students and other conveniently obtained research participants.
You will also need to determine your sample size. How many participants will you need in your study? In general, increasing your sample size increases the likelihood that your results will be statistically significant, because larger samples provide more accurate estimates of population values (see Table 7.2 , p. 149 ). Most researchers take note of the sample sizes in the research area being studied and select a sample size that is typical for studies in the area. A more formal approach to selecting a sample size, called power analysis, is discussed in Chapter 13 .
MANIPULATING THE INDEPENDENT VARIABLE
To manipulate an independent variable, you have to construct an operational definition of the variable (see Chapter 4 ). That is, you must turn a conceptual variable into a set of operations—specific instructions, events, and stimuli to be presented to the research participants. The manipulation of the independent variable, then, is when a researcher changes the conditions to which participants are exposed. In addition, the independent and dependent variables must be introduced within the context of the total experimental setting. This has been called setting the stage (Aronson, Brewer, & Carlsmith, 1985).
Setting the Stage
In setting the stage, you usually have to supply the participants with the information necessary for them to provide their informed consent to participate (informed consent is covered in Chapter 3 ). This generally includes information about the underlying rationale of the study. Sometimes, the rationale given is completely truthful, although only rarely will you want to tell participants the actual hypothesis. For example, you might say that you are conducting an experiment on memory when, in fact, you are studying a specific aspect of memory (your independent variable). If participants know what you are studying, they may try to confirm (or even deny) the hypothesis, or they may try to look good by behaving in the most socially acceptable way. If you find that deception is necessary, you have a special obligation to address the deception when you debrief the participants at the conclusion of the experiment.
There are no clear-cut rules for setting the stage, except that the experimental setting must seem plausible to the participants, nor are there any clear-cut rules for translating conceptual variables into specific operations. Exactly how the variable is manipulated depends on the variable and the cost, practicality, and ethics of the procedures being considered.
Types of Manipulations
Straightforward manipulations Researchers are usually able to manipulate an independent variable with relative simplicity by presenting written, verbal, or visual material to the participants. Such straightforward manipulations manipulate variables with instructions and stimulus presentations. Stimuli may be presented verbally, in written form, via videotape, or with a computer. Let’s look at a few examples.
Goldstein, Cialdini, and Griskevicius (2008) were interested in the influence of signs that hotels leave in their bathrooms encouraging guests to reuse their towels. In their research, they simply printed signs that were hooked on towel shelves in the rooms of single guests staying at least two nights. In a standard message, the sign read “HELP SAVE THE ENVIRONMENT. You can show your respect of nature and help save the environment by reusing Page 182towels during your stay.” In this case, 35% of the guests reused their towels on the second day. Another condition invoked a social norm that other people are reusing towels: “JOIN YOUR FELLOW GUESTS IN HELPING TO SAVE THE ENVIRONMENT. Almost 75% of guests who are asked to participate in our new resource savings program do help by using their towels more than once. You can join your fellow guests in this program to save the environment by reusing your towels during your stay.” This sign resulted in 44% reusing their towels. As you might expect, the researchers have extended this research to study ways that the sign can be even more effective in increasing conservation.
Most memory research relies on straightforward manipulations. For example, Coltheart and Langdon (1998) displayed lists of words to participants and later measured recall. The word lists differed on phonological similarity: Some lists had words that sounded similar, such as cat, map, and pat, and other lists had dissimilar words such as mop, pen, and cow. They found that lists with dissimilar words are recalled more accurately.
Educational programs are most often straightforward. Pawlenko, Safer, Wise, and Holfeld (2013) examined the effectiveness of three training programs designed to improve jurors’ ability to evaluate eyewitness testimony. Subjects viewed one of three 15-minute slide presentations on a computer screen. The Interview-Identification-Eyewitness training focused on three steps to analyze eyewitness evidence: Ask if the eyewitness interviews were done properly, ask if identification methods were proper, and evaluate if the conditions of the crime scene allowed for an accurate identification. A second presentation termed “Biggers training” was a presentation of five eyewitness factors that the Supreme Court determined should be used (developed in a case called Neil v. Biggers). The Jury Duty presentation was a summary of standard information provided to jurors such as the need to be fair and impartial and the importance of hearing all evidence before reaching a verdict. After viewing the presentations, subjects read a trial transcript that included problems with the eyewitness identification procedures. The subjects in the Interview-Identification-Eyewitness conditions were most likely to use these problems in reaching a verdict.
As a final example of a straightforward manipulation, consider a study by Mazer, Murphy, and Simonds (2009) on the effect of college teacher self-disclosure (via Facebook) on perceptions of teacher effectiveness. For this study, students read one of three Facebook profiles that were created for a volunteer teacher, one for each of the high-, medium-, and low-disclosure conditions. Level of disclosure was manipulated by changing the number and nature of photographs, biographical information, favorite movies/books/quotes, campus groups, and posts on “the wall.” After viewing the profile to which they were assigned, participants rated the teacher on several dimensions. Higher disclosure resulted in perceptions of greater caring and trustworthiness; however, disclosure was not related to perceptions of teacher competence.
You will find that most manipulations of independent variables in all areas of research are straightforward. Researchers vary the difficulty of material to Page 183be learned, motivation levels, the way questions are asked, characteristics of people to be judged, and a variety of other factors in a straightforward manner.
Staged manipulations Other manipulations are less straightforward. Sometimes, it is necessary to stage events during the experiment in order to manipulate the independent variable successfully. When this occurs, the manipulation is called a staged manipulation or event manipulation.
Staged manipulations are most frequently used for two reasons. First, the researcher may be trying to create some psychological state in the participants, such as frustration, anger, or a temporary lowering of self-esteem. For example, Zitek and her colleagues studied what is termed a sense of entitlement (Zitek, Jordan, Monin, & Leach, 2010). Their hypothesis is that the feeling of being unfairly wronged leads to a sense of entitlement and, as a result, the tendency to be more selfish with others. In their study, all participants played a computer game. The researchers programmed the game so that some participants would lose when the game crashed. This is an unfair outcome, because the participants lost for no good reason. Participants in the other condition also lost, but they thought it was because the game itself was very difficult. The participants experiencing the broken game did in fact behave more selfishly after the game; they later allocated themselves more money than deserved when competing with another participant.
Second, a staged manipulation may be necessary to simulate some situation that occurs in the real world. Recall the Milgram obedience experiment that was described in Chapter 3 . In that study, an elaborate procedure—ostensibly to study learning—was constructed to actually study obedience to an authority. Or consider a study on computer multitasking conducted by Bowman, Levine, Waite, and Gendron (2010), wherein students read academic material presented on a computer screen. In one condition, the participants received and responded to instant messages while they were reading. Other participants did not receive any messages. Student performance on a test was equal in the two conditions. However, students in the instant message condition took longer to read the material (after the time spent on the message was subtracted from the total time working on the computer).
Staged manipulations frequently employ a confederate (sometimes termed an “accomplice”). Usually, the confederate appears to be another participant in an experiment but is actually part of the manipulation (we discussed the use of confederates in Chapter 3 ). A confederate may be useful to create a particular social situation. For example, Hermans, Herman, Larsen, and Engels (2010) studied whether food intake by males is affected by the amount of food consumed by a companion. Participants were recruited for a study on evaluation of movie trailers. The participant and a confederate sat in a comfortable setting in which they viewed and evaluated three trailers. They were then told that they needed a break before viewing the next trailers; snacks were available if they were interested. In one condition, the confederate took a large serving of snacks. A small serving was taken in another condition, and the confederate Page 184did not eat in the third condition. The researchers then measured the amount consumed by the actual participants; they did model the amount consumed by the confederate but only when they were hungry.
Example of the Asch line judgment task
The classic Asch (1956) conformity experiment provides another example of how confederates may be used. Asch gathered people into groups and asked them to respond to a line judgment task such as the one in Figure 9.1 . Which of the three test lines matches the standard? Although this appears to be a simple task, Asch made it more interesting by having several confederates announce the same incorrect judgment prior to asking the actual participant; this procedure was repeated over a number of trials with different line judgments. Asch was able to demonstrate how easy it is to produce conformity—participants conformed to the unanimous majority on many of the trials even though the correct answer was clear. Finally, confederates may be used in field experiments as well as laboratory research. As described in Chapter 4 , Lee, Schwarz, Taubman, and Hou (2010) studied the impact of public sneezing on the perception of unrelated risks by having an accomplice either sneeze or not sneeze (control condition) while walking by someone in a public area of a university. A researcher then approached those people with a request to complete a questionnaire, which they described as a “class project.” The questionnaire measured participants’ perceptions of average Americans’ risk of contracting a serious disease. The researchers found that, indeed, being around a person who sneezes increases self-reported perception of risk.
As you can see, staged manipulations demand a great deal of ingenuity and even some acting ability. They are used to involve the participants in an ongoing social situation that the individuals perceive not as an experiment but as a real experience. Researchers assume that the result will be natural behavior that truly reflects the feelings and intentions of the participants. However, such procedures allow for a great deal of subtle interpersonal communication that is hard to put into words; this may make it difficult for other researchers to replicate the experiment. Also, a complex manipulation is difficult to interpret. If many things happened during the experiment, what one thing was responsible for the results? In general, it is easier to interpret results when the manipulation is relatively straightforward. However, the nature of the variable you are studying sometimes demands complicated procedures.
Strength of the Manipulation
The simplest experimental design has two levels of the independent variable. In planning the experiment, the researcher has to choose these levels. A general principle to follow is to make the manipulation as strong as possible. A strong manipulation maximizes the differences between the two groups and increases the chances that the independent variable will have a statistically significant effect on the dependent variable.
To illustrate, suppose you think that there is a positive linear relationship between attitude similarity and liking (“birds of a feather flock together”). In conducting the experiment, you could arrange for participants to encounter another person, a confederate. In one group, the confederate and the participant would share similar attitudes; in the other group, the confederate and the participant would be dissimilar. Similarity, then, is the independent variable, and liking is the dependent variable. Now you have to decide on the amount of similarity. Figure 9.2 shows the hypothesized relationship between attitude similarity and liking at 10 different levels of similarity. Level 1 represents the least amount of similarity with no common attitudes, and level 10 the greatest (all attitudes are similar). To achieve the strongest manipulation, the participants in one group would encounter a confederate of level 1 similarity; those in the other group would encounter a confederate of level 10 similarity. This would result in the greatest difference in the liking means—a 9-point difference. A weaker manipulation—using levels 4 and 7, for example—would result in a smaller mean difference.
A strong manipulation is particularly important in the early stages of research, when the researcher is most interested in demonstrating that a relationship does, in fact, exist. If the early experiments reveal a relationship between the variables, subsequent research can systematically manipulate the other levels of the independent variable to provide a more detailed picture of the relationship.
Relationship between attitude similarity and liking
Page 186The principle of using the strongest manipulation possible should be tempered by at least two considerations. The first concerns the external validity of a study: The strongest possible manipulation may entail a situation that rarely, if ever, occurs in the real world. For example, an extremely strong crowding manipulation might involve placing so many people in a room that no one could move—a manipulation that might significantly affect a variety of behaviors. However, we would not know if the results were similar to those occurring in more common, less crowded situations, such as many classrooms or offices.
A second consideration is ethics: A manipulation should be as strong as possible within the bounds of ethics. A strong manipulation of fear or anxiety, for example, might not be possible because of the potential physical and psychological harm to participants.
Cost of the Manipulation
Cost is another factor in the decision about how to manipulate the independent variable. Researchers who have limited monetary resources may not be able to afford expensive equipment, salaries for confederates, or payments to participants in long-term experiments. Also, a manipulation in which participants must be run individually requires more of the researcher’s time than a manipulation that allows running many individuals in a single setting. In this respect, a manipulation that uses straightforward presentation of written or verbal material is less costly than a complex, staged experimental manipulation. Some government and private agencies offer grants for research; because much research is costly, continued public support of these agencies is very important.
MEASURING THE DEPENDENT VARIABLE
In previous chapters, we have discussed various aspects of measuring variables, including reliability, validity, and reactivity of measures; observational methods; and the development of self-report measures for questionnaires and interviews. In this section, we will focus on measurement considerations that are particularly relevant to experimental research.
Types of Measures
The dependent variable in most experiments is one of three general types: self-report, behavioral, or physiological.
Self-report measures Self-reports can be used to measure attitudes, liking for someone, judgments about someone’s personality characteristics, intended behaviors, emotional states, attributions about why someone performed well or poorly on a task, confidence in one’s judgments, and many other aspects of human thought and behavior. Rating scales with descriptive anchors Page 187(endpoints) are most commonly used. For example, Funk and Todorov (2013) studied the impact of a facial tattoo on impressions of a man accused of assault. The man, Jack, had punched another man in a bar following a dispute over a spilled drink. A description of the incident included a photo of Jack with or without a facial tattoo. After viewing the photo and reading the description, subjects responded to several questions on a 7-point scale that included the following:
How likely is it that Jack is guilty?
Behavioral measures Behavioral measures are direct observations of behaviors. As with self-reports, measurements of an almost endless number of behaviors are possible. Sometimes, the researcher may record whether a given behavior occurs—for example, whether an individual responds to a request for help, makes an error on a test, or chooses to engage in one activity rather than another. Often, the researcher must decide whether to record the number of times a behavior occurs in a given time period—the rate of a behavior; how quickly a response occurs after a stimulus—a reaction time; or how long a behavior lasts—a measure of duration. The decision about which aspect of behavior to measure depends on which is most theoretically relevant for the study of a particular problem or which measure logically follows from the independent variable manipulation.
As an example, consider a study on eating behavior while viewing a food-related or nature television program (Bodenlos & Wormuth, 2013). Participants had access to chocolate-covered candies, cheese curls, and carrots that were weighed before and after the session. More candy was consumed during the food-related program; there were no differences for the other two foods.
Sometimes the behavioral measure is not an actual behavior but a behavioral intention or choice. Recall the study described in Chapter 3 in which subjects decided how much hot sauce another subject would have to consume later in the study (Vasquez, Pederson, Bushman, Kelley, Demeestere, & Miller, 2013). They did not actually pour the hot sauce but they did commit to an action rather than simply indicate their feelings about the other subject.
Physiological measures Physiological measures are recordings of responses of the body. Many such measurements are available; examples include the galvanic skin response (GSR), electromyogram (EMG), and electroencephalogram (EEG). The GSR is a measure of general emotional arousal and anxiety; it measures the electrical conductance of the skin, which changes when sweating occurs. The EMG measures muscle tension and is frequently used as a measure of tension or stress. The EEG is a measure of electrical activity of brain cells. It can be used to record general brain arousal as a response to different situations, such as activity in certain parts of the brain as learning occurs or brain activity during different stages of sleep.
Page 188The GSR, EMG, and EEG have long been used as physiological indicators of important psychological variables. Many other physiological measures are available, including temperature, heart rate, and analysis of blood or urine (see Cacioppo & Tassinary, 1990). In recent years, magnetic resonance imaging (MRI) has become an increasingly important tool for researchers in behavioral neuroscience. An MRI provides an image of an individual’s brain structure. It allows scientists to compare the brain structure of individuals with a particular condition (e.g., a cognitive impairment, schizophrenia, or attention deficit hyperactivity disorder) with the brain structure of people without the condition. In addition, a functional MRI (fMRI) allows researchers to scan areas of the brain while a research participant performs a physical or cognitive task. The data provide evidence for what brain processes are involved in these tasks. For example, a researcher can see which areas of the brain are most active when performing different memory tasks. In one study using fMRI, elderly adults with higher levels of education not only performed better on memory tasks than their less educated peers, but they also used areas of their frontal cortex that were not used by other elderly and younger individuals (Springer, McIntosh, Winocur, & Grady, 2005).
Although it is convenient to describe single dependent variables, most studies include more than one dependent measure. One reason to use multiple measures stems from the fact that a variable can be measured in a variety of concrete ways (recall the discussion of operational definitions in Chapter 4 ). In a study on the effects of an employee wellness program on health, the researchers might measure self-reported fatigue, stress, physical activity, and eating habits along with physical measures of blood pressure, blood sugar, cholesterol, and weight (cf. Clark et al, 2011). If the independent variable has the same effect on several measures of the same dependent variable, our confidence in the results is increased. It is also useful to know whether the same independent variable affects some measures but not others. For example, an independent variable designed to affect liking might have an effect on some measures of liking (e.g., desirability as a person to work with) but not others (e.g., desirability as a dating partner). Researchers may also be interested in studying the effects of an independent variable on several different behaviors. For example, an experiment on the effects of a new classroom management technique might examine academic performance, interaction rates among classmates, and teacher satisfaction.
When you have more than one dependent measure, the question of order arises. Does it matter which measures are made first? Is it possible that the results for a particular measure will be different if the measure comes earlier rather than later? The issue is similar to the order effects that were discussed in Chapter 8 in the context of repeated measures designs. Perhaps responding to the first measures will somehow affect responses on the later measures, Page 189or perhaps the participants attend more closely to first measures than to later measures. There are two possible ways of responding to this issue. If it appears that the problem is serious, the order of presenting the measures can be counterbalanced using the techniques described in Chapter 8 . Often there are no indications from previous research that order is a serious problem. In this case, the prudent response is to present the most important measures first and the less important ones later. With this approach, order will not be a problem in interpreting the results on the most important dependent variables. Even though order may be a potential problem for some of the measures, the overall impact on the study is minimized.
Making multiple measurements in a single experiment is valuable when it is feasible to do so. However, it may be necessary to conduct a separate series of experiments to explore the effects of an independent variable on various behaviors.
Sensitivity of the Dependent Variable
The dependent variable should be sensitive enough to detect differences between groups. A measure of liking that asks, “Do you like this person?” with only a simple “yes” or “no” response alternative is less sensitive than one that asks, “How much do you like this person?” on a 5- or 7-point scale. With the first measure, people may tend to be nice and say yes even if they have some negative feelings about the person. The second measure allows for a gradation of liking; such a scale would make it easier to detect differences in amount of liking.
The issue of sensitivity is particularly important when measuring human performance. Memory can be measured using recall, recognition, or reaction time; cognitive task performance might be measured by examining speed or number of errors during a proofreading task; physical performance can be measured through various motor tasks. Such tasks vary in their difficulty. Sometimes a task is so easy that everyone does well regardless of the conditions that are manipulated by the independent variable. This results in what is called a ceiling effect—the independent variable appears to have no effect on the dependent measure only because participants quickly reach the maximum performance level. The opposite problem occurs when a task is so difficult that hardly anyone can perform well; this is called a floor effect.
The need to consider sensitivity of measures is nicely illustrated in the Freedman et al. (1971) study of crowding mentioned in Chapter 4 . The study examined the effect of crowding on various measures of cognitive task performance and found that crowding did not impair performance. You could conclude that crowding has no effect on performance; however, it is also possible that the measures were either too easy or too difficult to detect an effect of crowding. In fact, subsequent research showed that the tasks may have been too easy; when subjects perform complex cognitive tasks in laboratory or natural settings, crowding does result in lower performance (Bruins & Barber, 2000; Paulus, Annis, Seta, Schkade, & Matthews, 1976).
Cost of Measures
Another consideration is cost—some measures may be more costly than others. Paper-and-pencil self-report measures are generally inexpensive; measures that require trained observers or elaborate equipment can become quite costly. A researcher studying nonverbal behavior, for example, might have to use a video camera to record each participant’s behaviors in a situation. Two or more observers would then have to view the tapes to code behaviors such as eye contact, smiling, or self-touching (two observers are needed to ensure that the observations are reliable). Thus, there would be expenses for both equipment and personnel. Physiological recording devices are also expensive. Researchers need resources from the university or outside agencies to carry out such research.
The basic experimental design has two groups: in the simplest case, an experimental group that receives the treatment and a control group that does not. Use of a control group makes it possible to eliminate a variety of alternative explanations for the results, thus improving internal validity. Sometimes additional control procedures may be necessary to address other types of alternative explanations. Two general control issues concern expectancies on the part of both the participants in the experiment and the experimenters.
Controlling for Participant Expectations
Demand characteristics We noted previously that experimenters generally do not wish to inform participants about the specific hypotheses being studied or the exact purpose of the research. The reason for this lies in the problem of demand characteristics (Orne, 1962), which is any feature of an experiment that might inform participants of the purpose of the study. The concern is that when participants form expectations about the hypothesis of the study, they will then do whatever is necessary to confirm the hypothesis. For example, if you were studying the relationship between political orientation and homophobia, participants might figure out the hypothesis and behave according to what they think you want, rather than according to their true selves.
One way to control for demand characteristics is to use deception—to make participants think that the experiment is studying one thing when actually it is studying something else. The experimenter may devise elaborate cover stories to explain the purpose of the study and to disguise what is really being studied. The researcher may also attempt to disguise the dependent variable by using an unobtrusive measure or by placing the measure among a set of unrelated filler items on a questionnaire. Another approach is simply to assess whether demand characteristics are a problem by asking participants about their perceptions of the purpose of the research. It may be that participants do Page 191not have an accurate view of the purpose of the study; or if some individuals do guess the hypotheses of the study, their data may be analyzed separately.
Demand characteristics may be eliminated when people are not aware that an experiment is taking place or that their behavior is being observed. Thus, experiments conducted in field settings and observational research in which the observer is concealed or unobtrusive measures are used minimize the problem of demand characteristics.
Placebo groups A special kind of participant expectation arises in research on the effects of drugs. Consider an experiment that is investigating whether a drug such as Prozac reduces depression. One group of people diagnosed as depressive receives the drug and the other group receives nothing. Now suppose that the drug group shows an improvement. We do not know whether the improvement was caused by the properties of the drug or by the participants’ expectations about the effect of the drug—what is called a placebo effect. In other words, just administering a pill or an injection may be sufficient to cause an observed improvement in behavior. To control for this possibility, a placebo group can be added. Participants in the placebo group receive a pill or injection containing an inert, harmless substance; they do not receive the drug given to members of the experimental group. If the improvement results from the active properties of the drug, the participants in the experimental group should show greater improvement than those in the placebo group. If the placebo group improves as much as the experimental group, all improvement could be caused by a placebo effect.
Sometimes, participants’ expectations are the primary focus of an investigation. For example, Marlatt and Rohsenow (1980) conducted research to determine which behavioral effects of alcohol are due to alcohol itself as opposed to the psychological impact of believing one is drinking alcohol. The experimental design to examine these effects had four groups: (1) expect no alcohol–receive no alcohol, (2) expect no alcohol–receive alcohol, (3) expect alcohol–receive no alcohol, and (4) expect alcohol–receive alcohol. This design is called a balanced placebo design. Marlatt and Rohsenow’s research suggests that the belief that one has consumed alcohol is a more important determinant of behavior than the alcohol itself. That is, people who believed they had consumed alcohol (Groups 3 and 4) behaved very similarly, although those in Group 3 were not actually given any alcohol.
In some areas of research, the use of placebo control groups has ethical implications. Suppose you are studying a treatment that does have a positive effect on people (for example, by reducing migraine headaches or alleviating symptoms of depression). It is important to use careful experimental procedures to make sure that the treatment does have an impact and that alternative explanations for the effect, including a placebo effect, are eliminated. However, it is also important to help those people who are in the control conditions; this aligns with the concept of beneficence that was covered in Chapter 3 . Thus, participants in the control conditions must be given the treatment as soon as Page 192they have completed their part in the study in order to maximize the benefits of participation.
Placebo effects are real and must receive serious study in many areas of research. A great deal of current research and debate focuses on the extent to which any beneficial effects of antidepressant medications such as Prozac are due to placebo effects (e.g., Kirsch, 2010; Wampold, Minami, Tierney, Baskin, & Bhati, 2005).
Controlling for Experimenter Expectations
Experimenters are usually aware of the purpose of the study and thus may develop expectations about how participants should respond. These expectations can in turn bias the results. This general problem is called experimenter bias or expectancy effects (Rosenthal, 1966, 1967, 1969).
Expectancy effects may occur whenever the experimenter knows which condition the participants are in. There are two potential sources of experimenter bias. First, the experimenter might unintentionally treat participants differently in the various conditions of the study. For example, certain words might be emphasized when reading instructions to one group but not the other, or the experimenter might smile more when interacting with people in one of the conditions. The second source of bias can occur when experimenters record the behaviors of the participants; there may be subtle differences in the way the experimenter interprets and records the behaviors.
Research on expectancy effects Expectancy effects have been studied in a variety of ways. Perhaps the earliest demonstration of the problem is the case of Clever Hans, a horse with alleged mathematical and other abilities that attracted the attention of Europeans in the early 20th century (Rosenthal, 1967). The owner of the horse posed questions to Hans who in turn would provide answers by tapping his hoof (e.g., a question of “what is two times five” would be followed by ten taps). Pfungst (1911) later showed that Hans was actually responding to barely detectable cues provided by the person asking the question. The person would look at the hoof as Hans started to tap and then changed to look at Hans as the correct answer was about to be given. Hans was responding to these head and eye movements that went undetected by observers.
If a clever horse can respond to subtle cues, it is reasonable to suppose that clever humans can too. In fact, research has shown that experimenter expectancies can be communicated to humans by both verbal and nonverbal means (Duncan, Rosenberg, & Finklestein, 1969; Jones & Cooper, 1971). An example of more systematic research on expectancy effects is a study by Rosenthal (1966). In this experiment, graduate students trained rats that were described as coming from either “maze bright” or “maze dull” genetic strains. The animals actually came from the same strain and had been randomly assigned to the bright and dull categories; however, the “bright” rats did perform better than the “dull” rats. Subtle differences in the ways the students treated the rats Page 193or recorded their behavior must have caused this result. A generalization of this particular finding is called “teacher expectancy.” Research has shown that telling a teacher that a pupil will bloom intellectually over the next year results in an increase in the pupil’s IQ score (Rosenthal & Jacobson, 1968). In short, teachers’ expectations can influence students’ performance.
The problem of expectations influencing ratings of behavior is nicely illustrated in an experiment by Langer and Abelson (1974). Clinical psychologists were shown a videotape of an interview in which the person interviewed was described as either an applicant for a job or a patient; in reality, all saw the same tape. The psychologists later rated the person as more “disturbed” when they thought the person was a patient than when the person was described as a job applicant.
Solutions to the expectancy problem Clearly, experimenter expectations can influence the outcomes of research investigations. How can this problem be solved? Fortunately, there are a number of ways to minimize expectancy effects. First, experimenters should be well trained and should practice behaving consistently with all participants. The benefit of training was illustrated in the Langer and Abelson study with clinical psychologists. The bias of rating the “patient” as disturbed was much less among behavior-oriented therapists than among traditional ones. Presumably, the training of the behavior-oriented therapists led them to focus more on the actual behavior of the person, so they were less influenced by expectations stemming from the label of “patient.”
Another solution is to run all conditions simultaneously so that the experimenter’s behavior is the same for all participants. This solution is feasible only under certain circumstances, however, such as when the study can be carried out with the use of printed materials or the experimenter’s instructions to participants are the same for everyone.
Expectancy effects are also minimized when the procedures are automated. As noted previously, it may be possible to manipulate independent variables and record responses using computers; with automated procedures, the experimenter’s expectations are less likely to influence the results.
A final solution is to use experimenters who are unaware of the hypothesis being investigated. In these cases, the person conducting the study or making observations is blind regarding what is being studied or which condition the participant is in. This procedure originated in drug research using placebo groups. In a single-blind experiment, the participant is unaware of whether a placebo or the actual drug is being administered; in a double-blind experiment, neither the participant nor the experimenter knows whether the placebo or actual treatment is being given. To use a procedure in which the experimenter or observer is unaware of either the hypothesis or the group the participant is in, you must hire other people to conduct the experiment and make observations.
Because researchers are aware of the problem of expectancy effects, solutions such as the ones just described are usually incorporated into the procedures of the study. The procedures used in scientific research must be precisely Page 194defined so they can be replicated by others. This allows other researchers to build on previous research. If a study does have a potential problem of expectancy effects, researchers are bound to notice and will attempt to replicate the experiment with procedures that control for them. It is also a self-correcting mechanism that ensures that methodological flaws will be discovered. The importance of replication will be discussed further in Chapter 14 .
So far, we have discussed several of the factors that a researcher considers when planning a study. Actually conducting the study and analyzing the results is a time-consuming process. Before beginning the research, the investigator wants to be as sure as possible that everything will be done right. And once the study has been designed, there are some additional procedures that will improve it.
After putting considerable thought into planning the study, the researcher writes a research proposal. The proposal will include a literature review that provides a background for the study. The intent is to clearly explain why the research is being done—what questions the research is designed to answer. The details of the procedures that will be used to test the idea are then given. The plans for analysis of the data are also provided. A research proposal is very similar to the introduction and method sections of a journal article. Such proposals must be included in applications for research grants; ethics review committees require some type of proposal as well (see Chapter 3 for more information on Institutional Review Boards).
Preparing a proposal is a good idea in planning any research project because simply putting your thoughts on paper helps organize and systematize ideas. In addition, you can show the proposal to friends, colleagues, professors, and other interested parties who can provide useful feedback about the adequacy of your procedures. They may see problems that you did not recognize, or they may offer ways of improving the study.
When the researcher has finally decided on all the specific aspects of the procedure, it is possible to conduct a pilot study in which the researcher does a trial run with a small number of participants. The pilot study will reveal whether participants understand the instructions, whether the total experimental setting seems plausible, whether any confusing questions are being asked, and so on.
Sometimes participants in the pilot study are questioned in detail about the experience following the experiment. Another method is to use the think aloud protocol (described in Chapter 7 ) in which the participants in the pilot study Page 195are instructed to verbalize their thoughts about everything that is happening during the study. Such procedures provide the researcher with an opportunity to make any necessary changes in the procedure before doing the entire study. Also, a pilot study allows the experimenters who are collecting the data to become comfortable with their roles and to standardize their procedures.
A manipulation check is an attempt to directly measure whether the independent variable manipulation has the intended effect on the participants. Manipulation checks provide evidence for the construct validity of the manipulation (construct validity was discussed in Chapter 4 ). If you are manipulating anxiety, for example, a manipulation check will tell you whether participants in the high-anxiety group really were more anxious than those in the low-anxiety condition. The manipulation check might involve a self-report of anxiety, a behavioral measure (such as number of arm and hand movements), or a physiological measure. All manipulation checks, then, ask whether the independent variable manipulation was in fact a successful operationalization of the conceptual variable being studied. Consider, for example, a manipulation of physical attractiveness as an independent variable. In an experiment, participants respond to someone who is supposed to be perceived as attractive or unattractive. The manipulation check in this case would determine whether participants do rate the highly attractive person as more physically attractive.
Manipulation checks are particularly useful in the pilot study to decide whether the independent variable manipulation is in fact having the intended effect. If the independent variable is not effective, the procedures can be changed. However, it is also important to conduct a manipulation check in the actual experiment. Because a manipulation check in the actual experiment might distract participants or inform them about the purpose of the experiment, it is usually wise to position the administration of the manipulation check measure near the end of the experiment; in most cases, this would be after measuring the dependent variables and prior to the debriefing session.
A manipulation check has two advantages. First, if the check shows that your manipulation was not effective, you have saved the expense of running the actual experiment. You can turn your attention to changing the manipulation to make it more effective. For instance, if the manipulation check shows that neither the low- nor the high-anxiety group was very anxious, you could change your procedures to increase the anxiety in the high-anxiety condition.
Second, a manipulation check is advantageous if you get nonsignificant results—that is, if the results indicate that no relationship exists between the independent and dependent variables. A manipulation check can identify whether the nonsignificant results are due to a problem in manipulating the independent variable. If your manipulation is not successful, it is only reasonable that you will obtain nonsignificant results. If both groups are equally anxious after you manipulate anxiety, anxiety cannot have any effect on the dependent measure. Page 196What if the check shows that the manipulation was successful, but you still get nonsignificant results? Then you know at least that the results were not due to a problem with the manipulation; the reason for not finding a relationship lies elsewhere. Perhaps you had a poor dependent measure, or perhaps there really is no relationship between the variables.
The importance of debriefing was discussed in Chapter 3 in the context of ethical considerations. After all the data are collected, a debriefing session is usually held. This is an opportunity for the researcher to interact with the participants to discuss the ethical and educational implications of the study.
The debriefing session can also provide an opportunity to learn more about what participants were thinking during the experiment. Researchers can ask participants what they believed to be the purpose of the experiment, how they interpreted the independent variable manipulation, and what they were thinking when they responded to the dependent measures. Such information can prove useful in interpreting the results and planning future studies.
Finally, researchers may ask the participants to refrain from discussing the study with others. Such requests are typically made when more people will be participating and they may talk with one another in classes or residence halls. People who have already participated are aware of the general purposes and procedures; it is important that these individuals not provide expectancies about the study to potential future participants.
ANALYZING AND INTERPRETING RESULTS
After the data have been collected, the next step is to analyze them. Statistical analyses of the data are carried out to allow the researcher to examine and interpret the pattern of results obtained in the study. The statistical analysis helps the researcher decide whether there really is a relationship between the independent and dependent variables; the logic underlying the use of statistical tests is discussed in Chapter 13 . It is not the purpose of this book to teach statistical methods; however, the calculations involved in several statistical tests are provided in Appendix C .
COMMUNICATING RESEARCH TO OTHERS
The final step is to write a report that details why you conducted the research, how you obtained the participants, what procedures you used, and what you found. A description of how to write such reports is included in Appendix A . After you have written the report, what do you do with it? How do you communicate the findings to others? Research findings are most often submitted as journal articles or as papers to be read at scientific meetings. In either case, Page 197the submitted paper is evaluated by two or more knowledgeable reviewers who decide whether the paper is acceptable for publication or presentation at the meeting.
Meetings sponsored by professional associations are important opportunities for researchers to present their findings to other researchers and the public. National and regional professional associations such as the American Psychological Association (APA) and the Association for Psychological Science (APS) hold annual meetings at which psychologists and students present their own research and learn about the latest research being done by their colleagues. Sometimes, verbal presentations are delivered to an audience. However, poster sessions are more common; here, researchers display posters that summarize the research and are available to discuss the research with others.
As we noted in Chapter 2 , many journals publish research papers. Nevertheless, the number of journals is small compared to the number of reports written; thus, it is not easy to publish research. When a researcher submits a paper to a journal, two or more reviewers read the paper and recommend acceptance (often with the stipulation that revisions be made) or rejection. This process is called peer review and it is very important in making sure that research has careful external review before it is published. As many as 90% of papers submitted to the more prestigious journals are rejected. Many rejected papers are submitted to other journals and eventually accepted for publication, but much research is never published. This is not necessarily bad; it simply means that selection processes separate high-quality research from that of lesser quality.
Many of the decisions that must be made when planning an experiment were described in this chapter. The discussion focused on experiments that use the simplest experimental design with a single independent variable. In the next chapter, more complex experimental designs are described.
ILLUSTRATIVE ARTICLE: CONDUCTING EXPERIMENTS
Many people behave superstitiously. That is, they may believe that their lucky shirt helps them with an exam, or that washing a uniform after a game removes the “luck,” or that winning the lottery is dependent on playing one’s lucky numbers. Many of us believe that, indeed, these superstitions do not really affect outcomes. Superstition has been studied in psychology for some time. B. F. Skinner (1947) demonstrated that superstitious Page 198behavior could be seen in a pigeon! More recently, Damisch, Stoberock, and Mussweiler (2010) decided to see if they could observe any effect that superstitious behaviors had on several different performance measures, including putting in golf, motor dexterity, memory, and performance on a word jumble puzzle.
Over four different experiments, the researchers varied participants’ perceptions of “luck” and then measured performance. In the first experiment, university students were randomly assigned to conditions wherein they were asked to putt using either a “lucky ball” (condition 1) or a “neutral ball” (condition 2). Participants in the “lucky ball” condition were statistically better putters than those in the “neutral ball” condition.
First, acquire and read the article:
Damisch, L., Stoberock, B., & Mussweiler, T. (2010). Keep your fingers crossed! How superstition improves performance. Psychological Science, 21, 1014–1020. doi:10.1177/0956797610372631
Then, after reading the article, consider the following:
1. For each of the four experiments, describe how the manipulation of the independent variable was straightforward or staged.
2. In this chapter, we discuss three types of dependent measures: self-report, behavioral, and physiological. In the experiments presented in this paper, what types of dependent measures were used? Could other types of dependent measures have been used? How so?
3. Was the dependent measure used in Experiment 1 sensitive? How so?
4. Did these researchers use any manipulation checks in their experiments? Design a manipulation check for Experiment 2.
5. This paper includes four experiments. Given that these researchers were interested in superstition, why was using multiple studies a good thing for the internal validity of the study?
6. How good was the internal validity of this series of studies?
7. How would you extend the study?
Behavioral measure ( p. 187 )
Ceiling effect ( p. 189 )
Confederate ( p. 183 )
Demand characteristics ( p. 190 )
Double-blind experiment ( p. 193 )
Electroencephalogram ( p. 187 )
Electromyogram ( p. 187 )
Expectancy effects (experimenter bias) ( p. 192 )
Filler items ( p. 190 )
Page 199Floor effect ( p. 189 )
Functional MRI ( p. 188 )
Galvanic skin response ( p. 187 )
Manipulation check ( p. 195 )
Manipulation strength ( p. 185 )
MRI ( p. 188 )
Physiological measure ( p. 187 )
Pilot study ( p. 194 )
Placebo group ( p. 191 )
Self-report ( p. 186 )
Sensitivity ( p. 189 )
Single-blind experiment ( p. 193 )
Staged manipulation ( p. 183 )
Straightforward manipulation ( p. 181 )
1. What is the difference between staged and straightforward manipulations of an independent variable?
2. What are the general types of dependent variables?
3. What is meant by the sensitivity of a dependent measure? What are ceiling and floor effects?
4. What are demand characteristics? Describe ways to minimize demand characteristics.
5. What is the reason for a placebo group?
6. What are experimenter expectancy effects? What are some solutions to the experimenter bias problem?
7. What is a pilot study?
8. What is a manipulation check? How does it help the researcher interpret the results of an experiment?
9. Describe the value of a debriefing following the study.
10. What does a researcher do with the findings after completing a research project?
1. Dr. Turk studied the relationship between age and reading comprehension, specifically predicting that older people will show lower comprehension than younger ones. Turk was particularly interested in comprehension of material that is available in the general press. Groups of participants who were 20, 30, 40, and 50 years of age read a chapter from a book by theoretical physicist Stephen W. Hawking (1988) entitled A Brief History of Time: From the Big Bang to Black Holes. After reading the chapter, participants were given a comprehension measure. The results showed that there was no relationship between age and comprehension scores; all age groups had equally low comprehension scores. Page 200Why do you think no relationship was found? Identify at least two possible reasons.
2. Recall the experiment on facilitated communication by children with autism that was described on p. 27 in Chapter 2 (Montee, Miltenberger, & Wittrock, 1995). Interpret the findings of that study in terms of experimenter expectancy effects.
3. Your lab group has been assigned the task of designing an experiment to investigate the effect of time spent studying on a recall task. Thus far, your group has come up with the following plan: “Participants will be randomly assigned to two groups. Individuals in one group will study a list of 5 words for 5 minutes, and those in the other group will study the same list for 7 minutes. Immediately after studying, the participants will a list of 10 words and circle those that appeared on the original study list.” Improve this experiment, giving specific reasons for any changes.
4. If you were investigating variables that affect helping behavior, would you be more likely to use a straightforward or staged manipulation? Why?
5. Design an experiment using a staged manipulation to test the hypothesis that when people are in a good mood, they are more likely to contribute to a charitable cause. Include a manipulation check in your design.
6. In a pilot study, Professor Mori conducted a manipulation check and found no significant difference between the experimental conditions. Should she continue with the experiment? What should she do next? Explain your recommendations for Professor Mori.
7. Write a debriefing statement that you would read to participants in the Asch line judgment study ( p. 184 ).
8. Test yourself: Read each statement and then circle the appropriate letter: T (true) or F (false).
a. Most manipulations are straightforward. T F
b. Staged manipulations are designed to get participants involved in the situation and to make them think that it is a real experience. T F
c. A staged experiment may be difficult to replicate by other researchers. T F
d. Straightforward manipulations are often difficult to interpret. T F
Scenario: Researchers provided both content of class and gender of instructor within vignettes for 2 classes of students that were manipulated by the experimenter. For example, the content manipulated in the two different classes was either counseling or research methods. The gender of the instructor manipulated in the vignettes was either male or female. In the research results, the main effects indicated instructor gender and course content were not statistically significant.
Answer each question in a maximum of 250 words excluding citations: Which of the following research designs is the above experimenter using? Why do you say that? What is the strength of the design that you selected from the list below?
a) Inverted U
b) 2 x 2
c) IV x PV
d) None of the above (What alternative design then?)
Instruction: Provide a definition of your concept design from our text then, discuss support for your selection including an example from research that illustrates your point. Do so with a maximum of 250 words excluding citations.
Complex Experimental Designs
· Define factorial design and discuss reasons a researcher would use this design.
· Describe the information provided by main effects and interaction effects in a factorial design.
· Describe an IV × PV design.
· Discuss the role of simple main effects in interpreting interactions.
· Compare the assignment of participants in an independent groups design, a repeated measures design, and a mixed factorial design.
THUS FAR WE HAVE FOCUSED PRIMARILY ON THE SIMPLEST EXPERIMENTAL DESIGN, IN WHICH ONE INDEPENDENT VARIABLE IS MANIPULATED AND ONE DEPENDENT VARIABLE IS MEASURED. However, researchers often investigate problems that demand more complicated designs. These complex experimental designs are the subject of this chapter.
We begin by discussing the idea of increasing the number of levels of an independent variable in an experiment. Then, we describe experiments that expand the number and types of independent variables. These changes impact the complexity of an experiment.
INCREASING THE NUMBER OF LEVELS OF AN INDEPENDENT VARIABLE
In the simplest experimental design, there are only two levels of the independent variable. However, a researcher might want to design an experiment with three or more levels for several reasons. First, a design with only two levels of one independent variable cannot provide very much information about the exact form of the relationship between the independent and dependent variables. For example, Figure 10.1 is based on the outcome of an experiment on the relationship between amount of “mental practice” and performance on a motor task: dart throwing score (Kremer, Spittle, McNeil, & Shinners, 2009). Mental practice consisted of imagining practice throws prior to an actual dart throwing task. Does mental practice improve dart performance? The solid line describes the results when only two levels were used—no mental practice throws and 100 mental practice throws. Because there are only two levels, the relationship can be described only with a straight line. We do not know what the relationship would be if other practice amounts were included as separate levels of the independent variable. The broken line in Figure 10.1 shows the results when 25, 50, and 75 mental practice throws are also included. This result is a more accurate description of the relationship between amount of mental practice and performance. The amount of practice is very effective in increasing performance up to a point, after which further practice is not helpful. This type of relationship is termed a positive monotonic relationship; there is a positive relationship between the variables, but it is not a strictly positive linear relationship. An experiment with only two levels cannot yield such exact information.
Linear versus positive monotonic functions
Note: Data based on an experiment conducted by Kremer, Spittle, McNeil, and Shinners (2009); that experiment did not include a 75-practice-throws condition.
Note: At least three levels of the independent variable are required to show curvilinear relationships.
Recall from Chapter 4 that variables are sometimes related in a curvilinear or nonmonotonic fashion; that is, the direction of relationship changes. Figure 10.2 shows an example of a curvilinear relationship; this particular form is called an inverted-U because the wide range of levels of the independent variable produces an inverted U shape (recall our discussion of inverted-U relationships in Chapter 4). An experimental design with only two levels of the independent variable cannot detect curvilinear relationships between variables. If a curvilinear relationship is predicted, at least three levels must be used. As Figure 10.2 shows, if only levels 1 and 3 of the independent variable had been used, no relationship between the variables would have been detected. Many such curvilinear relationships exist in psychology. The relationship between fear arousal and attitude change is one example—we can be scared into changing an attitude, but if we think that a message is “over the top,” attitude change does not occur. In other words, increasing the amount of fear aroused by a persuasive message increases attitude change up to a moderate level of fear; further increases in fear arousal actually reduce attitude change.
Finally, researchers frequently are interested in comparing more than two groups. Suppose you want to know whether playing with an animal has beneficial effects on nursing home residents. You could have two conditions, such Page 204as a no-animal control group and a group in which a dog is brought in for play each day. However, you might also be interested in knowing the effect of a cat and a bird, and so you could add these two groups to your study. Or you might be interested in comparing the effect of a large versus a small dog in addition to a no-animal control condition. In an actual study with four groups, Strassberg and Holty (2003) compared responses to women’s Internet personal ads. The researchers first devised a control ad portraying a woman with generally positive attributes, such as liking painting and hiking. The other ads each added a more specific characteristic: (1) slim and attractive, (2) sensual and passionate, or (3) financially independent and ambitious. Contrary to the researchers’ initial expectations, the independent/ambitious woman received many more responses than the other three.
INCREASING THE NUMBER OF INDEPENDENT VARIABLES: FACTORIAL DESIGNS
Researchers often manipulate more than one independent variable in a single experiment. Typically, two or three independent variables are operating simultaneously. This type of experimental design is a closer approximation of real-world conditions, in which independent variables do not exist by themselves. Researchers recognize that in any given situation a number of variables are operating to affect behavior. In Chapter 8, we described a hypothetical experiment in which exercise was the independent variable and mood was the dependent variable. An actual experiment on the relationship between exercise and depression was conducted by Dunn, Trivedi, Kampert, Clark, and Chambliss (2005). The participants were randomly assigned to one of two exercise conditions—a low or high amount, with energy expenditure of either 7.0 or 17.5 kcal per kilogram per week. The dependent variable was the score on a standard depression measure after 12 weeks of exercise. You might be wondering how often the participants exercised each week. Indeed, the researchers did wonder if frequency of exercising would be important, so they scheduled some subjects to exercise 3 days per week and others to exercise 5 days per week. Thus, the researchers designed an experiment with two independent variables—in this case, (1) amount of exercise and (2) frequency of exercise.
Factorial designs are designs with more than one independent variable (or factor). In a factorial design, all levels of each independent variable are combined with all levels of the other independent variables. The simplest factorial design—known as a 2 × 2 (two by two) factorial design—has two independent variables, each having two levels.
An experiment by Hermans, Engels, Larsen, and Herman (2009) illustrates a 2 × 2 factorial design. Herman et al. studied modeling of food intake when someone is with another person who is eating. What influences whether you will model the other person’s eating? In the experiment, a subject was paired with a same-sex confederate to view and rate movie trailers—they were seated in a comfortable living room environment with a bowl of M&Ms within easy reach on a coffee table. After 10 minutes of viewing, there was a break period. Two independent variables were manipulated: (1) confederate sociability and (2) confederate food intake. The sociable confederate initiated a conversation; the unsociable confederate did not initiate a conversation, responded with only brief answers if the subject said something, and avoided eye contact. The confederate also was first to reach for the M&Ms. One piece was taken in the low food intake condition; a total of six pieces were taken during the break. In the high food intake condition, four pieces were taken immediately; a total of 24 pieces were eaten by the confederate in this condition. During the break period (lasting 15 minutes), the subject could ignore the bowl of M&Ms or eat as many as desired.
2 × 2 factorial design: Setup of food intake modeling experiment
This 2 × 2 design results in four experimental conditions: (1) sociable confederate—low food intake, (2) unsociable confederate—low food intake, (3) sociable confederate—high food intake, (4) unsociable confederate—high food intake. A 2 × 2 design always has four groups. Figure 10.3 shows how these experimental conditions are created.
The general format for describing factorial designs is
and so on. A design with two independent variables, one having two levels and the other having three levels, is a 2 × 3 factorial design; there are six conditions in the experiment. A 3 × 3 design has nine conditions.
Interpretation of Factorial Designs
Factorial designs yield two kinds of information. The first is information about the effect of each independent variable taken by itself: the main effect of an independent variable. In a design with two independent variables, there are two main effects—one for each independent variable. The second type of information is called an interaction. If there is an interaction between two independent variables, the effect of one independent variable depends on the particular level of the other variable. In other words, the effect that an independent variable has on the dependent variable depends on the level of the other independent variable. Interactions are a new source of information that cannot be obtained in a simple experimental design in which only one independent variable is manipulated.
TABLE 10.1 2 × 2 factorial design: Results of the food intake modeling
To illustrate main effects and interactions, we can look at the results of the Hermans et al. (2009) study on food intake modeling. Table 10.1 illustrates a common method of presenting outcomes for the various groups in a factorial design. The number in each cell represents the mean number of M&Ms consumed by the subjects in the four conditions.
Main effects A main effect is the effect each variable has by itself. The main effect of independent variable A, confederate sociability, is the overall effect of the variable on the dependent measure. Similarly, the main effect of independent variable B, confederate food intake, is the effect of number of M&Ms that the confederate ate on the number of M&Ms consumed by the subject.
The main effect of each independent variable is the overall relationship between that independent variable and the dependent variable. For independent variable A, is there a relationship between sociability and food intake? We can find out by looking at the overall means in the sociable and unsociable confederate conditions. These overall main effect means are obtained by averaging across all participants in each group, irrespective of confederate food intake (low or high). The main effect means are shown in the rightmost column and bottom row (called the margins of the table) of Table 10.1. The average number of M&Ms consumed by participants in the sociable confederate condition is 6.13, and the number eaten in the unsociable condition is 6.39. Note that the overall mean of 6.13 in the sociable confederate condition is the average of 6.58 in the sociable—low food intake group and 5.68 in the sociable—high food intake group (this calculation assumes equal numbers of participants in each group). You can see that overall, somewhat more M&Ms are eaten when the confederate is unsociable. Statistical tests would enable us to determine whether this is a significant main effect.
Page 207The main effect for independent variable B (confederate food intake) is the overall relationship between that independent variable, by itself, and the dependent variable. You can see in Table 10.1 that the average number of candies consumed by subjects in the low food intake condition is 4.36, and the overall number eaten in the high food intake condition is 8.16. Thus, in general, more M&Ms are eaten by subjects when they were with a confederate who had consumed a high number of M&Ms (this is a modeling effect).
Interactions These main effect means tell us that, overall, subjects eat (1) slightly more M&Ms when the confederate is unsociable and (2) considerably more when the confederate eats a large amount of candy. There is also the possibility that an interaction exists; if so, the main effects of the independent variables must be qualified. This is because an interaction between independent variables indicates that the effect of one independent variable is different at different levels of the other independent variable. That is, an interaction tells us that the effect of one independent variable depends on the particular level of the other.
We can see an interaction in the results of the Herman et al. (2009) study. The effect of confederate food intake is different depending on whether the confederate is sociable or unsociable. When the confederate is unsociable, subjects consume many more M&Ms when the confederate food intake is high (10.68 in the unsociable condition versus 2.14 in the sociable condition). However, when the confederate is sociable, confederate food intake has little effect and in fact is the opposite of what would be expected based on modeling (6.58 in the low food intake condition and 5.68 in the high food intake condition). Thus, the relationship between confederate food intake and subject food intake is best understood by considering both independent variables: We must consider the food intake of the confederate and whether the confederate is sociable or unsociable.
Interactions can be seen easily when the means for all conditions are presented in a graph. Figure 10.4 shows a bar graph of the results of Herman et al. food intake modeling experiment. Note that all four means have been graphed. Two bars compare low versus high confederate food intake in the sociable confederate condition; the same comparison is shown for the unsociable confederate. You can see that confederate food intake has a small effect on the participants’ modeling of M&Ms consumed when the confederate is sociable; however, when the confederate is unsociable, the participants do model the food intake of the confederate. Herman et al. (2009) noted that they expected to observe the modeling effect primarily when the confederate is sociable; why do you think someone might actually model the food intake of the unsociable confederate instead?
Interaction between confederate sociability and food intake
Page 208The concept of interaction is a relatively simple one that you probably use all the time. When we say “it depends,” we are usually indicating that some sort of interaction is operating—it depends on some other variable. Suppose, for example, that a friend has asked you if you want to go to a movie. Whether you want to go may reflect an interaction between two variables: (1) Is an exam coming up? and (2) Who stars in the movie? If there is an exam coming up, you will not go under any circumstance. If you do not have an exam to worry about, your decision will depend on whether you like the actors in the movie; that is, you will be much more likely to go if a favorite star is in the movie.
You might try graphing the movie example in the same way we graphed the food intake example in Figure 10.4. The dependent variable (likelihood of going to the movie) is always placed on the vertical axis. One independent variable is placed on the horizontal axis. Bars are then drawn to represent each of the levels of the other independent variable. Graphing the results in this manner is a useful method of visualizing interactions in a factorial design.
Factorial Designs with Manipulated and Nonmanipulated Variables
One common type of factorial design includes both experimental (manipulated) and nonexperimental (measured or nonmanipulated) variables. These designs—sometimes called IV × PV designs (i.e., independent variable by participant variable)—allow researchers to investigate how different types of individuals (i.e., participants) respond to the same manipulated variable. These “participant variables” are personal attributes such as gender, age, ethnic group, personality characteristics, and clinical diagnostic category. You will sometimes see participant variables described as subject variables or attribute variables. This is only a difference of terminology.
The simplest IV × PV design includes one manipulated independent variable that has at least two levels and one participant variable with at least two levels. The two levels of the subject variable might be two different age groups, groups of low and high scorers on a personality measure, or groups of males and females. An example of this design is a study by Furnham, Gunter, and Peterson (1994). Do you ever try to study in the presence of a distraction such as a television program? Furnham et al. showed that the ability to study with such a distraction depends on whether you are more extraverted or introverted. The manipulated variable was distraction. College students read material in silence and within hearing range of a TV drama. Thus, a repeated measures design was used and the order of the conditions was counterbalanced. After they read the material, the students completed a reading comprehension measure. The participant variable was extraversion: Participants completed a measure of extraversion and then were classified as extraverts or introverts. The results are shown in Figure 10.5. There was a main effect of distraction and an interaction.
Interaction in IV × PV design
Overall, students had higher comprehension scores when they studied in silence. In addition, there was an interaction between extraversion and distraction. Without a distraction, the performance of extraverts and introverts was almost the same. However, extraverts performed better than introverts when the TV was on. You can speculate whether similar results would be obtained today when text messages are a potential distraction.
Factorial designs with both manipulated independent variables and participant variables offer a very appealing method for investigating many interesting research questions. Such experiments recognize that full understanding of behavior requires knowledge of both situational variables and the personal attributes of individuals.
Outcomes of a 2 × 2 Factorial Design
A 2 × 2 factorial design has two independent variables, each with two levels. When analyzing the results, researchers deal with several possibilities: (1) There may or may not be a significant main effect for independent variable A, (2) there may or may not be a significant main effect for independent variable B, and (3) there may or may not be a significant interaction between the independent variables.
Figure 10.6 illustrates the eight possible outcomes in a 2 × 2 factorial design. For each outcome, the means are given and then graphed using line graphs. In addition, for each graph in Figure 10.6, the main effect for each variable (A and B) is indicated by a Yes (indicating the presence of a main effect) or No (no main effect). Similarly, the A × B interaction is either present (“Yes” on the figure) or not present (“No” on the figure). The means that are given in the figure are idealized examples; such perfect outcomes rarely occur in actual research. Nevertheless, you should study the graphs to determine for yourself why, in each case, there is or is not a main effect for A, a main effect for B, and an A × B interaction.
Outcomes of a factorial design with two independent variables
Page 211Before you begin studying the graphs, it will help think of concrete variables to represent the two independent variables and the dependent variable. You might want to think about the example of the effect of amount and frequency of exercise on depression. Suppose that independent variable A is amount of exercise per week (A1 is low exercise—fewer calories per week; A2 is higher amount of exercise—more calories per week) and independent variable B is frequency of exercise (B1 is 3 times per week and B2 is 5 times per week). The dependent variable (DV) is the score on a depression measure, with higher numbers indicating greater depression.
The top four graphs illustrate outcomes in which there is no A × B interaction, and the bottom four graphs depict outcomes in which there is an interaction. When there is a statistically significant interaction, you need to carefully examine the means to understand why the interaction occurred. In some cases, there is a strong relationship between the first independent variable and the dependent variable at one level of the second independent variable; however, there is no relationship or a weak relationship at the other level of the second independent variable. In other outcomes, the interaction may indicate that one independent variable has opposite effects on the dependent variable, depending on the level of the second independent variable.
The independent and dependent variables in Figure 10.6 do not have concrete variable labels. As an exercise, interpret each of the graphs using actual variables from three different hypothetical experiments, using the scenarios suggested below. This works best if you draw the graphs, including labels for the variables, on a separate sheet of paper for each experiment. You can try depicting the data as either line graphs or bar graphs. The data points in both types of graphs are the same and both have been used in this chapter. In general, line graphs are used when the levels of the independent variable on the horizontal axis (independent variable A) are quantitative—low and high amounts. Bar graphs are more likely to be used when the levels of the independent variable represent different categories, such as one type of therapy compared with another type.
Hypothetical experiment 1: Effect of age of defendant and type of substance use during an offense on months of sentence. A male, age 20 or 50, was found guilty of causing a traffic accident while under the influence of either alcohol or marijuana.
Independent variable A: Type of Offense—Alcohol versus Marijuana
Independent variable B: Age of Defendant—20 versus 50 years of age
Dependent variable: Months of sentence (range from 0 to 10 months)
Page 212Hypothetical experiment 2: Effect of gender and violence on recall of advertising. Participants (males and females) viewed a video on a computer screen that was either violent or not violent. They were then asked to read print ads for eight different products over the next 3 minutes. The dependent variable was the number of ads correctly recalled.
Independent variable A: Exposure to Violence—Nonviolent versus Violent Video
Independent variable B: Participant Gender—Male versus Female
Dependent variable: Number of ads recalled (range from 0 to 8)
Hypothetical experiment 3: Devise your own experiment with two independent variables and one dependent variable.
Interactions and Simple Main Effects
A statistical procedure called analysis of variance is used to assess the statistical significance of the main effects and the interaction in a factorial design. When a significant interaction occurs, the researcher must statistically evaluate the individual means. If you take a look at Table 10.1 and Figure 10.4 once again, you see a clear interaction. When there is a significant interaction, the next step is to look at the simple main effects. A simple main effect analysis examines mean differences at each level of the independent variable. Recall that the main effect of an independent variable averages across the levels of the other independent variable; with simple main effects, the results are analyzed as if we had separate experiments at each level of the other independent variable.
Simple main effect of confederate food intake In Figure 10.4, we can look at the simple main effect of confederate food intake. This will tell us whether the difference between the low and high confederate food intake is significant when the confederate is (1) sociable and (2) unsociable. In this case, the simple main effect of confederate food intake is significant when the confederate is unsociable (means of 2.14 versus 10.63), but the simple main effect of confederate food intake is not significant when the confederate is sociable (means of 6.58 and 5.68).
Simple main effect of sociability We could also examine the simple main effect of confederate sociability; here we would compare the sociable versus unsociable conditions when the food intake is low and then when food intake is high. The simple main effect that you will be most interested in will depend on the predictions that you made when you designed the study. The exact statistical procedures do not concern us; the point here is that the pattern of results with all the means must be examined when there is a significant interaction in a factorial design.
Assignment Procedures and Factorial Designs
The considerations of assigning participants to conditions that were discussed in Chapter 8 can be generalized to factorial designs. There are two basic ways of assigning participants to conditions: (1) In an independent groups design, different participants are assigned to each of the conditions in the study and (2) in a repeated measures design, the same individuals participate in all conditions in the study. These two types of assignment procedures have implications for the number of participants necessary to complete the experiment. We can illustrate this fact by looking at a 2 × 2 factorial design. The design can be completely independent groups, completely repeated measures, or a mixed factorial design—that is, a combination of the two.
Independent groups (between-subjects) design In a 2 × 2 factorial design, there are four conditions. If we want a completely independent groups (between-subjects) design, a different group of participants will be assigned to each of the four conditions. The food intake modeling study illustrates a factorial design with different individuals in each of the conditions. Suppose that you have planned a 2 × 2 design and want to have 10 participants in each condition; you will need a total of 40 different participants, as shown in the first table in Figure 10.7.
Repeated measures (within-subjects) design In a completely repeated measures (within-subjects) design, the same individuals will participate in all conditions. Suppose you have planned a study on the effects of marijuana: One factor is marijuana (marijuana treatment versus placebo control) and the other factor is task difficulty (easy versus difficult). In a 2 × 2 completely repeated measures design, each individual would participate in all of the conditions by completing both easy and difficult tasks under both marijuana treatment conditions. If you wanted 10 participants in each condition, a total of 10 subjects would be needed, as illustrated in the second table in Figure 10.7. This design offers considerable savings in the number of participants required. In deciding whether to use a completely repeated measures assignment procedure, however, the researcher would have to consider the disadvantages of repeated measures designs.
Number of participants (P) required to have 10 observations in each condition
Page 214Mixed factorial design using combined assignment The Furnham, Gunter, and Peterson (1994) study on television distraction and extraversion illustrates the use of both independent groups and repeated measures procedures in a mixed factorial design. The participant variable, extraversion, is an independent groups variable. Distraction is a repeated measures variable; all participants studied with both distraction and silence. The third table in Figure 10.7 shows the number of participants needed to have 10 per condition in a 2 × 2 mixed factorial design. In this table, independent variable A is an independent groups variable. Ten participants are assigned to level 1 of this independent variable, and another 10 participants are assigned to level 2. Independent variable B is a repeated measures variable, however. The 10 participants assigned to A1 receive both levels of independent variable B. Similarly, the other 10 participants assigned to A2receive both levels of the B variable. Thus, a total of 20 participants are required.
Increasing the Number of Levels of an Independent Variable
The 2 × 2 is the simplest factorial design. With this basic design, the researcher can arrange experiments that are more and more complex. One way to increase complexity is to increase the number of levels of one or more of the independent variables. A 2 × 3 design, for example, contains two independent variables: Independent variable A has two levels, and independent variable B has three levels. Thus, the 2 × 3 design has six conditions.
Table 10.2 shows a 2 × 3 factorial design with the independent variables of task difficulty (easy, hard) and anxiety level (low, moderate, high). The dependent variable is performance on the task. The numbers in each of the six cells of the design indicate the mean performance score of the group. The overall means in the margins (rightmost column and bottom row) show the main effects of each of the independent variables. The results in Table 10.2 indicate a main effect of task difficulty because the overall performance score in the easy-task group is higher than the hard-task mean. However, there is no main effect of anxiety because the mean performance score is the same in each of the three anxiety groups. Is there an interaction between task difficulty and anxiety? Note that increasing the amount of anxiety has the effect of increasing performance on the easy task but decreasing performance on the hard task. The effect of anxiety is different, depending on whether the task is easy or hard; thus, there is an interaction.
TABLE 10.2 2 × 3 factorial design
This interaction can be easily seen in a graph. Figure 10.8 is a line graph in which one line shows the effect of anxiety for the easy task and a second line represents the effect of anxiety for the difficult task. As noted previously, line graphs are used when the independent variable represented on the horizontal axis is quantitative—that is, the levels of the independent variable are increasing amounts of that variable (not differences in category).
Increasing the Number of Independent Variables in a Factorial Design
We can also increase the number of variables in the design. A 2 × 2 × 2 factorial design contains three variables, each with two levels. Thus, there are eight conditions in this design. In a 2 × 2 × 3 design, there are 12 conditions; in a 2 × 2 × 2 × 2 design, there are 16. The rule for constructing factorial designs remains the same throughout.
Line graph of data from 3 (anxiety level) × 2 (task difficulty) factorial design
TABLE 10.3 2 × 2 × 2 factorial design
A 2 × 2 × 2 factorial design is constructed in Table 10.3. The independent variables are (1) instruction method (lecture, discussion), (2) class size (10, 40), and (3) student gender (male, female). Note that gender is a nonmanipulated variable and the other two variables are manipulated variables. The dependent variable is performance on a standard test.
Notice that the 2 × 2 × 2 design can be seen as two 2 × 2 designs, one for the males and another for the females. The design yields main effects for each of the three independent variables. For example, the overall mean for the lecture method is obtained by considering all participants who experience the lecture method, irrespective of class size or gender. Similarly, the discussion method mean is derived from all participants in this condition. The two means are then compared to see whether there is a significant main effect: Is one method superior to the other overall?
The design also allows us to look at interactions. In the 2 × 2 × 2 design, we can look at the interaction between (1) method and class size, (2) method and gender, and (3) class size and gender. We can also look at a three-way interaction that involves all three independent variables. Here, we want to determine whether the nature of the interaction between two of the variables differs depending on the particular level of the other variable. Three-way interactions are rather complicated; fortunately, you will not encounter too many of these in your explorations of behavioral science research.
Sometimes students are tempted to include in a study as many independent variables as they can think of. A problem with this is that the design may become needlessly complex and require enormous numbers of participants. The design previously discussed had 8 groups; a 2 × 2 × 2 × 2 design has 16 groups; adding yet another independent variable with two levels means that 32 groups would be required. Also, when there are more than three or four independent variables, many of the particular conditions that are produced by the combination of so many variables do not make sense or could not occur under natural circumstances.
Page 217The designs described thus far all use the same logic for determining whether the independent variable did in fact cause a change on the dependent variable measure. In the next chapter, we will consider alternative designs that use somewhat different procedures for examining the relationship between independent and dependent variables.
ILLUSTRATIVE ARTICLE: COMPLEX EXPERIMENTAL DESIGNS
As the saying goes, “money can’t buy happiness.” Mogilner (2010) put this idea to an empirical test in a series of three experiments that examined the impact of our thinking on how we spend our time.
Participants in the first experiment were given a scrambled-word task that included words that either primed them to think about money (“sheets the change price”), time (“sheets the change clock”), or nothing in particular (“sheets the change socks”). Then participants were given a list of activities and were asked to indicate their own plans for the day as well as the plans of a typical American. The author concluded that participants primed to think about money (based on the scrambled-word task) focused more on plans to work; in contrast, the participants primed to think about time indicated that they were motivated to engage in social connections.
First, acquire and read the article:
Mogilner, C. (2010). The pursuit of happiness: Time, money, and social connection. Psychological Science, 21, 1348–1354. doi:10.1177/0956797610380696
Then, after reading the article, consider the following:
1. Identify each independent variable in Experiment 1a.
2. Identify each dependent variable in Experiment 1a.
3. What type of assignment procedure was used for Experiment 1a?
4. The author attempted to improve the external validity of the study in Experiment 1b and Experiment 2. Do you think that she was successful? Why or why not?
5. Create a graph for the dependent variable of socializing, with the independent variable prime on the x-axis and separate lines for one’s own plans and plans of others. Describe what you see: Do you see a main effect for either of the independent variables? Do you see the interaction?
6. Create a graph for the dependent variable of work, with the independent variable prime with prime on x-axis. Describe what you see: Do you see a main effect for either of the independent variables? Do you see the interaction?
Page 218 Study Terms
Factorial design (p. 204)
Independent groups design (Between-subjects design) (p. 213)
Interaction (p. 206)
IV × PV design (p. 208)
Main effect (p. 205)
Mixed factorial design (p. 213)
Repeated measures design (Within-subjects design) (p. 213)
Simple main effect (p. 212)
1. Why would a researcher have more than two levels of the independent variable in an experiment?
2. What is a factorial design? Why would a researcher use a factorial design?
3. What are main effects in a factorial design? What is an interaction?
4. Describe an IV × PV factorial design.
5. Identify the number of conditions in a factorial design on the basis of knowing the number of independent variables and the number of levels of each independent variable.
1. Research participants read an “eating diary” of either a male or female stimulus person. The information in the diary indicated that the person ate either large meals or small meals. After reading this information, participants rated the person’s femininity and masculinity. (Based on a study by Chaiken and Pliner, 1987.)
a. Identify the design of this experiment.
b. How many conditions are in the experiment?
c. Identify the independent variable(s) and dependent variable(s).
d. Is there a participant variable in this experiment? If so, identify it. If not, can you suggest a participant variable that might be included?
2. The mean femininity ratings were (higher numbers indicate greater femininity): male—small meals (2.02), male—large meals (2.05), female—small meals (3.90), and female—large meals (2.82). Assume there are equal numbers of participants in each condition.
a. Are there any main effects?
b. Is there an interaction?
c. Graph the means.
d. Describe the results in a brief paragraph.
3. Assume that you want 15 participants in each condition of your experiment, which uses a 3 × 3 factorial design. How many different participants do you need for (a) a completely independent groups assignment, (b) a completely repeated measures assignment, and (c) a mixed factorial design with both independent groups assignment and repeated measures variables?Page 219
4. Practice graphing the results of the experiment on the effect of amount and frequency of exercise on depression. In the actual experiment, there was a main effect of amount of exercise: Participants in the high exercise (17.5 kcal) condition had lower depression scores after 12 weeks than the participants in the low exercise (7.0 kcal) condition. There was no main effect of amount of exercise: It did not matter whether exercise was scheduled for 3 or 5 times per week. There was no interaction effect. For this activity, higher scores on the depression measure indicate greater depression. Scores on this measure can range from 0 to 16.
5. Read each of the following research scenarios and then fill in the correct answer in each column of the table.
1. Locate at least two surveys (you can use any survey that you find on the internet). Try to find one that is relatively brief – 10 questions or less. Analyze the questions in the survey. Construct a table and evaluate each survey question on the following points:
· negative wording
· complexity (note: good questions are simple and straightforward)
· grammatically incorrect
· Discuss the importance of writing good survey questions. How can poorly-written questions bias results? Submit both the table that you constructed as well as a copy of the survey you analyzed to your instructor.
2. Create two questions for your instructor. It is preferable that all team members participate; however, it is required that at least two active members of the team engage in discussing a topic showing that they co-created the one – to – two questions submitted to me to earn towards participation points.
CONDITIONS: The two questions must relate to this week’s readings and learning objectives/competencies. The learning objectives/competencies. Ask 2 questions of your instructor that are directly related to learning objectives/competencies that the you want more assistance with to strengthen comprehension of research methods in our psychology program class.
|Week Four Homework ExercisePSYCH/610 Version 2||1|
Week Four Homework Exercise
Answer the following questions, covering material from Ch 8–10 of Methods in Behavioral Research:
1. What is a confounding variable and why do researchers try to eliminate confounding variables? Provide two examples of confounding variables.
2. What are the advantages and disadvantages of posttest only design and pretest-posttest design?
3. What is meant by sensitivity of a dependent variable?
4. What are the differences between an independent groups design and a repeated measures design?
5. How does an experimenter’s expectations and participant expectations affect outcomes?
6. Provide an example of a factorial design. What are the key features of a factorial design? What are the advantages of a factorial design?
7. Describe at least four different dependent variables.
8. What are some ways researchers can manipulate independent variables?
9. What is the difference between main effects and interactions?
10. How do moderator variables impact results? Provide an example.
11. A researcher is interested in studying the effects of story endings on preference ratings. He randomly assigns participants into two groups: predictable ending or surprise ending. He instructs them to read the story and provide preference ratings. The experimenter’s variation of story endings is a __________ (straightforward or staged) manipulation.
12. A researcher was interested in investigating the vocabulary skills of 6th graders in a program for gifted students. She gave a group of participants a test of vocabulary that was aimed at the 7th-grade level. She quickly discovered that there was limited variability in the scores because nearly all the students answered 90% or more of the questions correctly. This outcome is called a _______ effect.
|Research Evaluation WorksheetPSYCH/610 Version 2||1|
Select a research article of interest to you, preferably related to your Research Proposal, and use the Research Evaluation Worksheet to analyze the article. You can use this information to help you form the literature review section of your research proposal.
Research Evaluation Worksheet
Full Article Reference (APA style):
a. Is the need for the study clearly stated in the introduction? Explain by using information presented in the literature review.
b. What is the research hypothesis or question?
c. What are the variables of interest (independent and dependent variables)?
d. How are the variables operationally defined?
a. Sample Size (Total): ________________ Size Per Group/Cell: _______________
b. Were the methods and procedures described so that the study could be replicated without further information? What information, if any, would you need to replicate or reproduce this study?
a. How were participants selected and recruited? b. Were subjects randomly selected? c. Were there any biases in sampling? Explain
d. Were the samples appropriate for the population to which the researcher wished to generalize?
e. What are the characteristics of the sample populations?
Research Design (check which design applies)
_______ Single group, time series study
_______ Multiple baseline (sequential) design: ______________
_______ Single group, no measurement
_______ Single group with measurement: Pre ______ During _____ Post _____
_______ Two groups classic experimental versus control group, randomly assigned
_______ (quasi-experimental) two groups experimental versus control group,
not randomly assigned
_______ Correlation research, not manipulated, degree of relationship
_______ Descriptive research (qualitative study)
_______ Natural observation
_______ Analytical research
_______ Interview research
_______ Historical study
_______ Survey research
_______ Legal study
_______ Ethnography research
_______ Policy analysis
_______ Fieldwork research
_______ Evaluation study
_______ Grounded theory
_______ Protocol analysis (collection and analysis of verbatim reports)
_______ Case study, no measurement
_______ Case study, with measurement: Pre _________ During _______ Post _________
_______ Developmental research
_______ Longitudinal (same group of subjects over period of time)
_______ Cross-sectional (subjects from different age groups compared)
_______ Cross-sequential (subjects from different age groups, shorter period of time)
_______ Correlation, more than two groups: control, treatment, and other treatment comparisons
_______ Factorial design, two or more groups: other treatment differences, no untreated controls
_______ Two or more dependent variables (MANOVA)
_______ Other design: __________________________________________________________
Consider the Following Questions:
a. Was a control group used? Yes ______ No ______ If yes, complete b, c, and d below. b. Was the “control” method for the study appropriate?
c. What variable was being controlled for?
d. In the case of an experimental study, were subject randomly assigned to groups?
a. Describe the Dependent Measure(s)/Instruments used:
b. Describe the Measurement/Instrument Validity Information:
c. Describe the Measurement/Instrument Reliability Information:
Consider the Following Questions:
a. For all measures (measures to classify subjects, dependent variables, etc..) was evidence of reliability and validity provided, either through summarizing the data, or by referring the reader to an available source for that information?
b. Do the reliability and validity data justify the use of the measure?
c. Are the measures appropriate (if not, why not)?
d. Are multiple measures used, particularly those that sample the same domains, or constructs but with different methods (e.g., self-report, rating scales, self-monitoring, or direct observation)?
f. If human observers, judges, or raters were involved, was inter-observer or inter-rater agreement (reliability) assessed? Was it obtained for a representative sample of the data? Did the two raters do their ratings independently? Was their reliability satisfactory?
Independent and Dependent Variables
a. What is/are the Independent Variable(s):
b. What is/are the Dependent Variable(s):
Scales of Measurement (check those that apply):
Nominal _______ Ordinal _______ Interval _______ Ratio _______
a. What type of statistical techniques are used?
b. What type of tables and graphs are used?
Consider the Following Questions:
a. Were tests of significance used and reported appropriately (e.g., with sufficient detail to understand what analysis was being conducted)?
b. Do the researchers report means and standard deviations (if relevant) so that the reader can examine whether statistically significant differences are large enough to me meaningful? c. Other comments on the reported statistical analyses?
Evaluate the Summary and Conclusions of the study (Usefulness):
Describe the Strength(s) and Limitation(s) of the Study:
Describe what you learned from the study:
List any remaining questions you have about the study:
*Adapted from form created by Dr. Randy Buckner, University of Phoenix Instructor