Sampling and Sample Size
As a research advisor to Dr. Thompson who plans to conduct a simple experiment to test the effectiveness of concept maps for his population, it is critical to understand the correct target for his study, and to conceptualize a good sampling plan, which is presented later in this paper. Dr. Thompson's research idea is to statistically test whether concept mapping would aid freshman students in raising scores obtained after completing the fifth learning activity. Although Dr. Thompson identifies six learning activities, he is specifically interested in determining if the fifth activity's remedial scores can be improved by using concept mapping. Consequently, only the fifth activity learning results will be examined. Dr. Thompson additionally desires to use a two sample independent t-test for his experiment.
Choudhery (2009) reminds the researcher to identify the null and alternate hypotheses first. Dr. Thompson's hypotheses could be represented by:
H0: μ(Group 1: Fifth Activity Scores of Trained) = μ(Group 2: Fifth Activity Scores of Untrained) meaning that the scores of the concept mapping trained students after the fifth activity are equal to the scores of the untrained students after the fifth activity.
H1: μ(Group 1: Fifth Activity Scores of Concept Mapping Trained) ≠ μ(Group 2: Fifth Activity Scores of Untrained).
As an advisor to Dr. Thompson, it is also important to review the information relevant to concept mapping so that understanding and agreement is reached on Dr. Thompson's theory, which is that concept mapping could improve remedial scores. Novak and Cañas (2009) explained that concept mapping enables students to learn structurally as they identify new concepts "in a hierarchical fashion with the most inclusive, most general concepts at the top of the map and the more specific, less general concepts arranged hierarchically below" (para 2). Concept mapping also includes cross-links to other concepts, and reveals a relationship (creative leap) from one unit of knowledge to another. Concept mapping was found to facilitate "meaningful learning and the creation of powerful knowledge frameworks that not only permit utilization of the knowledge in new contexts, but also the retention of the knowledge for long periods of time" (Novak & Cañas, 2009, para 18). Based upon this information, agreement that concept mapping could improve learning, and that remedial scores may increase, could occur between the advisor and Dr. Thompson.
To agree with Dr. Thompson's desire to use his stated statistical test, a review of information by Choudhery (2009) verifies the test's applicability. Choudhery (2009) wrote that "the independent two-sample t-test is used to test whether population means are significantly different from each other, using the means from randomly drawn samples" (para 1). Although Choudhery (2009) recommended using the z-test for sample sizes greater than 30 (our sample size stated later is 65 for each group), the factorial designed independent two-sample t-test is the test of choice.
Sampling Plan
Westat (2000) wrote that "there are two methods for randomly selecting a sample of
members…the simple random sampling method selects members from a list of all members. The systematic sampling method useful for sampling historical records selects members based on a sampling rate, such as, every fifth freshman student undergoing the six learning activities" (p. 9). Based on the above information, the plan includes the recommendation to use the simple random sampling method.
Simple Random Probability Sampling Method.
Of the approximately 150 students who have scores at the remedial level, and who are
proceeding through the six learning activities, two groups of 65 are required for the test based upon the G*power test's result (below), and Cohen's (1992) guidelines (below). One-hundred and thirty members of the population would be randomly selected prior to training. Consequently, of the 130 members, 65 would be randomly chosen to be trained in concept mapping. The other 65 members would not be trained.
After the results of the questionnaire (see the moderating variable section below) are tallied, and final sample sizes confirmed, a simple random process would be used. All of the approximately 150 students' names would be listed, and assigned a number. The researcher would then select at random 65 numbers for Group 1, and 65 numbers for Group 2. Systematic sampling is not preferred because the process although often used can violate randomness since an equal chance of being selected is not possible (Salkind, 2009). Furthermore, stratified sampling is not recommended because the experiment does not use subgroups, e.g., based upon age or gender, and, therefore, layers in the population are not being tested (Salkind, 2009).
Errors in Sampling.
Errors in choosing the samples would be minimized because of the questionnaire's
utility in eliminating any interacting variables, which would also minimize diversity within the samples. The samples would be representative of the population. Using a sample size of 65 is recommended by the G*power analysis, is conducive to reducing errors because of its larger size, and because extraneous variables as noted below are not part of the experiment, which greatly minimizes differences between the two samples (Salkind, 2009).
Extraneous Variables.
The extraneous variables that could be tested, and could, therefore, be related
to the dependent or independent variables (Salkind, 2009) but are not related to Dr.
Thompson's experiment include gender, remedial score results in reading, two other high schools, and age. Although Dr. Thompson had noted that he has (1) more data,
(2) made observations about the difference in learning processes between males and females, (3) noted that the gender of the students is almost exactly the same as the portion of males and females in the freshman class, (4) knowledge that the college has a strong nontraditional student following, and so forth, this detail is not being examined according to Dr. Thompson's hypotheses.
Moderating Variables.
Consideration of the moderating (interacting) variables, which are related to the dependent or independent variables, and could have an impact on the dependent variables (Salkind, 2009) (the fifth activity “attacking word problems" and fifth activity test scores), could include that (1) not all trained students obtain the exact same or quality level of skill in concept mapping (scores after the fifth activity are relatively not as good as expected), (2) some of the untrained sample had previous knowledge of concept mapping, (3) some of both samples had previously undergone the fifth activity and were repeating the activity, and (4) a large variance in the pre-training remedial scores could offset the test results unless sampling is carefully designed.
Sample Size
Available population is about 150; recommend using 65 for Group A (trained in concept mapping), and 65 for Group B (untrained in concept mapping). Both groups will complete and be tested on Activity 5.
Independent Variables
Concept Mapping Trained: Group A, Untrained: Group B
Dependent Variables
Fifth Activity “attacking word problems", and Test Scores of Fifth Activity
Other Information Needing to be Collected for Sampling Plan Development
The recommendation for further data collection includes that the members of the two
populations complete questionnaires prior to the beginning the fifth activity. (Future testing of the usefulness of concept mapping in improving scores in all five activities would require training a sample before the activities begin.) The questionnaire would serve to factor out the effects of the moderating variables on the experiment's results. For example, students that received concept mapping training prior to college admittance would be eliminated from the experiment because (1) if a student is assigned to the sample of untrained members, and had received concept mapping training, this student's scores would bias the experiement's results, and (2) the concept mapping training given to the experiment's sample may be incongruent with the training students had received in the past, and would bias the experiment's results. The questionnaire would also aid in removing any restricting (control) variables. Once the questionnaire's results are examined and recorded, the sample sizes may need to be lowered because the above restricting variables may cause elimination of some of the sample's members.
Other information that would be very useful for Dr. Thompson to consider is that since the experiment will invest a certain amount of time, training in concept mapping, and data collection, conducting the experiment for each of the five activities, rather than just one, may help to improve learning in each activity. Dr. Thompson could also consider adding layers to the experiment so that gender differences are examined. For example, Dr. Thompson had noted that the gender of students was almost the same proportion of gender of those in the freshman class, and examining a gender subgroup that is almost the same would provide good testing validity. Considering subgroups arriving from different high schools, and/or subgroups of different ages could also be insightful for Dr. Thompson's experiment. For example, such an experiment could show that students have higher scores when coming from a specific high school, or that older students test higher in the initial remedial tests (or in activities) because older students bring more experential knowledge to the table. Consequently, before recommending a sample plan to Dr. Thompson, other considerations could be reviewed because a more complex experiment could gain more useful knowledge.
Using the G*power website, calculate the sample size that will provide an appropriate level of power and appropriate effect size for your research. Briefly discuss the outcome of that power analysis.
I had determined I wanted to use 64 in Group A (untrained), and 64 in Group B (trained in concept mapping) according to Conrad's (1992) guidelines, which explained that "to detect a medium difference between two independent sample means (d= .50) at a = .05 requires N = 64 in each group" (p. 158). The available population is about 150; therefore, the recommendation to Dr. Thompson (prior to using G*Power) would be to use 64 for Group A (not trained), and 64 for Group B (trained in concept mapping). This group size was verified after downloading the
newer version of G*Power 3.1.3, and using the a priori power analysis for a t test for the difference between two independent means, which calculated the sample size at 65 for each group. The calculation also included an Actual power (1-β err prob) = .916, and an Effect Size (ES) d = 0.5. The new recommended groups' size is 65.
As noted above, the calculation also included an Actual power (1-β err prob) = .916, and an Effect Size (ES) d = 0.5. Since there is a 92% chance of obtaining a statistically significant result from the study by using the sample size obtained, I would recommend that the study proceed because there is a high probability that the study will reject the null hypothesis because it is false. Since the power is very high, the chances of committing Type I (declaring a difference that does not exist) or Type II (concluding there is no effect when, in fact, there is one) errors is very small.
Regarding the Effect Size (ES) d = 0.5, the larger the number, the more effective the experimental treatment: generally an effect size in the .20's would show a treatment that produced a relatively small effect, and an effect size in the .80s would indicate a powerful treatment. The a priori indicated 0.5 so it appears the test using the sample size of 65 would be somewhat powerful.
Heinrich Heine Universitat Dusseldorf (2012) explained that the "sample size N is computed as a function of the required power level (1-β), the pre-specified significance level α, and the population effect size to be detected with probability (1-β)" (para 1). The a priori analyses offers an efficient procedure or "controlling controlling statistical power before a study is actually conducted, and can be recommended whenever resources such as time and money required for data collection are not critical" (Heinrich Heine Universitat Dusseldorf, 2012, para 2). According to the information provided, which did not disclose any issues with resources, the power analyses would be useful for Dr. Thompson's study especially due to its results suggesting the same size of each group being similar to Cohen's (1992) recommendation.
References:
Arsham, H. (2012). Statistical thinking for managerial decisions, 9th ed. Retrieved from http://home.ubalt.edu/ntsbarsh/stat-data/Javastat.htm
Choudhury, A. (2009). Independent two sample t-test. Retrieved from http://www.experiment-resources.com/independent-two-sample-t-test.html#ixzz1klACqyoM
Cohen, J. (1992). A power primer. Psychological Bulletin, 112 (9), 155-159. Retrieved from http://classes.deonandan.com/hss4303/2010/cohen%201992%20sample%20size.pdf
Heinrich Heine Universitat Dusseldorf. (2012). A priori power analyses. Retrieved from http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3/
Novak, J.D., & Cañas, A.J. (2008). The theory underlying concept maps and how to construct and use them. Retrieved from http://cmap.ihmc.us/publications/researchpapers/ theorycmaps/theoryunderlyingconceptmaps.htm
Salkind, N. (2009). Exploring research, 7th ed. Upper Saddle River, New Jersey: Pearson Education Ltd.
Westat. (2000). Performance outcomes measures project for the administration on aging. Implementation guide: General sampling plan. Retrieved from
http://www.aoapomp.net/pompI/smp_plan.pdf
No comments:
Post a Comment