PUBLICATIONS - RESEARCH DESIGN AND STATS

Statistical Methods

Descriptive Statistics

Distribution

Measures of Central Tendency

Shape

Measures of Variability

Measures of Correlation

Inferential Statistics

Statistical Significance

Effect Size

Methods of Statistical Analysis

t-Test

Analysis of Variance

Multiple Regression

Multivariate Analysis

Statistical Significance vs. Practical Significance


Statistical Methods

Statistical analysis provides researchers with a means of translating data collected from a sample into numerical expressions that represent the characteristics of the sample. Mathematical principles are applied to data in an effort to objectively determine whether change occurred as a result of intervention or treatment. For many rehabilitation professionals, the study of statistics is a “necessary evil” and is generally the least favored coursework of those enrolled in graduate or continuing education programs.

For the purposes of this chapter, an overview of basic statistical methods will be helpful to life care planners in understanding how research conclusions were reached and how findings may, or may not, relate to a specified area of interest. This section will reorient you to the basic statistical methods commonly utilized in rehabilitation and medical research. There are two types of statistics generally performed on research data; descriptive and inferential statistics.

Descriptive Statistics

Descriptive statistics help researchers to organize, summarize, and visualize the data collected from samples. At first, researchers are interested in learning how the response variable(s) are distributed.

Distribution

The set of scores or data collected from subjects is referred to as the “distribution.” Distribution of data may be displayed creating an ordered list of scores (i.e., from high scores to low scores), creating a frequency distribution (i.e., tally of the raw scores), or creating a histogram (i.e., bar graph) to visually summarize the data. Once the distribution has been established, additional descriptive analyses may occur.

Top


Measures of Central Tendency

Measures of central tendency simply calculate the mean (i.e., the average performance of all subjects), median (i.e., the central score of all subjects which divides scores into equal parts), and mode (i.e., the most frequently attained score observed in all subjects). Each value provides the researcher with slightly different information, but all describe the “central tendency” of the participants. For example, the mean is affected by extreme scores so in distributions having many extreme values, the median may be a more accurate measure of central tendency (i.e., the median is not affected by extreme scores). If the distribution of scores is absolutely normal (i.e., a “bell curve”), the mean, median, and mode will be identical.

Shape

A normal distribution produces a bell curve when scores are graphically displayed. Life care planners should note that this “normal” distribution is a mathematical model and is rarely produced by most studies. Many descriptive and inferential statistics are based upon this model of normal distribution, but specific correction techniques may be employed to account for a certain degree of abnormality. In reality, the shape of distributions may be bimodal, asymmetrical, skewed, flat, or otherwise dissimilar from a bell-shaped curve.

Top


Measures of Variability

Range and standard deviation describe the extent to which scores are dispersed across the distribution. The range is a rough measure of difference between the highest and lowest scores. The resulting number is not informative enough, however, so researchers calculate the standard deviation (The standard deviation is the square root of the variance and reflects the average deviation of each individual score from the mean score). In short, the standard deviation and mean scores of a distribution help the researcher to identify the average scores and the average variability of scores from the mean.

Imagine a normal distribution; a bell-shaped curve. Draw a line directly in the center of the “bell” at the highest point of the curve. This represents the mean where 50% of the scores fall below and 50% of the scores fall above. Now, continue dividing the “bell” into equal parts; four parts above the mean, and four parts below the mean. Label each of the dividing lines as illustrated below:

SD: -4 -3 -2 -1 0 +1 +2 +3 +4

Percent: -- .1% 2% 16% 50% 84% 98% 99.9% --

Theoretically speaking (i.e., in a normal distribution) 68% of the scores lie within plus and minus one standard deviation of the mean; 95% of the scores lie within plus and minus two standard deviations of the mean.

Note: Keep in mind that the normal distribution is not a fact of nature, but is a mathematical model, only!

Top


Measures of Correlation

Measures of correlation describe the extent to which two variables are related, or co-vary with one another. Correlation statistics indicate the magnitude (i.e., strength) and direction (i.e., positive or negative relationship) between two variables. Correlation coefficients range from -1.0 to +1.0 with +1.0 indicating a perfect relationship (i.e., an increase in one variable is accompanied by a proportional increase in the other variable). When graphed on a scatterplot or statistically analyzed (i.e., contingency tables), the measure of correlation (i.e., the correlation coefficient) is an index of the linear relationship of the variables.

A common example of correlation is when height and weight are considered. In most cases, individuals who are taller also weigh more than those who are shorter. Of course there are exceptions, so there is not a +1.0 correlation between these variables. The actual observed relationship is +0.8; less than perfect, but a strong linear relationship (Bellini & Rumrill, 1999).

The correlation coefficient does not tell the researcher whether statistical significance has been achieved, it only serves to quantify the relationship that exists.

Inferential Statistics

Inferential statistics are based on probability theory and are used to calculate the degree to which results derived from the sample can be generalized to the target population. Empirical data is translated into probability statements which are used to infer the relationship between variables within a target population (based upon what was observed in the sample). Put another way, tests of statistical significance determine the probability that the findings produced by the sample are also true within the target population.

A Note About Inferential Statistics:

*Statistical tests do not confirm that the research hypotheses are true.

*Statistical tests do not guarantee that the same results will be obtained if replicated (Cohen, 1990)

Statistical Significance

In the social sciences, statistical significance is typically determined when the probability (i.e., the p value) of an occurrence is less than 5%, or 0.05. When a p value is reported as being less than or equal to 0.05, the researcher interprets it to mean that there is likely to be a statistically significant relationship between the variables under investigation within the target population. Conversely, when a p value is found to be more than 0.05, the researcher concludes that there is likely to be no statistical relationship between the variables within the target population.

*In other words, a statistically significant result (at the p<.05 level) means that there is a 95% probability that results reflect what truly occurs between the variables within the target population.

Researchers may set the p value at any value, but most are at 0.05 or 0.01 for a more stringent test, or .10 for a less stringent test of significance. Determining the statistical significance of the data provides researchers with a means and level of confidence when identifying whether the results of the study were due to chance or to the treatment.

A Note About Inferential Statistics:

*”The only way to know for certain the actual nature of the relationship between these variables in population of interest is to sample every member of the population, an impossible task in nearly every instance of research (Bellini & Rumrill, 1999).”

Effect Size

Sample size (i.e., the number of subjects participating in a study) has an enormous effect on tests of statistical significance. If the sample size is large, statistical tests may detect significance very small correlations simply because the number of subjects causes the calculation to appear as though results were not due to chance, but to the treatment. As may be imagined, this fact has created a great deal of confusion and misinterpretation in the research literature. Life care planners should be aware of this fact and critically review the conclusions drawn from large sample sizes (Cohen, 1999; Hunter & Schmidt, 1990).

Hunter and Schmidt (1990) propose the following exercise:

Imagine that you reviewed all of the research studies regarding a specific counseling technique or therapy and tallied the number of studies that concluded that the intervention “worked” and those that concluded that the intervention did not “work.” After reading these conflicting reports a student may determine that the evidence in favor of the intervention is inconclusive and does not have merit. Is the student correct? Possibly, but upon closer review the student recognizes that the studies used samples of varying sizes and probabilities of varying values. Should this change the student’s mind?

Limitations in the sensitivity of significance tests and the practice of using them as the only measure of results has led to the development of alternatives such as “effect size measures.” Effect size refers to the proportion of variance in one variable (or a set of variables) that is accounted for by another variable (or set of variables) (Cohen, 1988). A d statistic is a measure of effect size and may be reported by researchers comparing the mean difference in standard deviations between two groups. Basically, the d statistic allows research findings of various sample sizes and outcome measures to be directly evaluated. Researchers may report the d statistic of the data to facilitate cross-study comparisons (Bellini & Rumrill, 1999).

Top


Methods of Statistical Analysis

There are many statistical techniques by which data are analyzed. Bellini and Rumrill (1999) note, “Methods are tools, and the methods of statistical analysis are meaningful only when they are applied within an appropriately designed study and interpreted within the theoretical context of the research question.”

The following statistical tests are commonly utilized in rehabilitation and social sciences: The t-test, analysis of variance (ANOVA), multiple regression, and multivariate analysis. Each of these tests may sound complicated, but are readily understood when the assumptions are known.

t-Test

One of the least complicated statistical analyses to perform is the t-test. This statistic measures the mean differences between two groups, usually between the experimental and control groups. The t-test is one method used by researchers to determine whether the mean differences between groups is large enough to be considered “significant” or whether the results were likely due to chance.

Consider the following scenario which is typical of research studies in rehabilitation sciences:

Mary has developed a twelve-week vocational adjustment and training program for adults who have sustained a physical injury requiring them to locate employment outside of their field of expertise. She wants to test her program to ascertain whether the individuals who complete it will exhibit higher levels of psychological adjustment than those who participated in the traditional program.

Mary gains the cooperation and consent of a group of individuals seeking vocational assistance, and randomly assigns them to two groups. One group will participate in Mary’s twelve-week program and the others will receive traditional vocational counseling and guidance. The study is initiated and after twelve weeks, all participants complete a self-report questionnaire.

Mary expects that the mean psychological adjustment scores of the individuals who participated in her vocational program will be higher than those who completed the traditional program.

Mary looks at between-group differences because she believes that the mean scores of these two groups will be unequal due to the benefits of her vocational adjustment program. She also realizes that individuals within each group will be different from one another (this is a fact of most research studies) so she must analyze the data for within-group differences.

The t-test will provide an analysis of the ratio of between-group differences to within-group differences. If the ratio of these differences is large enough, statistical significance is achieved. In other words, the t-test is applied to the data in an effort to determine whether the ratio of between-group differences and within-group differences is large enough that a researcher is able to attribute these differences to a treatment or intervention, rather than to chance.

In reality, the t-test identifies significance by analyzing three factors:

*the vehemence of the treatment or intervention (between-group difference),

*the degree of variance within each group (within-group difference), and

*the sample size

The best scenario for a researcher hoping for statistical significance is when the effect of the treatment/intervention is large (substantial between-group scores), when there is little variability among individual scores within each group, and when a large sample has been obtained.

Analysis of Variance

An analysis of variance (ANOVA) is very similar to a t-test, as may be inferred by the name, but is used when more than two groups are involved in the study. Referring back to the previous example, if Mary were to have developed two different vocational programs that she wanted to test, Group 1 may participate in Program A, Group 2 may participate in Program B, and Group 3 would participated in the traditional program.

Just like the t-test, the ANOVA determines the mean deviations within and between all three (or more) groups. An additional type of test, the post hoc (or “after the fact”) test, is used to obtain more information about the mean differences of the three groups. For example, Mary would likely be interested in knowing how each of her vocational programs compared with the traditional program, and which of the two may have been “better” than the other.

Post hoc tests allow researchers to compare the mean differences of Group 1 to Group 2, Group 2 to Group 3, and Group 1 to Group 3. This way, researchers are better able to determine the relative effectiveness of each group as compared to the others.

By performing a factorial analysis of variance, or factorial ANOVA, researchers are able to analyze the separate as well as the interactive effects of two or more categorical (i.e., differing in kind, not amount or degree) variables.

Consider the following scenario which is based upon an actual study conducted by Leierer, et al. (1996):

John is rehabilitation counselor who has worked with individuals with physical disabilities for many years and has noticed certain patterns in consumer behavior. Upon case closure or termination, consumers are asked to complete an evaluation of their counselor.

Several counselors on staff at his agency have physical disabilities themselves and John wonders whether consumers prefer to discuss issues involving the challenges related to disability with counselors who also have a disability may personally identify with some of their difficulties. On the other hand, he wonders whether consumers feel comfortable discussing general concerns unrelated to disability issues with any of the counselors on staff, whether with or without a disability.

He defines the following parameters:

*John also knows that one of the most important skills counselors possess is the ability to actively attend to the concerns of consumers and is a benchmark for professional competence. This will be one of the variables.

*John hypothesizes that there is a relationship between counselor’s disability status (whether or not they have a disability) and the consumer’s satisfaction ratings (the dependent variable) of their counselors. This will be a second variable.

*In addition, this relationship is influenced by the nature of the issues discussed with the counselor (consumers may prefer discussing disability-related issues with a counselor who has a disability, but has no preference when issues unrelated to the disability are being discussed). This will be a third variable.

By performing an ANOVA, John will be able to parcel out the effect of the status variables (counselor’s disability, professional competence rating) and independent variable (nature of the issues discussed) to identify the main effect. The main effect demonstrates the effect that an independent variable has on a specific dependent variable without being influenced by the other variables under study. The ANOVA will also enable him to analyze the influential effects of the combinations of independent variables, or interactive effects, of all of the elements under investigation.

Multiple Regression

Multiple regression is similar to the factorial ANOVA but may be used to predict and identify causal explanations, rather than to simply identify relationships. Multiple regression analyzes the multiple relationships between a set of independent variables and one dependent variable. In other words, this statistic is used as a means of predicting an outcome based upon a combination of two or more variables. Multiple regression is more flexible than ANOVA which is limited to categorical variables. Multiple regression techniques can analyze multiple continuous, dichotomous, or categorical variables.

For example, a researcher may want to identify the variables which best predict return to work success following physical injury. Based upon what is known in the field of vocational rehabilitation, the researcher may select the following variables: age of onset, previous work history, severity of physical impairment, marital status, etc., believed to influence vocational outcome. Multiple regression analysis allows the researcher to identify the combination of factors most likely to predict whether individuals successfully return to work after sustaining physical injuries.

Top


Multivariate Analysis

According to Bellini and Rumrill (1999), multivariate analysis is less commonly utilized in rehabilitation sciences than multiple regression, but may be useful depending upon the research design of a particular study. Rather than one method, multivariate analysis refers to a group of statistical techniques which analyze the effects of one or a set of variables on a set of continuous variables.

Top


Statistical Significance vs. Practical Significance

Life care planners need to be familiar enough with statistical methods in order to determine the difference between the statistical significance of a study, and the practical significance of a study. While statistical significance is the yardstick by which research findings are measured, it may not always be a useful criterion for determining whether the results have any practical importance affecting the welfare of individuals with disabilities.

Knowing the effect of a large sample size on probability (i.e., large sample sizes tend to detect significance when it may not be truly present), life care planner should pay close attention to the reported actual differences among group means and other indicators of magnitudes of relationships.

Bellini and Rumrill (1999) assert, “Evaluating the practical significance of research findings also involves reassessing the status of the theoretical proposition following the empirical test, as well as the heuristic and practical value of the theoretical proposition relative to the goals, activities, and procedures of the particular agency or program.” Recall that the purpose of scientific inquiry is to develop theories which guide a discipline’s philosophy and practice. Research does not “prove” or “disprove” facts, but support or refute current theoretical propositions. Bellini and Rumrill (1999) continue, “…it is the theory, confirmed by research findings, that provides rehabilitation practitioners with tools for understanding the relationships among personal values, beliefs about disability, and subsequent adjustment of persons with acquired disabilities.”

Top

 
Life Care Planning Education & Research Vocational Analysis