- RESEARCH DESIGN AND STATS
Fundamentals of Measurement
Basic Scientific Research Methodology
Basic scientific research can be applied to achieve the
goals of demonstrating the reliability, validity and relevancy
of life care planning as a tool for case management of the
patient with severe disabilities. In order to do that, we
need an understanding of scientific methodology.
Basic scientific research is driven by the testing of hypotheses.
The hypothesis is our best supposition of what we think is
happening under a given set of circumstances. While the development
of a working hypotheses is applicable to individual client
assessment (Reid, 1997) and is employed in daily clinical
practice, it can also be applicable to a larger field in
general, such as life care planning.
Each scientific study or experiment is designed to ask a
particular question about the hypothesis. The results of
each study or experiment have the potential to either lend
support to the truth of the hypothesis or to disprove and
challenge the hypothesis. As the evidence in support of a
hypothesis accrues, the hypothesis may become a well-accepted
theory on how things work. This does not imply that life
care planning is a theory. It is certainly not a theory,
but rather a very useful tool. So how might we develop hypotheses
about life care planning to test scientifically?
Traditionally, hypotheses develop from careful observations
of a phenomenon or reviews of the published literature in
an area leading to a rational assessment of the field. Once
an idea is intellectually formed of how things might be working,
then a research question can be posed to test whether the
hypotheses is true. A scientific study sets forth specific
aims and objectives to answer the research question. The
specific aims define the response variable that will be recorded
as the outcome of the investigation.
In the instance of research for validation of life care
planning, the body of published literature is only now emerging.
For a comprehensive anthology, see the appendix to the Amicus
Curiae Brief (Countiss, 2002) and The Bibliography
of Life Care Planning and Related Publications (Weed,
Berens & Deutsch 2002), as well as Hamilton’s state
of the science paper (1999).
The issues arising from the U.S. Supreme Court ruling on
Daubert v. Merrill Dow (1993) serve as an impetus for scientific
studies to validate the life care planning process. The ruling
has asked three important questions as to whether life care
planning, in all its aspects, are 1) reliable, 2) valid,
and 3) relevant to each specific patient’s case. Therefore,
our hypothesis is that life care plans are indeed 1) reliable,
2) valid, and 3) relevant to each specific patient’s
The Research Process
A single hypothesis can generate many research questions.
In a research study, the research question is addressed by
development of specific aims and the research objectives
through which the specific aims are going to be accomplished.
The specific aims identify the response variables to be analyzed.
Next, the study protocols and procedures are developed.
The protocols and procedures detail the methods to be employed
to assure consistent collection of reliable data. After the
data is gathered, it must be statistically analyzed. To be
meaningful, the results must be interpreted in context of
the current state of the profession and its future directions.
The Design of Research Studies
The research process is captured within the overall design
of the proposed study. Whatever the hypothesis, design the
best possible study to disprove it. Results gathered
in this manner have the strongest impact.
A major distinction in design can be made between descriptive
and analytical study designs, (Bellini & Rumrill, 1999,
Chap. 6). Descriptive studies are non-experimental or “cohort” studies,
while Analytical studies test hypotheses.
Descriptive studies gather data of interest about a certain
population, a “cohort.” A cohort is a sub-population
of patients that share particular characteristics, (e.g.
HIV infection or hemiplegia). The outcome of a descriptive
study might be a determination of the prevalence of disability
within a certain population. After analyzing the results
of a descriptive study statistically and making some inference
about the meaning of the data, a hypothesis may be generated
that can be tested analytically. For example, in a cohort
of insulin-resistant type II diabetics, the prevalence of
hearing loss might be determined.
The individual case report and case series are always
descriptive studies, usually of a singular, interesting nature.
These studies can provide a provocative observation justifying
a larger, descriptive cohort study.
Descriptive studies can inform the design process for
analytical studies. A retrospective case review groups similar
cases as cohorts and collects specific data about them. They
can be either descriptive or analytical. An example of an
analytical case review study would be the comparison of life
care plans that were updated five to seven years after implementation
to determine the predictive validity of the initial life
Prospective longitudinal studies are more powerful than
retrospective case reviews. They always test hypotheses by
following a particular endpoint over time in a specially
enrolled patient population. If, in the descriptive study,
the prevalence of hearing loss in insulin-resistant type
II diabetics were found to be very high, then some hypothesis
as to why that occurs might be put forward and tested by
experimental intervention in a prospective longitudinal study
(Elwood, 1998, Chap. 2; Piantadosi, 1997, Chap. 4).
Statistical Design and Power Analysis
Importantly, statistical consultation should be a part of
the study design process. Because the data must ultimately
be analyzed statistically to be meaningful, it is extremely
helpful to consult with a statistician in the design stage
of the study. The final methods of analysis should be determined
before data collection begins.
After the data collection has been completed, statistical
analysis will indicate whether the outcome is significant.
However, the qualitative parameters for deciding what is
significant must be chosen before data collection begins.
The level of the difference detected must be set very low
to minimize the chance of identifying a false positive effect,
known as a Type I error.
Type I Error
A Type I error occurs when the difference detected in the
study is accepted as a true result when it is not. Conventionally,
the level of significance is set at p < 0.05, so that
the probability of a Type I error is less than 5%. In contrast
to the parameter for the Type I error rate, the parameter
for the Type II error rate should be set very high.
Type II Error
A Type II error occurs when no difference is detected, but
a difference actually does exist, in other words a false
negative is identified. The Type II error probability is
frequently set as high as 80 - 90% (Bellini & Rumrill,
1999, Chap. 3; Friedman, Friedman, Furberg, & DeMets,
1998, Chap. 7; Piantadosi, 1997, Chap. 4) .
Statisticians can also help determine whether the proposed
study is feasible. This is done by power analysis. “Power” refers
to whether the study has the capability to detect a significant
difference in the response variables given the levels set
for the qualitative parameters discussed above. Power comes
from the number (N) of participants included in the study
and the magnitude of the effect of interest.
If a sufficient number of cases are not available to power
the study adequately, then it is not feasible to conduct
the study because no meaningful results can be detected.
The N required for the study to detect a difference can be
calculated from the expected effect size and the expected
variation in the data. If the effect size is small, or the
variation large, then the N must be large.
The circular question is “How can the effect size
from a study that has not been completed be determined?” The
answer is that it cannot be determined, only estimated. Published
reports of similar effects or preliminary studies, which
are small studies that were not “powered” and
may not have detected a difference in outcomes, can inform
us about estimating the effect size and the range of variation
in the effect (Bellini & Rumrill, 1999, Chap. 6; Friedman,
Furberg, & DeMets, 1998, Chap. 7; Senn, 1997, Chaps.
4 & 13.).
Inclusion/exclusion criteria define the study’s target
population, (Bellini & Rumrill, 1999; Piantadosi, 1997,
Chap. 8). The baseline characteristics considered by the
study are described by the inclusion/exclusion criteria,
including any baseline exams the study might deem important
to control of potential extraneous confounding variables.
A confounding variable, or bias, is some factor that accounts
for an effect identified in the study, but masks a true effect.
Some commonly identified confounding variables include baseline
characteristics of the cohort such as gender, age, cultural
background and socioeconomic level. Other confounding variables
could be identified as pre-existing medical conditions with
pathology similar to the pathology of interest in the study
age or with pathology that exacerbates the severity or progression
of the pathology of interest in the study.
Controlling for Confounding Variables
One way to control for confounding variables is to set the
inclusion/exclusion criteria to limit their presence within
the study population. For example, it might be reasonable
in a study on the effects on I.Q.of HIV-Associated
Dementia (HAD) to exclude those patients with a pre-existing
closed head trauma or cerebral stroke. In the same study,
age might be limited to young adults aged 21-35 to control
for the normal age effects on intellect seen in immature
and geriatric populations. The inclusion/exclusion criteria
serve as an assessment of eligibility, or checklist, for
participant enrollment to the study.
Another way to control for confounding variables is to include
the confound in the study population, but stratify the study
by the levels of the confounding variable. For example, socioeconomic
effects are commonly stratified by level of education achieved
and earned income. Gender might be an interesting confound
within the same study of HAD effects on intelligence, not
because males and females have essentially different IQ’s,
but because the HIV disease state underlying the observed
pathology may progress differently in males and females due
to their intrinsically different immune systems. Stratifying
for a confounding variable has the potential to identify
important and sometimes unanticipated effects.
Stratification and Hypothesis Testing
Stratification can also be used to test hypotheses. Consider
the following analytic retrospective case review study: The
response variable, (i.e., recommended level of nursing care
for a patient with C5 tetraplegia), may be stratified by
some factor of interest to test a hypothesis. In order to
test for intra-planner reliability, members of the cohort
from a single practitioner’s caseload may be grouped
according to the purpose and source of referral. Three groups
may include: (a) development of a life care plan referred
by defense counsel, (b) development of a life care plan referred
by plaintiff counsel, or (c) an independent medical examination.
The mean response variables may then be compared and group
differences identified. In this study the inclusion/exclusion
criteria would be set to limit the population to a single
life care planner’s caseload and to specifically include
patients referred from all three sources.