Improving the Reliability and Generalizability of Scientific Research
Document
Description
Science is a formalized method for acquiring information about the world. In
recent years, the ability of science to do so has been scrutinized. Attempts to reproduce
findings in diverse fields demonstrate that many results are unreliable and do not
generalize across contexts. In response to these concerns, many proposals for reform have
emerged. Although promising, such reforms have not addressed all aspects of scientific
practice. In the social sciences, two such aspects are the diversity of study participants
and incentive structures. Most efforts to improve scientific practice focus on replicability,
but sidestep issues of generalizability. And while researchers have speculated about the
effects of incentive structures, there is little systematic study of these hypotheses. This
dissertation takes one step towards filling these gaps. Chapter 1 presents a cross-cultural
study of social discounting – the purportedly fundamental human tendency to sacrifice
more for socially-close individuals – conducted among three diverse populations (U.S.,
rural Indonesia, rural Bangladesh). This study finds no independent effect of social
distance on generosity among Indonesian and Bangladeshi participants, providing
evidence against the hypothesis that social discounting is universal. It also illustrates the
importance of studying diverse human populations for developing generalizable theories
of human nature. Chapter 2 presents a laboratory experiment with undergraduates to test
the effect of incentive structures on research accuracy, in an instantiation of the scientific
process where the key decision is how much data to collect before submitting one’s
findings. The results demonstrate that rewarding novel findings causes respondents to
make guesses with less information, thereby reducing their accuracy. Chapter 3 presents
an evolutionary agent-based model that tests the effect of competition for novel findings
on the sample size of studies that researchers conduct. This model demonstrates that
competition for novelty causes the cultural evolution of research with smaller sample
sizes and lower statistical power. However, increasing the startup costs to conducting
single studies can reduce the negative effects of competition, as can rewarding
publication of secondary findings. These combined chapters provide evidence that
aspects of current scientific practice may be detrimental to the reliability and
generalizability of research and point to potential solutions.
recent years, the ability of science to do so has been scrutinized. Attempts to reproduce
findings in diverse fields demonstrate that many results are unreliable and do not
generalize across contexts. In response to these concerns, many proposals for reform have
emerged. Although promising, such reforms have not addressed all aspects of scientific
practice. In the social sciences, two such aspects are the diversity of study participants
and incentive structures. Most efforts to improve scientific practice focus on replicability,
but sidestep issues of generalizability. And while researchers have speculated about the
effects of incentive structures, there is little systematic study of these hypotheses. This
dissertation takes one step towards filling these gaps. Chapter 1 presents a cross-cultural
study of social discounting – the purportedly fundamental human tendency to sacrifice
more for socially-close individuals – conducted among three diverse populations (U.S.,
rural Indonesia, rural Bangladesh). This study finds no independent effect of social
distance on generosity among Indonesian and Bangladeshi participants, providing
evidence against the hypothesis that social discounting is universal. It also illustrates the
importance of studying diverse human populations for developing generalizable theories
of human nature. Chapter 2 presents a laboratory experiment with undergraduates to test
the effect of incentive structures on research accuracy, in an instantiation of the scientific
process where the key decision is how much data to collect before submitting one’s
findings. The results demonstrate that rewarding novel findings causes respondents to
make guesses with less information, thereby reducing their accuracy. Chapter 3 presents
an evolutionary agent-based model that tests the effect of competition for novel findings
on the sample size of studies that researchers conduct. This model demonstrates that
competition for novelty causes the cultural evolution of research with smaller sample
sizes and lower statistical power. However, increasing the startup costs to conducting
single studies can reduce the negative effects of competition, as can rewarding
publication of secondary findings. These combined chapters provide evidence that
aspects of current scientific practice may be detrimental to the reliability and
generalizability of research and point to potential solutions.