Examining the design of statistical studies can help us answer Whitney Houston’s famous question, “How Will I Know?” How will I know if this new vaccine is effective, how will I know if a policy meets its stated goals, how will I know if evidence I gather is significant? I want to answer these questions in settings where data arise in clustered and correlated ways, including in COVID-19 studies. Understanding this data can help us design more efficient, more robust, and more meaningful clinical trials and research studies. And by understanding how statistical evidence is used in regulatory policy and the historical background of statistics, we can generate better evidence and communicate it better to a skeptical public.
Below, you can find brief descriptions and links to some of my research. If you have any questions, or have trouble accessing any of the articles, please reach out to me, at lkennedyshaffer (at) vassar (dot) edu.
The COVID-19 pandemic has spurred an incredible amount of research in a wide variety of fields. To get the best evidence, we have to account for some of the unique features of data that come from infectious disease outbreaks. Accounting for these in the design and analysis of studies will improve the reliability of answers that we got and can even help us create more efficient studies that use these features to our advantage. Improving this evidence base will help policy and public health responses best address and end the pandemic.
- Using quantitative measures of viral load, like PCR Cycle Threshold (Ct) values, we can better estimate where a community is in an outbreak.
- Hay et al., “Estimating epidemiological dynamics from single cross-sectional viral load distributions,” medRxiv 2021.
- We must identify and avoid biases in common study designs when applied to epidemic settings.
- Accorsi et al., “How to detect and reduce biases potential sources of biases in epidemiologic studies of SARS-CoV-2,” European Journal of Epidemiology 2021.
- Kahn et al., “Potential biases arising from epidemic dynamics in observational seroprotection studies,” American Journal of Epidemiology 2021.
- Kennedy-Shaffer and Lipsitch, “Statistical properties of stepped wedge cluster-randomized trials in infectious disease outbreaks,” American Journal of Epidemiology 2020.
- Ongoing research includes efforts to provide better sample size methods for various study designs in epidemics.
- The clustering of transmission can allow us to design better surveys to gather information about the disease and can improve our control measures to reduce transmission with less burden on individuals.
- Kennedy-Shaffer et al., “Perfect as the enemy of the good: using low-sensitivity tests to mitigate SARS-CoV-2 outbreaks,” Lancet Microbe 2021.
- Kennedy-Shaffer et al., “Snowball sampling study design for serosurveys early in disease outbreaks,” American Journal of Epidemiology 2021.
- Ongoing research includes the effects of this clustering on optimal vaccination strategies.
Theory and Methods for Stepped-Wedge and Parallel-Arm Cluster-Randomized Trials
Research into clustered data and study design did not start with the COVID-19 pandemic. Clustered or correlated data can arise in many settings beyond infectious diseases, including policy evaluations, educational research, health systems research, and survey design. I work to understand the role of correlation and how we can best account for it in designing our studies.
- New methods for analyzing stepped wedge trials can avoid the need for strong assumptions on how data were generated.
- Kennedy-Shaffer et al., “Novel methods for the analysis of stepped wedge cluster randomized trials,” Statistics in Medicine 2020. The code for analysis is available at https://github.com/leekshaffer/SW-CRT-analysis.
- For more on the performance in epidemic settings, see Kennedy-Shaffer and Lipsitch, “Statistical properties of stepped wedge cluster-randomized trials in infectious disease outbreaks,” American Journal of Epidemiology 2020.
- We can design more efficient cluster randomized trials by stratifying on important predictors and accounting for that in sample size estimation and power calculations.
- Kennedy-Shaffer and Hughes, “Sample size estimation for stratified individual and cluster randomized trials with binary outcomes,” Statistics in Medicine 2020. The code for calculating sample sizes is available at https://github.com/leekshaffer/strat-crt-ss or through a user-friendly RShiny app at https://leekshaffer.shinyapps.io/stratcrt/.
History of Statistics and the Role of Statistics in Public Policy
Translational research is key to ensuring that statistics are properly used in fields as varied as public policy, business, law, and medicine, among others. I study how statistical methods, like p-values and significance testing, became so ubiquitous in certain areas, and what that means for how we develop, discuss, and disseminate new methods.
- The history of the use of and debate around p-values in statistics can frame the current debate, give us insight into how to avoid the misuse of statistical methods in the future, and improve the teaching of statistics.
- Kennedy-Shaffer, “Before p<0.05 to beyond p<0.05: using history to contextualize p-values and significance testing,” The American Statistician 2019.
- Use of significance testing and p-values in statistics is intimately tied to its acceptance and promotion in drug efficacy studies by the U.S. Food and Drug Administration. The staying power of these methods can be seen by how well they can meet FDA’s goals.
- Kennedy-Shaffer, “When the alpha is the omega: p-values, ‘substantial evidence,’ and the 0.05 standard at FDA,” Food and Drug Law Journal 2017.