Research Design
November 2019A deep dive into advanced research methodology, drawing on a mix of qualitative and quantitative approaches.
Senior Data Scientist, Epidemiologist, and Biostatistician.
I integrate rigorous data science with policy-grade strategy and narrative clarity - then ship real, measurable change. Forged at the intersection of macro-systems thinking and quantitative biostatistics, my work bridges the gap between global socio-economic frameworks and actionable decision intelligence. My professional trajectory spans leading frontline epidemiological response teams during the COVID-19 pandemic, scaling enterprise analytics as a senior data scientist for a Fortune 24 organization, and driving agile, cross-functional teams across a Big 4 firm and an SF-based health startup. I don’t just build models; I translate complex statistical outputs into the strategic foresight required for senior leadership and public policy.
Given PhD-level training in Epidemiology, Biostatistics, and Neurocognitive Science, I am deeply technical but relentlessly applied. I am at home working in R, with extensive experience across Python, SQL/Postgres, Julia, SAS, and SPSS - however I often leverage Rust, C++, and Java as well. My technical wheelhouse includes wrangling mixed data, hypothesis testing, and architecting explainable machine learning pipelines - spanning supervised learning, clustering, and natural language. Beyond the algorithms, I build automated workflows, interactive geospatial maps, and cross-platform tools that live in the real world, such as my civic-insights platform, Perception Interval. My graduate training and research in neurocognitive science equips me with perspective and deep curiosity regarding human and artificial cognitive abilities - which I bring to the table in every machine learning and generative AI system that I design, build, deploy, and maintain.
Whether I am brainstorming with colleagues, communicating to expert & broad audiences, engineering insight systems, developing intelligence in a drop of cloud compute, or stepping into leadership roles in times of great challenge, my focus remains constant: extracting the signal from the noise, anticipating the range of probable futures, and building systems that drive human-centric, scalable impact.
A sample of analytical outputs — from network analysis and causal graphs to individualization of effect estimates and semantic dimensionality reduction.
A deep dive into advanced research methodology, drawing on a mix of qualitative and quantitative approaches.
Presented for Statistics Without Borders as part of a panel at the Joint Statistical Meeting.
Introducing R language, data wrangling, and computer vision.
Spearheaded machine learning, evaluation, and probabilistic forecasting. Developed interactive R Shiny web applications updated daily. Managed fail-safe data ingestion pipelines handling 0.5B-1.5B rows.
Led the Outbreak Management SQL Server Team. Produced complex reproducible reports 7 days a week using R, SAS, and SQL. Developed data wrangling pipelines for all Los Angeles County School COVID-19 data.
Leveraged open-source natural language processing libraries to build tabular time-series from unstructured text. Applied AI for predictive models and contributed to the implementation of parameterized models.
Led a distributed team of data scientists on an NLP project concerning COVID-19. Presented at the 2020 Joint Statistical Meeting (JSM) on supporting emergency response clients.
Mediation, conditional process modeling, growth curves, multinomial regression, and mixed effects.
Dimensional reduction, dealing with missing data, visualization, and time-series data wrangling.
Leveraged supercomputers for Natural Language Processing. Presented on H2O Driverless AI and ggplot2.
Fit generalized low-rank models, Bayesian mixed effects models, and gradient boosting machines for longitudinal responses to breast cancer treatments.
Built cognitive tasks, ran experiments in accordance with human research ethics, and analyzed data via reproducible pipelines for cognitive science.
Served as a resource and community adviser for residents.
Led operational teams to support business objectives.
Co-founded a San Francisco-based health and fitness startup.
Claremont Graduate University
Claremont Graduate University
Claremont Graduate University (Declined)
San Jose State University
Tufts University
San Jose State University
San Jose State University
Neurocognitive Sciences, Biostatistics, and Epidemiology
Claremont Graduate University, Post-Coursework 2020
R Statistician Certificate
DataCamp, 2020
Credential ID 147123
Biostatistics and Epidemiology
Claremont Graduate University, 2019
Kinestry, 2019
R Data Scientist Certificate
DataCamp, 2017
Credential ID 5703
R Data Analyst Certificate
DataCamp, 2017
Credential ID 4526
R Developer Certificate
DataCamp, 2017
Credential ID 4799
Global Studies
San Jose State University, 2010
#1 Most Transformative University #5 Public University in the U.S.