Encoding/Decoding Complexity

With Agentic AI & Statistics

Senior Data Scientist, Epidemiologist, and Biostatistician.

About Me

I integrate rigorous data science with policy-grade strategy and narrative clarity - then ship real, measurable change. Forged at the intersection of macro-systems thinking and quantitative biostatistics, my work bridges the gap between global socio-economic frameworks and actionable decision intelligence. My professional trajectory spans leading frontline epidemiological response teams during the COVID-19 pandemic, scaling enterprise analytics as a senior data scientist for a Fortune 24 organization, and driving agile, cross-functional teams across a Big 4 firm and an SF-based health startup. I don’t just build models; I translate complex statistical outputs into the strategic foresight required for senior leadership and public policy.

Given PhD-level training in Epidemiology, Biostatistics, and Neurocognitive Science, I am deeply technical but relentlessly applied. I am at home working in R, with extensive experience across Python, SQL/Postgres, Julia, SAS, and SPSS - however I often leverage Rust, C++, and Java as well. My technical wheelhouse includes wrangling mixed data, hypothesis testing, and architecting explainable machine learning pipelines - spanning supervised learning, clustering, and natural language. Beyond the algorithms, I build automated workflows, interactive geospatial maps, and cross-platform tools that live in the real world, such as my civic-tech platform, Perception Interval. My graduate training and research in neurocognitive science equips me with perspective and deep curiosity regarding human and artificial cognitive abilities - which I bring to the table in every machine learning and generative AI system that I design, build, deploy, and maintain.

Whether I am brainstorming with colleagues, communicating to expert & broad audiences, engineering insight systems, developing intelligence in a drop of cloud compute, or stepping into leadership roles in times of great challenge, my focus remains constant: extracting the signal from the noise, anticipating the range of probable futures, and building systems that drive human-centric, scalable impact.

2 PhD-level courses co-taught in 3 languages
405 lb. Back Squat
100+ Data Science Courses Completed

Professional Experience

Senior Data Scientist (III)

Centene Corporation 2022 - 2025

Spearheaded machine learning, evaluation, and probabilistic forecasting. Developed interactive R Shiny web applications updated daily. Managed fail-safe data ingestion pipelines handling 0.5B-1.5B rows.

R Shiny Machine Learning AWS S3 Posit Connect

Epidemiologist - Data Science Team

Los Angeles County Dept of Public Health 2020 - 2022

Led the Outbreak Management SQL Server Team. Produced complex reproducible reports 7 days a week using R, SAS, and SQL. Developed data wrangling pipelines for all Los Angeles County School COVID-19 data.

SQL Server SAS R Distributed Team Lead

Researcher

CGU, Public Health/Math Dec 2020 - Mid 2021

Leveraged open-source natural language processing libraries to build tabular time-series from unstructured text. Applied AI for predictive models and contributed to the implementation of parameterized models.

NLP Time-Series Predictive AI

Statistical Consultant

Statistics Without Borders 2019 - Per Engagement

Led a distributed team of data scientists on an NLP project concerning COVID-19. Presented at the 2020 Joint Statistical Meeting (JSM) on supporting emergency response clients.

NLP Team Leadership JSM Presenter

Graduate Teaching Assistant

CGU, School of Community & Global Health May 2020 - Aug 2020

Mediation, conditional process modeling, growth curves, multinomial regression, and mixed effects.

Teaching Advanced Statistics Regression

Graduate Teaching Assistant

CGU, Transdisciplinary Studies May 2020 - Aug 2020

Dimensional reduction, dealing with missing data, visualization, and time-series data wrangling.

Data Wrangling Visualization Teaching

Support

Pomona High-Performance Computing Unit June 2019 - Oct 2020

Leveraged supercomputers for Natural Language Processing. Presented on H2O Driverless AI and ggplot2.

NLP H2O.ai Supercomputing

Researcher

CGU, School of Community & Global Health Jan 2019 - May 2019

Fit generalized low-rank models, Bayesian mixed effects models, and gradient boosting machines for longitudinal responses to breast cancer treatments.

Dimensionality Reduction Bayesian Statistics Gradient Boosting

Researcher

Human Working Memory Lab Jan 2018 - Dec 2018

Built cognitive tasks, ran experiments in accordance with human research ethics, and analyzed data via reproducible pipelines for cognitive science.

Cognitive Science Reproducible Pipelines Experiment Design

Community Adviser

Claremont Collegiate Apartments June 2016 - Aug 2020

Served as a resource and community adviser for residents.

Leadership Community Building Safety

One Team Lead

Deloitte April 2013 - Aug 2013

Led operational teams to support business objectives.

Team Leadership Operations

Cofounder

Studiomix Nov 2011 - April 2013

Co-founded a San Francisco-based health and fitness startup.

Entrepreneurship Service Design Hiring Functional Exercise Nutrition

Technical Toolbox

Languages & AI

  • R / Python / Julia / SAS
  • SQL / Arrow
  • Agentic LLMs (Command Line, IDE integrations)
  • Machine Learning / Deep Learning / Forecasting / Statistics

Statistics & Modeling

  • Causal Inference
  • Bayesian Statistics
  • Structural Equation Models
  • Psychometrics

Infrastructure & Tools

  • Shiny / Posit Connect
  • AWS (EC2, S3)
  • Docker / CI/CD
  • Cloud Computing

Honors & Awards

James A. Blaisdell Fellow

2019

Claremont Graduate University

Howard R. Houston Fellow

2017

Claremont Graduate University

Randall Lewis Fellow

2016

Claremont Graduate University (Declined)

Dean's International Scholar

2010

San Jose State University

Leadership Alliance Scholar

2009

Tufts University

Ronald E. McNair Scholar

2009

San Jose State University

S. H. Scaffold Scholar

2008

San Jose State University

Education

Ph.D. Health Promotion Sciences

Neurocognitive Sciences, Biostatistics, and Epidemiology

Claremont Graduate University, Post-Coursework 2020

Statistician

R Statistician Certificate

DataCamp, 2020

Credential ID 147123

Master of Public Health

Biostatistics and Epidemiology

Claremont Graduate University, 2019

Artificial Intelligence Bootcamp

Kinestry, 2019

Associate Data Scientist

R Data Scientist Certificate

DataCamp, 2017

Credential ID 5703

Data Analyst

R Data Analyst Certificate

DataCamp, 2017

Credential ID 4526

Developer

R Developer Certificate

DataCamp, 2017

Credential ID 4799

Bachelor's degree

Global Studies

San Jose State University, 2010

#1 Most Transformative University #5 Public University in the U.S.

Let's Work Together