Stephen loves the challenges that come with working with data and the collaborations that feed into data discovery. As a bioinformatics data scientist at RTI International, Stephen has leveraged his skill stack to build workflows and pipelines to aggregate and analyze genomic, clinical and public health data.
MSc Bioinformatics, 2013
Johns Hopkins University, Baltimore, MD
BSc Bioinformatics, 2011
Loyola University, Chicago, IL
Programming | High proficiency in R, Python, SQL (MySQL, Postgres, SQL Server), Perl, Shiny, bash, HTML, CSS Working knowledge of Docker, Kubernetes |
Data Warehousing | High proficiency in BigQuery, Cloud SQL, RDS, Snowflake Working knowledge of Redshift, BigTable, Hive, BaseX, MongoDB |
Data Processing | High proficiency in dplyr, tidyr, pandas, Dataprep Working knowledge of GCP Pub/Sub, AWS SNS/SQS, Dataproc (managed Spark/Hadoop service), GCP Cloud Composer (Airflow), BigQuery ML, Dialogflow, AutoML |
Data Visualization | High proficiency in ggplot, matplotlib, Data Studio Working knowledge of Tableau |
Documentation | High proficiency in RMarkdown, knitr, Jupyter |
Other technologies | High proficiency in Git, Subversion, Jira, Confluence, REDCap |
Vaz, M., Hwang, S. Y., ... Baylin, S. B. (2017). Chronic cigarette smoke-induced epigenomic changes precede sensitization of bronchial epithelial cells to single-step transformation by KRAS mutations. Cancer Cell, 32, 360–376. doi:10.1016/j.ccell.2017.08.006 [PubMed][Full PDF]
Gern, J. E., Jackson, D. J., Lemanske, R. F., Seroogy, C. M., Tachinardi, U., Craven, M., Hwang, S. Y., ... Bacharier, L. B. (2019). The Children's Respiratory and Environmental Workgroup (CREW) Birth Cohort Consortium: Design, methods, and study population. Respiratory Research, 20(1), 115. doi:10.1186/s12931-019-1088-9 [PubMed][Full PDF]