I'm a data scientist and PhD Candidate at the University of Pennsylvania. In my research, I utilize machine learning, econometrics, and A/B testing to study human behavior and the media. In addition to my PhD, I’m pursuing a Masters in Data Science and Statistics from The Wharton School.
I spent the summer of 2022 as a Machine Learning Intern at DataCamp, where I used transformer-based NLP models to build an internal linking recommendation tool. I now hope to continue to apply these skills and gain additional experience in other data-driven roles.
Contact Download My ResumeWe scraped a novel dataset of approximately 18,000 closed captioning transcripts from local television news programs across three cities from 2014 to 2018 and created a series of RoBERTa classifiers to identify news topics in these transcripts over time. Across the seven selected topics, the RoBERTa models achieved an average precision score of 0.85, an average recall score of 0.876, and an average F1 score of 0.859.
View findings through interactive dashboardUsing governmental administrative data and socio-demographic data, I show that LASSO logistic regression and random forest are very effective at predicting individual-level donation behavior. LASSO logistic regression correctly classifies 82.7% of test cases (61.9% of positive cases) and random forest correctly classifies 92.8% of test cases (99.9% of positive classes). Although both of these accuracy scores are notably higher than the 74.1 percent no-information rate, random forest proves to be the suprior model by far.
Link to Full ProjectUsing multiple identification strategies, we show that political protests generate a significant amount of money for federal political campaigns across the United States over time. Utilizing a staggered difference-in-difference design with county, week, and year fixed-effects, we find that each additional protest causes an increase of about 97 individual donations (or roughly $18,000) to federal political campaigns in a county. Using this estimate, we determine that protests have generated over $550,000,000 in donations to federal political campaigns from 2017 to 2021.
Link to Full ProjectThis project implements a series of survey experiments investigating how Americans view border crossers’ reasons for migrating and whether Americans are more supportive of migrants escaping violence or migrants escaping poverty. Contrary to prior research, my findings indicate that economic reasons for migration are only viewed less favorably by natives if the risks in the migrant’s home country are not equivalent to the risks associated with violence or persecution in the home country of origin.
Link to Full Project
I am interested in the ways that artificial intelligence can and will change the world. My dissertation project, for example, is titled, "Transforming Hearts and Minds: Using Artificial Intelligence to Reduce Anti-Immigrant Attitudes." The project features a series of experiments where research subjects engage in various types of conversations with refugees and asylum seekers who, unbeknownst to the research participants, are transformer-based chat bots. The project aims to better understand how different types of meaningful contact with humanitarian migrants can reduce exclusionary attitudes toward immigration.
Outside of data science and my research, I love running, swimming, and biking. In the past year, I’ve completed 2 triathlons (1 Sprint and 1 Olympic) and I hope to compete in many more in the years to come!
3+ years of experience as a quantitative researcher and data scientist.
Linkedin Download My Resume