Hi, I’m Nick!
Data Scientist

I'm a data scientist and PhD Candidate at the University of Pennsylvania. In my research, I utilize machine learning, econometrics, and A/B testing to study human behavior and the media. In addition to my PhD, I’m pursuing a Masters in Data Science and Statistics from The Wharton School.

I spent the summer of 2022 as a Machine Learning Intern at DataCamp, where I used transformer-based NLP models to build an internal linking recommendation tool. I now hope to continue to apply these skills and gain additional experience in other data-driven roles.

Contact Download My Resume

Data Science Portfolio

“Classifying Local News Television Transcripts Using RoBERTa” (with Sam Wolken and Chloe Ahn)

We scraped a novel dataset of approximately 18,000 closed captioning transcripts from local television news programs across three cities from 2014 to 2018 and created a series of RoBERTa classifiers to identify news topics in these transcripts over time. Across the seven selected topics, the RoBERTa models achieved an average precision score of 0.85, an average recall score of 0.876, and an average F1 score of 0.859.

View findings through interactive dashboard
“Classifying Local News Television Transcripts Using RoBERTa” (with Sam Wolken and Chloe Ahn)
“Who Donates? Using Machine Learning to Predict Federal Donation Behavior”

“Who Donates? Using Machine Learning to Predict Federal Donation Behavior”

Using governmental administrative data and socio-demographic data, I show that LASSO logistic regression and random forest are very effective at predicting individual-level donation behavior. LASSO logistic regression correctly classifies 82.7% of test cases (61.9% of positive cases) and random forest correctly classifies 92.8% of test cases (99.9% of positive classes). Although both of these accuracy scores are notably higher than the 74.1 percent no-information rate, random forest proves to be the suprior model by far.

Link to Full Project

“Protests Increase Donations to Federal Political Campaigns” (with Daniel Gillion)

Using multiple identification strategies, we show that political protests generate a significant amount of money for federal political campaigns across the United States over time. Utilizing a staggered difference-in-difference design with county, week, and year fixed-effects, we find that each additional protest causes an increase of about 97 individual donations (or roughly $18,000) to federal political campaigns in a county. Using this estimate, we determine that protests have generated over $550,000,000 in donations to federal political campaigns from 2017 to 2021.

Link to Full Project
 “Protests Increase Donations to Federal Political Campaigns” (with Daniel Gillion)
 “Fleeing For Their Lives: Reconsidering How Americans View Immigrants’ Reasons for Migrating”

“Fleeing For Their Lives: Reconsidering How Americans View Immigrants’ Reasons for Migrating”

This project implements a series of survey experiments investigating how Americans view border crossers’ reasons for migrating and whether Americans are more supportive of migrants escaping violence or migrants escaping poverty. Contrary to prior research, my findings indicate that economic reasons for migration are only viewed less favorably by natives if the risks in the migrant’s home country are not equivalent to the risks associated with violence or persecution in the home country of origin.

Link to Full Project
Profile Picture

About Me

I am interested in the ways that artificial intelligence can and will change the world. My dissertation project, for example, is titled, "Transforming Hearts and Minds: Using Artificial Intelligence to Reduce Anti-Immigrant Attitudes." The project features a series of experiments where research subjects engage in various types of conversations with refugees and asylum seekers who, unbeknownst to the research participants, are transformer-based chat bots. The project aims to better understand how different types of meaningful contact with humanitarian migrants can reduce exclusionary attitudes toward immigration.

Outside of data science and my research, I love running, swimming, and biking. In the past year, I’ve completed 2 triathlons (1 Sprint and 1 Olympic) and I hope to compete in many more in the years to come!

June 2022 - Sep 2022
Machine Learning Intern
DataCamp
Using several transformer-based NLP models, I created an internal linking tool that includes over 80,000 internal connections between DataCamp web pages, which saved at least 1,300 hours of labor if the task had been done manually. This tool has the potential to increase web traffic and, most importantly, enhance the DataCamp user experience.

Experience

3+ years of experience as a quantitative researcher and data scientist.

Linkedin Download My Resume

Education

2020-2024 (Expected)
University of Pennsylvania
PhD in Political Science
2020-2023 (Expected)
The Wharton School, University of Pennsylvania
Master's in Data Science and Statistics
2019-2020
Columbia University
Master's in Political Science
2013-2017
Boston College
Bachelor's in Political Science