Evaluation of Membership Satisfaction and Predicting Attendance for the Society of Petroleum Engineers
Objective: The Society of Petroleum Engineers Permian Basin (SPE PB) Section wanted to better understand why people were attending monthly lunch and learns. Furthermore, how were people to be incentivized to re-attend?
Process: Two surveys were deployed over a four month period. The first survey consisted of 10-12 questions while the second contained 50 questions. Surveys were deployed utilizing Google Forms via LinkedIn, Facebook, E-Mail distribution, and at in person events.
Solution: A model was created using simple linear regression to understand the strategic areas driving satisfaction and subsequently attendance and recommendation of the lunch and learns. From there a logistic regression was constructed that focused on purchasers specifically to better understand the relationship between satisfaction and purchase probability. In addition, it was possible to predict how much one person would possible spend on SPE PB lunch and learns in a year if their satisfaction were increased by one unit. This had an implied profit of $44,820 per year for the Permian Basin Section alone.
Fresh Arts Survey Text and Sentiment Analysis
Objective: Fresh Arts, a non-profit based in Houston, TX, deployed a survey utilizing Survey Monkey to artists to better understand their thoughts on the Fresh Arts Community. There were 3 free form questions which respondents could provide detailed responses. Fresh Arts had two objectives: Should continue to pay a subscription fee for Survey Monkey’s free text analysis? What is the sentiment of the respondents?
Process: The responses to the 3 free form questions were split into separate files for individualized analysis. Personal information was masked so as to protect privacy. Google and AFINN sentiment analysis were done for each individual question in addition to text analysis.
Solution: Google sentiment was better at predicting artist’s general feelings about the arts community in Houston, TX. The utilization of python to do text analysis provided a much clearer picture over Survey Monkey’s default word cloud. Results were given to Fresh Arts for them to build out their 5 year business plan based on the feedback from the artists. It was also recommended by our team that more specific free form questions should be asked in the future.
Humana-Mays Healthcare Analytics 2021 Competition
Objective: Humana proposes a business question to several universities for teams of students to solve across the United States of America. I joined a team with three other fellow graduate students to form The Residuals. Humana asked us to predict vaccination status utilizing a data set provided by them with customer information. Our team signed a non-disclosure agreement to not distribute the data. To learn more: https://mays.tamu.edu/humana-tamu-analytics/information/
Process: Demographics, hospital visits, income, location, and other features were provided by Humana. Initial feature exploration was done to better understand the relationship between these aspects and vaccination status.
Solution: Random undersampling was utilized as there was a disparity in the dataset between vaccinated and non-vaccinated persons. This modified data set was then utilized to conduct boot-strap random forest, but XGBoost and LightGBM were ultimately found to have the best predictive power. Our team placed #8 overall in the USA out of several universities. Final code can be seen on GitHub.