Leveraging gender guesser package to predict user gender based on Twitter display names
By scraping Twitter user data on display names, I created a gender predictor model to assign a gender to each user.
The package allows for region-specific predictions, as first names can vary in terms of gender by country.
In addition to "male" and "female" preditions, the prediction outputs of the model include "mostly male", "mostly female", and "androgenous".
For simplicity's sake, I reclassified the values with "mostly" as their respective category.
With the user gender defined, advanced demographics segmentation and analysis is possible when considering the larger project I was conducting on NLP-based brand sentiment analysis. I can compare sentiment performance across demographics of particular brands.