Reviewing the Reviews

In this part, we will be thinking about social consequences of deploying machine learning models. Recall the food reviews dataset from Week 3 Homework. The code for this question loads weights from a pretrained logistic regression model, which classifies reviews as positive or negative, on a bag-of-words representation for each review. It then prints out the weights associated with certain words: [“yummy”, “Indian”, “Mexican”, “Chinese”, “European”, “gross"]. Using this Colab Notebook , please investigate the weights of the words printed out. What do you notice? Try out other words by changing the list of words passed in. Consider trying words from this list; do the weights match your expectations of what they should be like?: ["disgusting", "favourite", "caffeine", "stinks"] You can also download the code here. 3A) How will a classifier with these weights handle “This is Indian food” relative to “This is European food”? Why does this happen? Consider the following statistics about the number of positive and negative reviews for each of these words.

	+	-
yummy	96	28
Indian	2	1
Mexican	1	1
Chinese	1	2
European	0	1
gross	20	69

3B) When is this desirable behavior? When is it not? What if we're helping Yelp build a restaurant recommendation system? What if we are doing a research project in which we are trying to understand how different cuisines are perceived? 3C) What changes could we make in

dataset
training procedure
post-processing

to achieve something different from what is happening in (a)? 3D) If we made the changes you came up with in (c), how would this affect performance on

the training set?
the test set?
a different set of food reviews?

3E) Is it important that we used logistic regression in this problem? Or would the lessons we learned apply to other linear classifiers? Food for Thought
In considering the questions above, you can see that machine learning doesn’t end with training the model and showing it has a high accuracy. We need to think about whether the behavior of our models is fair. Here, we were looking at food reviews, and we already started to see evidence of bias. Now what if we were building a classifier to look at a resume and decide whether or not to interview someone? Some companies tried this: Hiring Algorithms “After an audit of the algorithm, the resume screening company found that the algorithm found two factors to be most indicative of job performance: their name was Jared, and whether they played high school lacrosse.”

Discussion Guide

A) Neutral statement about Indian food would score higher than the neutral statement about European food. But both statements are entirely neutral, so this is not good behavior. Might be due to infrequent presence of these terms and due to existing correlations between term and sentiment
B) Yelp might not want to distinguish in this way, whereas researcher studying trends would want to see.
C)
- balance the number of positive and negative examples with words relating to ethnicity
- remove all words denoting ethnicity or any other characteristic on which we want to enforce neutrality when doing the BOW encoding
- In post-processing, we can check and ignore classifications for statements with protected words (and handle manually)
C) less well on both the training and the test set because we removed a seemingly-important correlation; better on a held-out test set where people don’t have the same preferences