We present a broad characterization of gender representation in a large heterogeneous sample of retail products. In particular, we study online product textual information, such as titles and descriptions. Our goal is to understand from a semantic perspective, differences and similarities in how girls (women) and boys (men) are represented. We perform a comparative analysis of the language used in gendered products (i.e., products that mention exclusively either of these two genders), and additionally compare it to products that are explicitly gender neutral or inclusive. We found that the adjectives, skills, occupations, and values described in gendered products tended to reinforce stereotypes. Some of these stereotypes are aligned with historical findings from research on traditional off-line retail stores, and others are new owing to the up-to-date product dataset our research is based on. By leveraging additional existing resources we were able to gain insight into how certain product descriptions reflect stereotypes that are related to soft-skills and hierarchical occupational information. Conversely, we found that a large segment of products present explicitly as gen-der neutral or inclusive. We explore whether the language used by gender-inclusive products can be useful to improve stereotypes reflected in gendered product text. Specifically, we study its effect in word embedding fairness through debiasing techniques.
Research areas