Discrimination, whether racial, ethnic, gender, sexual preference or religious is wrong.
But some forms of discrimination can be incredibly useful.
Life as we know it would be impossible if we could not discriminate between things we like and things we don’t.
Increasingly, we are relying on machines to discriminate on our behalf.
Understanding the limitations of machine based discrimination is essential to ensuring we are not inadvertent victims of machine based discrimination.
Discrimination in Content Discovery
As the quantity of content available on the internet increases, users are relying on machines to decide what they should consume next.
We’ve all experienced the Netflix movie that got 5 stars but was not so good. The collective wisdom of other viewers doesn’t always reflect our personal taste.
Predicting the affinity of a single person to a particular video is complicated by the fact that users are unwilling to answer a lengthy survey after watching a video or reading a blog post. So content providers must rely on metadata about the user’s experience, such as:
1. Rating – Some systems ask users to rate content, e.g. 1 – 5. But these systems are prone to sample bias (sampling only the users willing to offer their opinion). And the 1 – 5 rating fails to capture subtle aspects of the user’s experience.
2. Watch Time – Whether a video was viewed and watched in its entirety. This can be a relatively good indicator of whether the user found the content interesting. But this metric can’t capture the difference between an OK movie and a movie that changed your life.
3. Sharing – If a user shares a video on social media, the user is essentially vouching for the content. But they could be saying, “check this out, it’s awful” or “I loved this video”.
4. Bounce Rate – The extent to which the user engages with other content on the website or “bounces” away to another website. Content providers assume that users who navigate to the next piece of content are happy with the first experience. But perhaps they did not find what they were looking for and continued to look.
It could also be the case that the user is very happy with the first piece of content and has no further interest. For example, if I am looking for a “how to video” and the video solves my problem, I don’t keep looking. My need has been satisfied so I bounce away from the site. My bouncing away from the site could be misinterpreted as a measure of unhappiness.
Measuring metadata can be useful, but it doesn’t reveal a lot about the quality of the video or the user’s opinion. It tends to accentuate content that triggers certain people to engage in activities, such as rating and sharing.
In effect, metadata analysis amplifies content that’s provocative while subverting content that may be equally valid but does not provoke an action.
Some services, such as Netflix and Amazon, attempt to discriminate based on the data collected about all the user’s choices. They try to pigeonhole the user into an avatar, based on the sum of the user’s activities.
The assumption is that if you’ve consumed an assortment of content, you must be like other people who also watched the same videos or read the same books. Algorithms designed to make these choices have many flaws:
- Echo Chamber – The algorithm can’t know what you haven’t yet revealed. So it produces an echo chamber effect and gives you more of the type of content you’ve previously seen.
- Boring – The algorithms often lack serendipity. Some algorithms are designed to throw out the occasional “test balloon”, but these tests are often clumsy and unintuitive.
- Lack of Subtlety – These algorithms also fail to capture a qualitative response. In the same way metadata analysis fails to uncover your true feelings about content, analysis of your viewing patterns is a poor approximation of your feelings about the content.
Discrimination by Design
Perhaps the most troubling aspect of content discovery systems is the tendency to bias a user’s experience based upon assumptions made by the system’s designer. In effect, the content discovery algorithm that forms your experience is an extension of the prejudice of its designer.
Deliberate or unwitting decisions made during the design of the content discovery algorithm can build in bias that is reflected in the choices offered to users.
If an algorithm makes conclusions about your politics, it may limit you to only a small segment of the relevant content.
To be clear, users do this to themselves when they consume conventional media. A person who watches Fox News probably has different opinions than a person who watches CBS News. But in this case, the user is probably more aware that they have made an ideological choice when they choose one channel over another.
Users in online environments are being digitally herded into categories. Most are unaware this is taking place.
Here are some ways to avoid digital discrimination:
- Search for new subjects periodically. This will “shake up” the algorithms and prevent you from getting too much of the same content by default.
- Don’t rely on one service for all your content.
- Be aware that your choices are being limited by the machine.
Ultimately, you are the last line of defense against becoming a digital sheep.
Photo Credit: Jame Barwell