Category: Content Moderation
Washington Post
Adweek
Why We Worked with Parler to Implement Effective Content Moderation
Earlier today, The Washington Post published a feature detailing Hive’s work with social network Parler, and the role our content moderation solutions have played in protecting their community from harmful content and, as a result, earning their app reinstatement in Apple’s App Store.
We are proud of this very public endorsement on the quality of our content moderation solutions, but also know that with such a high-profile client use case there may be questions beyond what could be addressed in the article itself about why we decided to work with Parler and what role we play in their solution. For detailed answers to those questions, please see below.
Why did Hive decide to work with Parler?
We believe that every company should have access to best-in-class content moderation capabilities to create a safe environment for their users. While vendors earlier this year terminated their relationships with Parler after believing their services were enabling a toxic environment, we believe our work addresses the core challenge Parler faced and enables a safe community for Parler’s users to engage.
As outlined in our recent Series D funding announcement, our founders’ precursor to Hive was a consumer app business that itself confronted the challenge of moderating content at scale as the platform quickly grew. The lack of available enterprise-grade, pre-trained AI models to support this content moderation use case (and others) eventually inspired an ambitious repositioning of the company around building a portfolio of cloud-based enterprise AI solutions.
Our founders were not alone. Content moderation has since emerged as a key area of growth in Hive’s business, now powering automated content moderation solutions for more than 75 platforms globally, including prominent dating services, video chat applications, verification services, and more. A December 2020 WIRED article detailed the impact of our work with iconic random chat platform Chatroulette.
When Parler approached us for help in implementing a content moderation solution for their community, we did not take the decision lightly. However, after discussion, we aligned on having built this product to provide democratized access to best-in-class content moderation technology. From our founders’ personal experience, we know it is not feasible for most companies to build effective moderation solutions internally, and we therefore believe we have a responsibility to help any and all companies keep their communities safe from harmful content.
What is Hive’s role in content moderation relative to Parler (or Hive’s other moderation clients)?
Hive provides automated content moderation across video, image, text, and audio, spanning more than 40 classes (i.e., granular definitions of potentially harmful content classifications such as male nudity, gun in hand, or illegal injectables).
Our standard API provides a confidence score for every content submission against all our 40+ model classes. In the instance of Parler, model flagged instances of hate speech or incitement in text are additionally reviewed by members of Hive’s 2.5 million plus distributed workforce (additional details below).
Our clients map our responses to their individual content policies – both in terms of what categories they look to identify, how sensitive content is treated (i.e., blocked or filtered), and the tradeoff between recall (i.e., the percentage of total instances identified by our model) and precision (i.e., the corresponding percentage of identifications where our model is accurate). Hive partners with clients during onboarding as well as on an ongoing basis to provide guidance on setting class-specific thresholds based on client objectives and the desired tradeoffs between recall and precision.
It is the responsibility of companies like Apple to then determine whether the way our clients choose to implement our technology is sufficient to be distributed in their app stores, which in the case of Parler, Apple now has.
What percentage of content is moderated, and how fast?
100% of posts on Parler are processed through Hive’s models at the point of upload, with latency of automated responses in under 1 second.
Parler uses Hive’s visual moderation model to identify nudity, violence, and gore. Any harmful content identified is immediately placed behind a sensitive content filter at the point of upload (notifying users of sensitive content before they view).
Parler also uses Hive’s text moderation model to identify hate speech and incitement. Any potentially harmful content is routed for manual review. Posts deemed safe by Hive’s models are immediately posted to the site, whereas flagged posts are not displayed until model results are validated by a consensus of human workers. It typically takes 1-3 minutes for a flagged post to be validated. Posts containing incitement are blocked from appearing on the platform; posts containing hate speech are placed behind a sensitive content filter. Human review is completed using thousands of workers within Hive’s distributed workforce of more than 2.5 million registered contributors who have opted into and are specifically trained on and qualified to complete the Parler jobs.
In addition to the automated workflow, any user-reported content is automatically routed to Hive’s distributed workforce for additional review and Parler independently maintains a separate jury of internal moderators that handle appeals and other reviews.
This process is illustrated in the graphic below.
How effective is Hive’s moderation of content for Parler, and how does that compare to moderation solutions in place on other social networks?
We have run ongoing tests since launch to evaluate the effectiveness of our models specific to Parler’s content. While we believe that these benchmarks demonstrate best-in-class moderation, there will always be some level of false negatives. However, the models continue to learn from their mistakes, which will further improve the accuracy over time.
Within visual moderation, our tests suggest the incidence rate of adult nudity and sexual activity content not placed behind a sensitive content filter is less than 1 in 10,000 posts. In Facebook’s Q4 2020 Transparency Report (which, separately, we think is a great step forward for the industry and something all platforms should publish), it was reported that the prevalence of adult nudity and sexual activity content on Facebook was ~3 to 4 views per 10,000 views. These numbers can be seen as generally comparable with the assumption that views of posts with sensitive content roughly average the same as all other posts.
Within text moderation, our tests suggest the incidence rate of hate speech (defined as text hateful towards another person or group based on protected attributes, such as religion, nationality, race, sexual orientation, gender, etc.) not placed behind a sensitive content filter was roughly 2 of 10,000 posts. In Q4 2020, Facebook reported the prevalence of hate speech was 7 to 8 views per 10,000 views on their platform.
Our incidence rate of incitement (defined as text that incites or promotes acts of violence) not removed from the platform was roughly 1 in 10,000 posts. This category is not reported by Facebook for the purposes of benchmarking.
Does Hive’s solution prevent the spread of misinformation?
Hive’s scope of support to Parler does not currently support the identification of misinformation or manipulated media (i.e., deepfakes).
We hope the details above are helpful in further increasing understanding of how we work with social networking sites such as Parler and the role we play in keeping their environment (and others) safe from harmful content.
Learn more at https://thehive.ai/ and follow us on Linkedin
Press with additional questions? Please contact press@thehive.ai to request an interview or additional statements.
Note: All data specific to Parler above was shared with explicit permission from Parler.
TODAY Show
CNBC
Wired
Hive Adds Hate Model to Fully-Automated Content Moderation Suite
Social media platforms increasingly play a pivotal role in both spreading and combating hate speech and discrimination today. Now integrated into Hive’s content moderation suite, Hive’s hate model enables more proactive and comprehensive visual and textual moderation of hate speech online.
Year over year, our content moderation suite has emerged as the preeminent AI-powered solution to both help platforms keep their environments protected from harmful content, and to dramatically reduce the exposure of human moderators to sensitive content. Hive’s content moderation models have consistently and significantly outperformed comparable models, and we are proud to currently work with more than 30 of the world’s largest and fastest-growing social networks and digital video platforms.
Today we are excited to officially integrate our hate model into our content moderation product suite, helping our current and future clients combat racism and hate speech online. We believe that blending our best-in-class models with the significant scale of our clients’ platforms can result in real step-change impact.
Detecting hate speech is a unique challenge that is dynamic and evolving rapidly. Context and subtle nuances vary widely across cultures, languages, and regions. Additionally, hate speech itself isn’t always explicit. Models must be able to recognize subtleties quickly and proactively. Hive is committed to taking on that challenge and, over the past months, we have partnered with several of our clients to ready our hate model for today’s launch.
How We Help
Hate speech can occur both visually and textually with a large percentage occurring in photos and videos. Powered by our distributed global workforce of more than 2 million registered contributors, Hive’s hate model is trained on more than 25 million human judgments and supports both visual classification models and text moderation models.
Our visual classification models classify entire images into different categories by assigning a confidence score for each class. These models can be multi-headed, where each group of mutually exclusive model classes belongs to a single model head. Within our hate model, some examples of heads include the Nazi and KKK symbols, and other terrorist or white supremacist propaganda. Results from our model are actioned according to platform rules. Many posts are automatically actioned as safe or restricted; others are routed for manual review of edge cases where a symbol may be present but not in a prohibited use. Our visual hate models will typically achieve >98% recall and a <0.1% false positive rate. View our full documentation here.
Our text content moderation model is a multi-head classifier that will now include hate speech. This model automatically detects “hateful language” – defined, with input from our clients, as any language, expression, writing, or speech that expresses / incites violence against, attacks, degrades, or insults a particular group or an individual in a particular group. These specific groups are based on protected attributes such as race, ethnicity, national origin, gender, sex, sexual orientation, disability, and religion. Hateful language includes but is not limited to hate speech, hateful ideology, racial / ethnic slurs, and racism. View our full documentation here.
We are also breaking ground on solving the particularly challenging problem of multimodal relationships between the visual and textual content, and expect to be adding multi-modal capabilities over the next weeks. Multimodal learning allows our models to understand the relationship between both text and visual content in the same setting. This type of learning is important to better understand the meaning of language and the context in which it is used. Accurate multimodal systems can avoid flagging cases where the visual content on its own may be considered hateful, but the presence of counterspeech text — where individuals speak out against the hateful content — negates the hateful signal in the visual content. Similarly, multimodal systems can help flag cases where the visual and textual content independently are not considered to be hateful, but in the context of one another are in fact hateful, such as hateful memes. Over time, we expect this capability to further reduce the need for human reviews of edge cases.
What’s Next?
Today’s release is a milestone we are proud of, but merely the first step in a multi-year commitment to helping platforms filter hate speech from their environments. We will continue to expand and enhance model classification with further input from additional moderation clients and industry groups.
Wired
How Hive is helping social platforms and BPOs manage emergent content moderation needs during the COVID-19 pandemic
Social platforms face significant PR and revenue risks during the coronavirus crisis, challenged to maintain safe environments in the face of constrained human content moderation and insufficient in-house AI; Hive is using AI and its distributed workforce of 2 million contributors to help
SAN FRANCISCO, CA (March 23, 2020) – The extraordinary measures taken worldwide to limit the spread of the coronavirus disease have disrupted the global economy, as businesses across industries scramble to adapt to a reality few were prepared for. In many cases, companies have stalled operations – with notable examples including airlines, movie theaters, theme parks, and restaurants among others.
The disruption facing consumer technology companies like Google, Facebook, Twitter, and others is different. Engagement on social media platforms is unaffected, if not boosted, by the outbreak. However, underneath user trends are significant public relations and revenue risks if content moderation cannot keep up with the volume of user-generated content uploads.
Hive, a San Francisco-based AI company, has emerged as a leader in helping platforms navigate the disruption through a combination of data labeling services at scale and production-ready automated content moderation models.
Hive operates the world’s largest distributed workforce of humans labeling data, now more than 2 million contributors from more than 100 countries, and has been able to step in to support emergent content moderation data labeling needs as contract workforces of business process outsourcers (BPOs) have been forced to go on hiatus given their inability to work from home. Further, Hive’s suite of automated content moderation models have consistently and significantly outperformed capable models from top public clouds, and are being used by more than 15 leading platforms to reduce the volume of content required for human review.
Context for the Disruption
It is no secret that major social platforms employ tens of thousands of human content moderators to police uploaded content. These massive investments are made to maintain a brand safe environment and protect billions of dollars of ad revenue from marketers who are fast to act when things go wrong.
Most of this moderation is done by contract workers, often secured through outsourced labor from firms like Cognizant and Accenture. Work from home mandates spurred by COVID-19 have disrupted this model, as most of the moderators are not allowed to work from home. Platforms have suggested that they will use automated tools to help fill the gap during the disruption, but they have also acknowledged that this is likely to reduce effectiveness and to result in slower response times than normal.
How Hive is Helping
Hive has emerged in a unique position to meet emergent needs from social media platforms.
As BPOs have been forced to stand down onsite content moderation services, significant demand for data labeling has arisen. Hive has been able to meet these needs on short notice, mobilizing the world’s largest distributed workforce of humans labeling data, now more than 2 million contributors sourced from more than 100 countries. Hive’s workforce is paid to complete data labeling tasks through a consensus-driven workflow that yields high quality ground truth data.
“As more people worldwide stay close to home during the crisis and face unemployment or furloughs, our global workforce has seen significant daily growth and unprecedented capacity,” says Kevin Guo, Co-Founder and CEO of Hive.
Among data labeling service providers, Hive brings differentiated expertise to content moderation use cases. To date, Hive’s workforce has labeled more than 80 million human annotations for “not safe for work” (NSFW) content and more than 40 million human annotations for violent content (e.g. guns, knives, blood). Those preexisting job designs and workforce familiarity has enabled negligible job setup for new clients signed already this week.
Platforms are also relying on Hive to reduce the volume of content required for human review through use of Hive’s automated content moderation product suite. Hive’s models – which span visual, audio, and text solutions – have consistently and significantly outperformed comparable models from top public clouds, and are currently helping to power content moderation solutions for more than fifteen of the top social platforms.
Guo adds, “We have ample capacity for labeling and model deployment and are prepared to support the industry in helping to keep digital environments safe for consumers and brands as we all navigate the disruption caused by COVID-19.”
For press inquiries, contact Kevin Guo, Co-Founder and CEO, at kevin.guo@thehive.ai.