We are excited to announce that Hive is now offering Thorn’s predictive technology through our CSAM detection API! This API now enables customers to identify novel cases of child sexual abuse material (CSAM) in addition to detecting known CSAM using hash-based matching.
Our Commitment to Child Internet Safety
At Hive, making the internet safer is core to our mission. While our content moderation tools help reduce human exposure to harmful content across many categories, addressing CSAM requires specialized expertise and technology.
That’s why we’re expanding our existing partnership with Thorn, an innovative nonprofit that builds technology to defend children from sexual abuse and exploitation in the digital age.
Until now, our integration with Thorn focused on hash-matching technology to detect known CSAM. The new CSAM detection API builds on this foundation by adding advanced machine learning capabilities that can identify previously unidentified CSAM.
By combining Thorn’s industry-leading CSAM detection technology with Hive’s comprehensive content moderation suite, we provide platforms with robust protection against both known and newly created CSAM.
How the Classifier Works
The classifier works by first generating embeddings of the uploaded media. An embedding is a list of computer-generated scores between 0 and 1. After generating the embeddings, Hive permanently deletes all of the original media. We then use the classifier to determine whether the content is CSAM based on the embeddings. This process ensures that we do not retain any CSAM on our servers.
The classifier returns a score between 0 and 1 that predicts whether a video or image is CSAM. The response object will have the same general structure for both image and video inputs. Please note that Hive will return both results together: probability scores from the classifier and any match results from hash matching against the aggregated hash database.
For a detailed guide on how to use Hive’s CSAM detection API, refer to the documentation.
Building a Safer Internet
Protecting platforms from CSAM demands scalable solutions. The problem is complex; but our integration with Thorn’s advanced technology provides an efficient way to detect and stop CSAM, helping to safeguard children and build a safer internet for all.
If you have any further questions or would like to learn more, please reach out to sales@thehive.ai or contact us here.
Three complementary APIs to understand and protect proprietary content
Hive
We are excited to launch a new product suite that is purpose-built to empower our customers to protect their own IP or proactively monitor digital platforms for the potential misuse of others’ IP.
Hive’s Intellectual Property and Publicity Detection suite consists of three complementary cloud-based APIs:
Media Search API: identifies when copies and variants of content from thousands of movies and TV shows are being used.
Likeness Detection API: identifies the “likeness” of the most popular characters or artworks in images across a wide breadth of IP domains, based on their defining characteristics.
Celebrity Recognition API: detects the presence of well-known figures in images. It’s powered by our face detection and face similarity models and a curated and constantly updated Custom Search Index.
All three of these APIs boast comprehensive indexes that are proactively updated. Each API is seamless to integrate and can be built into any application with just a few lines of code. Importantly, with Hive, our customers can achieve speed at scale, as we serve real-time responses to billions of API calls each month.
Media Search API
Hive’s Media Search API automates human-like visual analysis to catch reposts of movies and TV shows. Our Media Search API is a powerful tool for both digital platforms who want to avoid hosting copyright-protected media, as well as content providers and streaming sites looking to be alerted to unauthorized reposts of their proprietary content on digital platforms.
Our Media Search API detects not only exact duplicates, but also modified versions, leveraging our Image Similarity Model. This includes manual image manipulations like rotations and text overlays, as well as more subtle augmentations such as introduction of noise, filters, and other pixel-level changes.
Additionally, for each query, the Media Search API response includes valuable metadata such as IMDB ID, content type (movie or TV show), title, relevant timestamps, and season and episode numbers (if applicable). This metadata empowers our customers to have the full context surrounding this APIs match results.
Finally, Hive’s Media Search API brings to bear a comprehensive search index that is regularly and proactively updated, so our matches are always up-to-date. You can learn more about Hive’s Media Search API on our documentation page.
Likeness Detection API
To complement our Media Search API, we are launching our Likeness Detection API, which identifies a comprehensive set of characters and artworks across the most well-known intellectual property domains.
Hive’s Likeness Detection API is trained on thousands of images per character or artwork across a wide breadth of domains in which that particular subject may have appeared. As a result, our Likeness Detection API is able to identify the “likeness” of well-known characters in any context, based on their defining characteristics. For example, our Likeness Detection API understands that blue costume + red cape + “S” emblem represents the likeness of a certain Kryptonian superhero, whether that subject appears in a live action film, cartoon, halloween costume, or AI-generated image.
Like our Media Search API, our Likeness Detection API is a powerful tool for digital platforms to proactively avoid hosting copyright-protected content, as well as for content creators and streaming platforms to monitor for the misuse of their proprietary content.
However, Hive’s Likeness Detection API also empowers Generative AI platforms to proactively filter and remove potentially copyright-protected characters buried in their datasets, before training text-to-image models. Of course, Likeness Detection API is also capable of detecting the likeness of characters within AI-generated images themselves, which may be highly stylized.
Finally, beyond monitoring for the potential misuse of proprietary content, digital platforms can leverage our Likeness Detection API to more deeply understand the content that their users are engaging with. Understanding the popular IP that users are posting and sharing is a valuable tool for contextual ad-targeting and improving content recommendation systems. Visit our documentation page to learn more about Hive’s Likeness Detection API.
Celebrity Recognition API
Rounding out Hive’s IP and Publicity Detection suite is our Celebrity Recognition API, which enables our customers to identify thousands of celebrities, politicians, athletes, and other well-known public figures in images and videos.
Hive’s Celebrity Recognition API automates human-like perceptual comparisons to identify any public figures visible in an image or video. Our Celebrity Recognition API is powered by our face detection and face similarity models and a curated and constantly updated Custom Search Index. Given an input image, Hive detects all faces present and returns a bounding box and a match for each, as well as a confidence score. When the face does not belong to a celebrity, the string returned is “No Match” and no confidence score is returned.
Paired with Hive’s AI-Generated Content Classification APIs, social platforms can use our Celebrity Recognition API to prevent the proliferation of political or personal misinformation by filtering content for specific well known figures, as well as screening for deepfakes or AI-generated content.
Additionally, digital platforms can use our Celebrity Recognition API to easily sort and tag large media libraries by automatically detecting which celebrities are present. Similarly, streaming platforms and online media databases can quickly identify which actors appear in any frame of films, TV shows, interviews, and more in order to highlight specific actor details to enrich their user experiences.
Finally, Hive’s Celebrity Recognition API can equip celebrities themselves, or the agencies who represent them, to monitor digital platforms for potential misuse of their likeness, enabling proactive brand protection for well-known public figures. To learn more, check out our documentation page for Celebrity Recognition API.
How you can Use IP and Publicity Detection Products Today
With our launch of Hive’s IP and Publicity Detection Products, Hive is bringing to market a comprehensive suite of AI models for understanding and protecting content. However, don’t just trust us, test us: reach out to sales@thehive.ai and our team can share API keys and credentials for your new endpoints.
Hive's Innovative Integration with Thorn's Safer Match
Hive
We are excited to announce that Hive’s Partnership with Thorn is now live! Our current and prospective customers can now easily integrate Thorn’s Safer Match, a CSAM (child sexual abuse material) detection solution, using Hive’s APIs.
The Danger of CSAM
The threat of CSAM involves the production, distribution, and possession of explicit images and videos depicting minors. Every platform with an upload button or messaging capabilities is at risk of hosting child sexual abuse material (CSAM). In fact, in 2023 alone, there were over 104 million reports of potential CSAM reported to the National Center of Missing and Exploited Children.
The current state-of-the-art approach is to use an encrypting function to “hash” the content and then “match” it against a database aggregating 57+ million verified CSAM hashes. If the content hash matches against the database, then the content can be flagged as CSAM.
How the Integration Works
When presented with visual content, we first hash it, then match it against known instances of CSAM.
Hashing: We take the submitted image or video, and convert it into one or more hashes.
Deletion: We then immediately delete the submitted content ensuring nothing stays on Hive’s servers.
Matching: We match the hashes against the CSAM database and return whether the hashes match or not to you.
Hive’s partnership with Thorn allows our customers to easily incorporate Thorn’s Safer Match into their detection toolset. Safer Match provides programmatic identification of known CSAM with cryptographic and perceptual hash matching for images and for videos, through proprietary scene-sensitive video hashing (SSVH).
How you can use this API today:
First, talk to your Hive sales rep, and get an API key and credentials for your new endpoint.
Image
For an image, simply send the image to us, and we will hash it using MD5 and Safer encryption algorithms. Once the image is hashed, we return the results in our output JSON.
Video
You can also send videos into the API. We use MD5 hashes and Safer’s proprietary perceptual hashing for videos as well. However, they have different use cases. MD5 will return exact match videos and will only indicate whether the whole video is a known CSAM video.
Additionally, Safer will hash different scenes within the video and will flag those which are known to be violating. Safer scenes are demarcated by a start and end timestamp as shown in the response below.
Note: For the Safer SSVH, videos are sampled at 1FPS.
For more information, you can reference our documents.
Teaming Up For a Safer Internet
CSAM is one of the most pervasive and harmful issues on the internet today. Legal requirements make this problem even harder to tackle, and previous technical solutions required significant integration efforts. But, together with Thorn’s proactive technology, we can respond to this challenge and help make the internet a safer place for everyone.
We often refer to our models as “industry-leading” or “best-in-class,” but what does this actually mean in practice? How are we better than our competitors, and by how much? It is easy to throw these terms around, but we mean it — and we have the evidence to back it up. In this blog post, we’ll be walking through some of the benchmarks that we have run against similar products to show how our models outperform the competition.
Visual Moderation
First, let’s take a look at one of our oldest and most popular models: visual moderation. To compare our model to its major competitors, we ran a test set of NSFW, suggestive, and clean images through all models.
Visual moderation is a classification task — in other words, the model’s job is to classify each submitted image into one of several categories (in this case, NSFW or Clean). A popular and effective metric to measure performance in classification models is by looking at their precision and recall. Precision is the number of true positives (i.e., correctly identified NSFW images) over the number of predicted positives (images predicted to be NSFW). Recall is the number of true positives (correctly identified NSFW images) over the number of ground-truth positives (actual NSFW images).
There is a tradeoff between the two. If you predict all images to be NSFW, you will have perfect recall — you caught all the NSFW images! — but horrible precision because you incorrectly classified many clean images as NSFW. The goal is to have both high recall and high precision, no matter what confidence threshold is used.
With our visual moderation models, we’ve achieved this. We plotted the results of our test as a precision/recall curve, showing that even at high recall we maintain high precision and vice versa while our competitors fall behind us.
The above plot is for NSFW content detection. Our precision at 90% recall is nearly perfect at 99.6%, which makes our error rate a whopping 45 times lower than Public Cloud C. Even Public Clouds A and B, which are closer to us in performance, have error rates 12.5 times higher and 22.5 times higher than ours respectively.
We also benchmarked our model for suggestive content detection, or content that is inappropriate but not as explicit as our NSFW category. Hive’s error rate remains far below the other models, resting at 6 times lower than Public Cloud A and 12 times lower than Public Cloud C. Public Cloud B did not offer a similar category and thus could not be compared.
We only ran our test on NSFW/explicit imagery more broadly because our competitors do not have equivalent classes to ours for other visual moderation classes such as drugs, gore, and terrorism. This makes comparisons difficult, though it also in itself speaks to the fact that we offer far more classes than many of our competitors. With more than 90 subclasses, our visual moderation model far exceeds its peers in terms of the granularity of our results — we don’t just have classes for NSFW, but also for nudity, underwear, cleavage, and other smaller categories that offer our customers a more more in-depth understanding of their content.
Text Moderation
We used precision/recall curves to compare our text moderation model as well. For this comparison, we charted our performance across eight different classes. Hive outperforms all peer models on every single one.
Hive’s error rate on sexual content is 4 times lower than its closest competitor, Public Cloud B. Our other two competitors for that class both have error rates 6 times higher. The threat class boasts similar metrics, with Hive’s error rate between 2 and 4 times lower than all its peers.
Hive’s model for hateful content detection is on par with our competitors, remaining slightly ahead on all thresholds. Our model for bullying content does the same, with an error rate 2 times lower than all comparable models.
Hive is one of few companies to offer text moderation for drugs and weapons, and our error rates here are also worth noting — our only competitor has an error rate 4 and 8 times higher than ours for drugs and weapons respectively.
Hive also offers the child exploitation class, one that few others provide. With this class, we achieve an error rate 8 times lower than our only other major competitor.
Audio Moderation
For Audio Moderation, we evaluate our model using word error rate (WER), which is the gold-standard metric for a speech recognition system. Word error rate is the number of errors divided by the total number of words transcribed, and a perfect word error rate is 0. As you can see, we achieve the best or near-best performance across a variety of languages.
We excel across the board, with the lowest word error rate on the majority of the languages offered. On Spanish in particular, our word error rate is more than 4 times lower than Public Cloud B.
For German and Italian we are very close behind Public Cloud C and remain better than all other competitors.
Optical Character Recognition (OCR)
To benchmark our OCR model, we calculated the F-score for our model as well as several of our competitors. F-score is the harmonic mean of a model’s precision and recall, combining both of them into one measurement. A perfect F-score is 1. When comparing general F-scores, Hive excels as shown below.
We also achieve best-in-class or near-best performance when comparing by language, as shown in the graphs below. With some languages, we excel by quite a large margin. For Chinese and Korean in particular, Hive’s F-score is more than twice all of its competitors. We fall slightly behind in Hindi, yet still perform significantly better than Public Cloud A.
Demographics
We evaluated our age prediction model by calculating mean error, or how far off our age predictions were from the truth. Since the test dataset we used is labeled using age ranges and not individual numbers, mean error is defined as the distance in years from the closest end of the correct age range (i.e., guessing 22 for someone in the range 25-30 is an error of 3 years). A perfect mean error is 0.
As you can see from this distribution, Hive has a significantly lower mean error rate in the three lowest age buckets (0-2, 3-9, and 10-19). In the age range 0-2, our mean error rate is 11 times less than Public Cloud A’s. For the range 3-9 and 10-19, that difference becomes 5 times greater and 3 times greater respectively — still quite a large margin. Hive also excels notably at the oldest age bucket (70+), where our mean error rate is nearly 7 times less than Public Cloud A’s.
For a broader analysis, we compared our overall mean error across all age buckets, as well as the accuracy of our gender predictions.
AutoML
One of the newest additions to our product suite, our AutoML platform allows you to train image classification, text classification, and fine-tune large language models with your own custom datasets. To evaluate the effectiveness of this tool, we used the same test set to train models both on our platform and on competitor’s platforms and measured the performance of the resulting model.
For image classification, we used three different classification tasks to account for the fact that different tasks have different levels of inherent difficulty and thus may yield higher or lower performing models. We also used three different dataset sizes for each classification task in order to measure how well the AutoML platform is able to work with limited amounts of examples.
We compared the resulting models using balanced accuracy, which is the arithmetic mean of a model’s true positive rate and true negative rate. A perfect balanced accuracy is 100%.
As shown in the above tables, Hive achieves best or near-best accuracy across all sets. Our results are quite similar to Public Cloud B’s, pulling ahead on the product dataset. We fell to near-best performance on the smoking dataset, which is the most difficult of the three classification tasks. Even then, we remained within a few percentage points of the winner, Public Cloud B.
For text classification, we trained models for three different categories: sexual content, drugs, and bullying. The results are in the table below. Hive outperforms all competitors on all three categories using all dataset sizes.
Another important consideration when it comes to AutoML is training time. An AutoML tool could build accurate models, but if it takes an entire day to do so it still may not be a great solution. We compared the time it took to train Hive’s text classification tool for the drugs category, and found that our platform was able to train the model 10 times as fast as Private Company A and 32 times as fast as Public Cloud B. And for the smallest dataset size of 100 examples, we trained the model 18 times faster than Private Company A and 268 times faster than Public Cloud B. That’s a pretty significant speedup.
Measuring the performance of fine-tuned LLMs on our foundation model is a bit more complicated. Here we evaluate two different tasks: question answering and closed-domain classification.
To measure performance on the question answering task, we used a metric called token accuracy. Token accuracy indicates how many tokens are the same between the model’s response and the expected response from the test set. A perfect token accuracy is 100%. As shown below, our token accuracy is higher than our competitors or around the same for all dataset sizes.
This is also true for the classification task, where maintained roughly the same performance as Public Cloud A across the various dataset sizes. Below are the full results of our comparison.
Final Thoughts
As illustrated throughout this in-depth look into the performance of our models, we truly earn the title “best-in-class.” We conduct these benchmarks not just to justify that title, but more so as part of our constant effort to make our models the best that they can be. Reviewing these analyses helps us to identify our strengths, yes, but also our weaknesses and where we can improve.
If you have any questions about any of the benchmarks we’ve discussed here or any other questions about our models, please don’t hesitate to reach out to us at sales@thehive.ai.
Hive was thrilled to have our CTO Dmitriy present at the Workshop on Multimodal Content Moderation during CVPR last week, where we provided an overview of a few important considerations when building machine learning models for classification tasks. What are the effects of data quantity and quality on model performance? Can we use synthetic data in the absence of real data? And after model training is done, how do we spot and address bias in the model’s performance?
Read on to learn some of the research that has made our models truly best-in-class.
The Importance of Quality Data
Data is, of course, a crucial component in machine learning. Without data, models would have no examples to learn from. It is widely accepted in the field that the more data you train a machine learning model with, the better. Similarly, the cleaner that data is, the better. This is fairly intuitive — the basic principle is true for human learners, too. The more examples to learn from, the easier it is to learn. And if those examples aren’t very good? Learning becomes more difficult.
But how important is good, clean data to building a good machine learning model? Good data is not always easy to come by. Is it better to use more data at the expense of having more noise?
To investigate this, we trained a binary image classifier to detect NSFW content, varying the amount of data between 10 images and 100k images. We also varied the noise by flipping a percentage of the labels on between 0% and 50% of the data. We then plotted the balanced accuracy of the resulting models using the same test set.
The result? It turns out that data quality is more important than we may think. It was clear that, as expected, accuracy was the best when the data was both as large as possible (100k examples) and as clean as possible (0% noise). From there, however, the table gets more interesting.
As seen above, the model trained with only 10k data and no noise performs better than the model trained with ten times as much data at 100k and 10% noise. The general trend appears to be similar — clean data matters very much, and it can quickly tank performance even when using the maximum amount of data. In other words, less data is sometimes preferable to more data if it is cleaner.
We wondered how this would change with a more detailed classification problem, so we built a new binary image classifier. This time, we trained the model to detect images of smoking, which is detecting signal from a small part of an image.
The outcome, shown below, echoes the results from the NSFW model — clean data has a great impact on performance even with a very large dataset. But the quantity of data appears to be more important than it was in the NSFW model. While 5000 examples with no noise got around 90% balanced accuracy for the NSFW model, that same amount of noiseless data only got around 77% for the smoking classifier. The increase in performance, while still strongly tied to data quantity, was noticeably slower and only the largest datasets produced well-performing models.
It makes sense that quantity of data would be more important with a more difficult classification task. Data noise also remained a crucial factor for the models trained with more data — the 50k model with 10% noise performed about the same as the 100k model with 10% noise, illustrating once more that more data is not always better if it is still noisy.
Our general takeaways here are that while both data quality and quantity matter quite a bit, clean data is more important beyond a certain quantity threshold. This threshold is where performance increases begin to plateau as the data grows larger, yet noisy data continues to have significant effects on model quality. And as we saw by comparing the NSFW model and the smoking one, this quality threshold also changes depending on the difficulty of the classification task itself.
Training on Synthetic Data: Does it Help or Hurt?
So having lots of clean data is important, what can be done when good data is hard to find or costly to acquire? With the rise of AI image generation over the past few years, more and more companies have been experimenting with generated images to supplement visual datasets. Can this kind of synthetic data be used to train visual classification models that will eventually classify real data?
In order to try this out, we trained five different binary classification models to detect smoking. Three of the models were trained exclusively with real data (10k, 20k, and 40k examples respectively), one was a mix of real and synthetic images (10k real and 30k synthetic), and one was trained entirely on synthetic data (40k). Each datatest had an even split of 50% smoking and 50% nonsmoking examples. To evaluate the models, we used two balanced test sets: one with 4k real images and one with 4k of synthetic images. All synetic images were created using Stable Diffusion.
Looking at the precision and recall curves for the various models, we made an interesting discovery. Unsurprisingly, the largest of the entirely real datasets performed the best (40k). The one trained on 10k real images and 30k synthetic images performed significantly better than the one trained only on 10k real images.
These results suggest that while large amounts of real data are best, a mixture of synthetic and real data could in fact boost model performance when little data is available.
Keeping an Eye Out For Bias
After model training is finished, extensive testing must be done in order to make sure there aren’t any biases in the model results. Biases can come in the form of biases that exist in the real world and are thus often ingrained in real-world data, such as racial bias or gender bias, but can also come in the form of biases that occur in the data by coincidence.
A great example of how unpredictable certain biases can be came recently during a model training for NSFW detection, where the model started flagging many pictures of computer keyboards as false positives. Upon closer investigation, this occurred because many of the NSFW pictures in our training data were photos of computers whose screens were displaying explicit content. Since the computer screens were the focus of these images, keyboards were also often included, leading to the false association that keyboards are an indicator of NSFW imagery.
Three images that were falsely categorized as NSFW
In order to correct this bias, we added more non-NSFW keyboard examples to the training data. Improving this bias in this way not only helps the model by addressing the bias itself, but also boosts general model performance. Of course, addressing bias is even more critical when dealing with data that carries current or historical biases against minority groups, thereby perpetuating them by ingraining them into future technology. The importance of detecting and correcting these biases cannot be overstated, since leaving them unaddressed carries a significant amount of risk beyond simply calling a keyboard NSFW.
Regardless of the type of bias, it’s important to note that biases aren’t always readily apparent. The original model prior to addressing the bias had a balanced accuracy of 80%, which is high enough that the bias may not have been immediately noticeable since errors weren’t extremely frequent. The takeaway here is thus not just that bias correction matters, but that looking into potential biases is necessary even when you might not think they’re there.
Takeaways
Visual classification models are in many ways the heart of Hive — they were our main launching point into the space of content moderation and AI-powered APIs more broadly. We’re continuously searching for ways to keep improving these models as the research surrounding them grows and evolves. Conclusions like those discussed here — the importance of clean data, particularly when you have lots of it, the possible use of synthetic data when real data is lacking, and the need to find and correct all biases (don’t forget about the unexpected ones!) — greatly inform the way we build and maintain our products.
We’re excited to announce Hive’s new AutoML tool that provides customers with everything they need to train, evaluate, and deploy customized machine learning models.
Our pre-trained models solve a wide range of use cases, but we will always be bounded by the number of models we can build. Now customers who find that their unique needs and moderation guidelines don’t quite match with any of our existing solutions can create their own, custom-built for their platform and easily accessible via API.
AutoML can be used to augment our current offerings or to create new models entirely. Want to flag a particular subject that doesn’t exist as a head in our Text Moderation API, or a certain symbol or action that isn’t part of our Visual Moderation? With AutoML, you can quickly build solutions for these problems that are already integrated with your Hive workflow.
Let’s walk through our AutoML process to illustrate how it works. In this example, we’ll build a text classification model that can determine whether or not a given news headline is satirical.
First, we need to get our data in the proper format. For text classification models, all dataset files must be in CSV format. One column should contain the text data (titled text_data) and all other columns represent model heads (classification categories). The values within each row of any given column represent the classes (possible classifications) within that head. An example of this formatting for our satire model is shown below:
The first page you’ll see on Hive’s AutoML platform is a dashboard with all of your organization’s training projects. In the image below, you’ll see how the training and deployment status of old projects are displayed. To create our satire classifier, we’re going to make a new project by hitting the “Create New Project” button in the top right corner.
We’ll then be prompted to provide a name and description for the project as well as training data in the form of a CSV file. For test data, you can either upload a separate CSV file or choose to randomly split your training data into two files, one to be used for training and the other for testing. If you decide to split your data, you will be able to choose the percentage that you would like to split off.
After all of that is entered, we are ready to train! Beginning model training is as easy as hitting a single button. While your model trains, you can easily view its training status on the Training Projects page.
Once training is completed, your project page will show an analysis of the model’s performance. The boxes at the top allow you to decide if you want to look at this analysis for a particular class or overall. If you’re building a multi-headed model, you can choose which head you’d like to evaluate as well. We provide precision, recall, and balanced accuracy for all confidence thresholds as well as a PR curve. We also display a confusion matrix to show how many predictions were correct and incorrect per class.
Once you’re satisfied with your model’s performance, select the “Create Deployment” to launch the model. Similarly to model training, deployment will take a few moments. After model deployment is complete, you can view the deployment in your Hive customer dashboard, where you can access your API key, view current tasks, as well as access other information as you would with our pre-trained models.
We’re very excited to be adding AutoML to our offerings. The platform currently supports both text and image classification, and we’re working to add support for large language models next. If you’d like to learn more about our AutoML platform and other solutions we’re building, please feel free to reach out to sales@thehive.ai or contact us here.
Recently, image-based content featuring embedded text – such as memes, captioned images and GIFs, and screenshots of text – have exploded in popularity across many social platforms. These types of content can present unique challenges for automated moderation tools. Not only does embedded text need to be detected and ordered accurately, it also must be analyzed with contextual awareness and attention to semantic nuance.
Emojis have historically been another obstacle for automated moderation. Thanks to native support across many devices and platforms, these characters have evolved into a new online lexicon for accentuating or replacing text. Many emojis have also developed connotations that are well-understood by humans but not directly related to the image itself, which can make it difficult for automated solutions to identify harmful or inappropriate text content.
To help platforms tackle these challenges, Hive offers optical character recognition (OCR)-based moderation as part of our content moderation suite. Our OCR models are optimized for the types of digitally-generated content that commonly appears on social platforms, enabling robust AI moderation on content forms that are widespread yet overlooked by other solutions. Our OCR moderation API combines competitive text detection and transcription capabilities with our best-in-class text moderation model (including emoji support) into a single response, making it easy for platforms to take real-time enforcement actions across these popular content formats.
OCR Model for Text Recognition
Effective OCR moderation starts with training for accurate text detection and extraction. Hive’s OCR model is trained on a large, proprietary set of examples that optimizes for how text commonly appears within user-generated digital content. Hive has the largest distributed workforce for data labeling in the world, and we leaned on this capability to provide tens of millions of human annotations on these examples to build our model’s understanding.
We recently conducted a head-to-head comparison of our OCR model against top public cloud solutions using a custom evaluation dataset sourced from social platforms. We were particularly interested in test examples that featured digitally-generated text – such as memes and captioned images – to capture how content commonly appears on social platforms and selected evaluation data accordingly.
In this evaluation, we looked at end-to-end text recognition, which includes both text detection and text transcription. Here, Hive’s OCR model outperformed or was competitive with other models on both exact transcription and transcription allowing character-level errors. At 90% recall, Hive’s OCR model achieved a precision of 98%, while public cloud models ranged from ~88% to 97%, implying a similar or lower end-to-end error rate.
OCR Moderation: Language Support
We recognize that many platforms’ moderation needs extend beyond English-speaking users. Hive’s OCR model supports text recognition and transcription for many widely spoken languages with comparable performance, many of which are also supported by our text moderation solutions. Here’s an overview of our current language support:
Language
OCR Support?
Text Moderation Support?
English
Yes
Yes (Model)
Spanish
Yes
Yes (Model)
French
Yes
Yes (Model)
German
Yes
Yes (Model)
Mandarin
Yes
Yes (Pattern Match)
Russian
Yes
Yes (Pattern Match)
Portuguese
Yes
Yes (Model)
Arabic
Yes
Yes (Model)
Korean
Yes
Yes (Pattern Match)
Japanese
Yes
Yes (Pattern Match)
Hindi
Yes
Yes (Model)
Italian
Yes
Yes (Pattern Match)
Moderation of Detected Text
Hive’s OCR moderation solution goes beyond producing a transcript – we then apply our best-in-class text moderation model to understand the meaning of that speech in context (including any detected emojis). Our backend will automatically feed text detected in an image as an input to our text moderation model, making our model classifications on image-based text accessible with a single API call. Our text model is generally robust to misspellings and character substitutions, enabling high classification accuracies on text extracted via OCR even if errors occur in transcription.
Hive’s text moderation model can classify extracted text across several sensitive or inappropriate categories, including sexuality, threats or descriptions of violence, bullying, and racism.
Another critical use-case is moderating spam and doxxing: OCR moderation will quickly and accurately flag images containing emails, phone numbers, addresses and other personal identifiable information. Finally, our text moderation model can also identify promotions such as soliciting services, asking for shares and follows, soliciting donations, or links to external sites. This gives platforms new tools to curate user experience and remove junk content.
We understand that verbal communication is rarely black and white – context and linguistic nuance can have profound effects on how meaning and intent of words are perceived. To help navigate these gray areas, our text model responses supplement classifications with a score from benign (score = 0) to severe (score = 3), which can be used to adapt any necessary moderation actions to platforms’ individual needs and sensitivities. You can read more about our text models in previous blog posts or in our documentation.
Our currently supported moderation classes in each language are as follows:
Language
Classes
English
Sexual, Hate, Violence, Bullying
Spanish
Sexual, Hate
Portuguese
Sexual, Hate
French
Sexual
German
Sexual
Hindi
Sexual
Arabic
Sexual
Emoji Classification for Text Moderation
Emoji recognition is a unique feature of Hive’s OCR moderation model that opens up new possibilities for identifying harmful or harassing text-based content. Emojis can be particularly useful in moderation contexts because they can subtly (or not-so-subtly) alter how accompanying text is interpreted by the reader. Text that is otherwise innocuous can easily become inappropriate when accompanied by a particular emoji and vice-versa.
Hive OCR is able to detect and classify any emojis supported by Apple, Samsung, or Google devices. Our OCR model currently achieves a weighted accuracy of over 97% when classifying emojis. This enables our text moderation model to account for contextual meaning and connotations of emojis used in input text.
To get a sense of our model’s understanding, let’s take a look at some examples of how use of emojis (or inclusion of text around emojis) changes our model predictions to align with human understanding. Each of these examples is from a real classification task submitted to our latest model release.
Here’s a basic example of how adding an emoji changes our model response from classifying as clean to classifying as sensitive. Our models understand not only the verbal concept represented by the emoji, but what the emoji means semantically based on where it is located in the text. In this case, the bullying connotation of the “garbage” or “trash” emoji would be completely missed by an analysis of the text alone.
Our model is similarly sensitive to changes in semantic meaning caused by substitutions of emojis for text.
In this case, our model catches the sexual connotation added by the eggplant emoji in place of the word “eggplant.” Again, the text alone without an emoji – “lemme see that !” – is completely clean.
In addition to understanding how emojis can alter the meaning of text, our model is also sensitive to how text can change implications of emojis themselves.
Here, adding the phrase “hey hotty” transforms an emoji usually used innocuously into a message with suggestive intent, and our model prediction changes accordingly.
Finally, Hive’s OCR and text moderation models are trained to differentiate between each skin tone option for emojis in the “People” category and understand their implications in the context of accompanying text. We are currently exploring how the ability to differentiate between light and darker skin tones can enable new tools to identify hateful, racist, or exclusionary text content.
OCR Moderation: Final Thoughts
User preferences for online communication are constantly evolving in both medium and content, which can make it challenging for platforms to keep up with abusive users. Hive prides itself on identifying blindspots in existing moderation tools and developing robust AI solutions using high-quality training data tailored to these use-cases. We hope that this post has showcased what’s possible with our OCR moderation capabilities and given some insight into our future directions.
Feel free to contact sales@thehive.ai if you are interested in adding OCR capabilities to your moderation suite, and please stay tuned as we announce new features and updates!