We are excited to announce that Hive is now offering Thorn’s predictive technology through our CSAM detection API! This API now enables customers to identify novel cases of child sexual abuse material (CSAM) in addition to detecting known CSAM using hash-based matching.
Our Commitment to Child Internet Safety
At Hive, making the internet safer is core to our mission. While our content moderation tools help reduce human exposure to harmful content across many categories, addressing CSAM requires specialized expertise and technology.
That’s why we’re expanding our existing partnership with Thorn, an innovative nonprofit that builds technology to defend children from sexual abuse and exploitation in the digital age.
Until now, our integration with Thorn focused on hash-matching technology to detect known CSAM. The new CSAM detection API builds on this foundation by adding advanced machine learning capabilities that can identify previously unidentified CSAM.
By combining Thorn’s industry-leading CSAM detection technology with Hive’s comprehensive content moderation suite, we provide platforms with robust protection against both known and newly created CSAM.
How the Classifier Works
The classifier works by first generating embeddings of the uploaded media. An embedding is a list of computer-generated scores between 0 and 1. After generating the embeddings, Hive permanently deletes all of the original media. We then use the classifier to determine whether the content is CSAM based on the embeddings. This process ensures that we do not retain any CSAM on our servers.
The classifier returns a score between 0 and 1 that predicts whether a video or image is CSAM. The response object will have the same general structure for both image and video inputs. Please note that Hive will return both results together: probability scores from the classifier and any match results from hash matching against the aggregated hash database.
For a detailed guide on how to use Hive’s CSAM detection API, refer to the documentation.
Building a Safer Internet
Protecting platforms from CSAM demands scalable solutions. The problem is complex; but our integration with Thorn’s advanced technology provides an efficient way to detect and stop CSAM, helping to safeguard children and build a safer internet for all.
If you have any further questions or would like to learn more, please reach out to sales@thehive.ai or contact us here.
To the untrained eye, distinguishing human-created art from AI-generated content can be difficult. Hive’s commitment to providing customers with API-accessible solutions for challenging problems led to the creation of our AI-Generated Image and Video Detection API, which classifies images as human-created or AI-generated. Our model was evaluated in an independent study conducted by Anna Yoo Jeong Ha and Josephine Passananti from the University of Chicago, which sought to determine who was more effective at classifying images as AI-generated: humans or automated detectors.
Ha and Passananti’s study addresses a growing problem within the generative AI space: As generative AI models become more advanced, the boundary between human-created art and AI-generated images has become increasingly indistinguishable. With such powerful tools being accessible to the general public, various legal and ethical concerns have been raised regarding the misuse of said technology.
Such concerns are pertinent to address because the misuse of generative AI models negatively impacts both society at large and the AI models themselves. Bad actors have used AI-generated images for harmful purposes, such as spreading misinformation, committing fraud, or scamming individuals and organizations. As only human-created art is eligible for copyright, businesses may attempt to bypass the law by passing off AI-generated images as human-created. Moreover, multiple studies (on both generative image and text models) have shown evidence that AI models will deteriorate if their training data solely consists of AI-generated content—which is where Hive’s classifier comes in handy.
The study’s results show that Hive’s model outperforms both its automated peers and highly-trained human experts in differentiating between human-created art versus AI-generated images across most scenarios. This post examines the study’s methodologies and findings, in addition to highlighting our model’s consistent performance across various inputs.
Structuring the Study
In the experiment, researchers evaluated the performance of five automated detectors (three of which are commercially available, including Hive’s model) and humans against a dataset containing both human-created and AI-generated images across various art styles. Humans were categorized into three subgroups: non-artists, professional artists, and expert artists. Expert artists are the only subgroup with prior experience in identifying AI-generated images.
The dataset consists of four different image groups: human-created art, AI-generated images, “hybrid images” which combine generative AI and human effort, and perturbed versions of human-created art. A perturbation is defined as a minor change to the model input aimed at detecting vulnerabilities in the model’s structure. Four perturbation methods are used in the study: JPEG compression, Gaussian noise, CLIP-based Adversarial Perturbation (which performs perturbations at the pixel level), and Glaze (a tool used to protect human artists from mimicry by introducing imperceptible perturbations on the artwork).
After evaluating the model on unperturbed imagery, the researchers proceeded to more advanced scenarios with perturbed imagery.
Evaluation Methods and Findings
The researchers evaluated the automated detectors on four metrics: overall accuracy (ratio of training data classified correctly to the entire dataset), false positive rate (ratio of human-created art misclassified as AI-generated), false negative rate (ratio of AI-generated images misclassified as human-created), and AI detection success rate (ratio of AI-generated images correctly classified as AI-generated to the total amount of AI-generated images).
Among automated detectors, Hive’s model emerged as the “clear winner” (Ha and Passananti 2024, 6). Not only does it boast a near-perfect 98.03% accuracy rate, but it also has a 0% false positive rate (i.e., it never misclassifies human art) and a low 3.17% false negative rate (i.e., it rarely misclassifies AI-generated images). According to the authors, this could be attributed to Hive’s rich collection of generative AI datasets, with high quantities of diverse training data compared to its competitors.
Additionally, Hive’s model proved to be resistant against most perturbation methods, but faced some challenges classifying AI-generated images processed with Glaze. However, it should be noted that Glaze’s primary purpose is as a protection tool for human artwork. Glazing AI-generated images is a non-traditional use case with minimal training data available as a result. Thus, Hive’s model’s performance with Glazed AI-generated images has little bearing on its overall quality.
Final Thoughts Moving Forward
When it comes to automated detectors and humans alike, Hive’s model is unparalleled. Even compared to human expert artists, Hive’s model classifies images with higher levels of confidence and accuracy.
While the study considers the model’s potential areas for improvement, it is important to note that the study was published in February 2024. In the months following the study’s publication, Hive’s model has vastly improved and continues to expand its capabilities, with 12+ model architectures added since.
If you’d like to learn more about Hive’s AI-Generated Image and Video Detection API, a demo of the service can be accessed here, with additional documentation provided here. However, don’t just trust us, test us: reach out to sales@thehive.ai or contact us here, and our team can share API keys and credentials for your new endpoints.
Hive's Innovative Integration with Thorn's Safer Match
Hive
We are excited to announce that Hive’s Partnership with Thorn is now live! Our current and prospective customers can now easily integrate Thorn’s Safer Match, a CSAM (child sexual abuse material) detection solution, using Hive’s APIs.
The Danger of CSAM
The threat of CSAM involves the production, distribution, and possession of explicit images and videos depicting minors. Every platform with an upload button or messaging capabilities is at risk of hosting child sexual abuse material (CSAM). In fact, in 2023 alone, there were over 104 million reports of potential CSAM reported to the National Center of Missing and Exploited Children.
The current state-of-the-art approach is to use an encrypting function to “hash” the content and then “match” it against a database aggregating 57+ million verified CSAM hashes. If the content hash matches against the database, then the content can be flagged as CSAM.
How the Integration Works
When presented with visual content, we first hash it, then match it against known instances of CSAM.
Hashing: We take the submitted image or video, and convert it into one or more hashes.
Deletion: We then immediately delete the submitted content ensuring nothing stays on Hive’s servers.
Matching: We match the hashes against the CSAM database and return whether the hashes match or not to you.
Hive’s partnership with Thorn allows our customers to easily incorporate Thorn’s Safer Match into their detection toolset. Safer Match provides programmatic identification of known CSAM with cryptographic and perceptual hash matching for images and for videos, through proprietary scene-sensitive video hashing (SSVH).
How you can use this API today:
First, talk to your Hive sales rep, and get an API key and credentials for your new endpoint.
Image
For an image, simply send the image to us, and we will hash it using MD5 and Safer encryption algorithms. Once the image is hashed, we return the results in our output JSON.
Video
You can also send videos into the API. We use MD5 hashes and Safer’s proprietary perceptual hashing for videos as well. However, they have different use cases. MD5 will return exact match videos and will only indicate whether the whole video is a known CSAM video.
Additionally, Safer will hash different scenes within the video and will flag those which are known to be violating. Safer scenes are demarcated by a start and end timestamp as shown in the response below.
Note: For the Safer SSVH, videos are sampled at 1FPS.
For more information, you can reference our documents.
Teaming Up For a Safer Internet
CSAM is one of the most pervasive and harmful issues on the internet today. Legal requirements make this problem even harder to tackle, and previous technical solutions required significant integration efforts. But, together with Thorn’s proactive technology, we can respond to this challenge and help make the internet a safer place for everyone.
Hive’s AutoML platform allows anyone the opportunity to create best-in-class machine learning solutions for the particular issues they face. Our platform can create classification and large language models for an endless range of use cases. If you need a model that bears no resemblance whatsoever to any pre-trained model we offer, no problem! We’ll help you build one yourself.
Hive AutoML uses the same technology behind our industry-leading ML tools to create yours. This way you get the best of both worlds — Hive’s impeccable model performance and a tool custom-built to address your needs.
Hive AutoML for Content Moderation
Today we’ll be focusing on one particular application of our AutoML platform: customizing our moderation models. These models kickstarted our success as a company and are used by many of the largest online platforms in the world. But the moderation guidelines of many sites differ from each other, and sometimes our base moderation models don’t quite fit them.
With AutoML, you can create your own version of our moderation models by fine-tuning our pre-existing heads or adding new heads entirely. We will then train a version of our high-performing base model with your added data to create a tool that best suits your platform’s moderation process.
In this blog post, we’ll walk through both how to add more data to an existing Hive moderation head and how to add a new custom moderation head. We’ll demonstrate the former while building a visual moderation model and the latter on a text moderation model. Audio moderation is not currently supported on AutoML.
Building a Visual Moderation Model
Hive AutoML for Visual Moderation allows you to customize our Visual Moderation base model to fit your specific needs. Using your own data, you can add new model heads or fine-tune any of the existing 45+ subclasses that we provide as part of our Visual Moderation tool. A full list of these classes is available here.
For this walkthrough, we’ll be fine-tuning the tobacco head. Our data will thus include images and labels for this head only. The resulting model will include all Hive visual moderation heads, with the tobacco head re-trained to incorporate this new data.
Uploading Your Dataset
Before you start building your model, you first need to upload any datasets you’ll use to the Dataset section of our AutoML platform. For Visual Moderation model training, we require a CSV file with a column for your image data (as publicly accessible image URLs) and an additional column for each head you wish to train.
For this tutorial, we’re going to train using additional data for the tobacco class. The below CSV includes image URLs and a column of labels for that head.
Dataset formatting, images have either “yes_tobacco” or “no_tobacco” labels
After you’ve selected your dataset file, you’ll be asked to confirm the column mapping. Make sure the columns of your dataset have been interpreted correctly and that you have the correct format (image or text) selected for each column.
The column mapping confirmation page lets you double check that the data has been processed correctly.
Once you’ve confirmed your mapping, you can preview and edit your data. This page opens automatically after any dataset upload. You will be able to check whether all images were uploaded successfully, view the images themselves, and change their respective labels if desired. You can also add or delete any data that you wish to before you proceed onto model training.
The dataset preview page for an image-based dataset.
Creating a Dataset Snapshot
When you’re happy with your dataset, you’ll then need to create a snapshot from it. A snapshot is a point-in-time export of a dataset that validates that dataset for training. Once a snapshot is created, its contents cannot be changed. This means that while you can continue to edit your original dataset, your snapshot will not change along with it — if you make any changes, you’ll need to create a new snapshot after you’re finished with your changes.
The information you’ll be asked to provide when creating a snapshot.
You can create a snapshot from any live dataset. To do so, simply click the “Create Snapshot” button on that dataset’s detail page. You’ll be prompted to provide some information, most notably which columns to use for image input and data labels. After your snapshot is successfully created, you’re ready to start training!
Creating a New Model
To create a training, you can select the “Create Model” button on the snapshot detail page. You’ll once again be asked to provide several pieces of information, including your model’s name, description, base model, and datasets. Make sure to select “Hive Vision Moderation” under the “Base Model” category as opposed to a general image classification model.
When creating your model, make sure you have the correct model type and base model selected.
You can choose to upload a separate test dataset or split off a random section of your training dataset to use instead. If you choose to upload a separate test dataset, this dataset must contain the same heads and classes as your training dataset. After uploading your dataset, you will also need to create a snapshot of that dataset before you begin model training.
If you choose to split off a section of your training dataset, you will be able to choose the percentage of that dataset that you would like to use for testing as you create your training.
Before you begin your training, you are also able to edit some training preferences such as maximum number of training epochs, model selection rule, model selection label, early stopping, and invalid data criteria. If you’re unsure what any of these options are, there is a little information icon next to each that will explain what is meant by that setting.
The training options you’re offered as you create your model include max epochs, model selection rule, and more.
After uploading your training (and, if desired, test) dataset and selecting your desired training options, you’re ready to create your model. After you begin training, your model will be ready within 20 minutes. You will automatically be directed to the model’s detail page, where you can watch its progress as it trains.
Playground and Metrics: Evaluating Your Model
When your model has completed its training, the model’s detail page will display a variety of metrics in order to help you analyze your model’s performance. At the top of the page, you’ll be shown the model’s precision, recall, balanced accuracy, and F1 score. You can toggle whether these metrics are calculated by head overall or by each class within a head.
The model details page displays performance metrics once the model has completed training.
Below these numbers, you’ll also be able to view an interactive precision/recall (PR) curve. This is the gold-standard metric for a classification model and gives you more insight into how your model balances the inherent tradeoff between high precision and high recall.
You’ll then be shown a confusion matrix, which is an exact breakdown of the true positives, false positives, true negatives, and false negatives of the model’s results. This can highlight particular weak spots of your model and potential areas you may want to address with further training. As shown below, our example model has no false positives but several false negatives — images with tobacco that were classified as “no_tobacco.”
This model’s confusion matrix, which shows that there is an issue with false negatives.
The final section of our metrics page is an area called the “playground.” The playground allows you to test your newly created AutoML model by submitting sample queries and viewing the responses. This feature is another great way to explore the way that your model responds to different kinds of prompts and the areas in which it could improve. You are given 500 free sample queries — beyond that you will be prompted to deploy your model with the cost of each submission charged to your organization’s billing account.
To test our tobacco model, we submitted the following sample image. To the right of it you can see the results for each Hive visual moderation class, including tobacco where it is classified correctly with a perfect confidence score or 1.00.
An example image of a man smoking a cigar and the labels assigned to it by our newly trained moderation model.
Deploying Your Model
To begin using your model, you can create a deployment from it. This will open the project on Hive Data, where you will be able to upload tasks, view tasks, and access your API key as you would with any other Hive Data project. An AutoML project can have multiple active deployments at one time.
Building a Text Moderation Model
Just like for Visual Moderation, our AutoML platform allows you to customize our Text Moderation base model to fit your particular use cases by adding or re-training model categories. The full class definitions for all 13 of our currently offered heads are available here. For this section of the walkthrough, we will be creating a new custom head in order to add capabilities to our model that we don’t currently offer: sentiment analysis.
Sentiment analysis is the task of categorizing the emotional tone of a piece of text, typically into two labels: positive or negative. Occasionally there may be a sentiment analysis task that breaks the sentiment down into more specific categories, such as joyful, angry, etc. Adding this kind of information to our existing Hive Text Moderation model could prove useful for platforms that wish to either exclude negative content on sites for children or to put limits on certain comment sections or forums where negative commentary is unwanted.
Sentiment analysis is a complex problem, since it is a language-based task. Understanding the meaning and tone of a sentence is not always easy even for humans. To keep it simple, we’ll just be using the two possible classifications of positive and negative.
Uploading Your Dataset
Similarly to creating a Visual Moderation model, you’ll need to upload your data as a CSV file to the “Data” section of our AutoML platform prior to model training. The format of our sentiment analysis dataset is shown below, though the column names do not need to be anything specific in order to be processed correctly.
The text data and labels for our sentiment analysis model, formatted into two columns.
After uploading your dataset, you’ll be asked to confirm the format of each column as either text, images, or JSONs. If you’d like to disregard that column entirely, that is also an option to “Ignore Column.” After you hit confirm, you can preview and edit your dataset just as you could with your image dataset in the Visual Moderation example. The preview page for text datasets is shown below.
The preview page for a text-based dataset.
Creating a Dataset Snapshot
As described in the Visual Moderation walkthrough, you’ll need to create a snapshot of your dataset in order to validate it prior to model training. When making your snapshot, make sure that you select “Text Classification” as your “Snapshot Type.” This will ensure that your snapshot is sufficient to train a Text Moderation model. You will also need to specify which column contains your text input and which contains the labels for that text input, as shown below for our dataset.
When creating your snapshot, you will be asked to provide some information about the dataset.
In the example above, we’ve selected our “text_data” column as our input and our “sentiment” column as our training labels.
Creating a New Model
After you’ve created your snapshot, you’ll automatically be brought to that snapshot’s detail page. From this page, starting a new model training is as easy — just hit the big “Create New Model” button on the top right. You’ll be asked to name your model and provide a few key details about the training, such as which snapshots you’d like to use as your data and how many times a training will cycle through that data.
You’ll be able to configure your training by choosing a model selection rule, maximum number of epochs, and more.
Make sure you’ve selected “Text Classification” as your model type and “Hive Text Moderation” as your base model. Then you’re ready to start your training! Model training takes up to 20 minutes depending on several factors including the size of your dataset. Most take only several minutes to complete.
Metrics and Model Evaluation
Once your training has completed, you’ll be redirected to the details page for your new moderation model. On this page, you’ll be shown the model’s precision, recall, balanced accuracy, and F1 score. You will also be able to view a precision/recall (P/R) curve and confusion matrix in order to further analyze the performance of your model.
The sentiment analysis model performs fairly well upon first training, with most metrics around 86%.
The overall performance of the model is pretty good for a difficult task such as sentiment analysis. While there is room for improvement, this first round of training indicates that with some additional data we could likely bring all metrics above 90%. The confusion matrix for this model indicates that a specific area of weakness for this model is false negatives, to which a possible solution would be to increase the amount of positive examples in the data and observe if this improves model results.
The confusion matrix for our model, which shows a 19% false negative rate.
We do not currently offer the playground feature for text moderation models, though we are working on this and expect it to be released in the coming months.
Deploying Your Model
The process for deploying your model is identical to the way we deployed our Visual Moderation model in the first example. To deploy any model, simply click “Create Deployment” from that model’s details page. Once deployed, you can access your unique API keys and begin to submit tasks to the model like any other Hive model.
Final Thoughts
We hope this in-depth walkthrough was helpful. If you have any further questions or run into any issues as you build your custom-made AI models, please don’t hesitate to reach out to us at support@thehive.ai and we will be happy to help. To inquire about testing out our AutoML platform, please contact sales@thehive.ai.