BACK TO ALL BLOGS

Announcing Hive’s Partnership with the Defense Innovation Unit

Contents

Hive is excited to announce that we have been awarded a Department of Defense (DoD) contract for deepfake detection of video, image, and audio content. This groundbreaking partnership marks a significant milestone in protecting our national security from the risks of synthetic media and AI-generated disinformation.

Combating Synthetic Media and Disinformation

Rapid strides in technology have made AI manipulation the weapon of choice for numerous adversarial entities. For the Department of Defense, a digital safeguard is necessary in order to protect the integrity of vital information systems and stay vigilant against the future spread of misinformation, threats, and conflicts at a national scale.

Hive’s reputation as frontline defenders against AI-generated deception makes us uniquely equipped to handle such threats. Not only do we understand the stakes at hand, we have been and continue to be committed to delivering unmatched detection tools that can mitigate these risks with accuracy and speed.

Under our initial two-year contract, Hive will partner with the Defense Innovation Unit (DIU) to support the intelligence community with our state-of-the-art deepfake detection models, deployed in an offline, on-premise environment and capable of detecting AI-generated video, image, and audio content. We are honored to join forces with the Department of Defense in this critical mission.

Our Cutting-Edge Tools

To best empower the U.S. defense forces against potential threats, we have provided five proprietary models that can detect whether an input is AI-generated or a deepfake.

If an input is flagged as AI-generated, it was likely created using a generative AI engine. Whereas, a deepfake is a real image or video where one or more of the faces in the original image has been swapped with another person’s face.

The models we’ve provided are, as follows:

  1. AI-Generated Detection (Image and Video), which detects if an image or video is AI-generated.
  2. AI-Generated Detection (Audio), which detects if an audio clip is AI-generated.
  3. Deepfake Detection (Image), which detects if an image contains one or more faces that are deepfaked.
  4. Deepfake Detection (Video), which detects if a video contains one or more faces that are deepfaked.
  5. Liveness (Image and Video), which detects whether a face in an image or video is primary (exists in the primary image) or secondary (exists in an image, screen, or painting inside of the primary image).

Forging a Path Forward

Even as new threats continue to emerge and escalate, Hive continues to be steadfast in our commitment to provide the world’s most capable AI models for validating the safety and authenticity of digital content.

For more details, you can find our recent press release here and the DIU’s press release here. If you’re interested in learning more about what we do, please reach out to our sales team (sales@thehive.ai) or contact us here for further questions.

BACK TO ALL BLOGS

Model Explainability With Text Moderation

Contents

Hive is excited to announce that we are releasing a new API: Text Moderation Explanations! This API helps customers understand why our Text Moderation model assigns text strings particular scores.

The Need For Explainability

Hive’s Text Moderation API scans a text-string or message, interprets it, and returns to our users a score from 0-3 mapping to a severity level across a number of top level classes and dozens of languages. Today, hundreds of customers send billions of text strings each month through this API to protect their online communities.

A top feature request has been explanations for why our model assigns the scores it does, especially for foreign languages. While some moderation scores may be clear, there also may be ambiguity around edge cases for why a string was scored the way it was.

This is where our new Text Moderation Explanations API comes in—delivering additional context and visibility into moderation results in a scalable way. With Text Moderation Explanations, human moderators can quickly interpret results and utilize the additional information to take appropriate action.

A Supplement to Our Text Moderation Model

Our Text Moderation classes are ordered by severity, ranging from level 3 (most severe) to level 0 (benign). These classes correspond to the possible scores Text Moderation can give a text string. For example: If a text string falls under the “sexual” head and contains sexually explicit language, it would be given a score of 3.

The Text Moderation Explanations API takes in three inputs: a text string, its class label (either “sexual”, “bullying”, “hate”, or “violence”), and the score it was assigned (either 3, 2, 1, or 0). The output is a text string that explains why the original input text was given that score relative to its class. It should be noted that Explanations is only supported for select multilevel heads (corresponding to the class labels listed previously).

To develop the Explanations model, we used a supervised fine-tuning process. We used labeled data—which we internally labeled at Hive using native speakers—to fine-tune the original model for this specialized process. This process allows us to support a number of languages apart from English.

Comprehensive Language Support

We have built our Text Moderation Explanation API with broad initial language support. Language support solves the crucial issue of understanding why a text string (in one’s non-native language) was scored a certain way.

We currently support eight different languages for Text Moderation Explanations and four top level classes:

Text Moderation Explanations are now included at no additional cost as part of our Moderation Dashboard product, as shown below:

Additionally, customers can also access the Text Moderation Explanations model through an API (refer to the documentation).

In future releases, we anticipate adding further language and top level class support. If you’re interested in learning more or gaining test access to the Text Moderation Explanations model, please reach out to our sales team (sales@thehive.ai) or contact us here for further questions.