Introducing Moderation Dashboard: a streamlined interface for content moderation

Over the past few years, Hive’s cloud-based APIs for moderating image, video, text, and audio content have been adopted by hundreds of content platforms, from small communities to the world’s largest and most well-known platforms like Reddit.  

However, not every platform has the resources or interest in building their own software on top of Hive’s APIs to manage their internal moderation workflows.  And since the need for software like this is shared by many platforms, it made sense to build a robust, accessible solution to fill the gap.

Today, we’re announcing the Moderation Dashboard, a no-code interface for your Trust & Safety team to design and execute custom-built moderation workflows on top of Hive’s best-in-class AI models.  For the first time, platforms can access a full-stack, turnkey content moderation solution that’s deployable in hours and accessible via an all-in-one flexible seat-based subscription model.

We’ve spent the last month beta testing the Moderation Dashboard and have received overwhelmingly positive feedback.  Here are a few highlights:

  • “Super simple integration”: customizable actions define how the Moderation Dashboard communicates with your platform
  • “Effortless enforcement”: automating moderation rules in the Moderation Dashboard UI requires zero internal development effort
  • “Streamlined human reviews”: granular policy enforcement settings for borderline content significantly reduced need for human intervention
  • “Flexible” and “Scalable”: easy to add seat licenses as your content or team needs grow, with a stable monthly fee you can plan for

We’re excited by the Moderation Dashboard’s potential to bring industry-leading moderation to more platforms that need it, and look forward to continuing to improve it with updates and new features based on your feedback.

If you want to learn more, the post below highlights how our favorite features work.  You can also read additional technical documentation here.

Easily Connect Moderation Dashboard to Your Application

Moderation Dashboard connects seamlessly to your application’s APIs, allowing you to create custom enforcement actions that can be triggered on posts or users – either manually by a moderator or automatically if content matches your defined rules.

Custom actions created in the Moderation Dashboard allow our API to interface directly with your platform when moderation rules trigger or when human moderators make decisions. Callback URLs specified in each action allow the Dashboard API to communicate with your callback server to take your defined action on the correct post or user in real time.

You can create actions within the Moderation Dashboard interface specifying callback URLs that tell the Dashboard API how to communicate with your platform.  When an action triggers, the Moderation Dashboard will ping your callback server with the required metadata so that you can successfully execute the action on the correct user or post within your platform.

Implement Custom Content Moderation Rules

Moderation Dashboard allows you to easily define rules to automatically moderate content and users according to Hive model classifications. Set conditions on over 50 supported moderation classes and choose which of your actions to take all from within the Dashboard interface.

At Hive, we understand that platforms have different content policies and community guidelines. Moderation Dashboard enables you to set up custom rules according to your particular content policies in order to automatically take action on problematic content using Hive model results. 

Moderation Dashboard currently supports access to both our visual moderation model and our text moderation model – you can configure which of over 50 model classes to use for moderation and at what level directly through the dashboard interface. You can easily define sets of classification conditions and specify which of your actions – such as removing a post or banning a user – to take in response, all from within the Moderation Dashboard UI. 

Once configured, Moderation Dashboard can communicate directly with your platform to implement the moderation policy laid out in your rule set. The Dashboard API will automatically trigger the enforcement actions you’ve specified on any submitted content that violates these rules.

An example of user moderation rules configurable within the Moderation Dashboard. User rules can account for various aspects of user's post history, such as account age, number of posts, and percentage of posts that were flagged in various content moderation classes when determining which action to trigger.

Another feature unique to Moderation Dashboard: we keep track of (anonymized) user identifiers to give you insight into high-risk users. You can design rules that account for a user’s post history to take automatic action on problematic users. For example, platforms can identify and ban users with a certain number of flagged posts in a set time period, or with a certain proportion of flagged posts relative to clean content – all according to rules you set in the interface.

Intuitive Adjustment of Model Classification Thresholds

Moderation Dashboard allows interface-based configuration of model classification thresholds for both visual and text models. Users can configure thresholds for each text moderation class with a slider element and configure thresholds for each visual model classes in a dialog box to tailor content moderation rules to their platform's policies and sensitivities.

Moderation Dashboard allows you to configure model classification thresholds directly within the interface. You can easily set confidence score cutoffs (for visual) and severity score cutoffs (for text) that tells Hive how to classify content according to your sensitivity around precision and recall.

Streamline Human Review

Hive’s API solutions were generally designed with an eye towards automated content moderation. Historically, this has required our customers to expend some internal development effort to build tools that also allow for human review. Moderation Dashboard closes this loop by allowing custom rules that route certain content to a Review Feed accessible by your human moderation team.

One workflow we expect to see frequently: automating moderation of content that our models classify as clearly harmful, while sending posts with less confident model results to human review. By limiting human review to borderline content and edge cases, platforms can significantly reduce the burden on moderators while also protecting them from viewing the worst content.

Setting Human Review Thresholds

To do this, Moderation Dashboard administrators can set custom score ranges that trigger human review for both visual and text moderation. Content scoring in these ranges will be automatically diverted to the Review Feed for human confirmation. This way, you can focus review from your moderation team on trickier cases, while leaving content that is clearly allowable and clearly harmful to your automated rules. Here’s an example rule that sends text content scored as “controversial” (severity scores of 1 or 2) to the review feed but auto-moderates the most severe cases.

Moderation Dashboard allows you to configure custom score ranges for both visual and text models that route the post to human review feeds. Human review thresholds can be configured separately for each class according to your needs and sensitivities.

Review Feed Interface for Human Moderators

When your human review rules trigger, Moderation Dashboard will route the post to the Review Feed of one of your moderators, where they can quickly visualize the post and see Hive model predictions to inform a final decision.

Review Feed is hosted directly by the Moderation Dashboard application, giving your human moderation teams an intuitive user interface to review and take action on flagged posts. Moderators visualize posts, access a quick overview of model classifications, and then select from any of the content moderation actions you've defined for immediate enforcement.

For each post, your moderators can select from the moderation actions you’ve set up to implement your content policy. Moderation Dashboard will then ping your callback server with the required information to execute that action, enabling your moderators to take quick action directly within the interface.

Additionally, Moderation Dashboard makes it simple for your Trust & Safety team administrators to onboard and grant review access to additional moderators. Platforms can easily scale their content moderation capabilities to keep up with growth.

Access Clear Intel on Your Content and Users

Interface for User Feed on the Moderation Dashboard applications. Moderators can easily accessed detailed information on user post histories to inform moderation decisions. All flagged posts and actions taken on those posts will be shown, in addition to the percentage of that user's posts that were flagged.

Beyond individual posts, Moderation Dashboard includes a User Feed that allows your moderators to see detailed post histories of each user that has submitted unsafe content. 

Here, your moderators can access an overview of each user including their total number of posts and the proportion of those posts that triggered your moderation rules. The User Feed also shows each of that user’s posts along with corresponding moderation categories and any corresponding action taken. 

Similarly, Moderation Dashboard makes quality control easy with a Content Feed that displays all posts moderated automatically or through human review. The Content Feed allows you to see your moderation rules in action, including detailed metrics on how Hive models classified each post. From here, administrators supervise human moderation teams for simple QA or further refine thresholds for automated moderation rules.

Effortless Moderation of Spam and Promotions

In addition to model classifications, Moderation Dashboard will also filter incoming text for spam entities – including URLs and personal information such as emails and phone numbers. The Spam Manager interface will aggregate all posts containing the same spam text into a single action item that can be allowed or denied with one click.

Spam Manager interface of the Moderation Dashboard application. Spam Manager flags posts containing links and personal identifiable information such as emails and phone numbers. The Spam Manager interface will aggregates all instances of the same spam entities into a single action item for easy decision-making and identification of bots and promotional accounts. You can also set custom allow and deny lists that enable auto-moderation rules on spam posts.

With Spam Manager, administrators can also define custom whitelists and blacklists for specific domains and URLs and then set up rules to automatically moderate spam entities in these lists. Finally, Spam Manager provides detailed histories of users that post spam entities for quick identification of bots and promotional accounts, making it easy to keep your platform free of junk content. 

Final Thoughts: The Future of Content Moderation

We’re optimistic that Moderation Dashboard can help platforms of all sizes meet their obligations to keep online environments safe and inclusive. With Moderation Dashboard as a supplement to (or replacement for) internal moderation infrastructure, it’s never been easier for our customers to leverage our top-performing AI models to automate their content policies and increase efficiency of human review. 

Moderation Dashboard is an exciting shift in how we deliver our AI solutions, and this is just the beginning. We’ll be quickly adding additional features and functionality based on customer feedback, so please stay tuned for future announcements.

If you’d like to learn more about Moderation Dashboard or schedule a personal demo, please feel free to contact