BACK TO ALL BLOGS

Search Custom Image Libraries with New Image Similarity Models

Contents


Building a Smarter Way to Search

Hive has spent the last two years building powerful AI models served to customers via APIs. At their core, our current models – visual and text classification, logo detection, OCR, speech-to-text, and more – generate metadata that describes unstructured content. Hive customers use these “content tagging” models to unlock value across a variety of use-cases, from brand advertising analytics to automated content moderation.

While these content tagging models are powerful, some content understanding challenges require a more holistic approach. Meeting these challenges requires an AI model that not only understands a piece of content, but also sees how that content relates to a larger set of data.  

Here’s an example: a dating app is looking to moderate their user profile images. Hive’s existing content tagging APIs can solve a number of challenges here, including identifying explicit content (visual moderation), verifying age (demographics), and detecting spam (OCR).  But what if we also needed to detect whether or not a given photo matches (or is very similar to) another user’s profile? That problem would fall outside the scope of the current content tagging models. 

To meet these broader content understanding challenges, we’re excited to launch the first of our intelligent search solutions: Custom Search, an image comparison API built on Hive’s visual similarity models. With the Custom Search APIs, platforms can maintain individualized, searchable databases of images and quickly submit query images for model-based comparisons across those sets. 

This customizability opens up a wide variety of use-cases:

  • Detecting spam content: oftentimes, spammers on online platforms will use the same content or variants of the original content. By banning a single piece of content and using our custom search solution, platforms can now more extensively protect their users.
  • Detecting marketplace scams: identify potentially fraudulent listings based on photos that match or are similar to other listings
  • Detecting impersonation attempts: on social networks and dating apps, detect whether or not the same or similar profile images are being used across different accounts

This post will preview our visual similarity models and explore how to use Hive’s Custom Search APIs.

Image Similarity Models: A Two-Pronged Approach

More than other classification problems, the question of “image similarity” largely depends on definitions: at what point are two images considered similar or identical? To solve this, we used contrastive learning techniques to build two deep learning models with different but complementary ground-truth concepts of image similarity. 

The first model is optimized to identify exact visual matches between images – in other words: would a human decide that two images are identical upon close inspection? This “exact match” model is sensitive to even subtle augmentations or visual differences, where modifications can have a substantial impact on its similarity predictions.

The second model is optimized towards identifying manipulated images, and is more specifically trained on (manual) modifications such as overlay text, cropping, rotations, filters, and juxtapositions. In other words, is the query image a manipulated copy of the original, or are they actually different images?

Why Use Similarity Models for Image Comparison?

Unlike traditional image duplicate detection approaches, Hive’s deep learning approach to image comparison builds in resilience to image modification techniques, including both manual image manipulations via image editing software and adversarial augmentations (e.g., noise, filters, and other pixel-level alterations). By training on these augmentations specifically, our models can pick up modifications that would defeat conventional image hashing checks, even if those modifications don’t result in visible changes to the image.

Each model quantifies image similarity as a normalized score between 0 and 1. As you might expect, a pair-wise similarity score of 1.0 indicates an exact match between two images, while lower scores correspond to the extent of visual differences or modifications.  

Example Image Comparisons and Model Responses

To illustrate the problem and give a sense of our models’ understanding, here’s how they classify some example image pairs: 

Responses from Hive image similarity models for near-exact match. The second image pair includes only overlay text (in this case, a subtitle for a video frame). Both the exact visual match model and the broader similarity model return high similarity scores (>0.95)

This example is close to an exact match – each image is from the same video frame. Both models predict very high similarity scores (although not an exact visual match). However, the model predictions begin to diverge when we consider manipulated images:

Responses from Hive image similarity models for a first manually manipulated image pair. The query image is mirrored horizontally with a basic color filter. In this case, the exact visual match model returns a similarity score of below 0.4, while the broader image similarity model still returns a similarity score of over 0.95
Horizontal flip plus filter adjustments
Responses from Hive image similarity models for a second manipulated image pair of a Hive billboard. The query image is an offset composition of multiple recolored masks. In this case, the exact visual match model returns a similarity score of below 0.7, while the broader image similarity model returns a similarity score of over 0.9
Recoloration plus multiple mask overlay
Responses from Hive image similarity models for a third manipulated image pair of a magazine cover. The query image is the same photo with layered overlay text. In this case, the exact visual match model returns a similarity score of below 0.75, while the broader image similarity model still returns a similarity score of over 0.9
Layered overlay text

In these examples, the exact match model shows significantly more sensitivity to visual differences, while the broader visual similarity model (correctly) predicts that one image is a manipulated copy of the other. In this way, scores from these models can be used in distinct but complementary ways to identify matching images in your image library. 

Hive’s Custom Search: API Overview

Custom Search includes three API endpoints: two for adding and removing images from individualized image libraries, and a third to submit query images for model-based comparison. 

For comparison tasks, the query endpoint allows images to be submitted for comparison to the library associated with your project. When a query image is submitted, our models will compare the image to each reference image in your custom index to identify visual matches. 

The Custom Search API will return a similarity score from both the exact visual match model and the visual similarity model on – like those shown in the above examples – for any matching images. Each platform can therefore adapt which of these scores to use (and at what threshold) based on their desired use-case. 

Final Thoughts

We’re excited about the ways that our new Custom Search APIs will enable customers to unlock useful insights in their search applications. For Hive, this represents the start of a new generation of enterprise AI that just scratches the surface of what is possible in this space.

If you’d like to learn more about Custom Search APIs or get help designing a solution tailored to your needs, you can reach out to our sales team here or by email at sales@thehive.ai