- A New Need for Content Moderation
- Using AI to Identify AI: Building Our Classifier
- How it Works: An Example Input and Response
- Final Thoughts and Future Directions
A New Need for Content Moderation
In the past few months, AI-generated art has experienced rapid growth in both popularity and accessibility. Engines like DALL-E, Midjourney, and Stable Diffusion have spurred an influx of AI-generated artworks across online platforms, prompting an intense debate around their legality, artistic value, and potential for enabling the propagation of deepfake-like content. As a result, certain digital platforms such as Getty Images, InkBlot Art, Fur Affinity, and Newgrounds have announced bans on AI-generated content entirely, with more to likely follow in the coming weeks and months.
Platforms are enacting these bans for a variety of reasons. Online communities built for artists to share their artwork such as Newgrounds, Fur Affinity, and Purpleport stated they put their AI artwork ban in place in order to keep their sites focused exclusively on human-created art. Other platforms have taken action against AI-generated artwork due to copyright concerns. Image synthesis models often include copyrighted images in their training data, which consist of massive amounts of photos and artwork scraped from across the web, typically without any artists’ consent. It is an open question whether this type of scraping and the resulting AI-generated artwork amount to copyright violations — particularly in the case of commercial use — and platforms like Getty and InkBlot Art don’t want to take that risk.
As part of Hive’s commitment to providing enterprise customers with API-accessible solutions to moderation problems, we have created a classification model made specifically to assist digital platforms in enacting these bans. Our AI-Generated Media Recognition API is built with the same type of robust classification model as our industry-leading visual moderation products, and it enables enterprise customers to moderate AI-generated artwork without relying on users to flag images manually.
This post explains how our model works and the new API that makes this functionality accessible.
Using AI to Identify AI: Building Our Classifier
Hive’s AI-Generated Media Recognition model is optimized for use with the kind of media generated by popular AI generative engines such as DALL-E, Midjourney, and Stable Diffusion. It was trained on a large dataset comprising millions of artificially generated images and human-created images such as photographs, digital and traditional art, and memes sourced from across the web.
The resulting model is able to identify AI-created images among many different types and styles of artwork, even correctly identifying AI artwork that could be misidentified by manual flagging. Our model returns not only whether or not a given image is AI-generated, but also the likely source engine it was generated from. Each classification is accompanied by a confidence score that ranges from 0.0 to 1.0, allowing customers to set a confidence threshold to guide their moderation.
How it Works: An Example Input and Response
When receiving an input image, our AI-Generated Media Recognition model returns classifications under two separate heads. The first provides a binary classification as to whether or not the image is AI-generated. The second, which is only relevant when the image is classified as an AI-made image, identifies the source of that artificial image from among the most popular generation engines that are currently in use.
To get a sense of the capabilities of our AI-Generated Media Recognition model, here’s a look at an example classification:
This input image was created with the AI model Midjourney, though it is so realistic that it may be missed by manual flagging. As shown in the response above, our model correctly classifies this image as AI-generated with a high confidence score of 0.968. The model also correctly identifies the source of the image, with a similarly high confidence score. Other sources like DALL-E are also returned along with their respective confidence scores, and the scores under each of the two model heads sum to 1.
Platforms that host artwork of any kind can integrate this AI-Generated Media Recognition API into their workflows by automatically screening all content as it is being posted. This method of moderating AI artwork works far more quickly than manual flagging and can catch realistic artificial artworks that even human reviewers might miss.
Final Thoughts and Future Directions
Digital platforms are now being flooded with AI-generated content, and that influx will only increase as these generative models continue to grow and spread. On top of this, creating this kind of artwork is fast and easy to access online, which enables large quantities of it to be produced quickly. Moderating artificially created artworks is crucial for many sites to maintain their platform’s mission and protect themselves and their customers from potential legal issues further down the line.
We created our AI-Generated Media Recognition API to solve this problem, but our model will need to continue to evolve along with image generation models as existing ones improve and new ones are released. We plan on adding new generative engines to our sources as well as continually updating our model to keep up with the current capabilities of these models. Since some newer generative models can create video in addition to still images, we are working to add support for video formats within our API in order to best prevent all types of AI-generated artwork from dominating online communities where they are unwelcome.