{"id":205,"date":"2022-04-08T15:56:00","date_gmt":"2022-04-08T15:56:00","guid":{"rendered":"https:\/\/thehive.ai\/blog\/?p=205"},"modified":"2025-03-04T09:52:50","modified_gmt":"2025-03-04T09:52:50","slug":"ocr-moderation-with-hive-new-approaches-to-online-content-moderation","status":"publish","type":"post","link":"https:\/\/thehive.ai\/blog\/ocr-moderation-with-hive-new-approaches-to-online-content-moderation","title":{"rendered":"OCR Moderation with Hive: New Approaches to Online Content Moderation"},"content":{"rendered":"\n<p>Recently, image-based content featuring embedded text \u2013 such as memes, captioned images and GIFs, and screenshots of text \u2013 have exploded in popularity across many social platforms. These types of content can present unique challenges for automated moderation tools. Not only does embedded text need to be detected and ordered accurately, it also must be analyzed with contextual awareness and attention to semantic nuance.&nbsp;<\/p>\n\n\n\n<p>Emojis have historically been another obstacle for automated moderation. Thanks to native support across many devices and platforms, these characters have evolved into a new online lexicon for accentuating or replacing text. Many emojis have also developed connotations that are well-understood by humans but not directly related to the image itself, which can make it difficult for automated solutions to identify harmful or inappropriate text content.<\/p>\n\n\n\n<p>To help platforms tackle these challenges, Hive offers optical character recognition (OCR)-based moderation as part of our content moderation suite. Our OCR models are optimized for the types of digitally-generated content that commonly appears on social platforms, enabling robust AI moderation on content forms that are widespread yet overlooked by other solutions. Our OCR moderation API combines competitive text detection and transcription capabilities with our best-in-class text moderation model (including emoji support) into a single response, making it easy for platforms to take real-time enforcement actions across these popular content formats.&nbsp;<\/p>\n\n\n\n<h2>OCR Model for Text Recognition<\/h2>\n\n\n\n<p>Effective OCR moderation starts with training for accurate text detection and extraction. Hive\u2019s OCR model is trained on a large, proprietary set of examples that optimizes for how text commonly appears within user-generated digital content. Hive has the largest distributed workforce for data labeling in the world, and we leaned on this capability to provide tens of millions of human annotations on these examples to build our model\u2019s understanding.&nbsp;<\/p>\n\n\n\n<p>We recently conducted a head-to-head comparison of our OCR model against top public cloud solutions using a custom evaluation dataset sourced from social platforms. We were particularly interested in test examples that featured digitally-generated text \u2013 such as memes and captioned images \u2013 to capture how content commonly appears on social platforms and selected evaluation data accordingly.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"586\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/1-6-1024x586.png\" alt=\"\" class=\"wp-image-338\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/1-6-1024x586.png 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/1-6-300x172.png 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/1-6-768x439.png 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/1-6-1536x878.png 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/1-6.png 1892w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>In this evaluation, we looked at end-to-end text recognition, which includes both text detection and text transcription. Here, Hive\u2019s OCR model outperformed or was competitive with other models on both exact transcription and transcription allowing character-level errors. At 90% recall, Hive\u2019s OCR model achieved a precision of 98%, while public cloud models ranged from ~88% to 97%, implying a similar or lower end-to-end error rate.<\/p>\n\n\n\n<h2>OCR Moderation: Language Support<\/h2>\n\n\n\n<p>We recognize that many platforms\u2019 moderation needs extend beyond English-speaking users. Hive\u2019s OCR model supports text recognition and transcription for&nbsp;<a href=\"https:\/\/docs.thehive.ai\/docs\/ocr-text-recognition\" target=\"_blank\" rel=\"noreferrer noopener\">many widely spoken languages<\/a>&nbsp;with comparable performance, many of which are also supported by our text moderation solutions. Here\u2019s an overview of our current language support:<\/p>\n\n\n\n<figure class=\"wp-block-table is-style-stripes\"><table><tbody><tr><td class=\"has-text-align-center\" data-align=\"center\"><strong>Language<\/strong><\/td><td class=\"has-text-align-center\" data-align=\"center\"><strong>OCR Support?<\/strong><\/td><td class=\"has-text-align-center\" data-align=\"center\"><strong>Text Moderation Support?<\/strong><\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">English<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes (Model)<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">Spanish<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes (Model)<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">French<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes (Model)<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">German<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes (Model)<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">Mandarin<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes (Pattern Match)<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">Russian<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes (Pattern Match)<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">Portuguese<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes (Model)<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">Arabic<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes (Model)<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">Korean<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes (Pattern Match)<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">Japanese<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes (Pattern Match)<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">Hindi<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes (Model)<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">Italian<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes<\/td><td class=\"has-text-align-center\" data-align=\"center\">Yes (Pattern Match)<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h2><strong>Moderation of Detected Text<\/strong><\/h2>\n\n\n\n<p>Hive\u2019s OCR moderation solution goes beyond producing a transcript \u2013 we then apply our best-in-class text moderation model to understand the meaning of that speech in context (including any detected emojis). Our backend will automatically feed text detected in an image as an input to our text moderation model, making our model classifications on image-based text accessible with a single API call. Our text model is generally robust to misspellings and character substitutions, enabling high classification accuracies on text extracted via OCR even if errors occur in transcription.&nbsp;<\/p>\n\n\n\n<p>Hive\u2019s text moderation model can classify extracted text across several sensitive or inappropriate categories, including sexuality, threats or descriptions of violence, bullying, and racism.&nbsp;<\/p>\n\n\n\n<p>Another critical use-case is moderating spam and doxxing: OCR moderation will quickly and accurately flag images containing emails, phone numbers, addresses and other personal identifiable information.&nbsp; Finally, our text moderation model can also identify promotions such as soliciting services, asking for shares and follows, soliciting donations, or links to external sites. This gives platforms new tools to curate user experience and remove junk content.&nbsp;<\/p>\n\n\n\n<p>We understand that verbal communication is rarely black and white \u2013 context and linguistic nuance can have profound effects on how meaning and intent of words are perceived. To help navigate these gray areas, our text model responses supplement classifications with a score from benign (score = 0) to severe (score = 3), which can be used to adapt any necessary moderation actions to platforms\u2019 individual needs and sensitivities. You can read more about our text models in&nbsp;<a href=\"https:\/\/thehive.ai\/blog\/hive-hate-model-automated-content-moderation-suite\" target=\"_blank\" rel=\"noreferrer noopener\" title=\"\">previous blog posts<\/a>&nbsp;or&nbsp;<a href=\"https:\/\/docs.thehive.ai\/docs\/classification-text\" target=\"_blank\" rel=\"noreferrer noopener\">in our documentation<\/a>.<\/p>\n\n\n\n<p>Our currently supported moderation classes in each language are as follows:<\/p>\n\n\n\n<figure class=\"wp-block-table is-style-stripes\"><table><tbody><tr><td class=\"has-text-align-center\" data-align=\"center\"><strong>Language<\/strong><\/td><td class=\"has-text-align-center\" data-align=\"center\"><strong>Classes<\/strong><\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">English<\/td><td class=\"has-text-align-center\" data-align=\"center\">Sexual, Hate, Violence, Bullying<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">Spanish<\/td><td class=\"has-text-align-center\" data-align=\"center\">Sexual, Hate<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">Portuguese<\/td><td class=\"has-text-align-center\" data-align=\"center\">Sexual, Hate<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">French<\/td><td class=\"has-text-align-center\" data-align=\"center\">Sexual<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">German<\/td><td class=\"has-text-align-center\" data-align=\"center\">Sexual<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">Hindi<\/td><td class=\"has-text-align-center\" data-align=\"center\">Sexual<\/td><\/tr><tr><td class=\"has-text-align-center\" data-align=\"center\">Arabic<\/td><td class=\"has-text-align-center\" data-align=\"center\">Sexual<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h2><strong>Emoji Classification for Text Moderation<\/strong><\/h2>\n\n\n\n<p>Emoji recognition is a unique feature of Hive\u2019s OCR moderation model that opens up new possibilities for identifying harmful or harassing text-based content. Emojis can be particularly useful in moderation contexts because they can subtly (or not-so-subtly) alter how accompanying text is interpreted by the reader. Text that is otherwise innocuous can easily become inappropriate when accompanied by a particular emoji and vice-versa.<\/p>\n\n\n\n<p>Hive OCR is able to detect and classify any emojis supported by Apple, Samsung, or Google devices. Our OCR model currently achieves a weighted accuracy of over 97% when classifying emojis. This enables our text moderation model to account for contextual meaning and connotations of emojis used in input text.&nbsp;<\/p>\n\n\n\n<p>To get a sense of our model\u2019s understanding, let\u2019s take a look at some examples of how use of emojis (or inclusion of text around emojis) changes our model predictions to align with human understanding. Each of these examples is from a real classification task submitted to our latest model release.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"342\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/2-4-1024x342.png\" alt=\"\" class=\"wp-image-339\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/2-4-1024x342.png 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/2-4-300x100.png 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/2-4-768x257.png 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/2-4.png 1352w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Here\u2019s a basic example of how adding an emoji changes our model response from classifying as clean to classifying as sensitive.&nbsp; Our models understand not only the verbal concept represented by the emoji, but what the emoji means semantically based on where it is located in the text. In this case, the bullying connotation of the \u201cgarbage\u201d or \u201ctrash\u201d emoji would be completely missed by an analysis of the text alone.&nbsp;<\/p>\n\n\n\n<p>Our model is similarly sensitive to changes in semantic meaning caused by substitutions of emojis for text.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"277\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/3-4-1024x277.png\" alt=\"\" class=\"wp-image-340\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/3-4-1024x277.png 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/3-4-300x81.png 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/3-4-768x207.png 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/3-4.png 1370w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>In this case, our model catches the sexual connotation added by the eggplant emoji in place of the word \u201ceggplant.\u201d Again, the text alone without an emoji \u2013 \u201clemme see that !\u201d \u2013 is completely clean.<\/p>\n\n\n\n<p>In addition to understanding how emojis can alter the meaning of text, our model is also sensitive to how text can change implications of emojis themselves.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"285\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/4-2-1024x285.png\" alt=\"\" class=\"wp-image-341\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/4-2-1024x285.png 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/4-2-300x83.png 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/4-2-768x213.png 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/4-2-1536x427.png 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/4-2.png 2048w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Here, adding the phrase \u201chey hotty\u201d transforms an emoji usually used innocuously into a message with suggestive intent, and our model prediction changes accordingly.&nbsp;&nbsp;<\/p>\n\n\n\n<p>Finally, Hive\u2019s OCR and text moderation models are trained to differentiate between each skin tone option for emojis in the \u201cPeople\u201d category and understand their implications in the context of accompanying text. We are currently exploring how the ability to differentiate between light and darker skin tones can enable new tools to identify hateful, racist, or exclusionary text content.<\/p>\n\n\n\n<h2><strong>OCR Moderation: Final Thoughts<\/strong><\/h2>\n\n\n\n<p>User preferences for online communication are constantly evolving in both medium and content, which can make it challenging for platforms to keep up with abusive users. Hive prides itself on identifying blindspots in existing moderation tools and developing robust AI solutions using high-quality training data tailored to these use-cases. We hope that this post has showcased what\u2019s possible with our OCR moderation capabilities and given some insight into our future directions.&nbsp;<\/p>\n\n\n\n<p>Feel free to contact&nbsp;<a href=\"mailto:sales@thehive.ai\">sales@thehive.ai<\/a>&nbsp;if you are interested in adding OCR capabilities to your moderation suite, and please stay tuned as we announce new features and updates!<\/p>\n\n\n\n<p><br><\/p>\n","protected":false},"excerpt":{"rendered":"<p>We showcase some unique features of Hive&#8217;s OCR Moderation API for online content moderation and compare how our OCR model stacks up against competitors<\/p>\n","protected":false},"author":1,"featured_media":331,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"kia_subtitle":""},"categories":[8,4,2],"tags":[],"_links":{"self":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts\/205"}],"collection":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/comments?post=205"}],"version-history":[{"count":6,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts\/205\/revisions"}],"predecessor-version":[{"id":1965,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts\/205\/revisions\/1965"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/media\/331"}],"wp:attachment":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/media?parent=205"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/categories?post=205"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/tags?post=205"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}