{"id":1,"date":"2024-01-11T06:33:00","date_gmt":"2024-01-11T06:33:00","guid":{"rendered":"http:\/\/54.151.72.21\/?p=1"},"modified":"2025-03-05T06:19:04","modified_gmt":"2025-03-05T06:19:04","slug":"customizing-hive-moderation-models-with-automl","status":"publish","type":"post","link":"https:\/\/thehive.ai\/blog\/customizing-hive-moderation-models-with-automl","title":{"rendered":"Customizing Hive Moderation Models with AutoML"},"content":{"rendered":"\n<p>Hive\u2019s AutoML platform allows anyone the opportunity to create best-in-class machine learning solutions for the particular issues they face. Our platform can create classification and large language models for an endless range of use cases. If you need a model that bears no resemblance whatsoever to any pre-trained model we offer, no problem! We\u2019ll help you build one yourself.&nbsp;<\/p>\n\n\n\n<p>Hive AutoML uses the same technology behind our industry-leading ML tools to create yours. This way you get the best of both worlds \u2014 Hive\u2019s impeccable model performance and a tool custom-built to address your needs.<\/p>\n\n\n\n<h3>Hive AutoML for Content Moderation<\/h3>\n\n\n\n<p>Today we\u2019ll be focusing on one particular application of our AutoML platform: customizing our moderation models. These models kickstarted our success as a company and are used by many of the largest online platforms in the world. But the moderation guidelines of many sites differ from each other, and sometimes our base moderation models don\u2019t quite fit them.&nbsp;<\/p>\n\n\n\n<p>With AutoML, you can create your own version of our moderation models by fine-tuning our pre-existing heads or adding new heads entirely. We will then train a version of our high-performing base model with your added data to create a tool that best suits your platform\u2019s moderation process.&nbsp;<\/p>\n\n\n\n<p>In this blog post, we\u2019ll walk through both how to add more data to an existing Hive moderation head and how to add a new custom moderation head. We\u2019ll demonstrate the former while building a visual moderation model and the latter on a text moderation model. Audio moderation is not currently supported on AutoML.<\/p>\n\n\n\n<h3>Building a Visual Moderation Model<br><\/h3>\n\n\n\n<p>Hive AutoML for Visual Moderation allows you to customize our Visual Moderation base model to fit your specific needs. Using your own data, you can add new model heads or fine-tune any of the existing 45+ subclasses that we provide as part of our Visual Moderation tool. A full list of these classes is available&nbsp;<a href=\"https:\/\/web.archive.org\/web\/20240221053955\/https:\/\/docs.thehive.ai\/docs\/visual-content-moderation#visual-content-moderation\">here<\/a>.<\/p>\n\n\n\n<p>For this walkthrough, we\u2019ll be fine-tuning the tobacco head. Our data will thus include images and labels for this head only. The resulting model will include all Hive visual moderation heads, with the tobacco head re-trained to incorporate this new data.<\/p>\n\n\n\n<h2>Uploading Your Dataset<\/h2>\n\n\n\n<p>Before you start building your model, you first need to upload any datasets you\u2019ll use to the Dataset section of our AutoML platform. For Visual Moderation model training, we require a CSV file with a column for your image data (as publicly accessible image URLs) and an additional column for each head you wish to train.<\/p>\n\n\n\n<p>For this tutorial, we\u2019re going to train using additional data for the tobacco class. The below CSV includes image URLs and a column of labels for that head.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"665\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_0-1024x665.jpg\" alt=\"Dataset formatting, images have either \u201cyes_tobacco\u201d or \u201cno_tobacco\u201d labels\" class=\"wp-image-24\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_0-1024x665.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_0-300x195.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_0-768x499.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_0-1536x998.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_0.jpg 1558w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>Dataset formatting, images have either \u201cyes_tobacco\u201d or \u201cno_tobacco\u201d labels<\/figcaption><\/figure>\n\n\n\n<p>After you\u2019ve selected your dataset file, you\u2019ll be asked to confirm the column mapping. Make sure the columns of your dataset have been interpreted correctly and that you have the correct format (image or text) selected for each column.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"491\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_1-1024x491.jpg\" alt=\"The column mapping confirmation page lets you double check that the data has been processed correctly.\" class=\"wp-image-25\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_1-1024x491.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_1-300x144.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_1-768x368.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_1-1536x737.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_1.jpg 1999w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>The column mapping confirmation page lets you double check that the data has been processed correctly.<\/figcaption><\/figure>\n\n\n\n<p>Once you\u2019ve confirmed your mapping, you can preview and edit your data. This page opens automatically after any dataset upload. You will be able to check whether all images were uploaded successfully, view the images themselves, and change their respective labels if desired. You can also add or delete any data that you wish to before you proceed onto model training.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"532\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_2-1024x532.jpg\" alt=\"The dataset preview page for an image-based dataset.\" class=\"wp-image-27\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_2-1024x532.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_2-300x156.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_2-768x399.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_2-1536x798.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_2.jpg 1999w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>The dataset preview page for an image-based dataset.<\/figcaption><\/figure>\n\n\n\n<h2>Creating a Dataset Snapshot<br><\/h2>\n\n\n\n<p>When you\u2019re happy with your dataset, you\u2019ll then need to create a snapshot from it. A snapshot is a point-in-time export of a dataset that validates that dataset for training. Once a snapshot is created, its contents cannot be changed. This means that while you can continue to edit your original dataset, your snapshot will not change along with it \u2014 if you make any changes, you\u2019ll need to create a new snapshot after you\u2019re finished with your changes.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"536\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_3-1024x536.jpg\" alt=\"The information you\u2019ll be asked to provide when creating a snapshot.\" class=\"wp-image-30\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_3-1024x536.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_3-300x157.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_3-768x402.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_3-1536x804.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_3.jpg 1999w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>The information you\u2019ll be asked to provide when creating a snapshot.<\/figcaption><\/figure>\n\n\n\n<p>You can create a snapshot from any live dataset. To do so, simply click the \u201cCreate Snapshot\u201d button on that dataset\u2019s detail page. You\u2019ll be prompted to provide some information, most notably which columns to use for image input and data labels. After your snapshot is successfully created, you\u2019re ready to start training!<\/p>\n\n\n\n<h2>Creating a New Model<\/h2>\n\n\n\n<p>To create a training, you can select the \u201cCreate Model\u201d button on the snapshot detail page. You\u2019ll once again be asked to provide several pieces of information, including your model\u2019s name, description, base model, and datasets. Make sure to select \u201cHive Vision Moderation\u201d under the \u201cBase Model\u201d category as opposed to a general image classification model.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"641\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_4-1024x641.jpg\" alt=\"When creating your model, make sure you have the correct model type and base model selected.\" class=\"wp-image-31\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_4-1024x641.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_4-300x188.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_4-768x481.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_4-1536x961.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_4.jpg 1598w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>When creating your model, make sure you have the correct model type and base model selected.<\/figcaption><\/figure>\n\n\n\n<p>You can choose to upload a separate test dataset or split off a random section of your training dataset to use instead. If you choose to upload a separate test dataset, this dataset must contain the same heads and classes as your training dataset. After uploading your dataset, you will also need to create a snapshot of that dataset before you begin model training.<\/p>\n\n\n\n<p>If you choose to split off a section of your training dataset, you will be able to choose the percentage of that dataset that you would like to use for testing as you create your training.<\/p>\n\n\n\n<p>Before you begin your training, you are also able to edit some training preferences such as maximum number of training epochs, model selection rule, model selection label, early stopping, and invalid data criteria. If you\u2019re unsure what any of these options are, there is a little information icon next to each that will explain what is meant by that setting.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"674\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_5-1024x674.jpg\" alt=\"The training options you\u2019re offered as you create your model include max epochs, model selection rule, and more.\" class=\"wp-image-26\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_5-1024x674.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_5-300x197.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_5-768x505.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_5-1536x1010.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_5.jpg 1648w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>The training options you\u2019re offered as you create your model include max epochs, model selection rule, and more.<\/figcaption><\/figure>\n\n\n\n<p>After uploading your training (and, if desired, test) dataset and selecting your desired training options, you\u2019re ready to create your model. After you begin training, your model will be ready within 20 minutes. You will automatically be directed to the model\u2019s detail page, where you can watch its progress as it trains.<\/p>\n\n\n\n<h2>Playground and Metrics: Evaluating Your Model<\/h2>\n\n\n\n<p>When your model has completed its training, the model\u2019s detail page will display a variety of metrics in order to help you analyze your model\u2019s performance. At the top of the page, you\u2019ll be shown the model\u2019s precision, recall, balanced accuracy, and F1 score. You can toggle whether these metrics are calculated by head overall or by each class within a head.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"537\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_6-1024x537.jpg\" alt=\"The model details page displays performance metrics once the model has completed training.\" class=\"wp-image-28\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_6-1024x537.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_6-300x157.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_6-768x403.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_6-1536x805.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_6.jpg 1999w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>The model details page displays performance metrics once the model has completed training.<\/figcaption><\/figure>\n\n\n\n<p>Below these numbers, you\u2019ll also be able to view an interactive precision\/recall (PR) curve. This is the gold-standard metric for a classification model and gives you more insight into how your model balances the inherent tradeoff between high precision and high recall.<\/p>\n\n\n\n<p>You\u2019ll then be shown a confusion matrix, which is an exact breakdown of the true positives, false positives, true negatives, and false negatives of the model\u2019s results. This can highlight particular weak spots of your model and potential areas you may want to address with further training. As shown below, our example model has no false positives but several false negatives \u2014 images with tobacco that were classified as \u201cno_tobacco.\u201d<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"783\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_7-1024x783.jpg\" alt=\" This model\u2019s confusion matrix, which shows that there is an issue with false negatives.\" class=\"wp-image-32\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_7-1024x783.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_7-300x229.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_7-768x587.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_7-1536x1175.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_7.jpg 1794w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption> This model\u2019s confusion matrix, which shows that there is an issue with false negatives.<\/figcaption><\/figure>\n\n\n\n<p>The final section of our metrics page is an area called the \u201cplayground.\u201d The playground allows you to test your newly created AutoML model by submitting sample queries and viewing the responses. This feature is another great way to explore the way that your model responds to different kinds of prompts and the areas in which it could improve. You are given 500 free sample queries \u2014 beyond that you will be prompted to deploy your model with the cost of each submission charged to your organization\u2019s billing account.<\/p>\n\n\n\n<p>To test our tobacco model, we submitted the following sample image. To the right of it you can see the results for each Hive visual moderation class, including tobacco where it is classified correctly with a perfect confidence score or 1.00.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"749\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_8-1024x749.jpg\" alt=\"An example image of a man smoking a cigar and the labels assigned to it by our newly trained moderation model.\" class=\"wp-image-29\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_8-1024x749.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_8-300x220.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_8-768x562.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_8-1536x1124.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_8.jpg 1640w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>An example image of a man smoking a cigar and the labels assigned to it by our newly trained moderation model.<\/figcaption><\/figure>\n\n\n\n<h2>Deploying Your Model<\/h2>\n\n\n\n<p>To begin using your model, you can create a deployment from it. This will open the project on Hive Data, where you will be able to upload tasks, view tasks, and access your API key as you would with any other Hive Data project. An AutoML project can have multiple active deployments at one time.<\/p>\n\n\n\n<h3>Building a Text Moderation Model<\/h3>\n\n\n\n<p>Just like for Visual Moderation, our AutoML platform allows you to customize our Text Moderation base model to fit your particular use cases by adding or re-training model categories. The full class definitions for all 13 of our currently offered heads are available<a href=\"https:\/\/docs.thehive.ai\/docs\/classification-text#text-classification-model\" target=\"_blank\" rel=\"noreferrer noopener\">&nbsp;here<\/a>. For this section of the walkthrough, we will be creating a new custom head in order to add capabilities to our model that we don\u2019t currently offer: sentiment analysis.<\/p>\n\n\n\n<p>Sentiment analysis is the task of categorizing the emotional tone of a piece of text, typically into two labels: positive or negative. Occasionally there may be a sentiment analysis task that breaks the sentiment down into more specific categories, such as joyful, angry, etc. Adding this kind of information to our existing Hive Text Moderation model could prove useful for platforms that wish to either exclude negative content on sites for children or to put limits on certain comment sections or forums where negative commentary is unwanted.<\/p>\n\n\n\n<p>Sentiment analysis is a complex problem, since it is a language-based task. Understanding the meaning and tone of a sentence is not always easy even for humans. To keep it simple, we\u2019ll just be using the two possible classifications of positive and negative.<\/p>\n\n\n\n<h2>Uploading Your Dataset<\/h2>\n\n\n\n<p>Similarly to creating a Visual Moderation model, you\u2019ll need to upload your data as a CSV file to the \u201cData\u201d section of our AutoML platform prior to model training. The format of our sentiment analysis dataset is shown below, though the column names do not need to be anything specific in order to be processed correctly.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"565\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_10-1024x565.jpg\" alt=\"The text data and labels for our sentiment analysis model, formatted into two columns.\n\" class=\"wp-image-58\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_10-1024x565.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_10-300x165.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_10-768x424.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_10-1536x847.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_10.jpg 1842w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>The text data and labels for our sentiment analysis model, formatted into two columns.<\/figcaption><\/figure>\n\n\n\n<p>After uploading your dataset, you\u2019ll be asked to confirm the format of each column as either text, images, or JSONs. If you\u2019d like to disregard that column entirely, that is also an option to \u201cIgnore Column.\u201d After you hit confirm, you can preview and edit your dataset just as you could with your image dataset in the Visual Moderation example. The preview page for text datasets is shown below.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"492\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_11-1024x492.jpg\" alt=\"The preview page for a text-based dataset.\" class=\"wp-image-59\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_11-1024x492.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_11-300x144.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_11-768x369.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_11-1536x738.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_11.jpg 1999w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>The preview page for a text-based dataset.<\/figcaption><\/figure>\n\n\n\n<h2>Creating a Dataset Snapshot<\/h2>\n\n\n\n<p>As described in the Visual Moderation walkthrough, you\u2019ll need to create a snapshot of your dataset in order to validate it prior to model training. When making your snapshot, make sure that you select \u201cText Classification\u201d as your \u201cSnapshot Type.\u201d This will ensure that your snapshot is sufficient to train a Text Moderation model. You will also need to specify which column contains your text input and which contains the labels for that text input, as shown below for our dataset.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"394\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_12-1024x394.jpg\" alt=\"When creating your snapshot, you will be asked to provide some information about the dataset.\" class=\"wp-image-60\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_12-1024x394.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_12-300x115.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_12-768x295.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_12-1536x591.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_12.jpg 1999w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>When creating your snapshot, you will be asked to provide some information about the dataset.<\/figcaption><\/figure>\n\n\n\n<p>In the example above, we\u2019ve selected our \u201ctext_data\u201d column as our input and our \u201csentiment\u201d column as our training labels.<\/p>\n\n\n\n<h2>Creating a New Model<\/h2>\n\n\n\n<p>After you\u2019ve created your snapshot, you\u2019ll automatically be brought to that snapshot\u2019s detail page. From this page, starting a new model training is as easy&nbsp; \u2014 just hit the big \u201cCreate New Model\u201d button on the top right. You\u2019ll be asked to name your model and provide a few key details about the training, such as which snapshots you\u2019d like to use as your data and how many times a training will cycle through that data.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"697\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_13-1024x697.jpg\" alt=\"You\u2019ll be able to configure your training by choosing a model selection rule, maximum number of epochs, and more.\" class=\"wp-image-62\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_13-1024x697.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_13-300x204.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_13-768x523.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_13-1536x1046.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_13.jpg 1580w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>You\u2019ll be able to configure your training by choosing a model selection rule, maximum number of epochs, and more.<\/figcaption><\/figure>\n\n\n\n<p>Make sure you\u2019ve selected \u201cText Classification\u201d as your model type and \u201cHive Text Moderation\u201d as your base model. Then you\u2019re ready to start your training! Model training takes up to 20 minutes depending on several factors including the size of your dataset. Most take only several minutes to complete.<\/p>\n\n\n\n<h2>Metrics and Model Evaluation<\/h2>\n\n\n\n<p>Once your training has completed, you\u2019ll be redirected to the details page for your new moderation model. On this page, you\u2019ll be shown the model\u2019s precision, recall, balanced accuracy, and F1 score. You will also be able to view a precision\/recall (P\/R) curve and confusion matrix in order to further analyze the performance of your model.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"447\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_14-1024x447.jpg\" alt=\"The sentiment analysis model performs fairly well upon first training, with most metrics around 86%.\" class=\"wp-image-63\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_14-1024x447.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_14-300x131.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_14-768x335.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_14-1536x671.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_14.jpg 1999w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>The sentiment analysis model performs fairly well upon first training, with most metrics around 86%.<\/figcaption><\/figure>\n\n\n\n<p>The overall performance of the model is pretty good for a difficult task such as sentiment analysis. While there is room for improvement, this first round of training indicates that with some additional data we could likely bring all metrics above 90%. The confusion matrix for this model indicates that a specific area of weakness for this model is false negatives, to which a possible solution would be to increase the amount of positive examples in the data and observe if this improves model results.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"691\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_15-1024x691.jpg\" alt=\"The confusion matrix for our model, which shows a 19% false negative rate.\" class=\"wp-image-64\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_15-1024x691.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_15-300x203.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_15-768x519.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_15-1536x1037.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_15.jpg 1632w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>The confusion matrix for our model, which shows a 19% false negative rate.<\/figcaption><\/figure>\n\n\n\n<p>We do not currently offer the playground feature for text moderation models, though we are working on this and expect it to be released in the coming months.<\/p>\n\n\n\n<h2>Deploying Your Model<\/h2>\n\n\n\n<p>The process for deploying your model is identical to the way we deployed our Visual Moderation model in the first example. To deploy any model, simply click \u201cCreate Deployment\u201d from that model\u2019s details page. Once deployed, you can access your unique API keys and begin to submit tasks to the model like any other Hive model.<\/p>\n\n\n\n<h3>Final Thoughts<\/h3>\n\n\n\n<p>We hope this in-depth walkthrough was helpful. If you have any further questions or run into any issues as you build your custom-made AI models, please don\u2019t hesitate to reach out to us at&nbsp;<a href=\"mailto:support@thehive.ai\">support@thehive.ai<\/a>&nbsp;and we will be happy to help. To inquire about testing out our AutoML platform, please contact&nbsp;<a href=\"mailto:sales@thehive.ai\">sales@thehive.ai<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Seamlessly incorporate your own custom classes and to our industry-leading moderation models with Hive AutoML.<\/p>\n","protected":false},"author":1,"featured_media":66,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"kia_subtitle":""},"categories":[8,6,4],"tags":[],"_links":{"self":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts\/1"}],"collection":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/comments?post=1"}],"version-history":[{"count":10,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts\/1\/revisions"}],"predecessor-version":[{"id":76,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts\/1\/revisions\/76"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/media\/66"}],"wp:attachment":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/media?parent=1"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/categories?post=1"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/tags?post=1"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}