{"id":33,"date":"2023-10-26T11:58:00","date_gmt":"2023-10-26T11:58:00","guid":{"rendered":"http:\/\/54.151.72.21\/?p=33"},"modified":"2025-03-14T11:33:41","modified_gmt":"2025-03-14T11:33:41","slug":"how-to-train-models-with-hive-automl","status":"publish","type":"post","link":"https:\/\/thehive.ai\/blog\/how-to-train-models-with-hive-automl","title":{"rendered":"How to Train Models with Hive AutoML"},"content":{"rendered":"\n<h3>What is Hive AutoML?<\/h3>\n\n\n\n<p>Hive\u2019s AutoML platform allows you to quickly train, evaluate, and deploy machine learning models for your own custom use cases. The process is simple \u2014 just select your desired model type, upload your datasets, and you\u2019re ready to begin training!&nbsp;<\/p>\n\n\n\n<p>Since we announced the&nbsp;<a href=\"https:\/\/thehive.ai\/blog\/build-your-own-custom-ml-models-with-hive-automl\" target=\"_blank\" rel=\"noreferrer noopener\">initial release of our AutoML platform<\/a>, we\u2019ve added support for Large Language Model training. Now you can build everything from classification models to chatbots, all in the same intuitive platform. To illustrate how easy the model-building process is, we\u2019ll walk through it step-by-step with each type of model. We\u2019ll also provide a link to the publicly available dataset we used as an example so that you can follow along.<\/p>\n\n\n\n<h3>Training an Image Classification Model<\/h3>\n\n\n\n<p>First we\u2019re going to create an Image Classification model. This type of model is used to identify certain subjects, settings, and other visual attributes in both images and videos. For this example, we\u2019ll be using a&nbsp;<a href=\"https:\/\/huggingface.co\/datasets\/Matthijs\/snacks\" target=\"_blank\" rel=\"noreferrer noopener\">snacks dataset<\/a>&nbsp;to identify 20 different kinds of food (strawberries, apples, hot dogs, cupcakes, etc.). To follow along with this walkthrough, first&nbsp;<a href=\"https:\/\/huggingface.co\/datasets\/Matthijs\/snacks\/blob\/main\/images.zip\" target=\"_blank\" rel=\"noreferrer noopener\">download the images from this dataset<\/a>, which are sorted into separate files for each label.<\/p>\n\n\n\n<h2>Formatting the Datasets<\/h2>\n\n\n\n<p>After downloading the image data, we\u2019ll need to put this data in the correct format for our AutoML training. For Image Classification datasets, the platform requires a CSV file that contains one column for image URLs titled \u201cimage_url\u201d and up to 20 other columns for the classification categories you wish to use. This requires creating publicly accessible links for each image in the dataset. For this example, all 20 of our food categories will be part of the same head \u2014 food type. To do this, we formatted our CSV as follows:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"740\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_1-1-1024x740.jpg\" alt=\"The snacks dataset in the correct format for our AutoML platform\" class=\"wp-image-87\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_1-1-1024x740.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_1-1-300x217.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_1-1-768x555.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_1-1-1536x1110.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_1-1.jpg 1940w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>The snacks dataset in the correct format for our AutoML platform<\/figcaption><\/figure>\n\n\n\n<p>This particular dataset is within the size limitations for Image Classification datasets. When uploading your own dataset, it is crucial that you ensure it meets all of the sizing requirements and other specifications or the dataset upload will fail. These requirements can be found in our&nbsp;<a href=\"https:\/\/docs.thehive.ai\/docs\/automl-for-image-classification#dataset-upload\" target=\"_blank\" rel=\"noreferrer noopener\">AutoML documentation<\/a>.<\/p>\n\n\n\n<p>Both test and validation datasets are provided as part of the snacks dataset. When using your own datasets, you can choose to upload a test dataset or to split off a random section of your training data to use instead. If you choose the latter, you will be able to select what percentage of that data you want you use as test data as you create your training.<\/p>\n\n\n\n<h2>Uploading the Datasets<\/h2>\n\n\n\n<p>Before we start building the model, we first need to upload both our training and test datasets to the \u201cDatasets\u201d section of our AutoML platform. This part of our platform validates each dataset before it can be used for training as well as stores all datasets to be easily accessed for future models. We\u2019ll upload both the training and test datasets separately, naming them&nbsp;<em>Snacks (Train)<\/em>&nbsp;and&nbsp;<em>Snacks (Test)<\/em>&nbsp;respectively.<\/p>\n\n\n\n<h2>Creating a Training<\/h2>\n\n\n\n<p>To start building your model, we\u2019ll head to our&nbsp;<a href=\"https:\/\/automl.thehive.ai\/trainings\/\" target=\"_blank\" rel=\"noreferrer noopener\">AutoML platform<\/a>&nbsp;and select the \u201cCreate New Model\u201d button. We\u2019ll then be brought to a project setup page where we will be prompted to enter a project name and description. For Model Type, we\u2019ll select \u201cImage Classification.\u201d On the right side of the screen, we can add our training dataset by selecting from our dataset library. We\u2019ll select the datasets called&nbsp;<em>Snacks (Train<\/em>) and&nbsp;<em>Snacks (Test)<\/em>&nbsp;that we just uploaded.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"497\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_2-1-1024x497.jpg\" alt=\"The \u201cCreate New Model\u201d page\" class=\"wp-image-89\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_2-1-1024x497.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_2-1-300x146.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_2-1-768x373.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_2-1-1536x745.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_2-1.jpg 1999w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>The \u201cCreate New Model\u201d page<\/figcaption><\/figure>\n\n\n\n<p>And just like that, we\u2019re ready to start training our model! To begin the training process, we\u2019ll click the \u201cStart Training Model\u201d button. The model\u2019s status will then shift to \u201cQueued\u201d and then \u201cIn Progress\u201d while we train the model. This will likely take several minutes. When training is complete, the status will display as \u201cCompleted.\u201d<\/p>\n\n\n\n<h2>Evaluating the Model<\/h2>\n\n\n\n<p>After model training is complete, the page for that project will show various performance metrics so that we can evaluate our model. At the top of the page we can select the head and, if desired, the class that we\u2019d like to evaluate. We can also use the slide to control the confidence threshold. Once selected, you will see the precision, recall, and balanced accuracy.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1999\" height=\"1020\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/3-1024x523.jpg\" alt=\"The model\u2019s project page after training has completed\" class=\"wp-image-90\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/3-1024x523.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/3-300x153.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/3-768x392.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/3-1536x784.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/3.jpg 1999w\" sizes=\"(max-width: 1999px) 100vw, 1999px\" \/><figcaption>The model\u2019s project page after training has completed<\/figcaption><\/figure>\n\n\n\n<p>Below that, you can view the precision\/recall curve (P\/R curve) as well as a confusion matrix that shows how many predictions were correct and incorrect per class. This gives us a more detailed understanding of what the model misclassified. For example, we can see here that two images of cupcakes were incorrectly classified as cookies \u2014 an understandable mistake as the two are both decorated desserts.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1999\" height=\"1062\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/4-1024x544.jpg\" alt=\"The confusion matrix for our snacks model\" class=\"wp-image-92\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/4-1024x544.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/4-300x159.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/4-768x408.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/4-1536x816.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/4.jpg 1999w\" sizes=\"(max-width: 1999px) 100vw, 1999px\" \/><figcaption>The confusion matrix for our snacks model<\/figcaption><\/figure>\n\n\n\n<p>These detailed metrics can help us to know what categories to target if we want to train a better version of the model. If you would like to retrain your model, you can also click the \u201cUpdate Model\u201d to begin the training process again.<\/p>\n\n\n\n<h2>Deploying the Model<\/h2>\n\n\n\n<p>Even after the first time training this model, we\u2019re pretty happy with how it turned out. We\u2019re ready to deploy the model and start using it. To deploy, select the project and click the \u201cCreate Deployment\u201d button in the top right corner. The project\u2019s status will shift to \u201cDeploying.\u201d The deployment may take a few minutes.<\/p>\n\n\n\n<h2>Submitting Tasks via API<\/h2>\n\n\n\n<p>After the deployment is complete, we\u2019re ready to start submitting tasks via API as we would any pre-trained Hive model. We can click on the name of any individual deployment to open the project on Hive Data, where we can upload tasks, view tasks, and access our API key. There is also a button to \u201cUndeploy\u201d the project, if we wish to deactivate it at any point. Undeploying a model is not permanent \u2014 we can redeploy the project if we later choose to.<\/p>\n\n\n\n<p>To see a video of the entire training and deployment process for an Image Classification model, head over to&nbsp;<a href=\"https:\/\/youtu.be\/hc23oGVL1Rw\" target=\"_blank\" rel=\"noreferrer noopener\">our Youtube channel<\/a>.<\/p>\n\n\n\n<h3>Training a Text Classification Model<\/h3>\n\n\n\n<p>We\u2019ll now walk through that same training process in order to build a Text Classification model, but with a few small differences. Text classification models can be used to sort and tag text content by topic, tone, and more. For this example, we\u2019ll use the&nbsp;<a href=\"https:\/\/huggingface.co\/datasets\/carblacac\/twitter-sentiment-analysis\" target=\"_blank\" rel=\"noreferrer noopener\">Twitter Sentiment Analysis dataset<\/a>&nbsp;posted by user carblacac on Hugging Face. This dataset consists of a series of short text posts originally published to Twitter and whether they have a negative (0) or positive (1) overall sentiment. To follow along with this walkthrough, you can download the dataset&nbsp;<a href=\"https:\/\/huggingface.co\/datasets\/carblacac\/twitter-sentiment-analysis\/tree\/main\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>.<\/p>\n\n\n\n<h3>Formatting the Datasets<\/h3>\n\n\n\n<p>For Text Classification datasets, our AutoML platform requires a CSV with the text data in a column titled \u201ctext_data\u201d and up to 20 other columns that each represent classification categories, also called model heads. Using the Twitter Sentiment Analysis dataset, we only need to rename the columns like so:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"733\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_5-1-1024x733.jpg\" alt=\"Our Twitter Sentiment Analysis data formatted correctly for our AutoML platform\n\" class=\"wp-image-93\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_5-1-1024x733.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_5-1-300x215.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_5-1-768x550.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_5-1-1536x1100.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_5-1.jpg 1980w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>Our Twitter Sentiment Analysis data formatted correctly for our AutoML platform<\/figcaption><\/figure>\n\n\n\n<p>The data consists of two sets, a training set with 150k examples and a test set with 62k examples. Before we upload our dataset, however, we must ensure that it fits our&nbsp;<a href=\"https:\/\/docs.thehive.ai\/docs\/automl-text-classification#dataset-upload\" target=\"_blank\" rel=\"noreferrer noopener\">Text Classification dataset requirements<\/a>. In the case of the training set, it does not fit those requirements \u2014 our AutoML platform only accepts CSV files that have 100,000 rows or less and this one has 150,000. In order to use this dataset, we\u2019ll have to remove some examples from the set. In order to keep the number of examples for each class relatively equal, we removed 25,000 negative (0) examples and 25,000 positive (1) ones.<\/p>\n\n\n\n<h2>Uploading the Datasets<\/h2>\n\n\n\n<p>After fixing the size issue, we\u2019re ready to upload our datasets. As is the case with all model types, we must first upload any datasets we are going to use before we create our training.<\/p>\n\n\n\n<h2>Creating a Training<\/h2>\n\n\n\n<p>After both the training and test datasets have been validated, we\u2019re ready to start building your model. On our&nbsp;<a href=\"https:\/\/automl.thehive.ai\/trainings\/\" target=\"_blank\" rel=\"noreferrer noopener\">AutoML platform<\/a>, we\u2019ll click the \u201cCreate New Model\u201d button and enter a project name and description. For our model type, this time we\u2019ll select \u201cText Classification.\u201d Finally, we\u2019ll add our training and test datasets that we just uploaded.<\/p>\n\n\n\n<p>We\u2019re then ready to start training! This aspect of the training process is identical to the one shown above for an Image Classification model. Just click the \u201cStart Training Model\u201d button on the bottom right corner of the screen. When training is complete, the status will display as \u201cCompleted.\u201d<\/p>\n\n\n\n<h2>Evaluating the Model<\/h2>\n\n\n\n<p>Just like in our Image Classification example, the project page will show various performance metrics after training is complete so that we can evaluate our model. At the top of the page we can select the head and, if desired, the class that we\u2019d like to evaluate. We can also use the slide to control the confidence threshold. Once selected, you will see the precision, recall, and balanced accuracy.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"529\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_6-1-1024x529.jpg\" alt=\"The project page for our Twitter Sentiment Analysis model after it has completed training\" class=\"wp-image-96\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_6-1-1024x529.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_6-1-300x155.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_6-1-768x396.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_6-1-1536x793.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_6-1.jpg 1999w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>The project page for our Twitter Sentiment Analysis model after it has completed training<\/figcaption><\/figure>\n\n\n\n<p>Below the precision, recall, and balanced accuracy, you can view the precision\/recall curve (P\/R curve) as well as a confusion matrix that shows how many predictions were correct and incorrect per class. This gives us a more detailed understanding of what the model misclassified. For example, we can see here that while there were a fair amount of mistakes for each class, there were more cases in which a positive example was mistaken for a negative than the other way around.&nbsp;<\/p>\n\n\n\n<p>While the results of this training are not as good as our Image Classification example, this is somewhat expected \u2014 sentiment analysis is a more complex and difficult classification task. While this model could definitely be improved by retraining with slightly different data, we\u2019ll demonstrate how to deploy it. To retrain your model, however, all you need to do is click the \u201cUpdate Model\u201d button and begin the training process again.<\/p>\n\n\n\n<h2>Deploying the Model<\/h2>\n\n\n\n<p>Deploying your model is the exact same process as described above in the Image Classification example. After the deployment is complete, you\u2019ll be able to view the deployment on Hive Data and access the API keys needed in order to begin using the model.&nbsp;<\/p>\n\n\n\n<p>To see a video of the entire training and deployment process for a Text Classification model, head over to&nbsp;<a href=\"https:\/\/youtu.be\/GvKnanzMAkg\" target=\"_blank\" rel=\"noreferrer noopener\">our Youtube channel<\/a>.<\/p>\n\n\n\n<h3>Training a Large Language Model<\/h3>\n\n\n\n<p>Finally, we\u2019ll walk through the training process for a Large Language Model (LLM). This process is slightly different from the training process for our classification model types, both in terms of dataset formatting and model evaluation.<br>Our AutoML platform supports two different types of LLMs: Text and Chat. Text models are geared towards generating passages of writing or lines of code, whereas chat models are built for interactions with the user, often in the format of asking questions and receiving concise, factual answers. For this example, we\u2019ll be using the&nbsp;<a href=\"https:\/\/huggingface.co\/datasets\/GEM\/viggo\" target=\"_blank\" rel=\"noreferrer noopener\">Viggo dataset<\/a>&nbsp;uploaded by GEM to Hugging Face. To follow along with us as we build the model, you can download the training and test sets&nbsp;<a href=\"https:\/\/huggingface.co\/datasets\/GEM\/viggo\/tree\/main\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>.<\/p>\n\n\n\n<h2>Formatting the Datasets<\/h2>\n\n\n\n<p>This dataset supports the task of summarizing and restructuring text into a very specific syntax. All data is within the video game domain, and all prompts take the form of either questions or statements about various games. The goal of the model is to take these prompts, extract the main idea behind them, and reformat them. For example, the prompt \u201cGuitar Hero: Smash Hits launched in 2009 but plays like a game from 1989, it\u2019s just not good\u201d becomes \u201cgive_opinion(name[Guitar Hero: Smash Hits], release_year[2009], rating[poor]).\u201d<\/p>\n\n\n\n<p>First, we\u2019ll check to make sure this dataset is valid per our&nbsp;<a href=\"https:\/\/docs.thehive.ai\/docs\/large-language-models-text#dataset-upload\" target=\"_blank\" rel=\"noreferrer noopener\">guidelines for AutoML datasets<\/a>. The size is well under the limit of 50,000 rows with only around 5,000. All that needs to be done to make sure that the formatting is correct is make sure that the prompt is in a column titled \u201cprompt\u201d and the expected completion is in another column titled \u201ccompletion.\u201d All other columns can be removed. From this dataset, we will use the column \u201ctarget\u201d as \u201cprompt\u201d and the column \u201cmeaning_representation\u201d as \u201ccompletion.\u201d The final CSV is as shown below:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"775\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_7-1-1024x775.jpg\" alt=\"The Viggo dataset ready to upload to our AutoML platform\" class=\"wp-image-97\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_7-1-1024x775.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_7-1-300x227.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_7-1-768x581.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_7-1-1536x1163.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/blog1_7-1.jpg 1974w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>The Viggo dataset ready to upload to our AutoML platform<\/figcaption><\/figure>\n\n\n\n<h2>Uploading the Datasets<\/h2>\n\n\n\n<p>Now let\u2019s upload our datasets. We\u2019ll be using both the training and test datasets from the Viggo dataset as provided&nbsp;<a href=\"https:\/\/huggingface.co\/datasets\/GEM\/viggo\/tree\/main\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>. After both datasets have been validated, we\u2019re ready to train the model.<\/p>\n\n\n\n<h2>Creating a Training<\/h2>\n\n\n\n<p>We\u2019ll head back to our Models page and select \u201cCreate New Model\u201d. This time, the project type should be \u201cLanguage Generative \u2013 Text\u201d. We will then choose our training and test datasets from a list of ones that we\u2019ve already uploaded to the platform. Then we\u2019ll start the training!<\/p>\n\n\n\n<h2>Evaluating the Model<\/h2>\n\n\n\n<p>For Large Language Models, the metrics page looks a little different than it does for our classification models.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"527\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/8-1024x527.jpg\" alt=\"\" class=\"wp-image-98\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/8-1024x527.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/8-300x154.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/8-768x395.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/8-1536x790.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/8.jpg 1999w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>The project page for the Viggo model after it has completed training<\/figcaption><\/figure>\n\n\n\n<p>The loss measures how closely the model\u2019s response matches the response from the test data, where 0 represents a perfect prediction, and a higher loss signifies that the prediction is increasingly far from the actual response sequence. If the response has 10 tokens, we let the model predict each of the 10 tokens given all previous tokens are the same and display the final numerical loss value.<\/p>\n\n\n\n<p>You can also evaluate your model by interacting with it in what we call the playground. Here you can submit prompts directly to your model and view its response, allowing model evaluation through experimentation. This will be available for 15 days after model training is complete, and has a limit of 500 requests. If either the time or request limit is reached, you can instead choose to deploy the model and continue to use the playground feature with unlimited uses which will be charged to the organization\u2019s billing account.<\/p>\n\n\n\n<p>For our Viggo model, all metrics are looking pretty good. We entered a few prompts into the playground to further test it, and the results showed no issues.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1999\" height=\"1111\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/9-1024x569.jpg\" alt=\"\" class=\"wp-image-99\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/9-1024x569.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/9-300x167.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/9-768x427.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/9-1536x854.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/9.jpg 1999w\" sizes=\"(max-width: 1999px) 100vw, 1999px\" \/><figcaption>An example query and response from the playground feature<\/figcaption><\/figure>\n\n\n\n<h2>Deploying the Model<\/h2>\n\n\n\n<p>The process to deploy a Large Language Model is the same as it is for our classification models. Just click \u201cCreate Deployment\u201d and you\u2019ll be ready to submit API requests in just a few short minutes.<\/p>\n\n\n\n<p>To see a video of the entire training and deployment process for an LLM, head over to&nbsp;<a href=\"https:\/\/youtu.be\/Dg-x650sTF8\" target=\"_blank\" rel=\"noreferrer noopener\">our Youtube channel<\/a>.<\/p>\n\n\n\n<h3>Final Thoughts<\/h3>\n\n\n\n<p>We hope this in-depth walkthrough of how to build different types of machine learning models with our AutoML platform was helpful. Keep an eye out for more AutoML tutorials in the coming weeks, such as a detailed guide to Retrieval Augmented Generation (RAG), data stream management systems (DSMS), and other exciting features we support.<\/p>\n\n\n\n<p>If you have any further questions or run into any issues as you build your custom-made AI models, please don\u2019t hesitate to reach out to us at&nbsp;<a href=\"mailto:support@thehive.ai\" target=\"_blank\" rel=\"noreferrer noopener\">support@thehive.ai<\/a>&nbsp;and we will be happy to help. To inquire about testing out our AutoML platform, please contact&nbsp;<a href=\"mailto:sales@thehive.ai\" target=\"_blank\" rel=\"noreferrer noopener\">sales@thehive.ai<\/a>.<\/p>\n\n\n\n<h3>Dataset Sources<\/h3>\n\n\n\n<p>All datasets that are linked to as examples in this post are publicly available for a wide range of uses, including commercial use. The snacks dataset and viggo dataset are both licensed under a&nbsp;<a href=\"https:\/\/creativecommons.org\/licenses\/by-sa\/4.0\/deed.en\" target=\"_blank\" rel=\"noreferrer noopener\">Creative Commons Attribution Share-Alike 4.0 (CC BY-SA 4.0)<\/a>&nbsp;license. They can be found on Hugging Face&nbsp;<a href=\"https:\/\/creativecommons.org\/licenses\/by-sa\/4.0\/deed.en\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>&nbsp;and&nbsp;<a href=\"https:\/\/huggingface.co\/datasets\/GEM\/viggo\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>. The Twitter Sentiment Analysis dataset is licensed under the&nbsp;<a href=\"https:\/\/huggingface.co\/datasets\/GEM\/viggo\" target=\"_blank\" rel=\"noreferrer noopener\">Apache License, Version 2.0<\/a>. It is available on Hugging Face&nbsp;<a href=\"https:\/\/huggingface.co\/datasets\/carblacac\/twitter-sentiment-analysis\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>. None of these datasets may be used except in compliance with their respective license agreements.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Follow along as we provide a step-by-step guide to building a machine learning model with Hive AutoML.<\/p>\n","protected":false},"author":1,"featured_media":80,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"kia_subtitle":""},"categories":[8,6,4],"tags":[],"_links":{"self":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts\/33"}],"collection":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/comments?post=33"}],"version-history":[{"count":9,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts\/33\/revisions"}],"predecessor-version":[{"id":1969,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts\/33\/revisions\/1969"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/media\/80"}],"wp:attachment":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/media?parent=33"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/categories?post=33"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/tags?post=33"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}