{"id":876,"date":"2018-05-31T07:08:00","date_gmt":"2018-05-31T07:08:00","guid":{"rendered":"https:\/\/thehive.ai\/blog\/?p=876"},"modified":"2024-07-05T07:13:29","modified_gmt":"2024-07-05T07:13:29","slug":"multi-label-classification","status":"publish","type":"post","link":"https:\/\/thehive.ai\/blog\/multi-label-classification","title":{"rendered":"Multi-label Classification"},"content":{"rendered":"\n<h2>Classification challenges like Imagenet changed the way we train models. Given enough data, neural networks can distinguish between thousands of classes with remarkable accuracy.<\/h2>\n\n\n\n<p>However, there are some circumstances where basic classification breaks down, and something called multi-label classification is necessary. Here are two examples:<\/p>\n\n\n\n<ul><li>You need to classify a large number of brand logos and what medium they appear on (sign, billboard, soda bottle, etc.)<br><\/li><li>You have plenty of image data on a lot of different animals, but none on the platypus &#8211; which you want to identify in images<\/li><\/ul>\n\n\n\n<p>In the first example, should you train a classifier with one class for each logo and medium combination? The number of such combinations could be enormous, and it might be impossible to get data on some of them. Another option would be to train a classifier for logos and a classifier for medium; however, this doubles the runtime to get your results. In the second example, it seems impossible to train a platypus model without data on it.<\/p>\n\n\n\n<p>Multi-label models step in by doing multiple classifications at once. In the first example, we can train a single model that outputs both a logo classification and a medium classification without increasing runtime. In the second example, we can use common sense to label animal features (fur vs. feathers vs. scales, bill vs. no bill, tail vs. no tail) for each of the animals we know about, train a single model that identifies all features for an animal at once, then infer that any animal with fur, a bill, and a tail is a platypus.<\/p>\n\n\n\n<p>A simple way to accomplish this in a neural network is to group a logit layer into multiple softmax predictions:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"655\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/1-14-1024x655.png\" alt=\"\" class=\"wp-image-1017\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/1-14-1024x655.png 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/1-14-300x192.png 300w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/1-14-768x491.png 768w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/1-14-1536x982.png 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/1-14.png 1880w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>You can then train such a network by simply adding the cross entropy loss for each softmax where a ground truth label is present.<\/p>\n\n\n\n<p>To compare these approaches, let\u2019s consider a subset of imagenet classes, and two features that distinguish them:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" width=\"595\" height=\"311\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/2-10.png\" alt=\"\" class=\"wp-image-1018\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/2-10.png 595w, https:\/\/staticblog.thehive.ai\/uploads\/2024\/07\/2-10-300x157.png 300w\" sizes=\"(max-width: 595px) 100vw, 595px\" \/><\/figure>\n\n\n\n<p>First, I trained two 50-layer resnet V2\u2019s on this balanced dataset: one trained on the single-label classification problem, and the other trained on the multi-label classification problem. In this example, every training image has both labels, but real applications may have only a subset of labels available for each image.<\/p>\n\n\n\n<p>The single-label model trained specifically on the 6-animal classification performed slightly better when distinguishing all 6 animals:<\/p>\n\n\n\n<ul><li>Single-label model: 90% accuracy<br><\/li><li>Multi-label model: 88% accuracy<\/li><\/ul>\n\n\n\n<p>However, the multihead model provides finer information granularity. Though it got only 88% accuracy on distinguishing all 6 animals, it achieved 92% accuracy at distinguishing scales\/exoskeleton\/fur and 95% accuracy at distinguishing spots\/no spots. If we care about only one of these factors, we\u2019re already better off with the multi-label model.<\/p>\n\n\n\n<p>But this toy example hardly touches on the regime where multi-label classification really thrives: large datasets with many possible combinations of independent labels. In this regime, we get the interesting benefit of transfer learning. Imagine if we had categorized hundreds of animals into a dozen binary criteria. Training a separate model for each binary criterion would yield acceptable results, but learning the other features can actually help in some cases by effectively pre-training the network on a larger dataset.<\/p>\n\n\n\n<p>At Hive, we recently deployed a multi-label classification model that replaced 8 separate classification models. For each image, we usually had truth data available for 2 to 5 of the labels. Out of the 8, 2 were better (think 93% instead of 91%). These were the labels with less data. This makes sense, since they would benefit most from domain-specific pretraining on the same images. But most importantly for this use case, we were able to run all the models together in 1\/8th the time as before.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Classification challenges changed the way we train models. In circumstances where basic classification breaks down, multi-label classification is necessary.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"kia_subtitle":""},"categories":[8],"tags":[],"_links":{"self":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts\/876"}],"collection":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/comments?post=876"}],"version-history":[{"count":4,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts\/876\/revisions"}],"predecessor-version":[{"id":1026,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts\/876\/revisions\/1026"}],"wp:attachment":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/media?parent=876"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/categories?post=876"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/tags?post=876"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}