{"id":727,"date":"2019-03-08T10:49:00","date_gmt":"2019-03-08T10:49:00","guid":{"rendered":"https:\/\/thehive.ai\/blog\/?p=727"},"modified":"2024-07-05T04:32:02","modified_gmt":"2024-07-05T04:32:02","slug":"setup-mesos-part2","status":"publish","type":"post","link":"https:\/\/thehive.ai\/blog\/setup-mesos-part2","title":{"rendered":"Spark on Mesos Part 2: The Great Disk Leak"},"content":{"rendered":"\n<p>After ramping up our usage of Spark, we found that our Mesos agents were running out of disk space. It was happening rapidly on some of our agents with small disks:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"519\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/1-1024x519.jpg\" alt=\"\" class=\"wp-image-879\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/1-1024x519.jpg 1024w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/1-300x152.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/1-768x389.jpg 768w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/1-1536x779.jpg 1536w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/1.jpg 1830w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>The issue turned out to be that Spark was leaving behind binaries and jars in both driver and executor directories:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" width=\"940\" height=\"391\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/2.jpg\" alt=\"\" class=\"wp-image-880\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/2.jpg 940w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/2-300x125.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/2-768x319.jpg 768w\" sizes=\"(max-width: 940px) 100vw, 940px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" width=\"940\" height=\"321\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/3.jpg\" alt=\"\" class=\"wp-image-881\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/3.jpg 940w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/3-300x102.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/3-768x262.jpg 768w\" sizes=\"(max-width: 940px) 100vw, 940px\" \/><\/figure>\n\n\n\n<p>Each uncompressed Spark binary directory folder contains 248MB, so to sum this up:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" width=\"940\" height=\"201\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/4.jpg\" alt=\"\" class=\"wp-image-882\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/4.jpg 940w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/4-300x64.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/4-768x164.jpg 768w\" sizes=\"(max-width: 940px) 100vw, 940px\" \/><\/figure>\n\n\n\n<p>For a small pipeline with one driver and one executor, this adds up to 957MB. At our level of usage, this was 100GB of dead weight added every day.<\/p>\n\n\n\n<p>I looked into ways to at least avoid storing the compressed Spark binaries, since Spark only really needs the uncompressed version. It turns out that Spark uses the <a href=\"https:\/\/mesos.apache.org\/documentation\/latest\/fetcher\/\" target=\"_blank\" rel=\"noreferrer noopener\">Mesos fetcher<\/a> to copy and extract files. By enabling caching on the Mesos fetcher, Mesos will store only one cached copy of the compressed Spark binaries, then extract it directly into each sandbox directory. In the spark documentation, it looks like this should be solved by setting the <strong>spark.mesos.fetcherCache.enable<\/strong> option to true;<\/p>\n\n\n\n<p><strong>If set to <code>true<\/code>, all URIs (example: <code>spark.executor.uri<\/code>, <code>spark.mesos.uris<\/code>) will be cached by the Mesos Fetcher Cache.&#8221;<\/strong><\/p>\n\n\n\n<p>Adding this to our Spark application confs, we found that the cache option was turned for the executor, but not driver:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" width=\"940\" height=\"201\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/5.jpg\" alt=\"\" class=\"wp-image-883\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/5.jpg 940w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/5-300x64.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/5-768x164.jpg 768w\" sizes=\"(max-width: 940px) 100vw, 940px\" \/><\/figure>\n\n\n\n<p>This brought our disk leak down to <strong>740MB<\/strong> per Spark application. Reading through the Spark code, I found that the driver&#8217;s fetch configuration is defined by the MesosClusterScheduler, whereas the executor&#8217;s are defined by the MesosCourseGrainedSchedulerBackend. There were two oddities about the MesosClusterScheduler:<\/p>\n\n\n\n<ul><li><strong>It reads options from the dispatcher&#8217;s configuration instead of the submitted application&#8217;s configuration<\/strong><br><\/li><li><strong>It uses the spark.mesos.fetchCache.enable option instead of spark.mesos.fetcherCache.enable<\/strong><\/li><\/ul>\n\n\n\n<p>So bizarre! Finding no documentation for either of these issues online, I filed <a href=\"https:\/\/issues.apache.org\/jira\/browse\/SPARK-26192\" target=\"_blank\" rel=\"noreferrer noopener\">two bugs<\/a>. By now, my PRs to fix them have been merged in, and should show up in upcoming releases.<\/p>\n\n\n\n<p>In the meantime, I implemented a workaround by adding the spark.mesos.fetchCache.enable=true option to the dispatcher.<\/p>\n\n\n\n<p>Now the Driver also used caching, reducing the disk leak to 523MB per Spark application:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" width=\"940\" height=\"201\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/6.jpg\" alt=\"\" class=\"wp-image-884\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/6.jpg 940w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/6-300x64.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/6-768x164.jpg 768w\" sizes=\"(max-width: 940px) 100vw, 940px\" \/><\/figure>\n\n\n\n<p>Finally, I took advantage of Spark&#8217;s shutdown hook functionality to manually clean up the driver&#8217;s uberjar and uncompressed spark binaries:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/shutdown hook to clean driver spark binaries after application finishes\nsys.env.get(\"MESOS_SANDBOX\").foreach((sandboxDirectory) =&gt; {\n sparkSession.sparkContext.addSparkListener(new SparkListener {\n   override def onApplicationEnd(sparkListenerApplicationEnd: SparkListenerApplicationEnd): Unit = {\n     val sandboxItems = new File(sandboxDirectory).listFiles()\n     val regexes = Array(\n       \"^spark-\\d+.\\d+.\\d+-bin\".r,\n       \"^hive-spark_.*\\.jar\".r\n     )\n     sandboxItems\n         .filter((item) =&gt; regexes.exists((regex) =&gt; regex.findFirstIn(item.getName).isDefined))\n         .foreach((item) =&gt; {\n           FileUtils.forceDelete(item)\n         })\n   }\n })\n})<\/code><\/pre>\n\n\n\n<p>This reduced the disk leak to just 248MB per application:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" width=\"940\" height=\"201\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/7.jpg\" alt=\"\" class=\"wp-image-885\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/7.jpg 940w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/7-300x64.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/7-768x164.jpg 768w\" sizes=\"(max-width: 940px) 100vw, 940px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" width=\"940\" height=\"221\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/8.jpg\" alt=\"\" class=\"wp-image-886\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/8.jpg 940w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/8-300x71.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/8-768x181.jpg 768w\" sizes=\"(max-width: 940px) 100vw, 940px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" width=\"940\" height=\"271\" src=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/9.jpg\" alt=\"\" class=\"wp-image-887\" srcset=\"https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/9.jpg 940w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/9-300x86.jpg 300w, https:\/\/staticblog.thehive.ai\/uploads\/2019\/03\/9-768x221.jpg 768w\" sizes=\"(max-width: 940px) 100vw, 940px\" \/><\/figure>\n\n\n\n<p>This still isn&#8217;t perfect, but I don&#8217;t think there will be a way to delete the uncompressed spark binaries from your Mesos executor sandbox directories until Spark adds more complete Mesos functionality. For now, it&#8217;s a 74% reduction in the disk leak.<\/p>\n\n\n\n<p>Last, and perhaps most importantly, we reduced the time to live for our completed Mesos frameworks and sandboxes from one month to one day. This effectively cut our equilibrium disk usage by 97%. Our Mesos agents&#8217; disk usage now stays at a healthy level.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hive tackled key challenges related to Spark, and why our product is better for it. Read more to learn how we go about coding problem solving.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"kia_subtitle":""},"categories":[8],"tags":[],"_links":{"self":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts\/727"}],"collection":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/comments?post=727"}],"version-history":[{"count":4,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts\/727\/revisions"}],"predecessor-version":[{"id":899,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/posts\/727\/revisions\/899"}],"wp:attachment":[{"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/media?parent=727"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/categories?post=727"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/thehive.ai\/blog\/wp-json\/wp\/v2\/tags?post=727"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}