Spark on Mesos Part 1: Setting Up

At Hive, we’ve created a data platform including Apache Spark applications that use Mesos as a resource manager. While both Spark and Mesos are popular frameworks used by many top tech companies, using the two together is a relatively new idea with incomplete documentation.

Why choose Mesos as Spark’s resource manager?

Spark needs a resource manager to tell it what machines are available and how much CPU and memory each one has. It uses this information and then requests that the resource manager add tasks for the executors it needs.

There are currently four resource manager options: standalone, YARN, Kubernetes, and Mesos. The next table should make clear why we chose to use Mesos.

We wanted our Spark applications to use the same in-house pool of resources that other, non-Hadoop workloads do, so only Kubernetes and Mesos were options to us. There are great posts out there contrasting the two of these, but for us the deciding factor was that we already use Mesos.

Spark applications can share resources with your other frameworks in Mesos.

Learnings from Spark on Mesos

Spark’s guide on running Spark on Mesos is the best place to start setting this up. However, we ran into a few notable quirks it does not mention.

A word on Spark versions

While Spark has technically provided support for Mesos since version 1.0, it wasn’t very functional until recently. We strongly recommend using Spark 2.4.0 or later. Even in Spark 2.3.2, there were some pretty major bugs:

  • The obsolete MESOS_DIRECTORY environment variable was used instead of MESOS_SANDBOX, which caused an error during sparkSession.stop in certain applications.
  • Spark-submit to mesos would not properly escape your application’s configurations. To run the equivalent of spark-submit --master local[4] --conf1 "a b c" --class package.Main my.jar on mesos, you would need to run spark-submit --master --conf1 "a\\ b\\ c" mesos://url --deploy-mode cluster --class package.Main my.jar. Spark 2.4.0 still has this issue for arguments.
  • Basically everything under the version 2.4.0 Jira page with Mesos in the name.

Still, Spark’s claim that it “does not require any special patches of Mesos” is usually wrong on one count: accessing jars and spark binaries in S3. To access these, your mesos agents will need hadoop libraries. Otherwise, you will only be able to access files stored locally on the agents or accessible by http. In order to use S3 links or HDFS links, one must configure every Mesos agent with a local path to the Hadoop client. This allows the Mesos Fetcher to successfully grab the Spark bin and begin executing the job.

Spark Mesos dispatcher

The Spark dispatcher is a very simple Mesos Framework for running Spark jobs inside a Mesos cluster. The dispatcher actually does not manage the resource allocation nor the application lifecycle of the jobs. Instead, for each new job it receives, it launches a Spark Driver within the cluster. The Driver itself is also a Mesos Framework with its own UI and is given the responsibility of provisioning resources and executing its specific job. The dispatcher is solely responsible for launching and keeping track of Spark Drivers.

How a Spark Driver runs jobs in any clustered configuration.

While setup for the dispatcher is as simple as running the provided startup script, one operational challenge to consider is the placement of your dispatcher. The two pragmatic locations for us were running the dispatcher on a separate instance outside the cluster, or as an application inside the Marathon Mesos framework. Both had their trade offs, but we decided to run the dispatcher on a small dedicated instance as it was an easy way to have a persistent endpoint for the service.

One small concern worth mentioning is the lack of HA for the dispatcher. While Spark Drivers continue to run when the dispatcher is down and state recovery is available with Apache Zookeeper, multiple dispatchers cannot be coordinated together. If HA is an important feature, it may be worthwhile to run the service on Marathon and setting up some form of service discovery so you can have a persistent endpoint for the dispatcher.

Dependency management

There are at least three ways to use manage dependencies for your Spark repo:

  1. Copying dependency jars to the Spark driver yourself and specifying spark.driver.extraClassPath and spark.driver.extraClassPath.
  2. Specifying spark.jars.packages and optionally spark.jars.repositories.
  3. Creating an uberjar that includes both your code and all necessary dependencies’ jars.

Option 1 gives you total control over which jars you use and where they come from, in case there are some items in the dependency tree you know you don’t need. This can save some application startup time, but is very tedious. Option 2 streamlines option 1 by listing the required jars only once and pulling from the list of repositories automatically, but loses the very fine control by pulling the full dependency tree of each dependency. Option 3 gives back that very fine control, and is the most simple, but duplicates the dependencies in every uberjar you make.

Overall, we found option 3 most appealing. Compared to option 2, it saved 5 seconds of startup time on every Spark application and removed the worry that the maven repository would become unavailable. Better automating option 1 might be the most ideal solution of all, but for now, it isn’t worth our effort.

What next?

Together with Spark’s guide to running on Mesos, this should address many of hiccups you’ll encounter. But join us next time as we tackle one more: the great disk leak.