[SPARK-12133][STREAMING] Streaming dynamic allocation #12154

tdas · 2016-04-04T20:52:27Z

What changes were proposed in this pull request?

Added a new Executor Allocation Manager for the Streaming scheduler for doing Streaming Dynamic Allocation.

How was this patch tested

Unit tests, and cluster tests.

tdas · 2016-04-04T20:52:51Z

@andrewor14

andrewor14 · 2016-04-04T20:57:21Z

12133

holdenk · 2016-04-04T21:40:22Z

streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala

@@ -43,7 +43,7 @@ import org.apache.spark.storage.StorageLevel
 import org.apache.spark.streaming.StreamingContextState._
 import org.apache.spark.streaming.dstream._
 import org.apache.spark.streaming.receiver.Receiver
-import org.apache.spark.streaming.scheduler.{JobScheduler, StreamingListener}
+import org.apache.spark.streaming.scheduler.{ExecutorAllocationManager, JobScheduler, StreamingListener}


Is the ExecutorAllocationManager import necessary? It doesn't seem to be referenced here.

I just pushed some more changes. Its now needed.

SparkQA · 2016-04-04T22:54:17Z

Test build #54893 has finished for PR 12154 at commit 81ad1dd.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-04-05T00:50:16Z

Test build #54905 has finished for PR 12154 at commit 0c6d94b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-04-05T02:34:02Z

Test build #2750 has finished for PR 12154 at commit 0c6d94b.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

Colin-Yu · 2016-04-05T04:01:08Z

core/src/main/scala/org/apache/spark/SparkContext.scala

@@ -1360,6 +1360,16 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli
    listenerBus.addListener(listener)
  }

+  private[spark] override def getExecutorIds(): Seq[String] = {


A newbie question: if a method has no side effect and return values, do code standard in spark suggest to remove parenthesis in method declaration?

andrewor14 · 2016-04-05T07:32:38Z

@tdas Looks great. I think you could add more comments in the code but the rest is pretty good.

jerryshao · 2016-04-06T00:39:35Z

streaming/src/main/scala/org/apache/spark/streaming/scheduler/ExecutorAllocationManager.scala

+
+  val MIN_EXECUTORS_KEY = "spark.streaming.dynamicAllocation.minExecutors"
+
+  val MAX_EXECUTORS_KEY = "spark.streaming.dynamicAllocation.maxExecutors"


Can we use these two configurations minExecutors and maxExecutors derived from Spark ExecutorAllocationManager?

Basically is there any semantic difference for min and max executors between here and Spark's dynamic allocation?

It is very confusing if configs inside spark.streaming.dynamicAllocation.* depends on configs in spark.dynamicAllocation.*. Very non intuitive and defeats the whole purpose of having config names be scoped with .s

@tdas Is there any particular reason, why initExecutors is not supported in streaming.dynamicAllocation?

@tdas @andrewor14 I also have to ask: Any reason initExecutors is not supported for streaming with dynamic allocation? I'm having issues with my application because it needs a minimum executors count to start behaving good with the Kinesis stream.

tdas · 2016-04-06T18:39:50Z

@andrewor14 Updated. Please take a look.

SparkQA · 2016-04-06T18:40:20Z

Test build #2759 has started for PR 12154 at commit 3b501a0.

andrewor14 · 2016-04-06T18:58:35Z

LGTM

andrewor14 · 2016-04-06T18:59:04Z

streaming/src/main/scala/org/apache/spark/streaming/scheduler/ExecutorAllocationManager.scala

+    logInfo(s"Requested total $targetTotalExecutors executors")
+  }
+
+  /** Kill a executors that is not running a receiver */


SparkQA · 2016-04-06T21:06:16Z

Test build #55136 has finished for PR 12154 at commit 0598c85.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-04-06T21:14:24Z

Test build #55140 has finished for PR 12154 at commit ce36c76.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

andrewor14 · 2016-04-06T22:45:58Z

Merged into master thanks.

jayv · 2016-04-20T06:31:59Z

@tdas or @andrewor14 does this depend on any 2.0 APIs, I would like to backport this to 1.5 or 1.6 if possible.

need to run multiple concurrent streaming jobs on mesos

andrewor14 · 2016-04-20T18:05:25Z

@jayv big new features like this are never backported into older branches.

jayv · 2016-04-20T18:09:36Z

I understand that, but I want to port this feature to our internal custom 1.6 build, if it's not too much trouble.

andrewor14 · 2016-04-20T20:52:34Z

I see. I don't believe this depends on new APIs. You may have some difficulty just backporting into 1.6 in general for big patches, however.

Add missing API to support backport of SPARK-12133 Author: Tathagata Das <tathagata.das1565@gmail.com> Author: Jo Voordeckers <jo.voordeckers@gmail.com>

sansagara · 2018-03-28T21:42:54Z

Is there a way to specify the Initial executors?

sugix · 2018-04-21T20:56:13Z

@tdas - Why we cannot see this in the documentation and I am not sure if AWS EMR supports this feature?

tdas added 5 commits April 3, 2016 19:53

Implement streaming dynamic allocation

26c0c18

Some change

0e78bd2

Fixed bugs and updated tests

60b7a22

Added enabling key and unit tests

39ed35a

Added docs

81ad1dd

tdas changed the title ~~Streaming dynamic allocation~~ Apr 4, 2016

tdas changed the title ~~[SPARK-XXX][STREAMING] Streaming dynamic allocation~~ Apr 4, 2016

holdenk reviewed Apr 4, 2016
View reviewed changes

Minor changes

0c6d94b

tdas force-pushed the streaming-dynamic-allocation branch from e4df62f to 0c6d94b Compare April 4, 2016 22:43

Colin-Yu reviewed Apr 5, 2016
View reviewed changes

jerryshao reviewed Apr 6, 2016
View reviewed changes

tdas force-pushed the streaming-dynamic-allocation branch from 3b501a0 to 0598c85 Compare April 6, 2016 18:42

andrewor14 reviewed Apr 6, 2016
View reviewed changes

Addressed comments

ce36c76

tdas force-pushed the streaming-dynamic-allocation branch from 0598c85 to ce36c76 Compare April 6, 2016 19:31

asfgit closed this in 9af5423 Apr 6, 2016

jayv pushed a commit to jayv/spark that referenced this pull request Apr 21, 2016

Squashed backport of SPARK-12133 PR apache#12154

29246e3

Add missing API to support backport of SPARK-12133 Author: Tathagata Das <tathagata.das1565@gmail.com> Author: Jo Voordeckers <jo.voordeckers@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-12133][STREAMING] Streaming dynamic allocation #12154

[SPARK-12133][STREAMING] Streaming dynamic allocation #12154

tdas commented Apr 4, 2016

tdas commented Apr 4, 2016

andrewor14 commented Apr 4, 2016

holdenk Apr 4, 2016

tdas Apr 4, 2016

SparkQA commented Apr 4, 2016

SparkQA commented Apr 5, 2016

SparkQA commented Apr 5, 2016

Colin-Yu Apr 5, 2016

tdas Apr 6, 2016

andrewor14 commented Apr 5, 2016

jerryshao Apr 6, 2016

tdas Apr 6, 2016

alunarbeach Feb 7, 2017

sansagara Mar 28, 2018

tdas commented Apr 6, 2016

SparkQA commented Apr 6, 2016

andrewor14 commented Apr 6, 2016

andrewor14 Apr 6, 2016

SparkQA commented Apr 6, 2016

SparkQA commented Apr 6, 2016

andrewor14 commented Apr 6, 2016

jayv commented Apr 20, 2016

andrewor14 commented Apr 20, 2016

jayv commented Apr 20, 2016

andrewor14 commented Apr 20, 2016

sansagara commented Mar 28, 2018

sugix commented Apr 21, 2018


		val MIN_EXECUTORS_KEY = "spark.streaming.dynamicAllocation.minExecutors"

		val MAX_EXECUTORS_KEY = "spark.streaming.dynamicAllocation.maxExecutors"

[SPARK-12133][STREAMING] Streaming dynamic allocation #12154

[SPARK-12133][STREAMING] Streaming dynamic allocation #12154

Conversation

tdas commented Apr 4, 2016

What changes were proposed in this pull request?

How was this patch tested

tdas commented Apr 4, 2016

andrewor14 commented Apr 4, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Apr 4, 2016

SparkQA commented Apr 5, 2016

SparkQA commented Apr 5, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrewor14 commented Apr 5, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tdas commented Apr 6, 2016

SparkQA commented Apr 6, 2016

andrewor14 commented Apr 6, 2016

Choose a reason for hiding this comment

SparkQA commented Apr 6, 2016

SparkQA commented Apr 6, 2016

andrewor14 commented Apr 6, 2016

jayv commented Apr 20, 2016

andrewor14 commented Apr 20, 2016

jayv commented Apr 20, 2016

andrewor14 commented Apr 20, 2016

sansagara commented Mar 28, 2018

sugix commented Apr 21, 2018