Forcing driver to run on specific slave in spark standalone cluster running with "--deploy-mode cluster" -


i running small spark cluster, 2 ec2 instances (m4.xlarge).

so far have been running spark master on 1 node, , single spark slave (4 cores, 16g memory) on other, deploying spark (streaming) app in client deploy-mode on master. summary of settings is:

--executor-memory 16g

--executor-cores 4

--driver-memory 8g

--driver-cores 2

--deploy-mode client

this results in single executor on single slave running 4 cores , 16gb memory. driver runs "outside" of cluster on master-node (i.e. not allocated resources master).

ideally i'd use cluster deploy-mode can take advantage of supervise option. have started second slave on master node giving 2 cores , 8g memory (smaller allocated resources leave space master daemon).

when run spark job in cluster deploy-mode (using same settings above --deploy-mode cluster). around 50% of time desired deployment driver runs through slave running on master node (which has right resources of 2 cores & 8gb) leaves original slave node free allocate executor of 4 cores & 16gb. other 50% of time master runs driver on non-master slave node, means driver on node 2 cores & 8gb memory, leaves no node sufficient resources start executor (which requires 4 cores & 16gb).

is there way force spark master use specific worker / slave driver? given spark knows there 2 slave nodes, 1 2 cores , other 4 cores, , driver needs 2 cores, , executor needs 4 cores ideally work out right optimal placement, doesn't seem case.

any ideas / suggestions gratefully received!

thanks!

i can see old question, let me answer still, might find useful.

add --driver-java-options="-dspark.driver.host=<host>" option spark-submit script, when submitting application, , spark should deploy driver specified host.


Comments

Popular posts from this blog

aws api gateway - SerializationException in posting new Records via Dynamodb Proxy Service in API -

asp.net - Problems sending emails from forum -