hadoop - Calculating yarn.nodemanager.resource.cpu-vcores for a yarn cluster with multiple spark clients -


if have 3 spark applications using same yarn cluster, how should set

yarn.nodemanager.resource.cpu-vcores

in each of 3 yarn-site.xml?

(each spark application required have it's own yarn-site.xml on classpath)

does value matter in client yarn-site.xml's ?

if does:

let's cluster has 16 cores.

should value in each yarn-site.xml 5 (for total of 15 leave 1 core system processes) ? or should set each 1 15 ?

(note: cloudera indicates 1 core should left system processes here: http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/ however, not go details of using multiple clients against same cluster)

assume spark running yarn master, , running in cluster mode.

are talking server-side configuration each yarn node manager? if so, typically configured little less number of cpu cores (or virtual cores if have hyperthreading) on each node in cluster. if have 4 nodes 4 cores each, dedicate example 3 per node yarn node manager , cluster have total of 12 virtual cpus.

then request desired resources when submitting spark job (see http://spark.apache.org/docs/latest/submitting-applications.html example) cluster , yarn attempt fulfill request. if can't fulfilled, spark job (or application) queued or there timeout.

you can configure different resource pools in yarn guarantee specific amount of memory/cpu resources such pool, that's little bit more advanced.

if submit spark application in cluster mode, have consider spark driver run on cluster node , not local machine (that 1 submitted it). therefore require @ least 1 virtual cpu more.

hope clarifies things little you.


Comments

Popular posts from this blog

sql server - Cannot query correctly (MSSQL - PHP - JSON) -

php - trouble displaying mysqli database results in correct order -

C++ Linked List -