如何在弹力麻preduce设置的每个节点并发运行的任务precise最大数量在Hadoop中2.4.0弹力、节点、数量、任务

2023-09-11 10:19:24 作者:本是荒野客

据的http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/,式,用于确定每个节点同时运行的任务的数目是:

According to http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/, the formula for determining the number of concurrently running tasks per node is:

min (yarn.nodemanager.resource.memory-mb / mapreduce.[map|reduce].memory.mb, 
     yarn.nodemanager.resource.cpu-vcores / mapreduce.[map|reduce].cpu.vcores) .

然而,这些参数设置(对于c3.2xlarges集群):

However, on setting these parameters to (for a cluster of c3.2xlarges):

yarn.nodemanager.resource.memory-MB = 14336

yarn.nodemanager.resource.memory-mb = 14336

MA preduce.map.memory.mb = 2048

mapreduce.map.memory.mb = 2048

yarn.nodemanager.resource.cpu-vcores = 8

yarn.nodemanager.resource.cpu-vcores = 8

MA preduce.map.cpu.vcores = 1,

mapreduce.map.cpu.vcores = 1,

我觉得我只获得了4个任务,每个节点同时运行时,配方说,7应该的。这是怎么回事?

I find I'm only getting up to 4 tasks running concurrently per node when the formula says 7 should be. What's the deal?

我在AMI 3.1.0运行的Hadoop 2.4.0。

I'm running Hadoop 2.4.0 on AMI 3.1.0.

推荐答案

我的经验公式是不正确的。由Cloudera的提供的公式是正确的,并显示得到的同时运行的任务的预期数,至少在AMI 3.3.1

My empirical formula was incorrect. The formula provided by Cloudera is the correct one and appears to give the expected number of concurrently running tasks, at least on AMI 3.3.1.