Hadoop配置的各种问题
注意,我i这里失败了很多次
首先 一定要永“管理员”打开powershell,不然权限不够,会发现没有active node
第二,我这台电脑很破的,所以这里记录yarn的配置如下(注意不要限制内存为1024mb)
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<!-- scheduler configuration, for multi-tasks run in queue, avoid mapreduce-run & pyspark ACCEPTED not run problem -->
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<name>yarn.scheduler.fair.preemption</name>
<value>true</value>
</property>
<!-- 下面配置用来设置集群利用率的阀值, 默认值0.8f,最多可以抢占到集群所有资源的80% -->
<property>
<name>yarn.scheduler.fair.preemption.cluster-utilization-threshold</name>
<value>1.0</value>
</property>
<property>
<name>yarn.nodemanager.disk-health-checker.min-healthy-disks</name>
<value>0.0</value>
</property>
<property>
<name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
<value>100.0</value>
</property>

浙公网安备 33010602011771号