hadoop使用公平调度器

2024-02-05 05:00:01
开发
29

总计写在前面，请使用公平调度器!!!

我根据时间节点来梳理一下发生了什么。

Stage 1（默认调度器）

大概在几年前，搭建的数仓集群中。数据量不大，做离线一个晚上就能轻松调度完，那时候使用的hadoop自带的调度器，容量调度器。但默认配置没有改，就会发生什么！

root主leaf下面只有default。虽然是容量调度器，但运行起来就是一个fifo。

Stage 2(容量调度器)

过了几个月，随着调度增加。发现不对劲了！我集群的资源没有用上啊。就增加了调度队列，比如说root主leaf下有个hive主leaf下面两个队列hive1和hive2，还有个kylin队列，flink队列等。。。这些都是在$HADOOP_HOME/etc/hadoop/capacity-scheduler.xml 里面配置的，我不过多描述了，不会配置的朋友网上搜索一下。

Stage 3(公平调度器)

大概过了一年以后，我无论怎么优化，优化数据库，采集，代码等。发现不尽人意，资源还是没完全利用起来，这个时候我发现了容量调度器本身就存在缺陷。就开始启用公平调度器。在简单的配置下，测试了一下。我测试的hadoop自带的hadoop-mapreduce-examples-2.7.2.jar，这可是hadoop自带的程序。几个窗口同时运行，发现速度极快而且几乎同时完成。于是，就改用公平调度器！

修改yarn-site.xml：

添加如下：

<property>
        <name>yarn.resourcemanager.scheduler.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
        <description>set fair sch</description>
    </property>

然后在$HADOOP_HOME/etc/hadoop下面新建fair-scheduler.xml文件，里面的内容为，可以自己修改(切记weight别设置相同，不然会出现堵塞的问题)：

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!--
  This file contains pool and user allocations for the Fair Scheduler.
  Its format is explained in the Fair Scheduler documentation at
  http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html.
  The documentation also includes a sample config file.
-->

<allocations>
    <defaultQueueSchedulingPolicy>fair</defaultQueueSchedulingPolicy>

    <queue name="default">
        <weight>30</weight>
    </queue>

    <queue name="kylin">
        <weight>50</weight>
    </queue>
	
	<queue name="hive1">
        <weight>70</weight>
    </queue>

    <queuePlacementPolicy>
        <rule name="specified" create="false" />
        <rule name="primaryGroup" create="false" />
        <rule name="default" queue="default" />
    </queuePlacementPolicy>
</allocations>

重启yarn。

完成，可以去hadoop页面查看！

原文地址:https://blog.csdn.net/qq_40209679/article/details/135994841 本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：https://www.suanlizi.com/kf/1754248466017685504.html 如若内容造成侵权/违法违规/事实不符，请联系《酸梨子》网邮箱：1419361763@qq.com进行投诉反馈，一经查实，立即删除！

阅读全部

hadoop使用公平调度器

Stage 1（默认调度器）

Stage 2(容量调度器)

Stage 3(公平调度器)

相关推荐

最近更新

热门阅读