CentOS7.0下Hadoop2.7.3的集群搭建

1.基本环境

1.1.操作系统

CentOS7.0

1.2.四台虚拟机

  • 192.168.56.216 apollo.hadoop.com
  • 192.168.56.217 artemis.hadoop.com
  • 192.168.56.218 uranus.hadoop.com
  • 192.168.56.219 ares.hadoop.com

1.3.软件包

  • hadoop-2.7.3.tar.gz
  • jdk-8u77-linux-x64.rpm

2.配置系统环境

2.1.配置ntp时间同步

2.2.修改hostname

#Host: 192.168.56.216
[root@apollo~]$ echo "apollo.hadoop.com" > /etc/hostname
#Host: 192.168.56.217
[root@artemis~]$ echo "artemis.hadoop.com" > /etc/hostname
#Host: 192.168.56.218
[root@uranus~]$ echo "uranus.hadoop.com" > /etc/hostname
#Host: 192.168.56.219
[root@ares~]$ echo "ares.hadoop.com" > /etc/hostname

2.3.修改主机上的/etc/hosts文件

[root@apollo~]$ echo "192.168.56.216 apollo.hadoop.com" >> /etc/hosts
[root@artemis~]$ echo "192.168.56.217 artemis.hadoop.com" >> /etc/hosts
[root@uranus~]$ echo "192.168.56.218 uranus.hadoop.com" >> /etc/hosts
[root@ares~]$ echo "192.168.56.219 ares.hadoop.com" >> /etc/hosts

2.4.同步三台从机的/etc/hosts文件

[root@apollo~]$ scp /etc/hosts artemis.hadoop.com:/etc/
[root@apollo~]$ scp /etc/hosts uranus.hadoop.com:/etc/
[root@apollo~]$ scp /etc/hosts ares.hadoop.com:/etc/

2.5.关闭主从机上的防火墙

#停止防火墙
[root@apollo~]$ systemctl stop firewalls.service
#禁止防火墙开机启动
[root@apollo~]$ systemctl disable firewalls.service
#停止防火墙
[root@artemis~]$ systemctl stop firewalls.service
#禁止防火墙开机启动
[root@artemis~]$ systemctl disable firewalls.service
#停止防火墙
[root@uranus~]$ systemctl stop firewalls.service
#禁止防火墙开机启动
[root@uranus~]$ systemctl disable firewalls.service
#停止防火墙
[root@ares~]$ systemctl stop firewalls.service
#禁止防火墙开机启动
[root@ares~]$ systemctl disable firewalls.service

3.配置hadoop环境

3.1.主从机上安装JDK

有关JDK1.8安装和环境变量配置请参考CentOS7.0安装配置JDK1.8

3.2.主从机上创建hadoop用户

#创建hadoop组
[root@apollo~]$ groupadd hadoop
[root@apollo~]$ groupadd hadoop
[root@apollo~]$ groupadd hadoop
[root@apollo~]$ groupadd hadoop
#创建hadoop用户
[root@apollo~]$ useradd -d /home/hadoop -g hadoop hadoop
[root@artemis~]$ useradd -d /home/hadoop -g hadoop hadoop
[root@uranus~]$ useradd -d /home/hadoop -g hadoop hadoop
[root@ares~]$ useradd -d /home/hadoop -g hadoop hadoop
#设置hadoop用户密码
[root@apollo~]$ passwd hadoop
Changing password for user hadoop.
New password: 
Retype new password: 
passwd: all authentication tokens updated successfully.
[root@artemis~]$ passwd hadoop
Changing password for user hadoop.
New password: 
Retype new password: 
passwd: all authentication tokens updated successfully.
[root@uranus~]$ passwd hadoop
Changing password for user hadoop.
New password: 
Retype new password: 
passwd: all authentication tokens updated successfully.
[root@ares~]$ passwd hadoop
Changing password for user hadoop.
New password: 
Retype new password: 
passwd: all authentication tokens updated successfully.

3.3.建议

建议在学习阶段将hadoop用户加入sudo权限管理,简单设置方法如下:

[root@apollo ~]# visudo 
#在root ALL=(ALL) ALL下一行加入
hadoop ALL=(ALL) ALL

[root@artemis ~]# visudo 
#在root ALL=(ALL) ALL下一行加入
hadoop ALL=(ALL) ALL

[root@uranus ~]# visudo 
#在root ALL=(ALL) ALL下一行加入
hadoop ALL=(ALL) ALL

[root@ares ~]# visudo 
#在root ALL=(ALL) ALL下一行加入
hadoop ALL=(ALL) ALL

3.4. 主从机之间设置无密钥连接

#主机切换hadoop用户
[root@apollo ~]$ su - hadoop
[hadoop@apollo ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 
/home/hadoop/.ssh/id_rsa already exists.
Overwrite (y/n)? y
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
9c:88:8a:b4:67:66:6e:d6:e4:a9:05:40:04:f0:1f:a1 hadoop@apollo.hadoop.com
The key's randomart image is:
+--[ RSA 2048]----+
|*o  .            |
|.. . .           |
| .E .            |
|  .. o o .       |
| . .o . S        |
|......           |
|...=+..          |
|  *o.+           |
|  oo.            |
+-----------------+

[hadoop@apollo ~]$ cd /home/hadoop/.ssh/
[hadoop@apollo .ssh]$ cp id_rsa.pub authorized_keys
[hadoop@apollo .ssh]$ chmod go-wx authorized_keys

#将apollo.hadoop.com(主机上的公钥authorized_keys拷贝到artemis.hadoop.com,uranus.hadoop.com,ares.hadoo.com三台从机上以便三台从机可同时访问主机
[hadoop@apollo .ssh]$ scp authorized_keys artemis.hadoop.com:/home/hadoop/.ssh/
hadoop@artemis.hadoop.com's password: 
authorized_keys                               100%  406     0.4KB/s   00:00    
[hadoop@apollo .ssh]$ scp authorized_keys uranus.hadoop.com:/home/hadoop/.ssh/
hadoop@uranus.hadoop.com's password: 
authorized_keys                               100%  406     0.4KB/s   00:00    
[hadoop@apollo .ssh]$ scp authorized_keys ares.hadoop.com:/home/hadoop/.ssh/
hadoop@ares.hadoop.com's password: 
authorized_keys                               100%  406     0.4KB/s   00:00

3.5.主从机设置hadoop环境变量

#主从机设置hadoop环境变量HADOOP_HOME
[root@apollo ~]# vim /etc/profile
[root@artemis ~]# vim /etc/profile
[root@uranus ~]# vim /etc/profile
[root@ares ~]# vim /etc/profile
#使修改生效
[root@apollo ~]# source /etc/profile 
[root@artemis ~]# source /etc/profile 
[root@uranus ~]# source /etc/profile 
[root@ares ~]# source /etc/profile 

3.6.主从机上创建相关目录

#创建hadoop的数据目录
[root@apollo hadoop]# mkdir -p /data/hadoop
[root@apollo ~]$ cd /data/hadoop/
[root@apollo hadoop]$ mkdir tmp #创建 tmp
[root@apollo hadoop]$ mkdir hdfs #创建hdfs
[root@apollo hadoop]$ cd hdfs/
[root@apollo hdfs]$ mkdir data #创建datanode目录
[root@apollo hdfs]$ mkdir name #创建namenode目录
[root@apollo hdfs]$ mkdir namesecondary 
[root@apollo hadoop]# chown -R hadoop:hadoop /data/hadoop/

#同样方法创建三台从机的hadoop数据目录
[root@artemis hadoop]# mkdir -p /data/hadoop
[root@artemis ~]$ cd /data/hadoop/
[root@artemis hadoop]$ mkdir tmp #创建 tmp
[root@artemis hadoop]$ mkdir hdfs #创建hdfs
[root@artemis hadoop]$ cd hdfs/
[root@artemis hdfs]$ mkdir data #创建datanode目录
[root@artemis hdfs]$ mkdir name #创建namenode目录
[root@artemis hdfs]$ mkdir namesecondary 
[root@artemis hadoop]# chown -R hadoop:hadoop /data/hadoop/

[root@uranus hadoop]# mkdir -p /data/hadoop
[root@uranus ~]$ cd /data/hadoop/
[root@uranus hadoop]$ mkdir tmp #创建 tmp
[root@uranus hadoop]$ mkdir hdfs #创建hdfs
[root@uranus hadoop]$ cd hdfs/
[root@uranus hdfs]$ mkdir data #创建datanode目录
[root@uranus hdfs]$ mkdir name #创建namenode目录
[root@uranus hdfs]$ mkdir namesecondary 
[root@uranus hadoop]# chown -R hadoop:hadoop /data/hadoop/

[root@ares hadoop]# mkdir -p /data/hadoop
[root@ares ~]$ cd /data/hadoop/
[root@ares hadoop]$ mkdir tmp #创建 tmp
[root@ares hadoop]$ mkdir hdfs #创建hdfs
[root@ares hadoop]$ cd hdfs/
[root@ares hdfs]$ mkdir data #创建datanode目录
[root@ares hdfs]$ mkdir name #创建namenode目录
[root@ares hdfs]$ mkdir namesecondary 
[root@ares hadoop]# chown -R hadoop:hadoop /data/hadoop/

3.7.主机上安装hadoop

#下载hadoop2.7.3
[root@apollo ~]$ wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
--2017-04-19 04:49:17--  http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
Resolving mirror.bit.edu.cn (mirror.bit.edu.cn)... 202.204.80.77,2001:da8:204:2001:250:56ff:fea1:22
Connecting to mirror.bit.edu.cn (mirror.bit.edu.cn)|202.204.80.77|:80... connected.
HTTP request sent,awaiting response... 200 OK
Length: 214092195 (204M) [application/octet-stream]
Saving to: ‘hadoop-2.7.3.tar.gz100%[==========================================================================>] 214,092,195 1.13MB/s   in 4m 14s 

2017-04-19 04:53:30 (825 KB/s) - ‘hadoop-2.7.3.tar.gz’ saved [214092195/214092195]

#安装hadoop
[root@apollo ~]$ tar -zxvf hadoop-2.7.3.tar.gz

#把解压后的hadoop移到/home/hadoop/目录下
[root@apollo ~]$ mv hadoop-2.7.3 /home/hadoop/hadoop2.7

#修改hadoop的所属主
[root@apollo ~]$ chown -R Hadoop:hadoop /home/hadoop/hadoop2.7

4.修改配置文件

4.1. 有关配置文件的详细说明请参考官方文档:

4.2.配置hadoop-env.sh

#切换到hadoop的配置文件所在目录
[hadoop@apollo ~]$ cd $HADOOP_HOME/etc/hadoop/
[hadoop@apollo hadoop]$ ls -la
total 164
drwxrwxr-x. 2 hadoop hadoop  4096 Apr 19 13:49 .
drwxrwxr-x. 3 hadoop hadoop    19 Aug 17  2016 ..
-rw-rwxr--. 1 hadoop hadoop  4436 Aug 17  2016 capacity-scheduler.xml
-rw-rwxr--. 1 hadoop hadoop  1335 Aug 17  2016 configuration.xsl
-rw-rwxr--. 1 hadoop hadoop   318 Aug 17  2016 container-executor.cfg
-rw-rwxr--. 1 hadoop hadoop  1946 Apr 19 11:47 core-site.xml
-rw-rwxr--. 1 hadoop hadoop  3589 Aug 17  2016 hadoop-env.cmd
-rw-rwxr--. 1 hadoop hadoop  4249 Apr 19 13:48 hadoop-env.sh
-rw-rwxr--. 1 hadoop hadoop  2598 Aug 17  2016 hadoop-metrics2.properties
-rw-rwxr--. 1 hadoop hadoop  2490 Aug 17  2016 hadoop-metrics.properties
-rw-rwxr--. 1 hadoop hadoop  9683 Aug 17  2016 hadoop-policy.xml
-rw-rwxr--. 1 hadoop hadoop  2181 Apr 19 12:06 hdfs-site.xml
-rw-rwxr--. 1 hadoop hadoop  1449 Aug 17  2016 httpfs-env.sh
-rw-rwxr--. 1 hadoop hadoop  1657 Aug 17  2016 httpfs-log4j.properties
-rw-rwxr--. 1 hadoop hadoop    21 Aug 17  2016 httpfs-signature.secret
-rw-rwxr--. 1 hadoop hadoop   620 Aug 17  2016 httpfs-site.xml
-rw-rwxr--. 1 hadoop hadoop  3518 Aug 17  2016 kms-acls.xml
-rw-rwxr--. 1 hadoop hadoop  1527 Aug 17  2016 kms-env.sh
-rw-rwxr--. 1 hadoop hadoop  1631 Aug 17  2016 kms-log4j.properties
-rw-rwxr--. 1 hadoop hadoop  5511 Aug 17  2016 kms-site.xml
-rw-rwxr--. 1 hadoop hadoop 11237 Aug 17  2016 log4j.properties
-rw-rwxr--. 1 hadoop hadoop   931 Aug 17  2016 mapred-env.cmd
-rw-rwxr--. 1 hadoop hadoop  1383 Aug 17  2016 mapred-env.sh
-rw-rwxr--. 1 hadoop hadoop  4113 Aug 17  2016 mapred-queues.xml.template
-rw-rwxr--. 1 hadoop hadoop  1292 Apr 19 12:15 mapred-site.xml
-rw-rwxr--. 1 hadoop hadoop   758 Aug 17  2016 mapred-site.xml.template
-rw-rw-r--. 1 hadoop hadoop    18 Apr 19 13:36 masters
-rw-rwxr--. 1 hadoop hadoop    64 Apr 19 13:34 slaves
-rw-rwxr--. 1 hadoop hadoop  2316 Aug 17  2016 ssl-client.xml.example
-rw-rwxr--. 1 hadoop hadoop  2268 Aug 17  2016 ssl-server.xml.example
-rw-rwxr--. 1 hadoop hadoop  2191 Aug 17  2016 yarn-env.cmd
-rw-rwxr--. 1 hadoop hadoop  4567 Aug 17  2016 yarn-env.sh
-rw-rwxr--. 1 hadoop hadoop  1361 Apr 19 12:37 yarn-site.xml

#设置HADOOP——HEAPSIZE=128M(默认值为1000M,这里修改为128M)
#设置JAVA_HOME
[hadoop@apollo hadoop]$ vim hadoop-env.sh
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License,Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Set Hadoop-specific environment variables here.

# The only required environment variable is JAVA_HOME. All others are
# optional. When running a distributed configuration it is best to
# set JAVA_HOME in this file,so that it is correctly defined on
# remote nodes.

# The java implementation to use.
export JAVA_HOME=/usr/java/jdk1.8.0_77 #${JAVA_HOME}

# The jsvc implementation to use. Jsvc is required to run secure datanodes
# that bind to privileged ports to provide authentication of data transfer
# protocol. Jsvc is not required if SASL is configured for authentication of
# data transfer protocol using non-privileged ports.
#export JSVC_HOME=${JSVC_HOME}

export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}

# Extra Java CLAsspATH elements. Automatically insert capacity-scheduler.
for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
  if [ "$HADOOP_CLAsspATH" ]; then
    export HADOOP_CLAsspATH=$HADOOP_CLAsspATH:$f
  else
 else
    export HADOOP_CLAsspATH=$f
  fi
done

# The maximum amount of heap to use,in MB. Default is 1000.
export HADOOP_HEAPSIZE=128
#export HADOOP_NAMENODE_INIT_HEAPSIZE=""

# Extra Java runtime options. Empty by default.
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"

# Command specific options appended to HADOOP_OPTS when specified
export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_Security_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS"
export HADOOP_DatanODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DatanODE_OPTS"

export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_Security_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"

export HADOOP_NFS3_OPTS="$HADOOP_NFS3_OPTS"
export HADOOP_PORTMAP_OPTS="-Xmx512m $HADOOP_PORTMAP_OPTS"

# The following applies to multiple commands (fs,dfs,fsck,distcp etc)
export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
#HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData $HADOOP_JAVA_PLATFORM_OPTS"

# On secure datanodes,user to run the datanode as after dropping privileges.
# This **MUST** be uncommented to enable secure HDFS if using privileged ports
# to provide authentication of data transfer protocol. This **MUST NOT** be
# defined if SASL is configured for authentication of data transfer protocol
# using non-privileged ports.
export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}

# Where log files are stored. $HADOOP_HOME/logs by default.
#export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER

# Where log files are stored in the secure data environment.
export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}

###
# HDFS Mover specific parameters
###
# Specify the JVM options to be used when starting the HDFS Mover.
# These options will be appended to the options specified as HADOOP_OPTS
# and therefore may override any similar flags set in HADOOP_OPTS
#
# export HADOOP_MOVER_OPTS=""

###
# Advanced Users Only!
###

# The directory where pid files are stored. /tmp by default.
# NOTE: this should be set to a directory that can only be written to by 
# the user that will run the hadoop daemons. Otherwise there is the
# potential for a symlink attack.
export HADOOP_PID_DIR=${HADOOP_PID_DIR}
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}
# A string representing this instance of hadoop. $USER by default.
export HADOOP_IDENT_STRING=$USER

4.3. 配置core-site.xml (全局配置)

[hadoop@apollo hadoop]$ vim core-site.xml #配置全局变量
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Licensed under the Apache License,Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing,software distributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. -->

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://apollo.hadoop.com:9000</value>
                <!-- hadoop namenode 服务器地址和端口,以域名形式 -->
        </property>

        <property>
                <name>dfs.namenode.checkpoint.period</name>
                <value>1800</value>
                <!-- editlog每隔30分钟触发一次合并,默认为60分钟 -->
        </property>

        <property>
                <name>fs.checkpoint.size</name>
                <value>67108864</value>
        </property>

        <property>
                <name>fs.trash.interval</name>
                <value>1440</value>
                <!-- Hadoop文件回收站,自动回收时间,单位分钟,这里设置是1天,默认值为0. -->
        </property>

        <property>
                <name>hadoop.tmp.dir</name>
                <value>/data/hadoop/tmp</value>
                <!-- Hadoop的默认临时路径,这个最好配置,如果在新增节点或者其它情况下莫名其妙的Datanode启动不了,就>删除此文件中的tmp目录即可。不过如果删除了NameNode机器的此目录,那么就需要重新执行NameNode格式化命令。/data/hadoop/tmp这里给的路径不需要创建会自动生成。-->
        </property>
        <property>
                <name>io.file.buffer.size</name>
                <value>131702</value>
                <!-- 流文件的缓冲区 -->
        </property>
</configuration>

4.4. hdfs中NameNode,Datanode局部配置(hdfs-site.xml)

[hadoop@apollo hadoop]$ vim hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Licensed under the Apache License,either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. -->

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>/data/hadoop/hdfs/name</value>
                <!-- HDFS namenode数据镜像目录 -->
                <description> </description>
        </property>

        <property>
                <name>dfs.datanode.data.dir</name>
                <value>/data/hadoop/hdfs/data</value>
                <!-- HDFS datanode数据镜像存储路径,可以配置多个不同的分区和磁盘中,使用,号分隔 -->
                <description> </description>
        </property>

        <property>
                <name>dfs.namenode.http-address</name>
                <value>apollo.hadoop.com:50070</value>
                <!-- HDFS Web查看主机和端口号 -->
        </property>

        <property>
                <name>dfs.namenode.secondary.http-address</name>
                <value>artemis.hadoop.com:50090</value>
                <!-- 辅控HDFS Web查看主机和端口 -->
        </property>

        <property>
                <name>dfs.webhdfs.enabled</name>
                <value>true</value>
        </property>
        <property>
                <name>dfs.replication</name>
                <value>3</value>
                <!-- HDFS数据保存份数,通常是3 -->
        </property>

        <property>
                <name>dfs.datanode.du.reserved</name>
                <value>1073741824</value>
                <!-- datanode写磁盘会预留1G空间给其它程序使用,而非写满,单位 bytes -->
        </property>

        <property>
                <name>dfs.block.size</name>
                <value>134217728</value>
                <!-- HDFS数据块大小,当前设置为128M/Blocka -->
        </property>

        <property>
                <name>dfs.permissions.enabled</name>
                <value>false</value>
                <!-- HDFS关闭文件权限 -->
        </property>

</configuration>

4.5. 配置MapReduce,使用yarn框架、jobhistory使用地址及web地址(mapred-site.xml)

[hadoop@apollo hadoop]$ vi mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Licensed under the Apache License,either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. -->

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
        <property>
                <name>mapreduce.jobtracker.http.address</name>
                <value>apollo.hadoop.com:50030</value>
        </property>
        <property>
                <name>mapred.job.tracker</name>
                <value>http://apollo.hadoop.com:9001</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>apollo.hadoop.com:10020</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>apollo.hadoop.com:19888</value>
        </property>
</configuration>

4.6.配置yarn-site.xml文件(yarn-site.xml)

[hadoop@apollo hadoop]$ vim yarn-site.xml 
<?xml version="1.0"?>
<!-- Licensed under the Apache License,either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. -->
<configuration>

<!-- Site specific YARN configuration properties -->
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
        <property>
                <name>yarn.resourcemanager.address</name>
                <value>apollo.hadoop.com:8032</value>
        </property>
        <property>
                <name>yarn.resourcemanager.scheduler.address</name>
                <value>apollo.hadoop.com:8030</value>
        </property>
        <property>
                <name>yarn.resourcemanager.resource-tracker.address</name>
                <value>apollo.hadoop.com:8031</value>
        </property>
        <property>
                <name>yarn.resourcemanager.admin.address</name>
                <value>apollo.hadoop.com:8033</value>
        </property>
        <property>
                <name>yarn.resourcemanager.webapp.address</name>
                <value>apollo.hadoop.com:8088</value>
        </property>
</configuration>

5.检查主机上的Hadoop

5.1.测试hdfs中的namenode与datanode

[hadoop@apollo hadoop]$ sh $HADOOP_HOME/sbin/hadoop-daemon.sh start namenode
[hadoop@apollo hadoop]$ chmod go-w /data/hadoop/hdfs/data/
[hadoop@apollo hadoop]$ sh $HADOOP_HOME/sbin/hadoop-daemon.sh start datanode

5.2.测试resourcemanager

[hadoop@apollo hadoop]$ sh $HADOOP_HOME/sbin/yarn-daemon.sh start resourcemanager

5.3.测试nodemanager

[hadoop@apollo hadoop]$ sh $HADOOP_HOME/sbin/yarn-daemon.sh start nodemanager

5.4.测试nodemanager

[hadoop@apollo hadoop]$ sh $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver

5.5.执行jps

* 看到以下信息表明单机版的Hadoop安装成功 *

[hadoop@apollo sbin]$ jps
15570 Jps
13861 JobHistoryServer
15273 ResourceManager
13997 Datanode
14349 NodeManager
15149 NameNode

6.Hadoop集群搭建

6.1.把主机上解压好的hadoop拷贝到从机上

#拷贝到从机artemis.hadoop.com
[hadoop@apollo sbin]$ scp -r $HADOOP_HOME/ artemis.hadoop.com:/home/hadoop/
#拷贝到从机uranus.hadoop.com
[hadoop@apollo sbin]$ scp -r $HADOOP_HOME/ uranus.hadoop.com:/home/hadoop/
#拷贝到从机ares.hadoop.com
[hadoop@apollo sbin]$ scp -r $HADOOP_HOME/ ares.hadoop.com:/home/hadoop/

6.2主机apollo.hadoop.com配置masters和slaves

[hadoop@apollo sbin]$ vim $HADOOP_HOME/etc/hadoop/salves
#1.删除localhost
#2.添加量台从机
artemis.hadoop.com
uranus.hadoop.com
ares.hadoop.com

[hadoop@apollo sbin]$ vim $HADOOP_HOME/etc/hadoop/masters
#1.删除localhost
#2.添加主机hostname
apollo.hadoop.com #目的是为了在从机artemis.hadoop.com节点上存放secondnamenode

7.测试集群搭建是否成功

#主机apollo.hadoop.com上,开启所有节点
[hadoop@apollo sbin]$ $HADOOP_HOME/bin/hdfs namenode -format
[hadoop@apollo sbin]$ sh $HADOOP_HOME/sbin/start-all.sh
#各个节点的jps:
[hadoop@apollo sbin]$jps
13861 JobHistoryServer
16567 GetConf
17527 Jps
15273 ResourceManager
13997 Datanode
14349 NodeManager
15149 NameNode

[hadoop@artemis ~]$ jps
13748 NodeManager
13606 Datanode
14598 Jps
13678 SecondaryNameNode

[hadoop@uranus ~]$ jps
13526 NodeManager
13449 Datanode
13916 Jps

[hadoop@ares ~]$ jps
13690 Jps
13355 NodeManager
13196 Datanode

* 如果存在上面状态,说明Hadoop的集群搭建成功*

8.0 通过web验证

CentOS7.0下Hadoop2.7.3的集群搭建的更多相关文章

  1. OpenStack对象存储――Swift

    Swift前身是RackspaceCloudFiles项目,随着Rackspace加入到OpenStack社区,于2010年7月贡献给OpenStack,作为该开源项目的一部分。Swift目前的最新版本是OpenStackEssex1.5.1。Swift特性在OpenStack官网中,列举了Swift的20多个特性,其中最引人关注的是以下几点。在OpenStack中还可以与镜像服务Glance结合,为其存储镜像文件。Auth认证服务目前已从Swift中剥离出来,使用OpenStack的认证服务Keysto

  2. 对象存储系统Swift技术详解:综述与概念

    通过阅读Swift的技术文档,我们可以理解其中的设计的原理和实现的方法。本人于9月底开始接触swift,刚开始看文档的时候一知半解,有幸阅读了zzcase等人的博客,才得以入门。随着对swift设计原理的理解和源码的深入,文档经过数次反复的修改,希望对各位学习swift的童鞋有所帮助,水平有限,若各位发现有错误之处,恳请指出。

  3. 《转》OpenStack对象存储——Swift

    Swift前身是RackspaceCloudFiles项目,随着Rackspace加入到OpenStack社区,于2010年7月贡献给OpenStack,作为该开源项目的一部分。Swift目前的最新版本是OpenStackEssex1.5.1。Swift特性在OpenStack官网中,列举了Swift的20多个特性,其中最引人关注的是以下几点。在OpenStack中还可以与镜像服务Glance结合,为其存储镜像文件。Auth认证服务目前已从Swift中剥离出来,使用OpenStack的认证服务Keysto

  4. OpenStack 对象存储 Swift 简单介绍

    Swift最适合的就是永久类型的静态数据的长期存储。提供账号验证的节点被称为AccountServer。Swift中由Swauth提供账号权限认证服务。ProxyserveracceptsincomingrequestsviatheOpenStackObjectAPIorjustrawHTTP.Itacceptsfilestoupload,modificationstoMetadataorcontainercreation.Inaddition,itwillalsoservefilesorcontaine

  5. openstack学习笔记七 swift安装

    指定映射位置创建ring文件启动服务代维服务proxyserver

  6. openstack安装liberty--安装对象存储服务swift

    通常使用CACHE技術提高性能Accountservers賬戶服務,管理對象存儲中的賬戶定義。Containerservers容器服務,在對象存儲中管理容器或文件夾映……Objectservers對象服務,在存儲節點管理實際的對象,比如文件。Wsgimiddleware處理認證,通常使用OPENSTACKIdentityswiftclient為用戶提供命令行接口使用RESTAPIswift-init初始化和構建RING文件腳本swift-recon一個命令行工具,用於檢索群集的各種度量和測試信息。

  7. 使用 Swift语言进行 Hadoop 数据流应用程序开发

    如果您发现了问题,或者希望为改进本文提供意见和建议,请在这里指出.在您开始之前,请参阅目前待解决的问题清单.简介本项目包括两类Hadoop流处理应用程序:映射器mapper和总结器reducer。如上所示,在Hadoop上编写流处理程序是一个很简单的工作,也不需要依赖于特定的软件体系。

  8. 将我的Android应用程序签名为系统应用程序

    将我的Android应用程序签名为系统应用程序在我的公司,我们希望在现场完全控制电池消耗,仅使用2g和gps可以快速耗尽电池.我们的决定是我们需要拥有移动电话的root权限,这样当手机闲置时,我们就会关掉那些不必要的电池消耗.而且我们也不允许用户将其卸载并清除数据.我的问题是:>我从哪里获得这些签名密钥?>它是否会像root访问权限一样如果我成功地成功了签字?

  9. 获得Android App的“root”权限

    我想知道如何从Android应用程序获得root权限?我尝试了下面的代码行来列出文件但没有发生任何事情我试图在我的清单文件中给予TEST_FACTORY权限,但是我收到错误“允许系统应用”如何制作我的应用系统应用?

  10. Java API操作Hdfs的示例详解

    这篇文章主要介绍了Java API操作Hdfs详细示例,遍历当前目录下所有文件与文件夹,可以使用listStatus方法实现上述需求,本文通过实例代码给大家介绍的非常详细,需要的朋友可以参考下

随机推荐

  1. 在airgapped(离线)CentOS 6系统上安装yum软件包

    我有一个CentOS6系统,出于安全考虑,它已经被空气泄漏.它可能从未连接到互联网,如果有,它很长时间没有更新.我想将所有.rpm软件包放在一个驱动器上,这样它们就可以脱机安装而无需查询互联网.但是,我在测试VM上遇到的问题是,即使指定了本地路径,yum仍然会挂起并尝试从在线存储库进行更新.另外,有没有办法使用yum-utils/yumdownloader轻松获取该包的所有依赖项和所有依赖项?目前

  2. centos – 命名在日志旋转后停止记录到rsyslog

    CentOS6.2,绑定9.7.3,rsyslog4.6.2我最近设置了一个服务器,我注意到在日志轮换后,named已停止记录到/var/log/messages.我认为这很奇怪,因为所有日志记录都是通过rsyslog进行的,并且named不会直接写入日志文件.这更奇怪,因为我在更新区域文件后命名了HUPed,但它仍然没有记录.在我停止并重新启动命名后,记录恢复.这里发生了什么?

  3. centos – 显示错误的磁盘大小

    对于其中一个磁盘,Df-h在我的服务器上显示错误的空白区域:Cpanel表明它只有34GB免费,但还有更多.几分钟前,我删除了超过80GB的日志文件.所以,我确信它完全错了.fdisk-l/dev/sda2也显示错误:如果没有格式化,我该怎么做才能解决这个问题?并且打开文件描述符就是它需要使用才能做到这一点.所以…使用“lsof”并查找已删除的文件.重新启动写入日志文件的服务,你很可能会看到空间可用.

  4. 如何在centos 6.9上安装docker-ce 17?

    我目前正在尝试在centOS6.9服务器上安装docker-ce17,但是,当运行yuminstalldocker-ce时,我收到以下错误:如果我用跳过的标志运行它我仍然得到相同的消息,有没有人知道这方面的方法?

  5. centos – 闲置工作站的异常负载平均值

    我有一个新的工作站,具有不寻常的高负载平均值.机器规格是:>至强cpu>256GB的RAM>4x512GBSSD连接到LSI2108RAID控制器我从livecd安装了CentOS6.564位,配置了分区,网络,用户/组,并安装了一些软件,如开发工具和MATLAB.在启动几分钟后,工作站负载平均值的值介于0.5到0.9之间.但它没有做任何事情.因此我无法理解为什么负载平均值如此之高.你能帮我诊断一下这个问题吗?

  6. centos – Cryptsetup luks – 检查内核是否支持aes-xts-plain64密码

    我在CentOS5上使用cryptsetupluks加密加密了一堆硬盘.一切都很好,直到我将系统升级到CentOS6.现在我再也无法安装磁盘了.使用我的关键短语装载:我收到此错误:在/var/log/messages中:有关如何装载的任何想法?找到解决方案问题是驱动器使用大约512个字符长的交互式关键短语加密.出于某种原因,CentOS6中的新内核模块在由旧版本创建时无法正确读取512个字符的加密密钥.似乎只会影响内核或cryptsetup的不同版本,因为在同一系统上创建和打开时,512字符的密钥将起作用

  7. centos – 大量ssh登录尝试

    22个我今天登录CentOS盒找到以下内容这是过去3天内的11次登录尝试.WTF?请注意,这是我从我的提供商处获得的全新IP,该盒子是全新的.我还没有发布任何关于此框的内容.为什么我会进行如此大量的登录尝试?是某种IP/端口扫描?基本上有4名匪徒,其中2名来自中国,1名来自香港,1名来自Verizon.这只发生在SSH上.HTTP上没有问题.我应该将罪魁祸首子网路由吗?你们有什么建议?

  8. centos – kswap使用100%的CPU,即使有100GB的RAM也可用

    >Linux内核是否应该足够智能,只需从内存中清除旧缓存页而不是启动kswap?

  9. centos – Azure将VM从A2 / 3调整为DS2 v2

    我正在尝试调整前一段时间创建的几个AzureVM,从基本的A3和标准A3到标准的DS2v2.我似乎没有能力调整到这个大小的VM.必须从头开始重建服务器会有点痛苦.如果它有所不同我在VM中运行CentOS,每个都有一个带有应用程序和操作系统的磁盘.任何人都可以告诉我是否可以在不删除磁盘的情况下删除VM,创建新VM然后将磁盘附加到新VM?

  10. centos – 广泛使用RAM时服务器计算速度减慢

    我在非常具体的情况下遇到服务器速度下降的问题.事实是:>1)我使用计算应用WRF>2)我使用双XeonE5-2620v3和128GBRAM(NUMA架构–可能与问题有关!

返回
顶部