安装前准备:
一台拥有root权限的Linux服务器(本文以Debian10为例)
1.更新一下系统软件
apt update && apt upgrade
2.安装JDK与Hadoop
下载Hadoop与JDK
Hadoop官网: https://hadoop.apache.org/
Oracle官网: https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
移动到相应路径进行解压
mv jdk-8u231-linux-x64.tar.gz /usr/local/lib
cd /usr/local/lib
tar xzvf jdk-8u231-linux-x64.tar.gz
cd
mv hadoop-2.7.3.tar.gz /usr/local
cd /usr/local
tar xzvf hadoop-2.7.3.tar.gz
3.配置环境变量
打开/etc/profile
vi /etc/profile
加入以下内容
export JAVA_HOME=/usr/local/lib/jdk1.8.0_231
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/usr/local/hadoop-2.7.3
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
使配置生效
source /etc/profile
5.设置ssh密钥
创建密钥并发送到本机
ssh-keygen
ssh-copy-id localhost
6.修改配置文件
进入配置文件目录
/usr/local/hadoop-2.7.3/etc/hadoop
修改 hadoop-env.sh
export JAVA_HOME=/usr/local/lib/jdk1.8.0_231
修改 yarn-env.sh
export JAVA_HOME=/usr/local/lib/jdk1.8.0_231
修改core-site.xml 加入如下内容
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-2.7.3/hadoopdata/tmp</value>
</property>
</configuration>
修改hdfs-site.xml 加入如下内容
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/usr/local/hadoop-2.7.3/hadoopdata/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/usr/local/hadoop-2.7.3/hadoopdata/data</value>
</property>
</configuration>
修改yarn-site.xml 加入如下内容
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:18040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>localhost:18030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>localhost:18025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>localhost:18141</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>localhost:18088</value>
</property>
</configuration>
复制并修改 mapred -site.xml 加入如下内容
cp mapred-site.xml.template mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
7.格式化文件系统
hdfs namenode -format
8.启动Hadoop集群
cd /usr/local/hadoop-2.7.3/sbin
./start-all.sh
查看集群状态
hdfs dfsadmin -report
至此,Hadoop单机、伪分布式集群已经安装完毕,可以运行一下实例来验证安装是否成功
cd /usr/local/hadoop-2.7.3/share/hadoop/mapreduce
hadoop jar hadoop-mapreduce-examples-2.7.3.jar pi 10 10