Ubuntu HDFS如何配置
Prerequisites Before configuring HDFS on Ubuntu, ensure your system meets the following requirements:
- Java Installation: Hadoop requires Java (OpenJDK 8 or 11 is recommended). Install it using:
Verify installation withsudo apt update sudo apt install openjdk-11-jdk
java -version
. - Hadoop Download: Download the latest stable Hadoop version from the Apache Hadoop website. Extract it to a directory like
/usr/local/
:wget https://downloads.apache.org/hadoop/core/hadoop-3.3.4/hadoop-3.3.4.tar.gz tar -xzvf hadoop-3.3.4.tar.gz -C /usr/local/
1. Configure Environment Variables
Set up Hadoop environment variables to access commands globally. Edit ~/.bashrc
(or /etc/profile
for system-wide access) and add:
export HADOOP_HOME=/usr/local/hadoop-3.3.4
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
Apply changes with source ~/.bashrc
.
2. Core HDFS Configuration Files
Navigate to the Hadoop configuration directory ($HADOOP_HOME/etc/hadoop
) and edit the following files:
a. core-site.xml Defines the default file system and temporary directory. Add:
<
configuration>
<
property>
<
name>
fs.defaultFS<
/name>
<
value>
hdfs://localhost:9000<
/value>
<
!-- For standalone mode;
use 'hdfs://mycluster' for HA -->
<
/property>
<
property>
<
name>
hadoop.tmp.dir<
/name>
<
value>
/usr/local/hadoop-3.3.4/tmp<
/value>
<
!-- Temporary directory for Hadoop data -->
<
/property>
<
/configuration>
b. hdfs-site.xml Configures HDFS-specific settings like replication and NameNode/DataNode directories. Add:
<
configuration>
<
property>
<
name>
dfs.replication<
/name>
<
value>
1<
/value>
<
!-- Replication factor (1 for standalone, 3 for production clusters) -->
<
/property>
<
property>
<
name>
dfs.namenode.name.dir<
/name>
<
value>
/usr/local/hadoop-3.3.4/data/namenode<
/value>
<
!-- Directory for NameNode metadata -->
<
/property>
<
property>
<
name>
dfs.datanode.data.dir<
/name>
<
value>
/usr/local/hadoop-3.3.4/data/datanode<
/value>
<
!-- Directory for DataNode data storage -->
<
/property>
<
/configuration>
3. Create HDFS Data Directories
Create the directories specified in hdfs-site.xml
and set ownership to the current user (replace yourusername
with your actual username):
sudo mkdir -p /usr/local/hadoop-3.3.4/data/namenode
sudo mkdir -p /usr/local/hadoop-3.3.4/data/datanode
sudo chown -R yourusername:yourusername /usr/local/hadoop-3.3.4/data
4. Format the NameNode The NameNode must be formatted before first use to initialize its metadata. Run:
hdfs namenode -format
This command creates the required directory structure and files for the NameNode.
5. Start HDFS Services Start the HDFS services (NameNode and DataNode) using:
start-dfs.sh
Verify the services are running by checking for Hadoop processes:
jps
You should see NameNode
, DataNode
, and other Hadoop processes listed.
6. Verify HDFS Functionality
- Web Interface: Open a browser and navigate to
http://localhost:9870
(for Hadoop 3.x) to view the HDFS web interface. - Command-Line Operations: Test HDFS commands to ensure functionality:
hdfs dfs -mkdir /user/yourusername # Create a directory hdfs dfs -put ~/testfile.txt /user/yourusername # Upload a file hdfs dfs -ls /user/yourusername # List directory contents
7. Optional: Configure SSH for Cluster Nodes If setting up a multi-node cluster, configure SSH passwordless login between nodes to enable secure communication. Generate an SSH key on the master node and copy it to all slave nodes:
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
ssh-copy-id slave1
ssh-copy-id slave2
Test the connection with ssh slave1
(replace slave1
with the actual hostname/IP of the slave node).
8. Optional: High Availability (HA) Configuration For production environments, configure HDFS HA to ensure fault tolerance. This involves:
- Setting up multiple NameNodes (active/passive).
- Configuring JournalNodes to store edit logs.
- Using ZooKeeper for failover management. Refer to the Hadoop HA documentation for detailed steps.
声明:本文内容由网友自发贡献,本站不承担相应法律责任。对本内容有异议或投诉,请联系2913721942#qq.com核实处理,我们将尽快回复您,谢谢合作!
若转载请注明出处: Ubuntu HDFS如何配置
本文地址: https://pptw.com/jishu/716640.html