hdfs client

What is hdfs client | A comprehensive tutorial in 2023

Hadoop or hdfs is the primary storage used by various haddop applications. In this tutorial, we will learn everything about the hdfs client. Let’s get started.

What is hdfs client

Hdfs client is the Hadoop interface that allows users to interact with the Hadoop file system. There are various clients available in haddop. The basic one is hdfs dfs which connects the Hadoop distributed file system.

The other hdfs clients are hdfs dfsadmin, which is used to perform the administration work on the Hadoop file system.

Hdfs client download

hdfs client download

Hdfs client can be downloaded from the below repo

http://apache.mirrors.pair.com/hadoop/common/
Hdfs client download

select the version of the client you need to download. For the sake of this demo, I will be installing the version hadoop-3.3.1

Hdfs client install

To install the hdfs client, type the wget command to download the zip file.

wget http://apache.mirrors.pair.com/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz
--2022-12-02 10:31:16--  http://apache.mirrors.pair.com/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz
Resolving apache.mirrors.pair.com (apache.mirrors.pair.com)... 216.92.2.131
Connecting to apache.mirrors.pair.com (apache.mirrors.pair.com)|216.92.2.131|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 605187279 (577M) [application/x-gzip]
Saving to: ‘hadoop-3.3.1.tar.gz’

100%[==================================================================================================================================================================>] 605,187,279 13.7MB/s   in 50s    

2022-12-02 10:32:07 (11.5 MB/s) - ‘hadoop-3.3.1.tar.gz’ saved [605187279/605187279]

Once the hdfs client in downloaded, go to the folder where the zip file is present and type the below command to extract the zip.

tar -xvzf hadoop-3.3.1.tar.gz

Hdfs client configuration

hdfs client configuration

In this session, we will configure the hdfs client to interact with the file system.

JAVA configuration for hdfs client

user needs to configure the JAVA_HOME path properly to interact with hdfs. Make sure java is installed in your system. The java version can be found by typing.

java -version

Users can type the below command to check the JAVA_HOME path

echo $JAVA_HOME

Set the correct java path if the path is not set up properly.

Configure Kerberos to interact with the hdfs client

If the haddop cluster is secured, then the user needs to install the KDC library to interact with hdfs. Type the below command to install KDC

yum -y install krb5-server krb5-libs

Configure the ‘/etc/krb5.conf’ file as per your cluster.

Download the hdfs configuration file

The user needs to download the below configuration file from the Cloudera manager.

hadoop-env.sh
core-site.xml
hdfs-site.xml
mapred-site-xml
yarn-site.xml

These files contain all the necessary configurations to interact with the Hadoop cluster. To download the file, follow the below steps:-

  • Go to the Cloudera Manager Admin Console page
  • Go to the hdfs/hive client configuration
  • Select Actions > Download Client Configuration.

Copy the config file under the directory /opt/hadoop_conf/

Setup environment variables for the hdfs client

Now the user needs to export the below environment variables before interacting with the hdfs client.

export KRB5_CONFIG="/etc/krb5.conf"
export HADOOP_CONF_DIR="/opt/hadoop_conf/"
export HADOOP_OPTS="-Djava.security.krb5.conf=/etc/krb5.conf"
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-11.0.17.0.8-2.el7_9.x86_64

Interact with Hadoop using hdfs client

In the cluster is secured perform the kinit by typing the below command

kinit -kt <keytab_name.keytab> <principal>

Verify if the kinit is done properly by typing

klist

Now type the below command to list files in hdfs

<path_to_hdfs_client>/hadoop-3.3.1/bin/hdfs --config $HADOOP_CONF_DIR dfs -ls /

Conclusion

I hope you have liked this small tutorial about the hdfs client. please do let me know in the comment box if you are facing any issues while following along.

More to explore

Jenkins workflow

Jenkins pipeline example hello-world

Jenkins pipeline example multiple stages

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top