Sunday, 3 August 2014

HDFS File Commands

Hadoop Distributed File System (HDFS) is the distributed storage component of hadoop framework. In order to interface with HDFS the user must know hadoop file shell commands.  
Hadoop commands take the form of
hadoop fs -command <args>

Whenever you write hadoop commands you have to prefix it with 'hadoop fs' followed by hyphen and command name. This is required so as the system to recognize that user is accessing files at HDFS and not local file system(underlying operating system).

Let us see some commonly used HDFS commands:

·     copyFromLocal OR put: This command is used to copy/put files from local ( underlying operating system's )  file system to HDFS.
Syntax: hadoop fs -copyFromLocal / put <path of file at local file system: source> <HDFS path: target>

·     copyToLocal OR get: This command is used to copy files from HDFS to local ( underlying operating system's )  file system.
Syntax: hadoop fs - copyToLocal / get  [-ignorecrc] [-crc] <HDFS path: source>
<path of file at local file system: target>

The parameter [-ignorecrc] [-crc] is important when you are copying files from HDFS. HDFS creates a checksum value for each block of every file. Whenever user gets file from HDFS to its local file system she has the choice of validating that data using the checksum value associated with it.

·     ls: This command is used to show the list of files / directories in HDFS. It shows name, permissions, owner, group, replication factor, modification date for each entry.
Syntax: hadoop fs -ls <path>

·       lsr: This command is used to show list of all files and directories recursively. Each entry show same information as ls.
Syntax: hadoop fs -lsr <path>

·       moveFromLocal: This command is used to move file(s) from local file system to HDFS.  The source file is deleted from local file system after successful copying on HDFS.
Syntax: hadoop fs -moveFromLocal <local file path: source> <HDFS path: target>

·         rm: This command is used to delete files or directories.
Syntax: hadoop fs -rm  <path>

·         rmr: This command works same as rm but with recursion.
Syntax: hadoop fs -rmr  <path>  

No comments:

Post a Comment