Additional Hadoop Filesystem Commands

Additional Hadoop Filesystem Commands


Command
Description
hadoop fs -copyFromLocal <source> <destination>
Copy from local fileystem to HDFS
hadoop fs -copyFromLocal file1 data
e.g: Copies file1 from local FS to data dir in HDFS
hadoop fs -copyToLocal <source> <destination>
copy from hdfs to local filesystem
hadoop fs -copyToLocal data/file1 /var/tmp
e.g: Copies file1 from HDFS data directory to /var/tmp on local FS
hadoop fs -put <source> <destination>
Copy from remote location to HDFS
hadoop fs -get <source> <destination>
Copy from HDFS to remote directory
hadoop distcp hdfs://192.168.0.8:8020/input hdfs://192.168.0.8:8020/output
Copy data from one cluster to another using the cluster URL
hadoop fs -mv file:///data/datafile /user/hduser/data
Move data file from the local directory to HDFS
hadoop fs -setrep -w 3 file1
Set the replication factor for file1 to 3
hadoop fs -getmerge mydir bigfile
Merge files in mydir directory and download it as one big file


  1. hadoop fs -copyFromLocal <source> <destination> This command is used to copy a file from your local file system to HDFS.

    Example:

    bash
    hadoop fs -copyFromLocal /localpath/file.txt /user/hadoopuser/hdfsdir/

    In this example, the file named file.txt from the local file system is copied to the /user/hadoopuser/hdfsdir/ directory in HDFS.

  2. hadoop fs -copyToLocal <source> <destination> This command is used to copy a file from HDFS to your local file system.

    Example:

    bash
    hadoop fs -copyToLocal /user/hadoopuser/hdfsfile.txt /localpath/

    Here, the file named hdfsfile.txt from HDFS is copied to the /localpath/ directory in your local file system.

  3. hadoop fs -put <source> <destination> The put command is used to copy a file from a local or remote location to HDFS.

    Example:

    bash
    hadoop fs -put /localpath/localfile.txt /user/hadoopuser/hdfsdir/

    This command copies localfile.txt from the local file system to the /user/hadoopuser/hdfsdir/ directory in HDFS.

  4. hadoop fs -get <source> <destination> The get command is used to copy a file or directory from HDFS to your local file system.

    Example:

    bash
    hadoop fs -get /user/hadoopuser/hdfsfile.txt /localpath/

    This command copies hdfsfile.txt from HDFS to the /localpath/ directory in your local file system.

  5. hadoop distcp <source> <destination> The distcp command is used to copy data between HDFS clusters.

    Example:

    ruby
    hadoop distcp hdfs://cluster1/user/hadoopuser/data/ hdfs://cluster2/user/hadoopuser/copieddata/

    This command copies data from the /user/hadoopuser/data/ directory on cluster1 to the /user/hadoopuser/copieddata/ directory on cluster2.

  6. hadoop fs -mv <source> <destination> The mv command is used to move a file or directory within HDFS.

    Example:

    bash
    hadoop fs -mv /user/hadoopuser/sourcefile.txt /user/hadoopuser/destination/

    This command moves sourcefile.txt from /user/hadoopuser/ to /user/hadoopuser/destination/ within HDFS.

  7. hadoop fs -setrep -w 3 <file> The setrep command is used to set the replication factor for a file in HDFS.

    Example:

    bash
    hadoop fs -setrep -w 3 /user/hadoopuser/hdfsfile.txt

    This command sets the replication factor of hdfsfile.txt to 3 in HDFS.

  8. hadoop fs -getmerge <srcDir> <destFile> The getmerge command is used to merge files in a source directory within HDFS and download the merged result as a single file to the local file system.

    Example:

    bash
    hadoop fs -getmerge /user/hadoopuser/sourcefiles/ /localpath/mergedfile.txt

    This command merges all files in /user/hadoopuser/sourcefiles/ within HDFS and saves the merged content as mergedfile.txt in the /localpath/ directory.

These examples should help you understand how each command works and how to use them effectively in Hadoop's HDFS.