mycluster是源hdfs,cluster2是目标hdfs,我想要从mycluster把tmp目录下的文件复制到cluster2下
hadoop distcp hftp://mycluster/tmp hdfs://cluster2/tmp
此命令在mycluster的namenode下执行,通过本地hadoop的配置defaultFS可以识别到mycluster对应的ip主机,但是没有cluster2的配置,要如何知道cluster2对应的是哪些ip或主机呢? cluster2应该是另一个集群中的namenode,只要在mycluster所在主机的hosts中配置cluster2及其对应的ip地址就行了 mycluster 和 cluster2 都是NameNode吧
<property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>machine1.example.com:8020</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>machine2.example.com:8020</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>machine1.example.com:50070</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>machine2.example.com:50070</value> </property>
hadoop distcp hdfs://nn1:8020/src hdfs://nn2:8020/dest
|