DevTech101

DevTech101

A quick update on hdfs put/get/etc…

Simplest solution to upload/download/delete/etc…
Use webhdfs. samples are below.

—————
Simple Example in one step.
Note: In this example, replace node name with any of the 6 nodes [n01-n06]
curl -L -i -X PUT -T source_file_name.pdf “http://n03.domain.com:50075/webhdfs/v1/user/usera/dest_file_name.pdf?user.name=usera&op=CREATE&user.name=usera&namenoderpcaddress=bda-cluster1-ns&overwrite=false”

# Example
You have a source file: solaris_openstack.pdf
user.name: usera
file dest: /user/usera/solaris_openstack.pdf
—–
curl -L -i -X PUT -T solaris_openstack.pdf “http://n03.domain.com:50075/webhdfs/v1/user/usera/solaris_openstack.pdf?user.name=usera&op=CREATE&user.name=usera&namenoderpcaddress=bda-cluster1-ns&overwrite=false”
===========

Example in two steps (to better understand).
You use the namenode to get an Location and then you upload.

# Step one
curl -i -X PUT “http://n01.domain.com:50070/webhdfs/v1/user/usera/solaris_openstack.pdf?user.name=usera&op=CREATE”
HTTP/1.1 307 TEMPORARY_REDIRECT
Cache-Control: no-cache
Expires: Thu, 31 Dec 2015 21:39:55 GMT
Date: Thu, 31 Dec 2015 21:39:55 GMT
Pragma: no-cache
Expires: Thu, 31 Dec 2015 21:39:55 GMT
Date: Thu, 31 Dec 2015 21:39:55 GMT
Pragma: no-cache
Set-Cookie: hadoop.auth=”u=usera&p=usera&t=simple&e=1451633995826&s=37CizOVn2LeT+LAKAHuewMFWctU=”; Path=/; Expires=Fri, 01-Jan-2016 07:39:55 GMT; HttpOnly
Location: http://n03.domain.com:50075/webhdfs/v1/user/usera/solaris_openstack.pdf?op=CREATE&user.name=usera&namenoderpcaddress=bda-cluster1-ns&overwrite=false
Content-Type: application/octet-stream
Content-Length: 0
Server: Jetty(6.1.26.cloudera.4)

# Step two – Note the Location above.
curl -i -T solaris_openstack.pdf “http://n03.domain.com:50075/webhdfs/v1/user/usera/solaris_openstack.pdf?op=CREATE&user.name=usera&namenoderpcaddress=bda-cluster1-ns&overwrite=false”
HTTP/1.1 100 Continue

HTTP/1.1 201 Created
Cache-Control: no-cache
Expires: Thu, 31 Dec 2015 21:40:22 GMT
Date: Thu, 31 Dec 2015 21:40:22 GMT
Pragma: no-cache
Expires: Thu, 31 Dec 2015 21:40:22 GMT
Date: Thu, 31 Dec 2015 21:40:22 GMT
Pragma: no-cache
Location: webhdfs://bda-cluster1-ns/user/usera/solaris_openstack.pdf
Content-Type: application/octet-stream
Content-Length: 0
Server: Jetty(6.1.26.cloudera.4)
————-

# Full Docs is available here
https://hadoop.apache.org/docs/r1.0.4/webhdfs.html

# A very helpful site to understand
Hadoop REST API – WebHDFS

# HDFS / kerberos Spnego
https://my.vertica.com/docs/7.1.x/HTML/Content/Authoring/HadoopIntegrationGuide/HDFSConnector/TestingYourHadoopWebHDFSConfiguration.htm

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x
%d bloggers like this: