IBM Support

Changing permissions of Hive warehouse directory for IOP 4.2.5 - Hadoop Dev

Technical Blog Post


Abstract

Changing permissions of Hive warehouse directory for IOP 4.2.5 - Hadoop Dev

Body

Overview

Hive stores the table data for managed tables in the Hive warehouse directory in HDFS which is configured in hive-site.xml with property hive.metastore.warehouse.dir. In IOP 4.2.5, the predefined location is /apps/hive/warehouse.
By default, the directory is owned by hive user and the permission is set to 770 which gives the hive user and members of the hadoop group full access to this warehouse directory:

drwxrwx---   - hive hadoop          0 2017-03-22 18:00 /apps/hive/warehouse

Depending on the specific customer implementation and authorization policies, the warehouse directory permission by default (770) could not be appropriated. For example, administrator may want to allow all users to read and write Hive data (777) or only allow owner and members of the hadoop group to write, and all others to read Hive data (775).

Running a hadoop fs command to modify the permissions of warehouse directory, such as:

 hadoop fs -chmod 777 /apps/hive/warehouse

will work, but the permissions will not be handled by Ambari anymore, with the risk of breaking possible cluster authorization policies.

The only solution available to manage these permissions through Ambari was to modify the hardcoded default value in Ambari’s python script which is not a desirable solution.

In IOP 4.2.5, we have temporarily introduced a Hive custom property to control the permission of the Hive warehouse directory while a robust solution is being worked out.

Changing Hive warehouse directory permissions

To control the  /apps/hive/warehouse permissions through Ambari configurations and avoid to modify python code follow these steps:

1. From the Ambari UI, browse to Hive advanced configurations screen.

2. Scroll down to the Custom hive-site section and click on Add Property.

3. A new custom hive-site property needs to be added where the key is custom.hive.warehouse.mode and the value is the permission for the Hive warehouse directory. For example, to allow all users to read and write Hive data, the property key and value to be added should be:

custom.hive.warehouse.mode=0777

4. Add the property and save configuration changes.

5.  In IOP 4.2.5 (Ambari 2.4.2),  hive.metastore.warehouse.dir is part of the immutable paths for Ambari in HDFS, so in order to allow Ambari to make changes on this directory we need to add it to the list  managed_hdfs_resource_property_names on cluster-env configuration type.  

A simple way to modify configurations in Ambari framework is using  configs.sh tool:

/var/lib/ambari-server/resources/scripts/configs.sh Usage: configs.sh [-u userId] [-p password] [-port port] <ACTION> <AMBARI_HOST> <CLUSTER_NAME> <CONFIG_TYPE> [CONFIG_FILENAME | CONFIG_KEY [CONFIG_VALUE]]
For example, in our case, to add hive-site/hive.metastore.warehouse.dir to cluster-env we should run the following command :
/var/lib/ambari-server/resources/scripts/configs.sh set localhost IOP43 cluster-env "managed_hdfs_resource_property_names" "hive-site/hive.metastore.warehouse.dir"

Notice that if  userId (admin), password (admin) and port (8080) are different to the ones set on Ambari by default, they would need to be passed as parameters in the script call.

    6. From the Ambari UI, restart Hive as required to apply the permissions change.

    7. After the new property was added and HiveServer2 is restarted, the permissions of warehouse directory should have been modified to the expected new value:

    [hive@ root]$ hadoop fs -ls /apps/hive  drwxrwxrwx   - hive hadoop          0 2017-03-22 18:02 /apps/hive/warehouse

     

    [{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

    UID

    ibm16260079