Hive stores the table data for managed tables in the Hive warehouse directory in HDFS which is configured in hive-site.xml with property¬†hive.metastore.warehouse.dir. In IOP 4.2.5, the predefined location is /apps/hive/warehouse.
By default, the directory is owned by hive user and the permission is set to 770 which gives the hive user and members of the hadoop group full access to this warehouse directory:
drwxrwx---¬†¬† - hive hadoop¬†¬†¬†¬†¬†¬†¬†¬†¬† 0 2017-03-22 18:00 /apps/hive/warehouse
Depending on the specific customer implementation and authorization policies, the warehouse directory permission by default (770) could not be appropriated. For example, administrator may want to allow all users to read and write Hive data (777) or only allow owner and members of the hadoop group to write, and all others to read Hive data (775).
Running a hadoop fs command to modify the permissions of warehouse directory, such as:
hadoop fs -chmod 777 /apps/hive/warehouse
will work, but the permissions will not be handled by Ambari anymore, with the risk of breaking possible cluster authorization policies.
The only solution available to manage these permissions through Ambari was to modify the hardcoded default value in Ambari’s python script which is not a desirable solution.
In IOP 4.2.5, we have temporarily introduced a Hive custom property to control the permission of the Hive warehouse directory while a robust solution is being worked out.
Changing Hive warehouse directory permissions
To control the ¬†/apps/hive/warehouse permissions¬†through Ambari configurations and avoid to modify python code follow these steps:
1. From the Ambari UI, browse to Hive advanced configurations screen.
2. Scroll down to the Custom hive-site section and click on Add Property.
3. A new custom hive-site property needs to be added where the key is custom.hive.warehouse.mode¬†and the value is the permission for the Hive warehouse directory. For example, to allow all users to read and write Hive data, the property key and value to be added should be:
4. Add the property and save configuration changes.
5. ¬†In IOP 4.2.5 (Ambari 2.4.2), ¬†hive.metastore.warehouse.dir¬†is part of the immutable paths for Ambari in HDFS, so in order to allow Ambari to make changes on this directory we need to add it to the list ¬†managed_hdfs_resource_property_names on¬†cluster-env configuration type. ¬†
A simple way to modify configurations¬†in Ambari framework is using ¬†configs.sh tool:
Usage: configs.sh [-u userId] [-p password] [-port port] <ACTION> <AMBARI_HOST> <CLUSTER_NAME> <CONFIG_TYPE> [CONFIG_FILENAME | CONFIG_KEY [CONFIG_VALUE]]
/var/lib/ambari-server/resources/scripts/configs.sh set localhost IOP43 cluster-env "managed_hdfs_resource_property_names" "hive-site/hive.metastore.warehouse.dir"
Notice that if ¬†userId (admin), password (admin) and port (8080) are different to the ones set on Ambari by default, they would need to be passed as parameters in the script call.
6. From the Ambari UI, restart Hive as required to apply the permissions change.
7. After the new property was added and HiveServer2 is restarted, the permissions of warehouse directory should have been modified to the expected new value:
[hive@ root]$ hadoop fs -ls /apps/hive drwxrwxrwx ¬† - hive hadoop ¬† ¬† ¬† ¬† ¬†0 2017-03-22 18:02 /apps/hive/warehouse