This blog provides an overview of Ranger and Ranger-KMS services and how they can be used to protect the IOP 4.2 clusters.
Ranger is the centralized Hadoop security manager with its own administration console through web interface. In IOP 4.2, it manages the access control list (ACL) and the user group information for multiple Hadoop components including HDFS, Yarn, Hive, HBase, and Knox. Ranger also provides the auditing capability to these components. The audit logs can be collected into the database or onto HDFS. Users can search based on their interests on these audit logs. Ranger can be configured to manage ACLs for multiple clusters.
Currently Ranger 0.5.2 allows granting, not denying, the access to the users. With this capability, the administrators will be able to grant any ACL on top of the native Hadoop component ACLs. The ACLs can be granted to a group or an individual user. The ACLs can be created for users from an external source like LDAP/AD or UNIX system based on the organizational needs. Ranger provides access control by defining “policies”. It can define and store the multiple security policies for a Hadoop component. The users can choose to turn on or off any of the security policies based on a particular need. Ranger stores the policies, audit logs, and the user information into an externally configured database during its installation.
Hadoop already has its own Key Management Server(KMS) implementation. Rangerâ€™s KMS is another implementation of KMS for Hadoop to provide more secure and efficient key storage along with auditing capabilities. The existing Hadoop keys can be imported to Ranger KMS and used seamlessly. Ranger KMS can be accessed through the same Ranger web interface through which a key administrator can create keys and manage them.
Ranger in IOP 4.2 has three components —
Ranger admin — Provides web interface, manages repositories and policies for individual Hadoop component.
Ranger usersync — Synchronizes the users from LDAP/AD or Unix system, which can be used for Access Controls
Ranger KMS — Provides more secure, easy to manage the keys for the Hadoop ecosystem.
In IOP 4.2, Ranger and Ranger KMS are provided as two separate services that can be installed through Ambari. A user can choose to install Ranger alone, or both. However Ranger has to be installed in order to install and use Ranger KMS. Ranger KMS needs to be installed on a Kerberized cluster.
The support of Ranger in the the Hadoop components is done through lightweight Ranger “plug-ins”. The plugins can be enabled or disabled for individual component. Once a plugin is enabled for a component, based on the configuration, there will be a default “repository” created for component along with default policy (if configured.) Refer to Ranger documentation in Knowledge Center for the information regarding the fields in Ranger configuration.
Once the repository is created, you can go and add new policies, update existing policies, delete policies, etc. through Ranger admin web interface. Managing fine-grained access control provides flexibility in defining policies:
- on the folder and file level, via HDFS
- on the database, table and column level, and on UDFs via Hive
- on the table, column family and column level, via HBase
- managing Yarn job ACL on Ranger
- Managing Knox service level authorization on Ranger through Knox topology files
The access controls can be performed for users on a LDAP/AD or UNIX system. The users can be assigned “Ranger Admin” role so that they can authenticate to Ranger web interface and create, modify, delete policies. Also Users/groups can be created through Ranger’s web interface for which you can provide access controls through policies. However there is no mechanism yet to delete the users from Ranger once they are synced to Ranger and if they are deleted or dropped in the external system later.
In the Hadoop ecosystem, the security policies are enforced using Ranger plug-ins, which run within the same process as the component, like the namenode (HDFS), Hive Server2(Hive), HBase server (Hbase), and Knox server (Knox) respectively. Thus there is no additional OS level process to manage. The plugin agents pull the policy-changes using REST API at a configured regular interval (e.g. 30 seconds.) The plugin is able to function even if the policy server (Ranger admin) is temporarily down and will provide the authorization enforcement based on its latest local copy of the policies. The plugins also collect access request details required for auditing. The security policies are independent from native individual component permissions. However, Ranger does provide a default feature to validate access using native Hadoop file-level permissions if the Ranger policies do not cover the requested access to a HDFS entry.
Ranger provides extensive auditing for user access into the system via Ranger audit to DB and HDFS at different levels:
- IP address
- Resource/resource type
- Access granted or denied
The Ranger audit logs can be viewed through the Ranger administration console, only if written to database. The auditing records configured to be written to HDFS are stored in JSON format and can later be processed by other applications, like Apache Hive, to query and report.
Hope this blog has provided some overview of the Ranger component. For detailed instructions on how to configure and setup Ranger and the plugins please visit the knowledge center at — IBM BigInsights 4.2 documentation. You can also view the Ranger installation and configuration for IOP 4.2 in the youtube video at: Ranger install in IOP 4.2.