Overview

Hue is a set of web applications that enable users to interact with a Hadoop cluster through a web UI. It provides applications to create Oozie workflows, run Hive queries, access HBase, run Spark programs, access HDFS and Hadoop job information and many more.
This article describes how Hue can be installed and configured along with a cluster that is based on IBM Open Platform with Apache Hadoop (IOP) 4.1. In addition, the article provides a script that automates the configuration of Hue to work with the IOP cluster.

Installation

Pre-requisites

Hue needs to be downloaded and installed on a node where the Hadoop and Hive configurations are available.

Download the Hue 3.9 tarball from http://gethue.com/hue-3-9-with-all-its-improvements-is-out/.

The following dependencies are required to install and run Hue. In RHEL/CentOS distributions, the dependencies may be installed via yum:

  1. # yum install ant
  2. # yum install python-devel
  3. # yum install krb5-devel
  4. # yum install krb5-libs
  5. # yum install libxml2
  6. # yum install python-lxml
  7. # yum install libxslt-devel
  8. # yum install mysql-devel
  9. # yum install openssl-devel
  10. # yum install libgsasl-devel
  11. # yum install sqlite-devel
  12. # yum install openldap-devel
  13. # yum install gmp-devel
  14. # yum install python

Install Hue

After installing all the dependencies, as the root user, extract the tarball and run make install to build and install Hue. This will create the Hue home directory under /usr/local/hue.

sudo su hue
tar -xzf hue-3.9.0.tgz
cd hue-3.9.0
make install

Create a Hue user and group and change the owner of /usr/local/hue to hue.

groupadd hue
useradd -g hue hue
chown -R hue:hue /usr/local/hue

Configuration Changes Required in Ambari UI

To use some of the services in Hue, configuration changes are required in Ambari.

  1. HDFS
      • Ensure WebHDFS is enabled

    webhdfs

      • Add properties to custom core-site.xml

    core-site

  2. Oozie
      • Add properties to custom oozie-site.xml

    oozie-site

  3. Hive
      • Add properties to custom webhcat-site.xml

    webhcat-site

Restart the three services from the Ambari UI.
Without these changes, Hue will not be able to access the applications and the Hue UI will show potential misconfigurations.
error_NoProxyUsersAddedInAmbari

Hue Configurations

The Hue configurations need to be updated with values from the cluster. The configuration for Hue is stored in hue.ini located in /usr/local/hue/desktop/conf/hue.ini. This article explains two ways to configure hue.ini, automatically via a script provided, or manually.

Automatic Configuration

This article provides a script to retrieve the configurations using the Ambari REST APIs and updates hue.ini automatically. If the Ambari cluster has Kerberos enabled, please follow the steps in Security in Hue before running this script.

Download the archive containing the scripts at configHueIOP.zip. To use this script, run these commands as the root user:

# unzip configHueIOP.zip
# cd configHueIOP
# ./config_hue_iop.sh -ambariuser=admin -ambaripassword=admin -ambariserver=hostname.abc.com -ambariport=8080 -services=ALL

This script takes in five parameters.

  • ambariuser: Ambari admin user ID.
  • ambaripassword: Password of Ambari admin user.
  • ambariserver: Hostname of Ambari server.
  • ambariport: Port of Ambari server.
  • services: Comma-separated list of Ambari services that should be configured in Hue. To configure all services, specify ALL
$ ./config_hue_iop.sh -help
Usage:
config_hue_iop.sh -help
config_hue_iop.sh -ambariuser=admin -ambaripassword=admin -ambariserver=hostname.abc.com -ambariport=8080 -services=HDFS
config_hue_iop.sh -ambariuser=admin -ambaripassword=admin -ambariserver=hostname.abc.com -ambariport=8080 -services=HDFS,YARN,HIVE
config_hue_iop.sh -ambariuser=admin -ambaripassword=admin -ambariserver=hostname.abc.com -ambariport=8080 -services=ALL

The script will prompt for additional user input, for example whether to start the HBase Thrift Server. After the script finished, check the generated log output for errors that may have occurred while starting the services, such as port conflicts.

Example of the output generated by the script:

./config_hue.sh -ambariuser=admin -ambaripassword=admin -ambariserver=hostname.abc.com -ambariport=8081 -services=ALL
Logging output to hue_20151022_1327.log
Use default hue.ini configuration file /usr/local/hue/desktop/conf/hue.ini ? y/n (y)
Creating backup of hue.ini...hue.ini_20151022_1327.bak
We will start the HBASE Thrift Server on this node: (hostname.abc.com), if you would like to start it on a different host, please select (n)o
Do you want to configure and start HBASE Thrift Server now? y/n (y)
Use default Hue Cluster name and port number for HBASE Thrift Server? y/n (y)
nohup: redirecting stderr to stdout
Kerberos is enabled in the cluster, will set security_enabled=true for all services
TODO:: Update hue.ini manually to configure Kerberos keytab and principal for the hue user.
  [[kerberos]]
    hue_keytab=/etc/security/keytabs/hue.service.keytab
    hue_principal=hue/hostname.abc.com
    kinit_path=/path/to/kinit
Do you want to start the Spark Livy Server? y/n (y)
Do you want to start Zookeeper Rest Services? y/n (y) nohup: redirecting stderrto stdout

Configuring hue.ini at /usr/local/hue/desktop/conf/hue.ini
Blacklisted Applications: impala,sqoop
Updated HBASE
 hbase_clusters=(C1|hostname.abc.com:9090)
Started HBASE Thrift Server, pid: 3633270
Please check hue_20151022_1327.log for any port conflict errors
Updated HDFS
 fs_defaultfs=hdfs://hostname.abc.com:8020
 fs_defaultfs=hostname.abc.com:50070
Updated HIVE
 hive_server_host=hostname.abc.com
 hive_server_port=10000
Updated MAPREDUCE2
 history_server_api_url=hostname.abc.com:19888
Updated OOZIE
 oozie_url=http://hostname.abc.com:11000/oozie
Updated PIG
 local_sample_dir=/usr/iop/current/pig-client/piggybank.jar
Updated SOLR
 solr_url=http://hostname.abc.com:8983/solr/
Updated SPARK
 livy_server_host=hostname.abc.com
Started Hue Spark Livy Server, pid: 3633539
Updated YARN
 resourcemanager_host=hostname.abc.com
 resourcemanager_port=8050
 resourcemanager_api_url=hostname.abc.com:8088
 proxy_api_url=http://hostname.abc.com:8088
Updated ZOOKEEPER
 host_ports=hostname.abc.com:2181
Started Zookeeper Rest Services, pid: 3633717
 rest_url=http://hostname.abc.com:9998/

Manual Configuration

This section explains the manual steps of updating the Hue configurations if not using the provided script.
Hue.ini has a section for each configured service and the properties that need to be updated with the values from the IOP cluster are marked in bold.

  1. desktop.secret_key
    [desktop]
    
      # Set this to a random string, the longer the better.
      # This is used for secure hashing in the session store.
      secret_key=
    

    If this key is not configured, Hue will show a misconfiguration in the UI:
    secret_key
    In the automatic configuration, the secret_key is randomly generated.

  2. Hadoop
    [hadoop]
    
          # Configuration for HDFS NameNode
          # ------------------------------------------------------------------------
          [[hdfs_clusters]]
            # HA support by using HttpFs
    
            [[[default]]]
              # Enter the filesystem uri
              fs_defaultfs=hdfs://hostname.abc.com:8020
    
              # Use WebHdfs/HttpFs as the communication mechanism.
              # Domain should be the NameNode or HttpFs host.
              # Default port is 14000 for HttpFs.
              webhdfs_url=http://hostname.abc.com:50070/webhdfs/v1
    
              # Directory of the Hadoop configuration
              ## hadoop_conf_dir=$HADOOP_CONF_DIR when set or '/etc/hadoop/conf'
    
          # Configuration for YARN (MR2)
          # ------------------------------------------------------------------------
          [[yarn_clusters]]
    
            [[[default]]]
              # Enter the host on which you are running the ResourceManager
              resourcemanager_host=hostname.abc.com  
    
              # The port where the ResourceManager IPC listens on
              resourcemanager_port=8050
              ...
              # URL of the ResourceManager API
              resourcemanager_api_url=http://hostname.abc.com:8088
    
              # URL of the ProxyServer API
              proxy_api_url=http://hostname.abc.com:8088
    
              # URL of the HistoryServer API
              history_server_api_url=http://hostname.abc.com:19888
    
  3. Oozie
    [liboozie]
    
      # The URL where the Oozie service runs on. This is required in order for
      # users to submit jobs. Empty value disables the config check.
      oozie_url=http://hostname.abc.com:11000/oozie
    
  4. Hive
    [beeswax]
    
          # Host where HiveServer2 is running.
          # If Kerberos security is enabled, use fully-qualified domain name (FQDN).
          hive_server_host=hostname.abc.com
    
          # Port where HiveServer2 Thrift server runs on.
          hive_server_port=10000
    
          # Hive configuration directory, where hive-site.xml is located
          ## hive_conf_dir=/etc/hive/conf
    
  5. Pig
    [pig]
          # Location of piggybank.jar on local filesystem.
          local_sample_dir=/usr/iop/current/pig-client/piggybank.jar
    
          # Location piggybank.jar will be copied to in HDFS.
          ## remote_data_dir=/user/hue/pig/examples
    
  6. HBase
    To be able to use the HBase application, the HBase thrift server needs to be started. The HBase thrift server is not managed by Ambari. To start the server, run the following command as root user:

    nohup hbase thrift start &
    

    By default, the HBase thrift server runs on port 9090. To use a different port, pass in a new port with the start command:

    nohup hbase thrift start --port <custom_port> &
    

    If HBase thrift server is not started, Hue shows the following error:

    HBase Thrift 1 server cannot be contacted: Could not connect to hostname.abc.com:9090
    

    Once the HBase Thrift Server is up and running, update hue.ini with the hostname and port number of the thrift server.

    [hbase]
      # Comma-separated list of HBase Thrift servers for clusters in the format of '(name|host:port)'.
      # Use full hostname with security.
      # If using Kerberos we assume GSSAPI SASL, not PLAIN.
      hbase_clusters=(Cluster|hostname.abc.com:9090)
    
  7. Solr
    [search]
          # URL of the Solr Server
          solr_url=http://hostname.abc.com:8983/solr/
    
  8. ZookeeperTo enable ZNode browsing in the Zookeeper application, the Zookeeper REST service needs to be started. The Zookeeper REST service is not managed by Ambari. To start the service, run the following command as hue user:
    /usr/jdk64/java-1.8.0-openjdk-1.8.0.45-28.b13.el6_6.x86_64/bin/java -cp /usr/iop/current/zookeeper-server/contrib/rest/*:/usr/iop/current/zookeeper-server/contrib/rest/lib/*:/usr/iop/current/zookeeper-server/zookeeper.jar:/usr/iop/current/zookeeper-server/conf/rest org.apache.zookeeper.server.jersey.RestMain
    

    By default, the Zookeeper REST service runs on port 9998. To use a different port, update the value for rest.port in /usr/iop/current/zookeeper-server/conf/rest/rest.properties.
    Once the Zookeeper REST service is up and running, update hue.ini with the hostname and port number of the Zookeeper REST service.

    [zookeeper]
    
          [[clusters]]
    
            [[[default]]]
              # Zookeeper ensemble. Comma separated list of Host/Port.
              # e.g. localhost:2181,localhost:2182,localhost:2183
              host_ports=hostname.abc.com:2181
    
              # The URL of the REST contrib service (required for znode browsing).
              rest_url=http://hostname.abc.com:9998
    
  9. SparkTo be able to use the Spark application, the Spark Livy server needs to be started. Livy is a REST service on top of Spark and is not managed by Ambari. To start the server, run the following command as hue user:
    cd /usr/local/hue
    /usr/loca/hue/build/env/bin/hue livy_server
    

    By default, Livy server runs on port 8998.
    The Livy server has to be started in a directory that the Hue user has write access to. When starting the Livy server, the logs will be written to logs/livy_server.log. Once the Livy server is up and running, update hue.ini with the hostname and port number of the Livy server.

    [spark]
          # Host address of the Livy Server.
          livy_server_host=hostname.abc.com
    
          # Port of the Livy Server.
          ## livy_server_port=8998
    

Start Hue

After completing the configuration, start Hue by running the following command as the root user:

/usr/lib/hue/build/env/bin/supervisor

By default, Hue will be running with port 8888. To update the port number, configure http_port in hue.ini.
To access the Hue web UI, open a browser with URL http://hostname.abc.com:8888.

User Management

On the first login, Hue prompts to create a Hue superuser who will have admin access to the web UI.

initial_login

 

 

As the Hue superuser, create additional users who can log into Hue. These users may be granted permissions to access certain Hue applications, such as allowing a user Bob to only have access to launch the File Browser Application. Below is an example on how to create a user hive, who is part of the hadoop group, in Hue.

    • Login to Hue UI with your Hue superuser and navigate to the top right corner of the taskbar. Click on the down arrow next to your superuser name, and click on Manage Users

hue_user_home

    • First create a new group named hadoop and assign the permissions for the group.

hue_create_group

    • Next, create a user, hive, that will be part of the hadoop group.

hue_create_user_step1

    • On step 2 of creating a user, assign the hive user to the hadoop group.

hue_create_user_step2

    • On step 3 of creating a user, make sure the user is “active.” The “superuser status” option will give the new user the same superuser status as the first user created when starting Hue for the first time.

hue_create_user_step3

Security in Hue

To configure Hue with LDAP, follow the instructions from http://gethue.com/making-hadoop-accessible-to-your-employees-with-ldap.
To configure Hue with Kerberos, follow the steps below:

    1. Follow the steps from Setting Up Kerberos for Use with Ambari to setup a KDC and kerberize the Ambari cluster.
    2. After kerberizing the Ambari cluster, configure Hue

To create a keytab and principal for the Hue user, run the commands in bold:

# kadmin
Authenticating as principal root/admin@IBM.COM with password.
Password for root/admin@IBM.COM:
kadmin:  addprinc -randkey hue/hostname.abc.com@IBM.COM

WARNING: no policy specified for hue/hostname.abc.com@IBM.COM; defaulting to no policy

Principal "hue/hostname.abc.com@IBM.COM" created.

kadmin:  xst -k /etc/security/keytabs/hue.service.keytab hue/hostname.abc.com@IBM.COM

Entry for principal hue/hostname.abc.com@IBM.COM with kvno 2, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:/etc/security/keytabs/hue.service.keytab.

Entry for principal hue/hostname.abc.com@IBM.COM with kvno 2, encryption type aes128-cts-hmac-sha1-96 added to keytab WRFILE:/etc/security/keytabs/hue.service.keytab.

Entry for principal hue/hostname.abc.com@IBM.COM with kvno 2, encryption type des3-cbc-sha1 added to keytab WRFILE:/etc/security/keytabs/hue.service.keytab.

Entry for principal hue/hostname.abc.com@IBM.COM with kvno 2, encryption type arcfour-hmac added to keytab WRFILE:/etc/security/keytabs/hue.service.keytab.

# kinit -k -t /etc/security/keytabs/hue.service.keytab hue/hostname.abc.com@IBM.COM
# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: hue/hostname.abc.com@IBM.COM
Valid starting     Expires            Service principal
09/08/15 19:18:48  09/09/15 19:18:48  krbtgt/IBM.COM@IBM.COM
        renew until 09/08/15 19:18:48
  • Update /usr/local/hue/desktop/conf/hue.ini.

Uncomment the line ## security_enabled=false in all the services and set it to true

security_enabled=true

Modify the kerberos section to the location of the hue_keytab and principal that were created in step 2.

[[kerberos]]

# Path to Hue's Kerberos keytab file
hue_keytab=/etc/security/keytabs/hue.service.keytab
# Kerberos principal name for Hue
hue_principal=hue/hostname.abc.com@IBM.COM
# Path to kinit
kinit_path=/usr/bin/kinit
  • Configure HBase hbase-site.xml in Ambari to add the kerberos principal name. Restart HBase service in Ambari

hbase_kerberos

  • Restart Hue as root: /usr/lib/hue/build/env/bin/supervisor

Potential Kerberos-Related Problems when starting the Hue Server

    • Permission denied while getting credentials
[09/Sep/2015 10:02:00 -0700] kt_renewer   INFO     Reinitting kerberos from keytab: /usr/bin/kinit -r 3600m -k -t /etc/security/keytabs/hue.service.keytab -c /tmp/hue_krb5_ccache hue/hostname.abc.com@IBM.COM
[09/Sep/2015 10:02:00 -0700] kt_renewer   ERROR    Couldn't reinit from keytab! `kinit' exited with 1.
kinit: Permission denied while getting initial credentials

Solution: Ensure hue keytab is readable by hue user.

# chown hue:hue /etc/security/keytabs/hue.service.keytab
  • Permission denied: ‘/tmp/hue_krb5_ccache’
IOError: [Errno 13] Permission denied: '/tmp/hue_krb5_ccache'.

Solution: Ensure /tmp/hue_krb5_ccache is writable by Hue user

# chown hue:hue /tmp/hue_krb5_ccache
  • Error: Couldn’t renew Kerberos ticket
[09/Sep/2015 10:22:23 -0700] kt_renewer   ERROR    Couldn't renew kerberos ticket in order to work around Kerberos 1.8.1 issue. Please check that the ticket for 'hue/hostname.abc.com@IBM.COM' is still renewable:

  $ kinit -f -c /tmp/hue_krb5_ccache

If the 'renew until' date is the same as the 'valid starting' date, the ticket cannot be renewed. Please check your KDC configuration, and the ticket renewal policy (maxrenewlife) for the 'hue/hostname.abc.com@IBM.COM' and `krbtgt' principals.

Solution: Modify Hue principal to allow renewable tickets

# kadmin.local
Authenticating as principal root/admin@IBM.COM with password.
kadmin.local:  modprinc -maxrenewlife 7day krbtgt/IBM.COM@IBM.COM
Principal "krbtgt/IBM.COM@IBM.COM" modified.
kadmin.local:  modprinc -maxrenewlife 7day +allow_renewable hue/hostname.abc.com@IBM.COM
Principal "hue/hostname.abc.com@IBM.COM" modified.

Limitations and Workarounds

Limitations

  • Sqoop2 and Impala applications are not supported in Hue when installing over IOP 4.1. Modify /usr/local/hue/desktop/conf/hue.ini to blacklist those applications.
      # Comma separated list of apps to not load at server startup.
      # e.g.: pig,zookeeper
      app_blacklist=impala,sqoop
    
  • The Solr examples are not working with the version of Solr in IOP 4.1.
  • Spark Yarn mode is not supported; instead, use the default spark process mode.
  • Knox is not supported, please see https://issues.apache.org/jira/browse/KNOX-44

Workarounds

  • When installing the Pig example, the following error may occur when executing the Pig script:
    ERROR 1070: Could not resolve org.apache.pig.piggybank.evaluation.string.UPPER using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
    org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Could not resolve org.apache.pig.piggybank.evaluation.string.UPPER using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
    

    Solution: Modify one line in the example Pig script
    from “upper_case = FOREACH data GENERATE org.apache.pig.piggybank.evaluation.string.UPPER(text);
    to “upper_case = FOREACH data GENERATE UPPER(text);

  • When installing the Hive examples as the Hue admin user, a permission denied error may occur because the Hue admin user does not have access to the Hive warehouse directory in HDFS.
    Solution: Install the Hive examples using another user who have access to the Hive warehouse directory.
  • Quickstart Wizard configuration check fails to find a running HiveServer2 after enabling Kerberos.
    When trying to run any commands from the Hive editor, the following error occurs:

    Failed to open new session: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hive/hostname.abc.com@IBM.COM is not allowed to impersonate admin
    

    Solution 1: Update hadoop core-site.xml from Ambari UI to change the proxy users for hive to * instead of users.

    hadoop.proxyuser.hive.hosts:*
    hadoop.proxyuser.hive.groups:*
    

    Solution 2: Add the hive user to users group on the node. Create the users group if it does not exist groupadd users.

    usermod -G users hive
    

2 comments on"How to Install Hue 3.9 on top of BigInsights 4.1"

  1. […] For Hue 3.9 and BigInsights 4.1 have a look to https://developer.ibm.com/hadoop/blog/2015/10/27/how-to-install-hue-3-9-on-top-of-biginsights-4-1/ […]

  2. php yazılımcı June 24, 2016

    Araştırdığım bir içerikdi teşekkür ederim

    http://www.dogruwebtasarim.com

Join The Discussion

Your email address will not be published. Required fields are marked *