IBM Support

Deploying and configuring IOP Titan on HDP - Hadoop Dev

Technical Blog Post


Abstract

Deploying and configuring IOP Titan on HDP - Hadoop Dev

Body

An expanded partnership between IBM and Hortonworks has combined Hortonworks Data Platform (HDP) with IBM Big SQL into a new integrated solution. HDP does not cover all of the services that were available on IBM Open Platform with Apache Spark and Apache Hadoop (IOP). This blog post shows you how to deploy and configure IOP Titan, a transactional distributed graph database that can support thousands of concurrent users, on HDP.

  1. Prepare the HDP environment. Ensure that your HDP deployment is ready for IOP Titan. Both Spark and HBase are required for Titan.
  2. Install IOP Titan on HDP.
    1. Download the Titan rpm from our website.
      For RHEL7: http://birepo-build.svl.ibm.com/repos/IOP/RHEL7/x86_64/4.2.5.0/4.2.5.0-GM/titan/noarch/titan_4_2_5_0-1.0.0_IBM-000000.el7.noarch.rpm    For RHEL6: http://birepo-build.svl.ibm.com/repos/IOP/RHEL6/x86_64/4.2.5.0/4.2.5.0-GM/titan/noarch/titan_4_2_5_0-1.0.0_IBM-000000.el6.noarch.rpm  
    2. Install Titan.
      yum install titan_4_2_5_0-1.0.0_IBM-000000.el7.noarch.rpm  

      The path on which Titan will be installed is /usr/iop/4.2.5.0-0000/titan.

    3. Configure the Titan client by completing the following steps:
      1. Modify the /usr/iop/4.2.5.0-0000/titan/conf/titan-hbase-solr.properties file:
        gremlin.graph=com.thinkaurelius.titan.core.TitanFactory  storage.backend=hbase  storage.hostname=hbase_hostname1,hbase_hostname2,hbase_hostname3  storage.hbase.table=titan  storage.hbase.ext.zookeeper.znode.parent=/hbase-unsecure  cache.db-cache = true  cache.db-cache-clean-wait = 20  cache.db-cache-time = 180000  cache.db-cache-size = 0.5  
      2. Copy the titan-env.sh file into /usr/iop/4.2.5.0-0000/titan/conf.
      3. Run the following command:
        chown -R titan:Hadoop /usr/iop/4.2.5.0-0000/titan/conf/titan-env.sh  
      4. Create the ‘titan’ user and add it to the ‘hadoop’ group by running the following command on each node:
        useradd titan -g Hadoop  
      5. Run the following commands:
        chown -R titan:hadoop /usr/iop/4.2.5.0-0000/titan/ext/plugins.txt    chown -R titan:hadoop /usr/iop/4.2.5.0-0000/titan/conf/titan-hbase-solr.properties  
      6. Run a test. Under the /usr/iop/4.2.5.0-0000/titan/bin directory, run the following command:
        ./gremlin.distro  

        In the Gremlin console, run the following code:

        graph = TitanFactory.open("/usr/iop/4.2.5.0-0000/titan/conf/titan-hbase-solr.properties")  mgmt = graph.openManagement()  pkeyName = mgmt.makePropertyKey("name").dataType(String.class).cardinality(Cardinality.SINGLE).make()  pkeyAge = mgmt.makePropertyKey("age").dataType(String.class).cardinality(Cardinality.SINGLE).make()  mgmt.commit()  hercules = graph.addVertex("name", "hercules", "age", 30, "type", "demigod");  alcmene = graph.addVertex("name", "alcmene", "age", 45, "type", "human");  jupiter = graph.addVertex("name", "jupiter", "age", "5000", "type", "god");  pluto = graph.addVertex("name", "pluto", "age", "4000", "type","god");  neptune = graph.addVertex("name", "neptune", "age", "4500", "type","god");  satum = graph.addVertex("name", "satum", "age", "10000", "type","titan");  satum1= graph.addVertex("name", "satum1", "age", "10200", "type","titan");  hercules.addEdge("father", jupiter);  hercules.addEdge("mother", alcmene);  jupiter.addEdge("father", satum);  jupiter.addEdge("brother", neptune);  jupiter.addEdge("brother", pluto);  neptune.addEdge("brother", jupiter);  neptune.addEdge("brother", pluto);  pluto.addEdge("brother", jupiter);  pluto.addEdge("brother", neptune);  satum.addEdge("brother", satum1);  satum1.addEdge("brother", satum);  graph.tx().commit()  
      7. Go to the HBase shell and check whether the table ‘titan’ was created in HBase by running the following command:
        scan 'hbase:meta'  
      8. Check other information that pertains to the table ‘titan’ by running the following command:
        scan 'titan'  

        Your Titan client should now be working well with HBase backend storage.

    4. Configure the Titan server by completing the following steps:
      1. Run the following command:
        chown -R titan:hadoop /usr/iop/4.2.5.0-0000/titan/conf/gremlin-server/gremlin-server.yaml  
      2. Modify the /usr/iop/4.2.5.0-0000/titan/conf/gremlin-server/gremlin-server.yaml file:
        host: titan_server_host  port: 8182  channelizer: org.apache.tinkerpop.gremlin.server.channel.HttpChannelizer  graphs: {    graph: /usr/iop/4.2.5.0-0000/titan/conf/titan-hbase-solr.properties,    graphSpark: /usr/iop/4.2.5.0-0000/titan/conf/hadoop-graph/hadoop-gryo.properties}  plugins:    - aurelius.titan    - tinkerpop.hadoop    - tinkerpop.tinkergraph  
      3. Under the /usr/iop/4.2.5.0-0000/titan/bin directory, run the following command:
        ./ gremlin-server.sh  
      4. Run a test.
        curl  -XPOST -Hcontent-type:application/json -d '{"gremlin":"100-1"}' http://titan_server_host:8182 | grep 99  

        The Titan server is now using the Restful API mode.

      5. Configure the Titan server log. Run the following commands:
        mkdir /var/log/titan/  chown -R titan:hadoop /var/log/titan/  
      6. Configure the Titan server to enable the WebSocket mode. Open /usr/iop/4.2.5.0-0000/titan/conf/gremlin-server/gremlin-server.yaml and change HttpChannelizer to WebSocketChannelizer.
      7. Run the following command:
        chown -R titan:hadoop /usr/iop/4.2.5.0-0000/titan/conf/remote.yaml  
      8. Modify the /usr/iop/4.2.5.0-0000/titan/conf/remote.yaml file:
        hosts: [titan_server_host]   - gremlin server host   port: 8182  serializer: { className:    org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}  
      9. Start the Titan server by running the following command under /usr/iop/4.2.5.0-0000/titan/bin:
        ./ gremlin-server.sh  
      10. Run a test. Open another SSH client, and under the /usr/iop/4.2.5.0-0000/titan/bin directory, run the following command:
        ./gremlin.distro  

        In the Gremlin console, run the following code:

        :remote connect tinkerpop.server /usr/iop/4.2.5.0-0000/titan/conf/remote.yaml session  :remote console  
    5. Configure Titan SparkGraphComputer by completing the following steps:
      1. Run the following commands:
        Sudo -u spark hadoop fs -mkdir -p /user/spark/share/lib/spark  Sudo -u spark hadoop fs -put -f /usr/hdp/current/spark2-client/jars/* /user/spark/share/lib/spark;   Sudo -u spark hadoop fs -rm -r /user/spark/share/lib/spark/guava*.jar;   Sudo -u spark hadoop fs -put -f /usr/iop/4.2.5.0-0000/titan/lib/guava*.jar /user/spark/share/lib/spark  sudo -u hdfs hadoop fs -mkdir /user/titan  sudo -u hdfs hadoop fs -chown -R titan:hdfs /user/titan  
      2. Copy the hadoop-gryo.properties file to the /usr/iop/4.2.5.0-0000/titan/conf/hadoop-graph directory.
        chown -R titan:Hadoop /usr/iop/4.2.5.0-0000/titan/conf/hadoop-graph/hadoop-gryo.properties  
      3. Modify the ‘spark.yarn.jars’ attribute to point to your NameNode host in the /usr/iop/4.2.5.0-0000/titan/conf/hadoop-graph/hadoop-gryo.properties directory. For example:
        spark.yarn.jars=hdfs://hostname:8020/user/spark/share/lib/spark/*.jar  
      4. Test SparkGraphComputer: Under the /usr/iop/4.2.5.0-0000/titan/bin directory, run the following command:
        ./gremlin.distro  
        gremlin> hdfs.copyFromLocal('/usr/iop/4.2.5.0-0000/titan/data/tinkerpop-modern.kryo','/user/ambari-qa/data/tinkerpop-modern.kryo')  ==>null  gremlin> graph = GraphFactory.open('/usr/iop/4.2.5.0-0000/titan/conf/hadoop-graph/hadoop-gryo.properties')  ==>hadoopgraph[gryoinputformat->gryooutputformat]  gremlin> g = graph.traversal(computer(SparkGraphComputer))  ==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat], sparkgraphcomputer]  gremlin> g.V().valueMap()  
    6. Enable security for Titan by completing the following steps. SSL provides the standard encryption technology for establishing a secure connection between a Titan client and the Titan server.
      1. Go to the /tmp directory and create a self-signed server certificate.
        openssl req -newkey rsa:2048 -nodes -keyout server.key -x509 -days 365 –out server.crt  
      2. Convert the private key to PKCS #8 format and set the password to ‘changeit’.
        openssl pkcs8 -topk8 -inform pem -in server.key -outform pem -out server.pk8  
      3. Go to the /usr/iop/4.2.5.0-0000/titan/conf/gremlin-server directory, open the gremlin-server.yaml file, change WebSocketChannelizer to HttpChannelizer, and then append the following configuration code to the end of this file:
        ssl: {   enabled: true, keyFile: /tmp/server.pk8, keyPassword: changeit, keyCertChainFile: /tmp/server.crt}  
      4. Restart the Titan server and run the following command to determine whether SSL is working:
        curl --cacert  /tmp/server.crt -XPOST -Hcontent-type:application/json -d '{"gremlin":"100-1"}' https://hhdp1.fyre.ibm.com:8182 | grep 99  

      For Knox, SSL provides a single point of authentication and access to the Titan server in a cluster.

      1. From the Knox page in the Ambari user interface, add the following code to the advanced topology. If the Titan server is running without SSL, you can use ‘http’ as the URL.
        <service>    <role>GREMLIN</role>    <version>1.0.0</version>    <url>https://hhdp1.fyre.ibm.com:8182</url>  </service>  
      2. Restart Knox and then start a demo LDAP server.
      3. Run the following command. You will need to add the cert file to the JRE.
        keytool -import -file hive.crt -keystore /usr/jdk64/jdk1.8.0_112/jre/lib/security/cacerts -storepass changeit -alias titan  
      4. Run the following command to check whether Titan works with Knox:
        curl -Hcontent-type:application/json -u guest:guest-password -k https://hhdp1.fyre.ibm.com:8443/gateway/default/gremlin -d '{"gremlin": "100-1"}'  

      For a Kerberized cluster, SSL is designed to provide strong authentication for Titan client/server applications by using secret key cryptography. From your Kerberos key distribution center (KDC) server, run the following command to create a Titan principal and generate a keytab:

      kadmin.local  addprinc titan/[email protected]  xst -norandkey -k titan.service.keytab titan/[email protected]  scp titan.service.keytab hostname:/etc/security/keytabs  chown -R titan:hadoop /etc/security/keytabs/titan.service.keytab  sudo –u titan /usr/bin/kinit -kt /etc/security/keytabs/titan.service.keytab titan/[email protected]  

      The ‘hostname’ identifies your Titan server.

      For HBase access control, start the HBase shell and then grant the following privileges to the ‘titan’ user:

      grant 'titan', 'RWXCA'  

      Simple Authentication and Security Layer (SASL) and Knox have similar functions and cannot be enabled at the same time. SASL should be disabled when Knox is enabled. To enable SASL, complete the following steps:

      1. Copy the tinkergraph-empty.properties file to the /usr/iop/4.2.5.0-0000/titan/conf directory.
      2. Run the following command:
        chown -R titan:hadoop /usr/iop/4.2.5.0-0000/titan/conf/tinkergraph-empty.properties  
      3. Create an empty file named credentials.kryo under the /usr/iop/4.2.5.0-0000/titan/data directory.
      4. Add the following content to the /usr/iop/4.2.5.0-0000/titan/conf/gremlin-server/gremlin-server.yaml file:
        authentication: {    className: org.apache.tinkerpop.gremlin.server.auth.SimpleAuthenticator,    config: {      credentialsDb: /usr/iop/4.2.5.0-0000/titan/conf/tinkergraph-empty.properties,  credentialsDbLocation: /usr/iop/4.2.5.0-0000/titan/data/credentials.kryo}}  
      5. In the Gremlin console, run the following command to create a SASL user and password:
        gremlin> :plugin use tinkerpop.credentials  ==>tinkerpop.credentials activated  gremlin> graph = TinkerGraph.open()  ==>tinkergraph[vertices:0edges:0]  gremlin  > graph.createIndex("username",Vertex.class)  gremlin> credentials = credentials(graph)  ==>CredentialGraph{graph=tinkergraph[vertices:0edges:0]}  gremlin> credentials.createUser("stephen","password") //to create user  ==>v[0]  gremlin> graph.io(IoCore.gryo()).writeGraph("/usr/iop/4.2.5.0-0000/titan/data/credentials.kryo") //to save the credentials database  
      6. Restart the Titan server.
      7. To access the Titan server, run curl commands that are similar to the following examples:
        Titan SASL+TITAN SSL: curl --cacert  /tmp/server.crt    -X POST -u stephen:password -d "{\"gremlin\":\"100-1\"}" https://hostname:8182    Titan SASL: curl    -X POST -u stephen:password -d "{\"gremlin\":\"100-1\"}" https://hostname:8182  

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm16260007