The below steps illustrate how a custom service can be added to IBM Open Platform. The service is added to the BigInsights 4.1 stack managed by Ambari. FTP service is chosen as an example. This make it possible to get the required service packages using the default repository that is likely to be setup on commonly used Linux systems. Similar steps can be followed with any simple custom service. We add the FTP service to the cluster and then use it to show how convenient it can make transferring files to HDFS from a remote client machine.

The following are different elements illustrated by this example:

    • Define a service with MASTER and CLIENT components.
      • Declare the packages to be installed for the service.
      • Define life cycle command scripts ‚Äď install, configure, start, stop, status, service-check.
    • Add a custom service using the Ambari web user interface.
    • Use the custom service to interact with one of the core services (HDFS).

Define a service

Create a directory for your service with the structure as shown below. The set of files used for the VSFTPD service have been annotated to explain different aspects relating to service definition, the life cycle commands, packages, and configuration. These files define a service for a stack administered by Ambari and are often referred to as “service configuration”.

Contents of the service directory - VSFTPD

The sample code for the VSFTPD service configuration is available in the attached archive:

VSFTPD.zip

 

VSFTPD/metainfo.xml
[code language=”xml”] <?xml version="1.0"?>
<metainfo>
<schemaVersion>2.0</schemaVersion>
<services>
<service>
<!– Unique name for the service –>
<name>VSFTPD</name>
<!– Display name in Ambari UI –>
<displayName>Vsftpd</displayName>
<!– Service description displayed on service selection screen –>
<comment>Probably the most secure and fastest FTP server for UNIX-like systems.</comment>
<!– Version of component–>
<version>2.2.2</version>
<components>
<component>
<!– Unique name for the component –>
<name>VSFTPD_MASTER</name>
<!– Display name for component in Ambari UI –>
<displayName>FTP server</displayName>
<!– Category decides minimal set of lifecycle commands –>
<!– to be provided: install,configure,status,[start,stop] –>
<category>MASTER</category>
<!– instances of the component cluster-wide –>
<!– E.g. 0+, 1, 1+, 1-2, ALL –>
<cardinality>1</cardinality>
<!– Script for component install/stop/start/config –>
<commandScript>
<script>scripts/vsftpd-master.py</script>
<!– Script type – Only PYTHON is currently supported –>
<scriptType>PYTHON</scriptType>
<!– timeout in seconds for action (superseded by agent timeout) –>
<timeout>600</timeout>
</commandScript>
</component>
<component>
<name>VSFTPD_CLIENT</name>
<displayName>FTP Client</displayName>
<category>CLIENT</category>
<cardinality>0+</cardinality>
<commandScript>
<script>scripts/vsftpd-client.py</script>
<scriptType>PYTHON</scriptType>
<timeout>600</timeout>
</commandScript>
<configFiles>
<!– Files to be included in an "Download Client Configs archive" –>
<configFile>
<type>properties</type>
<fileName>vsftpd-config.properties</fileName>
<dictionaryName>vsftpd-config</dictionaryName>
</configFile>
</configFiles>
</component>
</components>

<!– what yum packages need to be installed –>
<osSpecifics>
<osSpecific>
<!– set of supported operating systems –>
<osFamily>redhat6</osFamily>
<packages>
<!– Package names to be used by yum / zypper install –>
<package><name>vsftpd</name></package>
<package><name>ftp</name></package>
</packages>
</osSpecific>
</osSpecifics>

<!– script to determine service health –>
<commandScript>
<script>scripts/service_check.py</script>
<scriptType>PYTHON</scriptType>
<timeout>600</timeout>
</commandScript>

<!– Files under configuration directory used for generating –>
<!– actual configuration files using template engine –>
<configuration-dependencies>
<config-type>vsftpd-config</config-type>
</configuration-dependencies>
<restartRequiredAfterChange>false</restartRequiredAfterChange>
</service>
</services>
</metainfo>
[/code]

VSFTPD/package/scripts/vsftpd-master.py
[code language=”python”] #!/usr/bin/env python

import sys, os, glob, pwd, signal, time
from resource_management import *
from subprocess import call

class vsftpdMaster(Script):
  # Install MASTER component
  def install(self, env):
 
    # Install packages listed in metainfo.xml
    self.install_packages(env)
    self.configure(env)
    # Update role_command_order.json for stack versions
    self.update_role_command_order()
    
    # add any other install steps that are necessary here

  # To stop the service, use the Linux service stop command and pipe output to log file
  def stop(self, env):
    import params
¬†¬†¬† Execute(‘service vsftpd stop >>’ + params.vsftpd_log)

  # To start the service, use the Linux service start command and pipe output to log file      
  def start(self, env):
    import params
    self.configure(env)
¬†¬†¬† Execute(‘service vsftpd start >>’ + params.vsftpd_log)
    
  # To get status of the service component, use the Linux service status command      
  def status(self, env):
    import params

    # No use of log here as this is a hearbeat type of method
¬†¬†¬† Execute(‘service vsftpd status’)

  # Configure the component
  def configure(self, env):
    import params
 
    env.set_params(params)

    # create vsftpd-config.properties in vsftpd_conf_dir
    self.configFile("vsftpd-config.properties", template_name="vsftpd-config.j2")

  # Generate config file
  def configFile(self, name, template_name=None):
    import params

    File(format("{vsftpd_conf_dir}/{name}"),
         content=Template(template_name),
         owner="root",
         group="root"
    )

  # Add entries into role_command_order.json for stack versions
  def update_role_command_order(self):
    rco_files = glob.glob("/var/lib/ambari-server/resources/stacks/*/*/role_command_order.json")
    for file in rco_files:
      with open(file, "r+") as json:
        readin = json.read()
        if "VSFTPD" not in readin:
¬†¬†¬†¬†¬†¬†¬†¬†¬† writeout = readin.replace(‘"_comment" : "dependencies for all cases",’, ‘"_comment" : "dependencies for all cases",\n¬†¬†¬† "VSFTPD_SERVICE_CHECK-SERVICE_CHECK": ["VSFTPD_MASTER-START"],’)
          json.seek(0)
          json.write(writeout)
          json.truncate()
 
if __name__ == "__main__":
  vsftpdMaster().execute()
[/code]

VSFTPD/package/scripts/vsftpd-client.py
[code language=”python”] #!/usr/bin/env python

import sys, os, pwd, signal, time
from resource_management import *
from subprocess import call

class vsftpdClient(Script):
  # Install CLIENT component
  def install(self, env):
 
    # Install packages listed in metainfo.xml
    self.install_packages(env)
    
    # add any other install steps that are necessary here

if __name__ == "__main__":
  vsftpdClient().execute()
[/code]

VSFTPD/package/scripts/params.py
[code language=”python”] #!/usr/bin/env python
from resource_management import *

# config object that holds the configurations declared in the config xml file
config = Script.get_config()

vsftpd_conf_dir = ‘/etc/vsftpd’

# store the log file for the service from the ‘vsftpd.log’ property of the ‘vsftpd-config.properties’ file
vsftpd_log = config[‘configurations’][‘vsftpd-config’][‘vsftpd.log’]

# store the config properties for the templating engine to operate on
if ‘vsftpd-config’ in config[‘configurations’]:
¬† vsftpd_config_map = config[‘configurations’][‘vsftpd-config’] else:
  vsftpd_config_map = {}
vsftpd_config_map_length = len(vsftpd_config_map)

# vsftpd_config_map = dict(config[‘configurations’][‘vsftpd-config’])

# select host or hosts where service checking function should be targeted
# service check itself might ibe driven from another host
vsftpd_master_hosts = config[‘clusterHostInfo’][‘vsftpd_master_hosts’] vsftpd_host = vsftpd_master_hosts[0] [/code]

VSFTPD/package/scripts/service_check.py
[code language=”python”] #!/usr/bin/env python
from resource_management import *
import subprocess

class VsftpdServiceCheck(Script):
  def service_check(self, env):
    import params

    env.set_params(params)
    target_host = format("{vsftpd_host}")
¬†¬†¬† print (‘service check target is: ‘ + target_host)
    full_command = [ "ssh", target_host, "/sbin/service", "vsftpd", "status" ]     proc = subprocess.Popen(full_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    (stdout, stderr) = proc.communicate()
    response = stdout

    # response is
¬†¬†¬† # vsftpd (pid NNNNN) is running…
    # or
    # vsftpd is stopped

¬†¬†¬† if ‘stopped’ in response:
      raise ComponentIsNotRunning()

if __name__ == "__main__":
  VsftpdServiceCheck().execute()
[/code]

VSFTPD/package/templates/vsftpd-config.j2
[code language=”text”] {% for key, value in vsftpd_config_map.iteritems() -%}
{{key}}={{value}}
{% endfor %}
[/code]
VSFTPD/configuration/vsftpd-config.xml
[code language=”xml”] <?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

<!– Define configuration paramaters for service: property name, default value and description (shown as help text) –>

<!– service log file –>
<property>
<name>vsftpd.log</name>
<value>/var/log/vsftpd.log</value>
<description>Log file for VSFTPD service</description>
</property>

</configuration>
[/code]

Add the service to the BigInsights stack

    1. Copy the service configuration directory VSFTPD under the path /var/lib/ambari-server/resources/stacks/BigInsights/4.1/services.
    2. Restart ambari-server.
      [root@node1]# ambari-server restart
    3. Use the Ambari wizard and click on Add Service.

AddService

    1. Select Vsftpd from the listed services by clicking on the checkbox.

AddServiceVsftpd

Note:

For this service the MASTER component corresponds to the FTP daemon and the CLIENT is a FTP client.

You have to add the MASTER component to at least one node. Choose a cluster node that has the HDFS NFSGateway component to enable trying out the service once it has been added. If NFSGateway is not already part of the cluster then it can be added to the cluster after adding the FTP service as well.
The cardinality value of 1+ for MASTER in metainfo.xml indicates that at least one node will be designated during installation to host this component. A cardinality value of 0+ for CLIENT in metainfo.xml indicates that it is optional.

    1. Click through the rest of the screens making any desired changes to the presented defaults. This will step through the following screens:

Assign Masters

Assign master components to hosts you want to run them on.

AddServiceAssignMastersModified

Assign Slaves and Clients

Assign slave and client components to hosts you want to run them on.

AddServiceAssignSlavesAndClientsModified

Customize Services

Customize the recommended configurations for the selected service.

AddServiceCustomizeServicesModified

Review

Review the configuration before installation

AddServiceVsftpdReviewModified

    1. Click on the Deploy button.

Wait until the selected service components are installed and started as applicable. All nodes will report “Success” when the install, start and service-check steps are successful.

AddServiceVsftpdDeploySuccessModified2

    1. Navigate to the Services screen to view the details of the newly added service.

AmbariServicesWithVsftpd

 

Transfer files to HDFS using the FTP service

    1. Ensure that the HDFS NFSGateway component is available on the FTP Server node and it has been started. If you added the VSFTPD MASTER component i.e. the FTP Server, to a node that does not already have the HDFS NFSGateway component, you will have to add the latter to the cluster before you can proceed. Once the NFSGateway component is available and started successfully, continue to the next step.
      NFSGateways

 

    1. Mount HDFS on the FTP Server node’s local file system mount point. Login as the root user on the FTP Server node for executing the commands to achieve this. In this case /iophdfs has been chosen as the local file system mount point.VSFTPD_iophdfs_mount

 

    1. Start a FTP client on your workstation or any node remote to the cluster where you have file(s) that you want to transfer to HDFS. The following is an example:FtpClientSessionModified

 

  1. Check that the file(s) were transferred to HDFS and are listed using HDFS file system shell command as well as with Linux shell commands operating on the NFS mounted directory.VsftpdTransferredFile

 

Summary

In this blog post, we saw how to define and build a custom service to be managed by Ambari. We stepped through the process of actually adding that custom service to BigInsights 4.1 stack. We witnessed the successful execution of the basic set of life cycle commands for any service with a set of useful components. We made use of that service to work in a useful way with one of the core services of IBM Open Platform.

Sample code for the service configuration is made available. It can be used as a template to build other custom services or to try and experience how simple it can be to start adding custom services on top of IBM Open Platform.