We have seen from customer SMF data on z/OS that some of the DNS and SSL requests took a long time. What can cause this?

Looking up IP Hostnames and IP addresses

TCP connectivity uses numbered addresses such as 9.4.10.1. These are hard to remember, and hostnames like MVSA.IBM.COM can be used. To convert from MVSA.IBM.COM to 9.4.10.1 a Domain Name Server(DNS) is used. There are a few master servers, but many local DNS. You can issue a request such as the NSLOOKUP command to get the host name from an IP address, or IP address from a host name. Your request will most probably go to a local DNS. If it has the value in its cache – it will return it. If it does not have the value in the cache – it asks another DNS ( close to the master server) and so on.

  1.  This means that if the value is in the DNS cache the response is usually very fast.
  2.  If the value is not in the DNS cache the request can take a long time. We have seen an instance where the DNS was mis configured and was passing the request on to a DNS server a long way away, over a slow network, causing long response times( over 1 second)
  3. The observed behaviour is that the first time the DNS sees a value it can take a long time, successive times should be much faster.

A hostname is used in two places.
In channel definitions for connecting to another queue manager. This should be a “well known” set of names which you define, and so the value should be in the DNS cache.

In Channel Auth records. These allow you to define rules such a If the incoming IP address is LINUX1.IBM.COM then set the userid as “INQUIRY”. The logic is as follows

  1.  A numeric IP address tries to connect to the chinit.
  2.  A DNS request is made to get back the host name
  3. The hostname is checked with the Channel auth records.

If this is the first time the connection has tried to bind to the chinit, the IP address may not be in the DNS cache and so this could take a long time.
You may have no control over the clients or channels trying to connect to your queue manager. ( A firewall may be able to limit which addresses can access your environment) and so these may take a long time.
This request to the DNS can be controlled using the Queue manager Reverse DNS option REVDNS(ENABLED). If REVDNS(DISABLED) is specified then there is no DNS look up. This means you need to specify the IP address instead of host names in the Channel Auth records.

Specifying hostnames in CHLAUTH is considered risky as the rule is dependent on the content of the DNS server reply which could be manipulated.

SSL/TLS channels.

These channels use digital certificates as part of the protocol. Once a certificate has been created and sent out to the users, it can be revoked to make it no longer valid. A typical case is someone leaving an organization, and so the certificate should no longer be used.
The user of the certificate (such as the chinit) can check the validity of a certificate by checking with an LDAP server, and to see if it on the Certificate Revocation List(CRL). If it is on the CRL then the certificate is not valid and will not be used.
On distributed MQ the certificates can be checked using Online Certificate Status Protocol (OCSP).

These requests go to an LDAP server.
Once the validity of the certificate has been checked, the information maybe cached for a period, for example 1 hour.

The configuration and location of the LDAP server can affect the response time of the LDAP requests.

What data can help me?

On z/OS you can get data from the SMF records on

DNS requests on the DNS TCB

  1. The average request duration
  2. The count of requests
  3. The longest duration request and the date time of when this occurred
  4. How busy the TCB was

SSL requests – the number of TCBs for SSL requests is controlled by ALTER QMGR SSLTASKS()

  1. The average request duration
  2. The count of requests
  3. The longest duration request and the date time of when this occurred
  4. How busy each TCB was.

Join The Discussion

Your email address will not be published.