IBM Spectrum Scale object storage combines the benefits of IBM Spectrum Scale with OpenStack Swift to manage data as objects which can be accessed over the network by using RESTful HTTP-based APIs.For more info on IBM spectrum Scale Object refer to :
https://www.redbooks.ibm.com/redpapers/pdfs/redp5113.pdf

In order to obtain objects metrics for metering and billing purpose, Openstack provides Ceilometer which collects metrics per account basis. While this is one recommended approach where customer can record Spectrum scale object stats using ELK stack (for log analysis) and Kibana (to create useful customized usage reports) by leveraging the proxy-server logs.

Openstack Swift has quite data rich INFO level logging which can be used for cluster monitoring, utilization calculations, audit records, and more using proxy-logging middleware. The proxy-server logs contain the record of all external API requests made to the proxy server in the raw data format.We can leverage these proxy-server logs and extract the required GET/PUT/DELETE/HEAD requests information using elasticsearch and can use kibana to create reports based on openstack account/tenant/project stats.

Here are the steps to configure:

1. Object protocol has to be enabled:
(For more details related to install and object, please refer to https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1ins_quickrefobjectstorage.htm)

2. Turn on INFO level logging in the proxy-server.conf with the command
mmobj config change –ccrfile proxy-server.conf –section DEFAULT –property log_level –value INFO

Note: Please ensure we have proxy-logging middleware in proxy-server.conf pipeline as below:

pipeline = healthcheck cache formpost tempurl authtoken swift3 s3token keystoneauth container-quotas account-quotas staticweb bulk slo dlo proxy-logging sofConstraints sofDirCr proxy-server

3. Setup ELK Server to monitor & meter the usage
Before a deep dive to configure ELK server for monitoring proxy-server logs let’s quickly go through basics of ELK.

a) Key components of ELK stack
Elasticsearch, Logstash, and Kibana, when used together is known as an ELK stack.
There are 4 key component of ELK stack that we are using to achieve this task.
i. Elasticsearch
Elasticsearch is open source, distributed search & analytics engine based on Apache Lucene. It stores the data in the form of document and adds a searchable reference to the document in the cluster’s index. It is popular for running analytics on big volume log data.
ii. Logstash
Logstash is a open source tool which is used to parse the data and ingest formatted data to Elasticsearch for further analytics.
iii. Kibana
Kibana is a open source web interface that can be used to search and view content indexed on an Elasticsearch cluster.
iv.Filebeat
Filebeat (one of the Beat component) is a lightweight shipper specially used to ship the log files. Filebeat monitors the log directories or specific log files, tails the files, and forwards them either to Elasticsearch, Logstash or Kibana.

b) ELK Component Installation
Installation of ELK Stack is a straightforward process. You can download the packages based on your system architecture & operating system.
Here is the link of Package Download Page of ELK : https://www.elastic.co/downloads
For the given use case you need to install packages elasticsearch-6.2.4.rpm, kibana-6.2.4.rpm & logstash-6.2.4.rpm on ELKserver and install filebeat-6.2.4-x86_64.rpm package on every spectrum scale CES node where Object service is running.

Start Elasticsearch & Kibana service on ELKserver to test the installation.
service elasticsearch start
service kibana start

Here is the logical diagram of the setup.

Note: For Test purpose you can use single node ELK setup. In production environment one should use separate ELK cluster.
You can refer to below diagram for Multi Cluster setup.

c) Pipelining Logs from Scale cluster to ELKserver
As we stated above, proxy-server logs are being collected from CES nodes in order to track all operations for metering and billing purpose. Once filebeat is installed on CES nodes, you just need to make few changes into the configuration of Filebeat

Note: For Linux, You can find the configuration file of each component in there respective directory under /etc.

For configuration changes in Filebeat, edit /etc/filebeat/filebeat.yml file on every CES nodes :

filebeat.prospectors:
– type: log
enabled: true
paths:
– /var/log/swift/proxy-server.log

output.logstash:
hosts: [“:5044″]

Note: Comment “Elasticsearch output” session since we are using Logstash as output for filebat. By default, Elasticsearch output is set in filebeat.yml file.
Start Filebeat service.
service filebeat start

Logstash configuration need to be done on ELK server :

Create a configuration file in conf.d directory under Logstash directory (/etc/logstash/conf.d) to make logstash service listen to filebeat request.
So create a conf file under conf.d directory.
Ex: proxyserver.conf
input {
beats {
host => “
port => 5044
}
}

filter {
grok {
match => { “message” => “%{SYSLOGTIMESTAMP:timestamp} %{WORD:node} %{NOTSPACE} %{IP:remote_client} %{IP:remote_address} %{NOTSPACE} %{WORD:method} /v1/AUTH_%{USERNAME:user}%{NOTSPACE:request} %{WORD:server_protocol}%{NOTSPACE} %{NUMBER:status} %{NOTSPACE:referral} %{NOTSPACE:agent} %{NOTSPACE:token} %{NOTSPACE:bytes_received} %{NOTSPACE:bytes_sent} %{GREEDYDATA:log}” }
remove_field => [ “message” ] }
date {
match => [ “timestamp”, “MMM d HH:mm:ss”, “MMM dd HH:mm:ss” ] remove_field => [ “timestamp” ] }

mutate {
convert => [ “bytes_received”, “integer” ] convert => [ “bytes_sent”, “integer” ] }
}

output {
elasticsearch {
hosts => [“127.0.0.1:9200”] manage_template => false
index => “proxy_log”
}
stdout { codec => rubydebug }
}

Start the Logstash service on ELKserver
service logstash start

Note: For Linux, You can find the logs of each component in there respective directory under “/var/log”, i.e., /var/log//.log

Logstash Filters
After going through the configuration changes you probably get that it’s a Logstash filter who does all the magic. For this use cases we are using 2 most powerful filters of Logstash.

Grok
This filter is used to parse the arbitrary text and convert it into structured and queryable data. Logstash have more than 120 build in pattern and it’s more than likely you’ll find one that meets your needs.

For example,
Typical proxy-server log looks like:
May 1 05:59:18 a3n1 proxy-server: 10.0.100.41 10.0.100.41 01/May/2018/09/59/18 POST /v1/AUTH_18f14bf4a8a24cbbb06f3d8ed366b38c/new HTTP/1.0 404 – python-swiftclient-3.4.1.dev3 gAAAAABa6Dp1ZkAu… – 70 – tx03ac08511d8743d1adfef-005ae83a75 – 0.0098 – – 1525168758.069506884 1525168758.079281092 –

With Grok Filter
%{SYSLOGTIMESTAMP:timestamp} %{WORD:node} %{NOTSPACE} %{IP:remote_client} %{IP:remote_address} %{NOTSPACE} %{WORD:method} /v1/AUTH_%{USERNAME:user}(%{NOTSPACE}) %{WORD:server_protocol}%{NOTSPACE} %{NUMBER:status} %{NOTSPACE:referal} %{NOTSPACE:agent} %{NOTSPACE:token} %{NOTSPACE:bytes_received} %{NOTSPACE:bytes_sent}

Structured output will looks like:

{
“timestamp”: [
[
“May 1 05:59:18”
] ],
“MONTH”: [
[
“May”
] ],
“MONTHDAY”: [
[
“1”
] ],
“TIME”: [
[
“05:59:18”
] ],
“HOUR”: [
[
“05”
] ],
“MINUTE”: [
[
“59”
] ],
“SECOND”: [
[
“18”
] ],
“node”: [
[
“a3n1”
] ],
“NOTSPACE”: [
[
“proxy-server:”,
“01/May/2018/09/59/18”,
“/new”,
“/1.0”
] ],
“remote_client”: [
[
“10.0.100.41”
] ],
“IPV6”: [
[
null,
null
] ],
“IPV4”: [
[
“10.0.100.41”,
“10.0.100.41”
] ],
“remote_address”: [
[
“10.0.100.41”
] ],
“method”: [
[
“POST”
] ],
“user”: [
[
“18f14bf4a8a24cbbb06f3d8ed366b38c”
] ],
“server_protocol”: [
[
“HTTP”
] ],
“status”: [
[
“404”
] ],
“BASE10NUM”: [
[
“404”
] ],
“referal”: [
[
“-”
] ],
“agent”: [
[
“python-swiftclient-3.4.1.dev3”
] ],
“token”: [
[
“gAAAAABa6Dp1ZkAu…”
] ],
“bytes_received”: [
[
“-”
] ],
“bytes_sent”: [
[
“70”
] ] }

Mutate : Mutate is another filer which perform general transformations on event fields. You can rename, remove, replace, and modify fields in your events.

Ex: Here we used mutate to modify field type of “bytes_received ” & “bytes_sent” from string to integer (Since grok gave output in string). This will help Elasticsearch to do all arithmetic operation like summation, average, min, max etc.

Live Monitoring by Kibana
Kibana is user friendly dashboard which can help you to extract data in the form of tables, charts & many more.
Kibana URL: http://127.0.0.1:5601

To give you quick start, here the json of our dashboard which you can import in your setup and get started with.
https://developer.ibm.com/storage/wp-content/uploads/sites/91/2018/05/export-json.txt

Note: To import the Json to Kibana Dashboard, you need to go to Management Tab > Saved Objects > Import

Here is the snap from out Kibana Dashboard with all virtualizations created in Kibana


This exercise helps you to get the insights of objects operations (GET/PUT/DELETE/HEAD) number and bytes sent/received per account/project/tenant.

If one is looking for capacity consumed per account/per container, it can be seen using “swift stat” & “swift stat ” command respectively

[root@a3n1 ~]# swift stat
Account: AUTH_0800f052b8e44ac4a6e63f308d2be57c
Containers: 5
Objects: 141
Bytes: 25432865
Containers in policy “policy-0”: 1
Objects in policy “policy-0”: 141
Bytes in policy “policy-0”: 25432865
Containers in policy “sof”: 4
Objects in policy “sof”: 0
Bytes in policy “sof”: 0
X-Openstack-Request-Id: txee19e446805f408ebecb0-005ae9a655
X-Timestamp: 1525160198.31163
X-Trans-Id: txee19e446805f408ebecb0-005ae9a655
Content-Type: text/plain; charset=utf-8
Accept-Ranges: bytes

[root@a3n1 ~]# swift stat new
Account: AUTH_0800f052b8e44ac4a6e63f308d2be57c
Container: new
Objects: 141
Bytes: 25432865
Read ACL:
Write ACL:
Sync To:
Sync Key:
Accept-Ranges: bytes
X-Storage-Policy: policy-0
Last-Modified: Wed, 02 May 2018 07:02:21 GMT
X-Timestamp: 1525160198.49078
X-Trans-Id: txa1cb1e69d5f6460d970d3-005ae9a38e
Content-Type: text/plain; charset=utf-8
X-Openstack-Request-Id: txa1cb1e69d5f6460d970d3-005ae9a38e

Note : Writing scripts to periodically generate the capacity reports and direct those reports to ELK which can enhance ELK to even show capacity per tenant/account in addition to its GET/PUT usage.

And if you are a system administrators to monitor over health and load you can use mmpermon tool to monitor object metrics. Here is the link for more description :

Monitoring Spectrum Scale Object metrics

Join The Discussion

Your email address will not be published. Required fields are marked *