Apache CouchDB is a versioned JSON-encoded object store that effortlessly supports modern applications with HTTP APIs, distributed/replicated elastic scalability, rich queries, eventual consistency, and many other capabilities. However, CouchDB does not provide for encryption of data at rest in a built-in and out-of-the-box manner. This tutorial shows you how to implement encryption for data at rest in a clustered server configuration, employed in a permissioned Hyperledger Fabric blockchain application. The approach described here applies to any application that needs to secure data at rest in a CouchDB.
The diagram in Figure 1 illustrates the deployed solution.
Figure 1. The deployed solution
If you haven’t already done so, we recommend that you read the previous tutorial in this series, Secure a Hyperledger Fabric sample app with a custom CA and deploy it to a Kubernetes cluster.
While the implementation details presented here are derived from a containerized CouchDB server running in a Hyperledger Fabric blockchain application deployed on Kubernetes, the concepts in this tutorial are very much applicable to securing data on any Linux-based CouchDB database server. You’ll need the following software (or later versions) installed, but the steps in this tutorial walk you through the installations:
- A Docker image of CouchDB with
cryptsetup 1.7.4(if CouchDB is run as a container and Dockerfile to build an image is provided in the reference material) and LVM on the Linux host
- Linux host with CouchDB,
cryptsetup 1.7.4and LVM installed
In addition to these step-by-step instructions, you can find all of the materials you need to repeat the steps discussed in this tutorial in this repository. The repo contains a README document, Docker file, Kubernetes configuration file (
offchain-db.yaml), and an
init.sh bash shell script that is validated to be working in an operationally deployed configuration.
Completing this tutorial should take about 1 hour.
Secure deployment of CouchDB consists of engineering and enforcing controls at various levels of the entire solution stack. Figure 2 depicts a typical deployment of a CouchDB for a permissioned Hyperledger blockchain application. The diagram’s depiction is appropriate and applicable when deploying solutions on premises or in public cloud, hybrid cloud, or various data center deployment environments.
Figure 2. Typical CouchDB solution for a Hyperledger blockchain application
Protecting the solution stack consists of providing security across two logical tiers.
Infrastructure layer protection typically involves protecting processors, storage, memory, network components, operating systems, and virtual machines.
Enforcing network security, which is part of the infrastructure layer, consists of tiered separation of computing solution components (three or more typical tiers), internal/external stateful/stateless firewalls, IP tables, filters, protocol/port-based packet filtering, allowlists/blocklists, and more.
Hardening operating system security in physical and virtual machine instances addresses the security aspects of the infrastructure layer.
Providing security at the application and software layers consists of engineering protection of the CouchDB versioned object store and other related application layers.
- Application and database software protection consists of protecting data while at rest in storage and while being used in applications, as well as protecting data while it is in transit on the wire between:
- Clients and servers
- Server nodes
- Application and storage components
- Various stack layers
- Encryption of data at rest implies protecting data while it is being stored on the physical media.
Broadly speaking, protection involves encrypting data while in transit on the wire with secure communication protocols including HTTPS, at rest on physical media, or in applications that handle data and management of private/secret keys used in the encryption.
Data on physical media
CouchDB stores data on the physical media in the form of JSON-encoded documents, with document attachments and indexes for the documents. It is important to encrypt data being stored on physical media when using CouchDB in order to safeguard the data from being stolen or accessed without permission. That is the focus of this tutorial: We show you how to take advantage of the established block device encryption method on Linux and employ Linux Unified Key Setup (LUKS) and Logical Volume Management (LVM) technologies.
Data in transit
Protection of data from eavesdropping while in transit between clients and database servers or between different database servers can be achieved by configuring the CouchDB server to communicate only over TLS-secured connections. This is a built-in capability of the server.
Data in use
Encrypting data in CouchDB applications involves encrypting sensitive data while in use in memory prior to sending the data to the CouchDB server for storage on physical media. By employing data encryption in CouchDB application logic, application developers can control the encryption in a fine-grained manner (such as encryption of documents as a whole or in parts, or encryption/non-encryption of attachments). In addition, an app developer or administrator would be responsible for managing key rotation, safekeeping (such as secure key ring or hardware security module), revocation, and selecting encryption algorithms.
While data in transit and data in use are as important as data on physical media, their implementation methods and details are outside the scope of this tutorial. For an in-depth exploration of CouchDB authentication, authorization, and auditing topics, please refer to the links in the Resources section.
Block device encryption in Linux
To protect sensitive and confidential data on system-attached block devices and external backup volume devices, Linux distributions offer LUKS block device encryption format. LUKS-encrypted block devices can be created using one of two supported formats: plain
dm-crypt or extended LUKS format. Each of these approaches has its pros and cons. LUKS provides support for multiple user passwords, their management, and compatibility of a standard on-disk format for hard disk encryption with a variety of Linux distributions. Since all of the required setup information for disk encryption is stored in the LUKS partition header, the transportation and migration of disk data becomes easy to manage using this disk encryption approach. LUKS features a metadata header that it stores at the beginning of the device as the partition header, and has eight key slots that can store eight passphrases. LUKS stores an encrypted single master key in the anti-forensic stripes and uses the passphrases stored in the key slots to decrypt this single master key.
LUKS also offers the following capabilities:
- Ability to add, remove, and change up to 8 passphrases
- Protection from low-quality, low-grade passphrases by providing salting, iterated PBKDF2 passphrase hashing
- Kernel random number-generated encryption keys
- Enhanced usability with automatic configuration of non-default crypto parameters
One vulnerability of LUKS is that it makes obvious the fact that the block device is encrypted. If the header or key slots are corrupted, it could lead to permanent data loss unless mitigating tasks are undertaken (backing up the metadata partition header for a restore).
On the other hand, plain disk encryption does not offer many of the attractive features of the LUKS format. However, it also does not feature any on-the-disk metadata storage, which can lead to non-single-point failure or vulnerability that can result in enhanced resiliency. Additionally, plain disk encryption makes the disk encryption not obvious and detectable, especially if the disk is overwritten with a crypto-grade randomness configuration (using
/dev/urandom) prior to initial device creation and configuration. To format plain disk encryption, all the required parameters have to be passed on to the
cryptsetup command on the command line, and
cryptsetup derives a master key from the passphrase parameter that’s passed in. The obvious downside of plain
dm-crypt is that it derives a key directly from the chosen password, so it’s important to choose a high-quality passphrase.
Logical volume management in Linux
Note: The following information on logical volume management (LVM) concepts and management, which is taken from our implemented example involving Red Hat Linux, is equally applicable to other Linux distributions and should not discourage you from replicating the work on other Linux variants.
Volume management in Red Hat Linux enables the creation of a layer of abstraction over physical storage that results in management of logical storage volumes. Managing logical volumes enables greater storage management flexibility than is possible by managing physical storage directly. With logical storage volume management, physical disk size limitations can be overcome, storage configurations can be resized or relocated without software being aware of such relocation, file systems can span across multiple disks, failing disks can be removed from the storage pool, and the addition of newer, faster storage devices, disk striping, mirroring, and snapshot capturing can be supported.
Figure 3 illustrates the concept of LVM and the components of LVM in Red Hat Enterprise Linux.
Figure 3. LVM components
For an in-depth learning roadmap of LVM concepts, please refer to the Red Hat documentation.
General setup and configuration of LUKS
Standard Linux kernel modules provide support for ciphers and digest algorithms. The device mapper in the Linux kernel provides a way to encapsulate block devices such as commonly used LVM logical volumes. The device mapper crypt target (
dm-crypt) uses a kernel crypto API to provide transparent encryption of block devices in Linux. In Red Hat Enterprise Linux, the
cryptsetup configuration tool enables admin users to interact with
dm-crypt to set up and operate encrypted block devices. The
cryptsetup tool makes use of the underlying device mapper infrastructure to achieve admin user directed configuration.
Red Hat Enterprise Linux version 7 uses and requires a main package called
cryptsetup package uses
cryptsetup-libs, and these packages are included by default in any out-of-box installation.
Configuring a new LUKS-encrypted block device involves specifying four basic and essential parameters:
- Symmetric encryption cipher
- Cipher block mode
- Initial vector (IV) for the cipher block mode
- Encryption key size
cryptsetup tool features the meaningful configuration of these parameters for default values and does not usually require tweaking. The configurable cipher options are: AES, Twofish, Serpent, cast5, and cast6. However, we recommend that you avoid these custom configurations and instead use the default values.
To learn the use of
cryptsetup, use the
cryptsetup –help and
man cryptsetup commands.
The following are the basic steps for configuring block device encryption using the
cryptsetup tool, which can be used to encrypt and decrypt external drives, flash devices, secondary or primary hard disks, and more:
sudo yum install -y cryptsetup
Format the disk using
sudo cryptsetup luksFormat /dev/<dev>
Be sure to provide a strong, compliant passphrase.
To set up a file system on a newly formatted but empty encrypted device, the device needs to be opened up first:
sudo cryptsetup luksOpen /dev/<dev> encrypted_drive
At this point, the encrypted drive is mapped and available for creating a file system with
To create a file system with a label (such as
ext4) on the encrypted drive, you would typically use:
sudo mkfs.ext4 -L "encrypted_drive" /dev/mapper/encrypted_drive
To mount the encrypted drive onto an existing mount point, you would typically use:
sudo mount /dev/mapper/encrypted_drive /media/encrypted_drive
You can now save the data to this encrypted drive as you would normally, and the rest of the usage should be transparent.
To unmount and close the device, you would perform:
sudo umount /media/encrypted_drive sudo cryptsetup luksClose /dev/mapper/encrypted_drive
To change the passphrase, you can use this:
sudo cryptsetup luksChangeKey /dev/<dev> -s 0
For automatic mounting of the encrypted device, use the
/etc/crypttabconfiguration definitions (manual or automatic passphrase entry using keyfile).
To avoid losing the encrypted drive or data on the encrypted drive due to specification of a single passphrase, define more than one passphrase, up to a maximum of eight passphrases.
To track and use unused key slots, use the
cryptsetup luksDump /dev/<dev>command.
To create additional passphrases, use the
cryptsetup luksAddKey /dev/<dev>command.
To revoke a lost or compromised passphrase, use the
cryptsetup luksRemoveKey /dev/<dev>command.
To remove a forgotten password from a known slot when at least one passphrase is known, use the
cryptsetup luksKillKey /dev/<dev>command.
To automatically provide a passphrase at boot time to open and mount an encrypted drive, carefully use a keyfile that is well hidden using commands to prepare a key file, and use it at boot time:
- Define –
keyfile - dd if=/dev/random bs=32 count=1 of=/root/lukskey
- Use –
cryptsetup luksAddKey /dev/<dev> /root/lukskey and existing passphrase entry
- Create –
crypttab entry – encrypted_drive /dev/<dev> /root/lukskey
- Define –
Manage LUKS headers by backing up and restoring, to avoid problems and to perform essential decryption of encrypted devices. Perform regular backups of headers using:
cryptsetup luksHeaderBackup /dev/luksdevice –header-backup-file <filepath>/headerbackupfile
Verify the backed-up header file passphrase prior to using it in a restore:
cryptsetup luksOpen /dev/luksdevice encrypted_drive –header <filepath>/headerbackupfile
And enter the passphrase to verify that the header has been backed up.
To perform a restore operation for a header backup, use:
cryptsetup luksHeaderRestore /dev/luksdevice –header-backup-file <filepath>/headerbackupfile
To minimize exposure, leakage, and vulnerability, it is a good idea to initialize or overwrite a newly created
LUKS/dm-cryptpartition using either one of the following standard commands:
dd_rescue -w /dev/zero /dev/mapper/encrypted_drive
cat /dev/zero > /dev/mapper/ encrypted_drive dd if=/dev/zero of=/dev/mapper/ encrypted_drive
Security considerations for this approach
It is important to recognize the limitations inherent in this implemented solution and engineer risk remediation steps as appropriate. This implementation has two basic exposures:
- The root user can access and read encrypted volumes once the encrypted volumes are mounted.
- The need to secure LUKS private keys.
To avoid root or privileged user access to container instances or hosts, it’s a good idea to create an identity and access management (IAM) role with proper and required permissions to resource(s), and map the container or host instance to the role created at the time the container or host is created. Then have the application(s) running inside the container or host to assume the IAM role mapped and obtain a time-limited temporary access token automatically (using typically with vendor-provided SDK support) to perform the application task at hand. This approach removes the need for resource authentication, storing application/access keys in the program logic or configuration files, revocation, rotation, or explicit invalidation of keys.
To avoid losing or compromising private keys, the recommended approaches are:
- Employ Kubernetes secrets
- Store keys in encrypted form in a physically secured security vault
- Store keys securely in a software or hardware security module
In this tutorial, you learned how to set up a clustered Apache CouchDB to encrypt data at rest and deploy it for production use in the context of a permissioned Hyperledger Fabric blockchain application. In addition, you learned about some of the capabilities of Apache CouchDB and how to architect an end-to-end solution featuring encryption of data at rest and data in transit, and protecting data while it is being processed in various components of a multi-tier solution, as well as various encryption approaches that are available in Linux distributions.