With the introduction of more specific privacy laws like GDPR and CCPA, it has become increasingly important to process users’ information in a more secure environment. Federal Information Processing Standards (FIPS) compliance is one of the most widely followed methods. This tutorial describes some of the aspects and detailed steps on how one can achieve FIPS compliance in processing big data using Apache Spark. These steps can also help you secure other big data processing platforms as well.
Since this specification is periodically revised by the government, it is important to update this document to meet the currently applicable compliance specification.
Cloud HSM and keep your own key (KYOK) take enterprise security to another level. This tutorial briefly covers some of the relevant details about this.
Note: Please review your country-specific export guidance for cryptographic software before proceeding with your FIPS setup.
Note: This tutorial suggests various configurations and addresses common development-related issues, but it does not guarantee FIPS compliance.
Prerequisites
To complete this tutorial, you will need an operating system that has FIPS compliance mode. Making an OS FIPS compliant manually is more involved and is not the main focus of this tutorial, however I provide some hints on how to switch to open source libraries that have FIPS mode.
Estimated time
Completing this tutorial should take about 30 minutes.
Setting up the environment for enabling FIPS for Spark
Configuring Red Hat Enterprise Linux 7.7 to be FIPS compliant: Red Hat Enterprise Linux (RHEL) can be set up to be FIPS compliant. Instructions and other specific details can be found on the Red Hat website.
Configuring JVM
This step is required if PKCS11 support has to be enabled. By default, this provider is not registered in the JVM on most systems.
Step 1. Installing Mozilla NSS libraries
This step is required because Java does not come with a native implementation for the PKCS11 standard. However, it comes with a wrapper that can be linked with a supported linked library.
Currently, Mozilla NSS is the only widely used open source implementation available. This library has a FIPS-enabled mode, but it is not a FIPS-certified library.
Usually, this library comes pre-installed on a RHEL system. You can verify its existence as follows:
$ readelf -d /lib64/libnss3.so
However, if you need latest version of this library — or want to enable the FIPS module at compile time — you will need a build from source. Note that the FIPS module for NSS library is already installed in RHEL 7.8 with FIPS mode enabled, so it is not necessary to build from source.
Download the latest version of the Mozilla NSS library from their website and follow the instructions provided.
For RHEL 7, the following set of commands for building the library can be helpful:
$ subscription-manager repos --enable rhel-server-rhscl-7-rpms $ yum install -y devtoolset-8 $ yum install -y zlib-devel gcc-c++ $ make -C nss nss_build_all USE_64=1 NSS_FORCE_FIPS=1 NSS_ENABLE_WERROR=0
After executing the above commands, the build is generated under the directory
nss_$version/dist/
. Locate the lib folder containing.so
libraries. Usually, it is of the formnss_$version/dist/Linux3.10_x86_64_cc_glibc_PTH_64_DBG.OBJ/lib
.$ mkdir ~/nsslibs/ $ cp nss_$version/dist/Linux3.10_x86_64_cc_glibc_PTH_64_DBG.OBJ/lib/* ~/nsslibs/
Step 2. Configuring JVM to use the NSS libs
For this section of the tutorial, I used Oracle JDK 1.8 to test the steps. The same steps might work with OpenJDK, as well.
Create the PKCS11 configuration file for JVM as follows:
name=NSS # if pre-installed NSS library is used, following would suffice. nssLibraryDirectory=/lib64 # if custom build nss library is used, uncomment the following and point it to the correct lib location. #nssLibraryDirectory=/home/username/nsslibs nssDbMode=noDb attributes=compatibility showInfo=true nssModule=fips
Save the above file as pkcs11.cfg and note its location.
Edit the JVM security providers to include the PKCS11 module. This is usually located in
jdk1.8.0_251/jre/lib/security
as thejava.security
file. Locate the section with the list of providers, as follows:# # List of providers and their preference orders (see above): # security.provider.1=sun.security.provider.Sun security.provider.2=sun.security.rsa.SunRsaSign security.provider.3=sun.security.ec.SunEC security.provider.4=com.sun.net.ssl.internal.ssl.Provider security.provider.5=com.sun.crypto.provider.SunJCE security.provider.6=sun.security.jgss.SunProvider security.provider.7=com.sun.security.sasl.Provider security.provider.8=org.jcp.xml.dsig.internal.dom.XMLDSigRI security.provider.9=sun.security.smartcardio.SunPCSC # Enable by adding following row security.provider.10=sun.security.pkcs11.SunPKCS11 /path_to/pkcs11.cfg
Add the last row as shown above, and specify the correct path to the pkcs11.cfg file created in the previous step.
Verify that the setup was done properly, as follows:
import java.security.Provider; import java.security.Security; import java.util.Enumeration; public class ListCryptoProviders { public static void main(String[] args) { try { Provider p[] = Security.getProviders(); for (int i = 0; i < p.length; i++) { if (p[i].toString().startsWith("SunPKCS11")) { System.out.println(p[i]); for (Enumeration e = p[i].keys(); e.hasMoreElements(); ) System.out.println("\t" + e.nextElement()); } } } catch (Exception e) { System.out.println("Provider is not properly configured."); e.printStackTrace(); } } }
Save the above file as ListCryptoProviders.java and compile it as:
javac ListCryptoProviders.java
Run the test as:
Java ListCryptoProviders
This should produce output containing the following:
SunPKCS11-NSS version 1.8 ...
Configuring Spark
Set up the correct JVM:
export JAVA_HOME=<path_to_JDK>
Configure Spark security:
# FIPS compliant spark configuration example. #general spark.serializer org.apache.spark.serializer.KryoSerializer spark.authenticate true spark.authenticate.secret 1234567890123456 #i/o spark.io.encryption.enabled true spark.io.encryption.keySizeBits 256 spark.io.encryption.keygen.algorithm AES spark.io.encryption.commons.config.secure.random.java.algorithm PKCS11 spark.io.encryption.commons.config.cipher.transformation AES/CTR/PKCS5Padding #n/w spark.network.crypto.enabled true spark.network.crypto.saslFallback false spark.network.crypto.keyLength 256 spark.network.crypto.keyFactoryAlgorithm PBKDF2WithHmacSHA256 spark.network.crypto.config.secure.random.java.algorithm PKCS11 spark.network.crypto.config.cipher.transformation AES/CTR/PKCS5Padding spark.network.crypto.keyFactoryAlgorithm PBKDF2WithHmacSHA256 #SSL spark.ssl.enabled true spark.ssl.keyPassword changeit! spark.ssl.keyStorePassword changeit! spark.ssl.keyStore /path/to/keystore spark.ssl.keyStoreType pkcs12 spark.ssl.enabledAlgorithms TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 spark.ssl.needClientAuth false spark.ssl.protocol TLSv1.2
In order to complete the SSL setup, a keystore is required. It can be created as follows, if it is not already available:
$ $JAVA_HOME/bin/keytool -genkey -storetype pkcs12 -keyalg RSA -alias spark -keystore /path/to/keystore
Specify the above as $SPARK_HOME/conf/spark-defaults.conf
.
Note: You should always use the latest release of Spark, which ensures that you include fixes to all of the critical CVE(s).
Testing the setup
Test the setup using Spark standalone mode. Follow the instruction at http://spark.apache.org/docs/latest/spark-standalone.html.
Special notes for IBM SDK 8
Be sure to check out the IBM® SDK 8 documentation.
IBM SDK FIPS compliance provides detailed documentation of IBM SDK FIPS Cryptographic module validation program.
Configuring JVM
Setup the correct JVM:
export JAVA_HOME=<path_to_IBM SDK>
Add the IBMJCEFIPS provider to use IBM SDK support for FIPS:
Edit the section on security providers in the $JAVA_HOME/jre/lib/security/java.security file:
# # List of providers and their preference orders (see above): # security.provider.1=com.ibm.crypto.fips.provider.IBMJCEFIPS security.provider.2=com.ibm.crypto.plus.provider.IBMJCEPlusFIPS security.provider.3=com.ibm.jsse2.IBMJSSEProvider2 security.provider.4=com.ibm.crypto.provider.IBMJCE #security.provider.5=com.ibm.security.jgss.IBMJGSSProvider #security.provider.6=com.ibm.security.cert.IBMCertPath #security.provider.6=com.ibm.security.sasl.IBMSASL
Edit your Spark configuration to include the following:
#IBM SDK specific options, to use FIPS approved provider. spark.driver.extraJavaOptions "-Dcom.ibm.jsse2.usefipsProviderName=IBMJCEFIPS -Dssl.SocketFactory.provider=com.ibm.jsse2.SSLSocketFactoryImpl -Dssl.ServerSocketFactory.provider=com.ibm.jsse2.SSLServerSocketFactoryImpl" spark.executor.extraJavaOptions "-Dcom.ibm.jsse2.usefipsProviderName=IBMJCEFIPS -Dssl.SocketFactory.provider=com.ibm.jsse2.SSLSocketFactoryImpl -Dssl.ServerSocketFactory.provider=com.ibm.jsse2.SSLServerSocketFactoryImpl"
If you are using standalone mode, all of the Spark demons need to be started with above configurations:
export SPARK_MASTER_OPTS="-Dcom.ibm.jsse2.usefipsProviderName=IBMJCEFIPS -Dssl.SocketFactory.provider=com.ibm.jsse2.SSLSocketFactoryImpl -Dssl.ServerSocketFactory.provider=com.ibm.jsse2.SSLServerSocketFactoryImpl" export SPARK_WORKER_OPTS="-Dcom.ibm.jsse2.usefipsProviderName=IBMJCEFIPS -Dssl.SocketFactory.provider=com.ibm.jsse2.SSLSocketFactoryImpl -Dssl.ServerSocketFactory.provider=com.ibm.jsse2.SSLServerSocketFactoryImpl"
Configuring Spark
Troubleshooting
OpenSSL internal error, assertion failed: FATAL FIPS SELFTEST FAILURE
:Error snippet:
fips.c(145): OpenSSL internal error, assertion failed: FATAL FIPS SELFTEST FAILURE JVMDUMP039I Processing dump event "abort", detail "" at 2020/05/18 02:02:13 - please wait. JVMDUMP032I JVM requested System dump using '/home/user/sparkcore.20200518.020213.28807.0001.dmp' in response to an event JVMDUMP010I System dump written to /home/user/spark/core.20200518.020213.28807.0001.dmp JVMDUMP032I JVM requested Java dump using '/home/user/spark/javacore.20200518.020213.28807.0002.txt' in response to an event
On examining the dump, the most interesting thread is this:
3XMTHREADINFO "netty-rpc-connection-0" J9VMThread:0x00000000029CD600, omrthread_t:0x00007FBC7C2453B8, java/lang/Thread:0x00000000C07724F8, state:R, prio=5 3XMJAVALTHREAD (java/lang/Thread getId:0x34, isDaemon:true) 3XMTHREADINFO1 (native thread ID:0x70E4, native priority:0x5, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00000020) 3XMTHREADINFO2 (native stack address range from:0x00007FBCDCFCE000, to:0x00007FBCDD00E000, size:0x40000) 3XMCPUTIME CPU usage total: 0.315326548 secs, current category="Application" 3XMHEAPALLOC Heap bytes allocated since last GC cycle=739928 (0xB4A58) 3XMTHREADINFO3 Java callstack: 4XESTACKTRACE at org/apache/commons/crypto/cipher/OpenSslNative.initIDs(Native Method) 4XESTACKTRACE at org/apache/commons/crypto/cipher/OpenSsl.<clinit>(OpenSsl.java:95) 4XESTACKTRACE at org/apache/commons/crypto/cipher/OpenSslCipher.<init>(OpenSslCipher.java:57) 4XESTACKTRACE at sun/reflect/NativeConstructorAccessorImpl.newInstance0(Native Method) 4XESTACKTRACE at sun/reflect/NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:83) 4XESTACKTRACE at sun/reflect/DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:57(Compiled Code)) 4XESTACKTRACE at java/lang/reflect/Constructor.newInstance(Constructor.java:437) 4XESTACKTRACE at org/apache/commons/crypto/utils/ReflectionUtils.newInstance(ReflectionUtils.java:88) 4XESTACKTRACE at org/apache/commons/crypto/cipher/CryptoCipherFactory.getCryptoCipher(CryptoCipherFactory.java:160) 4XESTACKTRACE at org/apache/spark/network/crypto/AuthEngine.initializeForAuth(AuthEngine.java:214) ......
In this case, it is trying to access the native OpenSSL library via a JNA method. My guess is that the JNA library is not compiled for Open J9.
Fortunately, you do not need to compile the JNA library for Open J9, or try to debug the native C code; instead, you should turn off this optional module and use IBM SDK’s support for FIPS. This is also the recommended approach with the IBM SDK, as it makes it possible to use IBM’s cryptographic hardware accelerators.
So you can fix this by using Java mode for cipher suites, as follows…
Configure Spark with the following:
For networking:
spark.network.crypto.config.secure.random.classes org.apache.commons.crypto.random.JavaCryptoRandom spark.network.crypto.config.cipher.classes org.apache.commons.crypto.cipher.JceCipher
For disk I/O:
spark.io.encryption.commons.config.cipher.classes org.apache.commons.crypto.cipher.JceCipher spark.io.encryption.commons.config.secure.random.classes org.apache.commons.crypto.random.JavaCryptoRandom
So why does the IBM SDK uses its own internal implementation of secure random, which is FIPS compliant?
Caller-provided
IBMSecureRandom
number generators are ignored. To be FIPS 140-2 compliant during key pair generation, version 1.8 ignores caller-provided random number generators and instead uses the internal FIPS-approved SHA2DRBG generator. This update does not require changes to your application code.Get more details in the IBM Knowledge Center.
IBM SDK supported SSL cipher suite and details of selected cipher suite and its FIPS compliance.
This OpenSSL wiki discusses all the FIPS-compliant cipher suites.
And here is the supported list of cipher suites for the IBM SDK.
So we used
SSL_ECDHE_RSA_WITH_AES_256_GCM_SHA384
, which is both FIPS approved and supported by the IBM SDK. It is also supported by many browsers, including Firefox and Safari.Setting up the keystore with the IBM SDK:
$JAVA_HOME/bin/keytool -genkeypair -storetype jks -keyalg RSA -alias spark -keystore `pwd`/keystore4 -storepass changeit! -keypass changeit!
Putting it all together
Here is the sample Spark configuration for the IBM SDK, which is FIPS compliant:
# FIPS compliant spark configuration example.
#IBM SDK specific options, to use FIPS approved provider.
spark.driver.extraJavaOptions "-Dcom.ibm.jsse2.usefipsProviderName=IBMJCEFIPS -Dssl.SocketFactory.provider=com.ibm.jsse2.SSLSocketFactoryImpl -Dssl.ServerSocketFactory.provider=com.ibm.jsse2.SSLServerSocketFactoryImpl"
spark.executor.extraJavaOptions "-Dcom.ibm.jsse2.usefipsProviderName=IBMJCEFIPS -Dssl.SocketFactory.provider=com.ibm.jsse2.SSLSocketFactoryImpl -Dssl.ServerSocketFactory.provider=com.ibm.jsse2.SSLServerSocketFactoryImpl"
#general
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.authenticate true
spark.authenticate.secret 1234567890123456
#n/w
spark.network.crypto.enabled true
spark.network.crypto.saslFallback false
spark.network.crypto.keyLength 256
spark.network.crypto.keyFactoryAlgorithm PBKDF2WithHmacSHA256
# This is FIPS 140-2 compliant secure random implementation, that comes with IBM SDK.
# Setting any value here has no effect at all. When FIPS mode is enabled,
# IBM SDK choses SHA2DRBG algorithm for secure random, regardless of what user has configured.
# See section 2, for more details.
spark.io.encryption.commons.config.secure.random.java.algorithm SHA2DRBG
# It is important to use JAVA classes, otherwise apache commons crypto package will try to use
# native openssl jna library. This library does not play well with IBM SDK. See section 1.
spark.network.crypto.config.secure.random.classes org.apache.commons.crypto.random.JavaCryptoRandom
spark.network.crypto.config.cipher.classes org.apache.commons.crypto.cipher.JceCipher
spark.network.crypto.config.cipher.transformation AES/CTR/PKCS5Padding
spark.network.crypto.keyFactoryAlgorithm PBKDF2WithHmacSHA256
#i/o
spark.io.encryption.enabled true
spark.io.encryption.keySizeBits 256
spark.io.encryption.keygen.algorithm AES
spark.io.encryption.commons.config.secure.random.classes org.apache.commons.crypto.random.JavaCryptoRandom
spark.io.encryption.commons.config.secure.random.java.algorithm SHA2DRBG
spark.io.encryption.commons.config.cipher.classes org.apache.commons.crypto.cipher.JceCipher
spark.io.encryption.commons.config.cipher.transformation AES/CTR/PKCS5Padding
#SSL
spark.ssl.enabled true
spark.ssl.keyPassword changeit!
spark.ssl.keyStorePassword changeit!
spark.ssl.keyStore /path/to/keystore4
spark.ssl.keyStoreType JKS
spark.ssl.enabledAlgorithms SSL_ECDHE_RSA_WITH_AES_256_GCM_SHA384
spark.ssl.needClientAuth false
spark.ssl.protocol TLSv1.2
Summary
So is this enough to be FIPS compliant? No, actually — a user needs to be aware of the standard.
From a configuration standpoint, an admin or a user can ensure that they have the right configuration by following the guildlines in this tutorial. You can disable the non-approved cipher suite in the jre/lib/security/java.security
file — Oracle has compiled a list of FIPS-approved and non-approved cipher suites.
Does the use of the IBM SDK with FIPS mode guarantee the FIPS compliance of the application running on top of it?
As stated in the IBM SDK documentation: The property does not verify that you are using the correct protocol or cipher suites that are required for FIPS 140-2 compliance.
Now that you’ve completed this tutorial, take a look at the companion article, Common misconceptions about FIPS mode-enabled environment, as well as the related links in the Resources section in the right-hand column.