Digital Developer Conference: Cloud Security 2021 -- Build the skills to secure your cloud and data Register free

Achieve FIPS compliance with your Apache Spark big data processing

With the introduction of more specific privacy laws like GDPR and CCPA, it has become increasingly important to process users’ information in a more secure environment. Federal Information Processing Standards (FIPS) compliance is one of the most widely followed methods. This tutorial describes some of the aspects and detailed steps on how one can achieve FIPS compliance in processing big data using Apache Spark. These steps can also help you secure other big data processing platforms as well.

Since this specification is periodically revised by the government, it is important to update this document to meet the currently applicable compliance specification.

Cloud HSM and keep your own key (KYOK) take enterprise security to another level. This tutorial briefly covers some of the relevant details about this.

Note: Please review your country-specific export guidance for cryptographic software before proceeding with your FIPS setup.

Note: This tutorial suggests various configurations and addresses common development-related issues, but it does not guarantee FIPS compliance.

Prerequisites

To complete this tutorial, you will need an operating system that has FIPS compliance mode. Making an OS FIPS compliant manually is more involved and is not the main focus of this tutorial, however I provide some hints on how to switch to open source libraries that have FIPS mode.

Estimated time

Completing this tutorial should take about 30 minutes.

Setting up the environment for enabling FIPS for Spark

Configuring Red Hat Enterprise Linux 7.7 to be FIPS compliant: Red Hat Enterprise Linux (RHEL) can be set up to be FIPS compliant. Instructions and other specific details can be found on the Red Hat website.

Configuring JVM

This step is required if PKCS11 support has to be enabled. By default, this provider is not registered in the JVM on most systems.

Step 1. Installing Mozilla NSS libraries

This step is required because Java does not come with a native implementation for the PKCS11 standard. However, it comes with a wrapper that can be linked with a supported linked library.

Currently, Mozilla NSS is the only widely used open source implementation available. This library has a FIPS-enabled mode, but it is not a FIPS-certified library.

  1. Usually, this library comes pre-installed on a RHEL system. You can verify its existence as follows:

    $ readelf -d /lib64/libnss3.so
    
  2. However, if you need latest version of this library — or want to enable the FIPS module at compile time — you will need a build from source. Note that the FIPS module for NSS library is already installed in RHEL 7.8 with FIPS mode enabled, so it is not necessary to build from source.

    • Download the latest version of the Mozilla NSS library from their website and follow the instructions provided.

    • For RHEL 7, the following set of commands for building the library can be helpful:

       $ subscription-manager repos --enable rhel-server-rhscl-7-rpms
       $ yum install -y devtoolset-8
       $ yum install -y zlib-devel gcc-c++
       $ make -C nss nss_build_all USE_64=1 NSS_FORCE_FIPS=1 NSS_ENABLE_WERROR=0
      

    After executing the above commands, the build is generated under the directory nss_$version/dist/. Locate the lib folder containing .so libraries. Usually, it is of the form nss_$version/dist/Linux3.10_x86_64_cc_glibc_PTH_64_DBG.OBJ/lib.

     $ mkdir ~/nsslibs/
     $ cp nss_$version/dist/Linux3.10_x86_64_cc_glibc_PTH_64_DBG.OBJ/lib/* ~/nsslibs/
    

Step 2. Configuring JVM to use the NSS libs

For this section of the tutorial, I used Oracle JDK 1.8 to test the steps. The same steps might work with OpenJDK, as well.

  1. Create the PKCS11 configuration file for JVM as follows:

     name=NSS
     # if pre-installed NSS library is used, following would suffice.
     nssLibraryDirectory=/lib64
     # if custom build nss library is used, uncomment the following and point it to the correct lib location.
     #nssLibraryDirectory=/home/username/nsslibs
     nssDbMode=noDb
     attributes=compatibility
     showInfo=true
     nssModule=fips
    

    Save the above file as pkcs11.cfg and note its location.

  2. Edit the JVM security providers to include the PKCS11 module. This is usually located in jdk1.8.0_251/jre/lib/security as the java.security file. Locate the section with the list of providers, as follows:

         #
         # List of providers and their preference orders (see above):
         #
    
         security.provider.1=sun.security.provider.Sun
         security.provider.2=sun.security.rsa.SunRsaSign
         security.provider.3=sun.security.ec.SunEC
         security.provider.4=com.sun.net.ssl.internal.ssl.Provider
         security.provider.5=com.sun.crypto.provider.SunJCE
         security.provider.6=sun.security.jgss.SunProvider
         security.provider.7=com.sun.security.sasl.Provider
         security.provider.8=org.jcp.xml.dsig.internal.dom.XMLDSigRI
         security.provider.9=sun.security.smartcardio.SunPCSC
         # Enable by adding following row
         security.provider.10=sun.security.pkcs11.SunPKCS11 /path_to/pkcs11.cfg
    

    Add the last row as shown above, and specify the correct path to the pkcs11.cfg file created in the previous step.

  3. Verify that the setup was done properly, as follows:

    import java.security.Provider;
    import java.security.Security;
    import java.util.Enumeration;
    
    public class ListCryptoProviders {
        public static void main(String[] args) {
            try {
                Provider p[] = Security.getProviders();
                for (int i = 0; i < p.length; i++) {
                    if (p[i].toString().startsWith("SunPKCS11")) {
                        System.out.println(p[i]);
                        for (Enumeration e = p[i].keys(); e.hasMoreElements(); )
                            System.out.println("\t" + e.nextElement());
                    }
                }
            } catch (Exception e) {
                System.out.println("Provider is not properly configured.");
                e.printStackTrace();
            }
        }
    }
    
    • Save the above file as ListCryptoProviders.java and compile it as:

       javac ListCryptoProviders.java
      
    • Run the test as:

       Java ListCryptoProviders
      

      This should produce output containing the following:

       SunPKCS11-NSS version 1.8
       ...
      

Configuring Spark

  1. Set up the correct JVM:

    export JAVA_HOME=<path_to_JDK>
    
  2. Configure Spark security:

        # FIPS compliant spark configuration example.
    
     #general    
     spark.serializer    org.apache.spark.serializer.KryoSerializer
     spark.authenticate  true
     spark.authenticate.secret   1234567890123456
    
     #i/o
     spark.io.encryption.enabled true
     spark.io.encryption.keySizeBits 256
     spark.io.encryption.keygen.algorithm AES
     spark.io.encryption.commons.config.secure.random.java.algorithm        PKCS11
     spark.io.encryption.commons.config.cipher.transformation       AES/CTR/PKCS5Padding
    
     #n/w
     spark.network.crypto.enabled   true
     spark.network.crypto.saslFallback     false
     spark.network.crypto.keyLength    256
     spark.network.crypto.keyFactoryAlgorithm        PBKDF2WithHmacSHA256
     spark.network.crypto.config.secure.random.java.algorithm        PKCS11
     spark.network.crypto.config.cipher.transformation       AES/CTR/PKCS5Padding
     spark.network.crypto.keyFactoryAlgorithm    PBKDF2WithHmacSHA256
    
     #SSL
     spark.ssl.enabled true
     spark.ssl.keyPassword changeit!
     spark.ssl.keyStorePassword changeit!
     spark.ssl.keyStore /path/to/keystore
     spark.ssl.keyStoreType pkcs12
     spark.ssl.enabledAlgorithms    TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
     spark.ssl.needClientAuth     false
     spark.ssl.protocol TLSv1.2
    

In order to complete the SSL setup, a keystore is required. It can be created as follows, if it is not already available:

    $ $JAVA_HOME/bin/keytool -genkey -storetype pkcs12 -keyalg RSA -alias spark -keystore /path/to/keystore

Specify the above as $SPARK_HOME/conf/spark-defaults.conf.

Note: You should always use the latest release of Spark, which ensures that you include fixes to all of the critical CVE(s).

Testing the setup

Test the setup using Spark standalone mode. Follow the instruction at http://spark.apache.org/docs/latest/spark-standalone.html.

Special notes for IBM SDK 8

Be sure to check out the IBM® SDK 8 documentation.

IBM SDK FIPS compliance provides detailed documentation of IBM SDK FIPS Cryptographic module validation program.

Configuring JVM

  1. Setup the correct JVM:

     export JAVA_HOME=<path_to_IBM SDK>
    
  2. Add the IBMJCEFIPS provider to use IBM SDK support for FIPS:

    • Edit the section on security providers in the $JAVA_HOME/jre/lib/security/java.security file:

       #
       # List of providers and their preference orders (see above):
       #
       security.provider.1=com.ibm.crypto.fips.provider.IBMJCEFIPS 
       security.provider.2=com.ibm.crypto.plus.provider.IBMJCEPlusFIPS
       security.provider.3=com.ibm.jsse2.IBMJSSEProvider2
       security.provider.4=com.ibm.crypto.provider.IBMJCE
       #security.provider.5=com.ibm.security.jgss.IBMJGSSProvider
       #security.provider.6=com.ibm.security.cert.IBMCertPath
       #security.provider.6=com.ibm.security.sasl.IBMSASL
      
    • Edit your Spark configuration to include the following:

       #IBM SDK specific options, to use FIPS approved provider.
       spark.driver.extraJavaOptions "-Dcom.ibm.jsse2.usefipsProviderName=IBMJCEFIPS -Dssl.SocketFactory.provider=com.ibm.jsse2.SSLSocketFactoryImpl -Dssl.ServerSocketFactory.provider=com.ibm.jsse2.SSLServerSocketFactoryImpl"
       spark.executor.extraJavaOptions "-Dcom.ibm.jsse2.usefipsProviderName=IBMJCEFIPS -Dssl.SocketFactory.provider=com.ibm.jsse2.SSLSocketFactoryImpl -Dssl.ServerSocketFactory.provider=com.ibm.jsse2.SSLServerSocketFactoryImpl"
      
    • If you are using standalone mode, all of the Spark demons need to be started with above configurations:

       export SPARK_MASTER_OPTS="-Dcom.ibm.jsse2.usefipsProviderName=IBMJCEFIPS -Dssl.SocketFactory.provider=com.ibm.jsse2.SSLSocketFactoryImpl -Dssl.ServerSocketFactory.provider=com.ibm.jsse2.SSLServerSocketFactoryImpl"
       export SPARK_WORKER_OPTS="-Dcom.ibm.jsse2.usefipsProviderName=IBMJCEFIPS -Dssl.SocketFactory.provider=com.ibm.jsse2.SSLSocketFactoryImpl -Dssl.ServerSocketFactory.provider=com.ibm.jsse2.SSLServerSocketFactoryImpl"
      

Configuring Spark

  1. Troubleshooting OpenSSL internal error, assertion failed: FATAL FIPS SELFTEST FAILURE:

    Error snippet:

     fips.c(145): OpenSSL internal error, assertion failed: FATAL FIPS SELFTEST FAILURE
     JVMDUMP039I Processing dump event "abort", detail "" at 2020/05/18 02:02:13 - please wait.
     JVMDUMP032I JVM requested System dump using '/home/user/sparkcore.20200518.020213.28807.0001.dmp' in response to an event
     JVMDUMP010I System dump written to /home/user/spark/core.20200518.020213.28807.0001.dmp
     JVMDUMP032I JVM requested Java dump using '/home/user/spark/javacore.20200518.020213.28807.0002.txt' in response to an event
    

    On examining the dump, the most interesting thread is this:

        3XMTHREADINFO      "netty-rpc-connection-0" J9VMThread:0x00000000029CD600, omrthread_t:0x00007FBC7C2453B8, java/lang/Thread:0x00000000C07724F8, state:R, prio=5
        3XMJAVALTHREAD            (java/lang/Thread getId:0x34, isDaemon:true)
        3XMTHREADINFO1            (native thread ID:0x70E4, native priority:0x5, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00000020)
        3XMTHREADINFO2            (native stack address range from:0x00007FBCDCFCE000, to:0x00007FBCDD00E000, size:0x40000)
        3XMCPUTIME               CPU usage total: 0.315326548 secs, current category="Application"
        3XMHEAPALLOC             Heap bytes allocated since last GC cycle=739928 (0xB4A58)
        3XMTHREADINFO3           Java callstack:
        4XESTACKTRACE                at org/apache/commons/crypto/cipher/OpenSslNative.initIDs(Native Method)
        4XESTACKTRACE                at org/apache/commons/crypto/cipher/OpenSsl.<clinit>(OpenSsl.java:95)
        4XESTACKTRACE                at org/apache/commons/crypto/cipher/OpenSslCipher.<init>(OpenSslCipher.java:57)
        4XESTACKTRACE                at sun/reflect/NativeConstructorAccessorImpl.newInstance0(Native Method)
        4XESTACKTRACE                at sun/reflect/NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:83)
        4XESTACKTRACE                at sun/reflect/DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:57(Compiled Code))
        4XESTACKTRACE                at java/lang/reflect/Constructor.newInstance(Constructor.java:437)
        4XESTACKTRACE                at org/apache/commons/crypto/utils/ReflectionUtils.newInstance(ReflectionUtils.java:88)
        4XESTACKTRACE                at org/apache/commons/crypto/cipher/CryptoCipherFactory.getCryptoCipher(CryptoCipherFactory.java:160)
        4XESTACKTRACE                at org/apache/spark/network/crypto/AuthEngine.initializeForAuth(AuthEngine.java:214)
    ......
    

    In this case, it is trying to access the native OpenSSL library via a JNA method. My guess is that the JNA library is not compiled for Open J9.

    Fortunately, you do not need to compile the JNA library for Open J9, or try to debug the native C code; instead, you should turn off this optional module and use IBM SDK’s support for FIPS. This is also the recommended approach with the IBM SDK, as it makes it possible to use IBM’s cryptographic hardware accelerators.

    So you can fix this by using Java mode for cipher suites, as follows…

    Configure Spark with the following:

    For networking:

        spark.network.crypto.config.secure.random.classes   org.apache.commons.crypto.random.JavaCryptoRandom
        spark.network.crypto.config.cipher.classes  org.apache.commons.crypto.cipher.JceCipher
    

    For disk I/O:

        spark.io.encryption.commons.config.cipher.classes   org.apache.commons.crypto.cipher.JceCipher
        spark.io.encryption.commons.config.secure.random.classes    org.apache.commons.crypto.random.JavaCryptoRandom
    
  2. So why does the IBM SDK uses its own internal implementation of secure random, which is FIPS compliant?

    Caller-provided IBMSecureRandom number generators are ignored. To be FIPS 140-2 compliant during key pair generation, version 1.8 ignores caller-provided random number generators and instead uses the internal FIPS-approved SHA2DRBG generator. This update does not require changes to your application code.

    Get more details in the IBM Knowledge Center.

  3. IBM SDK supported SSL cipher suite and details of selected cipher suite and its FIPS compliance.

    This OpenSSL wiki discusses all the FIPS-compliant cipher suites.

    And here is the supported list of cipher suites for the IBM SDK.

    So we used SSL_ECDHE_RSA_WITH_AES_256_GCM_SHA384, which is both FIPS approved and supported by the IBM SDK. It is also supported by many browsers, including Firefox and Safari.

  4. Setting up the keystore with the IBM SDK:

     $JAVA_HOME/bin/keytool -genkeypair -storetype jks -keyalg RSA -alias spark -keystore `pwd`/keystore4 -storepass changeit! -keypass changeit!
    

Putting it all together

Here is the sample Spark configuration for the IBM SDK, which is FIPS compliant:

# FIPS compliant spark configuration example.

#IBM SDK specific options, to use FIPS approved provider.
spark.driver.extraJavaOptions "-Dcom.ibm.jsse2.usefipsProviderName=IBMJCEFIPS -Dssl.SocketFactory.provider=com.ibm.jsse2.SSLSocketFactoryImpl -Dssl.ServerSocketFactory.provider=com.ibm.jsse2.SSLServerSocketFactoryImpl"
spark.executor.extraJavaOptions "-Dcom.ibm.jsse2.usefipsProviderName=IBMJCEFIPS -Dssl.SocketFactory.provider=com.ibm.jsse2.SSLSocketFactoryImpl -Dssl.ServerSocketFactory.provider=com.ibm.jsse2.SSLServerSocketFactoryImpl"

#general    
spark.serializer    org.apache.spark.serializer.KryoSerializer
spark.authenticate  true
spark.authenticate.secret   1234567890123456

#n/w
spark.network.crypto.enabled   true
spark.network.crypto.saslFallback   false
spark.network.crypto.keyLength    256
spark.network.crypto.keyFactoryAlgorithm        PBKDF2WithHmacSHA256

# This is FIPS 140-2 compliant secure random implementation, that comes with IBM SDK.
# Setting any value here has no effect at all. When FIPS mode is enabled,
# IBM SDK choses SHA2DRBG algorithm for secure random, regardless of what user has configured.
# See section 2, for more details.
spark.io.encryption.commons.config.secure.random.java.algorithm  SHA2DRBG 

# It is important to use JAVA classes, otherwise apache commons crypto package will try to use 
# native openssl jna library. This library does not play well with IBM SDK. See section 1.
spark.network.crypto.config.secure.random.classes   org.apache.commons.crypto.random.JavaCryptoRandom
spark.network.crypto.config.cipher.classes  org.apache.commons.crypto.cipher.JceCipher

spark.network.crypto.config.cipher.transformation       AES/CTR/PKCS5Padding
spark.network.crypto.keyFactoryAlgorithm    PBKDF2WithHmacSHA256

#i/o
spark.io.encryption.enabled true
spark.io.encryption.keySizeBits 256
spark.io.encryption.keygen.algorithm AES
spark.io.encryption.commons.config.secure.random.classes    org.apache.commons.crypto.random.JavaCryptoRandom
spark.io.encryption.commons.config.secure.random.java.algorithm       SHA2DRBG
spark.io.encryption.commons.config.cipher.classes   org.apache.commons.crypto.cipher.JceCipher
spark.io.encryption.commons.config.cipher.transformation       AES/CTR/PKCS5Padding

#SSL
spark.ssl.enabled true
spark.ssl.keyPassword changeit!
spark.ssl.keyStorePassword changeit!
spark.ssl.keyStore /path/to/keystore4
spark.ssl.keyStoreType JKS
spark.ssl.enabledAlgorithms   SSL_ECDHE_RSA_WITH_AES_256_GCM_SHA384
spark.ssl.needClientAuth        false
spark.ssl.protocol      TLSv1.2

Summary

So is this enough to be FIPS compliant? No, actually — a user needs to be aware of the standard.

From a configuration standpoint, an admin or a user can ensure that they have the right configuration by following the guildlines in this tutorial. You can disable the non-approved cipher suite in the jre/lib/security/java.security file — Oracle has compiled a list of FIPS-approved and non-approved cipher suites.

Does the use of the IBM SDK with FIPS mode guarantee the FIPS compliance of the application running on top of it?

As stated in the IBM SDK documentation: The property does not verify that you are using the correct protocol or cipher suites that are required for FIPS 140-2 compliance.

Now that you’ve completed this tutorial, take a look at the companion article, Common misconceptions about FIPS mode-enabled environment, as well as the related links in the Resources section in the right-hand column.