IBM best practices for implementing a CI/CD secure container image pipeline for your Kubernetes-based apps
Ensure a safe and reliable operation of your Kubernetes instance
This article introduces some IBM best practices for implementing a Continuous Integration/Continuous Delivery (CI/CD) secure container image pipeline for your Kubernetes-based applications. This is a mechanism whereby any container images are first scanned for vulnerabilities and approved for use. This guarantees the security of your instance and of any containers that are subsequently instantiated.
Introduction to Kubernetes container images and a CI/CD secure image pipeline
Kubernetes offers you the flexible expansion of an application by using microservices, where individual containers incorporate specific parts of an application. Container numbers can scale rapidly and efficiently as your application expands and requires more resources. Container images can come from a variety of sources — some verifiable, some not.
Therefore, it is crucial for you to establish an appropriate security posture and a foolproof method of verifying the authenticity of container images before they are used in production. Failure to do so can result in serious security breaches and/or catastrophic failure of the applications in your Kubernetes instance! Using any such container image verification method should separate out any container images that are not fit for purpose, and ensure that only suitable and verifiable container images are used in your production environment.
One such method is the Continuous Integration/Continuous Delivery (CI/CD) secure container image pipeline. The pipeline detects any kind of current or historical vulnerabilities, or exposures in each container image — all of which depend on the container image contents, and accepts or rejects the images that you use in production on that basis. In a scenario where your container image OS is not fully updated, the pipeline ensures that the relevant updates are applied.
Only your container images that are fit for production are pushed to the private image registries that are in use by your Kubernetes instance.
This article discusses the current implementation and the best practices use of a Kubernetes CI/CD secure container image pipeline, and is targeted towards individuals who have a degree of knowledge of the containerization process.
What is the the definition of a Kubernetes CI/CD container image pipeline and how does it work?
The pipeline is a continuous evaluation of your container images that are contained in the private registry of your Kubernetes instance. Images in the registry are scanned on an ongoing basis and are evaluated for their suitability for production use. Container images that are considered suitably secure remain in your registry, while images that are considered unsecure are removed from your registry.
Secure images can be propagated in any new container instance. In this manner, only secure images are used at any given time by Kubernetes to instantiate new containers (in SoftLayer, for example). Secure container images can also be automatically rolled out by Kubernetes to update existing container images and their contents on site in your Kubernetes instance.
In terms of the Kubernetes instance need-to-know access control/separation of duty paradigm, there should be a specific or minimal number of people that have privileged access to any application image (e.g., generate images, deploy images, or curate the image library). Furthermore, access control should be configured so that whoever creates a change cannot also approve it. There should be a “technical approver,” who is ideally a peer or technical team lead. These best practices prevent one individual from sabotaging or damaging your entire instance (e.g., creating an image and then deploying it). If your team size is insufficient to have such separate individuals, then your instance should have an authorized approver.
How does the pipeline determine that a container image is secure?
Detect any outstanding CVEs for the container image OS and for your container image-resident applications. Furthermore, detect whether the container image OS has been sufficiently patched.
To wrap up:
What happens when a container image is considered suitable for production use?
- The image is added to the appropriate private image registry for future use. (Use case 1)
What happens when a container image is considered unsuitable for production use?
- The image is not added to the private image registry or it is removed from the registry. (Use case 2)
CI/CD secure container image pipeline actors
The main CI/CD secure container image pipeline actors are the development, AppOps, and infrastructure teams.
Development team definition and role
The development team develops the code for the containers that are deployed in your Kubernetes instance. The development team’s role is comprised of:
- Creating any new Docker images
- Identifing container deployment requirements:
- Configuration files, keystores, external authentication, etc.
- Any new infrastructure:
- Shared disks
- Proxies, etc.
- Defining container requirements:
- Disk space
- Configuration files, keystores, external authentication, etc.
- Designing procedures for handling crashed or unresponsive containers:
- Discarding and restarting the container
- Automated cleanup
- Identifing data flows for the instance application
- Explicitly identifying dependencies on other services and the services’ data
AppOps team definition and role
The AppOps team is the operational component of container application management. The AppOps team’s role is comprised of:
- Creating Kubernetes manifests based on information including:
- Disk space
- Creating and managing deployment assets:
- Configuration files, keystores, external authentication, etc.
- Communicating with the infrastructure team.
- Updating default UCD (UrbanCode Deploy: an IBM tool for automating application deployments through your environments) job with the required environment.
- Supplying credentials for automation to access GitHub.
- Creating any ancillary automation that is specific to the application:
- Such as automated version checks of other applications due to any dependencies.
Infrastructure team definition and role
The infrastructure team deals with any infrastructural issues that arise out of your container creation and deployment processes.
The infrastructure team’s role is comprised of:
- Creating the framework automation for Kubernetes, logging/monitoring stack, Ansible and UCD
- Maintaining and scale the SoftLayer®
- Monitoring the SoftLayer infrastructure and common components
- Updating the network to allow requisite communications for all cluster applications
- Assisting the AppOps team with application-specific automation
Use case 1: The container image is tested and considered suitable for use in the Kubernetes Instance Private Registry
Once a container image is considered suitable for use in your Kubernetes environment, the image is deployed into the production environment, using the same deployment methodology that is applied in other environments. Currently, the following application deployment process is in use:
- The development team pushes their application Docker images to Artifactory for consumption. (An Artifcatory is a repository manager, integrated with CI/CD and DevOps tools, that supports secure, clustered, and high-availability Docker registries. The application then tracks source code artifacts from development through to production.)
- The AppOps team updates their configuration files in their git repositories to reflect the changes in the Kubernetes manifest and the application configuration.
- UCD has environment definitions for SVT, CTE (Customer Test Environment), Production, etc. AppOps use the same deployment process across all of the environments to deploy the containers.
- UCD has built-in controls for access management, deployment scheduling, change management, and maintains an audit trail for all of the deployments made in the environment.
- The AppOps team synchronizes the git configuration to the environment using UCD.
- The AppOps team deploys the images into the QA environment using UCD.
- The generated logs are available as part of the UCD history for each environment/application.
- The UCD environment kicks off a continuous “heartbeat” health check. This is driven by a Jenkins deploy for each product/application that is maintained by the AppOps team that runs against your deployment.
- High priority alerts are generated, via Slack and PagerDuty, in the event that a basic end-to-end user process is no longer functioning in your environment (e.g. the database is unavailable and the customer is impacted).
- This end-to-end process must be configured individually and appropriately for each deployed application.
All of the environments currently have Nessus scanning enabled. Kritis and image scanning within the Docker repository in the Artifactory are currently being examined. Continual security and functional testing runs against all your deployed containers in order to ensure the integrity of your Kubernetes environment.
When is the image suitable for production use?
The image is considered suitable when the container code has been scanned and it has been established that all container components are free from vulnerabilities, and that QA has signed off on the container. The container is continuously tested during the period that is it promoted along your CI/CD pipeline, from development to testing to CTE, etc. Testing involves functional validation, regression, performance, and security, as well as other relevant aspects. The criteria for the image to pass the testing, and to move to production, is primarily determined by the development team, in conjunction with the AppOps team. The container is continuously tested during the period that is it promoted along your CI/CD pipeline, from development to testing to CTE, etc. Testing involves functional validation, regression, performance, and security, as well as other relevant aspects. The criteria for the image to pass the testing, and to move to production, is primarily determined by the development team, in conjunction with the AppOps team.
Use case 2: The container image is tested and considered unsuitable for use in the Kubernetes Private Registry
Currently, an add-only image mechanism is generally practiced. Older images are not necessarily removed from the existing registry. Instead, a new image with all the necessary patches and security fixes is added and promoted through the CI/CD pipeline. Eventually, a proper process and mechanism to prune the images from the registry, or delete deprecated images will be considered, but this does not impact the CI/CD process in any significant manner. Since the Kubernetes manifests are managed in SCM (Supply Chain Management) and are generally identical across environments, any usage of older or unsecure images is caught very early in your validation process.
When is the image unsuitable for production use? The container image is considered unsuitable for use in your Kubernetes instance if security or application defects are discovered during the ongoing evaluation of the image.
Existing container image-testing products The below products scan container images that are located in private repositories, to verify that they are free from all known security vulnerabilities or exposures. The products then report the results of their scans to you.
Grafeas: Grafeas is a recent IBM-supported open source project that is intended to provide audit and governance functionality for the Kubernetes and microservices software supply chain. Microservices exist where a Kubernetes cluster is made up of numerous related containers that run individual functional components. Before new code is deployed, Kubernetes can verify all relevant information by using the Grafeas API. If the code is certified and free of vulnerabilities, then it can be deployed to production.
Kritis: Kritis is a Google tool that utilizes the above Grafeas data to determine policy for which containers should and should not be permitted to run on your Kubernetes cluster at deploy time.
Docker Cloud: Docker Cloud can store pre-built images, or link to your source code to build Docker images. Furthermore, you can verify the resulting Docker images before you push them to your Kubernetes instance private image registry.
Docker Hub: Docker Hub cloud-based repositories allow you to share images with users, and links to Docker Cloud. There is also an image verification component associated with this image sharing process.
Overall, the secure container image pipeline ensures the security and integrity of any Kubernetes containers that are instantiated in your Kubernetes instance. The security of your instance can be ensured by determining who can approve each image before submission to the image registry, and who can instantiate the image as a live container. This separation of duties between who approves the image, and who deploys the image, is key to ensuring that unsuitable or malicious containers are never in use.
The development, AppOps, and infrastructure teams work together seamlessly in a process that ensures that any approved container meets stringent standards at all times. UCD (Urban Code Deploy) is employed to move each container through each stage in this process, and also assists in configuring the monitoring of your container, so that the appropriate individuals are notified in the event of any container failure.
The combined structure and operation of this Kubernetes CI/CD secure container image pipeline and the best practices that accompany it, ensure a safe, reliable, and trouble-free operation of your Kubernetes instance. This is critical to observe in any high availability Kubernetes environment, where containers and their contents may be required to scale rapidly at any time to meet demand.
Current best practices
We recommend that you use a scanning mechanism for all your container images in your Kubernetes Private Registry, such as IBM’s free Image Scanning Service: http://imagescanner.bluemix.net/.
Future best practices
Grafeas and Kritis may be required to be deployed to your Kubernetes environments in the future. However, considering that the CI/CD pipeline leverages the TaaS (Testing as a Service) artifactory, it may be possible to leverage JFrog Xray and AquaSec to carry out container vulnerability scanning. Even though private registries in your Kubernetes environments may not necessarily have this artifactory, the artifactory is the single point that all container images pass through during the CI/CD process. So, it may be possible to reap all the benefits of security scanning, without necessarily having to install Grafeas and Kritis on every environment.