Always On and Resilient workloads with Chaos Engineering – 002

Abstract

We discuss and show how by using Chaos Engineering we can sustain five nines (99.999%) Service Level Objectives across multi-regions and multi-clouds.

I present architectural methods, patterns and practices that are to be followed by developers, SREs and software architects when building and maintaining cloud-native applications and services that need to provide the highest levels of availability. The methods describe how to provide practical five nines (99.999%) for end to end business services by incorporating Site Reliability Engineering (SRE), DevOps, Microservices, Chaos Engineering, Cloud-native Architectures, Application Modernization, Multi-Availability Regions, Geo-dispersity, Data Consistency, Performance and Scalability, Content Delivery Networks (CDN), and Software-defined Environments (SDE).

Speaker Bio

BIO – Haytham Elkhoja – As the technical leader of the Always On practice at IBM, Haytham works with large customers to re-architect mission-critical applications to resilient cloud-native architectures with the aim of achieving the highest levels of service availability, resiliency, and reliability by incorporating continuous availability, site reliability engineering, chaos engineering, cloud platforms and infrastructure automation.