What’s new in Apache Spark 3.0

By Huaxin Gao, Mei Mei Fu | Published June 30, 2020

The release is a result of more than 3,400 fixes and improvements from more than 440 contributors worldwide.

Analyze your Spark application using explain

By Sunitha Kambhampati | Published June 30, 2020

Learn how to get the Spark query execution plan using the explain API to debug and analyze your Apache Spark application.

Explore best practices for Spark performance optimization

By Sunitha Kambhampati | Published June 30, 2020

Learn some performance optimization tips to keep in mind when developing your Spark applications.

Build a recommendation engine using Apache Spark and Elasticsearch

By Nick Pentreath | Published June 15, 2020

Learn how to use Apache Spark and new vector scoring functions in Elasticsearch to build and deploy recommender models.

Customize Spark for your deployment

By Sunitha Kambhampati | Published October 2, 2019

Learn about enhancements to Spark's Extension Points API.

Data design for partitioned databases

By Glynn Bird | Published March 25, 2019

Design an application's data to drive faster performance, lower cost, and future scalability.

Open source and AI at IBM

By Vijay Bommireddipalli, Mei Mei Fu, Bradley Holt, Susan Malaika, ANIMESH SINGH, Jim Spohrer, Thomas Truong | Published December 12, 2018

The past, present, and future of open source and AI at IBM.

IBM continues commitment to Apache Spark

By Mei Mei Fu | Published November 29, 2018

Get highlights on the latest Apache Spark v2.4.0 release.