An open-source analytics engine for large-scale data processing.
Before open source was cool, IBM worked to establish open source as technology that's safe (and good!) for the enterprise.
Mar 19, 2019
Develop, train, and deploy a spam filter model on Hortonworks Data Platform using Watson Studio Local
Stewarding open source for the future
Create a Spark service for IBM Watson Studio
Continuous learning with Watson Machine Learning and IBM Db2 Warehouse on Cloud
See all events
Nov 05, 2018
Nov 02, 2018
Sep 24, 2018
See all announcements
Feb 08, 2019
In this code pattern, we’ll use IBM Cloud Private for Data and load customer demographic and trading activity data into IBM Db2 Warehouse. From there, we'll analyze the data using a Jupyter notebook with Brunel visualizations.
Feb 04, 2019
Apache SparkArtificial intelligence+
Customize a notebook package to include Anaconda, Watson PowerAI, and sparkmagic and use that to run a Keras model connect to a Hadoop cluster and execute a Spark MLlib model.
Jan 30, 2019
In this code pattern, we’ll use Jupyter notebooks to load IoT sensor data into IBM Db2 Event Store. From there, we'll query and analyze the data using Jupyter notebooks with Spark SQL and Matplotlib. Finally, we'll use Spark Machine Learning Library to create a model that will predict the temperature…
Jan 10, 2019
In this tutorial, we will run an end-to-end application written on top of the IBM Db2 Event Store. This application is representative of a simplified IoT use-case where sensor data is being streamed to the Event Store and visualized.
Dec 12, 2018
The past, present, and future of open source and AI at IBM.
Nov 29, 2018
Get highlights on the latest Apache Spark v2.4.0 release.
Nov 08, 2018
Gain a basic understanding of graph-based meta data management in enterprise data governance with Apache Atlas as a prime example.
Nov 05, 2018
This code pattern demonstrates how data scientists can leverage IBM Watson Studio Local to automate the building and training of a machine learning model to classify wines.
Oct 30, 2018
Develop, Train, and Deploy Spam Filter Model on Hortonworks Data Platform using Watson Studio Local
Oct 24, 2018
Learn how a team added support for informational primary key and foreign key (referential integrity) constraints in Spark to achieve enterprise level TPC-DS performance.
Oct 23, 2018
Currently, Filters are pushed down to data source layer for better performance. However, Aggregate is still done at Spark layer. We can push down Aggregate to data source to further improve Spark performance.
Oct 22, 2018
Quickly build and prototype models, to monitor deployments, and to learn over time as more data becomes available.
Oct 12, 2018
Apache SparkData Science
Sam Couch goes over Olympic medal wins with Apache Spark and Pixiedust to pull out meaningful data.
Sep 24, 2018
Use Jupyter Notebooks with IBM Watson Studio to build an interactive recommendation engine PixieApp.
Aug 30, 2018
Leverage R4ML and Watson Studio to conduct preprocessing and exploratory analysis with big data.
Aug 22, 2018
Create bar charts, line charts, scatter plots, pie charts, histograms, and maps without any coding.
Aug 06, 2018
Use Watson Studio and scalable machine-learning tool R4ML to load dataset and do uniform sampling for visual data exploration.
Jul 16, 2018
Learn how to perform data analysis with Apache Spark on z/OS.
Apr 10, 2018
How to add a Spark service for use in a Jupyter notebook on IBM Watson Studio.
Oct 13, 2017
Learn how to set up and run Apache SystemML in an Apache Spark shell using IBM Analytics Engine and IBM Cloud.
Aug 09, 2017
Drive value by acquiring, curating, cleansing, analyzing, visualizing, and enriching data.
Back to top