IBM® Analytics for Apache Spark for Bluemix is an open source cluster computing framework optimized for extremely fast and large scale data processing. Spark’s Streaming and SQL programming models backed by MLlib and GraphX make it incredibly easy for developers and data scientists to build apps that exploit machine learning and graph analytics. Because the service is 100% compatible with Apache Spark, you can build your app without worry and run it against IBM’s Spark service, leaving operational, maintenance, and hardware concerns to IBM.
How to develop with IBM Analytics for Apache Spark
There are 2 ways to develop on our Spark service:
- On Bluemix: use spark-submit to work with Spark programmatically. Run large nightly batch jobs, which can launch unattended from a script triggered by a cron job, and don’t require any interaction to complete. Learn more about spark-submit.
- On Data Science Experience (DSx): use Jupyter notebooks to edit and execute Scala, R, or Python analysis code. You’ll find our sample notebooks useful for getting started, uploading data, quickly experimenting analysis, and for bringing in additional libraries for use with your IBM Analytics for Apache Spark instance. Learn more about notebooks.