IBM® Analytics for Apache Spark for Bluemix
Apache® Spark™ is an open source cluster computing framework optimized for extremely fast and large scale data processing. Spark’s Streaming and SQL programming models backed by MLlib and GraphX make it incredibly easy for developers and data scientists to build apps that exploit machine learning and graph analytics. Because the service is 100% compatible with Apache Spark, you can build your app without worry and run it against IBM’s Spark service, leaving operational, maintenance, and hardware concerns to IBM.
How to develop with IBM Analytics for Apache Spark
There are 2 ways to develop on our Spark service:
- spark-submit lets you work with Spark programmatically. Use spark-submit to run large nightly batch jobs, which can launch unattended from a script triggered by a cron job, and don’t require any interaction to complete. Read the docs.
- Interactive Notebooks You’ll find our sample notebooks useful for getting started, uploading data, quickly experimenting analysis, and for bringing in additional libraries for use with your IBM Analytics for Apache Spark instance. Browse them here, download them in GitHub, and get to insights faster with these helpful guides. Get started with Notebooks.