The most common R packages used by data scientists are installed for you and only require loading. This includes SparkR, a package that provides a light-weight frontend to use Apache Spark from R. In R notebooks, SparkContext and SQLContext are preconfigured for you as sc and sqlContext respectively. You can list which packages are installed, and install and load new ones as required.

Watch this short video which shows a sample notebook introducing basic Spark concepts and helps you to start using Spark for R. In this notebook, you’ll use the publicly available mtcars data set from Motor Trend magazine to learn some basic R. You’ll learn how to load data, create a Spark DataFrame, aggregate data, run mathematical formulas, and run SQL queries against the data. To do so, from within the IBM Data Science Experience, click the Notebooks section in the Data Science Experience Community, and search for Spark R.

Get started with Spark for R

Join The Discussion

Your email address will not be published. Required fields are marked *