Big R provides an end-to-end integration of R within IBM InfoSphere BigInsights. This makes it easy to write and execute R programs that operate on data stored in a Hadoop cluster. Using Big R, an R user can explore, transform, and analyze big data hosted in a BigInsights cluster using familiar R syntax and paradigm. All of these capabilities are accessible from a standard R client.
- Enables use of R query language for big data hiding many of the complexities pertaining to the underlying Hadoop/MapReduce framework
- Uses classes like bigr.frame, bigr.list, and bigr.vector presenting the user with an API that is heavily inspired by R’s foundation API on data.frames, vectors, and frames
- Uses groupApply, rowApply, and tableApply to pushdown R functions so that they run right on the data
- BigInsights transparently parallelizes executions of these functions and provides consolidated results back to the user
- Almost any R code, including most open source repositories like CRAN, can be run
Learn by doing! Get a first hand look at Big R with these tutorials.