IBM Data Refinery is a self-service data preparation client for data scientists, data engineers, and business analysts. With it, you can quickly transform large amounts of raw data into consumable, quality information that’s ready for analytics. IBM Data Refinery makes it easy to explore, prepare, and deliver data that people across your organization can trust.
- Ability to access data wherever it resides: in the cloud, on-premises, or on your desktop
- Powerful shaping operations to clean, organize, fix, and validate data
- Scripting support for RStudio’s dplyr for the efficient and flexible manipulation of data sets
- Support for single- and multi-column operations and the creation of complex new columns from existing columns
- Ability to undo, redo, and delete steps in a data flow
- Monitoring of data preparation flows
- Interactive data validation and automatic detection of anomalies such as missing values, outliers, and duplicates
- Visualizations that provide insight into large amounts of data
Stay tuned for a beta release soon.
If you’re participating in the closed beta, then sign in to IBM Data Science Experience to view the getting started materials.