For decades, companies have tried to break down silos by copying data from different operational systems into central data stores for analysis, such as data marts, data warehouses, and data lakes. This is often costly and prone to error. Most struggle to manage an average of 33 unique data sources, which are diverse in structure and type, and are often trapped in data silos that are hard to find and access.
With data virtualization, you can query data across many systems without having to copy and replicate data, which helps reduce costs. It also can simplify your analytics and make them more up to date and accurate because you’re querying the latest data at its source.
Watson Query
Watson Query connects multiple data sources across locations and turns all of this data into one virtual data view. This read-only data view makes it easier to get value out of your data. After you create connections to your data sources, you can quickly view all of your organization's data. This virtual data view enables real-time analytics without moving data, duplication, ETLs, or additional storage requirements, so processing times are greatly accelerated.
Security
Centralized authentication and authorization are enforced for platform users to access data sources in a trusted environment. Various data virtualization roles provide granular access management to the virtualized assets. If you need to use data virtualization functions, you must be assigned specific data virtualization roles based on your job description.
All communication between the environment and the application is securely encrypted with robust IBM technology, and SSL/TLS encryption by using standard protocols.
Platform support
Watson Query supports queries by using standard SQL through common interfaces such as R, Spark, Python, and Jupyter Notebooks. In addition, queries are also supported by the most common analytics application tools, including IBM Watson Studio and IBM Cognos Analytics.
After you've initiated the Watson Query service, you can:
Connect to data sources: Watson Query supports many relational and nonrelational data sources that you can add to your data source environment. Watson Query connects to relational data sources by using the Java™ Database Connectivity (JDBC) protocol. To learn more, see Connecting and authenticating to the Watson Query service.
Create virtual objects in Watson Query: You can use the Watson Query service to create virtual objects from various data sources so that you can query and use the data as if it came from a single source. Watson Query supports creating a virtual object from a single table, from multiple tables, or from files. You can also create a join view from multiple virtualized tables. To learn more, see Creating virtual objects in Watson Query.
Manage access to virtual objects in Watson Query: Watson Query administrators and engineers can grant users or groups access to virtual objects in Watson Query. To learn more, see Managing access to virtual objects in Watson Query.
Governing virtual data in Watson Query
Watson Query can integrate with Watson Knowledge Catalog to govern the virtual data that you publish to governed catalogs. Data governance involves applying business context, data policies, and data protection rules to your virtual data. To learn more, see Governing virtual data in Watson Query.
Summary
This section described data virtualization within IBM Cloud Pak for Data. You can view the product documentation to learn more about this topic.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.