Data and AI applications with Palantir for IBM Cloud Pak for Data
Learn about creating AI-infused apps with Palantir ontology management, and about the underlying integration architecture.
Palantir for IBM Cloud Pak for Data enables customers to build no-/low-code line-of-business applications using data and AI models from IBM Cloud Pak for Data. Ontology managers can define business-oriented data models integrating data services from IBM Cloud Pak for Data. Application builders can use Palantir tools to create line-of-business applications using these data models and can integrate AI models created by data scientists and deployed by ML operators on IBM Cloud Pak for Data. This blog post explains how to create AI-infused apps using Palantir ontology management and app building tools together with the data and AI catalog, projects, and model deployment spaces on IBM Cloud Pak for Data. It also outlines the underlying integration architecture.
IBM Cloud Pak for Data as the data and AI foundation
IBM Cloud Pak for Data together with Palantir provide integrated capabilities to:
- Collect, transform, and integrate data from many sources
- Organize data to be ready for use in projects and applications
- Analyze data to gain insights and create AI models
- Infuse AI insights such as predictions and optimization via APIs where needed
- Build applications using no-/low-code app builders, integrating data and AI on multiple clouds while leveraging Red Hat OpenShift as the underlying platform
Applications built with Palantir for IBM Cloud Pak for Data by application builders – using no-/low-code tools – can use data, predictions, and optimization from IBM Cloud Pak for Data helping business users achieve smarter business outcomes.
Data engineers can create data services in IBM Cloud Pak for Data such as Db2, Db2 Warehouse, Postgres, and more to collect data, and can build a catalog of data assets available for data scientists and app builders to use. Where needed, they can use DataStage flows or other tools to transform data from multiple sources and use data virtualization services.
Data scientists can collaborate in Projects, add data sets from the catalog or from other data sources, analyze data, gain insights, train ML models, and define decision optimization models. To train models, they may use Python code in JupyterLab using their favorite ML framework, SPSS Modeler flows, or Auto AI, as shown below.
Models can be saved and deployed to Spaces, as shown below, to make them available for AI infusion into business processes and applications. The deployed model can then be called via the model deployment REST API.
Building data and AI applications with Palantir for IBM Cloud Pak for Data
Application builders can build rich no-/low-code applications using the Palantir app builder tools available through a new Palantir card on the IBM Cloud Pak for Data page.
From here, ontology managers can navigate to the Palantir UI to define and manage Palantir ontologies, integrating data from IBM Cloud Pak for Data. Application builders can navigate to the Palantir UI to build apps using ontologies and connecting ML models from IBM Cloud Pak for Data to integrate predictions into applications. Once in the Palantir UI, they can integrate AI models from IBM Cloud Pak for Data into Palantir apps via Manage models and can integrate data from IBM Cloud Pak for Data into a Palantir ontology via Manage ontology.
To enable Palantir applications, a business-oriented ontology needs to first be defined using Palantir ontology management, which integrates with the data sets from the data and AI catalog in IBM Cloud Pak for Data. From the ontology management UI, users can search the IBM Cloud Pak for Data catalog for data assets to use and then drill down into the columns or object attributes of the data set to map these to business objects defined in the Palantir ontology.
The underlying data behind the data assets is then synchronized from the referenced data source into the Palantir ontology storage to make it queryable and searchable for Palantir apps.
As a result of ontology modeling and data integration, a Digital Twin of business objects for the company or organization is represented in the ontology data model, structured in a way that business users can relate to and that is suitable to build no-/low-code apps intended for them.
Building on the business-oriented ontology data model, application builders can create a range of apps by selecting relevant business objects and creating application views, forms, and dashboards on top. To integrate an ML model from IBM Cloud Pak for Data, the app builder uses the import model function to browse Spaces and model deployments from IBM Cloud Pak for Data and pick the appropriate model to use and connect to.
Application builders can map model parameters and ontology object attributes, so that the model will get the right input, and the output of the model can be stored back to the ontology.
As a result, the new Model Objective that wraps the AI model from IBM Cloud Pak for Data becomes available for use by Palantir apps. Application builders can now use the Model Objective to integrate predictions from the AI model from IBM Cloud Pak or Data into their apps.
Once an app is finished and and goes live, business users can start using it via web browsers or mobile devices, interacting with data and AI provided by IBM Cloud Pak for Data via the line-of-business app created using Palantir. For example, the Palantir app shown below is able to use data from data services and predictions from AI models from IBM Cloud Pak for Data.
Stay tuned for future integrations with other IBM Cloud Pak for Data services, like Decision Optimization and beyond.
IBM Cloud Pak for Data together with Palantir can be installed on the same underlying Red Hat OpenShift cluster with single sign-n (SSO). Application builders, data scientists, and data engineers can log into IBM Cloud Pak for Data, showing what is relevant to them.
Data scientists and data engineers can create projects or navigate to projects they have been added to. They can set up data transformation pipelines to access and transform data from external sources, add resulting data assets to the project, and can share data assets intended for use in other projects or by applications to the catalog. This helps make these data assets explorable and searchable via the catalog’s REST APIs and UI. Data scientists can analyze data and train models in projects and can make models that are ready for use available for deployment in Spaces, which allows applications to find and invoke them via REST APIs.
App builders will find a new Applications card on the IBM Cloud Pak for Data page to navigate to Palantir ontology management and Palantir applications. In defining an ontology, Palantir ontology management uses IBM Cloud Pak for Data catalog REST APIs to let the user find and pick data assets, and map them into the ontology. This enables the translation from technical column names to business-oriented terms in the ontology to achieve a business user-friendly data model. As a result, the referenced data is synchronized from data sources into the Palantir ontology storage. Additionally, this is indexed into the ontology’s search index to make it available for low-latency queries and searches by apps without pressuring source data services with unnecessary application-induced workload. It is important to have enough file system volume size available for data ingested, plus the search index to enable search over the ingested data.
Application builders use the Palantir app builder tools to create and customize app UIs that consist of object type views that can be filtered and searched, object page views, dashboards, and more. They operate on top of the Palantir Ontology REST API. Optionally, apps can include elements of workflow and low code to extend application logic.
The integration of Palantir app builder tools with ML models deployed in IBM Cloud Pak for Data Spaces uses REST APIs to browse Spaces and model deployments and let the application builder pick the model deployments to integrate into their app. Ontology data attributes are mapped to input parameters of the model, and model scoring output values are mapped to result data that the app can include in its UI. When it’s live at runtime, the app can display data, along with relevant predictions, by using an application ID to invoke the model scoring REST API to in turn invoke the model that runs on IBM Cloud Pak for Data. For this to work, the ID to be used to invoke models from the Palantir app needs to previously have been granted access to the apace in which the model is deployed.
Both apps and models can evolve over time. New application versions can be made available to business users via Palantir, and new models can be made available via IBM Cloud Pak for Data model deployments. The latter is possible while keeping the model deployment URL stable, so that a new model can replace the previously used model behind a model deployment endpoint without disrupting applications.
As a future direction, we intend to enable write-back from Palantir apps via the ontology synchronized back to data services on IBM Cloud Pak for Data and sharing write-back data from Palantir apps to the IBM Cloud Pak for Data catalog. Data scientists can then add the resulting data assets to projects, analyze data from Palantir apps, and use it to train models.