Article

An introduction to Content Management Interoperability Services (CMIS)

OASIS standard enables interoperability among different content management tools

By

Mayur Jain

In today’s rapidly changing IT landscape, enterprises are rarely limited to a single content management tool, but rather they often end up with diverse mix of different vendor products. This allows them to avoid dependency on a specific vendor, and enables them to leverage specialized features to meet their business requirements. This article describes Content Management Interoperability Services (CMIS), a vendor-neutral OASIS standard that enables interoperability among different content management tools.

This standard makes it possible for a developer to write a client application for IBM FileNet, and the same application will seamlessly integrate with Alfresco or Microsoft SharePoint (or any other CMIS-compliant enterprise content management tool) with no code changes. This is similar to USB-C chargers for mobile phones or HDMI/USB ports in computers, as they allow standardized interfaces to connect different vendor products.

The following diagram illustrates how different content management tools like OpenText Documentum, IBM Content Manager, IBM FileNet Content Manager, Microsoft SharePoint, or any other JCR-compliant repository comply with the CMIS standard. A client application can seamlessly connect with all of them at the same time with no tool-specific code changes, regardless of how these tools store the content or metadata under the hood.

CMIS architecture

CMIS enables client applications to connect with multiple content management tools by defining a standard domain model, a standard set of services and protocol bindings for web services, a RESTful Atom Publishing Protocol (AtomPub), and Java Script Object Notation (JSON).

CMIS adds a layer on top of existing enterprise content management (ECM) tools, exposing only the generic capabilities and not all of the tool's features.

Integration challenges

Integrating content management systems can involve a number of challenges, including:

  • The time and money required to integrate just two systems can be burdensome.
  • Many systems have proprietary features and methods for communicating and accessing the data stored within.
  • Integrations are not reusable.
  • Maintaining integration modules may not be easy and can be costly.
  • Content assets cannot be leveraged across the enterprise.

Why CMIS?

The purpose of CMIS is to define a domain model that includes a data model and abstract capabilities for content management and a set of bindings that applications can use to work with one or more content management repositories/systems, and can be implemented by content repositories and enable interoperability across repositories.

The CMIS standard exposes core/common ECM repository capabilities in an intentionally generic way. This allows you to construct applications that can work with content that resides in one or more ECM repositories, without having to understand implementation differences between the individual repositories or worry about interface inconsistencies between repositories.

Benefits of CMIS

CMIS offers a number of benefits, including:

  • No need for specialized skills proprietary to each ECM system -- Developers do not need specialized skills to write connectors for each ECM system. A single CMIS connector has the capability to talk to multiple systems at the back end.
  • Reduced development costs and schedules -- A common interface into repositories eliminates much of the integration work that has traditionally been required to create a unified view into disparate systems.
  • Reduced change impact -- A change in one system has less impact on all the other systems, and CMIS requires significantly less development to accommodate that change.
  • Leverage existing investments -- Existing legacy applications can be configured to communicate with middleware using CMIS, thus extending the life of the legacy system until it is migrated to a CMIS-compliant repository.
  • Easy enabling of new applications -- CMIS eases the effort required to achieve seamless integration, freeing business applications from the constraints of content that's locked in information silos or propriety data formats. End users see optimized processes that combine content from multiple sources into a unified view.
  • Enhance content discovery -- In some environments, content applications can be static with little or no change to the configuration and available data over time. Other content applications are much more dynamic, requiring more frequent scanning of existing resources and rapid integration of new content systems. CMIS shines in these dynamic environments by reducing complexity, cost, and time to add new CMS repositories on the fly, and by making that content immediately available in a unified view.

How CMIS works

The CMIS object model provides a standard, unambiguous method for accessing repositories regardless of its computing platform or environment. The CMIS specification defines six common types of objects that can be addressed and managed in a CMIS-ready repository:

  • Document and folder objects are defined, as their names indicate, to classify objects stored as documents and folders, respectively, similar to the way many file systems and repositories define and manage content.
  • Relationship objects are used to define relationships between document objects (including links), but also support other means of grouping and relating objects in much the same way relationships are defined in relational databases.
  • Policy objects are used to manage policies for accessing and exposing content.
  • Item objects are extension points for repositories that need to expose other object types via CMIS but do not fit the definition for document, folder, relationship, or policy.
  • Secondary objects include a set of properties that can be dynamically added to and removed from objects. That is, an object can gain or lose additional properties that are not defined by its primary type during its lifetime.

The CMIS standard defines web services, RESTful AtomPub, and JSON (browser) bindings that applications can use to work with one or more content management repositories.

Three access methods are defined in the current CMIS specification:

  • A web services interface (WSDL or SOAP)
  • An alternative Atom Publishing Protocol (AtomPub) method
  • JSON (Added in CMIS 1.1)

The web services, AtomPub, and JSON bindings defined in the CMIS specification are functionally equivalent and consistent; each protocol only defines the syntax for how methods are invoked and how responses are formatted. All of the same CMIS functionality is exposed regardless of which protocol is used.

Bindings and permissions

The CMIS standard has two concepts for permissions:

  • Basic permissions include read, write, and all. A user with read permissions can only view data from a repository. A user with write permissions can contribute to a repository, and a user with all permissions has full control.
  • Digest permissions are defined by and expressed by a repository. These permissions do not have an explicitly defined meaning in the CMIS specification. They can be looked up at runtime using CMIS interfaces.

The CMIS specification maps allowable actions to the access control level (ACL) permissions defined for a repository. A client application can discover how the permissions that are exposed by the repository affect what actions a user can perform by way of CMIS. Other factors can affect whether a user can perform an action. For example, to check out an object from the repository, a user may need both the appropriate permission and the item.

Web services interface

The web services method provides a Web Services Definition Language (WSDL) description of the interface to allow developers to connect systems to it. SOAP/WSDL web services have been widely adopted for enterprise applications.

The following diagram illustrates how various enterprises connect to Alfresco, SharePoint, and FileNet repositories using CMIS web services interfaces. Enterprises do not need to know which repository they are connecting to or any other lower-level details about the repositories; rather, they just call standardized web service APIs using SOAP/WSDL for all their content management needs.

Web services interface

CMIS uses web services and Web 2.0 interfaces to enable rich information to be shared across internet protocols in vendor-neutral formats among document systems, publishers, and repositories within a single enterprise or between companies.

If you're new to web services, please refer to the Annexure section at the end of this article for a brief introduction to web services architecture.

Atom Publishing Protocol (AtomPub)

The Atom publishing method relies on the predefined Atom specifications and processes.

With this binding, the client requests the service document using a vendor-provided URI, and then chooses a CMIS collection and accesses the repository by following references in the returned documents. The extensible service document contains service/Atom collections, feeds, and entry documents. To learn more about AtomPub, please refer to the Annexure section.

The following diagram depicts how a client calls a CMIS service point interface (SPI) using the AtomPub binding, which in turn returns the repository-specific service point for further action. For example, if a document's content needs to be accessed, the client calls the CMIS SPI using the AtomPub binding, which then returns a service point (the document’s repository-specific ATOM entry; for example, "WCM SP" in the diagram). This service point contains a link to the content. Hence, this is a two-step process using two HTTP calls. The OpenCMIS library can be used to cache these links in order to reduce the repeated calls.

Atom Publishing Protocol

Java Script Object Notation (JSON)

JSON is a language-independent, open standard format that uses human-readable text to transmit data objects. It was derived from JavaScript, but as of 2017 many programming languages include code to generate and parse JSON-formatted data.

This JSON binding, which is specifically designed for browser-based applications, leverages a browser’s built-in capabilities and does not require a JavaScript library. It’s easy for developers to use, since it’s based on existing technologies like HTML, HTML forms, JavaScript, and JSON. It uses GET and POST verbs. Although this binding is primarily used with web browsers, it can be used with other application types as well.

To learn more about JSON, please refer to the Annexure section.

Example CMIS implementation using web services

The CMIS connector for Microsoft SharePoint Server 2013 enables SharePoint users to interact with content that is stored in any repository that has implemented the CMIS standard. The connector also makes SharePoint Server 2013 content available to any application that has implemented the CMIS standard. CMIS is designed to support scenarios that enterprises commonly encounter when managing content across multiple content management systems in rich, hybrid environments, including:

  • Data migration to and from content management systems in an enterprise
  • Graphical user interfaces (GUIs) in apps that read data from multiple content repositories
  • A SharePoint web part that uses CMIS to roll up personnel data from multiple legacy content management systems within an enterprise
  • A mobile application that can access documents from any ECM system
  • A photo-editing application that saves files to a CMIS repository with ECM features enabled, such as the ability to check in and check out files
  • A line-of-business (LOB) system that exports report data to an ECM repository
  • A contract-approval app that uses SharePoint user interface (UI) elements to manage a central approval process while still enabling the contract to be published to several different systems

Example CMIS implementation using Atom

IBM CMIS for FileNet Content Manager is the implementation of the OASIS CMIS standard for IBM FileNet Content Manager. It uses the Atom/REST method for exposing the IBM content management repository.

The following diagram shows how IBM CMIS for FileNet Content Manager connects to your IBM FileNet P8 system and client application.

Example CMIS implementation using Atom

The CMIS web application is packaged as a web application archive (WAR) file that can be deployed in WebSphere to support REST services expressed in the CMIS specification. The CMIS web application translates these services at run time to Java API calls to the IBM FileNet and IBM Content Manager repositories. The preview uses the IBM FileNet and IBM Content Manager Java APIs to access the native repository.

It is recommended that you deploy IBM CMIS for FileNet Content Manager to a dedicated application server. In addition, it is recommended that you dedicate a server in your IBM FileNet P8 system to IBM CMIS for FileNet Content Manager. IBM CMIS for FileNet Content Manager communicates with the content engine by using the Content Engine Client Java API. After you install IBM CMIS for FileNet Content Manager, you must install the Content Engine Client on the machine where IBM CMIS for FileNet Content Manager will be deployed and install the Content Engine Client Java API files in the IBM CMIS for FileNet Content Manager installation directory.

Also, IBM Content Navigator 2.0.3 uses CMIS to connect with various CMIS-compliant repositories (such as IBM FileNet P8 or IBM Content Manager On Demand).

IBM Content Navigator 2.0.3 connects with CMIS-compliant repositories

Example CMIS implementation using JSON

Alfresco fully implements both the CMIS 1.0 and 1.1 standards to allow your application to manage content and metadata in an Alfresco repository or the Alfresco cloud.

For existing CMIS 1.0 applications, the Alfresco OpenCMIS Extension extends OpenCMIS to provide support for Alfresco aspects.

CMIS 1.1 introduces a number of new concepts that are supported by Alfresco. You can now use the new browser binding (JSON) to simplify flows for web applications, use Alfresco aspects, and use the append data support to manage large content items.

  • Browser binding -- In addition to the existing XML-based AtomPub and web services bindings, CMIS 1.1 provides a simpler JSON-based binding. The browser binding is designed for web applications and is easy to use with just HTML and JavaScript. It uses just two verbs, GET and POST, and resources are referenced using simple and predictable URLs.
  • Using aspects -- Alfresco aspects are exposed as secondary types in CMIS 1.1. You can dynamically add aspects to an Alfresco object using the API.
  • Appending content -- In some applications, such as journaling or when using very large files, you want to upload a file in chunks. You might have large files that time out during an upload or fail because of a bad connection. You can use the CMIS 1.1 append parameter in these situations.
  • cmis:item support -- You can use cmis:item to query some Alfresco object types and your own custom types that are outside the CMIS definitions of document, folder, relationship, or policy.

ECM systems / vendors that support CMIS

The following table shows the ECM systems that support CMIS:

VendorProductComments
AlfrescoAlfresco 3.3+
EMCDocumentum 6.7/7.0
IBMContent Manager 8.4.3+
IBMFileNet P8 5.0+
KnowledgeTreeKnowledgeTree 3.7+
MicrosoftSharePoint Server 2010/2013 (not available in Foundation version)CMIS 1.0 is supported out-of-the-box in SharePoint Server 2013; it requires installation of the Administration Toolkit in SharePoint Server 2010 (not available in Foundation version)
NuxeoNuxeo DMS 5.5+
OpenTextOpenTextWeb services for OpenText eDOCS DM/RM is based on CMIS using the CMIS -Adaptor.
OracleOracle Webcenter ContentContent Management REST Service Developer's Guide
SAPSAP HANA Cloud Document ServiceSAP HANA Cloud Platform Documentation

Conclusion

Integrating disparate content management repositories allows content to be accessed as needed, and even blended together in new and interesting ways. This article has offered you an overview of CMIS, its need, benefits, and examples using FileNet, SharePoint, and Alfresco. Although there is a long list of ECM tools that comply with CMIS, this article covered the most commonly used tools to make it easy to understand within the context of real-world situations.

I encourage you to implement new client applications and modernize existing ones to make them CMIS compliant, even if there is no immediate need for integrations with other tools. This can help reduce your organization’s technical debt and make it future ready. It can also make it easier for your organization to quickly change vendor products or try new products quickly in plug-and-play fashion, as and when needed, while also allowing you to save the cost of new application development.

Annexure

More about web services

Web services can convert your application into a web application, which can publish its function or message to the rest of the world. The basic web services platform is XML + HTTP. The following diagram describes the web services architecture:

Web services architecture

The web services architecture is defined by the following characteristics:

  • Service processes -- In a web services context, "processes" means how multiple services (part of single or multiple solutions) collaborate and coordinate with each other. For example, discovery allows you to locate one particular service from among a collection of web services. Aggregation corresponds to aggregation of multiple service results before sending it back to the original requestor. Choreography describes the interactions between multiple services (ways of exchanging messages between microservices whenever something happens).
  • Service description -- One of the most interesting features of web services is that they are self-describing. This means that once you've located a web service, you can ask it to "describe itself" and tell you what operations it supports and how to invoke them. This is handled by WSDL.
  • Service invocation -- Invoking a web service (and, in general, any kind of distributed service such as a CORBA object or an Enterprise Java Bean) involves passing messages between the client and the server. SOAP specifies how you should format requests to the server, and how the server should format its responses. In theory, you could use other service invocation languages (such as XML-RPC, or even some ad hoc XML language), however SOAP is by far the most popular choice for web services.
  • Transport -- Finally, all of these messages must be transmitted somehow between the server and the client. The protocol of choice for this part of the architecture is HyperText Transfer Protocol (HTTP), the same protocol that's used to access conventional web pages on the internet. Again, you can use other protocols, but HTTP is currently the most popular.

More about AtomPub

Atom is an XML-based document format that describes lists of related information known as feeds. Feeds are composed of a number of items, known as entries, each with an extensible set of attached metadata. For example, each entry has a title. The primary use case that Atom addresses is the syndication of web content such as weblogs and news headlines to web sites, as well as directly to user agents.

AtomPub is an application-level protocol for publishing and editing web resources. The protocol is based on HTTP transfer of Atom-formatted representations. The Atom format is documented in the Atom Syndication Format. The protocol supports the creation of web resources and provides facilities for:

  • Collections -- Sets of resources that can be retrieved in whole or in part
  • Services -- Discovery and description of collections
  • Editing -- Creating, editing, and deleting resources

Central to AtomPub is the concept of collections of editable resources that are represented by Atom 1.0 feed and entry documents. A collection has a unique URI. Issuing an HTTP GET request to that URI returns an Atom feed document. To create new entries in that feed, clients send HTTP POST requests to the collection's URI. Those newly created entries are then assigned their own unique edit URI. To modify those entries, the client simply retrieves the resource from the collection, makes its modifications, and then puts it back. Removing the entry from the feed is a simple matter of issuing an HTTP DELETE request to the appropriate edit URI. All operations are performed using simple HTTP requests and can usually be performed with nothing more than a simple text editor and a command prompt, as illustrated in the following diagram.

Atom Publishing Protocol (AtomPub)

More about JSON

JSON is a lightweight data-interchange format that is self-describing, easy to understand, and language independent. It is easy for humans to read and write, and easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language, Standard ECMA-262 3rd Edition - December 1999.

JSON is text, and you can convert any JavaScript object into JSON and send it to the server. You can also convert any JSON received from the server into JavaScript objects. That way, you can work with the data as JavaScript objects with no complicated parsing or translations.

JavaScript has a built-in function for converting a string that's written in JSON format into native JavaScript objects:

JSON.parse()

So, if you receive data from a server in JSON format, you can use it like any other JavaScript object.

Here are some examples:

Sending data:

var myObj = { "name":"Mayur", "age":36, "city":"Gurgaon" };
var myJSON = JSON.stringify(myObj);
window.location = "demo_json.php?x=" + myJSON;

Receiving data:

var myJSON = '{ "name":" Mayur ", "age":36, "city":" Gurgaon " }';
var myObj = JSON.parse(myJSON);
document.getElementById("demo").innerHTML = myObj.name;

Storing data:

//Storing data:
myObj = { "name":"John", "age":31, "city":"New York" };
myJSON = JSON.stringify(myObj);
localStorage.setItem("testJSON", myJSON);

//Retrieving data:
text = localStorage.getItem("testJSON");
obj = JSON.parse(text);
document.getElementById("demo").innerHTML = obj.name;