Corporate blog of SCHEMA GmbH

Intelligent information with machine learning and the SCHEMA Content Delivery Server

Leave a comment

Intelligent information is currently a huge talking point at many companies that wish to efficiently communicate content relevant to the situation to their customers or employees, e.g. using a content delivery portal such as the SCHEMA Content Delivery Server. However, the enrichment of content with metadata often involves a high degree of manual work. This article will present ways in which to achieve seamless automation and integration of this process with the aid of machine learning and WebHooks.

Intelligent information and efficient content delivery

Generally, “intelligent information” refers to modularised units of text enriched with (classifying) metadata, which provide new access options that are far removed from the context of the document. This is important, as due to the increasing popularity of smartphones and tablets, the appropriate presentation of content is moving away from traditional documents towards compact, self-contained modules. Furthermore, the expectations of users with respect to receiving information that is as tailored as possible and sensitive to context are increasing.

The filtering of information takes place in a content delivery portal (CDP), either automatically using the profile of the user or manually by selecting specific facets that narrow down a search request further. This process, which is also used by large online shopping portals, has now become an intuitive way to navigate through large amounts of content.

Filtering Intelligent Information

Filtering of intelligent information (Jan Oevermann / ICMS GmbH)

This tailored filtering of information is based on classifications, e.g. PI-Class®. This is a method developed by Prof. Wolfgang Ziegler, which can be used for the classification of modules. PI classifications are defined as taxonomies and can be used irrespective of the system. Intrinsic classifications clearly categorise the type of information within the content (information class) and link this with the product components described (product class). Extrinsic classifications expand on this approach with the intended (even repeated) use of the content for product models and document types. All these classification properties can be made applicable as facets described in a CDP.

Automatic classification through machine learning

Machine learning refers in general to processes that generate new knowledge based on experience. Learning data is used for this purpose in order to identify patterns and regularities, which can then be applied to data that is not known to the system (learning transfer). If, during the learning phase, the results expected for the respective data are communicated to the system, this is called “supervised learning”, which also counts as automatic classification. Generally speaking, machine learning processes are a subtype of artificial intelligence (AI).

Since most of the content in technical documentation is still text-based, automatic text classification is of particular interest. Tailored processes for automatically assigning intrinsic PI classifications for modules from the field of technical communication are the subject of current research, but can already be used today to support editors in preparing documents, for example. One tool suitable for this purpose is the “fastclass” software, which specialises in the automatic assignment of classification schemes from technical documentation.

Seamless integration into the SCHEMA CDS using WebHooks

Schematic Process of automatically generating intelligent information

Schematic process with WebHook (Jan Oevermann / ICMS GmbH)

In order to implement the “magic” import process, WebHooks are used as a form of machine-machine communication. As a rule, server A tells server B that a specific event has occurred, and server B can then trigger an action. A WebHook like this is registered in SCHEMA CDS for the “Package uploaded” event. If the event occurs, a connector is addressed, which triggers automatic classification of the uploaded content and controls the process (see Figure). The user only sees that following the import, the option now exists to filter the uploaded content by facets.

This “basic” content has now been turned into intelligent information using machine learning.


Jan Oevermann has a degree in Technical Editing as well as Communication and Media Management from the Karlsruhe University of Applied Sciences. He also has a doctorate in machine learning from the University of Bremen. He works as a consultant at ICMS and is a member of the tekom-AG Information 4.0 group.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s