Introduction to Streaming Telecommunications Event Data Analytics (TEDA)

 View Only

Introduction to Streaming Telecommunications Event Data Analytics (TEDA) 

Mon August 17, 2020 01:04 PM

Written by Michael Kotowski.

A mobile network consists of hundreds of network elements that generate call detail records or event data for each phone call, text message, internet activity, or even a simple walk from one location to another. The number of records per day varies between millions and billions. In many cases, the network elements are from different vendors. The typical interface to access these records is file-based.

As a telecommunications provider you want to analyze these records to run campaigns, improve user experience, improve your mobile network or call center, or to detect fraud. You need a system that unifies your source data, runs near real-time analytics, and scales with your increasing network traffic.

Many other industries have similar requirements and challenges.

Telecommunications Event Data Analytics accelerates the development of applications in these challenging fields.

In this article, we explain the purpose of the Telecommunications Event Data Analytics, which is to build mediation applications that are capable to process mass data and run near real-time analytics. We also provide a quick introduction to the architecture and features. The example we use derives from a project in the Telecommunications industry and is available as a sample application. But Telecommunications Event Data Analytics is not focused on this industry alone.

What is Mediation? What else can you do?

Telecommunications mediation is a process that converts call data to pre-defined layout that can be imported by a specific billing system or other OSS (Operations Support System) applications.

Telecommunications mediation – Wikipedia, the free encyclopedia
https://en.wikipedia.org/wiki/Telecommunications_mediation

This definition describes mediation in a generic way. Using the example from the Telecommunications industry, mediation can be translated into the following use case and actions in the application:

Use case:

  • As a Telecommunications operator, I want to store information subsets from call detail records and events that I receive from different network elements in different formats and versions, into a data warehouse (or Hadoop system), so business analysts and data scientists can easily work with the unified information.

Actions in the application:

  • Get the call detail records and events from many network elements, from different network equipment vendors, in different versions and formats.
  • Validate the call detail records, unify their formats, and apply your business logic. For example, enrich the call detail records with customer or network-specific information like contract IDs, or address information for the subscriber’s current location based on the network cell.
  • Store the transformed and unified call detail records in a database system, for example, in one or more database tables.

These use case and actions of the application are typical mediation topics. But, as a Telecommunications operator you might want to run some near real-time analytics in addition to the mediation, for example:

  • As a Telecommunications operator, I want to detect dropped calls per cell and per subscriber in near real-time, so I can predict outages, improve my network and my customer care center that can pro-actively contact customers, who experienced dropped calls, with special offers.
  • As a Telecommunications operator, I want to detect fraud in near real-time to increase revenue.

What is Telecommunications Event Data Analytics and why does it help you?

Telecommunications Event Data Analytics (TEDA) provides an application framework that consists of Streams application templates and supporting tools. You use the application framework to quickly setup mediation applications that can handle billions of call detail records (CDR) per day and support near real-time analytics. You configure these applications with configuration parameters that are stored in a simple text file, and customize them by adding your business logic or near real-time analytics to already prepared SPL composite operators.

The applications support the following features that solve typical requirements or problems.

  • TEDA applications are decoupled from the external source and sink systems.
    For example, source systems mean the network elements that provide the source data that is processed while sink systems are database systems like DB2 or Hadoop, or other systems that process the validated, enriched, and unified outputs. If a source or sink system is broken or in maintenance mode, it cannot provide source data or cannot handle the validated, enriched, and unified outputs. Typically, if a single source system does not provide data anymore, you do not want to stop the whole process. Or, even if the database system is in maintenance mode, you still want to process the source data. Therefore, TEDA applications are decoupled from the source and sink systems and expect the source data in files that appear in a landing zone in the local or a shared file system. They write the validated, enriched, and unified data to files that are in the local or a shared file system, too, so a sink system can take them as needed.
  • TEDA applications can be easily extended to read, decode, and parse many different input formats.
    As mentioned in the Telecommunications example, a Telecommunications operator typically manages many network elements from different vendors in different versions. The network elements provide the same or a similar data sets that must be unified, because the downstream systems require a common schema. TEDA supports three common formats out-of-the-box: CSV, ASN.1, and fixed-size structures. You only need to specify the common schema and the mapping between your source formats and the common schema. You can also integrate other formats, for which a parser/decoder exists as an SPL operator.
  • TEDA applications support the following mediation-related features: file and record deduplication, validation, filtering, transformation and enrichment, aggregation and correlation.
    Out-of-the-box TEDA supports a configurable file and record de-duplication to ensure that input files and data records are processed only once. Validation, filtering, transformation and enrichment, aggregation and correlation typically need your customization. TEDA applications already provide default implementations that you can use as-is or override. You are also supported during customization. For example, you can use a configurable operator to enrich the data records, or you can use the existing output streams for rejected (invalid) files/records or statistics.
  • TEDA applications are trimmed for performance and do scale.
    TEDA is optimized for a good ratio between CPU consumption and throughput, which means a small amount of multi-threaded processing elements and little inter-host communication to reduce the communication overhead. The number of files, which are processed in parallel can be configured. That means, if an existing TEDA application cannot handle the increasing traffic anymore, you simply reconfigure the degree of parallelism.

These and more features are available out-of-the-box. When you develop mediation applications with TEDA, you can focus on the customization and implement your business logic, and therefore, you significantly reduce your development efforts.

Quick introduction to the architecture

Telecommunications Event Data Analytics (TEDA) and its application framework provide two application types: (1) The Ingest Transform Enrich (ITE) application, and (2) The Lookup Manager (LM) application.

The LM application is required only if you use external data, for example, from a CRM system or data warehouse, to enrich your tuples. The LM application loads and updates the enrichment data in an in-memory key-value store and ensures that the data is distributed across all hosts. The application also stops the connected ITE applications before it updates the enrichment data, and continues them afterward. This procedure ensures that an input file is processed with the same enrichment data for all its tuples.

The ITE application is the worker. It processes the input files, including parsing, validating, transforming, enriching, and correlating records. The output of an ITE application is typically a set of files for each input file. The application does not write to a database directly to prevent back-pressure or problems if, for example, the database is down for maintenance or network problems. You can use the database loader application, which is available on GitHub, to transfer the generated ITE output files to a database.

The ITE and LM applications use a shared folder to exchange status information and to trigger actions in the other application. For more information about the architecture, see the IBM Knowledge Center: Reference > Toolkits > SPL standard and specialized toolkits > com.ibm.streams.teda > Application framework.

Architecture of a Telecommunications Event Data Analytics solution
Architecture of a Telecommunications Event Data Analytics solution

Conclusion

When you use IBM Streams and its Telecommunications Event Data Analytics (TEDA) application framework to develop your specific solution, you focus on your core competencies – your data and the business logic to apply to it. All the groundwork of a resilient, scalable, high performance application is already delivered to you. Therefore, the typical effort to develop mediation applications with TEDA is much lower than the effort to develop a comparable application from scratch.

Get started with the toolkit

Read the getting started with the TEDA toolkit article to run an application and learn more.

Documentation

See the reference documentation of the com.ibm.streams.teda toolkit for more details:


#CloudPakforDataGroup

Statistics

0 Favorited
6 Views
0 Files
0 Shares
0 Downloads