In this presentation from the 2016 Swiss Data Conference, Romeo Kienzler introduces DeepLearning4J and explains how it can be parallelized for ApacheSpark.

In this video:

Romeo begins by listing the components of DeepLearning4J: a Java library, the DataVec tool, and the ND4J and ND4S libraries. He describes the support and implementations of both the latter components. Next, he explains the parallelization requirements.

With the general information out of the way, Romeo makes some preliminary observations regarding his data (which he generated himself; see “Resources” below) and a couple of papers that have inspired him (see “Resources”). Next, he moves on to his demonstration.

Romeo’s demo is of an autoencoder, which reduces the “dimensionality” of the inputs to a lower dimension. By doing so, it creates a bottleneck of data, which eliminates noise.

There follows a close description of Romeo’s code, wherein he explains many of the code’s features and functions, along with showing their effects on real-time results. The presentation finishes with a question-and-answer period.


Discovering Data Science with Romeo Kienzler

Follow Romeo as he tackles the most difficult challenges in data science.

Join The Discussion

Your email address will not be published. Required fields are marked *