Learn more >
TrainableText Feature Extraction
Get this model
By IBM Developer Staff | Updated September 21, 2018 - Published July 12, 2018
Artificial intelligenceDeep learningNatural language processingNatural Language ProcessingText Feature Extraction
Machine learning algorithms usually expect numeric inputs. When a data scientist wants to use text to create a machine learning model, they must first find a way to represent their text as a vector of numbers. These vectors are called word embeddings. The Swivel algorithm is a frequency-based word embedding that uses a co-occurence matrix. The idea here is that words that have similar meanings tend to occur together in a text corpus. As a result, words that have similar meanings will have vector representations that are closer than those of unrelated words.
This model enables you to train the Swivel algorithm on a preprocessed Wikipedia text corpus. For instructions on generating word embeddings on your own text corpus see the instructions in the TensorFlow model repository.
Take a look at new MAX models for natural language processing tasks and new pens on CodePen.
Artificial intelligenceDeep learning+
Generate a summarized description of a body of text
View model »
Get an overview of Watson Discovery and learn how it can help you unlock hidden value in data to find…
Back to top