Learn more >
Get this modelTry the API Try the web app Try in a Node-RED flow Try in CodePen
By IBM Developer Staff | Updated September 21, 2018 - Published March 20, 2018
Artificial intelligenceDeep learningImage-to-Text TranslationNatural Language Processing
This model generates captions from a fixed vocabulary that describe the contents of images in the COCO Dataset. The model consists of an encoder model – a deep convolutional net using the Inception-v3 architecture trained on ImageNet-2012 data – and a decoder model – an LSTM network that is trained conditioned on the encoding from the image encoder model. The input to the model is an image, and the output is a sentence describing the image content.
The model is based on the Show and Tell Image Caption Generator Model.
This model can be deployed using the following mechanisms:
docker run -it -p 5000:5000 codait/max-image-caption-generator
kubectl apply -f https://raw.githubusercontent.com/IBM/MAX-Image-Caption-Generator/master/max-image-caption-generator.yaml
You can test or use this model
Once deployed, you can test the model from the command line. For example if running locally:
curl -F "image=@assets/surfing.jpg" -X POST http://127.0.0.1:5000/model/predict
"caption": "a man riding a wave on top of a surfboard .",
"caption": "a person riding a surf board on a wave",
"caption": "a man riding a wave on a surfboard in the ocean .",
Complete the node-red-contrib-model-asset-exchange module setup instructions and import the image-caption-generator getting started flow.
Learn how to send an image to the model and how to render the results in CodePen.
Back to top