One of the most frequent questions our clients and partners ask us about IBM Watson Personality Insights Service is about the amount of text they need to input to the service to get acceptable results. The guideline that we previously had for the service in production before the release of this new model was to provide at least 3,000 words, which is about 5 pages worth of text. While obtaining this much text for each individual, whose personality is to be inferred, is possible in some contexts, in many cases, it has been a challenge for our clients. In this release we address this issue by providing a new model that provides higher precision with fewer words.

Our Approach

As explained in a previous blog, our service relies on psychometric survey-based scores. In short, to collect our ground truth data, we administer surveys over large populations and for each user we collect standard psychometric surveys along with their Twitter posts (more details here). For this release, we tripled the number of users in our ground truth, allowing us to get much more confident results.

Along with collecting more data, we changed the nature of our model. Previously, we guided the features in our model using Linguistic Inquiry and Word Count (LIWC) dictionary categories. In the newer model, we decided to have a more robust approach that keeps up with the latest and emerging vocabulary on social media. Therefore the newer model, presently only available in English language, eliminates LIWC dictionary all together and instead uses a vector representation of words (Word2Vec) derived from multiple large corpora called GloVe developed by Stanford University. GloVe is trained on aggregated global word-word co-occurrence statistics from a very large corpus, and the resulting representations capture semantic similarities and differences in the words. Using GloVe word vectors as features in our model allowed us to breach the performance gap of our model on short texts and even outperform our previous model on long texts.

To study the accuracy of our model, we performed correlation and Mean Absolute Error (MAE) analysis to compare the trait scores that were calculated by using our Personality Insights machine learning model with the corresponding psychometric measures collected from administering the surveys to all 1,500 subjects. We averaged the correlation and the MAE over all the personality traits and report them as a function of the number of words a user of our service would submit. The graphs below show the correlation and the MAE on y-axis and number of words on x-axis.

Those graphs show that our new model performs much better than the previous one for all text lengths. With 3,000 words, the new model provides 40% higher correlation than before. In fact, with just 450 words the new model provides the same correlation as the older one provided with 3000 words. This translates to an improved model (see Mean Absolute Error and other details here), which requires fewer number of words to infer personality. We recommend that you provide at least 1200 words of input text, which results in correlation within 90% of the best correlation and an MAE that is within two percent of the best MAE the service can return. Submitting between 600 and 1200 words results in an MAE that is within three percent of the best MAE (correlation within 80%) that IBM still considers acceptable. Providing at least 3000 words approaches the service’s maximum precision. As before, you must submit at least 100 words; otherwise, the service reports an error. If you submit fewer than 600 words, the service reports a warning but still analyzes the input text.


With this new model, Watson Personality Insights Service is able to offer an improved performance for users. Further, we have been able to reduce the number of words recommended for an acceptable performance by almost a factor of 3. The IBM Watson Personality Insights team is very excited about the capabilities this release will enable and is looking forward to your feedback. We are currently releasing this new model for English. Other languages i.e. Spanish, Japanese and Arabic, will shortly follow.

Given that this is a new model, the results it produces are different from those of the previous one. We remind our users that while results may not tally on an individual basis between the new and old models, the overall performance of the new model as measured by both correlation and MAE is better than the old. Therefore, users should see better results overall with this new model. The users/clients who cache the personality results, have a choice to make on whether they would like to keep the old results as-is or to redo them with the newer model. Our recommendation is that if you are doing a population analysis for use cases such as customer segmentation based on personality and/or adding personality traits in prediction models, then the personality for all users in the population should be calculated from the same model. In either case, users are encouraged to understand that given the complexity of predicting one’s personality traits, in general, with proxies such as text, some amount of errors are unavoidable. However, we are pleased to note that our new model is better than the old one.

We are continuously improving our models and will keep you abreast of any future updates. The development team responsible for Personality Insights includes: Pierre Arnoux, Rama Akkiraju, Neil Boyette, Jalal Mahmud and Vibha Sinha. Kenneth R Kuo is the offering manager. Steffi Diamond is the release manager.



15 comments on"IBM Watson Personality Insights has a new model: Support of shorter text and precision improvement"

  1. Is this update now generally available in the PI service? We didn’t get a product update info. about this.

    • Pierre H. Arnoux September 01, 2016

      Yes, it was G.A. on August 31st but is only available for English language. Other languages will follow shortly.

  2. Awesome!! As a researcher, the lower word count threshold opens up a ton of new experimental possibilities for me, and will immediately help me justify using Watson on more projects/datasets. Really happy about this update, and look forward to more in the future. Thanks, PI team!

    P.S. – Also thanks for explaining it in this blog post. Now I have another resource I can cite in any journal articles I write to help explain how Watson works, its accuracy, etc.

  3. I’d be very interested in speaking with you. I am the founder of the MindTime project.
    The MindTime framework is a research and peer reviewed a priori theory of human consciousness and individual difference.
    I believe that this framework will provide more powerful and more useful insights than those possible through measurement of individual differences traits such as the Big 5, and I believe it can be achieved with a smaller body of text.
    To be sure, we’ve done the peer reviewed research and MindTime is parsimonious with and often explains the causes (survival value) behind many individual differences traits, including the Big 5.
    For many years we have felt that we could use text analysis to measure individuals’ thinking perspectives/style (MindTime), a result of the blend of the 3 constructs that define the MindTime framework. To date we have used our own survey methodology (we can measure a person’s thinking perspective/style with as few as 9 statements. although we use 45 in research work –
    I am very interested in speaking with you about using Watson’s capabilities to reveal these thinking constructs through text analysis.
    Sorry to communicate this way, your blog prompted my outreach and I do not have an email for you.
    John Furey

    • Name * Sankar . N October 11, 2016

      Hello Mr.John Furey,

      Interesting to know about Mindtime. You might want to try our TEXTIENT ANALYTICS platform where we have integrated IBM Watson services with various (text) datasources such as from facebook, twitter, youtube and bring your own data in a easy to use manner. We also have powerful data refining and management tools that helps you to get high quality and deeper people insights .. We are an IBM Watson partner. Let’s know if you wish to try.


  4. garyhow01hello January 21, 2017

    Hello, in the article you mentioned

    “To study the accuracy of our model, we performed correlation and Mean Absolute Error (MAE) analysis to compare the trait scores that were calculated by using our Personality Insights machine learning model with the corresponding psychometric measures collected from administering the surveys to all 1,500 subjects.”

    Is the machine learning model a supervised or unsupervised model? I am just wondering if the Personality Insights service results learn from inputs from its users and improves itself automatically over time.

    • Pierre H. Arnoux January 23, 2017

      Hi Gary, the model for Personality Insights is supervised.
      In the current implementation, the model doesn’t leverage users’ input to improve over time.

  5. Pierre H. Arnoux May 30, 2017

    Our latest publication on Personality Insights:

    You can find all the details about the technique we use in this paper.

  6. Hi Pierre,

    Is there any information available about the weighting method that was used in the original LIWC-based Personality Insights? As in, how the coefficients from Yarkoni (2010) were applied as weights? Thanks!

    • Pierre H. Arnoux November 28, 2017

      Hi Ryan,

      for this experiment, we didn’t use the Yarkoni (2010) coefficients directly.
      We reproduced his method based on LIWC and trained a linear model on our data.
      That way, we are comparing apples to apples.
      I hope this helps.

      • Thanks so much for your response. I appreciate that the new Insights model doesn’t use the Yarkoni coefficients directly. I’m actually after some documentation of the previous process that was used. Do you know of any?

        • Pierre H. Arnoux November 30, 2017

          Unfortunately, we don’t maintain the documentation of previous versions of our service.

  7. Gabriel Tincopa Bejar January 30, 2018

    hola Pierre,
    Veo que en el art√≠culo mencionan que por el momento el nuevo modelo esta por el momento desarrollado para el Idioma Ingl√©s. Pero esta publicaci√≥n es de agosto del 2016. Desear√≠a sabe si esta lista la versi√≥n en espa√Īol, ahora en enero del 2018. Quedo atento a tu respuesta

Join The Discussion

Your email address will not be published. Required fields are marked *