Today, we released a new SPSS Modeler extension that creates word clouds. If you are unfamiliar with word clouds, you have probably seen them one place or another as a way to visualize the most used words in a text dataset (corpus). Word clouds can be very helpful visualization tools when working with text data, so we wanted to add this functionality to SPSS Modeler with an extension. This extension was written in R, so you will need the appropriate version of R and R Essentials installed (learn what you need here).

This extension can use text coming from data in a Modeler stream, it can get text from a URL with a CSS selector, or it can read in a local .txt file. There are options available in the extension to change the text pre-processing, display, and save settings.

We have documentation, multiple examples, and the extension download available at the GitHub repository.

In one example we create two word clouds for the Random Acts of Pizza subreddit. In this subreddit, users call on the generosity of strangers to supply them with pizza. Using SPSS Modeler for some pre-processing, we generated two word clouds. The cloud on the left shows users that received a pizza and the cloud on the right shows users that did not get pizza.

Do you see any differences?
Word Cloud - Got Pizza.png Word Cloud - No Pizza

8 comments on"Create Word Clouds with New SPSS Modeler Extension"

  1. Nice work Greg ! 😉

  2. Greg,

    Been looking forward to this… saw it was coming.

    However, with Modeler Text Analytics, what I’m really interested in is Concept Clouds rather than Word Clouds. Can the extension handle multi word phrases and not just individual words? One would have a Text Analytic nugget produce concept words/phrases and then display them in the Word Cloud extension retaining the multi word phrases in addition to individual words.

    Right now I output concept records and generate the Concept/Word Cloud in Tableau.

    Tony

    • Hi Tony,

      Thanks for the suggestion. This extension currently only looks at single words and not phrases. It should be possible to make this enhancement. If you want to add this functionality, or suggest someone else add it, please make the recommendation here: https://github.com/IBMPredictiveAnalytics/Word_Cloud_Visualization/issues

      Hope this extension helps bring some visualization work into Modeler.
      Greg

      • TonyPines May 16, 2016

        Greg,

        Can you check the issue I posted for this extension on GitHub? Not sure if its the SPSS Extension itself, or a package that the Extension is calling on installation, but a serious potential problem. Reproduced on 2 virgin OS X installations.

        Tony

        • TonyPines May 16, 2016

          Greg

          Intego has reviewed the file, and confirm it was a “False Positive”… so the openssl is not infected.

          So there is no problem per the openssl component installed during Extension deployment.

          Tony

          > Thank you for your file submission.
          >
          > Our malware team has reviewed the file you sent and they have identified it as a false positive. They expect that it will be corrected with an update that will be released Thursday. For now, please Trust the file by adding it to the Trusted Items. We apologize about the inconvenience.
          >
          > Please let us know if you have any issues.
          >
          > Kind Regards,
          >
          > Babs
          > Intego Support Team

          • Greg Filla May 17, 2016

            Hi Tony,

            Glad this was fixed. Any code in the extension not written by myself would be coming from R Packages that are approved by CRAN and have been tested.

            Happy to see this was a false positive.

            Hope the extension is working for you,
            Greg

          • TonyPines May 17, 2016

            Greg,

            In general, would it be “R” or one of the extensions that installed openssl into the R Framework?

            I only saw R_Start make a connection to the internet during spss extension installation, and r_start is in Modeler client at
            /Applications/IBM/SPSS/Modeler/18/SPSSModeler.app/Contents/ext/bin/pasw.rstats/r_start

            Tony

  3. Greg,

    Can you check the issue I posted for this extension on GitHub? Thanks.

Join The Discussion

Your email address will not be published. Required fields are marked *