Answers for "Using the Silhouette Procedure to evaluate k-means clustering solutions"
https://developer.ibm.com/answers/questions/381501/using-the-silhouette-procedure-to-evaluate-k-means.html
The latest answers for the question "Using the Silhouette Procedure to evaluate k-means clustering solutions"Answer by jkpeck
https://developer.ibm.com/answers/answers/500422/view.html
@Jeanne123
You can install the STATS CLUS SIL extension command from Extensions > Extension Hub. It takes the clustering output and produces silhouette measures and charts. It will appear on the Analyze > Classify menu. See the dialog or syntax help for details.Mon, 08 Apr 2019 16:32:43 GMTjkpeckAnswer by jkpeck
https://developer.ibm.com/answers/answers/500421/view.html
@Jeanne123
You can install the STATS CLUS SIL extension command from Extensions > Extension Hub. It takes the clustering output and produces silhouette measures and charts. It will appear on the Analyze > Classify menu. See the dialog or syntax help for details.Mon, 08 Apr 2019 16:32:33 GMTjkpeckAnswer by Jeanne123
https://developer.ibm.com/answers/answers/500358/view.html
I am also using the two-cluster method in SPSS and I am wondering how I can get SPSS to tell me the Silhouette Coefficient for each solution?Mon, 08 Apr 2019 11:33:18 GMTJeanne123Answer by jkpeck
https://developer.ibm.com/answers/answers/432021/view.html
@g.g.g
This command requires the data in memory and takes time that is proportional
to the square of the number of cases. With larger datasets, you may want to carry
this analysis out on a random sample of the data.
As for the measure, the choice doesn't affect the time much. If your clustering variables are all continuous, Euclidean is typically the best choice, but if you also have categorical variables, you might try Gower, since that treats continuous and categorical variables differently.Sat, 17 Feb 2018 17:45:24 GMTjkpeckAnswer by g.g.g
https://developer.ibm.com/answers/answers/432014/view.html
@jkpeck I used a two-step cluster analysis and i would like to get a silhouette plot.
However i use the log-likelihood distance in the clustering procedure, which distance measure should i use to build the silhouette plot?
I tried with the euclidean one but it has been working for hours, is it normal?Sat, 17 Feb 2018 15:57:06 GMTg.g.gAnswer by jkpeck
https://developer.ibm.com/answers/answers/381533/view.html
@tzm12
Yes, use the cluster number saved from k means in the silhouette cluster number field. List all the variables used for clustering in the Cluster Variables field. Make sure that the dissimilarity measure matches what you used in k means. The procedure displays a table of the mean, minimum, and maximum silhouette statistic by cluster. You might also find the plots useful.Wed, 14 Jun 2017 17:03:19 GMTjkpeck