The item distribution performance operator calculates its performance in the sum of squares case very simply. The number of examples in each cluster is divided by the total number of examples in all clusters. This is squared and the values for each cluster are summed.

For example, three clusters containing 1, 2 and 3 items would have an item distribution score of 0.389. This is 1/36 + 4/36 + 9/36 = 14/36 = 0.389.

For a situation where one cluster dominates and the others clusters are very small in comparison, this value will tend to 1. For the opposite situation, where the clusters have equal numbers of examples, the value tends to 1/N where N is the number of clusters.

Subscribe to:
Post Comments (Atom)

## No comments:

## Post a Comment