Wednesday, 12 February 2014

A "feature" of the Performance (Regression) operator

I noticed a feature of the Performance (Regression) operator whereby it never reports a value for correlation less than 0. This would happen if a label and prediction are negatively correlated with one another. Furthermore, the squared correlation is also reported as zero in this case.

This example process shows this.

I'm using the Correlation Matrix operator as a way of checking the answer I get. If the sign of the calculation is changed in the Generate Attributes operator you will find that in the negative case the correlation should be -0.763 whilst in the positive it should be 0.706.

The output from the Performance (Regression) operator is 0 for correlation and squared correlation in the negative case. How odd since there are other criteria that can take negative values such as Kendall Tau and Spearman Rho.

I can't work out if this is a deliberate feature or some other thing ;)

