Friday, February 8, 2013

comparing null distributions: 2 or 4 runs

Carrying on the previous example, this post shows the null distributions resulting from running the simulation with two or four runs. Since the bias (difference between the classes) and number of examples per run per class (10) is kept constant, increasing the number of runs makes the classification easier since there are more training examples.

The null distributions are narrower when four runs are included:

Since both the null distributions are wider for two runs and the classification accuracy is worse, the p-values are less significant for two runs than four (below; they should get bigger if you click on them). For example, repetition #9 had an accuracy of 0.72 with two runs, which resulted (when permuting the training data only) in a rank of 10. Repetition #6 with four runs also had an accuracy of 0.72, but this time had a rank of 0 (the true-labeled data was more accurate than all permutations).
with two runs
with four runs

The true-labeled data accuracy (given as "real" in the tables) varies quite a bit more over the ten repetitions with only two runs compared to four runs (.57 to .9 with two runs, .69 to .84 with four runs). This strikes me as expected: the classification with only two runs is much more difficult - we have much less statistical power - and so is less stable. The permutation distributions should also be wider (have more variance) when we have less power.

No comments:

Post a Comment