Wednesday, February 20, 2013

pyMVPA null distributions, different simulated data

Ben Acland recently joined our lab, and knows python. He's been exploring the pyMVPA code, and was able to run the permutation code from Michael and Yaroslav, but generating the datasets using two normal distributions, instead of with the pyMVPA normal_feature_dataset function.

Here is the command he wrote to generate the dataset, where "mean" is what I previously called the "bias": class A is generated by a normal distribution of with mean "mean", class B by a normal distribution of mean (-1 * "mean").


Here are the curves he generated for different numbers of examples, chunks, and means ("bias"). This is using Michael and Yaroslav's code, except for using the above line of code to generate the simulated data (and having to set "mean", and having all the voxels "non-bogus").
The curves obviously change a lot, lot less than these or these; they are pretty much the same even though the true-labeled accuracy changes drastically.

So, yep, how we simulate the data really makes a difference. But how should we simulate data for MVPA purposes? Do we need a few standards? I doubt any single simulation method will be sufficient.

No comments:

Post a Comment