From xiaobeiz at usc.edu  Tue Apr 20 18:43:37 2021
From: xiaobeiz at usc.edu (Xiaobei Zhang)
Date: Tue, 20 Apr 2021 10:43:37 -0700
Subject: [pymvpa] How to average group-level permutation test in pyMVPA
Message-ID: <CAHac79tQz8enGb5gLUf1VbZ5JPaCATVfoP7wjsFf7qgXE=Pmzg@mail.gmail.com>

Hi all,
I am trying to do a group-level permutation test in pyMVPA. I have obtained
individual p-values with permutation test and I think I should average the
permuted distributions across subjects to get the p-value and mean
accuracy.
I am not sure what I should do next and here are the current lines I have:
sub=[2003,2016,2077,2098,2989,1989......]
for subject in sub:
............ #basic parameters
clf = LinearCSVMC()
permutator = AttributePermutator('targets', count=1000)
distr_est = MCNullDist(permutator, tail='left', enable_ca=['dist_samples'])
cvte = CrossValidation(clf,splitter,errorfx=mean_mismatch_error,
                     postproc=mean_sample(),
                     null_dist=distr_est,enable_ca=['stats'])
    err=cvte(dataset)
    cvte.null_dist.append(cvte.null_dist)
    p = cvte.ca.null_prob
    assert(p.shape == (1,1))
    print 'Corresponding p-value:',  np.asscalar(p)

Thanks for your help!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/pkg-exppsy-pymvpa/attachments/20210420/ff3cb3cf/attachment.htm>

From svtsan at utu.fi  Tue Jun 15 15:25:36 2021
From: svtsan at utu.fi (Severi Santavirta)
Date: Tue, 15 Jun 2021 14:25:36 +0000
Subject: [pymvpa] Permutation test with unbalanced data,
 confusion matrices of each permutation round?
Message-ID: <5ed5f086da03470ab85272c37161d0cb@utu.fi>

Dear PyMVPA users,

I have an unbalanced dataset of 11 classes with varying number of events per class (due to data-driven approach). I am running leave-one-subject out cross-validation to classify the events. I was able to implement a permutation test where the event labels are shuffled in the training set and then tested in real data of the leave-one-out subject. What I would like do is to classify all 11 classes in combination (not one versus others or all pairwise classifications) and then assess if the classifier is able to predict each class statistically above chance level. But because the data is unbalanced I cannot assume that the chance level for each class is equal (naive chance level would be 1/11~0.09). With the code below I was able to create a null distribution of the classification accuracies for the whole model, but in this case I cannot assess the statistical significance of each class by comparing the class accuracies with the null distribution of the whole model.

I would like to extract the confusion matrix of the predictions for each permutation round (event labels shuffled in the training set). From these confusion matrices I could create separate null distributions of prediction accuracies for each class and then assess the significance of each class separately. I assumed that this could be fairly simple to do but could not find anything related in the documentation or in this mailing list.

The relevant sections of the code:

# Classifier
clf = MLPClassifier(alpha=1, max_iter=1000) #Neural net classifier
wrapped_clf = SKLLearnerAdapter(clf) # Neural net classifier
stats.chisqprob = lambda chisq, df:stats.chi2.sf(chisq,df) 

# Permutator
permutator = AttributePermutator('targets', count=1000)
distr_est = MCNullDist(permutator, tail='right', enable_ca=['dist_samples'])

# Cross validation
cv = CrossValidation(wrapped_clf, NFoldPartitioner(attr='subject'),
                 errorfx=lambda p,t: np.mean(p==t),
                 postproc=mean_sample(),
                 null_dist=distr_est,
                 enable_ca=['stats'])

bsc_null_results = cv(ds_mni)
perm_accu = cv.null_dist.ca.dist_samples # Null distribution of accuracies for the whole model
accuracy_bs=bsc_null_results.S # Real accuracy for the whole model
confmat_bs = cv.ca.stats.matrix # Confusion matrix for classification using real data in the training set

Is it possible to extract the confusion matrices for each permutation round? If not I would be thankful for any advice on how to assess the significance of class accuracies separately in this case of unbalanced data. 

Sincerely,
Severi Santavirta


From dinga92 at gmail.com  Mon Jun 21 15:15:21 2021
From: dinga92 at gmail.com (Richard Dinga)
Date: Mon, 21 Jun 2021 16:15:21 +0200
Subject: [pymvpa] How to average group-level permutation test in pyMVPA
In-Reply-To: <CAHac79tQz8enGb5gLUf1VbZ5JPaCATVfoP7wjsFf7qgXE=Pmzg@mail.gmail.com>
References: <CAHac79tQz8enGb5gLUf1VbZ5JPaCATVfoP7wjsFf7qgXE=Pmzg@mail.gmail.com>
Message-ID: <CABbjURC7jHLOFZ1uONTnwSGfreMY+N+6V+UnKw1PKFTwdZFFZA@mail.gmail.com>

You can do it in a way you described, but a better way is to create one big
dataset with all subjects, and run your analysis so that you will get
average accuracy of your within subject classifications. Then you use
attribute_permutator with the limit='subject' or something like that which
will create permuted datasets where permutations are only performed within
subjects. Then you just tepeat your analysis for each permutation to get
the null distrinution of average within subject accuracied. You might also
consider strategy='chunks' to deal with temporal dependencies.

I am sorry i don't have an example code that would run this

On Tue, Apr 20, 2021, 19:43 Xiaobei Zhang <xiaobeiz at usc.edu> wrote:

> Hi all,
> I am trying to do a group-level permutation test in pyMVPA. I have
> obtained individual p-values with permutation test and I think I should
> average the permuted distributions across subjects to get the p-value and
> mean accuracy.
> I am not sure what I should do next and here are the current lines I have:
> sub=[2003,2016,2077,2098,2989,1989......]
> for subject in sub:
> ............ #basic parameters
> clf = LinearCSVMC()
> permutator = AttributePermutator('targets', count=1000)
> distr_est = MCNullDist(permutator, tail='left', enable_ca=['dist_samples'])
> cvte = CrossValidation(clf,splitter,errorfx=mean_mismatch_error,
>                      postproc=mean_sample(),
>                      null_dist=distr_est,enable_ca=['stats'])
>     err=cvte(dataset)
>     cvte.null_dist.append(cvte.null_dist)
>     p = cvte.ca.null_prob
>     assert(p.shape == (1,1))
>     print 'Corresponding p-value:',  np.asscalar(p)
>
> Thanks for your help!
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at alioth-lists.debian.net
> https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/pkg-exppsy-pymvpa/attachments/20210621/78df061e/attachment.htm>