[pymvpa] Training and testing on only 1 run (no cross validation)

Yaroslav Halchenko debian at onerussian.com
Fri Oct 6 22:03:20 UTC 2017


On Thu, 05 Oct 2017, Lynda Lin wrote:
>    I've tried it 3 different ways and I'm getting different results for each
>    way so just wanted to know if any of these ways is valid:

sorry... just a quick reply, let's see if it would satisfy ;)

>    1) Using the manual split example from the tutorial and calling the
>    "training_stats" conditional attribute in the classifier
>    In the tutorial we can get the individual accuracies for each run through
>    cv_results.samples but I'm interested in the TPR (True Positive Rate) for
>    Ingroup and Outgroup separately so I'm looking to print the confusion
>    matrix to calculate these numbers

you could access .stats dictionary with all those values, e.g.
cve.ca.stats.stats["TPR"]

>    ds_split1 = ds[ds.sa.chunks == 1.]
>    ds_split2 = ds[ds.sa.chunks == 2.]
>    clf = LinearCSVMC(enable_ca=['training_stats'])
>    clf.set_postproc(BinaryFxNode(mean_mismatch_error,'targets'))
>    clf.train(ds_split1)
>    err = clf(ds_split2)
>    clf.ca.training_stats.as_string(description=True)

this is training_stats, so how well you fit (you can get it right after
.train), not as you predict

>    2) Using the HalfPartitioner function's "count" argument

>     clf = LinearCSVMC(enable_ca=['training_stats']) #The training_stats
>    confusion matrix from this method doesn't match the one above
>     hpart = HalfPartitioner(count=1, attr='chunks')
>     cvte = CrossValidation(clf,hpart,errorfx=lambda p,t:
>    np.mean(p==t),enable_ca=['stats'])
>     cv_results = cvte(ds)
>     cvte.ca.stats.as_string(description=True)

You could also just use CustomPartitioner([([1], [2])])
which should generate a single split where first partition would have
only the ones with value 1, and the second with the value 2.
ca.stats should be appropriate

is your dataset balanced in terms of # of samples per label?
output of ds.summary()

-- 
Yaroslav O. Halchenko
Center for Open Neuroscience     http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        



More information about the Pkg-ExpPsy-PyMVPA mailing list