[pymvpa] RFE and dataset splits
Kimberly Zhou
kyqzhou at gmail.com
Wed Jun 22 18:11:02 UTC 2011
Hi All,
Wondering if anyone has had experience with RFE in PyMVPA 0.6.x? Still
trying to figure out RFE and it seems like I must still be missing
something... Here's part of what it shows when RFE is in progress:
[RFEC] DBG: Step 0: nfeatures=135168
[RFEC] DBG: Step 0: nfeatures=135168 error=0.5000 best/stop=1/0
[RFEC] DBG: Step 1: nfeatures=67584
[RFEC] DBG: Step 1: nfeatures=67584 error=0.5000 best/stop=0/0
[RFEC] DBG: Step 2: nfeatures=33792
[RFEC] DBG: Step 2: nfeatures=33792 error=0.5000 best/stop=0/0
[RFEC] DBG: Step 3: nfeatures=16896
[RFEC] DBG: Step 3: nfeatures=16896 error=0.5000 best/stop=0/0
[RFEC] DBG: Step 4: nfeatures=8448
[RFEC] DBG: Step 4: nfeatures=8448 error=0.5000 best/stop=0/0
[RFEC] DBG: Step 5: nfeatures=4224
[RFEC] DBG: Step 5: nfeatures=4224 error=0.5000 best/stop=0/0
[RFEC] DBG: Step 6: nfeatures=2112
[RFEC] DBG: Step 6: nfeatures=2112 error=0.5000 best/stop=0/0
[RFEC] DBG: Step 7: nfeatures=1056
[RFEC] DBG: Step 7: nfeatures=1056 error=0.5000 best/stop=0/0
[RFEC] DBG: Step 8: nfeatures=528
[RFEC] DBG: Step 8: nfeatures=528 error=0.5000 best/stop=0/0
[RFEC] DBG: Step 9: nfeatures=264
[RFEC] DBG: Step 9: nfeatures=264 error=0.5000 best/stop=0/0
[RFEC] DBG: Step 10: nfeatures=132
[RFEC] DBG: Step 10: nfeatures=132 error=0.5000 best/stop=0/1
...this goes on 24 times (I have 24 runs/chunks, each with two targets). The
main thing I am confused about is why it would have a 0.5 error each time.
Shouldn't it sometimes get both targets right or both wrong (i.e error of 0
or 1?).
Perhaps the code might help?
rfesvm_split = LinearCSVMC()
debug.active = ['RFEC']
fs = \
RFE(rfesvm_split.get_sensitivity_analyzer(),
ProxyMeasure(rfesvm_split,
postproc=BinaryFxNode(mean_mismatch_error, 'targets')),
Splitter('chunks'),
fselector=FractionTailSelector(
0.50,
mode='select', tail='upper'),
stopping_criterion=NBackHistoryStopCrit(BestDetector(), 10),
update_sensitivity=True)
clf = FeatureSelectionClassifier(
LinearCSVMC(),
# on features selected via RFE
fs)
# update sensitivity at each step (since we're not using the
# same CLF as sensitivity analyzer)
#cv = SplitClassifier(clf)
cvte = CrossValidation(clf, NFoldPartitioner(), errorfx=lambda p, t:
np.mean(p == t), postproc=mean_sample(),
enable_ca=['confusion', 'stats'])
cv_results=cvte(avgds)
print np.mean(cv_results)
print cvte.ca.stats.matrix
I would greatly appreciate any ideas! Thank you!
Kimberly Zhou
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20110622/1bc4b5e3/attachment.html>
More information about the Pkg-ExpPsy-PyMVPA
mailing list