[pymvpa] Zscore creates nan in dataset

Sun Feb 1 14:14:52 UTC 2015

Hello all,
I found the problem - I am using zscore to normalize according to baseline.
In default, zscore is done by chunks.

In some of my pre-defined chunks, there were no baseline scans included,
therefore, created nan vectors.
It was solved by adding baseline samples to every chunk.

Thanks for your support!

Cheers,
Gal Star

On Sun, Feb 1, 2015 at 12:09 PM, gal star <gal.star3051 at gmail.com> wrote:

> Hi,
> This does not work in my case, since in the dataset exists samples
> which are vectors filled with nan values (therefore, this function returns
> a dataset with nfeatures == 0, and later on an exception is thrown).
>
> Also, i've noticed that:
> -  In my case there are no nan or invariant variables in the dataset
>  before zscore execution.
> -  After zscore, the dataset turns all test samples into filled with nan
> vectors.
>
> What else could cause test samples turn into nan values vectors after
> zscore?
> Could it have to do with the param_est i'm using?
>
> Thanks,
> Gal Star
>
> On Fri, Jan 30, 2015 at 7:44 PM, Yaroslav Halchenko <debian at onerussian.com
> > wrote:
>
>>
>> On Fri, 30 Jan 2015, gal star wrote:
>>
>> >    Hello,
>> >    Unfortunately it did not remove the nan's vectors.
>> >    I've also tried to remove those specific scans from the samples list,
>> >    It still created nan vectors out of different scans.
>> >    Do you have any idea what can i do to remove/avoid those from the
>> dataset?
>>
>> we "recently" added remove_nonfinite_features function
>> https://github.com/PyMVPA/PyMVPA/blob/HEAD/mvpa2/datasets/miscfx.py#L58
>> which is just that:
>>
>> def remove_nonfinite_features(dataset):
>>     """Returns a new dataset with all non-finite (NaN,Inf) features
>> removed
>>
>>     Removes all feature for which not all values are finite
>>
>>     Parameters
>>     ----------
>>     dataset : Dataset
>>         Input dataset
>>
>>     Returns
>>     -------
>>     finite_dataset: Dataset
>>         Dataset based on data form the input, but only the features
>>         for which all samples are finite are kept.
>>     """
>>
>>     return dataset[:, np.all(np.isfinite(dataset.samples),axis=0)]
>>
>> so just do
>>
>> fds = remove_invariant_features(fds)
>> fds = remove_nonfinite_features(fds)  # shouldn't even be needed unless
>> you had NaNs from the beginning somehow
>>
>> >    My code looks like so:
>> >    attr = A SampleAttributes (os.path.join(source, map_name))
>> >    fds = fmri_dataset(samples=os.path.join(source,map_name),
>> >    targets=attr.targets, chunks=attr.chunks)
>> >    int = numpy.array([l in ['221','211','23'] for l in fds.sa.targets])
>> >    fds = fds[int]
>> >    zscore(fds, param_est=('targets',['23']), chunks_attr='chunks')
>> --
>> Yaroslav O. Halchenko, Ph.D.
>> http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
>> Research Scientist,            Psychological and Brain Sciences Dept.
>> Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
>> Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
>> WWW:   http://www.linkedin.com/in/yarik
>>
>> _______________________________________________
>> Pkg-ExpPsy-PyMVPA mailing list
>> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20150201/1cb2b138/attachment.html>