[pymvpa] Zscore creates nan in dataset

Yaroslav Halchenko debian at onerussian.com
Fri Jan 30 17:44:36 UTC 2015

On Fri, 30 Jan 2015, gal star wrote:

>    Hello,
>    Unfortunately it did not remove the nan's vectors.
>    I've also tried to remove those specific scans from the samples list,
>    It still created nan vectors out of different scans.
>    Do you have any idea what can i do to remove/avoid those from the dataset?

we "recently" added remove_nonfinite_features function
which is just that:

def remove_nonfinite_features(dataset):
    """Returns a new dataset with all non-finite (NaN,Inf) features removed

    Removes all feature for which not all values are finite

    dataset : Dataset
        Input dataset

    finite_dataset: Dataset
        Dataset based on data form the input, but only the features
        for which all samples are finite are kept.

    return dataset[:, np.all(np.isfinite(dataset.samples),axis=0)]

so just do

fds = remove_invariant_features(fds)
fds = remove_nonfinite_features(fds)  # shouldn't even be needed unless you had NaNs from the beginning somehow

>    My code looks like so:
>    attr = A SampleAttributes (os.path.join(source, map_name))
>    fds = fmri_dataset(samples=os.path.join(source,map_name),
>    targets=attr.targets, chunks=attr.chunks)
>    int = numpy.array([l in ['221','211','23'] for l in fds.sa.targets])
>    fds = fds[int]
>    zscore(fds, param_est=('targets',['23']), chunks_attr='chunks')
