[pymvpa] optimal way of loading the whole-brain data

Dmitry Smirnov dmi.smirnov07 at gmail.com
Wed May 7 09:42:39 UTC 2014


Hi and thanks for the tip, Yaroslav!

Indeed, when I gzipped the images, it took a few minutes to load the whole
thing:

[DS_] DBG{0.000 sec}:      Duplicating samples shaped (350, 91, 109, 91)
[DS_] DBG{0.001 sec}:      Create new dataset instance for copy
[DS_] DBG{0.000 sec}:      Return dataset copy (ID: 150670160) of source
(ID: 59680656)
[DS_] DBG{4.083 sec}:   Selecting feature/samples of (350, 902629)
[DS_] DBG{2.064 sec}:   Selected feature/samples (350, 902629)
[DS_] DBG{0.423 sec}:  Selecting feature/samples of (350, 228483)
[DS_] DBG{0.000 sec}:  Selected feature/samples (350, 228483)
[DS_] DBG{59.208 sec}:      Duplicating samples shaped (350, 91, 109, 91)
[DS_] DBG{0.000 sec}:      Create new dataset instance for copy
[DS_] DBG{0.000 sec}:      Return dataset copy (ID: 121150032) of source
(ID: 59680656)
[DS_] DBG{4.123 sec}:   Selecting feature/samples of (350, 902629)
[DS_] DBG{1.985 sec}:   Selected feature/samples (350, 902629)
[DS_] DBG{0.309 sec}:  Selecting feature/samples of (350, 228483)
[DS_] DBG{0.000 sec}:  Selected feature/samples (350, 228483)
[DS_] DBG{57.703 sec}:      Duplicating samples shaped (350, 91, 109, 91)
[DS_] DBG{0.000 sec}:      Create new dataset instance for copy
[DS_] DBG{0.000 sec}:      Return dataset copy (ID: 150670160) of source
(ID: 121093840)
[DS_] DBG{4.384 sec}:   Selecting feature/samples of (350, 902629)
[DS_] DBG{2.056 sec}:   Selected feature/samples (350, 902629)
[DS_] DBG{0.293 sec}:  Selecting feature/samples of (350, 228483)
[DS_] DBG{0.000 sec}:  Selected feature/samples (350, 228483)
[DS_] DBG{57.575 sec}:      Duplicating samples shaped (350, 91, 109, 91)
[DS_] DBG{0.000 sec}:      Create new dataset instance for copy
[DS_] DBG{0.000 sec}:      Return dataset copy (ID: 150670160) of source
(ID: 121093840)
[DS_] DBG{4.094 sec}:   Selecting feature/samples of (350, 902629)
[DS_] DBG{2.273 sec}:   Selected feature/samples (350, 902629)
[DS_] DBG{0.384 sec}:  Selecting feature/samples of (350, 228483)
[DS_] DBG{0.000 sec}:  Selected feature/samples (350, 228483)
[DS_] DBG{62.976 sec}:      Duplicating samples shaped (350, 91, 109, 91)
[DS_] DBG{0.000 sec}:      Create new dataset instance for copy
[DS_] DBG{0.000 sec}:      Return dataset copy (ID: 121150032) of source
(ID: 59680656)
[DS_] DBG{4.122 sec}:   Selecting feature/samples of (350, 902629)
[DS_] DBG{2.143 sec}:   Selected feature/samples (350, 902629)
[DS_] DBG{0.353 sec}:  Selecting feature/samples of (350, 228483)
[DS_] DBG{0.000 sec}:  Selected feature/samples (350, 228483)

A follow up question - if nibabel by default memmaps the uncompressed
images, can you guide me how I can possibly go around that behavior? I
don't want to gzip all my data, and in my current context, I'd like to be
able to load the whole dataset into memory at once. Is there some simple
way to do it?

Thank you in advance for any suggestion!

Best, Dima




2014-05-06 18:08 GMT+03:00 Yaroslav Halchenko <debian at onerussian.com>:

>
> On Tue, 06 May 2014, Dmitry Smirnov wrote:
>
> >    Hi Nick,
> >    Thanks for advice!
>
> >    I'm using a server with 256gb of RAM, while the data in 5 runs would
> be
> >    something 12.5gb altogether.
> >    Each run is 2.5gb, dimensions: 91x109x91x350
> >    The problem is still there: I've adjusted the code after your reply
> and
> >    ran it immediately, and it is still running.
>
> 1. I wonder if that is an effect of memory mapping which happens (by
> nibabel) if original files are uncompressed .nii
>
> 2. it might be worth timing such a run in bigger detail, e.g.:
>
> $> MVPA_DEBUG=DS.* MVPA_DEBUG_METRICS=reltime nosetests -s -v
> mvpa2/tests/test_niftidataset.py
> [DS_     ] DBG{0.000 sec}:            Binding function save to AttrDataset
> class
> T: MVPA_SEED=948286987
> T: Skipping testing of all dependencies since verbosity
> (MVPA_TESTS_VERBOSITY) is too low
> Basic testing of NiftiDataset ... [DS_             ] DBG{0.971 sec}:
>               Duplicating samples shaped (2, 128, 96, 24)
> [DS_             ] DBG{0.006 sec}:                     Create new dataset
> instance for copy
> [DS_             ] DBG{0.001 sec}:                     Return dataset copy
> #70317136 of source #68617232
> [DS_             ] DBG{0.706 sec}:                Duplicating samples
> shaped (2, 294912)
> [DS_             ] DBG{0.001 sec}:                Create new dataset
> instance for copy
> [DS_             ] DBG{0.007 sec}:                Return dataset copy
> #70479632 of source #70317136
> [DS_             ] DBG{0.045 sec}:                     Duplicating samples
> shaped (2, 128, 96, 24)
> [DS_             ] DBG{0.001 sec}:                     Create new dataset
> instance for copy
> [DS_             ] DBG{0.001 sec}:                     Return dataset copy
> #68616784 of source #70355792
> [DS_             ] DBG{1.264 sec}:                  Selecting
> feature/samples of (2, 294912)
> [DS_             ] DBG{0.006 sec}:                  Selected
> feature/samples (2, 294912)
> ...
>
> that reltime is time to took from previous debug message being printed
>
> in code you could enable it by
>
> mvpa2.debug.active += ['DS_']
> mvpa2.debug.metrics += ['reltime']
>
> --
> Yaroslav O. Halchenko, Ph.D.
> http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
> Research Scientist,            Psychological and Brain Sciences Dept.
> Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
> Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
> WWW:   http://www.linkedin.com/in/yarik
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>



-- 

Dmitry Smirnov (MSc.)
PhD Candidate, Brain & Mind Laboratory <http://becs.aalto.fi/bml/>
BECS, Aalto University School of Science
00076 AALTO, FINLAND
mobile: +358 50 3015072
email: dmitry.smirnov at aalto.fi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20140507/64dd59be/attachment-0001.html>


More information about the Pkg-ExpPsy-PyMVPA mailing list