[pymvpa] Q: Time-offset of category labels in Haxby 2001 dataset. Was this ever properly resolved?

Tue Mar 8 17:12:01 UTC 2011

Hi Raj,

sorry -- I cannot reincarnate in my memory exactly why there is a
mismatch of 121 physically present volumes I got per each run from Jim some
time ago (about 5-6 years) and 120 as paper description suggests (besides that
for the run length of 300sec, if you want to gather activation at t=300sec, you
would need 1 more volume to acquire, since collection for the previous one ends
at 299.whatever).

You can try asking Jim of cause as the ultimate source of information ;)

Nifti files we distribute are the reconstruction of the sequence from a set of
Analyze file pairs and text files listing which Analyze volumes belonged to
which category.  In those text files, each block had only 7 volumes listed per
each condition block (7*2.5 = 17.5sec), so, my blunt guess was that Jim
has discarded 1 volume from each side of the block, so those were appended back
to each block while I was reconstructing the sequence of stimuli hoping to get
closer to the original stimuli sequence which could not be regular in terms of
the volumes anyways due to 24 sec(condition) + 12 sec (rest) = 36 = 14
(volumes) *2.5 sec(TR) + 1sec.  that is why in labels.txt you would see varying
number of volumes for 'rest' condition (6 or 5).

$> uniq -c ~pymvpa/pymvpa/datadb/haxby2001/subj1/labels.txt | head -10
      1 labels chunks
      6 rest 0
      9 scissors 0
      6 rest 0
      9 face 0
      5 rest 0
      9 cat 0
      5 rest 0
      9 shoe 0
      5 rest 0

and first initial rest block in each chunk having 6 volumes (I think).  So, if
we forget for a moment that we got 121 volumes instead of 120, then initial
rest block should have indeed been allocated  5 volumes (although once again
not exactly, due to 12/2.5 = 4.8) instead of 6 volumes.  Thus you could
probably consider indeed that all labels in the dataset are shifted by 2
volumes, and I should have not added 1 volume on both sides of a stimuli block,
but rather 2 on the front, if I was looking for "precise" timing in terms of
volumes.   But then you would obtain 7 rest volumes at the end of each run
(121st volume?) That would be my speculation atm

hope this helps

Cheers,
Yarik

On Sun, 06 Mar 2011, Rajeev Raizada wrote:

> Dear PyMVPA folks,

> A while ago on this list there was some discussion
> of the question of whether the category-labels
> attached to each TR in the Haxby 2001 data-set:
> http://dev.pymvpa.org/datadb/haxby2001.html
> correspond to either:
> 1. The times at which the stimuli were presented to the subjects
> or
> 2. The times at which the subjects' HRFs are showing responses to those stimuli.

> Here's the final element in that discussion thread:
> http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/2009q1/000321.html

> Was this question ever properly resolved?

> I am planning to submit a paper with some new analyses of this dataset,
> so I want to make extra-sure that I am using the right labels.

> Comparing the dataset with the methods-description
> in the 2001 Science paper itself, here's what I can glean so far:

> The data has a TR of 2.5s,
> with 8 stimulus-blocks of 24s each, and 9 rest periods of 12s each
> (one rest period after each block, and another one at the start of each run).

> That makes each run last 300s:
> Blocks = 8*24 = 192s
> Rest = 9*12 = 108s

> 300secs is 120 TRs:
> 300 / 2.5 = 120

> However, in the dataset at
> http://data.pymvpa.org/datasets/haxby2001
> each run has 121 TRs.

> In each of those runs, the first 6 TRs are labeled as rest,
> which makes a 6*2.5 = 15secs long rest period.
> However, as mentioned above, the 2001 paper says that
> each rest period, including the one at the start of each run, was 12s.
> 12 seconds would actually be 4.8 TRs.

> This suggests to me (but it certainly doesn't guarantee it!)
> that the downloadable data in effect have one rest-period TR
> pre-attached at the beginning of each run,
> making, in effect, a time-offset of 2.5s.

> When taking into account the HRF-delay in assigning category-labels to TRs,
> I personally code it such that a category gets listed as occurring when the
> HRF-convolved stimulus-time-series for that category is above its
> run-mean value.

> For a TR of 2.5s, this ends up in effect being an offset of one TR.

> So, my current guess is that the fact that the downloadable Haxby data
> appear to have one extra label of rest inserted at the start of each run
> means that, in effect, the labels already have the correct offsets
> for corresponding to evoked BOLD responses, rather than raw stimulus-onsets.

> I am not sure about this, though, and would love to hear other
> people's thoughts.
> In particular, one key question is left unaddressed:
> the 2001 Science paper says that each run had 120 TRs in it,
> but the downloadable data has 121 TRs in each run.
> Where did the extra TR's worth of data come from?
> The best-case scenario would be that the first TR in each run,
> which gets assigned the label "rest" and which in effect offsets all
> the label-times,
> is a dummy-scan volume that was collected 2.5s before the
> moment that counted as t=0 for the stimulus-presentation timings.
> However, I have no idea if that is the case or not.

> Any help greatly appreciated.

> Raj

> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa

-- 
=------------------------------------------------------------------=
Keep in touch                                     www.onerussian.com
Yaroslav Halchenko                 www.ohloh.net/accounts/yarikoptic