[Debian-med-packaging] [devteam-bioc] Binary (rda) files in source tarball of BioC graph

Hervé Pagès hpages at fhcrc.org
Fri Aug 30 18:09:42 UTC 2013


Hi,

A little bit of background: .rda files are created in R with the save()
function. This function has an 'ascii' argument that is set to FALSE by
default. AFAIK almost nobody uses 'ascii=TRUE' because there is no
benefit in doing that (the .rda file will be bigger and will then take
longer to load with load()). But from a content point-of-view, binary
and ascii representations are equivalent. Furthermore, you can always
switch from one representation to the other with no loss of information:
just load the dataset with load() and re-save it with whatever value of
'ascii' you which (TRUE or FALSE).

One last note: the "ascii" representation is maybe a little bit more
user-friendly than the "binary" one but not much. Maybe you'll be able
to open it in an editor but it will probably still be a little bit too
cryptic for people not familiar with the RDA format to be able to read
and understand.

HTH,
H.


FWIW you can always convert a "binary" .rda

On 08/29/2013 02:05 PM, Maintainer wrote:
> [Put ftpmaster in CC]
> Hi Martin,
>
> thanks for your quick and helpful response.
>
> On Thu, Aug 29, 2013 at 01:19:52PM -0700, Martin Morgan wrote:
>> Hi Andreas --
>>
>> On 08/29/2013 05:42 AM, Maintainer wrote:
>>> Hello,
>>>
>>> I'm writing you on behalf of the Debian Med team that tries to package
>>> software for biologists and medical care straight into Debian and thus
>>> BioConductor is one of our targets.  While we just have some
>>> BioConductor modules included we need several more and I'm now busy
>>> packaging graph[1].  I noticed that inside the data/ dir some *.rda
>>> files are stored.  Recently on the Debian list some
>>> flamew^Wdiscussion[2] happened whether we should regard this as
>>> acceptable as "human editable source" or just chunks of binary code
>>> without source.
>>
>>> You as R experts might have some opinion on it and I will not question
>>> this.  However, it would make the process of integration into Debian way
>>> more smooth if you could deliver some ASCII representation of these data
>>> as well as some recipe to create the according *.rda files.  I admit I'm
>>> not very educated in R - I just was told this is possible and easy but
>>> my personal R knowledge is just about the fact that it is pretty easy to
>>> create Debian packages and so I'm doing this - sorry for my ignorance.
>>
>> The serialized R objects can be input and manipulated in R by
>> humans, I guess in the same way that png files are read by image
>> viewers.
>>
>> In general, data objects in Bioconductor packages are complicated --
>> not simple tables, but highly coordinated data structures. They have
>> diverse origins, and the binary representation offers benefits to
>> users and to our build and distribution channels; many are used to
>> test or illustrate (in vignettes or man page examples) package
>> functionality. It's not logistically feasible for us to provide
>> ASCII representations of these objects.
>>
>> It's not escaped our notice that binary files are not a good engineering solution!
>>
>> I hope that provides some context,
>
> Yes, it does.  Thanks for the hint.  I tried to document this issue on a
> newly created Wiki page
>
>     https://wiki.debian.org/GNU_R
>
> which hopefully might be helpful for future R packagers.
>
> Ftpmasters, if you need some further information / clarification I
> should put on the Wiki page, please let me know.
>
> Kind regards and thanks again for your insight
>
>      Andreas.
>
>> Martin
>>
>>> So if you would be able to do us a favour and provide something our
>>> ftpmaster could regard as "human editable source" this would really help
>>> our effort to bring BioConductor straight into Debian.
>>>
>>> Kind regards and thanks for providing the free BioConductor suite
>>>
>>>        Andreas.
>>>
>>> [1] http://bioconductor.org/packages/2.12/bioc/html/graph.html
>>> [2] https://lists.debian.org/debian-devel/2013/08/msg00069.html
>
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Debian-med-packaging mailing list