[Pkg-puppet-devel] Bug#1070744: /usr/bin/puppet: puts non-regeneratable data in /var/cache

Jérôme Charaoui jerome at riseup.net
Wed Sep 4 15:35:57 BST 2024


Hello,

On Wed, 8 May 2024 11:20:47 +0200 Hendrik Jaeger 
<debian-bugs at henk.geekmail.org> wrote:
> Package: puppet-agent
> Version: 7.23.0-1
> Severity: minor
> File: /usr/bin/puppet
> X-Debbugs-Cc: debian-bugs at henk.geekmail.org
> 
> Dear Maintainer,
> 
>    * What led up to the situation?
> 
> I was trying to build an exclude list for my backups and went through the content of my filesystems.
> 
>    * What was the outcome of this action?
> 
> I noticed that there are reports of puppet runs in /var/cache/puppet/reports.
> 
>    * What outcome did you expect instead?
> 
> I did expect all data in /var/cache and its subdirectories to be regeneratable and not contain any information one might want to backup.
> According to the FHS in https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch05s05.
> > /var/cache is intended for cached data from applications. Such data is locally generated as a result of time-consuming I/O or calculation. The application must be able to regenerate or restore the data.
> 
> This is not the case for reports:
> Puppet can not regenerate the report for a specific run.
> Also "cache" usually refers to data that will be reused which is not the case for these reports.
> /var/log seems a better fit for those.
> 
> In my concrete case, it seems suboptimal that these reports are in a directory that I would like to exclude from backups because it should not contain anything worth backing up anyway as all data in there is supposed to be regeneratable and these reports clearly are not.
> Under the "Rationale" this use case is even mentioned explicitly:
> > The existence of a separate directory for cached data allows system administrators to set different disk and backup policies from other directories in /var.
> 
> The argument has been made on IRC that usually reports are not stored locally anyway, but it seemed implied that the server would also store the reports in a directory named "cache", but outside the FHS in /opt/puppetlabs/puppet/cache/reports in the case of a non-debian installation. I have no puppetserver installation with debian on hand, so I don’t know how the debian package would behave.
> 
> Another argument has been made that the reports are stored in puppetdb and the reports are thus only stored temporarily as files on a disk. IMHO that still wouldn’t make them "cache" data. "temporary" data maybe, so in that case they should probably go to /var/tmp or /tmp.
> Or, as https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch05s14.html mentions:
> > /var/spool contains data which is awaiting some kind of later processing. Data in /var/spool represents work to be done in the future (by a program, user, or administrator); often data is deleted after it has been processed.
> 
> Both of these arguments are kind of OK for a certain set of circumstances but not everybody is running a puppetdb or even a puppetserver. I am running puppet standalone, i.e. with `puppet apply`, so the reports will not be transferred to the server and will not be consumed into/by puppetdb.
> 
> In any case, treating reports as "cached" data seems quite clearly wrong.
> In the case of standalone puppet (i.e. `puppet apply`) IMHO they are "logs" and should go to /var/log.
> In the case of a puppet-agent (i.e. a puppet client/agent connecting to a puppet server _without_ a puppetdb), they should probably not be saved on the client at all but if so, they are also "logs" IMHO and should be treated like mentioned above. On the server, they should also be treated like "logs" but not necessarily go to /var/log like machine-local log data. I don’t think I have a concrete sensible suggestion for this case. Maybe /var/lib.
> In the case of a puppetserver with a puppetdb, they should probably not be saved as files at all on the server. Unless they are sent directly to the puppetdb from the puppedserver, but consumed later, they are probably "spool" data.

I agree perhaps the default of "/var/cache/puppet/reports" isn't ideal. 
But instead of changing only "reportdir", we might want to instead 
change "vardir" from "/var/cache/puppet" to something like 
"/var/puppet". I'm not sure that anything puppet puts inside "vardir" 
can really be qualified as "cache"?

I think perhaps the only reason it's that way is because of the naming 
choices made by upstream a long time ago.

-- Jérôme



More information about the Pkg-puppet-devel mailing list