<div dir="ltr"><div>One more idea, inspired by messages systemd sometimes gives:</div><div><br></div><div>* Select to "Copy" the timestamps of when e.g. your `nut-server.service` stopped/restarted;</div><div>* As root(!) run `journalctl -xl` for a detailed log with service state changes and reasons, and other details, piped into `less` by default</div><div>* Press `G` to scroll to the end (maybe wait a minute for it to react, if you like me have months of active journal to sift through)<br></div><div>* Search up via `?` for the timestamp you've copied earlier</div><div>* Scroll a few screenfulls of text around (mostly before) to get a better educated guess about why the restart happened (some failure of the NUT daemon? some dependency change? system restart? sleep/hibernate/throttle? OOM killer - this one would also be seen in `dmesg`? etc.)</div><div><br></div><div>Also check in `dmesg` if there are any USB events around that time (e.g. UPS getting lost and reconnected)? If it does happen, check polling frequency settings on one hand and maybe set up monitoring like MRTG to correlate if the system could have been e.g. too busy at the moment, so could not dedicate enough time to regular polls and assumed a timeout (happened to me with a weak embedded device destined to monitor a dozen UPSes, or so we hoped).<br></div><div><br></div><div>Good luck,</div><div>Jim</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Jan 19, 2024 at 8:35 PM Jim Klimov <<a href="mailto:jimklimov%2Bnut@gmail.com">jimklimov+nut@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>>
1) How do I make the nut-server and nut-monitor find the right pid files? They are there but it seems they can't be opened. Permissions are nut/nut. <br></div><div><br></div><div>Actually, if the preceding lifetime of the service was a graceful stop, the exiting daemon should have removed its PID files. Then the newly starting one would check and not find them - as I wrote before - to make sure there is no hung old competitor to kill off as part of the start-up. So works as is normal, just with scary messages (newer versions should be less cryptic about this).<br><br>>
Jan 19 16:14:52 mars nut-monitor[3781]: Init SSL without certificate database<br><br></div><div>This means your NUT build is SSL-capable, but you did not configure it with certificates so it is using plaintext mode.<br></div><div><br>
> Jan 19 16:14:52 mars nut-monitor[3781]: Login on UPS [Eaton@localhost]<span> failed - got [ERR ACCESS-DENIED]<br></span></div><div><br><span></span></div><div><span>Given that in some messages posted earlier it works, and in some it is denied (soon after upsd startup), it is the most puzzling issue here (other than the service restarts which you did not post explanations about). I'd guess that it retried the connection too early somehow, if upsd is already listening but did not yet read all configuration. Not sure this should happen. Also might be if you have several MONITOR lines for the same device/server and some of them are wrong?<br></span></div><div><br>
</div><div>Jim<br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Jan 19, 2024 at 6:33 PM Stefan Schumacher via Nut-upsuser <<a href="mailto:nut-upsuser@alioth-lists.debian.net" target="_blank">nut-upsuser@alioth-lists.debian.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I still have two questions:<br>
1) How do I make the nut-server and nut-monitor find the right pid<br>
files? They are there but it seems they can't be opened. Permissions<br>
are nut/nut.<br>
2) What do these error messages mean?<br>
Jan 19 16:14:52 mars nut-monitor[3781]: Init SSL without certificate database<br>
Jan 19 16:14:52 mars nut-monitor[3781]: Login on UPS [Eaton@localhost]<br>
failed - got [ERR ACCESS-DENIED]<br>
<br>
Yours sincerely<br>
Stefan<br>
<br>
Am Fr., 19. Jan. 2024 um 17:59 Uhr schrieb Matus UHLAR - fantomas<br>
<<a href="mailto:uhlar@fantomas.sk" target="_blank">uhlar@fantomas.sk</a>>:<br>
><br>
> On 19.01.24 17:02, Stefan Schumacher via Nut-upsuser wrote:<br>
> >Jan 19 05:50:13 mars nut-monitor[849]: Signal 15: exiting<br>
><br>
> >Jan 19 05:50:17 mars nut-server[1303]: Signal 15: exiting<br>
><br>
> this looks like someone repeatedly killed nut server. This not a problem of<br>
> UPS.<br>
> --<br>
> Matus UHLAR - fantomas, <a href="mailto:uhlar@fantomas.sk" target="_blank">uhlar@fantomas.sk</a> ; <a href="http://www.fantomas.sk/" rel="noreferrer" target="_blank">http://www.fantomas.sk/</a><br>
> Warning: I wish NOT to receive e-mail advertising to this address.<br>
> Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.<br>
> Atheism is a non-prophet organization.<br>
><br>
> _______________________________________________<br>
> Nut-upsuser mailing list<br>
> <a href="mailto:Nut-upsuser@alioth-lists.debian.net" target="_blank">Nut-upsuser@alioth-lists.debian.net</a><br>
> <a href="https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser" rel="noreferrer" target="_blank">https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser</a><br>
<br>
_______________________________________________<br>
Nut-upsuser mailing list<br>
<a href="mailto:Nut-upsuser@alioth-lists.debian.net" target="_blank">Nut-upsuser@alioth-lists.debian.net</a><br>
<a href="https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser" rel="noreferrer" target="_blank">https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser</a><br>
</blockquote></div>
</blockquote></div>