[Nut-upsuser] wrong NOTIFYTYPE always ONBATT on upssched TIMER ?
RemiG
mailing-nut at dbmtechnologies.com
Tue Apr 22 13:45:03 BST 2025
Hi Jim
and thank's a lot for this "fun exploration".
i have change all stars by UPS1 at localhost in my /etc/nut/upssched.conf
But nothing better for wrong NOTIFYTYPE.
TIMER intiated by ONLINE process continue to receive "ONBATT" env value
by NOTIFYTYPE var.
Apr 22 08:22:00 brisefer upsmon[1613670]: Onduleur UPS1 at localhost en
mode Secours sur batterie
Apr 22 08:22:00 brisefer upssched[1854124]: Executing command: ONBATT
Apr 22 08:22:00 brisefer upssched-cmd-received-NOTIFYTYPE: ONBATT
Apr 22 08:22:00 brisefer upssched-cmd-read-Upsc_ups_status: OB
Apr 22 08:22:00 brisefer upssched[1854156]: Timer daemon started
Apr 22 08:22:00 brisefer upssched[1854156]: New timer: BATT-STATUS-5
(300 seconds)
Apr 22 08:22:00 brisefer upssched[1854156]: New timer: BATT-STATUS-10
(600 seconds)
Apr 22 08:22:00 brisefer upssched[1854156]: New timer: BATT-STATUS-15
(900 seconds)
Apr 22 08:22:00 brisefer upssched[1854156]: New timer: BATT-STATUS-20
(1200 seconds)
Apr 22 08:22:00 brisefer upssched[1854156]: New timer: BATT-STATUS-25
(1500 seconds)
Apr 22 08:22:00 brisefer upssched[1854156]: New timer: BATT-STATUS-30
(1800 seconds)
Apr 22 08:22:00 brisefer upssched[1854156]: New timer: BATT-STATUS-60
(3600 seconds)
Apr 22 08:22:00 brisefer upssched[1854156]: New timer: BATT-STATUS-120
(7200 seconds)
Apr 22 08:22:55 brisefer upsmon[1613670]: Onduleur UPS1 at localhost en
mode Secteur : recharge batterie en cours
Apr 22 08:22:55 brisefer upssched[1854183]: Executing command: ONLINE
Apr 22 08:22:55 brisefer upssched-cmd-received-NOTIFYTYPE: ONLINE
Apr 22 08:22:55 brisefer upssched-cmd-read-Upsc_ups_status: OL
Apr 22 08:22:55 brisefer upssched[1854156]: Cancelling timer: BATT-STATUS-5
Apr 22 08:22:55 brisefer upssched[1854156]: Cancelling timer:
BATT-STATUS-10
Apr 22 08:22:55 brisefer upssched[1854156]: Cancelling timer:
BATT-STATUS-15
Apr 22 08:22:55 brisefer upssched[1854156]: Cancelling timer:
BATT-STATUS-20
Apr 22 08:22:55 brisefer upssched[1854156]: Cancelling timer:
BATT-STATUS-25
Apr 22 08:22:55 brisefer upssched[1854156]: Cancelling timer:
BATT-STATUS-30
Apr 22 08:22:55 brisefer upssched[1854156]: Cancelling timer:
BATT-STATUS-60
Apr 22 08:22:55 brisefer upssched[1854156]: Cancelling timer:
BATT-STATUS-120
Apr 22 08:22:55 brisefer upssched[1854156]: New timer: LINE-STATUS-5
(300 seconds)
Apr 22 08:22:55 brisefer upssched[1854156]: New timer: LINE-STATUS-60
(3600 seconds)
Apr 22 08:22:55 brisefer upssched[1854156]: New timer: LINE-STATUS-120
(7200 seconds)
Apr 22 08:27:55 brisefer upssched[1854156]: Event: LINE-STATUS-5
Apr 22 08:27:55 brisefer upssched-cmd-received-NOTIFYTYPE: ONBATT
Apr 22 08:27:55 brisefer upssched-cmd-read-Upsc_ups_status: OL
Apr 22 09:22:55 brisefer upssched[1854156]: Event: LINE-STATUS-60
Apr 22 09:22:55 brisefer upssched-cmd-received-NOTIFYTYPE: ONBATT
Apr 22 09:22:55 brisefer upssched-cmd-read-Upsc_ups_status: OL
Apr 22 10:22:55 brisefer upssched[1854156]: Event: LINE-STATUS-120
Apr 22 10:22:55 brisefer upssched-cmd-received-NOTIFYTYPE: ONBATT
Apr 22 10:22:55 brisefer upssched-cmd-read-Upsc_ups_status: OL
Apr 22 10:23:10 brisefer upssched[1854156]: Timer queue empty, exiting
My script upssched-cmd script now don't use NOTIFYTYPE as good
information. It's working and all messages are OK
thank's
Rémi
Le 11/04/2025 à 16:09, Jim Klimov a écrit :
> So, I've done some experiments, also integrating upssched into NIT
> (NUT Integration Testing) suite, to help automate the setup and
> debugging - that aspect can be seen in
> https://github.com/jimklimov/nut/tree/issue-2890 (also investigating
> for https://github.com/networkupstools/nut/issues/2890 that also
> touches on upssched).
>
> My current conclusion is that `setenv("NOTIFYTYPE")` only happens when
> `upsmon` calls whatever it has as the `NOTIFYCMD` handler. It is a new
> process every time it is called, and when the handler is a script or
> other short-lived program, it indeed sees the current unique value of
> NOTIFYTYPE (and UPSNAME, and "other CLI args" that convey the message
> for notification).
>
> However in case of upssched, the first call to create/cancel timers
> launches a daemon (to track timer states and execute `CMDSCRIPT` when
> timers elapse). Subsequent calls to `upssched` as a CLI program talk
> to this daemon via pipe, to inform it of state changes for
> `START-TIMER` and `CANCEL-TIMER` matches, e.g.:
>
> UPS dummy at localhost:12345 on battery
> Fri Apr 11 13:40:09 UTC 2025:
> /home/jim/nut/scripts/misc/notifyme-debug ONBATT
> [dummy at localhost:12345]: args: UPS dummy at localhost:12345 on
> battery (1 arg tokens)
> Defaulting debug verbosity to NUT_DEBUG_LEVEL=3 since none was
> requested by command-line options
> 0.000081 [D2:7688] parse_at: is '*' in AT command the
> 'dummy at localhost:12345' we were launched to process?
> 0.000129 [D1:7688] parse_at: this AT command is for a wildcard:
> matched
> 0.000171 [D1:7688] parse_at: SKIP: 'ONLINE' in AT command did
> not match the 'ONBATT' NOTIFYTYPE we were launched to process
> ...
> 0.001165 [D1:7688] parse_at: this AT command is for a wildcard:
> matched
> 0.001172 [D1:7688] parse_at: processing EXECUTE
> 0.001193 Executing command: ONBATT
> Fri Apr 11 13:40:09 UTC 2025: /home/jim/nut/clients/upssched-cmd: THIS
> IS A SAMPLE SCRIPT, PLEASE TAILOR IT FOR YOUR DEPLOYMENT OF NUT!
> Fri Apr 11 13:40:09 UTC 2025: UPSNAME='dummy at localhost:12345'
> NOTIFYTYPE='ONBATT' args=ONBATT
> 0.009687 [D2:7688] parse_at: is '*' in AT command the
> 'dummy at localhost:12345' we were launched to process?
> 0.009719 [D1:7688] parse_at: this AT command is for a wildcard:
> matched
> 0.009751 [D1:7688] parse_at: processing CANCEL-TIMER
> 25.037157 [D3:7373] new connection on fd 16
> 25.037214 [D3:7373] Ending sock_read() on a good note: try to
> use command:
> 25.037238 [D3:7373] arg 0: CANCEL
> 25.037243 [D3:7373] arg 1: LINE-STATUS-1s
> ...
> 0.019160 [D2:7688] parse_at: is '*' in AT command the
> 'dummy at localhost:12345' we were launched to process?
> 0.019190 [D1:7688] parse_at: this AT command is for a wildcard:
> matched
> 0.019196 [D1:7688] parse_at: processing START-TIMER
> 25.046606 [D3:7373] closing connection on fd 16
> 25.046647 [D3:7373] new connection on fd 16
> 25.046706 [D3:7373] Ending sock_read() on a good note: try to
> use command:
> 25.046729 [D3:7373] arg 0: START
> 25.046734 [D3:7373] arg 1: BATT-STATUS-60
> 25.046757 [D3:7373] arg 2: 3600
> 25.046780 New timer: BATT-STATUS-60 (3600 seconds)
> 0.019597 [D2:7688] parse_at: is '*' in AT command the
> 'dummy at localhost:12345' we were launched to process?
> 0.019622 [D1:7688] parse_at: this AT command is for a wildcard:
> matched
> 0.019627 [D1:7688] parse_at: SKIP: 'REPLBATT' in AT command did
> not match the 'ONBATT' NOTIFYTYPE we were launched to process
>
> Note the 25-sec timestamps of the daemon, and 0.x sec timestamps of
> the client that reacts to a new ONBATT event.
>
> The `upssched` code, its infinite-looping daemon included, never calls
> `setenv()`. So all runs of a `CMDSCRIPT` inherit the environment
> exported for the first run of `upssched` when it created the daemon,
> including the original and static value of `NOTIFYTYPE`.
>
> The `upssched` CLI program directly runs the `CMDSCRIPT` when an `AT
> ... EXECUTE ...` line is matched, however -- including the
> `NOTIFYTYPE` relevant at this moment, as seen in the third report from
> ` upssched-cmd` script below:
>
> 25.047059 [D3:7373] closing connection on fd 16
> 0.019795 [D2:7688] parse_at: is '*' in AT command the
> 'dummy at localhost:12345' we were launched to process?
> 0.019818 [D1:7688] parse_at: this AT command is for a wildcard:
> matched
> 0.019842 [D1:7688] parse_at: SKIP: 'COMMBAD' in AT command did
> not match the 'ONBATT' NOTIFYTYPE we were launched to process
> 0.019858 [D1:7688] Exiting upssched (CLI process)
> 26.048227 Event: BATT-STATUS-1s
> Fri Apr 11 13:40:10 UTC 2025: /home/jim/nut/clients/upssched-cmd: THIS
> IS A SAMPLE SCRIPT, PLEASE TAILOR IT FOR YOUR DEPLOYMENT OF NUT!
> Fri Apr 11 13:40:10 UTC 2025: UPSNAME='dummy at localhost:12345'
> NOTIFYTYPE='ONBATT' args=BATT-STATUS-1s
> 27.061667 Event: BATT-STATUS-2s
> Fri Apr 11 13:40:11 UTC 2025: /home/jim/nut/clients/upssched-cmd: THIS
> IS A SAMPLE SCRIPT, PLEASE TAILOR IT FOR YOUR DEPLOYMENT OF NUT!
> Fri Apr 11 13:40:11 UTC 2025: UPSNAME='dummy at localhost:12345'
> NOTIFYTYPE='ONBATT' args=BATT-STATUS-2s
> UPS dummy at localhost:12345 on line power
> Fri Apr 11 13:40:14 UTC 2025:
> /home/jim/nut/scripts/misc/notifyme-debug ONLINE
> [dummy at localhost:12345]: args: UPS dummy at localhost:12345 on
> line power (1 arg tokens)
> Defaulting debug verbosity to NUT_DEBUG_LEVEL=3 since none was
> requested by command-line options
> 0.000068 [D2:7720] parse_at: is '*' in AT command the
> 'dummy at localhost:12345' we were launched to process?
> 0.000109 [D1:7720] parse_at: this AT command is for a wildcard:
> matched
> 0.000138 [D1:7720] parse_at: processing EXECUTE
> 0.000143 Executing command: ONLINE
> Fri Apr 11 13:40:14 UTC 2025: /home/jim/nut/clients/upssched-cmd: THIS
> IS A SAMPLE SCRIPT, PLEASE TAILOR IT FOR YOUR DEPLOYMENT OF NUT!
> Fri Apr 11 13:40:14 UTC 2025: UPSNAME='dummy at localhost:12345'
> NOTIFYTYPE='ONLINE' args=ONLINE
> 0.011340 [D2:7720] parse_at: is '*' in AT command the
> 'dummy at localhost:12345' we were launched to process?
> 0.011389 [D1:7720] parse_at: this AT command is for a wildcard:
> matched
> 0.011398 [D1:7720] parse_at: processing CANCEL-TIMER
> 30.042455 [D3:7373] new connection on fd 16
> 30.042490 Event: BATT-STATUS-5s
> Fri Apr 11 13:40:14 UTC 2025: /home/jim/nut/clients/upssched-cmd: THIS
> IS A SAMPLE SCRIPT, PLEASE TAILOR IT FOR YOUR DEPLOYMENT OF NUT!
> Fri Apr 11 13:40:14 UTC 2025: UPSNAME='dummy at localhost:12345'
> NOTIFYTYPE='ONBATT' args=BATT-STATUS-5s
> 30.051445 [D3:7373] Ending sock_read() on a good note: try to
> use command:
> 30.051489 [D3:7373] arg 0: CANCEL
> 30.051496 [D3:7373] arg 1: BATT-STATUS-1s
>
> It seems that sometimes timer calls can happen before they get the
> cancel signal (processing the list and calling the commands can take a
> while), note the jump from 27.x to 30.x seconds and call to handle
> BATT-STATUS-5s even though we are already processing cancellation of
> BATT-STATUS-1s and would likely get to cancel 5s in a moment.
>
> ...
> 0.020701 [D2:7720] parse_at: is '*' in AT command the
> 'dummy at localhost:12345' we were launched to process?
> 0.020727 [D1:7720] parse_at: this AT command is for a wildcard:
> matched
> 0.020733 [D1:7720] parse_at: processing CANCEL-TIMER
> 30.051843 [D3:7373] closing connection on fd 16
> 30.051897 [D3:7373] new connection on fd 16
> 30.051963 [D3:7373] Ending sock_read() on a good note: try to
> use command:
> 30.051992 [D3:7373] arg 0: CANCEL
> 30.052021 [D3:7373] arg 1: BATT-STATUS-2s
> 0.021220 [D2:7720] parse_at: is '*' in AT command the
> 'dummy at localhost:12345' we were launched to process?
> 0.021250 [D1:7720] parse_at: this AT command is for a wildcard:
> matched
> 0.021290 [D1:7720] parse_at: processing CANCEL-TIMER
> 30.052299 [D3:7373] closing connection on fd 16
> 30.052340 [D3:7373] new connection on fd 16
> 30.052392 [D3:7373] Ending sock_read() on a good note: try to
> use command:
> 30.052415 [D3:7373] arg 0: CANCEL
> 30.052420 [D3:7373] arg 1: BATT-STATUS-5s
> 0.021601 [D2:7720] parse_at: is '*' in AT command the
> 'dummy at localhost:12345' we were launched to process?
> 0.021626 [D1:7720] parse_at: this AT command is for a wildcard:
> matched
> 0.021631 [D1:7720] parse_at: processing CANCEL-TIMER
> 30.052685 [D3:7373] closing connection on fd 16
> 30.052727 [D3:7373] new connection on fd 16
> 30.052781 [D3:7373] Ending sock_read() on a good note: try to
> use command:
> 30.052804 [D3:7373] arg 0: CANCEL
> 30.052810 [D3:7373] arg 1: BATT-STATUS-15s
> 30.052831 Cancelling timer: BATT-STATUS-15s
> ...
>
> Ultimately, what should matter for the `CMDSCRIPT` are its
> command-line arguments (technically, the first and only one). That
> alone dictates the operation it should run, regardless of the UPSNAME
> or NOTIFYTYPE values. It would probably be correct to unset those
> variables in the loop at least, or in the tool overall (to not imply
> any guarantees of anything related to those strings).
>
> If timers or executions relate to a particular UPS, name it in the
> `AT` line spec (instead of the asterisk) and track as an uniquely
> named timer to start or cancel.
>
> That was a fun exploration, thanks for the nudge!
>
> Jim Klimov
>
>
>
> On Tue, Apr 8, 2025 at 10:59 AM RemiG
> <mailing-nut at dbmtechnologies.com> wrote:
>
> Hi for all
>
> an other one with same wrong NOTIFYTYPE (my supposition) (ups.mfr:
> EATON, ups.model: Ellipse PRO 1600)
>
> upsmon says NOTIFYTYPE = ONBATT
>
> upssched create a time associated to this Event => BATT-STATUS-5
> (but Power return before 5 minutes ;-) so ... )
> upsmon says NOTIFYTYPE = ONLINE
>
> upssched.conf delete this Event => BATT-STATUS-5, BATT-STATUS-10
>
> upssched create a TIMER associated to this Event => LINE-STATUS-5
> and so on
>
> and after all other timers created by ONLINE => LINE-STATUS-5,
> LINE-STATUS-10
>
> sent ENV var NOTIFYTYPE to /bin/upssched-cmd bash script =>
> ONBATT (but upsc UPS at localhost read ups.status : OL CHRG )
>
>
> Isn't it strange ONBATT / OL CHGR
>
> root at greg:~# grep " ups" /var/log/syslog
> Apr 8 10:07:33 greg upsmon[32507]: Onduleur UPS1 at localhost en
> mode Secours sur batterie
> Apr 8 10:07:33 greg upssched[17464]: Executing command: ONBATT
> Apr 8 10:07:34 greg upssched-cmd-received-NOTIFYTYPE: ONBATT
> Apr 8 10:07:34 greg upssched-cmd-read-Upsc_ups_status: OB DISCHRG
> Apr 8 10:07:34 greg upssched[17497]: Timer daemon started
> Apr 8 10:07:34 greg upssched[17497]: New timer: BATT-STATUS-5
> (300 seconds)
> Apr 8 10:07:34 greg upssched[17497]: New timer: BATT-STATUS-10
> (600 seconds)
> Apr 8 10:08:03 greg upsmon[32507]: Onduleur UPS1 at localhost en
> mode Secteur : recharge batterie en cours
> Apr 8 10:08:03 greg upssched[17516]: Executing command: ONLINE
> Apr 8 10:08:03 greg upssched-cmd-received-NOTIFYTYPE: ONLINE
> Apr 8 10:08:03 greg upssched-cmd-read-Upsc_ups_status: OL CHRG
> Apr 8 10:08:04 greg upssched[17497]: Cancelling timer: BATT-STATUS-5
> Apr 8 10:08:04 greg upssched[17497]: Cancelling timer:
> BATT-STATUS-10
> Apr 8 10:08:04 greg upssched[17497]: New timer: LINE-STATUS-5
> (300 seconds)
> Apr 8 10:08:04 greg upssched[17497]: New timer: LINE-STATUS-10
> (600 seconds)
> Apr 8 10:13:04 greg upssched[17497]: Event: LINE-STATUS-5
> Apr 8 10:13:04 greg upssched-cmd-received-NOTIFYTYPE: ONBATT
> Apr 8 10:13:04 greg upssched-cmd-read-Upsc_ups_status: OL CHRG
> Apr 8 10:18:04 greg upssched[17497]: Event: LINE-STATUS-10
> Apr 8 10:18:05 greg upssched-cmd-received-NOTIFYTYPE: ONBATT
> Apr 8 10:18:05 greg upssched-cmd-read-Upsc_ups_status: OL CHRG
> --
>
>> # admin linux only ;-)
>>
>> _______________________________________________
>> Nut-upsuser mailing list
>> Nut-upsuser at alioth-lists.debian.net
>> https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser
>>
> --
> # admin linux only ;-)
>
--
# admin linux only ;-)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/nut-upsuser/attachments/20250422/cbd09a5f/attachment-0001.htm>
More information about the Nut-upsuser
mailing list