[Nut-upsuser] Alert: REPLBATT active after battery replacement and requires reboot to clear
Vyasa
info at dalpha.com
Tue Jul 1 18:55:30 BST 2025
I posted the issue in github here:
https://github.com/networkupstools/nut/issues/2999
I am not sure about labels or what else might be required.
Thank you!
On 7/1/25 03:23, Jim Klimov wrote:
> I think yes, seems like a valid bug.
>
> Also as you mention `upsdrvctl`, systemd and NUT v2.8.x together, take
> a look at
> https://github.com/networkupstools/nut/wiki/nut%E2%80%90driver%E2%80%90enumerator-(NDE) -
> it may be more applicable to use `upsdrvsvcctl` instead nowadays.
>
> Jim
>
>
> On Tue, Jul 1, 2025 at 12:35 AM Vyasa <info at dalpha.com> wrote:
>
> Hi Jim,
>
> Thanks for the prompt response.
>
> The restart I refer to was exactly as you say. Where I restarted
> the service using: systemctl restart nut-server. This was
> separate to where I mention the reboot of server machine, which
> resolves the issue.
>
> The driver used was:
> Network UPS Tools - UPS driver controller 2.8.0
> Network UPS Tools - BCMXCP UPS driver 0.32 (2.8.0)
>
> I simulated the fault again, by putting the UPS in bypass and
> disconnecting the battery. This caused the RB alert again. With
> this I then reconnected battery, restored UPS to normal operating
> condition. Then used upsdrvctl to STOP and START the driver.
>
> Generating alert condition for simulating RB:
> Alert type: REPLBATT
> .....................
> ups.status: ALARM OL BYPASS RB
> ups.test.result: Done and error
>
> Alert cleared on UPS, and alert condition with RB persisting on
> NUT-SERVER:
> Alert type: ONLINE
> .................
> ups.status: OL RB
>
> ups.test.result: Done and passed
>
> Restarting using upsdrvctl start/stop command clears RB:
> Alert type: COMMOK
> ..................
> ups.status: OL
> ups.test.result: Done and passed
>
> So it seems that your and my suspicions have been verified. Where
> bcmxcp seems to "latch" the alarm until driver restart or server
> reboot.
>
> I think you are correct, in that this can cause issues in other
> subsets of real-life cases. Thinking here of automating and
> scripting and so forth.
>
> What would you suggest at this point? Can this be submitted as a bug?
>
> Vyasa
>
>
>
> On 6/30/25 14:18, Jim Klimov wrote:
>> Hello,
>>
>> You mention that you've tried restarting the "nut-server" - I
>> suppose you mean literally, the service unit by such name - of
>> the NUT data server. Did you try restarting the unit for the NUT
>> driver (e.g. `systemctl restart nut-drvier at upsname` with NUT
>> v2.8.x and newer)?
>>
>> You did not mention the driver used, but I wonder if that
>> driver program "latches" the RB value when it goes bad and never
>> updates it?.. This could make sense when UPS battery replacement
>> means server downtime, but that is just a subset of real-life
>> cases - so generally can be just an oversight. For example,
>> `bcmxcp` code seems to only set
>> `bcmxcp_status.alarm_replace_battery=1` (oddly neither the field
>> nor struct is ever initialized to 0, so might be garbage on some
>> systems/compilers that do not zero-out aggregate types by default).
>>
>> Jim
>>
>>
>> On Mon, Jun 30, 2025 at 7:53 PM Vyasa via Nut-upsuser
>> <nut-upsuser at alioth-lists.debian.net> wrote:
>>
>> Hello,
>>
>> CONFIGURATION:
>>
>> I am using a Powerware PW9120 3000i, on a network
>> configuration with a server and a couple of slaves.
>>
>> The nut-server OS is /Debian 12 (6.1.0-37-amd64)/. Nut was
>> installed from the Debian repo with version /2.8.0-7 amd64/,
>> and client has the same version.
>>
>> UPS is connected with a standard RS232 serial connection, and
>> works with all standard commands and functionality.
>>
>> Command "/upscmd -l upsname/" provides the following, where I
>> have successfully used /test.battery.start/ and
>> /test.system.start/:
>>
>> beeper.disable - Disable the UPS beeper
>> beeper.enable - Enable the UPS beeper
>> beeper.mute - Temporarily mute the UPS beeper
>> load.on - Turn on the load immediately
>> outlet.1.load.off - Turn off the load on outlet 1 immediately
>> outlet.1.load.on - Turn on the load on outlet 1 immediately
>> outlet.1.shutdown.return - Turn off the outlet 1 and return
>> when power is back
>> outlet.2.load.off - Turn off the load on outlet 2 immediately
>> outlet.2.load.on - Turn on the load on outlet 2 immediately
>> outlet.2.shutdown.return - Turn off the outlet 2 and return
>> when power is back
>> shutdown.return - Turn off the load and return when power is back
>> shutdown.stayoff - Turn off the load and remain off
>> test.battery.start - Start a battery test
>> test.system.start - Start a system test
>>
>> ISSUE:
>>
>> Every couple of years when I have to replace batteries in the
>> UPS, I get an issue with not being able to clear the REPLBATT
>> alert. That is not until I reboot the server running
>> NUT-SERVER. This might seem as not a big deal, but becomes a
>> hassle when batteries haven't quite failed yet and are still
>> good after a ups battery test.
>>
>> The UPS itself reports OK after battery replacement or
>> battery test, and clears alarm on its LCD. But when I poll
>> the UPS data using "upsc upsname" I still see the RB or
>> REPLBATT and this will not clear until I reboot the server.
>> So without reboot the alert will then be generated based on
>> RBWARNTIME in upsmon.conf, which is as per nut design.
>>
>> So without reboot I always get the RB flag with status:
>>
>> /Alert type: REPLBATT/
>> /............/
>> /ups.status: OL RB/
>> /ups.test.result: Done and passed/
>>
>> After reboot of server the alert is cleared:
>>
>> /Alert type: COMMOK
>> ............
>> ups.status: OL
>> ups.test.result: Done and passed/
>>
>> So my question becomes, why is this reboot required and it
>> doesn't seem to make any sense? I can't understand why the
>> polled data from a UPS would change after a reboot, while on
>> the UPS LCD its reporting all OK? I tried restarting
>> NUT-SERVER to see if it would make any difference. Also, the
>> command test.battery.start will clear the alarm on the UPS if
>> battery test good.
>>
>> The only explanation that I have come up with is that the
>> persistent RB/REPLBATT is latched to this condition and is an
>> artifact of UPS to NUT handshaking.
>>
>> Any feedback would be kindly appreciated, as I have searched
>> and searched.
>>
>> Thank you!
>>
>> Vyasa
>> _______________________________________________
>> Nut-upsuser mailing list
>> Nut-upsuser at alioth-lists.debian.net
>> https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/nut-upsuser/attachments/20250701/db65b345/attachment.htm>
More information about the Nut-upsuser
mailing list