[sane-devel] Character encoding used for sane_strstatus() strings

Ralph Little skelband at gmail.com
Sun Jul 24 18:45:30 BST 2022


Hi,

On 2022-07-18 03:19, Povilas Kanapickas wrote:
> Hi John,
>
> On 2022-07-18 05:25, John Scott wrote:
>> The SANE spec says that all strings are encoded in ISO-8859-1 ("Latin-
>> 1"). However, from inspecting the code for sane_strstatus(), it appears
>> that it just returns ordinary string literals, which use whatever
>> encoding the compiler prescribes for narrow string literals and need not
>> be the same.
> Agreed, going by the letter of standards this is indeed a problem.
>
>> So, what character encoding should I be assuming for strings coming from
>> sane_strstatus() as an application writer? One solution to this dilemma
>> is, since sane_strstatus() appears to only use characters from ASCII in
>> the strings, is to use UTF-8 string literals, like this:
>> 	u8"Hello, world"
> This would bump compiler requirements to C11. I don't think this is bad,
> because we already require C++ for at least one popular backend so it's
> unlikely we have many platforms with just ancient C compiler available.
>
> I'm CC'ing Ralph for a second opinion of whether we can start requiring C11.
>
> By the way, does the current assumption actually break in practice, that
> is, are there compilers for which ASCII text will not encode to a subset
> of ISO-8859-1?
>
>> If you can affirm that the specification needs to prevail, I can send a
>> merge request to adjust the string literals accordingly.
> Let's wait until Ralph replies and then we can see how to proceed.
>
> Thanks a lot for noticing this.
>
> Regards,
> Povilas
> .

None of the suggestions that we have seen so far seem very portable, yet 
this situation is indeed a problem.

Since UTF-8 is pretty much the de facto string representation these 
days, would a better solution be to change the SANE spec. to specify UTF-8?
If the currently supported text strings are the same in UTF-8 and 
ISO-8859-1 then there should be no practical fallout from the change.

What would the fallout of such a change be?
Would it make frontend support simpler?
Do any of our current frontends actually care?

Cheers,
Ralph



More information about the sane-devel mailing list