Bug#738981: Fwd: Bug#738981: Switch to use generic_fpu for ARM
Thomas Orgis
thomas-forum at orgis.org
Fri Feb 21 01:56:43 UTC 2014
I'm adding the mpg123 assembly guru to the CC list, as I imagine he
would be interested in why his ARM NEON code doesn't work on a Cortex
A8 chip here. Needless to say, it worked before (on other systems).
Also, the precision of the arm_nofpu code does not look right. This
topic is now shifting towards mpg123 development, but as long as it's
only on this debian platform that it's not working, I guess it is
on-topic for debian, too.
Am Fri, 21 Feb 2014 01:29:40 +0000
schrieb peter green <plugwash at p10link.net>:
> Ok, on a 1GHz freescale IMX53 (cortex A8) in a (probablly somewhat out
> of date) debian sid armhf chroot
> Built with ./configure --with-cpu=arm_nofpu
> #mpg123 benchmark (user CPU time in seconds for decoding)
> #decoder t_s16/s t_f32/s
> ARM 30.36 34.26
>
> Built with ./configure --with-cpu=generic_fpu
> #mpg123 benchmark (user CPU time in seconds for decoding)
> #decoder t_s16/s t_f32/s
> generic 148.66 138.49
That seems to prove a point about trying to use the nofpu build. How
does --with-cpu=generic_nofpu stack up for this machine? Also regarding
the compliance test later on ...
> Build with CFLAGS=-mfpu=neon ./configure --with-cpu=neon
> #mpg123 benchmark (user CPU time in seconds for decoding)
> #decoder t_s16/s t_f32/s
> NEON 0.03 0.04
Yeah, as we see
> Illegal instruction
this is most interesting. I refer to Taihei, as I don't have a NEON
setup at hand (need to get a debian chroot going on my phone).
> root at plugwash:/mpg123-test#
> LD_LIBRARY_PATH=/mpg123-20140220132548-arm_nofpu/src/libmpg123/.libs/
> perl compliance.pl /mpg123-20140220132548-arm_nofpu/src/mpg123
>
> ==== Layer 1 ====
> --> 16 bit signed integer output
> fl1.bit: RMS=3.486054e-02 (FAIL) maxdiff=5.002832e-02 (FAIL)
> fl2.bit: RMS=3.485670e-02 (FAIL) maxdiff=5.008233e-02 (FAIL)
That doesn't look pretty to me. Does it _sound_ like (metal) music (in
case no audio chip there, decode to WAV with -w output.wav, I happily
accept snippets, limit number of frames via -n 500).
> root at plugwash:/mpg123-test#
> LD_LIBRARY_PATH=/mpg123-20140220132548-generic_fpu/src/libmpg123/.libs/
> perl compliance.pl /mpg123-20140220132548-generic_fpu/src/mpg123
>
> ==== Layer 1 ====
> --> 16 bit signed integer output
> fl1.bit: RMS=8.683659e-06 (PASS) maxdiff=1.525879e-05 (PASS)
> fl2.bit: RMS=8.686681e-06 (PASS) maxdiff=1.525879e-05 (PASS)
> fl3.bit: RMS=8.737660e-06 (PASS) maxdiff=1.525879e-05 (PASS)
Yes, that is better. Can you compare --with-cpu=generic_nofpu to
isolate this to the assembly version for ARM? This is how it looks with
generic_nofpu on my box:
sh$ perl ../test/compliance.pl src/mpg123
==== Layer 1 ====
--> 16 bit signed integer output
fl1.bit: RMS=7.936754e-06 (PASS) maxdiff=2.533197e-05 (PASS)
fl2.bit: RMS=7.837830e-06 (PASS) maxdiff=2.342463e-05 (PASS)
fl3.bit: RMS=7.928321e-06 (PASS) maxdiff=2.485514e-05 (PASS)
fl4.bit: RMS=7.784658e-06 (PASS) maxdiff=2.521276e-05 (PASS)
fl5.bit: RMS=1.677634e-05 (LIMITED) maxdiff=6.681681e-05 (FAIL)
fl6.bit: RMS=1.071518e-05 (LIMITED) maxdiff=4.619360e-05 (PASS)
fl7.bit: RMS=7.469690e-06 (PASS) maxdiff=2.658367e-05 (PASS)
fl8.bit: RMS=7.923985e-06 (PASS) maxdiff=2.604723e-05 (PASS)
--> 32 bit integer output
fl1.bit: RMS=7.936754e-06 (PASS) maxdiff=2.533197e-05 (PASS)
fl2.bit: RMS=7.837830e-06 (PASS) maxdiff=2.342463e-05 (PASS)
fl3.bit: RMS=7.928321e-06 (PASS) maxdiff=2.485514e-05 (PASS)
fl4.bit: RMS=7.784658e-06 (PASS) maxdiff=2.521276e-05 (PASS)
fl5.bit: RMS=1.677634e-05 (LIMITED) maxdiff=6.681681e-05 (FAIL)
fl6.bit: RMS=1.071518e-05 (LIMITED) maxdiff=4.619360e-05 (PASS)
fl7.bit: RMS=7.469690e-06 (PASS) maxdiff=2.658367e-05 (PASS)
fl8.bit: RMS=7.923985e-06 (PASS) maxdiff=2.604723e-05 (PASS)
--> 24 bit integer output
fl1.bit: RMS=7.936754e-06 (PASS) maxdiff=2.533197e-05 (PASS)
fl2.bit: RMS=7.837830e-06 (PASS) maxdiff=2.342463e-05 (PASS)
fl3.bit: RMS=7.928321e-06 (PASS) maxdiff=2.485514e-05 (PASS)
fl4.bit: RMS=7.784658e-06 (PASS) maxdiff=2.521276e-05 (PASS)
fl5.bit: RMS=1.677634e-05 (LIMITED) maxdiff=6.681681e-05 (FAIL)
fl6.bit: RMS=1.071518e-05 (LIMITED) maxdiff=4.619360e-05 (PASS)
fl7.bit: RMS=7.469690e-06 (PASS) maxdiff=2.658367e-05 (PASS)
fl8.bit: RMS=7.923985e-06 (PASS) maxdiff=2.604723e-05 (PASS)
--> 32 bit floating point output
fl1.bit: RMS=7.936754e-06 (PASS) maxdiff=2.533197e-05 (PASS)
fl2.bit: RMS=7.837830e-06 (PASS) maxdiff=2.342463e-05 (PASS)
fl3.bit: RMS=7.928321e-06 (PASS) maxdiff=2.485514e-05 (PASS)
fl4.bit: RMS=7.784658e-06 (PASS) maxdiff=2.521276e-05 (PASS)
fl5.bit: RMS=1.677634e-05 (LIMITED) maxdiff=6.681681e-05 (FAIL)
fl6.bit: RMS=1.071518e-05 (LIMITED) maxdiff=4.619360e-05 (PASS)
fl7.bit: RMS=7.469690e-06 (PASS) maxdiff=2.658367e-05 (PASS)
fl8.bit: RMS=7.923985e-06 (PASS) maxdiff=2.604723e-05 (PASS)
==== Layer 2 ====
--> 16 bit signed integer output
fl10.bit: RMS=7.983482e-06 (PASS) maxdiff=2.837181e-05 (PASS)
fl11.bit: RMS=7.971939e-06 (PASS) maxdiff=3.039837e-05 (PASS)
fl12.bit: RMS=7.947400e-06 (PASS) maxdiff=2.884865e-05 (PASS)
fl13.bit: RMS=7.871138e-06 (PASS) maxdiff=2.616644e-05 (PASS)
fl14.bit: RMS=1.845901e-05 (LIMITED) maxdiff=6.735325e-05 (FAIL)
fl15.bit: RMS=9.506695e-06 (LIMITED) maxdiff=3.713369e-05 (PASS)
fl16.bit: RMS=8.529689e-06 (PASS) maxdiff=4.535913e-05 (PASS)
--> 32 bit integer output
fl10.bit: RMS=7.983482e-06 (PASS) maxdiff=2.837181e-05 (PASS)
fl11.bit: RMS=7.971939e-06 (PASS) maxdiff=3.039837e-05 (PASS)
fl12.bit: RMS=7.947400e-06 (PASS) maxdiff=2.884865e-05 (PASS)
fl13.bit: RMS=7.871138e-06 (PASS) maxdiff=2.616644e-05 (PASS)
fl14.bit: RMS=1.845901e-05 (LIMITED) maxdiff=6.735325e-05 (FAIL)
fl15.bit: RMS=9.506695e-06 (LIMITED) maxdiff=3.713369e-05 (PASS)
fl16.bit: RMS=8.529689e-06 (PASS) maxdiff=4.535913e-05 (PASS)
--> 24 bit integer output
fl10.bit: RMS=7.983482e-06 (PASS) maxdiff=2.837181e-05 (PASS)
fl11.bit: RMS=7.971939e-06 (PASS) maxdiff=3.039837e-05 (PASS)
fl12.bit: RMS=7.947400e-06 (PASS) maxdiff=2.884865e-05 (PASS)
fl13.bit: RMS=7.871138e-06 (PASS) maxdiff=2.616644e-05 (PASS)
fl14.bit: RMS=1.845901e-05 (LIMITED) maxdiff=6.735325e-05 (FAIL)
fl15.bit: RMS=9.506695e-06 (LIMITED) maxdiff=3.713369e-05 (PASS)
fl16.bit: RMS=8.529689e-06 (PASS) maxdiff=4.535913e-05 (PASS)
--> 32 bit floating point output
fl10.bit: RMS=7.983482e-06 (PASS) maxdiff=2.837181e-05 (PASS)
fl11.bit: RMS=7.971939e-06 (PASS) maxdiff=3.039837e-05 (PASS)
fl12.bit: RMS=7.947400e-06 (PASS) maxdiff=2.884865e-05 (PASS)
fl13.bit: RMS=7.871138e-06 (PASS) maxdiff=2.616644e-05 (PASS)
fl14.bit: RMS=1.845901e-05 (LIMITED) maxdiff=6.735325e-05 (FAIL)
fl15.bit: RMS=9.506695e-06 (LIMITED) maxdiff=3.713369e-05 (PASS)
fl16.bit: RMS=8.529689e-06 (PASS) maxdiff=4.535913e-05 (PASS)
==== Layer 3 ====
--> 16 bit signed integer output
compl.bit: RMS=7.927192e-06 (PASS) maxdiff=2.676249e-05 (PASS)
--> 32 bit integer output
compl.bit: RMS=7.927192e-06 (PASS) maxdiff=2.676249e-05 (PASS)
--> 24 bit integer output
compl.bit: RMS=7.927192e-06 (PASS) maxdiff=2.676249e-05 (PASS)
--> 32 bit floating point output
compl.bit: RMS=7.927192e-06 (PASS) maxdiff=2.676249e-05 (PASS)
Thanks for the time you take (also the folks being spammed with this
discussion;-). I'm confident we'll get to a bright future with mp3
decoding on debian/ARM soon.
Alrighty then,
Thomas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/pkg-multimedia-maintainers/attachments/20140221/f7673def/attachment-0001.sig>
More information about the pkg-multimedia-maintainers
mailing list