[Debichem-devel] chemfp: New upstream release 1.1p1 available

Michael Banck mbanck at debian.org
Sun Mar 10 15:20:54 UTC 2013


Hi Andrew,

On Wed, Mar 06, 2013 at 01:13:56AM +0100, Andrew Dalke wrote:
> > OpenMP should be doable.
> > 
> > If the SSSE3 support is not runtime-detected, we will have to configure
> > it out on ia32, as the baseline is currently a regular i686 CPU (without
> > SSE) I believe.
> 
> There are two aspects related to detection.
> 
> The first is, does the compiler understand OpenMP and SSSE3, and if
> so, what are the command-line flags? Chemfp has a set of hard-coded
> flags. For Debian, which has a gcc more recent than 4.1, this will
> work just fine. The options are:
> 
>     # I'm going to presume that everyone is using an Intel-like processor
>     "gcc": OMP("-fopenmp") + SSSE3("-mssse3") + ["-O3"],
> 
> where OPM and SSSE3 differ based on the values of
> 
>    --with-openmp / --without-openmp
>    --with-ssse3 / --without-ssse3
> 
> I do not have experience in compiling this with non-Intel-like
> processors. It may be worthwhile to invest in a Raspberry Pi
> just to have experience with making it portable to ARM. However,
> this is a low-priority thought for me. For ARM support it will
> likely be best to just use the --without-* flags.

Yeah.  ARM has NEON which is similar to SSE*, but I don't know how it
compares to SSSE3.  Seeing how it becomes more and more popular, it
might make sense to support it at some point (contrary to e.g. PowerPC's
Altivec I'd guess, unless you have customers asking for it).
 
> > On amd64 I am less sure, but I think there are some x86-64 CPUs around
> > which do not support it.  I assume the code will result in a SIGILL on
> > those CPUs?  That might also be a problem.
> 
> The second part of the detection is to see if the CPU supports
> the instruction. After all, the compiler might be able to
> generate code that the CPU can't actually use. This should not
> be a problem. Chemfp has a run-time check to see if the chip-specific
> instructions are available. These look like:
> 
> int chemfp_has_ssse3(void) {
> #if defined(GENERATE_SSSE3)
>   return (get_cpuid_flags() & bit_SSSE3);
> #else
>   (void)(get_cpuid_flags); /* suppress compiler warning */
>   return 0;
> #endif
> }
> 
> where 'get_cpuid_flags()' is based on cpuid(info, &eax, &ebx, &ecx, &edx).
> 
> The same is true for POPCNT support on even newer instruction
> sets.
> 
> If there is a problem, let me know and I will gladly work
> to make it fit what Debian needs.

I tried to build it on my i386 centrino (which does not have SSSE) now,
and the test suite aborts with SIGILL in
chemfp_count_tanimoto_arena_single:

|ssssssssssss....................................................ss....s.....s...........
|Program received signal SIGILL, Illegal instruction.
|chemfp_count_tanimoto_arena_single (threshold=0.20000000000000001,
|num_bits=166, query_storage_size=24, query_arena=0x89ae1d0 "",
|query_start=10,
|    query_end=<value optimized out>, target_storage_size=24,
|target_arena=0x89ae1d0 "", target_start=10, target_end=20,
|target_popcount_indices=0x896d734,
|    result_counts=0x88b77a0) at src/search_core.c:140
|140           start_target_popcount = (int)(query_popcount * threshold);

Disassembling that routine shows it uses a SSSE3 routine, apparently
without guarding against a CPUID check:

|(gdb) disassemble
|Dump of assembler code for function chemfp_count_tanimoto_arena_single:
[...]
|0xb78ccbf6 <chemfp_count_tanimoto_arena_single+422>:    pshufd $0x0,%xmm1,%xmm0
[...]
|End of assembler dump.

I guess this introduced by GCC on its own, as I cannot see a reason for
it off-hand in the code.  I have attached the full assembler output in
case it is helpful for you.

The question is whether there are lots of other places in the code as
well where SSSE3 instructions have crept in (in case my guess is right).

chemfp-1.1p1 build and tested fine on my amd64 notebook, but this one
has SSSE3 instructions.


Cheers,

Michael
-------------- next part --------------
Dump of assembler code for function chemfp_count_tanimoto_arena_single:
0xb78cca50 <chemfp_count_tanimoto_arena_single+0>:	push   %ebp
0xb78cca51 <chemfp_count_tanimoto_arena_single+1>:	mov    %esp,%ebp
0xb78cca53 <chemfp_count_tanimoto_arena_single+3>:	push   %edi
0xb78cca54 <chemfp_count_tanimoto_arena_single+4>:	push   %esi
0xb78cca55 <chemfp_count_tanimoto_arena_single+5>:	push   %ebx
0xb78cca56 <chemfp_count_tanimoto_arena_single+6>:	sub    $0x9c,%esp
0xb78cca5c <chemfp_count_tanimoto_arena_single+12>:	mov    0x1c(%ebp),%edi
0xb78cca5f <chemfp_count_tanimoto_arena_single+15>:	mov    0x20(%ebp),%esi
0xb78cca62 <chemfp_count_tanimoto_arena_single+18>:	call   0xb78ca3f7 <__i686.get_pc_thunk.bx>
0xb78cca67 <chemfp_count_tanimoto_arena_single+23>:	add    $0x25aa5,%ebx
0xb78cca6d <chemfp_count_tanimoto_arena_single+29>:	fldl   0x8(%ebp)
0xb78cca70 <chemfp_count_tanimoto_arena_single+32>:	fstpl  -0x28(%ebp)
0xb78cca73 <chemfp_count_tanimoto_arena_single+35>:	cmp    %esi,%edi
0xb78cca75 <chemfp_count_tanimoto_arena_single+37>:	jge    0xb78ccb58 <chemfp_count_tanimoto_arena_single+264>
0xb78cca7b <chemfp_count_tanimoto_arena_single+43>:	fldz   
0xb78cca7d <chemfp_count_tanimoto_arena_single+45>:	fldl   -0x28(%ebp)
0xb78cca80 <chemfp_count_tanimoto_arena_single+48>:	fucomip %st(1),%st
0xb78cca82 <chemfp_count_tanimoto_arena_single+50>:	fstp   %st(0)
0xb78cca84 <chemfp_count_tanimoto_arena_single+52>:	jbe    0xb78ccaa9 <chemfp_count_tanimoto_arena_single+89>
0xb78cca86 <chemfp_count_tanimoto_arena_single+54>:	fildl  0x10(%ebp)
0xb78cca89 <chemfp_count_tanimoto_arena_single+57>:	fld    %st(0)
0xb78cca8b <chemfp_count_tanimoto_arena_single+59>:	fdivrs -0x13498(%ebx)
0xb78cca91 <chemfp_count_tanimoto_arena_single+65>:	fldl   -0x28(%ebp)
0xb78cca94 <chemfp_count_tanimoto_arena_single+68>:	fxch   %st(1)
0xb78cca96 <chemfp_count_tanimoto_arena_single+70>:	fucomip %st(1),%st
0xb78cca98 <chemfp_count_tanimoto_arena_single+72>:	fstp   %st(0)
0xb78cca9a <chemfp_count_tanimoto_arena_single+74>:	jbe    0xb78ccaa7 <chemfp_count_tanimoto_arena_single+87>
0xb78cca9c <chemfp_count_tanimoto_arena_single+76>:	fdivrs -0x13494(%ebx)
0xb78ccaa2 <chemfp_count_tanimoto_arena_single+82>:	fstpl  -0x28(%ebp)
0xb78ccaa5 <chemfp_count_tanimoto_arena_single+85>:	jmp    0xb78ccaa9 <chemfp_count_tanimoto_arena_single+89>
0xb78ccaa7 <chemfp_count_tanimoto_arena_single+87>:	fstp   %st(0)
0xb78ccaa9 <chemfp_count_tanimoto_arena_single+89>:	mov    0x30(%ebp),%eax
0xb78ccaac <chemfp_count_tanimoto_arena_single+92>:	cmp    %eax,0x2c(%ebp)
0xb78ccaaf <chemfp_count_tanimoto_arena_single+95>:	jl     0xb78ccb68 <chemfp_count_tanimoto_arena_single+280>
0xb78ccab5 <chemfp_count_tanimoto_arena_single+101>:	sub    %edi,%esi
0xb78ccab7 <chemfp_count_tanimoto_arena_single+103>:	test   %esi,%esi
0xb78ccab9 <chemfp_count_tanimoto_arena_single+105>:	jle    0xb78ccb58 <chemfp_count_tanimoto_arena_single+264>
0xb78ccabf <chemfp_count_tanimoto_arena_single+111>:	mov    0x38(%ebp),%edx
0xb78ccac2 <chemfp_count_tanimoto_arena_single+114>:	and    $0xf,%edx
0xb78ccac5 <chemfp_count_tanimoto_arena_single+117>:	shr    $0x2,%edx
0xb78ccac8 <chemfp_count_tanimoto_arena_single+120>:	neg    %edx
0xb78ccaca <chemfp_count_tanimoto_arena_single+122>:	and    $0x3,%edx
0xb78ccacd <chemfp_count_tanimoto_arena_single+125>:	cmp    %esi,%edx
0xb78ccacf <chemfp_count_tanimoto_arena_single+127>:	cmova  %esi,%edx
0xb78ccad2 <chemfp_count_tanimoto_arena_single+130>:	test   %edx,%edx
0xb78ccad4 <chemfp_count_tanimoto_arena_single+132>:	je     0xb78ccebe <chemfp_count_tanimoto_arena_single+1134>
0xb78ccada <chemfp_count_tanimoto_arena_single+138>:	mov    0x38(%ebp),%ecx
0xb78ccadd <chemfp_count_tanimoto_arena_single+141>:	xor    %eax,%eax
0xb78ccadf <chemfp_count_tanimoto_arena_single+143>:	nop
0xb78ccae0 <chemfp_count_tanimoto_arena_single+144>:	movl   $0x0,(%ecx,%eax,4)
0xb78ccae7 <chemfp_count_tanimoto_arena_single+151>:	add    $0x1,%eax
0xb78ccaea <chemfp_count_tanimoto_arena_single+154>:	cmp    %eax,%edx
0xb78ccaec <chemfp_count_tanimoto_arena_single+156>:	ja     0xb78ccae0 <chemfp_count_tanimoto_arena_single+144>
0xb78ccaee <chemfp_count_tanimoto_arena_single+158>:	cmp    %edx,%esi
0xb78ccaf0 <chemfp_count_tanimoto_arena_single+160>:	je     0xb78ccb58 <chemfp_count_tanimoto_arena_single+264>
0xb78ccaf2 <chemfp_count_tanimoto_arena_single+162>:	mov    %esi,%ecx
0xb78ccaf4 <chemfp_count_tanimoto_arena_single+164>:	sub    %edx,%ecx
0xb78ccaf6 <chemfp_count_tanimoto_arena_single+166>:	mov    %ecx,-0x20(%ebp)
0xb78ccaf9 <chemfp_count_tanimoto_arena_single+169>:	shr    $0x2,%ecx
0xb78ccafc <chemfp_count_tanimoto_arena_single+172>:	lea    0x0(,%ecx,4),%edi
0xb78ccb03 <chemfp_count_tanimoto_arena_single+179>:	test   %edi,%edi
0xb78ccb05 <chemfp_count_tanimoto_arena_single+181>:	mov    %edi,-0x28(%ebp)
0xb78ccb08 <chemfp_count_tanimoto_arena_single+184>:	je     0xb78ccb3f <chemfp_count_tanimoto_arena_single+239>
0xb78ccb0a <chemfp_count_tanimoto_arena_single+186>:	mov    0x38(%ebp),%edi
0xb78ccb0d <chemfp_count_tanimoto_arena_single+189>:	pxor   %xmm0,%xmm0
0xb78ccb11 <chemfp_count_tanimoto_arena_single+193>:	mov    %eax,-0x2c(%ebp)
0xb78ccb14 <chemfp_count_tanimoto_arena_single+196>:	lea    (%edi,%edx,4),%edx
0xb78ccb17 <chemfp_count_tanimoto_arena_single+199>:	mov    %edx,-0x38(%ebp)
0xb78ccb1a <chemfp_count_tanimoto_arena_single+202>:	mov    -0x38(%ebp),%edi
0xb78ccb1d <chemfp_count_tanimoto_arena_single+205>:	xor    %edx,%edx
0xb78ccb1f <chemfp_count_tanimoto_arena_single+207>:	nop
0xb78ccb20 <chemfp_count_tanimoto_arena_single+208>:	mov    %edx,%eax
0xb78ccb22 <chemfp_count_tanimoto_arena_single+210>:	add    $0x1,%edx
0xb78ccb25 <chemfp_count_tanimoto_arena_single+213>:	shl    $0x4,%eax
0xb78ccb28 <chemfp_count_tanimoto_arena_single+216>:	cmp    %ecx,%edx
0xb78ccb2a <chemfp_count_tanimoto_arena_single+218>:	movdqa %xmm0,(%edi,%eax,1)
0xb78ccb2f <chemfp_count_tanimoto_arena_single+223>:	jb     0xb78ccb20 <chemfp_count_tanimoto_arena_single+208>
0xb78ccb31 <chemfp_count_tanimoto_arena_single+225>:	mov    -0x2c(%ebp),%eax
0xb78ccb34 <chemfp_count_tanimoto_arena_single+228>:	mov    -0x28(%ebp),%edx
0xb78ccb37 <chemfp_count_tanimoto_arena_single+231>:	add    -0x28(%ebp),%eax
0xb78ccb3a <chemfp_count_tanimoto_arena_single+234>:	cmp    %edx,-0x20(%ebp)
0xb78ccb3d <chemfp_count_tanimoto_arena_single+237>:	je     0xb78ccb58 <chemfp_count_tanimoto_arena_single+264>
0xb78ccb3f <chemfp_count_tanimoto_arena_single+239>:	mov    0x38(%ebp),%ecx
0xb78ccb42 <chemfp_count_tanimoto_arena_single+242>:	lea    (%ecx,%eax,4),%edx
0xb78ccb45 <chemfp_count_tanimoto_arena_single+245>:	lea    0x0(%esi),%esi
0xb78ccb48 <chemfp_count_tanimoto_arena_single+248>:	add    $0x1,%eax
0xb78ccb4b <chemfp_count_tanimoto_arena_single+251>:	movl   $0x0,(%edx)
0xb78ccb51 <chemfp_count_tanimoto_arena_single+257>:	add    $0x4,%edx
0xb78ccb54 <chemfp_count_tanimoto_arena_single+260>:	cmp    %eax,%esi
0xb78ccb56 <chemfp_count_tanimoto_arena_single+262>:	jg     0xb78ccb48 <chemfp_count_tanimoto_arena_single+248>
0xb78ccb58 <chemfp_count_tanimoto_arena_single+264>:	add    $0x9c,%esp
0xb78ccb5e <chemfp_count_tanimoto_arena_single+270>:	xor    %eax,%eax
0xb78ccb60 <chemfp_count_tanimoto_arena_single+272>:	pop    %ebx
0xb78ccb61 <chemfp_count_tanimoto_arena_single+273>:	pop    %esi
0xb78ccb62 <chemfp_count_tanimoto_arena_single+274>:	pop    %edi
0xb78ccb63 <chemfp_count_tanimoto_arena_single+275>:	pop    %ebp
0xb78ccb64 <chemfp_count_tanimoto_arena_single+276>:	ret    
0xb78ccb65 <chemfp_count_tanimoto_arena_single+277>:	lea    0x0(%esi),%esi
0xb78ccb68 <chemfp_count_tanimoto_arena_single+280>:	fld1   
0xb78ccb6a <chemfp_count_tanimoto_arena_single+282>:	fldl   -0x28(%ebp)
0xb78ccb6d <chemfp_count_tanimoto_arena_single+285>:	fucomip %st(1),%st
0xb78ccb6f <chemfp_count_tanimoto_arena_single+287>:	fstp   %st(0)
0xb78ccb71 <chemfp_count_tanimoto_arena_single+289>:	ja     0xb78ccab5 <chemfp_count_tanimoto_arena_single+101>
0xb78ccb77 <chemfp_count_tanimoto_arena_single+295>:	fldz   
0xb78ccb79 <chemfp_count_tanimoto_arena_single+297>:	fld    %st(0)
0xb78ccb7b <chemfp_count_tanimoto_arena_single+299>:	fldl   -0x28(%ebp)
0xb78ccb7e <chemfp_count_tanimoto_arena_single+302>:	fxch   %st(1)
0xb78ccb80 <chemfp_count_tanimoto_arena_single+304>:	fucomip %st(1),%st
0xb78ccb82 <chemfp_count_tanimoto_arena_single+306>:	fstp   %st(0)
0xb78ccb84 <chemfp_count_tanimoto_arena_single+308>:	jb     0xb78ccc49 <chemfp_count_tanimoto_arena_single+505>
0xb78ccb8a <chemfp_count_tanimoto_arena_single+314>:	fstp   %st(0)
0xb78ccb8c <chemfp_count_tanimoto_arena_single+316>:	sub    %edi,%esi
0xb78ccb8e <chemfp_count_tanimoto_arena_single+318>:	test   %esi,%esi
0xb78ccb90 <chemfp_count_tanimoto_arena_single+320>:	mov    %esi,-0x28(%ebp)
0xb78ccb93 <chemfp_count_tanimoto_arena_single+323>:	jle    0xb78ccb58 <chemfp_count_tanimoto_arena_single+264>
0xb78ccb95 <chemfp_count_tanimoto_arena_single+325>:	mov    0x38(%ebp),%edx
0xb78ccb98 <chemfp_count_tanimoto_arena_single+328>:	mov    0x30(%ebp),%ecx
0xb78ccb9b <chemfp_count_tanimoto_arena_single+331>:	sub    0x2c(%ebp),%ecx
0xb78ccb9e <chemfp_count_tanimoto_arena_single+334>:	and    $0xf,%edx
0xb78ccba1 <chemfp_count_tanimoto_arena_single+337>:	shr    $0x2,%edx
0xb78ccba4 <chemfp_count_tanimoto_arena_single+340>:	neg    %edx
0xb78ccba6 <chemfp_count_tanimoto_arena_single+342>:	and    $0x3,%edx
0xb78ccba9 <chemfp_count_tanimoto_arena_single+345>:	cmp    %esi,%edx
0xb78ccbab <chemfp_count_tanimoto_arena_single+347>:	cmova  %esi,%edx
0xb78ccbae <chemfp_count_tanimoto_arena_single+350>:	test   %edx,%edx
0xb78ccbb0 <chemfp_count_tanimoto_arena_single+352>:	je     0xb78cceb7 <chemfp_count_tanimoto_arena_single+1127>
0xb78ccbb6 <chemfp_count_tanimoto_arena_single+358>:	mov    0x38(%ebp),%esi
0xb78ccbb9 <chemfp_count_tanimoto_arena_single+361>:	xor    %eax,%eax
0xb78ccbbb <chemfp_count_tanimoto_arena_single+363>:	nop
0xb78ccbbc <chemfp_count_tanimoto_arena_single+364>:	lea    0x0(%esi,%eiz,1),%esi
0xb78ccbc0 <chemfp_count_tanimoto_arena_single+368>:	mov    %ecx,(%esi,%eax,4)
0xb78ccbc3 <chemfp_count_tanimoto_arena_single+371>:	add    $0x1,%eax
0xb78ccbc6 <chemfp_count_tanimoto_arena_single+374>:	cmp    %eax,%edx
0xb78ccbc8 <chemfp_count_tanimoto_arena_single+376>:	ja     0xb78ccbc0 <chemfp_count_tanimoto_arena_single+368>
0xb78ccbca <chemfp_count_tanimoto_arena_single+378>:	cmp    %edx,-0x28(%ebp)
0xb78ccbcd <chemfp_count_tanimoto_arena_single+381>:	je     0xb78ccb58 <chemfp_count_tanimoto_arena_single+264>
0xb78ccbcf <chemfp_count_tanimoto_arena_single+383>:	mov    -0x28(%ebp),%esi
0xb78ccbd2 <chemfp_count_tanimoto_arena_single+386>:	sub    %edx,%esi
0xb78ccbd4 <chemfp_count_tanimoto_arena_single+388>:	mov    %esi,-0x2c(%ebp)
0xb78ccbd7 <chemfp_count_tanimoto_arena_single+391>:	shr    $0x2,%esi
0xb78ccbda <chemfp_count_tanimoto_arena_single+394>:	lea    0x0(,%esi,4),%edi
0xb78ccbe1 <chemfp_count_tanimoto_arena_single+401>:	test   %edi,%edi
0xb78ccbe3 <chemfp_count_tanimoto_arena_single+403>:	mov    %edi,-0x20(%ebp)
0xb78ccbe6 <chemfp_count_tanimoto_arena_single+406>:	je     0xb78ccc2b <chemfp_count_tanimoto_arena_single+475>
0xb78ccbe8 <chemfp_count_tanimoto_arena_single+408>:	mov    0x38(%ebp),%edi
0xb78ccbeb <chemfp_count_tanimoto_arena_single+411>:	mov    %ecx,-0x7c(%ebp)
0xb78ccbee <chemfp_count_tanimoto_arena_single+414>:	mov    %eax,-0x3c(%ebp)
0xb78ccbf1 <chemfp_count_tanimoto_arena_single+417>:	movd   -0x7c(%ebp),%xmm1
0xb78ccbf6 <chemfp_count_tanimoto_arena_single+422>:	pshufd $0x0,%xmm1,%xmm0
0xb78ccbfb <chemfp_count_tanimoto_arena_single+427>:	lea    (%edi,%edx,4),%edx
0xb78ccbfe <chemfp_count_tanimoto_arena_single+430>:	mov    %edx,-0x38(%ebp)
0xb78ccc01 <chemfp_count_tanimoto_arena_single+433>:	mov    -0x38(%ebp),%edi
0xb78ccc04 <chemfp_count_tanimoto_arena_single+436>:	xor    %edx,%edx
0xb78ccc06 <chemfp_count_tanimoto_arena_single+438>:	xchg   %ax,%ax
0xb78ccc08 <chemfp_count_tanimoto_arena_single+440>:	mov    %edx,%eax
0xb78ccc0a <chemfp_count_tanimoto_arena_single+442>:	add    $0x1,%edx
0xb78ccc0d <chemfp_count_tanimoto_arena_single+445>:	shl    $0x4,%eax
0xb78ccc10 <chemfp_count_tanimoto_arena_single+448>:	cmp    %esi,%edx
0xb78ccc12 <chemfp_count_tanimoto_arena_single+450>:	movdqa %xmm0,(%edi,%eax,1)
0xb78ccc17 <chemfp_count_tanimoto_arena_single+455>:	jb     0xb78ccc08 <chemfp_count_tanimoto_arena_single+440>
0xb78ccc19 <chemfp_count_tanimoto_arena_single+457>:	mov    -0x3c(%ebp),%eax
0xb78ccc1c <chemfp_count_tanimoto_arena_single+460>:	mov    -0x20(%ebp),%edx
0xb78ccc1f <chemfp_count_tanimoto_arena_single+463>:	add    -0x20(%ebp),%eax
0xb78ccc22 <chemfp_count_tanimoto_arena_single+466>:	cmp    %edx,-0x2c(%ebp)
0xb78ccc25 <chemfp_count_tanimoto_arena_single+469>:	je     0xb78ccb58 <chemfp_count_tanimoto_arena_single+264>
0xb78ccc2b <chemfp_count_tanimoto_arena_single+475>:	mov    0x38(%ebp),%esi
0xb78ccc2e <chemfp_count_tanimoto_arena_single+478>:	lea    (%esi,%eax,4),%edx
0xb78ccc31 <chemfp_count_tanimoto_arena_single+481>:	mov    -0x28(%ebp),%esi
0xb78ccc34 <chemfp_count_tanimoto_arena_single+484>:	lea    0x0(%esi,%eiz,1),%esi
0xb78ccc38 <chemfp_count_tanimoto_arena_single+488>:	add    $0x1,%eax
0xb78ccc3b <chemfp_count_tanimoto_arena_single+491>:	mov    %ecx,(%edx)
0xb78ccc3d <chemfp_count_tanimoto_arena_single+493>:	add    $0x4,%edx
0xb78ccc40 <chemfp_count_tanimoto_arena_single+496>:	cmp    %eax,%esi
0xb78ccc42 <chemfp_count_tanimoto_arena_single+498>:	jg     0xb78ccc38 <chemfp_count_tanimoto_arena_single+488>
0xb78ccc44 <chemfp_count_tanimoto_arena_single+500>:	jmp    0xb78ccb58 <chemfp_count_tanimoto_arena_single+264>
0xb78ccc49 <chemfp_count_tanimoto_arena_single+505>:	mov    0x10(%ebp),%eax
0xb78ccc4c <chemfp_count_tanimoto_arena_single+508>:	mov    0x10(%ebp),%edx
0xb78ccc4f <chemfp_count_tanimoto_arena_single+511>:	add    $0xe,%eax
0xb78ccc52 <chemfp_count_tanimoto_arena_single+514>:	add    $0x7,%edx
0xb78ccc55 <chemfp_count_tanimoto_arena_single+517>:	cmovs  %eax,%edx
0xb78ccc58 <chemfp_count_tanimoto_arena_single+520>:	mov    0x34(%ebp),%eax
0xb78ccc5b <chemfp_count_tanimoto_arena_single+523>:	sar    $0x3,%edx
0xb78ccc5e <chemfp_count_tanimoto_arena_single+526>:	mov    %edx,-0x3c(%ebp)
0xb78ccc61 <chemfp_count_tanimoto_arena_single+529>:	test   %eax,%eax
0xb78ccc63 <chemfp_count_tanimoto_arena_single+531>:	je     0xb78cce26 <chemfp_count_tanimoto_arena_single+982>
0xb78ccc69 <chemfp_count_tanimoto_arena_single+537>:	mov    0x14(%ebp),%edx
0xb78ccc6c <chemfp_count_tanimoto_arena_single+540>:	fstps  -0x78(%ebp)
0xb78ccc6f <chemfp_count_tanimoto_arena_single+543>:	sub    %edi,%esi
0xb78ccc71 <chemfp_count_tanimoto_arena_single+545>:	mov    0x10(%ebp),%ecx
0xb78ccc74 <chemfp_count_tanimoto_arena_single+548>:	mov    0x18(%ebp),%eax
0xb78ccc77 <chemfp_count_tanimoto_arena_single+551>:	mov    %edx,0x4(%esp)
0xb78ccc7b <chemfp_count_tanimoto_arena_single+555>:	mov    %ecx,(%esp)
0xb78ccc7e <chemfp_count_tanimoto_arena_single+558>:	mov    %eax,0x8(%esp)
0xb78ccc82 <chemfp_count_tanimoto_arena_single+562>:	call   0xb78d5c90 <chemfp_select_popcount>
0xb78ccc87 <chemfp_count_tanimoto_arena_single+567>:	mov    0x24(%ebp),%edx
0xb78ccc8a <chemfp_count_tanimoto_arena_single+570>:	mov    0x18(%ebp),%ecx
0xb78ccc8d <chemfp_count_tanimoto_arena_single+573>:	mov    %edx,0xc(%esp)
0xb78ccc91 <chemfp_count_tanimoto_arena_single+577>:	mov    0x10(%ebp),%edx
0xb78ccc94 <chemfp_count_tanimoto_arena_single+580>:	mov    %ecx,0x8(%esp)
0xb78ccc98 <chemfp_count_tanimoto_arena_single+584>:	mov    %eax,-0x60(%ebp)
0xb78ccc9b <chemfp_count_tanimoto_arena_single+587>:	mov    0x28(%ebp),%eax
0xb78ccc9e <chemfp_count_tanimoto_arena_single+590>:	mov    %edx,(%esp)
0xb78ccca1 <chemfp_count_tanimoto_arena_single+593>:	mov    %eax,0x10(%esp)
0xb78ccca5 <chemfp_count_tanimoto_arena_single+597>:	mov    0x14(%ebp),%eax
0xb78ccca8 <chemfp_count_tanimoto_arena_single+600>:	mov    %eax,0x4(%esp)
0xb78cccac <chemfp_count_tanimoto_arena_single+604>:	call   0xb78d5ac0 <chemfp_select_intersect_popcount>
0xb78cccb1 <chemfp_count_tanimoto_arena_single+609>:	test   %esi,%esi
0xb78cccb3 <chemfp_count_tanimoto_arena_single+611>:	mov    %esi,-0x5c(%ebp)
0xb78cccb6 <chemfp_count_tanimoto_arena_single+614>:	mov    %eax,-0x44(%ebp)
0xb78cccb9 <chemfp_count_tanimoto_arena_single+617>:	jle    0xb78ccb58 <chemfp_count_tanimoto_arena_single+264>
0xb78cccbf <chemfp_count_tanimoto_arena_single+623>:	mov    0x14(%ebp),%eax
0xb78cccc2 <chemfp_count_tanimoto_arena_single+626>:	mov    0x30(%ebp),%ecx
0xb78cccc5 <chemfp_count_tanimoto_arena_single+629>:	sub    0x2c(%ebp),%ecx
0xb78cccc8 <chemfp_count_tanimoto_arena_single+632>:	movl   $0x0,-0x58(%ebp)
0xb78ccccf <chemfp_count_tanimoto_arena_single+639>:	imul   %edi,%eax
0xb78cccd2 <chemfp_count_tanimoto_arena_single+642>:	add    0x18(%ebp),%eax
0xb78cccd5 <chemfp_count_tanimoto_arena_single+645>:	mov    %ecx,-0x64(%ebp)
0xb78cccd8 <chemfp_count_tanimoto_arena_single+648>:	mov    %eax,-0x40(%ebp)
0xb78cccdb <chemfp_count_tanimoto_arena_single+651>:	jmp    0xb78ccd0c <chemfp_count_tanimoto_arena_single+700>
0xb78cccdd <chemfp_count_tanimoto_arena_single+653>:	fldz   
0xb78cccdf <chemfp_count_tanimoto_arena_single+655>:	fldl   -0x28(%ebp)
0xb78ccce2 <chemfp_count_tanimoto_arena_single+658>:	fucomip %st(1),%st
0xb78ccce4 <chemfp_count_tanimoto_arena_single+660>:	fstp   %st(0)
0xb78ccce6 <chemfp_count_tanimoto_arena_single+662>:	jne    0xb78cccf6 <chemfp_count_tanimoto_arena_single+678>
0xb78ccce8 <chemfp_count_tanimoto_arena_single+664>:	jp     0xb78cccf6 <chemfp_count_tanimoto_arena_single+678>
0xb78cccea <chemfp_count_tanimoto_arena_single+666>:	mov    -0x64(%ebp),%ecx
0xb78ccced <chemfp_count_tanimoto_arena_single+669>:	mov    -0x58(%ebp),%eax
0xb78cccf0 <chemfp_count_tanimoto_arena_single+672>:	mov    0x38(%ebp),%edx
0xb78cccf3 <chemfp_count_tanimoto_arena_single+675>:	mov    %ecx,(%edx,%eax,4)
0xb78cccf6 <chemfp_count_tanimoto_arena_single+678>:	addl   $0x1,-0x58(%ebp)
0xb78cccfa <chemfp_count_tanimoto_arena_single+682>:	mov    0x14(%ebp),%ecx
0xb78cccfd <chemfp_count_tanimoto_arena_single+685>:	mov    -0x5c(%ebp),%esi
0xb78ccd00 <chemfp_count_tanimoto_arena_single+688>:	add    %ecx,-0x40(%ebp)
0xb78ccd03 <chemfp_count_tanimoto_arena_single+691>:	cmp    %esi,-0x58(%ebp)
0xb78ccd06 <chemfp_count_tanimoto_arena_single+694>:	jge    0xb78ccb58 <chemfp_count_tanimoto_arena_single+264>
0xb78ccd0c <chemfp_count_tanimoto_arena_single+700>:	mov    -0x40(%ebp),%esi
0xb78ccd0f <chemfp_count_tanimoto_arena_single+703>:	mov    -0x3c(%ebp),%edi
0xb78ccd12 <chemfp_count_tanimoto_arena_single+706>:	mov    %esi,0x4(%esp)
0xb78ccd16 <chemfp_count_tanimoto_arena_single+710>:	mov    %edi,(%esp)
0xb78ccd19 <chemfp_count_tanimoto_arena_single+713>:	call   *-0x60(%ebp)
0xb78ccd1c <chemfp_count_tanimoto_arena_single+716>:	test   %eax,%eax
0xb78ccd1e <chemfp_count_tanimoto_arena_single+718>:	je     0xb78cccdd <chemfp_count_tanimoto_arena_single+653>
0xb78ccd20 <chemfp_count_tanimoto_arena_single+720>:	fldz   
0xb78ccd22 <chemfp_count_tanimoto_arena_single+722>:	mov    0x10(%ebp),%edx
0xb78ccd25 <chemfp_count_tanimoto_arena_single+725>:	fldl   -0x28(%ebp)
0xb78ccd28 <chemfp_count_tanimoto_arena_single+728>:	fucomi %st(1),%st
0xb78ccd2a <chemfp_count_tanimoto_arena_single+730>:	fstp   %st(1)
0xb78ccd2c <chemfp_count_tanimoto_arena_single+732>:	mov    %edx,-0x54(%ebp)
0xb78ccd2f <chemfp_count_tanimoto_arena_single+735>:	movl   $0x0,-0x50(%ebp)
0xb78ccd36 <chemfp_count_tanimoto_arena_single+742>:	jp     0xb78ccd3a <chemfp_count_tanimoto_arena_single+746>
0xb78ccd38 <chemfp_count_tanimoto_arena_single+744>:	je     0xb78ccd6c <chemfp_count_tanimoto_arena_single+796>
0xb78ccd3a <chemfp_count_tanimoto_arena_single+746>:	mov    %eax,-0x1c(%ebp)
0xb78ccd3d <chemfp_count_tanimoto_arena_single+749>:	fildl  -0x1c(%ebp)
0xb78ccd40 <chemfp_count_tanimoto_arena_single+752>:	fmul   %st,%st(1)
0xb78ccd42 <chemfp_count_tanimoto_arena_single+754>:	fxch   %st(1)
0xb78ccd44 <chemfp_count_tanimoto_arena_single+756>:	mov    %eax,-0x68(%ebp)
0xb78ccd47 <chemfp_count_tanimoto_arena_single+759>:	fisttpl -0x50(%ebp)
0xb78ccd4a <chemfp_count_tanimoto_arena_single+762>:	fdivl  -0x28(%ebp)
0xb78ccd4d <chemfp_count_tanimoto_arena_single+765>:	fstpl  (%esp)
0xb78ccd50 <chemfp_count_tanimoto_arena_single+768>:	call   0xb78ca088 <ceil at plt>
0xb78ccd55 <chemfp_count_tanimoto_arena_single+773>:	mov    0x10(%ebp),%ecx
0xb78ccd58 <chemfp_count_tanimoto_arena_single+776>:	mov    -0x68(%ebp),%eax
0xb78ccd5b <chemfp_count_tanimoto_arena_single+779>:	fisttpl -0x54(%ebp)
0xb78ccd5e <chemfp_count_tanimoto_arena_single+782>:	mov    -0x54(%ebp),%edx
0xb78ccd61 <chemfp_count_tanimoto_arena_single+785>:	cmp    %edx,0x10(%ebp)
0xb78ccd64 <chemfp_count_tanimoto_arena_single+788>:	cmovge %edx,%ecx
0xb78ccd67 <chemfp_count_tanimoto_arena_single+791>:	mov    %ecx,-0x54(%ebp)
0xb78ccd6a <chemfp_count_tanimoto_arena_single+794>:	jmp    0xb78ccd6e <chemfp_count_tanimoto_arena_single+798>
0xb78ccd6c <chemfp_count_tanimoto_arena_single+796>:	fstp   %st(0)
0xb78ccd6e <chemfp_count_tanimoto_arena_single+798>:	mov    -0x54(%ebp),%esi
0xb78ccd71 <chemfp_count_tanimoto_arena_single+801>:	xor    %edi,%edi
0xb78ccd73 <chemfp_count_tanimoto_arena_single+803>:	cmp    %esi,-0x50(%ebp)
0xb78ccd76 <chemfp_count_tanimoto_arena_single+806>:	jg     0xb78cce18 <chemfp_count_tanimoto_arena_single+968>
0xb78ccd7c <chemfp_count_tanimoto_arena_single+812>:	add    -0x50(%ebp),%eax
0xb78ccd7f <chemfp_count_tanimoto_arena_single+815>:	movl   $0x0,-0x4c(%ebp)
0xb78ccd86 <chemfp_count_tanimoto_arena_single+822>:	mov    %eax,-0x48(%ebp)
0xb78ccd89 <chemfp_count_tanimoto_arena_single+825>:	mov    -0x50(%ebp),%eax
0xb78ccd8c <chemfp_count_tanimoto_arena_single+828>:	lea    0x0(%esi,%eiz,1),%esi
0xb78ccd90 <chemfp_count_tanimoto_arena_single+832>:	mov    0x34(%ebp),%edx
0xb78ccd93 <chemfp_count_tanimoto_arena_single+835>:	fildl  -0x48(%ebp)
0xb78ccd96 <chemfp_count_tanimoto_arena_single+838>:	mov    (%edx,%eax,4),%esi
0xb78ccd99 <chemfp_count_tanimoto_arena_single+841>:	mov    0x4(%edx,%eax,4),%eax
0xb78ccd9d <chemfp_count_tanimoto_arena_single+845>:	fstpl  -0x38(%ebp)
0xb78ccda0 <chemfp_count_tanimoto_arena_single+848>:	cmp    %esi,0x2c(%ebp)
0xb78ccda3 <chemfp_count_tanimoto_arena_single+851>:	cmovge 0x2c(%ebp),%esi
0xb78ccda7 <chemfp_count_tanimoto_arena_single+855>:	cmp    %eax,0x30(%ebp)
0xb78ccdaa <chemfp_count_tanimoto_arena_single+858>:	cmovle 0x30(%ebp),%eax
0xb78ccdae <chemfp_count_tanimoto_arena_single+862>:	cmp    %esi,%eax
0xb78ccdb0 <chemfp_count_tanimoto_arena_single+864>:	mov    %eax,-0x20(%ebp)
0xb78ccdb3 <chemfp_count_tanimoto_arena_single+867>:	jle    0xb78cce01 <chemfp_count_tanimoto_arena_single+945>
0xb78ccdb5 <chemfp_count_tanimoto_arena_single+869>:	mov    0x24(%ebp),%edx
0xb78ccdb8 <chemfp_count_tanimoto_arena_single+872>:	imul   %esi,%edx
0xb78ccdbb <chemfp_count_tanimoto_arena_single+875>:	add    0x28(%ebp),%edx
0xb78ccdbe <chemfp_count_tanimoto_arena_single+878>:	jmp    0xb78ccdc3 <chemfp_count_tanimoto_arena_single+883>
0xb78ccdc0 <chemfp_count_tanimoto_arena_single+880>:	add    0x24(%ebp),%edx
0xb78ccdc3 <chemfp_count_tanimoto_arena_single+883>:	mov    -0x40(%ebp),%eax
0xb78ccdc6 <chemfp_count_tanimoto_arena_single+886>:	mov    -0x3c(%ebp),%ecx
0xb78ccdc9 <chemfp_count_tanimoto_arena_single+889>:	mov    %edx,0x8(%esp)
0xb78ccdcd <chemfp_count_tanimoto_arena_single+893>:	mov    %edx,-0x68(%ebp)
0xb78ccdd0 <chemfp_count_tanimoto_arena_single+896>:	mov    %eax,0x4(%esp)
0xb78ccdd4 <chemfp_count_tanimoto_arena_single+900>:	mov    %ecx,(%esp)
0xb78ccdd7 <chemfp_count_tanimoto_arena_single+903>:	call   *-0x44(%ebp)
0xb78ccdda <chemfp_count_tanimoto_arena_single+906>:	mov    -0x68(%ebp),%edx
0xb78ccddd <chemfp_count_tanimoto_arena_single+909>:	mov    %eax,-0x1c(%ebp)
0xb78ccde0 <chemfp_count_tanimoto_arena_single+912>:	lea    0x1(%edi),%eax
0xb78ccde3 <chemfp_count_tanimoto_arena_single+915>:	fildl  -0x1c(%ebp)
0xb78ccde6 <chemfp_count_tanimoto_arena_single+918>:	fldl   -0x38(%ebp)
0xb78ccde9 <chemfp_count_tanimoto_arena_single+921>:	fsub   %st(1),%st
0xb78ccdeb <chemfp_count_tanimoto_arena_single+923>:	fdivrp %st,%st(1)
0xb78ccded <chemfp_count_tanimoto_arena_single+925>:	fldl   -0x28(%ebp)
0xb78ccdf0 <chemfp_count_tanimoto_arena_single+928>:	fxch   %st(1)
0xb78ccdf2 <chemfp_count_tanimoto_arena_single+930>:	fucomip %st(1),%st
0xb78ccdf4 <chemfp_count_tanimoto_arena_single+932>:	fstp   %st(0)
0xb78ccdf6 <chemfp_count_tanimoto_arena_single+934>:	cmovae %eax,%edi
0xb78ccdf9 <chemfp_count_tanimoto_arena_single+937>:	add    $0x1,%esi
0xb78ccdfc <chemfp_count_tanimoto_arena_single+940>:	cmp    %esi,-0x20(%ebp)
0xb78ccdff <chemfp_count_tanimoto_arena_single+943>:	jg     0xb78ccdc0 <chemfp_count_tanimoto_arena_single+880>
0xb78cce01 <chemfp_count_tanimoto_arena_single+945>:	addl   $0x1,-0x4c(%ebp)
0xb78cce05 <chemfp_count_tanimoto_arena_single+949>:	mov    -0x4c(%ebp),%eax
0xb78cce08 <chemfp_count_tanimoto_arena_single+952>:	addl   $0x1,-0x48(%ebp)
0xb78cce0c <chemfp_count_tanimoto_arena_single+956>:	add    -0x50(%ebp),%eax
0xb78cce0f <chemfp_count_tanimoto_arena_single+959>:	cmp    %eax,-0x54(%ebp)
0xb78cce12 <chemfp_count_tanimoto_arena_single+962>:	jge    0xb78ccd90 <chemfp_count_tanimoto_arena_single+832>
0xb78cce18 <chemfp_count_tanimoto_arena_single+968>:	mov    -0x58(%ebp),%eax
0xb78cce1b <chemfp_count_tanimoto_arena_single+971>:	mov    0x38(%ebp),%edx
0xb78cce1e <chemfp_count_tanimoto_arena_single+974>:	mov    %edi,(%edx,%eax,4)
0xb78cce21 <chemfp_count_tanimoto_arena_single+977>:	jmp    0xb78cccf6 <chemfp_count_tanimoto_arena_single+678>
0xb78cce26 <chemfp_count_tanimoto_arena_single+982>:	fstp   %st(0)
0xb78cce28 <chemfp_count_tanimoto_arena_single+984>:	sub    %edi,%esi
0xb78cce2a <chemfp_count_tanimoto_arena_single+986>:	test   %esi,%esi
0xb78cce2c <chemfp_count_tanimoto_arena_single+988>:	mov    %esi,-0x40(%ebp)
0xb78cce2f <chemfp_count_tanimoto_arena_single+991>:	jle    0xb78ccb58 <chemfp_count_tanimoto_arena_single+264>
0xb78cce35 <chemfp_count_tanimoto_arena_single+997>:	mov    0x2c(%ebp),%eax
0xb78cce38 <chemfp_count_tanimoto_arena_single+1000>:	imul   0x24(%ebp),%eax
0xb78cce3c <chemfp_count_tanimoto_arena_single+1004>:	add    0x28(%ebp),%eax
0xb78cce3f <chemfp_count_tanimoto_arena_single+1007>:	movl   $0x0,-0x38(%ebp)
0xb78cce46 <chemfp_count_tanimoto_arena_single+1014>:	mov    %eax,-0x44(%ebp)
0xb78cce49 <chemfp_count_tanimoto_arena_single+1017>:	mov    0x14(%ebp),%eax
0xb78cce4c <chemfp_count_tanimoto_arena_single+1020>:	imul   %edi,%eax
0xb78cce4f <chemfp_count_tanimoto_arena_single+1023>:	add    0x18(%ebp),%eax
0xb78cce52 <chemfp_count_tanimoto_arena_single+1026>:	mov    %eax,-0x20(%ebp)
0xb78cce55 <chemfp_count_tanimoto_arena_single+1029>:	mov    -0x44(%ebp),%eax
0xb78cce58 <chemfp_count_tanimoto_arena_single+1032>:	xor    %esi,%esi
0xb78cce5a <chemfp_count_tanimoto_arena_single+1034>:	mov    0x2c(%ebp),%edi
0xb78cce5d <chemfp_count_tanimoto_arena_single+1037>:	jmp    0xb78cce63 <chemfp_count_tanimoto_arena_single+1043>
0xb78cce5f <chemfp_count_tanimoto_arena_single+1039>:	nop
0xb78cce60 <chemfp_count_tanimoto_arena_single+1040>:	add    0x24(%ebp),%eax
0xb78cce63 <chemfp_count_tanimoto_arena_single+1043>:	mov    -0x20(%ebp),%edx
0xb78cce66 <chemfp_count_tanimoto_arena_single+1046>:	mov    -0x3c(%ebp),%ecx
0xb78cce69 <chemfp_count_tanimoto_arena_single+1049>:	mov    %eax,0x8(%esp)
0xb78cce6d <chemfp_count_tanimoto_arena_single+1053>:	mov    %eax,-0x68(%ebp)
0xb78cce70 <chemfp_count_tanimoto_arena_single+1056>:	mov    %edx,0x4(%esp)
0xb78cce74 <chemfp_count_tanimoto_arena_single+1060>:	mov    %ecx,(%esp)
0xb78cce77 <chemfp_count_tanimoto_arena_single+1063>:	call   0xb78ca780 <chemfp_byte_tanimoto>
0xb78cce7c <chemfp_count_tanimoto_arena_single+1068>:	lea    0x1(%esi),%edx
0xb78cce7f <chemfp_count_tanimoto_arena_single+1071>:	mov    -0x68(%ebp),%eax
0xb78cce82 <chemfp_count_tanimoto_arena_single+1074>:	fldl   -0x28(%ebp)
0xb78cce85 <chemfp_count_tanimoto_arena_single+1077>:	fxch   %st(1)
0xb78cce87 <chemfp_count_tanimoto_arena_single+1079>:	fucomip %st(1),%st
0xb78cce89 <chemfp_count_tanimoto_arena_single+1081>:	fstp   %st(0)
0xb78cce8b <chemfp_count_tanimoto_arena_single+1083>:	cmovae %edx,%esi
0xb78cce8e <chemfp_count_tanimoto_arena_single+1086>:	add    $0x1,%edi
0xb78cce91 <chemfp_count_tanimoto_arena_single+1089>:	cmp    %edi,0x30(%ebp)
0xb78cce94 <chemfp_count_tanimoto_arena_single+1092>:	jg     0xb78cce60 <chemfp_count_tanimoto_arena_single+1040>
0xb78cce96 <chemfp_count_tanimoto_arena_single+1094>:	mov    -0x38(%ebp),%eax
0xb78cce99 <chemfp_count_tanimoto_arena_single+1097>:	mov    0x38(%ebp),%edx
0xb78cce9c <chemfp_count_tanimoto_arena_single+1100>:	mov    %esi,(%edx,%eax,4)
0xb78cce9f <chemfp_count_tanimoto_arena_single+1103>:	mov    -0x40(%ebp),%esi
0xb78ccea2 <chemfp_count_tanimoto_arena_single+1106>:	add    $0x1,%eax
0xb78ccea5 <chemfp_count_tanimoto_arena_single+1109>:	mov    0x14(%ebp),%ecx
0xb78ccea8 <chemfp_count_tanimoto_arena_single+1112>:	add    %ecx,-0x20(%ebp)
0xb78cceab <chemfp_count_tanimoto_arena_single+1115>:	cmp    %esi,%eax
0xb78ccead <chemfp_count_tanimoto_arena_single+1117>:	mov    %eax,-0x38(%ebp)
0xb78cceb0 <chemfp_count_tanimoto_arena_single+1120>:	jl     0xb78cce55 <chemfp_count_tanimoto_arena_single+1029>
0xb78cceb2 <chemfp_count_tanimoto_arena_single+1122>:	jmp    0xb78ccb58 <chemfp_count_tanimoto_arena_single+264>
0xb78cceb7 <chemfp_count_tanimoto_arena_single+1127>:	xor    %eax,%eax
0xb78cceb9 <chemfp_count_tanimoto_arena_single+1129>:	jmp    0xb78ccbcf <chemfp_count_tanimoto_arena_single+383>
0xb78ccebe <chemfp_count_tanimoto_arena_single+1134>:	xor    %eax,%eax
0xb78ccec0 <chemfp_count_tanimoto_arena_single+1136>:	jmp    0xb78ccaf2 <chemfp_count_tanimoto_arena_single+162>
End of assembler dump.


More information about the Debichem-devel mailing list