Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-4041-

Fancy SSE instructions in gcc

Name: Anonymous 2008-12-01 16:11

Does anyone know if gcc can produce code that uses SSE?

This is the program I'm trying to compile with gcc -msse -msse2 -msse3 sse.c -S -O2 -o sse.S:
#include <xmmintrin.h>

__m128i qq(__m128i a,__m128i b){
    return a&b;
}


And here's what I'm getting
    .file    "sse.c"
    .text
    .p2align 4,,15
.globl _qq
    .def    _qq;    .scl    2;    .type    32;    .endef
_qq:
    pushl    %ebp
    pxor    %xmm2, %xmm2
    movl    %esp, %ebp
    subl    $56, %esp
    movdqa    %xmm0, -24(%ebp)
    movl    -24(%ebp), %eax
    movdqa    %xmm1, -40(%ebp)
    movl    -40(%ebp), %ecx
    movdqa    %xmm2, -56(%ebp)
    movl    -36(%ebp), %edx
    andl    %ecx, %eax
    movl    %eax, -56(%ebp)
    movl    -20(%ebp), %eax
    movl    -32(%ebp), %ecx
    andl    %edx, %eax
    movl    -28(%ebp), %edx
    movl    %eax, -52(%ebp)
    movl    -16(%ebp), %eax
    andl    %ecx, %eax
    movl    %eax, -48(%ebp)
    movl    -12(%ebp), %eax
    andl    %edx, %eax
    movl    %eax, -44(%ebp)
    movdqa    -56(%ebp), %xmm0
    leave
    ret


This is most definitely not what I'm expecting; this simple function can be implemented in five instructions, but gcc decides to ignore SSE and do anding manually.

Name: Anonymous 2008-12-01 16:21

-funroll -loops

Name: Anonymous 2008-12-01 16:22

>>1
manually
Oh you.

Name: Anonymous 2008-12-01 16:28

SSE3 is a superset of SSE2 is a superset of SSE, you fucking dumb Gentoo shit.

Name: Anonymous 2008-12-01 16:36

>>1
you just cant have nice things.

Name: Anonymous 2008-12-01 16:39

$ cat sse.S
    .text
    .align 4,0x90
.globl _qq
_qq:
    pushl    %ebp
    pand    %xmm1, %xmm0
    movl    %esp, %ebp
    subl    $8, %esp
    leave
    ret
    .subsections_via_symbols

$ gcc --version
i686-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1 (Apple Inc. build 5490)
Copyright (C) 2005 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Name: Anonymous 2008-12-01 16:43

OH GOD WHAT DOES ALL THIS MEAN!!?

        .file   "sse.c"
        .text
        .p2align 4,,15
.globl qq
        .type   qq, @function
qq:
.LFB518:
        pand    %xmm1, %xmm0
        ret
.LFE518:
        .size   qq, .-qq
        .section        .eh_frame,"a",@progbits
.Lframe1:
        .long   .LECIE1-.LSCIE1
.LSCIE1:
        .long   0x0
        .byte   0x1
        .string "zR"
        .uleb128 0x1
        .sleb128 -8
        .byte   0x10
        .uleb128 0x1
        .byte   0x3
        .byte   0xc
        .uleb128 0x7
        .uleb128 0x8
        .byte   0x90
        .uleb128 0x1
        .align 8
.LECIE1:
.LSFDE1:
        .long   .LEFDE1-.LASFDE1
.LASFDE1:
        .long   .LASFDE1-.Lframe1
        .long   .LFB518
        .long   .LFE518-.LFB518
        .uleb128 0x0
        .align 8
.LEFDE1:
        .ident  "GCC: (GNU) 4.3.2"
        .section        .note.GNU-stack,"",@progbits

Name: Anonymous 2008-12-01 18:22

gcc 4.3.2 on linux i686 also produces the same output that >>6 posted. 4.3.2 on x86-64 dumps out something remarkably similar to >>7, and
I think >>1 runs Debian moldy stable.

Name: Anonymous 2008-12-02 1:26

Ah, true, I have gcc 3.4.4 -- from mingw and not gentoo. >>4-kun, yes I was aware of that; This thing just kept not working so I was adding and removing random flags.

Thanks, I'll try with gcc 4

Name: Anonymous 2008-12-02 1:35

>>9
And that is what makes you a fucking dumb Gentoo shit. Read the documentation and understand what the switches do instead of spewing them like diarrhoea in a public toilet.

Name: Anonymous 2008-12-02 2:20

>>10
Whatever, bro. It's not working either way and documentation isn't helping much.

       -msse
       -mno-sse
       -msse2
       -mno-sse2
       -msse3
       -mno-sse3
       -m3dnow
       -mno-3dnow
           These switches enable or disable the use of built-in functions that allow direct access to the
           MMX, SSE, SSE2, SSE3 and 3Dnow extensions of the instruction set.

           To have SSE/SSE2 instructions generated automatically from floating-point code, see -mfpmath=sse.


All I want is to make gcc generate and optimize code for me, so that I don't have to do that in assembler.

Name: Anonymous 2008-12-02 2:44

Name: Anonymous 2008-12-02 3:29

>>11
So now you've read the documentation, the next step is to understand it.

Name: Anonymous 2008-12-02 3:39

Sage for people thinking they can get gcc to produce ``optimized'' binaries.

Name: Anonymous 2008-12-02 5:25

Sage for people thinking they can get any compiler to produce ``optimized'' binaries.

Name: Anonymous 2008-12-02 6:35

>>15
The Intel C++ compiler and Microsoft's optimizing cl.exe compiler produce pretty darn well optimized binaries.
Of course, since free software is typically of sub-standard amateur quality, nobody really expects gcc to produce optimized code; well, maybe just the GNU fanboys :P

Name: Anonymous 2008-12-02 7:31

>>16
1/10

Name: Anonymous 2008-12-02 8:17

>>16
Apple made a lot improvements to gcc.

Name: Anonymous 2008-12-02 8:44

>>13
There's nothing about generating code for 128 bit integers in it, so, yeah, I do understand it completely.

Name: Anonymous 2008-12-02 8:48

>>19
           These switches enable or disable the use of built-in functions that allow direct access to the
           MMX, SSE, SSE2, SSE3 and 3Dnow extensions of the instruction set.

it says they enable or disable those builtin functions (http://gcc.gnu.org/onlinedocs/gcc/X86-Built_002din-Functions.html#X86-Built_002din-Functions), not that they make it automatically rewrite your code to use them.

Name: Anonymous 2008-12-02 8:57

>>16
cl.exe from msvcs9 doesn't even want to compile the code, ``error C2088: '&' : illegal for union''

Name: Anonymous 2008-12-02 8:58

>>20
Ah, but if you would have payed a little attention to code, you'd have noticed that there are movdqa instructions generated. How are you going to explain this?

Name: Anonymous 2008-12-02 9:01

>>22
it can use them, but the documentation doesn't say that those switches make it use them whenever possible.

Name: Anonymous 2008-12-02 9:04

>>23
This is exactly why I was trying to run gcc with different flags. Because documentation is not always consistent and there may be some undocumented effects. Does that make me a gentoo ricer?

Name: Anonymous 2008-12-02 9:10

>>24
why not just read the source to find out if there are any undocumented effects instead of trying random combinations of flags?

Name: sage 2008-12-02 9:19

>>25
I am a windows user.

Name: Anonymous 2008-12-02 9:21

Name: Anonymous 2008-12-02 9:31

>>23
Using them when appropriate is not the same as using them when possible. This makes you a ricer.

Name: Anonymous 2008-12-02 9:36

>>23
i was saying he's a ricer for expecting it to use them when possible.

Name: Anonymous 2008-12-02 9:56

Are you saying it's not appropriate to use sse in >>1?

Name: Anonymous 2008-12-02 10:06

>>30
no, i'm saying that only a ricer would expect an old as fuck version of gcc to use it in that program.

Name: Anonymous 2008-12-02 12:06

>>31
I am terribly sorry for not being a GNUU GCC expert. This makes me ricer, I guess. How sad. All I wanted is to have a nice thing done by program for me. Oh well.

Name: Anonymous 2008-12-02 13:29

>>32
The beauty of anonymous posting is that once you've made a complete fool of yourself, you don't have to keep digging.
There's no reputation to salvage. Just accept that you were an idiot, learn from your mistakes, and move on.

Name: Anonymous 2008-12-02 14:30

>>33
What nonsense. This will haunt him to the day he dies, and so will we.

Name: Anonymous 2008-12-02 15:08

Hey buddy, GCC's nice if you like kool-aids (with emphasis in AIDS). But code generation quality was never a priority, and it shows. Speed of compilation was neither, in case you haven't noticed yet.

Have a look at this little gem:

http://gcc.gnu.org/viewcvs/trunk/gcc/testsuite/gcc.target/i386/pr14552.c?revision=138078&view=markup

What it does it make sure that MMX intrinsics don not use
MMX registers.

I think that speaks for itself.

Name: Anonymous 2008-12-02 15:18

Firefox on Windows can be built using either MSVC or GCC. Of course it's built using MSVC because losing over 10% speed is not worth the freedom. Freedom of being fucked in the ass by Richard Stallman, that is.

Name: Anonymous 2008-12-02 15:42

Join us now and share the software.
You'll be free, hackers, you'll be freeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee

Name: Anonymous 2008-12-02 15:46

eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee

Name: Anonymous 2008-12-02 15:53

>>37
Gonna be freeeeeeeeeeeeeeeeeeeeEeeeeeeeeeeeEeeeEeeeeee-eeeeeeee
And move among the stars
You know they really aren't so far

Name: Anonymous 2008-12-02 16:13

where's your manchild at

Name: Anonymous 2008-12-02 16:51

>>35
i386 does not use extended i686 instructions, news at 11.

Name: Anonymous 2008-12-03 22:56

ENTERPRISE-ish

Name: Anonymous 2009-03-06 14:06


 print the value   of i yep.

Name: Anonymous 2010-11-03 5:15

Name: Anonymous 2010-12-08 21:31

Name: Anonymous 2010-12-17 1:22

Are you GAY?
Are you a NIGGER?
Are you a GAY NIGGER?

If you answered "Yes" to all of the above questions, then GNAA (GAY NIGGER ASSOCIATION OF AMERICA) might be exactly what you've been looking for!

Name: Anonymous 2011-02-04 17:20

Don't change these.
Name: Email:
Entire Thread Thread List