Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-

C QUESTION

Name: Anonymous 2013-01-13 23:14

for(int i = ((a > b) ? x : y); i < c; i++){...}
How will this be compiled? Will it hurt performance due to lack of unrolling or something?

Name: Anonymous 2013-01-13 23:18

No, it probably wouldn't. Even if it did it would not be noticeable. This is probably better than the alternative in both readability and code length, which when programming today is far more important than performance, unless you are writing in a horribly backwards and slow language with shit code.

Name: Anonymous 2013-01-13 23:34

>>1
Read SICP

>>2
Back to /g/

Name: Anonymous 2013-01-13 23:38

Compile it and see?

Name: Anonymous 2013-01-14 0:07

It's dependent on the loop body.

Name: Anonymous 2013-01-14 0:23

>>5
fuck off

Name: Anonymous 2013-01-14 2:29

>>1
A loop can be inlined if the values of the iteration values can be predicted at compile time. For instance, in


for(int i = 0; i < 5; i++) { printf("%d,", i); }


it is quite clear that the variable i will take on the values, 0, 1, 2, 3, 4, during the execution of the loop. Thus the loop can be unrolled into:


printf("%d", 0); printf("%d", 1); printf("%d", 2); printf("%d", 3); printf("%d", 4);


The key is that the initial values of i and all values in the loop terminate condition must be constant. The increment function must also only depend on constants and the current value of i. With these conditions, the values of the iteration variable can be calculated at compile time and the loop can be unrolled.

In the case of your example, the initial value of i is known at compile if and only if one of the following hold:

* It is known that a < b and the value of x is known at compile time.
* It is known that !(a < b) and the value of y is known at compile time.

The termination condition depends only on constant values and i if and only if the following holds:

* c is a known constant.

So yes, you are right. If the value of the expression (a < b ? x : y) cannot be determined at compile time, then this would prevent total loop unrolling.

Name: Anonymous 2013-01-14 3:01

It also depends if i is modified within the loop body as well. All modifications to i need to be inferrable at compile time. So >>5-san is right. Sorry for telling you to fuck off >>5-san.

Name: Anonymous 2013-01-14 6:51

>>7
cudder-katanaganoman unrolls it den jumps into a specific point based on da values of a, b, x, and y, u kno?

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2013-01-14 7:46

>>9
Loop unrolling is obsolete on modern processors thanks to better hardware. It was most beneficial on the P4 with its huge pipeline, became somewhat contentious throughout the Core line, and slows things down with i7s and above.

http://www.agner.org/optimize/blog/read.php?i=142
http://x264dev.multimedia.cx/archives/201
tl;dr: The future of optimisation is code and data density. Unaligned accesses and stupid size/speed tradeoffs don't matter anymore, because smaller IS faster. (Sucks for you, RISCtards...)

Name: Anonymous 2013-01-14 10:30

>>10
smaller IS faster
I haven't tried it yet, but does this mean there's no way I can beat instructions like FSIN or FPATAN in speed (while keeping precision) even if I use a lookup table that fills the entire RAM? Are the lookup tables that the FPU uses optimal in size?

Name: Anonymous 2013-01-14 10:33

Do you know what's the difference between a Frenchman and a chimpansee ?  - One of them is hairy, stinky, and scratches his ass all the time. The other is a chimpansee.

Name: Anonymous 2013-01-14 10:52

>>6,8
It's okay we all have our moments.

Name: Anonymous 2013-01-14 11:05

>>10
(Sucks for you, RISCtards...)
Turns out you can fit a square peg in a round hole if you have enough money to throw at it.

Name: Anonymous 2013-01-14 11:39

>>12
connard d'anglo raciste, va donc te goinfrer d'un autre hamburger

Name: Anonymous 2013-01-14 12:26

>>12,15
Statistically, there's a 70% probability that le Yannick is a nigger. Don't say things that could make him feel offended as a black hacker.

Name: Anonymous 2013-01-14 13:10

>>15
autre hamburger
lel

Name: Anonymous 2013-01-14 16:22

>>16
va chier en enfer, charogne

>>17
va te faire enculer lel-connard

Name: Anonymous 2013-01-14 16:39

Yannick is le butthurt nigger.

Name: Anonymous 2013-01-14 16:44

*African French

Name: Anonymous 2013-01-14 18:33

>>19
va te faire enculer, connard

Name: Anonymous 2013-01-14 18:43

Dead Yannick,

we will never forget you, fils de pute.

/prog/

Name: Anonymous 2013-01-14 21:13

>>11
FSIN has a latency of 64-100 cycles and throughput of 47-100 required about 100 microops on Sandy Bridge architecture - which btw is about the same order of magnitude as a read from main memory or cache miss.
So if you can get below that you can beat them.

Name: Anonymous 2013-01-14 21:43

>>15
Only a Frechman would think stuffing an ``autistic hamburger'' into someone's anus is a good idea.

Name: Anonymous 2013-01-14 21:43

>>15
Only a Frenchman would think stuffing an ``autistic hamburger'' into someone's anus is a good idea.

Name: Anonymous 2013-01-14 22:40

Ok there's something you don't get, Yannick is not the name of the guy who keeps flaming people, Yannick is the name of the developer who got flamed by the guy who hired him.

Name: Anonymous 2013-01-14 23:00

>>26
Yannick is French, so if >>15,18,21 is not Yannick, he must be Krueger.

Name: Anonymous 2013-01-14 23:28

>>26-27
c'est pas mon vrai nom de toute façon, va chier en enfer fils de pute

Name: Anonymous 2013-01-15 1:55

>>28
Cute.

Name: ‮ ruetartsinimdA ‭Krueger 2013-01-15 2:11

>>29
Je t'encule connard.

Name: Anonymous 2013-01-15 2:18

>>30
KAY YOO GAR

Name: Anonymous 2013-01-15 6:00

>>30
Suce mes boules.

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2013-01-15 7:11

>>11
while keeping precision
No. If you have a very specific set of values you need the sin/cos/tan of, then a lookup table maybe faster, but if you want the same precision, then doing it in software is going to be MUCH slower --- I don't know if anyone here was around to remember when x87s were optional and the immense speedup they provided for FP when installed compared to software emulation.

Implementing an approximation algorithm in software may also slow down other parts of the program, as tens or hundreds of extra bytes of instructions compete with other pieces for cache. The other point which isn't as obvious is the fact that the x87 operates in parallel with the other EUs, so the CPU can still execute other instructions while one x87 op is executing.

I should mention here that doing some operation in software will, if it's not currently slower than the hardware, become so in the future. Case in point: the ARM SoC in the iPhone 5 finally added a divide instruction. The fastest software division routine (32-bit by 32-bit) for ARM was between 10 ~ 120 cycles, while the hardware divide is 2-12 cycles. Sandy Bridge takes 11-18, although its clockspeed is several times higher and much of that is pipelining cost. Unfortunately for the ARM, all existing software that uses software division will have no benefit from the added instructions, since they didn't follow the x87 model of reserving opcode space, excepting when the instruction isn't supported, and handling it in emulation. They'll continue to use the slower, bloatier routine, while x86 apps compiled with x87 instructions work slower under emulation, but automatically benefit when the FPU is present.

tl;dr: Functionality implemented in software cannot improve. Functionality implemented in hardware improves with the hardware.

Name: Anonymous 2013-01-15 11:35

>>33
Shalom!

Name: Anonymous 2013-01-15 12:17

Silly could-er, how old are you?

Name: Anonymous 2013-01-15 15:08

>>35
She wants the D.

Don't change these.
Name: Email:
Entire Thread Thread List