/prog/ - RISC — World4ch

Name: Anonymous 2011-09-22 9:29

/prog/, I'm working on a RISC instruction set that is more RISC than any RISC out there. I've devised an instruction format, plus fifteen core instructions that should suffice for any programming out there. The instruction format, instruction forms, and instructions can be found here:

http://jsbin.com/ekuwap

Any comments or suggestions?

Name: Anonymous 2011-09-22 9:41

Looks interesting.

Name: Anonymous 2011-09-22 9:44

Seems fine, although you're wasting 4 bits on the opcode there, however they're probably well-wasted as it makes tools (assembler/disassembler/...) easier to write.

Name: Anonymous 2011-09-22 9:52

Fucking terrible. Have you even attempted to implement any nontrivial algorithm in it? Have you simulated it? Implementing a synthesizable Verilog/VHDL simulation would be educational for you.

Name: Anonymous 2011-09-22 9:54

>>3
I'm pretty sure that I'll be missing some important instructions, so a few more bits of opcode headroom is a good idea. This is my first venture into instruction set design, after briefly studying the x86 and PowerPC instruction sets. Do you know of any 'holes' in my instruction set? For example, I saw a 'system call' instruction, mnemonic sc, in PowerPC. I don't understand how that works at all, but I can gather that it's fairly important.

Name: Anonymous 2011-09-22 9:55

>>4

Fucking terrible.
As I would expect, as this is my first design, and I know next to nothing about how things really work.

Have you even attempted to implement any nontrivial algorithm in it? Have you simulated it? Implementing a synthesizable Verilog/VHDL simulation would be educational for you.
That is a good idea.

Name: Anonymous 2011-09-22 10:00

>>4
Care to explain why it's ``fucking terrible''? I don't disagree with you, but more detail and elaboration would be appreciated.

Name: Anonymous 2011-09-22 10:10

>>5
Depends on what you plan on using the instruction set for. syscall instructions are usually used for dealing with privilege levels (security) and providing a simple way of calling kernel or hypervisor code.
I would also like an indirect jump (to a register) instruction.

Name: Anonymous 2011-09-22 10:13

>>8
The j instruction is an 'indirect' jump, to a register, as you describe. The address field in that instruction is a register reference, from which the address value is pulled.

Name: Anonymous 2011-09-22 10:30

>>5

linux on the intel uses an interrupt with a value of 80, I think. It provides user programs with a method for invoking services from the operating system, like reading and writing to files, opening a file, and the like. Using the interrupt, the user program triggers the interrupt with the value of 80, and then I think the interrupt stops everything running, backs up the state of the processor, goes to a table of code pointers for handling interrupts and executes the 80'th one, which is the system call handler. Then the system call handler looks are the values in the registers, and executes an appropriate service as the operating system. When it is over, the interrupt ends and the processor is restored to its previous state. I think return values are passed in the registers. I would have to double check though. It has been a while.

Name: Anonymous 2011-09-22 11:56

run ECC on my doubles

Name: Anonymous 2011-09-22 12:16

So I write s 0, 0, 255 to write out 2^255 bytes to memory. The processor takes an exception halfway through. How does it resume execution after the exception handler completes?

Name: Anonymous 2011-09-22 12:34

write out 2^255 bytes to memory. The processor takes an exception halfway through.
Considering how "halfway through" would occur long after the heat death of the universe, I doubt it will make any difference.

Name: Anonymous 2011-09-22 13:22

Anonix quality

Name: Anonymous 2011-09-22 13:27

>>14
Anonix is not bloated unlike this ISA.

Name: Anonymous 2011-09-22 22:09

>>12
In 64-bit mode, only the values 0, 1, 2 and 3 are allowed.

Name: Anonymous 2011-09-22 22:33

hey OP/all

I also have pretty minimal knowledge in this field,
but was wondering if a Variable instruction set would be possible?

Name: Anonymous 2011-09-22 23:25

>>17
What does that even mean?

Name: Anonymous 2011-09-22 23:56

Hmm yeah good question..

Could you compress instructions?
Eg while app X is running, cpu frequently gets instructions A followed by B, C, etc... so instruct 'a' -> A + B + C...?

Name: Anonymous 2011-09-23 0:19

...might make little difference?

Programmable instruction sets then?? ...build your own SSE-n?

Name: Anonymous 2011-09-23 0:47

>>19
isn't that the CPU's job?

Name: Anonymous 2011-09-23 2:42

..instructions relate to specific circuits / components in a CPU..?
eg logic & arithmatic -> ALU... (any others?)

So, Instructions are hard-wired ?? // No such thing as a general-purpose instruction circuit =(

Probably code short circuits n shit anyway i s'pose..

Name: Anonymous 2011-09-23 7:03

>>22
There are CPUs that incorporate a degree of configurability, and there's always FPGAs, but those will always be slower and use more power than their hard-wired counterparts.

Name: Anonymous 2011-09-23 8:01

Hello again, /prog/.

After some rethinking, I've redesigned the instruction set architecture, this time with various changes, including

- instructions are now 16 bits long
- the opcode field is six bits long
- the register reference fields are five bits long
- for simplicity, ops like `A = B op C' are now `A = A op B'
- there is now a status register, currently only used for c/j
- none of that `[size]' bullshit anymore in the load/store/move ops

Overall, a hopefully cleaner and better designed instruction set.

http://jsbin.com/ekuwap/2

Any comments on this one?

Name: Anonymous 2011-09-23 8:15

Lacks SIMD.

Name: Anonymous 2011-09-23 8:21

>>25
Adding SIMD instructions, let alone any instruction that can be completed equally with a combination of other instructions, will defeat the idea of this architecture being RISC.

Name: Anonymous 2011-09-23 8:44

And now for an (untested) emulator in ~50 lines.

#include <stdio.h>

#include <stdint.h>

#include <stdlib.h>

#define MEM 67108864

#define BITS 64

#define uw uint64_t

#define sw int64_t

int main() {

    uint8_t *mem = calloc(MEM, 1);

    uint16_t *memi = (uint16_t *) mem, ins, insd, insa, insb;

    uint32_t stat = 0;

    uw *reg = calloc(32, BITS / 8), cia = 0, nia;

    while (1) {

        nia = cia + 1;

        ins = memi[cia];

        insd = ins & 0x3ff;

        insa = insd >> 5;

        insb = insd & 0x1f;

        switch (ins >> 10) {

            case 0: break;

            case 1: mem[reg[insa]] = mem[reg[insb]]; break;

            case 2: reg[insa] = ((reg[insa] >> 8) << 8) | mem[reg[insb]]; break;

            case 3: mem[reg[insa]] = reg[insb]; break;

            case 4: reg[insa] = (insb >> 1) << ((insb & 1) ? 4 : 0); break;

            case 5: reg[insa] = ~reg[insa]; break;

            case 6: reg[insa] &= reg[insb]; break;

            case 7: reg[insa] |= reg[insb]; break;

            case 8: reg[insa] ^= reg[insb]; break;

            case 9: reg[insa] += reg[insb]; break;

            case 10: reg[insa] -= reg[insb]; break;

            case 11: reg[insa] <<= reg[insb]; break;

            case 12: reg[insa] >>= reg[insb]; break;

            case 13: reg[insa] = (uw) (((sw) (reg[insa])) >> reg[insb]); break;

            case 14:

                stat = (stat >> 3) << 3;

                if (reg[insa] == reg[insb])

                    stat |= 1;

                else if (reg[insa] < reg[insb])

                    stat |= 2;

                else

                    stat |= 4;

            case 15:

                if (((insb >> 1) & 7) & (stat & 7))

                    nia = reg[insa] + ((insb & 1) ? cia : 0);

        }

        cia = nia;

        if (cia > MEM / 2 - 1)

            break;

    }

    free(mem);

    free(reg);

    return 0;

}

Name: Anonymous 2011-09-23 8:47

>>4
Fucking terrible.
^* Fucking Terrible!

Name: Anonymous 2011-09-23 8:58

With fixed `i' and op-by-op debugging:

#include <stdio.h>

#include <stdint.h>

#include <stdlib.h>

#define MEM 67108864

#define BITS 64

#define uw uint64_t

#define sw int64_t

int main() {

    uint8_t *mem = calloc(MEM, 1);

    uint16_t *memi = (uint16_t *) mem, ins, insd, insa, insb;

    uint32_t stat = 0;

    uw *reg = calloc(32, BITS / 8), cia = 0, nia;

    memi[0] = (4 << 10) | (0 << 5) | (0x1 << 1) | 1;

    memi[1] = (4 << 10) | (0 << 5) | (0x2 << 1) | 0;

    while (1) {

        nia = cia + 1;

        ins = memi[cia];

        insd = ins & 0x3ff;

        insa = insd >> 5;

        insb = insd & 0x1f;

        switch (ins >> 10) {

            case 0: break;

            case 1: mem[reg[insa]] = mem[reg[insb]]; break;

            case 2: reg[insa] = ((reg[insa] >> 8) << 8) | mem[reg[insb]]; break;

            case 3: mem[reg[insa]] = reg[insb]; break;

            case 4:

                if (insb & 1)

                    reg[insa] = ((reg[insa] >> 8) << 8) |

                        (reg[insa] & 0xf) | ((insb >> 1) << 4);

                else

                    reg[insa] = ((reg[insa] >> 4) << 4) |

                        (insb >> 1);

                break;

            case 5: reg[insa] = ~reg[insa]; break;

            case 6: reg[insa] &= reg[insb]; break;

            case 7: reg[insa] |= reg[insb]; break;

            case 8: reg[insa] ^= reg[insb]; break;

            case 9: reg[insa] += reg[insb]; break;

            case 10: reg[insa] -= reg[insb]; break;

            case 11: reg[insa] <<= reg[insb]; break;

            case 12: reg[insa] >>= reg[insb]; break;

            case 13: reg[insa] = (uw) (((sw) (reg[insa])) >> reg[insb]); break;

            case 14:

                stat = (stat >> 3) << 3;

                if (reg[insa] == reg[insb])

                    stat |= 1;

                else if (reg[insa] < reg[insb])

                    stat |= 2;

                else

                    stat |= 4;

            case 15:

                if (((insb >> 1) & 7) & (stat & 7))

                    nia = reg[insa] + ((insb & 1) ? cia : 0);

        }

        if (ins) {

            putchar('[');

            putchar(((ins >> 15) & 1) ? '1' : '0');

            putchar(((ins >> 14) & 1) ? '1' : '0');

            putchar(((ins >> 13) & 1) ? '1' : '0');

            putchar(((ins >> 12) & 1) ? '1' : '0');

            putchar(((ins >> 11) & 1) ? '1' : '0');

            putchar(((ins >> 10) & 1) ? '1' : '0');

            putchar(((ins >> 9) & 1) ? '1' : '0');

            putchar(((ins >> 8) & 1) ? '1' : '0');

            putchar(((ins >> 7) & 1) ? '1' : '0');

            putchar(((ins >> 6) & 1) ? '1' : '0');

            putchar(((ins >> 5) & 1) ? '1' : '0');

            putchar(((ins >> 4) & 1) ? '1' : '0');

            putchar(((ins >> 3) & 1) ? '1' : '0');

            putchar(((ins >> 2) & 1) ? '1' : '0');

            putchar(((ins >> 1) & 1) ? '1' : '0');

            putchar((ins & 1) ? '1' : '0');

            putchar(']');

            {

                int i;

                for (i = 0; i < 32; i++)

                    printf(" %lx", reg[i]);

            }

            putchar('\n');

        }

        cia = nia;

        if (cia > MEM / 2 - 1)

            break;

    }

    free(mem);

    free(reg);

    return 0;

}

Name: Anonymous 2011-09-23 9:06

Fact

It takes 12 instructions to load 0x12345678 into the first register.

Process

0. load the value 0x8 into the second register
1. load the value 0x1 into the high half of the lowest byte of the first register
2. load the value 0x2 into the low half of the lowest byte of the first register
3. shift the first register left by the second register (8)
4. load the value 0x3 into the high half of the lowest byte of the first register
5. load the value 0x4 into the low half of the lowest byte of the first register
6. shift the first register left by the second register (8)
7. load the value 0x5 into the high half of the lowest byte of the first register
8. load the value 0x6 into the low half of the lowest byte of the first register
9. shift the first register left by the second register (8)
10. load the value 0x7 into the high half of the lowest byte of the first register
11. load the value 0x8 into the low half of the lowest byte of the first register

Debugging output

[0001000000110000] 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

[0001000000000011] 10 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

[0001000000000100] 12 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

[0010110000000001] 1200 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

[0001000000000111] 1230 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

[0001000000001000] 1234 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

[0010110000000001] 123400 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

[0001000000001011] 123450 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

[0001000000001100] 123456 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

[0010110000000001] 12345600 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

[0001000000001111] 12345670 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

[0001000000010000] 12345678 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Source code

#include <stdio.h>

#include <stdint.h>

#include <stdlib.h>

#define MEM 67108864

#define BITS 64

#define uw uint64_t

#define sw int64_t

int main() {

    uint8_t *mem = calloc(MEM, 1);

    uint16_t *memi = (uint16_t *) mem, ins, insd, insa, insb;

    uint32_t stat = 0;

    uw *reg = calloc(32, BITS / 8), cia = 0, nia;

    memi[0] = (4 << 10) | (1 << 5) | (0x8 << 1) | 0;

    memi[1] = (4 << 10) | (0 << 5) | (0x1 << 1) | 1;

    memi[2] = (4 << 10) | (0 << 5) | (0x2 << 1) | 0;

    memi[3] = (11 << 10) | (0 << 5) | 1;

    memi[4] = (4 << 10) | (0 << 5) | (0x3 << 1) | 1;

    memi[5] = (4 << 10) | (0 << 5) | (0x4 << 1) | 0;

    memi[6] = (11 << 10) | (0 << 5) | 1;

    memi[7] = (4 << 10) | (0 << 5) | (0x5 << 1) | 1;

    memi[8] = (4 << 10) | (0 << 5) | (0x6 << 1) | 0;

    memi[9] = (11 << 10) | (0 << 5) | 1;

    memi[10] = (4 << 10) | (0 << 5) | (0x7 << 1) | 1;

    memi[11] = (4 << 10) | (0 << 5) | (0x8 << 1) | 0;

    while (1) {

        nia = cia + 1;

        ins = memi[cia];

        insd = ins & 0x3ff;

        insa = insd >> 5;

        insb = insd & 0x1f;

        switch (ins >> 10) {

            case 0: break;

            case 1: mem[reg[insa]] = mem[reg[insb]]; break;

            case 2: reg[insa] = ((reg[insa] >> 8) << 8) | mem[reg[insb]]; break;

            case 3: mem[reg[insa]] = reg[insb]; break;

            case 4:

                if (insb & 1)

                    reg[insa] = ((reg[insa] >> 8) << 8) |

                        (reg[insa] & 0xf) | ((insb >> 1) << 4);

                else

                    reg[insa] = ((reg[insa] >> 4) << 4) |

                        (insb >> 1);

                break;

            case 5: reg[insa] = ~reg[insa]; break;

            case 6: reg[insa] &= reg[insb]; break;

            case 7: reg[insa] |= reg[insb]; break;

            case 8: reg[insa] ^= reg[insb]; break;

            case 9: reg[insa] += reg[insb]; break;

            case 10: reg[insa] -= reg[insb]; break;

            case 11: reg[insa] <<= reg[insb]; break;

            case 12: reg[insa] >>= reg[insb]; break;

            case 13: reg[insa] = (uw) (((sw) (reg[insa])) >> reg[insb]); break;

            case 14:

                stat = (stat >> 3) << 3;

                if (reg[insa] == reg[insb])

                    stat |= 1;

                else if (reg[insa] < reg[insb])

                    stat |= 2;

                else

                    stat |= 4;

            case 15:

                if (((insb >> 1) & 7) & (stat & 7))

                    nia = reg[insa] + ((insb & 1) ? cia : 0);

        }

        if (ins) {

            putchar('[');

            putchar(((ins >> 15) & 1) ? '1' : '0');

            putchar(((ins >> 14) & 1) ? '1' : '0');

            putchar(((ins >> 13) & 1) ? '1' : '0');

            putchar(((ins >> 12) & 1) ? '1' : '0');

            putchar(((ins >> 11) & 1) ? '1' : '0');

            putchar(((ins >> 10) & 1) ? '1' : '0');

            putchar(((ins >> 9) & 1) ? '1' : '0');

            putchar(((ins >> 8) & 1) ? '1' : '0');

            putchar(((ins >> 7) & 1) ? '1' : '0');

            putchar(((ins >> 6) & 1) ? '1' : '0');

            putchar(((ins >> 5) & 1) ? '1' : '0');

            putchar(((ins >> 4) & 1) ? '1' : '0');

            putchar(((ins >> 3) & 1) ? '1' : '0');

            putchar(((ins >> 2) & 1) ? '1' : '0');

            putchar(((ins >> 1) & 1) ? '1' : '0');

            putchar((ins & 1) ? '1' : '0');

            putchar(']');

            {

                int i;

                for (i = 0; i < 32; i++)

                    printf(" %lx", reg[i]);

            }

            putchar('\n');

        }

        cia = nia;

        if (cia > MEM / 2 - 1)

            break;

    }

    free(mem);

    free(reg);

    return 0;

}

Name: Anonymous 2011-09-23 12:59

Reminds me of ARM Thumb.

Name: Anonymous 2011-09-23 13:38

http://en.wikipedia.org/wiki/One_instruction_set_computer

Name: Anonymous 2011-09-23 15:41

>>30
Fucking Terrible!

You can't even copy the value of one register into another. Instead, you have to write the first register out to memory one byte at a time, read the data back into the register one byte at a time and then read the data into the target register one byte at a time.

Also, no way of detecting overflow/underflow in arithmetic operations, or perform bit rotation etc. etc.

>>31
Except Thumb was designed by people who had a fucking clue.

Name: Anonymous 2011-09-23 21:22

>>33

You can't even copy the value of one register into another. Instead, you have to write the first register out to memory one byte at a time, read the data back into the register one byte at a time and then read the data into the target register one byte at a time.
Oh, shit. That's a gaping implementation hole.

Also, no way of detecting overflow/underflow in arithmetic operations, or perform bit rotation etc. etc.
Those are good ideas. How common are their use? Should they be included?

Name: Anonymous 2011-09-23 22:21

>>34
Go download and read through a bunch of CPU datasheets for as many architectures you can find. That should give you an idea of what practical instruction sets look like.

Name: Anonymous 2011-09-23 22:30

>>35
Wouldn't I just end up creating a large instruction set like the rest? Even PowerPC's ``RISC'' ISA seems very large.

Name: Anonymous 2011-09-23 22:50

Sigh.

Name: Anonymous 2011-09-23 22:51

>>34
As I said earlier, try implementing some nontrivial algorithms and you'll notice what's missing and what's just badly designed.

Also, the goal of RISC is to have simple instructions that can be executed quickly and without micro-code. Having a small number of instructions is a fallacy.

Have you read your Hennessy & Patterson today?

Name: Anonymous 2011-09-23 23:17





//---------

putchar(((ins >> 2) & 1) ? '1' : '0');

//---------



Vs ?



Const char BinString{"01"};

...

putchar( BinStr[ ((ins & 4)==4) ] );

Any faster..?

Name: Anonymous 2011-09-24 0:03

**



const char[2] BinString{"01"};

**

Comparison op should be faster than a rotate?
and a lookup faster than a branch?

...then putchar probably dulls it all down anyway =)

Name: Anonymous 2011-09-24 0:05

>>39
>>40
Fuck off, nenti.

Name: Anonymous 2011-09-24 1:11

In the 32-bit version you could just OR or AND the register with itself into a new register, but since you decided to switch to 16-bit two-operand instructions, a register-register move should be added so you can fit the equivalent to a three-operand operation into 32 bits.

NOP could be removed because you can OR a register with itself for the same effect. On MIPS, NOP is sll $0, $0, 0 (4 bytes). On x86, it's XCHG AX, AX (1 byte). Some other architectures use "branch never" or a "zero register" for this. You'd only need a dedicated NOP if you're using variable-length instructions and it's impossible to make an instruction that does nothing in the minimal instruction length.

NOT could be replaced by a two-operand "move complemented" or a NOR or NAND (NOT is just doing them on a register with itself), or you can make the immediate specify one of 32 possible one-operand ALU functions or be some sort of mask for altering individual bytes in the register. There are all sorts of ideas, but it's pretty inefficient to just leave 5 unused bits.

Another issue I see is immediate values. Using a 16-bit instruction to load a 4-bit immediate is terribly inefficient. It wouldn't be so bad if you were trying to make an esoteric language that's designed to be hard to program, but for a practical CPU, you need better ways to load immediates.

Since all jumps are based on registers, the system of loading immediates would affect jumps too. You should also have a "jump and link" or "call" that saves the return address in a register before making a jump in order to call subroutines. You could even use the spare bit in j and use a special register for return (like $31 in MIPS). But if you don't care about practicality or position-independent code, you could save the return address manually by loading immediate values into a register.

Name: Anonymous 2011-09-24 1:36

>>42
Thanks for the insightful advice! Do you have any ideas for loading immediates with 16-bit ops?

Name: Anonymous 2011-09-24 1:38

>>39,40
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ █ ___ █ █ // 7 █ █ (_,_/\ WORSHIP THIS █ █ \ \ YOUR THROBBING GOD █ █ \ \ █ █ _\ \__ █ █ ( \ ) █ █ \___\___/ █ █ █ ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

Name: n3n7i 2011-09-24 1:39

please play with my balls

Name: Anonymous 2011-09-24 1:48

>>44
fuck, why didn't I think of using that ASCII back when tdavis was around,

Name: Anonymous 2011-09-24 1:55

>>43
You could use three opcode bits to make an 8-bit immediate and have the instruction shift the register left by 8 bits before loading the immediate into the lower 8 bits. That way 8 instructions is the maximum needed to load any 64-bit value instead of 25 (3 per byte+1 for shift count) like before.

Name: Anonymous 2011-09-24 1:58

>>47
Sounds good. However, wouldn't that force me to a maximum of eight instructions, as my opcode field is now 3 bits long? Or are you suggesting I use some sort of context-dependent/variable-length opcode field?

Name: Anonymous 2011-09-24 2:00

>>47
(On second thought, I could split the six-bit opcode field into two three-bit opcode fields, and sometimes only use the first opcode field, and sometimes use both fields)

Name: Anonymous 2011-09-24 2:08

>>49
Yes, I meant something like this. For immediate (and whatever others), 3 bits from the opcode field are combined with the other immediate field to form an 8-bit immediate. For the other instructions, the whole 6 bits is used as the opcode.

Name: Anonymous 2011-09-24 2:13

>>50
Thanks a lot!

Name: Anonymous 2011-09-24 2:18

>>42
Tricks like using instructions that happen to have no side-effects is not a good idea in the long run. When you're trying to get more performace later on these things invariably cause problems. For instance MIPS32 added a dedicated NOP instruction.

>>43
Fixed-width instruction sets usually allow loading only small immediates, and use PC-relative addressing and constant pools for larger values. See SuperH for one 32-bit instruction set using 16-bit fixed-width instructions.

Name: Anonymous 2011-09-24 2:46

ekuwap/2 has a five bit address field & is using 8 bit numbers, so is limited to 32 Bytes of memory?

Name: Anonymous 2011-09-24 2:48

>>53
The address field is a register reference. There are 32 registers (which is why each register reference takes 5 bits). Each register can hold 32, 64, 128, ... bits depending on the CPU's bit mode.

Name: Anonymous 2011-09-24 3:07

>>52
MIPS manual:
The NOP instruction is actually encoded as an all-zero instruction. MIPS processors special-case this encoding as performing no operation, and optimize execution of the instruction. In addition, SSNOP instruction, takes up one issue cycle on any processor, including super-scalar implementations of the architecture.
ALPHA reference manual:
Implementations are free to optimize these into no
action and zero execution cycles.

MIPS's dedicated NOP (SSNOP) is for filling coprocessor or FPU delay slots. Nearly all RISCs and some CISCs use NOP as a synonym for some other do-nothing instruction and then special-case it in the hardware since they know there is no other reason for a programmer to use that instruction.

The all-zero MIPS NOP is actually sll $0, $0, 0. PowerPC (ori r0,r0,0), ALPHA (LDQ_U R31,0(Rx) for "UNOP", BIS R31,R31,R31 for "NOP", and CPYS F31,F31,F31 for "FNOP"), SPARC (sethi 0,%g0), ARM (MOV r0,r0) and S/360 (BC 0, "branch never") are other architectures that do similar things as MIPS and x86 regarding NOPs.

RISC design includes "synthetic instructions" which are practical because of the fixed-length instructions. In something like 68k there is both CLR.L and MOVE.L because instructions are variable length. In RISCs, there's no point in making a separate CLR instruction because it would be the same size and speed as XORing the register with itself or loading immediate 0 and would just complicate decoding and waste opcode space.

With only a maximum of 64 opcodes, explicit compare, and no mention of any delay slots or coprocessors, I don't think a dedicated NOP would be necessary for this particular CPU. Even if there was an FPU with exposed pipeline, you could special-case OR R0, R0 for the no cycle NOP and use OR with any other registers for the one-cycle NOP, so there's still no need to waste opcodes for a dedicated NOP.

Name: Anonymous 2011-09-24 3:14

>>54 Ah

also, do you pad out the 'Immediate operand' with zeros, or does the next instruction directly follow a nop instruction opcode?

Name: Anonymous 2011-09-24 3:15

>>56
Could you elaborate on your question?

Name: Anonymous 2011-09-24 3:23

>>57 actually nevermind
...Fixed width instructions obviously

Being a virtual machine, you could simulate L1/L2 cache & etc?

Name: Anonymous 2011-09-24 3:28

>>58
Yes. Well, it is currently implemented only as a very, very basic emulator with direct, physical memory and 32 registers, but I'm sure it could be improved. It could also, theoretically, be physically manufactured, but only if I improve this ISA significantly.

Name: Anonymous 2011-09-24 3:41

The super-H // 32-bit over a fixed 16-bit instruction set just uses the space of two instructions for some/all instructions, yeah?

So you could call it an 'Aligned Variable-width' instruction set?

Is there much benefit in squeezing more instructions into a bit less space like this?

Name: Anonymous 2011-09-24 3:54

>>55
68k has redundancies because orthogonality was one of the design goals. For XORing a register with itself to be fast it needs to be special-cased in the implementation and handled as a CLR internally, otherwise you're just adding pipeline stalls.

>>60
All SuperH instructions are 16 bits. Small instructions can reduce code size, and performs better on a narrow data bus. There's plenty of material analyzing ARM vs. Thumb from different perspectives. Thumb-2 tries to pack the most used instructions into 16 bits.

Name: Anonymous 2011-09-24 4:10

...a 32 bit instruction set where all instructions are 16 bit..?
so just the numbers / registers are 32 bit?

Name: Anonymous 2011-09-24 4:28

>>62
The instructions are always 16 bits long.
The CPU architecture is not fixed to any word length; it can run in 16-bit, 32-bit, 64-bit, 128-bit, ...

Name: Anonymous 2011-09-24 4:50

@ OP

Why dont you try out a non instruction set computer (NISC)?

http://www.ics.uci.edu/~jelenat/pubs/TR05-09.pdf

This baby allows you to make the best use of your fukken transistors, if that is, you can write a compiler good enough for it.

Name: Anonymous 2011-09-24 5:11

Presenting the third version of my RISC ISA:

http://jsbin.com/ekuwap/3

Changes:

- the opcode field is split into two halves; instructions that want 8-bit immediates can use a three-bit opcode
- the `i' instruction now shifts a register 8 bits to the left, and loads an 8-bit immediate into it
- `j' has been renamed to `jc' and a new, unconditional `j' is born because there is no way to do an unconditional jump with the old `j'
- a new `r' instruction is added to move among registers
- the comment column is cleaned up with clearer meanings

Name: Anonymous 2011-09-24 5:33

Example: loading 0x12345678 into the first register takes only four instructions now.

#include <stdio.h>

#include <stdint.h>

#include <stdlib.h>

#include <string.h>

#define MEM 134217728

#define BITS 64

#define uw uint64_t

#define sw int64_t

#define DPRINT printf

int main() {

    uint8_t *mem = calloc(MEM, 1);

    uint16_t *memi = (uint16_t *) mem, ins;

    uint32_t stat = 0;

    uw cia, nia, insoa, insob, insi, insa, insb;

    uw *reg = calloc(32, BITS / 8);

    memi[0] = (1 << 13) | (0x12 << 5) | (0);

    memi[1] = (1 << 13) | (0x34 << 5) | (0);

    memi[2] = (1 << 13) | (0x56 << 5) | (0);

    memi[3] = (1 << 13) | (0x78 << 5) | (0);

    while (cia < MEM / 2) {

        nia = cia + 1;

        ins = memi[cia];

        insoa = ins >> 13;

        insob = (ins >> 10) & 7;

        insi = (ins >> 5) & 255;

        insa = (ins >> 5) & 31;

        insb = ins & 31;

        if (insoa && insob)

            DPRINT("0x%016x:", ins);

        switch (insoa) {

            case 0: switch (insob) {

                case 0: break;

                case 1: reg[insa] = ~reg[insa]; break;

                case 2: reg[insa] &= reg[insb]; break;

                case 3: reg[insa] |= reg[insb]; break;

                case 4: reg[insa] ^= reg[insb]; break;

                case 5: reg[insa] <<= reg[insb]; break;

                case 6: reg[insa] >>= reg[insb]; break;

                case 7: reg[insa] = (uw)(((sw)(reg[insa])) >>

                    reg[insb]); break;

            }

            case 1: reg[insb] = (reg[insb] << 8) | insi; break;

            case 2: switch (insob) {

                case 0: reg[insa] = ((reg[insa] >> 8) << 8) |

                    mem[reg[insb]]; break;

                case 1: mem[reg[insa]] = reg[insb]; break;

                case 2: mem[reg[insa]] = mem[reg[insb]]; break;

                case 3: reg[insa] = reg[insb]; break;

            }

            case 3: switch (insob) {

                case 0: reg[insa] += reg[insb]; break;

                case 1: reg[insa] -= reg[insb]; break;

            }

            case 4: switch (insob) {

                case 0: stat = ((stat >> 3) << 3) |

                    ((reg[insa] == reg[insb]) ? 1 :

                    (reg[insa] < reg[insb]) ? 2 : 4); break;

                case 1: nia = ((insb & 1) ? cia : 0) + reg[insa]; break;

                case 2: if ((stat & 7) & ((insb >> 1) & 7))

                    nia = ((insb & 1) ? cia : 0) + reg[insa]; break;

            }

            case 5: break;

            case 6: break;

            case 7: break;

        }

        if (insoa && insob) {

            int i;

            for (i = 0; i < 31; i++)

                DPRINT(" %lx", reg[i]);

            DPRINT("\n");

        }

        cia = nia;

    }

    free(mem);

    free(reg);

    return 0;

}

Output:

0x0000000000002680: 1234 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0x0000000000002ac0: 123456 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0x0000000000002f00: 12345678 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Name: Cudder !MhMRSATORI!FBeUS42x4uM+kgp 2011-09-24 5:42

>>43
16-bit opcode followed by 8, 16, or 32-bit immediate value. 16 is already too large for most common operations.

RISC architecture was initially designed to allow higher clockspeeds, but we now know that the laws of physics don't let us go much faster than what we have today. Thus we're shifting back to more powerful instructions that can execute multiple operations and enhance parallelism. It's not about single-cycle instructions and boosting clock frequency anymore --- it's about decoding and executing more instructions per clock. RISC design has fared better in embedded systems, where a simple CPU core that can be integrated into a SoC with low cost is more important than absolute execution speed.

>>42,55
If you have register-register move there is already plenty of opportunities for NOPs (one per register). One way to alleviate this redundancy is to not allow moves between two same registers by either putting other instructions in those places or splitting the registers into two blocks so a move must go from one block to the other.

20 years ago, when the RISC fad was just starting, I predicted it would end eventually and architectures would start moving in the other direction again. I also foresaw the use of RISC in embedded applications.

Name: Cudder !MhMRSATORI!FBeUS42x4uM+kgp 2011-09-24 5:50

>>65
1280 of your instructions are NOPs (find them all!)

Name: Anonymous 2011-09-24 5:56

>>68
nop (reg0=idgaf, reg1=idgaf) = 1024 instructions
r (reg0=idgaf, reg1=idgaf, reg0=reg1) = 32 instructions
jc (reg0=idgaf, mask=0b000, r=idgaf) = 64 instructions

I've found 1120 of the 1280 instructions.

Name: Anonymous 2011-09-24 5:58

>>68
Care to share the other 160 of them? I can't find any others that work regardless of register state.

Name: Anonymous 2011-09-24 6:09

mnemonic 'i' could use a single 'on' bit as an opcode
// nop could be made to be a half-length instruction
whether that would be useful though...?

Name: Anonymous 2011-09-24 6:10

>>71
Get out, n3n7i.

Name: Anonymous 2011-09-24 6:13

Fixed instruction jump addressing and incorrect fall-throughs on switches:

#include <stdio.h>

#include <stdint.h>

#include <stdlib.h>

#include <string.h>

#define MEM 134217728

#define BITS 64

#define uw uint64_t

#define sw int64_t

#define DPRINT printf

int main() {

    uint8_t *mem = calloc(MEM, 1);

    uint16_t *memi = (uint16_t *) mem, ins;

    uint32_t stat = 0;

    uw cia, nia, inso, insoa, insob, insi, insa, insb;

    uw *reg = calloc(32, BITS / 8);

    while (cia < MEM / 2) {

        nia = cia + 1;

        ins = memi[cia];

        inso = ins >> 10 & 63;

        insoa = ins >> 13;

        insob = (ins >> 10) & 7;

        insi = (ins >> 5) & 255;

        insa = (ins >> 5) & 31;

        insb = ins & 31;

        if (inso) {

            int i = 16;

            DPRINT("%016lx: ", cia);

            while (i--)

                DPRINT("%c", ((ins >> i) & 1) ? '1' : '0');

            DPRINT(":");

        }

        switch (insoa) {

            case 0: switch (insob) {

                case 0: break;

                case 1: reg[insa] = ~reg[insa]; break;

                case 2: reg[insa] &= reg[insb]; break;

                case 3: reg[insa] |= reg[insb]; break;

                case 4: reg[insa] ^= reg[insb]; break;

                case 5: reg[insa] <<= reg[insb]; break;

                case 6: reg[insa] >>= reg[insb]; break;

                case 7: reg[insa] = (uw)(((sw)(reg[insa])) >>

                    reg[insb]); break;

            } break;

            case 1: reg[insb] = (reg[insb] << 8) | insi; break;

            case 2: switch (insob) {

                case 0: reg[insa] = ((reg[insa] >> 8) << 8) |

                    mem[reg[insb]]; break;

                case 1: mem[reg[insa]] = reg[insb]; break;

                case 2: mem[reg[insa]] = mem[reg[insb]]; break;

                case 3: reg[insa] = reg[insb]; break;

            } break;

            case 3: switch (insob) {

                case 0: reg[insa] += reg[insb]; break;

                case 1: reg[insa] -= reg[insb]; break;

            } break;

            case 4: switch (insob) {

                case 0: stat = ((stat >> 3) << 3) |

                    ((reg[insa] == reg[insb]) ? 1 :

                    (reg[insa] < reg[insb]) ? 2 : 4); break;

                case 1: nia = (((insb & 1) ? cia : 0) + reg[insa]) >> 1; break;

                case 2: if ((stat & 7) & ((insb >> 1) & 7))

                    nia = (((insb & 1) ? cia : 0) + reg[insa]) >> 1; break;

            } break;

            case 5: break;

            case 6: break;

            case 7: break;

        }

        if (inso) {

            int i;

            for (i = 0; i < 32; i++)

                DPRINT(" %lx", reg[i]);

            DPRINT("\n");

        }

        cia = nia;

    }

    free(mem);

    free(reg);

    return 0;

}

Name: Cudder !MhMRSATORI!FBeUS42x4uM+kgp 2011-09-24 6:16

>>70
Nevermind, I double-counted some.

nop 1024
and 32
or 32
m 32
r 32
jc 64

Total is 1216.

Name: Anonymous 2011-09-24 6:19

>>74

m would seem to work given that the two register references are the same, and thus their register values are equal, and thus the effective addresses are equal. However, what if the register value represents an effective address that is outside the available memory? It would have undefined behaviour or crash.

and's `src' is a register reference, not an immediate value, so you can't just `and r*, 0b11111'. Same goes for or.

Therefore, it's 1120, with nop, r and jc.

Name: Anonymous 2011-09-24 6:24

Logarithmic step rotations ?

for bits used as step// rot max

1 bit -> (Rot 0?? // nop?) & Rot 1
2 bit -> (Rot 2) & Rot 4
3rd bit -> (Rot 8) & Rot 16
...

Name: Anonymous 2011-09-24 6:26

Bunged that up hey

3rd bit -> Rot 8 / 16 / 32 / 64 ..?

Name: Cudder !MhMRSATORI!FBeUS42x4uM+kgp 2011-09-24 6:28

>>75
From the instruction set, your CPU has absolutely no memory protection/paging/etc. so I'm assuming it's a simple "open" type like a Z80 or 6502. If you access memory that doesn't exist you would just read the value from a floating databus (FFs if there's termination/pullups, other values are possible but it doesn't matter here) and try to write it to the same nonexistent location, so nothing actually happened.

and's `src' is a register reference, not an immediate value, so you can't just `and r*, 0b11111'. Same goes for or.
Read up on boolean algebra identities. Specifically idempotence.

Name: Anonymous 2011-09-24 6:34

>>75
Both the src and the dest for AND and OR are registers, right? As I'm sure you know, ANDing or ORing a number with itself produces that same number, which is why so many RISCs use them as NOPs or (three-operand versions) as register-register moves.

Name: Anonymous 2011-09-24 6:36

>>78
Whoops, I had a major blank of the mind there. Sorry about that.

Of course, `a = a & a' has no effect. Neither does `a = a | a'. Therefore, `and r*, r*' and `or r*, r*' are no-ops when the registers are the same.

Name: Anonymous 2011-09-24 6:37

>>80
>>79

Name: Cudder !MhMRSATORI!FBeUS42x4uM+kgp 2011-09-24 6:44

>>76
That reminds me of this:



4.1.1 REGISTER/IMMEDIATE OPERAND FIELD



000 8-bit immediate value follows instruction (#xx)

001 16-bit immediate value follows instruction (#xxxx)

010 32-bit immediate value follows instruction (#xxxxxxxx)

1nn memory operand pointed to by D0 through D4 (@D0-@D4)



4.1.2 FLEXIBLE OPERAND FIELD



000nn memory operand pointed to by D0 through D4 (@D0-@D4)

001nn 32-bit registers D0[W1:W0] through D4[W7:W6] (D0-D4)

01nnn 16-bit registers W0[B1:B0] through W7[B15:B14] (W0-W7)

1nnnn 8-bit registers B0 through B15 (B0-B15)

Name: Anonymous 2011-09-24 6:45

from >>77

Plenty of room!
make sl and sr Opcode 101 and 110 respectively, and you can use
opcode#2 as the log-step variable

can do a Rot 127 (if possible) in 7 instructions using just one... and in 3 instructions using both? shiftRight:64 + shiftRight:64 + shiftLeft:1...

Name: Anonymous 2011-09-24 6:45

>>78
If you access memory that doesn't exist you would just read the value from a floating databus (FFs if there's termination/pullups, other values are possible but it doesn't matter here) and try to write it to the same nonexistent location, so nothing actually happened.
This is usually but isn't always true because of memory-mapped I/O. Sometimes reading from a memory location and writing a value back (even the same value) has side-effects. Especially since that's probably how this CPU will do I/O since there are no IN/OUT port instructions. I know that on the SNES there are some memory-mapped I/O ports that are "open bus" on read but valid on write or that have destructive reads. An m instruction with the same register on those is definitely not a NOP!

Name: Anonymous 2011-09-24 7:10

>>83
If this CPU didn't have multiple-bit shifts, this would be a good idea (the IBM 5110 can do any 8-bit rotate in 3 instructions by doing this), but it can already shift by any number of bits in one instruction, so a hardware implementation would use a barrel shifter. In that case rotate can use the same format as shifts and then it could rotate by any number of bits in one instruction. It wouldn't make sense to use multiple rotate instructions in an architecture that already needs a barrel shifter and can already do them in one cycle.

Name: Cudder !MhMRSATORI!FBeUS42x4uM+kgp 2011-09-24 7:32

>>84
When I wrote "memory that doesn't exist" I was referring to the truly nonexistent parts of the memory address space, which have no hardware on the bus to respond to.

>>85
To be complete, the whole family of shift/rotates is
* left shift 0-pad
* right shift 0-pad
* left shift 1-pad (not as useful but included for completeness)
* right shift 1-pad (aka "arithmetic"/sign-preserving shift)
* left rotate
* right rotate
* left rotate through carry
* right rotate through carry
...which is conveniently encoded in 3 bits.

That also raises another point: Your CPU is missing multiple-precision arithmetic (ADC/SBC) instructions.

Name: Anonymous 2011-09-24 7:42

>>86
I only quoted part of your post. You also counted m instructions using the same source and destination registers as NOPs (by assuming it's either RAM or nothing). If the register points to an I/O address, it isn't a NOP. That sort of thing is why C has the volatile keyword.

Name: Anonymous 2011-09-24 7:53

>>82
could use something like this if you ran out of op-codes

i think most of the instructions so far would fit in just five bits of opcode, bar that 'i, so what op/? could do is use the very first bit as a 16-bit / 24-bit length instruction signal?

...full new 23 bit space to use//larger address spaces + more opcodes?

Name: Anonymous 2011-09-24 7:54

>>88

variable length instructions
vomitchan.svg

Name: Anonymous 2011-09-24 11:02

What about operations on the instruction stack..?

Name: Anonymous 2011-09-24 15:40

>>89
fixed length instructions
bloatedvomitchan.xml

Name: Anonymous 2011-09-24 19:48

Most of the book is free. I suggest you read it:
http://www1.idc.ac.il/tecs/

Name: Anonymous 2011-09-24 21:07

Signal_code a=0 [Plus eight bit?]
|a|/Opcode a
...|aa|/Opcode b
......./[bbb]/

Could Easily keep expanding like this..?

Signal_code a=1
|a|/Signal_code b=0 [Plus another 8 bit?]
...|b|/Opcode c?
....../[cc]// Opcode D?
.........../[dddd] (one byte of signals//two bytes to spare..?)

mnemonic pushPageVal |1|0|00|0001| source Reg[5 bit] | dest L1[8 bit] ?

mnemonic popPageVal |1|0|00|0001| dest L1[8 bit] | source Reg[5 bit]

Name: Anonymous 2011-09-25 0:08

>>93
Fuck off, n3n7i.

Name: Anonymous 2011-09-25 2:01

...might i ask where is the instruction stack?

Name: Anonymous 2011-09-25 2:44

...it's a single instruction stack? (It only holds the current instruction..?)

Reg Address compressed instructions?
eg using some signal/marker to specify re-using the last used register? / L1 / etc? [special cased repeats? reuse entire last instruct?] /&/ SIMD-like instructions (single op-code with many source/dest's)?

Name: Anonymous 2011-09-25 3:19

...Also, there are no base-level [array / memory in general]-specific instructions..?

stack push / pop..?

...mnemonic i could be a push (stack size is unknown // first in last out access order?) and takes 8 * >>1 left rot's to return a val? // can only push 8-bits at once/ and pop one bit per instruct?

Barrel rotate (?) means speed is probably not an issue..? But you still use 16 bytes of code for a rotate?

Name: Anonymous 2011-09-25 4:32

*16 bytes of code for a pop
cisc-esque but..

...special pre-emptive source/destination range instructions?

the idea being to cut out a bunch of fairly plain source/dest fields in the usual instructs, instead using a bit larger preempt code followed by x micro 8bit-all-opcode(?) instructions..

Name: Cudder !MhMRSATORI!FBeUS42x4uM+kgp 2011-09-25 4:55

>>92
There's this too:
http://www.amazon.com/Code-Language-Computer-Hardware-Software/dp/0735611319

Name: Anonymous 2011-09-25 5:08

>>93
You could do that, but your instruction decoder would get too complicated and it'd be slow as fuck.

Name: Anonymous 2011-09-25 6:08

>>100
probably, though perhaps not?

you only have to check the very first bit to see if its a 16bit instruction >> branch to 16bit decoder //etc. etc.

just using a big select case would be pretty slow// there would be binary-tree style decoders already?

Name: Cudder !MhMRSATORI!FBeUS42x4uM+kgp 2011-09-25 6:12

>>100
Not really. That's hardly "complicated", and multibyte instructions/multiple single-byte ones could be easily decoded in a single clock if the databus is wide enough.

Look at how the x86 decoder works. It can determine the instruction length in 1 cycle, and do it for multiple instructions at once. i7s are not "slow as fuck" either.

Name: Anonymous 2011-09-25 6:58

>>101
You're not writing software here.

>>102
If you can determine the total number of instruction bytes from the first byte then sure. This was one of the limiting factors of the VAX. Intel also spends a lot of resources hand-optimizing transistors. It's not really babby's first CPU material.

Name: Anonymous 2011-09-25 7:59

>>102,103

you could also build, then use, a single instruction to specify the length of the next n instructions? /depending on how much they vary might be as little as one bit per instrct.

Name: Anonymous 2011-09-25 8:26

can always compress the most common/smallest? length(s) with "shannon's entropy"(?) i think its called, right down to one bit regardless of the number of items, as long as you don't mind a bit of expansion..

1, 01, 001, ... // 01, 10, 001, 110, ... <<this

reminds me of an old run-code compression i tried to invent after reading once =) never got that to work..

Name: Anonymous 2011-09-25 8:49

...think i was doing it backwards?

2 bit -> 1 / 01 / 001 / 000 (typically expands... But because it is complete in both directions, can be used in either direction eg [2 bit] <---> [1 / 01 / 001 / 000]

for comparison a block of data can be broken up into 2 bit pieces and represented as [1 / 01 / 001 or 0001] but not the other way round..

same for [01 / 10 / 001 / 110 / 000 / 111] -> 3 bit but 3bit -/-> [01 / 10 / 001 / 110 / 000 / 111]

a good three bit
[3bits] <--> 01 / 10 / 001 / 110 / 0001 / 1110 / 0000 / 1111

Name: Anonymous 2011-09-25 8:54

...append a compression type marker and recompress?
especially if you have a few different versions..?

Name: Anonymous 2011-09-25 9:22

>>102
i7 still uses RISC representations internally, right?
To counter a earlier argument you made, the RISC philosophy is still very applicable when it comes to doing more per clock cycle.
Fancy superscalar tricks like out-of-order execution require very efficient decoding, and IIRC Intel's processor does all of that after the conversion stage.

Name: Anonymous 2011-09-25 17:30

>>93
>>105
You just went full iAPX432, nigga. Never go full iAPX432.

Name: Anonymous 2011-09-25 20:38

>>109
will google

Name: Anonymous 2011-09-25 21:29

>>109
One of the very few processors with variable-bit-length instructions AND forced OOP at the hardware level.

Intel was too ambitious and failed with that one.

Name: Cudder !MhMRSATORI!FBeUS42x4uM+kgp 2011-09-26 4:37

>>108
Micro-ops, which are even simpler representations than RISC instructions. The "RISC philosophy" you're referring to is different from what I was referring to; the latter would be the single-cycle-instruction pipelined CPU model that they still teach in CS classes (unfortunately), often along with now-outdated points like "we can increase clock speed if we make instructions simpler". Superscalar and OOE requires multiple decodes per clock too, which is facilitated by short/variable-length instructions. [Assuming a 32-bit databus, a fixed-length 32-bit instruction RISC would be able to decode one instruction per fetch, while e.g. x86 could decode 4 1-byte instructions --- and execute them in parallel if they're things like 4 independent register increments.]

>>105
Give these a read:
http://dl.acm.org/ft_gateway.cfm?id=1835424&ftid=62047&dwn=1&CFID=44651829&CFTOKEN=30301632
http://www.iro.umontreal.ca/~latendre/publications/techReport1219.pdf

RISC

1 Name: Anonymous 2011-09-22 9:29

2 Name: Anonymous 2011-09-22 9:41

3 Name: Anonymous 2011-09-22 9:44

4 Name: Anonymous 2011-09-22 9:52

5 Name: Anonymous 2011-09-22 9:54

6 Name: Anonymous 2011-09-22 9:55

7 Name: Anonymous 2011-09-22 10:00

8 Name: Anonymous 2011-09-22 10:10

9 Name: Anonymous 2011-09-22 10:13

10 Name: Anonymous 2011-09-22 10:30

11 Name: Anonymous 2011-09-22 11:56

12 Name: Anonymous 2011-09-22 12:16

13 Name: Anonymous 2011-09-22 12:34

14 Name: Anonymous 2011-09-22 13:22

15 Name: Anonymous 2011-09-22 13:27

16 Name: Anonymous 2011-09-22 22:09

17 Name: Anonymous 2011-09-22 22:33

18 Name: Anonymous 2011-09-22 23:25

19 Name: Anonymous 2011-09-22 23:56

20 Name: Anonymous 2011-09-23 0:19

21 Name: Anonymous 2011-09-23 0:47

22 Name: Anonymous 2011-09-23 2:42

23 Name: Anonymous 2011-09-23 7:03

24 Name: Anonymous 2011-09-23 8:01

25 Name: Anonymous 2011-09-23 8:15

26 Name: Anonymous 2011-09-23 8:21

27 Name: Anonymous 2011-09-23 8:44

28 Name: Anonymous 2011-09-23 8:47

29 Name: Anonymous 2011-09-23 8:58

30 Name: Anonymous 2011-09-23 9:06

31 Name: Anonymous 2011-09-23 12:59

32 Name: Anonymous 2011-09-23 13:38

33 Name: Anonymous 2011-09-23 15:41

34 Name: Anonymous 2011-09-23 21:22

35 Name: Anonymous 2011-09-23 22:21

36 Name: Anonymous 2011-09-23 22:30

37 Name: Anonymous 2011-09-23 22:50

38 Name: Anonymous 2011-09-23 22:51

39 Name: Anonymous 2011-09-23 23:17

40 Name: Anonymous 2011-09-24 0:03

41 Name: Anonymous 2011-09-24 0:05

42 Name: Anonymous 2011-09-24 1:11

43 Name: Anonymous 2011-09-24 1:36

44 Name: Anonymous 2011-09-24 1:38

45 Name: n3n7i 2011-09-24 1:39

46 Name: Anonymous 2011-09-24 1:48

47 Name: Anonymous 2011-09-24 1:55

48 Name: Anonymous 2011-09-24 1:58

49 Name: Anonymous 2011-09-24 2:00

50 Name: Anonymous 2011-09-24 2:08

51 Name: Anonymous 2011-09-24 2:13

52 Name: Anonymous 2011-09-24 2:18

53 Name: Anonymous 2011-09-24 2:46

54 Name: Anonymous 2011-09-24 2:48

55 Name: Anonymous 2011-09-24 3:07

56 Name: Anonymous 2011-09-24 3:14

57 Name: Anonymous 2011-09-24 3:15

58 Name: Anonymous 2011-09-24 3:23

59 Name: Anonymous 2011-09-24 3:28

60 Name: Anonymous 2011-09-24 3:41

61 Name: Anonymous 2011-09-24 3:54

62 Name: Anonymous 2011-09-24 4:10

63 Name: Anonymous 2011-09-24 4:28

64 Name: Anonymous 2011-09-24 4:50

65 Name: Anonymous 2011-09-24 5:11

66 Name: Anonymous 2011-09-24 5:33

67 Name: Cudder !MhMRSATORI!FBeUS42x4uM+kgp 2011-09-24 5:42

68 Name: Cudder !MhMRSATORI!FBeUS42x4uM+kgp 2011-09-24 5:50

69 Name: Anonymous 2011-09-24 5:56

70 Name: Anonymous 2011-09-24 5:58

71 Name: Anonymous 2011-09-24 6:09

72 Name: Anonymous 2011-09-24 6:10

73 Name: Anonymous 2011-09-24 6:13

74 Name: Cudder !MhMRSATORI!FBeUS42x4uM+kgp 2011-09-24 6:16

75 Name: Anonymous 2011-09-24 6:19

76 Name: Anonymous 2011-09-24 6:24

77 Name: Anonymous 2011-09-24 6:26

78 Name: Cudder !MhMRSATORI!FBeUS42x4uM+kgp 2011-09-24 6:28

79 Name: Anonymous 2011-09-24 6:34