Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

ITT: Make your own instruction set.

Name: Anonymous 2008-11-29 17:22

Make your own instruction set for your own made up processor.  How many bits can it access?  What kind of tasks will it be used for?  Bonus points for making an implementation in sepples, FIOC, lisp, or other GENERAL PURPOSE SCALABLE COST-EFFECTIVE ENTERPRISE QUALITY PROGRAMMING LANGUAGES(such as java)

Name: Anonymous 2008-12-01 2:02

>>40
Illiad

Name: Anonymous 2008-12-01 2:31

>>41
You are insulting only yourself.

Name: Anonymous 2008-12-01 2:41

a set of 256 stacks, shared between 16 cores.
each stack element is 128 bits, and can be used as the real and imaginary parts of a complex number (64-bits each), a single 128-bit signed integer or floating point value, or a machine instruction. each core pops and executes the instruction on top of it's instruction stack, which is stack 0 for core 0, stack 1 for core 1, etc. the instruction set is as follows:
0x00000000 000000xx 00000000 00000000
 clear x
removes all elements from x (stack).
0x00000001 000000xx yyyyyyyy yyyyyyyy
 drop x y
removes the top y (64-bit unsigned integer) elements from x (stack).
0x00000002 00xx00yy zzzzzzzz zzzzzzzz
 cdrop x y
drops the top z (64-bit unsigned integer) values from x (stack) if the top two values on y (stack) are equal, otherwise does nothing. the top two values on y are popped and discarded.
0x00000003 00xx00yy zzzzzzzz zzzzzzzz
 move x y z
copies z (64-bit unsigned integer) elements from x (stack) to y (stack).
0x00000004 00xx00yy zzzzzzzz zzzzzzzz
 copy x y z
copies z (64-bit unsigned integer) elements from x (stack) to y (stack).
0x00000005 00xx00yy 00000000 00000000
 swap x y
swap x (stack) and y (stack).
0x00000006 000000xx yyyyyyyy yyyyyyyy
 dup x y
duplicates the top y (64-bit unsigned integer) elements on x (stack).
0x00000007 000000xx yyyyyyyy yyyyyyyy
 rot x y
rotates the top y (64-bit unsigned integer) elements on x (stack).
0x00000008 00xx00yy zzzzzzzz zzzzzzzz
 uadd x y z
pops and sums (as unsigned integers) the top z (number) elements from x (stack) and pushes the result onto x. if an overflow occurs, pushes the value 1 onto y (stack), otherwise pushes the value 0 onto y.
0x00000009 00xx00yy zzzzzzzz zzzzzzzz
 usub x y z
like uadd, but with subtraction instead of addition, and underflow instead of overflow.
0x0000000A 00xx00yy zzzzzzzz zzzzzzzz
 umul x y z
like uadd, but with multiplication.
0x0000000B 000000xx yyyyyyyy yyyyyyyy
 udiv x y
like uadd, but with division, and no possibility of underflow or overflow.
0x0000000C 00xx00yy zzzzzzzz zzzzzzzz
 sadd x y z
like uadd, but with signed integers instead of unsigned. if an underflow occurs, -1 is pushed onto y.
0x0000000D 00xx00yy zzzzzzzz zzzzzzzz
 ssub x y z
like usub, but with signed integers. if an overflow occurs, -1 is pushed onto y
0x0000000E 00xx00yy zzzzzzzz zzzzzzzz
 smul x y z
like sadd, but with multiplication.
0x0000000F 000000xx yyyyyyyy yyyyyyyy
 sdiv x y
like udiv, but with signed integers.
0x00000010 000000xx yyyyyyyy yyyyyyyy
 fadd x y
like sadd, but with floating point numbers instead of signed integers and no possibility of overflow.
0x00000011 000000xx yyyyyyyy yyyyyyyy
 fsub x y
like ssub, but with floating point numbers and no possibility of overflow.
0x00000012 000000xx yyyyyyyy yyyyyyyy
 fmul x y
like smul, but with floating point numbers and no possibility of overflow.
0x00000013 000000xx yyyyyyyy yyyyyyyy
 fdiv x y
like sdiv, but with floating point numbers.
0x00000014 000000xx yyyyyyyy yyyyyyyy
 cadd x y
like fadd, but with complex numbers instead of floating point numbers.
0x00000015 000000xx yyyyyyyy yyyyyyyy
 csub x y
like fsub, but with complex numbers.
0x00000016 000000xx yyyyyyyy yyyyyyyy
 cmul x y
like fmul, but with complex numbers.
0x00000017 000000xx yyyyyyyy yyyyyyyy
 cdiv x y
like sdiv, but with floating point numbers.
0x00000018 000000xx yyyyyyyy yyyyyyyy
 and x y
pops the top y (64-bit unsigned integer) values from x (stack), performs a bitwise AND on them, and pushes the result onto x.
0x00000019 000000xx yyyyyyyy yyyyyyyy
 or x y
like and, but with OR instead of AND.
0x0000001A 000000xx yyyyyyyy yyyyyyyy
 xor x y
like and, but with XOR.
0x0000001B 00xx00yy zzzzzzzz zzzzzzzz
 pushm x y z
pops a memory location from x (stack), and pushes z (64-bit unsigned integer) values at that location onto y (stack).
0x0000001C 00xx00yy zzzzzzzz zzzzzzzz
 popm x y z
pops a memory location from x (stack), then pops z (64-bit unsigned integer) values from y (stack) and stores them at that memory location.

Name: Anonymous 2008-12-01 3:07

>>43
Did you give any thought to synchronization at all? How are you supposed to do process management and content switches when your code's on a stack?

Name: Anonymous 2008-12-01 4:38

>>43
i'm not sure what you mean by "content switches"... and how would having the code on a stack possibly make it any more difficult? are you one of those people who doesn't understand stacks?

Name: Anonymous 2008-12-01 5:17

>>45
*context switches, but never mind. I was thinking the stacks would use memory at a fixed location, which doesn't need to be the case. Though then it'd be harder to cache the top of each stack in core registers, without which it'd run like a slow ass.
And while you're wasting bits on stack element count, literals and offsets seem ignored to such a degree that even basic memory access requires elaborate trickery. Slow, slow, slow.

To further assert my belief that the ISA is far too stacky, I shall now commit the fallacy of guilt by association;
You know who else uses stacks? That's right, Java does! And if he were alive, why, Hitler would as well. Nothing like pushing on a couple of jews with your good friend Qosling and popping them into the ovens, is there, you fucking Nazi‽

Name: Anonymous 2008-12-01 6:40

Though then it'd be harder to cache the top of each stack in core registers, without which it'd run like a slow ass.
it makes a lot more sense to have a large (maybe about 64MB) cpu cache to hold all of the stacks. 16384 128-bit stack elements per stack should be more than enough for most purposes. and you'd only need 28 128-bit registers to hold all the stack element counts. since we're talking about made up processors, why not have, say, 512 registers? that'd be plenty to cache the top few elements from stacks that are used a lot, and even elements that aren't cached in registers would be pretty fast to access since they're always in the cpu cache. certainly a lot faster than a machine with fewer than 32 registers.
also, read this: http://en.wikipedia.org/wiki/Burroughs_large_systems#Stack_speed_and_performance

literals and offsets seem ignored to such a degree that even basic memory access requires elaborate trickery.
yeah, sure, adding numbers is "elaborate trickery". and literals aren't ignored. literals can be handled like so (puts four literal values onto stack 16):
move s0 s16 4
.data ( 0x48000000650000006c0000006c
        0x6f0000002c0000002000000057
        0x6f000000720000006c00000064
        0x21000000000000000000000000 )

Name: Anonymous 2008-12-01 7:23

>>47
You need to read CA:AQA instead of SICP.

Name: Anonymous 2008-12-01 8:14

Name: Anonymous 2008-12-01 10:19

>>49
All of it. Well maybe you can skip the parts about storage systems.

Name: Anonymous 2008-12-02 2:50

>>50
i have read all of it. i'm wondering which parts you think are relevant.

Name: Anonymous 2008-12-02 6:14

>>51
see >>50

Name: Anonymous 2008-12-03 18:16

one stack.  one instruction.  one big problem.

The Sussman:"It only pushes numbers!?"

Name: Anonymous 2008-12-03 18:42


push 01
push 02
push 03
push 04
push 0A

Name: Anonymous 2008-12-03 19:17

>>21
what is the reference here

Name: Anonymous 2008-12-03 19:41

push a
push b
push c
cdr %eax
car $16

Name: Anonymous 2008-12-03 23:31

>>53
It's Hollywood-ready!

Name: Anonymous 2008-12-04 0:25

DRIVE LIKE JEHU

Name: Anonymous 2009-02-26 22:22

I just thought I would bump this thread.  It's more interesting then the ones on /prog/ currently.

Name: Anonymous 2009-02-26 22:45

>>11
Dynamically sized registers are tricky to implement in hardware, since each bit in the register corresponds to a  digital circuit, so you could say, that all resources in a CPU are preallocated(once you decide what they are, they become a fixed number of cells when synthethized). So effectively, you'll have to compromise to a maximum register size such as 32,64,128,... If you'd want to have more than the default, you'll need to emulate the operations(for example using some specific microcode and some RAM), this will probably end up incredibly slow, and the additional needed circuitry would pobably lower the speed, that said, most of Intel's CPUs(including the Core2 line) are quite the custom design where everything is hand optimized to achieve the needed performance, and it's quite unlikely (if not almost impossible) you can make something given your specs, similar to the complex x86 CPU and get better speeds than Intel, if you use the same manufacturing tech. You could achieve better speeds with a much simplified RISC architecture w/ many optimizations, full custom design, and rely on a good compiler to generate fast code.

Name: Anonymous 2009-02-26 22:49

>>59
Yep.  And my magic CPU would have a functional assembly (as hinted by somebody else above) with hardware garbage collection.

Name: Anonymous 2009-02-26 23:55

>>15
He meant variable bit-length instructions, not variable bit-length registers. Sort of like how offsets/lengths in plain LZ12/4 compression are always 12 and 4 bits, but the actual data is bit-level packed.

Name: Anonymous 2009-03-06 11:53


The Desert of Indentation Wars between the   Guidans and THE   peculiarities of each   architecture down to   a grinding halt?

Name: Anonymous 2010-11-28 0:09

Name: Anonymous 2010-12-20 21:47

Name: Anonymous 2011-01-31 21:17

<-- check em dubz

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List