>>47
I think I've proved my point well enough. ;)
>>48
Protection and address space virtualization are still useful even if you don't want deal with replacing pages on demand.
Segmentation works fine for that.
Also, it's obvious why the 8051 is still around: Intel no longer controls the ISA.
Nor do they need to, since the whole opcode space has been assigned already (except for one slot at A5, which has become de-facto standard for a debugging breakpoint.) 8051 is mature, finished, and stable. There's no need to change it; it's evolved to perfection for its application.
Parallelism. And a RISC that deals with that, will be ahead of any CISC.
Good luck finding enough memory bandwidth... CISC will always have the code density advantage. With cores these days running at several times memory speed, the bottleneck is not instruction decoding or execution, it's at the memory interface. It gets even worse with more cores.
CISC scales with hardware improvement because single instructions perform many operations, that although initially slow, can be improved by hardware. Something like
add [ebx+esi*4+34], ax might've taken several dozen clock cycles before, but could take less than 1 in the future; and it's the same 5 bytes. Complexity in the instruction set is design that looks forward, building in anticipation for future improvements. The block move instruction I mentioned above is an excellent example; with a RISC ISA all you can do is code a loop, possibly a bloated unrolled one, and it's very difficult for the CPU to figure out that the series of instructions it's executing is a block move, and thus optimise it in hardware. It's easy to break complex instructions apart, it's harder to combine them together.
Here's something interesting:
Nehalem: LOOP 6uops, 4 clocks
Sandy Bridge: LOOP 7uops, 5 clocks
Bulldozer: LOOP 1uop, 1-2 clocks
Odd that AMD's optimisation guide is still recommending against it, when it's now become faster than dec/jz (1.5-2.5 clocks).