>>5
Power/PowerPC uses a finer-grained memory model with bidirectional fencing. So it should have slightly better scalability than x86, but even with x86, you can easily scale up to 256 cores or more almost linearly, it just takes a little more caressing of the code for the software developer.
Larrabee didn't fail outright, it just wasn't competitive at the time with nVidia's and AMD's GPUs which were pushing more FLOPs for less cost, plus Intel didn't have the software tools for it to make it easy to program figured out at the time... as such the scope of Larrabee was scaled down.
It's now been resurrected as Intel's Knight's Ferry and Knight's Corner. The latest version has up to 64x in-order x86-64 cores on Intel's 22nm 3D tri-gate process and is slated for delivery by Q2/2012. Plus Intel has adopted OpenCL, recently shipping OpenCL 1.1 drivers for x86/x86-64 CPUs, so along with OpenMP, they have the software issue figured out now. It's primarily targeted at HPC and graphics professionals as a workstation or supercomputing component, it's not intended as a consumer GPU replacement.
http://www.brightsideofnews.com/news/2011/6/20/intel-larrabee-take-two-knights-corner-in-20122c-exascale-in-2018.aspx
Also, I've got access to a 4x 12-core (total of 48-cores) AMD Magney-Cours Opteron SMP system at work, and we have some of our software scaling linearly on it.
8-core CPUs for power-users, enthusiasts, and gamers will become defacto entry-level by early next year. AMD's launching their 8-core consumer Bulldozer CPU on September 19th, and Intel is shipping 8-core (16-hardware threads) enthusiast Ivy Bridge processors in Q1/2012.
Once you get past 4-cores, in order to take advantage of the hardware, you really need to start using the same techniques that also scale nicely up to 256+ cores: task-oriented and data-oriented parallelism. So if you start programming for 8-cores properly, your code would be ready for larger machines.