Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-4041-8081-

The god has spoken

Name: Anonymous 2011-08-13 23:02

Carmack does hope that Sony avoids the Cell architecture all together due to the difficulty in development.

Fuck cell! For the Carmack's ために!

Name: Anonymous 2011-08-13 23:18

I thought Cell had already been defeated by Gohan.

Name: Anonymous 2011-08-13 23:42

>>1
He's in luck. The rumors are that Sony is considering a variant of IBM's Power 7 for the Playstation 4 on a 28nm high-performance HKMG process, which mean it would probably feature 8 cores. Power 7 has an extremely deep instruction pipeline and features 4-way hyperthreading, which would give it 32 hardware threads. And since all of the cores are homogeneous, you don't have to fuck around with DMA'ing shit and compiling your code for two different ISAs. It would be around ~12 times faster than the Cell CPU in the PS3.

http://www.ps3news.com/PlayStation-3-PSN-News/rumor-sony-chooses-ibm-power7-cpu-for-playstation-4-in-2012/

Name: Anonymous 2011-08-14 1:27

>>2
It doesn't make a bit of difference guys. The cores are inert.

Name: Anonymous 2011-08-14 2:08

>>3
SMP with 32 cores? How the fuck is cache coherency going to scale on that? Didn't the Larrabee project fail precisely because of SMP scaling badly to large core counts?

On another note, I don't actually give a fuck about this since my code will probably never run on such a machine (if it ever ends up existing), since Sony is probably going to lock it up like hell using hardware crypto and other such evil goodies. Oh well. I hope they won't try escaping taxes this time by declaring it as a ``general purpose'' computer; that would be adding insult to injury.

>>4
I love you.

Name: Anonymous 2011-08-14 8:53

>>5
Power/PowerPC uses a finer-grained memory model with bidirectional fencing. So it should have slightly better scalability than x86, but even with x86, you can easily scale up to 256 cores or more almost linearly, it just takes a little more caressing of the code for the software developer.

Larrabee didn't fail outright, it just wasn't competitive at the time with nVidia's and AMD's GPUs which were pushing more FLOPs for less cost, plus Intel didn't have the software tools for it to make it easy to program figured out at the time... as such the scope of Larrabee was scaled down.

It's now been resurrected as Intel's Knight's Ferry and Knight's Corner. The latest version has up to 64x in-order x86-64 cores on Intel's 22nm 3D tri-gate process and is slated for delivery by Q2/2012. Plus Intel has adopted OpenCL, recently shipping OpenCL 1.1 drivers for x86/x86-64 CPUs, so along with OpenMP, they have the software issue figured out now. It's primarily targeted at HPC and graphics professionals as a workstation or supercomputing component, it's not intended as a consumer GPU replacement.

http://www.brightsideofnews.com/news/2011/6/20/intel-larrabee-take-two-knights-corner-in-20122c-exascale-in-2018.aspx

Also, I've got access to a 4x 12-core (total of 48-cores) AMD Magney-Cours Opteron SMP system at work, and we have some of our software scaling linearly on it.

8-core CPUs for power-users, enthusiasts, and gamers will become defacto entry-level by early next year. AMD's launching their 8-core consumer Bulldozer CPU on September 19th, and Intel is shipping 8-core (16-hardware threads) enthusiast Ivy Bridge processors in Q1/2012.

Once you get past 4-cores, in order to take advantage of the hardware, you really need to start using the same techniques that also scale nicely up to 256+ cores: task-oriented and data-oriented parallelism. So if you start programming for 8-cores properly, your code would be ready for larger machines.

Name: Anonymous 2011-08-14 8:54

>>6
Bumping. I shouldn't have saged, just realized this thread went off the first page.

Name: TRUE TRUTH EXPERT 2011-08-14 10:32

aIN'T NOTHING A P1 CAN'T DO

Name: Anonymous 2011-08-14 11:57

>>6
I'm more interested in what and how much RAM it'll have.
Also, isn't this more /tech/ alternatively /g/ related than programming related?

Name: Anonymous 2011-08-14 12:32

>>6
Isn't the big problem with multiple cores is the whole "only certain tasks can be parallelized" and "raping the ram"?

For shit like raw calculations I can see how you can have linear scaling. When we're talking about AI or other state-dependent calculations it's much worse, with the ram throughput needing to be ridiculous high and parallel on the small scale rather than a large one. I know they use threading for shit like audio and input handling, but when we're talking about processors where a single processor can handle everything except the physics, AI, and graphical requirements, it's unimportant to "properly" thread shit like audio.

If they had like a 64mb shared L3 or L2 cache I might see how 8 cores will be efficient, but otherwise, it's going to be basically impossible to program for.

Now, on the other hand, if they instead used the other cores for physics and GPU enhancing when they're not parallelizing a few AIs or something, I can see that being useful, but to efficiently use 8 cores in real tasks is pretty impossible. Also, as we've seen from AMD 4xxx series, deep pipelining is only useful if you customize your compiler and actually can pipeline, and from hyperthreading, we know it gets to be basically shit with more than 2 threads because you eventually DO need to calculate. Faster ram just lessens the usefulness of hyperthreading as you need to wait less and less for data transfers.

Maybe if they could find some way to efficiently pipeline AND use hyperthreading to be pseudo processing, but at the same time, they could instead just use Bulldozer modules and scale much more effectively instead of trying to create a processor which dynamically pipelines tasks.

Name: Anonymous 2011-08-14 12:36

>>10
tl;dr version?

Name: Anonymous 2011-08-14 12:36

>>9
Neither /tech/ nor /g/ would truly appreciate the implications for having access to such hardware.

Knight's Corner will come with either 2GB or 4GB of GDDR5 ECC memory (typical of workstation GPU/compute units), there's an 8MB shared L2 cache, and each core has it's own pair of 32kb L1 instruction and 32kb L1 data caches. AVX2 will feature 512-bit AVX2 + ROP SIMD instructions with a fused multiply & add pipeline (perfect for 16-way 32-bit floating point vectors or 4x4 32-bit floating-point matrix transformations), and ROP brings in integer operations on 256-bit and 512-bit integer vectors for performing Raster OPerations.

You will also be able to run multiple Knight's Corner units in tandem on the same shared-memory machine (OpenCL handles working with multiple heterogeneous compute units and gives the programmer fine-grained control over workload distribution). Each unit will be capable of 6 teraflops of single-precision arithmetic, or around 2 teraflops of double-precision.

Name: Anonymous 2011-08-14 12:39

>>11
Pipelining has issues
Hyperthreading has issues
>4 cores has severe issues
It's impractical unless you have a group of knowledgeable programmers custom designing their stuff to your platform.

Name: Anonymous 2011-08-14 12:43

unless you have a group of knowledgeable programmers custom  designing their stuff to your platform

Why wouldn't you have that? Oh right, web development?

Name: Anonymous 2011-08-14 12:46

>>14
PS3 was already hard enough for which to design programs.

Multiple cores are good when you have things like media managers, input handlers, web browsers, shit running in the background.

Look at what happened with the 4xxx series with pipelining.

People already complained about the difficulty of programming for the PS3 when the only drawback was 8 cores. Now throw in pipelining guidelines and hyperthreading guidelines and I could easily see many programmers running to Nintendo or Microsoft, while the more faithful programmers make even worse games because they'll be expected to have like displacement maps on everything, high rest textres, and be expected to program in a horrible style that only works for the PS4.

Name: Anonymous 2011-08-14 13:16

>>10
Yes. That is the big problem. Part of that problem, however, is that a lot of people are stuck in the old school-of-thought when it comes to writing scalable, parallel code. When you ask them to make a bit of code threaded, the first thing they think of is to use a mutex or a reader-writer lock to serialize access to that code and then instantiate a few threads to execute it, and maybe use a semaphore or condition variable to signal events or pass messages between threads--of course that's not going to scale. I get the feeling this is what you're still thinking of.

The reality is that it's actually quite feasible to parallelize a lot of code, or transformations on the underlying data, in ways that don't serialize access to the data. You just need to adapt, think outside of the box and take a look at what other people are doing to solve these problems.

Threading code around the idea of modules or subsystems is old-school, and doesn't scale. Like you mentioned, you'd use an extra core to handle physics or AI or some other subsystem in a game engine. It doesn't work.

The way around it is to use task-oriented and data-oriented parallelism. And part of that involves getting rid of using fine-grained object-orientation. Just fucking stop using it OOP, it's what's holding back a lot of code from being made to scale. Suddenly you'll find yourself being able to write code that scales up to hundreds of cores, even on x86. And not only that, but you'll also find that the code you write is smaller and simpler.

For example, with my own task-oriented toolkit I've been working on for C++11/C++0x, based off of Intel's TBB, searching for the offset of a particular value within a vector using a map-reduce type pattern looks like:

vector<int> values; // initialized with say a million integers
constexpr int value_to_find = 20415001;

structured_task_group group;
combinable<ptrdiff_t> result(PTRDIFF_MAX);

// search for the offset of the first occurrence
parallel_for(values.begin(), values.end(), group, [&](vector<int>::iterator i) {
    if (*i == value_to_find) {
        ptrdiff_t offset = i - values.begin();
        result.value(offset);
        group.cancel_work_unit();
    }
});

// final reduction step
int final_offset = result.reduce([&] (int x, int y) {
    return (x < y) ? x : y;
}


There's no overhead of starting up new threads as they're already sitting idle waiting for work, the partitioning is automatically done based on the number of hardware cores that are available (it's possible to write your own partitioner for different use cases to better fit the data), the task-scheduler is based off the SLAW adaptive work-stealing scheduler and it will automatically load-balance work units across cores, and there's no write-sharing which causes memory performance issues, using the TLS mechanisms built into the combinable template container.

This is just a simple example, but doing things like parallel quick/intro or radix sorts, parallel tree traversals, parallel updating all of your entities in game engine each frame, asynchronous file/network I/O, etc. isn't that much more code, maybe a few dozen extra lines.

This is the kind of stuff game developers are starting to use in their engines, and what Lispers and Haskeller's have been raving about all of these years.

Name: Anonymous 2011-08-14 14:34

>>16
I read your post

Name: Anonymous 2011-08-14 15:05

>>16
Pretty much this. I find that the majority of enterprise/web developers and university graduates fall into the camp of people who still think multi-threading is too hard, due to their ardent following of object-orientation. The reality is that many of the hard engineering and algorithmic problems of writing parallel software have already been solved. It's just going to take another 10 years until all of the existing common code-monkeys die off and are replaced by a new generation of code-monkeys who are not conditioned into thinking it's too difficult or mysterious or uncharted territory.

Name: Anonymous 2011-08-14 15:35

>>18
gib docs plz thx

Name: Anonymous 2011-08-14 15:48

>>16
Great, now you're raping the cache by loading various parts of that array into the other cores. You're wasting time just to setup the thread pool. Unless this task is a choke point for your entire application, wouldn't it be simply more efficient to schedule *other* things on the other cores?

In other words, I don't get why it's a better idea to have multiple cores caress a single piece of data at a time rather than having each core caress its own piece of data.

Name: Anonymous 2011-08-14 16:13

>>20
>Great, now you're raping the cache by loading various parts of that array into the other cores.
You don't know how a multi-level cache actually works, do you? Reading data from memory scales linearly. You can read the same cache line of data on as many multiple physical cores simultaneously as you like and there won't be a single cache fault. It is ``write-sharing'' that you need to worry about. Writing, or read-modify-writing, to a shared location of memory from multiple cores simultaneously causes that cache-line to be invalidated mid-flight which will result in a cache fault the next time a different core tries to access it, thus forcing the memory controller to serialize access. However, writing to a cache-line from a single core, that other cores currently aren't attempting to read or write also scales fine. So the trick is remove write-sharing.

You're wasting time just to setup the thread pool.
Setting up the task scheduler is only done once when the program first starts. That's it. There is no additional overhead whenever you dispatch concurrent tasks. And in fact, if you use cooperative user-mode scheduling with your task scheduler (Windows 7 UMS threads for example), there's not even any kernel syscall overhead or kernel mode context-switching. (Also, it's not a mere thread-pool, as each worker thread multiplexes tasks from local lock-free queues).

In other words, I don't get why it's a better idea to have multiple cores caress a single piece of data at a time rather than having each core caress its own piece of data.
They do only get their own piece of data. parallel_for uses a linear fixed-size partitioner. If your machine has 8 logical cores, and the vector<int> values container has a 220 = 1024 * 1024 elements, then partitioner will schedule and dispatch 8 jobs to the task-scheduler, the first job iterating over the elements [0, 220 / 8) elements, the second job will iterate over elements [220 / 8, 2 * 220 / 8), and so on. Each element may only be touched once, and in fact, each job will terminate as soon as it finds the value it was searching for as we're only interested in the first such element in the entire container. Multiple cores aren't touching the same piece of data, they're touching disjoint parts of it.

So even if reading the same memory from multiple cores didn't scale linearly, the parallel_for example above still would scale linearly.

Name: >>20 2011-08-14 16:16

>>21
That was truly enlightening.  Thank you.

Name: Anonymous 2011-08-14 16:43

>>21
Yeah that was awesome, thanks. Now where are the docs?

Name: Anonymous 2011-08-14 16:55

only certain tasks can be parallelized
This. I'd rather have more single-thread and memory parallelism in hardware so that e.g. the following sequence of instructions could be executed in one clock cycle:


mov eax, [esi+8*edi+12345678]
mov ebx, [esi+4*edi+23456780]
mov ecx, [esi+2*edi+34567894]
mov edx, [esi+edi+4567890C]


I believe stuff like that and optimization in hardware (like dead code elimination, so e.g. if you have "mov eax, ebx" followed by another "mov eax ..." the processor doesn't even attempt to execute the first instruction) is what will really boost performance, not THROW MORE CORES AT IT LOL

Name: Anonymous 2011-08-14 17:12

>>23
Now where are the docs?
http://www.1024cores.net/
http://mc-fastflow.sourceforge.net/
http://threadingbuildingblocks.org/
http://msdn.microsoft.com/en-us/library/dd504870.aspx
http://libdispatch.macosforge.org/
http://openmp.org/wp/openmp-specifications/
http://www.khronos.org/opencl/

>>24
You will never see CPUs running on air-cooling or conventional water cooling clocked at more than 5GHz reliably. You will have to discover new physics that gets around the laws of thermodynamics. I don't even want to hear you mention ``but quantum-computing!'' because that would only show how utterly retarded you are. Graphene looks promising, but it still won't get you over ~8GHz reliably for a general-purpose CPU, where you can run a machine 24/7 on air-cooling without worrying about faults.

However, you will see CPUs with 256-cores on-die each clocked at 1.5GHz - 2.0GHz once we get down to 8-12nm. Personally, I'll take what I can get without being a faggot bitch about it like you.

Also, what are you doing using faggot 32-bit x86 SISD ALU instructions when you could have just used a single 128-bit SSE instruction in 64-bit mode:

movups xmm0,[rsi+8*rdi+12345678]

Name: Anonymous 2011-08-14 17:24

>>24
I believe stuff like that and optimization in hardware (like dead code elimination, so e.g. if you have "mov eax, ebx" followed by another "mov eax ..." the processor doesn't even attempt to execute the first instruction) is what will really boost performance, not THROW MORE CORES AT IT LOL
I also forgot to mention that Intel already does this in their most recent processors, they've pretty much maximized what they can do with single-core instruction parallelism. The named registers, eax, ebx, etc. don't even map to physical registers anymore, each core comes with a few hundred general purpose registers and the CPU's micro-op architecture will dynamically allocate internal registers to named registers.

So if you have something like:

mov [esi+8*edi+12345678], eax
mov eax, 12h
and eax, [esi+8*edi+11521530]


The CPU will detect that moving the value of 12h into eax has nothing to do with the previous value stored in eax, and so it will use a different internal register and continue along, while the store in the first instruction is executing.

There is so much insane stuff that modern x86 CPUs do that it's not funny. I really don't think Intel and/or AMD can squeeze much more out of it, maybe an extra 5% over the next few generations, but you're getting depreciating returns every time.

Instead, they're focusing on scaling things up with more cores and with wider SIMD instruction pipelines (they plan to have 1024-bit wide AVX instructions by 2018).

Name: Anonymous 2011-08-14 17:45

>>25
That's a goldmine. Words cannot express my gratitude.

Name: Anonymous 2011-08-14 18:04

>>27
You're welcome. Spread the word. It's time people started diving into the deep-end of the concurrency pool.

Name: Anonymous 2011-08-14 18:09

>>25
I'm saying we don't need more cores nor clockspeed if we can get more parallelism going on within a single instruction stream.
Also, what are you doing using faggot 32-bit x86 SISD ALU instructions when you could have just used a single 128-bit SSE instruction in 64-bit mode:
To illustrate the parallelism. It could execute 8 of those movups all with full addressing modes in the same cycle too, and you can see that this wouldn't need more than compiler (or assembler) level changes (if any) to take advantage of.
they've pretty much maximized what they can do with single-core instruction parallelism
Not really, if you've read the Intel specs there are still instruction sequences that could be parallelized but the core doesn't have enough execution resources for them (like ALUs in my example above).
(they plan to have 1024-bit wide AVX instructions by 2018
That reminds me, we also need parallelism with memory. What good is having ridiculously wide registers and parallel execution capability if the databus is narrow and (more importantly) the memory can only access 1 address at a time and in tiny chunks? Something like 4 ports by 1024 bits each would be ideal for instructions like the examples above. Note that this also nearly eliminates stupid alignment requirements for smaller data pieces since a 32-bit dword is going to need two accesses only if it ends up straddling two 128-byte blocks (3% of the time) and if the RAM is multiport the CPU could decide to parallelize there and read both blocks at the same time too.

THEN you scale it up with multiple cores that can each do all of the above, and you'll have one hell of a machine.

Name: Anonymous 2011-08-14 18:56

I'm saying we don't need more cores nor clockspeed if we can get more parallelism going on within a single instruction stream.
So you're saying we should all just settle with say a 5GHz single core CPU capable of say 60 gigaflops (a single core in a Sandy Bridge is around 16 gigaflops at 2GHz for comparison), and just forget about the 40+ teraflops we could have had with a 256-core in-order x86-64 with 1024-bit wide SIMD compute unit, like what we will be getting with a future succession of Larrabee/Knight's Corner on a 8nm-12nm process, simply because you find programming for concurrent architectures to your disliking?

For a little bit of discomfort, you can get huge gains of a few orders of a magnitude, and more if you go SMP, and even more if you go distributed. You aren't going to see those kinds of gains in single-core land over what we already have.

Not really, if you've read the Intel specs there are still instruction sequences that could be parallelized.
There's no point in wasting transistors on instructions or instruction sequences that are rarely used and when they are aren't major bottlenecks. Intel and AMD or other major CPU architecture designer profiles lots of third-party software and they look at what will give their customers the best bang for you buck. They don't waste resources on stuff that won't give you marginal gains.

That reminds me, we also need parallelism with memory.
Good point. And that's exactly what's happening, it's just that it's more expensive, so it's more cost-effective to focus on raw compute power first. Most current DDR3 systems today are dual-channel, but you're probably aware that the Core i7 9xx Nehalem CPUs are triple-channel systems. The upcoming enthusiast 8-core Core i7 Ivy Bridge CPUs on Socket LGA2011, as well as the AMD Opteron 12/16-core Bulldozer CPUs on socket LGA1974 will support quad-channel DDR3.

DDR4 moves beyond this. Hopefully it will be available for the Haswell and Piledriver architectures in early 2013. Instead of blocks of channels, each DIMM has its own channel, so in an 8-slot system you've got 8 channels. They're planning on debuting 3D stacking of memory units, so it's possible that in a future DDR5 iteration for the 2018 release time-frame, each layer in a stack would be assigned its own channel.

I'm not sure what Intel/AMD are going to do about the cache-line size, right now it's 64-bytes (it used to be 128-bytes on the P4, but that was part of the reason why the P4 sucked at the time). But with 512-bit AVX2, that's your 64-bytes right there, and then they plan on going up to 1024-bit, so they must be planning on some major cache architecture overhauls.

Name: Anonymous 2011-08-14 19:10

this is so fucking interesting my scrotum exploded.

must learn to do distributed computing or w/e the fuck it's called

Name: Anonymous 2011-08-14 19:32

>>6
Wait, they want to reach exaflop capabilities by 2018? Isn't that like around enough to simulate an entire human brain? Holy fuck, the singularity really is almost here!

Name: Anonymous 2011-08-14 19:40

>>32
It won't matter, the Jews will keep their AGI's under lock-and-key and while the rest of us toil away and continue to suffer and die from disease and old age while stuck at the bottom of gravity well known as Earth, the Jews will be using their AGI's to unlock the secrets of the Universe, cure cancer and aging for themselves, expand out into the cosmos to live like gods, like the so called ``chosen people'' they brainwash their children into believing.

Name: Anonymous 2011-08-14 19:54

>>32
Yeah, and zettaflop by 2022. Which means, simulating thousands of human-level minds or something that far exceeds a single human intelligence. It's all over after that... by 2030, I bet things are going to be vastly different, unless old politicians and business men lobby against it and whip up the retard population into a frenzy to put a halt to it all. Hopefully by then, things will be moving so fast that the the political system won't be able to keep pace.

>>33
If all I thought about were Jews day in and day, I'd be pessimistic too.

Name: Anonymous 2011-08-14 20:03

>>33
Elizier might, but I somehow doubt he'll have an AGI before OpenCog (which is opensource) succeeds.

Name: >>35 2011-08-14 20:04

Actually I take that back, he'd make it public and attempt to prevent competing research as to avoid unFriendly AGIs.

Name: Anonymous 2011-08-14 20:18

>>35,36
I doubt OpenCog or other such projects will be the first to the gate. The first ones will probably be actual simulated human brain projects, right down to electrical and protein/chemical messaging pathways, like the IBM/CERN Blue Brain project, you don't need to figure out all of the higher level abstractions, you just need to simulate the low-level mechanics. And once you get to the point, you have a much better platform on which to reverse engineer and identify abstractions to improve and extend the system into something that is self-improving. Also, I'd hate to be intelligent entity that the IBM/CERN researchers are essentially performing virtual brain surgery on. I wonder how scientific ethical standards will view all of this.

In fact, Elizier better start trying to get hired by a place like IBM or he will find himself and his organization becoming overshadowed.

http://www.youtube.com/watch?v=LS3wMC2BpxU

Name: Anonymous 2011-08-14 20:24

>>37
Actually, here's a better video. The one I was looking for got pulled by the copyright Jews on youtube.

http://vimeo.com/8977365

Name: Anonymous 2011-08-14 20:50

>>32,34 Who is this "they" who wants to reach these computation capabilities?

Name: Anonymous 2011-08-14 20:58

>>39
The computing industry at large, the scientific community at large, governments and government agencies, large corporations that rely on simulations modeling.

In particular, I was originally referring to Intel and perhaps IBM, AMD, nVidia, etc. because they're the ones who make it possible with their hardware.

Name: Anonymous 2011-08-14 21:00

>>39 disregard that i suck cocks

Name: Anonymous 2011-08-14 21:53

5GHz single core CPU capable of say 60 gigaflops (a single core in a Sandy Bridge is around 16 gigaflops at 2GHz for comparison), and just forget about the 40+ teraflops we could have had with a 256-core in-order x86-64 with 1024-bit wide SIMD compute unit
40TFlops with 256 cores is 156.25GFlops per core. Something doesn't add up (or multiply up) here.

Name: n3n7i 2011-08-14 22:00

>>37

I imagine a simulated brain would be freaking out quite a lot... Where are my legs?! ... Oh shit Where's my heart!?! ... etc (a brain probably has as many inputs as outputs?)

a standard desktop(?) already has enough to simulate a bee's brain supposedly... though there seem's to be issues with getting it to work..

Name: Anonymous 2011-08-14 22:05

>>37
SIM (Substrate Independent Minds) or just "mind uploads" are expected to come in some 15-40 years - currently the technology is good enough for scanning, but preservation technology is still untested and scanning of something like a human brain would take years and a lot of money (not streamlined), and current hardware is too slow (but stuff like HP/DARPA SyNAPSE are likely to increase the speed by many orders of magnitude and make it a practical hardware platform for running biological neural nets).

OpenCog has a roadmap that should deliver results in 10-20 years and their progress has been good and visible (although still a long way to go).

The advantage to OpenCog is basically their architecture and computational cost - they are much better for self-improvement and knowledge-sharing (the damn thing has "telepathy" built-in, you can just merge AGIs' knowledge - imagine reading a book then being able to transfer the knowledge you gained to anyone) and much less costly computationally, however obviously the challenge is much more difficult than just running a human brain (leaving aside the hard challenge of properly converting scanned brain data to abstracted neural nets and also that of getting the chemical content properly marked and identified (just getting synapses+neuron bodies won't give you a complete picture, might not even be functional)).

Obviously human brains are fairly opaque and even if you understand the inner processes, they're still neural networks which make self-improvement fairly hard compared to more high-level approaches.

I think both will be reached, but I also think it's uncertain which will be reached first.

Name: Anonymous 2011-08-14 22:22

>>44

only an over optimistic nut would expect "mind uploads" to be available in 15 years.

OpenCog has been working on AI for at least 14 years with not much to show and according to their roadmap they'll have human level intelligence in 8 years. LOL

Name: Anonymous 2011-08-14 22:55

>>42
I was taking into account the 32-way SIMD instructions that Intel is heading towards with their many-core processors. Current Knights Corner has 6 teraflops with around 50 active cores and so going up by a factor of 4x cores and 2x as wide SIMD and you get 48 teraflops. I was being conservative in my estimate.

Name: Anonymous 2011-08-14 23:34

>>46
Also, you might be wondering.... 6 teraflops / 50 cores = 120 gflops, which is still quite a bit more than a Core i7 core. Knight's Corner has AVX2 or 512-bit wide SIMD, whereas Sandy Bridge only has 256-bit wide SIMD. And then, Sandy Bridge is optimized for doing things like branch prediction, speculative loading, copy data around and working with strings, the kind of stuff that's good for running enterprise applications or web servers or web browsers. All of that extra circuitry doesn't contribute much to the overall flop throughput. Knight's Corner is optimized for crunching numbers, which is more useful for things like graphics or simulations.

Name: Anonymous 2011-08-14 23:40

>>45
Technically the technology to do it (very badly) is already available today. 15 years is the minimum amount of time that I'm giving it, it's not the most likely one. A working case may exist in 15 years, but it might not be what you expect it to be (it might not be one single "mind", it could be a messy mish-mash). 40 years would be my guess for a properly streamlined solution.

Name: Anonymous 2011-08-15 2:59

>>48
Citation needed.

Name: Anonymous 2011-08-15 3:12

>>49
http://www.carboncopies.org/

I could look up the individual citations as I knew of them before I found that site, but whatever, it's less work for me to link that.

Name: Anonymous 2011-08-15 7:54

>>3
The rumors are that Sony is considering a variant of IBM's Power 7 for the Playstation 4

So is the Wii-U, so that's hardly a ringing endorsement.

Name: Anonymous 2011-08-15 13:05

>>51
No, the Wii-U is getting a cheap custom 3-core PowerPC processor, similar to the XBox 360. The Wii-U is nothing but cheap components. Its GPU is essentially a custom variant of an AMD HD4550, and it's getting 1GB of DDR2 800MHz RAM. It is an underwhelming piece of hardware.

Name: Anonymous 2011-08-15 13:36

>>52
Well, consoles always were about OMG OPTIMIZED!! drivers, software, and marketing, not hardware.

Name: Anonymous 2011-08-15 13:51

NOT WORTH MY MONEY

Name: Anonymous 2011-08-15 17:48

>>52
>1gb of ram
more than GPU and normal ram combined for each other system

>amd 4550
[citation needed]
http://www.joystiq.com/2011/06/14/wii-u-graphics-chip-outed-as-last-gen-radeon-which-is-still-pre/
>chip similar to the r770
>the one the 4870 X2 was based on

Name: Anonymous 2011-08-15 18:28

>>55
Confirmed for being a stinking Nintendo furfag. Back to /vp/ please.

Name: Anonymous 2011-08-16 1:14

>>56
Back to the imageboards, please.

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2013-07-29 0:23

>>24
You're going to love what Intel has got coming.

(No, I'm not referring to Clownlake.)

Name: Anonymous 2013-07-29 0:25

>>58
Next gen of Knight's Corner?

Name: Anonymous 2013-07-29 0:25

Even Cudder's necrobumping!

>>58
Don't keep us in suspense. We've waited two years to hear about this, now put up or shut up.

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2013-07-29 0:35

>>59
Exact opposite... I quoted >>24 for a reason.

Name: Anonymous 2013-07-29 0:38

>>61
Intel has finally realized what a monstrosity their architecture has become and has finally decided to switch to RISC?

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2013-07-29 1:16

>>62
Never going to happen. The future is CISC.

http://en.wikipedia.org/wiki/Tianhe-2
All Intel Inside®.

Name: Anonymous 2013-07-29 3:43

When is it going to happen again when a person or two will be able to completely homebrew their own processor design and even fabricate it at home?

Name: Anonymous 2013-07-29 3:56

>>64
reported for subvertive terrorist plans. enjoy your v&.

Name: Anonymous 2013-07-29 3:57

>>65
reported for not checking dubz.

Name: Anonymous 2013-07-29 4:12

>>63
ARM is going to destroy Intel and shit on Moore's grave.

Name: Anonymous 2013-07-29 6:24

>─────▄████▀█▄
 >───▄█████████████████▄
 >─▄█████.▼.▼.▼.▼.▼.▼▼▼▼
 >▄███████▄.▲.▲▲▲▲▲▲▲▲
 >███████████████████▀▀
 YOU HAVE BEEN CAUGHT BY THE GATOR OF DOOM! REPOST THIS 5 TIMES OR GET GATORED!!!

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2013-07-29 6:29

>>67
Keep dreaming.

Intel 486DX2 0.8 DMIPS/MHz/core
ARM Cortex-M3 1.25 DMIPS/MHz/core
Intel Pentium 1.88 DMIPS/MHz/core
ARM Cortex-A8 2.0 DMIPS/MHz/core
Intel Atom N270 2.4 DMIPS/MHz/core
ARM Cortex-A57 "up to" 4.76 DMIPS/MHz/core according to ARM, in practice closer to ~4
AMD Athlon XP 2500+ 4.1 DMIPS/MHz/core
Core i7 2600K 9.43 DMIPS/MHz/core

Don't even bother asking about FPU performance; ARM absolutely sucks in that area, more than order of magnitude behind.

That's architecture advantage, not process advantage.

Name: Anonymous 2013-07-29 6:55

>>69
I'd be interested to see a similar table with per watt specs.

Name: Anonymous 2013-07-29 7:13

>>65
v&
stay on /b/, kid

Name: Anonymous 2013-07-29 7:13

Shalom, cudder-kike!
muh intel
muh windows

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2013-07-30 4:47

Name: Anonymous 2013-07-30 6:05

>>73
sponsored by IntelTM

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2013-07-30 6:09

>>74
And your point is?

Name: Anonymous 2013-07-30 6:15

>>75
never believe sponsored "research". specially if it's sponsored by jews.

Name: Anonymous 2013-07-30 6:16

>>75
are you really female, Cudder?

Name: Cudder !MhMRSATORI!fR8duoqGZdD/iE5 2013-07-30 6:47

>>76
You can reproduce the results yourself, idiotic antisemtist.

Name: Anonymous 2013-07-30 6:51

>>78
you are a dumb cunt. plz die.

Name: Anonymous 2013-07-30 7:12

>>77
not female at all. females can't into computers.

Name: Anonymous 2013-07-30 7:36

>>78
Shalom, Hymie!

Name: Anonymous 2013-07-30 7:45

>>77
Cudder is an Israeli megalomaniac, affiliated with Intel. His nose is longer than your penis.

Name: Anonymous 2013-07-30 9:18

>>80
dumb normalfag beliefs

Don't change these.
Name: Email:
Entire Thread Thread List