64 core cpu from intel back

Board: Home Board index Raytracing General Development

(L) [2013/11/21] [ost by Geri] [64 core cpu from intel] Wayback!

intel release they xeon phi architecture as a standalone cpu. (blog is missing from the site for me, so i must post this here). the current xeon phi available as an accelerator, the new version arrives in 2015 will be a standalone monolithic cpu (yeah, operating system see all cores)
[IMG #1 Image]
rumors say it will be produced with 72 core instead of 64, they also need incrase IPC and maybe add some superscalar capabilities to keep the one-thread performance a bit higher than the current design. they will have both DDR4 and motherboard-soldered GDDR5 version from the new architecture, to feed the cores. cores supports enhanced ht, every core can execute 4 threads simultanously. also, it seems i had right about gpgpu-s, manycore conception is superior.
[IMG #1]:[IMG:#0]
(L) [2013/11/21] [ost by graphicsMan] [64 core cpu from intel] Wayback!

It's certainly superior in that it runs x86_64 code natively out of the box with no hoops to jump through.  This will not get you great performance.  You need to use the 16-wide SIMD lanes to really get the most out of this chip.  This is not free, nor even easy as a programming exercise.  It remains to be seen if this has more bang-for-the-buck than a GPU programmed with CUDA -- both will be a pain if you want maximum performance.  I find the CUDA paradigm possibly a little less so than the Intel paradigm.

The other obvious win for the Phi as CPU (if it is true) would be that it may have more memory available to it than these discrete cards.
(L) [2013/11/21] [ost by Dade] [64 core cpu from intel] Wayback!

>> Geri wrote:
rumors say it will be produced with 72 core instead of 64, they also need incrase IPC and maybe add some superscalar capabilities to keep the one-thread performance a bit higher than the current design. they will have both DDR4 and motherboard-soldered GDDR5 version from the new architecture, to feed the cores. cores supports enhanced ht, every core can execute 4 threads simultanously. also, it seems i had right about gpgpu-s, manycore conception is superior.
Another important new feature is supposed to be the on-package eDRAM (i.e. high speed RAM). It is feature that GPUs should gain soon too and has the potential to make a huge difference.
(L) [2013/11/21] [ost by hobold] [64 core cpu from intel] Wayback!

Intel will probably publish quite a few case studies and benchmarks with hand tuned AVX-512 code. But if there should ever be a healthy third party software market, then we'll probably see a familiar phenomenon: features usually trump performance. Only some niche applications, which need the speed to make sense, will receive serious tuning efforts.

I could imagine that OpenCL, despite some unwieldy aspects, could end up as the default programming platform for Knights Landing. That way, you can start development early, and deploy on existing hardware, too. And you will automatically get some benefit when future Knights processors move on to AVX-1024 (which is already casting its long shadow in AVX-512 opcodes). Long term portability across generations might outweigh perfect utilization of the current processor model.
(L) [2013/11/22] [ost by beason] [64 core cpu from intel] Wayback!

Will this new CPU let regular threaded CPU code run on all the cores? If so that should be a lot faster than using a GPU since CPU code cannot run on a GPU.
(L) [2013/11/22] [ost by hobold] [64 core cpu from intel] Wayback!

>> beason wrote:Will this new CPU let regular threaded CPU code run on all the cores? If so that should be a lot faster than using a GPU since CPU code cannot run on a GPU.Some of the Knights Landing models will be able to boot standard PC operating systems, and will be able to serve as a host CPU of the computer. Contrast this to the current Knights Ferry, which is more like a "cluster in a PCIe slot", running its own local instance of Linux while the main CPU is running the host OS.

That means there could be machines which have Knights Landing as their sole processor, and the host OS views it as a large SMP (symmetrical multiprocessing) system. The number of virtual processors would further be increased by the mandatory SMT (simultaneous multithreading), so some OSes (and possibly the applications) might require modifications to fully utilize hundreds or thousands of threads.

Of course, there is still the classical challenge of coming up with multithreaded algorithms that scale so high, while producing actual results rather than just excess heat. [SMILEY :-)]
(L) [2013/11/22] [ost by Dade] [64 core cpu from intel] Wayback!

>> hobold wrote:Of course, there is still the classical challenge of coming up with multithreaded algorithms that scale so high, while producing actual results rather than just excess heat.
Xeon Phi cores are _lot_ slower than today average Xeon cores, I assume that the performance will be awful if you don't use (directly or by OpenCL) the new AVX instructions.

I wonder if the slow cores can be a problem when doing some traditional HPC task like high intensity network or disk I/O, etc.
(L) [2013/11/22] [ost by hobold] [64 core cpu from intel] Wayback!

>> Dade wrote:Xeon Phi cores are _lot_ slower than today average Xeon cores, I assume that the performance will be awful if you don't use (directly or by OpenCL) the new AVX instructions.
Indeed they are slow. Microprocessor experts outside Intel are currently puzzled how the future chip can hit the announced 3 TFLOPS double precision. This is more than double the throughput of current Larrabee chips, so either clock frequency would have to rise from a little over 1GHz to 2.3GHz. Or vector width would have to double from 512 bits to 1024 bits (with only a modest frequency boost).

Doubled clocks are deemed unlikely, because that would destroy FLOPs/Watt efficiency. Doubled vector width is deemed unlikely as well, because Intel info specifically talks about AVX-512. Another alternative is two independent vector FPUs per core, but that is deemed unlikely because a dual issue pipeline would be woefully inadequate to saturate both (there are memory accesses, address computations, loop overhead, etc. to compete with FPU instructions). The final alternative would be significantly more powerful cores (3-wide out of order execution, etc.), but there simply is no processor core anywhere on Intel's roadmap that would meet the required balance of power consumption, performance, and silicon area.

The only thing that is known for certain today is that there is more to Knights Landing than meets the eye. Interesting times, as usual. [SMILEY :-)]
(L) [2013/11/22] [ost by Dade] [64 core cpu from intel] Wayback!

>> hobold wrote:Microprocessor experts outside Intel are currently puzzled how the future chip can hit the announced 3 TFLOPS double precision.
Are we interested in double precision performance ? I mean, is it ever going to be a transition from FP32 to FP64 in the rendering field ?

I have the feeling it is not going to happen for a long time. FP64 are only for lazy people working in HPC field, true man uses FP32   [SMILEY :lol:]  The sad true is that working with FP32 is painful but necessary in our field.
(L) [2013/11/23] [ost by hobold] [64 core cpu from intel] Wayback!

>> Dade wrote:FP64 are only for lazy people working in HPC field, true man uses FP32   
Sorry for derailing further, but that reminds me of a quote attributed to Alan Turing. My memory is hazy, but it went something like this: "Floating point arithmetic is for people who cannot keep track of the position of the point". [SMILEY :-)]

back