You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently what we'd like to do is implement an accelerator for cpu. Right now there is overhead for creating a cpu in the form of loops and callbacks for each item in the arrays. Our current goal would be to unroll these loops where possible, and stick the kernel function body there, rather than a callback. At the very least this would prevent the looping and callback/closure cost, but there is a limit to the size of these functions and on this scale it can escalate quickly. A "small" 512*512 matrix, for example, has 262,144 kernel calls.
How does WebAssembly deal with this type of problem? Is this the right question to be asking?
@robertleeplummerjr, @tomByrer : Fuzz and I were discussing of doing this after v1. The SIMD aspect to be exact. Though we probably, would run it as a seperate mode (not CPU mode)
Mainly cause it will make for a hilarious tag line, GPU.JS, now transpiling from CPU to CPU!
I, for one, would be in favor of the "CPU to CPU" tagline, it'd at first be funny, then they'd see the numbers. Their reaction: "Hahaha, what a funny joke {clicks link}... oooOOOooo!"
(But I'll do whatever you leaders feel is important 😛 )
Activity
robertleeplummerjr commentedon Jul 14, 2017
I would love to! How would one go about doing this?
robertleeplummerjr commentedon Jul 14, 2017
Currently what we'd like to do is implement an accelerator for cpu. Right now there is overhead for creating a cpu in the form of loops and callbacks for each item in the arrays. Our current goal would be to unroll these loops where possible, and stick the kernel function body there, rather than a callback. At the very least this would prevent the looping and callback/closure cost, but there is a limit to the size of these functions and on this scale it can escalate quickly. A "small" 512*512 matrix, for example, has 262,144 kernel calls.
How does WebAssembly deal with this type of problem? Is this the right question to be asking?
PicoCreator commentedon Jul 14, 2017
@robertleeplummerjr, @tomByrer : Fuzz and I were discussing of doing this after v1. The SIMD aspect to be exact. Though we probably, would run it as a seperate mode (not CPU mode)
Mainly cause it will make for a hilarious tag line, GPU.JS, now transpiling from CPU to CPU!
robertleeplummerjr commentedon Jul 14, 2017
I, for one, would be in favor of the "CPU to CPU" tagline, it'd at first be funny, then they'd see the numbers. Their reaction: "Hahaha, what a funny joke {clicks link}... oooOOOooo!"
(But I'll do whatever you leaders feel is important 😛 )
fuzzie360 commentedon Jul 14, 2017
Will leave this here so you guys can salivate at the CPU performance gains of SIMD:
Also a working CPU SIMD demo here:
http://peterjensen.github.io/idf2014-simd/idf2014-simd.html
This is not forgetting that we are technically close to SIMD on GPU at the moment:
[-]perf test examples against WebAssembly[/-][+]Support for SIMD WebAssembly[/+]ohenepee commentedon Aug 5, 2018
Any speed comparisons against WebAssembly?