Skip to content

Support for SIMD WebAssembly #137

@tomByrer

Description

@tomByrer

Interesting project you have here!

Is there any speed comparisons against WebAssembly anywhere? eg rewrite this for gpu.js
http://kripken.github.io/Massive/beta/

I used to hand-code SSE ASM for DSP back in the day, so I'm always looking to save a few cycles ;)

Activity

robertleeplummerjr

robertleeplummerjr commented on Jul 14, 2017

@robertleeplummerjr
Member

I would love to! How would one go about doing this?

robertleeplummerjr

robertleeplummerjr commented on Jul 14, 2017

@robertleeplummerjr
Member

Currently what we'd like to do is implement an accelerator for cpu. Right now there is overhead for creating a cpu in the form of loops and callbacks for each item in the arrays. Our current goal would be to unroll these loops where possible, and stick the kernel function body there, rather than a callback. At the very least this would prevent the looping and callback/closure cost, but there is a limit to the size of these functions and on this scale it can escalate quickly. A "small" 512*512 matrix, for example, has 262,144 kernel calls.

How does WebAssembly deal with this type of problem? Is this the right question to be asking?

PicoCreator

PicoCreator commented on Jul 14, 2017

@PicoCreator
Contributor

@robertleeplummerjr, @tomByrer : Fuzz and I were discussing of doing this after v1. The SIMD aspect to be exact. Though we probably, would run it as a seperate mode (not CPU mode)

Mainly cause it will make for a hilarious tag line, GPU.JS, now transpiling from CPU to CPU!

robertleeplummerjr

robertleeplummerjr commented on Jul 14, 2017

@robertleeplummerjr
Member

I, for one, would be in favor of the "CPU to CPU" tagline, it'd at first be funny, then they'd see the numbers. Their reaction: "Hahaha, what a funny joke {clicks link}... oooOOOooo!"

(But I'll do whatever you leaders feel is important 😛 )

fuzzie360

fuzzie360 commented on Jul 14, 2017

@fuzzie360
Member

Will leave this here so you guys can salivate at the CPU performance gains of SIMD:

image

Also a working CPU SIMD demo here:

http://peterjensen.github.io/idf2014-simd/idf2014-simd.html

This is not forgetting that we are technically close to SIMD on GPU at the moment:

  1. The beginning: 1 gpu thread, 1 output value
  2. Float textures: 1 gpu thread, 4 output values <- we are currently here
  3. Branch-less optimizer: squash if branches e.g:
if (x > 0) {
    y += 5;
}

// becomes
z = x > 0;
y += 5 * z;
  1. SIMD optimizer:
result.r = a[0] + b[0];
result.g = a[1] + b[1];
result.b = a[2] + b[2];
result.a = a[3] + b[3];

// becomes
result = a + b;
changed the title [-]perf test examples against WebAssembly[/-] [+]Support for SIMD WebAssembly[/+] on Jul 16, 2017
ohenepee

ohenepee commented on Aug 5, 2018

@ohenepee

Any speed comparisons against WebAssembly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @fuzzie360@robertleeplummerjr@tomByrer@ohenepee@PicoCreator

        Issue actions

          Support for SIMD WebAssembly · Issue #137 · gpujs/gpu.js