Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic WebGL Backend #672

Merged
merged 48 commits into from
Jan 20, 2018
Merged

Basic WebGL Backend #672

merged 48 commits into from
Jan 20, 2018

Conversation

phisiart
Copy link
Contributor

@phisiart phisiart commented Nov 26, 2017

TLDR

Currently the following demo program runs and gets the correct result.

from __future__ import absolute_import, print_function

import tvm
import numpy as np

n = tvm.var("n")
A = tvm.placeholder((n,), name='A')
B = tvm.placeholder((n,), name='B')
C = tvm.compute(A.shape, lambda i: A[i] + B[i], name="C")

s = tvm.create_schedule(C.op)
s[C].opengl()

fadd_gl = tvm.build(s, [A, B, C], "opengl", name="myadd")
print("------opengl code------")
print(fadd_gl.imported_modules[0].get_source(fmt="gl"))

ctx = tvm.opengl(0)
n = 10
a = tvm.nd.array(np.random.uniform(size=(n)).astype(A.dtype), ctx)
b = tvm.nd.array(np.random.uniform(size=(n)).astype(B.dtype), ctx)
c = tvm.nd.array(np.zeros((n), dtype=C.dtype), ctx)
fadd_gl(a, b, c)

np.testing.assert_allclose(c.asnumpy(), a.asnumpy() + b.asnumpy())

The corresponding fragment shader is

#version 330 core
uniform sampler2D A;
uniform sampler2D B;
out float C;
void main() {
  ivec2 threadIdx = ivec2(gl_FragCoord.xy);
  C = (texelFetch(A, ivec2(threadIdx.x, 0), 0).r + texelFetch(B, ivec2(threadIdx.x, 0), 0).r);
}

Current Status

OpenGL Tensor Storage

  • We store tensors in OpenGL textures.
  • No matter what dimensions a tensor has, we always store it in a 2D texture with height=1.
  • The reason we are not using 1D textures is that texelFetch in GLSL only supports 2D textures.
  • We support uint{8,16,32}, int{8,16,32} and float32 textures.

OpenGL Schedule

  • We added an opengl schedule, which basically fuses all dimensions into one and binds that single dimension to threadIdx.x.

Codegen

  • We haven't changed lowering at all. When the IR says Store(buffer, index) and the buffer happens to be the output texture, we check that index must be threadIdx.x, and emit code to output a pixel.
  • The codegen part has only been started. We will go through all the AST nodes as our next step.

Task Items

  • Remove the dependency on glfw and glad. Emscripten supports glfw. Glad removed.
  • Investigate emscripten. Runtime runnable.
  • Support other types than float. Done.
  • Use full RGBA channels. Not critical now.
  • Don't let height always be 1. The reason is that OpenGL textures have stupid size limitations. For example, you can have a 2500x2500 2D texture but not a 6250000x1 2D texture. One possible way is just let height be the maximum supported value. Not critical now.
  • Go through the AST to properly do codegen.
  • Don't fuse reduction. Gemm runnable.
  • Fix argument name generation. Done.

.gitmodules Outdated
@@ -7,3 +7,7 @@
[submodule "dlpack"]
path = dlpack
url = https://github.com/dmlc/dlpack
[submodule "glad"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is glad used for, is it possible to only use OpenGL standard and not rely on new libraries?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually rely on 2 libraries.

  • glad: This is the library to help you use the correct version of OpenGL.
    For example, suppose your OS supports OpenGL 3.3, then the OS is able to give you any OpenGL version below 3.3 on your request. However, if you just #include <GL/gl.h> or whatever header your OS provides, the function prototypes are for OpenGL 1.3 or something, so you can't use OpenGL 3.3 even your OS supports it. The OS provides a way of querying the function pointers of OpenGL 3.3 APIs. That's what glad does. It is possible that we don't rely on glad, but with a huge cost - we must write our own code to get all the function pointers.

  • glfw: This is the library to help you create an OpenGL context in a cross-platform way.
    You can't use any OpenGL API before creating a context. GLFW wraps the context creation code of different OS'es to provide a unified API. It is possible that we don't rely on glfw, but with a huge cost - we must write context initialization code for different platforms.

I can remove the dependencies on those libraries, but that can only happen after Dec 12 or something. We are focusing on producing a final-report-able version currently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I think these are fine as long as we fix this at the final merge.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK GLFW itself wraps up all the functions. Maybe GLAD is not needed since GLFW is already there. If my memory went wrong please inform me.

// AddExtraTVMType which is not in DLPack here
kOpenGL = 11,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put kOpenGL before kExtDev

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved.

CHECK(!is_scheduled()) << "Must be a fresh schedule";
StageNode *self = operator->();

auto all_iter_vars = self->all_iter_vars; // curr version of all_iter_vars
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be careful that this can include reduction variables, which should not be part of the fusing stage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will go back to this later. Added a TODO for now.

assert(false);
}

LOG_INFO.stream() << "GLFW says OpenGL version: "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can directly use LOG(INFO)

@@ -0,0 +1,138 @@
/*!
* Copyright (c) 2017 by Contributors
* \file codegen_opengl.cc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a specific comment that we are targeting subset of OpenGL that is working for WebGL2,(no compute shaders)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

@@ -114,6 +114,9 @@ def __init__(self,
elif target_name in ("metal",):
self.keys += ("gpu",)
self.max_num_threads = 256
elif target_name in ("opengl"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use opengl as key, as opengl schedule need to be different from gpu schedules

@@ -114,6 +114,9 @@ def __init__(self,
elif target_name in ("metal",):
self.keys += ("gpu",)
self.max_num_threads = 256
elif target_name in ("opengl"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be ("opengl",)? It ought to be searching for target name in a tuple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@phisiart
Copy link
Contributor Author

phisiart commented Dec 3, 2017

I'm currently stuck on other final projects for courses, so I won't be able to work on this PR until mid December. Then I should have roughly a month after the end of this semester.

@phisiart
Copy link
Contributor Author

Next step: Remove dependency on glfw and glad. I will focus on Ubuntu, since it should be the tricky one. Windows and Mac should have official APIs to create OpenGL contexts.

I will see what Ubuntu Desktop has by default and only depend on that.
On Ubuntu Server, OpenGL might not exist by default (because it doesn't have GUI). If that's the case I will see what minimum packages are needed to have OpenGL, and only depend on them.

@PENGUINLIONG
Copy link
Contributor

Use wgl* functions you can create a context on Windows.

On Linux, if MESA is installed, OpenGL soft-rendering should be available as well as accelerated graphics, if there is a GPU. Compute Shader should be supported as it's a part of the OpenGL 4.3 spec.

FYI, without GLFW it will be your responsibility to query if a GL function is available. Using GL_ABR_compute_shader, the minimum GL version requirement should be (Core) 4.3. On initialization, GLFW stores function pointers to GL APIs so that the query results can be reused.

@tqchen
Copy link
Member

tqchen commented Dec 16, 2017

One of the the main goal here is to use WebGL, this means

  • so the standard version should be less than OpenGL2.
  • we cannot use compute shader

Please try to emscripten the GL runtime to see if we can successfully build it

@PENGUINLIONG
Copy link
Contributor

I see. But it would be nice if we can utilize the functionalities provided by OpenGL of higher versions - Compute Shader. We have Compute Shader support in OpenGL (Core) 4.3 and in OpenGL ES 3.1 but not yet in WebGL. In the current latest commit, you can see there is a dummy vertex shader that doesn't need to but has to be executed.

Or a strategy can be devised so that both shaders can do some work.

@tqchen
Copy link
Member

tqchen commented Dec 16, 2017

For most other devices, opencl is also available, which is more desirable than OpenGL itself

@phisiart
Copy link
Contributor Author

After some experiments, I found that the following is the minimum requirements for find_package(OpenGL REQUIRED) in cmake to succeed.

Dockerfile:

FROM ubuntu

RUN apt-get update --fix-missing

RUN apt-get install -y --no-install-recommends \
  make cmake g++ libgl1-mesa-dev

CMake is looking for libGL.so, and on Ubuntu there are 3 packages that provide this (see here):

  • mesa: This is what comes by default when you install Ubuntu desktop;
  • nvidia: The NVIDIA driver;
  • fglrx: The AMD driver.

Therefore, we should depend on libgl1-mesa-dev.

@phisiart
Copy link
Contributor Author

Okay, I just tested and found out that on a fresh installation of Ubuntu Desktop you also need the package libgl1-mesa-dev, for find_package(OpenGL REQUIRED) in cmake to succeed (because Ubuntu doesn't come with the 'dev' version of the package).

@PENGUINLIONG
Copy link
Contributor

Since the goal for the OpenGL port has shifted to NN deployment on web apps.. Does it still necessarily need mesa?

Emscripten is compiling LLVM intermediate to JavaScript. It should require no linking to native libraries. Maybe we can just extern "C"-declare those APIs we need, so that the feature can be implemented with no dependency added.

See this for GLES2 APIs.

@phisiart
Copy link
Contributor Author

@PENGUINLIONG I think we want to support both running natively and in the browser.

A funny thing is that emscripten has direct support for glfw. If we remove the dependency on glfw then we would be calling native glx functions. I'm not sure whether emscripten has good support for those. I will look into it and report my findings.

@PENGUINLIONG
Copy link
Contributor

@phisiart It seems Emscripten simply map the calls. See this. Then it's possible that we simply declare them and they will work.

However, it seems not possible we can create context without dependencies.

@tqchen
Copy link
Member

tqchen commented Dec 23, 2017

I think the first milestone is to confirm emscripten with simple gl runtime works. You can likely utilize the RPC module here https://github.com/dmlc/tvm/tree/master/web for testing

@phisiart
Copy link
Contributor Author

phisiart commented Jan 1, 2018

I tried RPC. I have successfully crashed in the middle of OpenGL initialization, after glfwInit(). It seems like incompatibility with glad. Since I'm going to remove the dependency on glad anyway, let me do it and then continue with RPC.

@phisiart
Copy link
Contributor Author

phisiart commented Jan 1, 2018

Update: Now OpenGL initialization succeeds.

@phisiart
Copy link
Contributor Author

phisiart commented Jan 2, 2018

Update: Now rendering succeeds.

@tqchen
Copy link
Member

tqchen commented Jan 2, 2018

here are some trackable milestones changes that I think could be useful. It would be very helpful to create a series of test functions under tests/web/webgl that relies on web proxy rpc and can be run manually.

@phisiart phisiart force-pushed the opengl branch 3 times, most recently from 95ccc12 to f3d1724 Compare January 2, 2018 04:04
Makefile Outdated
@@ -30,10 +30,10 @@ CFLAGS = -std=c++11 -Wall -O2 $(INCLUDE_FLAGS) -fPIC
FRAMEWORKS =
OBJCFLAGS = -fno-objc-arc
EMCC_FLAGS= -std=c++11 -DDMLC_LOG_STACK_TRACE=0\
-Oz -s RESERVED_FUNCTION_POINTERS=2 -s MAIN_MODULE=1 -s NO_EXIT_RUNTIME=1\
-O0 -s RESERVED_FUNCTION_POINTERS=2 -s MAIN_MODULE=1 -s NO_EXIT_RUNTIME=1\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is -O0 necessary? or can we use -Oz?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this is just for debugging purposes. Will be changed back when I cleanup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed back.

* \param alignment The alignment of the memory.
* \return The allocated device pointer
*/
virtual void* AllocDataSpace(TVMContext ctx, size_t size, size_t alignment) = 0;
virtual void* AllocDataSpace(TVMContext ctx, TVMType type, size_t nbytes,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let us put type->type_hint and put it as last parameter. Use comment to say type_hint is only needed by a few backend such as GL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

@@ -20,13 +20,14 @@ class CPUDeviceAPI final : public DeviceAPI {
*rv = 1;
}
}
void* AllocDataSpace(TVMContext ctx, size_t size, size_t alignment) final {
void* AllocDataSpace(TVMContext ctx, TVMType type, size_t nbytes,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we have multiple line breaks, it might be more natural to break each argument into one line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

@@ -48,6 +48,7 @@ enum class RPCCode : int {
kModuleFree,
kModuleGetFunc,
kModuleGetSource,
kTestRemoteOpenGL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be safely removed when we upstream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is just for my debugging purposes. Will remove when I cleanup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted.

@@ -213,6 +213,11 @@ class Stage : public NodeRef {
* \return reference to self.
*/
Stage& double_buffer(); // NOLINT(*)
/*!
* \brief Schedule for GpenGL fragment shader.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenGL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

@tqchen
Copy link
Member

tqchen commented Jan 6, 2018

for the case when we need 2D texture, we might be able to use the following workaround.

Always allocate memory as i = 2^k * y + x, where x and y are two dimensional axis. k is the folding ideal packing length over the x dimension, say 10

so we can easily use bit operation to get y and x from global index i, and reconstruct vice versa

The Opengl runtime will need this information for textures.
- fix opengl func param retrieval;
- can save opengl module locally;
- working on loading opengl module in browser.
Known issue: cannot retrieve integer texture data in webgl.
from __future__ import absolute_import, print_function

import tvm
import numpy as np

n = tvm.var("n")
m = tvm.var("m")
A = tvm.placeholder((n, m), name='A')
k = tvm.reduce_axis((0, m), "k")
B = tvm.compute((n,), lambda i: tvm.sum(A[i, k], axis=k), name="B")

s = tvm.create_schedule(B.op)
s[B].opengl()

fadd_gl = tvm.build(s, [A, B], "opengl", name="myadd")
print("------opengl code------")
print(fadd_gl.imported_modules[0].get_source(fmt="gl"))

ctx = tvm.opengl(0)
n = 10
m = 10
a = tvm.nd.array(np.random.uniform(size=(n, m)).astype(A.dtype), ctx)
b = tvm.nd.array(np.random.uniform(size=(n,)).astype(B.dtype), ctx)
fadd_gl(a, b)

np.testing.assert_allclose(b.asnumpy(), np.sum(a.asnumpy(), axis=1))
@phisiart
Copy link
Contributor Author

Review comments are addressed.

@tqchen
Copy link
Member

tqchen commented Jan 20, 2018

Thanks, this concludes the basic support of GL, and this PR is merged. Let us start new PRs to make further improvements

@tqchen tqchen changed the title [WIP] WebGL Backend Basic WebGL Backend Jan 20, 2018
@tqchen tqchen merged commit 7009496 into apache:master Jan 20, 2018
@tqchen tqchen mentioned this pull request Jan 20, 2018
5 tasks
tqchen pushed a commit to tqchen/tvm that referenced this pull request Jul 6, 2018
sergei-mironov pushed a commit to sergei-mironov/tvm that referenced this pull request Aug 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants