You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Define a WebAssembly module wrapper for Webcil assemblies.
Contributes to #80807
### Why
In some settings serving `application/octet-stream` data, or files with weird extensions will trigger firewalls or AV tools. But let's assume that if you're interested in deploying a .NET WebAssembly app, you're in an environment that can at least serve WebAssembly modules.
### How
Essentially we serve this WebAssembly module:
```wat
(module
(data "\0f\00\00\00") ;; data segment 0: payload size
(data "webcil Payload\cc") ;; data segment 1: webcil payload
(memory (import "webcil" "memory") 1)
(global (export "webcilVersion") i32 (i32.const 0))
(func (export "getWebcilSize") (param $destPtr i32) (result)
local.get $destPtr
i32.const 0
i32.const 4
memory.init 0)
(func (export "getWebcilPayload") (param $d i32) (param $n i32) (result)
local.get $d
i32.const 0
local.get $n
memory.init 1))
```
The module exports two WebAssembly functions `getWebcilSize` and `getWebcilPayload` that write some bytes (being the size or payload of the webcil assembly) to the linear memory at a given offset. The module also exports the constant `webcilVersion` to version the wrapper format.
So a runtime or tool that wants to consume the webcil module can do something like:
```js
const wasmModule = new WebAssembly.Module (...);
const wasmMemory = new WebAssembly.Memory ({initial: 1});
const wasmInstance =
new WebAssembly.Instance(wasmModule, {webcil: {memory: wasmMemory}});
const { getWebcilPayload, webcilVersion, getWebcilSize } = wasmInstance.exports;
console.log (`Version ${webcilVersion.value}`);
getWebcilSize(0);
const size = new Int32Array (wasmMemory.buffer)[0]
console.log (`Size ${size}`);
console.log (new Uint8Array(wasmMemory.buffer).subarray(0, 20));
getWebcilPayload(4, size);
console.log (new Uint8Array(wasmMemory.buffer).subarray(0, 20));
```
### How (Part 2)
But actually, we will define the wrapper to consist of exactly 2 data segments in the WebAssembly data section: segment 0 is 4 bytes and encodes the webcil payload size; and segment 1 is of variable size and contains the webcil payload.
So to load a webcil-in-wasm module, the runtime gets the _raw bytes_ of the WebAssembly module (ie: without instantiating it), and parses it to find the data section, assert that there are 2 segments, ensure they're both passive, and get the data directly from segment 1.
---
* Add option to emit webcil inside a wasm module wrapper
* [mono][loader] implement a webcil-in-wasm reader
* reword WebcilWasmWrapper summary comment
* update the Webcil spec to include the WebAssembly wrapper module
* Adjust RVA map offsets to account for wasm prefix
MonoImage:raw_data is used as a base when applying the RVA map to map virtual addresses to physical offsets in the assembly. With webcil-in-wasm there's an extra wasm prefix before the webcil payload starts, so we need to account for this extra data when creating the mapping.
An alternative is to compute the correct offsets as part of generating the webcil, but that would entangle the wasm module and the webcil payload. The current (somewhat hacky approach) keeps them logically separate.
* Add a note about the rva mapping to the spec
* Serve webcil-in-wasm as .wasm
* remove old .webcil support from Sdk Pack Tasks
* Implement support for webcil in wasm in the managed WebcilReader
* align webcil payload to a 4-byte boundary within the wasm module
Add padding to data segment 0 to ensure that data segment 1's payload (ie the webcil content itself) is 4-byte aligned
* assert that webcil raw data is 4-byte aligned
* add 4-byte alignment requirement to the webcil spec
* Don't modify MonoImageStorage:raw_data
instead just keep track of the webcil offset in the MonoImageStorage.
This introduces a situation where MonoImage:raw_data is different from MonoImageStorage:raw_data. The one to use for accessing IL and metadata is MonoImage:raw_data.
The storage pointer is just used by the image loading machinery
---------
Co-authored-by: Larry Ewing <[email protected]>
Copy file name to clipboardExpand all lines: docs/design/mono/webcil.md
+74-10Lines changed: 74 additions & 10 deletions
Original file line number
Diff line number
Diff line change
@@ -2,21 +2,83 @@
2
2
3
3
## Version
4
4
5
-
This is version 0.0 of the Webcil format.
5
+
This is version 0.0 of the Webcil payload format.
6
+
This is version 0 of the WebAssembly module Webcil wrapper.
6
7
7
8
## Motivation
8
9
9
10
When deploying the .NET runtime to the browser using WebAssembly, we have received some reports from
10
11
customers that certain users are unable to use their apps because firewalls and anti-virus software
11
12
may prevent browsers from downloading or caching assemblies with a .DLL extension and PE contents.
12
13
13
-
This document defines a new container format for ECMA-335 assemblies
14
-
that uses the `.webcil` extension and uses a new WebCIL container
15
-
format.
14
+
This document defines a new container format for ECMA-335 assemblies that uses the `.wasm` extension
15
+
and uses a new WebCIL metadata payload format wrapped in a WebAssembly module.
16
16
17
17
18
18
## Specification
19
19
20
+
### Webcil WebAssembly module
21
+
22
+
Webcil consists of a standard [binary WebAssembly version 0 module](https://webassembly.github.io/spec/core/binary/index.html) containing the following WAT module:
23
+
24
+
```wat
25
+
(module
26
+
(data "\0f\00\00\00") ;; data segment 0: payload size as a 4 byte LE uint32
27
+
(data "webcil Payload\cc") ;; data segment 1: webcil payload
That is, the module imports linear memory 0 and exports:
43
+
* a global `i32``webcilVersion` encoding the version of the WebAssembly wrapper (currently 0),
44
+
* a function `getWebcilSize : i32 -> ()` that writes the size of the Webcil payload to the specified
45
+
address in linear memory as a `u32` (that is: 4 LE bytes).
46
+
* a function `getWebcilPayload : i32 i32 -> ()` that writes `$n` bytes of the content of the Webcil
47
+
payload at the spcified address `$d` in linear memory.
48
+
49
+
The Webcil payload size and payload content are stored in the data section of the WebAssembly module
50
+
as passive data segments 0 and 1, respectively. The module must not contain additional data
51
+
segments. The module must store the payload size in data segment 0, and the payload content in data
52
+
segment 1.
53
+
54
+
The payload content in data segment 1 must be aligned on a 4-byte boundary within the web assembly
55
+
module. Additional trailing padding may be added to the data segment 0 content to correctly align
56
+
data segment 1's content.
57
+
58
+
(**Rationale**: With this wrapper it is possible to split the WebAssembly module into a *prefix*
59
+
consisting of everything before the data section, the data section, and a *suffix* that consists of
60
+
everything after the data section. The prefix and suffix do not depend on the contents of the
61
+
Webcil payload and a tool that generates Webcil files could simply emit the prefix and suffix from
62
+
constant data. The data section is the only variable content between different Webcil-encoded .NET
63
+
assemblies)
64
+
65
+
(**Rationale**: Encoding the payload in the data section in passive data segments with known indices
66
+
allows a runtime that does not include a WebAssembly host or a runtime that does not wish to
67
+
instantiate the WebAssembly module to extract the payload by traversing the WebAssembly module and
68
+
locating the Webcil payload in the data section at segment 1.)
69
+
70
+
(**Rationale**: The alignment requirement is due to ECMA-335 metadata requiring certain portions of
71
+
the physical layout to be 4-byte aligned, for example ECMA-335 Section II.25.4 and II.25.4.5.
72
+
Aligning the Webcil content within the wasm module allows tools that directly examine the wasm
73
+
module without instantiating it to properly parse the ECMA-335 metadata in the Webcil payload.)
74
+
75
+
(**Note**: the wrapper may be versioned independently of the payload.)
76
+
77
+
78
+
### Webcil payload
79
+
80
+
The webcil payload contains the ECMA-335 metadata, IL and resources comprising a .NET assembly.
81
+
20
82
As our starting point we take section II.25.1 "Structure of the
21
83
runtime file format" from ECMA-335 6th Edition.
22
84
@@ -40,12 +102,12 @@ A Webcil file follows a similar structure
40
102
| CLI Data |
41
103
||
42
104
43
-
## Webcil Headers
105
+
###Webcil Headers
44
106
45
107
The Webcil headers consist of a Webcil header followed by a sequence of section headers.
46
108
(All multi-byte integers are in little endian format).
47
109
48
-
### Webcil Header
110
+
####Webcil Header
49
111
50
112
```c
51
113
struct WebcilHeader {
@@ -75,11 +137,11 @@ The next pairs of integers are a subset of the PE Header data directory specifyi
75
137
of the CLI header, as well as the directory entry for the PE debug directory.
76
138
77
139
78
-
### Section header table
140
+
####Section header table
79
141
80
142
Immediately following the Webcil header is a sequence (whose length is given by `coff_sections`
81
143
above) of section headers giving their virtual address and virtual size, as well as the offset in
82
-
the Webcil file and the size in the file. This is a subset of the PE section header that includes
144
+
the Webcil payload and the size in the file. This is a subset of the PE section header that includes
83
145
enough information to correctly interpret the RVAs from the webcil header and from the .NET
84
146
metadata. Other information (such as the section names) are not included.
85
147
@@ -92,11 +154,13 @@ struct SectionHeader {
92
154
};
93
155
```
94
156
95
-
### Sections
157
+
(**Note**: the `st_raw_data_ptr` member is an offset from the beginning of the Webcil payload, not from the beginning of the WebAssembly wrapper module.)
158
+
159
+
#### Sections
96
160
97
161
Immediately following the section table are the sections. These are copied verbatim from the PE file.
98
162
99
-
## Rationale
163
+
###Rationale
100
164
101
165
The intention is to include only the information necessary for the runtime to locate the metadata
102
166
root, and to resolve the RVA references in the metadata (for locating data declarations and method IL).
0 commit comments