-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for clarifications regarding hardware capabilities for shared mappings #7
Comments
IMO, some RTOS and bare-metal applications need memory protection, but they don't require full virtual memory support. Therefore, having a MMU is optional. In scenarios without an MMU, a few customers opt to extend the sandbox to cover the entire memory, or in other words, disable the boundary check. Another interesting case is with another customer's product, where a Wasm application and its native libraries can use virtual memory and DMA to access two different address spaces. |
Thanks for raising this, your right @lukewagner we could do more to clarify this. This is a good issue to also bounce of the e-sig mailing list, and I'll do that. Also useful to bring this up on Zulip, just to get additional eyeballs on it. E-SIG Hardware Platforms and Memory Control Hardware The e-sig stated that the devices that the we should target would come with a minimum of 512kb of RAM and 1mb of storage. Finding devices with this much RAM without any form of hardware memory protection was hard. If the device has more than a handful of kb of RAM it will undoubtedly come with either an MPU or an MMU. Based on the e-sig's hardware list (see below) of the 5 platforms we've selected 3 come with an MMU and 2 with an MPU. What does this mean in practice ?
I think it is probably not possible to assume the presence of an MMU, and we should account for MMU or an MPU. We do not need to account for a world in which neither are present, as these devices are just too limited even for the E-SIG. - However, I'm be delighted to open this up to debate, and see what the other stakeholders in the E-SIG think - email promotion of this issue is coming ;) There is an argument which says that the growing capabilities of MPUs and the growing prevalence of MMUs mean that the need to support limited MPU's is shrinking. - Hence why I'd like to open the discussion a little wider. But, without a chip in from others, I'd assume that we should aim to support both an MMU and an MPU. Picking Hardware for An Application In my view @lum1n0us two examples provide the two extremes of the embedded world - The real truth is that the choice of hardware is application specific. if we've got an application that needs virtual memory, then selecting hardware with an MMU is preferable. Impact on WASM - Functional Equivalence, Not identical functionality (Similar to Linux's Approach?) So, if we can't assume that there is something MMU like on a device, what can we do? - Well, looking out at other projects we can take a queue from Linux. Which provides user land source if not binary compatibility. Today it is possible to run Linux on devices without an MMU (uClinux for example), and in those situations Linux can provide the running application functional compatibility but with out the same side effects. For instance we can use As @lum1n0us suggests something similar is needed in the embedded Wasm world. The same code should function as expected, but it is acceptable to encounter performance impacts with manual copies of memory between locations, or perhaps alternatively allow a weak sandbox where memory protection isn't present. But these are trade offs that can be made based on the application use case and cost point for hardware. The important aspect is that the same code will operate successfully. To aid in this discussion I've updated the selected hardware table (see below) with hardware management features.
This is an interesting idea. Perhaps Stephen from Aytm would have a view on this? I think Aytm's mainly focusing the ARM market at the moment. |
(Thanks for all the info so far and looking forward to seeing all the other responses.) |
per For example, if:
There really are lots of angles to explore prior to that, however, but it remains possible for sure. That issue aside -- because it may be a no-op if there are in fact ways to finesse this as the component model evolves forward -- the hardware specificity we can tackle together is a great result of this document. Without it, we can't really change the spec as needed to address the issues. I love that the SIG is in fact digging into where things work, where they don't, and what the limits may actually be in the future ( |
I was thinking more along the lines of what Linux is doing - and using this as a bit of an inspiration. There are two behaviours for the same function call. I'll explain the thought exercise here, but it could equally well live with the discussion over on the structured memory issue @lukewagner pointed to. Considering Linux:
The results from the caller perspective are same -> they have access to the memory address they require. But the implementation is, of course different. Thinking about WASM - Regarding Memory Protection for the returned pointer from
It is also this thinking (2) which pushes us toward the concept of just having a second memory mapped with mmap. Note this is just brainstorming - all thought experiments, I can see avenues in which support from core-wasm would really help. That aside, I hope this brainstorming helps in explaining how some folks are approaching the constraints. Certainly, the performance hit is totally acceptable - in the embedded world the hardware is chosen and is application specific, it's not a general compute device, therefore if you really need to do a lot of memory sharing there is an argument that says - you should stump up and pay for hardware with an MMU, or accept the limitation. |
A couple of comments here. First, regarding @squillace comment about it breaking the core spec. In my opinion, if the necessary HW is not provided it would be acceptable to have a performance impact provided the same code functions as expected and we maintain compliance with the core spec. I would also add that, optionally, the core spec requirements could be relaxed but this cannot be the default behavior of the runtime. This would enable someone to make an explicit decision on their platform regarding relaxing such requirements and whether or not that is appropriate for their specific use case. Again, this needs to be an explicit option and not default behavior. Second, regarding RISC-V32, we have just started working with the ESP32-C6 series chips. These are dual-core 32-bit RISC-V CPUs and come a number of configurations. We've currently using the Links: |
Agreed. This makes sense.
Ok, great! - I'll get an update to the document! |
(Back from holidays and catching up) Thanks for the replies! So, summarizing what I'm seeing above to see if this makes sense to everyone (let me know if not):
A follow-up question: when we say we'll accept "perf overhead" for sandboxing without hardware support, it seems like there are (at least) two potential degrees of perf overhead worth considering:
If we accept significant overhead, that obviously gives us more flexibility in what we consider an acceptable solution, but my worry is that it'll be so slow as to make default (standards-compliant) wasm a nonviable option in practice. So my impression is that we should only accept modest performance overhead when an MMU is not available. This also speaks to the constraints brought up in the last CG presentation where, even outside embedded scenarios where an MMU is not physically present, there are plenty of other wasm execution contexts where an MMU is inacessible to the wasm engine. Thoughts? |
In the target platforms doc, there is a section explaining that the E-SIG feels comfortable assuming an MPU or MMU for efficient implementations. This is very helpful since it gives us a baseline of memory-related hardware capabilities that we can assume when considering implementations strategies for various features.
Just to state some assumptions to see if I'm understanding these terms correctly so folks can correct me if I'm wrong:
Based on these assumptions, one corollary is that any feature that requires fine-grained/page-level control over an unbounded set of memory ranges wouldn't work on an MPU and thus would not work across all the target platforms.
But one thing that's unclear to me is whether an MPU allows mapping a finite set of distinct "virtual" addresses to the same "physical" page of RAM. Hypothetically, it seems like an MPU could do this without blowing its limited hardware budget by including an "offset" field in its per-region configuration state that was added as part of address translation. With this hardware capability, we could, e.g., efficiently support a small finite number of "
mmap()
s" (or features that want to be implemented in terms ofmmap()
). However, scanning the docs of the popular ARM Cortex-M MPU, it seems like at least this one popular MPU can't do such shared mappings. However, I've also heard that various extensions outside the official ARM Cortex-M MPU extension might allow this? Also, I have no idea about the wider world of MPUs.So, what I'd love to get a clarification on (here and, ideally, in the doc) is whether the E-SIG believes that such shared mappings are indeed possible across the target platforms' MPUs or perhaps whether we want to raise the baseline to assume an MMU.
One discussion where this is concretely relevant is in the WebAssembly CG memory-control proposal (and its collection of sub-proposals), e.g., memory-control/#19, which does seem to assume an MMU.
The text was updated successfully, but these errors were encountered: