You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Disable MADV_DONTNEED zeroing optimisation on mshv
In 1f94fb3, some odd behaviour with MSHV and MAP_PRIVATE mappings was
noticed. It turns out that this was actually due to the hypervisor
pinning the old page structures, which meant that
`madvise(MADV_DONTNEED)` resulted in the userspace view of the memory
in question becoming completely divorced from the hypervisor view.
Switching (back) to MAP_SHARED "fixed" this only because it meant that
zeroing was actually ineffective for both the host and the guest (so
at least host writes were reflected in the guest): `madvise(2)` notes
that shared anonymous mappings will have their contents repopulated on
access after an `MADV_DONTNEED`.
This commit switches back to `MAP_PRIVATE` (so that the optimisation
is correct on KVM, where it works) and disables the optimisation on
MSHV, where the scratch region will always be zeroed by writing zeroes
to it. The original intent of lazily zeroing/populating the memory
will likely only be possible on MSHV with kernel/hypervisor changes
for support.
Signed-off-by: Lucy Menon <168595099+syntactically@users.noreply.github.com>
/// \[5\] P1382R1: `volatile_load<T>` and `volatile_store<T>`. JF Bastien, Paul McKenney, Jeffrey Yasskin, and the indefatigable TBD. <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1382r1.pdf>
301
303
/// \[6\] Documentation for std::sync::atomic::fence. <https://doc.rust-lang.org/std/sync/atomic/fn.fence.html>
304
+
///
305
+
/// # Note \[Keeping mappings in sync between userspace and the guest\]
306
+
///
307
+
/// When using this structure with mshv on Linux, it is necessary to
308
+
/// be a little bit careful: since the hypervisor is not directly
309
+
/// integrated with the host kernel virtual memory subsystem, it is
310
+
/// easy for the memory region in userspace to get out of sync with
311
+
/// the memory region mapped into the guest. Generally speaking, when
312
+
/// the [`SharedMemory`] is mapped into a partition, the MSHV kernel
313
+
/// module will call `pin_user_pages(FOLL_PIN|FOLL_WRITE)` on it,
314
+
/// which will eagerly do any CoW, etc needing to obtain backing pages
315
+
/// pinned in memory, and then map precisely those backing pages into
316
+
/// the virtual machine. After that, the backing pages mapped into the
317
+
/// VM will not change until the region is unmapped or remapped. This
318
+
/// means that code in this module needs to be very careful to avoid
319
+
/// changing the backing pages of the region in the host userspace,
320
+
/// since that would result in hyperlight-host's view of the memory
321
+
/// becoming completely divorced from the view of the VM.
0 commit comments