|
| 1 | +.. SPDX-License-Identifier: GPL-2.0 |
| 2 | +
|
| 3 | +======================== |
| 4 | +Generic Radix Page Table |
| 5 | +======================== |
| 6 | + |
| 7 | +.. kernel-doc:: include/linux/generic_pt/common.h |
| 8 | + :doc: Generic Radix Page Table |
| 9 | + |
| 10 | +.. kernel-doc:: drivers/iommu/generic_pt/pt_defs.h |
| 11 | + :doc: Generic Page Table Language |
| 12 | + |
| 13 | +Usage |
| 14 | +===== |
| 15 | + |
| 16 | +Generic PT is structured as a multi-compilation system. Since each format |
| 17 | +provides an API using a common set of names there can be only one format active |
| 18 | +within a compilation unit. This design avoids function pointers around the low |
| 19 | +level API. |
| 20 | + |
| 21 | +Instead the function pointers can end up at the higher level API (i.e. |
| 22 | +map/unmap, etc.) and the per-format code can be directly inlined into the |
| 23 | +per-format compilation unit. For something like IOMMU each format will be |
| 24 | +compiled into a per-format IOMMU operations kernel module. |
| 25 | + |
| 26 | +For this to work the .c file for each compilation unit will include both the |
| 27 | +format headers and the generic code for the implementation. For instance in an |
| 28 | +implementation compilation unit the headers would normally be included as |
| 29 | +follows: |
| 30 | + |
| 31 | +generic_pt/fmt/iommu_amdv1.c:: |
| 32 | + |
| 33 | + #include <linux/generic_pt/common.h> |
| 34 | + #include "defs_amdv1.h" |
| 35 | + #include "../pt_defs.h" |
| 36 | + #include "amdv1.h" |
| 37 | + #include "../pt_common.h" |
| 38 | + #include "../pt_iter.h" |
| 39 | + #include "../iommu_pt.h" /* The IOMMU implementation */ |
| 40 | + |
| 41 | +iommu_pt.h includes definitions that will generate the operations functions for |
| 42 | +map/unmap/etc. using the definitions provided by AMDv1. The resulting module |
| 43 | +will have exported symbols named like pt_iommu_amdv1_init(). |
| 44 | + |
| 45 | +Refer to drivers/iommu/generic_pt/fmt/iommu_template.h for an example of how the |
| 46 | +IOMMU implementation uses multi-compilation to generate per-format ops structs |
| 47 | +pointers. |
| 48 | + |
| 49 | +The format code is written so that the common names arise from #defines to |
| 50 | +distinct format specific names. This is intended to aid debuggability by |
| 51 | +avoiding symbol clashes across all the different formats. |
| 52 | + |
| 53 | +Exported symbols and other global names are mangled using a per-format string |
| 54 | +via the NS() helper macro. |
| 55 | + |
| 56 | +The format uses struct pt_common as the top-level struct for the table, |
| 57 | +and each format will have its own struct pt_xxx which embeds it to store |
| 58 | +format-specific information. |
| 59 | + |
| 60 | +The implementation will further wrap struct pt_common in its own top-level |
| 61 | +struct, such as struct pt_iommu_amdv1. |
| 62 | + |
| 63 | +Format functions at the struct pt_common level |
| 64 | +---------------------------------------------- |
| 65 | + |
| 66 | +.. kernel-doc:: include/linux/generic_pt/common.h |
| 67 | + :identifiers: |
| 68 | +.. kernel-doc:: drivers/iommu/generic_pt/pt_common.h |
| 69 | + |
| 70 | +Iteration Helpers |
| 71 | +----------------- |
| 72 | + |
| 73 | +.. kernel-doc:: drivers/iommu/generic_pt/pt_iter.h |
| 74 | + |
| 75 | +Writing a Format |
| 76 | +---------------- |
| 77 | + |
| 78 | +It is best to start from a simple format that is similar to the target. x86_64 |
| 79 | +is usually a good reference for something simple, and AMDv1 is something fairly |
| 80 | +complete. |
| 81 | + |
| 82 | +The required inline functions need to be implemented in the format header. |
| 83 | +These should all follow the standard pattern of:: |
| 84 | + |
| 85 | + static inline pt_oaddr_t amdv1pt_entry_oa(const struct pt_state *pts) |
| 86 | + { |
| 87 | + [..] |
| 88 | + } |
| 89 | + #define pt_entry_oa amdv1pt_entry_oa |
| 90 | + |
| 91 | +where a uniquely named per-format inline function provides the implementation |
| 92 | +and a define maps it to the generic name. This is intended to make debug symbols |
| 93 | +work better. inline functions should always be used as the prototypes in |
| 94 | +pt_common.h will cause the compiler to validate the function signature to |
| 95 | +prevent errors. |
| 96 | + |
| 97 | +Review pt_fmt_defaults.h to understand some of the optional inlines. |
| 98 | + |
| 99 | +Once the format compiles then it should be run through the generic page table |
| 100 | +kunit test in kunit_generic_pt.h using kunit. For example:: |
| 101 | + |
| 102 | + $ tools/testing/kunit/kunit.py run --build_dir build_kunit_x86_64 --arch x86_64 --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig amdv1_fmt_test.* |
| 103 | + [...] |
| 104 | + [11:15:08] Testing complete. Ran 9 tests: passed: 9 |
| 105 | + [11:15:09] Elapsed time: 3.137s total, 0.001s configuring, 2.368s building, 0.311s running |
| 106 | + |
| 107 | +The generic tests are intended to prove out the format functions and give |
| 108 | +clearer failures to speed up finding the problems. Once those pass then the |
| 109 | +entire kunit suite should be run. |
| 110 | + |
| 111 | +IOMMU Invalidation Features |
| 112 | +--------------------------- |
| 113 | + |
| 114 | +Invalidation is how the page table algorithms synchronize with a HW cache of the |
| 115 | +page table memory, typically called the TLB (or IOTLB for IOMMU cases). |
| 116 | + |
| 117 | +The TLB can store present PTEs, non-present PTEs and table pointers, depending |
| 118 | +on its design. Every HW has its own approach on how to describe what has changed |
| 119 | +to have changed items removed from the TLB. |
| 120 | + |
| 121 | +PT_FEAT_FLUSH_RANGE |
| 122 | +~~~~~~~~~~~~~~~~~~~ |
| 123 | + |
| 124 | +PT_FEAT_FLUSH_RANGE is the easiest scheme to understand. It tries to generate a |
| 125 | +single range invalidation for each operation, over-invalidating if there are |
| 126 | +gaps of VA that don't need invalidation. This trades off impacted VA for number |
| 127 | +of invalidation operations. It does not keep track of what is being invalidated; |
| 128 | +however, if pages have to be freed then page table pointers have to be cleaned |
| 129 | +from the walk cache. The range can start/end at any page boundary. |
| 130 | + |
| 131 | +PT_FEAT_FLUSH_RANGE_NO_GAPS |
| 132 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 133 | + |
| 134 | +PT_FEAT_FLUSH_RANGE_NO_GAPS is similar to PT_FEAT_FLUSH_RANGE; however, it tries |
| 135 | +to minimize the amount of impacted VA by issuing extra flush operations. This is |
| 136 | +useful if the cost of processing VA is very high, for instance because a |
| 137 | +hypervisor is processing the page table with a shadowing algorithm. |
0 commit comments