Skip to content

Commit 5b11888

Browse files
committed
fscrypt: support crypto data unit size less than filesystem block size
Until now, fscrypt has always used the filesystem block size as the granularity of file contents encryption. Two scenarios have come up where a sub-block granularity of contents encryption would be useful: 1. Inline crypto hardware that only supports a crypto data unit size that is less than the filesystem block size. 2. Support for direct I/O at a granularity less than the filesystem block size, for example at the block device's logical block size in order to match the traditional direct I/O alignment requirement. (1) first came up with older eMMC inline crypto hardware that only supports a crypto data unit size of 512 bytes. That specific case ultimately went away because all systems with that hardware continued using out of tree code and never actually upgraded to the upstream inline crypto framework. But, now it's coming back in a new way: some current UFS controllers only support a data unit size of 4096 bytes, and there is a proposal to increase the filesystem block size to 16K. (2) was discussed as a "nice to have" feature, though not essential, when support for direct I/O on encrypted files was being upstreamed. Still, the fact that this feature has come up several times does suggest it would be wise to have available. Therefore, this patch implements it by using one of the reserved bytes in fscrypt_policy_v2 to allow users to select a sub-block data unit size. Supported data unit sizes are powers of 2 between 512 and the filesystem block size, inclusively. Support is implemented for both the FS-layer and inline crypto cases. This patch focuses on the basic support for sub-block data units. Some things are out of scope for this patch but may be addressed later: - Supporting sub-block data units in combination with FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64, in most cases. Unfortunately this combination usually causes data unit indices to exceed 32 bits, and thus fscrypt_supported_policy() correctly disallows it. The users who potentially need this combination are using f2fs. To support it, f2fs would need to provide an option to slightly reduce its max file size. - Supporting sub-block data units in combination with FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32. This has the same problem described above, but also it will need special code to make DUN wraparound still happen on a FS block boundary. - Supporting use case (2) mentioned above. The encrypted direct I/O code will need to stop requiring and assuming FS block alignment. This won't be hard, but it belongs in a separate patch. - Supporting this feature on filesystems other than ext4 and f2fs. (Filesystems declare support for it via their fscrypt_operations.) On UBIFS, sub-block data units don't make sense because UBIFS encrypts variable-length blocks as a result of compression. CephFS could support it, but a bit more work would be needed to make the fscrypt_*_block_inplace functions play nicely with sub-block data units. I don't think there's a use case for this on CephFS anyway. Link: https://lore.kernel.org/r/20230925055451.59499-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
1 parent 7a0263d commit 5b11888

11 files changed

Lines changed: 288 additions & 133 deletions

File tree

Documentation/filesystems/fscrypt.rst

Lines changed: 86 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -261,9 +261,9 @@ DIRECT_KEY policies
261261

262262
The Adiantum encryption mode (see `Encryption modes and usage`_) is
263263
suitable for both contents and filenames encryption, and it accepts
264-
long IVs --- long enough to hold both an 8-byte logical block number
265-
and a 16-byte per-file nonce. Also, the overhead of each Adiantum key
266-
is greater than that of an AES-256-XTS key.
264+
long IVs --- long enough to hold both an 8-byte data unit index and a
265+
16-byte per-file nonce. Also, the overhead of each Adiantum key is
266+
greater than that of an AES-256-XTS key.
267267

268268
Therefore, to improve performance and save memory, for Adiantum a
269269
"direct key" configuration is supported. When the user has enabled
@@ -300,8 +300,8 @@ IV_INO_LBLK_32 policies
300300

301301
IV_INO_LBLK_32 policies work like IV_INO_LBLK_64, except that for
302302
IV_INO_LBLK_32, the inode number is hashed with SipHash-2-4 (where the
303-
SipHash key is derived from the master key) and added to the file
304-
logical block number mod 2^32 to produce a 32-bit IV.
303+
SipHash key is derived from the master key) and added to the file data
304+
unit index mod 2^32 to produce a 32-bit IV.
305305

306306
This format is optimized for use with inline encryption hardware
307307
compliant with the eMMC v5.2 standard, which supports only 32 IV bits
@@ -451,31 +451,62 @@ acceleration is recommended:
451451
Contents encryption
452452
-------------------
453453

454-
For file contents, each filesystem block is encrypted independently.
455-
Starting from Linux kernel 5.5, encryption of filesystems with block
456-
size less than system's page size is supported.
457-
458-
Each block's IV is set to the logical block number within the file as
459-
a little endian number, except that:
460-
461-
- With CBC mode encryption, ESSIV is also used. Specifically, each IV
462-
is encrypted with AES-256 where the AES-256 key is the SHA-256 hash
463-
of the file's data encryption key.
464-
465-
- With `DIRECT_KEY policies`_, the file's nonce is appended to the IV.
466-
Currently this is only allowed with the Adiantum encryption mode.
467-
468-
- With `IV_INO_LBLK_64 policies`_, the logical block number is limited
469-
to 32 bits and is placed in bits 0-31 of the IV. The inode number
470-
(which is also limited to 32 bits) is placed in bits 32-63.
471-
472-
- With `IV_INO_LBLK_32 policies`_, the logical block number is limited
473-
to 32 bits and is placed in bits 0-31 of the IV. The inode number
474-
is then hashed and added mod 2^32.
475-
476-
Note that because file logical block numbers are included in the IVs,
477-
filesystems must enforce that blocks are never shifted around within
478-
encrypted files, e.g. via "collapse range" or "insert range".
454+
For contents encryption, each file's contents is divided into "data
455+
units". Each data unit is encrypted independently. The IV for each
456+
data unit incorporates the zero-based index of the data unit within
457+
the file. This ensures that each data unit within a file is encrypted
458+
differently, which is essential to prevent leaking information.
459+
460+
Note: the encryption depending on the offset into the file means that
461+
operations like "collapse range" and "insert range" that rearrange the
462+
extent mapping of files are not supported on encrypted files.
463+
464+
There are two cases for the sizes of the data units:
465+
466+
* Fixed-size data units. This is how all filesystems other than UBIFS
467+
work. A file's data units are all the same size; the last data unit
468+
is zero-padded if needed. By default, the data unit size is equal
469+
to the filesystem block size. On some filesystems, users can select
470+
a sub-block data unit size via the ``log2_data_unit_size`` field of
471+
the encryption policy; see `FS_IOC_SET_ENCRYPTION_POLICY`_.
472+
473+
* Variable-size data units. This is what UBIFS does. Each "UBIFS
474+
data node" is treated as a crypto data unit. Each contains variable
475+
length, possibly compressed data, zero-padded to the next 16-byte
476+
boundary. Users cannot select a sub-block data unit size on UBIFS.
477+
478+
In the case of compression + encryption, the compressed data is
479+
encrypted. UBIFS compression works as described above. f2fs
480+
compression works a bit differently; it compresses a number of
481+
filesystem blocks into a smaller number of filesystem blocks.
482+
Therefore a f2fs-compressed file still uses fixed-size data units, and
483+
it is encrypted in a similar way to a file containing holes.
484+
485+
As mentioned in `Key hierarchy`_, the default encryption setting uses
486+
per-file keys. In this case, the IV for each data unit is simply the
487+
index of the data unit in the file. However, users can select an
488+
encryption setting that does not use per-file keys. For these, some
489+
kind of file identifier is incorporated into the IVs as follows:
490+
491+
- With `DIRECT_KEY policies`_, the data unit index is placed in bits
492+
0-63 of the IV, and the file's nonce is placed in bits 64-191.
493+
494+
- With `IV_INO_LBLK_64 policies`_, the data unit index is placed in
495+
bits 0-31 of the IV, and the file's inode number is placed in bits
496+
32-63. This setting is only allowed when data unit indices and
497+
inode numbers fit in 32 bits.
498+
499+
- With `IV_INO_LBLK_32 policies`_, the file's inode number is hashed
500+
and added to the data unit index. The resulting value is truncated
501+
to 32 bits and placed in bits 0-31 of the IV. This setting is only
502+
allowed when data unit indices and inode numbers fit in 32 bits.
503+
504+
The byte order of the IV is always little endian.
505+
506+
If the user selects FSCRYPT_MODE_AES_128_CBC for the contents mode, an
507+
ESSIV layer is automatically included. In this case, before the IV is
508+
passed to AES-128-CBC, it is encrypted with AES-256 where the AES-256
509+
key is the SHA-256 hash of the file's contents encryption key.
479510

480511
Filenames encryption
481512
--------------------
@@ -544,7 +575,8 @@ follows::
544575
__u8 contents_encryption_mode;
545576
__u8 filenames_encryption_mode;
546577
__u8 flags;
547-
__u8 __reserved[4];
578+
__u8 log2_data_unit_size;
579+
__u8 __reserved[3];
548580
__u8 master_key_identifier[FSCRYPT_KEY_IDENTIFIER_SIZE];
549581
};
550582

@@ -586,6 +618,29 @@ This structure must be initialized as follows:
586618
The DIRECT_KEY, IV_INO_LBLK_64, and IV_INO_LBLK_32 flags are
587619
mutually exclusive.
588620

621+
- ``log2_data_unit_size`` is the log2 of the data unit size in bytes,
622+
or 0 to select the default data unit size. The data unit size is
623+
the granularity of file contents encryption. For example, setting
624+
``log2_data_unit_size`` to 12 causes file contents be passed to the
625+
underlying encryption algorithm (such as AES-256-XTS) in 4096-byte
626+
data units, each with its own IV.
627+
628+
Not all filesystems support setting ``log2_data_unit_size``. ext4
629+
and f2fs support it since Linux v6.7. On filesystems that support
630+
it, the supported nonzero values are 9 through the log2 of the
631+
filesystem block size, inclusively. The default value of 0 selects
632+
the filesystem block size.
633+
634+
The main use case for ``log2_data_unit_size`` is for selecting a
635+
data unit size smaller than the filesystem block size for
636+
compatibility with inline encryption hardware that only supports
637+
smaller data unit sizes. ``/sys/block/$disk/queue/crypto/`` may be
638+
useful for checking which data unit sizes are supported by a
639+
particular system's inline encryption hardware.
640+
641+
Leave this field zeroed unless you are certain you need it. Using
642+
an unnecessarily small data unit size reduces performance.
643+
589644
- For v2 encryption policies, ``__reserved`` must be zeroed.
590645

591646
- For v1 encryption policies, ``master_key_descriptor`` specifies how

fs/crypto/bio.c

Lines changed: 22 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -111,10 +111,14 @@ static int fscrypt_zeroout_range_inline_crypt(const struct inode *inode,
111111
int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
112112
sector_t pblk, unsigned int len)
113113
{
114-
const unsigned int blockbits = inode->i_blkbits;
115-
const unsigned int blocksize = 1 << blockbits;
116-
const unsigned int blocks_per_page_bits = PAGE_SHIFT - blockbits;
117-
const unsigned int blocks_per_page = 1 << blocks_per_page_bits;
114+
const struct fscrypt_info *ci = inode->i_crypt_info;
115+
const unsigned int du_bits = ci->ci_data_unit_bits;
116+
const unsigned int du_size = 1U << du_bits;
117+
const unsigned int du_per_page_bits = PAGE_SHIFT - du_bits;
118+
const unsigned int du_per_page = 1U << du_per_page_bits;
119+
u64 du_index = (u64)lblk << (inode->i_blkbits - du_bits);
120+
u64 du_remaining = (u64)len << (inode->i_blkbits - du_bits);
121+
sector_t sector = pblk << (inode->i_blkbits - SECTOR_SHIFT);
118122
struct page *pages[16]; /* write up to 16 pages at a time */
119123
unsigned int nr_pages;
120124
unsigned int i;
@@ -130,8 +134,8 @@ int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
130134
len);
131135

132136
BUILD_BUG_ON(ARRAY_SIZE(pages) > BIO_MAX_VECS);
133-
nr_pages = min_t(unsigned int, ARRAY_SIZE(pages),
134-
(len + blocks_per_page - 1) >> blocks_per_page_bits);
137+
nr_pages = min_t(u64, ARRAY_SIZE(pages),
138+
(du_remaining + du_per_page - 1) >> du_per_page_bits);
135139

136140
/*
137141
* We need at least one page for ciphertext. Allocate the first one
@@ -154,35 +158,36 @@ int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
154158
bio = bio_alloc(inode->i_sb->s_bdev, nr_pages, REQ_OP_WRITE, GFP_NOFS);
155159

156160
do {
157-
bio->bi_iter.bi_sector = pblk << (blockbits - 9);
161+
bio->bi_iter.bi_sector = sector;
158162

159163
i = 0;
160164
offset = 0;
161165
do {
162-
err = fscrypt_crypt_block(inode, FS_ENCRYPT, lblk,
163-
ZERO_PAGE(0), pages[i],
164-
blocksize, offset, GFP_NOFS);
166+
err = fscrypt_crypt_data_unit(ci, FS_ENCRYPT, du_index,
167+
ZERO_PAGE(0), pages[i],
168+
du_size, offset,
169+
GFP_NOFS);
165170
if (err)
166171
goto out;
167-
lblk++;
168-
pblk++;
169-
len--;
170-
offset += blocksize;
171-
if (offset == PAGE_SIZE || len == 0) {
172+
du_index++;
173+
sector += 1U << (du_bits - SECTOR_SHIFT);
174+
du_remaining--;
175+
offset += du_size;
176+
if (offset == PAGE_SIZE || du_remaining == 0) {
172177
ret = bio_add_page(bio, pages[i++], offset, 0);
173178
if (WARN_ON_ONCE(ret != offset)) {
174179
err = -EIO;
175180
goto out;
176181
}
177182
offset = 0;
178183
}
179-
} while (i != nr_pages && len != 0);
184+
} while (i != nr_pages && du_remaining != 0);
180185

181186
err = submit_bio_wait(bio);
182187
if (err)
183188
goto out;
184189
bio_reset(bio, inode->i_sb->s_bdev, REQ_OP_WRITE);
185-
} while (len != 0);
190+
} while (du_remaining != 0);
186191
err = 0;
187192
out:
188193
bio_put(bio);

0 commit comments

Comments
 (0)