Skip to content

Commit 9e35511

Browse files
committed
Merge tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner: "This contains a mix of VFS cleanups, performance improvements, API fixes, documentation, and a deprecation notice. Scalability and performance: - Rework pid allocation to only take pidmap_lock once instead of twice during alloc_pid(), improving thread creation/teardown throughput by 10-16% depending on false-sharing luck. Pad the namespace refcount to reduce false-sharing - Track file lock presence via a flag in ->i_opflags instead of reading ->i_flctx, avoiding false-sharing with ->i_readcount on open/close hot paths. Measured 4-16% improvement on 24-core open-in-a-loop benchmarks - Use a consume fence in locks_inode_context() to match the store-release/load-consume idiom, eliminating a hardware fence on some architectures - Annotate cdev_lock with __cacheline_aligned_in_smp to prevent false-sharing - Remove a redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu() that never fires since the caller already verifies it, eliminating a 100% mispredicted branch - Fix a 100% mispredicted likely() in devcgroup_inode_permission() that became wrong after a prior code reorder Bug fixes and correctness: - Make insert_inode_locked() wait for inode destruction instead of skipping, fixing a corner case where two matching inodes could exist in the hash - Move f_mode initialization before file_ref_init() in alloc_file() to respect the SLAB_TYPESAFE_BY_RCU ordering contract - Add a WARN_ON_ONCE guard in try_to_free_buffers() for folios with no buffers attached, preventing a null pointer dereference when AS_RELEASE_ALWAYS is set but no release_folio op exists - Fix select restart_block to store end_time as timespec64, avoiding truncation of tv_sec on 32-bit architectures - Make dump_inode() use get_kernel_nofault() to safely access inode and superblock fields, matching the dump_mapping() pattern API modernization: - Make posix_acl_to_xattr() allocate the buffer internally since every single caller was doing it anyway. Reduces boilerplate and unnecessary error checking across ~15 filesystems - Replace deprecated simple_strtoul() with kstrtoul() for the ihash_entries, dhash_entries, mhash_entries, and mphash_entries boot parameters, adding proper error handling - Convert chardev code to use guard(mutex) and __free(kfree) cleanup patterns - Replace min_t() with min() or umin() in VFS code to avoid silently truncating unsigned long to unsigned int - Gate LOOKUP_RCU assertions behind CONFIG_DEBUG_VFS since callers already check the flag Deprecation: - Begin deprecating legacy BSD process accounting (acct(2)). The interface has numerous footguns and better alternatives exist (eBPF) Documentation: - Fix and complete kernel-doc for struct export_operations, removing duplicated documentation between ReST and source - Fix kernel-doc warnings for __start_dirop() and ilookup5_nowait() Testing: - Add a kunit test for initramfs cpio handling of entries with filesize > PATH_MAX Misc: - Add missing <linux/init_task.h> include in fs_struct.c" * tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits) posix_acl: make posix_acl_to_xattr() alloc the buffer fs: make insert_inode_locked() wait for inode destruction initramfs_test: kunit test for cpio.filesize > PATH_MAX fs: improve dump_inode() to safely access inode fields fs: add <linux/init_task.h> for 'init_fs' docs: exportfs: Use source code struct documentation fs: move initializing f_mode before file_ref_init() exportfs: Complete kernel-doc for struct export_operations exportfs: Mark struct export_operations functions at kernel-doc exportfs: Fix kernel-doc output for get_name() acct(2): begin the deprecation of legacy BSD process accounting device_cgroup: remove branch hint after code refactor VFS: fix __start_dirop() kernel-doc warnings fs: Describe @isnew parameter in ilookup5_nowait() fs/namei: Remove redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu fs: only assert on LOOKUP_RCU when built with CONFIG_DEBUG_VFS select: store end_time as timespec64 in restart block chardev: Switch to guard(mutex) and __free(kfree) namespace: Replace simple_strtoul with kstrtoul to parse boot params dcache: Replace simple_strtoul with kstrtoul in set_dhash_entries ...
2 parents 3304b3f + 6cbfdf8 commit 9e35511

39 files changed

Lines changed: 352 additions & 294 deletions

Documentation/filesystems/nfs/exporting.rst

Lines changed: 5 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -119,43 +119,11 @@ For a filesystem to be exportable it must:
119119

120120
A file system implementation declares that instances of the filesystem
121121
are exportable by setting the s_export_op field in the struct
122-
super_block. This field must point to a "struct export_operations"
123-
struct which has the following members:
124-
125-
encode_fh (mandatory)
126-
Takes a dentry and creates a filehandle fragment which may later be used
127-
to find or create a dentry for the same object.
128-
129-
fh_to_dentry (mandatory)
130-
Given a filehandle fragment, this should find the implied object and
131-
create a dentry for it (possibly with d_obtain_alias).
132-
133-
fh_to_parent (optional but strongly recommended)
134-
Given a filehandle fragment, this should find the parent of the
135-
implied object and create a dentry for it (possibly with
136-
d_obtain_alias). May fail if the filehandle fragment is too small.
137-
138-
get_parent (optional but strongly recommended)
139-
When given a dentry for a directory, this should return a dentry for
140-
the parent. Quite possibly the parent dentry will have been allocated
141-
by d_alloc_anon. The default get_parent function just returns an error
142-
so any filehandle lookup that requires finding a parent will fail.
143-
->lookup("..") is *not* used as a default as it can leave ".." entries
144-
in the dcache which are too messy to work with.
145-
146-
get_name (optional)
147-
When given a parent dentry and a child dentry, this should find a name
148-
in the directory identified by the parent dentry, which leads to the
149-
object identified by the child dentry. If no get_name function is
150-
supplied, a default implementation is provided which uses vfs_readdir
151-
to find potential names, and matches inode numbers to find the correct
152-
match.
153-
154-
flags
155-
Some filesystems may need to be handled differently than others. The
156-
export_operations struct also includes a flags field that allows the
157-
filesystem to communicate such information to nfsd. See the Export
158-
Operations Flags section below for more explanation.
122+
super_block. This field must point to a struct export_operations
123+
which has the following members:
124+
125+
.. kernel-doc:: include/linux/exportfs.h
126+
:identifiers: struct export_operations
159127

160128
A filehandle fragment consists of an array of 1 or more 4byte words,
161129
together with a one byte "type".

fs/9p/acl.c

Lines changed: 3 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -167,17 +167,11 @@ int v9fs_iop_set_acl(struct mnt_idmap *idmap, struct dentry *dentry,
167167
if (retval)
168168
goto err_out;
169169

170-
size = posix_acl_xattr_size(acl->a_count);
171-
172-
value = kzalloc(size, GFP_NOFS);
170+
value = posix_acl_to_xattr(&init_user_ns, acl, &size, GFP_NOFS);
173171
if (!value) {
174172
retval = -ENOMEM;
175173
goto err_out;
176174
}
177-
178-
retval = posix_acl_to_xattr(&init_user_ns, acl, value, size);
179-
if (retval < 0)
180-
goto err_out;
181175
}
182176

183177
/*
@@ -257,13 +251,10 @@ static int v9fs_set_acl(struct p9_fid *fid, int type, struct posix_acl *acl)
257251
return 0;
258252

259253
/* Set a setxattr request to server */
260-
size = posix_acl_xattr_size(acl->a_count);
261-
buffer = kmalloc(size, GFP_KERNEL);
254+
buffer = posix_acl_to_xattr(&init_user_ns, acl, &size, GFP_KERNEL);
262255
if (!buffer)
263256
return -ENOMEM;
264-
retval = posix_acl_to_xattr(&init_user_ns, acl, buffer, size);
265-
if (retval < 0)
266-
goto err_free_out;
257+
267258
switch (type) {
268259
case ACL_TYPE_ACCESS:
269260
name = XATTR_NAME_POSIX_ACL_ACCESS;
@@ -275,7 +266,6 @@ static int v9fs_set_acl(struct p9_fid *fid, int type, struct posix_acl *acl)
275266
BUG();
276267
}
277268
retval = v9fs_fid_xattr_set(fid, name, buffer, size, 0);
278-
err_free_out:
279269
kfree(buffer);
280270
return retval;
281271
}

fs/btrfs/acl.c

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,8 @@ struct posix_acl *btrfs_get_acl(struct inode *inode, int type, bool rcu)
5757
int __btrfs_set_acl(struct btrfs_trans_handle *trans, struct inode *inode,
5858
struct posix_acl *acl, int type)
5959
{
60-
int ret, size = 0;
60+
int ret;
61+
size_t size = 0;
6162
const char *name;
6263
char AUTO_KFREE(value);
6364

@@ -77,20 +78,15 @@ int __btrfs_set_acl(struct btrfs_trans_handle *trans, struct inode *inode,
7778
if (acl) {
7879
unsigned int nofs_flag;
7980

80-
size = posix_acl_xattr_size(acl->a_count);
8181
/*
8282
* We're holding a transaction handle, so use a NOFS memory
8383
* allocation context to avoid deadlock if reclaim happens.
8484
*/
8585
nofs_flag = memalloc_nofs_save();
86-
value = kmalloc(size, GFP_KERNEL);
86+
value = posix_acl_to_xattr(&init_user_ns, acl, &size, GFP_KERNEL);
8787
memalloc_nofs_restore(nofs_flag);
8888
if (!value)
8989
return -ENOMEM;
90-
91-
ret = posix_acl_to_xattr(&init_user_ns, acl, value, size);
92-
if (ret < 0)
93-
return ret;
9490
}
9591

9692
if (trans)

fs/buffer.c

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2354,7 +2354,7 @@ bool block_is_partially_uptodate(struct folio *folio, size_t from, size_t count)
23542354
if (!head)
23552355
return false;
23562356
blocksize = head->b_size;
2357-
to = min_t(unsigned, folio_size(folio) - from, count);
2357+
to = min(folio_size(folio) - from, count);
23582358
to = from + to;
23592359
if (from < blocksize && to > folio_size(folio) - blocksize)
23602360
return false;
@@ -2948,6 +2948,10 @@ bool try_to_free_buffers(struct folio *folio)
29482948
if (folio_test_writeback(folio))
29492949
return false;
29502950

2951+
/* Misconfigured folio check */
2952+
if (WARN_ON_ONCE(!folio_buffers(folio)))
2953+
return true;
2954+
29512955
if (mapping == NULL) { /* can this still happen? */
29522956
ret = drop_buffers(folio, &buffers_to_free);
29532957
goto out;

fs/ceph/acl.c

Lines changed: 22 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,8 @@ struct posix_acl *ceph_get_acl(struct inode *inode, int type, bool rcu)
9090
int ceph_set_acl(struct mnt_idmap *idmap, struct dentry *dentry,
9191
struct posix_acl *acl, int type)
9292
{
93-
int ret = 0, size = 0;
93+
int ret = 0;
94+
size_t size = 0;
9495
const char *name = NULL;
9596
char *value = NULL;
9697
struct iattr newattrs;
@@ -126,16 +127,11 @@ int ceph_set_acl(struct mnt_idmap *idmap, struct dentry *dentry,
126127
}
127128

128129
if (acl) {
129-
size = posix_acl_xattr_size(acl->a_count);
130-
value = kmalloc(size, GFP_NOFS);
130+
value = posix_acl_to_xattr(&init_user_ns, acl, &size, GFP_NOFS);
131131
if (!value) {
132132
ret = -ENOMEM;
133133
goto out;
134134
}
135-
136-
ret = posix_acl_to_xattr(&init_user_ns, acl, value, size);
137-
if (ret < 0)
138-
goto out_free;
139135
}
140136

141137
if (new_mode != old_mode) {
@@ -172,7 +168,7 @@ int ceph_pre_init_acls(struct inode *dir, umode_t *mode,
172168
struct posix_acl *acl, *default_acl;
173169
size_t val_size1 = 0, val_size2 = 0;
174170
struct ceph_pagelist *pagelist = NULL;
175-
void *tmp_buf = NULL;
171+
void *tmp_buf1 = NULL, *tmp_buf2 = NULL;
176172
int err;
177173

178174
err = posix_acl_create(dir, mode, &default_acl, &acl);
@@ -192,15 +188,7 @@ int ceph_pre_init_acls(struct inode *dir, umode_t *mode,
192188
if (!default_acl && !acl)
193189
return 0;
194190

195-
if (acl)
196-
val_size1 = posix_acl_xattr_size(acl->a_count);
197-
if (default_acl)
198-
val_size2 = posix_acl_xattr_size(default_acl->a_count);
199-
200191
err = -ENOMEM;
201-
tmp_buf = kmalloc(max(val_size1, val_size2), GFP_KERNEL);
202-
if (!tmp_buf)
203-
goto out_err;
204192
pagelist = ceph_pagelist_alloc(GFP_KERNEL);
205193
if (!pagelist)
206194
goto out_err;
@@ -213,34 +201,39 @@ int ceph_pre_init_acls(struct inode *dir, umode_t *mode,
213201

214202
if (acl) {
215203
size_t len = strlen(XATTR_NAME_POSIX_ACL_ACCESS);
204+
205+
err = -ENOMEM;
206+
tmp_buf1 = posix_acl_to_xattr(&init_user_ns, acl,
207+
&val_size1, GFP_KERNEL);
208+
if (!tmp_buf1)
209+
goto out_err;
216210
err = ceph_pagelist_reserve(pagelist, len + val_size1 + 8);
217211
if (err)
218212
goto out_err;
219213
ceph_pagelist_encode_string(pagelist, XATTR_NAME_POSIX_ACL_ACCESS,
220214
len);
221-
err = posix_acl_to_xattr(&init_user_ns, acl,
222-
tmp_buf, val_size1);
223-
if (err < 0)
224-
goto out_err;
225215
ceph_pagelist_encode_32(pagelist, val_size1);
226-
ceph_pagelist_append(pagelist, tmp_buf, val_size1);
216+
ceph_pagelist_append(pagelist, tmp_buf1, val_size1);
227217
}
228218
if (default_acl) {
229219
size_t len = strlen(XATTR_NAME_POSIX_ACL_DEFAULT);
220+
221+
err = -ENOMEM;
222+
tmp_buf2 = posix_acl_to_xattr(&init_user_ns, default_acl,
223+
&val_size2, GFP_KERNEL);
224+
if (!tmp_buf2)
225+
goto out_err;
230226
err = ceph_pagelist_reserve(pagelist, len + val_size2 + 8);
231227
if (err)
232228
goto out_err;
233229
ceph_pagelist_encode_string(pagelist,
234230
XATTR_NAME_POSIX_ACL_DEFAULT, len);
235-
err = posix_acl_to_xattr(&init_user_ns, default_acl,
236-
tmp_buf, val_size2);
237-
if (err < 0)
238-
goto out_err;
239231
ceph_pagelist_encode_32(pagelist, val_size2);
240-
ceph_pagelist_append(pagelist, tmp_buf, val_size2);
232+
ceph_pagelist_append(pagelist, tmp_buf2, val_size2);
241233
}
242234

243-
kfree(tmp_buf);
235+
kfree(tmp_buf1);
236+
kfree(tmp_buf2);
244237

245238
as_ctx->acl = acl;
246239
as_ctx->default_acl = default_acl;
@@ -250,7 +243,8 @@ int ceph_pre_init_acls(struct inode *dir, umode_t *mode,
250243
out_err:
251244
posix_acl_release(acl);
252245
posix_acl_release(default_acl);
253-
kfree(tmp_buf);
246+
kfree(tmp_buf1);
247+
kfree(tmp_buf2);
254248
if (pagelist)
255249
ceph_pagelist_release(pagelist);
256250
return err;

fs/char_dev.c

Lines changed: 8 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
#include <linux/kdev_t.h>
1111
#include <linux/slab.h>
1212
#include <linux/string.h>
13+
#include <linux/cleanup.h>
1314

1415
#include <linux/major.h>
1516
#include <linux/errno.h>
@@ -97,7 +98,8 @@ static struct char_device_struct *
9798
__register_chrdev_region(unsigned int major, unsigned int baseminor,
9899
int minorct, const char *name)
99100
{
100-
struct char_device_struct *cd, *curr, *prev = NULL;
101+
struct char_device_struct *cd __free(kfree) = NULL;
102+
struct char_device_struct *curr, *prev = NULL;
101103
int ret;
102104
int i;
103105

@@ -117,14 +119,14 @@ __register_chrdev_region(unsigned int major, unsigned int baseminor,
117119
if (cd == NULL)
118120
return ERR_PTR(-ENOMEM);
119121

120-
mutex_lock(&chrdevs_lock);
122+
guard(mutex)(&chrdevs_lock);
121123

122124
if (major == 0) {
123125
ret = find_dynamic_major();
124126
if (ret < 0) {
125127
pr_err("CHRDEV \"%s\" dynamic allocation region is full\n",
126128
name);
127-
goto out;
129+
return ERR_PTR(ret);
128130
}
129131
major = ret;
130132
}
@@ -144,7 +146,7 @@ __register_chrdev_region(unsigned int major, unsigned int baseminor,
144146
if (curr->baseminor >= baseminor + minorct)
145147
break;
146148

147-
goto out;
149+
return ERR_PTR(ret);
148150
}
149151

150152
cd->major = major;
@@ -160,12 +162,7 @@ __register_chrdev_region(unsigned int major, unsigned int baseminor,
160162
prev->next = cd;
161163
}
162164

163-
mutex_unlock(&chrdevs_lock);
164-
return cd;
165-
out:
166-
mutex_unlock(&chrdevs_lock);
167-
kfree(cd);
168-
return ERR_PTR(ret);
165+
return_ptr(cd);
169166
}
170167

171168
static struct char_device_struct *
@@ -343,7 +340,7 @@ void __unregister_chrdev(unsigned int major, unsigned int baseminor,
343340
kfree(cd);
344341
}
345342

346-
static DEFINE_SPINLOCK(cdev_lock);
343+
static __cacheline_aligned_in_smp DEFINE_SPINLOCK(cdev_lock);
347344

348345
static struct kobject *cdev_get(struct cdev *p)
349346
{

fs/dcache.c

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3237,10 +3237,7 @@ EXPORT_SYMBOL(d_parent_ino);
32373237
static __initdata unsigned long dhash_entries;
32383238
static int __init set_dhash_entries(char *str)
32393239
{
3240-
if (!str)
3241-
return 0;
3242-
dhash_entries = simple_strtoul(str, &str, 0);
3243-
return 1;
3240+
return kstrtoul(str, 0, &dhash_entries) == 0;
32443241
}
32453242
__setup("dhash_entries=", set_dhash_entries);
32463243

fs/exec.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -555,7 +555,7 @@ int copy_string_kernel(const char *arg, struct linux_binprm *bprm)
555555
return -E2BIG;
556556

557557
while (len > 0) {
558-
unsigned int bytes_to_copy = min_t(unsigned int, len,
558+
unsigned int bytes_to_copy = min(len,
559559
min_not_zero(offset_in_page(pos), PAGE_SIZE));
560560
struct page *page;
561561

fs/ext4/mballoc.c

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4276,8 +4276,7 @@ void ext4_mb_mark_bb(struct super_block *sb, ext4_fsblk_t block,
42764276
* get the corresponding group metadata to work with.
42774277
* For this we have goto again loop.
42784278
*/
4279-
thisgrp_len = min_t(unsigned int, (unsigned int)len,
4280-
EXT4_BLOCKS_PER_GROUP(sb) - EXT4_C2B(sbi, blkoff));
4279+
thisgrp_len = min(len, EXT4_BLOCKS_PER_GROUP(sb) - EXT4_C2B(sbi, blkoff));
42814280
clen = EXT4_NUM_B2C(sbi, thisgrp_len);
42824281

42834282
if (!ext4_sb_block_valid(sb, NULL, block, thisgrp_len)) {

fs/ext4/resize.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1479,7 +1479,7 @@ static void ext4_update_super(struct super_block *sb,
14791479

14801480
/* Update the global fs size fields */
14811481
sbi->s_groups_count += flex_gd->count;
1482-
sbi->s_blockfile_groups = min_t(ext4_group_t, sbi->s_groups_count,
1482+
sbi->s_blockfile_groups = min(sbi->s_groups_count,
14831483
(EXT4_MAX_BLOCK_FILE_PHYS / EXT4_BLOCKS_PER_GROUP(sb)));
14841484

14851485
/* Update the reserved block counts only once the new group is

0 commit comments

Comments
 (0)