Commit f135cea
btrfs: fix partial loss of prealloc extent past i_size after fsync
When we have an inode with a prealloc extent that starts at an offset
lower than the i_size and there is another prealloc extent that starts at
an offset beyond i_size, we can end up losing part of the first prealloc
extent (the part that starts at i_size) and have an implicit hole if we
fsync the file and then have a power failure.
Consider the following example with comments explaining how and why it
happens.
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
# Create our test file with 2 consecutive prealloc extents, each with a
# size of 128Kb, and covering the range from 0 to 256Kb, with a file
# size of 0.
$ xfs_io -f -c "falloc -k 0 128K" /mnt/foo
$ xfs_io -c "falloc -k 128K 128K" /mnt/foo
# Fsync the file to record both extents in the log tree.
$ xfs_io -c "fsync" /mnt/foo
# Now do a redudant extent allocation for the range from 0 to 64Kb.
# This will merely increase the file size from 0 to 64Kb. Instead we
# could also do a truncate to set the file size to 64Kb.
$ xfs_io -c "falloc 0 64K" /mnt/foo
# Fsync the file, so we update the inode item in the log tree with the
# new file size (64Kb). This also ends up setting the number of bytes
# for the first prealloc extent to 64Kb. This is done by the truncation
# at btrfs_log_prealloc_extents().
# This means that if a power failure happens after this, a write into
# the file range 64Kb to 128Kb will not use the prealloc extent and
# will result in allocation of a new extent.
$ xfs_io -c "fsync" /mnt/foo
# Now set the file size to 256K with a truncate and then fsync the file.
# Since no changes happened to the extents, the fsync only updates the
# i_size in the inode item at the log tree. This results in an implicit
# hole for the file range from 64Kb to 128Kb, something which fsck will
# complain when not using the NO_HOLES feature if we replay the log
# after a power failure.
$ xfs_io -c "truncate 256K" -c "fsync" /mnt/foo
So instead of always truncating the log to the inode's current i_size at
btrfs_log_prealloc_extents(), check first if there's a prealloc extent
that starts at an offset lower than the i_size and with a length that
crosses the i_size - if there is one, just make sure we truncate to a
size that corresponds to the end offset of that prealloc extent, so
that we don't lose the part of that extent that starts at i_size if a
power failure happens.
A test case for fstests follows soon.
Fixes: 31d11b8 ("Btrfs: fix duplicate extents after fsync of file with prealloc extents")
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>1 parent 1402d17 commit f135cea
1 file changed
Lines changed: 40 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4226 | 4226 | | |
4227 | 4227 | | |
4228 | 4228 | | |
| 4229 | + | |
| 4230 | + | |
| 4231 | + | |
4229 | 4232 | | |
4230 | 4233 | | |
4231 | 4234 | | |
| |||
4240 | 4243 | | |
4241 | 4244 | | |
4242 | 4245 | | |
| 4246 | + | |
| 4247 | + | |
| 4248 | + | |
| 4249 | + | |
| 4250 | + | |
| 4251 | + | |
| 4252 | + | |
| 4253 | + | |
| 4254 | + | |
| 4255 | + | |
| 4256 | + | |
| 4257 | + | |
| 4258 | + | |
| 4259 | + | |
| 4260 | + | |
| 4261 | + | |
| 4262 | + | |
| 4263 | + | |
| 4264 | + | |
| 4265 | + | |
| 4266 | + | |
| 4267 | + | |
| 4268 | + | |
| 4269 | + | |
| 4270 | + | |
| 4271 | + | |
| 4272 | + | |
| 4273 | + | |
| 4274 | + | |
| 4275 | + | |
| 4276 | + | |
| 4277 | + | |
| 4278 | + | |
| 4279 | + | |
4243 | 4280 | | |
4244 | | - | |
4245 | | - | |
| 4281 | + | |
| 4282 | + | |
4246 | 4283 | | |
4247 | 4284 | | |
4248 | 4285 | | |
| |||
4280 | 4317 | | |
4281 | 4318 | | |
4282 | 4319 | | |
4283 | | - | |
| 4320 | + | |
4284 | 4321 | | |
4285 | 4322 | | |
4286 | 4323 | | |
| |||
0 commit comments