Skip to content

Commit a2b9ad4

Browse files
fdmananagregkh
authored andcommitted
btrfs: send, orphanize first all conflicting inodes when processing references
commit 98272bb upstream. When doing an incremental send it is possible that when processing the new references for an inode we end up issuing rename or link operations that have an invalid path, which contains the orphanized name of a directory before we actually orphanized it, causing the receiver to fail. The following reproducer triggers such scenario: $ cat reproducer.sh #!/bin/bash mkfs.btrfs -f /dev/sdi >/dev/null mount /dev/sdi /mnt/sdi touch /mnt/sdi/a touch /mnt/sdi/b mkdir /mnt/sdi/testdir # We want "a" to have a lower inode number then "testdir" (257 vs 259). mv /mnt/sdi/a /mnt/sdi/testdir/a # Filesystem looks like: # # . (ino 256) # |----- testdir/ (ino 259) # | |----- a (ino 257) # | # |----- b (ino 258) btrfs subvolume snapshot -r /mnt/sdi /mnt/sdi/snap1 btrfs send -f /tmp/snap1.send /mnt/sdi/snap1 # Now rename 259 to "testdir_2", then change the name of 257 to # "testdir" and make it a direct descendant of the root inode (256). # Also create a new link for inode 257 with the old name of inode 258. # By swapping the names and location of several inodes and create a # nasty dependency chain of rename and link operations. mv /mnt/sdi/testdir/a /mnt/sdi/a2 touch /mnt/sdi/testdir/a mv /mnt/sdi/b /mnt/sdi/b2 ln /mnt/sdi/a2 /mnt/sdi/b mv /mnt/sdi/testdir /mnt/sdi/testdir_2 mv /mnt/sdi/a2 /mnt/sdi/testdir # Filesystem now looks like: # # . (ino 256) # |----- testdir_2/ (ino 259) # | |----- a (ino 260) # | # |----- testdir (ino 257) # |----- b (ino 257) # |----- b2 (ino 258) btrfs subvolume snapshot -r /mnt/sdi /mnt/sdi/snap2 btrfs send -f /tmp/snap2.send -p /mnt/sdi/snap1 /mnt/sdi/snap2 mkfs.btrfs -f /dev/sdj >/dev/null mount /dev/sdj /mnt/sdj btrfs receive -f /tmp/snap1.send /mnt/sdj btrfs receive -f /tmp/snap2.send /mnt/sdj umount /mnt/sdi umount /mnt/sdj When running the reproducer, the receive of the incremental send stream fails: $ ./reproducer.sh Create a readonly snapshot of '/mnt/sdi' in '/mnt/sdi/snap1' At subvol /mnt/sdi/snap1 Create a readonly snapshot of '/mnt/sdi' in '/mnt/sdi/snap2' At subvol /mnt/sdi/snap2 At subvol snap1 At snapshot snap2 ERROR: link b -> o259-6-0/a failed: No such file or directory The problem happens because of the following: 1) Before we start iterating the list of new references for inode 257, we generate its current path and store it at @valid_path, done at the very beginning of process_recorded_refs(). The generated path is "o259-6-0/a", containing the orphanized name for inode 259; 2) Then we iterate over the list of new references, which has the references "b" and "testdir" in that specific order; 3) We process reference "b" first, because it is in the list before reference "testdir". We then issue a link operation to create the new reference "b" using a target path corresponding to the content at @valid_path, which corresponds to "o259-6-0/a". However we haven't yet orphanized inode 259, its name is still "testdir", and not "o259-6-0". The orphanization of 259 did not happen yet because we will process the reference named "testdir" for inode 257 only in the next iteration of the loop that goes over the list of new references. Fix the issue by having a preliminar iteration over all the new references at process_recorded_refs(). This iteration is responsible only for doing the orphanization of other inodes that have and old reference that conflicts with one of the new references of the inode we are currently processing. The emission of rename and link operations happen now in the next iteration of the new references. A test case for fstests will follow soon. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1 parent 3dfe2dd commit a2b9ad4

1 file changed

Lines changed: 87 additions & 40 deletions

File tree

fs/btrfs/send.c

Lines changed: 87 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -3880,52 +3880,56 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move)
38803880
goto out;
38813881
}
38823882

3883+
/*
3884+
* Before doing any rename and link operations, do a first pass on the
3885+
* new references to orphanize any unprocessed inodes that may have a
3886+
* reference that conflicts with one of the new references of the current
3887+
* inode. This needs to happen first because a new reference may conflict
3888+
* with the old reference of a parent directory, so we must make sure
3889+
* that the path used for link and rename commands don't use an
3890+
* orphanized name when an ancestor was not yet orphanized.
3891+
*
3892+
* Example:
3893+
*
3894+
* Parent snapshot:
3895+
*
3896+
* . (ino 256)
3897+
* |----- testdir/ (ino 259)
3898+
* | |----- a (ino 257)
3899+
* |
3900+
* |----- b (ino 258)
3901+
*
3902+
* Send snapshot:
3903+
*
3904+
* . (ino 256)
3905+
* |----- testdir_2/ (ino 259)
3906+
* | |----- a (ino 260)
3907+
* |
3908+
* |----- testdir (ino 257)
3909+
* |----- b (ino 257)
3910+
* |----- b2 (ino 258)
3911+
*
3912+
* Processing the new reference for inode 257 with name "b" may happen
3913+
* before processing the new reference with name "testdir". If so, we
3914+
* must make sure that by the time we send a link command to create the
3915+
* hard link "b", inode 259 was already orphanized, since the generated
3916+
* path in "valid_path" already contains the orphanized name for 259.
3917+
* We are processing inode 257, so only later when processing 259 we do
3918+
* the rename operation to change its temporary (orphanized) name to
3919+
* "testdir_2".
3920+
*/
38833921
list_for_each_entry(cur, &sctx->new_refs, list) {
3884-
/*
3885-
* We may have refs where the parent directory does not exist
3886-
* yet. This happens if the parent directories inum is higher
3887-
* than the current inum. To handle this case, we create the
3888-
* parent directory out of order. But we need to check if this
3889-
* did already happen before due to other refs in the same dir.
3890-
*/
38913922
ret = get_cur_inode_state(sctx, cur->dir, cur->dir_gen);
38923923
if (ret < 0)
38933924
goto out;
3894-
if (ret == inode_state_will_create) {
3895-
ret = 0;
3896-
/*
3897-
* First check if any of the current inodes refs did
3898-
* already create the dir.
3899-
*/
3900-
list_for_each_entry(cur2, &sctx->new_refs, list) {
3901-
if (cur == cur2)
3902-
break;
3903-
if (cur2->dir == cur->dir) {
3904-
ret = 1;
3905-
break;
3906-
}
3907-
}
3908-
3909-
/*
3910-
* If that did not happen, check if a previous inode
3911-
* did already create the dir.
3912-
*/
3913-
if (!ret)
3914-
ret = did_create_dir(sctx, cur->dir);
3915-
if (ret < 0)
3916-
goto out;
3917-
if (!ret) {
3918-
ret = send_create_inode(sctx, cur->dir);
3919-
if (ret < 0)
3920-
goto out;
3921-
}
3922-
}
3925+
if (ret == inode_state_will_create)
3926+
continue;
39233927

39243928
/*
3925-
* Check if this new ref would overwrite the first ref of
3926-
* another unprocessed inode. If yes, orphanize the
3927-
* overwritten inode. If we find an overwritten ref that is
3928-
* not the first ref, simply unlink it.
3929+
* Check if this new ref would overwrite the first ref of another
3930+
* unprocessed inode. If yes, orphanize the overwritten inode.
3931+
* If we find an overwritten ref that is not the first ref,
3932+
* simply unlink it.
39293933
*/
39303934
ret = will_overwrite_ref(sctx, cur->dir, cur->dir_gen,
39313935
cur->name, cur->name_len,
@@ -4004,6 +4008,49 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move)
40044008
}
40054009
}
40064010

4011+
}
4012+
4013+
list_for_each_entry(cur, &sctx->new_refs, list) {
4014+
/*
4015+
* We may have refs where the parent directory does not exist
4016+
* yet. This happens if the parent directories inum is higher
4017+
* than the current inum. To handle this case, we create the
4018+
* parent directory out of order. But we need to check if this
4019+
* did already happen before due to other refs in the same dir.
4020+
*/
4021+
ret = get_cur_inode_state(sctx, cur->dir, cur->dir_gen);
4022+
if (ret < 0)
4023+
goto out;
4024+
if (ret == inode_state_will_create) {
4025+
ret = 0;
4026+
/*
4027+
* First check if any of the current inodes refs did
4028+
* already create the dir.
4029+
*/
4030+
list_for_each_entry(cur2, &sctx->new_refs, list) {
4031+
if (cur == cur2)
4032+
break;
4033+
if (cur2->dir == cur->dir) {
4034+
ret = 1;
4035+
break;
4036+
}
4037+
}
4038+
4039+
/*
4040+
* If that did not happen, check if a previous inode
4041+
* did already create the dir.
4042+
*/
4043+
if (!ret)
4044+
ret = did_create_dir(sctx, cur->dir);
4045+
if (ret < 0)
4046+
goto out;
4047+
if (!ret) {
4048+
ret = send_create_inode(sctx, cur->dir);
4049+
if (ret < 0)
4050+
goto out;
4051+
}
4052+
}
4053+
40074054
if (S_ISDIR(sctx->cur_inode_mode) && sctx->parent_root) {
40084055
ret = wait_for_dest_dir_move(sctx, cur, is_orphan);
40094056
if (ret < 0)

0 commit comments

Comments
 (0)