Deprecated
This page is deprecated and may contain information that is no longer up to date.
Truncated disk replica¶
How to reproduce the problem¶
Archive a file to tape:
[itctabuild02] ~ > echo -n '1234567890' > ten_byte_file.txt
[itctabuild02] ~ > run_eosuser1_shell
[itctabuild02] ~ (krb5=eosuser1)> xrdcp ten_byte_file.txt root://localhost//eos/dev/userfiles/testdir_1
[10B/10B][100%][==================================================][10B/s]
[itctabuild02] ~ (krb5=eosuser1)> exit
exit
[itctabuild02] ~ >
Observe that the file is only on tape (d0::t1
):
[itctabuild02] ~ > run_eosuser1_shell
[itctabuild02] ~ (krb5=eosuser1)> eos root://localhost ls -y /eos/dev/userfiles/testdir_1/ten_byte_file.txt
d0::t1 -rw-r--r-- 1 eosuser1 eosuser1 10 Apr 16 11:14 ten_byte_file.txt
[itctabuild02] ~ (krb5=eosuser1)> exit
exit
[itctabuild02] ~ >
Request that the file be retrieved from tape:
[itctabuild02] ~ > run_eospoweruser1_shell
[itctabuild02] ~ (krb5=eospoweruser1)> xrdfs localhost prepare -s /eos/dev/userfiles/testdir_1/ten_byte_file.txt
eos:044620011458020202280000000001000042:e03bfa81.5e982134:11
[itctabuild02] ~ (krb5=eospoweruser1)> exit
exit
[itctabuild02] ~ >
Observe that the file is both on disk and on tape (d1::t1
):
[itctabuild02] ~ > run_eosuser1_shell
[itctabuild02] ~ (krb5=eosuser1)> eos root://localhost ls -y /eos/dev/userfiles/testdir_1/ten_byte_file.txt
d1::t1 -rw-r--r-- 2 eosuser1 eosuser1 10 Apr 16 11:14 ten_byte_file.txt
[itctabuild02] ~ (krb5=eosuser1)> exit
exit
[itctabuild02] ~ >
Determine the location of the underlying physical file on the EOS FST and truncate it:
[itctabuild02] ~ > sudo eos root://localhost fileinfo /eos/dev/userfiles/testdir_1/ten_byte_file.txt --fullpath
File: '/eos/dev/userfiles/testdir_1/ten_byte_file.txt' Flags: 0644
Size: 10
Modify: Thu Apr 16 11:14:35 2020 Timestamp: 1587028475.686152000
Change: Thu Apr 16 11:15:11 2020 Timestamp: 1587028511.310849367
Birth : Thu Apr 16 11:14:35 2020 Timestamp: 1587028475.645430951
CUid: 19227 CGid: 1487 Fxid: 00000010 Fid: 16 Pid: 15 Pxid: 0000000f
XStype: adler XS: 0b 2c 02 0e ETAGs: "4294967296:0b2c020e"
Layout: replica Stripes: 1 Blocksize: 4k LayoutId: 00100012
#Rep: 2
┌───┬──────┬────────────────────────┬────────────────┬────────────────────────────────────────────┬──────────┬──────────────┬────────────┬────────┬────────────────────────┬──────────────────────────────────────────────────────────────┐
│no.│ fs-id│ host│ schedgroup│ path│ boot│ configstatus│ drain│ active│ geotag│ physical location│
└───┴──────┴────────────────────────┴────────────────┴────────────────────────────────────────────┴──────────┴──────────────┴────────────┴────────┴────────────────────────┴──────────────────────────────────────────────────────────────┘
0 65535 localhost tape.0 /does_not_exist off nodrain offline /does_not_exist/00000000/00000010
1 2 itctabuild02.cern.ch spinner.0 /run/media/smurray/250GB/fst_spinner_storage booted rw nodrain online flat /run/media/smurray/250GB/fst_spinner_storage/00000000/00000010
*******
[itctabuild02] ~ > echo -n | sudo tee /run/media/smurray/250GB/fst_spinner_storage/00000000/00000010
[itctabuild02] ~ >
Copy out the disk replica as an end user and print the successful exit code. This actually works when it should NOT:
[itctabuild02] ~ > run_eosuser1_shell
[itctabuild02] ~ (krb5=eosuser1)> xrdcp root://localhost//eos/dev/userfiles/testdir_1/ten_byte_file.txt /tmp/tmp_ten_byte_file.txt
[0B/0B][100%][==================================================][0B/s]
[itctabuild02] ~ (krb5=eosuser1)> echo $?
0
[itctabuild02] ~ (krb5=eosuser1)> exit
exit
[itctabuild02] ~ >
Observe that the successfully copied out file is of zero length which is an ERROR:
[itctabuild02] ~ > stat /tmp/tmp_ten_byte_file.txt
File: ‘/tmp/tmp_ten_byte_file.txt’
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: 801h/2049d Inode: 3014674 Links: 1
Access: (0644/-rw-r--r--) Uid: (19214/ smurray) Gid: ( 1000/ smurray)
Context: unconfined_u:object_r:user_tmp_t:s0
Access: 2020-04-16 11:16:49.369981786 +0200
Modify: 2020-04-16 11:16:49.369981786 +0200
Change: 2020-04-16 11:16:49.369981786 +0200
Birth: -
[itctabuild02] ~ >
Observe that EOS ignores an end user’s request to retrieve the file from tape because EOS believes the disk replica already exists:
[itctabuild02] ~ > run_eospoweruser1_shell
[itctabuild02] ~ (krb5=eospoweruser1)> xrdfs localhost prepare -s /eos/dev/userfiles/testdir_1/ten_byte_file.txt
eos:044620011458020202280000000001000042:e03bfa81.5e982134:12
[itctabuild02] ~ (krb5=eospoweruser1)> exit
exit
[itctabuild02] ~ >
[itctabuild02] ~ > grep 'nothing to prepare' /var/log/eos/mgm/xrdlog.mgm
200416 11:39:05 time=1587029945.244306 func=HandleProtoMethodPrepareEvent level=INFO logid=static.............................. unit=mgm@itctabuild02.cern.ch:1094 tid=00007f5c442fa700 source=WFE:1666 tident= sec=(null) uid=99 gid=99 name=- geo="" File /eos/dev/userfiles/testdir_1/ten_byte_file.txt is already on disk, nothing to prepare.
[itctabuild02] ~ >
How an end user can recover the data¶
Ask EOS to evict the disk replica:
[itctabuild02] ~ > run_eospoweruser1_shell
[itctabuild02] ~ (krb5=eospoweruser1)> xrdfs localhost prepare -e /eos/dev/userfiles/testdir_1/ten_byte_file.txt
[itctabuild02] ~ (krb5=eospoweruser1)> exit
exit
[itctabuild02] ~ >
Observe that EOS now recognises the fact that the disk replica is in fact gone:
[itctabuild02] ~ > run_eosuser1_shell
[itctabuild02] ~ (krb5=eosuser1)> eos root://localhost ls -y /eos/dev/userfiles/testdir_1/ten_byte_file.txt
d0::t1 -rw-r--r-- 1 eosuser1 eosuser1 10 Apr 16 11:14 ten_byte_file.txt
[itctabuild02] ~ (krb5=eosuser1)> exit
exit
[itctabuild02] ~ >
Request that the file be retrieved from tape:
[itctabuild02] ~ > run_eospoweruser1_shell
[itctabuild02] ~ (krb5=eospoweruser1)> xrdfs localhost prepare -s /eos/dev/userfiles/testdir_1/ten_byte_file.txt
eos:044620011458020202280000000001000042:e03bfa81.5e982134:13
[itctabuild02] ~ (krb5=eospoweruser1)> exit
exit
[itctabuild02] ~ >
Observe that the file is both on disk and on tape (d1:t1
):
[itctabuild02] ~ > run_eosuser1_shell
[itctabuild02] ~ (krb5=eosuser1)> eos root://localhost ls -y /eos/dev/userfiles/testdir_1/ten_byte_file.txt
d1::t1 -rw-r--r-- 2 eosuser1 eosuser1 10 Apr 16 11:14 ten_byte_file.txt
[itctabuild02] ~ (krb5=eosuser1)> exit
exit
[itctabuild02] ~ >
Copy the recovered file out:
[itctabuild02] ~ > run_eosuser1_shell
[itctabuild02] ~ (krb5=eosuser1)> xrdcp root://localhost//eos/dev/userfiles/testdir_1/ten_byte_file.txt /tmp/tmp_ten_byte_file.txt
[10B/10B][100%][==================================================][10B/s]
[itctabuild02] ~ (krb5=eosuser1)> exit
exit
[itctabuild02] ~ > cat /tmp/tmp_ten_byte_file.txt; echo
1234567890
[itctabuild02] ~ >
What a tape operator can do to recover the data¶
The same as an end user.