Skip to content

Zero-length Files

How CASTOR handled zero-length files

CASTOR accepted zero-length files. These were pure metadata files that were never archived to tape; no "m-bit" for them. When such a file was read by the client it was touched on a diskserver, just for the client to have a physical file to read from.

How EOS handles zero-length files

  • eos cp <zero_length_file> ... creates a normal file consisting of a metadata entry in the MGM and a file replica in the FST with size 0 and a valid Adler32 checksum of 0x01.
  • eos touch creates a metadata entry in the MGM and recent versions of EOS also create a file replica with size 0 and valid checksum. (Earlier versions of eos touch did not create a file replica). The replica does not actually exist on the FST as zero-length files are served from the MGM. Tested in EOS v4.8.1.
  • There is another difference between eos cp and eos touch from CTA's point of view. Files created with eos cp trigger the CREATE/CLOSEW events which send a message to the CTA Frontend. eos touch does not trigger the CREATE/CLOSEW events (the file is created but CTA is not notified).
  • In case of a failed transfer, the file is deleted by EOS. The CREATE event is sent to CTA but there will be no CLOSEW event.

Files imported using gRPC

  • Files migrated from CASTOR to EOS using the gRPC interface are created with no disk replica. This is also true for zero-length files.
  • The fact that there is no replica shows up in eos fileinfo but the file has a valid checksum and copy operations work, so this does not seem to be an issue.

Summary

  • There is now only one "flavour" of zero-length file in the MGM: namespace entry + zero-length disk replica with valid checksum. The disk replica may or may not actually exist on the FST but the MGM behaves the same way in either case.
  • CTA Frontend has to treat zero-length files as a special case in order to have an consistent behaviour between eos touch and xrdcp/eos cp (see below).

How CTA handles zero-length files

When the CTA Frontend receives a CLOSEW event for a zero-length file, it returns SUCCESS but no archive request is queued. This allows the file to be created in the namespace with no tape copy.

Tape residency status of zero-length files

  • Zero-length files have no tape copy in CTA (same as CASTOR).
  • EOS ls -y reports zero-length files as having zero tape copies (same as CASTOR nsls).
  • EOS stat reports BackupExists as not set. This is correct, but is a difference in behaviour from CASTOR. CASTOR xrdfs stat reported the XRootD BackupExists bit set. This a workaround for the CASTOR Backup use case.
  • EOS Archive tool was modified to expect a zero-length file to have BackupExists not set (same as EOS stat).

How should the garbage collector treat zero-length files?

The MGM garbage collector will ignore zero-length files.

Outstanding issues

  • FTS will need some additional logic for the "m-bit" check. A file should be reported as successfully transferred if:
       file.size() == 0 and file is in the namespace
    OR file.size() > 0 and file is on tape
    

References

  • CTA issue #693: Zero-length files in CTA
  • CTA issue #801: CTA Frontend should allow users to create zero-length files
  • CTA issue #697: Import zero-length files from CASTOR
  • EOS issue EOS-3874: FST should not queue zero-length files