Deprecated
This page is deprecated and may contain information that is no longer up to date.
Checksums¶
Summary¶
Files stored on tape always have an ADLER32 checksum.
This checksum should be created by the client software, before the file is transferred to the EOSCTA endpoint. This provides a complete end-to-end integrity check from file creation to storage on tape media.
If a file is transferred to the EOSCTA endpoint without an ADLER32 checksum, it will be calculated by EOS and added to the file metadata. In this case, EOSCTA can only check the integrity of the final part of the transfer, between the EOS disk cache and CTA (tape).
It is strongly recommended that an ADLER32 checksum should be provided for all files prior to transfer into EOSCTA. If not, it is the responsibility of the client software to verify the integrity of the transfer between the point of file creation and the EOSCTA endpoint.
Technical Details (EOS)¶
At the time of writing, EOS supports the following checksum types:
- ADLER32
- CRC32
- CRC32C
- CRC64
- MD5 hash
- SHA1 hash
- SHA256 hash
- XXH64 hash
As described below, CTA only supports verification of ADLER32 checksums. If another checksum type is provided, CTA will store its value but will not verify that it is correct. EOS and CTA can store multiple checksums of different types for a single file.
The EOSCTA endpoint should be configured to automatically calculate ADLER32 checksums if they are not provided by the client. The target directory in EOS must therefore have the following extended attribute set:
Technical Details (CTA)¶
All files stored on tape must have an ADLER32 checksum. There are two reasons for this:
- CASTOR used ADLER32 checksums exclusively. CTA is designed to allow the seamless migration of CASTOR files so the same checksum format is supported by default.
- The tape drives calculate the ADLER32 checksum in hardware and perform a read-after-write test for every file as it is written to tape.
The ADLER32 checksum calculated by the tape drive is compared to the provided ADLER32 checksum in the file metadata to verify the integrity of the transfer from EOS.
If an ADLER32 checksum is not provided to CTA, or the checksum does not match, CTA will not archive the file to tape. (Technically, the file is written to tape but not added to the file catalogue). An exception is generated in the tape server, returning an error to EOS which is added to the file metadata in the EOS namespace:
sys.archive.error="Aug 14 16:05:51.259997 hostname In ArchiveMount::reportJobsBatchWritten(): got an exception"
The explanation for the exception is logged in the tape server logs:
Aug 14 16:05:48 hostname cta-taped: LVL="ERROR" PID="14712" TID="14817"
MSG="In ArchiveMount::reportJobsBatchWritten(): got an exception"
thread="MainThread" tapeDrive="VDSTK1" mountId="21"
exceptionMessageValue="Checksum type expected=NONE actual=ADLER32" ...
Support for other checksum types¶
It is envisaged that in future CTA will support other checksum types besides ADLER32, for example MD5. This will require a modification to the tape server. It is not envisaged that this will be done until all CASTOR files have been migrated to CTA.