Skip to content

Supported tape formats

CTA reads and writes the CTA tape format, which is similar to that of CASTOR.

Additionally, OSM and Enstore tape formats are supported for reading only.

CTA/CASTOR tape format

CTA uses the same AUL file format as CASTOR. CASTOR used several file formats over time, but by 2013, only the AUL format was in use. This format is based on ANSI INCITS 27-1987 and is described in detail on the Tape Labels, ANSI and IBM web page (last updated in 2008).

The AUL format has the following descriptors:

  • Volume Label (VOL1)
  • Header Blocks: Headers (HDR1, HDR2) and User Header Labels (UHL1)
  • Trailer Blocks: User Trailer Labels UTL1
    • The UHLs and UTLs are defined in ANSI X 3.27. The general description of the ANSI fields was documented in IBM’s z/OS documentation.

Each of these descriptor labels is contained in an 80-byte tape block of ASCII text. Empty bytes are stored as spaces (0x20). The label descriptor must begin with the 4-byte identifier. Labels are terminated by a file mark: Tape Mark (TM) or End of File (EOF)..

VOL1 HDR1 | HDR2 | UHL1 | TM | DATA | TM | EOF1 | EOF2 | UTL1 | TM
<-------------------------- One data file --------------------------------------------------->

Volumes that have just been initialised contain no data records, just a single ‘header label group’:

VOL1 HDR1(PRELABEL) | TM

Volume Label (VOLn)

The very first label record on a labelled volume is VOL1. If this label is incorrect, you will not advance at all.

The structure of the volume label (VOL1)

Bytes Length Offset Content
0-3 4 0x00 Volume label indicator: the characters VOL1
4-9 6 0x04 Volume serial number (VSN) (e.g., “AB1234”)
10 1 0x0A Accessibility (left as empty space)
11-23 13 0x0B Reserved (spaces)
24-36 13 0x18 Implementation identifier (left as empty spaces)
37-50 14 0x25 Owner identifier (the string “CASTOR” or STAGESUPERUSER name, padded with spaces)
51-78 28 0x33 Reserved (spaces)
79 1 0x4F Label standard level (1, 3 and 4 are listed as valid in IBM’s documentation. CASTOR uses ASCII ‘3’)

Examples

An example of the beginning of the tape:
00000000 56 4f 4c 31 56 35 32 30 30 31 20 20 20 20 20 20 |VOL1V52001     |
00000010 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 |               |
00000020 20 20 20 20 20 43 41 53 54 4f 52 20 20 20 20 20 |     CASTOR    |
00000030 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 |               |
00000040 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 33 |              3|

Header Label (HDRn)

HDR1 and HDR2 are normally found together at the beginning of a dataset.

The format for HDR1

Bytes Length Offset Content
0-3 4 0x00 Header label: the characters “HDR1 or EOF1”
4-20 17 0x04 File identifier: hexadecimal CASTOR NS file ID. nsgetpath -x can be used to find the CASTOR full path name. Aligned to left. In case of prelabeled tape ‘PRELABEL’ is used instead of file ID.
21-26 6 0x15 The volume serial number of the tape.
27-30 4 0x1B File section number: a number (0001 to 9999) that indicates the order of the volume within the multivolume aggregate. This number is always 0001 for a single volume data set.
31-34 4 0x1F File sequence number: a number that indicates the relative position of the data set within a multiple data set group (aggregate). CASTOR uses modulus for fseq by 10000
35-38 4 0x23 Generation number: ‘0001’ in CASTOR.
39-40 2 0x27 Version number of generation: ‘00’ in CASTOR.
41-46 6 0x29 Creation date: Date when allocation begins for creating the data set. The date format is cyyddd, where: c = century (blank=19; 0=20; 1=21; etc.) yy = year (00-99) ddd = day (001-366)
47-52 6 0x2F Expiration date: year and day of the year when the data set may be scratched or overwritten. The data is shown in the format cyyddd. It is always advisable to set the expiration date when a volume is being initialised (‘prelabelled’) to be a date before the current date, so that writing to the tape is immediately possible.
53 1 0x35 Accessibility: a code indicating the security status of the data set and ‘space’ means no data set access protection.
54-60 6 0x36 Block count: This field in the trailer label shows the number of data blocks in the data set on the current volume. This field in the header label is always ‘000000’.
60-72 13 0x3C System code of creating system: a unique code that identifies the system. CASTOR with CASTOR BASEVERSION number string.
73-79 7 0x49 Reserved

The format for HDR2

Bytes Length Offset Content
0-3 4 0x00 Header label: the characters “HDR2 or EOF2”
4 1 0x04 Record format. An alphabetic character that indicates the format of the records in the associated data set. For the AUL it could be only: F - fixed length (U - was used for HDR2 for prelabeled tapes)
5-9 5 0x05 Block length in bytes (maximum). For the block size greater than 100000 the value is 00000.
10-14 5 0x0A Record length in bytes (maximum). For the record size greater than 100000 the value is 00000.
15 1 0x0F Tape density. Depends on the tape density values are following: ‘2’ for D800, ‘3’ for D1600, ‘4’ for D6250
16-33 18 0x10 Reserved
34 2 0x22 Tape recording technique. The only technique available for 9-track tape is odd parity with no translation. For a magnetic tape subsystem with Improved Data Recording Capability, the values are: ‘P ’- Record data in compacted format, ‘ ’ - Record data in standard uncompacted format. For CASTOR is is ‘P’ if the drive configured to use compression (i.e. xxxGC)
35-49 14 0x24 Reserved
50-51 2 0x32 Buffer offset ‘00’ for AL and AUL tapes
52-79 28 0x34 Reserved

Examples

Example for the empty tape with PRELABEL and one HDR1:
00000000 56 4f 4c 31 56 35 32 30 30 31 20 20 20 20 20 20 |VOL1V52001      |
00000010 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 |                |
00000020 20 20 20 20 20 72 6f 6f 74 20 20 20 20 20 20 20 |     root       |
00000030 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 |                |
00000040 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 33 |               3|
00000050 48 44 52 31 50 52 45 4c 41 42 45 4c 20 20 20 20 |HDR1PRELABEL    |
00000060 20 20 20 20 20 56 35 32 30 30 31 30 30 30 31 30 |     V5200100010|
00000070 30 30 31 30 30 30 31 30 30 30 31 33 32 33 34 30 |0010001000132340|
00000080 31 33 32 33 34 20 30 30 30 30 30 30 43 41 53 54 |13234 000000CAST|
00000090 4f 52 20 32 2e 31 2e 31 33 20 20 20 20 20 20 20 |OR 2.1.13       |
Example of HDR1 for the second file on the tape:
00000000 48 44 52 31 31 32 41 31 36 30 43 33 38 20 20 20 |HDR112A160C38   |
00000010 20 20 20 20 20 56 35 32 30 30 31 30 30 30 31 30 |     V5200100010|
00000020 30 30 32 30 30 30 31 30 30 30 31 32 30 34 31 30 |0020001000120410|
00000030 31 32 30 34 31 20 30 30 30 30 30 30 43 41 53 54 |12041 000000CAST|
00000040 4f 52 20 32 2e 31 2e 31 32 20 20 20 20 20 20 20 |OR 2.1.12       |
Example of HDR2 for the first file on the tape:
00000000 48 44 52 32 46 30 30 30 30 30 30 30 30 30 30 20 |HDR2F0000000000 |
00000010 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 |                |
00000010 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 |                |
00000030 20 20 30 30 20 20 20 20 20 20 20 20 20 20 20 20 |  00            |
00000040 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 |                |

User Header Label (UHLn)

The format for UHL1

Bytes Length Offset Content
0-3 4 0x00 User header label: the characters “UHL1 or UTL1”.
4-13 10 0x04 Actual file sequence number ( ‘0’ padded from left ).
14-23 10 0x0E Actual block size ( ‘0’ padded from left ).
24-33 10 0x18 Actual record length ( ‘0’ padded from left ).
34-41 8 0x22 Site : a part of the domain name uppercase.
42-51 10 0x2A Tape mover host name uppercase without domain name.
52-59 8 0x34 Drive manufacturer.
60-67 8 0x3C Drive model (first 8 bytes from the field PRODUCT IDENTIFICATION in the SCSI INQUIRY replay).
68-79 12 0x44 Drive serial number

Examples

Example for the second file on the tape:
00000000 55 48 4c 31 30 30 30 30 30 30 30 30 30 32 30 30 |UHL1000000000200|
00000010 30 30 32 36 32 31 34 34 30 30 30 30 32 36 32 31 |0026214400002621|
00000020 34 34 43 45 52 4e 20 20 20 20 4c 58 43 32 44 45 |44CERN    LXC2DE|
00000030 56 35 44 32 53 54 4b 20 20 20 20 20 54 31 30 30 |V5D2STK     T100|
00000040 30 30 42 20 58 59 5a 5a 59 5f 42 31 20 20 20 20 |00B XYZZY_B1    |

Data Records

After a ‘header label group’, data records follow of any length and in any number. Eventually, an EOF will appear and then a ‘trailer label group’ is expected.

The data block size is configurable but in practice a block size of 256 KiB has been used everywhere.

End of File (EOFn)

EOF1 and EOF2 are normally found together at the end of a dataset.

Note that an End of Volume (EOVn) label will appear instead of EOFn if this is the final label group on the volume, but the dataset continues on another volume. EOV1 and EOV2 are only expected together and at the end of a volume.

User Trailer Label (UTLn)

The format for UTL1 is the same as UHL1 (see above).

Checksums

When a file is written to tape, an Adler32 checksum is computed on the file. The main advantages of Adler32 are that it is faster to compute than CRC32 or MD5, and it is distributive when computing the checksum for a multi-block file. This checksum is not stored on the tape; it is stored as metadata in the Catalogue.

Note

The tape drives also compute a CRC32 checksum on each block, which is checked in firmware. This checksum is not seen by the software.

Read-only formats

CTA also supports a set of other tape format for read-only operations. These allow CTA adopters to use their existing tapes, without having to re-write data to the CTA/CASTOR format. Note that for writing of new files only CTA/CASTOR format is supported.

OSM

The OSM tape label format is supported in CTA from version 4.7.9-1 and later.

Enstore

The Enstore tape label format is supported in CTA from version 4.7.11-1 and later.