The CTA Tape Lifecycle#
Introduction#
In CTA, each tape information is stored in the CTA Catalogue and are managed by the mean of cta-admin command.
Different tape states and supported state transitions#
What can be done on each final state#
| State | Queue user read requests | Queue user write requests | Queue repack requests | Queue repack read sub-requests (*) | Queue repack write sub-request(*) | Mountable | Reclaimable |
|---|---|---|---|---|---|---|---|
| ACTIVE | YES | YES | NO | NO | YES | YES | YES |
| DISABLED | YES | YES | NO | NO | YES | NO | YES |
| REPACKING | NO | NO | YES | YES | NO | YES | NO |
| REPACKING_DISABLED | NO | NO | YES | YES | NO | NO | NO |
| BROKEN | NO | NO | NO | NO | NO | NO | YES |
| EXPORTED | NO | NO | NO | NO | NO | NO | NO |
(*) Repack sub-request queueing is handled internally by the maintenance process.
Tape states explained#
- ACTIVE:
- A tape that is
ACTIVEis a tape that is in good condition to allow a user to read data from it or to write data to it. A newly added tape to CTA will beACTIVEby default.
- A tape that is
- DISABLED:
- A tape that is
DISABLEDcannot be mounted, but can still have retrieve requests queued to it. - Can be set for the following reasons:
- A tape server disabled it after encountering some issue that lead to failure (example: failed dismount).
- Monitoring probes decided to disable it (too many errors over the past XX hours,...).
- An operator think the tape is in bad shape and must be kept away from users while it is investigated.
- A tape should not stay in this state for more than 1 week, person on rota to follow up on tapes disabled for longer than one week.
- A tape that is
- REPACKING:
- A tape should be moved to
REPACKINGstate before the operator submits a repack request. Otherwise, the repack request won't be accepted. - Likewise, it's not possible to move out of
REPACKINGwhile a repack request is ongoing (except for theREPACKING_DISABLEDstate). - When a change to
REPACKINGis requested, the tape will first move to the temporary stateREPACKING_PENDING. Then, it waits for the maintenance process to clean all user requests on the tape queue, before finally moving it toREPACKINGstate. For more details see: Queue Cleanup Runner.
- A tape should be moved to
- REPACKING_DISABLED:
- The state
REPACKING_DISABLEDis similar toDISABLED, but for repacking tapes (we can't move out of repacking states while a repack is ongoing). - A tape should not stay in this state for more than 1 week, person on rota to follow up on tapes disabled for longer than one week.
- New repack requests can be queued on a
REPACKING_DISABLEDtape. However, no tape will be mounted while it's on this state.
- The state
- BROKEN:
- A tape can stay in this state for long and it is very likely that it is its very last state.
- When a change to
BROKENis requested, the tape will first move to the temporary stateBROKEN_PENDING. Then, it waits for the maintenance process to clean all user requests on the tape queue, before finally moving it toBROKENstate. This guarantees that all requests are properly disposed of. - Can be set for the following reasons:
- A problematic (earlier
DISABLED) tape that requires non-trivial efforts for its data to be recovered:- low level slow tape extract is needed
- sent for data recovery to the vendor = the tape is not present in the library
- all recovery attempts exhausted, the tape is permanently broken but experiment action is needed (delete the lost files from the catalogs)
- In rare cases an
ACTIVEtape that fall on the floor because of a gripper incident can move fromACTIVEstraight toBROKENas it must be physically put back in a slot.
- A problematic (earlier
- Normally, very few tapes are in the BROKEN state.
- No operations are allowed for a
BROKENtape.
- EXPORTED:
- A tape can stay in this state for long. It means that the tape was removed from the tape library.
- While we at CERN we do not remove tape cartridges from tape libraries, other sites do. Therefore, the
EXPORTEDstate should help to distinguish that. - This state would behave very similarly to
BROKENstate. The difference is on the error messages reported to the user.