Skip to content

Introduction

CTA was designed to be a scalable and reliabe tape storage system. The archival and retrieval of files is done by so-called workflow events. These events are triggered on the frontend by the disk buffer system. As such, the disk buffer system needs to be compatible with CTA. At the present moment, two different disk buffer systems can be used in conjunction with CTA: EOS and dCache. The disk buffer system is needed for the tape drives read and write with high throughput and typically consists of fast SSDs.

CTA is designed to be deployed on-prem. In order to do so you need:

  • Machines to run the CTA frontend
  • One or more tape drives attached to a dedicated machine that can act as a tape server. This tape server will run e.g. the Tape Daemon that handles the drive interaction
  • A fast disk buffer system that can be put in front of CTA
  • A database system that can act as the Catalogue. CTA supports both Oracle and Postgres databases for the Catalogue. It goes without saying that it is important for this service to be highly available as all fundamental information is stored here.
  • A Ceph cluster to handle the scheduler queues. We are currently working on developing a Postgres alternative to the scheduler, but this is still in early stages. At CERN we run a Ceph cluster to handle the scheduling

In addition to this, it goes without saying that a good monitoring setup is also required so that operators can intervene quickly when necessary.

Please visit our Community Forum if you have any questions on this.

Component Overview

The CTA system consists of various components. Below we briefly list these components and explain what they do:

  • Frontend: handles workflow events from the disk buffer and serves cta-admin requests
  • Tape server: finds jobs to do and handles the drive interaction
  • Disk Buffer: a buffer for the files to be archived/retrieved to/from tape. Sends workflow events to the frontend
  • Catalogue: central database that stores metadata regarding the files, tapes, libraries and other things
  • Scheduler: a system that stores and handles the scheduling queues. The scheduler is not a global scheduler in the sense that it actively schedules work for e.g. the tape servers. Instead, the frontend pushes jobs to the scheduler and the tape servers will individually pull jobs.
  • Admin Client: a separate client that can be used to execute cta-admin commands on the frontend

CTA Component Overview