Skip to content

CTA Scheduling system

Introduction

This documentation will describe how CTA mount scheduling works. This will describe what ONE tapeserver does to schedule or not a mount.

Mount policy description

In CTA, a mount policy is given to each queued Archive and Retrieve Requests. The mount policy is described by 5 values :

Value Type Description
Name String The name of the mount policy
ArchivePriority Unsigned int The priority of Archive Requests. If this number is high, the Archival priority will be high
ArchiveMinRequestAge (in seconds) Unsigned int The minimum age of the queued Archive Request to trigger a mount
RetrievePriority Unsigned int The priority of RetrieveRequests. If this number is high, the Retrieval priority will be high
RetrieveMinRequestAge (in seconds) Unsigned int The minimum age of the queued Retrieve Request to trigger a mount

Mount policy resolution

Mount policy resolution is done at queueing time. The mount policy of an archive/retrieve request is determined by the mount rules in the cta catalogue.

There are tree types of mount rules:

  • Activity Mount Rule - Matches a mount policy to a username and activity regex pair. The username and must be the same as those of the user performing the request. The activity of the queued request must match the regex of the activity mount rule. In case more than one activity mount rule matches a request, the one whose mount policy has highest retrieve priority is chosen.

  • Requester Mount Rule - Matches a mount policy to the username of the user performing the archive or retrieve request.

  • Group Mount Rule - Matches a mount policy to the group of the user performing the archive or retrieve request.

All mount rules are associated with a Disk Instance. Only mount rules with the same Disk Instance of the archive/retrieve request are used for resolution.

The mount policy of archive requests is defined by a matching Requester Mount Rule, or failing that, a matching Group Mount Rule. The mount policy of retrieve requests is defined by a matching Activity Mount Rule, failing that a Requester Mount Rule and failing that a Group Mount Rule. The queueing fails if no mount rule matches the archive/retrieve request.

The only way to change the priority of a queued request is to change the priority of the mount policy selected.

The scheduling steps

Each CTA tapeserver has an attached DriveProcess that looks for work to do every 10 seconds by using the Scheduler::getNextMountDryRun() and Scheduler::getNextMount() methods.

Scheduler::getNextMountDryRun() returns true if there is a mount to schedule, false otherwise. The Scheduler::getNextMount() method returns the actual mount to be done in order to create the tape session (Read or Write). These two methods work exactly the same, here are the steps that are executed:

  • Look all queues statistics for work to be done (each queue is a Potential Mount)
  • Look for existing mounts
  • For all Potential Mount, determine the best mount to be returned and hence trigger the tape session

WARNING: If the logical library of the drive is disabled, no mount will be triggered.

Look all queues statistics for work to be done

This step is done by the OStoreDB::fetchMountInfo() method.

The following queues are looked at: - RetrieveQueueToTransfer that contains User and Repack retrieve requests - ArchiveQueueToTransferForUser that contains only User archive requests - ArchiveQueueToTransferForRepack that contains only Repack archive requests

For each queue in the objectstore, a PotentialMount object will be created and will contain the following statistics associated to the queue:

  • the VID (for Retrieve queues) or the TapePool (for Archive queues)
  • the type
  • the number of files queued
  • the number of bytes queued
  • the time the oldest job in the queue was created
  • the mount policies related statistics:
    • the mount policy name
    • the priority
    • the minimum request age to trigger a mount

Here is an example to explain how the mount policies statistics are stored in a queue.

Suppose we have two mount policies:

MountPolicy Archive priority Retrieve priority Archive min request age Retrieve min request age
MP1 1 3 300 300
MP2 2 2 100 400

If a user queue 2 Retrieve Requests for VID1 with the mount policy MP1, and 1 Archive Request with the mount policy MP2 the Retrieve queue VID1 mount policy statistics will be:

Key Value
name MP1 2
Retrieve Priority 3 2
Retrieve min request age 300 2

The Archive queue mount policy statistics will be:

Key Value
name MP2 1
Archive Priority 2 1
Archive min request age 100 1

The mount policies statistics of the queues are stored as maps (ValueCountMap in the objectstore), one map for each mount policy item. The key is the value of the mount policy item, the value is the number of jobs that have been queued with the value of the job's associated mount policy item.

WARNING The best mount policy statistics values will be given to the PotentialMount created.

Look for existing mounts

This step is done by the end of the OStoreDB::fetchMountInfo() method. It simply locks the DriveRegister and get for each drive: - its status - the tapepool of the mounted tape - the vid of the mounted tape - the number of transerred files - the number of transferred bytes - the latest bandwidth

These existing mount informations will be given to the Scheduler::getNextMount() methods.

Filter the existing mounts

Among all the PotentialMount returned by the step above, the scheduler has to filter them. This is done in the Scheduler::sortAndGetTapesForMountInfo() method.

First, a filtering on the compatible logical libraries is done for Retrieve PotentialMount. Indeed, as the step above looped over all the queues, we need to filter them in order to have a potential mount for the logical library where the drive is located.

A second filtering is applied on each PotentialMount to see if it contains enough files bytes / files queued. These values are configurable in the tapeserver configuration file :

taped MountCriteria 500000000000,10000

If these values are not reach, but there is a Request that is older than the queue's minimum request age mount policy statistic, then the PotentialMount will be considered.

A last filtering is done. If the virtual organization of the tapepool of the potential mount is using all the drives it is allowed to use, the mount will be removed from the potential mounts list.

The number of drives a virtual organization is allowed to use for read and for write can be configured by using the following commands:

cta-admin virtualorganization ch --vo VO --readmaxdrives x --writemaxdrives y
Where x is the number of drives the virtual organization is allowed to use for reading, and y is the number of drives the virtual organization is allowed to use for writing (all types of Archive mounts).

Determine the best mount to be triggered.

The determination of the best mount to be triggered is also done in the Scheduler::sortAndGetTapesForMountInfo() method.

Once all these filtering are done, the remaining PotentialMount will be sorted according to the PotientialMount::operator<() in order to select the best mount to trigger.

The sorting will be done in the following order:

  1. priority (extracted from the queue mounnt policy statistics)
  2. mount type (archival has a higher priority than retrieval)
  3. the age of the job: the older the job is, the higher priority it has

The list of sorted PotentialMount will then be given to the Scheduler::getNextMount() methods that will then verify if the tape can be mounted for Retrieve or find a tape for Archival. The mount will then be returned to the DriveProcess in order to create a tape read or write session.

Conclusion

The CTA scheduling is done in four steps : create a PotentialMount for each queue found, filter all the PotentialMount, sort the remaining PotentialMount and trigger the mount for the best and possible mount. The mount policies and the virtual organization read/write max drives play an important role in all that process. The mount policy minimum request age and the virtual organization read/write max drives are used in the filtering part of the scheduling and the priority of the mount policy is used to sort all the remaining PotentialMount.

Currently, we do not have any tools or mechanism to tell when a tape is going to be scheduled per logical library.