Install a Local EOS Instance¶

This Appendix describes how to install a local EOS instance.

Install XRoot and EOS RPMs¶

The yum repos, priorities and version lock list should be configured as described in Section configure yum

Next, install the RPMs:

# yum install eos-client eos-server xrootd-client xrootd-debuginfo xrootd-server heimdal-server heimdal-workstation

Now the list of installed EOS and XRoot RPMs should look something like this:

# rpm -qa | egrep 'eos|xrootd|heimdal' | sort
eos-client-4.1.11-20170118171644git24cd94a.el7.x86_64
eos-server-4.1.11-20170118171644git24cd94a.el7.x86_64
heimdal-libs-1.6.0-0.9.20140621gita5adc06.el7.x86_64
heimdal-server-1.6.0-0.9.20140621gita5adc06.el7.x86_64
heimdal-workstation-1.6.0-0.9.20140621gita5adc06.el7.x86_64
libmicrohttpd-0.9.38-eos.wves.el7.cern.x86_64
xrootd-4.4.1-1.el7.x86_64
xrootd-client-4.4.1-1.el7.x86_64
xrootd-client-libs-4.4.1-1.el7.x86_64
xrootd-debuginfo-4.4.1-1.el7.x86_64
xrootd-libs-4.4.1-1.el7.x86_64
xrootd-selinux-4.4.1-1.el7.noarch
xrootd-server-4.4.1-1.el7.x86_64
xrootd-server-libs-4.4.1-1.el7.x86_64

Create a Kerberos `keytab` file¶

The instructions below are for machines on structured cabling which have a fixed IP address. Machines without a fixed IP address—including virtual machines or Docker pods—are not recognised by the CERN Kerberos servers. To install a local EOS instance in this case, we need to run our own KDC server. See Section Installing EOS in docker for more information on how to do this.

The EOS mgm in the XRoot daemon authenticates users using a key from the Kerberos keytab.

If /etc/krb5.keytab does not already exist, we need to create a new EOS service principal in the kdc, and install the key in the keytab.

In the case of machines at CERN with a fixed IP address, there is a package for this[^1]:

# yum install cern-get-keytab

This can be used to create the host and EOS service kerberos keys and host and CTA service key as follows:

# rm -f /etc/krb5.keytab
# cern-get-keytab --service eosdev --force
# rm /etc/krb5.keytab
# cern-get-keytab --service cta --force
# echo -e "read_kt /etc/krb5.keytab\nlist\nquit" | ktutil

Extract the key for the EOS principal to a new keytab, which should be owned by user daemon so that it is readable by the mgm:

# rm -f /etc/krb5.keytab.eosdev
# cp /etc/krb5.keytab /etc/krb5.keytab.eosdev
# chown daemon:daemon /etc/krb5.keytab.eosdev
# echo -e "read_kt /etc/krb5.keytab.eosdev\nlist\nquit" | ktutil

Note that this process re-generates a new version of every other key for this host, which might require the client users to kdestroy their corresponding tickets in caches.

Note that the ktutil utility can also be run in interactive mode. Here is a more verbose way to extract the keys:

# ktutil
ktutil:  @@read_kt /etc/krb5.keytab@@
ktutil:  @@list@@
slot KVNO Principal
---- ---- ---------------------------------------------------------------------
   1   14                          devbox$@CERN.CH
   2   14                          devbox$@CERN.CH
   3   14                          devbox$@CERN.CH
   4   14               eoscta/devbox.cern.ch@CERN.CH
   5   14               eoscta/devbox.cern.ch@CERN.CH
   6   14               eoscta/devbox.cern.ch@CERN.CH
ktutil:  @@delent 1@@
ktutil:  @@delent 1@@
ktutil:  @@delent 1@@
ktutil:  @@list@@
slot KVNO Principal
---- ---- ---------------------------------------------------------------------
   1   14               eoscta/devbox.cern.ch@CERN.CH
   2   14               eoscta/devbox.cern.ch@CERN.CH
   3   14               eoscta/devbox.cern.ch@CERN.CH
ktutil:  @@write_kt /etc/krb5.keytab.eos@@
ktutil:  @@quit@@

Create a Simple Shared Secret `keytab` file¶

The cta user must exist before creating the SSS keys. This should be done automatically when the CTA RPMs are installed.

The EOS mgm and fst nodes authenticate to each other using the Simple Shared Secret (SSS) mechanism. The xrdsssadmin tool is used to create a keytab containing the Tape Server key and EOS instance key:

# xrdsssadmin -k cta-taped -u cta -g cta add /etc/eos.keytab
xrdsssadmin: Keyfile '/etc/eos.keytab' does not exist. Create it? (y | n): y
xrdsssadmin: 1 key out of 1 kept (0 expired).
# xrdsssadmin -k eosdev -u daemon -g daemon add /etc/eos.keytab
xrdsssadmin: 2 keys out of 2 kept (0 expired).
# chown daemon:daemon /etc/eos.keytab

Then create a second keytab containing only the Tape Server key.

# cp /etc/eos.keytab /etc/cta-taped.keytab
# xrdsssadmin -k eosdev del /etc/cta-taped.keytab
xrdsssadmin: 1 key out of 2 kept (0 expired).
# chown cta:tape /etc/cta-taped.keytab

Note that XRoot clients which parse a multi-line SSS keytab file use the last line in the file as their key.

Configure EOS `sysconfig`¶

It looks like there is a new config file /etc/eos.systemd.conf which overlaps with (or supercedes?) /etc/syconfig/eos_env. Check which of these is the correct one to use.

Create the /etc/syconfig/eos_env file based on the example installed by the eos-server RPM:

# cp /etc/sysconfig/eos_env.example /etc/sysconfig/eos_env

Edit the file and comment out the lines for the SYNC and FED daemons. This reduces the XRoot daemon roles to the minimum set of three (mq, mgm and fst). i.e., there will be three XRoot daemons running for EOS on the local development box.

#sync=sync
#fed=fed

Continue editing the file to set the name of the EOS instance and replace all the hostnames with the fully-qualified domain name of the local development box. The resulting hostname entries should look something like the following (where devbox.cern.ch should be replaced with the FQDN of the development box where EOS is being installed):

EOS_INSTANCE_NAME=eosdev
EOS_GEOTAG=devbox # 8 characters max
EOS_MGM_HOST=devbox.cern.ch
EOS_MGM_HOST_TARGET=devbox.cern.ch
EOS_MGM_MASTER1=devbox.cern.ch
EOS_MGM_MASTER2=devbox.cern.ch
EOS_MGM_ALIAS=devbox.cern.ch
EOS_MAIL_CC="your.name@cern.ch"

Configure XRoot `mgm`¶

SSS will be used for communication between the CTA Tape Server and EOS. In a production system, Kerberos is also required for communication between the user and EOS.

This is configured in the /etc/xrd.cf.mgm file which was installed by the eos-server RPM. Edit this file and comment out the UNIX- and GSI-based authentication mechanisms and leave only Simple Shared Secret (SSS) and Kerberos (KRB) authentication enabled. Then edit the Kerberos protocol to use the EOS-specific kerberos keytab file as shown below:

# UNIX authentication
#sec.protocol unix
# SSS authentication
sec.protocol sss -c /etc/eos.keytab -s /etc/eos.keytab
# KRB  authentication
sec.protocol krb5 /etc/krb5.keytab.eosdev eosdev/<host>@CERN.CH
# GSI authentication
#sec.protocol gsi -crl:0 -cert:/etc/grid-security/daemon/hostcert.pem ...

Set the order of authentication mechanisms on all hosts to be Kerberos followed by Simple Shared Secret:

#sec.protbind localhost.localdomain unix sss
#sec.protbind localhost unix sss
sec.protbind * only krb5 sss

Configure the EOS mgm instance name:

mgmofs.broker root://devbox.cern.ch:1097//eos/
mgmofs.instance eosdev
mgmofs.cfgredishost devbox.cern.ch

Ensure that the EOS namespace plugin will be loaded:

#-------------------------------------------------------------------------------
# Set the namespace plugin implementation
#-------------------------------------------------------------------------------
mgmofs.nslib /usr/lib64/libEosNsInMemory.so

Configure the CTA Frontend endpoint and resources:

mgmofs.protowfendpoint ctafrontend.cern.ch:10955
mgmofs.protowfresource /ctafrontend

The EOS MGM is responsible for constructing the destination “open for write” URLs that are passed to CTA prepare requests. These destination URLs are used by tape servers to open the disk files they are writing back to disk. By default tape servers will write disk files to the default EOS space. If this must be changed then the following line should be added to /etc/xrd.cf.mgm. The example given names the destination EOS space as spinner. The spinner space is specific to a EOSCTA installation for ALICE. Please see section [section:configurespinnereosspace] for more details.

mgmofs.prepare.dest.space spinner

A prepare request with the Prep_EVICT flag can only call XrdMgmOfs::prepare() if XRootD believes an alternative Prepare plugin is present. "xrootd.fslib -2" invokes XrdSfsGetFileSystem2() which tells XRootD that such a plugin is present.

xrootd.fslib -2 libXrdEosMgm.so xrootd.seclib libXrdSec.so xrootd.async off nosf xrootd.chksum adler32

xrd.sched mint 8 maxt 256 idle 64

Create a local directory for EOS mgm. It seems this is not actually used for anything, but creating it suppresses a spurious error in the EOS logs.

# mkdir -p /mgm
# chown daemon:daemon /mgm/

Configure XRoot `fst`¶

In /etc/xrd.cf.fst, replace references to localhost with the FQDN of the local development box:

fstofs.broker root://devbox.cern.ch:1097//eos/

Create a local directory to be used to store files by the EOS fst:

# mkdir -p /fst
# chown daemon:daemon /fst

In order to allow for third party copying, make sure the following lines exist in /etc/sysconfig/eos_env on the fst node:

# Switch off enforcement of SSS in order to allow third party copies to work
export EOS_FST_NO_SSS_ENFORCEMENT=1

Also make sure the ofs.tpc directive is set within the /etc/xrd.cf.fst file. For example:

ofs.tpc  pgm /usr/bin/xrdcp

Start XRoot daemons¶

Start the XRoot daemons that will run the EOS mgm, mq and fst plugins:

# systemctl start eos

The logs for the XRoot daemons will be created under /var/log/eos/fst, /var/log/eos/mgm and /var/log/eos/mq.

Enable EOS authentication mechanisms¶

Enable the Kerberos and Simple Shared Secret authentication mechanisms within EOS (as opposed to XRoot):

# eos vid enable sss
# eos vid enable krb5

Configure the default EOS space¶

Register the directory for the default EOS space:

# echo 'EOS_MGM_URL=root://devbox.cern.ch' > /etc/sysconfig/eos
# eosfstregister -r /fst default:1
###########################
# <eosfstregister> v1.0.0
###########################
/fst : uuid=abae0ba6-ffc5-491f-9ef3-291f291493af fsid=undef
success:   mapped 'abae0ba6-ffc5-491f-9ef3-291f291493af' <=> fsid=1

/usr/sbin/eosfsregister is a convenience script for registering an EOS fst node with an EOS mgm node. It parses the deprecated configuration file /etc/sysconfig/eos to determine the location of the EOS mgm node. For the time being, we need to create this file even though it is not used by EOS directly.

eosfsregister is not maintained by the EOS developers, and the same work can be done by standard EOS commands. We will modify our CTA install/configure/setup scripts to use the standard commands nstead of eosfsregister.

Enable the WorkFlow Engine (WFE) for the default EOS space, enable default EOS space, bring the EOS fst node on-line:

# eos space config default space.wfe=on
# eos space set default on
# eos node set devbox.cern.ch on

Wait until the EOS disk filesystem comes online:

# eos fs ls /fst | grep online
devbox.cern.ch (1095) 1 /fst default.0 ...evbox.cern.ch booted rw nodrain @@online@@ unknown raid

Create the /eos directory within the EOS namespace, map it to the EOS default space, set the number of replicas to 1:

# 
# eos attr -r set sys.forced.layout="replica" /eos
# eos attr -r set sys.forced.nstripes=1 /eos

Configure the spinner EOS space¶

The requirements of the ALICE LHC experiment will require their EOSCTA instance to have two EOS storage spaces: the default space for archiving files to tape and the spinner space for retrieving them back.

The EOS space named default will be composed of SSDs. End users wishing to store files on tape will copy their files into the default space. The CTA tape servers will then read these files and write them to tape. Once these files are safely stored on tape they will be immediately deleted from the default space. The default space will be used as a pure consumer producer buffer where the files contained within it have a relatively short lifespan.

The EOS space named spinner will be composed of mechanical spinning hard drives. The tape servers will use the spinner space when retrieving files back from tape. Files will remain in the spinner space for a relatively long period of time. Tape servers will explicitly set the space to be used when opening the disk files for writing. The “open for writing” URLS will include eos.space=spinner.

Please note that the default space must not be enforced using the sys.forced.space extended attribute. If this attribute is set then the “open for writing” URLs of the tape servers would also need to include eos.layout.noforce plus the full description of the disk file layout. It was decided that there was no point in sending around disk layout information if it could be avoided.

Create the spinner EOS space:

# eos group set spinner.0 on

Assuming that you have created a /spinner_fst directory. Register it for the spinner EOS space:

# eosfstregister -r /spinner_fst spinner:1

There is no need to enable the WorkFlow Engine (WFE) for the spinner space because previously enabling it for the default space enables it for all spaces.

Wait until the EOS disk filesystem comes online:

# eos fs ls /spinner_fst | grep online
devbox.cern.ch (1095) 1 /spinner_fst spinner.0 ...evbox.cern.ch booted rw nodrain @@online@@ unknown raid

Setting `EOS_MGM_STATVFS_DEFAULT_SPACE`¶

Users who only use xrdcp and xrdfs can query the storage statistics of the EOS space named default by executing the following xrdfs command:

xrdfs query space /

Users can also query a specific EOS space by providing its name in CGI appended to the end of the previous xrdfs query space / command like so:

xrdfs query space /?eos.space=EOS_SPACE_NAME

If users executing the "space-less" xrdfs query space / command should ALWAYS get the storage statistics of an EOS space other than default then the EOS_MGM_STATVFS_DEFAULT_SPACE parameter of the /etc/sysconfig/eos_env file can be used as follows:

EOS_MGM_STATVFS_DEFAULT_SPACE=“Name of the EOS space”

The MGM will of course need to be restarted in order for this configuration change to take place:

systemctl restart eos@mgm

Configuring the MGM tape-aware garbage collector¶

There are two types of tape-aware garbage collector within an EOSCTA system, one type in each MGM and the other type in each FST. The garbage collector in the MGM, referred to as the MGM GC, uses a queue of file identifiers organised in Least Recently Used (LRU) order to decide which disk replica should be garbage collected next if and when free disk space is considered to be running low. A garbage collector within an FST, referred to as an FST GC, is purposely very simple. It iterates through the listings of all of the local disk replicas and considers a file for garbage collector if free space is too low and the file being examined is considered too old.

This section describes how to configure an MGM GC. This garbage collector is part of the MGM xrootd process and is configured via the /etc/xrd.cf.mgm configuration file and via runtime configuration commands via the eos client program.

The following parameters must be set in /etc/xrd.cf.mgm:

mgmofs.tapeenabled true
mgmofs.tgc.enablespace EOS_SPACE_NAME

The MGC GC abides by the value of the mgmofs.tapeenabled parameter and to this effect will not take any action unless the value is set to true. Garbage collection has to then be enabled per EOS space and hence the mgmofs.tgc.enablespace parameter.

How the MGM GC garbage collects an EOS space can be fine tuned using the following eos commands:

sudo eos space config EOS_SPACE_NAME space.tgc.availbytes=NB_BYTES
sudo eos space config EOS_SPACE_NAME space.tgc.qryperiodsecs=NB_SECS
sudo eos space config EOS_SPACE_NAME space.tgc.totalbytes=NB_BYTES

The space.tgc.availbytes parameter specifies the threshold where the MGM GC considers there to be enough free/available space. The MGM GC will not attempt to garbage collect any files if the actual amount of free/available space is above this number.

The space.tgc.totalbytes parameter specifies the number of storage bytes that must be available to the MGM for read/write access before the MGM GC can even begin to take action. This parameter solves a “startup” problem. Once an MGM XRootD process is started, the MGM GC must not immediately start considering files for garbage collection because FSTs will not have had time to register their free/available space and the MGM GC will see too little free/available space and will start prematurely and incorrectly garbage collecting.

The space.tgc.qryperiodsecs parameter specifies the period at which the garbage collector should query the MGM for statistics about the EOS space being managed. This value should be greater than the value of EOS_FST_DELETE_QUERY_INTERVAL in /etc/sysconfig/eos_env and 5 seconds greater than the value of publish.interval of each EOS node/FST in the EOS space being managed by the garabage collector. An eos node/FST publishes its file system statistics at a period of publish.interval seconds ± 5. The value of publish.interval for a given FST can be set using the following command:

eos node config NODE publish.interval=NB_SECS

The value of publish.interval for a given EOS node/FST can be displayed if it has been set at least once by running the following command:

eos node status NODE | egrep '^publish.interval'

Test the EOS Installation¶

Perform a simple write and read test of the new EOS installation:

# eos mkdir /eos/users/test
# eos chmod 777 /eos/users/test
# xrdcp /etc/motd root://devbox.cern.ch//eos/users/test/
# xrdcp root://devbox.cern.ch//eos/users/test/motd /tmp/eostest
# diff /tmp/eostest /etc/motd

Installing EOS in Docker¶

The instructions above are for installing EOS on real hardware. To install EOS in a Docker pod, note that you must:

Create a Docker network to allow DNS/reverse DNS queries to work.
Use a privileged Docker container and mount the cgroup to allow systemd to run.

To run systemd on Docker host bandersnatch on network wonderland, the command would be:

$ sudo docker network create wonderland
$ sudo docker run --net=wonderland --name bandersnatch --hostname bandersnatch.wonderland --privileged -v /sys/fs/cgroup:/sys/fs/cgroup:ro -d ctatest /usr/sbin/init

See also the eos-docker GitLab project, prepared by the IT-ST-AD section.

The EOS mgm startup script checks if we are running under systemd, but the test it performs does not work inside a Docker container, as (a) pidof is part of sysvtools-init, which is not installed and (b) the systemd process is started as init (a symbolic link to systemd):

/usr/sbin/pidof systemd

This has been reported to the EOS team to fix. In the meantime, here is a hack to work around the problem:

# yum install -y sysvinit-tools
# ln -s /usr/bin/sleep /tmp/systemd
# /tmp/systemd 1000 &
# systemctl start eos

EOS Access Control Lists¶

Access control lists in EOS are based on POSIX ACLs but with some differences.

EOS user access permission bits are:

a          Archive flag (archiving allowed)
c          Chown flag (owner change allowed)
m,!m       Chmod flag (mode change allowed/disallowed)
r,w,x      Read, Write, Execute (browse)
wo         Write-once access (allows creation, disallows delete)
[+]d,!d    Deletable flag (allows/disallows deletion)
[+]u,!u    Update flag (allows/disallows updates)
q          Quota flag (set quota allowed)
i          Directory immutable flag

EOS system access permission bits are:

p           Prepare flag (allows triggering workflows)

Directories with tape workflows must have the !u and p bits set.

[^1]: For more details, see http://linux.web.cern.ch/linux/docs/kerberos-access.shtml