Skip to main content

Rucio Command Line Tools

Setup

Install the required packages:

pip install --user rucio-clients==38.5.1
pip install --user pyjwt
pip install --user -i https://test.pypi.org/simple/ rucio-dlclient-gluex-fix==0.0.2

For download and upload you need XRootD client installed in your system. First check if you have it in your machine via

xrdcp -h

If not installed please install XRootD client or ask system admin to install it. Based on your OS choose one of the following.

  • In RPM-based distributions, like Alma, Rocky, Fedora, CentOS Stream, and RHEL, one can install XRootD with
dnf install xrootd-client python3-xrootd
warning

In RHEL-based distributions, it will be necessary to first install the EPEL release repository with

dnf install epel-release
  • On Debian 11 or later, and Ubuntu 22.04 or later, XRootD can be installed via apt
apt install xrootd-client python3-xrootd
  • On macOS, XRootD is available via Homebrew
brew install xrootd

Configuration

Create a file named rucio.cfg with the following content:

[client]
rucio_host = https://rucio-jlab.jlab.org
auth_host = https://rucio-jlab.jlab.org
auth_type = oidc
account = <your_account_name>
ca_cert = /etc/pki/tls/certs/ca-bundle.crt
request_retries = 3

[download]
use_rucio_oidc_token = True
note

Replace <your_account_name> with you username.

Set the environment variable to point to your configuration file:

export RUCIO_CONFIG=/path/to/rucio.cfg

Test Setup

Verify your installation and authentication by running:

rucio whoami

Complete the authentication flow:

  1. Copy the URL displayed in the terminal and paste it into your web browser
  2. On the CILogon page, select Thomas Jefferson National Accelerator Facility
  3. Log in using your JLab credentials
  4. Copy the authentication code displayed after login
  5. Paste the code back into your terminal and press Enter

Searching for Data

Data in Rucio is identified by a Data Identifier (DID) in the format scope:name. Refer to the Namespace Documentation for path patterns.

Search for Datasets

To find a logical group of files (a dataset), use list-dids. You can use wildcards (*):

# List reconstructed REST datasets for a run period with ver01
rucio did list 'recon:/RunPeriod-2025-01/recon/ver01/REST/*'

List Files in a Dataset

To see the individual files contained within a dataset:

rucio did content list recon:/RunPeriod-2025-01/recon/ver01/REST/132389

Check Data Location (Tape vs. Disk)

To determine where a dataset's files are physically located (e.g., MSS/Tape vs. Cache/Disk), check the replication rules:

rucio rule list --did <scope>:<dataset_name>

Interpreting the Output

Look at the RSE_EXPRESSION column:

  • ONLY JLAB-TAPE-SE: Data is cold/offline (on tape) and cannot be directly downloaded. You must first request a disk copy (see Data Recall).
  • Includes DISK or CACHE RSE (e.g., JLAB-DISK-SE): Data is online and can be downloaded directly. Proceed to Download Files.

Check the STATE column:

  • OK: Replication is complete
  • REPLICATING: Data is still being copied

Data Recall

If you see a rule only on Tape with no Disk replica, you must manually request a disk copy.

Step A: Request the Disk Replica

# Add a rule to bring data to JLAB-DISK-SE for 7 days (604800 seconds)
rucio rule add <scope>:<dataset_name> --copies 1 --rses JLAB-DISK-SE --lifetime 604800

Rucio will now stage (copy) the data from Tape to Disk.

Note: You can adjust the lifetime value based on your needs:

  • 7 days: 604800 seconds
  • 14 days: 1209600 seconds
  • 30 days: 2592000 seconds

Step B: Monitor the Disk Rule

Tape recalls can take hours or even days depending on the JLab MSS queue. Monitor your specific rule:

# List all your rules to find the RULE_ID
rucio rule list --account <your_account_name>

# Check the status of a specific rule
rucio rule show <RULE_ID>

Wait until the STATE changes to OK before attempting to download.

Download Files

The rucio download command retrieves data to your local storage or computing node.

Download an Entire Dataset

This is the most common way to get data for analysis. It downloads all files associated with the dataset DID:

rucio download --no-subdir ana:/RunPeriod-2019-11/analysis/ver01/omegapi_prime/merged

Download a Single File

If you only need one specific file, use the file-level DID:

rucio download recon:/RunPeriod-2025-01/recon/ver01/REST/132389/dana_rest_132389_035.hddm

Download to specific directory.

# Download to a specific directory
rucio download --no-subdir --dir /path/to/destination <scope>:<name>

Common Workflows

Example: Find and Download Reconstruction Data


# List datasets
rucio did list 'recon:/RunPeriod-2025-01/recon/ver01/REST/*'

# Check if data is on disk
rucio rule list --did recon:/RunPeriod-2025-01/recon/ver01/REST/132389

# 4a. If on tape, request disk copy
rucio rule add recon:/RunPeriod-2025-01/recon/ver01/REST/132389 1 JLAB-DISK-SE --lifetime 604800

# 4b. Monitor until STATE is OK
rucio rule list --account <your_account_name>

# 5. Download the dataset
rucio download recon:/RunPeriod-2025-01/recon/ver01/REST/132389

Troubleshooting

Authentication Issues

  • Ensure RUCIO_CONFIG environment variable points to your rucio.cfg file
  • Check that your JLab credentials are valid
  • Try running rucio whoami again to refresh your token