Rucio Command Line Tools
Setup
Install the required packages:
pip install --user rucio-clients==38.5.1
pip install --user pyjwt
pip install --user -i https://test.pypi.org/simple/ rucio-dlclient-gluex-fix==0.0.2
For download and upload you need XRootD client installed in your system. First check if you have it in your machine via
xrdcp -h
If not installed please install XRootD client or ask system admin to install it. Based on your OS choose one of the following.
- In RPM-based distributions, like Alma, Rocky, Fedora, CentOS Stream, and RHEL, one can install XRootD with
dnf install xrootd-client python3-xrootd
In RHEL-based distributions, it will be necessary to first install the EPEL release repository with
dnf install epel-release
- On Debian 11 or later, and Ubuntu 22.04 or later, XRootD can be installed via apt
apt install xrootd-client python3-xrootd
- On macOS, XRootD is available via Homebrew
brew install xrootd
Configuration
Create a file named rucio.cfg with the following content:
[client]
rucio_host = https://rucio-jlab.jlab.org
auth_host = https://rucio-jlab.jlab.org
auth_type = oidc
account = <your_account_name>
ca_cert = /etc/pki/tls/certs/ca-bundle.crt
request_retries = 3
[download]
use_rucio_oidc_token = True
Replace <your_account_name> with you username.
Set the environment variable to point to your configuration file:
export RUCIO_CONFIG=/path/to/rucio.cfg
Test Setup
Verify your installation and authentication by running:
rucio whoami
Complete the authentication flow:
- Copy the URL displayed in the terminal and paste it into your web browser
- On the CILogon page, select Thomas Jefferson National Accelerator Facility
- Log in using your JLab credentials
- Copy the authentication code displayed after login
- Paste the code back into your terminal and press Enter
Searching for Data
Data in Rucio is identified by a Data Identifier (DID) in the format scope:name. Refer to the Namespace Documentation for path patterns.
Search for Datasets
To find a logical group of files (a dataset), use list-dids. You can use wildcards (*):
# List reconstructed REST datasets for a run period with ver01
rucio did list 'recon:/RunPeriod-2025-01/recon/ver01/REST/*'
List Files in a Dataset
To see the individual files contained within a dataset:
rucio did content list recon:/RunPeriod-2025-01/recon/ver01/REST/132389
Check Data Location (Tape vs. Disk)
To determine where a dataset's files are physically located (e.g., MSS/Tape vs. Cache/Disk), check the replication rules:
rucio rule list --did <scope>:<dataset_name>
Interpreting the Output
Look at the RSE_EXPRESSION column:
- ONLY
JLAB-TAPE-SE: Data is cold/offline (on tape) and cannot be directly downloaded. You must first request a disk copy (see Data Recall). - Includes
DISKorCACHERSE (e.g.,JLAB-DISK-SE): Data is online and can be downloaded directly. Proceed to Download Files.
Check the STATE column:
OK: Replication is completeREPLICATING: Data is still being copied
Data Recall
If you see a rule only on Tape with no Disk replica, you must manually request a disk copy.
Step A: Request the Disk Replica
# Add a rule to bring data to JLAB-DISK-SE for 7 days (604800 seconds)
rucio rule add <scope>:<dataset_name> --copies 1 --rses JLAB-DISK-SE --lifetime 604800
Rucio will now stage (copy) the data from Tape to Disk.
Note: You can adjust the lifetime value based on your needs:
- 7 days:
604800seconds - 14 days:
1209600seconds - 30 days:
2592000seconds
Step B: Monitor the Disk Rule
Tape recalls can take hours or even days depending on the JLab MSS queue. Monitor your specific rule:
# List all your rules to find the RULE_ID
rucio rule list --account <your_account_name>
# Check the status of a specific rule
rucio rule show <RULE_ID>
Wait until the STATE changes to OK before attempting to download.
Download Files
The rucio download command retrieves data to your local storage or computing node.
Download an Entire Dataset
This is the most common way to get data for analysis. It downloads all files associated with the dataset DID:
rucio download --no-subdir ana:/RunPeriod-2019-11/analysis/ver01/omegapi_prime/merged
Download a Single File
If you only need one specific file, use the file-level DID:
rucio download recon:/RunPeriod-2025-01/recon/ver01/REST/132389/dana_rest_132389_035.hddm
Download to specific directory.
# Download to a specific directory
rucio download --no-subdir --dir /path/to/destination <scope>:<name>
Common Workflows
Example: Find and Download Reconstruction Data
# List datasets
rucio did list 'recon:/RunPeriod-2025-01/recon/ver01/REST/*'
# Check if data is on disk
rucio rule list --did recon:/RunPeriod-2025-01/recon/ver01/REST/132389
# 4a. If on tape, request disk copy
rucio rule add recon:/RunPeriod-2025-01/recon/ver01/REST/132389 1 JLAB-DISK-SE --lifetime 604800
# 4b. Monitor until STATE is OK
rucio rule list --account <your_account_name>
# 5. Download the dataset
rucio download recon:/RunPeriod-2025-01/recon/ver01/REST/132389
Troubleshooting
Authentication Issues
- Ensure
RUCIO_CONFIGenvironment variable points to yourrucio.cfgfile - Check that your JLab credentials are valid
- Try running
rucio whoamiagain to refresh your token