Skip to main content

GlueX Rucio Namespace

Overview

In Rucio, the smallest unit of data is a file. Files are grouped into datasets. A single file can be part of multiple datasets. Both files and datasets follow an identical naming scheme composed of scope:name, called a Data Identifier (DID).

Scopes

GlueX data is divided into multiple scopes that categorize data by type:

  • raw: Raw data from the detector
  • recon: Production reconstructed data
  • calib: Calibration-related data
  • ana: Analysis-level data

Name Structure

The name part of a DID follows specific patterns depending on the scope:

Raw Data (raw scope)

/<RunPeriod-YYYY-MM>/rawdata/Run<run_num>/<filename>.evio

Reconstructed Data (recon scope)

/<RunPeriod-YYYY-MM>/recon/ver<version_number>/REST/<run_num>/<filename>.hddm

Calibration Data (calib scope)

/<RunPeriod-YYYY-MM>/calib/ver<version_number>/<calibration_trigger>/Run<run_num>/<filename>.evio

Analysis Data (ana scope)

Note: Only merged files are stored at the analysis level.

/<RunPeriod-YYYY-MM>/analysis/ver<version_number>/<reaction_name>/merged/<filename>.root

DID Hierarchy

GlueX Rucio implements a two-level hierarchy: datasets contain files. There is no hierarchy above datasets or below files.

Dataset Level

A dataset DID combines the scope with a path that goes up to (but excludes) the filename. The dataset represents a logical grouping of related files.

Example dataset DIDs:

  • raw:/RunPeriod-2019-11/rawdata/Run051730
  • recon:/RunPeriod-2019-11/recon/ver02/REST/051730
  • ana:/RunPeriod-2019-11/analysis/ver01/omegapi_prime/merged

File Level

A file DID includes the complete path including the filename.

Example file DIDs:

  • raw:/RunPeriod-2019-11/rawdata/Run051730/hd_rawdata_051730_000.evio
  • recon:/RunPeriod-2019-11/recon/ver02/REST/051730/dana_rest_051730_000.hddm
  • ana:/RunPeriod-2019-11/analysis/ver01/omegapi_prime/merged/tree_omegapi_prime.root

Relationship Summary

  • Each dataset contains one or more files
  • Each file can belong to multiple datasets
  • There are no containers or collections above the dataset level
  • The path-like structure in names is purely organizational—the actual hierarchy is only dataset → files

Naming Conventions

  • <RunPeriod-YYYY-MM>: Run period identifier with year and month (e.g., RunPeriod-2019-11)
  • <run_num>: Six-digit run number (e.g., 051730)
  • <version_number>: Two-digit version number (e.g., 02)
  • <filename>: Base filename without path
  • <calibration_trigger>: Specific calibration trigger type
  • <reaction_name>: Physics reaction or analysis name