hipo2root¶
Convert HIPO files to ROOT RNTuple format.
Synopsis¶
Description¶
hipo2root converts HIPO files to ROOT RNTuple format using RNTupleParallelWriter for true concurrent writing. Each thread writes directly to the same output file via its own FillContext, eliminating the need for temporary files per thread.
ROOT Required
This tool requires ROOT 6.30+ to be installed on your system (for RNTuple support). All other hipo-utils tools work without ROOT.
Options¶
| Option | Description | Default |
|---|---|---|
-i, --input <files> | Input HIPO files (comma-separated, supports globs) | - |
-o, --output <dir> | Output directory | . |
-m, --merged <file> | Merged output filename | merged.root |
-t, --tree <name> | RNTuple name in output file | hipo |
-j, --jobs <n> | Number of parallel jobs | CPU cores |
-n, --nevt <n> | Events to inspect per file to find banks with data (increase for rare banks) | 1000 |
-I, --include-banks <banks> | Only include specified banks (comma-separated) | - |
-E, --exclude-banks <banks> | Exclude specified banks (comma-separated) | - |
Examples¶
Basic Conversion¶
# Convert single file
hipo2root data.hipo
# Convert to specific directory
hipo2root -o output/ data.hipo
# Using -i option
hipo2root -i data.hipo -o output/
Multiple Files¶
# Convert all HIPO files in data/
hipo2root data/*.hipo -o output/
# Using -i with comma-separated files
hipo2root -i file1.hipo,file2.hipo,file3.hipo -m output.root
# Custom output filename
hipo2root -m output.root file1.hipo file2.hipo
Parallel Processing¶
Records within each file are processed in parallel across threads using RNTupleParallelWriter:
# Use 4 threads for parallel writing
hipo2root --jobs 4 data/*.hipo
# Use all CPU cores (default)
hipo2root data/*.hipo
# Single file benefits from parallelism too
hipo2root --jobs 8 large_file.hipo
Bank Selection¶
# Include only specific banks
hipo2root -I REC::Particle,REC::Calorimeter data.hipo
# Exclude banks
hipo2root -E "RAW::*" data.hipo
Custom RNTuple Name¶
Output Structure¶
The ROOT file contains an RNTuple with fields for each bank column:
All fields are stored as std::vector<T> to handle multi-row banks.
Reading RNTuple Output¶
// C++ example
#include <ROOT/RNTupleReader.hxx>
auto reader = ROOT::RNTupleReader::Open("hipo", "output.root");
std::cout << "Entries: " << reader->GetNEntries() << std::endl;
// Access fields
auto pidView = reader->GetView<std::vector<int>>("REC_Particle_pid");
for (auto i : reader->GetEntryRange()) {
for (int pid : pidView(i)) {
// process particle IDs
}
}
# Python example (requires ROOT with RNTuple support)
import ROOT
reader = ROOT.RNTupleReader.Open("hipo", "output.root")
print(f"Entries: {reader.GetNEntries()}")
Architecture¶
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Worker 0 │ │ Worker 1 │ │ Worker N │
│ FillContext │ │ FillContext │ │ FillContext │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
└───────────────────┼───────────────────┘
│
┌──────▼──────┐
│ RNTuple │
│ParallelWriter│
│(single file)│
└─────────────┘
Notes¶
- True parallel writing: Uses
RNTupleParallelWriterwhere each thread has its ownFillContextfor lock-free concurrent writes to the same file - No temporary files: All input files stream directly to a single output file - no intermediate temp files or post-merge step required
- Files are processed sequentially, but each file fully utilizes all threads for record processing
- Schema discovery inspects N events per file (configurable with
-n) to find banks that contain data. Increase this value if you have rare banks that only appear occasionally - Event order within the output may differ from the original sequential order when using multiple threads (records are processed in parallel)
- Output uses 64MB cluster sizes for optimal parallel I/O performance
See Also¶
- hipo-dump - Export to CSV/JSON
- hipo-slice - Extract bank subsets before conversion