Performance¶

Optimization Tips¶

1. Use Column Indices Instead of Names¶

Map lookups are slower than direct index access:

// Slow: map lookup every call
for (int row = 0; row < bank.getRows(); row++) {
    int pid = bank.getInt("pid", row);  // map lookup
}

// Fast: cache the index
int pidCol = bank.getSchema().getEntryOrder("pid");
for (int row = 0; row < bank.getRows(); row++) {
    int pid = bank.getInt(pidCol, row);  // direct access
}

2. Reuse Bank Objects¶

Banks are designed to be reused across events. Don't create new bank objects inside the event loop:

// Good: create once, reuse
hipo::bank particles(dict.getSchema("REC::Particle"));
while (reader.next()) {
    event.read(particles);  // overwrites in-place
}

// Bad: creates a new bank every event
while (reader.next()) {
    hipo::bank particles(dict.getSchema("REC::Particle"));  // allocation!
    event.read(particles);
}

3. Use Banklist for Multiple Banks¶

The next(banklist) variant reads all banks in one call, avoiding redundant event parsing:

auto list = reader.getBanks({"REC::Particle", "REC::Event"});
while (reader.next(list)) {
    // more efficient than separate event.read() calls
}

4. Cache Parser Objects¶

When filtering rows across many events:

// Good: parse expression once
hipo::Parser filter("pid==11&&charge<0");
while (reader.next(list)) {
    list[0].getMutableRowList().filter(filter);
}

// Bad: re-parses expression every event
while (reader.next(list)) {
    list[0].getMutableRowList().filter("pid==11&&charge<0");
}

5. Use Chain for Multi-File Processing¶

The chain class handles multi-file processing and spreads the work across worker threads:

hipo::chain ch(0);  // auto-detect thread count
ch.add_pattern("data/*.hipo");
ch.process([](auto& event, int file_idx, long event_idx) {
    // parallelized at record level
});

Performance Characteristics¶

Read Path¶

A read does work at three points, in rough order of cost:

Record decompression — LZ4 inflation of a compressed record. This is CPU-bound rather than I/O-bound, and it is amortized across every event in the record.
Event iteration — advancing within an already-decompressed record copies the event's structures into your banks; no disk access is involved.
Column access — getInt / getFloat by index is a direct buffer read. Access by name adds a map lookup to resolve the name to a column index, so cache the index for hot loops (see Optimization Tips).

Record Sizing¶

Default record limits:

100,000 events per record
8 MB uncompressed data per record

These are tuned for typical CLAS12 event sizes. The record size affects:

Larger records: Better compression ratio, larger memory footprint
Smaller records: Lower latency for random access, more I/O overhead

Compression¶

Records are LZ4-compressed as a unit. Grouping many events into one record gives the compressor enough shared structure to work with, so records compress well; a file written as many tiny records compresses poorly. Decompression is CPU-bound, not I/O-bound — see Record Sizing for the size trade-off.

Limitations¶

No Random Column Access¶

HIPO's columnar layout is within each bank, not across the file. You cannot efficiently read "column X from all events" without decompressing all records. For this pattern, use record::getColumn().

No Parallel Writing¶

The writer class is single-threaded. For parallel workflows, use separate writers for separate output files.

Event Size Limit¶

The default event buffer is 128 KB. Events exceeding this size will cause buffer overflows.

No In-Place Modification¶

HIPO files cannot be modified in-place. To update data, read from one file and write to another.

Benchmarking¶

Use the built-in benchmark class to measure performance:

hipo::benchmark readBench;
hipo::benchmark processBench;

while (reader.next()) {
    readBench.resume();
    reader.read(event);
    event.read(particles);
    readBench.pause();

    processBench.resume();
    // analysis code...
    processBench.pause();
}

printf("Read time: %.3f sec\n", readBench.getTimeSec());
printf("Process time: %.3f sec\n", processBench.getTimeSec());

Record-level benchmarks are also available:

auto& readBM  = record.getReadBenchmark();
auto& unzipBM = record.getUnzipBenchmark();
printf("Disk read: %.3f sec, LZ4 decompress: %.3f sec\n",
       readBM.getTimeSec(), unzipBM.getTimeSec());