🔸 Meson and Ninja High Performance Output (HIPO) is a data format for experimental Physics. It uses chunked data blocks to store data with LZ4 compression. Provides highly efficient read speeds. Originally implemented in Java for physics reconstruction software using service oriented architecture (SOA), now has C++ implementaion with wrappers to interface Fortran, Julia and Python. Full-featured Python bindings (hipopy) are available via pybind11 with NumPy and Awkward Array integration.
Write File Example (Java)
Example writes a simple file with each event containing one bank.
SchemaBuilder b = new SchemaBuilder("data::clusters",12,1)
.addEntry("type","B","cluster type") // B - type Byte
.addEntry("n", "S", "cluster multiplicity") // S - Type Short
.addEntry("x", "F", "x position") // F - type float
.addEntry("y","F","y position") // F - type float
.addEntry("z", "F", "z position"); // F - type Float
Schema schema = b.build();
HipoWriter w = new HipoWriter();
w.getSchemaFactory().addSchema(schema);
w.open("clusters.h5");
Random r = new Random();
r.setSeed(123456);
Event event = new Event();
for(int i = 0; i < 1200; i++){
event.reset();
int nrows = r.nextInt(12)+4; // rows 4-16 randomly generated
Bank bank = new Bank(schema,nrows);
for(int row = 0; row < nrows; row++){
bank.putByte( "type", row, (byte) (r.nextInt(8)+1));
bank.putShort( "n", row, (short) (r.nextInt(15)+1));
bank.putFloat( "x", row, r.nextFloat());
bank.putFloat( "y", row, r.nextFloat());
bank.putFloat( "z", row, r.nextFloat());
}
event.write(bank); // write the bank to the event, event can contain multiple
w.addEvent(event); // add event to the file
Read File Example (Java)
Example shows how to read the file created using previous example code, and do sone analysis on the bank written.
HipoReader r = new HipoReader("clusters.h5");
Bank[] banks = r.getBanks("data::clusters");
while(r.nextEvent(banks)){
int nrows = banks[0].getRows();
for(int row = 0; row < nrows; row++){
if(banks[0].getInt("type", row)==5){
int nhits = banks[0].getInt("n", row);
if(nhits>4&&nhits<12){
System.out.printf("%4d, %5d, %8.5f %8.5f %8.5f\n",
banks[0].getInt("type", row),nhits,
banks[0].getFloat("x", row),
banks[0].getFloat("y", row),
banks[0].getFloat("z", row));
}
}
}
}
Write File Example (C++)
hipo::schema schemaPart("event::particle",100,1);
hipo::schema schemaDet( "event::detector",100,2);
//---------------------------------------------------------
// Defining structure of the schema (bank)
// The columns in the banks (or leafs, if you like ROOT)
// are given comma separated with type after the name.
// Available types : I-integer, S-short, B-byte, F-float,
// D-double, L-long
schemaPart.parse("pid/S,px/F,py/F,pz/F");
schemaDet.parse("pindex/I,detectorid/I,x/F,y/F,z/F,time/F,energy/F");
//---------------------------------------------------------
// Open a writer and register schemas with the writer.
// The schemas have to be added to the writer before openning
// the file, since they are written into the header of the file.
hipo::writer writer;
writer.getDictionary().addSchema(schemaPart);
writer.getDictionary().addSchema(schemaDet);
writer.open("outputFile.h5");
//hipo::bank partBank(schemaPart,30);
//hipo::bank detBank( schemaDet ,30);
hipo::event outEvent;
for(int i = 0; i < 25000; i++){
int nparts = 2 + rand()%10;
int ndets = 5 + rand()%20;
//-------------------------------------------------
// Create banks with random rows based on schemas
hipo::bank partBank(schemaPart,nparts);
hipo::bank detBank( schemaDet ,ndets);
// Fill banks with random numbers
dataFill(partBank);
dataFill(detBank);
// Printout on the screen content of the banks
printf("particles = %d, detectors = %d\n",nparts, ndets);
partBank.show();
detBank.show();
//-------------------------------------------------
// Clear output event Event append banks to the Event
// and write to the output file
outEvent.reset();
outEvent.addStructure(partBank);
outEvent.addStructure(detBank);
writer.addEvent(outEvent);
}
writer.close();
Here is the subroutine that fills the banks:
void dataFill(hipo::bank &bank){
int nrows = bank.getRows();
int nentries = bank.getSchema().getEntries();
for(int row = 0; row < nrows; row++){
for(int e = 0 ; e < nentries; e++){
int type = bank.getSchema().getEntryType(e);
if(type==1||type==2||type==3){
int inum = rand()%20;
bank.putInt(bank.getSchema().getEntryName(e).c_str(),row,inum);
}
if(type==4){
float ifloat = ((float) rand()) / (float) RAND_MAX;
bank.putFloat(bank.getSchema().getEntryName(e).c_str(),row,ifloat);
}
}
}
}
Reading File Example (C++)
Example of reading file created with previous example, and showing the content of the banks.
hipo::reader reader;
reader.open("outputFile.h5");
hipo::dictionary factory;
reader.readDictionary(factory);
factory.show();
hipo::bank particles(factory.getSchema("event::particle"));
hipo::bank detectors(factory.getSchema("event::detector"));
hipo::event event;
int counter = 0;
while(reader.next()==true){
reader.read(event);
event.read(particles);
event.read(detectors);
particles.show();
detectors.show();
int nrows = particles.getRows();
printf("---------- PARTICLE BANK CONTENT -------\n");
for(int row = 0; row < nrows; row++){
int pid = particles.getInt("pid",row);
float px = particles.getFloat("px",row);
float py = particles.getFloat("py",row);
float pz = particles.getFloat("pz",row);
printf("%6d : %6d %8.4f %8.4f %8.4f\n",row,pid,px,py,pz);
}
printf("---------- END OF PARTICLE BANK -------\n");
counter++;
}
printf("processed events = %d\n",counter);
}
Python Bindings (hipopy)
The hipopy package provides full Python access to HIPO files with NumPy and Awkward Array support.
Building
meson setup build -Dpython_bindings=true
cd build && ninja
Requires pybind11 and numpy. Install with: pip install pybind11 numpy
Read File Example (Python)
import hipopy
import numpy as np
with hipopy.open("data.hipo", banks=["REC::Particle"]) as f:
for banks in f:
pid = banks["REC::Particle"]["pid"] # NumPy array
px = banks["REC::Particle"]["px"]
py = banks["REC::Particle"]["py"]
pt = np.sqrt(px**2 + py**2)
print(f"Particles: {len(pid)}, max pT = {pt.max():.3f}")
Write File Example (Python)
import hipopy
schema = hipopy.Schema("data::clusters", 12, 1)
schema.parse("type/B,n/S,x/F,y/F,z/F")
d = hipopy.Dictionary()
d.add_schema(schema)
writer = hipopy.Writer()
writer.add_dictionary(d)
writer.open("clusters.hipo")
for i in range(1200):
event = hipopy.Event()
bank = hipopy.Bank(schema, 8)
for row in range(8):
bank.put_byte("type", row, row % 8 + 1)
bank.put_short("n", row, row + 1)
bank.put_float("x", row, 0.1 * row)
bank.put_float("y", row, 0.2 * row)
bank.put_float("z", row, 0.3 * row)
event.add_structure(bank)
writer.add_event(event)
writer.close()
Batch Reading (Python)
# Flat NumPy arrays (all events concatenated)
data = hipopy.read_columns("data.hipo", "REC::Particle", ["pid", "px", "py"])
pt = np.sqrt(data["px"]**2 + data["py"]**2)
# Ragged Awkward Arrays (per-event structure preserved)
particles = hipopy.read_bank("data.hipo", "REC::Particle", ["pid", "px", "py"])
print(particles.type) # e.g. 1000 * {pid: var * int32, px: var * float32, ...}
Installing the Package (Java)
Clone the repository, and compile with maven. The package will be deployed soon.
Installing the Package (C++)
Dependencies
Click each for details <details>
- Meson: https://mesonbuild.com/
Likely available in your package manager (apt, brew, dnf, etc.), but the versions may be too old, in which case, use pip (python -m pip install meson ninja)
</details>
<details>
🔸 LZ4
- LZ4: https://lz4.org/
Likely available in your package manager, but if you do not have it, it will be installed locally for you
</details>
<details>
🔸 ROOT (optional)
- ROOT: https://root.cern.ch/
ROOT is optional and only needed for certain extensions and examples, such as HipoDataFrame; if you do not have ROOT, the complete HIPO library will still be built.
</details>
Building
Use standard Meson commands to build HIPO.
For example, first create a build directory; let's name it ./build, set the installation location to ./install, and run the following command from the source code directory:
meson setup build --prefix=`pwd`/install
<details>
Click here for more details
- you may run this command from any directory; in that case, provide the path to the source code directory (e.g., meson setup build /path/to/source)
- the installation prefix must be an absolute path; you can change it later (
meson configure)
</details>
The build directory is where you can compile, test, and more: ```bash cd build ninja # compiles ninja install # installs to your specified prefix (../install/, in the example) ninja test # runs the tests ninja clean # clean the build directory, if you need to start over