ROOT文件IO实战(上)
.root文件一般格式
.root
文件是ROOT
的一种文件格式,一般储存TTree
或者TH1
等数据结构。
ROOT文件写入
写入普通对象
首先看一下对于普通对象,如何写入.root
文件。
import ROOT
# 创建一个TH1D对象(这个对象是一维直方图)
h1 = ROOT.TH1D("h1","A simple histogram",10,0,12)
# 填充数据
for i in range(7):
h1.Fill(i+1)
# 写入文件
path='test.root'
outfile = ROOT.TFile(path,"UPDATE")
outfile.WriteObject(h1,'h1')
outfile.Close()
运行代码后,会在当前目录下生成一个test.root
文件,打开文件可以看到h1
对象已经被写入。
文件查看方式:
VSCODE
插件ROOT File Viewer
ROOT
自带的TBrowser
- 利用
ROOT
代码查看
(1)对于VSCODE插件,直接右键,选择用VSCODE打开即可。
(2)对于TBrowser
,
#终端打开
root -l test.root
root [0] TBrowser b
(3)利用ROOT
代码查看,建议是用交互式命令行模式,其一是快速,其二是服务器上往往这样查看.值得一提的是ROOT
交互式命令行模式是C++
root -l test.root
root [0]
Attaching file test.root as _file0...
(TFile *) 0x5ff3944ba220
root [1] .ls
TFile** test.root
TFile* test.root
KEY: TH1D h1;1 object title
root [2] !h1->Draw()
如果此时不删除文件再运行修改数据的代码一次,
会发现文件结构变成了:
TFile** test.root
TFile* test.root
KEY: TH1D h1;2 object title [current cycle]
KEY: TH1D h1;1 object title [backup cycle]
有时侯碰见这种情况不知道读哪一个,默认读入是最新的一个,其实这个事情是可以在写入的时候避免的.
写入TTree
对象
一般这个常用.
先看简单例子:
import ROOT
import array
def copyarray(a,b):
'''
copy b[:]=a[:]
'''
while len(b)!=len(a):
if len(b)<len(a):
b.append(0)
else:
b.pop()
for i in range(len(a)):
b[i]=a[i]
print(b,a)
outfile= ROOT.TFile('test0.root', "UPDATE")
tree=ROOT.TTree('testtree','title')
ab={}
ab['a']=array.array("I",[10])
ab['b']=array.array("d",[0]*4)
ab['c']=array.array('H',[0])
ab['d']=array.array('Q',[0])
ab['e']=array.array('H',[0]*5)
tree.Branch('a',ab['a'],"a/i")
tree.Branch('b',ab['b'],"b[a]/D")
tree.Branch('c',ab['c'],"c/O")
tree.Branch('d',ab['d'],"d/l")
tree.Branch('e',ab['e'],"e[a]/s")
file={'a':[4,2,3,0],'b':[[10,10,10,10.0],[1,2],[1.1,2,3],[]],'e':[[True,False,True,True],[True,False],[False,True,True],[]],'c':[False,True,False,True],'d':[1<<56,(1<<56)|1,1<<16,1<<8]}
for i in range(4):
ab['a'][0]=file['a'][i]
ab['c'][0]=file['c'][i]
ab['d'][0]=file['d'][i]
copyarray(file['b'][i],ab['b'])
copyarray(file['e'][i],ab['e'])
tree.Fill()
outfile.WriteObject(tree,'tree')
outfile.Close()
上述代码向文件中写入一个tree,其中包含了五个分支,分别是a
、b
、c
、d
、e
。其中a
、c
、d
都是标量,b
和e
都是数组。由于使用的是python,主体过程是将python的array连接到Branch,然后每个事例填充。
TTree是一个列表,每一行是一个事例,对于每一个事例,有abcde这类东西,其中acd对于每一个事例是纯数值,而be对于每一个事例是不定长度的一维数组.
root -l test0.root
root [1] .ls
TFile** test0.root
TFile* test0.root
KEY: TTree tree;1 title
root [2] tree->Print()
******************************************************************************
*Tree :testtree : title *
*Entries : 4 : Total = 3661 bytes File Size = 0 *
* : : Tree compression factor = 1.00 *
******************************************************************************
*Br 0 :a : a/i *
*Entries : 4 : Total Size= 645 bytes One basket in memory *
*Baskets : 0 : Basket Size= 32000 bytes Compression= 1.00 *
*............................................................................*
*Br 1 :b : b[a]/D *
*Entries : 4 : Total Size= 804 bytes One basket in memory *
*Baskets : 0 : Basket Size= 32000 bytes Compression= 1.00 *
*............................................................................*
*Br 2 :c : c/O *
*Entries : 4 : Total Size= 627 bytes One basket in memory *
*Baskets : 0 : Basket Size= 32000 bytes Compression= 1.00 *
*............................................................................*
*Br 3 :d : d/l *
*Entries : 4 : Total Size= 669 bytes One basket in memory *
*Baskets : 0 : Basket Size= 32000 bytes Compression= 1.00 *
*............................................................................*
*Br 4 :e : e[a]/s *
*Entries : 4 : Total Size= 738 bytes One basket in memory *
*Baskets : 0 : Basket Size= 32000 bytes Compression= 1.00 *
*............................................................................*
主要查看TTree的方法有Print
、Show
、Scan
,三种方法各有优劣。
root [5] tree->Show(0)//Show method
======> EVENT:0
a = 4
b = 10,
10, 10, 10
c = 0
d = 72057594037927936
e = 1,
0, 1, 1
root [6] tree->Show(1)//Show method
======> EVENT:1
a = 2
b = 1,
2
c = 1
d = 72057594037927937
e = 1,
0
root [7] tree->Scan()
***********************************************************************************
* Row * Instance * a.a * b.b * c.c * d.d * e.e *
***********************************************************************************
* 0 * 0 * 4 * 10 * 0 * 7.205e+16 * 1 *
* 0 * 1 * 4 * 10 * 0 * 7.205e+16 * 0 *
* 0 * 2 * 4 * 10 * 0 * 7.205e+16 * 1 *
* 0 * 3 * 4 * 10 * 0 * 7.205e+16 * 1 *
* 1 * 0 * 2 * 1 * 1 * 7.205e+16 * 1 *
* 1 * 1 * 2 * 2 * 1 * 7.205e+16 * 0 *
* 2 * 0 * 3 * 1.1 * 0 * 65536 * 0 *
* 2 * 1 * 3 * 2 * 0 * 65536 * 1 *
* 2 * 2 * 3 * 3 * 0 * 65536 * 1 *
* 3 * 0 * 0 * * 1 * 256 * *
***********************************************************************************
(long long) 10
对于一般的TTree写入,我写了一个函数来实现,可以参考一下:
'''
一些输出到文件
'''
import ROOT
import os
import array
def copyarray(a,b):
#copy b[:]=a[:]
for i in range(len(a)):
b[i]=a[i]
def savefile(path,file,shape,TreeName='tree'):
'''
实现保存文件Tree功能,如果path存在则追加,不存在则创建
输入
path:文件路径及文件名
file:字典,key为变量名,value为变量值 [事例1值,...]
shape:字典,key为变量名,value为(每个事例的形状,type in array,type in tree)
每个事例的形状:[]表示单值,[n]表示第一维最多有n个值!!n必须大于所有可能长度!!
输入例子
savefile(savepath,
{
'EVENT':numpy.arange(len(ID)),
'ID':ID,
'Q':Q,
'X':X,
'BOOLdata':BOOLdata,
},{
'EVENT':([],"L","/l"),###python2 "L"表示Uint64 python3使用"Q"
'ID':([INDEXMAX+1],"i","/I"),#H,/s
'Q':([INDEXMAX+1],"L","/l"),
'X':([INDEXMAX+1],"d","/D"),
'BOOLdata':([],"H","/s"),
}
)
TTree的Branch的type:
C : a character string terminated by the 0 character
B : an 8 bit signed integer (Char_t)
b : an 8 bit unsigned integer (UChar_t)
S : a 16 bit signed integer (Short_t)
s : a 16 bit unsigned integer (UShort_t)
I : a 32 bit signed integer (Int_t)
i : a 32 bit unsigned integer (UInt_t)
F : a 32 bit floating point (Float_t)
f : a 24 bit floating point with truncated mantissa (Float16_t)
D : a 64 bit floating point (Double_t)
d : a 24 bit truncated floating point (Double32_t)
L : a 64 bit signed integer (Long64_t)
l : a 64 bit unsigned integer (ULong64_t)
G : a long signed integer, stored as 64 bit (Long_t)
g : a long unsigned integer, stored as 64 bit (ULong_t)
O : [the letter o, not a zero] a boolean (Bool_t)
csq
python3_array must be b, B, u, h, H, i, I, l, L, q, Q, f or d
python2_array must be b, B, u, h, H, i, I, l, L, f or d ,c (character)
b signed char 1
B unsigned char 1
u py_UNICODE 2
h signed short int 2
H unsigned short int2
i signed int 2
I unsigned int 2 uint16
l signed long 4
L unsigned long 4 uint32
q signed long long 8
Q unsigned long long8 uint64
f float 4
d double 8
'''
exist0=True
bianlianglist={}
if not os.path.exists(path):
exist0=False
print("# path not exists")
outfile= ROOT.TFile(path, "UPDATE")
tree=ROOT.TTree(TreeName,TreeName)
for i in shape:
if len(shape[i][0])==0:
bianlianglist[i]=array.array(shape[i][1],[0])
tree.Branch(i,bianlianglist[i],i+shape[i][2])
elif len(shape[i][0])==1:
if 'n' not in bianlianglist:
bianlianglist['n']=array.array('I',[0])
tree.Branch('n',bianlianglist['n'],'n/i')
bianlianglist[i]=array.array(shape[i][1],[0]*shape[i][0][0])
tree.Branch(i,bianlianglist[i],i+'[n]'+shape[i][2])
else:
print('wrong!csqhaimeibian')
else:
outfile= ROOT.TFile(path, "UPDATE")
tree=outfile.tree
for i in shape:
if len(shape[i][0])==0:
bianlianglist[i]=array.array(shape[i][1],[0])
tree.SetBranchAddress(i,bianlianglist[i])
elif len(shape[i][0])==1:
if 'n' not in bianlianglist:
bianlianglist['n']=array.array('I',[0])
tree.SetBranchAddress('n',bianlianglist['n'])
bianlianglist[i]=array.array(shape[i][1],[0]*shape[i][0][0])
tree.SetBranchAddress(i,bianlianglist[i])
else:
print('wrong!haimeibian')
N=len(file[list(file.keys())[0]])
for i in range(N):
for x in shape:
if len(shape[x][0])==0:
bianlianglist[x][0]=file[x][i]
elif len(shape[x][0])==1:
bianlianglist['n'][0]=len(file[x][i])
copyarray(file[x][i],bianlianglist[x])
else:
print('wrong!haimeibian')
tree.Fill()
#outfile.WriteObject(tree,'tree')#这会导致重复存储
tree.Write("",ROOT.TObject.kOverwrite)
outfile.Close()
print('文件已保存')