论文《Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models》里的相关数据
论文作者把数据放在了这里:
作者所生成的物体权重位于这里:
我们点开可以看到类似的文件目录:
也就是这样的目录:
其中没展开的spatial_imgs文件夹内放的全是图片中的bounding box的覆盖信息(其实我对这个没理解透,若大家有更详细的理解,欢迎留言。)
数据解析:
与数据无关的权重文件
vg
根目录下的ViT-B_32_spatial_logits.pth
文件。clip_obj_feature
文件夹下的ViT-B_32_clip_obj_feature_dict.pth
文件。proposal
文件夹下的ViT-B_32_box_proposal_dict.pth
文件和ViT-B_32_spatial_name.pth
文件。
这三个文件实际上都是clip或vit的权重,我们不用去读取研究,我们去研究专门的数据就好。
与数据有关的文件:
这四个文件是与data有关的文件。
des_prompts.pth文件里,定义了每个关系其sub(主体),obj(客体),pos(位置),size(大小)的可能属性:
其结果如下所示:
我们print一下字典的key值,结果如下所示:
state_dict = torch.load("C:\code\\read_pth\\recode\\vg\des_prompts.pth")
print(state_dict.keys())
dict_keys(['carrying_product_human', 'carrying_product_animal', 'carrying_animal_animal', 'carrying_human_human', 'carrying_human_animal', 'carrying_animal_human', 'carrying_animal_product', 'carrying_human_product', 'carrying_product_product', 'covered in_product_animal', 'covered in_product_product', 'covered in_animal_product', 'covering_product_product', 'covering_human_product', 'covering_product_human', 'eating_product_animal', 'eating_animal_animal', 'eating_animal_product', 'eating_product_product', 'flying in_animal_animal', 'flying in_product_product', 'growing on_product_product', 'hanging from_product_product', 'hanging from_animal_product', 'hanging from_human_product', 'hanging from_product_human', 'holding_product_human', 'holding_animal_human', 'holding_human_human', 'holding_human_animal', 'holding_human_product', 'laying on_animal_human', 'laying on_animal_animal', 'laying on_human_animal', 'laying on_human_human', 'laying on_product_product', 'looking at_human_human', 'looking at_animal_human', 'looking at_human_animal', 'looking at_animal_animal', 'looking at_animal_product', 'mounted on_product_product', 'parked on_product_product', 'playing_animal_animal', 'playing_human_human', 'riding_animal_animal', 'riding_human_human', 'riding_product_product', 'says_human_human', 'says_product_product', 'sitting on_product_human', 'sitting on_animal_animal', 'sitting on_human_animal', 'sitting on_animal_human', 'sitting on_human_human', 'sitting on_product_product', 'covered in_human_product', 'covering_product_animal', 'eating_human_product', 'eating_human_animal', 'flying in_animal_product', 'growing on_product_animal', 'growing on_product_human', 'hanging from_product_animal', 'holding_animal_product', 'holding_animal_animal', 'holding_product_animal', 'holding_product_product', 'laying on_human_product', 'laying on_animal_product', 'looking at_human_product', 'looking at_product_animal', 'looking at_product_product', 'lying on_human_product', 'lying on_animal_product', 'lying on_human_animal', 'lying on_animal_animal', 'lying on_animal_human', 'lying on_human_human', 'lying on_product_product', 'mounted on_product_human', 'mounted on_product_animal', 'mounted on_human_animal', 'painted on_human_product', 'painted on_product_product', 'painted on_animal_product', 'playing_human_animal', 'playing_human_product', 'playing_animal_product', 'playing_product_product', 'riding_human_product', 'riding_animal_product', 'riding_human_animal', 'says_human_animal', 'says_human_product', 'sitting on_human_product', 'sitting on_animal_product', 'standing on_human_product', 'standing on_animal_product', 'standing on_animal_animal', 'standing on_human_animal', 'standing on_product_product', 'to_animal_product', 'to_human_human', 'to_product_animal', 'to_product_product', 'using_human_product', 'using_animal_product', 'using_human_human', 'using_product_product', 'walking in_human_product', 'walking in_animal_product', 'walking in_animal_animal', 'walking on_human_product', 'walking on_animal_product', 'walking on_animal_animal', 'walking on_product_human', 'watching_human_product', 'watching_human_animal', 'watching_human_human', 'watching_animal_product', 'watching_animal_human', 'standing on_human_human', 'to_human_product', 'to_product_human', 'walking in_human_human', 'walking on_product_product', 'watching_animal_animal', 'watching_product_animal'])
des_weight.pth文件,定义了上面每一个key值的权重:
使用代码:
np_des_weight = np.load('C:\code\\read_pth\\recode\\vg\des_weight.npy',allow_pickle = True)
print(np_des_weight)
输出其内容如下:
{'painted on_product_product': array([0.5, 0.4, 0.1]), 'to_human_product': array([0.4, 0.4, 0.2]), 'carrying_product_human': array([0.6, 0.2, 0.2]), 'carrying_product_animal': array([0.4, 0.4, 0.2]), 'carrying_animal_animal': array([0.5, 0.4, 0.1]), 'carrying_human_human': array([0.35, 0.35, 0.3 ]), 'carrying_human_animal': array([0.6, 0.3, 0.1]), 'carrying_animal_human': array([0.3, 0.4, 0.3]), 'carrying_animal_product': array([0.2, 0.6, 0.2]), 'carrying_human_product': array([0.4, 0.4, 0.2]), 'carrying_product_product': array([0.4, 0.4, 0.2]), 'covered in_product_animal': array([0.5, 0.4, 0.1]), 'covered in_product_product': array([0.4, 0.4, 0.2]), 'covered in_animal_product': array([0.5, 0.3, 0.2]), 'covering_product_product': array([0.4, 0.4, 0.2]), 'covering_human_product': array([0.2, 0.2, 0.6]), 'covering_product_human': array([0.5, 0.3, 0.2]), 'eating_product_animal': array([0.4, 0.4, 0.2]), 'eating_animal_animal': array([0.4, 0.5, 0.1]), 'eating_animal_product': array([0.4, 0.4, 0.2]), 'eating_product_product': array([0.5, 0.3, 0.2]), 'flying in_animal_animal': array([0.5, 0.5, 0. ]), 'flying in_product_product': array([0.5 , 0.35, 0.15]), 'growing on_product_product': array([0.15, 0.7 , 0.15]), 'hanging from_product_product': array([0.5, 0.3, 0.2]), 'hanging from_animal_product': array([0.4, 0.4, 0.2]), 'hanging from_human_product': array([0.3, 0.3, 0.4]), 'hanging from_product_human': array([0.5, 0.3, 0.2]), 'holding_product_human': array([0.4, 0.4, 0.2]), 'holding_animal_human': array([0.4, 0.4, 0.2]), 'holding_human_human': array([0.4, 0.4, 0.2]), 'holding_human_animal': array([0.6, 0.2, 0.2]), 'holding_human_product': array([0.4, 0.4, 0.2]), 'laying on_animal_human': array([0.2, 0.2, 0.6]), 'laying on_animal_animal': array([0.2, 0.3, 0.5]), 'laying on_human_animal': array([0.2, 0.3, 0.5]), 'laying on_human_human': array([0.1, 0.1, 0.8]), 'laying on_product_product': array([0.2, 0.2, 0.6]), 'looking at_human_human': array([0.3, 0.3, 0.4]), 'looking at_animal_human': array([0.3, 0.4, 0.3]), 'looking at_human_animal': array([0.5, 0.4, 0.1]), 'looking at_animal_animal': array([0.5, 0.5, 0. ]), 'looking at_animal_product': array([0.3, 0.5, 0.2]), 'mounted on_product_product': array([0.35, 0.35, 0.3 ]), 'parked on_product_product': array([0.5, 0.3, 0.2]), 'playing_animal_animal': array([0.4, 0.4, 0.2]), 'playing_human_human': array([0.4, 0.4, 0.2]), 'riding_animal_animal': array([0.4, 0.4, 0.2]), 'riding_human_human': array([0.6, 0.2, 0.2]), 'riding_product_product': array([0.3, 0.3, 0.4]), 'says_human_human': array([0.4, 0.4, 0.2]), 'says_product_product': array([0.5 , 0.35, 0.15]), 'sitting on_product_human': array([0.3, 0.3, 0.4]), 'sitting on_animal_animal': array([0.6, 0.2, 0.2]), 'sitting on_human_animal': array([0.4, 0.4, 0.2]), 'sitting on_animal_human': array([0.2, 0.5, 0.3]), 'sitting on_human_human': array([0.1, 0.1, 0.8]), 'sitting on_product_product': array([0.5, 0.4, 0.1]), 'covered in_human_product': array([0.5, 0.4, 0.1]), 'covering_product_animal': array([0.5, 0.4, 0.1]), 'eating_human_product': array([0.45, 0.45, 0.1 ]), 'eating_human_animal': array([0.3, 0.5, 0.2]), 'flying in_animal_product': array([0.7, 0.1, 0.2]), 'growing on_product_animal': array([0.6, 0.3, 0.1]), 'growing on_product_human': array([0.4, 0.4, 0.2]), 'hanging from_product_animal': array([0.5, 0.4, 0.1]), 'holding_animal_product': array([0.4, 0.4, 0.2]), 'holding_animal_animal': array([0.4, 0.4, 0.2]), 'holding_product_animal': array([0.5, 0.4, 0.1]), 'holding_product_product': array([0.4, 0.4, 0.2]), 'laying on_human_product': array([0.2, 0.2, 0.6]), 'laying on_animal_product': array([0.1, 0.1, 0.8]), 'looking at_human_product': array([0.6, 0.2, 0.2]), 'looking at_product_animal': array([0.375, 0.375, 0.25 ]), 'looking at_product_product': array([0.4, 0.4, 0.2]), 'lying on_human_product': array([0.3, 0.3, 0.4]), 'lying on_animal_product': array([0.2, 0.6, 0.2]), 'lying on_human_animal': array([0.6, 0.2, 0.2]), 'lying on_animal_animal': array([0.3, 0.4, 0.3]), 'lying on_animal_human': array([0.5, 0.3, 0.2]), 'lying on_human_human': array([0.2, 0.2, 0.6]), 'lying on_product_product': array([0.2, 0.6, 0.2]), 'mounted on_product_human': array([0.6, 0.3, 0.1]), 'mounted on_product_animal': array([0.4, 0.5, 0.1]), 'mounted on_human_animal': array([0.4, 0.4, 0.2]), 'painted on_human_product': array([0.35, 0.35, 0.3 ]), 'painted on_animal_product': array([0.35, 0.45, 0.2 ]), 'playing_human_animal': array([0.35, 0.35, 0.3 ]), 'playing_human_product': array([0.5, 0.3, 0.2]), 'playing_animal_product': array([0.4, 0.4, 0.2]), 'playing_product_product': array([0.4, 0.4, 0.2]), 'riding_human_product': array([0.5, 0.4, 0.1]), 'riding_animal_product': array([0.2, 0.4, 0.4]), 'riding_human_animal': array([0.3, 0.5, 0.2]), 'says_human_animal': array([0.4, 0.4, 0.2]), 'says_human_product': array([0.4, 0.4, 0.2]), 'sitting on_human_product': array([0.4, 0.4, 0.2]), 'sitting on_animal_product': array([0.45, 0.45, 0.1 ]), 'standing on_human_product': array([0.6, 0.3, 0.1]), 'standing on_animal_product': array([0.6, 0.3, 0.1]), 'standing on_animal_animal': array([0.6, 0.3, 0.1]), 'standing on_human_animal': array([0.4, 0.4, 0.2]), 'to_animal_product': array([0.3, 0.3, 0.4]), 'to_human_human': array([0.1, 0.1, 0.8]), 'to_product_animal': array([0.3, 0.3, 0.4]), 'to_product_product': array([0.4, 0.4, 0.2]), 'using_human_product': array([0.5, 0.3, 0.2]), 'using_animal_product': array([0.4, 0.4, 0.2]), 'using_human_human': array([0.5, 0.4, 0.1]), 'using_product_product': array([0.35, 0.35, 0.3 ]), 'walking in_human_product': array([0.4, 0.4, 0.2]), 'walking in_animal_product': array([0.2, 0.6, 0.2]), 'walking in_animal_animal': array([0.45, 0.45, 0.1 ]), 'walking on_human_product': array([0.4, 0.4, 0.2]), 'walking on_animal_product': array([0.4, 0.4, 0.2]), 'walking on_animal_animal': array([0.4, 0.4, 0.2]), 'walking on_product_human': array([0.6, 0.2, 0.2]), 'watching_human_product': array([0.45, 0.45, 0.1 ]), 'watching_human_animal': array([0.4, 0.4, 0.2]), 'watching_human_human': array([0.5, 0.5, 0. ]), 'watching_animal_product': array([0.4, 0.4, 0.2]), 'watching_animal_human': array([0.5, 0.4, 0.1]), 'standing on_human_human': array([0.5, 0.5, 0. ]), 'to_product_human': array([0.4, 0.4, 0.2]), 'walking in_human_human': array([0.2, 0.2, 0.6]), 'walking on_product_product': array([0.4, 0.4, 0.2]), 'watching_animal_animal': array([0.45, 0.45, 0.1 ]), 'watching_product_animal': array([0.4, 0.4, 0.2]), 'standing on_product_product': array([0.1, 0.1, 0.8])}
obj_valid_tris.npy,其内部保存了每一个object,对当前的relation是否合法,其内容如下:
使用如下代码:
obj_valid_tris = np.load('C:\code\\read_pth\\recode\\vg\obj_valid_tris.npy', allow_pickle=True)
print(obj_valid_tris)
输出内容如下:
(数据很多,但数据全粘贴的话帖子太长了,只粘贴了一部分,大家自己输出试试)
{'airplane_carrying': False, 'airplane_covered in': False, 'airplane_covering': True, 'airplane_eating': False, 'airplane_flying in': True, 'airplane_growing on': False, 'airplane_hanging from': True, 'airplane_holding': False, 'airplane_laying on': True, 'airplane_looking at': True, 'airplane_lying on': True, 'airplane_mounted on': True, 'airplane_parked on': False, 'airplane_playing': False, 'airplane_riding': True, 'airplane_says': True, 'airplane_sitting on': True, 'airplane_standing on': True, 'airplane_to': False, 'airplane_using': True, 'airplane_walking in': False, 'airplane_walking on': False, 'airplane_watching': True, 'animal_carrying': True, 'animal_covered in': True, 'animal_covering': True, 'animal_eating': True, 'animal_flying in': True, 'animal_growing on': True, 'animal_hanging from': True, 'animal_holding': True, 'animal_laying on': True, 'animal_looking at': True, 'animal_lying on': True, 'animal_mounted on': True, 'animal_painted on': True, 'animal_parked on': False, 'animal_playing': False, 'animal_says': True, 'animal_sitting on': True, 'animal_standing on': True, 'animal_to': True, 'animal_using': True, 'animal_walking on': False, 'animal_watching': True, 'arm_carrying': False, 'arm_covered in': True, 'arm_covering': True, 'arm_eating': False, 'arm_growing on': True, 'arm_hanging from': True, 'arm_holding': True, 'arm_laying on': True, 'arm_looking at': False, 'arm_mounted on': True, 'arm_painted on': True, 'arm_parked on': False, 'arm_playing': False, 'arm_says': False, 'arm_sitting on': True, 'arm_standing on': True, 'arm_to': False, 'arm_walking in': False, 'arm_walking on': False, 'arm_watching': False, 'bag_carrying': True, 'bag_covered in': True, 'bag_covering': False, 'bag_eating': False, 'bag_growing on': False, 'bag_hanging from': True, 'bag_holding': True, 'bag_laying on': True, 'bag_looking at': True, 'bag_mounted on': False, 'bag_painted on': False, 'bag_parked on': False, 'bag_playing': False,
sub_valid_tris.npy,其内部保存了每一个subject,对当前的relation是否合法,其内容如下:
使用如下代码:
sub_valid_tris = np.load('C:\code\\read_pth\\recode\\vg\sub_valid_tris.npy', allow_pickle=True)
print(sub_valid_tris)
输出内容如下:
(数据很多,但数据全粘贴的话帖子太长了,只粘贴了一部分,大家自己输出试试)
{'airplane_carrying': False, 'airplane_covered in': False, 'airplane_covering': True, 'airplane_eating': False, 'airplane_flying in': True, 'airplane_growing on': False, 'airplane_hanging from': False, 'airplane_holding': True, 'airplane_laying on': False, 'airplane_looking at': False, 'airplane_lying on': False, 'airplane_mounted on': True,
数据下载
原始数据下载(直接在作者的github官网下载就好),不过我也提供了一个百度网盘:
官网github上就有
百度网盘下载链接:
通过百度网盘分享的文件:recode.zip
链接:https://pan.baidu.com/s/1XZxrew0pvDLuCENWQwO1QA
提取码:q3pr
–来自百度网盘超级会员V8的分享
我整理出的数据下载:
我使用如下代码,把des_prompts.pth
文件保存到了csv,txt,json
文件里。
import torch
import numpy as np
import csv
import json
# Function to save dictionary keys and values to a text file
def save_to_txt(state_dict, filename, delimiter=" "):
"""
Saves dictionary keys and values (one line per entry) to a text file.
Args:
data (dict): The dictionary containing keys and values.
filename (str): The name of the text file to save to.
"""
with open(filename, 'w') as f:
for key, value in state_dict.items():
# Handle potential non-string values by converting them to strings
key_str = str(key)
value_str = str(value)
f.write(f"{key_str}{delimiter}{value_str}\n")
# Function to save dictionary keys and values to an Excel file (assuming pandas is installed)
def save_to_csv_file(state_dict, csv_filename):
"""
Saves the contents of a state_dict (keys and corresponding values)
to a text file in a tabular format.
Args:
state_dict (dict): The state_dict to be saved.
filename (str): The name of the output file (txt or word).
"""
# Save to CSV file
with open(csv_filename, mode='w', newline='') as csv_file:
writer = csv.writer(csv_file)
# Write the header
writer.writerow(['key', 'sub', 'obj', 'pos', 'size'])
# Write the data rows
for key, value in state_dict.items():
writer.writerow([key] + [value['sub'][0], value['obj'][0], value['pos'][0], value['size'][0]])
# Function to convert numpy arrays in the dictionary to lists
def convert_arrays_to_lists(d):
for key, value in d.items():
if isinstance(value, dict):
convert_arrays_to_lists(value)
elif isinstance(value, np.ndarray):
d[key] = value.tolist()
return d
# Function to save dictionary to a compact JSON file
def save_to_json_file(state_dict, json_filename):
"""
Saves the contents of a state_dict to a JSON file in a compact format.
Args:
state_dict (dict): The state_dict to be saved.
json_filename (str): The name of the output JSON file.
"""
# Convert numpy arrays to lists
state_dict = convert_arrays_to_lists(state_dict)
# Save to JSON file in a compact format
with open(json_filename, 'w') as json_file:
json.dump(state_dict, json_file, separators=(',', ':'))
if __name__ == '__main__':
# Load the state_dict
state_dict = torch.load("C:\code\\read_pth\\recode\\vg\des_prompts.pth")
# Print the keys for reference (optional)
print(state_dict.keys()) # This can be helpful for debugging or verification
text_filename = "state_dict_output.txt"
# Save the state_dict to a file
save_to_txt(state_dict, text_filename) # Replace with your desired filename
print(f"Successfully saved data to text file: key_values.txt")
# 注重效率仍然是我们做科研里最重要的事。
# Save the state_dict to a file
csv_filename = "state_dict_output.csv" # Change extension to .csv for clarity
save_to_csv_file(state_dict, csv_filename)
print(f"State dict keys and values saved to {csv_filename}")
json_filename = "state_dict_output.json"
# Save the state_dict to a JSON file
save_to_json_file(state_dict, json_filename)
print(f"State dict keys and values saved to {json_filename}")
其中csv文件内容如下:
csv文件下载链接:
通过百度网盘分享的文件:state_dict_output.csv
链接:https://pan.baidu.com/s/17zIgirIaPypDz42JlWkI8g
提取码:89ni
–来自百度网盘超级会员V8的分享