CVAT简介
CVAT是一个开源标注平台,有以下优点:
- 从网页端进入
- 可以标注图片和视频
- 支持通过训练模型进行AI自动标注
- 管理员可以在CVAT上进行任务分发
- 进入review模式检查标注,给标注tag issue等功能
CVAT标注特点:
- 一张图片中可以标注多边形,矩形,椭圆,线,点,给图片打tag
- 标注的shape里可以预定义attribute
CVAT for image
用户完成数据标注任务后可以导出标注结果,根据需求可以选择不同的导出格式,例如YOLO 1.1、LabelMe 3.0等。但是针对某些特定架构的标注格式,可能无法做到兼容所有任务类型(目标检测、实例分割、图像分类等)。
CVAT for image是CVAT平台自有的注释数据格式。它使用xml语法,在一个文件中记录所有任务图像的注释数据。
使用CVAT for image格式导出数据,可以做到兼容任意任务类型。
CVAT for image格式导出数据示例:
<?xml version="1.0" encoding="utf-8"?>
<annotations>
<version>1.1</version>
<meta>
<job>
<id>262</id>
<size>21</size>
<mode>annotation</mode>
<overlap>0</overlap>
<bugtracker></bugtracker>
<created>2023-12-14 08:32:59.313518+00:00</created>
<updated>2023-12-20 08:03:26.561203+00:00</updated>
<subset>default</subset>
<start_frame>0</start_frame>
<stop_frame>20</stop_frame>
<frame_filter></frame_filter>
<segments>
<segment>
<id>262</id>
<start>0</start>
<stop>20</stop>
<url>http://192.168.3.254:8080/api/jobs/262</url>
</segment>
</segments>
<owner>
<username>ruoyi-test-user1</username>
<email>r.nnlvgtl@qq.com</email>
</owner>
<assignee></assignee>
<labels>
<label>
<name>人</name>
<color>#43FF40</color>
<type>any</type>
<attributes>
</attributes>
</label>
</labels>
</job>
<dumped>2023-12-20 08:03:40.048887+00:00</dumped>
</meta>
<image id="0" name="hard_hat_workers0.png" width="416" height="416">
<polygon label="人" source="manual" occluded="0" points="117.59,133.51;67.77,229.03;161.75,325.59;240.33,321.99;288.09,141.21" z_order="0">
</polygon>
</image>
<image id="1" name="hard_hat_workers1.png" width="416" height="416">
<polygon label="人" source="manual" occluded="0" points="141.72,201.30;84.72,259.33;122.21,336.88;224.41,329.18;257.28,230.06" z_order="0">
</polygon>
</image>
<image id="2" name="hard_hat_workers10.png" width="416" height="416">
<polygon label="人" source="manual" occluded="0" points="155.59,65.20;79.07,149.94;137.62,249.58;312.23,235.20;361.02,92.42" z_order="0">
</polygon>
</image>
<image id="3" name="hard_hat_workers11.png" width="416" height="415">
<tag label="人" source="manual">
</tag>
</image>
<image id="4" name="hard_hat_workers12.png" width="416" height="415">
<tag label="人" source="manual">
</tag>
</image>
<image id="5" name="hard_hat_workers13.png" width="416" height="415">
<box label="人" source="manual" occluded="0" xtl="111.70" ytl="116.06" xbr="290.51" ybr="292.82" z_order="0">
</box>
</image>
<image id="6" name="hard_hat_workers14.png" width="416" height="416">
<box label="人" source="manual" occluded="0" xtl="71.36" ytl="77.01" xbr="259.33" ybr="270.12" z_order="0">
</box>
</image>
<image id="7" name="hard_hat_workers15.png" width="416" height="416">
<polygon label="人" source="manual" occluded="0" points="80.09,96.02;43.12,226.98;112.96,298.88;279.88,295.80;356.91,141.21" z_order="0">
</polygon>
</image>
<image id="8" name="hard_hat_workers16.png" width="416" height="416">
<polygon label="人" source="manual" occluded="0" points="74.96,110.40;40.55,198.73;151.48,294.77;252.14,263.96;215.17,117.07" z_order="0">
</polygon>
</image>
<image id="9" name="hard_hat_workers17.png" width="416" height="416">
<polygon label="人" source="manual" occluded="0" points="290.15,88.31;186.92,176.65;250.09,278.34;363.59,285.01;403.65,143.26" z_order="0">
</polygon>
</image>
<image id="10" name="hard_hat_workers18.png" width="416" height="416">
<polygon label="人" source="manual" occluded="0" points="148.40,111.94;89.34,190.00;124.78,301.45;263.96,331.75;331.75,263.44" z_order="0">
</polygon>
</image>
<image id="11" name="hard_hat_workers19.png" width="416" height="415">
<polygon label="人" source="manual" occluded="0" points="154.74,70.97;93.77,295.38;317.67,334.32;374.02,114.01;257.72,42.79" z_order="0">
</polygon>
</image>
<image id="12" name="hard_hat_workers2.png" width="416" height="415">
<polygon label="人" source="manual" occluded="0" points="282.83,188.81;234.67,338.42;327.91,395.80;401.69,362.50;405.79,191.89" z_order="0">
</polygon>
</image>
<image id="13" name="hard_hat_workers20.png" width="416" height="416">
<polygon label="人" source="manual" occluded="0" points="127.34,104.23;44.14,279.36;351.26,368.21;406.22,91.39;225.44,72.39" z_order="0">
</polygon>
</image>
<image id="14" name="hard_hat_workers3.png" width="416" height="416">
<polygon label="人" source="manual" occluded="0" points="186.92,149.94;84.20,261.39;271.66,352.81;314.80,253.68;316.34,139.67" z_order="0">
</polygon>
</image>
<image id="15" name="hard_hat_workers4.png" width="416" height="415">
<polygon label="人" source="manual" occluded="0" points="138.35,119.64;13.85,250.81;138.35,336.37;250.55,251.32;180.87,153.97" z_order="0">
</polygon>
</image>
<image id="16" name="hard_hat_workers5.png" width="416" height="415">
<polygon label="人" source="manual" occluded="0" points="100.43,97.61;34.85,194.96;88.65,261.56;175.23,227.24;153.72,122.72" z_order="0">
</polygon>
</image>
<image id="17" name="hard_hat_workers6.png" width="416" height="415">
<polygon label="人" source="manual" occluded="0" points="93.77,128.35;51.25,184.71;112.22,276.42;201.36,235.43;191.12,131.94" z_order="0">
</polygon>
</image>
<image id="18" name="hard_hat_workers7.png" width="416" height="415">
<tag label="人" source="manual">
</tag>
</image>
<image id="19" name="hard_hat_workers8.png" width="416" height="415">
<tag label="人" source="manual">
</tag>
</image>
<image id="20" name="hard_hat_workers9.png" width="415" height="416">
<tag label="人" source="manual">
</tag>
</image>
</annotations>
Java程序读取xml注释文件
在取得CVAT for image格式的xml文件后,可以通过程序读取并转换为其他格式的数据以供模型训练使用。
以下为使用Java程序读取xml注释文件的示例。
读取xml的工具类
import cn.hutool.core.util.XmlUtil;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import java.io.File;
import java.util.ArrayList;
import java.util.List;
/**
* @author lipei
*/
public class CvatForImageUtil {
/**
* 读取cvat注释文件annotations.xml
*
* @param xml
* @return
*/
public static CvatAnnotations readAnnotationsXml(File xml) {
Document document = XmlUtil.readXML(xml);
CvatAnnotations cvatAnnotations = new CvatAnnotations();
// 根节点annotations
Element rootElement = XmlUtil.getRootElement(document);
Element meta = XmlUtil.getElement(rootElement, "meta");
Element job = XmlUtil.getElement(meta, "job");
Element size = XmlUtil.getElement(job, "size");
cvatAnnotations.setTotalSize(Integer.valueOf(size.getTextContent()));
Element labels = XmlUtil.getElement(job, "labels");
List<CvatAnnotationLabel> cvatAnnotationLabels = new ArrayList<>();
List<Element> labelList = XmlUtil.getElements(labels, "label");
for (int i = 0; i < labelList.size(); i++) {
Element label = labelList.get(i);
Element name = XmlUtil.getElement(label, "name");
CvatAnnotationLabel cvatAnnotationLabel = new CvatAnnotationLabel(i, name.getTextContent());
cvatAnnotationLabels.add(cvatAnnotationLabel);
}
cvatAnnotations.setLabels(cvatAnnotationLabels);
List<CvatAnnotationBox> boxes = new ArrayList<>();
List<CvatAnnotationPolygon> polygons = new ArrayList<>();
List<CvatAnnotationTag> tags = new ArrayList<>();
List<Element> imageList = XmlUtil.getElements(rootElement, "image");
for (Element image : imageList) {
String name = image.getAttribute("name");
String width = image.getAttribute("width");
String height = image.getAttribute("height");
Element box = XmlUtil.getElement(image, "box");
if (box != null) {
CvatAnnotationBox cvatAnnotationBox = new CvatAnnotationBox();
cvatAnnotationBox.setImageName(name);
cvatAnnotationBox.setWidth(Integer.valueOf(width));
cvatAnnotationBox.setHeight(Integer.valueOf(height));
cvatAnnotationBox.setLabelName(box.getAttribute("label"));
cvatAnnotationBox.setX1(Double.parseDouble(box.getAttribute("xtl")));
cvatAnnotationBox.setY1(Double.parseDouble(box.getAttribute("ytl")));
cvatAnnotationBox.setX2(Double.parseDouble(box.getAttribute("xbr")));
cvatAnnotationBox.setY2(Double.parseDouble(box.getAttribute("ybr")));
boxes.add(cvatAnnotationBox);
}
Element polygon = XmlUtil.getElement(image, "polygon");
if (polygon != null) {
CvatAnnotationPolygon cvatAnnotationPolygon = new CvatAnnotationPolygon();
cvatAnnotationPolygon.setImageName(name);
cvatAnnotationPolygon.setWidth(Integer.valueOf(width));
cvatAnnotationPolygon.setHeight(Integer.valueOf(height));
cvatAnnotationPolygon.setLabelName(polygon.getAttribute("label"));
List<double[]> points = new ArrayList<>();
String pointsStr = polygon.getAttribute("points");
String[] split = pointsStr.split(";");
for (String str : split) {
double[] xy = new double[2];
String[] xySplit = str.split(",");
xy[0] = Double.parseDouble(xySplit[0]);
xy[1] = Double.parseDouble(xySplit[1]);
points.add(xy);
}
cvatAnnotationPolygon.setPoints(points);
polygons.add(cvatAnnotationPolygon);
}
Element tag = XmlUtil.getElement(image, "tag");
if (tag != null) {
CvatAnnotationTag cvatAnnotationTag = new CvatAnnotationTag();
cvatAnnotationTag.setImageName(name);
cvatAnnotationTag.setWidth(Integer.valueOf(width));
cvatAnnotationTag.setHeight(Integer.valueOf(height));
cvatAnnotationTag.setLabelName(tag.getAttribute("label"));
tags.add(cvatAnnotationTag);
}
}
cvatAnnotations.setBoxes(boxes);
cvatAnnotations.setPolygons(polygons);
cvatAnnotations.setTags(tags);
return cvatAnnotations;
}
}
使用的实体类
import lombok.Data;
import java.io.Serializable;
import java.util.List;
/**
* @author lipei
*/
@Data
public class CvatAnnotations implements Serializable {
private static final long serialVersionUID = 1L;
private Integer totalSize;
private List<CvatAnnotationLabel> labels;
private List<CvatAnnotationBox> boxes;
private List<CvatAnnotationPolygon> polygons;
private List<CvatAnnotationTag> tags;
}
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
/**
* @author lipei
*/
@Data
@AllArgsConstructor
@NoArgsConstructor
public class CvatAnnotationLabel {
private Integer index;
private String name;
}
import lombok.Data;
import java.io.Serializable;
/**
* @author lipei
*/
@Data
public class CvatAnnotationBox implements Serializable {
private static final long serialVersionUID = 1L;
private String imageName;
private Integer width;
private Integer height;
private String labelName;
private double x1;
private double y1;
private double x2;
private double y2;
}
import lombok.Data;
import java.io.Serializable;
import java.util.List;
/**
* @author lipei
*/
@Data
public class CvatAnnotationPolygon implements Serializable {
private static final long serialVersionUID = 1L;
private String imageName;
private Integer width;
private Integer height;
private String labelName;
List<double[]> points;
}
import lombok.Data;
/**
* @author lipei
*/
@Data
public class CvatAnnotationTag {
private String imageName;
private Integer width;
private Integer height;
private String labelName;
}