xml文件解析并持久化_format: grp | gmt | xml | json

本文链接：https://blog.csdn.net/weixin_43296576/article/details/106714452

最近项目中涉及到对多种类型的配置文件的解析，主要有xml，json，yml，ini，properties这几种类型，这也是我第一次写博客，主要是想当做笔记，方便以后查看，从事java开发没多久，欢迎大家指正错误和不足。

对于xml文件的解析，有很多种方法，想了解的小伙伴们可以看下XML文档的四种生成和解析方法详解这个链接，对于各种方法的优点和确定有很详细的描述。

我采用的是DOM4J的方法，结合这种方法数据库的设计也粘贴在这里

xml文件中关于标签信息的表：
xml标签信息表
之所以把标签关系单独建表主要是方便后面xml文件还原用
xml标签关系表
xml文件中属性信息表：
xml属性信息表
标签和属性关联表，一对多：
xml标签属性关系表

xml解析

业务逻辑代码是这样：主要逻辑：将xml文件按照dom4j的方法解析，标签信息，属性信息，命名空间等一些列的字段存在一个list里面，然后在把list循环解析出标签名，标签值，属性名，属性值等一些列文件中的信息，存入数据库，之所以没有一边解析一遍存数据库，是想着这样耦合性减小，批量操作数据，提高性能。

 public Boolean xmlAnalyse(String fileAbsPath, String fileStorePath, Long fileId, String remark) {

        String createUser = "admin";
        SAXReader reader = new SAXReader();
        File file = new File(fileAbsPath +  fileStorePath);
        Document document = null;
        InputStream in = null;
        try {
            in = new FileInputStream(file);
        } catch (FileNotFoundException e) {
            LOGGER.error("文件不存在");
            return false;
        }
        try {
            String encoding = EncodingDetect.getJavaEncode(                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        fileAbsPath                  + fileStorePath);
            Reader read = new InputStreamReader(in, encoding);
            document = reader.read(read);
        } catch (DocumentException e) {
            e.printStackTrace();
            LOGGER.error("文件转换异常");
            return false;
        } catch (UnsupportedEncodingException e) {
            LOGGER.error("文件不支持的编码格式");
            return false;
        }
        Element root = document.getRootElement();
        List<String> list = new ArrayList<>();
        List<Map<String, Object>> fileMgsList = new ArrayList<>();
        try {
            //解析文件
            listNodes(root, list, fileMgsList);
        } catch (Exception e) {
            LOGGER.error("文件解析异常");
            return false;
        }
        //保存文件数据
        boolean checkSave = save(fileMgsList, fileId, createUser, remark);
        return checkSave;
    }

private void listNodes(Element node, List<String> nodeLists, List fileMgsList) {
        Map<String, Object> tagMgsMap = new HashedMap();
        String parent = getParentPath(node);
        tagMgsMap.put("parentTagLevel", parent);
        tagMgsMap.put("tagContent", node.getTextTrim());
        //标签顺序
        tagMgsMap.put("tagOrder", (long) nodeLists.size());
        if (node.isRootElement()) {
            if (node.getNamespace() != null && (node.getNamespace().getPrefix() == "" && node.getNamespace().getURI() != "")) {
                tagMgsMap.put("nameSpace1", "xmlns=" + node.getNamespace().getURI());
            } else if (node.getNamespace() != null && node.getNamespace().getPrefix() != "" && node.getNamespace().getURI() != "") {
                tagMgsMap.put("nameSpace1", node.getNamespace().getPrefix() + "=" + node.getNamespace().getURI());
            }
            List<Namespace> namespaceList = node.additionalNamespaces();
            for (Namespace namespace : namespaceList) {
                tagMgsMap.put("nameSpace2", "" + namespace.getPrefix() + "=" + namespace.getURI() + "|");
            }
        } else {
            if (node.getNamespace().getPrefix() != "") {
                tagMgsMap.put("nameSpace1", node.getNamespace().getURI());
            }
            List<Namespace> namespaceList = node.additionalNamespaces();
            for (Namespace namespace : namespaceList) {
                tagMgsMap.put("nameSpace2", namespace.getURI());
            }
        }
        tagMgsMap.put("tagName", node.getName());
        tagMgsMap.put("qualifiedName", node.getQualifiedName());
        List<Attribute> attributeList = node.attributes();
        tagMgsMap.put("attributes", attributeList);
        if (attributeList.size() > 0) {
            tagMgsMap.put("ifAttribute", "01");
        } else {
            tagMgsMap.put("ifAttribute", "02");
        }
        fileMgsList.add(tagMgsMap);
        //迭代当前节点下面的所有子节点
        List<Element> elementList = node.elements();
        List<String> nodeList = new ArrayList<>();
        for (int j = 0; j < elementList.size(); j++) {
            String nodeName = elementList.get(j).getName();
            nodeList.add(nodeName);
            listNodes(elementList.get(j), nodeList, fileMgsList);
        }
    }
 //获取父标签
    public String getParentPath(Element node) {
        String parentPath = "";
        while (node.getParent() != null) {
            parentPath += node.getParent().getQualifiedName() + "/";
            node = node.getParent();
        }
        return parentPath;
    }



	//xml文件解析的数据存入数据库，做持久化
    public Boolean save(List fileMsgList, Long fileId, String createUser, String remark) {
        Long parentTagId = null;
        Map<String, Object> parentIdMap = new LinkedHashMap<>();
        List<XmlTagAttributeRelEntity> tagAttributeRelTemp = new ArrayList<>();
        List<FileAnalyseRelEntity> fileRelListTemp = new ArrayList<>();
        List<XmlTagInfoEntity> tagInfoEntitieTemp=new ArrayList<>();
        List<XmlTagInfoEntity> tagInfoEntities=new ArrayList<>();
        List<XmlTagRelEntity> xmlTagRelEntitieTemp=new ArrayList<>();
        List<XmlAttributeInfoEntity> attributeTemp=new ArrayList<>();
        List<XmlAttributeInfoEntity> attributes=new ArrayList<>();
        int tagNum=0;
        int attributeNum=0;
        int tagRelNUm=0;
        int fileRelNum=0;
        int attributeRelNum=0;
        for (int i = 0; i < fileMsgList.size(); i++) {
            Map<String, Object> map = (Map<String, Object>) fileMsgList.get(i);
            //标签表
            XmlTagInfoEntity xmlEntity = new XmlTagInfoEntity();
            xmlEntity.setCreateUser(createUser);
            xmlEntity.setRemark(remark);
            xmlEntity.setQualifiedName((String) map.get("qualifiedName"));
            xmlEntity.setTagName((String) map.get("tagName"));
            xmlEntity.setTagContent((String) map.get("tagContent"));
            Long tagOrder = (Long) map.get("tagOrder");
            xmlEntity.setTagOrder(tagOrder);
            xmlEntity.setIfAttribute((String) map.get("ifAttribute"));
            xmlEntity.setNameSpace1((String) map.get("nameSpace1"));
            xmlEntity.setNameSpace2((String) map.get("nameSpace2"));
            tagInfoEntities.add(xmlEntity);
            tagInfoEntitieTemp.add(xmlEntity);
            //数据量达到1000时，存数据库，防止内存溢出
            if(tagInfoEntitieTemp.size()>1000){
                int n=xmlTagInfoDao.insertList(tagInfoEntitieTemp);
                tagNum=tagNum+n;
                tagInfoEntitieTemp=new ArrayList<>();
            }
            //属性
            List<Attribute> attributeList = (List<Attribute>) map.get("attributes");
            for (Attribute attribute : attributeList) {
                XmlAttributeInfoEntity xmlattribute = new XmlAttributeInfoEntity();
                xmlattribute.setAttributeName(attribute.getQualifiedName());
                xmlattribute.setAttributeValue(attribute.getValue());
                xmlattribute.setCreateUser(createUser);
                xmlattribute.setRemark(remark);
                attributes.add(xmlattribute);
                attributeTemp.add(xmlattribute);
            }
            if(attributeTemp.size()>1000){
                int m=xmlAttributeInfoDao.insertList(attributeTemp);
                attributeNum=attributeNum+m;
                attributeTemp=new ArrayList<>();
            }
        }
        //批量插入标签返回id
        if(tagInfoEntitieTemp.size()>0){
           int n= xmlTagInfoDao.insertList(tagInfoEntitieTemp);
            tagNum=tagNum+n;
        }
        if(tagNum!=tagInfoEntities.size()){
            return false;
        }
        if(attributeTemp.size()>0){
            int m=xmlAttributeInfoDao.insertList(attributeTemp);
            attributeNum=attributeNum+m;
        }
        if(attributeNum!=attributes.size()){
            return false;
        }
        int attributeIndex=0;
        for (int i = 0; i < fileMsgList.size(); i++) {
            Map<String, Object> map = (Map<String, Object>) fileMsgList.get(i);
            Long tagOrder = (Long) map.get("tagOrder");
            //文件关联
            FileAnalyseRelEntity fileRelEntity = new FileAnalyseRelEntity();
            fileRelEntity.setTagId(tagInfoEntities.get(i).getTagId());
            fileRelEntity.setFileId(fileId);
            fileRelEntity.setCreateUser(createUser);
            fileRelEntity.setCreateTime(new Date());

            fileRelListTemp.add(fileRelEntity);
            if(fileRelListTemp.size()>1000){
                int n=fileAnalyseRelDao.insertList(fileRelListTemp);
                fileRelNum=fileRelNum+n;
                fileRelListTemp=new ArrayList<>();
            }
            List<Attribute> attributeList = (List<Attribute>) map.get("attributes");
            //属性关联
            if(attributeList.size()>0){
                for (int j = 0; j < attributeList.size(); j++) {
                    XmlTagAttributeRelEntity relEntity = new XmlTagAttributeRelEntity();
                    relEntity.setAttributeId(attributes.get(attributeIndex).getAttributeId());
                    relEntity.setTagId(tagInfoEntities.get(i).getTagId());
                    relEntity.setCreateUser(createUser);
                    relEntity.setCreateTime(new Date());
                    tagAttributeRelTemp.add(relEntity);
                    attributeIndex++;
                }
            }
            if(tagAttributeRelTemp.size()>1000){
                int n=xmlTagAttributeRelDao.insertList(tagAttributeRelTemp);
                attributeRelNum=attributeRelNum+n;
                tagAttributeRelTemp=new ArrayList<>();
            }

            XmlTagRelEntity xmlTagRelEntity=new XmlTagRelEntity();
            xmlTagRelEntity.setTagId(tagInfoEntities.get(i).getTagId());
            if (i == 0) {
                xmlTagRelEntity.setParentTagId(0L);
            } else {
                if (tagOrder == 1) {
                    xmlTagRelEntity.setParentTagId(parentTagId);
                } else {
                    xmlTagRelEntity.setParentTagId((Long) parentIdMap.get(map.get("parentTagLevel")));
                }
            }
            xmlTagRelEntity.setCreateUser(createUser);
            xmlTagRelEntity.setCreateTime(new Date());
            xmlTagRelEntity.setRemark(remark);
            xmlTagRelEntitieTemp.add(xmlTagRelEntity);
            if(xmlTagRelEntitieTemp.size()>1000){
                int p=xmlTagRelDao.insertList(xmlTagRelEntitieTemp);
                tagRelNUm=tagRelNUm+p;
                xmlTagRelEntitieTemp=new ArrayList<>();
            }
            if (parentIdMap.keySet().contains(map.get("parentTagLevel")) && (((Long) map.get("tagOrder")).longValue() == 1)) {
                String parentTagLevel = (String) map.get("parentTagLevel");
                parentIdMap.put(parentTagLevel, parentTagId);
            }
            if (!parentIdMap.keySet().contains(map.get("parentTagLevel")) && (((Long) map.get("tagOrder")).longValue() == 1 || ((Long) map.get("tagOrder")).longValue() == 0)) {
                String parentTagLevel = (String) map.get("parentTagLevel");
                parentIdMap.put(parentTagLevel, parentTagId);
            }
            parentTagId = tagInfoEntities.get(i).getTagId();

        }
        if(xmlTagRelEntitieTemp.size()>0){
            int p=xmlTagRelDao.insertList(xmlTagRelEntitieTemp);
            tagRelNUm=tagRelNUm+p;
        }
        if(tagRelNUm!=tagNum){
            return false;
        }
        if(fileRelListTemp.size()>0){
            int n=fileAnalyseRelDao.insertList(fileRelListTemp);
            fileRelNum=fileRelNum+n;
        }
        if(fileRelNum!=tagNum){
            return false;
        }
        if(tagAttributeRelTemp.size()>0){
            int m=xmlTagAttributeRelDao.insertList(tagAttributeRelTemp);
            attributeRelNum=attributeRelNum+m;
        }
        if(attributeRelNum!=attributeNum){
            return false;
        }
        return true;
    }

因为如果文件过大的话，数据库存数据的时候采用批量插入性能较高，在这里我学到了一个自己不知道的技术，我们领导告诉我可以直接插入数据库的同时返回主键id，我竟然从来不知道mybatis有这么神奇的技术，这里贴一下我在mapper中用到的sql
在mybatis中批量插入，返回主键的方法，真的太神奇，哈哈，不要嫌弃我没见识
这是我在mybatis中批量插入，返回主键的方法，真的太神奇，哈哈，不要嫌弃我没见识
在xml解析的时候，主要学到两点，1：xml文件中，命名空间这个东西，开始以为是属性，后来发现不是，而是为了避免元素命名冲突。2：就是上面说到的mybatis中的这个技巧。
在这篇文章，我把xml文件五马分尸了，下篇文章给他重塑金身，嘿嘿。
其他的感觉也没什么可记录的了，都是简单的sql了，感谢大家的阅读，小女不才，不喜勿喷勿喷，欢迎指导！