用正则表达式匹配用rdf3x处理过后的TTL格式文档

1、比如下面这个用rdf3x处理过后的TTL文档片段:

注意缩进的是两个空格

<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853> <http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite> <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2622>.
<http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.ebi.ac.uk/terms/chembl#BindingSite>;
  <http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL_BS_2659";
  <http://rdf.ebi.ac.uk/terms/chembl#chemblId> "CHEMBL_BS_2659";
  <http://rdf.ebi.ac.uk/terms/chembl#hasTarget> <http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965>;
  <http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName> "30S ribosomal protein S1".
<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965> <http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite> <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659> , <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623>.
<http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.ebi.ac.uk/terms/chembl#BindingSite>;
  <http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL_BS_2623";
  <http://rdf.ebi.ac.uk/terms/chembl#chemblId> "CHEMBL_BS_2623";
  <http://rdf.ebi.ac.uk/terms/chembl#hasTarget> <http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965>;
  <http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName> "16S/23S ribosomal RNA interface".
<http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.ebi.ac.uk/terms/chembl#BindingSite>;
  <http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL_BS_2624";
  <http://rdf.ebi.ac.uk/terms/chembl#chemblId> "CHEMBL_BS_2624";
  <http://rdf.ebi.ac.uk/terms/chembl#hasTarget> <http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022>;
  <http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName> "23S ribosomal RNA".
<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022> <http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite> <http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624>.

2、Java编写的正则表达式代码

代码里注释的部分和上面那行是输出三种所需的不同结果

package com.jena;

import java.io.BufferedReader;
import java.io.FileReader;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class rdfReader3 {
    static String url="";
    
    public static void main(String[] args) {
        FileReader fr=null;
        BufferedReader br=null;
        try{
            fr=new FileReader("C:/Users/Don/workspace/Jena/src/com/jena/bindingsite");
            br=new BufferedReader(fr);
            String s=" ";
            StringBuffer str=new StringBuffer();
            while((s=br.readLine())!=null){
                Pattern p= Pattern.compile("<([^<>]*)>");    //匹配所有尖括号里的内容
//                Pattern p= Pattern.compile("^\n*<([^<>]*)>");    //匹配每一个主语,开头匹配“除了空格所有字符”,后面匹配"<>里的所有内容,内容为非尖括号"
//                Pattern p= Pattern.compile("  <([^<>]*)>");        //匹配“两个空格开头”,后面匹配"<>里的所有内容,内容为非尖括号"
                Matcher m=p.matcher(s);
              
                while(m.find()){
                    System.out.println(m.group(1));
                }
            }
            
        }catch(Exception e){
            System.out.println(e.getMessage());
        }
        
        
    }
    
    

}

(1)匹配所有尖括号里的内容

运行结果

http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853
http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2622
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://rdf.ebi.ac.uk/terms/chembl#BindingSite
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://rdf.ebi.ac.uk/terms/chembl#BindingSite
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://rdf.ebi.ac.uk/terms/chembl#BindingSite
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022
http://rdf.ebi.ac.uk/terms/chembl#hasBindingSite
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624

(2)匹配每一个主语,即开头不是两个空格的那一行数据的第一对尖括号里的内容

 

运行结果

http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363853
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2659
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2363965
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2623
http://rdf.ebi.ac.uk/resource/chembl/binding_site/CHEMBL_BS_2624
http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL2364022

 

 

(3)匹配“两个空格开头”,后面匹配"<>里的所有内容,内容为非尖括号"

 

http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName
http://www.w3.org/2000/01/rdf-schema#label
http://rdf.ebi.ac.uk/terms/chembl#chemblId
http://rdf.ebi.ac.uk/terms/chembl#hasTarget
http://rdf.ebi.ac.uk/terms/chembl#bindingSiteName

 

 匹配前面两个空格开始的数据时,在前面直接输入两个空格即可

  Pattern p= Pattern.compile("  <([^<>]*)>"); 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值