Python RDF知识库查询

本文知识库查询使用SPARQL;
SPARQL (SPARQL Protocol and RDF Query Language),是为RDF开发的一种查询语言和数据获取协议,它是为W3C所开发的RDF数据模型所定义,但是可以用于任何可以用RDF来表示的信息资源。
其语法格式如下:
SELECT ?Predicate ?Object where {http://baike.com/resource/华南师大?Predicate ?Object}
但是其返回的是三元组的形式,并不适合直接阅读,需要通过预处理。这里将《Python 将json格式文件转存为RDF格式文件》保存的RDF文件进行测试,代码如下:

# -*- coding: utf-8 -*-
import rdflib
import re
g1 = rdflib.Graph()
g1.parse("data.rdf", format="xml")
while 1:
    print "please input the three words that you want,Subject,Predicate,Object;ues ',' to split them,ues '' if you don't know:"
    try:
        Subject, Predicate, Object = input()

        data = ['', '', '']
        if Subject != '' and Predicate == '' and Object == '':  # only search Subject
            q = "select ?Predicate ?Object where { <http://baike.com/resource/" + \
                Subject + "> ?Predicate ?Object}"
            data[0] = Subject  # .encode('gbk')
            x1 = g1.query(q)
            if len(list(x1)) > 20:
                leng = 20
            else:
                leng = len(list(x1))
            for i in range(leng):
                s = str(list(x1)[i]).decode('unicode_escape')
                txt = re.findall(r'resource/.*?\'', s)
                data[1] = txt[0][9:-1]
                data[2] = txt[1][9:-1]
                print data[0], data[1], data[2]
        elif Subject == '' and Predicate != '' and Object == '':  # only search Predicate
            q = "select ?Subject ?Object where {?Subject <http://baike.com/resource/" + \
                Predicate + "> ?Object}"
            data[1] = Predicate  # .encode('gbk')
            x1 = g1.query(q)
            if len(list(x1)) > 20:
                leng = 20
            else:
                leng = len(list(x1))
            for i in range(leng):
                s = str(list(x1)[i]).decode('unicode_escape')
                txt = re.findall(r'resource/.*?\'', s)
                data[0] = txt[0][9:-1]
                data[2] = txt[1][9:-1]
                print data[0], data[1], data[2]
        elif Subject == '' and Predicate == '' and Object != '':  # only search Object
            q = "select ?Subject ?Predicate where {?Subject ?Predicate <http://baike.com/resource/" + Object + ">}"
            data[2] = Object  # .encode('gbk')
            x1 = g1.query(q)
            if len(list(x1)) > 20:
                leng = 20
            else:
                leng = len(list(x1))
            for i in range(leng):
                s = str(list(x1)[i]).decode('unicode_escape')
                txt = re.findall(r'resource/.*?\'', s)
                data[0] = txt[0][9:-1]
                data[1] = txt[1][9:-1]
                print data[0], data[1], data[2]
        elif Subject != '' and Predicate != '' and Object == '':  # search Subject and Predicate
            q = "select ?Object where {<http://baike.com/resource/" + Subject + ">\
        <http://baike.com/resource/" + Predicate + "> ?Object}"
            data[0] = Subject  # .encode('gbk')
            data[1] = Predicate  # .encode('gbk')
            x1 = g1.query(q)
            if len(list(x1)) > 20:
                leng = 20
            else:
                leng = len(list(x1))
            for i in range(leng):
                s = str(list(x1)[i]).decode('unicode_escape')
                txt = re.findall(r'resource/.*?\'', s)
                data[2] = txt[0][9:-1]
                print data[0], data[1], data[2]
        elif Subject != '' and Predicate == '' and Object != '':  # search Subject and Object
            q = "select ?Predicate where {<http://baike.com/resource/" + Subject + ">  ?Predicate \
        <http://baike.com/resource/" + Object + ">}"
            data[0] = Subject  # .encode('gbk')
            data[2] = Object  # .encode('gbk')
            x1 = g1.query(q)
            if len(list(x1)) > 20:
                leng = 20
            else:
                leng = len(list(x1))
            for i in range(leng):
                s = str(list(x1)[i]).decode('unicode_escape')
                txt = re.findall(r'resource/.*?\'', s)
                data[1] = txt[0][9:-1]
                print data[0], data[1], data[2]
        elif Subject == '' and Predicate != '' and Object != '':  # search Predicate and Object
            q = "select ?Subject where { ?Subject <http://baike.com/resource/" + Predicate + ">\
        <http://baike.com/resource/" + Object + ">}"
            data[1] = Predicate  # .encode('gbk')
            data[2] = Object  # .encode('gbk')
            x1 = g1.query(q)
            if len(list(x1)) > 20:
                leng = 20
            else:
                leng = len(list(x1))
            for i in range(leng):
                s = str(list(x1)[i]).decode('unicode_escape')
                txt = re.findall(r'resource/.*?\'', s)
                data[0] = txt[0][9:-1]
                print data[0], data[1], data[2]
        elif Subject != '' and Predicate != '' and Object != '':  # search all
            q = "select ?Predicate ?Object where { <http://baike.com/resource/" + \
                Subject + "> ?Predicate ?Object}"
            data[0] = Subject  # .encode('gbk')
            x1 = g1.query(q)
            if len(list(x1)) > 20:
                leng = 20
            else:
                leng = len(list(x1))
            for i in range(leng):
                s = str(list(x1)[i]).decode('unicode_escape')
                txt = re.findall(r'resource/.*?\'', s)
                data[1] = txt[0][9:-1]
                data[2] = txt[1][9:-1]
                print data[0], data[1], data[2]
        else:
            break
    except:
        pass

考虑到当数据量很大时,把所有匹配的结果显示出来太花时间,就只显示查询到的前20条
测试结果:
这里写图片描述
输入是三元组的任意一个或者任意两个或者全部,以逗号分割,当输入格式错误时pass
这里写图片描述
测试数据及程序:点击下载

  • 2
    点赞
  • 26
    收藏
    觉得还不错? 一键收藏
  • 6
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值