前言
Whoosh:搜索引擎 jieba:分词器 django-heystack:支持引擎的第三方app
准备
Pip3 install whoosh
Pip3 install jieba
Pip3 install django-haystack
配置
将 haystack 加入 INSTALLED_APP中:
INSTALLED_APPS = [ 'django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', #其它app ... 'search_liu', 'haystack', ]
再加入如下配置:
project/settings.py HAYSTACK_CONNECTIONS = { 'default': { 'ENGINE': 'search_liu.whoosh_cn_backend.WhooshEngine', #使用whoosh搜索引擎 'PATH': os.path.join(BASE_DIR, 'whooshindex'), }, } HAYSTACK_SEARCH_RESULTS_PER_PAGE = 10 #每十项结果为一页 HAYSTACK_SIGNAL_PROCESSOR = 'haystack.signals.RealtimeSignalProcessor'
'ENGINE': 'search_liu.whoosh_cn_backend.WhooshEngine' 虽然目前这个引擎还不存在,但我们接下来会创建它。
'PATH' 索引文件需要存放的位置,我们设置为项目根目录 BASE_DIR
下的 whoosh_index 文件夹(在建立索引是会自动创建)。
配置建立索引文件
在app下建立 search_indexes.py 文件并写上如下代码:
class newsIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True)
def get_model(self):
return news def index_queryset(self, using=None): return self.get_model().objects.filter(newsState=2) #限制搜索条件
因为我要检索多张表,所以我选择在search这个app下的 search_indexes.py 写了三个表名+index类 ,之后就会同时对这三个表建立索引文件。
然后在 templates/search/indexes/youapp/\<model_name>_text.txt 中写下需要检索的字段,多张表就有多个txt文件。
{{ object.title }}
{{ object.mainBody }}
修改搜索引擎为中文分词
在 search app 下建立 ChineseAnalyser.py 文件,写下如下的代码:
import jieba from whoosh.analysis import Tokenizer, Token class ChineseTokenizer(Tokenizer): def __call__(self, value, positions=False, chars=False, keeporiginal=False, removestops=True, start_pos=0, start_char=0, mode='', **kwargs): t = Token(positions, chars, removestops=removestops, mode=mode, **kwargs) seglist = jieba.cut_for_search(value) for w in seglist: t.original = t.text = w t.boost = 1.0 if positions: t.pos = start_pos + value.find(w) if chars: t.startchar = start_char + value.find(w) t.endchar = start_char + value.find(w) + len(w) yield t def chinese_analyzer(): return ChineseTokenizer()
在 python 下的 Lib\site-packages\haystack\backends 目录中找到 whoosh_backend.py 文件 复制到 search app 下,并改名为 whoosh_cn_backend.py
在其中加入
from search import ChineseAnalyser
并找到语句并做修改如下:
schema_fields[field_class.index_fieldname] = TEXT(stored=True, analyzer=ChineseAnalyser.chinese_analyzer(), field_boost=field_class.boost, sortable=True)
最后运行命令:python3 manage.py rebuild_index 就可以建立索引文件了。
创建搜索表单
<div class="input-group" style="width:370px"> <div style="float:right"> <form action="" id="search_form" method="get" onsubmit='return sub_search_form()'> <!--不要改name='q'--> <input type="text" class="form-control" style="width:229px;float:left;" name="q" placeholder=" 请输入关键字"> <span class="input-group-btn" > <button class="btn btn-info" id="search" style="width:60px;height:34px;background-color:purple;border-color:purple" type="submit"><i class="glyphicon glyphicon-search"></i></button> </span> </form> </div> <!--不要把select标签放进form表单中--> <select id="option" class="form-control" style="height:32px;width:77px;"> <option value="0">全部</option> <option value="1">新闻</option> <option value="2">公告</option> <option value="3">论文</option> </select> <!--不要把select标签放进form表单中--> </div>
后台函数处理
以上表单通过 js 向后台发起请求,相关js 如下:
function sub_search_form(){ //1:新闻 2:公告 3:论文 var obj = document.getElementById('option'); var form = document.getElementById('search_form'); var value = obj.value; //alert(value) switch (value){ case '0': form.action = '/search/'; break; case '1': form.action = '/search/news/'; break; case '2': form.action = '/search/announcement/'; break; case '3': form.action = '/search/thesis_information/'; break; default:break; } }
search/views.py 内容如下:
from haystack.generic_views import SearchView from haystack.query import SearchQuerySet from web.models import news, announcement, thesis_information model_map ={'news': news, 'announcement': announcement, 'thesis_information': thesis_information}\ class VisitorSearchView(SearchView): def get_queryset(self): queryset = super(VisitorSearchView, self).get_queryset() self.context_object_name = 'search_list' # 获取model名 model_name = self.kwargs.get('model') #如果分表查询 if model_name: model = model_map[model_name] queryset = SearchQuerySet().models(model) if model_name == 'thesis_information': self.context_object_name = 'search_thesis_list' #不分表查询 else: self.template_name = 'search/search_all.html' return queryset
search/urls.py
from django.urls import path from search.views import VisitorSearchView urlpatterns = [ path('<str:model>/', VisitorSearchView.as_view()), path('', VisitorSearchView.as_view()), ]
参考
https://www.cnblogs.com/fuhuixiang/p/4488029.html
https://www.zmrenwu.com/courses/django-blog-tutorial/materials/27/
https://www.cnblogs.com/ftl1012/p/10397553.html
https://github.com/stormsha/blog
https://stormsha.com/
Searchview https://blog.csdn.net/BetrayArmy/article/details/83512700