Getting Started with Solr 4.9 and Django haystack

forked from : http://www.alexanderinteractive.com/blog/2012/08/getting-started-with-solr-and-django/


Getting Started with Solr and Django

Solr is a very powerful search tool and it is pretty easy to get the basics, such as full text search, facets, and related assets up and running pretty quickly. We will be using haystack to do the communication between Django and Solr. All code for this can be viewed on github.

Install

Assuming you already have Django up and running, the first thing we need to do is install Solr.

?
1
2
3
4
5
unzip apache-solr-4.0.0-BETA.zip
cd apache-solr-4.0.0-BETA
cd example
java -jar start.jar

Next install pysolr and haystack. (At the time of this writing the git checkout of haystack works better with the Solr 4.0 beta then the 1.2.7 that’s in pip.)

?
1
2
pip install pysolr

Add ‘haystack’ to INSTALLED_APPS in settings.py and add the following haystack connection:

?
1
2
3
4
5
6
HAYSTACK_CONNECTIONS = {
     'default' : {
         'ENGINE' : 'haystack.backends.solr_backend.SolrEngine' ,
         'URL' : 'http://127.0.0.1:8983/solr'
     },
}

Full Text Search

For the example, we’re going to create a simple job database that a recruiter might use. Here is the model:

?
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
from django.db import models
from django.contrib.localflavor.us import models as us_models
 
JOB_TYPES = (
     ( 'pt' , 'Part Time' ),
     ( 'ft' , 'Full Time' ),
     ( 'ct' , 'Contract' )
)
 
class Company(models.Model):
     name = models.CharField(max_length = 64 )
     address = models.TextField(blank = True , null = True )
     contact_email = models.EmailField()
 
     def __unicode__( self ):
         return self .name
 
class Location(models.Model):
     city = models.CharField(max_length = 64 )
     state = us_models.USStateField()
 
     def __unicode__( self ):
         return "%s, %s" % ( self .city, self .state)
 
class Job(models.Model):
     name = models.CharField(max_length = 64 )
     description = models.TextField()
     salary = models.CharField(max_length = 64 , blank = True , null = True )
     type = models.CharField(max_length = 2 , choices = JOB_TYPES)
     company = models.ForeignKey(Company, related_name = 'jobs' )
     location = models.ForeignKey(Location, related_name = 'location_jobs' )
     contact_email = models.EmailField(blank = True , null = True )
     added_at = models.DateTimeField(auto_now = True )
 
     def __unicode__( self ):
         return self .name
 
     def get_contact_email( self ):
         if self .contact_email:
             return self .contact_email
         return self .company.contact_email

The next step is to create the SearchIndex object that will be used to transpose to data to Solr. save this as search_indexes.py in the same folder as your models.py. The text field with its template will be used for full text search on Solr. The other two fields will be used to faceted (drill down) navigation. For more details on this file, check out the haystack tutorial.

?
01
02
03
04
05
06
07
08
09
10
class JobIndex(indexes.SearchIndex, indexes.Indexable):
     text = indexes.CharField(document = True , use_template = True )
     type = indexes.CharField(model_attr = 'type' , faceted = True )
     location = indexes.CharField(model_attr = 'location' , faceted = True )
 
     def get_model( self ):
         return Job
 
     def index_queryset( self ):
         return self .get_model().objects. all ()

Create the search index template in your template folder with the following naming convention: search/indexes/[app]/[model]_text.txt
For us, this is templates/search/indexes/jobs/job_text.txt

?
1
2
3
4
5
{{ object.name }}
{{ object.description }}
{{ object.salary }}
{{ object.type }}
{{ object.added_at }}

Now, lets get our data into Solr. Run ./manage.py build_solr_schema to generate a schema.xml file. Move this into example\solr\conf in your Solr install. Note: if using Solr 4, edit this file and replace stopwords_en.txt with lang/stopwords_en.txt in all locations. To test everything and load your data, run: manage.py rebuild_index Subsequent updates can be made with: manage.py update_index.

If that all worked we can start working on the front-end to see the data in Django. Add this to your urls.py

?
1
(r '^$' , include( 'haystack.urls' )),

At this point there are at least two templates we’ll need. One for the search results page, and a sub-template to represent each item we are pulling back. My example uses twitter bootstrap for some layout help and styling, see my base.html here if interested.

Create templates/search/search.html
This gives you a basic search form, the results, and pagination for a number of results

?
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
{% extends 'base.html' %}
 
{% block hero_text %}Search{% endblock %}
{% block header %}
Click around!
 
{% endblock %}
 
{% block content %}</ pre >
< div class = "span12" >
< h1 >Search</ h1 >
< form class = ".form-search" action = "." method = "get" >{{ form.as_table }}
  < input type = "submit" value = "Search" /></ form ></ div >
< pre >
{% if query %}</ pre >
< div class = "span8" >
< h3 >Results</ h3 >
< div id = "accordion2" class = "accordion" >{% for result in page.object_list %}
  {% include 'search/_result_object.html' %}
  {% empty %}
 
No results found.
 
  {% endfor %}</ div >
  {% if page.has_previous or page.has_next %}
< div >{% if page.has_previous %}< a href = "?q={{ query }}&page={{ page.previous_page_number }}" >{% endif %}« Previous{% if page.has_previous %}</ a >{% endif %}
  |
  {% if page.has_next %}< a href = "?q={{ query }}&page={{ page.next_page_number }}" >{% endif %}Next »{% if page.has_next %}</ a >{% endif %}</ div >
  {% endif %}</ div >
< pre >
{% else %}
 
{% endif %}
{% endblock %}

And the templates/search/_result_object.html

?
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
{% with obj=result.object %}</ pre >
< div class = "accordion-group" >
< div class = "accordion-heading" >< a class = "accordion-toggle" href = "#collapse_{{ obj.id }}" data-toggle = "collapse" data-parent = "#accordion2" >
  {{ obj.name }}
  </ a >
< div style = "padding: 8px 15px;" >
Company: {{ obj.company }}
 
Type: {{ obj.type }}
 
  {% if obj.salary %}
Salary: {{ obj.salary }}
 
{% endif %}
 
Location: {{ obj.location }}</ div >
</ div >
< div id = "collapse_{{ obj.id }}" class = "accordion-body collapse in" >
< div class = "accordion-inner" >
Contact: < a href = "mailto:{{ obj.get_contact_email }}" >{{ obj.get_contact_email }}</ a >
 
  {{ obj.description }}</ div >
</ div >
</ div >
< pre >
{% endwith %}

Start up your dev server for search!

Related Items

Adding Related Items is as simple as using the related_content tag in the haystack more_like_this tag library and tweaking out Solr config. Open up solrconfig.xml and add a MoreLikeThisHandler within the tag:

?
1
< requestHandler name = "/mlt" class = "solr.MoreLikeThisHandler" />

Our full _result_object.html now looks like this:

?
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
{% load more_like_this %}
 
{% with obj=result.object %}
< div class = "accordion-group" >
     < div class = "accordion-heading" >
         < a class = "accordion-toggle" data-toggle = "collapse" data-parent = "#accordion2" href = "#collapse_{{ obj.id }}" >
             {{ obj.name }}
         </ a >
         < div style = "padding: 8px 15px;" >
             < p >Company: {{ obj.company }}</ p >
             < p >Type: {{ obj.type }}</ p >
             {% if obj.salary %}< p >Salary: {{ obj.salary }}</ p >{% endif %}
             < p >Location: {{ obj.location }}</ p >
         </ div >
     </ div >
     < div id = "collapse_{{ obj.id }}" class = "accordion-body collapse in" >
         < div class = "accordion-inner" >
             < p >Contact: < a href = "mailto:{{ obj.get_contact_email }}" >{{ obj.get_contact_email }}</ a ></ p >
             {{ obj.description }}
             {% more_like_this obj as related_content limit 5  %}
             {% if related_content %}
                 < div >
                     < br >
                     < p >< strong >Related:</ strong ></ p >
                     < ul >
                         {% for related in related_content %}
                             < li >< a >{{ related.object.name }}</ a ></ li >
                         {% endfor %}
                     </ ul >
                 </ div >
             {% endif %}
         </ div >
     </ div >
</ div >
{% endwith %}

Facets

To get our type and location facets, we’ll have to add them to a queryset and pass this to a FacetedSearchView instead of the default one. urls.py now looks like this:

?
01
02
03
04
05
06
07
08
09
10
from django.conf.urls import patterns, include, url
from haystack.forms import FacetedSearchForm
from haystack.query import SearchQuerySet
from haystack.views import FacetedSearchView
 
sqs = SearchQuerySet().facet( 'type' ).facet( 'location' )
 
urlpatterns = patterns( 'haystack.views' ,
     url(r '^$' , FacetedSearchView(form_class = FacetedSearchForm, searchqueryset = sqs), name = 'haystack_search' ),
)

Then, we can use the generated facets in the search template in the facets variable

?
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
{% extends 'base.html' %}
 
{% block hero_text %}Search{% endblock %}
{% block header %}< p >Click around!</ p >{% endblock %}
 
{% block content %}
< div class = "span12" >
     < h1 >Search</ h1 >
     < form method = "get" action = "." class = ".form-search" >
         < table >
             {{ form.as_table }}
         </ table >
         < input type = "submit" value = "Search" >
     </ form >
</ div >
         {% if query %}
             < div class = "span2" >
                 < h3 >Filter</ h3 >
                 {% if facets.fields.type %}
                     < div >
                         < h4 >Type</ h4 >
                         < ul >
                         {% for type in facets.fields.type %}
                             < li >< a href = "{{ request.get_full_path }}&amp;selected_facets=type_exact:{{ type.0|urlencode }}" >{{ type.0 }}</ a > ({{ type.1 }})</ li >
                         {% endfor %}
                         </ ul >
                     </ div >
                 {% endif %}
                 {% if facets.fields.location %}
                     < div >
                         < h4 >Location</ h4 >
                         < ul >
                         {% for location in facets.fields.location %}
                             < li >< a href = "{{ request.get_full_path }}&amp;selected_facets=location_exact:{{ location.0|urlencode }}" >{{ location.0 }}</ a > ({{ location.1 }})</ li >
                         {% endfor %}
                         </ ul >
                     </ div >
                 {% endif %}
             </ div >
             < div class = "span6" >
                 < h3 >Results</ h3 >
                 < div class = "accordion" id = "accordion2" >
                     {% for result in page.object_list %}
                         {% include 'search/_result_object.html' %}
                     {% empty %}
                         < p >No results found.</ p >
                     {% endfor %}
                 </ div >
 
                 {% if page.has_previous or page.has_next %}
                     < div >
                         {% if page.has_previous %}< a href = "?q={{ query }}&amp;page={{ page.previous_page_number }}" >{% endif %}&laquo; Previous{% if page.has_previous %}</ a >{% endif %}
                         |
                         {% if page.has_next %}< a href = "?q={{ query }}&amp;page={{ page.next_page_number }}" >{% endif %}Next &raquo;{% if page.has_next %}</ a >{% endif %}
                     </ div >
                 {% endif %}
             </ div >
         {% else %}
             < div class = "span6" >
                 {# Show some example queries to run, maybe query syntax, something else? #}
             </ div >
         {% endif %}
{% endblock %}

And we’re done! As I said, check out the haystack documentation for more information. Leave any questions in the comments and I’ll be sure to answer them. Spelling suggestions to come in the next post.

9 Comments

  1. I followed the same steps as mentioned above but my query returns no search results.I am using Sqlite3 in my Django Project.

    Rishabh Yadav on May 20, 2013 at 2:38 am
  2. Do you see results in the Solr admin console?

    Tim Broder on May 20, 2013 at 7:46 am
  3. Using solr 4.3.0 and haystack 1.2.7 and django 1.5.1

    Helpful tutorial, no real problems running it using sqlite3. I also had to add this line to the generated schema.xml to get solr to work properly:

    thanks

    geoff washam on May 20, 2013 at 9:34 pm
  4. Thank you for the tutorial.it helped me a lot,however I’m stuck on one point.

    In trying to index my data on solrn it gives me this error : http 404 error, missing required field: Id.

    How can that be resolved?

    kenneth on June 18, 2013 at 1:17 pm
  5. the demo above has a bug:

    def index_queryset(self):
    return self.get_model().objects.all()

    should be

    def index_queryset(self, using=None):
    return self.get_model().objects.all()

    if you query nothing, try modify like this, and run python ./manage.py rebuild index.

    伍正飞 on July 9, 2013 at 12:16 am
  6. Nice tutorial.
    After edited schema.xml, restart your Solr server.

    O on August 24, 2013 at 12:43 am
  7. Thanks for the detailed tutorial.

    For at least Haystack v. 2.1.0, search_indexes.py should be updated to have the following prototype:

    def index_queryset(self, using = None):

    Andy on February 11, 2014 at 3:42 pm
  8. May you please tell what is the need of

    from local_settings import *

    in settings.py file

    when I run
    python manage.py runserver I am getting error for this import

    Shireesha on July 25, 2014 at 7:17 pm
  9. @Shireesha

    We use that pattern to hold environment specific settings.

    settings.py has all the settings that don’t change, local_settings.py has things such ad db credentials

    Tim on July 28, 2014 at 8:03 am

Leave a Comment

* = required


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值