forked from : http://www.alexanderinteractive.com/blog/2012/08/getting-started-with-solr-and-django/
Getting Started with Solr and Django
Solr is a very powerful search tool and it is pretty easy to get the basics, such as full text search, facets, and related assets up and running pretty quickly. We will be using haystack to do the communication between Django and Solr. All code for this can be viewed on github.
Install
Assuming you already have Django up and running, the first thing we need to do is install Solr.
1
2
3
4
5
|
unzip apache-solr-4.0.0-BETA.zip
cd apache-solr-4.0.0-BETA
cd example
java -jar start.jar
|
Next install pysolr and haystack. (At the time of this writing the git checkout of haystack works better with the Solr 4.0 beta then the 1.2.7 that’s in pip.)
1
2
|
pip install pysolr
pip install -e https://github.com/toastdriven/django-haystack.git
|
Add ‘haystack’ to INSTALLED_APPS in settings.py and add the following haystack connection:
1
2
3
4
5
6
|
HAYSTACK_CONNECTIONS
=
{
'default'
: {
'ENGINE'
:
'haystack.backends.solr_backend.SolrEngine'
,
},
}
|
Full Text Search
For the example, we’re going to create a simple job database that a recruiter might use. Here is the model:
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
|
from
django.db
import
models
from
django.contrib.localflavor.us
import
models as us_models
JOB_TYPES
=
(
(
'pt'
,
'Part Time'
),
(
'ft'
,
'Full Time'
),
(
'ct'
,
'Contract'
)
)
class
Company(models.Model):
name
=
models.CharField(max_length
=
64
)
address
=
models.TextField(blank
=
True
, null
=
True
)
contact_email
=
models.EmailField()
def
__unicode__(
self
):
return
self
.name
class
Location(models.Model):
city
=
models.CharField(max_length
=
64
)
state
=
us_models.USStateField()
def
__unicode__(
self
):
return
"%s, %s"
%
(
self
.city,
self
.state)
class
Job(models.Model):
name
=
models.CharField(max_length
=
64
)
description
=
models.TextField()
salary
=
models.CharField(max_length
=
64
, blank
=
True
, null
=
True
)
type
=
models.CharField(max_length
=
2
, choices
=
JOB_TYPES)
company
=
models.ForeignKey(Company, related_name
=
'jobs'
)
location
=
models.ForeignKey(Location, related_name
=
'location_jobs'
)
contact_email
=
models.EmailField(blank
=
True
, null
=
True
)
added_at
=
models.DateTimeField(auto_now
=
True
)
def
__unicode__(
self
):
return
self
.name
def
get_contact_email(
self
):
if
self
.contact_email:
return
self
.contact_email
return
self
.company.contact_email
|
The next step is to create the SearchIndex object that will be used to transpose to data to Solr. save this as search_indexes.py in the same folder as your models.py. The text field with its template will be used for full text search on Solr. The other two fields will be used to faceted (drill down) navigation. For more details on this file, check out the haystack tutorial.
01
02
03
04
05
06
07
08
09
10
|
class
JobIndex(indexes.SearchIndex, indexes.Indexable):
text
=
indexes.CharField(document
=
True
, use_template
=
True
)
type
=
indexes.CharField(model_attr
=
'type'
, faceted
=
True
)
location
=
indexes.CharField(model_attr
=
'location'
, faceted
=
True
)
def
get_model(
self
):
return
Job
def
index_queryset(
self
):
return
self
.get_model().objects.
all
()
|
Create the search index template in your template folder with the following naming convention: search/indexes/[app]/[model]_text.txt
For us, this is templates/search/indexes/jobs/job_text.txt
1
2
3
4
5
|
{{ object.name }}
{{ object.description }}
{{ object.salary }}
{{ object.type }}
{{ object.added_at }}
|
Now, lets get our data into Solr. Run ./manage.py build_solr_schema to generate a schema.xml file. Move this into example\solr\conf in your Solr install. Note: if using Solr 4, edit this file and replace stopwords_en.txt with lang/stopwords_en.txt in all locations. To test everything and load your data, run: manage.py rebuild_index Subsequent updates can be made with: manage.py update_index.
If that all worked we can start working on the front-end to see the data in Django. Add this to your urls.py
1
|
(r
'^$'
, include(
'haystack.urls'
)),
|
At this point there are at least two templates we’ll need. One for the search results page, and a sub-template to represent each item we are pulling back. My example uses twitter bootstrap for some layout help and styling, see my base.html here if interested.
Create templates/search/search.html
This gives you a basic search form, the results, and pagination for a number of results
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
|
{% extends 'base.html' %}
{% block hero_text %}Search{% endblock %}
{% block header %}
Click around!
{% endblock %}
{% block content %}</
pre
>
<
div
class
=
"span12"
>
<
h1
>Search</
h1
>
<
form
class
=
".form-search"
action
=
"."
method
=
"get"
>{{ form.as_table }}
<
input
type
=
"submit"
value
=
"Search"
/></
form
></
div
>
<
pre
>
{% if query %}</
pre
>
<
div
class
=
"span8"
>
<
h3
>Results</
h3
>
<
div
id
=
"accordion2"
class
=
"accordion"
>{% for result in page.object_list %}
{% include 'search/_result_object.html' %}
{% empty %}
No results found.
{% endfor %}</
div
>
{% if page.has_previous or page.has_next %}
<
div
>{% if page.has_previous %}<
a
href
=
"?q={{ query }}&page={{ page.previous_page_number }}"
>{% endif %}« Previous{% if page.has_previous %}</
a
>{% endif %}
|
{% if page.has_next %}<
a
href
=
"?q={{ query }}&page={{ page.next_page_number }}"
>{% endif %}Next »{% if page.has_next %}</
a
>{% endif %}</
div
>
{% endif %}</
div
>
<
pre
>
{% else %}
{% endif %}
{% endblock %}
|
And the templates/search/_result_object.html
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
{% with obj=result.object %}</
pre
>
<
div
class
=
"accordion-group"
>
<
div
class
=
"accordion-heading"
><
a
class
=
"accordion-toggle"
href
=
"#collapse_{{ obj.id }}"
data-toggle
=
"collapse"
data-parent
=
"#accordion2"
>
{{ obj.name }}
</
a
>
<
div
style
=
"padding: 8px 15px;"
>
Company: {{ obj.company }}
Type: {{ obj.type }}
{% if obj.salary %}
Salary: {{ obj.salary }}
{% endif %}
Location: {{ obj.location }}</
div
>
</
div
>
<
div
id
=
"collapse_{{ obj.id }}"
class
=
"accordion-body collapse in"
>
<
div
class
=
"accordion-inner"
>
Contact: <
a
href
=
"mailto:{{ obj.get_contact_email }}"
>{{ obj.get_contact_email }}</
a
>
{{ obj.description }}</
div
>
</
div
>
</
div
>
<
pre
>
{% endwith %}
|
Start up your dev server for search!
Related Items
Adding Related Items is as simple as using the related_content tag in the haystack more_like_this tag library and tweaking out Solr config. Open up solrconfig.xml and add a MoreLikeThisHandler within the tag:
1
|
<
requestHandler
name
=
"/mlt"
class
=
"solr.MoreLikeThisHandler"
/>
|
Our full _result_object.html now looks like this:
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
|
{% load more_like_this %}
{% with obj=result.object %}
<
div
class
=
"accordion-group"
>
<
div
class
=
"accordion-heading"
>
<
a
class
=
"accordion-toggle"
data-toggle
=
"collapse"
data-parent
=
"#accordion2"
href
=
"#collapse_{{ obj.id }}"
>
{{ obj.name }}
</
a
>
<
div
style
=
"padding: 8px 15px;"
>
<
p
>Company: {{ obj.company }}</
p
>
<
p
>Type: {{ obj.type }}</
p
>
{% if obj.salary %}<
p
>Salary: {{ obj.salary }}</
p
>{% endif %}
<
p
>Location: {{ obj.location }}</
p
>
</
div
>
</
div
>
<
div
id
=
"collapse_{{ obj.id }}"
class
=
"accordion-body collapse in"
>
<
div
class
=
"accordion-inner"
>
<
p
>Contact: <
a
href
=
"mailto:{{ obj.get_contact_email }}"
>{{ obj.get_contact_email }}</
a
></
p
>
{{ obj.description }}
{% more_like_this obj as related_content limit 5 %}
{% if related_content %}
<
div
>
<
br
>
<
p
><
strong
>Related:</
strong
></
p
>
<
ul
>
{% for related in related_content %}
<
li
><
a
>{{ related.object.name }}</
a
></
li
>
{% endfor %}
</
ul
>
</
div
>
{% endif %}
</
div
>
</
div
>
</
div
>
{% endwith %}
|
Facets
To get our type and location facets, we’ll have to add them to a queryset and pass this to a FacetedSearchView instead of the default one. urls.py now looks like this:
01
02
03
04
05
06
07
08
09
10
|
from
django.conf.urls
import
patterns, include, url
from
haystack.forms
import
FacetedSearchForm
from
haystack.query
import
SearchQuerySet
from
haystack.views
import
FacetedSearchView
sqs
=
SearchQuerySet().facet(
'type'
).facet(
'location'
)
urlpatterns
=
patterns(
'haystack.views'
,
url(r
'^$'
, FacetedSearchView(form_class
=
FacetedSearchForm, searchqueryset
=
sqs), name
=
'haystack_search'
),
)
|
Then, we can use the generated facets in the search template in the facets variable
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
|
{% extends 'base.html' %}
{% block hero_text %}Search{% endblock %}
{% block header %}<
p
>Click around!</
p
>{% endblock %}
{% block content %}
<
div
class
=
"span12"
>
<
h1
>Search</
h1
>
<
form
method
=
"get"
action
=
"."
class
=
".form-search"
>
<
table
>
{{ form.as_table }}
</
table
>
<
input
type
=
"submit"
value
=
"Search"
>
</
form
>
</
div
>
{% if query %}
<
div
class
=
"span2"
>
<
h3
>Filter</
h3
>
{% if facets.fields.type %}
<
div
>
<
h4
>Type</
h4
>
<
ul
>
{% for type in facets.fields.type %}
<
li
><
a
href
=
"{{ request.get_full_path }}&selected_facets=type_exact:{{ type.0|urlencode }}"
>{{ type.0 }}</
a
> ({{ type.1 }})</
li
>
{% endfor %}
</
ul
>
</
div
>
{% endif %}
{% if facets.fields.location %}
<
div
>
<
h4
>Location</
h4
>
<
ul
>
{% for location in facets.fields.location %}
<
li
><
a
href
=
"{{ request.get_full_path }}&selected_facets=location_exact:{{ location.0|urlencode }}"
>{{ location.0 }}</
a
> ({{ location.1 }})</
li
>
{% endfor %}
</
ul
>
</
div
>
{% endif %}
</
div
>
<
div
class
=
"span6"
>
<
h3
>Results</
h3
>
<
div
class
=
"accordion"
id
=
"accordion2"
>
{% for result in page.object_list %}
{% include 'search/_result_object.html' %}
{% empty %}
<
p
>No results found.</
p
>
{% endfor %}
</
div
>
{% if page.has_previous or page.has_next %}
<
div
>
{% if page.has_previous %}<
a
href
=
"?q={{ query }}&page={{ page.previous_page_number }}"
>{% endif %}« Previous{% if page.has_previous %}</
a
>{% endif %}
|
{% if page.has_next %}<
a
href
=
"?q={{ query }}&page={{ page.next_page_number }}"
>{% endif %}Next »{% if page.has_next %}</
a
>{% endif %}
</
div
>
{% endif %}
</
div
>
{% else %}
<
div
class
=
"span6"
>
{# Show some example queries to run, maybe query syntax, something else? #}
</
div
>
{% endif %}
{% endblock %}
|
And we’re done! As I said, check out the haystack documentation for more information. Leave any questions in the comments and I’ll be sure to answer them. Spelling suggestions to come in the next post.
9 Comments
Leave a Comment
I followed the same steps as mentioned above but my query returns no search results.I am using Sqlite3 in my Django Project.
Do you see results in the Solr admin console?
Using solr 4.3.0 and haystack 1.2.7 and django 1.5.1
Helpful tutorial, no real problems running it using sqlite3. I also had to add this line to the generated schema.xml to get solr to work properly:
thanks
Thank you for the tutorial.it helped me a lot,however I’m stuck on one point.
In trying to index my data on solrn it gives me this error : http 404 error, missing required field: Id.
How can that be resolved?
the demo above has a bug:
def index_queryset(self):
return self.get_model().objects.all()
should be
def index_queryset(self, using=None):
return self.get_model().objects.all()
if you query nothing, try modify like this, and run python ./manage.py rebuild index.
Nice tutorial.
After edited schema.xml, restart your Solr server.
Thanks for the detailed tutorial.
For at least Haystack v. 2.1.0, search_indexes.py should be updated to have the following prototype:
def index_queryset(self, using = None):
May you please tell what is the need of
from local_settings import *
in settings.py file
when I run
python manage.py runserver I am getting error for this import
@Shireesha
We use that pattern to hold environment specific settings.
settings.py has all the settings that don’t change, local_settings.py has things such ad db credentials