How to add full-text search to your Django app (with a PostgreSQL backend)
In Django 1.10, the django.contrib.postgres.search module was added to make it really easy to use PostgreSQL's full text search engine with a Django app.
I currently use it for this blog, Remote Python's job and developer search, and our knowledge base for Highview Apps. I'm really impressed with how good the search is and how little code is needed to get it working. Unless you have huge amounts of data/traffic and need really advanced search, you can keep your infrastructure simple by taking advantage of this feature instead of having to add a separate Solr or ElasticSearch service. Another plus is you don't have to think about syncing your search index with your database records.
Before this feature was added, I was just doing very basic search by essentially using SQL LIKE operators on this site. The funny thing is the code is actually simpler with this module. You can even do more advanced stuff like adding weights to certain fields.
Let's get started by taking a look at the view code of a simple Blog app.
# views.py from django.contrib.postgres.search import SearchQuery, SearchRank, SearchVector from django.views.generic import ListView from myproject.apps.blog.models import Blog class BlogSearchListView(ListView): """ Display a Blog List page filtered by the search query. """ model = Blog paginate_by = 10 def get_queryset(self): qs = Blog.objects.published() keywords = self.request.GET.get('q') if keywords: query = SearchQuery(keywords) vector = SearchVector('title', 'content') qs = qs.annotate(search=vector).filter(search=query) qs = qs.annotate(rank=SearchRank(vector, query)).order_by('-rank') return qs
Here, we're just passing all the keywords to a SearchQuery object. In the SearchVector, we simply specify which model fields to search. We then use the SearchRank to order the queryset results by relevance. PostgreSQL automatically determines the ranking based on how close together the terms appear in the document, how often they appear, etc.
Let's tweak the code just a little to add weighting to the fields. When searching blog posts or articles, we'd normally want to put more weight to the title since that basically tells you what the document is about. So if a keyword appears in the title, then it's more likely that document is more relevant to what the user is searching for.
# views.py from django.contrib.postgres.search import SearchQuery, SearchRank, SearchVector from django.views.generic import ListView from myproject.apps.blog.models import Blog class BlogSearchListView(ListView): """ Display a Blog List page filtered by the search query. """ model = Blog paginate_by = 10 def get_queryset(self): qs = Blog.objects.published() keywords = self.request.GET.get('q') if keywords: query = SearchQuery(keywords) title_vector = SearchVector('title', weight='A') content_vector = SearchVector('content', weight='B') vectors = title_vector + content_vector qs = qs.annotate(search=vectors).filter(search=query) qs = qs.annotate(rank=SearchRank(vectors, query)).order_by('-rank') return qs
In the code above, we simply created two separate vectors for the "title" and "content" fields. We assigned the weight of "A" to title, indicating that it should be given a higher weight in the search results.
There are more things you can do with this search module, such as using Trigram Similarity. I actually haven't played with this feature myself as I found using weighting and having PostgreSQL figure out the relevancy by default is more than good enough for my use cases.
I recommend that you read the official docs here for a more detailed explanation.
Tags: howto, python, django, tech, software development, database, postgresql, search