Integrating Elasticsearch in the VoIPGRID platform
Written by Hans Adema on 14th July 2017
In my previous article, I showed you how I used the results from Elasticsearch to optimize the search results for the VoIPGRID telephony platform. A lot has happened since then, my final report has been handed in and my internship is done!
Last month I did a talk outlining the basics of my new and improved search functionality. However, I didn’t really have the time to go into details as to how my prototype actually works. In this blog post, I would like to explain how I integrated Elasticsearch in VoIPGRID.
The Elasticsearch implementation for VoIPGRID leans heavily on two existing libraries: Elasticsearch DSL and Django Elasticsearch DSL.
Elasticsearch DSL is an official wrapper for the main Elasticsearch client for Python. Elasticsearch DSL makes it easy to use Elasticsearch in any Python project. You can define the schema for your Elasticsearch documents as Python classes (not unlike how database models in Django are defined) and build your search queries with a fluent object oriented API (again, not unlike Django’s QuerySet). Elasticsearch DSL allows you to use Elasticsearch without having to deal with clunky JSON documents manually.
While the way of defining documents in Elasticsearch DSL is similar to Django, it’s in no way tied to the framework. Django Elasticsearch DSL is a library which builds a bridge between both modules. It lets you do a few things:
- Assign a model class to Elasticsearch DocType classes, so when the model is saved, the corresponding DocType is updated as well.
- List the fields from the model you want to index, and Django Elasticsearch DSL will automatically convert them to Elasticsearch data types.
- Use the “search_index” management command to easily rebuild your Elasticsearch indexes and fill them with data from your Django database.
Keeping search simple with SimpleSearch
Django Elasticsearch DSL handles the issue of keeping the Elasticsearch indexes up to date, but doesn’t do anything meaningful when it comes to executing search queries. Elasticsearch DSL makes composing search queries Pythonic, but it’s a bit too explicit to be conveniently used directly throughout an application.
That’s why I wrapped the construction of search queries in a class called SimpleSearch. With SimpleSearch, you can give it a model instance, and it will then create a Search object from Elasticsearch DSL and point it to the indexes and document types which have been defined for the model.
You can also pass additional filters from view (like active or deleted filters), pass a search text and some fields which should be searched and provide an instance of the currently logged in user to automatically apply authorization filters. These will then be transformed to appropriate method calls to the Search object.
You can then pass the SimpleSearch object as an item list to your view and use it like it’s a Django QuerySet. You can slice it and paginate it, and SimpleSearch will pass the options to the DSL Search object. When you iterate over it, SimpleSearch will execute the Elasticsearch query and build you a response.
How to interact with SimpleSearch (and how it interacts with Elasticsearch)
This brings us to one of the less conventional features of this implementation: you do not use the Elasticsearch results directly. SimpleSearch wraps the Elasticsearch response in an instance of ModelResponse. ModelResponse subsequently uses the IDs of the documents to load the corresponding