Update (November 4, 2014): While this post still contains some useful and relevant information,
we have released advanced query functionality which solves a lot of the problems this post discusses.
You can read more about it in our queries blog post.
In Part 1 of this post, we covered a series of common SQL queries and how they can be recreated in Firebase, building off our authoritative Denormalizing is Normal post from last year. We’re going to build on several principles introduced in those articles, so be sure to check those out before digging into this post.
In this tutorial, we’ll explore a fast and powerful approach to performing text searches, or content searches, in Firebase.
Why Not Just WHERE foo LIKE ‘%bar%’?
The 20 year old SQL paradigm for content queries (WHERE foo LIKE ‘%bar%’) is a staple for static databases, but not a simple prospect for real-time data. Our team is hard at work on a series of tools to bring content searches into the lightning-fast realm of Firebase’s NoSQL data store. Look for more news on these indexing and query-related tools in the coming months.
Until then, I’d like to introduce you to a few quick scripts that can add powerful content searches to your app. At the end of the article, I’ll share a library that incorporates these strategies into a service you can clone, configure, and run on your own box.
Introducing ElasticSearch
ElasticSearch, based on Lucene,
is an extremely powerful document storage and indexing tool. However, at its core is a very simple search feature, which is nearly plug-and-play compatible with Firebase.
While it’s certainly not the only way to write content searches in Firebase, ElasticSearch’s simple integration makes it fast to implement and takes very little code to utilize, while its powerful indexing capabilities provide for customization, allowing it to scale with your app.
You can set up a local instance for testing in three steps:
- Download ElasticSearch
- Decompress the archive
- Run ./bin/elasticsearch
It’s really that simple! And, surprisingly, deploying a free, hosted instance requires nothing more than a button click thanks to Heroku’s Bonsai add-on.
Indexing Firebase Data
The first step is to get data into ElasticSearch so it can be indexed. A simple Node.js script can plug Firebase into ElasticSearch with a few lines of work. I utilized the node-elasticsearch-client library, which is optional, but simplifies the process by wrapping the lower level ElasticSearch client:
var Firebase = require('firebase');
var ElasticClient = require('elasticsearchclient')
// initialize our ElasticSearch API
var client = new ElasticClient({ host: 'localhost', port: 9200 });
// listen for changes to Firebase data
var fb = new Firebase('<INSTANCE>.firebaseio.com/widgets');
fb.on('child_added', createOrUpdateIndex);
fb.on('child_changed', createOrUpdateIndex);
fb.on('child_removed', removeIndex);
function createOrUpdateIndex(snap) {
client.index(this.index, this.type, snap.val(), snap.key())
.on('data', function(data) { console.log('indexed ', snap.key()); })
.on('error', function(err) { /* handle errors */ });
}
function removeIndex(snap) {
client.deleteDocument(this.index, this.type, snap.key(), function(error, data) {
if( error ) console.error('failed to delete', snap.key(), error);
else console.log('deleted', snap.key());
});
}
Drop that in a hosting environment like Heroku or Nodejitsu, or onto your own host with forever to monitor up-time, and search indexing is done! Now it’s time to read some of that data back.
A Brute Force Search
Once we have our data indexed in ElasticSearch, we could directly query our index using a wrapper like elastic.js. This is a perfectly reasonable option, but does add coupling and dependencies to the client:
<script src="elastic.min.js"></script>
<script src="elastic-jquery-client.min.js"></script>
<script>
ejs.client = ejs.jQueryClient('http://localhost:9200');
client.search({
index: 'firebase',
type: 'widget',
body: ejs.Request().query(ejs.MatchQuery('title', 'foo'))
}, function (error, response) {
// handle response
});
</script>
Since our clients are already using Firebase, wouldn’t it be great to keep our client code agnostic and push the request to Firebase instead?
A Firebase Search Queue
This little node script listens at /search/request
for incoming searches, handles the interactions with ElasticSearch, and pushes results back into /search/response
:
var Firebase = require('firebase');
var ElasticClient = require('elasticsearchclient')
// initialize our ElasticSearch API
var client = new ElasticClient({ host: 'localhost', port: 9200 });
// listen for requests at https://<INSTANCE>.firebaseio.com/search/request
var queue = new Firebase('https://<INSTANCE>.firebaseio.com/search');
queue.child('request').on('child_added', processRequest);
function processRequest(snap) {
snap.ref().remove(); // clear the request after we receive it
var data = snap.val();
// Query ElasticSearch
client.search(dat.index, dat.type, { "query": { 'query_string': { query: dat.query } })
.on('data', function(data) {
// Post the results to https://<INSTANCE>.firebaseio.com/search/response
queue.child('response/'+snap.key()).set(results);
})
.on('error', function(error){ /* process errors */ });
.exec();
}
A Client Example
Now that we have a way to queue requests into Firebase, the client can simply push requests and listen for results:
<script>
var queue = new Firebase('https://<INSTANCE>.firebaseio.com/search');
function search(index, type, searchTerm, callback) {
// post search requests to https://<INSTANCE>.firebaseio.com/search/request
var reqRef = queue.child('request').push({ index: index, type: type, query: searchTerm });
// read the replies from https://<INSTANCE>.firebaseio.com/search/response
queue.child('response/'+reqRef.key()).on('value', function fn(snap) {
if( snap.val() !== null ) { // wait for data
snap.ref().off('value', fn); // stop listening
snap.ref().remove(); // clear the queue
callback(snap.val());
}
});
}
// invoke a search for *foo*
search('firebase', 'widget', '*foo*', function(data) {
console.log('got back '+data.total+' hits');
if( data.hits ) {
data.hits.forEach(function(hit) {
console.log(hit);
});
}
});
</script>
A Pre-Built Library for Your Use
We’ve implemented a content search for Firebase using ElasticSearch, set up a queue to ferry data transparently to and from our search engine, and finished all of this in a couple short scripts. We’ve also tapped into the powerful, flexible, and scalable world of ElasticSearch, which will grow with our app.
Easy enough? I’ve gone ahead and baked these scripts into a GitHub repo just to make things even simpler: Fork the Flashlight service on GitHub.
It’s MIT Licensed and ready for cloning. Just edit the config file, start it up, and get back to work on your app!
We want your feedback!
Have fun and let us know how we did! We love getting feedback from our dev community. Chat with us in the comment thread or send an email to wulf@firebase.com. You can also follow @Firebase on twitter. ß