Queries, Part 1: Common SQL Queries Converted for Firebase

October 1, 2013

Update (November 4, 2014): While this post still contains some useful and relevant information, we have released advanced query functionality which solves a lot of the problems this post discusses. You can read more about it in our queries blog post.

No WHERE clauses? No JOIN statements? No problem!

Coming from a SQL background as I did, it can take a while to grok the freedom of NoSQL data structures and the simplicity of Firebase’s dynamic, real-time query environment.

Part 1 of this double-header will will cover some of the common queries we know and love and talk about how they can be converted to Firebase queries. Part 2 goes on to cover some advanced query techniques for Firebase and a solution for full-blown content searches.

So let’s jump in. Here’s what we’re going to cover today:

Select a user by ID (WHERE id = x)
Find a user by email address (WHERE email = x)
Get messages posted yesterday (WHERE timestamp BETWEEN x AND y)
Paginate through widgets (LIMIT 10 OFFSET 10)
Join records using an id (FROM table1 JOIN table2 USING id)
More tools to come

Getting Started

This article relies heavily on the proper use of the following API calls, all of which are introduced in the documentation for Queries and Limiting Data:

startAt: Specify the first priority or record ID to be returned
endAt: Specify the last priority or record ID to be returned
limit: Return up to this many children, instead of all records at the location
on: Listen in real-time for any change to the data
once: Fetch data exactly one time

This article also leans heavily on theory from Anant’s authoritative post, Denormalizing is Normal. Where Anant’s post covers a wide breadth and highly foundational concepts, this post serves more as a quick reference and recipe book.

We’re going to work with the examples-sql-queries Firebase for all the examples. Feel free to browse the data and get a feel for the structure.

A lot of times in our docs, you’ll see something like var ref = new Firebase(URL); and then later, ref.child('user/1'). But in our examples we use new Firebase('URL/child/path'). So which should you use? They are functionally equivalent; use the one that keeps your code simple to read and maintain.

Using a variable will be a bit DRYer if it’s going to be referenced several times, but creating multiple Firebase instances does not incur any additional overhead as this is all optimized internally by the SDK.

Examples

Select a user by ID (`WHERE id = x`)

We’ll start off with the basics and build from here. In Firebase queries, records are stored in a “path”, which is simply a URL in the data hierarchy. In our sample data, we’ve stored our users at /user. So to retrieve record by it’s id, we just append it to the URL:

new Firebase('https://example-data-sql.firebaseio.com/user/1').once('value', function(snap) {
   console.log('I fetched a user!', snap.val());
});

See it work

Find a user by email address (`WHERE email = x`)

Selecting an ID is all good and fine. But what if I want to look up an account by something that’s not already part of the URL path?

Well this is where ordered data becomes our friend. Since we know that email addresses will be a common lookup method, we can call setPriority() whenever we add a new record. Then we can use that priority to look them up later.

new Firebase("https://examples-sql-queries.firebaseio.com/user")
    .startAt('kato@firebase.com')
    .endAt('kato@firebase.com')
    .once('value', function(snap) {
       console.log('accounts matching email address', snap.val())
    });

See it work

Pretty cool and useful for most cases, but what if we can’t use priorities? Or we need to search on more than one field? Well then it’s time to employ some indices!

See this example, which uses index/ to link email addresses to user accounts.

Get messages posted yesterday (`WHERE timestamp BETWEEN x AND y`)

What if we’d like to select a range of data? Ordering data with priorities is quite useful for this as well:

new Firebase("https://examples-sql-queries.firebaseio.com/messages")
    .startAt(startTime)
    .endAt(endTime)
    .once('value', function(snap) {
       console.log('messages in range', snap.val());
    });

See it work

Paginate through widgets (LIMIT 10 OFFSET 10)

First of all, let’s make some assertions. Unless we’re talking about a static data set, pagination behavior becomes very ambiguous. For instance, how do I define page numbers in a constantly changing data set where records are deleted or added frequently? How do I define the offset? The “last” page? If those questions are difficult to answer, then pagination is probably not the right answer for your use case.

Pagination for small, static data sets (less than 1MB) can be done entirely client side. For larger static data sets, things get a bit more challenging. Assuming we’re writing append-only data, we can use our ordered data examples above and assign each message a page number or a unique incremental counter and then use startAt()/endAt().

// fetch page 2 of messages
new Firebase("https://examples-sql-queries.firebaseio.com/messages")
    .startAt(2) // assumes the priority is the page number
    .endAt(2)
    .once('value', function(snap) {
       console.log('messages in range', snap.val());
    });

But what if we’re working with something like our [widgets path](https://examples-sql- queries.firebaseio.com/widget), which doesn’t have priorities? We can simply “start at” the last record on the previous page by passing null for a priority, followed by the last record id:

// fetch page 2 of widgets
new Firebase("https://examples-sql-queries.firebaseio.com/widget")
   .startAt(null, lastWidgetOnPrevPage)
   .limitToFirst(LIMIT+1) // add one to limit to account for lastWidgetOnPrevPage
   .once('value', function(snap) {
      var vals = snap.val()||{};
      delete vals[lastWidgetOnPrevPage]; // delete the extraneous record
      console.log('widgets on this page', vals);
   });

See it work or see a full pagination example

Join records using an id (FROM table1 JOIN table2 USING id)

Firebasers talk a lot about denormalization, which is great advice, but how do you put things back together once you’ve split them apart? Well, it’s a great deal simpler than it might seem.

Firebase is a real-time sync platform. It’s built for speed and efficiency. You don’t need to worry about creating extra references, and can listen to as many paths as you’d like to retrieve your data:

var fb = new Firebase("https://examples-sql-queries.firebaseio.com/");
fb.child('user/123').once('value', function(userSnap) {
   fb.child('media/123').once('value', function(mediaSnap) {
       // extend function: https://gist.github.com/katowulf/6598238
       console.log( extend({}, userSnap.val(), mediaSnap.val()) );
   });
});

See it work

If you are anything like me, your perfectionist instincts will be kicking in about now, since our merge logic synchronously waits for the [user](https://examples-sql- queries.firebaseio.com/user) data to load before grabbing media. It also gets a bit verbose if we add several paths to be joined. So let’s expand this into a utility that will merge any number of paths asynchronously, and stick a fork in it:

See three paths merged in parallel

More Tools to Come

Part 2 of this post covers some advanced techniques for performing content searching (e.g. WHERE description IS LIKE '%foo%').

We’re hard at work optimizing Firebase’s search and querying features by combining the best aspects of patterns like map reduce with the simplicity and speed of our real-time tools. Look for more news on this in the next few months! In the mean time, we’d love to hear your feedback. Let us know in the comments or email firebase-support@google.com.

The Firebase Blog