BigQuery and Firebase: Integration and Streaming

When you have an app out there in the world with users, naturally your app will generate a lot of data from both user activity and app behavior. Sometimes, you might have some questions about your app or your users’ experiences that can be answered by looking more closely at this data. Maybe you’re using Google Analytics and want to see the retention rate by country, or maybe you want to check Crashlytics to look at the number of crashes per day for the past month to ensure all of the bugs you worked so hard to fix truly made your app more stable over time.

If you’ve used the various product dashboards in the Firebase console before, you might have noticed that some of these questions can already be answered by the information provided there. But you can answer even more complex questions by integrating Firebase with BigQuery, where you can perform custom analysis on your data set in unique ways that are important to your app or business, and you can even join it with other data that you’ve collected. So, when should you consider integrating Firebase with BigQuery instead of just relying on the out-of-the-box dashboard analysis?

Get more out of Firebase data with BigQuery

BigQuery is Google Cloud’s enterprise data warehouse, which is designed to help ingest, store, analyze, and visualize an enormous amount of data. As such, it’s often used to get insights from data quickly; and when you integrate Firebase with BigQuery, you can dig into the data to identify patterns and trends. Exporting Firebase data to BigQuery also means that you can decide what to do with the data, like storing it for longer-term analysis, using it with other Google services, and even exporting it to third-party tooling.

You can also use BigQuery queries to get more actionable insights and explore your data in ways that are more customized than out-of-the-box data in the Firebase console. In the Firebase console, you can already, for example, get a breakdown of users by country and compare groups in the Google Analytics dashboard. When you take that data and export it to BigQuery, you can perform more advanced analysis, like group users based on when they first started using your app. You can then drill down even further to compare these groups by country, platform, business metrics, or other custom criteria you care about.

To export data to BigQuery from Firebase, first make sure you have the required IAM permissions to view or manage settings for data export to BigQuery. Next, go to the Integrations page in the Firebase console, click the BigQuery card, and then follow the instructions.

Which Firebase products and integrations can export data to BigQuery?

You can export data to BigQuery with these products and integrations: Analytics, Cloud Messaging, Crashlytics, Performance Monitoring, and A/B Testing. After integrating with BigQuery, you can export data from each of these products and use SQL to query the data to gain insights. You can also use a Firebase Extension to export Cloud Firestore collections to BigQuery. To learn what data is exported to BigQuery for each product and what you’ll be able to query, see each product’s data model schema along with examples: Analytics (blog post), Cloud Messaging (query examples), Crashlytics (query examples), and Performance Monitoring (query examples).

BigQuery integration for these products is fairly similar, with some variations, and you can learn more about BigQuery integration with each product in the documentation: A/B Testing, Analytics, Cloud Messaging, Crashlytics, and Performance Monitoring. This blog post focuses on Google Analytics and Crashlytics, as they have specific BigQuery features.

Get data into BigQuery in real time

When you link Firebase products and integrations to BigQuery, you gain insights into data for the previous day. In other words, today’s data isn’t available today, and you typically have to wait until tomorrow to query today’s data. Accessing today’s data can be critical for your app or business though, so Google Analytics and Crashlytics support streaming, which makes data for the current day available by publishing periodic updates to intraday tables throughout the day.

To enable streaming, follow the directions for Crashlytics BigQuery streaming and similarly for Google Analytics. Below, you can read more about each.

Google Analytics streaming

For Google Analytics streaming, BigQuery has real-time information that you can analyze about your users and their traffic on your property. For each day, streaming export creates one new table, events_intraday_YYYYMMDD, which is an internal staging table that includes records of session activity that took place during the day.

Most data that is automatically exported to BigQuery from Google Analytics is included in the intraday table, except for the following: Channel Grouping data, data that is widened through other Ads sources (like Google Ads, Campaign Manager 360, Google Ad Manager, etc.), and a few other fields. This intraday table can include partial records of a session when that session spans multiple export operations, and is eventually deleted when the daily export is complete.

The real-time information in the intraday table can be useful, for example, to understand the best time to offer a discount to consumers. Oftentimes, giving an offer right after a user showed intent (like adding a product to their cart) will convert consumers into actually making that purchase. In a case like this, it’s critical to use the freshest data to identify those users within minutes and deploy the right campaign to encourage a purchase now. Using Google Analytics streaming into BigQuery, you can make the following query to see when most users add a product to their cart (assuming that action triggers an event called add_to_cart).

SELECT event_timestamp FROM `app-name.table_name.events_intraday_20221109`
WHERE event_name = 'add_to_cart' 
ORDER BY event_timestamp

Streaming export has the following characteristics:

  • Is a best-effort operation
  • May omit some data for reasons such as the processing of late hits and/or failed uploads
  • Has data that is exported continuously throughout the day
  • Has no backfill for Google Analytics, meaning that until you link Google Analytics to BigQuery, there will be no Google Analytics data available in BigQuery. So if you plan on using BigQuery, or even if you just think you might use it, it’s probably worth linking it sooner rather than later so you have data available to query.

Export Limits

Important: While there is no limit on streaming export, standard Google Analytics properties have a daily BigQuery export limit of 1 million events. If your property consistently exceeds the export limit, the daily BigQuery export will be paused and the previous days’ exports will not be reprocessed. To increase this limit, upgrade to GA 360. To help you remain within the events limit, use data filtering to manage which events are exported.

Crashlytics streaming

When you integrate Crashlytics with BigQuery (without streaming), you automatically have a batch table. After you include streaming for Crashlytics, you’ll also see a real-time table in addition to the batch table.

The batch table has the following characteristics:

  • Has data exported once daily
  • Can provide backfilled data up to 90 days prior to the export
  • Is ideal for long-term analysis and identifying trends over time because Firebase durably stores events before writing them

The real-time table has the following attributes:

  • Has data exported in real time and is immediately written to BigQuery
  • Does not have backfill available
  • Is ideal for live dashboards and custom alerts

To benefit from both of these tables, you can combine them with a stitching query. See Example 9: Top 5 issues since DATE, including today.

Use BigQuery free of charge

When you use BigQuery, some features are free of charge and some features are a paid service. While BigQuery provides features that let you load and export data at no cost, BigQuery does charge for storing data, streaming inserts, and querying data.

Want to try out BigQuery to make sure it fits your needs? Use BigQuery Sandbox to evaluate and explore BigQuery without providing a form of payment. However, the Sandbox has the following limits:

  • A default table expiration time of 60 days, which means that the tables will be deleted in 60 days.
  • No access to some features like streaming to BigQuery, data transfer service, and data manipulation language statements.

If you want to access these paid features, you can upgrade at any time from the Sandbox to use BigQuery without these limits. To learn more, see BigQuery pricing.

Integrating A/B Testing, Analytics, Cloud Messaging, Crashlytics, and Performance Monitoring with BigQuery helps you gain more insights into your critical app and user data, giving you a better understanding of your users. Happy querying!