Build Responsive, AI-powered Apps with Cloud Functions for Firebase

March 6, 2025

App users demand instant, dynamic app experiences, whether they’re chatting with an AI assistant, generating content in real time, or waiting for live data.

Since their introduction in 2017, Cloud Functions for Firebase’s callable functions have focused on providing you with a streamlined server(less) development experience that simplifies server-to-client interactions. Now, we’re excited to introduce two new features that make it easier than ever to build responsive, AI-powered apps:

Streaming responses for callable functions – Deliver data incrementally to clients.
A new onCallGenkit trigger – Seamlessly integrate Firebase Genkit to productionize generative AI workflows.

Why Streaming Matters

Send data to your app as it is generated

Traditionally, when a client application calls a server function, it has to wait for the entire response to be generated before it can do anything. This “all-or-nothing” approach works well for many scenarios, but it falls short when dealing with:

Generative AI: LLMs like the Gemini API often produce responses token-by-token. Waiting for the entire generation to complete before displaying anything to the user leads to a sluggish, unresponsive experience.

Input:

Why do generative AI APIs return their responses token-by-token?

Output:

Large Datasets: Retrieving and processing large amounts of data (such as weather forecasts for multiple locations or real-time data feeds) can take time. Users shouldn’t have to stare at a loading spinner while the server churns.

With the new streaming capability for Callable Functions, your server can send data to the client in chunks, as it becomes available. The client can then process and display this data incrementally, creating a much more responsive and engaging user experience.

Example: Weather Forecast Aggregator

Imagine we’re building a web app for weather enthusiasts or frequent travelers that want to see the forecast for lots of locations at once, and that it calls a callable function to get these forecasts. Without streaming, they’d have to sit and wait while the forecast for each location is fetched, and can only start browsing the results once every location’s forecast is fetched and returned from the server.

Yosemite, CA

Grand Canyon, AZ

Yellowstone, WY

Acadia, ME

Everglades, FL

Click "Get Weather" to simulate (without streaming)

With streaming, our weather enthusiasts can start viewing each location’s data as soon as it’s ready:

Yosemite, CA

Grand Canyon, AZ

Yellowstone, WY

Acadia, ME

Everglades, FL

Click "Get Weather" to simulate (with streaming)

To stream results, check the acceptsStreaming field of the request object, and call response.sendChunk every time you have a chunk of data that’s ready to send back to the client. Return the full dataset at the end for clients that don’t accept streaming.

weatherApp/functions/index.js

exports.getForecast = onCall(async (request, response) => {
  const locations = request.data.locations;

  const allLocations = locations.map(async ({ latitude, longitude }) => {
      const forecast = await fetchWeather(latitude, longitude);

      // Stream each forecast as it resolves!
      if (request.acceptsStreaming) {
          response.sendChunk({ latitude, longitude, forecast });
      }

      return { latitude, longitude, forecast };
  });

  // Fallback for non-streaming clients
  return Promise.all(allLocations);
});

Copied!

See the full sample code here.

Use the Firebase client SDKs to consume the streaming response from a web app:

weatherApp/site/forecast.js

const getForecast = getFunctions("getForecast");
const { stream, data } = await getForecast.stream({ locations });
for await (const chunk of stream) {
  updateUI(chunk); // Live updates!
}
const allData = await data; // Full response when done

Copied!

If you’re already using callable functions in your app, adding streaming behavior may just be a small change to your server and client code that results in big user-experience benefits!

For generative AI applications, we’ve built a new onCallGenkit trigger on top of this new streaming functionality.

Introducing a new Genkit trigger

Productionize Your AI-Powered Apps Faster

Firebase Genkit is an all-in-one toolkit to prototype, test, build, and monitor production-ready generative AI applications. The new onCallGenkit trigger now lets you deploy Genkit flows as secure, scalable APIs with one line:

jokeTeller/functions/index.js

// An example Genkit flow.
const jokeTeller = ai.defineFlow(
  {
      name: "jokeTeller",
      // ...
  },
  async (jokeType, { sendChunk }) => {
      const { stream, output } = ai.generateStream(
          `Tell me a ${jokeType} joke.`,
      );
      for await (const chunk of stream) {
          sendChunk(chunk.text); // Stream results as they are generated!
      }
      return (await output).text; // Full response for non-streaming clients.
  },
);

// Deploy the flow with Cloud Functions for Firebase
exports.tellJoke = onCallGenkit(jokeTeller);

Copied!

As above, use the functions SDK to consume the results from a client app:

jokeTeller/site/getJoke.js

const tellJoke = getFunctions("tellJoke");
const { stream, data } = await tellJoke.stream({ data: "knock-knock" });
for await (const chunk of stream) {
  updateUI(chunk); // Live updates!
}
const allData = await data; // Full response when done

Copied!

With onCallGenkit, you leverage all features we offer in Cloud Functions for Firebase, like:

Simplified Deployment: Deploy your Genkit flows with just the firebase deploy command you are familiar with.
Automatic Streaming: onCallGenkit automatically handles the streaming setup for you. Just use sendChunk() within your Genkit flow, and call the stream method from the client SDK.
Built-in Abuse Prevention: Protect your AI-powered function with Firebase App Check.
Secret Management: Securely store and access API keys and other sensitive information with enterprise-grade Cloud Secret Manager.
Concurrency: Benefit from Cloud Function for Firebase 2nd gen features like concurrency to scale your service efficiently. With concurrency, one function instance can handle multiple concurrent requests, a great fit for AI applications that depend heavily on API calls.

Example: Tell Me a Joke

Here’s a more robust sample based on the code above:

jokeTeller/functions/index.js

const jokeTeller = ai.defineFlow(
  {
      name: "jokeTeller",
      inputSchema: z.string().nullable(),
      outputSchema: z.string(),
      streamSchema: z.string(),
  },
  async (jokeType, { sendChunk }) => {
      const { stream, output } = ai.generateStream(
          `Tell me a ${jokeType} joke.`,
      );
      for await (const chunk of stream) {
          sendChunk(chunk.text); // Stream each word!
      }
      return (await output).text; // Full response for non-streaming clients
  },
);

// Deploy to Firebase with secrets and App Check
const geminiApiKey = defineSecret("GOOGLE_GENAI_API_KEY");
exports.tellJoke = onCallGenkit(
  {
      secrets: [geminiApiKey],
      enforceAppCheck: true,
  },
  jokeTeller,
);

Copied!

See the full sample code here.

Roadmap

These new features are ready to use now with the 2nd gen Node.js runtime, but we’re still working on Python support. You can follow along as we implement it!

We’re still working on updating our Swift, Kotlin, and Dart client SDKs so that native mobile apps can also use this streaming functionality, alongside web apps. Until then, your functions can fall back to non-streaming behavior for these clients.

Get Started Today

We’re incredibly excited to see what you build with these new capabilities. Take a look at the documentation to start building your next-generation AI-powered app with Genkit and Cloud Functions for Firebase:

We can’t wait to see what you create! If you hit any snags along the way, reach out to support to get help troubleshooting an issue. Or, request new features on our UserVoice page. Happy coding!

The Firebase Blog