You can use Firebase AI Logic to build AI experiences in your apps that take advantage of hybrid inference. This means that your AI features can use on-device models when available but seamlessly fall back to cloud-hosted models otherwise (and vice versa!). This capability is currently available for web apps to access the on-device Gemini Nano model when running on Chrome desktop.
Here are some practical examples demonstrating how you can use hybrid inference in your apps!
Text summarization
❓ A web app that allows users to summarize personal notes, emails, or articles directly on their device. If the device is offline, then the request is sent to the cloud-hosted model automatically for processing without any delay.
💡 This can be achieved by using generateContent()
for text-only input and setting the inference mode to prioritize PREFER_ON_DEVICE
processing.
This code initializes a generative model that prefers to run on the user’s device. The summarizeText()
function takes text as input, adds a summarization instruction, and then generates a summary using the on-device model (if it’s available; otherwise, the cloud-hosted model is used).
import { initializeApp } from "firebase/app";
import { getAI, getGenerativeModel, GoogleAIBackend, InferenceMode } from "firebase/ai";
// Initialize FirebaseApp and the Gemini Developer API backend
const firebaseApp = initializeApp({/* ... */});
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });
// Create a model instance that prefers on-device inference
const model = getGenerativeModel(ai, { mode: InferenceMode.PREFER_ON_DEVICE });
async function summarizeText(textToSummarize) {
// Provide a prompt to summarize the text
const prompt = `Summarize the following text: ${textToSummarize}`;
const result = await model.generateContent(prompt);
const summary = result.response.text();
console.log(summary);
}
Offline image captioning
❓ An image gallery web app that can automatically generate descriptive captions for photos, even when the user is offline. Image captioning is useful for organizing photos and for accessibility by providing descriptions for visually impaired users.
💡 This use case leverages the multimodal capabilities of the on-device and cloud-hosted models, taking both an image and a text prompt to generate a caption.
In this example, an image file is converted into a format the model can understand. The generateCaption()
function then sends the image along with a text prompt to the model to generate a descriptive caption. This feature will work offline by using the on-device model.
import { initializeApp } from "firebase/app";
import { getAI, getGenerativeModel, GoogleAIBackend, InferenceMode } from "firebase/ai";
// Initialize FirebaseApp and the Gemini Developer API backend
const firebaseApp = initializeApp({/* ... */});
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });
// Create a model instance that prefers on-device inference
const model = getGenerativeModel(ai, { mode: InferenceMode.PREFER_ON_DEVICE });
async function generateCaption(imageBase64Encoded) {
const prompt = "Describe this image in a single sentence.";
const imagePart = { inlineData: { data: imageBase64Encoded, mimeType: "image/jpeg"} };
const result = await model.generateContent([prompt, imagePart]);
const imageCaption = result.response.text();
console.log(imageCaption);
}
Real time translation
❓ A web app that offers live transcription and translation capabilities.
💡 Working with common phrases and short sentences is a great use case for the smaller, no-cost, low-latency on-device model. However, for more complex or longer text, the app can send the text to a cloud-hosted model that’s more capable for challenging tasks.
This code shows how to implement routing logic in your app to use the on-device model for short sentences (PREFER_ON_DEVICE
), while using the cloud-hosted model for complex text (PREFER_IN_CLOUD
).
import { initializeApp } from "firebase/app";
import { getAI, getGenerativeModel, GoogleAIBackend, InferenceMode } from "firebase/ai";
// Define what constitutes a "small" sentence (e.g., number of words)
const SMALL_SENTENCE_WORD_LIMIT = 15;
// Initialize FirebaseApp and the Gemini Developer API backend
const firebaseApp = initializeApp({/* ... */});
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });
// Create model 1 that prefers on-device inference for small sentences
const preferOnDeviceModel = getGenerativeModel(ai, {
mode: InferenceMode.PREFER_ON_DEVICE
});
// Create model 2 that prefers cloud-based translation
const onlyInCloudModel = getGenerativeModel(ai, {
mode: InferenceMode.PREFER_IN_CLOUD,
inCloudParams: { model: "gemini-2.5-flash" }
});
async function translateHybrid(text, targetLanguage) {
const wordCount = text.split(/s+/).length;
let modelToUse;
if (wordCount <= SMALL_SENTENCE_WORD_LIMIT) {
console.log("Using on-device model for small sentence.");
modelToUse = preferOnDeviceModel;
} else {
console.log("Using cloud-only model for complex sentence.");
modelToUse = onlyInCloudModel;
}
try {
const prompt = `Translate this text to ${targetLanguage}: ${text}`;
const result = await modelToUse.generateContent(prompt);
const translation = result.response.text();
console.log(translation);
} catch (error) {
console.error("Translation failed:", error);
// If even the cloud model fails, provide a fallback message
console.log("Translation not available at this moment.");
}
}
The future of web apps isn’t just about integrating AI – it’s about the intelligent distribution of AI. Hybrid inference enables you to build AI features that accommodate your users whether they’re online or offline. And it also lets you access the power of cloud-hosted models when it’s needed, while still being able to take advantage of the speed and no-cost benefits of on-device models, too.
🚀 Get started today
The hybrid on-device/in-cloud inference feature is now available in preview. Check out the documentation to get started.
We also encourage you to join the Google Chrome Built-in AI Challenge 2025 and create new web applications or Chrome Extensions using Firebase AI Logic hybrid inference to shape the future of client-side AI!