Models

LiteRT-LM

Google's open-source inference framework for deploying large language models on edge devices.

5.5K Stars
Apache-2.0 License
0.6K Forks
Open sourceLocal first
LiteRT-LM 5.5K Stars · Apache-2.0 License · 0.6K Forks ai.google.dev verified 2026-06-10
About

LiteRT-LM overview

LiteRT-LM is Google's open-source, production-oriented inference framework for running LLMs on edge devices. It is relevant for teams evaluating local, mobile, and on-device agent stacks where latency, privacy, and hardware constraints matter.

Edge-first inference

LiteRT-LM focuses on deploying LLMs on edge and on-device environments.

Local inference can reduce latency, preserve privacy, and keep agents useful when cloud access is constrained.

Google AI Edge ecosystem

The project sits under Google's AI Edge GitHub organization.

Teams already watching Google's mobile and edge AI stack get a relevant open-source inference option to evaluate.

Production-oriented model serving

The repository describes LiteRT-LM as a production-ready inference framework.

Agent builders need model layers that can move beyond notebooks and into real devices.
Use cases

When to use LiteRT-LM

On-device assistants

Evaluate LiteRT-LM when an assistant needs local responses on mobile, desktop, or embedded hardware.

Private local inference

Use edge deployment to reduce dependence on cloud APIs for sensitive workflows.

Model runtime comparison

Compare LiteRT-LM with other local inference projects before choosing an agent model layer.

Compare

How it compares

When to choose LiteRT-LM

Compare it with nearby models by looking at hosting model, integration surface, license, and whether the official docs show the workflow you need.

FAQ

Questions

Is LiteRT-LM open source?

Yes. The GitHub repository is listed under the Apache-2.0 license.

Who should evaluate LiteRT-LM?

Teams building edge, mobile, desktop, or privacy-sensitive AI applications should evaluate it.

Tags

Capabilities

local inferenceinferenceopen sourcelocal firstlocal ai