Device licenses and fleet controls for local LLMs

Per-device activation. Fleet-wide model allowlists. Signed policy bundles pushed from one admin console. Air-gap support and local audit logs — so even offline machines stay under policy.

Access control for machines, not just humans

Per-device

Licenses

Tie activations to laptops, not shared secrets

<5 sec

Model Lockdown

Block an unapproved weight across the fleet

Offline

First Auth

Devices keep working after activation, even air-gapped

Signed

Admin Bundles

Push model sets and policies as signed artifacts

The full device lifecycle

Activate, scope, update, and revoke — with a complete local audit trail.

Per-Device Licenses

Each VirexaLLM install gets a unique license tied to the machine. Revoke the laptop and inference stops — no shared key to rotate.

Admin Console for the Fleet

A single dashboard to see every activated device, which models it has, and which policies are in effect. Push updates in one click.

Per-Device Model Access

Allow Llama 3 8B on engineering, restrict larger Qwen variants to ML workstations, block unapproved GGUFs everywhere else.

Local Auth

Sign in to the desktop app once. Auth artifacts stay on the device; prompts never round-trip to a cloud identity service.

SSO for Admins

Okta, Azure AD, or Google Workspace for the admin console. End users keep the local-first experience.

Offline Audit Trail

Every model load, policy change, and activation event captured locally — exportable as a signed log when the machine comes back online.

One control plane for every machine

Licenses, profiles, and policies — uniform across macOS, Windows, and Linux.

Per-Device LicensesPer-User ProfilesModel AllowlistsQuant AllowlistsGGUF Signing EnforcementAir-Gap ModeOffline ActivationSigned Policy BundlesAdmin SSORole-Based AdminFleet Model PushesRemote Revocation

Onboard a whole engineering floor in one click

Issue signed activation bundles for every workstation. SSO into the admin console to invite admins; developers get a local app that works the moment they open their laptop — online or off.

Fleet onboarding for local LLMs

How a device is activated

Four steps. Every policy enforced locally.

1

Enroll

Register a laptop in the admin console or hand the user an activation bundle.

2

Scope

Assign a profile: allowed models, quantizations, air-gap flag, and session limits.

3

Activate

The desktop app validates the signed license locally — no cloud dependency after setup.

4

Govern

Push policy updates, revoke on offboarding, or lock the device into air-gap mode forever.

Revocation that actually stops inference

Revoke a device and the next launch refuses to load any model. No phone-home required — the license check is local, the policy is signed, and the machine self-disables within seconds of the next boot.

Enforce model policy at the device

Decide which weights ship to which machines — and keep unapproved GGUFs off the fleet entirely.

Model Allowlists

Whitelist specific model IDs and quantizations. Unknown weights refuse to load.

Quant Caps

Cap RAM use by restricting which quantization tiers a device can run — protect shared workstations.

Alerts

Get notified when a device goes offline, fails license checks, or attempts to load a blocked model.

Frequently asked questions

How are devices authorized?
Each workstation receives a signed per-device license during activation. The license is validated locally on every launch — no ongoing cloud call required.
What happens when a laptop is lost?
Revoke the device from the admin console. The license stops validating, the app refuses to load models, and the machine falls back to a disabled state.
Can we run this air-gapped?
Yes. Generate an offline activation bundle, side-load it onto the isolated machine, and the app never contacts the network again.
How do we control which models a developer can run?
Push a signed policy bundle that whitelists model IDs and quantizations. Anything outside the list is refused at load time.
Does SSO touch end-user prompts?
No. SSO gates admin access to the console only. End-user inference stays 100% local.

Your laptop is the server now

Download VirexaLLM and run Llama, Mistral, Phi-3, Gemma, or Qwen locally in minutes. Free desktop app for macOS, Windows, and Linux — your prompts never leave the device.