AI: LFM2-24B-A2B chat
LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-Experts model with only 2B active parameters per token, it delivers high-quality generation while maintaining low inference costs. The model fits within 32 GB of RAM, making it practical to run on consumer laptops and desktops without sacrificing capability.
Capabilities
Context Window 32k tokens
Max Output 0 tokens
Inputs
Outputs
Pricing (per 1M tokens)
Input $0.03
Output $0.12
Cache Read -
Cache Write -
Supported Parameters
frequency_penaltylogit_biasmax_tokensmin_ppresence_penaltyrepetition_penaltystoptemperaturetop_ktop_p