top of page
Likhitha Aralimara

Likhitha Aralimara

Software Developer @Orbytring by Sensio Enterprises | Flutter Consultant

Flutter developer and consultant who helps startups and SMEs build apps from scratch, from architecture to deployment. She has been the sole app developer for Orbytring by Sensio Enterprises, SpaceMarvel AI, and EventHQ, delivering production-ready applications across EventOps, Health Tech, AI, and enterprise use cases.In the recent days , she began working on health monitoring applications with Orbytring, where she first explored the offline-first approach to reduce latency, improve responsiveness, and ensure seamless user experiences. That exploration sparked deeper interest in the unexplored potential of offline-first architectures, leading to her ongoing work on integrating on-device AI pipelines.

Flutter’s Cloudless AI Revolution: Solving Latency, Privacy, and Cost with SLMs & FFIs

Most AI integrations in mobile apps today rely on cloud APIs, introducing latency, recurring costs, and data privacy concerns. This talk demonstrates a cloudless approach: embedding quantized Small Language Models (SLMs) directly into Flutter apps using Foreign Function Interface (FFI) to call into runtimes like llama.cpp, whisper.cpp, or tflite. Attendees will see how on-device inference enables zero-latency, privacy-first, and cost-free AI interactions. We’ll explore the full technical pipeline - model quantization (INT4/INT8), memory optimization strategies for mobile devices, FFI bindings in Dart, and integrating offline STT/TTS for seamless conversational experiences. By the end, Flutter developers will understand how to build production-ready apps that run advanced AI locally without relying on the cloud. Key Takeaways 1.Learn how to quantize and embed SLMs (INT4/INT8) into Flutter apps for on-device inference. 2.Understand how to bridge Flutter with C/C++ runtimes like llama.cpp, whisper.cpp, and tflite using FFI. 3.Explore architectural strategies for offline-first, zero-latency AI pipelines in Flutter. 4.Gain practical techniques to optimize model performance under mobile constraints (memory, CPU, battery). 5.See how cloudless AI directly addresses latency, privacy, and cost - the hardest problems in deploying AI at scale.
bottom of page