Technology Innovation Institute Announces Falcon-H1 model availability as NVIDIA NIM to Deliver Sovereign AI at Scale

Flagship, top ranking, open-source AI model to be production-ready via new NVIDIA NIM microservices that deliver enterprise-ready inference for thousands of LLMs

Paris, France - Abu Dhabi, UAE - 11 June 2025: Abu Dhabi’s Technology Innovation Institute (TII), a leading global research center and the developer behind the globally ranked Falcon open-source AI models and privacy-preserving technologies, today announced that Falcon-H1, its next-generation, hybrid-architecture large language model, will be available as an NVIDIA NIM microservice.

The announcement, timed with NVIDIA’s GTC Paris showcase, positions Falcon-H1 for seamless enterprise deployment across cloud, on-premise, or hybrid environments. Developers can soon access and scale Falcon-H1 with production-grade performance, without the engineering overhead typically required to adapt open-source models for real-world application.

Dr. Najwa Aaraj, CEO of TII, commented: “Falcon-H1’s availability on NVIDIA NIM reflects our ongoing leadership in shaping the future of open, sovereign, and cross-domain deployment ready AI. It demonstrates that breakthrough innovation from our region is not only competitive on the global stage - it’s setting new benchmarks for scalable, secure, and enterprise-ready AI.”

At the heart of Falcon-H1 is a novel hybrid Transformer–Mamba architecture, combining the efficiency of state space models (SSMs) with the expressiveness of Transformer networks. Designed in-house by TII researchers, the architecture supports context windows of up to 256k tokens, an order-of-magnitude leap in long-context reasoning, while preserving high-speed inference and reduced memory demands. Multilingual by design, Falcon-H1 delivers robust performance ahead of models in its category, across both high- and low-resource languages, making it suited for global-scale applications.

Supported soon for deployment via the universal LLM NIM microservice, Falcon-H1 becomes a plug-and-play asset for enterprises building agentic systems, retrieval-augmented generation (RAG) workflows, or domain-specific assistants. Whether running with NVIDIA TensorRT-LLM, vLLM, or SGLang, NIM abstracts away the underlying inference stack, enabling developers to deploy Falcon-H1 in minutes using standard tools such as Docker and Hugging Face, with automated hardware optimization and enterprise-grade SLAs.

“Falcon-H1’s availability on NVIDIA NIM bridges the gap between cutting-edge model design and real-world operability. It combines our hybrid architecture with the performance and reliability of NVIDIA microservices. Developers can integrate Falcon-H1 optimized for long-context reasoning, multilingual versatility, and real-world applications. What once required weeks of infrastructure tuning becomes achievable in minutes at scale, with multilingual depth, and production resilience”, said Dr. Hakim Hacid, Chief AI Researcher at TII.

The release also mark Falcon-H1’s integration with NVIDIA NeMo microservices and NVIDIA AI Blueprints, giving developers access to full lifecycle tooling, from data curation and guardrailing to continuous evaluation and post-deployment tuning. Crucially, this makes Falcon-H1 viable in regulated, latency-sensitive and sovereign AI contexts, with full-stack NVIDIA support.

With over 55 million downloads to date, the Falcon series has become one of the most widely adopted open-source models from the Middle East region. Beyond its scale, Falcon-H1 smaller variants routinely outperform larger peers on reasoning and mathematical tasks, while the 34B model now leads several industry benchmarks.

TII’s strategic alignment with NVIDIA’s validated deployment framework reflects that open-source models are production-ready assets. Falcon-H1’s availability on NIM cements its place among them as a sovereign, scalable, and secure alternative to closed-weight incumbents.

For technical documentation, deployment guides and model access, Falcon-H1 is available via the Falcon LLM portal, and will be supported by an upcoming release of the universal LLM NIM microservice container at build.nvidia.com.