The real-time LLM inference engine