With 20 years of hands-on AI and systems engineering, spanning Fortune 500 programs and award-winning open source, I deliver a velocity-driven approach that ships working prototypes fast and scales them into robust, observable, and safe AI systems. Under the hood: Python/PyTorch, C++/CMake, CUDA, ONNX, TensorRT, vLLM/llama.cpp, AWS/GCP, and edge-optimized inference—backed by evaluation, observability, safety guardrails, and end-to-end MLOps.
Shlomo Kashani is an AIMO-2 Gold Medalist, published author (Deep Learning Interviews, GitHub), and founder of QNeura.ai, where he leads strategy and hands-on delivery for LLM-powered systems, RAG pipelines, agentic AI, and MLOps at production scale.
An interdisciplinary technologist and acting Chief Scientist, he integrates advanced AI research with Defense and Strategic Studies (DSS)-informed strategic and philosophical inquiry, combining scientific rigor with ethical and cultural awareness.
He works end-to-end across modern AI stacks, including agentic frameworks and orchestrators, LLMs and VLMs (Anthropic, OpenAI, DeepSeek), and the full lifecycle from pre-training and fine-tuning through LoRA, multi-GPU inference, and deployment via Hugging Face and vLLM.
His academic background spans Strategic Studies (MSU), Quantum Physics and Computing (Johns Hopkins University), Digital Signal Processing (Queen Mary University of London), and Engineering (Ben-Gurion University). His open-source work includes QuantumLLMInstruct, metalQwen3, vLLM-5090, and osxQ.
Strategic guidance on AI implementation, technology selection, and organizational transformation for quantum-ready enterprises.
Custom quantum machine learning solutions, algorithm development, and hybrid classical-quantum system design.
Seamless integration of quantum-enhanced AI capabilities into existing business workflows and technical infrastructure.
Ready to accelerate your AI roadmap or discuss fractional leadership? Reach out and we’ll help you get started.