From fine-tuning LLMs at Google to building autonomous perception systems and leading ML at scale — I architect intelligent systems that ship. 8+ years turning research papers into production-grade AI.
I'm Muhammad Hasnain Khan — a Lead Machine Learning Engineer who builds AI systems that work in the real world, not just in notebooks.
With a Master's in Computer Science (specialization in Deep Learning) from FAST Nuces and a Bachelor's from the University of Bradford, I've spent 8+ years shipping ML systems across Google, FrontNow, COMPREDICT, and ikeGPS. My work spans LLMs, computer vision, autonomous vehicles, NLP, and multi-sensor fusion.
I've trained large language models at Google for visual storytelling, built RAG-powered conversational AI platforms boosting engagement by 50%, and deployed real-time perception systems on autonomous vehicles with sub-50ms latency. I'm equally comfortable fine-tuning a Qwen 32B model as I am wiring up Kafka streams for real-time inference.
From research prototypes to production-grade systems at scale
Fine-tuning, RAG pipelines, and vision-language models
Sensor fusion, perception, and real-time edge deployment
Leading development of fine-tuned LLMs for a multilingual conversational AI platform. Engineered a hybrid RAG pipeline with FAISS and knowledge graphs, boosting engagement by 50%. Built streaming inference with RabbitMQ & Apache Flink and established MLOps with A/B testing, cutting inference latency by 30%.
Built virtual sensor systems for automotive diagnostics using CAN bus and OBD-II data. Predicted and diagnosed vehicle faults to reduce maintenance costs and increase reliability. Integrated ML-powered virtual sensors with cross-functional automotive platforms.
Trained large language models for semantic extraction from natural language for visual storytelling. Devised transformer-based pipelines aggregating multilingual corpora into latent scene-embeddings for 3D GAN-driven visual storytelling, reducing annotation overhead by 30%.
Built Visual Attention-based models in PyTorch for novel Webpage Object Detection. Utilized contextual features from ordered web elements via ResNet101. Achieved 95% accuracy for product price detection — 8.5% above Fast R-CNN baseline.
Designed architecture and implemented core features for AR applications. Transformed design specifications into functional apps and established development pipelines and strategy.
Research-driven projects spanning LLMs, autonomous systems, computer vision, and multimodal AI
Engineered a unified perception pipeline fusing camera, LiDAR, and radar via a cross-modal BEV-centric architecture with attention-based alignment and 3D detection heads. Deployed on an autonomous vehicle testbed with model quantization for sub-50ms latency.
Architected a Multimodal LLM integrating a Vision Transformer with a decoder-only LLM. Implemented a projection module to align visual features with word embeddings, enabling conversational VQA, referring expression generation, and multimodal grounding.
Built a novel framework enhancing factual reasoning by fine-tuning a decoder-only LLM with a domain-specific Knowledge Graph. Used PEFT with a RAG pipeline injecting relevant sub-graphs and Chain-of-Thought prompting, reducing hallucination significantly.
Fine-tuned Qwen 2.5-32B on 1,000 math problems using a novel "Wait" token technique to extend reasoning. Self-supervised CoT generation for iterative training. Achieved 56.7% on AIME 2024. Deployed on Google Cloud Run for real-time inference.
Designed a multimodal emotion recognition system combining computer vision, speech processing, and NLP. Used CNNs, LSTMs, and attention mechanisms to capture temporal and spatial dynamics of human emotions and integrated with a robotic platform.
Built a self-supervised audio-visual fusion system using footstep sounds and camera imaging for real-time pedestrian detection. Attention-based multimodal network achieves LiDAR-comparable performance at lower cost. Deployed on Jetson Orin Nano.
Developed a condition-aware BEV perception stack that dynamically re-weights camera, LiDAR, radar, and map priors across rain, fog, and nighttime scenes. Optimized for edge deployment with consistent low-latency inference under dense traffic.
Built a grounded enterprise QA copilot over PDFs, SOPs, and internal docs using OCR-aware chunking, retrieval, and citation-backed generation. Improved answer reliability while keeping response time fast enough for daily analyst workflows.
Built a creative eval and guardrail lab for agentic workflows with scenario stress tests, trace-based judges, and release gates. Visual diagnostics highlight failure concentration, drift, cost-quality tradeoffs, and deployment risk in one place.
Developed a city-scale digital twin that fuses traffic streams, graph topology, and event context to forecast congestion and simulate policy interventions. Includes rich visual mapping and frontier plots for practical planning decisions.
Built an advanced RAG platform over text, tables, charts, and images with modality-aware retrieval orchestration, citation validation, and hard regression gates. Includes deeply instrumented evaluation for correctness, faithfulness, robustness, latency, and cost.
Specialization in Deep Learning
Teaching: NLP, Neural Networks, Discrete Structures
Foundation in computer science, software engineering, and mathematics
"Hasnain's RAG pipeline and semantic search engine completely transformed our conversational platform. User engagement jumped 50%. He understands both the research and the engineering side — a rare combination."
"Hasnain built our Visual Attention model from scratch and beat the Fast R-CNN baseline by 8.5%. His PyTorch expertise and ability to deliver production-ready ML systems made all the difference."
"His work on virtual sensor systems and predictive diagnostics helped us cut maintenance costs significantly. Hasnain bridges the gap between automotive hardware and machine learning seamlessly."
Have a project in mind? I'd love to hear about it. Let's discuss how we can work together.