ML Engineer - AI Infra Group
Dream Security
ML Engineer - AI Infra Group
- AI
- Tel Aviv
- Full-time
Description
At Dream, we redefine cyber defense vision by combining AI and human expertise to create products that protect nations and critical infrastructure. This is more than a job; It’s a Dream job. Dream is where we tackle real-world challenges, redefine AI and security, and make the digital world safer. Let’s build something extraordinary together.
Dream's AI cybersecurity platform applies a new, out-of-the-ordinary, multi-layered approach, covering endless and evolving security challenges across the entire infrastructure of the most critical and sensitive networks. Central to our Dream's proprietary Cyber Language Models are innovative technologies that provide contextual intelligence for the future of cybersecurity.
At Dream, our talented team, driven by passion, expertise, and innovative minds, inspires us daily. We are not just dreamers, we are dream-makers.
The Dream Job
We are on an expedition to find you, someone who is passionate about creating intuitive, out-of-this-world production-grade AI infrastructure. This group builds scalable, high-performance AI systems for internal users and external customers, designed to run seamlessly across cloud and on-premise environments using the latest hardware advancements.
The Dream-Maker Responsibilities
- Design and optimize LLM serving infrastructure using inference engines (vLLM, TensorRT-LLM, Triton Inference Server)
- Implement and tune distributed inference strategies including tensor parallelism, pipeline parallelism, and multi-node serving
- Develop and apply model compression techniques to optimize cost, latency, and memory footprint while maintaining model quality
- Build self-service fine-tuning platforms that enable data scientists to run experiments (LoRA, QLoRA, full fine-tuning) in a standardized, reproducible, and governed manner
- Optimize inference performance through batching strategies, KV-cache tuning, and speculative decoding
- Develop reusable APIs, abstractions, and platform services for model deployment, scaling, and lifecycle management
- Collaborate with AI researchers and product teams to productionize models and meet latency/throughput requirements
- Evaluate and benchmark new model architectures, compression methods, and serving frameworks
The Dream Skill Set
- 5+ years of experience in software engineering or ml engineering with significant focus on ML systems or backend infrastructure
- Strong proficiency in Python and deep learning frameworks (PyTorch)
- Hands-on experience with LLM inference engines (vLLM, TensorRT-LLM, Triton Inference Server)
- Deep understanding of transformer architectures and LLM-specific optimizations (attention mechanisms, KV-cache, quantization techniques like GPTQ, AWQ, GGUF)
- Experience with distributed training/fine-tuning frameworks (Ray, DeepSpeed, FSDP)
- Ability to build developer-facing tools and platforms with clear APIs and documentation
- Understanding of GPU performance profiling and optimization
- Familiarity with LLM evaluation methodologies and benchmarking
Never Stop Dreaming...
If you think this role doesn't fully match your skills but are eager to grow and break glass ceilings, we’d love to hear from you!