Principal engineer, AI Serving Framework Architect (Software)
Confirmed live in the last 24 hours
Samsung Semiconductor
Compensation
$219,000 - $351,000/year
Job Description
Please Note:
To provide the best candidate experience amidst our high application volumes, each candidate is limited to 10 applications across all open jobs within a 6-month period.
Advancing the World’s Technology Together
Our technology solutions power the tools you use every day--including smartphones, electric vehicles, hyperscale data centers, IoT devices, and so much more. Here, you’ll have an opportunity to be part of a global leader whose innovative designs are pushing the boundaries of what’s possible and powering the future.
We believe innovation and growth are driven by an inclusive culture and a diverse workforce. We’re dedicated to empowering people to be their true selves. Together, we’re building a better tomorrow for our employees, customers, partners, and communities.
Job Title: Principal engineer, AI Serving Framework Architect (Software)
What You’ll Do
The Architecture Research Lab (ARL) focuses on addressing fundamental system-level bottlenecks in modern AI, particularly in memory capacity/bandwidth and system-scale communication. By leveraging Samsung’s world-class memory technologies, ARL explores and defines next-generation AI system architectures that deliver step-function improvements in performance, efficiency, and scalability.
We are seeking a Principal AI System Architect who will play a key role in bridging AI workloads, system architecture, and hardware design. In this role, you will develop system-level performance models, drive architecture-level design decisions, and propose forward-looking AI system architectures that shape Samsung’s long-term AI platform strategy.
Location: Daily onsite presence at our San Jose office in alignment with our Flexible Work policy
Job ID: 42853
- As a Tech Lead, leading research teams in Korea and proposing technical direction
- Research on dynamic scheduling methodologies for maximizing AI inference performance in multi-rack scale memory-centric systems, comprised of heterogeneous compute-capable memory and hierarchical memory
- Investigating methods to accelerate search operations in RAG’s vector DB and AI Agent’s knowledge-graph by leveraging compute-capable memory
- Studying strategies for optimally placing KVCache and a vector DB in hierarchical memory to minimize frequent SSD accesses and reduce IO stalls
- Proposing SW design for implementing the derived optimization algorithms on open-source platforms such as vLLM
What You Bring
- PhD in Computer Science or a related field with 10+ years of experience in AI Serving Framework for large-scale computing, with focusing on the AI workloads.
- Led a project to build and optimize a Large Language Model (LLM) Inference Software Stack on a multi-rack scale system to deliver AI Inference services to over 100,000 users.
- Extensive experience in designing AI Inference Software Stacks for heterogeneous devices.In-depth understanding of the internal architecture and operation mechanisms of inference engines such as vLLM.
- Proficiency in AI Inference System Profiling and optimization.
- Knowledge and practical experience with future AI workloads, including reasoning models, multi-modal solutions, AI agents, and world models.
- Strong understanding of compute, memory, and networking bottlenecks in AI systems.
- Required skillsets: PyTorch, Python, and C++
- A collaborative mindset, curiosity, and resilience in solving complex challenges.
- Excellent verbal, presentation, and written communication skills.
- (Nice to have) Native or fluent Korean speakers are preferred.
- You’re inclusive, adapting your style to the situation and diverse global norms of our people.
- You approach challenges with curiosity and resilience, seeking data to help build. Understand
Similar Jobs
6sense
Software Engineer III - Data
Amazon.com Services LLC
Sr. Software Development Engineer, Advanced Analytics
Amazon.com Services LLC
Software Development Engineer, Advanced Analytics - LLD
Amazon Data Services, Inc.
Software Development Engineer, Data Center Builder Tools
Grammarly
Software Engineer, Data Engineering
Amazon Development Centre Canada ULC