Performance Architect
Posted
In this position, you will developAI Storage Solutions based advanced system architectures and complex simulation models for Sandisks next generation products. You will need to initiate and analyze changes to the architecture of the product. Typical activities include designing, programming, debugging, and modifying simulation models to evaluate these changes and assess the performance, power, and endurance of the product.
You will work closely with excellent colleague engineers, cope with complex challenges, innovate, and develop products that will change the data centric architecture paradigm.
Essential Duties and Responsibilities :
- Build SystemC performance models for AI Storage Solutionsbased products covering end-to-end from GPU/TPU/NPU/xPU,host interface, memory hierarchy, basedie controller, and AI Storage Solutions using various packaging technolgies
- Responsible for improving the AI/ML ASIC Architecture performance through hardware & software co-optimization, post-silicon performance analysis, and influencing the strategic product roadmap.
- Workloadanalysis andcharacterizationof ASICand competitivedatacenter and AI solutions toidentifyopportunitiesfor performance improvementin our products.
- Collaboration with Architecture team to resolve performance issues and optimize the performance and TCO of their AI Storage Solutionsbased datacenter technologies.
- Experience modeling one or some components of AI/ML accelerator ASICs such as AI Storage Solutions, PCIe/UCIe/CXL, NoC, DMA, Firmware Interactions, NAND, xPU, fabrics, etc
- Performance modeling and optimization for multi-trillion parameter LLM training/inference including Dense, Mixture of Experts (MoE) with multiple modalities (text, vision, speech)
- Model/optimize novel parallelization strategies across tensor, pipeline, context, expert and data parallel dimensions
- Architect memory-efficient training systems utilizing techniques like structured pruning, quantization (MX formats), continuous batching/chunked prefill, speculative decoding
- Incorporate and extend SOTA models such as GPT-4, Reasoning models like Deepseek-R1, and multi-modal architectures
- Collaborate with internal and external stakeholders/ML researchers to disseminate results and iterate at rapid pace
In theAI Storage Solutions Performance Architecture Group, we build on our depth in microarchitecture expertise and simulation to analyze and optimize high-performance ASIC designs for critical areas such as AI/MLAccelerators, cloud computing, and high-performance computing.
