We are looking for a Junior NPU Kernel/Operator Engineer to develop and optimize deep learning operators for a custom AI accelerator / NPU. The role focuses on kernel/operator implementation, performance tuning, and correctness validation across a broad range of neural network workloads.
This is a good fit for candidates with strong C/C++ and Python skills who are interested in hardware-aware software optimization. Prior NPU experience is helpful but not required.
Responsibilities
Requirements
Preferred