AI Systems Engineer – Serverless Distributed Computing
Huawei Technologies Canada Co., Ltd. · Markham, Ontario, CA
Huawei Canada has an immediate permanent opening for a Software Engineer. About the team: The Distributed Data Storage and Management Lab leads research in d...
Job description
Huawei Canada has an immediate permanent opening for a Software Engineer. About the team: The Distributed Data Storage and Management Lab leads research in distributed data systems, aiming to develop next-generation cloud serverless products that encompass core infrastructure and databases. This lab addresses various data challenges, including cloud-native disaggregated databases, pay-by-query user models, and optimizing low-level data transfers via RDMA. Teams within this lab create advanced cloud serverless data infrastructure and implement cutting-edge networking technologies for Huawei's global AI infrastructure. About the job: Architect and develop frameworks and engines for next-generation serverless computing tailored to AI workloads (LLM training/inference, agent execution, RL training, etc.). Analyze and optimize end-to-end AI system performance, including distributed scheduling, data flow, and memory utilization across large clusters. Research and evaluate cutting-edge technologies in distributed computing, serverless infrastructure, reinforcement learning, and LLM-based AI agents. Collaborate cross-functionally with research, product, and platform teams to transform conc...