What you’ll be doing:
Platform architecture and hardware bring up of NVIDIA HGX GPU baseboards. Software architecture and design for various firmware, understanding embedded system limitations, Linux kernel internals to ensure performance, scalability and resiliency requirements for firmware running on embedded devices.
Working closely with hardware teams to influence hardware design and review HW architecture & schematics.
Work with internal and external team members to narrow down on performance and resiliency requirements for firmware running on Nvidia data center products. Hands on coding, code review, and BMC firmware development including various manageability features for NVIDIA’s Server platforms
Actively engaged in designing and developing CI/CD framework to ensure best quality for firmware. Writing and reviewing design documents, reviewing QA test plan and working closely with all collaborators to achieve consensus for design and testability as per product requirements.
Designs solutions for errors, stats & configuration appropriate to CPU, GPU, DIMM, SSDs, NICs, IB, PSU, BMC, FPGA, CPLD etc. for enterprise readiness of NVIDIA Server platforms.
Actively work with whole org to Instruments code to ensure maximum code coverage, writing and automating unit tests for each implemented module and maintaining detailed unit test case reports.
Mentor team for best practices on writing efficient and bug free code. Works with internal and external partners to drive design architecture to real products.
Works with the security team to ensure developed code is in line with product security goals, and with hardware teams to influence hardware design and review HW architecture & schematics.