128 TOPS, sparse acceleration.
High-efficiency NPU with 128 TOPS for edge and cloud AI inference.
64 GT/s, low latency, SR-IOV.