In order for Tesla to perform well, we have to expect a high level of efficiency. This results in an accuracy of FP16/CFP8 of 362 teraflops based on the D1 output. Tesla’s chip performs better than Nvidia’s most powerful GPU, the A100, for FP16-type data. With its product, Nvidia offers a traflapse of 312 for FP16.
Tesla has created a silicon chip based on a network of functional units (FU). FUs contain a dedicated ISA and 64-bit processor designed to transmit, collect, and play navigation information.
Each FU can have a power of 1 Traflapse in BF16 or CFP8 calculations and 64 GAflaps for FP32 calculations. Additionally, it has 512 GB/s of bandwidth in both directions. Generally, there is a decreased amount of delay and increased performance due to its design.