Nvidia and Intel Race For The Future Of Machine Learning

4/5 (4)

4/5 (4)

Two things happened recently that 99% of the ICT world would normally miss. After all microprocessor and chip interconnect technology is quite the geek area where we generally don’t venture into. So why would I want to bring this to your attention?

We are excited about the innovation that analytics, machine learning (ML) and all things real time processing will bring to our lives and the way we run our business. The data center, be it on an enterprise premise or truly on a cloud service provider’s infrastructure is being pressured to provide compute, memory, input/output (I/O) and storage requirements to take advantage of the hardware engineers would call ‘accelerators’. In its most simple form, an accelerator microprocessor does the specialty work for ML and analytics algorithms while the main microprocessor is trying to hold everything else together to ensure that all of the silicon parts are in sync. If we have a ML accelerator that is too fast with its answers, it will sit and wait for everyone else as its outcomes squeezed down a narrow, slow pipe or interconnect – in other words, the servers that are in the data center are not optimized for these workloads. The connection between the accelerators and the main components becomes the slowest and weakest link…. So now back to the news of the day.

A new high speed CPU-to-device interconnect standard, the Common Express Link (CXL) 1.0 was announced by Intel and a consortium of leading technology companies (Huawei and Cisco in the network infrastructure space, HPE and Dell EMC in the server hardware market, and Alibaba, Facebook, Google and Microsoft for the cloud services provider markets). CXL joins a crowded field of other standards already in the server link market including CAPI, NVLINK, GEN-Z and CCIX. CXL is being positioned to improve the performance of the links between FPGA and GPUs, the most common accelerators to be involved in ML-like workloads.

Of course there were some names that were absent from the launch – Arm, AMD, Nvidia, IBM, Amazon and Baidu. Each of them are members of the other standards bodies and probably are playing the waiting game.

Now let’s pause for a moment and look at the other announcement that happened at the same time. Nvidia and Mellanox announced that the two companies had reached a definitive agreement under which Nvidia will acquire Mellanox for $6.9 billion.  Nvidia puts the acquisition reasons as “The data and compute intensity of modern workloads in AI, scientific computing and data analytics is growing exponentially and has put enormous performance demands on hyperscale and enterprise datacenters. While computing demand is surging, CPU performance advances are slowing as Moore’s law has ended. This has led to the adoption of accelerated computing with Nvidia GPUs and Mellanox’s intelligent networking solutions.”

So to me it seems that despite Intel working on CXL for four years, it looks like they might have been outbid by Nvidia for Mellanox. Mellanox has been around for 20 years and was the major supplier of Infiniband, a high speed interconnect that is common in high performance workloads and very well accepted by the HPC industry. (Note: Intel was also one of the founders of the Infiniband Trade Association, IBTA, before they opted to refocus on the PCI bus). With the growing need for fast links between the accelerators and the microprocessors, it would seem like Mellanox persistence had paid off and now has the market coming to it. One can’t help but think that as soon as Intel knew that Nvidia was getting Mellanox, it pushed forward with the CXL announcement – rumors that have had no response from any of the parties.

Advice for Tech Suppliers:

The two announcements are great for any vendor who is entering the AI, intense computing world using graphics and floating point arithmetic functions. We know that more digital-oriented solutions are asking for analytics based outcomes so there will be a growing demand for broader commoditized server platforms to support them. Tech suppliers should avoid backing or picking one of either the CXL or Infiniband at the moment until we see how the CXL standard evolves and how nVidia integrates Mellanox.

Advice for Tech Users:

These two announcements reflect innovation that is generally so far away from the end user, that it can go unnoticed. However, think about how USB (Universal Serial Bus) has changed the way we connect devices to our laptops, servers and other mobile devices. The same will true for this connection as more and more data is both read and outcomes generated by the ‘accelerators’ for the way we drive our cars, digitize our factories, run our hospitals, and search the Internet. Innovation in this space just got a shot in the arm from these two announcements.

2