Where the Chips Fall: Navigating the Silicon Storm

5/5 (3)

5/5 (3)

GenAI has taken the world by storm, with organisations big and small eager to pilot use cases for automation and productivity boosts. Tech giants like Google, AWS, and Microsoft are offering cloud-based GenAI tools, but the demand is straining current infrastructure capabilities needed for training and deploying large language models (LLMs) like ChatGPT and Bard.

Understanding the Demand for Chips

The microchip manufacturing process is intricate, involving hundreds of steps and spanning up to four months from design to mass production. The significant expense and lengthy manufacturing process for semiconductor plants have led to global demand surpassing supply. This imbalance affects technology companies, automakers, and other chip users, causing production slowdowns.

Supply chain disruptions, raw material shortages (such as rare earth metals), and geopolitical situations have also had a fair role to play in chip shortages. For example, restrictions by the US on China’s largest chip manufacturer, SMIC, made it harder for them to sell to several organisations with American ties. This triggered a ripple effect, prompting tech vendors to start hoarding hardware, and worsening supply challenges.

As AI advances and organisations start exploring GenAI, specialised AI chips are becoming the need of the hour to meet their immense computing demands. AI chips can include graphics processing units (GPUs), application-specific integrated circuits (ASICs), and field-programmable gate arrays (FPGAs). These specialised AI accelerators can be tens or even thousands of times faster and more efficient than CPUs when it comes to AI workloads.

The surge in GenAI adoption across industries has heightened the demand for improved chip packaging, as advanced AI algorithms require more powerful and specialised hardware. Effective packaging solutions must manage heat and power consumption for optimal performance. TSMC, one of the world’s largest chipmakers, announced a shortage in advanced chip packaging capacity at the end of 2023, that is expected to persist through 2024.

The scarcity of essential hardware, limited manufacturing capacity, and AI packaging shortages have impacted tech providers. Microsoft acknowledged the AI chip crunch as a potential risk factor in their 2023 annual report, emphasising the need to expand data centre locations and server capacity to meet customer demands, particularly for AI services. The chip squeeze has highlighted the dependency of tech giants on semiconductor suppliers. To address this, companies like Amazon and Apple are investing heavily in internal chip design and production, to reduce dependence on large players such as Nvidia – the current leader in AI chip sales.

How are Chipmakers Responding?

NVIDIA, one of the largest manufacturers of GPUs, has been forced to pivot its strategy in response to this shortage. The company has shifted focus towards developing chips specifically designed to handle complex AI workloads, such as the A100 and V100 GPUs. These AI accelerators feature specialised hardware like tensor cores optimised for AI computations, high memory bandwidth, and native support for AI software frameworks.

While this move positions NVIDIA at the forefront of the AI hardware race, experts say that it comes at a significant cost. By reallocating resources towards AI-specific GPUs, the company’s ability to meet the demand for consumer-grade GPUs has been severely impacted. This strategic shift has worsened the ongoing GPU shortage, further straining the market dynamics surrounding GPU availability and demand.

Others like Intel, a stalwart in traditional CPUs, are expanding into AI, edge computing, and autonomous systems. A significant competitor to Intel in high-performance computing, AMD acquired Xilinx to offer integrated solutions combining high-performance central processing units (CPUs) and programmable logic devices.

Global Resolve Key to Address Shortages

Governments worldwide are boosting chip capacity to tackle the semiconductor crisis and fortify supply chains. Initiatives like the CHIPS for America Act and the European Chips Act aim to bolster domestic semiconductor production through investments and incentives. Leading manufacturers like TSMC and Samsung are also expanding production capacities, reflecting a global consensus on self-reliance and supply chain diversification. Asian governments are similarly investing in semiconductor manufacturing to address shortages and enhance their global market presence.

Japan is providing generous government subsidies and incentives to attract major foreign chipmakers such as TSMC, Samsung, and Micron to invest and build advanced semiconductor plants in the country. Subsidies have helped to bring greenfield investments in Japan’s chip sector in recent years. TSMC alone is investing over USD 20 billion to build two cutting-edge plants in Kumamoto by 2027. The government has earmarked around USD 13 billion just in this fiscal year to support the semiconductor industry.

Moreover, Japan’s collaboration with the US and the establishment of Rapidus, a memory chip firm, backed by major corporations, further show its ambitions to revitalise its semiconductor industry. Japan is also looking into advancements in semiconductor materials like silicon carbide (SiC) and gallium nitride (GaN) – crucial for powering electric vehicles, renewable energy systems, and 5G technology.

South Korea. While Taiwan holds the lead in semiconductor manufacturing volume, South Korea dominates the memory chip sector, largely due to Samsung. The country is also spending USD 470 billion over the next 23 years to build the world’s largest semiconductor “mega cluster” covering 21,000 hectares in Gyeonggi Province near Seoul. The ambitious project, a partnership with Samsung and SK Hynix, will centralise and boost self-sufficiency in chip materials and components to 50% by 2030. The mega cluster is South Korea’s bold plan to cement its position as a global semiconductor leader and reduce dependence on the US amidst growing geopolitical tensions.

Vietnam. Vietnam is actively positioning itself to become a major player in the global semiconductor supply chain amid the push to diversify away from China. The Southeast Asian nation is offering tax incentives, investing in training tens of thousands of semiconductor engineers, and encouraging major chip firms like Samsung, Nvidia, and Amkor to set up production facilities and design centres. However, Vietnam faces challenges such as a limited pool of skilled labour, outdated energy infrastructure leading to power shortages in key manufacturing hubs, and competition from other regional players like Taiwan and Singapore that are also vying for semiconductor investments.

The Potential of SLMs in Addressing Infrastructure Challenges

Small language models (SLMs) offer reduced computational requirements compared to larger models, potentially alleviating strain on semiconductor supply chains by deploying on smaller, specialised hardware.

Innovative SLMs like Google’s Gemini Nano and Mistral AI’s Mixtral 8x7B enhance efficiency, running on modest hardware, unlike their larger counterparts. Gemini Nano is integrated into Bard and available on Pixel 8 smartphones, while Mixtral 8x7B supports multiple languages and suits tasks like classification and customer support.

The shift towards smaller AI models can be pivotal to the AI landscape, democratising AI and ensuring accessibility and sustainability. While they may not be able to handle complex tasks as well as LLMs yet, the ability of SLMs to balance model size, compute power, and ethical considerations will shape the future of AI development.

More Insights to tech Buyer Guidance
0
AI Will be the “Next Big Thing” in End-User Computing

5/5 (3)

5/5 (3)

I have spent many years analysing the mobile and end-user computing markets. Going all the way back to 1995 where I was part of a Desktop PC research team, to running the European wireless and mobile comms practice, to my time at 3 Mobile in Australia and many years after, helping clients with their end-user computing strategies. From the birth of mobile data services (GPRS, WAP, and so on to 3G, 4G and 5G), from simple phones to powerful foldable devices, from desktop computers to a complex array of mobile computing devices to meet the many and varied employee needs. I am always looking for the “next big thing” – and there have been some significant milestones – Palm devices, Blackberries, the iPhone, Android, foldables, wearables, smaller, thinner, faster, more powerful laptops.  

But over the past few years, innovation in this space has tailed off. Outside of the foldable space (which is already four years old), the major benefits of new devices are faster processors, brighter screens, and better cameras. I review a lot of great computers too (like many of the recent Surface devices) – and while they are continuously improving, not much has got my clients or me “excited” over the past few years (outside of some of the very cool accessibility initiatives). 

The Force of AI 

But this is all about to change. Devices are going to get smarter based on their data ecosystem, the cloud, and AI-specific local processing power. To be honest, this has been happening for some time – but most of the “magic” has been invisible to us. It happened when cameras took multiple shots and selected the best one; it happened when pixels were sharpened and images got brighter, better, and more attractive; it happened when digital assistants were called upon to answer questions and provide context.  

Microsoft, among others, are about to make AI smarts more front and centre of the experience – Windows Copilot will add a smart assistant that can not only advise but execute on advice. It will help employees improve their focus and productivity, summarise documents and long chat threads, select music, distribute content to the right audience, and find connections. Added to Microsoft 365 Copilot it will help knowledge workers spend less time searching and reading – and more time doing and improving.  

The greater integration of public and personal data with “intent insights” will also play out on our mobile devices. We are likely to see the emergence of the much-promised “integrated app”– one that can take on many of the tasks that we currently undertake across multiple applications, mobile websites, and sometimes even multiple devices. This will initially be through the use of public LLMs like Bard and ChatGPT, but as more custom, private models emerge they will serve very specific functions. 

Focused AI Chips will Drive New Device Wars 

In parallel to these developments, we expect the emergence of very specific AI processors that are paired to very specific AI capabilities. As local processing power becomes a necessity for some AI algorithms, the broad CPUs – and even the AI-focused ones (like Google’s Tensor Processor) – will need to be complemented by specific chips that serve specific AI functions. These chips will perform the processing more efficiently – preserving the battery and improving the user experience.  

While this will be a longer-term trend, it is likely to significantly change the game for what can be achieved locally on a device – enabling capabilities that are not in the realm of imagination today. They will also spur a new wave of device competition and innovation – with a greater desire to be on the “latest and greatest” devices than we see today! 

So, while the levels of device innovation have flattened, AI-driven software and chipset innovation will see current and future devices enable new levels of employee productivity and consumer capability. The focus in 2023 and beyond needs to be less on the hardware announcements and more on the platforms and tools. End-user computing strategies need to be refreshed with a new perspective around intent and intelligence. The persona-based strategies of the past have to be changed in a world where form factors and processing power are less relevant than outcomes and insights. 

AI Research and Reports
0