GenAI has taken the world by storm, with organisations big and small eager to pilot use cases for automation and productivity boosts. Tech giants like Google, AWS, and Microsoft are offering cloud-based GenAI tools, but the demand is straining current infrastructure capabilities needed for training and deploying large language models (LLMs) like ChatGPT and Bard.
Understanding the Demand for Chips
The microchip manufacturing process is intricate, involving hundreds of steps and spanning up to four months from design to mass production. The significant expense and lengthy manufacturing process for semiconductor plants have led to global demand surpassing supply. This imbalance affects technology companies, automakers, and other chip users, causing production slowdowns.
Supply chain disruptions, raw material shortages (such as rare earth metals), and geopolitical situations have also had a fair role to play in chip shortages. For example, restrictions by the US on China’s largest chip manufacturer, SMIC, made it harder for them to sell to several organisations with American ties. This triggered a ripple effect, prompting tech vendors to start hoarding hardware, and worsening supply challenges.
As AI advances and organisations start exploring GenAI, specialised AI chips are becoming the need of the hour to meet their immense computing demands. AI chips can include graphics processing units (GPUs), application-specific integrated circuits (ASICs), and field-programmable gate arrays (FPGAs). These specialised AI accelerators can be tens or even thousands of times faster and more efficient than CPUs when it comes to AI workloads.
The surge in GenAI adoption across industries has heightened the demand for improved chip packaging, as advanced AI algorithms require more powerful and specialised hardware. Effective packaging solutions must manage heat and power consumption for optimal performance. TSMC, one of the world’s largest chipmakers, announced a shortage in advanced chip packaging capacity at the end of 2023, that is expected to persist through 2024.
The scarcity of essential hardware, limited manufacturing capacity, and AI packaging shortages have impacted tech providers. Microsoft acknowledged the AI chip crunch as a potential risk factor in their 2023 annual report, emphasising the need to expand data centre locations and server capacity to meet customer demands, particularly for AI services. The chip squeeze has highlighted the dependency of tech giants on semiconductor suppliers. To address this, companies like Amazon and Apple are investing heavily in internal chip design and production, to reduce dependence on large players such as Nvidia – the current leader in AI chip sales.
How are Chipmakers Responding?
NVIDIA, one of the largest manufacturers of GPUs, has been forced to pivot its strategy in response to this shortage. The company has shifted focus towards developing chips specifically designed to handle complex AI workloads, such as the A100 and V100 GPUs. These AI accelerators feature specialised hardware like tensor cores optimised for AI computations, high memory bandwidth, and native support for AI software frameworks.
While this move positions NVIDIA at the forefront of the AI hardware race, experts say that it comes at a significant cost. By reallocating resources towards AI-specific GPUs, the company’s ability to meet the demand for consumer-grade GPUs has been severely impacted. This strategic shift has worsened the ongoing GPU shortage, further straining the market dynamics surrounding GPU availability and demand.
Others like Intel, a stalwart in traditional CPUs, are expanding into AI, edge computing, and autonomous systems. A significant competitor to Intel in high-performance computing, AMD acquired Xilinx to offer integrated solutions combining high-performance central processing units (CPUs) and programmable logic devices.
Global Resolve Key to Address Shortages
Governments worldwide are boosting chip capacity to tackle the semiconductor crisis and fortify supply chains. Initiatives like the CHIPS for America Act and the European Chips Act aim to bolster domestic semiconductor production through investments and incentives. Leading manufacturers like TSMC and Samsung are also expanding production capacities, reflecting a global consensus on self-reliance and supply chain diversification. Asian governments are similarly investing in semiconductor manufacturing to address shortages and enhance their global market presence.
Japan is providing generous government subsidies and incentives to attract major foreign chipmakers such as TSMC, Samsung, and Micron to invest and build advanced semiconductor plants in the country. Subsidies have helped to bring greenfield investments in Japan’s chip sector in recent years. TSMC alone is investing over USD 20 billion to build two cutting-edge plants in Kumamoto by 2027. The government has earmarked around USD 13 billion just in this fiscal year to support the semiconductor industry.
Moreover, Japan’s collaboration with the US and the establishment of Rapidus, a memory chip firm, backed by major corporations, further show its ambitions to revitalise its semiconductor industry. Japan is also looking into advancements in semiconductor materials like silicon carbide (SiC) and gallium nitride (GaN) – crucial for powering electric vehicles, renewable energy systems, and 5G technology.
South Korea. While Taiwan holds the lead in semiconductor manufacturing volume, South Korea dominates the memory chip sector, largely due to Samsung. The country is also spending USD 470 billion over the next 23 years to build the world’s largest semiconductor “mega cluster” covering 21,000 hectares in Gyeonggi Province near Seoul. The ambitious project, a partnership with Samsung and SK Hynix, will centralise and boost self-sufficiency in chip materials and components to 50% by 2030. The mega cluster is South Korea’s bold plan to cement its position as a global semiconductor leader and reduce dependence on the US amidst growing geopolitical tensions.
Vietnam. Vietnam is actively positioning itself to become a major player in the global semiconductor supply chain amid the push to diversify away from China. The Southeast Asian nation is offering tax incentives, investing in training tens of thousands of semiconductor engineers, and encouraging major chip firms like Samsung, Nvidia, and Amkor to set up production facilities and design centres. However, Vietnam faces challenges such as a limited pool of skilled labour, outdated energy infrastructure leading to power shortages in key manufacturing hubs, and competition from other regional players like Taiwan and Singapore that are also vying for semiconductor investments.
The Potential of SLMs in Addressing Infrastructure Challenges
Small language models (SLMs) offer reduced computational requirements compared to larger models, potentially alleviating strain on semiconductor supply chains by deploying on smaller, specialised hardware.
Innovative SLMs like Google’s Gemini Nano and Mistral AI’s Mixtral 8x7B enhance efficiency, running on modest hardware, unlike their larger counterparts. Gemini Nano is integrated into Bard and available on Pixel 8 smartphones, while Mixtral 8x7B supports multiple languages and suits tasks like classification and customer support.
The shift towards smaller AI models can be pivotal to the AI landscape, democratising AI and ensuring accessibility and sustainability. While they may not be able to handle complex tasks as well as LLMs yet, the ability of SLMs to balance model size, compute power, and ethical considerations will shape the future of AI development.
I have spent many years analysing the mobile and end-user computing markets. Going all the way back to 1995 where I was part of a Desktop PC research team, to running the European wireless and mobile comms practice, to my time at 3 Mobile in Australia and many years after, helping clients with their end-user computing strategies. From the birth of mobile data services (GPRS, WAP, and so on to 3G, 4G and 5G), from simple phones to powerful foldable devices, from desktop computers to a complex array of mobile computing devices to meet the many and varied employee needs. I am always looking for the “next big thing” – and there have been some significant milestones – Palm devices, Blackberries, the iPhone, Android, foldables, wearables, smaller, thinner, faster, more powerful laptops.
But over the past few years, innovation in this space has tailed off. Outside of the foldable space (which is already four years old), the major benefits of new devices are faster processors, brighter screens, and better cameras. I review a lot of great computers too (like many of the recent Surface devices) – and while they are continuously improving, not much has got my clients or me “excited” over the past few years (outside of some of the very cool accessibility initiatives).
The Force of AI
But this is all about to change. Devices are going to get smarter based on their data ecosystem, the cloud, and AI-specific local processing power. To be honest, this has been happening for some time – but most of the “magic” has been invisible to us. It happened when cameras took multiple shots and selected the best one; it happened when pixels were sharpened and images got brighter, better, and more attractive; it happened when digital assistants were called upon to answer questions and provide context.
Microsoft, among others, are about to make AI smarts more front and centre of the experience – Windows Copilot will add a smart assistant that can not only advise but execute on advice. It will help employees improve their focus and productivity, summarise documents and long chat threads, select music, distribute content to the right audience, and find connections. Added to Microsoft 365 Copilot it will help knowledge workers spend less time searching and reading – and more time doing and improving.
The greater integration of public and personal data with “intent insights” will also play out on our mobile devices. We are likely to see the emergence of the much-promised “integrated app”– one that can take on many of the tasks that we currently undertake across multiple applications, mobile websites, and sometimes even multiple devices. This will initially be through the use of public LLMs like Bard and ChatGPT, but as more custom, private models emerge they will serve very specific functions.
Focused AI Chips will Drive New Device Wars
In parallel to these developments, we expect the emergence of very specific AI processors that are paired to very specific AI capabilities. As local processing power becomes a necessity for some AI algorithms, the broad CPUs – and even the AI-focused ones (like Google’s Tensor Processor) – will need to be complemented by specific chips that serve specific AI functions. These chips will perform the processing more efficiently – preserving the battery and improving the user experience.
While this will be a longer-term trend, it is likely to significantly change the game for what can be achieved locally on a device – enabling capabilities that are not in the realm of imagination today. They will also spur a new wave of device competition and innovation – with a greater desire to be on the “latest and greatest” devices than we see today!
So, while the levels of device innovation have flattened, AI-driven software and chipset innovation will see current and future devices enable new levels of employee productivity and consumer capability. The focus in 2023 and beyond needs to be less on the hardware announcements and more on the platforms and tools. End-user computing strategies need to be refreshed with a new perspective around intent and intelligence. The persona-based strategies of the past have to be changed in a world where form factors and processing power are less relevant than outcomes and insights.
It is not hyperbole to state that AI is on the cusp of having significant implications on society, business, economies, governments, individuals, cultures, politics, the arts, manufacturing, customer experience… I think you get the idea! We cannot understate the impact that AI will have on society. In times gone by, businesses tested ideas, new products, or services with small customer segments before they went live. But with AI we are all part of this experiment on the impacts of AI on society – its benefits, use cases, weaknesses, and threats.
What seemed preposterous just six months ago is not only possible but EASY! Do you want a virtual version of yourself, a friend, your CEO, or your deceased family member? Sure – just feed the data. Will succession planning be more about recording all conversations and interactions with an executive so their avatar can make the decisions when they leave? Why not? How about you turn the thousands of hours of recorded customer conversations with your contact centre team into a virtual contact centre team? Your head of product can present in multiple countries in multiple languages, tailored to the customer segments, industries, geographies, or business needs at the same moment.
AI has the potential to create digital clones of your employees, it can spread fake news as easily as real news, it can be used for deception as easily as for benefit. Is your organisation prepared for the social, personal, cultural, and emotional impacts of AI? Do you know how AI will evolve in your organisation?
When we focus on the future of AI, we often interview AI leaders, business leaders, futurists, and analysts. I haven’t seen enough focus on psychologists, sociologists, historians, academics, counselors, or even regulators! The Internet and social media changed the world more than we ever imagined – at this stage, it looks like these two were just a rehearsal for the real show – Artificial Intelligence.
Lack of Government or Industry Regulation Means You Need to Self-Regulate
These rapid developments – and the notable silence from governments, lawmakers, and regulators – make the requirement for an AI Ethics Policy for your organisation urgent! Even if you have one, it probably needs updating, as the scenarios that AI can operate within are growing and changing literally every day.
- For example, your customer service team might want to create a virtual customer service agent from a real person. What is the policy on this? How will it impact the person?
- Your marketing team might be using ChatGPT or Bard for content creation. Do you have a policy specifically for the creation and use of content using assets your business does not own?
- What data is acceptable to be ingested by a public Large Language Model (LLM). Are are you governing data at creation and publishing to ensure these policies are met?
- With the impending public launch of Microsoft’s Co-Pilot AI service, what data can be ingested by Co-Pilot? How are you governing the distribution of the insights that come out of that capability?
If policies are not put in place, data tagged, staff trained, before using a tool such as Co-Pilot, your business will be likely to break some privacy or employment laws – on the very first day!
What do the LLMs Say About AI Ethics Policies?
So where do you go when looking for an AI Ethics policy? ChatGPT and Bard of course! I asked the two for a modern AI Ethics policy.
You can read what they generated in the graphic below.
I personally prefer the ChatGPT4 version as it is more prescriptive. At the same time, I would argue that MOST of the AI tools that your business has access to today don’t meet all of these principles. And while they are tools and the ethics should dictate the way the tools are used, with AI you cannot always separate the process and outcome from the tool.
For example, a tool that is inherently designed to learn an employee’s character, style, or mannerisms cannot be unbiased if it is based on a biased opinion (and humans have biases!).
LLMs take data, content, and insights created by others, and give it to their customers to reuse. Are you happy with your website being used as a tool to train a startup on the opportunities in the markets and customers you serve?
By making content public, you acknowledge the risk of others using it. But at least they visited your website or app to consume it. Not anymore…
A Policy is Useless if it Sits on a Shelf
Your AI ethics policy needs to be more than a published document. It should be the beginning of a conversation across the entire organisation about the use of AI. Your employees need to be trained in the policy. It needs to be part of the culture of the business – particularly as low and no-code capabilities push these AI tools, practices, and capabilities into the hands of many of your employees.
Nearly every business leader I interview mentions that their organisation is an “intelligent, data-led, business.” What is the role of AI in driving this intelligent business? If being data-driven and analytical is in the DNA of your organisation, soon AI will also be at the heart of your business. You might think you can delay your investments to get it right – but your competitors may be ahead of you.
So, as you jump head-first into the AI pool, start to create, improve and/or socialise your AI Ethics Policy. It should guide your investments, protect your brand, empower your employees, and keep your business resilient and compliant with legacy and new legislation and regulations.
Google recently extended its Generative AI, Bard, to include coding in more than 20 programming languages, including C++, Go, Java, Javascript, and Python. The search giant has been eager to respond to last year’s launch of ChatGPT but as the trusted incumbent, it has naturally been hesitant to move too quickly. The tendency for large language models (LLMs) to produce controversial and erroneous outputs has the potential to tarnish established brands. Google Bard was released in March in the US and the UK as an LLM but lacked the coding ability of OpenAI’s ChatGPT and Microsoft’s Bing Chat.
Bard’s new features include code generation, optimisation, debugging, and explanation. Using natural language processing (NLP), users can explain their requirements to the AI and ask it to generate code that can then be exported to an integrated development environment (IDE) or executed directly in the browser with Google Colab. Similarly, users can request Bard to debug already existing code, explain code snippets, or optimise code to improve performance.
Google continues to refer to Bard as an experiment and highlights that as is the case with generated text, code produced by the AI may not function as expected. Regardless, the new functionality will be useful for both beginner and experienced developers. Those learning to code can use Generative AI to debug and explain their mistakes or write simple programs. More experienced developers can use the tool to perform lower-value work, such as commenting on code, or scaffolding to identify potential problems.
GitHub Copilot X to Face Competition
While the ability for Bard, Bing, and ChatGPT to generate code is one of their most important use cases, developers are now demanding AI directly in their IDEs.
In March, Microsoft made one of its most significant announcements of the year when it demonstrated GitHub Copilot X, which embeds GPT-4 in the development environment. Earlier this year, Microsoft invested $10 billion into OpenAI to add to the $1 billion from 2019, cementing the partnership between the two AI heavyweights. Among other benefits, this agreement makes Azure the exclusive cloud provider to OpenAI and provides Microsoft with the opportunity to enhance its software with AI co-pilots.
Currently, under technical preview, when Copilot X eventually launches, it will integrate into Visual Studio — Microsoft’s IDE. Presented as a sidebar or chat directly in the IDE, Copilot X will be able to generate, explain, and comment on code, debug, write unit tests, and identify vulnerabilities. The “Hey, GitHub” functionality will allow users to chat using voice, suitable for mobile users or more natural interaction on a desktop.
Not to be outdone by its cloud rivals, in April, AWS announced the general availability of what it describes as a real-time AI coding companion. Amazon CodeWhisperer, integrates with a range of IDEs, namely Visual Studio Code, IntelliJ IDEA, CLion, GoLand, WebStorm, Rider, PhpStorm, PyCharm, RubyMine, and DataGrip, or natively in AWS Cloud9 and AWS Lambda console. While the preview worked for Python, Java, JavaScript, TypeScript, and C#, the general release extends support for most languages. Amazon’s key differentiation is that it is available for free to individual users, while GitHub Copilot is currently subscription-based with exceptions only for teachers, students, and maintainers of open-source projects.
The Next Step: Generative AI in Security
The next battleground for Generative AI will be assisting overworked security analysts. Currently, some of the greatest challenges that Security Operations Centres (SOCs) face are being understaffed and overwhelmed with the number of alerts. Security vendors, such as IBM and Securonix, have already deployed automation to reduce alert noise and help analysts prioritise tasks to avoid responding to false threats.
Google recently introduced Sec-PaLM and Microsoft announced Security Copilot, bringing the power of Generative AI to the SOC. These tools will help analysts interact conversationally with their threat management systems and will explain alerts in natural language. How effective these tools will be is yet to be seen, considering hallucinations in security is far riskier than writing an essay with ChatGPT.
The Future of AI Code Generators
Although GitHub Copilot and Amazon CodeWhisperer had already launched with limited feature sets, it was the release of ChatGPT last year that ushered in a new era in AI code generation. There is now a race between the cloud hyperscalers to win over developers and to provide AI that supports other functions, such as security.
Despite fears that AI will replace humans, in their current state it is more likely that they will be used as tools to augment developers. Although AI and automated testing reduce the burden on the already stretched workforce, humans will continue to be in demand to ensure code is secure and satisfies requirements. A likely scenario is that with coding becoming simpler, rather than the number of developers shrinking, the volume and quality of code written will increase. AI will generate a new wave of citizen developers able to work on projects that would previously have been impossible to start. This may, in turn, increase demand for developers to build on these proofs-of-concept.
How the Generative AI landscape evolves over the next year will be interesting. In a recent interview, OpenAI’s founder, Sam Altman, explained that the non-profit model it initially pursued is not feasible, necessitating the launch of a capped-for-profit subsidiary. The company retains its values, however, focusing on advancing AI responsibly and transparently with public consultation. The appearance of Microsoft, Google, and AWS will undoubtedly change the market dynamics and may force OpenAI to at least reconsider its approach once again.