10. April 2023 By Lilian Do Khac
A rough guide to the vast jungle of AI language models
AGI, narrow AI and strong AI – an introduction to the terms
The advent of large-scale language models is seeing people come across the term artificial general intelligence (AGI) more and more often. The term AGI is intended to create a system-specific distinction in terms of application diversity. This diversity of applications also results, among other things, from the fundamentally changing pattern of thinking about how such AI models are created, used and are to be accounted and answered for. Previous, traditional AI systems are rather narrow in terms of their possible applications (narrow AI) – that is, they are only intended for one or a few tasks. In contrast, more general AI systems have a wide range of possible applications. On an informal level, the term is intended to contribute to the continuum between traditional AI – more specifically, narrow AI – and strong AI (which includes fictional characters such as WALL-E, JARVIS or younamei). The large language models are now called ‘fundamental models’. These are pre-trained AI models based on an enormous database. Taking these fundamental models as a basis, it is possible to make adjustments to downstream activities by using appropriate instructions (soft-prompting) or doing some fine-tuning (example results: ChatGPT, Bard). This means that the fundamental models move between the continuum of narrow AI and AGI, in the direction of AGI.
From the legislative point of view (EU AI Act), a distinction is made (as of Q1/2023) between narrow AI and non-narrow AI. Artificial intelligence is referred to here as ‘general purpose AI’: an AI system that is trained on broad data at scale, is designed for generality of output and can be adapted to a wide range of tasks. (EURACTIV (14 March 2023)). The EU legislators who led the drafting of the AI law have imposed significant obligations on the providers of large language models, such as ChatGPT and Stable Diffusion, while trying to clarify responsibilities along the AI value chain.
What models are currently on the market?
There are some plans or providers that offer models such as those mentioned above. The offer is straightforward though – especially in the commercial sense. At the moment, the word ‘oligopoly’ can be used in this regard. As of today (23 March 2023), there are only a handful of commercial providers of enterprise-ready services. For context, commercial availability means that a product or service has successfully completed the alpha and beta phases and meets the quality requirements for a general release . This also includes the ability to guarantee terms and conditions of business, such as availability of products or services, as well as to meet industry standards in terms of security. These include, for example, Microsoft and its OpenAI endpoint for ChatGPT, GPT-3 and GPT-4 as well as Alphabet (Deepmind, Anthropic) with PaLM variants or Claude.
The Luminous models from Aleph Alpha are the European equivalent. BLOOM is an open-source fundamental model that was developed together with independent scientists and the US company Hugging Face. In the Asian region, notable (enterprise-ready service) offerings are coming from the platform-oriented South Korean company Naver (with its product, the HyperClover LLM platform).
Apart from those models, there are other really interesting variants, such as models that have been trained on the basis of protein sequences and can thus predict protein structures. These include ESMFold from MetaAI and AlphaFold from Deepmind/Alphabet. Then there are models that are primarily concerned with generating images – including Midjourney and Stable Diffusion.
In the following figure (Source: linkedin/Three facts on ChatGPT), other models/providers are listed in addition to the commercial (enterprise-ready service) offerings:
This chart will surely become obsolete relatively quickly – probably after a week. That being said, the intention here is rather to provide an overview of these models and their offshoots, and in the end, the origin of the models remains the same.
You will find more exciting topics from the adesso world in our latest blog posts.
Why not check out some of our other interesting blog posts?