Share this article
Conversational Technologies
4 min read
Ziv Gidron

Mythbusting Google’s New Trillion-Parameter AI Language Model

Earlier this month, Google announced that its researchers have developed and benchmarked techniques enabling them to train a language model containing more than one trillion parameters. If you're not sure what parameters are, we highly recommend that you check out our previously published blog post about Open AI's language model, GPT-3. In the context of advanced language models, parameters are the key to machine learning algorithms, the part of the model that's learned from historical training data. As a blanket statement: more parameters equals a vastly more sophisticated language model, which is why when Google claimed it used over one trillion parameters to train its new language model, the conversational AI community was flush with excitement and prognostications. 

So, in service to all those curious to learn more about Google's latest proclamation, we sifted through the available facts and research to distinguish between what it is and what it isn't.

What It Is

This new language model (as of now nameless) is a roaring accomplishment due to the techniques its developers employed to train it with a staggering 1.6 trillion parameters (the most to date) including an up to 4x speedup over the previously largest Google-developed language model, T5-XXL. 

Training T5, Google’s previous language model. Source: Google AI

A paper released by the language model's researchers states that large-scale training is still one of the most effective paths toward powerful models. However, these mammoth language models are few and far between because such massive-scale training is computationally intensive, often prohibitively so.

To tackle this obstacle, the developers implemented The Switch Transformer technique, which uses only a subset of the model's parameters that transform input data within it. The Switch Transformer is based on an AI model first introduced in the early '90s called A Mixture of Experts. These experts (or learners) are not of the Homosapien variety but are instead composed of various neural networks and machine learning models. In as simplest terms possible, these models, each responsible for a specific task, live within a larger “mothership” model and are orchestrated by a "gating network" that determines which experts to interact with to acquire the desired data. 

To quote Tristan Greene of TNW: "Put simply, the Brain team (the model's researchers) has figured out a way to make the model itself as simple as possible while squeezing in as much raw compute power as possible to make the increased number of parameters possible.”

Source: Google

Applying the Switch Transformer awarded the developers over 7x speedup without having to exhaust exuberant computational resources. In one test where a Switch Transformer model was trained to translate between over 100 different languages, the researchers observed "a universal improvement" across 101 languages, with 91% of the languages benefitting from an over 4x speedup compared with a baseline model.

To tie this all together, Google's new language model is a laudable achievement in computational linguistics and AI, but should you expect to witness this force in action during your next exchange with an AI-powered chatbot?


Not quite. 

What It Isn’t

To the disappointment of conversational AI enthusiasts, this language model isn't suitable for real-world or business-setting scenarios; consuming over one trillion parameters means that the language model absorbed the biases ingrained in the public data it was trained on. Therefore, it is highly likely that when used "in the wild", human to language-model engagements could turn sour, resulting in an ill-informed, or offended, user.

Furthermore, regulating and policing such an extensive cache of data is a tremendous challenge that opens the door to malignant perpetrators using the model to spread misinformation and sow chaos and discord. 

As it stands, this language model is geared towards academic study, and even if it was offered to the general public for sale, like Open AI's GPT-3 (Generative Pre-trained Transformer 3) which preceded it, the design is inherently unconducive for customer-facing conversational AI. 

Although over one trillion parameters can give the impression that this language model was trained on all the data available online, it cannot automatically update itself - meaning Google’s new model only as good as the data it was fed (plentiful as it may be). One variable, one new product, an update of service, or a change in content can topple the entire house of cards and mislead or misguide a user seeking to accomplish a task. Furthermore, these models are not accommodative to minute iterations and judging by GPT-3's pricing model, any such adjustments will come attached with considerable costs. 

The Tip of the Iceberg

This past year has seen inspiring innovation in language models and conversational AI that would have been inconceivable only a decade ago. With every milestone reached and each obstacle overcome, the appetite for more outstanding achievements grows steadily in the boardrooms of tech giants Google and Open AI (which are locked in a language model arms race).

If and how these models will be put into widespread commercial use is still unclear, but their mere existence implies just how much focus, attention, and funds are currently being funneled towards conversational AI. The good news is that while we wait to see how this timeline progresses, there are already exceptional conversational AI solutions for businesses on the market, some fueled with technologies and computational linguistic acrobatics that are no less impressive than what Google and its giant tech peers are rolling out.

Want to learn more about language models, machine learning, and computational linguistics? Check out these related thought pieces from Hyro:

The best of conversational technologies, delivered weekly.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Conversational AI
Digital Transformation
Natural Language
Hot Off the Press! The Ultimate Guide to Implementing Conversational AI Within Your Organization.
Discover why 80% of enterprises are moving to conversational AI as intent-based chatbots continue to fail the test of scalability. Download our guide!
Get Your Free Copy
Ziv Gidron
January 29, 2021
% Read
People who read this article also enjoyed:
IT & Digital

Hyro Wins Juniper Research’s Gold Award for Best AI Chatbot

Hyro has won the Gold Award for Best AI Chatbot Solution as part of Juniper Research’s Future Digital Awards for Telco Innovation 2023. Find out why.

Ziv Gidron
January 25, 2023
Conversational Technologies
10 min read

Why ChatGPT is a Huge Win for Conversational AI Companies (and Their Customers)

OpenAI’s ChatGPT is all the talk, literally. While some conversational AI companies are concerned, Israel Krush, CEO and Co-Founder of Hyro, explains why this is actually very advantageous for enterprise deployments to come. Combining elements of ChatGPT to improve the conversational experience, while opting for a more controlled, and security-heavy engine run on enterprise data, will be the ultimate path forward as we enter this next wave of conversational and generative AI.

Israel Krush
December 21, 2022
3 min read

Hyro Named to the 2022 CB Insights’ Digital Health 150 List

CB Insights today named Hyro to its fourth-annual Digital Health 150, showcasing the 150 most promising private digital health companies of 2022.

Aaron Bours
December 7, 2022