Share this article
Conversational Technologies
 - 
4 min read
Ziv Gidron
 - 

Mythbusting Google’s New Trillion-Parameter AI Language Model

Earlier this month, Google announced that its researchers have developed and benchmarked techniques enabling them to train a language model containing more than one trillion parameters. If you're not sure what parameters are, we highly recommend that you check out our previously published blog post about Open AI's language model, GPT-3. In the context of advanced language models, parameters are the key to machine learning algorithms, the part of the model that's learned from historical training data. As a blanket statement: more parameters equals a vastly more sophisticated language model, which is why when Google claimed it used over one trillion parameters to train its new language model, the conversational AI community was flush with excitement and prognostications. 


So, in service to all those curious to learn more about Google's latest proclamation, we sifted through the available facts and research to distinguish between what it is and what it isn't.


What It Is

This new language model (as of now nameless) is a roaring accomplishment due to the techniques its developers employed to train it with a staggering 1.6 trillion parameters (the most to date) including an up to 4x speedup over the previously largest Google-developed language model, T5-XXL. 


Training T5, Google’s previous language model. Source: Google AI


A paper released by the language model's researchers states that large-scale training is still one of the most effective paths toward powerful models. However, these mammoth language models are few and far between because such massive-scale training is computationally intensive, often prohibitively so.


To tackle this obstacle, the developers implemented The Switch Transformer technique, which uses only a subset of the model's parameters that transform input data within it. The Switch Transformer is based on an AI model first introduced in the early '90s called A Mixture of Experts. These experts (or learners) are not of the Homosapien variety but are instead composed of various neural networks and machine learning models. In as simplest terms possible, these models, each responsible for a specific task, live within a larger “mothership” model and are orchestrated by a "gating network" that determines which experts to interact with to acquire the desired data. 


To quote Tristan Greene of TNW: "Put simply, the Brain team (the model's researchers) has figured out a way to make the model itself as simple as possible while squeezing in as much raw compute power as possible to make the increased number of parameters possible.”


Source: Google


Applying the Switch Transformer awarded the developers over 7x speedup without having to exhaust exuberant computational resources. In one test where a Switch Transformer model was trained to translate between over 100 different languages, the researchers observed "a universal improvement" across 101 languages, with 91% of the languages benefitting from an over 4x speedup compared with a baseline model.


To tie this all together, Google's new language model is a laudable achievement in computational linguistics and AI, but should you expect to witness this force in action during your next exchange with an AI-powered chatbot?

 

Not quite. 


What It Isn’t

To the disappointment of conversational AI enthusiasts, this language model isn't suitable for real-world or business-setting scenarios; consuming over one trillion parameters means that the language model absorbed the biases ingrained in the public data it was trained on. Therefore, it is highly likely that when used "in the wild", human to language-model engagements could turn sour, resulting in an ill-informed, or offended, user.


Furthermore, regulating and policing such an extensive cache of data is a tremendous challenge that opens the door to malignant perpetrators using the model to spread misinformation and sow chaos and discord. 


As it stands, this language model is geared towards academic study, and even if it was offered to the general public for sale, like Open AI's GPT-3 (Generative Pre-trained Transformer 3) which preceded it, the design is inherently unconducive for customer-facing conversational AI. 


Although over one trillion parameters can give the impression that this language model was trained on all the data available online, it cannot automatically update itself - meaning Google’s new model only as good as the data it was fed (plentiful as it may be). One variable, one new product, an update of service, or a change in content can topple the entire house of cards and mislead or misguide a user seeking to accomplish a task. Furthermore, these models are not accommodative to minute iterations and judging by GPT-3's pricing model, any such adjustments will come attached with considerable costs. 


The Tip of the Iceberg

This past year has seen inspiring innovation in language models and conversational AI that would have been inconceivable only a decade ago. With every milestone reached and each obstacle overcome, the appetite for more outstanding achievements grows steadily in the boardrooms of tech giants Google and Open AI (which are locked in a language model arms race).


If and how these models will be put into widespread commercial use is still unclear, but their mere existence implies just how much focus, attention, and funds are currently being funneled towards conversational AI. The good news is that while we wait to see how this timeline progresses, there are already exceptional conversational AI solutions for businesses on the market, some fueled with technologies and computational linguistic acrobatics that are no less impressive than what Google and its giant tech peers are rolling out.



Want to learn more about language models, machine learning, and computational linguistics? Check out these related thought pieces from Hyro:


The best of conversational technologies, delivered weekly.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Conversational AI
Digital Transformation
Natural Language
Hot Off the Press! The Ultimate Guide to Implementing Conversational AI Within Your Organization.
Discover why 80% of enterprises are moving to conversational AI as intent-based chatbots continue to fail the test of scalability. Download our guide!
Get Your Free Copy
Ziv Gidron
January 29, 2021
Share
0
% Read
People who read this article also enjoyed:
Healthcare
 - 
7 min read

The 8 Must-Haves for Successful Chief Medical Information Officers (CMIOs), Next Gen Leaders in Healthcare

No discussion about the marriage of medicine and technology would be complete without addressing Chief Medical Information Officers (CMIOs). CMIOs are at the forefront of technological innovation within healthcare organizations, and the work they do has very real and measurable benefits. But what makes a great CMIO? And what characteristics do these next-generation healthcare leaders possess? Let's get into it.

Aaron Bours
October 13, 2021
Healthcare
 - 
6 min read

The Challenges of a Healthcare CIO in 2021 and How Conversational AI Can Solve Them

Ultimately, the coveted role of the healthcare CIO in 2021 is both enviable and unenviable. On the one hand, healthcare CIOs are part of something exhilarating. They're driving 90MPH at the busy intersection where technological innovation meets medicine. The decisions they make not only change the landscape of the delivery of healthcare but have very real, measurable impacts on the lives of patients...

Aaron Bours
October 3, 2021
Real Estate
 - 
5 min read

The 4 Ways AI Chatbots Are Increasing ROI In Real Estate While Improving Buyer and Renter Experience

Real estate companies are embracing AI chatbots that are shortening the real estate sales cycle, improving buyer and renter experiences, and continually providing high ROI through lead management. Discover the 4 ways in which conversational AI can drive results.

Aaron Bours
September 15, 2021