2024 Prediction – Society Will Arrive At An Inflection Point in AI Advancement.
Posted on | December 21, 2023 | 4 Comments
Mike Magee
For my parents, March, 1965 was a banner month. First, that was the month that NASA launched the Gemini program, unleashing “transformative capabilities and cutting-edge technologies that paved the way for not only Apollo, but the achievements of the space shuttle, building the International Space Station and setting the stage for human exploration of Mars.” It also was the last month that either of them took a puff of their favored cigarette brand – L&M’s.
They are long gone, but the words “Gemini” and the L’s and the M’s have taken on new meaning and relevance now six decades later.
The name Gemini reemerged with great fanfare on December 6, 2023, when Google chair, Sundar Pichai, introduced “Gemini: our largest and most capable AI model.” Embedded in the announcement were the L’s and the M’s as we see here: “From natural image, audio and video understanding to mathematical reasoning, Gemini’s performance exceeds current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in large language model (LLM) research and development.
Google’s announcement also offered a head to head comparison with GPT-4 (Generative Pretrained Transformer-4.) It is the product of a non-profit initiative, OpenAI, and was released on March 14, 2023. Microsoft’s AI search engine, Bing, helpfully informs that, “OpenAI is a research organization that aims to create artificial general intelligence (AGI) that can benefit all of humanity…They have created models such as Generative Pretrained Transformers (GPT) which can understand and generate text or code, and DALL-E, which can generate and edit images given a text description.”
While “Bing” goes all the way back to a Steve Ballmer announcement on May 28, 2009, it was 14 years into the future, on February 7, 2023, that the company announced a major overhaul that, 1 month later, would allow Microsoft to broadcast that Bing (by leveraging an agreement with OpenAI) now had more than 100 million users.
Which brings us back to the other LLM (large language model) – GPT-4, which the Gemini announcement explores in a head-to-head comparison with its’ new offering. Google embraces text, image, video, and audio comparisons, and declares Gemini superior to the OpenAI/Microsoft GPT-4.
Mark Minevich, a “highly regarded and trusted Digital Cognitive Strategist,” writing this month in Forbes, seems to agree with this, writing, “Google rocked the technology world with the unveiling of Gemini – an artificial intelligence system representing their most significant leap in AI capabilities. Hailed as a potential game-changer across industries, Gemini combines data types like never before to unlock new possibilities in machine learning… Its multimodal nature builds on yet goes far beyond predecessors like GPT-3.5 and GPT-4 in its ability to understand our complex world dynamically.”
Expect to hear the word “multimodality” repeatedly in 2024 and with emphasis. But academics will be quick to remind that the origins can be traced all the way back to 1952 scholarly debates about “discourse analysis”, at a time when my Mom and Dad were still puffing on their L&M’s. Language and communication experts at the time recognized “a major shift from analyzing language, or mono-mode, to dealing with multi-mode meaning making practices such as: music, body language, facial expressions, images, architecture, and a great variety of communicative modes.”
Minevich believes that “With Gemini’s launch, society has arrived at an inflection point with AI advancement.” Powerhouse consulting group, BCG (Boston Consulting Group), definitely agrees. They’ve upgraded their L&M’s, with a new acronym, LMM, standing for “large multimodal model.” Leonid Zhukov, Ph.D, director of the BCG Global AI Institute, believes “LMMs have the potential to become the brains of autonomous agents—which don’t just sense but also act on their environment—in the next 3 to 5 years. This could pave the way for fully automated workflows.”
BCG predicts an explosion of activity among its corporate clients focused on labor productivity, personalized customer experiences, and accelerated (especially) scientific R&D. But they also see high volume consumer engagement generating content, new ideas, efficiency gains, and tailored personal experiences.
This seems to be BCG talk for “You ain’t seen nothing yet.” In 2024, they say all eyes are on “autonomous agents.” As they describe what’s coming next: “Autonomous agents are, in effect, dynamic systems that can both sense and act on their environment. In other words, with stand-alone LLMs, you have access to a powerful brain; autonomous agents add arms and legs.”
This kind of talk is making a whole bunch of people nervous. Most have already heard Elon Musk’s famous 2023 quote, “Mark my words, AI is far more dangerous than nukes. I am really quite close to the cutting edge in AI, and it scares the hell out of me.” BCG acknowledges as much, saying, “Using AI, which generates as much hope as it does horror, therefore poses a conundrum for business… Maintaining human control is central to responsible AI; the risks of AI failures are greatest when timely human intervention isn’t possible. It also demands tempering business performance with safety, security, and fairness… scientists usually focus on the technical challenge of building goodness and fairness into AI, which, logically, is impossible to accomplish unless all humans are good and fair.”
Expect in 2024 to see once again the worn out phrase “Three Pillars”. This time it will be attached to LMM AI, and it will advocate for three forms of “license” to operate:
- Legal license – “regulatory permits and statutory obligations.”
- Economic license – ROI to shareholders and executives.
- Social license – a social contract delivering transparency, equity and justice to society.
BCG suggests that trust will be the core challenge, and that technology is tricky. We’ve been there before. The 1964 Surgeon General’s report knocked the socks off of tobacco company execs who thought high-tech filters would shield them from liability. But the government report burst that bubble by stating “Cigarette smoking is a health hazard of sufficient importance in the United States to warrant appropriate remedial action.” Then came the Gemini 6A’s 1st attempt to launch on n December 12,1965. It was cancelled when its’ fuel igniter failed.
Generative AI driven LMM’s will “likely be transformative,” but clearly will also have its up’s and down’s as well. As BCG cautions, “Trust is critical for social acceptance, especially in cases where AI can act independent of human supervision and have an impact on human lives.”
Tags: AI > autonomous agents > DALL-E > Gemini > Google > GPT-4 > Leonid Zhukov > LLM > LMM > Mark Minevich > Microsoft > social license > Three Pillars
Comments
4 Responses to “2024 Prediction – Society Will Arrive At An Inflection Point in AI Advancement.”
December 28th, 2023 @ 2:07 pm
Pretty wild stuff, Mike. Love how you are staying current and fully alive as a human being!
December 30th, 2023 @ 5:12 pm
Thanks, Jack. Happy New Year to you and your’s. Best, Mike
December 31st, 2023 @ 12:06 am
Hey Mike. Wow, that is a truly far ranging article. There is so much going on there that I don’t know where to try and start a response. I guess I must acknowledge that you have become an author, historian, AI futurist, and medical industrial complex “tour de force”. Somehow I missed all that potential when we were staggering back to the dorm from Wanda’s bar or Stampalia’s (sp?) restaurant. But I always saw a good friend who helped me tremendously to emerge from my protective shell and social fear.
So Happy New Year my friend to you and Patricia and all of your progeny. May 2024 be the best year yet of all of us.
December 31st, 2023 @ 9:44 am
Thanks, Larry. The years fly by, but you and I remain equally curious in so many good ways…and true friends! Happy New Year to you and your family! Mike