A closer look at the hype and harms of generative AI
Generative AI, touted as revolutionary, may not live up to its promises.
Hasty deployment of the technology has outpaced thoughtful regulation. Picture: iStock
“Artificial intelligence” has captured the global imagination. Claims are made that generative AI (contracted here to AInt) is as revolutionary as the internet itself.
When you strip away the hype, AInt not only does not do what it is purported to do, but what it does do has a great number of negative effects, including on the environment.
The underlying data sets being used are often noninclusive and will likely widen the already frightening divide between north and south, while the uncritical and hasty deployment of the technology has outpaced thoughtful regulation.
And yet it is touted as the best thing since sliced bread.
Hysterical predictions like that of US financial services company Goldman-Sachs, which estimates that AInt “could boost global labour productivity by more than 1 percentage point a year in the decade following widespread usage”, have helped ensure the major beneficiaries of the AI arms race remain that same handful of US-based companies which are already insanely rich.
ALSO READ: A New Era with AI: 300,000 SA youth to gain skills for innovation and employment
Despite showing no profit, AInt has already received $170 billion (about R3.1 trillion) in funding from the big four tech companies: Apple, Facebook/ Meta, Google/Alphabet and Microsoft.
But what is all this excitement about? At the simplest level, AInt is nothing but a very fancy “auto-complete” using statistical modelling.
What propels AInt far beyond the level of a rudimentary chatbot are the extraordinarily large datasets utilised, together with the data structures inside the programs. Data structures are things like spreadsheets or lists or binary trees, which help manipulate the datasets, which in most instances are scraped from the internet.
There is nothing particularly new about the underlying statistical models. We use auto-complete daily when we type letters into our smartphones and auto-complete suggests what might be the rest of the word we have started based on a statistical model of our language, combined with other factors such as where we are located.
The software on your phone contains large statistical tables which show that the most likely word which begins “theat” in English is “theatre” – although in the American dialect of English, that most likely word is instead “theater”. Thus, the software gives the possible auto-complete “theatre” after you’ve typed “theat” – or it gives “theater” if you’re doing this typing in New York.
ALSO READ: Cape Town Science Centre partners Google to introduce girls to AI and Robotics
What shocked the world about a year ago when ChatGPT was made public is that this fancy auto-complete improved immensely due to changes in the software programs and the underlying datasets.
The statistical tables were effectively replaced by other ways of storing and organising data called “neural networks” (in an extremely poor analogy with how neurons are organised in animals).
These data structures can have dozens of layers and millions, or even hundreds of millions, of parameters describing how new data should flow through the structure.
The parameter values are found after extremely costly computations using terabytes of “training data”, often scraped from the public internet in ways that might violate copyright law and almost certainly violate creators’ wishes for how their works will be used.
When prompted, AInt does not use this neural net software with its millions of parameters to reason about the prompts or to answer a question or to summarise some given information: it merely puts together tokens in a way which is of relatively high probability, computed from those parameters.
ALSO READ: WHO says AI can transform healthcare if understood properly
Although all advanced technologies may seem like magic, it is always important to draw back the curtain. And this one falls short in a number of ways.
Even though it generates reasonable-looking text blocks, these can be filled with errors. Generative texts are synthetic language, they manufacture something that has plausibility, but in a fact- and reference-free zone.
We have instances of ChatGPT documents generated by lawyers, where references have been produced that simply do not exist and the legal community has not taken kindly to “alternative case law”.
AInt is based on material scraped from the ’net which could be a copyright violation itself, and while legal experts disagree on this point, we personally do think there are, if not necessarily direct copyright violations in making and distributing the models, there are most certainly distinct example where the use of copyright works in AInt violate authors’ intentions.
Data scraped from the public internet reiterates the biases and injustices that are so rampant in this domain. Some have wryly, but not inaccurately, deemed the outputs of AInt as “mansplaining as a service”.
ALSO READ: Generative AI, work from anywhere and multicloud at Dell Tech Forum
The fault lies within the training sets, as Garbage In equates to Garbage Out, or the Gigo effect.
When the input data is fundamentally flawed – through being deliberately skewed to a particular world view, or inadvertently lacking in diversity, too small, too geographically bound, limited to but a small subset of the teeming variation that is humanity, or focused only on those few languages for which there are a lot of texts available – the output is negatively impacted.
There are myriad examples where the use of AInt indicates there is no understanding and material generated is unlike something written by a human with insight, empathy, or pedagogical experience.
Some consider AInt will bring economic and scientific benefits so large that whichever superpower first chains this demon will quickly rise to overwhelming global dominance.
But we have to ask: who benefits from this current stock-market bubble?
And if X user Maple Cocaine is right in wondering: “What if a computer was stupid?”
- Kathryn Kure and Jonathan Poritz are independent consultants
ALSO READ: OpenAI’s ChatGPT helping empower education worldwide
For more news your way
Download our app and read this and other great stories on the move. Available for Android and iOS.