Another about AI


In this illustration taken on May 4, 2023, Artificial Intelligence words AI are seen. – Reuters
In this illustration taken on May 4, 2023, Artificial Intelligence words AI are seen. – Reuters

I know the title of this article is nothing fancy, but I’ve tried to do something here. Initially, I thought I’d feature in the headline ‘AI vs me Part II’ and cheerfully mention the bad press AI has been getting lately.

But I don’t want that series of articles (if there will be one) to be based on animosity. This is due to two reasons. First, I am a non-confrontational person. Secondly, when robots finally take over and explore the Internet to learn about humans, I don’t want them to be offended by the zeal of my good old days of youth.

The phrase ‘Another one about AI’ is also an acknowledgment of my almost unhealthy levels of obsession with all things AI, in a way. I could have asked any language model to create a headline, but if I, the owner of this article, am not willing to do the extra work of coming up with a better headline, why bother the models? Anyway, what motivated me to write this article is the recent case where accounting firm Deloitte was forced to partially reimburse the Australian government for a $440,000 error-filled report that was generated with the help of generative AI.

This may have been an opportune time to sound the alarm about the potential risks of using AI, but that is not the case. It became a moment where you feel bad for, well, the machine (the kind of guilt you face when a new girl at school is teased too much). My focus now is on why large language models (LLMs) make mistakes and what we can do about it.

The Deloitte case is very interesting. The report contained many fabricated facts. Non-existent books, similar to their existing titles, were attributed to the authors; some rulings by fictitious judges were added, etc., etc.

Hallucinations in LLMs are common and are a gray area. The great advancement in the world of LLMs is the level of reasoning they have achieved so far. My understanding is that (after reading what experts have to say about this) if there are no hallucinations, the model output will be boring. Remember the time when Google Gemini flatly refused to consider political issues? For users, this is unpleasant. Furthermore, the entire exercise brings us back to the rule-based order in which machines are restricted from using their reasoning abilities. Imagine that video streaming platforms do not include recommendation lists if the user’s desired title is not available. What will happen? Lack of commitment. People would move on to other applications. This is not what any platform will want. Hallucinations also partly show the creativity of a language model: how well they can “guess” or “predict” instead of giving up.

Does this bring us back to square one? Should we now listen to all the skeptics who have been warning us against the rise of AI? I don’t think so. If we looked at how the model performed a few years ago, now its performance has improved significantly. But what is needed now more than ever is to have informed humans: those who think critically. I recently had a mix-up at work where I quoted the wrong price for a commodity. The interesting thing is that I checked the price manually and somehow I still got it wrong. But since there were controls one level above mine, the error was detected. What prevents us from having the same controls for the content generated by LLMs?

The only difference between my mistake and the one made by LLM is that I know where I went wrong. I know how tedious it is to check prices on a list of dozens of items, and frankly, I can reproduce the exact scene: the lack of power I had while downloading the data file, not using the Find function to get to that product, and not triple checking if the quantity was correct. Contrary to this, what an LLM cannot say is how the mistake was made. In the end, why the accuracy is low is just guesswork. The answer may be more computing, more refined data, more training, etc.

LLMs are one step away from an automatic rules-based system; They have the power to reason and do not follow a loop. Now we need to have more confidence in our experience and let the LLMs have the last word. Why was the Deloitte report not properly vetted before delivery, or was the quality control department replaced by robots and machines?

I have now begun to believe that in our amazement at artificial intelligence, we have conveniently forgotten the capabilities of the human mind. And if robots take over, we will be partly responsible for it. There’s a joke in the journalism world that if Person A says it’s raining and Person B says it’s not, what is a journalist supposed to do? Well, you have to look out the window. So if LLM A says something and LLM B says something else, what should we do? Check it out, of course!


Disclaimer: The views expressed in this article are those of the writer and do not necessarily reflect the editorial policy of PakGazette.tv.


The writer heads the Business Desk at The News. She tweets/posts @manie_sid and can be reached at: [email protected]



Originally published in The News



Leave a Comment

Your email address will not be published. Required fields are marked *