Is ChatGPT Losing its Edge?

Jaya Mahajanam
August 24, 2023

A couple of days ago, I stumbled upon the ChatGPT subreddit, and one post that piqued my interest read, “I use ChatGPT for hours every day and can say 100% it’s been nerfed over the last month or so.” Intrigued, I couldn’t resist delving into the rabbit hole of the comment section filled with lovers and skeptics of AI’s capabilities. Users passionately debated whether the recent changes to ChatGPT’s performance were intentional or if it was simply a matter of perception. Amidst the fervent discussions, one thing was certain its impact on people’s lives is undeniable, igniting curiosity, controversy, and a quest for deeper understanding. As the AI landscape continues to evolve, conversations like these remind us of the profound influence AI technologies have on our daily interactions and the fascinating journey of human-machine collaboration.

In this article, I am going to explore the following:

Latest on ChatGPT
What Should Enterprises be Wary of?
Conclusion
Enterprise Automation with E42

What is Actually Happening with ChatGPT?

The underlying reason for these changes lies in the frequent updates OpenAI is implementing to its renowned product. These updates are sometimes seen as radical redesigns, driven by continuous attempts at jailbreaking ChatGPT and the surge of legal actions, FTC (Federal Trade Commission) inquiries, and a decline in user engagement. Although some users have experienced faster response generation times, many have observed a notable decrease in the quality of responses, leading to concerns about the overall performance.

The use of GPT-3.5 and GPT-4 has become widespread, with GPT-4 being updatable based on data, user feedback, and design changes. However, the lack of transparency in the update process raises concerns about stable integration into workflows and result reproducibility. Moreover, it is essential to understand whether updates aimed at improving certain aspects might inadvertently reduce capabilities in other dimensions.

To address these questions, Stanford conducted an evaluation of GPT-3.5 and GPT-4 in March and June 2023. The research focused on evaluating its performance on eight LLM tasks commonly used in performance and safety benchmarks. These tasks include solving math problems (with two problem types), answering sensitive questions, responding to Opinion surveys, LangChain Agent, code generation, taking the USMLE medical exam, and visual reasoning.

One of the noteworthy findings from the research was that ChatGPT’s accuracy in identifying prime numbers drastically declined from 97.6% in March to a shocking 2.4% in June. Similarly, in code generation, there were significantly more formatting mistakes observed in recent months compared to earlier this year. These results shed light on the varying performance of ChatGPT on specific tasks over time and highlight the need for continuous monitoring and improvement.

This evaluation of GPT-3.5 and GPT-4 highlights the evolving nature of AI language models like ChatGPT. While the models show promise, their performance on specific tasks can vary over time, necessitating continuous monitoring and improvement. Transparency in the update process and proactive management of trade-offs are essential to optimize ChatGPT’s performance in the journey towards revolutionizing enterprise operations and customer experiences.

What Should Enterprises be Wary of?

Effective automation of intricate processes within an enterprise with limited data sets necessitates specialized AI tools designed to interact with internal systems, manage information through APIs, and execute actions based on collected data. However, it’s important to acknowledge the limitations of ChatGPT in this context. While ChatGPT can be valuable for tasks such as content creation and software development, its capacity for automating enterprise processes is constrained, particularly concerning data security compliances and questionable accuracy of results.

ChatGPT has been known to produce ‘hallucinations’, which are incorrect or misleading statements

These hallucinations can be caused by several factors, including the size and quality of the training dataset, the optimization process used during training, and the input context. The hallucination problem can have a number of negative impacts on enterprises that use ChatGPT. For example, a study by the University of California, Berkeley found that businesses that use ChatGPT to generate marketing copy are more likely to be sued for false advertising. Additionally, a study by the Stanford University School of Medicine found that businesses that use ChatGPT to make decisions about medical treatments are more likely to make mistakes that could harm patients.

ChatGPT ushers in a host of security concerns that necessitate stringent precautions

ChatGPT’s ability to handle sensitive data and its propensity to generate contextually coherent responses underscore the need for comprehensive security measures. One significant apprehension involves the potential disclosure of confidential information through unintended prompts. To mitigate these concerns, enterprises must adopt a multi-faceted approach. First and foremost, robust encryption protocols must be employed to safeguard data both at rest and in transit. Role-based access controls are imperative to limit system access only to authorized personnel. Additionally, regular security audits, vulnerability assessments, and penetration testing can proactively identify and address potential weaknesses.

The adoption of ChatGPT and other large language models is subject to strict scrutiny by businesses and regulators

Studies suggest that “efforts to regulate AI appear to be gathering pace,” the World Economic Forum stated. Data from Stanford University’s 2023 AI Index shows that 37 bills related to AI were passed into law throughout the world in 2022. Even India is now contemplating the implementation of a regulatory structure for artificial intelligence (AI) technology and tools such as ChatGPT. IT Minister Ashwini Vaishnaw recently announced that the Indian government is evaluating the establishment of an AI regulatory framework encompassing aspects like algorithmic bias and copyright concerns. He suggested that AI regulations will likely take a cooperative global approach, like efforts seen in the EU and China, indicating the widespread significance of establishing a comprehensive AI regulatory framework.

Even the independent and nonpartisan organizations like the Center for AI and Digital Policy (CAIDP) are stepping into this arena, advocating for responsible and secure AI development. CAIDP’s recent action exemplifies this commitment—the organization had submitted a formal complaint to the Federal Trade Commission (FTC), urging an investigation into OpenAI’s GPT models. CAIDP’s objective is clear: to ensure that necessary safety measures are in place before the release of such models. This call aligns with both the FTC’s established AI product guidelines and the evolving global standards for AI governance.

As the AI landscape gains momentum, organizations like CAIDP underscore the imperative need for robust AI governance. This further accentuates the challenges enterprises face, as highlighted earlier, in integrating such complex AI systems, like ChatGPT, into their operations while navigating intricate regulatory challenges. What does this mean for enterprises? Finding trustworthy vendors, or in all likelihood certified certifiers, will be essential to not only complying with the new laws, but also signaling to consumers that they are as safe as possible.

Conclusion

ChatGPT, a formidable language model trained on billions of parameters, reveals its limitations when applied to small datasets. Consequently, its capacities are confined when striving to extend intelligent process automation beyond conversational AI. In contrast, AI co-workers equipped with Cognitive Process Automation (CPA) capabilities leverage AI models to comprehend specific enterprise datasets. They adeptly manage information from assorted structured and unstructured sources, make instant decisions, and execute tasks seamlessly.

Make your Enterprise Intelligent with E42!

E42 is a no-code Cognitive Process Automation (CPA) platform to create AI co-workers that automate business processes across functions at scale. Each AI co-worker can be customized with specific features to address particular problem areas in any industry or vertical. At the core of every AI co-worker’s configuration is the ability to think like humans, understand user sentiments, take action based on those sentiments, and learn from every interaction. AI co-workers can be used independently to automate specific processes. Or they can be combined to provide process-agnostic automation to an enterprise. To start your automation journey, get in touch with us at interact@e42.ai today!

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

By Function

By Industry

Is ChatGPT Losing its Edge?

What is Actually Happening with ChatGPT?

What Should Enterprises be Wary of?

ChatGPT has been known to produce ‘hallucinations’, which are incorrect or misleading statements

ChatGPT ushers in a host of security concerns that necessitate stringent precautions

The adoption of ChatGPT and other large language models is subject to strict scrutiny by businesses and regulators

Conclusion

Make your Enterprise Intelligent with E42!

Get in touch with us

Subscribe to Newsletter

Recent posts

Refund Policy