The world was still discovering the wonders of Open Al’s ChatGPT (Chat Generative Pre-trained Transformer) 3.5 when the artificial intelligence company released its successor, claiming that it was “the latest milestone in OpenAl’s effort to increase deep learning capability.”
GPT-4 has the ability to generate text as well as process image and text inputs, which is a significant advance over its predecessor. GPT-4 has been reported to have achieved “human-level” performance in various professional and academic evalations.
“GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic assessments.”
Open Al says GPT-4 is more reliable, creative and capable of handling much more nuanced instructions than GPT-3.5. The company completely redesigned its deep learning system to achieve this level of language processing capabilities, and in collaboration with Azure, co-designed a supercomputer to accommodate its workload.
Let’s dive deep and understand what makes the latest version of Chat GPT superior to its predecessor and what the limitations of the AI-based chatbot are.
The biggest improvement in the latest version is its ability to decrypt image and text inputs to generate human-like responses, helping us in a number of projects. For example, if a user sends in an image of the inside of a closet, GPT-4 will not only recognize the different garments available, but will also suggest a list of combinations in which the garments can be worn.
The company illustrated GPT-4’s image processing expertise on its website. “GPT-4 can accept a text and image request, which – along with the text-only framework – allows the user to specify any vision or language task.
The company illustrated GPT-4’s image processing capabilities on its website. “GPT-4 can accept both text and image control, which, along with the text-only setting, allows the user to specify any vision or language task. Specifically, it generates text output (natural language, code, etc.) given image inputs. However, it was also mentioned that the image inputs are still only a search preview and are not publicly available. Most AI chatbots can understand and generate responses in English, but Chat GPT-4 has broken the language barrier by having the ability to produce results in 26 languages. “Many existing ML repositories are written in English.
The company mentioned a feature called “Steerability,” which roughly translates to Chat GPT’s ability to respect and behave according to the user’s commands and directions. “Rather than the classic ChatGPT personality with a fixed verbosity, tone and style, developers (and soon ChatGPT users) can now prescribe the style and task of their AI by describing these directions in the “system” message. System messages allow API users to greatly customize their users’ experience within certain limits,” said Open Al.
What are the limitations of ChatGPT-4? While the advanced version offers a host of improved features, there are some limitations that still need to be worked on. Hallucinations – Despite its capabilities, GPT-4 has similar limitations to previous GPT models. Most importantly, it is still not fully reliable (it “hallucinates” facts and makes errors in reasoning). While this remains a real problem, GPT-4 significantly reduces hallucinations compared to previous models (which themselves have improved with each iteration). GPT-4 scores 40% higher than our latest GPT-3.5 in our internal assessments of adversarial factuality.
Despite all the benefits of the latest version of GPT Chat, founder Sam Altman expressed that GPT-4 “is still flawed, still limited, and still looks more impressive on first use than after spending more time with it.”