ChatGPT will soon have language and vision

Posted On: September 26, 2023

The interface that made it popularartificial intelligence generative (capable of producing texts, images and other content in everyday language upon simple request) will soon be able to process requests with images and also discuss verbally with its users.

For example, Internet users will be able to take a photo of a monument and Have a conversation with ChatGPT learn about the building’s history or even show the software what’s in their fridge so it can offer them a recipe, OpenAI suggests in a press release.

These new tools will be rolled out over the next two weeks to members of the paid ChatGPT Plus service or to organizations that are customers of the service.

The company announced the impending addition of such features last March when it unveiled GPT-4, the latest version of its language model. (new window)

GPT-4 is multimedia in the sense that it can process data other than text or computer code.

Risks of hallucinations still available

The success of ChatGPT since the end of 2022 has led to a big raceartificial intelligence generative between the technology giants, Google and Microsoft at the top.

However, the rapid introduction of these still very poorly regulated programs is a cause for great concern, especially as this tends to be the case hallucinatethat is, inventing answers from scratch.

Vision-enabled models present new challenges, acknowledges OpenAI in a press release. Below, the company notes the hallucinations they may have, but also the risk of trusting the model’s image interpretation in areas where the stakes are high.

This is what the up-and-coming company claims I tested the model on topics such as extremism and scientific knowledge and relies on real-world applications and feedback from Internet users to improve.

This further limited ChatGPT’s functionality Analyze peoplebecause the interface is not always precise and these systems must respect the confidentiality of the individual.

Spotify joins forces with OpenAI

The streaming platform Spotify also announced a partnership with OpenAI on Monday to directly translate podcastsartificial intelligence.

Programs recorded in English are now also available in other languages while retaining the speaker’s characteristic vocal characteristicsthe service said in a statement.

The Swedish company assures that the new speech generation technology from OpenAI reproduces the style of the original speaker, enabling a more authentic, personal and natural listening experience than traditional dubbing.