OpenAI rolled out the Advanced Voice Mode with Vision feature in ChatGPT on Thursday. The feature, which lets the artificial intelligence (AI) chatbot access the smartphone's camera to capture visual information of the user's surrounding, will be available to all ChatGPT Plus, Team and Pro subscribers. The feature draws on the capabilities of GPT-4o and can provide real-time voice responses on what is being shown in the camera. Vision in ChatGPT was first unveiled in May during the company's Spring Updates event.
ChatGPT Gets Vision Capabilities
The new ChatGPT feature was rolled out on day six of OpenAI's 12-day feature release schedule. The AI firm has so far released the full version of the o1 model, the video generation Sora model, and a new Canvas tool. Now, with the Advanced Voice mode with Vision, users can let the AI see their surroundings and ask questions based on them.
Just in time for the holidays, video and screensharing are now starting to roll out in Advanced Voice in the ChatGPT mobile app. pic.twitter.com/HFHX2E33S8
— OpenAI (@OpenAI) December 12, 2024
In a demonstration, the OpenAI team members interacted with the chatbot with the camera on, and introduced several people. After that, the AI could answer a quiz on those people even when they were not actively on the screen. This highlights that the vision mode also comes with memory, although the company did not specify how long the memory lasts.
Users can use the ChatGPT vision feature to show the AI their fridge and ask for recipes or by showing their wardrobe and asking for outfit recommendations. They can also show the AI a landmark outside and ask questions about it. This feature is paired with the chatbot's low latency and emotive Advanced Voice mode, making it easier for users to interact in natural language.
Once the feature rolls out to users, they can go to the mobile app of ChatGPT and tap on the Advanced Voice icon. In the new interface, they will now see a video option, tapping which will give the AI access to the user's camera feed. Additionally, a Screenshare feature is also available which can be accessed by tapping the three dot menu.
Screenshare feature will enable the AI to see the user's device and any app or screen they go to. This way, the chatbot can also help users with smartphone-related issues and queries. Notably, OpenAI said that all Team subscribers will get access to the feature within the next week in the latest version of the ChatGPT mobile app.
Most Plus and Pro users will also get the feature, however, users in the European Union region, Switzerland, Iceland, Norway, and Liechtenstein will not get it at present. On the other hand, Enterprise and Edu users will get access to ChatGPT's Advanced Voice with Vision in eary 2025.