OpenAI has launched voice and image functionalities in ChatGPT, which offers a more user-friendly interface that allows for voice interactions and visual demonstrations.
Table of Interests
Voice Conversations with ChatGPT
ChatGPT now offers voice conversations. You can converse with your assistant, ask for bedtime stories, or have debates.
To start, enable voice conversations in the mobile app’s Settings → New Features, then select a voice from five options by tapping the headphone icon.
The feature uses a state-of-the-art text-to-speech model and whisper, an open-source speech recognition system, to create lifelike audio and transcribe your words into text.
OpenAI has collaborated with professional voice actors for this and is actively working to mitigate potential risks like impersonation or fraud.
Discussing Images with ChatGPT
ChatGPT now supports image discussions. You can share images, analyze graphs, or use the drawing tool in the mobile app to highlight parts of an image.
To start, tap the photo button or the plus button for iOS and Android users. You can discuss multiple images and guide your assistant using the drawing tool.
The image understanding feature uses multimodal GPT-3.5 and GPT-4 models to apply language reasoning to various images, including photos, screenshots, and documents.
However, OpenAI acknowledges challenges like misinterpretations and has conducted extensive testing for responsible usage.
OpenAI acknowledges that while ChatGPT is useful for specialized topics, it has limitations and should not be used in high-risk situations without verification.
The model excels at English text, but may struggle with non-roman scripts. Therefore, non-English users are advised to use it cautiously.
The voice and image features OpenAI’s ChatGPT are available to Plus and Enterprise users in the next two weeks. These features will be available on iOS and Android through settings, and images will be accessible on all platforms.
OpenAI also has plans to extend these capabilities to other user groups, including developers, in the near future.