Exciting news, fellow tech enthusiasts and chatbot aficionados! ChatGPT is about to get even more versatile with the introduction of two game-changing features: voice chats and image-based queries. OpenAI, the brilliant minds behind ChatGPT, are rolling out these enhancements, and it’s set to revolutionize the way we interact with this AI marvel.
Voice Chats: Talk to ChatGPT Like Never Before
Imagine having a real conversation with ChatGPT using just your voice. Well, now you can! Voice chats are coming to ChatGPT, and they’ll be initially available on Android and iOS devices. The feature will eventually expand to all platforms, but for now, Plus and Enterprise users get the first taste.
To dive into voice chats, you’ll need to opt in through the ChatGPT app’s settings menu. Simply navigate to ‘Settings’ and then ‘New Features.’ Once you’ve done that, you can tap the microphone button and select from five different voices. Yes, you read that right—five distinct voices to chat with. It’s like having a virtual voice actor at your disposal.
These voice conversations are powered by an impressive text-to-speech model that can transform plain text and a short sample of speech into remarkably lifelike audio. OpenAI even enlisted professional actors to craft these voices. On the flip side, OpenAI’s Whisper speech recognition system works its magic, converting your spoken words into text that ChatGPT can understand and respond to.
Image-Based Queries: ChatGPT Gets Visual
But wait, there’s more! ChatGPT is also getting a major boost in its image recognition capabilities. You can now show this AI marvel a picture and ask it questions based on what it sees. For example, you can snap a shot of your balky grill and ask ChatGPT why it won’t start. Or maybe you want to plan a meal using the contents of your fridge, so you send ChatGPT a pic and let it do the meal planning for you. It can even solve math problems if you show it the equations in a picture. It’s like having a digital detective for visual conundrums!
The technology behind this visual prowess leverages GPT-3.5 and the cutting-edge GPT-4 models. To put it into action, just tap the photo button in the app (on iOS or Android, remember to tap the plus button first on mobile), and you can either take a new photo or select an existing one from your device. ChatGPT is ready to handle multiple photos, and it even offers a handy drawing tool to highlight specific areas of an image for more precise queries.
Privacy and Safety First
As much as these new features are a technological marvel, OpenAI is aware of the potential for misuse. When it comes to voice chats, there’s a concern about bad actors mimicking voices, which could lead to fraud. To mitigate this, OpenAI is initially focusing on voice conversations and is working with select partners for other limited use cases.
Regarding image-based queries, OpenAI has taken privacy seriously. They’ve collaborated with organizations like Be My Eyes to ensure the technology respects individuals’ privacy, especially when people appear in images. OpenAI has also published a paper on the safety properties of this image-based functionality, known as GPT-4 with vision.
English Dominates the Image Game
While ChatGPT excels at understanding English text in images, it’s currently not as proficient in other languages, especially those using non-Roman scripts. Non-English users might want to hold off on using ChatGPT for text in images until further improvements are made.
Spotify’s Voice Translation: A Unique Collaboration
In a surprising twist, Spotify has teamed up with OpenAI to utilize the voice-based technology for an intriguing purpose: Voice Translation for podcasters. This innovative tool can translate podcasts into different languages while retaining the unique speech characteristics of the original speakers. English-based podcasts are getting the translation treatment, starting with Spanish versions of select shows, and French and German variants are on the horizon.
These updates represent a significant leap forward in the capabilities of ChatGPT, making it more versatile and useful than ever before. Whether you’re a tech geek, a language enthusiast, or just someone who loves chatting with AI, there’s a lot to look forward to with these new features. So, go ahead and opt into voice chats, explore image-based queries, and see where the future of conversational AI is taking us. The possibilities are as vast as your imagination!
Frequently Asked Questions (FAQs) about ChatGPT New Features
Q: What are the new features introduced in ChatGPT?
A: ChatGPT is introducing two exciting features: voice chats and image-based queries. Voice chats allow users to engage in voice conversations with the chatbot, while image-based queries enable users to ask questions based on images they provide.
Q: Who can access the new voice chat feature?
A: Initially, the voice chat feature will be available to Plus and Enterprise users. However, it is expected to expand to all users in the future.
Q: How do I enable voice conversations with ChatGPT?
A: To enable voice conversations, you can opt in through the ChatGPT app’s settings menu. Simply go to ‘Settings,’ then ‘New Features,’ and tap the microphone button to choose from five different voices.
Q: What powers the back-and-forth voice conversations with ChatGPT?
A: Voice conversations are powered by a cutting-edge text-to-speech model capable of generating human-like audio from text and a short sample of speech. OpenAI collaborated with professional actors to create five distinct voices.
Q: How does ChatGPT understand spoken words in voice chats?
A: OpenAI employs its Whisper speech recognition system, which converts users’ spoken words into text that ChatGPT can comprehend and respond to.
Q: What can I do with image-based queries in ChatGPT?
A: With image-based queries, you can show ChatGPT pictures and ask questions based on what’s in the images. For example, you can ask it to diagnose issues with your grill, plan a meal from the contents of your fridge, or even solve math problems from pictures of equations.
Q: What models power ChatGPT’s image recognition features?
A: ChatGPT uses GPT-3.5 and GPT-4 models to enhance its image recognition capabilities, allowing it to understand and respond to questions related to images.
Q: What safety precautions are in place for these new features?
A: OpenAI is aware of potential misuse, especially regarding voice mimicking. Therefore, they are initially focusing on voice conversations and collaborating with select partners for limited use cases. For image-based queries, privacy and safety are priorities, and OpenAI has worked to respect individuals’ privacy in images.
Q: Is ChatGPT equally proficient in all languages for image-based queries?
A: ChatGPT performs exceptionally well with English text in images. However, it currently has limitations in understanding other languages, particularly those using non-Roman scripts.
Q: What is Spotify’s collaboration with OpenAI related to voice technology?
A: Spotify and OpenAI have partnered to introduce Voice Translation for podcasters. This tool can translate podcasts into different languages while retaining the unique speech characteristics of the original speakers. It initially focuses on English-based podcasts, with plans to expand to other languages.
Q: How can I update my privacy settings to access this content?
A: To view the content related to these updates, you may need to update your privacy settings. Please click the provided link and navigate to the “Content and social-media partners” setting to do so.