OpenAI Leading the Revolution in Accessible AI

This week's release of OpenAI's GPT-4o has sent a wave of excitement through Click Creative Digital Agency. It signifies a significant leap forward in the world of artificial intelligence. This groundbreaking update to the popular ChatGPT platform marks a significant shift in the accessibility and capabilities of large language models (LLMs).

GPT-4o is freely available to everyone, both online and through a dedicated desktop app. This democratisation of AI technology allows anyone, from students and hobbyists to entrepreneurs and established businesses, to utilise GPT-4o's advanced features. OpenAI, in a recent blog post, explained the "omni" in GPT-4o's name as signifying a step towards more natural human-computer interaction.

But what exactly makes GPT-4o so revolutionary? 

When can I have access? 

The multimodal marvel that is ‘4o’ boasts the ability to "reason across audio, vision, and text in real-time" (OpenAI, 2024). This groundbreaking technology is being rolled out progressively, with both free and paid users gaining access. While initial access within ‘Plus’ user accounts is limited to GPT-4o's text and image functionalities, the highly anticipated voice and video features are slated for a future release.

For web browser users of ChatGPT, accessing GPT-4o is a straightforward process. Simply login and navigate to the drop-down menu in the top left corner. Users fortunate enough to receive the update will find GPT-4o as the default option, identified as OpenAI's "newest and most advanced model."

However, the rollout for mobile and desktop applications is currently proceeding at a more measured pace. As of this writing, GPT-4o remains unavailable on iOS or Android apps, and the recently launched Mac app is still in its early stages of deployment. For Windows users, a dedicated version is planned for "later this year" (OpenAI, 2024).

What makes it so exciting? 

OpenAI's GPT-4o demo, as seen in the below video, showcased a glimpse into the future of AI interaction. The model's ability to engage in real-time conversational speech and vision-based interaction, allowing it to "see" and converse simultaneously, generated significant excitement. However, these functionalities will require a bit more time before widespread adoption. 

Currently, developers like our Studio team have access to GPT-4o within the API as a text and vision model, which differs from the image-based capabilities available to free and paid users since launch.

Regarding the much-anticipated voice features, OpenAI plans to "roll out a new alpha version of Voice Mode with GPT-4o within ChatGPT Plus in the coming weeks" (OpenAI, 2024). Additionally, they intend to "launch support for GPT-4o's new audio and video capabilities to a select group of trusted partners in the API in the coming weeks" (OpenAI, 2024).

The Future is “Open”

This measured rollout strategy, with some of GPT-4o's most captivating features initially restricted to testers and developers among paid users, is entirely understandable. The complex technology powering OpenAI's demos likely necessitates significant processing power, and a wider launch may take time. Nonetheless, the unveiling of GPT-4o presents a significant leap forward in AI accessibility and paves the way for a future filled with enhanced human-computer interaction.

While there are still ethical considerations surrounding the use of LLMs, OpenAI's commitment to open access signifies a shift towards a future where everyone can benefit from the power of AI. As GPT-4o continues to develop, it will be fascinating to see what new possibilities it unlocks for both the Click digital agency Melbourne team and our clients.

 

Let's talk AI!