- The Signal
- Posts
- OpenAI’s Image Explosion, Grok Gets Vision, Firefly Takes Flight
OpenAI’s Image Explosion, Grok Gets Vision, Firefly Takes Flight


AI Highlights
My top-3 picks of AI news this week.
OpenAI
1. OpenAI’s Image Explosion
OpenAI has made its powerful image generation capabilities available to developers via the API with their new gpt-image-1 model, following the enormous success of the feature in ChatGPT.
Massive adoption: The image generation feature in ChatGPT saw over 130 million users create more than 700 million images in just the first week after launch.
Enterprise integration: Major creative platforms like Adobe, Figma, Canva, and Wix are already implementing the API for professional design workflows.
Flexible pricing: Usage is charged per token with separate rates for text input ($5/1M tokens), image input ($10/1M tokens), and image output ($40/1M tokens), translating to roughly $0.02-$0.19 per generated image.
Alex’s take: It’s clear AI image generation is moving beyond just novelty. I remember when the first version of DALL-E was released at the beginning of 2021. It’s incredible to see how image generation is now becoming an essential part of creative workflows. Especially given how frictionless it is to turn ideas into reality.
xAI
2. Grok Gets Vision
xAI has expanded Grok’s capabilities with several significant updates, most notably the introduction of visual perception through Grok Vision.
Grok Vision: Users can now point their phone's camera at objects, products, signs, and documents to ask questions about what they see (currently available only on iOS).
Multilingual audio: Android users subscribed to the $30 monthly SuperGrok plan can now access voice interactions in multiple languages.
Real-time search: Voice mode now incorporates real-time search capabilities for more current information, again exclusively for SuperGrok subscribers on Android.
Alex’s take: The race for multimodal AI continues to intensify, with xAI now joining Google and OpenAI in offering real-time visual understanding. Only last week, we saw the integration of memory within Grok to enable personalised knowledge across conversations. xAI has been shipping at an incredible pace to not only catch up to, but now in some instances, surpass the capabilities of other leading LLM providers.
Adobe
3. Firefly Takes Flight
Adobe has announced major updates to its Firefly platform at its MAX London event this week.
Firefly Image Model 4 & 4 Ultra: Adobe's fastest and most controllable image models yet, offering lifelike quality with precision control over structure, style, and camera angles at up to 2K resolution.
Partner integration: Now incorporates models from Google Cloud and OpenAI, with more partners coming soon, including fal.ai, Ideogram, Luma, Pika and Runway.
Firefly Boards: A new collaborative, AI-first canvas for moodboarding and exploring creative concepts that streamlines ideation to production workflows.
Alex’s take: I attended the Adobe MAX event in London this week and was inspired by their approach, not just in terms of technical capabilities, but also in their commitment to commercially safe AI. It’s a bit like the wild west right now, with an abundance of lawsuits flying around, with image and video generation providers training on copyrighted works. However, Firefly is trained exclusively on Adobe Stock and non-copyright works, meaning outputs can be used in commercial projects with confidence.
Today’s Signal is brought to you by GrowHub.
Don’t Leave LinkedIn Growth To Chance
Create LinkedIn posts from blogs, articles and YouTube videos
AI trained on thousands of high-performing posts
Generate sensational content in any language
Content I Enjoyed
Invisible AI to Cheat On Everything?
When a video of AI glasses (which don’t actually exist) helping someone cheat on a date went viral this week, amidst the backlash, I couldn't help but watch with a mix of fascination and mild horror.
The product, Cluely, markets itself as an “undetectable AI-powered assistant” for interviews, sales calls, and meetings.
Behind Cluely is Chungin "Roy" Lee, a former Columbia student who was suspended after creating an AI tool for cheating on job interviews. His company has already raised $5.3 million from venture capital firms Abstract and Susa Ventures, and has gained 70,000 users since its launch.
I found myself wondering what season of Black Mirror this technology would fit into. Especially when the company chose dating as its showcase scenario, when using this type of tech in work meetings would be far more practical and far less controversial?
Then again, that’s the point. To appeal to interest, not to reason. To spark a reaction. And it did exactly that with 12M impressions on the launch video.
Lee argues that “ultimately, this is where the world is headed. We will be 100 times more efficient”, suggesting we need to rethink what “cheating” means in the age of AI. What happens when we have any publicly available fact or figure available at the speed of thought?
While I'm sceptical about the ethics, I do see a huge opportunity when it comes to the possibility of visual copilots and their application to get an informational advantage.
Idea I Learned
The AI Education Gap Is Creating a New Generation of Leaders
This week, President Trump signed an Executive Order establishing AI literacy as a national educational priority across all grade levels in the United States.
The initiative will integrate AI concepts throughout the K-12 curriculum, ensuring all students develop fundamental AI competencies regardless of their career path.
What struck me most was the potential generational divide that this will create.
While adults are figuring out prompt engineering and attempting to apply basic AI tools, today's children will grow up with AI concepts woven into their educational DNA.
This idea reminds me of Toby Brown, a teenage AI founder whom I met last year at an AI demo day in London. He recently secured $1 million investment from Silicon Valley before even finishing his GCSEs—and if I’m being honest, I’m not even surprised. Toby was the sharpest person I spoke to at that event, and his enthusiasm for building was infectious. He started coding at age 7 and is now building and raising capital in SF while his peers are still in school.
It begs the question, will we soon see 20-year-old CEOs regularly managing teams of 40-somethings who are struggling to keep pace with AI advancements?
Those of us who didn't learn AI in school will need to work twice as hard to stay relevant. The plasticity of young minds gives them a natural advantage. That’s why it's your job to prioritise both the education and application of the latest tools. Even an hour a week will put you ahead of your peer group.
What’s more, tools like Grok are now introducing Socratic responses to help guide users through topic understanding via questions and reasoning to promote critical thinking. Instead of providing direct answers, you’re actively engaging with a subject rather than just receiving passive information.
So, I recommend that you find something you're interested in related to AI and pursue that curiosity—it’ll take you further and lead to a far deeper understanding than someone doing it “because I have to.”
Geoffrey Hinton challenges our traditional view of human intelligence:
Geoffrey Hinton says the more we understand how AI and the brain actually work, the less human thinking looks like logic.
We're not reasoning machines, he says. We're analogy machines. We think by resonance, not deduction.
“We're much less rational than we thought.”
— vitrupo (@vitrupo)
2:32 AM • Apr 22, 2025
Hinton, one of AI's founding fathers, suggests that as we gain deeper insights into both AI systems and the human brain, we're discovering that human cognition isn’t primarily based on logical reasoning.
Instead, he proposes that we are "analogy machines" who think by resonance rather than deduction.
I think this idea is absolutely spot on. Humans love drawing comparisons to past experiences or familiar patterns rather than strict logical deduction.
Take, for example, choosing a restaurant to eat at when visiting a new city.
You don’t logically evaluate every restaurant’s menu, reviews, and pricing through deduction.
Instead, you think, “Ah, I’m going to find something similar to that pizza restaurant I love in my hometown, with those thick crusts and sourdough base.”
You use analogy, comparison and resonance of past experience to guide your choice.
In other words, you use “vibes”.
You think emotionally, not rationally. The emotions that are tied to these analogical memories shape the future.
Hinton rightly states that we have a thin layer of reasoning on top in order to do things like mathematics, but our primary decision-making engine is through analogies.
As LLMs get smarter and as we advance to AGI, we may need to reconsider our most fundamental assumptions about “rational” thought itself.
Source: University of Toronto, YouTube
Question to Ponder
“Who will win the AI race?”
The short answer: it’s too early to tell.
The long answer: it’ll be who controls the vertical from chip to chatbot.
Compute is the new oil of our time. Just look at Nvidia’s revenue growth. A staggering $39.3 billion last quarter.
What’s interesting to see is that whilst there has been a flurry of excitement about OpenAI over the last few years, I believe over the next three, we’ll see Google stretch its advantage.
Why? Google’s TPUs.
When testing state-of-the-art LLMs on the Aider Polyglot coding benchmark, Gemini 2.5 Pro had a remarkably low cost of $6.32 while achieving a high accuracy of 72.9%.
In contrast, OpenAI’s model o3 ($111.03), which offers comparable performance, is much more expensive, due to their reliance on GPU infrastructure, trailing back to their vendor: Nvidia.
Google’s TPUs (Tensor Processing Unit) are designed specifically for tensor operations, which are the core of neural network computations.
This optimisation for AI-specific workloads means a reduction in computational overhead and fast completion of tasks.
Unlike GPUs, which handle a wide range of computing tasks, TPUs focus solely on what matters for AI.
But Google's true advantage lies in vertical integration. They control both hardware and software, creating a perfectly synchronised end-to-end ecosystem that tailors their infrastructure precisely to their models' needs.
Energy efficiency becomes crucial at scale. TPUs consume significantly less power than GPUs for equivalent AI workloads—critical when running massive training jobs (for Gemini 3, 4, 5, etc.) or serving billions of inference requests daily.
That leads me onto the cost implications. Google's decade-long investment in custom silicon means they can offer more compute per dollar through economies of scale. While OpenAI has captured our imagination so far, the economics of running models at scale might ultimately decide this race.
The winner won’t just have the best consumer-facing models. They’ll have the most efficient path from electricity all the way through to intelligence.

How was the signal this week? |
See you next week, Alex Banks | ![]() |