It was just a look, two videos I see 3 of 8 seconds, but as with so many things that alter life, I will never forget my first time that generates audio and video synchronized with a skillfully designed warning.
I am currently running Google AI Pro, the $ 19.99 account per month that gives access to the Gemini 2.5 Pro model and, more importantly, a limited test of see 3 video generation.
I see 3 is the inflection point of the creation of generative video that, for the first time, makes it possible to create videos with dialogue, background noises and sound effects, all synchronized with the action.
While I understood that my access I see 3 could be limited, I was not sure how many videos could generate with the new model. The answer, apparently, is exactly two. If I want unlimited access, I can change Google AI Ultra for $ 249.99 per month (there is a three -month agreement for $ 124.99 per month). And I see 3 is currently only from the US.
Since I see 3, it was launched on Google I/O 2025, my feed Tiktok has been filled with these incredible already quite realistic clips of ia. Some seem infomercial or commercial, others are simply impossible, like a woman who interviews a smiling man who is clearly in flames.
He was divided between creating realism, hyperrealism and something fantastic. In the end, I built a notice in the Gemini 2.5 Pro window that admits the creation of video that was a mixture of science fiction, drama and fantasy.
However, writing within the indicated window turned out to be a mistake because I accidentally came to return before developing my idea, and suddenly I see 3 I was busy generating my video.
This was my first notice:
“Bill and Jessica live in a trunk cabin built on the surface of Mars. Bill emerges from the cabin to find Jessica fighting a Martian using nothing more than a teddy animal.
Bill shouts Jessica: What are you doing?
Jessica: This damn Martian wants our land and can’t have it. “
As you can see, there are not many details, and as easy as it is to generate a video in I see 3 (and I see 2 without audio), you will get a better result by including more details and dialogue. I see 3 will not make the characters say anything he did not write. In this case, because I came to return too soon, Jessica’s dialogue is cut and I could not polish my warning.
Even so, I see 3 took the few details and in approximately 5 minutes created a striking video. Take a look (sound for the full effect).
My first video I see 3: A cabin in Mars pic.twitter.com/at63W2LQDMMay 28, 2025
It is far from perfect. Bill does not really speak its line, although we listen to her from outside the camera. Jessica’s cry (or is the Martian?) Also comes from somewhere off the camera.
There is an unfortunate sound effect that could come from Bill, and that I did not write. In addition, I don’t know why Jessica speaks her lines directly to the camera.
Again, I suppose that if I had directed who should be talking to, I see 3 I could have made a different decision.
Even so, there are many more subtle things that are impressive. I see 3 obtains the correct configuration; Observe the reddish reddish of Mars Day. The Martian is scary. However, I am more impressed by the sound effects such as the sound of the cabin door, stepping on the Martian floor and the sound of the stuffed animal that hits the Martian’s chest.
Take 2
For my second warning, I wrote it and edited it outside of Gemini. I did my best to establish the scene, describe the characters and delineate dialogue and any sound effect. Here is the notice:
The scene is a lush forest with sunlight that extends from above. We listen to the pterodactile chillidos in the background and the sound of the leaves that balance in a slight breeze.
A tyrannosaurus is carefully painting a large canvas that represents a colorful image of a man about to be destroyed by an asteroid.
The Tyrannosaurus is singing silently to himself, “Pink Pony Club, I will continue to dance in the …”
A velociraptor wanders and asks: “Why are you painting that?”
The Tyrannosaurus: “AI made me do it.”
Velociraptor goes back with horror and says: “What?!”
As you can see, in part, I was inspired by some of the videos of I see 3 self -referential that I had been watching in Tiktok, where the characters break the fourth wall and mention that they are in a video. While my detailed work was worth it, I see made a series of questionable decisions.
I do not know why he chose to wear the T-Rex, but he neglected to give him a brush, or why the painting character looks like a kind of detective for children from the 70s. And although Gemini clearly knows one or two things about how dinosaurs look, the relative sizes of the T-Rex and Velociraptor are bad. He also disappointed that, instead of “Chillsks of Pterodactyls”, I obtained a static image of pterodactiles and the sound of the song of the birds in the background.
The synchronization of the dialogue is mostly good, although I expected more velociraptor.
In general, it took me for a few minutes to write these indications and another 3 to 5 minutes to see 3 generate each video. I think if I spent more time painting a detailed image, even writing a complete short story, I could get an even better result.
I would let you know with certainty, but I just executed my brief dry test. If you plan to attend a couple of videos I see 3, here are my main advice:
- Write your message outside of Gemini
- Choose your songs carefully
- Explain every detail, from the appearance of the characters to the scene
- Detail every action, or see 3 will invent something or have a character without doing anything
- Delight the dialogue to be clear.
- Describe the emotion behind the delivery of the dialogue
- Include details about background noise
- Include sound effects descriptions if you want specific sounds
- Each video is maximum of 8 seconds. Plan accordingly
- Try to create multiple videos that continue a story, but maintain consistent descriptions
Good luck with your trial units I see 3. Let me know how it goes in the comments below.
You may also like