Multimodal prompting with a 44-minute movie | Gemini 1.5 Pro Demo

This is a demo of long context understanding, an experimental feature in our newest model, Gemini 1.5 Pro using a 44-minute silent Buster Keaton movie, Sherlock Jr., and a series of multimodal prompts. This demo is a continuous recording of a live model interaction. Sequences have been shortened with response times shown. Token count details: The input video (696,161 tokens) and image (256 tokens) total 696,417 tokens. The text inputs add additional tokens into the prompt, yielding the 696,538 token total shown in the interface. To learn more about Gemini 1.5, visit Subscribe to our Channel: Tweet with us on X: Follow us on Instagram: Join us on Facebook:
Back to Top