In this video I look at the latest model out from Qwen, the Qwen 2.5 Omni model, which allows you to basically use the model for full multimodal input (text, images, video, audio) and get either text or audio output in real-time.
Blog: https://qwenlm.github.io/blog/qwen2.5...
Qwen Chat: https://chat.qwen.ai/
Try the model: https://huggingface.co/Qwen/Qwen2.5-O...
Colab: https://dripl.ink/tUoYL
For more tutorials on using LLMs and building agents, check out my Patreon
Patreon: / samwitteveen
Twitter: https://x.com/Sam_Witteveen
🕵️ Interested in building LLM Agents? Fill out the form below
Building LLM Agents Form: https://drp.li/dIMes
👨💻Github:
https://github.com/samwit/llm-tutorials
⏱️Time Stamps:
00:00 Intro
00:23 Qwen2.5 Omni Blog
00:29 Qwen2.5 Omni Architecture
00:59 Chat: Demo - Audio
03:17 Chat: Demo - Video
05:24 Hugging Face
05:33 Qwen 2.5 Omni Paper
11:05 Qwen 2.5 Demo Colab