How Microsoft gets AI to Click the Right Buttons!

In this video, I look at the recent release by Microsoft of OmniParser, which is a tool that allows agents to be able to read the screens of various UIs and then be able to produce an output that an LLM can use to interact with those screens. 

For more tutorials on using LLMs and building agents, check out my Patreon
Patreon:   / samwitteveen  
Twitter:   / sam_witteveen  

OmniParser : https://microsoft.github.io/OmniParser/
Colab: https://drp.li/rsuVh

🕵️ Interested in building LLM Agents? Fill out the form below
Building LLM Agents Form: https://drp.li/dIMes

👨‍💻Github:
https://github.com/samwit/langchain-t... (updated)
https://github.com/samwit/llm-tutorials


⏱️Time Stamps:

로딩 중...

How Microsoft gets AI to Click the Right Buttons!

This was the first feature I built when I worked as a Junior Software Engineer for Microsoft.🫡

OpenAI's Swarm - a GAME CHANGER for AI Agents

What Makes a Game Feel Mysterious?

What are Vector Databases?

Nuxt Nation 2024: Alex Kyriakidis - We Are Growing: Vue School Crowdfunding Announcement

스프링 배치 도메인 언어