Introducing DeepSeek-VL2, an advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL. DeepSeek-VL2 demonstrates superior capabilities across various tasks, including but not limited to visual question answering, optical character recognition, document/table/chart understanding, and visual grounding. Our model series is composed of three variants: DeepSeek-VL2-Tiny, DeepSeek-VL2-Small and DeepSeek-VL2, with 1.0B, 2.8B and 4.5B activated parameters respectively. DeepSeek-VL2 achieves competitive or state-of-the-art performance with similar or fewer activated parameters compared to existing open-source dense and MoE-based models.
🔗 Links 🔗
https://huggingface.co/deepseek-ai/de... (Deepseek VL2 full model)
Deepseek VL2 paper - DeepSeek-VL2: Mixture-of-Experts Vision-Language Models
for Advanced Multimodal Understanding https://arxiv.org/pdf/2412.10302
Deepseek VL2 small demo - https://huggingface.co/spaces/deepsee...
Deepseek VL UI Demo - https://huggingface.co/spaces/AskUI/D...
❤️ If you want to support the channel ❤️
Support here:
Patreon - / 1littlecoder
Ko-Fi - https://ko-fi.com/1littlecoder
🧭 Follow me on 🧭
Twitter - / 1littlecoder