Alibaba Introduces QVQ-Max: An AI That Sees, Understands, and Thinks

A new AI model called QVQ-Max has been released by Alibaba. It is designed to understand photos and videos, then analyse them to provide solutions.

This model has been introduced to bridge the gap between text-based AI and real-world understanding. Through visual reasoning, images can be processed, key details can be identified, and insights can be provided. It has been designed for various tasks such as illustration design, video script generation, and role-playing.

Unlike other AI chatbots, QVQ-Max has been built with visual capabilities. It can assist in solving math and physics problems with diagrams. Cooking guidance can also be provided based on recipe images.

This model is the first version, and Alibaba has shared its plans for improvement. Image recognition accuracy is expected to improve with grounding techniques. Multi-step tasks and complex problems will be handled better, enabling it to operate devices and play games. Future updates will also expand its interaction from text-based responses to tool verification and visual generation.

To use QVQ-Max, visit chat.qwen.ai. Select the model from the dropdown menu, enable more models, and start chatting. Attaching images will allow the AI to demonstrate its visual capabilities.

📌 Source : neowin

Alibaba Introduces QVQ-Max: An AI That Sees, Understands, and Thinks

Keep reading

One Minute Data