Chinese technology group Alibaba has just announced a new AI model called QVQ-Max of the Qwen series, marking a breakthrough in the field of multimedia AI. The special feature of this model is the ability to analyze image/video content, then make arguments and solutions based on the information obtained.
Impressive ability
QVQ-Max is described by Alibaba as a bridge between pure text-based AI models and the real world. With its visual reasoning capabilities, the system can:
- Analyze images and identify key elements
- Versatile application in many fields from illustration design, video script creation to character role-playing
- Solve problems with diagrams (math, physics)
- Step by step cooking instructions based on recipe pictures
Alibaba says the model bridges the gap between text-based AI and fact-based information. Thanks to its image-based reasoning capabilities, QVQ-Max can “see, understand, and think” about the world around it. The company says the model excels at analyzing images and identifying key elements, and is flexible enough to be used in areas such as illustration design, video scripting, and role-playing.

Like other AI chatbots, QVQ-Max supports work, education and personal life, but thanks to visual integration, it also solves more specific tasks such as: solving math/physics problems with diagrams, cooking instructions through recipe images.
Alibaba sees QVQ-Max as a first version and has outlined a roadmap for future upgrades. First, they want to improve image recognition accuracy using grounding techniques. Second, the model will be optimized to handle multiple tasks and complex problems such as operating a phone, computer, or playing a game. Finally, Alibaba plans to expand from text interaction to tool verification and image content generation.
Users can experience QVQ-Max by:
- Visit chat.qwen.ai
- Select the model menu in the left corner → " Expand more models "
- Select QVQ-Max and start chatting
- Attach image files to explore AI processing capabilities
With the launch of QVQ-Max, Alibaba continues to assert its position in the race to develop multimedia AI, competing directly with global technology giants. The model promises to bring practical applications in work, education and personal life.