๐ช๐ผ๐ฟ๐น๐ฑ๐๐ฒ๐ป๐ฐ๐ต: ๐ง๐ผ๐ฝ ๐ ๐๐๐ ๐ฆ๐ฐ๐ผ๐ฟ๐ฒ๐ ๐ฒ๐ฐ%
MIT researchers released a new test called WorldBench. It checks how AI models understand images.
They tested 15 multimodal models. The top model scored 64%. Some models performed near chance level.
Most tests focus on tasks like reading charts or text. WorldBench focuses on visual diversity. It uses thousands of concepts. This includes living things and landscapes.
Key facts:
- Released June 4 on arXiv.
- Top model scored 64%.
- Tests visual breadth over task depth.
- Exposes gaps in visual understanding.
This tells you visual diversity is the main problem. Models need better vision encoders. They need more diverse training data.
The researchers did not release the code or data yet. You are unable to replicate the results now.
Source: https://dev.to/gentic_news/worldbench-top-mllm-scores-64-on-visually-diverse-benchmark-3h0g Optional learning community: https://t.me/GyaanSetuAi