Singapore Institute of Technology
Browse

MerCulture: A Comprehensive Benchmark to Evaluate Vision-Language Models on Cultural Understanding in Singapore

conference contribution
posted on 2025-10-02, 02:36 authored by Tushar PranavTushar Pranav, Eshan Pandey, Lyka Diane Austria, Yin Yin Loo, Jing Hao LimJing Hao Lim, Indriyati AtmosukartoIndriyati Atmosukarto, Cheng Lock, Donny SohCheng Lock, Donny Soh
<p dir="ltr">Vision Language Models (VLMs) have achieved remarkable performance across multimodal tasks. However, they continue to exhibit significant limitations in cultural understanding, particularly when interpreting non Western imagery. These limitations primarily stem from biases in training data, which predominantly reflect Western centric perspectives. Existing benchmarks fail to address this issue, lacking diversity in cultural representation and evaluation criteria. To bridge this gap, we introduce MerCulture, a multimodal benchmark designed to assess VLMs' ability to interpret culturally significant objects, traditions, and symbols. MerCulture consists of two core tasks: MerCulture VQA, which evaluates culturally grounded question answering, and MerCulture Visual Grounding, which measures object context associations. To ensure rigorous evaluation, we propose novel metrics tailored for cultural fidelity and bias measurement. Specifically, for the MerCulture VQA task, we introduce the Cultural Alignment Score (CAS) and Bias Reinforcement Rate (BRR). For MerCulture Visual Grounding, we define the Mean Cultural Grounding Score (MCGS) and Textual Alignment Score (TAS). Benchmarking state-of-the-art VLMs on MerCulture reveals substantial performance disparities, underscoring the urgent need for more culturally inclusive multimodal AI systems. Our findings establish a foundation for advancing cross cultural AI applications in domains such as education, heritage preservation, and multilingual systems.</p>

History

Related Materials

Journal/Conference/Book title

2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Publication date

2025-06-11

Project ID

  • 14613 (R-AIS-A405-0003) LEARN: Language automated Evaluation by generating Answers / questions from caRtooNs

Usage metrics

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC