Skip to content

unable to reproduce the results for S3 on StreamingBench #12

@kailigo

Description

@kailigo

Following strictly the instructions but getting the following results, which are non-negligibly lower than what reported in the paper. Any idea why this happens. Thanks.

ViSpeak_real_stats.csv

task_type,total,correct,accuracy
Clips Summarize,317,240,0.7570977917981072
total,2500,1667,0.6668
Object Recognition,367,279,0.7602179836512262
Attribute Recognition,306,227,0.7418300653594772
Prospective Reasoning,108,65,0.6018518518518519
Action Recognition,353,239,0.6770538243626062
Spatial Understanding,246,147,0.5975609756097561
Event Understanding,161,119,0.7391304347826086
Counting,193,45,0.23316062176165803
Text-Rich Understanding,321,223,0.6947040498442367
Causal Reasoning,128,83,0.6484375

ViSpeak_sqa_stats.csv

task_type,total,correct,accuracy
Sequential Question Answering,250,96,0.384

ViSpeak_proactive_stats.csv

task_type,total,time_correct,time_accuracy,answer_correct,answer_accuracy
Proactive Output,250,115,0.46,109,0.436

ViSpeak_omni_stats.csv

task_type,total,correct,accuracy
Misleading Context Understanding,250,84,0.336
total,1500,790,0.5266666666666666
Source Discrimination,250,141,0.564
Emotion Recognition,250,112,0.448
Anomaly Context Understanding,250,114,0.456
Scene Understanding,250,143,0.572
Multimodal Alignment,250,196,0.784

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions