Neuro-Symbolic Agent Challenge

Overview

The Neuro-Symbolic Agent Challenge aims to foster research in video-based agent tasks by encouraging the development of datasets, benchmarks, and evaluation frameworks analogous to LLM-based agent tools. This challenge provides an initial dataset and evaluation metrics to serve as a foundation for future research.

Challenge Goals

The primary objective is to design and evaluate video agents that leverage deep-learning and neuro-symbolic methods to process videos and respond to complex natural language queries. The three specified tasks for a video agent are video search, tool calling, and video generation.

1. Video Search

Predicts the temporal span of a video segment corresponding to a query.

Requirements:

Parsing Queries: Extract objects, events, and temporal logic.
Perception: Utilize models to detect relevant elements.
Prediction: Identify spans with high probability.

2. Tool Calling

Determines the correct tool and executes it with appropriate inputs.

Requirements:

Tool Selection: Identify the right API/tool for a given span.
Tool Invocation: Provide tool inputs based on the detected video clip.

3. Video Generation

Synthesizes videos based on extended natural language queries.

Requirements:

Synthesis: Generate novel video sequences.
Evaluation: Ensure high visual and semantic quality.
Improvement & Editing: Iteratively refine videos with neuro-symbolic feedback.

Datasets

Accuracy of Events: F1-score comparing predicted and ground-truth spans.
Tool Calling: Accuracy of tool selection and input specification.
Synthetic Video Quality: Measured via VBench (visual fidelity) and Neus-V (temporal coherence).

Evaluation Metrics

Accuracy of Events: F1-score comparing predicted and ground-truth spans.
Tool Calling: Accuracy of tool selection and input specification.
Synthetic Video Quality: Measured via VBench (visual fidelity) and Neus-V (temporal coherence).

Get Started

Clone the repository:

git clone https://github.com/UTAustin-SwarmLab/Neuro-Symbolic-Agent-Challenge.git

Install dependencies:
```
pip install -r requirements.txt
```

Install pkl files for NeuS-V:

wget https://raw.githubusercontent.com/UTAustin-SwarmLab/NeuS-V/main/assets/distributions.pkl -O assets/distributions.pkl

Download the TLV Dataset from Hugging Face.
Explore the dataset and benchmarks.

Resources

Dataset: TLV Dataset
Metrics: NeuS-V

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
datasets/TLV		datasets/TLV
evaluation_metrics/NeuS-V		evaluation_metrics/NeuS-V
README.md		README.md
requirements.txt		requirements.txt
vllm_serve.sh		vllm_serve.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Neuro-Symbolic Agent Challenge

Overview

Challenge Goals

1. Video Search

Requirements:

2. Tool Calling

Requirements:

3. Video Generation

Requirements:

Datasets

Evaluation Metrics

Get Started

Resources

About

Uh oh!

Releases

Packages

Uh oh!

Languages

UTAustin-SwarmLab/Neuro-Symbolic-Agent-Challenge

Folders and files

Latest commit

History

Repository files navigation

Neuro-Symbolic Agent Challenge

Overview

Challenge Goals

1. Video Search

Requirements:

2. Tool Calling

Requirements:

3. Video Generation

Requirements:

Datasets

Evaluation Metrics

Get Started

Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages