Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models

This is the repository implements code for the paper Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models and the model used in the paper are GPT-4-0125-Preview and GPT-4-Vision.

Ideation

The paper propose a new prompting method which is called Visualization-of-Thought or VoT where LLM could visualize an imaginary visualization in their memory (Mind's Eye or Mental Images) and use that to help them solve problem that required visualize out each of the step in order to keep track and seeing the update of the changes.

The paper focused on compare the performance between VoT, CoT and No Visualization CoT for GPT-4 and GPT-4V

The result of the comparison can be seen at here where GPT-4 with VoT outperform CoT in both Vision and Non-Vision GPT-4 Version. This proven that VoT can really be used for task where image play a role as information and we can explain the image using natural language and ask the LLM to visualize it out in their thinking process.

Setup

1. Local

git clone https://github.com/sitloboi2012/Visualization-Of-Thought.git
conda create --name vot-prompt python=3.11 -y
conda activate vot-prompt
set OPENAI_API_KEY=sk-123
pip install -r requirements.txt

----
cd prompts
python next_step_prediction.py #to run Next Step Prediction Task
python route_planning.py #to run the Route Planning Task
python natural_language_navigation.py #to run the Natural Language Task

2. Docker

TBA

References

@misc{shao2024visual,
      title={Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models}, 
      author={Wenshan Wu, Shaoguang Mao, Yadong Zhang, Yan Xia, Li Dong, Lei Cui, Furu Wei},
      year={2024},
      eprint={2404.03622},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
media		media
prompts		prompts
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models

Ideation

Setup

1. Local

2. Docker

References

About

Releases

Packages

Languages

sitloboi2012/Visualization-of-Thought

Folders and files

Latest commit

History

Repository files navigation

Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models

Ideation

Setup

1. Local

2. Docker

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages