feat: add camera vision to look around by flov · Pull Request #84 · Thokoop/billy-b-assistant

flov · 2026-03-13T20:47:48Z

This PR adds camera support for the billy-b-assistant. Right now it can look around and describe what it sees.
GPTARS inspired me to have it play a round of chess with me.
In the future I would like to get people to play a round of chess with me via the camera.

Summary

Adds look_around tool that lets Billy take a photo and describe what he sees using GPT-4o-mini
vision
Billy has no innate visual capability — the tool is the only way he can describe his surroundings,
enforced via system prompt
Camera capture supports Pi Camera Module (via picamera2), USB webcams, and MacBook FaceTime camera
(via OpenCV) for dev/mockfish use

New files

core/camera.py — frame capture (picamera2 → OpenCV fallback) + GPT-4o-mini vision API call
test/list-cameras.py — utility to enumerate camera devices and pick the right CAMERA_DEVICE
index

Configuration

CAMERA_ENABLED=true   # default: false
CAMERA_DEVICE=0       # camera index, use test/list-cameras.py to find the right one

Test plan

- Run python test/list-cameras.py to confirm cameras are detected
- Set CAMERA_ENABLED=true and CAMERA_DEVICE=<correct index> in .env
- Start a session and ask "what do you see?" — Billy should call look_around and describe the scene
- Confirm Billy does NOT hallucinate a scene when camera is disabled (CAMERA_ENABLED=false)
- Test on Raspberry Pi with Pi Camera Module (picamera2 path)

Thokoop · 2026-03-13T22:17:13Z

Hi @flov , thanks for your contribution.

My billy hangs on the toilet wall, so I didn't want to add a camera myself haha. But it would be a fun optional upgrade !

Could you update your branch by merging my recent changes from main first? I have done some refactoring recently, causing the session.py file to split up.
Please also set the destination branch to my Dev branch, that will make sure I can first merge it to test before releasing it as a main version.

I will definitely try it out, I will order a picamera but in the meantime also test with a usb cam.

I think we can just hook it also into the 'normal' realtime session directly as the gpt-realtime models support also image inputs:

base64_image = "<BASE64_IMAGE_BYTES>"

message_event = {
    "type": "conversation.item.create",
    "item": {
        "type": "message",
        "role": "user",
        "content": [
            {
                "type": "input_text",
                "text": "Tell me what you see in this image."
            },
            {
                "type": "input_image",
                "image_url": f"data:image/jpeg;base64,{base64_image}",
                "detail": "high",
            }
        ],
    },
}

ws.send(json.dumps(message_event))

response_event = {
    "type": "response.create",
    "response": {
        "output_modalities": ["text", "audio"]
    },
}

ws.send(json.dumps(response_event))

flov · 2026-03-14T18:40:25Z

haha, yes, having a fish see you while you're on the toilet is indeed a bit creepy 😂
I will refactor the code and rebase it to your dev branch.
I've been testing it out on my macbook pro together with MOCKFISH=true
It's working incredibly well. He can tell me exactly what I'm wearing, what's visible in the background, etc...
it works incredibly well. But my Billy is still way too friendly.
A friendly fish is not funny. I want a fish that roasts me based on my looks and tells me sarcastic jokes related to it.

flov · 2026-03-14T18:43:13Z

btw. I've tried to play chess with him with the python-chess library, but it didn't work so well. He kept on making illegal chess moves :D. I think chat gpt is not good at playing chess. I think it would be possible with the integration of stockfish, but it's not that trivial

Billy can now take a photo and describe what he sees. Adds: - core/camera.py: capture via picamera2 (Pi) or OpenCV (Mac/USB) - look_around tool registered in base_tools and handled in session.py - CAMERA_ENABLED / CAMERA_DEVICE config vars - test/list-cameras.py helper to identify camera device indices - README section M documenting setup and usage

flov added 2 commits March 14, 2026 19:58

ruff --fix

e5b536a

flov force-pushed the look-around branch from 6aafdcf to 4a2ed2c Compare March 14, 2026 19:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add camera vision to look around#84

feat: add camera vision to look around#84
flov wants to merge 2 commits intoThokoop:mainfrom
flov:look-around

flov commented Mar 13, 2026

Uh oh!

Thokoop commented Mar 13, 2026

Uh oh!

flov commented Mar 14, 2026

Uh oh!

flov commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

flov commented Mar 13, 2026

Summary

New files

Configuration

Uh oh!

Thokoop commented Mar 13, 2026

Uh oh!

flov commented Mar 14, 2026

Uh oh!

flov commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants