Skip to content

feat: update human-pose and age-gender experiments #650

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: gen3
Choose a base branch
from

Conversation

jkbmrz
Copy link

@jkbmrz jkbmrz commented Apr 17, 2025

Purpose

Updating the experiments with the new host nodes (GatherData) and helper classes/functions (AnnotationHelper, generate_script_content).

Specification

  • human-pose:
    • utilize GatherData node,
    • utilize the depthai_nodes' generate_script_content function (instead of a custom implementation),
    • update AnnotationNode to use AnnotationHelper.
  • age-gender:
    • utilize GatherData node (instead of a custom implementation of the sync node in DetectionsAgeGenderSync),
    • utilize the depthai_nodes' generate_script_content function (instead of a custom implementation in ProcessDetections node),
    • update AnnotationNode to use AnnotationHelper,
    • allow FPS setting.

Dependencies & Potential Impact

None

Deployment Plan

None

Testing & Validation

Running the experiments on RVC4 and RVC2 (human-pose only as the age-gender does not support it).

@jkbmrz
Copy link
Author

jkbmrz commented Apr 23, 2025

Points of discussion:

  • Should we extend the AnnotationHelper with a method that automatically draws objects with coordinates relative to a bbox? For example, if one provides a bbox and keypoints with coordinatesrelative to that bbox, the AnnotationHelper could automatically adapt the keypoint coordinates to absolute values. This would save some lines of code in human-pose/utils/annotation_node.py and probably also elsewhere;
  • The depthai_nodes' generate_script_content function could also be utilized in some other experiments. We can either add it in this or a separate PR. I've identified the he following experiments:
    • pose-estimation/animal-pose,
    • object-tracking/deepsort-tracking,
    • 3D-detection/objectron.

@jkbmrz jkbmrz marked this pull request as ready for review April 23, 2025 12:06
Copy link
Contributor

@klemen1999 klemen1999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generate_script_content function could also be utilized in some other experiments

Essentially this could be used in every 2stage exepriment no?

pose_nn.out.link(gather_data_node.input_data)
detections_filter.out.link(gather_data_node.input_reference)

skeleton_edges = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I belive this information should be now already part of ImgDetectionExtended message

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The detection model here is YOLOv6 so we are dealing with dai.ImgDetections. Do you propose we transform to ImgDetectionsExtended and merge it together with Keypoints into a single message?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh my bad, then this is ok as is I think

age_gender_node.getOutput(0).link(sync_node.age_input)
age_gender_node.getOutput(1).link(sync_node.gender_input)
# gather age-gender info
rec_gathered = pipeline.create(GatherData).build(fps, wait_count_fn=return_one)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this gather? Generally the outputs of the same NN should already be syned between eachother no?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose is to join the two recognitions (age and gender) into a single message that can then be synced with the detections message. Do you maybe see a better way of how this could be done?

Copy link
Contributor

@klemen1999 klemen1999 Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I'm thinking how should we set an example for best practice in this case. Maybe creating just GatheredData type manually would be better? Because you only want to "package" up all the things into one message and adding GatherData node just for this seems a bit overkill, adds overhead. But I'm not sure, maybe this is infact the cleanest solution. Thoughts?
Maybe @dominik737 if you have some thoughts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree it is not the cleanest solution. Ideally, GatherData would sync input_data from multiple streams. But that might be an overkill for a single node.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would advise to not use GatherData for this, as it's primary purpose should be to gather data on a non-trivial condition e.g. NNData based off len(ImgDetections.detections).

The ways I would go about this are:

  1. Using dai.node.Sync, which is made to sync 1:1 messages based on timestamp. The output message is of type dai.MessageGroup.
  2. Implement a node that can merge inputs into a dai.MessageGroup without syncing. The merging could be done based on a lambda as in GatherData node.
  3. Make a custom type for this particular experiment and return it from a custom node that will merge the messages.
  4. Have outputs property be part of the ParsingNeuralNetwork, where the multiple head outputs are outputted into a single dai.MessageGroup.

I personally like the option (4) the most. I think there will hardly be use beside this use-case of merging multiple head outputs into group. Else if such cases are expected I would opt for (2).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like 4th option. Easy to implement on dai-nodes side and quite useful in such cases (at least 5 multihead models in the ZOO) + can work with GatherData.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like option 4 as well, let's add it on dai-nodes side yeah

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Starting a PR for it HERE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants