rotate face to neutral pose first #3

Neleac · 2022-04-01T02:17:06Z

I noticed that the results vary with different face rotation / head tilt, since the values used are tuned to the neutral / upright face rotation. I think you should first rotate the landmarks into a neutral pose before doing the calculations, so that results are rotation invariant. Are there plans on adding this feature?

Neleac · 2022-04-01T02:20:18Z

Actually a simpler solution than rotating the landmarks would be to project the points onto a plane defined by some local axes.

JimWest · 2022-04-03T18:33:57Z

I actually got those points in a better way, but couldn't had the time to implement it properly yet.
If you activate the --show_3d parameter and look at the image (projected 3d points onto 2d) that's pretty much as stable and normalized as you can get with the current mediapipe model.

qhanson · 2022-04-17T11:28:17Z

Currently, the code uses metric landmarks or normalized landmarks (image pixel space) to calculate blendshape values. I tried both methods and the results look awful.

However, both ways ignore the face identity. Different people have varied faces. I even try the rigid transformation to map my metric landmarks to the canonical face provided by mediapipe. However, even the neural faces in the transformed space (canonical space) look different. Do you have any suggestions? I am also working on data-driven blendshape solver (deep learning by collecting enough metahuman faces and their blendshape values).

xuguozhi · 2022-04-18T07:16:25Z

Currently, the code uses metric landmarks or normalized landmarks (image pixel space) to calculate blendshape values. I tried both methods and the results look awful.

However, both ways ignore the face identity. Different people have varied faces. I even try the rigid transformation to map my metric landmarks to the canonical face provided by mediapipe. However, even the neural faces in the transformed space (canonical space) look different. Do you have any suggestions? I am also working on data-driven blendshape solver (deep learning by collecting enough metahuman faces and their blendshape values).

deep learning base approach seems ok but requires mush pair-data for training

qhanson · 2022-04-18T08:09:01Z

Yes. It needs massive paired-data such as hundrends of faces. Luckily, Metahuman is real enough to compensate for the real human face collection. I am working on writing a metahuman project to receive blendshape values and save the results as image.

xuguozhi · 2022-04-18T08:32:43Z

Yes. It needs massive paired-data such as hundrends of faces. Luckily, Metahuman is real enough to compensate for the real human face collection. I am working on writing a metahuman project to receive blendshape values and save the results as image.

I am no more at NetEase, but the image-bs pair from metahuman could be easily acquired if you are familiar with UE.

qhanson · 2022-05-21T07:00:53Z

Some Updates:
Datasets: Send some 52 blendshape to metahuman and get the corresponding metahuman face. Personally, I obtained 30k images for 40 expressions of 59 metahumans.

Method: training a neural network from synthesized metahuman faces to 52 bs.

Result: The neural network is converged well on the synthesized datasets. Testing on the synthesized datasets worked well. However, it does not generalize to real human faces.

Neleac · 2022-05-21T08:02:34Z

@qhanson I suggest training the model to directly use MediaPipe landmarks to predict blendshape values. To generate the ground truth blendshape values for the dataset, you'll have to use something like LiveLinkFace mentioned in the README. This MediaPipe -> blendshape model is the missing piece to replacing LiveLinkFace

iPsych · 2022-05-21T08:12:06Z

@Neleac @qhanson
It seems that I am facing the same problem.
I am looking for the better solution for 'already recorded video' to meta-human applicable blendshape output.
Currently wiggling with mediapipe attention-mesh.

qhanson · 2022-05-21T09:20:39Z

@qhanson I suggest training the model to directly use MediaPipe landmarks to predict blendshape values. To generate the ground truth blendshape values for the dataset, you'll have to use something like LiveLinkFace mentioned in the README. This MediaPipe -> blendshape model is the missing piece to replacing LiveLinkFace

In my experiment, directly learning the mapping (468*3 -> 52) with a 4-layer MLP does not work well. With l1 loss, the output keeps the same. With l2 loss, the mouth can open and close while the eye keeps open all the time. This reminds me of the mesh classification problem. Passing the render mesh or point cloud of 468 landmarks may work. In this way, we can not exploit the pretrained-weights of mediapipe. I do not know the minimum number of paired image2bs. Tip: I have not tested this way.

JimWest · 2022-05-24T19:48:41Z

I would try to use a smaller input, you don't need all the 468 Keypoints, I would try to start with the ones I'm using in my config file and slowly adding more (by looking at the ones that really matter when doing facial stuff). With that you will need way less training data (and training time).

zk2ly · 2022-08-11T02:28:12Z

一些更新： **数据集：**向 metahuman 发送一些 52 blendshape 并获得相应的 metahuman 人脸。就个人而言，我为 59 个超人类的 40 个表情获得了 30k 张图像。

**方法：**从合成的超人脸到 52 bs 训练一个神经网络。

**结果：**神经网络在合成数据集上收敛良好。对合成数据集的测试效果很好。但是，它并不能推广到真实的人脸。

Can you share your data, I want to use it to train a mediapipe2blendshape network, if it works well I will share the network with you.

qhanson · 2022-08-11T06:30:47Z

Can you share your data, I want to use it to train a mediapipe2blendshape network, if it works well I will share the network with you.

For simple experiments, you do not need these datasets to train on model. You can try https://github.com/yeemachine/kalidokit

sylyt62 · 2023-02-02T03:49:40Z

Some Updates: Datasets: Send some 52 blendshape to metahuman and get the corresponding metahuman face. Personally, I obtained 30k images for 40 expressions of 59 metahumans.

Method: training a neural network from synthesized metahuman faces to 52 bs.

Result: The neural network is converged well on the synthesized datasets. Testing on the synthesized datasets worked well. However, it does not generalize to real human faces.

What loss function did you use to train this network?

There's another morphable head model named FLAME, which offers a tool to generate 3d mesh with its 100 expression parameters (something like blendshape) as inputs. With this tool we could build loss functions by mapping it back to the image space (3d -> 2d) and thus compare the 3d landmarks of the face.

But it seems that ARKit lack this kind of tool to do the mapping. If you use statistical L1 loss or so, it will only focus on the similarity of the numbers, but not the similarity of the actual expressions. Guess that's why your model not performing well in generalization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

rotate face to neutral pose first #3

rotate face to neutral pose first #3

Neleac commented Apr 1, 2022

Neleac commented Apr 1, 2022

Uh oh!

JimWest commented Apr 3, 2022

Uh oh!

qhanson commented Apr 17, 2022

Uh oh!

xuguozhi commented Apr 18, 2022 •

edited

Loading

Uh oh!

qhanson commented Apr 18, 2022 •

edited

Loading

Uh oh!

xuguozhi commented Apr 18, 2022

Uh oh!

qhanson commented May 21, 2022

Uh oh!

Neleac commented May 21, 2022

Uh oh!

iPsych commented May 21, 2022

Uh oh!

qhanson commented May 21, 2022

Uh oh!

JimWest commented May 24, 2022

Uh oh!

zk2ly commented Aug 11, 2022

Uh oh!

qhanson commented Aug 11, 2022

Uh oh!

sylyt62 commented Feb 2, 2023 •

edited

Loading

Uh oh!

rotate face to neutral pose first #3

rotate face to neutral pose first #3

Comments

Neleac commented Apr 1, 2022

Neleac commented Apr 1, 2022

Uh oh!

JimWest commented Apr 3, 2022

Uh oh!

qhanson commented Apr 17, 2022

Uh oh!

xuguozhi commented Apr 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qhanson commented Apr 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xuguozhi commented Apr 18, 2022

Uh oh!

qhanson commented May 21, 2022

Uh oh!

Neleac commented May 21, 2022

Uh oh!

iPsych commented May 21, 2022

Uh oh!

qhanson commented May 21, 2022

Uh oh!

JimWest commented May 24, 2022

Uh oh!

zk2ly commented Aug 11, 2022

Uh oh!

qhanson commented Aug 11, 2022

Uh oh!

sylyt62 commented Feb 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xuguozhi commented Apr 18, 2022 •

edited

Loading

qhanson commented Apr 18, 2022 •

edited

Loading

sylyt62 commented Feb 2, 2023 •

edited

Loading