BodyPose Model Card
bias, model card, BodyPose | March 14, 2025
BodyPose is developed leveraging TensorFlow's MoveNet and BlazePose models.
MoveNet
MoveNet was trained on two datasets:
COCO Keypoint Dataset Training Set 2017
- Date created: 2017
- Size: 28K images
- How the data was collected: “In-the-wild images with diverse scenes, instance sizes, and occlusions.” The original dataset of 64K images was distilled to the final 28K to only include three or less people per image.
- Bias:
- According to the public model card, the qualitative analysis shows that although the dataset has a 3:1 male to female ratio, favors young and light skinned individuals, the models is stated to perform “fairly” (< 5% performance differences between most categories).
- Categories of evaluation:
- Male / Female (gender)
- Young / Middle-age / Old (age)
- Darker / Medium/ Lighter (skin tone)
- Categories of evaluation:
- There has been a fair amount of research about the COCO Dataset. Most show that the dataset has numerous biases occurring due to underrepresentation of certain demographics.
- According to the public model card, the qualitative analysis shows that although the dataset has a 3:1 male to female ratio, favors young and light skinned individuals, the models is stated to perform “fairly” (< 5% performance differences between most categories).
Active Dataset Training Set
- Date created: 2017-2021 (assuming)
- Size: 23.5k images
- How the data was collected: “Images sampled from YouTube fitness videos which capture people exercising (e.g. HIIT, weight-lifting, etc.), stretching, or dancing. It contains diverse poses and motion with more motion blur and self-occlusions.”
- Bias:
- According to the model card, the models are stated to perform “fairly” (< 5% performance differences between all categories).
- Categories of evaluation:
- Male / Female (gender)
- Young / Middle-age / Old (age)
- Darker / Medium/ Lighter (skin tone)
- Categories of evaluation:
- The Active Single Person Image set, unlike COCO dataset, is not public, hence there is no additional research conducted to evaluate the fairness.
- According to the model card, the models are stated to perform “fairly” (< 5% performance differences between all categories).
As stated, fitness videos uploaded to YouTube were used to assemble this internal Google dataset. Only in 2024, Google has provided creators the opportunity to opt-out from Google using their videos for their AI/ML research.
BlazePose
BlazePose’s research paper and model card
- Date created: 2020-2021 (assuming)
- Size: 80K
- How the data was collected: Not stated in the original research paper. The model card asserts: “This model was trained and evaluated on images, including consented images (30K), of people using a mobile AR application captured with smartphone cameras in various “in-the-wild” conditions. The majority of training images (85K) capture a wide range of fitness poses.”
- Bias:
- According to the model card, the models are stated to perform “fairly”.
- Categories of evaluation:
- 14 subregions
- Male / Female (gender)
- 6 skin tones
- Evaluation results:
- Subregions (14): difference in confidence between average and worst performing regions of 4.8% for the heavy, 4.8% for the full and 6.5% for the light model.
- Gender: difference in confidence is 1.1% for the heavy model, 2.2% for the full model and 3.1% for the lite model.
- Skin tones: difference in confidence between worst and best performing categories is 5.7% for the heavy model, 7.0% for the full model and 7.3% for the lite model.
- Categories of evaluation:
- According to the model card, the models are stated to perform “fairly”.
There is no additional research conducted to evaluate the fairness. There is no specific information on how the consent was obtained to get the images.