Extracting Sign Language Articulation from Videos with MediaPipe
MetadataShow full item record
Original versionIn Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 169–178
This paper concerns evaluating methods for extracting phonological information of Swedish Sign Language signs from video data with MediaPipe’s pose estimation. The methods involve estimating i) the articulation phase, ii) hand dominance (left vs. right), iii) the number of hands articulating (one- vs. two-handed signs) and iv) the sign’s place of articulation. The results show that MediaPipe’s tracking of the hands’ location and movement in videos can be used to estimate the articulation phase of signs. Whereas the inclusion of transport movements improves the accuracy for the estimation of hand dominance and number of hands, removing transport movements is crucial for estimating a sign’s place of articulation.