GCTW Alignment for isolated gesture recognition
Universidad Católica San Pablo
In recent years, there has been increasing interest in developing automatic Sign Language Recognition (SLR) systems because Sign Language (SL) is the main mode of communication between deaf people all over the world. However, most people outside the deaf community do not understand SL, generating a communication problem, between both communities. Recognizing signs is a challenging problem because manual signing (not taking into account facial gestures) has four components that have to be recognized, namely, handshape, movement, location and palm orientation. Even though the appearance and meaning of basic signs are well-defined in sign language dictionaries, in practice, many variations arise due to different factors like gender, age, education or regional, social and ethnic factors which can lead to significant variations making hard to develop a robust SL recognition system. This project attempts to introduce the alignment of videos into isolated SLR, given that this approach has not been studied deeply, even though it presents a great potential for correctly recognize isolated gestures. We also aim for a user-independent recognition, which means that the system should give have a good recognition accuracy for the signers that were not represented in the data set. The main features used for the alignment are the wrists coordinates that we extracted from the videos by using OpenPose. These features will be aligned by using Generalized Canonical Time Warping. The resultant videos will be classified by making use of a 3D CNN. Our experimental results show that the proposed method has obtained a 65.02% accuracy, which places us 5th in the 2017 Chalearn LAP isolated gesture recognition challenge, only 2.69% away from the first place.