Open Japanese Sign Language Data Collection & Learning Games for Parents of Deaf Kids
Aug. 2024 – Present
Aug. 2024 – Present
Core Collaborators: Thad Starner (Google DeepMind, Georgia Tech), Asuka Ando (UTokyo), Misa Suzuki (Gallaudet University), Uiko Yano (Kwansei Gakuin), Yoshihiro Kawahara (UTokyo), Kai Kunze (Keio University)
Large-scale Sign Language data collection with tablets.
Sign videos of ある (exists) and 歌う (sing) shot at a studio by our team, which are used as reference signs during large-scale data collection.
About 1 in 1,000 children is born deaf; of these, 90 % are born to hearing parents. Parents who cannot use sign language often struggle to communicate effectively with their children, leading to language deprivation and dinner table syndrome: feelings of isolation and exclusion when they cannot fully participate in conversations during group meals or other social gatherings. As a result, these children frequently face social isolation, mental health issues, and unemployment. To address this issue, our goal is to create an environment where hearing parents of deaf children can learn sign language.
We will implement a sign language learning game on tablets, allowing users to learn sign language anywhere during their free time. Existing Japanese Sign Language (JSL) learning games are limited because they do not recognize sign language, preventing users from practicing effectively, which reduces learning efficacy. To incorporate practice in the game, a sign language recognition AI model and its training data are required; however, a large-scale JSL dataset does not exist that can be used for such purposes.
To execute this project, we need deep access to the sign language community, know-how for large-scale data collection, and expertise in AI model training. I lead the data collection and model training team, in partnership with a linguistics team (including Deaf researchers and Children of Deaf Adults: CODA), which will create a core parent-child communication vocabulary. Together, we will capture videos of Deaf signers across Japan, recording targeted concepts without geographic constraints. AI model training is conducted in collaboration with Thad Starner and his teams at Google and Georgia Tech.
We have collected 90k JSL annotated sign videos from 20 individuals so far, being the most extensive dataset for JSL. The dataset is planned to be publicly released in December 2025 under the CC-BY 4.0 license, allowing researchers to create their own Japanese Sign Language recognizers for various applications such as learning apps and translation.
We are hosting a game co-design workshop with parents of deaf children, deaf individuals, professional game designers, and game design students to determine the best game for learning JSL. Our game is scheduled for release in December 2025. Additionally, we are expanding the number of vocabularies in our dataset.
This project is part of the Google & UTokyo: AI Symbotic Future Society Program. We thank the participants for their kind participation in the data collection.