Autonomous mobile robots - Giving a robot the ability to interpret humanmovement patterns, and output a relevantresponse.
MetadataShow full item record
The demographic challenges caused by the proliferation of people of advanced age, and the following large expense of care facilities, are faced by many western countries, including Norway (eldrebølgen). A common denominator for the health conditions faced by the elderly is that they can be improved through the use of physical therapy. By combining the state-of-the-art methods in deep learning and robotics, one can potentially develop systems relevant for assisting in rehabilitation training for patients suffering from various diseases, such as stroke. Such systems can be made to not depend on physical contact, i.e. socially assistive robots. As of this writing, the current state-of-the-art for action recognition is presented in a paper called ``Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition'', introducing a deep learning model called spatial temporal graph convolutional network (ST-GCN) trained on DeepMind’s Kinetics dataset. We combine the ST-GCN model with the Robot Operating System (ROS) into a system deployed on a TurtleBot 3 Waffle Pi, equipped with a NVIDIA Jetson AGX Xavier, and a web camera mounted on top. This results in a completely physically independent system, able to interact with people, both interpreting input, and outputting relevant responses. Furthermore, we achieve a substantial decrease in the inference time compared to the ST-GCN pipeline, making the pipeline about 150 times faster and achieving close to real-time processing of video input. We also run multiple experiments to increase the model’s accuracy, such as transfer learning, layer freezing, and hyperparameter tuning, focusing on batch size, learning rate, and weight decay.