Hand motion-aware surgical tool localization and classification from an egocentric camera

Tomohiro Shimizu, Ryo Hachiuma, Hiroki Kajita, Yoshifumi Takatsume, Hideo Saito

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)


Detecting surgical tools is an essential task for the analysis and evaluation of surgical videos. However, in open surgery such as plastic surgery, it is difficult to detect them because there are surgical tools with similar shapes, such as scissors and needle holders. Unlike endoscopic surgery, the tips of the tools are often hidden in the operating field and are not captured clearly due to low camera resolution, whereas the movements of the tools and hands can be captured. As a result that the different uses of each tool require different hand movements, it is possible to use hand movement data to classify the two types of tools. We combined three modules for localization, selection, and classification, for the detection of the two tools. In the localization module, we employed the Faster R-CNN to detect surgical tools and target hands, and in the classification module, we extracted hand movement information by combining ResNet-18 and LSTM to classify two tools. We created a dataset in which seven different types of open surgery were recorded, and we provided the annotation of surgical tool detection. Our experiments show that our approach successfully detected the two different tools and outperformed the two baseline methods.

Original languageEnglish
Article number15
JournalJournal of Imaging
Issue number2
Publication statusPublished - 2021 Feb


  • Egocentric camera
  • Object detection
  • Open surgery
  • Surgical tools

ASJC Scopus subject areas

  • Radiology Nuclear Medicine and imaging
  • Computer Vision and Pattern Recognition
  • Computer Graphics and Computer-Aided Design
  • Electrical and Electronic Engineering


Dive into the research topics of 'Hand motion-aware surgical tool localization and classification from an egocentric camera'. Together they form a unique fingerprint.

Cite this