Single-modal incremental terrain clustering from self-supervised audio-visual feature learning

Reina Ishikawa, Ryo Hachiuma, Akiyoshi Kurobe, Hideo Saito

研究成果: Conference contribution

1 被引用数 (Scopus)

抄録

The key to an accurate understanding of terrain is to extract the informative features from the multi-modal data obtained from different devices. Sensors, such as RGB cameras, depth sensors, vibration sensors, and microphones, are used as the multi-modal data. Many studies have explored ways to use them, especially in the robotics field. Some papers have successfully introduced single-modal or multi-modal methods. However, in practice, robots can be faced with extreme conditions; microphones do not work well in crowded scenes, and an RGB camera cannot capture terrains well in the dark. In this paper, we present a novel framework using the multi-modal variational autoencoder and the Gaussian mixture model clustering algorithm on image data and audio data for terrain type clustering. Our method enables the terrain type clustering even if one of the modalities (either image or audio) is missing at the test-time. We evaluated the clustering accuracy with a conventional multi-modal terrain type clustering method and we conducted ablation studies to show the effectiveness of our approach.

本文言語English
ホスト出版物のタイトルProceedings of ICPR 2020 - 25th International Conference on Pattern Recognition
出版社Institute of Electrical and Electronics Engineers Inc.
ページ9399-9406
ページ数8
ISBN(電子版)9781728188089
DOI
出版ステータスPublished - 2020
イベント25th International Conference on Pattern Recognition, ICPR 2020 - Virtual, Milan, Italy
継続期間: 2021 1月 102021 1月 15

出版物シリーズ

名前Proceedings - International Conference on Pattern Recognition
ISSN(印刷版)1051-4651

Conference

Conference25th International Conference on Pattern Recognition, ICPR 2020
国/地域Italy
CityVirtual, Milan
Period21/1/1021/1/15

ASJC Scopus subject areas

  • コンピュータ ビジョンおよびパターン認識

フィンガープリント

「Single-modal incremental terrain clustering from self-supervised audio-visual feature learning」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル