VAST: Visual and Spectral Terrain Classification in Unstructured Multi-Class Environments


Terrain classification is a challenging task for robots operating in unstructured environments. Existing classification methods make simplifying assumptions, such as a reduced number of classes, clearly segmentable roads, or good lighting conditions, and focus primarily on one sensor type. These assumptions do not translate well to off-road vehicles, which operate in varying terrain conditions. To provide mobile robots with the capability to identify the terrain being traversed and avoid undesirable surface types, we propose a multimodal sensor suite capable of classifying different terrains. We capture high resolution macro images of surface texture, spectral reflectance curves, and localization data from a 9 degrees of freedom (DOF) inertial measurement unit (IMU) on 11 different terrains at different times of day. Using this dataset, we train individual neural networks on each of the modalities, and then combine their outputs in a fusion network. The fused network achieved an accuracy of 99.98% percent on the test set, exceeding the results of the best individual network component by 0.98%. We conclude that a combination of visual, spectral, and IMU data provides meaningful improvement over state of the art in terrain classification approaches. The data created for this research is available at

2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)