As one of the bottlenecks restricting the vehicle pedestrian detection algorithm, visual perception is difficult to cope with complex road conditions and weather conditions when a single visual sensor detects road surface objects. Therefore, this paper uses multimodal data to improve the performance of the detection algorithm. Firstly, this paper builds a multi-modal data acquisition system through the visible light camera, visible polarization movement, short-wave infrared movement, and long-wave infrared movement, and constructs a multi-modal data set to fill the gap in the data. Secondly, for heterologous image registration, a heterologous image registration algorithm based on the multi-scale partial intensity invariant features of improved SIFT feature points is proposed. Then, for target detection, a multimodal data object detection network based on YOLOv5 is proposed. Finally, the average accuracy was improved by 1.0% in the daytime dataset and 10.9% in the daytime and nighttime mixed dataset.