Abstract:[Objectives] The Amur Tiger and Amur Leopard are endangered protected animals, and using efficient means to identify and monitor them is of great significance for the conservation of species diversity.In order to solve the problems of tree occlusion, background interference, and difficulty in nighttime identification encountered in infrared camera monitoring in the wild, this study builds a deep learning model incorporating attention mechanism as a basic framework for target recognition, providing an efficient and accurate method for wildlife identification. [Methods] This study captured video images of wild animals by deploying automatic infrared cameras in the Northeast China Tiger and Leopard National Park. Eight hundred videos were selected for keyframe extraction. After noise removal, image enhancement, and image calibration, a dataset composed of 11 020 images was constructed for five species: Amur Tiger, Amur Leopard, Wild Boar, Sika Deer, and Roe Deer. This study proposed a convolutional neural network model integrating an attention mechanism module and realizing local cross-channel communication to reduce the impact of complex background environments on target recognition. This model achieved precise identification of animals in different scenarios, including day and night, different angles, and different scenes. The recognition performance of the model was evaluated via metrics such as average precision, recall, accuracy, and F1 score. [Results] The mean average precision value of the YOLO_v5m algorithm was 86.67%. After introduction of transfer learning, the mean average precision value was increased to 91.16%, and the time consumption was shortened by 106 min. Among the four types of attention mechanisms: CA, CBAM, SE, and ECA, the CA attention mechanism exhibited the best performance, achieving the average accuracy of 93.72%, which was 1.85%, 1.78%, and 1.05% higher than the other attention mechanisms, respectively (Fig. 5). [Conclusion] The deep learning model proposed in this study, which integrates transfer learning and attention mechanism, has the advantages of high accuracy and strong robustness, balancing training speed and recognition accuracy. By deploying infrared cameras to capture images of wild animals, this study can better test the potential of the model under the real living conditions of wild animals. The improved model in this study is more suitable for the identification of Amur Tigers and Amur Leopards in complex backgrounds.