In this paper, we propose a ‘preprocessing-encoder-decoder’ hybrid learning model, which can make full use of geographic semantic information and spatial neighborhood effects, thereby significantly improving the prediction performance. The other is the machine learning models, most of which simply leverage the features of Origin-Destination (OD), ignoring the topological nature of the interaction network and the spatial correlation brought by the nearby areas. These models rely on fixed and simple mathematical formulas derived from physics, and ignore rich geographic semantics, which makes them difficult to model complex human mobility patterns. One is traditional models, such as the gravity model and radiation model. However, the two existing types of solutions have inherent flaws. Commuting flow prediction is a crucial issue for transport optimization and urban planning.