| Submesoscale eddies play significant roles in oceanic energy transfer and ecological regulation, yet their accurate and real-time monitoring remains challenging. To address the limitations of conventional monitoring methods -such as insufficient timeliness, limited spatial resolution, and poor flexibility of observation platforms, this paper proposes a multi-Unmanned Surface Vehicle (multi-USV) edge search algorithm for submesoscale eddies based on a highly maneuverable multi -USV platform. The algorithm integrates a gradient-adaptive A-star algorithm and a dynamic reward-driven reinforcement learning method (Dynamic-Award Q-learning, DA-Q learning). Targeting the complex morphology of submesoscale eddy edges, rapidly changing dynamic environments, and incomplete information in realistic scenarios, the algorithm employs gradient-adaptive A-star to guide multiple USVs to quickly converge toward the eddy region. During the edge search process, a dynamic reward mechanism combining temperature gradient variations and historical path information is established to drive the DA-Q learning algorithm, enabling USVs to autonomously adapt to environmental changes and flexibly adjust their courses. Using both simulated data and simulated sea surface temperature data, multiple sets of comparative experiments were designed, covering different seasons, numbers of USVs, and initial configurations. The performance and robustness of the algorithm under various environmental conditions, eddy types, and eddy distortion scenarios were analyzed. Results show that the multi -USV strategy for submesoscale eddy edge search outperforms the single-USV approach in both path length and search efficiency. In terms of deployment strategy, distributed multi-point deployment further enhances search coverage and mission response efficiency compared to single mothervessel release deployment. The findings of this study can provide technical support for multi-USV monitoring of submesoscale eddies in complex marine environments. |