Autonomous target tracking of multi-UAV (2024)

research-article

Free Access

  • Authors:
  • Jiahua Wang School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China

    School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China

    View Profile

    ,
  • Ping Zhang School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China

    School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China

    View Profile

    ,
  • Yang Wang State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550525, China

    State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550525, China

    View Profile

Applied Soft ComputingVolume 145Issue CSep 2023https://doi.org/10.1016/j.asoc.2023.110604

Published:01 September 2023Publication History

  • 0citation
  • 0
  • Downloads

Metrics

Total Citations0Total Downloads0

Last 12 Months0

Last 6 weeks0

  • Get Citation Alerts

    New Citation Alert added!

    This alert has been successfully added and will be sent to:

    You will be notified whenever a record that you have chosen has been cited.

    To manage your alert preferences, click on the button below.

    Manage my Alerts

    New Citation Alert!

    Please log in to your account

  • Publisher Site

Applied Soft Computing

Volume 145, Issue C

PreviousArticleNextArticle

Autonomous target tracking of multi-UAV (1)

Skip Abstract Section

Abstract

Abstract

In recent years, deep reinforcement learning (DRL) has developed rapidly and has been applied to multi-UAV target tracking (MTT) research. However, DRL still faces challenges in data utilization and learning speed. To better solve the above problems, a novel two-stage DRL-based multi-UAV decision-making method is proposed in this paper. Specifically, a sample generator combining artificial potential field with proportional–integral–derivative is used to produce expert experience data. On this basis, a two-stage reinforcement learning training method is introduced. For the first stage, the policy network and critic network are pre-trained using expert data, combined with behavior cloning loss and additional Q-value loss, which reduces ineffective exploration and speeds up learning. For the second RL stage, by calculating the average return of the last recent k excellent episodes, the excellent experience generated by the agent itself is screened out and used to guide the policy network to choose the actions with high reward, thus improving the efficiency of data utilization. Extensive simulation experiments show that our method not only enables multi-UAV to continuously track the target in obstacle environments but also significantly improves the learning speed and convergence effect.

Graphical abstract

Display Omitted

Highlights

A new decision-making framework is proposed for MTT in obstacle environments.

TSDRL-EE makes full use of expert data and the excellent experience of the agent.

TSDRL-EE has obvious advantages in learning speed and convergence effect.

References

  1. [1] Yao P., Wang H., Ji H., Multi-UAVs tracking target in urban environment by model predictive control and Improved Grey Wolf Optimizer, Aerosp. Sci. Technol. 55 (2016) 131143.Google ScholarAutonomous target tracking of multi-UAV (2)
  2. [2] Oh H., Kim S., Tsourdos A., White B.A., Decentralised standoff tracking of moving targets using adaptive sliding mode control for UAVs, J. Intell. Robot. Syst. 76 (1) (2014) 169183.Google ScholarAutonomous target tracking of multi-UAV (3)
  3. [3] LeCun Y., Bengio Y., Hinton G., Deep learning, Nature 521 (7553) (2015) 436444.Google ScholarAutonomous target tracking of multi-UAV (4)
  4. [4] Botvinick M., Ritter S., Wang J.X., Kurth-Nelson Z., Blundell C., Hassabis D., Reinforcement learning, fast and slow, Trends in Cognitive Sciences 23 (5) (2019) 408422.Google ScholarAutonomous target tracking of multi-UAV (5)
  5. [5] Li B., Yang Z.-p., Chen D.-q., Liang S.-y., Ma H., Maneuvering target tracking of UAV based on MN-DDPG and transfer learning, Def. Technol. 17 (2) (2021) 457466.Google ScholarAutonomous target tracking of multi-UAV (6)
  6. [6] Moon J., Papaioannou S., Laoudias C., Kolios P., Kim S., Deep reinforcement learning multi-UAV trajectory control for target tracking, IEEE Internet Things J. 8 (20) (2021) 1544115455.Google ScholarAutonomous target tracking of multi-UAV (7)
  7. [7] Zhang R., Zong Q., Zhang X., Dou L., Tian B., Game of drones: Multi-UAV pursuit-evasion game with online motion planning by deep reinforcement learning, IEEE Trans. Neural Netw. Learn. Syst. (2022).Google ScholarAutonomous target tracking of multi-UAV (8)
  8. [8] Chen Y.-J., Chang D.-K., Zhang C., Autonomous tracking using a swarm of UAVs: A constrained multi-agent reinforcement learning approach, IEEE Trans. Veh. Technol. 69 (11) (2020) 1370213717.Google ScholarAutonomous target tracking of multi-UAV (9)
  9. [9] Fujimoto S., Hoof H., Meger D., Addressing function approximation error in actor-critic methods, in: International Conference on Machine Learning, PMLR, 2018, pp. 15871596.Google ScholarAutonomous target tracking of multi-UAV (10)
  10. [10] Shin Y., Kim E., Hybrid path planning using positioning risk and artificial potential fields, Aerosp. Sci. Technol. 112 (2021).Google ScholarAutonomous target tracking of multi-UAV (11)
  11. [11] Pham H.X., La H.M., Feil-Seifer D., Nguyen L.V., Autonomous uav navigation using reinforcement learning, 2018, arXiv preprint arXiv:1801.05086.Google ScholarAutonomous target tracking of multi-UAV (12)
  12. [12] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.Google ScholarAutonomous target tracking of multi-UAV (13)
  13. [13] Silver D., Huang A., Maddison C.J., Guez A., Sifre L., Van Den Driessche G., Schrittwieser J., Antonoglou I., Panneershelvam V., Lanctot M., et al., Mastering the game of go with deep neural networks and tree search, Nature 529 (7587) (2016) 484489.Google ScholarAutonomous target tracking of multi-UAV (14)
  14. [14] Li Z., Xiong G., Tian Y., Lv Y., Chen Y., Hui P., Su X., A multi-stream feature fusion approach for traffic prediction, IEEE Trans. Intell. Transp. Syst. (2020).Google ScholarAutonomous target tracking of multi-UAV (15)
  15. [15] Jin J., Ma X., A multi-objective agent-based control approach with application in intelligent traffic signal system, IEEE Trans. Intell. Transp. Syst. 20 (10) (2019) 39003912.Google ScholarAutonomous target tracking of multi-UAV (16)
  16. [16] Mnih V., Kavukcuoglu K., Silver D., Rusu A.A., Veness J., Bellemare M.G., Graves A., Riedmiller M., Fidjeland A.K., Ostrovski G., et al., Human-level control through deep reinforcement learning, Nature 518 (7540) (2015) 529533.Google ScholarAutonomous target tracking of multi-UAV (17)
  17. [17] Silver D., Schrittwieser J., Simonyan K., Antonoglou I., Huang A., Guez A., Hubert T., Baker L., Lai M., Bolton A., et al., Mastering the game of go without human knowledge, Nature 550 (7676) (2017) 354359.Google ScholarAutonomous target tracking of multi-UAV (18)
  18. [18] Huang Z., Wu J., Lv C., Efficient deep reinforcement learning with imitative expert priors for autonomous driving, IEEE Trans. Neural Netw. Learn. Syst. (2022).Google ScholarAutonomous target tracking of multi-UAV (19)
  19. [19] Li X., Wang X., Zheng X., Dai Y., Yu Z., Zhang J.J., Bu G., Wang F.-Y., Supervised assisted deep reinforcement learning for emergency voltage control of power systems, Neurocomputing 475 (2022) 6979.Google ScholarAutonomous target tracking of multi-UAV (20)
  20. [20] Samir M., Ebrahimi D., Assi C., Sharafeddine S., Ghrayeb A., Leveraging UAVs for coverage in cell-free vehicular networks: A deep reinforcement learning approach, IEEE Trans. Mob. Comput. 20 (9) (2020) 28352847.Google ScholarAutonomous target tracking of multi-UAV (21)
  21. [21] Wan K., Wu D., Li B., Gao X., Hu Z., Chen D., ME-MADDPG: An efficient learning-based motion planning method for multiple agents in complex environments, Int. J. Intell. Syst. 37 (3) (2022) 23932427.Google ScholarAutonomous target tracking of multi-UAV (22)
  22. [22] Bhagat S., Sujit P., UAV target tracking in urban environments using deep reinforcement learning, in: 2020 International Conference on Unmanned Aircraft Systems, ICUAS, IEEE, 2020, pp. 694701.Google ScholarAutonomous target tracking of multi-UAV (23)
  23. [23] Li B., Wu Y., Path planning for UAV ground target tracking via deep reinforcement learning, IEEE Access 8 (2020) 2906429074.Google ScholarAutonomous target tracking of multi-UAV (24)
  24. [24] You S., Diao M., Gao L., Zhang F., Wang H., Target tracking strategy using deep deterministic policy gradient, Appl. Soft Comput. 95 (2020).Google ScholarAutonomous target tracking of multi-UAV (25)
  25. [25] Zhou W., Liu Z., Li J., Xu X., Shen L., Multi-target tracking for unmanned aerial vehicle swarms using deep reinforcement learning, Neurocomputing 466 (2021) 285297.Google ScholarAutonomous target tracking of multi-UAV (26)
  26. [26] Xia Z., Du J., Wang J., Jiang C., Ren Y., Li G., Han Z., Multi-agent reinforcement learning aided intelligent UAV swarm for target tracking, IEEE Trans. Veh. Technol. 71 (1) (2021) 931945.Google ScholarAutonomous target tracking of multi-UAV (27)
  27. [27] T. Hester, M. Vecerik, O. Pietquin, M. Lanctot, T. Schaul, B. Piot, D. Horgan, J. Quan, A. Sendonaris, I. Osband, et al., Deep q-learning from demonstrations, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32, No. 1, 2018.Google ScholarAutonomous target tracking of multi-UAV (28)
  28. [28] Vecerik M., Hester T., Scholz J., Wang F., Pietquin O., Piot B., Heess N., Rothörl T., Lampe T., Riedmiller M., Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards, 2017, arXiv preprint arXiv:1707.08817.Google ScholarAutonomous target tracking of multi-UAV (29)
  29. [29] Nair A., McGrew B., Andrychowicz M., Zaremba W., Abbeel P., Overcoming exploration in reinforcement learning with demonstrations, in: 2018 IEEE International Conference on Robotics and Automation, ICRA, IEEE, 2018, pp. 62926299.Google ScholarAutonomous target tracking of multi-UAV (30)
  30. [30] Gao Y., Xu H., Lin J., Yu F., Levine S., Darrell T., Reinforcement learning from imperfect demonstrations, 2018, arXiv preprint arXiv:1802.05313.Google ScholarAutonomous target tracking of multi-UAV (31)
  31. [31] M. Jing, X. Ma, W. Huang, F. Sun, C. Yang, B. Fang, H. Liu, Reinforcement learning from imperfect demonstrations under soft expert guidance, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, No. 04, 2020, pp. 5109–5116.Google ScholarAutonomous target tracking of multi-UAV (32)
  32. [32] Xie L., Wang S., Rosa S., Markham A., Trigoni N., Learning with training wheels: speeding up training with a simple controller for deep reinforcement learning, in: 2018 IEEE International Conference on Robotics and Automation, ICRA, IEEE, 2018, pp. 62766283.Google ScholarAutonomous target tracking of multi-UAV (33)
  33. [33] Sutton R.S., Barto A.G., Reinforcement Learning: An Introduction, MIT Press, 2018.Google ScholarAutonomous target tracking of multi-UAV (34)Digital Library
  34. [34] H. VanHasselt, A. Guez, D. Silver, Deep reinforcement learning with double q-learning, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 30, No. 1, 2016.Google ScholarAutonomous target tracking of multi-UAV (36)
  35. [35] X. Liang, T. Wang, L. Yang, E. Xing, Cirl: Controllable imitative reinforcement learning for vision-based self-driving, in: Proceedings of the European Conference on Computer Vision, ECCV, 2018, pp. 584–599.Google ScholarAutonomous target tracking of multi-UAV (37)
  36. [36] Oh J., Guo Y., Singh S., Lee H., Self-imitation learning, in: International Conference on Machine Learning, PMLR, 2018, pp. 38783887.Google ScholarAutonomous target tracking of multi-UAV (38)
  37. [37] Raffin A., Hill A., Gleave A., Kanervisto A., Ernestus M., Dormann N., Stable-baselines3: Reliable reinforcement learning implementations, J. Mach. Learn. Res. (2021).Google ScholarAutonomous target tracking of multi-UAV (39)
  38. [38] Shah S., Dey D., Lovett C., Kapoor A., Airsim: High-fidelity visual and physical simulation for autonomous vehicles, in: Field and Service Robotics, Springer, 2018, pp. 621635.Google ScholarAutonomous target tracking of multi-UAV (40)
  39. [39] Lillicrap T.P., Hunt J.J., Pritzel A., Heess N., Erez T., Tassa Y., Silver D., Wierstra D., Continuous control with deep reinforcement learning, 2015, arXiv preprint arXiv:1509.02971.Google ScholarAutonomous target tracking of multi-UAV (41)
  40. [40] He L., Aouf N., Whidborne J.F., Song B., Deep reinforcement learning based local planner for UAV obstacle avoidance using demonstration data, 2020, arXiv preprint arXiv:2008.02521.Google ScholarAutonomous target tracking of multi-UAV (42)
  41. [41] Andrychowicz M., Wolski F., Ray A., Schneider J., Fong R., Welinder P., Mcgrew B., Tobin J., Abbeel P., Zaremba W., Hindsight experience replay, 2017.Google ScholarAutonomous target tracking of multi-UAV (43)

Cited By

View all

Autonomous target tracking of multi-UAV (44)

    Recommendations

    • Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Deep Reinforcement Learning

      Abstract

      When using deep reinforcement learning algorithm to complete Unmanned Aerial Vehicle (UAV) autonomous obstacle avoidance and target tracking tasks, there are often some problems such as slow convergence speed and low success rate. Therefore, this ...

      Read More

    • Multi-UAV autonomous collision avoidance based on PPO-GIC algorithm with CNN–LSTM fusion network

      Abstract

      This paper is concerned with the autonomous effective collision avoidance strategy for multiple unmanned aerial vehicles (multi-UAV) in limited airspace under the framework of proximal policy optimization (PPO) algorithm. An end-to-end ...

      Read More

    • Multi-mode filter target tracking method for mobile robot using multi-agent reinforcement learning

      Abstract

      Multi-mode filtering target tracking for mobile robot has important research significance for robot path planning, motion control and tracking robot targets. To address the problem that it is difficult for mobile robot to track targets in unknown ...

      Read More

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    Get this Article

    • Information
    • Contributors
    • Published in

      Autonomous target tracking of multi-UAV (45)

      Applied Soft Computing Volume 145, Issue C

      Sep 2023

      1314 pages

      ISSN:1568-4946

      Issue’s Table of Contents

      Elsevier B.V.

      Sponsors

        In-Cooperation

          Publisher

          Elsevier Science Publishers B. V.

          Netherlands

          Publication History

          • Published: 1 September 2023

          Author Tags

          • Multi-UAV
          • DRL
          • TD3
          • Expert experience
          • Target tracking

          Qualifiers

          • research-article

          Conference

          Funding Sources

          • Autonomous target tracking of multi-UAV (46)

            Other Metrics

            View Article Metrics

          • Bibliometrics
          • Citations0
          • Article Metrics

            • Total Citations

              View Citations
            • Total Downloads

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0

            Other Metrics

            View Author Metrics

          • Cited By

            This publication has not been cited yet

          Digital Edition

          View this article in digital edition.

          View Digital Edition

          • Figures
          • Other

            Close Figure Viewer

            Browse AllReturn

            Caption

            View Issue’s Table of Contents

            Export Citations

              Autonomous target tracking of multi-UAV (2024)
              Top Articles
              Latest Posts
              Article information

              Author: Stevie Stamm

              Last Updated:

              Views: 6025

              Rating: 5 / 5 (80 voted)

              Reviews: 95% of readers found this page helpful

              Author information

              Name: Stevie Stamm

              Birthday: 1996-06-22

              Address: Apt. 419 4200 Sipes Estate, East Delmerview, WY 05617

              Phone: +342332224300

              Job: Future Advertising Analyst

              Hobby: Leather crafting, Puzzles, Leather crafting, scrapbook, Urban exploration, Cabaret, Skateboarding

              Introduction: My name is Stevie Stamm, I am a colorful, sparkling, splendid, vast, open, hilarious, tender person who loves writing and wants to share my knowledge and understanding with you.