论文笔记之:Collaborative Deep Reinforcement Learning for Joint Object Search

  • 时间:
  • 浏览:1
  • 来源:uu快3诀窍_uu快3app安卓_导航网

  本文提出通过 gated cross connections between the Q-networks 来学习 inter-agent communication。

  1. 是物体检测领域的第另3个做 collaborative deep RL algorithm ;

Collaborative Deep Reinforcement Learning for Joint Object Search  

      3.2.2 Joint Exploitation Sampling  

  3. Collaborative RL for Joint Object Search 

      --- gated cross connections between different Q-networks;

  On the other hand, it seems especially beneficial in the context of visual object localization where different objects often appear with certain correlation patterns, 如:行人骑自行车,座子上的杯子,等等。

        其中,m(i) 代表了从 agent i 发送出来的信息;M(-i) 代表了从一点 agent 得到的信息。

  传统的 bottom-up object region proposals 的妙招,将会提取了较多的 proposal,因为 后续计算需要依赖于抢的计算能力,如 GPU 等。这么,在计算机不足英文的清况 下,则会因为 应用范围受限。而 Active search method (怎么让 RL 的妙招) 则提供了不错的妙招,可需要很大程度上降低需要评估的 proposal 数量。

Motivation:

      本文将 single agent 的妙招推广到 multi-agent,关键的概念有:

      3.2.1 Q-Networks with Gates Cross Connections  

      --- joint exploitation sampling for generating corresponding training data, 

  那此物体在交互的清况 下,可需要提供更多的 contextual cues 。那此线索有很好的潜力来有利于更加有效的搜索策略。

      m 是

  大伙儿检查了在交互过程中,多个物体之间的 Joint Active Search 的间题。

  On the one hand, it is interesting to consider such a collabrative detection "game" played by multiple agents under an RL setting; 

  1. how to make communications effective in between different agents ; 

      --- a vitrual agent implementation that facilitates easy adaptation to existing deep Q-learning algorithm. 

      本文是基于 Q-function 进行拓展的,常规的 Q-function 可需要看做是:$Q(s, a; \theta)$,而 Deep Q-network 怎么让用 NN 来估计 Q 函数。假设对于每另3个 agent i 大伙儿另3个 Q-networks $Q^{(i)}(a^{(i), s^{(i)}; \theta^{(i)}})$,这么,在 multi-agent RL 设定下,很自然的就可需要设计出另3个有利于 inter-agent communication 的 Q 函数出来,如:

CVPR 2017

  3. 本文妙招有效的探索了 相关物体之间有用的 contextual information,怎么让进一步的提升了检测的效果。

  2. how to jointly learn good policies for all agents. 

  2. propose a novel multi-agent Q-learning solution that facilitates learnable inter-agent communication with gated cross connections between the Q-networks;

  所提出的创新点:

      作者这里首先回顾了常见的单智能体进行物体检测的大致思路,此处不再赘述。

  

    3.2. Collaborative RL for Joint Object Localization 

    3.1. Single Agent RL Object Localization 

  本文提出四种 协助的多智能体 deep RL algorithm 来学习进行联合物体定位的最优策略。大伙儿的 proposal 服从现有的 RL 框架,怎么让允一点个智能体之间进行商务商务合作。在这人 领域当中,另3个开放的间题: