Abstract: | We are developing a helper robot that carries out tasks ordered by users through speech. The robot needs a vision system to recognize the objects appearing in the orders. However, conventional vision systems cannot recognize objects in complex scenes. They may find many objects and cannot determine which is the target. This paper proposes a method of using a conversation with the user to solve this problem. The robot asks a question to which the user can easily answer and whose answer can efficiently reduce the number of candidate objects. It considers the characteristics of features used for object identification such as the ease for humans to specify them by word, generating a user-friendly and efficient sequence of questions. Experimental results show that the robot can detect target objects by asking the questions generated by the method. |