Tags

Type your tag names separated by a space and hit enter

Detecting Target Objects by Natural Language Instructions Using an RGB-D Camera.
Sensors (Basel). 2016 Dec 13; 16(12)S

Abstract

Controlling robots by natural language (NL) is increasingly attracting attention for its versatility, convenience and no need of extensive training for users. Grounding is a crucial challenge of this problem to enable robots to understand NL instructions from humans. This paper mainly explores the object grounding problem and concretely studies how to detect target objects by the NL instructions using an RGB-D camera in robotic manipulation applications. In particular, a simple yet robust vision algorithm is applied to segment objects of interest. With the metric information of all segmented objects, the object attributes and relations between objects are further extracted. The NL instructions that incorporate multiple cues for object specifications are parsed into domain-specific annotations. The annotations from NL and extracted information from the RGB-D camera are matched in a computational state estimation framework to search all possible object grounding states. The final grounding is accomplished by selecting the states which have the maximum probabilities. An RGB-D scene dataset associated with different groups of NL instructions based on different cognition levels of the robot are collected. Quantitative evaluations on the dataset illustrate the advantages of the proposed method. The experiments of NL controlled object manipulation and NL-based task programming using a mobile manipulator show its effectiveness and practicability in robotic applications.

Authors+Show Affiliations

Department of Hydraulic, Energy and Power Engineering, Yangzhou University, Yangzhou 225000, China. jtbao@yzu.edu.cn.Department of Automotive Engineering, Clemson University, Greenville, SC 29607, USA. yunyij@clemson.edu.Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA. chengyu9@msu.edu.Department of Hydraulic, Energy and Power Engineering, Yangzhou University, Yangzhou 225000, China. hrtang@yzu.edu.cn.Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA. xin@msu.edu.

Pub Type(s)

Journal Article

Language

eng

PubMed ID

27983604

Citation

Bao, Jiatong, et al. "Detecting Target Objects By Natural Language Instructions Using an RGB-D Camera." Sensors (Basel, Switzerland), vol. 16, no. 12, 2016.
Bao J, Jia Y, Cheng Y, et al. Detecting Target Objects by Natural Language Instructions Using an RGB-D Camera. Sensors (Basel). 2016;16(12).
Bao, J., Jia, Y., Cheng, Y., Tang, H., & Xi, N. (2016). Detecting Target Objects by Natural Language Instructions Using an RGB-D Camera. Sensors (Basel, Switzerland), 16(12).
Bao J, et al. Detecting Target Objects By Natural Language Instructions Using an RGB-D Camera. Sensors (Basel). 2016 Dec 13;16(12) PubMed PMID: 27983604.
* Article titles in AMA citation format should be in sentence-case
TY - JOUR T1 - Detecting Target Objects by Natural Language Instructions Using an RGB-D Camera. AU - Bao,Jiatong, AU - Jia,Yunyi, AU - Cheng,Yu, AU - Tang,Hongru, AU - Xi,Ning, Y1 - 2016/12/13/ PY - 2016/09/20/received PY - 2016/11/24/revised PY - 2016/12/07/accepted PY - 2016/12/17/entrez PY - 2016/12/17/pubmed PY - 2016/12/17/medline KW - natural language control KW - natural language processing KW - object grounding KW - object recognition KW - robotic manipulation system KW - target object detection JF - Sensors (Basel, Switzerland) JO - Sensors (Basel) VL - 16 IS - 12 N2 - Controlling robots by natural language (NL) is increasingly attracting attention for its versatility, convenience and no need of extensive training for users. Grounding is a crucial challenge of this problem to enable robots to understand NL instructions from humans. This paper mainly explores the object grounding problem and concretely studies how to detect target objects by the NL instructions using an RGB-D camera in robotic manipulation applications. In particular, a simple yet robust vision algorithm is applied to segment objects of interest. With the metric information of all segmented objects, the object attributes and relations between objects are further extracted. The NL instructions that incorporate multiple cues for object specifications are parsed into domain-specific annotations. The annotations from NL and extracted information from the RGB-D camera are matched in a computational state estimation framework to search all possible object grounding states. The final grounding is accomplished by selecting the states which have the maximum probabilities. An RGB-D scene dataset associated with different groups of NL instructions based on different cognition levels of the robot are collected. Quantitative evaluations on the dataset illustrate the advantages of the proposed method. The experiments of NL controlled object manipulation and NL-based task programming using a mobile manipulator show its effectiveness and practicability in robotic applications. SN - 1424-8220 UR - https://www.unboundmedicine.com/medline/citation/27983604/Detecting_Target_Objects_by_Natural_Language_Instructions_Using_an_RGB_D_Camera_ L2 - http://www.mdpi.com/resolver?pii=s16122117 DB - PRIME DP - Unbound Medicine ER -
Try the Free App:
Prime PubMed app for iOS iPhone iPad
Prime PubMed app for Android
Prime PubMed is provided
free to individuals by:
Unbound Medicine.