<p dir="ltr">This paper introduces a human-robot-interaction approach for a robot to learn complex automated tasks via natural human actions. Our system comprises of marker-less human pose estimation that is retargeted onto a robot for it to learn for demonstration (LfD) while incorporating augmented visual feedback to the human. This is done through a digital twin setup comprising of the virtual environment that is a replica of the physical scene with the robot on one end and the human teleoperator on the other. The system begins with a low-cost off-the-shelf marker-less human capture module before its posture is mapped across to the robot via its virtual digital twin. Our mapping takes into account the difference between the human and robot joints by normalization of the Euclidean among the joint lengths. Also, we have included a preliminary adaptive hysteresis thresholding method to overcome potential jittering problem in precision of joint location during motion capture while preserving the high frequency movements. Finally, our system is integrated with state-of-the-art 3D scene reconstruction and object detection methods</p>