I reimplemented the Gtech paper for primitive monocular depth estimation. It works relatively well but requires a certain measurement of the object before the estimation.
Reference - N. Yao, E. Anaya, Q. Tao, S. Cho, H. Zheng and F. Zhang, "Monocular vision-based human following on miniature robotic blimp," 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 2017, pp. 3244-3249, doi: 10.1109/ICRA.2017.7989369.