Mumbai | IIT researchers have developed an efficient method to track surgical instruments in 3D using standard 2D video and basic geometry.
Surgeons and patients increasingly choose laparoscopic surgery, also called surgery through a keyhole, as patients experience less pain and faster recovery.
When surgeons manipulate robotic arms to guide a tiny tool inside the body in 3D space, they rely on experience and skill to perceive depth from the 2D video captured by the tiny camera at the operation site, which is also inserted through a keyhole.
While some tertiary healthcare facilities in big cities in India have high-end robotic surgery systems with 3D visualisations, such facilities are limited and expensive.
Dr Shubhangi Nema and Prof Leena Vachhani from the Indian Institute of Technology (IIT) Bombay and Abhishek Mathur from Indian Institute of Technology Goa have developed a novel software technique that enables 3D tracking of surgical instruments using standard video feeds, eliminating the need for expensive sensors and high-end computing.
The cost-effective approach, based on fundamental geometry, can enhance virtual reality training and has the potential to significantly lower the cost of 3D visualisation systems in surgeries.
"We chose a geometric approach because geometry is fundamentally reliable and interpretable. We leveraged geometric cues such as perspective projection, instrument shape constraints, and interval-based uncertainty modelling (using a range of possible position coordinates instead of an exact position)," said Dr Nema.
The researchers developed an algorithm that treats surgical tools as connected geometric shapes and tracks them using bounding boxes in 2D video.
By analysing changes in size, shape, position, and angles - based on perspective rules - of the boxes, the system estimates the instrument's depth, movement and rotation in 3D.
Accurately estimating depth from 2D images is challenging due to unclear object outlines caused by poor lighting, camera noise, or motion blur.
"From a single camera view, multiple 3D configurations can produce the same 2D projection. We introduced geometric constraints and interval-based bounds to narrow the feasible solution space," explained Dr Nema.
Instead of saying that the tooltip is at an exact point P, the algorithm gives a range, or an interval, in which the tip can be present, she said.
"By incorporating known instrument dimensions and motion continuity, we reduced ambiguity. This approach makes 3D estimation more stable and robust," Dr Nema added.
Further, the study revealed the system is efficient enough to run on a standard computer processor without specialised graphics hardware, processing video at speeds of roughly 50 frames per second, which is well within the requirements for real-time applications.
To validate their method, the team set up a physical experiment to record known motions of a scaled physical model using a highly precise motion capture system and a stationary webcam. Researchers found the errors were negligible, making the method usable for labelling and motion tracking of instruments for futuristic applications.
The researchers plan to implement their strategy in an experimental setup to provide real-time training or assistance to the surgeons.
"This work demonstrates that a three-dimensional visual experience for surgeons can be achieved using the existing monocular laparoscopic camera itself, offering a cost-effective and practical pathway toward improved depth perception in minimally invasive surgery," added Prof Vachhani.