Shuhaiber JH. Augmented Reality in Surgery. Arch Surg. 2004;139(2):170-174. doi:10.1001/archsurg.139.2.170
To evaluate the history and current knowledge of computer-augmented reality in the field of surgery and its potential goals in education, surgeon training, and patient treatment.
National Library of Medicine's database and additional library searches.
Only articles suited to surgical sciences with a well-defined aim of study, methodology, and precise description of outcome were included.
Augmented reality is an effective tool in executing surgical procedures requiring low-performance surgical dexterity; it remains a science determined mainly by stereotactic registration and ergonomics. Strong evidence was found that it is an effective teaching tool for training residents. Weaker evidence was found to suggest a significant influence on surgical outcome, both morbidity and mortality. No evidence of cost-effectiveness was found.
Augmented reality is a new approach in executing detailed surgical operations. Although its application is in a preliminary stage, further research is needed to evaluate its long-term clinical impact on patients, surgeons, and hospital administrators. Its widespread use and the universal transfer of such technology remains limited until there is a better understanding of registration and ergonomics.
The computer has invaded society and has become an integral part of continual advancements in medicine and science. Surgeons can no longer ignore the impact of this technology in changing our daily activities and patient treatment. Patient histories are now stored as electronic records. Computer programs exist to place patient orders, including imaging tests. Computer-based simulation empowers surgical residents in visualizing anatomy.
In the preoperative phase, most surgeons have a mental image of where the target lesion is and plan the route of exposure. Marking structures of interest on radiographic images that can be superimposed on live video camera images allows a surgeon to simultaneously visualize the surgical site and the overlaid graphic images, creating a so-called semi-immersive environment. The term is synonymous with augmented reality (AR). Virtual reality (VR) and AR are the 2 principal means by which computer technology will meet reality and offer the ultimate surgical environment.
This article reviews the developmental milestones and application of AR in the operating room in various surgical specialties and discusses the hopes and fears engendered by this evolving technology.
Augmented reality is a recent technology that is similar to the VR paradigm. It combines 3-dimensional (3D) computer-generated objects and text superimposed onto real images and video, all in real time. The main difference between VR and AR is that the latter uses real images, video frames, and 3D graphics alone.1
The ability to quantify and manipulate spatial information to relate one set of data to another (registration) is fundamental to surgical navigation. Registration is the product of mathematical methods that relate 2 or more coordinate spaces and stereotactic operating systems that will integrate these databases into the operative field.2 Stereoscopy, which is the science of both vision and perception of parallax, is not new to medical imaging. It has been extensively used in general radiography and cerebral angiography.3 Once a stereoscopic image is generated, interaction and display in the operating room is possible.4
The need for an ideal head tracking system that accurately and continuously follows all subtle movements of the surgical site and transparently adjusts the initial registration through the whole surgical procedure in a noninvasive way was the driving force underpinning the evolution of AR.
Several hurdles were recognized and overcome early on in neurosurgery by a series of surgical navigation systems advancing from mechanical arms to 2-dimensional charge-coupled devices.5 Limitations in orientation, anatomic landmark registration, and freedom of navigation became apparent to the operating surgeon as, for example, constant alteration of conjugate gaze between computed tomographic images and the surgical field.
An important but crucial problem to be solved was how to merge all data and systems necessary for achieving AR. Building a hybrid patient model was a plausible answer, but it created a registration problem for both the developer and the surgeon.6
Virtual reality offered a potential solution to build a virtual 3D patient model. Enthusiasts appreciated that to create a hybrid model—real and virtual—a complete representation that merged the real patient during surgery with useful computerized patient data was vital. The latter would encompass preoperative images (computed tomography, magnetic resonance imaging, magnetic resonance angiography, and others), anatomic models, intraoperative images (x-rays, ultrasound, video endoscope, and microscope), and position and shape information, and coordinate auditory or visual systems with operative guiding systems, ie, systems that give the accurate position of a tool freely moved by the surgeon or robot.
Lavallee et al6 put forward a definition of the hybrid model construction. As they defined it, a coordinate system would be associated with each preoperative and intraoperative imaging modality, each statistical geometrical model, each sensor, each surgical tool, and each guiding system. Building the hybrid model would require computing a chain of geometrical transformations (T1, T2, . . . Tn) between all involved coordinate systems. This system is the essence of successful functioning of AR.
Once the hybrid patient model provided the virtual component of AR, the next task was to register the virtual frame of reference with what the user is seeing: the real patient. This registration is more critical in an AR than a VR system because our eyes are more sensitive to visual misalignments than to the type of visual-kinesthetic errors arising in the VR system.7
The scene is viewed by an imaging device, which in this case is depicted by a video camera. The camera performs a perspective projection of the 3D world onto a 2-dimensional image plane. The intrinsic (focal length and lens distortion) and extrinsic (position and pose) parameters of the device determine exactly what is projected onto its image plane. The virtual image is generated with a standard computer graphics system. The virtual objects are modeled in an object reference frame. The graphics system requires information about imaging of the real scene so that it can correctly render these objects. These data will control the synthetic camera that is used to generate the image of the virtual objects. This image is then merged with the image of the real scene to form the AR image.8
Research continues to improve both registration technology and quality of display in the most ergonomic fashion.9,10
For successful execution of AR, a large number of display modalities have been considered. Two main display modalites have been adopted: head-mounted displays and heads-up displays.
Two types of head-mounted displays exist: video see-through and optical see-through. The video see-through display does not allow the operator's visual field to have direct contact with the real world, while the optical see-through display does. The optical see-through display offers less of a feeling of being immersed in the environment created by the display. Although no studies exist to show which type of head-mounted display is superior, the optical see-through display offers more control of the environment should an emergency arise or when misalignment of anatomic graphic images is recognized.
The heads-up display has already been used in airplane cockpits and recently in some experimental automobiles, allowing 2 images to be merged on a monitor facing the head rather than the window of the cockpit or the windshield. All displays have an obligatory delay for image processing, but each type has distinct advantages.
The video see-through displays allow the video-generated image to reach the rest of the AR system to provide immediate tracking information. The optical see-through device activates the human brain for further transformation processes regarding information tracking. This can cause eyestrain and, in severe cases, nausea and headaches for the surgeon. The resolution of the virtual image is directly mapped over the real-world view when an optical see-through display is used. With a monitor or video see-through display, both the real and virtual worlds are reduced to the resolution of the display device. These magnetic trackers also introduce errors caused by any surrounding metal objects in the environment, as well as measurement delays.11
In summary, imaging devices project a 3D world on a 2-dimensional image plane. The intrinsic and extrinsic parameters of the device determine exactly what is projected. These features are not error free.
Augmented reality is still in a rapidly progressing stage of development with further challenges. Despite its infancy, attempts to apply AR in surgery have been successful and promising. Neurosurgery, otolaryngology, and maxillofacial surgery are the main disciplines that have used this technology to navigate their specific surgical fields.
No specialty has been more involved than neurosurgery in the implementation of computer-aided surgery since its inception. Neurosurgeons are always trying to resect the smallest possible volume of brain tissue containing tumor. While methods exist (eg, magnetic resonance imaging and computed tomography) for imaging and displaying the 3D structure of the brain, the surgeon must relate what he or she sees on the 3D display with the patient's actual anatomy. Understanding of registration, stereotactic surgery, and stereoscopic surgery offered answers as to how to go about navigating a brain tumor. Primitive solutions to this problem involved a stereotactic frame for the patient's skull, imaging the skull and frame as a unit. A search for a more reliable frame suitable for the surgeon and comfortable for the patient initiated development of automatic registration methods for frameless stereotaxy, image-guided surgery, and AR.12 In the AR environment, a navigation system superimposes a 3D image (volume graph) of the anatomic part of the brain on the real operating field. This creates a 3D anatomic atlas–like interactive environment for the navigating surgeon.13 Surgical navigation therefore is key for reduction of surgical intervention in a narrow operative field. To the advantage of the neurosurgeon, the surgical anatomy is more fixed in space than abdominal organs are, allowing feasible registration.
These new technologies for surgical navigation and image analysis have been termed interactive image-guided neurosurgery. This system is composed of 5 fundamental elements: a method of registration of images and physical space, an interactive localization device, a computer with its requisite software interface and video display system, the integration of real-time feedback, and robotics.
Concerns surrounding the application of AR are similar to those in other surgical disciplines. Tissue movement during surgery caused by cerebrospinal fluid leakage, gravity, and tumor resection can affect registration.13
This specialty took over AR application later than other specialties. Already established traditional modes of general surgery and limited sources of funding14 are possible reasons for its delay. Nonetheless, a number of steps toward the development of an AR system combined with computer-assisted surgery have been made, especially in the field of liver surgery. Soler et al15 published the first fully automatic 3D reconstruction liver model through detailed translation of anatomic knowledge in topologic and geometric constraints.15 Such an approach allows the surgeon to automatically build an anatomic segmentation of the liver, based on the Couinaud definition of the 8 subsegments of the liver, with delineation of the hepatic and portal veins in VR.
Other steps to visualize complex anatomy included the development of teleimmersive collaborations in virtual pelvic floor16 and virtual abdomen.17 Although these models have not been used in the operating room, it is hoped that these environments will support widespread dissemination of surgical expertise.
Another important step was the application of frameless stereotactic liver surgery in tumor resection.18 Similar to neurosurgery, an interactive image-guided surgery system for liver surgery was evaluated for accurate instrument tracking. The results from human and porcine data showed accuracies ranging from 1.4 to 2.1 mm. Liver motion due to insufflation was 2.5 ± 1.4 mm in laparoscopy, while total liver motion during respiration was 10.8 ± 2.5 mm.
In the field of breast cancer, AR visualization was shown to be effective in phantom and clinical data.19 This novel approach allowed superimposition of 3D tumor models onto live video images of the breast, enabling the surgeon to perceive the exact 3D position of the tumor as if it were visible through the breast skin. Sato et al19 claimed that surgical AR helped the surgeon target surgical resection in a more objective and accurate process, thereby minimizing risk of relapse and maximizing breast conservation. Further research is needed to work out AR's reliability and validity in surgical oncology.
Although AR applications in musculoskeletal surgery are not yet clinically available, several research systems are being used to solve orthopedic problems. These applications include implant alignment in total hip and knee replacement, where an AR system can be used to guide the proper placement of implant components on the basis of preoperative plans.20 Limb kinematics is the mathematical analysis of pressure distribution during motion and soft-tissue tension. In the AR environment, the visualization of the vector form could allow total knee replacement and high tibial osteotomies to be adjusted and tailored to the individual patient.21 In knee surgery, the application of AR is highlighted by a recent experimental and stand-alone device, the mechatronic arthroscope. This tool allows the surgeon to apply force without damage, virtually integrating preoperative and intraoperative images, to navigate the knee joint during the planning phase and to intervene during the AR phase.22 Variations on the same theme include accurate placement of an intramedullary rod, proper manipulation of the bones, tumor resection, and cartilage resurfacing.
The use of AR in maxillofacial surgery has extended to orthognathic surgery, tumor surgery, temporomandibular joint motion analysis, foreign body removal, osteotomy, minimally invasive biopsy, prosthetic surgery, and dental implantation.23 One of the chief attractions is the provision of information on deep-tissue structures during the operation, allowing surgery to be less invasive. The application of AR technology to osteotomies of the facial skeleton could allow points, lines, and planes to be transferred from stereolithographic skull models, cephalometric drawings, splints, and diagnostic imaging data to the patient.24 In the context of oncology, the surgeon draws the tumor borders manually as an overlay using VR system software tools onto the computed tomographic data set. Adjustments and alignments are made. Thereafter, the overlay can be transmitted into any other data sets through the video image of the operative field and later into the heads-up display or head-mounted display units. The resection margins can then be seen in the context of the tumor borders. This may minimize the deformity generated by traditional surgical methods while optimizing the chances for successful curative and reconstructive surgery.25
Augmented reality has stepped into the field of otolaryngology. Most of the advances have come from the introduction of minimally invasive surgery of the head and neck. Such systems are revolutionary in aiding the surgeon with intraoperative anatomic landmarks, especially when distorted or absent. The use of AR especially in diagnosis, biopsy from sinuses, skull base surgery, orbital decompression, carcinoma excision, and foreign body removal has many advantages and disadvantages.26 Improved patient safety with improved mechanical and registration accuracy (within 0.2-3 mm) during real-time surgery27 allows for surgical precision. The technology is easy to use. However, surgeons cannot always predict which cases may benefit from localization, especially when associated with increased operative time and expense.28 The actual surgical time is unlikely to be prolonged. The latter disadvantage may not apply to routine surgery of the head and neck. Recently, a computerized image to reconstruct an anatomically accurate 3D computer model of the human temporal bone from serial histologic sections was achieved. A 3D virtual model of the temporal bone has been created and demonstrated as an efficient tool for education.29 The human temporal bone is a 3D complex anatomic region with many unique qualities that make anatomic teaching and learning difficult. The model may be interactively navigated from any viewpoint, greatly simplifying the task of conceptualizing and learning the anatomy. Automated tracking of tissue motion, however, remains a current research problem.30
Minimally invasive surgery in the chest via a thoracoscope has allowed AR to be used in thoracic surgery. Whether a thoracoscopic approach to diagnosis or treatment could replace more conventional approaches remains to be seen. However, according to Colt,31 the training capabilities will soon be enhanced by the incorporation of VR simulators. Thoracoscopy is partly the result of the impact of laparoscopic surgery on general surgical practice. This window to the pleural and pericardial cavity allows for diagnosis and treatment of pleural effusions, lung cancer, mediastinal tumors, vasospastic disease via thoracoscopic sympathectomies, empyema, and ligation of the patent ductus arteriosus.31
In cardiac surgery, the adoption of thoracoscopic access and a remotely operated robot using the surgeon's hands promises a novel method of endoscopic coronary artery bypass grafting. The robot provides the surgeon with delicate prehensile function using instruments. A television-video screen allows the surgeon to use his or her vision to track both the robot hands and the anatomy for coronary anastomoses.32 It has been stated that surgical technique with this method can be challenging to the surgical team.33 No literature exists with real-time AR yet. However, this technology promises to contribute to reduced hospital days, earlier return to normal activity, less pain, and better cosmesis.34
Totally endoscopic mitral valve repair35 and aortic valve replacement are now feasible. Further studies need to develop ways to facilitate the anastomosis, reduce errors, and superimpose anatomic images on real anatomic landmarks. A multicenter study will be essential to define the efficacy and clinical value of these techniques. As it stands, graft occlusion rate after minimally invasive direct coronary artery bypass remains slightly higher than that after traditional revascularization.36
In the field of off-pump coronary artery bypass grafting, both real-time imaging and automation will lead the way to improve the quality of coronary anastomosis. Visual synchronization and motion compensation will be required to present a still image of a beating heart.37
The absolute role and indications for AR in surgery are yet to be established. The data so far generated in AR are not substantial. The outcomes discussed in most publications to date include user-friendly features, accuracy of targeting tissues, and costs as end points.
These outcomes are not measured quantitatively, and subjective statements are not supported with good organized research according to patient case mix. The use of multicenter trials and structured research will help determine the cost-effectiveness of AR and answer questions in an evidence-based medicine fashion.
It must be noted that a universal problem for any surgeon in the AR environment is that the organ of interest does not behave as expected. Human organs are not rigid, but deform according to the rhythms of the heartbeat and respiration, according to pressure during laparoscopic insufflation,18 or when physically probed. This physical problem will be more marked for liver18 and intestinal surgery (pliable organs) than for bone and brain surgery (semirigid organs).21
Standard platforms for stereoscopic AR computer projection are recent innovations38 and have not yet reached "wearable" applicability.39 Comfort issues may limit prolonged use. For example, the weight of a head-mounted display is determined by the type of motion-tracking systems: electromagnetic, ultrasonic, or optical. Moreover, concerns may arise regarding fitting such devices into an already crowded operating room environment.40 In addition, outcome has yet to be measured qualitatively (risk-benefit ratio) and quantitatively.
A common underlying error-generating process in AR will always exist because of the tremendous variability in the fundamental elements: definition of accuracy, image acquisition, registration techniques, computers and software interfaces, iterative localization devices and intraoperative use, integration of real-time data, tissue displacement, robotics, and, finally, judgment and clinical experience.1
Augmented reality so far promises us additional information that cannot be detected by the 5 senses of a human being. Despite the basic function of AR systems as "x-ray vision" for surgical planning, the system extends to robots and simulation. Interventional AR systems are the most recent application to provide a "third hand" as an assistant. The clinical application of this tool is still very basic and passively driven by the surgeon because of concerns such as safety and minimizing device sophistication. The current versions of the passive-arm manipulator include use as a tool holder or retractor.41
The dynamic association of operating on a real organ with imaging data may create new modes of diagnosis and treatment of technically challenging patients. Very experienced surgeons can benefit from such systems by extending the limit of a safe area to allow for more complete and radical operative therapy, while less experienced surgeons may at least benefit by being oriented to critical anatomic landmarks. A new sense of perceiving the real and virtual world has been achieved. Advancing AR to become user-friendly has rekindled interest in real-time surgical anatomy as a way to maximize the number of safe surgical hands in the next century.
Corresponding author and reprints: Jeffrey H. Shuhaiber, MD, Department of Surgery, University of Illinois at Chicago, 840 Southwood St (CSB Suite 518-E), Chicago, IL 60612 (e-mail: email@example.com).
Accepted for publication July 12, 2003.