Forces of Tool-Tissue Interaction to Assess Surgical Skill Level | Medical Education and Training | JAMA Surgery | JAMA Network
[Skip to Content]
Sign In
Individual Sign In
Create an Account
Institutional Sign In
OpenAthens Shibboleth
[Skip to Content Landing]
Figure 1.  Force Data Recorded by SmartForceps for 3 Surgical Types of Tasks
Force Data Recorded by SmartForceps for 3 Surgical Types of Tasks

A, Four strain gauges are attached to the lateral surface of the bipolar forceps prongs for sensing coagulation (closing) and dissection (opening) forces. The strain gauges were covered with medical-grade sterilizable tape, and the forceps was steam autoclaved before clinical use. B-D, Examples of forces recorded for coagulation (arrowheads indicate closing) (B), division (arrowheads indicate opening and closing) (C), and dissection (arrowheads indicate opening) (D).

Figure 2.  Representative Force Data and Definition of Force Errors
Representative Force Data and Definition of Force Errors

A and B, Representative force data from an experienced (A) and a novice (B) surgeon for the same surgical task and case (coagulation of sphenoid ridge meningioma). The forces from the novice surgeon varied (circled variable peak forces), whereas the experienced surgeon applied forces uniformly. C, The definition of force errors based on experienced surgeons’ data. The fifth and 95th percentile of the maximum forces and 95th percentile of the coefficient of variation (CV, arrow) of forces for all trials from experienced surgeons were selected for the cutoff values. FVE indicates force variability error; HFE, high force error; and LFE, low force error.

Figure 3.  Comparison of Standardized Force and Rates of Error Between Successful and Unsuccessful Trials and Among Surgeon Groups
Comparison of Standardized Force and Rates of Error Between Successful and Unsuccessful Trials and Among Surgeon Groups

Standardized mean, maximum, and coefficient of variation (CV) of force (A and C) with rates of error (B and D). In A and C, the horizontal lines within the box represent the mean force values; the black squares, medians; and the error bars, SDs. FVE indicates force variability error; HFE, high force error; and LFE, low force error.

aP < .01.

bP < .05.

Figure 4.  Error Rates and Scores for the 3 Groups of Surgeons
Error Rates and Scores for the 3 Groups of Surgeons

A-C, Radar chart for the 3 groups of surgeons showing error rates. D, Scatterplot of the canonical scores from stepwise discriminant analysis for the 3 groups of surgeons. The x signs indicate the center of the ellipses. For stepwise discrimination analysis data used to construct the scatterplot, see eTable 5 in the Supplement. DF indicates discriminant function; FVE, force variability error; HFE, high force error; and LFE, low force error.

Table.  Acquired Force Profile From Experienced Surgeon and Baseline Cutoff Value
Acquired Force Profile From Experienced Surgeon and Baseline Cutoff Value
1.
Birkmeyer  JD, Finks  JF, O’Reilly  A,  et al; Michigan Bariatric Surgery Collaborative.  Surgical skill and complication rates after bariatric surgery.  N Engl J Med. 2013;369(15):1434-1442.PubMedGoogle ScholarCrossref
2.
Holmboe  ES, Sherbino  J, Long  DM, Swing  SR, Frank  JR.  The role of assessment in competency-based medical education.  Med Teach. 2010;32(8):676-682.PubMedGoogle ScholarCrossref
3.
Reznick  RK, MacRae  H.  Teaching surgical skills—changes in the wind.  N Engl J Med. 2006;355(25):2664-2669.PubMedGoogle ScholarCrossref
4.
van Hove  PD, Tuijthof  GJ, Verdaasdonk  EG, Stassen  LP, Dankelman  J.  Objective assessment of technical surgical skills.  Br J Surg. 2010;97(7):972-987.PubMedGoogle ScholarCrossref
5.
Champion  HR, Meglan  DA, Shair  EK.  Minimizing surgical error by incorporating objective assessment into surgical education.  J Am Coll Surg. 2008;207(2):284-291.PubMedGoogle ScholarCrossref
6.
Richstone  L, Schwartz  MJ, Seideman  C, Cadeddu  J, Marshall  S, Kavoussi  LR.  Eye metrics as an objective assessment of surgical skill.  Ann Surg. 2010;252(1):177-182.PubMedGoogle ScholarCrossref
7.
Aoun  SG, El Ahmadieh  TY, El Tecle  NE,  et al.  A pilot study to assess the construct and face validity of the Northwestern Objective Microanastomosis Assessment Tool.  J Neurosurg. 2015;123(1):103-109.PubMedGoogle ScholarCrossref
8.
Martin  JA, Regehr  G, Reznick  R,  et al.  Objective structured assessment of technical skill (OSATS) for surgical residents.  Br J Surg. 1997;84(2):273-278.PubMedGoogle ScholarCrossref
9.
Kalu  PU, Atkins  J, Baker  D, Green  CJ, Butler  PE.  How do we assess microsurgical skill?  Microsurgery. 2005;25(1):25-29.PubMedGoogle ScholarCrossref
10.
Darzi  A, Smith  S, Taffinder  N.  Assessing operative skill: needs to become more objective.  BMJ. 1999;318(7188):887-888.PubMedGoogle ScholarCrossref
11.
Ghasemloonia  A, Maddahi  Y, Zareinia  K, Lama  S, Dort  JC, Sutherland  GR.  Surgical skill assessment using motion quality and smoothness.  J Surg Educ. 2017;74(2):295-305.PubMedGoogle ScholarCrossref
12.
Harada  K, Morita  A, Minakawa  Y,  et al.  Assessing microneurosurgical skill with medico-engineering technology.  World Neurosurg. 2015;84(4):964-971.PubMedGoogle ScholarCrossref
13.
Chmarra  MK, Klein  S, de Winter  JC, Jansen  FW, Dankelman  J.  Objective classification of residents based on their psychomotor laparoscopic skills.  Surg Endosc. 2010;24(5):1031-1039.PubMedGoogle ScholarCrossref
14.
Gan  LS, Zareinia  K, Lama  S, Maddahi  Y, Yang  FW, Sutherland  GR.  Quantification of forces during a neurosurgical procedure: a pilot study.  World Neurosurg. 2015;84(2):537-548.PubMedGoogle ScholarCrossref
15.
Zareinia  K, Maddahi  Y, Gan  LS,  et al.  A force-sensing bipolar forceps to quantify tool-tissue interaction forces in microsurgery.  IEEE/ASME Trans Mechatron. 2016;21(5):2365-2377.Google ScholarCrossref
16.
Tang  B, Hanna  GB, Cuschieri  A.  Analysis of errors enacted by surgical trainees during skills training courses.  Surgery. 2005;138(1):14-20.PubMedGoogle ScholarCrossref
17.
Cundy  TP, Thangaraj  E, Rafii-Tari  H,  et al.  Force-Sensing Enhanced Simulation Environment (ForSense) for laparoscopic surgery training and assessment.  Surgery. 2015;157(4):723-731.PubMedGoogle ScholarCrossref
18.
Obdeijn  MC, Horeman  T, de Boer  LL, van Baalen  SJ, Liverneaux  P, Tuijthof  GJ.  Navigation forces during wrist arthroscopy: assessment of expert levels.  Knee Surg Sports Traumatol Arthrosc. 2016;24(11):3684-3692.PubMedGoogle ScholarCrossref
19.
Trejos  AL, Patel  RV, Malthaner  RA, Schlachta  CM.  Development of force-based metrics for skills assessment in minimally invasive surgery.  Surg Endosc. 2014;28(7):2106-2119.PubMedGoogle ScholarCrossref
20.
Sutherland  GR, Lama  S, Gan  LS, Wolfsberger  S, Zareinia  K.  Merging machines with microsurgery: clinical experience with neuroArm.  J Neurosurg. 2013;118(3):521-529.PubMedGoogle ScholarCrossref
21.
Sutherland  GR, Maddahi  Y, Gan  LS, Lama  S, Zareinia  K.  Robotics in the neurosurgical treatment of glioma.  Surg Neurol Int. 2015;6(suppl 1):S1-S8.PubMedGoogle ScholarCrossref
22.
Payne  CJ, Marcus  HJ, Yang  GZ.  A smart haptic hand-held device for neurosurgical microdissection.  Ann Biomed Eng. 2015;43(9):2185-2195.PubMedGoogle ScholarCrossref
23.
Maddahi  Y, Ghasemloonia  A, Zareinia  K, Sepehri  N, Sutherland  GR.  Quantifying force and positional frequency bands in neurosurgical tasks.  J Robot Surg. 2016;10(2):97-102.PubMedGoogle ScholarCrossref
24.
Sutherland  GR, Zareinia  K, Gan  LS, Hirmer  TJ, Lama  S. Bipolar forceps with force measurement. Google patents. 2013. http://www.google.com/patents/US20150005768. Accessed October 9, 2017.
25.
Lodha  N, Misra  G, Coombes  SA, Christou  EA, Cauraugh  JH.  Increased force variability in chronic stroke: contributions of force modulation below 1 Hz.  PLoS One. 2013;8(12):e83468.PubMedGoogle ScholarCrossref
26.
Horeman  T, Rodrigues  SP, Jansen  FW, Dankelman  J, van den Dobbelsteen  JJ.  Force parameters for skills assessment in laparoscopy.  IEEE Trans Haptics. 2012;5(4):312-322.PubMedGoogle ScholarCrossref
27.
Azarnoush  H, Siar  S, Sawaya  R,  et al.  The force pyramid: a spatial analysis of force application during virtual reality brain tumor resection.  J Neurosurg. 2017;127(1):171-181.PubMedGoogle ScholarCrossref
Original Investigation
March 2018

Forces of Tool-Tissue Interaction to Assess Surgical Skill Level

Author Affiliations
  • 1Department of Clinical Neurosciences and the Hotchkiss Brain Institute, University of Calgary, Calgary, Alberta, Canada
  • 2Department of Neurosurgery, Hokkaido University Graduate School of Medicine, Kita-ku, Sapporo, Japan
JAMA Surg. 2018;153(3):234-242. doi:10.1001/jamasurg.2017.4516
Key Points

Questions  What are the forces of tool-tissue interaction in microsurgery, and can they be used to assess surgeon skill level?

Findings  In this study of a catalog of tool-tissue interaction in 3 groups of surgeons (novice, intermediate, and experienced), force analysis and corresponding video recording revealed an association between high force error and bleeding and low force error with the need to repeat the task; discriminative analysis allowed differentiation of surgeons by their skill level.

Meaning  Force analysis of tool-tissue interaction may help distinguish surgeon skill level, which could enhance surgical education as it shifts to a competency-based paradigm.

Abstract

Importance  The application of optimal forces between surgical instruments and tissue is fundamental to surgical performance and learning. To date, this force has not been measured clinically during the performance of microsurgery.

Objectives  To establish a normative catalog of force profiles during the performance of surgery, to compare force variables among surgeons with different skill levels, and to evaluate whether such a force-based metric determines or differentiates skill level.

Design, Setting, and Participants  Through installation of strain gauge sensors, a force-sensing bipolar forceps was developed, and force data were obtained from predetermined surgical tasks at the Foothills Medical Centre, University of Calgary, a tertiary care center that serves Southern Alberta, Canada. Sixteen neurosurgeons (3 groups: novice, intermediate, and experienced) performed surgery on 26 neurosurgical patients with various conditions. Normative baseline force ranges were obtained using the force profiles (mean and maximum forces and force variability) from the experienced surgeons. Standardized force profiles and force errors (high force error [HFE], low force error [LFE], and force variability error [FVE]) were analyzed and compared among surgeons with different skill levels.

Main Outcomes and Measures  Each trial of the forceps use was termed successful or unsuccessful. The force profiles and force errors were analyzed and compared.

Results  This study included 26 patients (10 [38%] male and 16 [62%] female; mean [SD] age, 43 [15] years) undergoing neurosurgery by 16 surgeons (6 in the novice group, 5 in the intermediate group, and 5 in the experienced group). Unsuccessful trial–incomplete significantly correlated with LFE and FVE, and unsuccessful trial–bleeding correlated with HFE and FVE. The force strengths exerted by novice surgeons were significantly higher than those of experienced surgeons (0.74 vs 0.00; P < .001), and force variability decreased from novice (0.43) to intermediate (0.28) to experienced (0.00) surgeons; however, these differences varied among surgical tasks. The rate of HFE and FVE inversely correlated with surgeon level of experience (HFE, 0.27 for novice surgeons, 0.12 for intermediate surgeons, and 0.05 for experienced surgeons; FVE, 0.16 for novice surgeons, 0.10 for intermediate surgeons, and 0.05 for experienced surgeons). The rate of LFE significantly increased in intermediate (0.12) and novice (0.10) surgeons compared with experienced surgeons (0.04; P < .001). There was no difference in LFE between intermediate and novice surgeons. Stepwise discriminant analysis revealed that combined use of these error rates could accurately discriminate the groups (87.5%).

Conclusions and Relevance  Force-sensing bipolar forceps and force analysis may help distinguish surgeon skill level, which is particularly important as surgical education shifts to a competency-based paradigm.

Introduction

A previous study1 found that surgical proficiency correlates with clinical outcome. Surgeons spend years in hands-on training and deliberate practice in mastering psychomotor skills; however, achieving excellence continues to remain a matter of intuition rather than reason. Traditionally, trainees have learned the nuances of surgical skill through case observation and have received largely qualitative feedback on performance from their mentors. As the apprenticeship model of surgical education evolves toward a more competency-based paradigm, there is need to devise sensitive and reliable methods for objective assessment of surgical skill.2-4 This process is necessary not only to ensure patient safety by defining surgeon competence but also to provide information about training progress, credentialing, and standardization of surgery.5,6

To improve patient safety, criteria-based measures, including checklists and global rating scales, have been used7,8; however, these measures are subjective and have multiple biases.9,10 Morbidity, mortality, and length of procedure have been used as quantitative surrogate markers of surgical performance; however, such data may be affected by many factors not necessarily related to surgical skill.9,10 To overcome these limitations, quantitative metrics that reflect the proficiency of surgical performance are evolving.6,11,12 In surgical technique, motion smoothness and tool handling are necessary for technical finesse.4,11-13 Accordingly, knowledge of tool-tissue interaction forces is an important aspect of safe and efficient surgical performance.12,14,15 Surgical simulation has revealed that more than 50% of errors by surgical trainees are attributable to excessive force.16 Several force-sensing tools have been developed, mainly for endoscopic and robotic surgery.17-21 For microsurgery, force-sensing handheld devices, including microdissectors22 and jeweler’s forceps, have also evolved.12 We designed a force-sensing bipolar forceps, SmartForceps, by installing strain gauge force sensors onto the prongs of a conventional bipolar forceps, a tool commonly used in neurosurgery and other surgical specialties.14,15,23,24

Despite these advances, identification of a technique or system that provides and documents quantitative assessment of surgical performance in the operating room is lacking. In the present study, we demonstrated the force profiles of microsurgery obtained from patients with various neurosurgical conditions. The aims were to (1) establish a normative catalog of force profiles during the performance of neurosurgical tasks, (2) compare force profiles and force error rates among surgeons with different surgical skill levels, and (3) evaluate how such a force-based metric is useful in the assessment of surgical skill.

Methods
Force-Sensing Bipolar Forceps and Data Collection

The SmartForceps, which we developed for real-time sensing of intraoperative closing (coagulation) and opening (dissection) forces, is shown in Figure 1A. After demonstrating to Surgical Services, Foothills Medical Centre, University of Calgary, Calgary, Alberta, Canada, that the SmartForceps could be sterilized while maintaining functionality, we were approved to use the system on patients. Force data were recorded but not given to the surgeon during the procedure. Description of SmartForceps development and data collection is available in the eMethods in the Supplement. Verbal informed consent was obtained from all 26 patients, and all data were deidentified. The study was approved by the University of Calgary Conjoint Health Research Ethics Board.

Patients and Surgeons

The study included 26 adult neurosurgical patients. Sixteen surgeons participated in this study and were stratified into 3 groups based on their level of neurosurgical experience. The novice group included junior neurosurgical residents (postgraduate years 1-3, n = 6), and the intermediate group consisted of senior residents (postgraduate years 4-6) and clinical fellows who did not perform microsurgery independently (n = 5). The experienced group included staff neurosurgeons who performed more than 100 microsurgical procedures per year (n = 5) and assumed the role of primary surgeon in this study. Intermediate and novice surgeons participated and performed selected surgical tasks per their individual training levels under supervision of the primary surgeon.

Surgical Tasks and Trials

Commonly performed neurosurgical maneuvers were separated into 18 surgical tasks developed by expert consensus and validated through case observation in patients with various intracranial conditions: (1) coagulation of scalp vessel, (2) coagulation of dura, (3) coagulation of pia-arachnoid, (4) coagulation of large cortical vessel (>1 mm), (5) coagulation of small cortical vessel (<1 mm), (6) coagulation of bridging vein, (7) coagulation of surface of the extra-axial tumor, (8) removal of arachnoid from the tumor, (9) gripping of tumor tissue, (10) placement of cotton patties, (11) manipulation of bone wax, (12) division of pia-arachnoid and gray matter, (13) division of brain tissue (white matter), (14) division of extra-axial tumor tissue, (15) division of intra-axial tumor tissue (glioma), (16) opening of cistern, (17) dissection of arteries within the cistern, and (18) dissection between brain and extra-axial tumor. Of these, data from 10 surgical tasks (tasks 1, 2, 3, 5, 7, 13, 14, 15, 16, and 17) were selected for comparison among surgeon groups because these tasks reflect the performance of microsurgery, and sufficient data were available for at least 2 of the surgeon groups. The distribution of the trials is summarized in eTable 1 in the Supplement.

For each trial, the surgeon stated the task being performed. The time when the tips of the SmartForceps made contact with the tissue was documented as the starting time of force recording, and the stopping time was when the tips of the forceps no longer made contact with the tissue. Microscope video recording of the surgical field provided a corresponding time stamp for post hoc analysis of force data by a neurosurgeon investigator (T.S.), independent of performing the tasks.

Rating of Trials

In the division and grip tasks, bleeding is usually anticipated; therefore, only coagulation and dissection tasks were used for this performance evaluation. A given surgical task without any vascular injury (bleeding) and/or any requirement for repetition of the task was defined as a successful trial. A task that caused bleeding was considered to be an unsuccessful trial–bleeding. A task that required repetitive maneuvers at the same location because of failure of task completion was considered to be an unsuccessful trial–incomplete.

Force Data Analysis

In the coagulation and grip tasks (closure of the forceps tips), positive peak value (local maxima) was used for data analysis (Figure 1B), whereas in the division task (closing and opening of the forceps tips), both positive and negative peak values (local minima) were used (Figure 1C). For the dissection task (opening of the forceps tips), the negative peak value was used (Figure 1D).14 Data points with force level less than 0.1N were excluded from the calculation to avoid inclusion of baseline noise data. Any untoward maneuvers, such as the prongs of the forceps touching bone or another tool, were excluded from analysis.

The mean and maximum values of all peak readings from each trial were used to represent each force value (Figure 2A and B). Because the magnitude of force variability within each trial was considered to be an important element of force profile, the coefficient of variation (CV) of all the peak readings was calculated as follows25: CV = σ/μ, where σ is the SD of forces and μ is the mean value of measured forces. Because each force profile was different among each surgical task, all variables were standardized for comparison. Standardized variables (z) were calculated using the following formula: z = (x − μ)/σ, where μ and σ represent values calculated from all variables obtained for experienced surgeons and x is raw force variable.

For error analysis, the 5th and 95th percentiles of the maximum forces and the 95th percentile of the CV of forces for all trials from experienced surgeons were selected (Figure 2C) based on the assumption that experienced surgeons make few errors (ie, <5%).18 The trials in which maximum peak force exceeded the upper threshold was regarded as high force error (HFE), and those that did not reach the lower threshold were regarded as low force error (LFE). The trials in which the CV of forces breached the threshold were regarded as force variability error (FVE).

Statistical Analysis

The normality of the data was investigated using the Shapiro-Wilk test. Normally distributed continuous data were compared using the Welch 2-tailed, unpaired t test between 2 groups and by 1-factor analysis of variance followed by the post hoc Bonferroni test among the 3 groups. The data not normally distributed were compared using the Mann-Whitney test between 2 groups and by the Kruskal-Wallis test followed by the Steel-Dwass test among 3 groups. Homogeneity of variance was tested using the F test between 2 groups and the Bartlett test among 3 groups. P < .05 (2-sided) was considered to be statistically significant.

Stepwise forward canonical discriminant analysis was performed to discriminate surgeon skill level based on the 6 variables (mean, maximum, CV, and HFE, LFE, and FVE rates) of each surgeon. The variables that contributed the most to the discrimination of the 3 different surgeon groups were selected using the F values (P < .10) as the criterion for inclusion in the stepwise analysis. Correlation ratio (η2), Wilks λ, and the F values were used to test the significance and estimate the weight of each variable. Performance of the discrimination was examined using a leave-one-out cross-validation.13,26 Statistical analysis was completed with the use of R software, version 2.2-3 (R Development Core Team) and SPSS Statistics, version 24.0 (IBM Inc).

Results
Study Participants

This study included 26 patients (10 [38%] male and 16 [62%] female; mean [SD] age, 43 [15] years) undergoing neurosurgery by 16 surgeons (6 in the novice group, 5 in the intermediate group, and 5 in the experienced group). Six patients had cerebrovascular disease (3 vascular malformations, 2 cranial nerve microvascular decompression, and 1 unruptured aneurysm); 5, intra-axial tumors (glioma); 14, extra-axial tumors (5 schwannoma, 8 meningioma, and 1 dural-based metastatic carcinoma); and 1, temporal lobectomy for epilepsy.

Baseline Force Profile of Experienced Surgeons

The force profile of the mean, maximum, and CV of force peak readings from experienced surgeons across 18 surgical tasks is summarized in the Table. Force levels varied across surgical tasks, indicating that the applied forces were well adjusted according to the size and fragility of the tissue. The extracranial procedure and grip task tended to be higher; however, most intracranial tasks were less than 0.6N in mean force and less than 1.0N in maximum force. Of interest, only a 0.45N to 0.54N mean force with a 0.71N to 0.84N maximum force was enough to divide normal brain. Using this data set, we standardized all force profiles and defined thresholds of each surgical task for comparison analyses.

Task Performance Evaluation and Force Analysis

Video recordings of 887 trials (451 from experienced, 249 from intermediate, and 187 from novice surgeon trials) were evaluated. A total of 46 experienced (10.2%), 33 intermediate (13.3%), and 45 novice (24.1%) surgeon trials resulted in unsuccessful trial–incomplete. Unsuccessful trial–bleeding was observed in 17 experienced (3.8%), 17 (6.8%) intermediate, and 18 novice (9.6%) surgeon trials.

Standardized force profiles and force error rates of successful and unsuccessful trials are shown in Figure 3A and B. Compared with successful trials, the applied force tended to be lower in unsuccessful trial–incomplete (0.23 vs −0.1; P = .01) and significantly higher in unsuccessful trial–bleeding (0.23 vs 0.90; maximum, 0.24 vs 1.35; P < .001). Force variability (CV of force) was also significantly higher in unsuccessful trial–bleeding than successful trials (0.76 vs 0.09; P < .001). Error analysis revealed that the rate of HFE was significantly higher in unsuccessful trial–bleeding (0.40 vs 0.09; P < .001), and the rate of LFE was significantly higher in unsuccessful trial–incomplete (0.26 vs 0.03; P < .001). The rate of FVE was significantly higher in both unsuccessful trial–bleeding and unsuccessful trial–incomplete (0.26 and 0.14 vs 0.06; P < .001).

Comparison Among Surgeon Groups

Standardized force profiles and force error rates for the selected 10 surgical tasks are shown in Figure 3C and D, and the data are summarized in eTable 2 in the Supplement. The mean and maximum forces of novice surgeons were significantly higher than those of intermediate and experienced surgeons (0.74 vs 0.00 and 0.08; P < .001). No difference was found between intermediate and experienced surgeons for most tasks; however, in the dissection and coagulation of small cortical vessel tasks, intermediate surgeons (for dissection, 1.31 vs 0.00; P < .001; for coagulation, 0.54 vs 0.00; P = .02) used higher forces than experienced surgeons (P < .05), whereas in division of glioma, the opposite was observed (−0.5 vs 0.00; P = .01). Tasks such as coagulation of scalp vessel and dura showed no force differences among the 3 groups.

For most tasks, novice surgeons performed with higher variation of forces, indicating that their performance was inconsistent among trials. The force variability of intermediate and novice surgeons was also significantly higher than that of experienced surgeons, indicating inconsistent use of forces, whereas experienced surgeons were more consistent. Figure 3D shows the rate of HFE, LFE, and FVE in all trials; the data are detailed in eTable 3 in the Supplement. All 3 error rates among intermediate (HFE, 0.12; LFE, 0.12; and FVE, 0.10) and novice surgeons (HFE, 0.27; LFE, 0.10; and FVE, 0.16) were significantly higher than those of experienced surgeons (HFE, 0.05; LFE, 0.04; and FVE, 0.05; P < .001). Even in the tasks in which force strength of novice surgeons was higher than that of experienced surgeons, the rates of LFE and HFE were higher than those among experienced surgeons (eTable 2 and eTable 3 in the Supplement). This observation was further supported by the dispersion of force data of each trial being higher than that of experienced surgeons. The rates of HFE and FVE among novice surgeons were significantly higher than those among intermediate surgeons (HFE, 0.27 vs 0.12; P < .001; FVE, 0.16 vs 0.10; P = .01). The rate of LFE did not differ between novice and intermediate surgeons.

Surgeon Profiles and Discriminant Analysis

The surgeon error rates (Figure 4A-C and eTable 4 in the Supplement) varied among surgeons (ie, some often used too much force, some used too little, and some were inconsistent). To investigate those variables that contributed most in determining a surgeon skill level, stepwise discriminant analysis was performed. We discovered that the 3 error rates rather than standardized force profiles had stronger discriminant power (Figure 4D and eTable 5 in the Supplement). In particular, the rates of HFE and FVE contributed the most to the discriminant model. With use of discriminant functions 1 and 2 as shown in Figure 4D, 14 of the original 16 grouped surgeons (87.5%) were correctly classified into their respective groups (Figure 4D). With use of the leave-one-out method, 14 of the 16 cross-validated grouped surgeons (87.5%) (100% of the experienced and intermediate group surgeons) were correctly classified. A confusion matrix is given in eTable 6 in the Supplement.

Discussion

To our knowledge, this is the first study that obtained force data from neurosurgical patients during surgery. Although other studies12,14,15,22,27 have used surgical models or simulators in assessing force data, none of these represented an ideal, realistic surgical scenario that includes not only the physical properties but also the psychological stress related to the performance of surgery. The range of forces in this study was different from those previously reported using cadaver brain.14 For example, coagulation of scalp vessels was 0.2N to 0.41N in a previous cadaveric study,14 whereas it was 0.68N in this report. This finding is not surprising because cadaveric specimens have different mechanical properties, being obtained hours after death, and lack pulsatile blood flow. The force data presented in this study are therefore an accurate representation of forces applied during neurosurgery.

A selection bias might have occurred with easier surgical tasks assigned to the novice surgeon group. For simple tasks, such as coagulation of scalp vessels, no force differences were observed among the 3 groups. Differences were encountered as the complexity and risk of the task increased. Therefore, complex tasks outside the level of training for novice surgeons would be expected to be associated with increased differences in the use of forces. Investigation of this assumption would mandate experimental studies using animal or cadaveric models elaborated in the eMethods in the Supplement. The different force levels among the 18 surgical tasks were likely related to the variability in tissue consistency and thus the resistance encountered and, in part, to knowledge of surgical-structural anatomy. The 5 experienced surgeons had less variation of force profiles in most of the tasks. That the profiles with comparable led to 2 assumptions: (1) experienced surgeons used similar forces to complete a given task, and (2) the frequency of unsuccessful trials among experienced surgeons was low, and therefore, the cutoff value for error analysis was set at the 5th and 95th percentiles. Although the choice of the cutoff value needs to be validated in the future, this choice was supported by video recordings, in which experienced surgeons failed to complete a given task approximately 10% of the time, and the frequency of vessel injury during the performance of a task was approximately 4%.

Similar to a previous report,17 for most surgical tasks, force strength exerted by novice surgeons was higher than that of experienced surgeons, and force variability decreased from novice to experienced surgeons. These differences, however, varied with some exceptions. In the division of glioma, novice and intermediate surgeons applied lower force strength than experienced surgeons. This finding might suggest a hesitancy by novice surgeons to avoid injury to delicate structures while facing difficulty in distinguishing tumor from normal brain. Of importance, error analyses revealed clearer differences between novice and experienced surgeons. Even in the tasks in which force strength of novice surgeons was lower than that of experienced surgeons, the rates of HFE were higher than among experienced surgeons and vice versa. This finding suggests that error analysis could be an indicator for the objective assessment of surgical skill. Discriminant analysis also revealed that the 3 error rates (HFE, LFE, and FVE) had a stronger discriminant power for surgeon skill level than force profiles (mean, maximum, and CV of force). As the surgeon training level decreased, HFE and FVE increased, establishing an inverse association. Although the rate of LFE among novice and intermediate surgeons was also higher than that among experienced surgeons, no differences were found between intermediate and novice surgeons. In general, in the training process from novice to intermediate level, the surgical trainee is more cognizant of the adverse effect of HFE than LFE. This finding is also supported by the observation that HFE was associated with consequential errors (ie, tissue injury or serious complication).16

Limitations

One of the challenges of the SmartForceps was that it allowed measurement of forces in the 2 most important directions: dissection (opening) and coagulation (closing). A more comprehensive understanding of tool-tissue interaction requires force measurement along the other 2 axes, such as those involved in puncturing of tissues and upward retraction and tissue lifting. This limitation and future work toward providing real-time feedback (machine learning) and validation of results using an established animal model (surgical performance and learning curve) are detailed in the eMethods in the Supplement.

Conclusions

The SmartForceps introduces a mechanism to quantify the forces related to the performance of surgery in the operating room. Furthermore, force profiles and the result of error analysis uniquely reflect surgeon skill level and experience. This finding suggests that assessment of surgical forces can be used as an indicator of surgical skill proficiency in terms of applied force and instrument handling. This technology thus provides the ability to quantify surgical skill, which is in line with the ongoing evolution of competency-based surgical education for assessing trainee performance and progression. Application of this principle to commonly used toolsets will encompass the broader scope of surgical training and patient safety.

Back to top
Article Information

Corresponding Author: Garnette R. Sutherland, MD, FRCSC, Department of Clinical Neurosciences and the Hotchkiss Brain Institute, University of Calgary, 1C60 HRIC 3280 Hospital Drive NW, Calgary AB T2N 4Z6, Canada (garnette@ucalgary.ca).

Accepted for Publication: July 2, 2017.

Correction: This article was corrected on January 10, 2018, for incorrect data presentation and figure citations.

Published Online: November 15, 2017. doi:10.1001/jamasurg.2017.4516

Author Contributions: Dr Sugiyama had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study concept and design: All authors.

Acquisition, analysis, or interpretation of data: Sugiyama, Gan, Maddahi, Zareinia, Sutherland.

Drafting of the manuscript: Sugiyama, Lama, Gan, Maddahi, Zareinia.

Critical revision of the manuscript for important intellectual content: Sugiyama, Lama, Maddahi, Zareinia, Sutherland.

Statistical analysis: Sugiyama, Gan, Maddahi, Zareinia.

Obtained funding: Sugiyama, Zareinia, Sutherland.

Administrative, technical, or material support: Sugiyama, Lama, Gan, Zareinia, Sutherland.

Study Supervision: Lama, Zareinia, Sutherland.

Conflict of Interest Disclosures: All the authors were involved in the development of the SmartForceps and are now working toward its commercialization.

Funding/Support: This work was supported by grant 8766 from the Canada Foundation for Innovation (Dr Sutherland), Western Economic Diversification (Dr Sutherland), Alberta Advanced Education and Technology (Dr Sutherland), and the KANAE Foundation for the Promotion of Medical Science (Dr Sugiyama).

Role of the Funder/Sponsor: The funding sources had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and the decision to submit the manuscript for publication.

Meeting Presentation: Poster presented at the American Association of Neurological Surgeons Annual Scientific Meeting; April 22-27, 2017; Los Angeles, California.

Additional Contributions: Killam Trusts provided a postdoctoral fellowship award to Dr Maddahi. Fang Wei Yang, MD, Department of Clinical Neurosciences, University of Calgary, Calgary, Alberta, Canada, provided technical contribution. Lara Cooke, MD, Yves Starrveld, MD, Alim Mitha, MD, and John Kelly, MD, Department of Clinical Neurosciences, University of Calgary, Calgary, Alberta, Canada, together with the University of Calgary Neurosurgery Residency Program, also participated in this study. None received financial compensation for their work.

References
1.
Birkmeyer  JD, Finks  JF, O’Reilly  A,  et al; Michigan Bariatric Surgery Collaborative.  Surgical skill and complication rates after bariatric surgery.  N Engl J Med. 2013;369(15):1434-1442.PubMedGoogle ScholarCrossref
2.
Holmboe  ES, Sherbino  J, Long  DM, Swing  SR, Frank  JR.  The role of assessment in competency-based medical education.  Med Teach. 2010;32(8):676-682.PubMedGoogle ScholarCrossref
3.
Reznick  RK, MacRae  H.  Teaching surgical skills—changes in the wind.  N Engl J Med. 2006;355(25):2664-2669.PubMedGoogle ScholarCrossref
4.
van Hove  PD, Tuijthof  GJ, Verdaasdonk  EG, Stassen  LP, Dankelman  J.  Objective assessment of technical surgical skills.  Br J Surg. 2010;97(7):972-987.PubMedGoogle ScholarCrossref
5.
Champion  HR, Meglan  DA, Shair  EK.  Minimizing surgical error by incorporating objective assessment into surgical education.  J Am Coll Surg. 2008;207(2):284-291.PubMedGoogle ScholarCrossref
6.
Richstone  L, Schwartz  MJ, Seideman  C, Cadeddu  J, Marshall  S, Kavoussi  LR.  Eye metrics as an objective assessment of surgical skill.  Ann Surg. 2010;252(1):177-182.PubMedGoogle ScholarCrossref
7.
Aoun  SG, El Ahmadieh  TY, El Tecle  NE,  et al.  A pilot study to assess the construct and face validity of the Northwestern Objective Microanastomosis Assessment Tool.  J Neurosurg. 2015;123(1):103-109.PubMedGoogle ScholarCrossref
8.
Martin  JA, Regehr  G, Reznick  R,  et al.  Objective structured assessment of technical skill (OSATS) for surgical residents.  Br J Surg. 1997;84(2):273-278.PubMedGoogle ScholarCrossref
9.
Kalu  PU, Atkins  J, Baker  D, Green  CJ, Butler  PE.  How do we assess microsurgical skill?  Microsurgery. 2005;25(1):25-29.PubMedGoogle ScholarCrossref
10.
Darzi  A, Smith  S, Taffinder  N.  Assessing operative skill: needs to become more objective.  BMJ. 1999;318(7188):887-888.PubMedGoogle ScholarCrossref
11.
Ghasemloonia  A, Maddahi  Y, Zareinia  K, Lama  S, Dort  JC, Sutherland  GR.  Surgical skill assessment using motion quality and smoothness.  J Surg Educ. 2017;74(2):295-305.PubMedGoogle ScholarCrossref
12.
Harada  K, Morita  A, Minakawa  Y,  et al.  Assessing microneurosurgical skill with medico-engineering technology.  World Neurosurg. 2015;84(4):964-971.PubMedGoogle ScholarCrossref
13.
Chmarra  MK, Klein  S, de Winter  JC, Jansen  FW, Dankelman  J.  Objective classification of residents based on their psychomotor laparoscopic skills.  Surg Endosc. 2010;24(5):1031-1039.PubMedGoogle ScholarCrossref
14.
Gan  LS, Zareinia  K, Lama  S, Maddahi  Y, Yang  FW, Sutherland  GR.  Quantification of forces during a neurosurgical procedure: a pilot study.  World Neurosurg. 2015;84(2):537-548.PubMedGoogle ScholarCrossref
15.
Zareinia  K, Maddahi  Y, Gan  LS,  et al.  A force-sensing bipolar forceps to quantify tool-tissue interaction forces in microsurgery.  IEEE/ASME Trans Mechatron. 2016;21(5):2365-2377.Google ScholarCrossref
16.
Tang  B, Hanna  GB, Cuschieri  A.  Analysis of errors enacted by surgical trainees during skills training courses.  Surgery. 2005;138(1):14-20.PubMedGoogle ScholarCrossref
17.
Cundy  TP, Thangaraj  E, Rafii-Tari  H,  et al.  Force-Sensing Enhanced Simulation Environment (ForSense) for laparoscopic surgery training and assessment.  Surgery. 2015;157(4):723-731.PubMedGoogle ScholarCrossref
18.
Obdeijn  MC, Horeman  T, de Boer  LL, van Baalen  SJ, Liverneaux  P, Tuijthof  GJ.  Navigation forces during wrist arthroscopy: assessment of expert levels.  Knee Surg Sports Traumatol Arthrosc. 2016;24(11):3684-3692.PubMedGoogle ScholarCrossref
19.
Trejos  AL, Patel  RV, Malthaner  RA, Schlachta  CM.  Development of force-based metrics for skills assessment in minimally invasive surgery.  Surg Endosc. 2014;28(7):2106-2119.PubMedGoogle ScholarCrossref
20.
Sutherland  GR, Lama  S, Gan  LS, Wolfsberger  S, Zareinia  K.  Merging machines with microsurgery: clinical experience with neuroArm.  J Neurosurg. 2013;118(3):521-529.PubMedGoogle ScholarCrossref
21.
Sutherland  GR, Maddahi  Y, Gan  LS, Lama  S, Zareinia  K.  Robotics in the neurosurgical treatment of glioma.  Surg Neurol Int. 2015;6(suppl 1):S1-S8.PubMedGoogle ScholarCrossref
22.
Payne  CJ, Marcus  HJ, Yang  GZ.  A smart haptic hand-held device for neurosurgical microdissection.  Ann Biomed Eng. 2015;43(9):2185-2195.PubMedGoogle ScholarCrossref
23.
Maddahi  Y, Ghasemloonia  A, Zareinia  K, Sepehri  N, Sutherland  GR.  Quantifying force and positional frequency bands in neurosurgical tasks.  J Robot Surg. 2016;10(2):97-102.PubMedGoogle ScholarCrossref
24.
Sutherland  GR, Zareinia  K, Gan  LS, Hirmer  TJ, Lama  S. Bipolar forceps with force measurement. Google patents. 2013. http://www.google.com/patents/US20150005768. Accessed October 9, 2017.
25.
Lodha  N, Misra  G, Coombes  SA, Christou  EA, Cauraugh  JH.  Increased force variability in chronic stroke: contributions of force modulation below 1 Hz.  PLoS One. 2013;8(12):e83468.PubMedGoogle ScholarCrossref
26.
Horeman  T, Rodrigues  SP, Jansen  FW, Dankelman  J, van den Dobbelsteen  JJ.  Force parameters for skills assessment in laparoscopy.  IEEE Trans Haptics. 2012;5(4):312-322.PubMedGoogle ScholarCrossref
27.
Azarnoush  H, Siar  S, Sawaya  R,  et al.  The force pyramid: a spatial analysis of force application during virtual reality brain tumor resection.  J Neurosurg. 2017;127(1):171-181.PubMedGoogle ScholarCrossref
×