A critical appraisal of Evicore’s guidelines for advanced diagnostic imaging of the spine for lower extremity pain with neurological features
• Evicore’s spinal imaging guidelines for lower extremity pain with neurological features would benefit from identification and diversification of development parties and stakeholders, increased rigor of development, and incorporation of explicit suggestions for implementation of recommendations by healthcare providers.
What is known and what is new?
• Prior authorization is required by insurance companies for advanced imaging studies.
• Insurance companies use Evicore’s guidelines to determine the need for advanced imaging.
• The AGREE II guideline appraisal tool examines quality domains including scope and purpose, stakeholder involvement, rigor of development, clarity of presentation, applicability, and editorial independence.
• AGREE II was leveraged to evaluate Evicore’s guidelines.
What is the implication, and what should change now?
• Except for clarity of presentation, all domains received AGREE II scores <50%, indicating weak methodologic quality.
• Evicore’s guidelines should be updated for improved quality in these domains.
Prior authorization (PA) is required by many insurance companies for advanced imaging studies such as magnetic resonance imagings (MRIs). Typically, office notes undergo review and additional information is required, or a peer-to-peer discussion must occur before the study is authorized. The PA process is controversial due to the increased burden it places on physicians and staff, and its potential to delay care for patients (1). While some PA programs have been shown to decrease healthcare expenditures in select populations, many have disputed their efficacy (2-5). Physicians frequently report that the PA process can lead to serious adverse events (6). Further, a survey conducted by the American Hospital Association in 2019 found that 89% of hospital systems had experienced an increase in claim denials with PA issues being the main cause (7). With questionable efficacy, an increasing rate of denials, and potential for adverse events, PA guidelines should be critically investigated.
Many insurance companies use Evicore’s guidelines to determine the need for advanced imaging for musculoskeletal pathology. Evicore aims to provide an evidence-based approach that leverages analytics to guide decision making and quality reporting while reducing administrative burden and costs (8). The guidelines produced by Evicore attempt to address areas of over-utilization and unnecessary spending as well as pinpoint areas to improve care and increase savings. This review was performed to evaluate whether Evicore’s guidelines are of appropriate methodological quality through a transparent decision-making process based on high quality evidence.
Specifically, our goal was to assess the quality of guideline development for advanced diagnostic imaging for lower extremity pain with neurological features with or without back pain by examining the General Guidelines (SP-1) as well as section 6-1 of Evicore’s Clinical Guidelines Spine Imaging Policy. These sections were chosen because requests for MRIs for patients that present with these features often result in denial and a need for a peer-to-peer review. Section 6-1 states that for patients with lower extremity pain with neurological features:
“All of the following are required prior to advanced imaging including MRI lumbar spine without contrast (or CT lumbar spine without contrast/CT myelography when MRI is contraindicated):
- Initial clinical evaluation must be performed;
- A face-to-face evaluation must be performed within the last 60 days;
- Initial evaluation is not required within the last 60 days if another face-to-face evaluation was performed in that time frame. This may be satisfied by the initial evaluation, re-evaluation or another visit;
- Failure of recent (within 3 months) 6-week trial of provider-directed treatment;
- Clinical re-evaluation after treatment period (may consist of a face-to-face evaluation or other meaningful contact.”
We used the Appraisal of Guidelines for Research and Evaluation (AGREE) II tool to evaluate each guideline. The AGREE II is a standardized tool that provides a framework to assess the quality of guideline development (9). It was designed to address the issues of variability in guideline quality and improve methodological rigor and transparency during the guideline development process. Since its 2009 release, the AGREE II has been used to evaluate hundreds of guidelines and has been extensively validated (9-12). More specifically, this tool has been utilized to evaluate guidelines pertaining to surgical and nonsurgical management of back pain and spinal cord injury. These appraisals have demonstrated room for improvement in knowledge translation and health system changes while also highlighting strengths in guideline development (2,13-15).
The objective of our study was to evaluate Evicore’s advanced diagnostic spinal imaging guidelines for lower extremity pain with neurological symptoms with or without lower back pain using the AGREE II tool.
Five appraisers conducted a comprehensive assessment (LB, BS, NI, JZ, JK). The appraisers used training tools developed by the AGREE collaboration before conducting appraisals to ensure a quality review of the guidelines (9,10). Evicore’s guidelines for advanced diagnostic spinal imaging for lower extremity pain with neurological features with or without lower back pain were rated independently with the AGREE II tool by each appraiser. Appraisers did not discuss their respective scores during the appraisal process. All five appraisers evaluated all items within each domain.
The AGREE II tool is comprised of over 20 items with six quality domains including scope and purpose, stakeholder involvement, rigor of development, clarity of presentation, applicability, and editorial independence (Table 1). Each item within the above six categories is rated from 1 (strongly disagree) to 7 (strongly agree). Detailed criteria for each item within the AGREE II tool were utilized by the appraisers to guide their evaluation. The appraisers also provided an overall assessment of the guidelines on the 1 to 7 scale, as well as a statement regarding whether they would recommend the guideline as is, recommend it with modifications or not recommend it at all.
|1 Scope and purpose||1||The overall objective(s) of the guideline is (are) specifically described|
|2||The health question(s) covered by the guideline is specifically described|
|3||The population to whom the guideline is meant to apply is specifically described|
|2 Stakeholder involvement||4||The guideline development group includes individuals from all relevant professional groups|
|5||The views and preferences of the target population (patients, public, etc.) have been sought|
|6||The target users of the guideline are clearly defined|
|3 Rigor of development||7||Systematic methods were used to search for evidence|
|8||The criteria for selecting the evidence are clearly described|
|9||The strengths and limitations of the body of evidence are clearly described|
|10||The methods for formulating the recommendations are clearly described|
|11||The health benefits, side effects, and risks have been considered in formulating the recommendations|
|12||There is an explicit link between the recommendations and the supporting evidence|
|13||The guideline has been externally reviewed by experts prior to its publication|
|14||A procedure for updating the guideline is provided|
|4 Clarity of presentation||15||The recommendations are specific and unambiguous|
|16||The different options for management of the condition or health issue are clearly presented|
|17||Key recommendations are easily identifiable|
|5 Applicability||18||The guideline describes facilitators and barriers to its application|
|19||The guideline provides advice and/or tools on how the recommendations can be put into practice|
|20||The potential resource implications of applying the recommendations have been considered|
|21||The guideline presents monitoring and/or auditing criteria|
|6 Editorial independence||22||The views of the funding body have not influenced the content of the guideline|
|23||Competing interests of guideline development group members have been recorded and addressed|
Average scores were calculated across all authors for each domain of the AGREE II tool using Evicore’s guidelines. Overall average appraisal scores for each individual appraiser were also calculated for the guideline. Scaled domain scores were converted into a percentage as per AGREE II recommendations: (Obtained score – Minimal possible score)/(Maximal possible score – Minimal possible score) X 100. In accordance with examples provided by the AGREE II user manual, a clinical guideline was considered satisfactory if it scored at least 50% on all six domains.
Raw appraisal scores were tabulated in Microsoft Excel and sent to appraisers for detection and correction of errors. Final scaled domain percentages were calculated as well as an intraclass correlation coefficient (ICC2,k) to assess for agreement among raters. ICC estimates <0.5, between 0.5 and 0.75, between 0.75 and 0.9, and >0.90 can be interpreted as poor, moderate, good and excellent reliability, respectively (16). ICC calculations were performed using Python Version 3.7 and P values less than 0.05 were considered statistically significant.
Overall assessment scores and recommendations for use of Evicore’s 6-1 guideline are provided in Table 2. None of the appraisers recommended the guideline as is; three recommended the guideline with modifications; two did not recommend the guideline at all.
|Authors||Overall assessment score||Recommendation|
|Lainey Bukowiec (LB)||2||Recommended, with modifications|
|Brandon Stahl (BS)||2||Not recommended|
|Nareena Imam (NI)||4||Recommended, with modifications|
|Jay Zaifman (JZ)||2||Not recommended|
|John Koerner (JK)||3||Recommended, with modifications|
Figure 1 presents the scaled domain percentages for all three appraisers. Clarity of presentation was the only domain that scored over 50%.
Table 1 provides a detailed description of the AGREE II Parameters by domain and item. A further breakdown of domain and item scores for each individual appraiser is provided in Table 3. The intraclass correlation coefficient was 0.881 [95% confidence interval (CI): 0.77, 0.94], indicating good reliability amongst raters, as seen in Table 4.
LB, Lainey Bukowiec; BS, Brandon Stahl; NI, Nareena Imam; JZ, Jay Zaifman; JK, John Koerner.
|Type||ICC||95% CI||P value|
ICC, intraclass correlation coefficient; CI, confidence interval.
As the number and usage of clinical practice guidelines increases, the need to evaluate their development and quality becomes paramount. These guidelines have the potential to greatly impact patient outcomes, especially as the rate of PA denials continues to increase (6,7). Thorough evaluation and targeted improvement of these guidelines likely benefit patient care (17). The AGREE II tool provides a systematic method of assessing the value of a guideline and proposing specific and applicable corrections. Our study evaluated the quality of a single guideline provided by Evicore for the diagnostic workup of lower extremity pain with neurological features with or without lower back pain using the AGREE II assessment tool.
Except for the clarity of presentation domain, this guideline received low AGREE II scores, with all five remaining domains scoring lower than 50%. Scores lower than 50% indicate weak methodologic quality with a lack of supporting evidence including randomized trials and systematic reviews (18).
Scope and purpose
The scope and purpose domain scored slightly under the satisfactory threshold (40%). Guidelines that clearly specify their objectives and intended patient population in a well-structured format tend to score highly on these domains (19,20). Appraisers cited that while the intent of the guideline—to use MRI of the lumbar spine to aid in diagnosing radiculopathy, radiculitis, plexopathy and neuropathy—was stated, there was no explicit mention of expected benefits should these guidelines be employed. In addition, the target population of individuals with lower extremity pain with or without neurological features was defined but would benefit from further stratification based on patient gender and presence of comorbidities, for example.
The stakeholder involvement domain received an unsatisfactory score of 11%. Target users and extent of input from vested parties including patients and providers were not specified. While the culture of medicine has shifted to place a greater emphasis on patient-centered care, developers of clinical guidelines have been slow to seek and include patient viewpoints (19-21). Inclusion of patient preference when creating guidelines may be unnecessary due to preference variability and/or patients may prefer to not be involved in their care; however, patient input may ultimately improve clinical outcomes (22-24). Although controversial, as the use of patient satisfaction and patient reported outcome measures to evaluate medical care rises, inclusion of whether patient preference was sought and resolution of the ambiguity of target users would improve the overall quality of this guideline (25). This guideline would also benefit from a clear acknowledgement section that explicitly states individuals from relevant professional groups who were involved in the guideline development process.
Rigor of development
The rigor of development domain received an unsatisfactory score of 10%. Appraisers pointed out that systematic methods used to search for supporting evidence and to subsequently create guideline recommendations were not specified. Additionally, discussions regarding strengths and limitations of supporting evidence, consequential benefits and risks of implementation of the guidelines, and a procedure for updating the guideline regularly as new evidence emerges were not included. Although the guideline mentions that relevant external stakeholders reviewed the guideline prior to its publication, an explicit statement identifying specialties, titles and/or affiliations of these experts would improve this domain.
Clarity of presentation
Out of the six domains, only the clarity of presentation domain received a score considered satisfactory (66%). Appraisers noted positive features of this guideline including straightforward bulleted lists and tables with easy-to-read text, links to outside material that are highlighted and accessible, and succinctly summarized recommendations.
The applicability domain received a score of 13%, with appraisers citing a lack of consideration of barriers to application of the guideline, resources required for its implementation, and specific information on auditing criteria. Consideration of these factors during the development process is essential as even guidelines with high-quality recommendations are rendered ineffective if they are difficult to implement. Previous studies using the AGREE II tool to evaluate guidelines have also recognized this domain as commonly deficient across various fields of medicine (19,21,26). Guideline characteristics that increase chance of utilization by providers include intelligibility, convenience, and accessibility of required resources (27). Evicore would benefit from considering these characteristics in formulating their recommendations. Specifically, incorporation of check lists, how-to-manuals, cost considerations, or outcomes of pilot tests/lessons learned would be beneficial to this guideline’s applicability.
Editorial independence was the lowest scoring domain at 0%. The editorial independence domain addresses whether a guideline’s developers and funding parties have conflicts of interest that were disclosed and addressed to minimize bias. The consistently low scores for this domain provided by our appraisers reflect the lack of mention of funding bodies, developers, and subsequent discussion of competing interests of these unnamed parties in Evicore’s guidelines. These problems are not unique to Evicore’s guidelines, as questionable editorial independence and failure to disclose conflicts of interest are widespread problems throughout the healthcare industry that can have negative consequences (28,29). As determined by a survey of AGREE II users, rigor of development and editorial independence domains have the strongest influence on overall assessment of a guideline’s quality (30). It is likely that this guideline’s low editorial independence score profoundly impacted the appraisers’ negative overall assessment scores. Publicly available disclosure of funding sources and conflicts of interest would improve the transparency of Evicore’s spinal imaging guidelines and allow clinicians to judge the influence of bias for themselves.
Although the AGREE II is a well-validated tool used to meaningfully assess guideline quality, there are several shortcomings to the current evaluation, including the small number of appraisers. The AGREE II tool is validated for a minimum of two appraisers and four reviewers are recommended by the AGREE II manual. Although our study had five reviewers, which is more than what is recommended by the AGREE II manual, our study would have increased power if additional appraisers provided scores. Additionally, these appraisers were healthcare professionals who were relatively inexperienced in guideline development and evaluation. Lastly, an inherent limitation of the AGREE II tool is that it assesses methodological quality rather than clinical content. While a guideline can be methodologically sound, it may not be appropriate for clinical use (31).
Based on the conclusions drawn by the five appraisers, the guidelines developed by Evicore would greatly benefit from increased identification and diversification of guideline development parties and stakeholders, increased rigor of development including a more robust discussion of the body of evidence and its strengths and limitations, and incorporation of more explicit suggestions for implementation of guideline recommendations.
Peer Review File: Available at https://jss.amegroups.com/article/view/10.21037/jss-22-57/prf
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jss.amegroups.com/article/view/10.21037/jss-22-57/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
- American Medical Association. 2019 AMA prior authorization (PA) physician survey. (2018). Available online: https://www.ama-assn.org/system/files/2019-02/prior-auth-2018.pdf.
- Bussières AE, Stewart G, Al-Zoubi F, et al. Spinal Manipulative Therapy and Other Conservative Treatments for Low Back Pain: A Guideline From the Canadian Chiropractic Guideline Initiative. J Manipulative Physiol Ther 2018;41:265-93. [Crossref] [PubMed]
- Dickens DS, Pollock BH. Medication prior authorization in pediatric hematology and oncology. Pediatr Blood Cancer 2017;64. [Crossref] [PubMed]
- Simeone JC, Marcoux RM, Quilliam BJ. Cost and utilization of behavioral health medications associated with rescission of an exemption for prior authorization for severe and persistent mental illness in the Vermont Medicaid Program. J Manag Care Pharm 2010;16:317-28. [Crossref] [PubMed]
- Siracuse MV, Vuchetich PJ. Impact of Medicaid prior authorization requirement for COX-2 inhibitor drugs in Nebraska. Health Serv Res 2008;43:435-50. [Crossref] [PubMed]
- American Medical Association. 2021 AMA prior authorization (PA) physician survey. (2021). Available online: https://www.ama-assn.org/system/files/prior-authorization-survey.pdf
- American Hospital Association (2019). Addressing Commercial Health Plan Abuses to Ensure Fair Coverage for Patients and Providers [White paper]. Available online: https://www.aha.org/system/files/media/file/2020/12/addressing-commercial-health-plan-abuses-ensure-fair-coverage-patients-providers.pdf
- “EviCore.” Available online: www.evicore.com. Accessed 15 Mar. 2022.
- Brouwers MC, Kho ME, Browman GP, et al. AGREE II: advancing guideline development, reporting and evaluation in health care. CMAJ 2010;182:E839-42. [Crossref] [PubMed]
- Brouwers MC, Kerkvliet K, Spithoff K, et al. The AGREE Reporting Checklist: a tool to improve reporting of clinical practice guidelines. BMJ 2016;352:i1152. [Crossref] [PubMed]
- Brouwers MC, Kho ME, Browman GP, et al. Development of the AGREE II, part 1: performance, usefulness and areas for improvement. CMAJ 2010;182:1045-52. [Crossref] [PubMed]
- Brouwers MC, Kho ME, Browman GP, et al. Development of the AGREE II, part 2: assessment of validity of items and tools to support application. CMAJ 2010;182:E472-8. [Crossref] [PubMed]
- Stochkendahl MJ, Kjaer P, Hartvigsen J, et al. National Clinical Guidelines for non-surgical treatment of patients with recent onset low back pain or lumbar radiculopathy. Eur Spine J 2018;27:60-75. [Crossref] [PubMed]
- Layne EI, Roffey DM, Coyle MJ, et al. Activities performed and treatments conducted before consultation with a spine surgeon: are patients and clinicians following evidence-based clinical practice guidelines? Spine J 2018;18:614-9. [Crossref] [PubMed]
- Martin Ginis KA, van der Scheer JW, Latimer-Cheung AE, et al. Evidence-based scientific exercise guidelines for adults with spinal cord injury: an update and a new guideline. Spinal Cord 2018;56:308-21. [Crossref] [PubMed]
- Koo TK, Li MY. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. J Chiropr Med 2016;15:155-63. [Crossref] [PubMed]
- Grimshaw JM, Russell IT. Effect of clinical guidelines on medical practice: a systematic review of rigorous evaluations. Lancet 1993;342:1317-22. [Crossref] [PubMed]
- Hoffmann-Eßer W, Siering U, Neugebauer EAM, et al. Systematic review of current guideline appraisals performed with the Appraisal of Guidelines for Research & Evaluation II instrument-a third of AGREE II users apply a cut-off for guideline quality. J Clin Epidemiol 2018;95:120-7. [Crossref] [PubMed]
- Sanclemente G, Acosta JL, Tamayo ME, et al. Clinical practice guidelines for treatment of acne vulgaris: a critical appraisal using the AGREE II instrument. Arch Dermatol Res 2014;306:269-77. [Crossref] [PubMed]
- Polus S, Lerberg P, Vogel J, et al. Appraisal of WHO guidelines in maternal health using the AGREE II assessment tool. PLoS One 2012;7:e38891. [Crossref] [PubMed]
- Sabharwal S, Patel NK, Gauher S, et al. High methodologic quality but poor applicability: assessment of the AAOS guidelines using the AGREE II instrument. Clin Orthop Relat Res 2014;472:1982-8. [Crossref] [PubMed]
- Nwosu K, Hershman S, Cha T. Shared decision-making in spine surgery. Semin Spine Surg 2018;30:99-103.
- Sepucha KR, Atlas SJ, Chang Y, et al. Informed, Patient-Centered Decisions Associated with Better Health Outcomes in Orthopedics: Prospective Cohort Study. Med Decis Making 2018;38:1018-26. [Crossref] [PubMed]
- Youm J, Chenok K, Belkora J, et al. The Emerging Case for Shared Decision Making in Orthopaedics. J Bone Joint Surg Am 2012;94:1907-12.
- Lizzio VA, Dekhne MS, Makhni EC. Electronic Patient-Reported Outcome Collection Systems in Orthopaedic Clinical Practice. JBJS Rev 2019;7:e2. [Crossref] [PubMed]
- Gagliardi AR, Brouwers MC. Do guidelines offer implementation advice to target users? A systematic review of guideline applicability. BMJ Open 2015;5:e007047. [Crossref] [PubMed]
- Francke AL, Smit MC, de Veer AJ, et al. Factors influencing the implementation of clinical guidelines for health care professionals: a systematic meta-review. BMC Med Inform Decis Mak 2008;8:38. [Crossref] [PubMed]
- Spithoff S, Leece P, Sullivan F, et al. Drivers of the opioid crisis: An appraisal of financial conflicts of interest in clinical practice guideline panels at the peak of opioid prescribing. PLoS One 2020;15:e0227045. [Crossref] [PubMed]
- Bindslev JB, Schroll J, Gøtzsche PC, et al. Underreporting of conflicts of interest in clinical practice guidelines: cross sectional study. BMC Med Ethics 2013;14:19. [Crossref] [PubMed]
- Hoffmann-Eßer W, Siering U, Neugebauer EAM, et al. Guideline appraisal with AGREE II: online survey of the potential influence of AGREE II items on overall assessment of guideline quality and recommendation for use. BMC Health Serv Res 2018;18:143. [Crossref] [PubMed]
- Vlayen J, Aertgeerts B, Hannes K, et al. A systematic review of appraisal tools for clinical practice guidelines: multiple similarities and one common deficit. Int J Qual Health Care 2005;17:235-42. [Crossref] [PubMed]