2.4.2.1 Search sensitivity and specificity
In each of the steps, reviewers have an opportunity to discuss conflicts, revise search strategies and improve results, each time with the goal of ensuring that all relevant literature is included in the review. Each stage of the search also provides an opportunity to evaluate the degree of sensitivity and specificity of the search and to adjust it if needed. Sensitivity, also called recall, is the proportion of relevant articles as a percentage of the relevant articles in existence. A highly sensitive search will retrieve all articles on a specific topic. Specificity, or precision, is the proportion of relevant articles as a percentage of the number of articles retrieved. Searching is always an attempt to strike a balance between sensitivity and specificity. Hausner et al. (2012) describe an approach of developing search strategies based on a test set of seed references which are validated by another set of seed references. The search strategy below is a good example that captures the 25 seed reference PMIDs at line 37 (Silva e Silva et al. 2022). Checking the reference lists of included records can be used to confirm the sensitivity of a search strategy.
Figure 1: Example search strategy that captures ‘seed’ references.
Whenever possible, the search should be undertaken by librarians, information specialists or other expert searchers (Koffel 2015; Rethlefsen, Murad & Livingston 2014; Rethlefsen et al. 2015). At the very least, a librarian trained in systematic searching should be consulted for advice on searching and database syntax. Using their detailed knowledge of information sources is of great benefit to the content expert or methodologist, who are experts in the topic at hand but may be less experienced in evidence synthesis methodology or searching. Furthermore, librarians and information specialists are familiar with search logic, how to combine keywords in the most efficient way and how to maximise the use of database features, like Boolean and proximity operators, to enable the best balance between sensitivity and specificity. The use of expert searchers in reviews has been long established and recommended by JBI and other review organisations (McGowan & Sampson 2005; Rethlefsen et al. 2014, 2015).
There is a lot of hype about the possibilities of Generative AI tools such as ChatGPT for use in evidence synthesis. As at time of writing in 2024, the recommendation is not to use ChatGPT for generating comprehensive Boolean search strategies. An Evidence Summary examining the role of ChatGPT in developing systematic literature searches recommends not relying solely on ChatGPT to generate Boolean search strategies (Parisi & Sutton 2024), and that human oversight is essential. The authors highlight the limitations of ChatGPT; it struggles with database syntax, fabricates MeSH terms and hallucinates inaccurate citations. ChatGPT and other tools such as http://Perplexity.ai or Consensus.app can be useful for suggesting keyword synonyms and for finding seed references from which the keywords and indexed terms can be harvested. Wang et al. (2023) found that Boolean queries generated by ChatGPT had lower sensitivity/recall and higher precision, running the risk of missing studies.
Even before developing your protocol, it is necessary to determine whether a review is necessary. A previously published review on the same or a very similar topic, particularly if published recently, usually within the last 5 to 10 years, may mean that further development of the protocol is not needed, or needs to be refined. There are numerous places to search for existing health-related systematic reviews, including JBI, of course, as well as Cochrane, Embase, Epistemonikos, MEDLINE/PubMed and PROSPERO. Further information about considering a review can be found in section 1.2 of the JBI Manual for Evidence Synthesis.