The literature search lays the foundation of evidence for a systematic review. A weak search degrades and potentially invalidates a review’s value. An information professional or librarian with systematic review experience should be a part of the systematic review team if possible. If not possible, training and consultation with such a professional should be sought to ensure that the searching step of a review is competent.
Three principles shape a systematic review’s search. The search must be systematic, comprehensive, and reproducible.
A systematic search is built to deliberately take full advantage of each database’s structure and syntax to do the best job possible of retrieving the relevant literature from that database. A systematic review search will have a single search string for each database.
A comprehensive search is built on two factors. First is the robustness the string, which needs to capture the varieties of terms describing key concepts across the literature, combined with controlled vocabulary terms. Second is running the string, appropriately modified, in multiple databases chosen on the likelihood that they will index needed literature. Depending on the SR question, database searches may be supplemented by searching for grey literature and unpublished studies.
A reproducible search captures the exact search string as configured for each database and its hosting platform. The strings will be published initially in the protocol and then in the body of or as a supplement to the finished study. If the exact string, database, or platform is missing from the reporting, the search is not actually reproducible.
Precision versus recall : Every search balances recall and precision. Recall is the number of relevant results a search captures compared to all the possible relevant results indexed in a database. Precision is the number of relevant results retrieved compared to the total number of results that search retrieved. A more precise search will return a higher proportion of relevant results but is more likely to exclude relevant results. Ideally a systematic review of quantitative evidence is truly comprehensive, but that ideal needs to be balanced against the authors’ capacity to screen results.
The question's framing might introduce opportunities to limit the results that must be screened, by limiting to human studies, for instance, or to a particular type of study, or to a geographical region.
A systematic review cannot be based on evidence from a single database. The search should be run in as many databases available to the review team that include relevant research.
Which databases are most appropriate for a systematic review will depend on the subject focus of the review question. To judge if a database is potentially useful, look at the database's stated scope and content, which describes, in general terms, what will be found in the database. Bear in mind that a database often includes a broader spread of literature than its name implies. Other important strategies include running exploratory searches in any database that could be useful, and consulting with a librarian.
Databases to consider searching include—but are not limited to—FSTA, CAB Abstracts, Medline or PubMed, SciELO, Web of Science Core Collection, Agricola, Scopus, EMBASE, and Biosis Previews.
A group of key articles that represent the evidence that the search strategy will need to find will be equally important at the beginning of the search process and in the testing phase. At the start they are key sources to be mined for terms for building the search strategy. At the end of the process, they facilitate testing the search string in each database included in the search.
A search string is made up of its terms and the syntax that connects the terms. Developing the search strategy takes expertise and time. Two components make up the search string: the terms used, and the Boolean operators connecting them. The terms will include free text search terms, and controlled vocabulary terms.
The free text search terms are those that have been brainstormed and gathered in finding the widest appropriate terminology to represent each concept in the question framework.
Controlled vocabulary terms are pulled from thesaurus which undergird subject specific databases. These vocabularies can have specific names like MeSH for PubMed and Medline or they are called subject terms, keywords, descriptors, or subject headings, depending upon the platform hosting the database. These terms are applied by indexers to each record in a database. By adding additional, regularized terms which capture the main concepts and topics addressed in the research the record represents, these terms make relevant research easier to find. They also can be “exploded” so that a search includes the term itself plus any more specific terms subsumed under that term. This improves the comprehensiveness of the search. Controlled vocabulary terms also pull together many variations on terms.
A database’s thesaurus should be searched to identify appropriate terms. The specifics of how to search a thesaurus and explode its terms varies according to the database and the platform it is being searched on. The help section of any database will provide guidance, as will an experienced librarian.
Controlled vocabulary terms should not be confused with the author supplied keywords that are included in records in the Web of Science Core Collection and Scopus. Those terms are idiosyncratic and do not collate variations of a term.
Once appropriate terms have been gathered and tested for each concept, they need to be pulled together into a search string. All the terms that represent a single concept will be connected by the Boolean operator OR. The controlled vocabulary terms will also be added to those terms with OR. Finally, each section built for each concept will be connected to the others with the Boolean operator AND. The results from the search will all have at least one of the terms from each concept block that has been included in the search string present in the record.
Find additional information about building a search string, along with guidance on formatting terms (truncation) and combing terms for sensitivity (adjacency searching) and the pros and cons of limits, filters, and hedges in the supplementary section building a search
When all the terms for all concepts have been gathered and combined with correct syntax (the rules governing how terms need to be connected in each interface) for the first database, the search string must be reviewed and tested. The first level of testing should be run by the author of the string, looking for any internal errors, but also being sure that it is retrieving all of the key articles in the database being searched. The string should also be reviewed by subject experts in the review team, to ensure that no free text search terms have been omitted. Ideally the string should also be peer reviewed or checked by a librarian or information professional.[2]
The search string as it has been developed in the first database will not run correctly if it is simply copied and pasted into others. Each subject specific database will have its own controlled vocabulary built around its subject focus, and each database platform will have its own syntax. Boolean AND and OR are used across platforms, but fields, proximity operators and the process of building a search string vary.
When running the searches in additional databases, it is normal to get many duplicate results. Each database will also have unique content, and it because of this that the evidence base for a systematic review will not be comprehensive unless multiple databases are searched.
This chart gives a sense of some of the syntax and functionality variations across platforms for a single database, FSTA.
Variations between different databases on different platforms will be greater.
Ovid |
EBSCOhost |
Web of Science |
|
Labels used for controlled vocabulary terms |
Subject headings (which are phrase searched) or heading words (which are word searched) |
Descriptor, subject, or keyword, depending upon which screen you are on. |
Descriptor as the field to search, but keyword in the record. |
Combined title/abstract/ heading words fields search |
Available as Advanced Search (when map to subject heading is not ticked |
Not available, must be built manually. All Text Fields search searches all fields except Section and Subsection Codes, which helps cut down on false hits for some search terms . |
Available as Topic search |
Stemming and lemmatization |
None. Searches exactly what is typed, unless modified with wildcards. |
Default setting includes stemming and lemmatization. To override, put search term within quotation marks |
Default setting includes stemming and lemmatization. To block it, go to advanced search and toggle to exact search under more options. Or put terms inside quotation marks. Wildcards in a search term will also turn of stemming or lemmatization for the term |
Indexing date available |
Yes. Use Entry Date field |
Not available as a field, but can be manually entered. Type UC [three letters for month] [year] or UC [year], e.g. UC Feb 2021 or UC 2021. Find more information about field codes in EBSCO Help |
Yes. In basic search mode see Add date range, and use Index Date. |
Explode functionality |
Explode includes all levels of narrower terms to the bottom of the thesaurus hierarchy |
Explode includes narrower terms one level below the term exploded |
Not available. Can be done manually by adding terms in the thesaurus view. |
|
---|
Searching appropriate databases will be the main method for identifying a review's evidence. But additional searching methods should supplement the database searches to minimize chances that relevant literature is missed. Supplementary search methods include:
Near the end of conducting a systematic review, all searches should be rerun to see if any relevant studies have been published since they were run towards the start of the project.
The most efficient way to do this is to formulate the search to only include records that have been added to each database since the finalised searches were run at the review's start. Be aware that the field indicating when records were added to a database has many different names, depending on both the database and the platform it is being searched on. Check a database's help section to confirm which field to use.
Finalised search strategies are usually bulky. It is acceptable to not include them in the main body of the article, but they must be included in full in an appendix or a repository such as the Open Science Framework and linked to the review. Journals provide guidance on their preferred formatting.
Documentation needs to be detailed enough that another researcher could reproduce the review's search strategy exactly, without having to guess about any of the elements that could change the search's results.
PRISMA-S provides extensive guidance on the elements that must be reported for a search strategy to truly be reproducible. Information is required both about the information sources searched, and about the search strategy as run in each information source.