Good review practice: a researcher guide to systematic review methodology in the sciences of food and health

Evaluation of included studies: quality assessment

This step of the process is to evaluate the quality of included studies and the overall strength of the evidence. [3] Full texts of the included studies are required for the quality assessment process which involves the following steps:


Identifying potential bias sources for critical appraising primary studies 

The first step in planning the quality assessment process is to identify what biases are expected to foresee the evaluation process and effectively exclude studies that are not of sufficient quality relative to the intended outcomes of the review. At protocol planning stage, any source of bias including those related to primary studies should be made explicit and the methods and measures for identifying these should be documented. These include the key aspects of design, conduct, data analysis and reporting of individuals studies that are known to induce bias in the estimation of the outcomes.

In healthcare research explicit categories are used to define bias types that are known to be associated with the potential point of bias from the primary studies. There are five core categories that are designed for randomised control trials but can be also used for other study designs. These include selection biasperformance biasdetection bias, attrition bias, and selective outcomes[4]


Assesing 'risk of bias' of individual studies (internal validity) 

The important characteristics of studies are needed for assessing ‘risk of bias’ and can be structured into checklists or in the form of questions to facilitate this process. 

There are many check lists already made available for many subject areas and disciplines in healthcare to standardise the assessment procedure. However, they can be often modified if the review question needs to accommodate certain aspects different from those in the checklists. They are useful tools in guiding the assessment process and rating the included studies that will be used in the synthesis process. (See Appendix B for access to ‘risk of bias’ tools.)

Overall assessments: the process of rating primary studies based on relevant ‘risk of bias’ category is predetermined by relevant guidelines in different fields. Generally, a final rating is recorded for the results of each included study when the quality assessment of specific ‘risk of bias’ category is completed. [5]

The final ratings after the agreements of reviewers are then used in grading the strength of the body of evidence. Using these explicit ratings can also guide the study inclusion for data synthesis. 

Assessing the applicability is only considered when certain features of the primary studies impact the applicability of their outcomes.2 For example, when the duration of follow-ups is different between studies and can be a potential source of bias in the applicability of the outcomes for the topic. [6]

If reviewers anticipate bias from sources of funding, it is recommended that they consider using explicit categories to define them and decide how their likely impacts are going to be measured rather than assigning a high risk of bias to studies funded by industry or authored by guest or ghost authors. [7]


Assessing Quality of Reporting 

Poor reporting of primary studies is frequently found and reported in systematic reviews of interventions in healthcare research. It is important to separate the reporting deficiencies from the methodological aspects of the studies. Reporting issues including underreporting of study characteristics, study methods, and missing data do not represent the quality of methods. 

The Cochrane ‘risk of bias’ tool for example, recognises issues related to underreporting as ‘uncertain’ risks. However, it is advised that the assessment focus on study design and conduct and not on the quality of reporting.3 Although reporting issues should be assessed, identified and clearly documented to inform relevant research communities of weaknesses in reporting standards for the fields of studies. [7]

For human and animals’ studies, there are clear guidelines and guidance statements on what information should be reported from primary studies across relevant disciplines . These guidelines list standard reporting items and are made available for different study designs. They are useful tools in identifying reporting issues and can help make the reporting assessment standardised by using appropriate terminologies. (See Appendix B for access to reporting guidelines)


Assessing the strength of the body of evidence (External Validity

Assessing the strenght is the process of identifying sources of non-systematic errors and differences within those aspects of included studies that are relevant to the generalisability of the outcomes. Depending on the contexts of the topics these may include the size of the studies, validity and precisions of their findings, and the level of heterogeneity between studies. [3]

There are standard frameworks that use explicit categories for the assessing relevant factors from the results to rate the overall body of evidence. The GRADE which is the most widely used framework across many disciplines for health-related topics, assesses the body of evidence for 5 core domains including risk of biasdirectnessconsistencyprecision, and publication bias to rate the body of evidence. It assigns 4 level categories of rating including very low, low, medium, and high as a measure of certainty of the outcomes. (See Appendix B for access to grading checklists)

Good Practice point: For transparency the terminologies used in the ‘quality assessment’ section of the review and any self-assigned category that is used to rate the studies for risk of bias should be clearly defined. Usually, at least two reviewers should assess primary studies for risk of bias and if necessary, a third reviewer might be consulted to resolve any disagreements.  

