^{1}

^{2}

^{1}

^{1}

^{1}

^{1}

^{1}

Trial sequential analysis (TSA) is a recent cumulative meta-analysis method used to weigh type I and II errors and to estimate when the effect is large enough to be unaffected by further studies. The aim of this study was to illustrate possible TSA scenarios and their significance using meta-analyses published in the

We performed a systematic medical literature search for meta-analyses published in the KJA. TSA was performed on each main outcome, estimating the required sample size on the calculated effect size for the intervention, considering a type I error of 5% and a power of 90% or 99%.

Six meta-analyses with a total of ten main outcomes were included in the analysis. Seven TSAs confirmed the results of the meta-analyses. However, only three of them reached the required sample size. In the two TSAs, the cumulative z-lines were not statistically significant. One TSA boundary for effect was reached with the 90% analysis, but not with the 99% analysis.

In TSA, a meta-analysis pooled effect may be established to assess if the cumulative sample size is large enough. TSA can be used to add strength to the conclusions of meta-analyses; however, pre-registration of the TSA protocol is of paramount importance. This study could be useful to better understand the use of TSA as an additional statistical tool to improve meta-analysis quality.

Traditional meta-analyses are only able to examine the pooled effect size rather than to evaluate whether the number of participants and the corresponding number of trials in a meta-analysis are sufficient to draw any conclusions. Moreover, the use of the traditional 95% CI or the 5% statistical significance threshold will lead to too many false-positive conclusions (type I errors) and too many false-negative conclusions (type II errors) [

Trial sequential analysis (TSA) is a recently described cumulative frequentist meta-analysis method [

TSA generates a graphical outcome divided into four areas by four lines: “benefit,” “harm,” “inner wedge,” or “non-statistically significant,” representing a statistically significant result for the first two areas (“benefit” and “harm”) and a strong evidence that further studies will hardly be able to change the no-effect results for the “inner wedge” area (

The aim of this study was to illustrate the possible scenarios and possible significance of TSA using meta-analyses published in the

We performed a systematic search of the medical literature following the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) Statement Guidelines for the identification, screening, and inclusion of articles. The search was performed by two researchers (ADC and MT) in close collaboration with the rest of the research team.

The search was performed on May 10, 2021, using the search tool in the KJA site and using the following terms: “meta-analysis,” “metaanalysis,” “meta analysis.” In our search, we did not apply any restrictions on publication type or date, language or status.

Two researchers (ADC and MT) independently screened the titles and abstracts of the identified papers to select those that were relevant. Only meta-analyses were considered eligible for analysis.

After identifying those studies meeting the inclusion criteria, two researchers (FG and AB) independently reviewed and assessed each of the included studies. The following information was collected: first author, year of the study, total number of patients per group, registration number, main outcome, and data for intervention and control relative to the main outcome.

If the main outcome was not clearly stated, it was retrieved by examining the registered protocol or by contacting the main author of the paper.

TSA was performed on the main outcome for each paper using TSA software (Copenhagen Trial Unit, Centre for Clinical Intervention Research, Copenhagen). The effect measure and model (mean difference, odds ratio, relative risk, risk difference, or Peto odds ratio) were used. A fixed effects model, random effects model using the DerSimonian–Laird method, random effects model using the Sidik–Jonkman method, or random effects model using the Biggerstaff–Tweedie method was selected according to the outcome measure and model. No continuity correction was applied in the case of a zero event. We estimated the required sample size on the calculated effect size for the intervention, considering a type I error of 5% and a power of 90%; benefit, harm, and inner wedge boundaries were drawn using the O’Brien–Fleming spending function.

Moreover, a more conservative approach, performing a second TSA with a type I error of 5% and a power of 99% was performed for each main outcome. This post-hoc conservative approach allowed us to assess whether the data provided convincing evidence of the true effect.

We identified 11 papers [

The topics of the meta-analyses were as follows: curare side effects [

Choi et al. [

Bailey et al. [

Min et al. [

The effect of a single dose of ibuprofen was evaluated by Kim et al. [

Kim et al. [

Another study by the same group of authors [

A TSA analyzes the cumulative evidence in a meta-analysis. Its output is represented by a cumulative z-line score that may lie in one out of four areas: benefit (labeled A in

A pooled effect in favor of the intervention (benefit) or in favor of the control (harm), or the absence of any effect (inner wedge), may be established to assess if the cumulative sample size is large enough. On the contrary, when the cumulative z-line lies in the area that is not statistically significant, further studies with an increase in the overall sample size are deemed necessary.

Seven out of ten TSAs confirmed the results of meta-analyses. However, only in three of them (

On the contrary, in four TSAs (

In the two TSAs (

No studies have reported examples of the inner wedge zone. However, for completeness, we would like to briefly illustrate this eventuality. The inner wedge zone is delimited by the futility boundaries, creating an isosceles triangle with its base on the sample size line. If the cumulative z-score lies in the inner wedge zone, future studies on the argument must be considered futile because they will hardly be able to change the no-effect results.

The importance of registering the TSA protocol before conducting the analysis is depicted in

Despite no guidelines or clear recommendations regarding the choice of the power of the analysis, this example shows the limitation of a post-hoc analysis in which the power could be arbitrarily changed to confirm or not the recommended result.

Our study has some limitations that we would like to discuss. A limited number of TSAs were included in the analysis, and no examples of a TSA lying in the inner wedge were available.

Other methods such as the law of iterated logarithm penalizing the z-value by the strength of the available evidence and number of statistical tests could be used to adjust the issues of repeated significance testing. In our study, we chose the cumulative z-curve approach, but we recognize this was an arbitrary choice.

We also presented a guide to help clinicians interpret TSA; however, we recognize that we have not explained the statistical basis of this analysis and we recognize this as a limitation.

We showed several examples of how a TSA can be applied to meta-analyses published in the KJA. We believe that this study provides useful insights to better understand the use of this statistical tool.

We deeply thanks Michele Salvagno, MD for drawing Fig. 1.

No potential conflict of interest relevant to this article was reported.

Alessandro De Cassai (Conceptualization; Formal analysis; Methodology; Project administration; Supervision; Writing – original draft; Writing – review & editing)

Martina Tassone (Conceptualization; Writing – original draft; Writing – review & editing)

Federico Geraldini (Writing – original draft; Writing – review & editing)

Massimo Sergi (Writing – original draft; Writing – review & editing)

Nicolò Sella (Writing – original draft; Writing – review & editing)

Annalisa Boscolo (Writing – original draft; Writing – review & editing)

Marina Munari (Writing – original draft; Writing – review & editing)

Graphical representation of the trial sequential analysis (TSA) outcome. A: favors intervention (benefit), B: non-statistically significant, C: inner wedge, D: favors control (harm).

Flow chart of study inclusion.

Trial sequential analysis (TSA) of the effect of lidocaine in reducing rocuronium-induced withdrawal movement [

Trial sequential analysis (TSA) of the effect of opioids in reducing rocuronium-induced withdrawal movement [

Trial sequential analysis (TSA) of the effect of multimodal anesthesia compared to that of continuous peripheral nerve blocks on pain at 48 hours following midline laparotomy [

Trial sequential analysis (TSA) of the effect of epidural anesthesia compared to that of continuous peripheral nerve blocks on pain at 48 hours following midline laparotomy [

Trial sequential analysis (TSA) of the effect of meperidine compared to that of placebo on postoperative shivering [

Trial sequential analysis (TSA) of the effect of clonidine compared to that of placebo on postoperative shivering [

Trial sequential analysis (TSA) of the effect of ibuprofen on postoperative opioid consumption [

Trial sequential analysis (TSA) of the effect of ibuprofen on postoperative pain [

Trial sequential analysis (TSA) of the efficacy of ramosetron in preventing postoperative nausea and vomiting [

Trial sequential analysis (TSA) of the efficacy of lidocaine/tetracaine patch and peel on pain [

Characteristics of the Included Studies

Author (yr) | Registration number | Main outcome | n | Intervention | Control | Overall effect (95% CI) |
---|---|---|---|---|---|---|

Choi et al. (2014) [ |
- | Incidence of rocuronium-induced withdrawal movement following pretreatment with lidocaine | 905 | 223/480 | 316/425 | Random effects using the M-H method: |

RR 0.60 (0.49, 0.74) | ||||||

Incidence of rocuronium-induced withdrawal movement following pretreatment with opioids | 1016 | 146/582 | 353/434 | Random effects using the M-H method: | ||

RR 0.28 (0.18, 0.44) | ||||||

Bailey et al. (2020) [ |
CRD42017051770 | Cumulative opioid consumption at 48 hours in patients undergoing midline laparotomy with continuous peripheral nerve blocks versus multimodal analgesia | 1080 | 552 | 528 | Random effects using the MD IV: |

−31.52 (−42.81, −20.22) | ||||||

Cumulative opioid consumption at 48 hours in patients undergoing midline laparotomy with continuous peripheral nerve blocks versus epidural analgesia | 566 | 293 | 273 | Random effects using the MD IV: | ||

16.13 (-0.10, 32.36) | ||||||

Min et al. (1999) [ |
- | Meperidine for prevention of postoperative shivering | 70 | 5/35 | 17/35 | Fixed effects using Peto OR: |

0.2 (0.1, 0.5) | ||||||

Clonidine for prevention of postoperative shivering | 518 | 99/259 | 161/259 | Fixed effects using Peto OR: | ||

0.3 (0.2, 0.5) | ||||||

Kim et al. (2021) [ |
CRD42020166141 | Opioid consumption following treatment with ibuprofen | 269 | 135 | 134 | Random effects using MD IV: |

-170.70 (-265.64, -75.77) | ||||||

Postoperative pain scores following treatment with ibuprofen | 266 | 185 | 181 | Random effects using MD IV: | ||

-0.58 (-0.99, -0.18) | ||||||

Kim et al. (2011) [ |
- | Incidence of postoperative nausea and vomiting following pretreatment with ramosetron | 685 | 106/340 | 216/345 | Random effects using RR IV: |

0.40 (0.27, 0.58) | ||||||

Kim et al. (2012) [ |
- | Efficacy and safety of lidocaine/tetracaine patch and peel to treat pain | 574 | 211/298 | 70/276 | Fixed effects using RR IV: |

2.49 (2.01, 3.07) |

n: number, M-H: Mantel–Haenszel, RR: relative risk, MD: mean difference, IV: inverse variance.