Comparison of four indirect (data mining) approaches to derive within-subject biological variation
Within-subject biological variation (CV i ) is a fundamental aspect of laboratory medicine, from interpretation of serial results, partitioning of reference intervals and setting analytical performance specifications. Four indirect (data mining) approaches in determination of CV i were directly compared.
Paired serial laboratory results for 5,000 patients was simulated using four parameters, d the percentage difference in the means between the pathological and non-pathological populations, CV i the within-subject coefficient of variation for non-pathological values, f the fraction of pathological values, and e the relative increase in CV i of the pathological distribution. These parameters resulted in a total of 128 permutations. Performance of the Expected Mean Squares method (EMS), the median method, a result ratio method with Tukey’s outlier exclusion method and a modified result ratio method with Tukey’s outlier exclusion were compared.
Within the 128 permutations examined in this study, the EMS method performed the best with 101/128 permutations falling within ±0.20 fractional error of the ‘true’ simulated CV i , followed by the result ratio method with Tukey’s exclusion method for 78/128 permutations. The median method grossly under-estimated the CV i . The modified result ratio with Tukey’s rule performed best overall with 114/128 permutations within allowable error.
This simulation study demonstrates that with careful selection of the statistical approach the influence of outliers from pathological populations can be minimised, and it is possible to recover CV i values close to the ‘true’ underlying non-pathological population. This finding provides further evidence for use of routine laboratory databases in derivation of biological variation components.
Journal/Conference/Book titleClinical Chemistry and Laboratory Medicine (CCLM)