Sample Size Estimation for Outlier Detection

Timothy Gebhard, Inga Koerte, Sylvain Bouix

18th International Conference on Medical Image Computing and Computer-Assisted Interventions (MICCAI 2015),


Cite this paper

@InProceedings{   Gebhard_2015,
  title         = {Sample Size Estimation for Outlier Detection},
  author        = {Gebhard, Timothy and Koerte, Inga and Bouix, Sylvain},
  year          = {2015},
  pages         = {743--750},
  doi           = {10.1007/978-3-319-24574-4_89},
  booktitle     = {Medical Image Computing and Computer-Assisted Intervention
                  -- MICCAI 2015},
  publisher     = {Springer International Publishing},
  editor        = {Navab, Nassir and Hornegger, Joachim and Wells, William M.
                  and Frangi, Alejandro F.},
  isbn          = {978-3-319-24574-4}
Poster DOI Oral


The study of brain disorders which display spatially heterogeneous patterns of abnormalities has led to a number of techniques aimed at providing subject specific abnormality (SSA) maps. One popular method to identify SSAs is to calculate, for a set of regions, the z-score between a feature in a test subject and the distribution of this feature in a normative atlas, and identify regions exceeding a threshold. While sample size estimation and power calculations are well understood in group comparisons, describing the confidence interval of a z-score threshold or estimating sample size for a desired threshold uncertainty in SSA analyses have not been thoroughly considered. In this paper, we propose a method to quantify the impact of the size and distribution properties of the control data on the uncertainty of the z-score threshold. The main idea is that a z-score threshold confidence interval can be approximated by using Gaussian Error Propagation of the uncertainties associated with the sample mean and standard deviation. In addition, we provide a method to estimate the sample size of the control data required for a z-score threshold and associated desired width of confidence interval. We provide both parametric and resampling methods to estimate these confidence intervals and apply our techniques to establish confidence of SSA maps of diffusion data in subjects with traumatic brain injury.