
基于混合变点模型的卵巢癌诊断筛查 基于混合变点模型的卵巢癌诊断筛查 邹晨晨 房祥忠 翟广贺 北京大学数学科学学院北京 100871 摘要卵巢癌死亡率在妇科恶性肿瘤里位居前列有效的早期诊断可极大地提高生存率 CA125 与 HE4 是目前最常用且被证实颇为有效的肿瘤标记物本文基于两者的纵向变化水平 建立了带变点的二元混合效应模型在潜伏病程右删失的假设下用极大似然方估计参数证明 了其相合性较 Skates 所提的Bayesian模式有相当的准确性和更广的适应性在跨度为五年的 模拟筛查里比较了基于癌症风险率和假设检验两种诊断方法的效果模拟结果显示在提早探测 基于假设检验的筛查方法更具优势 关键词纵向数据变点混合效应模型极大似然估计卵巢癌筛查 中图分类号 please refer to Chinese Library Classication Ovarian Cancer Screening Based on Changepoint Mixture Model ZOU Chen-chen FANG Xiang-zhong ZHAI Guang-he School of Mathematical Science Peking University Beijing 100871 Abstract Ovarian cancer is one of the most deadly female genital malignant tumors in many regions while an eective early screening strategy can save numerous lives CA125 and HE4 are tumor markers validated ecacious as well as most commonly used in recent screening research of ovarian cancer In this paper we constructed a changepoint and mixture model on the basis of longitudinal CA125 and HE4 levels and estimated parameters using imum likelihood with the preclinical duration assumed right-censored which is more adaptive and yields comparable results in comparison to the Bayesian approach raised by Skates Consistency of estimators are proved We also ran a 5-year simulation of sequential screening by calculating the risk of cancer and hypothesis testing the true incidence time respectively Results show that diagnosis based on hypothesis test pers better in early detection Key words Longitudinal Changepoint mixture model imum likelihood estimation Ovarian caner screening 基金项目 the Ph D Programs Foundation of Ministry of Education of China No 20090001110005 the National Natural Science Foundation of China Grant No 11171007 作者简介 邹晨晨北京大学博士研究生导师房祥忠教授房祥忠北京大学教授翟广贺北京大学硕士 0  Introduction Ovarian cancer with a low incidence 30–50100000 [1] high mortality[2] poor survival in advanced stage[3] and remarkable cure rate 90 in early stage[4] makes its early detection an eective approach to save peoples lives However lack of manifest clinical symptoms in early stage [3] and the fact that over 70 of patients are diagnosed with advanced stage greatly limit the sensitivity1 of its screening and detection Carbohydrate antigen CA125 is a most commonly used tumor marker in clinic but only 50 of patients in early stage show a raise in CA125[5] and several diseases other than ovarian caner can cause an elevated CA125 level too[6–8] which restricts its sensitivity and specicity Jacobs[9] proposed a risk of malignancy index RMI for early prediction using logistic regression based on menopausal status transvaginal ultrasound and CA125 in 1990 which in- creased the sensitivity of CA125 as well as but expensive Meanwhile the report TVS transvaginal ultrasound is kind of subjective and inconvenient The research conducted by Berg[10] mirrors the drawback of current screening –mainly xed cuto test of Ca125 and transvaginal ultrasound detection They randomly split almost 70000 women into two roughly equal-sized groups – one that got yearly screening for ovarian cancer between 1993 and 2001 with both blood tests and TVS and one not 212 women in the screening group were diagnosed with ovarian cancer and 118 of them died from the disease comparing to 176 diagnoses and 100 deaths in the group not taking regular screening In both groups more than three-quarters of women diagnosed with ovarian cancer already had stage 3 or 4 disease They also recorded more than 3000 cases of false-positives in the screening group of whom more than 1000 who didnt end up having ovarian cancer had surgery to remove an ovary because of a positive test result Those surgeries resulted in serious complications in 163 women Skates[11] adopted a Bayesian approach which calculates the posterior probability of dis- ease based on longitudinal CA125 levels under the framework of a hierarchical changepoint ie assuming the trends of ca125 before and after the disease are dierent and mixture model Posterior distributions are got through Markov chain Monte Carlo s Although it out- pers Jacobs RMI and traditional xed cuto screening the outcome depends on the selection of priori Whats more this demand one assign a specic distribution to the preclinical duration of ovarian cancer which is far from being fully addressed and researches concerned are inconsistent According to Campbell[12] and Van Nagell[13] the averaging pre- clinical duration should be 2 years while Andersens research[3] indicates an overall advance of 3 years before clinical diagnosis when tumor markers begin to climb Patrick[14] even argues that early stage of ovarian cancer alone lasts 4 years Dierent assumptions generate dierent results and aect early screening too 1  sensitivity true positive true positivefalse negative On the other hand researchers have found that a new serum marker called HE4 human epididymis protein has a high expr