Winkel, Rikke Rass1; von Euler-Chelpin, My2; Nielsen, Mads2; Diao, Pengfei3; Nielsen, Michael Bachmann1; Uldall, Wei Yao1; Vejborg, Ilse1
1 Radiologisk Klinik, Diagnostisk Center, Rigshospitalet, The Capital Region of Denmark2 unknown3 Datalogisk Institut
impact on relative risk of breast cancer
BACKGROUND: Mammographic breast density and parenchymal patterns are well-established risk factors for breast cancer. We aimed to report inter-observer agreement on three different subjective ways of assessing mammographic density and parenchymal pattern, and secondarily to examine what potential impact reproducibility has on relative risk estimates of breast cancer. METHODS: This retrospective case-control study included 122 cases and 262 age- and time matched controls (765 breasts) based on a 2007 screening cohort of 14,736 women with negative screening mammograms from Bispebjerg Hospital, Copenhagen. Digitised randomized film-based mammograms were classified independently by two readers according to two radiological visual classifications (BI-RADS and Tabár) and a computerized interactive threshold technique measuring area-based percent mammographic density (denoted PMD). Kappa statistics, Intraclass Correlation Coefficient (ICC) (equivalent to weighted kappa), Pearson's linear correlation coefficient and limits-of-agreement analysis were used to evaluate inter-observer agreement. High/low-risk agreement was also determined by defining the following categories as high-risk: BI-RADS's D3 and D4, Tabár's PIV and PV and the upper two quartiles (within density range) of PMD. The relative risk of breast cancer was estimated using logistic regression to calculate odds ratios (ORs) adjusted for age, which were compared between the two readers. RESULTS: Substantial inter-observer agreement was seen for BI-RADS and Tabár (κ=0.68 and 0.64) and agreement was almost perfect when ICC was calculated for the ordinal BI-RADS scale (ICC=0.88) and the continuous PMD measure (ICC=0.93). The two readers judged 5% (PMD), 10% (Tabár) and 13% (BI-RADS) of the women to different high/low-risk categories, respectively. Inter-reader variability showed different impact on the relative risk of breast cancer estimated by the two readers on a multiple-category scale, however, not on a high/low-risk scale. Tabár's pattern IV demonstrated the highest ORs of all density patterns investigated. CONCLUSIONS: Our study shows the Tabár classification has comparable inter-observer reproducibility with well tested density methods, and confirms the association between Tabár's PIV and breast cancer. In spite of comparable high inter-observer agreement for all three methods, impact on ORs for breast cancer seems to differ according to the density scale used. Automated computerized techniques are needed to fully overcome the impact of subjectivity.