MICCAI 2020 RibFrac Challenge: 
Rib Fracture Detection and Classification

1. Evaluation of Detection

FROC score: average sensitivity with false positives of 2, 4, 6, based on ee-Response ROC (FROC) analysis, is used for evaluating the detection task.

Instance-level masks are required to calculate a "hit". If a detection proposal has an IoU > 0.2 (segmentation for elongated objects have low IoU) with a certain ground truth fracture annotation, it counts for a “hit”.

Both sensitivity and false positives are important in this detection task. Besides, due to the elongated object shape, the instance masks, rather than bounding-boxes are used for calculating a "hit".

We will also report the global dice of the submissions, however, it will not be used for ranking due to the fact that detection matters much more than segmentation in this task.  

Figure. FROC analysis

2. Evaluation of Classification

Macro-averaged F1 score over the 4 categories.  

A multi-class confusion matrix will be calculated over the prediction ("missing", buckle, nondisplaced, displaced, segmental) and ground truth (buckle, nondisplaced, displaced, segmental). A macro-averaged F1 score is calculated based on the confusion matrix.

Table. Confusion matrix

Note that the performance of Task 2 is partially dependent on the performance of Task 1. We do not design the separate evaluation protocol since the detection task is more clinically important.  

3. Evaluation Scripts and Starter Code

We will provide the starter code and evaluation scripts at https://github.com/M3DV/RibFrac-Challenge. Please stay tuned!

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.