Passed
Push — master ( b916d3...99e6cf )
by Fernando
01:17
created

RSNAMICCAI._get_subjects()   C

Complexity

Conditions 10

Size

Total Lines 39
Code Lines 35

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
eloc 35
dl 0
loc 39
rs 5.9999
c 0
b 0
f 0
cc 10
nop 4

How to fix   Complexity   

Complexity

Complex classes like torchio.datasets.rsna_miccai.RSNAMICCAI._get_subjects() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

1
import csv
2
from typing import List
3
from pathlib import Path
4
5
from ..typing import TypePath
6
from .. import SubjectsDataset, Subject, ScalarImage
7
8
9
class RSNAMICCAI(SubjectsDataset):
10
    """RSNA-MICCAI Brain Tumor Radiogenomic Classification challenge dataset.
11
12
    This is a helper class for the dataset used in the
13
    `RSNA-MICCAI Brain Tumor Radiogenomic Classification challenge`_ hosted on
14
    `kaggle <https://www.kaggle.com/>`_. The dataset must be downloaded before
15
    instantiating this class (as oposed to, e.g., :class:`torchio.datasets.IXI`).
16
17
    If you reference or use the dataset in any form, include the following
18
    citation:
19
20
    U.Baid, et al., "The RSNA-ASNR-MICCAI BraTS 2021 Benchmark on Brain Tumor
21
    Segmentation and Radiogenomic Classification", arXiv:2107.02314, 2021.
22
23
    Args:
24
        root_dir: Directory containing the dataset (``train`` directory,
25
            ``test`` directory, etc.).
26
        train: If ``True``, the training set will be used. Otherwise the
27
            validation set will be used.
28
        ignore_empty: If ``True``, the three subjects flagged as "presenting
29
            issues" (empty images) by the challenge organizers will be ignored.
30
31
    .. _RSNA-MICCAI Brain Tumor Radiogenomic Classification challenge: https://www.kaggle.com/c/rsna-miccai-brain-tumor-radiogenomic-classification
32
    """
33
    id_key = 'BraTS21ID'
34
    label_key = 'MGMT_value'
35
    modalities = 'T1w', 'T1wCE', 'T2w', 'FLAIR'
36
    bad_subjects = '00109', '00123', '00709'
37
38
    def __init__(
39
            self,
40
            root_dir: TypePath,
41
            train: bool = True,
42
            ignore_empty: bool = True,
43
            **kwargs,
44
            ):
45
        self.root_dir = Path(root_dir).expanduser().resolve()
46
        subjects = self._get_subjects(self.root_dir, train, ignore_empty)
47
        super().__init__(subjects, **kwargs)
48
        self.train = train
49
50
    def _get_subjects(
51
            self,
52
            root_dir: Path,
53
            train: bool,
54
            ignore_empty: bool,
55
            ) -> List[Subject]:
56
        subjects = []
57
        if train:
58
            csv_path = root_dir / 'train_labels.csv'
59
            with open(csv_path) as csvfile:
60
                reader = csv.DictReader(csvfile)
61
                labels_dict = {
62
                    row[self.id_key]: int(row[self.label_key])
63
                    for row in reader
64
                }
65
            subjects_dir = root_dir / 'train'
66
        else:
67
            subjects_dir = root_dir / 'test'
68
69
        for subject_dir in sorted(subjects_dir.iterdir()):
70
            subject_id = subject_dir.name
71
            if ignore_empty and subject_id in self.bad_subjects:
72
                continue
73
            try:
74
                int(subject_id)
75
            except ValueError:
76
                continue
77
            images_dict = {self.id_key: subject_dir.name}
78
            if train:
79
                images_dict[self.label_key] = labels_dict[subject_id]
0 ignored issues
show
introduced by
The variable labels_dict does not seem to be defined for all execution paths.
Loading history...
80
            for modality in self.modalities:
81
                image_dir = subject_dir / modality
82
                filepaths = list(image_dir.iterdir())
83
                num_files = len(filepaths)
84
                path = filepaths[0] if num_files == 1 else image_dir
85
                images_dict[modality] = ScalarImage(path)
86
            subject = Subject(images_dict)
87
            subjects.append(subject)
88
        return subjects
89