Conditions | 8 |
Total Lines | 61 |
Code Lines | 47 |
Lines | 0 |
Ratio | 0 % |
Changes | 0 |
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
Methods with many parameters are not only hard to understand, but their parameters also often become inconsistent when you need more, or different data.
There are several approaches to avoid long parameter lists:
1 | from __future__ import annotations |
||
108 | def _process( |
||
109 | self, |
||
110 | compound_id: Optional[str], |
||
111 | library: Optional[str], |
||
112 | inchi: Optional[str], |
||
113 | inchikey: Optional[str], |
||
114 | pubchem_id: Optional[str], |
||
115 | chembl_id: Optional[str], |
||
116 | line_no: int, |
||
117 | ): |
||
118 | if inchikey is pubchem_id is chembl_id is None: |
||
119 | logger.error(f"[line {line_no}] No data for {compound_id}") |
||
120 | return dict( |
||
121 | inchi=inchi, |
||
122 | inchikey=inchikey, |
||
123 | chembl_id=None, |
||
124 | chembl_inchi=None, |
||
125 | chembl_inchikey=None, |
||
126 | pubchem_id=None, |
||
127 | pubchem_inchi=None, |
||
128 | pubchem_inchikey=None, |
||
129 | ) |
||
130 | fake_x = CompoundStruct("input", compound_id, inchi, inchikey) |
||
131 | chembl_x = self._get_chembl(inchikey, chembl_id) |
||
132 | pubchem_x = self._get_pubchem(inchikey, pubchem_id) |
||
133 | ################################################################################# |
||
134 | # This is important and weird! |
||
135 | # Where DNE = does not exist and E = exists |
||
136 | # If chembl DNE and pubchem E ==> fill chembl |
||
137 | # THEN: If chembl E and (pubchem E or pubchem DNE) ==> fill pubchem |
||
138 | # we might therefore go from pubchem --> chembl --> pubchem |
||
139 | # The advantage is that chembl might have a good parent compound |
||
140 | # Whereas pubchem does not |
||
141 | # This is often true: chembl is much better at this than pubchem |
||
142 | # In contrast, only fill ChEMBL if it's missing |
||
143 | if chembl_x is None and pubchem_x is not None: |
||
144 | chembl_x = self._get_chembl(pubchem_x.inchikey, None) |
||
145 | if chembl_x is not None: |
||
146 | pubchem_x = self._get_pubchem(chembl_x.inchikey, None) |
||
147 | ################################################################################# |
||
148 | # the order is from best to worst |
||
149 | prioritize_choices = [chembl_x, pubchem_x, fake_x] |
||
150 | db_to_struct = {o.db: o for o in prioritize_choices if o is not None} |
||
151 | inchikey, inchikey_choices = self._choose(db_to_struct, "inchikey") |
||
152 | inchi, inchi_choices = self._choose(db_to_struct, "inchi") |
||
153 | about = " ; ".join([x.simple_str for x in prioritize_choices if x is not None]) |
||
154 | if len(inchikey_choices) == 0: |
||
155 | logger.error(f"[line {line_no}] no database inchikeys found :: {about}") |
||
156 | elif len(inchikey_choices) > 1: |
||
157 | logger.error(f"[line {line_no}] inchikey mismatch :: {about} :: {inchikey_choices}") |
||
158 | elif len(inchi_choices) > 1: |
||
159 | logger.debug(f"[line {line_no}] inchi mismatch :: {about} :: {inchi_choices}") |
||
160 | return dict( |
||
161 | inchi=inchi, |
||
162 | inchikey=inchikey, |
||
163 | chembl_id=look(chembl_x, "id"), |
||
164 | chembl_inchi=look(chembl_x, "inchi"), |
||
165 | chembl_inchikey=look(chembl_x, "inchikey"), |
||
166 | pubchem_id=look(pubchem_x, "id"), |
||
167 | pubchem_inchi=look(pubchem_x, "inchi"), |
||
168 | pubchem_inchikey=look(pubchem_x, "inchikey"), |
||
169 | ) |
||
232 |