Conditions | 6 |
Total Lines | 126 |
Code Lines | 42 |
Lines | 0 |
Ratio | 0 % |
Changes | 0 |
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
Methods with many parameters are not only hard to understand, but their parameters also often become inconsistent when you need more, or different data.
There are several approaches to avoid long parameter lists:
1 | ''' |
||
136 | def corr_plot(data, split=None, threshold=0, cmap=sns.color_palette("BrBG", 250), figsize=(12, 10), annot=True, dev=False, **kwargs): |
||
137 | ''' |
||
138 | Two-dimensional visualization of the correlation between feature-columns, excluding NA values. |
||
139 | |||
140 | Parameters: |
||
141 | ---------- |
||
142 | data: 2D dataset that can be coerced into an ndarray. If a Pandas DataFrame is provided, the index/column information will be used to label the columns and rows. |
||
143 | |||
144 | split: {'None', 'pos', 'neg', 'high', 'low'}, default 'None' |
||
145 | Type of split to be performed. |
||
146 | |||
147 | * None: visualize all correlations between the feature-columns. |
||
148 | * pos: visualize all positive correlations between the feature-columns above the threshold. |
||
149 | * neg: visualize all negative correlations between the feature-columns below the threshold. |
||
150 | * high: visualize all correlations between the feature-columns for which abs(corr) > threshold is True. |
||
151 | * low: visualize all correlations between the feature-columns for which abs(corr) < threshold is True. |
||
152 | |||
153 | threshold: float, default 0 |
||
154 | Value between 0 <= threshold <= 1 |
||
155 | |||
156 | cmap: matplotlib colormap name or object, or list of colors, default 'BrBG' |
||
157 | The mapping from data values to color space. |
||
158 | |||
159 | figsize: tuple, default (12, 10) |
||
160 | Use to control the figure size. |
||
161 | |||
162 | annot: bool, default True |
||
163 | Use to show or hide annotations. |
||
164 | |||
165 | dev: bool, default False |
||
166 | Display figure settings in the plot by setting dev = True. If False, the settings are not displayed. Use for presentations. |
||
167 | |||
168 | **kwargs: optional |
||
169 | Additional elements to control the visualization of the plot, e.g.: |
||
170 | |||
171 | * mask: bool, default True |
||
172 | If set to False the entire correlation matrix, including the upper triangle is shown. Set dev = False in this case to avoid overlap. |
||
173 | * cmap: matplotlib colormap name or object, or list of colors, optional |
||
174 | The mapping from data values to color space. If not provided, the |
||
175 | default is sns.color_palette("BrBG", 150). |
||
176 | * annot:bool, default True for 20 or less columns, False for more than 20 feature-columns. |
||
177 | * vmax: float, default is calculated from the given correlation coefficients. |
||
178 | Value between -1 or vmin <= vmax <= 1, limits the range of the colorbar. |
||
179 | * vmin: float, default is calculated from the given correlation coefficients. |
||
180 | Value between -1 <= vmin <= 1 or vmax, limits the range of the colorbar. |
||
181 | * linewidths: float, default 0.5 |
||
182 | Controls the line-width inbetween the squares. |
||
183 | * annot_kws: dict, default {'size' : 10} |
||
184 | Controls the font size of the annotations. Only available when annot = True. |
||
185 | * cbar_kws: dict, default {'shrink': .95, 'aspect': 30} |
||
186 | Controls the size of the colorbar. |
||
187 | * Many more kwargs are available, i.e. 'alpha' to control blending, or options to adjust labels, ticks ... |
||
188 | |||
189 | Kwargs can be supplied through a dictionary of key-value pairs (see above). |
||
190 | |||
191 | Returns: |
||
192 | ------- |
||
193 | ax: matplotlib Axes. Axes object with the heatmap. |
||
194 | ''' |
||
195 | |||
196 | if split == 'pos': |
||
197 | corr = data.corr().where((data.corr() >= threshold) & (data.corr() > 0)) |
||
198 | threshold = '-' |
||
199 | elif split == 'neg': |
||
200 | corr = data.corr().where((data.corr() <= threshold) & (data.corr() < 0)) |
||
201 | threshold = '-' |
||
202 | elif split == 'high': |
||
203 | corr = data.corr().where(np.abs(data.corr()) >= threshold) |
||
204 | elif split == 'low': |
||
205 | corr = data.corr().where(np.abs(data.corr()) <= threshold) |
||
206 | else: |
||
207 | corr = data.corr() |
||
208 | split = "full" |
||
209 | threshold = 'None' |
||
210 | |||
211 | # Generate mask for the upper triangle |
||
212 | mask = np.triu(np.ones_like(corr, dtype=np.bool)) |
||
213 | |||
214 | # Compute dimensions and correlation range to adjust settings |
||
215 | vmax = np.round(np.nanmax(corr.where(mask == False))-0.05, 2) |
||
216 | vmin = np.round(np.nanmin(corr.where(mask == False))+0.05, 2) |
||
217 | |||
218 | # Set up the matplotlib figure and generate colormap |
||
219 | fig, ax = plt.subplots(figsize=figsize) |
||
220 | |||
221 | # kwargs for the heatmap |
||
222 | kwargs = {'mask': mask, |
||
223 | 'cmap': cmap, |
||
224 | 'annot': annot, |
||
225 | 'vmax': vmax, |
||
226 | 'vmin': vmin, |
||
227 | 'linewidths': .5, |
||
228 | 'annot_kws': {'size': 10}, |
||
229 | 'cbar_kws': {'shrink': .95, 'aspect': 30}, |
||
230 | **kwargs} |
||
231 | |||
232 | # Draw heatmap with mask and some default settings |
||
233 | sns.heatmap(corr, |
||
234 | center=0, |
||
235 | square=True, |
||
236 | fmt='.2f', |
||
237 | **kwargs |
||
238 | ) |
||
239 | |||
240 | ax.set_title('Feature-correlation Matrix', fontdict={'fontsize': 18}) |
||
241 | |||
242 | if dev == False: |
||
243 | pass |
||
244 | else: # show settings |
||
245 | fig.suptitle(f"\ |
||
246 | Settings (dev-mode): \n\ |
||
247 | - split-mode: {split} \n\ |
||
248 | - threshold: {threshold} \n\ |
||
249 | - cbar: \n\ |
||
250 | - vmax: {vmax} \n\ |
||
251 | - vmin: {vmin} \n\ |
||
252 | - linewidths: {kwargs['linewidths']} \n\ |
||
253 | - annot_kws: {kwargs['annot_kws']} \n\ |
||
254 | - cbar_kws: {kwargs['cbar_kws']}", |
||
255 | fontsize=12, |
||
256 | color='gray', |
||
257 | x=0.35, |
||
258 | y=0.8, |
||
259 | ha='left') |
||
260 | |||
261 | return ax |
||
262 | |||
274 |