Completed
Push — master ( 4274a7...754032 )
by Kolen
26s
created

parse_alignment()   F

Complexity

Conditions 11

Size

Total Lines 52

Duplication

Lines 0
Ratio 0 %

Importance

Changes 7
Bugs 1 Features 0
Metric Value
cc 11
c 7
b 1
f 0
dl 0
loc 52
rs 3.8571

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

Complexity

Complex classes like parse_alignment() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

1
r"""
2
Panflute filter to parse table in fenced YAML code blocks.
3
Currently only CSV table is supported.
4
5
7 metadata keys are recognized:
6
7
-   caption: the caption of the table. If omitted, no caption will be inserted.
8
-   alignment: a string of characters among L,R,C,D, case-insensitive,
9
        corresponds to Left-aligned, Right-aligned,
10
        Center-aligned, Default-aligned respectively.
11
    e.g. LCRD for a table with 4 columns
12
    default: DDD...
13
-   width: a list of relative width corresponding to the width of each columns.
14
    default: auto calculate from the length of each line in table cells.
15
-   table-width: the relative width of the table (e.g. relative to \linewidth).
16
    default: 1.0
17
-   header: If it has a header row. default: true
18
-   markdown: If CSV table cell contains markdown syntax. default: False
19
-   include: the path to an CSV file.
20
    If non-empty, override the CSV in the CodeBlock.
21
    default: None
22
23
When the metadata keys is invalid, the default will be used instead.
24
Note that width and table-width accept fractions as well.
25
26
e.g.
27
28
```table
29
---
30
caption: '*Awesome* **Markdown** Table'
31
alignment: RC
32
table-width: 2/3
33
markdown: True
34
---
35
First row,defaulted to be header row,can be disabled
36
1,cell can contain **markdown**,"It can be aribrary block element:
37
38
- following standard markdown syntax
39
- like this"
40
2,"Any markdown syntax, e.g.",$$E = mc^2$$
41
```
42
"""
43
44
import csv
45
import fractions
46
import io
47
import panflute
48
49
import sys
50
py2 = sys.version_info[0] == 2
51
52
# begin helper functions
53
54
55
def to_bool(to_be_bool, default=True):
56
    """
57
    Do nothing if to_be_bool is boolean,
58
    return `False` if it is "false" or "no" (case-insensitive),
59
    otherwise return default.
60
    """
61
    if isinstance(to_be_bool, bool):
62
        # nothing need to do if already boolean
63
        return to_be_bool
64
    else:
65
        bool_dict = {"false": False, "true": True,
66
                     "no": False, "yes": True}
67
        try:
68
            booled = bool_dict[to_be_bool.lower()]
69
        except (KeyError, AttributeError):
70
            booled = default
71
            panflute.debug("""pantable: invalid boolean. \
72
Should be true/false/yes/no, case-insensitive. Default is used.""")
73
    return booled
74
75
76
def get_width(options, number_of_columns):
77
    """
78
    get width: set to `None` when
79
80
    1. not given
81
    2. not a list
82
    3. length not equal to the number of columns
83
    4. negative entries
84
    """
85
    try:
86
        # if width not exists, exits immediately through except
87
        width = options['width']
88
        assert len(width) == number_of_columns
89
        custom_float = lambda x: float(fractions.Fraction(x))
90
        width = [custom_float(x) for x in options['width']]
91
        assert all(i >= 0 for i in width)
92
    except KeyError:
93
        width = None
94
    except (AssertionError, ValueError, TypeError):
95
        width = None
96
        panflute.debug("pantable: invalid width")
97
    return width
98
99
100
def get_table_width(options):
101
    """
102
    `table-width` set to `1.0` if invalid
103
    """
104
    try:
105
        table_width = float(fractions.Fraction(
106
            (options.get('table-width', 1.0))))
107
        assert table_width > 0
108
    except (ValueError, AssertionError, TypeError):
109
        table_width = 1.0
110
        panflute.debug("pantable: invalid table-width")
111
    return table_width
112
# end helper functions
113
114
115
def auto_width(table_width, number_of_columns, table_list):
116
    """
117
    `width` is auto-calculated if not given in YAML
118
    It also returns None when table is empty.
119
    """
120
    # calculate width
121
    # The +3 match the way pandoc handle width, see jgm/pandoc commit 0dfceda
122
    width_abs = [3 + max(
123
        [max(
124
            [len(line) for line in row[column_index].split("\n")]
125
        ) for row in table_list]
126
    ) for column_index in range(number_of_columns)]
127
    try:
128
        width_tot = sum(width_abs)
129
        # when all are 3 means all are empty, see comment above
130
        assert width_tot != 3 * number_of_columns
131
        width = [
132
            each_width / width_tot * table_width
133
            for each_width in width_abs
134
        ]
135
    except AssertionError:
136
        width = None
137
        panflute.debug("pantable: table is empty")
138
    return width
139
140
141
def parse_alignment(alignment_string, number_of_columns):
142
    """
143
    `alignment` string is parsed into pandoc format (AlignDefault, etc.).
144
    Cases are checked:
145
146
    - if not given, return None (let panflute handle it)
147
    - if wrong type
148
    - if too long
149
    - if invalid characters are given
150
    - if too short
151
    """
152
    # alignment string can be None or empty; return None: set to default by
153
    # panflute
154
    if not alignment_string:
155
        return None
156
157
    # prepare alignment_string
158
    try:
159
        # test valid type
160
        str_universal = basestring if py2 else str
161
        if not isinstance(alignment_string, str_universal):
162
            raise TypeError
163
        number_of_alignments = len(alignment_string)
164
        # truncate and debug if too long
165
        assert number_of_alignments <= number_of_columns
166
    except TypeError:
167
        panflute.debug("pantable: alignment string is invalid")
168
        # return None: set to default by panflute
169
        return None
170
    except AssertionError:
171
        alignment_string = alignment_string[:number_of_columns]
172
        panflute.debug(
173
            "pantable: alignment string is too long, truncated instead.")
174
175
    # parsing alignment
176
    align_dict = {'l': "AlignLeft",
177
                  'c': "AlignCenter",
178
                  'r': "AlignRight",
179
                  'd': "AlignDefault"}
180
    try:
181
        alignment = [align_dict[i.lower()] for i in alignment_string]
182
    except KeyError:
183
        panflute.debug(
184
            "pantable: alignment: invalid character found, default is used instead.")
185
        return None
186
187
    # fill up with default if too short
188
    if number_of_columns > number_of_alignments:
189
        alignment += ["AlignDefault" for __ in range(
190
            number_of_columns - number_of_alignments)]
191
192
    return alignment
193
194
195
def read_data(include, data):
196
    """
197
    read csv and return the table in list.
198
    Return None when the include path is invalid.
199
    """
200
    if include is None:
201
        if py2:
202
            data = data.encode('utf-8')
203
        io_universal = io.BytesIO if py2 else io.StringIO
204
        with io_universal(data) as file:
205
            raw_table_list = list(csv.reader(file))
206
    else:
207
        try:
208
            with open(str(include)) as file:
209
                raw_table_list = list(csv.reader(file))
210
        except IOError:  # FileNotFoundError is not in Python2
211
            raw_table_list = None
212
            panflute.debug("pantable: file not found from the path", include)
213
    return raw_table_list
214
215
216
def regularize_table_list(raw_table_list):
217
    """
218
    When the length of rows are uneven, make it as long as the longest row.
219
    """
220
    length_of_rows = [len(row) for row in raw_table_list]
221
    number_of_columns = max(length_of_rows)
222
    try:
223
        assert all(i == number_of_columns for i in length_of_rows)
224
        table_list = raw_table_list
225
    except AssertionError:
226
        table_list = [
227
            row + ['' for __ in range(number_of_columns - len(row))] for row in raw_table_list]
228
        panflute.debug(
229
            "pantable: table rows are of irregular length. Empty cells appended.")
230
    return (table_list, number_of_columns)
231
232
233
def parse_table_list(markdown, table_list):
234
    """
235
    read table in list and return panflute table format
236
    """
237
    # make functions local
238
    to_table_row = panflute.TableRow
239
    if markdown:
240
        to_table_cell = lambda x: panflute.TableCell(*panflute.convert_text(x))
241
    else:
242
        to_table_cell = lambda x: panflute.TableCell(
243
            panflute.Plain(panflute.Str(x)))
244
    return [to_table_row(*[to_table_cell(x) for x in row]) for row in table_list]
245
246
247
def convert2table(options, data, **__):
248
    """
249
    provided to panflute.yaml_filter to parse its content as pandoc table.
250
    """
251
    # prepare table in list from data/include
252
    raw_table_list = read_data(options.get('include', None), data)
253
    # delete element if table is empty (by returning [])
254
    # element unchanged if include is invalid (by returning None)
255
    try:
256
        assert raw_table_list and raw_table_list is not None
257
    except AssertionError:
258
        panflute.debug("pantable: table is empty or include is invalid")
259
        # [] means delete the current element; None means kept as is
260
        return raw_table_list
261
    # regularize table: all rows should have same length
262
    table_list, number_of_columns = regularize_table_list(raw_table_list)
263
264
    # Initialize the `options` output from `panflute.yaml_filter`
265
    # parse width
266
    width = get_width(options, number_of_columns)
267
    # auto-width when width is not specified
268
    if width is None:
269
        width = auto_width(get_table_width(
270
            options), number_of_columns, table_list)
271
    # delete element if table is empty (by returning [])
272
    # width remains None only when table is empty
273
    try:
274
        assert width is not None
275
    except AssertionError:
276
        panflute.debug("pantable: table is empty")
277
        return []
278
    # parse alignment
279
    alignment = parse_alignment(options.get(
280
        'alignment', None), number_of_columns)
281
    header = to_bool(options.get('header', True), True)
282
    markdown = to_bool(options.get('markdown', False), False)
283
284
    # get caption: parsed as markdown into panflute AST if non-empty.
285
    caption = panflute.convert_text(str(options['caption']))[
286
        0].content if 'caption' in options else None
287
    # parse list to panflute table
288
    table_body = parse_table_list(markdown, table_list)
289
    # extract header row
290
    header_row = table_body.pop(0) if (
291
        len(table_body) > 1 and header
292
    ) else None
293
    return panflute.Table(
294
        *table_body,
295
        caption=caption,
296
        alignment=alignment,
297
        width=width,
298
        header=header_row
299
    )
300
301
302
def main(_=None):
303
    """
304
    Fenced code block with class table will be parsed using
305
    panflute.yaml_filter with the fuction convert2table above.
306
    """
307
    return panflute.run_filter(
308
        panflute.yaml_filter,
309
        tag='table',
310
        function=convert2table,
311
        strict_yaml=True
312
    )
313
314
if __name__ == '__main__':
315
    main()
316