Completed
Branch master (78a222)
by Chris
14:36
created

_synoname_word_approximation()   F

Complexity

Conditions 73

Size

Total Lines 202
Code Lines 140

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 103
CRAP Score 73.0047

Importance

Changes 0
Metric Value
eloc 140
dl 0
loc 202
ccs 103
cts 104
cp 0.9904
rs 0
c 0
b 0
f 0
cc 73
nop 5
crap 73.0047

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

Complexity

Complex classes like abydos.distance._synoname._synoname_word_approximation() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

1
# -*- coding: utf-8 -*-
2
3
# Copyright 2018 by Christopher C. Little.
4
# This file is part of Abydos.
5
#
6
# Abydos is free software: you can redistribute it and/or modify
7
# it under the terms of the GNU General Public License as published by
8
# the Free Software Foundation, either version 3 of the License, or
9
# (at your option) any later version.
10
#
11
# Abydos is distributed in the hope that it will be useful,
12
# but WITHOUT ANY WARRANTY; without even the implied warranty of
13
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
14
# GNU General Public License for more details.
15
#
16
# You should have received a copy of the GNU General Public License
17
# along with Abydos. If not, see <http://www.gnu.org/licenses/>.
18
19 1
"""abydos.distance.synoname.
20
21
The distance.synoname module implements Synoname.
22
"""
23
24 1
from __future__ import division, unicode_literals
25
26 1
from collections import Iterable
27
28 1
from ._levenshtein import levenshtein
29 1
from ._sequence import sim_ratcliff_obershelp
30
31
# noinspection PyProtectedMember
32 1
from ..fingerprint._synoname import _synoname_special_table, synoname_toolcode
33
34 1
__all__ = ['synoname']
35
36
37 1
def _synoname_strip_punct(word):
38
    """Return a word with punctuation stripped out.
39
40
    :param word: a word to strip punctuation from
41
    :returns: The word stripped of punctuation
42
43
    >>> _synoname_strip_punct('AB;CD EF-GH$IJ')
44
    'ABCD EFGHIJ'
45
    """
46 1
    stripped = ''
47 1
    for char in word:
48 1
        if char not in set(',-./:;"&\'()!{|}?$%*+<=>[\\]^_`~'):
49 1
            stripped += char
50 1
    return stripped.strip()
51
52
53 1
def _synoname_word_approximation(
0 ignored issues
show
Comprehensibility introduced by
This function exceeds the maximum number of variables (30/15).
Loading history...
best-practice introduced by
Too many return statements (10/6)
Loading history...
54
    src_ln, tar_ln, src_fn='', tar_fn='', features=None
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
55
):
56
    """Return the Synoname word approximation score for two names.
57
58
    :param str src_ln: last name of the source
59
    :param str tar_ln: last name of the target
60
    :param str src_fn: first name of the source (optional)
61
    :param str tar_fn: first name of the target (optional)
62
    :param features: a dict containing special features calculated via
63
        fingerprint.synoname_toolcode() (optional)
64
    :returns: The word approximation score
65
    :rtype: float
66
67
    >>> _synoname_word_approximation('Smith Waterman', 'Waterman',
68
    ... 'Tom Joe Bob', 'Tom Joe')
69
    0.6
70
    """
71 1
    if features is None:
72 1
        features = {}
73 1
    if 'src_specials' not in features:
74 1
        features['src_specials'] = []
75 1
    if 'tar_specials' not in features:
76 1
        features['tar_specials'] = []
77
78 1
    src_len_specials = len(features['src_specials'])
79 1
    tar_len_specials = len(features['tar_specials'])
80
81
    # 1
82 1
    if ('gen_conflict' in features and features['gen_conflict']) or (
83
        'roman_conflict' in features and features['roman_conflict']
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
84
    ):
85 1
        return 0
86
87
    # 3 & 7
88 1
    full_tar1 = ' '.join((tar_ln, tar_fn)).replace('-', ' ').strip()
89 1
    for s_pos, s_type in features['tar_specials']:
90 1
        if s_type == 'a':
91 1
            full_tar1 = full_tar1[
92
                : -(1 + len(_synoname_special_table[s_pos][1]))
93
            ]
94 1
        elif s_type == 'b':
95 1
            loc = (
96
                full_tar1.find(' ' + _synoname_special_table[s_pos][1] + ' ')
97
                + 1
98
            )
99 1
            full_tar1 = (
100
                full_tar1[:loc]
101
                + full_tar1[loc + len(_synoname_special_table[s_pos][1]) :]
102
            )
103 1
        elif s_type == 'c':
104 1
            full_tar1 = full_tar1[1 + len(_synoname_special_table[s_pos][1]) :]
105
106 1
    full_src1 = ' '.join((src_ln, src_fn)).replace('-', ' ').strip()
107 1
    for s_pos, s_type in features['src_specials']:
108 1
        if s_type == 'a':
109 1
            full_src1 = full_src1[
110
                : -(1 + len(_synoname_special_table[s_pos][1]))
111
            ]
112 1
        elif s_type == 'b':
113 1
            loc = (
114
                full_src1.find(' ' + _synoname_special_table[s_pos][1] + ' ')
115
                + 1
116
            )
117 1
            full_src1 = (
118
                full_src1[:loc]
119
                + full_src1[loc + len(_synoname_special_table[s_pos][1]) :]
120
            )
121 1
        elif s_type == 'c':
122 1
            full_src1 = full_src1[1 + len(_synoname_special_table[s_pos][1]) :]
123
124 1
    full_tar2 = full_tar1
125 1
    for s_pos, s_type in features['tar_specials']:
126 1
        if s_type == 'd':
127 1
            full_tar2 = full_tar2[len(_synoname_special_table[s_pos][1]) :]
128 1
        elif s_type == 'X' and _synoname_special_table[s_pos][1] in full_tar2:
129 1
            loc = full_tar2.find(' ' + _synoname_special_table[s_pos][1])
130 1
            full_tar2 = (
131
                full_tar2[:loc]
132
                + full_tar2[loc + len(_synoname_special_table[s_pos][1]) :]
133
            )
134
135 1
    full_src2 = full_src1
136 1
    for s_pos, s_type in features['src_specials']:
137 1
        if s_type == 'd':
138 1
            full_src2 = full_src2[len(_synoname_special_table[s_pos][1]) :]
139 1
        elif s_type == 'X' and _synoname_special_table[s_pos][1] in full_src2:
140 1
            loc = full_src2.find(' ' + _synoname_special_table[s_pos][1])
141 1
            full_src2 = (
142
                full_src2[:loc]
143
                + full_src2[loc + len(_synoname_special_table[s_pos][1]) :]
144
            )
145
146 1
    full_tar1 = _synoname_strip_punct(full_tar1)
147 1
    tar1_words = full_tar1.split()
148 1
    tar1_num_words = len(tar1_words)
149
150 1
    full_src1 = _synoname_strip_punct(full_src1)
151 1
    src1_words = full_src1.split()
152 1
    src1_num_words = len(src1_words)
153
154 1
    full_tar2 = _synoname_strip_punct(full_tar2)
155 1
    tar2_words = full_tar2.split()
156 1
    tar2_num_words = len(tar2_words)
157
158 1
    full_src2 = _synoname_strip_punct(full_src2)
159 1
    src2_words = full_src2.split()
160 1
    src2_num_words = len(src2_words)
161
162
    # 2
163 1
    if (
164
        src1_num_words < 2
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
165
        and src_len_specials == 0
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
166
        and src2_num_words < 2
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
167
        and tar_len_specials == 0
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
168
    ):
169 1
        return 0
170
171
    # 4
172 1
    if (
173
        tar1_num_words == 1
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
174
        and src1_num_words == 1
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
175
        and tar1_words[0] == src1_words[0]
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
176
    ):
177 1
        return 1
178 1
    if tar1_num_words < 2 and tar_len_specials == 0:
179 1
        return 0
180
181
    # 5
182 1
    last_found = False
183 1
    for word in tar1_words:
184 1
        if src_ln.endswith(word) or word + ' ' in src_ln:
185 1
            last_found = True
186
187 1
    if not last_found:
188 1
        for word in src1_words:
189 1
            if tar_ln.endswith(word) or word + ' ' in tar_ln:
190 1
                last_found = True
191
192
    # 6
193 1
    matches = 0
194 1
    if last_found:
195 1
        for i, s_word in enumerate(src1_words):
196 1
            for j, t_word in enumerate(tar1_words):
197 1
                if s_word == t_word:
198 1
                    src1_words[i] = '@'
199 1
                    tar1_words[j] = '@'
200 1
                    matches += 1
201 1
    w_ratio = matches / max(tar1_num_words, src1_num_words)
202 1
    if matches > 1 or (
0 ignored issues
show
best-practice introduced by
Too many boolean expressions in if statement (6/5)
Loading history...
203
        matches == 1
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
204
        and src1_num_words == 1
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
205
        and tar1_num_words == 1
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
206
        and (tar_len_specials > 0 or src_len_specials > 0)
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
207
    ):
208 1
        return w_ratio
209
210
    # 8
211 1
    if (
212
        tar2_num_words == 1
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
213
        and src2_num_words == 1
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
214
        and tar2_words[0] == src2_words[0]
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
215
    ):
216 1
        return 1
217
    # I see no way that the following can be True if the equivalent in
218
    # #4 was False.
219
    if tar2_num_words < 2 and tar_len_specials == 0:  # pragma: no cover
220
        return 0
221
222
    # 9
223 1
    last_found = False
224 1
    for word in tar2_words:
225 1
        if src_ln.endswith(word) or word + ' ' in src_ln:
226 1
            last_found = True
227
228 1
    if not last_found:
229 1
        for word in src2_words:
230 1
            if tar_ln.endswith(word) or word + ' ' in tar_ln:
231 1
                last_found = True
232
233 1
    if not last_found:
234 1
        return 0
235
236
    # 10
237 1
    matches = 0
238 1
    if last_found:
239 1
        for i, s_word in enumerate(src2_words):
240 1
            for j, t_word in enumerate(tar2_words):
241 1
                if s_word == t_word:
242 1
                    src2_words[i] = '@'
243 1
                    tar2_words[j] = '@'
244 1
                    matches += 1
245 1
    w_ratio = matches / max(tar2_num_words, src2_num_words)
246 1
    if matches > 1 or (
0 ignored issues
show
best-practice introduced by
Too many boolean expressions in if statement (6/5)
Loading history...
247
        matches == 1
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
248
        and src2_num_words == 1
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
249
        and tar2_num_words == 1
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
250
        and (tar_len_specials > 0 or src_len_specials > 0)
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
251
    ):
252
        return w_ratio
253
254 1
    return 0
255
256
257 1
def synoname(
0 ignored issues
show
best-practice introduced by
Too many arguments (6/5)
Loading history...
Comprehensibility introduced by
This function exceeds the maximum number of variables (45/15).
Loading history...
best-practice introduced by
Too many return statements (18/6)
Loading history...
258
    src,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
259
    tar,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
260
    word_approx_min=0.3,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
261
    char_approx_min=0.73,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
262
    tests=2 ** 12 - 1,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
263
    ret_name=False,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
264
):
265
    """Return the Synoname similarity type of two words.
266
267
    Cf. :cite:`Getty:1991,Gross:1991`
268
269
    :param str src: source string for comparison
270
    :param str tar: target string for comparison
271
    :param bool ret_name: return the name of the match type rather than the
272
        int value
273
    :param float word_approx_min: the minimum word approximation value to
274
        signal a 'word_approx' match
275
    :param float char_approx_min: the minimum character approximation value to
276
        signal a 'char_approx' match
277
    :param int or Iterable tests: either an integer indicating tests to
278
        perform or a list of test names to perform (defaults to performing all
279
        tests)
280
    :param bool ret_name: if True, returns the match name rather than its
281
        integer equivalent
282
    :returns: Synoname value
283
    :rtype: int (or str if ret_name is True)
284
285
    >>> synoname(('Breghel', 'Pieter', ''), ('Brueghel', 'Pieter', ''))
286
    2
287
    >>> synoname(('Breghel', 'Pieter', ''), ('Brueghel', 'Pieter', ''),
288
    ... ret_name=True)
289
    'omission'
290
    >>> synoname(('Dore', 'Gustave', ''),
291
    ... ('Dore', 'Paul Gustave Louis Christophe', ''),
292
    ... ret_name=True)
293
    'inclusion'
294
    >>> synoname(('Pereira', 'I. R.', ''), ('Pereira', 'I. Smith', ''),
295
    ... ret_name=True)
296
    'word_approx'
297
    """
298 1
    test_dict = {
299
        val: 2 ** n
300
        for n, val in enumerate(
301
            [
302
                'exact',
303
                'omission',
304
                'substitution',
305
                'transposition',
306
                'punctuation',
307
                'initials',
308
                'extension',
309
                'inclusion',
310
                'no_first',
311
                'word_approx',
312
                'confusions',
313
                'char_approx',
314
            ]
315
        )
316
    }
317 1
    match_name = [
318
        '',
319
        'exact',
320
        'omission',
321
        'substitution',
322
        'transposition',
323
        'punctuation',
324
        'initials',
325
        'extension',
326
        'inclusion',
327
        'no_first',
328
        'word_approx',
329
        'confusions',
330
        'char_approx',
331
        'no_match',
332
    ]
333 1
    match_type_dict = {val: n for n, val in enumerate(match_name)}
334
335 1
    if isinstance(tests, Iterable):
336 1
        new_tests = 0
337 1
        for term in tests:
338 1
            if term in test_dict:
339 1
                new_tests += test_dict[term]
340 1
        tests = new_tests
341
342 1
    if isinstance(src, tuple):
343 1
        src_ln, src_fn, src_qual = src
344 1
    elif '#' in src:
345 1
        src_ln, src_fn, src_qual = src.split('#')[-3:]
346
    else:
347 1
        src_ln, src_fn, src_qual = src, '', ''
348
349 1
    if isinstance(tar, tuple):
350 1
        tar_ln, tar_fn, tar_qual = tar
351 1
    elif '#' in tar:
352 1
        tar_ln, tar_fn, tar_qual = tar.split('#')[-3:]
353
    else:
354 1
        tar_ln, tar_fn, tar_qual = tar, '', ''
355
356 1
    def _split_special(spec):
357 1
        spec_list = []
358 1
        while spec:
359 1
            spec_list.append((int(spec[:3]), spec[3:4]))
360 1
            spec = spec[4:]
361 1
        return spec_list
362
363 1
    def _fmt_retval(val):
364 1
        if ret_name:
365 1
            return match_name[val]
366 1
        return val
367
368
    # 1. Preprocessing
369
370
    # Lowercasing
371 1
    src_fn = src_fn.strip().lower()
372 1
    src_ln = src_ln.strip().lower()
373 1
    src_qual = src_qual.strip().lower()
374
375 1
    tar_fn = tar_fn.strip().lower()
376 1
    tar_ln = tar_ln.strip().lower()
377 1
    tar_qual = tar_qual.strip().lower()
378
379
    # Create toolcodes
380 1
    src_ln, src_fn, src_tc = synoname_toolcode(src_ln, src_fn, src_qual)
381 1
    tar_ln, tar_fn, tar_tc = synoname_toolcode(tar_ln, tar_fn, tar_qual)
382
383 1
    src_generation = int(src_tc[2])
384 1
    src_romancode = int(src_tc[3:6])
385 1
    src_len_fn = int(src_tc[6:8])
386 1
    src_tc = src_tc.split('$')
387 1
    src_specials = _split_special(src_tc[1])
388
389 1
    tar_generation = int(tar_tc[2])
390 1
    tar_romancode = int(tar_tc[3:6])
391 1
    tar_len_fn = int(tar_tc[6:8])
392 1
    tar_tc = tar_tc.split('$')
393 1
    tar_specials = _split_special(tar_tc[1])
394
395 1
    gen_conflict = (src_generation != tar_generation) and bool(
396
        src_generation or tar_generation
397
    )
398 1
    roman_conflict = (src_romancode != tar_romancode) and bool(
399
        src_romancode or tar_romancode
400
    )
401
402 1
    ln_equal = src_ln == tar_ln
403 1
    fn_equal = src_fn == tar_fn
404
405
    # approx_c
406 1
    def _approx_c():
407 1
        if gen_conflict or roman_conflict:
408 1
            return False, 0
409
410 1
        full_src = ' '.join((src_ln, src_fn))
411 1
        if full_src.startswith('master '):
412 1
            full_src = full_src[len('master ') :]
413 1
            for intro in [
414
                'of the ',
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
415
                'of ',
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
416
                'known as the ',
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
417
                'with the ',
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
418
                'with ',
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
419
            ]:
420 1
                if full_src.startswith(intro):
421 1
                    full_src = full_src[len(intro) :]
422
423 1
        full_tar = ' '.join((tar_ln, tar_fn))
424 1
        if full_tar.startswith('master '):
425 1
            full_tar = full_tar[len('master ') :]
426 1
            for intro in [
427
                'of the ',
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
428
                'of ',
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
429
                'known as the ',
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
430
                'with the ',
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
431
                'with ',
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
432
            ]:
433 1
                if full_tar.startswith(intro):
434 1
                    full_tar = full_tar[len(intro) :]
435
436 1
        loc_ratio = sim_ratcliff_obershelp(full_src, full_tar)
437 1
        return loc_ratio >= char_approx_min, loc_ratio
438
439 1
    approx_c_result, ca_ratio = _approx_c()
0 ignored issues
show
Unused Code introduced by
The variable approx_c_result seems to be unused.
Loading history...
440
441 1
    if tests & test_dict['exact'] and fn_equal and ln_equal:
442 1
        return _fmt_retval(match_type_dict['exact'])
443 1
    if tests & test_dict['omission']:
444 1
        if fn_equal and levenshtein(src_ln, tar_ln, cost=(1, 1, 99, 99)) == 1:
445 1
            if not roman_conflict:
446 1
                return _fmt_retval(match_type_dict['omission'])
447 1
        elif (
448
            ln_equal and levenshtein(src_fn, tar_fn, cost=(1, 1, 99, 99)) == 1
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
449
        ):
450 1
            return _fmt_retval(match_type_dict['omission'])
451 1
    if tests & test_dict['substitution']:
452 1
        if fn_equal and levenshtein(src_ln, tar_ln, cost=(99, 99, 1, 99)) == 1:
453 1
            return _fmt_retval(match_type_dict['substitution'])
454 1
        elif (
455
            ln_equal and levenshtein(src_fn, tar_fn, cost=(99, 99, 1, 99)) == 1
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
456
        ):
457 1
            return _fmt_retval(match_type_dict['substitution'])
458 1
    if tests & test_dict['transposition']:
459 1
        if fn_equal and (
460
            levenshtein(src_ln, tar_ln, mode='osa', cost=(99, 99, 99, 1)) == 1
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
461
        ):
462 1
            return _fmt_retval(match_type_dict['transposition'])
463 1
        elif ln_equal and (
464
            levenshtein(src_fn, tar_fn, mode='osa', cost=(99, 99, 99, 1)) == 1
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
465
        ):
466 1
            return _fmt_retval(match_type_dict['transposition'])
467 1
    if tests & test_dict['punctuation']:
468 1
        np_src_fn = _synoname_strip_punct(src_fn)
469 1
        np_tar_fn = _synoname_strip_punct(tar_fn)
470 1
        np_src_ln = _synoname_strip_punct(src_ln)
471 1
        np_tar_ln = _synoname_strip_punct(tar_ln)
472
473 1
        if (np_src_fn == np_tar_fn) and (np_src_ln == np_tar_ln):
474 1
            return _fmt_retval(match_type_dict['punctuation'])
475
476 1
        np_src_fn = _synoname_strip_punct(src_fn.replace('-', ' '))
477 1
        np_tar_fn = _synoname_strip_punct(tar_fn.replace('-', ' '))
478 1
        np_src_ln = _synoname_strip_punct(src_ln.replace('-', ' '))
479 1
        np_tar_ln = _synoname_strip_punct(tar_ln.replace('-', ' '))
480
481 1
        if (np_src_fn == np_tar_fn) and (np_src_ln == np_tar_ln):
482 1
            return _fmt_retval(match_type_dict['punctuation'])
483
484 1
    if tests & test_dict['initials'] and ln_equal:
485 1
        if src_fn and tar_fn:
486 1
            src_initials = _synoname_strip_punct(src_fn).split()
487 1
            tar_initials = _synoname_strip_punct(tar_fn).split()
488 1
            initials = bool(
489
                (len(src_initials) == len(''.join(src_initials)))
490
                or (len(tar_initials) == len(''.join(tar_initials)))
491
            )
492 1
            if initials:
493 1
                src_initials = ''.join(_[0] for _ in src_initials)
494 1
                tar_initials = ''.join(_[0] for _ in tar_initials)
495 1
                if src_initials == tar_initials:
496 1
                    return _fmt_retval(match_type_dict['initials'])
497 1
                initial_diff = abs(len(src_initials) - len(tar_initials))
498 1
                if initial_diff and (
499
                    (
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
500
                        initial_diff
501
                        == levenshtein(
502
                            src_initials, tar_initials, cost=(1, 99, 99, 99)
503
                        )
504
                    )
505
                    or (
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
506
                        initial_diff
507
                        == levenshtein(
508
                            tar_initials, src_initials, cost=(1, 99, 99, 99)
509
                        )
510
                    )
511
                ):
512 1
                    return _fmt_retval(match_type_dict['initials'])
513 1
    if tests & test_dict['extension']:
514 1
        if src_ln[1] == tar_ln[1] and (
515
            src_ln.startswith(tar_ln) or tar_ln.startswith(src_ln)
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
516
        ):
517 1
            if (
0 ignored issues
show
best-practice introduced by
Too many boolean expressions in if statement (7/5)
Loading history...
518
                (not src_len_fn and not tar_len_fn)
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
519
                or (tar_fn and src_fn.startswith(tar_fn))
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
520
                or (src_fn and tar_fn.startswith(src_fn))
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
521
            ) and not roman_conflict:
522 1
                return _fmt_retval(match_type_dict['extension'])
523 1
    if tests & test_dict['inclusion'] and ln_equal:
524 1
        if (src_fn and src_fn in tar_fn) or (tar_fn and tar_fn in src_ln):
525 1
            return _fmt_retval(match_type_dict['inclusion'])
526 1
    if tests & test_dict['no_first'] and ln_equal:
527 1
        if src_fn == '' or tar_fn == '':
528 1
            return _fmt_retval(match_type_dict['no_first'])
529 1
    if tests & test_dict['word_approx']:
530 1
        ratio = _synoname_word_approximation(
531
            src_ln,
532
            tar_ln,
533
            src_fn,
534
            tar_fn,
535
            {
536
                'gen_conflict': gen_conflict,
537
                'roman_conflict': roman_conflict,
538
                'src_specials': src_specials,
539
                'tar_specials': tar_specials,
540
            },
541
        )
542 1
        if ratio == 1 and tests & test_dict['confusions']:
543 1
            if (
544
                ' '.join((src_fn, src_ln)).strip()
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
545
                == ' '.join((tar_fn, tar_ln)).strip()
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
546
            ):
547 1
                return _fmt_retval(match_type_dict['confusions'])
548 1
        if ratio >= word_approx_min:
549 1
            return _fmt_retval(match_type_dict['word_approx'])
550 1
    if tests & test_dict['char_approx']:
551 1
        if ca_ratio >= char_approx_min:
552 1
            return _fmt_retval(match_type_dict['char_approx'])
553 1
    return _fmt_retval(match_type_dict['no_match'])
554
555
556
if __name__ == '__main__':
557
    import doctest
558
559
    doctest.testmod()
560