abydos.fingerprint.synoname.synoname_toolcode() - Code Metrics - Inspection of "Merge pull request #120 from chrislit/modularize" - chrislit/abydos - Measure and Improve Code Quality continuously with Scrutinizer

Test Failed

Push — master ( 64abe2...a464fa )

by Chris

created 2018-10-19 22:32 UTC

abydos.fingerprint.synoname.synoname_toolcode() F

↳ Parent: abydos.fingerprint.synoname

Complexity

Conditions

Size

Total Lines	195
Code Lines	132

Duplication

Lines	0
Ratio	0 %

Importance

Changes

Metric	Value
cc	51
eloc	132
nop	4
dl	0
loc	195
rs	0
c	0
b	0
f	0

How to fix Long Method Complexity

# -*- coding: utf-8 -*-

# Copyright 2018 by Christopher C. Little.
# This file is part of Abydos.
#
# Abydos is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# Abydos is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with Abydos. If not, see <http://www.gnu.org/licenses/>.

"""abydos.fingerprint.

The fingerprint.synoname module implements the Synoname toolcode.
"""

from __future__ import unicode_literals


__all__ = ['synoname_toolcode']

_synoname_special_table = (

    # Roman, match, extra, method
    (False, 'NONE', '', 0),
    (False, 'aine', '', 3),
    (False, 'also erroneously', '', 4),
    (False, 'also identified with the', '', 2),
    (False, 'also identified with', '', 2),
    (False, 'archbishop', '', 7),
    (False, 'atelier', '', 7),
    (False, 'baron', '', 7),
    (False, 'cadet', '', 3),
    (False, 'cardinal', '', 7),
    (False, 'circle of', '', 5),
    (False, 'circle', '', 5),
    (False, 'class of', '', 5),
    (False, 'conde de', '', 7),
    (False, 'countess', '', 7),
    (False, 'count', '', 7),
    (False, "d'", " d'", 15),
    (False, 'dai', '', 15),
    (False, "dall'", " dall'", 15),
    (False, 'dalla', '', 15),
    (False, 'dalle', '', 15),
    (False, 'dal', '', 15),
    (False, 'da', '', 15),
    (False, 'degli', '', 15),
    (False, 'della', '', 15),
    (False, 'del', '', 15),
    (False, 'den', '', 15),
    (False, 'der altere', '', 3),
    (False, 'der jungere', '', 3),
    (False, 'der', '', 15),
    (False, 'de la', '', 15),
    (False, 'des', '', 15),
    (False, "de'", " de'", 15),
    (False, 'de', '', 15),
    (False, 'di ser', '', 7),
    (False, 'di', '', 15),
    (False, 'dos', '', 15),
    (False, 'du', '', 15),
    (False, 'duke of', '', 7),
    (False, 'earl of', '', 7),
    (False, 'el', '', 15),
    (False, 'fils', '', 3),
    (False, 'florentine follower of', '', 5),
    (False, 'follower of', '', 5),
    (False, 'fra', '', 7),
    (False, 'freiherr von', '', 7),
    (False, 'giovane', '', 7),
    (False, 'group', '', 5),
    (True, 'iii', '', 3),
    (True, 'ii', '', 3),
    (False, 'il giovane', '', 7),
    (False, 'il vecchio', '', 7),
    (False, 'il', '', 15),
    (False, "in't", '', 7),
    (False, 'in het', '', 7),
    (True, 'iv', '', 3),
    (True, 'ix', '', 3),
    (True, 'i', '', 3),
    (False, 'jr.', '', 3),
    (False, 'jr', '', 3),
    (False, 'juniore', '', 3),
    (False, 'junior', '', 3),
    (False, 'king of', '', 7),
    (False, "l'", " l'", 15),
    (False, "l'aine", '', 3),
    (False, 'la', '', 15),
    (False, 'le jeune', '', 3),
    (False, 'le', '', 15),
    (False, 'lo', '', 15),
    (False, 'maestro', '', 7),
    (False, 'maitre', '', 7),
    (False, 'marchioness', '', 7),
    (False, 'markgrafin von', '', 7),
    (False, 'marquess', '', 7),
    (False, 'marquis', '', 7),
    (False, 'master of the', '', 7),
    (False, 'master of', '', 7),
    (False, 'master known as the', '', 7),
    (False, 'master with the', '', 7),
    (False, 'master with', '', 7),
    (False, 'masters', '', 7),
    (False, 'master', '', 7),
    (False, 'meister', '', 7),
    (False, 'met de', '', 7),
    (False, 'met', '', 7),
    (False, 'mlle.', '', 7),
    (False, 'mlle', '', 7),
    (False, 'monogrammist', '', 7),
    (False, 'monsu', '', 7),
    (False, 'nee', '', 2),
    (False, 'of', '', 3),
    (False, 'oncle', '', 3),
    (False, 'op den', '', 15),
    (False, 'op de', '', 15),
    (False, 'or', '', 2),
    (False, 'over den', '', 15),
    (False, 'over de', '', 15),
    (False, 'over', '', 7),
    (False, 'p.re', '', 7),
    (False, 'p.r.a.', '', 1),
    (False, 'padre', '', 7),
    (False, 'painter', '', 7),
    (False, 'pere', '', 3),
    (False, 'possibly identified with', '', 6),
    (False, 'possibly', '', 6),
    (False, 'pseudo', '', 15),
    (False, 'r.a.', '', 1),
    (False, 'reichsgraf von', '', 7),
    (False, 'ritter von', '', 7),
    (False, 'sainte-', ' sainte-', 8),
    (False, 'sainte', '', 7),
    (False, 'saint-', ' saint-', 8),
    (False, 'saint', '', 7),
    (False, 'santa', '', 15),
    (False, "sant'", " sant'", 15),
    (False, 'san', '', 15),
    (False, 'ser', '', 7),
    (False, 'seniore', '', 3),
    (False, 'senior', '', 3),
    (False, 'sir', '', 5),
    (False, 'sr.', '', 3),
    (False, 'sr', '', 3),
    (False, 'ss.', ' ss.', 14),
    (False, 'ss', '', 6),
    (False, 'st-', ' st-', 8),
    (False, 'st.', ' st.', 15),
    (False, 'ste-', ' ste-', 8),
    (False, 'ste.', ' ste.', 15),
    (False, 'studio', '', 7),
    (False, 'sub-group', '', 5),
    (False, 'sultan of', '', 7),
    (False, 'ten', '', 15),
    (False, 'ter', '', 15),
    (False, 'the elder', '', 3),
    (False, 'the younger', '', 3),
    (False, 'the', '', 7),
    (False, 'tot', '', 15),
    (False, 'unidentified', '', 1),
    (False, 'van den', '', 15),
    (False, 'van der', '', 15),
    (False, 'van de', '', 15),
    (False, 'vanden', '', 15),
    (False, 'vander', '', 15),
    (False, 'van', '', 15),
    (False, 'vecchia', '', 7),
    (False, 'vecchio', '', 7),
    (True, 'viii', '', 3),
    (True, 'vii', '', 3),
    (True, 'vi', '', 3),
    (True, 'v', '', 3),
    (False, 'vom', '', 7),
    (False, 'von', '', 15),
    (False, 'workshop', '', 7),
    (True, 'xiii', '', 3),
    (True, 'xii', '', 3),
    (True, 'xiv', '', 3),
    (True, 'xix', '', 3),
    (True, 'xi', '', 3),
    (True, 'xviii', '', 3),
    (True, 'xvii', '', 3),
    (True, 'xvi', '', 3),
    (True, 'xv', '', 3),
    (True, 'xx', '', 3),
    (True, 'x', '', 3),
    (False, 'y', '', 7)
)


def synoname_toolcode(lname, fname='', qual='', normalize=0):

    """Build the Synoname toolcode.

    Cf. :cite:`Getty:1991,Gross:1991`.

    :param str lname: last name
    :param str fname: first name (can be blank)
    :param str qual: qualifier
    :param int normalize: normalization mode (0, 1, or 2)
    :returns: the transformed last and first names and the synoname toolcode
    :rtype: tuple

    >>> synoname_toolcode('hat')
    ('hat', '', '0000000003$$h')
    >>> synoname_toolcode('niall')
    ('niall', '', '0000000005$$n')
    >>> synoname_toolcode('colin')
    ('colin', '', '0000000005$$c')
    >>> synoname_toolcode('atcg')
    ('atcg', '', '0000000004$$a')
    >>> synoname_toolcode('entreatment')
    ('entreatment', '', '0000000011$$e')

    >>> synoname_toolcode('Ste.-Marie', 'Count John II', normalize=2)
    ('ste.-marie ii', 'count john', '0200491310$015b049a127c$smcji')
    >>> synoname_toolcode('Michelangelo IV', '', 'Workshop of')
    ('michelangelo iv', '', '3000550015$055b$mi')
    """
    method_dict = {'end': 1, 'middle': 2, 'beginning': 4,
                             'beginning_no_space': 8}


    lname = lname.lower()
    fname = fname.lower()
    qual = qual.lower()

    # Start with the basic code
    toolcode = ['0', '0', '0', '000', '00', '00', '$', '', '$', '']

    full_name = ' '.join((lname, fname))

    # Fill field 0 (qualifier)
    qual_3 = {'adaptation after', 'after', 'assistant of', 'assistants of',
              'circle of', 'follower of', 'imitator of', 'in the style of',
              'manner of', 'pupil of', 'school of', 'studio of',
              'style of', 'workshop of'}
    qual_2 = {'copy after', 'copy after?', 'copy of'}
    qual_1 = {'ascribed to', 'attributed to or copy after',
              'attributed to', 'possibly'}

    if qual in qual_3:
        toolcode[0] = '3'
    elif qual in qual_2:
        toolcode[0] = '2'
    elif qual in qual_1:
        toolcode[0] = '1'

    # Fill field 1 (punctuation)
    if '.' in full_name:
        toolcode[1] = '2'
    else:
        for punct in ',-/:;"&\'()!{|}?$%*+<=>[\\]^_`~':
            if punct in full_name:
                toolcode[1] = '1'
                break

    # Fill field 2 (generation)
    gen_1 = ('the elder', ' sr.', ' sr', 'senior', 'der altere', 'il vecchio',
             "l'aine", 'p.re', 'padre', 'seniore', 'vecchia', 'vecchio')
    gen_2 = (' jr.', ' jr', 'der jungere', 'il giovane', 'giovane', 'juniore',
             'junior', 'le jeune', 'the younger')

    elderyounger = ''  # save elder/younger for possible movement later
    for gen in gen_1:
        if gen in full_name:
            toolcode[2] = '1'
            elderyounger = gen
            break
    else:
        for gen in gen_2:
            if gen in full_name:
                toolcode[2] = '2'
                elderyounger = gen
                break

    # do comma flip
    if normalize:
        comma = lname.find(',')
        if comma != -1:
            lname_end = lname[comma + 1:]
            while lname_end[0] in {' ', ','}:
                lname_end = lname_end[1:]
            fname = lname_end + ' ' + fname
            lname = lname[:comma].strip()

    # do elder/younger move
    if normalize == 2 and elderyounger:
        elderyounger_loc = fname.find(elderyounger)
        if elderyounger_loc != -1:
            lname = ' '.join((lname, elderyounger.strip()))
            fname = ' '.join((fname[:elderyounger_loc].strip(),
                              fname[elderyounger_loc +
                                    len(elderyounger):])).strip()

    toolcode[4] = '{:02d}'.format(len(fname))
    toolcode[5] = '{:02d}'.format(len(lname))

    # strip punctuation
    for char in ',/:;"&()!{|}?$%*+<=>[\\]^_`~':
        full_name = full_name.replace(char, '')
    for pos, char in enumerate(full_name):
        if char == '-' and full_name[pos - 1:pos + 2] != 'b-g':
            full_name = full_name[:pos] + ' ' + full_name[pos + 1:]

    # Fill field 9 (search range)
    for letter in [_[0] for _ in full_name.split()]:
        if letter not in toolcode[9]:
            toolcode[9] += letter
        if len(toolcode[9]) == 15:
            break

    def roman_check(numeral, fname, lname):
        """Move Roman numerals from first name to last."""
        loc = fname.find(numeral)
        if fname and (loc != -1 and
                      (len(fname[loc:]) == len(numeral)) or
                      fname[loc+len(numeral)] in {' ', ','}):
            lname = ' '.join((lname, numeral))
            fname = ' '.join((fname[:loc].strip(),
                              fname[loc + len(numeral):].lstrip(' ,')))
        return fname.strip(), lname.strip()

    # Fill fields 7 (specials) and 3 (roman numerals)
    for num, special in enumerate(_synoname_special_table):

        roman, match, extra, method = special
        if method & method_dict['end']:
            match_context = ' ' + match
            loc = full_name.find(match_context)
            if ((len(full_name) > len(match_context)) and
                    (loc == len(full_name) - len(match_context))):
                if roman:
                    if not any(abbr in fname for abbr in ('i.', 'v.', 'x.')):
                        full_name = full_name[:loc]
                        toolcode[7] += '{:03d}'.format(num) + 'a'
                        if toolcode[3] == '000':
                            toolcode[3] = '{:03d}'.format(num)
                        if normalize == 2:
                            fname, lname = roman_check(match, fname, lname)
                else:
                    full_name = full_name[:loc]
                    toolcode[7] += '{:03d}'.format(num) + 'a'
        if method & method_dict['middle']:
            match_context = ' ' + match + ' '
            loc = 0
            while loc != -1:
                loc = full_name.find(match_context, loc+1)
                if loc > 0:
                    if roman:
                        if not any(abbr in fname for abbr in
                                   ('i.', 'v.', 'x.')):
                            full_name = (full_name[:loc] +
                                         full_name[loc + len(match) + 1:])
                            toolcode[7] += '{:03d}'.format(num) + 'b'
                            if toolcode[3] == '000':
                                toolcode[3] = '{:03d}'.format(num)
                            if normalize == 2:
                                fname, lname = roman_check(match, fname, lname)
                    else:
                        full_name = (full_name[:loc] +
                                     full_name[loc + len(match) + 1:])
                        toolcode[7] += '{:03d}'.format(num) + 'b'
        if method & method_dict['beginning']:
            match_context = match + ' '
            loc = full_name.find(match_context)
            if loc == 0:
                full_name = full_name[len(match) + 1:]
                toolcode[7] += '{:03d}'.format(num) + 'c'
        if method & method_dict['beginning_no_space']:
            loc = full_name.find(match)
            if loc == 0:
                toolcode[7] += '{:03d}'.format(num) + 'd'
                if full_name[:len(match)] not in toolcode[9]:
                    toolcode[9] += full_name[:len(match)]

        if extra:
            loc = full_name.find(extra)
            if loc != -1:
                toolcode[7] += '{:03d}'.format(num) + 'X'
                # Since extras are unique, we only look for each of them
                # once, and they include otherwise impossible characters for
                # this field, it's not possible for the following line to have
                # ever been false.
                # if full_name[loc:loc+len(extra)] not in toolcode[9]:
                toolcode[9] += full_name[loc:loc+len(match)]

    return lname, fname, ''.join(toolcode)


if __name__ == '__main__':
    import doctest
    doctest.testmod()


1			# -- coding: utf-8 --
2
3			# Copyright 2018 by Christopher C. Little.
4			# This file is part of Abydos.
5			#
6			# Abydos is free software: you can redistribute it and/or modify
7			# it under the terms of the GNU General Public License as published by
8			# the Free Software Foundation, either version 3 of the License, or
9			# (at your option) any later version.
10			#
11			# Abydos is distributed in the hope that it will be useful,
12			# but WITHOUT ANY WARRANTY; without even the implied warranty of
13			# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
14			# GNU General Public License for more details.
15			#
16			# You should have received a copy of the GNU General Public License
17			# along with Abydos. If not, see <http://www.gnu.org/licenses/>.
18
19			"""abydos.fingerprint.
20
21			The fingerprint.synoname module implements the Synoname toolcode.
22			"""
23
24			from __future__ import unicode_literals
25
26
27			__all__ = ['synoname_toolcode']
28
29			_synoname_special_table = (
			0 ignored issues – show Coding Style Naming introduced 2018-10-19 22:31 UTC by Report Bug Copy Issue Report Constant name "_synoname_special_table" doesn't conform to UPPER_CASE naming style ('(([A-Z_][A-Z0-9_])\|(__.__))$' pattern) This check looks for invalid names for a range of different identifiers. You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements. If your project includes a Pylint configuration file, the settings contained in that file take precedence. To find out more about Pylint, please refer to their site. Loading history...
30			# Roman, match, extra, method
31			(False, 'NONE', '', 0),
32			(False, 'aine', '', 3),
33			(False, 'also erroneously', '', 4),
34			(False, 'also identified with the', '', 2),
35			(False, 'also identified with', '', 2),
36			(False, 'archbishop', '', 7),
37			(False, 'atelier', '', 7),
38			(False, 'baron', '', 7),
39			(False, 'cadet', '', 3),
40			(False, 'cardinal', '', 7),
41			(False, 'circle of', '', 5),
42			(False, 'circle', '', 5),
43			(False, 'class of', '', 5),
44			(False, 'conde de', '', 7),
45			(False, 'countess', '', 7),
46			(False, 'count', '', 7),
47			(False, "d'", " d'", 15),
48			(False, 'dai', '', 15),
49			(False, "dall'", " dall'", 15),
50			(False, 'dalla', '', 15),
51			(False, 'dalle', '', 15),
52			(False, 'dal', '', 15),
53			(False, 'da', '', 15),
54			(False, 'degli', '', 15),
55			(False, 'della', '', 15),
56			(False, 'del', '', 15),
57			(False, 'den', '', 15),
58			(False, 'der altere', '', 3),
59			(False, 'der jungere', '', 3),
60			(False, 'der', '', 15),
61			(False, 'de la', '', 15),
62			(False, 'des', '', 15),
63			(False, "de'", " de'", 15),
64			(False, 'de', '', 15),
65			(False, 'di ser', '', 7),
66			(False, 'di', '', 15),
67			(False, 'dos', '', 15),
68			(False, 'du', '', 15),
69			(False, 'duke of', '', 7),
70			(False, 'earl of', '', 7),
71			(False, 'el', '', 15),
72			(False, 'fils', '', 3),
73			(False, 'florentine follower of', '', 5),
74			(False, 'follower of', '', 5),
75			(False, 'fra', '', 7),
76			(False, 'freiherr von', '', 7),
77			(False, 'giovane', '', 7),
78			(False, 'group', '', 5),
79			(True, 'iii', '', 3),
80			(True, 'ii', '', 3),
81			(False, 'il giovane', '', 7),
82			(False, 'il vecchio', '', 7),
83			(False, 'il', '', 15),
84			(False, "in't", '', 7),
85			(False, 'in het', '', 7),
86			(True, 'iv', '', 3),
87			(True, 'ix', '', 3),
88			(True, 'i', '', 3),
89			(False, 'jr.', '', 3),
90			(False, 'jr', '', 3),
91			(False, 'juniore', '', 3),
92			(False, 'junior', '', 3),
93			(False, 'king of', '', 7),
94			(False, "l'", " l'", 15),
95			(False, "l'aine", '', 3),
96			(False, 'la', '', 15),
97			(False, 'le jeune', '', 3),
98			(False, 'le', '', 15),
99			(False, 'lo', '', 15),
100			(False, 'maestro', '', 7),
101			(False, 'maitre', '', 7),
102			(False, 'marchioness', '', 7),
103			(False, 'markgrafin von', '', 7),
104			(False, 'marquess', '', 7),
105			(False, 'marquis', '', 7),
106			(False, 'master of the', '', 7),
107			(False, 'master of', '', 7),
108			(False, 'master known as the', '', 7),
109			(False, 'master with the', '', 7),
110			(False, 'master with', '', 7),
111			(False, 'masters', '', 7),
112			(False, 'master', '', 7),
113			(False, 'meister', '', 7),
114			(False, 'met de', '', 7),
115			(False, 'met', '', 7),
116			(False, 'mlle.', '', 7),
117			(False, 'mlle', '', 7),
118			(False, 'monogrammist', '', 7),
119			(False, 'monsu', '', 7),
120			(False, 'nee', '', 2),
121			(False, 'of', '', 3),
122			(False, 'oncle', '', 3),
123			(False, 'op den', '', 15),
124			(False, 'op de', '', 15),
125			(False, 'or', '', 2),
126			(False, 'over den', '', 15),
127			(False, 'over de', '', 15),
128			(False, 'over', '', 7),
129			(False, 'p.re', '', 7),
130			(False, 'p.r.a.', '', 1),
131			(False, 'padre', '', 7),
132			(False, 'painter', '', 7),
133			(False, 'pere', '', 3),
134			(False, 'possibly identified with', '', 6),
135			(False, 'possibly', '', 6),
136			(False, 'pseudo', '', 15),
137			(False, 'r.a.', '', 1),
138			(False, 'reichsgraf von', '', 7),
139			(False, 'ritter von', '', 7),
140			(False, 'sainte-', ' sainte-', 8),
141			(False, 'sainte', '', 7),
142			(False, 'saint-', ' saint-', 8),
143			(False, 'saint', '', 7),
144			(False, 'santa', '', 15),
145			(False, "sant'", " sant'", 15),
146			(False, 'san', '', 15),
147			(False, 'ser', '', 7),
148			(False, 'seniore', '', 3),
149			(False, 'senior', '', 3),
150			(False, 'sir', '', 5),
151			(False, 'sr.', '', 3),
152			(False, 'sr', '', 3),
153			(False, 'ss.', ' ss.', 14),
154			(False, 'ss', '', 6),
155			(False, 'st-', ' st-', 8),
156			(False, 'st.', ' st.', 15),
157			(False, 'ste-', ' ste-', 8),
158			(False, 'ste.', ' ste.', 15),
159			(False, 'studio', '', 7),
160			(False, 'sub-group', '', 5),
161			(False, 'sultan of', '', 7),
162			(False, 'ten', '', 15),
163			(False, 'ter', '', 15),
164			(False, 'the elder', '', 3),
165			(False, 'the younger', '', 3),
166			(False, 'the', '', 7),
167			(False, 'tot', '', 15),
168			(False, 'unidentified', '', 1),
169			(False, 'van den', '', 15),
170			(False, 'van der', '', 15),
171			(False, 'van de', '', 15),
172			(False, 'vanden', '', 15),
173			(False, 'vander', '', 15),
174			(False, 'van', '', 15),
175			(False, 'vecchia', '', 7),
176			(False, 'vecchio', '', 7),
177			(True, 'viii', '', 3),
178			(True, 'vii', '', 3),
179			(True, 'vi', '', 3),
180			(True, 'v', '', 3),
181			(False, 'vom', '', 7),
182			(False, 'von', '', 15),
183			(False, 'workshop', '', 7),
184			(True, 'xiii', '', 3),
185			(True, 'xii', '', 3),
186			(True, 'xiv', '', 3),
187			(True, 'xix', '', 3),
188			(True, 'xi', '', 3),
189			(True, 'xviii', '', 3),
190			(True, 'xvii', '', 3),
191			(True, 'xvi', '', 3),
192			(True, 'xv', '', 3),
193			(True, 'xx', '', 3),
194			(True, 'x', '', 3),
195			(False, 'y', '', 7)
196			)
197
198
199			def synoname_toolcode(lname, fname='', qual='', normalize=0):
			0 ignored issues – show Comprehensibility introduced 2018-09-03 11:25 UTC by Report Bug Copy Issue Report This function exceeds the maximum number of variables (30/15). Loading history...
200			"""Build the Synoname toolcode.
201
202			Cf. :cite:`Getty:1991,Gross:1991`.
203
204			:param str lname: last name
205			:param str fname: first name (can be blank)
206			:param str qual: qualifier
207			:param int normalize: normalization mode (0, 1, or 2)
208			:returns: the transformed last and first names and the synoname toolcode
209			:rtype: tuple
210
211			>>> synoname_toolcode('hat')
212			('hat', '', '0000000003$$h')
213			>>> synoname_toolcode('niall')
214			('niall', '', '0000000005$$n')
215			>>> synoname_toolcode('colin')
216			('colin', '', '0000000005$$c')
217			>>> synoname_toolcode('atcg')
218			('atcg', '', '0000000004$$a')
219			>>> synoname_toolcode('entreatment')
220			('entreatment', '', '0000000011$$e')
221
222			>>> synoname_toolcode('Ste.-Marie', 'Count John II', normalize=2)
223			('ste.-marie ii', 'count john', '0200491310$015b049a127c$smcji')
224			>>> synoname_toolcode('Michelangelo IV', '', 'Workshop of')
225			('michelangelo iv', '', '3000550015$055b$mi')
226			"""
227			method_dict = {'end': 1, 'middle': 2, 'beginning': 4,
228			'beginning_no_space': 8}
			0 ignored issues – show Coding Style introduced 2018-09-25 05:30 UTC by Report Bug Copy Issue Report Wrong continued indentation (remove 10 spaces). Loading history...
229
230			lname = lname.lower()
231			fname = fname.lower()
232			qual = qual.lower()
233
234			# Start with the basic code
235			toolcode = ['0', '0', '0', '000', '00', '00', '$', '', '$', '']
236
237			full_name = ' '.join((lname, fname))
238
239			# Fill field 0 (qualifier)
240			qual_3 = {'adaptation after', 'after', 'assistant of', 'assistants of',
241			'circle of', 'follower of', 'imitator of', 'in the style of',
242			'manner of', 'pupil of', 'school of', 'studio of',
243			'style of', 'workshop of'}
244			qual_2 = {'copy after', 'copy after?', 'copy of'}
245			qual_1 = {'ascribed to', 'attributed to or copy after',
246			'attributed to', 'possibly'}
247
248			if qual in qual_3:
249			toolcode[0] = '3'
250			elif qual in qual_2:
251			toolcode[0] = '2'
252			elif qual in qual_1:
253			toolcode[0] = '1'
254
255			# Fill field 1 (punctuation)
256			if '.' in full_name:
257			toolcode[1] = '2'
258			else:
259			for punct in ',-/:;"&\'()!{\|}?$%*+<=>[\\]^_`~':
260			if punct in full_name:
261			toolcode[1] = '1'
262			break
263
264			# Fill field 2 (generation)
265			gen_1 = ('the elder', ' sr.', ' sr', 'senior', 'der altere', 'il vecchio',
266			"l'aine", 'p.re', 'padre', 'seniore', 'vecchia', 'vecchio')
267			gen_2 = (' jr.', ' jr', 'der jungere', 'il giovane', 'giovane', 'juniore',
268			'junior', 'le jeune', 'the younger')
269
270			elderyounger = '' # save elder/younger for possible movement later
271			for gen in gen_1:
272			if gen in full_name:
273			toolcode[2] = '1'
274			elderyounger = gen
275			break
276			else:
277			for gen in gen_2:
278			if gen in full_name:
279			toolcode[2] = '2'
280			elderyounger = gen
281			break
282
283			# do comma flip
284			if normalize:
285			comma = lname.find(',')
286			if comma != -1:
287			lname_end = lname[comma + 1:]
288			while lname_end[0] in {' ', ','}:
289			lname_end = lname_end[1:]
290			fname = lname_end + ' ' + fname
291			lname = lname[:comma].strip()
292
293			# do elder/younger move
294			if normalize == 2 and elderyounger:
295			elderyounger_loc = fname.find(elderyounger)
296			if elderyounger_loc != -1:
297			lname = ' '.join((lname, elderyounger.strip()))
298			fname = ' '.join((fname[:elderyounger_loc].strip(),
299			fname[elderyounger_loc +
300			len(elderyounger):])).strip()
301
302			toolcode[4] = '{:02d}'.format(len(fname))
303			toolcode[5] = '{:02d}'.format(len(lname))
304
305			# strip punctuation
306			for char in ',/:;"&()!{\|}?$%*+<=>[\\]^_`~':
307			full_name = full_name.replace(char, '')
308			for pos, char in enumerate(full_name):
309			if char == '-' and full_name[pos - 1:pos + 2] != 'b-g':
310			full_name = full_name[:pos] + ' ' + full_name[pos + 1:]
311
312			# Fill field 9 (search range)
313			for letter in [_[0] for _ in full_name.split()]:
314			if letter not in toolcode[9]:
315			toolcode[9] += letter
316			if len(toolcode[9]) == 15:
317			break
318
319			def roman_check(numeral, fname, lname):
320			"""Move Roman numerals from first name to last."""
321			loc = fname.find(numeral)
322			if fname and (loc != -1 and
323			(len(fname[loc:]) == len(numeral)) or
324			fname[loc+len(numeral)] in {' ', ','}):
325			lname = ' '.join((lname, numeral))
326			fname = ' '.join((fname[:loc].strip(),
327			fname[loc + len(numeral):].lstrip(' ,')))
328			return fname.strip(), lname.strip()
329
330			# Fill fields 7 (specials) and 3 (roman numerals)
331			for num, special in enumerate(_synoname_special_table):
			0 ignored issues – show unused-code introduced 2018-09-25 05:30 UTC by Report Bug Copy Issue Report Too many nested blocks (6/5) Loading history... unused-code introduced 2018-10-05 08:54 UTC by Report Bug Copy Issue Report Too many nested blocks (7/5) Loading history...
332			roman, match, extra, method = special
333			if method & method_dict['end']:
334			match_context = ' ' + match
335			loc = full_name.find(match_context)
336			if ((len(full_name) > len(match_context)) and
337			(loc == len(full_name) - len(match_context))):
338			if roman:
339			if not any(abbr in fname for abbr in ('i.', 'v.', 'x.')):
340			full_name = full_name[:loc]
341			toolcode[7] += '{:03d}'.format(num) + 'a'
342			if toolcode[3] == '000':
343			toolcode[3] = '{:03d}'.format(num)
344			if normalize == 2:
345			fname, lname = roman_check(match, fname, lname)
346			else:
347			full_name = full_name[:loc]
348			toolcode[7] += '{:03d}'.format(num) + 'a'
349			if method & method_dict['middle']:
350			match_context = ' ' + match + ' '
351			loc = 0
352			while loc != -1:
353			loc = full_name.find(match_context, loc+1)
354			if loc > 0:
355			if roman:
356			if not any(abbr in fname for abbr in
357			('i.', 'v.', 'x.')):
358			full_name = (full_name[:loc] +
359			full_name[loc + len(match) + 1:])
360			toolcode[7] += '{:03d}'.format(num) + 'b'
361			if toolcode[3] == '000':
362			toolcode[3] = '{:03d}'.format(num)
363			if normalize == 2:
364			fname, lname = roman_check(match, fname, lname)
365			else:
366			full_name = (full_name[:loc] +
367			full_name[loc + len(match) + 1:])
368			toolcode[7] += '{:03d}'.format(num) + 'b'
369			if method & method_dict['beginning']:
370			match_context = match + ' '
371			loc = full_name.find(match_context)
372			if loc == 0:
373			full_name = full_name[len(match) + 1:]
374			toolcode[7] += '{:03d}'.format(num) + 'c'
375			if method & method_dict['beginning_no_space']:
376			loc = full_name.find(match)
377			if loc == 0:
378			toolcode[7] += '{:03d}'.format(num) + 'd'
379			if full_name[:len(match)] not in toolcode[9]:
380			toolcode[9] += full_name[:len(match)]
381
382			if extra:
383			loc = full_name.find(extra)
384			if loc != -1:
385			toolcode[7] += '{:03d}'.format(num) + 'X'
386			# Since extras are unique, we only look for each of them
387			# once, and they include otherwise impossible characters for
388			# this field, it's not possible for the following line to have
389			# ever been false.
390			# if full_name[loc:loc+len(extra)] not in toolcode[9]:
391			toolcode[9] += full_name[loc:loc+len(match)]
392
393			return lname, fname, ''.join(toolcode)
394
395
396			if __name__ == '__main__':
397			import doctest
398			doctest.testmod()
399

chrislit / abydos

Push — master ( 64abe2...a464fa )

abydos.fingerprint.synoname.synoname_toolcode() F

Complexity

Size

Duplication

Importance

How to fix Long Method Complexity

Long Method

Complexity

Duplication Side-by-Side

Filter issues like