Completed
Pull Request — master (#141)
by Chris
11:42
created

abydos.compression._BWT.BWT.encode()   A

Complexity

Conditions 4

Size

Total Lines 38
Code Lines 11

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 8
CRAP Score 4

Importance

Changes 0
Metric Value
cc 4
eloc 11
nop 3
dl 0
loc 38
ccs 8
cts 8
cp 1
crap 4
rs 9.85
c 0
b 0
f 0
1
# -*- coding: utf-8 -*-
0 ignored issues
show
Coding Style Naming introduced by
The name _BWT does not conform to the module naming conventions ((([a-z_][a-z0-9_]*)|([A-Z][a-zA-Z0-9]+))$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
2
3
# Copyright 2014-2018 by Christopher C. Little.
4
# This file is part of Abydos.
5
#
6
# Abydos is free software: you can redistribute it and/or modify
7
# it under the terms of the GNU General Public License as published by
8
# the Free Software Foundation, either version 3 of the License, or
9
# (at your option) any later version.
10
#
11
# Abydos is distributed in the hope that it will be useful,
12
# but WITHOUT ANY WARRANTY; without even the implied warranty of
13
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
14
# GNU General Public License for more details.
15
#
16
# You should have received a copy of the GNU General Public License
17
# along with Abydos. If not, see <http://www.gnu.org/licenses/>.
18
19 1
"""abydos.compression._BWT.
20
21
Burrows-Wheeler Transform encoder/decoder
22
"""
23
24 1
from __future__ import (
25
    unicode_literals,
26
    absolute_import,
27
    division,
28
    print_function,
29
)
30
31 1
from six.moves import range
32
33
34 1
__all__ = ['BWT', 'bwt_decode', 'bwt_encode']
35
36
37 1
class BWT(object):
0 ignored issues
show
Unused Code introduced by
The variable __class__ seems to be unused.
Loading history...
38
    """Burrows-Wheeler Transform.
39
40
    The Burrows-Wheeler transform is an attempt at placing similar characters
41
    together to improve compression.
42
    Cf. :cite:`Burrows:1994`.
43
    """
44
45 1
    def encode(self, word, terminator='\0'):
0 ignored issues
show
Coding Style introduced by
This method could be written as a function/class method.

If a method does not access any attributes of the class, it could also be implemented as a function or static method. This can help improve readability. For example

class Foo:
    def some_method(self, x, y):
        return x + y;

could be written as

class Foo:
    @classmethod
    def some_method(cls, x, y):
        return x + y;
Loading history...
46
        r"""Return the Burrows-Wheeler transformed form of a word.
47
48
        Args:
49
            word (str): The word to transform using BWT
50
            terminator (str): A character added to signal the end of the string
51
52
        Returns:
53
            str: Word encoded by BWT
54
55
        Raises:
56
            ValueError: Specified terminator absent from code.
57
58
        Examples:
59
            >>> bwt = BWT()
60
            >>> bwt.encode('align')
61
            'n\x00ilag'
62
            >>> bwt.encode('banana')
63
            'annb\x00aa'
64
            >>> bwt.encode('banana', '@')
65
            'annb@aa'
66
67
        """
68 1
        if word:
69 1
            if terminator in word:
70 1
                raise ValueError(
71
                    'Specified terminator, {}, already in word.'.format(
72
                        terminator if terminator != '\0' else '\\0'
73
                    )
74
                )
75
            else:
76 1
                word += terminator
77 1
                wordlist = sorted(
78
                    word[i:] + word[:i] for i in range(len(word))
79
                )
80 1
                return ''.join([w[-1] for w in wordlist])
81
        else:
82 1
            return terminator
83
84 1
    def decode(self, code, terminator='\0'):
0 ignored issues
show
Coding Style introduced by
This method could be written as a function/class method.

If a method does not access any attributes of the class, it could also be implemented as a function or static method. This can help improve readability. For example

class Foo:
    def some_method(self, x, y):
        return x + y;

could be written as

class Foo:
    @classmethod
    def some_method(cls, x, y):
        return x + y;
Loading history...
85
        r"""Return a word decoded from BWT form.
86
87
        Args:
88
            code (str): The word to transform from BWT form
89
            terminator (str): A character added to signal the end of the string
90
91
        Returns:
92
            str: Word decoded by BWT
93
94
        Raises:
95
            ValueError: Specified terminator absent from code.
96
97
        Examples:
98
            >>> bwt = BWT()
99
            >>> bwt.decode('n\x00ilag')
100
            'align'
101
            >>> bwt.decode('annb\x00aa')
102
            'banana'
103
            >>> bwt.decode('annb@aa', '@')
104
            'banana'
105
106
        """
107 1
        if code:
108 1
            if terminator not in code:
109 1
                raise ValueError(
110
                    'Specified terminator, {}, absent from code.'.format(
111
                        terminator if terminator != '\0' else '\\0'
112
                    )
113
                )
114
            else:
115 1
                wordlist = [''] * len(code)
116 1
                for i in range(len(code)):
0 ignored issues
show
Unused Code introduced by
The variable i seems to be unused.
Loading history...
117 1
                    wordlist = sorted(
118
                        code[i] + wordlist[i] for i in range(len(code))
119
                    )
120 1
                rows = [w for w in wordlist if w[-1] == terminator][0]
121 1
                return rows.rstrip(terminator)
122
        else:
123 1
            return ''
124
125
126 1
def bwt_encode(word, terminator='\0'):
127
    r"""Return the Burrows-Wheeler transformed form of a word.
128
129
    This is a wrapper for :py:meth:`BWT.encode`.
130
131
    Args:
132
        word (str): The word to transform using BWT
133
        terminator (str): A character added to signal the end of the string
134
135
    Returns:
136
        str: Word encoded by BWT
137
138
    Examples:
139
        >>> bwt_encode('align')
140
        'n\x00ilag'
141
        >>> bwt_encode('banana')
142
        'annb\x00aa'
143
        >>> bwt_encode('banana', '@')
144
        'annb@aa'
145
146
    """
147 1
    return BWT().encode(word, terminator)
148
149
150 1
def bwt_decode(code, terminator='\0'):
151
    r"""Return a word decoded from BWT form.
152
153
    This is a wrapper for :py:meth:`BWT.decode`.
154
155
    Args:
156
        code (str): The word to transform from BWT form
157
        terminator (str): A character added to signal the end of the string
158
159
    Returns:
160
        str: Word decoded by BWT
161
162
    Examples:
163
        >>> bwt_decode('n\x00ilag')
164
        'align'
165
        >>> bwt_decode('annb\x00aa')
166
        'banana'
167
        >>> bwt_decode('annb@aa', '@')
168
        'banana'
169
170
    """
171 1
    return BWT().decode(code, terminator)
172
173
174
if __name__ == '__main__':
175
    import doctest
176
177
    doctest.testmod()
178