Completed
Push — master ( c76636...9def71 )
by Philip
29s
created

TestDelimitedReader   A

Complexity

Total Complexity 7

Size/Duplication

Total Lines 61
Duplicated Lines 0 %

Importance

Changes 2
Bugs 0 Features 0
Metric Value
c 2
b 0
f 0
dl 0
loc 61
rs 10
wmc 7

7 Methods

Rating   Name   Duplication   Size   Complexity  
A test_stream_reader() 0 8 1
A test_from_zipfile() 0 11 1
A setUp() 0 4 1
A test_from_file() 0 10 1
A test_file_headers() 0 6 1
A test_line_number() 0 9 1
A test_zip_file_headers() 0 6 1
1
import csv
0 ignored issues
show
Coding Style introduced by
This module should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
2
import io
3
import os
4
import unittest
5
import textwrap
6
import zipfile
7
from collections import namedtuple
8
from datetime import date, datetime
9
from tempfile import NamedTemporaryFile
10
11
from foil.fileio import (concatenate_streams, DelimitedReader,
12
                         DelimitedSubsetReader, TextReader, ZipReader)
13
14
15
class MockDialect(csv.Dialect):
0 ignored issues
show
Coding Style introduced by
This class should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
16
    delimiter = '|'
17
    quotechar = '"'
18
    doublequote = True
19
    skipinitialspace = False
20
    lineterminator = '\n'
21
    quoting = csv.QUOTE_MINIMAL
22
23
24
def delimited_text():
0 ignored issues
show
Coding Style introduced by
This function should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
25
    file_content = textwrap.dedent(r"""
26
"NAME"|"CLASS"|"DATE"|"ASSIGNMENT"|"SCORE"|"AVERAGE"
27
"Dave"|"American History"|2015-03-02|"QUIZ 1"|82|82.0
28
"Dave"|"American History"|2015-04-04|"QUIZ 2"|91|86.5
29
"Dave"|"American History"|2015-04-20|"Mid-term"|77|83.333
30
""").strip()
31
    return file_content
32
33
34
def data_records(fields):
0 ignored issues
show
Coding Style introduced by
This function should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
35
    Record = namedtuple('Record', fields)
0 ignored issues
show
Coding Style Naming introduced by
The name Record does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
36
    records = [
37
        Record('Dave', 'American History', date(2015, 3, 2), 'QUIZ 1', 82.0, 82.0),
38
        Record('Dave', 'American History', date(2015, 4, 4), 'QUIZ 2', 91.0, 86.5),
39
        Record('Dave', 'American History', date(2015, 4, 20), 'Mid-term', 77.0, 83.333)]
40
41
    return records
42
43
44
def partial_data_records(fields):
0 ignored issues
show
Coding Style introduced by
This function should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
45
    Record = namedtuple('Record', fields)
0 ignored issues
show
Coding Style Naming introduced by
The name Record does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
46
    records = [
47
        Record('Dave', 'American History', 82.0),
48
        Record('Dave', 'American History', 86.5),
49
        Record('Dave', 'American History', 83.333)]
50
51
    return records
52
53
54
def single_field_records(fields):
55
    Record = namedtuple('Record', fields)
0 ignored issues
show
Coding Style Naming introduced by
The name Record does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
56
    records = [Record('Dave'), Record('Dave'), Record('Dave')]
57
58
    return records
59
60
61
def parse_date(date_str):
62
    if date_str is '':
63
        return None
64
    else:
65
        return datetime.strptime(date_str, '%Y-%m-%d').date()
66
67
68
class ReaderFixture(object):
0 ignored issues
show
Coding Style introduced by
This class should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
69
    encoding = 'UTF-8'
70
71
    @classmethod
72
    def setUpClass(cls):
0 ignored issues
show
Coding Style Naming introduced by
The name setUpClass does not conform to the method naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
73
        cls._prep()
74
75
    @classmethod
76
    def _prep(cls):
77
        pass
78
79
    @classmethod
80
    def _clean(cls):
81
        pass
82
83
    @classmethod
84
    def tearDownClass(cls):
0 ignored issues
show
Coding Style Naming introduced by
The name tearDownClass does not conform to the method naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
85
        cls._clean()
86
87
        if os.path.exists(cls.path):
0 ignored issues
show
Bug introduced by
The Class ReaderFixture does not seem to have a member named path.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
88
            os.unlink(cls.path)
0 ignored issues
show
Bug introduced by
The Class ReaderFixture does not seem to have a member named path.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
89
90
91
class DelimitedReaderFixture(ReaderFixture):
0 ignored issues
show
Coding Style introduced by
This class should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
92
    zip_filename = 'delimited_file1.txt'
93
94
    @classmethod
95
    def _prep(cls):
96
        cls.dialect = MockDialect()
97
        file_content = delimited_text()
98
        with NamedTemporaryFile(prefix='delim_', suffix='.txt', delete=False) as tmp:
99
            with open(tmp.name, 'w', encoding=cls.encoding) as text_file:
100
                text_file.write(file_content)
101
            cls.path = tmp.name
102
103
        file_content_bytes = bytes(file_content, cls.encoding)
104
        with NamedTemporaryFile(prefix='zipped_', suffix='.zip', delete=False) as tmp:
105
            with zipfile.ZipFile(tmp.name, mode='w') as myzip:
106
                myzip.writestr(cls.zip_filename, file_content_bytes)
107
            cls.zip_path = tmp.name
108
109
    @classmethod
110
    def _clean(cls):
111
        if os.path.exists(cls.zip_path):
112
            os.unlink(cls.zip_path)
113
114
115
class TestTextReader(ReaderFixture, unittest.TestCase):
116
    @classmethod
117
    def _prep(cls):
118
        file_content = 'hello\nworld\n'
119
        with NamedTemporaryFile(prefix='text_', suffix='.txt', delete=False) as tmp:
120
            with open(tmp.name, 'w', encoding=cls.encoding) as text_file:
121
                text_file.write(file_content)
122
            cls.path = tmp.name
123
124
    def test_text_reader(self):
125
        reader = TextReader(self.path, 'UTF-8')
126
127
        expected = ['hello', 'world']
128
        result = list(reader)
129
130
        self.assertEqual(expected, result)
131
132
133
class TestDelimitedReader(DelimitedReaderFixture, unittest.TestCase):
134
    def setUp(self):
135
        self.fields = ['NAME', 'CLASS', 'DATE', 'ASSIGNMENT', 'SCORE', 'AVERAGE']
136
        self.converters = [str, str, parse_date, str, float, float]
137
        self.expected = data_records(self.fields)
138
139
    def test_stream_reader(self):
140
        stream = io.StringIO(delimited_text())
141
        reader = DelimitedReader(stream, dialect=self.dialect,
142
                                 fields=self.fields, converters=self.converters)
143
144
        result = list(reader)
145
146
        self.assertSequenceEqual(self.expected, result)
147
148
    def test_from_file(self):
149
        reader = DelimitedReader.from_file(path=self.path,
150
                                           encoding=self.encoding,
151
                                           dialect=self.dialect,
152
                                           fields=self.fields,
153
                                           converters=self.converters)
154
155
        result = list(reader)
156
157
        self.assertSequenceEqual(self.expected, result)
158
159
    def test_from_zipfile(self):
160
        reader = DelimitedReader.from_zipfile(path=self.zip_path,
161
                                              filename=self.zip_filename,
162
                                              encoding=self.encoding,
163
                                              dialect=self.dialect,
164
                                              fields=self.fields,
165
                                              converters=self.converters)
166
167
        result = list(reader)
168
169
        self.assertSequenceEqual(self.expected, result)
170
171
    def test_file_headers(self):
172
        headers = DelimitedReader.file_headers(path=self.path,
173
                                               encoding=self.encoding,
174
                                               dialect=self.dialect)
175
176
        self.assertEqual(self.fields, list(headers))
177
178
    def test_zip_file_headers(self):
179
        headers = DelimitedReader.zipfile_headers(path=self.zip_path,
180
                                                  filename=self.zip_filename,
181
                                                  encoding=self.encoding,
182
                                                  dialect=self.dialect)
183
        self.assertEqual(self.fields, list(headers))
184
185
    def test_line_number(self):
186
        # counts skipped header line
187
        stream = io.StringIO(delimited_text())
188
        reader = DelimitedReader(stream, dialect=self.dialect, fields=self.fields,
189
                                 converters=self.converters)
190
        next(reader)
191
        next(reader)
192
193
        self.assertEqual(reader.file_line_number, 3)
194
195
196
class TestDelimitedSubsetReader(DelimitedReaderFixture, unittest.TestCase):
197
    def setUp(self):
198
        self.maxDiff = None
0 ignored issues
show
Coding Style Naming introduced by
The name maxDiff does not conform to the attribute naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
199
        self.headers = ['NAME', 'CLASS', 'DATE', 'ASSIGNMENT', 'SCORE', 'AVERAGE']
200
        self.fields = ['NAME', 'CLASS', 'AVERAGE']
201
        self.field_index = [self.headers.index(field) for field in self.fields]
202
        self.converters = [str, str, float]
203
        self.expected = partial_data_records(self.fields)
204
205
    def test_stream_reader(self):
206
        stream = io.StringIO(delimited_text())
207
        reader = DelimitedSubsetReader(stream,
208
                                       dialect=self.dialect,
209
                                       fields=self.fields,
210
                                       converters=self.converters,
211
                                       field_index=self.field_index)
212
213
        result = list(reader)
214
215
        self.assertSequenceEqual(self.expected, result)
216
217
    def test_from_file(self):
218
        reader = DelimitedSubsetReader.from_file(path=self.path,
219
                                                 encoding=self.encoding,
220
                                                 dialect=self.dialect,
221
                                                 fields=self.fields,
222
                                                 converters=self.converters,
223
                                                 field_index=self.field_index)
224
225
        result = list(reader)
226
227
        self.assertSequenceEqual(self.expected, result)
228
229
    def test_from_zipfile(self):
230
        reader = DelimitedSubsetReader.from_zipfile(path=self.zip_path,
231
                                                    filename=self.zip_filename,
232
                                                    encoding=self.encoding,
233
                                                    dialect=self.dialect,
234
                                                    fields=self.fields,
235
                                                    converters=self.converters,
236
                                                    field_index=self.field_index)
237
238
        result = list(reader)
239
240
        self.assertSequenceEqual(self.expected, result)
241
242
    def test_single_field(self):
243
        stream = io.StringIO(delimited_text())
244
        field_index = [0]
245
        fields = [self.fields[index] for index in field_index]
246
        converters = [self.converters[index] for index in field_index]
247
248
        reader = DelimitedSubsetReader(stream,
249
                                       dialect=self.dialect,
250
                                       fields=fields,
251
                                       converters=converters,
252
                                       field_index=field_index)
253
254
        expected = single_field_records(fields)
255
        result = list(reader)
256
257
        self.assertEqual(expected, result)
258
259
260
class TestZipReader(ReaderFixture, unittest.TestCase):
261
262
    filename = 'sample_file.txt'
263
    file_content = 'abc\neasy\n123.\n'
264
265
    @classmethod
266
    def _prep(cls):
267
        cls.file_content_bytes = bytes(cls.file_content, encoding=cls.encoding)
268
        with NamedTemporaryFile(prefix='zipped_', suffix='.zip', delete=False) as tmp:
269
            with zipfile.ZipFile(tmp.name, mode='w') as myzip:
270
                    myzip.writestr(cls.filename, cls.file_content_bytes)
0 ignored issues
show
Coding Style introduced by
The indentation here looks off. 16 spaces were expected, but 20 were found.
Loading history...
271
            cls.path = tmp.name
272
273
    def test_read(self):
274
        result = ZipReader(self.path, self.filename).read(self.encoding)
275
276
        self.assertEqual(self.file_content, result)
277
278
    def test_readlines(self):
279
        line_gen = ZipReader(self.path, self.filename).readlines(self.encoding)
280
        expected = ['abc', 'easy', '123.']
281
282
        self.assertEqual(expected, list(line_gen))
283
284
285
class TestConcatenateStreams(unittest.TestCase):
286
    def test_concatenate_streams(self):
287
        streams = [[1, 2, 3], ['a', 'b', 'c']]
288
289
        expected = [1, 2, 3, 'a', 'b', 'c']
290
        result = list(concatenate_streams(streams))
291
292
        self.assertEqual(expected, result)
293