Passed
Push — 2.x ( 7fbcc6...4b71c6 )
by Ramon
09:29
created

senaite.core.exportimport.instruments.parser   B

Complexity

Total Complexity 47

Size/Duplication

Total Lines 350
Duplicated Lines 0 %

Importance

Changes 0
Metric Value
wmc 47
eloc 137
dl 0
loc 350
rs 8.64
c 0
b 0
f 0

23 Methods

Rating   Name   Duplication   Size   Complexity  
A InstrumentResultsFileParser.parse() 0 13 1
A InstrumentTXTResultsFileParser.splitLine() 0 3 1
A InstrumentTXTResultsFileParser.__init__() 0 6 1
A InstrumentResultsFileParser.getRawResults() 0 53 1
A InstrumentResultsFileParser._emptyRawResults() 0 4 1
B InstrumentTXTResultsFileParser.parse() 0 31 6
A InstrumentResultsFileParser.getResultsTotalCount() 0 7 2
A InstrumentCSVResultsFileParser.splitLine() 0 3 1
A InstrumentResultsFileParser.getAnalysesTotalCount() 0 4 1
A InstrumentCSVResultsFileParser.__init__() 0 5 1
A InstrumentResultsFileParser.resume() 0 10 2
A InstrumentCSVResultsFileParser._parseline() 0 8 1
A InstrumentResultsFileParser._addRawResult() 0 35 3
A InstrumentResultsFileParser.getAttachmentFileType() 0 7 1
C InstrumentCSVResultsFileParser.parse() 0 42 10
A InstrumentTXTResultsFileParser._parseline() 0 8 1
A InstrumentResultsFileParser.__init__() 0 7 1
A InstrumentResultsFileParser.getHeader() 0 4 1
A InstrumentResultsFileParser.getObjectsTotalCount() 0 4 1
A InstrumentResultsFileParser.getInputFile() 0 4 1
A InstrumentResultsFileParser.getAnalysisKeywords() 0 8 3
A InstrumentResultsFileParser.getFileMimeType() 0 4 1
A InstrumentTXTResultsFileParser.read_file() 0 17 5

How to fix   Complexity   

Complexity

Complex classes like senaite.core.exportimport.instruments.parser often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

1
# -*- coding: utf-8 -*-
2
#
3
# This file is part of SENAITE.CORE.
4
#
5
# SENAITE.CORE is free software: you can redistribute it and/or modify it under
6
# the terms of the GNU General Public License as published by the Free Software
7
# Foundation, version 2.
8
#
9
# This program is distributed in the hope that it will be useful, but WITHOUT
10
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
11
# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
12
# details.
13
#
14
# You should have received a copy of the GNU General Public License along with
15
# this program; if not, write to the Free Software Foundation, Inc., 51
16
# Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
17
#
18
# Copyright 2018-2024 by it's authors.
19
# Some rights reserved, see README and LICENSE.
20
21
import codecs
22
23
from senaite.core.exportimport.instruments.logger import Logger
24
from zope.deprecation import deprecate
25
26
27
class InstrumentResultsFileParser(Logger):
28
    """Base parser
29
    """
30
31
    def __init__(self, infile, mimetype):
32
        Logger.__init__(self)
33
        self._infile = infile
34
        self._header = {}
35
        self._rawresults = {}
36
        self._mimetype = mimetype
37
        self._numline = 0
38
39
    def getInputFile(self):
40
        """ Returns the results input file
41
        """
42
        return self._infile
43
44
    def parse(self):
45
        """Parses the input file and populates the rawresults dict.
46
47
        See getRawResults() method for more info about rawresults format
48
49
        Returns True if the file has been parsed successfully.
50
51
        Is highly recommended to use _addRawResult method when adding raw
52
        results.
53
54
        IMPORTANT: To be implemented by child classes
55
        """
56
        raise NotImplementedError
57
58
    @deprecate("Please use getRawResults directly")
59
    def resume(self):
60
        """Resumes the parse process
61
62
        Called by the Results Importer after parse() call
63
        """
64
        if len(self.getRawResults()) == 0:
65
            self.warn("No results found")
66
            return False
67
        return True
68
69
    def getAttachmentFileType(self):
70
        """ Returns the file type name that will be used when creating the
71
            AttachmentType used by the importer for saving the results file as
72
            an attachment in each Analysis matched.
73
            By default returns self.getFileMimeType()
74
        """
75
        return self.getFileMimeType()
76
77
    def getFileMimeType(self):
78
        """ Returns the results file type
79
        """
80
        return self._mimetype
81
82
    def getHeader(self):
83
        """ Returns a dictionary with custom key, values
84
        """
85
        return self._header
86
87
    def _addRawResult(self, resid, values={}, override=False):
88
        """ Adds a set of raw results for an object with id=resid
89
            resid is usually an Analysis Request ID or Worksheet's Reference
90
            Analysis ID. The values are a dictionary in which the keys are
91
            analysis service keywords and the values, another dictionary with
92
            the key,value results.
93
            The column 'DefaultResult' must be provided, because is used to map
94
            to the column from which the default result must be retrieved.
95
96
            Example:
97
            resid  = 'DU13162-001-R1'
98
            values = {
99
                'D2': {'DefaultResult': 'Final Conc',
100
                       'Remarks':       '',
101
                       'Resp':          '5816',
102
                       'ISTD Resp':     '274638',
103
                       'Resp Ratio':    '0.0212',
104
                       'Final Conc':    '0.9145',
105
                       'Exp Conc':      '1.9531',
106
                       'Accuracy':      '98.19' },
107
108
                'D3': {'DefaultResult': 'Final Conc',
109
                       'Remarks':       '',
110
                       'Resp':          '5816',
111
                       'ISTD Resp':     '274638',
112
                       'Resp Ratio':    '0.0212',
113
                       'Final Conc':    '0.9145',
114
                       'Exp Conc':      '1.9531',
115
                       'Accuracy':      '98.19' }
116
                }
117
        """
118
        if override or resid not in self._rawresults.keys():
119
            self._rawresults[resid] = [values]
120
        else:
121
            self._rawresults[resid].append(values)
122
123
    def _emptyRawResults(self):
124
        """Remove all grabbed raw results
125
        """
126
        self._rawresults = {}
127
128
    def getObjectsTotalCount(self):
129
        """The total number of objects (ARs, ReferenceSamples, etc.) parsed
130
        """
131
        return len(self.getRawResults())
132
133
    def getResultsTotalCount(self):
134
        """The total number of analysis results parsed
135
        """
136
        count = 0
137
        for val in self.getRawResults().values():
138
            count += len(val)
139
        return count
140
141
    def getAnalysesTotalCount(self):
142
        """ The total number of different analyses parsed
143
        """
144
        return len(self.getAnalysisKeywords())
145
146
    def getAnalysisKeywords(self):
147
        """Return found analysis service keywords
148
        """
149
        analyses = []
150
        for rows in self.getRawResults().values():
151
            for row in rows:
152
                analyses = list(set(analyses + row.keys()))
153
        return analyses
154
155
    def getRawResults(self):
156
        """Returns a dictionary containing the parsed results data
157
158
        Each dict key is the results row ID (usually AR ID or Worksheet's
159
        Reference Sample ID). Each item is another dictionary, in which the key
160
        is a the AS Keyword.
161
162
        Inside the AS dict, the column 'DefaultResult' must be provided, that
163
        maps to the column from which the default result must be retrieved.
164
165
        If 'Remarks' column is found, it value will be set in Analysis Remarks
166
        field when using the deault Importer.
167
168
        Example:
169
170
            raw_results['DU13162-001-R1'] = [{
171
172
                'D2': {'DefaultResult': 'Final Conc',
173
                        'Remarks':       '',
174
                        'Resp':          '5816',
175
                        'ISTD Resp':     '274638',
176
                        'Resp Ratio':    '0.0212',
177
                        'Final Conc':    '0.9145',
178
                        'Exp Conc':      '1.9531',
179
                        'Accuracy':      '98.19' },
180
181
                'D3': {'DefaultResult': 'Final Conc',
182
                        'Remarks':       '',
183
                        'Resp':          '5816',
184
                        'ISTD Resp':     '274638',
185
                        'Resp Ratio':    '0.0212',
186
                        'Final Conc':    '0.9145',
187
                        'Exp Conc':      '1.9531',
188
                        'Accuracy':      '98.19' }]
189
190
            in which:
191
            - 'DU13162-001-R1' is the Analysis Request ID,
192
            - 'D2' column is an analysis service keyword,
193
            - 'DefaultResult' column maps to the column with default result
194
            - 'Remarks' column with Remarks results for that Analysis
195
            - The rest of the dict columns are results (or additional info)
196
              that can be set to the analysis if needed (the default importer
197
              will look for them if the analysis has Interim fields).
198
199
            In the case of reference samples:
200
            Control/Blank:
201
            raw_results['QC13-0001-0002'] = {...}
202
203
            Duplicate of sample DU13162-009 (from AR DU13162-009-R1)
204
            raw_results['QC-DU13162-009-002'] = {...}
205
206
        """
207
        return self._rawresults
208
209
210
class InstrumentCSVResultsFileParser(InstrumentResultsFileParser):
211
    """Parser for CSV files
212
    """
213
214
    def __init__(self, infile, encoding=None):
215
        InstrumentResultsFileParser.__init__(self, infile, 'CSV')
216
        # Some Instruments can generate files with different encodings, so we
217
        # may need this parameter
218
        self._encoding = encoding
219
220
    def parse(self):
221
        infile = self.getInputFile()
222
        self.log("Parsing file ${file_name}",
223
                 mapping={"file_name": infile.filename})
224
        jump = 0
225
        # We test in import functions if the file was uploaded
226
        try:
227
            if self._encoding:
228
                f = codecs.open(infile.name, 'r', encoding=self._encoding)
229
            else:
230
                f = open(infile.name, 'rU')
231
        except AttributeError:
232
            f = infile
233
        except IOError:
234
            f = infile.file
235
        for line in f.readlines():
236
            self._numline += 1
237
            if jump == -1:
238
                # Something went wrong. Finish
239
                self.err("File processing finished due to critical errors")
240
                return False
241
            if jump > 0:
242
                # Jump some lines
243
                jump -= 1
244
                continue
245
246
            if not line or not line.strip():
247
                continue
248
249
            line = line.strip()
250
            jump = 0
251
            if line:
252
                jump = self._parseline(line)
253
254
        self.log(
255
            "End of file reached successfully: ${total_objects} objects, "
256
            "${total_analyses} analyses, ${total_results} results",
257
            mapping={"total_objects": self.getObjectsTotalCount(),
258
                     "total_analyses": self.getAnalysesTotalCount(),
259
                     "total_results": self.getResultsTotalCount()}
260
        )
261
        return True
262
263
    def splitLine(self, line):
264
        sline = line.split(',')
265
        return [token.strip() for token in sline]
266
267
    def _parseline(self, line):
268
        """ Parses a line from the input CSV file and populates rawresults
269
            (look at getRawResults comment)
270
            returns -1 if critical error found and parser must end
271
            returns the number of lines to be jumped in next read. If 0, the
272
            parser reads the next line as usual
273
        """
274
        raise NotImplementedError
275
276
277
class InstrumentTXTResultsFileParser(InstrumentResultsFileParser):
278
    """Parser for TXT files
279
    """
280
281
    def __init__(self, infile, separator, encoding=None,):
282
        InstrumentResultsFileParser.__init__(self, infile, 'TXT')
283
        # Some Instruments can generate files with different encodings, so we
284
        # may need this parameter
285
        self._separator = separator
286
        self._encoding = encoding
287
288
    def parse(self):
289
        infile = self.getInputFile()
290
        self.log("Parsing file ${file_name}", mapping={"file_name": infile.filename})
291
        jump = 0
292
        lines = self.read_file(infile)
293
        for line in lines:
294
            self._numline += 1
295
            if jump == -1:
296
                # Something went wrong. Finish
297
                self.err("File processing finished due to critical errors")
298
                return False
299
            if jump > 0:
300
                # Jump some lines
301
                jump -= 1
302
                continue
303
304
            if not line:
305
                continue
306
307
            jump = 0
308
            if line:
309
                jump = self._parseline(line)
310
311
        self.log(
312
            "End of file reached successfully: ${total_objects} objects, "
313
            "${total_analyses} analyses, ${total_results} results",
314
            mapping={"total_objects": self.getObjectsTotalCount(),
315
                     "total_analyses": self.getAnalysesTotalCount(),
316
                     "total_results": self.getResultsTotalCount()}
317
        )
318
        return True
319
320
    def read_file(self, infile):
321
        """Given an input file read its contents, strip whitespace from the
322
         beginning and end of each line and return a list of the preprocessed
323
         lines read.
324
325
        :param infile: file that contains the data to be read
326
        :return: list of the read lines with stripped whitespace
327
        """
328
        try:
329
            encoding = self._encoding if self._encoding else None
330
            mode = 'r' if self._encoding else 'rU'
331
            with codecs.open(infile.name, mode, encoding=encoding) as f:
332
                lines = f.readlines()
333
        except AttributeError:
334
            lines = infile.readlines()
335
        lines = [line.strip() for line in lines]
336
        return lines
337
338
    def splitLine(self, line):
339
        sline = line.split(self._separator)
340
        return [token.strip() for token in sline]
341
342
    def _parseline(self, line):
343
        """ Parses a line from the input CSV file and populates rawresults
344
            (look at getRawResults comment)
345
            returns -1 if critical error found and parser must end
346
            returns the number of lines to be jumped in next read. If 0, the
347
            parser reads the next line as usual
348
        """
349
        raise NotImplementedError
350