senaite.core.exportimport.instruments.parser - Code Metrics - Inspection of "Added GPSCoordinates widget and field (#2565)" - senaite/senaite.core - Measure and Improve Code Quality continuously with Scrutinizer

Passed

Push — 2.x ( 7fbcc6...4b71c6 )

by Ramon

created 2024-05-30 09:59 UTC

senaite.core.exportimport.instruments.parser B

↳ Parent: Project

Complexity

Total Complexity

Size/Duplication

Total Lines	350
Duplicated Lines	0 %

Importance

Changes

Metric	Value
wmc	47
eloc	137
dl	0
loc	350
rs	8.64
c	0
b	0
f	0

23 Methods

Rating	Name	Size	Complexity
A	InstrumentResultsFileParser.parse()	13	1
A	InstrumentTXTResultsFileParser.splitLine()	3	1
A	InstrumentTXTResultsFileParser.__init__()	6	1
A	InstrumentResultsFileParser.getRawResults()	53	1
A	InstrumentResultsFileParser._emptyRawResults()	4	1
B	InstrumentTXTResultsFileParser.parse()	31	6
A	InstrumentResultsFileParser.getResultsTotalCount()	7	2
A	InstrumentCSVResultsFileParser.splitLine()	3	1
A	InstrumentResultsFileParser.getAnalysesTotalCount()	4	1
A	InstrumentCSVResultsFileParser.__init__()	5	1
A	InstrumentResultsFileParser.resume()	10	2
A	InstrumentCSVResultsFileParser._parseline()	8	1
A	InstrumentResultsFileParser._addRawResult()	35	3
A	InstrumentResultsFileParser.getAttachmentFileType()	7	1
C	InstrumentCSVResultsFileParser.parse()	42	10
A	InstrumentTXTResultsFileParser._parseline()	8	1
A	InstrumentResultsFileParser.__init__()	7	1
A	InstrumentResultsFileParser.getHeader()	4	1
A	InstrumentResultsFileParser.getObjectsTotalCount()	4	1
A	InstrumentResultsFileParser.getInputFile()	4	1
A	InstrumentResultsFileParser.getAnalysisKeywords()	8	3
A	InstrumentResultsFileParser.getFileMimeType()	4	1
A	InstrumentTXTResultsFileParser.read_file()	17	5

How to fix Complexity

# -*- coding: utf-8 -*-
#
# This file is part of SENAITE.CORE.
#
# SENAITE.CORE is free software: you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free Software
# Foundation, version 2.
#
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
# details.
#
# You should have received a copy of the GNU General Public License along with
# this program; if not, write to the Free Software Foundation, Inc., 51
# Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
#
# Copyright 2018-2024 by it's authors.
# Some rights reserved, see README and LICENSE.

import codecs

from senaite.core.exportimport.instruments.logger import Logger
from zope.deprecation import deprecate


class InstrumentResultsFileParser(Logger):
    """Base parser
    """

    def __init__(self, infile, mimetype):
        Logger.__init__(self)
        self._infile = infile
        self._header = {}
        self._rawresults = {}
        self._mimetype = mimetype
        self._numline = 0

    def getInputFile(self):
        """ Returns the results input file
        """
        return self._infile

    def parse(self):
        """Parses the input file and populates the rawresults dict.

        See getRawResults() method for more info about rawresults format

        Returns True if the file has been parsed successfully.

        Is highly recommended to use _addRawResult method when adding raw
        results.

        IMPORTANT: To be implemented by child classes
        """
        raise NotImplementedError

    @deprecate("Please use getRawResults directly")
    def resume(self):
        """Resumes the parse process

        Called by the Results Importer after parse() call
        """
        if len(self.getRawResults()) == 0:
            self.warn("No results found")
            return False
        return True

    def getAttachmentFileType(self):
        """ Returns the file type name that will be used when creating the
            AttachmentType used by the importer for saving the results file as
            an attachment in each Analysis matched.
            By default returns self.getFileMimeType()
        """
        return self.getFileMimeType()

    def getFileMimeType(self):
        """ Returns the results file type
        """
        return self._mimetype

    def getHeader(self):
        """ Returns a dictionary with custom key, values
        """
        return self._header

    def _addRawResult(self, resid, values={}, override=False):
        """ Adds a set of raw results for an object with id=resid
            resid is usually an Analysis Request ID or Worksheet's Reference
            Analysis ID. The values are a dictionary in which the keys are
            analysis service keywords and the values, another dictionary with
            the key,value results.
            The column 'DefaultResult' must be provided, because is used to map
            to the column from which the default result must be retrieved.

            Example:
            resid  = 'DU13162-001-R1'
            values = {
                'D2': {'DefaultResult': 'Final Conc',
                       'Remarks':       '',
                       'Resp':          '5816',
                       'ISTD Resp':     '274638',
                       'Resp Ratio':    '0.0212',
                       'Final Conc':    '0.9145',
                       'Exp Conc':      '1.9531',
                       'Accuracy':      '98.19' },

                'D3': {'DefaultResult': 'Final Conc',
                       'Remarks':       '',
                       'Resp':          '5816',
                       'ISTD Resp':     '274638',
                       'Resp Ratio':    '0.0212',
                       'Final Conc':    '0.9145',
                       'Exp Conc':      '1.9531',
                       'Accuracy':      '98.19' }
                }
        """
        if override or resid not in self._rawresults.keys():
            self._rawresults[resid] = [values]
        else:
            self._rawresults[resid].append(values)

    def _emptyRawResults(self):
        """Remove all grabbed raw results
        """
        self._rawresults = {}

    def getObjectsTotalCount(self):
        """The total number of objects (ARs, ReferenceSamples, etc.) parsed
        """
        return len(self.getRawResults())

    def getResultsTotalCount(self):
        """The total number of analysis results parsed
        """
        count = 0
        for val in self.getRawResults().values():
            count += len(val)
        return count

    def getAnalysesTotalCount(self):
        """ The total number of different analyses parsed
        """
        return len(self.getAnalysisKeywords())

    def getAnalysisKeywords(self):
        """Return found analysis service keywords
        """
        analyses = []
        for rows in self.getRawResults().values():
            for row in rows:
                analyses = list(set(analyses + row.keys()))
        return analyses

    def getRawResults(self):
        """Returns a dictionary containing the parsed results data

        Each dict key is the results row ID (usually AR ID or Worksheet's
        Reference Sample ID). Each item is another dictionary, in which the key
        is a the AS Keyword.

        Inside the AS dict, the column 'DefaultResult' must be provided, that
        maps to the column from which the default result must be retrieved.

        If 'Remarks' column is found, it value will be set in Analysis Remarks
        field when using the deault Importer.

        Example:

            raw_results['DU13162-001-R1'] = [{

                'D2': {'DefaultResult': 'Final Conc',
                        'Remarks':       '',
                        'Resp':          '5816',
                        'ISTD Resp':     '274638',
                        'Resp Ratio':    '0.0212',
                        'Final Conc':    '0.9145',
                        'Exp Conc':      '1.9531',
                        'Accuracy':      '98.19' },

                'D3': {'DefaultResult': 'Final Conc',
                        'Remarks':       '',
                        'Resp':          '5816',
                        'ISTD Resp':     '274638',
                        'Resp Ratio':    '0.0212',
                        'Final Conc':    '0.9145',
                        'Exp Conc':      '1.9531',
                        'Accuracy':      '98.19' }]

            in which:
            - 'DU13162-001-R1' is the Analysis Request ID,
            - 'D2' column is an analysis service keyword,
            - 'DefaultResult' column maps to the column with default result
            - 'Remarks' column with Remarks results for that Analysis
            - The rest of the dict columns are results (or additional info)
              that can be set to the analysis if needed (the default importer
              will look for them if the analysis has Interim fields).

            In the case of reference samples:
            Control/Blank:
            raw_results['QC13-0001-0002'] = {...}

            Duplicate of sample DU13162-009 (from AR DU13162-009-R1)
            raw_results['QC-DU13162-009-002'] = {...}

        """
        return self._rawresults


class InstrumentCSVResultsFileParser(InstrumentResultsFileParser):
    """Parser for CSV files
    """

    def __init__(self, infile, encoding=None):
        InstrumentResultsFileParser.__init__(self, infile, 'CSV')
        # Some Instruments can generate files with different encodings, so we
        # may need this parameter
        self._encoding = encoding

    def parse(self):
        infile = self.getInputFile()
        self.log("Parsing file ${file_name}",
                 mapping={"file_name": infile.filename})
        jump = 0
        # We test in import functions if the file was uploaded
        try:
            if self._encoding:
                f = codecs.open(infile.name, 'r', encoding=self._encoding)
            else:
                f = open(infile.name, 'rU')
        except AttributeError:
            f = infile
        except IOError:
            f = infile.file
        for line in f.readlines():
            self._numline += 1
            if jump == -1:
                # Something went wrong. Finish
                self.err("File processing finished due to critical errors")
                return False
            if jump > 0:
                # Jump some lines
                jump -= 1
                continue

            if not line or not line.strip():
                continue

            line = line.strip()
            jump = 0
            if line:
                jump = self._parseline(line)

        self.log(
            "End of file reached successfully: ${total_objects} objects, "
            "${total_analyses} analyses, ${total_results} results",
            mapping={"total_objects": self.getObjectsTotalCount(),
                     "total_analyses": self.getAnalysesTotalCount(),
                     "total_results": self.getResultsTotalCount()}
        )
        return True

    def splitLine(self, line):
        sline = line.split(',')
        return [token.strip() for token in sline]

    def _parseline(self, line):
        """ Parses a line from the input CSV file and populates rawresults
            (look at getRawResults comment)
            returns -1 if critical error found and parser must end
            returns the number of lines to be jumped in next read. If 0, the
            parser reads the next line as usual
        """
        raise NotImplementedError


class InstrumentTXTResultsFileParser(InstrumentResultsFileParser):
    """Parser for TXT files
    """

    def __init__(self, infile, separator, encoding=None,):
        InstrumentResultsFileParser.__init__(self, infile, 'TXT')
        # Some Instruments can generate files with different encodings, so we
        # may need this parameter
        self._separator = separator
        self._encoding = encoding

    def parse(self):
        infile = self.getInputFile()
        self.log("Parsing file ${file_name}", mapping={"file_name": infile.filename})
        jump = 0
        lines = self.read_file(infile)
        for line in lines:
            self._numline += 1
            if jump == -1:
                # Something went wrong. Finish
                self.err("File processing finished due to critical errors")
                return False
            if jump > 0:
                # Jump some lines
                jump -= 1
                continue

            if not line:
                continue

            jump = 0
            if line:
                jump = self._parseline(line)

        self.log(
            "End of file reached successfully: ${total_objects} objects, "
            "${total_analyses} analyses, ${total_results} results",
            mapping={"total_objects": self.getObjectsTotalCount(),
                     "total_analyses": self.getAnalysesTotalCount(),
                     "total_results": self.getResultsTotalCount()}
        )
        return True

    def read_file(self, infile):
        """Given an input file read its contents, strip whitespace from the
         beginning and end of each line and return a list of the preprocessed
         lines read.

        :param infile: file that contains the data to be read
        :return: list of the read lines with stripped whitespace
        """
        try:
            encoding = self._encoding if self._encoding else None
            mode = 'r' if self._encoding else 'rU'
            with codecs.open(infile.name, mode, encoding=encoding) as f:
                lines = f.readlines()
        except AttributeError:
            lines = infile.readlines()
        lines = [line.strip() for line in lines]
        return lines

    def splitLine(self, line):
        sline = line.split(self._separator)
        return [token.strip() for token in sline]

    def _parseline(self, line):
        """ Parses a line from the input CSV file and populates rawresults
            (look at getRawResults comment)
            returns -1 if critical error found and parser must end
            returns the number of lines to be jumped in next read. If 0, the
            parser reads the next line as usual
        """
        raise NotImplementedError


1			# -- coding: utf-8 --
2			#
3			# This file is part of SENAITE.CORE.
4			#
5			# SENAITE.CORE is free software: you can redistribute it and/or modify it under
6			# the terms of the GNU General Public License as published by the Free Software
7			# Foundation, version 2.
8			#
9			# This program is distributed in the hope that it will be useful, but WITHOUT
10			# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
11			# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
12			# details.
13			#
14			# You should have received a copy of the GNU General Public License along with
15			# this program; if not, write to the Free Software Foundation, Inc., 51
16			# Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
17			#
18			# Copyright 2018-2024 by it's authors.
19			# Some rights reserved, see README and LICENSE.
20
21			import codecs
22
23			from senaite.core.exportimport.instruments.logger import Logger
24			from zope.deprecation import deprecate
25
26
27			class InstrumentResultsFileParser(Logger):
28			"""Base parser
29			"""
30
31			def __init__(self, infile, mimetype):
32			Logger.__init__(self)
33			self._infile = infile
34			self._header = {}
35			self._rawresults = {}
36			self._mimetype = mimetype
37			self._numline = 0
38
39			def getInputFile(self):
40			""" Returns the results input file
41			"""
42			return self._infile
43
44			def parse(self):
45			"""Parses the input file and populates the rawresults dict.
46
47			See getRawResults() method for more info about rawresults format
48
49			Returns True if the file has been parsed successfully.
50
51			Is highly recommended to use _addRawResult method when adding raw
52			results.
53
54			IMPORTANT: To be implemented by child classes
55			"""
56			raise NotImplementedError
57
58			@deprecate("Please use getRawResults directly")
59			def resume(self):
60			"""Resumes the parse process
61
62			Called by the Results Importer after parse() call
63			"""
64			if len(self.getRawResults()) == 0:
65			self.warn("No results found")
66			return False
67			return True
68
69			def getAttachmentFileType(self):
70			""" Returns the file type name that will be used when creating the
71			AttachmentType used by the importer for saving the results file as
72			an attachment in each Analysis matched.
73			By default returns self.getFileMimeType()
74			"""
75			return self.getFileMimeType()
76
77			def getFileMimeType(self):
78			""" Returns the results file type
79			"""
80			return self._mimetype
81
82			def getHeader(self):
83			""" Returns a dictionary with custom key, values
84			"""
85			return self._header
86
87			def _addRawResult(self, resid, values={}, override=False):
88			""" Adds a set of raw results for an object with id=resid
89			resid is usually an Analysis Request ID or Worksheet's Reference
90			Analysis ID. The values are a dictionary in which the keys are
91			analysis service keywords and the values, another dictionary with
92			the key,value results.
93			The column 'DefaultResult' must be provided, because is used to map
94			to the column from which the default result must be retrieved.
95
96			Example:
97			resid = 'DU13162-001-R1'
98			values = {
99			'D2': {'DefaultResult': 'Final Conc',
100			'Remarks': '',
101			'Resp': '5816',
102			'ISTD Resp': '274638',
103			'Resp Ratio': '0.0212',
104			'Final Conc': '0.9145',
105			'Exp Conc': '1.9531',
106			'Accuracy': '98.19' },
107
108			'D3': {'DefaultResult': 'Final Conc',
109			'Remarks': '',
110			'Resp': '5816',
111			'ISTD Resp': '274638',
112			'Resp Ratio': '0.0212',
113			'Final Conc': '0.9145',
114			'Exp Conc': '1.9531',
115			'Accuracy': '98.19' }
116			}
117			"""
118			if override or resid not in self._rawresults.keys():
119			self._rawresults[resid] = [values]
120			else:
121			self._rawresults[resid].append(values)
122
123			def _emptyRawResults(self):
124			"""Remove all grabbed raw results
125			"""
126			self._rawresults = {}
127
128			def getObjectsTotalCount(self):
129			"""The total number of objects (ARs, ReferenceSamples, etc.) parsed
130			"""
131			return len(self.getRawResults())
132
133			def getResultsTotalCount(self):
134			"""The total number of analysis results parsed
135			"""
136			count = 0
137			for val in self.getRawResults().values():
138			count += len(val)
139			return count
140
141			def getAnalysesTotalCount(self):
142			""" The total number of different analyses parsed
143			"""
144			return len(self.getAnalysisKeywords())
145
146			def getAnalysisKeywords(self):
147			"""Return found analysis service keywords
148			"""
149			analyses = []
150			for rows in self.getRawResults().values():
151			for row in rows:
152			analyses = list(set(analyses + row.keys()))
153			return analyses
154
155			def getRawResults(self):
156			"""Returns a dictionary containing the parsed results data
157
158			Each dict key is the results row ID (usually AR ID or Worksheet's
159			Reference Sample ID). Each item is another dictionary, in which the key
160			is a the AS Keyword.
161
162			Inside the AS dict, the column 'DefaultResult' must be provided, that
163			maps to the column from which the default result must be retrieved.
164
165			If 'Remarks' column is found, it value will be set in Analysis Remarks
166			field when using the deault Importer.
167
168			Example:
169
170			raw_results['DU13162-001-R1'] = [{
171
172			'D2': {'DefaultResult': 'Final Conc',
173			'Remarks': '',
174			'Resp': '5816',
175			'ISTD Resp': '274638',
176			'Resp Ratio': '0.0212',
177			'Final Conc': '0.9145',
178			'Exp Conc': '1.9531',
179			'Accuracy': '98.19' },
180
181			'D3': {'DefaultResult': 'Final Conc',
182			'Remarks': '',
183			'Resp': '5816',
184			'ISTD Resp': '274638',
185			'Resp Ratio': '0.0212',
186			'Final Conc': '0.9145',
187			'Exp Conc': '1.9531',
188			'Accuracy': '98.19' }]
189
190			in which:
191			- 'DU13162-001-R1' is the Analysis Request ID,
192			- 'D2' column is an analysis service keyword,
193			- 'DefaultResult' column maps to the column with default result
194			- 'Remarks' column with Remarks results for that Analysis
195			- The rest of the dict columns are results (or additional info)
196			that can be set to the analysis if needed (the default importer
197			will look for them if the analysis has Interim fields).
198
199			In the case of reference samples:
200			Control/Blank:
201			raw_results['QC13-0001-0002'] = {...}
202
203			Duplicate of sample DU13162-009 (from AR DU13162-009-R1)
204			raw_results['QC-DU13162-009-002'] = {...}
205
206			"""
207			return self._rawresults
208
209
210			class InstrumentCSVResultsFileParser(InstrumentResultsFileParser):
211			"""Parser for CSV files
212			"""
213
214			def __init__(self, infile, encoding=None):
215			InstrumentResultsFileParser.__init__(self, infile, 'CSV')
216			# Some Instruments can generate files with different encodings, so we
217			# may need this parameter
218			self._encoding = encoding
219
220			def parse(self):
221			infile = self.getInputFile()
222			self.log("Parsing file ${file_name}",
223			mapping={"file_name": infile.filename})
224			jump = 0
225			# We test in import functions if the file was uploaded
226			try:
227			if self._encoding:
228			f = codecs.open(infile.name, 'r', encoding=self._encoding)
229			else:
230			f = open(infile.name, 'rU')
231			except AttributeError:
232			f = infile
233			except IOError:
234			f = infile.file
235			for line in f.readlines():
236			self._numline += 1
237			if jump == -1:
238			# Something went wrong. Finish
239			self.err("File processing finished due to critical errors")
240			return False
241			if jump > 0:
242			# Jump some lines
243			jump -= 1
244			continue
245
246			if not line or not line.strip():
247			continue
248
249			line = line.strip()
250			jump = 0
251			if line:
252			jump = self._parseline(line)
253
254			self.log(
255			"End of file reached successfully: ${total_objects} objects, "
256			"${total_analyses} analyses, ${total_results} results",
257			mapping={"total_objects": self.getObjectsTotalCount(),
258			"total_analyses": self.getAnalysesTotalCount(),
259			"total_results": self.getResultsTotalCount()}
260			)
261			return True
262
263			def splitLine(self, line):
264			sline = line.split(',')
265			return [token.strip() for token in sline]
266
267			def _parseline(self, line):
268			""" Parses a line from the input CSV file and populates rawresults
269			(look at getRawResults comment)
270			returns -1 if critical error found and parser must end
271			returns the number of lines to be jumped in next read. If 0, the
272			parser reads the next line as usual
273			"""
274			raise NotImplementedError
275
276
277			class InstrumentTXTResultsFileParser(InstrumentResultsFileParser):
278			"""Parser for TXT files
279			"""
280
281			def __init__(self, infile, separator, encoding=None,):
282			InstrumentResultsFileParser.__init__(self, infile, 'TXT')
283			# Some Instruments can generate files with different encodings, so we
284			# may need this parameter
285			self._separator = separator
286			self._encoding = encoding
287
288			def parse(self):
289			infile = self.getInputFile()
290			self.log("Parsing file ${file_name}", mapping={"file_name": infile.filename})
291			jump = 0
292			lines = self.read_file(infile)
293			for line in lines:
294			self._numline += 1
295			if jump == -1:
296			# Something went wrong. Finish
297			self.err("File processing finished due to critical errors")
298			return False
299			if jump > 0:
300			# Jump some lines
301			jump -= 1
302			continue
303
304			if not line:
305			continue
306
307			jump = 0
308			if line:
309			jump = self._parseline(line)
310
311			self.log(
312			"End of file reached successfully: ${total_objects} objects, "
313			"${total_analyses} analyses, ${total_results} results",
314			mapping={"total_objects": self.getObjectsTotalCount(),
315			"total_analyses": self.getAnalysesTotalCount(),
316			"total_results": self.getResultsTotalCount()}
317			)
318			return True
319
320			def read_file(self, infile):
321			"""Given an input file read its contents, strip whitespace from the
322			beginning and end of each line and return a list of the preprocessed
323			lines read.
324
325			:param infile: file that contains the data to be read
326			:return: list of the read lines with stripped whitespace
327			"""
328			try:
329			encoding = self._encoding if self._encoding else None
330			mode = 'r' if self._encoding else 'rU'
331			with codecs.open(infile.name, mode, encoding=encoding) as f:
332			lines = f.readlines()
333			except AttributeError:
334			lines = infile.readlines()
335			lines = [line.strip() for line in lines]
336			return lines
337
338			def splitLine(self, line):
339			sline = line.split(self._separator)
340			return [token.strip() for token in sline]
341
342			def _parseline(self, line):
343			""" Parses a line from the input CSV file and populates rawresults
344			(look at getRawResults comment)
345			returns -1 if critical error found and parser must end
346			returns the number of lines to be jumped in next read. If 0, the
347			parser reads the next line as usual
348			"""
349			raise NotImplementedError
350

senaite / senaite.core

Push — 2.x ( 7fbcc6...4b71c6 )

senaite.core.exportimport.instruments.parser B

Complexity

Size/Duplication

Importance

23 Methods

How to fix Complexity

Complexity

Duplication Side-by-Side

Filter issues like