Passed
Pull Request — master (#26)
by Dafne van
02:48
created

e2edutch.stanza   A

Complexity

Total Complexity 3

Size/Duplication

Total Lines 75
Duplicated Lines 0 %

Importance

Changes 0
Metric Value
wmc 3
eloc 38
dl 0
loc 75
rs 10
c 0
b 0
f 0

3 Methods

Rating   Name   Duplication   Size   Complexity  
A CorefProcessor._set_up_model() 0 3 1
A CorefProcessor.__init__() 0 22 1
A CorefProcessor.process() 0 20 1
1
import os
0 ignored issues
show
introduced by
Missing module docstring
Loading history...
Unused Code introduced by
The import os seems to be unused.
Loading history...
2
import stanza
0 ignored issues
show
Unused Code introduced by
The import stanza seems to be unused.
Loading history...
3
import logging
0 ignored issues
show
introduced by
standard import "import logging" should be placed before "import stanza"
Loading history...
4
5
from pathlib import Path
0 ignored issues
show
introduced by
standard import "from pathlib import Path" should be placed before "import stanza"
Loading history...
6
7
from e2edutch import util
8
from e2edutch import coref_model as cm
0 ignored issues
show
Unused Code introduced by
Unused coref_model imported from e2edutch as cm
Loading history...
9
from e2edutch.download import download_data
10
from e2edutch.predict import Predictor
11
12
from stanza.pipeline.processor import Processor, register_processor
0 ignored issues
show
introduced by
third party import "from stanza.pipeline.processor import Processor, register_processor" should be placed before "from e2edutch import util"
Loading history...
13
from stanza.models.common.doc import Document
0 ignored issues
show
Unused Code introduced by
Unused Document imported from stanza.models.common.doc
Loading history...
introduced by
third party import "from stanza.models.common.doc import Document" should be placed before "from e2edutch import util"
Loading history...
14
15
import tensorflow.compat.v1 as tf
0 ignored issues
show
introduced by
Unable to import 'tensorflow.compat.v1'
Loading history...
Unused Code introduced by
Unused tensorflow.compat.v1 imported as tf
Loading history...
introduced by
third party import "import tensorflow.compat.v1 as tf" should be placed before "from e2edutch import util"
Loading history...
16
17
logger = logging.getLogger('e2edutch')
18
logger.setLevel(logging.INFO)
19
logger.addHandler(logging.StreamHandler())
20
21
22
@register_processor('coref')
23
class CorefProcessor(Processor):
24
    ''' Processor that appends coreference information '''
25
    _requires = set(['tokenize'])
26
    _provides = set(['coref'])
27
    
0 ignored issues
show
Coding Style introduced by
Trailing whitespace
Loading history...
28
    def __init__(self, config, pipeline, use_gpu):
0 ignored issues
show
Bug introduced by
The __init__ method of the super-class Processor is not called.

It is generally advisable to initialize the super-class by calling its __init__ method:

class SomeParent:
    def __init__(self):
        self.x = 1

class SomeChild(SomeParent):
    def __init__(self):
        # Initialize the super class
        SomeParent.__init__(self)
Loading history...
29
        # Make e2edutch follow Stanza's GPU settings:
30
        # set the environment value for GPU, so that initialize_from_env picks it up.
31
        #if use_gpu:
32
        #    os.environ['GPU'] = ' '.join(tf.config.experimental.list_physical_devices('GPU'))
33
        #else:
34
        #    if 'GPU' in os.environ['GPU'] :
35
        #        os.environ.pop('GPU')
36
37
        self.e2econfig = util.initialize_from_env(model_name='final')
38
39
        # Override datapath and log_root:
40
        # store e2edata with the Stanza resources, ie. a 'stanza_resources/nl/coref' directory
41
        self.e2econfig['datapath'] = Path(config['model_path']).parent
42
        self.e2econfig['log_root'] = Path(config['model_path']).parent
43
44
        # Download data files if not present
45
        download_data(self.e2econfig)
46
47
        # Start and stop a session to cache all models
48
        predictor = Predictor(config=self.e2econfig)
49
        predictor.end_session()
50
51
    def _set_up_model(self, *args):
0 ignored issues
show
Unused Code introduced by
The argument args seems to be unused.
Loading history...
Coding Style introduced by
This method could be written as a function/class method.

If a method does not access any attributes of the class, it could also be implemented as a function or static method. This can help improve readability. For example

class Foo:
    def some_method(self, x, y):
        return x + y;

could be written as

class Foo:
    @classmethod
    def some_method(cls, x, y):
        return x + y;
Loading history...
52
        print ('_set_up_model')
0 ignored issues
show
Coding Style introduced by
No space allowed before bracket
Loading history...
53
        pass
0 ignored issues
show
Unused Code introduced by
Unnecessary pass statement
Loading history...
54
55
    def process(self, doc):
56
57
        predictor = Predictor(config=self.e2econfig)
58
59
        # build the example argument for predict:
60
        #   example (dict): dict with the following fields:
61
        #                     sentences ([[str]])
62
        #                     doc_id (str)
63
        #                     clusters ([[(int, int)]]) (optional)
64
        example = {}
65
        example['sentences'] = [sentence.text for sentence in doc.sentences]
66
        example['doc_id'] = 'document_from_stanza'
67
        example['doc_key'] = 'undocumented'
68
69
        # predicted_clusters, _ = predictor.predict(example)
70
        print(predictor.predict(example))
71
72
        predictor.end_session()
73
74
        return doc
75