Passed
Push — master ( 25ec64...5aaeb0 )
by Marcin
02:04
created

build.rna_tools.tools.PyMOL4RNA.libs.calc_gdt   A

Complexity

Total Complexity 4

Size/Duplication

Total Lines 82
Duplicated Lines 75.61 %

Importance

Changes 0
Metric Value
eloc 49
dl 62
loc 82
rs 10
c 0
b 0
f 0
wmc 4

2 Functions

Rating   Name   Duplication   Size   Complexity  
A gdt_hs() 31 31 2
A gdt() 31 31 2

How to fix   Duplicated Code   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

1
#!/usr/bin/env python
2
# -*- coding: utf-8 -*-
3
"""
4
Author: Guillaume Bouvier -- [email protected]
5
https://research.pasteur.fr/en/member/guillaume-bouvier/
6
2018-01-24 15:23:33 (UTC+0100)
7
https://bougui505.github.io/2018/01/24/compute_the_global_distance_test_(gdt)_with_pymol.html
8
"""
9
10
from pymol import cmd
11
import numpy as np
12
13
print("Loaded calc_gdt.py: gdt(sel1, sel2), gdt_hs(sel1, sel2)")
14 View Code Duplication
@cmd.extend
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
15
def gdt(sel1, sel2):
16
    """
17
    Calculate the GDT (Global Distance Test) between two RNA (using P atoms) selections.
18
    Args:
19
        sel1 (str): selection string for the protein to compare.
20
        sel2 (str): selection string for the reference protein.
21
    """
22
    # Select only C-alpha atoms
23
    sel1 += ' and name P'
24
    sel2 += ' and name P'
25
    cutoffs = [1.0, 2.0, 4.0, 8.0]
26
    # Align without transforming for raw alignment data
27
    cmd.align(sel1, sel2, cycles=0, transform=0, object='aln')
28
    # Get raw alignment data
29
    mapping = cmd.get_raw_alignment('aln')
30
    distances = []
31
    for pair in mapping:
32
        atom1 = '%s and id %d' % (pair[0][0], pair[0][1])
33
        atom2 = '%s and id %d' % (pair[1][0], pair[1][1])
34
        dist = cmd.get_distance(atom1, atom2)
35
        distances.append(dist)
36
    distances = np.array(distances)
37
    gdts = [(distances <= cutoff).mean() for cutoff in cutoffs]
38
    out = np.array(list(zip(cutoffs, gdts))).flatten()
39
    print("GDT_%d: %.4f; GDT_%d: %.4f; GDT_%d: %.4f; GDT_%d: %.4f;" % tuple(out))
40
    GDT = np.mean(gdts)
41
    print("GDT: %.4f" % GDT)
42
    # Color by distance
43
    cmd.spectrum('b', 'green_yellow_red', sel1)
44
    return GDT
45
46 View Code Duplication
@cmd.extend
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
47
def gdt_hs(sel1, sel2):
48
    """
49
    Calculate the GDT (Global Distance Test) High Accuracy between two RNA (using P atoms) selections.
50
    Args:
51
        sel1 (str): selection string for the protein to compare.
52
        sel2 (str): selection string for the reference protein.
53
    """
54
    # Select only C-alpha atoms
55
    sel1 += ' and name P'
56
    sel2 += ' and name P'
57
    cutoffs = [0.5, 1, 2, 4]
58
    # Align without transforming for raw alignment data
59
    cmd.align(sel1, sel2, cycles=0, transform=0, object='aln')
60
    # Get raw alignment data
61
    mapping = cmd.get_raw_alignment('aln')
62
    distances = []
63
    for pair in mapping:
64
        atom1 = '%s and id %d' % (pair[0][0], pair[0][1])
65
        atom2 = '%s and id %d' % (pair[1][0], pair[1][1])
66
        dist = cmd.get_distance(atom1, atom2)
67
        distances.append(dist)
68
    distances = np.array(distances)
69
    gdts = [(distances <= cutoff).mean() for cutoff in cutoffs]
70
    out = np.array(list(zip(cutoffs, gdts))).flatten()
71
    print("GDT_HS%d: %.4f; GDT_%d: %.4f; GDT_%d: %.4f; GDT_%d: %.4f;" % tuple(out))
72
    GDT = np.mean(gdts)
73
    print("GDT_HS: %.4f" % GDT)
74
    # Color by distance
75
    cmd.spectrum('b', 'green_yellow_red', sel1)
76
    return GDT
77
78
if __name__ == '__main__':
79
    # Assume the first two objects are the models to compare
80
    model1, reference = cmd.get_object_list('all')[:2]
81
    gdt_gs(model1, reference)
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable gdt_gs does not seem to be defined.
Loading history...
82