SessionSet.worst_possible_score() - Code Metrics - Inspection of "Improved session attendance estimation" - minnestar/sessionizer - Measure and Improve Code Quality continuously with Scrutinizer

Completed

Push — scheduler-improvements ( dabd67 )

unknown

created 2017-03-17 17:07 UTC

SessionSet.worst_possible_score() A

↳ Parent: SessionSet

Complexity

Conditions

Size

Total Lines

Duplication

Lines	0
Ratio	0 %

Importance

Changes	2
Bugs	0	Features	0

Metric	Value
cc	2
c	2
b	0
f	0
dl	0
loc	4
rs	10

module Scheduling

  # A set of sessions which can be scored against a particular schedule.
  #
  # This can represent either the set of sessions an attendee is interested in seeing,
  # or the set a presenter is presenting. Either way, we want sessions booked in
  # nonoverlapping timeslots.
  #
  class SessionSet
    def initialize(ctx, superset: nil, penalty_callback: ->(*args) { 0 })
      @ctx = ctx
      @superset = superset
      @penalty_callback = penalty_callback

      @sessions = Set.new
    end

    def add(session_id)
      @sessions << session_id
      @superset.add(session_id) if @superset
    end

    # The score if sessions are evenly distributed among all available timeslots.
    #
    def best_possible_score
      return 1 if empty?

      @best_possible_score ||= begin
        slot_count = @ctx.timeslots.size
        even_split_floor = size / slot_count
        big_slots = size % slot_count
        small_slots = slot_count - big_slots
        [
          small_slots * slot_value(even_split_floor)    + 
            big_slots * slot_value(even_split_floor + 1),
          1.0
        ].min
      end
    end

    # The score if everything is in the same timeslot.
    #
    def worst_possible_score
      return 1 if empty?
      1.0 / size
    end

    # The average score of a randomly assigned schedule. This is a good baseline
    # for what the scheduler is trying to improve on — more so than worst_possible_score,
    # which tends to be far worse than any actual schedule would generate.
    #
    def random_schedule_score
      return 1 if empty?

      slot_count = @ctx.timeslots.size

      @@random_schedule_scores ||= {}

      @@random_schedule_scores[[slot_count, size]] ||= begin
        # Someone more knowledgeable than me can probably find
        # a closed form for this. But brute force works!
        trials = 1000
        total = 0
        trials.times do
          counts = [0] * slot_count
          size.times { counts[rand(slot_count)] += 1 }
          total += counts.sum { |c| slot_value(c) }
        end
        total / trials.to_f
      end
    end

    def score(schedule)
      return 1 if empty?
      
      slot_session_count = Hash.new(0)
      penalty = 0

      @sessions.each do |session|
        slot = schedule.slot_id_for(session)
        slot_session_count[slot] += 1
        penalty += @penalty_callback.call(slot)
      end

      satisfaction = slot_session_count.values.sum do |k|
        slot_value(k)
      end

      satisfaction - penalty / size.to_f
    end

    def size
      @sessions.size
    end

    def empty?
      @sessions.empty?
    end

  private

    # Assume each attendee has some ranking of all the sessions in which they expressed interest.
    # (Not an unreasonable assumption.) Further assume that the attendee values their sessions of interest
    # linearly: least interested=1,2,3...N=most interested. (That's more of a stretch, but likely bears
    # some resemblance to reality.)
    #
    # In each timeslot, assume the attendee will choose the session they're most interested in. We don't
    # know attendee's actual ranking, so assume sessions of interest are distributed randomly in the schedule.
    # The expected maximum of a slot in which there are k sessions is v_slot = k(N+1)/(k+1). The total
    # value of all sessions is v_total = N(N+1)/2. The expected fraction of total value in one slot is
    # thus v_slot / v_total = 2k/(N(k+1)). Sum that over all slots to get the attendee's overall satisfaction.
    #
    # A similar logic applies to presenters avoiding double-booking.
    #
    # (The old scheduling algorithm didn't take into account how deep the overlap ran within a given slot,
    # essentially set v_slot = {1/N if any sessions of interest, 0 otherwise}.)
    #
    def slot_value(k)
      2 * k / (size.to_f * (k + 1))
    end

  end
end


1			module Scheduling
2
3			# A set of sessions which can be scored against a particular schedule.
4			#
5			# This can represent either the set of sessions an attendee is interested in seeing,
6			# or the set a presenter is presenting. Either way, we want sessions booked in
7			# nonoverlapping timeslots.
8			#
9			class SessionSet
10			def initialize(ctx, superset: nil, penalty_callback: ->(*args) { 0 })
11			@ctx = ctx
12			@superset = superset
13			@penalty_callback = penalty_callback
14
15			@sessions = Set.new
16			end
17
18			def add(session_id)
19			@sessions << session_id
20			@superset.add(session_id) if @superset
21			end
22
23			# The score if sessions are evenly distributed among all available timeslots.
24			#
25			def best_possible_score
26			return 1 if empty?
27
28			@best_possible_score \|\|= begin
29			slot_count = @ctx.timeslots.size
30			even_split_floor = size / slot_count
31			big_slots = size % slot_count
32			small_slots = slot_count - big_slots
33			[
34			small_slots * slot_value(even_split_floor) +
35			big_slots * slot_value(even_split_floor + 1),
36			1.0
37			].min
38			end
39			end
40
41			# The score if everything is in the same timeslot.
42			#
43			def worst_possible_score
44			return 1 if empty?
45			1.0 / size
46			end
47
48			# The average score of a randomly assigned schedule. This is a good baseline
49			# for what the scheduler is trying to improve on — more so than worst_possible_score,
50			# which tends to be far worse than any actual schedule would generate.
51			#
52			def random_schedule_score
53			return 1 if empty?
54
55			slot_count = @ctx.timeslots.size
56
57			@@random_schedule_scores \|\|= {}
			0 ignored issues – show Comprehensibility Best Practice introduced 2017-03-17 17:09 UTC by Report Bug Copy Issue Report Using a class variable like `@@random_schedule_scores` is generally not recommended; did you consider using an class instance variable instead? Loading history...
58			@@random_schedule_scores[[slot_count, size]] \|\|= begin
59			# Someone more knowledgeable than me can probably find
60			# a closed form for this. But brute force works!
61			trials = 1000
62			total = 0
63			trials.times do
64			counts = [0] * slot_count
65			size.times { counts[rand(slot_count)] += 1 }
66			total += counts.sum { \|c\| slot_value(c) }
67			end
68			total / trials.to_f
69			end
70			end
71
72			def score(schedule)
73			return 1 if empty?
74
75			slot_session_count = Hash.new(0)
76			penalty = 0
77
78			@sessions.each do \|session\|
79			slot = schedule.slot_id_for(session)
80			slot_session_count[slot] += 1
81			penalty += @penalty_callback.call(slot)
82			end
83
84			satisfaction = slot_session_count.values.sum do \|k\|
85			slot_value(k)
86			end
87
88			satisfaction - penalty / size.to_f
89			end
90
91			def size
92			@sessions.size
93			end
94
95			def empty?
96			@sessions.empty?
97			end
98
99			private
100
101			# Assume each attendee has some ranking of all the sessions in which they expressed interest.
102			# (Not an unreasonable assumption.) Further assume that the attendee values their sessions of interest
103			# linearly: least interested=1,2,3...N=most interested. (That's more of a stretch, but likely bears
104			# some resemblance to reality.)
105			#
106			# In each timeslot, assume the attendee will choose the session they're most interested in. We don't
107			# know attendee's actual ranking, so assume sessions of interest are distributed randomly in the schedule.
108			# The expected maximum of a slot in which there are k sessions is v_slot = k(N+1)/(k+1). The total
109			# value of all sessions is v_total = N(N+1)/2. The expected fraction of total value in one slot is
110			# thus v_slot / v_total = 2k/(N(k+1)). Sum that over all slots to get the attendee's overall satisfaction.
111			#
112			# A similar logic applies to presenters avoiding double-booking.
113			#
114			# (The old scheduling algorithm didn't take into account how deep the overlap ran within a given slot,
115			# essentially set v_slot = {1/N if any sessions of interest, 0 otherwise}.)
116			#
117			def slot_value(k)
118			2 * k / (size.to_f * (k + 1))
119			end
120
121			end
122			end
123

minnestar / sessionizer

Push — scheduler-improvements ( dabd67 )

SessionSet.worst_possible_score() A

Complexity

Size

Duplication

Importance

Duplication Side-by-Side

Filter issues like