Conditions | 4 |
Total Lines | 65 |
Code Lines | 41 |
Lines | 0 |
Ratio | 0 % |
Changes | 0 |
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
1 | """Backend utilizing a large-language model.""" |
||
140 | def _llm_suggest_batch( |
||
141 | self, |
||
142 | texts: list[str], |
||
143 | suggestion_batch: SuggestionBatch, |
||
144 | params: dict[str, Any], |
||
145 | ) -> SuggestionBatch: |
||
146 | |||
147 | model = params["model"] |
||
148 | encoding = tiktoken.encoding_for_model(model.rsplit("-", 1)[0]) |
||
149 | max_prompt_tokens = int(params["max_prompt_tokens"]) |
||
150 | |||
151 | system_prompt = """ |
||
152 | You will be given text and a list of keywords to describe it. Your task is |
||
153 | to score the keywords with a value between 0.0 and 1.0. The score value |
||
154 | should depend on how well the keyword represents the text: a perfect |
||
155 | keyword should have score 1.0 and completely unrelated keyword score |
||
156 | 0.0. You must output JSON with keywords as field names and add their scores |
||
157 | as field values. |
||
158 | There must be the same number of objects in the JSON as there are lines in |
||
159 | the intput keyword list; do not skip scoring any keywords. |
||
160 | """ |
||
161 | |||
162 | labels_batch = self._get_labels_batch(suggestion_batch) |
||
163 | |||
164 | llm_batch_suggestions = [] |
||
165 | for text, labels in zip(texts, labels_batch): |
||
166 | prompt = "Here are the keywords:\n" + "\n".join(labels) + "\n" * 3 |
||
167 | text = self._truncate_text(text, encoding, max_prompt_tokens) |
||
168 | prompt += "Here is the text:\n" + text + "\n" |
||
169 | |||
170 | response = self._call_llm( |
||
171 | system_prompt, |
||
172 | prompt, |
||
173 | model, |
||
174 | params, |
||
175 | response_format={"type": "json_object"}, |
||
176 | ) |
||
177 | try: |
||
178 | llm_result = json.loads(response) |
||
179 | except (TypeError, json.decoder.JSONDecodeError) as err: |
||
180 | print(f"Error decoding JSON response from LLM: {response}") |
||
181 | print(f"Error: {err}") |
||
182 | llm_batch_suggestions.append( |
||
183 | [SubjectSuggestion(subject_id=None, score=0.0) for _ in labels] |
||
184 | ) |
||
185 | continue |
||
186 | llm_batch_suggestions.append( |
||
187 | [ |
||
188 | ( |
||
189 | SubjectSuggestion( |
||
190 | subject_id=self.project.subjects.by_label( |
||
191 | llm_label, self.params["labels_language"] |
||
192 | ), |
||
193 | score=score, |
||
194 | ) |
||
195 | if llm_label in labels |
||
196 | else SubjectSuggestion(subject_id=None, score=0.0) |
||
197 | ) |
||
198 | for llm_label, score in llm_result.items() |
||
199 | ] |
||
200 | ) |
||
201 | |||
202 | return SuggestionBatch.from_sequence( |
||
203 | llm_batch_suggestions, |
||
204 | self.project.subjects, |
||
205 | ) |
||
217 |