potluck.rubrics
Main classes for defining a rubric, including Rubric
itself, as well as
Goal
and Context
.
rubrics.py
The big picture is that a Rubric
contains Goal
objects, which in
turn each get evaluated in one or more potluck.contexts.Context
s.
The context graph determines what tests actually get run, and supplies
test results, often from both submitted and solution code, in the form of
a context dictionary containing various slots (there is a list of common
slot names in the documentation for potluck.contexts.Context
).
Goals
then read those values and make decisions: they are either
accomplished, partially accomplished, or failed in each particular
context that they list, and then they also get an overall status based on
combining the statuses from their evaluation in individual contexts.
One common Goal
type is the ComparisonTest
which compares two
context values, for example one produced from submitted code against
another produced by the same process from solution code.
Another common Goal
type is the ImplementationCheck
, which uses the
mast
module to look for certain patterns in the AST of the submitted
code.
Once a Rubric
's goals have been evaluated and have individually been
assigned statuses, the entire rubric gets an overall evaluation via a
metric function, such as core_extras_categorized_metric
. This process
is handled via the Rubric.evaluate
method.
1""" 2Main classes for defining a rubric, including `Rubric` itself, as well as 3`Goal` and `Context`. 4 5rubrics.py 6 7The big picture is that a `Rubric` contains `Goal` objects, which in 8turn each get evaluated in one or more `potluck.contexts.Context`s. 9 10The context graph determines what tests actually get run, and supplies 11test results, often from both submitted and solution code, in the form of 12a context dictionary containing various slots (there is a list of common 13slot names in the documentation for `potluck.contexts.Context`). 14 15`Goals` then read those values and make decisions: they are either 16accomplished, partially accomplished, or failed in each particular 17context that they list, and then they also get an overall status based on 18combining the statuses from their evaluation in individual contexts. 19 20One common `Goal` type is the `ComparisonTest` which compares two 21context values, for example one produced from submitted code against 22another produced by the same process from solution code. 23 24Another common `Goal` type is the `ImplementationCheck`, which uses the 25`mast` module to look for certain patterns in the AST of the submitted 26code. 27 28Once a `Rubric`'s goals have been evaluated and have individually been 29assigned statuses, the entire rubric gets an overall evaluation via a 30metric function, such as `core_extras_categorized_metric`. This process 31is handled via the `Rubric.evaluate` method. 32""" 33 34import os 35import copy 36import math 37import ast 38import traceback 39 40from . import mast 41from . import patterns 42from . import html_tools 43from . import phrasing 44from . import contexts 45from . import context_utils 46from . import logging 47 48# TODO: Import Goal, and Rubric stuff from codder.buoy 49 50 51GOAL_TYPE_RUBRICS = { 52 "style": ( 53 "Style Requirements", 54 "How your code is written.", 55 "Style Requirements", 56 ( 57 "Checks regarding the raw text of your code file and how it" 58 + " is organized stylistically (e.g., how many characters are" 59 + " in a line of code, or how many comments there are)." 60 ) 61 ), 62 "procedure": ( 63 "Procedure Requirements", 64 "What code you use to solve the problem.", 65 "Procedure Requirements", 66 ( 67 "Code checks which require that your code is written in a" 68 + " certain way, regardless of what happens when it runs" 69 + " (e.g., how many lines of code call a certain function)." 70 ), 71 ), 72 "process": ( 73 "Process Requirements", 74 "How your code achieves its results.", 75 "Process Requirements", 76 ( 77 "Process checks which check how your code works by recording" 78 + " each operation that happens in order (e.g., how many times" 79 + " or with what arguments a certain function is called)." 80 ) 81 ), 82 "product": ( 83 "Product Requirements", 84 "Your code's result values.", 85 "Product Requirements", 86 ( 87 "Result tests that run your code and check for the internal" 88 + " result values it produces (e.g., the return value from a" 89 + " specific function)." 90 ) 91 ), 92 "behavior": ( 93 "Behavior Requirements", 94 "What your code does from the user's perspective.", 95 "Behavior Requirements", 96 ( 97 "Behavior tests that run your code and check what inputs it" 98 + " requests from and what outputs it displays to the user" 99 + " (e.g., what is printed given certain typed inputs)." 100 ) 101 ), 102 "testing": ( 103 "Testing Requirements", 104 "What tests you define for your code and their results.", 105 "Testing Requirements", 106 ( 107 "Expectation tests that look at the test cases you've set up" 108 + " and the expectations you've defined and ensure that you" 109 + " have enough tests and/or that your tests are working." 110 ) 111 ), 112 "other": ( 113 "Other Requirements", 114 "Requirements that don't fall into any other category." 115 ) 116 # TODO: Long explanation with an e.g. part? 117} 118""" 119A dictionary mapping goal types to 4-part descriptions that explain what 120each goal type means, for use in rubric tables that categorize goals by 121type. 122""" 123 124 125#---------------------# 126# The base Goal class # 127#---------------------# 128 129BLANK_RESULT = { 130 "status": "unknown", 131 "explanation": "This goal has not been evaluated." 132} 133""" 134The default blank result value that a goal acquires when first 135constructed or whenever it is reset. 136""" 137 138STATUS_DESCRIPTORS = { 139 "accomplished": "accomplished", 140 "partial": "partially accomplished", 141 "failed": "failed", 142 "not applicable": "not applicable", 143 "unknown": "unknown", 144} 145""" 146Full human-oriented strings for each status string. 147""" 148 149 150class Goal: 151 """ 152 A goal is a line-item on a rubric: something that a submission should 153 accomplish. When evaluated, it updates its 'result' to an evaluation 154 object that has a status (one of "unknown", "accomplished", 155 "partial", "failed", or "not applicable") and an explanation. It also 156 has a dictionary of strings that describe different tags it is labeled 157 with. 158 159 A Goal is able to produce a table of results (and possibly 160 sub-results) via its table method. 161 """ 162 USED_IDS = {} 163 """ 164 A dictionary of identifier values that have been used already, 165 organized into sub-dictionaries indexed by task IDs. 166 """ 167 168 def unique_id(taskid, category, identifier): 169 """ 170 A static method: given a task of interest, a category, and an 171 identifier, keeps track of identifiers provided and returns them 172 as-is, except when a duplicate is provided, in which case it 173 appends a -number suffix to the duplicate to make it unique and 174 returns that. The -number suffixes start at -2 for the second 175 copy; -1 is never used because the first copy doesn't get a 176 suffix added. 177 178 The result is prefixed with 'goal:<category>.' which can also 179 de-duplicate IDs without needing a suffix sometimes. 180 """ 181 task_ids = Goal.USED_IDS.setdefault(taskid, {}) 182 183 full_id = "goal:" + category + '.' + identifier 184 185 seen_before = task_ids.get(full_id, 0) 186 187 if seen_before == 0: # not seen before 188 task_ids[full_id] = 1 189 return full_id 190 else: # was seen before; would be a duplicate 191 task_ids[full_id] += 1 192 result = full_id + '-' + str(seen_before + 1) 193 return result 194 195 def __init__( 196 self, 197 taskid, 198 identifier, 199 description=("BLANK GOAL", "THIS GOAL HAS NOT BEEN DEFINED"), 200 test_in=None, 201 explanations=None, 202 tags=None 203 ): 204 """ 205 You must supply a task ID (a string), an identifier (a string) 206 and a description-tuple: the title of the goal, and a more 207 detailed explanation that will be available if the user requests 208 more information. The description-tuple may also include a third 209 and/or fourth entry: these will be used instead of the first and 210 second entry (respectively) when the Goal is being presented as 211 part of graded feedback instead of as part of a blank rubric. 212 This can be used to avoid showing exactly which tests are 213 performed when the rubric is constructed, but include that 214 information when feedback is given. 215 216 Note that if the identifier you supply is already in use within 217 the specified task, a numeric suffix will be appended to make it 218 unique. 219 220 You may also supply: 221 222 1. A 'test_in' dictionary which has the following keys: 223 - 'contexts': A list of Context objects within which 224 this goal should be independently tested. 225 - 'required': The amount of credit which this goal 226 needs to count as accomplished overall. For 227 contexts where it evaluates to "accomplished", one 228 unit of credit is earned, and contexts where it 229 evaluates to "partial" earn 1/2 unit of credit by 230 default. The default value is the number of 231 contexts supplied, implying conjunctive logic. Set 232 this to 1 for disjunctive logic (and set 'strict' 233 to True for pure disjunction). 234 - 'partial': If present, the amount of credit needed to 235 count as partially accomplished. If absent, will 236 default to 1/2 the 'required' value. Set to False 237 to prevent the goal from being marked as partially 238 accomplished. 239 - 'count_partial_as': If present, should be a number 240 between 0 and 1, which specifies how much credit 241 to give for contexts where this gaol evaluates as 242 partially-accomplished when comparing results 243 against required/partial thresholds. Default 0.5. 244 - 'strict': If present and truthy, overrides 'partial' 245 and 'count_partial_as' to set 'partial' to False 246 and 'count_partial_as' to 0. 247 - 'warn': If present and truthy, when a context builder 248 fails, the failure explanation will be generated as 249 a warning instead of as a note (the default). 250 - 'fail_without_context': If context creation fails, 251 the goal will be marked as failing in that context 252 without even being evaluated. True by default. 253 254 During evaluation, this goal will be independently 255 evaluated in each provided context, and the aggregate 256 results of those evaluations will be used to determine 257 the overall status of this goal. Note that context keys 258 provided by these Context objects override any context 259 keys that may be established and provided by a super-goal 260 during testing, and the super-goal context is not made 261 available to these Context objects as part of their 262 context construction process. 263 264 If no test_in dictionary is provided, the goal is simply 265 evaluated in a blank context (or in whatever context its 266 parent goal passed down). 267 268 2. An 'explanations' dictionary with some or all of the keys: 269 "accomplished", "partial", "failed", and/or "crash" 270 (typically used when a goal fails due to an exception). If 271 a relevant key exists in this dictionary, the value will 272 be used as the explanation for this goal if it has the 273 specified outcome. If the value is a function instead of a 274 string, it will be given the Goal object, which will 275 already include a partial 'result' object, and the 276 evaluation context, and the string that it returns will be 277 used as an explanation. 278 279 3. A 'tags' dictionary of strings that tag the goal. Some 280 tags affect how certain types of goals behave. 281 """ 282 if not isinstance(taskid, str): 283 raise TypeError("A Goal's task ID must be a string.") 284 285 self.taskid = taskid 286 287 if not isinstance(identifier, str): 288 raise TypeError("A Goal's identifier must be a string.") 289 290 if ( 291 not isinstance(description, (list, tuple)) 292 or not 2 <= len(description) <= 4 293 ): 294 raise ValueError( 295 ( 296 "The description for a goal must be a 2-to-4-element" 297 " list or tuple (got: {})." 298 ).format(repr(description)) 299 ) 300 self.description = description 301 self.test_in = test_in 302 # TODO: Figure this out. 303 #if ( 304 # not isinstance(self.test_in, (dict)) 305 # or any( 306 # not isinstance(x, contexts.Context) 307 # for x in self.test_in["contexts"] 308 # ) 309 #): 310 # raise ValueError( 311 # "Every item in the test_in 'contexts' slot must be a" 312 # + " Context, and test_in must be a dictionary." 313 # ) 314 self.explanations = explanations or {} 315 self.tags = tags or {} 316 317 # Get a unique ID for this goal 318 category = self.tags.get("category", "unknown") 319 self.identifier = Goal.unique_id(taskid, category, identifier) 320 321 # Initialize blank result: 322 self.reset() 323 324 def __copy__(self): 325 """ 326 `Goal`s may not be copied (there is no good way to do so since 327 they're entangled with both each other and with 328 `potluck.contexts.Context` objects). 329 """ 330 raise NotImplementedError("Goals may not be copied.") 331 332 def __deepcopy__(self, memo): 333 """ 334 `Goals` may not be copied (they are entangled with other goals 335 and with `potluck.contexts.Context` objects). 336 """ 337 raise NotImplementedError("Goals may not be copied.") 338 339 def description_topic(self): 340 """ 341 Gets the rubric version of this `Goal`'s topic. 342 """ 343 return self.description[0] 344 345 def description_details(self): 346 """ 347 Gets the rubric version of this `Goal`'s details. 348 """ 349 return self.description[1] 350 351 def feedback_topic(self): 352 """ 353 Gets the feedback version of this `Goal`'s topic, or just the 354 normal topic if there is no feedback version. 355 """ 356 return self.description[::2][-1] 357 358 def feedback_details(self): 359 """ 360 Gets the feedback version of this `Goal's details, or just the 361 normal details if there is no feedback version. 362 """ 363 return self.description[1::2][-1] 364 365 def get_goal_type(self): 366 """ 367 Inspects this `Goal`'s tags for a "goal_type" slot and returns 368 the associated value, or None if there is no such slot. 369 """ 370 return self.tags.get("goal_type", None) 371 372 def set_default_goal_type(self, default_type): 373 """ 374 Sets the given goal type for this goal (adds a tag), but only 375 does so if the goal doesn't already have a goal type tag. 376 """ 377 if "goal_type" not in self.tags: 378 self.tags["goal_type"] = default_type 379 380 def reset(self): 381 """ 382 Resets internal state so that the goal can be evaluated again. 383 Does not affect internal state of any sub-goals, and does not 384 affect cached context values. 385 """ 386 self.result = copy.deepcopy(BLANK_RESULT) 387 388 def reset_network(self): 389 """ 390 Resets our internal state and the states of any sub-goals, but 391 does not affect context caches. 392 """ 393 self.reset() 394 for goal in self.subgoals(): 395 goal.reset_network() 396 397 def full_reset(self): 398 """ 399 Does a full reset, including a full reset of subgoals plus 400 burning of context caches. 401 """ 402 self.reset() 403 for goal in self.subgoals(): 404 goal.full_reset() 405 if self.test_in: 406 for ctx in self.test_in["contexts"]: 407 ctx.burn_cache() 408 409 def subgoals(self): 410 """ 411 Returns a list of `Goal` objects that are considered subgoals of 412 this goal. Different `Goal` classes have different relationships 413 to their subgoals, but this method allows other code to discover 414 the full tree of goals regardless of those relationships. `Goal` 415 classes without subgoals can safely inherit this method, which 416 returns an empty list. 417 """ 418 # A base Goal has no subgoals. 419 return [] 420 421 def evaluate(self, base_context=None): 422 """ 423 Evaluates this goal independently within each of its contexts, 424 and produces an overall evaluation that combines explanations 425 from each context. If there are no contexts, simply evaluates the 426 goal normally. 427 428 A base context is normally required, as otherwise the goal won't 429 have access to the submitted code or even basic info about the 430 task being evaluated. 431 432 Keeps track of the set of all distinct explanations generated, 433 and if there was only a single shared explanation across all 434 contexts, it uses that as the final explanation, but if there 435 were multiple different explanations, creates a combined 436 explanation with sections for the different contexts that had 437 different explanations. 438 439 Note: During this process, if the goal ever evaluates to "not 440 evaluated" in one of the contexts, the end result will be "not 441 evaluated" overall regardless of results from other contexts. 442 443 Note: If one of the contexts cannot be created, the goal will 444 count as failed in that context, and a note will be attached to 445 the result. If a context builder function generates an error 446 other than a `potluck.context_utils.ContextCreationError`, a 447 warning is generated, but in other cases the 'warn' of the 448 setting determines whether a warning or note is generated. 449 """ 450 if not self.test_in or len(self.test_in["contexts"]) == 0: 451 # No contexts listed: simply evaluate in base context 452 try: 453 # this will set self.result 454 self.evaluate_in_context(base_context) 455 except Exception: 456 self.result = { 457 "status": "failed", 458 "warnings": [], 459 "notes": [ 460 # generic context creation failure is usually not 461 # warning-worthy. TODO: Sometimes it is! 462 "Context creation failed" 463 + " unexpectedly:<br>\n{}".format( 464 html_tools.html_traceback( 465 title='Error:', 466 linkable=context_utils.linkmap(base_context) 467 ) 468 ) 469 ] 470 } 471 return self.result 472 473 else: # has specific contexts to test in 474 credit = 0 475 full_count = 0 476 partial_count = 0 477 notes = [] 478 warnings = [] 479 # mapping from explanation strings to lists of status, 480 # context pairs: 481 explanations = {} 482 for i, builder in enumerate(self.test_in["contexts"]): 483 # Construct context: 484 this_context = copy.copy(base_context) 485 # Note: can't deep-copy things like modules 486 487 # Set goal_id and which_context value to provide enough 488 # information in the context dictionary to uniquely 489 # identify a specific context-building operation. 490 this_context["goal_id"] = self.identifier 491 this_context["which_context"] = i 492 493 add_failures_to = notes 494 if self.test_in.get("warn"): 495 add_failures_to = warnings 496 497 err = None 498 try: 499 this_context.update(builder.create(this_context)) 500 except context_utils.ContextCreationError as e: 501 err = e.explanation() 502 add_failures_to.append(e.explanation()) 503 except Exception: 504 err = html_tools.html_traceback( 505 title="Unexpected Error:", 506 linkable=context_utils.linkmap(this_context) 507 ) 508 notes.append( 509 "Context creation failed unexpectedly:<br>\n" 510 + err 511 ) 512 513 # reset this and subgoals, but don't disturb Context caches: 514 self.reset_network() 515 516 # evaluate ourselves: 517 if ( 518 self.test_in.get("fail_without_context", True) 519 and err is not None 520 ): 521 res = { 522 "status": "failed", 523 "explanation": ( 524 "Failed to establish testing context:<br>\n{}" 525 ).format(err) 526 } 527 else: 528 res = self.evaluate_in_context(this_context) 529 530 if res["status"] == "accomplished": 531 credit += 1 532 full_count += 1 533 elif res["status"] == "partial": 534 credit += self.test_in.get("count_partial_as", 0.5) 535 partial_count += 1 536 elif res["status"] == "unknown": 537 # Immediately abandon evaluation across contexts: 538 return { 539 "status": "unknown", 540 "explanation": ( 541 "Unable to evaluate in context:<br>\n{}" 542 ).format(builder.html_topic(in_feedback=True)) 543 # TODO: Does this need to be html_context_tree 544 # for disambiguation? 545 } 546 547 # record explanation & status: 548 expl = res.get("explanation", "") 549 if expl not in explanations: 550 explanations[expl] = [] 551 explanations[expl].append((res["status"], builder, res)) 552 553 # copy notes and warnings 554 if "notes" in res: 555 notes.extend(res["notes"]) 556 if "warnings" in res: 557 warnings.extend(res["warnings"]) 558 559 # Compute credit required/partial 560 required = self.test_in.get( 561 "required", 562 len(self.test_in["contexts"]) 563 ) 564 partial = self.test_in.get("partial", required / 2) 565 566 # Compute status 567 # TODO: Should credit-logic be made visible since it's not 568 # always consistent?!? 569 status = "failed" 570 if credit >= required: 571 status = "accomplished" 572 elif partial is not False and credit >= partial: 573 status = "partial" 574 575 self.result = { 576 "status": status, 577 "notes": notes, 578 "warnings": warnings 579 } 580 581 # Combine explanations: 582 if len(explanations) == 0: 583 # TODO: Should we be bypassing set_explanation here? 584 self.result["explanation"] = "THIS SHOULDN'T BE POSSIBLE!" 585 elif len(explanations) == 1: 586 # Single explanation: don't bother worrying about 587 # multiple contexts and statuses: 588 # TODO: This logic is bad or hides stuff? 589 self.result["explanation"] = list(explanations.keys())[0] 590 # In this case pick up extra keys from the result... 591 competing = list(explanations.values())[0] 592 if len(competing) == 1: 593 sole_result = competing[0][2] 594 for k in sole_result: 595 if k not in self.result: 596 self.result[k] = sole_result[k] 597 else: 598 # Multiple explanations: mix & group by statuses/contexts 599 # TODO: What to do about multiple potentially 600 # contradictory custom result keys? 601 602 # Group by status: 603 by_status = {} 604 for expl in explanations: 605 for status, builder, result in explanations[expl]: 606 if status not in by_status: 607 by_status[status] = [] 608 by_status[status].append((expl, builder)) 609 610 # Order statuses: 611 status_order = ["accomplished", "partial", "failed"] 612 for status in by_status: 613 if status not in status_order: 614 status_order.append(status) 615 616 # Build parts of explanation: 617 expl_parts = [] 618 for status in status_order: 619 if status not in by_status: 620 continue 621 expls_and_builders = by_status[status] 622 n_ctx = len(expls_and_builders) 623 if n_ctx == 0: 624 raise ValueError("Shouldn't have zero explanations!") 625 elif n_ctx == 1: 626 in_ctx = "in one context" 627 else: 628 in_ctx = "in {} contexts".format(n_ctx) 629 630 this_expl = html_tools.build_html_details( 631 '{} {}:'.format( 632 STATUS_DESCRIPTORS.get(status, status) 633 .capitalize(), 634 in_ctx 635 ), 636 '<ul class="contextual_explanations">{}</ul>'.format( 637 '\n'.join( 638 ( 639 '<li>In context {}\n' 640 + '<div class="expl_in_context {}">\n' 641 + '{}\n{}\n' 642 + '</div>\n' 643 + '</li>' 644 ).format( 645 builder.html_topic(in_feedback=True), 646 # TODO: Does this need to be 647 # html_context_tree for 648 # disambiguation? 649 html_tools.status_css_class(status), 650 html_tools.build_status_indicator(status), 651 expl 652 ) 653 for expl, builder in expls_and_builders 654 ) 655 ) 656 ) 657 expl_parts.append((status, this_expl)) 658 659 # Combine parts into one explanation: 660 rstatus = self.result["status"] 661 rsdesc = STATUS_DESCRIPTORS.get(rstatus, rstatus) 662 if rstatus == "accomplished": 663 if full_count >= required: 664 self.result["explanation"] = "{} (in {})".format( 665 rsdesc, 666 phrasing.obj_num(full_count, "context") 667 ) 668 else: 669 self.result["explanation"] = ( 670 "{} (in {} and partially accomplished in {})" 671 ).format( 672 rsdesc, 673 phrasing.obj_num(full_count, "context"), 674 phrasing.obj_num(partial_count, "context") 675 ) 676 else: 677 if full_count > 0: 678 if partial_count > 0: 679 self.result["explanation"] = ( 680 "{} (accomplished in {};" 681 " partially accomplished in {})" 682 ).format( 683 rsdesc.capitalize(), 684 phrasing.obj_num(full_count, "context"), 685 phrasing.obj_num(partial_count, "context") 686 ) 687 else: 688 self.result["explanation"] = ( 689 "{} (accomplished in {})" 690 ).format( 691 rsdesc.capitalize(), 692 phrasing.obj_num(full_count, "context") 693 ) 694 else: 695 if partial_count > 0: 696 self.result["explanation"] = ( 697 "{} (partially accomplished in {})" 698 ).format( 699 rsdesc.capitalize(), 700 phrasing.obj_num(partial_count, "context") 701 ) 702 else: 703 self.result["explanation"] = ( 704 "{} (not accomplished in any contexts)" 705 ).format(rsdesc.capitalize()) 706 707 # Append parts describing success/failure in different 708 # contexts: 709 self.result["explanation"] += "<br>\n".join( 710 '<div class="expl_part {}">{}</div>'.format( 711 html_tools.status_css_class(status), 712 part 713 ) 714 for status, part in expl_parts 715 ) 716 717 # Return our result: 718 return self.result 719 720 def evaluate_in_context(self, context=None): 721 """ 722 The evaluate_in_context method of a Goal subclass should update 723 its 'result' value and return that new value. The result value 724 must be a dictionary with keys 'status' and 'explanation', where 725 the 'status' is one of the strings "unknown", "accomplished", 726 "partial", "failed", or "not applicable", and the 'explanation' 727 value is a (possibly-HTML) string. The result dictionary may also 728 optionally include a list of notes and/or a list of warnings, 729 which are HTML strings. 730 731 The evaluate_in_context method does not need to worry about a 732 goal's test_in value or the associated Context objects: 733 evaluate takes care of constructing context dictionaries which 734 are given to evaluate, so evaluate should just evaluate this goal 735 within the given context. Typical context keys are explained in 736 the documentation for the `potluck.contexts.Context` class. 737 """ 738 raise NotImplementedError("Cannot evaluate base Goal object!") 739 740 def table(self, blank=False): 741 """ 742 Creates a table report for this goal. The table includes a list 743 of rows, where each row contains a result dictionary, with a 744 "description" key including this goal's description and an 745 optional extra "subtable" key containing a sub-table of 746 additional results. The "notes" and "warnings" entries will 747 always be lists, and will be empty if there were no such keys 748 (or their values were explicitly None). The following keys are 749 canonical: 750 751 - 'id': This goal's unique ID (see `Goal.unique_id`). May be 752 absent on some rows representing groups of goals rather than 753 individual goals. 754 - 'description': A pair of strings describing this goal. 755 - 'tags': A dictionary of the tags for this goal. 756 - 'status': The goal's status. 757 - 'explanation': An explanation for the goal's success or 758 failure. 759 - 'notes': A list of strings describing additional feedback for 760 this goal. 761 - 'warnings': A list of strings describing any warnings that 762 arose during the evaluation of this goal. 763 - 'subtable': A list of table rows from sub-goals. 764 765 If "blank" is given as True, the BLANK_RESULT will be used as the 766 basis instead of this table's current result, so there will be no 767 notes or warnings, and the status will be "unknown." 768 """ 769 if blank: 770 row = copy.deepcopy(BLANK_RESULT) 771 else: 772 row = copy.deepcopy(self.result) 773 row["notes"] = self.result.get("notes") or [] 774 row["warnings"] = self.result.get("warnings") or [] 775 row["id"] = self.identifier 776 row["description"] = list(self.description[:]) 777 row["tags"] = copy.copy(self.tags) 778 row["subtable"] = [] 779 return [ row ] 780 781 def set_explanation( 782 self, 783 context, 784 status=None, 785 default="", 786 specific_context=True 787 ): 788 """ 789 Implements the explanations logic, where if self.explanations 790 contains an appropriate key, the string or function value for 791 that key is used to provide an explanation, and otherwise the 792 given default explanation is used. If no status string is given, 793 self.result.status is used as the key. 794 795 For cross-context final evaluation, this function is not used, 796 and explanation-overrides are ignored. 797 TODO: Really that? 798 799 The resulting explanation string is inserted into self.result 800 under the "explanation" key, in addition to being returned. 801 """ 802 status = status or self.result["status"] 803 expl = self.explanations.get(status, default) 804 if isinstance(expl, type(lambda x: x)): 805 expl = expl(self, context) 806 807 self.result["explanation"] = expl 808 return expl 809 810 811#------------------# 812# The Rubric class # 813#------------------# 814 815class Rubric: 816 """ 817 A rubric has a list of goals, and a method for determining overall 818 performance based on the evaluation of each individual goal. It may 819 also have a separate list of validation goals to be tested during the 820 validation step (e.g., goals that a certain number of tests should be 821 defined; see `potluck.validation`). 822 """ 823 def __init__( 824 self, 825 evaluation_goals, 826 performance_metric, 827 validation_goals=None, 828 spec_file=None 829 ): 830 """ 831 Sets up the rubric with a list of goals to be evaluated, and a 832 performance metric function that accepts a list of evaluated 833 goals and returns a performance report object. 834 835 A filename for the specification the rubric was loaded from may 836 be provided, in which case certain tracebacks within output may 837 be rewritten to abbreviate that filename. 838 """ 839 self.evaluation_goals = evaluation_goals 840 self.validation_goals = validation_goals or [] 841 self.metric = performance_metric 842 self.spec_file = spec_file 843 844 def all_contexts(self, goals): 845 """ 846 Crawls the provided list of goals and their subgoals to find all 847 relevant `potluck.contexts.Context` objects that might possibly 848 be used by evaluation tests in this rubric. Returns a list in 849 breadth-first traversal order of this rubric's goals, their 850 contexts, and those contexts' dependencies. 851 """ 852 # Map of object IDs 853 idmap = {} 854 855 queue = goals[:] 856 while queue: 857 # pop first 858 first = queue.pop(0) 859 860 # Process a Goal object (queue subgoals and contexts) 861 if isinstance(first, Goal): 862 queue.extend(first.subgoals()) 863 864 # Add associated contexts to our queue 865 if first.test_in: 866 queue.extend(first.test_in.get("contexts", [])) 867 868 # Process a Context object (accumulate and queue dependencies) 869 elif isinstance(first, contexts.Context): 870 queue.extend(first.depends) 871 872 # Add novel contexts to our idmap 873 if id(first) not in idmap: 874 idmap[id(first)] = first 875 queue.extend(first.depends) 876 877 result = list(idmap.values()) 878 879 return result 880 881 # TODO: HERE 882 def create_contexts_list(self, goals, base_context=None): 883 """ 884 Returns a list of context summary dictionaries describing all of 885 the contexts used by goals in the given goals list. It has the 886 same format as returned by 887 `potluck.contexts.list_and_render_contexts`. 888 889 A base context object is necessary to generate context values; 890 if no base context is given then context slots will not include 891 values and will use their redacted topics and details. 892 """ 893 clist = self.all_contexts(goals) 894 if self.spec_file: 895 html_tools.set_tb_rewrite( 896 self.spec_file, 897 "<task specification>" 898 ) 899 900 # Ensure that duplicate topics are distinguished 901 contexts.add_context_numbering(clist) 902 903 cgraph = contexts.build_context_graph(clist) 904 905 if len(clist) == 0: 906 return [] 907 908 return contexts.list_and_render_contexts(cgraph, base_context) 909 910 def create_blank_report(self, task_info): 911 """ 912 Creates a blank report for this rubric that simply shows what the 913 goals and contexts are. This function will erase any existing 914 results associated with rubric goals. 915 916 It uses False as the in_feedback value, so included context 917 descriptions will be obfuscated. 918 919 The returned report is a dictionary with the following keys: 920 921 - taskid: The task ID (from the given taskspec) 922 - evaluation: The string 'unknown' 923 - warnings: An empty list 924 - summary: A description of the task that this rubric belongs to. 925 - table: A table (in the format returned by `Goal.table`) detailing 926 each goal and subgoal. 927 - contexts: A list of context summary dictionaries in the format 928 returned by `potluck.contexts.list_and_render_contexts`, 929 which summarizes all contexts used by this rubric. 930 """ 931 # Empty report: 932 report = { 933 "taskid": task_info["id"], 934 "evaluation": "unknown", 935 "warnings": [], 936 "summary": f"Rubric for {task_info['id']}.", 937 "table": [], 938 "contexts": self.create_contexts_list(self.evaluation_goals) 939 } 940 941 # Reset our goals: 942 for g in self.evaluation_goals: 943 g.reset_network() 944 945 # Just in case set up a rewrite rule for the spec file 946 if self.spec_file: 947 html_tools.set_tb_rewrite( 948 self.spec_file, 949 "<task specification>" 950 ) 951 952 # Run metric over un-evaluated goals and ask for a blank result: 953 metric_result = self.metric(self.evaluation_goals, blank=True) 954 955 # Integrate results into our report: 956 report["evaluation"] = metric_result["evaluation"] 957 report["summary"] = metric_result["summary"] 958 report["table"] = metric_result["table"] 959 report["warnings"].extend(metric_result["warnings"]) 960 961 return report 962 963 def create_blank_validation_report(self, task_info): 964 """ 965 Creates a blank validation report for this rubric that simply 966 shows what the validation goals and contexts are. Just like 967 `Rubric.create_blank_report`, this function will erase any 968 existing results associated with validation rubric goals. 969 970 It uses False as the in_feedback value, so included context 971 descriptions will be obfuscated. 972 973 The result has the same keys as `Rubric.create_blank_report` 974 does. 975 """ 976 # Empty report: 977 report = { 978 "taskid": task_info["id"], 979 "evaluation": "unknown", 980 "warnings": [], 981 "summary": f"Validation rubric for {task_info['id']}.", 982 "table": [], 983 "contexts": self.create_contexts_list(self.validation_goals) 984 } 985 986 # Reset our goals: 987 for g in self.validation_goals: 988 g.reset_network() 989 990 # Just in case set up a rewrite rule for the spec file 991 if self.spec_file: 992 html_tools.set_tb_rewrite( 993 self.spec_file, 994 "<task specification>" 995 ) 996 997 # Run metric over un-evaluated goals and ask for a blank result: 998 metric_result = self.metric(self.validation_goals, blank=True) 999 1000 # Integrate results into our report: 1001 report["evaluation"] = metric_result["evaluation"] 1002 report["summary"] = metric_result["summary"] 1003 report["table"] = metric_result["table"] 1004 report["warnings"].extend(metric_result["warnings"]) 1005 1006 return report 1007 1008 def evaluate(self, task_info, username, submission_target): 1009 """ 1010 Evaluates this rubric based on the given submitted task (the 1011 task_info includes generic info about the task, the username 1012 identifies who submitted it, and the submission_target 1013 identifies the file or folder to be evaluated). 1014 1015 See `tasks.json` for the task info format (it's a dictionary 1016 stored in the "tasks" slot under its taskid as a key). 1017 1018 Returns a report object that has information about which goal(s) 1019 from the rubric passed or failed, and the overall performance as 1020 determined by the rubric's metric. 1021 1022 If submitted code cannot be loaded due to a syntax error or 1023 parsing fails for some other reason, the report will mention 1024 that in as much detail as it can, and the normal rubric items 1025 will be skipped. 1026 1027 Note: This function completely resets all evaluation goals and 1028 clears the caches of any associated `potluck.contexts.Context` 1029 objects before it starts evaluating goals. 1030 1031 The returned report dictionary has the following keys: 1032 1033 - taskid: The task ID (from the given taskspec) 1034 - evaluation: A string summarizing the performance on the entire 1035 task (from the metric function). 1036 - summary: An HTML string summarizing performance on the task 1037 (from the metric function). 1038 - files: A list of dictionaries with 'filename' and 'code' slots 1039 containing the file names and raw code text of the submitted 1040 file(s). 1041 - warnings: A list of warnings (from the metric function plus a 1042 few custom warnings if things are seriously wrong). 1043 - table: A table (in the format returned by `Goal.table`) detailing 1044 each goal and subgoal (from the metric function). 1045 - contexts: A list of context summary dictionaries in the format 1046 returned by `potluck.contexts.list_and_render_contexts`, 1047 which summarizes all contexts used by this rubric (see 1048 `Rubric.create_contexts_list`). 1049 - TODO: Add a partner_username field here? 1050 """ 1051 # Empty report: 1052 report = { 1053 "taskid": task_info["id"], 1054 "evaluation": "unknown", 1055 "summary": "No summary has been generated.", 1056 "files": [], 1057 "warnings": [], 1058 "table": [], 1059 "contexts": [] 1060 } 1061 1062 # Set up a rewrite rule for the spec file 1063 if self.spec_file: 1064 html_tools.set_tb_rewrite( 1065 self.spec_file, 1066 "<task specification>" 1067 ) 1068 1069 # Check for a missing submission: 1070 if not os.path.exists(submission_target): 1071 report["warnings"] = [ 1072 "You did not submit any code for this task." 1073 ] 1074 report["evaluation"] = "incomplete" 1075 report["summary"] = "You did not submit any code for this task." 1076 # Early return: no need to grade rubric items 1077 return report 1078 1079 # Check for accidental submission of the starter file: 1080 if os.path.isfile(submission_target): 1081 with open(submission_target, 'r', encoding="utf-8") as fin: 1082 submitted_code = fin.read() 1083 if submitted_code == task_info["specification"].starter_src: 1084 report["warnings"] = [ 1085 "You submitted the starter file without any" 1086 " changes (you probably submitted the wrong file?)." 1087 ] 1088 report["evaluation"] = "incomplete" 1089 report["summary"] = ( 1090 "You submitted an unchanged starter file." 1091 ) 1092 1093 # Reset each goal + any associated contexts: 1094 for g in self.evaluation_goals: 1095 g.full_reset() 1096 1097 # Ensure context descriptions are unique: 1098 clist = self.all_contexts(self.evaluation_goals) 1099 contexts.add_context_numbering(clist) 1100 1101 # Create our base context: 1102 if os.path.isdir(submission_target): 1103 submission_root = submission_target 1104 default_file = task_info["target"] 1105 actual_file = default_file 1106 else: 1107 submission_root, actual_file = os.path.split(submission_target) 1108 default_file = task_info["target"] 1109 base_context = { 1110 "task_info": task_info, 1111 "username": username, 1112 "submission_root": submission_root, 1113 "default_file": default_file, 1114 "actual_file": actual_file 1115 } 1116 1117 if len(self.evaluation_goals) == 0: 1118 raise ValueError("Rubric does not have any goals!") 1119 1120 # Evaluate each goal: 1121 for g in self.evaluation_goals: 1122 logging.debug_msg( 1123 "Evaluating goal '{}' @ {}...".format( 1124 g.feedback_topic(), 1125 id(g) 1126 ) 1127 ) 1128 # Task is automatically made available as part of context. 1129 result = g.evaluate(base_context) 1130 logging.debug_msg("...result is: {}".format(result)) 1131 logging.debug_msg("...review result is: {}".format(g.result)) 1132 1133 # Double-check that the goal correctly stored the value it 1134 # returned 1135 if result != g.result: 1136 logging.debug_msg( 1137 f"WARNING: Goal's returned result differs from" 1138 f" stored result!\nGoal:" 1139 f" '{g.feedback_topic()}'\nReturned:" 1140 f" {result}\nStored: {g.result}" 1141 ) 1142 1143 # Run our metric over the evaluated goals: 1144 metric_result = self.metric(self.evaluation_goals) 1145 1146 # Integrate results into our report: 1147 report["evaluation"] = metric_result["evaluation"] 1148 report["summary"] = metric_result["summary"] 1149 report["table"] = metric_result["table"] 1150 report["warnings"].extend(metric_result["warnings"]) 1151 1152 # Build our contexts list now that contexts should be caching the 1153 # same values used during testing: 1154 report["contexts"] = self.create_contexts_list( 1155 self.evaluation_goals, 1156 base_context 1157 ) 1158 1159 # Elevate warnings from contexts to the main warnings list. 1160 for creport in report["contexts"]: 1161 report["warnings"].extend(creport.get("warnings", [])) 1162 1163 # Build our files dictionary based on FileContext objects. It 1164 # maps file names to dictionaries with "path" slots (and possibly 1165 # more if we can dig up more info). 1166 all_filenames = { 1167 base_context["default_file"]: { 1168 "path": os.path.abspath( 1169 os.path.join( 1170 base_context["submission_root"], 1171 base_context["actual_file"] 1172 ) 1173 ) 1174 } 1175 } 1176 for ctx in clist: 1177 if isinstance(ctx, contexts.FileContext): 1178 if ctx.target_file is not None: 1179 ctx_result = ctx.create(base_context) 1180 name = ctx_result.get("filename", ctx.target_file) 1181 path = ctx_result.get("file_path", name) 1182 if name not in all_filenames: 1183 all_filenames[name] = { "path": path } 1184 1185 # Look for code contexts which have handled parsing on target 1186 # files, and add "source" and possibly "original_source" slots 1187 for ctx in clist: 1188 if isinstance(ctx, contexts.CodeContext): 1189 ctx_result = ctx.create(base_context) 1190 if "filename" in ctx_result: 1191 name = ctx_result["filename"] 1192 original = ctx_result["original_source"] 1193 fixed = ctx_result["source"] 1194 all_filenames[name]["source"] = fixed 1195 if original != fixed: 1196 all_filenames[name]["original_source"] = original 1197 # Otherwise there was some kind of error we assume 1198 1199 # Grab file contents if we haven't already 1200 for filename in all_filenames: 1201 file_info = all_filenames[filename] 1202 entry = { 1203 "filename": filename, 1204 "path": file_info["path"] 1205 } 1206 report["files"].append(entry) 1207 if "source" in file_info: 1208 entry["code"] = file_info["source"] 1209 else: 1210 with open(entry["path"], 'r', encoding="utf-8") as fin: 1211 if entry["path"].endswith(".py"): 1212 entry["code"] = fin.read() 1213 else: 1214 entry["raw"] = fin.read() 1215 1216 if "original_source" in file_info: 1217 entry["original_code"] = file_info["original_source"] 1218 1219 return report 1220 1221 def validate(self, task_info, username, tests_target, target): 1222 """ 1223 Validates tests for this task based on the given submitted tests 1224 file and submission file (the task_info includes generic info 1225 about the task, the username identifies who submitted it, the 1226 tests_target identifies the file or folder to be evaluated, and 1227 the target identifies the base task file or folder to run tests 1228 against). 1229 1230 See `tasks.json` for the task info format (it's a dictionary 1231 stored in the "tasks" slot under its taskid as a key). 1232 1233 Returns a report object that has information about which 1234 validation goal(s) from the rubric passed or failed, and the 1235 overall performance as determined by the rubric's metric. 1236 1237 If submitted tests cannot be loaded due to a syntax error or 1238 parsing fails for some other reason, the report will mention 1239 that in as much detail as it can, and the normal rubric items 1240 will be skipped. 1241 1242 Note: This function completely resets all validation goals and 1243 clears the caches of any associated `potluck.contexts.Context` 1244 objects before it starts evaluating goals. 1245 1246 TODO: We mostly effectively ignore the `target` argument because 1247 we grab the solution (see `contexts.TestsFileContext`). Get rid 1248 of it? 1249 1250 The returned report dictionary has the same keys/values as the 1251 result from `Rubric.evaluate`. 1252 """ 1253 # Empty report: 1254 report = { 1255 "taskid": task_info["id"], 1256 "evaluation": "unknown", 1257 "summary": "No summary has been generated.", 1258 "files": [], 1259 "warnings": [], 1260 "table": [], 1261 "contexts": [] 1262 } 1263 1264 # Set up a rewrite rule for the spec file 1265 if self.spec_file: 1266 html_tools.set_tb_rewrite( 1267 self.spec_file, 1268 "<task specification>" 1269 ) 1270 1271 # Check for a missing submission: 1272 if not os.path.exists(tests_target): 1273 report["warnings"] = [ 1274 "You did not submit any tests for this task." 1275 ] 1276 report["evaluation"] = "incomplete" 1277 report["summary"] = "You did not submit any tests for this task." 1278 # Early return: no need to grade rubric items 1279 return report 1280 1281 # Check for a missing submission: 1282 if not os.path.exists(target): 1283 report["warnings"] = [ 1284 "We did not find the code to test." 1285 ] 1286 report["evaluation"] = "incomplete" 1287 report["summary"] = "We did not find the code to test." 1288 # Early return: no need to grade rubric items 1289 return report 1290 1291 # Reset each goal + any associated contexts: 1292 for g in self.validation_goals: 1293 g.full_reset() 1294 1295 # Ensure context descriptions are unique: 1296 clist = self.all_contexts(self.validation_goals) 1297 contexts.add_context_numbering(clist) 1298 1299 # Figure out whether tests target is a directory or file 1300 if os.path.isdir(tests_target): 1301 submission_root = tests_target 1302 default_file = task_info.get( 1303 "tests_target", 1304 "test_" + task_info["target"] 1305 ) 1306 actual_file = default_file 1307 else: 1308 submission_root, actual_file = os.path.split(tests_target) 1309 default_file = task_info.get( 1310 "tests_target", 1311 "test_" + task_info["target"] 1312 ) 1313 1314 # Figure out whether submission target is a directory or file 1315 if os.path.isdir(target): 1316 target_root = target 1317 target_default_file = task_info["target"] 1318 target_actual_file = target_default_file 1319 else: 1320 target_root, target_actual_file = os.path.split(target) 1321 target_default_file = task_info["target"] 1322 1323 # Create our base context: 1324 base_context = { 1325 "task_info": task_info, 1326 "username": username, 1327 "submission_root": target_root, 1328 "default_file": target_default_file, 1329 "actual_file": target_actual_file, 1330 "tests_submission_root": submission_root, 1331 "default_tests_file": default_file, 1332 "actual_tests_file": actual_file 1333 } 1334 1335 if len(self.validation_goals) == 0: 1336 raise ValueError("Rubric does not have any validation goals!") 1337 1338 # Evaluate each goal: 1339 for g in self.validation_goals: 1340 logging.debug_msg( 1341 "Evaluating validation goal '{}' @ {}...".format( 1342 g.feedback_topic(), 1343 id(g) 1344 ) 1345 ) 1346 # Task is automatically made available as part of context. 1347 result = g.evaluate(base_context) 1348 logging.debug_msg("...result is: {}".format(result)) 1349 logging.debug_msg("...review result is: {}".format(g.result)) 1350 1351 # Double-check that the goal correctly stored the value it 1352 # returned 1353 if result != g.result: 1354 logging.debug_msg( 1355 f"WARNING: Validation goal's returned result differs" 1356 f" from stored result!\nGoal:" 1357 f" '{g.feedback_topic()}'\nReturned:" 1358 f" {result}\nStored: {g.result}" 1359 ) 1360 1361 # Run our metric over the evaluated goals: 1362 # TODO: Allow/require separate validation metrics? 1363 metric_result = self.metric(self.validation_goals) 1364 1365 # Integrate results into our report: 1366 report["evaluation"] = metric_result["evaluation"] 1367 report["summary"] = metric_result["summary"] 1368 report["table"] = metric_result["table"] 1369 report["warnings"].extend(metric_result["warnings"]) 1370 1371 # Build our contexts list now that contexts should be caching the 1372 # same values used during testing: 1373 report["contexts"] = self.create_contexts_list( 1374 self.validation_goals, 1375 base_context 1376 ) 1377 1378 # Elevate warnings from contexts to the main warnings list. 1379 for creport in report["contexts"]: 1380 report["warnings"].extend(creport.get("warnings", [])) 1381 1382 # Build our files dictionary based on TestsFileContext objects. 1383 # It maps file names to dictionaries with "path" slots (and 1384 # possibly more if we can dig up more info). 1385 all_filenames = { 1386 base_context["default_tests_file"]: { 1387 "path": os.path.abspath( 1388 os.path.join( 1389 base_context["submission_root"], 1390 base_context["actual_tests_file"] 1391 ) 1392 ) 1393 } 1394 } 1395 for ctx in clist: 1396 if isinstance(ctx, contexts.TestsFileContext): 1397 if ctx.target_tests_file is not None: 1398 ctx_result = ctx.create(base_context) 1399 name = ctx_result.get( 1400 "tests_filename", 1401 ctx.target_tests_file 1402 ) 1403 path = ctx_result.get("tests_file_path", name) 1404 if name not in all_filenames: 1405 all_filenames[name] = { "path": path } 1406 1407 # Look for code contexts which have handled parsing on target 1408 # files, and add "source" and possibly "original_source" slots 1409 for ctx in clist: 1410 if isinstance(ctx, contexts.CodeContext): 1411 ctx_result = ctx.create(base_context) 1412 if "tests_filename" in ctx_result: 1413 name = ctx_result["tests_filename"] 1414 original = ctx_result["original_tests_source"] 1415 fixed = ctx_result["tests_source"] 1416 all_filenames[name]["source"] = fixed 1417 if original != fixed: 1418 all_filenames[name]["original_source"] = original 1419 # Otherwise there was some kind of error we assume 1420 1421 # Grab file contents if we haven't already 1422 for filename in all_filenames: 1423 file_info = all_filenames[filename] 1424 entry = { 1425 "filename": filename, 1426 "path": file_info["path"] 1427 } 1428 report["files"].append(entry) 1429 if "source" in file_info: 1430 entry["code"] = file_info["source"] 1431 else: 1432 with open(entry["path"], 'r', encoding="utf-8") as fin: 1433 if entry["path"].endswith(".py"): 1434 entry["code"] = fin.read() 1435 else: 1436 entry["raw"] = fin.read() 1437 1438 if "original_source" in file_info: 1439 entry["original_code"] = file_info["original_source"] 1440 1441 return report 1442 1443 def goals_by_id(self, fragment): 1444 """ 1445 Retrieves one or more of the goals from this rubric according to 1446 its identifier. Note that it's possible for multiple goals to 1447 share the same identifier (only when rendered into HTML do they 1448 get suffixes to make them unique), so this function always 1449 returns a list of goals, which is likely to be length-1. Of 1450 course, an empty list is returned if no goals have the given ID. 1451 Any goal whose identifier contains the provided string will be 1452 included in the goals returned, although '^^^' will be added to 1453 the front and a '$$$' to the end when checking this, so you can 1454 use those in your fragment; neither character normally appears 1455 inside of non-custom identifiers. 1456 """ 1457 # TODO: Prefix these for evaluation/validation? 1458 return [ 1459 g 1460 for g in self.evaluation_goals + self.validation_goals 1461 if fragment in ('^^^' + g.identifier + '$$$') 1462 ] 1463 1464 1465#----------------------------------------------------------------------# 1466# Performance metrics create evaluation, summary, and table from goals # 1467#----------------------------------------------------------------------# 1468 1469def overall_evaluation(foundational, core, extra): 1470 """ 1471 Given lists of evaluated foundational, core, and extra goals, returns 1472 a pair containing an overall evaluation string and a summary string 1473 based on the following rules. Treating each core goal as 1 point, 1474 with 1/2 point for partial accomplishment, the metric computes a 1475 point total for core goals and then: 1476 1477 - If a score of at least 1/2 of the number of core goals is met, and 1478 all of the foundational goals are accomplished, the overall 1479 evaluation is "partially complete". 1480 - Depending on the number of core goals, a completeness point 1481 threshold is established (TODO: more principled than this?): 1482 - If there is only 1 core goal, the threshold is 1 (it's impossible 1483 to score 'almost complete' in this scenario). 1484 - Otherwise, the threshold is the number of core goals minus a 1485 fudge factor of 10% rounded up to the nearest 0.5. In other 1486 words, for 2-5 core goals the fudge factor is 0.5, for 5-10 it's 1487 1, for 11-15 it's 1.5, for 16-20 it's 2, etc. 1488 - If at least one core goal is not fully accomplished, but the core 1489 point total is equal to or greater than the completeness point 1490 threshold, then the overall evaluation is "almost complete". 1491 - If all of the core goals are fully accomplished, but at least one 1492 extra goal is not fully accomplished, the evaluation is "complete". 1493 - If all of the core goals and all of the extra goals are 1494 accomplished, the overall evaluation is "excellent". 1495 - If either at least one foundational goal is failed, or the score 1496 for core goals is less than 1/2 of the number of core goals, the 1497 evaluation is "incomplete". 1498 """ 1499 # Check foundational goals 1500 failed_foundational = [] 1501 for g in foundational: 1502 logging.debug_msg( 1503 "Reviewing foundational goal '{}' @ {}...".format( 1504 g.feedback_topic(), 1505 id(g) 1506 ) 1507 ) 1508 logging.debug_msg("...result is: {}".format(g.result)) 1509 if g.result["status"] not in ("accomplished", "partial"): 1510 failed_foundational.append(g) 1511 1512 # Check core goals 1513 core_score = 0 1514 core_accomplished = [] 1515 core_partial = [] 1516 for g in core: 1517 logging.debug_msg( 1518 "Reviewing core goal '{}' @ {}...".format( 1519 g.feedback_topic, 1520 id(g) 1521 ) 1522 ) 1523 logging.debug_msg("...result is: {}".format(g.result)) 1524 if g.result["status"] == "accomplished": 1525 core_score += 1 1526 core_accomplished.append(g) 1527 elif g.result["status"] == "partial": 1528 core_score += 0.5 1529 core_partial.append(g) 1530 1531 # Nicer repr 1532 if int(core_score) == core_score: 1533 core_score = int(core_score) 1534 1535 # Check extra goals 1536 extra_unaccomplished = [] 1537 for g in extra: 1538 logging.debug_msg( 1539 "Reviewing extra goal '{}' @ {}...".format( 1540 g.feedback_topic(), 1541 id(g) 1542 ) 1543 ) 1544 logging.debug_msg("...result is: {}".format(g.result)) 1545 if g.result["status"] != "accomplished": 1546 extra_unaccomplished.append(g) 1547 1548 # Feedback for core and extra goals: 1549 if len(core) < 2: 1550 core_threshold = len(core) 1551 else: 1552 core_threshold = len(core) - (math.ceil(0.2 * len(core)) / 2) - 0.01 1553 # the 0.01 is extra careful of rounding errors 1554 1555 if core_score == len(core): 1556 # Perfect core score -> 'complete' or 'excellent' overall 1557 if len(extra_unaccomplished) == 0: 1558 # Extras all accomplished + core all accomplished -> 'excellent' 1559 return ( 1560 "excellent", 1561 "<p>You accomplished all core and extra goals. Great job!</p>" 1562 ) 1563 else: 1564 return ( 1565 "complete", 1566 ( 1567 "<p>You accomplished the core goals. Good job!</p>" 1568 "<p>You accomplished" 1569 f" {len(extra) - len(extra_unaccomplished)} of" 1570 f" {len(extra)} extra goals.</p>" 1571 ) 1572 ) 1573 1574 elif core_score >= core_threshold: 1575 # Close-enough core score: "almost complete" 1576 return ( 1577 "almost complete", 1578 ( 1579 f"<p>You accomplished {core_score} (nearly all) of the" 1580 f" {len(core)} core goals.</p>" 1581 ) 1582 ) 1583 1584 else: 1585 # Not even close-enough 1586 half = len(core) * 0.5 1587 if half == int(half): # Nicer repr 1588 half = int(half) 1589 if core_score >= half: 1590 return ( 1591 "partially complete", 1592 ( 1593 f"<p>You accomplished {core_score} (which is at" 1594 f" least half) of the {len(core)} core goals.</p>" 1595 ) 1596 ) 1597 else: 1598 return ( 1599 "incomplete", 1600 ( 1601 f"<p>You accomplished only {core_score} (which is" 1602 f" less than half) of the {len(core)} core goals.</p>" 1603 ) 1604 ) 1605 1606 1607def summarize_category_row( 1608 row, 1609 goals, 1610 all_or_nothing=False, 1611 half_matters=False, 1612 blank=False 1613): 1614 """ 1615 Given a table row and a list of goals, adds "status" and 1616 "explanation" entries to the given row based on whether some, 1617 more/less than half, and/or all of the goals in the list were 1618 accomplished. 1619 1620 If all_or_nothing is given as True, then the status will always be 1621 either "accomplished" if all goals were, or "failed" if at least one 1622 wasn't (even if it was partial). 1623 1624 If half_matters is given as True, a note about whether or not at 1625 least half of the goals were accomplished will be added, counting 1626 partially-accomplished goals as 1/2 point. 1627 1628 This function modifies the provided row and doesn't return anything. 1629 1630 If blank is set to True, the status/explanation are set according to 1631 BLANK_RESULT. 1632 """ 1633 if blank: 1634 row["status"] = BLANK_RESULT["status"] 1635 row["explanation"] = BLANK_RESULT["explanation"] 1636 return 1637 1638 accomplished = len( 1639 [g for g in goals if g.result["status"] == "accomplished"] 1640 ) 1641 partial = len( 1642 [g for g in goals if g.result["status"] == "partial"] 1643 ) 1644 count = len(goals) 1645 1646 if accomplished == count: 1647 row["status"] = "accomplished" 1648 row["explanation"] = "Accomplished all {} {}.".format( 1649 count, 1650 phrasing.plural(count, "goal") 1651 ) 1652 else: 1653 points = accomplished + 0.5 * partial 1654 if all_or_nothing: 1655 row["status"] = "failed" 1656 row["explanation"] = "Failed to fully accomplish {} {}.".format( 1657 count - accomplished, 1658 phrasing.plural(count - accomplished, "goal") 1659 ) 1660 else: 1661 row["explanation"] = "Accomplished {} of {} {}.".format( 1662 round(points) if round(points) == points else points, 1663 count, 1664 phrasing.plural(count, "goal") 1665 ) 1666 if points < count / 2 - 0.001: 1667 row["status"] = "failed" 1668 else: 1669 row["status"] = "partial" 1670 1671 if half_matters: 1672 if points < count / 2 - 0.001: 1673 half_msg = " (less than half overall)." 1674 else: 1675 half_msg = " (at least half overall)." 1676 row["explanation"] = row["explanation"][:-1] + half_msg 1677 1678 1679def foundational_core_extras_metric(goals, blank=False): 1680 """ 1681 Summarizes a list of evaluated goals by looking at those tagged 1682 as "foundational" and "core" and treating the rest as extras, while 1683 ignoring any tagged with "feedback_only". It assigns an evaluation 1684 using the overall_evaluation function. 1685 1686 If blank is given as True, the report will include an evaluation of 1687 "not evaluated" and will not assign success or failure overall or to 1688 individual goal categories. Use this, along with unevaluated goals, 1689 to create a blank rubric. 1690 1691 This function returns a dictionary with the following keys: 1692 1693 - evaluation: A short string providing an overall evaluation of 1694 the submission, as described above. 1695 - summary: A string containing HTML code that summarizes the 1696 evaluation in a few sentences. It contains descriptions of 1697 how many goals in each category were accomplished. 1698 - table: A table dictionary, similar to those returned by 1699 `Goal.table. It will have 'description', 'tags', 'status', 1700 'explanation', and perhaps 'subtable' keys. 1701 - warnings: A list of HTML strings including all warnings 1702 generated by any goal. TODO: Actually just an empty list for 1703 now. 1704 """ 1705 1706 # Sort goals into categories (multiple membership allowed in some cases) 1707 foundational = [] 1708 core = [] 1709 extra = [] 1710 feedback = [] 1711 for g in goals: 1712 category = g.tags.get("category", "extra") 1713 if category == "foundational": 1714 foundational.append(g) 1715 elif category == "core": 1716 core.append(g) 1717 elif category == "feedback_only": 1718 feedback.append(g) 1719 else: 1720 extra.append(g) 1721 1722 # Include foundational goals: 1723 foundation_row = { 1724 "description": ( 1725 "Foundational goals", 1726 "If one fails, the assignment is incomplete." 1727 ), 1728 "tags": { "category": "foundational" }, 1729 "status": "unknown", 1730 "explanation": "No explanation yet.", 1731 "subtable": [], 1732 } 1733 for g in foundational: 1734 foundation_row["subtable"].extend(g.table(blank=blank)) 1735 summarize_category_row( 1736 foundation_row, 1737 foundational, 1738 all_or_nothing=True, 1739 blank=blank 1740 ) 1741 1742 # Include core goals: 1743 core_row = { 1744 "description": ( 1745 "Core goals", 1746 ( 1747 "Complete all of these for core credit. Get partial" 1748 " credit for completing at least half, and more" 1749 + " partial credit for completing at least 90%." 1750 ) 1751 ), 1752 "tags": { "category": "core" }, 1753 "status": "unknown", 1754 "explanation": "No explanation yet.", 1755 "subtable": [], 1756 } 1757 for g in core: 1758 core_row["subtable"].extend(g.table(blank=blank)) 1759 summarize_category_row(core_row, core, half_matters=True, blank=blank) 1760 1761 # Include extra goals: 1762 extra_row = { 1763 "description": ( 1764 "Extra goals", 1765 ( 1766 "Complete all of these in addition to all of the core" 1767 + " goals for a perfect score." 1768 ) 1769 ), 1770 "tags": { "category": "extra" }, 1771 "status": "unknown", 1772 "explanation": "No explanation yet.", 1773 "subtable": [], 1774 } 1775 for g in extra: 1776 extra_row["subtable"].extend(g.table(blank=blank)) 1777 summarize_category_row(extra_row, extra, all_or_nothing=True, blank=blank) 1778 1779 # Include feedback_only goals: 1780 feedback_row = { 1781 "description": ("Additional feedback (not graded):", ""), 1782 "tags": { "category": "feedback_only" }, 1783 "status": "not applicable", 1784 "explanation": ( 1785 "These extra items are not graded, but provide potentially " 1786 + "useful feedback ." 1787 ), 1788 "subtable": [], 1789 } 1790 for g in feedback: 1791 logging.debug_msg( 1792 "Reviewing feedback goal '{}' @ {}...".format( 1793 g.feedback_topic(), 1794 id(g) 1795 ) 1796 ) 1797 logging.debug_msg("...result is: {}".format(g.result)) 1798 feedback_row["subtable"].extend(g.table(blank=blank)) 1799 summarize_category_row(feedback_row, feedback, blank=blank) 1800 1801 nonempty_rows = list( 1802 filter( 1803 lambda row: len(row.get("subtable", [])) > 0, 1804 [ 1805 foundation_row, 1806 core_row, 1807 extra_row, 1808 feedback_row 1809 ] 1810 ) 1811 ) 1812 1813 # If we're creating a blank rubric, stop here and just report what 1814 # the goals were. 1815 if blank: 1816 return { 1817 "evaluation": "not evaluated", 1818 "summary": "Blank rubric.", 1819 "table": nonempty_rows, 1820 "warnings": [] # TODO: Mechanism to generate these? 1821 } 1822 1823 # Build summary and table rows; decide evaluation: 1824 evaluation, summary = overall_evaluation(foundational, core, extra) 1825 1826 return { 1827 "evaluation": evaluation, 1828 "summary": summary, 1829 "table": nonempty_rows, 1830 "warnings": [] # TODO: Mechanism to generate these? 1831 } 1832 1833 1834def core_extras_categorized_metric(goals, blank=False): 1835 """ 1836 Works like `foundational_core_extras_metric`, but does not use 1837 foundational goals (only goals tagged "core" vs. not are 1838 distinguished). However, this version looks at goal type tags and 1839 creates a table organizing goals by their types and then categories. 1840 The goal types (supplied via "goal_type" tags) are: 1841 1842 - "style" 1843 - "procedure" 1844 - "process" 1845 - "product" 1846 - "behavior" 1847 - "tests" 1848 - "other" (any goal not tagged with a type will get this type) 1849 1850 The overall evaluation and summary of the dictionary returned are the 1851 same as for the `foundational_core_extras_metric`, and the goal types 1852 are not relevant to the evaluation result. 1853 """ 1854 1855 # Sort goals into categories 1856 core = [] 1857 extra = [] 1858 feedback = [] 1859 for g in goals: 1860 cat = g.tags.get("category", "extra") 1861 if cat == "core": 1862 core.append(g) 1863 elif cat == "feedback_only": 1864 feedback.append(g) 1865 else: 1866 extra.append(g) 1867 1868 # Get evaluation & summary for the goals (no foundational goals) 1869 evaluation, summary = overall_evaluation([], core, extra) 1870 1871 # Sort goals again by type tags 1872 rows = [] 1873 for gtype in GOAL_TYPE_RUBRICS: 1874 gtype_description = GOAL_TYPE_RUBRICS[gtype] 1875 gtype_goals = [] 1876 for g in goals: 1877 if g.tags.get("goal_type", "other") == gtype: 1878 gtype_goals.append(g) 1879 1880 # If there aren't any goals in this category, we skip it entirely 1881 if len(gtype_goals) == 0: 1882 continue 1883 1884 # Core/extra/feedback sub-rows for this category 1885 core_subrow = { 1886 "description": ( 1887 "Core goals", 1888 ( 1889 "Complete all core goals for core credit. Get partial" 1890 " credit for completing at least half, and more" 1891 + " partial credit for completing at least 90%." 1892 ) 1893 ), 1894 "tags": { "category": "core", "goal_type": gtype }, 1895 "status": "unknown", 1896 "explanation": "No explanation yet.", 1897 "subtable": [], 1898 } 1899 extra_subrow = { 1900 "description": ( 1901 "Extra goals", 1902 ( 1903 "Complete all extra goals in addition to the core" 1904 + " goals for a perfect score." 1905 ) 1906 ), 1907 "tags": { "category": "extra", "goal_type": gtype }, 1908 "status": "unknown", 1909 "explanation": "No explanation yet.", 1910 "subtable": [], 1911 } 1912 feedback_subrow = { 1913 "description": ( 1914 "Additional feedback (not graded):", 1915 ( 1916 "These checks and tests are provided to give you" 1917 + " more insight into the assignment, but are not part" 1918 + " of the grading." 1919 ) 1920 ), 1921 "tags": { "category": "feedback_only", "goal_type": gtype }, 1922 "status": "not applicable", 1923 "explanation": ( 1924 "These extra items are not graded, but provide potentially " 1925 + "useful feedback." 1926 ), 1927 "subtable": [], 1928 } 1929 1930 # Add goals to sub-rows 1931 core_here = [] 1932 extra_here = [] 1933 feedback_here = [] 1934 for g in gtype_goals: 1935 if g in core: 1936 core_here.append(g) 1937 core_subrow["subtable"].extend(g.table(blank=blank)) 1938 elif g in feedback: 1939 feedback_here.append(g) 1940 feedback_subrow["subtable"].extend(g.table(blank=blank)) 1941 else: 1942 extra_here.append(g) 1943 extra_subrow["subtable"].extend(g.table(blank=blank)) 1944 1945 # List the non-empty sub-rows 1946 nonempty_subrows = [] 1947 for sub in (core_subrow, extra_subrow, feedback_subrow): 1948 if len(sub["subtable"]) > 0: 1949 nonempty_subrows.append(sub) 1950 1951 # Main row for this category 1952 row = { 1953 "description": gtype_description, 1954 "tags": { "category": "type_group", "goal_type": gtype }, 1955 "status": "unknown", 1956 "explanation": "No explanation yet.", 1957 "subtable": nonempty_subrows, 1958 } 1959 # Add this row to our rows list 1960 rows.append(row) 1961 1962 # Summarize each sub-row 1963 summarize_category_row(core_subrow, core_here, blank=blank) 1964 summarize_category_row(extra_subrow, extra_here, blank=blank) 1965 summarize_category_row(feedback_subrow, feedback_here, blank=blank) 1966 1967 # Status + explanation for this entire category 1968 if blank: 1969 # Blank status + explanation 1970 row["status"] = BLANK_RESULT["status"] 1971 row["explanation"] = BLANK_RESULT["explanation"] 1972 else: 1973 # Goal-type group status based on core goals alone 1974 row["status"] = core_subrow["status"] 1975 ngoals = len(core_subrow["subtable"]) 1976 if ngoals == 0: 1977 # no core goals in this category 1978 if len(extra_subrow["subtable"]) == 0: 1979 # no evaluated goals at all... 1980 row["status"] = "unknown" 1981 ngoals = len(feedback_subrow["subtable"]) 1982 row["explanation"] = ( 1983 "The {} {} {} contribute to your overall" 1984 " evaluation ({} just informative)." 1985 ).format( 1986 gtype, 1987 phrasing.plural(ngoals, "goal"), 1988 phrasing.plural(ngoals, "does not", "do not"), 1989 phrasing.plural(ngoals, "they're", "it's"), 1990 ) 1991 else: 1992 # Base on the extra goals 1993 row["status"] = extra_subrow["status"] 1994 ngoals = len(extra_subrow["subtable"]) 1995 if row["status"] == "accomplished": 1996 if ngoals > 1: 1997 row["explanation"] = ( 1998 "Accomplished all {} extra {} goals." 1999 ).format(ngoals, gtype) 2000 else: 2001 row["explanation"] = ( 2002 "Accomplished the {} extra {} goal." 2003 ).format(ngoals, gtype) 2004 elif row["status"] == "partial": 2005 if ngoals > 1: 2006 row["explanation"] = ( 2007 "Accomplished most of the {} extra {}" 2008 " goals." 2009 ).format(ngoals, gtype) 2010 else: 2011 row["explanation"] = ( 2012 "Partially accomplished the extra {}" 2013 " goal." 2014 ).format(gtype) 2015 elif row["status"] == "failed": 2016 if ngoals > 1: 2017 row["explanation"] = ( 2018 "Did not accomplish at least half of" 2019 " the {} extra {} goals." 2020 ).format(ngoals, gtype) 2021 else: 2022 row["explanation"] = ( 2023 "Did not accomplish the extra {} goal." 2024 ).format(gtype) 2025 else: 2026 row["explanation"] = ( 2027 "No conclusive evaluation for the extra {}" 2028 " {}." 2029 ).format(gtype, phrasing.plural(ngoals, "goal")) 2030 elif row["status"] == "accomplished": 2031 # Explanation tweaked based on extra goals 2032 nextra = len(extra_subrow["subtable"]) 2033 cat_phrase = "core" 2034 if ( 2035 nextra > 0 2036 and extra_subrow["status"] == "accomplished" 2037 ): 2038 cat_phrase = "core and extra" 2039 ngoals += nextra 2040 if ngoals > 1: 2041 row["explanation"] = ( 2042 "Accomplished all {} {} {} goals." 2043 ).format(ngoals, cat_phrase, gtype) 2044 else: 2045 row["explanation"] = ( 2046 "Accomplished the {} {} goal." 2047 ).format(cat_phrase, gtype) 2048 elif row["status"] == "partial": 2049 if ngoals > 1: 2050 row["explanation"] = ( 2051 "Accomplished most of the core {} goals." 2052 ).format(gtype) 2053 else: 2054 row["explanation"] = ( 2055 "Partially accomplished the core {} goal." 2056 ).format(gtype) 2057 elif row["status"] == "failed": 2058 if ngoals > 1: 2059 row["explanation"] = ( 2060 "Did not accomplish at least half of the core" 2061 " {} goals." 2062 ).format(gtype) 2063 else: 2064 row["explanation"] = ( 2065 "Did not at least partially accomplish the core" 2066 " {} goal." 2067 ).format(gtype) 2068 else: 2069 row["explanation"] = ( 2070 "No conclusive evaluation for the core {} {}." 2071 ).format(gtype, phrasing.plural(ngoals, "goal")) 2072 2073 # If we're creating a blank rubric, stop here and just report what 2074 # the goals were. 2075 if blank: 2076 return { 2077 "evaluation": "not evaluated", 2078 "summary": "Blank rubric.", 2079 "table": rows, 2080 "warnings": [] # TODO: Mechanism to generate these? 2081 } 2082 else: 2083 # Otherwise, include the evaluation and summary 2084 return { 2085 "evaluation": evaluation, 2086 "summary": summary, 2087 "table": rows, 2088 "warnings": [] # TODO: Mechanism to generate these? 2089 } 2090 2091 2092def core_extras_flat_metric(goals, blank=False): 2093 """ 2094 Works like the `core_extras_categorized_metric` but returns a flat 2095 table without goal-type or goal-category rows. This table can be used 2096 with custom sorting controls to allow re-grouping by goal-type, 2097 goal-category, etc. 2098 2099 The overall evaluation and summary of the dictionary returned are the 2100 same as for the `foundational_core_extras_metric`. 2101 """ 2102 2103 # Sort goals into categories 2104 core = [] 2105 extra = [] 2106 feedback = [] 2107 for g in goals: 2108 cat = g.tags.get("category", "extra") 2109 if cat == "core": 2110 core.append(g) 2111 elif cat == "feedback_only": 2112 feedback.append(g) 2113 else: 2114 extra.append(g) 2115 2116 # Get evaluation & summary for the goals 2117 evaluation, summary = overall_evaluation([], core, extra) 2118 2119 # Accumulate rows for each goal 2120 rows = [] 2121 for g in goals: 2122 rows.extend(g.table(blank=blank)) 2123 2124 # If we're creating a blank rubric, use empty evaluation/summary. 2125 if blank: 2126 return { 2127 "evaluation": "not evaluated", 2128 "summary": "Blank rubric.", 2129 "table": rows, 2130 "warnings": [] # TODO: Mechanism to generate these? 2131 } 2132 else: 2133 # Otherwise, include the evaluation and summary 2134 return { 2135 "evaluation": evaluation, 2136 "summary": summary, 2137 "table": rows, 2138 "warnings": [] # TODO: Mechanism to generate these? 2139 } 2140 2141 2142#---------------# 2143# Goal subtypes # 2144#---------------# 2145 2146class NoteGoal(Goal): 2147 """ 2148 A NoteGoal just serves as an extra rubric entry that's not associated 2149 with any test. 2150 """ 2151 def __init__( 2152 self, 2153 taskid, 2154 identifier, 2155 description=("BLANK NOTE GOAL", "THIS GOAL HAS NOT BEEN DEFINED"), 2156 explanation="", 2157 **kwargs 2158 ): 2159 """ 2160 A task ID (string), an identifier (string), and a description are 2161 required, and an explanation (shown only during feedback, but not 2162 on the rubric) may also be given. If no category tag is 2163 specified, the category tag will be set to "feedback_only". 2164 2165 The categorizer "note:" will be prepended to the identifier. 2166 """ 2167 tags = kwargs.setdefault("tags", {}) 2168 if "category" not in tags: 2169 tags["category"] = "feedback_only" 2170 2171 super().__init__( 2172 taskid, 2173 "node:" + identifier, 2174 description, 2175 **kwargs 2176 ) 2177 self.set_default_goal_type("other") 2178 self.explanation = explanation 2179 2180 def evaluate_in_context(self, context=None): 2181 """ 2182 Simply returns the pre-defined explanation. 2183 """ 2184 return { 2185 "status": "not applicable", 2186 "explanation": self.explanation 2187 } 2188 2189 2190class JointGoal(Goal): 2191 """ 2192 A joint goal requires 1 or more subgoals to succeed and bases its 2193 success off of the success of its subgoals. 2194 2195 If the `JointGoal` is tagged as "transparent", then when producing a 2196 table, it will not create an entry for itself and instead will just 2197 return the subtable containing sub-goals. This is useful when it is 2198 obvious from context how failure of a subgoal would affect the 2199 super-goal. 2200 2201 The joint goal takes its goal type tag from the tags of its child 2202 goals, or sets its tag to "other" if its children have more than one 2203 goal type tag. 2204 """ 2205 def __init__( 2206 self, 2207 taskid, 2208 identifier, 2209 description=("BLANK JOINT GOAL", "THIS GOAL HAS NOT BEEN DEFINED"), 2210 parts=None, 2211 required=None, 2212 partial_required=None, 2213 stop_early=False, 2214 **kwargs 2215 ): 2216 """ 2217 You must provide a task ID, an identifier, a description, a list 2218 of parts (default empty list), and a number of parts required 2219 (default is the size of the given parts list). 2220 2221 The categorizer "joint:" is prepended to the identifier. 2222 2223 If partial_required is given, as long as that many parts are 2224 strictly accomplished, this goal will count as partially 2225 accomplished (must be lower than required). 2226 2227 If stop_early is given as True, if the outcome is known based on 2228 goals already evaluated, the `JointGoal` will not evaluate 2229 subsequent goals. 2230 """ 2231 parts = parts or [] 2232 2233 # Pre-specify goal type tag 2234 subgoal_types = set() 2235 for p in parts: 2236 subgoal_types |= set( 2237 [t for t in p.tags if t in GOAL_TYPE_RUBRICS] 2238 ) 2239 2240 if len(subgoal_types) == 1: 2241 goal_type = list(subgoal_types)[0] 2242 else: 2243 # Zero or more than one explicit subgoal type 2244 goal_type = "other" 2245 2246 super().__init__( 2247 taskid, 2248 "joint:" + identifier, 2249 description, 2250 **kwargs 2251 ) 2252 self.set_default_goal_type(goal_type) 2253 self.parts = parts 2254 if required is None: 2255 required = len(parts) 2256 self.required = required 2257 self.partial_required = partial_required 2258 self.stop_early = stop_early 2259 2260 def subgoals(self): 2261 """ 2262 List of subgoals of this goal (our sub-goals). 2263 """ 2264 return self.parts 2265 2266 def table(self, blank=False): 2267 """ 2268 A `JointGoal`'s table by default contains a sub-table consisting of 2269 the combined tables for each of its sub-goals, but this is 2270 suppressed if the goal has the "hide_subgoals" tag. 2271 2272 If it has the "hide_unevaluated" tag, parts which were never 2273 evaluated due to early stopping are omitted from the subtable. 2274 2275 See `Goal.table` regarding the table format. 2276 """ 2277 subtable = [] 2278 if "hide_subgoals" in self.tags: 2279 subtable = None 2280 else: 2281 for i, subgoal in enumerate(self.parts): 2282 # Only goals that we actually evaluated belong in our result 2283 # table: 2284 if ( 2285 "hide_unevaluated" in self.tags 2286 and i >= self.result.get("goals_evaluated", len(self.parts)) 2287 ): 2288 break 2289 subtable.extend(subgoal.table(blank=blank)) 2290 2291 if "transparent" in self.tags: 2292 result = subtable 2293 else: 2294 result = super().table(blank=blank) 2295 result[0]["subtable"] = subtable 2296 2297 return result 2298 2299 def evaluate_in_context(self, context=None): 2300 """ 2301 To evaluate a `JointGoal`, we evaluate each subgoal in order. If at 2302 least the required number of them are "accomplished", the joint 2303 goal is also "accomplished". If not, but at least the required 2304 number are either "accomplished" or "partial", the joint goal is 2305 "partial". Otherwise, it is "failed". If the result is known 2306 before all goals are evaluated, the `JointGoal` will skip 2307 unnecessary parts, unless it was created with stop_early=False. 2308 """ 2309 context = context or {} 2310 2311 passed = 0 2312 partial = 0 2313 remaining = len(self.parts) 2314 2315 if self.required == 0 and self.stop_early: 2316 self.result = { 2317 "status": "accomplished", 2318 "goals_evaluated": 0 2319 } 2320 self.set_explanation( 2321 context, 2322 default="Context established; no testing required." 2323 ) 2324 return self.result 2325 2326 if self.required == 0: 2327 pass_msg = "Context established; no testing required." 2328 partial_msg = "ERROR: THIS MESSAGE SHOULD NEVER BE DISPLAYED (1)" 2329 fail_msg = "ERROR: THIS MESSAGE SHOULD NEVER BE DISPLAYED (2)" 2330 elif self.required == len(self.parts) and self.required > 1: 2331 pass_msg = "All parts accomplished." 2332 if self.partial_required is not None: 2333 partial_msg = ( 2334 "All parts at least partially accomplished, or at " 2335 + "least {} of {} parts accomplished." 2336 ).format(self.partial_required, len(self.parts)) 2337 else: 2338 partial_msg = "All parts at least partially accomplished." 2339 fail_msg = "At least one part failed." 2340 elif self.required == len(self.parts): 2341 pass_msg = "Subgoal accomplished." 2342 partial_msg = "Subgoal partially accomplished." 2343 fail_msg = "Subgoal failed." 2344 else: 2345 pass_msg = "At least {} of {} parts accomplished.".format( 2346 self.required, 2347 len(self.parts) 2348 ) 2349 if self.partial_required is not None: 2350 partial_msg = ( 2351 "At least {} of {} parts accomplished or partially " 2352 + "accomplished, or at least {} of {} parts accomplished." 2353 ).format( 2354 self.required, 2355 len(self.parts), 2356 self.partial_required, 2357 len(self.parts), 2358 ) 2359 fail_msg = ( 2360 "Failed to accomplish at least {} of {} parts." 2361 ).format( 2362 self.partial_required, 2363 len(self.parts) 2364 ) 2365 else: 2366 partial_msg = ( 2367 "At least {} of {} parts accomplished or partially " 2368 + "accomplished." 2369 ).format(self.required, len(self.parts)) 2370 fail_msg = ( 2371 "Failed to accomplish at least {} of {} parts." 2372 ).format( 2373 self.required, 2374 len(self.parts) 2375 ) 2376 2377 goals_evaluated = 0 2378 for subgoal in self.parts: 2379 # Shallow copy of our context: 2380 sub_context = {} 2381 sub_context.update(context) 2382 result = subgoal.evaluate(sub_context) 2383 goals_evaluated += 1 2384 result_status = result.get("status", "unknown") 2385 remaining -= 1 2386 if result_status == "accomplished": 2387 passed += 1 2388 elif result_status == "partial": 2389 partial += 1 2390 2391 if self.stop_early: 2392 if passed >= self.required: 2393 self.result = { 2394 "status": "accomplished", 2395 "goals_evaluated": goals_evaluated 2396 } 2397 self.set_explanation(context, default=pass_msg) 2398 return self.result 2399 elif ( 2400 ( 2401 passed + partial >= self.required 2402 and passed + remaining < self.required 2403 ) 2404 or ( 2405 self.partial_required is not None 2406 and passed >= self.partial_required 2407 and passed + remaining < self.required 2408 ) 2409 ): 2410 self.result = { 2411 "status": "partial", 2412 "goals_evaluated": goals_evaluated 2413 } 2414 self.set_explanation(context, default=partial_msg) 2415 return self.result 2416 2417 if passed >= self.required: 2418 self.result = { 2419 "status": "accomplished", 2420 "goals_evaluated": goals_evaluated 2421 } 2422 self.set_explanation(context, default=pass_msg) 2423 return self.result 2424 elif ( 2425 (passed + partial >= self.required) 2426 or ( 2427 self.partial_required is not None 2428 and passed >= self.partial_required 2429 ) 2430 ): 2431 self.result = { 2432 "status": "partial", 2433 "goals_evaluated": goals_evaluated 2434 } 2435 self.set_explanation(context, default=partial_msg) 2436 return self.result 2437 else: 2438 self.result = { 2439 "status": "failed", 2440 "goals_evaluated": goals_evaluated 2441 } 2442 self.set_explanation(context, default=fail_msg) 2443 return self.result 2444 2445 2446class FailGoal(Goal): 2447 """ 2448 A fail goal simply swaps accomplished for failed and vice versa in 2449 the result of a sub-goal. 2450 """ 2451 def __init__( 2452 self, 2453 taskid, 2454 identifier, 2455 description=None, 2456 goal=None, 2457 permit_partial=True, 2458 **kwargs 2459 ): 2460 """ 2461 Requires a task ID, an identifier, and a subgoal, with optional 2462 description, explanations, and tags. The description should 2463 generally be phrased as the negation of the subgoal's 2464 description, and the default (if None is given explicitly) is to 2465 add "Do not " in front of the subgoal's description title and add 2466 " You need to avoid this." to the end of its details. 2467 2468 The categorizer "fail:" is prepended to the identifier. 2469 2470 If permit_partial is specified, True means that partial success 2471 of the subgoal is partial success of this goal (the default), and 2472 False means that even partial success of the subgoal is full 2473 failure of this goal. 2474 """ 2475 # Auto description 2476 if description is None: 2477 subrub = goal.description 2478 subtitle, subdetails = subrub 2479 if subtitle[0].isupper(): 2480 subtitle = subtitle[0].lower() + subtitle[1:] 2481 description = ( 2482 "Do not " + subtitle, 2483 subdetails + " You need to avoid this." 2484 ) 2485 2486 # Lift goal type from sub-goal 2487 goal_type = goal.tags.get("goal_type", "other") 2488 2489 super().__init__( 2490 taskid, 2491 "fail:" + identifier, 2492 description, 2493 **kwargs 2494 ) 2495 self.set_default_goal_type(goal_type) 2496 2497 if goal is None: 2498 raise ValueError("A FailGoal must be provided a subgoal!") 2499 self.goal = goal 2500 self.permit_partial = permit_partial 2501 2502 def subgoals(self): 2503 """ 2504 List of subgoals of this goal (just our single goal). 2505 """ 2506 if self.goal: 2507 return [ self.goal ] 2508 else: 2509 return [] 2510 2511 def table(self, blank=False): 2512 """ 2513 The table for a `FailGoal` is a copy of it's subgoal's table, 2514 with the status, description, and explanation from the 2515 `FailGoal`'s result. This means that the `FailGoal` itself does 2516 not appear as a separate entry in rubric tables. Any tags for the 2517 `FailGoal` are added to the tags of the subgoal. 2518 2519 See `Goal.table` regarding the table format. 2520 """ 2521 row = self.goal.table(blank=blank)[0] 2522 category = self.tags.get("category", "unknown") 2523 row["id"] = "goal:" + category + '.' + self.identifier 2524 row["description"] = list(self.description[:]) 2525 row["tags"] = list(set(row["tags"]) | set(self.tags)) 2526 if not blank: 2527 row["status"] = self.result["status"] 2528 row["explanation"] = self.result["explanation"] 2529 2530 return [ row ] 2531 2532 def evaluate_in_context(self, context=None): 2533 """ 2534 Evaluates the sub-goal, and returns a result which replaces 2535 "accomplished" with "failed" and vice versa. Does not affect a 2536 result of "partial" unless permit_partial is set to False, in 2537 which case a "partial" result is converted to "failed." 2538 """ 2539 context = context or {} 2540 self.result = {} 2541 self.result.update(self.goal.evaluate(context)) 2542 if self.result["status"] == "accomplished": 2543 self.result["status"] = "failed" 2544 elif self.result["status"] == "failed": 2545 self.result["status"] = "accomplished" 2546 elif self.result["status"] == "partial" and not self.permit_partial: 2547 self.result["status"] = "failed" 2548 # else don't modify the status 2549 2550 # Update explanation from sub_result only if we have a matching 2551 # explanation function. 2552 self.set_explanation(context, default=self.result["explanation"]) 2553 2554 return self.result 2555 2556 2557class PreconditionGoal(Goal): 2558 """ 2559 A precondition goal requires that a condition goal is achieved, and 2560 only if it is does it return an evaluation based on a subgoal. 2561 """ 2562 def __init__( 2563 self, 2564 taskid, 2565 identifier, 2566 description=( 2567 "BLANK PRECONDITION GOAL", 2568 "THIS GOAL HAS NOT BEEN DEFINED" 2569 ), 2570 precondition=None, 2571 goal=None, 2572 **kwargs 2573 ): 2574 """ 2575 You must provide a task ID, an identifier, a description, a 2576 precondition goal, and a subgoal. 2577 2578 The categorizer "precondition:" is prepended to the identifier. 2579 """ 2580 # Pre-specify goal type tag 2581 subgoal_types = set() 2582 for sg in [precondition, goal]: 2583 subgoal_types |= set( 2584 [t for t in sg.tags if t in GOAL_TYPE_RUBRICS] 2585 ) 2586 2587 if len(subgoal_types) == 1: 2588 goal_type = list(subgoal_types)[0] 2589 else: 2590 # Zero or more than one explicit subgoal type 2591 goal_type = "other" 2592 2593 super().__init__( 2594 taskid, 2595 "precondition:" + identifier, 2596 description, 2597 **kwargs 2598 ) 2599 self.set_default_goal_type(goal_type) 2600 if precondition is None or goal is None: 2601 raise ValueError( 2602 "A PreconditionGoal must have both a precondition and a goal!" 2603 ) 2604 self.precondition = precondition 2605 self.goal = goal 2606 2607 def subgoals(self): 2608 """ 2609 List of subgoals of this goal (our precondition and our goal). 2610 """ 2611 return [ self.precondition, self.goal ] 2612 2613 def evaluate_in_context(self, context={}): 2614 """ 2615 To evaluate a `PreconditionGoal`, we evaluate the precondition. If 2616 it does not evaluate as "accomplished," then the entire goal 2617 evaluates to "failed" immediately. If it does evaluate to 2618 "accomplished," the final goal is evaluated and that result is 2619 returned. 2620 2621 If the precondition passes, it is not mentioned in the 2622 explanation that results, but if it fails, its failure 2623 explanation is used as the explanation for this goal's failure. 2624 2625 Even if the precondition passes, this node's explanation function 2626 is still run on the results, but if it fails, the special 2627 explanation status "precondition_failed" is used (to 2628 differentiate from a failed sub-goal post-precondition). 2629 """ 2630 pre = self.precondition.evaluate(context) 2631 if pre.get("status") != "accomplished": 2632 self.result = { 2633 "status": "failed", 2634 "precondition_failed": True, 2635 } 2636 self.set_explanation( 2637 context, 2638 status="precondition_failed", 2639 default="Precondition failed:<br>\n{}".format( 2640 pre.get("explanation", "Cause unknown") 2641 ) 2642 ) 2643 return self.result 2644 else: 2645 self.result = self.goal.evaluate(context) 2646 self.result["precondition_failed"] = False 2647 self.set_explanation(context, default=self.result["explanation"]) 2648 return self.result 2649 2650 def table(self, blank=False): 2651 """ 2652 A `PreconditionGoal`'s table depends on the result from its 2653 precondition. If the precondition failed, the table will be the 2654 precondition's table; otherwise it will be the main goal's table. 2655 The fact that there is a precondition is thus not visible from 2656 the table unless the precondition fails. 2657 TODO: Not that? 2658 2659 See `Goal.table` regarding the table format. 2660 """ 2661 if self.result.get("precondition_failed", False): 2662 return self.precondition.table(blank=blank) 2663 else: 2664 return self.goal.table(blank=blank) 2665 2666 2667class ComparisonTest(Goal): 2668 """ 2669 Runs a checker function on two arbitrary context slots. 2670 """ 2671 def __init__( 2672 self, 2673 taskid, 2674 identifier, 2675 description=( 2676 "BLANK COMPARISON TEST", 2677 "THIS GOAL HAS NOT BEEN DEFINED" 2678 ), 2679 context_slot="value", 2680 checker=None, 2681 ref_slot=None, 2682 **kwargs 2683 ): 2684 """ 2685 In addition to a task ID (string) and an identifier (string), a 2686 description, and optional explanations and/or tags (see the 2687 `Goal` class), a checker function is needed, which should accept 2688 value and reference objects and return a goal result (a 2689 dictionary with status + explanation keys). The context_slot is 2690 used to determine which slot in the current context to check, and 2691 ref_slot specifies where to get the reference object, although if 2692 not given it will default to "ref_" + context_slot. 2693 2694 The categorizer "test:" is prepended to the identifier. 2695 2696 If the checker is omitted or given explicitly as None, the goal 2697 will succeed as long as the appropriate context_slot (and 2698 ref_slot) are present, and will only fail if the assigned context 2699 fails to even establish those keys. 2700 2701 If the ref_slot is the same as the context_slot, the checker 2702 function will be called with only one value. 2703 """ 2704 super().__init__( 2705 taskid, 2706 "test:" + identifier, 2707 description, 2708 **kwargs 2709 ) 2710 self.context_slot = context_slot 2711 self.checker = checker 2712 if ref_slot is None: 2713 ref_slot = "ref_" + context_slot 2714 self.ref_slot = ref_slot 2715 2716 # subgoals is inherited (no subgoals) 2717 2718 # table is inherited 2719 2720 def evaluate_in_context(self, context=None): 2721 """ 2722 Runs the checker and returns its result. 2723 """ 2724 context = context or {} 2725 2726 if self.checker is None: 2727 if self.context_slot in context and self.ref_slot in context: 2728 self.result = { 2729 "status": "accomplished", 2730 "explanation": ( 2731 "Successfully established '{}' context." 2732 ).format(self.context_slot) 2733 } 2734 elif self.context_slot not in context: 2735 self.result = { 2736 "status": "failed", 2737 "explanation": ( 2738 "Failed to establish '{}' context." 2739 ).format(self.context_slot) 2740 } 2741 else: 2742 self.result = { 2743 "status": "failed", 2744 "explanation": ( 2745 "Failed to establish '{}' context." 2746 ).format(self.ref_slot) 2747 } 2748 else: 2749 try: 2750 val = context[self.context_slot] 2751 except KeyError: 2752 self.result = { 2753 "status": "failed", 2754 "traceback": html_tools.html_traceback( 2755 linkable=context_utils.linkmap(context) 2756 ) 2757 } 2758 self.set_explanation( 2759 context, 2760 status="crash", 2761 default=( 2762 "Could not access '{}' for testing." 2763 " Context has keys:<br>{}" 2764 ).format( 2765 self.context_slot, 2766 ', '.join(repr(k) for k in context.keys()) 2767 ) 2768 ) 2769 return self.result 2770 2771 try: 2772 ref = context[self.ref_slot] 2773 except KeyError: 2774 self.result = { 2775 "status": "failed", 2776 "traceback": html_tools.html_traceback( 2777 linkable=context_utils.linkmap(context) 2778 ) 2779 } 2780 self.set_explanation( 2781 context, 2782 status="crash", 2783 default=( 2784 "Could not access '{}' for testing." 2785 " Context has keys:<br>{}" 2786 ).format( 2787 self.ref_slot, 2788 ', '.join(repr(k) for k in context.keys()) 2789 ) 2790 ) 2791 return self.result 2792 2793 try: 2794 if self.context_slot == self.ref_slot: 2795 self.result = self.checker(val) 2796 else: 2797 self.result = self.checker(val, ref) 2798 2799 if self.result is None: 2800 raise ValueError( 2801 "Context checker {} returned None!".format( 2802 self.checker 2803 ) 2804 ) 2805 2806 self.set_explanation( 2807 context, 2808 default=self.result["explanation"] 2809 ) 2810 except Exception: 2811 self.result = { 2812 "status": "failed", 2813 "traceback": html_tools.html_traceback( 2814 linkable=context_utils.linkmap(context) 2815 ) 2816 } 2817 self.set_explanation( 2818 context, 2819 status="crash", 2820 default=html_tools.html_traceback( 2821 title="Error while checking {}:".format( 2822 self.context_slot 2823 ), 2824 linkable=context_utils.linkmap(context) 2825 ) 2826 ) 2827 2828 return self.result 2829 2830 2831class ImplementationCheck(Goal): 2832 """ 2833 An `ImplementationCheck` inspects the AST of submitted code to 2834 determine whether it counts as accomplished or failed. An 2835 `ImplementationCheck`'s subrules must all be accomplished for the 2836 parent check to count as accomplished. An `ImplementationCheck` looks 2837 for the first match that can satisfy its subrules. 2838 2839 `ImplementationCheck`s by default run on the 'scope' context slot 2840 which contains an AST for the submitted module, or (via refinement by 2841 `ImplementationCheck`s) a subset of that code. When created, unless 2842 explicit dependencies are specified via a `test_in` keyword argument, 2843 each `ImplementationCheck` will grab the current automatic "scope" 2844 context as its only dependency. 2845 """ 2846 def __init__( 2847 self, 2848 taskid, 2849 identifier, 2850 description=( 2851 "BLANK IMPLEMENTATION CHECK", 2852 "THIS GOAL HAS NOT BEEN DEFINED" 2853 ), 2854 pattern="_", 2855 name=None, 2856 match=lambda code, node, env: True, 2857 use=None, 2858 min=None, max=None, 2859 softmin=False, softmax=False, 2860 outside=None, 2861 callees=False, 2862 subrules=None, 2863 match_identity=lambda code, node, envs: ( 2864 tuple(node) if isinstance(node, list) else node 2865 ), 2866 subslip=None, 2867 normalize=False, 2868 check_in_def=False, 2869 force_smaller_match=False, 2870 **kwargs 2871 ): 2872 """ 2873 A task ID, an identifier, and a description are required (see the 2874 `Goal` class). An appropriate `test_in` dictionary which will 2875 provide a "scope" slot is typically required. 2876 2877 The categorizer "check:" is prepended to the identifier. 2878 2879 `ImplementationCheck` itself uses the following arguments: 2880 2881 - pattern: A string containing Python code that will be matched 2882 against using mast. May instead be a list of strings, in 2883 which case they will be tried in turn to generate matches. 2884 - name: specifies a name for the construct being searched for. 2885 The plural will be constructed by adding 's', unless name is 2886 a tuple, in which case the first entry will be used as the 2887 singular and the second as the plural. May contain HTML code. 2888 If pattern is not a list, this can be left out, and the 2889 pattern will be used as the name. 2890 - match: A function that accepts the entire submitted AST, the 2891 node being considered for a match right now, and the current 2892 binding environment. This function should return True or 2893 False, and any matches for which it does not return True will 2894 be ignored. 2895 - use/min/max: Either the 'use' argument, or one or both of the 2896 'min' and 'max' arguments should be given, but not both. 2897 Supplying 'use' sets both 'min' and 'max' to that value. If 2898 'max' is 0, the pattern is considered a negative pattern, and 2899 the goal will fail if any matches are found. Otherwise, the 2900 goal will succeed if the number of matches is between the 2901 given min and max values, inclusive. If none of these are 2902 given, the min defaults to 1 and the max to None (no limit). 2903 - softmin/softmax: If one of these is true, the minimum (or 2904 maximum) restriction on the number of matches will be treated 2905 as a soft constraint, and if violated the goal will be 2906 treated as partially accomplished instead of failed. If they 2907 are exactly either the string "warn" or "note", then the goal 2908 will still count as fully accomplished if that constraint is 2909 violated, but a warning or note will be attached mentioning 2910 the unexpectedly low/high number of matches. They may also be 2911 integers or floats, in which case they establish an alternate 2912 min/max threshold for partial completion. For softmin, 2913 partial matches are counted as 0.5 of a match towards 2914 achieving the threshold, but for softmax partial matches are 2915 ignored. 2916 - outside: If present, the 'outside' pattern (or list of 2917 patterns) is checked, and matches will only be considered 2918 valid if they are not sub-nodes of a match for one of the 2919 given outside patterns. 2920 - callees: If given as True, instead of simply searching within 2921 the context's scope node, this check will look for matches 2922 within other functions defined in the submitted code which 2923 are called from within the given scope node. TODO: This is 2924 still (as of 2020-6) experimental/unstable. 2925 - subrules: A list of `ImplementationCheck` goals to be tested 2926 within matches of this goal. Only matches where this goal and 2927 all of its subrules are accomplished (or partially 2928 accomplished) will be considered valid (respectively, 2929 partially valid). If this goal is a negative goal (max = 0), 2930 it fails if there are any fully valid matches, and partial 2931 matches are ignored. On the other hand, if it is a positive 2932 goal (max != 0), it counts as accomplished if the number of 2933 fully valid matches is within the min and max limits 2934 (inclusive), and partially accomplished if the number of 2935 fully valid matches is below the min limit but the number of 2936 fully valid + partially valid matches is at least the min 2937 limit. 2938 - match_identity: a function that returns a hashable object to 2939 represent the identity of a match for match-counting 2940 purposes. The function will be given the entire code context, 2941 the matching node, and a list of matching environments as 2942 input. It may return a list of identities instead of a single 2943 identity and each will be counted. By default this is a 2944 function which just returns the matching node, such that 2945 multiple matching environments based on the same node are not 2946 counted as separate matches. One reasonable alternative if 2947 you know what type of node you're matching would be to return 2948 some associated string (e.g., the id of a Call node that has 2949 a Name as its func). 2950 - subslip: A number of subgoals which are allowed to be violated 2951 and still count a potential match as a partial match. May be 2952 fractional, since partially-matched subgoals will be counted 2953 as 1/2 a point. By default this number will be set equal to 2954 the number of subgoals, meaning that even if all subgoals 2955 fail a match for a specified structure will still count as a 2956 partial match. 2957 - normalize: default False; experimental mast option that tries 2958 to inline some local variable assignments into larger 2959 expressions for better matching. 2960 - check_in_def: default False, this option changes the context 2961 within which the check occurs by default: the check will use 2962 the 'scope' element from the current context as usual, but 2963 will then assume that that AST node is a Call to a function 2964 defined in the same file (within the 'code' element) and will 2965 look up that definition, running the check in the context of 2966 that definition rather than in the original scope context 2967 given to it. This is useful for placing requirements on 2968 helper functions whose names aren't known ahead of time: a 2969 parent `ImplementationCheck` can be used to match the helper 2970 function call, with child checks using check_in_def that 2971 place requirements on the code in the helper function. The 2972 check will fail if the 'scope' context provided to it is not 2973 a Call node, or if it can't find the matching FunctionDef 2974 node in the 'code' tree of the context it's given. 2975 - force_smaller_match: default False. If set to True, a match 2976 which matches the entire target scope will not be considered a 2977 real match. Use this in places where you want to require 2978 things like nested loops, since otherwise a sub-requirement 2979 that's the same as a super-requirement will simply match the 2980 entire node matched by the super-requirement. 2981 """ 2982 # Grab parent context 2983 if "test_in" not in kwargs or kwargs["test_in"] is None: 2984 kwargs["test_in"] = {} 2985 if "contexts" not in kwargs["test_in"]: 2986 kwargs["test_in"]["contexts"] = contexts.auto("scope") 2987 2988 # Set up Goal properties 2989 super().__init__( 2990 taskid, 2991 "check:" + identifier, 2992 description, 2993 **kwargs 2994 ) 2995 self.set_default_goal_type("procedure") 2996 2997 # Ensure patterns is a list 2998 if isinstance(pattern, str): 2999 self.patterns = [ pattern ] 3000 else: 3001 self.patterns = pattern 3002 3003 # Figure out name 3004 if name is None: 3005 if len(self.patterns) > 1: 3006 raise ValueError( 3007 ( 3008 "When building an ImplementationCheck, if there are " 3009 + "multiple patterns, a name must be specified." 3010 + " (topic: '{}' / patterns: {})" 3011 ).format(self.feedback_topic(), self.patterns) 3012 ) 3013 else: 3014 self.name = self.patterns[0] 3015 self.pl_name = self.name + 's' 3016 elif isinstance(name, (list, tuple)): 3017 self.name, self.pl_name = name 3018 else: 3019 self.name = name 3020 self.pl_name = self.name + 's' 3021 3022 self.match = match 3023 3024 # Figure out min and max 3025 if (min is not None or max is not None) and use is not None: 3026 raise ValueError( 3027 ( 3028 "When building an ImplementationCheck, you may supply " 3029 + "*either* 'use' or 'min'/'max', but you may not supply " 3030 + "'use' if either 'min' or 'max' is given." 3031 + " (topic: '{}' / patterns: {})" 3032 ).format(self.feedback_topic(), self.patterns) 3033 ) 3034 elif use is not None: 3035 self.min_allowed = use 3036 self.max_allowed = use 3037 elif min is None and max is None: 3038 # Default is "at least 1" 3039 self.min_allowed = 1 3040 self.max_allowed = None 3041 else: 3042 self.min_allowed = min 3043 self.max_allowed = max 3044 3045 # Is this goal a positive goal (keep searching for any match 3046 # across possible environments?) or not (fail if any match is 3047 # found in any environment). 3048 self.is_positive = self.max_allowed != 0 3049 3050 self.softmin = softmin 3051 self.softmax = softmax 3052 3053 # Make sure outside is a list 3054 if outside is None: 3055 self.outside = [] 3056 elif isinstance(outside, str): 3057 self.outside = [ outside ] 3058 else: 3059 self.outside = outside 3060 3061 self.callees = callees 3062 3063 self.force_smaller_match = force_smaller_match 3064 3065 # Set subrules 3066 if subrules is None: 3067 self.subrules = [] 3068 else: 3069 self.subrules = subrules 3070 3071 self.match_identity = match_identity 3072 3073 self.subslip = subslip 3074 if self.subslip is None: 3075 self.subslip = len(self.subrules) 3076 3077 self.normalize = normalize 3078 3079 self.check_in_def = check_in_def 3080 3081 def subgoals(self): 3082 """ 3083 List of subgoals of this goal (our precondition and our goal). 3084 """ 3085 return self.subrules 3086 3087 def table(self, blank=False): 3088 """ 3089 Includes sub-table with subrule statuses preserved from the last 3090 full match, or the last partial match if there are no full 3091 matches. 3092 3093 See `Goal.table` regarding the table format. 3094 """ 3095 result = super().table(blank=blank) 3096 3097 # Maybe add a subtable: 3098 if blank: 3099 result[0]["subtable"] = self.build_subtable(blank=blank) 3100 elif self.is_positive: 3101 # TODO: What about tables requested during pre-evaluation 3102 # description construction? 3103 result[0]["subtable"] = self.result.get("subtable") or [] 3104 elif self.result.get("status") != "accomplished": 3105 # For negative rules where we don't want any matches, reporting 3106 # the successful discovery of sub-rules only makes sense if 3107 # we failed the goal (because there was a match that 3108 # shouldn't have been there). 3109 result[0]["subtable"] = self.result.get("subtable") or [] 3110 # Otherwise don't attach a subtable (negative rules that 3111 # succeeded because they didn't have any full matches). 3112 3113 return result 3114 3115 def build_subtable(self, blank=False): 3116 """ 3117 Builds a sub-table using the results of each subrule as currently 3118 evaluated. 3119 """ 3120 result = [] 3121 for subrule in self.subrules: 3122 result.extend(subrule.table(blank=blank)) 3123 return result 3124 3125 def evaluate_in_context(self, context=None): 3126 """ 3127 Checks the rule within the 'scope' node of the given context, 3128 respecting bindings in the 'env' dictionary from the given 3129 context. Uses the entire submitted code if no scope is present, 3130 and uses an empty dictionary if there is no binding environment. 3131 Use build_code_context to establish a top-level scope beforehand 3132 if you are worried about parsing issues causing code to be 3133 missing. 3134 """ 3135 # Grab scope and top-scope slots 3136 task_info = context_utils.extract(context, "task_info") 3137 scope = context_utils.extract(context, "scope") 3138 top_scope = context_utils.extract(context, "top_scope") 3139 filename = context_utils.extract(context, "filename") 3140 3141 # Create sub-context 3142 context = context or {} 3143 sub_context = {} 3144 sub_context.update(context) 3145 3146 # Create/extract matching environment 3147 if sub_context.get("env") is not None: 3148 env = sub_context["env"] 3149 else: 3150 env = {} 3151 3152 # Swap from the specified scope over to the matching definition 3153 # if check_in_def was specified: 3154 if self.check_in_def: 3155 if not isinstance(scope, ast.Call): 3156 raise context_utils.MissingContextError( 3157 "Attempt to check in a definition but parent check" 3158 + " didn't provide a function call to work from:" 3159 + "\n{}\n{}".format(scope, self.description) 3160 ) 3161 3162 if not isinstance(scope.func, ast.Name): 3163 raise context_utils.MissingContextError( 3164 "Attempt to check in a definition but the parent" 3165 + " check provided a function call with a complex func" 3166 + " expression:\n {}".format(scope) 3167 ) 3168 3169 defs = mast.findall( 3170 top_scope, 3171 "def {}(___):\n ___".format(scope.func.id), 3172 env=env, 3173 gen=False 3174 ) 3175 3176 if len(defs) == 0: 3177 raise context_utils.MissingContextError( 3178 ( 3179 "Attempt to check in a definition but the parent" 3180 + " check provided a function call (to {}) with no" 3181 + " matching definitions:\n {}" 3182 ).format(scope.func.id, scope) 3183 ) 3184 3185 # last definition overrides earlier ones if there are multiple 3186 last_node, last_envs = defs[-1] 3187 # TODO: DEBUG 3188 if last_node is None: 3189 print("None last_node") 3190 3191 scope = last_node 3192 # arbitrarily use first env; shouldn't be multiple we hope? 3193 env = last_envs[0] 3194 3195 # list of matching AST nodes 3196 matches = [] 3197 3198 # Scope our match predicate: 3199 my_match = self.match 3200 3201 # Our match filter: 3202 match_filter = lambda node, env: my_match(top_scope, node, env) 3203 3204 # Define match-collecting function 3205 def collect_matches(in_scope, memo=None): 3206 """ 3207 This local function collects matches to any of the patterns 3208 in this goal's patterns list, subject to the goal's matching 3209 rule. It accepts a scope (an AST node to search within) and 3210 uses a memo set to remember which callees have been 3211 investigated so that recursive functions with callees=True 3212 will not create an infinite loop. 3213 """ 3214 nonlocal self 3215 if memo is None: # remember which callees we've investigated 3216 memo = set() 3217 for pat in self.patterns: 3218 try: 3219 for node, envs in mast.findall( 3220 in_scope, 3221 pat, 3222 outside=self.outside, 3223 matchpred=match_filter, 3224 env=env, 3225 normalize=self.normalize, 3226 gen=True 3227 ): 3228 for prev_node, prev_envs in matches: 3229 if prev_node == node: 3230 # TODO: worry whether this duplicates envs? 3231 prev_envs.extend(envs) 3232 break 3233 else: # if we didn't ever break 3234 if not ( 3235 self.force_smaller_match 3236 and node is in_scope 3237 ): 3238 matches.append((node, envs)) 3239 3240 except Exception: 3241 # Rule checks shouldn't crash no matter what students 3242 # do... 3243 traceback.print_exc() 3244 logging.log( 3245 ( 3246 'ERROR CHECKING RULE\n rule name: "{}"\n' 3247 + ' attempted pattern: {}' 3248 ).format(self.name, pat) 3249 ) 3250 raise # will be caught below 3251 3252 # Check for matches in callees too. 3253 # WARNINGS: 3254 # - Matches only calls where the function position is a name 3255 # (not an arbitrary expression) 3256 # - Searches the top-level task code node for this name 3257 # without understanding shadowing and without considering 3258 # arguments/parameters 3259 # - Attempts to match the full pattern within a single 3260 # function (currently cannot automatically split pattern 3261 # across a call) 3262 # - Likely to cause even more exponential blowup 3263 # - No attempts are made to respect scope when unifying 3264 # env with match environments in callees 3265 if self.callees: 3266 callee_names = set( 3267 call_env['f'].id 3268 for call_node, call_envs in mast.findall( 3269 in_scope, 3270 '_f_(___)', 3271 gen=True, 3272 matchpred=( 3273 lambda node, env: type(env['f']) == ast.Name 3274 ) 3275 ) # noqa: E123 3276 for call_env in call_envs 3277 ) 3278 # Exclude already-checked callees and update memo: 3279 callee_names -= memo 3280 memo |= callee_names 3281 # Check each callee 3282 for callee_name in callee_names: 3283 callee_patterns = [ 3284 pat.replace("_f_", callee_name) 3285 for pat in patterns.ALL_DEF_PATTERNS 3286 ] 3287 outside_patterns = [ 3288 pat.replace("_f_", in_scope.name) 3289 for pat in patterns.ALL_DEF_PATTERNS 3290 ] if type(scope) == ast.FunctionDef else [] 3291 for cpat in callee_patterns: 3292 for callee_def_node, callee_def_env in mast.findall( 3293 top_scope, 3294 cpat, 3295 outside=outside_patterns, 3296 gen=True 3297 ): 3298 collect_matches(callee_def_node, memo=memo) 3299 pass 3300 pass 3301 pass 3302 pass 3303 pass 3304 3305 # Now that we've defined our collect_matches function, let's use it: 3306 try: 3307 collect_matches(scope) 3308 except Exception: 3309 logging.log( 3310 '>>> WARNING: check_ast_rule exception:\n' 3311 + html_tools.string_traceback() 3312 + '\n<<<' 3313 ) 3314 logging.log( 3315 ( 3316 "Exception while performing ImplementationCheck:\n" 3317 "(topic: '{}', patterns: {})" 3318 ).format(self.feedback_topic(), self.patterns) 3319 ) 3320 self.result = { 3321 "status": "unknown", 3322 "traceback": html_tools.html_traceback( 3323 linkable=context_utils.linkmap(context) 3324 ), 3325 "warnings": [ 3326 "There was an error while checking your implementation." 3327 ] 3328 } 3329 self.set_explanation( 3330 context, 3331 status="crash", 3332 default=html_tools.html_traceback( 3333 title="Error while checking implementation:", 3334 linkable=context_utils.linkmap(context) 3335 ) 3336 ) 3337 return self.result 3338 3339 # Used for messaging in presence of subrules: 3340 unrefined_match_count = len(matches) 3341 3342 # Refine matches by running subrules: 3343 partial_matches = [] 3344 full_matches = [] 3345 full_match_subtable = None 3346 partial_match_subtable = None 3347 closest_subtable = None 3348 closest_successes = -1 3349 closest_partials = -1 3350 for (node, envs) in matches: 3351 for env in envs: 3352 subsuccesses = 0 3353 subpartials = 0 3354 for rule in self.subrules: 3355 this_sub_context = {} 3356 this_sub_context.update(sub_context) 3357 this_sub_context["scope"] = node 3358 this_sub_context["env"] = env 3359 # evaluate sub-rule 3360 sub_result = rule.evaluate_in_context(this_sub_context) 3361 if sub_result["status"] == "accomplished": 3362 subsuccesses += 1 3363 elif sub_result["status"] == "partial": 3364 subpartials += 1 3365 3366 # tally sub-results 3367 if subsuccesses == len(self.subrules): 3368 # all succeeded: this is a full match 3369 if full_match_subtable is None: 3370 full_match_subtable = self.build_subtable() 3371 for prev_node, prev_envs in full_matches: 3372 if prev_node == node: 3373 prev_envs.append(env) 3374 break 3375 else: # if we didn't break 3376 full_matches.append((node, [env])) 3377 elif ( 3378 (subsuccesses + subpartials) == len(self.subrules) 3379 or ( 3380 (subsuccesses + subpartials / 2) 3381 >= (len(self.subrules) - self.subslip) 3382 ) 3383 ): 3384 # partially succeeded 3385 if partial_match_subtable is None: 3386 partial_match_subtable = self.build_subtable() 3387 for prev_node, prev_envs in partial_matches: 3388 if prev_node == node: 3389 prev_envs.append(env) 3390 break 3391 else: # if we didn't break 3392 partial_matches.append((node, [env])) 3393 elif ( 3394 subsuccesses > closest_successes 3395 or ( 3396 subsuccesses == closest_successes 3397 and subpartials > closest_partials 3398 ) 3399 ): 3400 # Best so far in terms of subrule successes 3401 closest_successes = subsuccesses 3402 closest_partials = subpartials 3403 closest_subtable = self.build_subtable() 3404 3405 # Get counts: 3406 full_match_identities = [] 3407 for n, envs in full_matches: 3408 identity_or_identities = self.match_identity(top_scope, n, envs) 3409 if isinstance(identity_or_identities, list): 3410 full_match_identities.extend(identity_or_identities) 3411 else: 3412 full_match_identities.append(identity_or_identities) 3413 3414 n_full_matches = len(set(full_match_identities)) 3415 3416 partial_match_identities = [] 3417 for n, envs in partial_matches: 3418 identity_or_identities = self.match_identity(top_scope, n, envs) 3419 if isinstance(identity_or_identities, list): 3420 partial_match_identities.extend(identity_or_identities) 3421 else: 3422 partial_match_identities.append(identity_or_identities) 3423 3424 n_partial_matches = len(set(partial_match_identities)) 3425 3426 # Check bounds now that we know which matches are partial/full: 3427 violated_min = ( 3428 self.min_allowed is not None 3429 and self.min_allowed > n_full_matches 3430 ) 3431 violated_max = ( 3432 self.max_allowed is not None 3433 and self.max_allowed < n_full_matches 3434 ) 3435 obeyed_min_partially = ( 3436 self.min_allowed is None 3437 or self.min_allowed <= n_partial_matches 3438 ) 3439 # You can't use partial matches to satisfy the max limit 3440 3441 # Notes and warnings for our ultimate result: 3442 notes = [] 3443 warnings = [] 3444 3445 # Assign status 3446 result_status = None 3447 if violated_min: 3448 if obeyed_min_partially: 3449 result_status = "partial" 3450 3451 if self.softmin: 3452 if isinstance(self.softmin, (str, list, tuple)): 3453 if "note" in self.softmin: 3454 notes.append( 3455 "Found fewer {} than expected.".format( 3456 self.pl_name 3457 ) 3458 ) 3459 3460 if "warn" in self.softmin: 3461 warnings.append( 3462 "Found fewer {} than expected.".format( 3463 self.pl_name 3464 ) 3465 ) 3466 3467 if "partial" in self.softmin: 3468 result_status = "partial" 3469 3470 if "fail" in self.softmin: 3471 result_status = "failed" 3472 3473 elif isinstance(self.softmin, (int, float)): 3474 matchpoints = n_full_matches + 0.5 * n_partial_matches 3475 if matchpoints >= self.softmin: 3476 result_status = "partial" 3477 else: 3478 result_status = "failed" 3479 else: 3480 result_status = "partial" 3481 3482 elif not obeyed_min_partially: 3483 result_status = "failed" 3484 3485 if violated_max: 3486 if self.softmax: 3487 if isinstance(self.softmax, (str, list, tuple)): 3488 if "note" in self.softmax: 3489 notes.append( 3490 f"Found more {self.pl_name} than expected." 3491 ) 3492 3493 if "warn" in self.softmax: 3494 warnings.append( 3495 f"Found more {self.pl_name} than expected." 3496 ) 3497 3498 if "partial" in self.softmax: 3499 # Don't upgrade failed (e.g. due to softmax): 3500 if result_status != "failed": 3501 result_status = "partial" 3502 3503 if "fail" in self.softmax: 3504 # old status is irrelevant 3505 result_status = "failed" 3506 3507 elif isinstance(self.softmax, (int, float)): 3508 # partial matches don't count against max 3509 if ( 3510 n_full_matches <= self.softmax 3511 and result_status != "failed" 3512 ): 3513 result_status = "partial" 3514 else: 3515 result_status = "failed" 3516 elif self.softmax: 3517 if result_status != "failed": 3518 result_status = "partial" 3519 else: 3520 result_status = "failed" 3521 3522 # No status assigned by min/max constraints? Then it's accomplished: 3523 if result_status is None: 3524 result_status = "accomplished" 3525 3526 # Figure out line numbers for matches 3527 matching_lines = [ 3528 mast.node_line(node) 3529 for node, envs in full_matches 3530 ] 3531 partial_lines = [ 3532 mast.node_line(node) 3533 for node, envs in partial_matches 3534 ] 3535 arent_extra = [ 3536 node for node, env in full_matches 3537 ] + [ 3538 node for node, env in partial_matches 3539 ] 3540 non_matching_lines = [ 3541 mast.node_line(node) 3542 for node, envs in matches 3543 if node not in arent_extra 3544 ] 3545 3546 # Create explanation: 3547 plural = True 3548 if self.max_allowed == 0: 3549 quantity = "zero" 3550 elif self.min_allowed is None: 3551 if self.max_allowed is None: 3552 quantity = "any number of" 3553 else: 3554 quantity = "no more than {}".format(self.max_allowed) 3555 plural = self.max_allowed != 1 3556 else: 3557 if self.max_allowed is None: 3558 quantity = "at least {}".format(self.min_allowed) 3559 plural = self.min_allowed != 1 3560 elif self.min_allowed == self.max_allowed: 3561 quantity = "exactly {}".format(self.min_allowed) 3562 plural = self.max_allowed != 1 3563 else: 3564 quantity = "between {} and {}".format( 3565 self.min_allowed, 3566 self.max_allowed 3567 ) 3568 plural = True 3569 3570 extra_unrefined = ( 3571 unrefined_match_count 3572 - len(full_matches) 3573 - len(partial_matches) 3574 ) 3575 explanation = ( 3576 "Expected {quantity} {name}, found {found}{sub}." 3577 ).format( 3578 quantity=quantity, 3579 name=self.pl_name if plural else self.name, 3580 found=( 3581 str(n_full_matches) 3582 if ( 3583 result_status == "accomplished" # partials are irrelevant 3584 or len(partial_match_identities) == 0 # no partials 3585 or self.max_allowed == 0 # partials are irrelevant 3586 ) 3587 else 3588 "{} {}, plus {} partial {} which did not satisfy {}".format( 3589 n_full_matches, 3590 phrasing.plural(n_full_matches, "match", "matches"), 3591 n_partial_matches, 3592 phrasing.plural(n_partial_matches, "match", "matches"), 3593 phrasing.plural( 3594 len(self.subrules), 3595 "the sub-rule", 3596 f"all {len(self.subrules)} sub-rules" 3597 ) 3598 ) 3599 ), 3600 sub=( 3601 " (found {}{} possible {} which did not satisfy {})" 3602 ).format( 3603 extra_unrefined, 3604 " more" if n_partial_matches > 0 else '', 3605 phrasing.plural(extra_unrefined, "match", "matches"), 3606 phrasing.plural( 3607 len(self.subrules), 3608 "the sub-rule", 3609 f"enough of the {len(self.subrules)} sub-rules" 3610 ), 3611 ) if self.subrules and extra_unrefined else "" 3612 ) 3613 3614 # Add line numbers: 3615 if len(matching_lines) > 0: 3616 notes.append( 3617 "Found on line(s): {}".format( 3618 ', '.join( 3619 html_tools.html_link_to_line( 3620 task_info["id"], 3621 filename, 3622 ln 3623 ) 3624 for ln in matching_lines 3625 ) 3626 ) 3627 ) 3628 if len(partial_lines) > 0 and result_status != "accomplished": 3629 notes.append( 3630 "Found partial matches on line(s): {}".format( 3631 ', '.join( 3632 html_tools.html_link_to_line( 3633 task_info["id"], 3634 filename, 3635 ln 3636 ) 3637 for ln in partial_lines 3638 ) 3639 ) 3640 ) 3641 if ( 3642 self.subrules 3643 and extra_unrefined 3644 and result_status != "accomplished" 3645 ): 3646 notes.append( 3647 "Found disqualified matches on line(s): {}".format( 3648 ", ".join( 3649 html_tools.html_link_to_line( 3650 task_info["id"], 3651 filename, 3652 ln 3653 ) 3654 for ln in non_matching_lines 3655 ) 3656 ) 3657 ) 3658 3659 if full_match_subtable is not None: 3660 subtable = full_match_subtable 3661 elif partial_match_subtable is not None: 3662 subtable = partial_match_subtable 3663 else: 3664 subtable = closest_subtable # might still be None in some cases 3665 3666 self.result = { 3667 "status": result_status, 3668 "notes": notes, 3669 "warnings": warnings, 3670 "subtable": subtable 3671 } 3672 3673 self.set_explanation(context, default=explanation) 3674 # TODO: Bubble warnings from sub-rules? 3675 return self.result 3676 3677 3678class NoParseErrors(Goal): 3679 """ 3680 This goal is simply accomplished if there are no parsing errors 3681 during task loading, and failed otherwise. If generate_warnings is given it 3682 generates a warning for each parse error. The created goal will 3683 always use the identifier "syntax:no_parse_errors". 3684 """ 3685 def __init__( 3686 self, 3687 taskid, 3688 description=( 3689 "No errors loading code", 3690 ( 3691 "Your code should be able to be loaded without errors. Run " 3692 + "your code before submitting it to make sure this is true." 3693 ) 3694 ), 3695 generate_warnings=True, 3696 **kwargs 3697 ): 3698 """ 3699 A task ID is required. A default description is available. If 3700 generate_warnings is given as False, parse errors will not be 3701 turned into warnings, but in the default case, they will be. 3702 3703 The goal identifier will be "syntax:no_parse_errors". 3704 """ 3705 super().__init__( 3706 taskid, 3707 "misc:no_parse_errors", 3708 description, 3709 **kwargs 3710 ) 3711 self.set_default_goal_type("procedure") 3712 self.generate_warnings = generate_warnings 3713 3714 # subgoals is inherited (no subgoals) 3715 3716 # table is inherited 3717 3718 def evaluate_in_context(self, context=None): 3719 """ 3720 Checks whether there were any parse errors. 3721 """ 3722 context = context or {} 3723 if ( 3724 "parse_errors" not in context 3725 or len(context["parse_errors"]) == 0 3726 ): 3727 self.result = { "status": "accomplished" } 3728 self.set_explanation( 3729 context, 3730 default="There weren't any parsing errors." 3731 ) 3732 return self.result 3733 else: 3734 message = "There were errors during parsing." 3735 if not self.generate_warnings: 3736 # Incorporate errors into message directly: 3737 message += "<br>\n" + '<br>\n'.join( 3738 html_tools.summarize_parse_error(e) 3739 for e in context["parse_errors"] 3740 ) 3741 3742 self.result = { "status": "failed" } 3743 3744 if self.generate_warnings: 3745 # Generate a warning for each error: 3746 self.result["warnings"] = [ 3747 html_tools.summarize_parse_error(e) 3748 for e in context["parse_errors"] 3749 ] 3750 3751 self.set_explanation(context, default=message) 3752 return self.result 3753 3754 3755#--------------------------# 3756# Specialized linter goals # 3757#--------------------------# 3758 3759class LintCheck(Goal): 3760 """ 3761 Runs a linter function against the auto-context for "scope". Inherit 3762 and override the `check` method with a function that accepts a 3763 context and returns a goal evaluation result to define your linter. 3764 """ 3765 def check(self, context): 3766 """ 3767 Not implemented; override to define specific linters. 3768 """ 3769 raise NotImplementedError( 3770 "LintCheck is an abstract class that can't be used directly." 3771 ) 3772 3773 def __init__( 3774 self, 3775 taskid, 3776 identifier, 3777 description=( 3778 "BLANK LINT CHECK", 3779 "THIS GOAL HAS NOT BEEN DEFINED" 3780 ), 3781 goal_type="style", 3782 uses_slots=("scope",), 3783 **kwargs 3784 ): 3785 """ 3786 In addition to a task ID, an identifier, and a description, a 3787 goal type may be supplied other than the default "style". 3788 "procedure" is the most likely alternative. 3789 3790 The categorizer "link:" will be prepended to the identifier 3791 provided. 3792 3793 The slots required should be given as uses_slots, and a relevant 3794 context will be selected or created as the testing context. 3795 3796 Any extra arguments are passed through to the `Goal` constructor. 3797 """ 3798 # Auto context dependency based on uses_slots 3799 depends = contexts.auto(*uses_slots) 3800 if len(depends) == 1: 3801 test_context = depends[0] 3802 else: 3803 # TODO: De-duplicate stuff where one context actually 3804 # provides everything needed via inheritance but auto 3805 # doesn't see that? 3806 test_context = contexts.Context( 3807 description=( 3808 "Details of your code", 3809 ( 3810 "The " + phrasing.comma_list(uses_slots) 3811 + " of your code." 3812 ) 3813 ), 3814 builder=lambda ctx: ctx, 3815 depends=depends 3816 ) 3817 3818 if "test_in" not in kwargs: 3819 kwargs["test_in"] = {} 3820 if "contexts" not in kwargs["test_in"]: 3821 kwargs["test_in"]["contexts"] = [ test_context ] 3822 3823 # Specified goal type 3824 if "tags" not in kwargs: 3825 kwargs["tags"] = {} 3826 kwargs["tags"]["goal_type"] = goal_type 3827 3828 # Set up Goal stuff 3829 super().__init__( 3830 taskid, 3831 "lint:" + identifier, 3832 description, 3833 **kwargs 3834 ) 3835 3836 # subgoals is inherited (no subgoals) 3837 3838 # table is inherited 3839 3840 def evaluate_in_context(self, context=None): 3841 """ 3842 Runs the checker and returns its result. 3843 """ 3844 context = context or {} 3845 3846 try: 3847 self.result = self.check(context) 3848 3849 if self.result is None: 3850 raise ValueError( 3851 f"Linter for {self.__class__.__name__} returned None!" 3852 ) 3853 except Exception: 3854 self.result = { 3855 "status": "failed", 3856 "traceback": html_tools.html_traceback( 3857 linkable=context_utils.linkmap(context) 3858 ) 3859 } 3860 self.set_explanation( 3861 context, 3862 status="crash", 3863 default=html_tools.html_traceback( 3864 title="Error while inspecting your code.", 3865 linkable=context_utils.linkmap(context) 3866 ) 3867 ) 3868 return self.result 3869 3870 self.set_explanation( 3871 context, 3872 default=self.result["explanation"] 3873 ) 3874 3875 return self.result 3876 3877 3878class AllFunctionsHaveDocstrings(LintCheck): 3879 """ 3880 A `LintCheck` which requires that all functions defined in the 3881 submitted module must have non-empty docstrings. 3882 """ 3883 def __init__(self, taskid, exclude=None, **kwargs): 3884 """ 3885 A task ID is required. A list of function names to ignore may be 3886 given as `exclude`. All other keyword arguments are passed to the 3887 `LintCheck` constructor. If no description is specified, a 3888 default description will be included. 3889 3890 The identifier will be "docstrings". 3891 """ 3892 self.exclude = exclude or [] 3893 3894 if "description" not in kwargs: 3895 kwargs["description"] = ( 3896 "All functions are documented", 3897 ( 3898 "Each function you define must include a non-empty" 3899 + " documentation string as the very first thing in" 3900 + " the function." 3901 ) 3902 ) 3903 3904 super().__init__( 3905 taskid, 3906 "docstrings", 3907 uses_slots=["docstrings", "defs"], 3908 **kwargs 3909 ) 3910 3911 def check(self, context): 3912 """ 3913 Checks that none of the extracted docstrings are None or 3914 empty. Requires a context that has a "docstrings" slot. 3915 """ 3916 docmap = context_utils.extract(context, "docstrings") 3917 empty_docstrings = [] 3918 has_docstrings = [] 3919 for fname in sorted(docmap): 3920 if fname not in self.exclude and docmap[fname] == '': 3921 empty_docstrings.append(fname) 3922 elif fname not in self.exclude: 3923 has_docstrings.append(fname) 3924 3925 if empty_docstrings: 3926 if has_docstrings: 3927 return { 3928 "status": "partial", 3929 "explanation": ( 3930 "Some functions had docstrings but others" 3931 " didn't. Functions missing docstrings:" 3932 "<br>\n{}" 3933 ).format( 3934 '<br>\n'.join( 3935 '<code>{}</code>'.format(fname) 3936 for fname in empty_docstrings 3937 ) 3938 ) 3939 } 3940 else: 3941 return { 3942 "status": "failed", 3943 "explanation": ( 3944 "One or more functions were missing" 3945 " docstrings or had empty docstrings:" 3946 "<br>\n{}" 3947 ).format( 3948 '<br>\n'.join( 3949 '<code>{}</code>'.format(fname) 3950 for fname in empty_docstrings 3951 ) 3952 ) 3953 } 3954 else: 3955 return { 3956 "status": "accomplished", 3957 "explanation": ( 3958 "All required functions included docstrings." 3959 ) 3960 } 3961 3962 3963class FunctionsArentNested(LintCheck): 3964 """ 3965 A `LintCheck` which requires that no functions are defined inside 3966 other functions. 3967 """ 3968 def __init__(self, taskid, exclude=None, **kwargs): 3969 """ 3970 A task ID is required. A list of function names to exclude from 3971 the check may be provided. These functions will be ignored if 3972 they are nested, and functions nested inside them will not be 3973 flagged. 3974 3975 The identifier will be "functions_arent_nested". 3976 """ 3977 self.exclude = exclude or [] 3978 3979 if "description" not in kwargs: 3980 kwargs["description"] = ( 3981 "Do not define functions inside of other functions", 3982 ( 3983 "None of your function definitions may be placed" 3984 " inside of other function definitions." 3985 ) 3986 ) 3987 3988 super().__init__( 3989 taskid, 3990 "functions_arent_nested", 3991 uses_slots=["docstrings"], 3992 goal_type="procedure", 3993 **kwargs 3994 ) 3995 3996 def check(self, context): 3997 """ 3998 A linter function that checks a defs context to make sure 3999 that none of the definitions includes an interior def. 4000 """ 4001 filename = context_utils.extract(context, "filename") 4002 defsmap = context_utils.extract(context, "defs") 4003 task_info = context_utils.extract(context, "task_info") 4004 4005 has_nested = {} 4006 for name in defsmap: 4007 if name not in self.exclude: 4008 inners = defsmap[name].body 4009 for pat in patterns.ALL_DEF_PATTERNS: 4010 for inner_statement in inners: 4011 for (match, bindings) in mast.findall( 4012 inner_statement, 4013 pat 4014 ): 4015 if match.name not in self.exclude: 4016 has_nested.setdefault( 4017 name, 4018 set() 4019 ).add(match) 4020 4021 if has_nested: 4022 all_defs = set( 4023 [name for name in defsmap if name not in self.exclude] 4024 ) 4025 nested_defs = set() 4026 for outer in has_nested: 4027 nested_defs |= has_nested[outer] 4028 4029 pct_nested = len(nested_defs) / len(all_defs) 4030 4031 nested_msg = ( 4032 "We found the following functions defined within" 4033 + " other functions:<br>\n<ul>" 4034 + "\n".join( 4035 "<li>Within {} (on line {}):<br>{}</li>".format( 4036 outer, 4037 html_tools.html_link_to_line( 4038 task_info["id"], 4039 filename, 4040 defsmap[outer].lineno 4041 ), 4042 "<br>\n".join( 4043 "<code>{}</code> on line {}".format( 4044 inner.name, 4045 html_tools.html_link_to_line( 4046 task_info["id"], 4047 filename, 4048 inner.lineno 4049 ) 4050 ) 4051 for inner in has_nested[outer] 4052 ) 4053 ) 4054 for outer in has_nested 4055 ) 4056 ) 4057 4058 if pct_nested < 0.5: 4059 return { 4060 "status": "partial", 4061 "explanation": ( 4062 "Some relevant definitions were found inside" 4063 " other definitions. " 4064 ) + nested_msg 4065 } 4066 else: 4067 return { 4068 "status": "failed", 4069 "explanation": ( 4070 "More than half of relevant definitions were" 4071 " found within other definitions! " 4072 ) + nested_msg 4073 } 4074 4075 return { 4076 "status": "accomplished", 4077 "explanation": "No defs were found within other defs." 4078 } 4079 else: 4080 return { 4081 "status": "accomplished", 4082 "explanation": "No defs were found within other defs." 4083 } 4084 4085 4086class DoesntWasteFruit(LintCheck): 4087 """ 4088 A `LintCheck` that makes sure that any fruitful function or method 4089 calls get stored in variables or used as part of expressions. A 4090 fruitful function or method is one of: 4091 4092 1. Defined in the submission itself with an interior return node that 4093 has an expression associated with it, which isn't inside a nested 4094 definition. 4095 2. One of the functions named in the `extra` list of strings, or a 4096 method named in that list with a '.' at the start. 4097 4098 This goal will fail if at least one function call to a fruitful 4099 function or method doesn't use the result, but will partially succeed 4100 if there's at least one that does use the result. 4101 """ 4102 def __init__(self, taskid, exclude=None, extra=None, **kwargs): 4103 """ 4104 A task ID is required. A list of strings specifying names of 4105 functions to exclude from this check may be given. The code in 4106 those functions won't be inspected for wasting fruit, but calls 4107 to those functions in other contexts will still be inspected if 4108 they're fruitful. 4109 4110 A description tuple can be supplied but a reasonable default will be 4111 use if it isn't given. 4112 4113 The identifier will be "doesnt_waste_fruit". 4114 """ 4115 self.exclude = exclude or [] 4116 self.extra = extra or [] 4117 4118 if "description" not in kwargs: 4119 kwargs["description"] = ( 4120 ( 4121 "Do not ignore the results of any fruitful function" 4122 " calls" 4123 ), 4124 ( 4125 "According to the \"Don't waste fruit\" principle," 4126 " every place you call a fruitful function" 4127 " (built-in or custom) you must store the result in" 4128 " a variable, or that function call must be part of" 4129 " a larger expression that uses its return value." 4130 ) 4131 ) 4132 4133 super().__init__( 4134 taskid, 4135 "doesnt_waste_fruit", 4136 uses_slots=["scope"], 4137 goal_type="procedure", 4138 **kwargs 4139 ) 4140 4141 def check(self, context): 4142 """ 4143 Returns success if none of the fruitful function and/or method 4144 calls in the given AST tree has a result but fails to either 4145 store it in a variable or use it as part of a larger expression 4146 or statement. 4147 """ 4148 filename = context_utils.extract(context, "filename") 4149 scope = context_utils.extract(context, "scope") 4150 task_info = context_utils.extract(context, "task_info") 4151 4152 # Variables to accumulate results 4153 fruitful_defs = {} 4154 4155 used_calls = set() 4156 unused_calls = set() 4157 4158 # Maps from function names (or method names prefixed with '.') to 4159 # AST Call nodes for good calls (fruitful functions called in a 4160 # way that uses their result) and bad calls (fruitful functions 4161 # called as bare expressions). 4162 good_calls = {} 4163 bad_calls = {} 4164 4165 # Gather fruitful definitions 4166 for pat in patterns.ALL_DEF_PATTERNS: 4167 for (matching_node, bindings) in mast.findall(scope, pat): 4168 if mast.find( 4169 matching_node.body, # so we don't exclude this def itself 4170 "return _", 4171 outside=patterns.ALL_DEF_PATTERNS 4172 ): 4173 fruitful_defs[matching_node.name] = matching_node 4174 4175 # Search entire code for used/unused function or method calls: 4176 self.accumulate_function_and_method_calls( 4177 scope, 4178 used_calls, 4179 unused_calls, 4180 self.exclude 4181 ) 4182 4183 # Find bad unused calls to fruitful functions 4184 for call in unused_calls: 4185 # Get the name of the function we're calling 4186 if isinstance(call.func, ast.Name): 4187 # A direct function call 4188 fname = call.func.id 4189 mname = fname 4190 elif isinstance(call.func, ast.Attribute): 4191 # A method call 4192 fname = call.func.attr 4193 mname = '.' + fname 4194 else: 4195 # Too complex to analyze; skip this function call 4196 continue 4197 4198 # Decide if this call is bad or not: 4199 if ( 4200 mname in self.extra 4201 or fname in fruitful_defs 4202 ): 4203 bad_calls.setdefault(mname, []).append(call) 4204 4205 # Find good used calls to fruitful functions 4206 for call in used_calls: 4207 # Get the name of the function we're calling 4208 if isinstance(call.func, ast.Name): 4209 # A direct function call 4210 fname = call.func.id 4211 mname = fname 4212 elif isinstance(call.func, ast.Attribute): 4213 # A method call 4214 fname = call.func.attr 4215 mname = '.' + fname 4216 else: 4217 # Too complex to analyze; skip this function call 4218 continue 4219 4220 # Decide if this call is good or not: 4221 if ( 4222 mname in self.extra 4223 or fname in fruitful_defs 4224 ): 4225 good_calls.setdefault(mname, []).append(call) 4226 4227 # Report results 4228 if (len(bad_calls) > 0): 4229 bad_call_report = ( 4230 "We found the following calls to fruitful functions" 4231 + " whose results were ignored:\n<ul>{}</ul>" 4232 ).format( 4233 "\n".join( 4234 "<li><code>{}</code> on line(s) {}</li>".format( 4235 fname, 4236 ", ".join( 4237 html_tools.html_link_to_line( 4238 task_info["id"], 4239 filename, 4240 call.lineno 4241 ) 4242 for call in bad_calls[fname] 4243 ) 4244 ) 4245 for fname in bad_calls 4246 ) 4247 ) 4248 4249 if len(good_calls) == 0: 4250 return { 4251 "status": "failed", 4252 "explanation": ( 4253 "Your code used fruitful functions but ignored" 4254 + " their results. " 4255 ) + bad_call_report 4256 } 4257 else: 4258 return { 4259 "status": "partial", 4260 "explanation": ( 4261 "Your code used some fruitful functions but" 4262 + " ignored their results. " 4263 ) + bad_call_report 4264 } 4265 else: # no bad calls! 4266 return { 4267 "status": "accomplished", 4268 "explanation": ( 4269 "All calls to fruitful functions in your code" 4270 + " correctly made use of their results." 4271 ) 4272 } 4273 4274 def accumulate_function_and_method_calls( 4275 self, 4276 node, 4277 used, 4278 unused, 4279 exclude=[] 4280 ): 4281 """ 4282 Recursively accumulates used and unused function and method 4283 calls. Ignores function calls where the function being called is 4284 the result of an expression that's not an ast.Name or an 4285 ast.Attribute. 4286 4287 The 'used' and 'unused' parameters are treated as sets of AST 4288 nodes. 4289 4290 The `exclude` parameter is optional and lists functions whose 4291 definitions won't be inspected. 4292 """ 4293 # We won't process things which come up in recursion that aren't AST 4294 # nodes (like strings, None, etc.). Note that when we recurse we make 4295 # sure to recurse into the AST nodes within lists like bodies. 4296 if not isinstance(node, ast.AST): 4297 return 4298 4299 # If this is a function call that hasn't already been marked as 4300 # unused, mark it as used 4301 if isinstance(node, ast.Call) and node not in unused: 4302 # Only add it if it's a simple call to a function or method 4303 if isinstance(node.func, (ast.Name, ast.Attribute)): 4304 used.add(node) 4305 4306 # Don't recurse or process statements if we're the definition of 4307 # an excluded function 4308 if isinstance(node, ast.FunctionDef) and node.name in exclude: 4309 return 4310 4311 # Gather places to look for calls that qualify as unused: 4312 statements = [] 4313 if isinstance( 4314 node, 4315 ( 4316 ast.Module, 4317 ast.FunctionDef, 4318 ast.ClassDef, 4319 ast.ExceptHandler, 4320 ast.With 4321 ) 4322 ): 4323 # A node that has a body 4324 statements = node.body 4325 4326 elif isinstance(node, (ast.If, ast.For, ast.While)): 4327 # We need to inspect both the body and the orelse 4328 statements = node.body + node.orelse 4329 4330 elif isinstance(node, ast.Try): 4331 # Inspect body, finalbody, and orelse (handlers will be inspected 4332 # when recursing on them) 4333 statements = node.body + node.finalbody + node.orelse 4334 4335 # No other AST nodes define blocks, so they can't give rise to unused 4336 # function/method calls. 4337 4338 # Inspect the block-level statements for unused expressions 4339 # TODO: Should we negate this? ANY expression which isn't a function 4340 # call to a non-fruitful function is wasting a value when it appears 4341 # as a statement... 4342 for statement in statements: 4343 if ( 4344 isinstance(statement, ast.Expr) 4345 and isinstance(statement.value, ast.Call) 4346 ): 4347 call = statement.value 4348 if isinstance(call.func, ast.Name): 4349 unused.add(call) 4350 elif isinstance(call.func, ast.Attribute): 4351 unused.add(call) 4352 # else ignore this call; it's too complex 4353 4354 # Recurse to accumulate results from inner nodes 4355 for field in node._fields: 4356 4357 if not hasattr(node, field): # skip missing fields 4358 continue 4359 4360 child = getattr(node, field) 4361 if isinstance(child, list): # recurse into each element 4362 for child_part in child: 4363 self.accumulate_function_and_method_calls( 4364 child_part, 4365 used, 4366 unused, 4367 exclude 4368 ) 4369 else: # Just recurse into this item 4370 self.accumulate_function_and_method_calls( 4371 child, 4372 used, 4373 unused, 4374 exclude 4375 ) 4376 4377 4378class DoesntWasteBoxes(LintCheck): 4379 """ 4380 A `LintCheck` which looks for unused variables, excluding a list of 4381 strings (named functions won't be inspected at all and named 4382 variables won't be counted as unused). The given partial tolerance 4383 value controls how many unused variables must exist before the goal 4384 is failed instead of partially completed. Set it to 0 to force a 4385 strict binary accomplished/failed result. 4386 4387 The special name '_' will always be permitted, as it explicitly 4388 hints that a value will not be used. By default, loop variables will 4389 not be checked, although they can be inspected by setting 4390 `check_loop_vars` to True. 4391 4392 An unused variable is defined as a variable which is set but never 4393 loaded, which we detect via ast.Name nodes and 4394 ast.FunctionDef/ast.Lambda arguments and the presence of Store vs. 4395 Load contexts. This goal will be happily accept load-before-store, 4396 but other parts of your rubric will probably notice the code crashing 4397 when run... 4398 4399 Note that our handling of scopes is primitive: we recognize the 4400 global scope and function def scopes, but not all the nuances of 4401 other scopes. 4402 """ 4403 def __init__( 4404 self, 4405 taskid, 4406 exclude=None, 4407 tolerance=2, 4408 check_loop_vars=False, 4409 **kwargs 4410 ): 4411 """ 4412 A task ID is required. A list of strings specifying names of 4413 functions and/or variables to exclude from this check may be 4414 given. Excluded functions won't have their code inspected, and 4415 excluded variables won't be checked. 4416 4417 The identifier will be "doesnt_waste_boxes". 4418 4419 A category other than the default 'core' may also be specified. 4420 4421 A tolerance (resulting in partial instead of complete failure) 4422 other than the default of 2 may be specified. 4423 4424 A custom description tuple may be supplied, but a default 4425 description will be added if a custom one isn't provided. 4426 """ 4427 self.exclude = exclude 4428 self.partial_tolerance = tolerance 4429 self.check_loop_vars = check_loop_vars 4430 4431 if "description" not in kwargs: 4432 kwargs["description"] = ( 4433 ( 4434 "Do not create any variables that you never make" 4435 " use of" 4436 ), 4437 ( 4438 "According to the \"Don't waste boxes\" principle," 4439 " every time you create a variable (using" 4440 " <code>=</code> or by defining a parameter for a" 4441 " function) you must also later use that variable" 4442 " as part of another expression. If you need to" 4443 " create a variable that you won't use, it must" 4444 " have the name <code>_</code>, but you should only" 4445 " do this if absolutely necessary." 4446 ) 4447 ) 4448 4449 super().__init__( 4450 taskid, 4451 "doesnt_waste_boxes", 4452 uses_slots=["scope"], 4453 goal_type="procedure", 4454 **kwargs 4455 ) 4456 4457 def check(self, context): 4458 """ 4459 A checker function which requires that there are no unused 4460 variables in the given scope or in any particular function 4461 definition scope inside it (more complex scoping rules aren't 4462 attended to). 4463 """ 4464 node = context_utils.extract(context, "scope") 4465 task_info = context_utils.extract(context, "task_info") 4466 filename = context_utils.extract(context, "filename") 4467 4468 # Variable to track scopes (see gather_loads_and_stores) 4469 scopes = {} 4470 4471 # Find all Name nodes plus arguments, noting which scope(s) they 4472 # are a part of. 4473 self.gather_loads_and_stores( 4474 node, 4475 scopes, 4476 exclude=self.exclude, 4477 include_loop_vars=self.check_loop_vars 4478 ) 4479 4480 # Report string and boolean for state 4481 report = "Found the following variables that were never used:\n<ol>\n" 4482 num_unused = 0 4483 4484 # Check each scope to look for stores that don't have 4485 # corresponding loads and assemble our report: 4486 for scope in scopes: 4487 missing_loads = ( 4488 set(scopes[scope].get('store', {})) 4489 - scopes[scope].get('load', set()) 4490 - set(self.exclude) 4491 - { '_' } # single-underscore is valid for an unused result 4492 ) 4493 4494 if missing_loads: 4495 num_unused += len(missing_loads) 4496 if scope == ("__global__",): 4497 scope_repr = "the global scope" 4498 else: 4499 scope_repr = ' → '.join( 4500 "<code>{}</code>".format(sc) 4501 for sc in scope[1:] 4502 ) 4503 4504 report += "<li>In {}, found:\n<ol>\n{}\n</ol></li>\n".format( 4505 scope_repr, 4506 "\n".join( 4507 "<li>Variable <code>{}</code> on line(s) {}</li>\n" 4508 .format( 4509 var, 4510 ", ".join( 4511 html_tools.html_link_to_line( 4512 task_info["id"], 4513 filename, 4514 node.lineno 4515 ) 4516 for node in scopes[scope]['store'][var] 4517 ) 4518 ) 4519 for var in missing_loads 4520 ) 4521 ) 4522 4523 # Succeed or fail 4524 if num_unused > 0: 4525 if num_unused > self.partial_tolerance: 4526 status = "failed" 4527 else: 4528 status = "partial" 4529 4530 return { 4531 "status": status, 4532 "explanation": ( 4533 "Your code created {} variables which it did not" 4534 + " make use of:\n{}" 4535 ).format(num_unused, report) 4536 } 4537 else: 4538 return { 4539 "status": "accomplished", 4540 "explanation": ( 4541 "Your code did not create any variables which it did" 4542 + " not make use of." 4543 ) 4544 } 4545 4546 def gather_loads_and_stores( 4547 self, 4548 node, 4549 result, 4550 current_scopes=("__global__",), 4551 exclude=[], 4552 include_loop_vars=True 4553 ): 4554 """ 4555 Recursively traverses an AST and makes note of each time a Name 4556 appears including its Load or Store context. Accumulates results 4557 into the 'result' dictionary, which has scope-name-tuples (e.g., 4558 ("__global__",) or ("__global__", "foo", "<lambda at line 12 col 4559 8>")) as keys and values which are dictionaries with 'load' and 4560 'store' keys. The 'load' value is a set of variable names, while 4561 the 'store' value is a dictionary mapping variable names to lists 4562 of AST nodes. 4563 4564 If `include_loop_vars` is set to False (default is True), loop 4565 variables of for loops will not be included. 4566 4567 As it traverses the AST tree, the current_scopes tuple indicates 4568 which scope(s) we're inside of. We add loads to all parent scopes 4569 but stores just to the innermost scope. Note that we aren't 4570 really keeping track of shadowing properly, so a shadowed global 4571 variable would still think that it's referenced even if it's not 4572 (TODO: fix that!) 4573 """ 4574 # We won't process non-AST items 4575 if not isinstance(node, ast.AST): 4576 return 4577 4578 # Don't process if we're the definition of an excluded function 4579 if isinstance(node, ast.FunctionDef) and node.name in exclude: 4580 return 4581 4582 # Process this node if it's a Name... 4583 if isinstance(node, ast.Name): 4584 if isinstance(node.ctx, ast.Load): 4585 for i in range(1, len(current_scopes) + 1): 4586 result.setdefault(current_scopes[:i], {})\ 4587 .setdefault('load', set())\ 4588 .add(node.id) 4589 elif isinstance(node.ctx, ast.Store): 4590 result.setdefault(current_scopes, {})\ 4591 .setdefault('store', {})\ 4592 .setdefault(node.id, [])\ 4593 .append(node) 4594 # Note: we don't track Del-context Name references 4595 4596 # If this node is a FunctionDef, it creates a scope and we've also 4597 # got to add its arguments as stored variables. 4598 if isinstance(node, ast.FunctionDef) or isinstance(node, ast.Lambda): 4599 if isinstance(node, ast.FunctionDef): 4600 inner_scopes = current_scopes + (node.name,) 4601 else: # Lambdas create anonymous scopes 4602 scope_id = "<lambda at line {} col {}>".format( 4603 node.lineno, 4604 node.col_offset 4605 ) 4606 inner_scopes = current_scopes = (scope_id,) 4607 4608 for arg in ( 4609 # Note: some relevant Python versions don't have posonlyargs 4610 ( 4611 getattr(node.args, "posonlyargs") 4612 if hasattr(node.args, "posonlyargs") 4613 else [] 4614 ) 4615 + node.args.args 4616 + node.args.kwonlyargs 4617 + ([node.args.vararg] if node.args.vararg else []) 4618 + ([node.args.kwarg] if node.args.kwarg else []) 4619 ): 4620 result.setdefault(inner_scopes, {})\ 4621 .setdefault('store', {})\ 4622 .setdefault(arg.arg, [])\ 4623 .append(node) 4624 else: 4625 # Otherwise, the inner scopes are the same as the current scopes 4626 inner_scopes = current_scopes 4627 4628 # Recurse to accumulate results from inner nodes 4629 for field in node._fields: 4630 4631 if not hasattr(node, field): # skip missing fields 4632 continue 4633 4634 # Skip the target of a for loop if include_loop_vars is False 4635 if ( 4636 not include_loop_vars 4637 and isinstance(node, (ast.For, ast.AsyncFor)) 4638 and field == "target" 4639 ): 4640 continue 4641 4642 child = getattr(node, field) 4643 if isinstance(child, list): # recurse into each element 4644 for child_part in child: 4645 self.gather_loads_and_stores( 4646 child_part, 4647 result, 4648 inner_scopes, 4649 exclude, 4650 include_loop_vars 4651 ) 4652 else: # Just recurse into this item 4653 self.gather_loads_and_stores( 4654 child, 4655 result, 4656 inner_scopes, 4657 exclude, 4658 include_loop_vars 4659 )
A dictionary mapping goal types to 4-part descriptions that explain what each goal type means, for use in rubric tables that categorize goals by type.
The default blank result value that a goal acquires when first constructed or whenever it is reset.
Full human-oriented strings for each status string.
151class Goal: 152 """ 153 A goal is a line-item on a rubric: something that a submission should 154 accomplish. When evaluated, it updates its 'result' to an evaluation 155 object that has a status (one of "unknown", "accomplished", 156 "partial", "failed", or "not applicable") and an explanation. It also 157 has a dictionary of strings that describe different tags it is labeled 158 with. 159 160 A Goal is able to produce a table of results (and possibly 161 sub-results) via its table method. 162 """ 163 USED_IDS = {} 164 """ 165 A dictionary of identifier values that have been used already, 166 organized into sub-dictionaries indexed by task IDs. 167 """ 168 169 def unique_id(taskid, category, identifier): 170 """ 171 A static method: given a task of interest, a category, and an 172 identifier, keeps track of identifiers provided and returns them 173 as-is, except when a duplicate is provided, in which case it 174 appends a -number suffix to the duplicate to make it unique and 175 returns that. The -number suffixes start at -2 for the second 176 copy; -1 is never used because the first copy doesn't get a 177 suffix added. 178 179 The result is prefixed with 'goal:<category>.' which can also 180 de-duplicate IDs without needing a suffix sometimes. 181 """ 182 task_ids = Goal.USED_IDS.setdefault(taskid, {}) 183 184 full_id = "goal:" + category + '.' + identifier 185 186 seen_before = task_ids.get(full_id, 0) 187 188 if seen_before == 0: # not seen before 189 task_ids[full_id] = 1 190 return full_id 191 else: # was seen before; would be a duplicate 192 task_ids[full_id] += 1 193 result = full_id + '-' + str(seen_before + 1) 194 return result 195 196 def __init__( 197 self, 198 taskid, 199 identifier, 200 description=("BLANK GOAL", "THIS GOAL HAS NOT BEEN DEFINED"), 201 test_in=None, 202 explanations=None, 203 tags=None 204 ): 205 """ 206 You must supply a task ID (a string), an identifier (a string) 207 and a description-tuple: the title of the goal, and a more 208 detailed explanation that will be available if the user requests 209 more information. The description-tuple may also include a third 210 and/or fourth entry: these will be used instead of the first and 211 second entry (respectively) when the Goal is being presented as 212 part of graded feedback instead of as part of a blank rubric. 213 This can be used to avoid showing exactly which tests are 214 performed when the rubric is constructed, but include that 215 information when feedback is given. 216 217 Note that if the identifier you supply is already in use within 218 the specified task, a numeric suffix will be appended to make it 219 unique. 220 221 You may also supply: 222 223 1. A 'test_in' dictionary which has the following keys: 224 - 'contexts': A list of Context objects within which 225 this goal should be independently tested. 226 - 'required': The amount of credit which this goal 227 needs to count as accomplished overall. For 228 contexts where it evaluates to "accomplished", one 229 unit of credit is earned, and contexts where it 230 evaluates to "partial" earn 1/2 unit of credit by 231 default. The default value is the number of 232 contexts supplied, implying conjunctive logic. Set 233 this to 1 for disjunctive logic (and set 'strict' 234 to True for pure disjunction). 235 - 'partial': If present, the amount of credit needed to 236 count as partially accomplished. If absent, will 237 default to 1/2 the 'required' value. Set to False 238 to prevent the goal from being marked as partially 239 accomplished. 240 - 'count_partial_as': If present, should be a number 241 between 0 and 1, which specifies how much credit 242 to give for contexts where this gaol evaluates as 243 partially-accomplished when comparing results 244 against required/partial thresholds. Default 0.5. 245 - 'strict': If present and truthy, overrides 'partial' 246 and 'count_partial_as' to set 'partial' to False 247 and 'count_partial_as' to 0. 248 - 'warn': If present and truthy, when a context builder 249 fails, the failure explanation will be generated as 250 a warning instead of as a note (the default). 251 - 'fail_without_context': If context creation fails, 252 the goal will be marked as failing in that context 253 without even being evaluated. True by default. 254 255 During evaluation, this goal will be independently 256 evaluated in each provided context, and the aggregate 257 results of those evaluations will be used to determine 258 the overall status of this goal. Note that context keys 259 provided by these Context objects override any context 260 keys that may be established and provided by a super-goal 261 during testing, and the super-goal context is not made 262 available to these Context objects as part of their 263 context construction process. 264 265 If no test_in dictionary is provided, the goal is simply 266 evaluated in a blank context (or in whatever context its 267 parent goal passed down). 268 269 2. An 'explanations' dictionary with some or all of the keys: 270 "accomplished", "partial", "failed", and/or "crash" 271 (typically used when a goal fails due to an exception). If 272 a relevant key exists in this dictionary, the value will 273 be used as the explanation for this goal if it has the 274 specified outcome. If the value is a function instead of a 275 string, it will be given the Goal object, which will 276 already include a partial 'result' object, and the 277 evaluation context, and the string that it returns will be 278 used as an explanation. 279 280 3. A 'tags' dictionary of strings that tag the goal. Some 281 tags affect how certain types of goals behave. 282 """ 283 if not isinstance(taskid, str): 284 raise TypeError("A Goal's task ID must be a string.") 285 286 self.taskid = taskid 287 288 if not isinstance(identifier, str): 289 raise TypeError("A Goal's identifier must be a string.") 290 291 if ( 292 not isinstance(description, (list, tuple)) 293 or not 2 <= len(description) <= 4 294 ): 295 raise ValueError( 296 ( 297 "The description for a goal must be a 2-to-4-element" 298 " list or tuple (got: {})." 299 ).format(repr(description)) 300 ) 301 self.description = description 302 self.test_in = test_in 303 # TODO: Figure this out. 304 #if ( 305 # not isinstance(self.test_in, (dict)) 306 # or any( 307 # not isinstance(x, contexts.Context) 308 # for x in self.test_in["contexts"] 309 # ) 310 #): 311 # raise ValueError( 312 # "Every item in the test_in 'contexts' slot must be a" 313 # + " Context, and test_in must be a dictionary." 314 # ) 315 self.explanations = explanations or {} 316 self.tags = tags or {} 317 318 # Get a unique ID for this goal 319 category = self.tags.get("category", "unknown") 320 self.identifier = Goal.unique_id(taskid, category, identifier) 321 322 # Initialize blank result: 323 self.reset() 324 325 def __copy__(self): 326 """ 327 `Goal`s may not be copied (there is no good way to do so since 328 they're entangled with both each other and with 329 `potluck.contexts.Context` objects). 330 """ 331 raise NotImplementedError("Goals may not be copied.") 332 333 def __deepcopy__(self, memo): 334 """ 335 `Goals` may not be copied (they are entangled with other goals 336 and with `potluck.contexts.Context` objects). 337 """ 338 raise NotImplementedError("Goals may not be copied.") 339 340 def description_topic(self): 341 """ 342 Gets the rubric version of this `Goal`'s topic. 343 """ 344 return self.description[0] 345 346 def description_details(self): 347 """ 348 Gets the rubric version of this `Goal`'s details. 349 """ 350 return self.description[1] 351 352 def feedback_topic(self): 353 """ 354 Gets the feedback version of this `Goal`'s topic, or just the 355 normal topic if there is no feedback version. 356 """ 357 return self.description[::2][-1] 358 359 def feedback_details(self): 360 """ 361 Gets the feedback version of this `Goal's details, or just the 362 normal details if there is no feedback version. 363 """ 364 return self.description[1::2][-1] 365 366 def get_goal_type(self): 367 """ 368 Inspects this `Goal`'s tags for a "goal_type" slot and returns 369 the associated value, or None if there is no such slot. 370 """ 371 return self.tags.get("goal_type", None) 372 373 def set_default_goal_type(self, default_type): 374 """ 375 Sets the given goal type for this goal (adds a tag), but only 376 does so if the goal doesn't already have a goal type tag. 377 """ 378 if "goal_type" not in self.tags: 379 self.tags["goal_type"] = default_type 380 381 def reset(self): 382 """ 383 Resets internal state so that the goal can be evaluated again. 384 Does not affect internal state of any sub-goals, and does not 385 affect cached context values. 386 """ 387 self.result = copy.deepcopy(BLANK_RESULT) 388 389 def reset_network(self): 390 """ 391 Resets our internal state and the states of any sub-goals, but 392 does not affect context caches. 393 """ 394 self.reset() 395 for goal in self.subgoals(): 396 goal.reset_network() 397 398 def full_reset(self): 399 """ 400 Does a full reset, including a full reset of subgoals plus 401 burning of context caches. 402 """ 403 self.reset() 404 for goal in self.subgoals(): 405 goal.full_reset() 406 if self.test_in: 407 for ctx in self.test_in["contexts"]: 408 ctx.burn_cache() 409 410 def subgoals(self): 411 """ 412 Returns a list of `Goal` objects that are considered subgoals of 413 this goal. Different `Goal` classes have different relationships 414 to their subgoals, but this method allows other code to discover 415 the full tree of goals regardless of those relationships. `Goal` 416 classes without subgoals can safely inherit this method, which 417 returns an empty list. 418 """ 419 # A base Goal has no subgoals. 420 return [] 421 422 def evaluate(self, base_context=None): 423 """ 424 Evaluates this goal independently within each of its contexts, 425 and produces an overall evaluation that combines explanations 426 from each context. If there are no contexts, simply evaluates the 427 goal normally. 428 429 A base context is normally required, as otherwise the goal won't 430 have access to the submitted code or even basic info about the 431 task being evaluated. 432 433 Keeps track of the set of all distinct explanations generated, 434 and if there was only a single shared explanation across all 435 contexts, it uses that as the final explanation, but if there 436 were multiple different explanations, creates a combined 437 explanation with sections for the different contexts that had 438 different explanations. 439 440 Note: During this process, if the goal ever evaluates to "not 441 evaluated" in one of the contexts, the end result will be "not 442 evaluated" overall regardless of results from other contexts. 443 444 Note: If one of the contexts cannot be created, the goal will 445 count as failed in that context, and a note will be attached to 446 the result. If a context builder function generates an error 447 other than a `potluck.context_utils.ContextCreationError`, a 448 warning is generated, but in other cases the 'warn' of the 449 setting determines whether a warning or note is generated. 450 """ 451 if not self.test_in or len(self.test_in["contexts"]) == 0: 452 # No contexts listed: simply evaluate in base context 453 try: 454 # this will set self.result 455 self.evaluate_in_context(base_context) 456 except Exception: 457 self.result = { 458 "status": "failed", 459 "warnings": [], 460 "notes": [ 461 # generic context creation failure is usually not 462 # warning-worthy. TODO: Sometimes it is! 463 "Context creation failed" 464 + " unexpectedly:<br>\n{}".format( 465 html_tools.html_traceback( 466 title='Error:', 467 linkable=context_utils.linkmap(base_context) 468 ) 469 ) 470 ] 471 } 472 return self.result 473 474 else: # has specific contexts to test in 475 credit = 0 476 full_count = 0 477 partial_count = 0 478 notes = [] 479 warnings = [] 480 # mapping from explanation strings to lists of status, 481 # context pairs: 482 explanations = {} 483 for i, builder in enumerate(self.test_in["contexts"]): 484 # Construct context: 485 this_context = copy.copy(base_context) 486 # Note: can't deep-copy things like modules 487 488 # Set goal_id and which_context value to provide enough 489 # information in the context dictionary to uniquely 490 # identify a specific context-building operation. 491 this_context["goal_id"] = self.identifier 492 this_context["which_context"] = i 493 494 add_failures_to = notes 495 if self.test_in.get("warn"): 496 add_failures_to = warnings 497 498 err = None 499 try: 500 this_context.update(builder.create(this_context)) 501 except context_utils.ContextCreationError as e: 502 err = e.explanation() 503 add_failures_to.append(e.explanation()) 504 except Exception: 505 err = html_tools.html_traceback( 506 title="Unexpected Error:", 507 linkable=context_utils.linkmap(this_context) 508 ) 509 notes.append( 510 "Context creation failed unexpectedly:<br>\n" 511 + err 512 ) 513 514 # reset this and subgoals, but don't disturb Context caches: 515 self.reset_network() 516 517 # evaluate ourselves: 518 if ( 519 self.test_in.get("fail_without_context", True) 520 and err is not None 521 ): 522 res = { 523 "status": "failed", 524 "explanation": ( 525 "Failed to establish testing context:<br>\n{}" 526 ).format(err) 527 } 528 else: 529 res = self.evaluate_in_context(this_context) 530 531 if res["status"] == "accomplished": 532 credit += 1 533 full_count += 1 534 elif res["status"] == "partial": 535 credit += self.test_in.get("count_partial_as", 0.5) 536 partial_count += 1 537 elif res["status"] == "unknown": 538 # Immediately abandon evaluation across contexts: 539 return { 540 "status": "unknown", 541 "explanation": ( 542 "Unable to evaluate in context:<br>\n{}" 543 ).format(builder.html_topic(in_feedback=True)) 544 # TODO: Does this need to be html_context_tree 545 # for disambiguation? 546 } 547 548 # record explanation & status: 549 expl = res.get("explanation", "") 550 if expl not in explanations: 551 explanations[expl] = [] 552 explanations[expl].append((res["status"], builder, res)) 553 554 # copy notes and warnings 555 if "notes" in res: 556 notes.extend(res["notes"]) 557 if "warnings" in res: 558 warnings.extend(res["warnings"]) 559 560 # Compute credit required/partial 561 required = self.test_in.get( 562 "required", 563 len(self.test_in["contexts"]) 564 ) 565 partial = self.test_in.get("partial", required / 2) 566 567 # Compute status 568 # TODO: Should credit-logic be made visible since it's not 569 # always consistent?!? 570 status = "failed" 571 if credit >= required: 572 status = "accomplished" 573 elif partial is not False and credit >= partial: 574 status = "partial" 575 576 self.result = { 577 "status": status, 578 "notes": notes, 579 "warnings": warnings 580 } 581 582 # Combine explanations: 583 if len(explanations) == 0: 584 # TODO: Should we be bypassing set_explanation here? 585 self.result["explanation"] = "THIS SHOULDN'T BE POSSIBLE!" 586 elif len(explanations) == 1: 587 # Single explanation: don't bother worrying about 588 # multiple contexts and statuses: 589 # TODO: This logic is bad or hides stuff? 590 self.result["explanation"] = list(explanations.keys())[0] 591 # In this case pick up extra keys from the result... 592 competing = list(explanations.values())[0] 593 if len(competing) == 1: 594 sole_result = competing[0][2] 595 for k in sole_result: 596 if k not in self.result: 597 self.result[k] = sole_result[k] 598 else: 599 # Multiple explanations: mix & group by statuses/contexts 600 # TODO: What to do about multiple potentially 601 # contradictory custom result keys? 602 603 # Group by status: 604 by_status = {} 605 for expl in explanations: 606 for status, builder, result in explanations[expl]: 607 if status not in by_status: 608 by_status[status] = [] 609 by_status[status].append((expl, builder)) 610 611 # Order statuses: 612 status_order = ["accomplished", "partial", "failed"] 613 for status in by_status: 614 if status not in status_order: 615 status_order.append(status) 616 617 # Build parts of explanation: 618 expl_parts = [] 619 for status in status_order: 620 if status not in by_status: 621 continue 622 expls_and_builders = by_status[status] 623 n_ctx = len(expls_and_builders) 624 if n_ctx == 0: 625 raise ValueError("Shouldn't have zero explanations!") 626 elif n_ctx == 1: 627 in_ctx = "in one context" 628 else: 629 in_ctx = "in {} contexts".format(n_ctx) 630 631 this_expl = html_tools.build_html_details( 632 '{} {}:'.format( 633 STATUS_DESCRIPTORS.get(status, status) 634 .capitalize(), 635 in_ctx 636 ), 637 '<ul class="contextual_explanations">{}</ul>'.format( 638 '\n'.join( 639 ( 640 '<li>In context {}\n' 641 + '<div class="expl_in_context {}">\n' 642 + '{}\n{}\n' 643 + '</div>\n' 644 + '</li>' 645 ).format( 646 builder.html_topic(in_feedback=True), 647 # TODO: Does this need to be 648 # html_context_tree for 649 # disambiguation? 650 html_tools.status_css_class(status), 651 html_tools.build_status_indicator(status), 652 expl 653 ) 654 for expl, builder in expls_and_builders 655 ) 656 ) 657 ) 658 expl_parts.append((status, this_expl)) 659 660 # Combine parts into one explanation: 661 rstatus = self.result["status"] 662 rsdesc = STATUS_DESCRIPTORS.get(rstatus, rstatus) 663 if rstatus == "accomplished": 664 if full_count >= required: 665 self.result["explanation"] = "{} (in {})".format( 666 rsdesc, 667 phrasing.obj_num(full_count, "context") 668 ) 669 else: 670 self.result["explanation"] = ( 671 "{} (in {} and partially accomplished in {})" 672 ).format( 673 rsdesc, 674 phrasing.obj_num(full_count, "context"), 675 phrasing.obj_num(partial_count, "context") 676 ) 677 else: 678 if full_count > 0: 679 if partial_count > 0: 680 self.result["explanation"] = ( 681 "{} (accomplished in {};" 682 " partially accomplished in {})" 683 ).format( 684 rsdesc.capitalize(), 685 phrasing.obj_num(full_count, "context"), 686 phrasing.obj_num(partial_count, "context") 687 ) 688 else: 689 self.result["explanation"] = ( 690 "{} (accomplished in {})" 691 ).format( 692 rsdesc.capitalize(), 693 phrasing.obj_num(full_count, "context") 694 ) 695 else: 696 if partial_count > 0: 697 self.result["explanation"] = ( 698 "{} (partially accomplished in {})" 699 ).format( 700 rsdesc.capitalize(), 701 phrasing.obj_num(partial_count, "context") 702 ) 703 else: 704 self.result["explanation"] = ( 705 "{} (not accomplished in any contexts)" 706 ).format(rsdesc.capitalize()) 707 708 # Append parts describing success/failure in different 709 # contexts: 710 self.result["explanation"] += "<br>\n".join( 711 '<div class="expl_part {}">{}</div>'.format( 712 html_tools.status_css_class(status), 713 part 714 ) 715 for status, part in expl_parts 716 ) 717 718 # Return our result: 719 return self.result 720 721 def evaluate_in_context(self, context=None): 722 """ 723 The evaluate_in_context method of a Goal subclass should update 724 its 'result' value and return that new value. The result value 725 must be a dictionary with keys 'status' and 'explanation', where 726 the 'status' is one of the strings "unknown", "accomplished", 727 "partial", "failed", or "not applicable", and the 'explanation' 728 value is a (possibly-HTML) string. The result dictionary may also 729 optionally include a list of notes and/or a list of warnings, 730 which are HTML strings. 731 732 The evaluate_in_context method does not need to worry about a 733 goal's test_in value or the associated Context objects: 734 evaluate takes care of constructing context dictionaries which 735 are given to evaluate, so evaluate should just evaluate this goal 736 within the given context. Typical context keys are explained in 737 the documentation for the `potluck.contexts.Context` class. 738 """ 739 raise NotImplementedError("Cannot evaluate base Goal object!") 740 741 def table(self, blank=False): 742 """ 743 Creates a table report for this goal. The table includes a list 744 of rows, where each row contains a result dictionary, with a 745 "description" key including this goal's description and an 746 optional extra "subtable" key containing a sub-table of 747 additional results. The "notes" and "warnings" entries will 748 always be lists, and will be empty if there were no such keys 749 (or their values were explicitly None). The following keys are 750 canonical: 751 752 - 'id': This goal's unique ID (see `Goal.unique_id`). May be 753 absent on some rows representing groups of goals rather than 754 individual goals. 755 - 'description': A pair of strings describing this goal. 756 - 'tags': A dictionary of the tags for this goal. 757 - 'status': The goal's status. 758 - 'explanation': An explanation for the goal's success or 759 failure. 760 - 'notes': A list of strings describing additional feedback for 761 this goal. 762 - 'warnings': A list of strings describing any warnings that 763 arose during the evaluation of this goal. 764 - 'subtable': A list of table rows from sub-goals. 765 766 If "blank" is given as True, the BLANK_RESULT will be used as the 767 basis instead of this table's current result, so there will be no 768 notes or warnings, and the status will be "unknown." 769 """ 770 if blank: 771 row = copy.deepcopy(BLANK_RESULT) 772 else: 773 row = copy.deepcopy(self.result) 774 row["notes"] = self.result.get("notes") or [] 775 row["warnings"] = self.result.get("warnings") or [] 776 row["id"] = self.identifier 777 row["description"] = list(self.description[:]) 778 row["tags"] = copy.copy(self.tags) 779 row["subtable"] = [] 780 return [ row ] 781 782 def set_explanation( 783 self, 784 context, 785 status=None, 786 default="", 787 specific_context=True 788 ): 789 """ 790 Implements the explanations logic, where if self.explanations 791 contains an appropriate key, the string or function value for 792 that key is used to provide an explanation, and otherwise the 793 given default explanation is used. If no status string is given, 794 self.result.status is used as the key. 795 796 For cross-context final evaluation, this function is not used, 797 and explanation-overrides are ignored. 798 TODO: Really that? 799 800 The resulting explanation string is inserted into self.result 801 under the "explanation" key, in addition to being returned. 802 """ 803 status = status or self.result["status"] 804 expl = self.explanations.get(status, default) 805 if isinstance(expl, type(lambda x: x)): 806 expl = expl(self, context) 807 808 self.result["explanation"] = expl 809 return expl
A goal is a line-item on a rubric: something that a submission should accomplish. When evaluated, it updates its 'result' to an evaluation object that has a status (one of "unknown", "accomplished", "partial", "failed", or "not applicable") and an explanation. It also has a dictionary of strings that describe different tags it is labeled with.
A Goal is able to produce a table of results (and possibly sub-results) via its table method.
196 def __init__( 197 self, 198 taskid, 199 identifier, 200 description=("BLANK GOAL", "THIS GOAL HAS NOT BEEN DEFINED"), 201 test_in=None, 202 explanations=None, 203 tags=None 204 ): 205 """ 206 You must supply a task ID (a string), an identifier (a string) 207 and a description-tuple: the title of the goal, and a more 208 detailed explanation that will be available if the user requests 209 more information. The description-tuple may also include a third 210 and/or fourth entry: these will be used instead of the first and 211 second entry (respectively) when the Goal is being presented as 212 part of graded feedback instead of as part of a blank rubric. 213 This can be used to avoid showing exactly which tests are 214 performed when the rubric is constructed, but include that 215 information when feedback is given. 216 217 Note that if the identifier you supply is already in use within 218 the specified task, a numeric suffix will be appended to make it 219 unique. 220 221 You may also supply: 222 223 1. A 'test_in' dictionary which has the following keys: 224 - 'contexts': A list of Context objects within which 225 this goal should be independently tested. 226 - 'required': The amount of credit which this goal 227 needs to count as accomplished overall. For 228 contexts where it evaluates to "accomplished", one 229 unit of credit is earned, and contexts where it 230 evaluates to "partial" earn 1/2 unit of credit by 231 default. The default value is the number of 232 contexts supplied, implying conjunctive logic. Set 233 this to 1 for disjunctive logic (and set 'strict' 234 to True for pure disjunction). 235 - 'partial': If present, the amount of credit needed to 236 count as partially accomplished. If absent, will 237 default to 1/2 the 'required' value. Set to False 238 to prevent the goal from being marked as partially 239 accomplished. 240 - 'count_partial_as': If present, should be a number 241 between 0 and 1, which specifies how much credit 242 to give for contexts where this gaol evaluates as 243 partially-accomplished when comparing results 244 against required/partial thresholds. Default 0.5. 245 - 'strict': If present and truthy, overrides 'partial' 246 and 'count_partial_as' to set 'partial' to False 247 and 'count_partial_as' to 0. 248 - 'warn': If present and truthy, when a context builder 249 fails, the failure explanation will be generated as 250 a warning instead of as a note (the default). 251 - 'fail_without_context': If context creation fails, 252 the goal will be marked as failing in that context 253 without even being evaluated. True by default. 254 255 During evaluation, this goal will be independently 256 evaluated in each provided context, and the aggregate 257 results of those evaluations will be used to determine 258 the overall status of this goal. Note that context keys 259 provided by these Context objects override any context 260 keys that may be established and provided by a super-goal 261 during testing, and the super-goal context is not made 262 available to these Context objects as part of their 263 context construction process. 264 265 If no test_in dictionary is provided, the goal is simply 266 evaluated in a blank context (or in whatever context its 267 parent goal passed down). 268 269 2. An 'explanations' dictionary with some or all of the keys: 270 "accomplished", "partial", "failed", and/or "crash" 271 (typically used when a goal fails due to an exception). If 272 a relevant key exists in this dictionary, the value will 273 be used as the explanation for this goal if it has the 274 specified outcome. If the value is a function instead of a 275 string, it will be given the Goal object, which will 276 already include a partial 'result' object, and the 277 evaluation context, and the string that it returns will be 278 used as an explanation. 279 280 3. A 'tags' dictionary of strings that tag the goal. Some 281 tags affect how certain types of goals behave. 282 """ 283 if not isinstance(taskid, str): 284 raise TypeError("A Goal's task ID must be a string.") 285 286 self.taskid = taskid 287 288 if not isinstance(identifier, str): 289 raise TypeError("A Goal's identifier must be a string.") 290 291 if ( 292 not isinstance(description, (list, tuple)) 293 or not 2 <= len(description) <= 4 294 ): 295 raise ValueError( 296 ( 297 "The description for a goal must be a 2-to-4-element" 298 " list or tuple (got: {})." 299 ).format(repr(description)) 300 ) 301 self.description = description 302 self.test_in = test_in 303 # TODO: Figure this out. 304 #if ( 305 # not isinstance(self.test_in, (dict)) 306 # or any( 307 # not isinstance(x, contexts.Context) 308 # for x in self.test_in["contexts"] 309 # ) 310 #): 311 # raise ValueError( 312 # "Every item in the test_in 'contexts' slot must be a" 313 # + " Context, and test_in must be a dictionary." 314 # ) 315 self.explanations = explanations or {} 316 self.tags = tags or {} 317 318 # Get a unique ID for this goal 319 category = self.tags.get("category", "unknown") 320 self.identifier = Goal.unique_id(taskid, category, identifier) 321 322 # Initialize blank result: 323 self.reset()
You must supply a task ID (a string), an identifier (a string) and a description-tuple: the title of the goal, and a more detailed explanation that will be available if the user requests more information. The description-tuple may also include a third and/or fourth entry: these will be used instead of the first and second entry (respectively) when the Goal is being presented as part of graded feedback instead of as part of a blank rubric. This can be used to avoid showing exactly which tests are performed when the rubric is constructed, but include that information when feedback is given.
Note that if the identifier you supply is already in use within the specified task, a numeric suffix will be appended to make it unique.
You may also supply:
A 'test_in' dictionary which has the following keys:
- 'contexts': A list of Context objects within which this goal should be independently tested.
- 'required': The amount of credit which this goal needs to count as accomplished overall. For contexts where it evaluates to "accomplished", one unit of credit is earned, and contexts where it evaluates to "partial" earn 1/2 unit of credit by default. The default value is the number of contexts supplied, implying conjunctive logic. Set this to 1 for disjunctive logic (and set 'strict' to True for pure disjunction).
- 'partial': If present, the amount of credit needed to count as partially accomplished. If absent, will default to 1/2 the 'required' value. Set to False to prevent the goal from being marked as partially accomplished.
- 'count_partial_as': If present, should be a number between 0 and 1, which specifies how much credit to give for contexts where this gaol evaluates as partially-accomplished when comparing results against required/partial thresholds. Default 0.5.
- 'strict': If present and truthy, overrides 'partial' and 'count_partial_as' to set 'partial' to False and 'count_partial_as' to 0.
- 'warn': If present and truthy, when a context builder fails, the failure explanation will be generated as a warning instead of as a note (the default).
- 'fail_without_context': If context creation fails, the goal will be marked as failing in that context without even being evaluated. True by default.
During evaluation, this goal will be independently evaluated in each provided context, and the aggregate results of those evaluations will be used to determine the overall status of this goal. Note that context keys provided by these Context objects override any context keys that may be established and provided by a super-goal during testing, and the super-goal context is not made available to these Context objects as part of their context construction process.
If no test_in dictionary is provided, the goal is simply evaluated in a blank context (or in whatever context its parent goal passed down).
An 'explanations' dictionary with some or all of the keys: "accomplished", "partial", "failed", and/or "crash" (typically used when a goal fails due to an exception). If a relevant key exists in this dictionary, the value will be used as the explanation for this goal if it has the specified outcome. If the value is a function instead of a string, it will be given the Goal object, which will already include a partial 'result' object, and the evaluation context, and the string that it returns will be used as an explanation.
A 'tags' dictionary of strings that tag the goal. Some tags affect how certain types of goals behave.
A dictionary of identifier values that have been used already, organized into sub-dictionaries indexed by task IDs.
169 def unique_id(taskid, category, identifier): 170 """ 171 A static method: given a task of interest, a category, and an 172 identifier, keeps track of identifiers provided and returns them 173 as-is, except when a duplicate is provided, in which case it 174 appends a -number suffix to the duplicate to make it unique and 175 returns that. The -number suffixes start at -2 for the second 176 copy; -1 is never used because the first copy doesn't get a 177 suffix added. 178 179 The result is prefixed with 'goal:<category>.' which can also 180 de-duplicate IDs without needing a suffix sometimes. 181 """ 182 task_ids = Goal.USED_IDS.setdefault(taskid, {}) 183 184 full_id = "goal:" + category + '.' + identifier 185 186 seen_before = task_ids.get(full_id, 0) 187 188 if seen_before == 0: # not seen before 189 task_ids[full_id] = 1 190 return full_id 191 else: # was seen before; would be a duplicate 192 task_ids[full_id] += 1 193 result = full_id + '-' + str(seen_before + 1) 194 return result
A static method: given a task of interest, a category, and an identifier, keeps track of identifiers provided and returns them as-is, except when a duplicate is provided, in which case it appends a -number suffix to the duplicate to make it unique and returns that. The -number suffixes start at -2 for the second copy; -1 is never used because the first copy doesn't get a suffix added.
The result is prefixed with 'goal:
340 def description_topic(self): 341 """ 342 Gets the rubric version of this `Goal`'s topic. 343 """ 344 return self.description[0]
Gets the rubric version of this Goal
's topic.
346 def description_details(self): 347 """ 348 Gets the rubric version of this `Goal`'s details. 349 """ 350 return self.description[1]
Gets the rubric version of this Goal
's details.
352 def feedback_topic(self): 353 """ 354 Gets the feedback version of this `Goal`'s topic, or just the 355 normal topic if there is no feedback version. 356 """ 357 return self.description[::2][-1]
Gets the feedback version of this Goal
's topic, or just the
normal topic if there is no feedback version.
359 def feedback_details(self): 360 """ 361 Gets the feedback version of this `Goal's details, or just the 362 normal details if there is no feedback version. 363 """ 364 return self.description[1::2][-1]
Gets the feedback version of this `Goal's details, or just the normal details if there is no feedback version.
366 def get_goal_type(self): 367 """ 368 Inspects this `Goal`'s tags for a "goal_type" slot and returns 369 the associated value, or None if there is no such slot. 370 """ 371 return self.tags.get("goal_type", None)
Inspects this Goal
's tags for a "goal_type" slot and returns
the associated value, or None if there is no such slot.
373 def set_default_goal_type(self, default_type): 374 """ 375 Sets the given goal type for this goal (adds a tag), but only 376 does so if the goal doesn't already have a goal type tag. 377 """ 378 if "goal_type" not in self.tags: 379 self.tags["goal_type"] = default_type
Sets the given goal type for this goal (adds a tag), but only does so if the goal doesn't already have a goal type tag.
381 def reset(self): 382 """ 383 Resets internal state so that the goal can be evaluated again. 384 Does not affect internal state of any sub-goals, and does not 385 affect cached context values. 386 """ 387 self.result = copy.deepcopy(BLANK_RESULT)
Resets internal state so that the goal can be evaluated again. Does not affect internal state of any sub-goals, and does not affect cached context values.
389 def reset_network(self): 390 """ 391 Resets our internal state and the states of any sub-goals, but 392 does not affect context caches. 393 """ 394 self.reset() 395 for goal in self.subgoals(): 396 goal.reset_network()
Resets our internal state and the states of any sub-goals, but does not affect context caches.
398 def full_reset(self): 399 """ 400 Does a full reset, including a full reset of subgoals plus 401 burning of context caches. 402 """ 403 self.reset() 404 for goal in self.subgoals(): 405 goal.full_reset() 406 if self.test_in: 407 for ctx in self.test_in["contexts"]: 408 ctx.burn_cache()
Does a full reset, including a full reset of subgoals plus burning of context caches.
410 def subgoals(self): 411 """ 412 Returns a list of `Goal` objects that are considered subgoals of 413 this goal. Different `Goal` classes have different relationships 414 to their subgoals, but this method allows other code to discover 415 the full tree of goals regardless of those relationships. `Goal` 416 classes without subgoals can safely inherit this method, which 417 returns an empty list. 418 """ 419 # A base Goal has no subgoals. 420 return []
Returns a list of Goal
objects that are considered subgoals of
this goal. Different Goal
classes have different relationships
to their subgoals, but this method allows other code to discover
the full tree of goals regardless of those relationships. Goal
classes without subgoals can safely inherit this method, which
returns an empty list.
422 def evaluate(self, base_context=None): 423 """ 424 Evaluates this goal independently within each of its contexts, 425 and produces an overall evaluation that combines explanations 426 from each context. If there are no contexts, simply evaluates the 427 goal normally. 428 429 A base context is normally required, as otherwise the goal won't 430 have access to the submitted code or even basic info about the 431 task being evaluated. 432 433 Keeps track of the set of all distinct explanations generated, 434 and if there was only a single shared explanation across all 435 contexts, it uses that as the final explanation, but if there 436 were multiple different explanations, creates a combined 437 explanation with sections for the different contexts that had 438 different explanations. 439 440 Note: During this process, if the goal ever evaluates to "not 441 evaluated" in one of the contexts, the end result will be "not 442 evaluated" overall regardless of results from other contexts. 443 444 Note: If one of the contexts cannot be created, the goal will 445 count as failed in that context, and a note will be attached to 446 the result. If a context builder function generates an error 447 other than a `potluck.context_utils.ContextCreationError`, a 448 warning is generated, but in other cases the 'warn' of the 449 setting determines whether a warning or note is generated. 450 """ 451 if not self.test_in or len(self.test_in["contexts"]) == 0: 452 # No contexts listed: simply evaluate in base context 453 try: 454 # this will set self.result 455 self.evaluate_in_context(base_context) 456 except Exception: 457 self.result = { 458 "status": "failed", 459 "warnings": [], 460 "notes": [ 461 # generic context creation failure is usually not 462 # warning-worthy. TODO: Sometimes it is! 463 "Context creation failed" 464 + " unexpectedly:<br>\n{}".format( 465 html_tools.html_traceback( 466 title='Error:', 467 linkable=context_utils.linkmap(base_context) 468 ) 469 ) 470 ] 471 } 472 return self.result 473 474 else: # has specific contexts to test in 475 credit = 0 476 full_count = 0 477 partial_count = 0 478 notes = [] 479 warnings = [] 480 # mapping from explanation strings to lists of status, 481 # context pairs: 482 explanations = {} 483 for i, builder in enumerate(self.test_in["contexts"]): 484 # Construct context: 485 this_context = copy.copy(base_context) 486 # Note: can't deep-copy things like modules 487 488 # Set goal_id and which_context value to provide enough 489 # information in the context dictionary to uniquely 490 # identify a specific context-building operation. 491 this_context["goal_id"] = self.identifier 492 this_context["which_context"] = i 493 494 add_failures_to = notes 495 if self.test_in.get("warn"): 496 add_failures_to = warnings 497 498 err = None 499 try: 500 this_context.update(builder.create(this_context)) 501 except context_utils.ContextCreationError as e: 502 err = e.explanation() 503 add_failures_to.append(e.explanation()) 504 except Exception: 505 err = html_tools.html_traceback( 506 title="Unexpected Error:", 507 linkable=context_utils.linkmap(this_context) 508 ) 509 notes.append( 510 "Context creation failed unexpectedly:<br>\n" 511 + err 512 ) 513 514 # reset this and subgoals, but don't disturb Context caches: 515 self.reset_network() 516 517 # evaluate ourselves: 518 if ( 519 self.test_in.get("fail_without_context", True) 520 and err is not None 521 ): 522 res = { 523 "status": "failed", 524 "explanation": ( 525 "Failed to establish testing context:<br>\n{}" 526 ).format(err) 527 } 528 else: 529 res = self.evaluate_in_context(this_context) 530 531 if res["status"] == "accomplished": 532 credit += 1 533 full_count += 1 534 elif res["status"] == "partial": 535 credit += self.test_in.get("count_partial_as", 0.5) 536 partial_count += 1 537 elif res["status"] == "unknown": 538 # Immediately abandon evaluation across contexts: 539 return { 540 "status": "unknown", 541 "explanation": ( 542 "Unable to evaluate in context:<br>\n{}" 543 ).format(builder.html_topic(in_feedback=True)) 544 # TODO: Does this need to be html_context_tree 545 # for disambiguation? 546 } 547 548 # record explanation & status: 549 expl = res.get("explanation", "") 550 if expl not in explanations: 551 explanations[expl] = [] 552 explanations[expl].append((res["status"], builder, res)) 553 554 # copy notes and warnings 555 if "notes" in res: 556 notes.extend(res["notes"]) 557 if "warnings" in res: 558 warnings.extend(res["warnings"]) 559 560 # Compute credit required/partial 561 required = self.test_in.get( 562 "required", 563 len(self.test_in["contexts"]) 564 ) 565 partial = self.test_in.get("partial", required / 2) 566 567 # Compute status 568 # TODO: Should credit-logic be made visible since it's not 569 # always consistent?!? 570 status = "failed" 571 if credit >= required: 572 status = "accomplished" 573 elif partial is not False and credit >= partial: 574 status = "partial" 575 576 self.result = { 577 "status": status, 578 "notes": notes, 579 "warnings": warnings 580 } 581 582 # Combine explanations: 583 if len(explanations) == 0: 584 # TODO: Should we be bypassing set_explanation here? 585 self.result["explanation"] = "THIS SHOULDN'T BE POSSIBLE!" 586 elif len(explanations) == 1: 587 # Single explanation: don't bother worrying about 588 # multiple contexts and statuses: 589 # TODO: This logic is bad or hides stuff? 590 self.result["explanation"] = list(explanations.keys())[0] 591 # In this case pick up extra keys from the result... 592 competing = list(explanations.values())[0] 593 if len(competing) == 1: 594 sole_result = competing[0][2] 595 for k in sole_result: 596 if k not in self.result: 597 self.result[k] = sole_result[k] 598 else: 599 # Multiple explanations: mix & group by statuses/contexts 600 # TODO: What to do about multiple potentially 601 # contradictory custom result keys? 602 603 # Group by status: 604 by_status = {} 605 for expl in explanations: 606 for status, builder, result in explanations[expl]: 607 if status not in by_status: 608 by_status[status] = [] 609 by_status[status].append((expl, builder)) 610 611 # Order statuses: 612 status_order = ["accomplished", "partial", "failed"] 613 for status in by_status: 614 if status not in status_order: 615 status_order.append(status) 616 617 # Build parts of explanation: 618 expl_parts = [] 619 for status in status_order: 620 if status not in by_status: 621 continue 622 expls_and_builders = by_status[status] 623 n_ctx = len(expls_and_builders) 624 if n_ctx == 0: 625 raise ValueError("Shouldn't have zero explanations!") 626 elif n_ctx == 1: 627 in_ctx = "in one context" 628 else: 629 in_ctx = "in {} contexts".format(n_ctx) 630 631 this_expl = html_tools.build_html_details( 632 '{} {}:'.format( 633 STATUS_DESCRIPTORS.get(status, status) 634 .capitalize(), 635 in_ctx 636 ), 637 '<ul class="contextual_explanations">{}</ul>'.format( 638 '\n'.join( 639 ( 640 '<li>In context {}\n' 641 + '<div class="expl_in_context {}">\n' 642 + '{}\n{}\n' 643 + '</div>\n' 644 + '</li>' 645 ).format( 646 builder.html_topic(in_feedback=True), 647 # TODO: Does this need to be 648 # html_context_tree for 649 # disambiguation? 650 html_tools.status_css_class(status), 651 html_tools.build_status_indicator(status), 652 expl 653 ) 654 for expl, builder in expls_and_builders 655 ) 656 ) 657 ) 658 expl_parts.append((status, this_expl)) 659 660 # Combine parts into one explanation: 661 rstatus = self.result["status"] 662 rsdesc = STATUS_DESCRIPTORS.get(rstatus, rstatus) 663 if rstatus == "accomplished": 664 if full_count >= required: 665 self.result["explanation"] = "{} (in {})".format( 666 rsdesc, 667 phrasing.obj_num(full_count, "context") 668 ) 669 else: 670 self.result["explanation"] = ( 671 "{} (in {} and partially accomplished in {})" 672 ).format( 673 rsdesc, 674 phrasing.obj_num(full_count, "context"), 675 phrasing.obj_num(partial_count, "context") 676 ) 677 else: 678 if full_count > 0: 679 if partial_count > 0: 680 self.result["explanation"] = ( 681 "{} (accomplished in {};" 682 " partially accomplished in {})" 683 ).format( 684 rsdesc.capitalize(), 685 phrasing.obj_num(full_count, "context"), 686 phrasing.obj_num(partial_count, "context") 687 ) 688 else: 689 self.result["explanation"] = ( 690 "{} (accomplished in {})" 691 ).format( 692 rsdesc.capitalize(), 693 phrasing.obj_num(full_count, "context") 694 ) 695 else: 696 if partial_count > 0: 697 self.result["explanation"] = ( 698 "{} (partially accomplished in {})" 699 ).format( 700 rsdesc.capitalize(), 701 phrasing.obj_num(partial_count, "context") 702 ) 703 else: 704 self.result["explanation"] = ( 705 "{} (not accomplished in any contexts)" 706 ).format(rsdesc.capitalize()) 707 708 # Append parts describing success/failure in different 709 # contexts: 710 self.result["explanation"] += "<br>\n".join( 711 '<div class="expl_part {}">{}</div>'.format( 712 html_tools.status_css_class(status), 713 part 714 ) 715 for status, part in expl_parts 716 ) 717 718 # Return our result: 719 return self.result
Evaluates this goal independently within each of its contexts, and produces an overall evaluation that combines explanations from each context. If there are no contexts, simply evaluates the goal normally.
A base context is normally required, as otherwise the goal won't have access to the submitted code or even basic info about the task being evaluated.
Keeps track of the set of all distinct explanations generated, and if there was only a single shared explanation across all contexts, it uses that as the final explanation, but if there were multiple different explanations, creates a combined explanation with sections for the different contexts that had different explanations.
Note: During this process, if the goal ever evaluates to "not evaluated" in one of the contexts, the end result will be "not evaluated" overall regardless of results from other contexts.
Note: If one of the contexts cannot be created, the goal will
count as failed in that context, and a note will be attached to
the result. If a context builder function generates an error
other than a potluck.context_utils.ContextCreationError
, a
warning is generated, but in other cases the 'warn' of the
setting determines whether a warning or note is generated.
721 def evaluate_in_context(self, context=None): 722 """ 723 The evaluate_in_context method of a Goal subclass should update 724 its 'result' value and return that new value. The result value 725 must be a dictionary with keys 'status' and 'explanation', where 726 the 'status' is one of the strings "unknown", "accomplished", 727 "partial", "failed", or "not applicable", and the 'explanation' 728 value is a (possibly-HTML) string. The result dictionary may also 729 optionally include a list of notes and/or a list of warnings, 730 which are HTML strings. 731 732 The evaluate_in_context method does not need to worry about a 733 goal's test_in value or the associated Context objects: 734 evaluate takes care of constructing context dictionaries which 735 are given to evaluate, so evaluate should just evaluate this goal 736 within the given context. Typical context keys are explained in 737 the documentation for the `potluck.contexts.Context` class. 738 """ 739 raise NotImplementedError("Cannot evaluate base Goal object!")
The evaluate_in_context method of a Goal subclass should update its 'result' value and return that new value. The result value must be a dictionary with keys 'status' and 'explanation', where the 'status' is one of the strings "unknown", "accomplished", "partial", "failed", or "not applicable", and the 'explanation' value is a (possibly-HTML) string. The result dictionary may also optionally include a list of notes and/or a list of warnings, which are HTML strings.
The evaluate_in_context method does not need to worry about a
goal's test_in value or the associated Context objects:
evaluate takes care of constructing context dictionaries which
are given to evaluate, so evaluate should just evaluate this goal
within the given context. Typical context keys are explained in
the documentation for the potluck.contexts.Context
class.
741 def table(self, blank=False): 742 """ 743 Creates a table report for this goal. The table includes a list 744 of rows, where each row contains a result dictionary, with a 745 "description" key including this goal's description and an 746 optional extra "subtable" key containing a sub-table of 747 additional results. The "notes" and "warnings" entries will 748 always be lists, and will be empty if there were no such keys 749 (or their values were explicitly None). The following keys are 750 canonical: 751 752 - 'id': This goal's unique ID (see `Goal.unique_id`). May be 753 absent on some rows representing groups of goals rather than 754 individual goals. 755 - 'description': A pair of strings describing this goal. 756 - 'tags': A dictionary of the tags for this goal. 757 - 'status': The goal's status. 758 - 'explanation': An explanation for the goal's success or 759 failure. 760 - 'notes': A list of strings describing additional feedback for 761 this goal. 762 - 'warnings': A list of strings describing any warnings that 763 arose during the evaluation of this goal. 764 - 'subtable': A list of table rows from sub-goals. 765 766 If "blank" is given as True, the BLANK_RESULT will be used as the 767 basis instead of this table's current result, so there will be no 768 notes or warnings, and the status will be "unknown." 769 """ 770 if blank: 771 row = copy.deepcopy(BLANK_RESULT) 772 else: 773 row = copy.deepcopy(self.result) 774 row["notes"] = self.result.get("notes") or [] 775 row["warnings"] = self.result.get("warnings") or [] 776 row["id"] = self.identifier 777 row["description"] = list(self.description[:]) 778 row["tags"] = copy.copy(self.tags) 779 row["subtable"] = [] 780 return [ row ]
Creates a table report for this goal. The table includes a list of rows, where each row contains a result dictionary, with a "description" key including this goal's description and an optional extra "subtable" key containing a sub-table of additional results. The "notes" and "warnings" entries will always be lists, and will be empty if there were no such keys (or their values were explicitly None). The following keys are canonical:
- 'id': This goal's unique ID (see
Goal.unique_id
). May be absent on some rows representing groups of goals rather than individual goals. - 'description': A pair of strings describing this goal.
- 'tags': A dictionary of the tags for this goal.
- 'status': The goal's status.
- 'explanation': An explanation for the goal's success or failure.
- 'notes': A list of strings describing additional feedback for this goal.
- 'warnings': A list of strings describing any warnings that arose during the evaluation of this goal.
- 'subtable': A list of table rows from sub-goals.
If "blank" is given as True, the BLANK_RESULT will be used as the basis instead of this table's current result, so there will be no notes or warnings, and the status will be "unknown."
782 def set_explanation( 783 self, 784 context, 785 status=None, 786 default="", 787 specific_context=True 788 ): 789 """ 790 Implements the explanations logic, where if self.explanations 791 contains an appropriate key, the string or function value for 792 that key is used to provide an explanation, and otherwise the 793 given default explanation is used. If no status string is given, 794 self.result.status is used as the key. 795 796 For cross-context final evaluation, this function is not used, 797 and explanation-overrides are ignored. 798 TODO: Really that? 799 800 The resulting explanation string is inserted into self.result 801 under the "explanation" key, in addition to being returned. 802 """ 803 status = status or self.result["status"] 804 expl = self.explanations.get(status, default) 805 if isinstance(expl, type(lambda x: x)): 806 expl = expl(self, context) 807 808 self.result["explanation"] = expl 809 return expl
Implements the explanations logic, where if self.explanations contains an appropriate key, the string or function value for that key is used to provide an explanation, and otherwise the given default explanation is used. If no status string is given, self.result.status is used as the key.
For cross-context final evaluation, this function is not used, and explanation-overrides are ignored. TODO: Really that?
The resulting explanation string is inserted into self.result under the "explanation" key, in addition to being returned.
816class Rubric: 817 """ 818 A rubric has a list of goals, and a method for determining overall 819 performance based on the evaluation of each individual goal. It may 820 also have a separate list of validation goals to be tested during the 821 validation step (e.g., goals that a certain number of tests should be 822 defined; see `potluck.validation`). 823 """ 824 def __init__( 825 self, 826 evaluation_goals, 827 performance_metric, 828 validation_goals=None, 829 spec_file=None 830 ): 831 """ 832 Sets up the rubric with a list of goals to be evaluated, and a 833 performance metric function that accepts a list of evaluated 834 goals and returns a performance report object. 835 836 A filename for the specification the rubric was loaded from may 837 be provided, in which case certain tracebacks within output may 838 be rewritten to abbreviate that filename. 839 """ 840 self.evaluation_goals = evaluation_goals 841 self.validation_goals = validation_goals or [] 842 self.metric = performance_metric 843 self.spec_file = spec_file 844 845 def all_contexts(self, goals): 846 """ 847 Crawls the provided list of goals and their subgoals to find all 848 relevant `potluck.contexts.Context` objects that might possibly 849 be used by evaluation tests in this rubric. Returns a list in 850 breadth-first traversal order of this rubric's goals, their 851 contexts, and those contexts' dependencies. 852 """ 853 # Map of object IDs 854 idmap = {} 855 856 queue = goals[:] 857 while queue: 858 # pop first 859 first = queue.pop(0) 860 861 # Process a Goal object (queue subgoals and contexts) 862 if isinstance(first, Goal): 863 queue.extend(first.subgoals()) 864 865 # Add associated contexts to our queue 866 if first.test_in: 867 queue.extend(first.test_in.get("contexts", [])) 868 869 # Process a Context object (accumulate and queue dependencies) 870 elif isinstance(first, contexts.Context): 871 queue.extend(first.depends) 872 873 # Add novel contexts to our idmap 874 if id(first) not in idmap: 875 idmap[id(first)] = first 876 queue.extend(first.depends) 877 878 result = list(idmap.values()) 879 880 return result 881 882 # TODO: HERE 883 def create_contexts_list(self, goals, base_context=None): 884 """ 885 Returns a list of context summary dictionaries describing all of 886 the contexts used by goals in the given goals list. It has the 887 same format as returned by 888 `potluck.contexts.list_and_render_contexts`. 889 890 A base context object is necessary to generate context values; 891 if no base context is given then context slots will not include 892 values and will use their redacted topics and details. 893 """ 894 clist = self.all_contexts(goals) 895 if self.spec_file: 896 html_tools.set_tb_rewrite( 897 self.spec_file, 898 "<task specification>" 899 ) 900 901 # Ensure that duplicate topics are distinguished 902 contexts.add_context_numbering(clist) 903 904 cgraph = contexts.build_context_graph(clist) 905 906 if len(clist) == 0: 907 return [] 908 909 return contexts.list_and_render_contexts(cgraph, base_context) 910 911 def create_blank_report(self, task_info): 912 """ 913 Creates a blank report for this rubric that simply shows what the 914 goals and contexts are. This function will erase any existing 915 results associated with rubric goals. 916 917 It uses False as the in_feedback value, so included context 918 descriptions will be obfuscated. 919 920 The returned report is a dictionary with the following keys: 921 922 - taskid: The task ID (from the given taskspec) 923 - evaluation: The string 'unknown' 924 - warnings: An empty list 925 - summary: A description of the task that this rubric belongs to. 926 - table: A table (in the format returned by `Goal.table`) detailing 927 each goal and subgoal. 928 - contexts: A list of context summary dictionaries in the format 929 returned by `potluck.contexts.list_and_render_contexts`, 930 which summarizes all contexts used by this rubric. 931 """ 932 # Empty report: 933 report = { 934 "taskid": task_info["id"], 935 "evaluation": "unknown", 936 "warnings": [], 937 "summary": f"Rubric for {task_info['id']}.", 938 "table": [], 939 "contexts": self.create_contexts_list(self.evaluation_goals) 940 } 941 942 # Reset our goals: 943 for g in self.evaluation_goals: 944 g.reset_network() 945 946 # Just in case set up a rewrite rule for the spec file 947 if self.spec_file: 948 html_tools.set_tb_rewrite( 949 self.spec_file, 950 "<task specification>" 951 ) 952 953 # Run metric over un-evaluated goals and ask for a blank result: 954 metric_result = self.metric(self.evaluation_goals, blank=True) 955 956 # Integrate results into our report: 957 report["evaluation"] = metric_result["evaluation"] 958 report["summary"] = metric_result["summary"] 959 report["table"] = metric_result["table"] 960 report["warnings"].extend(metric_result["warnings"]) 961 962 return report 963 964 def create_blank_validation_report(self, task_info): 965 """ 966 Creates a blank validation report for this rubric that simply 967 shows what the validation goals and contexts are. Just like 968 `Rubric.create_blank_report`, this function will erase any 969 existing results associated with validation rubric goals. 970 971 It uses False as the in_feedback value, so included context 972 descriptions will be obfuscated. 973 974 The result has the same keys as `Rubric.create_blank_report` 975 does. 976 """ 977 # Empty report: 978 report = { 979 "taskid": task_info["id"], 980 "evaluation": "unknown", 981 "warnings": [], 982 "summary": f"Validation rubric for {task_info['id']}.", 983 "table": [], 984 "contexts": self.create_contexts_list(self.validation_goals) 985 } 986 987 # Reset our goals: 988 for g in self.validation_goals: 989 g.reset_network() 990 991 # Just in case set up a rewrite rule for the spec file 992 if self.spec_file: 993 html_tools.set_tb_rewrite( 994 self.spec_file, 995 "<task specification>" 996 ) 997 998 # Run metric over un-evaluated goals and ask for a blank result: 999 metric_result = self.metric(self.validation_goals, blank=True) 1000 1001 # Integrate results into our report: 1002 report["evaluation"] = metric_result["evaluation"] 1003 report["summary"] = metric_result["summary"] 1004 report["table"] = metric_result["table"] 1005 report["warnings"].extend(metric_result["warnings"]) 1006 1007 return report 1008 1009 def evaluate(self, task_info, username, submission_target): 1010 """ 1011 Evaluates this rubric based on the given submitted task (the 1012 task_info includes generic info about the task, the username 1013 identifies who submitted it, and the submission_target 1014 identifies the file or folder to be evaluated). 1015 1016 See `tasks.json` for the task info format (it's a dictionary 1017 stored in the "tasks" slot under its taskid as a key). 1018 1019 Returns a report object that has information about which goal(s) 1020 from the rubric passed or failed, and the overall performance as 1021 determined by the rubric's metric. 1022 1023 If submitted code cannot be loaded due to a syntax error or 1024 parsing fails for some other reason, the report will mention 1025 that in as much detail as it can, and the normal rubric items 1026 will be skipped. 1027 1028 Note: This function completely resets all evaluation goals and 1029 clears the caches of any associated `potluck.contexts.Context` 1030 objects before it starts evaluating goals. 1031 1032 The returned report dictionary has the following keys: 1033 1034 - taskid: The task ID (from the given taskspec) 1035 - evaluation: A string summarizing the performance on the entire 1036 task (from the metric function). 1037 - summary: An HTML string summarizing performance on the task 1038 (from the metric function). 1039 - files: A list of dictionaries with 'filename' and 'code' slots 1040 containing the file names and raw code text of the submitted 1041 file(s). 1042 - warnings: A list of warnings (from the metric function plus a 1043 few custom warnings if things are seriously wrong). 1044 - table: A table (in the format returned by `Goal.table`) detailing 1045 each goal and subgoal (from the metric function). 1046 - contexts: A list of context summary dictionaries in the format 1047 returned by `potluck.contexts.list_and_render_contexts`, 1048 which summarizes all contexts used by this rubric (see 1049 `Rubric.create_contexts_list`). 1050 - TODO: Add a partner_username field here? 1051 """ 1052 # Empty report: 1053 report = { 1054 "taskid": task_info["id"], 1055 "evaluation": "unknown", 1056 "summary": "No summary has been generated.", 1057 "files": [], 1058 "warnings": [], 1059 "table": [], 1060 "contexts": [] 1061 } 1062 1063 # Set up a rewrite rule for the spec file 1064 if self.spec_file: 1065 html_tools.set_tb_rewrite( 1066 self.spec_file, 1067 "<task specification>" 1068 ) 1069 1070 # Check for a missing submission: 1071 if not os.path.exists(submission_target): 1072 report["warnings"] = [ 1073 "You did not submit any code for this task." 1074 ] 1075 report["evaluation"] = "incomplete" 1076 report["summary"] = "You did not submit any code for this task." 1077 # Early return: no need to grade rubric items 1078 return report 1079 1080 # Check for accidental submission of the starter file: 1081 if os.path.isfile(submission_target): 1082 with open(submission_target, 'r', encoding="utf-8") as fin: 1083 submitted_code = fin.read() 1084 if submitted_code == task_info["specification"].starter_src: 1085 report["warnings"] = [ 1086 "You submitted the starter file without any" 1087 " changes (you probably submitted the wrong file?)." 1088 ] 1089 report["evaluation"] = "incomplete" 1090 report["summary"] = ( 1091 "You submitted an unchanged starter file." 1092 ) 1093 1094 # Reset each goal + any associated contexts: 1095 for g in self.evaluation_goals: 1096 g.full_reset() 1097 1098 # Ensure context descriptions are unique: 1099 clist = self.all_contexts(self.evaluation_goals) 1100 contexts.add_context_numbering(clist) 1101 1102 # Create our base context: 1103 if os.path.isdir(submission_target): 1104 submission_root = submission_target 1105 default_file = task_info["target"] 1106 actual_file = default_file 1107 else: 1108 submission_root, actual_file = os.path.split(submission_target) 1109 default_file = task_info["target"] 1110 base_context = { 1111 "task_info": task_info, 1112 "username": username, 1113 "submission_root": submission_root, 1114 "default_file": default_file, 1115 "actual_file": actual_file 1116 } 1117 1118 if len(self.evaluation_goals) == 0: 1119 raise ValueError("Rubric does not have any goals!") 1120 1121 # Evaluate each goal: 1122 for g in self.evaluation_goals: 1123 logging.debug_msg( 1124 "Evaluating goal '{}' @ {}...".format( 1125 g.feedback_topic(), 1126 id(g) 1127 ) 1128 ) 1129 # Task is automatically made available as part of context. 1130 result = g.evaluate(base_context) 1131 logging.debug_msg("...result is: {}".format(result)) 1132 logging.debug_msg("...review result is: {}".format(g.result)) 1133 1134 # Double-check that the goal correctly stored the value it 1135 # returned 1136 if result != g.result: 1137 logging.debug_msg( 1138 f"WARNING: Goal's returned result differs from" 1139 f" stored result!\nGoal:" 1140 f" '{g.feedback_topic()}'\nReturned:" 1141 f" {result}\nStored: {g.result}" 1142 ) 1143 1144 # Run our metric over the evaluated goals: 1145 metric_result = self.metric(self.evaluation_goals) 1146 1147 # Integrate results into our report: 1148 report["evaluation"] = metric_result["evaluation"] 1149 report["summary"] = metric_result["summary"] 1150 report["table"] = metric_result["table"] 1151 report["warnings"].extend(metric_result["warnings"]) 1152 1153 # Build our contexts list now that contexts should be caching the 1154 # same values used during testing: 1155 report["contexts"] = self.create_contexts_list( 1156 self.evaluation_goals, 1157 base_context 1158 ) 1159 1160 # Elevate warnings from contexts to the main warnings list. 1161 for creport in report["contexts"]: 1162 report["warnings"].extend(creport.get("warnings", [])) 1163 1164 # Build our files dictionary based on FileContext objects. It 1165 # maps file names to dictionaries with "path" slots (and possibly 1166 # more if we can dig up more info). 1167 all_filenames = { 1168 base_context["default_file"]: { 1169 "path": os.path.abspath( 1170 os.path.join( 1171 base_context["submission_root"], 1172 base_context["actual_file"] 1173 ) 1174 ) 1175 } 1176 } 1177 for ctx in clist: 1178 if isinstance(ctx, contexts.FileContext): 1179 if ctx.target_file is not None: 1180 ctx_result = ctx.create(base_context) 1181 name = ctx_result.get("filename", ctx.target_file) 1182 path = ctx_result.get("file_path", name) 1183 if name not in all_filenames: 1184 all_filenames[name] = { "path": path } 1185 1186 # Look for code contexts which have handled parsing on target 1187 # files, and add "source" and possibly "original_source" slots 1188 for ctx in clist: 1189 if isinstance(ctx, contexts.CodeContext): 1190 ctx_result = ctx.create(base_context) 1191 if "filename" in ctx_result: 1192 name = ctx_result["filename"] 1193 original = ctx_result["original_source"] 1194 fixed = ctx_result["source"] 1195 all_filenames[name]["source"] = fixed 1196 if original != fixed: 1197 all_filenames[name]["original_source"] = original 1198 # Otherwise there was some kind of error we assume 1199 1200 # Grab file contents if we haven't already 1201 for filename in all_filenames: 1202 file_info = all_filenames[filename] 1203 entry = { 1204 "filename": filename, 1205 "path": file_info["path"] 1206 } 1207 report["files"].append(entry) 1208 if "source" in file_info: 1209 entry["code"] = file_info["source"] 1210 else: 1211 with open(entry["path"], 'r', encoding="utf-8") as fin: 1212 if entry["path"].endswith(".py"): 1213 entry["code"] = fin.read() 1214 else: 1215 entry["raw"] = fin.read() 1216 1217 if "original_source" in file_info: 1218 entry["original_code"] = file_info["original_source"] 1219 1220 return report 1221 1222 def validate(self, task_info, username, tests_target, target): 1223 """ 1224 Validates tests for this task based on the given submitted tests 1225 file and submission file (the task_info includes generic info 1226 about the task, the username identifies who submitted it, the 1227 tests_target identifies the file or folder to be evaluated, and 1228 the target identifies the base task file or folder to run tests 1229 against). 1230 1231 See `tasks.json` for the task info format (it's a dictionary 1232 stored in the "tasks" slot under its taskid as a key). 1233 1234 Returns a report object that has information about which 1235 validation goal(s) from the rubric passed or failed, and the 1236 overall performance as determined by the rubric's metric. 1237 1238 If submitted tests cannot be loaded due to a syntax error or 1239 parsing fails for some other reason, the report will mention 1240 that in as much detail as it can, and the normal rubric items 1241 will be skipped. 1242 1243 Note: This function completely resets all validation goals and 1244 clears the caches of any associated `potluck.contexts.Context` 1245 objects before it starts evaluating goals. 1246 1247 TODO: We mostly effectively ignore the `target` argument because 1248 we grab the solution (see `contexts.TestsFileContext`). Get rid 1249 of it? 1250 1251 The returned report dictionary has the same keys/values as the 1252 result from `Rubric.evaluate`. 1253 """ 1254 # Empty report: 1255 report = { 1256 "taskid": task_info["id"], 1257 "evaluation": "unknown", 1258 "summary": "No summary has been generated.", 1259 "files": [], 1260 "warnings": [], 1261 "table": [], 1262 "contexts": [] 1263 } 1264 1265 # Set up a rewrite rule for the spec file 1266 if self.spec_file: 1267 html_tools.set_tb_rewrite( 1268 self.spec_file, 1269 "<task specification>" 1270 ) 1271 1272 # Check for a missing submission: 1273 if not os.path.exists(tests_target): 1274 report["warnings"] = [ 1275 "You did not submit any tests for this task." 1276 ] 1277 report["evaluation"] = "incomplete" 1278 report["summary"] = "You did not submit any tests for this task." 1279 # Early return: no need to grade rubric items 1280 return report 1281 1282 # Check for a missing submission: 1283 if not os.path.exists(target): 1284 report["warnings"] = [ 1285 "We did not find the code to test." 1286 ] 1287 report["evaluation"] = "incomplete" 1288 report["summary"] = "We did not find the code to test." 1289 # Early return: no need to grade rubric items 1290 return report 1291 1292 # Reset each goal + any associated contexts: 1293 for g in self.validation_goals: 1294 g.full_reset() 1295 1296 # Ensure context descriptions are unique: 1297 clist = self.all_contexts(self.validation_goals) 1298 contexts.add_context_numbering(clist) 1299 1300 # Figure out whether tests target is a directory or file 1301 if os.path.isdir(tests_target): 1302 submission_root = tests_target 1303 default_file = task_info.get( 1304 "tests_target", 1305 "test_" + task_info["target"] 1306 ) 1307 actual_file = default_file 1308 else: 1309 submission_root, actual_file = os.path.split(tests_target) 1310 default_file = task_info.get( 1311 "tests_target", 1312 "test_" + task_info["target"] 1313 ) 1314 1315 # Figure out whether submission target is a directory or file 1316 if os.path.isdir(target): 1317 target_root = target 1318 target_default_file = task_info["target"] 1319 target_actual_file = target_default_file 1320 else: 1321 target_root, target_actual_file = os.path.split(target) 1322 target_default_file = task_info["target"] 1323 1324 # Create our base context: 1325 base_context = { 1326 "task_info": task_info, 1327 "username": username, 1328 "submission_root": target_root, 1329 "default_file": target_default_file, 1330 "actual_file": target_actual_file, 1331 "tests_submission_root": submission_root, 1332 "default_tests_file": default_file, 1333 "actual_tests_file": actual_file 1334 } 1335 1336 if len(self.validation_goals) == 0: 1337 raise ValueError("Rubric does not have any validation goals!") 1338 1339 # Evaluate each goal: 1340 for g in self.validation_goals: 1341 logging.debug_msg( 1342 "Evaluating validation goal '{}' @ {}...".format( 1343 g.feedback_topic(), 1344 id(g) 1345 ) 1346 ) 1347 # Task is automatically made available as part of context. 1348 result = g.evaluate(base_context) 1349 logging.debug_msg("...result is: {}".format(result)) 1350 logging.debug_msg("...review result is: {}".format(g.result)) 1351 1352 # Double-check that the goal correctly stored the value it 1353 # returned 1354 if result != g.result: 1355 logging.debug_msg( 1356 f"WARNING: Validation goal's returned result differs" 1357 f" from stored result!\nGoal:" 1358 f" '{g.feedback_topic()}'\nReturned:" 1359 f" {result}\nStored: {g.result}" 1360 ) 1361 1362 # Run our metric over the evaluated goals: 1363 # TODO: Allow/require separate validation metrics? 1364 metric_result = self.metric(self.validation_goals) 1365 1366 # Integrate results into our report: 1367 report["evaluation"] = metric_result["evaluation"] 1368 report["summary"] = metric_result["summary"] 1369 report["table"] = metric_result["table"] 1370 report["warnings"].extend(metric_result["warnings"]) 1371 1372 # Build our contexts list now that contexts should be caching the 1373 # same values used during testing: 1374 report["contexts"] = self.create_contexts_list( 1375 self.validation_goals, 1376 base_context 1377 ) 1378 1379 # Elevate warnings from contexts to the main warnings list. 1380 for creport in report["contexts"]: 1381 report["warnings"].extend(creport.get("warnings", [])) 1382 1383 # Build our files dictionary based on TestsFileContext objects. 1384 # It maps file names to dictionaries with "path" slots (and 1385 # possibly more if we can dig up more info). 1386 all_filenames = { 1387 base_context["default_tests_file"]: { 1388 "path": os.path.abspath( 1389 os.path.join( 1390 base_context["submission_root"], 1391 base_context["actual_tests_file"] 1392 ) 1393 ) 1394 } 1395 } 1396 for ctx in clist: 1397 if isinstance(ctx, contexts.TestsFileContext): 1398 if ctx.target_tests_file is not None: 1399 ctx_result = ctx.create(base_context) 1400 name = ctx_result.get( 1401 "tests_filename", 1402 ctx.target_tests_file 1403 ) 1404 path = ctx_result.get("tests_file_path", name) 1405 if name not in all_filenames: 1406 all_filenames[name] = { "path": path } 1407 1408 # Look for code contexts which have handled parsing on target 1409 # files, and add "source" and possibly "original_source" slots 1410 for ctx in clist: 1411 if isinstance(ctx, contexts.CodeContext): 1412 ctx_result = ctx.create(base_context) 1413 if "tests_filename" in ctx_result: 1414 name = ctx_result["tests_filename"] 1415 original = ctx_result["original_tests_source"] 1416 fixed = ctx_result["tests_source"] 1417 all_filenames[name]["source"] = fixed 1418 if original != fixed: 1419 all_filenames[name]["original_source"] = original 1420 # Otherwise there was some kind of error we assume 1421 1422 # Grab file contents if we haven't already 1423 for filename in all_filenames: 1424 file_info = all_filenames[filename] 1425 entry = { 1426 "filename": filename, 1427 "path": file_info["path"] 1428 } 1429 report["files"].append(entry) 1430 if "source" in file_info: 1431 entry["code"] = file_info["source"] 1432 else: 1433 with open(entry["path"], 'r', encoding="utf-8") as fin: 1434 if entry["path"].endswith(".py"): 1435 entry["code"] = fin.read() 1436 else: 1437 entry["raw"] = fin.read() 1438 1439 if "original_source" in file_info: 1440 entry["original_code"] = file_info["original_source"] 1441 1442 return report 1443 1444 def goals_by_id(self, fragment): 1445 """ 1446 Retrieves one or more of the goals from this rubric according to 1447 its identifier. Note that it's possible for multiple goals to 1448 share the same identifier (only when rendered into HTML do they 1449 get suffixes to make them unique), so this function always 1450 returns a list of goals, which is likely to be length-1. Of 1451 course, an empty list is returned if no goals have the given ID. 1452 Any goal whose identifier contains the provided string will be 1453 included in the goals returned, although '^^^' will be added to 1454 the front and a '$$$' to the end when checking this, so you can 1455 use those in your fragment; neither character normally appears 1456 inside of non-custom identifiers. 1457 """ 1458 # TODO: Prefix these for evaluation/validation? 1459 return [ 1460 g 1461 for g in self.evaluation_goals + self.validation_goals 1462 if fragment in ('^^^' + g.identifier + '$$$') 1463 ]
A rubric has a list of goals, and a method for determining overall
performance based on the evaluation of each individual goal. It may
also have a separate list of validation goals to be tested during the
validation step (e.g., goals that a certain number of tests should be
defined; see potluck.validation
).
824 def __init__( 825 self, 826 evaluation_goals, 827 performance_metric, 828 validation_goals=None, 829 spec_file=None 830 ): 831 """ 832 Sets up the rubric with a list of goals to be evaluated, and a 833 performance metric function that accepts a list of evaluated 834 goals and returns a performance report object. 835 836 A filename for the specification the rubric was loaded from may 837 be provided, in which case certain tracebacks within output may 838 be rewritten to abbreviate that filename. 839 """ 840 self.evaluation_goals = evaluation_goals 841 self.validation_goals = validation_goals or [] 842 self.metric = performance_metric 843 self.spec_file = spec_file
Sets up the rubric with a list of goals to be evaluated, and a performance metric function that accepts a list of evaluated goals and returns a performance report object.
A filename for the specification the rubric was loaded from may be provided, in which case certain tracebacks within output may be rewritten to abbreviate that filename.
845 def all_contexts(self, goals): 846 """ 847 Crawls the provided list of goals and their subgoals to find all 848 relevant `potluck.contexts.Context` objects that might possibly 849 be used by evaluation tests in this rubric. Returns a list in 850 breadth-first traversal order of this rubric's goals, their 851 contexts, and those contexts' dependencies. 852 """ 853 # Map of object IDs 854 idmap = {} 855 856 queue = goals[:] 857 while queue: 858 # pop first 859 first = queue.pop(0) 860 861 # Process a Goal object (queue subgoals and contexts) 862 if isinstance(first, Goal): 863 queue.extend(first.subgoals()) 864 865 # Add associated contexts to our queue 866 if first.test_in: 867 queue.extend(first.test_in.get("contexts", [])) 868 869 # Process a Context object (accumulate and queue dependencies) 870 elif isinstance(first, contexts.Context): 871 queue.extend(first.depends) 872 873 # Add novel contexts to our idmap 874 if id(first) not in idmap: 875 idmap[id(first)] = first 876 queue.extend(first.depends) 877 878 result = list(idmap.values()) 879 880 return result
Crawls the provided list of goals and their subgoals to find all
relevant potluck.contexts.Context
objects that might possibly
be used by evaluation tests in this rubric. Returns a list in
breadth-first traversal order of this rubric's goals, their
contexts, and those contexts' dependencies.
883 def create_contexts_list(self, goals, base_context=None): 884 """ 885 Returns a list of context summary dictionaries describing all of 886 the contexts used by goals in the given goals list. It has the 887 same format as returned by 888 `potluck.contexts.list_and_render_contexts`. 889 890 A base context object is necessary to generate context values; 891 if no base context is given then context slots will not include 892 values and will use their redacted topics and details. 893 """ 894 clist = self.all_contexts(goals) 895 if self.spec_file: 896 html_tools.set_tb_rewrite( 897 self.spec_file, 898 "<task specification>" 899 ) 900 901 # Ensure that duplicate topics are distinguished 902 contexts.add_context_numbering(clist) 903 904 cgraph = contexts.build_context_graph(clist) 905 906 if len(clist) == 0: 907 return [] 908 909 return contexts.list_and_render_contexts(cgraph, base_context)
Returns a list of context summary dictionaries describing all of
the contexts used by goals in the given goals list. It has the
same format as returned by
potluck.contexts.list_and_render_contexts
.
A base context object is necessary to generate context values; if no base context is given then context slots will not include values and will use their redacted topics and details.
911 def create_blank_report(self, task_info): 912 """ 913 Creates a blank report for this rubric that simply shows what the 914 goals and contexts are. This function will erase any existing 915 results associated with rubric goals. 916 917 It uses False as the in_feedback value, so included context 918 descriptions will be obfuscated. 919 920 The returned report is a dictionary with the following keys: 921 922 - taskid: The task ID (from the given taskspec) 923 - evaluation: The string 'unknown' 924 - warnings: An empty list 925 - summary: A description of the task that this rubric belongs to. 926 - table: A table (in the format returned by `Goal.table`) detailing 927 each goal and subgoal. 928 - contexts: A list of context summary dictionaries in the format 929 returned by `potluck.contexts.list_and_render_contexts`, 930 which summarizes all contexts used by this rubric. 931 """ 932 # Empty report: 933 report = { 934 "taskid": task_info["id"], 935 "evaluation": "unknown", 936 "warnings": [], 937 "summary": f"Rubric for {task_info['id']}.", 938 "table": [], 939 "contexts": self.create_contexts_list(self.evaluation_goals) 940 } 941 942 # Reset our goals: 943 for g in self.evaluation_goals: 944 g.reset_network() 945 946 # Just in case set up a rewrite rule for the spec file 947 if self.spec_file: 948 html_tools.set_tb_rewrite( 949 self.spec_file, 950 "<task specification>" 951 ) 952 953 # Run metric over un-evaluated goals and ask for a blank result: 954 metric_result = self.metric(self.evaluation_goals, blank=True) 955 956 # Integrate results into our report: 957 report["evaluation"] = metric_result["evaluation"] 958 report["summary"] = metric_result["summary"] 959 report["table"] = metric_result["table"] 960 report["warnings"].extend(metric_result["warnings"]) 961 962 return report
Creates a blank report for this rubric that simply shows what the goals and contexts are. This function will erase any existing results associated with rubric goals.
It uses False as the in_feedback value, so included context descriptions will be obfuscated.
The returned report is a dictionary with the following keys:
- taskid: The task ID (from the given taskspec)
- evaluation: The string 'unknown'
- warnings: An empty list
- summary: A description of the task that this rubric belongs to.
- table: A table (in the format returned by
Goal.table
) detailing each goal and subgoal. - contexts: A list of context summary dictionaries in the format
returned by
potluck.contexts.list_and_render_contexts
, which summarizes all contexts used by this rubric.
964 def create_blank_validation_report(self, task_info): 965 """ 966 Creates a blank validation report for this rubric that simply 967 shows what the validation goals and contexts are. Just like 968 `Rubric.create_blank_report`, this function will erase any 969 existing results associated with validation rubric goals. 970 971 It uses False as the in_feedback value, so included context 972 descriptions will be obfuscated. 973 974 The result has the same keys as `Rubric.create_blank_report` 975 does. 976 """ 977 # Empty report: 978 report = { 979 "taskid": task_info["id"], 980 "evaluation": "unknown", 981 "warnings": [], 982 "summary": f"Validation rubric for {task_info['id']}.", 983 "table": [], 984 "contexts": self.create_contexts_list(self.validation_goals) 985 } 986 987 # Reset our goals: 988 for g in self.validation_goals: 989 g.reset_network() 990 991 # Just in case set up a rewrite rule for the spec file 992 if self.spec_file: 993 html_tools.set_tb_rewrite( 994 self.spec_file, 995 "<task specification>" 996 ) 997 998 # Run metric over un-evaluated goals and ask for a blank result: 999 metric_result = self.metric(self.validation_goals, blank=True) 1000 1001 # Integrate results into our report: 1002 report["evaluation"] = metric_result["evaluation"] 1003 report["summary"] = metric_result["summary"] 1004 report["table"] = metric_result["table"] 1005 report["warnings"].extend(metric_result["warnings"]) 1006 1007 return report
Creates a blank validation report for this rubric that simply
shows what the validation goals and contexts are. Just like
Rubric.create_blank_report
, this function will erase any
existing results associated with validation rubric goals.
It uses False as the in_feedback value, so included context descriptions will be obfuscated.
The result has the same keys as Rubric.create_blank_report
does.
1009 def evaluate(self, task_info, username, submission_target): 1010 """ 1011 Evaluates this rubric based on the given submitted task (the 1012 task_info includes generic info about the task, the username 1013 identifies who submitted it, and the submission_target 1014 identifies the file or folder to be evaluated). 1015 1016 See `tasks.json` for the task info format (it's a dictionary 1017 stored in the "tasks" slot under its taskid as a key). 1018 1019 Returns a report object that has information about which goal(s) 1020 from the rubric passed or failed, and the overall performance as 1021 determined by the rubric's metric. 1022 1023 If submitted code cannot be loaded due to a syntax error or 1024 parsing fails for some other reason, the report will mention 1025 that in as much detail as it can, and the normal rubric items 1026 will be skipped. 1027 1028 Note: This function completely resets all evaluation goals and 1029 clears the caches of any associated `potluck.contexts.Context` 1030 objects before it starts evaluating goals. 1031 1032 The returned report dictionary has the following keys: 1033 1034 - taskid: The task ID (from the given taskspec) 1035 - evaluation: A string summarizing the performance on the entire 1036 task (from the metric function). 1037 - summary: An HTML string summarizing performance on the task 1038 (from the metric function). 1039 - files: A list of dictionaries with 'filename' and 'code' slots 1040 containing the file names and raw code text of the submitted 1041 file(s). 1042 - warnings: A list of warnings (from the metric function plus a 1043 few custom warnings if things are seriously wrong). 1044 - table: A table (in the format returned by `Goal.table`) detailing 1045 each goal and subgoal (from the metric function). 1046 - contexts: A list of context summary dictionaries in the format 1047 returned by `potluck.contexts.list_and_render_contexts`, 1048 which summarizes all contexts used by this rubric (see 1049 `Rubric.create_contexts_list`). 1050 - TODO: Add a partner_username field here? 1051 """ 1052 # Empty report: 1053 report = { 1054 "taskid": task_info["id"], 1055 "evaluation": "unknown", 1056 "summary": "No summary has been generated.", 1057 "files": [], 1058 "warnings": [], 1059 "table": [], 1060 "contexts": [] 1061 } 1062 1063 # Set up a rewrite rule for the spec file 1064 if self.spec_file: 1065 html_tools.set_tb_rewrite( 1066 self.spec_file, 1067 "<task specification>" 1068 ) 1069 1070 # Check for a missing submission: 1071 if not os.path.exists(submission_target): 1072 report["warnings"] = [ 1073 "You did not submit any code for this task." 1074 ] 1075 report["evaluation"] = "incomplete" 1076 report["summary"] = "You did not submit any code for this task." 1077 # Early return: no need to grade rubric items 1078 return report 1079 1080 # Check for accidental submission of the starter file: 1081 if os.path.isfile(submission_target): 1082 with open(submission_target, 'r', encoding="utf-8") as fin: 1083 submitted_code = fin.read() 1084 if submitted_code == task_info["specification"].starter_src: 1085 report["warnings"] = [ 1086 "You submitted the starter file without any" 1087 " changes (you probably submitted the wrong file?)." 1088 ] 1089 report["evaluation"] = "incomplete" 1090 report["summary"] = ( 1091 "You submitted an unchanged starter file." 1092 ) 1093 1094 # Reset each goal + any associated contexts: 1095 for g in self.evaluation_goals: 1096 g.full_reset() 1097 1098 # Ensure context descriptions are unique: 1099 clist = self.all_contexts(self.evaluation_goals) 1100 contexts.add_context_numbering(clist) 1101 1102 # Create our base context: 1103 if os.path.isdir(submission_target): 1104 submission_root = submission_target 1105 default_file = task_info["target"] 1106 actual_file = default_file 1107 else: 1108 submission_root, actual_file = os.path.split(submission_target) 1109 default_file = task_info["target"] 1110 base_context = { 1111 "task_info": task_info, 1112 "username": username, 1113 "submission_root": submission_root, 1114 "default_file": default_file, 1115 "actual_file": actual_file 1116 } 1117 1118 if len(self.evaluation_goals) == 0: 1119 raise ValueError("Rubric does not have any goals!") 1120 1121 # Evaluate each goal: 1122 for g in self.evaluation_goals: 1123 logging.debug_msg( 1124 "Evaluating goal '{}' @ {}...".format( 1125 g.feedback_topic(), 1126 id(g) 1127 ) 1128 ) 1129 # Task is automatically made available as part of context. 1130 result = g.evaluate(base_context) 1131 logging.debug_msg("...result is: {}".format(result)) 1132 logging.debug_msg("...review result is: {}".format(g.result)) 1133 1134 # Double-check that the goal correctly stored the value it 1135 # returned 1136 if result != g.result: 1137 logging.debug_msg( 1138 f"WARNING: Goal's returned result differs from" 1139 f" stored result!\nGoal:" 1140 f" '{g.feedback_topic()}'\nReturned:" 1141 f" {result}\nStored: {g.result}" 1142 ) 1143 1144 # Run our metric over the evaluated goals: 1145 metric_result = self.metric(self.evaluation_goals) 1146 1147 # Integrate results into our report: 1148 report["evaluation"] = metric_result["evaluation"] 1149 report["summary"] = metric_result["summary"] 1150 report["table"] = metric_result["table"] 1151 report["warnings"].extend(metric_result["warnings"]) 1152 1153 # Build our contexts list now that contexts should be caching the 1154 # same values used during testing: 1155 report["contexts"] = self.create_contexts_list( 1156 self.evaluation_goals, 1157 base_context 1158 ) 1159 1160 # Elevate warnings from contexts to the main warnings list. 1161 for creport in report["contexts"]: 1162 report["warnings"].extend(creport.get("warnings", [])) 1163 1164 # Build our files dictionary based on FileContext objects. It 1165 # maps file names to dictionaries with "path" slots (and possibly 1166 # more if we can dig up more info). 1167 all_filenames = { 1168 base_context["default_file"]: { 1169 "path": os.path.abspath( 1170 os.path.join( 1171 base_context["submission_root"], 1172 base_context["actual_file"] 1173 ) 1174 ) 1175 } 1176 } 1177 for ctx in clist: 1178 if isinstance(ctx, contexts.FileContext): 1179 if ctx.target_file is not None: 1180 ctx_result = ctx.create(base_context) 1181 name = ctx_result.get("filename", ctx.target_file) 1182 path = ctx_result.get("file_path", name) 1183 if name not in all_filenames: 1184 all_filenames[name] = { "path": path } 1185 1186 # Look for code contexts which have handled parsing on target 1187 # files, and add "source" and possibly "original_source" slots 1188 for ctx in clist: 1189 if isinstance(ctx, contexts.CodeContext): 1190 ctx_result = ctx.create(base_context) 1191 if "filename" in ctx_result: 1192 name = ctx_result["filename"] 1193 original = ctx_result["original_source"] 1194 fixed = ctx_result["source"] 1195 all_filenames[name]["source"] = fixed 1196 if original != fixed: 1197 all_filenames[name]["original_source"] = original 1198 # Otherwise there was some kind of error we assume 1199 1200 # Grab file contents if we haven't already 1201 for filename in all_filenames: 1202 file_info = all_filenames[filename] 1203 entry = { 1204 "filename": filename, 1205 "path": file_info["path"] 1206 } 1207 report["files"].append(entry) 1208 if "source" in file_info: 1209 entry["code"] = file_info["source"] 1210 else: 1211 with open(entry["path"], 'r', encoding="utf-8") as fin: 1212 if entry["path"].endswith(".py"): 1213 entry["code"] = fin.read() 1214 else: 1215 entry["raw"] = fin.read() 1216 1217 if "original_source" in file_info: 1218 entry["original_code"] = file_info["original_source"] 1219 1220 return report
Evaluates this rubric based on the given submitted task (the task_info includes generic info about the task, the username identifies who submitted it, and the submission_target identifies the file or folder to be evaluated).
See tasks.json
for the task info format (it's a dictionary
stored in the "tasks" slot under its taskid as a key).
Returns a report object that has information about which goal(s) from the rubric passed or failed, and the overall performance as determined by the rubric's metric.
If submitted code cannot be loaded due to a syntax error or parsing fails for some other reason, the report will mention that in as much detail as it can, and the normal rubric items will be skipped.
Note: This function completely resets all evaluation goals and
clears the caches of any associated potluck.contexts.Context
objects before it starts evaluating goals.
The returned report dictionary has the following keys:
- taskid: The task ID (from the given taskspec)
- evaluation: A string summarizing the performance on the entire task (from the metric function).
- summary: An HTML string summarizing performance on the task (from the metric function).
- files: A list of dictionaries with 'filename' and 'code' slots containing the file names and raw code text of the submitted file(s).
- warnings: A list of warnings (from the metric function plus a few custom warnings if things are seriously wrong).
- table: A table (in the format returned by
Goal.table
) detailing each goal and subgoal (from the metric function). - contexts: A list of context summary dictionaries in the format
returned by
potluck.contexts.list_and_render_contexts
, which summarizes all contexts used by this rubric (seeRubric.create_contexts_list
). - TODO: Add a partner_username field here?
1222 def validate(self, task_info, username, tests_target, target): 1223 """ 1224 Validates tests for this task based on the given submitted tests 1225 file and submission file (the task_info includes generic info 1226 about the task, the username identifies who submitted it, the 1227 tests_target identifies the file or folder to be evaluated, and 1228 the target identifies the base task file or folder to run tests 1229 against). 1230 1231 See `tasks.json` for the task info format (it's a dictionary 1232 stored in the "tasks" slot under its taskid as a key). 1233 1234 Returns a report object that has information about which 1235 validation goal(s) from the rubric passed or failed, and the 1236 overall performance as determined by the rubric's metric. 1237 1238 If submitted tests cannot be loaded due to a syntax error or 1239 parsing fails for some other reason, the report will mention 1240 that in as much detail as it can, and the normal rubric items 1241 will be skipped. 1242 1243 Note: This function completely resets all validation goals and 1244 clears the caches of any associated `potluck.contexts.Context` 1245 objects before it starts evaluating goals. 1246 1247 TODO: We mostly effectively ignore the `target` argument because 1248 we grab the solution (see `contexts.TestsFileContext`). Get rid 1249 of it? 1250 1251 The returned report dictionary has the same keys/values as the 1252 result from `Rubric.evaluate`. 1253 """ 1254 # Empty report: 1255 report = { 1256 "taskid": task_info["id"], 1257 "evaluation": "unknown", 1258 "summary": "No summary has been generated.", 1259 "files": [], 1260 "warnings": [], 1261 "table": [], 1262 "contexts": [] 1263 } 1264 1265 # Set up a rewrite rule for the spec file 1266 if self.spec_file: 1267 html_tools.set_tb_rewrite( 1268 self.spec_file, 1269 "<task specification>" 1270 ) 1271 1272 # Check for a missing submission: 1273 if not os.path.exists(tests_target): 1274 report["warnings"] = [ 1275 "You did not submit any tests for this task." 1276 ] 1277 report["evaluation"] = "incomplete" 1278 report["summary"] = "You did not submit any tests for this task." 1279 # Early return: no need to grade rubric items 1280 return report 1281 1282 # Check for a missing submission: 1283 if not os.path.exists(target): 1284 report["warnings"] = [ 1285 "We did not find the code to test." 1286 ] 1287 report["evaluation"] = "incomplete" 1288 report["summary"] = "We did not find the code to test." 1289 # Early return: no need to grade rubric items 1290 return report 1291 1292 # Reset each goal + any associated contexts: 1293 for g in self.validation_goals: 1294 g.full_reset() 1295 1296 # Ensure context descriptions are unique: 1297 clist = self.all_contexts(self.validation_goals) 1298 contexts.add_context_numbering(clist) 1299 1300 # Figure out whether tests target is a directory or file 1301 if os.path.isdir(tests_target): 1302 submission_root = tests_target 1303 default_file = task_info.get( 1304 "tests_target", 1305 "test_" + task_info["target"] 1306 ) 1307 actual_file = default_file 1308 else: 1309 submission_root, actual_file = os.path.split(tests_target) 1310 default_file = task_info.get( 1311 "tests_target", 1312 "test_" + task_info["target"] 1313 ) 1314 1315 # Figure out whether submission target is a directory or file 1316 if os.path.isdir(target): 1317 target_root = target 1318 target_default_file = task_info["target"] 1319 target_actual_file = target_default_file 1320 else: 1321 target_root, target_actual_file = os.path.split(target) 1322 target_default_file = task_info["target"] 1323 1324 # Create our base context: 1325 base_context = { 1326 "task_info": task_info, 1327 "username": username, 1328 "submission_root": target_root, 1329 "default_file": target_default_file, 1330 "actual_file": target_actual_file, 1331 "tests_submission_root": submission_root, 1332 "default_tests_file": default_file, 1333 "actual_tests_file": actual_file 1334 } 1335 1336 if len(self.validation_goals) == 0: 1337 raise ValueError("Rubric does not have any validation goals!") 1338 1339 # Evaluate each goal: 1340 for g in self.validation_goals: 1341 logging.debug_msg( 1342 "Evaluating validation goal '{}' @ {}...".format( 1343 g.feedback_topic(), 1344 id(g) 1345 ) 1346 ) 1347 # Task is automatically made available as part of context. 1348 result = g.evaluate(base_context) 1349 logging.debug_msg("...result is: {}".format(result)) 1350 logging.debug_msg("...review result is: {}".format(g.result)) 1351 1352 # Double-check that the goal correctly stored the value it 1353 # returned 1354 if result != g.result: 1355 logging.debug_msg( 1356 f"WARNING: Validation goal's returned result differs" 1357 f" from stored result!\nGoal:" 1358 f" '{g.feedback_topic()}'\nReturned:" 1359 f" {result}\nStored: {g.result}" 1360 ) 1361 1362 # Run our metric over the evaluated goals: 1363 # TODO: Allow/require separate validation metrics? 1364 metric_result = self.metric(self.validation_goals) 1365 1366 # Integrate results into our report: 1367 report["evaluation"] = metric_result["evaluation"] 1368 report["summary"] = metric_result["summary"] 1369 report["table"] = metric_result["table"] 1370 report["warnings"].extend(metric_result["warnings"]) 1371 1372 # Build our contexts list now that contexts should be caching the 1373 # same values used during testing: 1374 report["contexts"] = self.create_contexts_list( 1375 self.validation_goals, 1376 base_context 1377 ) 1378 1379 # Elevate warnings from contexts to the main warnings list. 1380 for creport in report["contexts"]: 1381 report["warnings"].extend(creport.get("warnings", [])) 1382 1383 # Build our files dictionary based on TestsFileContext objects. 1384 # It maps file names to dictionaries with "path" slots (and 1385 # possibly more if we can dig up more info). 1386 all_filenames = { 1387 base_context["default_tests_file"]: { 1388 "path": os.path.abspath( 1389 os.path.join( 1390 base_context["submission_root"], 1391 base_context["actual_tests_file"] 1392 ) 1393 ) 1394 } 1395 } 1396 for ctx in clist: 1397 if isinstance(ctx, contexts.TestsFileContext): 1398 if ctx.target_tests_file is not None: 1399 ctx_result = ctx.create(base_context) 1400 name = ctx_result.get( 1401 "tests_filename", 1402 ctx.target_tests_file 1403 ) 1404 path = ctx_result.get("tests_file_path", name) 1405 if name not in all_filenames: 1406 all_filenames[name] = { "path": path } 1407 1408 # Look for code contexts which have handled parsing on target 1409 # files, and add "source" and possibly "original_source" slots 1410 for ctx in clist: 1411 if isinstance(ctx, contexts.CodeContext): 1412 ctx_result = ctx.create(base_context) 1413 if "tests_filename" in ctx_result: 1414 name = ctx_result["tests_filename"] 1415 original = ctx_result["original_tests_source"] 1416 fixed = ctx_result["tests_source"] 1417 all_filenames[name]["source"] = fixed 1418 if original != fixed: 1419 all_filenames[name]["original_source"] = original 1420 # Otherwise there was some kind of error we assume 1421 1422 # Grab file contents if we haven't already 1423 for filename in all_filenames: 1424 file_info = all_filenames[filename] 1425 entry = { 1426 "filename": filename, 1427 "path": file_info["path"] 1428 } 1429 report["files"].append(entry) 1430 if "source" in file_info: 1431 entry["code"] = file_info["source"] 1432 else: 1433 with open(entry["path"], 'r', encoding="utf-8") as fin: 1434 if entry["path"].endswith(".py"): 1435 entry["code"] = fin.read() 1436 else: 1437 entry["raw"] = fin.read() 1438 1439 if "original_source" in file_info: 1440 entry["original_code"] = file_info["original_source"] 1441 1442 return report
Validates tests for this task based on the given submitted tests file and submission file (the task_info includes generic info about the task, the username identifies who submitted it, the tests_target identifies the file or folder to be evaluated, and the target identifies the base task file or folder to run tests against).
See tasks.json
for the task info format (it's a dictionary
stored in the "tasks" slot under its taskid as a key).
Returns a report object that has information about which validation goal(s) from the rubric passed or failed, and the overall performance as determined by the rubric's metric.
If submitted tests cannot be loaded due to a syntax error or parsing fails for some other reason, the report will mention that in as much detail as it can, and the normal rubric items will be skipped.
Note: This function completely resets all validation goals and
clears the caches of any associated potluck.contexts.Context
objects before it starts evaluating goals.
TODO: We mostly effectively ignore the target
argument because
we grab the solution (see contexts.TestsFileContext
). Get rid
of it?
The returned report dictionary has the same keys/values as the
result from Rubric.evaluate
.
1444 def goals_by_id(self, fragment): 1445 """ 1446 Retrieves one or more of the goals from this rubric according to 1447 its identifier. Note that it's possible for multiple goals to 1448 share the same identifier (only when rendered into HTML do they 1449 get suffixes to make them unique), so this function always 1450 returns a list of goals, which is likely to be length-1. Of 1451 course, an empty list is returned if no goals have the given ID. 1452 Any goal whose identifier contains the provided string will be 1453 included in the goals returned, although '^^^' will be added to 1454 the front and a '$$$' to the end when checking this, so you can 1455 use those in your fragment; neither character normally appears 1456 inside of non-custom identifiers. 1457 """ 1458 # TODO: Prefix these for evaluation/validation? 1459 return [ 1460 g 1461 for g in self.evaluation_goals + self.validation_goals 1462 if fragment in ('^^^' + g.identifier + '$$$') 1463 ]
Retrieves one or more of the goals from this rubric according to its identifier. Note that it's possible for multiple goals to share the same identifier (only when rendered into HTML do they get suffixes to make them unique), so this function always returns a list of goals, which is likely to be length-1. Of course, an empty list is returned if no goals have the given ID. Any goal whose identifier contains the provided string will be included in the goals returned, although '^^^' will be added to the front and a '$$$' to the end when checking this, so you can use those in your fragment; neither character normally appears inside of non-custom identifiers.
1470def overall_evaluation(foundational, core, extra): 1471 """ 1472 Given lists of evaluated foundational, core, and extra goals, returns 1473 a pair containing an overall evaluation string and a summary string 1474 based on the following rules. Treating each core goal as 1 point, 1475 with 1/2 point for partial accomplishment, the metric computes a 1476 point total for core goals and then: 1477 1478 - If a score of at least 1/2 of the number of core goals is met, and 1479 all of the foundational goals are accomplished, the overall 1480 evaluation is "partially complete". 1481 - Depending on the number of core goals, a completeness point 1482 threshold is established (TODO: more principled than this?): 1483 - If there is only 1 core goal, the threshold is 1 (it's impossible 1484 to score 'almost complete' in this scenario). 1485 - Otherwise, the threshold is the number of core goals minus a 1486 fudge factor of 10% rounded up to the nearest 0.5. In other 1487 words, for 2-5 core goals the fudge factor is 0.5, for 5-10 it's 1488 1, for 11-15 it's 1.5, for 16-20 it's 2, etc. 1489 - If at least one core goal is not fully accomplished, but the core 1490 point total is equal to or greater than the completeness point 1491 threshold, then the overall evaluation is "almost complete". 1492 - If all of the core goals are fully accomplished, but at least one 1493 extra goal is not fully accomplished, the evaluation is "complete". 1494 - If all of the core goals and all of the extra goals are 1495 accomplished, the overall evaluation is "excellent". 1496 - If either at least one foundational goal is failed, or the score 1497 for core goals is less than 1/2 of the number of core goals, the 1498 evaluation is "incomplete". 1499 """ 1500 # Check foundational goals 1501 failed_foundational = [] 1502 for g in foundational: 1503 logging.debug_msg( 1504 "Reviewing foundational goal '{}' @ {}...".format( 1505 g.feedback_topic(), 1506 id(g) 1507 ) 1508 ) 1509 logging.debug_msg("...result is: {}".format(g.result)) 1510 if g.result["status"] not in ("accomplished", "partial"): 1511 failed_foundational.append(g) 1512 1513 # Check core goals 1514 core_score = 0 1515 core_accomplished = [] 1516 core_partial = [] 1517 for g in core: 1518 logging.debug_msg( 1519 "Reviewing core goal '{}' @ {}...".format( 1520 g.feedback_topic, 1521 id(g) 1522 ) 1523 ) 1524 logging.debug_msg("...result is: {}".format(g.result)) 1525 if g.result["status"] == "accomplished": 1526 core_score += 1 1527 core_accomplished.append(g) 1528 elif g.result["status"] == "partial": 1529 core_score += 0.5 1530 core_partial.append(g) 1531 1532 # Nicer repr 1533 if int(core_score) == core_score: 1534 core_score = int(core_score) 1535 1536 # Check extra goals 1537 extra_unaccomplished = [] 1538 for g in extra: 1539 logging.debug_msg( 1540 "Reviewing extra goal '{}' @ {}...".format( 1541 g.feedback_topic(), 1542 id(g) 1543 ) 1544 ) 1545 logging.debug_msg("...result is: {}".format(g.result)) 1546 if g.result["status"] != "accomplished": 1547 extra_unaccomplished.append(g) 1548 1549 # Feedback for core and extra goals: 1550 if len(core) < 2: 1551 core_threshold = len(core) 1552 else: 1553 core_threshold = len(core) - (math.ceil(0.2 * len(core)) / 2) - 0.01 1554 # the 0.01 is extra careful of rounding errors 1555 1556 if core_score == len(core): 1557 # Perfect core score -> 'complete' or 'excellent' overall 1558 if len(extra_unaccomplished) == 0: 1559 # Extras all accomplished + core all accomplished -> 'excellent' 1560 return ( 1561 "excellent", 1562 "<p>You accomplished all core and extra goals. Great job!</p>" 1563 ) 1564 else: 1565 return ( 1566 "complete", 1567 ( 1568 "<p>You accomplished the core goals. Good job!</p>" 1569 "<p>You accomplished" 1570 f" {len(extra) - len(extra_unaccomplished)} of" 1571 f" {len(extra)} extra goals.</p>" 1572 ) 1573 ) 1574 1575 elif core_score >= core_threshold: 1576 # Close-enough core score: "almost complete" 1577 return ( 1578 "almost complete", 1579 ( 1580 f"<p>You accomplished {core_score} (nearly all) of the" 1581 f" {len(core)} core goals.</p>" 1582 ) 1583 ) 1584 1585 else: 1586 # Not even close-enough 1587 half = len(core) * 0.5 1588 if half == int(half): # Nicer repr 1589 half = int(half) 1590 if core_score >= half: 1591 return ( 1592 "partially complete", 1593 ( 1594 f"<p>You accomplished {core_score} (which is at" 1595 f" least half) of the {len(core)} core goals.</p>" 1596 ) 1597 ) 1598 else: 1599 return ( 1600 "incomplete", 1601 ( 1602 f"<p>You accomplished only {core_score} (which is" 1603 f" less than half) of the {len(core)} core goals.</p>" 1604 ) 1605 )
Given lists of evaluated foundational, core, and extra goals, returns a pair containing an overall evaluation string and a summary string based on the following rules. Treating each core goal as 1 point, with 1/2 point for partial accomplishment, the metric computes a point total for core goals and then:
- If a score of at least 1/2 of the number of core goals is met, and all of the foundational goals are accomplished, the overall evaluation is "partially complete".
- Depending on the number of core goals, a completeness point
threshold is established (TODO: more principled than this?):
- If there is only 1 core goal, the threshold is 1 (it's impossible to score 'almost complete' in this scenario).
- Otherwise, the threshold is the number of core goals minus a fudge factor of 10% rounded up to the nearest 0.5. In other words, for 2-5 core goals the fudge factor is 0.5, for 5-10 it's 1, for 11-15 it's 1.5, for 16-20 it's 2, etc.
- If at least one core goal is not fully accomplished, but the core point total is equal to or greater than the completeness point threshold, then the overall evaluation is "almost complete".
- If all of the core goals are fully accomplished, but at least one extra goal is not fully accomplished, the evaluation is "complete".
- If all of the core goals and all of the extra goals are accomplished, the overall evaluation is "excellent".
- If either at least one foundational goal is failed, or the score for core goals is less than 1/2 of the number of core goals, the evaluation is "incomplete".
1608def summarize_category_row( 1609 row, 1610 goals, 1611 all_or_nothing=False, 1612 half_matters=False, 1613 blank=False 1614): 1615 """ 1616 Given a table row and a list of goals, adds "status" and 1617 "explanation" entries to the given row based on whether some, 1618 more/less than half, and/or all of the goals in the list were 1619 accomplished. 1620 1621 If all_or_nothing is given as True, then the status will always be 1622 either "accomplished" if all goals were, or "failed" if at least one 1623 wasn't (even if it was partial). 1624 1625 If half_matters is given as True, a note about whether or not at 1626 least half of the goals were accomplished will be added, counting 1627 partially-accomplished goals as 1/2 point. 1628 1629 This function modifies the provided row and doesn't return anything. 1630 1631 If blank is set to True, the status/explanation are set according to 1632 BLANK_RESULT. 1633 """ 1634 if blank: 1635 row["status"] = BLANK_RESULT["status"] 1636 row["explanation"] = BLANK_RESULT["explanation"] 1637 return 1638 1639 accomplished = len( 1640 [g for g in goals if g.result["status"] == "accomplished"] 1641 ) 1642 partial = len( 1643 [g for g in goals if g.result["status"] == "partial"] 1644 ) 1645 count = len(goals) 1646 1647 if accomplished == count: 1648 row["status"] = "accomplished" 1649 row["explanation"] = "Accomplished all {} {}.".format( 1650 count, 1651 phrasing.plural(count, "goal") 1652 ) 1653 else: 1654 points = accomplished + 0.5 * partial 1655 if all_or_nothing: 1656 row["status"] = "failed" 1657 row["explanation"] = "Failed to fully accomplish {} {}.".format( 1658 count - accomplished, 1659 phrasing.plural(count - accomplished, "goal") 1660 ) 1661 else: 1662 row["explanation"] = "Accomplished {} of {} {}.".format( 1663 round(points) if round(points) == points else points, 1664 count, 1665 phrasing.plural(count, "goal") 1666 ) 1667 if points < count / 2 - 0.001: 1668 row["status"] = "failed" 1669 else: 1670 row["status"] = "partial" 1671 1672 if half_matters: 1673 if points < count / 2 - 0.001: 1674 half_msg = " (less than half overall)." 1675 else: 1676 half_msg = " (at least half overall)." 1677 row["explanation"] = row["explanation"][:-1] + half_msg
Given a table row and a list of goals, adds "status" and "explanation" entries to the given row based on whether some, more/less than half, and/or all of the goals in the list were accomplished.
If all_or_nothing is given as True, then the status will always be either "accomplished" if all goals were, or "failed" if at least one wasn't (even if it was partial).
If half_matters is given as True, a note about whether or not at least half of the goals were accomplished will be added, counting partially-accomplished goals as 1/2 point.
This function modifies the provided row and doesn't return anything.
If blank is set to True, the status/explanation are set according to BLANK_RESULT.
1680def foundational_core_extras_metric(goals, blank=False): 1681 """ 1682 Summarizes a list of evaluated goals by looking at those tagged 1683 as "foundational" and "core" and treating the rest as extras, while 1684 ignoring any tagged with "feedback_only". It assigns an evaluation 1685 using the overall_evaluation function. 1686 1687 If blank is given as True, the report will include an evaluation of 1688 "not evaluated" and will not assign success or failure overall or to 1689 individual goal categories. Use this, along with unevaluated goals, 1690 to create a blank rubric. 1691 1692 This function returns a dictionary with the following keys: 1693 1694 - evaluation: A short string providing an overall evaluation of 1695 the submission, as described above. 1696 - summary: A string containing HTML code that summarizes the 1697 evaluation in a few sentences. It contains descriptions of 1698 how many goals in each category were accomplished. 1699 - table: A table dictionary, similar to those returned by 1700 `Goal.table. It will have 'description', 'tags', 'status', 1701 'explanation', and perhaps 'subtable' keys. 1702 - warnings: A list of HTML strings including all warnings 1703 generated by any goal. TODO: Actually just an empty list for 1704 now. 1705 """ 1706 1707 # Sort goals into categories (multiple membership allowed in some cases) 1708 foundational = [] 1709 core = [] 1710 extra = [] 1711 feedback = [] 1712 for g in goals: 1713 category = g.tags.get("category", "extra") 1714 if category == "foundational": 1715 foundational.append(g) 1716 elif category == "core": 1717 core.append(g) 1718 elif category == "feedback_only": 1719 feedback.append(g) 1720 else: 1721 extra.append(g) 1722 1723 # Include foundational goals: 1724 foundation_row = { 1725 "description": ( 1726 "Foundational goals", 1727 "If one fails, the assignment is incomplete." 1728 ), 1729 "tags": { "category": "foundational" }, 1730 "status": "unknown", 1731 "explanation": "No explanation yet.", 1732 "subtable": [], 1733 } 1734 for g in foundational: 1735 foundation_row["subtable"].extend(g.table(blank=blank)) 1736 summarize_category_row( 1737 foundation_row, 1738 foundational, 1739 all_or_nothing=True, 1740 blank=blank 1741 ) 1742 1743 # Include core goals: 1744 core_row = { 1745 "description": ( 1746 "Core goals", 1747 ( 1748 "Complete all of these for core credit. Get partial" 1749 " credit for completing at least half, and more" 1750 + " partial credit for completing at least 90%." 1751 ) 1752 ), 1753 "tags": { "category": "core" }, 1754 "status": "unknown", 1755 "explanation": "No explanation yet.", 1756 "subtable": [], 1757 } 1758 for g in core: 1759 core_row["subtable"].extend(g.table(blank=blank)) 1760 summarize_category_row(core_row, core, half_matters=True, blank=blank) 1761 1762 # Include extra goals: 1763 extra_row = { 1764 "description": ( 1765 "Extra goals", 1766 ( 1767 "Complete all of these in addition to all of the core" 1768 + " goals for a perfect score." 1769 ) 1770 ), 1771 "tags": { "category": "extra" }, 1772 "status": "unknown", 1773 "explanation": "No explanation yet.", 1774 "subtable": [], 1775 } 1776 for g in extra: 1777 extra_row["subtable"].extend(g.table(blank=blank)) 1778 summarize_category_row(extra_row, extra, all_or_nothing=True, blank=blank) 1779 1780 # Include feedback_only goals: 1781 feedback_row = { 1782 "description": ("Additional feedback (not graded):", ""), 1783 "tags": { "category": "feedback_only" }, 1784 "status": "not applicable", 1785 "explanation": ( 1786 "These extra items are not graded, but provide potentially " 1787 + "useful feedback ." 1788 ), 1789 "subtable": [], 1790 } 1791 for g in feedback: 1792 logging.debug_msg( 1793 "Reviewing feedback goal '{}' @ {}...".format( 1794 g.feedback_topic(), 1795 id(g) 1796 ) 1797 ) 1798 logging.debug_msg("...result is: {}".format(g.result)) 1799 feedback_row["subtable"].extend(g.table(blank=blank)) 1800 summarize_category_row(feedback_row, feedback, blank=blank) 1801 1802 nonempty_rows = list( 1803 filter( 1804 lambda row: len(row.get("subtable", [])) > 0, 1805 [ 1806 foundation_row, 1807 core_row, 1808 extra_row, 1809 feedback_row 1810 ] 1811 ) 1812 ) 1813 1814 # If we're creating a blank rubric, stop here and just report what 1815 # the goals were. 1816 if blank: 1817 return { 1818 "evaluation": "not evaluated", 1819 "summary": "Blank rubric.", 1820 "table": nonempty_rows, 1821 "warnings": [] # TODO: Mechanism to generate these? 1822 } 1823 1824 # Build summary and table rows; decide evaluation: 1825 evaluation, summary = overall_evaluation(foundational, core, extra) 1826 1827 return { 1828 "evaluation": evaluation, 1829 "summary": summary, 1830 "table": nonempty_rows, 1831 "warnings": [] # TODO: Mechanism to generate these? 1832 }
Summarizes a list of evaluated goals by looking at those tagged as "foundational" and "core" and treating the rest as extras, while ignoring any tagged with "feedback_only". It assigns an evaluation using the overall_evaluation function.
If blank is given as True, the report will include an evaluation of "not evaluated" and will not assign success or failure overall or to individual goal categories. Use this, along with unevaluated goals, to create a blank rubric.
This function returns a dictionary with the following keys:
- evaluation: A short string providing an overall evaluation of the submission, as described above.
- summary: A string containing HTML code that summarizes the evaluation in a few sentences. It contains descriptions of how many goals in each category were accomplished.
- table: A table dictionary, similar to those returned by `Goal.table. It will have 'description', 'tags', 'status', 'explanation', and perhaps 'subtable' keys.
- warnings: A list of HTML strings including all warnings generated by any goal. TODO: Actually just an empty list for now.
1835def core_extras_categorized_metric(goals, blank=False): 1836 """ 1837 Works like `foundational_core_extras_metric`, but does not use 1838 foundational goals (only goals tagged "core" vs. not are 1839 distinguished). However, this version looks at goal type tags and 1840 creates a table organizing goals by their types and then categories. 1841 The goal types (supplied via "goal_type" tags) are: 1842 1843 - "style" 1844 - "procedure" 1845 - "process" 1846 - "product" 1847 - "behavior" 1848 - "tests" 1849 - "other" (any goal not tagged with a type will get this type) 1850 1851 The overall evaluation and summary of the dictionary returned are the 1852 same as for the `foundational_core_extras_metric`, and the goal types 1853 are not relevant to the evaluation result. 1854 """ 1855 1856 # Sort goals into categories 1857 core = [] 1858 extra = [] 1859 feedback = [] 1860 for g in goals: 1861 cat = g.tags.get("category", "extra") 1862 if cat == "core": 1863 core.append(g) 1864 elif cat == "feedback_only": 1865 feedback.append(g) 1866 else: 1867 extra.append(g) 1868 1869 # Get evaluation & summary for the goals (no foundational goals) 1870 evaluation, summary = overall_evaluation([], core, extra) 1871 1872 # Sort goals again by type tags 1873 rows = [] 1874 for gtype in GOAL_TYPE_RUBRICS: 1875 gtype_description = GOAL_TYPE_RUBRICS[gtype] 1876 gtype_goals = [] 1877 for g in goals: 1878 if g.tags.get("goal_type", "other") == gtype: 1879 gtype_goals.append(g) 1880 1881 # If there aren't any goals in this category, we skip it entirely 1882 if len(gtype_goals) == 0: 1883 continue 1884 1885 # Core/extra/feedback sub-rows for this category 1886 core_subrow = { 1887 "description": ( 1888 "Core goals", 1889 ( 1890 "Complete all core goals for core credit. Get partial" 1891 " credit for completing at least half, and more" 1892 + " partial credit for completing at least 90%." 1893 ) 1894 ), 1895 "tags": { "category": "core", "goal_type": gtype }, 1896 "status": "unknown", 1897 "explanation": "No explanation yet.", 1898 "subtable": [], 1899 } 1900 extra_subrow = { 1901 "description": ( 1902 "Extra goals", 1903 ( 1904 "Complete all extra goals in addition to the core" 1905 + " goals for a perfect score." 1906 ) 1907 ), 1908 "tags": { "category": "extra", "goal_type": gtype }, 1909 "status": "unknown", 1910 "explanation": "No explanation yet.", 1911 "subtable": [], 1912 } 1913 feedback_subrow = { 1914 "description": ( 1915 "Additional feedback (not graded):", 1916 ( 1917 "These checks and tests are provided to give you" 1918 + " more insight into the assignment, but are not part" 1919 + " of the grading." 1920 ) 1921 ), 1922 "tags": { "category": "feedback_only", "goal_type": gtype }, 1923 "status": "not applicable", 1924 "explanation": ( 1925 "These extra items are not graded, but provide potentially " 1926 + "useful feedback." 1927 ), 1928 "subtable": [], 1929 } 1930 1931 # Add goals to sub-rows 1932 core_here = [] 1933 extra_here = [] 1934 feedback_here = [] 1935 for g in gtype_goals: 1936 if g in core: 1937 core_here.append(g) 1938 core_subrow["subtable"].extend(g.table(blank=blank)) 1939 elif g in feedback: 1940 feedback_here.append(g) 1941 feedback_subrow["subtable"].extend(g.table(blank=blank)) 1942 else: 1943 extra_here.append(g) 1944 extra_subrow["subtable"].extend(g.table(blank=blank)) 1945 1946 # List the non-empty sub-rows 1947 nonempty_subrows = [] 1948 for sub in (core_subrow, extra_subrow, feedback_subrow): 1949 if len(sub["subtable"]) > 0: 1950 nonempty_subrows.append(sub) 1951 1952 # Main row for this category 1953 row = { 1954 "description": gtype_description, 1955 "tags": { "category": "type_group", "goal_type": gtype }, 1956 "status": "unknown", 1957 "explanation": "No explanation yet.", 1958 "subtable": nonempty_subrows, 1959 } 1960 # Add this row to our rows list 1961 rows.append(row) 1962 1963 # Summarize each sub-row 1964 summarize_category_row(core_subrow, core_here, blank=blank) 1965 summarize_category_row(extra_subrow, extra_here, blank=blank) 1966 summarize_category_row(feedback_subrow, feedback_here, blank=blank) 1967 1968 # Status + explanation for this entire category 1969 if blank: 1970 # Blank status + explanation 1971 row["status"] = BLANK_RESULT["status"] 1972 row["explanation"] = BLANK_RESULT["explanation"] 1973 else: 1974 # Goal-type group status based on core goals alone 1975 row["status"] = core_subrow["status"] 1976 ngoals = len(core_subrow["subtable"]) 1977 if ngoals == 0: 1978 # no core goals in this category 1979 if len(extra_subrow["subtable"]) == 0: 1980 # no evaluated goals at all... 1981 row["status"] = "unknown" 1982 ngoals = len(feedback_subrow["subtable"]) 1983 row["explanation"] = ( 1984 "The {} {} {} contribute to your overall" 1985 " evaluation ({} just informative)." 1986 ).format( 1987 gtype, 1988 phrasing.plural(ngoals, "goal"), 1989 phrasing.plural(ngoals, "does not", "do not"), 1990 phrasing.plural(ngoals, "they're", "it's"), 1991 ) 1992 else: 1993 # Base on the extra goals 1994 row["status"] = extra_subrow["status"] 1995 ngoals = len(extra_subrow["subtable"]) 1996 if row["status"] == "accomplished": 1997 if ngoals > 1: 1998 row["explanation"] = ( 1999 "Accomplished all {} extra {} goals." 2000 ).format(ngoals, gtype) 2001 else: 2002 row["explanation"] = ( 2003 "Accomplished the {} extra {} goal." 2004 ).format(ngoals, gtype) 2005 elif row["status"] == "partial": 2006 if ngoals > 1: 2007 row["explanation"] = ( 2008 "Accomplished most of the {} extra {}" 2009 " goals." 2010 ).format(ngoals, gtype) 2011 else: 2012 row["explanation"] = ( 2013 "Partially accomplished the extra {}" 2014 " goal." 2015 ).format(gtype) 2016 elif row["status"] == "failed": 2017 if ngoals > 1: 2018 row["explanation"] = ( 2019 "Did not accomplish at least half of" 2020 " the {} extra {} goals." 2021 ).format(ngoals, gtype) 2022 else: 2023 row["explanation"] = ( 2024 "Did not accomplish the extra {} goal." 2025 ).format(gtype) 2026 else: 2027 row["explanation"] = ( 2028 "No conclusive evaluation for the extra {}" 2029 " {}." 2030 ).format(gtype, phrasing.plural(ngoals, "goal")) 2031 elif row["status"] == "accomplished": 2032 # Explanation tweaked based on extra goals 2033 nextra = len(extra_subrow["subtable"]) 2034 cat_phrase = "core" 2035 if ( 2036 nextra > 0 2037 and extra_subrow["status"] == "accomplished" 2038 ): 2039 cat_phrase = "core and extra" 2040 ngoals += nextra 2041 if ngoals > 1: 2042 row["explanation"] = ( 2043 "Accomplished all {} {} {} goals." 2044 ).format(ngoals, cat_phrase, gtype) 2045 else: 2046 row["explanation"] = ( 2047 "Accomplished the {} {} goal." 2048 ).format(cat_phrase, gtype) 2049 elif row["status"] == "partial": 2050 if ngoals > 1: 2051 row["explanation"] = ( 2052 "Accomplished most of the core {} goals." 2053 ).format(gtype) 2054 else: 2055 row["explanation"] = ( 2056 "Partially accomplished the core {} goal." 2057 ).format(gtype) 2058 elif row["status"] == "failed": 2059 if ngoals > 1: 2060 row["explanation"] = ( 2061 "Did not accomplish at least half of the core" 2062 " {} goals." 2063 ).format(gtype) 2064 else: 2065 row["explanation"] = ( 2066 "Did not at least partially accomplish the core" 2067 " {} goal." 2068 ).format(gtype) 2069 else: 2070 row["explanation"] = ( 2071 "No conclusive evaluation for the core {} {}." 2072 ).format(gtype, phrasing.plural(ngoals, "goal")) 2073 2074 # If we're creating a blank rubric, stop here and just report what 2075 # the goals were. 2076 if blank: 2077 return { 2078 "evaluation": "not evaluated", 2079 "summary": "Blank rubric.", 2080 "table": rows, 2081 "warnings": [] # TODO: Mechanism to generate these? 2082 } 2083 else: 2084 # Otherwise, include the evaluation and summary 2085 return { 2086 "evaluation": evaluation, 2087 "summary": summary, 2088 "table": rows, 2089 "warnings": [] # TODO: Mechanism to generate these? 2090 }
Works like foundational_core_extras_metric
, but does not use
foundational goals (only goals tagged "core" vs. not are
distinguished). However, this version looks at goal type tags and
creates a table organizing goals by their types and then categories.
The goal types (supplied via "goal_type" tags) are:
- "style"
- "procedure"
- "process"
- "product"
- "behavior"
- "tests"
- "other" (any goal not tagged with a type will get this type)
The overall evaluation and summary of the dictionary returned are the
same as for the foundational_core_extras_metric
, and the goal types
are not relevant to the evaluation result.
2093def core_extras_flat_metric(goals, blank=False): 2094 """ 2095 Works like the `core_extras_categorized_metric` but returns a flat 2096 table without goal-type or goal-category rows. This table can be used 2097 with custom sorting controls to allow re-grouping by goal-type, 2098 goal-category, etc. 2099 2100 The overall evaluation and summary of the dictionary returned are the 2101 same as for the `foundational_core_extras_metric`. 2102 """ 2103 2104 # Sort goals into categories 2105 core = [] 2106 extra = [] 2107 feedback = [] 2108 for g in goals: 2109 cat = g.tags.get("category", "extra") 2110 if cat == "core": 2111 core.append(g) 2112 elif cat == "feedback_only": 2113 feedback.append(g) 2114 else: 2115 extra.append(g) 2116 2117 # Get evaluation & summary for the goals 2118 evaluation, summary = overall_evaluation([], core, extra) 2119 2120 # Accumulate rows for each goal 2121 rows = [] 2122 for g in goals: 2123 rows.extend(g.table(blank=blank)) 2124 2125 # If we're creating a blank rubric, use empty evaluation/summary. 2126 if blank: 2127 return { 2128 "evaluation": "not evaluated", 2129 "summary": "Blank rubric.", 2130 "table": rows, 2131 "warnings": [] # TODO: Mechanism to generate these? 2132 } 2133 else: 2134 # Otherwise, include the evaluation and summary 2135 return { 2136 "evaluation": evaluation, 2137 "summary": summary, 2138 "table": rows, 2139 "warnings": [] # TODO: Mechanism to generate these? 2140 }
Works like the core_extras_categorized_metric
but returns a flat
table without goal-type or goal-category rows. This table can be used
with custom sorting controls to allow re-grouping by goal-type,
goal-category, etc.
The overall evaluation and summary of the dictionary returned are the
same as for the foundational_core_extras_metric
.
2147class NoteGoal(Goal): 2148 """ 2149 A NoteGoal just serves as an extra rubric entry that's not associated 2150 with any test. 2151 """ 2152 def __init__( 2153 self, 2154 taskid, 2155 identifier, 2156 description=("BLANK NOTE GOAL", "THIS GOAL HAS NOT BEEN DEFINED"), 2157 explanation="", 2158 **kwargs 2159 ): 2160 """ 2161 A task ID (string), an identifier (string), and a description are 2162 required, and an explanation (shown only during feedback, but not 2163 on the rubric) may also be given. If no category tag is 2164 specified, the category tag will be set to "feedback_only". 2165 2166 The categorizer "note:" will be prepended to the identifier. 2167 """ 2168 tags = kwargs.setdefault("tags", {}) 2169 if "category" not in tags: 2170 tags["category"] = "feedback_only" 2171 2172 super().__init__( 2173 taskid, 2174 "node:" + identifier, 2175 description, 2176 **kwargs 2177 ) 2178 self.set_default_goal_type("other") 2179 self.explanation = explanation 2180 2181 def evaluate_in_context(self, context=None): 2182 """ 2183 Simply returns the pre-defined explanation. 2184 """ 2185 return { 2186 "status": "not applicable", 2187 "explanation": self.explanation 2188 }
A NoteGoal just serves as an extra rubric entry that's not associated with any test.
2152 def __init__( 2153 self, 2154 taskid, 2155 identifier, 2156 description=("BLANK NOTE GOAL", "THIS GOAL HAS NOT BEEN DEFINED"), 2157 explanation="", 2158 **kwargs 2159 ): 2160 """ 2161 A task ID (string), an identifier (string), and a description are 2162 required, and an explanation (shown only during feedback, but not 2163 on the rubric) may also be given. If no category tag is 2164 specified, the category tag will be set to "feedback_only". 2165 2166 The categorizer "note:" will be prepended to the identifier. 2167 """ 2168 tags = kwargs.setdefault("tags", {}) 2169 if "category" not in tags: 2170 tags["category"] = "feedback_only" 2171 2172 super().__init__( 2173 taskid, 2174 "node:" + identifier, 2175 description, 2176 **kwargs 2177 ) 2178 self.set_default_goal_type("other") 2179 self.explanation = explanation
A task ID (string), an identifier (string), and a description are required, and an explanation (shown only during feedback, but not on the rubric) may also be given. If no category tag is specified, the category tag will be set to "feedback_only".
The categorizer "note:" will be prepended to the identifier.
2191class JointGoal(Goal): 2192 """ 2193 A joint goal requires 1 or more subgoals to succeed and bases its 2194 success off of the success of its subgoals. 2195 2196 If the `JointGoal` is tagged as "transparent", then when producing a 2197 table, it will not create an entry for itself and instead will just 2198 return the subtable containing sub-goals. This is useful when it is 2199 obvious from context how failure of a subgoal would affect the 2200 super-goal. 2201 2202 The joint goal takes its goal type tag from the tags of its child 2203 goals, or sets its tag to "other" if its children have more than one 2204 goal type tag. 2205 """ 2206 def __init__( 2207 self, 2208 taskid, 2209 identifier, 2210 description=("BLANK JOINT GOAL", "THIS GOAL HAS NOT BEEN DEFINED"), 2211 parts=None, 2212 required=None, 2213 partial_required=None, 2214 stop_early=False, 2215 **kwargs 2216 ): 2217 """ 2218 You must provide a task ID, an identifier, a description, a list 2219 of parts (default empty list), and a number of parts required 2220 (default is the size of the given parts list). 2221 2222 The categorizer "joint:" is prepended to the identifier. 2223 2224 If partial_required is given, as long as that many parts are 2225 strictly accomplished, this goal will count as partially 2226 accomplished (must be lower than required). 2227 2228 If stop_early is given as True, if the outcome is known based on 2229 goals already evaluated, the `JointGoal` will not evaluate 2230 subsequent goals. 2231 """ 2232 parts = parts or [] 2233 2234 # Pre-specify goal type tag 2235 subgoal_types = set() 2236 for p in parts: 2237 subgoal_types |= set( 2238 [t for t in p.tags if t in GOAL_TYPE_RUBRICS] 2239 ) 2240 2241 if len(subgoal_types) == 1: 2242 goal_type = list(subgoal_types)[0] 2243 else: 2244 # Zero or more than one explicit subgoal type 2245 goal_type = "other" 2246 2247 super().__init__( 2248 taskid, 2249 "joint:" + identifier, 2250 description, 2251 **kwargs 2252 ) 2253 self.set_default_goal_type(goal_type) 2254 self.parts = parts 2255 if required is None: 2256 required = len(parts) 2257 self.required = required 2258 self.partial_required = partial_required 2259 self.stop_early = stop_early 2260 2261 def subgoals(self): 2262 """ 2263 List of subgoals of this goal (our sub-goals). 2264 """ 2265 return self.parts 2266 2267 def table(self, blank=False): 2268 """ 2269 A `JointGoal`'s table by default contains a sub-table consisting of 2270 the combined tables for each of its sub-goals, but this is 2271 suppressed if the goal has the "hide_subgoals" tag. 2272 2273 If it has the "hide_unevaluated" tag, parts which were never 2274 evaluated due to early stopping are omitted from the subtable. 2275 2276 See `Goal.table` regarding the table format. 2277 """ 2278 subtable = [] 2279 if "hide_subgoals" in self.tags: 2280 subtable = None 2281 else: 2282 for i, subgoal in enumerate(self.parts): 2283 # Only goals that we actually evaluated belong in our result 2284 # table: 2285 if ( 2286 "hide_unevaluated" in self.tags 2287 and i >= self.result.get("goals_evaluated", len(self.parts)) 2288 ): 2289 break 2290 subtable.extend(subgoal.table(blank=blank)) 2291 2292 if "transparent" in self.tags: 2293 result = subtable 2294 else: 2295 result = super().table(blank=blank) 2296 result[0]["subtable"] = subtable 2297 2298 return result 2299 2300 def evaluate_in_context(self, context=None): 2301 """ 2302 To evaluate a `JointGoal`, we evaluate each subgoal in order. If at 2303 least the required number of them are "accomplished", the joint 2304 goal is also "accomplished". If not, but at least the required 2305 number are either "accomplished" or "partial", the joint goal is 2306 "partial". Otherwise, it is "failed". If the result is known 2307 before all goals are evaluated, the `JointGoal` will skip 2308 unnecessary parts, unless it was created with stop_early=False. 2309 """ 2310 context = context or {} 2311 2312 passed = 0 2313 partial = 0 2314 remaining = len(self.parts) 2315 2316 if self.required == 0 and self.stop_early: 2317 self.result = { 2318 "status": "accomplished", 2319 "goals_evaluated": 0 2320 } 2321 self.set_explanation( 2322 context, 2323 default="Context established; no testing required." 2324 ) 2325 return self.result 2326 2327 if self.required == 0: 2328 pass_msg = "Context established; no testing required." 2329 partial_msg = "ERROR: THIS MESSAGE SHOULD NEVER BE DISPLAYED (1)" 2330 fail_msg = "ERROR: THIS MESSAGE SHOULD NEVER BE DISPLAYED (2)" 2331 elif self.required == len(self.parts) and self.required > 1: 2332 pass_msg = "All parts accomplished." 2333 if self.partial_required is not None: 2334 partial_msg = ( 2335 "All parts at least partially accomplished, or at " 2336 + "least {} of {} parts accomplished." 2337 ).format(self.partial_required, len(self.parts)) 2338 else: 2339 partial_msg = "All parts at least partially accomplished." 2340 fail_msg = "At least one part failed." 2341 elif self.required == len(self.parts): 2342 pass_msg = "Subgoal accomplished." 2343 partial_msg = "Subgoal partially accomplished." 2344 fail_msg = "Subgoal failed." 2345 else: 2346 pass_msg = "At least {} of {} parts accomplished.".format( 2347 self.required, 2348 len(self.parts) 2349 ) 2350 if self.partial_required is not None: 2351 partial_msg = ( 2352 "At least {} of {} parts accomplished or partially " 2353 + "accomplished, or at least {} of {} parts accomplished." 2354 ).format( 2355 self.required, 2356 len(self.parts), 2357 self.partial_required, 2358 len(self.parts), 2359 ) 2360 fail_msg = ( 2361 "Failed to accomplish at least {} of {} parts." 2362 ).format( 2363 self.partial_required, 2364 len(self.parts) 2365 ) 2366 else: 2367 partial_msg = ( 2368 "At least {} of {} parts accomplished or partially " 2369 + "accomplished." 2370 ).format(self.required, len(self.parts)) 2371 fail_msg = ( 2372 "Failed to accomplish at least {} of {} parts." 2373 ).format( 2374 self.required, 2375 len(self.parts) 2376 ) 2377 2378 goals_evaluated = 0 2379 for subgoal in self.parts: 2380 # Shallow copy of our context: 2381 sub_context = {} 2382 sub_context.update(context) 2383 result = subgoal.evaluate(sub_context) 2384 goals_evaluated += 1 2385 result_status = result.get("status", "unknown") 2386 remaining -= 1 2387 if result_status == "accomplished": 2388 passed += 1 2389 elif result_status == "partial": 2390 partial += 1 2391 2392 if self.stop_early: 2393 if passed >= self.required: 2394 self.result = { 2395 "status": "accomplished", 2396 "goals_evaluated": goals_evaluated 2397 } 2398 self.set_explanation(context, default=pass_msg) 2399 return self.result 2400 elif ( 2401 ( 2402 passed + partial >= self.required 2403 and passed + remaining < self.required 2404 ) 2405 or ( 2406 self.partial_required is not None 2407 and passed >= self.partial_required 2408 and passed + remaining < self.required 2409 ) 2410 ): 2411 self.result = { 2412 "status": "partial", 2413 "goals_evaluated": goals_evaluated 2414 } 2415 self.set_explanation(context, default=partial_msg) 2416 return self.result 2417 2418 if passed >= self.required: 2419 self.result = { 2420 "status": "accomplished", 2421 "goals_evaluated": goals_evaluated 2422 } 2423 self.set_explanation(context, default=pass_msg) 2424 return self.result 2425 elif ( 2426 (passed + partial >= self.required) 2427 or ( 2428 self.partial_required is not None 2429 and passed >= self.partial_required 2430 ) 2431 ): 2432 self.result = { 2433 "status": "partial", 2434 "goals_evaluated": goals_evaluated 2435 } 2436 self.set_explanation(context, default=partial_msg) 2437 return self.result 2438 else: 2439 self.result = { 2440 "status": "failed", 2441 "goals_evaluated": goals_evaluated 2442 } 2443 self.set_explanation(context, default=fail_msg) 2444 return self.result
A joint goal requires 1 or more subgoals to succeed and bases its success off of the success of its subgoals.
If the JointGoal
is tagged as "transparent", then when producing a
table, it will not create an entry for itself and instead will just
return the subtable containing sub-goals. This is useful when it is
obvious from context how failure of a subgoal would affect the
super-goal.
The joint goal takes its goal type tag from the tags of its child goals, or sets its tag to "other" if its children have more than one goal type tag.
2206 def __init__( 2207 self, 2208 taskid, 2209 identifier, 2210 description=("BLANK JOINT GOAL", "THIS GOAL HAS NOT BEEN DEFINED"), 2211 parts=None, 2212 required=None, 2213 partial_required=None, 2214 stop_early=False, 2215 **kwargs 2216 ): 2217 """ 2218 You must provide a task ID, an identifier, a description, a list 2219 of parts (default empty list), and a number of parts required 2220 (default is the size of the given parts list). 2221 2222 The categorizer "joint:" is prepended to the identifier. 2223 2224 If partial_required is given, as long as that many parts are 2225 strictly accomplished, this goal will count as partially 2226 accomplished (must be lower than required). 2227 2228 If stop_early is given as True, if the outcome is known based on 2229 goals already evaluated, the `JointGoal` will not evaluate 2230 subsequent goals. 2231 """ 2232 parts = parts or [] 2233 2234 # Pre-specify goal type tag 2235 subgoal_types = set() 2236 for p in parts: 2237 subgoal_types |= set( 2238 [t for t in p.tags if t in GOAL_TYPE_RUBRICS] 2239 ) 2240 2241 if len(subgoal_types) == 1: 2242 goal_type = list(subgoal_types)[0] 2243 else: 2244 # Zero or more than one explicit subgoal type 2245 goal_type = "other" 2246 2247 super().__init__( 2248 taskid, 2249 "joint:" + identifier, 2250 description, 2251 **kwargs 2252 ) 2253 self.set_default_goal_type(goal_type) 2254 self.parts = parts 2255 if required is None: 2256 required = len(parts) 2257 self.required = required 2258 self.partial_required = partial_required 2259 self.stop_early = stop_early
You must provide a task ID, an identifier, a description, a list of parts (default empty list), and a number of parts required (default is the size of the given parts list).
The categorizer "joint:" is prepended to the identifier.
If partial_required is given, as long as that many parts are strictly accomplished, this goal will count as partially accomplished (must be lower than required).
If stop_early is given as True, if the outcome is known based on
goals already evaluated, the JointGoal
will not evaluate
subsequent goals.
2261 def subgoals(self): 2262 """ 2263 List of subgoals of this goal (our sub-goals). 2264 """ 2265 return self.parts
List of subgoals of this goal (our sub-goals).
2267 def table(self, blank=False): 2268 """ 2269 A `JointGoal`'s table by default contains a sub-table consisting of 2270 the combined tables for each of its sub-goals, but this is 2271 suppressed if the goal has the "hide_subgoals" tag. 2272 2273 If it has the "hide_unevaluated" tag, parts which were never 2274 evaluated due to early stopping are omitted from the subtable. 2275 2276 See `Goal.table` regarding the table format. 2277 """ 2278 subtable = [] 2279 if "hide_subgoals" in self.tags: 2280 subtable = None 2281 else: 2282 for i, subgoal in enumerate(self.parts): 2283 # Only goals that we actually evaluated belong in our result 2284 # table: 2285 if ( 2286 "hide_unevaluated" in self.tags 2287 and i >= self.result.get("goals_evaluated", len(self.parts)) 2288 ): 2289 break 2290 subtable.extend(subgoal.table(blank=blank)) 2291 2292 if "transparent" in self.tags: 2293 result = subtable 2294 else: 2295 result = super().table(blank=blank) 2296 result[0]["subtable"] = subtable 2297 2298 return result
A JointGoal
's table by default contains a sub-table consisting of
the combined tables for each of its sub-goals, but this is
suppressed if the goal has the "hide_subgoals" tag.
If it has the "hide_unevaluated" tag, parts which were never evaluated due to early stopping are omitted from the subtable.
See Goal.table
regarding the table format.
2300 def evaluate_in_context(self, context=None): 2301 """ 2302 To evaluate a `JointGoal`, we evaluate each subgoal in order. If at 2303 least the required number of them are "accomplished", the joint 2304 goal is also "accomplished". If not, but at least the required 2305 number are either "accomplished" or "partial", the joint goal is 2306 "partial". Otherwise, it is "failed". If the result is known 2307 before all goals are evaluated, the `JointGoal` will skip 2308 unnecessary parts, unless it was created with stop_early=False. 2309 """ 2310 context = context or {} 2311 2312 passed = 0 2313 partial = 0 2314 remaining = len(self.parts) 2315 2316 if self.required == 0 and self.stop_early: 2317 self.result = { 2318 "status": "accomplished", 2319 "goals_evaluated": 0 2320 } 2321 self.set_explanation( 2322 context, 2323 default="Context established; no testing required." 2324 ) 2325 return self.result 2326 2327 if self.required == 0: 2328 pass_msg = "Context established; no testing required." 2329 partial_msg = "ERROR: THIS MESSAGE SHOULD NEVER BE DISPLAYED (1)" 2330 fail_msg = "ERROR: THIS MESSAGE SHOULD NEVER BE DISPLAYED (2)" 2331 elif self.required == len(self.parts) and self.required > 1: 2332 pass_msg = "All parts accomplished." 2333 if self.partial_required is not None: 2334 partial_msg = ( 2335 "All parts at least partially accomplished, or at " 2336 + "least {} of {} parts accomplished." 2337 ).format(self.partial_required, len(self.parts)) 2338 else: 2339 partial_msg = "All parts at least partially accomplished." 2340 fail_msg = "At least one part failed." 2341 elif self.required == len(self.parts): 2342 pass_msg = "Subgoal accomplished." 2343 partial_msg = "Subgoal partially accomplished." 2344 fail_msg = "Subgoal failed." 2345 else: 2346 pass_msg = "At least {} of {} parts accomplished.".format( 2347 self.required, 2348 len(self.parts) 2349 ) 2350 if self.partial_required is not None: 2351 partial_msg = ( 2352 "At least {} of {} parts accomplished or partially " 2353 + "accomplished, or at least {} of {} parts accomplished." 2354 ).format( 2355 self.required, 2356 len(self.parts), 2357 self.partial_required, 2358 len(self.parts), 2359 ) 2360 fail_msg = ( 2361 "Failed to accomplish at least {} of {} parts." 2362 ).format( 2363 self.partial_required, 2364 len(self.parts) 2365 ) 2366 else: 2367 partial_msg = ( 2368 "At least {} of {} parts accomplished or partially " 2369 + "accomplished." 2370 ).format(self.required, len(self.parts)) 2371 fail_msg = ( 2372 "Failed to accomplish at least {} of {} parts." 2373 ).format( 2374 self.required, 2375 len(self.parts) 2376 ) 2377 2378 goals_evaluated = 0 2379 for subgoal in self.parts: 2380 # Shallow copy of our context: 2381 sub_context = {} 2382 sub_context.update(context) 2383 result = subgoal.evaluate(sub_context) 2384 goals_evaluated += 1 2385 result_status = result.get("status", "unknown") 2386 remaining -= 1 2387 if result_status == "accomplished": 2388 passed += 1 2389 elif result_status == "partial": 2390 partial += 1 2391 2392 if self.stop_early: 2393 if passed >= self.required: 2394 self.result = { 2395 "status": "accomplished", 2396 "goals_evaluated": goals_evaluated 2397 } 2398 self.set_explanation(context, default=pass_msg) 2399 return self.result 2400 elif ( 2401 ( 2402 passed + partial >= self.required 2403 and passed + remaining < self.required 2404 ) 2405 or ( 2406 self.partial_required is not None 2407 and passed >= self.partial_required 2408 and passed + remaining < self.required 2409 ) 2410 ): 2411 self.result = { 2412 "status": "partial", 2413 "goals_evaluated": goals_evaluated 2414 } 2415 self.set_explanation(context, default=partial_msg) 2416 return self.result 2417 2418 if passed >= self.required: 2419 self.result = { 2420 "status": "accomplished", 2421 "goals_evaluated": goals_evaluated 2422 } 2423 self.set_explanation(context, default=pass_msg) 2424 return self.result 2425 elif ( 2426 (passed + partial >= self.required) 2427 or ( 2428 self.partial_required is not None 2429 and passed >= self.partial_required 2430 ) 2431 ): 2432 self.result = { 2433 "status": "partial", 2434 "goals_evaluated": goals_evaluated 2435 } 2436 self.set_explanation(context, default=partial_msg) 2437 return self.result 2438 else: 2439 self.result = { 2440 "status": "failed", 2441 "goals_evaluated": goals_evaluated 2442 } 2443 self.set_explanation(context, default=fail_msg) 2444 return self.result
To evaluate a JointGoal
, we evaluate each subgoal in order. If at
least the required number of them are "accomplished", the joint
goal is also "accomplished". If not, but at least the required
number are either "accomplished" or "partial", the joint goal is
"partial". Otherwise, it is "failed". If the result is known
before all goals are evaluated, the JointGoal
will skip
unnecessary parts, unless it was created with stop_early=False.
2447class FailGoal(Goal): 2448 """ 2449 A fail goal simply swaps accomplished for failed and vice versa in 2450 the result of a sub-goal. 2451 """ 2452 def __init__( 2453 self, 2454 taskid, 2455 identifier, 2456 description=None, 2457 goal=None, 2458 permit_partial=True, 2459 **kwargs 2460 ): 2461 """ 2462 Requires a task ID, an identifier, and a subgoal, with optional 2463 description, explanations, and tags. The description should 2464 generally be phrased as the negation of the subgoal's 2465 description, and the default (if None is given explicitly) is to 2466 add "Do not " in front of the subgoal's description title and add 2467 " You need to avoid this." to the end of its details. 2468 2469 The categorizer "fail:" is prepended to the identifier. 2470 2471 If permit_partial is specified, True means that partial success 2472 of the subgoal is partial success of this goal (the default), and 2473 False means that even partial success of the subgoal is full 2474 failure of this goal. 2475 """ 2476 # Auto description 2477 if description is None: 2478 subrub = goal.description 2479 subtitle, subdetails = subrub 2480 if subtitle[0].isupper(): 2481 subtitle = subtitle[0].lower() + subtitle[1:] 2482 description = ( 2483 "Do not " + subtitle, 2484 subdetails + " You need to avoid this." 2485 ) 2486 2487 # Lift goal type from sub-goal 2488 goal_type = goal.tags.get("goal_type", "other") 2489 2490 super().__init__( 2491 taskid, 2492 "fail:" + identifier, 2493 description, 2494 **kwargs 2495 ) 2496 self.set_default_goal_type(goal_type) 2497 2498 if goal is None: 2499 raise ValueError("A FailGoal must be provided a subgoal!") 2500 self.goal = goal 2501 self.permit_partial = permit_partial 2502 2503 def subgoals(self): 2504 """ 2505 List of subgoals of this goal (just our single goal). 2506 """ 2507 if self.goal: 2508 return [ self.goal ] 2509 else: 2510 return [] 2511 2512 def table(self, blank=False): 2513 """ 2514 The table for a `FailGoal` is a copy of it's subgoal's table, 2515 with the status, description, and explanation from the 2516 `FailGoal`'s result. This means that the `FailGoal` itself does 2517 not appear as a separate entry in rubric tables. Any tags for the 2518 `FailGoal` are added to the tags of the subgoal. 2519 2520 See `Goal.table` regarding the table format. 2521 """ 2522 row = self.goal.table(blank=blank)[0] 2523 category = self.tags.get("category", "unknown") 2524 row["id"] = "goal:" + category + '.' + self.identifier 2525 row["description"] = list(self.description[:]) 2526 row["tags"] = list(set(row["tags"]) | set(self.tags)) 2527 if not blank: 2528 row["status"] = self.result["status"] 2529 row["explanation"] = self.result["explanation"] 2530 2531 return [ row ] 2532 2533 def evaluate_in_context(self, context=None): 2534 """ 2535 Evaluates the sub-goal, and returns a result which replaces 2536 "accomplished" with "failed" and vice versa. Does not affect a 2537 result of "partial" unless permit_partial is set to False, in 2538 which case a "partial" result is converted to "failed." 2539 """ 2540 context = context or {} 2541 self.result = {} 2542 self.result.update(self.goal.evaluate(context)) 2543 if self.result["status"] == "accomplished": 2544 self.result["status"] = "failed" 2545 elif self.result["status"] == "failed": 2546 self.result["status"] = "accomplished" 2547 elif self.result["status"] == "partial" and not self.permit_partial: 2548 self.result["status"] = "failed" 2549 # else don't modify the status 2550 2551 # Update explanation from sub_result only if we have a matching 2552 # explanation function. 2553 self.set_explanation(context, default=self.result["explanation"]) 2554 2555 return self.result
A fail goal simply swaps accomplished for failed and vice versa in the result of a sub-goal.
2452 def __init__( 2453 self, 2454 taskid, 2455 identifier, 2456 description=None, 2457 goal=None, 2458 permit_partial=True, 2459 **kwargs 2460 ): 2461 """ 2462 Requires a task ID, an identifier, and a subgoal, with optional 2463 description, explanations, and tags. The description should 2464 generally be phrased as the negation of the subgoal's 2465 description, and the default (if None is given explicitly) is to 2466 add "Do not " in front of the subgoal's description title and add 2467 " You need to avoid this." to the end of its details. 2468 2469 The categorizer "fail:" is prepended to the identifier. 2470 2471 If permit_partial is specified, True means that partial success 2472 of the subgoal is partial success of this goal (the default), and 2473 False means that even partial success of the subgoal is full 2474 failure of this goal. 2475 """ 2476 # Auto description 2477 if description is None: 2478 subrub = goal.description 2479 subtitle, subdetails = subrub 2480 if subtitle[0].isupper(): 2481 subtitle = subtitle[0].lower() + subtitle[1:] 2482 description = ( 2483 "Do not " + subtitle, 2484 subdetails + " You need to avoid this." 2485 ) 2486 2487 # Lift goal type from sub-goal 2488 goal_type = goal.tags.get("goal_type", "other") 2489 2490 super().__init__( 2491 taskid, 2492 "fail:" + identifier, 2493 description, 2494 **kwargs 2495 ) 2496 self.set_default_goal_type(goal_type) 2497 2498 if goal is None: 2499 raise ValueError("A FailGoal must be provided a subgoal!") 2500 self.goal = goal 2501 self.permit_partial = permit_partial
Requires a task ID, an identifier, and a subgoal, with optional description, explanations, and tags. The description should generally be phrased as the negation of the subgoal's description, and the default (if None is given explicitly) is to add "Do not " in front of the subgoal's description title and add " You need to avoid this." to the end of its details.
The categorizer "fail:" is prepended to the identifier.
If permit_partial is specified, True means that partial success of the subgoal is partial success of this goal (the default), and False means that even partial success of the subgoal is full failure of this goal.
2503 def subgoals(self): 2504 """ 2505 List of subgoals of this goal (just our single goal). 2506 """ 2507 if self.goal: 2508 return [ self.goal ] 2509 else: 2510 return []
List of subgoals of this goal (just our single goal).
2512 def table(self, blank=False): 2513 """ 2514 The table for a `FailGoal` is a copy of it's subgoal's table, 2515 with the status, description, and explanation from the 2516 `FailGoal`'s result. This means that the `FailGoal` itself does 2517 not appear as a separate entry in rubric tables. Any tags for the 2518 `FailGoal` are added to the tags of the subgoal. 2519 2520 See `Goal.table` regarding the table format. 2521 """ 2522 row = self.goal.table(blank=blank)[0] 2523 category = self.tags.get("category", "unknown") 2524 row["id"] = "goal:" + category + '.' + self.identifier 2525 row["description"] = list(self.description[:]) 2526 row["tags"] = list(set(row["tags"]) | set(self.tags)) 2527 if not blank: 2528 row["status"] = self.result["status"] 2529 row["explanation"] = self.result["explanation"] 2530 2531 return [ row ]
The table for a FailGoal
is a copy of it's subgoal's table,
with the status, description, and explanation from the
FailGoal
's result. This means that the FailGoal
itself does
not appear as a separate entry in rubric tables. Any tags for the
FailGoal
are added to the tags of the subgoal.
See Goal.table
regarding the table format.
2533 def evaluate_in_context(self, context=None): 2534 """ 2535 Evaluates the sub-goal, and returns a result which replaces 2536 "accomplished" with "failed" and vice versa. Does not affect a 2537 result of "partial" unless permit_partial is set to False, in 2538 which case a "partial" result is converted to "failed." 2539 """ 2540 context = context or {} 2541 self.result = {} 2542 self.result.update(self.goal.evaluate(context)) 2543 if self.result["status"] == "accomplished": 2544 self.result["status"] = "failed" 2545 elif self.result["status"] == "failed": 2546 self.result["status"] = "accomplished" 2547 elif self.result["status"] == "partial" and not self.permit_partial: 2548 self.result["status"] = "failed" 2549 # else don't modify the status 2550 2551 # Update explanation from sub_result only if we have a matching 2552 # explanation function. 2553 self.set_explanation(context, default=self.result["explanation"]) 2554 2555 return self.result
Evaluates the sub-goal, and returns a result which replaces "accomplished" with "failed" and vice versa. Does not affect a result of "partial" unless permit_partial is set to False, in which case a "partial" result is converted to "failed."
2558class PreconditionGoal(Goal): 2559 """ 2560 A precondition goal requires that a condition goal is achieved, and 2561 only if it is does it return an evaluation based on a subgoal. 2562 """ 2563 def __init__( 2564 self, 2565 taskid, 2566 identifier, 2567 description=( 2568 "BLANK PRECONDITION GOAL", 2569 "THIS GOAL HAS NOT BEEN DEFINED" 2570 ), 2571 precondition=None, 2572 goal=None, 2573 **kwargs 2574 ): 2575 """ 2576 You must provide a task ID, an identifier, a description, a 2577 precondition goal, and a subgoal. 2578 2579 The categorizer "precondition:" is prepended to the identifier. 2580 """ 2581 # Pre-specify goal type tag 2582 subgoal_types = set() 2583 for sg in [precondition, goal]: 2584 subgoal_types |= set( 2585 [t for t in sg.tags if t in GOAL_TYPE_RUBRICS] 2586 ) 2587 2588 if len(subgoal_types) == 1: 2589 goal_type = list(subgoal_types)[0] 2590 else: 2591 # Zero or more than one explicit subgoal type 2592 goal_type = "other" 2593 2594 super().__init__( 2595 taskid, 2596 "precondition:" + identifier, 2597 description, 2598 **kwargs 2599 ) 2600 self.set_default_goal_type(goal_type) 2601 if precondition is None or goal is None: 2602 raise ValueError( 2603 "A PreconditionGoal must have both a precondition and a goal!" 2604 ) 2605 self.precondition = precondition 2606 self.goal = goal 2607 2608 def subgoals(self): 2609 """ 2610 List of subgoals of this goal (our precondition and our goal). 2611 """ 2612 return [ self.precondition, self.goal ] 2613 2614 def evaluate_in_context(self, context={}): 2615 """ 2616 To evaluate a `PreconditionGoal`, we evaluate the precondition. If 2617 it does not evaluate as "accomplished," then the entire goal 2618 evaluates to "failed" immediately. If it does evaluate to 2619 "accomplished," the final goal is evaluated and that result is 2620 returned. 2621 2622 If the precondition passes, it is not mentioned in the 2623 explanation that results, but if it fails, its failure 2624 explanation is used as the explanation for this goal's failure. 2625 2626 Even if the precondition passes, this node's explanation function 2627 is still run on the results, but if it fails, the special 2628 explanation status "precondition_failed" is used (to 2629 differentiate from a failed sub-goal post-precondition). 2630 """ 2631 pre = self.precondition.evaluate(context) 2632 if pre.get("status") != "accomplished": 2633 self.result = { 2634 "status": "failed", 2635 "precondition_failed": True, 2636 } 2637 self.set_explanation( 2638 context, 2639 status="precondition_failed", 2640 default="Precondition failed:<br>\n{}".format( 2641 pre.get("explanation", "Cause unknown") 2642 ) 2643 ) 2644 return self.result 2645 else: 2646 self.result = self.goal.evaluate(context) 2647 self.result["precondition_failed"] = False 2648 self.set_explanation(context, default=self.result["explanation"]) 2649 return self.result 2650 2651 def table(self, blank=False): 2652 """ 2653 A `PreconditionGoal`'s table depends on the result from its 2654 precondition. If the precondition failed, the table will be the 2655 precondition's table; otherwise it will be the main goal's table. 2656 The fact that there is a precondition is thus not visible from 2657 the table unless the precondition fails. 2658 TODO: Not that? 2659 2660 See `Goal.table` regarding the table format. 2661 """ 2662 if self.result.get("precondition_failed", False): 2663 return self.precondition.table(blank=blank) 2664 else: 2665 return self.goal.table(blank=blank)
A precondition goal requires that a condition goal is achieved, and only if it is does it return an evaluation based on a subgoal.
2563 def __init__( 2564 self, 2565 taskid, 2566 identifier, 2567 description=( 2568 "BLANK PRECONDITION GOAL", 2569 "THIS GOAL HAS NOT BEEN DEFINED" 2570 ), 2571 precondition=None, 2572 goal=None, 2573 **kwargs 2574 ): 2575 """ 2576 You must provide a task ID, an identifier, a description, a 2577 precondition goal, and a subgoal. 2578 2579 The categorizer "precondition:" is prepended to the identifier. 2580 """ 2581 # Pre-specify goal type tag 2582 subgoal_types = set() 2583 for sg in [precondition, goal]: 2584 subgoal_types |= set( 2585 [t for t in sg.tags if t in GOAL_TYPE_RUBRICS] 2586 ) 2587 2588 if len(subgoal_types) == 1: 2589 goal_type = list(subgoal_types)[0] 2590 else: 2591 # Zero or more than one explicit subgoal type 2592 goal_type = "other" 2593 2594 super().__init__( 2595 taskid, 2596 "precondition:" + identifier, 2597 description, 2598 **kwargs 2599 ) 2600 self.set_default_goal_type(goal_type) 2601 if precondition is None or goal is None: 2602 raise ValueError( 2603 "A PreconditionGoal must have both a precondition and a goal!" 2604 ) 2605 self.precondition = precondition 2606 self.goal = goal
You must provide a task ID, an identifier, a description, a precondition goal, and a subgoal.
The categorizer "precondition:" is prepended to the identifier.
2608 def subgoals(self): 2609 """ 2610 List of subgoals of this goal (our precondition and our goal). 2611 """ 2612 return [ self.precondition, self.goal ]
List of subgoals of this goal (our precondition and our goal).
2614 def evaluate_in_context(self, context={}): 2615 """ 2616 To evaluate a `PreconditionGoal`, we evaluate the precondition. If 2617 it does not evaluate as "accomplished," then the entire goal 2618 evaluates to "failed" immediately. If it does evaluate to 2619 "accomplished," the final goal is evaluated and that result is 2620 returned. 2621 2622 If the precondition passes, it is not mentioned in the 2623 explanation that results, but if it fails, its failure 2624 explanation is used as the explanation for this goal's failure. 2625 2626 Even if the precondition passes, this node's explanation function 2627 is still run on the results, but if it fails, the special 2628 explanation status "precondition_failed" is used (to 2629 differentiate from a failed sub-goal post-precondition). 2630 """ 2631 pre = self.precondition.evaluate(context) 2632 if pre.get("status") != "accomplished": 2633 self.result = { 2634 "status": "failed", 2635 "precondition_failed": True, 2636 } 2637 self.set_explanation( 2638 context, 2639 status="precondition_failed", 2640 default="Precondition failed:<br>\n{}".format( 2641 pre.get("explanation", "Cause unknown") 2642 ) 2643 ) 2644 return self.result 2645 else: 2646 self.result = self.goal.evaluate(context) 2647 self.result["precondition_failed"] = False 2648 self.set_explanation(context, default=self.result["explanation"]) 2649 return self.result
To evaluate a PreconditionGoal
, we evaluate the precondition. If
it does not evaluate as "accomplished," then the entire goal
evaluates to "failed" immediately. If it does evaluate to
"accomplished," the final goal is evaluated and that result is
returned.
If the precondition passes, it is not mentioned in the explanation that results, but if it fails, its failure explanation is used as the explanation for this goal's failure.
Even if the precondition passes, this node's explanation function is still run on the results, but if it fails, the special explanation status "precondition_failed" is used (to differentiate from a failed sub-goal post-precondition).
2651 def table(self, blank=False): 2652 """ 2653 A `PreconditionGoal`'s table depends on the result from its 2654 precondition. If the precondition failed, the table will be the 2655 precondition's table; otherwise it will be the main goal's table. 2656 The fact that there is a precondition is thus not visible from 2657 the table unless the precondition fails. 2658 TODO: Not that? 2659 2660 See `Goal.table` regarding the table format. 2661 """ 2662 if self.result.get("precondition_failed", False): 2663 return self.precondition.table(blank=blank) 2664 else: 2665 return self.goal.table(blank=blank)
A PreconditionGoal
's table depends on the result from its
precondition. If the precondition failed, the table will be the
precondition's table; otherwise it will be the main goal's table.
The fact that there is a precondition is thus not visible from
the table unless the precondition fails.
TODO: Not that?
See Goal.table
regarding the table format.
2668class ComparisonTest(Goal): 2669 """ 2670 Runs a checker function on two arbitrary context slots. 2671 """ 2672 def __init__( 2673 self, 2674 taskid, 2675 identifier, 2676 description=( 2677 "BLANK COMPARISON TEST", 2678 "THIS GOAL HAS NOT BEEN DEFINED" 2679 ), 2680 context_slot="value", 2681 checker=None, 2682 ref_slot=None, 2683 **kwargs 2684 ): 2685 """ 2686 In addition to a task ID (string) and an identifier (string), a 2687 description, and optional explanations and/or tags (see the 2688 `Goal` class), a checker function is needed, which should accept 2689 value and reference objects and return a goal result (a 2690 dictionary with status + explanation keys). The context_slot is 2691 used to determine which slot in the current context to check, and 2692 ref_slot specifies where to get the reference object, although if 2693 not given it will default to "ref_" + context_slot. 2694 2695 The categorizer "test:" is prepended to the identifier. 2696 2697 If the checker is omitted or given explicitly as None, the goal 2698 will succeed as long as the appropriate context_slot (and 2699 ref_slot) are present, and will only fail if the assigned context 2700 fails to even establish those keys. 2701 2702 If the ref_slot is the same as the context_slot, the checker 2703 function will be called with only one value. 2704 """ 2705 super().__init__( 2706 taskid, 2707 "test:" + identifier, 2708 description, 2709 **kwargs 2710 ) 2711 self.context_slot = context_slot 2712 self.checker = checker 2713 if ref_slot is None: 2714 ref_slot = "ref_" + context_slot 2715 self.ref_slot = ref_slot 2716 2717 # subgoals is inherited (no subgoals) 2718 2719 # table is inherited 2720 2721 def evaluate_in_context(self, context=None): 2722 """ 2723 Runs the checker and returns its result. 2724 """ 2725 context = context or {} 2726 2727 if self.checker is None: 2728 if self.context_slot in context and self.ref_slot in context: 2729 self.result = { 2730 "status": "accomplished", 2731 "explanation": ( 2732 "Successfully established '{}' context." 2733 ).format(self.context_slot) 2734 } 2735 elif self.context_slot not in context: 2736 self.result = { 2737 "status": "failed", 2738 "explanation": ( 2739 "Failed to establish '{}' context." 2740 ).format(self.context_slot) 2741 } 2742 else: 2743 self.result = { 2744 "status": "failed", 2745 "explanation": ( 2746 "Failed to establish '{}' context." 2747 ).format(self.ref_slot) 2748 } 2749 else: 2750 try: 2751 val = context[self.context_slot] 2752 except KeyError: 2753 self.result = { 2754 "status": "failed", 2755 "traceback": html_tools.html_traceback( 2756 linkable=context_utils.linkmap(context) 2757 ) 2758 } 2759 self.set_explanation( 2760 context, 2761 status="crash", 2762 default=( 2763 "Could not access '{}' for testing." 2764 " Context has keys:<br>{}" 2765 ).format( 2766 self.context_slot, 2767 ', '.join(repr(k) for k in context.keys()) 2768 ) 2769 ) 2770 return self.result 2771 2772 try: 2773 ref = context[self.ref_slot] 2774 except KeyError: 2775 self.result = { 2776 "status": "failed", 2777 "traceback": html_tools.html_traceback( 2778 linkable=context_utils.linkmap(context) 2779 ) 2780 } 2781 self.set_explanation( 2782 context, 2783 status="crash", 2784 default=( 2785 "Could not access '{}' for testing." 2786 " Context has keys:<br>{}" 2787 ).format( 2788 self.ref_slot, 2789 ', '.join(repr(k) for k in context.keys()) 2790 ) 2791 ) 2792 return self.result 2793 2794 try: 2795 if self.context_slot == self.ref_slot: 2796 self.result = self.checker(val) 2797 else: 2798 self.result = self.checker(val, ref) 2799 2800 if self.result is None: 2801 raise ValueError( 2802 "Context checker {} returned None!".format( 2803 self.checker 2804 ) 2805 ) 2806 2807 self.set_explanation( 2808 context, 2809 default=self.result["explanation"] 2810 ) 2811 except Exception: 2812 self.result = { 2813 "status": "failed", 2814 "traceback": html_tools.html_traceback( 2815 linkable=context_utils.linkmap(context) 2816 ) 2817 } 2818 self.set_explanation( 2819 context, 2820 status="crash", 2821 default=html_tools.html_traceback( 2822 title="Error while checking {}:".format( 2823 self.context_slot 2824 ), 2825 linkable=context_utils.linkmap(context) 2826 ) 2827 ) 2828 2829 return self.result
Runs a checker function on two arbitrary context slots.
2672 def __init__( 2673 self, 2674 taskid, 2675 identifier, 2676 description=( 2677 "BLANK COMPARISON TEST", 2678 "THIS GOAL HAS NOT BEEN DEFINED" 2679 ), 2680 context_slot="value", 2681 checker=None, 2682 ref_slot=None, 2683 **kwargs 2684 ): 2685 """ 2686 In addition to a task ID (string) and an identifier (string), a 2687 description, and optional explanations and/or tags (see the 2688 `Goal` class), a checker function is needed, which should accept 2689 value and reference objects and return a goal result (a 2690 dictionary with status + explanation keys). The context_slot is 2691 used to determine which slot in the current context to check, and 2692 ref_slot specifies where to get the reference object, although if 2693 not given it will default to "ref_" + context_slot. 2694 2695 The categorizer "test:" is prepended to the identifier. 2696 2697 If the checker is omitted or given explicitly as None, the goal 2698 will succeed as long as the appropriate context_slot (and 2699 ref_slot) are present, and will only fail if the assigned context 2700 fails to even establish those keys. 2701 2702 If the ref_slot is the same as the context_slot, the checker 2703 function will be called with only one value. 2704 """ 2705 super().__init__( 2706 taskid, 2707 "test:" + identifier, 2708 description, 2709 **kwargs 2710 ) 2711 self.context_slot = context_slot 2712 self.checker = checker 2713 if ref_slot is None: 2714 ref_slot = "ref_" + context_slot 2715 self.ref_slot = ref_slot
In addition to a task ID (string) and an identifier (string), a
description, and optional explanations and/or tags (see the
Goal
class), a checker function is needed, which should accept
value and reference objects and return a goal result (a
dictionary with status + explanation keys). The context_slot is
used to determine which slot in the current context to check, and
ref_slot specifies where to get the reference object, although if
not given it will default to "ref_" + context_slot.
The categorizer "test:" is prepended to the identifier.
If the checker is omitted or given explicitly as None, the goal will succeed as long as the appropriate context_slot (and ref_slot) are present, and will only fail if the assigned context fails to even establish those keys.
If the ref_slot is the same as the context_slot, the checker function will be called with only one value.
2721 def evaluate_in_context(self, context=None): 2722 """ 2723 Runs the checker and returns its result. 2724 """ 2725 context = context or {} 2726 2727 if self.checker is None: 2728 if self.context_slot in context and self.ref_slot in context: 2729 self.result = { 2730 "status": "accomplished", 2731 "explanation": ( 2732 "Successfully established '{}' context." 2733 ).format(self.context_slot) 2734 } 2735 elif self.context_slot not in context: 2736 self.result = { 2737 "status": "failed", 2738 "explanation": ( 2739 "Failed to establish '{}' context." 2740 ).format(self.context_slot) 2741 } 2742 else: 2743 self.result = { 2744 "status": "failed", 2745 "explanation": ( 2746 "Failed to establish '{}' context." 2747 ).format(self.ref_slot) 2748 } 2749 else: 2750 try: 2751 val = context[self.context_slot] 2752 except KeyError: 2753 self.result = { 2754 "status": "failed", 2755 "traceback": html_tools.html_traceback( 2756 linkable=context_utils.linkmap(context) 2757 ) 2758 } 2759 self.set_explanation( 2760 context, 2761 status="crash", 2762 default=( 2763 "Could not access '{}' for testing." 2764 " Context has keys:<br>{}" 2765 ).format( 2766 self.context_slot, 2767 ', '.join(repr(k) for k in context.keys()) 2768 ) 2769 ) 2770 return self.result 2771 2772 try: 2773 ref = context[self.ref_slot] 2774 except KeyError: 2775 self.result = { 2776 "status": "failed", 2777 "traceback": html_tools.html_traceback( 2778 linkable=context_utils.linkmap(context) 2779 ) 2780 } 2781 self.set_explanation( 2782 context, 2783 status="crash", 2784 default=( 2785 "Could not access '{}' for testing." 2786 " Context has keys:<br>{}" 2787 ).format( 2788 self.ref_slot, 2789 ', '.join(repr(k) for k in context.keys()) 2790 ) 2791 ) 2792 return self.result 2793 2794 try: 2795 if self.context_slot == self.ref_slot: 2796 self.result = self.checker(val) 2797 else: 2798 self.result = self.checker(val, ref) 2799 2800 if self.result is None: 2801 raise ValueError( 2802 "Context checker {} returned None!".format( 2803 self.checker 2804 ) 2805 ) 2806 2807 self.set_explanation( 2808 context, 2809 default=self.result["explanation"] 2810 ) 2811 except Exception: 2812 self.result = { 2813 "status": "failed", 2814 "traceback": html_tools.html_traceback( 2815 linkable=context_utils.linkmap(context) 2816 ) 2817 } 2818 self.set_explanation( 2819 context, 2820 status="crash", 2821 default=html_tools.html_traceback( 2822 title="Error while checking {}:".format( 2823 self.context_slot 2824 ), 2825 linkable=context_utils.linkmap(context) 2826 ) 2827 ) 2828 2829 return self.result
Runs the checker and returns its result.
2832class ImplementationCheck(Goal): 2833 """ 2834 An `ImplementationCheck` inspects the AST of submitted code to 2835 determine whether it counts as accomplished or failed. An 2836 `ImplementationCheck`'s subrules must all be accomplished for the 2837 parent check to count as accomplished. An `ImplementationCheck` looks 2838 for the first match that can satisfy its subrules. 2839 2840 `ImplementationCheck`s by default run on the 'scope' context slot 2841 which contains an AST for the submitted module, or (via refinement by 2842 `ImplementationCheck`s) a subset of that code. When created, unless 2843 explicit dependencies are specified via a `test_in` keyword argument, 2844 each `ImplementationCheck` will grab the current automatic "scope" 2845 context as its only dependency. 2846 """ 2847 def __init__( 2848 self, 2849 taskid, 2850 identifier, 2851 description=( 2852 "BLANK IMPLEMENTATION CHECK", 2853 "THIS GOAL HAS NOT BEEN DEFINED" 2854 ), 2855 pattern="_", 2856 name=None, 2857 match=lambda code, node, env: True, 2858 use=None, 2859 min=None, max=None, 2860 softmin=False, softmax=False, 2861 outside=None, 2862 callees=False, 2863 subrules=None, 2864 match_identity=lambda code, node, envs: ( 2865 tuple(node) if isinstance(node, list) else node 2866 ), 2867 subslip=None, 2868 normalize=False, 2869 check_in_def=False, 2870 force_smaller_match=False, 2871 **kwargs 2872 ): 2873 """ 2874 A task ID, an identifier, and a description are required (see the 2875 `Goal` class). An appropriate `test_in` dictionary which will 2876 provide a "scope" slot is typically required. 2877 2878 The categorizer "check:" is prepended to the identifier. 2879 2880 `ImplementationCheck` itself uses the following arguments: 2881 2882 - pattern: A string containing Python code that will be matched 2883 against using mast. May instead be a list of strings, in 2884 which case they will be tried in turn to generate matches. 2885 - name: specifies a name for the construct being searched for. 2886 The plural will be constructed by adding 's', unless name is 2887 a tuple, in which case the first entry will be used as the 2888 singular and the second as the plural. May contain HTML code. 2889 If pattern is not a list, this can be left out, and the 2890 pattern will be used as the name. 2891 - match: A function that accepts the entire submitted AST, the 2892 node being considered for a match right now, and the current 2893 binding environment. This function should return True or 2894 False, and any matches for which it does not return True will 2895 be ignored. 2896 - use/min/max: Either the 'use' argument, or one or both of the 2897 'min' and 'max' arguments should be given, but not both. 2898 Supplying 'use' sets both 'min' and 'max' to that value. If 2899 'max' is 0, the pattern is considered a negative pattern, and 2900 the goal will fail if any matches are found. Otherwise, the 2901 goal will succeed if the number of matches is between the 2902 given min and max values, inclusive. If none of these are 2903 given, the min defaults to 1 and the max to None (no limit). 2904 - softmin/softmax: If one of these is true, the minimum (or 2905 maximum) restriction on the number of matches will be treated 2906 as a soft constraint, and if violated the goal will be 2907 treated as partially accomplished instead of failed. If they 2908 are exactly either the string "warn" or "note", then the goal 2909 will still count as fully accomplished if that constraint is 2910 violated, but a warning or note will be attached mentioning 2911 the unexpectedly low/high number of matches. They may also be 2912 integers or floats, in which case they establish an alternate 2913 min/max threshold for partial completion. For softmin, 2914 partial matches are counted as 0.5 of a match towards 2915 achieving the threshold, but for softmax partial matches are 2916 ignored. 2917 - outside: If present, the 'outside' pattern (or list of 2918 patterns) is checked, and matches will only be considered 2919 valid if they are not sub-nodes of a match for one of the 2920 given outside patterns. 2921 - callees: If given as True, instead of simply searching within 2922 the context's scope node, this check will look for matches 2923 within other functions defined in the submitted code which 2924 are called from within the given scope node. TODO: This is 2925 still (as of 2020-6) experimental/unstable. 2926 - subrules: A list of `ImplementationCheck` goals to be tested 2927 within matches of this goal. Only matches where this goal and 2928 all of its subrules are accomplished (or partially 2929 accomplished) will be considered valid (respectively, 2930 partially valid). If this goal is a negative goal (max = 0), 2931 it fails if there are any fully valid matches, and partial 2932 matches are ignored. On the other hand, if it is a positive 2933 goal (max != 0), it counts as accomplished if the number of 2934 fully valid matches is within the min and max limits 2935 (inclusive), and partially accomplished if the number of 2936 fully valid matches is below the min limit but the number of 2937 fully valid + partially valid matches is at least the min 2938 limit. 2939 - match_identity: a function that returns a hashable object to 2940 represent the identity of a match for match-counting 2941 purposes. The function will be given the entire code context, 2942 the matching node, and a list of matching environments as 2943 input. It may return a list of identities instead of a single 2944 identity and each will be counted. By default this is a 2945 function which just returns the matching node, such that 2946 multiple matching environments based on the same node are not 2947 counted as separate matches. One reasonable alternative if 2948 you know what type of node you're matching would be to return 2949 some associated string (e.g., the id of a Call node that has 2950 a Name as its func). 2951 - subslip: A number of subgoals which are allowed to be violated 2952 and still count a potential match as a partial match. May be 2953 fractional, since partially-matched subgoals will be counted 2954 as 1/2 a point. By default this number will be set equal to 2955 the number of subgoals, meaning that even if all subgoals 2956 fail a match for a specified structure will still count as a 2957 partial match. 2958 - normalize: default False; experimental mast option that tries 2959 to inline some local variable assignments into larger 2960 expressions for better matching. 2961 - check_in_def: default False, this option changes the context 2962 within which the check occurs by default: the check will use 2963 the 'scope' element from the current context as usual, but 2964 will then assume that that AST node is a Call to a function 2965 defined in the same file (within the 'code' element) and will 2966 look up that definition, running the check in the context of 2967 that definition rather than in the original scope context 2968 given to it. This is useful for placing requirements on 2969 helper functions whose names aren't known ahead of time: a 2970 parent `ImplementationCheck` can be used to match the helper 2971 function call, with child checks using check_in_def that 2972 place requirements on the code in the helper function. The 2973 check will fail if the 'scope' context provided to it is not 2974 a Call node, or if it can't find the matching FunctionDef 2975 node in the 'code' tree of the context it's given. 2976 - force_smaller_match: default False. If set to True, a match 2977 which matches the entire target scope will not be considered a 2978 real match. Use this in places where you want to require 2979 things like nested loops, since otherwise a sub-requirement 2980 that's the same as a super-requirement will simply match the 2981 entire node matched by the super-requirement. 2982 """ 2983 # Grab parent context 2984 if "test_in" not in kwargs or kwargs["test_in"] is None: 2985 kwargs["test_in"] = {} 2986 if "contexts" not in kwargs["test_in"]: 2987 kwargs["test_in"]["contexts"] = contexts.auto("scope") 2988 2989 # Set up Goal properties 2990 super().__init__( 2991 taskid, 2992 "check:" + identifier, 2993 description, 2994 **kwargs 2995 ) 2996 self.set_default_goal_type("procedure") 2997 2998 # Ensure patterns is a list 2999 if isinstance(pattern, str): 3000 self.patterns = [ pattern ] 3001 else: 3002 self.patterns = pattern 3003 3004 # Figure out name 3005 if name is None: 3006 if len(self.patterns) > 1: 3007 raise ValueError( 3008 ( 3009 "When building an ImplementationCheck, if there are " 3010 + "multiple patterns, a name must be specified." 3011 + " (topic: '{}' / patterns: {})" 3012 ).format(self.feedback_topic(), self.patterns) 3013 ) 3014 else: 3015 self.name = self.patterns[0] 3016 self.pl_name = self.name + 's' 3017 elif isinstance(name, (list, tuple)): 3018 self.name, self.pl_name = name 3019 else: 3020 self.name = name 3021 self.pl_name = self.name + 's' 3022 3023 self.match = match 3024 3025 # Figure out min and max 3026 if (min is not None or max is not None) and use is not None: 3027 raise ValueError( 3028 ( 3029 "When building an ImplementationCheck, you may supply " 3030 + "*either* 'use' or 'min'/'max', but you may not supply " 3031 + "'use' if either 'min' or 'max' is given." 3032 + " (topic: '{}' / patterns: {})" 3033 ).format(self.feedback_topic(), self.patterns) 3034 ) 3035 elif use is not None: 3036 self.min_allowed = use 3037 self.max_allowed = use 3038 elif min is None and max is None: 3039 # Default is "at least 1" 3040 self.min_allowed = 1 3041 self.max_allowed = None 3042 else: 3043 self.min_allowed = min 3044 self.max_allowed = max 3045 3046 # Is this goal a positive goal (keep searching for any match 3047 # across possible environments?) or not (fail if any match is 3048 # found in any environment). 3049 self.is_positive = self.max_allowed != 0 3050 3051 self.softmin = softmin 3052 self.softmax = softmax 3053 3054 # Make sure outside is a list 3055 if outside is None: 3056 self.outside = [] 3057 elif isinstance(outside, str): 3058 self.outside = [ outside ] 3059 else: 3060 self.outside = outside 3061 3062 self.callees = callees 3063 3064 self.force_smaller_match = force_smaller_match 3065 3066 # Set subrules 3067 if subrules is None: 3068 self.subrules = [] 3069 else: 3070 self.subrules = subrules 3071 3072 self.match_identity = match_identity 3073 3074 self.subslip = subslip 3075 if self.subslip is None: 3076 self.subslip = len(self.subrules) 3077 3078 self.normalize = normalize 3079 3080 self.check_in_def = check_in_def 3081 3082 def subgoals(self): 3083 """ 3084 List of subgoals of this goal (our precondition and our goal). 3085 """ 3086 return self.subrules 3087 3088 def table(self, blank=False): 3089 """ 3090 Includes sub-table with subrule statuses preserved from the last 3091 full match, or the last partial match if there are no full 3092 matches. 3093 3094 See `Goal.table` regarding the table format. 3095 """ 3096 result = super().table(blank=blank) 3097 3098 # Maybe add a subtable: 3099 if blank: 3100 result[0]["subtable"] = self.build_subtable(blank=blank) 3101 elif self.is_positive: 3102 # TODO: What about tables requested during pre-evaluation 3103 # description construction? 3104 result[0]["subtable"] = self.result.get("subtable") or [] 3105 elif self.result.get("status") != "accomplished": 3106 # For negative rules where we don't want any matches, reporting 3107 # the successful discovery of sub-rules only makes sense if 3108 # we failed the goal (because there was a match that 3109 # shouldn't have been there). 3110 result[0]["subtable"] = self.result.get("subtable") or [] 3111 # Otherwise don't attach a subtable (negative rules that 3112 # succeeded because they didn't have any full matches). 3113 3114 return result 3115 3116 def build_subtable(self, blank=False): 3117 """ 3118 Builds a sub-table using the results of each subrule as currently 3119 evaluated. 3120 """ 3121 result = [] 3122 for subrule in self.subrules: 3123 result.extend(subrule.table(blank=blank)) 3124 return result 3125 3126 def evaluate_in_context(self, context=None): 3127 """ 3128 Checks the rule within the 'scope' node of the given context, 3129 respecting bindings in the 'env' dictionary from the given 3130 context. Uses the entire submitted code if no scope is present, 3131 and uses an empty dictionary if there is no binding environment. 3132 Use build_code_context to establish a top-level scope beforehand 3133 if you are worried about parsing issues causing code to be 3134 missing. 3135 """ 3136 # Grab scope and top-scope slots 3137 task_info = context_utils.extract(context, "task_info") 3138 scope = context_utils.extract(context, "scope") 3139 top_scope = context_utils.extract(context, "top_scope") 3140 filename = context_utils.extract(context, "filename") 3141 3142 # Create sub-context 3143 context = context or {} 3144 sub_context = {} 3145 sub_context.update(context) 3146 3147 # Create/extract matching environment 3148 if sub_context.get("env") is not None: 3149 env = sub_context["env"] 3150 else: 3151 env = {} 3152 3153 # Swap from the specified scope over to the matching definition 3154 # if check_in_def was specified: 3155 if self.check_in_def: 3156 if not isinstance(scope, ast.Call): 3157 raise context_utils.MissingContextError( 3158 "Attempt to check in a definition but parent check" 3159 + " didn't provide a function call to work from:" 3160 + "\n{}\n{}".format(scope, self.description) 3161 ) 3162 3163 if not isinstance(scope.func, ast.Name): 3164 raise context_utils.MissingContextError( 3165 "Attempt to check in a definition but the parent" 3166 + " check provided a function call with a complex func" 3167 + " expression:\n {}".format(scope) 3168 ) 3169 3170 defs = mast.findall( 3171 top_scope, 3172 "def {}(___):\n ___".format(scope.func.id), 3173 env=env, 3174 gen=False 3175 ) 3176 3177 if len(defs) == 0: 3178 raise context_utils.MissingContextError( 3179 ( 3180 "Attempt to check in a definition but the parent" 3181 + " check provided a function call (to {}) with no" 3182 + " matching definitions:\n {}" 3183 ).format(scope.func.id, scope) 3184 ) 3185 3186 # last definition overrides earlier ones if there are multiple 3187 last_node, last_envs = defs[-1] 3188 # TODO: DEBUG 3189 if last_node is None: 3190 print("None last_node") 3191 3192 scope = last_node 3193 # arbitrarily use first env; shouldn't be multiple we hope? 3194 env = last_envs[0] 3195 3196 # list of matching AST nodes 3197 matches = [] 3198 3199 # Scope our match predicate: 3200 my_match = self.match 3201 3202 # Our match filter: 3203 match_filter = lambda node, env: my_match(top_scope, node, env) 3204 3205 # Define match-collecting function 3206 def collect_matches(in_scope, memo=None): 3207 """ 3208 This local function collects matches to any of the patterns 3209 in this goal's patterns list, subject to the goal's matching 3210 rule. It accepts a scope (an AST node to search within) and 3211 uses a memo set to remember which callees have been 3212 investigated so that recursive functions with callees=True 3213 will not create an infinite loop. 3214 """ 3215 nonlocal self 3216 if memo is None: # remember which callees we've investigated 3217 memo = set() 3218 for pat in self.patterns: 3219 try: 3220 for node, envs in mast.findall( 3221 in_scope, 3222 pat, 3223 outside=self.outside, 3224 matchpred=match_filter, 3225 env=env, 3226 normalize=self.normalize, 3227 gen=True 3228 ): 3229 for prev_node, prev_envs in matches: 3230 if prev_node == node: 3231 # TODO: worry whether this duplicates envs? 3232 prev_envs.extend(envs) 3233 break 3234 else: # if we didn't ever break 3235 if not ( 3236 self.force_smaller_match 3237 and node is in_scope 3238 ): 3239 matches.append((node, envs)) 3240 3241 except Exception: 3242 # Rule checks shouldn't crash no matter what students 3243 # do... 3244 traceback.print_exc() 3245 logging.log( 3246 ( 3247 'ERROR CHECKING RULE\n rule name: "{}"\n' 3248 + ' attempted pattern: {}' 3249 ).format(self.name, pat) 3250 ) 3251 raise # will be caught below 3252 3253 # Check for matches in callees too. 3254 # WARNINGS: 3255 # - Matches only calls where the function position is a name 3256 # (not an arbitrary expression) 3257 # - Searches the top-level task code node for this name 3258 # without understanding shadowing and without considering 3259 # arguments/parameters 3260 # - Attempts to match the full pattern within a single 3261 # function (currently cannot automatically split pattern 3262 # across a call) 3263 # - Likely to cause even more exponential blowup 3264 # - No attempts are made to respect scope when unifying 3265 # env with match environments in callees 3266 if self.callees: 3267 callee_names = set( 3268 call_env['f'].id 3269 for call_node, call_envs in mast.findall( 3270 in_scope, 3271 '_f_(___)', 3272 gen=True, 3273 matchpred=( 3274 lambda node, env: type(env['f']) == ast.Name 3275 ) 3276 ) # noqa: E123 3277 for call_env in call_envs 3278 ) 3279 # Exclude already-checked callees and update memo: 3280 callee_names -= memo 3281 memo |= callee_names 3282 # Check each callee 3283 for callee_name in callee_names: 3284 callee_patterns = [ 3285 pat.replace("_f_", callee_name) 3286 for pat in patterns.ALL_DEF_PATTERNS 3287 ] 3288 outside_patterns = [ 3289 pat.replace("_f_", in_scope.name) 3290 for pat in patterns.ALL_DEF_PATTERNS 3291 ] if type(scope) == ast.FunctionDef else [] 3292 for cpat in callee_patterns: 3293 for callee_def_node, callee_def_env in mast.findall( 3294 top_scope, 3295 cpat, 3296 outside=outside_patterns, 3297 gen=True 3298 ): 3299 collect_matches(callee_def_node, memo=memo) 3300 pass 3301 pass 3302 pass 3303 pass 3304 pass 3305 3306 # Now that we've defined our collect_matches function, let's use it: 3307 try: 3308 collect_matches(scope) 3309 except Exception: 3310 logging.log( 3311 '>>> WARNING: check_ast_rule exception:\n' 3312 + html_tools.string_traceback() 3313 + '\n<<<' 3314 ) 3315 logging.log( 3316 ( 3317 "Exception while performing ImplementationCheck:\n" 3318 "(topic: '{}', patterns: {})" 3319 ).format(self.feedback_topic(), self.patterns) 3320 ) 3321 self.result = { 3322 "status": "unknown", 3323 "traceback": html_tools.html_traceback( 3324 linkable=context_utils.linkmap(context) 3325 ), 3326 "warnings": [ 3327 "There was an error while checking your implementation." 3328 ] 3329 } 3330 self.set_explanation( 3331 context, 3332 status="crash", 3333 default=html_tools.html_traceback( 3334 title="Error while checking implementation:", 3335 linkable=context_utils.linkmap(context) 3336 ) 3337 ) 3338 return self.result 3339 3340 # Used for messaging in presence of subrules: 3341 unrefined_match_count = len(matches) 3342 3343 # Refine matches by running subrules: 3344 partial_matches = [] 3345 full_matches = [] 3346 full_match_subtable = None 3347 partial_match_subtable = None 3348 closest_subtable = None 3349 closest_successes = -1 3350 closest_partials = -1 3351 for (node, envs) in matches: 3352 for env in envs: 3353 subsuccesses = 0 3354 subpartials = 0 3355 for rule in self.subrules: 3356 this_sub_context = {} 3357 this_sub_context.update(sub_context) 3358 this_sub_context["scope"] = node 3359 this_sub_context["env"] = env 3360 # evaluate sub-rule 3361 sub_result = rule.evaluate_in_context(this_sub_context) 3362 if sub_result["status"] == "accomplished": 3363 subsuccesses += 1 3364 elif sub_result["status"] == "partial": 3365 subpartials += 1 3366 3367 # tally sub-results 3368 if subsuccesses == len(self.subrules): 3369 # all succeeded: this is a full match 3370 if full_match_subtable is None: 3371 full_match_subtable = self.build_subtable() 3372 for prev_node, prev_envs in full_matches: 3373 if prev_node == node: 3374 prev_envs.append(env) 3375 break 3376 else: # if we didn't break 3377 full_matches.append((node, [env])) 3378 elif ( 3379 (subsuccesses + subpartials) == len(self.subrules) 3380 or ( 3381 (subsuccesses + subpartials / 2) 3382 >= (len(self.subrules) - self.subslip) 3383 ) 3384 ): 3385 # partially succeeded 3386 if partial_match_subtable is None: 3387 partial_match_subtable = self.build_subtable() 3388 for prev_node, prev_envs in partial_matches: 3389 if prev_node == node: 3390 prev_envs.append(env) 3391 break 3392 else: # if we didn't break 3393 partial_matches.append((node, [env])) 3394 elif ( 3395 subsuccesses > closest_successes 3396 or ( 3397 subsuccesses == closest_successes 3398 and subpartials > closest_partials 3399 ) 3400 ): 3401 # Best so far in terms of subrule successes 3402 closest_successes = subsuccesses 3403 closest_partials = subpartials 3404 closest_subtable = self.build_subtable() 3405 3406 # Get counts: 3407 full_match_identities = [] 3408 for n, envs in full_matches: 3409 identity_or_identities = self.match_identity(top_scope, n, envs) 3410 if isinstance(identity_or_identities, list): 3411 full_match_identities.extend(identity_or_identities) 3412 else: 3413 full_match_identities.append(identity_or_identities) 3414 3415 n_full_matches = len(set(full_match_identities)) 3416 3417 partial_match_identities = [] 3418 for n, envs in partial_matches: 3419 identity_or_identities = self.match_identity(top_scope, n, envs) 3420 if isinstance(identity_or_identities, list): 3421 partial_match_identities.extend(identity_or_identities) 3422 else: 3423 partial_match_identities.append(identity_or_identities) 3424 3425 n_partial_matches = len(set(partial_match_identities)) 3426 3427 # Check bounds now that we know which matches are partial/full: 3428 violated_min = ( 3429 self.min_allowed is not None 3430 and self.min_allowed > n_full_matches 3431 ) 3432 violated_max = ( 3433 self.max_allowed is not None 3434 and self.max_allowed < n_full_matches 3435 ) 3436 obeyed_min_partially = ( 3437 self.min_allowed is None 3438 or self.min_allowed <= n_partial_matches 3439 ) 3440 # You can't use partial matches to satisfy the max limit 3441 3442 # Notes and warnings for our ultimate result: 3443 notes = [] 3444 warnings = [] 3445 3446 # Assign status 3447 result_status = None 3448 if violated_min: 3449 if obeyed_min_partially: 3450 result_status = "partial" 3451 3452 if self.softmin: 3453 if isinstance(self.softmin, (str, list, tuple)): 3454 if "note" in self.softmin: 3455 notes.append( 3456 "Found fewer {} than expected.".format( 3457 self.pl_name 3458 ) 3459 ) 3460 3461 if "warn" in self.softmin: 3462 warnings.append( 3463 "Found fewer {} than expected.".format( 3464 self.pl_name 3465 ) 3466 ) 3467 3468 if "partial" in self.softmin: 3469 result_status = "partial" 3470 3471 if "fail" in self.softmin: 3472 result_status = "failed" 3473 3474 elif isinstance(self.softmin, (int, float)): 3475 matchpoints = n_full_matches + 0.5 * n_partial_matches 3476 if matchpoints >= self.softmin: 3477 result_status = "partial" 3478 else: 3479 result_status = "failed" 3480 else: 3481 result_status = "partial" 3482 3483 elif not obeyed_min_partially: 3484 result_status = "failed" 3485 3486 if violated_max: 3487 if self.softmax: 3488 if isinstance(self.softmax, (str, list, tuple)): 3489 if "note" in self.softmax: 3490 notes.append( 3491 f"Found more {self.pl_name} than expected." 3492 ) 3493 3494 if "warn" in self.softmax: 3495 warnings.append( 3496 f"Found more {self.pl_name} than expected." 3497 ) 3498 3499 if "partial" in self.softmax: 3500 # Don't upgrade failed (e.g. due to softmax): 3501 if result_status != "failed": 3502 result_status = "partial" 3503 3504 if "fail" in self.softmax: 3505 # old status is irrelevant 3506 result_status = "failed" 3507 3508 elif isinstance(self.softmax, (int, float)): 3509 # partial matches don't count against max 3510 if ( 3511 n_full_matches <= self.softmax 3512 and result_status != "failed" 3513 ): 3514 result_status = "partial" 3515 else: 3516 result_status = "failed" 3517 elif self.softmax: 3518 if result_status != "failed": 3519 result_status = "partial" 3520 else: 3521 result_status = "failed" 3522 3523 # No status assigned by min/max constraints? Then it's accomplished: 3524 if result_status is None: 3525 result_status = "accomplished" 3526 3527 # Figure out line numbers for matches 3528 matching_lines = [ 3529 mast.node_line(node) 3530 for node, envs in full_matches 3531 ] 3532 partial_lines = [ 3533 mast.node_line(node) 3534 for node, envs in partial_matches 3535 ] 3536 arent_extra = [ 3537 node for node, env in full_matches 3538 ] + [ 3539 node for node, env in partial_matches 3540 ] 3541 non_matching_lines = [ 3542 mast.node_line(node) 3543 for node, envs in matches 3544 if node not in arent_extra 3545 ] 3546 3547 # Create explanation: 3548 plural = True 3549 if self.max_allowed == 0: 3550 quantity = "zero" 3551 elif self.min_allowed is None: 3552 if self.max_allowed is None: 3553 quantity = "any number of" 3554 else: 3555 quantity = "no more than {}".format(self.max_allowed) 3556 plural = self.max_allowed != 1 3557 else: 3558 if self.max_allowed is None: 3559 quantity = "at least {}".format(self.min_allowed) 3560 plural = self.min_allowed != 1 3561 elif self.min_allowed == self.max_allowed: 3562 quantity = "exactly {}".format(self.min_allowed) 3563 plural = self.max_allowed != 1 3564 else: 3565 quantity = "between {} and {}".format( 3566 self.min_allowed, 3567 self.max_allowed 3568 ) 3569 plural = True 3570 3571 extra_unrefined = ( 3572 unrefined_match_count 3573 - len(full_matches) 3574 - len(partial_matches) 3575 ) 3576 explanation = ( 3577 "Expected {quantity} {name}, found {found}{sub}." 3578 ).format( 3579 quantity=quantity, 3580 name=self.pl_name if plural else self.name, 3581 found=( 3582 str(n_full_matches) 3583 if ( 3584 result_status == "accomplished" # partials are irrelevant 3585 or len(partial_match_identities) == 0 # no partials 3586 or self.max_allowed == 0 # partials are irrelevant 3587 ) 3588 else 3589 "{} {}, plus {} partial {} which did not satisfy {}".format( 3590 n_full_matches, 3591 phrasing.plural(n_full_matches, "match", "matches"), 3592 n_partial_matches, 3593 phrasing.plural(n_partial_matches, "match", "matches"), 3594 phrasing.plural( 3595 len(self.subrules), 3596 "the sub-rule", 3597 f"all {len(self.subrules)} sub-rules" 3598 ) 3599 ) 3600 ), 3601 sub=( 3602 " (found {}{} possible {} which did not satisfy {})" 3603 ).format( 3604 extra_unrefined, 3605 " more" if n_partial_matches > 0 else '', 3606 phrasing.plural(extra_unrefined, "match", "matches"), 3607 phrasing.plural( 3608 len(self.subrules), 3609 "the sub-rule", 3610 f"enough of the {len(self.subrules)} sub-rules" 3611 ), 3612 ) if self.subrules and extra_unrefined else "" 3613 ) 3614 3615 # Add line numbers: 3616 if len(matching_lines) > 0: 3617 notes.append( 3618 "Found on line(s): {}".format( 3619 ', '.join( 3620 html_tools.html_link_to_line( 3621 task_info["id"], 3622 filename, 3623 ln 3624 ) 3625 for ln in matching_lines 3626 ) 3627 ) 3628 ) 3629 if len(partial_lines) > 0 and result_status != "accomplished": 3630 notes.append( 3631 "Found partial matches on line(s): {}".format( 3632 ', '.join( 3633 html_tools.html_link_to_line( 3634 task_info["id"], 3635 filename, 3636 ln 3637 ) 3638 for ln in partial_lines 3639 ) 3640 ) 3641 ) 3642 if ( 3643 self.subrules 3644 and extra_unrefined 3645 and result_status != "accomplished" 3646 ): 3647 notes.append( 3648 "Found disqualified matches on line(s): {}".format( 3649 ", ".join( 3650 html_tools.html_link_to_line( 3651 task_info["id"], 3652 filename, 3653 ln 3654 ) 3655 for ln in non_matching_lines 3656 ) 3657 ) 3658 ) 3659 3660 if full_match_subtable is not None: 3661 subtable = full_match_subtable 3662 elif partial_match_subtable is not None: 3663 subtable = partial_match_subtable 3664 else: 3665 subtable = closest_subtable # might still be None in some cases 3666 3667 self.result = { 3668 "status": result_status, 3669 "notes": notes, 3670 "warnings": warnings, 3671 "subtable": subtable 3672 } 3673 3674 self.set_explanation(context, default=explanation) 3675 # TODO: Bubble warnings from sub-rules? 3676 return self.result
An ImplementationCheck
inspects the AST of submitted code to
determine whether it counts as accomplished or failed. An
ImplementationCheck
's subrules must all be accomplished for the
parent check to count as accomplished. An ImplementationCheck
looks
for the first match that can satisfy its subrules.
ImplementationCheck
s by default run on the 'scope' context slot
which contains an AST for the submitted module, or (via refinement by
ImplementationCheck
s) a subset of that code. When created, unless
explicit dependencies are specified via a test_in
keyword argument,
each ImplementationCheck
will grab the current automatic "scope"
context as its only dependency.
2847 def __init__( 2848 self, 2849 taskid, 2850 identifier, 2851 description=( 2852 "BLANK IMPLEMENTATION CHECK", 2853 "THIS GOAL HAS NOT BEEN DEFINED" 2854 ), 2855 pattern="_", 2856 name=None, 2857 match=lambda code, node, env: True, 2858 use=None, 2859 min=None, max=None, 2860 softmin=False, softmax=False, 2861 outside=None, 2862 callees=False, 2863 subrules=None, 2864 match_identity=lambda code, node, envs: ( 2865 tuple(node) if isinstance(node, list) else node 2866 ), 2867 subslip=None, 2868 normalize=False, 2869 check_in_def=False, 2870 force_smaller_match=False, 2871 **kwargs 2872 ): 2873 """ 2874 A task ID, an identifier, and a description are required (see the 2875 `Goal` class). An appropriate `test_in` dictionary which will 2876 provide a "scope" slot is typically required. 2877 2878 The categorizer "check:" is prepended to the identifier. 2879 2880 `ImplementationCheck` itself uses the following arguments: 2881 2882 - pattern: A string containing Python code that will be matched 2883 against using mast. May instead be a list of strings, in 2884 which case they will be tried in turn to generate matches. 2885 - name: specifies a name for the construct being searched for. 2886 The plural will be constructed by adding 's', unless name is 2887 a tuple, in which case the first entry will be used as the 2888 singular and the second as the plural. May contain HTML code. 2889 If pattern is not a list, this can be left out, and the 2890 pattern will be used as the name. 2891 - match: A function that accepts the entire submitted AST, the 2892 node being considered for a match right now, and the current 2893 binding environment. This function should return True or 2894 False, and any matches for which it does not return True will 2895 be ignored. 2896 - use/min/max: Either the 'use' argument, or one or both of the 2897 'min' and 'max' arguments should be given, but not both. 2898 Supplying 'use' sets both 'min' and 'max' to that value. If 2899 'max' is 0, the pattern is considered a negative pattern, and 2900 the goal will fail if any matches are found. Otherwise, the 2901 goal will succeed if the number of matches is between the 2902 given min and max values, inclusive. If none of these are 2903 given, the min defaults to 1 and the max to None (no limit). 2904 - softmin/softmax: If one of these is true, the minimum (or 2905 maximum) restriction on the number of matches will be treated 2906 as a soft constraint, and if violated the goal will be 2907 treated as partially accomplished instead of failed. If they 2908 are exactly either the string "warn" or "note", then the goal 2909 will still count as fully accomplished if that constraint is 2910 violated, but a warning or note will be attached mentioning 2911 the unexpectedly low/high number of matches. They may also be 2912 integers or floats, in which case they establish an alternate 2913 min/max threshold for partial completion. For softmin, 2914 partial matches are counted as 0.5 of a match towards 2915 achieving the threshold, but for softmax partial matches are 2916 ignored. 2917 - outside: If present, the 'outside' pattern (or list of 2918 patterns) is checked, and matches will only be considered 2919 valid if they are not sub-nodes of a match for one of the 2920 given outside patterns. 2921 - callees: If given as True, instead of simply searching within 2922 the context's scope node, this check will look for matches 2923 within other functions defined in the submitted code which 2924 are called from within the given scope node. TODO: This is 2925 still (as of 2020-6) experimental/unstable. 2926 - subrules: A list of `ImplementationCheck` goals to be tested 2927 within matches of this goal. Only matches where this goal and 2928 all of its subrules are accomplished (or partially 2929 accomplished) will be considered valid (respectively, 2930 partially valid). If this goal is a negative goal (max = 0), 2931 it fails if there are any fully valid matches, and partial 2932 matches are ignored. On the other hand, if it is a positive 2933 goal (max != 0), it counts as accomplished if the number of 2934 fully valid matches is within the min and max limits 2935 (inclusive), and partially accomplished if the number of 2936 fully valid matches is below the min limit but the number of 2937 fully valid + partially valid matches is at least the min 2938 limit. 2939 - match_identity: a function that returns a hashable object to 2940 represent the identity of a match for match-counting 2941 purposes. The function will be given the entire code context, 2942 the matching node, and a list of matching environments as 2943 input. It may return a list of identities instead of a single 2944 identity and each will be counted. By default this is a 2945 function which just returns the matching node, such that 2946 multiple matching environments based on the same node are not 2947 counted as separate matches. One reasonable alternative if 2948 you know what type of node you're matching would be to return 2949 some associated string (e.g., the id of a Call node that has 2950 a Name as its func). 2951 - subslip: A number of subgoals which are allowed to be violated 2952 and still count a potential match as a partial match. May be 2953 fractional, since partially-matched subgoals will be counted 2954 as 1/2 a point. By default this number will be set equal to 2955 the number of subgoals, meaning that even if all subgoals 2956 fail a match for a specified structure will still count as a 2957 partial match. 2958 - normalize: default False; experimental mast option that tries 2959 to inline some local variable assignments into larger 2960 expressions for better matching. 2961 - check_in_def: default False, this option changes the context 2962 within which the check occurs by default: the check will use 2963 the 'scope' element from the current context as usual, but 2964 will then assume that that AST node is a Call to a function 2965 defined in the same file (within the 'code' element) and will 2966 look up that definition, running the check in the context of 2967 that definition rather than in the original scope context 2968 given to it. This is useful for placing requirements on 2969 helper functions whose names aren't known ahead of time: a 2970 parent `ImplementationCheck` can be used to match the helper 2971 function call, with child checks using check_in_def that 2972 place requirements on the code in the helper function. The 2973 check will fail if the 'scope' context provided to it is not 2974 a Call node, or if it can't find the matching FunctionDef 2975 node in the 'code' tree of the context it's given. 2976 - force_smaller_match: default False. If set to True, a match 2977 which matches the entire target scope will not be considered a 2978 real match. Use this in places where you want to require 2979 things like nested loops, since otherwise a sub-requirement 2980 that's the same as a super-requirement will simply match the 2981 entire node matched by the super-requirement. 2982 """ 2983 # Grab parent context 2984 if "test_in" not in kwargs or kwargs["test_in"] is None: 2985 kwargs["test_in"] = {} 2986 if "contexts" not in kwargs["test_in"]: 2987 kwargs["test_in"]["contexts"] = contexts.auto("scope") 2988 2989 # Set up Goal properties 2990 super().__init__( 2991 taskid, 2992 "check:" + identifier, 2993 description, 2994 **kwargs 2995 ) 2996 self.set_default_goal_type("procedure") 2997 2998 # Ensure patterns is a list 2999 if isinstance(pattern, str): 3000 self.patterns = [ pattern ] 3001 else: 3002 self.patterns = pattern 3003 3004 # Figure out name 3005 if name is None: 3006 if len(self.patterns) > 1: 3007 raise ValueError( 3008 ( 3009 "When building an ImplementationCheck, if there are " 3010 + "multiple patterns, a name must be specified." 3011 + " (topic: '{}' / patterns: {})" 3012 ).format(self.feedback_topic(), self.patterns) 3013 ) 3014 else: 3015 self.name = self.patterns[0] 3016 self.pl_name = self.name + 's' 3017 elif isinstance(name, (list, tuple)): 3018 self.name, self.pl_name = name 3019 else: 3020 self.name = name 3021 self.pl_name = self.name + 's' 3022 3023 self.match = match 3024 3025 # Figure out min and max 3026 if (min is not None or max is not None) and use is not None: 3027 raise ValueError( 3028 ( 3029 "When building an ImplementationCheck, you may supply " 3030 + "*either* 'use' or 'min'/'max', but you may not supply " 3031 + "'use' if either 'min' or 'max' is given." 3032 + " (topic: '{}' / patterns: {})" 3033 ).format(self.feedback_topic(), self.patterns) 3034 ) 3035 elif use is not None: 3036 self.min_allowed = use 3037 self.max_allowed = use 3038 elif min is None and max is None: 3039 # Default is "at least 1" 3040 self.min_allowed = 1 3041 self.max_allowed = None 3042 else: 3043 self.min_allowed = min 3044 self.max_allowed = max 3045 3046 # Is this goal a positive goal (keep searching for any match 3047 # across possible environments?) or not (fail if any match is 3048 # found in any environment). 3049 self.is_positive = self.max_allowed != 0 3050 3051 self.softmin = softmin 3052 self.softmax = softmax 3053 3054 # Make sure outside is a list 3055 if outside is None: 3056 self.outside = [] 3057 elif isinstance(outside, str): 3058 self.outside = [ outside ] 3059 else: 3060 self.outside = outside 3061 3062 self.callees = callees 3063 3064 self.force_smaller_match = force_smaller_match 3065 3066 # Set subrules 3067 if subrules is None: 3068 self.subrules = [] 3069 else: 3070 self.subrules = subrules 3071 3072 self.match_identity = match_identity 3073 3074 self.subslip = subslip 3075 if self.subslip is None: 3076 self.subslip = len(self.subrules) 3077 3078 self.normalize = normalize 3079 3080 self.check_in_def = check_in_def
A task ID, an identifier, and a description are required (see the
Goal
class). An appropriate test_in
dictionary which will
provide a "scope" slot is typically required.
The categorizer "check:" is prepended to the identifier.
ImplementationCheck
itself uses the following arguments:
- pattern: A string containing Python code that will be matched against using mast. May instead be a list of strings, in which case they will be tried in turn to generate matches.
- name: specifies a name for the construct being searched for. The plural will be constructed by adding 's', unless name is a tuple, in which case the first entry will be used as the singular and the second as the plural. May contain HTML code. If pattern is not a list, this can be left out, and the pattern will be used as the name.
- match: A function that accepts the entire submitted AST, the node being considered for a match right now, and the current binding environment. This function should return True or False, and any matches for which it does not return True will be ignored.
- use/min/max: Either the 'use' argument, or one or both of the 'min' and 'max' arguments should be given, but not both. Supplying 'use' sets both 'min' and 'max' to that value. If 'max' is 0, the pattern is considered a negative pattern, and the goal will fail if any matches are found. Otherwise, the goal will succeed if the number of matches is between the given min and max values, inclusive. If none of these are given, the min defaults to 1 and the max to None (no limit).
- softmin/softmax: If one of these is true, the minimum (or maximum) restriction on the number of matches will be treated as a soft constraint, and if violated the goal will be treated as partially accomplished instead of failed. If they are exactly either the string "warn" or "note", then the goal will still count as fully accomplished if that constraint is violated, but a warning or note will be attached mentioning the unexpectedly low/high number of matches. They may also be integers or floats, in which case they establish an alternate min/max threshold for partial completion. For softmin, partial matches are counted as 0.5 of a match towards achieving the threshold, but for softmax partial matches are ignored.
- outside: If present, the 'outside' pattern (or list of patterns) is checked, and matches will only be considered valid if they are not sub-nodes of a match for one of the given outside patterns.
- callees: If given as True, instead of simply searching within the context's scope node, this check will look for matches within other functions defined in the submitted code which are called from within the given scope node. TODO: This is still (as of 2020-6) experimental/unstable.
- subrules: A list of
ImplementationCheck
goals to be tested within matches of this goal. Only matches where this goal and all of its subrules are accomplished (or partially accomplished) will be considered valid (respectively, partially valid). If this goal is a negative goal (max = 0), it fails if there are any fully valid matches, and partial matches are ignored. On the other hand, if it is a positive goal (max != 0), it counts as accomplished if the number of fully valid matches is within the min and max limits (inclusive), and partially accomplished if the number of fully valid matches is below the min limit but the number of fully valid + partially valid matches is at least the min limit. - match_identity: a function that returns a hashable object to represent the identity of a match for match-counting purposes. The function will be given the entire code context, the matching node, and a list of matching environments as input. It may return a list of identities instead of a single identity and each will be counted. By default this is a function which just returns the matching node, such that multiple matching environments based on the same node are not counted as separate matches. One reasonable alternative if you know what type of node you're matching would be to return some associated string (e.g., the id of a Call node that has a Name as its func).
- subslip: A number of subgoals which are allowed to be violated and still count a potential match as a partial match. May be fractional, since partially-matched subgoals will be counted as 1/2 a point. By default this number will be set equal to the number of subgoals, meaning that even if all subgoals fail a match for a specified structure will still count as a partial match.
- normalize: default False; experimental mast option that tries to inline some local variable assignments into larger expressions for better matching.
- check_in_def: default False, this option changes the context
within which the check occurs by default: the check will use
the 'scope' element from the current context as usual, but
will then assume that that AST node is a Call to a function
defined in the same file (within the 'code' element) and will
look up that definition, running the check in the context of
that definition rather than in the original scope context
given to it. This is useful for placing requirements on
helper functions whose names aren't known ahead of time: a
parent
ImplementationCheck
can be used to match the helper function call, with child checks using check_in_def that place requirements on the code in the helper function. The check will fail if the 'scope' context provided to it is not a Call node, or if it can't find the matching FunctionDef node in the 'code' tree of the context it's given. - force_smaller_match: default False. If set to True, a match which matches the entire target scope will not be considered a real match. Use this in places where you want to require things like nested loops, since otherwise a sub-requirement that's the same as a super-requirement will simply match the entire node matched by the super-requirement.
3082 def subgoals(self): 3083 """ 3084 List of subgoals of this goal (our precondition and our goal). 3085 """ 3086 return self.subrules
List of subgoals of this goal (our precondition and our goal).
3088 def table(self, blank=False): 3089 """ 3090 Includes sub-table with subrule statuses preserved from the last 3091 full match, or the last partial match if there are no full 3092 matches. 3093 3094 See `Goal.table` regarding the table format. 3095 """ 3096 result = super().table(blank=blank) 3097 3098 # Maybe add a subtable: 3099 if blank: 3100 result[0]["subtable"] = self.build_subtable(blank=blank) 3101 elif self.is_positive: 3102 # TODO: What about tables requested during pre-evaluation 3103 # description construction? 3104 result[0]["subtable"] = self.result.get("subtable") or [] 3105 elif self.result.get("status") != "accomplished": 3106 # For negative rules where we don't want any matches, reporting 3107 # the successful discovery of sub-rules only makes sense if 3108 # we failed the goal (because there was a match that 3109 # shouldn't have been there). 3110 result[0]["subtable"] = self.result.get("subtable") or [] 3111 # Otherwise don't attach a subtable (negative rules that 3112 # succeeded because they didn't have any full matches). 3113 3114 return result
Includes sub-table with subrule statuses preserved from the last full match, or the last partial match if there are no full matches.
See Goal.table
regarding the table format.
3116 def build_subtable(self, blank=False): 3117 """ 3118 Builds a sub-table using the results of each subrule as currently 3119 evaluated. 3120 """ 3121 result = [] 3122 for subrule in self.subrules: 3123 result.extend(subrule.table(blank=blank)) 3124 return result
Builds a sub-table using the results of each subrule as currently evaluated.
3126 def evaluate_in_context(self, context=None): 3127 """ 3128 Checks the rule within the 'scope' node of the given context, 3129 respecting bindings in the 'env' dictionary from the given 3130 context. Uses the entire submitted code if no scope is present, 3131 and uses an empty dictionary if there is no binding environment. 3132 Use build_code_context to establish a top-level scope beforehand 3133 if you are worried about parsing issues causing code to be 3134 missing. 3135 """ 3136 # Grab scope and top-scope slots 3137 task_info = context_utils.extract(context, "task_info") 3138 scope = context_utils.extract(context, "scope") 3139 top_scope = context_utils.extract(context, "top_scope") 3140 filename = context_utils.extract(context, "filename") 3141 3142 # Create sub-context 3143 context = context or {} 3144 sub_context = {} 3145 sub_context.update(context) 3146 3147 # Create/extract matching environment 3148 if sub_context.get("env") is not None: 3149 env = sub_context["env"] 3150 else: 3151 env = {} 3152 3153 # Swap from the specified scope over to the matching definition 3154 # if check_in_def was specified: 3155 if self.check_in_def: 3156 if not isinstance(scope, ast.Call): 3157 raise context_utils.MissingContextError( 3158 "Attempt to check in a definition but parent check" 3159 + " didn't provide a function call to work from:" 3160 + "\n{}\n{}".format(scope, self.description) 3161 ) 3162 3163 if not isinstance(scope.func, ast.Name): 3164 raise context_utils.MissingContextError( 3165 "Attempt to check in a definition but the parent" 3166 + " check provided a function call with a complex func" 3167 + " expression:\n {}".format(scope) 3168 ) 3169 3170 defs = mast.findall( 3171 top_scope, 3172 "def {}(___):\n ___".format(scope.func.id), 3173 env=env, 3174 gen=False 3175 ) 3176 3177 if len(defs) == 0: 3178 raise context_utils.MissingContextError( 3179 ( 3180 "Attempt to check in a definition but the parent" 3181 + " check provided a function call (to {}) with no" 3182 + " matching definitions:\n {}" 3183 ).format(scope.func.id, scope) 3184 ) 3185 3186 # last definition overrides earlier ones if there are multiple 3187 last_node, last_envs = defs[-1] 3188 # TODO: DEBUG 3189 if last_node is None: 3190 print("None last_node") 3191 3192 scope = last_node 3193 # arbitrarily use first env; shouldn't be multiple we hope? 3194 env = last_envs[0] 3195 3196 # list of matching AST nodes 3197 matches = [] 3198 3199 # Scope our match predicate: 3200 my_match = self.match 3201 3202 # Our match filter: 3203 match_filter = lambda node, env: my_match(top_scope, node, env) 3204 3205 # Define match-collecting function 3206 def collect_matches(in_scope, memo=None): 3207 """ 3208 This local function collects matches to any of the patterns 3209 in this goal's patterns list, subject to the goal's matching 3210 rule. It accepts a scope (an AST node to search within) and 3211 uses a memo set to remember which callees have been 3212 investigated so that recursive functions with callees=True 3213 will not create an infinite loop. 3214 """ 3215 nonlocal self 3216 if memo is None: # remember which callees we've investigated 3217 memo = set() 3218 for pat in self.patterns: 3219 try: 3220 for node, envs in mast.findall( 3221 in_scope, 3222 pat, 3223 outside=self.outside, 3224 matchpred=match_filter, 3225 env=env, 3226 normalize=self.normalize, 3227 gen=True 3228 ): 3229 for prev_node, prev_envs in matches: 3230 if prev_node == node: 3231 # TODO: worry whether this duplicates envs? 3232 prev_envs.extend(envs) 3233 break 3234 else: # if we didn't ever break 3235 if not ( 3236 self.force_smaller_match 3237 and node is in_scope 3238 ): 3239 matches.append((node, envs)) 3240 3241 except Exception: 3242 # Rule checks shouldn't crash no matter what students 3243 # do... 3244 traceback.print_exc() 3245 logging.log( 3246 ( 3247 'ERROR CHECKING RULE\n rule name: "{}"\n' 3248 + ' attempted pattern: {}' 3249 ).format(self.name, pat) 3250 ) 3251 raise # will be caught below 3252 3253 # Check for matches in callees too. 3254 # WARNINGS: 3255 # - Matches only calls where the function position is a name 3256 # (not an arbitrary expression) 3257 # - Searches the top-level task code node for this name 3258 # without understanding shadowing and without considering 3259 # arguments/parameters 3260 # - Attempts to match the full pattern within a single 3261 # function (currently cannot automatically split pattern 3262 # across a call) 3263 # - Likely to cause even more exponential blowup 3264 # - No attempts are made to respect scope when unifying 3265 # env with match environments in callees 3266 if self.callees: 3267 callee_names = set( 3268 call_env['f'].id 3269 for call_node, call_envs in mast.findall( 3270 in_scope, 3271 '_f_(___)', 3272 gen=True, 3273 matchpred=( 3274 lambda node, env: type(env['f']) == ast.Name 3275 ) 3276 ) # noqa: E123 3277 for call_env in call_envs 3278 ) 3279 # Exclude already-checked callees and update memo: 3280 callee_names -= memo 3281 memo |= callee_names 3282 # Check each callee 3283 for callee_name in callee_names: 3284 callee_patterns = [ 3285 pat.replace("_f_", callee_name) 3286 for pat in patterns.ALL_DEF_PATTERNS 3287 ] 3288 outside_patterns = [ 3289 pat.replace("_f_", in_scope.name) 3290 for pat in patterns.ALL_DEF_PATTERNS 3291 ] if type(scope) == ast.FunctionDef else [] 3292 for cpat in callee_patterns: 3293 for callee_def_node, callee_def_env in mast.findall( 3294 top_scope, 3295 cpat, 3296 outside=outside_patterns, 3297 gen=True 3298 ): 3299 collect_matches(callee_def_node, memo=memo) 3300 pass 3301 pass 3302 pass 3303 pass 3304 pass 3305 3306 # Now that we've defined our collect_matches function, let's use it: 3307 try: 3308 collect_matches(scope) 3309 except Exception: 3310 logging.log( 3311 '>>> WARNING: check_ast_rule exception:\n' 3312 + html_tools.string_traceback() 3313 + '\n<<<' 3314 ) 3315 logging.log( 3316 ( 3317 "Exception while performing ImplementationCheck:\n" 3318 "(topic: '{}', patterns: {})" 3319 ).format(self.feedback_topic(), self.patterns) 3320 ) 3321 self.result = { 3322 "status": "unknown", 3323 "traceback": html_tools.html_traceback( 3324 linkable=context_utils.linkmap(context) 3325 ), 3326 "warnings": [ 3327 "There was an error while checking your implementation." 3328 ] 3329 } 3330 self.set_explanation( 3331 context, 3332 status="crash", 3333 default=html_tools.html_traceback( 3334 title="Error while checking implementation:", 3335 linkable=context_utils.linkmap(context) 3336 ) 3337 ) 3338 return self.result 3339 3340 # Used for messaging in presence of subrules: 3341 unrefined_match_count = len(matches) 3342 3343 # Refine matches by running subrules: 3344 partial_matches = [] 3345 full_matches = [] 3346 full_match_subtable = None 3347 partial_match_subtable = None 3348 closest_subtable = None 3349 closest_successes = -1 3350 closest_partials = -1 3351 for (node, envs) in matches: 3352 for env in envs: 3353 subsuccesses = 0 3354 subpartials = 0 3355 for rule in self.subrules: 3356 this_sub_context = {} 3357 this_sub_context.update(sub_context) 3358 this_sub_context["scope"] = node 3359 this_sub_context["env"] = env 3360 # evaluate sub-rule 3361 sub_result = rule.evaluate_in_context(this_sub_context) 3362 if sub_result["status"] == "accomplished": 3363 subsuccesses += 1 3364 elif sub_result["status"] == "partial": 3365 subpartials += 1 3366 3367 # tally sub-results 3368 if subsuccesses == len(self.subrules): 3369 # all succeeded: this is a full match 3370 if full_match_subtable is None: 3371 full_match_subtable = self.build_subtable() 3372 for prev_node, prev_envs in full_matches: 3373 if prev_node == node: 3374 prev_envs.append(env) 3375 break 3376 else: # if we didn't break 3377 full_matches.append((node, [env])) 3378 elif ( 3379 (subsuccesses + subpartials) == len(self.subrules) 3380 or ( 3381 (subsuccesses + subpartials / 2) 3382 >= (len(self.subrules) - self.subslip) 3383 ) 3384 ): 3385 # partially succeeded 3386 if partial_match_subtable is None: 3387 partial_match_subtable = self.build_subtable() 3388 for prev_node, prev_envs in partial_matches: 3389 if prev_node == node: 3390 prev_envs.append(env) 3391 break 3392 else: # if we didn't break 3393 partial_matches.append((node, [env])) 3394 elif ( 3395 subsuccesses > closest_successes 3396 or ( 3397 subsuccesses == closest_successes 3398 and subpartials > closest_partials 3399 ) 3400 ): 3401 # Best so far in terms of subrule successes 3402 closest_successes = subsuccesses 3403 closest_partials = subpartials 3404 closest_subtable = self.build_subtable() 3405 3406 # Get counts: 3407 full_match_identities = [] 3408 for n, envs in full_matches: 3409 identity_or_identities = self.match_identity(top_scope, n, envs) 3410 if isinstance(identity_or_identities, list): 3411 full_match_identities.extend(identity_or_identities) 3412 else: 3413 full_match_identities.append(identity_or_identities) 3414 3415 n_full_matches = len(set(full_match_identities)) 3416 3417 partial_match_identities = [] 3418 for n, envs in partial_matches: 3419 identity_or_identities = self.match_identity(top_scope, n, envs) 3420 if isinstance(identity_or_identities, list): 3421 partial_match_identities.extend(identity_or_identities) 3422 else: 3423 partial_match_identities.append(identity_or_identities) 3424 3425 n_partial_matches = len(set(partial_match_identities)) 3426 3427 # Check bounds now that we know which matches are partial/full: 3428 violated_min = ( 3429 self.min_allowed is not None 3430 and self.min_allowed > n_full_matches 3431 ) 3432 violated_max = ( 3433 self.max_allowed is not None 3434 and self.max_allowed < n_full_matches 3435 ) 3436 obeyed_min_partially = ( 3437 self.min_allowed is None 3438 or self.min_allowed <= n_partial_matches 3439 ) 3440 # You can't use partial matches to satisfy the max limit 3441 3442 # Notes and warnings for our ultimate result: 3443 notes = [] 3444 warnings = [] 3445 3446 # Assign status 3447 result_status = None 3448 if violated_min: 3449 if obeyed_min_partially: 3450 result_status = "partial" 3451 3452 if self.softmin: 3453 if isinstance(self.softmin, (str, list, tuple)): 3454 if "note" in self.softmin: 3455 notes.append( 3456 "Found fewer {} than expected.".format( 3457 self.pl_name 3458 ) 3459 ) 3460 3461 if "warn" in self.softmin: 3462 warnings.append( 3463 "Found fewer {} than expected.".format( 3464 self.pl_name 3465 ) 3466 ) 3467 3468 if "partial" in self.softmin: 3469 result_status = "partial" 3470 3471 if "fail" in self.softmin: 3472 result_status = "failed" 3473 3474 elif isinstance(self.softmin, (int, float)): 3475 matchpoints = n_full_matches + 0.5 * n_partial_matches 3476 if matchpoints >= self.softmin: 3477 result_status = "partial" 3478 else: 3479 result_status = "failed" 3480 else: 3481 result_status = "partial" 3482 3483 elif not obeyed_min_partially: 3484 result_status = "failed" 3485 3486 if violated_max: 3487 if self.softmax: 3488 if isinstance(self.softmax, (str, list, tuple)): 3489 if "note" in self.softmax: 3490 notes.append( 3491 f"Found more {self.pl_name} than expected." 3492 ) 3493 3494 if "warn" in self.softmax: 3495 warnings.append( 3496 f"Found more {self.pl_name} than expected." 3497 ) 3498 3499 if "partial" in self.softmax: 3500 # Don't upgrade failed (e.g. due to softmax): 3501 if result_status != "failed": 3502 result_status = "partial" 3503 3504 if "fail" in self.softmax: 3505 # old status is irrelevant 3506 result_status = "failed" 3507 3508 elif isinstance(self.softmax, (int, float)): 3509 # partial matches don't count against max 3510 if ( 3511 n_full_matches <= self.softmax 3512 and result_status != "failed" 3513 ): 3514 result_status = "partial" 3515 else: 3516 result_status = "failed" 3517 elif self.softmax: 3518 if result_status != "failed": 3519 result_status = "partial" 3520 else: 3521 result_status = "failed" 3522 3523 # No status assigned by min/max constraints? Then it's accomplished: 3524 if result_status is None: 3525 result_status = "accomplished" 3526 3527 # Figure out line numbers for matches 3528 matching_lines = [ 3529 mast.node_line(node) 3530 for node, envs in full_matches 3531 ] 3532 partial_lines = [ 3533 mast.node_line(node) 3534 for node, envs in partial_matches 3535 ] 3536 arent_extra = [ 3537 node for node, env in full_matches 3538 ] + [ 3539 node for node, env in partial_matches 3540 ] 3541 non_matching_lines = [ 3542 mast.node_line(node) 3543 for node, envs in matches 3544 if node not in arent_extra 3545 ] 3546 3547 # Create explanation: 3548 plural = True 3549 if self.max_allowed == 0: 3550 quantity = "zero" 3551 elif self.min_allowed is None: 3552 if self.max_allowed is None: 3553 quantity = "any number of" 3554 else: 3555 quantity = "no more than {}".format(self.max_allowed) 3556 plural = self.max_allowed != 1 3557 else: 3558 if self.max_allowed is None: 3559 quantity = "at least {}".format(self.min_allowed) 3560 plural = self.min_allowed != 1 3561 elif self.min_allowed == self.max_allowed: 3562 quantity = "exactly {}".format(self.min_allowed) 3563 plural = self.max_allowed != 1 3564 else: 3565 quantity = "between {} and {}".format( 3566 self.min_allowed, 3567 self.max_allowed 3568 ) 3569 plural = True 3570 3571 extra_unrefined = ( 3572 unrefined_match_count 3573 - len(full_matches) 3574 - len(partial_matches) 3575 ) 3576 explanation = ( 3577 "Expected {quantity} {name}, found {found}{sub}." 3578 ).format( 3579 quantity=quantity, 3580 name=self.pl_name if plural else self.name, 3581 found=( 3582 str(n_full_matches) 3583 if ( 3584 result_status == "accomplished" # partials are irrelevant 3585 or len(partial_match_identities) == 0 # no partials 3586 or self.max_allowed == 0 # partials are irrelevant 3587 ) 3588 else 3589 "{} {}, plus {} partial {} which did not satisfy {}".format( 3590 n_full_matches, 3591 phrasing.plural(n_full_matches, "match", "matches"), 3592 n_partial_matches, 3593 phrasing.plural(n_partial_matches, "match", "matches"), 3594 phrasing.plural( 3595 len(self.subrules), 3596 "the sub-rule", 3597 f"all {len(self.subrules)} sub-rules" 3598 ) 3599 ) 3600 ), 3601 sub=( 3602 " (found {}{} possible {} which did not satisfy {})" 3603 ).format( 3604 extra_unrefined, 3605 " more" if n_partial_matches > 0 else '', 3606 phrasing.plural(extra_unrefined, "match", "matches"), 3607 phrasing.plural( 3608 len(self.subrules), 3609 "the sub-rule", 3610 f"enough of the {len(self.subrules)} sub-rules" 3611 ), 3612 ) if self.subrules and extra_unrefined else "" 3613 ) 3614 3615 # Add line numbers: 3616 if len(matching_lines) > 0: 3617 notes.append( 3618 "Found on line(s): {}".format( 3619 ', '.join( 3620 html_tools.html_link_to_line( 3621 task_info["id"], 3622 filename, 3623 ln 3624 ) 3625 for ln in matching_lines 3626 ) 3627 ) 3628 ) 3629 if len(partial_lines) > 0 and result_status != "accomplished": 3630 notes.append( 3631 "Found partial matches on line(s): {}".format( 3632 ', '.join( 3633 html_tools.html_link_to_line( 3634 task_info["id"], 3635 filename, 3636 ln 3637 ) 3638 for ln in partial_lines 3639 ) 3640 ) 3641 ) 3642 if ( 3643 self.subrules 3644 and extra_unrefined 3645 and result_status != "accomplished" 3646 ): 3647 notes.append( 3648 "Found disqualified matches on line(s): {}".format( 3649 ", ".join( 3650 html_tools.html_link_to_line( 3651 task_info["id"], 3652 filename, 3653 ln 3654 ) 3655 for ln in non_matching_lines 3656 ) 3657 ) 3658 ) 3659 3660 if full_match_subtable is not None: 3661 subtable = full_match_subtable 3662 elif partial_match_subtable is not None: 3663 subtable = partial_match_subtable 3664 else: 3665 subtable = closest_subtable # might still be None in some cases 3666 3667 self.result = { 3668 "status": result_status, 3669 "notes": notes, 3670 "warnings": warnings, 3671 "subtable": subtable 3672 } 3673 3674 self.set_explanation(context, default=explanation) 3675 # TODO: Bubble warnings from sub-rules? 3676 return self.result
Checks the rule within the 'scope' node of the given context, respecting bindings in the 'env' dictionary from the given context. Uses the entire submitted code if no scope is present, and uses an empty dictionary if there is no binding environment. Use build_code_context to establish a top-level scope beforehand if you are worried about parsing issues causing code to be missing.
3679class NoParseErrors(Goal): 3680 """ 3681 This goal is simply accomplished if there are no parsing errors 3682 during task loading, and failed otherwise. If generate_warnings is given it 3683 generates a warning for each parse error. The created goal will 3684 always use the identifier "syntax:no_parse_errors". 3685 """ 3686 def __init__( 3687 self, 3688 taskid, 3689 description=( 3690 "No errors loading code", 3691 ( 3692 "Your code should be able to be loaded without errors. Run " 3693 + "your code before submitting it to make sure this is true." 3694 ) 3695 ), 3696 generate_warnings=True, 3697 **kwargs 3698 ): 3699 """ 3700 A task ID is required. A default description is available. If 3701 generate_warnings is given as False, parse errors will not be 3702 turned into warnings, but in the default case, they will be. 3703 3704 The goal identifier will be "syntax:no_parse_errors". 3705 """ 3706 super().__init__( 3707 taskid, 3708 "misc:no_parse_errors", 3709 description, 3710 **kwargs 3711 ) 3712 self.set_default_goal_type("procedure") 3713 self.generate_warnings = generate_warnings 3714 3715 # subgoals is inherited (no subgoals) 3716 3717 # table is inherited 3718 3719 def evaluate_in_context(self, context=None): 3720 """ 3721 Checks whether there were any parse errors. 3722 """ 3723 context = context or {} 3724 if ( 3725 "parse_errors" not in context 3726 or len(context["parse_errors"]) == 0 3727 ): 3728 self.result = { "status": "accomplished" } 3729 self.set_explanation( 3730 context, 3731 default="There weren't any parsing errors." 3732 ) 3733 return self.result 3734 else: 3735 message = "There were errors during parsing." 3736 if not self.generate_warnings: 3737 # Incorporate errors into message directly: 3738 message += "<br>\n" + '<br>\n'.join( 3739 html_tools.summarize_parse_error(e) 3740 for e in context["parse_errors"] 3741 ) 3742 3743 self.result = { "status": "failed" } 3744 3745 if self.generate_warnings: 3746 # Generate a warning for each error: 3747 self.result["warnings"] = [ 3748 html_tools.summarize_parse_error(e) 3749 for e in context["parse_errors"] 3750 ] 3751 3752 self.set_explanation(context, default=message) 3753 return self.result
This goal is simply accomplished if there are no parsing errors during task loading, and failed otherwise. If generate_warnings is given it generates a warning for each parse error. The created goal will always use the identifier "syntax:no_parse_errors".
3686 def __init__( 3687 self, 3688 taskid, 3689 description=( 3690 "No errors loading code", 3691 ( 3692 "Your code should be able to be loaded without errors. Run " 3693 + "your code before submitting it to make sure this is true." 3694 ) 3695 ), 3696 generate_warnings=True, 3697 **kwargs 3698 ): 3699 """ 3700 A task ID is required. A default description is available. If 3701 generate_warnings is given as False, parse errors will not be 3702 turned into warnings, but in the default case, they will be. 3703 3704 The goal identifier will be "syntax:no_parse_errors". 3705 """ 3706 super().__init__( 3707 taskid, 3708 "misc:no_parse_errors", 3709 description, 3710 **kwargs 3711 ) 3712 self.set_default_goal_type("procedure") 3713 self.generate_warnings = generate_warnings
A task ID is required. A default description is available. If generate_warnings is given as False, parse errors will not be turned into warnings, but in the default case, they will be.
The goal identifier will be "syntax:no_parse_errors".
3719 def evaluate_in_context(self, context=None): 3720 """ 3721 Checks whether there were any parse errors. 3722 """ 3723 context = context or {} 3724 if ( 3725 "parse_errors" not in context 3726 or len(context["parse_errors"]) == 0 3727 ): 3728 self.result = { "status": "accomplished" } 3729 self.set_explanation( 3730 context, 3731 default="There weren't any parsing errors." 3732 ) 3733 return self.result 3734 else: 3735 message = "There were errors during parsing." 3736 if not self.generate_warnings: 3737 # Incorporate errors into message directly: 3738 message += "<br>\n" + '<br>\n'.join( 3739 html_tools.summarize_parse_error(e) 3740 for e in context["parse_errors"] 3741 ) 3742 3743 self.result = { "status": "failed" } 3744 3745 if self.generate_warnings: 3746 # Generate a warning for each error: 3747 self.result["warnings"] = [ 3748 html_tools.summarize_parse_error(e) 3749 for e in context["parse_errors"] 3750 ] 3751 3752 self.set_explanation(context, default=message) 3753 return self.result
Checks whether there were any parse errors.
3760class LintCheck(Goal): 3761 """ 3762 Runs a linter function against the auto-context for "scope". Inherit 3763 and override the `check` method with a function that accepts a 3764 context and returns a goal evaluation result to define your linter. 3765 """ 3766 def check(self, context): 3767 """ 3768 Not implemented; override to define specific linters. 3769 """ 3770 raise NotImplementedError( 3771 "LintCheck is an abstract class that can't be used directly." 3772 ) 3773 3774 def __init__( 3775 self, 3776 taskid, 3777 identifier, 3778 description=( 3779 "BLANK LINT CHECK", 3780 "THIS GOAL HAS NOT BEEN DEFINED" 3781 ), 3782 goal_type="style", 3783 uses_slots=("scope",), 3784 **kwargs 3785 ): 3786 """ 3787 In addition to a task ID, an identifier, and a description, a 3788 goal type may be supplied other than the default "style". 3789 "procedure" is the most likely alternative. 3790 3791 The categorizer "link:" will be prepended to the identifier 3792 provided. 3793 3794 The slots required should be given as uses_slots, and a relevant 3795 context will be selected or created as the testing context. 3796 3797 Any extra arguments are passed through to the `Goal` constructor. 3798 """ 3799 # Auto context dependency based on uses_slots 3800 depends = contexts.auto(*uses_slots) 3801 if len(depends) == 1: 3802 test_context = depends[0] 3803 else: 3804 # TODO: De-duplicate stuff where one context actually 3805 # provides everything needed via inheritance but auto 3806 # doesn't see that? 3807 test_context = contexts.Context( 3808 description=( 3809 "Details of your code", 3810 ( 3811 "The " + phrasing.comma_list(uses_slots) 3812 + " of your code." 3813 ) 3814 ), 3815 builder=lambda ctx: ctx, 3816 depends=depends 3817 ) 3818 3819 if "test_in" not in kwargs: 3820 kwargs["test_in"] = {} 3821 if "contexts" not in kwargs["test_in"]: 3822 kwargs["test_in"]["contexts"] = [ test_context ] 3823 3824 # Specified goal type 3825 if "tags" not in kwargs: 3826 kwargs["tags"] = {} 3827 kwargs["tags"]["goal_type"] = goal_type 3828 3829 # Set up Goal stuff 3830 super().__init__( 3831 taskid, 3832 "lint:" + identifier, 3833 description, 3834 **kwargs 3835 ) 3836 3837 # subgoals is inherited (no subgoals) 3838 3839 # table is inherited 3840 3841 def evaluate_in_context(self, context=None): 3842 """ 3843 Runs the checker and returns its result. 3844 """ 3845 context = context or {} 3846 3847 try: 3848 self.result = self.check(context) 3849 3850 if self.result is None: 3851 raise ValueError( 3852 f"Linter for {self.__class__.__name__} returned None!" 3853 ) 3854 except Exception: 3855 self.result = { 3856 "status": "failed", 3857 "traceback": html_tools.html_traceback( 3858 linkable=context_utils.linkmap(context) 3859 ) 3860 } 3861 self.set_explanation( 3862 context, 3863 status="crash", 3864 default=html_tools.html_traceback( 3865 title="Error while inspecting your code.", 3866 linkable=context_utils.linkmap(context) 3867 ) 3868 ) 3869 return self.result 3870 3871 self.set_explanation( 3872 context, 3873 default=self.result["explanation"] 3874 ) 3875 3876 return self.result
Runs a linter function against the auto-context for "scope". Inherit
and override the check
method with a function that accepts a
context and returns a goal evaluation result to define your linter.
3774 def __init__( 3775 self, 3776 taskid, 3777 identifier, 3778 description=( 3779 "BLANK LINT CHECK", 3780 "THIS GOAL HAS NOT BEEN DEFINED" 3781 ), 3782 goal_type="style", 3783 uses_slots=("scope",), 3784 **kwargs 3785 ): 3786 """ 3787 In addition to a task ID, an identifier, and a description, a 3788 goal type may be supplied other than the default "style". 3789 "procedure" is the most likely alternative. 3790 3791 The categorizer "link:" will be prepended to the identifier 3792 provided. 3793 3794 The slots required should be given as uses_slots, and a relevant 3795 context will be selected or created as the testing context. 3796 3797 Any extra arguments are passed through to the `Goal` constructor. 3798 """ 3799 # Auto context dependency based on uses_slots 3800 depends = contexts.auto(*uses_slots) 3801 if len(depends) == 1: 3802 test_context = depends[0] 3803 else: 3804 # TODO: De-duplicate stuff where one context actually 3805 # provides everything needed via inheritance but auto 3806 # doesn't see that? 3807 test_context = contexts.Context( 3808 description=( 3809 "Details of your code", 3810 ( 3811 "The " + phrasing.comma_list(uses_slots) 3812 + " of your code." 3813 ) 3814 ), 3815 builder=lambda ctx: ctx, 3816 depends=depends 3817 ) 3818 3819 if "test_in" not in kwargs: 3820 kwargs["test_in"] = {} 3821 if "contexts" not in kwargs["test_in"]: 3822 kwargs["test_in"]["contexts"] = [ test_context ] 3823 3824 # Specified goal type 3825 if "tags" not in kwargs: 3826 kwargs["tags"] = {} 3827 kwargs["tags"]["goal_type"] = goal_type 3828 3829 # Set up Goal stuff 3830 super().__init__( 3831 taskid, 3832 "lint:" + identifier, 3833 description, 3834 **kwargs 3835 )
In addition to a task ID, an identifier, and a description, a goal type may be supplied other than the default "style". "procedure" is the most likely alternative.
The categorizer "link:" will be prepended to the identifier provided.
The slots required should be given as uses_slots, and a relevant context will be selected or created as the testing context.
Any extra arguments are passed through to the Goal
constructor.
3766 def check(self, context): 3767 """ 3768 Not implemented; override to define specific linters. 3769 """ 3770 raise NotImplementedError( 3771 "LintCheck is an abstract class that can't be used directly." 3772 )
Not implemented; override to define specific linters.
3841 def evaluate_in_context(self, context=None): 3842 """ 3843 Runs the checker and returns its result. 3844 """ 3845 context = context or {} 3846 3847 try: 3848 self.result = self.check(context) 3849 3850 if self.result is None: 3851 raise ValueError( 3852 f"Linter for {self.__class__.__name__} returned None!" 3853 ) 3854 except Exception: 3855 self.result = { 3856 "status": "failed", 3857 "traceback": html_tools.html_traceback( 3858 linkable=context_utils.linkmap(context) 3859 ) 3860 } 3861 self.set_explanation( 3862 context, 3863 status="crash", 3864 default=html_tools.html_traceback( 3865 title="Error while inspecting your code.", 3866 linkable=context_utils.linkmap(context) 3867 ) 3868 ) 3869 return self.result 3870 3871 self.set_explanation( 3872 context, 3873 default=self.result["explanation"] 3874 ) 3875 3876 return self.result
Runs the checker and returns its result.
3879class AllFunctionsHaveDocstrings(LintCheck): 3880 """ 3881 A `LintCheck` which requires that all functions defined in the 3882 submitted module must have non-empty docstrings. 3883 """ 3884 def __init__(self, taskid, exclude=None, **kwargs): 3885 """ 3886 A task ID is required. A list of function names to ignore may be 3887 given as `exclude`. All other keyword arguments are passed to the 3888 `LintCheck` constructor. If no description is specified, a 3889 default description will be included. 3890 3891 The identifier will be "docstrings". 3892 """ 3893 self.exclude = exclude or [] 3894 3895 if "description" not in kwargs: 3896 kwargs["description"] = ( 3897 "All functions are documented", 3898 ( 3899 "Each function you define must include a non-empty" 3900 + " documentation string as the very first thing in" 3901 + " the function." 3902 ) 3903 ) 3904 3905 super().__init__( 3906 taskid, 3907 "docstrings", 3908 uses_slots=["docstrings", "defs"], 3909 **kwargs 3910 ) 3911 3912 def check(self, context): 3913 """ 3914 Checks that none of the extracted docstrings are None or 3915 empty. Requires a context that has a "docstrings" slot. 3916 """ 3917 docmap = context_utils.extract(context, "docstrings") 3918 empty_docstrings = [] 3919 has_docstrings = [] 3920 for fname in sorted(docmap): 3921 if fname not in self.exclude and docmap[fname] == '': 3922 empty_docstrings.append(fname) 3923 elif fname not in self.exclude: 3924 has_docstrings.append(fname) 3925 3926 if empty_docstrings: 3927 if has_docstrings: 3928 return { 3929 "status": "partial", 3930 "explanation": ( 3931 "Some functions had docstrings but others" 3932 " didn't. Functions missing docstrings:" 3933 "<br>\n{}" 3934 ).format( 3935 '<br>\n'.join( 3936 '<code>{}</code>'.format(fname) 3937 for fname in empty_docstrings 3938 ) 3939 ) 3940 } 3941 else: 3942 return { 3943 "status": "failed", 3944 "explanation": ( 3945 "One or more functions were missing" 3946 " docstrings or had empty docstrings:" 3947 "<br>\n{}" 3948 ).format( 3949 '<br>\n'.join( 3950 '<code>{}</code>'.format(fname) 3951 for fname in empty_docstrings 3952 ) 3953 ) 3954 } 3955 else: 3956 return { 3957 "status": "accomplished", 3958 "explanation": ( 3959 "All required functions included docstrings." 3960 ) 3961 }
A LintCheck
which requires that all functions defined in the
submitted module must have non-empty docstrings.
3884 def __init__(self, taskid, exclude=None, **kwargs): 3885 """ 3886 A task ID is required. A list of function names to ignore may be 3887 given as `exclude`. All other keyword arguments are passed to the 3888 `LintCheck` constructor. If no description is specified, a 3889 default description will be included. 3890 3891 The identifier will be "docstrings". 3892 """ 3893 self.exclude = exclude or [] 3894 3895 if "description" not in kwargs: 3896 kwargs["description"] = ( 3897 "All functions are documented", 3898 ( 3899 "Each function you define must include a non-empty" 3900 + " documentation string as the very first thing in" 3901 + " the function." 3902 ) 3903 ) 3904 3905 super().__init__( 3906 taskid, 3907 "docstrings", 3908 uses_slots=["docstrings", "defs"], 3909 **kwargs 3910 )
A task ID is required. A list of function names to ignore may be
given as exclude
. All other keyword arguments are passed to the
LintCheck
constructor. If no description is specified, a
default description will be included.
The identifier will be "docstrings".
3912 def check(self, context): 3913 """ 3914 Checks that none of the extracted docstrings are None or 3915 empty. Requires a context that has a "docstrings" slot. 3916 """ 3917 docmap = context_utils.extract(context, "docstrings") 3918 empty_docstrings = [] 3919 has_docstrings = [] 3920 for fname in sorted(docmap): 3921 if fname not in self.exclude and docmap[fname] == '': 3922 empty_docstrings.append(fname) 3923 elif fname not in self.exclude: 3924 has_docstrings.append(fname) 3925 3926 if empty_docstrings: 3927 if has_docstrings: 3928 return { 3929 "status": "partial", 3930 "explanation": ( 3931 "Some functions had docstrings but others" 3932 " didn't. Functions missing docstrings:" 3933 "<br>\n{}" 3934 ).format( 3935 '<br>\n'.join( 3936 '<code>{}</code>'.format(fname) 3937 for fname in empty_docstrings 3938 ) 3939 ) 3940 } 3941 else: 3942 return { 3943 "status": "failed", 3944 "explanation": ( 3945 "One or more functions were missing" 3946 " docstrings or had empty docstrings:" 3947 "<br>\n{}" 3948 ).format( 3949 '<br>\n'.join( 3950 '<code>{}</code>'.format(fname) 3951 for fname in empty_docstrings 3952 ) 3953 ) 3954 } 3955 else: 3956 return { 3957 "status": "accomplished", 3958 "explanation": ( 3959 "All required functions included docstrings." 3960 ) 3961 }
Checks that none of the extracted docstrings are None or empty. Requires a context that has a "docstrings" slot.
3964class FunctionsArentNested(LintCheck): 3965 """ 3966 A `LintCheck` which requires that no functions are defined inside 3967 other functions. 3968 """ 3969 def __init__(self, taskid, exclude=None, **kwargs): 3970 """ 3971 A task ID is required. A list of function names to exclude from 3972 the check may be provided. These functions will be ignored if 3973 they are nested, and functions nested inside them will not be 3974 flagged. 3975 3976 The identifier will be "functions_arent_nested". 3977 """ 3978 self.exclude = exclude or [] 3979 3980 if "description" not in kwargs: 3981 kwargs["description"] = ( 3982 "Do not define functions inside of other functions", 3983 ( 3984 "None of your function definitions may be placed" 3985 " inside of other function definitions." 3986 ) 3987 ) 3988 3989 super().__init__( 3990 taskid, 3991 "functions_arent_nested", 3992 uses_slots=["docstrings"], 3993 goal_type="procedure", 3994 **kwargs 3995 ) 3996 3997 def check(self, context): 3998 """ 3999 A linter function that checks a defs context to make sure 4000 that none of the definitions includes an interior def. 4001 """ 4002 filename = context_utils.extract(context, "filename") 4003 defsmap = context_utils.extract(context, "defs") 4004 task_info = context_utils.extract(context, "task_info") 4005 4006 has_nested = {} 4007 for name in defsmap: 4008 if name not in self.exclude: 4009 inners = defsmap[name].body 4010 for pat in patterns.ALL_DEF_PATTERNS: 4011 for inner_statement in inners: 4012 for (match, bindings) in mast.findall( 4013 inner_statement, 4014 pat 4015 ): 4016 if match.name not in self.exclude: 4017 has_nested.setdefault( 4018 name, 4019 set() 4020 ).add(match) 4021 4022 if has_nested: 4023 all_defs = set( 4024 [name for name in defsmap if name not in self.exclude] 4025 ) 4026 nested_defs = set() 4027 for outer in has_nested: 4028 nested_defs |= has_nested[outer] 4029 4030 pct_nested = len(nested_defs) / len(all_defs) 4031 4032 nested_msg = ( 4033 "We found the following functions defined within" 4034 + " other functions:<br>\n<ul>" 4035 + "\n".join( 4036 "<li>Within {} (on line {}):<br>{}</li>".format( 4037 outer, 4038 html_tools.html_link_to_line( 4039 task_info["id"], 4040 filename, 4041 defsmap[outer].lineno 4042 ), 4043 "<br>\n".join( 4044 "<code>{}</code> on line {}".format( 4045 inner.name, 4046 html_tools.html_link_to_line( 4047 task_info["id"], 4048 filename, 4049 inner.lineno 4050 ) 4051 ) 4052 for inner in has_nested[outer] 4053 ) 4054 ) 4055 for outer in has_nested 4056 ) 4057 ) 4058 4059 if pct_nested < 0.5: 4060 return { 4061 "status": "partial", 4062 "explanation": ( 4063 "Some relevant definitions were found inside" 4064 " other definitions. " 4065 ) + nested_msg 4066 } 4067 else: 4068 return { 4069 "status": "failed", 4070 "explanation": ( 4071 "More than half of relevant definitions were" 4072 " found within other definitions! " 4073 ) + nested_msg 4074 } 4075 4076 return { 4077 "status": "accomplished", 4078 "explanation": "No defs were found within other defs." 4079 } 4080 else: 4081 return { 4082 "status": "accomplished", 4083 "explanation": "No defs were found within other defs." 4084 }
A LintCheck
which requires that no functions are defined inside
other functions.
3969 def __init__(self, taskid, exclude=None, **kwargs): 3970 """ 3971 A task ID is required. A list of function names to exclude from 3972 the check may be provided. These functions will be ignored if 3973 they are nested, and functions nested inside them will not be 3974 flagged. 3975 3976 The identifier will be "functions_arent_nested". 3977 """ 3978 self.exclude = exclude or [] 3979 3980 if "description" not in kwargs: 3981 kwargs["description"] = ( 3982 "Do not define functions inside of other functions", 3983 ( 3984 "None of your function definitions may be placed" 3985 " inside of other function definitions." 3986 ) 3987 ) 3988 3989 super().__init__( 3990 taskid, 3991 "functions_arent_nested", 3992 uses_slots=["docstrings"], 3993 goal_type="procedure", 3994 **kwargs 3995 )
A task ID is required. A list of function names to exclude from the check may be provided. These functions will be ignored if they are nested, and functions nested inside them will not be flagged.
The identifier will be "functions_arent_nested".
3997 def check(self, context): 3998 """ 3999 A linter function that checks a defs context to make sure 4000 that none of the definitions includes an interior def. 4001 """ 4002 filename = context_utils.extract(context, "filename") 4003 defsmap = context_utils.extract(context, "defs") 4004 task_info = context_utils.extract(context, "task_info") 4005 4006 has_nested = {} 4007 for name in defsmap: 4008 if name not in self.exclude: 4009 inners = defsmap[name].body 4010 for pat in patterns.ALL_DEF_PATTERNS: 4011 for inner_statement in inners: 4012 for (match, bindings) in mast.findall( 4013 inner_statement, 4014 pat 4015 ): 4016 if match.name not in self.exclude: 4017 has_nested.setdefault( 4018 name, 4019 set() 4020 ).add(match) 4021 4022 if has_nested: 4023 all_defs = set( 4024 [name for name in defsmap if name not in self.exclude] 4025 ) 4026 nested_defs = set() 4027 for outer in has_nested: 4028 nested_defs |= has_nested[outer] 4029 4030 pct_nested = len(nested_defs) / len(all_defs) 4031 4032 nested_msg = ( 4033 "We found the following functions defined within" 4034 + " other functions:<br>\n<ul>" 4035 + "\n".join( 4036 "<li>Within {} (on line {}):<br>{}</li>".format( 4037 outer, 4038 html_tools.html_link_to_line( 4039 task_info["id"], 4040 filename, 4041 defsmap[outer].lineno 4042 ), 4043 "<br>\n".join( 4044 "<code>{}</code> on line {}".format( 4045 inner.name, 4046 html_tools.html_link_to_line( 4047 task_info["id"], 4048 filename, 4049 inner.lineno 4050 ) 4051 ) 4052 for inner in has_nested[outer] 4053 ) 4054 ) 4055 for outer in has_nested 4056 ) 4057 ) 4058 4059 if pct_nested < 0.5: 4060 return { 4061 "status": "partial", 4062 "explanation": ( 4063 "Some relevant definitions were found inside" 4064 " other definitions. " 4065 ) + nested_msg 4066 } 4067 else: 4068 return { 4069 "status": "failed", 4070 "explanation": ( 4071 "More than half of relevant definitions were" 4072 " found within other definitions! " 4073 ) + nested_msg 4074 } 4075 4076 return { 4077 "status": "accomplished", 4078 "explanation": "No defs were found within other defs." 4079 } 4080 else: 4081 return { 4082 "status": "accomplished", 4083 "explanation": "No defs were found within other defs." 4084 }
A linter function that checks a defs context to make sure that none of the definitions includes an interior def.
4087class DoesntWasteFruit(LintCheck): 4088 """ 4089 A `LintCheck` that makes sure that any fruitful function or method 4090 calls get stored in variables or used as part of expressions. A 4091 fruitful function or method is one of: 4092 4093 1. Defined in the submission itself with an interior return node that 4094 has an expression associated with it, which isn't inside a nested 4095 definition. 4096 2. One of the functions named in the `extra` list of strings, or a 4097 method named in that list with a '.' at the start. 4098 4099 This goal will fail if at least one function call to a fruitful 4100 function or method doesn't use the result, but will partially succeed 4101 if there's at least one that does use the result. 4102 """ 4103 def __init__(self, taskid, exclude=None, extra=None, **kwargs): 4104 """ 4105 A task ID is required. A list of strings specifying names of 4106 functions to exclude from this check may be given. The code in 4107 those functions won't be inspected for wasting fruit, but calls 4108 to those functions in other contexts will still be inspected if 4109 they're fruitful. 4110 4111 A description tuple can be supplied but a reasonable default will be 4112 use if it isn't given. 4113 4114 The identifier will be "doesnt_waste_fruit". 4115 """ 4116 self.exclude = exclude or [] 4117 self.extra = extra or [] 4118 4119 if "description" not in kwargs: 4120 kwargs["description"] = ( 4121 ( 4122 "Do not ignore the results of any fruitful function" 4123 " calls" 4124 ), 4125 ( 4126 "According to the \"Don't waste fruit\" principle," 4127 " every place you call a fruitful function" 4128 " (built-in or custom) you must store the result in" 4129 " a variable, or that function call must be part of" 4130 " a larger expression that uses its return value." 4131 ) 4132 ) 4133 4134 super().__init__( 4135 taskid, 4136 "doesnt_waste_fruit", 4137 uses_slots=["scope"], 4138 goal_type="procedure", 4139 **kwargs 4140 ) 4141 4142 def check(self, context): 4143 """ 4144 Returns success if none of the fruitful function and/or method 4145 calls in the given AST tree has a result but fails to either 4146 store it in a variable or use it as part of a larger expression 4147 or statement. 4148 """ 4149 filename = context_utils.extract(context, "filename") 4150 scope = context_utils.extract(context, "scope") 4151 task_info = context_utils.extract(context, "task_info") 4152 4153 # Variables to accumulate results 4154 fruitful_defs = {} 4155 4156 used_calls = set() 4157 unused_calls = set() 4158 4159 # Maps from function names (or method names prefixed with '.') to 4160 # AST Call nodes for good calls (fruitful functions called in a 4161 # way that uses their result) and bad calls (fruitful functions 4162 # called as bare expressions). 4163 good_calls = {} 4164 bad_calls = {} 4165 4166 # Gather fruitful definitions 4167 for pat in patterns.ALL_DEF_PATTERNS: 4168 for (matching_node, bindings) in mast.findall(scope, pat): 4169 if mast.find( 4170 matching_node.body, # so we don't exclude this def itself 4171 "return _", 4172 outside=patterns.ALL_DEF_PATTERNS 4173 ): 4174 fruitful_defs[matching_node.name] = matching_node 4175 4176 # Search entire code for used/unused function or method calls: 4177 self.accumulate_function_and_method_calls( 4178 scope, 4179 used_calls, 4180 unused_calls, 4181 self.exclude 4182 ) 4183 4184 # Find bad unused calls to fruitful functions 4185 for call in unused_calls: 4186 # Get the name of the function we're calling 4187 if isinstance(call.func, ast.Name): 4188 # A direct function call 4189 fname = call.func.id 4190 mname = fname 4191 elif isinstance(call.func, ast.Attribute): 4192 # A method call 4193 fname = call.func.attr 4194 mname = '.' + fname 4195 else: 4196 # Too complex to analyze; skip this function call 4197 continue 4198 4199 # Decide if this call is bad or not: 4200 if ( 4201 mname in self.extra 4202 or fname in fruitful_defs 4203 ): 4204 bad_calls.setdefault(mname, []).append(call) 4205 4206 # Find good used calls to fruitful functions 4207 for call in used_calls: 4208 # Get the name of the function we're calling 4209 if isinstance(call.func, ast.Name): 4210 # A direct function call 4211 fname = call.func.id 4212 mname = fname 4213 elif isinstance(call.func, ast.Attribute): 4214 # A method call 4215 fname = call.func.attr 4216 mname = '.' + fname 4217 else: 4218 # Too complex to analyze; skip this function call 4219 continue 4220 4221 # Decide if this call is good or not: 4222 if ( 4223 mname in self.extra 4224 or fname in fruitful_defs 4225 ): 4226 good_calls.setdefault(mname, []).append(call) 4227 4228 # Report results 4229 if (len(bad_calls) > 0): 4230 bad_call_report = ( 4231 "We found the following calls to fruitful functions" 4232 + " whose results were ignored:\n<ul>{}</ul>" 4233 ).format( 4234 "\n".join( 4235 "<li><code>{}</code> on line(s) {}</li>".format( 4236 fname, 4237 ", ".join( 4238 html_tools.html_link_to_line( 4239 task_info["id"], 4240 filename, 4241 call.lineno 4242 ) 4243 for call in bad_calls[fname] 4244 ) 4245 ) 4246 for fname in bad_calls 4247 ) 4248 ) 4249 4250 if len(good_calls) == 0: 4251 return { 4252 "status": "failed", 4253 "explanation": ( 4254 "Your code used fruitful functions but ignored" 4255 + " their results. " 4256 ) + bad_call_report 4257 } 4258 else: 4259 return { 4260 "status": "partial", 4261 "explanation": ( 4262 "Your code used some fruitful functions but" 4263 + " ignored their results. " 4264 ) + bad_call_report 4265 } 4266 else: # no bad calls! 4267 return { 4268 "status": "accomplished", 4269 "explanation": ( 4270 "All calls to fruitful functions in your code" 4271 + " correctly made use of their results." 4272 ) 4273 } 4274 4275 def accumulate_function_and_method_calls( 4276 self, 4277 node, 4278 used, 4279 unused, 4280 exclude=[] 4281 ): 4282 """ 4283 Recursively accumulates used and unused function and method 4284 calls. Ignores function calls where the function being called is 4285 the result of an expression that's not an ast.Name or an 4286 ast.Attribute. 4287 4288 The 'used' and 'unused' parameters are treated as sets of AST 4289 nodes. 4290 4291 The `exclude` parameter is optional and lists functions whose 4292 definitions won't be inspected. 4293 """ 4294 # We won't process things which come up in recursion that aren't AST 4295 # nodes (like strings, None, etc.). Note that when we recurse we make 4296 # sure to recurse into the AST nodes within lists like bodies. 4297 if not isinstance(node, ast.AST): 4298 return 4299 4300 # If this is a function call that hasn't already been marked as 4301 # unused, mark it as used 4302 if isinstance(node, ast.Call) and node not in unused: 4303 # Only add it if it's a simple call to a function or method 4304 if isinstance(node.func, (ast.Name, ast.Attribute)): 4305 used.add(node) 4306 4307 # Don't recurse or process statements if we're the definition of 4308 # an excluded function 4309 if isinstance(node, ast.FunctionDef) and node.name in exclude: 4310 return 4311 4312 # Gather places to look for calls that qualify as unused: 4313 statements = [] 4314 if isinstance( 4315 node, 4316 ( 4317 ast.Module, 4318 ast.FunctionDef, 4319 ast.ClassDef, 4320 ast.ExceptHandler, 4321 ast.With 4322 ) 4323 ): 4324 # A node that has a body 4325 statements = node.body 4326 4327 elif isinstance(node, (ast.If, ast.For, ast.While)): 4328 # We need to inspect both the body and the orelse 4329 statements = node.body + node.orelse 4330 4331 elif isinstance(node, ast.Try): 4332 # Inspect body, finalbody, and orelse (handlers will be inspected 4333 # when recursing on them) 4334 statements = node.body + node.finalbody + node.orelse 4335 4336 # No other AST nodes define blocks, so they can't give rise to unused 4337 # function/method calls. 4338 4339 # Inspect the block-level statements for unused expressions 4340 # TODO: Should we negate this? ANY expression which isn't a function 4341 # call to a non-fruitful function is wasting a value when it appears 4342 # as a statement... 4343 for statement in statements: 4344 if ( 4345 isinstance(statement, ast.Expr) 4346 and isinstance(statement.value, ast.Call) 4347 ): 4348 call = statement.value 4349 if isinstance(call.func, ast.Name): 4350 unused.add(call) 4351 elif isinstance(call.func, ast.Attribute): 4352 unused.add(call) 4353 # else ignore this call; it's too complex 4354 4355 # Recurse to accumulate results from inner nodes 4356 for field in node._fields: 4357 4358 if not hasattr(node, field): # skip missing fields 4359 continue 4360 4361 child = getattr(node, field) 4362 if isinstance(child, list): # recurse into each element 4363 for child_part in child: 4364 self.accumulate_function_and_method_calls( 4365 child_part, 4366 used, 4367 unused, 4368 exclude 4369 ) 4370 else: # Just recurse into this item 4371 self.accumulate_function_and_method_calls( 4372 child, 4373 used, 4374 unused, 4375 exclude 4376 )
A LintCheck
that makes sure that any fruitful function or method
calls get stored in variables or used as part of expressions. A
fruitful function or method is one of:
- Defined in the submission itself with an interior return node that has an expression associated with it, which isn't inside a nested definition.
- One of the functions named in the
extra
list of strings, or a method named in that list with a '.' at the start.
This goal will fail if at least one function call to a fruitful function or method doesn't use the result, but will partially succeed if there's at least one that does use the result.
4103 def __init__(self, taskid, exclude=None, extra=None, **kwargs): 4104 """ 4105 A task ID is required. A list of strings specifying names of 4106 functions to exclude from this check may be given. The code in 4107 those functions won't be inspected for wasting fruit, but calls 4108 to those functions in other contexts will still be inspected if 4109 they're fruitful. 4110 4111 A description tuple can be supplied but a reasonable default will be 4112 use if it isn't given. 4113 4114 The identifier will be "doesnt_waste_fruit". 4115 """ 4116 self.exclude = exclude or [] 4117 self.extra = extra or [] 4118 4119 if "description" not in kwargs: 4120 kwargs["description"] = ( 4121 ( 4122 "Do not ignore the results of any fruitful function" 4123 " calls" 4124 ), 4125 ( 4126 "According to the \"Don't waste fruit\" principle," 4127 " every place you call a fruitful function" 4128 " (built-in or custom) you must store the result in" 4129 " a variable, or that function call must be part of" 4130 " a larger expression that uses its return value." 4131 ) 4132 ) 4133 4134 super().__init__( 4135 taskid, 4136 "doesnt_waste_fruit", 4137 uses_slots=["scope"], 4138 goal_type="procedure", 4139 **kwargs 4140 )
A task ID is required. A list of strings specifying names of functions to exclude from this check may be given. The code in those functions won't be inspected for wasting fruit, but calls to those functions in other contexts will still be inspected if they're fruitful.
A description tuple can be supplied but a reasonable default will be use if it isn't given.
The identifier will be "doesnt_waste_fruit".
4142 def check(self, context): 4143 """ 4144 Returns success if none of the fruitful function and/or method 4145 calls in the given AST tree has a result but fails to either 4146 store it in a variable or use it as part of a larger expression 4147 or statement. 4148 """ 4149 filename = context_utils.extract(context, "filename") 4150 scope = context_utils.extract(context, "scope") 4151 task_info = context_utils.extract(context, "task_info") 4152 4153 # Variables to accumulate results 4154 fruitful_defs = {} 4155 4156 used_calls = set() 4157 unused_calls = set() 4158 4159 # Maps from function names (or method names prefixed with '.') to 4160 # AST Call nodes for good calls (fruitful functions called in a 4161 # way that uses their result) and bad calls (fruitful functions 4162 # called as bare expressions). 4163 good_calls = {} 4164 bad_calls = {} 4165 4166 # Gather fruitful definitions 4167 for pat in patterns.ALL_DEF_PATTERNS: 4168 for (matching_node, bindings) in mast.findall(scope, pat): 4169 if mast.find( 4170 matching_node.body, # so we don't exclude this def itself 4171 "return _", 4172 outside=patterns.ALL_DEF_PATTERNS 4173 ): 4174 fruitful_defs[matching_node.name] = matching_node 4175 4176 # Search entire code for used/unused function or method calls: 4177 self.accumulate_function_and_method_calls( 4178 scope, 4179 used_calls, 4180 unused_calls, 4181 self.exclude 4182 ) 4183 4184 # Find bad unused calls to fruitful functions 4185 for call in unused_calls: 4186 # Get the name of the function we're calling 4187 if isinstance(call.func, ast.Name): 4188 # A direct function call 4189 fname = call.func.id 4190 mname = fname 4191 elif isinstance(call.func, ast.Attribute): 4192 # A method call 4193 fname = call.func.attr 4194 mname = '.' + fname 4195 else: 4196 # Too complex to analyze; skip this function call 4197 continue 4198 4199 # Decide if this call is bad or not: 4200 if ( 4201 mname in self.extra 4202 or fname in fruitful_defs 4203 ): 4204 bad_calls.setdefault(mname, []).append(call) 4205 4206 # Find good used calls to fruitful functions 4207 for call in used_calls: 4208 # Get the name of the function we're calling 4209 if isinstance(call.func, ast.Name): 4210 # A direct function call 4211 fname = call.func.id 4212 mname = fname 4213 elif isinstance(call.func, ast.Attribute): 4214 # A method call 4215 fname = call.func.attr 4216 mname = '.' + fname 4217 else: 4218 # Too complex to analyze; skip this function call 4219 continue 4220 4221 # Decide if this call is good or not: 4222 if ( 4223 mname in self.extra 4224 or fname in fruitful_defs 4225 ): 4226 good_calls.setdefault(mname, []).append(call) 4227 4228 # Report results 4229 if (len(bad_calls) > 0): 4230 bad_call_report = ( 4231 "We found the following calls to fruitful functions" 4232 + " whose results were ignored:\n<ul>{}</ul>" 4233 ).format( 4234 "\n".join( 4235 "<li><code>{}</code> on line(s) {}</li>".format( 4236 fname, 4237 ", ".join( 4238 html_tools.html_link_to_line( 4239 task_info["id"], 4240 filename, 4241 call.lineno 4242 ) 4243 for call in bad_calls[fname] 4244 ) 4245 ) 4246 for fname in bad_calls 4247 ) 4248 ) 4249 4250 if len(good_calls) == 0: 4251 return { 4252 "status": "failed", 4253 "explanation": ( 4254 "Your code used fruitful functions but ignored" 4255 + " their results. " 4256 ) + bad_call_report 4257 } 4258 else: 4259 return { 4260 "status": "partial", 4261 "explanation": ( 4262 "Your code used some fruitful functions but" 4263 + " ignored their results. " 4264 ) + bad_call_report 4265 } 4266 else: # no bad calls! 4267 return { 4268 "status": "accomplished", 4269 "explanation": ( 4270 "All calls to fruitful functions in your code" 4271 + " correctly made use of their results." 4272 ) 4273 }
Returns success if none of the fruitful function and/or method calls in the given AST tree has a result but fails to either store it in a variable or use it as part of a larger expression or statement.
4275 def accumulate_function_and_method_calls( 4276 self, 4277 node, 4278 used, 4279 unused, 4280 exclude=[] 4281 ): 4282 """ 4283 Recursively accumulates used and unused function and method 4284 calls. Ignores function calls where the function being called is 4285 the result of an expression that's not an ast.Name or an 4286 ast.Attribute. 4287 4288 The 'used' and 'unused' parameters are treated as sets of AST 4289 nodes. 4290 4291 The `exclude` parameter is optional and lists functions whose 4292 definitions won't be inspected. 4293 """ 4294 # We won't process things which come up in recursion that aren't AST 4295 # nodes (like strings, None, etc.). Note that when we recurse we make 4296 # sure to recurse into the AST nodes within lists like bodies. 4297 if not isinstance(node, ast.AST): 4298 return 4299 4300 # If this is a function call that hasn't already been marked as 4301 # unused, mark it as used 4302 if isinstance(node, ast.Call) and node not in unused: 4303 # Only add it if it's a simple call to a function or method 4304 if isinstance(node.func, (ast.Name, ast.Attribute)): 4305 used.add(node) 4306 4307 # Don't recurse or process statements if we're the definition of 4308 # an excluded function 4309 if isinstance(node, ast.FunctionDef) and node.name in exclude: 4310 return 4311 4312 # Gather places to look for calls that qualify as unused: 4313 statements = [] 4314 if isinstance( 4315 node, 4316 ( 4317 ast.Module, 4318 ast.FunctionDef, 4319 ast.ClassDef, 4320 ast.ExceptHandler, 4321 ast.With 4322 ) 4323 ): 4324 # A node that has a body 4325 statements = node.body 4326 4327 elif isinstance(node, (ast.If, ast.For, ast.While)): 4328 # We need to inspect both the body and the orelse 4329 statements = node.body + node.orelse 4330 4331 elif isinstance(node, ast.Try): 4332 # Inspect body, finalbody, and orelse (handlers will be inspected 4333 # when recursing on them) 4334 statements = node.body + node.finalbody + node.orelse 4335 4336 # No other AST nodes define blocks, so they can't give rise to unused 4337 # function/method calls. 4338 4339 # Inspect the block-level statements for unused expressions 4340 # TODO: Should we negate this? ANY expression which isn't a function 4341 # call to a non-fruitful function is wasting a value when it appears 4342 # as a statement... 4343 for statement in statements: 4344 if ( 4345 isinstance(statement, ast.Expr) 4346 and isinstance(statement.value, ast.Call) 4347 ): 4348 call = statement.value 4349 if isinstance(call.func, ast.Name): 4350 unused.add(call) 4351 elif isinstance(call.func, ast.Attribute): 4352 unused.add(call) 4353 # else ignore this call; it's too complex 4354 4355 # Recurse to accumulate results from inner nodes 4356 for field in node._fields: 4357 4358 if not hasattr(node, field): # skip missing fields 4359 continue 4360 4361 child = getattr(node, field) 4362 if isinstance(child, list): # recurse into each element 4363 for child_part in child: 4364 self.accumulate_function_and_method_calls( 4365 child_part, 4366 used, 4367 unused, 4368 exclude 4369 ) 4370 else: # Just recurse into this item 4371 self.accumulate_function_and_method_calls( 4372 child, 4373 used, 4374 unused, 4375 exclude 4376 )
Recursively accumulates used and unused function and method calls. Ignores function calls where the function being called is the result of an expression that's not an ast.Name or an ast.Attribute.
The 'used' and 'unused' parameters are treated as sets of AST nodes.
The exclude
parameter is optional and lists functions whose
definitions won't be inspected.
4379class DoesntWasteBoxes(LintCheck): 4380 """ 4381 A `LintCheck` which looks for unused variables, excluding a list of 4382 strings (named functions won't be inspected at all and named 4383 variables won't be counted as unused). The given partial tolerance 4384 value controls how many unused variables must exist before the goal 4385 is failed instead of partially completed. Set it to 0 to force a 4386 strict binary accomplished/failed result. 4387 4388 The special name '_' will always be permitted, as it explicitly 4389 hints that a value will not be used. By default, loop variables will 4390 not be checked, although they can be inspected by setting 4391 `check_loop_vars` to True. 4392 4393 An unused variable is defined as a variable which is set but never 4394 loaded, which we detect via ast.Name nodes and 4395 ast.FunctionDef/ast.Lambda arguments and the presence of Store vs. 4396 Load contexts. This goal will be happily accept load-before-store, 4397 but other parts of your rubric will probably notice the code crashing 4398 when run... 4399 4400 Note that our handling of scopes is primitive: we recognize the 4401 global scope and function def scopes, but not all the nuances of 4402 other scopes. 4403 """ 4404 def __init__( 4405 self, 4406 taskid, 4407 exclude=None, 4408 tolerance=2, 4409 check_loop_vars=False, 4410 **kwargs 4411 ): 4412 """ 4413 A task ID is required. A list of strings specifying names of 4414 functions and/or variables to exclude from this check may be 4415 given. Excluded functions won't have their code inspected, and 4416 excluded variables won't be checked. 4417 4418 The identifier will be "doesnt_waste_boxes". 4419 4420 A category other than the default 'core' may also be specified. 4421 4422 A tolerance (resulting in partial instead of complete failure) 4423 other than the default of 2 may be specified. 4424 4425 A custom description tuple may be supplied, but a default 4426 description will be added if a custom one isn't provided. 4427 """ 4428 self.exclude = exclude 4429 self.partial_tolerance = tolerance 4430 self.check_loop_vars = check_loop_vars 4431 4432 if "description" not in kwargs: 4433 kwargs["description"] = ( 4434 ( 4435 "Do not create any variables that you never make" 4436 " use of" 4437 ), 4438 ( 4439 "According to the \"Don't waste boxes\" principle," 4440 " every time you create a variable (using" 4441 " <code>=</code> or by defining a parameter for a" 4442 " function) you must also later use that variable" 4443 " as part of another expression. If you need to" 4444 " create a variable that you won't use, it must" 4445 " have the name <code>_</code>, but you should only" 4446 " do this if absolutely necessary." 4447 ) 4448 ) 4449 4450 super().__init__( 4451 taskid, 4452 "doesnt_waste_boxes", 4453 uses_slots=["scope"], 4454 goal_type="procedure", 4455 **kwargs 4456 ) 4457 4458 def check(self, context): 4459 """ 4460 A checker function which requires that there are no unused 4461 variables in the given scope or in any particular function 4462 definition scope inside it (more complex scoping rules aren't 4463 attended to). 4464 """ 4465 node = context_utils.extract(context, "scope") 4466 task_info = context_utils.extract(context, "task_info") 4467 filename = context_utils.extract(context, "filename") 4468 4469 # Variable to track scopes (see gather_loads_and_stores) 4470 scopes = {} 4471 4472 # Find all Name nodes plus arguments, noting which scope(s) they 4473 # are a part of. 4474 self.gather_loads_and_stores( 4475 node, 4476 scopes, 4477 exclude=self.exclude, 4478 include_loop_vars=self.check_loop_vars 4479 ) 4480 4481 # Report string and boolean for state 4482 report = "Found the following variables that were never used:\n<ol>\n" 4483 num_unused = 0 4484 4485 # Check each scope to look for stores that don't have 4486 # corresponding loads and assemble our report: 4487 for scope in scopes: 4488 missing_loads = ( 4489 set(scopes[scope].get('store', {})) 4490 - scopes[scope].get('load', set()) 4491 - set(self.exclude) 4492 - { '_' } # single-underscore is valid for an unused result 4493 ) 4494 4495 if missing_loads: 4496 num_unused += len(missing_loads) 4497 if scope == ("__global__",): 4498 scope_repr = "the global scope" 4499 else: 4500 scope_repr = ' → '.join( 4501 "<code>{}</code>".format(sc) 4502 for sc in scope[1:] 4503 ) 4504 4505 report += "<li>In {}, found:\n<ol>\n{}\n</ol></li>\n".format( 4506 scope_repr, 4507 "\n".join( 4508 "<li>Variable <code>{}</code> on line(s) {}</li>\n" 4509 .format( 4510 var, 4511 ", ".join( 4512 html_tools.html_link_to_line( 4513 task_info["id"], 4514 filename, 4515 node.lineno 4516 ) 4517 for node in scopes[scope]['store'][var] 4518 ) 4519 ) 4520 for var in missing_loads 4521 ) 4522 ) 4523 4524 # Succeed or fail 4525 if num_unused > 0: 4526 if num_unused > self.partial_tolerance: 4527 status = "failed" 4528 else: 4529 status = "partial" 4530 4531 return { 4532 "status": status, 4533 "explanation": ( 4534 "Your code created {} variables which it did not" 4535 + " make use of:\n{}" 4536 ).format(num_unused, report) 4537 } 4538 else: 4539 return { 4540 "status": "accomplished", 4541 "explanation": ( 4542 "Your code did not create any variables which it did" 4543 + " not make use of." 4544 ) 4545 } 4546 4547 def gather_loads_and_stores( 4548 self, 4549 node, 4550 result, 4551 current_scopes=("__global__",), 4552 exclude=[], 4553 include_loop_vars=True 4554 ): 4555 """ 4556 Recursively traverses an AST and makes note of each time a Name 4557 appears including its Load or Store context. Accumulates results 4558 into the 'result' dictionary, which has scope-name-tuples (e.g., 4559 ("__global__",) or ("__global__", "foo", "<lambda at line 12 col 4560 8>")) as keys and values which are dictionaries with 'load' and 4561 'store' keys. The 'load' value is a set of variable names, while 4562 the 'store' value is a dictionary mapping variable names to lists 4563 of AST nodes. 4564 4565 If `include_loop_vars` is set to False (default is True), loop 4566 variables of for loops will not be included. 4567 4568 As it traverses the AST tree, the current_scopes tuple indicates 4569 which scope(s) we're inside of. We add loads to all parent scopes 4570 but stores just to the innermost scope. Note that we aren't 4571 really keeping track of shadowing properly, so a shadowed global 4572 variable would still think that it's referenced even if it's not 4573 (TODO: fix that!) 4574 """ 4575 # We won't process non-AST items 4576 if not isinstance(node, ast.AST): 4577 return 4578 4579 # Don't process if we're the definition of an excluded function 4580 if isinstance(node, ast.FunctionDef) and node.name in exclude: 4581 return 4582 4583 # Process this node if it's a Name... 4584 if isinstance(node, ast.Name): 4585 if isinstance(node.ctx, ast.Load): 4586 for i in range(1, len(current_scopes) + 1): 4587 result.setdefault(current_scopes[:i], {})\ 4588 .setdefault('load', set())\ 4589 .add(node.id) 4590 elif isinstance(node.ctx, ast.Store): 4591 result.setdefault(current_scopes, {})\ 4592 .setdefault('store', {})\ 4593 .setdefault(node.id, [])\ 4594 .append(node) 4595 # Note: we don't track Del-context Name references 4596 4597 # If this node is a FunctionDef, it creates a scope and we've also 4598 # got to add its arguments as stored variables. 4599 if isinstance(node, ast.FunctionDef) or isinstance(node, ast.Lambda): 4600 if isinstance(node, ast.FunctionDef): 4601 inner_scopes = current_scopes + (node.name,) 4602 else: # Lambdas create anonymous scopes 4603 scope_id = "<lambda at line {} col {}>".format( 4604 node.lineno, 4605 node.col_offset 4606 ) 4607 inner_scopes = current_scopes = (scope_id,) 4608 4609 for arg in ( 4610 # Note: some relevant Python versions don't have posonlyargs 4611 ( 4612 getattr(node.args, "posonlyargs") 4613 if hasattr(node.args, "posonlyargs") 4614 else [] 4615 ) 4616 + node.args.args 4617 + node.args.kwonlyargs 4618 + ([node.args.vararg] if node.args.vararg else []) 4619 + ([node.args.kwarg] if node.args.kwarg else []) 4620 ): 4621 result.setdefault(inner_scopes, {})\ 4622 .setdefault('store', {})\ 4623 .setdefault(arg.arg, [])\ 4624 .append(node) 4625 else: 4626 # Otherwise, the inner scopes are the same as the current scopes 4627 inner_scopes = current_scopes 4628 4629 # Recurse to accumulate results from inner nodes 4630 for field in node._fields: 4631 4632 if not hasattr(node, field): # skip missing fields 4633 continue 4634 4635 # Skip the target of a for loop if include_loop_vars is False 4636 if ( 4637 not include_loop_vars 4638 and isinstance(node, (ast.For, ast.AsyncFor)) 4639 and field == "target" 4640 ): 4641 continue 4642 4643 child = getattr(node, field) 4644 if isinstance(child, list): # recurse into each element 4645 for child_part in child: 4646 self.gather_loads_and_stores( 4647 child_part, 4648 result, 4649 inner_scopes, 4650 exclude, 4651 include_loop_vars 4652 ) 4653 else: # Just recurse into this item 4654 self.gather_loads_and_stores( 4655 child, 4656 result, 4657 inner_scopes, 4658 exclude, 4659 include_loop_vars 4660 )
A LintCheck
which looks for unused variables, excluding a list of
strings (named functions won't be inspected at all and named
variables won't be counted as unused). The given partial tolerance
value controls how many unused variables must exist before the goal
is failed instead of partially completed. Set it to 0 to force a
strict binary accomplished/failed result.
The special name '_' will always be permitted, as it explicitly
hints that a value will not be used. By default, loop variables will
not be checked, although they can be inspected by setting
check_loop_vars
to True.
An unused variable is defined as a variable which is set but never loaded, which we detect via ast.Name nodes and ast.FunctionDef/ast.Lambda arguments and the presence of Store vs. Load contexts. This goal will be happily accept load-before-store, but other parts of your rubric will probably notice the code crashing when run...
Note that our handling of scopes is primitive: we recognize the global scope and function def scopes, but not all the nuances of other scopes.
4404 def __init__( 4405 self, 4406 taskid, 4407 exclude=None, 4408 tolerance=2, 4409 check_loop_vars=False, 4410 **kwargs 4411 ): 4412 """ 4413 A task ID is required. A list of strings specifying names of 4414 functions and/or variables to exclude from this check may be 4415 given. Excluded functions won't have their code inspected, and 4416 excluded variables won't be checked. 4417 4418 The identifier will be "doesnt_waste_boxes". 4419 4420 A category other than the default 'core' may also be specified. 4421 4422 A tolerance (resulting in partial instead of complete failure) 4423 other than the default of 2 may be specified. 4424 4425 A custom description tuple may be supplied, but a default 4426 description will be added if a custom one isn't provided. 4427 """ 4428 self.exclude = exclude 4429 self.partial_tolerance = tolerance 4430 self.check_loop_vars = check_loop_vars 4431 4432 if "description" not in kwargs: 4433 kwargs["description"] = ( 4434 ( 4435 "Do not create any variables that you never make" 4436 " use of" 4437 ), 4438 ( 4439 "According to the \"Don't waste boxes\" principle," 4440 " every time you create a variable (using" 4441 " <code>=</code> or by defining a parameter for a" 4442 " function) you must also later use that variable" 4443 " as part of another expression. If you need to" 4444 " create a variable that you won't use, it must" 4445 " have the name <code>_</code>, but you should only" 4446 " do this if absolutely necessary." 4447 ) 4448 ) 4449 4450 super().__init__( 4451 taskid, 4452 "doesnt_waste_boxes", 4453 uses_slots=["scope"], 4454 goal_type="procedure", 4455 **kwargs 4456 )
A task ID is required. A list of strings specifying names of functions and/or variables to exclude from this check may be given. Excluded functions won't have their code inspected, and excluded variables won't be checked.
The identifier will be "doesnt_waste_boxes".
A category other than the default 'core' may also be specified.
A tolerance (resulting in partial instead of complete failure) other than the default of 2 may be specified.
A custom description tuple may be supplied, but a default description will be added if a custom one isn't provided.
4458 def check(self, context): 4459 """ 4460 A checker function which requires that there are no unused 4461 variables in the given scope or in any particular function 4462 definition scope inside it (more complex scoping rules aren't 4463 attended to). 4464 """ 4465 node = context_utils.extract(context, "scope") 4466 task_info = context_utils.extract(context, "task_info") 4467 filename = context_utils.extract(context, "filename") 4468 4469 # Variable to track scopes (see gather_loads_and_stores) 4470 scopes = {} 4471 4472 # Find all Name nodes plus arguments, noting which scope(s) they 4473 # are a part of. 4474 self.gather_loads_and_stores( 4475 node, 4476 scopes, 4477 exclude=self.exclude, 4478 include_loop_vars=self.check_loop_vars 4479 ) 4480 4481 # Report string and boolean for state 4482 report = "Found the following variables that were never used:\n<ol>\n" 4483 num_unused = 0 4484 4485 # Check each scope to look for stores that don't have 4486 # corresponding loads and assemble our report: 4487 for scope in scopes: 4488 missing_loads = ( 4489 set(scopes[scope].get('store', {})) 4490 - scopes[scope].get('load', set()) 4491 - set(self.exclude) 4492 - { '_' } # single-underscore is valid for an unused result 4493 ) 4494 4495 if missing_loads: 4496 num_unused += len(missing_loads) 4497 if scope == ("__global__",): 4498 scope_repr = "the global scope" 4499 else: 4500 scope_repr = ' → '.join( 4501 "<code>{}</code>".format(sc) 4502 for sc in scope[1:] 4503 ) 4504 4505 report += "<li>In {}, found:\n<ol>\n{}\n</ol></li>\n".format( 4506 scope_repr, 4507 "\n".join( 4508 "<li>Variable <code>{}</code> on line(s) {}</li>\n" 4509 .format( 4510 var, 4511 ", ".join( 4512 html_tools.html_link_to_line( 4513 task_info["id"], 4514 filename, 4515 node.lineno 4516 ) 4517 for node in scopes[scope]['store'][var] 4518 ) 4519 ) 4520 for var in missing_loads 4521 ) 4522 ) 4523 4524 # Succeed or fail 4525 if num_unused > 0: 4526 if num_unused > self.partial_tolerance: 4527 status = "failed" 4528 else: 4529 status = "partial" 4530 4531 return { 4532 "status": status, 4533 "explanation": ( 4534 "Your code created {} variables which it did not" 4535 + " make use of:\n{}" 4536 ).format(num_unused, report) 4537 } 4538 else: 4539 return { 4540 "status": "accomplished", 4541 "explanation": ( 4542 "Your code did not create any variables which it did" 4543 + " not make use of." 4544 ) 4545 }
A checker function which requires that there are no unused variables in the given scope or in any particular function definition scope inside it (more complex scoping rules aren't attended to).
4547 def gather_loads_and_stores( 4548 self, 4549 node, 4550 result, 4551 current_scopes=("__global__",), 4552 exclude=[], 4553 include_loop_vars=True 4554 ): 4555 """ 4556 Recursively traverses an AST and makes note of each time a Name 4557 appears including its Load or Store context. Accumulates results 4558 into the 'result' dictionary, which has scope-name-tuples (e.g., 4559 ("__global__",) or ("__global__", "foo", "<lambda at line 12 col 4560 8>")) as keys and values which are dictionaries with 'load' and 4561 'store' keys. The 'load' value is a set of variable names, while 4562 the 'store' value is a dictionary mapping variable names to lists 4563 of AST nodes. 4564 4565 If `include_loop_vars` is set to False (default is True), loop 4566 variables of for loops will not be included. 4567 4568 As it traverses the AST tree, the current_scopes tuple indicates 4569 which scope(s) we're inside of. We add loads to all parent scopes 4570 but stores just to the innermost scope. Note that we aren't 4571 really keeping track of shadowing properly, so a shadowed global 4572 variable would still think that it's referenced even if it's not 4573 (TODO: fix that!) 4574 """ 4575 # We won't process non-AST items 4576 if not isinstance(node, ast.AST): 4577 return 4578 4579 # Don't process if we're the definition of an excluded function 4580 if isinstance(node, ast.FunctionDef) and node.name in exclude: 4581 return 4582 4583 # Process this node if it's a Name... 4584 if isinstance(node, ast.Name): 4585 if isinstance(node.ctx, ast.Load): 4586 for i in range(1, len(current_scopes) + 1): 4587 result.setdefault(current_scopes[:i], {})\ 4588 .setdefault('load', set())\ 4589 .add(node.id) 4590 elif isinstance(node.ctx, ast.Store): 4591 result.setdefault(current_scopes, {})\ 4592 .setdefault('store', {})\ 4593 .setdefault(node.id, [])\ 4594 .append(node) 4595 # Note: we don't track Del-context Name references 4596 4597 # If this node is a FunctionDef, it creates a scope and we've also 4598 # got to add its arguments as stored variables. 4599 if isinstance(node, ast.FunctionDef) or isinstance(node, ast.Lambda): 4600 if isinstance(node, ast.FunctionDef): 4601 inner_scopes = current_scopes + (node.name,) 4602 else: # Lambdas create anonymous scopes 4603 scope_id = "<lambda at line {} col {}>".format( 4604 node.lineno, 4605 node.col_offset 4606 ) 4607 inner_scopes = current_scopes = (scope_id,) 4608 4609 for arg in ( 4610 # Note: some relevant Python versions don't have posonlyargs 4611 ( 4612 getattr(node.args, "posonlyargs") 4613 if hasattr(node.args, "posonlyargs") 4614 else [] 4615 ) 4616 + node.args.args 4617 + node.args.kwonlyargs 4618 + ([node.args.vararg] if node.args.vararg else []) 4619 + ([node.args.kwarg] if node.args.kwarg else []) 4620 ): 4621 result.setdefault(inner_scopes, {})\ 4622 .setdefault('store', {})\ 4623 .setdefault(arg.arg, [])\ 4624 .append(node) 4625 else: 4626 # Otherwise, the inner scopes are the same as the current scopes 4627 inner_scopes = current_scopes 4628 4629 # Recurse to accumulate results from inner nodes 4630 for field in node._fields: 4631 4632 if not hasattr(node, field): # skip missing fields 4633 continue 4634 4635 # Skip the target of a for loop if include_loop_vars is False 4636 if ( 4637 not include_loop_vars 4638 and isinstance(node, (ast.For, ast.AsyncFor)) 4639 and field == "target" 4640 ): 4641 continue 4642 4643 child = getattr(node, field) 4644 if isinstance(child, list): # recurse into each element 4645 for child_part in child: 4646 self.gather_loads_and_stores( 4647 child_part, 4648 result, 4649 inner_scopes, 4650 exclude, 4651 include_loop_vars 4652 ) 4653 else: # Just recurse into this item 4654 self.gather_loads_and_stores( 4655 child, 4656 result, 4657 inner_scopes, 4658 exclude, 4659 include_loop_vars 4660 )
Recursively traverses an AST and makes note of each time a Name
appears including its Load or Store context. Accumulates results
into the 'result' dictionary, which has scope-name-tuples (e.g.,
("__global__",) or ("__global__", "foo", "
If include_loop_vars
is set to False (default is True), loop
variables of for loops will not be included.
As it traverses the AST tree, the current_scopes tuple indicates which scope(s) we're inside of. We add loads to all parent scopes but stores just to the innermost scope. Note that we aren't really keeping track of shadowing properly, so a shadowed global variable would still think that it's referenced even if it's not (TODO: fix that!)