CddMeeting 31 01 2008

Revision as of 00:50, 4 February 2008 by Mogh (Talk | contribs) (Stefan)

CDD Meeting, Tuesday, 31.1.2008, 10:00

Next Meeting

  • Thursday, 5.2.2008, 10:00

Tasks

Andreas

  • Forumulate Experiment Hypothesis (Andreas)
  • Fix AutoTest for courses
    • New release
  • Write documentation and videos tutorials (together with final release)
  • [done] Commit dangling patch from 6.0 to 6.1
  • [done] Make it so that tester target never has extraction or execution enabled
    • remove hack from CDD_MANAGER.schedule_testing_restart
  • [done] Make CDD Windows apear by default
  • Finish tuple_002 test case
  • Retest if test cases with errors are properly ignored (after 6.1 port)

Arno

  • When test class gets removed manually, update test suite
  • Clean up test case in interpreter after each execution (through garbage collection?)
  • Build releasable delivery for Linux (after each Beta I guess...)
  • Display ignored test class compilation errors (looks like we will have this for free in 6.1)
  • Make sure CDD Tools are visible by default (what layout would you prefer?)
    • Main tool shares tabs with clusters/features tool, output tool after C output tool
  • Red bg for failing test cases in view
  • Write new simple "New Manual Test Case" dialog
  • Tesy case for (user defined) expanded types
  • test case containing feature names with underscores and "like Current"


Bug Fixing

  • Result type (like Current) produces syntax error in new test class
  • Fix interpreter hang after runtime crash
  • Check why EiffelStudio quits after debugging a test routine and ignoring violations
  • Check if interpreter compilation errors are propagated correctly (seems to start interpreter even though compilation has failed)

Ilinca

  • Integrate variable declarations into AutoTest trunk (by 8.2.2008)

Stefan

  • [DONE] Uniqe id to tag test cases with. To be used in logs. So test logs are resiliant to test class renamings
  • [DONE, except probably for bad memory corruption bugs which didn't occur anymore since agents are ignored] Make popup on interpreter crash go away (win32 only)
  • [RECURRENT] Build releasable delivery on Windows
  • Logging
    • What data to log?
    • Implement storing
    • Define how students should submit logs
  • Data Gathering
    • Define what data to gather
    • Define how to process gather data
  • Second Chance re-run to find true prestate (with Jocelyn)
  • Allow for test case extraction of passing routine invocations (with Jocelyn)
  • Rebuilding manual test suite through extraction and synthesizing
  • Find performance bottleneck of test case extraction and propose extraction method for second chance

Bugs/Things to look at

  • For big projects (like ES itself) background compilation of the interpreter leads to completely unresponsive ES
  • Is it still necessary to ever call the routine update actions with argument "void"?
  • Crash upon closing of EiffelStudio (feature call on void target in breakpoint tool)

Manu

  • Define Project for SoftEng (due by next meeting)
    • Find System level test suite for us to test students code
    • Find project with pure functional part
  • Install CDD in student labs (Manu)
  • Devise questionnaires
    • Initial (due next meeting after Manu's vacation)
    • Midterm
    • Final
  • Analyze questionnaires
  • Rework example profiles
  • Assis will use CDD to get a feel for it and create a test suite for the students to start with

Unassigned

  • Cache debug values when extracting several test cases.
  • Enable execution and extraction by default for new projects.
  • Make CDD Window and CDD Log Window visiable by default
  • "Debug selected test routine" should be grayed out if no test case is currently selected
  • Testing V2 Application should not interupt flow
  • Retest if test cases with errors are properly ignored (after 6.1 port)
  • Extraction for inline agents not currently working (at least not always)
    • Create inline agent test case
    • Fix extraction for inline agents
  • Revive system level test suite

Questionnaires

  • Use ELBA

Software Engineering Project

  • One large project, but divided into testable subcomponents
  • Students required to write test cases
  • Fixed API to make things uniformly testable
  • Public/Secret test cases (similar to Zeller course)
  • Competitions:
    • Group A test cases applied to Group A project
    • Group A test cases applied to Groupt B project
  • Idea how to cancel out bias while allowing fair grading:
    • Subtasks 1 and 2, Students divided into groups A and B
    • First both groups do 1, A is allowed to use tool, B not
    • Then both groups do 2, B is allowed to use tool, A not
    • Bias cancelation:
      • Project complexity
      • Experience of students
      • Experience gained in first subtask, when developing second
      • Risk: One task might be better suited for the tool than the other

Data to harvest

  • IDE Time with CDD(extraction) enabled / IDE Time with CDD(extraction) disabled
  • Test Case Source (just final version, or all versions?)
    • Use Profiler to get coverage approximation
  • TC Meta Data (with timestamps -> Evolution of Test Case)
    • TC Added/Removed/Changed
    • TC Outcome (transitions from FAIL/PASS/UNRESOLVED[bad_communication <-> does_not_compile <-> bad_input])
    • TC execution time
    • Modificiations to a testcase (compiler needs to recompile)
  • Development Session Data
    • IDE Startup
    • File save
  • Questionnairs
    • Initial
    • Final


Logging

  • "Meta" log entries
    • Project opened (easy)
    • CDD enable/disable (easy)
    • general EiffelStudio action log entries for Developer Behaviour (harder... what do we need??)
  • CDD actions log entries
    • Compilation of interpreter (start, end, duration)
    • Execution of test cases (start, end, do we need individual duration of each test cases that gets executed?)
    • Extraction of new test case (extraction time)
  • Test Suite Status
    • Test suite: after each refresh log list of all test cases (class level, needed because it's not possible to know when manual test cases get added...)
    • Test class: (do we need info on this level)
    • Test routine: status (basically as you see it in the tool)

Experiment Hypotheses

Do Contracts improve Tests?

  • Is there a correlation between Tests quantity or quality and the quantity or quality of contracts?

Corellation between failure/fault type and test type?

  • Do certain kind of tests find certain kind of failures/faults?

Use of CDD increases development productivity

  • Did the use of testing decrease development time?
  • Meassures:
    • Number of compilations
    • Number of saves
    • Number of revisions
    • IDE time
    • Asking the students

Emphasis on quetionnair result. Correlation with logs only if it makes sense

Use of CDD increases code correctness

  • Is there a relation between code correctness of project (vs. some system level test suite) and test activity?
  • Measures:
    • number of tests
    • number of times test were run
    • Number of pass/fail, fail/pass transitions, (also consider unresolved/* transitions ?)
    • Secret test suite

Developer Profile: Is there a correlation between Developer Profile and the way they use testing tools

  • How did students use the testing tools?
  • Are ther clusters of similar use?
  • What is charactersitic for these clusters?
  • Meassures:
    • Aksing students before and after
    • Are there projects where tests initially always fail resp. pass
    • How often do they test?
    • How correct is their project?

Midterm questionnaire will be used to phrase questions for final questionnaire.

Example profiles

  • Waldundwiesen Hacker
    • No explicit structure. Does whatever seems appriorate at the time. No QA plan.
  • Agile
    • Processes interleave. Conscionsness for QA. Maybe even Test First or TDD.
  • Waterfall inspired
    • Explicit process model. Phases don't interleave.
  •  ?

How do extracted, synthesized and manually written test cases compare?

  • Which tests are the most useful to students?
  • How many tests are there in each category?
  • What's the test suite quality of each category?
  • Were some excluded from testing more often than others?
  • How many red/green and green/red transitions are there in each category?
  • Which had compile-time errors most often that did not get fixed?
  • Meassures:
    • LOC
    • Number of tests
    • Number of executions
    • Outcome transitions