CddMeeting 05 02 2008

Revision as of 02:20, 6 February 2008 by Aleitner (Talk | contribs) (Beta Tester Feedback)

CDD Meeting, Tuesday, 05.02.2008, 10:00

Next Meeting

  • Tuesday, 12.02.2008, 10:00

Tasks

Andreas

  • Forumulate Experiment Hypothesis (Andreas)
  • Fix AutoTest for courses
    • New release
  • Write documentation and videos tutorials (together with final release)
  • Finish tuple_002 test case
  • Retest if test cases with errors are properly ignored (after 6.1 port)
  • Add timeout judgement
  • Timeout -> 5 sec

Arno

  • When test class gets removed manually, update test suite
  • Build releasable delivery for Linux (after each Beta I guess...)
  • Display ignored test class compilation errors (looks like we will have this for free in 6.1)
  • Red bg for failing test cases in view
  • When debugging extracted test case, set first breakpoint in "covers." feature
  • Extraction for inline agents not currently working (at least not always)
    • Create inline agent test case
    • Fix extraction for inline agents


Bug Fixing

  • Result type (like Current) produces syntax error in new test class
  • Check why EiffelStudio quits after debugging a test routine and ignoring violations

Ilinca

  • Integrate variable declarations into AutoTest trunk (by 8.2.2008)

Stefan

  • [RECURRENT] Build releasable delivery on Windows
  • Distinguish extracted, synthesized and manual test cases in logs
  • Log TS Snapshot after compilation
  • Log TS Snapshot after testing
  • Log when ES starts up and shuts down
  • Log time it takes to extract test case
  • Log time it takes to compile SUT
  • Log time it takes to compile test suite
  • Log original exception (make it part of test routine's state)
  • Second Chance re-run to find true prestate (with Jocelyn)
  • Allow for test case extraction of passing routine invocations (with Jocelyn)
  • Revive system level test suite
  • Rebuilding manual test suite through extraction and synthesizing
  • Find performance bottleneck of test case extraction and propose extraction method for second chance

Bugs/Things to look at

  • For big projects (like ES itself) background compilation of the interpreter leads to completely unresponsive ES
  • Crash upon closing of EiffelStudio (feature call on void target in breakpoint tool)

Manu

  • Install CDD in student labs (Manu)
  • Devise questionnaires
    • Initial (due next meeting after Manu's vacation)
    • Midterm
    • Final
  • Analyze questionnaires
  • Rework example profiles
  • Assis will use CDD to get a feel for it and create a test suite for the students to start with

Bernd

  • Define Project for SoftEng
    • Find test suite for us to test students code
    • Find project with pure functional part

Unassigned

  • Only execute unresolved test cases once. Disable them afterwards. (Needs discussion)
  • Cache debug values when extracting several test cases.

Beta Tester Feedback

(Please put your name so we can get back to you in the case of questions)

  • It should be possible to set the location of the cdd_tests directory (what if location of .ecf file is not readable?)
    • home directory? application_data directory?
  • There should be UI support for deletion of Test Case
  • [BUG] the manual test case creation dialog should check if class with chosen name is already in the system
  • It would be nice if there was a way to configure the timeout for the interpreter

Questionnaires

  • Use ELBA

Software Engineering Project

  • Task 1: Implement VCard API
  • Task 2: Implement Mime API
  • Task 3: Write test cases to reveal faults in foreign VCard implementations
  • Task 4: Write test cases to reveal faults in foreign Mime implementations
  • Group A:
    • Task 1, Manual Tests
    • Task 2, Extracted Tests
    • Task 3, Manual Tests
    • Task 4, Extracted Tests
  • Group B:
    • Task 1, Extracted Tests
    • Task 2, Manual Tests
    • Task 3, Extracted Tests
    • Task 4, Manual Tests
  • One large project, but divided into testable subcomponents
  • Students required to write test cases
  • Fixed API to make things uniformly testable
  • Public/Secret test cases (similar to Zeller course)
  • Competitions:
    • Group A test cases applied to Group A project
    • Group A test cases applied to Groupt B project
  • Idea how to cancel out bias while allowing fair grading:
    • Subtasks 1 and 2, Students divided into groups A and B
    • First both groups do 1, A is allowed to use tool, B not
    • Then both groups do 2, B is allowed to use tool, A not
    • Bias cancelation:
      • Project complexity
      • Experience of students
      • Experience gained in first subtask, when developing second
      • Risk: One task might be better suited for the tool than the other

Data to harvest

  • IDE Time with CDD(extraction) enabled / IDE Time with CDD(extraction) disabled
  • Test Case Source (just final version, or all versions?)
    • Use Profiler to get coverage approximation
  • TC Meta Data (with timestamps -> Evolution of Test Case)
    • TC Added/Removed/Changed
    • TC Outcome (transitions from FAIL/PASS/UNRESOLVED[bad_communication <-> does_not_compile <-> bad_input])
    • TC execution time
    • Modificiations to a testcase (compiler needs to recompile)
  • Development Session Data
    • IDE Startup
    • File save
  • Questionnairs
    • Initial
    • Final


Logging

  • "Meta" log entries
    • Project opened (easy)
    • CDD enable/disable (easy)
    • general EiffelStudio action log entries for Developer Behaviour (harder... what do we need??)
  • CDD actions log entries
    • Compilation of interpreter (start, end, duration)
    • Execution of test cases (start, end, do we need individual duration of each test cases that gets executed?)
    • Extraction of new test case (extraction time)
  • Test Suite Status
    • Test suite: after each refresh log list of all test cases (class level, needed because it's not possible to know when manual test cases get added...)
    • Test class: (do we need info on this level)
    • Test routine: status (basically as you see it in the tool)

Experiment Hypotheses

Do Contracts improve Tests?

  • Is there a correlation between Tests quantity or quality and the quantity or quality of contracts?

Corellation between failure/fault type and test type?

  • Do certain kind of tests find certain kind of failures/faults?

Use of CDD increases development productivity

  • Did the use of testing decrease development time?
  • Meassures:
    • Number of compilations
    • Number of saves
    • Number of revisions
    • IDE time
    • Asking the students

Emphasis on quetionnair result. Correlation with logs only if it makes sense

Use of CDD increases code correctness

  • Is there a relation between code correctness of project (vs. some system level test suite) and test activity?
  • Measures:
    • number of tests
    • number of times test were run
    • Number of pass/fail, fail/pass transitions, (also consider unresolved/* transitions ?)
    • Secret test suite

Developer Profile: Is there a correlation between Developer Profile and the way they use testing tools

  • How did students use the testing tools?
  • Are ther clusters of similar use?
  • What is charactersitic for these clusters?
  • Meassures:
    • Aksing students before and after
    • Are there projects where tests initially always fail resp. pass
    • How often do they test?
    • How correct is their project?

Midterm questionnaire will be used to phrase questions for final questionnaire.

Example profiles

  • Waldundwiesen Hacker
    • No explicit structure. Does whatever seems appriorate at the time. No QA plan.
  • Agile
    • Processes interleave. Conscionsness for QA. Maybe even Test First or TDD.
  • Waterfall inspired
    • Explicit process model. Phases don't interleave.
  •  ?

How do extracted, synthesized and manually written test cases compare?

  • Which tests are the most useful to students?
  • How many tests are there in each category?
  • What's the test suite quality of each category?
  • Were some excluded from testing more often than others?
  • How many red/green and green/red transitions are there in each category?
  • Which had compile-time errors most often that did not get fixed?
  • Meassures:
    • LOC
    • Number of tests
    • Number of executions
    • Outcome transitions