CddMeeting 31 01 2008
Contents
- 1 CDD Meeting, Tuesday, 31.1.2008, 10:00
- 1.1 Next Meeting
- 1.2 Tasks
- 1.3 Questionnaires
- 1.4 Software Engineering Project
- 1.5 Data to harvest
- 1.6 Logging
- 1.7 Experiment Hypotheses
- 1.8 Do Contracts improve Tests?
- 1.9 Corellation between failure/fault type and test type?
CDD Meeting, Tuesday, 31.1.2008, 10:00
Next Meeting
- Thursday, 5.2.2008, 10:00
Tasks
Andreas
- Forumulate Experiment Hypothesis (Andreas)
- Fix AutoTest for courses
- New release
- Write documentation and videos tutorials (together with final release)
- [done] Commit dangling patch from 6.0 to 6.1
- [done] Make it so that tester target never has extraction or execution enabled
- remove hack from CDD_MANAGER.schedule_testing_restart
- [done] Make CDD Windows apear by default
- Finish tuple_002 test case
- Retest if test cases with errors are properly ignored (after 6.1 port)
Arno
- When test class gets removed manually, update test suite
- Clean up test case in interpreter after each execution (through garbage collection?)
- Build releasable delivery for Linux (after each Beta I guess...)
- Display ignored test class compilation errors (looks like we will have this for free in 6.1)
- Make sure CDD Tools are visible by default (what layout would you prefer?)
- Main tool shares tabs with clusters/features tool, output tool after C output tool
- Red bg for failing test cases in view
- Write new simple "New Manual Test Case" dialog
- Tesy case for (user defined) expanded types
- test case containing feature names with underscores and "like Current"
Bug Fixing
- Result type (like Current) produces syntax error in new test class
- Fix interpreter hang after runtime crash
- Check why EiffelStudio quits after debugging a test routine and ignoring violations
- Check if interpreter compilation errors are propagated correctly (seems to start interpreter even though compilation has failed)
Ilinca
- Integrate variable declarations into AutoTest trunk (by 8.2.2008)
Stefan
- [DONE] Uniqe id to tag test cases with. To be used in logs. So test logs are resiliant to test class renamings
- [DONE, except probably for bad memory corruption bugs which didn't occur anymore since agents are ignored] Make popup on interpreter crash go away (win32 only)
- [RECURRENT] Build releasable delivery on Windows
- Logging
- What data to log?
- Implement storing
- Define how students should submit logs
- Data Gathering
- Define what data to gather
- Define how to process gather data
- Second Chance re-run to find true prestate (with Jocelyn)
- Allow for test case extraction of passing routine invocations (with Jocelyn)
- Rebuilding manual test suite through extraction and synthesizing
- Find performance bottleneck of test case extraction and propose extraction method for second chance
Bugs/Things to look at
- For big projects (like ES itself) background compilation of the interpreter leads to completely unresponsive ES
- Is it still necessary to ever call the routine update actions with argument "void"?
- Crash upon closing of EiffelStudio (feature call on void target in breakpoint tool)
Manu
- Define Project for SoftEng (due by next meeting)
- Find System level test suite for us to test students code
- Find project with pure functional part
- Install CDD in student labs (Manu)
- Devise questionnaires
- Initial (due next meeting after Manu's vacation)
- Midterm
- Final
- Analyze questionnaires
- Rework example profiles
- Assis will use CDD to get a feel for it and create a test suite for the students to start with
Unassigned
- Cache debug values when extracting several test cases.
- Enable execution and extraction by default for new projects.
- Make CDD Window and CDD Log Window visiable by default
- "Debug selected test routine" should be grayed out if no test case is currently selected
- Testing V2 Application should not interupt flow
- Retest if test cases with errors are properly ignored (after 6.1 port)
- Extraction for inline agents not currently working (at least not always)
- Create inline agent test case
- Fix extraction for inline agents
- Revive system level test suite
Questionnaires
- Use ELBA
Software Engineering Project
- One large project, but divided into testable subcomponents
- Students required to write test cases
- Fixed API to make things uniformly testable
- Public/Secret test cases (similar to Zeller course)
- Competitions:
- Group A test cases applied to Group A project
- Group A test cases applied to Groupt B project
- Idea how to cancel out bias while allowing fair grading:
- Subtasks 1 and 2, Students divided into groups A and B
- First both groups do 1, A is allowed to use tool, B not
- Then both groups do 2, B is allowed to use tool, A not
- Bias cancelation:
- Project complexity
- Experience of students
- Experience gained in first subtask, when developing second
- Risk: One task might be better suited for the tool than the other
Data to harvest
- IDE Time with CDD(extraction) enabled / IDE Time with CDD(extraction) disabled
- Test Case Source (just final version, or all versions?)
- Use Profiler to get coverage approximation
- TC Meta Data (with timestamps -> Evolution of Test Case)
- TC Added/Removed/Changed
- TC Outcome (transitions from FAIL/PASS/UNRESOLVED[bad_communication <-> does_not_compile <-> bad_input])
- TC execution time
- Modificiations to a testcase (compiler needs to recompile)
- Development Session Data
- IDE Startup
- File save
- Questionnairs
- Initial
- Final
Logging
- "Meta" log entries
- Project opened (easy)
- CDD enable/disable (easy)
- general EiffelStudio action log entries for Developer Behaviour (harder... what do we need??)
- CDD actions log entries
- Compilation of interpreter (start, end, duration)
- Execution of test cases (start, end, do we need individual duration of each test cases that gets executed?)
- Extraction of new test case (extraction time)
- Test Suite Status
- Test suite: after each refresh log list of all test cases (class level, needed because it's not possible to know when manual test cases get added...)
- Test class: (do we need info on this level)
- Test routine: status (basically as you see it in the tool)
Experiment Hypotheses
Do Contracts improve Tests?
- Is there a correlation between Tests quantity or quality and the quantity or quality of contracts?
Corellation between failure/fault type and test type?
- Do certain kind of tests find certain kind of failures/faults?
Use of CDD increases development productivity
- Did the use of testing decrease development time?
- Meassures:
- Number of compilations
- Number of saves
- Number of revisions
- IDE time
- Asking the students
Emphasis on quetionnair result. Correlation with logs only if it makes sense
Use of CDD increases code correctness
- Is there a relation between code correctness of project (vs. some system level test suite) and test activity?
- Measures:
- number of tests
- number of times test were run
- Number of pass/fail, fail/pass transitions, (also consider unresolved/* transitions ?)
- Secret test suite
Developer Profile: Is there a correlation between Developer Profile and the way they use testing tools
- How did students use the testing tools?
- Are ther clusters of similar use?
- What is charactersitic for these clusters?
- Meassures:
- Aksing students before and after
- Are there projects where tests initially always fail resp. pass
- How often do they test?
- How correct is their project?
Midterm questionnaire will be used to phrase questions for final questionnaire.
Example profiles
- Waldundwiesen Hacker
- No explicit structure. Does whatever seems appriorate at the time. No QA plan.
- Agile
- Processes interleave. Conscionsness for QA. Maybe even Test First or TDD.
- Waterfall inspired
- Explicit process model. Phases don't interleave.
- ?
How do extracted, synthesized and manually written test cases compare?
- Which tests are the most useful to students?
- How many tests are there in each category?
- What's the test suite quality of each category?
- Were some excluded from testing more often than others?
- How many red/green and green/red transitions are there in each category?
- Which had compile-time errors most often that did not get fixed?
- Meassures:
- LOC
- Number of tests
- Number of executions
- Outcome transitions