Difference between revisions of "CddMeeting01082008"
(→Data to harvest) |
|||
(23 intermediate revisions by 2 users not shown) | |||
Line 5: | Line 5: | ||
* Friday, 11.1.2008, 10:00 | * Friday, 11.1.2008, 10:00 | ||
== Tasks == | == Tasks == | ||
− | * | + | * Add filters and tags for extracted, manual tests and automated tests |
− | + | * Fix extraction for tuples -> DONE | |
− | * Fix extraction for tuples -> DONE | + | |
* Look at/fix test case execution for agents (Stefan) | * Look at/fix test case execution for agents (Stefan) | ||
− | * | + | * CDD log window in IDE (Arno) |
− | + | * "New manual test case" Button (Arno) | |
− | * "New manual test case" Button ( | + | * Better Icons for GUI (Arno) |
− | * Better Icons for GUI | + | |
** http://www.famfamfam.com/lab/icons/silk/ | ** http://www.famfamfam.com/lab/icons/silk/ | ||
** http://tango.freedesktop.org/Tango_Icon_Library | ** http://tango.freedesktop.org/Tango_Icon_Library | ||
− | * Status / Progress bar ( | + | * Status / Progress bar (Arno) |
* Port to 6.1 (?, probably only after Beta 1) | * Port to 6.1 (?, probably only after Beta 1) | ||
− | * Manual re-run to find true prestate (Jocelyn, | + | * Manual re-run to find true prestate (Jocelyn, Stefan) |
− | * Logging | + | * Logging (Stefan) |
** What data to log? | ** What data to log? | ||
** Implement storing | ** Implement storing | ||
** Define how students should submit logs | ** Define how students should submit logs | ||
+ | * Data Gathering (Stefan) | ||
+ | ** Define what data to gather | ||
+ | ** Define how to process gather data | ||
* Forumulate Experiment Hypothesis (Andreas) | * Forumulate Experiment Hypothesis (Andreas) | ||
* Define Project for SoftEng (Manu) | * Define Project for SoftEng (Manu) | ||
− | * | + | ** Find System level test suite for us to test students code |
− | * | + | ** Find project with pure functional part |
+ | * "Execute visible test cases only" Button (?) | ||
* Restore open nodes and selection after grid update (Arno) | * Restore open nodes and selection after grid update (Arno) | ||
** Maybe better/easier solved via incremental updates from tree | ** Maybe better/easier solved via incremental updates from tree | ||
− | + | * Automate CDD System level tests (Stefan) | |
− | + | * Install CDD in student labs (Manu) | |
− | + | * Pause test execution and compilation during regular compilation and execution (Arno) | |
− | * Automate CDD System level tests | + | * Add most important convenience routine to CDD_TEST_CASE (Stefan) |
− | * Install CDD in student labs | + | * Add failure context window (Arno) |
− | * Pause test execution and compilation during regular compilation and execution | + | |
− | * Add most important convenience routine to CDD_TEST_CASE | + | |
− | * Add failure context window | + | |
** Maybe also additional information such as previous outcomes? | ** Maybe also additional information such as previous outcomes? | ||
− | * | + | * Check why Gobo slows down compilation of project not using gobo when melting (performance issue for compiling interpreter) |
+ | * Fix AutoTest for courses | ||
+ | ** Integrate AUT_TEST_CASE into CDD_TEST_CASE hierarchy | ||
+ | ** Variable declaration for failing test cases | ||
+ | ** New release | ||
+ | * Move logs below cdd_tests | ||
+ | * Environment variable (or better user preference) for qualifying class names (to avoid svn conflicts) | ||
+ | * Uniqe id to tag test cases with. To be used in logs. So test logs are resiliant to test class renamings | ||
+ | * While extracting test cases, flag objects that are target to a currently executing routine | ||
+ | * During setup check inv of all objects that are not flaged | ||
== Software Engineering Project == | == Software Engineering Project == | ||
Line 50: | Line 58: | ||
== Data to harvest == | == Data to harvest == | ||
+ | * IDE Time with CDD(extraction) enabled / IDE Time with CDD(extraction) disabled | ||
* Test Case Source (just final version, or all versions?) | * Test Case Source (just final version, or all versions?) | ||
− | * TC Meta Data | + | ** Use Profiler to get coverage approximation |
+ | * TC Meta Data (with timestamps -> Evolution of Test Case) | ||
** TC Added/Removed | ** TC Added/Removed | ||
− | ** TC Outcome | + | ** TC Outcome (transitions from FAIL/PASS/UNRESOLVED[bad_communication <-> does_not_compile <-> bad_input]) |
** TC execution time | ** TC execution time | ||
+ | ** Modificiations to a testcase (compiler needs to recompile) | ||
+ | * Development Session Data | ||
+ | ** IDE Startup | ||
+ | ** File save | ||
* Questionnairs | * Questionnairs | ||
** Initial | ** Initial | ||
** Final | ** Final | ||
+ | |||
+ | ==Experiment Hypotheses== | ||
+ | |||
+ | ===Use of CDD increases development productivity=== | ||
+ | * Did the use of testing decrease development time? | ||
+ | * This can be meassured by either looking at | ||
+ | ** Number of compilations | ||
+ | ** Number of saves | ||
+ | ** Number of revisions | ||
+ | ** IDE time | ||
+ | ** Asking the students | ||
+ | |||
+ | None of the above strikes me as particualry reliable though. Also, it is easy to develop quickly if you do a bad job. | ||
+ | In order to compare apples to apples we must be careful to compare projects with a similar correcntess and completeness. We could use an external test suite to assess correctness, or the grade of the students. | ||
+ | |||
+ | |||
+ | ===Use of CDD increases code correctness=== | ||
+ | * Is there a relation between code correctness of project (vs. some system level test suite) and test activity? | ||
+ | |||
+ | Measures for test activity: | ||
+ | * number of tests | ||
+ | * number of times test were run | ||
+ | * Number of pass/fail, fail/pass transitions | ||
+ | |||
+ | ===Developer Profile=== | ||
+ | * How did students use the testing tools. | ||
+ | * Are ther clusters of similar use? | ||
+ | * What is charactersitic for these clusters? | ||
+ | * Meassures: | ||
+ | ** Aksing students before and after | ||
+ | ** Are there projects where tests initially always fail resp. pass | ||
+ | ** How often do they test? | ||
+ | ** How correct is their project? | ||
+ | |||
+ | I am not completely sure yet what to assess here. | ||
+ | |||
+ | ===How do extracted, synthesized and manually written test cases compare?=== | ||
+ | * Which tests are the most useful to students? | ||
+ | * How many tests are there in each category? | ||
+ | * What's the test suite quality of each category? | ||
+ | * Were some excluded from testing more often than others? | ||
+ | * How many red/green and green/red transitions are there in each category? | ||
+ | * Which had compile-time errors most often that did not get fixed? |
Latest revision as of 02:20, 11 January 2008
Contents
CDD Meeting, Tuesday, 8.1.2008, 10:00
Next Meeting
- Friday, 11.1.2008, 10:00
Tasks
- Add filters and tags for extracted, manual tests and automated tests
- Fix extraction for tuples -> DONE
- Look at/fix test case execution for agents (Stefan)
- CDD log window in IDE (Arno)
- "New manual test case" Button (Arno)
- Better Icons for GUI (Arno)
- Status / Progress bar (Arno)
- Port to 6.1 (?, probably only after Beta 1)
- Manual re-run to find true prestate (Jocelyn, Stefan)
- Logging (Stefan)
- What data to log?
- Implement storing
- Define how students should submit logs
- Data Gathering (Stefan)
- Define what data to gather
- Define how to process gather data
- Forumulate Experiment Hypothesis (Andreas)
- Define Project for SoftEng (Manu)
- Find System level test suite for us to test students code
- Find project with pure functional part
- "Execute visible test cases only" Button (?)
- Restore open nodes and selection after grid update (Arno)
- Maybe better/easier solved via incremental updates from tree
- Automate CDD System level tests (Stefan)
- Install CDD in student labs (Manu)
- Pause test execution and compilation during regular compilation and execution (Arno)
- Add most important convenience routine to CDD_TEST_CASE (Stefan)
- Add failure context window (Arno)
- Maybe also additional information such as previous outcomes?
- Check why Gobo slows down compilation of project not using gobo when melting (performance issue for compiling interpreter)
- Fix AutoTest for courses
- Integrate AUT_TEST_CASE into CDD_TEST_CASE hierarchy
- Variable declaration for failing test cases
- New release
- Move logs below cdd_tests
- Environment variable (or better user preference) for qualifying class names (to avoid svn conflicts)
- Uniqe id to tag test cases with. To be used in logs. So test logs are resiliant to test class renamings
- While extracting test cases, flag objects that are target to a currently executing routine
- During setup check inv of all objects that are not flaged
Software Engineering Project
- One large project, but divided into testable subcomponents
- Students required to write test cases
- Fixed API to make things uniformly testable
- Public/Secret test cases (similar to Zeller course)
- Competitions:
- Group A test cases applied to Group A project
- Group A test cases applied to Groupt B project
Data to harvest
- IDE Time with CDD(extraction) enabled / IDE Time with CDD(extraction) disabled
- Test Case Source (just final version, or all versions?)
- Use Profiler to get coverage approximation
- TC Meta Data (with timestamps -> Evolution of Test Case)
- TC Added/Removed
- TC Outcome (transitions from FAIL/PASS/UNRESOLVED[bad_communication <-> does_not_compile <-> bad_input])
- TC execution time
- Modificiations to a testcase (compiler needs to recompile)
- Development Session Data
- IDE Startup
- File save
- Questionnairs
- Initial
- Final
Experiment Hypotheses
Use of CDD increases development productivity
- Did the use of testing decrease development time?
- This can be meassured by either looking at
- Number of compilations
- Number of saves
- Number of revisions
- IDE time
- Asking the students
None of the above strikes me as particualry reliable though. Also, it is easy to develop quickly if you do a bad job. In order to compare apples to apples we must be careful to compare projects with a similar correcntess and completeness. We could use an external test suite to assess correctness, or the grade of the students.
Use of CDD increases code correctness
- Is there a relation between code correctness of project (vs. some system level test suite) and test activity?
Measures for test activity:
- number of tests
- number of times test were run
- Number of pass/fail, fail/pass transitions
Developer Profile
- How did students use the testing tools.
- Are ther clusters of similar use?
- What is charactersitic for these clusters?
- Meassures:
- Aksing students before and after
- Are there projects where tests initially always fail resp. pass
- How often do they test?
- How correct is their project?
I am not completely sure yet what to assess here.
How do extracted, synthesized and manually written test cases compare?
- Which tests are the most useful to students?
- How many tests are there in each category?
- What's the test suite quality of each category?
- Were some excluded from testing more often than others?
- How many red/green and green/red transitions are there in each category?
- Which had compile-time errors most often that did not get fixed?