Difference between revisions of "CddMeeting01082008"

Latest revision as of 02:20, 11 January 2008

1 CDD Meeting, Tuesday, 8.1.2008, 10:00

CDD Meeting, Tuesday, 8.1.2008, 10:00

Next Meeting

Friday, 11.1.2008, 10:00

Tasks

Add filters and tags for extracted, manual tests and automated tests
Fix extraction for tuples -> DONE
Look at/fix test case execution for agents (Stefan)
CDD log window in IDE (Arno)
"New manual test case" Button (Arno)
Better Icons for GUI (Arno)
- http://www.famfamfam.com/lab/icons/silk/
- http://tango.freedesktop.org/Tango_Icon_Library
Status / Progress bar (Arno)
Port to 6.1 (?, probably only after Beta 1)
Manual re-run to find true prestate (Jocelyn, Stefan)
Logging (Stefan)
- What data to log?
- Implement storing
- Define how students should submit logs
Data Gathering (Stefan)
- Define what data to gather
- Define how to process gather data
Forumulate Experiment Hypothesis (Andreas)
Define Project for SoftEng (Manu)
- Find System level test suite for us to test students code
- Find project with pure functional part
"Execute visible test cases only" Button (?)
Restore open nodes and selection after grid update (Arno)
- Maybe better/easier solved via incremental updates from tree
Automate CDD System level tests (Stefan)
Install CDD in student labs (Manu)
Pause test execution and compilation during regular compilation and execution (Arno)
Add most important convenience routine to CDD_TEST_CASE (Stefan)
Add failure context window (Arno)
- Maybe also additional information such as previous outcomes?
Check why Gobo slows down compilation of project not using gobo when melting (performance issue for compiling interpreter)
Fix AutoTest for courses
- Integrate AUT_TEST_CASE into CDD_TEST_CASE hierarchy
- Variable declaration for failing test cases
- New release
Move logs below cdd_tests
Environment variable (or better user preference) for qualifying class names (to avoid svn conflicts)
Uniqe id to tag test cases with. To be used in logs. So test logs are resiliant to test class renamings
While extracting test cases, flag objects that are target to a currently executing routine
During setup check inv of all objects that are not flaged

Software Engineering Project

One large project, but divided into testable subcomponents
Students required to write test cases
Fixed API to make things uniformly testable
Public/Secret test cases (similar to Zeller course)
Competitions:
- Group A test cases applied to Group A project
- Group A test cases applied to Groupt B project

Data to harvest

IDE Time with CDD(extraction) enabled / IDE Time with CDD(extraction) disabled
Test Case Source (just final version, or all versions?)
- Use Profiler to get coverage approximation
TC Meta Data (with timestamps -> Evolution of Test Case)
- TC Added/Removed
- TC Outcome (transitions from FAIL/PASS/UNRESOLVED[bad_communication <-> does_not_compile <-> bad_input])
- TC execution time
- Modificiations to a testcase (compiler needs to recompile)
Development Session Data
- IDE Startup
- File save
Questionnairs
- Initial
- Final

Experiment Hypotheses

Use of CDD increases development productivity

Did the use of testing decrease development time?
This can be meassured by either looking at
- Number of compilations
- Number of saves
- Number of revisions
- IDE time
- Asking the students

None of the above strikes me as particualry reliable though. Also, it is easy to develop quickly if you do a bad job. In order to compare apples to apples we must be careful to compare projects with a similar correcntess and completeness. We could use an external test suite to assess correctness, or the grade of the students.

Use of CDD increases code correctness

Is there a relation between code correctness of project (vs. some system level test suite) and test activity?

Measures for test activity:

number of tests
number of times test were run
Number of pass/fail, fail/pass transitions

Developer Profile

How did students use the testing tools.
Are ther clusters of similar use?
What is charactersitic for these clusters?
Meassures:
- Aksing students before and after
- Are there projects where tests initially always fail resp. pass
- How often do they test?
- How correct is their project?

I am not completely sure yet what to assess here.

How do extracted, synthesized and manually written test cases compare?

Which tests are the most useful to students?
How many tests are there in each category?
What's the test suite quality of each category?
Were some excluded from testing more often than others?
How many red/green and green/red transitions are there in each category?
Which had compile-time errors most often that did not get fixed?

Retrieved from "https://dev.eiffel.com/index.php?title=CddMeeting01082008&oldid=10274"

Category:

@@ Line 6: / Line 6: @@
 == Tasks ==
 * Add filters and tags for extracted, manual tests and automated tests
-* Fix extraction for tuples -> DONE, but needs testing, there are probably still problems with agents, but it's not certain if related to tuples or extraction (Stefan)
+* Fix extraction for tuples -> DONE
 * Look at/fix test case execution for agents (Stefan)
-* Add non-commited test cases (Stefan)
 * CDD log window in IDE (Arno)
 * "New manual test case" Button (Arno)
@@ Line 45: / Line 44: @@
 * Environment variable (or better user preference) for qualifying class names (to avoid svn conflicts)
 * Uniqe id to tag test cases with. To be used in logs. So test logs are resiliant to test class renamings
+* While extracting test cases, flag objects that are target to a currently executing routine
+* During setup check inv of all objects that are not flaged
 == Software Engineering Project ==
@@ Line 59: / Line 58: @@
 == Data to harvest ==
+* IDE Time with CDD(extraction) enabled / IDE Time with CDD(extraction) disabled
 * Test Case Source (just final version, or all versions?)
 ** Use Profiler to get coverage approximation
-* TC Meta Data (with timestamps)
+* TC Meta Data (with timestamps -> Evolution of Test Case)
 ** TC Added/Removed
-** TC Outcome
+** TC Outcome (transitions from FAIL/PASS/UNRESOLVED[bad_communication <-> does_not_compile <-> bad_input])
 ** TC execution time
+** Modificiations to a testcase (compiler needs to recompile)
 * Development Session Data
 ** IDE Startup
@@ Line 74: / Line 75: @@
 ==Experiment Hypotheses==
-===CDD makes development easier/more productive===
+===Use of CDD increases development productivity===
-===CDD makes more correct code===
+* Did the use of testing decrease development time?
-===Profile of students (dev a, deb b style comparison)===
+* This can be meassured by either looking at
-===Given 3 kinds of test. what actually gets used and how effective is it?===
+** Number of compilations
+** Number of saves
+** Number of revisions
+** IDE time
+** Asking the students
+None of the above strikes me as particualry reliable though. Also, it is easy to develop quickly if you do a bad job.
+In order to compare apples to apples we must be careful to compare projects with a similar correcntess and completeness. We could use an external test suite to assess correctness, or the grade of the students.
+===Use of CDD increases code correctness===
+* Is there a relation between code correctness of project (vs. some system level test suite) and test activity?
+Measures for test activity:
+* number of tests
+* number of times test were run
+* Number of pass/fail, fail/pass transitions
+===Developer Profile===
+* How did students use the testing tools.
+* Are ther clusters of similar use?
+* What is charactersitic for these clusters?
+* Meassures:
+** Aksing students before and after
+** Are there projects where tests initially always fail resp. pass
+** How often do they test?
+** How correct is their project?
+I am not completely sure yet what to assess here.
+===How do extracted, synthesized and manually written test cases compare?===
+* Which tests are the most useful to students?
+* How many tests are there in each category?
+* What's the test suite quality of each category?
+* Were some excluded from testing more often than others?
+* How many red/green and green/red transitions are there in each category?
+* Which had compile-time errors most often that did not get fixed?

Difference between revisions of "CddMeeting01082008"

Latest revision as of 02:20, 11 January 2008

Contents

CDD Meeting, Tuesday, 8.1.2008, 10:00

Next Meeting

Tasks

Software Engineering Project

Data to harvest

Experiment Hypotheses

Use of CDD increases development productivity

Use of CDD increases code correctness

Developer Profile

How do extracted, synthesized and manually written test cases compare?

Navigation

Development

Wiki

Search

Tools