Difference between revisions of "CddMeeting01082008"

(Experiment Hypotheses)
(Data to harvest)
 
(14 intermediate revisions by 2 users not shown)
Line 6: Line 6:
 
== Tasks ==
 
== Tasks ==
 
* Add filters and tags for extracted, manual tests and automated tests
 
* Add filters and tags for extracted, manual tests and automated tests
* Fix extraction for tuples -> DONE, but needs testing, there are probably still problems with agents, but it's not certain if related to tuples or extraction (Stefan)
+
* Fix extraction for tuples -> DONE
 
* Look at/fix test case execution for agents (Stefan)
 
* Look at/fix test case execution for agents (Stefan)
* Add non-commited test cases (Stefan)
 
 
* CDD log window in IDE (Arno)
 
* CDD log window in IDE (Arno)
 
* "New manual test case" Button (Arno)
 
* "New manual test case" Button (Arno)
Line 45: Line 44:
 
* Environment variable (or better user preference) for qualifying class names (to avoid svn conflicts)
 
* Environment variable (or better user preference) for qualifying class names (to avoid svn conflicts)
 
* Uniqe id to tag test cases with. To be used in logs. So test logs are resiliant to test class renamings
 
* Uniqe id to tag test cases with. To be used in logs. So test logs are resiliant to test class renamings
 
+
* While extracting test cases, flag objects that are target to a currently executing routine
 
+
* During setup check inv of all objects that are not flaged
  
 
== Software Engineering Project ==
 
== Software Engineering Project ==
Line 59: Line 58:
  
 
== Data to harvest ==
 
== Data to harvest ==
 +
* IDE Time with CDD(extraction) enabled / IDE Time with CDD(extraction) disabled
 
* Test Case Source (just final version, or all versions?)
 
* Test Case Source (just final version, or all versions?)
 
** Use Profiler to get coverage approximation
 
** Use Profiler to get coverage approximation
* TC Meta Data (with timestamps)
+
* TC Meta Data (with timestamps -> Evolution of Test Case)
 
** TC Added/Removed
 
** TC Added/Removed
** TC Outcome
+
** TC Outcome (transitions from FAIL/PASS/UNRESOLVED[bad_communication <-> does_not_compile <-> bad_input])
 
** TC execution time
 
** TC execution time
 +
** Modificiations to a testcase (compiler needs to recompile)
 
* Development Session Data
 
* Development Session Data
 
** IDE Startup
 
** IDE Startup
Line 75: Line 76:
  
 
===Use of CDD increases development productivity===
 
===Use of CDD increases development productivity===
Did the use of testing decrease development time? In order to compare apples to apples we must be careful to compare  
+
* Did the use of testing decrease development time?
projects with a similar correcntess and completeness.
+
* This can be meassured by either looking at
 +
** Number of compilations
 +
** Number of saves
 +
** Number of revisions
 +
** IDE time
 +
** Asking the students
 +
 
 +
None of the above strikes me as particualry reliable though. Also, it is easy to develop quickly if you do a bad job.
 +
In order to compare apples to apples we must be careful to compare projects with a similar correcntess and completeness. We could use an external test suite to assess correctness, or the grade of the students.
  
  
 
===Use of CDD increases code correctness===
 
===Use of CDD increases code correctness===
Is there a relation between code correctness of project (vs. some system level test suite) and test activity?
+
* Is there a relation between code correctness of project (vs. some system level test suite) and test activity?
Test activity could composed of number of tests in system, number of times it ran, Number of pass/fail, fail/pass transitions, ...
+
  
 +
Measures for test activity:
 +
* number of tests
 +
* number of times test were run
 +
* Number of pass/fail, fail/pass transitions
  
 
===Developer Profile===
 
===Developer Profile===
How did students use the testing tools. Are ther clusters of similar use? What is charactersitic for these clusters?
+
* How did students use the testing tools.  
 +
* Are ther clusters of similar use?  
 +
* What is charactersitic for these clusters?
 +
* Meassures:
 +
** Aksing students before and after
 +
** Are there projects where tests initially always fail resp. pass
 +
** How often do they test?
 +
** How correct is their project?
 +
 
 +
I am not completely sure yet what to assess here.
  
 
===How do extracted, synthesized and manually written test cases compare?===
 
===How do extracted, synthesized and manually written test cases compare?===
How many tests are there in each category?
+
* Which tests are the most useful to students?
Were some excluded from testing more often than others?
+
* How many tests are there in each category?
How many red/green and green/red transitions are there in each category?
+
* What's the test suite quality of each category?
Which had compile-time errors most often that did not get fixed?
+
* Were some excluded from testing more often than others?
Which did students find the most useful (questionnair)?
+
* How many red/green and green/red transitions are there in each category?
 +
* Which had compile-time errors most often that did not get fixed?

Latest revision as of 01:20, 11 January 2008

CDD Meeting, Tuesday, 8.1.2008, 10:00

Next Meeting

  • Friday, 11.1.2008, 10:00

Tasks

  • Add filters and tags for extracted, manual tests and automated tests
  • Fix extraction for tuples -> DONE
  • Look at/fix test case execution for agents (Stefan)
  • CDD log window in IDE (Arno)
  • "New manual test case" Button (Arno)
  • Better Icons for GUI (Arno)
  • Status / Progress bar (Arno)
  • Port to 6.1 (?, probably only after Beta 1)
  • Manual re-run to find true prestate (Jocelyn, Stefan)
  • Logging (Stefan)
    • What data to log?
    • Implement storing
    • Define how students should submit logs
  • Data Gathering (Stefan)
    • Define what data to gather
    • Define how to process gather data
  • Forumulate Experiment Hypothesis (Andreas)
  • Define Project for SoftEng (Manu)
    • Find System level test suite for us to test students code
    • Find project with pure functional part
  • "Execute visible test cases only" Button (?)
  • Restore open nodes and selection after grid update (Arno)
    • Maybe better/easier solved via incremental updates from tree
  • Automate CDD System level tests (Stefan)
  • Install CDD in student labs (Manu)
  • Pause test execution and compilation during regular compilation and execution (Arno)
  • Add most important convenience routine to CDD_TEST_CASE (Stefan)
  • Add failure context window (Arno)
    • Maybe also additional information such as previous outcomes?
  • Check why Gobo slows down compilation of project not using gobo when melting (performance issue for compiling interpreter)
  • Fix AutoTest for courses
    • Integrate AUT_TEST_CASE into CDD_TEST_CASE hierarchy
    • Variable declaration for failing test cases
    • New release
  • Move logs below cdd_tests
  • Environment variable (or better user preference) for qualifying class names (to avoid svn conflicts)
  • Uniqe id to tag test cases with. To be used in logs. So test logs are resiliant to test class renamings
  • While extracting test cases, flag objects that are target to a currently executing routine
  • During setup check inv of all objects that are not flaged

Software Engineering Project

  • One large project, but divided into testable subcomponents
  • Students required to write test cases
  • Fixed API to make things uniformly testable
  • Public/Secret test cases (similar to Zeller course)
  • Competitions:
    • Group A test cases applied to Group A project
    • Group A test cases applied to Groupt B project

Data to harvest

  • IDE Time with CDD(extraction) enabled / IDE Time with CDD(extraction) disabled
  • Test Case Source (just final version, or all versions?)
    • Use Profiler to get coverage approximation
  • TC Meta Data (with timestamps -> Evolution of Test Case)
    • TC Added/Removed
    • TC Outcome (transitions from FAIL/PASS/UNRESOLVED[bad_communication <-> does_not_compile <-> bad_input])
    • TC execution time
    • Modificiations to a testcase (compiler needs to recompile)
  • Development Session Data
    • IDE Startup
    • File save
  • Questionnairs
    • Initial
    • Final

Experiment Hypotheses

Use of CDD increases development productivity

  • Did the use of testing decrease development time?
  • This can be meassured by either looking at
    • Number of compilations
    • Number of saves
    • Number of revisions
    • IDE time
    • Asking the students

None of the above strikes me as particualry reliable though. Also, it is easy to develop quickly if you do a bad job. In order to compare apples to apples we must be careful to compare projects with a similar correcntess and completeness. We could use an external test suite to assess correctness, or the grade of the students.


Use of CDD increases code correctness

  • Is there a relation between code correctness of project (vs. some system level test suite) and test activity?

Measures for test activity:

  • number of tests
  • number of times test were run
  • Number of pass/fail, fail/pass transitions

Developer Profile

  • How did students use the testing tools.
  • Are ther clusters of similar use?
  • What is charactersitic for these clusters?
  • Meassures:
    • Aksing students before and after
    • Are there projects where tests initially always fail resp. pass
    • How often do they test?
    • How correct is their project?

I am not completely sure yet what to assess here.

How do extracted, synthesized and manually written test cases compare?

  • Which tests are the most useful to students?
  • How many tests are there in each category?
  • What's the test suite quality of each category?
  • Were some excluded from testing more often than others?
  • How many red/green and green/red transitions are there in each category?
  • Which had compile-time errors most often that did not get fixed?