Difference between revisions of "CITADEL"

(Discussions)
(Discussions)
Line 37: Line 37:
 
   that in the first version we may leave aside those containers, that  
 
   that in the first version we may leave aside those containers, that  
 
   aren't inherited from CONTAINER.
 
   aren't inherited from CONTAINER.
 +
 +
Ilinca: I'm not sure about the idea of including argument-less functions in
 +
  this flattened form. What if the precondition of the function doesn't
 +
  hold or we run into some kind of trouble when executing the function
 +
  body (trouble such as the contracts of a sub-call made by this
 +
  function not holding or other bugs)? On the other hand, including
 +
  functions would bring significantly more information about the state
 +
  of the object... Does the Daikon frontend for Java, for instance, by
 +
  default evaluate functions too as part of the flattened form of the
 +
  objects?
 +
 +
Nadia: Well, this idea is disputable, of cause.
 +
  In Java considering argument-less functions as variables is more
 +
  dangerous, because Java, like all C-family languages, encourages
 +
  side-effects in functions. However, in Daikon front-end for Java there
 +
  exists an option that allows using functions as variables. Together with
 +
  enabling this option user has to provide a special "purity" file that
 +
  lists those functions from the system, that are side-effect free. This
 +
  feature is described in Daikon User Manual, page 82.
 +
  Considering functions as variables is even more relevant in Eiffel
 +
  context by two main reasons. Firstly, Eiffel encourages absence of
 +
  abstract side-effects in functions (and, I believe, most Eiffel
 +
  functions indeed do not have abstract side-effect, am I right?). For
 +
  those cases, when some functions in system are known to have
 +
  side-effect, we can allow user to specify an "impurity" file, that would
 +
  list those functions. I think that these cases would be rare.
 +
  Secondly, Eiffel advocates uniform access principle. Routine's clients
 +
  don't have to know, which queries, used in precondition, are attributes
 +
  and which are functions. If inferred preconditions contained only
 +
  attributes, it would be inconsistent with uniform access principle, I
 +
  think...
 +
 
 +
  The issue with functions' preconditions that you've mentioned isn't a
 +
  big problem. If after flattening we got a variable of the form, e.g.,
 +
  `x.f1.f2', where `f1' and `f2' are functions, printing instructions for
 +
  this variable can be like this:
 +
    if x.f1_precondition_holds and then x.f1.f2_precondition_holds
 +
    then
 +
      print (x.f1.f2)
 +
    else
 +
      print (Nonsensical)
 +
    end
 +
 
 +
  "Nonsensical" is a special Daikon value that indicates that variable
 +
  cannot be evaluated.
 +
  Function preconditions are known after parsing. To be able to evaluate
 +
  them separately, we may, for example, wrap them into functions.
 +
  `f1_precondition_holds' in the above code is an example of such a
 +
  function, that was introduced by instrumenter and added to x's
 +
  generating class.
 +
 
 +
  BTW, for attributes we also need check in printing instructions: the
 +
  target on which attribute is called should be non-void, otherwise
 +
  attribute value should be printed as `Nonsensical'.
 +
 
 +
  As for troubles that may arise when calling functions... As I
 +
  understand, exception in a function, called when its precondition holds,
 +
  is a bug. So, if it happens, we can just tell user that we found a bug
 +
  in certain function and that before inferring contracts the bug should
 +
  be fixed.
  
 
* Support for loop invariants
 
* Support for loop invariants
Line 49: Line 109:
 
   so to say, "standalone" and doesn't try to connect it with any other.
 
   so to say, "standalone" and doesn't try to connect it with any other.
 
   We don't need any extra support from Daikon to handle loop invariants.
 
   We don't need any extra support from Daikon to handle loop invariants.
 
+
 
 
   If we want to infer loop invariants, we should introduce a separate  
 
   If we want to infer loop invariants, we should introduce a separate  
 
   program point for each loop. So, we need a way to distinguish loops  
 
   program point for each loop. So, we need a way to distinguish loops  
Line 59: Line 119:
 
   CI_INSTRUCTION_COUNTER is responsible for counting instruction while  
 
   CI_INSTRUCTION_COUNTER is responsible for counting instruction while  
 
   iterating through AST.
 
   iterating through AST.
 +
 +
* Inferring contracts for generic classes
 +
Nadia: there are actually two different cases.
 +
  1. Contracts inside generic classes. This is not really a problem.
 +
  Entities, which have parameter type, can be treated as if their
 +
  generator is constraining class (ANY, if genericity is unconstrained).
 +
 
 +
  2. Contracts in clients of generic classes. Here some problems arise. If
 +
  we have an entity `x: ARRAYED_LIST[INTEGER]', its flattening involves
 +
  processing the `item' query. `item' is declared to be of type `G' in
 +
  ARRAYED_LIST and, as I understand, parsing is not enough (some type
 +
  analysis is needed) to infer that `G' means `INTEGER' in current
 +
  context. Two approaches to this problem, that I see at present, are:
 +
  perform such a type analysis in compile time (with the help of gobo
 +
  compilation facilities) or leave some flattening work until runtime,
 +
  when reflections may be used to determine variable's generating class.

Revision as of 07:07, 14 December 2007


Overview

CITADEL (Contract-Inference Tool that Applies Daikon to the Eiffel Language) is a tool that provides an Eiffel front-end for the Daikon invariant detector. It is being implemented as a Master's project by Nadia Polikarpova supervised by Ilinca Ciupa (ETH).

Discussions

  • Flattened variables
Nadia: Daikon can process values of only few types: five scalar types (boolean, 
 integer, double, string and hashcode) and five more types, which are 
 arrays of these scalars.
 So, if in Eiffel code we a have a variable of one of basic types 
 (BOOLEAN, NATURAL, INTEGER, REAL or STRING) we can print its value 
 directly as if it has one of Daikon's types.
 If, however, we have a variable of any other type, we have to collect 
 all the information about it that is of interest and is available at 
 certain program point, and present this information to Daikon as a set 
 of  variables, which have Daikon types. The information of interest 
 about a variable includes results of all queries that can be called on 
 this variable at this program point.  However, since queries results may 
 themselves be of non-basic types, we have to perform this operation 
 recursively (to a certain depth).
 For a reference variable information of interest includes also its 
 address (Daikon type "hashcode" is intended for representing object 
 addresses).
 For example, if we had a variable c: POINT, queries of class POINT 
 available at current program point were `x: REAL', `y: REAL' and `twin: 
 POINT' and flattening depth were set to 2, then `c.flattened' would be a 
 set of variables: {$c, c.x, c.y, $c.twin, c.twin.x, c.twin.y}.
 
 Containers need special kind of flattening. If we want Daikon to handle 
 them sensibly, they should be converted to arrays of scalars (to be more 
 precise, it should be done only if all container elements are observable 
 at current program point...). It isn't difficult for descendants of 
 CONTAINER class from standard library. Instrumenter may check, if a 
 variable is an indirect instance of CONTAINER, get its 
 `linear_representation', then traverse it and output as array. I think 
 that in the first version we may leave aside those containers, that 
 aren't inherited from CONTAINER.
Ilinca: I'm not sure about the idea of including argument-less functions in 
 this flattened form. What if the precondition of the function doesn't 
 hold or we run into some kind of trouble when executing the function 
 body (trouble such as the contracts of a sub-call made by this 
 function not holding or other bugs)? On the other hand, including 
 functions would bring significantly more information about the state 
 of the object... Does the Daikon frontend for Java, for instance, by 
 default evaluate functions too as part of the flattened form of the 
 objects?
Nadia: Well, this idea is disputable, of cause.
 In Java considering argument-less functions as variables is more 
 dangerous, because Java, like all C-family languages, encourages 
 side-effects in functions. However, in Daikon front-end for Java there 
 exists an option that allows using functions as variables. Together with 
 enabling this option user has to provide a special "purity" file that 
 lists those functions from the system, that are side-effect free. This 
 feature is described in Daikon User Manual, page 82.
 Considering functions as variables is even more relevant in Eiffel 
 context by two main reasons. Firstly, Eiffel encourages absence of 
 abstract side-effects in functions (and, I believe, most Eiffel 
 functions indeed do not have abstract side-effect, am I right?). For 
 those cases, when some functions in system are known to have 
 side-effect, we can allow user to specify an "impurity" file, that would 
 list those functions. I think that these cases would be rare.
 Secondly, Eiffel advocates uniform access principle. Routine's clients 
 don't have to know, which queries, used in precondition, are attributes 
 and which are functions. If inferred preconditions contained only 
 attributes, it would be inconsistent with uniform access principle, I 
 think...
 
 The issue with functions' preconditions that you've mentioned isn't a 
 big problem. If after flattening we got a variable of the form, e.g., 
 `x.f1.f2', where `f1' and `f2' are functions, printing instructions for 
 this variable can be like this:
   if x.f1_precondition_holds and then x.f1.f2_precondition_holds
   then
      print (x.f1.f2)
   else
      print (Nonsensical)
   end
 
 "Nonsensical" is a special Daikon value that indicates that variable 
 cannot be evaluated.
 Function preconditions are known after parsing. To be able to evaluate 
 them separately, we may, for example, wrap them into functions. 
 `f1_precondition_holds' in the above code is an example of such a 
 function, that was introduced by instrumenter and added to x's 
 generating class.
 
 BTW, for attributes we also need check in printing instructions: the 
 target on which attribute is called should be non-void, otherwise 
 attribute value should be printed as `Nonsensical'.
 
 As for troubles that may arise when calling functions... As I 
 understand, exception in a function, called when its precondition holds, 
 is a bug. So, if it happens, we can just tell user that we found a bug 
 in certain function and that before inferring contracts the bug should 
 be fixed.
  • Support for loop invariants
Ilinca: Does the "classic" Daikon support loop invariants?
Nadia: Daikon uses standard program point 
 suffixes to infer more invariants. E.g., it understands that 
 some_routine:::ENTER and some_routine:::EXIT are entry and exit points 
 of the same routine. Thanks to this, it knows `old' values of variables 
 at exit and can infer invariants like "x = old x". As I understood, if 
 Daikon sees an unknown suffix, it treats corresponding program point as, 
 so to say, "standalone" and doesn't try to connect it with any other.
 We don't need any extra support from Daikon to handle loop invariants.
 
 If we want to infer loop invariants, we should introduce a separate 
 program point for each loop. So, we need a way to distinguish loops 
 within a routine. One way is to take its position in source code as 
 index. However, I thought, that position is too unstable to become loop 
 identifier, so I chose instruction number instead. These numbers are 
 unique numbers of instructions within surrounding routine, they may be 
 used not only for loops, but for arbitrary instructions. 
 CI_INSTRUCTION_COUNTER is responsible for counting instruction while 
 iterating through AST.
  • Inferring contracts for generic classes
Nadia: there are actually two different cases.
 1. Contracts inside generic classes. This is not really a problem. 
 Entities, which have parameter type, can be treated as if their 
 generator is constraining class (ANY, if genericity is unconstrained).
 
 2. Contracts in clients of generic classes. Here some problems arise. If 
 we have an entity `x: ARRAYED_LIST[INTEGER]', its flattening involves 
 processing the `item' query. `item' is declared to be of type `G' in 
 ARRAYED_LIST and, as I understand, parsing is not enough (some type 
 analysis is needed) to infer that `G' means `INTEGER' in current 
 context. Two approaches to this problem, that I see at present, are: 
 perform such a type analysis in compile time (with the help of gobo 
 compilation facilities) or leave some flattening work until runtime, 
 when reflections may be used to determine variable's generating class.