Difference between revisions of "CITADEL"
m (→Discussions) |
(→Flattened variables) |
||
| Line 63: | Line 63: | ||
side-effect, we can allow user to specify an "impurity" file, that would | side-effect, we can allow user to specify an "impurity" file, that would | ||
list those functions. I think that these cases would be rare. | list those functions. I think that these cases would be rare. | ||
| + | |||
| + | Ilinca: Indeed, I also think these cases would be rare, but, depending on the | ||
| + | size of the system, it might be very cumbersome for users to determine | ||
| + | these functions. Something else: what Daikon calls "purity" is actually | ||
| + | weak purity, isn't it? (Weak purity means that creations of local | ||
| + | variables are allowed in the body of a pure function. Strong purity | ||
| + | disallows this.) | ||
| + | |||
Secondly, Eiffel advocates uniform access principle. Routine's clients | Secondly, Eiffel advocates uniform access principle. Routine's clients | ||
don't have to know, which queries, used in precondition, are attributes | don't have to know, which queries, used in precondition, are attributes | ||
| Line 88: | Line 96: | ||
generating class. | generating class. | ||
| + | Ilinca: I don't think it's that easy, mainly because of polymorphism. Statically | ||
| + | you can't determine the type that x will be a direct instance of at | ||
| + | runtime. Hence, you don't know which preconditions to wrap. Of course, | ||
| + | since redefined routines can keep or weaken the preconditions of the | ||
| + | original routines, you can take the pessimistic approach of wrapping | ||
| + | only the precondition of the function from the static type of x, but | ||
| + | then you might miss cases in which the weakened version of the | ||
| + | precondition does hold. Do you understand my point? | ||
| + | |||
| + | There's one further problem with this approach: when a function is | ||
| + | called as part of the evaluation of a contract, this function's | ||
| + | contracts are not checked. The reason for this is to avoid infinite | ||
| + | loops. If you wrap your precondition into a function, the contracts of | ||
| + | any routines it calls will be evaluated, and thus the system can | ||
| + | actually behave differently. | ||
| + | |||
BTW, for attributes we also need check in printing instructions: the | BTW, for attributes we also need check in printing instructions: the | ||
target on which attribute is called should be non-void, otherwise | target on which attribute is called should be non-void, otherwise | ||
Revision as of 06:17, 14 December 2007
Contents
Overview
CITADEL (Contract-Inference Tool that Applies Daikon to the Eiffel Language) is a tool that provides an Eiffel front-end for the Daikon invariant detector. It is being implemented as a Master's project by Nadia Polikarpova supervised by Ilinca Ciupa (ETH).
Discussions
Flattened variables
Nadia: Daikon can process values of only few types: five scalar types (boolean,
integer, double, string and hashcode) and five more types, which are
arrays of these scalars.
So, if in Eiffel code we a have a variable of one of basic types
(BOOLEAN, NATURAL, INTEGER, REAL or STRING) we can print its value
directly as if it has one of Daikon's types.
If, however, we have a variable of any other type, we have to collect
all the information about it that is of interest and is available at
certain program point, and present this information to Daikon as a set
of variables, which have Daikon types. The information of interest
about a variable includes results of all queries that can be called on
this variable at this program point. However, since queries results may
themselves be of non-basic types, we have to perform this operation
recursively (to a certain depth).
For a reference variable information of interest includes also its
address (Daikon type "hashcode" is intended for representing object
addresses).
For example, if we had a variable c: POINT, queries of class POINT
available at current program point were `x: REAL', `y: REAL' and `twin:
POINT' and flattening depth were set to 2, then `c.flattened' would be a
set of variables: {$c, c.x, c.y, $c.twin, c.twin.x, c.twin.y}.
Containers need special kind of flattening. If we want Daikon to handle
them sensibly, they should be converted to arrays of scalars (to be more
precise, it should be done only if all container elements are observable
at current program point...). It isn't difficult for descendants of
CONTAINER class from standard library. Instrumenter may check, if a
variable is an indirect instance of CONTAINER, get its
`linear_representation', then traverse it and output as array. I think
that in the first version we may leave aside those containers, that
aren't inherited from CONTAINER.
Ilinca: I'm not sure about the idea of including argument-less functions in this flattened form. What if the precondition of the function doesn't hold or we run into some kind of trouble when executing the function body (trouble such as the contracts of a sub-call made by this function not holding or other bugs)? On the other hand, including functions would bring significantly more information about the state of the object... Does the Daikon frontend for Java, for instance, by default evaluate functions too as part of the flattened form of the objects?
Nadia: Well, this idea is disputable, of cause.
In Java considering argument-less functions as variables is more
dangerous, because Java, like all C-family languages, encourages
side-effects in functions. However, in Daikon front-end for Java there
exists an option that allows using functions as variables. Together with
enabling this option user has to provide a special "purity" file that
lists those functions from the system, that are side-effect free. This
feature is described in Daikon User Manual, page 82.
Considering functions as variables is even more relevant in Eiffel
context by two main reasons. Firstly, Eiffel encourages absence of
abstract side-effects in functions (and, I believe, most Eiffel
functions indeed do not have abstract side-effect, am I right?). For
those cases, when some functions in system are known to have
side-effect, we can allow user to specify an "impurity" file, that would
list those functions. I think that these cases would be rare.
Ilinca: Indeed, I also think these cases would be rare, but, depending on the
size of the system, it might be very cumbersome for users to determine
these functions. Something else: what Daikon calls "purity" is actually
weak purity, isn't it? (Weak purity means that creations of local
variables are allowed in the body of a pure function. Strong purity
disallows this.)
Secondly, Eiffel advocates uniform access principle. Routine's clients
don't have to know, which queries, used in precondition, are attributes
and which are functions. If inferred preconditions contained only
attributes, it would be inconsistent with uniform access principle, I
think...
The issue with functions' preconditions that you've mentioned isn't a
big problem. If after flattening we got a variable of the form, e.g.,
`x.f1.f2', where `f1' and `f2' are functions, printing instructions for
this variable can be like this:
if x.f1_precondition_holds and then x.f1.f2_precondition_holds
then
print (x.f1.f2)
else
print (Nonsensical)
end
"Nonsensical" is a special Daikon value that indicates that variable
cannot be evaluated.
Function preconditions are known after parsing. To be able to evaluate
them separately, we may, for example, wrap them into functions.
`f1_precondition_holds' in the above code is an example of such a
function, that was introduced by instrumenter and added to x's
generating class.
Ilinca: I don't think it's that easy, mainly because of polymorphism. Statically
you can't determine the type that x will be a direct instance of at
runtime. Hence, you don't know which preconditions to wrap. Of course,
since redefined routines can keep or weaken the preconditions of the
original routines, you can take the pessimistic approach of wrapping
only the precondition of the function from the static type of x, but
then you might miss cases in which the weakened version of the
precondition does hold. Do you understand my point?
There's one further problem with this approach: when a function is
called as part of the evaluation of a contract, this function's
contracts are not checked. The reason for this is to avoid infinite
loops. If you wrap your precondition into a function, the contracts of
any routines it calls will be evaluated, and thus the system can
actually behave differently.
BTW, for attributes we also need check in printing instructions: the
target on which attribute is called should be non-void, otherwise
attribute value should be printed as `Nonsensical'.
As for troubles that may arise when calling functions... As I
understand, exception in a function, called when its precondition holds,
is a bug. So, if it happens, we can just tell user that we found a bug
in certain function and that before inferring contracts the bug should
be fixed.
Support for loop invariants
Ilinca: Does the "classic" Daikon support loop invariants?
Nadia: Daikon uses standard program point suffixes to infer more invariants. E.g., it understands that some_routine:::ENTER and some_routine:::EXIT are entry and exit points of the same routine. Thanks to this, it knows `old' values of variables at exit and can infer invariants like "x = old x". As I understood, if Daikon sees an unknown suffix, it treats corresponding program point as, so to say, "standalone" and doesn't try to connect it with any other. We don't need any extra support from Daikon to handle loop invariants. If we want to infer loop invariants, we should introduce a separate program point for each loop. So, we need a way to distinguish loops within a routine. One way is to take its position in source code as index. However, I thought, that position is too unstable to become loop identifier, so I chose instruction number instead. These numbers are unique numbers of instructions within surrounding routine, they may be used not only for loops, but for arbitrary instructions. CI_INSTRUCTION_COUNTER is responsible for counting instruction while iterating through AST.
Inferring contracts for generic classes
Nadia: there are actually two different cases. 1. Contracts inside generic classes. This is not really a problem. Entities, which have parameter type, can be treated as if their generator is constraining class (ANY, if genericity is unconstrained). 2. Contracts in clients of generic classes. Here some problems arise. If we have an entity `x: ARRAYED_LIST[INTEGER]', its flattening involves processing the `item' query. `item' is declared to be of type `G' in ARRAYED_LIST and, as I understand, parsing is not enough (some type analysis is needed) to infer that `G' means `INTEGER' in current context. Two approaches to this problem, that I see at present, are: perform such a type analysis in compile time (with the help of gobo compilation facilities) or leave some flattening work until runtime, when reflections may be used to determine variable's generating class.

