Difference between revisions of "EiffelBase2"
Peter gummer (Talk | contribs) m (Typo) |
(→ToDo list) |
||
Line 166: | Line 166: | ||
Below you can see a list of most important changes planned for EiffelBase2. | Below you can see a list of most important changes planned for EiffelBase2. | ||
* Test extensively with AutoTest. | * Test extensively with AutoTest. | ||
− | |||
* Implement <e>PREORDER_ITERATOR</e> and <e>POSTORDER_ITERATOR</e> to traverse binaries in pre- and postorder. | * Implement <e>PREORDER_ITERATOR</e> and <e>POSTORDER_ITERATOR</e> to traverse binaries in pre- and postorder. | ||
− | |||
* Finish the implementation of the MML library. | * Finish the implementation of the MML library. | ||
* Rewrite loops using the <e>across</e> where appropriate. | * Rewrite loops using the <e>across</e> where appropriate. |
Revision as of 03:24, 29 July 2010
Contents
Overview
EiffelBase2, currently under development, is a general-purpose data structures library for Eiffel. It is intended as the future replacement for the EiffelBase library ("Classic EiffelBase" in this document), which has for many years played a central role in Eiffel development.
Design goals
The design goals for EiffelBase2 are:
- Verifiability. The library is designed to allow proofs of correctness.
- Full contracts. Partly as a result of the verifiability goal, but also for clarity and documentation, the contracts associated with classes and features should describe the relevant semantics in its entirety.
- Simple and consistent hierarchy, in particular avoidance of descendant hiding and of "taxomania" (classes not representing a meaningful abstraction, unnecessary inheritance links).
- As in Classic EiffelBase, application of a systematic classification (a theory) of fundamental data structures.
- Full traversal mechanisms, based on external cursors (with internal cursors also provided when useful).
- Client-interface compatibility with corresponding classes in Classic EiffelBase, whenever it does not conflict with the preceding goals.
Design overview
Traversal mechanisms and classification criteria
A design decision that significantly influenced the architecture of the library is using external iterators to traverse containers as opposed to internal ones available in classic EiffelBase. In the design inspired by external iterators we found it natural to separate containers (with their measurements and modification means) from interfaces to access elements of containers, here called observers.
On the observer side we distinguish maps (with the defining property "accessing elements by unique keys") and iterators (providing linear traversal). An observer can be either used by a container (as a supplier) or inherited by it. The first case produces an external access mechanism and is useful when multiple instances of an observer for the same container can exist at the same time; the second case results in an internal access mechanism and allows only one instance of an observer per container at a time. In the rest of the library maps are mostly inherited and iterators are mostly used, but this is not enforced by the design. For infinite sequences like RANDOM
it makes sense to inherit from an iterator, because they cannot have more than one active iterator.
On the other side, a container is a finite storage of values. Containers are deliberately confined to finite structures and understood real, physical storage. Infinite structures, instead, are usually represented as functions or mechanisms that are external to a program. Most of the time we can only access their elements (not add or remove) and for this purpose we have observers: maps and iterators (the latter in the infinite case are called streams).
Containers may be classified based on different means of modification, but previous experience shows that a complete and clean classification here is impossible (there are too many kinds of modification, basically one per "concrete" data structure). So concrete data structures mostly just inherit directly from CONTAINER
. There is one exception: SEQUENCE
, which represents a sequence of values accessible by indices from a continuous interval. It serves as a common ancestor for ARRAY and LIST and factors out their common search and replacement mechanisms.
Below you can find the class diagram of EiffelBase2, split into two hierarchies (according to connectedness). The first one is a hierarchy of containers and maps. Note: dash-bordered ovals stand for classes that might be included in the library, but are not the first priority of the developers. All EiffelBase2 class names start with V_
(for Verified), but the prefix is omitted in the current document for brevity.
The second one is a hierarchy of streams and iterators.
Comparison criteria
Another important design decision for a data structures library is how to represent and modify comparison criteria for container elements. Adjustable comparison criterion is used in search operations and to define the uniqueness property in sets and tables (uniqueness of keys).
- Classic EiffelBase uses the boolean
object_comparison
attribute inCONTAINER
for this purpose. - Gobo.structures uses another approach: storing equivalence as an object of a specific class, which can be redefined by the user to implement an arbitrary equalivalence relation.
- A similar approach would be to use agents to store the equalivalence relation (no need to create classes). The downside is that agents do not serialize well.
- Another approach is to make
has (x: G)
andsearch (x: G)
always use=
, but also introduceexists (p: PREDICATE [ANY, TUPLE [G]])
andsatisfy (p: PREDICATE [ANY, TUPLE [G]])
. This is more flexible (better fits cases when the predicate is not an equivalence), but duplicates the number of search features and doesn't solve the problem with sets.
An additional question (if the comparison criterion is stored in the container) is whether it can be changed during the object's lifetime. A well known practice is that in sets it is not allowed to change if the set is not empty. This gives rise to changeable_comparison_criterion
query in CONTAINER.
Note that for most set and table implementations (hashed, sorted) the comparison criterion is defined by the corresponding hash function, order, etc. and cannot be modified arbitrarily even when the set or table is empty.
The strategy chosen in EiffelBase2 is to treat differently comparison criteria on which a set or a table is based and ones just used for search. For the first the Gobo approach is taken (2 in the list above), because it is more flexible than in EiffelBase, as we can use whatever equivalence relation we want instead of just is_equal
(useful for library classes). Moreover, the equivalence relation is passed as an argument to the creation procedure and cannot be changed afterward.
For search operations the approach 4 is taken, because it doesn't clutter the container state with properties that don't pertain to the container itself, but to specific search operations.
Indexing
The current version of EiffelBase2 is using the same policy for indexing as the one used in the classic EiffelBase: indexing of lists (and iterators) always starts from 1, whereas for arrays the starting index can be set to an arbitrary value. However, during the code review in the Chair of Software Engineering (17.03.2010) is was decided that the possibility to start arrays from an arbitrary index is not crucial and is used very rarely, but complicates the API. Thus it will probably be changed in future so that indices of arrays, like those of other indexed data structures, always start at 1.
Model-based contracts
Models
EiffelBase2 is specified using the model-based contracts specification method. This method prescribes that each class defines a model - a mathematical object that represents explicitly its abstract state space. The model of a class is expressed as tuple of one or more predefined mathematical objects: booleans, integers, sets, relations, maps, sequences or bags. We also introduce a special mathematical sort for object IDs (references).
Such mathematical objects in the program are represented by model classes, which are immutable and thus are straightforward translations of mathematical definitions. The Mathematical Model Library (MML), which is a part of EiffelBase2 contains model classes for sets, relations, maps, sequences and bags. Boolean and integer components of models are represented by standard classes BOOLEAN
and INTEGER
. Finally, arbitrary reference classes can be used as model components to denote the mathematical sort of references.
For example, a mathematical sequence is a model for a stack or a queue. A pair consisting of a sequence and an integer is a model for a list with an internal cursor.
The value of each component of the model is defined by a model query. You define the model of the class in the source code by listing its model queries under the tag model
in the class's note
clause. For example:
note model: sequence, index class LIST [G] feature -- Access index: INTEGER -- Cursor position deferred end ... feature -- Specification sequence: MML_SEQUENCE [G] -- Sequence of list elements note status: specification ... end
Here we declared the model of class LIST
consisting of two components: a mathematical sequence of type MML_SEQUENCE
and an integer index. As you can see, model queries are not necessarily introduced specifically for specification purposes (as is the case with sequence
); regular queries meant to be called from the code can be also reused as model queries (as index
in this example). We attach a status: specification
note to a query to indicate that its primary purpose is specification.
Contracts
The purpose of introducing model queries is to define the postconditions of the regular features in their terms. For queries we define their result (or the model of the result, in case the query returns a fresh object) as a function of the model of Current
and (the models of) the arguments. Definitions of zero-argument queries, as usual, can be moved to the class invariant. For commands we define their effects on the model queries of Current
and the arguments. If a model query is not mentioned in the postcondition of a command, it is equivalent to stating that it's not modified.
The model-based contracts approach does not constrain the way in which you write preconditions: it is not necessary to express them through model queries if they can be conveniently expressed otherwise.
Class invariants in the model-based contracts approach constrain the values of model queries to make them reflect precisely the abstract state space of the class.
Let us add a couple of features and model-based contracts into the class LIST
shown above:
note model: sequence, index class LIST [G] feature -- Access item: G -- Item at current position require not_off: not off deferred ensure definition: Result = sequence [index] end ... feature -- Status report off: BOOLEAN -- Is cursor off all elements? deferred ensure definition: not sequence.domain [index] end ... feature -- Element change put_right (v: G) -- Put `v' to the right of the cursor require not_after: not after deferred ensure sequence_effect: sequence |=| old (sequence.front (index).extended (v) + sequence.tail (index + 1)) index_effect: index = old index end ... invariant index_in_range: 0 <= index and index <= sequence.count + 1 end
Inheritance
If a class B
inherits from A
it is free to choose, whether to reuse each of A
's model queries to represent its own model, as well as introduce new model queries. If B
does not reuse an A
's model query is has to provide a linking invariant: a definition of the old model query in terms of B
's model. Linking invariants explain the parent's model in terms of the heir's model and thus make sure that the inherited model-based contracts make sense in the heir.
For example, suppose that LIST
inherits directly from CONTAINER
, whose model is a bag:
note model: bag class CONTAINER [G] ... end note model: sequence, index class LIST [G] ... invariant ... bag_domain_definition: bag.domain = sequence.range bag_definition: bag.domain.for all (agent (x: G): BOOLEAN do Result := bag [x] = sequence.occurrences (x) end) end
Here the linking invariant, provided as part of the class invariant in LIST
completely defines an old model query bag
in terms of a new model query sequence
.
Status and roadmap
Development of EiffelBase has started as a project at ETH Zurich; see the project page. Documentation and other material will soon be transferred to the present page.
ToDo list
Below you can see a list of most important changes planned for EiffelBase2.
- Test extensively with AutoTest.
- Implement
PREORDER_ITERATOR
andPOSTORDER_ITERATOR
to traverse binaries in pre- and postorder. - Finish the implementation of the MML library.
- Rewrite loops using the
across
where appropriate. - Implement
RANDOM
andSTANDARD_INPUT
streams. - Add an iterator over keys for
TABLE
. - Make the library void-safe.
- Implement more efficient data structures: e.g., tables with fast lookup both ways, heaps, skip lists, treaps, etc.