Difference between revisions of "Dynamic Binding"

m (body_id and real_body_id)
m
 
(48 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
[[Category:Compiler]]
 
[[Category:Compiler]]
{{Warning|'''Warning''': Article under development}}
+
{{UnderConstruction}}
  
====Introduction====
+
===Introduction===
 
Dynamic binding is a key feature behind object technology. In the following piece of code:
 
Dynamic binding is a key feature behind object technology. In the following piece of code:
  
Line 14: Line 14:
 
|}
 
|}
  
It is not known at compile time, which feature has to be called for call a.f. This depends on the exact type of the object referenced by a, also called its dynamic type.  
+
It is not known at compile time, which feature has to be called for call a.f. This depends on the exact type of the object referenced by a, also called its dynamic type.
  
====The ABC Example====
+
===Some Ids===
 
In the following sections several ids are introduced. Their explanation is based on the following example system (Further referenced as the ABC system):
 
In the following sections several ids are introduced. Their explanation is based on the following example system (Further referenced as the ABC system):
  
 
{|border="0" cellpadding="0" cellspacing="0" align="center"
 
{|border="0" cellpadding="0" cellspacing="0" align="center"
 
|-valign="top" -halign="center"
 
|-valign="top" -halign="center"
|[[Image:DynamicBindingABCExample.jpg]]
+
|[[Image:DynamicBindingABCExample.png]]
 
|}
 
|}
  
Line 32: Line 32:
  
 
====body_id and real_body_id====
 
====body_id and real_body_id====
Every routine body, that means every declaration or redeclaration, gets its own unique '''body_id'''.  
+
Every routine body, that means every declaration or redeclaration, gets its own unique '''body_id'''. The '''body_id''' of feature g in classes A and B are the same since g is not redefined in B.
  
The '''body_id''' of feature g in classes A and B are the same since g is not redefined in B.
+
A body of a generic class may need several translated versions, when there are different type instantiations of the class.
 +
For getting the right version of the correct type instantiation the '''real_body_id''' is used.
 +
There might be for example two translations for feature f of class A. One for each of the types A [INTEGER] and A [STRING]. Their '''real_body_id''' will thus be different.
 +
The '''real_body_id''' is mostly important in workbench mode.
  
 
====feature_id====
 
====feature_id====
Line 41: Line 44:
 
====class_id and type_id====
 
====class_id and type_id====
  
Every class gets a unique class_id an every type a unique '''type_id'''. A body of a generic class may need several translated versions, when there are different type instantiations of the class. For getting the right version of the correct type instantiation the '''real_body_id''' is used.
+
Every class gets a unique class_id an every type a unique '''type_id'''. In the ABC system the classes A, B and C have the class_id's 10, 11 and 12
  
====Dynamic binding in finalized code====
+
===Dynamic binding in finalized code===
In finalized code the dynamic binding is very simple. For every routine_id there exists an array, that maps dynamic type ids to function pointers.  
+
In finalized code the dynamic binding is very simple. For every routine_id there exists an array, that maps dynamic type ids to function pointers. This array is called routine table.  
  
For an example the ABC system is used including the assumption that the system contains the two type instantiations C [STRING] and C [INTEGER]. This could result in the following type_id's:
+
For attributes there is also a routine table but it contains offsets instead of function pointers.
*50 for A [INTEGER]
+
*51 for A [STRING]
+
*52 for B [INTEGER]
+
*53 for B [STRING]
+
*54 for C [INTEGER]
+
*55 for C [STRING]
+
  
The code snipped how the dynamic binding works:  
+
The code snipped shows how the routine table for routine_id 1 looks like and on the right side a dynamically bound call (its the feature foo from above):  
  
 
{|border="0" cellpadding="2" cellspacing="0" align="center"
 
{|border="0" cellpadding="2" cellspacing="0" align="center"
Line 61: Line 58:
 
<code>[c, N]char *(*Txiockf[6])();
 
<code>[c, N]char *(*Txiockf[6])();
 
void Txiockf_init () {
 
void Txiockf_init () {
  // A [INTEGER] has type_id 50
+
   Txiockf[0] = (char *(*)()) A_INTEGER_f;
   Txiockf[0] = (char *(*)()) Fcuus0f;
+
   Txiockf[1] = (char *(*)()) A_REFERENCE_f;
  // B [INTEGER] has type_id 50
+
   Txiockf[2] = (char *(*)()) B_INTEGER_f_b;
   Txiockf[1] = (char *(*)()) Fcuus0g;
+
   Txiockf[3] = (char *(*)()) B_REFERENCE_f_b;
  // C [INTEGER] has type_id 50
+
   Txiockf[4] = (char *(*)()) B_INTEGER_f_b;
   Txiockf[2] = (char *(*)()) Fcuwquc;
+
   Txiockf[5] = (char *(*)()) B_REFERENCE_f_b;
   Txiockf[3] = (char *(*)()) Fcuwque;
+
   Txiockf[4] = (char *(*)()) Fcuwquc;
+
   Txiockf[5] = (char *(*)()) Fcuwque;
+
 
}</code>
 
}</code>
 
|
 
|
Line 75: Line 69:
 
<code>[c,N]void Fcubdx0 (EIF_REFERENCE Current, EIF_REFERENCE arg1) {
 
<code>[c,N]void Fcubdx0 (EIF_REFERENCE Current, EIF_REFERENCE arg1) {
 
   ...
 
   ...
 +
      //Offset 50 is used to allow a smaller routine table
 
   (FUNCTION_CAST(void, (EIF_REFERENCE))  
 
   (FUNCTION_CAST(void, (EIF_REFERENCE))  
       Txiockf[Dtype(arg1)-50])(arg1);
+
       Txiockf[Dtype(arg1)-50])(arg1);    
   ...
+
   ...                                  
 
}</code>
 
}</code>
 
|
 
|
 
|}
 
|}
  
 +
The function names in the routine table are artificial to ease its understanding. A_INTEGER_f for example is the function for the feature f of type A [INTEGER].
  
In this example the type_id for  
+
Two facts become obvious:
A body_id refers to a certain body of a class. The real_body_id refers a body of one specific generic derivation of a class. The real_body_id is very important in workbench mode. Once the real_body_id is resolved for a call, the rest is just melting ice stuff. In the call to a.f the routine_id of f is known.
+
*The type_id of inherited classes should be close together, otherwise the routine tables will have big holes and become huge.
 +
*When in the above example an object of a wrong type is passed to Current it will have disastrous consequences.
 +
 
 +
===Dynamic binding in workbench code===
 +
Before explaining the dynamic part the generated workbench code for two feature calls are shown:
 +
 
 +
{|border="0" cellpadding="2" cellspacing="0" align="center"
 +
|-valign="top" -halign="center"
 +
|
 +
<code>[c]
 +
(FUNCTION_CAST(void, (EIF_REFERENCE)) RTVF(50, 1, "f", arg1))(arg1); //red call
 +
(FUNCTION_CAST(void, (EIF_REFERENCE)) RTVF(55, 3, "f_b", arg1))(arg1); //blue call
 +
</code>
 +
|
 +
|}
 +
 
 +
The macro RTVF will resolve the correct feature at runtime. It does so based on three things:
 +
* The static type: 50 (for A [INTEGER])
 +
* The feature_id: 1 (for f of A)
 +
* The dynamic type: Resolved at runtime through the pointer arg1. Is either 50, 52 or 54.
 +
 
 +
At run-time the real_body_id of the feature that has to be called will be calculated. This is done in three steps:
 +
# The routine_id is calculated.
 +
# The rout_info structure is resolved.
 +
# The real_body_id is calculated.
 +
 
 +
====Calculating the routine_id====
 +
The routine_id is resolved with a two level look up in the ecall table with the static type and the feature_id. This is shown in the following picture:
 +
 
 +
{|border="0" cellpadding="2" cellspacing="0" align="center"
 +
|-valign="top" -halign="center"
 +
|
 +
[[Image:ABCecall.png]]
 +
|
 +
|}
 +
 
 +
The look up path taken for the red call (see c code above) is emphasized in red and the one of the blue call in blue.
 +
 
 +
There is the question why it is even necessary to calculate the routine_id at runtime instead of just doing so at compile time and save it in the generated code. All the clients of the features with a changed routine_id needed to be melted. And it can indeed happen, that the routine_id changes for given static type and feature_id.
 +
 
 +
====Calculating the rout_info====
 +
The origin of a feature is the class that first introduced the feature. Or more formal, the highest class in the inheritance hierarchy that has a feature with the same routine_id.
 +
 
 +
Example: The origin of feature f_b of class B is class A.
 +
 
 +
Every class introduces only a fixed amount of features. All of them get a unique offset. 
 +
 
 +
Example 1: Class A introduces features f and g. So they get the offsets zero and one.
 +
 
 +
Example 2: Class B only introduces feature h. Feature h gets offset zero.
 +
 
 +
Origin and offset together are called the rout_info. The rout_info can be resolved with the routine_id through the ecall table:
 +
 
 +
{|border="0" cellpadding="2" cellspacing="0" align="center"
 +
|-valign="top" -halign="center"
 +
|
 +
[[Image:ABCeorg.png]]
 +
|
 +
|}
 +
 
 +
The red and blue color have the same meaning as in the previous picture.
 +
 
 +
====Calculating the real_body_id====
 +
Central to the dynamic binding in workbench mode is the desc_tab. For the ABC system this table looks like this:
 +
 
 +
{|border="0" cellpadding="2" cellspacing="0" align="center"
 +
|-valign="top" -halign="center"
 +
|
 +
[[Image:ABCDescTab.png]]
 +
 
 +
|
 +
|}
 +
 
 +
For every origin in the system there is a second table (shown in yellow) that has an entry  for every type that is an instantiation of either the origin itself or of a descendant.
 +
 
 +
Example: The table for class B has four entries, one for each of the types B [INTEGER], B [STRING], C [INTEGER] and C [STRING] with ids 52, 53, 54 and 55.
 +
 
 +
Every yellow table belongs to a certain origin. And so it comes, that every entry of such a yellow table references an other table with as many entries as the origin introduces features.
 +
 
 +
These gray tables are indexed by the offset (of rout_info) and finally yield the real_body_id.
 +
 
 +
Example: The entries of the yellow table for class B (with class_id 11) reference tables with one entry, since class B only introduces one feature (feature h).
 +
 
 +
The exact resolution for the read call and blue call is again shown in their corresponding color.  
 +
 
 +
Example: When doing the look up for the read call and the class_id (10) and offset (0) are already known the real_body_id is found with the following piece of C code:
 +
 
 +
{|border="0" cellpadding="2" cellspacing="0" align="center"
 +
|-valign="top" -halign="center"
 +
|
 +
<code>[c]
 +
desc_tab [10] [50] [0]
 +
</code>
 +
|
 +
|}
 +
 
 +
For the blue call with class_id 11 and offset zero the look up looks like:
 +
 
 +
{|border="0" cellpadding="2" cellspacing="0" align="center"
 +
|-valign="top" -halign="center"
 +
|
 +
<code>[c]
 +
desc_tab [11] [55] [0]
 +
</code>
 +
|
 +
|}

Latest revision as of 01:14, 11 April 2007

Construction.png Not Ready for Review: This Page is Under Development!

Introduction

Dynamic binding is a key feature behind object technology. In the following piece of code:

foo (a: A)
   do
      a.f
   end

It is not known at compile time, which feature has to be called for call a.f. This depends on the exact type of the object referenced by a, also called its dynamic type.

Some Ids

In the following sections several ids are introduced. Their explanation is based on the following example system (Further referenced as the ABC system):

DynamicBindingABCExample.png

routine_id

In a compiled Eiffel system, every feature that is a seed gets its own unique routine_id. Features that are just inherited or are a redeclaration have the same routine_id as their precursor (See Routine IDs).

It is possible for a feature to have more than one routine_id when it has more than one precursor. This is the case for feature j of class C that redefines both feature g of A and feature h of B and thus gets both their routine_id's.

As it will be shown later, the chosen routine_id have an impact on the dynamic binding semantics. In class C the feature f_b gets the routine_id 1 since it is selected, the not selected feature f of C gets a new routine_id.

body_id and real_body_id

Every routine body, that means every declaration or redeclaration, gets its own unique body_id. The body_id of feature g in classes A and B are the same since g is not redefined in B.

A body of a generic class may need several translated versions, when there are different type instantiations of the class. For getting the right version of the correct type instantiation the real_body_id is used. There might be for example two translations for feature f of class A. One for each of the types A [INTEGER] and A [STRING]. Their real_body_id will thus be different. The real_body_id is mostly important in workbench mode.

feature_id

Every feature of a class gets a feature_id that is unique in the context of this class.

class_id and type_id

Every class gets a unique class_id an every type a unique type_id. In the ABC system the classes A, B and C have the class_id's 10, 11 and 12

Dynamic binding in finalized code

In finalized code the dynamic binding is very simple. For every routine_id there exists an array, that maps dynamic type ids to function pointers. This array is called routine table.

For attributes there is also a routine table but it contains offsets instead of function pointers.

The code snipped shows how the routine table for routine_id 1 looks like and on the right side a dynamically bound call (its the feature foo from above):

char *(*Txiockf[6])();
void Txiockf_init () {
   Txiockf[0] = (char *(*)()) A_INTEGER_f;
   Txiockf[1] = (char *(*)()) A_REFERENCE_f;
   Txiockf[2] = (char *(*)()) B_INTEGER_f_b;
   Txiockf[3] = (char *(*)()) B_REFERENCE_f_b;
   Txiockf[4] = (char *(*)()) B_INTEGER_f_b;
   Txiockf[5] = (char *(*)()) B_REFERENCE_f_b;
}
void Fcubdx0 (EIF_REFERENCE Current, EIF_REFERENCE arg1) {
   ...
      //Offset 50 is used to allow a smaller routine table
   (FUNCTION_CAST(void, (EIF_REFERENCE)) 
      Txiockf[Dtype(arg1)-50])(arg1);     
   ...                                    
}

The function names in the routine table are artificial to ease its understanding. A_INTEGER_f for example is the function for the feature f of type A [INTEGER].

Two facts become obvious:

  • The type_id of inherited classes should be close together, otherwise the routine tables will have big holes and become huge.
  • When in the above example an object of a wrong type is passed to Current it will have disastrous consequences.

Dynamic binding in workbench code

Before explaining the dynamic part the generated workbench code for two feature calls are shown:

(FUNCTION_CAST(void, (EIF_REFERENCE)) RTVF(50, 1, "f", arg1))(arg1); //red call
(FUNCTION_CAST(void, (EIF_REFERENCE)) RTVF(55, 3, "f_b", arg1))(arg1); //blue call

The macro RTVF will resolve the correct feature at runtime. It does so based on three things:

  • The static type: 50 (for A [INTEGER])
  • The feature_id: 1 (for f of A)
  • The dynamic type: Resolved at runtime through the pointer arg1. Is either 50, 52 or 54.

At run-time the real_body_id of the feature that has to be called will be calculated. This is done in three steps:

  1. The routine_id is calculated.
  2. The rout_info structure is resolved.
  3. The real_body_id is calculated.

Calculating the routine_id

The routine_id is resolved with a two level look up in the ecall table with the static type and the feature_id. This is shown in the following picture:

ABCecall.png

The look up path taken for the red call (see c code above) is emphasized in red and the one of the blue call in blue.

There is the question why it is even necessary to calculate the routine_id at runtime instead of just doing so at compile time and save it in the generated code. All the clients of the features with a changed routine_id needed to be melted. And it can indeed happen, that the routine_id changes for given static type and feature_id.

Calculating the rout_info

The origin of a feature is the class that first introduced the feature. Or more formal, the highest class in the inheritance hierarchy that has a feature with the same routine_id.

Example: The origin of feature f_b of class B is class A.

Every class introduces only a fixed amount of features. All of them get a unique offset.

Example 1: Class A introduces features f and g. So they get the offsets zero and one.

Example 2: Class B only introduces feature h. Feature h gets offset zero.

Origin and offset together are called the rout_info. The rout_info can be resolved with the routine_id through the ecall table:

ABCeorg.png

The red and blue color have the same meaning as in the previous picture.

Calculating the real_body_id

Central to the dynamic binding in workbench mode is the desc_tab. For the ABC system this table looks like this:

ABCDescTab.png

For every origin in the system there is a second table (shown in yellow) that has an entry for every type that is an instantiation of either the origin itself or of a descendant.

Example: The table for class B has four entries, one for each of the types B [INTEGER], B [STRING], C [INTEGER] and C [STRING] with ids 52, 53, 54 and 55.

Every yellow table belongs to a certain origin. And so it comes, that every entry of such a yellow table references an other table with as many entries as the origin introduces features.

These gray tables are indexed by the offset (of rout_info) and finally yield the real_body_id.

Example: The entries of the yellow table for class B (with class_id 11) reference tables with one entry, since class B only introduces one feature (feature h).

The exact resolution for the read call and blue call is again shown in their corresponding color.

Example: When doing the look up for the read call and the class_id (10) and offset (0) are already known the real_body_id is found with the following piece of C code:

desc_tab [10] [50] [0]

For the blue call with class_id 11 and offset zero the look up looks like:

desc_tab [11] [55] [0]