Difference between revisions of "Dynamic Binding"
| m | m | ||
| (64 intermediate revisions by one other user not shown) | |||
| Line 1: | Line 1: | ||
| [[Category:Compiler]] | [[Category:Compiler]] | ||
| − | {{ | + | {{UnderConstruction}} | 
| + | ===Introduction=== | ||
| Dynamic binding is a key feature behind object technology. In the following piece of code: | Dynamic binding is a key feature behind object technology. In the following piece of code: | ||
| Line 13: | Line 14: | ||
| |} | |} | ||
| − | It is not known at compile time, which feature has to be called for call a.f. This depends on the exact type of the object referenced by a, also called its dynamic type.   | + | It is not known at compile time, which feature has to be called for call a.f. This depends on the exact type of the object referenced by a, also called its dynamic type. | 
| − | ==== | + | ===Some Ids=== | 
| + | In the following sections several ids are introduced. Their explanation is based on the following example system (Further referenced as the ABC system): | ||
| − | + | {|border="0" cellpadding="0" cellspacing="0" align="center" | |
| − | + | |-valign="top" -halign="center" | |
| + | |[[Image:DynamicBindingABCExample.png]] | ||
| + | |} | ||
| − | + | ====routine_id==== | |
| + | In a compiled Eiffel system, every feature that is a seed gets its own unique '''routine_id'''. Features that are just inherited or are a redeclaration have the same '''routine_id''' as their precursor (See [[Routine IDs]]).   | ||
| − | In the  | + | It is possible for a feature to have more than one '''routine_id''' when it has more than one precursor. This is the case for feature j of class C that redefines both feature g of A and feature h of B and thus gets both their '''routine_id''''s. | 
| + | |||
| + | As it will be shown later, the chosen '''routine_id''' have an impact on the dynamic binding semantics. In class C the feature f_b gets the routine_id 1 since it is selected, the not selected feature f of C gets a new '''routine_id'''. | ||
| + | |||
| + | ====body_id and real_body_id==== | ||
| + | Every routine body, that means every declaration or redeclaration, gets its own unique '''body_id'''. The '''body_id''' of feature g in classes A and B are the same since g is not redefined in B. | ||
| + | |||
| + | A body of a generic class may need several translated versions, when there are different type instantiations of the class.  | ||
| + | For getting the right version of the correct type instantiation the '''real_body_id''' is used.  | ||
| + | There might be for example two translations for feature f of class A. One for each of the types A [INTEGER] and A [STRING]. Their '''real_body_id''' will thus be different. | ||
| + | The '''real_body_id''' is mostly important in workbench mode. | ||
| + | |||
| + | ====feature_id==== | ||
| + | Every feature of a class gets a '''feature_id''' that is unique in the context of this class. | ||
| + | |||
| + | ====class_id and type_id==== | ||
| + | |||
| + | Every class gets a unique class_id an every type a unique '''type_id'''. In the ABC system the classes A, B and C have the class_id's 10, 11 and 12 | ||
| + | |||
| + | ===Dynamic binding in finalized code=== | ||
| + | In finalized code the dynamic binding is very simple. For every routine_id there exists an array, that maps dynamic type ids to function pointers. This array is called routine table.  | ||
| + | |||
| + | For attributes there is also a routine table but it contains offsets instead of function pointers. | ||
| + | |||
| + | The code snipped shows how the routine table for routine_id 1 looks like and on the right side a dynamically bound call (its the feature foo from above):   | ||
| {|border="0" cellpadding="2" cellspacing="0" align="center" | {|border="0" cellpadding="2" cellspacing="0" align="center" | ||
| |-valign="top" -halign="center" | |-valign="top" -halign="center" | ||
| − | |[[ | + | | | 
| + | <code>[c, N]char *(*Txiockf[6])(); | ||
| + | void Txiockf_init () { | ||
| + |    Txiockf[0] = (char *(*)()) A_INTEGER_f; | ||
| + |    Txiockf[1] = (char *(*)()) A_REFERENCE_f; | ||
| + |    Txiockf[2] = (char *(*)()) B_INTEGER_f_b; | ||
| + |    Txiockf[3] = (char *(*)()) B_REFERENCE_f_b; | ||
| + |    Txiockf[4] = (char *(*)()) B_INTEGER_f_b; | ||
| + |    Txiockf[5] = (char *(*)()) B_REFERENCE_f_b; | ||
| + | }</code> | ||
| + | | | ||
| + | | | ||
| + | <code>[c,N]void Fcubdx0 (EIF_REFERENCE Current, EIF_REFERENCE arg1) { | ||
| + |    ... | ||
| + |       //Offset 50 is used to allow a smaller routine table | ||
| + |    (FUNCTION_CAST(void, (EIF_REFERENCE))  | ||
| + |       Txiockf[Dtype(arg1)-50])(arg1);      | ||
| + |    ...                                     | ||
| + | }</code> | ||
| + | | | ||
| |} | |} | ||
| − | + | The function names in the routine table are artificial to ease its understanding. A_INTEGER_f for example is the function for the feature f of type A [INTEGER]. | |
| − | + | ||
| − | + | Two facts become obvious: | |
| − | + | *The type_id of inherited classes should be close together, otherwise the routine tables will have big holes and become huge.   | |
| − | + | *When in the above example an object of a wrong type is passed to Current it will have disastrous consequences. | |
| − | + | ||
| − | + | ||
| − | + | ||
| − | + | ===Dynamic binding in workbench code=== | |
| − | A  | + | Before explaining the dynamic part the generated workbench code for two feature calls are shown: | 
| + | |||
| + | {|border="0" cellpadding="2" cellspacing="0" align="center" | ||
| + | |-valign="top" -halign="center" | ||
| + | | | ||
| + | <code>[c] | ||
| + | (FUNCTION_CAST(void, (EIF_REFERENCE)) RTVF(50, 1, "f", arg1))(arg1); //red call | ||
| + | (FUNCTION_CAST(void, (EIF_REFERENCE)) RTVF(55, 3, "f_b", arg1))(arg1); //blue call | ||
| + | </code> | ||
| + | | | ||
| + | |} | ||
| + | |||
| + | The macro RTVF will resolve the correct feature at runtime. It does so based on three things: | ||
| + | * The static type: 50 (for A [INTEGER]) | ||
| + | * The feature_id: 1 (for f of A) | ||
| + | * The dynamic type: Resolved at runtime through the pointer arg1. Is either 50, 52 or 54. | ||
| + | |||
| + | At run-time the real_body_id of the feature that has to be called will be calculated. This is done in three steps: | ||
| + | # The routine_id is calculated. | ||
| + | # The rout_info structure is resolved. | ||
| + | # The real_body_id is calculated. | ||
| + | |||
| + | ====Calculating the routine_id==== | ||
| + | The routine_id is resolved with a two level look up in the ecall table with the static type and the feature_id. This is shown in the following picture: | ||
| + | |||
| + | {|border="0" cellpadding="2" cellspacing="0" align="center" | ||
| + | |-valign="top" -halign="center" | ||
| + | | | ||
| + | [[Image:ABCecall.png]] | ||
| + | | | ||
| + | |} | ||
| + | |||
| + | The look up path taken for the red call (see c code above) is emphasized in red and the one of the blue call in blue. | ||
| + | |||
| + | There is the question why it is even necessary to calculate the routine_id at runtime instead of just doing so at compile time and save it in the generated code. All the clients of the features with a changed routine_id needed to be melted. And it can indeed happen, that the routine_id changes for given static type and feature_id. | ||
| + | |||
| + | ====Calculating the rout_info==== | ||
| + | The origin of a feature is the class that first introduced the feature. Or more formal, the highest class in the inheritance hierarchy that has a feature with the same routine_id.  | ||
| + | |||
| + | Example: The origin of feature f_b of class B is class A. | ||
| + | |||
| + | Every class introduces only a fixed amount of features. All of them get a unique offset.   | ||
| + | |||
| + | Example 1: Class A introduces features f and g. So they get the offsets zero and one. | ||
| + | |||
| + | Example 2: Class B only introduces feature h. Feature h gets offset zero.  | ||
| + | |||
| + | Origin and offset together are called the rout_info. The rout_info can be resolved with the routine_id through the ecall table: | ||
| + | |||
| + | {|border="0" cellpadding="2" cellspacing="0" align="center" | ||
| + | |-valign="top" -halign="center" | ||
| + | | | ||
| + | [[Image:ABCeorg.png]] | ||
| + | | | ||
| + | |} | ||
| + | |||
| + | The red and blue color have the same meaning as in the previous picture. | ||
| + | |||
| + | ====Calculating the real_body_id==== | ||
| + | Central to the dynamic binding in workbench mode is the desc_tab. For the ABC system this table looks like this: | ||
| + | |||
| + | {|border="0" cellpadding="2" cellspacing="0" align="center" | ||
| + | |-valign="top" -halign="center" | ||
| + | | | ||
| + | [[Image:ABCDescTab.png]] | ||
| + | |||
| + | | | ||
| + | |} | ||
| + | |||
| + | For every origin in the system there is a second table (shown in yellow) that has an entry   for every type that is an instantiation of either the origin itself or of a descendant.  | ||
| + | |||
| + | Example: The table for class B has four entries, one for each of the types B [INTEGER], B [STRING], C [INTEGER] and C [STRING] with ids 52, 53, 54 and 55. | ||
| + | |||
| + | Every yellow table belongs to a certain origin. And so it comes, that every entry of such a yellow table references an other table with as many entries as the origin introduces features.  | ||
| + | |||
| + | These gray tables are indexed by the offset (of rout_info) and finally yield the real_body_id. | ||
| + | |||
| + | Example: The entries of the yellow table for class B (with class_id 11) reference tables with one entry, since class B only introduces one feature (feature h).  | ||
| + | |||
| + | The exact resolution for the read call and blue call is again shown in their corresponding color.   | ||
| + | |||
| + | Example: When doing the look up for the read call and the class_id (10) and offset (0) are already known the real_body_id is found with the following piece of C code: | ||
| + | |||
| + | {|border="0" cellpadding="2" cellspacing="0" align="center" | ||
| + | |-valign="top" -halign="center" | ||
| + | | | ||
| + | <code>[c] | ||
| + | desc_tab [10] [50] [0] | ||
| + | </code> | ||
| + | | | ||
| + | |} | ||
| + | |||
| + | For the blue call with class_id 11 and offset zero the look up looks like: | ||
| + | |||
| + | {|border="0" cellpadding="2" cellspacing="0" align="center" | ||
| + | |-valign="top" -halign="center" | ||
| + | | | ||
| + | <code>[c] | ||
| + | desc_tab [11] [55] [0] | ||
| + | </code> | ||
| + | | | ||
| + | |} | ||
Latest revision as of 02:14, 11 April 2007
Contents
Introduction
Dynamic binding is a key feature behind object technology. In the following piece of code:
| foo (a: A) do a.f end | 
It is not known at compile time, which feature has to be called for call a.f. This depends on the exact type of the object referenced by a, also called its dynamic type.
Some Ids
In the following sections several ids are introduced. Their explanation is based on the following example system (Further referenced as the ABC system):
|   | 
routine_id
In a compiled Eiffel system, every feature that is a seed gets its own unique routine_id. Features that are just inherited or are a redeclaration have the same routine_id as their precursor (See Routine IDs).
It is possible for a feature to have more than one routine_id when it has more than one precursor. This is the case for feature j of class C that redefines both feature g of A and feature h of B and thus gets both their routine_id's.
As it will be shown later, the chosen routine_id have an impact on the dynamic binding semantics. In class C the feature f_b gets the routine_id 1 since it is selected, the not selected feature f of C gets a new routine_id.
body_id and real_body_id
Every routine body, that means every declaration or redeclaration, gets its own unique body_id. The body_id of feature g in classes A and B are the same since g is not redefined in B.
A body of a generic class may need several translated versions, when there are different type instantiations of the class. For getting the right version of the correct type instantiation the real_body_id is used. There might be for example two translations for feature f of class A. One for each of the types A [INTEGER] and A [STRING]. Their real_body_id will thus be different. The real_body_id is mostly important in workbench mode.
feature_id
Every feature of a class gets a feature_id that is unique in the context of this class.
class_id and type_id
Every class gets a unique class_id an every type a unique type_id. In the ABC system the classes A, B and C have the class_id's 10, 11 and 12
Dynamic binding in finalized code
In finalized code the dynamic binding is very simple. For every routine_id there exists an array, that maps dynamic type ids to function pointers. This array is called routine table.
For attributes there is also a routine table but it contains offsets instead of function pointers.
The code snipped shows how the routine table for routine_id 1 looks like and on the right side a dynamically bound call (its the feature foo from above):
| char *(*Txiockf[6])(); void Txiockf_init () { Txiockf[0] = (char *(*)()) A_INTEGER_f; Txiockf[1] = (char *(*)()) A_REFERENCE_f; Txiockf[2] = (char *(*)()) B_INTEGER_f_b; Txiockf[3] = (char *(*)()) B_REFERENCE_f_b; Txiockf[4] = (char *(*)()) B_INTEGER_f_b; Txiockf[5] = (char *(*)()) B_REFERENCE_f_b; } | void Fcubdx0 (EIF_REFERENCE Current, EIF_REFERENCE arg1) { ... //Offset 50 is used to allow a smaller routine table (FUNCTION_CAST(void, (EIF_REFERENCE)) Txiockf[Dtype(arg1)-50])(arg1); ... } | 
The function names in the routine table are artificial to ease its understanding. A_INTEGER_f for example is the function for the feature f of type A [INTEGER].
Two facts become obvious:
- The type_id of inherited classes should be close together, otherwise the routine tables will have big holes and become huge.
- When in the above example an object of a wrong type is passed to Current it will have disastrous consequences.
Dynamic binding in workbench code
Before explaining the dynamic part the generated workbench code for two feature calls are shown:
| (FUNCTION_CAST(void, (EIF_REFERENCE)) RTVF(50, 1, "f", arg1))(arg1); //red call (FUNCTION_CAST(void, (EIF_REFERENCE)) RTVF(55, 3, "f_b", arg1))(arg1); //blue call | 
The macro RTVF will resolve the correct feature at runtime. It does so based on three things:
- The static type: 50 (for A [INTEGER])
- The feature_id: 1 (for f of A)
- The dynamic type: Resolved at runtime through the pointer arg1. Is either 50, 52 or 54.
At run-time the real_body_id of the feature that has to be called will be calculated. This is done in three steps:
- The routine_id is calculated.
- The rout_info structure is resolved.
- The real_body_id is calculated.
Calculating the routine_id
The routine_id is resolved with a two level look up in the ecall table with the static type and the feature_id. This is shown in the following picture:
The look up path taken for the red call (see c code above) is emphasized in red and the one of the blue call in blue.
There is the question why it is even necessary to calculate the routine_id at runtime instead of just doing so at compile time and save it in the generated code. All the clients of the features with a changed routine_id needed to be melted. And it can indeed happen, that the routine_id changes for given static type and feature_id.
Calculating the rout_info
The origin of a feature is the class that first introduced the feature. Or more formal, the highest class in the inheritance hierarchy that has a feature with the same routine_id.
Example: The origin of feature f_b of class B is class A.
Every class introduces only a fixed amount of features. All of them get a unique offset.
Example 1: Class A introduces features f and g. So they get the offsets zero and one.
Example 2: Class B only introduces feature h. Feature h gets offset zero.
Origin and offset together are called the rout_info. The rout_info can be resolved with the routine_id through the ecall table:
The red and blue color have the same meaning as in the previous picture.
Calculating the real_body_id
Central to the dynamic binding in workbench mode is the desc_tab. For the ABC system this table looks like this:
For every origin in the system there is a second table (shown in yellow) that has an entry for every type that is an instantiation of either the origin itself or of a descendant.
Example: The table for class B has four entries, one for each of the types B [INTEGER], B [STRING], C [INTEGER] and C [STRING] with ids 52, 53, 54 and 55.
Every yellow table belongs to a certain origin. And so it comes, that every entry of such a yellow table references an other table with as many entries as the origin introduces features.
These gray tables are indexed by the offset (of rout_info) and finally yield the real_body_id.
Example: The entries of the yellow table for class B (with class_id 11) reference tables with one entry, since class B only introduces one feature (feature h).
The exact resolution for the read call and blue call is again shown in their corresponding color.
Example: When doing the look up for the read call and the class_id (10) and offset (0) are already known the real_body_id is found with the following piece of C code:
| desc_tab [10] [50] [0] | 
For the blue call with class_id 11 and offset zero the look up looks like:
| desc_tab [11] [55] [0] | 






