Difference between revisions of "Melting Ice Technology"

m (Patterns to call the other side)
m (Compiled versus interpreted)
 
(13 intermediate revisions by 4 users not shown)
Line 1: Line 1:
[[Category:Documentation]]
+
[[Category:Compiler]]
{{Warning|'''Warning''': Article under development}}
+
{{UnderConstruction}}
 
+
====Introduction====
 
There are two main criteria for any good compiler. Both the compilation and the compiled program need to be fast. There is a trade off, the more time a compiler takes to make optimizations, the faster the compiled program is and the slower the compilation process becomes.  
 
There are two main criteria for any good compiler. Both the compilation and the compiled program need to be fast. There is a trade off, the more time a compiler takes to make optimizations, the faster the compiled program is and the slower the compilation process becomes.  
  
Fortunately the need for fast compilation and fast compiled programs occurs at different times during the development cycle. When the developer is incrementally writing and testing a program he needs short compilation time to be productive. At the end, when his work is thoroughly tested and ready to ship there can a slower compilation that generates a very fast delivery.  
+
Fortunately the need for fast compilation and fast compiled programs occurs at different times during the development cycle. When the developer is incrementally writing and testing a program he needs short compilation time to be productive. At the end, when his work is thoroughly tested and ready to ship there can be a slower compilation that generates a very fast delivery.  
  
EiffelStudio exploits this by providing two basic compilation modes: workbench and finalized mode.  
+
EiffelStudio's compiler, known from here on in as "Eiffel compiler", exploits this by providing two basic compilation modes: workbench and finalized mode.  
 
The C code generated by the compiler looks different for workbench and finalized mode. The terms workbench code and finalized code are used to refer to the corresponding generated C code.
 
The C code generated by the compiler looks different for workbench and finalized mode. The terms workbench code and finalized code are used to refer to the corresponding generated C code.
  
Line 14: Line 14:
 
* It supports precompiles.
 
* It supports precompiles.
  
Finalized code has only two advantages, smallness and speed. Whereas the former contributes to the latter due too better cache efficiency. Finalized code can only be debugged at the C level. This article focuses on the workbench mode of EiffelStudio
+
Finalized code has only two advantages, footprint and speed. A small footprint contributes to speed due to better cache efficiency. Finalized code can only be debugged at the C level. This article focuses on the workbench mode of the Eiffel compiler.
  
 
====Compiled versus interpreted====
 
====Compiled versus interpreted====
EiffelStudio was designed with the following principle in mind: When a programmer makes a small change he expects a short recompilation time, when he makes a big change he will accept some waiting time.  
+
The Eiffel compiler was designed with the following principle in mind: when a programmer makes a small change he expects a short recompilation time, and when he makes a big change he will accept some waiting time.  
  
 
Traditionally, the fastest programming environments (in terms of recompilation time) were interpreted languages (like VisualBasic). Unfortunately interpreted languages are not competitive performance wise.  
 
Traditionally, the fastest programming environments (in terms of recompilation time) were interpreted languages (like VisualBasic). Unfortunately interpreted languages are not competitive performance wise.  
  
The designers of EiffelStudio circumnavigated the trade off by taking the best of both approaches. Thus, in workbench mode EiffelStudio is half compiler and half interpreter. The technology behind this is called melting ice.  
+
The designers of the Eiffel compiler circumnavigated the trade off by taking the best of both approaches. Thus, in workbench mode the Eiffel compiler is half compiler and half interpreter. The technology behind this is called ''melting ice''.
  
An Eiffel System compiled in workbench mode consists of both frozen and melted code. Frozen code is code, that is translated to C code. Melted code is not yet translated to C code but to a form of byte code that has to be interpreted. This byte code is called EiffelStudio byte code (EBC).
+
An Eiffel system compiled in workbench mode consists of both frozen and melted code. Frozen code is code that is translated to C code. Melted code is not yet translated to C code but to a form of byte code that has to be interpreted. This byte code is called Eiffel byte code (EBC).
  
This yields two requirements to the EiffelStudio runtime. It needs both to be able to execute the EiffelStudio byte code and to handle calls from frozen code into melted code and vice versa.
+
This yields two requirements to the Eiffel runtime. It needs both to be able to execute the EBC and to handle calls from frozen code into melted code and vice versa.
  
 
====The interpreter and the byte code====
 
====The interpreter and the byte code====
The EiffelStudio byte code interpreter is a stack machine. The byte code itself is not very different from .NET byte code or Java Byte Code.  
+
The EBC interpreter is a stack machine. The byte code itself is not very different from .NET byte code or Java Byte Code.  
  
This article won't explain the byte codes. This is partially done in [[Byte Code]]. To give a general idea the Fibonacci feature and its EiffelStudio byte code are shown:
+
This article won't explain the byte codes. This is partially done in [[Byte Code]]. To give a general idea the Fibonacci feature and its EBC are shown:
  
 
{|border="0" cellpadding="2" cellspacing="0" align="center"
 
{|border="0" cellpadding="2" cellspacing="0" align="center"
Line 80: Line 80:
 
|
 
|
 
<code>  
 
<code>  
90: BC_JMP 80
 
 
  95: BC_HOOK 4
 
  95: BC_HOOK 4
 
100: BC_ARG 1  
 
100: BC_ARG 1  
Line 162: Line 161:
  
 
This two patterns need to be available not for every feature in the system but for every signature type. They are generated into the file epattern.c.
 
This two patterns need to be available not for every feature in the system but for every signature type. They are generated into the file epattern.c.
 +
 +
====Locating the right pattern====
 +
In workbench mode every piece of translated code is referenced by its real_body_id. In the runtime the array mpatidtab serves as a function between real_body_id and pattern_id. There is a second array called pattern that makes the mapping between pattern_id and the two function pointers to the ''toi'' and ''toc'' patterns.
 +
 +
So the right pattern ''toi'' could be resolved like:
 +
 +
{|border="0" cellpadding="2" cellspacing="0" align="center"
 +
|-valign="top" -halign="center"
 +
|
 +
<code>[c,N]
 +
pattern [mpatidtab [real_body_id]].toi
 +
</code>
 +
|
 +
|}

Latest revision as of 17:25, 14 June 2007

Construction.png Not Ready for Review: This Page is Under Development!

Introduction

There are two main criteria for any good compiler. Both the compilation and the compiled program need to be fast. There is a trade off, the more time a compiler takes to make optimizations, the faster the compiled program is and the slower the compilation process becomes.

Fortunately the need for fast compilation and fast compiled programs occurs at different times during the development cycle. When the developer is incrementally writing and testing a program he needs short compilation time to be productive. At the end, when his work is thoroughly tested and ready to ship there can be a slower compilation that generates a very fast delivery.

EiffelStudio's compiler, known from here on in as "Eiffel compiler", exploits this by providing two basic compilation modes: workbench and finalized mode. The C code generated by the compiler looks different for workbench and finalized mode. The terms workbench code and finalized code are used to refer to the corresponding generated C code.

Workbench code has the following properties:

  • It is easily testable (by debugging).
  • It compiles and recompiles very fast (due to melting ice technology).
  • It supports precompiles.

Finalized code has only two advantages, footprint and speed. A small footprint contributes to speed due to better cache efficiency. Finalized code can only be debugged at the C level. This article focuses on the workbench mode of the Eiffel compiler.

Compiled versus interpreted

The Eiffel compiler was designed with the following principle in mind: when a programmer makes a small change he expects a short recompilation time, and when he makes a big change he will accept some waiting time.

Traditionally, the fastest programming environments (in terms of recompilation time) were interpreted languages (like VisualBasic). Unfortunately interpreted languages are not competitive performance wise.

The designers of the Eiffel compiler circumnavigated the trade off by taking the best of both approaches. Thus, in workbench mode the Eiffel compiler is half compiler and half interpreter. The technology behind this is called melting ice.

An Eiffel system compiled in workbench mode consists of both frozen and melted code. Frozen code is code that is translated to C code. Melted code is not yet translated to C code but to a form of byte code that has to be interpreted. This byte code is called Eiffel byte code (EBC).

This yields two requirements to the Eiffel runtime. It needs both to be able to execute the EBC and to handle calls from frozen code into melted code and vice versa.

The interpreter and the byte code

The EBC interpreter is a stack machine. The byte code itself is not very different from .NET byte code or Java Byte Code.

This article won't explain the byte codes. This is partially done in Byte Code. To give a general idea the Fibonacci feature and its EBC are shown:

fibonacci (n: INTEGER): INTEGER
   require
      n >= 0
   do
      if n = 0 then
         Result := 0
      elseif n = 1 then
         Result := 1
      else
         Result := 
            fibonacci (n - 1) + 
            fibonacci (n - 2)
      end
   end
1: BC_START 
Routine Id   : 186
Body Id      : 185
Result Type  : [INTEGER_32]
Nr. args     : 1
Nr. locals   : 0
  18: BC_NO_CLONE_ARG 
Routine name : fibonacci
Written      : 9
 32: BC_PRECOND offset 23
 37: BC_HOOK 1
 42: BC_ASSERT <8, 16>
 45: BC_ARG 1 
 48: BC_INT32 0
 53: BC_GE 
 54: BC_END_PRE offset 0
 59: BC_RAISE_PREC 
 60: BC_HOOK 2
 65: BC_ARG 1 
 68: BC_INT32 0
 73: BC_EQ 
 74: BC_JMP_F 16
 79: BC_HOOK 3
 84: BC_INT32 0
 89: BC_RASSIGN 
 90: BC_JMP 80
95: BC_HOOK 4
100: BC_ARG 1 
103: BC_INT32 1
108: BC_EQ 
109: BC_JMP_F 16
114: BC_HOOK 5
119: BC_INT32 1
124: BC_RASSIGN 
125: BC_JMP 45
130: BC_HOOK 6
135: BC_ARG 1 
138: BC_INT32 1
143: BC_MINUS 
144: BC_CURRENT 
145: BC_FEATURE fid 36 [9: HELLO_WORLD]
154: BC_ARG 1 
157: BC_INT32 2
162: BC_MINUS 
163: BC_CURRENT 
164: BC_FEATURE fid 36 [9: HELLO_WORLD]
173: BC_PLUS 
174: BC_RASSIGN 
175: BC_HOOK 7
180: BC_NULL

The byte code is not that difficult to understand. Most instructions have an effect on the interpreter stack. Instruction 124 for example pops one value from the stack and saves it to the result of the current feature.

Patterns to call the other side

Every feature call in workbench mode is a possible transition out of or into interpreted code. This transitions are handled by so called patterns (term comes from the idea of calling patterns).

Patterns that go out of interpreted code have the prefix toc (to C code) and the ones that go into interpreted code toi (to interpreted code).

The following snippet shows the toi pattern that could be used for the Fibonacci feature (or any other feature that expects an INTEGER_32 as argument and returns an INTEGER_32):

static EIF_INTEGER_32 toi49 (EIF_REFERENCE Current, EIF_INTEGER_32 arg1) {
	GTCX
	struct item *it;
       	it = iget();                // Push arg1 on the interpreters stack 
	it->type = SK_INT32;           
	it->it_int32 = arg1;          
	it = iget();                // Push Current on the interpreters stack 
	it->type = SK_REF;             
	it->it_ref = Current;
	xinterp(IC);                // Call the interpreter
	it = opop();                // Pop the result from the interpreters stack
	return it->it_int32;        // Return the result in the C way
}

Arguments that were passed to the C function are pushed onto the interpreters stack. After that the interpreter is called and finally the result is popped from the interpreter stack and returned to the caller.

What follows is the pattern that goes out of interpreted code:

static void toc49 (fnptr ptr) {    //The patterns expects a pointer to the C function
	GTCX
	EIF_REFERENCE Current;
	EIF_INTEGER_32 result;
	struct item *it;
	EIF_INTEGER_32 arg1;
	Current = opop()->it_ref;          //The current is popped from the interpreter stack
	arg1 = opop()->it_int32;           //arg1 is popped from the interpreter stack
        //The function is called with the arguments that are popped from the stack.
	result = (FUNCTION_CAST(EIF_INTEGER_32, (EIF_REFERENCE, EIF_INTEGER_32)) ptr)(Current,arg1);
	it = iget();                       //The result of the function is pushed onto the interpreters stack
	it->type = SK_INT32;
	it->it_int32 = result;
}

The arguments for the feature call are popped from the interpreters stack. Then the C function for the feature is called before the result is pushed back onto the interpreters stack.

This two patterns need to be available not for every feature in the system but for every signature type. They are generated into the file epattern.c.

Locating the right pattern

In workbench mode every piece of translated code is referenced by its real_body_id. In the runtime the array mpatidtab serves as a function between real_body_id and pattern_id. There is a second array called pattern that makes the mapping between pattern_id and the two function pointers to the toi and toc patterns.

So the right pattern toi could be resolved like:

pattern [mpatidtab [real_body_id]].toi