Difference between revisions of "ACE to ECF: The Transition Explained"

(typos corrected)
(Replaced origo.ethz.ch by eiffel.com in SVN URL)
 
(15 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
[[Category:Configuration]]
 
[[Category:Configuration]]
This page address the rationale behind the new ECF configuration format, and clarify which needs were covered by the new system.
+
This page addresses the rationale behind EiffelStudio's new ECF configuration format, and clarifies which needs were covered by the new system. ECF stands for Eiffel Configuration File.
  
 
=Why the change?=
 
=Why the change?=
Line 7: Line 7:
 
Here are the major points raised by the above discussions:
 
Here are the major points raised by the above discussions:
 
* The ''Project Settings'' dialog of EiffelStudio 5.6 (and earlier) was not complete, that is to say not all the Lace constructs could be reached from the UI.
 
* The ''Project Settings'' dialog of EiffelStudio 5.6 (and earlier) was not complete, that is to say not all the Lace constructs could be reached from the UI.
* Lack of clear documentation on the most complex aspect of the Lace specification: mostly class renaming, but also recent additions to Lace for .NET projects.
+
* The ''Project Settings'' dialog had a lot of bugs, such as making a mess of the format of the Ace file. For example, it moved comments to the wrong place in the Ace file.
* It was hard to have one Ace file for building portable systems, usually one needs to have at least three Ace files: one for Windows, one for .NET (Windows) and one for UNIX. With the possible addition of Mono on UNIX, this number could be brought up to four.
+
* Lack of clear documentation on the most complex aspects of the Lace specification: mostly class renaming, but also recent additions to Lace for .NET projects.
 +
* It was hard to have one Ace file for building portable systems. Usually one needed to have at least four Ace files: one for Windows, one for .NET (Windows), on for Mac OS X and one for UNIX. With the possible addition of Mono on UNIX and Mac OS X, this number could be brought up to six.
 
* When you had C code depending on the Eiffel Software runtime, again you needed two additional Ace files: one for workbench mode and one for finalized mode.
 
* When you had C code depending on the Eiffel Software runtime, again you needed two additional Ace files: one for workbench mode and one for finalized mode.
 
* When you had a library that could be compiled in mono- or multithreaded mode, again you would need two Ace files: one for monothreaded and the other one for multithreaded.
 
* When you had a library that could be compiled in mono- or multithreaded mode, again you would need two Ace files: one for monothreaded and the other one for multithreaded.
* One could not have both debug and release builds of the same project in the same Ace file, again two Ace files are required.
+
* One could not have both debug and release builds of the same project in the same Ace file; again two Ace files were required.
* The class renaming mechanism was completely not understood and actually difficult to use without asking Eiffel Software how it worked.
+
* When you had multiple projects sharing the same clusters, defaults, options, etc., an Ace file was needed for each project. The commonalities had to be copied manually to each Ace file, making it difficult to ensure that every Ace file was consistent and up to date.
 +
* The class renaming mechanism was not completely understood and was actually difficult to use without asking Eiffel Software how it worked.
 
* The addition of .NET components increased the class name clashing dramatically, requiring a robust class name clash resolution mechanism.
 
* The addition of .NET components increased the class name clashing dramatically, requiring a robust class name clash resolution mechanism.
 
* Recursive clusters were transformed into actual clusters, sometimes making it hard to apply some of the Lace construct specifications involving clusters, since cluster names had to be guessed by the end-user.
 
* Recursive clusters were transformed into actual clusters, sometimes making it hard to apply some of the Lace construct specifications involving clusters, since cluster names had to be guessed by the end-user.
 
* Override clusters were simply a hack over normal clusters, preventing users from seeing which classes they were actually overriding.
 
* Override clusters were simply a hack over normal clusters, preventing users from seeing which classes they were actually overriding.
 
* Support for ECMA type mapping needed to make INTEGER be INTEGER_32, NATURAL be NATURAL_32, etc.
 
* Support for ECMA type mapping needed to make INTEGER be INTEGER_32, NATURAL be NATURAL_32, etc.
* Even if Lace was a public format (specified in ETL2), Eiffel Software's implementation was quite different and there was no library available to parse it.
+
* Although Lace was a public format (specified in ETL2), Eiffel Software's implementation was quite different and there was no library available to parse it.
* It is not easy to move a library location since the use of relative path was not properly working (EiffelStudio uses as reference directory the directory where the EIFGEN was located, not the one where the Ace was located).
+
* It was not easy to move a library's location since the use of relative paths was not working properly. (EiffelStudio used as reference directory the directory where the EIFGEN was located, not the one where the Ace was located.)
  
 
=What's new?=
 
=What's new?=
 
==One configuration for all==
 
==One configuration for all==
As pointed above, one of the major problems with Lace was the multiplication of Ace files for various platforms/modes/builds of compilation. To address this issue, almost every element of a configuration file can be conditioned. That is, you could say that an external declaration is only valid for windows, and another one for UNIX. When the compiler sees the two external declarations, it only chooses the one matching the current platform.
+
As pointed out above, one of the major problems with Lace was the multiplication of Ace files for various platforms/modes/builds of compilation. To address this issue, almost every element of a configuration file can be conditioned. That is, you can say that an external declaration is only valid for Windows, and another one for UNIX. When the compiler sees the two external declarations, it only chooses the one matching the current platform.
If you need to perform a cross-platform compilation, you can manually set the required platform, and the compiler will chose the selected platform rather than the current platform.
+
If you need to perform a cross-platform compilation, you can manually set the required platform, and the compiler will choose the selected platform rather than the current platform.
  
The built in conditions are:
+
The built-in conditions are:
 
*Platforms: Windows, UNIX, vxWorks, Macintosh
 
*Platforms: Windows, UNIX, vxWorks, Macintosh
 
*.NET: True, False
 
*.NET: True, False
Line 34: Line 36:
 
*Compiler version: a range can be specified
 
*Compiler version: a range can be specified
  
In addition, you can create your own variables and the condition can either be equality or inequality to your variable value.
+
In addition, you can define your own variables and use them for writing custom conditions.
  
 
==Library approach==
 
==Library approach==
One new element of ECF is the ability to use another ECF file that we call a library. With Ace files, this was achieved by copying/pasting a master Ace file. Needless to say, this made the work of library authors very difficult since they were limited in their refactoring to keeping classes where they were; otherwise, every user of the library would have to change his Ace files using it. The copy/paste operation was needed for clusters, but also for C compilation options which often changed depending on the platform.
+
One new element of ECF is the ability to use another ECF file that we call a library. With Ace files, this was achieved by copying and pasting from a master Ace file. Needless to say, this made the work of library authors very difficult since they were limited in their refactoring to keeping classes where they were; otherwise, every user of the library would have to change his Ace files using it. The copy and paste operation was needed not only for clusters, but also for C compilation options which often changed depending on the platform.
  
For the end user, a library is a black box which exposes a set of classes defined in the library. There is no need to know how the library is implemented, nor is there a need to know from the programming point of view which other library it may reference internally (with one exception from a management point of view where you need to know the dependence since otherwise the library may not compile on your system if you do not have the referenced libraries). A good analogy for a library dependency graph is to see it as a .NET Assembly except that this is presented in a source component rather than in a binary component.
+
For the end user, a library is a black box which exposes a set of classes defined in the library. There is no need to know how the library is implemented, nor is there a need to know from the programming point of view which other libraries it may reference internally (with one exception, from a management point of view, where you need to know dependencies since otherwise the library may not compile on your system if you do not have the referenced libraries). A good analogy for a library dependency graph is to see it as a .NET Assembly, except that this is presented in a source component rather than in a binary component.
  
 
==Class name clashing resolution==
 
==Class name clashing resolution==
Line 49: Line 51:
 
[[Image:General_library_layout.png|center]]
 
[[Image:General_library_layout.png|center]]
  
Because library ''c'' only directly depends on library ''a'', it can only access classes from library ''a'' (i.e. only '''A''' in this example). The class from library ''b'' are not exposed to ''c''. This is how we can solve potential class name clashes coming from library ''b''.
+
Because library ''c'' only directly depends on library ''a'', it can only access classes from library ''a'' (i.e. only '''A''' in this example). The class from library ''b'' is not exposed to ''c''. This is how we can solve potential class name clashes coming from library ''b''.
  
 
If library ''c'' wants to use classes from library ''b'', it needs to explicitly reference library ''b'':
 
If library ''c'' wants to use classes from library ''b'', it needs to explicitly reference library ''b'':
Line 58: Line 60:
  
 
===Renaming===
 
===Renaming===
Now we need to tackle the ten remaining percent. For that purpose we use the same mechanism used in Eiffel for solving feature name clashes with multiple inheritance, that is to say class name renaming. Let's take for example the following diagram representing three libraries:
+
Now we need to tackle the ten percent remaining. For that purpose we use the same mechanism used in Eiffel for solving feature name clashes with multiple inheritance, that is to say, class name renaming. Let's take for example the following diagram representing three libraries:
  
 
[[Image:Class_name_clash_in_libraries.png|center]]
 
[[Image:Class_name_clash_in_libraries.png|center]]
  
Both libraries ''a'' and ''b'' define a class '''A'''. Library ''c'' which uses both ''a'' and ''b'' needs a clear way to distinguish '''A''' from ''a'' and '''A''' from ''b''. We do this by renaming '''A''' from ''b'' into '''A_FROM_B'''. Now when you encounter '''A''' in classes of ''c'' it means the version from ''a'' and when you encounter '''A_FROM_B''' i means the version from ''b''. In other words, the list of classes available to ''c'' are:
+
Both libraries ''a'' and ''b'' define a class '''A'''. Library ''c'' which uses both ''a'' and ''b'' needs a clear way to distinguish '''A''' from ''a'' and '''A''' from ''b''. We do this by renaming '''A''' from ''b'' into '''A_FROM_B'''. Now when you encounter '''A''' in classes of ''c'' it means the version from ''a'' and when you encounter '''A_FROM_B''' it means the version from ''b''. In other words, the list of classes available to ''c'' are:
 
* '''A'''
 
* '''A'''
 
* '''A_FROM_B'''
 
* '''A_FROM_B'''
Line 69: Line 71:
  
 
=Why the syntax change?=
 
=Why the syntax change?=
Because we wanted to make most modifications through the UI or through the configuration library, we could have used an object binary format, but we did not chose this approach because it makes configuration versioning in the CMS tool completely useless since one cannot visualize the differences.
+
Because we wanted to make most modifications through the UI or through the configuration library, we could have used an object binary format; but we rejected this approach because it would have made configuration versioning in the CMS tool completely useless since one would not have been able to visualize the differences.
  
 
So we were left with:
 
So we were left with:
# Modifying lace
+
# Modifying Lace
 
# Adopting a new syntax
 
# Adopting a new syntax
 
# Using the quite common XML syntax
 
# Using the quite common XML syntax
  
 
In the end, we chose XML for the following reasons:
 
In the end, we chose XML for the following reasons:
* We wanted new Eiffel users to easily create their own configurations. XML is very good at that since XML-aware text editors are able to provide code completion when we provide a schema, which we do. With Lace not being self-describing, it is harder for someone completely new to Eiffel to feel what possibilities are offered.
+
* We wanted new Eiffel users to easily create their own configurations. XML is very good at that since XML-aware text editors are able to provide code completion when we provide a schema, which we do. With Lace not being self-describing, it is harder for someone completely new to Eiffel to get a feel for what possibilities are offered.
* Anyone could parse XML, no need for a special parser.
+
* Anyone can parse XML: no need for a special parser.
 
* The node ordering in XML matches nicely with the layout of an Eiffel system, meaning you have libraries within libraries, you have clusters within clusters and you have either classes or clusters.
 
* The node ordering in XML matches nicely with the layout of an Eiffel system, meaning you have libraries within libraries, you have clusters within clusters and you have either classes or clusters.
 
* Schema evolution: with a XSL transform, it is easy to automate the conversion from successive revisions of the XML file without building complex tools.
 
* Schema evolution: with a XSL transform, it is easy to automate the conversion from successive revisions of the XML file without building complex tools.
Line 85: Line 87:
  
 
=Summary=
 
=Summary=
If you were to remember a few things about ECF, they should be:
+
If you wish to remember a few things about ECF, they should be:
* One configuration for all platforms (.NET, Windows, UNIX, Mac OS X), all modes (multithreaded/monothreaded, ...), all builds (debug/release, ...) of compilation for your system.
+
* One configuration for all platforms (.NET, Windows, UNIX, Mac OS X), all modes (multithreaded/monothreaded, ...), and all builds (debug/release, ...) of compilation for your system.
* Easy to create libraries that only list the required dependencies (it makes the configuration simpler and smaller)
+
* Easy-to-create libraries that only list the required dependencies (making the configuration simpler and smaller).
* Pre and post compilation tasks.
+
* Pre- and post compilation tasks.
* Only approach guaranteeing a complete solution to class name clash, all other approaches simply post-pone the problem one step further.
+
* It uses the only approach guaranteeing a complete solution to class name clashes - all other approaches simply postpone the problem one step further.
 +
* ECF can be manipulated using the configuration library (See https://svn.eiffel.com/eiffelstudio/trunk/Src/framework/configuration).
 +
 
 +
=Future of ECF=
 +
* Because ECF is a library as well as a file format, and we use the library to process the file format, ECF can easily be changed to use a syntax other than XML. At the time of this writing no other compelling syntax has been found, but we are open to suggestions.
 +
* We plan to add versioning information to libraries (see [[ProposalLibraryDependencies| Library Dependencies]]).
 +
* Automatic discovery of libraries without having to download the missing libraries manually (see [[ProposalConfigurationDiscovery| Discovery of Libraries]]).
 +
* ...

Latest revision as of 13:22, 4 June 2012

This page addresses the rationale behind EiffelStudio's new ECF configuration format, and clarifies which needs were covered by the new system. ECF stands for Eiffel Configuration File.

Why the change?

As part of the EiffelStudio 5.7 project plan (made in 2005), we finally decided to improve the way one build projects using EiffelStudio. Most of the ideas introduced in 5.7 have originated from discussions we had over the past 6 years with both employees of Eiffel Software and users of EiffelStudio. We actually wanted to change this a while ago, but lack of time and various constraints prevented us from doing it until 5.7.

Here are the major points raised by the above discussions:

  • The Project Settings dialog of EiffelStudio 5.6 (and earlier) was not complete, that is to say not all the Lace constructs could be reached from the UI.
  • The Project Settings dialog had a lot of bugs, such as making a mess of the format of the Ace file. For example, it moved comments to the wrong place in the Ace file.
  • Lack of clear documentation on the most complex aspects of the Lace specification: mostly class renaming, but also recent additions to Lace for .NET projects.
  • It was hard to have one Ace file for building portable systems. Usually one needed to have at least four Ace files: one for Windows, one for .NET (Windows), on for Mac OS X and one for UNIX. With the possible addition of Mono on UNIX and Mac OS X, this number could be brought up to six.
  • When you had C code depending on the Eiffel Software runtime, again you needed two additional Ace files: one for workbench mode and one for finalized mode.
  • When you had a library that could be compiled in mono- or multithreaded mode, again you would need two Ace files: one for monothreaded and the other one for multithreaded.
  • One could not have both debug and release builds of the same project in the same Ace file; again two Ace files were required.
  • When you had multiple projects sharing the same clusters, defaults, options, etc., an Ace file was needed for each project. The commonalities had to be copied manually to each Ace file, making it difficult to ensure that every Ace file was consistent and up to date.
  • The class renaming mechanism was not completely understood and was actually difficult to use without asking Eiffel Software how it worked.
  • The addition of .NET components increased the class name clashing dramatically, requiring a robust class name clash resolution mechanism.
  • Recursive clusters were transformed into actual clusters, sometimes making it hard to apply some of the Lace construct specifications involving clusters, since cluster names had to be guessed by the end-user.
  • Override clusters were simply a hack over normal clusters, preventing users from seeing which classes they were actually overriding.
  • Support for ECMA type mapping needed to make INTEGER be INTEGER_32, NATURAL be NATURAL_32, etc.
  • Although Lace was a public format (specified in ETL2), Eiffel Software's implementation was quite different and there was no library available to parse it.
  • It was not easy to move a library's location since the use of relative paths was not working properly. (EiffelStudio used as reference directory the directory where the EIFGEN was located, not the one where the Ace was located.)

What's new?

One configuration for all

As pointed out above, one of the major problems with Lace was the multiplication of Ace files for various platforms/modes/builds of compilation. To address this issue, almost every element of a configuration file can be conditioned. That is, you can say that an external declaration is only valid for Windows, and another one for UNIX. When the compiler sees the two external declarations, it only chooses the one matching the current platform. If you need to perform a cross-platform compilation, you can manually set the required platform, and the compiler will choose the selected platform rather than the current platform.

The built-in conditions are:

  • Platforms: Windows, UNIX, vxWorks, Macintosh
  • .NET: True, False
  • .NET version: a range can be specified
  • Build: Workbench, Finalize
  • Runtime: Static, Dynamic
  • Thread: Multithreaded, Monothreaded
  • Compiler version: a range can be specified

In addition, you can define your own variables and use them for writing custom conditions.

Library approach

One new element of ECF is the ability to use another ECF file that we call a library. With Ace files, this was achieved by copying and pasting from a master Ace file. Needless to say, this made the work of library authors very difficult since they were limited in their refactoring to keeping classes where they were; otherwise, every user of the library would have to change his Ace files using it. The copy and paste operation was needed not only for clusters, but also for C compilation options which often changed depending on the platform.

For the end user, a library is a black box which exposes a set of classes defined in the library. There is no need to know how the library is implemented, nor is there a need to know from the programming point of view which other libraries it may reference internally (with one exception, from a management point of view, where you need to know dependencies since otherwise the library may not compile on your system if you do not have the referenced libraries). A good analogy for a library dependency graph is to see it as a .NET Assembly, except that this is presented in a source component rather than in a binary component.

Class name clashing resolution

We are not going to explain how the Lace solution worked since very few people understood it. The solution adopted by the new configuration mechanism is based on information hiding, renaming and the new library abstraction.

Information Hiding

As mentioned in the library approach section, a library is a black box that only exposes classes written in the library, that is to say all classes from the other libraries used by this library are not externally visible. This is a very efficient way to get rid of roughly 90% of common cases of name clashes. For example, let's consider the following 3 libraries:

General library layout.png

Because library c only directly depends on library a, it can only access classes from library a (i.e. only A in this example). The class from library b is not exposed to c. This is how we can solve potential class name clashes coming from library b.

If library c wants to use classes from library b, it needs to explicitly reference library b:

General library layout 2.png

And in this case, library c has access to both A and B.

Renaming

Now we need to tackle the ten percent remaining. For that purpose we use the same mechanism used in Eiffel for solving feature name clashes with multiple inheritance, that is to say, class name renaming. Let's take for example the following diagram representing three libraries:

Class name clash in libraries.png

Both libraries a and b define a class A. Library c which uses both a and b needs a clear way to distinguish A from a and A from b. We do this by renaming A from b into A_FROM_B. Now when you encounter A in classes of c it means the version from a and when you encounter A_FROM_B it means the version from b. In other words, the list of classes available to c are:

  • A
  • A_FROM_B

The good news is that the renaming has a local scope, meaning it is only valid in the context of library c. Another library used by c that may use either a or b will not be affected by the renaming, making this solution highly scalable for large systems composed of many libraries.

Why the syntax change?

Because we wanted to make most modifications through the UI or through the configuration library, we could have used an object binary format; but we rejected this approach because it would have made configuration versioning in the CMS tool completely useless since one would not have been able to visualize the differences.

So we were left with:

  1. Modifying Lace
  2. Adopting a new syntax
  3. Using the quite common XML syntax

In the end, we chose XML for the following reasons:

  • We wanted new Eiffel users to easily create their own configurations. XML is very good at that since XML-aware text editors are able to provide code completion when we provide a schema, which we do. With Lace not being self-describing, it is harder for someone completely new to Eiffel to get a feel for what possibilities are offered.
  • Anyone can parse XML: no need for a special parser.
  • The node ordering in XML matches nicely with the layout of an Eiffel system, meaning you have libraries within libraries, you have clusters within clusters and you have either classes or clusters.
  • Schema evolution: with a XSL transform, it is easy to automate the conversion from successive revisions of the XML file without building complex tools.
  • There were some semantics changes we added in the new configuration approach that would have required an upgrade from the old Lace format to the new one. So no matter the chosen syntax, a migration was necessary. And because XML offered more than Lace for processing, this was one more reason for choosing XML.
  • XML was already used by other tools addressing some of the shortcomings of Lace.

Summary

If you wish to remember a few things about ECF, they should be:

  • One configuration for all platforms (.NET, Windows, UNIX, Mac OS X), all modes (multithreaded/monothreaded, ...), and all builds (debug/release, ...) of compilation for your system.
  • Easy-to-create libraries that only list the required dependencies (making the configuration simpler and smaller).
  • Pre- and post compilation tasks.
  • It uses the only approach guaranteeing a complete solution to class name clashes - all other approaches simply postpone the problem one step further.
  • ECF can be manipulated using the configuration library (See https://svn.eiffel.com/eiffelstudio/trunk/Src/framework/configuration).

Future of ECF

  • Because ECF is a library as well as a file format, and we use the library to process the file format, ECF can easily be changed to use a syntax other than XML. At the time of this writing no other compelling syntax has been found, but we are open to suggestions.
  • We plan to add versioning information to libraries (see Library Dependencies).
  • Automatic discovery of libraries without having to download the missing libraries manually (see Discovery of Libraries).
  • ...