Difference between revisions of "ACE to ECF: The Transition Explained"
(added category) |
(Replaced origo.ethz.ch by eiffel.com in SVN URL) |
||
(42 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
[[Category:Configuration]] | [[Category:Configuration]] | ||
− | This page | + | This page addresses the rationale behind EiffelStudio's new ECF configuration format, and clarifies which needs were covered by the new system. ECF stands for Eiffel Configuration File. |
+ | |||
+ | =Why the change?= | ||
+ | As part of the EiffelStudio 5.7 project plan (made in 2005), we finally decided to improve the way one build projects using EiffelStudio. Most of the ideas introduced in 5.7 have originated from discussions we had over the past 6 years with both employees of Eiffel Software and users of EiffelStudio. We actually wanted to change this a while ago, but lack of time and various constraints prevented us from doing it until 5.7. | ||
+ | |||
+ | Here are the major points raised by the above discussions: | ||
+ | * The ''Project Settings'' dialog of EiffelStudio 5.6 (and earlier) was not complete, that is to say not all the Lace constructs could be reached from the UI. | ||
+ | * The ''Project Settings'' dialog had a lot of bugs, such as making a mess of the format of the Ace file. For example, it moved comments to the wrong place in the Ace file. | ||
+ | * Lack of clear documentation on the most complex aspects of the Lace specification: mostly class renaming, but also recent additions to Lace for .NET projects. | ||
+ | * It was hard to have one Ace file for building portable systems. Usually one needed to have at least four Ace files: one for Windows, one for .NET (Windows), on for Mac OS X and one for UNIX. With the possible addition of Mono on UNIX and Mac OS X, this number could be brought up to six. | ||
+ | * When you had C code depending on the Eiffel Software runtime, again you needed two additional Ace files: one for workbench mode and one for finalized mode. | ||
+ | * When you had a library that could be compiled in mono- or multithreaded mode, again you would need two Ace files: one for monothreaded and the other one for multithreaded. | ||
+ | * One could not have both debug and release builds of the same project in the same Ace file; again two Ace files were required. | ||
+ | * When you had multiple projects sharing the same clusters, defaults, options, etc., an Ace file was needed for each project. The commonalities had to be copied manually to each Ace file, making it difficult to ensure that every Ace file was consistent and up to date. | ||
+ | * The class renaming mechanism was not completely understood and was actually difficult to use without asking Eiffel Software how it worked. | ||
+ | * The addition of .NET components increased the class name clashing dramatically, requiring a robust class name clash resolution mechanism. | ||
+ | * Recursive clusters were transformed into actual clusters, sometimes making it hard to apply some of the Lace construct specifications involving clusters, since cluster names had to be guessed by the end-user. | ||
+ | * Override clusters were simply a hack over normal clusters, preventing users from seeing which classes they were actually overriding. | ||
+ | * Support for ECMA type mapping needed to make INTEGER be INTEGER_32, NATURAL be NATURAL_32, etc. | ||
+ | * Although Lace was a public format (specified in ETL2), Eiffel Software's implementation was quite different and there was no library available to parse it. | ||
+ | * It was not easy to move a library's location since the use of relative paths was not working properly. (EiffelStudio used as reference directory the directory where the EIFGEN was located, not the one where the Ace was located.) | ||
+ | |||
+ | =What's new?= | ||
+ | ==One configuration for all== | ||
+ | As pointed out above, one of the major problems with Lace was the multiplication of Ace files for various platforms/modes/builds of compilation. To address this issue, almost every element of a configuration file can be conditioned. That is, you can say that an external declaration is only valid for Windows, and another one for UNIX. When the compiler sees the two external declarations, it only chooses the one matching the current platform. | ||
+ | If you need to perform a cross-platform compilation, you can manually set the required platform, and the compiler will choose the selected platform rather than the current platform. | ||
+ | |||
+ | The built-in conditions are: | ||
+ | *Platforms: Windows, UNIX, vxWorks, Macintosh | ||
+ | *.NET: True, False | ||
+ | *.NET version: a range can be specified | ||
+ | *Build: Workbench, Finalize | ||
+ | *Runtime: Static, Dynamic | ||
+ | *Thread: Multithreaded, Monothreaded | ||
+ | *Compiler version: a range can be specified | ||
+ | |||
+ | In addition, you can define your own variables and use them for writing custom conditions. | ||
+ | |||
+ | ==Library approach== | ||
+ | One new element of ECF is the ability to use another ECF file that we call a library. With Ace files, this was achieved by copying and pasting from a master Ace file. Needless to say, this made the work of library authors very difficult since they were limited in their refactoring to keeping classes where they were; otherwise, every user of the library would have to change his Ace files using it. The copy and paste operation was needed not only for clusters, but also for C compilation options which often changed depending on the platform. | ||
+ | |||
+ | For the end user, a library is a black box which exposes a set of classes defined in the library. There is no need to know how the library is implemented, nor is there a need to know from the programming point of view which other libraries it may reference internally (with one exception, from a management point of view, where you need to know dependencies since otherwise the library may not compile on your system if you do not have the referenced libraries). A good analogy for a library dependency graph is to see it as a .NET Assembly, except that this is presented in a source component rather than in a binary component. | ||
+ | |||
+ | ==Class name clashing resolution== | ||
+ | We are not going to explain how the Lace solution worked since very few people understood it. The solution adopted by the new configuration mechanism is based on information hiding, renaming and the new library abstraction. | ||
+ | |||
+ | ===Information Hiding=== | ||
+ | As mentioned in [[Ace_To_Ecf:_Improving_On_The_Existing#Library_approach| the library approach section]], a library is a black box that only exposes classes written in the library, that is to say all classes from the other libraries used by this library are not externally visible. This is a very efficient way to get rid of roughly 90% of common cases of name clashes. For example, let's consider the following 3 libraries: | ||
+ | |||
+ | [[Image:General_library_layout.png|center]] | ||
+ | |||
+ | Because library ''c'' only directly depends on library ''a'', it can only access classes from library ''a'' (i.e. only '''A''' in this example). The class from library ''b'' is not exposed to ''c''. This is how we can solve potential class name clashes coming from library ''b''. | ||
+ | |||
+ | If library ''c'' wants to use classes from library ''b'', it needs to explicitly reference library ''b'': | ||
+ | |||
+ | [[Image:General_library_layout_2.png|center]] | ||
+ | |||
+ | And in this case, library ''c'' has access to both '''A''' and '''B'''. | ||
+ | |||
+ | ===Renaming=== | ||
+ | Now we need to tackle the ten percent remaining. For that purpose we use the same mechanism used in Eiffel for solving feature name clashes with multiple inheritance, that is to say, class name renaming. Let's take for example the following diagram representing three libraries: | ||
+ | |||
+ | [[Image:Class_name_clash_in_libraries.png|center]] | ||
+ | |||
+ | Both libraries ''a'' and ''b'' define a class '''A'''. Library ''c'' which uses both ''a'' and ''b'' needs a clear way to distinguish '''A''' from ''a'' and '''A''' from ''b''. We do this by renaming '''A''' from ''b'' into '''A_FROM_B'''. Now when you encounter '''A''' in classes of ''c'' it means the version from ''a'' and when you encounter '''A_FROM_B''' it means the version from ''b''. In other words, the list of classes available to ''c'' are: | ||
+ | * '''A''' | ||
+ | * '''A_FROM_B''' | ||
+ | |||
+ | The good news is that the renaming has a local scope, meaning it is only valid in the context of library ''c.'' Another library used by ''c'' that may use either ''a'' or ''b'' will not be affected by the renaming, making this solution highly scalable for large systems composed of many libraries. | ||
+ | |||
+ | =Why the syntax change?= | ||
+ | Because we wanted to make most modifications through the UI or through the configuration library, we could have used an object binary format; but we rejected this approach because it would have made configuration versioning in the CMS tool completely useless since one would not have been able to visualize the differences. | ||
+ | |||
+ | So we were left with: | ||
+ | # Modifying Lace | ||
+ | # Adopting a new syntax | ||
+ | # Using the quite common XML syntax | ||
+ | |||
+ | In the end, we chose XML for the following reasons: | ||
+ | * We wanted new Eiffel users to easily create their own configurations. XML is very good at that since XML-aware text editors are able to provide code completion when we provide a schema, which we do. With Lace not being self-describing, it is harder for someone completely new to Eiffel to get a feel for what possibilities are offered. | ||
+ | * Anyone can parse XML: no need for a special parser. | ||
+ | * The node ordering in XML matches nicely with the layout of an Eiffel system, meaning you have libraries within libraries, you have clusters within clusters and you have either classes or clusters. | ||
+ | * Schema evolution: with a XSL transform, it is easy to automate the conversion from successive revisions of the XML file without building complex tools. | ||
+ | * There were some semantics changes we added in the new configuration approach that would have required an upgrade from the old Lace format to the new one. So no matter the chosen syntax, a migration was necessary. And because XML offered more than Lace for processing, this was one more reason for choosing XML. | ||
+ | * XML was already used by other tools addressing some of the shortcomings of Lace. | ||
+ | |||
+ | =Summary= | ||
+ | If you wish to remember a few things about ECF, they should be: | ||
+ | * One configuration for all platforms (.NET, Windows, UNIX, Mac OS X), all modes (multithreaded/monothreaded, ...), and all builds (debug/release, ...) of compilation for your system. | ||
+ | * Easy-to-create libraries that only list the required dependencies (making the configuration simpler and smaller). | ||
+ | * Pre- and post compilation tasks. | ||
+ | * It uses the only approach guaranteeing a complete solution to class name clashes - all other approaches simply postpone the problem one step further. | ||
+ | * ECF can be manipulated using the configuration library (See https://svn.eiffel.com/eiffelstudio/trunk/Src/framework/configuration). | ||
+ | |||
+ | =Future of ECF= | ||
+ | * Because ECF is a library as well as a file format, and we use the library to process the file format, ECF can easily be changed to use a syntax other than XML. At the time of this writing no other compelling syntax has been found, but we are open to suggestions. | ||
+ | * We plan to add versioning information to libraries (see [[ProposalLibraryDependencies| Library Dependencies]]). | ||
+ | * Automatic discovery of libraries without having to download the missing libraries manually (see [[ProposalConfigurationDiscovery| Discovery of Libraries]]). | ||
+ | * ... |
Latest revision as of 12:22, 4 June 2012
This page addresses the rationale behind EiffelStudio's new ECF configuration format, and clarifies which needs were covered by the new system. ECF stands for Eiffel Configuration File.
Contents
Why the change?
As part of the EiffelStudio 5.7 project plan (made in 2005), we finally decided to improve the way one build projects using EiffelStudio. Most of the ideas introduced in 5.7 have originated from discussions we had over the past 6 years with both employees of Eiffel Software and users of EiffelStudio. We actually wanted to change this a while ago, but lack of time and various constraints prevented us from doing it until 5.7.
Here are the major points raised by the above discussions:
- The Project Settings dialog of EiffelStudio 5.6 (and earlier) was not complete, that is to say not all the Lace constructs could be reached from the UI.
- The Project Settings dialog had a lot of bugs, such as making a mess of the format of the Ace file. For example, it moved comments to the wrong place in the Ace file.
- Lack of clear documentation on the most complex aspects of the Lace specification: mostly class renaming, but also recent additions to Lace for .NET projects.
- It was hard to have one Ace file for building portable systems. Usually one needed to have at least four Ace files: one for Windows, one for .NET (Windows), on for Mac OS X and one for UNIX. With the possible addition of Mono on UNIX and Mac OS X, this number could be brought up to six.
- When you had C code depending on the Eiffel Software runtime, again you needed two additional Ace files: one for workbench mode and one for finalized mode.
- When you had a library that could be compiled in mono- or multithreaded mode, again you would need two Ace files: one for monothreaded and the other one for multithreaded.
- One could not have both debug and release builds of the same project in the same Ace file; again two Ace files were required.
- When you had multiple projects sharing the same clusters, defaults, options, etc., an Ace file was needed for each project. The commonalities had to be copied manually to each Ace file, making it difficult to ensure that every Ace file was consistent and up to date.
- The class renaming mechanism was not completely understood and was actually difficult to use without asking Eiffel Software how it worked.
- The addition of .NET components increased the class name clashing dramatically, requiring a robust class name clash resolution mechanism.
- Recursive clusters were transformed into actual clusters, sometimes making it hard to apply some of the Lace construct specifications involving clusters, since cluster names had to be guessed by the end-user.
- Override clusters were simply a hack over normal clusters, preventing users from seeing which classes they were actually overriding.
- Support for ECMA type mapping needed to make INTEGER be INTEGER_32, NATURAL be NATURAL_32, etc.
- Although Lace was a public format (specified in ETL2), Eiffel Software's implementation was quite different and there was no library available to parse it.
- It was not easy to move a library's location since the use of relative paths was not working properly. (EiffelStudio used as reference directory the directory where the EIFGEN was located, not the one where the Ace was located.)
What's new?
One configuration for all
As pointed out above, one of the major problems with Lace was the multiplication of Ace files for various platforms/modes/builds of compilation. To address this issue, almost every element of a configuration file can be conditioned. That is, you can say that an external declaration is only valid for Windows, and another one for UNIX. When the compiler sees the two external declarations, it only chooses the one matching the current platform. If you need to perform a cross-platform compilation, you can manually set the required platform, and the compiler will choose the selected platform rather than the current platform.
The built-in conditions are:
- Platforms: Windows, UNIX, vxWorks, Macintosh
- .NET: True, False
- .NET version: a range can be specified
- Build: Workbench, Finalize
- Runtime: Static, Dynamic
- Thread: Multithreaded, Monothreaded
- Compiler version: a range can be specified
In addition, you can define your own variables and use them for writing custom conditions.
Library approach
One new element of ECF is the ability to use another ECF file that we call a library. With Ace files, this was achieved by copying and pasting from a master Ace file. Needless to say, this made the work of library authors very difficult since they were limited in their refactoring to keeping classes where they were; otherwise, every user of the library would have to change his Ace files using it. The copy and paste operation was needed not only for clusters, but also for C compilation options which often changed depending on the platform.
For the end user, a library is a black box which exposes a set of classes defined in the library. There is no need to know how the library is implemented, nor is there a need to know from the programming point of view which other libraries it may reference internally (with one exception, from a management point of view, where you need to know dependencies since otherwise the library may not compile on your system if you do not have the referenced libraries). A good analogy for a library dependency graph is to see it as a .NET Assembly, except that this is presented in a source component rather than in a binary component.
Class name clashing resolution
We are not going to explain how the Lace solution worked since very few people understood it. The solution adopted by the new configuration mechanism is based on information hiding, renaming and the new library abstraction.
Information Hiding
As mentioned in the library approach section, a library is a black box that only exposes classes written in the library, that is to say all classes from the other libraries used by this library are not externally visible. This is a very efficient way to get rid of roughly 90% of common cases of name clashes. For example, let's consider the following 3 libraries:
Because library c only directly depends on library a, it can only access classes from library a (i.e. only A in this example). The class from library b is not exposed to c. This is how we can solve potential class name clashes coming from library b.
If library c wants to use classes from library b, it needs to explicitly reference library b:
And in this case, library c has access to both A and B.
Renaming
Now we need to tackle the ten percent remaining. For that purpose we use the same mechanism used in Eiffel for solving feature name clashes with multiple inheritance, that is to say, class name renaming. Let's take for example the following diagram representing three libraries:
Both libraries a and b define a class A. Library c which uses both a and b needs a clear way to distinguish A from a and A from b. We do this by renaming A from b into A_FROM_B. Now when you encounter A in classes of c it means the version from a and when you encounter A_FROM_B it means the version from b. In other words, the list of classes available to c are:
- A
- A_FROM_B
The good news is that the renaming has a local scope, meaning it is only valid in the context of library c. Another library used by c that may use either a or b will not be affected by the renaming, making this solution highly scalable for large systems composed of many libraries.
Why the syntax change?
Because we wanted to make most modifications through the UI or through the configuration library, we could have used an object binary format; but we rejected this approach because it would have made configuration versioning in the CMS tool completely useless since one would not have been able to visualize the differences.
So we were left with:
- Modifying Lace
- Adopting a new syntax
- Using the quite common XML syntax
In the end, we chose XML for the following reasons:
- We wanted new Eiffel users to easily create their own configurations. XML is very good at that since XML-aware text editors are able to provide code completion when we provide a schema, which we do. With Lace not being self-describing, it is harder for someone completely new to Eiffel to get a feel for what possibilities are offered.
- Anyone can parse XML: no need for a special parser.
- The node ordering in XML matches nicely with the layout of an Eiffel system, meaning you have libraries within libraries, you have clusters within clusters and you have either classes or clusters.
- Schema evolution: with a XSL transform, it is easy to automate the conversion from successive revisions of the XML file without building complex tools.
- There were some semantics changes we added in the new configuration approach that would have required an upgrade from the old Lace format to the new one. So no matter the chosen syntax, a migration was necessary. And because XML offered more than Lace for processing, this was one more reason for choosing XML.
- XML was already used by other tools addressing some of the shortcomings of Lace.
Summary
If you wish to remember a few things about ECF, they should be:
- One configuration for all platforms (.NET, Windows, UNIX, Mac OS X), all modes (multithreaded/monothreaded, ...), and all builds (debug/release, ...) of compilation for your system.
- Easy-to-create libraries that only list the required dependencies (making the configuration simpler and smaller).
- Pre- and post compilation tasks.
- It uses the only approach guaranteeing a complete solution to class name clashes - all other approaches simply postpone the problem one step further.
- ECF can be manipulated using the configuration library (See https://svn.eiffel.com/eiffelstudio/trunk/Src/framework/configuration).
Future of ECF
- Because ECF is a library as well as a file format, and we use the library to process the file format, ECF can easily be changed to use a syntax other than XML. At the time of this writing no other compelling syntax has been found, but we are open to suggestions.
- We plan to add versioning information to libraries (see Library Dependencies).
- Automatic discovery of libraries without having to download the missing libraries manually (see Discovery of Libraries).
- ...