Difference between revisions of "Syntax level"

(Added a page that explains why we need an option to specify the syntax used in the source code)
 
m (Fixed a typo in category specification)
Line 1: Line 1:
 
[[Category:Compiler]]
 
[[Category:Compiler]]
 
[[Category:Configuration]]
 
[[Category:Configuration]]
[[Categoty:ECMA]]
+
[[Category:ECMA]]
 
The ECMA standard makes several modifications to the Eiffel syntax. The general approach for the compilers is to extend the syntax rules, so that both old and new styles are supported. This happened with the new constructs like <e>alias</e> or <e>assign</e>. But it turns out this common way has to be broken sometime.
 
The ECMA standard makes several modifications to the Eiffel syntax. The general approach for the compilers is to extend the syntax rules, so that both old and new styles are supported. This happened with the new constructs like <e>alias</e> or <e>assign</e>. But it turns out this common way has to be broken sometime.
 
== Mixed grammar ==
 
== Mixed grammar ==

Revision as of 05:20, 13 February 2008

The ECMA standard makes several modifications to the Eiffel syntax. The general approach for the compilers is to extend the syntax rules, so that both old and new styles are supported. This happened with the new constructs like alias or assign. But it turns out this common way has to be broken sometime.

Mixed grammar

The way the old syntax can be updated so that the new one is accepted is to extend the old language grammar with the new constructs and to substitute old rules when the new ones cannot be used. For example, the ECMA grammar allows the following code:

item: INTEGER assign put
put (value: INTEGER)
   do
      ...
   end

Here the keyword assign declares an assigner command for the query item. But what if we also have a feature declaration like the following?

assign (value: INTEGER)
   do
      ...
   end

The compiler is smart enough to see that the keyword assign is not suitable in this context, but the corresponding identifier is just fine. So, it can treat the keyword as an identifier, because the source code is valid according to pre-ECMA rules.

Ambiguous grammar

Unfortunately the approach above does not work all the time when a new keyword is introduced. Let's consider the following code snippet:

class A
invariant
   a: b
   note
   c: d
end

In pre-ECMA Eiffel the class invariant consists of 3 assertion clauses with the second one without an associated tag. In ECMA Eiffel note is a keyword and therefore the class invariant has one clause followed by a new Notes construct with one entry. As a result it's impossible to have a grammar that allows to treat note either as an identifier or as a keyword using the surrounding context, because both variants are legal. The resulting grammar is ambiguous.

Transitional grammar

How can we get the best of the two worlds if we cannot have a grammar that covers both the old and the new syntax? The idea is to use a grammar that allows using both constructs with some limitations. Talking about the keyword note that comes more or less as a replacement for the keyword indexing, we allow using both as keywords and forbid using them as identifiers. This way the ambiguity is resolved though we cannot use note as an identifier anymore. The table below summarizes the behaviour using note/indexing transition as an example.

Syntax compatibility level
Obsolete Transitional Standard
indexing is used as a keyword

note is used as an identifier with an optional warning that it becomes a keyword in the future

indexing is used as a keyword with an optional warning that it's an obsolete keyword that should be replaced with note

note is used as a keyword

indexing is used as an identifier

note is used as a keyword