Difference between revisions of "Syntax checking/Parser"

m (Work distribution: typo)
(TODO)
 
(15 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
[[Category:Projects]]
 +
[[Category:Editor]]
 
[[Syntax_checking|back to Syntax checking page]]
 
[[Syntax_checking|back to Syntax checking page]]
 +
 +
 +
== TODO ==
 +
 +
=== Refactoring ===
 +
* change indirect inheritance of EIFFEL_PARSER_SKELETON via EIFFEL_PARSER_ERROR_REPORTER in EIFFEL_PARSER.
 +
 +
=== report a lot more errors ===
 +
 +
=== classes with errors to check if parser works ===
 +
* See [http://origo.ethz.ch/pipermail/es-devel/2006-June/000372.html this post] on the es-devel mailing list. It's probably better to wait, as we don't really have too much time left.
  
 
==Important Classes/Files==
 
==Important Classes/Files==
  
====eiffel.y====
+
====eiffel.y / eiffel.l====
 
* Eiffel grammar description.  
 
* Eiffel grammar description.  
* use [http://www.gobosoft.com/eiffel/gobo/geyacc/ geyacc] to generate eiffel_parser.e from this file
+
* create EIFFEL_SCANNER, EIFFEL_PARSER and EIFFEL_TOKENS from these files using the makefile in $EIFFEL_SRC/Eiffel/parser/parser/
 +
====EIFFEL_TOKENS====
 +
* defines the tokens
 +
====EIFFEL_SCANNER====
 +
* reads list of tokens from file/string
 
====EIFFEL_PARSER====
 
====EIFFEL_PARSER====
 
* inherits from EIFFEL_PARSER_SKELETON (where the features parse, parse_string, make_with_factory are implemented)
 
* inherits from EIFFEL_PARSER_SKELETON (where the features parse, parse_string, make_with_factory are implemented)
Line 11: Line 28:
 
** AST_NULL_FACTORY doesn't build an AST (AST_FACTORY does, the AST is in EIFFEL_PARSER.root_node after parsing)
 
** AST_NULL_FACTORY doesn't build an AST (AST_FACTORY does, the AST is in EIFFEL_PARSER.root_node after parsing)
 
* parse (a_file: KL_BINARY_INPUT_FILE) and parse_from_string (a_string: STRING).
 
* parse (a_file: KL_BINARY_INPUT_FILE) and parse_from_string (a_string: STRING).
 
+
====EIFFEL_PARSER_ERROR_REPORTER====
====CLASS_AS====
+
* provides error reporting features (report_*)
* AST of a class
+
====EIFFEL_AST====
 +
* base class of AST nodes
 
====ERROR====
 
====ERROR====
 
* deferred; superclass of all error types like EIFFEL_ERROR or SYNTAX_ERROR
 
* deferred; superclass of all error types like EIFFEL_ERROR or SYNTAX_ERROR
Line 25: Line 43:
  
  
==Implementation==
+
== creating a project that uses the parser ==
* based on Paul's code
+
The new build (June 11 2006) finally has the new system configuration GUI. It's quite easy to create a project that uses the parser:
===ERROR classes===
+
# set ISE_LIBRARY to your checkout (ie. the directory that contains the 'library' subdirectory)
* Create new SYNTAX_ERROR classes that correspond to Paul's classes, but fit into the current hierarchy
+
# start EiffelStudio and create a new project
* store start and end position of the error
+
# open Project > Project settings...
===extend parser to generate the right ERRORs===
+
# go to Target > Group (on the left)
* Integrate Paul's changes to eiffel.l and eiffel.y into the current versions.  
+
# click on the library icon (the one with books on it) and add the gobo library
* add facilities from Paul's EIFFEL_PARSER_ERROR_REPORTER
+
# click on the icon again, type 'parser' in the Name field and set the location to $ISE_LIBRARY/Eiffel/parser/parser.ecf
** in existing class like EIFFEL_PARSER_SKELETON (EP_ERROR_REPORTER only inherits SHARED_ERROR_HANDLER and so does EP_SKELETON
+
** in new class
+
  
 +
Now you can use the parser.
  
== Work distribution==
+
* [http://n.ethz.ch/student/luderm/es/root_class.e example of an application that uses the parser]
  
=== eiffel.y ===
 
* Ueli: 0 - 844: Parent_List
 
* Marko: 845 - 1494 Formal Generics
 
* Michi: 1495 - 2130 Instruction Call
 
* Martin: 2131 - end
 
  
=== error classes ===
+
== Tools ==
*Chrigu
+
* [http://meld.sourceforge.net/ meld]
 +
* [http://furius.ca/xxdiff/ xxdiff]
 +
* [http://www.gobosoft.com/eiffel/gobo/geyacc/ geyacc]
  
=== classes with errors to check if parser works ===
+
== Grammar definition file (eiffel.y) ==
* nobody yet
+
  
== Tools ==
+
some general information:
* [[http://meld.sourceforge.net/ meld]]
+
* capitalized names like TE_STR_LT are tokens. Search for them in eiffel.l (lower case L):
* [[http://furius.ca/xxdiff/ xxdiff]]
+
  \""<"\" {
* [[http://www.gobosoft.com/eiffel/gobo/geyacc/ geyacc]]
+
  ast_factory.set_buffer (token_buffer2, Current)
 +
  last_token := TE_STR_LT
 +
}
 +
So TE_STR_LT corresponds to '<' (\""<"\" is a regular expression)
 +
* all other names are non-terminals, so you can find them in eiffel.y. If you don't know what a non-terminal means, you can always look it up in eiffel.y or ask the person responsible for that part of the file.
 +
 
 +
=== changes ===
 +
There are several types of changes we can do to the eiffel.y file while merging Paul's eiffel.y and the current eiffel.y:
 +
 
 +
==== renaming ====
 +
This doesn't really change the functionality and shouldn't really be a problem.
 +
 
 +
Example: Paul renamed infix_operator to infix_string (for whatever reasons).
 +
 
 +
==== new non-terminals ====
 +
new Non-terminals are introduced because they simplify an existing rule or simplify error handling.
 +
 
 +
Example:
 +
 
 +
current version:
 +
  Default_manifest_string:
 +
  Non_empty_string
 +
  { $$ := $1 }
 +
  | TE_EMPTY_STRING
 +
  {
 +
  $$ := ast_factory.new_string_as ("", line, column, string_position, position + text_count - string_position, token_buffer2)
 +
  }
 +
  | TE_EMPTY_VERBATIM_STRING
 +
  {
 +
  $$ := ast_factory.new_verbatim_string_as ("", verbatim_marker.substring (2, verbatim_marker.count), not has_old_verbatim_strings and then verbatim_marker.item (1) = ']', line, column, string_position, position + text_count - string_position, token_buffer2)
 +
  }
 +
  ;
 +
 
 +
changed to:
 +
  Default_manifest_string:
 +
  Non_empty_string
 +
  { $$ := $1 }
 +
  | Empty_string
 +
  { $$ := $1 }
 +
  ;
 +
 
 +
  Empty_string:
 +
TE_EMPTY_STRING
 +
  { $$ := ast_factory.new_string_as ("", line, column, string_position, position + text_count - string_position, token_buffer2) }
 +
  | TE_EMPTY_VERBATIM_STRING
 +
  { $$ := ast_factory.new_verbatim_string_as ("", verbatim_marker.substring (2, verbatim_marker.count), not has_old_verbatim_strings and then verbatim_marker.item (1) = ']', line, column, string_position, position + text_count - string_position, token_buffer2) }
 +
;
 +
 
 +
==== new rules for error handling ====
 +
Rules added to non-terminals to do error handling
 +
 
 +
Example:
 +
 
 +
current Obsolete non-terminal:
 +
 
 +
  Obsolete: -- Empty
 +
  -- { $$ := Void }
 +
  | TE_OBSOLETE Manifest_string
 +
  {
 +
  $$ := ast_factory.new_keyword_string_pair ($1, $2)
 +
  }
 +
  ;
 +
 
 +
Paul's Obsolete non-terminal:
 +
 
 +
  Obsolete: -- Empty
 +
  -- { $$ := Void }
 +
  | TE_OBSOLETE Manifest_string
 +
  {
 +
  $$ := ast_factory.new_keyword_string_pair ($1, $2)
 +
  }
 +
  | TE_OBSOLETE error { report_expected_after_error (parser_errors.obsolete_keyword, $1, parser_errors.obsolete_string, False) }
 +
  ;
 +
 
 +
==== changed error handling ====
 +
Already existing error handling in eiffel.y is usually longer than Paul's error handling. That's mainly because he put that code into features.
 +
 
 +
Example:
 +
 
 +
current version:
 +
 
 +
  Inheritance: -- Empty
 +
  -- { $$ := Void }
 +
  | TE_INHERIT ASemi
 +
  {
 +
  if has_syntax_warning then
 +
  Error_handler.insert_warning (
 +
  create {SYNTAX_WARNING}.make (line, column, filename,
 +
  "Use `inherit ANY' or do not specify an empty inherit clause"))
 +
  end
 +
  --- $$ := Void
 +
  $$ := ast_factory.new_eiffel_list_parent_as (0)
 +
  if $$ /= Void then
 +
  $$.set_inherit_keyword ($1)
 +
  end
 +
  }
 +
  [...]
 +
  ;
 +
 
 +
Paul's version:
 +
 
 +
  Inheritance: -- Empty
 +
  -- { $$ := Void }
 +
  | TE_INHERIT ASemi
 +
  {
 +
  report_warning (parser_errors.empty_inherit_clause_warning, Void)
 +
  $$ := ast_factory.new_eiffel_list_parent_as (0)
 +
  if $$ /= Void then
 +
  $$.set_inherit_keyword ($1)
 +
  end
 +
  }
 +
  [...]
 +
  ;
 +
 
 +
 
 +
== BON diagrams ==
 +
 
 +
 
 +
=== Parser ===
 +
 
 +
[[Image:SynChé BON PARSER.png]]
 +
 
 +
 
 +
=== SYNTAX_MESSAGE descendants ===
 +
 
 +
[[Image:SynChé BON SYNTAX MESSAGE.png]]

Latest revision as of 00:49, 11 July 2006

back to Syntax checking page


TODO

Refactoring

  • change indirect inheritance of EIFFEL_PARSER_SKELETON via EIFFEL_PARSER_ERROR_REPORTER in EIFFEL_PARSER.

report a lot more errors

classes with errors to check if parser works

  • See this post on the es-devel mailing list. It's probably better to wait, as we don't really have too much time left.

Important Classes/Files

eiffel.y / eiffel.l

  • Eiffel grammar description.
  • create EIFFEL_SCANNER, EIFFEL_PARSER and EIFFEL_TOKENS from these files using the makefile in $EIFFEL_SRC/Eiffel/parser/parser/

EIFFEL_TOKENS

  • defines the tokens

EIFFEL_SCANNER

  • reads list of tokens from file/string

EIFFEL_PARSER

  • inherits from EIFFEL_PARSER_SKELETON (where the features parse, parse_string, make_with_factory are implemented)
  • make_with_factory (a_factory: AST_FACTORY): give argument of type AST_NULL_FACTORY (inherits from AST_FACTORY)
    • AST_NULL_FACTORY doesn't build an AST (AST_FACTORY does, the AST is in EIFFEL_PARSER.root_node after parsing)
  • parse (a_file: KL_BINARY_INPUT_FILE) and parse_from_string (a_string: STRING).

EIFFEL_PARSER_ERROR_REPORTER

  • provides error reporting features (report_*)

EIFFEL_AST

  • base class of AST nodes

ERROR

  • deferred; superclass of all error types like EIFFEL_ERROR or SYNTAX_ERROR
  • features line, column: INTEGER give location of error

ERROR_HANDLER

  • feature error_list: ERROR is a list of errors found by the parser

SHARED_ERROR_HANDLER

  • singleton used by all relevant classes

EIFFEL_CLASS_C

  • features build_ast and parse_ast show how the parser can be used.


creating a project that uses the parser

The new build (June 11 2006) finally has the new system configuration GUI. It's quite easy to create a project that uses the parser:

  1. set ISE_LIBRARY to your checkout (ie. the directory that contains the 'library' subdirectory)
  2. start EiffelStudio and create a new project
  3. open Project > Project settings...
  4. go to Target > Group (on the left)
  5. click on the library icon (the one with books on it) and add the gobo library
  6. click on the icon again, type 'parser' in the Name field and set the location to $ISE_LIBRARY/Eiffel/parser/parser.ecf

Now you can use the parser.


Tools

Grammar definition file (eiffel.y)

some general information:

  • capitalized names like TE_STR_LT are tokens. Search for them in eiffel.l (lower case L):
 \""<"\"		{				
 				ast_factory.set_buffer (token_buffer2, Current)
 				last_token := TE_STR_LT
			}

So TE_STR_LT corresponds to '<' (\""<"\" is a regular expression)

  • all other names are non-terminals, so you can find them in eiffel.y. If you don't know what a non-terminal means, you can always look it up in eiffel.y or ask the person responsible for that part of the file.

changes

There are several types of changes we can do to the eiffel.y file while merging Paul's eiffel.y and the current eiffel.y:

renaming

This doesn't really change the functionality and shouldn't really be a problem.

Example: Paul renamed infix_operator to infix_string (for whatever reasons).

new non-terminals

new Non-terminals are introduced because they simplify an existing rule or simplify error handling.

Example:

current version:

 Default_manifest_string: 
 		Non_empty_string
 			{ $$ := $1 }
 	|	TE_EMPTY_STRING
 			{
 				$$ := ast_factory.new_string_as ("", line, column, string_position, position + text_count - string_position, token_buffer2)
 			}
 	|	TE_EMPTY_VERBATIM_STRING
 			{
 				$$ := ast_factory.new_verbatim_string_as ("", verbatim_marker.substring (2, verbatim_marker.count), not has_old_verbatim_strings and then verbatim_marker.item (1) = ']', line, column, string_position, position + text_count - string_position, token_buffer2)
 			}
 	;

changed to:

 Default_manifest_string: 
 		Non_empty_string
 			{ $$ := $1 }
 	|	Empty_string
 			{ $$ := $1 }
 	;
 
 Empty_string: 
		TE_EMPTY_STRING
 			{ $$ := ast_factory.new_string_as ("", line, column, string_position, position + text_count - string_position, token_buffer2) }
 	|	TE_EMPTY_VERBATIM_STRING
 			{ $$ := ast_factory.new_verbatim_string_as ("", verbatim_marker.substring (2, verbatim_marker.count), not has_old_verbatim_strings and then verbatim_marker.item (1) = ']', line, column, string_position, position + text_count - string_position, token_buffer2) }
	;

new rules for error handling

Rules added to non-terminals to do error handling

Example:

current Obsolete non-terminal:

 Obsolete: -- Empty
 			-- { $$ := Void }
 	|	TE_OBSOLETE Manifest_string
 			{
 				$$ := ast_factory.new_keyword_string_pair ($1, $2)
 			}
 	;

Paul's Obsolete non-terminal:

 Obsolete: -- Empty
 			-- { $$ := Void }
 	|	TE_OBSOLETE Manifest_string
 			{
 				$$ := ast_factory.new_keyword_string_pair ($1, $2)
 			}
 	|	TE_OBSOLETE error { report_expected_after_error (parser_errors.obsolete_keyword, $1, parser_errors.obsolete_string, False) }
 	;

changed error handling

Already existing error handling in eiffel.y is usually longer than Paul's error handling. That's mainly because he put that code into features.

Example:

current version:

 Inheritance: -- Empty
 			-- { $$ := Void }
 	|	TE_INHERIT ASemi
 			{
 				if has_syntax_warning then
 					Error_handler.insert_warning (
 						create {SYNTAX_WARNING}.make (line, column, filename,
 						"Use `inherit ANY' or do not specify an empty inherit clause"))
 				end
 				--- $$ := Void
 				$$ := ast_factory.new_eiffel_list_parent_as (0)
 				if $$ /= Void then
 					$$.set_inherit_keyword ($1)
 				end
 			}
 	[...]
 	;

Paul's version:

 Inheritance: -- Empty
 			-- { $$ := Void }
 	|	TE_INHERIT ASemi
 			{
 				report_warning (parser_errors.empty_inherit_clause_warning, Void)
 				$$ := ast_factory.new_eiffel_list_parent_as (0)
 				if $$ /= Void then
 					$$.set_inherit_keyword ($1)
 				end
 			}
 	[...]
 	;


BON diagrams

Parser

SynChé BON PARSER.png


SYNTAX_MESSAGE descendants

SynChé BON SYNTAX MESSAGE.png