Difference between revisions of "Talk:Syntax checking/Parser"
(→error classes) |
|||
(22 intermediate revisions by 4 users not shown) | |||
Line 8: | Line 8: | ||
[[User:Chrigu|Chrigu]] 17:19, 17 May 2006 (CEST) | [[User:Chrigu|Chrigu]] 17:19, 17 May 2006 (CEST) | ||
+ | |||
Actually read http://www.gobosoft.com/eiffel/gobo/geyacc/error.html for more info on how modifying `eiffel.y' to recover from error and therefore detect more than one error at a time. | Actually read http://www.gobosoft.com/eiffel/gobo/geyacc/error.html for more info on how modifying `eiffel.y' to recover from error and therefore detect more than one error at a time. | ||
--[[User:Manus|manus]] 18:10, 17 May 2006 (CEST) | --[[User:Manus|manus]] 18:10, 17 May 2006 (CEST) | ||
+ | |||
+ | |||
+ | Chrigu: How exactly do you parse piecewise? Though it wouldn't solve the problem because there can be several errors in one piece. | ||
+ | Manu: Thanks for the hint! Looks very promising! | ||
+ | |||
+ | [[User:Maser|maser]] 22:55, 17 May 2006 (CEST) | ||
+ | |||
+ | |||
+ | Maser: Couldn't we read several strings of the file and use parse_with_string instead of parse? I'm not sure... | ||
+ | |||
+ | [[User:Chrigu|Chrigu]] 15:52, 18 May 2006 (CEST) | ||
+ | |||
+ | |||
+ | Chrigu: A normal EIFFEL_PARSER accepts just classes, but you can use set_expression_parser (or some other `Parser type setting' feature) to create a parser that parses Eiffel expressions (I knew I had seen something like that before, but couldn't find it yesterday). But changing the eiffel.y file seems like a better solution. Maybe we should use expression and/or feature parsers while checking if a previously found error has been corrected by the user (I assume parsing just one expression is pretty fast). | ||
+ | |||
+ | I've looked through the documentation of gayacc and I think it shouldn't be too difficult to change eiffel.y to our needs, the file's size will probably bethe biggest problem. | ||
+ | |||
+ | [[User:Maser|maser]] 16:31, 18 May 2006 (CEST) | ||
+ | |||
+ | |||
+ | Everyone, I've sent Martin something that I worked on a couple a months ago. It's be no means finished or polished. It got put on the back burner because I need to work on other parts of EiffelEnvision. I also do not think it is the most recent version but I will need to check at home for the latest bits. | ||
+ | |||
+ | Basically I sent him what I started implementing for a recoverable parser in the new EiffelEnvision editor. It supported recovering from errors (more complex cannot be support with ease because it's hard to find a recoverable token - Eiffel does not have mandatory end-of-statement tokens like C/C++/C#), precise error/warning reporting (a real message stating excatly what is wrong) and absolute error positions (if you notice EiffelStudio does not always give the exact error location.) | ||
+ | |||
+ | One thing that I started doing, which you guys are going to need to address too, is error/warning spans. It would be nice if the error reporting indicated the start and end coordinates (x1, y1 - x2, y2) of the error, without draining performance. | ||
+ | |||
+ | The parser is common to EiffelStudio/Compiler, EiffelEnvision and a number of internal tools, which is something you need to be aware of. That means do not add any references to anything related to the compiler or EiffelStudio. The parser itself is a stand-alone cluster. | ||
+ | |||
+ | Creating a recoverable parser is hard and placement of those error token is a black art (that's what O'Reilly say and I agree)! Good luck :) | ||
+ | |||
+ | --[[User:Paulb|Paulb]] 17:59, 18 May 2006 (CEST) | ||
+ | |||
+ | |||
+ | Thanks a lot for the code, Paul! | ||
+ | |||
+ | I quickly looked through it and I'll summarize how (I think) it works: eiffel.y is extended is modified like described on the page Manu mentioned (according to Paul it contains bugs because he didn't finish it, and it isn't up to date because eiffel.y was extended to support new language features), so that it reports multiple errors (and stores them in SHARED_ERROR_HANDLER.error_list). It's about a thousand lines longer than the version in the repository. Additionally, there are new ERROR classes that describe the found syntax errors and some helper classes. Paul also mentioned that he made changes to EIFFEL_PARSER_SKELETON, but he couldn't find them. | ||
+ | |||
+ | [[User:Maser|maser]] 20:15, 18 May 2006 (CEST) | ||
+ | |||
+ | |||
+ | The ERROR classes in Paul's code aren't compatible with the ones in the current version (mainly the versions of SYNTAX_ERROR and SYNTAX_MESSAGE are quite different). | ||
+ | |||
+ | [[User:Maser|maser]] 00:54, 19 May 2006 (CEST) | ||
+ | |||
+ | |||
+ | That's correct. The existing error classes did not support everything they needed to support, so I reworte them. If you actually look at the parser error/warning classes in the parser cluster (not EiffelStudio's error classes) you'll see that the warning class is not even implemented! | ||
+ | |||
+ | As I said, the parser is a stand-alone library. The parser cluster was only moved under Src/Eiffel for ease to users to check out and compile Eiffel. In that respect you should just create an application that uses the parser library, as I did. I then sent a generated error string, from the error classes, to the the command-line shell. | ||
+ | |||
+ | [[User:Paulb|Paulb]] 22:14, 20 May 2006 (CEST) | ||
+ | |||
+ | |||
+ | I'm not sure if I understand what you mean. Basically, I should be able to take just the parser cluster and write a program that parses Eiffel files? But that's not possible, because EIFFEL_PARSER_SKELETON inherits from SHARED_ERROR_HANDLER, which is in the compiler cluster. Also, the error/warning classes in the parser cluster aren't used in ES, only the ones in the compiler cluster. I'd really like to create an application that uses the parser library, but I don't understand how it's going to work without the compiler cluster. | ||
+ | |||
+ | [[User:Maser|maser]] 01:38, 21 May 2006 (CEST) | ||
+ | |||
+ | |||
+ | If you take the configuration file parser.ecf in $EIFFEL_SRC/Eiffel/parser then it should include all you need. | ||
+ | |||
+ | --[[User:Manus|manus]] 19:30, 21 May 2006 (CEST) | ||
+ | |||
+ | |||
+ | Oh, you're right, everything's there, I just didn't see it. I could create a new project based on that file and it seems like there's really a lot to do in the ERROR classes. | ||
+ | |||
+ | [[User:Maser|maser]] 00:36, 22 May 2006 (CEST) | ||
+ | |||
+ | == Interface to visualization group == | ||
+ | |||
+ | As the parser produces ERROR objects and they contain information about position and description of the error and stores them in SHARED_ERROR_HANDLER.error_list, I suggest we use this as interface to exchange the data between the parser and visualisation groups. | ||
+ | |||
+ | [[User:Maser|maser]] 16:43, 18 May 2006 (CEST) | ||
+ | |||
+ | |||
+ | I didn't look into it too much, but the ERROR classes can do some stuff with TEXT_FORMATTERs (several elements of the GUI inherit from TEXT_FORMATTER) which could be useful to display information about the ERROR. | ||
+ | |||
+ | [[User:Maser|maser]] 02:22, 21 May 2006 (CEST) | ||
+ | |||
+ | == suggestion for implementation == | ||
+ | |||
+ | I've added a suggestion how to implement our changes. How bad is it? Comments? Improvements? | ||
+ | |||
+ | [[User:Maser|maser]] 16:35, 19 May 2006 (CEST) | ||
+ | |||
+ | It's not bad, it sounds good ;) | ||
+ | |||
+ | [[User:Chrigu|Chrigu]] 12:39, 21 May 2006 (CEST) | ||
+ | |||
+ | == acex for parser application == | ||
+ | |||
+ | I've uploaded a [http://n.ethz.ch/student/luderm/parser.acex parser.acex] (Edit: removed, there's information on this topic in [[Syntax checking/Parser]]). You can also use parser.ecf file in Eiffel/parser/ in the current trunk as Manu pointed out. | ||
+ | |||
+ | [[User:Maser|maser]] 17:01, 23 May 2006 (CEST) | ||
+ | |||
+ | == error classes == | ||
+ | |||
+ | I've studied Paul's error classes. They all inherit of syntax_error. And they can be called by the eiffel_parser_error_reporter. But where you decide, which class you should choose to represent the error? | ||
+ | |||
+ | [[User:Chrigu|Chrigu]] 13:18, 5 June 2006 (CEST) | ||
+ | |||
+ | |||
+ | It's decided in eiffel.y, as far as I know. There's a 'report_*' feature for each error class and in the eiffel.y the appropriate one is called with the right string from the EIFFEL_PARSER_ERRORS class. | ||
+ | |||
+ | [[User:Maser|maser]] 13:54, 5 June 2006 (CEST) | ||
+ | |||
+ | I have some problems: In EIFFEL_PARSER (generated by eiffel.y) are a lot of identifiers that are not defined (leaf_list_as: LEAF_AS_LIST;recoverable_parser, single_parser_type, successful: BOOLEAN;end_recover is do end;if_part_tuple: ?????). Where are they from? Should I inherit a class to solve that problem? | ||
+ | |||
+ | [[User:Chrigu|Chrigu]] 22:46, 20 June 2006 (CEST) | ||
+ | |||
+ | A lot of those features a very basic. I just sent Martin a brief description of `end_recover'. It simply was used to recover from an error and continue parsing. It called `clear_token' and would reset a state flag, used in error reporting, to allow errors to be generated and added to ERROR_HANDLER. In the case where `max_errors' was set, if the number of parse errors generated was greater that `max_errors', the state flag would not be reset. | ||
+ | |||
+ | `if_part_tuple' - I have no idea. | ||
+ | `successful' - Indicates if parsing was successful | ||
+ | `recoverable_parser' - Indicates if parser should recover from errors (For instance, syntax formatting tools should have this set to False) | ||
+ | `leaf_list_as' - Was just needed for extracting location information from generated AS nodes. It was simply an emmpty initialized LEAF_AS_NODE. It's required for accessing AST_EIFFEL.complete_start_location/complete_end_location as AST_EIFFEL.start_location/end_location will ignore keywords. The full location information is required to create correct location information for errors. | ||
+ | |||
+ | As a note. The parser I gave you is only a reference. Do not use as your base for you implementation. | ||
+ | |||
+ | [[User:Paulb|Paulb]] 17:19, 22 June 2006 (CEST) |
Latest revision as of 07:19, 22 June 2006
Contents
find more than one error
As far as I've seen, the Parser throws an ERROR as soon as it doesn't like something in the source, which aborts parsing. The found error is in the SHARED_ERROR_HANDLER's error_list (I've never seen more than one error in there, why is it a list anyway?). Has anybody found a way to tell the parser to parse the whole file?
maser 14:55, 17 May 2006 (CEST)
Hmmm... I don't know how to get all the errors with one call. But you haven't to parse the whole file, so you can parse piecewise and if an error occurs, you have to parse after the string that occurs the error. This is a way to get all errors, it's complicated, but it should work!?
Chrigu 17:19, 17 May 2006 (CEST)
Actually read http://www.gobosoft.com/eiffel/gobo/geyacc/error.html for more info on how modifying `eiffel.y' to recover from error and therefore detect more than one error at a time.
--manus 18:10, 17 May 2006 (CEST)
Chrigu: How exactly do you parse piecewise? Though it wouldn't solve the problem because there can be several errors in one piece.
Manu: Thanks for the hint! Looks very promising!
maser 22:55, 17 May 2006 (CEST)
Maser: Couldn't we read several strings of the file and use parse_with_string instead of parse? I'm not sure...
Chrigu 15:52, 18 May 2006 (CEST)
Chrigu: A normal EIFFEL_PARSER accepts just classes, but you can use set_expression_parser (or some other `Parser type setting' feature) to create a parser that parses Eiffel expressions (I knew I had seen something like that before, but couldn't find it yesterday). But changing the eiffel.y file seems like a better solution. Maybe we should use expression and/or feature parsers while checking if a previously found error has been corrected by the user (I assume parsing just one expression is pretty fast).
I've looked through the documentation of gayacc and I think it shouldn't be too difficult to change eiffel.y to our needs, the file's size will probably bethe biggest problem.
maser 16:31, 18 May 2006 (CEST)
Everyone, I've sent Martin something that I worked on a couple a months ago. It's be no means finished or polished. It got put on the back burner because I need to work on other parts of EiffelEnvision. I also do not think it is the most recent version but I will need to check at home for the latest bits.
Basically I sent him what I started implementing for a recoverable parser in the new EiffelEnvision editor. It supported recovering from errors (more complex cannot be support with ease because it's hard to find a recoverable token - Eiffel does not have mandatory end-of-statement tokens like C/C++/C#), precise error/warning reporting (a real message stating excatly what is wrong) and absolute error positions (if you notice EiffelStudio does not always give the exact error location.)
One thing that I started doing, which you guys are going to need to address too, is error/warning spans. It would be nice if the error reporting indicated the start and end coordinates (x1, y1 - x2, y2) of the error, without draining performance.
The parser is common to EiffelStudio/Compiler, EiffelEnvision and a number of internal tools, which is something you need to be aware of. That means do not add any references to anything related to the compiler or EiffelStudio. The parser itself is a stand-alone cluster.
Creating a recoverable parser is hard and placement of those error token is a black art (that's what O'Reilly say and I agree)! Good luck :)
--Paulb 17:59, 18 May 2006 (CEST)
Thanks a lot for the code, Paul!
I quickly looked through it and I'll summarize how (I think) it works: eiffel.y is extended is modified like described on the page Manu mentioned (according to Paul it contains bugs because he didn't finish it, and it isn't up to date because eiffel.y was extended to support new language features), so that it reports multiple errors (and stores them in SHARED_ERROR_HANDLER.error_list). It's about a thousand lines longer than the version in the repository. Additionally, there are new ERROR classes that describe the found syntax errors and some helper classes. Paul also mentioned that he made changes to EIFFEL_PARSER_SKELETON, but he couldn't find them.
maser 20:15, 18 May 2006 (CEST)
The ERROR classes in Paul's code aren't compatible with the ones in the current version (mainly the versions of SYNTAX_ERROR and SYNTAX_MESSAGE are quite different).
maser 00:54, 19 May 2006 (CEST)
That's correct. The existing error classes did not support everything they needed to support, so I reworte them. If you actually look at the parser error/warning classes in the parser cluster (not EiffelStudio's error classes) you'll see that the warning class is not even implemented!
As I said, the parser is a stand-alone library. The parser cluster was only moved under Src/Eiffel for ease to users to check out and compile Eiffel. In that respect you should just create an application that uses the parser library, as I did. I then sent a generated error string, from the error classes, to the the command-line shell.
Paulb 22:14, 20 May 2006 (CEST)
I'm not sure if I understand what you mean. Basically, I should be able to take just the parser cluster and write a program that parses Eiffel files? But that's not possible, because EIFFEL_PARSER_SKELETON inherits from SHARED_ERROR_HANDLER, which is in the compiler cluster. Also, the error/warning classes in the parser cluster aren't used in ES, only the ones in the compiler cluster. I'd really like to create an application that uses the parser library, but I don't understand how it's going to work without the compiler cluster.
maser 01:38, 21 May 2006 (CEST)
If you take the configuration file parser.ecf in $EIFFEL_SRC/Eiffel/parser then it should include all you need.
--manus 19:30, 21 May 2006 (CEST)
Oh, you're right, everything's there, I just didn't see it. I could create a new project based on that file and it seems like there's really a lot to do in the ERROR classes.
maser 00:36, 22 May 2006 (CEST)
Interface to visualization group
As the parser produces ERROR objects and they contain information about position and description of the error and stores them in SHARED_ERROR_HANDLER.error_list, I suggest we use this as interface to exchange the data between the parser and visualisation groups.
maser 16:43, 18 May 2006 (CEST)
I didn't look into it too much, but the ERROR classes can do some stuff with TEXT_FORMATTERs (several elements of the GUI inherit from TEXT_FORMATTER) which could be useful to display information about the ERROR.
maser 02:22, 21 May 2006 (CEST)
suggestion for implementation
I've added a suggestion how to implement our changes. How bad is it? Comments? Improvements?
maser 16:35, 19 May 2006 (CEST)
It's not bad, it sounds good ;)
Chrigu 12:39, 21 May 2006 (CEST)
acex for parser application
I've uploaded a parser.acex (Edit: removed, there's information on this topic in Syntax checking/Parser). You can also use parser.ecf file in Eiffel/parser/ in the current trunk as Manu pointed out.
maser 17:01, 23 May 2006 (CEST)
error classes
I've studied Paul's error classes. They all inherit of syntax_error. And they can be called by the eiffel_parser_error_reporter. But where you decide, which class you should choose to represent the error?
Chrigu 13:18, 5 June 2006 (CEST)
It's decided in eiffel.y, as far as I know. There's a 'report_*' feature for each error class and in the eiffel.y the appropriate one is called with the right string from the EIFFEL_PARSER_ERRORS class.
maser 13:54, 5 June 2006 (CEST)
I have some problems: In EIFFEL_PARSER (generated by eiffel.y) are a lot of identifiers that are not defined (leaf_list_as: LEAF_AS_LIST;recoverable_parser, single_parser_type, successful: BOOLEAN;end_recover is do end;if_part_tuple: ?????). Where are they from? Should I inherit a class to solve that problem?
Chrigu 22:46, 20 June 2006 (CEST)
A lot of those features a very basic. I just sent Martin a brief description of `end_recover'. It simply was used to recover from an error and continue parsing. It called `clear_token' and would reset a state flag, used in error reporting, to allow errors to be generated and added to ERROR_HANDLER. In the case where `max_errors' was set, if the number of parse errors generated was greater that `max_errors', the state flag would not be reset.
`if_part_tuple' - I have no idea. `successful' - Indicates if parsing was successful `recoverable_parser' - Indicates if parser should recover from errors (For instance, syntax formatting tools should have this set to False) `leaf_list_as' - Was just needed for extracting location information from generated AS nodes. It was simply an emmpty initialized LEAF_AS_NODE. It's required for accessing AST_EIFFEL.complete_start_location/complete_end_location as AST_EIFFEL.start_location/end_location will ignore keywords. The full location information is required to create correct location information for errors.
As a note. The parser I gave you is only a reference. Do not use as your base for you implementation.
Paulb 17:19, 22 June 2006 (CEST)