Difference between revisions of "CA Library Implementation"
m |
m (Manus moved page User:Stefan/Code Analysis/Library Implementation to CA Library Implementation without leaving a redirect) |
||
(10 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | + | [[Category:Code Analysis]] | |
− | + | ||
− | + | ||
− | The code for | + | The code for Inspector Eiffel is located at three different places in the EiffelStudio source: |
− | # The | + | # The framework—by far the largest part, with the rule checking, the rules, the control flow graph functionality, and more—is represented as a ''library''; |
− | # The graphical user interface can be found in the ''interface'' cluster of | + | # The graphical user interface can be found in the ''interface'' cluster of EiffelStudio; |
− | # The command-line interface for code analysis is a single class in the ''tty'' cluster of | + | # The command-line interface for code analysis is a single class in the ''tty'' cluster of EiffelStudio. |
− | + | The whole '''Inspector Eiffel framework''' is located in the '''library ''code_analysis'''''. | |
− | + | == Class Relations == | |
− | + | The following diagram shows an overview of the relations between the classes of the code analysis framework. All classes are located in the ''code_analysis'' library except for <e>CLASS_C</e> (EiffelStudio), <e>ROTA_TIMED_TASK_I</e> (''ecosystem'' cluster), <e>EWB_CODE_ANALYSIS</e> (command-line interface), and <e>ES_CODE_ANALYSIS_BENCH_HELPER</e> (GUI). | |
+ | |||
+ | [[File:CA Framework Diagram.png|thumb|center|800px|The most interesting classes of the code analysis framework.]] | ||
+ | |||
+ | == Interface == | ||
In this section it is explained from a client view how to use the code analyzer. The code analyzer is represented by the class <e>CA_CODE_ANALYZER</e>, so a client must have or access an instance of this class. Before the analyzer can be launched all the classes that shall be analyzed must be added using one of the following features. If you use more than one of these commands then the added classes from all commands will be conjoined. | In this section it is explained from a client view how to use the code analyzer. The code analyzer is represented by the class <e>CA_CODE_ANALYZER</e>, so a client must have or access an instance of this class. Before the analyzer can be launched all the classes that shall be analyzed must be added using one of the following features. If you use more than one of these commands then the added classes from all commands will be conjoined. | ||
Line 29: | Line 31: | ||
Then, to start analyzing simply call <e>{CA_CODE_ANALYZER}.analyze</e>. | Then, to start analyzing simply call <e>{CA_CODE_ANALYZER}.analyze</e>. | ||
− | + | == Rule Checking == | |
In the GUI we want to be able to continue to work while the code analyzer is running. Analyzing larger sets of classes (such as whole libraries) can take from several seconds to several minutes. For this reason the code analyzer uses an ''asynchronous task'', <e>{CA_RULE_CHECKING_TASK}</e>. In <e>{CA_CODE_ANALYZER}.analyze</e> this task (<e>l_task</e>) is invoked as follows: | In the GUI we want to be able to continue to work while the code analyzer is running. Analyzing larger sets of classes (such as whole libraries) can take from several seconds to several minutes. For this reason the code analyzer uses an ''asynchronous task'', <e>{CA_RULE_CHECKING_TASK}</e>. In <e>{CA_CODE_ANALYZER}.analyze</e> this task (<e>l_task</e>) is invoked as follows: | ||
− | + | === In <e>{CA_CODE_ANALYZER}.analyze</e>: === | |
<e> | <e> | ||
Line 43: | Line 45: | ||
<e>{CA_RULE_CHECKING_TASK}</e> essentially runs the whole analysis. Like all other conformants to <e>{ROTA_TASK_I}</e> this class executes a series of ''steps'' between which the user interface gets some time to process its events. In <e>{CA_RULE_CHECKING_TASK}</e> each step analyses one class. This means that a class is checked by ''all'' the rules for violations. The following code does that: | <e>{CA_RULE_CHECKING_TASK}</e> essentially runs the whole analysis. Like all other conformants to <e>{ROTA_TASK_I}</e> this class executes a series of ''steps'' between which the user interface gets some time to process its events. In <e>{CA_RULE_CHECKING_TASK}</e> each step analyses one class. This means that a class is checked by ''all'' the rules for violations. The following code does that: | ||
− | + | === From <e>{CA_RULE_CHECKING_TASK}</e>: === | |
<e> | <e> | ||
Line 102: | Line 104: | ||
In the <e>rescue</e> clause all possible exceptions are caught and recorded. In case of such an exception it then proceeds to the next class. | In the <e>rescue</e> clause all possible exceptions are caught and recorded. In case of such an exception it then proceeds to the next class. | ||
− | == | + | === Checking ''Standard'' Rules === |
− | The | + | The relatively large class <e>{CA_ALL_RULES_CHECKER}</e> is responsible for checking ''standard rules''. It does this in a straightforward way. It is a subclass of <e>{AST_ITERATOR}</e>, a realization of a visitor on the AST. |
− | + | Rules can register their actions with <e>{CA_ALL_RULES_CHECKER}</e> by calling a procedure like <e>add_bin_lt_pre_action (a_action: attached PROCEDURE [ANY, TUPLE [BIN_LT_AS]])</e> or <e>add_if_post_action (a_action: attached PROCEDURE [ANY, TUPLE [IF_AS]])</e>. These "pre" and "post" actions exist for many other types of AST nodes as well. All the registered actions are stored in <e>ACTION_SEQUENCE</e> variables: | |
− | + | <e> | |
− | + | if_pre_actions, if_post_actions: ACTION_SEQUENCE [TUPLE [IF_AS]] | |
− | + | ||
− | + | ||
− | + | ||
+ | add_if_post_action (a_action: attached PROCEDURE [ANY, TUPLE [IF_AS]]) | ||
+ | do | ||
+ | if_post_actions.extend (a_action) | ||
+ | end | ||
− | + | -- And similar for all other relevant AST nodes... | |
+ | </e> | ||
− | + | The corresponding visitor procedures are redefined. This is done is the following way: | |
+ | <e> | ||
+ | process_if_as (a_if: IF_AS) | ||
+ | do | ||
+ | if_pre_actions.call ([a_if]) | ||
+ | Precursor (a_if) | ||
+ | if_post_actions.call ([a_if]) | ||
+ | end | ||
− | + | -- And similar for all other relevant AST nodes... | |
+ | </e> | ||
+ | Since the actual iteration over the AST is done in the ancestor we need only very little code to analyze a class: | ||
<e> | <e> | ||
− | + | feature {CA_RULE_CHECKING_TASK} -- Execution Commands | |
− | + | ||
− | + | run_on_class (a_class_to_check: CLASS_C) | |
− | + | -- Check all rules that have added their agents. | |
+ | local | ||
+ | l_ast: CLASS_AS | ||
+ | do | ||
+ | last_run_successful := False | ||
+ | l_ast := a_class_to_check.ast | ||
+ | class_pre_actions.call ([l_ast]) | ||
+ | process_class_as (l_ast) | ||
+ | class_post_actions.call ([l_ast]) | ||
+ | last_run_successful := True | ||
+ | end | ||
</e> | </e> | ||
− | + | This code analyzes a class for all active ''standard'' rules. <e>class_pre_actions</e> and <e>class_post_actions</e> are action sequences that are identical to those for the AST nodes. <e>process_class_as</e>, which is implemented in <e>{AST_ITERATOR}</e> will recursively visit all relevant AST nodes and execute their action sequences. | |
+ | |||
+ | == Example: Rule # 71: ''Self-comparison'' == | ||
+ | |||
+ | We will go through the implementation of rule # 71 (''Self-comparison'') in detail. | ||
+ | |||
+ | The heart of this implementation lies in the feature <e>analyze_self</e>. There it is tested whether a binary expression is s self-comparison. <e>is_self</e>, a <e>BOOLEAN</e> attribute, is set to true if and only if the argument is a comparison between two identical variables. | ||
<e> | <e> | ||
− | + | analyze_self (a_bin: attached BINARY_AS) | |
− | + | -- Is `a_bin' a self-comparison? | |
− | + | do | |
− | + | is_self := False | |
− | + | ||
− | + | if | |
− | + | attached {EXPR_CALL_AS} a_bin.left as l_e1 | |
− | + | and then attached {ACCESS_ID_AS} l_e1.call as l_l | |
− | + | and then attached {EXPR_CALL_AS} a_bin.right as l_e2 | |
− | + | and then attached {ACCESS_ID_AS} l_e2.call as l_r | |
− | + | then | |
− | + | is_self := l_l.feature_name.is_equal (l_r.feature_name) | |
− | + | self_name := l_l.access_name_32 | |
− | + | end | |
− | + | end | |
− | + | ||
+ | is_self: BOOLEAN | ||
+ | -- Is `a_bin' from last call to `analyze_self' a self-comparison? | ||
+ | |||
+ | self_name: detachable STRING_32 | ||
+ | -- Name of the self-compared variable. | ||
+ | </e> | ||
+ | |||
+ | Both sides of the comparison, <e>a_bin.left</e> and <e>a_bin.right</e>, are tested to have the types that indicate that they are variable or feature accesses. If the tests succeed then <e>is_self</e> is set according to the equality of the two feature names. Then the name is stored in an internal attribute. | ||
+ | |||
+ | <e>analyze_self</e> is used in <e>process_comparison</e>, which creates a rule violation if a self-comparison was detected. | ||
+ | |||
+ | <e> | ||
+ | process_comparison (a_comparison: BINARY_AS) | ||
+ | -- Checks `a_comparison' for rule violations. | ||
+ | local | ||
+ | l_viol: CA_RULE_VIOLATION | ||
+ | do | ||
+ | if not in_loop then | ||
+ | analyze_self (a_comparison) | ||
+ | if is_self then | ||
+ | create l_viol.make_with_rule (Current) | ||
+ | l_viol.set_location (a_comparison.start_location) | ||
+ | l_viol.long_description_info.extend (self_name) | ||
+ | violations.extend (l_viol) | ||
end | end | ||
− | + | end | |
− | + | end | |
+ | </e> | ||
+ | |||
+ | First we check that we are not dealing with a loop condition. Self-comparisons in loop conditions are more dangerous and need special treatment (see below). For the rule violation, we set the location to the start location of the binary comparison. We add the variable or feature name to the violation. | ||
+ | |||
+ | Different kinds of comparisons also have different types in the AST. That is why in an AST iterator they are processed independently. Thus, we need to add some delegation to each of the actions that are called when processing a comparison. | ||
+ | |||
+ | <e> | ||
+ | process_bin_eq (a_bin_eq: BIN_EQ_AS) | ||
+ | do | ||
+ | process_comparison (a_bin_eq) | ||
+ | end | ||
+ | |||
+ | process_bin_ge (a_bin_ge: BIN_GE_AS) | ||
+ | do | ||
+ | process_comparison (a_bin_ge) | ||
+ | end | ||
+ | |||
+ | process_bin_gt (a_bin_gt: BIN_GT_AS) | ||
+ | do | ||
+ | process_comparison (a_bin_gt) | ||
+ | end | ||
+ | |||
+ | process_bin_le (a_bin_le: BIN_LE_AS) | ||
+ | do | ||
+ | process_comparison (a_bin_le) | ||
+ | end | ||
+ | |||
+ | process_bin_lt (a_bin_lt: BIN_LT_AS) | ||
+ | do | ||
+ | process_comparison (a_bin_lt) | ||
+ | end | ||
+ | </e> | ||
+ | |||
+ | In the case that a loop condition is a self-comparison, the loop is either never entered or it is never exited. The last case is more severe; the first case only arises with an equality comparison. For this reason we analyze loop conditions separately. If we find such a violation we set <e>in_loop</e> to <e>True</e> so that any further self-comparisons are ignored until we have left the loop. | ||
+ | |||
+ | <e> | ||
+ | pre_process_loop (a_loop: LOOP_AS) | ||
+ | -- Checking a loop `a_loop' for self-comparisons needs more work. If the until expression | ||
+ | -- is a self-comparison that does not compare for equality then the loop will | ||
+ | -- not terminate, which is more severe consequence compared to other self-comparisons. | ||
+ | local | ||
+ | l_viol: CA_RULE_VIOLATION | ||
+ | do | ||
+ | if attached {BINARY_AS} a_loop.stop as l_bin then | ||
+ | analyze_self (l_bin) | ||
+ | if is_self then | ||
+ | create l_viol.make_with_rule (Current) | ||
+ | l_viol.set_location (a_loop.stop.start_location) | ||
+ | l_viol.long_description_info.extend (self_name) | ||
+ | if not attached {BIN_EQ_AS} l_bin then | ||
+ | -- It is only a dangerous loop stop condition if we do not have | ||
+ | -- an equality comparison. | ||
+ | l_viol.long_description_info.extend ("loop_stop") | ||
+ | end | ||
+ | violations.extend (l_viol) | ||
+ | in_loop := True | ||
end | end | ||
end | end | ||
end | end | ||
+ | </e> | ||
− | if | + | <e>format_violation_description</e>, which is declared in <e>CA_RULE</e> as <e>deferred</e>, must be implemented. Here, together with a predefined localized text, we mention the name of the self-compared variable. If the self-comparison is located in a loop stop condition we add an additional warning text. |
+ | |||
+ | <e> | ||
+ | format_violation_description (a_violation: attached CA_RULE_VIOLATION; a_formatter: attached TEXT_FORMATTER) | ||
+ | local | ||
+ | l_info: LINKED_LIST [ANY] | ||
+ | do | ||
+ | l_info := a_violation.long_description_info | ||
+ | a_formatter.add ("'") | ||
+ | if l_info.count >= 1 and then attached {STRING_32} l_info.first as l_name then | ||
+ | a_formatter.add_local (l_name) | ||
+ | end | ||
+ | a_formatter.add (ca_messages.self_comparison_violation_1) | ||
+ | |||
+ | l_info.compare_objects | ||
+ | if l_info.has ("loop_stop") then | ||
+ | -- Dangerous loop stop condition. | ||
+ | a_formatter.add (ca_messages.self_comparison_violation_2) | ||
+ | end | ||
+ | end | ||
</e> | </e> | ||
+ | |||
+ | Then we must implement the usual properties. | ||
+ | |||
+ | <e> | ||
+ | title: STRING_32 | ||
+ | do | ||
+ | Result := ca_names.self_comparison_title | ||
+ | end | ||
+ | |||
+ | id: STRING_32 = "CA071" | ||
+ | -- <Precursor> | ||
+ | |||
+ | description: STRING_32 | ||
+ | do | ||
+ | Result := ca_names.self_comparison_description | ||
+ | end | ||
+ | </e> | ||
+ | |||
+ | Finally, in the initialization we use the default settings, which can be set by calling <e>{CA_RULE}.make_with_defaults</e>. To the default severity score we assign a custom value. In <e>register_actions</e> we must add all the agents for processing the loop and comparison nodes of the AST. | ||
+ | |||
+ | <e> | ||
+ | feature {NONE} -- Initialization | ||
+ | |||
+ | make | ||
+ | -- Initialization. | ||
+ | do | ||
+ | make_with_defaults | ||
+ | default_severity_score := 70 | ||
+ | end | ||
+ | |||
+ | feature {NONE} -- Activation | ||
+ | |||
+ | register_actions (a_checker: attached CA_ALL_RULES_CHECKER) | ||
+ | do | ||
+ | a_checker.add_bin_eq_pre_action (agent process_bin_eq) | ||
+ | a_checker.add_bin_ge_pre_action (agent process_bin_ge) | ||
+ | a_checker.add_bin_gt_pre_action (agent process_bin_gt) | ||
+ | a_checker.add_bin_le_pre_action (agent process_bin_le) | ||
+ | a_checker.add_bin_lt_pre_action (agent process_bin_lt) | ||
+ | a_checker.add_loop_pre_action (agent pre_process_loop) | ||
+ | a_checker.add_loop_post_action (agent post_process_loop) | ||
+ | end | ||
+ | </e> | ||
+ | |||
+ | The complete source code of this rule is available in the [https://svn.eiffel.com/eiffelstudio/branches/eth/eve/Src/framework/code_analysis/rules/expressions/ca_self_comparison_rule.e SVN repository]. | ||
+ | |||
+ | == Example: Rule # 2: ''Unused argument'' == | ||
+ | |||
+ | The ''unused argument'' rule processes the ''feature'', ''body'', ''access id'', and ''converted expression'' AST nodes. The feature node is stored for the description and for ignoring ''deferred'' features. The body node is used to retrieve the arguments. The ''access id'' and ''converted expression'' nodes may represent used arguments, so the nodes are used to mark arguments as read. We register the ''pre'' actions for all the AST nodes as well as the ''post'' action for the ''body'' node in <e>register_actions</e>. | ||
+ | |||
+ | <e> | ||
+ | feature {NONE} -- Activation | ||
+ | |||
+ | register_actions (a_checker: attached CA_ALL_RULES_CHECKER) | ||
+ | do | ||
+ | a_checker.add_feature_pre_action (agent process_feature) | ||
+ | a_checker.add_body_pre_action (agent process_body) | ||
+ | a_checker.add_body_post_action (agent post_process_body) | ||
+ | a_checker.add_access_id_pre_action (agent process_access_id) | ||
+ | a_checker.add_converted_expr_pre_action (agent process_converted_expr) | ||
+ | end | ||
+ | </e> | ||
+ | |||
+ | On processing a feature we store the feature instance, which will be used later. | ||
+ | |||
+ | <e> | ||
+ | process_feature (a_feature_as: FEATURE_AS) | ||
+ | -- Sets the current feature. | ||
+ | do | ||
+ | current_feature := a_feature_as | ||
+ | end | ||
+ | </e> | ||
+ | |||
+ | Before processing the body of a feature we store a list of all the argument names. This is however only done if the feature is a routine, if it has arguments, and if it is not external. In the code we need two nested loops since the arguments are grouped by type. For example, two consecutive <e>STRING</e> arguments as in <e>feature print(first, second: STRING)</e> are in one entry of <e>{BODY_AS}.arguments</e>. This single entry is itself a list of arguments. | ||
+ | |||
+ | <e> | ||
+ | process_body (a_body_as: BODY_AS) | ||
+ | -- Retrieves the arguments from `a_body_as'. | ||
+ | local | ||
+ | j: INTEGER | ||
+ | do | ||
+ | has_arguments := (a_body_as.arguments /= Void) | ||
+ | create args_used.make (0) | ||
+ | n_arguments := 0 | ||
+ | if | ||
+ | attached a_body_as.as_routine as l_rout | ||
+ | and then has_arguments | ||
+ | and then not l_rout.is_external | ||
+ | then | ||
+ | routine_body := a_body_as | ||
+ | create arg_names.make (0) | ||
+ | across a_body_as.arguments as l_args loop | ||
+ | from | ||
+ | j := 1 | ||
+ | until | ||
+ | j > l_args.item.id_list.count | ||
+ | loop | ||
+ | arg_names.extend (l_args.item.item_name (j)) | ||
+ | args_used.extend (False) | ||
+ | n_arguments := n_arguments + 1 | ||
+ | j := j + 1 | ||
+ | end | ||
+ | end | ||
+ | end | ||
+ | end | ||
+ | |||
+ | has_arguments: BOOLEAN | ||
+ | -- Does current feature have arguments? | ||
+ | |||
+ | current_feature: FEATURE_AS | ||
+ | -- Currently checked feature. | ||
+ | |||
+ | routine_body: BODY_AS | ||
+ | -- Current routine body. | ||
+ | |||
+ | n_arguments: INTEGER | ||
+ | -- # arguments for current routine. | ||
+ | |||
+ | arg_names: ARRAYED_LIST [STRING_32] | ||
+ | -- Argument names of current routine. | ||
+ | |||
+ | args_used: ARRAYED_LIST [BOOLEAN] | ||
+ | -- Which argument has been used? | ||
+ | </e> | ||
+ | |||
+ | Both the nodes <e>{ACCESS_ID_AS}</e> and <e>{CONVERTED_EXPR_AS}</e> may represent used arguments. <e>{ACCESS_ID_AS}</e> is a usual variable usage, while <e>{CONVERTED_EXPR_AS}</e> stands for an argument used in inline C code (the dollar sign syntax: <e>$arg</e>). In both routines <e>check_arguments</e> is called eventually, which updates the internal data structures of our rule class. | ||
+ | |||
+ | <e> | ||
+ | process_access_id (a_aid: ACCESS_ID_AS) | ||
+ | -- Checks if `a_aid' is an argument. | ||
+ | do | ||
+ | check_arguments (a_aid.feature_name.name_32) | ||
+ | end | ||
+ | |||
+ | process_converted_expr (a_conv: CONVERTED_EXPR_AS) | ||
+ | -- Checks if `a_conv' is an argument used in the | ||
+ | -- form `$arg'. | ||
+ | local | ||
+ | j: INTEGER | ||
+ | do | ||
+ | if | ||
+ | attached {ADDRESS_AS} a_conv.expr as l_address | ||
+ | and then attached {FEAT_NAME_ID_AS} l_address.feature_name as l_id | ||
+ | then | ||
+ | check_arguments (l_id.feature_name.name_32) | ||
+ | end | ||
+ | end | ||
+ | |||
+ | check_arguments (a_var_name: attached STRING_32) | ||
+ | -- Mark an argument as used if it corresponds to `a_aid'. | ||
+ | local | ||
+ | j: INTEGER | ||
+ | do | ||
+ | from | ||
+ | j := 1 | ||
+ | until | ||
+ | j > n_arguments | ||
+ | loop | ||
+ | if not args_used [j] and then arg_names [j].is_equal (a_var_name) then | ||
+ | args_used [j] := True | ||
+ | end | ||
+ | j := j + 1 | ||
+ | end | ||
+ | end | ||
+ | </e> | ||
+ | |||
+ | <e>post_process_body</e> finally checks if there exist unused arguments. If this is the case then all the relevant variable names are stored in the rule violation. Also, the feature is stored (for the feature name). The location of the violation is set to the start of the routine body. No rule violation is issued if the feature is deferred. | ||
+ | |||
+ | <e> | ||
+ | post_process_body (a_body: BODY_AS) | ||
+ | -- Adds a violation if the feature contains unused arguments. | ||
+ | local | ||
+ | l_violation: CA_RULE_VIOLATION | ||
+ | j: INTEGER | ||
+ | do | ||
+ | if | ||
+ | a_body.content /= Void | ||
+ | and then not current_feature.is_deferred | ||
+ | and then has_arguments | ||
+ | and then args_used.has (False) | ||
+ | then | ||
+ | create l_violation.make_with_rule (Current) | ||
+ | l_violation.set_location (routine_body.start_location) | ||
+ | l_violation.long_description_info.extend (current_feature) | ||
+ | from | ||
+ | j := 1 | ||
+ | until | ||
+ | j > n_arguments | ||
+ | loop | ||
+ | if not args_used.at (j) then | ||
+ | l_violation.long_description_info.extend (arg_names.at (j)) | ||
+ | end | ||
+ | j := j + 1 | ||
+ | end | ||
+ | violations.extend (l_violation) | ||
+ | end | ||
+ | end | ||
+ | </e> | ||
+ | |||
+ | All the information that was stored in the rule violation is used for the formatted description: | ||
+ | |||
+ | <e> | ||
+ | format_violation_description (a_violation: attached CA_RULE_VIOLATION; a_formatter: attached TEXT_FORMATTER) | ||
+ | local | ||
+ | j: INTEGER | ||
+ | do | ||
+ | a_formatter.add (ca_messages.unused_argument_violation_1) | ||
+ | from | ||
+ | j := 2 | ||
+ | until | ||
+ | j > a_violation.long_description_info.count | ||
+ | loop | ||
+ | if j > 2 then a_formatter.add (", ") end | ||
+ | a_formatter.add ("'") | ||
+ | if attached {STRING_32} a_violation.long_description_info.at (j) as l_arg then | ||
+ | a_formatter.add_local (l_arg) | ||
+ | end | ||
+ | a_formatter.add ("'") | ||
+ | j := j + 1 | ||
+ | end | ||
+ | a_formatter.add (ca_messages.unused_argument_violation_2) | ||
+ | if attached {FEATURE_AS} a_violation.long_description_info.first as l_feature then | ||
+ | a_formatter.add_feature_name (l_feature.feature_name.name_32, a_violation.affected_class) | ||
+ | end | ||
+ | a_formatter.add (ca_messages.unused_argument_violation_3) | ||
+ | end | ||
+ | </e> | ||
+ | |||
+ | The complete source code of this rule is available in the [https://svn.eiffel.com/eiffelstudio/branches/eth/eve/Src/framework/code_analysis/rules/features/ca_unused_argument_rule.e SVN repository]. |
Latest revision as of 14:21, 3 June 2014
The code for Inspector Eiffel is located at three different places in the EiffelStudio source:
- The framework—by far the largest part, with the rule checking, the rules, the control flow graph functionality, and more—is represented as a library;
- The graphical user interface can be found in the interface cluster of EiffelStudio;
- The command-line interface for code analysis is a single class in the tty cluster of EiffelStudio.
The whole Inspector Eiffel framework is located in the library code_analysis.
Contents
Class Relations
The following diagram shows an overview of the relations between the classes of the code analysis framework. All classes are located in the code_analysis library except for CLASS_C
(EiffelStudio), ROTA_TIMED_TASK_I
(ecosystem cluster), EWB_CODE_ANALYSIS
(command-line interface), and ES_CODE_ANALYSIS_BENCH_HELPER
(GUI).
Interface
In this section it is explained from a client view how to use the code analyzer. The code analyzer is represented by the class CA_CODE_ANALYZER
, so a client must have or access an instance of this class. Before the analyzer can be launched all the classes that shall be analyzed must be added using one of the following features. If you use more than one of these commands then the added classes from all commands will be conjoined.
-
{CA_CODE_ANALYZER}.add_whole_system
- Adds all the classes that are part of the current system. Classes of referenced libraries will not be added. So, for example, if your system consists of the classes
MY_MAIN
,MY_BOX
, andMY_ITEM
then these three classes will be added to the list of classes to be analyzed. -
.add_class (a_class: attached CONF_CLASS)
- Adds a single class.
-
.add_classes (a_classes: attached ITERABLE [attached CONF_CLASS])
- Adds a list of classes.
-
.add_cluster (a_cluster: attached CLUSTER_I)
- Adds all classes of a cluster (and all the classes of the sub-clusters recursively).
-
.add_group (a_group: attached CONF_GROUP)
- Adds all classes of a configuration group. An example of a configuration group is a library.
Here are other features which can be called before starting to analyze:
-
{CA_CODE_ANALYZER}.clear_classes_to_analyze
- Removes all classes that have been added to the list of classes to analyze.
-
.add_completed_action (a_action: attached PROCEDURE [ANY, TUPLE [ITERABLE [TUPLE [detachable EXCEPTION, CLASS_C]]]])
- Adds
`a_action'
to the list of procedures that will be called when analysis has completed. The procedures have one argument, a list of exceptions (with the corresponding class). In the case an exception is thrown during analysis the exception is caught by the code analyzer and is added to this list. In the graphical user interface such exceptions would show up as errors at the top of the list of rule violations. -
.add_output_action (a_action: attached PROCEDURE [ANY, TUPLE [READABLE_STRING_GENERAL]])
- Adds
`a_action'
to the procedures that are called for outputting the status. The final results (rule violations) are not given to these procedures. These output actions are used by the command-line mode and by the status bar in the GUI. -
.is_rule_checkable (a_rule: attached CA_RULE): BOOLEAN
- Tells whether
`a_rule'
will be checked based on the current preferences and based on the current checking scope (whole system or custom set of classes).
Then, to start analyzing simply call {CA_CODE_ANALYZER}.analyze
.
Rule Checking
In the GUI we want to be able to continue to work while the code analyzer is running. Analyzing larger sets of classes (such as whole libraries) can take from several seconds to several minutes. For this reason the code analyzer uses an asynchronous task, {CA_RULE_CHECKING_TASK}
. In {CA_CODE_ANALYZER}.analyze
this task (l_task
) is invoked as follows:
In {CA_CODE_ANALYZER}.analyze
:
create l_task.make (l_rules_checker, l_rules_to_check, classes_to_analyze, agent analysis_completed) l_task.set_output_actions (output_actions) rota.run_task (l_task)
{CA_RULE_CHECKING_TASK}
essentially runs the whole analysis. Like all other conformants to {ROTA_TASK_I}
this class executes a series of steps between which the user interface gets some time to process its events. In {CA_RULE_CHECKING_TASK}
each step analyses one class. This means that a class is checked by all the rules for violations. The following code does that:
From {CA_RULE_CHECKING_TASK}
:
step -- <Precursor> do if has_next_step then -- Gather type information type_recorder.clear type_recorder.analyze_class (classes.item) context.set_node_types (type_recorder.node_types) context.set_checking_class (classes.item) across rules as l_rules loop -- If rule is non-standard then it will not be checked by l_rules_checker. -- We will have the rule check the current class here: if l_rules.item.is_enabled.value and then attached {CA_CFG_RULE} l_rules.item as l_cfg_rule then l_cfg_rule.check_class (classes.item) end end -- Status output. if output_actions /= Void then output_actions.call ([ca_messages.analyzing_class (classes.item.name)]) end rules_checker.run_on_class (classes.item) classes.forth has_next_step := not classes.after if not has_next_step then completed_action.call ([exceptions]) end end rescue -- Instant error output. if output_actions /= Void then output_actions.call ([ca_messages.error_on_class (classes.item.name)]) end exceptions.extend ([exception_manager.last_exception, classes.item]) -- Jump to the next class. classes.forth has_next_step := not classes.after if not has_next_step then completed_action.call ([exceptions]) end retry end
type_recorder
is of type {CA_AST_TYPE_RECORDER}
. It uses a functionality of the Eiffel compiler to determine the type of some AST nodes in the current class. The AST itself (as provided by the Eiffel compiler) does not contain any type information. context
has type {CA_ANALYSIS_CONTEXT}
and contains any side-information such as the previously mentioned types and the current class. The rules were given this context before so that they can access it when needed.
The across
loop only checks control flow graph rules. All the standard rules are checked by the line rules_checker.run_on_class (classes.item)
. rules_checker
has type {CA_ALL_RULES_CHECKER}
. This is the class where each rule must register the AST nodes the rule visits. run_on_class
iterates over the AST and calls all the actions that were registered by the standard rules. So this is the way all rules are used to check the current class. step
is executed repeatedly until there are no classes left to analyze.
In the rescue
clause all possible exceptions are caught and recorded. In case of such an exception it then proceeds to the next class.
Checking Standard Rules
The relatively large class {CA_ALL_RULES_CHECKER}
is responsible for checking standard rules. It does this in a straightforward way. It is a subclass of {AST_ITERATOR}
, a realization of a visitor on the AST.
Rules can register their actions with {CA_ALL_RULES_CHECKER}
by calling a procedure like add_bin_lt_pre_action (a_action: attached PROCEDURE [ANY, TUPLE [BIN_LT_AS]])
or add_if_post_action (a_action: attached PROCEDURE [ANY, TUPLE [IF_AS]])
. These "pre" and "post" actions exist for many other types of AST nodes as well. All the registered actions are stored in ACTION_SEQUENCE
variables:
if_pre_actions, if_post_actions: ACTION_SEQUENCE [TUPLE [IF_AS]] add_if_post_action (a_action: attached PROCEDURE [ANY, TUPLE [IF_AS]]) do if_post_actions.extend (a_action) end -- And similar for all other relevant AST nodes...
The corresponding visitor procedures are redefined. This is done is the following way:
process_if_as (a_if: IF_AS) do if_pre_actions.call ([a_if]) Precursor (a_if) if_post_actions.call ([a_if]) end -- And similar for all other relevant AST nodes...
Since the actual iteration over the AST is done in the ancestor we need only very little code to analyze a class:
feature {CA_RULE_CHECKING_TASK} -- Execution Commands run_on_class (a_class_to_check: CLASS_C) -- Check all rules that have added their agents. local l_ast: CLASS_AS do last_run_successful := False l_ast := a_class_to_check.ast class_pre_actions.call ([l_ast]) process_class_as (l_ast) class_post_actions.call ([l_ast]) last_run_successful := True end
This code analyzes a class for all active standard rules. class_pre_actions
and class_post_actions
are action sequences that are identical to those for the AST nodes. process_class_as
, which is implemented in {AST_ITERATOR}
will recursively visit all relevant AST nodes and execute their action sequences.
Example: Rule # 71: Self-comparison
We will go through the implementation of rule # 71 (Self-comparison) in detail.
The heart of this implementation lies in the feature analyze_self
. There it is tested whether a binary expression is s self-comparison. is_self
, a BOOLEAN
attribute, is set to true if and only if the argument is a comparison between two identical variables.
analyze_self (a_bin: attached BINARY_AS) -- Is `a_bin' a self-comparison? do is_self := False if attached {EXPR_CALL_AS} a_bin.left as l_e1 and then attached {ACCESS_ID_AS} l_e1.call as l_l and then attached {EXPR_CALL_AS} a_bin.right as l_e2 and then attached {ACCESS_ID_AS} l_e2.call as l_r then is_self := l_l.feature_name.is_equal (l_r.feature_name) self_name := l_l.access_name_32 end end is_self: BOOLEAN -- Is `a_bin' from last call to `analyze_self' a self-comparison? self_name: detachable STRING_32 -- Name of the self-compared variable.
Both sides of the comparison, a_bin.left
and a_bin.right
, are tested to have the types that indicate that they are variable or feature accesses. If the tests succeed then is_self
is set according to the equality of the two feature names. Then the name is stored in an internal attribute.
analyze_self
is used in process_comparison
, which creates a rule violation if a self-comparison was detected.
process_comparison (a_comparison: BINARY_AS) -- Checks `a_comparison' for rule violations. local l_viol: CA_RULE_VIOLATION do if not in_loop then analyze_self (a_comparison) if is_self then create l_viol.make_with_rule (Current) l_viol.set_location (a_comparison.start_location) l_viol.long_description_info.extend (self_name) violations.extend (l_viol) end end end
First we check that we are not dealing with a loop condition. Self-comparisons in loop conditions are more dangerous and need special treatment (see below). For the rule violation, we set the location to the start location of the binary comparison. We add the variable or feature name to the violation.
Different kinds of comparisons also have different types in the AST. That is why in an AST iterator they are processed independently. Thus, we need to add some delegation to each of the actions that are called when processing a comparison.
process_bin_eq (a_bin_eq: BIN_EQ_AS) do process_comparison (a_bin_eq) end process_bin_ge (a_bin_ge: BIN_GE_AS) do process_comparison (a_bin_ge) end process_bin_gt (a_bin_gt: BIN_GT_AS) do process_comparison (a_bin_gt) end process_bin_le (a_bin_le: BIN_LE_AS) do process_comparison (a_bin_le) end process_bin_lt (a_bin_lt: BIN_LT_AS) do process_comparison (a_bin_lt) end
In the case that a loop condition is a self-comparison, the loop is either never entered or it is never exited. The last case is more severe; the first case only arises with an equality comparison. For this reason we analyze loop conditions separately. If we find such a violation we set in_loop
to True
so that any further self-comparisons are ignored until we have left the loop.
pre_process_loop (a_loop: LOOP_AS) -- Checking a loop `a_loop' for self-comparisons needs more work. If the until expression -- is a self-comparison that does not compare for equality then the loop will -- not terminate, which is more severe consequence compared to other self-comparisons. local l_viol: CA_RULE_VIOLATION do if attached {BINARY_AS} a_loop.stop as l_bin then analyze_self (l_bin) if is_self then create l_viol.make_with_rule (Current) l_viol.set_location (a_loop.stop.start_location) l_viol.long_description_info.extend (self_name) if not attached {BIN_EQ_AS} l_bin then -- It is only a dangerous loop stop condition if we do not have -- an equality comparison. l_viol.long_description_info.extend ("loop_stop") end violations.extend (l_viol) in_loop := True end end end
format_violation_description
, which is declared in CA_RULE
as deferred
, must be implemented. Here, together with a predefined localized text, we mention the name of the self-compared variable. If the self-comparison is located in a loop stop condition we add an additional warning text.
format_violation_description (a_violation: attached CA_RULE_VIOLATION; a_formatter: attached TEXT_FORMATTER) local l_info: LINKED_LIST [ANY] do l_info := a_violation.long_description_info a_formatter.add ("'") if l_info.count >= 1 and then attached {STRING_32} l_info.first as l_name then a_formatter.add_local (l_name) end a_formatter.add (ca_messages.self_comparison_violation_1) l_info.compare_objects if l_info.has ("loop_stop") then -- Dangerous loop stop condition. a_formatter.add (ca_messages.self_comparison_violation_2) end end
Then we must implement the usual properties.
title: STRING_32 do Result := ca_names.self_comparison_title end id: STRING_32 = "CA071" -- <Precursor> description: STRING_32 do Result := ca_names.self_comparison_description end
Finally, in the initialization we use the default settings, which can be set by calling {CA_RULE}.make_with_defaults
. To the default severity score we assign a custom value. In register_actions
we must add all the agents for processing the loop and comparison nodes of the AST.
feature {NONE} -- Initialization make -- Initialization. do make_with_defaults default_severity_score := 70 end feature {NONE} -- Activation register_actions (a_checker: attached CA_ALL_RULES_CHECKER) do a_checker.add_bin_eq_pre_action (agent process_bin_eq) a_checker.add_bin_ge_pre_action (agent process_bin_ge) a_checker.add_bin_gt_pre_action (agent process_bin_gt) a_checker.add_bin_le_pre_action (agent process_bin_le) a_checker.add_bin_lt_pre_action (agent process_bin_lt) a_checker.add_loop_pre_action (agent pre_process_loop) a_checker.add_loop_post_action (agent post_process_loop) end
The complete source code of this rule is available in the SVN repository.
Example: Rule # 2: Unused argument
The unused argument rule processes the feature, body, access id, and converted expression AST nodes. The feature node is stored for the description and for ignoring deferred features. The body node is used to retrieve the arguments. The access id and converted expression nodes may represent used arguments, so the nodes are used to mark arguments as read. We register the pre actions for all the AST nodes as well as the post action for the body node in register_actions
.
feature {NONE} -- Activation register_actions (a_checker: attached CA_ALL_RULES_CHECKER) do a_checker.add_feature_pre_action (agent process_feature) a_checker.add_body_pre_action (agent process_body) a_checker.add_body_post_action (agent post_process_body) a_checker.add_access_id_pre_action (agent process_access_id) a_checker.add_converted_expr_pre_action (agent process_converted_expr) end
On processing a feature we store the feature instance, which will be used later.
process_feature (a_feature_as: FEATURE_AS) -- Sets the current feature. do current_feature := a_feature_as end
Before processing the body of a feature we store a list of all the argument names. This is however only done if the feature is a routine, if it has arguments, and if it is not external. In the code we need two nested loops since the arguments are grouped by type. For example, two consecutive STRING
arguments as in feature print(first, second: STRING)
are in one entry of {BODY_AS}.arguments
. This single entry is itself a list of arguments.
process_body (a_body_as: BODY_AS) -- Retrieves the arguments from `a_body_as'. local j: INTEGER do has_arguments := (a_body_as.arguments /= Void) create args_used.make (0) n_arguments := 0 if attached a_body_as.as_routine as l_rout and then has_arguments and then not l_rout.is_external then routine_body := a_body_as create arg_names.make (0) across a_body_as.arguments as l_args loop from j := 1 until j > l_args.item.id_list.count loop arg_names.extend (l_args.item.item_name (j)) args_used.extend (False) n_arguments := n_arguments + 1 j := j + 1 end end end end has_arguments: BOOLEAN -- Does current feature have arguments? current_feature: FEATURE_AS -- Currently checked feature. routine_body: BODY_AS -- Current routine body. n_arguments: INTEGER -- # arguments for current routine. arg_names: ARRAYED_LIST [STRING_32] -- Argument names of current routine. args_used: ARRAYED_LIST [BOOLEAN] -- Which argument has been used?
Both the nodes {ACCESS_ID_AS}
and {CONVERTED_EXPR_AS}
may represent used arguments. {ACCESS_ID_AS}
is a usual variable usage, while {CONVERTED_EXPR_AS}
stands for an argument used in inline C code (the dollar sign syntax: $arg
). In both routines check_arguments
is called eventually, which updates the internal data structures of our rule class.
process_access_id (a_aid: ACCESS_ID_AS) -- Checks if `a_aid' is an argument. do check_arguments (a_aid.feature_name.name_32) end process_converted_expr (a_conv: CONVERTED_EXPR_AS) -- Checks if `a_conv' is an argument used in the -- form `$arg'. local j: INTEGER do if attached {ADDRESS_AS} a_conv.expr as l_address and then attached {FEAT_NAME_ID_AS} l_address.feature_name as l_id then check_arguments (l_id.feature_name.name_32) end end check_arguments (a_var_name: attached STRING_32) -- Mark an argument as used if it corresponds to `a_aid'. local j: INTEGER do from j := 1 until j > n_arguments loop if not args_used [j] and then arg_names [j].is_equal (a_var_name) then args_used [j] := True end j := j + 1 end end
post_process_body
finally checks if there exist unused arguments. If this is the case then all the relevant variable names are stored in the rule violation. Also, the feature is stored (for the feature name). The location of the violation is set to the start of the routine body. No rule violation is issued if the feature is deferred.
post_process_body (a_body: BODY_AS) -- Adds a violation if the feature contains unused arguments. local l_violation: CA_RULE_VIOLATION j: INTEGER do if a_body.content /= Void and then not current_feature.is_deferred and then has_arguments and then args_used.has (False) then create l_violation.make_with_rule (Current) l_violation.set_location (routine_body.start_location) l_violation.long_description_info.extend (current_feature) from j := 1 until j > n_arguments loop if not args_used.at (j) then l_violation.long_description_info.extend (arg_names.at (j)) end j := j + 1 end violations.extend (l_violation) end end
All the information that was stored in the rule violation is used for the formatted description:
format_violation_description (a_violation: attached CA_RULE_VIOLATION; a_formatter: attached TEXT_FORMATTER) local j: INTEGER do a_formatter.add (ca_messages.unused_argument_violation_1) from j := 2 until j > a_violation.long_description_info.count loop if j > 2 then a_formatter.add (", ") end a_formatter.add ("'") if attached {STRING_32} a_violation.long_description_info.at (j) as l_arg then a_formatter.add_local (l_arg) end a_formatter.add ("'") j := j + 1 end a_formatter.add (ca_messages.unused_argument_violation_2) if attached {FEATURE_AS} a_violation.long_description_info.first as l_feature then a_formatter.add_feature_name (l_feature.feature_name.name_32, a_violation.affected_class) end a_formatter.add (ca_messages.unused_argument_violation_3) end
The complete source code of this rule is available in the SVN repository.