HeaderDoc::MacroFilter

Declared In:

Introduction

Filters content based on C preprocessor directives.

Discussion

This class is basically a data structure for interpreting if statements, #if statements, and other similar conditional statements (e.g. switch), depending on the value of HeaderDoc::interpret_case.

The way the tree behaves regarding unknown tokens is described in more detail in the discusion for doit.



Member Functions

adjconstraint

Adds a new variable or operator to an existing constraint.

attop

Returns whether there are any non-null constraints between the current constraint and the top of the constraint tree.

doit

Parses a #if/#ifdef/#else block and builds up a constraint tree.

dotest

Runs a single test of the macro filter engine.

filterFileString

Filters an entire file string based on the specified macros.

hasReturnOrBreak

Checks for return/break statements in a code tree.

ignoreWithinCPPDirective

Determines whether the contents of a given C preprocessor directive should be ignored or not.

isnullconstraint

Returns whether this is a null constraint.

localmatch

Checks the value for a given node without recursion.

matchesconstraints

Returns whether the constraints match the current set of macro definitions.

matchesconstraints_sub

The recursive portion of matchesconstraints.

newchild

Creates a new constraint as a cild of the current constraint.

newconstraint

Creates a new constraint for insertion into the constraint tree.

newparenguts

Creates a new parenthesis guts sibling node for a parenthesis constraint.

newsibling

Creates a new constraint as a sibling of the current constraint.

printconstraint

Prints a constraint for debugging purposes.

run_macro_filter_tests

Runs a series of tests on the macro filter engine.

unrolltoparen

Unrolls from current constraint to the nearest enclosing parenthesis constraint.

walkTree

Walks through a parse tree and builds up the corresponding constraint tree.


adjconstraint


Adds a new variable or operator to an existing constraint.

sub adjconstraint(
    $$$$$$) 
Parameters
constraint

The constraint to alter.

parenconstraint

The nearest enclosing parenthesis around this constraint.

topconstraint

The top node in the constraint tree.

lefttoken

The token on the left side of the comparison operator.

comparison

The comparison operator (==, !=, etc.).

righttoken

The token on the right side of the comparison operator.


attop


Returns whether there are any non-null constraints between the current constraint and the top of the constraint tree.

sub attop 
Parameters
topconstraint

The top node in the constraint tree.

constraint

The constraint node to check.


doit


Parses a #if/#ifdef/#else block and builds up a constraint tree.

sub doit 
Parameters
block

The block of code to parse.

Discussion

Called by ignoreWithinCPPDirective, calls itself recursively, and called by dotest.

This is the core of the macro filter engine. This parses the declaration using blockParse, calls walkTree to build up a constraint tree. The calling function can then call matchesconstraints to determine whether to include or exclude the content within any portion of the content.

Propagation of "Don't care":

If a symbol is marked as an explicit "must be undefined", its value is 0 just as it would be with a real C preprocessor. However, this leaves open the issue of symbols for which no value is specified. We call tese "don't care" values.

The way these values are handled makes it possible to have combinations that cannot occur in the real world (a value being interpreted one way in one spot and differently in another spot). This is intentional because we prefer inclusion over exclusion for these values in all cases.

To support this goal, when we see an unknown symbol, we mark that constraint as a "don't care" value. This propagates up the chain as follows:

  1. Check && chain (parent/child). If we get a logic false farther down, constraint must be false because "false && X" is false for all X.

  2. Check || chain (siblings). If we get a logic true to the left or right, constraint must be true because "true && X" is true for all X.

  3. If we get here, propagate DC up one level.

  4. If top level is DC, assume true.


dotest


Runs a single test of the macro filter engine.

sub dotest(
    $$) 
Parameters
string

The initial #if or whatever.

expected_value

The expected return value from the engine.


filterFileString


Filters an entire file string based on the specified macros.

Parameters
data

The entire contents of a file as a string.


hasReturnOrBreak


Checks for return/break statements in a code tree.

Discussion

This function determines whether a parse tree fragment contains a return or break statement in every possible path through a tree of if() {...} else {...} statements.

This function is not used by HeaderDoc. It is provided for use by other tools that take advantage of the HeaderDoc parser and related modules.


ignoreWithinCPPDirective


Determines whether the contents of a given C preprocessor directive should be ignored or not.

Parameters
cpp_command

The C preprocessor command (e.g. "#if").

text

The entire contents of the #if/#else/#ifdef, including the enclosed text (but not including the #else for a #if).

curshow

Indicates whether the previous block is ignored. Used to determine whether #else clauses should return true or false.

Discussion

The primary purpose of this function is to pre-parse the declaration and scrape out only the actual #if expression without the contents.

Secondarily, this simplifies "#ifdef" to "#if (defined(...))" so that the later parsing code is smaller.


isnullconstraint


Returns whether this is a null constraint.

Parameters
constraint

The constraint to check.

printing

Set to disable debug output if you are in the middle of printing a constraint. (Optional. Default is 0.)

Discussion

In the process of building up the tree, it is sometimes necessary to insert null constraints as placeholders. (For example, the first node in any chain is a NULL constraint.) These constraints merely propagate the values below/beside them.


localmatch


Checks the value for a given node without recursion.

Parameters
constraint

The constraint to check.

use_default_value

Indicates that the macro filter engine should use a specific default value to use for undefined parameters.

default_value

The default value to use for undefined parameters.

printing

Set to disable debug output if you are in the middle of printing a constraint. (Optional. Default is 0.)

Return Value

Returns 0 if the constraint fails explicitly with either an "if (!defined(X))" where "X" is defined or with a comparison failure where both values are defined.

Returns 1 if the constraint succeeds explicitly with either an "if (defined(X))" where "X" is defined or with a comparison success where both values are defined.

Returns -1 for a null comparison. Also returns -1 if the result cannot be determined because of a constant token without a specified value and use_default_value is 1.

Returns -3 if the result cannot be determined because of a constant token without a specified value and use_default_value is 0.


matchesconstraints


Returns whether the constraints match the current set of macro definitions.

Parameters
constraint

The constraint to check.

default_value

The default value to use for undefined variables.


matchesconstraints_sub


The recursive portion of matchesconstraints.

Parameters
constraint

The constraint to check.

use_default_value

Indicates that the macro filter engine should use a specific default value to use for undefined parameters.

default_value

The default value to use for undefined parameters.


newchild


Creates a new constraint as a cild of the current constraint.

sub newchild 

newconstraint


Creates a new constraint for insertion into the constraint tree.


newparenguts


Creates a new parenthesis guts sibling node for a parenthesis constraint.

Discussion

The constraints representing what comes between this parenthesis and the matching parenthesis hang off the PARENTREE chain. This function creates an initial null constraint for those additional constraints to hang off of.


newsibling


Creates a new constraint as a sibling of the current constraint.

Parameters
constraint

The current constraint.

parenconstraint

If specified (see unrolltoparen for details; call sets including=0), this represents an outer bound for unrolling to the nearest enclosing parenthesis.

If unspecified, the function does not unroll to the nearest parenthesis constraint.


printconstraint


Prints a constraint for debugging purposes.

Parameters
constraint

The constraint node to print.

nodeonly

Indicates that children and successors of this node should not be printed. (Optional. Defaults to 0.)


run_macro_filter_tests


Runs a series of tests on the macro filter engine.


unrolltoparen


Unrolls from current constraint to the nearest enclosing parenthesis constraint.

Parameters
constraint

The constraint to alter.

parenconstraint

An enclosing parenthesis around this constraint.

Depending on the value of including, this is either an absolute upper bound parenthesis that should never be reached or this is a parenthesis constraint whose enclosing parenthesis constraint you want to obtain.

including

If this is 1, the functino unrolls to the parenthesis constraint that encloses parencontraint.

If this is 0, unroll to the parenthesis constraint that encloses constraint.


walkTree


Walks through a parse tree and builds up the corresponding constraint tree.

sub walkTree(
    $$$$) 
Parameters
parseTree

The parse tree to walk.

constraint

The constraint to alter.

parenconstraint

The nearest enclosing parenthesis around this constraint.

topconstraint

The top node in the constraint tree.


Member Data

ALWAYSFALSE

If set, this constraint always returns false. This is vestigial; this flag is never set. Do not depend on this behavior.

COMPARISON

The comparison operator itself.

DEFINED

True if the token is the word "defined". Used for interpreting #if (defined(...)) statements.

DEFINEDSKIPCP

Used to prevent normal parenthesis nesting in contexts where it makes no sense.

ELSEGUTS

The contents of an else statement (from the elseContents field in the ParserState class).

ELSETREE

A constraint tree representing any nested if/#if statements or similar within the "else" side of this conditional.

FIRSTCHILD

Points to a constraint that should be treated as mandatory in order for this constraint to succeed. In other words, an "AND" clause.

For example, if you have A || B, the FIRSTCHILD chain for A points to the constraint for B.

GROUP

Set to true for the defined token and the case token (switch). This is vestigial. Do not depend on this behavior.

HASRETURNORBREAK

Cache for the hasReturnOrBreak function.

HeaderDoc::MacroFilter::VERSION

The revision control revision number for this module.

IFDECLARATION

The #if declaration itself.

IFGUTS

The contents of an if/#if statement (from the ifContents field in the ParserState class).

IFTREE

A constraint tree representing any nested if/#if statements or similar within the "if" side of this conditional.

ISPAREN

True if the constraint is for a parenthesis.

LASTJOIN

The joining operator that connects this to its immediate predecessor in the actual code. For something hanging off the NEXT tree, this is usually "||". For something hanging off the "FIRSTCHILD" tree, this is usually "&&". In the case of a parenthesis constraint, the value of LASTJOIN is an open parenthesis.

LEFTDONTCARE

The value on the left side is a symbol that was neither explicitly defined (-D flag) nor undefined (-U flag).

LEFTISSYMBOL

If set, the left side of the comparison is nonempty and was either explicitly defined or undefined by flags on the command line.

LEFTTOKEN

The actual text token from the left side of the comparison.

LEFTVALUE

The numerical value assigned to the left side of the comparison.

NEXT

The next constraint. This represents any statement that should be treated as an alternative to this one.

For example, if you have A || B, then for the constraint A, the MEXT constraint is the constraint for B.

NOT

True if the constraint's token was preceded by an exclamation point that inverts this constraint's return values.

PARENT

Points to the constraint whose FIRSTCHILD field points to this one or to a constraint in this constraint's PREV chain.

PARENTREE

The tree of constraints for contents within this parenthesis constraint.

For example, if you have the string (A || B), the top constraint represents the open parenthesis. Its PARENTREE chain points to the constraint for A (whose NEXT field, in turn, points to the constraint for B).

PARENWRAPPER

The parenthesis token that encloses the current token. Used as the fast path cache for unrolltoparen.

PREVIOUS

Points to the constraint whose NEXT field points to this one.

PREVPAREN

For a parenthesis token, a reference to the enclosing parenthesis token.

If HeaderDoc::interpret_case is set, then for a case statement, this points to the enclosing switch statement.

RIGHTDONTCARE

The value on the right side is a symbol that was neither explicitly defined (-D flag) nor undefined (-U flag).

RIGHTISSYMBOL

If set, the right side of the comparison is nonempty and was either explicitly defined or undefined by flags on the command line.

RIGHTTOKEN

The actual text token from the right side of the comparison.

RIGHTVALUE

The numerical value assigned to the right side of the comparison.

SWITCHGUTS

The contents of an switch statement (from the functionContents field in the ParserState class). Not currently used by HeaderDoc.

SWITCHTREE

A constraint tree representing any nested if/#if statements or similar within the body of a switch statement. Not currently used by HeaderDoc.

TOKENCONCAT

Contains a token that might be followed by another token as part of an operator or might live on its own. This is a temporary value used while building the constraint tree.

For example, while building up the tree, when the parser encounters an exclamation point (!), this could either negate the token after it or could be part of a not-equals operator (!=). Until it knows which, the exclamation point gets stored in the TOKENCONCAT field.

WAITINGCOMPARISON

A comparison found before any recognized tokens.

WAITINGTOKEN

A token found before any comparisons.


ALWAYSFALSE


If set, this constraint always returns false. This is vestigial; this flag is never set. Do not depend on this behavior.

$self->{ALWAYSFALSE}

COMPARISON


The comparison operator itself.

$self->{COMPARISON}

DEFINED


True if the token is the word "defined". Used for interpreting #if (defined(...)) statements.

$self->{DEFINED}

DEFINEDSKIPCP


Used to prevent normal parenthesis nesting in contexts where it makes no sense.

$self->{DEFINEDSKIPCP}

ELSEGUTS


The contents of an else statement (from the elseContents field in the ParserState class).

$self->{ELSEGUTS}

ELSETREE


A constraint tree representing any nested if/#if statements or similar within the "else" side of this conditional.

$self->{ELSETREE}

FIRSTCHILD


Points to a constraint that should be treated as mandatory in order for this constraint to succeed. In other words, an "AND" clause.

For example, if you have A || B, the FIRSTCHILD chain for A points to the constraint for B.

$self->{FIRSTCHILD}

GROUP


Set to true for the defined token and the case token (switch). This is vestigial. Do not depend on this behavior.

$self->{GROUP}

HASRETURNORBREAK


Cache for the hasReturnOrBreak function.

$self->{HASRETURNORBREAK}

HeaderDoc::MacroFilter::VERSION


The revision control revision number for this module.

$HeaderDoc::MacroFilter::VERSION = '$Revision: 1298084578 $';  
Discussion

In the git repository, contains the number of seconds since January 1, 1970.


IFDECLARATION


The #if declaration itself.

$self->{IFDECLARATION}

IFGUTS


The contents of an if/#if statement (from the ifContents field in the ParserState class).

$self->{IFGUTS}

IFTREE


A constraint tree representing any nested if/#if statements or similar within the "if" side of this conditional.

$self->{IFTREE}

ISPAREN


True if the constraint is for a parenthesis.

$self->{ISPAREN}

LASTJOIN


The joining operator that connects this to its immediate predecessor in the actual code. For something hanging off the NEXT tree, this is usually "||". For something hanging off the "FIRSTCHILD" tree, this is usually "&&". In the case of a parenthesis constraint, the value of LASTJOIN is an open parenthesis.

$self->{LASTJOIN}

LEFTDONTCARE


The value on the left side is a symbol that was neither explicitly defined (-D flag) nor undefined (-U flag).

$self->{LEFTDONTCARE}

LEFTISSYMBOL


If set, the left side of the comparison is nonempty and was either explicitly defined or undefined by flags on the command line.

$self->{LEFTISSYMBOL}

LEFTTOKEN


The actual text token from the left side of the comparison.

$self->{LEFTTOKEN}

LEFTVALUE


The numerical value assigned to the left side of the comparison.

$self->{LEFTVALUE}

NEXT


The next constraint. This represents any statement that should be treated as an alternative to this one.

For example, if you have A || B, then for the constraint A, the MEXT constraint is the constraint for B.

$self->{NEXT}

NOT


True if the constraint's token was preceded by an exclamation point that inverts this constraint's return values.

$self->{NOT}

PARENT


Points to the constraint whose FIRSTCHILD field points to this one or to a constraint in this constraint's PREV chain.

$self->{PARENT}

PARENTREE


The tree of constraints for contents within this parenthesis constraint.

For example, if you have the string (A || B), the top constraint represents the open parenthesis. Its PARENTREE chain points to the constraint for A (whose NEXT field, in turn, points to the constraint for B).

$self->{PARENTREE}

PARENWRAPPER


The parenthesis token that encloses the current token. Used as the fast path cache for unrolltoparen.

$self->{PARENWRAPPER}

PREVIOUS


Points to the constraint whose NEXT field points to this one.

$self->{PREVIOUS}

PREVPAREN


For a parenthesis token, a reference to the enclosing parenthesis token.

If HeaderDoc::interpret_case is set, then for a case statement, this points to the enclosing switch statement.

$self->{PREVPAREN}

RIGHTDONTCARE


The value on the right side is a symbol that was neither explicitly defined (-D flag) nor undefined (-U flag).

$self->{RIGHTDONTCARE}

RIGHTISSYMBOL


If set, the right side of the comparison is nonempty and was either explicitly defined or undefined by flags on the command line.

$self->{RIGHTISSYMBOL}

RIGHTTOKEN


The actual text token from the right side of the comparison.

$self->{RIGHTTOKEN}

RIGHTVALUE


The numerical value assigned to the right side of the comparison.

$self->{RIGHTVALUE}

SWITCHGUTS


The contents of an switch statement (from the functionContents field in the ParserState class). Not currently used by HeaderDoc.

$self->{SWITCHGUTS}

SWITCHTREE


A constraint tree representing any nested if/#if statements or similar within the body of a switch statement. Not currently used by HeaderDoc.

$self->{SWITCHTREE}

TOKENCONCAT


Contains a token that might be followed by another token as part of an operator or might live on its own. This is a temporary value used while building the constraint tree.

For example, while building up the tree, when the parser encounters an exclamation point (!), this could either negate the token after it or could be part of a not-equals operator (!=). Until it knows which, the exclamation point gets stored in the TOKENCONCAT field.

$self->{TOKENCONCAT}

WAITINGCOMPARISON


A comparison found before any recognized tokens.

$self->{WAITINGCOMPARISON}

WAITINGTOKEN


A token found before any comparisons.

$self->{WAITINGTOKEN}