Introduction
Core parser routines and parser interfaces
Discussion
The BlockParse package is a group of functions that
are used for parsing declarations in every supported
language except Python. (Support functions in this
package are used when parsing Python, but the actual
parsing of Python declarations happens in the
PythonParse
package.)
The main entry points are blockParse (used for
parsing a declaration and returning information about
what was parsed) and blockParseOutside (used for
taking both a declaration and a HeaderDoc comment and
reconciling them into a HeaderDoc object (descended from
HeaderElement).
Other important functions are cpp_add (adds a C
preprocessor macro from a parse tree), cpp_add_string
(adds a C preprocessor macro from a string), and
blockParseReturnState (used for handling APIs inside
classes — interprets a
ParserState
object hidden away inside the parse tree for the class,
returning the same results that blockParse would have
returned had been called on the individual declaration).
Member Functions
- blockParse
The core of HeaderDoc's parse engine.
- blockParseOutside
-
The outer block parser
- blockParseReturnState
The magic box.
- bracematching
Returns the closing token to match a given
opening token.
- buildCommentFromFields
Reconstructs a HeaderDoc comment from a field list.
- changeAll
Changes an array of TypeHelper objects.
- changeAllMatching
Changes matching members of an array of TypeHelper objects.
- configureAccessControlStateForClass
Configures the access control state and optional/required
state for methods and variables within a class based on
the current language and class type.
- cpp_add
Adds a C preprocessor macro to the parser.
- cpp_add_cl
Adds C preprocessor macro passed in with the -D flag
on the command line.
- cpp_add_string
Adds a C preprocessor macro to the parser.
- cpp_argparse
Parses C preprocessor arguments.
- cpp_preprocess
Performs C preprocessing on a single token.
- cpp_remove
Removes a token from the C preprocessor macros list.
- cpp_subparse
Used by cpp_argparse to recursively perform preprocessing on tokens within the
actual arguments to a macro.
- cppHashMerge
Merges CPP hashes and CPP argument hashes based on interpreting
a stack of #if ... #else ... #elif ... #endif
directives.
- cppsupers
Scrapes the C++ superclass information from a declaration.
- decomment
Strips comments out of a return type declaration.
- defParmParse
Parses #define arguments.
- empty_comment
Returns true if a field set is effectively empty.
- findMatch
Searches an array of TypeHelper objects for a matching name.
- getAndClearCPPHash
Returns the current C preprocessor hash tables and
wipes them clean for the next header.
- getLangAndSublangFromClassType
-
Returns the new language and language dialect based on the
token that began a class declaration.
- ignore
Returns whether a token should be ignored.
- macroRegexpFromList
Returns a regular expression for searching for macro tokens
derived from a hash table.
- mergeComplexAvailability
Merges availability from multiple sources.
- nameObjDump
Dumps an array of TypeHelper objects for debugging purposes.
- nspaces
A legacy piece of code that generates spaces for the raw
declaration.
- objForType
Returns a HeaderDoc object (Var, Enum, Typedef, CPPClass, etc.)
for a given set of type information.
- objlink
Creates "see also" references between related APIs.
- pbs
A piece of debug code that prints the brace stack.
- peekmatch
Returns the closing token that matches the token at the top
of the brace stack.
- setCPPHashes
Sets a new CPP hash and CPP argument hash in place of the existing one.
- spacefix
A legacy piece of code that adjusts spacing in the raw
declaration.
The core of HeaderDoc's parse engine.
Parameters
-
fullpath
The path to the file being parsed.
-
fileoffset
The line number where the current block begins. The
line number printed is (fileoffset + inputCounter).
-
inputLinesRef
A reference to an array of code lines.
-
inputCounter
The offset within the array. This is added to fileoffset
when printing the line number.
-
argparse
-
Set to 1 for parsing function arguments, enum constants,
or struct fields, 2 for reparsing embedded
HeaderDoc markup in a class, 0 otherwise.
This has the following effects:
Disables warnings when parsing arguments to avoid seeing them twice.
Disables C preprocessing (to avoid double-replacement).
Sets $parseTokens{assignmentwithcolon} = 2 in
AppleScript.
Disables the handling of the of and in
tokens and label keywords in AppleScript.
Forces the block parser to return only the outer name for
a type (a la $HeaderDoc::outerNamesOnly) if
argpase is 2.
-
ignoreref
A reference to a hash of tokens to ignore on all headers.
-
perheaderignoreref
A reference to a hash of tokens, generated from @ignore
headerdoc comments.
-
perheaderignorefuncmacrosref
A reference to a hash of tokens, generated from
@ignorefunmacro headerdoc comments.
-
keywordhashref
A reference to a hash of keywords.
-
case_sensitive
Boolean value that controls whether keywords should
be processed in a case-sensitive fashion.
-
lang
The language family to use in parsing. Overrides
HeaderDoc::lang.
-
sublang
The language variant to use in parsing. Overrides
HeaderDoc::sublang.
Return Value
Returns the array ($inputCounter, $declaration, $typelist, $namelist, $posstypes, $value, @pplStack, $returntype, $privateDeclaration, $treeTop, $simpleTDcontents, $availability) to the caller.
Discussion
Most of the variables used by this parser are things that are
used for determining what type of declaration we just parsed.
Such variables are stored as keys in the $parserState
variable. For more information about these variables, see
the documentation for the
ParserState class.
This parser consists of three parsers running in parallel:
The namePending parser — looks for names a certain number of
non-keyword tokens after keyword tokens like struct. Used mainly for
data structures.
The startOfDec parser — looks for names based on the number of
tokens since the start of the declaration (SOD/SODEC). Used for
functions, etc.
The parameter list parser.
The callback name parser — uses parameter list parse results.
Local Variables
External variables
HeaderDoc::parseIfElse
Enables parsing of if/else
statements. Not used by HeaderDoc; used by other
tools that share this parser.
HeaderDoc::fileDebug
Set to 1 by the outer layers when the filename
matches a particular filename. This is useful
when you need to enable lots of debugging for a
single file. When 1, enables lots of debugging.
HeaderDoc::lang
The programming language being parsed. This is
deprecated, and is used only if you do not pass
in a value for the lang parameter.
HeaderDoc::sublang
The programming language dialect being parsed
(e.g. cpp for C++). This is
deprecated, and is used only if you do not pass
in a value for the sublang parameter.
HeaderDoc::AccessControlState
The current access control state (public, private,
protected, etc.). When a permanent access control
change (with a colon after it) occurs, this global
variable is modified. After a declaration, the
temporary (per-declaration) access control state is
restored from this variable.
HeaderDoc::parsing_man_pages
Set to 1 if (in C) you want a function declaration
to end after the closing parenthesis even if there
is no trailing semicolon. Do NOT set this for
normal parsing; it will break many typedef
declarations and similar. This also enables
some broken man page detection for deleting lines
that say or and and.
Key parser state variables
continue
Indicates that parsing should continue. Upon receiving a terminating token,
this gets set to zero, and parsing ends at the end of the line.
continue_no_return
This gets high when we see an opening brace at the start of parsing. If the
parser returned, you would get a bogus declaration, so instead, the parser
reboots itself, starting parsing from scratch at the next line.
lang
The programming language being parsed. Set from
HeaderDoc::lang.
sublang
The programming language dialect being parsed (e.g. cpp for C++).
Set from HeaderDoc::sublang.
callback_typedef_and_name_on_one_line
Legacy formatting cruft variable.
inRegexp
-
Indicates whether the parser is in a regular expression. Values are:
0 — Not in a regex (or in the tail of a regex).
1 — In the second part of a two-part regexp, or the only
part of a one-part regexp.
2 — Between the two parts; only occurs if
the separator is neither '|' nor '/'. Otherwise,
this state gets skipped.
3 — In the first part of a regular expression
after the first separator.
4 — Before the first separator. This state
ends instantly unless there is a prefix.
inRegexpFirstPart
-
When parsing regular expressions, the contents of the right side are
largely unparsed (no parenthesis or bracket interpolation, for example).
Thus, it is important to know whether you are in the left side or the
right side during parsing. Unfortunately, the inRegexp
variable only indicates how many pieces remain in the regexp. Although
this is vital information, it is insufficient for this purpose.
For a single-part regexp, you would have to look for 1, but for a two-part
regexp, a 1 would indicate the last part instead of the first.
This variable solves that problem.
Values are:
0 — Not in the first part of a regular expression.
1 — In the first part of a regular expression.
2 — Before the first part of a regular expression.
The value is 2 up to and including the leading symbol (e.g. /).
It goes to zero upon reaching the symbol that terminates the first
part of the regular expression (e.g. /).
inRegexpCharClass
-
In a regular expression character class, the first character
behaves differently; a closing bracket as the first character
in a character class is treated as a literal. (For example,
[]] is a character class containing only a close bracket.)
To support this, the inRegexpCharClass has
several values:
0 — Not in a character class.
1 — In a chracter class (not at the beginning).
2 — The first character of a character class. (Reduced
to 1 at end of token loop.)
3 — Just saw the opening bracket. (Reduced to 2 at end
of token loop.)
4 — In a nested character class. (Reduced to 1
after closing :] mark.)
5 — In a nested character class after possible trailing colon.
(Reduced to 1 if next character is a right bracket.)
6 — In a nested character class at possible trailing colon.
(Reduced to 5 at end of token loop.)
regexpNoInterpolate
Certain regular expression commands don't result in any parsing
within them (e.g. the tr command). If set, this is equivalent to
setting inRegexpFirstPart to 0.
leavingRegexp
In the trailing part of a regular expression.
inParen
Indicates the number of levels of nested parentheses the current
token is within.
inPType
Indicates that the parser is currently processing a Pascal type declaration.
ppSkipOneToken
Used to tell the parameter parsing code to skip the end-of-comment
character. (The value of inComment (in the
parserState object) goes to 0 before that code, so
without this, it would end up at the start of the next parameter.)
asConcat
In AppleScript parsing, set to 1 when a vertical pipe operator (|) is
encountered to protect an identifier. Set to 0 when the next vertical
pipe operator is encountered.
Parameter parsing
Token variables
curline
The (input) line being parsed.
part
The current token being processed (from curline).
nextpart
The token after the token being processed (from curline).
treepart
In some cases, it is necessary to drop a token for formatting purposes but keep it in
the parse tree. When this is needed, the treepart variable contains
the original token, and the part variable contains a placeholder value
(generally a space).
lastchar
This variable is rather odd. The last token in this string is the last character,
but it may contain multiple characters. This should probably not be used in the
parser, but it is used in a few spots.
lastnspart
The last non-space token encountered.
lasttoken
The last token encountered (though newlines and carriage returns may be
replaced by a space).
Parser states and parser state insertion
parserState
The ParserState
object used for storing most of the parser state variables.
sethollow
This variable is normally 0. It gets set to 1 to tell the hollow insertion code
(at the bottom of the token loop) to set the value of the hollow
variable (in the parserState object) to the tree node for the current
token (which has not been created yet at the time this variable gets set).
hollowskip
Indicates that in spite of sethollow being set to 1, the current node is a bad place
to insert the parser state because it is one of the access control tokens (e.g.
public/private) or because it isn't really being inserted into the tree.
pushParserStateAfterToken
Normally 0. Set to 1 if the parser state should be pushed onto the stack
after this token.
pushParserStateAfterWordToken
Normally 0. Set to 1 if the parser state should be pushed onto the stack
after the next word token. May also be set to 2 if the parser state
should be pushed at the word token after the next word token.
pushParserStateAtBrace
Normally 0. Set to 1 if the parser state should be pushed onto the stack
after the next opening brace.
occPushParserStateOnWordTokenAfterNext
-
Normally 0. The name of this variable is slightly misleading. When used,
the variable is initially set to 2. On the next word token (and only a word
token), this variable is decremented to 1.
At this point, matching behavior changes, and the parser state is pushed
at the first token that is either a word token, an at sign (@), a
minus sign (-), or a plus sign (+).
Tree management
treeTop
The top of the current parse tree.
treeCur
The current position in the parse tree.
treeNest
-
Used to control whether the code at the bottom of the token loop should trigger a loop
nesting after the current token.
0 — tokens after this one should be siblings of this one.
1 — tokens after this one should be nested as children of this node.
2 — tokens after this one should be nested as children of this node
and this node has already been inserted into the tree, so it should not be
inserted again at the bottom of the loop.
treeSkip
This gets set to 1 if the current part should not be inserted into the parse tree
(generally because it has already been inserted in some form during parsing).
treePopOnNewLine
This indicates that the current position in the parse tree should be popped from
the treeStack stack after the next newline character.
trailingHide
Indicates that this is a token that follows a state change to a new state in which
the seenBraces flag was previously set, and that this token should be treated as
though seenBraces were still set. This flag is only supported in bits of code after
where it is first set (in the right closing brace code).
Parser stacks
regexpStack
Stack for regular expression characters.
braceStack
Stack for brace tokens, including the left curly brace, the start-of-template
(sotemplate) value, the left square bracket, the left parenthesis
and the opening class marker for class markers that aren't followed by a left
curly brace (Objective-C @interface, for example).
parsedParamParseStack
A stack containing values from parsedParamParse (in
the parserState object). These are
pushed and popped on curly braces, parentheses, etc. This is basically used
for keeping track of which split character to use as the parser goes into
deeper nesting levels (e.g. when dropping into a function pointer/callback
inside a struct).
treeStack
A stack of parse trees. These are pushed and popped at various points during
the parse process as braces, colons, parentheses, etc. The behavior is
controlled by the variables treeNest, treeSkip,
treePopTwo (in parserState, and
treePopOnNewLine.
Legacy junk variables
prespace
Temporary variable used for leading space during formatting.
prespaceadjust
Temporary variable used for leading space during formatting.
scratch
Temporary storage used during formatting.
curstring
The string currently being parsed. Was at one time used
for checking for quoting, but no longer.
continuation
An obscure spacing workaround.
forcenobreak
An obscure spacing workaround.
setNoInsert
When set to a nonzero value, the noInsert variable in the ParseTree
object created after the next open curly brace gets set to this value.
The outer block parser
Parameters
-
apiOwner
The API owner object (class, header, etc.)
into which new declarations should be inserted.
-
fullpath
-
The path to the file being parsed.
The full (possibly relative) path to the current
input file.
-
inFunction
Set to 1 if an @function comment
preceded this declaration.
-
inUnknown
Set to 1 if a new-style comment (with no
top-level HeaderDoc tag) preceded this declaration.
-
inTypedef
Set to 1 if an @typedef comment
preceded this declaration.
-
inStruct
Set to 1 if an @struct comment
preceded this declaration.
-
inEnum
Set to 1 if an @enum comment
preceded this declaration.
-
inUnion
Set to 1 if an @union comment
preceded this declaration.
-
inConstant
Set to 1 if an @constant or
@const comment preceded this
declaration.
-
inVar
Set to 1 if an @var comment
preceded this declaration.
-
inMethod
Set to 1 if an @method comment
preceded this declaration.
-
inPDefine
-
Set to 1 if an @define comment
preceded this declaration.
Set to 2 if an @defineblock or
@definedblock comment preceded
this declaration.
-
inClass
Set to 1 if an @class comment
preceded this declaration.
-
inInterface
Set to 1 if an @interface comment
preceded this declaration.
-
blockOffset
The line number where the current block begins. The
line number printed is (blockOffset + inputCounter).
-
categoryObjectsref
A reference to the initial array of category
(HeaderDoc::ObjCCategory) objects.
New category objects are added to this array.
-
classObjectsref
A reference to the initial array of class
(HeaderDoc::CPPClass and
HeaderDoc::ObjCClss) objects.
New category objects are added to this array.
-
classType
-
The class type, based on what class was
last parsed. Used when parsing fragments
within a class. Legal values are
intf, occ,
occCat, or any value that
is valid for sublang.
This is used to determine whether to treat the
@method tag as an Objective-C method
(HeaderDoc::Method) or as a normal
method (HeaderDoc::Function).
-
cppAccessControlState
The new access control state (public, private, etc.).
It is named cpp because at the time it was naed, the
only langauge that required it was C++ (where
sublang = "cpp").
-
fieldsref
An array of fields returned from a call to
stringToFields on a HeaderDoc comment.
-
functionGroup
The function group currently in effect.
-
headerObject
The header object that will eventually contain any
objects produced.
-
inputCounter
The offset within the array. This is added to
blockOffset when printing the line number.
-
inputlinesref
A reference to an array of code lines.
-
lang
The language family to use in parsing. Overrides
HeaderDoc::lang.
-
nlines
The number of lines in inputlinesref.
-
preAtPart
Text before the initial @ in the
preceding HeaderDoc comment. Contains the
discussion in a new-style comment. Otherwise,
contains whitespace.
-
xml_output
Set to 1 if output should be in XML format, else 0.
This sets the outputformat value on new
objects.
-
localDebug
Set to 1 to enable lots of general debug spew.
-
hangDebug
Set to 1 to enable lots of debug spew specific to
tracking down infinite loops.
-
parmDebug
Set to 1 to enable lots of debug spew specific to
parameter handling.
-
blockDebug
Set to 1 to debug block handling (both define blocks
and blocks wrapped in C preprocessor macros).
-
subparse
Set to 1 to use subparse mode (handling a declaration
extracted out of an existing parse tree).
-
subparseTree
The source parse tree in subparse mode. Ignored
otherwise.
-
nodec
No longer used. Always pass zero.
-
allow_multi
Pass 1 to allow blocks to be created when a
#if statement is found immediately
after a HeaderDoc comment. Pass 0 to disable this
feature.
-
subparseCommentTree
Used in block mode because subparseTree is empty by
definition when the comment precedes the declaration.
-
sublang
The language variant to use in parsing. Overrides
HeaderDoc::sublang used in previous
versions of this function. Optional FOR NOW.
-
hashtreecur
-
A HashObject instance
that reflects the current position in the CPP hash tree.
This is used by the parser to manage the C preprocessor
hash tables in the presence of #if directives.
For a detailed explanation, see the documentation for the
HashObject class.
Although this is optional, if you don't pass these correctly,
you won't get support for #if/#else/#endif blocks.
-
hashtreeroot
-
A HashObject instance
that represents the root of the CPP hash tree.
This is used by the parser to manage the C preprocessor
hash tables in the presence of #if directives.
For a detailed explanation, see the documentation for the
HashObject class.
Although this is optional, if you don't pass these correctly,
you won't get support for #if/#else/#endif blocks.
Return Value
Returns the array ($inputCounter, $cppAccessControlState, $classType, @classObjects, @categoryObjects, $blockOffset, $numcurlybraces, $foundMatch, $lang, $sublang).
inputCounter
- The new value for inputCounter, adjusted for the lines
that have were parsed.
cppAccessControlState
- The new access control state (public, private, etc.)
classType
- The new value for class type, based on what class was
last parsed. Used when parsing fragments within a class.
classObjects
- A reference to an array of class objects (either
CPPClass or ObjCClass).
categoryObjects
- A reference to an array of category objects
(
ObjCCategory).
blockOffset
- The new block offset (relative to inputCounter),
adjusted for the lines already parsed.
numcurlybraces
- The number of curly braces parsed. Not
particularly useful anymore.
foundMatch
- True if this pass found an object that matches
the requested type (e.g. an
@function
comment matched a function or function-like macro).
lang
- The programming language.
sublang
- The sublanguage (which may change as new
classes are parsed).
Discussion
This is the block parser API you should generally be calling if you are
reusing this code for other purposes. It parses a declaration and
returns an appropriate set of HeaderDoc objects. It includes all of
the HeaderDoc name processing voodoo. More explanation of this code
is probably in order, but there's no time right now.
Common mistakes:
Unlike blockParse, you must increment the input
counter or you risk an infinite loop. (When looping
with blockParse, you must not increment
the input counter or you will skip lines.)
Local Variables
blockmode
-
Possible values:
-
0 — Not in a block of any kind.
-
1 — Got an @defineblock comment,
but have not yet seen any #define macros.
-
2 — Got an @defineblock comment,
and have seen at least one #define macro.
3 — Got a #if macro before the first
declaration. Treat the following declarations as
a group until the corresponding #endif macro
The magic box.
Parameters
-
parserState
The topmost parser state context object from blockParse.
-
treeTop
The top of the parser tree object from blockParse.
-
argparse
Set to 1 for parsing function arguments, enum constants,
or struct fields, 2 for reparsing embedded
HeaderDoc markup in a class, 0 otherwise. For more details,
see blockParse.
-
declaration
The declaration returned by blockParse. If you pass
an empty string, the declaration is obtained from the parse tree.
-
inPrivateParamTypes
Set to 1 if a C++ method with private parameters has been parsed
and the public declaration needs to be restored.
-
publicDeclaration
The public declaration to restore.
-
lastACS
The access control state when the block parser finished, including
any access control changes parsed this round.
-
forcedebug
Set to 1 to dump lots of debug information.
-
fileoffset
The base line number of the
LineRange
object containing this declaration. In subparse mode (reprocessing
a declaration embedded in a class), this value gets overwritten with
the correct value from the tree. Thus, this value is only relevant
when this function is called from blockParse itself.
-
subparse
Set to 0 when this is called from blockParse. Set to 1
when reinterpreting a parse tree obtained from a declaration within
a class.
-
definename
The token for #define. Used to determine whether to run a
separate parser to extract the #define macro parameters.
-
inputCounter
The line number relative to the start of the
LineRange
object containing this declaration. In subparse mode (reprocessing
a declaration embedded in a class), this value gets overwritten with
the correct value from the tree. Thus, this value is only relevant
when this function is called from blockParse itself.
Discussion
The block parser consists of a fairly complex
state machine. Inside it lies a complex state
object that requires further interpretation if
you want to derive any useful information from
it.
This code was originally part of the
blockParse function itself. However,
to improve class handling performance, the
code was modified to reuse the previous class
parse and extract information about each
embedded method, etc. To support this, the
parser state nformation needed to be stored
in the parse tree and interpreted later.
Thus, this portion was split off from the
parser to interpret the structure when needed.
This function is called in three main places:
at the end of blockParse, in the
blockParseOutside function when
reprocessing a parse tree, and at the end of
pythonParse.
Local Variables
External variables
Returns the closing token to match a given
opening token.
Parameters
-
tos
The opening symbol.
-
calledByParser
If 1, returns the original symbol and prints a warning
message on error. If 0, returns an empty string on error (with
no warning).
Discussion
This is used by peekmatch (and by other bits of code) to find the
ending token that matches a starting token for braces, parentheses,
and various other tokens that behave similarly.
Reconstructs a HeaderDoc comment from a field list.
Parameters
-
fields
An array of fields.
-
preAtPart
The part before the first @ sign (the declaration of a
new-style HeaderDoc comment, or empty for an old-style
HeaderDoc comment).
-
message
Content to use if the field set is empty.
Changes an array of TypeHelper objects.
Parameters
-
arrayRef
The array to dump.
-
element
The key in each object to modify.
-
value
The desired value for the specified key.
-
append
-
If 0, replace the existing value with $value.
If 1, append $value to the existing
value (space-delimited).
Changes matching members of an array of TypeHelper objects.
Parameters
-
arrayRef
The array to dump.
-
matchingElement
The key in each object to match.
-
matchingValue
The value for that key that, if matching, indicates the object should be modified.
-
element
The key in each object to modify.
-
value
The desired value for the specified key.
-
append
-
If 0, replace the existing value with $value.
If 1, append $value to the existing
value (space-delimited).
Configures the access control state and optional/required
state for methods and variables within a class based on
the current language and class type.
Adds a C preprocessor macro to the parser.
Parameters
-
parseTree
The parse tree for the macro in question.
-
dropdeclaration
True if the declaration's contents should be omitted entirely.
Adds C preprocessor macro passed in with the -D flag
on the command line.
Adds a C preprocessor macro to the parser.
Parameters
-
string
The string form of the macro in question.
-
dropdeclaration
True if the declaration's contents should be omitted entirely.
Parses C preprocessor arguments.
Parameters
-
name
The name of the C preprocessor macro for which these arguments are the parameters.
-
linenum
The line number where this line appears. Used for determining which #define directives
apply at that point in time.
-
arglistref
An array containing a parse tree for each actual parameters to this instance of the C
preprocessor macro (in order of occurrence).
Performs C preprocessing on a single token.
Parameters
-
part
The part to process.
-
linenum
The line number where the part appears. Used for determining which #define directives
apply at that point in time.
Return Value
Returns the array ($newtoken, $hasargs, @arguments)
Discussion
Much of the actual processing happens in the caller. For simple substitutions, this
returns the updated part. For function-like macros, this returns true for the
hasargs value and also returns an array of argument names for use when processing
the contents of the macros. In practice, that third value is never used.
Removes a token from the C preprocessor macros list.
Discussion
Used with availability macros so that C preprocessor doesn't
strip out the availability macro tokens out before the parser sees
them.
Used by cpp_argparse to recursively perform preprocessing on tokens within the
actual arguments to a macro.
Parameters
-
tree
A parse tree for the actual argument in question.
Merges CPP hashes and CPP argument hashes based on interpreting
a stack of #if ... #else ... #elif ... #endif
directives.
Discussion
Used when processing blocks that might corrupt each other.
For example, if you have a #if ... #else ... #endif
block in which the #if side is a #define
that defines the name of a nonexistent function to an existing
function and the #else side or #elif
side is a real function definition for that same symbol name,
the C preprocessor would dutifully turn that function declaration
into a declaration for the other function. Oops.
Instead, upon entering such a block, the parser makes a backup of
the C preprocessor's working hashes (which contain C preprocessing
tokens and argument lists). This gives the preprocessor a
base state for the block. Whenever a #else or
#elif directive appears, the parser makes an
intermediate copy of the hash coming out of that block, then
resets the working hashes to the base state (prior to the initial
#if). When the closing #endif
directive appears, the parser merges all of the intermediate
(per-block) hashes together and sets the working hashes to
that value.
For detailed explanation, see the documentation for
HashObject.
Scrapes the C++ superclass information from a declaration.
Discussion
This function is also used for the Java implements information.
Strips comments out of a return type declaration.
Discussion
This should only be used when handling return types. It does not handle
strings or anything requiring actual parsing. It strictly rips out
C comments (both single-line and standard).
Parses #define arguments.
Parameters
-
declaration
The text of the declaration to parse.
-
inputCounter
The line number (for debugging purposes).
-
definename
The name of the #define.
-
braceDebug
Set to 1 to print debug info.
-
fullpath
The header file path (for debugging purposes).
Returns true if a field set is effectively empty.
Searches an array of TypeHelper objects for a matching name.
Parameters
-
arrayRef
A reference to the array to search.
-
element
The key in each object to search.
-
value
The expected value of that key.
Returns the current C preprocessor hash tables and
wipes them clean for the next header.
Returns the new language and language dialect based on the
token that began a class declaration.
Parameters
-
classtype
The class token.
Discussion
This function takes a class token (class,
@class, @interface, etc.) and returns
a lang and sublang value. Pretty trivial,
but critical....
Returns whether a token should be ignored.
Return Value
Returns the availability string if one is available.
Otherwise, returns 0 if the token is a normal token,
1 if the token is in the ignore list and should be
dropped during parsing, or 3 if the token represents
an availability macro that has arguments and thus
needs special handling.
Discussion
Checks the ignore list and availability macros.
Returns a regular expression for searching for macro tokens
derived from a hash table.
Parameters
-
nameref
A reference to a hash in which the names of the macro tokens
(e.g. #define, #if, #ifdef) are the hash keys.
-
onlywithpound
-
If 0, includes all tokens as-is.
If 1, includes only tokens that begin with a # sign and
strips off the leading #, e.g. define instead of
#define.
If 2, includes only tokens that do not begin with a # sign.
Merges availability from multiple sources.
Parameters
-
orig_avail
The original availability derived from comments.
-
nodearrayref
The array of availability nodes generated from parse
tokens parsed by the parser.
Dumps an array of TypeHelper objects for debugging purposes.
Parameters
-
arrayRef
A reference to the array to dump.
A legacy piece of code that generates spaces for the raw
declaration.
Deprecated
This is going away eventually.
Returns a HeaderDoc object (Var, Enum, Typedef, CPPClass, etc.)
for a given set of type information.
Parameters
-
curObj
IN: The current master object from blockParseOutside. This master object
is generated based on the top-level tag in the HeaderDoc comment,
if present. (If the comment has no top-level tag, this is a generic
HeaderElement
object.)
-
typedefname
IN: The typedefname parse token (obtained from a call to
parseTokens.
-
typestring
IN: The typestring field out of the parser state object.
See the parserState class for more information.
-
posstypes
IN: The posstypes field out of the parser state object.
See the parserState class for more information.
-
outertype
IN: The primary type returned by the parser. For example, in the case of a
typedef struct declaration, the outer type would be
typedef.
-
curtype
IN: The type that we are searching for (as defined by the HeaderDoc comment).
-
classType
-
IN: The class type of the enclosing context.
OUT: The class type of the class just parsed; unchanged if the current
declaration is not a class.
-
classKeyword
IN: The class keyword from the HeaderDoc comment (e.g. for
an @class comment, the value is class). If
unspecified, the value is auto.
-
declaration
IN: The raw declaration. For classes, passed to
classTypeFromFieldAndBPinfo. Otherwise unused.
-
fieldref
IN: A reference to the array of fields from the HeaderDoc comment.
-
functionGroup
IN: The name of the current function group.
-
varIsConstant
-
IN: Probably doesn't matter.
OUT: Returns 1 if the variable declaration is a constant, else 0.
-
blockmode
IN: Nonzero if the parser is in a #define block. (For details, see
blockParseOutside.
-
inClass
IN: Nonzero if the HeaderDoc comment began with @class.
-
inInterface
IN: Nonzero if the HeaderDoc comment began with @interface.
-
inTypedef
IN: Nonzero if the HeaderDoc comment began with @typedef.
-
inStruct
IN: Nonzero if the HeaderDoc comment began with @struct.
-
fullpath
IN: The filename with leading path parts (for debugging purposes/warnings).
-
inputCounter
IN: The position within the current text block (for debugging purposes/warnings).
-
blockOffset
IN: The offset of the current text block from the start of the file (for debugging purposes/warnings).
-
lang
IN: The programming language of the file being parsed. Used to determine whether certain
Pascal-specific keywords are active.
-
outerLocalDebug
IN: The value of localDebug in blockParseOutside. Set high for debugging.
-
functionContents
IN: The function body. Used to populate the object (if it's a function).
-
apiOwner
IN: The object into which this object will eventually be inserted. (Used to set the
appropriate field in the object; this function does NOT add the object to the
apiOwner object in any way.)
-
subparseInputCounter
IN: An override for the inputCounter field used when doing a subparse (handling a
parse tree that has already been parsed once). Leave unset normally.
-
subparseBlockOffset
IN: An override for the blockOffset field used when doing a subparse (handling a
parse tree that has already been parsed once). Leave unset normally.
-
extendsClass
IN: The superclass name (obtained from the block parser).
-
implementsClass
IN: The name of the class that this class implements (Java-specific, obtained from
the block parser).
-
alwaysProcessComment
IN: Indicates that the processComment() call should me made on the resulting object
even if curtype is UNKNOWN (meaning that the comment would normally get processed
later in blockParseOutside). Used only in the case of a conversion request in
blockParseOutisde.
Return Value
Returns the array ($extra, $classType, $varIsConstant).
Discussion
This logic got so large that it was too much of a pain to maintain
in two places in blockParseOutside, hence the separate function.
Creates "see also" references between related APIs.
Parameters
-
listref
A reference to an array of objects to be cross-linked.
Discussion
When the parser sees, for example, an @typedef
comment, followed by a struct, followed by a
typedef, it treats these as related APIs and
automatically associates the comment with both of these
two declarations. This function links those together at
the end.
A piece of debug code that prints the brace stack.
Discussion
This does nothing unless localDebug is set to 1 below.
This should probably be revisited to key off something
in the calling function.
Returns the closing token that matches the token at the top
of the brace stack.
Parameters
-
ref
A reference to the brace stack array.
-
fullpath
The path of the current header. Used for error
messages.
-
linenum
The current line number within the header. Used for error
messages.
Discussion
This is a variant of peek.
Sets a new CPP hash and CPP argument hash in place of the existing one.
Parameters
-
cpphashref
A reference to the new CPP symbol hash.
-
cpparghashref
A reference to the new CPP argument hash.
A legacy piece of code that adjusts spacing in the raw
declaration.
Deprecated
This is going away eventually.
Member Data
- CPP_ARG_HASH
C preprocessor argument hash for the current header.
- CPP_HASH
C preprocessor token hash for the current header.
- HeaderDoc::BlockParse::VERSION
The revision control revision number for this module.
- HeaderDoc::hideIDLAttributes
Controls whether IDL attributes (e.g. [foo]) should be hidden
in HTML output.
- HeaderDoc::includeFunctionContents
Tells the block parser to include the function body
in the parse tree.
- HeaderDoc::inputCounterDebug
Global variable that turns on input counter debugging in
various parts of the code.
- HeaderDoc::OptionalOrRequired
Stores whether Objective-C protocol methods are optional or required.
- HeaderDoc::useParmNameForUnlabeledParms
Change this to 0 if you want to hide the parameter name
for unlabeled parameters (old behavior).
C preprocessor argument hash for the current header.
Discussion
The token hash contains a mapping of C preprocessor token
names to their argument lists. For example, if you have
the following define:
#define FOO(x, y) (x + (3 * y))
then the C preprocessor argument hash would contain a key called
FOO with a (string) value of x, y.
C preprocessor token hash for the current header.
Discussion
The token hash contains a mapping of C preprocessor tokens
to their values. For example, if you have the following define:
#define FOO(x, y) (x + (3 * y))
then the C preprocessor token hash would contain a key called
FOO with a (string) value of (x + (3 * y)).
The revision control revision number for this module.
$HeaderDoc::BlockParse::VERSION = '$Revision: 1333753010 $';
Discussion
In the git repository, contains the number of seconds since
January 1, 1970.
Controls whether IDL attributes (e.g. [foo]) should be hidden
in HTML output.
Discussion
By default, these tokens are hidden. Because this switch
is unlikely to ever be used by anyone, it can be set only
by changing the default value in BlockParse.pm
from 1 to 0.
Tells the block parser to include the function body
in the parse tree.
Global variable that turns on input counter debugging in
various parts of the code.
Stores whether Objective-C protocol methods are optional or required.
Change this to 0 if you want to hide the parameter name
for unlabeled parameters (old behavior).
Discussion
Historically, HeaderDoc left out unlabeled parameters
in constructing Objective-C method names. If you
want that behavior, change this value. This is not
an end-user-tunable parameter (without changing the
code) because it doesn't seem likely that many
people will want to change this behavior.
Last Updated: Saturday, August 06, 2016
|