Introduction
Core data structure for the parser.
Discussion
The ParserState object represents an almost-complete
view of the state machine inside the parser.
(There are a few local variables in the parser that
contain additional transient state information.)
ParserState object instances are routinely stored
on a stack to provide the ability to fully parse
and interpret one declaration that appears inside
another declaration (variable declarations within
the parameter list of a function, for example).
Member Functions
- _initialize
Initializes an instance of a ParserState object.
- addBackslash
Increments the backslash counter.
- braceCount
Looks at the top token on the brace stack and
returns the closing token that would match it.
- dbprint
Prints object for debugging purposes.
- free
Releases resources associated with a parsers state object.
- isContinuationLine
Returns whether the current line is a Python continuation line.
- isLeftBrace
Returns whether or not this token should be
treated as a left brace.
- isQuoted
Increments the backslash counter.
- isRubyCloseQuote
Returns whether a token should be interpreted as
a Ruby close quote mark.
- isRubyOpenQuote
Returns whether a token should be interpreted as
a Ruby open quote mark.
- new
Creates a new ParserState object.
- peekBrace
Looks at the top token on the brace stack.
- peekBraceMatch
Looks at the top token on the brace stack and
returns the closing token that would match it.
- popBrace
Pops a token off of the brace stack and returns it.
- print
Alias for
dbprint.
- pushBrace
Pushes a token onto the brace stack.
- resetBackslash
Resets the backslash couter to zero.
- rollback
Rolls back the parser state to the last state
saved by a call to rollbackSet.
- rollbackSet
Creates a clone of the object for future rollbacks.
- setHollowWithLineNumbers
Sets the hollow field in this object,
and sets the input counter and block offset values
for the tree node.
- treePop
Pops a tree from the tree stack.
- treePush
Pushes a tree onto the tree stack.
Initializes an instance of a ParserState object.
Parameters
-
self
The object to initialize.
Increments the backslash counter.
Parameters
Looks at the top token on the brace stack and
returns the closing token that would match it.
Parameters
Discussion
This is currently only used for the Python
parser. Eventually, the main parser should
be modified to share this stack instead of
using a local variable.
Prints object for debugging purposes.
Parameters
Releases resources associated with a parsers state object.
Parameters
-
self
The ParserState object.
Returns whether the current line is a Python continuation line.
Discussion
In Python, if you are inside a string, a multiline string,
a parenthesized expression, an array, etc., subsequent lines
are treated as part of the current line implicitly. Those
subsequent lines are called continuation lines.
A continuation line also occurs explicitly when the previous
line ends with a backslash.
Returns whether or not this token should be
treated as a left brace.
Parameters
-
self
This object.
-
part
The token to check.
-
lang
The programming language.
-
lbrace
The primary left brace character.
-
lbraceunconditionalre
A regular expression containing other patterns that
are always considered left braces. Currently used
for for/if in Python and Ruby, and tell in AppleScript.
-
lbraceconditionalre
In Ruby/Python, a set of tokens that are treated as
left braces unless they are immediately after a
right brace. Basically, this handles
begin/while/until when used at the end of a line
in Ruby/Python.
-
classisbrace
Set to 1 if a class declaration is treated as an
open brace. (This is not used for ObjC clases;
they are special.)
-
functionisbrace
Set to 1 if a function declaration is treated as an
open brace.
-
case_sensitive
Set to 1 for most languages. Set to 0 if the
language uses case-insensitive token matching
(e.g. Pascal).
-
curBraceCount
The current brace count. This is used to prevent
nesting of braces in languages that don't work that way.
Increments the backslash counter.
Parameters
-
self
This object.
-
lang
The current programming language.
-
sublang
The current language dialect.
Returns whether a token should be interpreted as
a Ruby close quote mark.
Parameters
-
self
This object.
-
part
The string to check.
Discussion
The value returned depends on whether the close token
matches the open token. This is determined based on
the value store in the inRuby variable
in this parser state instance. If not in a
Ruby string, this returns zero.
Returns whether a token should be interpreted as
a Ruby open quote mark.
Parameters
-
self
This object.
-
part
The string to check.
Discussion
The value returned, if nonzero, indicates the value
that should be stored in the inRuby variable
in this parser state instance. If already in a
Ruby string, this returns zero.
Creates a new ParserState object.
Parameters
-
param
A reference to the relevant package object (e.g.
HeaderDoc::ParserState->new() to allocate
a new instance of this class).
Looks at the top token on the brace stack.
Parameters
Discussion
This is currently only used for the Python
parser. Eventually, the main parser should
be modified to share this stack instead of
using a local variable.
Looks at the top token on the brace stack and
returns the closing token that would match it.
Parameters
Discussion
This is currently only used for the Python
parser. Eventually, the main parser should
be modified to share this stack instead of
using a local variable.
Pops a token off of the brace stack and returns it.
Parameters
Discussion
This is currently only used for the Python
parser. Eventually, the main parser should
be modified to share this stack instead of
using a local variable.
Alias for
dbprint.
Parameters
Pushes a token onto the brace stack.
Parameters
-
self
This object.
-
token
The token to push.
Discussion
This is currently only used for the Python
parser. Eventually, the main parser should
be modified to share this stack instead of
using a local variable.
Resets the backslash couter to zero.
Parameters
Rolls back the parser state to the last state
saved by a call to rollbackSet.
Parameters
Creates a clone of the object for future rollbacks.
Parameters
Sets the hollow field in this object,
and sets the input counter and block offset values
for the tree node.
Parameters
-
self
This object.
-
treeCur
The tree node to modify, and also the tree node
that the hollow field should reference.
-
blockOffset
The block offset value to set in the tree node.
-
inputCounter
The input counter value to set in the tree node.
Pops a tree from the tree stack.
Parameters
Discussion
This is currently only used for the Python
parser. Eventually, the main parser should
be modified to share this stack instead of
using a local variable.
Pushes a tree onto the tree stack.
Parameters
-
self
This object.
-
tree
The token to push.
Discussion
This is currently only used for the Python
parser. Eventually, the main parser should
be modified to share this stack instead of
using a local variable.
Member Data
- afterNL
A nondestructive variant of firstpastnl that is available to
any programming language (and currently used in TCL).
Set to 2 after a newline, 1 during the first non-space
token, 0 after.
- afterSemi
In shell, initially 0, set to 2 after a double-semicolon or
1 after a semicolon (but never set to 1 after it is already 2).
Reset to 0 after the first non-space token. Used in case/esac
parsing.
- APIODONE
Set on parser state objects that represent declarations
within classes so that it does not get processed twice.
- ASlabel
The AppleScript label currently being parsed. Each
label is treated as a parsed parameter.
- attributeState
-
Used when parsing the GCC __attribute__
info, __asm__ declarations, and other
similar pieces of info (certain availability macros,
for example).
Legal values are:
0 — Not parsing an attribute.
1 — Just saw the leading token.
-1 — Got the leading open parenthesis.
Decremented to smaller negative values as
additional open parentheses are parsed.
Incremented towards 0 as close parentheses
are parsed. When it reaches zero, the tree
is popped up a level, and attribute parsing
is complete.
- autoContinue
In Python, this indicates the number of block
nesting levels deep the parser is (e.g. the start
of a function sets this to 1, an if statement
inside that function increases it to 2, and so on).
- availability
Contains the contents of an availability macro that was seen by the parser.
- availabilityNodesArray
Temporary storage scribbled into by blockParse.
Each token in this array is the top of a subtree
that begins with one of the "Magic" availability
macros in Availability.list (e.g.
__OSX_AVAILABLE_BUT_DEPRECATED or
__OSX_AVAILABLE_STARTING).
- backslashcount
The number of backslashes since the last non-backslash
token. Modified by resetBackslash and
addBackslash.
- basetype
The type name in a simple typedef, e.g. foo in
typedef struct foo bar;.
- bracePending
-
Normally 0.
Set to 1 if the parser is expecting a brace
at the end of the first part of a struct, union,
or enum declaration. If it gets a word token
instead, the parser is parsing a variable
declaration rather than a type declaration.
Set 2 if the parser is expecting another
word token before changing this variable to 1.
For example, if the parser encounters a
double colon (::), the next word
token is part of the structure name, but a
subsequent word token after that would make it
a structure variable instead.
- braceStack
-
Stack for brace tokens, including the left curly brace, the start-of-template
(sotemplate) value, the left square bracket, the left parenthesis
and the opening class marker for class markers that aren't followed by a left
curly brace (Objective-C @interface, for example).
This is currently used exclusively for Python.
Other languages use a local variable in blockParse.
- callbackIsTypedef
Indicates whether the callback is wrapped in a typedef (1) or not (0).
Sets priority order of type matching (up one level in blockParseOutside).
- callbackName
The name of this callback. This takes priority over all other names,
including the sodname.
- callbackNamePending
-
In a typedef of a callback, indicates that the next word token
is the name of a callback. (Non-typedef callback names get
picked up naturally by the parameter parsing code---if a second
set of parsed parameters appear, the first set becomes the
callback name.) Values are:
0 — Normal state.
1 — Just saw leading typedef token.
2 — Saw first word after typedef.
3 — Saw parenthesis after first word. Capture
the name now.
4 — Saw name token after parenthesis.
(Further word tokens mean it's not a callback.)
5 — Saw :: after name. Continue to capture
the name here.
- categoryClass
The owning class for an Objective-C category.
- cbsodname
-
When a second open parenthesis is encountered in parsing
the callback name, this tells the parser that it is really
seeing a function that returns a callback instead of a
callback variable. The original sodname value is stored
here, and the functionReturnsCallback flag
is set so that this value can be restored later.
If a typedef contains a second set of parentheses and is
not identiified as a function returning a callback, the
name inside the first set is the callback name, so this
gets cleared.
- classIsObjC
Set to 1 when an Objective-C class token is encountered.
In addition to playing a key role in parsing decisions,
this also causes sublang to be set to
occ.
- classNameConcat
-
Set to 1 on encountering a period while parsing the name of an
IDL class. This causes the next token to be interpreted as an
additional part of the name rather than turning the whole thing
into a class instance. Set to 0 after encountering the next
token.
The bleeding of JavaScript-specific syntax into IDL files is
really something of an abuse of the language, but supporting
it is necessary to parse certain content.
- classNameFound
Set to 1 after a class name has been parsed. (Set back
to 0 if double colons are seen.) If a second word token
is encountered in this state, it's a variable instead of
a class (e.g. class foo *foo_instance;).
- classtype
Contains the token that began the current class
declaration with any leading @ sign merged.
Returned to the caller.
- conformsToList
The list where the list of classes to which this protocol
conforms is stored. This variable contains a string.
- constKeywordFound
Set to 1 after the const keyword is found.
- cppMacroHasArgs
Indicates that the #define macro described by the parser state
object has an argument list associated with it. Used to
determine the definetype attribute for the macro in XML output.
- curvarstars
-
Temporary storage for asterisks before each variable
name in a declaration with more than one name.
This variable is reset to empty when the parser
encounters a comma in such a declaration.
See curvarstars for more information.
- declarationEndsAtNewLine
TCL variables, AppleScript variables, and TCL
functions end at a newline character. When
these are detected (by token matching), this
variable is set to 1.
- elseContents
The contents of the else part of an if/else conditional.
Only valid if $HeaderDoc::parseIfElse is 1.
- endgame
-
In Python, this variable determines whether the
declaration is done after this token, in which case
a new parser state (sibling) must be added.
0 — Nope.
1 — In this state if we got a newline and
autoContinue is 0 (we're not in
a nested block). We're done after this token,
but it should be added to the parse tree.
2 — seenLeading is less than
leadspace. Don't add this token
to the parse tree because it's part of the
next declaration.
3 — seenLeading is Less than
parentLeading. Don't add this
token to the parse tree because it's part of
the next declaration.
- endOfString
In shell (and Perl), set to the token after a << that is
treated as the start of a multi-line string. Reset to
an empty string upon leaving the multi-line string. While
in this state, inString is set to 13.
- endOfTripleQuote
-
The number of quote tokens in a row when
potentially leaving a triple-quoted string.
This value is reset to zero upon
encountering a non-quote token.
If this reaches 2, the next quote mark causes
the three quotes to be combined into a single
token, and the value is reset to 0.
- endOfTripleQuoteToken
When a quote mark is seen, the object is
added here so that the parser can easily
go back to it later if it turns out to be
a triple quote. This is used to merge the
three quote marks into a single token in
the parse tree.
- extendsClass
The name of the class that this class extends.
- extendsProtocol
Stores the name of the Objective-C protocol that
this protocol extends (the tokens within angle
brackets).
- externC
-
In C, when the extern is encountered,
this flag is set to 1 and the rollbackSet
function is called to set a rollback point. The
declaration to date is also stored in the
preExternCdeclaration field at this
time.
If what comes after this token is C,
then the previous declaration is restored and the
parser state is rolled back to this point.
- firstpastnl
In shell (and Perl), set to 1 after a newline until the
first non-space token.
- followingrubyrbrace
-
A while or other statement right after
an end statement (on the same line) is
treated as applying to the preceding
block instead of starting a new one.
Set to 1 when end is encounered, 0 at
following newline.
- forceClassDone
Set to 1 after reaching the left brace after a class. This
essentially tells the parser to stop appending superclass tokens
to forceClassSuper.
- forceClassName
When the parser sees a colon (indicating a superclass name is coming),
or the keywords extends or implements in Java,
etc., this gets a copy of the class name so that it doesn't get overwritten.
- forceClassSuper
Holds the superclass information after a colon token. Used in
conjunction with forceClassName.
- freezereturn
Once the parser passes the opening curly brace of a function body, the
return type information is frozen. This prevents other things that loook
too much like function declarations from overwriting the return type info.
- freezeStack
Copy of the pplStack when the stack is frozen by stackFrozen.
- frozensodname
A copy of the sodname variable frozen at a particular point in time.
Freezing occurs when the parser enters certain contexts like parameter parsing
because the sodname field would otherwise get overwritten by other things.
- FULLPATH
The full path for the file containing the
declaration that this parser state describes.
By storing the info here, it is available for
debug messages during subparse operations
(reprocessing declarations nested within
class declarations).
- functionContents
The contents of a function (or, when parsing a switch
statement, the contents of the struct body).
- functionReturnsCallback
-
Indiciates that the parser has seen a function that
returns a callback. If sest, the parser restores the
value from cbsodname into the
sodname field.
This is incremented to 2 while parsing the parameters
for the callback, and decremented back to 1 at the end.
- gatheringObjCReturnType
-
While parsing an Objective-C method, this gets
set to 1 upon seeing an open parenthesis, 2 at
the bottom of the loop. While at 2 or greater,
tokens are appended to the
occmethodreturntype variable.
This value is incremeneted when additional open
parentheses are encountered, and is decremented
when close parentheses are encountered. When it
reaches 1 again, it is reset to 0.
- HeaderDoc::ParserState::VERSION
The revision control revision number for this module.
- hollow
This variable holds a reference to the node in
the parse tree where the parser state should be stored when the current declaration
has been fully parsed.
- ifContents
The contents of the if part of an if/else conditional
(not including the test expression). Only valid if
$HeaderDoc::parseIfElse is 1.
- ignoreAvailabilityMacros
Set high within the definition for any of the built-in
availability macros so that those macro definitions can
be properly parsed even if they refer to other
availability macros.
- implementsClass
The name of the abstract class that this class
implements.
- inBitfield
Indicates that we are at a token that might be the start of
a C bitfield. This goes high when a colon occurs. If the next
token is a non-colon (i.e. it's not ::),
startOfDec gets reset to zero to lock the name and
stuff..
- inBrackets
Indicates the number of levels of nested square brackets the current
token is within.
- inCase
In shell, initially 0, incremented upon entering a case
statement, and decremented on exit.
- inChar
Inside a single-quoted character/string literal.
- inClass
-
Indicates whether we are in a class. Possible values are:
0 — Not in a class declaration.
1 — Enters this state when a class keyword is
encountered (except @protocol or
@interface.
2 — Enters this state when the @interface
class keyword is encountered. Returns to 1 when a colon or
close parenthesis is encountered.
3 — Enters this state on the first word token found while in state 2.
Returns to 1 when colon or close parenthesis is encountered.
- inClassConformingToProtocol
-
Set to 1 when a conforming left angle bracket (<) is seen in an
@protocol declaration.
Set to 2 after that token. While this value is 2, tokens are
gathered in the conformsToList string.
Reset to 0 upon seeing the matching right angle bracket (>).
- inComment
Indicates whether we are in a multi-line comment. See also
the ppSkipOneToken local variable in
blockParse.
- inEnum
Set to 1 while inside an enumeration.
- inExtends
Set to 1 when the extends keyword is encountered in
Java. Reset to 0 when an implements keyword occurs.
- inGiven
Set to 1 when a given token is seen in AppleScript. Reset to 0
at the following newline.
- INIF
Inside an if statement. Only used if the HeaderDoc::parseIfElse
variable is set to 1.
- inImplements
Set to 1 when the implements keyword is encountered
in Java. Reset to 0 when an extends keyword occurs.
- inInlineComment
-
Indicates whether we are in a single-line comment (i.e. one
beginning with a hash or two slashes).
Initial value is 4. Decremented to 3 at end of loop.
Decremented to 2 after next token, then 1, increased to 3
if 1 and saw exclamation point. I don't remember what this
code does, and it is probably wrong.
- initbsCount
Contains the number of braces on the brace stack
when this parser state was created. When the
number of braces drops below this level, this
parser state must go away.
- inLabel
-
Set to 1 when a label token is seen in AppleScript. (See the
labelregexp variable in
parseTokens for
a list of these tokens.)
Reset to 0 after the next word token, at the following newline,
or when a given token is encountered.
- inMacro
-
Indicates that the current declaration is a #define macro or similar. Values are:
0 — Not in a macro.
1 — Got leading #.
2 — Got something else after # (error case).
3 — Got #define.
4 — Got another C preprocessor token, including
#if, #ifdef,
#ifndef, #endif,
#else, #undef,
#elif, #error,
#warning, #pragma,
#import, and #include.
See also inMacroLine.
- inMacroLine
Used for handling macros in the middle of declarations.
- inMacroTail
Set high upon encountering the first whitespace after
a macro name. Once this key is set, the value of the
cppMacroHasArgs key is no longer set upon
encountering an open parenthesis.
- INMODULE
-
Indicates that the parser is in a module declaration.
Possible values are:
0 — Not in a module declaration.
1 — Saw the module token.
2 — Unused vestigial state.
3 — Unused vestigial state.
- inOfIn
Set to 1 when AppleScript of or in token
is encountered. Reset to 0 on newline or after encountering the
next word token and appending it to OfIn.
- inOperator
In a C++ operator declaration.
- inPrivateParamTypes
Set to 1 after the colon in a C++ method declaration.
Indicates that the parser is parsing the private parameter
declarations for the method.
- inProtocol
-
Possible values are:
0 — Not in a protocol.
1 — Saw @protocol token.
2 — After next word token after @protocol. Returns to
this state after closing > token.
In this state, it is capturing tokens into
the extendsProtocol field.
3 — Inside conforming angle braces (<).
- inputCounter
The input counter. Used for restoring the
value during a subparse (reprocessing a
declaration within an already-parsed class).
- inrbraceargument
-
Some languages take an additional argument for their equivalent of
a right brace. For example, in AppleScript, a tell
block ends with end tell. In effect, end
terminates the block, but the next token does not start the next
block.
If rbracetakesargument
is set in the object returned by a call to
parseTokens,
then that trailing tell is included in the
trailer for the block.
- inRuby
In a Ruby quote. Quotes in Ruby are much more complex
than in any sane language, so they get their own
variable....
- inRubyBlock
The character that began the current Ruby block. For example, the
<< token.
- inRubyClass
-
Normally 0.
Set to 1 when a Ruby class declaration is encountered.
Set to 2 when the first newline after a Ruby class is encountered.
- inString
Inside a double-quoted string literal if 1, else 0.
Set to 13 for a multi-line string (e.g. FOO <<EOF...).
- inTCLRegExpCommand
-
In TCL, set to 1 when a command is encountered that takes an
unquoted (non-string) regular expression as an argument.
Set to 0 upon entering the regular expression or when a
newline or carriage return is encountered.
- inTemplate
Within C++ template braces (< and >). Also used for
IDL bracket notation.
- inTypedef
Set to 1 while inside a C typedef.
- inUnion
Set to 1 when the union keyword is encountered. Remains high
until the end of this declaration.
- isConstructor
Set to 1 after the constructor token is seen in TCL
(or equivalent in other languages). (Not used in C++.)
- ISFORWARDDECLARATION
Indicates whether a class declaration is a forward declaration
(1) or the actual class declaration (0). That way, the
resulting object is a Var
object instead of a CPPClass
object.
- isProperty
Set to 1 after a keyword is parsed that indicates that this
variable is an Objective-C property.
- isStatic
Set to 1 when static or equivalent
(e.g. my in perl) is seen. Used to
determine whether a variable is file-scoped or
global.
- justLeftStringToken
After an empty string (""), this gets set high
in Python. That way, if the next token is
also a double quote mark, the opening triple
quote of a triple-quoted tring can be easily
detected.
- kr_c_function
-
Indicates that the current code is a K&R-style C function (with separate
parameter type declarations, e.g.
int foo(a, b)
int a;
char *b;
{ ... function body ... }
- kr_c_name
Contains the name of a K&R C function. The normal function name detection
code would fail hard because of the existence of multiple declarations.
- lang
The language that the parser was parsing when
this parser state was created.
- lastNLWasQuoted
In Python, set to 1 if the last newline was preceded
by a backslash, else unset. Used to determine
whether to care about the leading whitespace count.
- lastpart
Holds the last part before the one being processed
by the Python parser. Similar to the local variable
of the same name in blockParse.
- lastsymbol
The last token, wiped by braces, parentheses, and so on. It is used primarily
for handling names of typedefs. In general, when writing code, except in a few
specific contexts, you probably want the local variable
lasttoken in blockParse instead. Also
related are the local variables lastnspart and
lastchar.
- lastTreeNode
-
The last node in the parse tree rooted at this node.
This node is marked with EODEC in parse tree dumps.
For example, the lastTreeNode value for
a class declaration would point to the closing brace
or semicolon at the end of the class.
Note that nodes within the class, each nested
declaration also has a lastTreeNode
value that points to the end of that nested
declaration.
- leadspace
-
The number of leading spaces in the first line
since the parser state was created.
Initial value is -1 indicating that the value
has not yet been determined. This value does
not get set until the first line that
contains at least one non-space token after
that whitespace and before the trailing newline.
If the current line's leading space (in
seenLeading drops to this level or
lower, the end of block is considered to have been
reached.
- leavingComment
Set to 1 on an end-of-comment token so that
the ending comment token won't get added to
the return type.
- macroNoTrunc
Set to 1 to avoid truncating the body of macros that
don't begin with a parenthesis or brace. Otherwise 0.
- MODULE
Temporary storage for the name of a module.
The module token is treated much
like an @indexgroup tag.
- name
The name of a data type parsed by the main (namePending) parser.
This is the lowest priority name; it gets overridden by the sodname name
more often than not.
- nameList
In Pascal, upon seeing a colon (after a variable name),
the sodname and sodtype
fields are concatenated together (with a space)
into this field. This later becomes the
variable name.
- namePending
-
Set to 1 when the parser expects a name:
After the keyword function, procedure, sub, or other similar
function delimiter tokens.
Set to 2 after the keyword typedef, struct, union, and so on
because the name is the second non-keyword token after this one.
Decremented at the end of the token loop.
- namepending
Python-specific parser state variable.
The initial value is 1. Set high after
A Class keyword or a def keyword.
Set low after a word token (the name).
- nestAfter
Indicates that after inserting this token into the parse
tree, future tokens should be nested under this one.
- newlineIsSemi
In Ruby, an end marks the end of a function,
so treat the newline after it as the end of the declaration.
- NEXTTOKENNOCPP
-
Turns off the C preprocessor temporarily.
0 — Normal operation.
1 — Just saw #if. Goes to 3 if you get a defined token.
2 — Just saw #ifdef. Don't do C preprocessing for the symbol that follows. Goes to 0 after the next word token.
3 — In #if defined. Don't do c preprocessing fr the symbol that follows, and drop back to state 1 after a word token.
- noInsert
Set high to indicate that the next curly brace should not
result in a parser state insertion. Used when, for example,
a curly brace appears on its own prior to any actual
declaration.
- occmethod
Value is 1 if this is an Objective-C method, else 0/undefined.
- occmethodname
The name of this Objective-C method. As new fragments get parsed, this gets
extended to be foo:bar:baz:
- occmethodreturntype
Stores the return type for an Objective-C method.
- occmethodtype
The Objective-C method type. Contains either a
- or + character.
- occparmlabelfound
-
Possible values are:
-2 — Colon encountered without seeing a label.
In this state, the token is captured as the name of the
parameter because the parameter has no label. After
a word token is captured, the state returns to 0
because the next token is the name of the next
parameter.
-1 — Colon encountered while in state 1. The
paramter name follows. After a word token is
captured, this gets incremented to 0 because the next
token is the name of the next parameter.
0 — Default state. If colon is encountered, goes to state -2.
1 — Enters this state on first word token that's not in parentheses (thus skipping types in Objective-C methods). If colon is
encountered, go to state -1. .
- occSuper
The superclass of an Objective-C class.
- OfIn
Set to the actual of or in token
encountered when parsing AppleScript. The word token after it is
appended to this variable (delimited by a space).
- onlyComments
Initially, this is set to 1. As soon as the parser sees a valid code token,
this variable is set to 0. This serves two purposes. If the parser sees an
opening curly brace before this gets set to 0, it restarts parsing without
returning. (See continue_no_return in blockParse.) Also, once the parser has seen
a code token, it will not allow the C preprocessing code to take over
and return a #define that appears in the middle of a declaration.
- optionalOrRequired
Either @optional or @required, depending on the current
state of the parser.
- parentLeading
-
Holds the number of leading spaces at the beginning
of the line for the enclosing block.
If the current line's leading space drops to this
level or lower, the end of block is considered to
have been reached.
- parsedParam
Temporary storage for the parsed parameter being parsed. Used only by the
Python parser. (The main block parser uses a local variable,
$parsedParam instead.)
- parsedParamAtBrace
Any in-progress parsed parameters when we enter a brace.
- parsedParamList
An array of parsed parameter strings. When parsing a function, these are the
parameters to the function. When parsing a struct or similar, these are the
fields in the structure.
- parsedParamParse
-
Indicates parameter parsing is in progress. Possible values are:
0 — Not parsing parameters
1 — Parsing semicolon-delimited parameters.
2 — About to parse semicolon-delimited parameters.
3 — Parsing comma-delimited parameters.Not parsing parameters
4 — About to parse comma-delimited parameters.
5 — Parsing whitespace-delimited parameters.
6 — About to parse space-delimited parameters.
The value is set to the even-numbered variant first, which causes the current
token (usually a brace or parenthesis) to be skipped and the value to be
decremented by 1, after which all future tokens are parsed.
- parsedParamStateAtBrace
The state of parameter parsing when we enter a brace.
- pendingBracedParameters
Used in languages where parameters are wrapped in
curly braces. A value of 1 indicates that the next
curly brace should start parameter parsing. A value
of 2 indicates that such a brace has been parsed.
The default value is 0.
- perlClassName
Stores a Perl class name (this::that::the_other).
When a :: token is encountered,
:: is appended (if this variable is
nonempty), followed by sodname.
- popAfter
In the Python parser, indicates that a new
$treeCur should be popped from
the stack (treeStack field)
after inserting this node.
- popAtEnd
Set to 1 if parser sees a colon while bracePending
is set. This indicates that if this declaration
ends at the end of this line, the parse tree (which has
become nested by the colon) needs to be poped back out.
- posstypes
List of type names that follow after a complex typedef, e.g.
bar and baz in the declaration
typedef struct foo { ...} bar, baz;.
- posstypesPending
The next token should go into the posstypes variable.
- pplStack
A stack of parsed parameter lists. Used to handle fields and parameters in
nested structures/callbacks.
- preclasssodtype
The contents of sodtype when class
or other similar keyword is encountered. This is used to
restore things when class appears as part of a
function's return type (e.g. static class
foo *returnsfoo();).
- preEqualsSymbol
The last symbol before the equals sign. Used to obtain the name of a variable
with an initial value.
- preExternCcurline
The value of curline is stored in this
variable when the extern token is
encountered. This value is rolled back when
rollbackPending is set. See
externC for details.
- preExternCdeclaration
In C, when the extern is encountered,
the declaration to date is stored here. See
externC for details.
- prekeywordsodname
See prekeywordsodtype.
- prekeywordsodtype
-
If startofDec is 2, the parser has
seen proc, sub, function, or
equivalent keyword or has seen the first token of the
declaration. Either way, the start-of-declaration
parser is expecting a name. If it sees a
keyword, the sodtype variable is copied
into prekeywordsodtype and
the sodname variable is copied into
prekeywordsodname.
This basically fixed a bug where the setter keyword
wrecked things if it appeared after the name of an
Objective-C property.
- preTemplateSymbol
Used primarily for determining whether this is a function or a function template.
- pushedfuncbrace
Set to 1 when a sofunction token is seen
in the few languages that both use this token and
do not precede the function body with any other
opening brace.
- pushParserStateOnBrace
-
Set to 1 when a keyword is encountered that should
cause the parser state to be pushed the next time the
tree is nested (a class keyword, specifically).
Set to 2 when the colon at the end of the class
declaration is parsed. After the token is pushed
onto the tree, the parser state is pushed onto
the parser stack, and the value is incremented
to 3 so that it does not get pushed again.
- returntype
The return type of a function, callback, or
(non-Objective-C) method.
- rollbackPending
Set to 1 during parsing to indicate that the state should be
rolled back when done handling this token. After this token,
the parser calls rollback to roll back to the
previously saved state.
- rollbackState
A temporary copy of the parser state that the parser can roll
back to under certain circumstances. Set by rollbackSet
and used by rollback.
- seenBraces
The opening brace of functions/methods and function-like macros
has been seen by the parser, so the parser is now in a state
where it does nothing but walk to the matching close brace.
- seenElse
If $HeaderDoc::parseIfElse is 1, this
flag is set to indicate that the tree associated
with this parser state contains an else clause.
- seenIf
If $HeaderDoc::parseIfElse is 1, this
flag is set to indicate that the tree associated
with this parser state contains an if clause.
- seenLeading
-
The number of leading spaces on the current line.
If this indentation drops to be at or below the
indentation in leadspace (the
indentation of the first line inside this nesting
level) or if leadspace is -1 (and
thus uncheckable) and this value drops to be at
or below the value in parentLeading
(the neting level above this one), the block is
done.
- seenMacroName
Set high after the macro name has been parsed.
If this is set and inMacroTail is not set,
if a parenthesis is encountered, it represents
the start of an argument list, which causes
cppMacroHasArgs to be set.
- seenMacroPart
Indicates that we've seen at least one non-whitespace token after
the #define. (This means the name should be locked, among
other things.)
- seenMacroStart
Set high after a #define token has been parsed. Once set,
the seenMacroName key is set on the next word token.
- seenTilde
Indicates that we are in a C++ destructor.
- seenToken
Used by the Python parser to determine whether
it has seen the first non-space token in a line.
This disables leading space counting.
- setHollowAfter
Used by the Python parser to indicate that after this
token has been inserted into the tree, the
hollow field should be set to the resulting
tree node.
- setleading
In python, indicates that this is the first
line of nonempty declaration encountered, so
the next leading space should not result in
any comparisons of indentation.
- simpleTDcontents
The guts of a simple typedef.
- simpleTypedef
-
Indicates a typedef without braces (0/1). This is used for three things:
To determine whether the next brace starts field parsing or not.
(Field parsing starts at the first brace.)
To determine whether the namelist variable contains tag names
for a complex typedef. (Tag names appear after struct and
before the opening curly brace.) In the case of a simple
typedef, this would contain bogus data.
In parsing MIG declarations, to determine whether a return
type was specified.
- skiptoken
Set to 1 when the parser state has just been
pushed so that the hollow value won't
point to (at least) the next token.
- sodbrackets
Captures the data between square brackets when
startOfDec is 2. This state typically
occurs after the first non-symbol token in the line.
Used for temporarily storing the bracketed
attributes in an IDL file.
- sodclass
-
The sodclass variable contains a standardixed name for the type
being parsed, specifically one of: variable, function,
enum, or class.
The sod stands for "start of declaration". This variable, along with
sodtype, sodname, and sodclass
are used for parsing functions and
callbacks (but not the names of callbacks).
These parser variables are controlled by the startOfDec
counter variable. With a few exceptions (callback names, in particular,
come to mind), the startOfDec parser takes precedence over
the other parsers.
- sodname
-
The sodname variable contains the parsed name.
The sod stands for "start of declaration". This variable, along with
sodtype, sodname, and sodclass
are used for parsing functions and
callbacks (but not the names of callbacks).
These parser variables are controlled by the startOfDec
counter variable. With a few exceptions (callback names, in particular,
come to mind), the startOfDec parser takes precedence over
the other parsers.
- sodtype
-
The sodtype variable contains code symbols that may be used for
various purposes.
The sod stands for "start of declaration". This variable, along with
sodtype, sodname, and sodclass
are used for parsing functions and
callbacks (but not the names of callbacks).
These parser variables are controlled by the startOfDec
counter variable. With a few exceptions (callback names, in particular,
come to mind), the startOfDec parser takes precedence over
the other parsers.
- sodtypeclasstoken
Contains the token that began the current class
declaration. Used to restore the class token
if it is really just the start of a variable name.
- stackFrozen
Once the parser passes the opening curly brace of a function body, the
parsed parameter stack is frozen. This prevents other things that loook
like parameter lists (e.g. the expression of an if or while statement)
from getting parsed.
- startOfDec
The control variable for the startOfDec parser. Used to
control when the variables sodname and
sodtype get filled.
- storeDec
Temporary storage for nested declarations, used
to build up the vestigial plain text declaration.
- structClassName
-
The last symbol before a colon in a struct declaration.
Used for structs that look like this:
struct foo : bar {...}
In this case, the actual name of the struct is
foo, so that token gets stored in
structClassName and restored later.
- sublang
The language dialect that the parser was parsing
when this parser state was created (e.g. cpp
for C++).
- temponlyComments
-
When a semicolon is encountered, if the parser might
be parsing a parameter list that is semicolon-delimited
(parsedParamParse <= 2), this gets
the value of the onlyComments field,
and the value is replaced at the end of the loop.
If this was not the first character in the overall
declaration, this has the effect of preventing the
onlyComments value from being reset by
the semicolon handler.
If this was the first character in the overall
declaration, the value of onlyComments
was already zero, so this has no effect.
Note: this could probably be replaced by a flag
to simply tell the various bits of code not to
change the onlyComments value, but
it's probably not worth the effort for the
limited simplification this would cause.
- treePopTwo
This gets set to 1 when a token is encountered that causes the tree to be nested
but has no explicit ending token (e.g. +, -, or :). Thus, when the enclosing
context ends and the parse tree gets popped from the treeStack stack,
the code pops a second time for this token.
- treeStack
-
A stack of parse trees. These are pushed and popped at various points during
the parse process as braces, colons, parentheses, etc. The behavior is
controlled by the variables treeNest, treeSkip,
treePopTwo, and treePopOnNewLine
(most of which are local variables in blockParse
and/or
pythonParse.
This is currently used exclusively for Python.
Other languages use a local variable in blockParse
by the same name.
- typestring
The outer type keyword (in C, struct, union,
enum, or typedef).
- value
The parsed value of a constant.
- valuepending
This variable goes high after an equals sign, indicating that
the next tokens contain the value of the constant.
- variableNameConcat
-
Tells the parser to concatenate extra bits onto the name of
a function, variable, etc. For example, foo.bar is
(ostensibly) a valid name in Java, JavaScript, and IDL.
Set to 2 on encountering a period while parsing the name of
a variable, function, etc. Goes down to 1 when the period is
concatenated, zero when the next word token is concatenated.
- variablenames
Contains a hash table mapping variable names to
values when parsing variable declarations that
define more than one variable.
- variablestars
-
Contains a hash table mapping variable names to
the number of leading * characters
before them. By separating this from the type
information, it ensures that variables within
declarations that contain a mixture of pointer
and nonpointer types (char *a, b, **c;,
for example) are typed correctly.
The variable curvarstars is used for
temporary storage of subsequent groups of asterisks.
- variabletype
Temporary storage of the variable type (e.g. int)
used to prevent its destruction when parsing variable
declarations that define more than one variable.
- waitingForExceptions
Set to 1 when Ruby parsing encounters a left angle
bracket (<) in a class declaration.
- waitingForTypeInformation
-
By default, 0.
Set to 2 on a colon within a variable declaration.
If 2, set to 1 on non-space.
If 1, set to 3 on open parenthesis, else -1 if non-space.
Basically, if this goes to 3, the variable is a
Pascal enumerated type, e.g.
pascal_var_e: (apple, pear, banana, orange, lemon);
Otherwise, the declaration is just a normal variable.
A nondestructive variant of firstpastnl that is available to
any programming language (and currently used in TCL).
Set to 2 after a newline, 1 during the first non-space
token, 0 after.
In shell, initially 0, set to 2 after a double-semicolon or
1 after a semicolon (but never set to 1 after it is already 2).
Reset to 0 after the first non-space token. Used in case/esac
parsing.
Set on parser state objects that represent declarations
within classes so that it does not get processed twice.
The AppleScript label currently being parsed. Each
label is treated as a parsed parameter.
Used when parsing the GCC __attribute__
info, __asm__ declarations, and other
similar pieces of info (certain availability macros,
for example).
Legal values are:
0 — Not parsing an attribute.
1 — Just saw the leading token.
-1 — Got the leading open parenthesis.
Decremented to smaller negative values as
additional open parentheses are parsed.
Incremented towards 0 as close parentheses
are parsed. When it reaches zero, the tree
is popped up a level, and attribute parsing
is complete.
In Python, this indicates the number of block
nesting levels deep the parser is (e.g. the start
of a function sets this to 1, an if statement
inside that function increases it to 2, and so on).
Contains the contents of an availability macro that was seen by the parser.
Temporary storage scribbled into by blockParse.
Each token in this array is the top of a subtree
that begins with one of the "Magic" availability
macros in Availability.list (e.g.
__OSX_AVAILABLE_BUT_DEPRECATED or
__OSX_AVAILABLE_STARTING).
$self->{availabilityNodesArray}
The number of backslashes since the last non-backslash
token. Modified by resetBackslash and
addBackslash.
The type name in a simple typedef, e.g. foo in
typedef struct foo bar;.
Normally 0.
Set to 1 if the parser is expecting a brace
at the end of the first part of a struct, union,
or enum declaration. If it gets a word token
instead, the parser is parsing a variable
declaration rather than a type declaration.
Set 2 if the parser is expecting another
word token before changing this variable to 1.
For example, if the parser encounters a
double colon (::), the next word
token is part of the structure name, but a
subsequent word token after that would make it
a structure variable instead.
Stack for brace tokens, including the left curly brace, the start-of-template
(sotemplate) value, the left square bracket, the left parenthesis
and the opening class marker for class markers that aren't followed by a left
curly brace (Objective-C @interface, for example).
This is currently used exclusively for Python.
Other languages use a local variable in blockParse.
Indicates whether the callback is wrapped in a typedef (1) or not (0).
Sets priority order of type matching (up one level in blockParseOutside).
$self->{callbackIsTypedef}
The name of this callback. This takes priority over all other names,
including the sodname.
In a typedef of a callback, indicates that the next word token
is the name of a callback. (Non-typedef callback names get
picked up naturally by the parameter parsing code---if a second
set of parsed parameters appear, the first set becomes the
callback name.) Values are:
0 — Normal state.
1 — Just saw leading typedef token.
2 — Saw first word after typedef.
3 — Saw parenthesis after first word. Capture
the name now.
4 — Saw name token after parenthesis.
(Further word tokens mean it's not a callback.)
5 — Saw :: after name. Continue to capture
the name here.
$self->{callbackNamePending}
The owning class for an Objective-C category.
When a second open parenthesis is encountered in parsing
the callback name, this tells the parser that it is really
seeing a function that returns a callback instead of a
callback variable. The original sodname value is stored
here, and the functionReturnsCallback flag
is set so that this value can be restored later.
If a typedef contains a second set of parentheses and is
not identiified as a function returning a callback, the
name inside the first set is the callback name, so this
gets cleared.
Set to 1 when an Objective-C class token is encountered.
In addition to playing a key role in parsing decisions,
this also causes sublang to be set to
occ.
Set to 1 on encountering a period while parsing the name of an
IDL class. This causes the next token to be interpreted as an
additional part of the name rather than turning the whole thing
into a class instance. Set to 0 after encountering the next
token.
The bleeding of JavaScript-specific syntax into IDL files is
really something of an abuse of the language, but supporting
it is necessary to parse certain content.
Set to 1 after a class name has been parsed. (Set back
to 0 if double colons are seen.) If a second word token
is encountered in this state, it's a variable instead of
a class (e.g. class foo *foo_instance;).
Contains the token that began the current class
declaration with any leading @ sign merged.
Returned to the caller.
The list where the list of classes to which this protocol
conforms is stored. This variable contains a string.
Set to 1 after the const keyword is found.
$self->{constKeywordFound}
Indicates that the #define macro described by the parser state
object has an argument list associated with it. Used to
determine the definetype attribute for the macro in XML output.
Temporary storage for asterisks before each variable
name in a declaration with more than one name.
This variable is reset to empty when the parser
encounters a comma in such a declaration.
See curvarstars for more information.
TCL variables, AppleScript variables, and TCL
functions end at a newline character. When
these are detected (by token matching), this
variable is set to 1.
$self->{declarationEndsAtNewLine}
The contents of the else part of an if/else conditional.
Only valid if $HeaderDoc::parseIfElse is 1.
In Python, this variable determines whether the
declaration is done after this token, in which case
a new parser state (sibling) must be added.
0 — Nope.
1 — In this state if we got a newline and
autoContinue is 0 (we're not in
a nested block). We're done after this token,
but it should be added to the parse tree.
2 — seenLeading is less than
leadspace. Don't add this token
to the parse tree because it's part of the
next declaration.
3 — seenLeading is Less than
parentLeading. Don't add this
token to the parse tree because it's part of
the next declaration.
In shell (and Perl), set to the token after a << that is
treated as the start of a multi-line string. Reset to
an empty string upon leaving the multi-line string. While
in this state, inString is set to 13.
The number of quote tokens in a row when
potentially leaving a triple-quoted string.
This value is reset to zero upon
encountering a non-quote token.
If this reaches 2, the next quote mark causes
the three quotes to be combined into a single
token, and the value is reset to 0.
$self->{endOfTripleQuote}
When a quote mark is seen, the object is
added here so that the parser can easily
go back to it later if it turns out to be
a triple quote. This is used to merge the
three quote marks into a single token in
the parse tree.
$self->{endOfTripleQuoteToken}
The name of the class that this class extends.
Stores the name of the Objective-C protocol that
this protocol extends (the tokens within angle
brackets).
In C, when the extern is encountered,
this flag is set to 1 and the rollbackSet
function is called to set a rollback point. The
declaration to date is also stored in the
preExternCdeclaration field at this
time.
If what comes after this token is C,
then the previous declaration is restored and the
parser state is rolled back to this point.
In shell (and Perl), set to 1 after a newline until the
first non-space token.
A while or other statement right after
an end statement (on the same line) is
treated as applying to the preceding
block instead of starting a new one.
Set to 1 when end is encounered, 0 at
following newline.
$self->{followingrubyrbrace}
Set to 1 after reaching the left brace after a class. This
essentially tells the parser to stop appending superclass tokens
to forceClassSuper.
When the parser sees a colon (indicating a superclass name is coming),
or the keywords extends or implements in Java,
etc., this gets a copy of the class name so that it doesn't get overwritten.
Holds the superclass information after a colon token. Used in
conjunction with forceClassName.
Once the parser passes the opening curly brace of a function body, the
return type information is frozen. This prevents other things that loook
too much like function declarations from overwriting the return type info.
Copy of the pplStack when the stack is frozen by stackFrozen.
A copy of the sodname variable frozen at a particular point in time.
Freezing occurs when the parser enters certain contexts like parameter parsing
because the sodname field would otherwise get overwritten by other things.
The full path for the file containing the
declaration that this parser state describes.
By storing the info here, it is available for
debug messages during subparse operations
(reprocessing declarations nested within
class declarations).
The contents of a function (or, when parsing a switch
statement, the contents of the struct body).
$self->{functionContents}
Indiciates that the parser has seen a function that
returns a callback. If sest, the parser restores the
value from cbsodname into the
sodname field.
This is incremented to 2 while parsing the parameters
for the callback, and decremented back to 1 at the end.
$self->{functionReturnsCallback}
While parsing an Objective-C method, this gets
set to 1 upon seeing an open parenthesis, 2 at
the bottom of the loop. While at 2 or greater,
tokens are appended to the
occmethodreturntype variable.
This value is incremeneted when additional open
parentheses are encountered, and is decremented
when close parentheses are encountered. When it
reaches 1 again, it is reset to 0.
$self->{gatheringObjCReturnType}
The revision control revision number for this module.
$HeaderDoc::ParserState::VERSION = '$Revision: 1333753010 $';
Discussion
In the git repository, contains the number of seconds since
January 1, 1970.
This variable holds a reference to the node in
the parse tree where the parser state should be stored when the current declaration
has been fully parsed.
The contents of the if part of an if/else conditional
(not including the test expression). Only valid if
$HeaderDoc::parseIfElse is 1.
Set high within the definition for any of the built-in
availability macros so that those macro definitions can
be properly parsed even if they refer to other
availability macros.
$self->{ignoreAvailabilityMacros}
The name of the abstract class that this class
implements.
Indicates that we are at a token that might be the start of
a C bitfield. This goes high when a colon occurs. If the next
token is a non-colon (i.e. it's not ::),
startOfDec gets reset to zero to lock the name and
stuff..
Indicates the number of levels of nested square brackets the current
token is within.
In shell, initially 0, incremented upon entering a case
statement, and decremented on exit.
Inside a single-quoted character/string literal.
Indicates whether we are in a class. Possible values are:
0 — Not in a class declaration.
1 — Enters this state when a class keyword is
encountered (except @protocol or
@interface.
2 — Enters this state when the @interface
class keyword is encountered. Returns to 1 when a colon or
close parenthesis is encountered.
3 — Enters this state on the first word token found while in state 2.
Returns to 1 when colon or close parenthesis is encountered.
Set to 1 when a conforming left angle bracket (<) is seen in an
@protocol declaration.
Set to 2 after that token. While this value is 2, tokens are
gathered in the conformsToList string.
Reset to 0 upon seeing the matching right angle bracket (>).
$self->{inClassConformingToProtocol}
Indicates whether we are in a multi-line comment. See also
the ppSkipOneToken local variable in
blockParse.
Set to 1 while inside an enumeration.
Set to 1 when the extends keyword is encountered in
Java. Reset to 0 when an implements keyword occurs.
Set to 1 when a given token is seen in AppleScript. Reset to 0
at the following newline.
Inside an if statement. Only used if the HeaderDoc::parseIfElse
variable is set to 1.
Set to 1 when the implements keyword is encountered
in Java. Reset to 0 when an extends keyword occurs.
Indicates whether we are in a single-line comment (i.e. one
beginning with a hash or two slashes).
Initial value is 4. Decremented to 3 at end of loop.
Decremented to 2 after next token, then 1, increased to 3
if 1 and saw exclamation point. I don't remember what this
code does, and it is probably wrong.
Contains the number of braces on the brace stack
when this parser state was created. When the
number of braces drops below this level, this
parser state must go away.
Set to 1 when a label token is seen in AppleScript. (See the
labelregexp variable in
parseTokens for
a list of these tokens.)
Reset to 0 after the next word token, at the following newline,
or when a given token is encountered.
Indicates that the current declaration is a #define macro or similar. Values are:
0 — Not in a macro.
1 — Got leading #.
2 — Got something else after # (error case).
3 — Got #define.
4 — Got another C preprocessor token, including
#if, #ifdef,
#ifndef, #endif,
#else, #undef,
#elif, #error,
#warning, #pragma,
#import, and #include.
See also inMacroLine.
Used for handling macros in the middle of declarations.
Set high upon encountering the first whitespace after
a macro name. Once this key is set, the value of the
cppMacroHasArgs key is no longer set upon
encountering an open parenthesis.
Indicates that the parser is in a module declaration.
Possible values are:
0 — Not in a module declaration.
1 — Saw the module token.
2 — Unused vestigial state.
3 — Unused vestigial state.
Set to 1 when AppleScript of or in token
is encountered. Reset to 0 on newline or after encountering the
next word token and appending it to OfIn.
In a C++ operator declaration.
Set to 1 after the colon in a C++ method declaration.
Indicates that the parser is parsing the private parameter
declarations for the method.
$self->{inPrivateParamTypes}
Possible values are:
0 — Not in a protocol.
1 — Saw @protocol token.
2 — After next word token after @protocol. Returns to
this state after closing > token.
In this state, it is capturing tokens into
the extendsProtocol field.
3 — Inside conforming angle braces (<).
The input counter. Used for restoring the
value during a subparse (reprocessing a
declaration within an already-parsed class).
Some languages take an additional argument for their equivalent of
a right brace. For example, in AppleScript, a tell
block ends with end tell. In effect, end
terminates the block, but the next token does not start the next
block.
If rbracetakesargument
is set in the object returned by a call to
parseTokens,
then that trailing tell is included in the
trailer for the block.
$self->{inrbraceargument}
In a Ruby quote. Quotes in Ruby are much more complex
than in any sane language, so they get their own
variable....
The character that began the current Ruby block. For example, the
<< token.
Normally 0.
Set to 1 when a Ruby class declaration is encountered.
Set to 2 when the first newline after a Ruby class is encountered.
Inside a double-quoted string literal if 1, else 0.
Set to 13 for a multi-line string (e.g. FOO <<EOF...).
In TCL, set to 1 when a command is encountered that takes an
unquoted (non-string) regular expression as an argument.
Set to 0 upon entering the regular expression or when a
newline or carriage return is encountered.
$self->{inTCLRegExpCommand}
Within C++ template braces (< and >). Also used for
IDL bracket notation.
Set to 1 while inside a C typedef.
Set to 1 when the union keyword is encountered. Remains high
until the end of this declaration.
Set to 1 after the constructor token is seen in TCL
(or equivalent in other languages). (Not used in C++.)
Indicates whether a class declaration is a forward declaration
(1) or the actual class declaration (0). That way, the
resulting object is a Var
object instead of a CPPClass
object.
$self->{ISFORWARDDECLARATION}
Set to 1 after a keyword is parsed that indicates that this
variable is an Objective-C property.
Set to 1 when static or equivalent
(e.g. my in perl) is seen. Used to
determine whether a variable is file-scoped or
global.
After an empty string (""), this gets set high
in Python. That way, if the next token is
also a double quote mark, the opening triple
quote of a triple-quoted tring can be easily
detected.
$self->{justLeftStringToken}
Indicates that the current code is a K&R-style C function (with separate
parameter type declarations, e.g.
int foo(a, b)
int a;
char *b;
{ ... function body ... }
Contains the name of a K&R C function. The normal function name detection
code would fail hard because of the existence of multiple declarations.
The language that the parser was parsing when
this parser state was created.
In Python, set to 1 if the last newline was preceded
by a backslash, else unset. Used to determine
whether to care about the leading whitespace count.
Holds the last part before the one being processed
by the Python parser. Similar to the local variable
of the same name in blockParse.
The last token, wiped by braces, parentheses, and so on. It is used primarily
for handling names of typedefs. In general, when writing code, except in a few
specific contexts, you probably want the local variable
lasttoken in blockParse instead. Also
related are the local variables lastnspart and
lastchar.
The last node in the parse tree rooted at this node.
This node is marked with EODEC in parse tree dumps.
For example, the lastTreeNode value for
a class declaration would point to the closing brace
or semicolon at the end of the class.
Note that nodes within the class, each nested
declaration also has a lastTreeNode
value that points to the end of that nested
declaration.
The number of leading spaces in the first line
since the parser state was created.
Initial value is -1 indicating that the value
has not yet been determined. This value does
not get set until the first line that
contains at least one non-space token after
that whitespace and before the trailing newline.
If the current line's leading space (in
seenLeading drops to this level or
lower, the end of block is considered to have been
reached.
Set to 1 on an end-of-comment token so that
the ending comment token won't get added to
the return type.
Set to 1 to avoid truncating the body of macros that
don't begin with a parenthesis or brace. Otherwise 0.
Temporary storage for the name of a module.
The module token is treated much
like an @indexgroup tag.
The name of a data type parsed by the main (namePending) parser.
This is the lowest priority name; it gets overridden by the sodname name
more often than not.
In Pascal, upon seeing a colon (after a variable name),
the sodname and sodtype
fields are concatenated together (with a space)
into this field. This later becomes the
variable name.
Set to 1 when the parser expects a name:
After the keyword function, procedure, sub, or other similar
function delimiter tokens.
Set to 2 after the keyword typedef, struct, union, and so on
because the name is the second non-keyword token after this one.
Decremented at the end of the token loop.
Python-specific parser state variable.
The initial value is 1. Set high after
A Class keyword or a def keyword.
Set low after a word token (the name).
Indicates that after inserting this token into the parse
tree, future tokens should be nested under this one.
In Ruby, an end marks the end of a function,
so treat the newline after it as the end of the declaration.
Turns off the C preprocessor temporarily.
0 — Normal operation.
1 — Just saw #if. Goes to 3 if you get a defined token.
2 — Just saw #ifdef. Don't do C preprocessing for the symbol that follows. Goes to 0 after the next word token.
3 — In #if defined. Don't do c preprocessing fr the symbol that follows, and drop back to state 1 after a word token.
Set high to indicate that the next curly brace should not
result in a parser state insertion. Used when, for example,
a curly brace appears on its own prior to any actual
declaration.
Value is 1 if this is an Objective-C method, else 0/undefined.
The name of this Objective-C method. As new fragments get parsed, this gets
extended to be foo:bar:baz:
Stores the return type for an Objective-C method.
$self->{occmethodreturntype}
The Objective-C method type. Contains either a
- or + character.
Possible values are:
-2 — Colon encountered without seeing a label.
In this state, the token is captured as the name of the
parameter because the parameter has no label. After
a word token is captured, the state returns to 0
because the next token is the name of the next
parameter.
-1 — Colon encountered while in state 1. The
paramter name follows. After a word token is
captured, this gets incremented to 0 because the next
token is the name of the next parameter.
0 — Default state. If colon is encountered, goes to state -2.
1 — Enters this state on first word token that's not in parentheses (thus skipping types in Objective-C methods). If colon is
encountered, go to state -1. .
$self->{occparmlabelfound}
The superclass of an Objective-C class.
Set to the actual of or in token
encountered when parsing AppleScript. The word token after it is
appended to this variable (delimited by a space).
Initially, this is set to 1. As soon as the parser sees a valid code token,
this variable is set to 0. This serves two purposes. If the parser sees an
opening curly brace before this gets set to 0, it restarts parsing without
returning. (See continue_no_return in blockParse.) Also, once the parser has seen
a code token, it will not allow the C preprocessing code to take over
and return a #define that appears in the middle of a declaration.
Either @optional or @required, depending on the current
state of the parser.
$self->{optionalOrRequired}
Holds the number of leading spaces at the beginning
of the line for the enclosing block.
If the current line's leading space drops to this
level or lower, the end of block is considered to
have been reached.
Temporary storage for the parsed parameter being parsed. Used only by the
Python parser. (The main block parser uses a local variable,
$parsedParam instead.)
Any in-progress parsed parameters when we enter a brace.
$self->{parsedParamAtBrace}
An array of parsed parameter strings. When parsing a function, these are the
parameters to the function. When parsing a struct or similar, these are the
fields in the structure.
Indicates parameter parsing is in progress. Possible values are:
0 — Not parsing parameters
1 — Parsing semicolon-delimited parameters.
2 — About to parse semicolon-delimited parameters.
3 — Parsing comma-delimited parameters.Not parsing parameters
4 — About to parse comma-delimited parameters.
5 — Parsing whitespace-delimited parameters.
6 — About to parse space-delimited parameters.
The value is set to the even-numbered variant first, which causes the current
token (usually a brace or parenthesis) to be skipped and the value to be
decremented by 1, after which all future tokens are parsed.
$self->{parsedParamParse}
The state of parameter parsing when we enter a brace.
$self->{parsedParamStateAtBrace}
Used in languages where parameters are wrapped in
curly braces. A value of 1 indicates that the next
curly brace should start parameter parsing. A value
of 2 indicates that such a brace has been parsed.
The default value is 0.
$self->{pendingBracedParameters}
Stores a Perl class name (this::that::the_other).
When a :: token is encountered,
:: is appended (if this variable is
nonempty), followed by sodname.
In the Python parser, indicates that a new
$treeCur should be popped from
the stack (treeStack field)
after inserting this node.
Set to 1 if parser sees a colon while bracePending
is set. This indicates that if this declaration
ends at the end of this line, the parse tree (which has
become nested by the colon) needs to be poped back out.
List of type names that follow after a complex typedef, e.g.
bar and baz in the declaration
typedef struct foo { ...} bar, baz;.
The next token should go into the posstypes variable.
$self->{posstypesPending}
A stack of parsed parameter lists. Used to handle fields and parameters in
nested structures/callbacks.
The contents of sodtype when class
or other similar keyword is encountered. This is used to
restore things when class appears as part of a
function's return type (e.g. static class
foo *returnsfoo();).
The last symbol before the equals sign. Used to obtain the name of a variable
with an initial value.
The value of curline is stored in this
variable when the extern token is
encountered. This value is rolled back when
rollbackPending is set. See
externC for details.
$self->{preExternCcurline}
In C, when the extern is encountered,
the declaration to date is stored here. See
externC for details.
$self->{preExternCdeclaration}
See prekeywordsodtype.
$self->{prekeywordsodname}
If startofDec is 2, the parser has
seen proc, sub, function, or
equivalent keyword or has seen the first token of the
declaration. Either way, the start-of-declaration
parser is expecting a name. If it sees a
keyword, the sodtype variable is copied
into prekeywordsodtype and
the sodname variable is copied into
prekeywordsodname.
This basically fixed a bug where the setter keyword
wrecked things if it appeared after the name of an
Objective-C property.
$self->{prekeywordsodtype}
Used primarily for determining whether this is a function or a function template.
$self->{preTemplateSymbol}
Set to 1 when a sofunction token is seen
in the few languages that both use this token and
do not precede the function body with any other
opening brace.
Set to 1 when a keyword is encountered that should
cause the parser state to be pushed the next time the
tree is nested (a class keyword, specifically).
Set to 2 when the colon at the end of the class
declaration is parsed. After the token is pushed
onto the tree, the parser state is pushed onto
the parser stack, and the value is incremented
to 3 so that it does not get pushed again.
$self->{pushParserStateOnBrace}
The return type of a function, callback, or
(non-Objective-C) method.
Set to 1 during parsing to indicate that the state should be
rolled back when done handling this token. After this token,
the parser calls rollback to roll back to the
previously saved state.
A temporary copy of the parser state that the parser can roll
back to under certain circumstances. Set by rollbackSet
and used by rollback.
The opening brace of functions/methods and function-like macros
has been seen by the parser, so the parser is now in a state
where it does nothing but walk to the matching close brace.
If $HeaderDoc::parseIfElse is 1, this
flag is set to indicate that the tree associated
with this parser state contains an else clause.
If $HeaderDoc::parseIfElse is 1, this
flag is set to indicate that the tree associated
with this parser state contains an if clause.
The number of leading spaces on the current line.
If this indentation drops to be at or below the
indentation in leadspace (the
indentation of the first line inside this nesting
level) or if leadspace is -1 (and
thus uncheckable) and this value drops to be at
or below the value in parentLeading
(the neting level above this one), the block is
done.
Set high after the macro name has been parsed.
If this is set and inMacroTail is not set,
if a parenthesis is encountered, it represents
the start of an argument list, which causes
cppMacroHasArgs to be set.
Indicates that we've seen at least one non-whitespace token after
the #define. (This means the name should be locked, among
other things.)
Set high after a #define token has been parsed. Once set,
the seenMacroName key is set on the next word token.
Indicates that we are in a C++ destructor.
Used by the Python parser to determine whether
it has seen the first non-space token in a line.
This disables leading space counting.
Used by the Python parser to indicate that after this
token has been inserted into the tree, the
hollow field should be set to the resulting
tree node.
In python, indicates that this is the first
line of nonempty declaration encountered, so
the next leading space should not result in
any comparisons of indentation.
The guts of a simple typedef.
$self->{simpleTDcontents}
Indicates a typedef without braces (0/1). This is used for three things:
To determine whether the next brace starts field parsing or not.
(Field parsing starts at the first brace.)
To determine whether the namelist variable contains tag names
for a complex typedef. (Tag names appear after struct and
before the opening curly brace.) In the case of a simple
typedef, this would contain bogus data.
In parsing MIG declarations, to determine whether a return
type was specified.
Set to 1 when the parser state has just been
pushed so that the hollow value won't
point to (at least) the next token.
Captures the data between square brackets when
startOfDec is 2. This state typically
occurs after the first non-symbol token in the line.
Used for temporarily storing the bracketed
attributes in an IDL file.
The sodclass variable contains a standardixed name for the type
being parsed, specifically one of: variable, function,
enum, or class.
The sod stands for "start of declaration". This variable, along with
sodtype, sodname, and sodclass
are used for parsing functions and
callbacks (but not the names of callbacks).
These parser variables are controlled by the startOfDec
counter variable. With a few exceptions (callback names, in particular,
come to mind), the startOfDec parser takes precedence over
the other parsers.
The sodname variable contains the parsed name.
The sod stands for "start of declaration". This variable, along with
sodtype, sodname, and sodclass
are used for parsing functions and
callbacks (but not the names of callbacks).
These parser variables are controlled by the startOfDec
counter variable. With a few exceptions (callback names, in particular,
come to mind), the startOfDec parser takes precedence over
the other parsers.
The sodtype variable contains code symbols that may be used for
various purposes.
The sod stands for "start of declaration". This variable, along with
sodtype, sodname, and sodclass
are used for parsing functions and
callbacks (but not the names of callbacks).
These parser variables are controlled by the startOfDec
counter variable. With a few exceptions (callback names, in particular,
come to mind), the startOfDec parser takes precedence over
the other parsers.
Contains the token that began the current class
declaration. Used to restore the class token
if it is really just the start of a variable name.
$self->{sodtypeclasstoken}
Once the parser passes the opening curly brace of a function body, the
parsed parameter stack is frozen. This prevents other things that loook
like parameter lists (e.g. the expression of an if or while statement)
from getting parsed.
The control variable for the startOfDec parser. Used to
control when the variables sodname and
sodtype get filled.
Temporary storage for nested declarations, used
to build up the vestigial plain text declaration.
The last symbol before a colon in a struct declaration.
Used for structs that look like this:
struct foo : bar {...}
In this case, the actual name of the struct is
foo, so that token gets stored in
structClassName and restored later.
The language dialect that the parser was parsing
when this parser state was created (e.g. cpp
for C++).
When a semicolon is encountered, if the parser might
be parsing a parameter list that is semicolon-delimited
(parsedParamParse <= 2), this gets
the value of the onlyComments field,
and the value is replaced at the end of the loop.
If this was not the first character in the overall
declaration, this has the effect of preventing the
onlyComments value from being reset by
the semicolon handler.
If this was the first character in the overall
declaration, the value of onlyComments
was already zero, so this has no effect.
Note: this could probably be replaced by a flag
to simply tell the various bits of code not to
change the onlyComments value, but
it's probably not worth the effort for the
limited simplification this would cause.
$self->{temponlyComments}
This gets set to 1 when a token is encountered that causes the tree to be nested
but has no explicit ending token (e.g. +, -, or :). Thus, when the enclosing
context ends and the parse tree gets popped from the treeStack stack,
the code pops a second time for this token.
A stack of parse trees. These are pushed and popped at various points during
the parse process as braces, colons, parentheses, etc. The behavior is
controlled by the variables treeNest, treeSkip,
treePopTwo, and treePopOnNewLine
(most of which are local variables in blockParse
and/or
pythonParse.
This is currently used exclusively for Python.
Other languages use a local variable in blockParse
by the same name.
The outer type keyword (in C, struct, union,
enum, or typedef).
The parsed value of a constant.
This variable goes high after an equals sign, indicating that
the next tokens contain the value of the constant.
Tells the parser to concatenate extra bits onto the name of
a function, variable, etc. For example, foo.bar is
(ostensibly) a valid name in Java, JavaScript, and IDL.
Set to 2 on encountering a period while parsing the name of
a variable, function, etc. Goes down to 1 when the period is
concatenated, zero when the next word token is concatenated.
$self->{variableNameConcat}
Contains a hash table mapping variable names to
values when parsing variable declarations that
define more than one variable.
Contains a hash table mapping variable names to
the number of leading * characters
before them. By separating this from the type
information, it ensures that variables within
declarations that contain a mixture of pointer
and nonpointer types (char *a, b, **c;,
for example) are typed correctly.
The variable curvarstars is used for
temporary storage of subsequent groups of asterisks.
Temporary storage of the variable type (e.g. int)
used to prevent its destruction when parsing variable
declarations that define more than one variable.
Set to 1 when Ruby parsing encounters a left angle
bracket (<) in a class declaration.
$self->{waitingForExceptions}
By default, 0.
Set to 2 on a colon within a variable declaration.
If 2, set to 1 on non-space.
If 1, set to 3 on open parenthesis, else -1 if non-space.
Basically, if this goes to 3, the variable is a
Pascal enumerated type, e.g.
pascal_var_e: (apple, pear, banana, orange, lemon);
Otherwise, the declaration is just a normal variable.
$self->{waitingForTypeInformation}
Last Updated: Saturday, August 06, 2016
|