HeaderDoc::ParserState

Declared In:

Introduction

Core data structure for the parser.

Discussion

The ParserState object represents an almost-complete view of the state machine inside the parser. (There are a few local variables in the parser that contain additional transient state information.)

ParserState object instances are routinely stored on a stack to provide the ability to fully parse and interpret one declaration that appears inside another declaration (variable declarations within the parameter list of a function, for example).



Member Functions

_initialize

Initializes an instance of a ParserState object.

addBackslash

Increments the backslash counter.

braceCount

Looks at the top token on the brace stack and returns the closing token that would match it.

dbprint

Prints object for debugging purposes.

free

Releases resources associated with a parsers state object.

isContinuationLine

Returns whether the current line is a Python continuation line.

isLeftBrace

Returns whether or not this token should be treated as a left brace.

isQuoted

Increments the backslash counter.

isRubyCloseQuote

Returns whether a token should be interpreted as a Ruby close quote mark.

isRubyOpenQuote

Returns whether a token should be interpreted as a Ruby open quote mark.

new

Creates a new ParserState object.

peekBrace

Looks at the top token on the brace stack.

peekBraceMatch

Looks at the top token on the brace stack and returns the closing token that would match it.

popBrace

Pops a token off of the brace stack and returns it.

print

Alias for dbprint.

pushBrace

Pushes a token onto the brace stack.

resetBackslash

Resets the backslash couter to zero.

rollback

Rolls back the parser state to the last state saved by a call to rollbackSet.

rollbackSet

Creates a clone of the object for future rollbacks.

setHollowWithLineNumbers

Sets the hollow field in this object, and sets the input counter and block offset values for the tree node.

treePop

Pops a tree from the tree stack.

treePush

Pushes a tree onto the tree stack.


_initialize


Initializes an instance of a ParserState object.

Parameters
self

The object to initialize.


addBackslash


Increments the backslash counter.

Parameters
self

This object.


braceCount


Looks at the top token on the brace stack and returns the closing token that would match it.

Parameters
self

This object.

Discussion

This is currently only used for the Python parser. Eventually, the main parser should be modified to share this stack instead of using a local variable.


dbprint


Prints object for debugging purposes.

sub dbprint 
Parameters
self

This object.


free


Releases resources associated with a parsers state object.

sub free 
Parameters
self

The ParserState object.


isContinuationLine


Returns whether the current line is a Python continuation line.

Discussion

In Python, if you are inside a string, a multiline string, a parenthesized expression, an array, etc., subsequent lines are treated as part of the current line implicitly. Those subsequent lines are called continuation lines.

A continuation line also occurs explicitly when the previous line ends with a backslash.


isLeftBrace


Returns whether or not this token should be treated as a left brace.

Parameters
self

This object.

part

The token to check.

lang

The programming language.

lbrace

The primary left brace character.

lbraceunconditionalre

A regular expression containing other patterns that are always considered left braces. Currently used for for/if in Python and Ruby, and tell in AppleScript.

lbraceconditionalre

In Ruby/Python, a set of tokens that are treated as left braces unless they are immediately after a right brace. Basically, this handles begin/while/until when used at the end of a line in Ruby/Python.

classisbrace

Set to 1 if a class declaration is treated as an open brace. (This is not used for ObjC clases; they are special.)

functionisbrace

Set to 1 if a function declaration is treated as an open brace.

case_sensitive

Set to 1 for most languages. Set to 0 if the language uses case-insensitive token matching (e.g. Pascal).

curBraceCount

The current brace count. This is used to prevent nesting of braces in languages that don't work that way.


isQuoted


Increments the backslash counter.

sub isQuoted 
Parameters
self

This object.

lang

The current programming language.

sublang

The current language dialect.


isRubyCloseQuote


Returns whether a token should be interpreted as a Ruby close quote mark.

Parameters
self

This object.

part

The string to check.

Discussion

The value returned depends on whether the close token matches the open token. This is determined based on the value store in the inRuby variable in this parser state instance. If not in a Ruby string, this returns zero.


isRubyOpenQuote


Returns whether a token should be interpreted as a Ruby open quote mark.

Parameters
self

This object.

part

The string to check.

Discussion

The value returned, if nonzero, indicates the value that should be stored in the inRuby variable in this parser state instance. If already in a Ruby string, this returns zero.


new


Creates a new ParserState object.

sub new 
Parameters
param

A reference to the relevant package object (e.g. HeaderDoc::ParserState->new() to allocate a new instance of this class).


peekBrace


Looks at the top token on the brace stack.

sub peekBrace 
Parameters
self

This object.

Discussion

This is currently only used for the Python parser. Eventually, the main parser should be modified to share this stack instead of using a local variable.


peekBraceMatch


Looks at the top token on the brace stack and returns the closing token that would match it.

Parameters
self

This object.

Discussion

This is currently only used for the Python parser. Eventually, the main parser should be modified to share this stack instead of using a local variable.


popBrace


Pops a token off of the brace stack and returns it.

sub popBrace 
Parameters
self

This object.

Discussion

This is currently only used for the Python parser. Eventually, the main parser should be modified to share this stack instead of using a local variable.


print


Alias for dbprint.

sub print 
Parameters
self

This object.


pushBrace


Pushes a token onto the brace stack.

sub pushBrace 
Parameters
self

This object.

token

The token to push.

Discussion

This is currently only used for the Python parser. Eventually, the main parser should be modified to share this stack instead of using a local variable.


resetBackslash


Resets the backslash couter to zero.

Parameters
self

This object.


rollback


Rolls back the parser state to the last state saved by a call to rollbackSet.

sub rollback 
Parameters
self

This object.


rollbackSet


Creates a clone of the object for future rollbacks.

Parameters
self

This object.


setHollowWithLineNumbers


Sets the hollow field in this object, and sets the input counter and block offset values for the tree node.

Parameters
self

This object.

treeCur

The tree node to modify, and also the tree node that the hollow field should reference.

blockOffset

The block offset value to set in the tree node.

inputCounter

The input counter value to set in the tree node.


treePop


Pops a tree from the tree stack.

sub treePop 
Parameters
self

This object.

Discussion

This is currently only used for the Python parser. Eventually, the main parser should be modified to share this stack instead of using a local variable.


treePush


Pushes a tree onto the tree stack.

sub treePush 
Parameters
self

This object.

tree

The token to push.

Discussion

This is currently only used for the Python parser. Eventually, the main parser should be modified to share this stack instead of using a local variable.


Member Data

afterNL

A nondestructive variant of firstpastnl that is available to any programming language (and currently used in TCL). Set to 2 after a newline, 1 during the first non-space token, 0 after.

afterSemi

In shell, initially 0, set to 2 after a double-semicolon or 1 after a semicolon (but never set to 1 after it is already 2). Reset to 0 after the first non-space token. Used in case/esac parsing.

APIODONE

Set on parser state objects that represent declarations within classes so that it does not get processed twice.

ASlabel

The AppleScript label currently being parsed. Each label is treated as a parsed parameter.

attributeState

Used when parsing the GCC __attribute__ info, __asm__ declarations, and other similar pieces of info (certain availability macros, for example).

Legal values are:

  • 0 — Not parsing an attribute.

  • 1 — Just saw the leading token.

  • -1 — Got the leading open parenthesis. Decremented to smaller negative values as additional open parentheses are parsed. Incremented towards 0 as close parentheses are parsed. When it reaches zero, the tree is popped up a level, and attribute parsing is complete.

autoContinue

In Python, this indicates the number of block nesting levels deep the parser is (e.g. the start of a function sets this to 1, an if statement inside that function increases it to 2, and so on).

availability

Contains the contents of an availability macro that was seen by the parser.

availabilityNodesArray

Temporary storage scribbled into by blockParse. Each token in this array is the top of a subtree that begins with one of the "Magic" availability macros in Availability.list (e.g. __OSX_AVAILABLE_BUT_DEPRECATED or __OSX_AVAILABLE_STARTING).

backslashcount

The number of backslashes since the last non-backslash token. Modified by resetBackslash and addBackslash.

basetype

The type name in a simple typedef, e.g. foo in typedef struct foo bar;.

bracePending

Normally 0.

Set to 1 if the parser is expecting a brace at the end of the first part of a struct, union, or enum declaration. If it gets a word token instead, the parser is parsing a variable declaration rather than a type declaration.

Set 2 if the parser is expecting another word token before changing this variable to 1. For example, if the parser encounters a double colon (::), the next word token is part of the structure name, but a subsequent word token after that would make it a structure variable instead.

braceStack

Stack for brace tokens, including the left curly brace, the start-of-template (sotemplate) value, the left square bracket, the left parenthesis and the opening class marker for class markers that aren't followed by a left curly brace (Objective-C @interface, for example).

This is currently used exclusively for Python. Other languages use a local variable in blockParse.

callbackIsTypedef

Indicates whether the callback is wrapped in a typedef (1) or not (0). Sets priority order of type matching (up one level in blockParseOutside).

callbackName

The name of this callback. This takes priority over all other names, including the sodname.

callbackNamePending

In a typedef of a callback, indicates that the next word token is the name of a callback. (Non-typedef callback names get picked up naturally by the parameter parsing code---if a second set of parsed parameters appear, the first set becomes the callback name.) Values are:

  • 0 — Normal state.

  • 1 — Just saw leading typedef token.

  • 2 — Saw first word after typedef.

  • 3 — Saw parenthesis after first word. Capture the name now.

  • 4 — Saw name token after parenthesis. (Further word tokens mean it's not a callback.)

  • 5 — Saw :: after name. Continue to capture the name here.

categoryClass

The owning class for an Objective-C category.

cbsodname

When a second open parenthesis is encountered in parsing the callback name, this tells the parser that it is really seeing a function that returns a callback instead of a callback variable. The original sodname value is stored here, and the functionReturnsCallback flag is set so that this value can be restored later.

If a typedef contains a second set of parentheses and is not identiified as a function returning a callback, the name inside the first set is the callback name, so this gets cleared.

classIsObjC

Set to 1 when an Objective-C class token is encountered. In addition to playing a key role in parsing decisions, this also causes sublang to be set to occ.

classNameConcat

Set to 1 on encountering a period while parsing the name of an IDL class. This causes the next token to be interpreted as an additional part of the name rather than turning the whole thing into a class instance. Set to 0 after encountering the next token.

The bleeding of JavaScript-specific syntax into IDL files is really something of an abuse of the language, but supporting it is necessary to parse certain content.

classNameFound

Set to 1 after a class name has been parsed. (Set back to 0 if double colons are seen.) If a second word token is encountered in this state, it's a variable instead of a class (e.g. class foo *foo_instance;).

classtype

Contains the token that began the current class declaration with any leading @ sign merged. Returned to the caller.

conformsToList

The list where the list of classes to which this protocol conforms is stored. This variable contains a string.

constKeywordFound

Set to 1 after the const keyword is found.

cppMacroHasArgs

Indicates that the #define macro described by the parser state object has an argument list associated with it. Used to determine the definetype attribute for the macro in XML output.

curvarstars

Temporary storage for asterisks before each variable name in a declaration with more than one name. This variable is reset to empty when the parser encounters a comma in such a declaration.

See curvarstars for more information.

declarationEndsAtNewLine

TCL variables, AppleScript variables, and TCL functions end at a newline character. When these are detected (by token matching), this variable is set to 1.

elseContents

The contents of the else part of an if/else conditional. Only valid if $HeaderDoc::parseIfElse is 1.

endgame

In Python, this variable determines whether the declaration is done after this token, in which case a new parser state (sibling) must be added.

  • 0 — Nope.

  • 1 — In this state if we got a newline and autoContinue is 0 (we're not in a nested block). We're done after this token, but it should be added to the parse tree.

  • 2 — seenLeading is less than leadspace. Don't add this token to the parse tree because it's part of the next declaration.

  • 3 — seenLeading is Less than parentLeading. Don't add this token to the parse tree because it's part of the next declaration.

endOfString

In shell (and Perl), set to the token after a << that is treated as the start of a multi-line string. Reset to an empty string upon leaving the multi-line string. While in this state, inString is set to 13.

endOfTripleQuote

The number of quote tokens in a row when potentially leaving a triple-quoted string.

This value is reset to zero upon encountering a non-quote token.

If this reaches 2, the next quote mark causes the three quotes to be combined into a single token, and the value is reset to 0.

endOfTripleQuoteToken

When a quote mark is seen, the object is added here so that the parser can easily go back to it later if it turns out to be a triple quote. This is used to merge the three quote marks into a single token in the parse tree.

extendsClass

The name of the class that this class extends.

extendsProtocol

Stores the name of the Objective-C protocol that this protocol extends (the tokens within angle brackets).

externC

In C, when the extern is encountered, this flag is set to 1 and the rollbackSet function is called to set a rollback point. The declaration to date is also stored in the preExternCdeclaration field at this time.

If what comes after this token is C, then the previous declaration is restored and the parser state is rolled back to this point.

firstpastnl

In shell (and Perl), set to 1 after a newline until the first non-space token.

followingrubyrbrace

A while or other statement right after an end statement (on the same line) is treated as applying to the preceding block instead of starting a new one.

Set to 1 when end is encounered, 0 at following newline.

forceClassDone

Set to 1 after reaching the left brace after a class. This essentially tells the parser to stop appending superclass tokens to forceClassSuper.

forceClassName

When the parser sees a colon (indicating a superclass name is coming), or the keywords extends or implements in Java, etc., this gets a copy of the class name so that it doesn't get overwritten.

forceClassSuper

Holds the superclass information after a colon token. Used in conjunction with forceClassName.

freezereturn

Once the parser passes the opening curly brace of a function body, the return type information is frozen. This prevents other things that loook too much like function declarations from overwriting the return type info.

freezeStack

Copy of the pplStack when the stack is frozen by stackFrozen.

frozensodname

A copy of the sodname variable frozen at a particular point in time. Freezing occurs when the parser enters certain contexts like parameter parsing because the sodname field would otherwise get overwritten by other things.

FULLPATH

The full path for the file containing the declaration that this parser state describes. By storing the info here, it is available for debug messages during subparse operations (reprocessing declarations nested within class declarations).

functionContents

The contents of a function (or, when parsing a switch statement, the contents of the struct body).

functionReturnsCallback

Indiciates that the parser has seen a function that returns a callback. If sest, the parser restores the value from cbsodname into the sodname field.

This is incremented to 2 while parsing the parameters for the callback, and decremented back to 1 at the end.

gatheringObjCReturnType

While parsing an Objective-C method, this gets set to 1 upon seeing an open parenthesis, 2 at the bottom of the loop. While at 2 or greater, tokens are appended to the occmethodreturntype variable.

This value is incremeneted when additional open parentheses are encountered, and is decremented when close parentheses are encountered. When it reaches 1 again, it is reset to 0.

HeaderDoc::ParserState::VERSION

The revision control revision number for this module.

hollow

This variable holds a reference to the node in the parse tree where the parser state should be stored when the current declaration has been fully parsed.

ifContents

The contents of the if part of an if/else conditional (not including the test expression). Only valid if $HeaderDoc::parseIfElse is 1.

ignoreAvailabilityMacros

Set high within the definition for any of the built-in availability macros so that those macro definitions can be properly parsed even if they refer to other availability macros.

implementsClass

The name of the abstract class that this class implements.

inBitfield

Indicates that we are at a token that might be the start of a C bitfield. This goes high when a colon occurs. If the next token is a non-colon (i.e. it's not ::), startOfDec gets reset to zero to lock the name and stuff..

inBrackets

Indicates the number of levels of nested square brackets the current token is within.

inCase

In shell, initially 0, incremented upon entering a case statement, and decremented on exit.

inChar

Inside a single-quoted character/string literal.

inClass

Indicates whether we are in a class. Possible values are:

  • 0 — Not in a class declaration.

  • 1 — Enters this state when a class keyword is encountered (except @protocol or @interface.

  • 2 — Enters this state when the @interface class keyword is encountered. Returns to 1 when a colon or close parenthesis is encountered.

  • 3 — Enters this state on the first word token found while in state 2. Returns to 1 when colon or close parenthesis is encountered.

inClassConformingToProtocol

Set to 1 when a conforming left angle bracket (<) is seen in an @protocol declaration.

Set to 2 after that token. While this value is 2, tokens are gathered in the conformsToList string.

Reset to 0 upon seeing the matching right angle bracket (>).

inComment

Indicates whether we are in a multi-line comment. See also the ppSkipOneToken local variable in blockParse.

inEnum

Set to 1 while inside an enumeration.

inExtends

Set to 1 when the extends keyword is encountered in Java. Reset to 0 when an implements keyword occurs.

inGiven

Set to 1 when a given token is seen in AppleScript. Reset to 0 at the following newline.

INIF

Inside an if statement. Only used if the HeaderDoc::parseIfElse variable is set to 1.

inImplements

Set to 1 when the implements keyword is encountered in Java. Reset to 0 when an extends keyword occurs.

inInlineComment

Indicates whether we are in a single-line comment (i.e. one beginning with a hash or two slashes).

Initial value is 4. Decremented to 3 at end of loop. Decremented to 2 after next token, then 1, increased to 3 if 1 and saw exclamation point. I don't remember what this code does, and it is probably wrong.

initbsCount

Contains the number of braces on the brace stack when this parser state was created. When the number of braces drops below this level, this parser state must go away.

inLabel

Set to 1 when a label token is seen in AppleScript. (See the labelregexp variable in parseTokens for a list of these tokens.)

Reset to 0 after the next word token, at the following newline, or when a given token is encountered.

inMacro

Indicates that the current declaration is a #define macro or similar. Values are:

  • 0 — Not in a macro.

  • 1 — Got leading #.

  • 2 — Got something else after # (error case).

  • 3 — Got #define.

  • 4 — Got another C preprocessor token, including #if, #ifdef, #ifndef, #endif, #else, #undef, #elif, #error, #warning, #pragma, #import, and #include.

See also inMacroLine.

inMacroLine

Used for handling macros in the middle of declarations.

inMacroTail

Set high upon encountering the first whitespace after a macro name. Once this key is set, the value of the cppMacroHasArgs key is no longer set upon encountering an open parenthesis.

INMODULE

Indicates that the parser is in a module declaration. Possible values are:

  • 0 — Not in a module declaration.

  • 1 — Saw the module token.

  • 2 — Unused vestigial state.

  • 3 — Unused vestigial state.

inOfIn

Set to 1 when AppleScript of or in token is encountered. Reset to 0 on newline or after encountering the next word token and appending it to OfIn.

inOperator

In a C++ operator declaration.

inPrivateParamTypes

Set to 1 after the colon in a C++ method declaration. Indicates that the parser is parsing the private parameter declarations for the method.

inProtocol

Possible values are:

  • 0 — Not in a protocol.

  • 1 — Saw @protocol token.

  • 2 — After next word token after @protocol. Returns to this state after closing > token. In this state, it is capturing tokens into the extendsProtocol field.

  • 3 — Inside conforming angle braces (<).

inputCounter

The input counter. Used for restoring the value during a subparse (reprocessing a declaration within an already-parsed class).

inrbraceargument

Some languages take an additional argument for their equivalent of a right brace. For example, in AppleScript, a tell block ends with end tell. In effect, end terminates the block, but the next token does not start the next block.

If rbracetakesargument is set in the object returned by a call to parseTokens, then that trailing tell is included in the trailer for the block.

inRuby

In a Ruby quote. Quotes in Ruby are much more complex than in any sane language, so they get their own variable....

inRubyBlock

The character that began the current Ruby block. For example, the << token.

inRubyClass

Normally 0.

Set to 1 when a Ruby class declaration is encountered.

Set to 2 when the first newline after a Ruby class is encountered.

inString

Inside a double-quoted string literal if 1, else 0. Set to 13 for a multi-line string (e.g. FOO <<EOF...).

inTCLRegExpCommand

In TCL, set to 1 when a command is encountered that takes an unquoted (non-string) regular expression as an argument.

Set to 0 upon entering the regular expression or when a newline or carriage return is encountered.

inTemplate

Within C++ template braces (< and >). Also used for IDL bracket notation.

inTypedef

Set to 1 while inside a C typedef.

inUnion

Set to 1 when the union keyword is encountered. Remains high until the end of this declaration.

isConstructor

Set to 1 after the constructor token is seen in TCL (or equivalent in other languages). (Not used in C++.)

ISFORWARDDECLARATION

Indicates whether a class declaration is a forward declaration (1) or the actual class declaration (0). That way, the resulting object is a Var object instead of a CPPClass object.

isProperty

Set to 1 after a keyword is parsed that indicates that this variable is an Objective-C property.

isStatic

Set to 1 when static or equivalent (e.g. my in perl) is seen. Used to determine whether a variable is file-scoped or global.

justLeftStringToken

After an empty string (""), this gets set high in Python. That way, if the next token is also a double quote mark, the opening triple quote of a triple-quoted tring can be easily detected.

kr_c_function

Indicates that the current code is a K&R-style C function (with separate parameter type declarations, e.g.

 
             int foo(a, b)
             int a;
             char *b;
             { ... function body ... }
              
kr_c_name

Contains the name of a K&R C function. The normal function name detection code would fail hard because of the existence of multiple declarations.

lang

The language that the parser was parsing when this parser state was created.

lastNLWasQuoted

In Python, set to 1 if the last newline was preceded by a backslash, else unset. Used to determine whether to care about the leading whitespace count.

lastpart

Holds the last part before the one being processed by the Python parser. Similar to the local variable of the same name in blockParse.

lastsymbol

The last token, wiped by braces, parentheses, and so on. It is used primarily for handling names of typedefs. In general, when writing code, except in a few specific contexts, you probably want the local variable lasttoken in blockParse instead. Also related are the local variables lastnspart and lastchar.

lastTreeNode

The last node in the parse tree rooted at this node. This node is marked with EODEC in parse tree dumps.

For example, the lastTreeNode value for a class declaration would point to the closing brace or semicolon at the end of the class.

Note that nodes within the class, each nested declaration also has a lastTreeNode value that points to the end of that nested declaration.

leadspace

The number of leading spaces in the first line since the parser state was created.

Initial value is -1 indicating that the value has not yet been determined. This value does not get set until the first line that contains at least one non-space token after that whitespace and before the trailing newline.

If the current line's leading space (in seenLeading drops to this level or lower, the end of block is considered to have been reached.

leavingComment

Set to 1 on an end-of-comment token so that the ending comment token won't get added to the return type.

macroNoTrunc

Set to 1 to avoid truncating the body of macros that don't begin with a parenthesis or brace. Otherwise 0.

MODULE

Temporary storage for the name of a module. The module token is treated much like an @indexgroup tag.

name

The name of a data type parsed by the main (namePending) parser. This is the lowest priority name; it gets overridden by the sodname name more often than not.

nameList

In Pascal, upon seeing a colon (after a variable name), the sodname and sodtype fields are concatenated together (with a space) into this field. This later becomes the variable name.

namePending

Set to 1 when the parser expects a name:

  • After the keyword function, procedure, sub, or other similar function delimiter tokens.

  • Set to 2 after the keyword typedef, struct, union, and so on because the name is the second non-keyword token after this one. Decremented at the end of the token loop.

namepending

Python-specific parser state variable. The initial value is 1. Set high after A Class keyword or a def keyword. Set low after a word token (the name).

nestAfter

Indicates that after inserting this token into the parse tree, future tokens should be nested under this one.

newlineIsSemi

In Ruby, an end marks the end of a function, so treat the newline after it as the end of the declaration.

NEXTTOKENNOCPP

Turns off the C preprocessor temporarily.

  • 0 — Normal operation.

  • 1 — Just saw #if. Goes to 3 if you get a defined token.

  • 2 — Just saw #ifdef. Don't do C preprocessing for the symbol that follows. Goes to 0 after the next word token.

  • 3 — In #if defined. Don't do c preprocessing fr the symbol that follows, and drop back to state 1 after a word token.

noInsert

Set high to indicate that the next curly brace should not result in a parser state insertion. Used when, for example, a curly brace appears on its own prior to any actual declaration.

occmethod

Value is 1 if this is an Objective-C method, else 0/undefined.

occmethodname

The name of this Objective-C method. As new fragments get parsed, this gets extended to be foo:bar:baz:

occmethodreturntype

Stores the return type for an Objective-C method.

occmethodtype

The Objective-C method type. Contains either a - or + character.

occparmlabelfound

Possible values are:

  • -2 — Colon encountered without seeing a label. In this state, the token is captured as the name of the parameter because the parameter has no label. After a word token is captured, the state returns to 0 because the next token is the name of the next parameter.

  • -1 — Colon encountered while in state 1. The paramter name follows. After a word token is captured, this gets incremented to 0 because the next token is the name of the next parameter.

  • 0 — Default state. If colon is encountered, goes to state -2.

  • 1 — Enters this state on first word token that's not in parentheses (thus skipping types in Objective-C methods). If colon is encountered, go to state -1.

  • .
occSuper

The superclass of an Objective-C class.

OfIn

Set to the actual of or in token encountered when parsing AppleScript. The word token after it is appended to this variable (delimited by a space).

onlyComments

Initially, this is set to 1. As soon as the parser sees a valid code token, this variable is set to 0. This serves two purposes. If the parser sees an opening curly brace before this gets set to 0, it restarts parsing without returning. (See continue_no_return in blockParse.) Also, once the parser has seen a code token, it will not allow the C preprocessing code to take over and return a #define that appears in the middle of a declaration.

optionalOrRequired

Either @optional or @required, depending on the current state of the parser.

parentLeading

Holds the number of leading spaces at the beginning of the line for the enclosing block.

If the current line's leading space drops to this level or lower, the end of block is considered to have been reached.

parsedParam

Temporary storage for the parsed parameter being parsed. Used only by the Python parser. (The main block parser uses a local variable, $parsedParam instead.)

parsedParamAtBrace

Any in-progress parsed parameters when we enter a brace.

parsedParamList

An array of parsed parameter strings. When parsing a function, these are the parameters to the function. When parsing a struct or similar, these are the fields in the structure.

parsedParamParse

Indicates parameter parsing is in progress. Possible values are:

  • 0 — Not parsing parameters

  • 1 — Parsing semicolon-delimited parameters.

  • 2 — About to parse semicolon-delimited parameters.

  • 3 — Parsing comma-delimited parameters.Not parsing parameters

  • 4 — About to parse comma-delimited parameters.

  • 5 — Parsing whitespace-delimited parameters.

  • 6 — About to parse space-delimited parameters.

The value is set to the even-numbered variant first, which causes the current token (usually a brace or parenthesis) to be skipped and the value to be decremented by 1, after which all future tokens are parsed.

parsedParamStateAtBrace

The state of parameter parsing when we enter a brace.

pendingBracedParameters

Used in languages where parameters are wrapped in curly braces. A value of 1 indicates that the next curly brace should start parameter parsing. A value of 2 indicates that such a brace has been parsed. The default value is 0.

perlClassName

Stores a Perl class name (this::that::the_other). When a :: token is encountered, :: is appended (if this variable is nonempty), followed by sodname.

popAfter

In the Python parser, indicates that a new $treeCur should be popped from the stack (treeStack field) after inserting this node.

popAtEnd

Set to 1 if parser sees a colon while bracePending is set. This indicates that if this declaration ends at the end of this line, the parse tree (which has become nested by the colon) needs to be poped back out.

posstypes

List of type names that follow after a complex typedef, e.g. bar and baz in the declaration typedef struct foo { ...} bar, baz;.

posstypesPending

The next token should go into the posstypes variable.

pplStack

A stack of parsed parameter lists. Used to handle fields and parameters in nested structures/callbacks.

preclasssodtype

The contents of sodtype when class or other similar keyword is encountered. This is used to restore things when class appears as part of a function's return type (e.g. static class foo *returnsfoo();).

preEqualsSymbol

The last symbol before the equals sign. Used to obtain the name of a variable with an initial value.

preExternCcurline

The value of curline is stored in this variable when the extern token is encountered. This value is rolled back when rollbackPending is set. See externC for details.

preExternCdeclaration

In C, when the extern is encountered, the declaration to date is stored here. See externC for details.

prekeywordsodname

See prekeywordsodtype.

prekeywordsodtype

If startofDec is 2, the parser has seen proc, sub, function, or equivalent keyword or has seen the first token of the declaration. Either way, the start-of-declaration parser is expecting a name. If it sees a keyword, the sodtype variable is copied into prekeywordsodtype and the sodname variable is copied into prekeywordsodname.

This basically fixed a bug where the setter keyword wrecked things if it appeared after the name of an Objective-C property.

preTemplateSymbol

Used primarily for determining whether this is a function or a function template.

pushedfuncbrace

Set to 1 when a sofunction token is seen in the few languages that both use this token and do not precede the function body with any other opening brace.

pushParserStateOnBrace

Set to 1 when a keyword is encountered that should cause the parser state to be pushed the next time the tree is nested (a class keyword, specifically).

Set to 2 when the colon at the end of the class declaration is parsed. After the token is pushed onto the tree, the parser state is pushed onto the parser stack, and the value is incremented to 3 so that it does not get pushed again.

returntype

The return type of a function, callback, or (non-Objective-C) method.

rollbackPending

Set to 1 during parsing to indicate that the state should be rolled back when done handling this token. After this token, the parser calls rollback to roll back to the previously saved state.

rollbackState

A temporary copy of the parser state that the parser can roll back to under certain circumstances. Set by rollbackSet and used by rollback.

seenBraces

The opening brace of functions/methods and function-like macros has been seen by the parser, so the parser is now in a state where it does nothing but walk to the matching close brace.

seenElse

If $HeaderDoc::parseIfElse is 1, this flag is set to indicate that the tree associated with this parser state contains an else clause.

seenIf

If $HeaderDoc::parseIfElse is 1, this flag is set to indicate that the tree associated with this parser state contains an if clause.

seenLeading

The number of leading spaces on the current line.

If this indentation drops to be at or below the indentation in leadspace (the indentation of the first line inside this nesting level) or if leadspace is -1 (and thus uncheckable) and this value drops to be at or below the value in parentLeading (the neting level above this one), the block is done.

seenMacroName

Set high after the macro name has been parsed. If this is set and inMacroTail is not set, if a parenthesis is encountered, it represents the start of an argument list, which causes cppMacroHasArgs to be set.

seenMacroPart

Indicates that we've seen at least one non-whitespace token after the #define. (This means the name should be locked, among other things.)

seenMacroStart

Set high after a #define token has been parsed. Once set, the seenMacroName key is set on the next word token.

seenTilde

Indicates that we are in a C++ destructor.

seenToken

Used by the Python parser to determine whether it has seen the first non-space token in a line. This disables leading space counting.

setHollowAfter

Used by the Python parser to indicate that after this token has been inserted into the tree, the hollow field should be set to the resulting tree node.

setleading

In python, indicates that this is the first line of nonempty declaration encountered, so the next leading space should not result in any comparisons of indentation.

simpleTDcontents

The guts of a simple typedef.

simpleTypedef

Indicates a typedef without braces (0/1). This is used for three things:

  • To determine whether the next brace starts field parsing or not. (Field parsing starts at the first brace.)

  • To determine whether the namelist variable contains tag names for a complex typedef. (Tag names appear after struct and before the opening curly brace.) In the case of a simple typedef, this would contain bogus data.

  • In parsing MIG declarations, to determine whether a return type was specified.

skiptoken

Set to 1 when the parser state has just been pushed so that the hollow value won't point to (at least) the next token.

sodbrackets

Captures the data between square brackets when startOfDec is 2. This state typically occurs after the first non-symbol token in the line. Used for temporarily storing the bracketed attributes in an IDL file.

sodclass

The sodclass variable contains a standardixed name for the type being parsed, specifically one of: variable, function, enum, or class.

The sod stands for "start of declaration". This variable, along with sodtype, sodname, and sodclass are used for parsing functions and callbacks (but not the names of callbacks).

These parser variables are controlled by the startOfDec counter variable. With a few exceptions (callback names, in particular, come to mind), the startOfDec parser takes precedence over the other parsers.

sodname

The sodname variable contains the parsed name.

The sod stands for "start of declaration". This variable, along with sodtype, sodname, and sodclass are used for parsing functions and callbacks (but not the names of callbacks).

These parser variables are controlled by the startOfDec counter variable. With a few exceptions (callback names, in particular, come to mind), the startOfDec parser takes precedence over the other parsers.

sodtype

The sodtype variable contains code symbols that may be used for various purposes.

The sod stands for "start of declaration". This variable, along with sodtype, sodname, and sodclass are used for parsing functions and callbacks (but not the names of callbacks).

These parser variables are controlled by the startOfDec counter variable. With a few exceptions (callback names, in particular, come to mind), the startOfDec parser takes precedence over the other parsers.

sodtypeclasstoken

Contains the token that began the current class declaration. Used to restore the class token if it is really just the start of a variable name.

stackFrozen

Once the parser passes the opening curly brace of a function body, the parsed parameter stack is frozen. This prevents other things that loook like parameter lists (e.g. the expression of an if or while statement) from getting parsed.

startOfDec

The control variable for the startOfDec parser. Used to control when the variables sodname and sodtype get filled.

storeDec

Temporary storage for nested declarations, used to build up the vestigial plain text declaration.

structClassName

The last symbol before a colon in a struct declaration. Used for structs that look like this:

struct foo : bar {...}

In this case, the actual name of the struct is foo, so that token gets stored in structClassName and restored later.

sublang

The language dialect that the parser was parsing when this parser state was created (e.g. cpp for C++).

temponlyComments

When a semicolon is encountered, if the parser might be parsing a parameter list that is semicolon-delimited (parsedParamParse <= 2), this gets the value of the onlyComments field, and the value is replaced at the end of the loop.

If this was not the first character in the overall declaration, this has the effect of preventing the onlyComments value from being reset by the semicolon handler.

If this was the first character in the overall declaration, the value of onlyComments was already zero, so this has no effect.

Note: this could probably be replaced by a flag to simply tell the various bits of code not to change the onlyComments value, but it's probably not worth the effort for the limited simplification this would cause.

treePopTwo

This gets set to 1 when a token is encountered that causes the tree to be nested but has no explicit ending token (e.g. +, -, or :). Thus, when the enclosing context ends and the parse tree gets popped from the treeStack stack, the code pops a second time for this token.

treeStack

A stack of parse trees. These are pushed and popped at various points during the parse process as braces, colons, parentheses, etc. The behavior is controlled by the variables treeNest, treeSkip, treePopTwo, and treePopOnNewLine (most of which are local variables in blockParse and/or pythonParse.

This is currently used exclusively for Python. Other languages use a local variable in blockParse by the same name.

typestring

The outer type keyword (in C, struct, union, enum, or typedef).

value

The parsed value of a constant.

valuepending

This variable goes high after an equals sign, indicating that the next tokens contain the value of the constant.

variableNameConcat

Tells the parser to concatenate extra bits onto the name of a function, variable, etc. For example, foo.bar is (ostensibly) a valid name in Java, JavaScript, and IDL.

Set to 2 on encountering a period while parsing the name of a variable, function, etc. Goes down to 1 when the period is concatenated, zero when the next word token is concatenated.

variablenames

Contains a hash table mapping variable names to values when parsing variable declarations that define more than one variable.

variablestars

Contains a hash table mapping variable names to the number of leading * characters before them. By separating this from the type information, it ensures that variables within declarations that contain a mixture of pointer and nonpointer types (char *a, b, **c;, for example) are typed correctly.

The variable curvarstars is used for temporary storage of subsequent groups of asterisks.

variabletype

Temporary storage of the variable type (e.g. int) used to prevent its destruction when parsing variable declarations that define more than one variable.

waitingForExceptions

Set to 1 when Ruby parsing encounters a left angle bracket (<) in a class declaration.

waitingForTypeInformation

By default, 0.

Set to 2 on a colon within a variable declaration.

If 2, set to 1 on non-space.

If 1, set to 3 on open parenthesis, else -1 if non-space.

Basically, if this goes to 3, the variable is a Pascal enumerated type, e.g.

pascal_var_e: (apple, pear, banana, orange, lemon);

Otherwise, the declaration is just a normal variable.


afterNL


A nondestructive variant of firstpastnl that is available to any programming language (and currently used in TCL). Set to 2 after a newline, 1 during the first non-space token, 0 after.

$self->{afterNL}

afterSemi


In shell, initially 0, set to 2 after a double-semicolon or 1 after a semicolon (but never set to 1 after it is already 2). Reset to 0 after the first non-space token. Used in case/esac parsing.

$self->{afterSemi}

APIODONE


Set on parser state objects that represent declarations within classes so that it does not get processed twice.

$self->{APIODONE}

ASlabel


The AppleScript label currently being parsed. Each label is treated as a parsed parameter.

$self->{ASlabel}

attributeState


Used when parsing the GCC __attribute__ info, __asm__ declarations, and other similar pieces of info (certain availability macros, for example).

Legal values are:

  • 0 — Not parsing an attribute.

  • 1 — Just saw the leading token.

  • -1 — Got the leading open parenthesis. Decremented to smaller negative values as additional open parentheses are parsed. Incremented towards 0 as close parentheses are parsed. When it reaches zero, the tree is popped up a level, and attribute parsing is complete.

$self->{attributeState}

autoContinue


In Python, this indicates the number of block nesting levels deep the parser is (e.g. the start of a function sets this to 1, an if statement inside that function increases it to 2, and so on).

$self->{autoContinue}

availability


Contains the contents of an availability macro that was seen by the parser.

$self->{availability}

availabilityNodesArray


Temporary storage scribbled into by blockParse. Each token in this array is the top of a subtree that begins with one of the "Magic" availability macros in Availability.list (e.g. __OSX_AVAILABLE_BUT_DEPRECATED or __OSX_AVAILABLE_STARTING).

$self->{availabilityNodesArray}

backslashcount


The number of backslashes since the last non-backslash token. Modified by resetBackslash and addBackslash.

$self->{backslashcount}

basetype


The type name in a simple typedef, e.g. foo in typedef struct foo bar;.

$self->{basetype}

bracePending


Normally 0.

Set to 1 if the parser is expecting a brace at the end of the first part of a struct, union, or enum declaration. If it gets a word token instead, the parser is parsing a variable declaration rather than a type declaration.

Set 2 if the parser is expecting another word token before changing this variable to 1. For example, if the parser encounters a double colon (::), the next word token is part of the structure name, but a subsequent word token after that would make it a structure variable instead.

$self->{bracePending}

braceStack


Stack for brace tokens, including the left curly brace, the start-of-template (sotemplate) value, the left square bracket, the left parenthesis and the opening class marker for class markers that aren't followed by a left curly brace (Objective-C @interface, for example).

This is currently used exclusively for Python. Other languages use a local variable in blockParse.

$self->{braceStack}

callbackIsTypedef


Indicates whether the callback is wrapped in a typedef (1) or not (0). Sets priority order of type matching (up one level in blockParseOutside).

$self->{callbackIsTypedef}

callbackName


The name of this callback. This takes priority over all other names, including the sodname.

$self->{callbackName}

callbackNamePending


In a typedef of a callback, indicates that the next word token is the name of a callback. (Non-typedef callback names get picked up naturally by the parameter parsing code---if a second set of parsed parameters appear, the first set becomes the callback name.) Values are:

  • 0 — Normal state.

  • 1 — Just saw leading typedef token.

  • 2 — Saw first word after typedef.

  • 3 — Saw parenthesis after first word. Capture the name now.

  • 4 — Saw name token after parenthesis. (Further word tokens mean it's not a callback.)

  • 5 — Saw :: after name. Continue to capture the name here.

$self->{callbackNamePending}

categoryClass


The owning class for an Objective-C category.

$self->{categoryClass}

cbsodname


When a second open parenthesis is encountered in parsing the callback name, this tells the parser that it is really seeing a function that returns a callback instead of a callback variable. The original sodname value is stored here, and the functionReturnsCallback flag is set so that this value can be restored later.

If a typedef contains a second set of parentheses and is not identiified as a function returning a callback, the name inside the first set is the callback name, so this gets cleared.

$self->{cbsodname}

classIsObjC


Set to 1 when an Objective-C class token is encountered. In addition to playing a key role in parsing decisions, this also causes sublang to be set to occ.

$self->{classIsObjC}

classNameConcat


Set to 1 on encountering a period while parsing the name of an IDL class. This causes the next token to be interpreted as an additional part of the name rather than turning the whole thing into a class instance. Set to 0 after encountering the next token.

The bleeding of JavaScript-specific syntax into IDL files is really something of an abuse of the language, but supporting it is necessary to parse certain content.

$self->{classNameConcat}

classNameFound


Set to 1 after a class name has been parsed. (Set back to 0 if double colons are seen.) If a second word token is encountered in this state, it's a variable instead of a class (e.g. class foo *foo_instance;).

$self->{classNameFound}

classtype


Contains the token that began the current class declaration with any leading @ sign merged. Returned to the caller.

$self->{classtype}

conformsToList


The list where the list of classes to which this protocol conforms is stored. This variable contains a string.

$self->{conformsToList}

constKeywordFound


Set to 1 after the const keyword is found.

$self->{constKeywordFound}

cppMacroHasArgs


Indicates that the #define macro described by the parser state object has an argument list associated with it. Used to determine the definetype attribute for the macro in XML output.

$self->{cppMacroHasArgs}

curvarstars


Temporary storage for asterisks before each variable name in a declaration with more than one name. This variable is reset to empty when the parser encounters a comma in such a declaration.

See curvarstars for more information.

$self->{curvarstars}

declarationEndsAtNewLine


TCL variables, AppleScript variables, and TCL functions end at a newline character. When these are detected (by token matching), this variable is set to 1.

$self->{declarationEndsAtNewLine}

elseContents


The contents of the else part of an if/else conditional. Only valid if $HeaderDoc::parseIfElse is 1.

$self->{elseContents}

endgame


In Python, this variable determines whether the declaration is done after this token, in which case a new parser state (sibling) must be added.

  • 0 — Nope.

  • 1 — In this state if we got a newline and autoContinue is 0 (we're not in a nested block). We're done after this token, but it should be added to the parse tree.

  • 2 — seenLeading is less than leadspace. Don't add this token to the parse tree because it's part of the next declaration.

  • 3 — seenLeading is Less than parentLeading. Don't add this token to the parse tree because it's part of the next declaration.

$self->{endgame}

endOfString


In shell (and Perl), set to the token after a << that is treated as the start of a multi-line string. Reset to an empty string upon leaving the multi-line string. While in this state, inString is set to 13.

$self->{endOfString}

endOfTripleQuote


The number of quote tokens in a row when potentially leaving a triple-quoted string.

This value is reset to zero upon encountering a non-quote token.

If this reaches 2, the next quote mark causes the three quotes to be combined into a single token, and the value is reset to 0.

$self->{endOfTripleQuote}

endOfTripleQuoteToken


When a quote mark is seen, the object is added here so that the parser can easily go back to it later if it turns out to be a triple quote. This is used to merge the three quote marks into a single token in the parse tree.

$self->{endOfTripleQuoteToken}

extendsClass


The name of the class that this class extends.

$self->{extendsClass}

extendsProtocol


Stores the name of the Objective-C protocol that this protocol extends (the tokens within angle brackets).

$self->{extendsProtocol}

externC


In C, when the extern is encountered, this flag is set to 1 and the rollbackSet function is called to set a rollback point. The declaration to date is also stored in the preExternCdeclaration field at this time.

If what comes after this token is C, then the previous declaration is restored and the parser state is rolled back to this point.

$self->{externC}

firstpastnl


In shell (and Perl), set to 1 after a newline until the first non-space token.

$self->{firstpastnl}

followingrubyrbrace


A while or other statement right after an end statement (on the same line) is treated as applying to the preceding block instead of starting a new one.

Set to 1 when end is encounered, 0 at following newline.

$self->{followingrubyrbrace}

forceClassDone


Set to 1 after reaching the left brace after a class. This essentially tells the parser to stop appending superclass tokens to forceClassSuper.

$self->{forceClassDone}

forceClassName


When the parser sees a colon (indicating a superclass name is coming), or the keywords extends or implements in Java, etc., this gets a copy of the class name so that it doesn't get overwritten.

$self->{forceClassName}

forceClassSuper


Holds the superclass information after a colon token. Used in conjunction with forceClassName.

$self->{forceClassSuper}

freezereturn


Once the parser passes the opening curly brace of a function body, the return type information is frozen. This prevents other things that loook too much like function declarations from overwriting the return type info.

$self->{freezereturn}

freezeStack


Copy of the pplStack when the stack is frozen by stackFrozen.

$self->{freezeStack}

frozensodname


A copy of the sodname variable frozen at a particular point in time. Freezing occurs when the parser enters certain contexts like parameter parsing because the sodname field would otherwise get overwritten by other things.

$self->{frozensodname}

FULLPATH


The full path for the file containing the declaration that this parser state describes. By storing the info here, it is available for debug messages during subparse operations (reprocessing declarations nested within class declarations).

$self->{FULLPATH}

functionContents


The contents of a function (or, when parsing a switch statement, the contents of the struct body).

$self->{functionContents}

functionReturnsCallback


Indiciates that the parser has seen a function that returns a callback. If sest, the parser restores the value from cbsodname into the sodname field.

This is incremented to 2 while parsing the parameters for the callback, and decremented back to 1 at the end.

$self->{functionReturnsCallback}

gatheringObjCReturnType


While parsing an Objective-C method, this gets set to 1 upon seeing an open parenthesis, 2 at the bottom of the loop. While at 2 or greater, tokens are appended to the occmethodreturntype variable.

This value is incremeneted when additional open parentheses are encountered, and is decremented when close parentheses are encountered. When it reaches 1 again, it is reset to 0.

$self->{gatheringObjCReturnType}

HeaderDoc::ParserState::VERSION


The revision control revision number for this module.

$HeaderDoc::ParserState::VERSION = '$Revision: 1333753010 $';  
Discussion

In the git repository, contains the number of seconds since January 1, 1970.


hollow


This variable holds a reference to the node in the parse tree where the parser state should be stored when the current declaration has been fully parsed.

$self->{hollow}

ifContents


The contents of the if part of an if/else conditional (not including the test expression). Only valid if $HeaderDoc::parseIfElse is 1.

$self->{ifContents}

ignoreAvailabilityMacros


Set high within the definition for any of the built-in availability macros so that those macro definitions can be properly parsed even if they refer to other availability macros.

$self->{ignoreAvailabilityMacros}

implementsClass


The name of the abstract class that this class implements.

$self->{implementsClass}

inBitfield


Indicates that we are at a token that might be the start of a C bitfield. This goes high when a colon occurs. If the next token is a non-colon (i.e. it's not ::), startOfDec gets reset to zero to lock the name and stuff..

$self->{inBitfield}

inBrackets


Indicates the number of levels of nested square brackets the current token is within.

$self->{inBrackets}

inCase


In shell, initially 0, incremented upon entering a case statement, and decremented on exit.

$self->{inCase}

inChar


Inside a single-quoted character/string literal.

$self->{inChar}

inClass


Indicates whether we are in a class. Possible values are:

  • 0 — Not in a class declaration.

  • 1 — Enters this state when a class keyword is encountered (except @protocol or @interface.

  • 2 — Enters this state when the @interface class keyword is encountered. Returns to 1 when a colon or close parenthesis is encountered.

  • 3 — Enters this state on the first word token found while in state 2. Returns to 1 when colon or close parenthesis is encountered.

$self->{inClass}

inClassConformingToProtocol


Set to 1 when a conforming left angle bracket (<) is seen in an @protocol declaration.

Set to 2 after that token. While this value is 2, tokens are gathered in the conformsToList string.

Reset to 0 upon seeing the matching right angle bracket (>).

$self->{inClassConformingToProtocol}

inComment


Indicates whether we are in a multi-line comment. See also the ppSkipOneToken local variable in blockParse.

$self->{inComment}

inEnum


Set to 1 while inside an enumeration.

$self->{inEnum}

inExtends


Set to 1 when the extends keyword is encountered in Java. Reset to 0 when an implements keyword occurs.

$self->{inExtends}

inGiven


Set to 1 when a given token is seen in AppleScript. Reset to 0 at the following newline.

$self->{inGiven}

INIF


Inside an if statement. Only used if the HeaderDoc::parseIfElse variable is set to 1.

$self->{INIF}

inImplements


Set to 1 when the implements keyword is encountered in Java. Reset to 0 when an extends keyword occurs.

$self->{inImplements}

inInlineComment


Indicates whether we are in a single-line comment (i.e. one beginning with a hash or two slashes).

Initial value is 4. Decremented to 3 at end of loop. Decremented to 2 after next token, then 1, increased to 3 if 1 and saw exclamation point. I don't remember what this code does, and it is probably wrong.

$self->{inInlineComment}

initbsCount


Contains the number of braces on the brace stack when this parser state was created. When the number of braces drops below this level, this parser state must go away.

$self->{initbsCount}

inLabel


Set to 1 when a label token is seen in AppleScript. (See the labelregexp variable in parseTokens for a list of these tokens.)

Reset to 0 after the next word token, at the following newline, or when a given token is encountered.

$self->{inLabel}

inMacro


Indicates that the current declaration is a #define macro or similar. Values are:

  • 0 — Not in a macro.

  • 1 — Got leading #.

  • 2 — Got something else after # (error case).

  • 3 — Got #define.

  • 4 — Got another C preprocessor token, including #if, #ifdef, #ifndef, #endif, #else, #undef, #elif, #error, #warning, #pragma, #import, and #include.

See also inMacroLine.

$self->{inMacro}

inMacroLine


Used for handling macros in the middle of declarations.

$self->{inMacroLine}

inMacroTail


Set high upon encountering the first whitespace after a macro name. Once this key is set, the value of the cppMacroHasArgs key is no longer set upon encountering an open parenthesis.

$self->{inMacroTail}

INMODULE


Indicates that the parser is in a module declaration. Possible values are:

  • 0 — Not in a module declaration.

  • 1 — Saw the module token.

  • 2 — Unused vestigial state.

  • 3 — Unused vestigial state.

$self->{INMODULE}

inOfIn


Set to 1 when AppleScript of or in token is encountered. Reset to 0 on newline or after encountering the next word token and appending it to OfIn.

$self->{inOfIn}

inOperator


In a C++ operator declaration.

$self->{inOperator}

inPrivateParamTypes


Set to 1 after the colon in a C++ method declaration. Indicates that the parser is parsing the private parameter declarations for the method.

$self->{inPrivateParamTypes}

inProtocol


Possible values are:

  • 0 — Not in a protocol.

  • 1 — Saw @protocol token.

  • 2 — After next word token after @protocol. Returns to this state after closing > token. In this state, it is capturing tokens into the extendsProtocol field.

  • 3 — Inside conforming angle braces (<).

$self->{inProtocol}

inputCounter


The input counter. Used for restoring the value during a subparse (reprocessing a declaration within an already-parsed class).

$self->{inputCounter}

inrbraceargument


Some languages take an additional argument for their equivalent of a right brace. For example, in AppleScript, a tell block ends with end tell. In effect, end terminates the block, but the next token does not start the next block.

If rbracetakesargument is set in the object returned by a call to parseTokens, then that trailing tell is included in the trailer for the block.

$self->{inrbraceargument}

inRuby


In a Ruby quote. Quotes in Ruby are much more complex than in any sane language, so they get their own variable....

$self->{inRuby}

inRubyBlock


The character that began the current Ruby block. For example, the << token.

$self->{inRubyBlock}

inRubyClass


Normally 0.

Set to 1 when a Ruby class declaration is encountered.

Set to 2 when the first newline after a Ruby class is encountered.

$self->{inRubyClass}

inString


Inside a double-quoted string literal if 1, else 0. Set to 13 for a multi-line string (e.g. FOO <<EOF...).

$self->{inString}

inTCLRegExpCommand


In TCL, set to 1 when a command is encountered that takes an unquoted (non-string) regular expression as an argument.

Set to 0 upon entering the regular expression or when a newline or carriage return is encountered.

$self->{inTCLRegExpCommand}

inTemplate


Within C++ template braces (< and >). Also used for IDL bracket notation.

$self->{inTemplate}

inTypedef


Set to 1 while inside a C typedef.

$self->{inTypedef}

inUnion


Set to 1 when the union keyword is encountered. Remains high until the end of this declaration.

$self->{inUnion}

isConstructor


Set to 1 after the constructor token is seen in TCL (or equivalent in other languages). (Not used in C++.)

$self->{isConstructor}

ISFORWARDDECLARATION


Indicates whether a class declaration is a forward declaration (1) or the actual class declaration (0). That way, the resulting object is a Var object instead of a CPPClass object.

$self->{ISFORWARDDECLARATION}

isProperty


Set to 1 after a keyword is parsed that indicates that this variable is an Objective-C property.

$self->{isProperty}

isStatic


Set to 1 when static or equivalent (e.g. my in perl) is seen. Used to determine whether a variable is file-scoped or global.

$self->{isStatic}

justLeftStringToken


After an empty string (""), this gets set high in Python. That way, if the next token is also a double quote mark, the opening triple quote of a triple-quoted tring can be easily detected.

$self->{justLeftStringToken}

kr_c_function


Indicates that the current code is a K&R-style C function (with separate parameter type declarations, e.g.

 
             int foo(a, b)
             int a;
             char *b;
             { ... function body ... }
              
$self->{kr_c_function}

kr_c_name


Contains the name of a K&R C function. The normal function name detection code would fail hard because of the existence of multiple declarations.

$self->{kr_c_name}

lang


The language that the parser was parsing when this parser state was created.

$self->{lang}

lastNLWasQuoted


In Python, set to 1 if the last newline was preceded by a backslash, else unset. Used to determine whether to care about the leading whitespace count.

$self->{lastNLWasQuoted}

lastpart


Holds the last part before the one being processed by the Python parser. Similar to the local variable of the same name in blockParse.

$self->{lastpart}

lastsymbol


The last token, wiped by braces, parentheses, and so on. It is used primarily for handling names of typedefs. In general, when writing code, except in a few specific contexts, you probably want the local variable lasttoken in blockParse instead. Also related are the local variables lastnspart and lastchar.

$self->{lastsymbol}

lastTreeNode


The last node in the parse tree rooted at this node. This node is marked with EODEC in parse tree dumps.

For example, the lastTreeNode value for a class declaration would point to the closing brace or semicolon at the end of the class.

Note that nodes within the class, each nested declaration also has a lastTreeNode value that points to the end of that nested declaration.

$self->{lastTreeNode}

leadspace


The number of leading spaces in the first line since the parser state was created.

Initial value is -1 indicating that the value has not yet been determined. This value does not get set until the first line that contains at least one non-space token after that whitespace and before the trailing newline.

If the current line's leading space (in seenLeading drops to this level or lower, the end of block is considered to have been reached.

$self->{leadspace}

leavingComment


Set to 1 on an end-of-comment token so that the ending comment token won't get added to the return type.

$self->{leavingComment}

macroNoTrunc


Set to 1 to avoid truncating the body of macros that don't begin with a parenthesis or brace. Otherwise 0.

$self->{macroNoTrunc}

MODULE


Temporary storage for the name of a module. The module token is treated much like an @indexgroup tag.

$self->{MODULE}

name


The name of a data type parsed by the main (namePending) parser. This is the lowest priority name; it gets overridden by the sodname name more often than not.

$self->{name}

nameList


In Pascal, upon seeing a colon (after a variable name), the sodname and sodtype fields are concatenated together (with a space) into this field. This later becomes the variable name.

$self->{nameList}

namePending


Set to 1 when the parser expects a name:

  • After the keyword function, procedure, sub, or other similar function delimiter tokens.

  • Set to 2 after the keyword typedef, struct, union, and so on because the name is the second non-keyword token after this one. Decremented at the end of the token loop.

$self->{namePending}

namepending


Python-specific parser state variable. The initial value is 1. Set high after A Class keyword or a def keyword. Set low after a word token (the name).

$self->{namepending}

nestAfter


Indicates that after inserting this token into the parse tree, future tokens should be nested under this one.

$self->{nestAfter}

newlineIsSemi


In Ruby, an end marks the end of a function, so treat the newline after it as the end of the declaration.

$self->{newlineIsSemi}

NEXTTOKENNOCPP


Turns off the C preprocessor temporarily.

  • 0 — Normal operation.

  • 1 — Just saw #if. Goes to 3 if you get a defined token.

  • 2 — Just saw #ifdef. Don't do C preprocessing for the symbol that follows. Goes to 0 after the next word token.

  • 3 — In #if defined. Don't do c preprocessing fr the symbol that follows, and drop back to state 1 after a word token.

$self->{NEXTTOKENNOCPP}

noInsert


Set high to indicate that the next curly brace should not result in a parser state insertion. Used when, for example, a curly brace appears on its own prior to any actual declaration.

$self->{noInsert}

occmethod


Value is 1 if this is an Objective-C method, else 0/undefined.

$self->{occmethod}

occmethodname


The name of this Objective-C method. As new fragments get parsed, this gets extended to be foo:bar:baz:

$self->{occmethodname}

occmethodreturntype


Stores the return type for an Objective-C method.

$self->{occmethodreturntype}

occmethodtype


The Objective-C method type. Contains either a - or + character.

$self->{occmethodtype}

occparmlabelfound


Possible values are:

  • -2 — Colon encountered without seeing a label. In this state, the token is captured as the name of the parameter because the parameter has no label. After a word token is captured, the state returns to 0 because the next token is the name of the next parameter.

  • -1 — Colon encountered while in state 1. The paramter name follows. After a word token is captured, this gets incremented to 0 because the next token is the name of the next parameter.

  • 0 — Default state. If colon is encountered, goes to state -2.

  • 1 — Enters this state on first word token that's not in parentheses (thus skipping types in Objective-C methods). If colon is encountered, go to state -1.

  • .
$self->{occparmlabelfound}

occSuper


The superclass of an Objective-C class.

$self->{occSuper}

OfIn


Set to the actual of or in token encountered when parsing AppleScript. The word token after it is appended to this variable (delimited by a space).

$self->{OfIn}

onlyComments


Initially, this is set to 1. As soon as the parser sees a valid code token, this variable is set to 0. This serves two purposes. If the parser sees an opening curly brace before this gets set to 0, it restarts parsing without returning. (See continue_no_return in blockParse.) Also, once the parser has seen a code token, it will not allow the C preprocessing code to take over and return a #define that appears in the middle of a declaration.

$self->{onlyComments}

optionalOrRequired


Either @optional or @required, depending on the current state of the parser.

$self->{optionalOrRequired}

parentLeading


Holds the number of leading spaces at the beginning of the line for the enclosing block.

If the current line's leading space drops to this level or lower, the end of block is considered to have been reached.

$self->{parentLeading}

parsedParam


Temporary storage for the parsed parameter being parsed. Used only by the Python parser. (The main block parser uses a local variable, $parsedParam instead.)

$self->{parsedParam}

parsedParamAtBrace


Any in-progress parsed parameters when we enter a brace.

$self->{parsedParamAtBrace}

parsedParamList


An array of parsed parameter strings. When parsing a function, these are the parameters to the function. When parsing a struct or similar, these are the fields in the structure.

$self->{parsedParamList}

parsedParamParse


Indicates parameter parsing is in progress. Possible values are:

  • 0 — Not parsing parameters

  • 1 — Parsing semicolon-delimited parameters.

  • 2 — About to parse semicolon-delimited parameters.

  • 3 — Parsing comma-delimited parameters.Not parsing parameters

  • 4 — About to parse comma-delimited parameters.

  • 5 — Parsing whitespace-delimited parameters.

  • 6 — About to parse space-delimited parameters.

The value is set to the even-numbered variant first, which causes the current token (usually a brace or parenthesis) to be skipped and the value to be decremented by 1, after which all future tokens are parsed.

$self->{parsedParamParse}

parsedParamStateAtBrace


The state of parameter parsing when we enter a brace.

$self->{parsedParamStateAtBrace}

pendingBracedParameters


Used in languages where parameters are wrapped in curly braces. A value of 1 indicates that the next curly brace should start parameter parsing. A value of 2 indicates that such a brace has been parsed. The default value is 0.

$self->{pendingBracedParameters}

perlClassName


Stores a Perl class name (this::that::the_other). When a :: token is encountered, :: is appended (if this variable is nonempty), followed by sodname.

$self->{perlClassName}

popAfter


In the Python parser, indicates that a new $treeCur should be popped from the stack (treeStack field) after inserting this node.

$self->{popAfter}

popAtEnd


Set to 1 if parser sees a colon while bracePending is set. This indicates that if this declaration ends at the end of this line, the parse tree (which has become nested by the colon) needs to be poped back out.

$self->{popAtEnd}

posstypes


List of type names that follow after a complex typedef, e.g. bar and baz in the declaration typedef struct foo { ...} bar, baz;.

$self->{posstypes}

posstypesPending


The next token should go into the posstypes variable.

$self->{posstypesPending}

pplStack


A stack of parsed parameter lists. Used to handle fields and parameters in nested structures/callbacks.

$self->{pplStack}

preclasssodtype


The contents of sodtype when class or other similar keyword is encountered. This is used to restore things when class appears as part of a function's return type (e.g. static class foo *returnsfoo();).

$self->{preclasssodtype}

preEqualsSymbol


The last symbol before the equals sign. Used to obtain the name of a variable with an initial value.

$self->{preEqualsSymbol}

preExternCcurline


The value of curline is stored in this variable when the extern token is encountered. This value is rolled back when rollbackPending is set. See externC for details.

$self->{preExternCcurline}

preExternCdeclaration


In C, when the extern is encountered, the declaration to date is stored here. See externC for details.

$self->{preExternCdeclaration}

prekeywordsodname


See prekeywordsodtype.

$self->{prekeywordsodname}

prekeywordsodtype


If startofDec is 2, the parser has seen proc, sub, function, or equivalent keyword or has seen the first token of the declaration. Either way, the start-of-declaration parser is expecting a name. If it sees a keyword, the sodtype variable is copied into prekeywordsodtype and the sodname variable is copied into prekeywordsodname.

This basically fixed a bug where the setter keyword wrecked things if it appeared after the name of an Objective-C property.

$self->{prekeywordsodtype}

preTemplateSymbol


Used primarily for determining whether this is a function or a function template.

$self->{preTemplateSymbol}

pushedfuncbrace


Set to 1 when a sofunction token is seen in the few languages that both use this token and do not precede the function body with any other opening brace.

$self->{pushedfuncbrace}

pushParserStateOnBrace


Set to 1 when a keyword is encountered that should cause the parser state to be pushed the next time the tree is nested (a class keyword, specifically).

Set to 2 when the colon at the end of the class declaration is parsed. After the token is pushed onto the tree, the parser state is pushed onto the parser stack, and the value is incremented to 3 so that it does not get pushed again.

$self->{pushParserStateOnBrace}

returntype


The return type of a function, callback, or (non-Objective-C) method.

$self->{returntype}

rollbackPending


Set to 1 during parsing to indicate that the state should be rolled back when done handling this token. After this token, the parser calls rollback to roll back to the previously saved state.

$self->{rollbackPending}

rollbackState


A temporary copy of the parser state that the parser can roll back to under certain circumstances. Set by rollbackSet and used by rollback.

$self->{rollbackState}

seenBraces


The opening brace of functions/methods and function-like macros has been seen by the parser, so the parser is now in a state where it does nothing but walk to the matching close brace.

$self->{seenBraces}

seenElse


If $HeaderDoc::parseIfElse is 1, this flag is set to indicate that the tree associated with this parser state contains an else clause.

$self->{seenElse}

seenIf


If $HeaderDoc::parseIfElse is 1, this flag is set to indicate that the tree associated with this parser state contains an if clause.

$self->{seenIf}

seenLeading


The number of leading spaces on the current line.

If this indentation drops to be at or below the indentation in leadspace (the indentation of the first line inside this nesting level) or if leadspace is -1 (and thus uncheckable) and this value drops to be at or below the value in parentLeading (the neting level above this one), the block is done.

$self->{seenLeading}

seenMacroName


Set high after the macro name has been parsed. If this is set and inMacroTail is not set, if a parenthesis is encountered, it represents the start of an argument list, which causes cppMacroHasArgs to be set.

$self->{seenMacroName}

seenMacroPart


Indicates that we've seen at least one non-whitespace token after the #define. (This means the name should be locked, among other things.)

$self->{seenMacroPart}

seenMacroStart


Set high after a #define token has been parsed. Once set, the seenMacroName key is set on the next word token.

$self->{seenMacroStart}

seenTilde


Indicates that we are in a C++ destructor.

$self->{seenTilde}

seenToken


Used by the Python parser to determine whether it has seen the first non-space token in a line. This disables leading space counting.

$self->{seenToken}

setHollowAfter


Used by the Python parser to indicate that after this token has been inserted into the tree, the hollow field should be set to the resulting tree node.

$self->{setHollowAfter}

setleading


In python, indicates that this is the first line of nonempty declaration encountered, so the next leading space should not result in any comparisons of indentation.

$self->{setleading}

simpleTDcontents


The guts of a simple typedef.

$self->{simpleTDcontents}

simpleTypedef


Indicates a typedef without braces (0/1). This is used for three things:

  • To determine whether the next brace starts field parsing or not. (Field parsing starts at the first brace.)

  • To determine whether the namelist variable contains tag names for a complex typedef. (Tag names appear after struct and before the opening curly brace.) In the case of a simple typedef, this would contain bogus data.

  • In parsing MIG declarations, to determine whether a return type was specified.

$self->{simpleTypedef}

skiptoken


Set to 1 when the parser state has just been pushed so that the hollow value won't point to (at least) the next token.

$self->{skiptoken}

sodbrackets


Captures the data between square brackets when startOfDec is 2. This state typically occurs after the first non-symbol token in the line. Used for temporarily storing the bracketed attributes in an IDL file.

$self->{sodbrackets}

sodclass


The sodclass variable contains a standardixed name for the type being parsed, specifically one of: variable, function, enum, or class.

The sod stands for "start of declaration". This variable, along with sodtype, sodname, and sodclass are used for parsing functions and callbacks (but not the names of callbacks).

These parser variables are controlled by the startOfDec counter variable. With a few exceptions (callback names, in particular, come to mind), the startOfDec parser takes precedence over the other parsers.

$self->{sodclass}

sodname


The sodname variable contains the parsed name.

The sod stands for "start of declaration". This variable, along with sodtype, sodname, and sodclass are used for parsing functions and callbacks (but not the names of callbacks).

These parser variables are controlled by the startOfDec counter variable. With a few exceptions (callback names, in particular, come to mind), the startOfDec parser takes precedence over the other parsers.

$self->{sodname}

sodtype


The sodtype variable contains code symbols that may be used for various purposes.

The sod stands for "start of declaration". This variable, along with sodtype, sodname, and sodclass are used for parsing functions and callbacks (but not the names of callbacks).

These parser variables are controlled by the startOfDec counter variable. With a few exceptions (callback names, in particular, come to mind), the startOfDec parser takes precedence over the other parsers.

$self->{sodtype}

sodtypeclasstoken


Contains the token that began the current class declaration. Used to restore the class token if it is really just the start of a variable name.

$self->{sodtypeclasstoken}

stackFrozen


Once the parser passes the opening curly brace of a function body, the parsed parameter stack is frozen. This prevents other things that loook like parameter lists (e.g. the expression of an if or while statement) from getting parsed.

$self->{stackFrozen}

startOfDec


The control variable for the startOfDec parser. Used to control when the variables sodname and sodtype get filled.

$self->{startOfDec}

storeDec


Temporary storage for nested declarations, used to build up the vestigial plain text declaration.

$self->{storeDec}

structClassName


The last symbol before a colon in a struct declaration. Used for structs that look like this:

struct foo : bar {...}

In this case, the actual name of the struct is foo, so that token gets stored in structClassName and restored later.

$self->{structClassName}

sublang


The language dialect that the parser was parsing when this parser state was created (e.g. cpp for C++).

$self->{sublang}

temponlyComments


When a semicolon is encountered, if the parser might be parsing a parameter list that is semicolon-delimited (parsedParamParse <= 2), this gets the value of the onlyComments field, and the value is replaced at the end of the loop.

If this was not the first character in the overall declaration, this has the effect of preventing the onlyComments value from being reset by the semicolon handler.

If this was the first character in the overall declaration, the value of onlyComments was already zero, so this has no effect.

Note: this could probably be replaced by a flag to simply tell the various bits of code not to change the onlyComments value, but it's probably not worth the effort for the limited simplification this would cause.

$self->{temponlyComments}

treePopTwo


This gets set to 1 when a token is encountered that causes the tree to be nested but has no explicit ending token (e.g. +, -, or :). Thus, when the enclosing context ends and the parse tree gets popped from the treeStack stack, the code pops a second time for this token.

$self->{treePopTwo}

treeStack


A stack of parse trees. These are pushed and popped at various points during the parse process as braces, colons, parentheses, etc. The behavior is controlled by the variables treeNest, treeSkip, treePopTwo, and treePopOnNewLine (most of which are local variables in blockParse and/or pythonParse.

This is currently used exclusively for Python. Other languages use a local variable in blockParse by the same name.

$self->{treeStack}

typestring


The outer type keyword (in C, struct, union, enum, or typedef).

$self->{typestring}

value


The parsed value of a constant.

$self->{value}

valuepending


This variable goes high after an equals sign, indicating that the next tokens contain the value of the constant.

$self->{valuepending}

variableNameConcat


Tells the parser to concatenate extra bits onto the name of a function, variable, etc. For example, foo.bar is (ostensibly) a valid name in Java, JavaScript, and IDL.

Set to 2 on encountering a period while parsing the name of a variable, function, etc. Goes down to 1 when the period is concatenated, zero when the next word token is concatenated.

$self->{variableNameConcat}

variablenames


Contains a hash table mapping variable names to values when parsing variable declarations that define more than one variable.

$self->{variablenames}

variablestars


Contains a hash table mapping variable names to the number of leading * characters before them. By separating this from the type information, it ensures that variables within declarations that contain a mixture of pointer and nonpointer types (char *a, b, **c;, for example) are typed correctly.

The variable curvarstars is used for temporary storage of subsequent groups of asterisks.

$self->{variablestars}

variabletype


Temporary storage of the variable type (e.g. int) used to prevent its destruction when parsing variable declarations that define more than one variable.

$self->{variabletype}

waitingForExceptions


Set to 1 when Ruby parsing encounters a left angle bracket (<) in a class declaration.

$self->{waitingForExceptions}

waitingForTypeInformation


By default, 0.

Set to 2 on a colon within a variable declaration.

If 2, set to 1 on non-space.

If 1, set to 3 on open parenthesis, else -1 if non-space.

Basically, if this goes to 3, the variable is a Pascal enumerated type, e.g.

pascal_var_e: (apple, pear, banana, orange, lemon);

Otherwise, the declaration is just a normal variable.

$self->{waitingForTypeInformation}