The parser needs a review both for updating the algo's documentation and to reduce repetitive code.
As of 868b93c much of the parser is littered with manual look-ahead loops to resolve contextual ambiguities:
|
// Check three tokens ahead to make sure that were not dealing with a constructor initialization... |
|
// ( 350.0f , <--- Could be the scenario |
|
// Example : <Capture_Start> <Value> <Comma> |
|
// idx +1 +2 |
|
bool detected_comma = _ctx->parser.Tokens.Arr[ _ctx->parser.Tokens.Idx + 2 ].Type == Tok_Comma; |
|
|
|
b32 detected_non_varadic_unpaired_param = detected_comma && nexttok.Type != Tok_Varadic_Argument; |
|
if (! detected_non_varadic_unpaired_param && nexttok.Type == Tok_Preprocess_Macro_Expr) for( s32 break_scope = 0; break_scope == 0; ++ break_scope) |
|
{ |
|
Macro* macro = lookup_macro( nexttok.Text ); |
|
if (macro == nullptr || ! macro_is_functional(* macro)) |
|
break; |
|
|
|
// ( <Macro_Expr> ( |
|
// Idx +1 +2 |
|
s32 idx = _ctx->parser.Tokens.Idx + 1; |
|
s32 level = 0; |
|
|
|
// Find end of the token expression |
|
for ( ; idx < array_num(_ctx->parser.Tokens.Arr); idx++ ) |
|
{ |
|
Token tok = _ctx->parser.Tokens.Arr[ idx ]; |
|
|
|
if ( tok.Type == Tok_Capture_Start ) |
|
level++; |
|
else if ( tok.Type == Tok_Capture_End && level > 0 ) |
|
level--; |
|
if (level == 0 && tok.Type == Tok_Capture_End) |
|
break; |
|
} |
|
++ idx; // Will incremnt to possible comma position |
|
|
|
if ( _ctx->parser.Tokens.Arr[ idx ].Type != Tok_Comma ) |
|
break; |
|
|
|
detected_non_varadic_unpaired_param = true; |
|
} |
for example, the above uses raw iteration through the lexed tokens to resolve if after thee macro argument is a comma.
We need to setup utilizing a slice for a set of tokens to look ahead that will behave as a sub-slice of the full lexed slice:
struct LexSlice
{
Token* Ptr;
s32 Len;
s32 Idx;
};
Like with the regular tokens array, it needs a simple interface to navigate with it (could probably just recycle the current one with TokArray and just change it to take a slice instead.
This sort of iteration is used throughout the parser for aggregating tokens the parser cannot parse:
eat( Tok_Capture_Start );
s32 level = 0;
while ( left && ( currtok.Type != Tok_Capture_End || level > 0 ) )
{
if ( currtok.Type == Tok_Capture_Start )
level++;
else if ( currtok.Type == Tok_Capture_End && level > 0 )
level--;
eat( currtok.Type );
}
eat( Tok_Capture_End );
It can be generalized for both consumption and look-ahead.
The parser needs a review both for updating the algo's documentation and to reduce repetitive code.
As of 868b93c much of the parser is littered with manual look-ahead loops to resolve contextual ambiguities:
gencpp/base/components/parser.cpp
Lines 2488 to 2524 in 868b93c
for example, the above uses raw iteration through the lexed tokens to resolve if after thee macro argument is a comma.
We need to setup utilizing a slice for a set of tokens to look ahead that will behave as a sub-slice of the full lexed slice:
Like with the regular tokens array, it needs a simple interface to navigate with it (could probably just recycle the current one with TokArray and just change it to take a slice instead.
This sort of iteration is used throughout the parser for aggregating tokens the parser cannot parse:
It can be generalized for both consumption and look-ahead.