項(xiàng)目的完整代碼在 C2j-Compiler <https://github.com/dejavudwh/C2j-Compiler>
前言
在之前完成了描述符號(hào)表的數(shù)據(jù)結(jié)構(gòu),現(xiàn)在就可以正式構(gòu)造符號(hào)表了。符號(hào)表的創(chuàng)建自然是要根據(jù)語(yǔ)法分析過程中走的,所以符號(hào)表的創(chuàng)建就在LRStateTableParser里的takeActionForReduce方法
不過在此之前,當(dāng)然還需要一個(gè)方便對(duì)這個(gè)符號(hào)表操作的類了
這一篇主要的兩個(gè)文件是
* TypeSystem.java
* LRStateTableParser.java
操作符號(hào)表
操作符號(hào)表的方法都在TypeSystem類里。
TypeSystem里主要有這幾個(gè)方法:
類型說明符
邏輯都很簡(jiǎn)單
public TypeLink newType(String typeText) { Specifier sp; int type =
Specifier.NONE; boolean isLong = false, isSigned = true; switch
(typeText.charAt(0)) { case 'c': if (typeText.charAt(1) == 'h') { type =
Specifier.CHAR; } break; case 'd': case 'f': System.err.println("Floating point
Numbers are not supported"); System.exit(1); break; case 'i': type =
Specifier.INT; break; case 'l': isLong = true; break; case 'u': isSigned =
false; break; case 'v': if (typeText.charAt(2) == 'i') { type = Specifier.VOID;
} break; case 's': //ignore short signed break; default: break; } sp = new
Specifier(); sp.setType(type); sp.setLong(isLong); sp.setSign(isSigned);
TypeLink link = new TypeLink(false, false, sp); return link; }
創(chuàng)建存儲(chǔ)類型
其實(shí)這一部分有的到后面解釋執(zhí)行或者代碼生成的時(shí)候,現(xiàn)在這個(gè)編譯器是不處理的
這一部分的邏輯也很簡(jiǎn)單
public TypeLink newClass(String classText) { Specifier sp = new Specifier();
sp.setType(Specifier.NONE); setClassType(sp, classText.charAt(0)); TypeLink
link = new TypeLink(false, false, sp); return link; } private void
setClassType(Specifier sp, char c) { switch(c) { case 0:
sp.setStorageClass(Specifier.FIXED); sp.setStatic(false);
sp.setExternal(false); break; case 't': sp.setStorageClass(Specifier.TYPEDEF);
break; case 'r': sp.setStorageClass(Specifier.REGISTER); break; case 's':
sp.setStatic(true); break; case 'e': sp.setExternal(true); break; default:
System.err.println("Internal error, Invalid Class type"); System.exit(1);
break; } }
給符號(hào)添加修飾符
addSpecifierToDeclaration是為當(dāng)前整個(gè)Symbol鏈都加上修飾符,比如遇見int x,y,z;這種情況
public Declarator addDeclarator(Symbol symbol, int declaratorType) {
Declarator declarator = new Declarator(declaratorType); TypeLink link = new
TypeLink(true, false, declarator); symbol.addDeclarator(link); return
declarator; } public void addSpecifierToDeclaration(TypeLink specifier, Symbol
symbol) { while (symbol != null) { symbol.addSpecifier(specifier); symbol =
symbol.getNextSymbol(); } }
剩下用到再提
構(gòu)造符號(hào)表
構(gòu)造符號(hào)表的過程在語(yǔ)法分析的過程里,也就是進(jìn)行reduce操作的時(shí)候。很好理解問什么符號(hào)表的構(gòu)建會(huì)在reduce操作時(shí)發(fā)生,因?yàn)楫?dāng)發(fā)生reduce操作的時(shí)候就代表產(chǎn)生了一個(gè)變量名或者是產(chǎn)生了一個(gè)變量定義,這時(shí)候是把它們加入符號(hào)表的最好時(shí)機(jī)
所以在語(yǔ)法分析的過程中加入一個(gè)方法來處理reduce時(shí)的操作,此外還需要一個(gè)屬性堆棧來輔助操作,屬性堆棧的作用就是用來保存之前的操作,以方便后面使用
比如現(xiàn)在產(chǎn)生了一個(gè)修飾符,但是語(yǔ)法分析過程還沒有讀入變量名,就先把這個(gè)修飾符壓入屬性堆棧,等讀入變量名的時(shí)候就可以創(chuàng)建一個(gè)Symbol,再把修飾符彈出堆棧鏈接到Symbol上
takeActionForReduce
takeActionForReduce方法的參數(shù)就是做reduce操作依據(jù)的產(chǎn)生式的編號(hào)
只看一下比較復(fù)雜的:
*
StructSpecifier_TO_TypeSpecifier:
這是結(jié)構(gòu)定義生成一個(gè)結(jié)構(gòu)體類型的declartor,對(duì)應(yīng)的推導(dǎo)式是
* TYPE_SPECIFIER -> STRUCT_SPECIFIER * STTUCT_SPECIFIER -> STRUCT OPT_TAG LC
DEF_LIST RC * | STRUCT TAG
先生成一個(gè)Specifier再設(shè)置它的vStruct屬性,也就是聲明這是一個(gè)結(jié)構(gòu)體,之后拿到結(jié)構(gòu)體定義,從這個(gè)推導(dǎo)式中我們可以看出,結(jié)構(gòu)體定義肯定在它的上一步,也就是被放入了屬性堆棧的頂端
*
SPECIFIERS_TypeOrClass_TO_SPECIFIERS
這里對(duì)應(yīng)的推導(dǎo)式是
* SPECIFIERS -> SPECIFIERS TYPE_OR_CLASS
目的其實(shí)就是合并多個(gè)specifier
*
DEFLIST
case SyntaxProductionInit.ExtDeclList_COMMA_ExtDecl_TO_ExtDeclList: case
SyntaxProductionInit.VarList_COMMA_ParamDeclaration_TO_VarList: case
SyntaxProductionInit.DeclList_Comma_Decl_TO_DeclList: case
SyntaxProductionInit.DefList_Def_TO_DefList:
其實(shí)這部分是連接一系列變量的定義,比如ExtDeclList_COMMA_ExtDecl_TO_ExtDeclList
就是對(duì)應(yīng)像x,y這樣用逗號(hào)分割的連續(xù)定義,把它們的符號(hào)連接起來
*
DECLARTOR
case SyntaxProductionInit.OptSpecifier_ExtDeclList_Semi_TO_ExtDef: case
SyntaxProductionInit.TypeNT_VarDecl_TO_ParamDeclaration: case
SyntaxProductionInit.Specifiers_DeclList_Semi_TO_Def:
這里其實(shí)是完成一個(gè)完成的定義,像int
x,y;后把拿到Specifier放入到每個(gè)符號(hào)去,比較特殊的是拿到Specifier的位置,這是根據(jù)reduce次數(shù)計(jì)算的。
*
STRUCT和FUNCTION
接下來比較需要注意的就是處理struct和function的方法
*
在處理連續(xù)的變量聲明的時(shí)候,如果遇見的類型是結(jié)構(gòu)體的話,就進(jìn)行這樣一個(gè)處理,如果當(dāng)前是個(gè)結(jié)構(gòu)體聲明,我們就直接把這個(gè)結(jié)構(gòu)體里的符號(hào),也就是structDefine的fields放入argList(這個(gè)原本應(yīng)該是放函數(shù)參數(shù)的)
private void handleStructVariable(Symbol symbol) { if (symbol == null) {
return; } boolean isStruct = false; TypeLink typeLink = symbol.typeLinkBegin;
Specifier specifier = null; while (typeLink != null) { if
(!typeLink.isDeclarator) { specifier = (Specifier) typeLink.getTypeObject(); if
(specifier.getType() == Specifier.STRUCTURE) { isStruct = true; break; } }
typeLink = typeLink.toNext(); } if (isStruct) { StructDefine structDefine =
specifier.getStruct(); Symbol copy = null, headCopy = null, original =
structDefine.getFields(); while (original != null) { if (copy != null) { Symbol
sym = original.copy(); copy.setNextSymbol(sym); copy = sym; } else { copy =
original.copy(); headCopy = copy; } original = original.getNextSymbol(); }
symbol.setArgList(headCopy); } }
* 這個(gè)方法其實(shí)就是根據(jù)有沒有參數(shù)來判斷當(dāng)前函數(shù)的名字在堆棧的哪個(gè)位置 private void setFunctionSymbol(boolean
hasArgs) { Symbol funcSymbol; if (hasArgs) { funcSymbol = (Symbol)
valueStack.get(valueStack.size() - 4); } else { funcSymbol = (Symbol)
valueStack.get(valueStack.size() - 3); } typeSystem.addDeclarator(funcSymbol,
Declarator.FUNCTION); attributeForParentNode = funcSymbol; }
takeActionForReduce源碼
private void takeActionForReduce(int productionNum) { switch (productionNum) {
case SyntaxProductionInit.TYPE_TO_TYPE_SPECIFIER: attributeForParentNode =
typeSystem.newType(text); break; case
SyntaxProductionInit.EnumSpecifier_TO_TypeSpecifier: attributeForParentNode =
typeSystem.newType("int"); break; case
SyntaxProductionInit.StructSpecifier_TO_TypeSpecifier: attributeForParentNode =
typeSystem.newType(text); TypeLink link = (TypeLink) attributeForParentNode;
Specifier sp = (Specifier) link.getTypeObject();
sp.setType(Specifier.STRUCTURE); StructDefine struct = (StructDefine)
valueStack.get(valueStack.size() - 1); sp.setStruct(struct); break; case
SyntaxProductionInit.SPECIFIERS_TypeOrClass_TO_SPECIFIERS:
attributeForParentNode = valueStack.peek(); Specifier last = (Specifier)
((TypeLink) valueStack.get(valueStack.size() - 2)).getTypeObject(); Specifier
dst = (Specifier) ((TypeLink) attributeForParentNode).getTypeObject();
typeSystem.specifierCopy(dst, last); break; case
SyntaxProductionInit.NAME_TO_NewName: case SyntaxProductionInit.Name_TO_NameNT:
attributeForParentNode = typeSystem.newSymbol(text, nestingLevel); break; case
SyntaxProductionInit.START_VarDecl_TO_VarDecl: case
SyntaxProductionInit.Start_VarDecl_TO_VarDecl:
typeSystem.addDeclarator((Symbol) attributeForParentNode, Declarator.POINTER);
break; case SyntaxProductionInit.VarDecl_LB_ConstExpr_RB_TO_VarDecl: Declarator
declarator = typeSystem.addDeclarator((Symbol) valueStack.get(valueStack.size()
- 4), Declarator.ARRAY); int arrayNum = (Integer) attributeForParentNode;
declarator.setElementNum(arrayNum); attributeForParentNode =
valueStack.get(valueStack.size() - 4); break; case
SyntaxProductionInit.Name_TO_Unary: attributeForParentNode =
typeSystem.getSymbolByText(text, nestingLevel, symbolScope); break; case
SyntaxProductionInit.ExtDeclList_COMMA_ExtDecl_TO_ExtDeclList: case
SyntaxProductionInit.VarList_COMMA_ParamDeclaration_TO_VarList: case
SyntaxProductionInit.DeclList_Comma_Decl_TO_DeclList: case
SyntaxProductionInit.DefList_Def_TO_DefList: { Symbol currentSym = (Symbol)
attributeForParentNode; Symbol lastSym = null; if (productionNum ==
SyntaxProductionInit.DefList_Def_TO_DefList) { lastSym = (Symbol)
valueStack.get(valueStack.size() - 2); } else { lastSym = (Symbol)
valueStack.get(valueStack.size() - 3); } currentSym.setNextSymbol(lastSym); }
break; case SyntaxProductionInit.OptSpecifier_ExtDeclList_Semi_TO_ExtDef: case
SyntaxProductionInit.TypeNT_VarDecl_TO_ParamDeclaration: case
SyntaxProductionInit.Specifiers_DeclList_Semi_TO_Def: Symbol symbol = (Symbol)
attributeForParentNode; TypeLink specifier; if (productionNum ==
SyntaxProductionInit.TypeNT_VarDecl_TO_ParamDeclaration) { specifier =
(TypeLink) (valueStack.get(valueStack.size() - 2)); } else { specifier =
(TypeLink) (valueStack.get(valueStack.size() - 3)); }
typeSystem.addSpecifierToDeclaration(specifier, symbol);
typeSystem.addSymbolsToTable(symbol, symbolScope);
handleStructVariable(symbol); break; case
SyntaxProductionInit.VarDecl_Equal_Initializer_TO_Decl: //Here you need to set
the Symbol object for the response, otherwise there will be an error above
Symbol symbol = (Symbol)attributeForParentNode; attributeForParentNode =
(Symbol) valueStack.get(valueStack.size() - 2); break; case
SyntaxProductionInit.NewName_LP_VarList_RP_TO_FunctDecl:
setFunctionSymbol(true); Symbol argList = (Symbol)
valueStack.get(valueStack.size() - 2); ((Symbol) attributeForParentNode).args =
argList; typeSystem.addSymbolsToTable((Symbol) attributeForParentNode,
symbolScope); symbolScope = ((Symbol) attributeForParentNode).getName(); Symbol
sym = argList; while (sym != null) { sym.addScope(symbolScope); sym =
sym.getNextSymbol(); } break; case
SyntaxProductionInit.NewName_LP_RP_TO_FunctDecl: setFunctionSymbol(false);
typeSystem.addSymbolsToTable((Symbol) attributeForParentNode, symbolScope);
symbolScope = ((Symbol) attributeForParentNode).getName(); break; case
SyntaxProductionInit.OptSpecifiers_FunctDecl_CompoundStmt_TO_ExtDef: symbol =
(Symbol) valueStack.get(valueStack.size() - 2); specifier = (TypeLink)
(valueStack.get(valueStack.size() - 3));
typeSystem.addSpecifierToDeclaration(specifier, symbol); symbolScope =
GLOBAL_SCOPE; break; case SyntaxProductionInit.Name_To_Tag: symbolScope = text;
attributeForParentNode = typeSystem.getStructFromTable(text); if
(attributeForParentNode == null) { attributeForParentNode = new
StructDefine(text, nestingLevel, null);
typeSystem.addStructToTable((StructDefine) attributeForParentNode); } break;
case SyntaxProductionInit.Struct_OptTag_LC_DefList_RC_TO_StructSpecifier:
Symbol defList = (Symbol) valueStack.get(valueStack.size() - 2); StructDefine
structObj = (StructDefine) valueStack.get(valueStack.size() - 4);
structObj.setFields(defList); attributeForParentNode = structObj; symbolScope =
GLOBAL_SCOPE; break; case SyntaxProductionInit.Enum_TO_EnumNT: enumVal = 0;
break; case SyntaxProductionInit.NameNT_TO_Emurator: doEnum(); break; case
SyntaxProductionInit.Name_Eequal_ConstExpr_TO_Enuerator: enumVal = (Integer)
(valueStack.get(valueStack.size() - 1)); attributeForParentNode = (Symbol)
(valueStack.get(valueStack.size() - 3)); doEnum(); break; case
SyntaxProductionInit.Number_TO_ConstExpr: case
SyntaxProductionInit.Number_TO_Unary: attributeForParentNode =
Integer.valueOf(text); break; default: break; }
astBuilder.buildSyntaxTree(productionNum, text); }
作用域
在上面說的構(gòu)造符號(hào)表的過程還有一個(gè)點(diǎn)沒有說,就是每個(gè)符號(hào)的作用域問題。
在LRStateTableParser有兩個(gè)屬性
* nestingLevel
用來表明當(dāng)前符號(hào)的嵌套層次
* symbolScope
用來表示當(dāng)前符號(hào)的作用域,如果是全局變量就設(shè)置為GLOBAL_SCOPE,為函數(shù)內(nèi)部的即設(shè)置為對(duì)應(yīng)的函數(shù)名
nestingLevel在shift操作遇見左右括號(hào)時(shí),就會(huì)進(jìn)行相應(yīng)的加減
symbolScope則是在reduce過程,如果遇見一個(gè)函數(shù)定義或者完成一個(gè)完整的函數(shù)定義就會(huì)進(jìn)行相應(yīng)的設(shè)置
小結(jié)
這一篇就是利用reduce操作的時(shí)候,我們可以知道是根據(jù)哪個(gè)產(chǎn)生式做的reduce,就可以在其中插入符號(hào)表的構(gòu)建。過程看著可能比較復(fù)雜,但其實(shí)就是:
根據(jù)reduce的產(chǎn)生式創(chuàng)建信息 -> 根據(jù)reduce的產(chǎn)生式來組合信息
另外的github博客:https://dejavudwh.cn <https://dejavudwh.cn>
熱門工具 換一換