項(xiàng)目的完整代碼在 C2j-Compiler <https://github.com/dejavudwh/C2j-Compiler>
前言
在上一篇解釋完了一些基礎(chǔ)的Java字節(jié)碼指令后,就可以正式進(jìn)入真正的代碼生成部分了。但是這部分先說的是代碼生成依靠的幾個(gè)類,也就是用來生成指令的操作。
這一篇用到的文件都在codegen下:
* Directive.java
* Instruction.java
* CodeGenerator.java
* ProgramGenerator.java
Directive.java
這個(gè)是枚舉類,用來生成一些比較特殊的指令
都生成像聲明一個(gè)類或者一個(gè)方法的范圍的指令,比較簡單。
public enum Directive { CLASS_PUBLIC(".class public"), END_CLASS(".end
class"), SUPER(".super"), FIELD_PRIVATE_STATIC(".field private static"),
METHOD_STATIC(".method static"), METHOD_PUBLIC(".method public"),
FIELD_PUBLIC(".field public"), METHOD_PUBBLIC_STATIC(".method public static"),
END_METHOD(".end method"), LIMIT_LOCALS(".limit locals"), LIMIT_STACK(".limit
stack"), VAR(".var"), LINE(".line"); private String text; Directive(String
text) { this.text = text; } public String toString() { return text; } }
Instruction.java
這也是一個(gè)枚舉類,用來生成一些基本的指令
public enum Instruction { LDC("ldc"), GETSTATIC("getstatic"),
SIPUSH("sipush"), IADD("iadd"), IMUL("imul"), ISUB("isub"), IDIV("idiv"),
INVOKEVIRTUAL("invokevirtual"), INVOKESTATIC("invokestatic"),
INVOKESPECIAL("invokespecial"), RETURN("return"), IRETURN("ireturn"),
ILOAD("iload"), ISTORE("istore"), NEWARRAY("newarray"), NEW("new"), DUP("dup"),
ASTORE("astore"), IASTORE("iastore"), ALOAD("aload"), PUTFIELD("putfield"),
GETFIELD("getfield"), ANEWARRAY("anewarray"), AASTORE("aastore"),
AALOAD("aaload"), IF_ICMPEG("if_icmpeq"), IF_ICMPNE("if_icmpne"),
IF_ICMPLT("if_icmplt"), IF_ICMPGE("if_icmpge"), IF_ICMPGT("if_icmpgt"),
IF_ICMPLE("if_icmple"), GOTO("goto"), IALOAD("iaload"); private String text;
Instruction(String s) { this.text = s; } public String toString() { return
text; } }
CodeGenerator.java
重點(diǎn)來了,生成的邏輯主要都在CodeGenerator和ProgramGenerator里,CodeGenerator是ProgramGenerator的父類
CodeGenerator的構(gòu)造函數(shù)new了一個(gè)輸出流,用來輸出字節(jié)碼到xxx.j里
public CodeGenerator() { String assemblyFileName = programName + ".j"; try {
bytecodeFile = new PrintWriter(new PrintStream(new File(assemblyFileName))); }
catch (FileNotFoundException e) { e.printStackTrace(); } }
emit、emitString、emitDirective、emitBlankLine都屬于輸出基本指令的方法,都有多個(gè)重載方法來應(yīng)對不一樣操作和操作數(shù)。需要注意的是,有的指令可能需要先緩存起來,在最后的時(shí)候一起提交,比如buffered、classDefine就是用來判斷是不是應(yīng)該先緩存的布爾值
public void emitString(String s) { if (buffered) { bufferedContent += s +
"\n"; return; } if (classDefine) { classDefinition += s + "\n"; return; }
bytecodeFile.print(s); bytecodeFile.flush(); } public void emit(Instruction
opcode) { if (buffered) { bufferedContent += "\t" + opcode.toString() + "\n";
return; } if (classDefine) { classDefinition += "\t" + opcode.toString() +
"\n"; return; } bytecodeFile.println("\t" + opcode.toString());
bytecodeFile.flush(); ++instructionCount; } public void emitDirective(Directive
directive, String operand1, String operand2, String operand3) { if (buffered) {
bufferedContent += directive.toString() + " " + operand1 + " " + operand2 + " "
+ operand3 + "\n"; return; } if (classDefine) { classDefinition +=
directive.toString() + " " + operand1 + " " + operand2 + " " + operand3 + "\n";
return; } bytecodeFile.println(directive.toString() + " " + operand1 + " " +
operand2 + " " + operand3); ++instructionCount; } public void emitBlankLine() {
if (buffered) { bufferedContent += "\n"; return; } if (classDefine) {
classDefinition += "\n"; return; } bytecodeFile.println();
bytecodeFile.flush(); }
ProgramGenerator.java
ProgramGenerator繼承了CodeGenerator,也就是繼承了一些基本的操作,在上一篇像結(jié)構(gòu)體、數(shù)組的指令輸出都在這個(gè)類里
處理嵌套
先看四個(gè)屬性,這四個(gè)屬性主要是就來處理嵌套的分支和循環(huán)。
private int branch_count = 0; private int branch_out = 0; private String
embedded = ""; private int loopCount = 0;
*
當(dāng)沒嵌套一個(gè)ifelse語句時(shí)候 embedded屬性就會加上一個(gè)字符‘i’,而當(dāng)退出一個(gè)分支的時(shí)候,就把這個(gè)‘i’切割掉
*
branch_count和branch_out都用來標(biāo)志相同作用域的分支跳轉(zhuǎn)
*
也就是說如果有嵌套就用embedded來處理,如果是用一個(gè)作用域的分支就用branch_count和branch_out來做標(biāo)志
public void incraseIfElseEmbed() { embedded += "i"; } public void
decraseIfElseEmbed() { embedded = embedded.substring(1); } public void
emitBranchOut() { String s = "\n" + embedded + "branch_out" + branch_out +
":\n"; this.emitString(s); branch_out++; }
loopCount則是對嵌套循環(huán)的處理
public void emitLoopBranch() { String s = "\n" + "loop" + loopCount + ":" +
"\n"; emitString(s); } public String getLoopBranch() { return "loop" +
loopCount; } public void increaseLoopCount() { loopCount++; }
處理結(jié)構(gòu)體
putStructToClassDeclaration是定義結(jié)構(gòu)體的,也就是new一個(gè)類。declareStructAsClass則是處理結(jié)構(gòu)體里的變量,也就是相當(dāng)于處理類的屬性
* 結(jié)構(gòu)體如果已經(jīng)類的定義的話,就會加入structNameList,不要進(jìn)行重復(fù)的定義
* symbol.getValueSetter()如果不是空的話就表明是一個(gè)結(jié)構(gòu)體數(shù)組,這樣就直接從數(shù)組加載這個(gè)實(shí)例,不用在堆棧上創(chuàng)建
* declareStructAsClass則是依照上一篇說的Java字節(jié)碼有關(guān)類的指令來創(chuàng)建一個(gè)類 public void
putStructToClassDeclaration(Symbol symbol) { Specifier sp =
symbol.getSpecifierByType(Specifier.STRUCTURE); if (sp == null) { return; }
StructDefine struct = sp.getStruct(); if
(structNameList.contains(struct.getTag())) { return; } else {
structNameList.add(struct.getTag()); } if (symbol.getValueSetter() == null) {
this.emit(Instruction.NEW, struct.getTag()); this.emit(Instruction.DUP);
this.emit(Instruction.INVOKESPECIAL, struct.getTag() + "/" + "<init>()V"); int
idx = this.getLocalVariableIndex(symbol); this.emit(Instruction.ASTORE, "" +
idx); } declareStructAsClass(struct); } private void
declareStructAsClass(StructDefine struct) { this.setClassDefinition(true);
this.emitDirective(Directive.CLASS_PUBLIC, struct.getTag());
this.emitDirective(Directive.SUPER, "java/lang/Object"); Symbol fields =
struct.getFields(); do { String fieldName = fields.getName() + " "; if
(fields.getDeclarator(Declarator.ARRAY) != null) { fieldName += "["; } if
(fields.hasType(Specifier.INT)) { fieldName += "I"; } else if
(fields.hasType(Specifier.CHAR)) { fieldName += "C"; } else if
(fields.hasType(Specifier.CHAR) && fields.getDeclarator(Declarator.POINTER) !=
null) { fieldName += "Ljava/lang/String;"; }
this.emitDirective(Directive.FIELD_PUBLIC, fieldName); fields =
fields.getNextSymbol(); } while (fields != null);
this.emitDirective(Directive.METHOD_PUBLIC, "<init>()V");
this.emit(Instruction.ALOAD, "0"); String superInit =
"java/lang/Object/<init>()V"; this.emit(Instruction.INVOKESPECIAL, superInit);
fields = struct.getFields(); do { this.emit(Instruction.ALOAD, "0"); String
fieldName = struct.getTag() + "/" + fields.getName(); String fieldType = ""; if
(fields.hasType(Specifier.INT)) { fieldType = "I";
this.emit(Instruction.SIPUSH, "0"); } else if (fields.hasType(Specifier.CHAR))
{ fieldType = "C"; this.emit(Instruction.SIPUSH, "0"); } else if
(fields.hasType(Specifier.CHAR) && fields.getDeclarator(Declarator.POINTER) !=
null) { fieldType = "Ljava/lang/String;"; this.emit(Instruction.LDC, " "); }
String classField = fieldName + " " + fieldType;
this.emit(Instruction.PUTFIELD, classField); fields = fields.getNextSymbol(); }
while (fields != null); this.emit(Instruction.RETURN);
this.emitDirective(Directive.END_METHOD);
this.emitDirective(Directive.END_CLASS); this.setClassDefinition(false); }
獲取堆棧信息
其它有關(guān)Java字節(jié)碼其實(shí)都是根據(jù)上一篇來完成的,邏輯不復(fù)雜,現(xiàn)在來看一個(gè)方法:getLocalVariableIndex,這個(gè)方法是獲取變量當(dāng)前在隊(duì)列里的位置的
* 先拿到當(dāng)前執(zhí)行的函數(shù),然后拿到函數(shù)的對應(yīng)參數(shù),再反轉(zhuǎn)(這和參數(shù)壓棧的順序有關(guān))
* 然后把當(dāng)前符號對應(yīng)作用域的符號都添加到列表里
* 之后遍歷這個(gè)列表就可以算出這個(gè)符號對應(yīng)在隊(duì)列里的位置 public int getLocalVariableIndex(Symbol symbol)
{ TypeSystem typeSys = TypeSystem.getInstance(); String funcName =
nameStack.peek(); Symbol funcSym = typeSys.getSymbolByText(funcName, 0,
"main"); ArrayList<Symbol> localVariables = new ArrayList<>(); Symbol s =
funcSym.getArgList(); while (s != null) { localVariables.add(s); s =
s.getNextSymbol(); } Collections.reverse(localVariables); ArrayList<Symbol>
list = typeSys.getSymbolsByScope(symbol.getScope()); for (int i = 0; i <
list.size(); i++) { if (!localVariables.contains(list.get(i))) {
localVariables.add(list.get(i)); } } for (int i = 0; i < localVariables.size();
i++) { if (localVariables.get(i) == symbol) { return i; } } return -1; }
小結(jié)
這一篇主要是根據(jù)上一篇的JVM字節(jié)碼來對不同的操作提供不同的方法來去輸出這些指令
歡迎Star!
熱門工具 換一換