14th of January

Discovered the download for the benchmarks had lost the connection, at 1 am, so won’t be completed until tomorrow. Stared by recovering it and making a start of tackling how all this llvm ties in with qemu.

To aid with this I decided to write down the questions i had (typed out) in my case, and proceed to answer them myself by looking through the code, as well as looking at Andrew’s diary, then if I couldn’t really look at the code or if it was too obscure like something that Andrew did a different way, ask him why.

saw Brad who also liked this idea and really wanted to push that I get my head around how the low level machine stuff works as well as qemu, so for the upcoming meeting we really have something to push for.


The following contains questions that I asked myself mainly today, and then filled in the answer if I already had looked at it, followed by extra research to ensure it was correct.

Q: Where does the transformation to LLVM take place?

A: The approximately named llm_translate, inside llvm_disas_arm_ans. The instruction is decoded, then the LLVM Instructions are Built, using LLVMBuild(Add|Sub), LLVMBuildStore, Additional: As ARM instructions are 32-bit the instruction are passed as an argument to the function.

Q: Where does the external compiler jobs get queued?

A: Well the actual queuing takes place in ext_blk_cmpl_new_jobs, which has the appropriate mutex/concurrency locks in place, and deals with the queue. This is not however where the determination of if we will use llvm takes place. That will be answered in the next question instead.

Q: Where does the determination for are we going to use LLVM take place?

A: Tracing where ext_blk_cmpl_new_jobs is called from shows its used in exec.c in tb_gen_code. The call of this function is determined if the translation block has the llvmable field true, which is set in target-arm/translate.c. llvm_qemu_insn_test – This takes a instruction (32-bit) and decodes the condition, category, flags, opcodes, similar to how the actual translate stuff then shifts it and sets a bit.

Q: What is the disassembler (udis86) used for?

A: It is used to calcuate the size of a function by being pointed to the start of a block, and counting until, it gets to the return instruction. llvm_translate@x86_64_function_size, as well as a helper function, unused at the moment designed for printing out the LLVM version of the block

Q: Where does the replacement of LLVM with TCG generated code take place?

A: This is found in ext-blk-cmpl,c in ext_blk_cmpl function right at the bottom. It does this by putting a mov %14, %rdi in the start of the function. This function is called by external block complier process job, function in the same file, which is what setups the LLVM Moudle. Passes and JIT Compiler and performs, the optimizations, it is also where the mutex clocks are. To sum up this this function is created as a thread in the linux-user/main.c Notes: Instruction buffer size is 512 bytes.

Q: What is the "Enable this if theres trouble with LLVM generating blocks", code do?

A: Performs a LLVMVerifyModule, prints an error then DisposeMessage (frees the error string) LLVMVerifyMoudle – From the sounds of the Analysis.h, it performs checking on a module, then gives the caller, a string that contains a description of the problems that a human can understand.

Q: Why aren’t the replaced blocks being printed out via the out_asm option?

Need to asked Andrew specifically, because he did write the method block_dias in llvm_translate to get the blocks. so will more then likely know why they not getting printed.

Q: What are the instructions already ARM -> LLVM instructions that are already implemented?

A: From translate.c and checked with Andrew. DATA_PROC with ADD,AND, ORR, SUB, MOV, and LD_ST_IM/LD_ST_REG: STUW and LDUW with PRE and POST.

Shift with register – pitfall

A special point to note, that Andrew made special note to since, I found some of the code related to it commented out, is that arm, has a special case on what shifting by 0 means, for arm it means shift by 31. So this means it needs conditional logic for when the shift is from a register to check if the operand is 0.

Statistic Script Improvements

Worked on improving the instruct statistic script by adding a new method, plan to improve it further to make life easier by introducing a kinda pipe line system, so the Reading is a pass, the Instruction Usage Count is another pass, then some analysis on that is another