A look at Mach-O

1 Comment

Yesterday (2010-11-06), when I took a break from studying, I started looking at the Mach-O Format used by Apple’s Mac OS X.  It is a format for executable,  similar in purpose to the PE Format (Portable Executable) that is used by Microsoft Windows. That said both are not limited just to runnable executable but could also be a dynamic library (Usually denoted by dll extension for/on Windows). I started working on a simple C program, were you provide it the path to a Mach-O binary and it would print on the information. I am aware there is objdump and a few other tools which will do this but I was after learning the format, not after a way to decode it, and none of them had a nice modular/library interface to being able to do such a thing anyway.

The binary I was looking at was “Mach-O fat file with 2 architectures” but more on this later. I was able to use 7-zip to extract one of the architecture contained in it out. The format is documented on the Apple Developer site, and the main structure is as follows:

struct mach_header macho_header_struct
{
    uint32_t magic;
    cpu_type_t cputype;
    cpu_subtype_t cpusubtype;
    uint32_t filetype;
    uint32_t ncmds;
    uint32_t sizeofcmds;
    uint32_t flags;
};

 

Ignoring the error checking, the actual workload looked something like this:

struct mach_header header;
bufferSize = fread (&header,1,sizeof(header), file);
printf("mach object header\n"); 
printf("magic number\t%08x\n", header.magic); 
printf("cputype\t\t%08x\t%s\n", header.cputype, cputype_to_string(false, header.cputype)); 
printf("subcputype\t%08x\n",  header.cpusubtype); 
printf("filetype\t%08x\t%s\n",  header.filetype, filetype_to_string(header.filetype)); 
printf("ncmds\t\t%08x\n",  header.ncmds); 
printf("sizeofcmds\t%08x\n",  header.sizeofcmds); 
printf("flags\t\t%08x\n",  header.flags); 
 

Once I had that working, the next step I choose to take was to look into the “fat” file again. Since I had the above in place I realised that if it ias a “fat” executable it has a different magic number and it matches the magic number defined in the documentation. Instead of a mach_header there is a fat header which has the magic number and the number of architectures. Straight after that structure in the file is an array of structures containing the information about where to find the mach_header for each architecture. It would seem the Apple marketing term for a “fat” executable is a Universal binary as it can be used on more then one platform.

This diagram shows the layout of data for an fat executable, it shows the fields in the fat_header and fat_arch structures. Something I found interesting was the offset, the offset is from the start of the file and not from the offset location or at the least the start of the fat_arch structure. After the fat_arch should be the first mach_header structure. Something that may be interesting to follow up is if the mach_header comes straight after the last fat_arch structure or is there padding/zeroed space or some other undocumented metadata that I haven’t came across in my studies so far. Since the location is based from the offset it would suggest there no requirement for it to be straight after.

macho_header

I then was able to seek to the offset of one of the fat_arch and then the code above I had previously written for just reading the mach_header and printing it out, worked. The next step from here is to decode the load commands into the appropriate structures, a problem here is there are 37 structures. I can see two ways of dealing with this, the first way is create each different structure read it, then I can also print out information while I work but this requires doing it for all 37 structures. The second option is generically allocate it, since the command structure all have 2 fields the type and the size, so the size could be used to allocate the amount of memory it requires, then it can just read in the structure as bytes, delaying the structure till later where it can be casted on use. Another aspect I will need to think about soon is and appropriate API and how to go about making the current stuff more modular, so for example for a fat binary it will give the array of fat_arch structures, then that can be used to then get a mach_header, followed by a load_commands or maybe even just get_commands, that is just some food for my thoughts.

Fibonacci in x86 Assembly

Leave a comment

On the 31st of October 2009, sometime in the afternoon, the topic of Fibonacci came up, on the computer science channel for the University of Adelaide. I think the weeks coding activity for first years was to write a recursive Fibonacci algorithm. That said, future post will be why recursive Fibonacci is the wrong way of thinking about the problem. I was in my third year and for fun I decided that instead I would write an iterative one, in p in C. After that was done was still not happy with that as being an end point so I decided it was time to learn a little bit of X86 assembly. So the logical step here was to translate the algorithm I wrote to x86 assembly which goes a little as follows:

fib.x86.asm

        .text
.globl _fib

# Fibonacci in x86 assembly by Sean Donno <darkdonno@gmail.com> 
#
# Paramter n is in  0x8(esp) store in ebx
# Return via eax (last)
# ecx = n  (loaded from 0x(esp)
# ebx = secondlast
# eax = last  (and the value returned)
 
_fib:
    pushl %ebp
    movl  %esp, %ebp
    pushl %ebx
    pushl %ecx
    # Saved Registers
 
    movl 8(%ebp), %ecx          # ecx = n
    movl $0, %ebx               # ebx = secondlast = 1
    movl $1, %eax               # eax = last = 0
    jmp    fib2
fib1:
    add  %ebx, %eax             # last = last + secondlast
    neg  %ebx
    add  %eax, %ebx             # secondlast = secondlast + last
    dec  %ecx                   # n = n  1
fib2:
    cmp  $0, %ecx
    jne  fib1                   # if n != 0 goto fib1
 
    # Restore registers and return
    popl %ecx
    popl %ebx
    popl %ebp 
    ret

Code available on Google Docs at https://docs.google.com/Doc?docid=0AXW7elIWGgADZGR6ZHJkMl8xMzd6ZHBxNmhk&hl=en, I really wish WordPress had support this JavaScript syntax highlighter so it provides a a plain version, and can do line numbers, and colouring on the fly, it might and I just haven’t looked hard enough. At the time of writing this i am using hosting at wordpress.com so to the best of my knowledge installing any plugins are not possible.

The comments make more sense when you can compare it to the C version, a while later I followed this up with a version for DLX, for those who haven’t gone to the University of Adelaide (UofA), DLX is a a RISC processor designed by Hennessy and Patterson, who worked on the MIPS architecture, and its a slightly slimmer version of MIPS, that is taught in the Computer Systems course at UofA.  For some comparison the DLX version is 10 instructions, but passes arguments via registers instead of a stack, which is what 7 of the instructions is the x86 version does. There are 15  lines of comments in the traditional ‘function’ commenting style used in the course, which outnumbers the actual lines of code.

SEGP – Fourth Milestone – 1

1 Comment

This post documents the work done for the fourth milestone on the Software Engineering Group Project done at the University of Adelaide in association with ASTC as follow up to the Summer Project I worked on. 

The first part of the follow up work involved improving the tools that I had written during summer for collecting results such that more information could be extracted and formulated. Another thing that was done was the addition of using oprofile to determine which blocks the QEMU spends the most time in for a given test program. This means its easier to determine where time is spent the most and can point people into the right direction of what instructions and blocks to look at getting translatable next. The final thing was the addition of new instructions, which were reverse subtract and exclusive or. The addressing modes for using ASR and LSL were looked at and implementation provided, along with a implementation for the specialised zero case but none of the tests we run use that so those cases are disabled. The big one was the push instruction, out of two variations that were implemented only one works to date.

The plan was to remove the memcpy of LLVM Blocks by allowing qemu to jump out to the location were LLVM generates the native code for a block and then returning to the epilogue for that block, with the hope to remove the locking/blocking. For extended detail on this I will write the post at a later date. (*IF NOT PUT IT HERE AND SEPERATE THIS)

The first task to do was to first of all calculating the total time spent in the critical section

This was added in ext-blk-cmpl.c

#include <time.h>
	uint64_t ext_blk_time_spent_processing = 0;

 

After the lock is successful in ext_blk_cmpl_replace_block then the current time is recorded.

	struct timeval tv;
	gettimeofday(&tv, NULL);
	uint64_t start_time = (tv.tv_sec * 1000000) + tv.tv_usec;

After the lock is released update the total time spent by adding on the difference in current time with start time along with a print out of the time so far.

	
	gettimeofday(&tv, NULL);
	ext_blk_time_spent_processing += (tv.tv_sec * 1000000) + tv.tv_usec - start_time;
	printf("Acc Time spent blocked processing the jobs: %llu\n",ext_blk_time_spent_processing);

This has a higher precision, i first did it with just time which is second precision and it just read as zero. The result on my machine from that is the final was 36673 microseconds which is 0.036673 seconds. That is a lot smaller then expected,

Acc Time spent blocked processing the jobs: 36673

Research – Exploring the TranslationBlock structure and epilogue.The structure has a field unit8_t *tc_ptr, this is a pointer to the translated code. (The Address in the code cache). What we are after is weather or not it is read from and stored elsewhere, such as in the epilogue of another block. An quick check with ack-grep for tc_ptr reveals 43 lines in which tc_ptr appears. 28 are pointer references (->). 15 out of the 43 of them were in the ext-blk-cmpl which is created by our work so easily to handle, leaving 28 references in translate-all.c, exec-all.h, exec_c, cpu-exec.c, two of which are in comments.

 

So we have 28 occurrences of tc_ptr, and of those there are 13 that use pointer referencing. Three of them are uint8_t *tc_ptr so eliminating that brings us to 25, eliminating the two .tc_ptr which are in comments is down to 23.Remove another 2, one for a function with argument tc_ptr and unsigned long tc_ptr; and one is doing the disassembly of the code to the log file so that brings us to 20 possible usages of tc_ptr that may result in a problem. For the exact command I did, the result after wards is not exactly but i just modified it so it was blog friendly.

ack-grep tc_ptr | grep "\.[ch]" | grep -v ext-blk-cmpl | grep -v \*tc_ptr | grep -v "]\.tc" | grep -v "log_disas" | grep -v "unsigned long tc_ptr" | wc -l 

translate-all.c: 102, 173, 174, 185
exec-all.h: 242, 245, 265
exec.c: 735,875, 876, 1179, 1234, 1235, 1243, 1244, 1246
cpu-exec.c: 109, 581, 607, 614

 

  • translate-all.c
    • 102 – Located in cpu_gen_code, it is read into gen_code_buf, which is passed to tcg_gen_code so this is SAFE, as this is before TCG code is generated, and is only used there.
    • 173/174 – Located in cpu_restore_state and tc_ptr is read from and checked against searched_pc  (Potential problem comment says finds opc index corresponding to search pc)
    • 185 – Located in cpu_restore_state –> tcg_gen_code_search_pc is called with the tc_ptr and searched_pc – tc_ptr (Specially here is the problem)
      • This is the only location where tcg_gen_code_search_pc is called from, this calls tcg_gen_code_common.
  • exec-all.h
    • 242/245 – Insidetb_set_jmp_target, calls tb_set_jmp_target1 with the tc_ptr.
      • This is located in an #elif defined(__arm__) and requires USE_DIRECT_JUMP, which is also defined when arm is used
      • Now  following it down the rabbit hole: static inline void tb_set_jmp_target1(unsigned long jmp_addr, unsigned long addr)
      • Called from
        • tb_reset_jump in exec.c where by it is used to “reset the jump entry ‘n’ of a TB so taht it is not chained to another TB
          • the address is tc_ptr + tb->tb_next_offset[n]
        • tb_add_jump 264  where it is used to patch a native jump address. (Assuming it is used to chain)
      • tb_jmp_offset is 4 elements and is the offset of jump instruction.
    • 265 – Located in tb_add_jump, this is used to patch the native jump address
      • Called in cpu-exec.c line 593
  • exec.c
    • 735 – Used in calculation for tb_set_jmp_target see 242/245 in exec-all.h (above)
    • 875/876 – Located inside tb_gen_code this is where the tc_ptr in the TranslationBlock is first set so no problem.
    • 1179 – Located in tb_free, it frees a translation block, if its the last one used.
    • 1234/1235/1234/1244/1246 – Located in a function called tb_find_pc – Similar problem as the search_pc. This returns the translation block of a given PC. PROBLEM it uses the tc_ptr the thing for searching.
  • cpu-exec.c
    • 109 – Located in cpu_exec_nocache, tcg_qemu_tb_exec is called on the translated code. This executes the code after it is generated, so safe.
    • 581 – Used for the trace debug statement output
    • 607/614 – This reads in the tc_ptr from the translation block and executes it.

 

Summary of problem functions

  • tb_find_pc – It uses the the Translated Code Pointer to find a Translation Block. Located in exec.c.
  • cpu_restore_state – It uses the Translated Code Pointer. Requires more details, and research.
  • Chaining
    • tb_reset_jump – Called/Defined in exec.c
    • tb_add_jump – Called in cpu-exec.c and defined in exec-all.h

Distributed Music Player

Leave a comment

Well last Monday, was the last day for the game jam for the Game Development Club, I had hoped to finish off one of my entries but things took a detour instead. I had gone into university to submit my preferences for working as a prac supervisor/demonstrator. I was initially up in Innova 21 (the new building on campus). Waiting for Wilson to finish with his meeting as he recently got back from China, so there was some catch-up about how that was, and then we met with Lennie for lunch and more about Wilson’s adventures in China was discussed. Now the detour happened when after lunch we ended up in the Computer Graphics lab, as Lennie was working on improving his Distributed Music Player that he first wrote in 2007.

The problem was he was discussion the program and I decided to contribute to explain how I would do it and what he should do however, he turned around and suggested I look at it/ do it. So pretty much for the last three days that is what I have been doing. The work I did was created a new python class that was responsible for holding a number of clients,  able to load the ip list, and able to perform the play, stop and loadFile command on all the clients. I then started using multiprocessing library in Python 2.6 and later to do multi threading however slight problem with that as it uses pickle (serialisation) and network sockets do not like serialisation so rather then try to find a way to get that working decided to just use the threading library which is available in 2.5. Long story short I now have python 2.5 (what i had originally), python 2.6 and python 2.7.

 

That night I wasted most the night learning how to use Glade and PyGTK, at the end of the night I eventually had a basic GUI up and running as seen below. Glade is a Visual editor for creating user interfaces for GTK. and GTK is a user interface API/ library. However in hindsight I could of ported the existing Client and Host classes to C# and then developed the interface in Visual Studio in about an hour. However on the plus side I did learn a new tool, which could help in the future.

dmpcc2

Day two, I went into uni again, hooked up a slider to be the volume bar, and rehauled the interface to have a toolbar and status bar, and moved the buttons to the status bar, introduced a context menu for right clicking on clients, didn’t get that working, had issues passing what client was ‘selected. Added the loading orchrestra files since I didn’t want to make a dialog that had a list of items to select, and this way it was now playable. That night, I added a menubar and an about screen, and disconnect button.

dmpgoing

Day three, added the logout button to the menu and created a “question” dialog, which is a dialog that has a "Message or Question”, and an Text Entry area so you can enter. This was then hooked up as the “Connect to a client”, dialog when you right click on the client area somewhere there is not a client. Added Solo and Mute support before going to bed, I was only planning on adding it to the gui and the templates however I started writing comments for the steps to implement solo and mute but i ended up doing the code instead. I actually touched the java code for the node to perform a little cleanup, I removed some unused methods, and variables and changed all the fields to private followed by running a reformatter over the code but only indentation was changed .

 

Hopefully I will post a follow up later to night.

Latin Invaders

Leave a comment

I present you Latin Invaders. This was for the Adelaide University Game Development Club – Third Game Jam in 2010.
It took about an hour to adapt the existing code base I had from Space Invaders to fit the theme "Every Letter of the Alphabet"

Story
The Latin character sets from the planeta Marineris.
You are the last line of defence and must shoot them out of the sky.

Theme Usage
I choose instead of using the theme in a subliminal manner to go for the superliminal manner (superliminal is a word from the Simpsons)

How to Play
Left Arrow Key – Moves Left
Right Arrow Key – Moves Right
Space Bar – Fires

Download
I will post a download to the source and Window Binary at a later time.

 

Development

The first step I took was in the InitGame function where it loads the graphics for the Invaders, arrays of SDL_Surfaces were setup to hold the sprites for the Invaders, and setup a for loop to populate this, then changed the existing loading to instead just refer to the some sprites in this array for testing to ensure it rendered correctly. There were 3 structures one for the top set, one for the middle set and one bottom set, which held onto the frames for the particular type of invaders (since there are 3 types). Upon simply commented out or deleting it showed all the use cases. I realised that I could improve the system, I had originally had 2 arrays one for frame 1 and one for frame 2, then I simplified it to be vaderFrames[2][26] but upon realising the structure mentioned earlier took an array of SDL_Surfaces which are the frames so changed it to do vaderFrames[26][2] this allows  it to just go theFramesForVaderZ = invaderFrame[25];. The structure was removed and the code was simplified so each Vader Instance (which has weather its alive or dead), also held a pointer to the frames, which is what the theFramesForVaderZ really is its more invader[25].frames = invaderFrame[25]. That was it, just  fixed hte rendering code to use the new array. The next step was to replace the loading of the player and bonus ship graphics with text instead for added effect.

Cleanup on the data files were performance to removed the unused graphics, since the Bonus ship, invaders and the player graphics were now text based.

Since I had done all my development so far on Microsoft Windows with Visual Studio, I decided to boot up Canonical Ubuntu in Oracle VirtualBox, and quickly found there were a few problems with the Makefile, the first being it always produced SpaceInvaders.exe, so split this into two parts TARGET and EXT (these are variables), the TARGET would be latininvaders and on Windows_NT the EXT would be .exe. The next problem was the run target, on windows launching it with Release\latininvaders.exe is fine but on Linux it should be ./Release/latininvaders so added the RUN command to the if def and yes it uses TARGET and EXT variables.

I was planning on changing the background image with one generated from text but lost motivation and just wanted to get this little project completed. I then ran Radical Image Optimization Tool over certain images to make them smaller. It was quite successfully reduced entire size of all the images and the font to be half the file size of the background image original was. Very nice tool, that I recommend as it features a preview which you can zoom in so you can see the pixels closer to see the differences.

Pangrams Game Idea

Leave a comment

This is an idea I had the night of the Adelaide Unviersity Game Development Club Third GameJam started or possibly the day after. I however had no focus to actually getting around to coding it and since there is also requires art work to match the game play neither the time to waste on trying to draw the stuff my self or finding external resources (I had considered clipart but anyway).

Have not thought of a proper name for the game yet so I have titled it Pangrams.

Enviroment
The game play style is point and click style, you click on the scene to move the character around, you can also click on actions such as “Use” or “Pick Up” and have an inventory. The objective is to collect different objects or perform different actions in a “scene” to complete it. The idea was to have about 4-5 scenes maybe more depending on time and available art work. There would be multiple entry and exit points to the scene.

Theme
The theme comes from that the levels/scenes are constructed from Pangrams, which for those not up to date on bizarre words are “are sentences using every letter of the alphabet at least once. ”

Game Play Example
The first would be depict the most widely known pangram. “The quick brown fox jumps over the lazy dog.” , so there is a Dog lying lets say on a bridge and so you the fox can’t bet past, to leave the right hand side of the screen to go to the next level. You have no current way to get past him so you leave to the left to go to another scene. This scene would be Jackdaws love my big sphinx of quartz. So you have to take a Jackdaws (bird) to a sphinx made out of quartz, upon doing this you get a pogostick (this was Shrubber’s idea). Now you return to the original scene and can jump over the dog with the pogostick.
Ideally here for simplicity the player would not always be a fox, so possibly when apporitate (need a new main character) the player transformers into the another… Maybe this is a artifical reality game, and they change avatar.

List of possible appropriate pangrams:

  • The jay, pig, fox, zebra, and my wolves quack!
  • Wolf zombies quickly spot the jinxed grave
  • Waxy and quivering, jocks fumble pizza.
  • Vamp fox held quartz duck just by wing.
  • How quickly daft jumping zebras vex.
  • The lazy major was fixing Cupid’s broken quiver.

Ideally game play is a whole lot more adventure like if you don’t realise what the panagrams are. The idea was once you compelte one it would flash up with the pangram on the screen.

Sorry, I did implementation this game.

Wave Sleep

Leave a comment

I was trying to get to sleep last night (22nd-May-2010), and started thinking about projects I would like to make, which lead onto the hosting and project management. During the week I had read the Google I/O talk about Programming on/with Wave. The idea is a robot that will check a subversion repository, and will post to a wave of the changes, etc, so kind of like the emails that you can get as a post-commit script. Little bit of thinking I realised , it may be possible to handle the waving as a post commit hook script. I am hoping to get a chance to look up some more information about Google Wave to see if this is possible .The idea would be if the option has ‘standard layout’, meaning if it has trunk, branches, and tags. It will “do a ‘reply/ threadiness ” of where the thing was branched and continue on there. Granted at this stage I am really not that familiar with Wave or the terminology so bare with me. Since I was in bed trying to sleep I didn’t have access to a computer or the internet to check if someone had already implemented a system like this.

As I was typing this up I decided to have a quick search and sure enough http://code.google.com/p/svn-py-robot/

“A google wave robot to add the log message of a svn commit to a wave blip.”

Ideally I then thought, since I have been using Git lately would be nice to do it for that as well. So will look into that as well. Likewise really why stop there, when you could use wave to post issues and feature requests, and also use it as a forum. So ideally it would be the backend for a website for your project.

I never completed this post so will followed up at a later date.

Older Entries