Some more shellcode

Some more shellcode

MacOS Shellcode Primer #2

Abstract

In this series of articles, I am analysing the pieces of shellcode written by Odzhan on the page Shellcode: Mac OSX amd64.

In the last article, Come taste some shellcode..., we introduced some basic binary analysis and we learned how to call a syscall with no arguments. In this article, we will analyse how to work with more complex arguments.

I will also introduce the hopper disassembler (https://www.hopperapp.com/), one of the tools I use the most.

Execute a command

We start with the code:

; 43 bytes execute command
;
bits    64

global _main
_main:
    push    59
    pop     rax         ; eax = sys_execve
    cdq                 ; edx = 0
    bts     eax, 25     ; eax = 0x0200003B
    mov     rbx, '/bin//sh'
    push    rdx         ; 0
    push    rbx         ; "/bin//sh"
    push    rsp
    pop     rdi         ; rdi="/bin//sh", 0
    ; ---------
    push    rdx         ; 0
    push    word '-c'
    push    rsp
    pop     rbx         ; rbx="-c", 0
    push    rdx         ; argv[3]=NULL
    jmp     l_cmd64

r_cmd64:                ; argv[2]=cmd
    push    rbx         ; argv[1]="-c"
    push    rdi         ; argv[0]="/bin//sh"
    push    rsp
    pop     rsi         ; rsi=argv
    syscall

l_cmd64:
    call    r_cmd64
    ; put your command here followed by null terminator
    db      'cat /etc/passwd',0

We compile and link it:

gbiondo@tripleX Odzhan % nasm -f macho64 cmdRun.asm
gbiondo@tripleX Odzhan % ld -L /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -lSystem -o cmdRun cmdRun.o

and we can obviously run it. I will not - why should I disclose the users in my machine, anyway? - however, it runs perfectly on my MacOS Monterey.

A bit of static analysis

If you read this blog, the following commands should not be new to you. If not, you may want to read previous articles :) or do some man around.

Getting information about the executable

gbiondo@tripleX Odzhan % file cmdRun
cmdRun: Mach-O 64-bit executable x86_64

No surprises here. Just compare the above with how we compiled the file.

Looking for C strings

There is none. in fact:

gbiondo@tripleX Odzhan % strings cmdRun
gbiondo@tripleX Odzhan %

Getting information about the sections

This is a very simple program, after all:

gbiondo@tripleX Odzhan % objdump -m --section-headers cmdRun

Sections:
Idx Name          Size     VMA              Type
  0 __text        0000003b 0000000100003f7d TEXT
  1 __unwind_info 00000048 0000000100003fb8 DATA

Getting the symbol table

No big hint is given by the symbol table:

gbiondo@tripleX Odzhan % objdump -m --syms cmdRun           
cmdRun:

SYMBOL TABLE:
0000000100003f9d l     F __TEXT,__text r_cmd64
0000000100000000 g       *ABS* __mh_execute_header
0000000100003f7d g     F __TEXT,__text _main

Dynamic Analysis

Time for some debugging. This time we are using Hopper. This is not supposed to be a tutorial for hopper (which can be found here: https://www.hopperapp.com/tutorial.html or under the menu Help of the application).

Let's remember what an execve-based shellcode must do:

  • first parameter must be stored in RDI. It is a pointer to a string, holding the path of the executable.
  • second parameter must be stored in RSI. It is a pointer to a null-terminated array of strings, containing the parameters to the command.
  • third parameter must be stored in RDX. It is a pointer to a null-terminated array of strings, containing the environment variables.

I am choosing Select Debugger from the Debug menu, and I opt for the local debugger. I am then presented with this window:

Screenshot 2022-04-07 at 15.05.17.png

Observe the "Controls" box. There are 10 buttons. From left to right, these are:

  • Continue execution
  • Pause execution
  • Step into
  • Step out
  • Step over
  • Continue until current position
  • Continue until basic block end
  • Trace procedure
  • Stop execution
  • Toggle breakpoint

Below, a Tab View controller can be seen. It contains four main areas we are interested into:

  • General purpose registers (GPR)
  • Memory
  • Debugger Console
  • Application Output

I also set a breakpoint at the beginning of the main routine:

Screenshot 2022-04-07 at 15.05.57.png

It's worthwhile to set some breakpoints around. At the beginning, we'll stop at each single instruction.

If we set a breakpoint to the very next instruction and we hit Continue execution twice, we'll notice how the contents of the RAX register changes (59 has been pushed to the stack and then popped to RAX).

Before:

Screenshot 2022-04-07 at 15.21.10.png

After:

Screenshot 2022-04-07 at 15.21.25.png

If we continue, there's the CDQ instruction. It sets RDX=0, since EAX is signed positive at the current point (see below).

Screenshot 2022-04-07 at 15.28.13.png

Then the 25th bit is set to 1 - this technique should already be familiar to the reader, we won't illustrate it.

With the next instruction we can see something I love about Hopper. The instruction is illustrated below:

Screenshot 2022-04-07 at 15.32.01.png or, in plain text

movabs     rbx, 0x68732f2f6e69622f

If we right click on the 0x68732f2f6e69622f operand and select "Characters", we convert it into something more human readable (in this case, /bin//sh). Good job, Hopper!

Screenshot 2022-04-07 at 15.32.55.png

and RBX will be updated accordingly.

We hit continue, and first the contents of RDX (a zeroed register) and the contents of RBX (0x68732F2F6E69622F, the string /bin//sh) are pushed to the stack. After this, we obtain a null-terminated string containing /bin//sh.

In fact, before running the instructions, RSP, the stack pointer, points to 00007FF7BFEFFA68. The contents of the memory can be seen in the third line of the memory dump of the image below:

Screenshot 2022-04-08 at 09.57.59.png

Then RDX is pushed and RSP updated accordingly, so RSP becomes 00007FF7BFEFFA60. The stack is updated, see second line of the memory pane:

Screenshot 2022-04-08 at 10.01.23.png

Finally RBX is pushed and we have:

  • RSP: contains 00007FF7BFEFFA58
  • RBX: contains 68732F2F6E69622F
  • and the stack is represented below

Screenshot 2022-04-08 at 10.04.53.png

The contents of the stack are then popped in RDI. Let's have a look at what we have after the instruction pop rdi is executed:

Screenshot 2022-04-11 at 08.58.15.png

Now the contents of RDX are pushed in the stack, and then we push also the constant 0x632d, which is the string -c. See below:

Screenshot 2022-04-11 at 09.05.29.png

The stack looks as follows:

Screenshot 2022-04-11 at 09.06.33.png

The next instruction pushes the contents of rsp in the stack, then the value is popped in rbx:

Screenshot 2022-04-11 at 09.13.33.png

Finally, the value of rdx, which is null, is pushed into the stack. Now there's a noticeable difference between the original code, with two labels (r_cmd64 and l_cmd64), and the code that's been first assembled and linked, then disassembled. Only r_cmd64 is present, and the instruction jmp r_cmd64+6 is used instead. When hit 'continue', the jump sends the control flow to 0x0000000100003FA3 which contains call r_cmd64. Why jumping back and forth? The reason is that this way we place the pointer to the command string in the stack. Below there's the 'before'...

Screenshot 2022-04-11 at 09.33.41.png

...and the 'after':

Screenshot 2022-04-11 at 09.35.01.png

Now it's possible pushing rsp and popping the value into rsi:

Screenshot 2022-04-11 at 09.38.48.png

and

Screenshot 2022-04-11 at 09.39.09.png

Finally the syscall is invoked, and the execution of the shellcode terminates.

Conclusions

As strange as it may seem, I find easier working with lldb - but I am an old guy :) Hopper has powerful features anyway - I am using it for many other purposes (like changing memory contents on the fly).

We have shown how to approach some static and dinamic binary analysis for this kind of programs.

Finally: we start to understand assembly, but always having no talent and being proud of having none!

Did you find this article valuable?

Support RevEng3 - Reverse Engineering by becoming a sponsor. Any amount is appreciated!