lib/syscall_shim/arch/x86_64: Save proper rsp in execenv prologue
In the execenv prologue meant for native builds we try to mimic the
context save/restore that would happen following a syscall instruction
but in the case of a direct call instruction. This means that the rsp
on entry is actually 8 bytes less than the rsp we are supposed to show
to actual users of this execenv. To cope with this, after pushing the
original rsp do an addition of 8 so that children (e.g. vfork) that may
inherit this context have the proper rsp. Lastly, because of this, upon
exiting the execenv assembly wrapper we must ensure that the context
whose execenv we store/restore is using the proper rsp as well by
undoing aforementioned addition, since it actually returns like a normal
function through ret.
This bug hasn't been caught before because we've only been using this
in the context of the clone syscall for native builds. Unlike vfork,
in the case of clone, the children typically begin execution with a
brand new stack instead of reusing and mimicking that of the parent.