]> xenbits.xensource.com Git - qemu-xen.git/commit
target/i386: move C0-FF opcodes to new decoder (except for x87)
authorPaolo Bonzini <pbonzini@redhat.com>
Sat, 21 Oct 2023 15:36:34 +0000 (17:36 +0200)
committerPaolo Bonzini <pbonzini@redhat.com>
Tue, 7 May 2024 06:53:26 +0000 (08:53 +0200)
commitd7c41a60d0c5228d5adfc73c83facb1307a1d45e
treec283bdf985df2a7130ff2676ead03c3c9f294e77
parentb603136402d2ae217b5051cd041a8591f09b04ba
target/i386: move C0-FF opcodes to new decoder (except for x87)

The shift instructions are rewritten instead of reusing code from the old
decoder.  Rotates use CC_OP_ADCOX more extensively and generally rely
more on the optimizer, so that the code generators are shared between
the immediate-count and variable-count cases.

In particular, this makes gen_RCL and gen_RCR pretty efficient for the
count == 1 case, which becomes (apart from a few extra movs) something like:

  (compute_cc_all if needed)
  // save old value for OF calculation
  mov     cc_src2, T0
  // the bulk of RCL is just this!
  deposit T0, cc_src, T0, 1, TARGET_LONG_BITS - 1
  // compute carry
  shr     cc_dst, cc_src2, length - 1
  and     cc_dst, cc_dst, 1
  // compute overflow
  xor     cc_src2, cc_src2, T0
  extract cc_src2, cc_src2, length - 1, 1

32-bit MUL and IMUL are also slightly more efficient on 64-bit hosts.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
target/i386/tcg/decode-new.c.inc
target/i386/tcg/decode-new.h
target/i386/tcg/emit.c.inc
target/i386/tcg/translate.c