Forcing a register operand hides (from the compiler) the fact that clflush
behaves as a read from the memory operand (wrt memory order, faults, etc.).
It also reduces the compilers flexibility with register scheduling.
Re-implement clfush() (and wbinvd() for consistency) as a static inline rather
than a macro, and have it take a const void pointer.
In practice, the only generated code which gets modified by this is in
mwait_idle_with_hints(), where a disp8 encoding now gets used.
While here, I noticed that &mwait_wakeup(cpu) was being calculated twice.
This is caused by the memory clobber in mb(), so take the opportunity to help
the optimiser by calculating it once, ahead of time. bloat-o-meter reports a
delta of -26 as a result of this change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
{
unsigned int cpu = smp_processor_id();
s_time_t expires = per_cpu(timer_deadline, cpu);
+ const void *monitor_addr = &mwait_wakeup(cpu);
if ( boot_cpu_has(X86_FEATURE_CLFLUSH_MONITOR) )
{
mb();
- clflush((void *)&mwait_wakeup(cpu));
+ clflush(monitor_addr);
mb();
}
- __monitor((void *)&mwait_wakeup(cpu), 0, 0);
+ __monitor(monitor_addr, 0, 0);
smp_mb();
/*
__sel; \
})
-#define wbinvd() \
- asm volatile ( "wbinvd" : : : "memory" )
+static inline void wbinvd(void)
+{
+ asm volatile ( "wbinvd" ::: "memory" );
+}
-#define clflush(a) \
- asm volatile ( "clflush (%0)" : : "r"(a) )
+static inline void clflush(const void *p)
+{
+ asm volatile ( "clflush %0" :: "m" (*(const char *)p) );
+}
#define xchg(ptr,v) \
((__typeof__(*(ptr)))__xchg((unsigned long)(v),(ptr),sizeof(*(ptr))))