At the moment, flush_xen_tlb_range_va{,_local}() are using system
wide memory barrier. This is quite expensive and unnecessary.
For the local version, a non-shareable barrier is sufficient.
For the SMP version, an inner-shareable barrier is sufficient.
Furthermore, the initial barrier only needs to a store barrier.
For the full explanation of the sequence see asm/arm{32,64}/flushtlb.h.
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Henry Wang <Henry.Wang@arm.com>
{
vaddr_t end = va + size;
- dsb(sy); /* Ensure preceding are visible */
+ /* See asm/arm{32,64}/flushtlb.h for the explanation of the sequence. */
+ dsb(nshst); /* Ensure prior page-tables updates have completed */
while ( va < end )
{
__flush_xen_tlb_one_local(va);
va += PAGE_SIZE;
}
- dsb(sy); /* Ensure completion of the TLB flush */
+ dsb(nsh); /* Ensure the TLB invalidation has completed */
isb();
}
{
vaddr_t end = va + size;
- dsb(sy); /* Ensure preceding are visible */
+ /* See asm/arm{32,64}/flushtlb.h for the explanation of the sequence. */
+ dsb(ishst); /* Ensure prior page-tables updates have completed */
while ( va < end )
{
__flush_xen_tlb_one(va);
va += PAGE_SIZE;
}
- dsb(sy); /* Ensure completion of the TLB flush */
+ dsb(ish); /* Ensure the TLB invalidation has completed */
isb();
}