libxl: rework domain userdata file lock
The lock introduced in
d2cd9d4f ("libxl: functions to lock / unlock
libxl userdata store") has a bug that can leak the lock file when domain
destruction races with other functions that try to get hold of the lock.
There are several issues:
1. The lock is released too early with libxl__userdata_destroyall
deletes everything in userdata store, including the lock file.
2. The check of domain existence is only done at the beginning of lock
function, by the time the lock is acquired, the domain might have
been gone already.
The effect of this two issues is we can run into such situation:
Process 1 Process 2 domain destruction
# LOCK FUNCTION # LOCK FUNCTION
check domain existence check domain existence
acquire lock (file created)
# LOCK FUNCTION
destroy all files (lock file deleted,
lock released)
acquire lock (file created)
# LOCK FUNCTION destroy domain
# UNLOCK (close fd only)
[ lock file leaked ]
Fix this problem by deploying following changes:
1. Unlink lock file in unlock function.
2. Modify libxl__userdata_destroyall to not delete domain-userdata-lock,
so that the lock remains held until unlock function is called.
3. Check domain still exists when the lock is acquired, unlock if
domain is already gone.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>