OMPI people want an easy way to recognize MICs and nodes using hwloc.
We already have MICSerialNumber inside OS devices in the host topology,
add the same to the root object of MIC topologies.
Unfortunately, we couldn't find anything better than parsing /proc/elog
to find that number.
crashes when hwloc is dynamically loaded by another plugin mechanisms.
+ Add --with-hwloc-plugins-path to specify the install/load directories
of plugins.
+ + Add the MICSerialNumber info attribute to the root object when running
+ hwloc inside a Xeon Phi to match the same attribute in the MIC OS device
+ when running in the host.
* API
+ hwloc.h and hwloc/helper.h have been reorganized to clarify the
documentation sections. The actual inline code has moved out of hwloc.h
<dt>NVIDIAUUID, NVIDIASerial (NVML GPU OS devices)</dt>
<dd>The UUID and Serial of NVIDIA GPUs.
</dd>
-<dt>MICFamily, MICSKU, MICSerialNumber, MICActiveCores, MICMemorySize</dt>
-<dd>The family, SKU (model), serial number,
+<dt>MICSerialNumber</dt>
+<dd>
+ The serial number of an Intel Xeon Phi (MIC) coprocessor.
+ When running hwloc on the host, each hwloc OS device object that
+ corresponds to a Xeon Phi gets such an attribute.
+ When running hwloc inside a Xeon Phi, the root object of the topology
+ gets this attribute.
+ It enables easy identification of devices and topologies when multiples
+ nodes and MICs are involved.
+</dd>
+<dt>MICFamily, MICSKU, MICActiveCores, MICMemorySize</dt>
+<dd>The family, SKU (model),
number of active cores, and memory size (in kB)
of an Intel Xeon Phi (MIC) coprocessor.
</dd>
****** Main Topology Discovery ******
*************************************/
+static void
+hwloc__linux_get_mic_sn(struct hwloc_topology *topology, struct hwloc_linux_backend_data_s *data)
+{
+ FILE *file;
+ char line[64], *tmp, *end;
+ file = hwloc_fopen("/proc/elog", "r", data->root_fd);
+ if (!file)
+ return;
+ if (!fgets(line, sizeof(line), file))
+ goto out_with_file;
+ if (strncmp(line, "Card ", 5))
+ goto out_with_file;
+ tmp = line + 5;
+ end = strchr(tmp, ':');
+ if (!end)
+ goto out_with_file;
+ *end = '\0';
+ hwloc_obj_add_info(hwloc_get_root_obj(topology), "MICSerialNumber", tmp);
+
+ out_with_file:
+ fclose(file);
+}
+
static void
hwloc_linux_fallback_pu_level(struct hwloc_topology *topology)
{
free(cpuset_name);
}
+ hwloc__linux_get_mic_sn(topology, data);
+
/* gather uname info if fsroot wasn't changed */
if (topology->is_thissystem)
hwloc_add_uname_info(topology);