Skip to content

Answer key for trace; DiskIOHigh.etl #2

@itoleck

Description

@itoleck

Scenario:

Windows Server Backup was running, backing up spanned volume SSD disks 2, 3, 4, 5 to HD disk 6.

Analysis:

  1. Find current disk usage. Open trace in WPA and add Disk Usage graph.

  2. Order columns similar to the following: Disk | Priority | IO Type | Process | IO Init Stack |GOLDBAR| Disk Service Time(us) Avg | Size | Count |BLUEBAR| Disk Service Time(us) Sum

  3. Sort by Disk Service Time(us) Avg column. This will show the average latency for each disk. Disks 1-5 are within normal latency levels (1.294ms and less), but disk 6 shows an average of 108.9ms which is above the normal 15-25ms for a 7200RPM hard drive.

DiskIOHigh1

  1. Sometimes long Disk Service Time/High latency is caused by the IO being of a low priority. In this trace, only disk 0 has other priority IO than Normal. You can move the Priority column to the first slot and select the Low and Very Low IO priorities and right-click and select Filter Out Selection to remove them from the view. Move the Disk column back to the first after filtering.

DiskIOHigh2

  1. Open Disk 6, priority Normal and view the type of IO. In this case most of the IO are writes. To find out what process is writing open the Process column. You should find that wbengine.exe (10856) process is responsible for most of the writes.

  2. Open the IO Init Stack columns until the end for the wbengine.exe process and find that there are 69 writes of exactly 32MB each. Most writes have high latency (> 100ms).

DiskIOHigh3

Remediation:

Backup to faster backup storage or split up backups to different hard drives to spread the load.

Metadata

Metadata

Assignees

Labels

answer keyAnswers for practice traces

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions