Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on tracking hxestorage progress #114

Open
dougmill-ibm opened this issue Aug 2, 2017 · 2 comments
Open

Question on tracking hxestorage progress #114

dougmill-ibm opened this issue Aug 2, 2017 · 2 comments

Comments

@dougmill-ibm
Copy link

I am trying to diagnose a problem that occurs at a specific point in HTX mdt.io. It always appears at the same point relative to starting HTX mdt.io, approximately 11 hours into the test. It seems to be about when "cycle count" goes from "0" to "1". The problem persists for about 2 hours, then subsides until the start of the next cycle (cycle count goes to "2" - approximately 9 hours after going to "1"). This pattern then repeats indefinitely throughout the testing (i.e. a couple hours of problems every ~9 hours).

What I am trying to determine is just what sorts of I/O are being performed during that time. How do I get HTX to tell me what tests are being run, or how do I get it to log more information on what tests are being run?

@preeti-dhir
Copy link
Contributor

hxestorage exerciser dumps the info of last apprx. 1500 IOs that happened on any disk device at below location on test system:
/tmp/htx/hxestorage/<device_name>/IO_details_dump.log.
Info is as below:
thread_id: rule_5_1 - time: 0x3c63254e211c7, oper: wrc, cur_oper: read, blkno: 0x1afbddb8, num_blks: 227, rbuf addr: 0x110382df0
Time: when the IO started.
oper of be performed: WRC (i.e. write/read/compare)
cure_oper: current operation going on
blkno: blkno where IO started.
num_blks: transfer size in terms of blocks
rbuf_addr: address of buffer where to do read.

Hope, this will be helpful to you.

To explain a bit about the exerciser, hxestorage uses default.hdd or default.ssd rulefile in mdt.io. These rulefiles have multiple stanzas which act as input to the exerciser. Each stanza defined here is one testcase. During each stanza run, multiple threads will be doing IO simultaneously on the disk. In 1 cycle count, hxestorage exerciser runs all these stanzas i.e. completes one pass of the rulefile, increments the cycle count and start from beginning again. Each testcase will have its own inputs to be given to the exerciser. Since problem is seen everytime around when cycle count changes. mostly it will be in stanza 3 or 4. You can see on screen "Curr Stanza" around the time you see the problem,

If you can let me know what kind of issue are you seeing and provide me system login details. I can provide may b more details related to it.

@dougmill-ibm
Copy link
Author

What I need is to be able to relate behavior of the system, bounded by some timestamp values, to what HTX was doing during that time. I guess I could dig through the rules and get an idea of what is going on, but I do not have any log that shows what rules were being run at a given time. Is there any sort of file that contains this information, for example a timestamp (human readable) and indication of the beginning/ending of a step?

Am I correct in assuming that these rule files will help explain what each step does?

Also, the "time:" field in the IO_details_dump.log file seems to be the value of the PPC timebase register, and is difficult (impossible) to relate back to wall-clock time. Also, since that file only shows the last 1500 I/Os it doesn't help for a long running test. I typically see around 1 million I/Os per hour, per disk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants