Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regent: Scalability Issue of Cholesky.rg in legion/language/examples #663

Open
qyz96 opened this issue Oct 19, 2019 · 52 comments
Open

Regent: Scalability Issue of Cholesky.rg in legion/language/examples #663

qyz96 opened this issue Oct 19, 2019 · 52 comments
Assignees
Labels
question Regent Issues pertaining to Regent

Comments

@qyz96
Copy link

qyz96 commented Oct 19, 2019

Hi,

I am running cholesky.rg in legion/language/examples for some runtime comparisons. I ran the code on Sherlock cluster with different number of cores per task, number of tasks per node and number of nodes. The size of the matrix is set to 80*40 and the number of blocks is set to 40. The results are shown below:

8 cores per task, 1 task per node, 1 node:
214526.257 ms
16 cores per task, 1 task per node, 1 node:
255938.125 ms
24 cores per task, 1 task per node, 1 node:
210877.760 ms
24 cores per task, 1 task per node, 2 node:
213083.923 ms
8 cores per task, 3 tasks per node, 1 node:
208970.563 ms
4 cores per task, 6 tasks per node, 1 node:
211100.003 ms
4 cores per task, 6 tasks per node, 2 nodes:
229718.337 ms
8 cores per task, 3 tasks per node, 2 nodes:
229127.595 ms

It appears that cholesky.rg has worse performance on multiple nodes. I understand that there may be some problems with the structure of the code, as mentioned in #574. It was said that index launches needs to be used and the structure of the code needs to match the one shown in:

https://www.oerc.ox.ac.uk/sites/default/files/uploads/ProjectFiles/ASEArch/starpu.pdf#page=41

However, it seems to me that the for loop part, which is shown below,

var is = ispace(f2d, { x = n, y = n })
var cs = ispace(f2d, { x = np, y = np })
var rA = region(is, double)
var rB = region(is, double)
var pA = partition(equal, rA, cs)
var pB = partition(equal, rB, cs)
var bn = n / np

for x = 0, np do
dpotrf(x, n, bn, pB[f2d { x = x, y = x }])
for y = x + 1, np do
dtrsm(x, y, n, bn, pB[f2d { x = x, y = y }], pB[f2d { x = x, y = x }])
end
for k = x + 1, np do
dsyrk(x, k, n, bn, pB[f2d { x = k, y = k }], pB[f2d { x = x, y = k }])
for y = k + 1, np do
dgemm(x, y, k, n, bn,
pB[f2d { x = k, y = y }],
pB[f2d { x = x, y = y }],
pB[f2d { x = x, y = k }])
end
end
end

already matches the right looking structure, or am I missing something? Also, maybe I am just not so familiar with regent, but isn't the for loop of the decomposition in the code already using index spaces and partitions? Is the scalability issue caused by the fact that the code is parallelized but not optimized in some way?

@elliottslaughter
Copy link
Contributor

Well, first of all, none of those loops are being index launched because they're failing in the optimizer. The easiest way to tell is to put __demand(__index_launch) on any of the inner loops and see that the compiler rejects them.

For the dtrsm and dsyrk loops, an easy fix is to add __demand(__index_launch) and then pass the flag -foverride-demand-index-launch 1 to the compiler. That's because those really are valid index launches, it's just the compiler doesn't know it because there's some math with the indices that it's not able to do right now.

But the other two loops cannot possibly be a valid index launch. The problem is that the loops aren't properly nested:

    for k = x + 1, np do
      dsyrk(x, k, n, bn, pB[f2d { x = k, y = k }], pB[f2d { x = x, y = k }])
      for y = k + 1, np do
        dgemm(x, y, k, n, bn,
              pB[f2d { x = k, y = y }],
              pB[f2d { x = x, y = y }],
              pB[f2d { x = x, y = k }])
      end
    end

The point of linking the slides is that it's possible to write this in a way where they are properly nested:

for (n = k+1 .. tiles – 1)
  SYRK(A[n,k], A[n,n])
for (n = k+1 .. tiles – 1)
  for (m = k+1 .. tiles – 1)
    GEMM(A[m,k], A[n,k], A[m,n]

You'll see there aren't any loops with two or more task calls, each loop nest contains exactly one task call.

The other nice things about this is that the loop at the bottom is a square. Whereas the one in the current Regent code is a triangle (i.e. the bounds on the y loop depends on k). It's not impossible to make a triangular index space, but it's a lot of work and it can go away completely if you use a rectangular index space.

Bottom line is that the code does not match what I linked in that slide, and will have to be rewritten if you really want it to work well.

@elliottslaughter
Copy link
Contributor

@magnatelee pointed out to me that 80*40 is probably a quite small problem size for Cholesky, so even aside from the issue with index launches (which you should still certainly aim to fix), increasing the problem size might help quite a bit. We'd need a Legion Prof chart to be sure.

@qyz96
Copy link
Author

qyz96 commented Oct 23, 2019

Thanks! I have changed the code to the followng:

for x = 0, np do
dpotrf(x, n, bn, pB[f2d { x = x, y = x }])
__demand(__index_launch)
for y = x + 1, np do
dtrsm(x, y, n, bn, pB[f2d { x = x, y = y }], pB[f2d { x = x, y = x }])
end
__demand(__index_launch)
for k = x + 1, np do
dsyrk(x, k, n, bn, pB[f2d { x = k, y = k }], pB[f2d { x = x, y = k }])
end
for k = x + 1, np do
__demand(__index_launch)
for y = k + 1, np do
dgemm(x, y, k, n, bn,
pB[f2d { x = k, y = y }],
pB[f2d { x = x, y = y }],
pB[f2d { x = x, y = k }])
end
end
end

I ran the code with the flag -foverride-demand-index-launch 1 on a small problem of size 80*4 and everything is good:
[0 - 7fbc58dc2700] {4}{runtime}: [warning 1019] LEGION WARNING: Region requirements 0 and 1 of index task dsyrk (UID 52) in parent task cholesky (UID 3) are potentially interfering. It's possible that this is a false positive if there are projection region requirements and each of the point tasks are non-interfering. If the runtime is built in debug mode then it will check that the region requirements of all points are actually non-interfering. If you see no further error messages for this index task launch then everything is good. (from file /home/users/adncat/yizhou/Legion/runtime/legion/legion_tasks.cc:6993)[warning 1019]
For more information see:
http://legion.stanford.edu/messages/warning_code.html#warning_code_1019

ELAPSED TIME = 25.915 ms

However, when I change the size back to 80*40, I got the following error:
[0 - 7ff1d479d700] {4}{runtime}: [warning 1071] LEGION WARNING: Region requirement 1 of operation transpose_copy (UID 797) in parent task cholesky (UID 3) is using uninitialized data for field(s) ,101 of logical region (1094,1,1) (from file /home/users/adncat/yizhou/Legion/runtime/legion/legion_ops.cc:539)[warning 1071]
For more information see:
http://legion.stanford.edu/messages/warning_code.html#warning_code_1071

[0 - 7ff1d475c700] {6}{compqueue}: completion queue overflow: cq=1900000000000003 size=1024

This seems to only happen when I am under the master branch (The code with same size can run normally in the stable branch).

@streichler
Copy link
Contributor

@qyz96 which commit from master are you using? The completion queue overflow you're seeing was hopefully fixed with commit 61cbe9d from yesterday afternoon.

@qyz96
Copy link
Author

qyz96 commented Oct 23, 2019

Yes I have that update in my commit:
[adncat@sh-107-42 ~/yizhou/Legion/language]$ git log -10 master
commit 3459e17
Author: qyz96 [email protected]
Date: Wed Oct 23 15:54:31 2019 -0700

Index Launch 1

commit a4f42af
Author: Sean Treichler [email protected]
Date: Wed Oct 23 12:14:28 2019 -0700

realm: reuse compqueue IDs and detect exhaustion

commit e4b4847
Author: Seshu Yamajala [email protected]
Date: Wed Oct 23 12:05:29 2019 -0700

Print expected number of fields. #658

commit fcc1e34
Author: Seshu Yamajala [email protected]
Date: Wed Oct 23 11:53:08 2019 -0700

Selectively export tasks in save_tasks. #657

commit 70f3699
Author: Mike [email protected]
Date: Tue Oct 22 23:57:27 2019 -0700

legion: fix for virtual mappings in traces

commit 268ebe7
Author: Sean Treichler [email protected]
Date: Tue Oct 22 15:39:57 2019 -0700

test: test both bounded and unbounded completion queues

commit 61cbe9d
Author: Sean Treichler [email protected]
Date: Tue Oct 22 15:39:27 2019 -0700

realm: fix off-by-one in completion queue resizing code

commit b8e06dc
:
But the error still appears:

[0 - 7f3f4c26a700] {4}{runtime}: [warning 1071] LEGION WARNING: Region requirement 1 of operation transpose_copy (UID 793) in parent task cholesky (UID 3) is using uninitialized data for field(s) ,101 of logical region (1014,1,1) (from file /home/users/adncat/yizhou/Legion/runtime/legion/legion_ops.cc:539)[warning 1071]
For more information see:
http://legion.stanford.edu/messages/warning_code.html#warning_code_1071

[0 - 7f3f4c26a700] {4}{runtime}: [warning 1071] LEGION WARNING: Region requirement 1 of operation transpose_copy (UID 795) in parent task cholesky (UID 3) is using uninitialized data for field(s) ,101 of logical region (1054,1,1) (from file /home/users/adncat/yizhou/Legion/runtime/legion/legion_ops.cc:539)[warning 1071]
For more information see:
http://legion.stanford.edu/messages/warning_code.html#warning_code_1071

[0 - 7f3f4c26a700] {4}{runtime}: [warning 1071] LEGION WARNING: Region requirement 1 of operation transpose_copy (UID 797) in parent task cholesky (UID 3) is using uninitialized data for field(s) ,101 of logical region (1094,1,1) (from file /home/users/adncat/yizhou/Legion/runtime/legion/legion_ops.cc:539)[warning 1071]
For more information see:
http://legion.stanford.edu/messages/warning_code.html#warning_code_1071

[0 - 7f3f4c229700] {6}{compqueue}: completion queue overflow: cq=1900000000000003 size=1024

@elliottslaughter
Copy link
Contributor

Just to confirm, you did re-run install.py after updating, right?

@qyz96
Copy link
Author

qyz96 commented Oct 23, 2019

Yes.

@elliottslaughter
Copy link
Contributor

Is the version of the code you are running the same as what you had in the pull request? Or do you want to put up a new version?

@qyz96
Copy link
Author

qyz96 commented Oct 23, 2019

I created a new pull request:
#665

@streichler streichler added bug Legion Issues pertaining to Legion labels Oct 24, 2019
@streichler
Copy link
Contributor

Passing this over to @lightsighter . Legion is creating a bounded completion queue with a size of 1024 and it's overflowing:

$ ./regent.py examples/cholesky.rg -fflow 0 -foverride-demand-index-launch 1 -level runtime=5,compqueue=2 -logfile c.log
[0 - 7f5a9b244700] {6}{compqueue}: completion queue overflow: cq=1900000000000003 size=1024

$ grep 1900000000000003 c.log
[0 - 7f5a9b285700] {2}{compqueue}: created completion queue: cq=1900000000000003 size=1024
[0 - 7f5a9b244700] {2}{compqueue}: event registered with completion queue: cq=1900000000000003 event=8000000208f00003
...
[0 - 7f5a9b244700] {2}{compqueue}: event pushed: cq=1900000000000003 event=80000001cd300002
[0 - 7f5a9b244700] {6}{compqueue}: completion queue overflow: cq=1900000000000003 size=1024

$ grep 1900000000000003 c.log | grep -c pushed
1041

$ grep 1900000000000003 c.log | grep popped
[0 - 7f5a9b285700] {2}{compqueue}: events popped: cq=1900000000000003 max=16 act=16 events=[8000000208f00003, 8000000208600005, 8000000119f00006, 800000020ad00004, 800000020ad00006, 800000020a40000a, 8000000207800009, 800000020b00000c, 800000020790000d, 800000020790000f, 800000002300000c, 8000000119f00015, 8000000119f00017, 8000000041000008, 800000004100000a, 800000001ba00014]

(Note that you need the modified cholesky.rg from #665 for this.)

@lightsighter
Copy link
Contributor

Fixed with 105cc4b

@lightsighter lightsighter added Regent Issues pertaining to Regent and removed Legion Issues pertaining to Legion bug labels Oct 24, 2019
@elliottslaughter
Copy link
Contributor

@qyz96 Can you pull and try again?

@qyz96
Copy link
Author

qyz96 commented Oct 24, 2019

Thanks for the quick fix! Yes the new commit works for me now.

@elliottslaughter
Copy link
Contributor

Ok, can you go ahead and run your code with Legion Prof, and we'll see how well it's actually performing now? (You can zip the file and attach it here.)

@qyz96
Copy link
Author

qyz96 commented Oct 24, 2019

I am having some difficulties opening index.html in the legion_prof folder. It looks like the entire folder is too big to upload here (1GB).

@elliottslaughter
Copy link
Contributor

Maybe you can put it on some sort of a shared location on Sherlock?

@qyz96
Copy link
Author

qyz96 commented Oct 25, 2019

@elliottslaughter
Copy link
Contributor

So what I see in this profile is that the utility processor is 100% full, while the CPU is only 50% full. This is not surprising since np is set to 40. There are a relatively large number of tasks and those tasks are relatively small.

I would change np to be either the total number of CPU cores, or maybe 2x the number of CPU cores. You should see the utility processor utilization go down and the CPU utilization go up. It will also result in generating smaller traces.

@qyz96
Copy link
Author

qyz96 commented Nov 6, 2019

I have changed the size to 200*24 and ran the program on 1 node with 24 cores (using salloc). The profile link is here:
https://www.dropbox.com/s/1lncoh9lggm7ytw/legion_prof.zip?dl=0

It looks like the use of CPU is unusually low in this profile. I tried to increase the size of the task by changing the matrix size to 400*24, but I got the out of memory problem (which I did not encounter with other systems):
[adncat@sh-107-47 ~/yizhou/Legion/language]$ ./regent.py examples/cholesky.rg -fflow 0 -foverride-demand-index-launch 1
[0 - 7fa40465b700] {5}{default_mapper}: Default mapper failed allocation of size 800000000 bytes for region requirement 0 of task make_pds_matrix (UID 6) in memory 1e00000000000000 for processor 1d00000000000001. This means the working set of your application is too big for the allotted capacity of the given memory under the default mapper's mapping scheme. You have three choices: ask Realm to allocate more memory, write a custom mapper to better manage working sets, or find a bigger machine.
terra: /home/users/adncat/yizhou/Legion/runtime/mappers/default_mapper.cc:2410: void Legion::Mapping::DefaultMapper::default_report_failed_instance_creation(const Legion::Task&, unsigned int, Legion::Processor, Legion::Memory, size_t) const: Assertion `false' failed.
[adncat@sh-107-47 ~/yizhou/Legion/language]$ free -m
total used free shared buff/cache available
Mem: 191668 2333 185989 19 3345 187946
Swap: 4095 0 4095

Also, I tried compiling with different number of threads (using install.py -j NUM_THREADS) but the performance seems to be always the same. Are there ways check the number of threads the program is using? Thanks!

@qyz96
Copy link
Author

qyz96 commented Nov 11, 2019

Never mind, I just found the flags that specify the number of cores and memory per core. I will let you know if I have further problems.

@qyz96
Copy link
Author

qyz96 commented Nov 30, 2019

Do you know how I can deal with this bug? I am running on a node with four cores. Thanks!
[adncat@sh-105-11 ~]$ ./yizhou/Legion/language/regent.py yizhou/Legion/language/examples/cholesky.rg -fflow 0 -llcpu 4 -ll:csize 8124 -foverride-demand-index-launch 1 -n 4000 -p 20
/home/users/adncat/yizhou/Legion/bindings/regent/libregent.so(_ZN5Realm8TypeConv8from_cppIvPKvmS3_mNS_9ProcessorEEENS_4TypeENS0_14CppTypeCaptureIPFT_T0_T1_T2_T3_T4_EEE+0x37) [0x7fed7649f4f7]
1 terra (JIT) 0x00007fed75bdeca3 $main + 15299

@elliottslaughter
Copy link
Contributor

A couple things:

  1. It would help to have the full Legion Prof trace. E.g. I can't see the utility processor or zoom in with this view.
  2. How many physical cores do you actually have on this machine?
  3. Do you have a reference implementation (e.g. MPI or OpenMP) to compare against? Otherwise it's hard to know how we should expect the code to scale.

@eddy16112
Copy link
Contributor

A quick way to compare with reference implementation without actual running the reference code is to calculate the flops of cholesky. A high performance DPOTRF would have similar peak flops as DGEMM, and MKL DGEMM is close to the peak flops of CPU.

@qyz96
Copy link
Author

qyz96 commented Dec 4, 2019

I have shared the profiling output here:
https://www.dropbox.com/sh/iiyh5qlxjg1e5cx/AAAMciyUn_EIgONRWZTEe1CHa?dl=0
I have 24 cores (used something like salloc -c 24 -n 1). I compared Legion with Starpu and another task scheduling system (tasktorrent) from our group. The result is something like this:
image
Here 200*20 means 20 by 20 blocks with each block of size 200 by 200. For Legion I also played a little bit with ll:cpu and ll:util, so in this case I am only using 20 cores for computation and 4 cores for utility.

@elliottslaughter
Copy link
Contributor

The trace shows you're 100% bottlenecked on the utility processor. This means you either need a larger problem size (so that the task get bigger), or you have to make fewer blocks (do you really need 400 blocks when you only have 24 processors?), or you need to use tracing (which means adding a timestep loop around the outside, and using __demand(__trace) on it, and start timing from the 3rd iteration or so).

In general Legion is designed to reserve 1 or 2 cores for the runtime's internal use. So it's not surprising that it ticks up at the end. But given the utility bottleneck, that's the biggest issue.

@qyz96
Copy link
Author

qyz96 commented Feb 5, 2020

Hi, I am encountering this issue again:
#663 (comment)

Could you help me with this? I have run the code in gdb and the output says:

Starting program: /home/users/adncat/yizhou/Legion/language/terra/terra yizhou/Legion/language/examples/cholesky.rg -ll:cpu 12 -ll:util 4 -ll:csize 8192 -fflow 0 -foverride-demand-index-launch 1 -n 20000 -np 20
warning: File "/share/software/user/open/gcc/8.1.0/lib64/libstdc++.so.6.0.25-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load:/usr/bin/mono-gdb.py".
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Detaching after fork from child process 103988.
Detaching after fork from child process 103991.
Detaching after fork from child process 103996.
Detaching after fork from child process 103999.
Detaching after fork from child process 104001.
Detaching after fork from child process 104004.
Detaching after fork from child process 104006.
Detaching after fork from child process 104009.
Detaching after fork from child process 104012.
Detaching after fork from child process 104014.

Program received signal SIGILL, Illegal instruction.
0x00007ffff44814f7 in Realm::Type Realm::TypeConv::from_cpp<void, void const*, unsigned long, void const*, unsigned long, Realm::Processor>(Realm::TypeConv::CppTypeCapture<void ()(void const, unsigned long, void const*, unsigned long, Realm::Processor)>) () from /home/users/adncat/yizhou/Legion/bindings/regent/libregent.so
(gdb) c
Continuing.
/home/users/adncat/yizhou/Legion/bindings/regent/libregent.so(_ZN5Realm8TypeConv8from_cppIvPKvmS3_mNS_9ProcessorEEENS_4TypeENS0_14CppTypeCaptureIPFT_T0_T1_T2_T3_T4_EEE+0x37) [0x7ffff44814f7]
1 terra (JIT) 0x00007ffff7e08d6e $main + 15470

Program received signal SIGILL, Illegal instruction.
0x00007ffff44814f7 in Realm::Type Realm::TypeConv::from_cpp<void, void const*, unsigned long, void const*, unsigned long, Realm::Processor>(Realm::TypeConv::CppTypeCapture<void ()(void const, unsigned long, void const*, unsigned long, Realm::Processor)>) () from /home/users/adncat/yizhou/Legion/bindings/regent/libregent.so
(gdb) c
Continuing.

Program terminated with signal SIGILL, Illegal instruction.
The program no longer exists.

Thanks!

@elliottslaughter
Copy link
Contributor

You got a SIGILL, which makes me wonder if the code was compiled correctly. What machine are you running on again? If it's a distributed machine, is it possible the node you compiled on has a different architecture than the node you're running on?

@qyz96
Copy link
Author

qyz96 commented Feb 5, 2020

I am running on sherlock cluster, but possibly on a different node each time. Does that mean I have to run install.py everytime? By the way, is there a way to disable printing out the following warning?

[0 - 7fb5c004b700] {4}{runtime}: [warning 1019] LEGION WARNING: Region requirements 0 and 2 of index task dgemm (UID 16794) in parent task cholesky (UID 3) are potentially interfering. It's possible that this is a false positive if there are projection region requirements and each of the point tasks are non-interfering. If the runtime is built in debug mode then it will check that the region requirements of all points are actually non-interfering. If you see no further error messages for this index task launch then everything is good. (from file /home/users/adncat/yizhou/Legion/runtime/legion/legion_tasks.cc:6993)[warning 1019]
For more information see:
http://legion.stanford.edu/messages/warning_code.html#warning_code_1019

Thanks!

@elliottslaughter
Copy link
Contributor

I don't know, but just for the moment, let's try that to make sure we remove it as a potential cause.

You can run with -level 5, but that will get rid of all warnings, not just this one.

@lightsighter
Copy link
Contributor

If you do a run with the runtime built in debug mode and it doesn't report an error, then you can safely ignore the warning about potentially interfering region requirements.

@qyz96
Copy link
Author

qyz96 commented May 21, 2020

Hi, is cholesky.rg here a distributed version that I can run on multinodes? I tried running with gasnetrun_ibv but it says the following:
gasnetrun: unable to locate a GASNet program in '/home/darve/adncat/legion/language/./regent.py examples/cholesky.rg'

Thank you!

@elliottslaughter
Copy link
Contributor

That error has nothing to do with this program. We do not recommend running with gasnetrun_* on any platform. Instead we recommend enabling the MPI compat mode in GASNet (which is on if you compiled with our scripts here) and then running with mpirun or similar depending on what MPI you have.

@qyz96
Copy link
Author

qyz96 commented May 22, 2020

Yes I compiled with the ibv configuration, which I believe has the --enable-mpi-compat flag. I tried running with the following command:
mpirun -n 1 ./regent.py examples/cholesky.rg -fflow 0 -foverride-demand-index-launch 1
But it shows the following error:
[cli_0]: write_line error; fd=14 buf=:cmd=init pmi_version=1 pmi_subversion=1
:
system msg for write_line failure : Bad file descriptor
[cli_0]: Unable to write to PMI_fd
[cli_0]: write_line error; fd=14 buf=:cmd=barrier_in
:
system msg for write_line failure : Bad file descriptor
[cli_0]: write_line error; fd=14 buf=:cmd=get_ranks2hosts
:
system msg for write_line failure : Bad file descriptor
[cli_0]: expecting cmd="put_ranks2hosts", got cmd=""
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(805): fail failed
MPID_Init(1743)......: channel initialization failed
MPID_Init(2144)......: PMI_Init returned -1
[cli_0]: write_line error; fd=14 buf=:cmd=abort exitcode=68204815
:
system msg for write_line failure : Bad file descriptor

@elliottslaughter
Copy link
Contributor

elliottslaughter commented May 22, 2020

How about this?

LAUNCHER="mpirun -n 1" ./regent.py examples/cholesky.rg -fflow 0 -foverride-demand-index-launch 1

Note though if this is OpenMPI you need some additional flags to make sure all the environment variables get passed through.

@qyz96
Copy link
Author

qyz96 commented May 22, 2020

I am using intel MPI library:
[adncat@compute-1-1 language]$ mpirun --version
Intel(R) MPI Library for Linux* OS, Version 2018 Update 2 Build 20180125 (id: 18157)
Copyright 2003-2018 Intel Corporation.

That suggestion does work! But I am having seg fault in the end:
[adncat@compute-1-1 language]$ LAUNCHER="mpirun -n 1" ./regent.py examples/cholesky.rg -fflow 0 -foverride-demand-index-launch 1 -level 5
ELAPSED TIME = 2.863 ms

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 12065 RUNNING AT compute-1-1
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 12065 RUNNING AT compute-1-1
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

Intel(R) MPI Library troubleshooting guide:
https://software.intel.com/node/561764

@qyz96
Copy link
Author

qyz96 commented May 22, 2020

I also have the following error:
/home/darve/adncat/legion/language/terra/release/bin/terra: symbol lookup error: /home/darve/adncat/legion/bindings/regent/libregent.so: undefined symbol:
ZNK5Realm18RegionInstanceImpl8Metadata13serialize_msgINS_13ActiveMessageINS_23MetadataResponseMessageEEEEEvRT

when I try to run this command on multiple nodes:
LAUNCHER="mpirun -n ${SLURM_NTASKS}" ./regent.py examples/cholesky.rg -fflow 0 -foverride-demand-index-launch 1 -level 5

Thank you!

@elliottslaughter
Copy link
Contributor

Is it possible you rebuilt or something? This isn't a symbol directly used by Regent, so it would be only used transitively through some sort of Legion API call. I wouldn't expect that to go wrong unless there was a miscompilation or perhaps a dirty compilation of some sort.

@qyz96
Copy link
Author

qyz96 commented May 26, 2020

I ran the script file ./setup_env.py and then built with ./install.py --gasnet. It seems that the error disappeared after I reinstalled with the command ./install.py. The seg fault that I was seeing on one node also disappeared:

LAUNCHER="mpirun -n 1" ./regent.py examples/cholesky.rg -fflow 0 -foverride-demand-index-launch 1 -level 5 -n 10000 -p 20 -ll:cpu 16 -ll:csize 8192
ELAPSED TIME = 11794.601 ms

However, I did not really see any scaling with multiple nodes. I saw the following output with 8 nodes:
ELAPSED TIME = 11743.746 ms
ELAPSED TIME = 11752.615 ms
ELAPSED TIME = 11593.876 ms
ELAPSED TIME = 11677.211 ms
ELAPSED TIME = 11669.117 ms
ELAPSED TIME = 11674.317 ms
ELAPSED TIME = 11736.817 ms
ELAPSED TIME = 11797.231 ms

I am not sure if I am running the code correctly, the exact command was:
OPEN_NUM_THREADS=1 LAUNCHER="mpirun -n ${SLURM_NTASKS}" ./regent.py examples/cholesky.rg -fflow 0 -foverride-demand-index-launch 1 -level 5 -n 10000 -p 20 -ll:cpu 16 -ll:csize 8192

Am I missing something here? Am I supposed to install with gasnet support (with the flag --gasnet) and resolve the previous issue, or I missed some configuration in the command? I could profile the process, but I am also not sure how I can generate profiling results for each node in the process.

@eddy16112
Copy link
Contributor

@qyz96 Your matrix size is too small for a 8 node run. Cholesky is not a suitable application for strong scale, you can try weak scale (1 node, 4 nodes, 16 nodes, ...); also try some large block sizes, the default 500*500 may not be large enough to saturate the CPU.

Instead of using the actual run time, converting the time into flops can tell how far the cholesky is close to the peak flops of the machine.

Last, I am not sure if this cholesky is optimized for a distributed run. A high performance choleksy should use 2d block cyclic data distribution, which is not in the default mapper, so you might need to create a custom mapper.

@qyz96
Copy link
Author

qyz96 commented May 27, 2020

I see. I am not very familiar with using customized mappers. Do you have some references or documentation regarding that? Thanks!

@qyz96
Copy link
Author

qyz96 commented Jun 22, 2020

Hi,
I have been running some benchmarks for both Cholesky and GEMM using regent. In particular, you can see the scaling results regarding different number of cpus per node and number of nodes in the table below:
image
I noticed that when the total number of cores is beyond certain number (in this case 16 cores), the runtime stops decreasing. I checked the profiling result, it appears that when we have more than 16 cores, the mapping does not give any work to cpus other than those 16 cores. Below is the link to the profiling result of 4 cores per node, 8 nodes in total, matrix size of 8192 and block size of 512.
https://www.dropbox.com/sh/lvau3ek3q87fiad/AAAUa_d6o7bI6pGFSnzU3SNUa?dl=0
You can see there the nodes 4-7 are not doing any computational work. I am just wondering if I am using the mapping correctly? If I need to improve the mapping, what would be the easiest way to change the default mapping? The exact command I am running now is as follows:
LAUNCHER="mpirun -n 8" ./regent.py my_dgemm.rg (or cholesky.rg) -fflow 0 -level 5 -n 8192 -p 16 -ll:csize 8192 -ll:cpu ${NUM_CPU} -ll:util ${NUM_UTILITY} -foverride-demand-index-launch 1

Thank you!

@magnatelee
Copy link
Contributor

I don't know much about your dgemm code, but I believe for cholesky the number you give to -p determines the total number of tasks to launch, and it looks like you only give 16, which may be not enough if you have more than 16 cores. I'd set -p to the total number of cores.

@EricDarve
Copy link

-p is the number of blocks in the matrix partitioning along the rows and columns. The matrix is partitioned into 16x16 blocks. The total number of tasks for a gemm is 16^3 = 4,096. You have 16 cores per node and 16 nodes at maximum so that's 256 tasks per node or 16 tasks per core. The workload for all nodes is the same. We did some benchmarks with starpu and our runtime tasktorrent. They both scale well on this problem with these parameters, so the calculation scales well in this parameter range.

We had some issues with the utility processor in the past but this looks different. Even if the utility processor is very busy the workload should still be evenly distributed but we are seeing that several nodes are basically idle.

Here are some screenshots from the profiler. The data is here for the profiling:
https://www.dropbox.com/sh/lvau3ek3q87fiad/AAAUa_d6o7bI6pGFSnzU3SNUa?dl=0

image

Notice the 50% all nodes (CPU) utilization.

image

Node 4 through 7 are idle. There is some utility activity towards the end. The tasks at the end are all Send Shutdown Notifications. If you zoom in:

image

You can find the Regent code here:
https://github.com/qyz96/Legion_Benchmark/blob/master/my_dgemm.rg

If you look at the benchmark above, the matrix has size 16 x 16 blocks (of size 512). So each loop does 16 iterations. The code seems to stop scaling when we try to use more than 16 cores. So it seems that the loops are not parallelized correctly. Is there something preventing the compiler from parallelizing the loops? The performance seems consistent with parallelizing only one of the loops. In the code below, pC is read-write so the k loop is sequential (in principle a reduction could be used for pC leading to more parallelism but this is not done). The x and y loops should be parallel. So the concurrency is 16^2=256.

The code is quite simple. Three nested loops:

  var bn = n / np
  for k = 0, np do
    for x = 0, np do
      __demand(__index_launch)
      for y = 0, np do
        dgemm(x, y, k, n, bn,
              pC[f2d { x = x, y = y }],
              pA[f2d { x = x, y = k }],
              pB[f2d { x = k, y = y }])
      end
    end
  end

and the dgemm task:

task dgemm(x : int, y : int, k : int, n : int, bn : int,
           rA : region(ispace(f2d), double),
           rB : region(ispace(f2d), double),
           rC : region(ispace(f2d), double))
where reads writes(rA), reads(rB, rC)
do
  dgemm_terra(x, y, k, n, bn,
              __physical(rA)[0], __fields(rA)[0],
              __physical(rB)[0], __fields(rB)[0],
              __physical(rC)[0], __fields(rC)[0])
end

@magnatelee
Copy link
Contributor

magnatelee commented Jun 27, 2020

Aha, I was wrong about the code that I wrote myself. :) In general, the code like cholesky needs a custom mapper, since the default mapper is not smart enough to handle irregular tasks graphs. I just didn't bother to write one for the cholesky example, as it wasn't really intended for any performance evaluation. (and I'd be happy to help write the right custom mapper, if you are planning to use it for any serious performance comparison.)

For your gemm code though, I think what you need is a 2D index launch like the following (which may need -foverride-demand-index-launch to be accepted by the compiler):

  var bn = n / np
  var launch_domain = rect2d { int2d {0, 0}, int2d {np - 1, np - 1} }
  for k = 0, np do
    __demand(__index_launch)
    for p in launch_domain do
      dgemm(x, y, k, n, bn,
            pC[f2d { x = p.x, y = p.y }],
            pA[f2d { x = p.x, y = k }],
            pB[f2d { x = k, y = p.y }])
    end
  end

The default mapper should be able to distribute the tasks from this code evenly across all the nodes, as long as np^2 is bigger than or equal to the number of cores.

@EricDarve
Copy link

Makes sense. Thanks.

@leopoldcambier
Copy link

leopoldcambier commented Jun 29, 2020

Thanks for the suggestion Wonchan, it does make a lot of sense.
We tried that on the Gemm example, it seems like it does improve the load balancing.

However we hit random segfaults/hangs sometimes on "large" problem.
In particular on a 8192x8192 matrix with a block size of 512 it happens almost systematically.
When it doesn't crash, it seems to be giving correct results.

I attached a Gasnet backtrace to this message.
The exact Gemm code is here https://github.com/qyz96/Legion_Benchmark/blob/lcambier/dgemm_index_launch_crash/my_dgemm.rg
And we launch it using this https://github.com/qyz96/Legion_Benchmark/blob/lcambier/dgemm_index_launch_crash/run_gemm.sh
(For instance doing NUM_CPUS=16 NUM_BLOCKS=16 BLOCK_SIZE=512 sbatch -c 32 -n 8 run_gemm.sh)

Looking at the code, is there anything that seems off to you ?

I built Legion on commit 3418c3205cf7de6df761c71a9e8e0bf60edb5e41 (master as of June 18) using GCC 8.3.0 and we use Intel MPI.

compute-1-1
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0-31
Thread(s) per core:    2
Core(s) per socket:    8
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 45
Model name:            Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
Stepping:              7
CPU MHz:               2000.000
CPU max MHz:           2601.0000
CPU min MHz:           1200.0000
BogoMIPS:              5200.00
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              20480K
NUMA node0 CPU(s):     0-7,16-23
NUMA node1 CPU(s):     8-15,24-31
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts
compute-1-1
compute-1-3
compute-1-8
compute-1-2
compute-1-7
compute-1-6
compute-1-4
compute-1-5
>>>>slurm_id=27156,matrix_size=8192,num_blocks=16,block_size=512,num_ranks=8,num_cpus=16
*** Caught a fatal signal (proc 1): SIGSEGV(11)
NOTICE: We recommend linking the debug version of GASNet to assist you in resolving this application issue.
[1] Invoking GDB for backtrace...
[1] /usr/bin/gdb -nx -batch -x /tmp/gasnet_bFaOAp '/home/darve/lcambier/yizhou_legion/./my_dgemm.rg' 18259
[1] [New LWP 18324]
[1] [New LWP 18323]
[1] [New LWP 18322]
[1] [New LWP 18321]
[1] [New LWP 18320]
[1] [New LWP 18319]
[1] [New LWP 18318]
[1] [New LWP 18317]
[1] [New LWP 18316]
[1] [New LWP 18315]
[1] [New LWP 18314]
[1] [New LWP 18313]
[1] [New LWP 18312]
[1] [New LWP 18311]
[1] [New LWP 18310]
[1] [New LWP 18309]
[1] [New LWP 18308]
[1] [New LWP 18307]
[1] [New LWP 18306]
[1] [New LWP 18305]
[1] [New LWP 18304]
[1] [New LWP 18303]
[1] [New LWP 18302]
[1] [New LWP 18301]
[1] [New LWP 18300]
[1] [New LWP 18299]
[1] [New LWP 18298]
[1] [New LWP 18297]
[1] [New LWP 18296]
[1] [New LWP 18295]
[1] [New LWP 18294]
[1] [New LWP 18293]
[1] [New LWP 18292]
[1] [New LWP 18291]
[1] [New LWP 18290]
[1] [New LWP 18289]
[1] [New LWP 18288]
[1] [New LWP 18287]
[1] [Thread debugging using libthread_db enabled]
[1] Using host libthread_db library "/lib64/libthread_db.so.1".
[1] 0x00007f649cdd89f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] To enable execution of this file add
[1] 	add-auto-load-safe-path /opt/ohpc/pub/compiler/gcc/8.3.0/lib64/libstdc++.so.6.0.25-gdb.py
[1] line to your configuration file "/home/darve/lcambier/.gdbinit".
[1] To completely disable this security protection add
[1] 	set auto-load safe-path /
[1] line to your configuration file "/home/darve/lcambier/.gdbinit".
[1] For more information about this security protection see the
[1] "Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
[1] 	info "(gdb)Auto-loading safe path"
[1]   Id   Target Id         Frame
[1]   39   Thread 0x7f6492455700 (LWP 18287) "terra" 0x00007f649c255bed in poll () from /lib64/libc.so.6
[1]   38   Thread 0x7f648bfff700 (LWP 18288) "terra" 0x00007f649c245727 in sched_yield () from /lib64/libc.so.6
[1]   37   Thread 0x7f6490352780 (LWP 18289) "terra" 0x00007f649cdd89f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   36   Thread 0x7f648b7fe700 (LWP 18290) "terra" 0x00007f649cdd89f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   35   Thread 0x7f648affd700 (LWP 18291) "terra" 0x00007f649cdd89f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   34   Thread 0x7f648a7fc700 (LWP 18292) "terra" 0x00007f649cdd89f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   33   Thread 0x7f6494670780 (LWP 18293) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   32   Thread 0x7f6494664780 (LWP 18294) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   31   Thread 0x7f6494658780 (LWP 18295) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   30   Thread 0x7f649464c780 (LWP 18296) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   29   Thread 0x7f6494640780 (LWP 18297) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   28   Thread 0x7f6494634780 (LWP 18298) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   27   Thread 0x7f6494628780 (LWP 18299) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   26   Thread 0x7f649461c780 (LWP 18300) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   25   Thread 0x7f6494610780 (LWP 18301) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   24   Thread 0x7f6494604780 (LWP 18302) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   23   Thread 0x7f64945f8780 (LWP 18303) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   22   Thread 0x7f64945ec780 (LWP 18304) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   21   Thread 0x7f64945e0780 (LWP 18305) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   20   Thread 0x7f64945d4780 (LWP 18306) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   19   Thread 0x7f64945c8780 (LWP 18307) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   18   Thread 0x7f64945bc780 (LWP 18308) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   17   Thread 0x7f64945b0780 (LWP 18309) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   16   Thread 0x7f64945a4780 (LWP 18310) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   15   Thread 0x7f6494598780 (LWP 18311) "terra" 0x00007f649c227469 in waitpid () from /lib64/libc.so.6
[1]   14   Thread 0x7f649458c780 (LWP 18312) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   13   Thread 0x7f6494580780 (LWP 18313) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   12   Thread 0x7f6494574780 (LWP 18314) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   11   Thread 0x7f6494568780 (LWP 18315) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   10   Thread 0x7f649455c780 (LWP 18316) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   9    Thread 0x7f6494550780 (LWP 18317) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   8    Thread 0x7f6494544780 (LWP 18318) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   7    Thread 0x7f6494538780 (LWP 18319) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   6    Thread 0x7f649452c780 (LWP 18320) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   5    Thread 0x7f6494520780 (LWP 18321) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   4    Thread 0x7f649014e780 (LWP 18322) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   3    Thread 0x7f6490142780 (LWP 18323) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]   2    Thread 0x7f6490136780 (LWP 18324) "terra" 0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] * 1    Thread 0x7f649d603780 (LWP 18259) "terra" 0x00007f649cdd89f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1]
[1] Thread 39 (Thread 0x7f6492455700 (LWP 18287)):
[1] #0  0x00007f649c255bed in poll () from /lib64/libc.so.6
[1] #1  0x00007f649246f053 in cm_thread () from /lib64/libdaploucm.so.2
[1] #2  0x00007f649245aaca in dapli_thread_init () from /lib64/libdaploucm.so.2
[1] #3  0x00007f649cdd4e65 in start_thread () from /lib64/libpthread.so.0
[1] #4  0x00007f649c26088d in clone () from /lib64/libc.so.6
[1]
[1] Thread 38 (Thread 0x7f648bfff700 (LWP 18288)):
[1] #0  0x00007f649c245727 in sched_yield () from /lib64/libc.so.6
[1] #1  0x00007f649b8e543d in gasnetc_am_sema_poll () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b8e54b5 in gasnetc_am_get_credit () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b8eaeff in gasnetc_AMRequestMediumM () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b4d30bb in ActiveMessageEndpoint::send_short(OutgoingMessage*) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b4ceb34 in EndpointManager::polling_worker_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649b345dfa in Realm::KernelThread::pthread_entry(void*) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #7  0x00007f649cdd4e65 in start_thread () from /lib64/libpthread.so.0
[1] #8  0x00007f649c26088d in clone () from /lib64/libc.so.6
[1]
[1] Thread 37 (Thread 0x7f6490352780 (LWP 18289)):
[1] #0  0x00007f649cdd89f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b4cd928 in IncomingMessageManager::get_messages(int&, bool) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b4cdc78 in IncomingMessageManager::handler_thread_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b345dfa in Realm::KernelThread::pthread_entry(void*) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649cdd4e65 in start_thread () from /lib64/libpthread.so.0
[1] #5  0x00007f649c26088d in clone () from /lib64/libc.so.6
[1]
[1] Thread 36 (Thread 0x7f648b7fe700 (LWP 18290)):
[1] #0  0x00007f649cdd89f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b322cce in Realm::XferDesQueue::dequeue_xferDes(Realm::DMAThread*, bool) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b322daf in Realm::DMAThread::dma_thread_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b345dfa in Realm::KernelThread::pthread_entry(void*) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649cdd4e65 in start_thread () from /lib64/libpthread.so.0
[1] #5  0x00007f649c26088d in clone () from /lib64/libc.so.6
[1]
[1] Thread 35 (Thread 0x7f648affd700 (LWP 18291)):
[1] #0  0x00007f649cdd89f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b32cc58 in Realm::DmaRequestQueue::dequeue_request(bool) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b33463b in Realm::DmaRequestQueue::worker_thread_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b345dfa in Realm::KernelThread::pthread_entry(void*) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649cdd4e65 in start_thread () from /lib64/libpthread.so.0
[1] #5  0x00007f649c26088d in clone () from /lib64/libc.so.6
[1]
[1] Thread 34 (Thread 0x7f648a7fc700 (LWP 18292)):
[1] #0  0x00007f649cdd89f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b36abdb in Realm::PartitioningOpQueue::worker_thread_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b345dfa in Realm::KernelThread::pthread_entry(void*) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649cdd4e65 in start_thread () from /lib64/libpthread.so.0
[1] #4  0x00007f649c26088d in clone () from /lib64/libc.so.6
[1]
[1] Thread 33 (Thread 0x7f6494670780 (LWP 18293)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 32 (Thread 0x7f6494664780 (LWP 18294)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 31 (Thread 0x7f6494658780 (LWP 18295)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 30 (Thread 0x7f649464c780 (LWP 18296)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 29 (Thread 0x7f6494640780 (LWP 18297)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 28 (Thread 0x7f6494634780 (LWP 18298)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 27 (Thread 0x7f6494628780 (LWP 18299)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 26 (Thread 0x7f649461c780 (LWP 18300)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 25 (Thread 0x7f6494610780 (LWP 18301)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 24 (Thread 0x7f6494604780 (LWP 18302)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 23 (Thread 0x7f64945f8780 (LWP 18303)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 22 (Thread 0x7f64945ec780 (LWP 18304)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 21 (Thread 0x7f64945e0780 (LWP 18305)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 20 (Thread 0x7f64945d4780 (LWP 18306)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 19 (Thread 0x7f64945c8780 (LWP 18307)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 18 (Thread 0x7f64945bc780 (LWP 18308)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 17 (Thread 0x7f64945b0780 (LWP 18309)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 16 (Thread 0x7f64945a4780 (LWP 18310)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 15 (Thread 0x7f6494598780 (LWP 18311)):
[1] #0  0x00007f649c227469 in waitpid () from /lib64/libc.so.6
[1] #1  0x00007f649c1a4f12 in do_system () from /lib64/libc.so.6
[1] #2  0x00007f649c1a52c1 in system () from /lib64/libc.so.6
[1] #3  0x00007f649b904ce4 in gasneti_system_redirected () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b905418 in gasneti_bt_gdb () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b90902f in gasneti_print_backtrace () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649ac81185 in gasneti_defaultSignalHandler () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #7  <signal handler called>
[1] #8  0x00007f64971dc6a7 in mkl_blas_avx_dgemm_kernel_0 () from /home/darve/lcambier/yizhou_legion/myblas.so
[1] #9  0x00007f641c2002b0 in ?? ()
[1] #10 0x00007f641c200680 in ?? ()
[1] #11 0x0000000000000000 in ?? ()
[1]
[1] Thread 14 (Thread 0x7f649458c780 (LWP 18312)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 13 (Thread 0x7f6494580780 (LWP 18313)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 12 (Thread 0x7f6494574780 (LWP 18314)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 11 (Thread 0x7f6494568780 (LWP 18315)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 10 (Thread 0x7f649455c780 (LWP 18316)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 9 (Thread 0x7f6494550780 (LWP 18317)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 8 (Thread 0x7f6494544780 (LWP 18318)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 7 (Thread 0x7f6494538780 (LWP 18319)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 6 (Thread 0x7f649452c780 (LWP 18320)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 5 (Thread 0x7f6494520780 (LWP 18321)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 4 (Thread 0x7f649014e780 (LWP 18322)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 3 (Thread 0x7f6490142780 (LWP 18323)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 2 (Thread 0x7f6490136780 (LWP 18324)):
[1] #0  0x00007f649cdd8da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b3433a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b35e38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649b35e745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649b363df8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #5  0x00007f649b34b264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #6  0x00007f649c1aa140 in ?? () from /lib64/libc.so.6
[1] #7  0x0000000000000000 in ?? ()
[1]
[1] Thread 1 (Thread 0x7f649d603780 (LWP 18259)):
[1] #0  0x00007f649cdd89f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[1] #1  0x00007f649b19a820 in Realm::RuntimeImpl::wait_for_shutdown() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #2  0x00007f649b19fd01 in Realm::Runtime::wait_for_shutdown() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #3  0x00007f649af4e448 in Legion::Internal::Runtime::start(int, char**, bool) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[1] #4  0x00007f649883e486 in ?? ()
[1] #5  0x00007fff00000100 in ?? ()
[1] #6  0x00007f6498846ee0 in ?? ()
[1] #7  0x0000000000000000 in ?? ()

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 18259 RUNNING AT compute-1-2
=   EXIT CODE: 11
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 18259 RUNNING AT compute-1-2
=   EXIT CODE: 11
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
   Intel(R) MPI Library troubleshooting guide:
      https://software.intel.com/node/561764
===================================================================================

@streichler
Copy link
Contributor

In the backtrace above, the seg fault appears to be occurring inside an MKL-accelerated routine. Can you look at that call with a debugger and confirm the arguments are reasonable? We've seen problems with other BLAS libraries (i.e. Cray's) being unhappy with being called from multiple tasks concurrently, so I'd look into how to get MKL to run in a single-threaded mode and/or experiment with a different BLAS implementation.

@eddy16112
Copy link
Contributor

By the way, here is how to link with the single-threaded MKL
terralib.linklibrary("libmkl_core.so")
terralib.linklibrary("libmkl_sequential.so")
terralib.linklibrary("libmkl_intel_lp64.so")
If you are using the netlib BLAS, that is already single-threaded.
I will try the dgemm example later.

@leopoldcambier
Copy link

leopoldcambier commented Jun 29, 2020

Ha I actually didn't notice the MKL call in the backtrace, good catch. It's probably from there I agree.
I'm already using the sequential version (I wasn't really sure how to link everything, so the myblas.so is just a library where I put all of the MKL stuff directly).
I tried eddy16112 suggestion's but unfortunately the same happens (hangs/crash on mkl_blas_avx_dgemm_kernel_0).
I'll try to debug this a bit.

@leopoldcambier
Copy link

leopoldcambier commented Jun 29, 2020

I tried Openblas instead, no difference, still hangs/segfault.

However, I may (maybe ?) have found something.
In the terra dgemm function I try printing the arguments (hopefully that's valid regent/terra/lua code, I'm not 100% sure)
(Note: the offsets are all 512 so the [bn*bn-1] should be correct, the matrix blocks are contiguous I believe)

c.printf("dgemm (%d, %d, %d, %d, %d, %e, %e, %e, %e, %e, %e)\n",
          x, y, k, n, bn, rawA.ptr[0], rawB.ptr[0], rawC.ptr[0], rawA.ptr[bn*bn - 1], rawB.ptr[bn*bn - 1], rawC.ptr[bn*bn - 1])

and I saw this (full log below) on rank 3 before segfaulting

[3] dgemm (14, 4, 0, 8192, 512, 0.000000e+00, 5.434722e-323, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[3] dgemm (12, 4, 0, 8192, 512, 0.000000e+00, 2.121996e-314, 0.000000e+00, -nan, 0.000000e+00, 0.000000e+00)
[3] dgemm (15, 5, 0, 8192, 512, -nan, 6.250440e-316, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[3] dgemm (13, 5, 0, 8192, 512, -nan, 2.014297e-307, 0.000000e+00, -nan, 0.000000e+00, 0.000000e+00)
[3] *** Caught a fatal signal (proc 3): SIGSEGV(11)
[3] NOTICE: We recommend linking the debug version of GASNet to assist you in resolving this application issue.
[3] [3] Invoking GDB for backtrace...

Notice the -nan and and the 6.250440e-316 values. That looks like uninitialized memory ; maybe it's not even allocated, hence the segfaults ?

Finally, all other arguments (alpha, beta, n, nb, x, y, k, ..) look fine

$ GASNET_BACKTRACE=1 LAUNCHER="mpirun -l -n 8" ../legion/language/regent.py ./my_dgemm.rg -fflow 0 -level 5 -n 8192 -p 16 -ll:csize 16384 -foverride-demand-index-launch 1 -ll:cpu 1 -ll:util 1 -verify
[0] dgemm (0, 0, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (1, 0, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 0, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 0, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (0, 1, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (1, 1, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 1, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 1, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (0, 2, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (1, 2, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 2, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 2, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (0, 3, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (1, 3, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 3, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 3, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (0, 4, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (1, 4, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 4, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 4, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (0, 5, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (1, 5, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 5, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 5, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (0, 6, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (1, 6, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 6, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 6, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (0, 7, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (1, 7, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 7, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 7, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (0, 0, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (1, 0, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 0, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 0, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (0, 1, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (1, 1, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 1, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 1, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (0, 2, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (1, 2, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 2, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[3] dgemm (12, 5, 0, 8192, 512, 0.000000e+00, 2.121996e-314, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (10, 10, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 2, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (8, 10, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[3] dgemm (14, 4, 0, 8192, 512, 0.000000e+00, 5.434722e-323, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (0, 3, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (9, 10, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[3] dgemm (12, 4, 0, 8192, 512, 0.000000e+00, 2.121996e-314, 0.000000e+00, -nan, 0.000000e+00, 0.000000e+00)
[0] dgemm (1, 3, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (11, 10, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 3, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[3] dgemm (15, 5, 0, 8192, 512, -nan, 6.250440e-316, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (8, 11, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 3, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[3] dgemm (13, 5, 0, 8192, 512, -nan, 2.014297e-307, 0.000000e+00, -nan, 0.000000e+00, 0.000000e+00)
[6] dgemm (9, 11, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (0, 4, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (11, 11, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[3] *** Caught a fatal signal (proc 3): SIGSEGV(11)
[3] NOTICE: We recommend linking the debug version of GASNet to assist you in resolving this application issue.
[3] [3] Invoking GDB for backtrace...
[0] dgemm (1, 4, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (10, 11, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 4, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 4, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (0, 5, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (1, 5, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 5, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 5, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (11, 15, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (2, 12, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[7] dgemm (14, 8, 0, 8192, 512, 6.031070e-316, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (0, 6, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (10, 15, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (1, 12, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[7] *** Caught a fatal signal (proc 7): SIGSEGV(11)
[7] NOTICE: We recommend linking the debug version of GASNet to assist you in resolving this application issue.
[7] [7] Invoking GDB for backtrace...
[0] dgemm (1, 6, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (9, 15, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (0, 12, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 6, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (8, 15, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (1, 9, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 6, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (11, 14, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (2, 9, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (0, 7, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (9, 9, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (3, 9, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (1, 7, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (10, 9, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (0, 9, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (2, 7, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (11, 9, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (2, 14, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[0] dgemm (3, 7, 1, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (8, 9, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (1, 14, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (9, 8, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (3, 14, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (10, 8, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (1, 8, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (11, 8, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (2, 8, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (8, 8, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (3, 8, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (8, 13, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (0, 8, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (10, 13, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (0, 14, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (9, 13, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (3, 13, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (8, 12, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (2, 13, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (10, 14, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (1, 13, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (9, 14, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (1, 11, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (8, 14, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (2, 11, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (11, 13, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (3, 11, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (11, 12, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (0, 11, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (10, 12, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (1, 10, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[6] dgemm (9, 12, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (2, 10, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (3, 10, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (0, 10, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (1, 15, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[4] dgemm (0, 15, 0, 8192, 512, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00)
[7] [7] /usr/bin/gdb -nx -batch -x /tmp/gasnet_ShJOOd '/home/darve/lcambier/yizhou_legion/./my_dgemm.rg' 14853
[7] [7] [New LWP 15103]
[7] [7] [New LWP 15102]
[7] [7] [New LWP 15077]
[7] [7] [New LWP 15072]
[7] [7] [New LWP 15067]
[7] [7] [New LWP 15061]
[7] [7] [New LWP 15050]
[7] [7] [Thread debugging using libthread_db enabled]
[7] [7] Using host libthread_db library "/lib64/libthread_db.so.1".
[7] [7] 0x00007fac1aba49f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[7] [7] To enable execution of this file add
[7] [7] 	add-auto-load-safe-path /opt/ohpc/pub/compiler/gcc/8.3.0/lib64/libstdc++.so.6.0.25-gdb.py
[7] [7] line to your configuration file "/home/darve/lcambier/.gdbinit".
[7] [7] To completely disable this security protection add
[7] [7] 	set auto-load safe-path /
[7] [7] line to your configuration file "/home/darve/lcambier/.gdbinit".
[7] [7] For more information about this security protection see the
[7] [7] "Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
[7] [7] 	info "(gdb)Auto-loading safe path"
[7] [7]   Id   Target Id         Frame
[7] [7]   8    Thread 0x7fac05720700 (LWP 15050) "terra" 0x00007fac196c8427 in gasnetc_poll_rcv_hca () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7]   7    Thread 0x7fabfffff700 (LWP 15061) "terra" 0x00007fac1aba49f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[7] [7]   6    Thread 0x7fac04f1e780 (LWP 15067) "terra" 0x00007fac1aba49f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[7] [7]   5    Thread 0x7fac04d1b700 (LWP 15072) "terra" 0x00007fac1aba49f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[7] [7]   4    Thread 0x7fabff7fe700 (LWP 15077) "terra" 0x00007fac1aba49f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[7] [7]   3    Thread 0x7fac04519780 (LWP 15102) "terra" 0x00007fac1aba4da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[7] [7]   2    Thread 0x7fac0450d780 (LWP 15103) "terra" 0x00007fac19ff3469 in waitpid () from /lib64/libc.so.6
[7] [7] * 1    Thread 0x7fac1b3ce780 (LWP 14853) "terra" 0x00007fac1aba49f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[7] [7]
[7] [7] Thread 8 (Thread 0x7fac05720700 (LWP 15050)):
[7] [7] #0  0x00007fac196c8427 in gasnetc_poll_rcv_hca () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #1  0x00007fac196c8b84 in gasnetc_do_poll () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #2  0x00007fac196b2273 in gasnetc_AMPoll () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #3  0x00007fac1929abb5 in EndpointManager::polling_worker_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #4  0x00007fac19111dfa in Realm::KernelThread::pthread_entry(void*) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #5  0x00007fac1aba0e65 in start_thread () from /lib64/libpthread.so.0
[7] [7] #6  0x00007fac1a02c88d in clone () from /lib64/libc.so.6
[7] [7]
[7] [7] Thread 7 (Thread 0x7fabfffff700 (LWP 15061)):
[7] [7] #0  0x00007fac1aba49f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[7] [7] #1  0x00007fac19136bdb in Realm::PartitioningOpQueue::worker_thread_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #2  0x00007fac19111dfa in Realm::KernelThread::pthread_entry(void*) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #3  0x00007fac1aba0e65 in start_thread () from /lib64/libpthread.so.0
[7] [7] #4  0x00007fac1a02c88d in clone () from /lib64/libc.so.6
[7] [7]
[7] [7] Thread 6 (Thread 0x7fac04f1e780 (LWP 15067)):
[7] [7] #0  0x00007fac1aba49f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[7] [7] #1  0x00007fac19299928 in IncomingMessageManager::get_messages(int&, bool) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #2  0x00007fac19299c78 in IncomingMessageManager::handler_thread_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #3  0x00007fac19111dfa in Realm::KernelThread::pthread_entry(void*) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #4  0x00007fac1aba0e65 in start_thread () from /lib64/libpthread.so.0
[7] [7] #5  0x00007fac1a02c88d in clone () from /lib64/libc.so.6
[7] [7]
[7] [7] Thread 5 (Thread 0x7fac04d1b700 (LWP 15072)):
[7] [7] #0  0x00007fac1aba49f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[7] [7] #1  0x00007fac190f8c58 in Realm::DmaRequestQueue::dequeue_request(bool) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #2  0x00007fac1910063b in Realm::DmaRequestQueue::worker_thread_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #3  0x00007fac19111dfa in Realm::KernelThread::pthread_entry(void*) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #4  0x00007fac1aba0e65 in start_thread () from /lib64/libpthread.so.0
[7] [7] #5  0x00007fac1a02c88d in clone () from /lib64/libc.so.6
[7] [7]
[7] [7] Thread 4 (Thread 0x7fabff7fe700 (LWP 15077)):
[7] [7] #0  0x00007fac1aba49f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[7] [7] #1  0x00007fac190eecce in Realm::XferDesQueue::dequeue_xferDes(Realm::DMAThread*, bool) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #2  0x00007fac190eedaf in Realm::DMAThread::dma_thread_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #3  0x00007fac19111dfa in Realm::KernelThread::pthread_entry(void*) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #4  0x00007fac1aba0e65 in start_thread () from /lib64/libpthread.so.0
[7] [7] #5  0x00007fac1a02c88d in clone () from /lib64/libc.so.6
[7] [7]
[7] [7] Thread 3 (Thread 0x7fac04519780 (LWP 15102)):
[7] [7] #0  0x00007fac1aba4da2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[7] [7] #1  0x00007fac1910f3a4 in Realm::CondVar::timedwait(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #2  0x00007fac1912a38d in Realm::ThreadedTaskScheduler::WorkCounter::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #3  0x00007fac1912a745 in Realm::ThreadedTaskScheduler::wait_for_work(long long) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #4  0x00007fac1912fdf8 in Realm::ThreadedTaskScheduler::scheduler_loop() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #5  0x00007fac19117264 in Realm::UserThread::uthread_entry() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #6  0x00007fac19f76140 in ?? () from /lib64/libc.so.6
[7] [7] #7  0x0000000000000000 in ?? ()
[7] [7]
[7] [7] Thread 2 (Thread 0x7fac0450d780 (LWP 15103)):
[7] [7] #0  0x00007fac19ff3469 in waitpid () from /lib64/libc.so.6
[7] [7] #1  0x00007fac19f70f12 in do_system () from /lib64/libc.so.6
[7] [7] #2  0x00007fac19f712c1 in system () from /lib64/libc.so.6
[7] [7] #3  0x00007fac196d0ce4 in gasneti_system_redirected () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #4  0x00007fac196d1418 in gasneti_bt_gdb () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #5  0x00007fac196d502f in gasneti_print_backtrace () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #6  0x00007fac18a4d185 in gasneti_defaultSignalHandler () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #7  <signal handler called>
[7] [7] #8  0x00007fac1b1ca71b in do_lookup_x () from /lib64/ld-linux-x86-64.so.2
[7] [7] #9  0x00007fac1b1cb04f in _dl_lookup_symbol_x () from /lib64/ld-linux-x86-64.so.2
[7] [7] #10 0x00007fac1b1cfd9e in _dl_fixup () from /lib64/ld-linux-x86-64.so.2
[7] [7] #11 0x00007fac1b1d799a in _dl_runtime_resolve_xsave () from /lib64/ld-linux-x86-64.so.2
[7] [7] #12 0x00007fabc5560f0b in ?? ()
[7] [7] #13 0x0000000000000100 in ?? ()
[7] [7] #14 0x0000000000000000 in ?? ()
[7] [7]
[7] [7] Thread 1 (Thread 0x7fac1b3ce780 (LWP 14853)):
[7] [7] #0  0x00007fac1aba49f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
[7] [7] #1  0x00007fac18f66820 in Realm::RuntimeImpl::wait_for_shutdown() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #2  0x00007fac18f6bd01 in Realm::Runtime::wait_for_shutdown() () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #3  0x00007fac18d1a448 in Legion::Internal::Runtime::start(int, char**, bool) () from /home/darve/lcambier/legion/bindings/regent/libregent.so
[7] [7] #4  0x00007fac1660b496 in ?? ()
[7] [7] #5  0x00007fff00000100 in ?? ()
[7] [7] #6  0x00007fac16613f70 in ?? ()
[7] [7] #7  0x0000000000000000 in ?? ()
[4] *** Caught a signal (proc 4): SIGINT(2)
[5] *** Caught a signal (proc 5): SIGINT(2)
[6] *** Caught a signal (proc 6): SIGINT(2)

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 14853 RUNNING AT armstrong-login-0-1
=   EXIT CODE: 11
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
   Intel(R) MPI Library troubleshooting guide:
      https://software.intel.com/node/561764
===================================================================================

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Regent Issues pertaining to Regent
Projects
None yet
Development

No branches or pull requests

8 participants