-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Variance adjusted UniFrac #20
Conversation
Once more, could you update this branch? |
@ElDeveloper all green |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I have not tested this locally as I am not familiar at all with variance adjusted UniFrac, so I don't really know what to expect.
@@ -743,6 +743,48 @@ void test_generalized_unifrac() { | |||
SUITE_END(); | |||
} | |||
|
|||
void test_vaw_unifrac_weighted_normalized() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor stylistic note, I checked in vim and this test case has a mix of tabs and spaces. Up to you if you want to fix. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch
sucpp/unifrac.cpp
Outdated
@@ -322,6 +321,56 @@ void progressbar(float progress) { | |||
std::cout.flush(); | |||
} | |||
|
|||
void initialize_embedded(double*& prop, const su::task_parameters* task_p) { | |||
posix_memalign((void **)&prop, 32, sizeof(double) * task_p->n_samples * 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line also has a tab instead of spaces ¯\_(ツ)_/¯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch
sucpp/unifrac.cpp
Outdated
Method unifrac_method, | ||
const su::task_parameters* task_p) { | ||
for(unsigned int i = task_p->start; i < task_p->stop; i++){ | ||
posix_memalign((void **)&dm_stripes[i], 32, sizeof(double) * task_p->n_samples); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in case, the interwebs seem to indicate that posix_memalign has a return value specific for out of memory, ENOMEM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating all instances
break; | ||
case generalized: | ||
func = &su::_vaw_generalized_unifrac_task; | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth adding a default case where this returns an error and the program terminates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added. I'm not too concerned as the switch
is on an enum
so this can only arise if the switch receives an enum
value that is not described from su::Method
which I believe is a compile-time constraint. But it is defensive and could also arise from programmer error.
* at the moment. basically, we can't assume the presence of avx2. | ||
*/ | ||
for(unsigned int j = 0; j < task_p->n_samples / 4; j++) { | ||
int k = j * 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k is used here and in a loop below, perhaps worth a different letter ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow? k
is instantiated here as it is the stride offset. Or is it because k
is used in the in the cleanup loop as well? I also think it is appropriate to stick with k
there as the variable should not leak scope, and the use of the variable are consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, just thought it was confusing, I wasn't concerned with the leakage.
dm_stripe_total = dm_stripes_total[stripe]; | ||
|
||
for(unsigned int j = 0; j < task_p->n_samples / 4; j++) { | ||
int k = j * 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above regarding k
double sum1 = (u1 + v1) / vaw; | ||
double sub1 = fabs(u1 - v1) / vaw; | ||
double sum_pow1 = pow(sum1, task_p->g_unifrac_alpha) * length; | ||
dm_stripe[j] += sum_pow1 * (sub1 / sum1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could sum1 ever be zero i.e. u1 and v1 being zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's implicit and originally the zero check was done on sum1
. The reason this should never happen is that, if sum1
is zero then vaw
must be zero as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet!
sucpp/unifrac_task.hpp
Outdated
unsigned int tid; // thread ID | ||
double g_unifrac_alpha; // generalized unifrac alpha | ||
}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to need some documentation at some point ... 😟
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bam bam
double sum1 = (u1 + v1) / vaw; | ||
double sub1 = fabs(u1 - v1) / vaw; | ||
double sum_pow1 = pow(sum1, task_p->g_unifrac_alpha) * length; | ||
dm_stripe[j] += sum_pow1 * (sub1 / sum1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet!
Dependent on #18 and #19.
Tests are not written for VA unweighted, weighted unnormalized and generalized unifrac. These need to be done by hand as there isn't a reference implementation.