Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variance adjusted UniFrac #20

Merged
merged 28 commits into from
Aug 21, 2017
Merged

Variance adjusted UniFrac #20

merged 28 commits into from
Aug 21, 2017

Conversation

wasade
Copy link
Member

@wasade wasade commented Aug 17, 2017

Dependent on #18 and #19.

Tests are not written for VA unweighted, weighted unnormalized and generalized unifrac. These need to be done by hand as there isn't a reference implementation.

@wasade wasade mentioned this pull request Aug 17, 2017
@ElDeveloper
Copy link
Member

Once more, could you update this branch?

@wasade
Copy link
Member Author

wasade commented Aug 19, 2017

@ElDeveloper all green

Copy link
Member

@ElDeveloper ElDeveloper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I have not tested this locally as I am not familiar at all with variance adjusted UniFrac, so I don't really know what to expect.

@@ -743,6 +743,48 @@ void test_generalized_unifrac() {
SUITE_END();
}

void test_vaw_unifrac_weighted_normalized() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor stylistic note, I checked in vim and this test case has a mix of tabs and spaces. Up to you if you want to fix. 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

@@ -322,6 +321,56 @@ void progressbar(float progress) {
std::cout.flush();
}

void initialize_embedded(double*& prop, const su::task_parameters* task_p) {
posix_memalign((void **)&prop, 32, sizeof(double) * task_p->n_samples * 2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line also has a tab instead of spaces ¯\_(ツ)_/¯

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

Method unifrac_method,
const su::task_parameters* task_p) {
for(unsigned int i = task_p->start; i < task_p->stop; i++){
posix_memalign((void **)&dm_stripes[i], 32, sizeof(double) * task_p->n_samples);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case, the interwebs seem to indicate that posix_memalign has a return value specific for out of memory, ENOMEM.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating all instances

break;
case generalized:
func = &su::_vaw_generalized_unifrac_task;
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth adding a default case where this returns an error and the program terminates?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added. I'm not too concerned as the switch is on an enum so this can only arise if the switch receives an enum value that is not described from su::Method which I believe is a compile-time constraint. But it is defensive and could also arise from programmer error.

* at the moment. basically, we can't assume the presence of avx2.
*/
for(unsigned int j = 0; j < task_p->n_samples / 4; j++) {
int k = j * 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k is used here and in a loop below, perhaps worth a different letter ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow? k is instantiated here as it is the stride offset. Or is it because k is used in the in the cleanup loop as well? I also think it is appropriate to stick with k there as the variable should not leak scope, and the use of the variable are consistent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, just thought it was confusing, I wasn't concerned with the leakage.

dm_stripe_total = dm_stripes_total[stripe];

for(unsigned int j = 0; j < task_p->n_samples / 4; j++) {
int k = j * 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above regarding k

double sum1 = (u1 + v1) / vaw;
double sub1 = fabs(u1 - v1) / vaw;
double sum_pow1 = pow(sum1, task_p->g_unifrac_alpha) * length;
dm_stripe[j] += sum_pow1 * (sub1 / sum1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could sum1 ever be zero i.e. u1 and v1 being zero?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's implicit and originally the zero check was done on sum1. The reason this should never happen is that, if sum1 is zero then vaw must be zero as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet!

unsigned int tid; // thread ID
double g_unifrac_alpha; // generalized unifrac alpha
};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to need some documentation at some point ... 😟

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bam bam

double sum1 = (u1 + v1) / vaw;
double sub1 = fabs(u1 - v1) / vaw;
double sum_pow1 = pow(sum1, task_p->g_unifrac_alpha) * length;
dm_stripe[j] += sum_pow1 * (sub1 / sum1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet!

@ElDeveloper ElDeveloper merged commit 0b4659d into master Aug 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants