Skip to content

Write data #12

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: master
Choose a base branch
from
Open

Write data #12

wants to merge 16 commits into from

Conversation

zkamvar
Copy link
Contributor

@zkamvar zkamvar commented Aug 4, 2016

This partially addresses #9 by adding a function that will create a table or list of tables as output. @daattali, @kplaney, @thibautjombart, what do you think?

NEW FUNCTIONS

  • tablesFromMatch() generates a tidy table or a list of tables containing the matches
  • getIndexList() switches the weight output of matchEpiData() to index output

The function to generate the table(s) will only output the information used for matching. This way, the user can use it as a reference to filter their own data. When the output is a single tidy table, the information of the group, data set, index, and weight/score are included.

SHINY

I've modified server.R slightly to use the giveWeights = TRUE argument for matchEpiData(). This will allow us to write something to display the weights/scores in the future.

zkamvar added 11 commits July 16, 2016 16:38
I've added a function to return tables from the
match query. It can work in addition or in lieu
of `matchEpiData()`. I have added documentation
for this and also updated the documentation of
`matchEpiData()`.
This will allow future development so that we can
give the user the scores/weights in the table that
is output.
@daattali
Copy link
Collaborator

daattali commented Aug 5, 2016

Nice work! I don't think you need to add all the library(epimatch) to the examples sections, I don't commonly see that in Hadley et al's packages.

Regarding tablesFromMatch: this is excellent. I haven't looked into it for more than a minute, but does it differentiate between which rows are from which dataset? In the shiny app, when there's a table showing records from two different linelists, it says which records are from which dataset

@zkamvar
Copy link
Contributor Author

zkamvar commented Aug 6, 2016

I think I put the library(epimatch) calls in there during a layover because I was being silly. I can take them out if you like.

tablesFromMatch() includes a column the specifies data set as well as their respective indices.

@daattali
Copy link
Collaborator

daattali commented Aug 6, 2016

Oh I see, really nice function. To be nitpicky:

  • are the row names in the output needed? If they are needed, would it not make more sense to have them as a proper variable rather than rownames?
  • I do see now that using collapse=T prints out where each record came from. But when collapse=F, I don't think there's any identifying information?

Other than that , this is awesome

@zkamvar
Copy link
Contributor Author

zkamvar commented Aug 6, 2016

Oh I see, really nice function. To be nitpicky:

Nits are meant to be picked!

are the row names in the output needed? If they are needed, would it not make more sense to have them as a proper variable rather than rownames?

They are not needed. I can easily get rid of them.

I do see now that using collapse=T prints out where each record came from. But when collapse=F, I don't think there's any identifying information?

collapse = FALSE does not identify the origin (except by rownames, which is a no-no). Should I add that as an extra column?

@daattali
Copy link
Collaborator

daattali commented Aug 6, 2016

It seems to me that it would be more consistent behaviour and make sense to
the end user if the dataset of origin on each record was mentioned. Whether
the result is multiple tables separately or collapsed should probably not
change the amount of output information. If that makes sense to you? Just
my initial thought, but obviously you spent a lot more time with these
functions and data so it's your call! But regarding removing the rownames,
that I do feel strongly about if they're not meaningful :)


http://deanattali.com

On 5 August 2016 at 19:01, Zhian N. Kamvar [email protected] wrote:

Oh I see, really nice function. To be nitpicky:

Nits are meant to be picked!

are the row names in the output needed? If they are needed, would it not
make more sense to have them as a proper variable rather than rownames?

They are not needed. I can easily get rid of them.

I do see now that using collapse=T prints out where each record came from.
But when collapse=F, I don't think there's any identifying information?

collapse = FALSE does not identify the origin (except by rownames, which
is a no-no). Should I add that as an extra column?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#12 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AA6IFHP5dpBnilaCrgukZwAbnYuK9n5sks5qc-sWgaJpZM4JdJBM
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants