Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combinatorics navigation #7

Open
wants to merge 25 commits into
base: master
Choose a base branch
from
Open

Combinatorics navigation #7

wants to merge 25 commits into from

Conversation

hyunjimoon
Copy link
Collaborator

No description provided.

@Dashadower
Copy link
Collaborator

By 97chadol

  1. Hierarchy is a list
  2. Each element of the 'Hierarchy' list is a dictionary.
  3. ith element of the 'Hierarchy' list contains the information about 'the hierarchy between different instances (terminal nodes) of the ith module'.
  4. For each key (which is an instance(terminal node) of a module), the items are the instances (terminal nodes) of the same module that are inferior to the key.

The following is the example of the Hierarchy list in the Simple example case.

Simple example

Hierarchy = [{},
{'lognormal,StddevInformative:yes':['standard'],'lognormal,StddevInformative:no':['standard']}
]

The following is the example of the Hierarchy list in the Birthday Case study.

Birthday Case study

Hierarchy = [{'yes,DayOfWeekWeights:weighted':['no'],'yes,DayOfWeekWeights:uniform':['no']},
{'yes,DayOfYearHierarchicalVariance:yes,DayOfYearNormalVariance:yes':['no'],
'yes,DayOfYearHierarchicalVariance:yes,DayOfYearNormalVariance:no':['no'],
'yes,DayOfYearHierarchicalVariance:no,DayOfYearNormalVariance:yes':['no'],
'yes,DayOfYearHierarchicalVariance:no,DayOfYearNormalVariance:no':['no']},
{'yes':['no']},
{'yes': ['no']},
{'yes': ['no']}
]

@hyunjimoon
Copy link
Collaborator Author

hyunjimoon commented Jan 7, 2022

Assumption:

  1. The longest chain starts from the most complex model
  2. greedy among the most complex model is allowed

A possible hierarchy format might be (for Birthday)
[{{{w:n, u:n}},{{},{}},{},{},{}}]

1@Dashadower, From 2-4 @97chadol

1. Form hierarchy

len(H[1]) = 5
len(H[1][1]) = 1
len(H[1][2]) = 2
len(H[1][3]) = 0
len(H[1][4]) = 0
len(H[1][5]) = 0

Hier_dict[k, v] : k > v (implement)

2. maximal_i (terminal implementation)

terminal_imp := Union (Hier_dict'key) (Hier_dict'value)
maximal_i :=
for i in terminal_imp:
i not in Union(Hier_dict'value)

3. maximal_m (model)

the most complex model is generated by cartesian product, maximal_m[:=product of maximal_i] inspecting the values of the dictionary list
To prevent many overlapping of models, from for loops which result in the following structure with n number of n, we change Bellmanford to dikstra.
3

  • 2
    --1
    --1
  • 2
    --1
    --1
  • 2
    --1
    --1
def model_out_in(out_i, in_i )

def gen_chain(model):
	chain[0] = model
	for mi in maxiaml_i[model]: 
		for j in Hier_dict[mi]: 
			max_chain_from_j = max(gen_chain(model_out_in(mi,j))
	
	chain = chain + max_chain_from_j
    return chain

4. Decompose Network to chain

While(1)

  1. Pick the most complex model from Network, then gen_chain,
  2. Network <- Network\gen_chain

@hyunjimoon hyunjimoon changed the title commit for pr Combinatorics navigation Jan 8, 2022
@hyunjimoon
Copy link
Collaborator Author

FYI, 8.8. flow decomposition from Korte, Combinatorics Optimization (5e) and its proof (induction with separating maximal at each step) shows sequential decomposition via maximal @97chadol.

@Dashadower
Copy link
Collaborator

Added new feature to find the models of the highest hierarchy in each subtree. Uses a very simple rule which omits implementations with "no', and includes others. Assumes max signature count of 2 per branch for branches below the top-level branch.

Run with:

mstan -f simple_example.m.stan get-highest-models

simple example output:

Mean:standard
Mean:normal
Stddev:standard
Stddev:lognormal,StddevInformative:yes

birthday model output:

LongTermTrend:yes
SeasonalTrend:yes
DayOfWeekTrend:yes,DayOfWeekWeights:weighted
DayOfWeekTrend:yes,DayOfWeekWeights:uniform
DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes
HolidayTrend:yes

@Dashadower
Copy link
Collaborator

@rybern I've tried to mess around with the Haskell code a bit and added the feature that we mentioned in the email about finding the max complexity model. It's probably gonna break with the blow of the wind and I'm super duper new with Haskell so it's gonna be eye-gouging to read but if you have time, can you do a quick review?

@Dashadower
Copy link
Collaborator

I completely forgot about the golf model and to my surprise when I ran it, looks like it works!!

NSuccesses:binomial
NSuccesses:proportional
PSuccess:logistic
PSuccess:angle_only,PAngleSuccess:angle_success
PSuccess:angle_and_distance,PAngleSuccess:angle_success,PDistanceSuccess:distance_success,OvershootModel:fixed
PSuccess:angle_and_distance,PAngleSuccess:angle_success,PDistanceSuccess:distance_success,OvershootModel:parametric

@rybern
Copy link
Owner

rybern commented Jan 13, 2022

@Dashadower Hi Shin - I'm impressed you were able to dive into the Haskell! Haskell is pretty unique, I hope you're enjoying it.

I have some initial questions:

  • What is the desired output of each command? Most of the other commands output complete model IDs (complete sets of Signature:Implementation pairs), but these seem to be outputting partial sets
  • It looks like you're effectively removing implementations called "no". Is that right? If so, there are some alternative implementations:
    • Start by filtering out "no" modules, then enumerate models in the usual way
    • Start by enumerating models in the usual way, then filter out models that use "no" implementations

I'm not sure that naming models "no" is a good way for users to annotate nested modules. For example, when there are more than two possible implementations, what do you call the second nested module?

@Dashadower
Copy link
Collaborator

Dashadower commented Jan 13, 2022

@rybern Thanks! So far I'd say the Haskell experience was neither great nor terrible.

The goal was to identify the "most complicated" models for each signature subtree. Right now it just prints the portions separately but it's not hard to compose them together to create sound models(Just another combinations of nodes).

I think the way you mentioned would be way more effective; removing "no" implementations when building sigImpls .

Right now, there honestly is no way for us to determine "which model is more complex" other then whether module A is included or not, at least from the current tree point of view. Which is why it's very restrictive - and another reason I'm paying very close attention to Yuling's original email to figure out other ways that might not be too hard to add in.

@rybern
Copy link
Owner

rybern commented Jan 13, 2022

When building sigImpls would work, but we could also just filter them out of the program, like:

execCommand prog GetHighestModels = do
  let filteredProg = prog { implementations = filter ((\= ImplName "no") . implName) (implemenations prog) }
  let sels = allSelections filteredProg
  return . map showSelection . Set.toList $ sels

Right now, there honestly is no way for us to determine "which model is more complex" other then whether module A is included or not, at least from the current tree point of view.

Right, the goal as I understand it would be to let users annotate which implementations are superseded by others; I just think that using the name "no" as the annotation won't generalize very well. But, I don't want to get in the way of your experiments, so maybe "no" is a good place to start. Maybe filtering out model IDs with ":no" in them from the outside program (like graph_search.py) would also make more sense for experimenting, so that different experiments can have different conventions for specifying nested models.

@Dashadower
Copy link
Collaborator

When building sigImpls would work, but we could also just filter them out of the program, like:

Wow that's way better! I'll definitely change to your code. Thanks for the suggestion.

Right, the goal as I understand it would be to let users annotate which implementations are superseded by others; I just think that using the name "no" as the annotation won't generalize very well. But, I don't want to get in the way of your experiments, so maybe "no" is a good place to start. Maybe filtering out model IDs with ":no" in them from the outside program (like graph_search.py) would also make more sense for experimenting, so that different experiments can have different conventions for specifying nested models.

Yeah I agree with your points. And changing model representation conventions is something I haven't thought about, but it seems like there's opportunity for exploration here. Again, thanks a lot for the feedback, this is all valuable food for thought for me.

@Dashadower
Copy link
Collaborator

apex predator search result:

{'DayOfWeekTrend:yes,DayOfWeekWeights:uniform,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes,DayOfYearTrend:yes,HolidayTrend:yes,LongTermTrend:yes,Regression:glm,SeasonalTrend:yes': 14624.1187889647, 
'DayOfWeekTrend:yes,DayOfWeekWeights:weighted,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes,DayOfYearTrend:yes,HolidayTrend:yes,LongTermTrend:yes,Regression:glm,SeasonalTrend:yes': 13173.9060251408}

@Dashadower
Copy link
Collaborator

Starting chain:

['DayOfWeekTrend:yes,DayOfWeekWeights:weighted', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes', 'HolidayTrend:yes', 'LongTermTrend:yes', 'SeasonalTrend:yes']
['DayOfWeekTrend:yes,DayOfWeekWeights:weighted', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes', 'HolidayTrend:yes', 'LongTermTrend:no', 'SeasonalTrend:yes']
['DayOfWeekTrend:yes,DayOfWeekWeights:weighted', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes', 'HolidayTrend:yes', 'LongTermTrend:no', 'SeasonalTrend:yes']
['DayOfWeekTrend:yes,DayOfWeekWeights:weighted', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:no', 'HolidayTrend:yes', 'LongTermTrend:no', 'SeasonalTrend:yes']
['DayOfWeekTrend:yes,DayOfWeekWeights:uniform', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:no', 'HolidayTrend:yes', 'LongTermTrend:no', 'SeasonalTrend:yes']
['DayOfWeekTrend:yes,DayOfWeekWeights:uniform', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:no', 'HolidayTrend:yes', 'LongTermTrend:no', 'SeasonalTrend:no']
['DayOfWeekTrend:yes,DayOfWeekWeights:uniform', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:no,DayOfYearNormalVariance:no', 'HolidayTrend:yes', 'LongTermTrend:no', 'SeasonalTrend:no']
['DayOfWeekTrend:no', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:no,DayOfYearNormalVariance:no', 'HolidayTrend:yes', 'LongTermTrend:no', 'SeasonalTrend:no']

K = 3 result:

model: DayOfWeekTrend:yes,DayOfWeekWeights:weighted,DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes,HolidayTrend:yes,LongTermTrend:yes,SeasonalTrend:yes ELPD:13173.9060251408
model: DayOfWeekTrend:yes,DayOfWeekWeights:weighted,DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes,HolidayTrend:yes,LongTermTrend:no,SeasonalTrend:yes ELPD:1746.41966617266
model: DayOfWeekTrend:yes,DayOfWeekWeights:uniform,DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes,HolidayTrend:yes,LongTermTrend:no,SeasonalTrend:yes ELPD:7361.39343209761
Traceback (most recent call last):
  File "Chain_Generation_and_Search.py", line 149, in <module>
    best_model, best_elpd = Chain_Search(Chain=chain, K=K, data_file_dir=data_file_dir)
  File "Chain_Generation_and_Search.py", line 117, in Chain_Search
    step_size_Uniform = (n-1-cur_ind)/(K-num_ELPD_computed)
ZeroDivisionError: division by zero

@Dashadower
Copy link
Collaborator

20 iterations of bayesian probabilistic update.
The probability distribution of models:
model_pmf_1
model_pmf_2
model_pmf_3
model_pmf_4
model_pmf_5
model_pmf_6
model_pmf_7
model_pmf_8
model_pmf_9
model_pmf_10
model_pmf_11
model_pmf_12
model_pmf_13
model_pmf_14
model_pmf_15
model_pmf_16
model_pmf_17
model_pmf_18
model_pmf_19
model_pmf_20

@Dashadower
Copy link
Collaborator

(Continued) The probability-ELPD scatterplots, again for each iteration:

prob-epld_plot_1
prob-epld_plot_2
prob-epld_plot_3
prob-epld_plot_4
prob-epld_plot_5
prob-epld_plot_6
prob-epld_plot_7
prob-epld_plot_8
prob-epld_plot_9
prob-epld_plot_10
prob-epld_plot_11
prob-epld_plot_12
prob-epld_plot_13
prob-epld_plot_14
prob-epld_plot_15
prob-epld_plot_16
prob-epld_plot_17
prob-epld_plot_18
prob-epld_plot_19
prob-epld_plot_20

@Dashadower
Copy link
Collaborator

(Continued)
And finally, the plots for each signature, after iteration 20 is completed:
sigplot_DayOfWeekTrend
sigplot_DayOfWeekWeights
sigplot_DayOfYearHeirarchicalVariance
sigplot_DayOfYearNormalVariance
sigplot_DayOfYearTrend
sigplot_HolidayTrend
sigplot_LongTermTrend
sigplot_SeasonalTrend

@rybern
Copy link
Owner

rybern commented Feb 15, 2022

It looks like there's some exciting stuff happening in here!

Just a heads up, you might want to merge with the master branch. I've found and fixed some edge case bugs in the network-of-models code.

@Dashadower
Copy link
Collaborator

It looks like there's some exciting stuff happening in here!

Just a heads up, you might want to merge with the master branch. I've found and fixed some edge case bugs in the network-of-models code.

Thanks for the heads up, will do.

I've been writing an example model here but am confused on when to use or omit return in module blocks. Can you give a light explanation on the syntax?

@rybern
Copy link
Owner

rybern commented Feb 17, 2022

Yeah, the short answer is that you should use return when the module is going to be used as a value/expression, and not when it's going to be used as a statement. Looks like you've got it right in your example.

A note from your example: since you use Regression like y ~ Regression(), I suggest adding y as an argument like module "glm" Regression(y). Just like Stan functions, the left side of the ~ is passed into the module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants