Combinatorics navigation #7

hyunjimoon · 2022-01-07T01:22:09Z

No description provided.

Dashadower · 2022-01-07T01:22:57Z

By 97chadol

Hierarchy is a list
Each element of the 'Hierarchy' list is a dictionary.
ith element of the 'Hierarchy' list contains the information about 'the hierarchy between different instances (terminal nodes) of the ith module'.
For each key (which is an instance(terminal node) of a module), the items are the instances (terminal nodes) of the same module that are inferior to the key.

The following is the example of the Hierarchy list in the Simple example case.

Simple example

Hierarchy = [{},
{'lognormal,StddevInformative:yes':['standard'],'lognormal,StddevInformative:no':['standard']}
]

The following is the example of the Hierarchy list in the Birthday Case study.

Birthday Case study

Hierarchy = [{'yes,DayOfWeekWeights:weighted':['no'],'yes,DayOfWeekWeights:uniform':['no']},
{'yes,DayOfYearHierarchicalVariance:yes,DayOfYearNormalVariance:yes':['no'],
'yes,DayOfYearHierarchicalVariance:yes,DayOfYearNormalVariance:no':['no'],
'yes,DayOfYearHierarchicalVariance:no,DayOfYearNormalVariance:yes':['no'],
'yes,DayOfYearHierarchicalVariance:no,DayOfYearNormalVariance:no':['no']},
{'yes':['no']},
{'yes': ['no']},
{'yes': ['no']}
]

hyunjimoon · 2022-01-07T04:30:13Z

Assumption:

The longest chain starts from the most complex model
greedy among the most complex model is allowed

A possible hierarchy format might be (for Birthday)
[{{{w:n, u:n}},{{},{}},{},{},{}}]

1@Dashadower, From 2-4 @97chadol

1. Form hierarchy

len(H[1]) = 5
len(H[1][1]) = 1
len(H[1][2]) = 2
len(H[1][3]) = 0
len(H[1][4]) = 0
len(H[1][5]) = 0

Hier_dict[k, v] : k > v (implement)

2. maximal_i (terminal implementation)

terminal_imp := Union (Hier_dict'key) (Hier_dict'value)
maximal_i :=
for i in terminal_imp:
i not in Union(Hier_dict'value)

3. maximal_m (model)

the most complex model is generated by cartesian product, maximal_m[:=product of maximal_i] inspecting the values of the dictionary list
To prevent many overlapping of models, from for loops which result in the following structure with n number of n, we change Bellmanford to dikstra.
3

2
--1
--1
2
--1
--1
2
--1
--1

def model_out_in(out_i, in_i )

def gen_chain(model):
	chain[0] = model
	for mi in maxiaml_i[model]: 
		for j in Hier_dict[mi]: 
			max_chain_from_j = max(gen_chain(model_out_in(mi,j))
	
	chain = chain + max_chain_from_j
    return chain

4. Decompose Network to chain

While(1)

Pick the most complex model from Network, then gen_chain,
Network <- Network\gen_chain

hyunjimoon · 2022-01-09T02:56:06Z

FYI, 8.8. flow decomposition from Korte, Combinatorics Optimization (5e) and its proof (induction with separating maximal at each step) shows sequential decomposition via maximal @97chadol.

Dashadower · 2022-01-13T15:42:03Z

Added new feature to find the models of the highest hierarchy in each subtree. Uses a very simple rule which omits implementations with "no', and includes others. Assumes max signature count of 2 per branch for branches below the top-level branch.

Run with:

mstan -f simple_example.m.stan get-highest-models

simple example output:

Mean:standard
Mean:normal
Stddev:standard
Stddev:lognormal,StddevInformative:yes

birthday model output:

LongTermTrend:yes
SeasonalTrend:yes
DayOfWeekTrend:yes,DayOfWeekWeights:weighted
DayOfWeekTrend:yes,DayOfWeekWeights:uniform
DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes
HolidayTrend:yes

Dashadower · 2022-01-13T15:45:04Z

@rybern I've tried to mess around with the Haskell code a bit and added the feature that we mentioned in the email about finding the max complexity model. It's probably gonna break with the blow of the wind and I'm super duper new with Haskell so it's gonna be eye-gouging to read but if you have time, can you do a quick review?

Dashadower · 2022-01-13T15:52:25Z

I completely forgot about the golf model and to my surprise when I ran it, looks like it works!!

NSuccesses:binomial
NSuccesses:proportional
PSuccess:logistic
PSuccess:angle_only,PAngleSuccess:angle_success
PSuccess:angle_and_distance,PAngleSuccess:angle_success,PDistanceSuccess:distance_success,OvershootModel:fixed
PSuccess:angle_and_distance,PAngleSuccess:angle_success,PDistanceSuccess:distance_success,OvershootModel:parametric

rybern · 2022-01-13T16:55:48Z

@Dashadower Hi Shin - I'm impressed you were able to dive into the Haskell! Haskell is pretty unique, I hope you're enjoying it.

I have some initial questions:

What is the desired output of each command? Most of the other commands output complete model IDs (complete sets of Signature:Implementation pairs), but these seem to be outputting partial sets
It looks like you're effectively removing implementations called "no". Is that right? If so, there are some alternative implementations:
- Start by filtering out "no" modules, then enumerate models in the usual way
- Start by enumerating models in the usual way, then filter out models that use "no" implementations

I'm not sure that naming models "no" is a good way for users to annotate nested modules. For example, when there are more than two possible implementations, what do you call the second nested module?

Dashadower · 2022-01-13T17:09:27Z

@rybern Thanks! So far I'd say the Haskell experience was neither great nor terrible.

The goal was to identify the "most complicated" models for each signature subtree. Right now it just prints the portions separately but it's not hard to compose them together to create sound models(Just another combinations of nodes).

I think the way you mentioned would be way more effective; removing "no" implementations when building sigImpls .

Right now, there honestly is no way for us to determine "which model is more complex" other then whether module A is included or not, at least from the current tree point of view. Which is why it's very restrictive - and another reason I'm paying very close attention to Yuling's original email to figure out other ways that might not be too hard to add in.

rybern · 2022-01-13T17:54:15Z

When building sigImpls would work, but we could also just filter them out of the program, like:

execCommand prog GetHighestModels = do
  let filteredProg = prog { implementations = filter ((\= ImplName "no") . implName) (implemenations prog) }
  let sels = allSelections filteredProg
  return . map showSelection . Set.toList $ sels

Right now, there honestly is no way for us to determine "which model is more complex" other then whether module A is included or not, at least from the current tree point of view.

Right, the goal as I understand it would be to let users annotate which implementations are superseded by others; I just think that using the name "no" as the annotation won't generalize very well. But, I don't want to get in the way of your experiments, so maybe "no" is a good place to start. Maybe filtering out model IDs with ":no" in them from the outside program (like graph_search.py) would also make more sense for experimenting, so that different experiments can have different conventions for specifying nested models.

Dashadower · 2022-01-14T02:27:08Z

When building sigImpls would work, but we could also just filter them out of the program, like:

Wow that's way better! I'll definitely change to your code. Thanks for the suggestion.

Right, the goal as I understand it would be to let users annotate which implementations are superseded by others; I just think that using the name "no" as the annotation won't generalize very well. But, I don't want to get in the way of your experiments, so maybe "no" is a good place to start. Maybe filtering out model IDs with ":no" in them from the outside program (like graph_search.py) would also make more sense for experimenting, so that different experiments can have different conventions for specifying nested models.

Yeah I agree with your points. And changing model representation conventions is something I haven't thought about, but it seems like there's opportunity for exploration here. Again, thanks a lot for the feedback, this is all valuable food for thought for me.

Dashadower · 2022-01-18T03:19:30Z

apex predator search result:

{'DayOfWeekTrend:yes,DayOfWeekWeights:uniform,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes,DayOfYearTrend:yes,HolidayTrend:yes,LongTermTrend:yes,Regression:glm,SeasonalTrend:yes': 14624.1187889647, 
'DayOfWeekTrend:yes,DayOfWeekWeights:weighted,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes,DayOfYearTrend:yes,HolidayTrend:yes,LongTermTrend:yes,Regression:glm,SeasonalTrend:yes': 13173.9060251408}

Dashadower · 2022-01-21T03:40:44Z

Starting chain:

['DayOfWeekTrend:yes,DayOfWeekWeights:weighted', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes', 'HolidayTrend:yes', 'LongTermTrend:yes', 'SeasonalTrend:yes']
['DayOfWeekTrend:yes,DayOfWeekWeights:weighted', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes', 'HolidayTrend:yes', 'LongTermTrend:no', 'SeasonalTrend:yes']
['DayOfWeekTrend:yes,DayOfWeekWeights:weighted', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes', 'HolidayTrend:yes', 'LongTermTrend:no', 'SeasonalTrend:yes']
['DayOfWeekTrend:yes,DayOfWeekWeights:weighted', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:no', 'HolidayTrend:yes', 'LongTermTrend:no', 'SeasonalTrend:yes']
['DayOfWeekTrend:yes,DayOfWeekWeights:uniform', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:no', 'HolidayTrend:yes', 'LongTermTrend:no', 'SeasonalTrend:yes']
['DayOfWeekTrend:yes,DayOfWeekWeights:uniform', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:no', 'HolidayTrend:yes', 'LongTermTrend:no', 'SeasonalTrend:no']
['DayOfWeekTrend:yes,DayOfWeekWeights:uniform', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:no,DayOfYearNormalVariance:no', 'HolidayTrend:yes', 'LongTermTrend:no', 'SeasonalTrend:no']
['DayOfWeekTrend:no', 'DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:no,DayOfYearNormalVariance:no', 'HolidayTrend:yes', 'LongTermTrend:no', 'SeasonalTrend:no']

K = 3 result:

model: DayOfWeekTrend:yes,DayOfWeekWeights:weighted,DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes,HolidayTrend:yes,LongTermTrend:yes,SeasonalTrend:yes ELPD:13173.9060251408
model: DayOfWeekTrend:yes,DayOfWeekWeights:weighted,DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes,HolidayTrend:yes,LongTermTrend:no,SeasonalTrend:yes ELPD:1746.41966617266
model: DayOfWeekTrend:yes,DayOfWeekWeights:uniform,DayOfYearTrend:yes,DayOfYearHeirarchicalVariance:yes,DayOfYearNormalVariance:yes,HolidayTrend:yes,LongTermTrend:no,SeasonalTrend:yes ELPD:7361.39343209761
Traceback (most recent call last):
  File "Chain_Generation_and_Search.py", line 149, in <module>
    best_model, best_elpd = Chain_Search(Chain=chain, K=K, data_file_dir=data_file_dir)
  File "Chain_Generation_and_Search.py", line 117, in Chain_Search
    step_size_Uniform = (n-1-cur_ind)/(K-num_ELPD_computed)
ZeroDivisionError: division by zero

…ity update algorithm

Dashadower · 2022-02-05T08:07:28Z

20 iterations of bayesian probabilistic update.
The probability distribution of models:

Dashadower · 2022-02-05T08:08:27Z

(Continued) The probability-ELPD scatterplots, again for each iteration:

Dashadower · 2022-02-05T08:09:13Z

(Continued)
And finally, the plots for each signature, after iteration 20 is completed:

rybern · 2022-02-15T16:09:46Z

It looks like there's some exciting stuff happening in here!

Just a heads up, you might want to merge with the master branch. I've found and fixed some edge case bugs in the network-of-models code.

Dashadower · 2022-02-16T22:15:02Z

It looks like there's some exciting stuff happening in here!

Just a heads up, you might want to merge with the master branch. I've found and fixed some edge case bugs in the network-of-models code.

Thanks for the heads up, will do.

I've been writing an example model here but am confused on when to use or omit return in module blocks. Can you give a light explanation on the syntax?

rybern · 2022-02-17T00:54:15Z

Yeah, the short answer is that you should use return when the module is going to be used as a value/expression, and not when it's going to be used as a statement. Looks like you've got it right in your example.

A note from your example: since you use Regression like y ~ Regression(), I suggest adding y as an argument like module "glm" Regression(y). Just like Stan functions, the left side of the ~ is passed into the module.

commit for pr

844a135

hyunjimoon changed the title ~~commit for pr~~ Combinatorics navigation Jan 8, 2022

Dashadower added 3 commits January 9, 2022 13:15

Merge branch 'master' into combinatorics

dbd92d5

Find max hierarchy models for each subtree

1ae4889

Update n(signature)=1 case

7cc22e8

Update highest model command and add apex search

70233bd

Dashadower added 3 commits January 21, 2022 10:36

Add chain search

73a450d

Update chain search

67fedf0

Update chain search code

80025a9

Dashadower added 7 commits January 28, 2022 09:11

update search and add model df

8733124

Create model dataframe module and birthday model df

e80399d

update typing

d365077

Move files to directory, create mstan inferface and bayesian probabil…

f65f1bc

…ity update algorithm

Update probability search and add score based algorithm

7b0577c

Update bayesian probabilistic search to include plots

883178b

remove pyc

3796258

Dashadower added 4 commits February 13, 2022 11:49

Upload full birthday DF

a76d53b

Remove redundant csv

ef62161

full elpd for birthday

827e5af

Add roach model, score search, and notebook template

9f514a4

Upload roach csv

64587d8

Dashadower added 5 commits February 18, 2022 11:19

Update prob search

3eacced

Merge remote-tracking branch 'origin/master' into combinatorics

876b4b4

Update notebook

72fdd62

Update chain algorithms and notebook

639adf9

Update notebook

a4f090d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Combinatorics navigation #7

Combinatorics navigation #7

hyunjimoon commented Jan 7, 2022

Dashadower commented Jan 7, 2022

hyunjimoon commented Jan 7, 2022 •

edited

Loading

hyunjimoon commented Jan 9, 2022

Dashadower commented Jan 13, 2022

Dashadower commented Jan 13, 2022

Dashadower commented Jan 13, 2022

rybern commented Jan 13, 2022

Dashadower commented Jan 13, 2022 •

edited

Loading

rybern commented Jan 13, 2022 •

edited

Loading

Dashadower commented Jan 14, 2022

Dashadower commented Jan 18, 2022

Dashadower commented Jan 21, 2022

Dashadower commented Feb 5, 2022

Dashadower commented Feb 5, 2022

Dashadower commented Feb 5, 2022

rybern commented Feb 15, 2022 •

edited

Loading

Dashadower commented Feb 16, 2022

rybern commented Feb 17, 2022

Combinatorics navigation #7

Are you sure you want to change the base?

Combinatorics navigation #7

Conversation

hyunjimoon commented Jan 7, 2022

Dashadower commented Jan 7, 2022

Simple example

Birthday Case study

hyunjimoon commented Jan 7, 2022 • edited Loading

1. Form hierarchy

2. maximal_i (terminal implementation)

3. maximal_m (model)

4. Decompose Network to chain

hyunjimoon commented Jan 9, 2022

Dashadower commented Jan 13, 2022

Dashadower commented Jan 13, 2022

Dashadower commented Jan 13, 2022

rybern commented Jan 13, 2022

Dashadower commented Jan 13, 2022 • edited Loading

rybern commented Jan 13, 2022 • edited Loading

Dashadower commented Jan 14, 2022

Dashadower commented Jan 18, 2022

Dashadower commented Jan 21, 2022

Dashadower commented Feb 5, 2022

Dashadower commented Feb 5, 2022

Dashadower commented Feb 5, 2022

rybern commented Feb 15, 2022 • edited Loading

Dashadower commented Feb 16, 2022

rybern commented Feb 17, 2022

hyunjimoon commented Jan 7, 2022 •

edited

Loading

Dashadower commented Jan 13, 2022 •

edited

Loading

rybern commented Jan 13, 2022 •

edited

Loading

rybern commented Feb 15, 2022 •

edited

Loading