-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source priority input #2022
Source priority input #2022
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you describe the design in this PR? From what I remember we need a configurable priority per nodetype only? And if we needed to override that (not sure), there can be a column in the Node table.
Given that context, why do we need a python/ribasim/ribasim/allocation/source_priorities.py
?
It's still nice to summarize what the changes are in a PR description (and later commit message). Can be very short, as in rename |
A remark on tests that were changed: In some cases the allocated amounts actually changed because a change in order of sources and how they are clustered together. But more generally, it is a bad idea to test based on obtaining the values from the variables (e.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! 💯 Could you comment on your approach in general?
I'm having a hard time to understand the refactor, mostly because of the rename with AllocationSourceType (going from the old boundary_node basin main_to_sub user_return buffer to user_demand, boundary, level_demand, flow_demand and subnetwork_inlet?). It would be good to document that. Same for the AllocationSource functions using the Val.
The changes in the tests slightly worry me, do you/we fully understand the changes in the source order? Are there other tests that assert correct behavior (if any)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's up with all the added newlines? Best to remove them (from the generator script).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def migration_func(df: DataFrame, schema_version: int) -> DataFrame: | ||
if schema_version < 4: | ||
warnings.warn( | ||
f"Migrating outdated {node_type} / {table_type} table.", UserWarning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That warning is not so important that we need to adhere to a specific format, feel free to move the function outside of the loop.
if !( | ||
(type isa DataType) && | ||
type <: Legolas.AbstractRecord && | ||
hasfield(type, :demand_priority) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I appreciate this refactor, I'd say the previous code was more readable. I now have to look up all tables with demand_priority
to get the same understanding. Just a general remark, it's a fine line between preventing duplication bugs and readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. On the other hand, if we add a new table with a demand_priority_column
, this is typically something that we would forget to update and then the problem might be hard to locate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably something that would have to be undone if we were to prepare our code for static compilation?
It seems that with these changes we're back in a situation where the allocation results are not deterministic. This might be due to the fact that now again in certain situations the best result (as measured by the loss function) is not unique. For instance, say there are two sources in a subnetwork that have the same priority and thus are used within the same optimization problem. If these two sources can both supply to a certain demand, there is no constraint that pins down which fraction of the demand comes from which of the 2 sources. We can solve this by going back to optimizing per source. At the moment we do not enforce unique source priorities so in order to have a fixed order for the sources, per source priority the sources are ordered by node ID. If desired the modeler can set specific priorities for nodes to deviate from this behavior. @FaridAlavi, is there a general way in which you avoid non-uniqueness problems like this? I just read about using a secondary objective. So first optimize for the allocation of demands, and then when those are fixed, solve for a secondary objective, say retrieving the same amount from every source, to make the solution unique. |
python/ribasim/ribasim/schemas.py
Outdated
@@ -36,614 +36,779 @@ def migrate(cls, df: Any, schema_version: int) -> Any: | |||
|
|||
class BasinConcentrationExternalSchema(_BaseSchema): | |||
fid: Index[Int32] = pa.Field(default=1, check_name=True, coerce=True) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure where all these white spaces come from
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if !( | ||
(type isa DataType) && | ||
type <: Legolas.AbstractRecord && | ||
hasfield(type, :demand_priority) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. On the other hand, if we add a new table with a demand_priority_column
, this is typically something that we would forget to update and then the problem might be hard to locate
if !( | ||
(type isa DataType) && | ||
type <: Legolas.AbstractRecord && | ||
hasfield(type, :demand_priority) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably something that would have to be undone if we were to prepare our code for static compilation?
@evetion a comment on the approach in general: In Ribasim Python, source priorities can now be added via the In the core the default or overridden source priorities are assigned to the source nodes in each subnetwork. Currently the sources within a subnetwork with the same priority are solved for at the same time which yield (more) situations with non-uniqueness of an optimal solution. Regarding the rename of the Regarding the |
125889f
to
b70895f
Compare
My latest commit reverts back to solving allocation per source to reduce the likelihood of non-unique optimal solutions which can lead to non-deterministic behavior. Because of this the allocation tests give the same results as main again so @evetion can sleep soundly 😴 The changes by @visr tried to fix a problem with CI which turned out to be unrelated to this PR. Regarding non-unique solutions in general: they are still possible even now, for instance if flow can take 2 different routes from a source to a demand. There is a new issue for addressing this problem rigorously: #2051 |
Fixes #1505.
To do:
priority -> demand_priority
a non-breaking migration step