Utility vs Distance threshold for cutoff point when creating bike route choice paths and logsums. #329
-
|
Background The existing Java implementation has a bespoke Dijkstra's algorithm that uses the generalized costs when building the paths, but keeps track of the distance separately to apply the limit. The limit is currently set to 20 miles for TAZs and 3 miles for MAZs. It is not feasible to move away from SciPy's Dijkstra implementation for the python development of the bike route choice model and we are therefore stuck with having to supply a generalized cost as the limit. The question comes into how we present that limit to the user -- either as a distance that gets converted to a cost, or as a cost itself. Case for Distance Limit This implementation, while theoretically appropriate, produces paths that are much longer than our expected distance threshold. We suspect a few reasons why this is the case:
These essentially boil down to the assumption that the cost-per-mile across the entire network is not representative of the cost-per-mile along paths that are actually chosen. Case for Utility Limit Downsides to this approach are:
Upsides:
My Personal Take I also don't think it really matters -- the number of bike trips happening out at this 20-ish mile is so small that changing the limit by +/- a mile or so at that limit will make no meaningful difference. For reference, the below plot is the distribution of bike trips coming out of the ABM3 model with a tail that extends all the way out to 27 miles --the distance threshold in the current Java implementation of 20 miles does not currently translate directly to the outputs right now anyways. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
|
Thanks David, this is a great summary and I think covers most of what we discussed last week. I'm leaning towards using a utility cutoff for the reasons you gave - more reflective of the actual implementation using SciPy and more consistent output. Because the utility threshold is unintuitive for users as compared to a distance threshold, we would want to clearly document how we arrived at that threshold and what steps to take to update that threshold in the future. I want to graph out the bike logsum values for all bike trips in some existing no-build and build scenarios so we can try to find a threshold where almost no trips are choosing bike mode. How close can we expect the final logsum to be to the original path utilities? Do we need to temporarily add a "logsum without path size" to the bike model output to get closer to the path utility? |
Beta Was this translation helpful? Give feedback.
-
|
The biggest driver of the conversation to date has been the fact that we're seeing distances that significantly exceed the threshold we specify, both in the Python and Java implementations. David did a great job of summarizing the path ahead of us. Practically, there is no way to implement a true distance cutoff without re-implementing Dijkstra's algorithm, and we don't presently have resources to accomplish that. I support shifting to a cost-based threshold in spite of the user-unfriendliness because it should actually adhere to a firm cutoff. Nonetheless, it's important to keep in mind that, either way, the cutoff is merely a heuristic for improving runtime and, if set properly, should have next to no impact on the actual logsums used in the model. I agree wholeheartedly with Alexander about wanting a clearly documented derivation of the chosen thresholds. I think it's worth designing a procedure to do so from scratch (either in addition to or in lieu of a procedure for updating existing thresholds). As David mentioned, the cutoff will likely need to change every time the coefficients are altered, and I can envision scenarios (e.g. the e-bike vs. regular bike discussion) in which the existing cutoff values may have little to no relevance. Whether developing from scratch or updating existing values is more efficient will likely vary heavily depending on the changes made, but having a means to do so from scratch should probably come first; we can then build updating procedures on top of that. |
Beta Was this translation helpful? Give feedback.
-
|
Thank you all for your input. We do not update the coefficients very often. I recommend going ahead with a utility threshold. Next steps
|
Beta Was this translation helpful? Give feedback.

Thank you all for your input. We do not update the coefficients very often. I recommend going ahead with a utility threshold.
Next steps