Skip to content

Performance optimization for large matrices #84

@vagmcs

Description

@vagmcs

In choice_calcs.py line 928, the library checks if the given weights for computing the weighted log-likelihood are provided and if they are not, it set them to an array of ones, followed by a multiplication and a max (per column) with the rows_to_obs array. However, when the rows_to_obs is pretty large, this can lead to an out of memory error. On the other hand, if the weights are not provided, or they are all one, then I think, we can just set the weights_per_obs to an array of ones without doing the multiplication and max operations, leading to a great improve in performance.

The existing code:

if weights is None:
    weights = np.ones(design.shape[0])
weights_per_obs =\
    np.max(rows_to_obs.toarray() * weights[:, None], axis=0)

and, the proposed fix:

if weights is None or np.all(weights == 1):
    weights_per_obs = np.ones(rows_to_obs.shape[1])
else:
    weights_per_obs = \
        np.max(rows_to_obs.toarray() * weights[:, None], axis=0)

I have created a pull request to address the issue (see #85).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions