Performance optimization for large matrices

In `choice_calcs.py` line 928, the library checks if the given weights for computing the weighted log-likelihood are provided and if they are not, it set them to an array of ones, followed by a multiplication and a max (per column) with the rows_to_obs array. However, when the rows_to_obs is pretty large, this can lead to an out of memory error. On the other hand, if the weights are not provided, or they are all one, then I think, we can just set the weights_per_obs to an array of ones without doing the multiplication and max operations, leading to a great improve in performance.

The existing code:

```python
if weights is None:
    weights = np.ones(design.shape[0])
weights_per_obs =\
    np.max(rows_to_obs.toarray() * weights[:, None], axis=0)
```

and, the proposed fix:

```python
if weights is None or np.all(weights == 1):
    weights_per_obs = np.ones(rows_to_obs.shape[1])
else:
    weights_per_obs = \
        np.max(rows_to_obs.toarray() * weights[:, None], axis=0)
```

I have created a pull request to address the issue (see #85).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance optimization for large matrices #84

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Performance optimization for large matrices #84

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions