Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing something like numpy.bincount #1786

Open
mateusrodriguesxyz opened this issue Jan 22, 2025 · 8 comments
Open

Missing something like numpy.bincount #1786

mateusrodriguesxyz opened this issue Jan 22, 2025 · 8 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@mateusrodriguesxyz
Copy link

mateusrodriguesxyz commented Jan 22, 2025

While converting a KNN implementation from NumPy to MLX Swift I stumbled upon the usage of a bincount to count the number of occurrences of each value in array. I couldn't find a MLX equivalent so I did the following:

let prediction =  kNearestLabels
        .asArray(Int.self) // MLXArray -> [Int]
        .reduce(into: [:]) { $0[$1, default: 0] += 1 }  // count labels
        .max(by: { $0.1 < $1.1 })!  // max '(label, count)' pair
        .key // label (prediction)

The NumPy code did this in just one line:

np.bincount(k_nearest_labels).argmax()

Is there a more "MLX way" of doing this? If not, maybe MLX should have this bincount method, seems quite handy.

@davidkoski
Copy link
Contributor

Transfer this issue to mlx -- I don't see one in the underlying C++ core. Per the docs

it looks like it produces a histogram. I think the problem with the API (as written) is that the output size is variable based on the contents of the input, so it doesn't match the mlx model.

@davidkoski
Copy link
Contributor

Oh, I can't transfer it. Well @awni what do you think?

@awni
Copy link
Member

awni commented Jan 22, 2025

I'll transfer it to MLX. I think we could have a version where the maximum value is provided as an argument.. but without that it would be an op with data-dependent output shape and we don't have support for those in MLX at the moment.

@awni awni transferred this issue from ml-explore/mlx-swift Jan 22, 2025
@angeloskath
Copy link
Member

angeloskath commented Jan 22, 2025

Well given a fixed output size N then you can simply write

mx.zeros(N, dtype=mx.int32).at[k_neareset_labels].add(1).argmax()

@mateusrodriguesxyz
Copy link
Author

mateusrodriguesxyz commented Jan 23, 2025

Well given a fixed output size N then you can simply write

mx.zeros(N, dtype=mx.int32).at[k_neareset_labels].add(1).argmax()

Thanks! Can this be done in Swift? I can't find the equivalent to this at and add methods. The best I could do was this:

let a = MLXArray([0, 0, 0])
let idx = MLXArray([0, 1, 0, 1, 1, 2, 1, 2, 2, 2, 2])
for i in idx {
    a[i] = a[i] + 1
}
print(a.argMax())

@davidkoski
Copy link
Contributor

Added an issue to add the missing at() function: ml-explore/mlx-swift#188

@awni
Copy link
Member

awni commented Jan 24, 2025

@angeloskath the one liner you wrote for this is the entire implementation so I’m debating if it’s worth adding this op or closing this. It might be handy to have the op since it’s not entirely obvious.. I’ll mark it as an enhancement for now but I’m also ok to close it.

@awni awni added enhancement New feature or request good first issue Good for newcomers labels Jan 24, 2025
@angeloskath
Copy link
Member

Personally I don't think it is worth an op but I tend to be overly conservative wrt these things ie I don't like ops like relu etc. I think they hide more than they are worth. Otoh jax has it 🤷‍♂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants