Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Biomedical NEN #3600

Open
darije opened this issue Jan 24, 2025 · 5 comments
Open

[Bug]: Biomedical NEN #3600

darije opened this issue Jan 24, 2025 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@darije
Copy link

darije commented Jan 24, 2025

Describe the bug

When running example from Biomedical NEN (https://github.com/flairNLP/flair/blob/master/resources/docs/HUNFLAIR2.md) I get AttributeError: 'csr_matrix' object has no attribute 'A'

To Reproduce

from flair.data import Sentence
from flair.models import EntityMentionLinker
from flair.nn import Classifier

# make a sentence
sentence = Sentence("Behavioral abnormalities in the Fmr1 KO2 Mouse Model of Fragile X Syndrome")

# load biomedical NER tagger + predict entities
tagger = Classifier.load("hunflair2")
tagger.predict(sentence)

# load gene linker and perform normalization
gene_linker = EntityMentionLinker.load("gene-linker")
gene_linker.predict(sentence)

# load disease linker and perform normalization
disease_linker = EntityMentionLinker.load("disease-linker")
disease_linker.predict(sentence)

# load species linker and perform normalization
species_linker = EntityMentionLinker.load("species-linker")
species_linker.predict(sentence)

for entity in sentence.get_labels("link"):
    print(entity)

Expected behavior

Span[0:2]: "Behavioral abnormalities" → MESH:D001523/name=Mental Disorders (197.9467010498047)
Span[4:5]: "Fmr1" → 108684022/name=FRAXA (219.9510040283203)
Span[6:7]: "Mouse" → 10090/name=Mus musculus (213.6201934814453)
Span[9:12]: "Fragile X Syndrome" → MESH:D005600/name=Fragile X Syndrome (193.7115020751953)

Logs and Stack traces

Traceback (most recent call last):
  File "c:\Users\darij\Desktop\FlairNLP\biomedNEN.py", line 14, in <module>
    gene_linker.predict(sentence)
  File "C:\Users\darij\Desktop\FlairNLP\.venv\Lib\site-packages\flair\models\entity_mention_linking.py", line 822, in predict
    candidates = self.candidate_generator.search(entity_mentions=mentions[i : i + batch_size], top_k=top_k)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\darij\Desktop\FlairNLP\.venv\Lib\site-packages\flair\models\entity_mention_linking.py", line 614, in search
    mention_embs = self.embed(entity_mentions)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\darij\Desktop\FlairNLP\.venv\Lib\site-packages\flair\models\entity_mention_linking.py", line 594, in embed
    self.embeddings["sparse"].embed(inputs)
  File "C:\Users\darij\Desktop\FlairNLP\.venv\Lib\site-packages\flair\embeddings\document.py", line 213, in embed
    tfidf_vectors = torch.from_numpy(self.vectorizer.transform(raw_sentences).A)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'csr_matrix' object has no attribute 'A'

Screenshots

No response

Additional Context

Windows 11, VS Code, Python venv 3.12

Environment

Versions:

Flair

0.15.0

Pytorch

2.5.1+cu124

Transformers

4.48.1

GPU

True

@darije darije added the bug Something isn't working label Jan 24, 2025
@alanakbik
Copy link
Collaborator

@WangXII can you take a look?

@WangXII
Copy link
Collaborator

WangXII commented Jan 27, 2025

Pinging @sg-wbi , our expert on the NEN module

@sg-wbi
Copy link
Collaborator

sg-wbi commented Jan 28, 2025

The problem is with the installation of dependencies:

scikit-learn>=1.0.2 pulls scipy-1.15.1

Collecting scikit-learn>=1.0.2 (from flair==0.15.0)
  Using cached scikit_learn-1.6.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (18 kB)
Collecting scipy>=1.6.0 (from scikit-learn>=1.0.2->flair==0.15.0)
  Using cached scipy-1.15.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)

but scipy-1.14.0 removed the .A attribute.

The simple fix is to change .A to .toarray: PR here.

Or you can try downgrading scipy to 1.13.

@sg-wbi
Copy link
Collaborator

sg-wbi commented Jan 28, 2025

I don't think scipy is used much in the library but there's a conflict in the dependencies which prevents seeing scipy updates in the tests.

In testing gensim>=4.2.0 is installed, which removes scipy-1.15.1 and installs scipy-1.13.1 (for whatever reason).

@darije
Copy link
Author

darije commented Jan 28, 2025

@sg-wbi "The simple fix is to change .A to .toarray" - this worked!
Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants