Skip to content

Tinyyuumi/PMC-xml-read

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

该仓库用于从PMC网站论文的xml中提取作者、标题、摘要、引文位置、引文等信息。
This repository is used to extract information from paper XML in website PMC, which includes author, title, abstract, reference location and reference.

https://www.ncbi.nlm.nih.gov/pmc

1、先从网站上导出搜索列表,放在pmc_result.txt文件中

2、运行 extract_pcmid.py,从pmc_result.txt文件提取PCMID并保存在id.txt中

python extract_pcmid.py

3、运行 article.py,从id.txt文件根据PCMID提取文章信息

python article.py

About

从PMC网站获取论文xml,并从xml中提取出引文位置

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages