Skip to content

JaviAgua/CorpusSampling

Repository files navigation

CorpusSampling

To compare two text files and to extract a sample from the larger one
01 Count the number of tokens and types in one file
02 Count the number of tokens and types in a second file
03 Compares the number of tokens and types in two files
04 Extract a random sample from the largest file
05 Extract a random sample from the largest file using Pandas

About

To compare two text files and to extract a sample from the larger one

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages