Skip to content

Commit

Permalink
Exclude LaTeX figures and tables from word count
Browse files Browse the repository at this point in the history
  • Loading branch information
fschaffner committed May 30, 2019
1 parent 2d82282 commit ea2587b
Show file tree
Hide file tree
Showing 5 changed files with 30 additions and 12 deletions.
Binary file added .DS_Store
Binary file not shown.
21 changes: 12 additions & 9 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,23 @@ Package: wordcountaddin
Type: Package
Title: Word counts and readability statistics in R markdown documents
Version: 0.3.0.9000
Authors@R: c(person("Ben", "Marwick",
Authors@R: c(person("Ben", "Marwick",
email = "[email protected]",
role = c("aut", "cre")),
person("JooYoung", "Seo",
email = "[email protected]",
person("JooYoung", "Seo",
email = "[email protected]",
role = "ctb"),
person("Henrik", "Bengtsson",
email = "[email protected]",
role = "ctb"))
person("Henrik", "Bengtsson",
email = "[email protected]",
role = "ctb"),
person("Florian S.", "Schaffner",
email = "[email protected]",
role = "ctb"),)
Maintainer: Ben Marwick <[email protected]>
Description: An addin for RStudio that will count the words and characters
Description: An addin for RStudio that will count the words and characters
in a plain text document. It is designed for use with RMarkdown
documents and will exclude YAML header content, code chunks and inline
code from the counts. It also computes readability statistics so you can
documents and will exclude YAML header content, code chunks and inline
code from the counts. It also computes readability statistics so you can
get an idea of how easy or difficult your text is to read.
License: MIT + file LICENSE
LazyData: TRUE
Expand Down
6 changes: 5 additions & 1 deletion R/hello.R
Original file line number Diff line number Diff line change
Expand Up @@ -180,9 +180,13 @@ prep_text <- function(text){
# don't include html tags
text <- gsub("<.+?>|</.+?>", "", text)

# don't include percent signs because they trip up stringi
# don't include percent signs because they trip up stringi
text <- gsub("%", "", text)

# don't include figures and tables inserted using plain LaTeX code
text <- gsub("\\\\begin\\{figure\\}(.*?)\\\\end\\{figure\\}", "", text)
text <- gsub("\\\\begin\\{table\\}(.*?)\\\\end\\{table\\}", "", text)

# don't include LaTeX \eggs{ham}
# how to do? problem with capturing \x

Expand Down
4 changes: 2 additions & 2 deletions man/text_stats.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions tests/testthat/test_wordcountaddin.R
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,17 @@ test_that("Word count is correct for text with % sign", {
"|Word count |9 |9 |")
})


test_that("Word count is correct for text with figures included using LaTeX code", {
# test for escaping the percent sign in plain text
text_with_figures <- "One \\begin{figure} \\caption{text} \\label{text} \\includegraphics[width=\\textwidth]{figure.png} \\end{figure} Two \\begin{figure} \\caption{text} \\label{text} \\includegraphics[width=\\textwidth]{figure.png} \\end{figure} Three"

text_stats_percent_chr_out <- text_stats_chr(text_with_figures)
expect_equal(text_stats_percent_chr_out[3],
"|Word count |3 |3 |")
})


test_that("Word count is a single integer for a Rmd file when using word_count", {
# test that we can word count on a file
the_rmd_word_count <- word_count(filename = test_path("test_wordcountaddin.Rmd"))
Expand Down

0 comments on commit ea2587b

Please sign in to comment.