Skip to content

Commit 3064d8d

Browse files
krivityihui
andauthored
close #77: tidy_source(width.cutoff = I(number)) enforces the maximum line length instead of the minimum (#71)
Co-authored-by: Yihui Xie <[email protected]>
1 parent 437668a commit 3064d8d

File tree

6 files changed

+106
-19
lines changed

6 files changed

+106
-19
lines changed

DESCRIPTION

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,11 @@ Title: Format R Code Automatically
44
Version: 1.8.2
55
Authors@R: c(
66
person("Yihui", "Xie", role = c("aut", "cre"), email = "[email protected]", comment = c(ORCID = "0000-0003-0645-5666")),
7+
person("Ed", "Lee", role = "ctb"),
78
person("Eugene", "Ha", role = "ctb"),
89
person("Kohske", "Takahashi", role = "ctb"),
9-
person("Ed", "Lee", role = "ctb")
10+
person("Pavel", "Krivitsky", role = "ctb"),
11+
person()
1012
)
1113
Description: Provides a function tidy_source() to format R source code. Spaces
1214
and indent will be added to the code automatically, and comments will be

NEWS

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,17 @@ NEW FEATURES
55
o Lines will be wrapped after operators `%>%`, `%T%`, `%$%`, and `%<>%` now
66
(thanks, @g4challenge #54, @jzelner #62, @edlee123 #68).
77

8+
o The argument `width.cutoff` of `tidy_source()` used to be the lower bound of
9+
line widths. Now if you pass a number wrapped in I(), it will be treated as
10+
the uppper bound, e.g., `tidy_source(width.cutoff = I(60))`. However, please
11+
note that the upper bound cannot always be respected, e.g., when the code
12+
contains an extremely long string, there is no way to break it into shorter
13+
lines automatically (thanks, @krivit @pablo14, #71).
14+
15+
o The value of the argument `width.cutoff` can be specified in the global
16+
option `formatR.width` now. By default, the value is still taken from the
17+
global option `width` like before.
18+
819
BUG FIXES
920

1021
o When the text in the clipboard on macOS does not have a final EOL,

R/tidy.R

Lines changed: 68 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,15 @@
55
#' \code{\link{deparse}}. It can also replace \code{=} with \code{<-} where
66
#' \code{=} means assignments, and reindent code by a specified number of spaces
77
#' (default is 4).
8+
#'
9+
#' If the value of the argument \code{width.cutoff} is wrapped in
10+
#' \code{\link{I}()} (e.g., \code{I(60)}), it will be treated as the \emph{upper
11+
#' bound} on the line width, but this upper bound may not be satisfied. In this
12+
#' case, the function will perform a binary search for a width value that can
13+
#' make \code{deparse()} return code with line width smaller than or equal to
14+
#' the \code{width.cutoff} value. If the search fails to find such a value, it
15+
#' will emit a warning, which can be suppressed by the global option
16+
#' \code{options(formatR.width.warning = FALSE)}.
817
#' @param source a character string: location of the source code (default to be
918
#' the clipboard; this means we can copy the code to clipboard and use
1019
#' \code{tidy_source()} without specifying the argument \code{source})
@@ -16,15 +25,17 @@
1625
#' @param indent number of spaces to indent the code (default 4)
1726
#' @param wrap whether to wrap comments to the linewidth determined by
1827
#' \code{width.cutoff} (note that roxygen comments will never be wrapped)
28+
#' @param width.cutoff Passed to \code{\link{deparse}()}: an integer in
29+
#' \code{[20, 500]} determining the cutoff at which line-breaking is tried
30+
#' (default to be \code{getOption("width")}). In other words, this is the
31+
#' \emph{lower bound} of the line width. See \sQuote{Details} if an upper
32+
#' bound is desired instead.
1933
#' @param output output to the console or a file using \code{\link{cat}}?
2034
#' @param text an alternative way to specify the input: if it is \code{NULL},
2135
#' the function will read the source code from the \code{source} argument;
2236
#' alternatively, if \code{text} is a character vector containing the source
2337
#' code, it will be used as the input and the \code{source} argument will be
2438
#' ignored
25-
#' @param width.cutoff passed to \code{\link{deparse}}: integer in [20, 500]
26-
#' determining the cutoff at which line-breaking is tried (default to be
27-
#' \code{getOption("width")})
2839
#' @param ... other arguments passed to \code{\link{cat}}, e.g. \code{file}
2940
#' (this can be useful for batch-processing R scripts, e.g.
3041
#' \code{tidy_source(source = 'input.R', file = 'output.R')})
@@ -47,8 +58,8 @@ tidy_source = function(
4758
brace.newline = getOption('formatR.brace.newline', FALSE),
4859
indent = getOption('formatR.indent', 4),
4960
wrap = getOption('formatR.wrap', TRUE),
50-
output = TRUE, text = NULL,
51-
width.cutoff = getOption('width'), ...
61+
width.cutoff = getOption('formatR.width', getOption('width')),
62+
output = TRUE, text = NULL, ...
5263
) {
5364
if (is.null(text)) {
5465
if (source == 'clipboard' && Sys.info()['sysname'] == 'Darwin') {
@@ -72,6 +83,8 @@ tidy_source = function(
7283
n2 = attr(regexpr('\n*$', one), 'match.length')
7384
}
7485
on.exit(.env$line_break <- NULL, add = TRUE)
86+
if (width.cutoff > 500) width.cutoff[1] = 500
87+
if (width.cutoff < 20) width.cutoff[1] = 20
7588
# insert enough spaces into infix operators such as %>% so the lines can be
7689
# broken after the operators
7790
spaces = paste(rep(' ', max(10, width.cutoff)), collapse = '')
@@ -94,16 +107,63 @@ begin.comment = '.BeGiN_TiDy_IdEnTiFiEr_HaHaHa'
94107
end.comment = '.HaHaHa_EnD_TiDy_IdEnTiFiEr'
95108
pat.comment = sprintf('invisible\\("\\%s|\\%s"\\)', begin.comment, end.comment)
96109
mat.comment = sprintf('invisible\\("\\%s([^"]*)\\%s"\\)', begin.comment, end.comment)
97-
inline.comment = ' %InLiNe_IdEnTiFiEr%[ ]*"([ ]*#[^"]*)"'
110+
inline.comment = ' %\b%[ ]*"([ ]*#[^"]*)"'
98111
blank.comment = sprintf('invisible("%s%s")', begin.comment, end.comment)
99112
blank.comment2 = sprintf('(\n)\\s+invisible\\("%s%s"\\)(\n|$)', begin.comment, end.comment)
100113

114+
# first, perform a (semi-)binary search to find the greatest cutoff width such
115+
# that the width of the longest line <= `width`; if the search fails, use
116+
# brute-force to try all possible widths
117+
deparse2 = function(expr, width, warn = getOption('formatR.width.warning', TRUE)) {
118+
wmin = 20 # if deparse() can't manage it with width.cutoff <= 20, issue a warning
119+
wmax = min(500, width + 10) # +10 because a larger width may result in smaller actual width
120+
121+
r = seq(wmin, wmax)
122+
k = setNames(rep(NA, length(r)), as.character(r)) # results of width checks
123+
d = p = list() # deparsed results and lines exceeding desired width
124+
125+
check_width = function(w) {
126+
i = as.character(w)
127+
if (!is.na(x <- k[i])) return(x)
128+
x = deparse(expr, w)
129+
x = gsub('\\s+$', '', x)
130+
d[[i]] <<- x
131+
x2 = grep(pat.comment, x, invert = TRUE, value = TRUE) # don't check comments
132+
p[[i]] <<- x2[nchar(x2, type = 'width') > width]
133+
k[i] <<- length(p[[i]]) == 0
134+
}
135+
136+
# if the desired width happens to just work, return the result
137+
if (check_width(w <- width)) return(d[[as.character(w)]])
138+
139+
repeat {
140+
if (!any(is.na(k))) break # has tried all possibilities
141+
if (wmin >= wmax) break
142+
w = ceiling((wmin + wmax)/2)
143+
if (check_width(w)) wmin = w else wmax = wmax - 2
144+
}
145+
146+
# try all the rest of widths if no suitable width has been found
147+
if (!any(k, na.rm = TRUE)) for (i in r[is.na(k)]) check_width(i)
148+
r = r[which(k)]
149+
if ((n <- length(r)) > 0) return(d[[as.character(r[n])]])
150+
151+
i = as.character(width)
152+
if (warn) warning(
153+
'Unable to find a suitable cut-off to make the line widths smaller than ',
154+
width, ' for the line(s) of code:\n', paste0(' ', p[[i]], collapse = '\n'),
155+
call. = FALSE
156+
)
157+
d[[i]]
158+
}
159+
101160
# wrapper around parse() and deparse()
102161
tidy_block = function(text, width = getOption('width'), arrow = FALSE) {
103162
exprs = parse_only(text)
104163
if (length(exprs) == 0) return(character(0))
105164
exprs = if (arrow) replace_assignment(exprs) else as.list(exprs)
106-
sapply(exprs, function(e) paste(base::deparse(e, width), collapse = '\n'))
165+
deparse = if (inherits(width, 'AsIs')) deparse2 else base::deparse
166+
sapply(exprs, function(e) paste(deparse(e, width), collapse = '\n'))
107167
}
108168

109169
# Restore the real source code from the masked text
@@ -113,7 +173,7 @@ unmask_source = function(text.mask, spaces) {
113173
if (!is.null(m)) text.mask = gsub(m, '\n', text.mask)
114174
## if the comments were separated into the next line, then remove '\n' after
115175
## the identifier first to move the comments back to the same line
116-
text.mask = gsub('%InLiNe_IdEnTiFiEr%[ ]*\n', '%InLiNe_IdEnTiFiEr%', text.mask)
176+
text.mask = gsub('(%\b%)[ ]*\n', '\\1', text.mask)
117177
## move 'else ...' back to the last line
118178
text.mask = gsub('\n\\s*else(\\s+|$)', ' else\\1', text.mask)
119179
if (any(grepl('\\\\\\\\', text.mask)) &&

R/utils.R

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ mask_comments = function(x, width, keep.blank.line, wrap = TRUE, spaces) {
6262
# mask block and inline comments
6363
d.text[c1 & !c3] = reflow_comments(d.text[c1 & !c3], width)
6464
d.text[c3] = sprintf('invisible("%s%s%s")', begin.comment, d.text[c3], end.comment)
65-
d.text[c2] = sprintf('%%InLiNe_IdEnTiFiEr%% "%s"', d.text[c2])
65+
d.text[c2] = sprintf('%%\b%% "%s"', d.text[c2])
6666

6767
# add blank lines
6868
if (keep.blank.line) for (i in seq_along(d.text)) {
@@ -100,7 +100,7 @@ mask_inline = function(x) {
100100
p = paste('{\ninvisible("', begin.comment, '\\1', end.comment, '")', sep = '')
101101
x[idx] = gsub('\\{\\s*(#.*)$', p, x[idx])
102102
}
103-
gsub('(#[^"]*)$', ' %InLiNe_IdEnTiFiEr% "\\1"', x)
103+
gsub('(#[^"]*)$', ' %\b% "\\1"', x)
104104
}
105105

106106
# reflow comments (excluding roxygen comments)

man/tidy_source.Rd

Lines changed: 17 additions & 5 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

vignettes/formatR.Rmd

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -273,10 +273,10 @@ will become
273273
Inline comments are first disguised as a weird operation with its preceding R code, which is essentially meaningless but syntactically correct! For example,
274274

275275
```r
276-
1+1 %InLiNe_IdEnTiFiEr% "# comments"
276+
1+1 %\b% "# comments"
277277
```
278278

279-
then `base::parse()` will deal with this expression; again, the disguised comments will not be removed. In the end, inline comments will be freed as well (remove the operator `%InLiNe_IdEnTiFiEr%` and surrounding double quotes).
279+
then `base::parse()` will deal with this expression; again, the disguised comments will not be removed. In the end, inline comments will be freed as well (remove the operator `%\b%` and surrounding double quotes).
280280

281281
All these special treatments to comments are due to the fact that `base::parse()` and `base::deparse()` can tidy the R code at the price of dropping all the comments.
282282

@@ -290,6 +290,8 @@ There are global options which can override some arguments in `tidy_source()`:
290290
| `blank` | `options('formatR.blank')` | `TRUE` |
291291
| `arrow` | `options('formatR.arrow')` | `FALSE` |
292292
| `indent` | `options('formatR.indent')` | `4` |
293+
| `wrap` | `options('formatR.wrap')` | `TRUE` |
294+
| `width.cutoff` | `options('formatR.width')` | `options('width')` |
293295
| `brace.newline` | `options('formatR.brace.newline')` | `FALSE` |
294296

295-
Also note that single lines of long comments will be wrapped into shorter ones automatically, but roxygen comments will not be wrapped (i.e., comments that begin with `#'`).
297+
Also note that single lines of long comments will be wrapped into shorter ones automatically when `wrap = TRUE`, but roxygen comments will not be wrapped (i.e., comments that begin with `#'`).

0 commit comments

Comments
 (0)