Skip to content

Commit afc12a2

Browse files
committed
Quick Save
1 parent 80145a6 commit afc12a2

File tree

84 files changed

+875
-373
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

84 files changed

+875
-373
lines changed

CITATION.cff

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
# YAML 1.2
21
cff-version: 1.2.0
32
message: "If you use this software, please cite it as below."
43
title: datatools

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11

2-
Copyright (c) 2021, Caltech
2+
Copyright (c) 2022, Caltech
33
All rights not granted herein are expressly reserved by Caltech.
44

55
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

TODO.html

Lines changed: 93 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,8 @@
3030
<a href="how-to/">How To</a>
3131
</li>
3232
<li>
33-
<a href="https://github.com/caltechlibrary/datatools">Github</a>
33+
<a
34+
href="https://github.com/caltechlibrary/datatools">Github</a>
3435
</li>
3536
</ul>
3637
</nav>
@@ -44,46 +45,58 @@ <h2 id="bug">
4445
</h2>
4546
<ul class="task-list">
4647
<li>
47-
<input type="checkbox" disabled="" /> findfile v0.0.23-pre option -f, -full-path doesn’t return full paths
48+
<input type="checkbox" disabled="" /> findfile v0.0.23-pre option -f,
49+
-full-path doesn’t return full paths
4850
</li>
4951
</ul>
5052
<h2 id="next">
5153
Next
5254
</h2>
5355
<ul class="task-list">
5456
<li>
55-
<input type="checkbox" disabled="" /> upgrade to use the new cli v0.0.5-dev
57+
<input type="checkbox" disabled="" /> upgrade to use the new cli
58+
v0.0.5-dev
5659
</li>
5760
<li>
58-
<input type="checkbox" disabled="" /> csvrows would output a range of rows (e.g. [2:] would be all rows but the first row)
61+
<input type="checkbox" disabled="" /> csvrows would output a range of
62+
rows (e.g. [2:] would be all rows but the first row)
5963
</li>
6064
<li>
61-
<input type="checkbox" disabled="" /> csv utilities should support integer ranges notation for columns and rows references, E.g. “1,3:4,7,10:” or all
65+
<input type="checkbox" disabled="" /> csv utilities should support
66+
integer ranges notation for columns and rows references, E.g.
67+
“1,3:4,7,10:” or all
6268
</li>
6369
</ul>
6470
<h2 id="someday-maybe">
6571
Someday, Maybe
6672
</h2>
6773
<ul class="task-list">
6874
<li>
69-
<input type="checkbox" disabled="" /> finddir should have an option to exclude directories (e.g. exclude .git directories from a listing)
75+
<input type="checkbox" disabled="" /> finddir should have an option to
76+
exclude directories (e.g. exclude .git directories from a listing)
7077
</li>
7178
<li>
72-
<input type="checkbox" disabled="" /> textscraper - a tool for select out text and storing it as a JSON field value, sort grep plus sed cleanup and semi-structured text (e.g. webpage)
79+
<input type="checkbox" disabled="" /> textscraper - a tool for select
80+
out text and storing it as a JSON field value, sort grep plus sed
81+
cleanup and semi-structured text (e.g. webpage)
7382
<ul>
7483
<li>
75-
look at how cut, sed, grep are commonly used in my scripts and merge that functionality into a single tool
84+
look at how cut, sed, grep are commonly used in my scripts and merge
85+
that functionality into a single tool
7686
</li>
7787
</ul>
7888
</li>
7989
<li>
80-
<input type="checkbox" disabled="" /> csvcols, csvrows should have a length option to give you a number of columns or rows respectively
90+
<input type="checkbox" disabled="" /> csvcols, csvrows should have a
91+
length option to give you a number of columns or rows respectively
8192
</li>
8293
<li>
83-
<input type="checkbox" disabled="" /> csvcols, csvrows should have a filter option to filter to support filting output conditionally
94+
<input type="checkbox" disabled="" /> csvcols, csvrows should have a
95+
filter option to filter to support filting output conditionally
8496
</li>
8597
<li>
86-
<input type="checkbox" disabled="" /> csvsort should allow a multi-column sort respecting column headings
98+
<input type="checkbox" disabled="" /> csvsort should allow a
99+
multi-column sort respecting column headings
87100
<ul>
88101
<li>
89102
plus column number would be ascending by that column
@@ -95,74 +108,95 @@ <h2 id="someday-maybe">
95108
sort would be read from left to right
96109
</li>
97110
<li>
98-
it would be good to include support for column names and not just column numbers to describe the sort
111+
it would be good to include support for column names and not just column
112+
numbers to describe the sort
99113
</li>
100114
</ul>
101115
</li>
102116
<li>
103-
<input type="checkbox" disabled="" /> jsonmodify takes a JSON document, a dotpath and value then creates/updates the dotpath in the JSON document with the new value
117+
<input type="checkbox" disabled="" /> jsonmodify takes a JSON document,
118+
a dotpath and value then creates/updates the dotpath in the JSON
119+
document with the new value
104120
<ul>
105121
<li>
106122
“(delete DOTPATH)” would remove the property described by the dotpath
107123
</li>
108124
<li>
109-
“(update DOTPATH NEW_VALUE)” would replace the property described by the dotpath with a new value (value can be a string, number, or JSON)
125+
“(update DOTPATH NEW_VALUE)” would replace the property described by the
126+
dotpath with a new value (value can be a string, number, or JSON)
110127
</li>
111128
<li>
112-
“(create” DOTPATH NEW_VALUE)" would add a new property at the described dotpath with a new value (value can be a string, number, or JSON)
129+
“(create” DOTPATH NEW_VALUE)” would add a new property at the described
130+
dotpath with a new value (value can be a string, number, or JSON)
113131
</li>
114132
<li>
115-
“(join DOTH_PATH SEP)” combines JSON array elements into a string version using separator
133+
“(join DOTH_PATH SEP)” combines JSON array elements into a string
134+
version using separator
116135
</li>
117136
<li>
118-
“(concat DOTPATH1 DOTPATH2… SEP)” combines values into a concatenated string, it takes one or more dotpath values (must be string or number) and return them as a concatenated value (concat .last_name .first_name “,”) would return a last name comma first name string.
137+
“(concat DOTPATH1 DOTPATH2… SEP)” combines values into a concatenated
138+
string, it takes one or more dotpath values (must be string or number)
139+
and return them as a concatenated value (concat .last_name .first_name
140+
“,”) would return a last name comma first name string.
119141
</li>
120142
<li>
121-
“(split DOTH_PATH SEP)” turns a string into an array of strings using separator
143+
“(split DOTH_PATH SEP)” turns a string into an array of strings using
144+
separator
122145
</li>
123146
</ul>
124147
</li>
125148
<li>
126-
<input type="checkbox" disabled="" /> csvcols, csvrows should have a filter mechanism should provide a mechanism to filter by column or row
149+
<input type="checkbox" disabled="" /> csvcols, csvrows should have a
150+
filter mechanism should provide a mechanism to filter by column or row
127151
<ul>
128152
<li>
129-
using a prefix notation (e.g. ‘(and (eq (join (cols (colNo “Last Name”) (colNo “First Name”)) “,”) “Doiel, R. S.”) (gt (cols 4) “2017-06-12”))’)
153+
using a prefix notation (e.g. ‘(and (eq (join (cols (colNo “Last Name”)
154+
(colNo “First Name”)) “,”) “Doiel, R. S.”) (gt (cols 4) “2017-06-12”))’)
130155
</li>
131156
</ul>
132157
</li>
133158
<li>
134-
<input type="checkbox" disabled="" /> csvfind, csvjoin should have an inverted match operation
159+
<input type="checkbox" disabled="" /> csvfind, csvjoin should have an
160+
inverted match operation
135161
</li>
136162
<li>
137-
<input type="checkbox" disabled="" /> a range should accept the word “all” as well as comma delimited list of rows and ranges
163+
<input type="checkbox" disabled="" /> a range should accept the word
164+
“all” as well as comma delimited list of rows and ranges
138165
</li>
139166
<li>
140-
<input type="checkbox" disabled="" /> Add -uuid and -skip-header-row options constistantly to all csv tools
167+
<input type="checkbox" disabled="" /> Add -uuid and -skip-header-row
168+
options constistantly to all csv tools
141169
<ul class="task-list">
142170
<li>
143171
<input type="checkbox" disabled="" /> csvcols
144172
</li>
145173
</ul>
146174
</li>
147175
<li>
148-
<input type="checkbox" disabled="" /> unify the options vocabulary to work the same between each cli
176+
<input type="checkbox" disabled="" /> unify the options vocabulary to
177+
work the same between each cli
149178
<ul>
150179
<li>
151180
Need a common approach to column ranges in csvcols, csvfind, csvjoin
152181
</li>
153182
<li>
154-
csv2json, csv2mdtable, csv2xlsx should accept a column and row range option for output
183+
csv2json, csv2mdtable, csv2xlsx should accept a column and row range
184+
option for output
155185
</li>
156186
</ul>
157187
</li>
158188
<li>
159-
<input type="checkbox" disabled="" /> csvfind add filter by row number (helpful when combined with csvcols for snapshotting the middle of a table)
189+
<input type="checkbox" disabled="" /> csvfind add filter by row number
190+
(helpful when combined with csvcols for snapshotting the middle of a
191+
table)
160192
</li>
161193
<li>
162-
<input type="checkbox" disabled="" /> csv2json should have an option that will include a row number in JSON blob output
194+
<input type="checkbox" disabled="" /> csv2json should have an option
195+
that will include a row number in JSON blob output
163196
</li>
164197
<li>
165-
<input type="checkbox" disabled="" /> csv2json should have the options to normalize property names in JSON objects
198+
<input type="checkbox" disabled="" /> csv2json should have the options
199+
to normalize property names in JSON objects
166200
<ul>
167201
<li>
168202
camel case
@@ -185,19 +219,27 @@ <h2 id="someday-maybe">
185219
</ul>
186220
</li>
187221
<li>
188-
<input type="checkbox" disabled="" /> csvrotate would take a CSV file as import and output columns as rows
222+
<input type="checkbox" disabled="" /> csvrotate would take a CSV file as
223+
import and output columns as rows
189224
</li>
190225
<li>
191-
<input type="checkbox" disabled="" /> smartcat would function like cat but with support for ranges of lines (e.g. show me last 20 lines: smartcat -start=0 -end=“-20” file.txt; cat starting with 10th line: smartcat -start=10 file.txt)
226+
<input type="checkbox" disabled="" /> smartcat would function like cat
227+
but with support for ranges of lines (e.g. show me last 20 lines:
228+
smartcat -start=0 -end=“-20” file.txt; cat starting with 10th line:
229+
smartcat -start=10 file.txt)
192230
<ul class="task-list">
193231
<li>
194-
<input type="checkbox" disabled="" /> allow prefix line number with a specific delimiter (E.g. comma would let you cat a CSV file adding row numbers as first column)
232+
<input type="checkbox" disabled="" /> allow prefix line number with a
233+
specific delimiter (E.g. comma would let you cat a CSV file adding row
234+
numbers as first column)
195235
</li>
196236
<li>
197-
<input type="checkbox" disabled="" /> show lines with prefix, suffix, containing or regxp
237+
<input type="checkbox" disabled="" /> show lines with prefix, suffix,
238+
containing or regxp
198239
</li>
199240
<li>
200-
<input type="checkbox" disabled="" /> show lines without prefix, suffix, containing or regexp
241+
<input type="checkbox" disabled="" /> show lines without prefix, suffix,
242+
containing or regexp
201243
</li>
202244
</ul>
203245
</li>
@@ -207,32 +249,42 @@ <h2 id="completed">
207249
</h2>
208250
<ul class="task-list">
209251
<li>
210-
<input type="checkbox" disabled="" checked="" /> consolidate string utilities (e.g. toupper, tolower, totitle) into string cli
252+
<input type="checkbox" disabled="" checked="" /> consolidate string
253+
utilities (e.g. toupper, tolower, totitle) into string cli
211254
</li>
212255
<li>
213-
<input type="checkbox" disabled="" checked="" /> csvcols -col option should not be a boolean, it should take a range like other csv cli
256+
<input type="checkbox" disabled="" checked="" /> csvcols -col option
257+
should not be a boolean, it should take a range like other csv cli
214258
</li>
215259
<li>
216-
<input type="checkbox" disabled="" checked="" /> utilities should use starting index of 1 instead of zero as humans refer to column 1 when intending to work on the first column
260+
<input type="checkbox" disabled="" checked="" /> utilities should use
261+
starting index of 1 instead of zero as humans refer to column 1 when
262+
intending to work on the first column
217263
</li>
218264
<li>
219-
<input type="checkbox" disabled="" checked="" /> for all cli the -delimiter option should support special characters like , /li&gt;
265+
<input type="checkbox" disabled="" checked="" /> for all cli the
266+
-delimiter option should support special characters like ,
267+
</li>
220268
<li>
221-
<input type="checkbox" disabled="" checked="" /> csvfind would accept CSV input from stdin and output rows with matching column values
269+
<input type="checkbox" disabled="" checked="" /> csvfind would accept
270+
CSV input from stdin and output rows with matching column values
222271
<ul>
223272
<li>
224-
E.g. <code>cat file1.csv | csvfind -levenshtein -stop-words=“the:a:of” -col=1 “This Red Book of West March”</code>
273+
E.g. <code>cat file1.csv | csvfind -levenshtein -stop-words=“the:a:of”
274+
-col=1 “This Red Book of West March”</code>
225275
</li>
226276
<li>
227-
E.g. <code>cat file1.csv | csvfind -inverted -levenstein -stop-words=“the:a:of” -col=1 “This Red Book of West March”</code>
277+
E.g. <code>cat file1.csv | csvfind -inverted -levenstein
278+
-stop-words=“the:a:of” -col=1 “This Red Book of West March”</code>
228279
</li>
229280
<li>
230281
E.g. <code>cat file1.csv | csvfind -contains -col=1 “Red Book”</code>
231282
</li>
232283
</ul>
233284
</li>
234285
<li>
235-
<input type="checkbox" disabled="" checked="" /> csvjoin should have option for fuzzy match on columns (e.g. comparing titles)
286+
<input type="checkbox" disabled="" checked="" /> csvjoin should have
287+
option for fuzzy match on columns (e.g. comparing titles)
236288
</li>
237289
</ul>
238290
</section>

docs/csv2json/index.html

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,8 @@
3030
<a href="../../how-to/">How To</a>
3131
</li>
3232
<li>
33-
<a href="https://github.com/caltechlibrary/datatools">Github</a>
33+
<a
34+
href="https://github.com/caltechlibrary/datatools">Github</a>
3435
</li>
3536
</ul>
3637
</nav>
@@ -44,7 +45,9 @@ <h2 id="description">
4445
DESCRIPTION
4546
</h2>
4647
<p>
47-
csv2json reads CSV from stdin and writes a JSON to stdout. JSON output can be either an array of JSON blobs or one JSON blob (row as object) per line.
48+
csv2json reads CSV from stdin and writes a JSON to stdout. JSON output
49+
can be either an array of JSON blobs or one JSON blob (row as object)
50+
per line.
4851
</p>
4952
<h2 id="options">
5053
OPTIONS

docs/csv2mdtable/index.html

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,8 @@
3030
<a href="../../how-to/">How To</a>
3131
</li>
3232
<li>
33-
<a href="https://github.com/caltechlibrary/datatools">Github</a>
33+
<a
34+
href="https://github.com/caltechlibrary/datatools">Github</a>
3435
</li>
3536
</ul>
3637
</nav>
@@ -44,7 +45,8 @@ <h2 id="description">
4445
DESCRIPTION
4546
</h2>
4647
<p>
47-
csv2mdtable reads CSV from stdin and writes a Github Flavored Markdown table to stdout.
48+
csv2mdtable reads CSV from stdin and writes a Github Flavored Markdown
49+
table to stdout.
4850
</p>
4951
<h2 id="options">
5052
OPTIONS

docs/csv2xlsx/index.html

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,8 @@
3030
<a href="../../how-to/">How To</a>
3131
</li>
3232
<li>
33-
<a href="https://github.com/caltechlibrary/datatools">Github</a>
33+
<a
34+
href="https://github.com/caltechlibrary/datatools">Github</a>
3435
</li>
3536
</ul>
3637
</nav>
@@ -44,7 +45,8 @@ <h2 id="description">
4445
DESCRIPTION
4546
</h2>
4647
<p>
47-
csv2xlsx will take CSV input and create a new sheet in an Excel Workbook. If the Workbook does not exist then it is created.
48+
csv2xlsx will take CSV input and create a new sheet in an Excel
49+
Workbook. If the Workbook does not exist then it is created.
4850
</p>
4951
<h2 id="options">
5052
OPTIONS
@@ -74,11 +76,13 @@ <h2 id="examples">
7476
</p>
7577
<pre><code>csv2xlsx -i data.csv MyWorkbook.xlsx &#39;My worksheet 1&#39;</code></pre>
7678
<p>
77-
This creates a new ‘My worksheet 1’ in the Excel Workbook called ‘MyWorkbook.xlsx’ with the contents of data.csv.
79+
This creates a new ‘My worksheet 1’ in the Excel Workbook called
80+
‘MyWorkbook.xlsx’ with the contents of data.csv.
7881
</p>
7982
<pre><code>cat data.csv | csv2xlsx MyWorkbook.xlsx &#39;My worksheet 2&#39;</code></pre>
8083
<p>
81-
This does the same but the contents of data.csv are piped into the workbook’s ‘My worksheet 2’ sheet.
84+
This does the same but the contents of data.csv are piped into the
85+
workbook’s ‘My worksheet 2’ sheet.
8286
</p>
8387
<p>
8488
csv2xlsx v0.0.25

0 commit comments

Comments
 (0)