Skip to content

Commit 3679b66

Browse files
authored
Merge pull request #101 from queryverse/remove-xlsx-support
Remove support for XLSX files
2 parents 7285836 + 73f8f89 commit 3679b66

File tree

9 files changed

+138
-140
lines changed

9 files changed

+138
-140
lines changed

NEWS.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
1-
# ExcelReaders.jl v1.0.0 Release Notes
1+
# ExcelReaders.jl v0.12.0 Release Notes
22
* Drop julia 0.7 support
33
* Migrate to Project.toml
4+
* Drop support for modern Excel files, this package now only supports legacy xls files
45

56
# ExcelReaders.jl v0.11.0 Release Notes
67
* Update to PyCall.jl 1.90.0

Project.toml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,17 @@ version = "1.0.0-DEV"
66
DataValues = "e7dc6d0d-1eca-5fa6-8ad6-5aecde8b7ea5"
77
Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"
88
PyCall = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"
9+
Conda = "8f4d0f93-b110-5947-807f-2305c1781a2d"
910

1011
[extras]
1112
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
13+
TestItemRunner = "f8b46487-2199-4994-9208-9a1283c18c0a"
1214

1315
[targets]
14-
test = ["Test"]
16+
test = ["Test", "TestItemRunner"]
1517

1618
[compat]
1719
DataValues = "0.4.4"
1820
PyCall = "1.90"
21+
Conda = "1 - 1.8.0"
1922
julia = "1.6"

README.md

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,9 @@
77

88
ExcelReaders is a package that provides functionality to read Excel files.
99

10-
**WARNING**: Version v0.9.0 removed all support for [DataFrames.jl](https://github.com/JuliaData/DataFrames.jl)
11-
from this package. The [ExcelFiles.jl](https://github.com/queryverse/ExcelFiles.jl)
12-
package now provides functionality to read data from an Excel file into
13-
a ``DataFrame`` (or any other table type), and users are encouraged to use
14-
that package for tabular data going forward. Version v0.9.0 also no longer
15-
uses [DataArrays.jl](https://github.com/JuliaStats/DataArrays.jl), but instead
16-
is based on [DataValues.jl](https://github.com/queryverse/DataValues.jl).
10+
**WARNING**: Version v0.12 removed support for modern Excel files. This package is now _only_ supporting legacy xls files. The reason for this is that the underlying Python package made that move a couple of years ago as well.
11+
12+
The [XLSX.jl](https://github.com/felipenoris/XLSX.jl) provides excellent support for modern Excel files.
1713

1814
## Installation
1915

@@ -23,6 +19,8 @@ The package uses the Python xlrd library. If either Python or the xlrd package a
2319

2420
## Alternatives
2521

22+
The [XLSX.jl](https://github.com/felipenoris/XLSX.jl) provides excellent support for modern Excel files.
23+
2624
The [Taro](https://github.com/aviks/Taro.jl) package also provides Excel file reading functionality. The main difference between the two packages (in terms of Excel functionality) is that ExcelReaders uses the Python package [xlrd](https://github.com/python-excel/xlrd) for its processing, whereas Taro uses the Java packages Apache [Tika](http://tika.apache.org/) and Apache [POI](http://poi.apache.org/).
2725

2826
## Basic usage
@@ -32,17 +30,17 @@ The most basic usage is this:
3230
````julia
3331
using ExcelReaders
3432

35-
data = readxl("Filename.xlsx", "Sheet1!A1:C4")
33+
data = readxl("Filename.xls", "Sheet1!A1:C4")
3634
````
3735

38-
This will return an array with all the data in the cell range A1 to C4 on Sheet1 in the Excel file Filename.xlsx.
36+
This will return an array with all the data in the cell range A1 to C4 on Sheet1 in the Excel file Filename.xls.
3937

4038
If you expect to read multiple ranges from the same Excel file you can get much better performance by opening the Excel file only once:
4139

4240
````julia
4341
using ExcelReaders
4442

45-
f = openxl("Filename.xlsx")
43+
f = openxl("Filename.xls")
4644

4745
data1 = readxl(f, "Sheet1!A1:C4")
4846
data2 = readxl(f, "Sheet2!B4:F10")
@@ -55,10 +53,10 @@ The ``readxlsheet`` function reads complete Excel sheets, without a need to spec
5553
````julia
5654
using ExcelReaders
5755

58-
data = readxlsheet("Filename.xlsx", "Sheet1")
56+
data = readxlsheet("Filename.xls", "Sheet1")
5957
````
6058

61-
This will read all content on Sheet1 in the file Filename.xlsx. Eventual blank rows and columns at the top and left are skipped. ``readxlsheet`` takes a number of optional keyword arguments:
59+
This will read all content on Sheet1 in the file Filename.xls. Eventual blank rows and columns at the top and left are skipped. ``readxlsheet`` takes a number of optional keyword arguments:
6260

6361
- ``skipstartrows`` accepts either ``:blanks`` (default) or a positive integer. With ``:blank`` any empty initial rows are skipped. An integer skips as many rows as specified.
6462
- ``skipstartcols`` accepts either ``:blanks`` (default) or a positive integer. With ``:blank`` any empty initial columns are skipped. An integer skips as many columns as specified.

src/ExcelReaders.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ file only once with ``openxl``.
5656
5757
# Example
5858
````julia
59-
f = openxl("filename.xlsx")
59+
f = openxl("filename.xls")
6060
data = readxl(f, "Sheet1!A1:C4")
6161
````
6262
"""

src/package_documentation.jl

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -30,18 +30,18 @@ The most basic usage is this:
3030
3131
````julia
3232
using ExcelReaders
33-
data = readxl("Filename.xlsx", "Sheet1!A1:C4")
33+
data = readxl("Filename.xls", "Sheet1!A1:C4")
3434
````
3535
3636
This will return an array with all the data in the cell range A1 to
37-
C4 on Sheet1 in the Excel file Filename.xlsx.
37+
C4 on Sheet1 in the Excel file Filename.xls.
3838
3939
If you expect to read multiple ranges from the same Excel file you can get much
4040
better performance by opening the Excel file only once:
4141
4242
````julia
4343
using ExcelReaders
44-
f = openxl("Filename.xlsx")
44+
f = openxl("Filename.xls")
4545
data1 = readxl(f, "Sheet1!A1:C4")
4646
data2 = readxl(f, "Sheet2!B4:F10")
4747
````
@@ -53,10 +53,10 @@ specify precise range information. The most basic usage is
5353
5454
````julia
5555
using ExcelReaders
56-
data = readxlsheet("Filename.xlsx", "Sheet1")
56+
data = readxlsheet("Filename.xls", "Sheet1")
5757
````
5858
59-
This will read all content on Sheet1 in the file Filename.xlsx. Eventual blank
59+
This will read all content on Sheet1 in the file Filename.xls. Eventual blank
6060
rows and columns at the top and left are skipped. ``readxlsheet`` takes a number
6161
of optional keyword arguments:
6262

test/TestData.xls

28.5 KB
Binary file not shown.

test/TestData.xlsx

-17.4 KB
Binary file not shown.

test/runtests.jl

Lines changed: 2 additions & 120 deletions
Original file line numberDiff line numberDiff line change
@@ -1,121 +1,3 @@
1-
using ExcelReaders
2-
using Dates
3-
using PyCall
4-
using DataValues
5-
using Test
1+
using TestItemRunner
62

7-
@testset "ExcelReaders" begin
8-
9-
# TODO Throw julia specific exceptions for these errors
10-
@test_throws PyCall.PyError openxl("FileThatDoesNotExist.xlsx")
11-
@test_throws PyCall.PyError openxl("runtests.jl")
12-
13-
filename = normpath(@__DIR__, "TestData.xlsx")
14-
file = openxl(filename)
15-
@test file.filename == "TestData.xlsx"
16-
17-
buffer = IOBuffer()
18-
show(buffer, file)
19-
@test String(take!(buffer)) == "ExcelFile <TestData.xlsx>"
20-
21-
for (k, v) in Dict(0 => "#NULL!", 7 => "#DIV/0!", 23 => "#REF!", 42 => "#N/A", 29 => "#NAME?", 36 => "#NUM!", 15 => "#VALUE!")
22-
errorcell = ExcelErrorCell(k)
23-
buffer = IOBuffer()
24-
show(buffer, errorcell)
25-
@test String(take!(buffer)) == v
26-
end
27-
28-
# Read into DataValueArray
29-
for f in [file, filename]
30-
@test_throws ErrorException readxl(f, "Sheet1!C4:G3")
31-
@test_throws ErrorException readxl(f, "Sheet1!G2:B5")
32-
@test_throws ErrorException readxl(f, "Sheet1!G5:B2")
33-
34-
data = readxl(f, "Sheet1!C3:N7")
35-
@test size(data) == (5, 12)
36-
@test data[4,1] == 2.0
37-
@test data[2,2] == "A"
38-
@test data[2,3] == true
39-
@test DataValues.isna(data[4,5])
40-
@test data[2,9] == Date(2015, 3, 3)
41-
@test data[3,9] == DateTime(2015, 2, 4, 10, 14)
42-
@test data[4,9] == DateTime(1988, 4, 9, 0, 0)
43-
@test data[5,9] == Time(15, 2, 0)
44-
@test data[3,10] == DateTime(1950, 8, 9, 18, 40)
45-
@test DataValues.isna(data[5,10])
46-
@test isa(data[2,11], ExcelErrorCell)
47-
@test isa(data[3,11], ExcelErrorCell)
48-
@test isa(data[4,12], ExcelErrorCell)
49-
@test DataValues.isna(data[5,12])
50-
51-
# Test readxlsheet function
52-
@test_throws ErrorException readxlsheet(f, "Empty Sheet")
53-
for sheetinfo = ["Second Sheet", 2]
54-
@test_throws ErrorException readxlsheet(f, sheetinfo, skipstartrows = -1)
55-
@test_throws ErrorException readxlsheet(f, sheetinfo, skipstartrows = :nonsense)
56-
57-
@test_throws ErrorException readxlsheet(f, sheetinfo, skipstartcols = -1)
58-
@test_throws ErrorException readxlsheet(f, sheetinfo, skipstartcols = :nonsense)
59-
60-
@test_throws ErrorException readxlsheet(f, sheetinfo, nrows = -1)
61-
@test_throws ErrorException readxlsheet(f, sheetinfo, nrows = :nonsense)
62-
63-
@test_throws ErrorException readxlsheet(f, sheetinfo, ncols = -1)
64-
@test_throws ErrorException readxlsheet(f, sheetinfo, ncols = :nonsense)
65-
66-
data = readxlsheet(f, sheetinfo)
67-
@test size(data) == (6, 6)
68-
@test data[2,1] == 1.
69-
@test data[5,2] == "CCC"
70-
@test data[3,3] == false
71-
@test data[6,6] == Time(15, 2, 00)
72-
@test DataValues.isna(data[4,3])
73-
@test DataValues.isna(data[4,6])
74-
75-
data = readxlsheet(f, sheetinfo, skipstartrows = :blanks, skipstartcols = :blanks)
76-
@test size(data) == (6, 6)
77-
@test data[2,1] == 1.
78-
@test data[5,2] == "CCC"
79-
@test data[3,3] == false
80-
@test data[6,6] == Time(15, 2, 00)
81-
@test DataValues.isna(data[4,3])
82-
@test DataValues.isna(data[4,6])
83-
84-
data = readxlsheet(f, sheetinfo, skipstartrows = 0, skipstartcols = 0)
85-
@test size(data) == (6 + 7, 6 + 3)
86-
@test data[2 + 7,1 + 3] == 1.
87-
@test data[5 + 7,2 + 3] == "CCC"
88-
@test data[3 + 7,3 + 3] == false
89-
@test data[6 + 7,6 + 3] == Time(15, 2, 00)
90-
@test DataValues.isna(data[4 + 7,3 + 3])
91-
@test DataValues.isna(data[4 + 7,6 + 3])
92-
93-
data = readxlsheet(f, sheetinfo, skipstartrows = 0, )
94-
@test size(data) == (6 + 7, 6)
95-
@test data[2 + 7,1] == 1.
96-
@test data[5 + 7,2] == "CCC"
97-
@test data[3 + 7,3] == false
98-
@test data[6 + 7,6] == Time(15, 2, 00)
99-
@test DataValues.isna(data[4 + 7,3])
100-
@test DataValues.isna(data[4 + 7,6])
101-
102-
data = readxlsheet(f, sheetinfo, skipstartcols = 0)
103-
@test size(data) == (6, 6 + 3)
104-
@test data[2,1 + 3] == 1.
105-
@test data[5,2 + 3] == "CCC"
106-
@test data[3,3 + 3] == false
107-
@test data[6,6 + 3] == Time(15, 2, 00)
108-
@test DataValues.isna(data[4,3 + 3])
109-
@test DataValues.isna(data[4,6 + 3])
110-
111-
data = readxlsheet(f, sheetinfo, skipstartrows = 1, skipstartcols = 1, nrows = 11, ncols = 7)
112-
@test size(data) == (11, 7)
113-
@test data[2 + 6,1 + 2] == 1.
114-
@test data[5 + 6,2 + 2] == "CCC"
115-
@test data[3 + 6,3 + 2] == false
116-
@test_throws BoundsError data[6 + 6,6 + 2] == Time(15, 2, 00)
117-
@test DataValues.isna(data[4 + 6,2 + 2])
118-
end
119-
end
120-
121-
end
3+
@run_package_tests

test/test_excelreaders.jl

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
@testitem "ExcelReaders" begin
2+
using Dates, PyCall, DataValues
3+
4+
# TODO Throw julia specific exceptions for these errors
5+
@test_throws PyCall.PyError openxl("FileThatDoesNotExist.xls")
6+
@test_throws PyCall.PyError openxl("runtests.jl")
7+
8+
filename = normpath(@__DIR__, "TestData.xls")
9+
file = openxl(filename)
10+
@test file.filename == "TestData.xls"
11+
12+
buffer = IOBuffer()
13+
14+
@test sprint(show, file) == "ExcelFile <TestData.xls>"
15+
16+
for (k, v) in Dict(0 => "#NULL!", 7 => "#DIV/0!", 23 => "#REF!", 42 => "#N/A", 29 => "#NAME?", 36 => "#NUM!", 15 => "#VALUE!")
17+
errorcell = ExcelErrorCell(k)
18+
@test sprint(show, errorcell) == v
19+
end
20+
21+
# Read into DataValueArray
22+
for f in [file, filename]
23+
@test_throws ErrorException readxl(f, "Sheet1!C4:G3")
24+
@test_throws ErrorException readxl(f, "Sheet1!G2:B5")
25+
@test_throws ErrorException readxl(f, "Sheet1!G5:B2")
26+
27+
data = readxl(f, "Sheet1!C3:N7")
28+
@test size(data) == (5, 12)
29+
@test data[4,1] == 2.0
30+
@test data[2,2] == "A"
31+
@test data[2,3] == true
32+
@test DataValues.isna(data[4,5])
33+
@test data[2,9] == Date(2015, 3, 3)
34+
@test data[3,9] == DateTime(2015, 2, 4, 10, 14)
35+
@test data[4,9] == DateTime(1988, 4, 9, 0, 0)
36+
@test data[5,9] == Time(15, 2, 0)
37+
@test data[3,10] == DateTime(1950, 8, 9, 18, 40)
38+
@test DataValues.isna(data[5,10])
39+
@test isa(data[2,11], ExcelErrorCell)
40+
@test isa(data[3,11], ExcelErrorCell)
41+
@test isa(data[4,12], ExcelErrorCell)
42+
@test DataValues.isna(data[5,12])
43+
44+
# Test readxlsheet function
45+
@test_throws ErrorException readxlsheet(f, "Empty Sheet")
46+
for sheetinfo = ["Second Sheet", 2]
47+
@test_throws ErrorException readxlsheet(f, sheetinfo, skipstartrows = -1)
48+
@test_throws ErrorException readxlsheet(f, sheetinfo, skipstartrows = :nonsense)
49+
50+
@test_throws ErrorException readxlsheet(f, sheetinfo, skipstartcols = -1)
51+
@test_throws ErrorException readxlsheet(f, sheetinfo, skipstartcols = :nonsense)
52+
53+
@test_throws ErrorException readxlsheet(f, sheetinfo, nrows = -1)
54+
@test_throws ErrorException readxlsheet(f, sheetinfo, nrows = :nonsense)
55+
56+
@test_throws ErrorException readxlsheet(f, sheetinfo, ncols = -1)
57+
@test_throws ErrorException readxlsheet(f, sheetinfo, ncols = :nonsense)
58+
59+
data = readxlsheet(f, sheetinfo)
60+
@test size(data) == (6, 6)
61+
@test data[2,1] == 1.
62+
@test data[5,2] == "CCC"
63+
@test data[3,3] == false
64+
@test data[6,6] == Time(15, 2, 00)
65+
@test DataValues.isna(data[4,3])
66+
@test DataValues.isna(data[4,6])
67+
68+
data = readxlsheet(f, sheetinfo, skipstartrows = :blanks, skipstartcols = :blanks)
69+
@test size(data) == (6, 6)
70+
@test data[2,1] == 1.
71+
@test data[5,2] == "CCC"
72+
@test data[3,3] == false
73+
@test data[6,6] == Time(15, 2, 00)
74+
@test DataValues.isna(data[4,3])
75+
@test DataValues.isna(data[4,6])
76+
77+
data = readxlsheet(f, sheetinfo, skipstartrows = 0, skipstartcols = 0)
78+
@test size(data) == (6 + 7, 6 + 3)
79+
@test data[2 + 7,1 + 3] == 1.
80+
@test data[5 + 7,2 + 3] == "CCC"
81+
@test data[3 + 7,3 + 3] == false
82+
@test data[6 + 7,6 + 3] == Time(15, 2, 00)
83+
@test DataValues.isna(data[4 + 7,3 + 3])
84+
@test DataValues.isna(data[4 + 7,6 + 3])
85+
86+
data = readxlsheet(f, sheetinfo, skipstartrows = 0, )
87+
@test size(data) == (6 + 7, 6)
88+
@test data[2 + 7,1] == 1.
89+
@test data[5 + 7,2] == "CCC"
90+
@test data[3 + 7,3] == false
91+
@test data[6 + 7,6] == Time(15, 2, 00)
92+
@test DataValues.isna(data[4 + 7,3])
93+
@test DataValues.isna(data[4 + 7,6])
94+
95+
data = readxlsheet(f, sheetinfo, skipstartcols = 0)
96+
@test size(data) == (6, 6 + 3)
97+
@test data[2,1 + 3] == 1.
98+
@test data[5,2 + 3] == "CCC"
99+
@test data[3,3 + 3] == false
100+
@test data[6,6 + 3] == Time(15, 2, 00)
101+
@test DataValues.isna(data[4,3 + 3])
102+
@test DataValues.isna(data[4,6 + 3])
103+
104+
data = readxlsheet(f, sheetinfo, skipstartrows = 1, skipstartcols = 1, nrows = 11, ncols = 7)
105+
@test size(data) == (11, 7)
106+
@test data[2 + 6,1 + 2] == 1.
107+
@test data[5 + 6,2 + 2] == "CCC"
108+
@test data[3 + 6,3 + 2] == false
109+
@test_throws BoundsError data[6 + 6,6 + 2] == Time(15, 2, 00)
110+
@test DataValues.isna(data[4 + 6,2 + 2])
111+
end
112+
end
113+
114+
end

0 commit comments

Comments
 (0)