-
Notifications
You must be signed in to change notification settings - Fork 13
[WIP] Las 1.3 and 1.4 read support. #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@visr In my opinion this requires us to change our
I'm thinking about the following API (apart from the whole datastreams). header = load(file)
points = points(header)
waves = waves(header) Where the header (could be renamed) stores the file/stream reference and can load the points or waves on the fly. All small VLRs can be stored in the header, but larger ones (internal waves) will only be streamed/mmapped. I'm also thinking of disabling pipes for these versions and updating LazIO with 1.3/1.4 support. Pipes for loading Let me know your thoughts. |
I'm aware of the las14 branch started by @c42f, and will probably merge the work in there (mainly fixed size strings) at a later stage. |
Exciting! Yeah I realize the API will probably need to change to support this. Let's try to get it right this time. Will need some time to go over this. Regarding the fixed size strings, I was also looking at https://github.com/JuliaComputing/FixedSizeStrings.jl, but it's not very active, so might be better to provide for that in here for now. |
Got a first working version of waveform parsing for the internal 1.3 spec. Still using hardcoded vlrs and assuming 8 bits per sample, but we're getting there. h, p, e = load("test/LDR100301_120340_4.LAS")
wf = LasIO.waveform(p[3], h, e[end])
plot(wf[:, 1], wf[:,end], label="$(p[3])", xlabel="Distance (ps)", ylabel="Sample (V)", title="Digitized waveform") |
Codecov Report
@@ Coverage Diff @@
## master #16 +/- ##
==========================================
- Coverage 62.62% 56.75% -5.87%
==========================================
Files 8 10 +2
Lines 396 481 +85
==========================================
+ Hits 248 273 +25
- Misses 148 208 +60
Continue to review full report at Codecov.
|
@visr Could you review the current functionality and the way it's implemented? The goal here is to fully support reading LAS 1.x. I will start writing tests and documentation now. Write support can be added later in another PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also thinking of disabling pipes for these versions and updating LazIO with 1.3/1.4 support. Pipes for loading .laz files can't work with this API at all and with the filesizes it quickly leads to memory problems.
Yeah dropping this piping LAZ support is fine by me. We can work on getting LazIO up to speed afterwards.
I'm thinking about the following API (apart from the whole datastreams).
header = load(file) points = points(header) waves = waves(header)Where the header (could be renamed) stores the file/stream reference and can load the points or waves on the fly. All small VLRs can be stored in the header, but larger ones (internal waves) will only be streamed/mmapped.
Something like that sounds good to me, it's also in line with #4 (comment). How do you decide what's a small VLR and what is big though? Do you mean VLR vs EVLR? We don't want to have a situation where people don't have enough RAM to read the header since it also reads large VLRs. So perhaps we should take it out of the header altogether? It aligns also better with the structure described in the spec:
We might want to name the functions laspoints
/getpoints
or something like that if we want to export them.
variable_length_records::Vector{LasVariableLengthRecord} | ||
user_defined_bytes::Vector{UInt8} | ||
end | ||
|
||
function Base.show(io::IO, header::LasHeader) | ||
n = Int(header.records_count) | ||
n = Int(header.records_count_new) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that here you need to do something similar as above?
n = lv >= v"1.4" ? header.records_count_new : header.records_count
@@ -128,15 +131,15 @@ end | |||
|
|||
function save(s::Stream{format"LAS"}, header::LasHeader, pointdata::AbstractVector{<:LasPoint}) | |||
# checks | |||
header_n = header.records_count | |||
header_n = header.records_count_new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here
@@ -64,7 +64,7 @@ mutable struct LasHeader | |||
point_return_count_new::Vector{UInt64} # 15 | |||
|
|||
# VLRs | |||
variable_length_records::Vector{LasVariableLengthRecord} | |||
variable_length_records::Dict{UInt16, Union{LasVariableLengthRecord, ExtendedLasVariableLengthRecord}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you prefer to make the Vector into a Dict?
Would be good to mention here what the key is supposed to be. It is Record ID right?
That means we cannot store two of the same Record IDs, I don't see why that should not be allowed. Also from the LAS 1.4 spec:
The Record ID is dependent upon the User ID.
Different User IDs are likely to use the same Record IDs.
@@ -8,7 +8,7 @@ using FileIO | |||
using FixedPointNumbers | |||
using ColorTypes | |||
using GeometryTypes # for conversion | |||
using Compat | |||
using StaticArrays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
REQUIRE needs to be updated for this
(cherry picked from commit c78d207)
Rebased, this branch now runs on julia 1.0 as well. |
Cool stuff guys, sorry I can't help with it (don't have the time... I no longer use this stuff at all!) Random additional thoughts on lazy loading: las_file = LasIO.open("test.las") # create `LASFile` stream, reading only the minimal amount (header without VLRs?)
# Following possibly return iterators?
# How to get points to associate neatly with waveforms? `mmap` would help a lot there.
p = collect(points(las_file))
v = collect(vlrs(las_file))
h = header(las_file)
data = LasIO.load("test.las") # just load all the data as in the old interface
# Could return a `LASData` instead of a tuple?
p = points(data)
h = header(data)
v = vlrs(data)
w = waveforms(data)
# It's a bit weird, but if `LASData` supports iteration, destructuring can be made to work
# (not suitable for production code which will have to deal with optional waveforms)
header,points = LasIO.load("test.las") The function names there are a bit generic, not to be taken too seriously ;-) |
Uh oh!
There was an error while loading. Please reload this page.