-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GRIB source feature for NWP emulator #3
base: develop
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #3 +/- ##
============================================
+ Coverage 36.59% 48.90% +12.31%
============================================
Files 44 55 +11
Lines 1585 2476 +891
Branches 67 241 +174
============================================
+ Hits 580 1211 +631
- Misses 1005 1265 +260 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review to be continued.
src/nwp_emulator/base_data_reader.h
Outdated
* @class BaseDataReader | ||
* @brief Base class for handling different types of data sources for the emulator. | ||
*/ | ||
class BaseDataReader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would drop the 'Base' bit. It should be clear from the structure.
src/nwp_emulator/grib_file_reader.h
Outdated
class GRIBFileReader : public BaseDataReader { | ||
private: | ||
std::vector<eckit::PathName> srcFilenames_; | ||
FILE* currentFile_ = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps you have an std::fstream
instead of the C-style FILE*
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As eccodes requires a FILE*
and there is no easy translation between an fstream
and a FILE*
I'll stick to this for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm done with reviewing it in full. See comments. Overall, very nicely structured work.
src/nwp_emulator/grib_file_reader.cc
Outdated
} | ||
// Broadcast emulator source params from main reader (root) | ||
// 1. Grid identifier | ||
std::vector<char> gridNameBuffer(10); // typical grid names are 4-6 chars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use std::string
instead. std::vector<char>
is really for generic buffers (i.e. even binary data). You may have to use a different broadcast
overload, but it is supported in eckit::mpi
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like there is a broadcast overload that accepts string so the other way I see to do this is to broadcast the size of the string first and then the string as iterators like so:
size_t gridNameSize = gridName_.size();
eckit::mpi::comm().broadcast(gridNameSize, root_);
if (rank_ != root_) {
gridName_.resize(gridNameSize);
}
eckit::mpi::comm().broadcast(gridName_.begin(), gridName_.end(), root_);
Is that what you have in mind ?
src/nwp_emulator/grib_file_reader.cc
Outdated
gridName_.resize(strlen(gridName_.c_str())); | ||
} | ||
// 2. Field names & metadata (parameters) | ||
std::vector<char> paramBuffer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, using std::string would make the subsequent code simpler.
src/nwp_emulator/grib_file_reader.cc
Outdated
eckit::mpi::comm().broadcast(paramBuffer, root_); // broadcast a single string to limit communication | ||
if (rank_ != root_) { | ||
std::string paramBufferStr(paramBuffer.begin(), paramBuffer.end()); | ||
paramBufferStr.resize(strlen(paramBufferStr.c_str())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this line doing exactly? Isn't this the same thing as calling shrink_to_fit
? See https://en.cppreference.com/w/cpp/string/basic_string/shrink_to_fit .
// For later consistency checks & emulator setup : | ||
openGribFile(srcPath, true); | ||
eckit::Log::info() << "Number of messages : " << std::to_string(count_) << std::endl; | ||
char buffer[64]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not std::string
? Then you can still pass the C-style string to eccodes by calling c_str()
on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c_str
is const
while codes_get_string
attempts to populate the buffer with the data it found in the codes handle (and the parameter buffer should be a char *
). I later convert it back to string but I don't think it can work with a buffer of type string, unless I'm missing something ?
src/nwp_emulator/grib_file_reader.cc
Outdated
readMsgMetadata(_, paramMd); | ||
params_.push_back(paramMd); | ||
eckit::Log::info() << params_[i] << "; "; | ||
if (i == count_ - 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would do it slightly differently: loop to count_ - 1
, then it is a clean body of the for loop. Add the last element after the for loop.
gridName = std::string(buffer); | ||
} | ||
paramMd = ""; | ||
std::vector<const char*> keywords = {"shortName", "levtype", "level"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this an vector of std::string
. The overall principle is that we want to make the translation between C++ and C at the latest possible point, which is at the point of calling the C API.
src/nwp_emulator/nwp_data_provider.h
Outdated
* @enum DataSourceType | ||
* @brief Enumeration of the supported types of data source for the emulator. | ||
*/ | ||
enum DataSourceType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use enum class
.
src/nwp_emulator/nwp_data_provider.h
Outdated
* | ||
* @return true if model data has been successfully generated for the step, false otherwise. | ||
*/ | ||
bool getLocalStepData(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove the get
prefix from all of these member functions, so localStepData
, globalStepData
, etc. This looks a bit boilerplate-y.
src/nwp_emulator/nwp_data_provider.h
Outdated
class NWPDataProvider { | ||
private: | ||
const DataSourceType sourceType_; | ||
BaseDataReader* dataReader = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use an std::unique_ptr
instead of a raw pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I address this point in a different PR so leaving as is here for now
switch (sourceType_) { | ||
case DataSourceType::GRIB: | ||
eckit::Log::info() << "Emulator will use GRIB files as data source from " << inputPath << std::endl; | ||
dataReader = new GRIBFileReader(inputPath, rank, root); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No 'naked' new
in C++ code if at all possible. If the dataReader
is an std::unique_ptr
then you please insatiate it with std::make_unique
.
} | ||
// Steps should be stored in alphabetical order | ||
std::sort(srcFilenames_.begin(), srcFilenames_.end()); | ||
srcPath = eckit::PathName(srcFilenames_[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I should test if
srcFilenames_
is empty here in case no grib files were found in the source directory
d592ba8
to
ad30730
Compare
dacb4aa
to
ffe3b93
Compare
ffe3b93
to
6ce826a
Compare
06a0f01
to
7e1343d
Compare
7e1343d
to
e90687f
Compare
This PR does not contain a script to run the emulator, it only consists of the class that can then be used by an emulator. For reference, that's how using it can look like (tested in parallel environment):
Example of results of the extreme event detection plugin run on storm Christian data from ERA5:
