-
Notifications
You must be signed in to change notification settings - Fork 988
EVF Tutorial
This tutorial shows how to use the Extended Vector Framework to create a simple format plugin. The EVF framework has also been called the "row set framework" and the "new scan framework". Here we focus on using the framework. Other pages in this section provide background information for when you need features beyond those shown here.
The Drill log plugin is the focus of this tutorial. A simplified version of this plugin is explained in the Learning Apache Drill book. The version used here is [the one which ships with Drill|https://github.com/apache/drill/tree/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/log].
In Drill 1.16 and earlier, the LogRecordReader uses a typical method to write to value vectors using the associated {{Mutator}} class. For example, for a nullable VarChar vector:
private static class VarCharDefn extends ColumnDefn {
private NullableVarCharVector.Mutator mutator;
public VarCharDefn(String name, int index) {
super(name, index);
}
@Override
public void define(OutputMutator outputMutator) throws SchemaChangeException {
MaterializedField field = MaterializedField.create(getName(),
Types.optional(MinorType.VARCHAR));
mutator = outputMutator.addField(field, NullableVarCharVector.class).getMutator();
}
@Override
public void load(int rowIndex, String value) {
byte[] bytes = value.getBytes();
mutator.setSafe(rowIndex, bytes, 0, bytes.length);
}
}
Other readers are more clever: the "V2" text reader (Drill 1.16 and earlier) worked with direct memory itself, handling its own buffer allocation, offset vector calculations and so on.
The log reader code uses a {{ColumnDefn}} class to convert from the String value provided by the regex parser to the Java type needed by the {{Mutator}}.
With the EVF, we'll replace the {{Mutator}} with a {{ColumnWriter}}. We'll first do the simplest possible conversion, then look at how to use advanced features, such as type conversions, schemas and table properties.
In order to use the EVF, we must also change the way that the plugin is structured, using the new version of the "easy" plugin implementation to define the reader using the new scan framework.