Skip to content

Feature request: Differentiate between empty string and null value #81

@StefRe

Description

@StefRe

readAllWithHeader yields a List<Map<String,String>> and hence empty columns are being read as empty strings, so that we get "" for both col1 and col2 in the following example:

"col1","col2"
"",

I'd really like to get null for col2 here (this of course only makes sense if all strings are quoted, otherwise it wouldn't be clear how to interpret empty columns). I understand that you can't change the result to List<Map<String,String?>> now, but maybe you could add a nullCode option for reading as it already exists for writing. The default value is an empty string "" (=current behavior). I could then simply do

val nullCode = "NULL"
val rows = csvReader(nullCode=nullCode).readAllWithHeader(inputStream)
    .map { row -> row.mapValues { col -> if (col.value == nullCode) null else col.value } }

At first glance it seems that it only requires to change
https://github.com/doyaaaaaken/kotlin-csv/blob/c23a51b98fbbcb1348b3928d561f1f5e11fba965/src/commonMain/kotlin/com/github/doyaaaaaken/kotlincsv/parser/ParseStateMachine.kt#L36-L48
to

                    delimiter -> {
                        field.append(nullCode)
                        flushField()
                        state = ParseState.DELIMITER
                    }
                    '\n', '\u2028', '\u2029', '\u0085' -> {
                        field.append(nullCode)
                        flushField()
                        state = ParseState.END
                    }
                    '\r' -> {
                        if (nextCh == '\n') pos += 1
                        field.append(nullCode)
                        flushField()
                        state = ParseState.END
                    }

and the same for https://github.com/doyaaaaaken/kotlin-csv/blob/c23a51b98fbbcb1348b3928d561f1f5e11fba965/src/commonMain/kotlin/com/github/doyaaaaaken/kotlincsv/parser/ParseStateMachine.kt#L87-L99

but I didn't check it thoroughly.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions