Skip to content

Should we make a clear value-level distinction between floating and integral numbers? #546

Open
@leftaroundabout

Description

@leftaroundabout

This basic subject has been variously discussed before –

#227
#181

The thing is that Aeson can't make a distinction between integer values and floating-point values (scientific values) which merely happen to be equal to an integer.

Prelude Data.Aeson> encode 4
"4"
Prelude Data.Aeson> encode 4.0
"4"
Prelude Data.Aeson> encode 4e+30
"4000000000000000000000000000000"
Prelude Data.Aeson> encode 4.53986e+30
"4539860000000000000000000000000"

The former two may be sensible enough. The latter are kinda correct too, at least in a dynamically-typed understanding of numbers; but they're arguably not desirable. It's most obvious when interpreting JSON as a human-readable format, as the aeson-pretty library does. There, this behaviour is under discussion right now. In science, numbers like 4.53986e+30 are very much not something unusual, and nobody would consider it acceptable to print them in expanded, exact integer form if the purpose is human readability.

Now, it's understood that Aeson itself does not really focus on human readability, but even so, such insignificant-digit-monsters are a bit of a pain. The space usage and parsing overhead can't be completely irrelevant.

Dynamic languages like Python actually make a distinction between integers and integral-floats in JSON:

Python 3.4.3 (default, Nov 17 2016, 01:08:31) 
Type "copyright", "credits" or "license" for more information.

IPython 4.0.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import json

In [5]: json.dumps(4)
Out[5]: '4'

In [6]: json.dumps(4.0)
Out[6]: '4.0'

In [7]: json.dumps(4e+32)
Out[7]: '4e+32'

Aeson used to have such a distinction too: before aeson-0.7, Number used Data.Attoparsec.Number, which has separate cases. Now granted, this can lead to some problems too

sol/aeson-qq#4

but I think all in all I think we should follow Python here. So would it be viable to try something like

data Value = Object !Object
           | Array !Array
           | String !Text
           | Scientific !Scientific
           | Int !Int
           | Bool !Bool
           | Null
             deriving (Eq, Read, Show, Typeable, Data)

to have a clear grasp of which numbers are really integers and which ones are merely within experimental error in the range of an integer?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions