You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The timeseries.ipynb example shows how to visualize one or more sets of timeseries data by plotting them in 2D (as connected line segments) and then aggregating them in 2D. Such an approach is a more-principled form of a common plotting paradigm, where large numbers of timeseries are overplotted as 2D curves, allowing the density of overlapping lines to be displayed faithfully. However, such an approach requires the curves to have a meaningful alignment in both x and y, which is true for the example data in timeseries.ipynb, but won't always be the case. Often there may be alignment in x (e.g. a shared time axis), but no meaningful alignment in the precise y values even for closely related variables. For instance, stocks for different companies in the same sector may covary but have very different absolute values, various weather-related measurements will all vary as weather fronts come and go, yet will be on different scales, etc. In some cases, these traces can be aligned by normalizing to put the y values onto similar scales, but in others such normalization would be very difficult, e.g. if there is strong drift on one variable that obscures smaller day-to-day variations that overlap. In any of these cases, aggregating the plots in 2D will not be very informative about the properties of the set of curves.
Instead, one can aggregate a set of curves in 1D directly, yielding a 1D array. The existing code can be used for this already, by ensuring that there is only a single pixel in the y direction, passing in a fake fixed value for y in the aggregation, and then aggregatng what would have been the y value for a 2D line plot as a value dimension. Alternatively, we could provide a specific interface for performing 1D aggregation, which would let us e.g. support non-uniform sampling along the x axis, as well as avoiding users having to munge their data into fake 2D as above.
In either case, we'll then need to determine appropriate ways to visualize and analyze the results. One approach is outlined in #103, i.e. to see if the aggregated values fit some statistical model (e.g. by aggregating into a histogram, then fitting the histogram from each x value with a model and computing the goodness of fit), and then plotting the fit values (for a good fit) or the actual histogram (for pixels with a bad fit). We can also highlight anomalies in some way, or do other automatic calculations on the 1D aggregate.
The text was updated successfully, but these errors were encountered:
The timeseries.ipynb example shows how to visualize one or more sets of timeseries data by plotting them in 2D (as connected line segments) and then aggregating them in 2D. Such an approach is a more-principled form of a common plotting paradigm, where large numbers of timeseries are overplotted as 2D curves, allowing the density of overlapping lines to be displayed faithfully. However, such an approach requires the curves to have a meaningful alignment in both x and y, which is true for the example data in timeseries.ipynb, but won't always be the case. Often there may be alignment in x (e.g. a shared time axis), but no meaningful alignment in the precise y values even for closely related variables. For instance, stocks for different companies in the same sector may covary but have very different absolute values, various weather-related measurements will all vary as weather fronts come and go, yet will be on different scales, etc. In some cases, these traces can be aligned by normalizing to put the y values onto similar scales, but in others such normalization would be very difficult, e.g. if there is strong drift on one variable that obscures smaller day-to-day variations that overlap. In any of these cases, aggregating the plots in 2D will not be very informative about the properties of the set of curves.
Instead, one can aggregate a set of curves in 1D directly, yielding a 1D array. The existing code can be used for this already, by ensuring that there is only a single pixel in the y direction, passing in a fake fixed value for y in the aggregation, and then aggregatng what would have been the y value for a 2D line plot as a value dimension. Alternatively, we could provide a specific interface for performing 1D aggregation, which would let us e.g. support non-uniform sampling along the x axis, as well as avoiding users having to munge their data into fake 2D as above.
In either case, we'll then need to determine appropriate ways to visualize and analyze the results. One approach is outlined in #103, i.e. to see if the aggregated values fit some statistical model (e.g. by aggregating into a histogram, then fitting the histogram from each x value with a model and computing the goodness of fit), and then plotting the fit values (for a good fit) or the actual histogram (for pixels with a bad fit). We can also highlight anomalies in some way, or do other automatic calculations on the 1D aggregate.
The text was updated successfully, but these errors were encountered: