Skip to content

Conversation

martindurant
Copy link
Member

Simply copying the script code to a notebook. Could use some cleaning and prose. For some reason, we are getting lookback on the cluster centres, whereas we should only be showing the most recent.

cc @maximlt

@maximlt
Copy link
Contributor

maximlt commented Jan 8, 2022

Thanks for starting to gather the code in a notebook 👍

If you want to display the most recent clusters (n=3) then this change would be required (fixing a typo too):

diff --git a/examples/river_kmeans.py b/examples/river_kmeans.py
index 4acbaf1..f204e9e 100644
--- a/examples/river_kmeans.py
+++ b/examples/river_kmeans.py
@@ -58,7 +58,7 @@ def main(viz=True):
         return concat([previous, new]).iloc[-last_lines:, :]
 
     partition_obs = 10
-    particion_clusters = 10
+    partition_clusters = 10
     backlog_obs = 100
 
     # .partition is used to gather x number of points
@@ -74,8 +74,8 @@ def main(viz=True):
     )
     (
         clusters
-        .partition(particion_clusters)
-        .map(pd.concat)
+        .partition(partition_clusters)
+        .map(lambda t: t[-1])
         .sink(pipe_out.send)
     )

Instead of concatenating the clusters (n=3partition) the stream just gets the last clusters (n=3) that were accumulated in .partition. Note that it means that some clusters (=3(partition-1)) are not displayed at all.

Even if the script is working at the required cadence thanks to the use of .partition I've opened an issue in holoviews (holoviz/holoviews#5178) to report the error observed when holoviews (or panel or bokeh or tornado...) couldn't cope with a high frequency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants