Skip to content

Conversation

@jf---
Copy link
Collaborator

@jf--- jf--- commented Dec 10, 2025

Summary

  • Douglas-Peucker simplification in C++
  • AABB tree batch closest point queries (CGAL)
  • Python wrapper with tests

New API

from compas_cgal.polylines import simplify_polyline, closest_point_on_polylines

Test plan

  • builds successfully
  • tests pass

- Douglas-Peucker simplification in C++
- AABB tree batch closest point queries (CGAL)
- Python wrapper with tests
@tomvanmele
Copy link
Member

@petrasvestartas can you help reviewing the technical aspects of the binding? thanks!

@tomvanmele tomvanmele added the enhancement New feature or request label Dec 10, 2025
@tomvanmele
Copy link
Member

@jf--- without an entry in the changelog, the checks will not complete...

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new polylines module to compas_cgal providing polyline simplification using the Douglas-Peucker algorithm and efficient batch closest point queries using CGAL's AABB tree.

Key Changes:

  • Implements Douglas-Peucker simplification in C++ for 2D/3D polylines with configurable distance threshold
  • Adds AABB tree-based closest point queries for batch processing of query points against a polyline
  • Provides Python wrappers with functions: simplify_polyline, simplify_polylines, and closest_points_on_polyline

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
tests/test_polylines.py Comprehensive test suite covering simplification and closest point functionality with various edge cases
src/polylines.h C++ header declaring the two main functions with documentation
src/polylines.cpp C++ implementation of Douglas-Peucker algorithm and AABB tree closest point queries with Python bindings
src/compas_cgal/polylines.py Python wrapper module providing user-facing API with type hints and documentation
src/compas_cgal/init.py Registers the new polylines module for lazy loading
CMakeLists.txt Adds the polylines module to the build system
CHANGELOG.md Documents the new features in the unreleased section

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +13 to +34
static double perpendicular_distance(
double px, double py,
double x1, double y1, double x2, double y2)
{
double dx = x2 - x1;
double dy = y2 - y1;
double len_sq = dx * dx + dy * dy;

if (len_sq < 1e-12) {
// Degenerate segment
return std::sqrt((px - x1) * (px - x1) + (py - y1) * (py - y1));
}

// Project point onto line
double t = ((px - x1) * dx + (py - y1) * dy) / len_sq;
t = std::max(0.0, std::min(1.0, t));

double proj_x = x1 + t * dx;
double proj_y = y1 + t * dy;

return std::sqrt((px - proj_x) * (px - proj_x) + (py - proj_y) * (py - proj_y));
}
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Douglas-Peucker implementation only uses X and Y coordinates (2D projection) for distance calculations, even when working with 3D polylines. This means that simplification will not consider deviations in the Z dimension.

For example, a 3D polyline like [[0,0,0], [1,0,5], [2,0,0]] would be simplified to just start and end points if the threshold is 0.1, even though the middle point has significant Z deviation.

This should be documented clearly, or the algorithm should be extended to handle 3D distances. The function documentation in the header file (line 11) and Python docstrings should specify that simplification is performed in the XY plane only.

Copilot uses AI. Check for mistakes.
src/polylines.h Outdated
* @brief Simplify multiple 2D polylines using Douglas-Peucker algorithm.
*
* @param polylines Vector of polylines, each as Nx2 or Nx3 matrix (float64)
* @param threshold Squared distance threshold for simplification
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring states "Squared distance threshold" but the implementation uses linear distance. In the Douglas-Peucker algorithm (line 63 in polylines.cpp), max_dist is compared directly with threshold, and perpendicular_distance returns the Euclidean distance (with sqrt), not squared distance. The documentation should say "Distance threshold" instead.

Suggested change
* @param threshold Squared distance threshold for simplification
* @param threshold Distance threshold for simplification

Copilot uses AI. Check for mistakes.
Comment on lines 47 to 48
"""
arrays = [np.asarray(p, dtype=np.float64) for p in polylines]
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing input validation. The function should validate that:

  1. threshold is non-negative (negative thresholds don't make sense for distance)
  2. polylines is not empty
  3. Each polyline has valid dimensions (at least 2 columns)

Without validation, users will get cryptic C++ errors instead of helpful Python exceptions. Consider adding validation before calling the C++ function.

Suggested change
"""
arrays = [np.asarray(p, dtype=np.float64) for p in polylines]
"""
# Input validation
if threshold < 0:
raise ValueError("threshold must be non-negative")
if not polylines or len(polylines) == 0:
raise ValueError("polylines must not be empty")
arrays = [np.asarray(p, dtype=np.float64) for p in polylines]
for i, arr in enumerate(arrays):
if arr.ndim != 2 or arr.shape[1] < 2:
raise ValueError(
f"Each polyline must be a 2D array with at least 2 columns (got shape {arr.shape} for polyline {i})"
)

Copilot uses AI. Check for mistakes.
"""
queries = np.asarray(query_points, dtype=np.float64)
poly = np.asarray(polyline, dtype=np.float64)
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing input validation. The function should validate that:

  1. query_points and polyline are not empty
  2. Both inputs have valid dimensions (at least 2 columns)
  3. Dimensions are compatible (both 2D or both 3D, or handle mixed cases explicitly)

Without validation, users will get cryptic C++ errors instead of helpful Python exceptions. Consider adding validation before calling the C++ function.

Suggested change
poly = np.asarray(polyline, dtype=np.float64)
poly = np.asarray(polyline, dtype=np.float64)
# Input validation
if queries.size == 0:
raise ValueError("query_points is empty.")
if poly.size == 0:
raise ValueError("polyline is empty.")
if queries.ndim != 2 or poly.ndim != 2:
raise ValueError("Both query_points and polyline must be 2D arrays.")
if queries.shape[1] < 2 or poly.shape[1] < 2:
raise ValueError("Both query_points and polyline must have at least 2 columns (2D or 3D points).")
if queries.shape[1] != poly.shape[1]:
raise ValueError(
f"Dimension mismatch: query_points has {queries.shape[1]} columns, polyline has {poly.shape[1]} columns."
)

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually not a fan of input validation (ask for forgiveness not for permission), but this is negligible overhead so worth considering.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would indeed take the copilot review just as a bunch of suggestions. i leave it up to you what to do with them...

Comment on lines +39 to +49
def test_simplify_with_z_coords(self):
"""Simplification should work with 3D points (z preserved)."""
polyline = [
[0, 0, 1],
[1, 0.01, 2],
[2, 0, 3],
]
result = simplify_polyline(polyline, threshold=0.1)
assert result.shape[1] == 3 # Preserves 3 columns
assert len(result) == 2 # Middle point removed

Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test doesn't verify that Z coordinates are preserved correctly during simplification. The test checks that the result has 3 columns and 2 rows, but doesn't verify what the Z values actually are. Given that simplification only considers XY distance (see polylines.cpp lines 13-34), the Z coordinates should be preserved from the kept points.

Consider adding assertions like:

assert np.allclose(result[0], [0, 0, 1])  # First point preserved with its Z
assert np.allclose(result[1], [2, 0, 3])  # Last point preserved with its Z

Copilot uses AI. Check for mistakes.
Comment on lines +132 to +134
if (n_poly == 0) {
result.setZero();
return result;
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing handling for empty query points. When n_queries == 0, the function will return an empty result matrix, which is correct. However, when n_poly == 0 (line 132), the result is filled with zeros which may not be the expected behavior. Consider either:

  1. Throwing an error for invalid input (empty polyline)
  2. Documenting this behavior clearly

The test suite should also cover the case of empty query points to ensure it behaves as expected.

Copilot uses AI. Check for mistakes.
Comment on lines +71 to +77
def test_closest_point_on_segment(self):
"""Find closest points on a single segment."""
polyline = [[0, 0], [10, 0]]
queries = [[5, 5]]
result = closest_points_on_polyline(queries, polyline)
# Closest point on horizontal segment to (5, 5) is (5, 0)
assert np.allclose(result[0], [5, 0], atol=1e-6)
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test assumes 2D output but doesn't validate Z coordinates were removed. When query points are 3D and polyline is 2D (or vice versa), the function uses std::min(q_cols, p_cols) for output dimensions (line 128 in polylines.cpp). The test should verify that:

  1. The Z coordinate handling is correct when mixing 2D/3D inputs
  2. The result dimensions match expectations

Consider adding a test case like:

polyline_2d = [[0, 0], [10, 0]]
queries_3d = [[5, 5, 10]]
result = closest_points_on_polyline(queries_3d, polyline_2d)
assert result.shape == (1, 2)  # Output should be 2D

Copilot uses AI. Check for mistakes.
Comment on lines 36 to 66
// Recursive Douglas-Peucker
static void rdp_simplify(
const compas::RowMatrixXd& polyline,
int start, int end,
double threshold,
std::vector<bool>& keep)
{
if (end <= start + 1) return;

double max_dist = 0.0;
int max_idx = start;

double x1 = polyline(start, 0);
double y1 = polyline(start, 1);
double x2 = polyline(end, 0);
double y2 = polyline(end, 1);

for (int i = start + 1; i < end; ++i) {
double dist = perpendicular_distance(
polyline(i, 0), polyline(i, 1),
x1, y1, x2, y2);
if (dist > max_dist) {
max_dist = dist;
max_idx = i;
}
}

if (max_dist > threshold) {
keep[max_idx] = true;
rdp_simplify(polyline, start, max_idx, threshold, keep);
rdp_simplify(polyline, max_idx, end, threshold, keep);
Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential integer overflow in recursive algorithm. The Douglas-Peucker algorithm uses recursion depth proportional to the number of points kept. For very large polylines with many features (worst case: O(n) recursion depth), this could cause stack overflow.

Consider either:

  1. Documenting maximum recommended polyline size
  2. Implementing an iterative version using an explicit stack
  3. Adding a depth limit check

This is particularly important since CGAL is often used for large geometric datasets.

Suggested change
// Recursive Douglas-Peucker
static void rdp_simplify(
const compas::RowMatrixXd& polyline,
int start, int end,
double threshold,
std::vector<bool>& keep)
{
if (end <= start + 1) return;
double max_dist = 0.0;
int max_idx = start;
double x1 = polyline(start, 0);
double y1 = polyline(start, 1);
double x2 = polyline(end, 0);
double y2 = polyline(end, 1);
for (int i = start + 1; i < end; ++i) {
double dist = perpendicular_distance(
polyline(i, 0), polyline(i, 1),
x1, y1, x2, y2);
if (dist > max_dist) {
max_dist = dist;
max_idx = i;
}
}
if (max_dist > threshold) {
keep[max_idx] = true;
rdp_simplify(polyline, start, max_idx, threshold, keep);
rdp_simplify(polyline, max_idx, end, threshold, keep);
// Iterative Douglas-Peucker using explicit stack to avoid stack overflow
static void rdp_simplify(
const compas::RowMatrixXd& polyline,
int start, int end,
double threshold,
std::vector<bool>& keep)
{
struct Segment {
int start;
int end;
};
std::vector<Segment> stack;
stack.push_back({start, end});
while (!stack.empty()) {
Segment seg = stack.back();
stack.pop_back();
int s = seg.start;
int e = seg.end;
if (e <= s + 1) continue;
double max_dist = 0.0;
int max_idx = s;
double x1 = polyline(s, 0);
double y1 = polyline(s, 1);
double x2 = polyline(e, 0);
double y2 = polyline(e, 1);
for (int i = s + 1; i < e; ++i) {
double dist = perpendicular_distance(
polyline(i, 0), polyline(i, 1),
x1, y1, x2, y2);
if (dist > max_dist) {
max_dist = dist;
max_idx = i;
}
}
if (max_dist > threshold) {
keep[max_idx] = true;
stack.push_back({s, max_idx});
stack.push_back({max_idx, e});
}

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the most valuable insight will look into that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think also the "by ref" suggestion is valid. but we have to check how this works with the nanobind mechanism (in pybind you have to mark the parameters as "opaque" to avoid copying).

Comment on lines +2 to +3
import pytest

Copy link

Copilot AI Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'pytest' is not used.

Suggested change
import pytest

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <[email protected]>
@jf---
Copy link
Collaborator Author

jf--- commented Dec 11, 2025

@tomvanmele quite helpful the copilot review

Copy link
Collaborator Author

@jf--- jf--- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think also the "by ref" suggestion is valid. but we have to check how this works with the nanobind mechanism (in pybind you have to mark the parameters as "opaque" to avoid copying).

i missed this important insight.

@jf--- jf--- mentioned this pull request Dec 11, 2025
@jf---
Copy link
Collaborator Author

jf--- commented Dec 11, 2025

@tomvanmele the #59 addresses intermodule consistency wrt how np arrays are handled

@petrasvestartas
Copy link
Collaborator

@jf---

This pr is missing examples in this folder:
https://github.com/compas-dev/compas_cgal/tree/main/docs/examples

You need to make:

  • 1 rst file
  • 1 minimal code (e.g. taken from your tests).
  • run code in the viewer and make a screenshot for the rst file

The highly likely section that the users will use is this:
https://compas.dev/compas_cgal/latest/examples.html

@tomvanmele
Copy link
Member

@petrasvestartas @jf--- maybe we can merge as is and add the example later? would like to switch to mkdocs first. should be done later today...

@petrasvestartas
Copy link
Collaborator

I added documentation and changed the eigen ref. Whenthe build is finished, it is ready to be merged.

@tomvanmele tomvanmele self-requested a review December 12, 2025 13:15
@petrasvestartas petrasvestartas merged commit 2c07439 into compas-dev:main Dec 12, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants