Skip to content

Incorrect nesting #17

@neverfox

Description

@neverfox

In the wild, I have noticed that the results of parse-string don't always nest as expected. For example, if you pull in the DOM for https://www.google.com/search?q=dentist+pinellas+county+fl (I'm using clj-http to make the request), there is a div with id ires (the search results) that contains one child ol which itself contains 10 or so children with class g (each individual search result). The first of these children also contains class _Arj. However, the tagsoup result shows the ol and the _Arj g div as siblings, and the remaining g-classed divs as direct children of the body tag. I'm not sure if this is an issue with clj-tagsoup or something upstream, but I thought I'd bring it to your attention.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions