Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve headers using Clang C++ API? #453

Open
TravisCardwell opened this issue Feb 27, 2025 · 0 comments
Open

Resolve headers using Clang C++ API? #453

TravisCardwell opened this issue Feb 27, 2025 · 0 comments
Labels
enhancement New feature or request performance

Comments

@TravisCardwell
Copy link
Collaborator

TravisCardwell commented Feb 27, 2025

As of #433 and #450, headers are resolved using libclang by parsing an in-memory header that just contains a #include directive, where the fold breaks after the include directive has been processed. By using libclang, we can be confident that resolution is done exactly the same as with headers that we translate.

A drawback, however, is performance. Each header resolution requires creating a new translation unit and parsing headers. We break after the include directive has been processed, but it was not known if Clang parses headers lazily. Benchmarks indicate that it parses the whole header before the break: larger headers take more time to resolve.

As a performance enhancement, we may want to instead resolve headers using the Clang C++ API. This involves creating frontend and preprocessor instances, but perhaps that can be done once and all headers can be resolved (using LookupFile) using a single instance on program start.

If we do this, we must check that headers resolve the same as they do with libclang. The clang CLI has different behavior, as it includes the compiler include directory in the system include path while libclang does not, but we do not know about the Clang C++ API until we test it.

The Clang C++ API is not as stable as the libclang API, but perhaps there would not be too many stability/maintenance issues if the implementation is minimal.


Benchmarking notes

I found a large C header (for libepoxy): /usr/include/epoxy/gl_generated.h.

Details
$ find /usr/include -type f -name '*.h' -printf "%s\t%p\n" | sort -n | tail
...
1492422	/usr/include/epoxy/gl_generated.h
2365781 /usr/include/isl/typed_cpp.h
Resolving headers in the REPL
λ: import HsBindgen.Clang.Paths.Resolve
λ: import HsBindgen.Clang.Paths
λ: import HsBindgen.Clang.Args
λ: let args = defaultClangArgs { clangQuoteIncludePathDirs = ["/usr/include"] }
λ: resolveHeader args $ CHeaderQuoteIncludePath "epoxy/gl_generated.h"
Just "/usr/include/epoxy/gl_generated.h"
it :: Maybe SourcePath
Benchmark program
{-# LANGUAGE OverloadedStrings #-}

{-# OPTIONS_GHC -fno-warn-orphans #-}

module Main where

import Control.DeepSeq
import Criterion.Main

import HsBindgen.Clang.Args
import HsBindgen.Clang.Paths
import HsBindgen.Clang.Paths.Resolve

------------------------------------------------------------------------------

instance NFData SourcePath where
  rnf (SourcePath t) = rnf t

main :: IO ()
main = do
    print =<<
      resolveHeader args (CHeaderQuoteIncludePath "simple_structs.h")
    print =<<
      resolveHeader args (CHeaderQuoteIncludePath "epoxy/gl_generated.h")
    defaultMain
      [ bench "simple_structs.h" . nfIO $
          resolveHeader args (CHeaderQuoteIncludePath "simple_structs.h")
      , bench "epoxy/gl_generated.h" . nfIO $
          resolveHeader args (CHeaderQuoteIncludePath "epoxy/gl_generated.h")
      ]
  where
    args :: ClangArgs
    args = defaultClangArgs
      { clangStdInc = False
      , clangQuoteIncludePathDirs = ["hs-bindgen/examples", "/usr/include"]
      }

Results:

Just "hs-bindgen/examples/simple_structs.h"
Just "/usr/include/epoxy/gl_generated.h"
benchmarking simple_structs.h
time                 1.180 ms   (1.172 ms .. 1.190 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 1.173 ms   (1.169 ms .. 1.178 ms)
std dev              15.56 μs   (12.68 μs .. 20.44 μs)

benchmarking epoxy/gl_generated.h
time                 415.2 ms   (387.4 ms .. 440.7 ms)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 394.3 ms   (386.4 ms .. 402.5 ms)
std dev              10.21 ms   (373.5 μs .. 12.53 ms)
variance introduced by outliers: 19% (moderately inflated)
@TravisCardwell TravisCardwell added enhancement New feature or request performance labels Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
None yet
Development

No branches or pull requests

1 participant