From b3e10bc079de759b6ea1cc1a8ba525dcc59e7283 Mon Sep 17 00:00:00 2001
From: IgorSwat <igorswat2002@o2.pl>
Date: Wed, 29 Oct 2025 18:44:57 +0100
Subject: [PATCH 01/11] chore: update v0.5.0 benchmarks (#481)

---
 docs/docs/04-benchmarks/inference-time.md     | 96 ++++++++----------
 docs/docs/04-benchmarks/memory-usage.md       | 56 ++++++-----
 docs/docs/04-benchmarks/model-size.md         |  6 +-
 .../04-benchmarks/inference-time.md           | 98 +++++++++----------
 .../04-benchmarks/memory-usage.md             | 59 ++++++-----
 .../version-0.5.x/04-benchmarks/model-size.md | 12 ++-
 6 files changed, 168 insertions(+), 159 deletions(-)

diff --git a/docs/docs/04-benchmarks/inference-time.md b/docs/docs/04-benchmarks/inference-time.md
index dd0f1275a..89f1f9de1 100644
--- a/docs/docs/04-benchmarks/inference-time.md
+++ b/docs/docs/04-benchmarks/inference-time.md
@@ -8,46 +8,48 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 ## Classification
 
-| Model             | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model             | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ----------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| EFFICIENTNET_V2_S |             100              |             120              |            130             |                180                |            170            |
+| EFFICIENTNET_V2_S |             105              |             110              |            149             |                299                |            227            |
 
 ## Object Detection
 
-| Model                          | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |             190              |             260              |            280             |                100                |            90             |
+| SSDLITE_320_MOBILENET_V3_LARGE |             116              |             120              |            164             |                257                |            129            |
 
 ## Style Transfer
 
-| Model                        | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model                        | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ---------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| STYLE_TRANSFER_CANDY         |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_MOSAIC        |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_UDNIE         |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_RAIN_PRINCESS |             450              |             600              |            750             |               1650                |           1800            |
+| STYLE_TRANSFER_CANDY         |             1356             |             1550             |            2003            |               2578                |           2328            |
+| STYLE_TRANSFER_MOSAIC        |             1376             |             1456             |            1971            |               2657                |           2394            |
+| STYLE_TRANSFER_UDNIE         |             1389             |             1499             |            1858            |               2380                |           2124            |
+| STYLE_TRANSFER_RAIN_PRINCESS |             1339             |             1514             |            2004            |               2608                |           2371            |
 
 ## OCR
 
-| Model                 | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | Samsung Galaxy S21 (XNNPACK) [ms] |
-| --------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-------------------------------: |
-| Detector (CRAFT_800)  |             2099             |               2227               |             ❌             |               2245                |               7108                |
-| Recognizer (CRNN_512) |              70              |               252                |             ❌             |                54                 |                151                |
-| Recognizer (CRNN_256) |              39              |               123                |             ❌             |                24                 |                78                 |
-| Recognizer (CRNN_128) |              17              |                83                |             ❌             |                14                 |                39                 |
+Notice that the recognizer models were executed between 3 and 7 times during a single recognition.
+The values below represent the averages across all runs for the benchmark image.
 
-❌ - Insufficient RAM.
+| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| Detector (CRAFT_800_QUANTIZED) |             669              |             649              |            825             |                541                |            474            |
+| Recognizer (CRNN_512)          |              48              |              47              |             60             |                91                 |            72             |
+| Recognizer (CRNN_256)          |              22              |              22              |             29             |                51                 |            30             |
+| Recognizer (CRNN_128)          |              11              |              11              |             14             |                28                 |            17             |
 
 ## Vertical OCR
 
-| Model                 | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | Samsung Galaxy S21 (XNNPACK) [ms] |
-| --------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-------------------------------: |
-| Detector (CRAFT_1280) |             5457             |               5833               |             ❌             |               6296                |               14053               |
-| Detector (CRAFT_320)  |             1351             |               1460               |             ❌             |               1485                |               3101                |
-| Recognizer (CRNN_512) |              39              |               123                |             ❌             |                24                 |                78                 |
-| Recognizer (CRNN_64)  |              10              |                33                |             ❌             |                 7                 |                18                 |
+Notice that the recognizer models, as well as detector CRAFT_320 model, were executed between 4 and 21 times during a single recognition.
+The values below represent the averages across all runs for the benchmark image.
 
-❌ - Insufficient RAM.
+| Model                           | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| Detector (CRAFT_1280_QUANTIZED) |             1749             |             1804             |            2105            |               1216                |           1171            |
+| Detector (CRAFT_320_QUANTIZED)  |             458              |             474              |            561             |                360                |            332            |
+| Recognizer (CRNN_512)           |              54              |              52              |             68             |                144                |            72             |
+| Recognizer (CRNN_64)            |              5               |              6               |             7              |                28                 |            11             |
 
 ## LLMs
 
@@ -62,41 +64,31 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 ❌ - Insufficient RAM.
 
-### Streaming mode
-
-Notice than for `Whisper` model which has to take as an input 30 seconds audio chunks (for shorter audio it is automatically padded with silence to 30 seconds) `fast` mode has the lowest latency (time from starting transcription to first token returned, caused by streaming algorithm), but the slowest speed. If you believe that this might be a problem for you, prefer `balanced` mode instead.
-
-| Model (mode)            | iPhone 16 Pro (XNNPACK) [latency \| tokens/s] | iPhone 14 Pro (XNNPACK) [latency \| tokens/s] | iPhone SE 3 (XNNPACK) [latency \| tokens/s] | Samsung Galaxy S24 (XNNPACK) [latency \| tokens/s] | OnePlus 12 (XNNPACK) [latency \| tokens/s] |
-| ----------------------- | :-------------------------------------------: | :-------------------------------------------: | :-----------------------------------------: | :------------------------------------------------: | :----------------------------------------: |
-| Whisper-tiny (fast)     |                2.8s \| 5.5t/s                 |                3.7s \| 4.4t/s                 |               4.4s \| 3.4t/s                |                   5.5s \| 3.1t/s                   |               5.3s \| 3.8t/s               |
-| Whisper-tiny (balanced) |                5.6s \| 7.9t/s                 |                7.0s \| 6.3t/s                 |               8.3s \| 5.0t/s                |                   8.4s \| 6.7t/s                   |               7.7s \| 7.2t/s               |
-| Whisper-tiny (quality)  |                10.3s \| 8.3t/s                |                12.6s \| 6.8t/s                |               7.8s \| 8.9t/s                |                  13.5s \| 7.1t/s                   |              12.9s \| 7.5t/s               |
-
 ### Encoding
 
 Average time for encoding audio of given length over 10 runs. For `Whisper` model we only list 30 sec audio chunks since `Whisper` does not accept other lengths (for shorter audio the audio needs to be padded to 30sec with silence).
 
-| Model              | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model              | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Whisper-tiny (30s) |             1034             |             1344             |            1269            |               2916                |           2143            |
+| Whisper-tiny (30s) |             1391             |             1372             |            1894            |               1303                |           1214            |
 
 ### Decoding
 
-Average time for decoding one token in sequence of 100 tokens, with encoding context is obtained from audio of noted length.
+Average time for decoding one token in sequence of approximately 100 tokens, with encoding context is obtained from audio of noted length.
 
-| Model              | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model              | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Whisper-tiny (30s) |            128.03            |            113.65            |           141.63           |               89.08               |           84.49           |
+| Whisper-tiny (30s) |              53              |              53              |             74             |                100                |            84             |
 
 ## Text Embeddings
 
-| Model                      | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ms] |
-| -------------------------- | :--------------------------: | :------------------------------: | :------------------------: | :--------------------------: | :-----------------------: |
-| ALL_MINILM_L6_V2           |              15              |                22                |             23             |              36              |            31             |
-| ALL_MPNET_BASE_V2          |              71              |                96                |            101             |             112              |            105            |
-| MULTI_QA_MINILM_L6_COS_V1  |              15              |                22                |             23             |              36              |            31             |
-| MULTI_QA_MPNET_BASE_DOT_V1 |              71              |                95                |            100             |             112              |            105            |
-| CLIP_VIT_BASE_PATCH32_TEXT |              31              |                47                |             48             |              55              |            49             |
+| Model                      | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| -------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| ALL_MINILM_L6_V2           |              16              |              16              |             19             |                54                 |            28             |
+| ALL_MPNET_BASE_V2          |             115              |             116              |            144             |                145                |            95             |
+| MULTI_QA_MINILM_L6_COS_V1  |              16              |              16              |             20             |                47                 |            28             |
+| MULTI_QA_MPNET_BASE_DOT_V1 |             112              |             119              |            144             |                146                |            96             |
+| CLIP_VIT_BASE_PATCH32_TEXT |              47              |              45              |             57             |                65                 |            48             |
 
 :::info
 Benchmark times for text embeddings are highly dependent on the sentence length. The numbers above are based on a sentence of around 80 tokens. For shorter or longer sentences, inference time may vary accordingly.
@@ -104,9 +96,9 @@ Benchmark times for text embeddings are highly dependent on the sentence length.
 
 ## Image Embeddings
 
-| Model                       | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| --------------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| CLIP_VIT_BASE_PATCH32_IMAGE |              48              |                64                |             69             |                65                 |            63             |
+| Model                       | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| --------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| CLIP_VIT_BASE_PATCH32_IMAGE |              70              |              70              |             90             |                66                 |            58             |
 
 :::info
 Image embedding benchmark times are measured using 224×224 pixel images, as required by the model. All input images, whether larger or smaller, are resized to 224×224 before processing. Resizing is typically fast for small images but may be noticeably slower for very large images, which can increase total inference time.
@@ -114,8 +106,6 @@ Image embedding benchmark times are measured using 224×224 pixel images, as req
 
 ## Text to Image
 
-Average time for generating one image of size 256×256 in 10 inference steps.
-
-| Model                 | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| --------------------- | :--------------------------: | :------------------------------: | :-------------------: | :-------------------------------: | :-----------------------: |
-| BK_SDM_TINY_VPRED_256 |            19100             |              25000               |          ❌           |                ❌                 |           23100           |
+| Model                 | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| --------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| BK_SDM_TINY_VPRED_256 |            21184             |            21021             |             ❌             |               18834               |           16617           |
diff --git a/docs/docs/04-benchmarks/memory-usage.md b/docs/docs/04-benchmarks/memory-usage.md
index e34c8a7ca..a0c5a7b6d 100644
--- a/docs/docs/04-benchmarks/memory-usage.md
+++ b/docs/docs/04-benchmarks/memory-usage.md
@@ -2,76 +2,80 @@
 title: Memory Usage
 ---
 
+:::info
+All the below benchmarks were performed on iPhone 17 Pro (iOS) and OnePlus 12 (Android).
+:::
+
 ## Classification
 
 | Model             | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
 | ----------------- | :--------------------: | :----------------: |
-| EFFICIENTNET_V2_S |          130           |         85         |
+| EFFICIENTNET_V2_S |          230           |         87         |
 
 ## Object Detection
 
 | Model                          | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | ------------------------------ | :--------------------: | :----------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |           90           |         90         |
+| SSDLITE_320_MOBILENET_V3_LARGE |          164           |        132         |
 
 ## Style Transfer
 
 | Model                        | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
 | ---------------------------- | :--------------------: | :----------------: |
-| STYLE_TRANSFER_CANDY         |          950           |        350         |
-| STYLE_TRANSFER_MOSAIC        |          950           |        350         |
-| STYLE_TRANSFER_UDNIE         |          950           |        350         |
-| STYLE_TRANSFER_RAIN_PRINCESS |          950           |        350         |
+| STYLE_TRANSFER_CANDY         |          1200          |        380         |
+| STYLE_TRANSFER_MOSAIC        |          1200          |        380         |
+| STYLE_TRANSFER_UDNIE         |          1200          |        380         |
+| STYLE_TRANSFER_RAIN_PRINCESS |          1200          |        380         |
 
 ## OCR
 
-| Model                                                                                        | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| -------------------------------------------------------------------------------------------- | :--------------------: | :----------------: |
-| Detector (CRAFT_800) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) |          2100          |        1782        |
+| Model                                                                                                  | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ------------------------------------------------------------------------------------------------------ | :--------------------: | :----------------: |
+| Detector (CRAFT_800_QUANTIZED) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) |          1400          |        1320        |
 
 ## Vertical OCR
 
-| Model                                                                | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| -------------------------------------------------------------------- | :--------------------: | :----------------: |
-| Detector (CRAFT_1280) + Detector (CRAFT_320) + Recognizer (CRNN_512) |          2770          |        3720        |
-| Detector(CRAFT_1280) + Detector(CRAFT_320) + Recognizer (CRNN_64)    |          1770          |        2740        |
+| Model                                                                                    | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ---------------------------------------------------------------------------------------- | :--------------------: | :----------------: |
+| Detector (CRAFT_1280_QUANTIZED) + Detector (CRAFT_320_QUANTIZED) + Recognizer (CRNN_512) |          1540          |        1470        |
+| Detector(CRAFT_1280_QUANTIZED) + Detector(CRAFT_320_QUANTIZED) + Recognizer (CRNN_64)    |          1070          |        1000        |
 
 ## LLMs
 
 | Model                 | Android (XNNPACK) [GB] | iOS (XNNPACK) [GB] |
 | --------------------- | :--------------------: | :----------------: |
-| LLAMA3_2_1B           |          3.2           |        3.1         |
-| LLAMA3_2_1B_SPINQUANT |          1.9           |         2          |
-| LLAMA3_2_1B_QLORA     |          2.2           |        2.5         |
+| LLAMA3_2_1B           |          3.3           |        3.1         |
+| LLAMA3_2_1B_SPINQUANT |          1.9           |        2.4         |
+| LLAMA3_2_1B_QLORA     |          2.7           |        2.8         |
 | LLAMA3_2_3B           |          7.1           |        7.3         |
 | LLAMA3_2_3B_SPINQUANT |          3.7           |        3.8         |
-| LLAMA3_2_3B_QLORA     |           4            |        4.1         |
+| LLAMA3_2_3B_QLORA     |          3.9           |        4.0         |
 
 ## Speech to text
 
 | Model        | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | ------------ | :--------------------: | :----------------: |
-| WHISPER_TINY |          900           |        600         |
+| WHISPER_TINY |          410           |        375         |
 
 ## Text Embeddings
 
 | Model                      | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | -------------------------- | :--------------------: | :----------------: |
-| ALL_MINILM_L6_V2           |           85           |        100         |
-| ALL_MPNET_BASE_V2          |          390           |        465         |
-| MULTI_QA_MINILM_L6_COS_V1  |          115           |        130         |
-| MULTI_QA_MPNET_BASE_DOT_V1 |          415           |        490         |
-| CLIP_VIT_BASE_PATCH32_TEXT |          195           |        250         |
+| ALL_MINILM_L6_V2           |           95           |        110         |
+| ALL_MPNET_BASE_V2          |          405           |        455         |
+| MULTI_QA_MINILM_L6_COS_V1  |          120           |        140         |
+| MULTI_QA_MPNET_BASE_DOT_V1 |          435           |        455         |
+| CLIP_VIT_BASE_PATCH32_TEXT |          200           |        280         |
 
 ## Image Embeddings
 
 | Model                       | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | --------------------------- | :--------------------: | :----------------: |
-| CLIP_VIT_BASE_PATCH32_IMAGE |          350           |        340         |
+| CLIP_VIT_BASE_PATCH32_IMAGE |          345           |        340         |
 
 ## Text to Image
 
 | Model                 | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | --------------------- | ---------------------- | ------------------ |
-| BK_SDM_TINY_VPRED_256 | 2900                   | 2800               |
-| BK_SDM_TINY_VPRED     | 6700                   | 6560               |
+| BK_SDM_TINY_VPRED_256 | 2400                   | 2400               |
+| BK_SDM_TINY_VPRED     | 6210                   | 6050               |
diff --git a/docs/docs/04-benchmarks/model-size.md b/docs/docs/04-benchmarks/model-size.md
index 5cf87f6fa..30999fa7b 100644
--- a/docs/docs/04-benchmarks/model-size.md
+++ b/docs/docs/04-benchmarks/model-size.md
@@ -27,7 +27,7 @@ title: Model Size
 
 | Model                 | XNNPACK [MB] |
 | --------------------- | :----------: |
-| Detector (CRAFT_800)  |     83.1     |
+| Detector (CRAFT_800)  |     19.8     |
 | Recognizer (CRNN_512) |  15 - 18\*   |
 | Recognizer (CRNN_256) |  16 - 18\*   |
 | Recognizer (CRNN_128) |  17 - 19\*   |
@@ -38,8 +38,8 @@ title: Model Size
 
 | Model                    | XNNPACK [MB] |
 | ------------------------ | :----------: |
-| Detector (CRAFT_1280)    |     83.1     |
-| Detector (CRAFT_320)     |     83.1     |
+| Detector (CRAFT_1280)    |     19.8     |
+| Detector (CRAFT_320)     |     19.8     |
 | Recognizer (CRNN_EN_512) |  15 - 18\*   |
 | Recognizer (CRNN_EN_64)  |  15 - 16\*   |
 
diff --git a/docs/versioned_docs/version-0.5.x/04-benchmarks/inference-time.md b/docs/versioned_docs/version-0.5.x/04-benchmarks/inference-time.md
index 504c0f6e9..89f1f9de1 100644
--- a/docs/versioned_docs/version-0.5.x/04-benchmarks/inference-time.md
+++ b/docs/versioned_docs/version-0.5.x/04-benchmarks/inference-time.md
@@ -8,46 +8,48 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 ## Classification
 
-| Model             | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model             | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ----------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| EFFICIENTNET_V2_S |             100              |             120              |            130             |                180                |            170            |
+| EFFICIENTNET_V2_S |             105              |             110              |            149             |                299                |            227            |
 
 ## Object Detection
 
-| Model                          | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |             190              |             260              |            280             |                100                |            90             |
+| SSDLITE_320_MOBILENET_V3_LARGE |             116              |             120              |            164             |                257                |            129            |
 
 ## Style Transfer
 
-| Model                        | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model                        | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ---------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| STYLE_TRANSFER_CANDY         |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_MOSAIC        |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_UDNIE         |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_RAIN_PRINCESS |             450              |             600              |            750             |               1650                |           1800            |
+| STYLE_TRANSFER_CANDY         |             1356             |             1550             |            2003            |               2578                |           2328            |
+| STYLE_TRANSFER_MOSAIC        |             1376             |             1456             |            1971            |               2657                |           2394            |
+| STYLE_TRANSFER_UDNIE         |             1389             |             1499             |            1858            |               2380                |           2124            |
+| STYLE_TRANSFER_RAIN_PRINCESS |             1339             |             1514             |            2004            |               2608                |           2371            |
 
 ## OCR
 
-| Model                 | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | Samsung Galaxy S21 (XNNPACK) [ms] |
-| --------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-------------------------------: |
-| Detector (CRAFT_800)  |             2099             |               2227               |             ❌             |               2245                |               7108                |
-| Recognizer (CRNN_512) |              70              |               252                |             ❌             |                54                 |                151                |
-| Recognizer (CRNN_256) |              39              |               123                |             ❌             |                24                 |                78                 |
-| Recognizer (CRNN_128) |              17              |                83                |             ❌             |                14                 |                39                 |
+Notice that the recognizer models were executed between 3 and 7 times during a single recognition.
+The values below represent the averages across all runs for the benchmark image.
 
-❌ - Insufficient RAM.
+| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| Detector (CRAFT_800_QUANTIZED) |             669              |             649              |            825             |                541                |            474            |
+| Recognizer (CRNN_512)          |              48              |              47              |             60             |                91                 |            72             |
+| Recognizer (CRNN_256)          |              22              |              22              |             29             |                51                 |            30             |
+| Recognizer (CRNN_128)          |              11              |              11              |             14             |                28                 |            17             |
 
 ## Vertical OCR
 
-| Model                 | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | Samsung Galaxy S21 (XNNPACK) [ms] |
-| --------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-------------------------------: |
-| Detector (CRAFT_1280) |             5457             |               5833               |             ❌             |               6296                |               14053               |
-| Detector (CRAFT_320)  |             1351             |               1460               |             ❌             |               1485                |               3101                |
-| Recognizer (CRNN_512) |              39              |               123                |             ❌             |                24                 |                78                 |
-| Recognizer (CRNN_64)  |              10              |                33                |             ❌             |                 7                 |                18                 |
+Notice that the recognizer models, as well as detector CRAFT_320 model, were executed between 4 and 21 times during a single recognition.
+The values below represent the averages across all runs for the benchmark image.
 
-❌ - Insufficient RAM.
+| Model                           | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| Detector (CRAFT_1280_QUANTIZED) |             1749             |             1804             |            2105            |               1216                |           1171            |
+| Detector (CRAFT_320_QUANTIZED)  |             458              |             474              |            561             |                360                |            332            |
+| Recognizer (CRNN_512)           |              54              |              52              |             68             |                144                |            72             |
+| Recognizer (CRNN_64)            |              5               |              6               |             7              |                28                 |            11             |
 
 ## LLMs
 
@@ -62,41 +64,31 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 ❌ - Insufficient RAM.
 
-### Streaming mode
-
-Notice than for `Whisper` model which has to take as an input 30 seconds audio chunks (for shorter audio it is automatically padded with silence to 30 seconds) `fast` mode has the lowest latency (time from starting transcription to first token returned, caused by streaming algorithm), but the slowest speed. If you believe that this might be a problem for you, prefer `balanced` mode instead.
-
-| Model (mode)              | iPhone 16 Pro (XNNPACK) [latency \| tokens/s] | iPhone 14 Pro (XNNPACK) [latency \| tokens/s] | iPhone SE 3 (XNNPACK) [latency \| tokens/s] | Samsung Galaxy S24 (XNNPACK) [latency \| tokens/s] | OnePlus 12 (XNNPACK) [latency \| tokens/s] |
-| ------------------------- | :-------------------------------------------: | :-------------------------------------------: | :-----------------------------------------: | :------------------------------------------------: | :----------------------------------------: |
-| Whisper-tiny (fast)       |                2.8s \| 5.5t/s                 |                3.7s \| 4.4t/s                 |               4.4s \| 3.4t/s                |                   5.5s \| 3.1t/s                   |               5.3s \| 3.8t/s               |
-| Whisper-tiny (balanced)   |                5.6s \| 7.9t/s                 |                7.0s \| 6.3t/s                 |               8.3s \| 5.0t/s                |                   8.4s \| 6.7t/s                   |               7.7s \| 7.2t/s               |
-| Whisper-tiny (quality)    |                10.3s \| 8.3t/s                |                12.6s \| 6.8t/s                |               7.8s \| 8.9t/s                |                  13.5s \| 7.1t/s                   |              12.9s \| 7.5t/s               |
-
 ### Encoding
 
 Average time for encoding audio of given length over 10 runs. For `Whisper` model we only list 30 sec audio chunks since `Whisper` does not accept other lengths (for shorter audio the audio needs to be padded to 30sec with silence).
 
-| Model                | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| -------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Whisper-tiny (30s)   |             1034             |             1344             |            1269            |               2916                |           2143            |
+| Model              | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| Whisper-tiny (30s) |             1391             |             1372             |            1894            |               1303                |           1214            |
 
 ### Decoding
 
-Average time for decoding one token in sequence of 100 tokens, with encoding context is obtained from audio of noted length.
+Average time for decoding one token in sequence of approximately 100 tokens, with encoding context is obtained from audio of noted length.
 
-| Model                | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| -------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Whisper-tiny (30s)   |            128.03            |            113.65            |           141.63           |               89.08               |           84.49           |
+| Model              | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| Whisper-tiny (30s) |              53              |              53              |             74             |                100                |            84             |
 
 ## Text Embeddings
 
-| Model                      | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ms] |
-| -------------------------- | :--------------------------: | :------------------------------: | :------------------------: | :--------------------------: | :-----------------------: |
-| ALL_MINILM_L6_V2           |              15              |                22                |             23             |              36              |            31             |
-| ALL_MPNET_BASE_V2          |              71              |                96                |            101             |             112              |            105            |
-| MULTI_QA_MINILM_L6_COS_V1  |              15              |                22                |             23             |              36              |            31             |
-| MULTI_QA_MPNET_BASE_DOT_V1 |              71              |                95                |            100             |             112              |            105            |
-| CLIP_VIT_BASE_PATCH32_TEXT |              31              |                47                |             48             |              55              |            49             |
+| Model                      | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| -------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| ALL_MINILM_L6_V2           |              16              |              16              |             19             |                54                 |            28             |
+| ALL_MPNET_BASE_V2          |             115              |             116              |            144             |                145                |            95             |
+| MULTI_QA_MINILM_L6_COS_V1  |              16              |              16              |             20             |                47                 |            28             |
+| MULTI_QA_MPNET_BASE_DOT_V1 |             112              |             119              |            144             |                146                |            96             |
+| CLIP_VIT_BASE_PATCH32_TEXT |              47              |              45              |             57             |                65                 |            48             |
 
 :::info
 Benchmark times for text embeddings are highly dependent on the sentence length. The numbers above are based on a sentence of around 80 tokens. For shorter or longer sentences, inference time may vary accordingly.
@@ -104,10 +96,16 @@ Benchmark times for text embeddings are highly dependent on the sentence length.
 
 ## Image Embeddings
 
-| Model                       | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| --------------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| CLIP_VIT_BASE_PATCH32_IMAGE |              48              |                64                |             69             |                65                 |            63             |
+| Model                       | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| --------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| CLIP_VIT_BASE_PATCH32_IMAGE |              70              |              70              |             90             |                66                 |            58             |
 
 :::info
 Image embedding benchmark times are measured using 224×224 pixel images, as required by the model. All input images, whether larger or smaller, are resized to 224×224 before processing. Resizing is typically fast for small images but may be noticeably slower for very large images, which can increase total inference time.
 :::
+
+## Text to Image
+
+| Model                 | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| --------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| BK_SDM_TINY_VPRED_256 |            21184             |            21021             |             ❌             |               18834               |           16617           |
diff --git a/docs/versioned_docs/version-0.5.x/04-benchmarks/memory-usage.md b/docs/versioned_docs/version-0.5.x/04-benchmarks/memory-usage.md
index 684020e2a..a0c5a7b6d 100644
--- a/docs/versioned_docs/version-0.5.x/04-benchmarks/memory-usage.md
+++ b/docs/versioned_docs/version-0.5.x/04-benchmarks/memory-usage.md
@@ -2,69 +2,80 @@
 title: Memory Usage
 ---
 
+:::info
+All the below benchmarks were performed on iPhone 17 Pro (iOS) and OnePlus 12 (Android).
+:::
+
 ## Classification
 
 | Model             | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
 | ----------------- | :--------------------: | :----------------: |
-| EFFICIENTNET_V2_S |          130           |         85         |
+| EFFICIENTNET_V2_S |          230           |         87         |
 
 ## Object Detection
 
 | Model                          | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | ------------------------------ | :--------------------: | :----------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |           90           |         90         |
+| SSDLITE_320_MOBILENET_V3_LARGE |          164           |        132         |
 
 ## Style Transfer
 
 | Model                        | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
 | ---------------------------- | :--------------------: | :----------------: |
-| STYLE_TRANSFER_CANDY         |          950           |        350         |
-| STYLE_TRANSFER_MOSAIC        |          950           |        350         |
-| STYLE_TRANSFER_UDNIE         |          950           |        350         |
-| STYLE_TRANSFER_RAIN_PRINCESS |          950           |        350         |
+| STYLE_TRANSFER_CANDY         |          1200          |        380         |
+| STYLE_TRANSFER_MOSAIC        |          1200          |        380         |
+| STYLE_TRANSFER_UDNIE         |          1200          |        380         |
+| STYLE_TRANSFER_RAIN_PRINCESS |          1200          |        380         |
 
 ## OCR
 
-| Model                                                                                        | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| -------------------------------------------------------------------------------------------- | :--------------------: | :----------------: |
-| Detector (CRAFT_800) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) |          2100          |        1782        |
+| Model                                                                                                  | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ------------------------------------------------------------------------------------------------------ | :--------------------: | :----------------: |
+| Detector (CRAFT_800_QUANTIZED) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) |          1400          |        1320        |
 
 ## Vertical OCR
 
-| Model                                                                | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| -------------------------------------------------------------------- | :--------------------: | :----------------: |
-| Detector (CRAFT_1280) + Detector (CRAFT_320) + Recognizer (CRNN_512) |          2770          |        3720        |
-| Detector(CRAFT_1280) + Detector(CRAFT_320) + Recognizer (CRNN_64)    |          1770          |        2740        |
+| Model                                                                                    | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ---------------------------------------------------------------------------------------- | :--------------------: | :----------------: |
+| Detector (CRAFT_1280_QUANTIZED) + Detector (CRAFT_320_QUANTIZED) + Recognizer (CRNN_512) |          1540          |        1470        |
+| Detector(CRAFT_1280_QUANTIZED) + Detector(CRAFT_320_QUANTIZED) + Recognizer (CRNN_64)    |          1070          |        1000        |
 
 ## LLMs
 
 | Model                 | Android (XNNPACK) [GB] | iOS (XNNPACK) [GB] |
 | --------------------- | :--------------------: | :----------------: |
-| LLAMA3_2_1B           |          3.2           |        3.1         |
-| LLAMA3_2_1B_SPINQUANT |          1.9           |         2          |
-| LLAMA3_2_1B_QLORA     |          2.2           |        2.5         |
+| LLAMA3_2_1B           |          3.3           |        3.1         |
+| LLAMA3_2_1B_SPINQUANT |          1.9           |        2.4         |
+| LLAMA3_2_1B_QLORA     |          2.7           |        2.8         |
 | LLAMA3_2_3B           |          7.1           |        7.3         |
 | LLAMA3_2_3B_SPINQUANT |          3.7           |        3.8         |
-| LLAMA3_2_3B_QLORA     |           4            |        4.1         |
+| LLAMA3_2_3B_QLORA     |          3.9           |        4.0         |
 
 ## Speech to text
 
 | Model        | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | ------------ | :--------------------: | :----------------: |
-| WHISPER_TINY |          900           |        600         |
+| WHISPER_TINY |          410           |        375         |
 
 ## Text Embeddings
 
 | Model                      | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | -------------------------- | :--------------------: | :----------------: |
-| ALL_MINILM_L6_V2           |           85           |        100         |
-| ALL_MPNET_BASE_V2          |          390           |        465         |
-| MULTI_QA_MINILM_L6_COS_V1  |          115           |        130         |
-| MULTI_QA_MPNET_BASE_DOT_V1 |          415           |        490         |
-| CLIP_VIT_BASE_PATCH32_TEXT |          195           |        250         |
+| ALL_MINILM_L6_V2           |           95           |        110         |
+| ALL_MPNET_BASE_V2          |          405           |        455         |
+| MULTI_QA_MINILM_L6_COS_V1  |          120           |        140         |
+| MULTI_QA_MPNET_BASE_DOT_V1 |          435           |        455         |
+| CLIP_VIT_BASE_PATCH32_TEXT |          200           |        280         |
 
 ## Image Embeddings
 
 | Model                       | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | --------------------------- | :--------------------: | :----------------: |
-| CLIP_VIT_BASE_PATCH32_IMAGE |          350           |        340         |
+| CLIP_VIT_BASE_PATCH32_IMAGE |          345           |        340         |
+
+## Text to Image
+
+| Model                 | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| --------------------- | ---------------------- | ------------------ |
+| BK_SDM_TINY_VPRED_256 | 2400                   | 2400               |
+| BK_SDM_TINY_VPRED     | 6210                   | 6050               |
diff --git a/docs/versioned_docs/version-0.5.x/04-benchmarks/model-size.md b/docs/versioned_docs/version-0.5.x/04-benchmarks/model-size.md
index 9d20c95d5..30999fa7b 100644
--- a/docs/versioned_docs/version-0.5.x/04-benchmarks/model-size.md
+++ b/docs/versioned_docs/version-0.5.x/04-benchmarks/model-size.md
@@ -27,7 +27,7 @@ title: Model Size
 
 | Model                 | XNNPACK [MB] |
 | --------------------- | :----------: |
-| Detector (CRAFT_800)  |     83.1     |
+| Detector (CRAFT_800)  |     19.8     |
 | Recognizer (CRNN_512) |  15 - 18\*   |
 | Recognizer (CRNN_256) |  16 - 18\*   |
 | Recognizer (CRNN_128) |  17 - 19\*   |
@@ -38,8 +38,8 @@ title: Model Size
 
 | Model                    | XNNPACK [MB] |
 | ------------------------ | :----------: |
-| Detector (CRAFT_1280)    |     83.1     |
-| Detector (CRAFT_320)     |     83.1     |
+| Detector (CRAFT_1280)    |     19.8     |
+| Detector (CRAFT_320)     |     19.8     |
 | Recognizer (CRNN_EN_512) |  15 - 18\*   |
 | Recognizer (CRNN_EN_64)  |  15 - 16\*   |
 
@@ -82,3 +82,9 @@ title: Model Size
 | Model                       | XNNPACK [MB] |
 | --------------------------- | :----------: |
 | CLIP_VIT_BASE_PATCH32_IMAGE |     352      |
+
+## Text to Image
+
+| Model             | Text encoder (XNNPACK) [MB] | UNet (XNNPACK) [MB] | VAE decoder (XNNPACK) [MB] |
+| ----------------- | --------------------------- | ------------------- | -------------------------- |
+| BK_SDM_TINY_VPRED | 492                         | 1290                | 198                        |

From 8fbfd006e0f1939def764baa867f64a65f2ae52d Mon Sep 17 00:00:00 2001
From: IgorSwat <igorswat2002@o2.pl>
Date: Mon, 3 Nov 2025 08:14:02 +0100
Subject: [PATCH 02/11] chore: update v0.4.0 benchmarks (#481)

---
 .../benchmarks/inference-time.md              | 68 ++++++++++---------
 .../version-0.4.x/benchmarks/memory-usage.md  | 14 ++--
 .../version-0.4.x/benchmarks/model-size.md    | 24 +++----
 3 files changed, 54 insertions(+), 52 deletions(-)

diff --git a/docs/versioned_docs/version-0.4.x/benchmarks/inference-time.md b/docs/versioned_docs/version-0.4.x/benchmarks/inference-time.md
index da35e7b6e..f5d6d0113 100644
--- a/docs/versioned_docs/version-0.4.x/benchmarks/inference-time.md
+++ b/docs/versioned_docs/version-0.4.x/benchmarks/inference-time.md
@@ -8,50 +8,52 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 ## Classification
 
-| Model             | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model             | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ----------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| EFFICIENTNET_V2_S |             100              |             120              |            130             |                180                |            170            |
+| EFFICIENTNET_V2_S |             150              |             161              |            227             |                196                |            214            |
 
 ## Object Detection
 
-| Model                          | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |             190              |             260              |            280             |                100                |            90             |
+| SSDLITE_320_MOBILENET_V3_LARGE |             261              |             279              |            414             |                125                |            115            |
 
 ## Style Transfer
 
-| Model                        | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model                        | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ---------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| STYLE_TRANSFER_CANDY         |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_MOSAIC        |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_UDNIE         |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_RAIN_PRINCESS |             450              |             600              |            750             |               1650                |           1800            |
+| STYLE_TRANSFER_CANDY         |             1565             |             1675             |            2325            |               1750                |           1620            |
+| STYLE_TRANSFER_MOSAIC        |             1565             |             1675             |            2325            |               1750                |           1620            |
+| STYLE_TRANSFER_UDNIE         |             1565             |             1675             |            2325            |               1750                |           1620            |
+| STYLE_TRANSFER_RAIN_PRINCESS |             1565             |             1675             |            2325            |               1750                |           1620            |
 
 ## OCR
 
-| Model                 | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | Samsung Galaxy S21 (XNNPACK) [ms] |
-| --------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-------------------------------: |
-| Detector (CRAFT_800)  |             2099             |               2227               |             ❌             |               2245                |               7108                |
-| Recognizer (CRNN_512) |              70              |               252                |             ❌             |                54                 |                151                |
-| Recognizer (CRNN_256) |              39              |               123                |             ❌             |                24                 |                78                 |
-| Recognizer (CRNN_128) |              17              |                83                |             ❌             |                14                 |                39                 |
+Notice that the recognizer models were executed between 3 and 7 times during a single recognition.
+The values below represent the averages across all runs for the benchmark image.
 
-❌ - Insufficient RAM.
+| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| Detector (CRAFT_800_QUANTIZED) |             779              |             897              |            1276            |                553                |            586            |
+| Recognizer (CRNN_512)          |              77              |              74              |            244             |                56                 |            57             |
+| Recognizer (CRNN_256)          |              35              |              37              |            120             |                28                 |            30             |
+| Recognizer (CRNN_128)          |              18              |              19              |             60             |                14                 |            16             |
 
 ## Vertical OCR
 
-| Model                 | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | Samsung Galaxy S21 (XNNPACK) [ms] |
-| --------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-------------------------------: |
-| Detector (CRAFT_1280) |             5457             |               5833               |             ❌             |               6296                |               14053               |
-| Detector (CRAFT_320)  |             1351             |               1460               |             ❌             |               1485                |               3101                |
-| Recognizer (CRNN_512) |              39              |               123                |             ❌             |                24                 |                78                 |
-| Recognizer (CRNN_64)  |              10              |                33                |             ❌             |                 7                 |                18                 |
+Notice that the recognizer models, as well as detector CRAFT_320 model, were executed between 4 and 21 times during a single recognition.
+The values below represent the averages across all runs for the benchmark image.
 
-❌ - Insufficient RAM.
+| Model                           | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| Detector (CRAFT_1280_QUANTIZED) |             1918             |             2304             |            3371            |               1391                |           1445            |
+| Detector (CRAFT_320_QUANTIZED)  |             473              |             563              |            813             |                361                |            382            |
+| Recognizer (CRNN_512)           |              78              |              83              |            310             |                59                 |            57             |
+| Recognizer (CRNN_64)            |              9               |              9               |             38             |                 8                 |             7             |
 
 ## LLMs
 
-| Model                 | iPhone 16 Pro (XNNPACK) [tokens/s] | iPhone 13 Pro (XNNPACK) [tokens/s] | iPhone SE 3 (XNNPACK) [tokens/s] | Samsung Galaxy S24 (XNNPACK) [tokens/s] | OnePlus 12 (XNNPACK) [tokens/s] |
+| Model                 | iPhone 17 Pro (XNNPACK) [tokens/s] | iPhone 16 Pro (XNNPACK) [tokens/s] | iPhone SE 3 (XNNPACK) [tokens/s] | Samsung Galaxy S24 (XNNPACK) [tokens/s] | OnePlus 12 (XNNPACK) [tokens/s] |
 | --------------------- | :--------------------------------: | :--------------------------------: | :------------------------------: | :-------------------------------------: | :-----------------------------: |
 | LLAMA3_2_1B           |                16.1                |                11.4                |                ❌                |                  15.6                   |              19.3               |
 | LLAMA3_2_1B_SPINQUANT |                40.6                |                16.7                |               16.5               |                  40.3                   |              48.2               |
@@ -68,7 +70,7 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 Notice than for `Whisper` model which has to take as an input 30 seconds audio chunks (for shorter audio it is automatically padded with silence to 30 seconds) `fast` mode has the lowest latency (time from starting transcription to first token returned, caused by streaming algorithm), but the slowest speed. That's why for the lowest latency and the fastest transcription we suggest using `Moonshine` model, if you still want to proceed with `Whisper` use preferably the `balanced` mode.
 
-| Model (mode)              | iPhone 16 Pro (XNNPACK) [latency \| tokens/s] | iPhone 14 Pro (XNNPACK) [latency \| tokens/s] | iPhone SE 3 (XNNPACK) [latency \| tokens/s] | Samsung Galaxy S24 (XNNPACK) [latency \| tokens/s] | OnePlus 12 (XNNPACK) [latency \| tokens/s] |
+| Model (mode)              | iPhone 17 Pro (XNNPACK) [latency \| tokens/s] | iPhone 16 Pro (XNNPACK) [latency \| tokens/s] | iPhone SE 3 (XNNPACK) [latency \| tokens/s] | Samsung Galaxy S24 (XNNPACK) [latency \| tokens/s] | OnePlus 12 (XNNPACK) [latency \| tokens/s] |
 | ------------------------- | :-------------------------------------------: | :-------------------------------------------: | :-----------------------------------------: | :------------------------------------------------: | :----------------------------------------: |
 | Moonshine-tiny (fast)     |                0.8s \| 19.0t/s                |                1.5s \| 11.3t/s                |               1.5s \| 10.4t/s               |                   2.0s \| 8.8t/s                   |              1.6s \| 12.5t/s               |
 | Moonshine-tiny (balanced) |                2.0s \| 20.0t/s                |                3.2s \| 12.4t/s                |               3.7s \| 10.4t/s               |                  4.6s \| 11.2t/s                   |              3.4s \| 14.6t/s               |
@@ -81,7 +83,7 @@ Notice than for `Whisper` model which has to take as an input 30 seconds audio c
 
 Average time for encoding audio of given length over 10 runs. For `Whisper` model we only list 30 sec audio chunks since `Whisper` does not accept other lengths (for shorter audio the audio needs to be padded to 30sec with silence).
 
-| Model                | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model                | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | -------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
 | Moonshine-tiny (5s)  |              99              |              95              |            115             |                284                |            277            |
 | Moonshine-tiny (10s) |             178              |             177              |            204             |                555                |            528            |
@@ -92,7 +94,7 @@ Average time for encoding audio of given length over 10 runs. For `Whisper` mode
 
 Average time for decoding one token in sequence of 100 tokens, with encoding context is obtained from audio of noted length.
 
-| Model                | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model                | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | -------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
 | Moonshine-tiny (5s)  |            48.98             |            47.98             |           46.86            |               36.70               |           29.03           |
 | Moonshine-tiny (10s) |            54.24             |            51.74             |           55.07            |               46.31               |           32.41           |
@@ -101,9 +103,9 @@ Average time for decoding one token in sequence of 100 tokens, with encoding con
 
 ## Text Embeddings
 
-| Model                      | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ms] |
-| -------------------------- | :--------------------------: | :------------------------------: | :------------------------: | :--------------------------: | :-----------------------: |
-| ALL_MINILM_L6_V2           |              53              |                69                |             78             |              60              |            65             |
-| ALL_MPNET_BASE_V2          |             352              |               423                |            478             |             521              |            527            |
-| MULTI_QA_MINILM_L6_COS_V1  |             135              |               166                |            180             |             158              |            165            |
-| MULTI_QA_MPNET_BASE_DOT_V1 |             503              |               598                |            680             |             694              |            743            |
+| Model                      | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| -------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| ALL_MINILM_L6_V2           |              50              |              58              |             84             |                58                 |            58             |
+| ALL_MPNET_BASE_V2          |             352              |             428              |            879             |                483                |            517            |
+| MULTI_QA_MINILM_L6_COS_V1  |             133              |             161              |            269             |                151                |            155            |
+| MULTI_QA_MPNET_BASE_DOT_V1 |             502              |             796              |            1216            |                915                |            713            |
diff --git a/docs/versioned_docs/version-0.4.x/benchmarks/memory-usage.md b/docs/versioned_docs/version-0.4.x/benchmarks/memory-usage.md
index 862ffd574..25298f630 100644
--- a/docs/versioned_docs/version-0.4.x/benchmarks/memory-usage.md
+++ b/docs/versioned_docs/version-0.4.x/benchmarks/memory-usage.md
@@ -25,16 +25,16 @@ title: Memory Usage
 
 ## OCR
 
-| Model                                                                                        | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| -------------------------------------------------------------------------------------------- | :--------------------: | :----------------: |
-| Detector (CRAFT_800) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) |          2100          |        1782        |
+| Model                                                                                                  | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ------------------------------------------------------------------------------------------------------ | :--------------------: | :----------------: |
+| Detector (CRAFT_800_QUANTIZED) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) |          1400          |        1320        |
 
 ## Vertical OCR
 
-| Model                                                                | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| -------------------------------------------------------------------- | :--------------------: | :----------------: |
-| Detector (CRAFT_1280) + Detector (CRAFT_320) + Recognizer (CRNN_512) |          2770          |        3720        |
-| Detector(CRAFT_1280) + Detector(CRAFT_320) + Recognizer (CRNN_64)    |          1770          |        2740        |
+| Model                                                                                    | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ---------------------------------------------------------------------------------------- | :--------------------: | :----------------: |
+| Detector (CRAFT_1280_QUANTIZED) + Detector (CRAFT_320_QUANTIZED) + Recognizer (CRNN_512) |          1540          |        1470        |
+| Detector(CRAFT_1280) + Detector(CRAFT_320) + Recognizer (CRNN_64)                        |          1070          |        1000        |
 
 ## LLMs
 
diff --git a/docs/versioned_docs/version-0.4.x/benchmarks/model-size.md b/docs/versioned_docs/version-0.4.x/benchmarks/model-size.md
index f39fa2f14..d5e890120 100644
--- a/docs/versioned_docs/version-0.4.x/benchmarks/model-size.md
+++ b/docs/versioned_docs/version-0.4.x/benchmarks/model-size.md
@@ -25,23 +25,23 @@ title: Model Size
 
 ## OCR
 
-| Model                 | XNNPACK [MB] |
-| --------------------- | :----------: |
-| Detector (CRAFT_800)  |     83.1     |
-| Recognizer (CRNN_512) |  15 - 18\*   |
-| Recognizer (CRNN_256) |  16 - 18\*   |
-| Recognizer (CRNN_128) |  17 - 19\*   |
+| Model                          | XNNPACK [MB] |
+| ------------------------------ | :----------: |
+| Detector (CRAFT_800_QUANTIZED) |     19.8     |
+| Recognizer (CRNN_512)          |  15 - 18\*   |
+| Recognizer (CRNN_256)          |  16 - 18\*   |
+| Recognizer (CRNN_128)          |  17 - 19\*   |
 
 \* - The model weights vary depending on the language.
 
 ## Vertical OCR
 
-| Model                    | XNNPACK [MB] |
-| ------------------------ | :----------: |
-| Detector (CRAFT_1280)    |     83.1     |
-| Detector (CRAFT_320)     |     83.1     |
-| Recognizer (CRNN_EN_512) |  15 - 18\*   |
-| Recognizer (CRNN_EN_64)  |  15 - 16\*   |
+| Model                           | XNNPACK [MB] |
+| ------------------------------- | :----------: |
+| Detector (CRAFT_1280_QUANTIZED) |     19.8     |
+| Detector (CRAFT_320_QUANTIZED)  |     19.8     |
+| Recognizer (CRNN_EN_512)        |  15 - 18\*   |
+| Recognizer (CRNN_EN_64)         |  15 - 16\*   |
 
 \* - The model weights vary depending on the language.
 

From b9de78e5d01ea2999a1e09bae1d9a1e208375b4f Mon Sep 17 00:00:00 2001
From: IgorSwat <igorswat2002@o2.pl>
Date: Mon, 3 Nov 2025 08:31:02 +0100
Subject: [PATCH 03/11] chore: minor naming fixes

---
 docs/docs/04-benchmarks/model-size.md         | 24 +++++++++----------
 .../version-0.5.x/04-benchmarks/model-size.md | 24 +++++++++----------
 2 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/docs/docs/04-benchmarks/model-size.md b/docs/docs/04-benchmarks/model-size.md
index 30999fa7b..128cbd7fb 100644
--- a/docs/docs/04-benchmarks/model-size.md
+++ b/docs/docs/04-benchmarks/model-size.md
@@ -25,23 +25,23 @@ title: Model Size
 
 ## OCR
 
-| Model                 | XNNPACK [MB] |
-| --------------------- | :----------: |
-| Detector (CRAFT_800)  |     19.8     |
-| Recognizer (CRNN_512) |  15 - 18\*   |
-| Recognizer (CRNN_256) |  16 - 18\*   |
-| Recognizer (CRNN_128) |  17 - 19\*   |
+| Model                          | XNNPACK [MB] |
+| ------------------------------ | :----------: |
+| Detector (CRAFT_800_QUANTIZED) |     19.8     |
+| Recognizer (CRNN_512)          |  15 - 18\*   |
+| Recognizer (CRNN_256)          |  16 - 18\*   |
+| Recognizer (CRNN_128)          |  17 - 19\*   |
 
 \* - The model weights vary depending on the language.
 
 ## Vertical OCR
 
-| Model                    | XNNPACK [MB] |
-| ------------------------ | :----------: |
-| Detector (CRAFT_1280)    |     19.8     |
-| Detector (CRAFT_320)     |     19.8     |
-| Recognizer (CRNN_EN_512) |  15 - 18\*   |
-| Recognizer (CRNN_EN_64)  |  15 - 16\*   |
+| Model                           | XNNPACK [MB] |
+| ------------------------------- | :----------: |
+| Detector (CRAFT_1280_QUANTIZED) |     19.8     |
+| Detector (CRAFT_320_QUANTIZED)  |     19.8     |
+| Recognizer (CRNN_EN_512)        |  15 - 18\*   |
+| Recognizer (CRNN_EN_64)         |  15 - 16\*   |
 
 \* - The model weights vary depending on the language.
 
diff --git a/docs/versioned_docs/version-0.5.x/04-benchmarks/model-size.md b/docs/versioned_docs/version-0.5.x/04-benchmarks/model-size.md
index 30999fa7b..128cbd7fb 100644
--- a/docs/versioned_docs/version-0.5.x/04-benchmarks/model-size.md
+++ b/docs/versioned_docs/version-0.5.x/04-benchmarks/model-size.md
@@ -25,23 +25,23 @@ title: Model Size
 
 ## OCR
 
-| Model                 | XNNPACK [MB] |
-| --------------------- | :----------: |
-| Detector (CRAFT_800)  |     19.8     |
-| Recognizer (CRNN_512) |  15 - 18\*   |
-| Recognizer (CRNN_256) |  16 - 18\*   |
-| Recognizer (CRNN_128) |  17 - 19\*   |
+| Model                          | XNNPACK [MB] |
+| ------------------------------ | :----------: |
+| Detector (CRAFT_800_QUANTIZED) |     19.8     |
+| Recognizer (CRNN_512)          |  15 - 18\*   |
+| Recognizer (CRNN_256)          |  16 - 18\*   |
+| Recognizer (CRNN_128)          |  17 - 19\*   |
 
 \* - The model weights vary depending on the language.
 
 ## Vertical OCR
 
-| Model                    | XNNPACK [MB] |
-| ------------------------ | :----------: |
-| Detector (CRAFT_1280)    |     19.8     |
-| Detector (CRAFT_320)     |     19.8     |
-| Recognizer (CRNN_EN_512) |  15 - 18\*   |
-| Recognizer (CRNN_EN_64)  |  15 - 16\*   |
+| Model                           | XNNPACK [MB] |
+| ------------------------------- | :----------: |
+| Detector (CRAFT_1280_QUANTIZED) |     19.8     |
+| Detector (CRAFT_320_QUANTIZED)  |     19.8     |
+| Recognizer (CRNN_EN_512)        |  15 - 18\*   |
+| Recognizer (CRNN_EN_64)         |  15 - 16\*   |
 
 \* - The model weights vary depending on the language.
 

From bbd7f530d6e91f4aedf2c7af3bd69f9b16dcfaa7 Mon Sep 17 00:00:00 2001
From: IgorSwat <igorswat2002@o2.pl>
Date: Mon, 17 Nov 2025 10:46:25 +0100
Subject: [PATCH 04/11] chore: update docs subsections

---
 .../useSpeechToText.md                        |  30 ++--
 .../useTextEmbeddings.md                      |  24 ++--
 .../02-computer-vision/useClassification.md   |   6 +-
 .../02-computer-vision/useImageEmbeddings.md  |   6 +-
 .../02-hooks/02-computer-vision/useOCR.md     |  46 +++---
 .../02-computer-vision/useObjectDetection.md  |   6 +-
 .../02-computer-vision/useStyleTransfer.md    |  18 +--
 .../02-computer-vision/useTextToImage.md      |   6 +-
 .../02-computer-vision/useVerticalOCR.md      |  44 +++---
 .../computer-vision/useClassification.md      |   8 +-
 .../version-0.4.x/computer-vision/useOCR.md   |  14 +-
 .../computer-vision/useObjectDetection.md     |   4 +-
 .../computer-vision/useStyleTransfer.md       |  10 +-
 .../computer-vision/useVerticalOCR.md         |  14 +-
 .../useTextEmbeddings.md                      |  12 +-
 .../useSpeechToText.md                        |  30 ++--
 .../useTextEmbeddings.md                      |  24 ++--
 .../02-computer-vision/useClassification.md   |   6 +-
 .../02-computer-vision/useImageEmbeddings.md  |  10 +-
 .../02-hooks/02-computer-vision/useOCR.md     |  60 ++++----
 .../02-computer-vision/useObjectDetection.md  |   6 +-
 .../02-computer-vision/useStyleTransfer.md    |  18 +--
 .../02-computer-vision/useTextToImage.md      | 133 ++++++++++++++++++
 .../02-computer-vision/useVerticalOCR.md      |  58 ++++----
 24 files changed, 357 insertions(+), 236 deletions(-)
 create mode 100644 docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useTextToImage.md

diff --git a/docs/docs/02-hooks/01-natural-language-processing/useSpeechToText.md b/docs/docs/02-hooks/01-natural-language-processing/useSpeechToText.md
index 8876bf37e..d94c96a66 100644
--- a/docs/docs/02-hooks/01-natural-language-processing/useSpeechToText.md
+++ b/docs/docs/02-hooks/01-natural-language-processing/useSpeechToText.md
@@ -75,20 +75,20 @@ For more information on loading resources, take a look at [loading models](../..
 
 ### Returns
 
-| Field                       | Type                                                                                                 | Description                                                                                                                                                                                                                                                                                              |
-| --------------------------- | ---------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `transcribe`                | `(waveform: Float32Array \| number[], options?: DecodingOptions \| undefined) => Promise<string>`    | Starts a transcription process for a given input array, which should be a waveform at 16kHz. The second argument is an options object, e.g. `{ language: 'es' }` for multilingual models. Resolves a promise with the output transcription when the model is finished. Passing `number[]` is deprecated. |
-| `stream`                    | `(options?: DecodingOptions \| undefined) => Promise<string>`                                                                              | Starts a streaming transcription process. Use in combination with `streamInsert` to feed audio chunks and `streamStop` to end the stream. The argument is an options object, e.g. `{ language: 'es' }` for multilingual models. Updates `committedTranscription` and `nonCommittedTranscription` as transcription progresses.                                                                  |
-| `streamInsert`              | `(waveform: Float32Array \| number[]) => void`                                                       | Inserts a chunk of audio data (sampled at 16kHz) into the ongoing streaming transcription. Call this repeatedly as new audio data becomes available. Passing `number[]` is deprecated.                                                                                                                   |
-| `streamStop`                | `() => void`                                                                                         | Stops the ongoing streaming transcription process.                                                                                                                                                                                                                                                       |
-| `encode`                    | `(waveform: Float32Array \| number[]) => Promise<Float32Array>`                                      | Runs the encoding part of the model on the provided waveform. Passing `number[]` is deprecated.                                                                                                                                                                                                          |
-| `decode`                    | `(tokens: number[] \| Int32Array, encoderOutput: Float32Array \| number[]) => Promise<Float32Array>` | Runs the decoder of the model. Passing `number[]` is deprecated.                                                                                                                                                                                                                                         |
-| `committedTranscription`    | `string`                                                                                             | Contains the part of the transcription that is finalized and will not change. Useful for displaying stable results during streaming.                                                                                                                                                                     |
-| `nonCommittedTranscription` | `string`                                                                                             | Contains the part of the transcription that is still being processed and may change. Useful for displaying live, partial results during streaming.                                                                                                                                                       |
-| `error`                     | `string \| null`                                                                                     | Contains the error message if the model failed to load.                                                                                                                                                                                                                                                  |
-| `isGenerating`              | `boolean`                                                                                            | Indicates whether the model is currently processing an inference.                                                                                                                                                                                                                                        |
-| `isReady`                   | `boolean`                                                                                            | Indicates whether the model has successfully loaded and is ready for inference.                                                                                                                                                                                                                          |
-| `downloadProgress`          | `number`                                                                                             | Tracks the progress of the model download process.                                                                                                                                                                                                                                                       |
+| Field                       | Type                                                                                                 | Description                                                                                                                                                                                                                                                                                                                   |
+| --------------------------- | ---------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `transcribe`                | `(waveform: Float32Array \| number[], options?: DecodingOptions \| undefined) => Promise<string>`    | Starts a transcription process for a given input array, which should be a waveform at 16kHz. The second argument is an options object, e.g. `{ language: 'es' }` for multilingual models. Resolves a promise with the output transcription when the model is finished. Passing `number[]` is deprecated.                      |
+| `stream`                    | `(options?: DecodingOptions \| undefined) => Promise<string>`                                        | Starts a streaming transcription process. Use in combination with `streamInsert` to feed audio chunks and `streamStop` to end the stream. The argument is an options object, e.g. `{ language: 'es' }` for multilingual models. Updates `committedTranscription` and `nonCommittedTranscription` as transcription progresses. |
+| `streamInsert`              | `(waveform: Float32Array \| number[]) => void`                                                       | Inserts a chunk of audio data (sampled at 16kHz) into the ongoing streaming transcription. Call this repeatedly as new audio data becomes available. Passing `number[]` is deprecated.                                                                                                                                        |
+| `streamStop`                | `() => void`                                                                                         | Stops the ongoing streaming transcription process.                                                                                                                                                                                                                                                                            |
+| `encode`                    | `(waveform: Float32Array \| number[]) => Promise<Float32Array>`                                      | Runs the encoding part of the model on the provided waveform. Passing `number[]` is deprecated.                                                                                                                                                                                                                               |
+| `decode`                    | `(tokens: number[] \| Int32Array, encoderOutput: Float32Array \| number[]) => Promise<Float32Array>` | Runs the decoder of the model. Passing `number[]` is deprecated.                                                                                                                                                                                                                                                              |
+| `committedTranscription`    | `string`                                                                                             | Contains the part of the transcription that is finalized and will not change. Useful for displaying stable results during streaming.                                                                                                                                                                                          |
+| `nonCommittedTranscription` | `string`                                                                                             | Contains the part of the transcription that is still being processed and may change. Useful for displaying live, partial results during streaming.                                                                                                                                                                            |
+| `error`                     | `string \| null`                                                                                     | Contains the error message if the model failed to load.                                                                                                                                                                                                                                                                       |
+| `isGenerating`              | `boolean`                                                                                            | Indicates whether the model is currently processing an inference.                                                                                                                                                                                                                                                             |
+| `isReady`                   | `boolean`                                                                                            | Indicates whether the model has successfully loaded and is ready for inference.                                                                                                                                                                                                                                               |
+| `downloadProgress`          | `number`                                                                                             | Tracks the progress of the model download process.                                                                                                                                                                                                                                                                            |
 
 <details>
 <summary>Type definitions</summary>
@@ -340,4 +340,4 @@ function App() {
 
 | Model        | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | ------------ | :--------------------: | :----------------: |
-| WHISPER_TINY |          900           |        600         |
+| WHISPER_TINY |          410           |        375         |
diff --git a/docs/docs/02-hooks/01-natural-language-processing/useTextEmbeddings.md b/docs/docs/02-hooks/01-natural-language-processing/useTextEmbeddings.md
index c40d19e94..fd595d208 100644
--- a/docs/docs/02-hooks/01-natural-language-processing/useTextEmbeddings.md
+++ b/docs/docs/02-hooks/01-natural-language-processing/useTextEmbeddings.md
@@ -133,11 +133,11 @@ For the supported models, the returned embedding vector is normalized, meaning t
 
 | Model                      | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | -------------------------- | :--------------------: | :----------------: |
-| ALL_MINILM_L6_V2           |           85           |        100         |
-| ALL_MPNET_BASE_V2          |          390           |        465         |
-| MULTI_QA_MINILM_L6_COS_V1  |          115           |        130         |
-| MULTI_QA_MPNET_BASE_DOT_V1 |          415           |        490         |
-| CLIP_VIT_BASE_PATCH32_TEXT |          195           |        250         |
+| ALL_MINILM_L6_V2           |           95           |        110         |
+| ALL_MPNET_BASE_V2          |          405           |        455         |
+| MULTI_QA_MINILM_L6_COS_V1  |          120           |        140         |
+| MULTI_QA_MPNET_BASE_DOT_V1 |          435           |        455         |
+| CLIP_VIT_BASE_PATCH32_TEXT |          200           |        280         |
 
 ### Inference time
 
@@ -145,13 +145,13 @@ For the supported models, the returned embedding vector is normalized, meaning t
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
 :::
 
-| Model                      | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ms] |
-| -------------------------- | :--------------------------: | :------------------------------: | :------------------------: | :--------------------------: | :-----------------------: |
-| ALL_MINILM_L6_V2           |              15              |                22                |             23             |              36              |            31             |
-| ALL_MPNET_BASE_V2          |              71              |                96                |            101             |             112              |            105            |
-| MULTI_QA_MINILM_L6_COS_V1  |              15              |                22                |             23             |              36              |            31             |
-| MULTI_QA_MPNET_BASE_DOT_V1 |              71              |                95                |            100             |             112              |            105            |
-| CLIP_VIT_BASE_PATCH32_TEXT |              31              |                47                |             48             |              55              |            49             |
+| Model                      | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| -------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| ALL_MINILM_L6_V2           |              16              |              16              |             19             |                54                 |            28             |
+| ALL_MPNET_BASE_V2          |             115              |             116              |            144             |                145                |            95             |
+| MULTI_QA_MINILM_L6_COS_V1  |              16              |              16              |             20             |                47                 |            28             |
+| MULTI_QA_MPNET_BASE_DOT_V1 |             112              |             119              |            144             |                146                |            96             |
+| CLIP_VIT_BASE_PATCH32_TEXT |              47              |              45              |             57             |                65                 |            48             |
 
 :::info
 Benchmark times for text embeddings are highly dependent on the sentence length. The numbers above are based on a sentence of around 80 tokens. For shorter or longer sentences, inference time may vary accordingly.
diff --git a/docs/docs/02-hooks/02-computer-vision/useClassification.md b/docs/docs/02-hooks/02-computer-vision/useClassification.md
index b4d3f34a6..e17bfa775 100644
--- a/docs/docs/02-hooks/02-computer-vision/useClassification.md
+++ b/docs/docs/02-hooks/02-computer-vision/useClassification.md
@@ -100,7 +100,7 @@ function App() {
 
 | Model             | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
 | ----------------- | :--------------------: | :----------------: |
-| EFFICIENTNET_V2_S |          130           |         85         |
+| EFFICIENTNET_V2_S |          230           |         87         |
 
 ### Inference time
 
@@ -108,6 +108,6 @@ function App() {
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
 :::
 
-| Model             | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model             | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ----------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| EFFICIENTNET_V2_S |             100              |             120              |            130             |                180                |            170            |
+| EFFICIENTNET_V2_S |             105              |             110              |            149             |                299                |            227            |
diff --git a/docs/docs/02-hooks/02-computer-vision/useImageEmbeddings.md b/docs/docs/02-hooks/02-computer-vision/useImageEmbeddings.md
index 6dbdc7dcc..4d417590c 100644
--- a/docs/docs/02-hooks/02-computer-vision/useImageEmbeddings.md
+++ b/docs/docs/02-hooks/02-computer-vision/useImageEmbeddings.md
@@ -123,9 +123,9 @@ For the supported models, the returned embedding vector is normalized, meaning t
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization. Performance also heavily depends on image size, because resize is expansive operation, especially on low-end devices.
 :::
 
-| Model                       | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| --------------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| CLIP_VIT_BASE_PATCH32_IMAGE |              48              |                64                |             69             |                65                 |            63             |
+| Model                       | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| --------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| CLIP_VIT_BASE_PATCH32_IMAGE |              70              |              70              |             90             |                66                 |            58             |
 
 :::info
 Image embedding benchmark times are measured using 224×224 pixel images, as required by the model. All input images, whether larger or smaller, are resized to 224×224 before processing. Resizing is typically fast for small images but may be noticeably slower for very large images, which can increase total inference time.
diff --git a/docs/docs/02-hooks/02-computer-vision/useOCR.md b/docs/docs/02-hooks/02-computer-vision/useOCR.md
index 037daebf7..08e28f829 100644
--- a/docs/docs/02-hooks/02-computer-vision/useOCR.md
+++ b/docs/docs/02-hooks/02-computer-vision/useOCR.md
@@ -288,20 +288,20 @@ You need to make sure the recognizer models you pass in `recognizerSources` matc
 
 ### Model size
 
-| Model                 | XNNPACK [MB] |
-| --------------------- | :----------: |
-| Detector (CRAFT_800)  |     83.1     |
-| Recognizer (CRNN_512) |  15 - 18\*   |
-| Recognizer (CRNN_256) |  16 - 18\*   |
-| Recognizer (CRNN_128) |  17 - 19\*   |
+| Model                          | XNNPACK [MB] |
+| ------------------------------ | :----------: |
+| Detector (CRAFT_800_QUANTIZED) |     19.8     |
+| Recognizer (CRNN_512)          |  15 - 18\*   |
+| Recognizer (CRNN_256)          |  16 - 18\*   |
+| Recognizer (CRNN_128)          |  17 - 19\*   |
 
 \* - The model weights vary depending on the language.
 
 ### Memory usage
 
-| Model                                                                                        | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| -------------------------------------------------------------------------------------------- | :--------------------: | :----------------: |
-| Detector (CRAFT_800) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) |          1600          |        1700        |
+| Model                                                                                                  | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ------------------------------------------------------------------------------------------------------ | :--------------------: | :----------------: |
+| Detector (CRAFT_800_QUANTIZED) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) |          1400          |        1320        |
 
 ### Inference time
 
@@ -317,18 +317,16 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 **Time measurements:**
 
-| Metric                    | iPhone 14 Pro Max <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
-| ------------------------- | ----------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
-| **Total Inference Time**  | 4330                          | 2537                      | ❌          | 6648                           | 5993                   |
-| **Detector (CRAFT_800)**  | 1945                          | 1809                      | ❌          | 2080                           | 1961                   |
-| **Recognizer (CRNN_512)** |                               |                           |             |                                |                        |
-| ├─ Average Time           | 273                           | 76                        | ❌          | 289                            | 252                    |
-| ├─ Total Time (3 runs)    | 820                           | 229                       | ❌          | 867                            | 756                    |
-| **Recognizer (CRNN_256)** |                               |                           |             |                                |                        |
-| ├─ Average Time           | 137                           | 39                        | ❌          | 260                            | 229                    |
-| ├─ Total Time (7 runs)    | 958                           | 271                       | ❌          | 1818                           | 1601                   |
-| **Recognizer (CRNN_128)** |                               |                           |             |                                |                        |
-| ├─ Average Time           | 68                            | 18                        | ❌          | 239                            | 214                    |
-| ├─ Total Time (7 runs)    | 478                           | 124                       | ❌          | 1673                           | 1498                   |
-
-❌ - Insufficient RAM.
+| Metric                             | iPhone 17 Pro <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
+| ---------------------------------- | ------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
+| **Total Inference Time**           | 1160                      | 1144                      | 1498        | 1567                           | 1160                   |
+| **Detector (CRAFT_800_QUANTIZED)** | 669                       | 649                       | 825         | 541                            | 474                    |
+| **Recognizer (CRNN_512)**          |                           |                           |             |                                |                        |
+| ├─ Average Time                    | 48                        | 47                        | 60          | 91                             | 72                     |
+| ├─ Total Time (3 runs)             | 144                       | 141                       | 180         | 273                            | 216                    |
+| **Recognizer (CRNN_256)**          |                           |                           |             |                                |                        |
+| ├─ Average Time                    | 22                        | 22                        | 29          | 51                             | 30                     |
+| ├─ Total Time (7 runs)             | 154                       | 154                       | 203         | 357                            | 210                    |
+| **Recognizer (CRNN_128)**          |                           |                           |             |                                |                        |
+| ├─ Average Time                    | 11                        | 11                        | 14          | 28                             | 17                     |
+| ├─ Total Time (7 runs)             | 77                        | 77                        | 98          | 196                            | 119                    |
diff --git a/docs/docs/02-hooks/02-computer-vision/useObjectDetection.md b/docs/docs/02-hooks/02-computer-vision/useObjectDetection.md
index ac756d6a6..7f49e8389 100644
--- a/docs/docs/02-hooks/02-computer-vision/useObjectDetection.md
+++ b/docs/docs/02-hooks/02-computer-vision/useObjectDetection.md
@@ -139,7 +139,7 @@ function App() {
 
 | Model                          | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | ------------------------------ | :--------------------: | :----------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |           90           |         90         |
+| SSDLITE_320_MOBILENET_V3_LARGE |          164           |        132         |
 
 ### Inference time
 
@@ -147,6 +147,6 @@ function App() {
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
 :::
 
-| Model                          | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |             190              |             260              |            280             |                100                |            90             |
+| SSDLITE_320_MOBILENET_V3_LARGE |             116              |             120              |            164             |                257                |            129            |
diff --git a/docs/docs/02-hooks/02-computer-vision/useStyleTransfer.md b/docs/docs/02-hooks/02-computer-vision/useStyleTransfer.md
index 899a619ca..2bedba325 100644
--- a/docs/docs/02-hooks/02-computer-vision/useStyleTransfer.md
+++ b/docs/docs/02-hooks/02-computer-vision/useStyleTransfer.md
@@ -95,10 +95,10 @@ function App() {
 
 | Model                        | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
 | ---------------------------- | :--------------------: | :----------------: |
-| STYLE_TRANSFER_CANDY         |          950           |        350         |
-| STYLE_TRANSFER_MOSAIC        |          950           |        350         |
-| STYLE_TRANSFER_UDNIE         |          950           |        350         |
-| STYLE_TRANSFER_RAIN_PRINCESS |          950           |        350         |
+| STYLE_TRANSFER_CANDY         |          1200          |        380         |
+| STYLE_TRANSFER_MOSAIC        |          1200          |        380         |
+| STYLE_TRANSFER_UDNIE         |          1200          |        380         |
+| STYLE_TRANSFER_RAIN_PRINCESS |          1200          |        380         |
 
 ### Inference time
 
@@ -106,9 +106,9 @@ function App() {
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
 :::
 
-| Model                        | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model                        | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ---------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| STYLE_TRANSFER_CANDY         |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_MOSAIC        |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_UDNIE         |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_RAIN_PRINCESS |             450              |             600              |            750             |               1650                |           1800            |
+| STYLE_TRANSFER_CANDY         |             1356             |             1550             |            2003            |               2578                |           2328            |
+| STYLE_TRANSFER_MOSAIC        |             1376             |             1456             |            1971            |               2657                |           2394            |
+| STYLE_TRANSFER_UDNIE         |             1389             |             1499             |            1858            |               2380                |           2124            |
+| STYLE_TRANSFER_RAIN_PRINCESS |             1339             |             1514             |            2004            |               2608                |           2371            |
diff --git a/docs/docs/02-hooks/02-computer-vision/useTextToImage.md b/docs/docs/02-hooks/02-computer-vision/useTextToImage.md
index 83e47a3e2..3eaf7d826 100644
--- a/docs/docs/02-hooks/02-computer-vision/useTextToImage.md
+++ b/docs/docs/02-hooks/02-computer-vision/useTextToImage.md
@@ -124,9 +124,9 @@ The number following the underscore (\_) indicates that the model supports gener
 
 ### Inference time
 
-| Model                 | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| --------------------- | :--------------------------: | :------------------------------: | :-------------------: | :-------------------------------: | :-----------------------: |
-| BK_SDM_TINY_VPRED_256 |            19100             |              25000               |          ❌           |                ❌                 |           23100           |
+| Model                 | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| --------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| BK_SDM_TINY_VPRED_256 |            21184             |            21021             |             ❌             |               18834               |           16617           |
 
 :::info
 Text-to-image benchmark times are measured generating 256×256 images in 10 inference steps.
diff --git a/docs/docs/02-hooks/02-computer-vision/useVerticalOCR.md b/docs/docs/02-hooks/02-computer-vision/useVerticalOCR.md
index 29a4de452..94e5e3054 100644
--- a/docs/docs/02-hooks/02-computer-vision/useVerticalOCR.md
+++ b/docs/docs/02-hooks/02-computer-vision/useVerticalOCR.md
@@ -302,12 +302,12 @@ You need to make sure the recognizer models you pass in `recognizerSources` matc
 
 ### Model size
 
-| Model                 | XNNPACK [MB] |
-| --------------------- | :----------: |
-| Detector (CRAFT_1280) |     83.1     |
-| Detector (CRAFT_320)  |     83.1     |
-| Recognizer (CRNN_512) |  15 - 18\*   |
-| Recognizer (CRNN_64)  |  15 - 16\*   |
+| Model                           | XNNPACK [MB] |
+| ------------------------------- | :----------: |
+| Detector (CRAFT_1280_QUANTIZED) |     19.8     |
+| Detector (CRAFT_32_QUANTIZED)   |     19.8     |
+| Recognizer (CRNN_512)           |  15 - 18\*   |
+| Recognizer (CRNN_64)            |  15 - 16\*   |
 
 \* - The model weights vary depending on the language.
 
@@ -315,8 +315,8 @@ You need to make sure the recognizer models you pass in `recognizerSources` matc
 
 | Model                                                                | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | -------------------------------------------------------------------- | :--------------------: | :----------------: |
-| Detector (CRAFT_1280) + Detector (CRAFT_320) + Recognizer (CRNN_512) |          2172          |        2214        |
-| Detector(CRAFT_1280) + Detector(CRAFT_320) + Recognizer (CRNN_64)    |          1774          |        1705        |
+| Detector (CRAFT_1280) + Detector (CRAFT_320) + Recognizer (CRNN_512) |          1540          |        1470        |
+| Detector(CRAFT_1280) + Detector(CRAFT_320) + Recognizer (CRNN_64)    |          1070          |        1000        |
 
 ### Inference time
 
@@ -332,18 +332,16 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 **Time measurements:**
 
-| Metric                                                                     | iPhone 14 Pro Max <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
-| -------------------------------------------------------------------------- | ----------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
-| **Total Inference Time**                                                   | 9350 / 9620                   | 8572 / 8621               | ❌          | 13737 / 10570                  | 13436 / 9848           |
-| **Detector (CRAFT_1250)**                                                  | 4895                          | 4756                      | ❌          | 5574                           | 5016                   |
-| **Detector (CRAFT_320)**                                                   |                               |                           |             |                                |                        |
-| ├─ Average Time                                                            | 1247                          | 1206                      | ❌          | 1350                           | 1356                   |
-| ├─ Total Time (3 runs)                                                     | 3741                          | 3617                      | ❌          | 4050                           | 4069                   |
-| **Recognizer (CRNN_64)** <br /> (_With Flag `independentChars == true`_)   |                               |                           |             |                                |                        |
-| ├─ Average Time                                                            | 31                            | 9                         | ❌          | 195                            | 207                    |
-| ├─ Total Time (21 runs)                                                    | 649                           | 191                       | ❌          | 4092                           | 4339                   |
-| **Recognizer (CRNN_512)** <br /> (_With Flag `independentChars == false`_) |                               |                           |             |                                |                        |
-| ├─ Average Time                                                            | 306                           | 80                        | ❌          | 308                            | 250                    |
-| ├─ Total Time (3 runs)                                                     | 919                           | 240                       | ❌          | 925                            | 751                    |
-
-❌ - Insufficient RAM.
+| Metric                                                                     | iPhone 17 Pro <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
+| -------------------------------------------------------------------------- | ------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
+| **Total Inference Time**                                                   | 3819 / 3716               | 3978 / 3841               | 4751 / 4532 | 3095 / 3286                    | 2787 / 2770            |
+| **Detector (CRAFT_1280_QUANTIZED)**                                        | 1749                      | 1804                      | 2105        | 1216                           | 1171                   |
+| **Detector (CRAFT_320_QUANTIZED)**                                         |                           |                           |             |                                |                        |
+| ├─ Average Time                                                            | 458                       | 474                       | 561         | 360                            | 332                    |
+| ├─ Total Time (4 runs)                                                     | 1832                      | 1896                      | 2244        | 1440                           | 1328                   |
+| **Recognizer (CRNN_64)** <br /> (_With Flag `independentChars == true`_)   |                           |                           |             |                                |                        |
+| ├─ Average Time                                                            | 5                         | 6                         | 7           | 28                             | 11                     |
+| ├─ Total Time (21 runs)                                                    | 105                       | 126                       | 147         | 588                            | 231                    |
+| **Recognizer (CRNN_512)** <br /> (_With Flag `independentChars == false`_) |                           |                           |             |                                |                        |
+| ├─ Average Time                                                            | 54                        | 52                        | 68          | 144                            | 72                     |
+| ├─ Total Time (4 runs)                                                     | 216                       | 208                       | 272         | 576                            | 288                    |
diff --git a/docs/versioned_docs/version-0.4.x/computer-vision/useClassification.md b/docs/versioned_docs/version-0.4.x/computer-vision/useClassification.md
index fb812fb57..caef31b3d 100644
--- a/docs/versioned_docs/version-0.4.x/computer-vision/useClassification.md
+++ b/docs/versioned_docs/version-0.4.x/computer-vision/useClassification.md
@@ -85,8 +85,8 @@ function App() {
 
 ## Supported models
 
-| Model                                                                                                           | Number of classes | Class list                                                                                                                                                                 |
-| --------------------------------------------------------------------------------------------------------------- | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| Model                                                                                                             | Number of classes | Class list                                                                                                                                                                        |
+| ----------------------------------------------------------------------------------------------------------------- | ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | [efficientnet_v2_s](https://pytorch.org/vision/stable/models/generated/torchvision.models.efficientnet_v2_s.html) | 1000              | [ImageNet1k_v1](https://github.com/software-mansion/react-native-executorch/blob/release/0.4/android/src/main/java/com/swmansion/rnexecutorch/models/classification/Constants.kt) |
 
 ## Benchmarks
@@ -109,6 +109,6 @@ function App() {
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
 :::
 
-| Model             | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model             | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ----------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| EFFICIENTNET_V2_S |             100              |             120              |            130             |                180                |            170            |
+| EFFICIENTNET_V2_S |             150              |             161              |            227             |                196                |            214            |
diff --git a/docs/versioned_docs/version-0.4.x/computer-vision/useOCR.md b/docs/versioned_docs/version-0.4.x/computer-vision/useOCR.md
index 2c12300e8..960815719 100644
--- a/docs/versioned_docs/version-0.4.x/computer-vision/useOCR.md
+++ b/docs/versioned_docs/version-0.4.x/computer-vision/useOCR.md
@@ -321,11 +321,9 @@ function App() {
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
 :::
 
-| Model                 | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | Samsung Galaxy S21 (XNNPACK) [ms] |
-| --------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-------------------------------: |
-| Detector (CRAFT_800)  |             2099             |               2227               |             ❌             |               2245                |               7108                |
-| Recognizer (CRNN_512) |              70              |               252                |             ❌             |                54                 |                151                |
-| Recognizer (CRNN_256) |              39              |               123                |             ❌             |                24                 |                78                 |
-| Recognizer (CRNN_128) |              17              |                83                |             ❌             |                14                 |                39                 |
-
-❌ - Insufficient RAM.
+| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| Detector (CRAFT_800_QUANTIZED) |             779              |             897              |            1276            |                553                |            586            |
+| Recognizer (CRNN_512)          |              77              |              74              |            244             |                56                 |            57             |
+| Recognizer (CRNN_256)          |              35              |              37              |            120             |                28                 |            30             |
+| Recognizer (CRNN_128)          |              18              |              19              |             60             |                14                 |            16             |
diff --git a/docs/versioned_docs/version-0.4.x/computer-vision/useObjectDetection.md b/docs/versioned_docs/version-0.4.x/computer-vision/useObjectDetection.md
index b18faa8f8..0bdbeef01 100644
--- a/docs/versioned_docs/version-0.4.x/computer-vision/useObjectDetection.md
+++ b/docs/versioned_docs/version-0.4.x/computer-vision/useObjectDetection.md
@@ -145,6 +145,6 @@ function App() {
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
 :::
 
-| Model                          | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |             190              |             260              |            280             |                100                |            90             |
+| SSDLITE_320_MOBILENET_V3_LARGE |             261              |             279              |            414             |                125                |            115            |
diff --git a/docs/versioned_docs/version-0.4.x/computer-vision/useStyleTransfer.md b/docs/versioned_docs/version-0.4.x/computer-vision/useStyleTransfer.md
index 40f30a1d0..09599bac7 100644
--- a/docs/versioned_docs/version-0.4.x/computer-vision/useStyleTransfer.md
+++ b/docs/versioned_docs/version-0.4.x/computer-vision/useStyleTransfer.md
@@ -107,9 +107,9 @@ function App(){
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
 :::
 
-| Model                        | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model                        | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ---------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| STYLE_TRANSFER_CANDY         |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_MOSAIC        |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_UDNIE         |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_RAIN_PRINCESS |             450              |             600              |            750             |               1650                |           1800            |
+| STYLE_TRANSFER_CANDY         |             1565             |             1675             |            2325            |               1750                |           1620            |
+| STYLE_TRANSFER_MOSAIC        |             1565             |             1675             |            2325            |               1750                |           1620            |
+| STYLE_TRANSFER_UDNIE         |             1565             |             1675             |            2325            |               1750                |           1620            |
+| STYLE_TRANSFER_RAIN_PRINCESS |             1565             |             1675             |            2325            |               1750                |           1620            |
diff --git a/docs/versioned_docs/version-0.4.x/computer-vision/useVerticalOCR.md b/docs/versioned_docs/version-0.4.x/computer-vision/useVerticalOCR.md
index 98cc301bf..ce9b456e3 100644
--- a/docs/versioned_docs/version-0.4.x/computer-vision/useVerticalOCR.md
+++ b/docs/versioned_docs/version-0.4.x/computer-vision/useVerticalOCR.md
@@ -342,11 +342,9 @@ function App() {
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
 :::
 
-| Model                 | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | Samsung Galaxy S21 (XNNPACK) [ms] |
-| --------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-------------------------------: |
-| Detector (CRAFT_1280) |             5457             |               5833               |             ❌             |               6296                |               14053               |
-| Detector (CRAFT_320)  |             1351             |               1460               |             ❌             |               1485                |               3101                |
-| Recognizer (CRNN_512) |              39              |               123                |             ❌             |                24                 |                78                 |
-| Recognizer (CRNN_64)  |              10              |                33                |             ❌             |                 7                 |                18                 |
-
-❌ - Insufficient RAM.
+| Model                           | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| Detector (CRAFT_1280_QUANTIZED) |             1918             |             2304             |            3371            |               1391                |           1445            |
+| Detector (CRAFT_320_QUANTIZED)  |             473              |             563              |            813             |                361                |            382            |
+| Recognizer (CRNN_512)           |              78              |              83              |            310             |                59                 |            57             |
+| Recognizer (CRNN_64)            |              9               |              9               |             38             |                 8                 |             7             |
diff --git a/docs/versioned_docs/version-0.4.x/natural-language-processing/useTextEmbeddings.md b/docs/versioned_docs/version-0.4.x/natural-language-processing/useTextEmbeddings.md
index 43fefe3d6..5aeeaa02b 100644
--- a/docs/versioned_docs/version-0.4.x/natural-language-processing/useTextEmbeddings.md
+++ b/docs/versioned_docs/version-0.4.x/natural-language-processing/useTextEmbeddings.md
@@ -148,9 +148,9 @@ function App() {
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
 :::
 
-| Model                      | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ms] |
-| -------------------------- | :--------------------------: | :------------------------------: | :------------------------: | :--------------------------: | :-----------------------: |
-| ALL_MINILM_L6_V2           |              53              |                69                |             78             |              60              |            65             |
-| ALL_MPNET_BASE_V2          |             352              |               423                |            478             |             521              |            527            |
-| MULTI_QA_MINILM_L6_COS_V1  |             135              |               166                |            180             |             158              |            165            |
-| MULTI_QA_MPNET_BASE_DOT_V1 |             503              |               598                |            680             |             694              |            743            |
+| Model                      | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| -------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| ALL_MINILM_L6_V2           |              50              |              58              |             84             |                58                 |            58             |
+| ALL_MPNET_BASE_V2          |             352              |             428              |            879             |                483                |            517            |
+| MULTI_QA_MINILM_L6_COS_V1  |             133              |             161              |            269             |                151                |            155            |
+| MULTI_QA_MPNET_BASE_DOT_V1 |             502              |             796              |            1216            |                915                |            713            |
diff --git a/docs/versioned_docs/version-0.5.x/02-hooks/01-natural-language-processing/useSpeechToText.md b/docs/versioned_docs/version-0.5.x/02-hooks/01-natural-language-processing/useSpeechToText.md
index 3256e2e88..d94c96a66 100644
--- a/docs/versioned_docs/version-0.5.x/02-hooks/01-natural-language-processing/useSpeechToText.md
+++ b/docs/versioned_docs/version-0.5.x/02-hooks/01-natural-language-processing/useSpeechToText.md
@@ -75,20 +75,20 @@ For more information on loading resources, take a look at [loading models](../..
 
 ### Returns
 
-| Field                       | Type                                                                                                 | Description                                                                                                                                                                                                                                                                                              |
-| --------------------------- | ---------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `transcribe`                | `(waveform: Float32Array \| number[], options?: DecodingOptions \| undefined) => Promise<string>`    | Starts a transcription process for a given input array, which should be a waveform at 16kHz. The second argument is an options object, e.g. `{ language: 'es' }` for multilingual models. Resolves a promise with the output transcription when the model is finished. Passing `number[]` is deprecated. |
-| `stream`                    | `() => Promise<string>`                                                                              | Starts a streaming transcription process. Use in combination with `streamInsert` to feed audio chunks and `streamStop` to end the stream. Updates `committedTranscription` and `nonCommittedTranscription` as transcription progresses.                                                                  |
-| `streamInsert`              | `(waveform: Float32Array \| number[]) => void`                                                       | Inserts a chunk of audio data (sampled at 16kHz) into the ongoing streaming transcription. Call this repeatedly as new audio data becomes available. Passing `number[]` is deprecated.                                                                                                                   |
-| `streamStop`                | `() => void`                                                                                         | Stops the ongoing streaming transcription process.                                                                                                                                                                                                                                                       |
-| `encode`                    | `(waveform: Float32Array \| number[]) => Promise<Float32Array>`                                      | Runs the encoding part of the model on the provided waveform. Passing `number[]` is deprecated.                                                                                                                                                                                                          |
-| `decode`                    | `(tokens: number[] \| Int32Array, encoderOutput: Float32Array \| number[]) => Promise<Float32Array>` | Runs the decoder of the model. Passing `number[]` is deprecated.                                                                                                                                                                                                                                         |
-| `committedTranscription`    | `string`                                                                                             | Contains the part of the transcription that is finalized and will not change. Useful for displaying stable results during streaming.                                                                                                                                                                     |
-| `nonCommittedTranscription` | `string`                                                                                             | Contains the part of the transcription that is still being processed and may change. Useful for displaying live, partial results during streaming.                                                                                                                                                       |
-| `error`                     | `string \| null`                                                                                     | Contains the error message if the model failed to load.                                                                                                                                                                                                                                                  |
-| `isGenerating`              | `boolean`                                                                                            | Indicates whether the model is currently processing an inference.                                                                                                                                                                                                                                        |
-| `isReady`                   | `boolean`                                                                                            | Indicates whether the model has successfully loaded and is ready for inference.                                                                                                                                                                                                                          |
-| `downloadProgress`          | `number`                                                                                             | Tracks the progress of the model download process.                                                                                                                                                                                                                                                       |
+| Field                       | Type                                                                                                 | Description                                                                                                                                                                                                                                                                                                                   |
+| --------------------------- | ---------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `transcribe`                | `(waveform: Float32Array \| number[], options?: DecodingOptions \| undefined) => Promise<string>`    | Starts a transcription process for a given input array, which should be a waveform at 16kHz. The second argument is an options object, e.g. `{ language: 'es' }` for multilingual models. Resolves a promise with the output transcription when the model is finished. Passing `number[]` is deprecated.                      |
+| `stream`                    | `(options?: DecodingOptions \| undefined) => Promise<string>`                                        | Starts a streaming transcription process. Use in combination with `streamInsert` to feed audio chunks and `streamStop` to end the stream. The argument is an options object, e.g. `{ language: 'es' }` for multilingual models. Updates `committedTranscription` and `nonCommittedTranscription` as transcription progresses. |
+| `streamInsert`              | `(waveform: Float32Array \| number[]) => void`                                                       | Inserts a chunk of audio data (sampled at 16kHz) into the ongoing streaming transcription. Call this repeatedly as new audio data becomes available. Passing `number[]` is deprecated.                                                                                                                                        |
+| `streamStop`                | `() => void`                                                                                         | Stops the ongoing streaming transcription process.                                                                                                                                                                                                                                                                            |
+| `encode`                    | `(waveform: Float32Array \| number[]) => Promise<Float32Array>`                                      | Runs the encoding part of the model on the provided waveform. Passing `number[]` is deprecated.                                                                                                                                                                                                                               |
+| `decode`                    | `(tokens: number[] \| Int32Array, encoderOutput: Float32Array \| number[]) => Promise<Float32Array>` | Runs the decoder of the model. Passing `number[]` is deprecated.                                                                                                                                                                                                                                                              |
+| `committedTranscription`    | `string`                                                                                             | Contains the part of the transcription that is finalized and will not change. Useful for displaying stable results during streaming.                                                                                                                                                                                          |
+| `nonCommittedTranscription` | `string`                                                                                             | Contains the part of the transcription that is still being processed and may change. Useful for displaying live, partial results during streaming.                                                                                                                                                                            |
+| `error`                     | `string \| null`                                                                                     | Contains the error message if the model failed to load.                                                                                                                                                                                                                                                                       |
+| `isGenerating`              | `boolean`                                                                                            | Indicates whether the model is currently processing an inference.                                                                                                                                                                                                                                                             |
+| `isReady`                   | `boolean`                                                                                            | Indicates whether the model has successfully loaded and is ready for inference.                                                                                                                                                                                                                                               |
+| `downloadProgress`          | `number`                                                                                             | Tracks the progress of the model download process.                                                                                                                                                                                                                                                                            |
 
 <details>
 <summary>Type definitions</summary>
@@ -340,4 +340,4 @@ function App() {
 
 | Model        | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | ------------ | :--------------------: | :----------------: |
-| WHISPER_TINY |          900           |        600         |
+| WHISPER_TINY |          410           |        375         |
diff --git a/docs/versioned_docs/version-0.5.x/02-hooks/01-natural-language-processing/useTextEmbeddings.md b/docs/versioned_docs/version-0.5.x/02-hooks/01-natural-language-processing/useTextEmbeddings.md
index c40d19e94..fd595d208 100644
--- a/docs/versioned_docs/version-0.5.x/02-hooks/01-natural-language-processing/useTextEmbeddings.md
+++ b/docs/versioned_docs/version-0.5.x/02-hooks/01-natural-language-processing/useTextEmbeddings.md
@@ -133,11 +133,11 @@ For the supported models, the returned embedding vector is normalized, meaning t
 
 | Model                      | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | -------------------------- | :--------------------: | :----------------: |
-| ALL_MINILM_L6_V2           |           85           |        100         |
-| ALL_MPNET_BASE_V2          |          390           |        465         |
-| MULTI_QA_MINILM_L6_COS_V1  |          115           |        130         |
-| MULTI_QA_MPNET_BASE_DOT_V1 |          415           |        490         |
-| CLIP_VIT_BASE_PATCH32_TEXT |          195           |        250         |
+| ALL_MINILM_L6_V2           |           95           |        110         |
+| ALL_MPNET_BASE_V2          |          405           |        455         |
+| MULTI_QA_MINILM_L6_COS_V1  |          120           |        140         |
+| MULTI_QA_MPNET_BASE_DOT_V1 |          435           |        455         |
+| CLIP_VIT_BASE_PATCH32_TEXT |          200           |        280         |
 
 ### Inference time
 
@@ -145,13 +145,13 @@ For the supported models, the returned embedding vector is normalized, meaning t
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
 :::
 
-| Model                      | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ms] |
-| -------------------------- | :--------------------------: | :------------------------------: | :------------------------: | :--------------------------: | :-----------------------: |
-| ALL_MINILM_L6_V2           |              15              |                22                |             23             |              36              |            31             |
-| ALL_MPNET_BASE_V2          |              71              |                96                |            101             |             112              |            105            |
-| MULTI_QA_MINILM_L6_COS_V1  |              15              |                22                |             23             |              36              |            31             |
-| MULTI_QA_MPNET_BASE_DOT_V1 |              71              |                95                |            100             |             112              |            105            |
-| CLIP_VIT_BASE_PATCH32_TEXT |              31              |                47                |             48             |              55              |            49             |
+| Model                      | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| -------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| ALL_MINILM_L6_V2           |              16              |              16              |             19             |                54                 |            28             |
+| ALL_MPNET_BASE_V2          |             115              |             116              |            144             |                145                |            95             |
+| MULTI_QA_MINILM_L6_COS_V1  |              16              |              16              |             20             |                47                 |            28             |
+| MULTI_QA_MPNET_BASE_DOT_V1 |             112              |             119              |            144             |                146                |            96             |
+| CLIP_VIT_BASE_PATCH32_TEXT |              47              |              45              |             57             |                65                 |            48             |
 
 :::info
 Benchmark times for text embeddings are highly dependent on the sentence length. The numbers above are based on a sentence of around 80 tokens. For shorter or longer sentences, inference time may vary accordingly.
diff --git a/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useClassification.md b/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useClassification.md
index b4d3f34a6..e17bfa775 100644
--- a/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useClassification.md
+++ b/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useClassification.md
@@ -100,7 +100,7 @@ function App() {
 
 | Model             | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
 | ----------------- | :--------------------: | :----------------: |
-| EFFICIENTNET_V2_S |          130           |         85         |
+| EFFICIENTNET_V2_S |          230           |         87         |
 
 ### Inference time
 
@@ -108,6 +108,6 @@ function App() {
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
 :::
 
-| Model             | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model             | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ----------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| EFFICIENTNET_V2_S |             100              |             120              |            130             |                180                |            170            |
+| EFFICIENTNET_V2_S |             105              |             110              |            149             |                299                |            227            |
diff --git a/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useImageEmbeddings.md b/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useImageEmbeddings.md
index 1849a95ce..4d417590c 100644
--- a/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useImageEmbeddings.md
+++ b/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useImageEmbeddings.md
@@ -91,9 +91,9 @@ try {
 
 ## Supported models
 
-| Model                                                                              | Language | Image size | Embedding Dimensions | Description                                                                                                                                                                                                                                                                                                                                                                                                                               |
+| Model                                                                              | Language | Image size | Embedding dimensions | Description                                                                                                                                                                                                                                                                                                                                                                                                                               |
 | ---------------------------------------------------------------------------------- | :------: | :--------: | :------------------: | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [clip-vit-base-patch32-image](https://huggingface.co/openai/clip-vit-base-patch32) | English  | 224 x 224  |         512          | CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. CLIP allows to embed images and text into the same vector space. This allows to find similar images as well as to implement image search. This is the image encoder part of the CLIP model. To embed text checkout [clip-vit-base-patch32-text](../01-natural-language-processing/useTextEmbeddings.md#supported-models). |
+| [clip-vit-base-patch32-image](https://huggingface.co/openai/clip-vit-base-patch32) | English  |  224×224   |         512          | CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. CLIP allows to embed images and text into the same vector space. This allows to find similar images as well as to implement image search. This is the image encoder part of the CLIP model. To embed text checkout [clip-vit-base-patch32-text](../01-natural-language-processing/useTextEmbeddings.md#supported-models). |
 
 **`Image size`** - the size of an image that the model takes as an input. Resize will happen automatically.
 
@@ -123,9 +123,9 @@ For the supported models, the returned embedding vector is normalized, meaning t
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization. Performance also heavily depends on image size, because resize is expansive operation, especially on low-end devices.
 :::
 
-| Model                       | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| --------------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| CLIP_VIT_BASE_PATCH32_IMAGE |              48              |                64                |             69             |                65                 |            63             |
+| Model                       | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| --------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| CLIP_VIT_BASE_PATCH32_IMAGE |              70              |              70              |             90             |                66                 |            58             |
 
 :::info
 Image embedding benchmark times are measured using 224×224 pixel images, as required by the model. All input images, whether larger or smaller, are resized to 224×224 before processing. Resizing is typically fast for small images but may be noticeably slower for very large images, which can increase total inference time.
diff --git a/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useOCR.md b/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useOCR.md
index a23acd17c..5a1e80cfc 100644
--- a/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useOCR.md
+++ b/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useOCR.md
@@ -134,13 +134,13 @@ For more information on loading resources, take a look at [loading models](../..
 
 The hook returns an object with the following properties:
 
-| Field              | Type                                         | Description                                                                                 |
-| ------------------ | -------------------------------------------- | ------------------------------------------------------------------------------------------- |
-| `forward`          | `(input: string) => Promise<OCRDetection[]>` | A function that accepts an image (url, b64) and returns an array of `OCRDetection` objects. |
-| `error`            | <code>string &#124; null</code>              | Contains the error message if the model loading failed.                                     |
-| `isGenerating`     | `boolean`                                    | Indicates whether the model is currently processing an inference.                           |
-| `isReady`          | `boolean`                                    | Indicates whether the model has successfully loaded and is ready for inference.             |
-| `downloadProgress` | `number`                                     | Represents the download progress as a value between 0 and 1.                                |
+| Field              | Type                                               | Description                                                                                 |
+| ------------------ | -------------------------------------------------- | ------------------------------------------------------------------------------------------- |
+| `forward`          | `(imageSource: string) => Promise<OCRDetection[]>` | A function that accepts an image (url, b64) and returns an array of `OCRDetection` objects. |
+| `error`            | <code>string &#124; null</code>                    | Contains the error message if the model loading failed.                                     |
+| `isGenerating`     | `boolean`                                          | Indicates whether the model is currently processing an inference.                           |
+| `isReady`          | `boolean`                                          | Indicates whether the model has successfully loaded and is ready for inference.             |
+| `downloadProgress` | `number`                                           | Represents the download progress as a value between 0 and 1.                                |
 
 ## Running the model
 
@@ -288,20 +288,20 @@ You need to make sure the recognizer models you pass in `recognizerSources` matc
 
 ### Model size
 
-| Model                 | XNNPACK [MB] |
-| --------------------- | :----------: |
-| Detector (CRAFT_800)  |     83.1     |
-| Recognizer (CRNN_512) |  15 - 18\*   |
-| Recognizer (CRNN_256) |  16 - 18\*   |
-| Recognizer (CRNN_128) |  17 - 19\*   |
+| Model                          | XNNPACK [MB] |
+| ------------------------------ | :----------: |
+| Detector (CRAFT_800_QUANTIZED) |     19.8     |
+| Recognizer (CRNN_512)          |  15 - 18\*   |
+| Recognizer (CRNN_256)          |  16 - 18\*   |
+| Recognizer (CRNN_128)          |  17 - 19\*   |
 
 \* - The model weights vary depending on the language.
 
 ### Memory usage
 
-| Model                                                                                        | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| -------------------------------------------------------------------------------------------- | :--------------------: | :----------------: |
-| Detector (CRAFT_800) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) |          1600          |        1700        |
+| Model                                                                                                  | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ------------------------------------------------------------------------------------------------------ | :--------------------: | :----------------: |
+| Detector (CRAFT_800_QUANTIZED) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) |          1400          |        1320        |
 
 ### Inference time
 
@@ -317,18 +317,16 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 **Time measurements:**
 
-| Metric                    | iPhone 14 Pro Max <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
-| ------------------------- | ----------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
-| **Total Inference Time**  | 4330                          | 2537                      | ❌          | 6648                           | 5993                   |
-| **Detector (CRAFT_800)**  | 1945                          | 1809                      | ❌          | 2080                           | 1961                   |
-| **Recognizer (CRNN_512)** |                               |                           |             |                                |                        |
-| ├─ Average Time           | 273                           | 76                        | ❌          | 289                            | 252                    |
-| ├─ Total Time (3 runs)    | 820                           | 229                       | ❌          | 867                            | 756                    |
-| **Recognizer (CRNN_256)** |                               |                           |             |                                |                        |
-| ├─ Average Time           | 137                           | 39                        | ❌          | 260                            | 229                    |
-| ├─ Total Time (7 runs)    | 958                           | 271                       | ❌          | 1818                           | 1601                   |
-| **Recognizer (CRNN_128)** |                               |                           |             |                                |                        |
-| ├─ Average Time           | 68                            | 18                        | ❌          | 239                            | 214                    |
-| ├─ Total Time (7 runs)    | 478                           | 124                       | ❌          | 1673                           | 1498                   |
-
-❌ - Insufficient RAM.
+| Metric                             | iPhone 17 Pro <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
+| ---------------------------------- | ------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
+| **Total Inference Time**           | 1160                      | 1144                      | 1498        | 1567                           | 1160                   |
+| **Detector (CRAFT_800_QUANTIZED)** | 669                       | 649                       | 825         | 541                            | 474                    |
+| **Recognizer (CRNN_512)**          |                           |                           |             |                                |                        |
+| ├─ Average Time                    | 48                        | 47                        | 60          | 91                             | 72                     |
+| ├─ Total Time (3 runs)             | 144                       | 141                       | 180         | 273                            | 216                    |
+| **Recognizer (CRNN_256)**          |                           |                           |             |                                |                        |
+| ├─ Average Time                    | 22                        | 22                        | 29          | 51                             | 30                     |
+| ├─ Total Time (7 runs)             | 154                       | 154                       | 203         | 357                            | 210                    |
+| **Recognizer (CRNN_128)**          |                           |                           |             |                                |                        |
+| ├─ Average Time                    | 11                        | 11                        | 14          | 28                             | 17                     |
+| ├─ Total Time (7 runs)             | 77                        | 77                        | 98          | 196                            | 119                    |
diff --git a/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useObjectDetection.md b/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useObjectDetection.md
index ac756d6a6..7f49e8389 100644
--- a/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useObjectDetection.md
+++ b/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useObjectDetection.md
@@ -139,7 +139,7 @@ function App() {
 
 | Model                          | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | ------------------------------ | :--------------------: | :----------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |           90           |         90         |
+| SSDLITE_320_MOBILENET_V3_LARGE |          164           |        132         |
 
 ### Inference time
 
@@ -147,6 +147,6 @@ function App() {
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
 :::
 
-| Model                          | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |             190              |             260              |            280             |                100                |            90             |
+| SSDLITE_320_MOBILENET_V3_LARGE |             116              |             120              |            164             |                257                |            129            |
diff --git a/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useStyleTransfer.md b/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useStyleTransfer.md
index 899a619ca..2bedba325 100644
--- a/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useStyleTransfer.md
+++ b/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useStyleTransfer.md
@@ -95,10 +95,10 @@ function App() {
 
 | Model                        | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
 | ---------------------------- | :--------------------: | :----------------: |
-| STYLE_TRANSFER_CANDY         |          950           |        350         |
-| STYLE_TRANSFER_MOSAIC        |          950           |        350         |
-| STYLE_TRANSFER_UDNIE         |          950           |        350         |
-| STYLE_TRANSFER_RAIN_PRINCESS |          950           |        350         |
+| STYLE_TRANSFER_CANDY         |          1200          |        380         |
+| STYLE_TRANSFER_MOSAIC        |          1200          |        380         |
+| STYLE_TRANSFER_UDNIE         |          1200          |        380         |
+| STYLE_TRANSFER_RAIN_PRINCESS |          1200          |        380         |
 
 ### Inference time
 
@@ -106,9 +106,9 @@ function App() {
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
 :::
 
-| Model                        | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| Model                        | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ---------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| STYLE_TRANSFER_CANDY         |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_MOSAIC        |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_UDNIE         |             450              |             600              |            750             |               1650                |           1800            |
-| STYLE_TRANSFER_RAIN_PRINCESS |             450              |             600              |            750             |               1650                |           1800            |
+| STYLE_TRANSFER_CANDY         |             1356             |             1550             |            2003            |               2578                |           2328            |
+| STYLE_TRANSFER_MOSAIC        |             1376             |             1456             |            1971            |               2657                |           2394            |
+| STYLE_TRANSFER_UDNIE         |             1389             |             1499             |            1858            |               2380                |           2124            |
+| STYLE_TRANSFER_RAIN_PRINCESS |             1339             |             1514             |            2004            |               2608                |           2371            |
diff --git a/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useTextToImage.md b/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useTextToImage.md
new file mode 100644
index 000000000..476f8d95d
--- /dev/null
+++ b/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useTextToImage.md
@@ -0,0 +1,133 @@
+---
+title: useTextToImage
+keywords: [image generation]
+description: "Learn how to use image generation models in your React Native applications with React Native ExecuTorch's useTextToImage hook."
+---
+
+Text-to-image is a process of generating images directly from a description in natural language by conditioning a model on the provided text input. Our implementation follows the Stable Diffusion pipeline, which applies the diffusion process in a lower-dimensional latent space to reduce memory requirements. The pipeline combines a text encoder to preprocess the prompt, a U-Net that iteratively denoises latent representations, and a VAE decoder to reconstruct the final image. React Native ExecuTorch offers a dedicated hook, `useTextToImage`, for this task.
+
+<!-- Update links after uploading the model to Swm HuggingFace -->
+
+:::warning
+It is recommended to use models provided by us which are available at our Hugging Face repository, you can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
+:::
+
+## Reference
+
+```typescript
+import { useTextToImage, BK_SDM_TINY_VPRED_256 } from 'react-native-executorch';
+
+const model = useTextToImage({ model: BK_SDM_TINY_VPRED_256 });
+
+const input = 'a castle';
+
+try {
+  const image = await model.generate(input);
+} catch (error) {
+  console.error(error);
+}
+```
+
+### Arguments
+
+**`model`** - Object containing the model source.
+
+- **`schedulerSource`** - A string that specifies the location of the scheduler config.
+
+- **`tokenizerSource`** - A string that specifies the location of the tokenizer config.
+
+- **`encoderSource`** - A string that specifies the location of the text encoder binary.
+
+- **`unetSource`** - A string that specifies the location of the U-Net binary.
+
+- **`decoderSource`** - A string that specifies the location of the VAE decoder binary.
+
+**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+### Returns
+
+| Field              | Type                                                                                       | Description                                                                                                                                                                                                                              |
+| ------------------ | ------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `generate`         | `(input: string, imageSize?: number, numSteps?: number, seed?: number) => Promise<string>` | Runs the model to generate an image described by `input`, and conditioned by `seed`, performing `numSteps` inference steps. The resulting image, with dimensions `imageSize`×`imageSize` pixels, is returned as a base64-encoded string. |
+| `error`            | <code>string &#124; null</code>                                                            | Contains the error message if the model failed to load.                                                                                                                                                                                  |
+| `isGenerating`     | `boolean`                                                                                  | Indicates whether the model is currently processing an inference.                                                                                                                                                                        |
+| `isReady`          | `boolean`                                                                                  | Indicates whether the model has successfully loaded and is ready for inference.                                                                                                                                                          |
+| `downloadProgress` | `number`                                                                                   | Represents the download progress as a value between 0 and 1.                                                                                                                                                                             |
+| `interrupt()`      | `() => void`                                                                               | Interrupts the current inference. The model is stopped in the nearest inference step.                                                                                                                                                    |
+
+## Running the model
+
+To run the model, you can use the `forward` method. It accepts four arguments: a text prompt describing the requested image, a size of the image in pixels, a number of denoising steps, and an optional seed value, which enables reproducibility of the results.
+
+The image size must be a multiple of 32 due to the architecture of the U-Net and VAE models. The seed should be a positive integer.
+
+:::warning
+Larger imageSize values require significantly more memory to run the model.
+:::
+
+## Example
+
+```tsx
+import { useTextToImage, BK_SDM_TINY_VPRED_256 } from 'react-native-executorch';
+
+function App() {
+  const model = useTextToImage({ model: BK_SDM_TINY_VPRED_256 });
+
+  //...
+  const input = 'a medieval castle by the sea shore';
+
+  const imageSize = 256;
+  const numSteps = 25;
+
+  try {
+    image = await model.generate(input, imageSize, numSteps);
+  } catch (error) {
+    console.error(error);
+  }
+  //...
+
+  return <Image source={{ uri: `data:image/png;base64,${image}` }} />;
+}
+```
+
+| ![Castle 256x256](../../../../static/img/castle256.png) | ![Castle 512x512](../../../../static/img/castle512.png) |
+| ------------------------------------------------------- | ------------------------------------------------------- |
+| Image of size 256×256                                   | Image of size 512×512                                   |
+
+## Supported models
+
+| Model                                                               | Parameters [B] | Description                                                                                                                                                                                                                                                                                                  |
+| ------------------------------------------------------------------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| [bk-sdm-tiny-vpred](https://huggingface.co/vivym/bk-sdm-tiny-vpred) | 0.5            | BK-SDM (Block-removed Knowledge-distilled Stable Diffusion Model) is a compressed version of Stable Diffusion v1.4 with several residual and attention blocks removed. The BK-SDM-Tiny is a v-prediction variant of the model, obtained through further block removal, built around a 0.33B-parameter U-Net. |
+
+## Benchmarks
+
+:::info
+The number following the underscore (\_) indicates that the model supports generating image with dimensions ranging from 128 pixels up to that value. This setting doesn’t affect the model’s file size - it only determines how memory is allocated at runtime, based on the maximum allowed image size.
+:::
+
+### Model size
+
+| Model                 | Text encoder (XNNPACK) [MB] | UNet (XNNPACK) [MB] | VAE decoder (XNNPACK) [MB] |
+| --------------------- | --------------------------- | ------------------- | -------------------------- |
+| BK_SDM_TINY_VPRED_256 | 492                         | 1290                | 198                        |
+| BK_SDM_TINY_VPRED_512 | 492                         | 1290                | 198                        |
+
+### Memory usage
+
+| Model                 | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| --------------------- | ---------------------- | ------------------ |
+| BK_SDM_TINY_VPRED_256 | 2900                   | 2800               |
+| BK_SDM_TINY_VPRED_512 | 6700                   | 6560               |
+
+### Inference time
+
+| Model                 | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| --------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| BK_SDM_TINY_VPRED_256 |            21184             |            21021             |             ❌             |               18834               |           16617           |
+
+:::info
+Text-to-image benchmark times are measured generating 256×256 images in 10 inference steps.
+:::
diff --git a/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useVerticalOCR.md b/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useVerticalOCR.md
index e15c08fbe..73c3fc108 100644
--- a/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useVerticalOCR.md
+++ b/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useVerticalOCR.md
@@ -147,13 +147,13 @@ For more information on loading resources, take a look at [loading models](../..
 
 The hook returns an object with the following properties:
 
-| Field              | Type                                         | Description                                                                                 |
-| ------------------ | -------------------------------------------- | ------------------------------------------------------------------------------------------- |
-| `forward`          | `(input: string) => Promise<OCRDetection[]>` | A function that accepts an image (url, b64) and returns an array of `OCRDetection` objects. |
-| `error`            | <code>string &#124; null</code>              | Contains the error message if the model loading failed.                                     |
-| `isGenerating`     | `boolean`                                    | Indicates whether the model is currently processing an inference.                           |
-| `isReady`          | `boolean`                                    | Indicates whether the model has successfully loaded and is ready for inference.             |
-| `downloadProgress` | `number`                                     | Represents the download progress as a value between 0 and 1.                                |
+| Field              | Type                                               | Description                                                                                 |
+| ------------------ | -------------------------------------------------- | ------------------------------------------------------------------------------------------- |
+| `forward`          | `(imageSource: string) => Promise<OCRDetection[]>` | A function that accepts an image (url, b64) and returns an array of `OCRDetection` objects. |
+| `error`            | <code>string &#124; null</code>                    | Contains the error message if the model loading failed.                                     |
+| `isGenerating`     | `boolean`                                          | Indicates whether the model is currently processing an inference.                           |
+| `isReady`          | `boolean`                                          | Indicates whether the model has successfully loaded and is ready for inference.             |
+| `downloadProgress` | `number`                                           | Represents the download progress as a value between 0 and 1.                                |
 
 ## Running the model
 
@@ -302,12 +302,12 @@ You need to make sure the recognizer models you pass in `recognizerSources` matc
 
 ### Model size
 
-| Model                 | XNNPACK [MB] |
-| --------------------- | :----------: |
-| Detector (CRAFT_1280) |     83.1     |
-| Detector (CRAFT_320)  |     83.1     |
-| Recognizer (CRNN_512) |  15 - 18\*   |
-| Recognizer (CRNN_64)  |  15 - 16\*   |
+| Model                           | XNNPACK [MB] |
+| ------------------------------- | :----------: |
+| Detector (CRAFT_1280_QUANTIZED) |     19.8     |
+| Detector (CRAFT_32_QUANTIZED)   |     19.8     |
+| Recognizer (CRNN_512)           |  15 - 18\*   |
+| Recognizer (CRNN_64)            |  15 - 16\*   |
 
 \* - The model weights vary depending on the language.
 
@@ -315,8 +315,8 @@ You need to make sure the recognizer models you pass in `recognizerSources` matc
 
 | Model                                                                | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
 | -------------------------------------------------------------------- | :--------------------: | :----------------: |
-| Detector (CRAFT_1280) + Detector (CRAFT_320) + Recognizer (CRNN_512) |          2172          |        2214        |
-| Detector(CRAFT_1280) + Detector(CRAFT_320) + Recognizer (CRNN_64)    |          1774          |        1705        |
+| Detector (CRAFT_1280) + Detector (CRAFT_320) + Recognizer (CRNN_512) |          1540          |        1470        |
+| Detector(CRAFT_1280) + Detector(CRAFT_320) + Recognizer (CRNN_64)    |          1070          |        1000        |
 
 ### Inference time
 
@@ -332,18 +332,16 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 **Time measurements:**
 
-| Metric                                                                     | iPhone 14 Pro Max <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
-| -------------------------------------------------------------------------- | ----------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
-| **Total Inference Time**                                                   | 9350 / 9620                   | 8572 / 8621               | ❌          | 13737 / 10570                  | 13436 / 9848           |
-| **Detector (CRAFT_1250)**                                                  | 4895                          | 4756                      | ❌          | 5574                           | 5016                   |
-| **Detector (CRAFT_320)**                                                   |                               |                           |             |                                |                        |
-| ├─ Average Time                                                            | 1247                          | 1206                      | ❌          | 1350                           | 1356                   |
-| ├─ Total Time (3 runs)                                                     | 3741                          | 3617                      | ❌          | 4050                           | 4069                   |
-| **Recognizer (CRNN_64)** <br /> (_With Flag `independentChars == true`_)   |                               |                           |             |                                |                        |
-| ├─ Average Time                                                            | 31                            | 9                         | ❌          | 195                            | 207                    |
-| ├─ Total Time (21 runs)                                                    | 649                           | 191                       | ❌          | 4092                           | 4339                   |
-| **Recognizer (CRNN_512)** <br /> (_With Flag `independentChars == false`_) |                               |                           |             |                                |                        |
-| ├─ Average Time                                                            | 306                           | 80                        | ❌          | 308                            | 250                    |
-| ├─ Total Time (3 runs)                                                     | 919                           | 240                       | ❌          | 925                            | 751                    |
-
-❌ - Insufficient RAM.
+| Metric                                                                     | iPhone 17 Pro <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
+| -------------------------------------------------------------------------- | ------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
+| **Total Inference Time**                                                   | 3819 / 3716               | 3978 / 3841               | 4751 / 4532 | 3095 / 3286                    | 2787 / 2770            |
+| **Detector (CRAFT_1280_QUANTIZED)**                                        | 1749                      | 1804                      | 2105        | 1216                           | 1171                   |
+| **Detector (CRAFT_320_QUANTIZED)**                                         |                           |                           |             |                                |                        |
+| ├─ Average Time                                                            | 458                       | 474                       | 561         | 360                            | 332                    |
+| ├─ Total Time (4 runs)                                                     | 1832                      | 1896                      | 2244        | 1440                           | 1328                   |
+| **Recognizer (CRNN_64)** <br /> (_With Flag `independentChars == true`_)   |                           |                           |             |                                |                        |
+| ├─ Average Time                                                            | 5                         | 6                         | 7           | 28                             | 11                     |
+| ├─ Total Time (21 runs)                                                    | 105                       | 126                       | 147         | 588                            | 231                    |
+| **Recognizer (CRNN_512)** <br /> (_With Flag `independentChars == false`_) |                           |                           |             |                                |                        |
+| ├─ Average Time                                                            | 54                        | 52                        | 68          | 144                            | 72                     |
+| ├─ Total Time (4 runs)                                                     | 216                       | 208                       | 272         | 576                            | 288                    |

From 2d1ac228d6ff5eea8b82489c0c8f2a74906ef3c0 Mon Sep 17 00:00:00 2001
From: Mateusz Kopcinski <120639731+mkopcins@users.noreply.github.com>
Date: Wed, 3 Dec 2025 11:20:11 +0100
Subject: [PATCH 05/11] feat: Remove stft calculation within the encoder (#658)

## Description

The Whisper model export now takes in a plain waveform instead of
pre-computed STFT. This PR aims to change the current API to accept
waveforms instead. Before merging this, make sure to re-export all the
existing Whisper models with the new export script.

### Introduces a breaking change?

- [ ] Yes
- [x] No

### Type of change

- [ ] Bug fix (change which fixes an issue)
- [x] New feature (change which adds functionality)
- [ ] Documentation update (improves or adds clarity to existing
documentation)
- [ ] Other (chores, tests, code style improvements etc.)

### Tested on

- [x] iOS
- [x] Android

### Testing instructions

<!-- Provide step-by-step instructions on how to test your changes.
Include setup details if necessary. -->

### Screenshots

<!-- Add screenshots here, if applicable -->

### Related issues

<!-- Link related issues here using #issue-number -->

### Checklist

- [ ] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have updated the documentation accordingly
- [ ] My changes generate no new warnings

### Additional notes

<!-- Include any additional information, assumptions, or context that
reviewers might need to understand this PR. -->

---------

Co-authored-by: chmjkb <jakubchmura1607@gmail.com>
Co-authored-by: Jakub Chmura <92989966+chmjkb@users.noreply.github.com>
Co-authored-by: IgorSwat <114943112+IgorSwat@users.noreply.github.com>
---
 .../rnexecutorch/data_processing/dsp.cpp      | 46 -------------------
 .../common/rnexecutorch/models/BaseModel.cpp  | 12 ++---
 .../common/rnexecutorch/models/BaseModel.h    | 17 ++++---
 .../models/speech_to_text/asr/ASR.cpp         | 31 +++++--------
 .../models/speech_to_text/asr/ASR.h           |  9 ++--
 .../VoiceActivityDetection.cpp                |  3 +-
 .../src/constants/modelUrls.ts                | 46 +++++++++++--------
 7 files changed, 61 insertions(+), 103 deletions(-)

diff --git a/packages/react-native-executorch/common/rnexecutorch/data_processing/dsp.cpp b/packages/react-native-executorch/common/rnexecutorch/data_processing/dsp.cpp
index d3761dced..b1c8714a2 100644
--- a/packages/react-native-executorch/common/rnexecutorch/data_processing/dsp.cpp
+++ b/packages/react-native-executorch/common/rnexecutorch/data_processing/dsp.cpp
@@ -1,6 +1,4 @@
-#include <algorithm>
 #include <cstddef>
-#include <limits>
 #include <math.h>
 #include <rnexecutorch/data_processing/FFT.h>
 #include <rnexecutorch/data_processing/dsp.h>
@@ -18,48 +16,4 @@ std::vector<float> hannWindow(size_t size) {
   return window;
 }
 
-std::vector<float> stftFromWaveform(std::span<const float> waveform,
-                                    size_t fftWindowSize, size_t hopSize) {
-  // Initialize FFT
-  FFT fft(fftWindowSize);
-
-  const auto numFrames = 1 + (waveform.size() - fftWindowSize) / hopSize;
-  const auto numBins = fftWindowSize / 2;
-  const auto hann = hannWindow(fftWindowSize);
-  auto inBuffer = std::vector<float>(fftWindowSize);
-  auto outBuffer = std::vector<std::complex<float>>(fftWindowSize);
-
-  // Output magnitudes in dB
-  std::vector<float> magnitudes;
-  magnitudes.reserve(numFrames * numBins);
-  const auto magnitudeScale = 1.0f / static_cast<float>(fftWindowSize);
-  constexpr auto epsilon = std::numeric_limits<float>::epsilon();
-  constexpr auto dbConversionFactor = 20.0f;
-
-  for (size_t t = 0; t < numFrames; ++t) {
-    const size_t offset = t * hopSize;
-    // Clear the input buffer first
-    std::ranges::fill(inBuffer, 0.0f);
-
-    // Fill frame with windowed signal
-    const size_t samplesToRead =
-        std::min(fftWindowSize, waveform.size() - offset);
-    for (size_t i = 0; i < samplesToRead; i++) {
-      inBuffer[i] = waveform[offset + i] * hann[i];
-    }
-
-    fft.doFFT(inBuffer.data(), outBuffer);
-
-    // Calculate magnitudes in dB (only positive frequencies)
-    for (size_t i = 0; i < numBins; i++) {
-      const auto magnitude = std::abs(outBuffer[i]) * magnitudeScale;
-      const auto magnitude_db =
-          dbConversionFactor * log10f(magnitude + epsilon);
-      magnitudes.push_back(magnitude_db);
-    }
-  }
-
-  return magnitudes;
-}
-
 } // namespace rnexecutorch::dsp
diff --git a/packages/react-native-executorch/common/rnexecutorch/models/BaseModel.cpp b/packages/react-native-executorch/common/rnexecutorch/models/BaseModel.cpp
index a1194de69..ee53c7d5a 100644
--- a/packages/react-native-executorch/common/rnexecutorch/models/BaseModel.cpp
+++ b/packages/react-native-executorch/common/rnexecutorch/models/BaseModel.cpp
@@ -30,7 +30,7 @@ BaseModel::BaseModel(const std::string &modelSource,
 }
 
 std::vector<int32_t> BaseModel::getInputShape(std::string method_name,
-                                              int32_t index) {
+                                              int32_t index) const {
   if (!module_) {
     throw std::runtime_error("Model not loaded: Cannot get input shape");
   }
@@ -56,7 +56,7 @@ std::vector<int32_t> BaseModel::getInputShape(std::string method_name,
 }
 
 std::vector<std::vector<int32_t>>
-BaseModel::getAllInputShapes(std::string methodName) {
+BaseModel::getAllInputShapes(std::string methodName) const {
   if (!module_) {
     throw std::runtime_error("Model not loaded: Cannot get all input shapes");
   }
@@ -88,7 +88,7 @@ BaseModel::getAllInputShapes(std::string methodName) {
 /// to JS. It is not meant to be used within C++. If you want to call forward
 /// from C++ on a BaseModel, please use BaseModel::forward.
 std::vector<JSTensorViewOut>
-BaseModel::forwardJS(std::vector<JSTensorViewIn> tensorViewVec) {
+BaseModel::forwardJS(std::vector<JSTensorViewIn> tensorViewVec) const {
   if (!module_) {
     throw std::runtime_error("Model not loaded: Cannot perform forward pass");
   }
@@ -136,7 +136,7 @@ BaseModel::forwardJS(std::vector<JSTensorViewIn> tensorViewVec) {
 }
 
 Result<executorch::runtime::MethodMeta>
-BaseModel::getMethodMeta(const std::string &methodName) {
+BaseModel::getMethodMeta(const std::string &methodName) const {
   if (!module_) {
     throw std::runtime_error("Model not loaded: Cannot get method meta!");
   }
@@ -161,7 +161,7 @@ BaseModel::forward(const std::vector<EValue> &input_evalues) const {
 
 Result<std::vector<EValue>>
 BaseModel::execute(const std::string &methodName,
-                   const std::vector<EValue> &input_value) {
+                   const std::vector<EValue> &input_value) const {
   if (!module_) {
     throw std::runtime_error("Model not loaded, cannot run execute.");
   }
@@ -175,7 +175,7 @@ std::size_t BaseModel::getMemoryLowerBound() const noexcept {
 void BaseModel::unload() noexcept { module_.reset(nullptr); }
 
 std::vector<int32_t>
-BaseModel::getTensorShape(const executorch::aten::Tensor &tensor) {
+BaseModel::getTensorShape(const executorch::aten::Tensor &tensor) const {
   auto sizes = tensor.sizes();
   return std::vector<int32_t>(sizes.begin(), sizes.end());
 }
diff --git a/packages/react-native-executorch/common/rnexecutorch/models/BaseModel.h b/packages/react-native-executorch/common/rnexecutorch/models/BaseModel.h
index b7b7b54ed..cf2940429 100644
--- a/packages/react-native-executorch/common/rnexecutorch/models/BaseModel.h
+++ b/packages/react-native-executorch/common/rnexecutorch/models/BaseModel.h
@@ -25,18 +25,20 @@ class BaseModel {
       Module::LoadMode loadMode = Module::LoadMode::MmapUseMlockIgnoreErrors);
   std::size_t getMemoryLowerBound() const noexcept;
   void unload() noexcept;
-  std::vector<int32_t> getInputShape(std::string method_name, int32_t index);
+  std::vector<int32_t> getInputShape(std::string method_name,
+                                     int32_t index) const;
   std::vector<std::vector<int32_t>>
-  getAllInputShapes(std::string methodName = "forward");
+  getAllInputShapes(std::string methodName = "forward") const;
   std::vector<JSTensorViewOut>
-  forwardJS(std::vector<JSTensorViewIn> tensorViewVec);
+  forwardJS(std::vector<JSTensorViewIn> tensorViewVec) const;
   Result<std::vector<EValue>> forward(const EValue &input_value) const;
   Result<std::vector<EValue>>
   forward(const std::vector<EValue> &input_value) const;
-  Result<std::vector<EValue>> execute(const std::string &methodName,
-                                      const std::vector<EValue> &input_value);
+  Result<std::vector<EValue>>
+  execute(const std::string &methodName,
+          const std::vector<EValue> &input_value) const;
   Result<executorch::runtime::MethodMeta>
-  getMethodMeta(const std::string &methodName);
+  getMethodMeta(const std::string &methodName) const;
 
 protected:
   // If possible, models should not use the JS runtime to keep JSI internals
@@ -49,7 +51,8 @@ class BaseModel {
   std::size_t memorySizeLowerBound{0};
 
 private:
-  std::vector<int32_t> getTensorShape(const executorch::aten::Tensor &tensor);
+  std::vector<int32_t>
+  getTensorShape(const executorch::aten::Tensor &tensor) const;
 };
 } // namespace models
 
diff --git a/packages/react-native-executorch/common/rnexecutorch/models/speech_to_text/asr/ASR.cpp b/packages/react-native-executorch/common/rnexecutorch/models/speech_to_text/asr/ASR.cpp
index d0f965cb3..bf8f9fb86 100644
--- a/packages/react-native-executorch/common/rnexecutorch/models/speech_to_text/asr/ASR.cpp
+++ b/packages/react-native-executorch/common/rnexecutorch/models/speech_to_text/asr/ASR.cpp
@@ -4,7 +4,6 @@
 #include "ASR.h"
 #include "executorch/extension/tensor/tensor_ptr.h"
 #include "rnexecutorch/data_processing/Numerical.h"
-#include "rnexecutorch/data_processing/dsp.h"
 #include "rnexecutorch/data_processing/gzip.h"
 
 namespace rnexecutorch::models::speech_to_text::asr {
@@ -37,8 +36,7 @@ ASR::getInitialSequence(const DecodingOptions &options) const {
   return seq;
 }
 
-GenerationResult ASR::generate(std::span<const float> waveform,
-                               float temperature,
+GenerationResult ASR::generate(std::span<float> waveform, float temperature,
                                const DecodingOptions &options) const {
   std::vector<float> encoderOutput = this->encode(waveform);
 
@@ -94,7 +92,7 @@ float ASR::getCompressionRatio(const std::string &text) const {
 }
 
 std::vector<Segment>
-ASR::generateWithFallback(std::span<const float> waveform,
+ASR::generateWithFallback(std::span<float> waveform,
                           const DecodingOptions &options) const {
   std::vector<float> temperatures = {0.0f, 0.2f, 0.4f, 0.6f, 0.8f, 1.0f};
   std::vector<int32_t> bestTokens;
@@ -209,7 +207,7 @@ ASR::estimateWordLevelTimestampsLinear(std::span<const int32_t> tokens,
   return wordObjs;
 }
 
-std::vector<Segment> ASR::transcribe(std::span<const float> waveform,
+std::vector<Segment> ASR::transcribe(std::span<float> waveform,
                                      const DecodingOptions &options) const {
   int32_t seek = 0;
   std::vector<Segment> results;
@@ -218,7 +216,7 @@ std::vector<Segment> ASR::transcribe(std::span<const float> waveform,
     int32_t start = seek * ASR::kSamplingRate;
     const auto end = std::min<int32_t>(
         (seek + ASR::kChunkSize) * ASR::kSamplingRate, waveform.size());
-    std::span<const float> chunk = waveform.subspan(start, end - start);
+    auto chunk = waveform.subspan(start, end - start);
 
     if (std::cmp_less(chunk.size(), ASR::kMinChunkSamples)) {
       break;
@@ -246,19 +244,12 @@ std::vector<Segment> ASR::transcribe(std::span<const float> waveform,
   return results;
 }
 
-std::vector<float> ASR::encode(std::span<const float> waveform) const {
-  constexpr int32_t fftWindowSize = 512;
-  constexpr int32_t stftHopLength = 160;
-  constexpr int32_t innerDim = 256;
-
-  std::vector<float> preprocessedData =
-      dsp::stftFromWaveform(waveform, fftWindowSize, stftHopLength);
-  const auto numFrames =
-      static_cast<int32_t>(preprocessedData.size()) / innerDim;
-  std::vector<int32_t> inputShape = {numFrames, innerDim};
+std::vector<float> ASR::encode(std::span<float> waveform) const {
+  auto inputShape = {static_cast<int32_t>(waveform.size())};
 
   const auto modelInputTensor = executorch::extension::make_tensor_ptr(
-      std::move(inputShape), std::move(preprocessedData));
+      std::move(inputShape), waveform.data(),
+      executorch::runtime::etensor::ScalarType::Float);
   const auto encoderResult = this->encoder->forward(modelInputTensor);
 
   if (!encoderResult.ok()) {
@@ -268,7 +259,7 @@ std::vector<float> ASR::encode(std::span<const float> waveform) const {
   }
 
   const auto decoderOutputTensor = encoderResult.get().at(0).toTensor();
-  const int32_t outputNumel = decoderOutputTensor.numel();
+  const auto outputNumel = decoderOutputTensor.numel();
 
   const float *const dataPtr = decoderOutputTensor.const_data_ptr<float>();
   return {dataPtr, dataPtr + outputNumel};
@@ -277,8 +268,10 @@ std::vector<float> ASR::encode(std::span<const float> waveform) const {
 std::vector<float> ASR::decode(std::span<int32_t> tokens,
                                std::span<float> encoderOutput) const {
   std::vector<int32_t> tokenShape = {1, static_cast<int32_t>(tokens.size())};
+  auto tokensLong = std::vector<int64_t>(tokens.begin(), tokens.end());
+
   auto tokenTensor = executorch::extension::make_tensor_ptr(
-      std::move(tokenShape), tokens.data(), ScalarType::Int);
+      tokenShape, tokensLong.data(), ScalarType::Long);
 
   const auto encoderOutputSize = static_cast<int32_t>(encoderOutput.size());
   std::vector<int32_t> encShape = {1, ASR::kNumFrames,
diff --git a/packages/react-native-executorch/common/rnexecutorch/models/speech_to_text/asr/ASR.h b/packages/react-native-executorch/common/rnexecutorch/models/speech_to_text/asr/ASR.h
index 20180ebe4..a0ea7e181 100644
--- a/packages/react-native-executorch/common/rnexecutorch/models/speech_to_text/asr/ASR.h
+++ b/packages/react-native-executorch/common/rnexecutorch/models/speech_to_text/asr/ASR.h
@@ -14,9 +14,9 @@ class ASR {
                const models::BaseModel *decoder,
                const TokenizerModule *tokenizer);
   std::vector<types::Segment>
-  transcribe(std::span<const float> waveform,
+  transcribe(std::span<float> waveform,
              const types::DecodingOptions &options) const;
-  std::vector<float> encode(std::span<const float> waveform) const;
+  std::vector<float> encode(std::span<float> waveform) const;
   std::vector<float> decode(std::span<int32_t> tokens,
                             std::span<float> encoderOutput) const;
 
@@ -44,11 +44,10 @@ class ASR {
 
   std::vector<int32_t>
   getInitialSequence(const types::DecodingOptions &options) const;
-  types::GenerationResult generate(std::span<const float> waveform,
-                                   float temperature,
+  types::GenerationResult generate(std::span<float> waveform, float temperature,
                                    const types::DecodingOptions &options) const;
   std::vector<types::Segment>
-  generateWithFallback(std::span<const float> waveform,
+  generateWithFallback(std::span<float> waveform,
                        const types::DecodingOptions &options) const;
   std::vector<types::Segment>
   calculateWordLevelTimestamps(std::span<const int32_t> tokens,
diff --git a/packages/react-native-executorch/common/rnexecutorch/models/voice_activity_detection/VoiceActivityDetection.cpp b/packages/react-native-executorch/common/rnexecutorch/models/voice_activity_detection/VoiceActivityDetection.cpp
index d07dbfb3c..dbc974706 100644
--- a/packages/react-native-executorch/common/rnexecutorch/models/voice_activity_detection/VoiceActivityDetection.cpp
+++ b/packages/react-native-executorch/common/rnexecutorch/models/voice_activity_detection/VoiceActivityDetection.cpp
@@ -6,7 +6,6 @@
 #include <array>
 #include <functional>
 #include <numeric>
-#include <ranges>
 #include <vector>
 
 namespace rnexecutorch::models::voice_activity_detection {
@@ -158,4 +157,4 @@ VoiceActivityDetection::postprocess(const std::vector<float> &scores,
   return speechSegments;
 }
 
-} // namespace rnexecutorch::models::voice_activity_detection
\ No newline at end of file
+} // namespace rnexecutorch::models::voice_activity_detection
diff --git a/packages/react-native-executorch/src/constants/modelUrls.ts b/packages/react-native-executorch/src/constants/modelUrls.ts
index 57381cf15..e9fe9e4d9 100644
--- a/packages/react-native-executorch/src/constants/modelUrls.ts
+++ b/packages/react-native-executorch/src/constants/modelUrls.ts
@@ -307,29 +307,32 @@ export const STYLE_TRANSFER_UDNIE = {
 };
 
 // S2T
-const WHISPER_TINY_EN_TOKENIZER = `${URL_PREFIX}-whisper-tiny.en/${VERSION_TAG}/tokenizer.json`;
-const WHISPER_TINY_EN_ENCODER = `${URL_PREFIX}-whisper-tiny.en/${VERSION_TAG}/xnnpack/whisper_tiny_en_encoder_xnnpack.pte`;
-const WHISPER_TINY_EN_DECODER = `${URL_PREFIX}-whisper-tiny.en/${VERSION_TAG}/xnnpack/whisper_tiny_en_decoder_xnnpack.pte`;
+const WHISPER_TINY_EN_TOKENIZER = `${URL_PREFIX}-whisper-tiny.en/${NEXT_VERSION_TAG}/tokenizer.json`;
+const WHISPER_TINY_EN_ENCODER = `${URL_PREFIX}-whisper-tiny.en/${NEXT_VERSION_TAG}/xnnpack/whisper_tiny_en_encoder_xnnpack.pte`;
+const WHISPER_TINY_EN_DECODER = `${URL_PREFIX}-whisper-tiny.en/${NEXT_VERSION_TAG}/xnnpack/whisper_tiny_en_decoder_xnnpack.pte`;
 
-const WHISPER_BASE_EN_TOKENIZER = `${URL_PREFIX}-whisper-base.en/${VERSION_TAG}/tokenizer.json`;
-const WHISPER_BASE_EN_ENCODER = `${URL_PREFIX}-whisper-base.en/${VERSION_TAG}/xnnpack/whisper_base_en_encoder_xnnpack.pte`;
-const WHISPER_BASE_EN_DECODER = `${URL_PREFIX}-whisper-base.en/${VERSION_TAG}/xnnpack/whisper_base_en_decoder_xnnpack.pte`;
+const WHISPER_TINY_EN_ENCODER_QUANTIZED = `${URL_PREFIX}-whisper-tiny-quantized.en/${NEXT_VERSION_TAG}/xnnpack/whisper_tiny_quantized_en_encoder_xnnpack.pte`;
+const WHISPER_TINY_EN_DECODER_QUANTIZED = `${URL_PREFIX}-whisper-tiny-quantized.en/${NEXT_VERSION_TAG}/xnnpack/whisper_tiny_quantized_en_decoder_xnnpack.pte`;
 
-const WHISPER_SMALL_EN_TOKENIZER = `${URL_PREFIX}-whisper-small.en/${VERSION_TAG}/tokenizer.json`;
-const WHISPER_SMALL_EN_ENCODER = `${URL_PREFIX}-whisper-small.en/${VERSION_TAG}/xnnpack/whisper_small_en_encoder_xnnpack.pte`;
-const WHISPER_SMALL_EN_DECODER = `${URL_PREFIX}-whisper-small.en/${VERSION_TAG}/xnnpack/whisper_small_en_decoder_xnnpack.pte`;
+const WHISPER_BASE_EN_TOKENIZER = `${URL_PREFIX}-whisper-base.en/${NEXT_VERSION_TAG}/tokenizer.json`;
+const WHISPER_BASE_EN_ENCODER = `${URL_PREFIX}-whisper-base.en/${NEXT_VERSION_TAG}/xnnpack/whisper_base_en_encoder_xnnpack.pte`;
+const WHISPER_BASE_EN_DECODER = `${URL_PREFIX}-whisper-base.en/${NEXT_VERSION_TAG}/xnnpack/whisper_base_en_decoder_xnnpack.pte`;
 
-const WHISPER_TINY_TOKENIZER = `${URL_PREFIX}-whisper-tiny/${VERSION_TAG}/tokenizer.json`;
-const WHISPER_TINY_ENCODER_MODEL = `${URL_PREFIX}-whisper-tiny/${VERSION_TAG}/xnnpack/whisper_tiny_encoder_xnnpack.pte`;
-const WHISPER_TINY_DECODER_MODEL = `${URL_PREFIX}-whisper-tiny/${VERSION_TAG}/xnnpack/whisper_tiny_decoder_xnnpack.pte`;
+const WHISPER_SMALL_EN_TOKENIZER = `${URL_PREFIX}-whisper-small.en/${NEXT_VERSION_TAG}/tokenizer.json`;
+const WHISPER_SMALL_EN_ENCODER = `${URL_PREFIX}-whisper-small.en/${NEXT_VERSION_TAG}/xnnpack/whisper_small_en_encoder_xnnpack.pte`;
+const WHISPER_SMALL_EN_DECODER = `${URL_PREFIX}-whisper-small.en/${NEXT_VERSION_TAG}/xnnpack/whisper_small_en_decoder_xnnpack.pte`;
 
-const WHISPER_BASE_TOKENIZER = `${URL_PREFIX}-whisper-base/${VERSION_TAG}/tokenizer.json`;
-const WHISPER_BASE_ENCODER_MODEL = `${URL_PREFIX}-whisper-base/${VERSION_TAG}/xnnpack/whisper_base_encoder_xnnpack.pte`;
-const WHISPER_BASE_DECODER_MODEL = `${URL_PREFIX}-whisper-base/${VERSION_TAG}/xnnpack/whisper_base_decoder_xnnpack.pte`;
+const WHISPER_TINY_TOKENIZER = `${URL_PREFIX}-whisper-tiny/${NEXT_VERSION_TAG}/tokenizer.json`;
+const WHISPER_TINY_ENCODER_MODEL = `${URL_PREFIX}-whisper-tiny/${NEXT_VERSION_TAG}/xnnpack/whisper_tiny_encoder_xnnpack.pte`;
+const WHISPER_TINY_DECODER_MODEL = `${URL_PREFIX}-whisper-tiny/${NEXT_VERSION_TAG}/xnnpack/whisper_tiny_decoder_xnnpack.pte`;
 
-const WHISPER_SMALL_TOKENIZER = `${URL_PREFIX}-whisper-small/${VERSION_TAG}/tokenizer.json`;
-const WHISPER_SMALL_ENCODER_MODEL = `${URL_PREFIX}-whisper-small/${VERSION_TAG}/xnnpack/whisper_small_encoder_xnnpack.pte`;
-const WHISPER_SMALL_DECODER_MODEL = `${URL_PREFIX}-whisper-small/${VERSION_TAG}/xnnpack/whisper_small_decoder_xnnpack.pte`;
+const WHISPER_BASE_TOKENIZER = `${URL_PREFIX}-whisper-base/${NEXT_VERSION_TAG}/tokenizer.json`;
+const WHISPER_BASE_ENCODER_MODEL = `${URL_PREFIX}-whisper-base/${NEXT_VERSION_TAG}/xnnpack/whisper_base_encoder_xnnpack.pte`;
+const WHISPER_BASE_DECODER_MODEL = `${URL_PREFIX}-whisper-base/${NEXT_VERSION_TAG}/xnnpack/whisper_base_decoder_xnnpack.pte`;
+
+const WHISPER_SMALL_TOKENIZER = `${URL_PREFIX}-whisper-small/${NEXT_VERSION_TAG}/tokenizer.json`;
+const WHISPER_SMALL_ENCODER_MODEL = `${URL_PREFIX}-whisper-small/${NEXT_VERSION_TAG}/xnnpack/whisper_small_encoder_xnnpack.pte`;
+const WHISPER_SMALL_DECODER_MODEL = `${URL_PREFIX}-whisper-small/${NEXT_VERSION_TAG}/xnnpack/whisper_small_decoder_xnnpack.pte`;
 
 export const WHISPER_TINY_EN = {
   isMultilingual: false,
@@ -338,6 +341,13 @@ export const WHISPER_TINY_EN = {
   tokenizerSource: WHISPER_TINY_EN_TOKENIZER,
 };
 
+export const WHISPER_TINY_EN_QUANTIZED = {
+  isMultilingual: false,
+  encoderSource: WHISPER_TINY_EN_ENCODER_QUANTIZED,
+  decoderSource: WHISPER_TINY_EN_DECODER_QUANTIZED,
+  tokenizerSource: WHISPER_TINY_EN_TOKENIZER,
+};
+
 export const WHISPER_BASE_EN = {
   isMultilingual: false,
   encoderSource: WHISPER_BASE_EN_ENCODER,

From 73a47a8df8bf20ab0f5ac32ab68a375f196c3d6b Mon Sep 17 00:00:00 2001
From: Jakub Chmura <92989966+chmjkb@users.noreply.github.com>
Date: Wed, 3 Dec 2025 11:24:04 +0100
Subject: [PATCH 06/11] fix: Import Expo FS conditionally to work with Expo 54 
 (#699)

## Description

Expo 54 introduces a new FileSystem API, deprecating the ones used in
our codebase. The old APIs can still be accessed under
`expo-file-system/legacy`. This is a temporary fix to work with old Expo
versions.

### Introduces a breaking change?

- [ ] Yes
- [x] No

### Type of change

- [x] Bug fix (change which fixes an issue)
- [ ] New feature (change which adds functionality)
- [ ] Documentation update (improves or adds clarity to existing
documentation)
- [ ] Other (chores, tests, code style improvements etc.)

### Tested on

- [x] iOS
- [x] Android

### Testing instructions

<!-- Provide step-by-step instructions on how to test your changes.
Include setup details if necessary. -->

### Screenshots

<!-- Add screenshots here, if applicable -->

### Related issues

<!-- Link related issues here using #issue-number -->

### Checklist

- [ ] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have updated the documentation accordingly
- [ ] My changes generate no new warnings

### Additional notes

<!-- Include any additional information, assumptions, or context that
reviewers might need to understand this PR. -->
---
 .../src/constants/directories.ts              |  4 ++-
 .../src/controllers/LLMController.ts          |  7 ++++--
 .../src/utils/ResourceFetcher.ts              | 25 +++++++++++++++++--
 .../src/utils/ResourceFetcherUtils.ts         | 20 +++++++--------
 4 files changed, 40 insertions(+), 16 deletions(-)

diff --git a/packages/react-native-executorch/src/constants/directories.ts b/packages/react-native-executorch/src/constants/directories.ts
index ac20d04d8..3cc6e68a9 100644
--- a/packages/react-native-executorch/src/constants/directories.ts
+++ b/packages/react-native-executorch/src/constants/directories.ts
@@ -1,3 +1,5 @@
-import { documentDirectory } from 'expo-file-system';
+import { importLegacyExpoFSModules } from '../utils/ResourceFetcher';
+
+const { documentDirectory } = importLegacyExpoFSModules();
 
 export const RNEDirectory = `${documentDirectory}react-native-executorch/`;
diff --git a/packages/react-native-executorch/src/controllers/LLMController.ts b/packages/react-native-executorch/src/controllers/LLMController.ts
index bcc131eba..bbc113a76 100644
--- a/packages/react-native-executorch/src/controllers/LLMController.ts
+++ b/packages/react-native-executorch/src/controllers/LLMController.ts
@@ -1,9 +1,11 @@
 import { ResourceSource } from '../types/common';
-import { ResourceFetcher } from '../utils/ResourceFetcher';
+import {
+  importLegacyExpoFSModules,
+  ResourceFetcher,
+} from '../utils/ResourceFetcher';
 import { ETError, getError } from '../Error';
 import { Template } from '@huggingface/jinja';
 import { DEFAULT_CHAT_CONFIG } from '../constants/llmDefaults';
-import { readAsStringAsync } from 'expo-file-system';
 import {
   ChatConfig,
   GenerationConfig,
@@ -14,6 +16,7 @@ import {
 } from '../types/llm';
 import { parseToolCall } from '../utils/llm';
 import { Logger } from '../common/Logger';
+const { readAsStringAsync } = importLegacyExpoFSModules();
 
 export class LLMController {
   private nativeModule: any;
diff --git a/packages/react-native-executorch/src/utils/ResourceFetcher.ts b/packages/react-native-executorch/src/utils/ResourceFetcher.ts
index fa2fd8c09..efc16ab63 100644
--- a/packages/react-native-executorch/src/utils/ResourceFetcher.ts
+++ b/packages/react-native-executorch/src/utils/ResourceFetcher.ts
@@ -27,7 +27,27 @@
  *   - Implements linked list behavior via the `.next` attribute
  *   - Automatically processes subsequent downloads when `.next` contains a valid resource
  */
-import {
+import type * as FileSystemTypes from 'expo-file-system';
+
+export function importLegacyExpoFSModules() {
+  let FileSystem: typeof FileSystemTypes;
+
+  try {
+    const expoPkg = require('expo/package.json');
+    const sdkVersion = expoPkg.version.split('.')[0];
+
+    if (Number(sdkVersion) > 53) {
+      FileSystem = require('expo-file-system/legacy');
+    } else {
+      FileSystem = require('expo-file-system');
+    }
+  } catch (e) {
+    throw new Error('Expo must be installed to use react-native-executorch');
+  }
+  return FileSystem;
+}
+
+const {
   cacheDirectory,
   copyAsync,
   createDownloadResumable,
@@ -37,7 +57,8 @@ import {
   EncodingType,
   deleteAsync,
   readDirectoryAsync,
-} from 'expo-file-system';
+} = importLegacyExpoFSModules();
+
 import { Asset } from 'expo-asset';
 import { Platform } from 'react-native';
 import { RNEDirectory } from '../constants/directories';
diff --git a/packages/react-native-executorch/src/utils/ResourceFetcherUtils.ts b/packages/react-native-executorch/src/utils/ResourceFetcherUtils.ts
index 67d6edc9b..d36a9ba5e 100644
--- a/packages/react-native-executorch/src/utils/ResourceFetcherUtils.ts
+++ b/packages/react-native-executorch/src/utils/ResourceFetcherUtils.ts
@@ -1,16 +1,14 @@
-/**
- * @internal
- */
-
-import {
-  DownloadResumable,
-  getInfoAsync,
-  makeDirectoryAsync,
-} from 'expo-file-system';
+import type * as FileSystemTypes from 'expo-file-system';
 import { RNEDirectory } from '../constants/directories';
 import { ResourceSource } from '../types/common';
 import { Asset } from 'expo-asset';
 import { Logger } from '../common/Logger';
+import { importLegacyExpoFSModules } from './ResourceFetcher';
+
+/**
+ * @internal
+ */
+const { getInfoAsync, makeDirectoryAsync } = importLegacyExpoFSModules();
 
 export const enum HTTP_CODE {
   OK = 200,
@@ -42,7 +40,7 @@ export interface ResourceSourceExtended {
 }
 
 export interface DownloadResource {
-  downloadResumable: DownloadResumable;
+  downloadResumable: FileSystemTypes.DownloadResumable;
   status: DownloadStatus;
   extendedInfo: ResourceSourceExtended;
 }
@@ -75,7 +73,7 @@ export namespace ResourceFetcherUtils {
     let totalLength = 0;
     let previousFilesTotalLength = 0;
     for (const source of sources) {
-      const type = await ResourceFetcherUtils.getType(source);
+      const type = ResourceFetcherUtils.getType(source);
       let length = 0;
       try {
         if (type === SourceType.REMOTE_FILE && typeof source === 'string') {

From 0da57dc7ff704a02ba6b44d25531beef7aea3c63 Mon Sep 17 00:00:00 2001
From: Jakub Chmura <92989966+chmjkb@users.noreply.github.com>
Date: Wed, 3 Dec 2025 12:17:35 +0100
Subject: [PATCH 07/11] fix: prevent OpenCV from overriding our threading
 configuration (#700)

## Description

We observed activity on all CPU cores despite manually configuring the
thread pool. OpenCV's internal threading was activating all available
cores, overriding our optimized thread configuration and resulting in
worse performance.

### Introduces a breaking change?

- [ ] Yes
- [x] No

### Type of change

- [x] Bug fix (change which fixes an issue)
- [ ] New feature (change which adds functionality)
- [ ] Documentation update (improves or adds clarity to existing
documentation)
- [ ] Other (chores, tests, code style improvements etc.)

### Tested on

- [x] iOS
- [x] Android

### Testing instructions

<!-- Provide step-by-step instructions on how to test your changes.
Include setup details if necessary. -->

### Screenshots

<!-- Add screenshots here, if applicable -->

### Related issues

<!-- Link related issues here using #issue-number -->

### Checklist

- [ ] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have updated the documentation accordingly
- [ ] My changes generate no new warnings

### Additional notes

<!-- Include any additional information, assumptions, or context that
reviewers might need to understand this PR. -->
---
 .../rnexecutorch/RnExecutorchInstaller.cpp       | 16 ----------------
 .../rnexecutorch/threads/GlobalThreadPool.h      |  4 ++++
 2 files changed, 4 insertions(+), 16 deletions(-)

diff --git a/packages/react-native-executorch/common/rnexecutorch/RnExecutorchInstaller.cpp b/packages/react-native-executorch/common/rnexecutorch/RnExecutorchInstaller.cpp
index 95f6e9e55..c25fbd13f 100644
--- a/packages/react-native-executorch/common/rnexecutorch/RnExecutorchInstaller.cpp
+++ b/packages/react-native-executorch/common/rnexecutorch/RnExecutorchInstaller.cpp
@@ -108,22 +108,6 @@ void RnExecutorchInstaller::injectJSIBindings(
 
   threads::utils::unsafeSetupThreadPool();
   threads::GlobalThreadPool::initialize();
-
-#if defined(__ANDROID__) && defined(__aarch64__)
-  auto num_of_perf_cores =
-      ::executorch::extension::cpuinfo::get_num_performant_cores();
-  log(LOG_LEVEL::Info, "Detected ", num_of_perf_cores, " performant cores");
-  // setting num_of_cores to floor(num_of_perf_cores / 2) + 1) because depending
-  // on cpu arch as when possible we want to leave at least 2 performant cores
-  // for other tasks (setting more actually results in drop of performance). For
-  // older devices (i.e. samsung s22) resolves to 3 cores, and for newer ones
-  // (like OnePlus 12) resolves to 4, which when benchamrked gives highest
-  // throughput.
-  auto num_of_cores = static_cast<uint32_t>(num_of_perf_cores / 2) + 1;
-  ::executorch::extension::threadpool::get_threadpool()
-      ->_unsafe_reset_threadpool(num_of_cores);
-  log(LOG_LEVEL::Info, "Configuring xnnpack for ", num_of_cores, " threads");
-#endif
 }
 
 } // namespace rnexecutorch
diff --git a/packages/react-native-executorch/common/rnexecutorch/threads/GlobalThreadPool.h b/packages/react-native-executorch/common/rnexecutorch/threads/GlobalThreadPool.h
index 8b61080f8..50025eeeb 100644
--- a/packages/react-native-executorch/common/rnexecutorch/threads/GlobalThreadPool.h
+++ b/packages/react-native-executorch/common/rnexecutorch/threads/GlobalThreadPool.h
@@ -4,6 +4,7 @@
 #include <executorch/extension/threadpool/cpuinfo_utils.h>
 #include <memory>
 #include <mutex>
+#include <opencv2/opencv.hpp>
 #include <optional>
 #include <rnexecutorch/Log.h>
 #include <rnexecutorch/threads/HighPerformanceThreadPool.h>
@@ -38,6 +39,9 @@ class GlobalThreadPool {
           numThreads, "threads");
       instance = std::make_unique<HighPerformanceThreadPool>(numThreads.value(),
                                                              config);
+      // Disable OpenCV's internal threading to prevent it from overriding our
+      // thread pool configuration, which would cause degraded performance
+      cv::setNumThreads(0);
     });
   }
 

From 270c85e93fb9dcdaa935dabcf3cbee11671ef327 Mon Sep 17 00:00:00 2001
From: Jakub Chmura <92989966+chmjkb@users.noreply.github.com>
Date: Thu, 4 Dec 2025 13:35:42 +0100
Subject: [PATCH 08/11] chore: update HuggingFace model URL tags to v0.6 (#701)

## Description

<!-- Provide a concise and descriptive summary of the changes
implemented in this PR. -->

### Introduces a breaking change?

- [ ] Yes
- [ ] No

### Type of change

- [ ] Bug fix (change which fixes an issue)
- [ ] New feature (change which adds functionality)
- [ ] Documentation update (improves or adds clarity to existing
documentation)
- [ ] Other (chores, tests, code style improvements etc.)

### Tested on

- [ ] iOS
- [ ] Android

### Testing instructions

<!-- Provide step-by-step instructions on how to test your changes.
Include setup details if necessary. -->

### Screenshots

<!-- Add screenshots here, if applicable -->

### Related issues

<!-- Link related issues here using #issue-number -->

### Checklist

- [ ] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have updated the documentation accordingly
- [ ] My changes generate no new warnings

### Additional notes

<!-- Include any additional information, assumptions, or context that
reviewers might need to understand this PR. -->
---
 .../src/constants/modelUrls.ts                | 46 +++++++++----------
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/packages/react-native-executorch/src/constants/modelUrls.ts b/packages/react-native-executorch/src/constants/modelUrls.ts
index e9fe9e4d9..50e7ef5a8 100644
--- a/packages/react-native-executorch/src/constants/modelUrls.ts
+++ b/packages/react-native-executorch/src/constants/modelUrls.ts
@@ -2,8 +2,8 @@ import { Platform } from 'react-native';
 
 const URL_PREFIX =
   'https://huggingface.co/software-mansion/react-native-executorch';
-const VERSION_TAG = 'resolve/v0.5.0';
-const NEXT_VERSION_TAG = 'resolve/v0.6.0';
+const VERSION_TAG = 'resolve/v0.6.0';
+// const NEXT_VERSION_TAG = 'resolve/v0.7.0';
 
 // LLMs
 
@@ -307,32 +307,32 @@ export const STYLE_TRANSFER_UDNIE = {
 };
 
 // S2T
-const WHISPER_TINY_EN_TOKENIZER = `${URL_PREFIX}-whisper-tiny.en/${NEXT_VERSION_TAG}/tokenizer.json`;
-const WHISPER_TINY_EN_ENCODER = `${URL_PREFIX}-whisper-tiny.en/${NEXT_VERSION_TAG}/xnnpack/whisper_tiny_en_encoder_xnnpack.pte`;
-const WHISPER_TINY_EN_DECODER = `${URL_PREFIX}-whisper-tiny.en/${NEXT_VERSION_TAG}/xnnpack/whisper_tiny_en_decoder_xnnpack.pte`;
+const WHISPER_TINY_EN_TOKENIZER = `${URL_PREFIX}-whisper-tiny.en/${VERSION_TAG}/tokenizer.json`;
+const WHISPER_TINY_EN_ENCODER = `${URL_PREFIX}-whisper-tiny.en/${VERSION_TAG}/xnnpack/whisper_tiny_en_encoder_xnnpack.pte`;
+const WHISPER_TINY_EN_DECODER = `${URL_PREFIX}-whisper-tiny.en/${VERSION_TAG}/xnnpack/whisper_tiny_en_decoder_xnnpack.pte`;
 
-const WHISPER_TINY_EN_ENCODER_QUANTIZED = `${URL_PREFIX}-whisper-tiny-quantized.en/${NEXT_VERSION_TAG}/xnnpack/whisper_tiny_quantized_en_encoder_xnnpack.pte`;
-const WHISPER_TINY_EN_DECODER_QUANTIZED = `${URL_PREFIX}-whisper-tiny-quantized.en/${NEXT_VERSION_TAG}/xnnpack/whisper_tiny_quantized_en_decoder_xnnpack.pte`;
+const WHISPER_TINY_EN_ENCODER_QUANTIZED = `${URL_PREFIX}-whisper-tiny-quantized.en/${VERSION_TAG}/xnnpack/whisper_tiny_quantized_en_encoder_xnnpack.pte`;
+const WHISPER_TINY_EN_DECODER_QUANTIZED = `${URL_PREFIX}-whisper-tiny-quantized.en/${VERSION_TAG}/xnnpack/whisper_tiny_quantized_en_decoder_xnnpack.pte`;
 
-const WHISPER_BASE_EN_TOKENIZER = `${URL_PREFIX}-whisper-base.en/${NEXT_VERSION_TAG}/tokenizer.json`;
-const WHISPER_BASE_EN_ENCODER = `${URL_PREFIX}-whisper-base.en/${NEXT_VERSION_TAG}/xnnpack/whisper_base_en_encoder_xnnpack.pte`;
-const WHISPER_BASE_EN_DECODER = `${URL_PREFIX}-whisper-base.en/${NEXT_VERSION_TAG}/xnnpack/whisper_base_en_decoder_xnnpack.pte`;
+const WHISPER_BASE_EN_TOKENIZER = `${URL_PREFIX}-whisper-base.en/${VERSION_TAG}/tokenizer.json`;
+const WHISPER_BASE_EN_ENCODER = `${URL_PREFIX}-whisper-base.en/${VERSION_TAG}/xnnpack/whisper_base_en_encoder_xnnpack.pte`;
+const WHISPER_BASE_EN_DECODER = `${URL_PREFIX}-whisper-base.en/${VERSION_TAG}/xnnpack/whisper_base_en_decoder_xnnpack.pte`;
 
-const WHISPER_SMALL_EN_TOKENIZER = `${URL_PREFIX}-whisper-small.en/${NEXT_VERSION_TAG}/tokenizer.json`;
-const WHISPER_SMALL_EN_ENCODER = `${URL_PREFIX}-whisper-small.en/${NEXT_VERSION_TAG}/xnnpack/whisper_small_en_encoder_xnnpack.pte`;
-const WHISPER_SMALL_EN_DECODER = `${URL_PREFIX}-whisper-small.en/${NEXT_VERSION_TAG}/xnnpack/whisper_small_en_decoder_xnnpack.pte`;
+const WHISPER_SMALL_EN_TOKENIZER = `${URL_PREFIX}-whisper-small.en/${VERSION_TAG}/tokenizer.json`;
+const WHISPER_SMALL_EN_ENCODER = `${URL_PREFIX}-whisper-small.en/${VERSION_TAG}/xnnpack/whisper_small_en_encoder_xnnpack.pte`;
+const WHISPER_SMALL_EN_DECODER = `${URL_PREFIX}-whisper-small.en/${VERSION_TAG}/xnnpack/whisper_small_en_decoder_xnnpack.pte`;
 
-const WHISPER_TINY_TOKENIZER = `${URL_PREFIX}-whisper-tiny/${NEXT_VERSION_TAG}/tokenizer.json`;
-const WHISPER_TINY_ENCODER_MODEL = `${URL_PREFIX}-whisper-tiny/${NEXT_VERSION_TAG}/xnnpack/whisper_tiny_encoder_xnnpack.pte`;
-const WHISPER_TINY_DECODER_MODEL = `${URL_PREFIX}-whisper-tiny/${NEXT_VERSION_TAG}/xnnpack/whisper_tiny_decoder_xnnpack.pte`;
+const WHISPER_TINY_TOKENIZER = `${URL_PREFIX}-whisper-tiny/${VERSION_TAG}/tokenizer.json`;
+const WHISPER_TINY_ENCODER_MODEL = `${URL_PREFIX}-whisper-tiny/${VERSION_TAG}/xnnpack/whisper_tiny_encoder_xnnpack.pte`;
+const WHISPER_TINY_DECODER_MODEL = `${URL_PREFIX}-whisper-tiny/${VERSION_TAG}/xnnpack/whisper_tiny_decoder_xnnpack.pte`;
 
-const WHISPER_BASE_TOKENIZER = `${URL_PREFIX}-whisper-base/${NEXT_VERSION_TAG}/tokenizer.json`;
-const WHISPER_BASE_ENCODER_MODEL = `${URL_PREFIX}-whisper-base/${NEXT_VERSION_TAG}/xnnpack/whisper_base_encoder_xnnpack.pte`;
-const WHISPER_BASE_DECODER_MODEL = `${URL_PREFIX}-whisper-base/${NEXT_VERSION_TAG}/xnnpack/whisper_base_decoder_xnnpack.pte`;
+const WHISPER_BASE_TOKENIZER = `${URL_PREFIX}-whisper-base/${VERSION_TAG}/tokenizer.json`;
+const WHISPER_BASE_ENCODER_MODEL = `${URL_PREFIX}-whisper-base/${VERSION_TAG}/xnnpack/whisper_base_encoder_xnnpack.pte`;
+const WHISPER_BASE_DECODER_MODEL = `${URL_PREFIX}-whisper-base/${VERSION_TAG}/xnnpack/whisper_base_decoder_xnnpack.pte`;
 
-const WHISPER_SMALL_TOKENIZER = `${URL_PREFIX}-whisper-small/${NEXT_VERSION_TAG}/tokenizer.json`;
-const WHISPER_SMALL_ENCODER_MODEL = `${URL_PREFIX}-whisper-small/${NEXT_VERSION_TAG}/xnnpack/whisper_small_encoder_xnnpack.pte`;
-const WHISPER_SMALL_DECODER_MODEL = `${URL_PREFIX}-whisper-small/${NEXT_VERSION_TAG}/xnnpack/whisper_small_decoder_xnnpack.pte`;
+const WHISPER_SMALL_TOKENIZER = `${URL_PREFIX}-whisper-small/${VERSION_TAG}/tokenizer.json`;
+const WHISPER_SMALL_ENCODER_MODEL = `${URL_PREFIX}-whisper-small/${VERSION_TAG}/xnnpack/whisper_small_encoder_xnnpack.pte`;
+const WHISPER_SMALL_DECODER_MODEL = `${URL_PREFIX}-whisper-small/${VERSION_TAG}/xnnpack/whisper_small_decoder_xnnpack.pte`;
 
 export const WHISPER_TINY_EN = {
   isMultilingual: false,
@@ -452,7 +452,7 @@ export const BK_SDM_TINY_VPRED_256 = {
 };
 
 // Voice Activity Detection
-const FSMN_VAD_MODEL = `${URL_PREFIX}-fsmn-vad/${NEXT_VERSION_TAG}/xnnpack/fsmn-vad_xnnpack.pte`;
+const FSMN_VAD_MODEL = `${URL_PREFIX}-fsmn-vad/${VERSION_TAG}/xnnpack/fsmn-vad_xnnpack.pte`;
 
 export const FSMN_VAD = {
   modelSource: FSMN_VAD_MODEL,

From aa87474cc6135544a9f2862a23ee1e4435f56dc1 Mon Sep 17 00:00:00 2001
From: IgorSwat <igorswat2002@o2.pl>
Date: Fri, 5 Dec 2025 08:11:52 +0100
Subject: [PATCH 09/11] update benchmarks for v0.6.0

---
 .../useTextEmbeddings.md                      |  14 +-
 .../02-computer-vision/useClassification.md   |   2 +-
 .../02-computer-vision/useImageEmbeddings.md  |   6 +-
 .../02-hooks/02-computer-vision/useOCR.md     |  16 +-
 .../02-computer-vision/useObjectDetection.md  |   2 +-
 .../02-computer-vision/useStyleTransfer.md    |   8 +-
 .../02-computer-vision/useVerticalOCR.md      |  16 +-
 docs/docs/04-benchmarks/inference-time.md     |  52 +-
 .../01-fundamentals/01-getting-started.md     | 100 ++++
 .../01-fundamentals/02-loading-models.md      |  50 ++
 .../03-frequently-asked-questions.md          |  39 ++
 .../01-fundamentals/_category_.json           |   6 +
 .../_category_.json                           |   6 +
 .../01-natural-language-processing/useLLM.md  | 537 ++++++++++++++++++
 .../useSpeechToText.md                        | 343 +++++++++++
 .../useTextEmbeddings.md                      | 158 ++++++
 .../useTokenizer.md                           | 104 ++++
 .../01-natural-language-processing/useVAD.md  | 194 +++++++
 .../02-computer-vision/_category_.json        |   6 +
 .../02-computer-vision/useClassification.md   | 113 ++++
 .../02-computer-vision/useImageEmbeddings.md  | 132 +++++
 .../useImageSegmentation.md                   | 117 ++++
 .../02-hooks/02-computer-vision/useOCR.md     | 332 +++++++++++
 .../02-computer-vision/useObjectDetection.md  | 152 +++++
 .../02-computer-vision/useStyleTransfer.md    | 114 ++++
 .../02-computer-vision/useTextToImage.md      | 133 +++++
 .../02-computer-vision/useVerticalOCR.md      | 347 +++++++++++
 .../03-executorch-bindings/_category_.json    |   6 +
 .../useExecutorchModule.md                    | 155 +++++
 .../version-0.6.x/02-hooks/_category_.json    |   6 +
 .../LLMModule.md                              | 166 ++++++
 .../SpeechToTextModule.md                     | 252 ++++++++
 .../TextEmbeddingsModule.md                   |  59 ++
 .../TokenizerModule.md                        |  60 ++
 .../_category_.json                           |   6 +
 .../ClassificationModule.md                   |  64 +++
 .../ImageEmbeddingsModule.md                  |  60 ++
 .../ImageSegmentationModule.md                |  77 +++
 .../02-computer-vision/OCRModule.md           | 135 +++++
 .../ObjectDetectionModule.md                  |  77 +++
 .../02-computer-vision/StyleTransferModule.md |  64 +++
 .../02-computer-vision/VerticalOCRModule.md   | 151 +++++
 .../02-computer-vision/_category_.json        |   6 +
 .../ExecutorchModule.md                       | 164 ++++++
 .../03-executorch-bindings/_category_.json    |   6 +
 .../03-typescript-api/_category_.json         |   6 +
 .../04-benchmarks/_category_.json             |   6 +
 .../04-benchmarks/inference-time.md           | 111 ++++
 .../04-benchmarks/memory-usage.md             |  81 +++
 .../version-0.6.x/04-benchmarks/model-size.md |  90 +++
 .../05-utilities/_category_.json              |   6 +
 .../05-utilities/resource-fetcher.md          | 218 +++++++
 52 files changed, 5073 insertions(+), 58 deletions(-)
 create mode 100644 docs/versioned_docs/version-0.6.x/01-fundamentals/01-getting-started.md
 create mode 100644 docs/versioned_docs/version-0.6.x/01-fundamentals/02-loading-models.md
 create mode 100644 docs/versioned_docs/version-0.6.x/01-fundamentals/03-frequently-asked-questions.md
 create mode 100644 docs/versioned_docs/version-0.6.x/01-fundamentals/_category_.json
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/_category_.json
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useLLM.md
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useSpeechToText.md
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTextEmbeddings.md
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTokenizer.md
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useVAD.md
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/_category_.json
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useClassification.md
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageEmbeddings.md
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageSegmentation.md
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useOCR.md
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useObjectDetection.md
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useStyleTransfer.md
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useTextToImage.md
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useVerticalOCR.md
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/_category_.json
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/useExecutorchModule.md
 create mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/_category_.json
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/LLMModule.md
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/SpeechToTextModule.md
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TextEmbeddingsModule.md
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TokenizerModule.md
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/_category_.json
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ClassificationModule.md
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageEmbeddingsModule.md
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageSegmentationModule.md
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/OCRModule.md
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ObjectDetectionModule.md
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/StyleTransferModule.md
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/VerticalOCRModule.md
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/_category_.json
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/ExecutorchModule.md
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/_category_.json
 create mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/_category_.json
 create mode 100644 docs/versioned_docs/version-0.6.x/04-benchmarks/_category_.json
 create mode 100644 docs/versioned_docs/version-0.6.x/04-benchmarks/inference-time.md
 create mode 100644 docs/versioned_docs/version-0.6.x/04-benchmarks/memory-usage.md
 create mode 100644 docs/versioned_docs/version-0.6.x/04-benchmarks/model-size.md
 create mode 100644 docs/versioned_docs/version-0.6.x/05-utilities/_category_.json
 create mode 100644 docs/versioned_docs/version-0.6.x/05-utilities/resource-fetcher.md

diff --git a/docs/docs/02-hooks/01-natural-language-processing/useTextEmbeddings.md b/docs/docs/02-hooks/01-natural-language-processing/useTextEmbeddings.md
index fd595d208..7d4706f15 100644
--- a/docs/docs/02-hooks/01-natural-language-processing/useTextEmbeddings.md
+++ b/docs/docs/02-hooks/01-natural-language-processing/useTextEmbeddings.md
@@ -145,13 +145,13 @@ For the supported models, the returned embedding vector is normalized, meaning t
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
 :::
 
-| Model                      | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| -------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| ALL_MINILM_L6_V2           |              16              |              16              |             19             |                54                 |            28             |
-| ALL_MPNET_BASE_V2          |             115              |             116              |            144             |                145                |            95             |
-| MULTI_QA_MINILM_L6_COS_V1  |              16              |              16              |             20             |                47                 |            28             |
-| MULTI_QA_MPNET_BASE_DOT_V1 |             112              |             119              |            144             |                146                |            96             |
-| CLIP_VIT_BASE_PATCH32_TEXT |              47              |              45              |             57             |                65                 |            48             |
+| Model                      | iPhone 17 Pro (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| -------------------------- | :--------------------------: | :-----------------------: |
+| ALL_MINILM_L6_V2           |              7               |            21             |
+| ALL_MPNET_BASE_V2          |              24              |            90             |
+| MULTI_QA_MINILM_L6_COS_V1  |              7               |            19             |
+| MULTI_QA_MPNET_BASE_DOT_V1 |              24              |            88             |
+| CLIP_VIT_BASE_PATCH32_TEXT |              14              |            39             |
 
 :::info
 Benchmark times for text embeddings are highly dependent on the sentence length. The numbers above are based on a sentence of around 80 tokens. For shorter or longer sentences, inference time may vary accordingly.
diff --git a/docs/docs/02-hooks/02-computer-vision/useClassification.md b/docs/docs/02-hooks/02-computer-vision/useClassification.md
index e17bfa775..eaf9afcb7 100644
--- a/docs/docs/02-hooks/02-computer-vision/useClassification.md
+++ b/docs/docs/02-hooks/02-computer-vision/useClassification.md
@@ -110,4 +110,4 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 | Model             | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ----------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| EFFICIENTNET_V2_S |             105              |             110              |            149             |                299                |            227            |
+| EFFICIENTNET_V2_S |              64              |              68              |            217             |                205                |            198            |
diff --git a/docs/docs/02-hooks/02-computer-vision/useImageEmbeddings.md b/docs/docs/02-hooks/02-computer-vision/useImageEmbeddings.md
index 4d417590c..b6decd1d2 100644
--- a/docs/docs/02-hooks/02-computer-vision/useImageEmbeddings.md
+++ b/docs/docs/02-hooks/02-computer-vision/useImageEmbeddings.md
@@ -123,9 +123,9 @@ For the supported models, the returned embedding vector is normalized, meaning t
 Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization. Performance also heavily depends on image size, because resize is expansive operation, especially on low-end devices.
 :::
 
-| Model                       | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| --------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| CLIP_VIT_BASE_PATCH32_IMAGE |              70              |              70              |             90             |                66                 |            58             |
+| Model                       | iPhone 17 Pro (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| --------------------------- | :--------------------------: | :-----------------------: |
+| CLIP_VIT_BASE_PATCH32_IMAGE |              18              |            55             |
 
 :::info
 Image embedding benchmark times are measured using 224×224 pixel images, as required by the model. All input images, whether larger or smaller, are resized to 224×224 before processing. Resizing is typically fast for small images but may be noticeably slower for very large images, which can increase total inference time.
diff --git a/docs/docs/02-hooks/02-computer-vision/useOCR.md b/docs/docs/02-hooks/02-computer-vision/useOCR.md
index 08e28f829..d07efd601 100644
--- a/docs/docs/02-hooks/02-computer-vision/useOCR.md
+++ b/docs/docs/02-hooks/02-computer-vision/useOCR.md
@@ -319,14 +319,14 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 | Metric                             | iPhone 17 Pro <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
 | ---------------------------------- | ------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
-| **Total Inference Time**           | 1160                      | 1144                      | 1498        | 1567                           | 1160                   |
-| **Detector (CRAFT_800_QUANTIZED)** | 669                       | 649                       | 825         | 541                            | 474                    |
+| **Total Inference Time**           | 652                       | 600                       | 2855        | 1092                           | 1034                   |
+| **Detector (CRAFT_800_QUANTIZED)** | 220                       | 221                       | 1740        | 521                            | 492                    |
 | **Recognizer (CRNN_512)**          |                           |                           |             |                                |                        |
-| ├─ Average Time                    | 48                        | 47                        | 60          | 91                             | 72                     |
-| ├─ Total Time (3 runs)             | 144                       | 141                       | 180         | 273                            | 216                    |
+| ├─ Average Time                    | 45                        | 38                        | 110         | 40                             | 38                     |
+| ├─ Total Time (3 runs)             | 135                       | 114                       | 330         | 120                            | 114                    |
 | **Recognizer (CRNN_256)**          |                           |                           |             |                                |                        |
-| ├─ Average Time                    | 22                        | 22                        | 29          | 51                             | 30                     |
-| ├─ Total Time (7 runs)             | 154                       | 154                       | 203         | 357                            | 210                    |
+| ├─ Average Time                    | 21                        | 18                        | 54          | 20                             | 19                     |
+| ├─ Total Time (7 runs)             | 147                       | 126                       | 378         | 140                            | 133                    |
 | **Recognizer (CRNN_128)**          |                           |                           |             |                                |                        |
-| ├─ Average Time                    | 11                        | 11                        | 14          | 28                             | 17                     |
-| ├─ Total Time (7 runs)             | 77                        | 77                        | 98          | 196                            | 119                    |
+| ├─ Average Time                    | 11                        | 9                         | 27          | 10                             | 10                     |
+| ├─ Total Time (7 runs)             | 77                        | 63                        | 189         | 70                             | 70                     |
diff --git a/docs/docs/02-hooks/02-computer-vision/useObjectDetection.md b/docs/docs/02-hooks/02-computer-vision/useObjectDetection.md
index 7f49e8389..2bae6a658 100644
--- a/docs/docs/02-hooks/02-computer-vision/useObjectDetection.md
+++ b/docs/docs/02-hooks/02-computer-vision/useObjectDetection.md
@@ -149,4 +149,4 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 | Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |             116              |             120              |            164             |                257                |            129            |
+| SSDLITE_320_MOBILENET_V3_LARGE |              71              |              74              |            257             |                115                |            109            |
diff --git a/docs/docs/02-hooks/02-computer-vision/useStyleTransfer.md b/docs/docs/02-hooks/02-computer-vision/useStyleTransfer.md
index 2bedba325..f5d0a423c 100644
--- a/docs/docs/02-hooks/02-computer-vision/useStyleTransfer.md
+++ b/docs/docs/02-hooks/02-computer-vision/useStyleTransfer.md
@@ -108,7 +108,7 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 | Model                        | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ---------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| STYLE_TRANSFER_CANDY         |             1356             |             1550             |            2003            |               2578                |           2328            |
-| STYLE_TRANSFER_MOSAIC        |             1376             |             1456             |            1971            |               2657                |           2394            |
-| STYLE_TRANSFER_UDNIE         |             1389             |             1499             |            1858            |               2380                |           2124            |
-| STYLE_TRANSFER_RAIN_PRINCESS |             1339             |             1514             |            2004            |               2608                |           2371            |
+| STYLE_TRANSFER_CANDY         |             1400             |             1485             |            4255            |               2510                |           2355            |
+| STYLE_TRANSFER_MOSAIC        |             1400             |             1485             |            4255            |               2510                |           2355            |
+| STYLE_TRANSFER_UDNIE         |             1400             |             1485             |            4255            |               2510                |           2355            |
+| STYLE_TRANSFER_RAIN_PRINCESS |             1400             |             1485             |            4255            |               2510                |           2355            |
diff --git a/docs/docs/02-hooks/02-computer-vision/useVerticalOCR.md b/docs/docs/02-hooks/02-computer-vision/useVerticalOCR.md
index 94e5e3054..f317d527e 100644
--- a/docs/docs/02-hooks/02-computer-vision/useVerticalOCR.md
+++ b/docs/docs/02-hooks/02-computer-vision/useVerticalOCR.md
@@ -334,14 +334,14 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 | Metric                                                                     | iPhone 17 Pro <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
 | -------------------------------------------------------------------------- | ------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
-| **Total Inference Time**                                                   | 3819 / 3716               | 3978 / 3841               | 4751 / 4532 | 3095 / 3286                    | 2787 / 2770            |
-| **Detector (CRAFT_1280_QUANTIZED)**                                        | 1749                      | 1804                      | 2105        | 1216                           | 1171                   |
+| **Total Inference Time**                                                   | 1104                      | 1113                      | 8840        | 2845                           | 2640                   |
+| **Detector (CRAFT_1280_QUANTIZED)**                                        | 501                       | 507                       | 4317        | 1405                           | 1275                   |
 | **Detector (CRAFT_320_QUANTIZED)**                                         |                           |                           |             |                                |                        |
-| ├─ Average Time                                                            | 458                       | 474                       | 561         | 360                            | 332                    |
-| ├─ Total Time (4 runs)                                                     | 1832                      | 1896                      | 2244        | 1440                           | 1328                   |
+| ├─ Average Time                                                            | 125                       | 121                       | 1060        | 338                            | 299                    |
+| ├─ Total Time (4 runs)                                                     | 500                       | 484                       | 4240        | 1352                           | 1196                   |
 | **Recognizer (CRNN_64)** <br /> (_With Flag `independentChars == true`_)   |                           |                           |             |                                |                        |
-| ├─ Average Time                                                            | 5                         | 6                         | 7           | 28                             | 11                     |
-| ├─ Total Time (21 runs)                                                    | 105                       | 126                       | 147         | 588                            | 231                    |
+| ├─ Average Time                                                            | 5                         | 6                         | 14          | 7                              | 6                      |
+| ├─ Total Time (21 runs)                                                    | 105                       | 126                       | 294         | 147                            | 126                    |
 | **Recognizer (CRNN_512)** <br /> (_With Flag `independentChars == false`_) |                           |                           |             |                                |                        |
-| ├─ Average Time                                                            | 54                        | 52                        | 68          | 144                            | 72                     |
-| ├─ Total Time (4 runs)                                                     | 216                       | 208                       | 272         | 576                            | 288                    |
+| ├─ Average Time                                                            | 46                        | 42                        | 109         | 47                             | 37                     |
+| ├─ Total Time (4 runs)                                                     | 184                       | 168                       | 436         | 188                            | 148                    |
diff --git a/docs/docs/04-benchmarks/inference-time.md b/docs/docs/04-benchmarks/inference-time.md
index 89f1f9de1..dbfc2b21d 100644
--- a/docs/docs/04-benchmarks/inference-time.md
+++ b/docs/docs/04-benchmarks/inference-time.md
@@ -10,22 +10,22 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
 
 | Model             | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ----------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| EFFICIENTNET_V2_S |             105              |             110              |            149             |                299                |            227            |
+| EFFICIENTNET_V2_S |              64              |              68              |            217             |                205                |            198            |
 
 ## Object Detection
 
 | Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |             116              |             120              |            164             |                257                |            129            |
+| SSDLITE_320_MOBILENET_V3_LARGE |              71              |              74              |            257             |                115                |            109            |
 
 ## Style Transfer
 
 | Model                        | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ---------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| STYLE_TRANSFER_CANDY         |             1356             |             1550             |            2003            |               2578                |           2328            |
-| STYLE_TRANSFER_MOSAIC        |             1376             |             1456             |            1971            |               2657                |           2394            |
-| STYLE_TRANSFER_UDNIE         |             1389             |             1499             |            1858            |               2380                |           2124            |
-| STYLE_TRANSFER_RAIN_PRINCESS |             1339             |             1514             |            2004            |               2608                |           2371            |
+| STYLE_TRANSFER_CANDY         |             1400             |             1485             |            4255            |               2510                |           2355            |
+| STYLE_TRANSFER_MOSAIC        |             1400             |             1485             |            4255            |               2510                |           2355            |
+| STYLE_TRANSFER_UDNIE         |             1400             |             1485             |            4255            |               2510                |           2355            |
+| STYLE_TRANSFER_RAIN_PRINCESS |             1400             |             1485             |            4255            |               2510                |           2355            |
 
 ## OCR
 
@@ -34,10 +34,10 @@ The values below represent the averages across all runs for the benchmark image.
 
 | Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Detector (CRAFT_800_QUANTIZED) |             669              |             649              |            825             |                541                |            474            |
-| Recognizer (CRNN_512)          |              48              |              47              |             60             |                91                 |            72             |
-| Recognizer (CRNN_256)          |              22              |              22              |             29             |                51                 |            30             |
-| Recognizer (CRNN_128)          |              11              |              11              |             14             |                28                 |            17             |
+| Detector (CRAFT_800_QUANTIZED) |             220              |             221              |            1740            |                521                |            492            |
+| Recognizer (CRNN_512)          |              45              |              38              |            110             |                40                 |            38             |
+| Recognizer (CRNN_256)          |              21              |              18              |             54             |                20                 |            19             |
+| Recognizer (CRNN_128)          |              11              |              9               |             27             |                10                 |            10             |
 
 ## Vertical OCR
 
@@ -46,10 +46,10 @@ The values below represent the averages across all runs for the benchmark image.
 
 | Model                           | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Detector (CRAFT_1280_QUANTIZED) |             1749             |             1804             |            2105            |               1216                |           1171            |
-| Detector (CRAFT_320_QUANTIZED)  |             458              |             474              |            561             |                360                |            332            |
-| Recognizer (CRNN_512)           |              54              |              52              |             68             |                144                |            72             |
-| Recognizer (CRNN_64)            |              5               |              6               |             7              |                28                 |            11             |
+| Detector (CRAFT_1280_QUANTIZED) |             501              |             507              |            4317            |               1405                |           1275            |
+| Detector (CRAFT_320_QUANTIZED)  |             125              |             121              |            1060            |                338                |            299            |
+| Recognizer (CRNN_512)           |              46              |              42              |            109             |                47                 |            37             |
+| Recognizer (CRNN_64)            |              5               |              6               |             14             |                 7                 |             6             |
 
 ## LLMs
 
@@ -70,7 +70,7 @@ Average time for encoding audio of given length over 10 runs. For `Whisper` mode
 
 | Model              | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Whisper-tiny (30s) |             1391             |             1372             |            1894            |               1303                |           1214            |
+| Whisper-tiny (30s) |             248              |             254              |            1145            |                435                |            526            |
 
 ### Decoding
 
@@ -78,17 +78,17 @@ Average time for decoding one token in sequence of approximately 100 tokens, wit
 
 | Model              | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
 | ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Whisper-tiny (30s) |              53              |              53              |             74             |                100                |            84             |
+| Whisper-tiny (30s) |              23              |              25              |            121             |                92                 |            115            |
 
 ## Text Embeddings
 
-| Model                      | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| -------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| ALL_MINILM_L6_V2           |              16              |              16              |             19             |                54                 |            28             |
-| ALL_MPNET_BASE_V2          |             115              |             116              |            144             |                145                |            95             |
-| MULTI_QA_MINILM_L6_COS_V1  |              16              |              16              |             20             |                47                 |            28             |
-| MULTI_QA_MPNET_BASE_DOT_V1 |             112              |             119              |            144             |                146                |            96             |
-| CLIP_VIT_BASE_PATCH32_TEXT |              47              |              45              |             57             |                65                 |            48             |
+| Model                      | iPhone 17 Pro (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| -------------------------- | :--------------------------: | :-----------------------: |
+| ALL_MINILM_L6_V2           |              7               |            21             |
+| ALL_MPNET_BASE_V2          |              24              |            90             |
+| MULTI_QA_MINILM_L6_COS_V1  |              7               |            19             |
+| MULTI_QA_MPNET_BASE_DOT_V1 |              24              |            88             |
+| CLIP_VIT_BASE_PATCH32_TEXT |              14              |            39             |
 
 :::info
 Benchmark times for text embeddings are highly dependent on the sentence length. The numbers above are based on a sentence of around 80 tokens. For shorter or longer sentences, inference time may vary accordingly.
@@ -96,9 +96,9 @@ Benchmark times for text embeddings are highly dependent on the sentence length.
 
 ## Image Embeddings
 
-| Model                       | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| --------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| CLIP_VIT_BASE_PATCH32_IMAGE |              70              |              70              |             90             |                66                 |            58             |
+| Model                       | iPhone 17 Pro (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| --------------------------- | :--------------------------: | :-----------------------: |
+| CLIP_VIT_BASE_PATCH32_IMAGE |              18              |            55             |
 
 :::info
 Image embedding benchmark times are measured using 224×224 pixel images, as required by the model. All input images, whether larger or smaller, are resized to 224×224 before processing. Resizing is typically fast for small images but may be noticeably slower for very large images, which can increase total inference time.
diff --git a/docs/versioned_docs/version-0.6.x/01-fundamentals/01-getting-started.md b/docs/versioned_docs/version-0.6.x/01-fundamentals/01-getting-started.md
new file mode 100644
index 000000000..b5d60c35b
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/01-fundamentals/01-getting-started.md
@@ -0,0 +1,100 @@
+---
+title: Getting Started
+slug: /
+keywords:
+  [
+    react native,
+    react native ai,
+    react native llm,
+    react native qwen,
+    react native llama,
+    react native executorch,
+    executorch,
+    on-device ai,
+    pytorch,
+    mobile ai,
+  ]
+description: 'Get started with React Native ExecuTorch - a framework for running AI models on-device in your React Native applications.'
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## What is ExecuTorch?
+
+ExecuTorch is a novel AI framework developed by Meta, designed to streamline deploying PyTorch models on a variety of devices, including mobile phones and microcontrollers. This framework enables exporting models into standalone binaries, allowing them to run locally without requiring API calls. ExecuTorch achieves state-of-the-art performance through optimizations and delegates such as Core ML and XNNPACK. It provides a seamless export process with robust debugging options, making it easier to resolve issues if they arise.
+
+## React Native ExecuTorch
+
+React Native ExecuTorch is our way of bringing ExecuTorch into the React Native world. Our API is built to be simple, declarative, and efficient. Plus, we’ll provide a set of pre-exported models for common use cases, so you won’t have to worry about handling exports yourself. With just a few lines of JavaScript, you’ll be able to run AI models (even LLMs 👀) right on your device—keeping user data private and saving on cloud costs.
+
+## Compatibility
+
+React Native Executorch supports only the [New React Native architecture](https://reactnative.dev/architecture/landing-page).
+
+If your app still runs on the old architecture, please consider upgrading to the New Architecture.
+
+## Installation
+
+Installation is pretty straightforward, just use your favorite package manager.
+
+<Tabs>
+  <TabItem value="npm" label="NPM">
+
+    ```
+    npm install react-native-executorch
+    ```
+
+  </TabItem>
+  <TabItem value="pnpm" label="PNPM">
+
+    ```
+    pnpm install react-native-executorch
+    ```
+
+  </TabItem>
+  <TabItem value="yarn" label="YARN">
+
+    ```
+    yarn add react-native-executorch
+    ```
+
+  </TabItem>
+</Tabs>
+
+If you're using bare React Native (instead of a managed Expo project), you also need to install Expo Modules because the underlying implementation relies on expo-file-system. Since expo-file-system is an Expo package, bare React Native projects need **Expo Modules** to properly integrate and use it. The link provided (https://docs.expo.dev/bare/installing-expo-modules/) offers guidance on setting up Expo Modules in a bare React Native environment.
+
+If you plan on using your models via require() instead of fetching them from a url, you also need to add following lines to your `metro.config.js`:
+
+```json
+// metro.config.js
+...
+    defaultConfig.resolver.assetExts.push('pte')
+    defaultConfig.resolver.assetExts.push('bin')
+...
+```
+
+This allows us to use binaries, such as exported models or tokenizers for LLMs.
+
+:::caution
+When using Expo, please note that you need to use a custom development build of your app, not the standard Expo Go app. This is because we rely on native modules, which Expo Go doesn’t support.
+:::
+
+:::info
+Because we are using ExecuTorch under the hood, you won't be able to build iOS app for release with simulator selected as the target device. Make sure to test release builds on real devices.
+:::
+
+Running the app with the library:
+
+```bash
+yarn run expo:<ios | android> -d
+```
+
+## Good reads
+
+If you want to dive deeper into ExecuTorch or our previous work with the framework, we highly encourage you to check out the following resources:
+
+- [ExecuTorch docs](https://pytorch.org/executorch/stable/index.html)
+- [Native code for iOS](https://medium.com/swmansion/bringing-native-ai-to-your-mobile-apps-with-executorch-part-i-ios-f1562a4556e8?source=user_profile_page---------0-------------250189c98ccf---------------)
+- [Native code for Android](https://medium.com/swmansion/bringing-native-ai-to-your-mobile-apps-with-executorch-part-ii-android-29431b6b9f7f?source=user_profile_page---------2-------------b8e3a5cb1c63---------------)
+- [Exporting to Android with XNNPACK](https://medium.com/swmansion/exporting-ai-models-on-android-with-xnnpack-and-executorch-3e70cff51c59?source=user_profile_page---------1-------------b8e3a5cb1c63---------------)
diff --git a/docs/versioned_docs/version-0.6.x/01-fundamentals/02-loading-models.md b/docs/versioned_docs/version-0.6.x/01-fundamentals/02-loading-models.md
new file mode 100644
index 000000000..8763d9614
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/01-fundamentals/02-loading-models.md
@@ -0,0 +1,50 @@
+---
+title: Loading Models
+---
+
+There are three different methods available for loading model files, depending on their size and location.
+
+**1. Load from React Native assets folder (For Files < 512MB)**
+
+```typescript
+useExecutorchModule({
+  modelSource: require('../assets/llama3_2.pte'),
+});
+```
+
+**2. Load from remote URL:**
+
+For files larger than 512MB or when you want to keep size of the app smaller, you can load the model from a remote URL (e.g. HuggingFace).
+
+```typescript
+useExecutorchModule({
+  modelSource: 'https://.../llama3_2.pte',
+});
+```
+
+**3. Load from local file system:**
+
+If you prefer to delegate the process of obtaining and loading model and tokenizer files to the user, you can use the following method:
+
+```typescript
+useExecutorchModule({
+  modelSource: 'file:///var/mobile/.../llama3_2.pte',
+});
+```
+
+:::info
+The downloaded files are stored in documents directory of your application.
+:::
+
+## Example
+
+The following code snippet demonstrates how to load model and tokenizer files using `useLLM` hook:
+
+```typescript
+import { useLLM } from 'react-native-executorch';
+
+const llama = useLLM({
+  modelSource: 'https://.../llama3_2.pte',
+  tokenizerSource: require('../assets/tokenizer.bin'),
+});
+```
diff --git a/docs/versioned_docs/version-0.6.x/01-fundamentals/03-frequently-asked-questions.md b/docs/versioned_docs/version-0.6.x/01-fundamentals/03-frequently-asked-questions.md
new file mode 100644
index 000000000..03914b25d
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/01-fundamentals/03-frequently-asked-questions.md
@@ -0,0 +1,39 @@
+---
+title: Frequently Asked Questions
+---
+
+This section is meant to answer some common community inquiries, especially regarding the ExecuTorch runtime or adding your own models. If you can't see an answer to your question, feel free to open up a [discussion](https://github.com/software-mansion/react-native-executorch/discussions/new/choose).
+
+### What models are supported?
+
+Each hook documentation subpage (useClassification, useLLM, etc.) contains a supported models section, which lists the models that are runnable within the library with close to no setup. For running your custom models, refer to `ExecuTorchModule` or `useExecuTorchModule`.
+
+### How can I run my own AI model?
+
+To run your own model, you need to directly access the underlying [ExecuTorch Module API](https://pytorch.org/executorch/stable/extension-module.html). We provide an experimental [React hook](../02-hooks/03-executorch-bindings/useExecutorchModule.md) along with a [TypeScript alternative](../03-typescript-api/03-executorch-bindings/ExecutorchModule.md), which serve as a way to use the aforementioned API without the need of diving into native code. In order to get a model in a format runnable by the runtime, you'll need to get your hands dirty with some ExecuTorch knowledge. For more guides on exporting models, please refer to the [ExecuTorch tutorials](https://pytorch.org/executorch/stable/tutorials/export-to-executorch-tutorial.html). Once you obtain your model in a `.pte` format, you can run it with `useExecuTorchModule` and `ExecuTorchModule`.
+
+### Can you do function calling with useLLM?
+
+If your model supports tool calling (i.e. its chat template can process tools) you can use the method explained on the [useLLM page](../02-hooks/01-natural-language-processing/useLLM.md).
+
+If your model doesn't support it, you can still work around it using context. For details, refer to [this comment](https://github.com/software-mansion/react-native-executorch/issues/173#issuecomment-2775082278).
+
+### Can I use React Native ExecuTorch in bare React Native apps?
+
+To use the library, you need to install Expo Modules first. For a setup guide, refer to [this tutorial](https://docs.expo.dev/bare/installing-expo-modules/). This is because we use Expo File System under the hood to download and manage the model binaries.
+
+### Do you support the old architecture?
+
+The old architecture is not supported and we're currently not planning to add support.
+
+### Can I run GGUF models using the library?
+
+No, as of now ExecuTorch runtime doesn't provide a reliable way to use GGUF models, hence it is not possible.
+
+### Are the models leveraging GPU acceleration?
+
+While it is possible to run some models using Core ML on iOS, which is a backend that utilizes CPU, GPU and ANE, we currently don't have many models exported to Core ML. For Android, the current state of GPU acceleration is pretty limited. As of now, there are attempts of running the models using a Vulkan backend. However the operator support is very limited meaning that the resulting performance is often inferior to XNNPACK. Hence, most of the models use XNNPACK, which is a highly optimized and mature CPU backend that runs on both Android and iOS.
+
+### Does this library support XNNPACK and Core ML?
+
+Yes, all of the backends are linked, therefore the only thing that needs to be done on your end is to export the model with the backend that you're interested in using.
diff --git a/docs/versioned_docs/version-0.6.x/01-fundamentals/_category_.json b/docs/versioned_docs/version-0.6.x/01-fundamentals/_category_.json
new file mode 100644
index 000000000..e3fddcbeb
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/01-fundamentals/_category_.json
@@ -0,0 +1,6 @@
+{
+  "label": "Fundamentals",
+  "link": {
+    "type": "generated-index"
+  }
+}
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/_category_.json b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/_category_.json
new file mode 100644
index 000000000..0314f315d
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/_category_.json
@@ -0,0 +1,6 @@
+{
+  "label": "Natural Language Processing",
+  "link": {
+    "type": "generated-index"
+  }
+}
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useLLM.md b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useLLM.md
new file mode 100644
index 000000000..3f072f93c
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useLLM.md
@@ -0,0 +1,537 @@
+---
+title: useLLM
+keywords:
+  [
+    react native,
+    react native ai,
+    react native llm,
+    react native qwen,
+    react native llama,
+    react native executorch,
+    executorch,
+    pytorch,
+    on-device ai,
+    mobile ai,
+    llama 3,
+    qwen,
+    text generation,
+    tool calling,
+    function calling,
+  ]
+description: "Learn how to use LLMs in your React Native applications with React Native ExecuTorch's useLLM hook."
+---
+
+React Native ExecuTorch supports a variety of LLMs (checkout our [HuggingFace repository](https://huggingface.co/software-mansion) for model already converted to ExecuTorch format) including Llama 3.2. Before getting started, you’ll need to obtain the .pte binary—a serialized model, the tokenizer and tokenizer config JSON files. There are various ways to accomplish this:
+
+- For your convenience, it's best if you use models exported by us, you can get them from our [HuggingFace repository](https://huggingface.co/software-mansion). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
+- Follow the official [tutorial](https://github.com/pytorch/executorch/blob/release/0.7/examples/demo-apps/android/LlamaDemo/docs/delegates/xnnpack_README.md) made by ExecuTorch team to build the model and tokenizer yourself.
+
+:::danger
+Lower-end devices might not be able to fit LLMs into memory. We recommend using quantized models to reduce the memory footprint.
+:::
+
+## Initializing
+
+In order to load a model into the app, you need to run the following code:
+
+```typescript
+import { useLLM, LLAMA3_2_1B } from 'react-native-executorch';
+
+const llm = useLLM({ model: LLAMA3_2_1B });
+```
+
+<br/>
+
+The code snippet above fetches the model from the specified URL, loads it into memory, and returns an object with various functions and properties for controlling the model. You can monitor the loading progress by checking the `llm.downloadProgress` and `llm.isReady` property, and if anything goes wrong, the `llm.error` property will contain the error message.
+
+### Arguments
+
+**`model`** - Object containing the model source, tokenizer source, and tokenizer config source.
+
+- **`modelSource`** - `ResourceSource` that specifies the location of the model binary.
+
+- **`tokenizerSource`** - `ResourceSource` pointing to the JSON file which contains the tokenizer.
+
+- **`tokenizerConfigSource`** - `ResourceSource` pointing to the JSON file which contains the tokenizer config.
+
+**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+### Returns
+
+| Field                    | Type                                                                                                           | Description                                                                                                                                     |
+| ------------------------ | -------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
+| `generate()`             | `(messages: Message[], tools?: LLMTool[]) => Promise<void>`                                                    | Runs model to complete chat passed in `messages` argument. It doesn't manage conversation context.                                              |
+| `interrupt()`            | `() => void`                                                                                                   | Function to interrupt the current inference.                                                                                                    |
+| `response`               | `string`                                                                                                       | State of the generated response. This field is updated with each token generated by the model.                                                  |
+| `token`                  | `string`                                                                                                       | The most recently generated token.                                                                                                              |
+| `isReady`                | `boolean`                                                                                                      | Indicates whether the model is ready.                                                                                                           |
+| `isGenerating`           | `boolean`                                                                                                      | Indicates whether the model is currently generating a response.                                                                                 |
+| `downloadProgress`       | `number`                                                                                                       | Represents the download progress as a value between 0 and 1, indicating the extent of the model file retrieval.                                 |
+| `error`                  | <code>string &#124; null</code>                                                                                | Contains the error message if the model failed to load.                                                                                         |
+| `configure`              | `({chatConfig?: Partial<ChatConfig>, toolsConfig?: ToolsConfig, generationConfig?: GenerationConfig}) => void` | Configures chat and tool calling. See more details in [configuring the model](#configuring-the-model).                                          |
+| `sendMessage`            | `(message: string) => Promise<void>`                                                                           | Function to add user message to conversation. After model responds, `messageHistory` will be updated with both user message and model response. |
+| `deleteMessage`          | `(index: number) => void`                                                                                      | Deletes all messages starting with message on `index` position. After deletion `messageHistory` will be updated.                                |
+| `messageHistory`         | `Message[]`                                                                                                    | History containing all messages in conversation. This field is updated after model responds to `sendMessage`.                                   |
+| `getGeneratedTokenCount` | `() => number`                                                                                                 | Returns the number of tokens generated in the last response.                                                                                    |
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+const useLLM: ({
+  model,
+  preventLoad,
+}: {
+  model: {
+    modelSource: ResourceSource;
+    tokenizerSource: ResourceSource;
+    tokenizerConfigSource: ResourceSource;
+  };
+  preventLoad?: boolean;
+}) => LLMType;
+
+interface LLMType {
+  messageHistory: Message[];
+  response: string;
+  token: string;
+  isReady: boolean;
+  isGenerating: boolean;
+  downloadProgress: number;
+  error: string | null;
+  configure: ({
+    chatConfig,
+    toolsConfig,
+    generationConfig,
+  }: {
+    chatConfig?: Partial<ChatConfig>;
+    toolsConfig?: ToolsConfig;
+    generationConfig?: GenerationConfig;
+  }) => void;
+  getGeneratedTokenCount: () => number;
+  generate: (messages: Message[], tools?: LLMTool[]) => Promise<void>;
+  sendMessage: (message: string) => Promise<void>;
+  deleteMessage: (index: number) => void;
+  interrupt: () => void;
+}
+
+type ResourceSource = string | number | object;
+
+type MessageRole = 'user' | 'assistant' | 'system';
+
+interface Message {
+  role: MessageRole;
+  content: string;
+}
+interface ChatConfig {
+  initialMessageHistory: Message[];
+  contextWindowLength: number;
+  systemPrompt: string;
+}
+
+interface GenerationConfig {
+  temperature?: number;
+  topp?: number;
+  outputTokenBatchSize?: number;
+  batchTimeInterval?: number;
+}
+
+// tool calling
+interface ToolsConfig {
+  tools: LLMTool[];
+  executeToolCallback: (call: ToolCall) => Promise<string | null>;
+  displayToolCalls?: boolean;
+}
+
+interface ToolCall {
+  toolName: string;
+  arguments: Object;
+}
+
+type LLMTool = Object;
+```
+
+</details>
+
+## Functional vs managed
+
+You can use functions returned from this hooks in two manners:
+
+1. Functional/pure - we will not keep any state for you. You'll need to keep conversation history and handle function calling yourself. Use `generate` (and rarely `forward`) and `response`. Note that you don't need to run `configure` to use those. Furthermore, `chatConfig` and `toolsConfig` will not have any effect on those functions.
+
+2. Managed/stateful - we will manage conversation state. Tool calls will be parsed and called automatically after passing appropriate callbacks. See more at [managed LLM chat](#managed-llm-chat).
+
+## Functional way
+
+### Simple generation
+
+To perform chat completion you can use the `generate` function. There is no return value. Instead, the `response` value is updated with each token.
+
+```tsx
+const llm = useLLM({ model: LLAMA3_2_1B });
+
+const handleGenerate = () => {
+  const chat: Message[] = [
+    { role: 'system', content: 'You are a helpful assistant' },
+    { role: 'user', content: 'Hi!' },
+    { role: 'assistant', content: 'Hi!, how can I help you?' },
+    { role: 'user', content: 'What is the meaning of life?' },
+  ];
+
+  // Chat completion
+  llm.generate(chat);
+};
+
+return (
+  <View>
+    <Button onPress={handleGenerate} title="Generate!" />
+    <Text>{llm.response}</Text>
+  </View>
+);
+```
+
+### Interrupting the model
+
+Sometimes, you might want to stop the model while it’s generating. To do this, you can use `interrupt()`, which will halt the model and update the response one last time.
+
+There are also cases when you need to check if tokens are being generated, such as to conditionally render a stop button. We’ve made this easy with the `isGenerating` property.
+
+:::caution
+If you try to dismount the component using this hook while generation is still going on, it will result in crash.
+You'll need to interrupt the model first and wait until `isGenerating` is set to false.
+:::
+
+### Reasoning
+
+Some models ship with a built-in "reasoning" or "thinking" mode, but this is model-specific, not a feature of our library. If the model you're using supports disabling reasoning, follow the instructions provided by the model authors. For example, Qwen 3 lets you disable reasoning by adding the `/no_think` suffix to your prompts - [source](https://qwenlm.github.io/blog/qwen3/#advanced-usages).
+
+### Tool calling
+
+Sometimes text processing capabilities of LLMs are not enough. That's when you may want to introduce tool calling (also called function calling). It allows model to use external tools to perform its tasks. The tools may be any arbitrary function that you want your model to run. It may retrieve some data from 3rd party API. It may do an action inside an app like pressing buttons or filling forms, or it may use system APIs to interact with your phone (turning on flashlight, adding events to your calendar, changing volume etc.).
+
+```tsx
+const TOOL_DEFINITIONS: LLMTool[] = [
+  {
+    name: 'get_weather',
+    description: 'Get/check weather in given location.',
+    parameters: {
+      type: 'dict',
+      properties: {
+        location: {
+          type: 'string',
+          description: 'Location where user wants to check weather',
+        },
+      },
+      required: ['location'],
+    },
+  },
+];
+
+const llm = useLLM({ model: HAMMER2_1_1_5B });
+
+const handleGenerate = () => {
+  const chat: Message[] = [
+    {
+      role: 'system',
+      content: `You are a helpful assistant. Current time and date: ${new Date().toString()}`,
+    },
+    {
+      role: 'user',
+      content: `Hi, what's the weather like in Cracow right now?`,
+    },
+  ];
+
+  // Chat completion
+  llm.generate(chat, TOOL_DEFINITIONS);
+};
+
+useEffect(() => {
+  // Parse response and call tools accordingly
+  // ...
+}, [llm.response]);
+
+return (
+  <View>
+    <Button onPress={handleGenerate} title="Generate!" />
+    <Text>{llm.response}</Text>
+  </View>
+);
+```
+
+## Managed LLM Chat
+
+### Configuring the model
+
+To configure model (i.e. change system prompt, load initial conversation history or manage tool calling) you can use
+`configure` function. It accepts object with following fields:
+
+**`chatConfig`** - Object configuring chat management, contains following properties:
+
+- **`systemPrompt`** - Often used to tell the model what is its purpose, for example - "Be a helpful translator".
+
+- **`initialMessageHistory`** - An array of `Message` objects that represent the conversation history. This can be used to provide initial context to the model.
+
+- **`contextWindowLength`** - The number of messages from the current conversation that the model will use to generate a response. The higher the number, the more context the model will have. Keep in mind that using larger context windows will result in longer inference time and higher memory usage.
+
+**`toolsConfig`** - Object configuring options for enabling and managing tool use. **It will only have effect if your model's chat template support it**. Contains following properties:
+
+- **`tools`** - List of objects defining tools.
+
+- **`executeToolCallback`** - Function that accepts `ToolCall`, executes tool and returns the string to model.
+
+- **`displayToolCalls`** - If set to true, JSON tool calls will be displayed in chat. If false, only answers will be displayed.
+
+**`generationConfig`** - Object configuring generation settings.
+
+- **`outputTokenBatchSize`** - Soft upper limit on the number of tokens in each token batch (in certain cases there can be more tokens in given batch, i.e. when the batch would end with special emoji join character).
+
+- **`batchTimeInterval`** - Upper limit on the time interval between consecutive token batches.
+
+- **`temperature`** - Scales output logits by the inverse of temperature. Controls the randomness / creativity of text generation.
+
+- **`topp`** - Only samples from the smallest set of tokens whose cumulative probability exceeds topp.
+
+### Sending a message
+
+In order to send a message to the model, one can use the following code:
+
+```tsx
+const llm = useLLM({ model: LLAMA3_2_1B });
+
+const send = () => {
+  const message = 'Hi, who are you?';
+  llm.sendMessage(message);
+};
+
+return <Button onPress={send} title="Generate!" />;
+```
+
+### Accessing conversation history
+
+Behind the scenes, tokens are generated one by one, and the `response` property is updated with each token as it’s created.
+If you want to get entire conversation you can use `messageHistory` field:
+
+```tsx
+return (
+  <View>
+    {llm.messageHistory.map((message) => (
+      <Text>{message.content}</Text>
+    ))}
+  </View>
+);
+```
+
+### Tool calling example
+
+```tsx
+const TOOL_DEFINITIONS: LLMTool[] = [
+  {
+    name: 'get_weather',
+    description: 'Get/check weather in given location.',
+    parameters: {
+      type: 'dict',
+      properties: {
+        location: {
+          type: 'string',
+          description: 'Location where user wants to check weather',
+        },
+      },
+      required: ['location'],
+    },
+  },
+];
+
+const llm = useLLM({ model: HAMMER2_1_1_5B });
+
+useEffect(() => {
+  llm.configure({
+    chatConfig: {
+      systemPrompt: `You are helpful assistant. Current time and date: ${new Date().toString()}`,
+    },
+    toolsConfig: {
+      tools: TOOL_DEFINITIONS,
+      executeToolCallback: async (call) => {
+        if (call.toolName === 'get_weather') {
+          console.log('Checking weather!');
+          // perform call to weather API
+          // ...
+          const mockResults = 'Weather is great!';
+          return mockResults;
+        }
+        return null;
+      },
+      displayToolCalls: true,
+    },
+  });
+}, []);
+
+const send = () => {
+  const message = `Hi, what's the weather like in Cracow right now?`;
+  llm.sendMessage(message);
+};
+
+return (
+  <View>
+    <Button onPress={send} title="Generate!" />
+    <Text>{llm.response}</Text>
+  </View>
+);
+```
+
+### Structured output example
+
+```tsx
+import { Schema } from 'jsonschema';
+
+const responseSchema: Schema = {
+  properties: {
+    username: {
+      type: 'string',
+      description: 'Name of user, that is asking a question.',
+    },
+    question: {
+      type: 'string',
+      description: 'Question that user asks.',
+    },
+    bid: {
+      type: 'number',
+      description: 'Amount of money, that user offers.',
+    },
+    currency: {
+      type: 'string',
+      description: 'Currency of offer.',
+    },
+  },
+  required: ['username', 'bid'],
+  type: 'object',
+};
+
+// alternatively use Zod
+import * as z from 'zod/v4';
+const responseSchemaWithZod = z.object({
+  username: z
+    .string()
+    .meta({ description: 'Name of user, that is asking a question.' }),
+  question: z.optional(
+    z.string().meta({ description: 'Question that user asks.' })
+  ),
+  bid: z.number().meta({ description: 'Amount of money, that user offers.' }),
+  currency: z.optional(z.string().meta({ description: 'Currency of offer.' })),
+});
+
+const llm = useLLM({ model: QWEN3_4B_QUANTIZED });
+
+useEffect(() => {
+  const formattingInstructions = getStructuredOutputPrompt(responseSchema);
+  // alternatively pass schema defined with Zod
+  //  const formattingInstructions = getStructuredOutputPrompt(responseSchemaWithZod);
+
+  // Some extra prompting to improve quality of response.
+  const prompt = `Your goal is to parse user's messages and return them in JSON format. Don't respond to user. Simply return JSON with user's question parsed. \n${formattingInstructions}\n /no_think`;
+
+  llm.configure({
+    chatConfig: {
+      systemPrompt: prompt,
+    },
+  });
+}, []);
+
+useEffect(() => {
+  const lastMessage = llm.messageHistory.at(-1);
+  if (!llm.isGenerating && lastMessage?.role === 'assistant') {
+    try {
+      const formattedOutput = fixAndValidateStructuredOutput(
+        lastMessage.content,
+        responseSchemaWithZod
+      );
+      // Zod will allow you to correctly type output
+      const formattedOutputWithZod = fixAndValidateStructuredOutput(
+        lastMessage.content,
+        responseSchema
+      );
+      console.log('Formatted output:', formattedOutput, formattedOutputWithZod);
+    } catch (e) {
+      console.log(
+        "Error parsing output and/or output doesn't match required schema!",
+        e
+      );
+    }
+  }
+}, [llm.messageHistory, llm.isGenerating]);
+
+const send = () => {
+  const message = `I'm John. Is this product damaged? I can give you $100 for this.`;
+  llm.sendMessage(message);
+};
+
+return (
+  <View>
+    <Button onPress={send} title="Generate!" />
+    <Text>{llm.response}</Text>
+  </View>
+);
+```
+
+The response should include JSON:
+
+```json
+{
+  "username": "John",
+  "question": "Is this product damaged?",
+  "bid": 100,
+  "currency": "USD"
+}
+```
+
+## Token Batching
+
+Depending on selected model and the user's device generation speed can be above 60 tokens per second. If the `tokenCallback` triggers rerenders and is invoked on every single token it can significantly decrease the app's performance. To alleviate this and help improve performance we've implemented token batching. To configure this you need to call `configure` method and pass `generationConfig`. Inside you can set two parameters `outputTokenBatchSize` and `batchTimeInterval`. They set the size of the batch before tokens are emitted and the maximum time interval between consecutive batches respectively. Each batch is emitted if either `timeInterval` elapses since last batch or `countInterval` number of tokens are generated. This allows for smooth generation even if model lags during generation. Default parameters are set to 10 tokens and 80ms for time interval (~12 batches per second).
+
+## Available models
+
+| Model Family                                                                             |      Sizes       | Quantized |
+| ---------------------------------------------------------------------------------------- | :--------------: | :-------: |
+| [Hammer 2.1](https://huggingface.co/software-mansion/react-native-executorch-hammer-2.1) |  0.5B, 1.5B, 3B  |    ✅     |
+| [Qwen 2.5](https://huggingface.co/software-mansion/react-native-executorch-qwen-2.5)     |  0.5B, 1.5B, 3B  |    ✅     |
+| [Qwen 3](https://huggingface.co/software-mansion/react-native-executorch-qwen-3)         |  0.6B, 1.7B, 4B  |    ✅     |
+| [Phi 4 Mini](https://huggingface.co/software-mansion/react-native-executorch-phi-4-mini) |        4B        |    ✅     |
+| [SmolLM 2](https://huggingface.co/software-mansion/react-native-executorch-smolLm-2)     | 135M, 360M, 1.7B |    ✅     |
+| [LLaMA 3.2](https://huggingface.co/software-mansion/react-native-executorch-llama-3.2)   |      1B, 3B      |    ✅     |
+
+## Benchmarks
+
+### Model size
+
+| Model                 | XNNPACK [GB] |
+| --------------------- | :----------: |
+| LLAMA3_2_1B           |     2.47     |
+| LLAMA3_2_1B_SPINQUANT |     1.14     |
+| LLAMA3_2_1B_QLORA     |     1.18     |
+| LLAMA3_2_3B           |     6.43     |
+| LLAMA3_2_3B_SPINQUANT |     2.55     |
+| LLAMA3_2_3B_QLORA     |     2.65     |
+
+### Memory usage
+
+| Model                 | Android (XNNPACK) [GB] | iOS (XNNPACK) [GB] |
+| --------------------- | :--------------------: | :----------------: |
+| LLAMA3_2_1B           |          3.2           |        3.1         |
+| LLAMA3_2_1B_SPINQUANT |          1.9           |         2          |
+| LLAMA3_2_1B_QLORA     |          2.2           |        2.5         |
+| LLAMA3_2_3B           |          7.1           |        7.3         |
+| LLAMA3_2_3B_SPINQUANT |          3.7           |        3.8         |
+| LLAMA3_2_3B_QLORA     |           4            |        4.1         |
+
+### Inference time
+
+| Model                 | iPhone 16 Pro (XNNPACK) [tokens/s] | iPhone 13 Pro (XNNPACK) [tokens/s] | iPhone SE 3 (XNNPACK) [tokens/s] | Samsung Galaxy S24 (XNNPACK) [tokens/s] | OnePlus 12 (XNNPACK) [tokens/s] |
+| --------------------- | :--------------------------------: | :--------------------------------: | :------------------------------: | :-------------------------------------: | :-----------------------------: |
+| LLAMA3_2_1B           |                16.1                |                11.4                |                ❌                |                  15.6                   |              19.3               |
+| LLAMA3_2_1B_SPINQUANT |                40.6                |                16.7                |               16.5               |                  40.3                   |              48.2               |
+| LLAMA3_2_1B_QLORA     |                31.8                |                11.4                |               11.2               |                  37.3                   |              44.4               |
+| LLAMA3_2_3B           |                 ❌                 |                 ❌                 |                ❌                |                   ❌                    |               7.1               |
+| LLAMA3_2_3B_SPINQUANT |                17.2                |                8.2                 |                ❌                |                  16.2                   |              19.4               |
+| LLAMA3_2_3B_QLORA     |                14.5                |                 ❌                 |                ❌                |                  14.8                   |              18.1               |
+
+❌ - Insufficient RAM.
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useSpeechToText.md b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useSpeechToText.md
new file mode 100644
index 000000000..d94c96a66
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useSpeechToText.md
@@ -0,0 +1,343 @@
+---
+title: useSpeechToText
+keywords:
+  [
+    speech to text,
+    stt,
+    voice recognition,
+    transcription,
+    whisper,
+    react native,
+    executorch,
+    ai,
+    machine learning,
+    on-device,
+    mobile ai,
+  ]
+description: "Learn how to use speech-to-text models in your React Native applications with React Native ExecuTorch's useSpeechToText hook."
+---
+
+Speech to text is a task that allows to transform spoken language to written text. It is commonly used to implement features such as transcription or voice assistants.
+
+:::warning
+It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-whisper-tiny.en). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
+:::
+
+## Reference
+
+You can obtain waveform from audio in any way most suitable to you, however in the snippet below we utilize `react-native-audio-api` library to process a `.mp3` file.
+
+```typescript
+import { useSpeechToText, WHISPER_TINY_EN } from 'react-native-executorch';
+import { AudioContext } from 'react-native-audio-api';
+import * as FileSystem from 'expo-file-system';
+
+const model = useSpeechToText({
+  model: WHISPER_TINY_EN,
+});
+
+const { uri } = await FileSystem.downloadAsync(
+  'https://some-audio-url.com/file.mp3',
+  FileSystem.cacheDirectory + 'audio_file'
+);
+
+const audioContext = new AudioContext({ sampleRate: 16000 });
+const decodedAudioData = await audioContext.decodeAudioDataSource(uri);
+const audioBuffer = decodedAudioData.getChannelData(0);
+
+try {
+  const transcription = await model.transcribe(audioBuffer);
+  console.log(transcription);
+} catch (error) {
+  console.error('Error during audio transcription', error);
+}
+```
+
+### Streaming
+
+Since speech-to-text models can only process audio segments up to 30 seconds long, we need to split longer inputs into chunks. However, simple chunking may cut speech mid-sentence, making it harder for the model to understand. To address this, we use the [whisper-streaming](https://aclanthology.org/2023.ijcnlp-demo.3.pdf) algorithm. While this introduces some overhead, it enables accurate processing of audio inputs of arbitrary length.
+
+### Arguments
+
+**`model`** - Object containing:
+
+- **`isMultilingual`** - A boolean flag indicating whether the model supports multiple languages.
+
+- **`encoderSource`** - A string that specifies the location of a `.pte` file for the encoder.
+
+- **`decoderSource`** - A string that specifies the location of a `.pte` file for the decoder.
+
+- **`tokenizerSource`** - A string that specifies the location to the tokenizer for the model.
+
+**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+### Returns
+
+| Field                       | Type                                                                                                 | Description                                                                                                                                                                                                                                                                                                                   |
+| --------------------------- | ---------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `transcribe`                | `(waveform: Float32Array \| number[], options?: DecodingOptions \| undefined) => Promise<string>`    | Starts a transcription process for a given input array, which should be a waveform at 16kHz. The second argument is an options object, e.g. `{ language: 'es' }` for multilingual models. Resolves a promise with the output transcription when the model is finished. Passing `number[]` is deprecated.                      |
+| `stream`                    | `(options?: DecodingOptions \| undefined) => Promise<string>`                                        | Starts a streaming transcription process. Use in combination with `streamInsert` to feed audio chunks and `streamStop` to end the stream. The argument is an options object, e.g. `{ language: 'es' }` for multilingual models. Updates `committedTranscription` and `nonCommittedTranscription` as transcription progresses. |
+| `streamInsert`              | `(waveform: Float32Array \| number[]) => void`                                                       | Inserts a chunk of audio data (sampled at 16kHz) into the ongoing streaming transcription. Call this repeatedly as new audio data becomes available. Passing `number[]` is deprecated.                                                                                                                                        |
+| `streamStop`                | `() => void`                                                                                         | Stops the ongoing streaming transcription process.                                                                                                                                                                                                                                                                            |
+| `encode`                    | `(waveform: Float32Array \| number[]) => Promise<Float32Array>`                                      | Runs the encoding part of the model on the provided waveform. Passing `number[]` is deprecated.                                                                                                                                                                                                                               |
+| `decode`                    | `(tokens: number[] \| Int32Array, encoderOutput: Float32Array \| number[]) => Promise<Float32Array>` | Runs the decoder of the model. Passing `number[]` is deprecated.                                                                                                                                                                                                                                                              |
+| `committedTranscription`    | `string`                                                                                             | Contains the part of the transcription that is finalized and will not change. Useful for displaying stable results during streaming.                                                                                                                                                                                          |
+| `nonCommittedTranscription` | `string`                                                                                             | Contains the part of the transcription that is still being processed and may change. Useful for displaying live, partial results during streaming.                                                                                                                                                                            |
+| `error`                     | `string \| null`                                                                                     | Contains the error message if the model failed to load.                                                                                                                                                                                                                                                                       |
+| `isGenerating`              | `boolean`                                                                                            | Indicates whether the model is currently processing an inference.                                                                                                                                                                                                                                                             |
+| `isReady`                   | `boolean`                                                                                            | Indicates whether the model has successfully loaded and is ready for inference.                                                                                                                                                                                                                                               |
+| `downloadProgress`          | `number`                                                                                             | Tracks the progress of the model download process.                                                                                                                                                                                                                                                                            |
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+// Languages supported by whisper (Multilingual)
+type SpeechToTextLanguage =
+  | 'af'
+  | 'sq'
+  | 'ar'
+  | 'hy'
+  | 'az'
+  | 'eu'
+  | 'be'
+  | 'bn'
+  | 'bs'
+  | 'bg'
+  | 'my'
+  | 'ca'
+  | 'zh'
+  | 'hr'
+  | 'cs'
+  | 'da'
+  | 'nl'
+  | 'et'
+  | 'en'
+  | 'fi'
+  | 'fr'
+  | 'gl'
+  | 'ka'
+  | 'de'
+  | 'el'
+  | 'gu'
+  | 'ht'
+  | 'he'
+  | 'hi'
+  | 'hu'
+  | 'is'
+  | 'id'
+  | 'it'
+  | 'ja'
+  | 'kn'
+  | 'kk'
+  | 'km'
+  | 'ko'
+  | 'lo'
+  | 'lv'
+  | 'lt'
+  | 'mk'
+  | 'mg'
+  | 'ms'
+  | 'ml'
+  | 'mt'
+  | 'mr'
+  | 'ne'
+  | 'no'
+  | 'fa'
+  | 'pl'
+  | 'pt'
+  | 'pa'
+  | 'ro'
+  | 'ru'
+  | 'sr'
+  | 'si'
+  | 'sk'
+  | 'sl'
+  | 'es'
+  | 'su'
+  | 'sw'
+  | 'sv'
+  | 'tl'
+  | 'tg'
+  | 'ta'
+  | 'te'
+  | 'th'
+  | 'tr'
+  | 'uk'
+  | 'ur'
+  | 'uz'
+  | 'vi'
+  | 'cy'
+  | 'yi';
+
+interface DecodingOptions {
+  language?: SpeechToTextLanguage;
+}
+
+interface SpeechToTextModelConfig {
+  isMultilingual: boolean;
+  encoderSource: ResourceSource;
+  decoderSource: ResourceSource;
+  tokenizerSource: ResourceSource;
+}
+```
+
+</details>
+
+## Running the model
+
+Before running the model's `transcribe` method, make sure to extract the audio waveform you want to transcribe. You'll need to handle this step yourself, ensuring the audio is sampled at 16 kHz. Once you have the waveform, pass it as an argument to the transcribe method. The method returns a promise that resolves to the generated transcription on success, or an error if inference fails.
+
+### Multilingual transcription
+
+If you want to transcribe speech in languages other than English, use the multilingual version of Whisper. To generate the output in your desired language, pass the `language` option to the `transcribe` method.
+
+```typescript
+import { useSpeechToText, WHISPER_TINY } from 'react-native-executorch';
+
+const model = useSpeechToText({
+  model: WHISPER_TINY,
+});
+
+const transcription = await model.transcribe(spanishAudio, { language: 'es' });
+```
+
+## Example
+
+```tsx
+import React, { useState } from 'react';
+import { Button, Text } from 'react-native';
+import { useSpeechToText, WHISPER_TINY_EN } from 'react-native-executorch';
+import { AudioContext } from 'react-native-audio-api';
+import * as FileSystem from 'expo-file-system';
+
+function App() {
+  const model = useSpeechToText({
+    model: WHISPER_TINY_EN,
+  });
+
+  const [transcription, setTranscription] = useState('');
+
+  const loadAudio = async () => {
+    const { uri } = await FileSystem.downloadAsync(
+      'https://some-audio-url.com/file.mp3',
+      FileSystem.cacheDirectory + 'audio_file'
+    );
+
+    const audioContext = new AudioContext({ sampleRate: 16000 });
+    const decodedAudioData = await audioContext.decodeAudioDataSource(uri);
+    const audioBuffer = decodedAudioData.getChannelData(0);
+
+    return audioBuffer;
+  };
+
+  const handleTranscribe = async () => {
+    const audio = await loadAudio();
+    await model.transcribe(audio);
+  };
+
+  return (
+    <>
+      <Text>{transcription}</Text>
+      <Button onPress={handleTranscribe} title="Transcribe" />
+    </>
+  );
+}
+```
+
+### Streaming transcription
+
+```tsx
+import React, { useEffect, useState } from 'react';
+import { Text, Button } from 'react-native';
+import { useSpeechToText, WHISPER_TINY_EN } from 'react-native-executorch';
+import { AudioManager, AudioRecorder } from 'react-native-audio-api';
+import * as FileSystem from 'expo-file-system';
+
+function App() {
+  const model = useSpeechToText({
+    model: WHISPER_TINY_EN,
+  });
+
+  const [recorder] = useState(
+    () =>
+      new AudioRecorder({
+        sampleRate: 16000,
+        bufferLengthInSamples: 1600,
+      })
+  );
+
+  useEffect(() => {
+    AudioManager.setAudioSessionOptions({
+      iosCategory: 'playAndRecord',
+      iosMode: 'spokenAudio',
+      iosOptions: ['allowBluetooth', 'defaultToSpeaker'],
+    });
+    AudioManager.requestRecordingPermissions();
+  }, []);
+
+  const handleStartStreamingTranscribe = async () => {
+    recorder.onAudioReady(({ buffer }) => {
+      model.streamInsert(buffer.getChannelData(0));
+    });
+    recorder.start();
+
+    try {
+      await model.stream();
+    } catch (error) {
+      console.error('Error during streaming transcription:', error);
+    }
+  };
+
+  const handleStopStreamingTranscribe = () => {
+    recorder.stop();
+    model.streamStop();
+  };
+
+  return (
+    <>
+      <Text>
+        {model.committedTranscription}
+        {model.nonCommittedTranscription}
+      </Text>
+      <Button
+        onPress={handleStartStreamingTranscribe}
+        title="Start Streaming"
+      />
+      <Button onPress={handleStopStreamingTranscribe} title="Stop Streaming" />
+    </>
+  );
+}
+```
+
+## Supported models
+
+| Model                                                              |   Language   |
+| ------------------------------------------------------------------ | :----------: |
+| [whisper-tiny.en](https://huggingface.co/openai/whisper-tiny.en)   |   English    |
+| [whisper-tiny](https://huggingface.co/openai/whisper-tiny)         | Multilingual |
+| [whisper-base.en](https://huggingface.co/openai/whisper-base.en)   |   English    |
+| [whisper-base](https://huggingface.co/openai/whisper-base)         | Multilingual |
+| [whisper-small.en](https://huggingface.co/openai/whisper-small.en) |   English    |
+| [whisper-small](https://huggingface.co/openai/whisper-small)       | Multilingual |
+
+## Benchmarks
+
+### Model size
+
+| Model            | XNNPACK [MB] |
+| ---------------- | :----------: |
+| WHISPER_TINY_EN  |     151      |
+| WHISPER_TINY     |     151      |
+| WHISPER_BASE_EN  |    290.6     |
+| WHISPER_BASE     |    290.6     |
+| WHISPER_SMALL_EN |     968      |
+| WHISPER_SMALL    |     968      |
+
+### Memory usage
+
+| Model        | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ------------ | :--------------------: | :----------------: |
+| WHISPER_TINY |          410           |        375         |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTextEmbeddings.md b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTextEmbeddings.md
new file mode 100644
index 000000000..7d4706f15
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTextEmbeddings.md
@@ -0,0 +1,158 @@
+---
+title: useTextEmbeddings
+keywords:
+  [
+    text embedding,
+    text embeddings,
+    embeddings,
+    react native,
+    executorch,
+    ai,
+    machine learning,
+    on-device,
+    mobile ai,
+  ]
+description: "Learn how to use text embeddings models in your React Native applications with React Native ExecuTorch's useTextEmbeddings hook."
+---
+
+Text Embedding is the process of converting text into a numerical representation. This representation can be used for various natural language processing tasks, such as semantic search, text classification, and clustering.
+
+:::caution
+It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-all-MiniLM-L6-v2). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
+:::
+
+## Reference
+
+```typescript
+import { useTextEmbeddings, ALL_MINILM_L6_V2 } from 'react-native-executorch';
+
+const model = useTextEmbeddings({ model: ALL_MINILM_L6_V2 });
+
+try {
+  const embedding = await model.forward('Hello World!');
+} catch (error) {
+  console.error(error);
+}
+```
+
+### Arguments
+
+**`model`** - Object containing the model source and tokenizer source.
+
+- **`modelSource`** - A string that specifies the location of the model binary.
+
+- **`tokenizerSource`** - A string that specifies the location of the tokenizer JSON file.
+
+**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+### Returns
+
+| Field              | Type                                   | Description                                                                       |
+| ------------------ | -------------------------------------- | --------------------------------------------------------------------------------- |
+| `forward`          | `(input: string) => Promise<number[]>` | Executes the model's forward pass, where `input` is a text that will be embedded. |
+| `error`            | <code>string &#124; null</code>        | Contains the error message if the model failed to load.                           |
+| `isGenerating`     | `boolean`                              | Indicates whether the model is currently processing an inference.                 |
+| `isReady`          | `boolean`                              | Indicates whether the model has successfully loaded and is ready for inference.   |
+| `downloadProgress` | `number`                               | Represents the download progress as a value between 0 and 1.                      |
+
+## Running the model
+
+To run the model, you can use the `forward` method. It accepts one argument, which is a string representing the text you want to embed. The function returns a promise, which can resolve either to an error or an array of numbers representing the embedding.
+
+## Example
+
+```typescript
+import { useTextEmbeddings, ALL_MINILM_L6_V2 } from 'react-native-executorch';
+
+const dotProduct = (a: number[], b: number[]) =>
+  a.reduce((sum, val, i) => sum + val * b[i], 0);
+
+const cosineSimilarity = (a: number[], b: number[]) => {
+  const dot = dotProduct(a, b);
+  const normA = Math.sqrt(dotProduct(a, a));
+  const normB = Math.sqrt(dotProduct(b, b));
+  return dot / (normA * normB);
+};
+
+function App() {
+  const model = useTextEmbeddings({ model: ALL_MINILM_L6_V2 });
+
+  // ...
+
+  try {
+    const helloWorldEmbedding = await model.forward('Hello World!');
+    const goodMorningEmbedding = await model.forward('Good Morning!');
+
+    const similarity = cosineSimilarity(
+      helloWorldEmbedding,
+      goodMorningEmbedding
+    );
+
+    console.log(`Cosine similarity: ${similarity}`);
+  } catch (error) {
+    console.error(error);
+  }
+
+  // ...
+}
+```
+
+## Supported models
+
+| Model                                                                                                 | Language | Max Tokens | Embedding Dimensions | Description                                                                                                                                                                                                                                                                                                                                                                                                                      |
+| ----------------------------------------------------------------------------------------------------- | :------: | :--------: | :------------------: | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)                     | English  |    254     |         384          | All-round model tuned for many use-cases. Trained on a large and diverse dataset of over 1 billion training pairs.                                                                                                                                                                                                                                                                                                               |
+| [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2)                   | English  |    382     |         768          | All-round model tuned for many use-cases. Trained on a large and diverse dataset of over 1 billion training pairs.                                                                                                                                                                                                                                                                                                               |
+| [multi-qa-MiniLM-L6-cos-v1](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1)   | English  |    509     |         384          | This model was tuned for semantic search: Given a query/question, it can find relevant passages. It was trained on a large and diverse set of (question, answer) pairs.                                                                                                                                                                                                                                                          |
+| [multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1) | English  |    510     |         768          | This model was tuned for semantic search: Given a query/question, it can find relevant passages. It was trained on a large and diverse set of (question, answer) pairs.                                                                                                                                                                                                                                                          |
+| [clip-vit-base-patch32-text](https://huggingface.co/openai/clip-vit-base-patch32)                     | English  |     74     |         512          | CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. CLIP allows to embed images and text into the same vector space. This allows to find similar images as well as to implement image search. This is the text encoder part of the CLIP model. To embed images checkout [clip-vit-base-patch32-image](../02-computer-vision/useImageEmbeddings.md#supported-models). |
+
+**`Max Tokens`** - the maximum number of tokens that can be processed by the model. If the input text exceeds this limit, it will be truncated.
+
+**`Embedding Dimensions`** - the size of the output embedding vector. This is the number of dimensions in the vector representation of the input text.
+
+:::info
+For the supported models, the returned embedding vector is normalized, meaning that its length is equal to 1. This allows for easier comparison of vectors using cosine similarity, just calculate the dot product of two vectors to get the cosine similarity score.
+:::
+
+## Benchmarks
+
+### Model size
+
+| Model                      | XNNPACK [MB] |
+| -------------------------- | :----------: |
+| ALL_MINILM_L6_V2           |      91      |
+| ALL_MPNET_BASE_V2          |     438      |
+| MULTI_QA_MINILM_L6_COS_V1  |      91      |
+| MULTI_QA_MPNET_BASE_DOT_V1 |     438      |
+| CLIP_VIT_BASE_PATCH32_TEXT |     254      |
+
+### Memory usage
+
+| Model                      | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| -------------------------- | :--------------------: | :----------------: |
+| ALL_MINILM_L6_V2           |           95           |        110         |
+| ALL_MPNET_BASE_V2          |          405           |        455         |
+| MULTI_QA_MINILM_L6_COS_V1  |          120           |        140         |
+| MULTI_QA_MPNET_BASE_DOT_V1 |          435           |        455         |
+| CLIP_VIT_BASE_PATCH32_TEXT |          200           |        280         |
+
+### Inference time
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+| Model                      | iPhone 17 Pro (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| -------------------------- | :--------------------------: | :-----------------------: |
+| ALL_MINILM_L6_V2           |              7               |            21             |
+| ALL_MPNET_BASE_V2          |              24              |            90             |
+| MULTI_QA_MINILM_L6_COS_V1  |              7               |            19             |
+| MULTI_QA_MPNET_BASE_DOT_V1 |              24              |            88             |
+| CLIP_VIT_BASE_PATCH32_TEXT |              14              |            39             |
+
+:::info
+Benchmark times for text embeddings are highly dependent on the sentence length. The numbers above are based on a sentence of around 80 tokens. For shorter or longer sentences, inference time may vary accordingly.
+:::
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTokenizer.md b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTokenizer.md
new file mode 100644
index 000000000..23ad40803
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTokenizer.md
@@ -0,0 +1,104 @@
+---
+title: useTokenizer
+keywords:
+  [
+    tokenizer,
+    text tokenizer,
+    tokenization,
+    react native,
+    executorch,
+    ai,
+    machine learning,
+    on-device,
+    mobile ai,
+  ]
+description: "Learn how to tokenize your text in your React Native applications using React Native ExecuTorch's useTokenizer hook."
+---
+
+Tokenization is the process of breaking down text into smaller units called tokens. It’s a crucial step in natural language processing that
+converts text into a format that machine learning models can understand.
+
+:::info
+We are using [Hugging Face Tokenizers](https://huggingface.co/docs/tokenizers/index) under the hood, ensuring compatibility with the Hugging Face ecosystem.
+:::
+
+## Reference
+
+```typescript
+import { useTokenizer, ALL_MINILM_L6_V2 } from 'react-native-executorch';
+
+const tokenizer = useTokenizer({ tokenizer: ALL_MINILM_L6_V2 });
+
+const text = 'Hello, world!';
+
+try {
+  // Tokenize the text
+  const tokens = await tokenizer.encode(text);
+  console.log('Tokens:', tokens);
+
+  // Decode the tokens back to text
+  const decodedText = await tokenizer.decode(tokens);
+  console.log('Decoded text:', decodedText);
+} catch (error) {
+  console.error('Error tokenizing text:', error);
+}
+```
+
+## Arguments
+
+**`tokenizer`** - Object containing the tokenizer source.
+
+- **`tokenizerSource`** - A string that specifies the location of the tokenizer JSON file.
+
+**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+### Returns
+
+| Field              | Type                                  | Description                                                           |
+| ------------------ | ------------------------------------- | --------------------------------------------------------------------- |
+| `encode`           | `(text: string) => Promise<number[]>` | Converts a string into an array of token IDs.                         |
+| `decode`           | `(ids: number[]) => Promise<string>`  | Converts an array of token IDs into a string.                         |
+| `getVocabSize`     | `() => Promise<number>`               | Returns the size of the tokenizer's vocabulary.                       |
+| `idToToken`        | `(id: number) => Promise<string>`     | Returns the token associated to the ID.                               |
+| `tokenToId`        | `(token: string) => Promise<number>`  | Returns the ID associated to the token.                               |
+| `error`            | <code>string &#124; null</code>       | Contains the error message if the tokenizer failed to load.           |
+| `isGenerating`     | `boolean`                             | Indicates whether the tokenizer is currently running.                 |
+| `isReady`          | `boolean`                             | Indicates whether the tokenizer has successfully loaded and is ready. |
+| `downloadProgress` | `number`                              | Represents the download progress as a value between 0 and 1.          |
+
+## Example
+
+```typescript
+import { useTokenizer, ALL_MINILM_L6_V2 } from 'react-native-executorch';
+
+function App() {
+  const tokenizer = useTokenizer({ tokenizer: ALL_MINILM_L6_V2 });
+
+  // ...
+
+  try {
+    const text = 'Hello, world!';
+
+    const vocabSize = await tokenizer.getVocabSize();
+    console.log('Vocabulary size:', vocabSize);
+
+    const tokens = await tokenizer.encode(text);
+    console.log('Token IDs:', tokens);
+
+    const decoded = await tokenizer.decode(tokens);
+    console.log('Decoded text:', decoded);
+
+    const tokenId = await tokenizer.tokenToId('hello');
+    console.log('Token ID for "Hello":', tokenId);
+
+    const token = await tokenizer.idToToken(tokenId);
+    console.log('Token for ID:', token);
+  } catch (error) {
+    console.error(error);
+  }
+
+  // ...
+}
+```
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useVAD.md b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useVAD.md
new file mode 100644
index 000000000..b38fe8df0
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useVAD.md
@@ -0,0 +1,194 @@
+---
+title: useVAD
+---
+
+Voice Activity Detection (VAD) is the task of analyzing an audio signal to identify time segments containing human speech, separating them from non-speech sections like silence and background noise.
+
+:::caution
+It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-fsmn-vad). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
+:::
+
+## Reference
+
+You can obtain waveform from audio in any way most suitable to you, however in the snippet below we utilize `react-native-audio-api` library to process a `.mp3` file.
+
+```typescript
+import { useVAD, FSMN_VAD } from 'react-native-executorch';
+import { AudioContext } from 'react-native-audio-api';
+import * as FileSystem from 'expo-file-system';
+
+const model = useVAD({
+  model: FSMN_VAD,
+});
+
+const { uri } = await FileSystem.downloadAsync(
+  'https://some-audio-url.com/file.mp3',
+  FileSystem.cacheDirectory + 'audio_file'
+);
+
+const audioContext = new AudioContext({ sampleRate: 16000 });
+const decodedAudioData = await audioContext.decodeAudioDataSource(uri);
+const audioBuffer = decodedAudioData.getChannelData(0);
+
+try {
+  // NOTE: to obtain segments in seconds, you need to divide
+  // start / end of the segment by the sampling rate (16k)
+
+  const speechSegments = await model.forward(audioBuffer);
+  console.log(speechSegments);
+} catch (error) {
+  console.error('Error during running VAD model', error);
+}
+```
+
+### Arguments
+
+**`model`** - Object containing the model source.
+
+- **`modelSource`** - A string that specifies the location of the model binary.
+
+**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+### Returns
+
+| Field              | Type                                               | Description                                                                                                                                     |
+| ------------------ | -------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
+| `forward`          | `(waveform: Float32Array) => Promise<{Segment[]}>` | Executes the model's forward pass, where input array should be a waveform at 16kHz. Returns a promise containing an array of `Segment` objects. |
+| `error`            | <code>string &#124; null</code>                    | Contains the error message if the model failed to load.                                                                                         |
+| `isGenerating`     | `boolean`                                          | Indicates whether the model is currently processing an inference.                                                                               |
+| `isReady`          | `boolean`                                          | Indicates whether the model has successfully loaded and is ready for inference.                                                                 |
+| `downloadProgress` | `number`                                           | Represents the download progress as a value between 0 and 1.                                                                                    |
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+interface Segment {
+  start: number;
+  end: number;
+}
+```
+
+</details>
+## Running the model
+
+Before running the model's `forward` method, make sure to extract the audio waveform you want to process. You'll need to handle this step yourself, ensuring the audio is sampled at 16 kHz. Once you have the waveform, pass it as an argument to the forward method. The method returns a promise that resolves to the array of detected speech segments.
+
+:::info
+Timestamps in returned speech segments, correspond to indices of input array (waveform).
+:::
+
+## Example
+
+```tsx
+import React from 'react';
+import { Button, Text, SafeAreaView } from 'react-native';
+import { useVAD, FSMN_VAD } from 'react-native-executorch';
+import { AudioContext } from 'react-native-audio-api';
+import * as FileSystem from 'expo-file-system';
+
+export default function App() {
+  const model = useVAD({
+    model: FSMN_VAD,
+  });
+
+  const audioURL = 'https://some-audio-url.com/file.mp3';
+
+  const handleAudio = async () => {
+    if (!model) {
+      console.error('VAD model is not loaded yet.');
+      return;
+    }
+
+    console.log('Processing URL:', audioURL);
+
+    try {
+      const { uri } = await FileSystem.downloadAsync(
+        audioURL,
+        FileSystem.cacheDirectory + 'vad_example.tmp'
+      );
+
+      const audioContext = new AudioContext({ sampleRate: 16000 });
+      const originalDecodedBuffer =
+        await audioContext.decodeAudioDataSource(uri);
+      const originalChannelData = originalDecodedBuffer.getChannelData(0);
+
+      const segments = await model.forward(originalChannelData);
+      if (segments.length === 0) {
+        console.log('No speech segments were found.');
+        return;
+      }
+      console.log(`Found ${segments.length} speech segments.`);
+
+      const totalLength = segments.reduce(
+        (sum, seg) => sum + (seg.end - seg.start),
+        0
+      );
+      const newAudioBuffer = audioContext.createBuffer(
+        1, // Mono
+        totalLength,
+        originalDecodedBuffer.sampleRate
+      );
+      const newChannelData = newAudioBuffer.getChannelData(0);
+
+      let offset = 0;
+      for (const segment of segments) {
+        const slice = originalChannelData.subarray(segment.start, segment.end);
+        newChannelData.set(slice, offset);
+        offset += slice.length;
+      }
+
+      //  Play the processed audio
+      const source = audioContext.createBufferSource();
+      source.buffer = newAudioBuffer;
+      source.connect(audioContext.destination);
+      source.start();
+    } catch (error) {
+      console.error('Error processing audio data:', error);
+    }
+  };
+
+  return (
+    <SafeAreaView>
+      <Text>
+        Press the button to process and play speech from a sample file.
+      </Text>
+      <Button onPress={handleAudio} title="Run VAD Example" />
+    </SafeAreaView>
+  );
+}
+```
+
+## Supported models
+
+- [fsmn-vad](https://huggingface.co/funasr/fsmn-vad)
+
+## Benchmarks
+
+### Model size
+
+| Model    | XNNPACK [MB] |
+| -------- | :----------: |
+| FSMN_VAD |     1.83     |
+
+### Memory usage
+
+| Model    | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| -------- | :--------------------: | :----------------: |
+| FSMN_VAD |           97           |        45,9        |
+
+### Inference time
+
+<!-- TODO: MEASURE INFERENCE TIME FOR SAMSUNG GALAXY S24 WHEN POSSIBLE -->
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+Inference time were measured on a 60s audio, that can be found [here](https://models.silero.ai/vad_models/en.wav).
+
+| Model    | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| -------- | :--------------------------: | :------------------------------: | :------------------------: | :-----------------------: |
+| FSMN_VAD |             151              |               171                |            180             |            109            |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/_category_.json b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/_category_.json
new file mode 100644
index 000000000..930e814ef
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/_category_.json
@@ -0,0 +1,6 @@
+{
+  "label": "Computer Vision",
+  "link": {
+    "type": "generated-index"
+  }
+}
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useClassification.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useClassification.md
new file mode 100644
index 000000000..eaf9afcb7
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useClassification.md
@@ -0,0 +1,113 @@
+---
+title: useClassification
+---
+
+Image classification is the process of assigning a label to an image that best describes its contents. For example, when given an image of a puppy, the image classifier should assign the puppy class to that image.
+
+:::info
+Usually, the class with the highest probability is the one that is assigned to an image. However, if there are multiple classes with comparatively high probabilities, this may indicate that the model is not confident in its prediction.
+:::
+
+:::caution
+It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-efficientnet-v2-s). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
+:::
+
+## Reference
+
+```typescript
+import { useClassification, EFFICIENTNET_V2_S } from 'react-native-executorch';
+
+const model = useClassification({ model: EFFICIENTNET_V2_S });
+
+const imageUri = 'file::///Users/.../cute_puppy.png';
+
+try {
+  const classesWithProbabilities = await model.forward(imageUri);
+} catch (error) {
+  console.error(error);
+}
+```
+
+### Arguments
+
+**`model`** - Object containing the model source.
+
+- **`modelSource`** - A string that specifies the location of the model binary.
+
+**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+### Returns
+
+| Field              | Type                                                               | Description                                                                                                    |
+| ------------------ | ------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------- |
+| `forward`          | `(imageSource: string) => Promise<{ [category: string]: number }>` | Executes the model's forward pass, where `imageSource` can be a fetchable resource or a Base64-encoded string. |
+| `error`            | <code>string &#124; null</code>                                    | Contains the error message if the model failed to load.                                                        |
+| `isGenerating`     | `boolean`                                                          | Indicates whether the model is currently processing an inference.                                              |
+| `isReady`          | `boolean`                                                          | Indicates whether the model has successfully loaded and is ready for inference.                                |
+| `downloadProgress` | `number`                                                           | Represents the download progress as a value between 0 and 1.                                                   |
+
+## Running the model
+
+To run the model, you can use the `forward` method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The function returns a promise, which can resolve either to an error or an object containing categories with their probabilities.
+
+:::info
+Images from external sources are stored in your application's temporary directory.
+:::
+
+## Example
+
+```typescript
+import { useClassification, EFFICIENTNET_V2_S } from 'react-native-executorch';
+
+function App() {
+  const model = useClassification({ model: EFFICIENTNET_V2_S });
+
+  // ...
+  const imageUri = 'file:///Users/.../cute_puppy.png';
+
+  try {
+    const classesWithProbabilities = await model.forward(imageUri);
+
+    // Extract three classes with the highest probabilities
+    const topThreeClasses = Object.entries(classesWithProbabilities)
+      .sort(([, a], [, b]) => b - a)
+      .slice(0, 3)
+      .map(([label, score]) => ({ label, score }));
+  } catch (error) {
+    console.error(error);
+  }
+  // ...
+}
+```
+
+## Supported models
+
+| Model                                                                                                             | Number of classes | Class list                                                                                                                                                                    |
+| ----------------------------------------------------------------------------------------------------------------- | ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| [efficientnet_v2_s](https://pytorch.org/vision/stable/models/generated/torchvision.models.efficientnet_v2_s.html) | 1000              | [ImageNet1k_v1](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/common/rnexecutorch/models/classification/Constants.h) |
+
+## Benchmarks
+
+### Model size
+
+| Model             | XNNPACK [MB] | Core ML [MB] |
+| ----------------- | :----------: | :----------: |
+| EFFICIENTNET_V2_S |     85.6     |     43.9     |
+
+### Memory usage
+
+| Model             | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
+| ----------------- | :--------------------: | :----------------: |
+| EFFICIENTNET_V2_S |          230           |         87         |
+
+### Inference time
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+| Model             | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ----------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| EFFICIENTNET_V2_S |              64              |              68              |            217             |                205                |            198            |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageEmbeddings.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageEmbeddings.md
new file mode 100644
index 000000000..b6decd1d2
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageEmbeddings.md
@@ -0,0 +1,132 @@
+---
+title: useImageEmbeddings
+keywords:
+  [
+    image embedding,
+    image embeddings,
+    embeddings,
+    react native,
+    executorch,
+    ai,
+    machine learning,
+    on-device,
+    mobile ai,
+    clip,
+  ]
+description: "Learn how to use image embeddings models in your React Native applications with React Native ExecuTorch's useImageEmbeddings hook."
+---
+
+Image Embedding is the process of converting an image into a numerical representation. This representation can be used for tasks, such as classification, clustering and (using contrastive learning like e.g. CLIP model) image search.
+
+:::caution
+It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-clip-vit-base-patch32). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
+:::
+
+## Reference
+
+```typescript
+import {
+  useImageEmbeddings,
+  CLIP_VIT_BASE_PATCH32_IMAGE,
+} from 'react-native-executorch';
+
+const model = useImageEmbeddings({ model: CLIP_VIT_BASE_PATCH32_IMAGE });
+
+try {
+  const imageEmbedding = await model.forward('https://url-to-image.jpg');
+} catch (error) {
+  console.error(error);
+}
+```
+
+### Arguments
+
+**`model`** - Object containing the model source.
+
+- **`modelSource`** - A string that specifies the location of the model binary.
+
+**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+### Returns
+
+| Field              | Type                                             | Description                                                                                         |
+| ------------------ | ------------------------------------------------ | --------------------------------------------------------------------------------------------------- |
+| `forward`          | `(imageSource: string) => Promise<Float32Array>` | Executes the model's forward pass, where `imageSource` is a URI/URL to image that will be embedded. |
+| `error`            | <code>string &#124; null</code>                  | Contains the error message if the model failed to load.                                             |
+| `isGenerating`     | `boolean`                                        | Indicates whether the model is currently processing an inference.                                   |
+| `isReady`          | `boolean`                                        | Indicates whether the model has successfully loaded and is ready for inference.                     |
+| `downloadProgress` | `number`                                         | Represents the download progress as a value between 0 and 1.                                        |
+
+## Running the model
+
+To run the model, you can use the `forward` method. It accepts one argument which is a URI/URL to an image you want to encode. The function returns a promise, which can resolve either to an error or an array of numbers representing the embedding.
+
+## Example
+
+```typescript
+const dotProduct = (a: Float32Array, b: Float32Array) =>
+  a.reduce((sum, val, i) => sum + val * b[i], 0);
+
+const cosineSimilarity = (a: Float32Array, b: Float32Array) => {
+  const dot = dotProduct(a, b);
+  const normA = Math.sqrt(dotProduct(a, a));
+  const normB = Math.sqrt(dotProduct(b, b));
+  return dot / (normA * normB);
+};
+
+try {
+  // we assume you've provided catImage and dogImage
+  const catImageEmbedding = await model.forward(catImage);
+  const dogImageEmbedding = await model.forward(dogImage);
+
+  const similarity = cosineSimilarity(catImageEmbedding, dogImageEmbedding);
+
+  console.log(`Cosine similarity: ${similarity}`);
+} catch (error) {
+  console.error(error);
+}
+```
+
+## Supported models
+
+| Model                                                                              | Language | Image size | Embedding dimensions | Description                                                                                                                                                                                                                                                                                                                                                                                                                               |
+| ---------------------------------------------------------------------------------- | :------: | :--------: | :------------------: | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| [clip-vit-base-patch32-image](https://huggingface.co/openai/clip-vit-base-patch32) | English  |  224×224   |         512          | CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. CLIP allows to embed images and text into the same vector space. This allows to find similar images as well as to implement image search. This is the image encoder part of the CLIP model. To embed text checkout [clip-vit-base-patch32-text](../01-natural-language-processing/useTextEmbeddings.md#supported-models). |
+
+**`Image size`** - the size of an image that the model takes as an input. Resize will happen automatically.
+
+**`Embedding Dimensions`** - the size of the output embedding vector. This is the number of dimensions in the vector representation of the input image.
+
+:::info
+For the supported models, the returned embedding vector is normalized, meaning that its length is equal to 1. This allows for easier comparison of vectors using cosine similarity, just calculate the dot product of two vectors to get the cosine similarity score.
+:::
+
+## Benchmarks
+
+### Model size
+
+| Model                       | XNNPACK [MB] |
+| --------------------------- | :----------: |
+| CLIP_VIT_BASE_PATCH32_IMAGE |     352      |
+
+### Memory usage
+
+| Model                       | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| --------------------------- | :--------------------: | :----------------: |
+| CLIP_VIT_BASE_PATCH32_IMAGE |          350           |        340         |
+
+### Inference time
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization. Performance also heavily depends on image size, because resize is expansive operation, especially on low-end devices.
+:::
+
+| Model                       | iPhone 17 Pro (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| --------------------------- | :--------------------------: | :-----------------------: |
+| CLIP_VIT_BASE_PATCH32_IMAGE |              18              |            55             |
+
+:::info
+Image embedding benchmark times are measured using 224×224 pixel images, as required by the model. All input images, whether larger or smaller, are resized to 224×224 before processing. Resizing is typically fast for small images but may be noticeably slower for very large images, which can increase total inference time.
+:::
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageSegmentation.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageSegmentation.md
new file mode 100644
index 000000000..7fee70880
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageSegmentation.md
@@ -0,0 +1,117 @@
+---
+title: useImageSegmentation
+---
+
+Semantic image segmentation, akin to image classification, tries to assign the content of the image to one of the predefined classes. However, in case of segmentation this classification is done on a per-pixel basis, so as the result the model provides an image-sized array of scores for each of the classes. You can then use this information to detect objects on a per-pixel basis. React Native ExecuTorch offers a dedicated hook `useImageSegmentation` for this task.
+
+:::caution
+It is recommended to use models provided by us which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-style-transfer-candy), you can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
+:::
+
+## Reference
+
+```typescript
+import {
+  useImageSegmentation,
+  DEEPLAB_V3_RESNET50,
+} from 'react-native-executorch';
+
+const model = useImageSegmentation({ model: DEEPLAB_V3_RESNET50 });
+
+const imageUri = 'file::///Users/.../cute_cat.png';
+
+try {
+  const outputDict = await model.forward(imageUri);
+} catch (error) {
+  console.error(error);
+}
+```
+
+### Arguments
+
+**`model`** - Object containing the model source.
+
+- **`modelSource`** - A string that specifies the location of the model binary.
+
+**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+### Returns
+
+| Field              | Type                                                                                                                         | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
+| ------------------ | ---------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `forward`          | `(imageSource: string, classesOfInterest?: DeeplabLabel[], resize?: boolean) => Promise<{[key in DeeplabLabel]?: number[]}>` | Executes the model's forward pass, where: <br/> \* `imageSource` can be a fetchable resource or a Base64-encoded string. <br/> \* `classesOfInterest` is an optional list of `DeeplabLabel` used to indicate additional arrays of probabilities to output (see section "Running the model"). The default is an empty list. <br/> \* `resize` is an optional boolean to indicate whether the output should be resized to the original image dimensions, or left in the size of the model (see section "Running the model"). The default is `false`. <br/> <br/> The return is a dictionary containing: <br/> \* for the key `DeeplabLabel.ARGMAX` an array of integers corresponding to the most probable class for each pixel <br/> \* an array of floats for each class from `classesOfInterest` corresponding to the probabilities for this class. |
+| `error`            | <code>string &#124; null</code>                                                                                              | Contains the error message if the model failed to load.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
+| `isGenerating`     | `boolean`                                                                                                                    | Indicates whether the model is currently processing an inference.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
+| `isReady`          | `boolean`                                                                                                                    | Indicates whether the model has successfully loaded and is ready for inference.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
+| `downloadProgress` | `number`                                                                                                                     | Represents the download progress as a value between 0 and 1.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
+
+## Running the model
+
+To run the model, you can use the `forward` method. It accepts three arguments: a required image, an optional list of classes, and an optional flag whether to resize the output to the original dimensions.
+
+- The image can be a remote URL, a local file URI, or a base64-encoded image.
+- The `classesOfInterest` list contains classes for which to output the full results. By default the list is empty, and only the most probable classes are returned (essentially an arg max for each pixel). Look at [`DeeplabLabel`](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/imageSegmentation.ts) enum for possible classes.
+- The `resize` flag says whether the output will be rescaled back to the size of the image you put in. The default is `false`. The model runs inference on a scaled (probably smaller) version of your image (224x224 for `DEEPLAB_V3_RESNET50`). If you choose to resize, the output will be `number[]` of size `width * height` of your original image.
+
+:::caution
+Setting `resize` to true will make `forward` slower.
+:::
+
+`forward` returns a promise which can resolve either to an error or a dictionary containing number arrays with size depending on `resize`:
+
+- For the key `DeeplabLabel.ARGMAX` the array contains for each pixel an integer corresponding to the class with the highest probability.
+- For every other key from `DeeplabLabel`, if the label was included in `classesOfInterest` the dictionary will contain an array of floats corresponding to the probability of this class for every pixel.
+
+## Example
+
+```typescript
+function App() {
+  const model = useImageSegmentation({ model: DEEPLAB_V3_RESNET50 });
+
+  // ...
+  const imageUri = 'file::///Users/.../cute_cat.png';
+
+  try {
+    const outputDict = await model.forward(imageUri, [DeeplabLabel.CAT], true);
+  } catch (error) {
+    console.error(error);
+  }
+  // ...
+}
+```
+
+## Supported models
+
+| Model                                                                                                                            | Number of classes | Class list                                                                                                                                            |
+| -------------------------------------------------------------------------------------------------------------------------------- | ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
+| [deeplabv3_resnet50](https://pytorch.org/vision/stable/models/generated/torchvision.models.segmentation.deeplabv3_resnet50.html) | 21                | [DeeplabLabel](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/imageSegmentation.ts) |
+
+## Benchmarks
+
+### Model size
+
+| Model             | XNNPACK [MB] |
+| ----------------- | ------------ |
+| DEELABV3_RESNET50 | 168          |
+
+### Memory usage
+
+:::warning warning
+Data presented in the following sections is based on inference with non-resized output. When resize is enabled, expect higher memory usage and inference time with higher resolutions.
+:::
+
+| Model             | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ----------------- | ---------------------- | ------------------ |
+| DEELABV3_RESNET50 | 930                    | 660                |
+
+### Inference time
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+| Model             | iPhone 16 Pro (Core ML) [ms] | iPhone 14 Pro Max (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] |
+| ----------------- | ---------------------------- | -------------------------------- | --------------------------------- |
+| DEELABV3_RESNET50 | 1000                         | 670                              | 700                               |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useOCR.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useOCR.md
new file mode 100644
index 000000000..d07efd601
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useOCR.md
@@ -0,0 +1,332 @@
+---
+title: useOCR
+---
+
+Optical character recognition(OCR) is a computer vision technique that detects and recognizes text within the image. It's commonly used to convert different types of documents, such as scanned paper documents, PDF files, or images captured by a digital camera, into editable and searchable data.
+
+:::caution
+It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
+:::
+
+## Reference
+
+```tsx
+import { useOCR, OCR_ENGLISH } from 'react-native-executorch';
+
+function App() {
+  const model = useOCR({ model: OCR_ENGLISH });
+
+  // ...
+  for (const ocrDetection of await model.forward('https://url-to-image.jpg')) {
+    console.log('Bounding box: ', ocrDetection.bbox);
+    console.log('Bounding label: ', ocrDetection.text);
+    console.log('Bounding score: ', ocrDetection.score);
+  }
+  // ...
+}
+```
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+interface RecognizerSources {
+  recognizerLarge: string | number;
+  recognizerMedium: string | number;
+  recognizerSmall: string | number;
+}
+
+type OCRLanguage =
+  | 'abq'
+  | 'ady'
+  | 'af'
+  | 'ava'
+  | 'az'
+  | 'be'
+  | 'bg'
+  | 'bs'
+  | 'chSim'
+  | 'che'
+  | 'cs'
+  | 'cy'
+  | 'da'
+  | 'dar'
+  | 'de'
+  | 'en'
+  | 'es'
+  | 'et'
+  | 'fr'
+  | 'ga'
+  | 'hr'
+  | 'hu'
+  | 'id'
+  | 'inh'
+  | 'ic'
+  | 'it'
+  | 'ja'
+  | 'kbd'
+  | 'kn'
+  | 'ko'
+  | 'ku'
+  | 'la'
+  | 'lbe'
+  | 'lez'
+  | 'lt'
+  | 'lv'
+  | 'mi'
+  | 'mn'
+  | 'ms'
+  | 'mt'
+  | 'nl'
+  | 'no'
+  | 'oc'
+  | 'pi'
+  | 'pl'
+  | 'pt'
+  | 'ro'
+  | 'ru'
+  | 'rsCyrillic'
+  | 'rsLatin'
+  | 'sk'
+  | 'sl'
+  | 'sq'
+  | 'sv'
+  | 'sw'
+  | 'tab'
+  | 'te'
+  | 'th'
+  | 'tjk'
+  | 'tl'
+  | 'tr'
+  | 'uk'
+  | 'uz'
+  | 'vi';
+
+interface Point {
+  x: number;
+  y: number;
+}
+
+interface OCRDetection {
+  bbox: Point[];
+  text: string;
+  score: number;
+}
+```
+
+</details>
+
+### Arguments
+
+**`model`** - Object containing the detector source, recognizer sources, and language.
+
+- **`detectorSource`** - A string that specifies the location of the detector binary.
+- **`recognizerLarge`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 512 pixels.
+- **`recognizerMedium`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 256 pixels.
+- **`recognizerSmall`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 128 pixels.
+- **`language`** - A parameter that specifies the language of the text to be recognized by the OCR.
+
+**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+### Returns
+
+The hook returns an object with the following properties:
+
+| Field              | Type                                               | Description                                                                                 |
+| ------------------ | -------------------------------------------------- | ------------------------------------------------------------------------------------------- |
+| `forward`          | `(imageSource: string) => Promise<OCRDetection[]>` | A function that accepts an image (url, b64) and returns an array of `OCRDetection` objects. |
+| `error`            | <code>string &#124; null</code>                    | Contains the error message if the model loading failed.                                     |
+| `isGenerating`     | `boolean`                                          | Indicates whether the model is currently processing an inference.                           |
+| `isReady`          | `boolean`                                          | Indicates whether the model has successfully loaded and is ready for inference.             |
+| `downloadProgress` | `number`                                           | Represents the download progress as a value between 0 and 1.                                |
+
+## Running the model
+
+To run the model, you can use the `forward` method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The function returns an array of `OCRDetection` objects. Each object contains coordinates of the bounding box, the text recognized within the box, and the confidence score. For more information, please refer to the reference or type definitions.
+
+## Detection object
+
+The detection object is specified as follows:
+
+```typescript
+interface Point {
+  x: number;
+  y: number;
+}
+
+interface OCRDetection {
+  bbox: Point[];
+  text: string;
+  score: number;
+}
+```
+
+The `bbox` property contains information about the bounding box of detected text regions. It is represented as four points, which are corners of detected bounding box.
+The `text` property contains the text recognized within detected text region. The `score` represents the confidence score of the recognized text.
+
+## Example
+
+```tsx
+import { useOCR, OCR_ENGLISH } from 'react-native-executorch';
+
+function App() {
+  const model = useOCR({ model: OCR_ENGLISH });
+
+  const runModel = async () => {
+    const ocrDetections = await model.forward('https://url-to-image.jpg');
+
+    for (const ocrDetection of ocrDetections) {
+      console.log('Bounding box: ', ocrDetection.bbox);
+      console.log('Bounding text: ', ocrDetection.text);
+      console.log('Bounding score: ', ocrDetection.score);
+    }
+  };
+}
+```
+
+## Language-Specific Recognizers
+
+Each supported language requires its own set of recognizer models.  
+The built-in constants such as `RECOGNIZER_EN_CRNN_512`, `RECOGNIZER_PL_CRNN_256`, etc., point to specific models trained for a particular language.
+
+> For example:
+>
+> - To recognize **English** text, use:
+>   - `RECOGNIZER_EN_CRNN_512`
+>   - `RECOGNIZER_EN_CRNN_256`
+>   - `RECOGNIZER_EN_CRNN_128`
+> - To recognize **Polish** text, use:
+>   - `RECOGNIZER_PL_CRNN_512`
+>   - `RECOGNIZER_PL_CRNN_256`
+>   - `RECOGNIZER_PL_CRNN_128`
+
+You need to make sure the recognizer models you pass in `recognizerSources` match the `language` you specify.
+
+## Supported languages
+
+|      Language      | Code Name  |
+| :----------------: | :--------: |
+|       Abaza        |    abq     |
+|       Adyghe       |    ady     |
+|      Africans      |     af     |
+|        Avar        |    ava     |
+|    Azerbaijani     |     az     |
+|     Belarusian     |     be     |
+|     Bulgarian      |     bg     |
+|      Bosnian       |     bs     |
+| Simplified Chinese |   chSim    |
+|      Chechen       |    che     |
+|       Chech        |     cs     |
+|       Welsh        |     cy     |
+|       Danish       |     da     |
+|       Dargwa       |    dar     |
+|       German       |     de     |
+|      English       |     en     |
+|      Spanish       |     es     |
+|      Estonian      |     et     |
+|       French       |     fr     |
+|       Irish        |     ga     |
+|      Croatian      |     hr     |
+|     Hungarian      |     hu     |
+|     Indonesian     |     id     |
+|       Ingush       |    inh     |
+|     Icelandic      |     ic     |
+|      Italian       |     it     |
+|      Japanese      |     ja     |
+|     Karbadian      |    kbd     |
+|      Kannada       |     kn     |
+|       Korean       |     ko     |
+|      Kurdish       |     ku     |
+|       Latin        |     la     |
+|        Lak         |    lbe     |
+|      Lezghian      |    lez     |
+|     Lithuanian     |     lt     |
+|      Latvian       |     lv     |
+|       Maori        |     mi     |
+|     Mongolian      |     mn     |
+|       Malay        |     ms     |
+|      Maltese       |     mt     |
+|       Dutch        |     nl     |
+|     Norwegian      |     no     |
+|      Occitan       |     oc     |
+|        Pali        |     pi     |
+|       Polish       |     pl     |
+|     Portuguese     |     pt     |
+|      Romanian      |     ro     |
+|      Russian       |     ru     |
+| Serbian (Cyrillic) | rsCyrillic |
+|  Serbian (Latin)   |  rsLatin   |
+|       Slovak       |     sk     |
+|     Slovenian      |     sl     |
+|      Albanian      |     sq     |
+|      Swedish       |     sv     |
+|      Swahili       |     sw     |
+|     Tabassaran     |    tab     |
+|       Telugu       |     te     |
+|        Thai        |     th     |
+|       Tajik        |    tjk     |
+|      Tagalog       |     tl     |
+|      Turkish       |     tr     |
+|     Ukrainian      |     uk     |
+|       Uzbek        |     uz     |
+|     Vietnamese     |     vi     |
+
+## Supported models
+
+| Model                                                   |    Type    |
+| ------------------------------------------------------- | :--------: |
+| [CRAFT_800\*](https://github.com/clovaai/CRAFT-pytorch) |  Detector  |
+| [CRNN_512\*](https://www.jaided.ai/easyocr/modelhub/)   | Recognizer |
+| [CRNN_256\*](https://www.jaided.ai/easyocr/modelhub/)   | Recognizer |
+| [CRNN_128\*](https://www.jaided.ai/easyocr/modelhub/)   | Recognizer |
+
+\* - The number following the underscore (\_) indicates the input image width used during model export.
+
+## Benchmarks
+
+### Model size
+
+| Model                          | XNNPACK [MB] |
+| ------------------------------ | :----------: |
+| Detector (CRAFT_800_QUANTIZED) |     19.8     |
+| Recognizer (CRNN_512)          |  15 - 18\*   |
+| Recognizer (CRNN_256)          |  16 - 18\*   |
+| Recognizer (CRNN_128)          |  17 - 19\*   |
+
+\* - The model weights vary depending on the language.
+
+### Memory usage
+
+| Model                                                                                                  | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ------------------------------------------------------------------------------------------------------ | :--------------------: | :----------------: |
+| Detector (CRAFT_800_QUANTIZED) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) |          1400          |        1320        |
+
+### Inference time
+
+**Image Used for Benchmarking:**
+
+| ![Alt text](../../../static/img/harvard.png) | ![Alt text](../../../static/img/harvard-boxes.png) |
+| -------------------------------------------- | -------------------------------------------------- |
+| Original Image                               | Image with detected Text Boxes                     |
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+**Time measurements:**
+
+| Metric                             | iPhone 17 Pro <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
+| ---------------------------------- | ------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
+| **Total Inference Time**           | 652                       | 600                       | 2855        | 1092                           | 1034                   |
+| **Detector (CRAFT_800_QUANTIZED)** | 220                       | 221                       | 1740        | 521                            | 492                    |
+| **Recognizer (CRNN_512)**          |                           |                           |             |                                |                        |
+| ├─ Average Time                    | 45                        | 38                        | 110         | 40                             | 38                     |
+| ├─ Total Time (3 runs)             | 135                       | 114                       | 330         | 120                            | 114                    |
+| **Recognizer (CRNN_256)**          |                           |                           |             |                                |                        |
+| ├─ Average Time                    | 21                        | 18                        | 54          | 20                             | 19                     |
+| ├─ Total Time (7 runs)             | 147                       | 126                       | 378         | 140                            | 133                    |
+| **Recognizer (CRNN_128)**          |                           |                           |             |                                |                        |
+| ├─ Average Time                    | 11                        | 9                         | 27          | 10                             | 10                     |
+| ├─ Total Time (7 runs)             | 77                        | 63                        | 189         | 70                             | 70                     |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useObjectDetection.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useObjectDetection.md
new file mode 100644
index 000000000..2bae6a658
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useObjectDetection.md
@@ -0,0 +1,152 @@
+---
+title: useObjectDetection
+---
+
+Object detection is a computer vision technique that identifies and locates objects within images or video. It’s commonly used in applications like image recognition, video surveillance or autonomous driving.
+`useObjectDetection` is a hook that allows you to seamlessly integrate object detection into your React Native applications.
+
+:::caution
+It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-ssdlite320-mobilenet-v3-large). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
+:::
+
+## Reference
+
+```tsx
+import {
+  useObjectDetection,
+  SSDLITE_320_MOBILENET_V3_LARGE,
+} from 'react-native-executorch';
+
+function App() {
+  const ssdlite = useObjectDetection({ model: SSDLITE_320_MOBILENET_V3_LARGE });
+
+  // ...
+  for (const detection of await ssdlite.forward('https://url-to-image.jpg')) {
+    console.log('Bounding box: ', detection.bbox);
+    console.log('Bounding label: ', detection.label);
+    console.log('Bounding score: ', detection.score);
+  }
+  // ...
+}
+```
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+interface Bbox {
+  x1: number;
+  x2: number;
+  y1: number;
+  y2: number;
+}
+
+interface Detection {
+  bbox: Bbox;
+  label: keyof typeof CocoLabel;
+  score: number;
+}
+```
+
+</details>
+
+### Arguments
+
+**`model`** - Object containing the model source.
+
+- **`modelSource`** - A string that specifies the path to the model file. You can download the model from our [HuggingFace repository](https://huggingface.co/software-mansion/react-native-executorch-ssdlite320-mobilenet-v3-large/tree/main).
+
+**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+### Returns
+
+The hook returns an object with the following properties:
+
+| Field              | Type                                                                              | Description                                                                                                                                                              |
+| ------------------ | --------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| `forward`          | `(imageSource: string, detectionThreshold: number = 0.7) => Promise<Detection[]>` | A function that accepts an image (url, b64) and returns an array of `Detection` objects. `detectionThreshold` can be supplied to alter the sensitivity of the detection. |
+| `error`            | <code>string &#124; null</code>                                                   | Contains the error message if the model loading failed.                                                                                                                  |
+| `isGenerating`     | `boolean`                                                                         | Indicates whether the model is currently processing an inference.                                                                                                        |
+| `isReady`          | `boolean`                                                                         | Indicates whether the model has successfully loaded and is ready for inference.                                                                                          |
+| `downloadProgress` | `number`                                                                          | Represents the download progress as a value between 0 and 1.                                                                                                             |
+
+## Running the model
+
+To run the model, you can use the `forward` method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The function returns an array of `Detection` objects. Each object contains coordinates of the bounding box, the label of the detected object, and the confidence score. For more information, please refer to the reference or type definitions.
+
+## Detection object
+
+The detection object is specified as follows:
+
+```typescript
+interface Bbox {
+  x1: number;
+  y1: number;
+  x2: number;
+  y2: number;
+}
+
+interface Detection {
+  bbox: Bbox;
+  label: keyof typeof CocoLabels;
+  score: number;
+}
+```
+
+The `bbox` property contains information about the bounding box of detected objects. It is represented as two points: one at the bottom-left corner of the bounding box (`x1`, `y1`) and the other at the top-right corner (`x2`, `y2`).
+The `label` property contains the name of the detected object, which corresponds to one of the `CocoLabels`. The `score` represents the confidence score of the detected object.
+
+## Example
+
+```tsx
+import {
+  useObjectDetection,
+  SSDLITE_320_MOBILENET_V3_LARGE,
+} from 'react-native-executorch';
+
+function App() {
+  const ssdlite = useObjectDetection({ model: SSDLITE_320_MOBILENET_V3_LARGE });
+
+  const runModel = async () => {
+    const detections = await ssdlite.forward('https://url-to-image.jpg');
+
+    for (const detection of detections) {
+      console.log('Bounding box: ', detection.bbox);
+      console.log('Bounding label: ', detection.label);
+      console.log('Bounding score: ', detection.score);
+    }
+  };
+}
+```
+
+## Supported models
+
+| Model                                                                                                                                                                                                                 | Number of classes | Class list                                                                                                                                                             |
+| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| [SSDLite320 MobileNetV3 Large](https://pytorch.org/vision/stable/models/generated/torchvision.models.detection.ssdlite320_mobilenet_v3_large.html#torchvision.models.detection.SSDLite320_MobileNet_V3_Large_Weights) | 91                | [COCO](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/common/rnexecutorch/models/object_detection/Constants.h) |
+
+## Benchmarks
+
+### Model size
+
+| Model                          | XNNPACK [MB] |
+| ------------------------------ | :----------: |
+| SSDLITE_320_MOBILENET_V3_LARGE |     13.9     |
+
+### Memory usage
+
+| Model                          | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ------------------------------ | :--------------------: | :----------------: |
+| SSDLITE_320_MOBILENET_V3_LARGE |          164           |        132         |
+
+### Inference time
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| SSDLITE_320_MOBILENET_V3_LARGE |              71              |              74              |            257             |                115                |            109            |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useStyleTransfer.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useStyleTransfer.md
new file mode 100644
index 000000000..f5d0a423c
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useStyleTransfer.md
@@ -0,0 +1,114 @@
+---
+title: useStyleTransfer
+---
+
+Style transfer is a technique used in computer graphics and machine learning where the visual style of one image is applied to the content of another. This is achieved using algorithms that manipulate data from both images, typically with the aid of a neural network. The result is a new image that combines the artistic elements of one picture with the structural details of another, effectively merging art with traditional imagery. React Native ExecuTorch offers a dedicated hook `useStyleTransfer`, for this task. However before you start you'll need to obtain ExecuTorch-compatible model binary.
+
+:::caution
+It is recommended to use models provided by us which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-style-transfer-candy), you can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
+:::
+
+## Reference
+
+```typescript
+import {
+  useStyleTransfer,
+  STYLE_TRANSFER_CANDY,
+} from 'react-native-executorch';
+
+const model = useStyleTransfer({ model: STYLE_TRANSFER_CANDY });
+
+const imageUri = 'file::///Users/.../cute_cat.png';
+
+try {
+  const generatedImageUrl = await model.forward(imageUri);
+} catch (error) {
+  console.error(error);
+}
+```
+
+### Arguments
+
+**`model`** - Object containing the model source.
+
+- **`modelSource`** - A string that specifies the location of the model binary.
+
+**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+### Returns
+
+| Field              | Type                                       | Description                                                                                                    |
+| ------------------ | ------------------------------------------ | -------------------------------------------------------------------------------------------------------------- |
+| `forward`          | `(imageSource: string) => Promise<string>` | Executes the model's forward pass, where `imageSource` can be a fetchable resource or a Base64-encoded string. |
+| `error`            | <code>string &#124; null</code>            | Contains the error message if the model failed to load.                                                        |
+| `isGenerating`     | `boolean`                                  | Indicates whether the model is currently processing an inference.                                              |
+| `isReady`          | `boolean`                                  | Indicates whether the model has successfully loaded and is ready for inference.                                |
+| `downloadProgress` | `number`                                   | Represents the download progress as a value between 0 and 1.                                                   |
+
+## Running the model
+
+To run the model, you can use `forward` method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The function returns a promise which can resolve either to an error or a URL to generated image.
+
+:::info
+Images from external sources and the generated image are stored in your application's temporary directory.
+:::
+
+## Example
+
+```typescript
+function App() {
+  const model = useStyleTransfer({ model: STYLE_TRANSFER_CANDY });
+
+  // ...
+  const imageUri = 'file::///Users/.../cute_cat.png';
+
+  try {
+    const generatedImageUrl = await model.forward(imageUri);
+  } catch (error) {
+    console.error(error);
+  }
+  // ...
+}
+```
+
+## Supported models
+
+- [Candy](https://github.com/pytorch/examples/tree/main/fast_neural_style)
+- [Mosaic](https://github.com/pytorch/examples/tree/main/fast_neural_style)
+- [Udnie](https://github.com/pytorch/examples/tree/main/fast_neural_style)
+- [Rain princess](https://github.com/pytorch/examples/tree/main/fast_neural_style)
+
+## Benchmarks
+
+### Model size
+
+| Model                        | XNNPACK [MB] | Core ML [MB] |
+| ---------------------------- | :----------: | :----------: |
+| STYLE_TRANSFER_CANDY         |     6.78     |     5.22     |
+| STYLE_TRANSFER_MOSAIC        |     6.78     |     5.22     |
+| STYLE_TRANSFER_UDNIE         |     6.78     |     5.22     |
+| STYLE_TRANSFER_RAIN_PRINCESS |     6.78     |     5.22     |
+
+### Memory usage
+
+| Model                        | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
+| ---------------------------- | :--------------------: | :----------------: |
+| STYLE_TRANSFER_CANDY         |          1200          |        380         |
+| STYLE_TRANSFER_MOSAIC        |          1200          |        380         |
+| STYLE_TRANSFER_UDNIE         |          1200          |        380         |
+| STYLE_TRANSFER_RAIN_PRINCESS |          1200          |        380         |
+
+### Inference time
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+| Model                        | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ---------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| STYLE_TRANSFER_CANDY         |             1400             |             1485             |            4255            |               2510                |           2355            |
+| STYLE_TRANSFER_MOSAIC        |             1400             |             1485             |            4255            |               2510                |           2355            |
+| STYLE_TRANSFER_UDNIE         |             1400             |             1485             |            4255            |               2510                |           2355            |
+| STYLE_TRANSFER_RAIN_PRINCESS |             1400             |             1485             |            4255            |               2510                |           2355            |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useTextToImage.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useTextToImage.md
new file mode 100644
index 000000000..3eaf7d826
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useTextToImage.md
@@ -0,0 +1,133 @@
+---
+title: useTextToImage
+keywords: [image generation]
+description: "Learn how to use image generation models in your React Native applications with React Native ExecuTorch's useTextToImage hook."
+---
+
+Text-to-image is a process of generating images directly from a description in natural language by conditioning a model on the provided text input. Our implementation follows the Stable Diffusion pipeline, which applies the diffusion process in a lower-dimensional latent space to reduce memory requirements. The pipeline combines a text encoder to preprocess the prompt, a U-Net that iteratively denoises latent representations, and a VAE decoder to reconstruct the final image. React Native ExecuTorch offers a dedicated hook, `useTextToImage`, for this task.
+
+<!-- Update links after uploading the model to Swm HuggingFace -->
+
+:::warning
+It is recommended to use models provided by us which are available at our Hugging Face repository, you can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
+:::
+
+## Reference
+
+```typescript
+import { useTextToImage, BK_SDM_TINY_VPRED_256 } from 'react-native-executorch';
+
+const model = useTextToImage({ model: BK_SDM_TINY_VPRED_256 });
+
+const input = 'a castle';
+
+try {
+  const image = await model.generate(input);
+} catch (error) {
+  console.error(error);
+}
+```
+
+### Arguments
+
+**`model`** - Object containing the model source.
+
+- **`schedulerSource`** - A string that specifies the location of the scheduler config.
+
+- **`tokenizerSource`** - A string that specifies the location of the tokenizer config.
+
+- **`encoderSource`** - A string that specifies the location of the text encoder binary.
+
+- **`unetSource`** - A string that specifies the location of the U-Net binary.
+
+- **`decoderSource`** - A string that specifies the location of the VAE decoder binary.
+
+**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+### Returns
+
+| Field              | Type                                                                                       | Description                                                                                                                                                                                                                              |
+| ------------------ | ------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `generate`         | `(input: string, imageSize?: number, numSteps?: number, seed?: number) => Promise<string>` | Runs the model to generate an image described by `input`, and conditioned by `seed`, performing `numSteps` inference steps. The resulting image, with dimensions `imageSize`×`imageSize` pixels, is returned as a base64-encoded string. |
+| `error`            | <code>string &#124; null</code>                                                            | Contains the error message if the model failed to load.                                                                                                                                                                                  |
+| `isGenerating`     | `boolean`                                                                                  | Indicates whether the model is currently processing an inference.                                                                                                                                                                        |
+| `isReady`          | `boolean`                                                                                  | Indicates whether the model has successfully loaded and is ready for inference.                                                                                                                                                          |
+| `downloadProgress` | `number`                                                                                   | Represents the download progress as a value between 0 and 1.                                                                                                                                                                             |
+| `interrupt()`      | `() => void`                                                                               | Interrupts the current inference. The model is stopped in the nearest inference step.                                                                                                                                                    |
+
+## Running the model
+
+To run the model, you can use the `forward` method. It accepts four arguments: a text prompt describing the requested image, a size of the image in pixels, a number of denoising steps, and an optional seed value, which enables reproducibility of the results.
+
+The image size must be a multiple of 32 due to the architecture of the U-Net and VAE models. The seed should be a positive integer.
+
+:::warning
+Larger imageSize values require significantly more memory to run the model.
+:::
+
+## Example
+
+```tsx
+import { useTextToImage, BK_SDM_TINY_VPRED_256 } from 'react-native-executorch';
+
+function App() {
+  const model = useTextToImage({ model: BK_SDM_TINY_VPRED_256 });
+
+  //...
+  const input = 'a medieval castle by the sea shore';
+
+  const imageSize = 256;
+  const numSteps = 25;
+
+  try {
+    image = await model.generate(input, imageSize, numSteps);
+  } catch (error) {
+    console.error(error);
+  }
+  //...
+
+  return <Image source={{ uri: `data:image/png;base64,${image}` }} />;
+}
+```
+
+| ![Castle 256x256](../../../static/img/castle256.png) | ![Castle 512x512](../../../static/img/castle512.png) |
+| ---------------------------------------------------- | ---------------------------------------------------- |
+| Image of size 256×256                                | Image of size 512×512                                |
+
+## Supported models
+
+| Model                                                               | Parameters [B] | Description                                                                                                                                                                                                                                                                                                  |
+| ------------------------------------------------------------------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| [bk-sdm-tiny-vpred](https://huggingface.co/vivym/bk-sdm-tiny-vpred) | 0.5            | BK-SDM (Block-removed Knowledge-distilled Stable Diffusion Model) is a compressed version of Stable Diffusion v1.4 with several residual and attention blocks removed. The BK-SDM-Tiny is a v-prediction variant of the model, obtained through further block removal, built around a 0.33B-parameter U-Net. |
+
+## Benchmarks
+
+:::info
+The number following the underscore (\_) indicates that the model supports generating image with dimensions ranging from 128 pixels up to that value. This setting doesn’t affect the model’s file size - it only determines how memory is allocated at runtime, based on the maximum allowed image size.
+:::
+
+### Model size
+
+| Model                 | Text encoder (XNNPACK) [MB] | UNet (XNNPACK) [MB] | VAE decoder (XNNPACK) [MB] |
+| --------------------- | --------------------------- | ------------------- | -------------------------- |
+| BK_SDM_TINY_VPRED_256 | 492                         | 1290                | 198                        |
+| BK_SDM_TINY_VPRED_512 | 492                         | 1290                | 198                        |
+
+### Memory usage
+
+| Model                 | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| --------------------- | ---------------------- | ------------------ |
+| BK_SDM_TINY_VPRED_256 | 2900                   | 2800               |
+| BK_SDM_TINY_VPRED_512 | 6700                   | 6560               |
+
+### Inference time
+
+| Model                 | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| --------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| BK_SDM_TINY_VPRED_256 |            21184             |            21021             |             ❌             |               18834               |           16617           |
+
+:::info
+Text-to-image benchmark times are measured generating 256×256 images in 10 inference steps.
+:::
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useVerticalOCR.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useVerticalOCR.md
new file mode 100644
index 000000000..f317d527e
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useVerticalOCR.md
@@ -0,0 +1,347 @@
+---
+title: useVerticalOCR
+---
+
+:::danger Experimental
+The `useVerticalOCR` hook is currently in an experimental phase. We appreciate feedback from users as we continue to refine and enhance its functionality.
+:::
+
+Optical Character Recognition (OCR) is a computer vision technique used to detect and recognize text within images. It is commonly utilized to convert a variety of documents, such as scanned paper documents, PDF files, or images captured by a digital camera, into editable and searchable data. Traditionally, OCR technology has been optimized for recognizing horizontal text, and integrating support for vertical text recognition often requires significant additional effort from developers. To simplify this, we introduce `useVerticalOCR`, a tool designed to abstract the complexities of vertical text OCR, enabling seamless integration into your applications.
+
+:::caution
+It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
+:::
+
+## Reference
+
+```tsx
+import { useVerticalOCR, VERTICAL_OCR_ENGLISH } from 'react-native-executorch';
+
+function App() {
+  const model = useVerticalOCR({
+    model: VERTICAL_OCR_ENGLISH,
+    independentCharacters: true,
+  });
+
+  // ...
+  for (const ocrDetection of await model.forward('https://url-to-image.jpg')) {
+    console.log('Bounding box: ', ocrDetection.bbox);
+    console.log('Bounding label: ', ocrDetection.text);
+    console.log('Bounding score: ', ocrDetection.score);
+  }
+  // ...
+}
+```
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+interface DetectorSources {
+  detectorLarge: string | number;
+  detectorNarrow: string | number;
+}
+
+interface RecognizerSources {
+  recognizerLarge: string | number;
+  recognizerSmall: string | number;
+}
+
+type OCRLanguage =
+  | 'abq'
+  | 'ady'
+  | 'af'
+  | 'ava'
+  | 'az'
+  | 'be'
+  | 'bg'
+  | 'bs'
+  | 'chSim'
+  | 'che'
+  | 'cs'
+  | 'cy'
+  | 'da'
+  | 'dar'
+  | 'de'
+  | 'en'
+  | 'es'
+  | 'et'
+  | 'fr'
+  | 'ga'
+  | 'hr'
+  | 'hu'
+  | 'id'
+  | 'inh'
+  | 'ic'
+  | 'it'
+  | 'ja'
+  | 'kbd'
+  | 'kn'
+  | 'ko'
+  | 'ku'
+  | 'la'
+  | 'lbe'
+  | 'lez'
+  | 'lt'
+  | 'lv'
+  | 'mi'
+  | 'mn'
+  | 'ms'
+  | 'mt'
+  | 'nl'
+  | 'no'
+  | 'oc'
+  | 'pi'
+  | 'pl'
+  | 'pt'
+  | 'ro'
+  | 'ru'
+  | 'rsCyrillic'
+  | 'rsLatin'
+  | 'sk'
+  | 'sl'
+  | 'sq'
+  | 'sv'
+  | 'sw'
+  | 'tab'
+  | 'te'
+  | 'th'
+  | 'tjk'
+  | 'tl'
+  | 'tr'
+  | 'uk'
+  | 'uz'
+  | 'vi';
+
+interface Point {
+  x: number;
+  y: number;
+}
+
+interface OCRDetection {
+  bbox: Point[];
+  text: string;
+  score: number;
+}
+```
+
+</details>
+
+### Arguments
+
+**`model`** - Object containing the detector sources, recognizer sources, and language.
+
+- **`detectorLarge`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 1280 pixels.
+- **`detectorNarrow`** - A string that specifies the location of the detector binary file which accepts input images with a width of 320 pixels.
+- **`recognizerLarge`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 512 pixels.
+- **`recognizerSmall`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 64 pixels.
+- **`language`** - A parameter that specifies the language of the text to be recognized by the OCR.
+
+**`independentCharacters`** – A boolean parameter that indicates whether the text in the image consists of a random sequence of characters. If set to true, the algorithm will scan each character individually instead of reading them as continuous text.
+
+**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+### Returns
+
+The hook returns an object with the following properties:
+
+| Field              | Type                                               | Description                                                                                 |
+| ------------------ | -------------------------------------------------- | ------------------------------------------------------------------------------------------- |
+| `forward`          | `(imageSource: string) => Promise<OCRDetection[]>` | A function that accepts an image (url, b64) and returns an array of `OCRDetection` objects. |
+| `error`            | <code>string &#124; null</code>                    | Contains the error message if the model loading failed.                                     |
+| `isGenerating`     | `boolean`                                          | Indicates whether the model is currently processing an inference.                           |
+| `isReady`          | `boolean`                                          | Indicates whether the model has successfully loaded and is ready for inference.             |
+| `downloadProgress` | `number`                                           | Represents the download progress as a value between 0 and 1.                                |
+
+## Running the model
+
+To run the model, you can use the `forward` method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The function returns an array of `OCRDetection` objects. Each object contains coordinates of the bounding box, the text recognized within the box, and the confidence score. For more information, please refer to the reference or type definitions.
+
+## Detection object
+
+The detection object is specified as follows:
+
+```typescript
+interface Point {
+  x: number;
+  y: number;
+}
+
+interface OCRDetection {
+  bbox: Point[];
+  text: string;
+  score: number;
+}
+```
+
+The `bbox` property contains information about the bounding box of detected text regions. It is represented as four points, which are corners of detected bounding box.
+The `text` property contains the text recognized within detected text region. The `score` represents the confidence score of the recognized text.
+
+## Example
+
+```tsx
+import { useVerticalOCR, VERTICAL_OCR_ENGLISH } from 'react-native-executorch';
+
+function App() {
+  const model = useVerticalOCR({
+    model: VERTICAL_OCR_ENGLISH,
+    independentCharacters: true,
+  });
+
+  const runModel = async () => {
+    const ocrDetections = await model.forward('https://url-to-image.jpg');
+
+    for (const ocrDetection of ocrDetections) {
+      console.log('Bounding box: ', ocrDetection.bbox);
+      console.log('Bounding text: ', ocrDetection.text);
+      console.log('Bounding score: ', ocrDetection.score);
+    }
+  };
+}
+```
+
+## Language-Specific Recognizers
+
+Each supported language requires its own set of recognizer models.  
+The built-in constants such as `RECOGNIZER_EN_CRNN_512`, `RECOGNIZER_PL_CRNN_64`, etc., point to specific models trained for a particular language.
+
+> For example:
+>
+> - To recognize **English** text, use:
+>   - `RECOGNIZER_EN_CRNN_512`
+>   - `RECOGNIZER_EN_CRNN_64`
+> - To recognize **Polish** text, use:
+>   - `RECOGNIZER_PL_CRNN_512`
+>   - `RECOGNIZER_PL_CRNN_64`
+
+You need to make sure the recognizer models you pass in `recognizerSources` match the `language` you specify.
+
+## Supported languages
+
+|      Language      | Code Name  |
+| :----------------: | :--------: |
+|       Abaza        |    abq     |
+|       Adyghe       |    ady     |
+|      Africans      |     af     |
+|        Avar        |    ava     |
+|    Azerbaijani     |     az     |
+|     Belarusian     |     be     |
+|     Bulgarian      |     bg     |
+|      Bosnian       |     bs     |
+| Simplified Chinese |   chSim    |
+|      Chechen       |    che     |
+|       Chech        |     cs     |
+|       Welsh        |     cy     |
+|       Danish       |     da     |
+|       Dargwa       |    dar     |
+|       German       |     de     |
+|      English       |     en     |
+|      Spanish       |     es     |
+|      Estonian      |     et     |
+|       French       |     fr     |
+|       Irish        |     ga     |
+|      Croatian      |     hr     |
+|     Hungarian      |     hu     |
+|     Indonesian     |     id     |
+|       Ingush       |    inh     |
+|     Icelandic      |     ic     |
+|      Italian       |     it     |
+|      Japanese      |     ja     |
+|     Karbadian      |    kbd     |
+|      Kannada       |     kn     |
+|       Korean       |     ko     |
+|      Kurdish       |     ku     |
+|       Latin        |     la     |
+|        Lak         |    lbe     |
+|      Lezghian      |    lez     |
+|     Lithuanian     |     lt     |
+|      Latvian       |     lv     |
+|       Maori        |     mi     |
+|     Mongolian      |     mn     |
+|       Malay        |     ms     |
+|      Maltese       |     mt     |
+|       Dutch        |     nl     |
+|     Norwegian      |     no     |
+|      Occitan       |     oc     |
+|        Pali        |     pi     |
+|       Polish       |     pl     |
+|     Portuguese     |     pt     |
+|      Romanian      |     ro     |
+|      Russian       |     ru     |
+| Serbian (Cyrillic) | rsCyrillic |
+|  Serbian (Latin)   |  rsLatin   |
+|       Slovak       |     sk     |
+|     Slovenian      |     sl     |
+|      Albanian      |     sq     |
+|      Swedish       |     sv     |
+|      Swahili       |     sw     |
+|     Tabassaran     |    tab     |
+|       Telugu       |     te     |
+|        Thai        |     th     |
+|       Tajik        |    tjk     |
+|      Tagalog       |     tl     |
+|      Turkish       |     tr     |
+|     Ukrainian      |     uk     |
+|       Uzbek        |     uz     |
+|     Vietnamese     |     vi     |
+
+## Supported models
+
+| Model                                                    | Type       |
+| -------------------------------------------------------- | ---------- |
+| [CRAFT_1280\*](https://github.com/clovaai/CRAFT-pytorch) | Detector   |
+| [CRAFT_320\*](https://github.com/clovaai/CRAFT-pytorch)  | Detector   |
+| [CRNN_512\*](https://www.jaided.ai/easyocr/modelhub/)    | Recognizer |
+| [CRNN_64\*](https://www.jaided.ai/easyocr/modelhub/)     | Recognizer |
+
+\* - The number following the underscore (\_) indicates the input image width used during model export.
+
+## Benchmarks
+
+### Model size
+
+| Model                           | XNNPACK [MB] |
+| ------------------------------- | :----------: |
+| Detector (CRAFT_1280_QUANTIZED) |     19.8     |
+| Detector (CRAFT_32_QUANTIZED)   |     19.8     |
+| Recognizer (CRNN_512)           |  15 - 18\*   |
+| Recognizer (CRNN_64)            |  15 - 16\*   |
+
+\* - The model weights vary depending on the language.
+
+### Memory usage
+
+| Model                                                                | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| -------------------------------------------------------------------- | :--------------------: | :----------------: |
+| Detector (CRAFT_1280) + Detector (CRAFT_320) + Recognizer (CRNN_512) |          1540          |        1470        |
+| Detector(CRAFT_1280) + Detector(CRAFT_320) + Recognizer (CRNN_64)    |          1070          |        1000        |
+
+### Inference time
+
+**Image Used for Benchmarking:**
+
+| ![Alt text](../../../static/img/sales-vertical.jpeg) | ![Alt text](../../../static/img/sales-vertical-boxes.png) |
+| ---------------------------------------------------- | --------------------------------------------------------- |
+| Original Image                                       | Image with detected Text Boxes                            |
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+**Time measurements:**
+
+| Metric                                                                     | iPhone 17 Pro <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
+| -------------------------------------------------------------------------- | ------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
+| **Total Inference Time**                                                   | 1104                      | 1113                      | 8840        | 2845                           | 2640                   |
+| **Detector (CRAFT_1280_QUANTIZED)**                                        | 501                       | 507                       | 4317        | 1405                           | 1275                   |
+| **Detector (CRAFT_320_QUANTIZED)**                                         |                           |                           |             |                                |                        |
+| ├─ Average Time                                                            | 125                       | 121                       | 1060        | 338                            | 299                    |
+| ├─ Total Time (4 runs)                                                     | 500                       | 484                       | 4240        | 1352                           | 1196                   |
+| **Recognizer (CRNN_64)** <br /> (_With Flag `independentChars == true`_)   |                           |                           |             |                                |                        |
+| ├─ Average Time                                                            | 5                         | 6                         | 14          | 7                              | 6                      |
+| ├─ Total Time (21 runs)                                                    | 105                       | 126                       | 294         | 147                            | 126                    |
+| **Recognizer (CRNN_512)** <br /> (_With Flag `independentChars == false`_) |                           |                           |             |                                |                        |
+| ├─ Average Time                                                            | 46                        | 42                        | 109         | 47                             | 37                     |
+| ├─ Total Time (4 runs)                                                     | 184                       | 168                       | 436         | 188                            | 148                    |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/_category_.json b/docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/_category_.json
new file mode 100644
index 000000000..1e8953592
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/_category_.json
@@ -0,0 +1,6 @@
+{
+  "label": "ExecuTorch Bindings",
+  "link": {
+    "type": "generated-index"
+  }
+}
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/useExecutorchModule.md b/docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/useExecutorchModule.md
new file mode 100644
index 000000000..137b19d92
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/useExecutorchModule.md
@@ -0,0 +1,155 @@
+---
+title: useExecutorchModule
+---
+
+useExecutorchModule provides React Native bindings to the ExecuTorch [Module API](https://pytorch.org/executorch/stable/extension-module.html) directly from JavaScript.
+
+:::caution
+These bindings are primarily intended for custom model integration where no dedicated hook exists. If you are considering using a provided model, first verify whether a dedicated hook is available. Dedicated hooks simplify the implementation process by managing necessary pre and post-processing automatically. Utilizing these can save you effort and reduce complexity, ensuring you do not implement additional handling that is already covered.
+:::
+
+## Initializing ExecuTorch Module
+
+You can initialize the ExecuTorch module in your JavaScript application using the `useExecutorchModule` hook. This hook facilitates the loading of models from the specified source and prepares them for use.
+
+```typescript
+import { useExecutorchModule } from 'react-native-executorch';
+
+const executorchModule = useExecutorchModule({
+  modelSource: require('../assets/models/model.pte'),
+});
+```
+
+The `modelSource` parameter expects a location string pointing to the model binary.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+### Arguments
+
+**`modelSource`** - A string that specifies the location of the model binary.
+
+**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
+
+### Returns
+
+|       Field        |                      Type                      |                                                                         Description                                                                         |
+| :----------------: | :--------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------: |
+|      `error`       |        <code>string &#124; null</code>         |                                                   Contains the error message if the model failed to load.                                                   |
+|   `isGenerating`   |                   `boolean`                    |                                              Indicates whether the model is currently processing an inference.                                              |
+|     `isReady`      |                   `boolean`                    |                                       Indicates whether the model has successfully loaded and is ready for inference.                                       |
+|     `forward`      | `(input: TensorPtr[]) => Promise<TensorPtr[]>` | Executes the model's forward pass, where `input` is an array of TensorPtr objects. If the inference is successful, an array of tensor pointers is returned. |
+| `downloadProgress` |                    `number`                    |                                                Represents the download progress as a value between 0 and 1.                                                 |
+
+## TensorPtr
+
+TensorPtr is a JS representation of the underlying tensor, which is then passed to the model. You can read more about creating tensors [here](https://docs.pytorch.org/executorch/stable/extension-tensor.html). On JS side, the TensorPtr holds the following information:
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+interface TensorPtr {
+  dataPtr: TensorBuffer;
+  sizes: number[];
+  scalarType: ScalarType;
+}
+
+type TensorBuffer =
+  | ArrayBuffer
+  | Float32Array
+  | Float64Array
+  | Int8Array
+  | Int16Array
+  | Int32Array
+  | Uint8Array
+  | Uint16Array
+  | Uint32Array
+  | BigInt64Array
+  | BigUint64Array;
+
+enum ScalarType {
+  BYTE = 0,
+  CHAR = 1,
+  SHORT = 2,
+  INT = 3,
+  LONG = 4,
+  HALF = 5,
+  FLOAT = 6,
+  DOUBLE = 7,
+  BOOL = 11,
+  QINT8 = 12,
+  QUINT8 = 13,
+  QINT32 = 14,
+  QUINT4X2 = 16,
+  QUINT2X4 = 17,
+  BITS16 = 22,
+  FLOAT8E5M2 = 23,
+  FLOAT8E4M3FN = 24,
+  FLOAT8E5M2FNUZ = 25,
+  FLOAT8E4M3FNUZ = 26,
+  UINT16 = 27,
+  UINT32 = 28,
+  UINT64 = 29,
+}
+```
+
+</details>
+
+`dataPtr` - Represents a data buffer that will be used to create a tensor on the native side. This can be either an `ArrayBuffer` or a `TypedArray`. If your model takes in a datatype which is not covered by any of the `TypedArray` types, just pass an `ArrayBuffer` here.
+
+`sizes` - Represents a shape of a given tensor, i.e. for a 640x640 RGB image with a batch size of 1, you would need to pass `[1, 3, 640, 640]` here.
+
+`scalarType` - An enum resembling the ExecuTorch's `ScalarType`. For example, if your model was exported with float32 as an input, you will need to pass `ScalarType.FLOAT` here.
+
+## End to end example
+
+This example demonstrates the integration and usage of the ExecuTorch bindings with a [style transfer model](../../02-hooks/02-computer-vision/useStyleTransfer.md). Specifically, we'll be using the `STYLE_TRANSFER_CANDY` model, which applies artistic style transfer to an input image.
+
+### Importing the Module and loading the model
+
+First, import the necessary functions from the `react-native-executorch` package and initialize the ExecuTorch module with the specified style transfer model.
+
+```typescript
+import {
+  useExecutorchModule,
+  STYLE_TRANSFER_CANDY,
+  ScalarType,
+} from 'react-native-executorch';
+
+// Initialize the executorch module with the predefined style transfer model.
+const executorchModule = useExecutorchModule({
+  modelSource: STYLE_TRANSFER_CANDY,
+});
+```
+
+### Setting up input parameters
+
+To prepare the model input, define the tensor shape according to your model's requirements (defined by the model export process). For example, the STYLE_TRANSFER_CANDY model expects a tensor with shape `[1, 3, 640, 640]` — representing a batch size of 1, 3 color channels (RGB), and 640×640 pixel dimensions.
+
+```typescript
+const inputTensor = {
+  dataPtr: new Float32Array(1 * 3 * 640 * 640), // or other TypedArray / ArrayBuffer
+  sizes: [1, 3, 640, 640],
+  scalarType: ScalarType.FLOAT,
+};
+```
+
+### Performing inference
+
+After passing input to the forward function, you'll receive an array of TensorPtr objects. Each TensorPtr contains its `dataPtr` as an ArrayBuffer. Since ArrayBuffer represents raw binary data, you'll need to interpret it according to the tensor's underlying data type (e.g., creating a Float32Array view for float32 tensors, Int32Array for int32 tensors, etc.).
+
+```typescript
+try {
+  // Perform the forward operation and receive the stylized image output.
+  const output = await executorchModule.forward([inputTensor]);
+  // Interpret the output ArrayBuffer
+  // foo(output[0].dataPtr);
+} catch (error) {
+  // Log any errors that occur during the forward pass.
+  console.error('Error during model execution:', error);
+}
+```
+
+:::info
+This code assumes that you have handled preprocessing of the input image (scaling, normalization) and postprocessing of the output (interpreting the raw output data) according to the model's requirements. Make sure to adjust these parts depending on your specific data and model outputs.
+:::
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/_category_.json b/docs/versioned_docs/version-0.6.x/02-hooks/_category_.json
new file mode 100644
index 000000000..556f74382
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/02-hooks/_category_.json
@@ -0,0 +1,6 @@
+{
+  "label": "Hooks",
+  "link": {
+    "type": "generated-index"
+  }
+}
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/LLMModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/LLMModule.md
new file mode 100644
index 000000000..0a2197849
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/LLMModule.md
@@ -0,0 +1,166 @@
+---
+title: LLMModule
+---
+
+TypeScript API implementation of the [useLLM](../../02-hooks/01-natural-language-processing/useLLM.md) hook.
+
+## Reference
+
+```typescript
+import { LLMModule, LLAMA3_2_1B_QLORA } from 'react-native-executorch';
+
+// Creating an instance
+const llm = new LLMModule({
+  tokenCallback: (token) => console.log(token),
+  messageHistoryCallback: (messages) => console.log(messages),
+});
+
+// Loading the model
+await llm.load(LLAMA3_2_1B_QLORA, (progress) => console.log(progress));
+
+// Running the model
+await llm.sendMessage('Hello, World!');
+
+// Interrupting the model (to actually interrupt the generation it has to be called when sendMessage or generate is running)
+llm.interrupt();
+
+// Deleting the model from memory
+llm.delete();
+```
+
+### Methods
+
+| Method                   | Type                                                                                                                                                                                       | Description                                                                                                                                                                                                                                                                                                                                                                 |
+| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `constructor`            | `({tokenCallback?: (token: string) => void, responseCallback?: (response: string) => void, messageHistoryCallback?: (messageHistory: Message[]) => void})`                                 | Creates a new instance of LLMModule with optional callbacks.                                                                                                                                                                                                                                                                                                                |
+| `load`                   | `(model: { modelSource: ResourceSource; tokenizerSource: ResourceSource; tokenizerConfigSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model.                                                                                                                                                                                                                                                                                                                                                            |
+| `setTokenCallback`       | `{tokenCallback: (token: string) => void}) => void`                                                                                                                                        | Sets new token callback.                                                                                                                                                                                                                                                                                                                                                    |
+| `generate`               | `(messages: Message[], tools?: LLMTool[]) => Promise<string>`                                                                                                                              | Runs model to complete chat passed in `messages` argument. It doesn't manage conversation context.                                                                                                                                                                                                                                                                          |
+| `forward`                | `(input: string) => Promise<string>`                                                                                                                                                       | Runs model inference with raw input string. You need to provide entire conversation and prompt (in correct format and with special tokens!) in input string to this method. It doesn't manage conversation context. It is intended for users that need access to the model itself without any wrapper. If you want a simple chat with model the consider using`sendMessage` |
+| `configure`              | `({chatConfig?: Partial<ChatConfig>, toolsConfig?: ToolsConfig, generationConfig?: GenerationConfig}) => void`                                                                             | Configures chat and tool calling and generation settings. See more details in [configuring the model](#configuring-the-model).                                                                                                                                                                                                                                              |
+| `sendMessage`            | `(message: string) => Promise<Message[]>`                                                                                                                                                  | Method to add user message to conversation. After model responds it will call `messageHistoryCallback()`containing both user message and model response. It also returns them.                                                                                                                                                                                              |
+| `deleteMessage`          | `(index: number) => void`                                                                                                                                                                  | Deletes all messages starting with message on `index` position. After deletion it will call `messageHistoryCallback()` containing new history. It also returns it.                                                                                                                                                                                                          |
+| `delete`                 | `() => void`                                                                                                                                                                               | Method to delete the model from memory. Note you cannot delete model while it's generating. You need to interrupt it first and make sure model stopped generation.                                                                                                                                                                                                          |
+| `interrupt`              | `() => void`                                                                                                                                                                               | Interrupts model generation. It may return one more token after interrupt.                                                                                                                                                                                                                                                                                                  |
+| `getGeneratedTokenCount` | `() => number`                                                                                                                                                                             | Returns the number of tokens generated in the last response.                                                                                                                                                                                                                                                                                                                |
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+type ResourceSource = string | number | object;
+
+type MessageRole = 'user' | 'assistant' | 'system';
+
+interface Message {
+  role: MessageRole;
+  content: string;
+}
+interface ChatConfig {
+  initialMessageHistory: Message[];
+  contextWindowLength: number;
+  systemPrompt: string;
+}
+
+interface GenerationConfig {
+  outputTokenBatchSize: number;
+  batchTimeInterval: number;
+}
+
+// tool calling
+interface ToolsConfig {
+  tools: LLMTool[];
+  executeToolCallback: (call: ToolCall) => Promise<string | null>;
+  displayToolCalls?: boolean;
+}
+
+interface ToolCall {
+  toolName: string;
+  arguments: Object;
+}
+
+type LLMTool = Object;
+```
+
+</details>
+
+## Loading the model
+
+To create a new instance of LLMModule, use the constructor with optional callbacks:
+
+**`tokenCallback`** - (Optional) A function that will be called on every generated token with that token as its only argument.
+
+**`responseCallback`** - (Optional) A function that will be called on every generated token and receives the entire response, including this token. [**DEPRECATED** - consider using `tokenCallback`]
+
+**`messageHistoryCallback`** - (Optional) A function called on every finished message. Returns the entire message history.
+
+Then, to load the model, use the `load` method. It accepts an object with the following fields:
+
+**`model`** - Object containing the model source, tokenizer source, and tokenizer config source.
+
+- **`modelSource`** - `ResourceSource` specifying the location of the model binary.
+
+- **`tokenizerSource`** - `ResourceSource` specifying the location of the tokenizer.
+
+- **`tokenizerConfigSource`** - `ResourceSource` specifying the location of the tokenizer config.
+
+**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
+
+This method returns a promise, which can resolve to an error or void.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+## Listening for download progress
+
+To subscribe to the download progress event, you can pass the `onDownloadProgressCallback` function to the `load` method. This function is called whenever the download progress changes.
+
+## Running the model
+
+To run the model, you can use `generate` method. It allows you to pass chat messages and receive completion from the model. It doesn't provide any message history management.
+
+Alternatively in managed chat (see: [Functional vs managed](../../02-hooks/01-natural-language-processing/useLLM.md#functional-vs-managed)), you can use the `sendMessage` method. It accepts the user message. After model responds it will return new message history containing both user message and model response.. Additionally, it will call `messageHistoryCallback`.
+
+If you need raw model, without any wrappers, you can use `forward`. It provides direct access to the model, so the input string is passed straight into the model. It may be useful to work with models that aren't finetuned for chat completions. If you're not sure what are implications of that (e.g. that you have to include special model tokens), you're better off with `sendMessage`.
+
+## Listening for generated tokens
+
+To subscribe to the token generation event, you can pass `tokenCallback` or `messageHistoryCallback` functions to the constructor. `tokenCallback` is called on every token and contains only the most recent token. `messageHistoryCallback` is called whenever model finishes generation and contains all message history including user's and model's last messages.
+
+## Interrupting the model
+
+In order to interrupt the model, you can use the `interrupt` method.
+
+## Token Batching
+
+Depending on selected model and the user's device generation speed can be above 60 tokens per second. If the `tokenCallback` triggers rerenders and is invoked on every single token it can significantly decrease the app's performance. To alleviate this and help improve performance we've implemented token batching. To configure this you need to call `configure` method and pass `generationConfig`. Inside you can set two parameters `outputTokenBatchSize` and `batchTimeInterval`. They set the size of the batch before tokens are emitted and the maximum time interval between consecutive batches respectively. Each batch is emitted if either `timeInterval` elapses since last batch or `countInterval` number of tokens are generated. This allows for smooth generation even if model lags during generation. Default parameters are set to 10 tokens and 80ms for time interval (~12 batches per second).
+
+## Configuring the model
+
+To configure model (i.e. change system prompt, load initial conversation history or manage tool calling) you can use
+`configure` method. It is only applied to managed chats i.e. when using `sendMessage` (see: [Functional vs managed](../../02-hooks/01-natural-language-processing/useLLM.md#functional-vs-managed)) It accepts object with following fields:
+
+**`chatConfig`** - Object configuring chat management:
+
+- **`systemPrompt`** - Often used to tell the model what is its purpose, for example - "Be a helpful translator".
+
+- **`initialMessageHistory`** - An array of `Message` objects that represent the conversation history. This can be used to provide initial context to the model.
+
+- **`contextWindowLength`** - The number of messages from the current conversation that the model will use to generate a response. The higher the number, the more context the model will have. Keep in mind that using larger context windows will result in longer inference time and higher memory usage.
+
+**`toolsConfig`** - Object configuring options for enabling and managing tool use. **It will only have effect if your model's chat template support it**. Contains following properties:
+
+- **`tools`** - List of objects defining tools.
+
+- **`executeToolCallback`** - Function that accepts `ToolCall`, executes tool and returns the string to model.
+
+- **`displayToolCalls`** - If set to true, JSON tool calls will be displayed in chat. If false, only answers will be displayed.
+
+**`generationConfig`** - Object configuring generation settings, currently only output token batching.
+
+- **`outputTokenBatchSize`** - Soft upper limit on the number of tokens in each token batch (in certain cases there can be more tokens in given batch, i.e. when the batch would end with special emoji join character).
+
+- **`batchTimeInterval`** - Upper limit on the time interval between consecutive token batches.
+
+## Deleting the model from memory
+
+To delete the model from memory, you can use the `delete` method.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/SpeechToTextModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/SpeechToTextModule.md
new file mode 100644
index 000000000..f93600c00
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/SpeechToTextModule.md
@@ -0,0 +1,252 @@
+---
+title: SpeechToTextModule
+---
+
+TypeScript API implementation of the [useSpeechToText](../../02-hooks/01-natural-language-processing/useSpeechToText.md) hook.
+
+## Reference
+
+```typescript
+import { SpeechToTextModule, WHISPER_TINY_EN } from 'react-native-executorch';
+
+const model = new SpeechToTextModule();
+await model.load(WHISPER_TINY_EN, (progress) => {
+  console.log(progress);
+});
+
+await model.transcribe(waveform);
+```
+
+### Methods
+
+| Method         | Type                                                                                                       | Description                                                                                                                                                                                                   |
+| -------------- | ---------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `load`         | `(model: SpeechToTextModelConfig, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model specified by the config object. `onDownloadProgressCallback` allows you to monitor the current progress of the model download.                                                                |
+| `delete`       | `(): void`                                                                                                 | Unloads the model from memory.                                                                                                                                                                                |
+| `encode`       | `(waveform: Float32Array \| number[]): Promise<Float32Array>`                                              | Runs the encoding part of the model on the provided waveform. Returns the encoded waveform as a Float32Array. Passing `number[]` is deprecated.                                                               |
+| `decode`       | `(tokens: number[] \| Int32Array, encoderOutput: Float32Array \| number[]): Promise<Float32Array>`         | Runs the decoder of the model. Passing `number[]` is deprecated.                                                                                                                                              |
+| `transcribe`   | `(waveform: Float32Array \| number[], options?: DecodingOptions): Promise<string>`                         | Starts a transcription process for a given input array (16kHz waveform). For multilingual models, specify the language in `options`. Returns the transcription as a string. Passing `number[]` is deprecated. |
+| `stream`       | `(options?: DecodingOptions): AsyncGenerator<{ committed: string; nonCommitted: string }>`                 | Starts a streaming transcription session. Yields objects with `committed` and `nonCommitted` transcriptions. Use with `streamInsert` and `streamStop` to control the stream.                                  |
+| `streamStop`   | `(): void`                                                                                                 | Stops the current streaming transcription session.                                                                                                                                                            |
+| `streamInsert` | `(waveform: Float32Array \| number[]): void`                                                               | Inserts a new audio chunk into the streaming transcription session. Passing `number[]` is deprecated.                                                                                                         |
+
+:::info
+
+- `committed` contains the latest part of the transcription that is finalized and will not change. To obtain the full transcription during streaming, concatenate all the `committed` values yielded over time. Useful for displaying stable results during streaming.
+- `nonCommitted` contains the part of the transcription that is still being processed and may change. Useful for displaying live, partial results during streaming.
+  :::
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+// Languages supported by whisper (Multilingual)
+type SpeechToTextLanguage =
+  | 'af'
+  | 'sq'
+  | 'ar'
+  | 'hy'
+  | 'az'
+  | 'eu'
+  | 'be'
+  | 'bn'
+  | 'bs'
+  | 'bg'
+  | 'my'
+  | 'ca'
+  | 'zh'
+  | 'hr'
+  | 'cs'
+  | 'da'
+  | 'nl'
+  | 'et'
+  | 'en'
+  | 'fi'
+  | 'fr'
+  | 'gl'
+  | 'ka'
+  | 'de'
+  | 'el'
+  | 'gu'
+  | 'ht'
+  | 'he'
+  | 'hi'
+  | 'hu'
+  | 'is'
+  | 'id'
+  | 'it'
+  | 'ja'
+  | 'kn'
+  | 'kk'
+  | 'km'
+  | 'ko'
+  | 'lo'
+  | 'lv'
+  | 'lt'
+  | 'mk'
+  | 'mg'
+  | 'ms'
+  | 'ml'
+  | 'mt'
+  | 'mr'
+  | 'ne'
+  | 'no'
+  | 'fa'
+  | 'pl'
+  | 'pt'
+  | 'pa'
+  | 'ro'
+  | 'ru'
+  | 'sr'
+  | 'si'
+  | 'sk'
+  | 'sl'
+  | 'es'
+  | 'su'
+  | 'sw'
+  | 'sv'
+  | 'tl'
+  | 'tg'
+  | 'ta'
+  | 'te'
+  | 'th'
+  | 'tr'
+  | 'uk'
+  | 'ur'
+  | 'uz'
+  | 'vi'
+  | 'cy'
+  | 'yi';
+
+interface DecodingOptions {
+  language?: SpeechToTextLanguage;
+}
+
+interface SpeechToTextModelConfig {
+  isMultilingual: boolean;
+  encoderSource: ResourceSource;
+  decoderSource: ResourceSource;
+  tokenizerSource: ResourceSource;
+}
+```
+
+</details>
+
+## Loading the model
+
+Create an instance of SpeechToTextModule and use the `load` method. It accepts an object with the following fields:
+
+**`model`** - Object containing:
+
+- **`isMultilingual`** - A boolean flag indicating whether the model supports multiple languages.
+
+- **`encoderSource`** - A string that specifies the location of a `.pte` file for the encoder.
+
+- **`decoderSource`** - A string that specifies the location of a `.pte` file for the decoder.
+
+- **`tokenizerSource`** - A string that specifies the location to the tokenizer for the model.
+
+**`onDownloadProgressCallback`** - (Optional) Function that will be called on download progress.
+
+This method returns a promise, which can resolve to an error or void.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+## Running the model
+
+To run the model, you can use the `transcribe` method. It accepts one argument, which is an array of numbers representing a waveform at 16kHz sampling rate. The method returns a promise, which can resolve either to an error or a string containing the output text.
+
+### Multilingual transcription
+
+If you aim to obtain a transcription in other languages than English, use the multilingual version of whisper. To obtain the output text in your desired language, pass the `DecodingOptions` object with the `language` field set to your desired language code.
+
+```typescript
+import { SpeechToTextModule, WHISPER_TINY } from 'react-native-executorch';
+
+const model = new SpeechToTextModule();
+await model.load(WHISPER_TINY, (progress) => {
+  console.log(progress);
+});
+
+const transcription = await model.transcribe(spanishAudio, { language: 'es' });
+```
+
+## Example
+
+### Transcription
+
+```tsx
+import { SpeechToTextModule, WHISPER_TINY_EN } from 'react-native-executorch';
+import { AudioContext } from 'react-native-audio-api';
+import * as FileSystem from 'expo-file-system';
+
+// Load the model
+const model = new SpeechToTextModule();
+
+// Download the audio file
+const { uri } = await FileSystem.downloadAsync(
+  'https://some-audio-url.com/file.mp3',
+  FileSystem.cacheDirectory + 'audio_file'
+);
+
+// Decode the audio data
+const audioContext = new AudioContext({ sampleRate: 16000 });
+const decodedAudioData = await audioContext.decodeAudioDataSource(uri);
+const audioBuffer = decodedAudioData.getChannelData(0);
+
+// Transcribe the audio
+try {
+  const transcription = await model.transcribe(audioBuffer);
+  console.log(transcription);
+} catch (error) {
+  console.error('Error during audio transcription', error);
+}
+```
+
+### Streaming Transcription
+
+```tsx
+import { SpeechToTextModule, WHISPER_TINY_EN } from 'react-native-executorch';
+import { AudioManager, AudioRecorder } from 'react-native-audio-api';
+
+// Load the model
+const model = new SpeechToTextModule();
+await model.load(WHISPER_TINY_EN, (progress) => {
+  console.log(progress);
+});
+
+// Configure audio session
+AudioManager.setAudioSessionOptions({
+  iosCategory: 'playAndRecord',
+  iosMode: 'spokenAudio',
+  iosOptions: ['allowBluetooth', 'defaultToSpeaker'],
+});
+AudioManager.requestRecordingPermissions();
+
+// Initialize audio recorder
+const recorder = new AudioRecorder({
+  sampleRate: 16000,
+  bufferLengthInSamples: 1600,
+});
+recorder.onAudioReady(({ buffer }) => {
+  // Insert the audio into the streaming transcription
+  model.streamInsert(buffer.getChannelData(0));
+});
+recorder.start();
+
+// Start streaming transcription
+try {
+  let transcription = '';
+  for await (const { committed, nonCommitted } of model.stream()) {
+    console.log('Streaming transcription:', { committed, nonCommitted });
+    transcription += committed;
+  }
+  console.log('Final transcription:', transcription);
+} catch (error) {
+  console.error('Error during streaming transcription:', error);
+}
+
+// Stop streaming transcription
+model.streamStop();
+recorder.stop();
+```
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TextEmbeddingsModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TextEmbeddingsModule.md
new file mode 100644
index 000000000..7f59268f9
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TextEmbeddingsModule.md
@@ -0,0 +1,59 @@
+---
+title: TextEmbeddingsModule
+---
+
+TypeScript API implementation of the [useTextEmbeddings](../../02-hooks/01-natural-language-processing/useTextEmbeddings.md) hook.
+
+## Reference
+
+```typescript
+import {
+  TextEmbeddingsModule,
+  ALL_MINILM_L6_V2,
+} from 'react-native-executorch';
+
+// Creating an instance
+const textEmbeddingsModule = new TextEmbeddingsModule();
+
+// Loading the model
+await textEmbeddingsModule.load(ALL_MINILM_L6_V2);
+
+// Running the model
+const embedding = await textEmbeddingsModule.forward('Hello World!');
+```
+
+### Methods
+
+| Method               | Type                                                                                                                                                | Description                                                                                                                                                                             |
+| -------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `load`               | `(model: { modelSource: ResourceSource; tokenizerSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `modelSource` is a string that specifies the location of the model binary, `tokenizerSource` is a string that specifies the location of the tokenizer JSON file. |
+| `forward`            | `(input: string): Promise<number[]>`                                                                                                                | Executes the model's forward pass, where `input` is a text that will be embedded.                                                                                                       |
+| `onDownloadProgress` | `(callback: (downloadProgress: number) => void): any`                                                                                               | Subscribe to the download progress event.                                                                                                                                               |
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+type ResourceSource = string | number | object;
+```
+
+</details>
+
+## Loading the model
+
+To load the model, use the `load` method. It accepts an object:
+
+**`model`** - Object containing the model source and tokenizer source.
+
+- **`modelSource`** - A string that specifies the location of the model binary.
+- **`tokenizerSource`** - A string that specifies the location of the tokenizer JSON file.
+
+**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
+
+This method returns a promise, which can resolve to an error or void.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+## Running the model
+
+To run the model, you can use the `forward` method. It accepts one argument, which is the text you want to embed. The method returns a promise, which can resolve either to an error or an array of numbers representing the embedding.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TokenizerModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TokenizerModule.md
new file mode 100644
index 000000000..41ad2b027
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TokenizerModule.md
@@ -0,0 +1,60 @@
+---
+title: TokenizerModule
+---
+
+TypeScript API implementation of the [useTokenizer](../../02-hooks/01-natural-language-processing/useTokenizer.md) hook.
+
+## Reference
+
+```typescript
+import { TokenizerModule, ALL_MINILM_L6_V2 } from 'react-native-executorch';
+
+// Creating an instance
+const tokenizerModule = new TokenizerModule();
+
+// Load the tokenizer
+await tokenizerModule.load(ALL_MINILM_L6_V2);
+console.log('Tokenizer loaded');
+
+// Get tokenizers vocabulary size
+const vocabSize = await tokenizerModule.getVocabSize();
+console.log('Vocabulary size:', vocabSize);
+
+const text = 'Hello, world!';
+
+// Tokenize the text
+const tokens = await tokenizerModule.encode(text);
+console.log('Token IDs:', tokens);
+
+// Decode the tokens back to text
+const decoded = await tokenizerModule.decode(tokens);
+console.log('Decoded text:', decoded);
+
+// Get the token ID for a specific token
+const tokenId = await tokenizerModule.tokenToId('hello');
+console.log('Token ID for "Hello":', tokenId);
+
+// Get the token for a specific ID
+const token = await tokenizerModule.idToToken(tokenId);
+console.log('Token for ID:', token);
+```
+
+### Methods
+
+| Method         | Type                                                                                                                       | Description                                                                                                                          |
+| -------------- | -------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ |
+| `load`         | `(tokenizer: { tokenizerSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the tokenizer from the specified source. `tokenizerSource` is a string that points to the location of the tokenizer JSON file. |
+| `encode`       | `(input: string): Promise<number[]>`                                                                                       | Converts a string into an array of token IDs.                                                                                        |
+| `decode`       | `(input: number[]): Promise<string>`                                                                                       | Converts an array of token IDs into a string.                                                                                        |
+| `getVocabSize` | `(): Promise<number>`                                                                                                      | Returns the size of the tokenizer's vocabulary.                                                                                      |
+| `idToToken`    | `(tokenId: number): Promise<string>`                                                                                       | Returns the token associated to the ID.                                                                                              |
+| `tokenToId`    | `(token: string): Promise<number>`                                                                                         | Returns the ID associated to the token.                                                                                              |
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+type ResourceSource = string | number | object;
+```
+
+</details>
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/_category_.json b/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/_category_.json
new file mode 100644
index 000000000..0314f315d
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/_category_.json
@@ -0,0 +1,6 @@
+{
+  "label": "Natural Language Processing",
+  "link": {
+    "type": "generated-index"
+  }
+}
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ClassificationModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ClassificationModule.md
new file mode 100644
index 000000000..a8e7bea75
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ClassificationModule.md
@@ -0,0 +1,64 @@
+---
+title: ClassificationModule
+---
+
+TypeScript API implementation of the [useClassification](../../02-hooks/02-computer-vision/useClassification.md) hook.
+
+## Reference
+
+```typescript
+import {
+  ClassificationModule,
+  EFFICIENTNET_V2_S,
+} from 'react-native-executorch';
+
+const imageUri = 'path/to/image.png';
+
+// Creating an instance
+const classificationModule = new ClassificationModule();
+
+// Loading the model
+await classificationModule.load(EFFICIENTNET_V2_S);
+
+// Running the model
+const classesWithProbabilities = await classificationModule.forward(imageUri);
+```
+
+### Methods
+
+| Method    | Type                                                                                                               | Description                                                                                                                                                                                |
+| --------- | ------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| `load`    | `(model: { modelSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `modelSource` is a string that specifies the location of the model binary. To track the download progress, supply a callback function `onDownloadProgressCallback`. |
+| `forward` | `(imageSource: string): Promise<{ [category: string]: number }>`                                                   | Executes the model's forward pass, where `imageSource` can be a fetchable resource or a Base64-encoded string.                                                                             |
+| `delete`  | `(): void`                                                                                                         | Release the memory held by the module. Calling `forward` afterwards is invalid.                                                                                                            |
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+type ResourceSource = string | number | object;
+```
+
+</details>
+
+## Loading the model
+
+To load the model, create a new instance of the module and use the `load` method on it. It accepts an object:
+
+**`model`** - Object containing the model source.
+
+- **`modelSource`** - A string that specifies the location of the model binary.
+
+**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
+
+This method returns a promise, which can resolve to an error or void.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+## Running the model
+
+To run the model, you can use the `forward` method on the module object. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The method returns a promise, which can resolve either to an error or an object containing categories with their probabilities.
+
+## Managing memory
+
+The module is a regular JavaScript object, and as such its lifespan will be managed by the garbage collector. In most cases this should be enough, and you should not worry about freeing the memory of the module yourself, but in some cases you may want to release the memory occupied by the module before the garbage collector steps in. In this case use the method `delete()` on the module object you will no longer use, and want to remove from the memory. Note that you cannot use `forward` after `delete` unless you load the module again.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageEmbeddingsModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageEmbeddingsModule.md
new file mode 100644
index 000000000..7045da8e5
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageEmbeddingsModule.md
@@ -0,0 +1,60 @@
+---
+title: ImageEmbeddingsModule
+---
+
+TypeScript API implementation of the [useImageEmbeddings](../../02-hooks/02-computer-vision/useImageEmbeddings.md) hook.
+
+## Reference
+
+```typescript
+import {
+  ImageEmbeddingsModule,
+  CLIP_VIT_BASE_PATCH32_IMAGE,
+} from 'react-native-executorch';
+
+// Creating an instance
+const imageEmbeddingsModule = new ImageEmbeddingsModule();
+
+// Loading the model
+await imageEmbeddingsModule.load(CLIP_VIT_BASE_PATCH32_IMAGE);
+
+// Running the model
+const embedding = await imageEmbeddingsModule.forward(
+  'https://url-to-image.jpg'
+);
+```
+
+### Methods
+
+| Method               | Type                                                                                                               | Description                                                                                         |
+| -------------------- | ------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------- |
+| `load`               | `(model: { modelSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `modelSource` is a string that specifies the location of the model binary.   |
+| `forward`            | `(imageSource: string): Promise<Float32Array>`                                                                     | Executes the model's forward pass, where `imageSource` is a URI/URL to image that will be embedded. |
+| `onDownloadProgress` | `(callback: (downloadProgress: number) => void): any`                                                              | Subscribe to the download progress event.                                                           |
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+type ResourceSource = string | number | object;
+```
+
+</details>
+
+## Loading the model
+
+To load the model, use the `load` method. It accepts an object:
+
+**`model`** - Object containing the model source.
+
+- **`modelSource`** - A string that specifies the location of the model binary.
+
+**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
+
+This method returns a promise, which can resolve to an error or void.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+## Running the model
+
+It accepts one argument, which is a URI/URL to an image you want to encode. The function returns a promise, which can resolve either to an error or an array of numbers representing the embedding.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageSegmentationModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageSegmentationModule.md
new file mode 100644
index 000000000..99deae014
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageSegmentationModule.md
@@ -0,0 +1,77 @@
+---
+title: ImageSegmentationModule
+---
+
+TypeScript API implementation of the [useImageSegmentation](../../02-hooks/02-computer-vision/useImageSegmentation.md) hook.
+
+## Reference
+
+```typescript
+import {
+  ImageSegmentationModule,
+  DEEPLAB_V3_RESNET50,
+} from 'react-native-executorch';
+
+const imageUri = 'path/to/image.png';
+
+// Creating an instance
+const imageSegmentationModule = new ImageSegmentationModule();
+
+// Loading the model
+await imageSegmentationModule.load(DEEPLAB_V3_RESNET50);
+
+// Running the model
+const outputDict = await imageSegmentationModule.forward(imageUri);
+```
+
+### Methods
+
+| Method    | Type                                                                                                                         | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+| --------- | ---------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `load`    | `(model: { modelSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>`           | Loads the model, where `modelSource` is a string that specifies the location of the model binary. To track the download progress, supply a callback function `onDownloadProgressCallback`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
+| `forward` | `(imageSource: string, classesOfInterest?: DeeplabLabel[], resize?: boolean) => Promise<{[key in DeeplabLabel]?: number[]}>` | Executes the model's forward pass, where : <br/> \* `imageSource` can be a fetchable resource or a Base64-encoded string. <br/> \* `classesOfInterest` is an optional list of `DeeplabLabel` used to indicate additional arrays of probabilities to output (see section "Running the model"). The default is an empty list. <br/> \* `resize` is an optional boolean to indicate whether the output should be resized to the original image dimensions, or left in the size of the model (see section "Running the model"). The default is `false`. <br/> <br/> The return is a dictionary containing: <br/> \* for the key `DeeplabLabel.ARGMAX` an array of integers corresponding to the most probable class for each pixel <br/> \* an array of floats for each class from `classesOfInterest` corresponding to the probabilities for this class. |
+| `delete`  | `(): void`                                                                                                                   | Release the memory held by the module. Calling `forward` afterwards is invalid.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+type ResourceSource = string | number | object;
+```
+
+</details>
+
+## Loading the model
+
+To load the model, create a new instance of the module and use the `load` method on it. It accepts an object:
+
+**`model`** - Object containing the model source.
+
+- **`modelSource`** - A string that specifies the location of the model binary.
+
+**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
+
+This method returns a promise, which can resolve to an error or void.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+## Running the model
+
+To run the model, you can use the `forward` method on the module object. It accepts three arguments: a required image, an optional list of classes, and an optional flag whether to resize the output to the original dimensions.
+
+- The image can be a remote URL, a local file URI, or a base64-encoded image.
+- The `classesOfInterest` list contains classes for which to output the full results. By default the list is empty, and only the most probable classes are returned (essentially an arg max for each pixel). Look at `DeeplabLabel` enum for possible classes.
+- The `resize` flag says whether the output will be rescaled back to the size of the image you put in. The default is `false`. The model runs inference on a scaled (probably smaller) version of your image (224x224 for the `DEEPLAB_V3_RESNET50`). If you choose to resize, the output will be `number[]` of size `width * height` of your original image.
+
+:::caution
+Setting `resize` to true will make `forward` slower.
+:::
+
+`forward` returns a promise which can resolve either to an error or a dictionary containing number arrays with size depending on `resize`:
+
+- For the key `DeeplabLabel.ARGMAX` the array contains for each pixel an integer corresponding to the class with the highest probability.
+- For every other key from `DeeplabLabel`, if the label was included in `classesOfInterest` the dictionary will contain an array of floats corresponding to the probability of this class for every pixel.
+
+## Managing memory
+
+The module is a regular JavaScript object, and as such its lifespan will be managed by the garbage collector. In most cases this should be enough, and you should not worry about freeing the memory of the module yourself, but in some cases you may want to release the memory occupied by the module before the garbage collector steps in. In this case use the method `delete()` on the module object you will no longer use, and want to remove from the memory. Note that you cannot use `forward` after `delete` unless you load the module again.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/OCRModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/OCRModule.md
new file mode 100644
index 000000000..43a812005
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/OCRModule.md
@@ -0,0 +1,135 @@
+---
+title: OCRModule
+---
+
+TypeScript API implementation of the [useOCR](../../02-hooks/02-computer-vision/useOCR.md) hook.
+
+## Reference
+
+```typescript
+import { OCRModule, OCR_ENGLISH } from 'react-native-executorch';
+const imageUri = 'path/to/image.png';
+
+// Creating an instance
+const ocrModule = new OCRModule();
+
+// Loading the model
+await ocrModule.load(OCR_ENGLISH);
+
+// Running the model
+const detections = await ocrModule.forward(imageUri);
+```
+
+### Methods
+
+| Method    | Type                                                                                                                                                                                                                                             | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
+| --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `load`    | `(model: { detectorSource: ResourceSource; recognizerLarge: ResourceSource; recognizerMedium: ResourceSource; recognizerSmall: ResourceSource; language: OCRLanguage }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `detectorSource` is a string that specifies the location of the detector binary, `recognizerLarge` is a string that specifies the location of the recognizer binary file which accepts input images with a width of 512 pixels, `recognizerMedium` is a string that specifies the location of the recognizer binary file which accepts input images with a width of 256 pixels, `recognizerSmall` is a string that specifies the location of the recognizer binary file which accepts input images with a width of 128 pixels, and `language` is a parameter that specifies the language of the text to be recognized by the OCR. |
+| `forward` | `(input: string): Promise<OCRDetections[]>`                                                                                                                                                                                                      | Executes the model's forward pass, where `input` can be a fetchable resource or a Base64-encoded string.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
+| `delete`  | `(): void`                                                                                                                                                                                                                                       | Release the memory held by the module. Calling `forward` afterwards is invalid. Note that you cannot delete model while it's generating.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+type OCRLanguage =
+  | 'abq'
+  | 'ady'
+  | 'af'
+  | 'ava'
+  | 'az'
+  | 'be'
+  | 'bg'
+  | 'bs'
+  | 'chSim'
+  | 'che'
+  | 'cs'
+  | 'cy'
+  | 'da'
+  | 'dar'
+  | 'de'
+  | 'en'
+  | 'es'
+  | 'et'
+  | 'fr'
+  | 'ga'
+  | 'hr'
+  | 'hu'
+  | 'id'
+  | 'inh'
+  | 'ic'
+  | 'it'
+  | 'ja'
+  | 'kbd'
+  | 'kn'
+  | 'ko'
+  | 'ku'
+  | 'la'
+  | 'lbe'
+  | 'lez'
+  | 'lt'
+  | 'lv'
+  | 'mi'
+  | 'mn'
+  | 'ms'
+  | 'mt'
+  | 'nl'
+  | 'no'
+  | 'oc'
+  | 'pi'
+  | 'pl'
+  | 'pt'
+  | 'ro'
+  | 'ru'
+  | 'rsCyrillic'
+  | 'rsLatin'
+  | 'sk'
+  | 'sl'
+  | 'sq'
+  | 'sv'
+  | 'sw'
+  | 'tab'
+  | 'te'
+  | 'th'
+  | 'tjk'
+  | 'tl'
+  | 'tr'
+  | 'uk'
+  | 'uz'
+  | 'vi';
+
+interface Point {
+  x: number;
+  y: number;
+}
+
+interface OCRDetection {
+  bbox: Point[];
+  text: string;
+  score: number;
+}
+```
+
+</details>
+
+## Loading the model
+
+To load the model, use the `load` method. It accepts an object:
+
+**`model`** - Object containing the detector source, recognizer sources, and language.
+
+- **`detectorSource`** - A string that specifies the location of the detector binary.
+- **`recognizerLarge`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 512 pixels.
+- **`recognizerMedium`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 256 pixels.
+- **`recognizerSmall`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 128 pixels.
+- **`language`** - A parameter that specifies the language of the text to be recognized by the OCR.
+
+**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
+
+This method returns a promise, which can resolve to an error or void.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+## Running the model
+
+To run the model, you can use the `forward` method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The method returns a promise, which can resolve either to an error or an array of `OCRDetection` objects. Each object contains coordinates of the bounding box, the label of the detected object, and the confidence score.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ObjectDetectionModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ObjectDetectionModule.md
new file mode 100644
index 000000000..ed4b27463
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ObjectDetectionModule.md
@@ -0,0 +1,77 @@
+---
+title: ObjectDetectionModule
+---
+
+TypeScript API implementation of the [useObjectDetection](../../02-hooks/02-computer-vision/useObjectDetection.md) hook.
+
+## Reference
+
+```typescript
+import {
+  ObjectDetectionModule,
+  SSDLITE_320_MOBILENET_V3_LARGE,
+} from 'react-native-executorch';
+
+const imageUri = 'path/to/image.png';
+
+// Creating an instance
+const objectDetectionModule = new ObjectDetectionModule();
+
+// Loading the model
+await objectDetectionModule.load(SSDLITE_320_MOBILENET_V3_LARGE);
+
+// Running the model
+const detections = await objectDetectionModule.forward(imageUri);
+```
+
+### Methods
+
+| Method    | Type                                                                                                               | Description                                                                                                                                                                                    |
+| --------- | ------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `load`    | `(model: { modelSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `modelSource` is a string that specifies the location of the model binary. To track the download progress, supply a callback function `onDownloadProgressCallback`.     |
+| `forward` | `(imageSource: string, detectionThreshold: number = 0.7): Promise<Detection[]>`                                    | Executes the model's forward pass, where `imageSource` can be a fetchable resource or a Base64-encoded string. `detectionThreshold` can be supplied to alter the sensitivity of the detection. |
+| `delete`  | `(): void`                                                                                                         | Release the memory held by the module. Calling `forward` afterwards is invalid.                                                                                                                |
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+type ResourceSource = string | number | object;
+
+interface Bbox {
+  x1: number;
+  x2: number;
+  y1: number;
+  y2: number;
+}
+
+interface Detection {
+  bbox: Bbox;
+  label: keyof typeof CocoLabel;
+  score: number;
+}
+```
+
+</details>
+
+## Loading the model
+
+To load the model, create a new instance of the module and use the `load` method on it. It accepts an object:
+
+**`model`** - Object containing the model source.
+
+- **`modelSource`** - A string that specifies the location of the model binary.
+
+**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
+
+This method returns a promise, which can resolve to an error or void.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+## Running the model
+
+To run the model, you can use the `forward` method on the module object. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The method returns a promise, which can resolve either to an error or an array of `Detection` objects. Each object contains coordinates of the bounding box, the label of the detected object, and the confidence score.
+
+## Managing memory
+
+The module is a regular JavaScript object, and as such its lifespan will be managed by the garbage collector. In most cases this should be enough, and you should not worry about freeing the memory of the module yourself, but in some cases you may want to release the memory occupied by the module before the garbage collector steps in. In this case use the method `delete()` on the module object you will no longer use, and want to remove from the memory. Note that you cannot use `forward` after `delete` unless you load the module again.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/StyleTransferModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/StyleTransferModule.md
new file mode 100644
index 000000000..50a19e104
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/StyleTransferModule.md
@@ -0,0 +1,64 @@
+---
+title: StyleTransferModule
+---
+
+TypeScript API implementation of the [useStyleTransfer](../../02-hooks/02-computer-vision/useStyleTransfer.md) hook.
+
+## Reference
+
+```typescript
+import {
+  StyleTransferModule,
+  STYLE_TRANSFER_CANDY,
+} from 'react-native-executorch';
+
+const imageUri = 'path/to/image.png';
+
+// Creating an instance
+const styleTransferModule = new StyleTransferModule();
+
+// Loading the model
+await styleTransferModule.load(STYLE_TRANSFER_CANDY);
+
+// Running the model
+const generatedImageUrl = await styleTransferModule.forward(imageUri);
+```
+
+### Methods
+
+| Method    | Type                                                                                                               | Description                                                                                                                                                                                |
+| --------- | ------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| `load`    | `(model: { modelSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `modelSource` is a string that specifies the location of the model binary. To track the download progress, supply a callback function `onDownloadProgressCallback`. |
+| `forward` | `(imageSource: string): Promise<string>`                                                                           | Executes the model's forward pass, where `imageSource` can be a fetchable resource or a Base64-encoded string.                                                                             |
+| `delete`  | `(): void`                                                                                                         | Release the memory held by the module. Calling `forward` afterwards is invalid.                                                                                                            |
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+type ResourceSource = string | number | object;
+```
+
+</details>
+
+## Loading the model
+
+To load the model, create a new instance of the module and use the `load` method on it. It accepts an object:
+
+**`model`** - Object containing the model source.
+
+- **`modelSource`** - A string that specifies the location of the model binary.
+
+**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
+
+This method returns a promise, which can resolve to an error or void.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+## Running the model
+
+To run the model, you can use the `forward` method on the module object. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The method returns a promise, which can resolve either to an error or a URL to generated image.
+
+## Managing memory
+
+The module is a regular JavaScript object, and as such its lifespan will be managed by the garbage collector. In most cases this should be enough, and you should not worry about freeing the memory of the module yourself, but in some cases you may want to release the memory occupied by the module before the garbage collector steps in. In this case use the method `delete()` on the module object you will no longer use, and want to remove from the memory. Note that you cannot use `forward` after `delete` unless you load the module again.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/VerticalOCRModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/VerticalOCRModule.md
new file mode 100644
index 000000000..27b4564ad
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/VerticalOCRModule.md
@@ -0,0 +1,151 @@
+---
+title: VerticalOCRModule
+---
+
+TypeScript API implementation of the [useVerticalOCR](../../02-hooks/02-computer-vision/useVerticalOCR.md) hook.
+
+## Reference
+
+```typescript
+import {
+  VerticalOCRModule,
+  VERTICAL_OCR_ENGLISH,
+} from 'react-native-executorch';
+
+const imageUri = 'path/to/image.png';
+
+// Creating an instance
+const verticalOCRModule = new VerticalOCRModule();
+
+// Loading the model
+await verticalOCRModule.load(VERTICAL_OCR_ENGLISH);
+
+// Running the model
+const detections = await verticalOCRModule.forward(imageUri);
+```
+
+### Methods
+
+| Method    | Type                                                                                                                                                                                                                                                                          | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
+| --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `load`    | `(model: { detectorLarge: ResourceSource; detectorNarrow: ResourceSource; recognizerLarge: ResourceSource; recognizerSmall: ResourceSource; language: OCRLanguage }, independentCharacters: boolean, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `detectorLarge` is a string that specifies the location of the recognizer binary file which accepts input images with a width of 1280 pixels, `detectorNarrow` is a string that specifies the location of the detector binary file which accepts input images with a width of 320 pixels, `recognizerLarge` is a string that specifies the location of the recognizer binary file which accepts input images with a width of 512 pixels, `recognizerSmall` is a string that specifies the location of the recognizer binary file which accepts input images with a width of 64 pixels, and `language` is a parameter that specifies the language of the text to be recognized by the OCR. |
+| `forward` | `(input: string): Promise<OCRDetections[]>`                                                                                                                                                                                                                                   | Executes the model's forward pass, where `input` can be a fetchable resource or a Base64-encoded string.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
+| `delete`  | `(): void`                                                                                                                                                                                                                                                                    | Release the memory held by the module. Calling `forward` afterwards is invalid. Note that you cannot delete model while it's generating.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+interface DetectorSources {
+  detectorLarge: string | number;
+  detectorNarrow: string | number;
+}
+
+interface RecognizerSources {
+  recognizerLarge: string | number;
+  recognizerSmall: string | number;
+}
+
+type OCRLanguage =
+  | 'abq'
+  | 'ady'
+  | 'af'
+  | 'ava'
+  | 'az'
+  | 'be'
+  | 'bg'
+  | 'bs'
+  | 'chSim'
+  | 'che'
+  | 'cs'
+  | 'cy'
+  | 'da'
+  | 'dar'
+  | 'de'
+  | 'en'
+  | 'es'
+  | 'et'
+  | 'fr'
+  | 'ga'
+  | 'hr'
+  | 'hu'
+  | 'id'
+  | 'inh'
+  | 'ic'
+  | 'it'
+  | 'ja'
+  | 'kbd'
+  | 'kn'
+  | 'ko'
+  | 'ku'
+  | 'la'
+  | 'lbe'
+  | 'lez'
+  | 'lt'
+  | 'lv'
+  | 'mi'
+  | 'mn'
+  | 'ms'
+  | 'mt'
+  | 'nl'
+  | 'no'
+  | 'oc'
+  | 'pi'
+  | 'pl'
+  | 'pt'
+  | 'ro'
+  | 'ru'
+  | 'rsCyrillic'
+  | 'rsLatin'
+  | 'sk'
+  | 'sl'
+  | 'sq'
+  | 'sv'
+  | 'sw'
+  | 'tab'
+  | 'te'
+  | 'th'
+  | 'tjk'
+  | 'tl'
+  | 'tr'
+  | 'uk'
+  | 'uz'
+  | 'vi';
+
+interface Point {
+  x: number;
+  y: number;
+}
+
+interface OCRDetection {
+  bbox: Point[];
+  text: string;
+  score: number;
+}
+```
+
+</details>
+
+## Loading the model
+
+To load the model, use the `load` method. It accepts:
+
+**`model`** - Object containing the detector sources, recognizer sources, and language.
+
+- **`detectorLarge`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 1280 pixels.
+- **`detectorNarrow`** - A string that specifies the location of the detector binary file which accepts input images with a width of 320 pixels.
+- **`recognizerLarge`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 512 pixels.
+- **`recognizerSmall`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 64 pixels.
+- **`language`** - A parameter that specifies the language of the text to be recognized by the OCR.
+
+**`independentCharacters`** – A boolean parameter that indicates whether the text in the image consists of a random sequence of characters. If set to true, the algorithm will scan each character individually instead of reading them as continuous text.
+
+**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
+
+This method returns a promise, which can resolve to an error or void.
+
+For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
+
+## Running the model
+
+To run the model, you can use the `forward` method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The method returns a promise, which can resolve either to an error or an array of `OCRDetection` objects. Each object contains coordinates of the bounding box, the label of the detected object, and the confidence score.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/_category_.json b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/_category_.json
new file mode 100644
index 000000000..930e814ef
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/_category_.json
@@ -0,0 +1,6 @@
+{
+  "label": "Computer Vision",
+  "link": {
+    "type": "generated-index"
+  }
+}
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/ExecutorchModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/ExecutorchModule.md
new file mode 100644
index 000000000..58b09c9be
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/ExecutorchModule.md
@@ -0,0 +1,164 @@
+---
+title: ExecutorchModule
+---
+
+ExecutorchModule provides TypeScript bindings for the underlying ExecuTorch [Module API](https://pytorch.org/executorch/stable/extension-module.html).
+
+:::tip
+For React applications, consider using the [`useExecutorchModule`](../../02-hooks/03-executorch-bindings/useExecutorchModule.md) hook instead, which provides automatic state management, loading progress tracking, and cleanup on unmount.
+:::
+
+## Reference
+
+```typescript
+import {
+  ExecutorchModule,
+  STYLE_TRANSFER_CANDY,
+  ScalarType,
+} from 'react-native-executorch';
+
+// Creating the input array
+const inputTensor = {
+  dataPtr: new Float32Array(1 * 3 * 640 * 640),
+  sizes: [1, 3, 640, 640],
+  scalarType: ScalarType.FLOAT,
+};
+
+// Creating an instance
+const model = new ExecutorchModule();
+
+// Loading the model
+await model.load(STYLE_TRANSFER_CANDY);
+
+// Running the forward method
+const output = await model.forward([inputTensor]);
+```
+
+### Methods
+
+| Method          | Type                                                                                                    | Description                                                                                                                                                           |
+| --------------- | ------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `load`          | `(modelSource: ResourceSource, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `modelSource` is a string, number, or object that specifies the location of the model binary. Optionally accepts a download progress callback. |
+| `forward`       | `(inputTensor: TensorPtr[]): Promise<TensorPtr[]>`                                                      | Executes the model's forward pass, where input is an array of TensorPtr objects. If the inference is successful, an array of tensor pointers is returned.             |
+| `getInputShape` | `(methodName: string, index: number): Promise<number[]>`                                                | Returns the expected input shape for a specific method and input index.                                                                                               |
+| `delete`        | `(): void`                                                                                              | Unloads the model and releases resources.                                                                                                                             |
+
+## TensorPtr
+
+TensorPtr is a JS representation of the underlying tensor, which is then passed to the model. You can read more about creating tensors [here](https://docs.pytorch.org/executorch/stable/extension-tensor.html). On JS side, the TensorPtr holds the following information:
+
+<details>
+<summary>Type definitions</summary>
+
+```typescript
+interface TensorPtr {
+  dataPtr: TensorBuffer;
+  sizes: number[];
+  scalarType: ScalarType;
+}
+
+type TensorBuffer =
+  | ArrayBuffer
+  | Float32Array
+  | Float64Array
+  | Int8Array
+  | Int16Array
+  | Int32Array
+  | Uint8Array
+  | Uint16Array
+  | Uint32Array
+  | BigInt64Array
+  | BigUint64Array;
+
+enum ScalarType {
+  BYTE = 0,
+  CHAR = 1,
+  SHORT = 2,
+  INT = 3,
+  LONG = 4,
+  HALF = 5,
+  FLOAT = 6,
+  DOUBLE = 7,
+  BOOL = 11,
+  QINT8 = 12,
+  QUINT8 = 13,
+  QINT32 = 14,
+  QUINT4X2 = 16,
+  QUINT2X4 = 17,
+  BITS16 = 22,
+  FLOAT8E5M2 = 23,
+  FLOAT8E4M3FN = 24,
+  FLOAT8E5M2FNUZ = 25,
+  FLOAT8E4M3FNUZ = 26,
+  UINT16 = 27,
+  UINT32 = 28,
+  UINT64 = 29,
+}
+```
+
+</details>
+
+`dataPtr` - Represents a data buffer that will be used to create a tensor on the native side. This can be either an `ArrayBuffer` or a `TypedArray`. If your model takes in a datatype which is not covered by any of the `TypedArray` types, just pass an `ArrayBuffer` here.
+
+`sizes` - Represents the shape of a given tensor, i.e. for a 640x640 RGB image with a batch size of 1, you would need to pass `[1, 3, 640, 640]` here.
+
+`scalarType` - An enum resembling the ExecuTorch's `ScalarType`. For example, if your model was exported with float32 as an input, you will need to pass `ScalarType.FLOAT` here.
+
+## End to end example
+
+This example demonstrates the integration and usage of the ExecuTorch bindings with a [style transfer model](../../02-hooks/02-computer-vision/useStyleTransfer.md). Specifically, we'll be using the `STYLE_TRANSFER_CANDY` model, which applies artistic style transfer to an input image.
+
+### Importing the Module and loading the model
+
+First, import the necessary functions from the `react-native-executorch` package and initialize the ExecuTorch module with the specified style transfer model.
+
+```typescript
+import {
+  ExecutorchModule,
+  STYLE_TRANSFER_CANDY,
+  ScalarType,
+} from 'react-native-executorch';
+
+// Initialize the executorch module
+const executorchModule = new ExecutorchModule();
+
+// Load the model with optional download progress callback
+await executorchModule.load(STYLE_TRANSFER_CANDY, (progress) => {
+  console.log(`Download progress: ${progress}%`);
+});
+```
+
+### Setting up input parameters
+
+To prepare the model input, define the tensor shape according to your model's requirements (defined by the model export process). For example, the STYLE_TRANSFER_CANDY model expects a tensor with shape `[1, 3, 640, 640]` — representing a batch size of 1, 3 color channels (RGB), and 640×640 pixel dimensions.
+
+```typescript
+const inputTensor = {
+  dataPtr: new Float32Array(1 * 3 * 640 * 640), // or other TypedArray / ArrayBuffer
+  sizes: [1, 3, 640, 640],
+  scalarType: ScalarType.FLOAT,
+};
+```
+
+### Performing inference
+
+After passing input to the forward function, you'll receive an array of TensorPtr objects. Each TensorPtr contains its `dataPtr` field as an ArrayBuffer. Since ArrayBuffer represents raw binary data, you'll need to interpret it according to the tensor's underlying data type (e.g., creating a Float32Array view for float32 tensors, Int32Array for int32 tensors, etc.).
+
+```typescript
+try {
+  // Perform the forward operation and receive the stylized image output.
+  const output = await executorchModule.forward([inputTensor]);
+  // Interpret the output ArrayBuffer
+  // foo(output[0].dataPtr);
+} catch (error) {
+  // Log any errors that occur during the forward pass.
+  console.error('Error during model execution:', error);
+}
+
+// Clean up resources when done
+executorchModule.delete();
+```
+
+:::info
+This code assumes that you have handled preprocessing of the input image (scaling, normalization) and postprocessing of the output (interpreting the raw output data) according to the model's requirements. Make sure to adjust these parts depending on your specific data and model outputs.
+:::
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/_category_.json b/docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/_category_.json
new file mode 100644
index 000000000..1e8953592
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/_category_.json
@@ -0,0 +1,6 @@
+{
+  "label": "ExecuTorch Bindings",
+  "link": {
+    "type": "generated-index"
+  }
+}
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/_category_.json b/docs/versioned_docs/version-0.6.x/03-typescript-api/_category_.json
new file mode 100644
index 000000000..5d1c80c51
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/03-typescript-api/_category_.json
@@ -0,0 +1,6 @@
+{
+  "label": "TypeScript API",
+  "link": {
+    "type": "generated-index"
+  }
+}
diff --git a/docs/versioned_docs/version-0.6.x/04-benchmarks/_category_.json b/docs/versioned_docs/version-0.6.x/04-benchmarks/_category_.json
new file mode 100644
index 000000000..6ea5e9670
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/04-benchmarks/_category_.json
@@ -0,0 +1,6 @@
+{
+  "label": "Benchmarks",
+  "link": {
+    "type": "generated-index"
+  }
+}
diff --git a/docs/versioned_docs/version-0.6.x/04-benchmarks/inference-time.md b/docs/versioned_docs/version-0.6.x/04-benchmarks/inference-time.md
new file mode 100644
index 000000000..dbfc2b21d
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/04-benchmarks/inference-time.md
@@ -0,0 +1,111 @@
+---
+title: Inference Time
+---
+
+:::warning warning
+Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
+:::
+
+## Classification
+
+| Model             | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ----------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| EFFICIENTNET_V2_S |              64              |              68              |            217             |                205                |            198            |
+
+## Object Detection
+
+| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| SSDLITE_320_MOBILENET_V3_LARGE |              71              |              74              |            257             |                115                |            109            |
+
+## Style Transfer
+
+| Model                        | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ---------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| STYLE_TRANSFER_CANDY         |             1400             |             1485             |            4255            |               2510                |           2355            |
+| STYLE_TRANSFER_MOSAIC        |             1400             |             1485             |            4255            |               2510                |           2355            |
+| STYLE_TRANSFER_UDNIE         |             1400             |             1485             |            4255            |               2510                |           2355            |
+| STYLE_TRANSFER_RAIN_PRINCESS |             1400             |             1485             |            4255            |               2510                |           2355            |
+
+## OCR
+
+Notice that the recognizer models were executed between 3 and 7 times during a single recognition.
+The values below represent the averages across all runs for the benchmark image.
+
+| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| Detector (CRAFT_800_QUANTIZED) |             220              |             221              |            1740            |                521                |            492            |
+| Recognizer (CRNN_512)          |              45              |              38              |            110             |                40                 |            38             |
+| Recognizer (CRNN_256)          |              21              |              18              |             54             |                20                 |            19             |
+| Recognizer (CRNN_128)          |              11              |              9               |             27             |                10                 |            10             |
+
+## Vertical OCR
+
+Notice that the recognizer models, as well as detector CRAFT_320 model, were executed between 4 and 21 times during a single recognition.
+The values below represent the averages across all runs for the benchmark image.
+
+| Model                           | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| Detector (CRAFT_1280_QUANTIZED) |             501              |             507              |            4317            |               1405                |           1275            |
+| Detector (CRAFT_320_QUANTIZED)  |             125              |             121              |            1060            |                338                |            299            |
+| Recognizer (CRNN_512)           |              46              |              42              |            109             |                47                 |            37             |
+| Recognizer (CRNN_64)            |              5               |              6               |             14             |                 7                 |             6             |
+
+## LLMs
+
+| Model                 | iPhone 16 Pro (XNNPACK) [tokens/s] | iPhone 13 Pro (XNNPACK) [tokens/s] | iPhone SE 3 (XNNPACK) [tokens/s] | Samsung Galaxy S24 (XNNPACK) [tokens/s] | OnePlus 12 (XNNPACK) [tokens/s] |
+| --------------------- | :--------------------------------: | :--------------------------------: | :------------------------------: | :-------------------------------------: | :-----------------------------: |
+| LLAMA3_2_1B           |                16.1                |                11.4                |                ❌                |                  15.6                   |              19.3               |
+| LLAMA3_2_1B_SPINQUANT |                40.6                |                16.7                |               16.5               |                  40.3                   |              48.2               |
+| LLAMA3_2_1B_QLORA     |                31.8                |                11.4                |               11.2               |                  37.3                   |              44.4               |
+| LLAMA3_2_3B           |                 ❌                 |                 ❌                 |                ❌                |                   ❌                    |               7.1               |
+| LLAMA3_2_3B_SPINQUANT |                17.2                |                8.2                 |                ❌                |                  16.2                   |              19.4               |
+| LLAMA3_2_3B_QLORA     |                14.5                |                 ❌                 |                ❌                |                  14.8                   |              18.1               |
+
+❌ - Insufficient RAM.
+
+### Encoding
+
+Average time for encoding audio of given length over 10 runs. For `Whisper` model we only list 30 sec audio chunks since `Whisper` does not accept other lengths (for shorter audio the audio needs to be padded to 30sec with silence).
+
+| Model              | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| Whisper-tiny (30s) |             248              |             254              |            1145            |                435                |            526            |
+
+### Decoding
+
+Average time for decoding one token in sequence of approximately 100 tokens, with encoding context is obtained from audio of noted length.
+
+| Model              | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| Whisper-tiny (30s) |              23              |              25              |            121             |                92                 |            115            |
+
+## Text Embeddings
+
+| Model                      | iPhone 17 Pro (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| -------------------------- | :--------------------------: | :-----------------------: |
+| ALL_MINILM_L6_V2           |              7               |            21             |
+| ALL_MPNET_BASE_V2          |              24              |            90             |
+| MULTI_QA_MINILM_L6_COS_V1  |              7               |            19             |
+| MULTI_QA_MPNET_BASE_DOT_V1 |              24              |            88             |
+| CLIP_VIT_BASE_PATCH32_TEXT |              14              |            39             |
+
+:::info
+Benchmark times for text embeddings are highly dependent on the sentence length. The numbers above are based on a sentence of around 80 tokens. For shorter or longer sentences, inference time may vary accordingly.
+:::
+
+## Image Embeddings
+
+| Model                       | iPhone 17 Pro (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| --------------------------- | :--------------------------: | :-----------------------: |
+| CLIP_VIT_BASE_PATCH32_IMAGE |              18              |            55             |
+
+:::info
+Image embedding benchmark times are measured using 224×224 pixel images, as required by the model. All input images, whether larger or smaller, are resized to 224×224 before processing. Resizing is typically fast for small images but may be noticeably slower for very large images, which can increase total inference time.
+:::
+
+## Text to Image
+
+| Model                 | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
+| --------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
+| BK_SDM_TINY_VPRED_256 |            21184             |            21021             |             ❌             |               18834               |           16617           |
diff --git a/docs/versioned_docs/version-0.6.x/04-benchmarks/memory-usage.md b/docs/versioned_docs/version-0.6.x/04-benchmarks/memory-usage.md
new file mode 100644
index 000000000..a0c5a7b6d
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/04-benchmarks/memory-usage.md
@@ -0,0 +1,81 @@
+---
+title: Memory Usage
+---
+
+:::info
+All the below benchmarks were performed on iPhone 17 Pro (iOS) and OnePlus 12 (Android).
+:::
+
+## Classification
+
+| Model             | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
+| ----------------- | :--------------------: | :----------------: |
+| EFFICIENTNET_V2_S |          230           |         87         |
+
+## Object Detection
+
+| Model                          | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ------------------------------ | :--------------------: | :----------------: |
+| SSDLITE_320_MOBILENET_V3_LARGE |          164           |        132         |
+
+## Style Transfer
+
+| Model                        | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
+| ---------------------------- | :--------------------: | :----------------: |
+| STYLE_TRANSFER_CANDY         |          1200          |        380         |
+| STYLE_TRANSFER_MOSAIC        |          1200          |        380         |
+| STYLE_TRANSFER_UDNIE         |          1200          |        380         |
+| STYLE_TRANSFER_RAIN_PRINCESS |          1200          |        380         |
+
+## OCR
+
+| Model                                                                                                  | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ------------------------------------------------------------------------------------------------------ | :--------------------: | :----------------: |
+| Detector (CRAFT_800_QUANTIZED) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) |          1400          |        1320        |
+
+## Vertical OCR
+
+| Model                                                                                    | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ---------------------------------------------------------------------------------------- | :--------------------: | :----------------: |
+| Detector (CRAFT_1280_QUANTIZED) + Detector (CRAFT_320_QUANTIZED) + Recognizer (CRNN_512) |          1540          |        1470        |
+| Detector(CRAFT_1280_QUANTIZED) + Detector(CRAFT_320_QUANTIZED) + Recognizer (CRNN_64)    |          1070          |        1000        |
+
+## LLMs
+
+| Model                 | Android (XNNPACK) [GB] | iOS (XNNPACK) [GB] |
+| --------------------- | :--------------------: | :----------------: |
+| LLAMA3_2_1B           |          3.3           |        3.1         |
+| LLAMA3_2_1B_SPINQUANT |          1.9           |        2.4         |
+| LLAMA3_2_1B_QLORA     |          2.7           |        2.8         |
+| LLAMA3_2_3B           |          7.1           |        7.3         |
+| LLAMA3_2_3B_SPINQUANT |          3.7           |        3.8         |
+| LLAMA3_2_3B_QLORA     |          3.9           |        4.0         |
+
+## Speech to text
+
+| Model        | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| ------------ | :--------------------: | :----------------: |
+| WHISPER_TINY |          410           |        375         |
+
+## Text Embeddings
+
+| Model                      | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| -------------------------- | :--------------------: | :----------------: |
+| ALL_MINILM_L6_V2           |           95           |        110         |
+| ALL_MPNET_BASE_V2          |          405           |        455         |
+| MULTI_QA_MINILM_L6_COS_V1  |          120           |        140         |
+| MULTI_QA_MPNET_BASE_DOT_V1 |          435           |        455         |
+| CLIP_VIT_BASE_PATCH32_TEXT |          200           |        280         |
+
+## Image Embeddings
+
+| Model                       | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| --------------------------- | :--------------------: | :----------------: |
+| CLIP_VIT_BASE_PATCH32_IMAGE |          345           |        340         |
+
+## Text to Image
+
+| Model                 | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
+| --------------------- | ---------------------- | ------------------ |
+| BK_SDM_TINY_VPRED_256 | 2400                   | 2400               |
+| BK_SDM_TINY_VPRED     | 6210                   | 6050               |
diff --git a/docs/versioned_docs/version-0.6.x/04-benchmarks/model-size.md b/docs/versioned_docs/version-0.6.x/04-benchmarks/model-size.md
new file mode 100644
index 000000000..128cbd7fb
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/04-benchmarks/model-size.md
@@ -0,0 +1,90 @@
+---
+title: Model Size
+---
+
+## Classification
+
+| Model             | XNNPACK [MB] | Core ML [MB] |
+| ----------------- | :----------: | :----------: |
+| EFFICIENTNET_V2_S |     85.6     |     43.9     |
+
+## Object Detection
+
+| Model                          | XNNPACK [MB] |
+| ------------------------------ | :----------: |
+| SSDLITE_320_MOBILENET_V3_LARGE |     13.9     |
+
+## Style Transfer
+
+| Model                        | XNNPACK [MB] | Core ML [MB] |
+| ---------------------------- | :----------: | :----------: |
+| STYLE_TRANSFER_CANDY         |     6.78     |     5.22     |
+| STYLE_TRANSFER_MOSAIC        |     6.78     |     5.22     |
+| STYLE_TRANSFER_UDNIE         |     6.78     |     5.22     |
+| STYLE_TRANSFER_RAIN_PRINCESS |     6.78     |     5.22     |
+
+## OCR
+
+| Model                          | XNNPACK [MB] |
+| ------------------------------ | :----------: |
+| Detector (CRAFT_800_QUANTIZED) |     19.8     |
+| Recognizer (CRNN_512)          |  15 - 18\*   |
+| Recognizer (CRNN_256)          |  16 - 18\*   |
+| Recognizer (CRNN_128)          |  17 - 19\*   |
+
+\* - The model weights vary depending on the language.
+
+## Vertical OCR
+
+| Model                           | XNNPACK [MB] |
+| ------------------------------- | :----------: |
+| Detector (CRAFT_1280_QUANTIZED) |     19.8     |
+| Detector (CRAFT_320_QUANTIZED)  |     19.8     |
+| Recognizer (CRNN_EN_512)        |  15 - 18\*   |
+| Recognizer (CRNN_EN_64)         |  15 - 16\*   |
+
+\* - The model weights vary depending on the language.
+
+## LLMs
+
+| Model                 | XNNPACK [GB] |
+| --------------------- | :----------: |
+| LLAMA3_2_1B           |     2.47     |
+| LLAMA3_2_1B_SPINQUANT |     1.14     |
+| LLAMA3_2_1B_QLORA     |     1.18     |
+| LLAMA3_2_3B           |     6.43     |
+| LLAMA3_2_3B_SPINQUANT |     2.55     |
+| LLAMA3_2_3B_QLORA     |     2.65     |
+
+## Speech to text
+
+| Model            | XNNPACK [MB] |
+| ---------------- | :----------: |
+| WHISPER_TINY_EN  |     151      |
+| WHISPER_TINY     |     151      |
+| WHISPER_BASE_EN  |    290.6     |
+| WHISPER_BASE     |    290.6     |
+| WHISPER_SMALL_EN |     968      |
+| WHISPER_SMALL    |     968      |
+
+## Text Embeddings
+
+| Model                      | XNNPACK [MB] |
+| -------------------------- | :----------: |
+| ALL_MINILM_L6_V2           |      91      |
+| ALL_MPNET_BASE_V2          |     438      |
+| MULTI_QA_MINILM_L6_COS_V1  |      91      |
+| MULTI_QA_MPNET_BASE_DOT_V1 |     438      |
+| CLIP_VIT_BASE_PATCH32_TEXT |     254      |
+
+## Image Embeddings
+
+| Model                       | XNNPACK [MB] |
+| --------------------------- | :----------: |
+| CLIP_VIT_BASE_PATCH32_IMAGE |     352      |
+
+## Text to Image
+
+| Model             | Text encoder (XNNPACK) [MB] | UNet (XNNPACK) [MB] | VAE decoder (XNNPACK) [MB] |
+| ----------------- | --------------------------- | ------------------- | -------------------------- |
+| BK_SDM_TINY_VPRED | 492                         | 1290                | 198                        |
diff --git a/docs/versioned_docs/version-0.6.x/05-utilities/_category_.json b/docs/versioned_docs/version-0.6.x/05-utilities/_category_.json
new file mode 100644
index 000000000..dc9848f39
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/05-utilities/_category_.json
@@ -0,0 +1,6 @@
+{
+  "label": "Utilities",
+  "link": {
+    "type": "generated-index"
+  }
+}
diff --git a/docs/versioned_docs/version-0.6.x/05-utilities/resource-fetcher.md b/docs/versioned_docs/version-0.6.x/05-utilities/resource-fetcher.md
new file mode 100644
index 000000000..1dfea89b3
--- /dev/null
+++ b/docs/versioned_docs/version-0.6.x/05-utilities/resource-fetcher.md
@@ -0,0 +1,218 @@
+---
+title: Resource Fetcher
+---
+
+This module provides functions to download and work with downloaded files stored in the application's document directory inside the `react-native-executorch/` directory. These utilities can help you manage your storage and clean up the downloaded files when they are no longer needed.
+
+## fetch
+
+Fetches resources (remote URLs, local files or embedded assets), downloads or stores them locally for use by React Native ExecuTorch.
+
+### Reference
+
+```typescript
+import { ResourceFetcher } from 'react-native-executorch';
+
+const uris = await ResourceFetcher.fetch(
+  (progress) => console.log('Total progress:', progress),
+  'https://.../llama3_2.pte',
+  'https://.../qwen3.pte'
+);
+```
+
+### Parameters
+
+- `callback: (downloadProgress: number) => void` - Optional callback to track progress of all downloads, reported between 0 and 1.
+- `...sources: ResourceSource[]` - Multiple resources that can be strings, asset references, or objects.
+
+### Returns
+
+`Promise<string[] | null>`:
+
+- If the fetch was successful, it returns a promise which resolves to an array of local file paths for the downloaded/stored resources (without file:// prefix).
+- If the fetch was interrupted by `pauseFetching` or `cancelFetching`, it returns a promise which resolves to `null`.
+
+:::info
+If the resource is an object, it will be saved as a JSON file on disk.  
+:::
+
+## pauseFetching
+
+Pauses an ongoing download of files.
+
+### Reference
+
+```typescript
+import { ResourceFetcher } from 'react-native-executorch';
+
+const uris = ResourceFetcher.fetch(
+  (progress) => console.log('Total progress:', progress),
+  'https://.../llama3_2.pte',
+  'https://.../qwen3.pte'
+).then((uris) => {
+  console.log('URI resolved to: ', uris); // since we pause the fetch, uris is resolved to null
+});
+
+await ResourceFetcher.pauseFetching(
+  'https://.../llama3_2.pte',
+  'https://.../qwen3.pte'
+);
+```
+
+### Parameters
+
+- `...sources: ResourceSource[]` - The resource identifiers used when calling `fetch`.
+
+### Returns
+
+`Promise<void>` – A promise that resolves once the download is paused.
+
+## resumeFetching
+
+Resumes a paused download of files.
+
+### Reference
+
+```typescript
+import { ResourceFetcher } from 'react-native-executorch';
+
+const uris = ResourceFetcher.fetch(
+  (progress) => console.log('Total progress:', progress),
+  'https://.../llama3_2.pte',
+  'https://.../qwen3.pte'
+).then((uris) => {
+  console.log('URI resolved as: ', uris); // since we pause the fetch, uris is resolved to null
+});
+
+await ResourceFetcher.pauseFetching(
+  'https://.../llama3_2.pte',
+  'https://.../qwen3.pte'
+);
+
+const resolvedUris = await ResourceFetcher.resumeFetching(
+  'https://.../llama3_2.pte',
+  'https://.../qwen3.pte'
+);
+//resolvedUris is resolved to file paths to fetched resources, unless explicitly paused/cancel again.
+```
+
+### Parameters
+
+- `...sources: ResourceSource[]` - The resource identifiers used when calling `fetch`.
+
+### Returns
+
+`Promise<string[] | null>`:
+
+- If the fetch was successful, it returns a promise which resolves to an array of local file paths for the downloaded resources (without file:// prefix).
+- If the fetch was again interrupted by `pauseFetching` or `cancelFetching`, it returns a promise which resolves to null.
+
+:::info
+The other way to resume paused resources is to simply call `fetch` again. However, `resumeFetching` is faster.
+:::
+
+## cancelFetching
+
+Cancels an ongoing/paused download of files.
+
+### Reference
+
+```typescript
+import { ResourceFetcher } from 'react-native-executorch';
+
+const uris = ResourceFetcher.fetch(
+  (progress) => console.log('Total progress:', progress),
+  'https://.../llama3_2.pte',
+  'https://.../qwen3.pte'
+).then((uris) => {
+  console.log('URI resolved as: ', uris); // since we cancel the fetch, uris is resolved to null
+});
+
+await ResourceFetcher.cancelFetching(
+  'https://.../llama3_2.pte',
+  'https://.../qwen3.pte'
+);
+```
+
+### Parameters
+
+- `...sources: ResourceSource[]` - The resource identifiers used when calling `fetch()`.
+
+### Returns
+
+`Promise<void>` – A promise that resolves once the download is cancelled.
+
+## deleteResources
+
+Deletes downloaded resources from the local filesystem.
+
+### Reference
+
+```typescript
+import { ResourceFetcher } from 'react-native-executorch';
+
+await ResourceFetcher.deleteResources('https://.../llama3_2.pte');
+```
+
+### Parameters
+
+- `...sources: ResourceSource[]` - The resource identifiers used when calling `fetch`.
+
+### Returns
+
+`Promise<void>` – A promise that resolves once all specified resources have been removed.
+
+## getFilesTotalSize
+
+Fetches the info about files size. Works only for remote files.
+
+### Reference
+
+```typescript
+import { ResourceFetcher } from 'react-native-executorch';
+
+const totalSize = await ResourceFetcher.getFilesTotalSize(
+  'https://.../llama3_2.pte',
+  'https://.../qwen3.pte'
+);
+```
+
+### Parameters
+
+- `...sources: ResourceSource[]` - The resource identifiers (URLs).
+
+### Returns
+
+`Promise<number>` – A promise that resolves to combined size of files in bytes.
+
+## listDownloadedFiles
+
+Lists all the downloaded files used by React Native ExecuTorch.
+
+### Reference
+
+```typescript
+import { ResourceFetcher } from 'react-native-executorch';
+
+const filesUris = await ResourceFetcher.listDownloadedFiles();
+```
+
+### Returns
+
+`Promise<string[]>` - A promise, which resolves to an array of URIs for all the downloaded files.
+
+## listDownloadedModels
+
+Lists all the downloaded models used by React Native ExecuTorch.
+
+### Reference
+
+```typescript
+import { ResourceFetcher } from 'react-native-executorch';
+
+const modelsUris = await ResourceFetcher.listDownloadedModels();
+```
+
+### Returns
+
+`Promise<string[]>` - A promise, which resolves to an array of URIs for all the downloaded models.

From 6dda23aa4d2471ebc1485cce2203f3db5545f7e5 Mon Sep 17 00:00:00 2001
From: IgorSwat <igorswat2002@o2.pl>
Date: Fri, 5 Dec 2025 12:14:19 +0100
Subject: [PATCH 10/11] remove unnecessary versioned docs

---
 .../01-fundamentals/01-getting-started.md     | 100 ----
 .../01-fundamentals/02-loading-models.md      |  50 --
 .../03-frequently-asked-questions.md          |  39 --
 .../01-fundamentals/_category_.json           |   6 -
 .../_category_.json                           |   6 -
 .../01-natural-language-processing/useLLM.md  | 537 ------------------
 .../useSpeechToText.md                        | 343 -----------
 .../useTextEmbeddings.md                      | 158 ------
 .../useTokenizer.md                           | 104 ----
 .../01-natural-language-processing/useVAD.md  | 194 -------
 .../02-computer-vision/_category_.json        |   6 -
 .../02-computer-vision/useClassification.md   | 113 ----
 .../02-computer-vision/useImageEmbeddings.md  | 132 -----
 .../useImageSegmentation.md                   | 117 ----
 .../02-hooks/02-computer-vision/useOCR.md     | 332 -----------
 .../02-computer-vision/useObjectDetection.md  | 152 -----
 .../02-computer-vision/useStyleTransfer.md    | 114 ----
 .../02-computer-vision/useTextToImage.md      | 133 -----
 .../02-computer-vision/useVerticalOCR.md      | 347 -----------
 .../03-executorch-bindings/_category_.json    |   6 -
 .../useExecutorchModule.md                    | 155 -----
 .../version-0.6.x/02-hooks/_category_.json    |   6 -
 .../LLMModule.md                              | 166 ------
 .../SpeechToTextModule.md                     | 252 --------
 .../TextEmbeddingsModule.md                   |  59 --
 .../TokenizerModule.md                        |  60 --
 .../_category_.json                           |   6 -
 .../ClassificationModule.md                   |  64 ---
 .../ImageEmbeddingsModule.md                  |  60 --
 .../ImageSegmentationModule.md                |  77 ---
 .../02-computer-vision/OCRModule.md           | 135 -----
 .../ObjectDetectionModule.md                  |  77 ---
 .../02-computer-vision/StyleTransferModule.md |  64 ---
 .../02-computer-vision/VerticalOCRModule.md   | 151 -----
 .../02-computer-vision/_category_.json        |   6 -
 .../ExecutorchModule.md                       | 164 ------
 .../03-executorch-bindings/_category_.json    |   6 -
 .../03-typescript-api/_category_.json         |   6 -
 .../04-benchmarks/_category_.json             |   6 -
 .../04-benchmarks/inference-time.md           | 111 ----
 .../04-benchmarks/memory-usage.md             |  81 ---
 .../version-0.6.x/04-benchmarks/model-size.md |  90 ---
 .../05-utilities/_category_.json              |   6 -
 .../05-utilities/resource-fetcher.md          | 218 -------
 44 files changed, 5015 deletions(-)
 delete mode 100644 docs/versioned_docs/version-0.6.x/01-fundamentals/01-getting-started.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/01-fundamentals/02-loading-models.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/01-fundamentals/03-frequently-asked-questions.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/01-fundamentals/_category_.json
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/_category_.json
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useLLM.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useSpeechToText.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTextEmbeddings.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTokenizer.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useVAD.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/_category_.json
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useClassification.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageEmbeddings.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageSegmentation.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useOCR.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useObjectDetection.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useStyleTransfer.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useTextToImage.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useVerticalOCR.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/_category_.json
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/useExecutorchModule.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/02-hooks/_category_.json
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/LLMModule.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/SpeechToTextModule.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TextEmbeddingsModule.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TokenizerModule.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/_category_.json
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ClassificationModule.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageEmbeddingsModule.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageSegmentationModule.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/OCRModule.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ObjectDetectionModule.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/StyleTransferModule.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/VerticalOCRModule.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/_category_.json
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/ExecutorchModule.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/_category_.json
 delete mode 100644 docs/versioned_docs/version-0.6.x/03-typescript-api/_category_.json
 delete mode 100644 docs/versioned_docs/version-0.6.x/04-benchmarks/_category_.json
 delete mode 100644 docs/versioned_docs/version-0.6.x/04-benchmarks/inference-time.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/04-benchmarks/memory-usage.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/04-benchmarks/model-size.md
 delete mode 100644 docs/versioned_docs/version-0.6.x/05-utilities/_category_.json
 delete mode 100644 docs/versioned_docs/version-0.6.x/05-utilities/resource-fetcher.md

diff --git a/docs/versioned_docs/version-0.6.x/01-fundamentals/01-getting-started.md b/docs/versioned_docs/version-0.6.x/01-fundamentals/01-getting-started.md
deleted file mode 100644
index b5d60c35b..000000000
--- a/docs/versioned_docs/version-0.6.x/01-fundamentals/01-getting-started.md
+++ /dev/null
@@ -1,100 +0,0 @@
----
-title: Getting Started
-slug: /
-keywords:
-  [
-    react native,
-    react native ai,
-    react native llm,
-    react native qwen,
-    react native llama,
-    react native executorch,
-    executorch,
-    on-device ai,
-    pytorch,
-    mobile ai,
-  ]
-description: 'Get started with React Native ExecuTorch - a framework for running AI models on-device in your React Native applications.'
----
-
-import Tabs from '@theme/Tabs';
-import TabItem from '@theme/TabItem';
-
-## What is ExecuTorch?
-
-ExecuTorch is a novel AI framework developed by Meta, designed to streamline deploying PyTorch models on a variety of devices, including mobile phones and microcontrollers. This framework enables exporting models into standalone binaries, allowing them to run locally without requiring API calls. ExecuTorch achieves state-of-the-art performance through optimizations and delegates such as Core ML and XNNPACK. It provides a seamless export process with robust debugging options, making it easier to resolve issues if they arise.
-
-## React Native ExecuTorch
-
-React Native ExecuTorch is our way of bringing ExecuTorch into the React Native world. Our API is built to be simple, declarative, and efficient. Plus, we’ll provide a set of pre-exported models for common use cases, so you won’t have to worry about handling exports yourself. With just a few lines of JavaScript, you’ll be able to run AI models (even LLMs 👀) right on your device—keeping user data private and saving on cloud costs.
-
-## Compatibility
-
-React Native Executorch supports only the [New React Native architecture](https://reactnative.dev/architecture/landing-page).
-
-If your app still runs on the old architecture, please consider upgrading to the New Architecture.
-
-## Installation
-
-Installation is pretty straightforward, just use your favorite package manager.
-
-<Tabs>
-  <TabItem value="npm" label="NPM">
-
-    ```
-    npm install react-native-executorch
-    ```
-
-  </TabItem>
-  <TabItem value="pnpm" label="PNPM">
-
-    ```
-    pnpm install react-native-executorch
-    ```
-
-  </TabItem>
-  <TabItem value="yarn" label="YARN">
-
-    ```
-    yarn add react-native-executorch
-    ```
-
-  </TabItem>
-</Tabs>
-
-If you're using bare React Native (instead of a managed Expo project), you also need to install Expo Modules because the underlying implementation relies on expo-file-system. Since expo-file-system is an Expo package, bare React Native projects need **Expo Modules** to properly integrate and use it. The link provided (https://docs.expo.dev/bare/installing-expo-modules/) offers guidance on setting up Expo Modules in a bare React Native environment.
-
-If you plan on using your models via require() instead of fetching them from a url, you also need to add following lines to your `metro.config.js`:
-
-```json
-// metro.config.js
-...
-    defaultConfig.resolver.assetExts.push('pte')
-    defaultConfig.resolver.assetExts.push('bin')
-...
-```
-
-This allows us to use binaries, such as exported models or tokenizers for LLMs.
-
-:::caution
-When using Expo, please note that you need to use a custom development build of your app, not the standard Expo Go app. This is because we rely on native modules, which Expo Go doesn’t support.
-:::
-
-:::info
-Because we are using ExecuTorch under the hood, you won't be able to build iOS app for release with simulator selected as the target device. Make sure to test release builds on real devices.
-:::
-
-Running the app with the library:
-
-```bash
-yarn run expo:<ios | android> -d
-```
-
-## Good reads
-
-If you want to dive deeper into ExecuTorch or our previous work with the framework, we highly encourage you to check out the following resources:
-
-- [ExecuTorch docs](https://pytorch.org/executorch/stable/index.html)
-- [Native code for iOS](https://medium.com/swmansion/bringing-native-ai-to-your-mobile-apps-with-executorch-part-i-ios-f1562a4556e8?source=user_profile_page---------0-------------250189c98ccf---------------)
-- [Native code for Android](https://medium.com/swmansion/bringing-native-ai-to-your-mobile-apps-with-executorch-part-ii-android-29431b6b9f7f?source=user_profile_page---------2-------------b8e3a5cb1c63---------------)
-- [Exporting to Android with XNNPACK](https://medium.com/swmansion/exporting-ai-models-on-android-with-xnnpack-and-executorch-3e70cff51c59?source=user_profile_page---------1-------------b8e3a5cb1c63---------------)
diff --git a/docs/versioned_docs/version-0.6.x/01-fundamentals/02-loading-models.md b/docs/versioned_docs/version-0.6.x/01-fundamentals/02-loading-models.md
deleted file mode 100644
index 8763d9614..000000000
--- a/docs/versioned_docs/version-0.6.x/01-fundamentals/02-loading-models.md
+++ /dev/null
@@ -1,50 +0,0 @@
----
-title: Loading Models
----
-
-There are three different methods available for loading model files, depending on their size and location.
-
-**1. Load from React Native assets folder (For Files < 512MB)**
-
-```typescript
-useExecutorchModule({
-  modelSource: require('../assets/llama3_2.pte'),
-});
-```
-
-**2. Load from remote URL:**
-
-For files larger than 512MB or when you want to keep size of the app smaller, you can load the model from a remote URL (e.g. HuggingFace).
-
-```typescript
-useExecutorchModule({
-  modelSource: 'https://.../llama3_2.pte',
-});
-```
-
-**3. Load from local file system:**
-
-If you prefer to delegate the process of obtaining and loading model and tokenizer files to the user, you can use the following method:
-
-```typescript
-useExecutorchModule({
-  modelSource: 'file:///var/mobile/.../llama3_2.pte',
-});
-```
-
-:::info
-The downloaded files are stored in documents directory of your application.
-:::
-
-## Example
-
-The following code snippet demonstrates how to load model and tokenizer files using `useLLM` hook:
-
-```typescript
-import { useLLM } from 'react-native-executorch';
-
-const llama = useLLM({
-  modelSource: 'https://.../llama3_2.pte',
-  tokenizerSource: require('../assets/tokenizer.bin'),
-});
-```
diff --git a/docs/versioned_docs/version-0.6.x/01-fundamentals/03-frequently-asked-questions.md b/docs/versioned_docs/version-0.6.x/01-fundamentals/03-frequently-asked-questions.md
deleted file mode 100644
index 03914b25d..000000000
--- a/docs/versioned_docs/version-0.6.x/01-fundamentals/03-frequently-asked-questions.md
+++ /dev/null
@@ -1,39 +0,0 @@
----
-title: Frequently Asked Questions
----
-
-This section is meant to answer some common community inquiries, especially regarding the ExecuTorch runtime or adding your own models. If you can't see an answer to your question, feel free to open up a [discussion](https://github.com/software-mansion/react-native-executorch/discussions/new/choose).
-
-### What models are supported?
-
-Each hook documentation subpage (useClassification, useLLM, etc.) contains a supported models section, which lists the models that are runnable within the library with close to no setup. For running your custom models, refer to `ExecuTorchModule` or `useExecuTorchModule`.
-
-### How can I run my own AI model?
-
-To run your own model, you need to directly access the underlying [ExecuTorch Module API](https://pytorch.org/executorch/stable/extension-module.html). We provide an experimental [React hook](../02-hooks/03-executorch-bindings/useExecutorchModule.md) along with a [TypeScript alternative](../03-typescript-api/03-executorch-bindings/ExecutorchModule.md), which serve as a way to use the aforementioned API without the need of diving into native code. In order to get a model in a format runnable by the runtime, you'll need to get your hands dirty with some ExecuTorch knowledge. For more guides on exporting models, please refer to the [ExecuTorch tutorials](https://pytorch.org/executorch/stable/tutorials/export-to-executorch-tutorial.html). Once you obtain your model in a `.pte` format, you can run it with `useExecuTorchModule` and `ExecuTorchModule`.
-
-### Can you do function calling with useLLM?
-
-If your model supports tool calling (i.e. its chat template can process tools) you can use the method explained on the [useLLM page](../02-hooks/01-natural-language-processing/useLLM.md).
-
-If your model doesn't support it, you can still work around it using context. For details, refer to [this comment](https://github.com/software-mansion/react-native-executorch/issues/173#issuecomment-2775082278).
-
-### Can I use React Native ExecuTorch in bare React Native apps?
-
-To use the library, you need to install Expo Modules first. For a setup guide, refer to [this tutorial](https://docs.expo.dev/bare/installing-expo-modules/). This is because we use Expo File System under the hood to download and manage the model binaries.
-
-### Do you support the old architecture?
-
-The old architecture is not supported and we're currently not planning to add support.
-
-### Can I run GGUF models using the library?
-
-No, as of now ExecuTorch runtime doesn't provide a reliable way to use GGUF models, hence it is not possible.
-
-### Are the models leveraging GPU acceleration?
-
-While it is possible to run some models using Core ML on iOS, which is a backend that utilizes CPU, GPU and ANE, we currently don't have many models exported to Core ML. For Android, the current state of GPU acceleration is pretty limited. As of now, there are attempts of running the models using a Vulkan backend. However the operator support is very limited meaning that the resulting performance is often inferior to XNNPACK. Hence, most of the models use XNNPACK, which is a highly optimized and mature CPU backend that runs on both Android and iOS.
-
-### Does this library support XNNPACK and Core ML?
-
-Yes, all of the backends are linked, therefore the only thing that needs to be done on your end is to export the model with the backend that you're interested in using.
diff --git a/docs/versioned_docs/version-0.6.x/01-fundamentals/_category_.json b/docs/versioned_docs/version-0.6.x/01-fundamentals/_category_.json
deleted file mode 100644
index e3fddcbeb..000000000
--- a/docs/versioned_docs/version-0.6.x/01-fundamentals/_category_.json
+++ /dev/null
@@ -1,6 +0,0 @@
-{
-  "label": "Fundamentals",
-  "link": {
-    "type": "generated-index"
-  }
-}
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/_category_.json b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/_category_.json
deleted file mode 100644
index 0314f315d..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/_category_.json
+++ /dev/null
@@ -1,6 +0,0 @@
-{
-  "label": "Natural Language Processing",
-  "link": {
-    "type": "generated-index"
-  }
-}
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useLLM.md b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useLLM.md
deleted file mode 100644
index 3f072f93c..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useLLM.md
+++ /dev/null
@@ -1,537 +0,0 @@
----
-title: useLLM
-keywords:
-  [
-    react native,
-    react native ai,
-    react native llm,
-    react native qwen,
-    react native llama,
-    react native executorch,
-    executorch,
-    pytorch,
-    on-device ai,
-    mobile ai,
-    llama 3,
-    qwen,
-    text generation,
-    tool calling,
-    function calling,
-  ]
-description: "Learn how to use LLMs in your React Native applications with React Native ExecuTorch's useLLM hook."
----
-
-React Native ExecuTorch supports a variety of LLMs (checkout our [HuggingFace repository](https://huggingface.co/software-mansion) for model already converted to ExecuTorch format) including Llama 3.2. Before getting started, you’ll need to obtain the .pte binary—a serialized model, the tokenizer and tokenizer config JSON files. There are various ways to accomplish this:
-
-- For your convenience, it's best if you use models exported by us, you can get them from our [HuggingFace repository](https://huggingface.co/software-mansion). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
-- Follow the official [tutorial](https://github.com/pytorch/executorch/blob/release/0.7/examples/demo-apps/android/LlamaDemo/docs/delegates/xnnpack_README.md) made by ExecuTorch team to build the model and tokenizer yourself.
-
-:::danger
-Lower-end devices might not be able to fit LLMs into memory. We recommend using quantized models to reduce the memory footprint.
-:::
-
-## Initializing
-
-In order to load a model into the app, you need to run the following code:
-
-```typescript
-import { useLLM, LLAMA3_2_1B } from 'react-native-executorch';
-
-const llm = useLLM({ model: LLAMA3_2_1B });
-```
-
-<br/>
-
-The code snippet above fetches the model from the specified URL, loads it into memory, and returns an object with various functions and properties for controlling the model. You can monitor the loading progress by checking the `llm.downloadProgress` and `llm.isReady` property, and if anything goes wrong, the `llm.error` property will contain the error message.
-
-### Arguments
-
-**`model`** - Object containing the model source, tokenizer source, and tokenizer config source.
-
-- **`modelSource`** - `ResourceSource` that specifies the location of the model binary.
-
-- **`tokenizerSource`** - `ResourceSource` pointing to the JSON file which contains the tokenizer.
-
-- **`tokenizerConfigSource`** - `ResourceSource` pointing to the JSON file which contains the tokenizer config.
-
-**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-### Returns
-
-| Field                    | Type                                                                                                           | Description                                                                                                                                     |
-| ------------------------ | -------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
-| `generate()`             | `(messages: Message[], tools?: LLMTool[]) => Promise<void>`                                                    | Runs model to complete chat passed in `messages` argument. It doesn't manage conversation context.                                              |
-| `interrupt()`            | `() => void`                                                                                                   | Function to interrupt the current inference.                                                                                                    |
-| `response`               | `string`                                                                                                       | State of the generated response. This field is updated with each token generated by the model.                                                  |
-| `token`                  | `string`                                                                                                       | The most recently generated token.                                                                                                              |
-| `isReady`                | `boolean`                                                                                                      | Indicates whether the model is ready.                                                                                                           |
-| `isGenerating`           | `boolean`                                                                                                      | Indicates whether the model is currently generating a response.                                                                                 |
-| `downloadProgress`       | `number`                                                                                                       | Represents the download progress as a value between 0 and 1, indicating the extent of the model file retrieval.                                 |
-| `error`                  | <code>string &#124; null</code>                                                                                | Contains the error message if the model failed to load.                                                                                         |
-| `configure`              | `({chatConfig?: Partial<ChatConfig>, toolsConfig?: ToolsConfig, generationConfig?: GenerationConfig}) => void` | Configures chat and tool calling. See more details in [configuring the model](#configuring-the-model).                                          |
-| `sendMessage`            | `(message: string) => Promise<void>`                                                                           | Function to add user message to conversation. After model responds, `messageHistory` will be updated with both user message and model response. |
-| `deleteMessage`          | `(index: number) => void`                                                                                      | Deletes all messages starting with message on `index` position. After deletion `messageHistory` will be updated.                                |
-| `messageHistory`         | `Message[]`                                                                                                    | History containing all messages in conversation. This field is updated after model responds to `sendMessage`.                                   |
-| `getGeneratedTokenCount` | `() => number`                                                                                                 | Returns the number of tokens generated in the last response.                                                                                    |
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-const useLLM: ({
-  model,
-  preventLoad,
-}: {
-  model: {
-    modelSource: ResourceSource;
-    tokenizerSource: ResourceSource;
-    tokenizerConfigSource: ResourceSource;
-  };
-  preventLoad?: boolean;
-}) => LLMType;
-
-interface LLMType {
-  messageHistory: Message[];
-  response: string;
-  token: string;
-  isReady: boolean;
-  isGenerating: boolean;
-  downloadProgress: number;
-  error: string | null;
-  configure: ({
-    chatConfig,
-    toolsConfig,
-    generationConfig,
-  }: {
-    chatConfig?: Partial<ChatConfig>;
-    toolsConfig?: ToolsConfig;
-    generationConfig?: GenerationConfig;
-  }) => void;
-  getGeneratedTokenCount: () => number;
-  generate: (messages: Message[], tools?: LLMTool[]) => Promise<void>;
-  sendMessage: (message: string) => Promise<void>;
-  deleteMessage: (index: number) => void;
-  interrupt: () => void;
-}
-
-type ResourceSource = string | number | object;
-
-type MessageRole = 'user' | 'assistant' | 'system';
-
-interface Message {
-  role: MessageRole;
-  content: string;
-}
-interface ChatConfig {
-  initialMessageHistory: Message[];
-  contextWindowLength: number;
-  systemPrompt: string;
-}
-
-interface GenerationConfig {
-  temperature?: number;
-  topp?: number;
-  outputTokenBatchSize?: number;
-  batchTimeInterval?: number;
-}
-
-// tool calling
-interface ToolsConfig {
-  tools: LLMTool[];
-  executeToolCallback: (call: ToolCall) => Promise<string | null>;
-  displayToolCalls?: boolean;
-}
-
-interface ToolCall {
-  toolName: string;
-  arguments: Object;
-}
-
-type LLMTool = Object;
-```
-
-</details>
-
-## Functional vs managed
-
-You can use functions returned from this hooks in two manners:
-
-1. Functional/pure - we will not keep any state for you. You'll need to keep conversation history and handle function calling yourself. Use `generate` (and rarely `forward`) and `response`. Note that you don't need to run `configure` to use those. Furthermore, `chatConfig` and `toolsConfig` will not have any effect on those functions.
-
-2. Managed/stateful - we will manage conversation state. Tool calls will be parsed and called automatically after passing appropriate callbacks. See more at [managed LLM chat](#managed-llm-chat).
-
-## Functional way
-
-### Simple generation
-
-To perform chat completion you can use the `generate` function. There is no return value. Instead, the `response` value is updated with each token.
-
-```tsx
-const llm = useLLM({ model: LLAMA3_2_1B });
-
-const handleGenerate = () => {
-  const chat: Message[] = [
-    { role: 'system', content: 'You are a helpful assistant' },
-    { role: 'user', content: 'Hi!' },
-    { role: 'assistant', content: 'Hi!, how can I help you?' },
-    { role: 'user', content: 'What is the meaning of life?' },
-  ];
-
-  // Chat completion
-  llm.generate(chat);
-};
-
-return (
-  <View>
-    <Button onPress={handleGenerate} title="Generate!" />
-    <Text>{llm.response}</Text>
-  </View>
-);
-```
-
-### Interrupting the model
-
-Sometimes, you might want to stop the model while it’s generating. To do this, you can use `interrupt()`, which will halt the model and update the response one last time.
-
-There are also cases when you need to check if tokens are being generated, such as to conditionally render a stop button. We’ve made this easy with the `isGenerating` property.
-
-:::caution
-If you try to dismount the component using this hook while generation is still going on, it will result in crash.
-You'll need to interrupt the model first and wait until `isGenerating` is set to false.
-:::
-
-### Reasoning
-
-Some models ship with a built-in "reasoning" or "thinking" mode, but this is model-specific, not a feature of our library. If the model you're using supports disabling reasoning, follow the instructions provided by the model authors. For example, Qwen 3 lets you disable reasoning by adding the `/no_think` suffix to your prompts - [source](https://qwenlm.github.io/blog/qwen3/#advanced-usages).
-
-### Tool calling
-
-Sometimes text processing capabilities of LLMs are not enough. That's when you may want to introduce tool calling (also called function calling). It allows model to use external tools to perform its tasks. The tools may be any arbitrary function that you want your model to run. It may retrieve some data from 3rd party API. It may do an action inside an app like pressing buttons or filling forms, or it may use system APIs to interact with your phone (turning on flashlight, adding events to your calendar, changing volume etc.).
-
-```tsx
-const TOOL_DEFINITIONS: LLMTool[] = [
-  {
-    name: 'get_weather',
-    description: 'Get/check weather in given location.',
-    parameters: {
-      type: 'dict',
-      properties: {
-        location: {
-          type: 'string',
-          description: 'Location where user wants to check weather',
-        },
-      },
-      required: ['location'],
-    },
-  },
-];
-
-const llm = useLLM({ model: HAMMER2_1_1_5B });
-
-const handleGenerate = () => {
-  const chat: Message[] = [
-    {
-      role: 'system',
-      content: `You are a helpful assistant. Current time and date: ${new Date().toString()}`,
-    },
-    {
-      role: 'user',
-      content: `Hi, what's the weather like in Cracow right now?`,
-    },
-  ];
-
-  // Chat completion
-  llm.generate(chat, TOOL_DEFINITIONS);
-};
-
-useEffect(() => {
-  // Parse response and call tools accordingly
-  // ...
-}, [llm.response]);
-
-return (
-  <View>
-    <Button onPress={handleGenerate} title="Generate!" />
-    <Text>{llm.response}</Text>
-  </View>
-);
-```
-
-## Managed LLM Chat
-
-### Configuring the model
-
-To configure model (i.e. change system prompt, load initial conversation history or manage tool calling) you can use
-`configure` function. It accepts object with following fields:
-
-**`chatConfig`** - Object configuring chat management, contains following properties:
-
-- **`systemPrompt`** - Often used to tell the model what is its purpose, for example - "Be a helpful translator".
-
-- **`initialMessageHistory`** - An array of `Message` objects that represent the conversation history. This can be used to provide initial context to the model.
-
-- **`contextWindowLength`** - The number of messages from the current conversation that the model will use to generate a response. The higher the number, the more context the model will have. Keep in mind that using larger context windows will result in longer inference time and higher memory usage.
-
-**`toolsConfig`** - Object configuring options for enabling and managing tool use. **It will only have effect if your model's chat template support it**. Contains following properties:
-
-- **`tools`** - List of objects defining tools.
-
-- **`executeToolCallback`** - Function that accepts `ToolCall`, executes tool and returns the string to model.
-
-- **`displayToolCalls`** - If set to true, JSON tool calls will be displayed in chat. If false, only answers will be displayed.
-
-**`generationConfig`** - Object configuring generation settings.
-
-- **`outputTokenBatchSize`** - Soft upper limit on the number of tokens in each token batch (in certain cases there can be more tokens in given batch, i.e. when the batch would end with special emoji join character).
-
-- **`batchTimeInterval`** - Upper limit on the time interval between consecutive token batches.
-
-- **`temperature`** - Scales output logits by the inverse of temperature. Controls the randomness / creativity of text generation.
-
-- **`topp`** - Only samples from the smallest set of tokens whose cumulative probability exceeds topp.
-
-### Sending a message
-
-In order to send a message to the model, one can use the following code:
-
-```tsx
-const llm = useLLM({ model: LLAMA3_2_1B });
-
-const send = () => {
-  const message = 'Hi, who are you?';
-  llm.sendMessage(message);
-};
-
-return <Button onPress={send} title="Generate!" />;
-```
-
-### Accessing conversation history
-
-Behind the scenes, tokens are generated one by one, and the `response` property is updated with each token as it’s created.
-If you want to get entire conversation you can use `messageHistory` field:
-
-```tsx
-return (
-  <View>
-    {llm.messageHistory.map((message) => (
-      <Text>{message.content}</Text>
-    ))}
-  </View>
-);
-```
-
-### Tool calling example
-
-```tsx
-const TOOL_DEFINITIONS: LLMTool[] = [
-  {
-    name: 'get_weather',
-    description: 'Get/check weather in given location.',
-    parameters: {
-      type: 'dict',
-      properties: {
-        location: {
-          type: 'string',
-          description: 'Location where user wants to check weather',
-        },
-      },
-      required: ['location'],
-    },
-  },
-];
-
-const llm = useLLM({ model: HAMMER2_1_1_5B });
-
-useEffect(() => {
-  llm.configure({
-    chatConfig: {
-      systemPrompt: `You are helpful assistant. Current time and date: ${new Date().toString()}`,
-    },
-    toolsConfig: {
-      tools: TOOL_DEFINITIONS,
-      executeToolCallback: async (call) => {
-        if (call.toolName === 'get_weather') {
-          console.log('Checking weather!');
-          // perform call to weather API
-          // ...
-          const mockResults = 'Weather is great!';
-          return mockResults;
-        }
-        return null;
-      },
-      displayToolCalls: true,
-    },
-  });
-}, []);
-
-const send = () => {
-  const message = `Hi, what's the weather like in Cracow right now?`;
-  llm.sendMessage(message);
-};
-
-return (
-  <View>
-    <Button onPress={send} title="Generate!" />
-    <Text>{llm.response}</Text>
-  </View>
-);
-```
-
-### Structured output example
-
-```tsx
-import { Schema } from 'jsonschema';
-
-const responseSchema: Schema = {
-  properties: {
-    username: {
-      type: 'string',
-      description: 'Name of user, that is asking a question.',
-    },
-    question: {
-      type: 'string',
-      description: 'Question that user asks.',
-    },
-    bid: {
-      type: 'number',
-      description: 'Amount of money, that user offers.',
-    },
-    currency: {
-      type: 'string',
-      description: 'Currency of offer.',
-    },
-  },
-  required: ['username', 'bid'],
-  type: 'object',
-};
-
-// alternatively use Zod
-import * as z from 'zod/v4';
-const responseSchemaWithZod = z.object({
-  username: z
-    .string()
-    .meta({ description: 'Name of user, that is asking a question.' }),
-  question: z.optional(
-    z.string().meta({ description: 'Question that user asks.' })
-  ),
-  bid: z.number().meta({ description: 'Amount of money, that user offers.' }),
-  currency: z.optional(z.string().meta({ description: 'Currency of offer.' })),
-});
-
-const llm = useLLM({ model: QWEN3_4B_QUANTIZED });
-
-useEffect(() => {
-  const formattingInstructions = getStructuredOutputPrompt(responseSchema);
-  // alternatively pass schema defined with Zod
-  //  const formattingInstructions = getStructuredOutputPrompt(responseSchemaWithZod);
-
-  // Some extra prompting to improve quality of response.
-  const prompt = `Your goal is to parse user's messages and return them in JSON format. Don't respond to user. Simply return JSON with user's question parsed. \n${formattingInstructions}\n /no_think`;
-
-  llm.configure({
-    chatConfig: {
-      systemPrompt: prompt,
-    },
-  });
-}, []);
-
-useEffect(() => {
-  const lastMessage = llm.messageHistory.at(-1);
-  if (!llm.isGenerating && lastMessage?.role === 'assistant') {
-    try {
-      const formattedOutput = fixAndValidateStructuredOutput(
-        lastMessage.content,
-        responseSchemaWithZod
-      );
-      // Zod will allow you to correctly type output
-      const formattedOutputWithZod = fixAndValidateStructuredOutput(
-        lastMessage.content,
-        responseSchema
-      );
-      console.log('Formatted output:', formattedOutput, formattedOutputWithZod);
-    } catch (e) {
-      console.log(
-        "Error parsing output and/or output doesn't match required schema!",
-        e
-      );
-    }
-  }
-}, [llm.messageHistory, llm.isGenerating]);
-
-const send = () => {
-  const message = `I'm John. Is this product damaged? I can give you $100 for this.`;
-  llm.sendMessage(message);
-};
-
-return (
-  <View>
-    <Button onPress={send} title="Generate!" />
-    <Text>{llm.response}</Text>
-  </View>
-);
-```
-
-The response should include JSON:
-
-```json
-{
-  "username": "John",
-  "question": "Is this product damaged?",
-  "bid": 100,
-  "currency": "USD"
-}
-```
-
-## Token Batching
-
-Depending on selected model and the user's device generation speed can be above 60 tokens per second. If the `tokenCallback` triggers rerenders and is invoked on every single token it can significantly decrease the app's performance. To alleviate this and help improve performance we've implemented token batching. To configure this you need to call `configure` method and pass `generationConfig`. Inside you can set two parameters `outputTokenBatchSize` and `batchTimeInterval`. They set the size of the batch before tokens are emitted and the maximum time interval between consecutive batches respectively. Each batch is emitted if either `timeInterval` elapses since last batch or `countInterval` number of tokens are generated. This allows for smooth generation even if model lags during generation. Default parameters are set to 10 tokens and 80ms for time interval (~12 batches per second).
-
-## Available models
-
-| Model Family                                                                             |      Sizes       | Quantized |
-| ---------------------------------------------------------------------------------------- | :--------------: | :-------: |
-| [Hammer 2.1](https://huggingface.co/software-mansion/react-native-executorch-hammer-2.1) |  0.5B, 1.5B, 3B  |    ✅     |
-| [Qwen 2.5](https://huggingface.co/software-mansion/react-native-executorch-qwen-2.5)     |  0.5B, 1.5B, 3B  |    ✅     |
-| [Qwen 3](https://huggingface.co/software-mansion/react-native-executorch-qwen-3)         |  0.6B, 1.7B, 4B  |    ✅     |
-| [Phi 4 Mini](https://huggingface.co/software-mansion/react-native-executorch-phi-4-mini) |        4B        |    ✅     |
-| [SmolLM 2](https://huggingface.co/software-mansion/react-native-executorch-smolLm-2)     | 135M, 360M, 1.7B |    ✅     |
-| [LLaMA 3.2](https://huggingface.co/software-mansion/react-native-executorch-llama-3.2)   |      1B, 3B      |    ✅     |
-
-## Benchmarks
-
-### Model size
-
-| Model                 | XNNPACK [GB] |
-| --------------------- | :----------: |
-| LLAMA3_2_1B           |     2.47     |
-| LLAMA3_2_1B_SPINQUANT |     1.14     |
-| LLAMA3_2_1B_QLORA     |     1.18     |
-| LLAMA3_2_3B           |     6.43     |
-| LLAMA3_2_3B_SPINQUANT |     2.55     |
-| LLAMA3_2_3B_QLORA     |     2.65     |
-
-### Memory usage
-
-| Model                 | Android (XNNPACK) [GB] | iOS (XNNPACK) [GB] |
-| --------------------- | :--------------------: | :----------------: |
-| LLAMA3_2_1B           |          3.2           |        3.1         |
-| LLAMA3_2_1B_SPINQUANT |          1.9           |         2          |
-| LLAMA3_2_1B_QLORA     |          2.2           |        2.5         |
-| LLAMA3_2_3B           |          7.1           |        7.3         |
-| LLAMA3_2_3B_SPINQUANT |          3.7           |        3.8         |
-| LLAMA3_2_3B_QLORA     |           4            |        4.1         |
-
-### Inference time
-
-| Model                 | iPhone 16 Pro (XNNPACK) [tokens/s] | iPhone 13 Pro (XNNPACK) [tokens/s] | iPhone SE 3 (XNNPACK) [tokens/s] | Samsung Galaxy S24 (XNNPACK) [tokens/s] | OnePlus 12 (XNNPACK) [tokens/s] |
-| --------------------- | :--------------------------------: | :--------------------------------: | :------------------------------: | :-------------------------------------: | :-----------------------------: |
-| LLAMA3_2_1B           |                16.1                |                11.4                |                ❌                |                  15.6                   |              19.3               |
-| LLAMA3_2_1B_SPINQUANT |                40.6                |                16.7                |               16.5               |                  40.3                   |              48.2               |
-| LLAMA3_2_1B_QLORA     |                31.8                |                11.4                |               11.2               |                  37.3                   |              44.4               |
-| LLAMA3_2_3B           |                 ❌                 |                 ❌                 |                ❌                |                   ❌                    |               7.1               |
-| LLAMA3_2_3B_SPINQUANT |                17.2                |                8.2                 |                ❌                |                  16.2                   |              19.4               |
-| LLAMA3_2_3B_QLORA     |                14.5                |                 ❌                 |                ❌                |                  14.8                   |              18.1               |
-
-❌ - Insufficient RAM.
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useSpeechToText.md b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useSpeechToText.md
deleted file mode 100644
index d94c96a66..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useSpeechToText.md
+++ /dev/null
@@ -1,343 +0,0 @@
----
-title: useSpeechToText
-keywords:
-  [
-    speech to text,
-    stt,
-    voice recognition,
-    transcription,
-    whisper,
-    react native,
-    executorch,
-    ai,
-    machine learning,
-    on-device,
-    mobile ai,
-  ]
-description: "Learn how to use speech-to-text models in your React Native applications with React Native ExecuTorch's useSpeechToText hook."
----
-
-Speech to text is a task that allows to transform spoken language to written text. It is commonly used to implement features such as transcription or voice assistants.
-
-:::warning
-It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-whisper-tiny.en). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
-:::
-
-## Reference
-
-You can obtain waveform from audio in any way most suitable to you, however in the snippet below we utilize `react-native-audio-api` library to process a `.mp3` file.
-
-```typescript
-import { useSpeechToText, WHISPER_TINY_EN } from 'react-native-executorch';
-import { AudioContext } from 'react-native-audio-api';
-import * as FileSystem from 'expo-file-system';
-
-const model = useSpeechToText({
-  model: WHISPER_TINY_EN,
-});
-
-const { uri } = await FileSystem.downloadAsync(
-  'https://some-audio-url.com/file.mp3',
-  FileSystem.cacheDirectory + 'audio_file'
-);
-
-const audioContext = new AudioContext({ sampleRate: 16000 });
-const decodedAudioData = await audioContext.decodeAudioDataSource(uri);
-const audioBuffer = decodedAudioData.getChannelData(0);
-
-try {
-  const transcription = await model.transcribe(audioBuffer);
-  console.log(transcription);
-} catch (error) {
-  console.error('Error during audio transcription', error);
-}
-```
-
-### Streaming
-
-Since speech-to-text models can only process audio segments up to 30 seconds long, we need to split longer inputs into chunks. However, simple chunking may cut speech mid-sentence, making it harder for the model to understand. To address this, we use the [whisper-streaming](https://aclanthology.org/2023.ijcnlp-demo.3.pdf) algorithm. While this introduces some overhead, it enables accurate processing of audio inputs of arbitrary length.
-
-### Arguments
-
-**`model`** - Object containing:
-
-- **`isMultilingual`** - A boolean flag indicating whether the model supports multiple languages.
-
-- **`encoderSource`** - A string that specifies the location of a `.pte` file for the encoder.
-
-- **`decoderSource`** - A string that specifies the location of a `.pte` file for the decoder.
-
-- **`tokenizerSource`** - A string that specifies the location to the tokenizer for the model.
-
-**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-### Returns
-
-| Field                       | Type                                                                                                 | Description                                                                                                                                                                                                                                                                                                                   |
-| --------------------------- | ---------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `transcribe`                | `(waveform: Float32Array \| number[], options?: DecodingOptions \| undefined) => Promise<string>`    | Starts a transcription process for a given input array, which should be a waveform at 16kHz. The second argument is an options object, e.g. `{ language: 'es' }` for multilingual models. Resolves a promise with the output transcription when the model is finished. Passing `number[]` is deprecated.                      |
-| `stream`                    | `(options?: DecodingOptions \| undefined) => Promise<string>`                                        | Starts a streaming transcription process. Use in combination with `streamInsert` to feed audio chunks and `streamStop` to end the stream. The argument is an options object, e.g. `{ language: 'es' }` for multilingual models. Updates `committedTranscription` and `nonCommittedTranscription` as transcription progresses. |
-| `streamInsert`              | `(waveform: Float32Array \| number[]) => void`                                                       | Inserts a chunk of audio data (sampled at 16kHz) into the ongoing streaming transcription. Call this repeatedly as new audio data becomes available. Passing `number[]` is deprecated.                                                                                                                                        |
-| `streamStop`                | `() => void`                                                                                         | Stops the ongoing streaming transcription process.                                                                                                                                                                                                                                                                            |
-| `encode`                    | `(waveform: Float32Array \| number[]) => Promise<Float32Array>`                                      | Runs the encoding part of the model on the provided waveform. Passing `number[]` is deprecated.                                                                                                                                                                                                                               |
-| `decode`                    | `(tokens: number[] \| Int32Array, encoderOutput: Float32Array \| number[]) => Promise<Float32Array>` | Runs the decoder of the model. Passing `number[]` is deprecated.                                                                                                                                                                                                                                                              |
-| `committedTranscription`    | `string`                                                                                             | Contains the part of the transcription that is finalized and will not change. Useful for displaying stable results during streaming.                                                                                                                                                                                          |
-| `nonCommittedTranscription` | `string`                                                                                             | Contains the part of the transcription that is still being processed and may change. Useful for displaying live, partial results during streaming.                                                                                                                                                                            |
-| `error`                     | `string \| null`                                                                                     | Contains the error message if the model failed to load.                                                                                                                                                                                                                                                                       |
-| `isGenerating`              | `boolean`                                                                                            | Indicates whether the model is currently processing an inference.                                                                                                                                                                                                                                                             |
-| `isReady`                   | `boolean`                                                                                            | Indicates whether the model has successfully loaded and is ready for inference.                                                                                                                                                                                                                                               |
-| `downloadProgress`          | `number`                                                                                             | Tracks the progress of the model download process.                                                                                                                                                                                                                                                                            |
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-// Languages supported by whisper (Multilingual)
-type SpeechToTextLanguage =
-  | 'af'
-  | 'sq'
-  | 'ar'
-  | 'hy'
-  | 'az'
-  | 'eu'
-  | 'be'
-  | 'bn'
-  | 'bs'
-  | 'bg'
-  | 'my'
-  | 'ca'
-  | 'zh'
-  | 'hr'
-  | 'cs'
-  | 'da'
-  | 'nl'
-  | 'et'
-  | 'en'
-  | 'fi'
-  | 'fr'
-  | 'gl'
-  | 'ka'
-  | 'de'
-  | 'el'
-  | 'gu'
-  | 'ht'
-  | 'he'
-  | 'hi'
-  | 'hu'
-  | 'is'
-  | 'id'
-  | 'it'
-  | 'ja'
-  | 'kn'
-  | 'kk'
-  | 'km'
-  | 'ko'
-  | 'lo'
-  | 'lv'
-  | 'lt'
-  | 'mk'
-  | 'mg'
-  | 'ms'
-  | 'ml'
-  | 'mt'
-  | 'mr'
-  | 'ne'
-  | 'no'
-  | 'fa'
-  | 'pl'
-  | 'pt'
-  | 'pa'
-  | 'ro'
-  | 'ru'
-  | 'sr'
-  | 'si'
-  | 'sk'
-  | 'sl'
-  | 'es'
-  | 'su'
-  | 'sw'
-  | 'sv'
-  | 'tl'
-  | 'tg'
-  | 'ta'
-  | 'te'
-  | 'th'
-  | 'tr'
-  | 'uk'
-  | 'ur'
-  | 'uz'
-  | 'vi'
-  | 'cy'
-  | 'yi';
-
-interface DecodingOptions {
-  language?: SpeechToTextLanguage;
-}
-
-interface SpeechToTextModelConfig {
-  isMultilingual: boolean;
-  encoderSource: ResourceSource;
-  decoderSource: ResourceSource;
-  tokenizerSource: ResourceSource;
-}
-```
-
-</details>
-
-## Running the model
-
-Before running the model's `transcribe` method, make sure to extract the audio waveform you want to transcribe. You'll need to handle this step yourself, ensuring the audio is sampled at 16 kHz. Once you have the waveform, pass it as an argument to the transcribe method. The method returns a promise that resolves to the generated transcription on success, or an error if inference fails.
-
-### Multilingual transcription
-
-If you want to transcribe speech in languages other than English, use the multilingual version of Whisper. To generate the output in your desired language, pass the `language` option to the `transcribe` method.
-
-```typescript
-import { useSpeechToText, WHISPER_TINY } from 'react-native-executorch';
-
-const model = useSpeechToText({
-  model: WHISPER_TINY,
-});
-
-const transcription = await model.transcribe(spanishAudio, { language: 'es' });
-```
-
-## Example
-
-```tsx
-import React, { useState } from 'react';
-import { Button, Text } from 'react-native';
-import { useSpeechToText, WHISPER_TINY_EN } from 'react-native-executorch';
-import { AudioContext } from 'react-native-audio-api';
-import * as FileSystem from 'expo-file-system';
-
-function App() {
-  const model = useSpeechToText({
-    model: WHISPER_TINY_EN,
-  });
-
-  const [transcription, setTranscription] = useState('');
-
-  const loadAudio = async () => {
-    const { uri } = await FileSystem.downloadAsync(
-      'https://some-audio-url.com/file.mp3',
-      FileSystem.cacheDirectory + 'audio_file'
-    );
-
-    const audioContext = new AudioContext({ sampleRate: 16000 });
-    const decodedAudioData = await audioContext.decodeAudioDataSource(uri);
-    const audioBuffer = decodedAudioData.getChannelData(0);
-
-    return audioBuffer;
-  };
-
-  const handleTranscribe = async () => {
-    const audio = await loadAudio();
-    await model.transcribe(audio);
-  };
-
-  return (
-    <>
-      <Text>{transcription}</Text>
-      <Button onPress={handleTranscribe} title="Transcribe" />
-    </>
-  );
-}
-```
-
-### Streaming transcription
-
-```tsx
-import React, { useEffect, useState } from 'react';
-import { Text, Button } from 'react-native';
-import { useSpeechToText, WHISPER_TINY_EN } from 'react-native-executorch';
-import { AudioManager, AudioRecorder } from 'react-native-audio-api';
-import * as FileSystem from 'expo-file-system';
-
-function App() {
-  const model = useSpeechToText({
-    model: WHISPER_TINY_EN,
-  });
-
-  const [recorder] = useState(
-    () =>
-      new AudioRecorder({
-        sampleRate: 16000,
-        bufferLengthInSamples: 1600,
-      })
-  );
-
-  useEffect(() => {
-    AudioManager.setAudioSessionOptions({
-      iosCategory: 'playAndRecord',
-      iosMode: 'spokenAudio',
-      iosOptions: ['allowBluetooth', 'defaultToSpeaker'],
-    });
-    AudioManager.requestRecordingPermissions();
-  }, []);
-
-  const handleStartStreamingTranscribe = async () => {
-    recorder.onAudioReady(({ buffer }) => {
-      model.streamInsert(buffer.getChannelData(0));
-    });
-    recorder.start();
-
-    try {
-      await model.stream();
-    } catch (error) {
-      console.error('Error during streaming transcription:', error);
-    }
-  };
-
-  const handleStopStreamingTranscribe = () => {
-    recorder.stop();
-    model.streamStop();
-  };
-
-  return (
-    <>
-      <Text>
-        {model.committedTranscription}
-        {model.nonCommittedTranscription}
-      </Text>
-      <Button
-        onPress={handleStartStreamingTranscribe}
-        title="Start Streaming"
-      />
-      <Button onPress={handleStopStreamingTranscribe} title="Stop Streaming" />
-    </>
-  );
-}
-```
-
-## Supported models
-
-| Model                                                              |   Language   |
-| ------------------------------------------------------------------ | :----------: |
-| [whisper-tiny.en](https://huggingface.co/openai/whisper-tiny.en)   |   English    |
-| [whisper-tiny](https://huggingface.co/openai/whisper-tiny)         | Multilingual |
-| [whisper-base.en](https://huggingface.co/openai/whisper-base.en)   |   English    |
-| [whisper-base](https://huggingface.co/openai/whisper-base)         | Multilingual |
-| [whisper-small.en](https://huggingface.co/openai/whisper-small.en) |   English    |
-| [whisper-small](https://huggingface.co/openai/whisper-small)       | Multilingual |
-
-## Benchmarks
-
-### Model size
-
-| Model            | XNNPACK [MB] |
-| ---------------- | :----------: |
-| WHISPER_TINY_EN  |     151      |
-| WHISPER_TINY     |     151      |
-| WHISPER_BASE_EN  |    290.6     |
-| WHISPER_BASE     |    290.6     |
-| WHISPER_SMALL_EN |     968      |
-| WHISPER_SMALL    |     968      |
-
-### Memory usage
-
-| Model        | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| ------------ | :--------------------: | :----------------: |
-| WHISPER_TINY |          410           |        375         |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTextEmbeddings.md b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTextEmbeddings.md
deleted file mode 100644
index 7d4706f15..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTextEmbeddings.md
+++ /dev/null
@@ -1,158 +0,0 @@
----
-title: useTextEmbeddings
-keywords:
-  [
-    text embedding,
-    text embeddings,
-    embeddings,
-    react native,
-    executorch,
-    ai,
-    machine learning,
-    on-device,
-    mobile ai,
-  ]
-description: "Learn how to use text embeddings models in your React Native applications with React Native ExecuTorch's useTextEmbeddings hook."
----
-
-Text Embedding is the process of converting text into a numerical representation. This representation can be used for various natural language processing tasks, such as semantic search, text classification, and clustering.
-
-:::caution
-It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-all-MiniLM-L6-v2). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
-:::
-
-## Reference
-
-```typescript
-import { useTextEmbeddings, ALL_MINILM_L6_V2 } from 'react-native-executorch';
-
-const model = useTextEmbeddings({ model: ALL_MINILM_L6_V2 });
-
-try {
-  const embedding = await model.forward('Hello World!');
-} catch (error) {
-  console.error(error);
-}
-```
-
-### Arguments
-
-**`model`** - Object containing the model source and tokenizer source.
-
-- **`modelSource`** - A string that specifies the location of the model binary.
-
-- **`tokenizerSource`** - A string that specifies the location of the tokenizer JSON file.
-
-**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-### Returns
-
-| Field              | Type                                   | Description                                                                       |
-| ------------------ | -------------------------------------- | --------------------------------------------------------------------------------- |
-| `forward`          | `(input: string) => Promise<number[]>` | Executes the model's forward pass, where `input` is a text that will be embedded. |
-| `error`            | <code>string &#124; null</code>        | Contains the error message if the model failed to load.                           |
-| `isGenerating`     | `boolean`                              | Indicates whether the model is currently processing an inference.                 |
-| `isReady`          | `boolean`                              | Indicates whether the model has successfully loaded and is ready for inference.   |
-| `downloadProgress` | `number`                               | Represents the download progress as a value between 0 and 1.                      |
-
-## Running the model
-
-To run the model, you can use the `forward` method. It accepts one argument, which is a string representing the text you want to embed. The function returns a promise, which can resolve either to an error or an array of numbers representing the embedding.
-
-## Example
-
-```typescript
-import { useTextEmbeddings, ALL_MINILM_L6_V2 } from 'react-native-executorch';
-
-const dotProduct = (a: number[], b: number[]) =>
-  a.reduce((sum, val, i) => sum + val * b[i], 0);
-
-const cosineSimilarity = (a: number[], b: number[]) => {
-  const dot = dotProduct(a, b);
-  const normA = Math.sqrt(dotProduct(a, a));
-  const normB = Math.sqrt(dotProduct(b, b));
-  return dot / (normA * normB);
-};
-
-function App() {
-  const model = useTextEmbeddings({ model: ALL_MINILM_L6_V2 });
-
-  // ...
-
-  try {
-    const helloWorldEmbedding = await model.forward('Hello World!');
-    const goodMorningEmbedding = await model.forward('Good Morning!');
-
-    const similarity = cosineSimilarity(
-      helloWorldEmbedding,
-      goodMorningEmbedding
-    );
-
-    console.log(`Cosine similarity: ${similarity}`);
-  } catch (error) {
-    console.error(error);
-  }
-
-  // ...
-}
-```
-
-## Supported models
-
-| Model                                                                                                 | Language | Max Tokens | Embedding Dimensions | Description                                                                                                                                                                                                                                                                                                                                                                                                                      |
-| ----------------------------------------------------------------------------------------------------- | :------: | :--------: | :------------------: | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)                     | English  |    254     |         384          | All-round model tuned for many use-cases. Trained on a large and diverse dataset of over 1 billion training pairs.                                                                                                                                                                                                                                                                                                               |
-| [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2)                   | English  |    382     |         768          | All-round model tuned for many use-cases. Trained on a large and diverse dataset of over 1 billion training pairs.                                                                                                                                                                                                                                                                                                               |
-| [multi-qa-MiniLM-L6-cos-v1](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1)   | English  |    509     |         384          | This model was tuned for semantic search: Given a query/question, it can find relevant passages. It was trained on a large and diverse set of (question, answer) pairs.                                                                                                                                                                                                                                                          |
-| [multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1) | English  |    510     |         768          | This model was tuned for semantic search: Given a query/question, it can find relevant passages. It was trained on a large and diverse set of (question, answer) pairs.                                                                                                                                                                                                                                                          |
-| [clip-vit-base-patch32-text](https://huggingface.co/openai/clip-vit-base-patch32)                     | English  |     74     |         512          | CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. CLIP allows to embed images and text into the same vector space. This allows to find similar images as well as to implement image search. This is the text encoder part of the CLIP model. To embed images checkout [clip-vit-base-patch32-image](../02-computer-vision/useImageEmbeddings.md#supported-models). |
-
-**`Max Tokens`** - the maximum number of tokens that can be processed by the model. If the input text exceeds this limit, it will be truncated.
-
-**`Embedding Dimensions`** - the size of the output embedding vector. This is the number of dimensions in the vector representation of the input text.
-
-:::info
-For the supported models, the returned embedding vector is normalized, meaning that its length is equal to 1. This allows for easier comparison of vectors using cosine similarity, just calculate the dot product of two vectors to get the cosine similarity score.
-:::
-
-## Benchmarks
-
-### Model size
-
-| Model                      | XNNPACK [MB] |
-| -------------------------- | :----------: |
-| ALL_MINILM_L6_V2           |      91      |
-| ALL_MPNET_BASE_V2          |     438      |
-| MULTI_QA_MINILM_L6_COS_V1  |      91      |
-| MULTI_QA_MPNET_BASE_DOT_V1 |     438      |
-| CLIP_VIT_BASE_PATCH32_TEXT |     254      |
-
-### Memory usage
-
-| Model                      | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| -------------------------- | :--------------------: | :----------------: |
-| ALL_MINILM_L6_V2           |           95           |        110         |
-| ALL_MPNET_BASE_V2          |          405           |        455         |
-| MULTI_QA_MINILM_L6_COS_V1  |          120           |        140         |
-| MULTI_QA_MPNET_BASE_DOT_V1 |          435           |        455         |
-| CLIP_VIT_BASE_PATCH32_TEXT |          200           |        280         |
-
-### Inference time
-
-:::warning warning
-Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
-:::
-
-| Model                      | iPhone 17 Pro (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| -------------------------- | :--------------------------: | :-----------------------: |
-| ALL_MINILM_L6_V2           |              7               |            21             |
-| ALL_MPNET_BASE_V2          |              24              |            90             |
-| MULTI_QA_MINILM_L6_COS_V1  |              7               |            19             |
-| MULTI_QA_MPNET_BASE_DOT_V1 |              24              |            88             |
-| CLIP_VIT_BASE_PATCH32_TEXT |              14              |            39             |
-
-:::info
-Benchmark times for text embeddings are highly dependent on the sentence length. The numbers above are based on a sentence of around 80 tokens. For shorter or longer sentences, inference time may vary accordingly.
-:::
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTokenizer.md b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTokenizer.md
deleted file mode 100644
index 23ad40803..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useTokenizer.md
+++ /dev/null
@@ -1,104 +0,0 @@
----
-title: useTokenizer
-keywords:
-  [
-    tokenizer,
-    text tokenizer,
-    tokenization,
-    react native,
-    executorch,
-    ai,
-    machine learning,
-    on-device,
-    mobile ai,
-  ]
-description: "Learn how to tokenize your text in your React Native applications using React Native ExecuTorch's useTokenizer hook."
----
-
-Tokenization is the process of breaking down text into smaller units called tokens. It’s a crucial step in natural language processing that
-converts text into a format that machine learning models can understand.
-
-:::info
-We are using [Hugging Face Tokenizers](https://huggingface.co/docs/tokenizers/index) under the hood, ensuring compatibility with the Hugging Face ecosystem.
-:::
-
-## Reference
-
-```typescript
-import { useTokenizer, ALL_MINILM_L6_V2 } from 'react-native-executorch';
-
-const tokenizer = useTokenizer({ tokenizer: ALL_MINILM_L6_V2 });
-
-const text = 'Hello, world!';
-
-try {
-  // Tokenize the text
-  const tokens = await tokenizer.encode(text);
-  console.log('Tokens:', tokens);
-
-  // Decode the tokens back to text
-  const decodedText = await tokenizer.decode(tokens);
-  console.log('Decoded text:', decodedText);
-} catch (error) {
-  console.error('Error tokenizing text:', error);
-}
-```
-
-## Arguments
-
-**`tokenizer`** - Object containing the tokenizer source.
-
-- **`tokenizerSource`** - A string that specifies the location of the tokenizer JSON file.
-
-**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-### Returns
-
-| Field              | Type                                  | Description                                                           |
-| ------------------ | ------------------------------------- | --------------------------------------------------------------------- |
-| `encode`           | `(text: string) => Promise<number[]>` | Converts a string into an array of token IDs.                         |
-| `decode`           | `(ids: number[]) => Promise<string>`  | Converts an array of token IDs into a string.                         |
-| `getVocabSize`     | `() => Promise<number>`               | Returns the size of the tokenizer's vocabulary.                       |
-| `idToToken`        | `(id: number) => Promise<string>`     | Returns the token associated to the ID.                               |
-| `tokenToId`        | `(token: string) => Promise<number>`  | Returns the ID associated to the token.                               |
-| `error`            | <code>string &#124; null</code>       | Contains the error message if the tokenizer failed to load.           |
-| `isGenerating`     | `boolean`                             | Indicates whether the tokenizer is currently running.                 |
-| `isReady`          | `boolean`                             | Indicates whether the tokenizer has successfully loaded and is ready. |
-| `downloadProgress` | `number`                              | Represents the download progress as a value between 0 and 1.          |
-
-## Example
-
-```typescript
-import { useTokenizer, ALL_MINILM_L6_V2 } from 'react-native-executorch';
-
-function App() {
-  const tokenizer = useTokenizer({ tokenizer: ALL_MINILM_L6_V2 });
-
-  // ...
-
-  try {
-    const text = 'Hello, world!';
-
-    const vocabSize = await tokenizer.getVocabSize();
-    console.log('Vocabulary size:', vocabSize);
-
-    const tokens = await tokenizer.encode(text);
-    console.log('Token IDs:', tokens);
-
-    const decoded = await tokenizer.decode(tokens);
-    console.log('Decoded text:', decoded);
-
-    const tokenId = await tokenizer.tokenToId('hello');
-    console.log('Token ID for "Hello":', tokenId);
-
-    const token = await tokenizer.idToToken(tokenId);
-    console.log('Token for ID:', token);
-  } catch (error) {
-    console.error(error);
-  }
-
-  // ...
-}
-```
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useVAD.md b/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useVAD.md
deleted file mode 100644
index b38fe8df0..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/01-natural-language-processing/useVAD.md
+++ /dev/null
@@ -1,194 +0,0 @@
----
-title: useVAD
----
-
-Voice Activity Detection (VAD) is the task of analyzing an audio signal to identify time segments containing human speech, separating them from non-speech sections like silence and background noise.
-
-:::caution
-It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-fsmn-vad). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
-:::
-
-## Reference
-
-You can obtain waveform from audio in any way most suitable to you, however in the snippet below we utilize `react-native-audio-api` library to process a `.mp3` file.
-
-```typescript
-import { useVAD, FSMN_VAD } from 'react-native-executorch';
-import { AudioContext } from 'react-native-audio-api';
-import * as FileSystem from 'expo-file-system';
-
-const model = useVAD({
-  model: FSMN_VAD,
-});
-
-const { uri } = await FileSystem.downloadAsync(
-  'https://some-audio-url.com/file.mp3',
-  FileSystem.cacheDirectory + 'audio_file'
-);
-
-const audioContext = new AudioContext({ sampleRate: 16000 });
-const decodedAudioData = await audioContext.decodeAudioDataSource(uri);
-const audioBuffer = decodedAudioData.getChannelData(0);
-
-try {
-  // NOTE: to obtain segments in seconds, you need to divide
-  // start / end of the segment by the sampling rate (16k)
-
-  const speechSegments = await model.forward(audioBuffer);
-  console.log(speechSegments);
-} catch (error) {
-  console.error('Error during running VAD model', error);
-}
-```
-
-### Arguments
-
-**`model`** - Object containing the model source.
-
-- **`modelSource`** - A string that specifies the location of the model binary.
-
-**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-### Returns
-
-| Field              | Type                                               | Description                                                                                                                                     |
-| ------------------ | -------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
-| `forward`          | `(waveform: Float32Array) => Promise<{Segment[]}>` | Executes the model's forward pass, where input array should be a waveform at 16kHz. Returns a promise containing an array of `Segment` objects. |
-| `error`            | <code>string &#124; null</code>                    | Contains the error message if the model failed to load.                                                                                         |
-| `isGenerating`     | `boolean`                                          | Indicates whether the model is currently processing an inference.                                                                               |
-| `isReady`          | `boolean`                                          | Indicates whether the model has successfully loaded and is ready for inference.                                                                 |
-| `downloadProgress` | `number`                                           | Represents the download progress as a value between 0 and 1.                                                                                    |
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-interface Segment {
-  start: number;
-  end: number;
-}
-```
-
-</details>
-## Running the model
-
-Before running the model's `forward` method, make sure to extract the audio waveform you want to process. You'll need to handle this step yourself, ensuring the audio is sampled at 16 kHz. Once you have the waveform, pass it as an argument to the forward method. The method returns a promise that resolves to the array of detected speech segments.
-
-:::info
-Timestamps in returned speech segments, correspond to indices of input array (waveform).
-:::
-
-## Example
-
-```tsx
-import React from 'react';
-import { Button, Text, SafeAreaView } from 'react-native';
-import { useVAD, FSMN_VAD } from 'react-native-executorch';
-import { AudioContext } from 'react-native-audio-api';
-import * as FileSystem from 'expo-file-system';
-
-export default function App() {
-  const model = useVAD({
-    model: FSMN_VAD,
-  });
-
-  const audioURL = 'https://some-audio-url.com/file.mp3';
-
-  const handleAudio = async () => {
-    if (!model) {
-      console.error('VAD model is not loaded yet.');
-      return;
-    }
-
-    console.log('Processing URL:', audioURL);
-
-    try {
-      const { uri } = await FileSystem.downloadAsync(
-        audioURL,
-        FileSystem.cacheDirectory + 'vad_example.tmp'
-      );
-
-      const audioContext = new AudioContext({ sampleRate: 16000 });
-      const originalDecodedBuffer =
-        await audioContext.decodeAudioDataSource(uri);
-      const originalChannelData = originalDecodedBuffer.getChannelData(0);
-
-      const segments = await model.forward(originalChannelData);
-      if (segments.length === 0) {
-        console.log('No speech segments were found.');
-        return;
-      }
-      console.log(`Found ${segments.length} speech segments.`);
-
-      const totalLength = segments.reduce(
-        (sum, seg) => sum + (seg.end - seg.start),
-        0
-      );
-      const newAudioBuffer = audioContext.createBuffer(
-        1, // Mono
-        totalLength,
-        originalDecodedBuffer.sampleRate
-      );
-      const newChannelData = newAudioBuffer.getChannelData(0);
-
-      let offset = 0;
-      for (const segment of segments) {
-        const slice = originalChannelData.subarray(segment.start, segment.end);
-        newChannelData.set(slice, offset);
-        offset += slice.length;
-      }
-
-      //  Play the processed audio
-      const source = audioContext.createBufferSource();
-      source.buffer = newAudioBuffer;
-      source.connect(audioContext.destination);
-      source.start();
-    } catch (error) {
-      console.error('Error processing audio data:', error);
-    }
-  };
-
-  return (
-    <SafeAreaView>
-      <Text>
-        Press the button to process and play speech from a sample file.
-      </Text>
-      <Button onPress={handleAudio} title="Run VAD Example" />
-    </SafeAreaView>
-  );
-}
-```
-
-## Supported models
-
-- [fsmn-vad](https://huggingface.co/funasr/fsmn-vad)
-
-## Benchmarks
-
-### Model size
-
-| Model    | XNNPACK [MB] |
-| -------- | :----------: |
-| FSMN_VAD |     1.83     |
-
-### Memory usage
-
-| Model    | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| -------- | :--------------------: | :----------------: |
-| FSMN_VAD |           97           |        45,9        |
-
-### Inference time
-
-<!-- TODO: MEASURE INFERENCE TIME FOR SAMSUNG GALAXY S24 WHEN POSSIBLE -->
-
-:::warning warning
-Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
-:::
-
-Inference time were measured on a 60s audio, that can be found [here](https://models.silero.ai/vad_models/en.wav).
-
-| Model    | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| -------- | :--------------------------: | :------------------------------: | :------------------------: | :-----------------------: |
-| FSMN_VAD |             151              |               171                |            180             |            109            |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/_category_.json b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/_category_.json
deleted file mode 100644
index 930e814ef..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/_category_.json
+++ /dev/null
@@ -1,6 +0,0 @@
-{
-  "label": "Computer Vision",
-  "link": {
-    "type": "generated-index"
-  }
-}
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useClassification.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useClassification.md
deleted file mode 100644
index eaf9afcb7..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useClassification.md
+++ /dev/null
@@ -1,113 +0,0 @@
----
-title: useClassification
----
-
-Image classification is the process of assigning a label to an image that best describes its contents. For example, when given an image of a puppy, the image classifier should assign the puppy class to that image.
-
-:::info
-Usually, the class with the highest probability is the one that is assigned to an image. However, if there are multiple classes with comparatively high probabilities, this may indicate that the model is not confident in its prediction.
-:::
-
-:::caution
-It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-efficientnet-v2-s). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
-:::
-
-## Reference
-
-```typescript
-import { useClassification, EFFICIENTNET_V2_S } from 'react-native-executorch';
-
-const model = useClassification({ model: EFFICIENTNET_V2_S });
-
-const imageUri = 'file::///Users/.../cute_puppy.png';
-
-try {
-  const classesWithProbabilities = await model.forward(imageUri);
-} catch (error) {
-  console.error(error);
-}
-```
-
-### Arguments
-
-**`model`** - Object containing the model source.
-
-- **`modelSource`** - A string that specifies the location of the model binary.
-
-**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-### Returns
-
-| Field              | Type                                                               | Description                                                                                                    |
-| ------------------ | ------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------- |
-| `forward`          | `(imageSource: string) => Promise<{ [category: string]: number }>` | Executes the model's forward pass, where `imageSource` can be a fetchable resource or a Base64-encoded string. |
-| `error`            | <code>string &#124; null</code>                                    | Contains the error message if the model failed to load.                                                        |
-| `isGenerating`     | `boolean`                                                          | Indicates whether the model is currently processing an inference.                                              |
-| `isReady`          | `boolean`                                                          | Indicates whether the model has successfully loaded and is ready for inference.                                |
-| `downloadProgress` | `number`                                                           | Represents the download progress as a value between 0 and 1.                                                   |
-
-## Running the model
-
-To run the model, you can use the `forward` method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The function returns a promise, which can resolve either to an error or an object containing categories with their probabilities.
-
-:::info
-Images from external sources are stored in your application's temporary directory.
-:::
-
-## Example
-
-```typescript
-import { useClassification, EFFICIENTNET_V2_S } from 'react-native-executorch';
-
-function App() {
-  const model = useClassification({ model: EFFICIENTNET_V2_S });
-
-  // ...
-  const imageUri = 'file:///Users/.../cute_puppy.png';
-
-  try {
-    const classesWithProbabilities = await model.forward(imageUri);
-
-    // Extract three classes with the highest probabilities
-    const topThreeClasses = Object.entries(classesWithProbabilities)
-      .sort(([, a], [, b]) => b - a)
-      .slice(0, 3)
-      .map(([label, score]) => ({ label, score }));
-  } catch (error) {
-    console.error(error);
-  }
-  // ...
-}
-```
-
-## Supported models
-
-| Model                                                                                                             | Number of classes | Class list                                                                                                                                                                    |
-| ----------------------------------------------------------------------------------------------------------------- | ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [efficientnet_v2_s](https://pytorch.org/vision/stable/models/generated/torchvision.models.efficientnet_v2_s.html) | 1000              | [ImageNet1k_v1](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/common/rnexecutorch/models/classification/Constants.h) |
-
-## Benchmarks
-
-### Model size
-
-| Model             | XNNPACK [MB] | Core ML [MB] |
-| ----------------- | :----------: | :----------: |
-| EFFICIENTNET_V2_S |     85.6     |     43.9     |
-
-### Memory usage
-
-| Model             | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
-| ----------------- | :--------------------: | :----------------: |
-| EFFICIENTNET_V2_S |          230           |         87         |
-
-### Inference time
-
-:::warning warning
-Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
-:::
-
-| Model             | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| ----------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| EFFICIENTNET_V2_S |              64              |              68              |            217             |                205                |            198            |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageEmbeddings.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageEmbeddings.md
deleted file mode 100644
index b6decd1d2..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageEmbeddings.md
+++ /dev/null
@@ -1,132 +0,0 @@
----
-title: useImageEmbeddings
-keywords:
-  [
-    image embedding,
-    image embeddings,
-    embeddings,
-    react native,
-    executorch,
-    ai,
-    machine learning,
-    on-device,
-    mobile ai,
-    clip,
-  ]
-description: "Learn how to use image embeddings models in your React Native applications with React Native ExecuTorch's useImageEmbeddings hook."
----
-
-Image Embedding is the process of converting an image into a numerical representation. This representation can be used for tasks, such as classification, clustering and (using contrastive learning like e.g. CLIP model) image search.
-
-:::caution
-It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-clip-vit-base-patch32). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
-:::
-
-## Reference
-
-```typescript
-import {
-  useImageEmbeddings,
-  CLIP_VIT_BASE_PATCH32_IMAGE,
-} from 'react-native-executorch';
-
-const model = useImageEmbeddings({ model: CLIP_VIT_BASE_PATCH32_IMAGE });
-
-try {
-  const imageEmbedding = await model.forward('https://url-to-image.jpg');
-} catch (error) {
-  console.error(error);
-}
-```
-
-### Arguments
-
-**`model`** - Object containing the model source.
-
-- **`modelSource`** - A string that specifies the location of the model binary.
-
-**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-### Returns
-
-| Field              | Type                                             | Description                                                                                         |
-| ------------------ | ------------------------------------------------ | --------------------------------------------------------------------------------------------------- |
-| `forward`          | `(imageSource: string) => Promise<Float32Array>` | Executes the model's forward pass, where `imageSource` is a URI/URL to image that will be embedded. |
-| `error`            | <code>string &#124; null</code>                  | Contains the error message if the model failed to load.                                             |
-| `isGenerating`     | `boolean`                                        | Indicates whether the model is currently processing an inference.                                   |
-| `isReady`          | `boolean`                                        | Indicates whether the model has successfully loaded and is ready for inference.                     |
-| `downloadProgress` | `number`                                         | Represents the download progress as a value between 0 and 1.                                        |
-
-## Running the model
-
-To run the model, you can use the `forward` method. It accepts one argument which is a URI/URL to an image you want to encode. The function returns a promise, which can resolve either to an error or an array of numbers representing the embedding.
-
-## Example
-
-```typescript
-const dotProduct = (a: Float32Array, b: Float32Array) =>
-  a.reduce((sum, val, i) => sum + val * b[i], 0);
-
-const cosineSimilarity = (a: Float32Array, b: Float32Array) => {
-  const dot = dotProduct(a, b);
-  const normA = Math.sqrt(dotProduct(a, a));
-  const normB = Math.sqrt(dotProduct(b, b));
-  return dot / (normA * normB);
-};
-
-try {
-  // we assume you've provided catImage and dogImage
-  const catImageEmbedding = await model.forward(catImage);
-  const dogImageEmbedding = await model.forward(dogImage);
-
-  const similarity = cosineSimilarity(catImageEmbedding, dogImageEmbedding);
-
-  console.log(`Cosine similarity: ${similarity}`);
-} catch (error) {
-  console.error(error);
-}
-```
-
-## Supported models
-
-| Model                                                                              | Language | Image size | Embedding dimensions | Description                                                                                                                                                                                                                                                                                                                                                                                                                               |
-| ---------------------------------------------------------------------------------- | :------: | :--------: | :------------------: | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [clip-vit-base-patch32-image](https://huggingface.co/openai/clip-vit-base-patch32) | English  |  224×224   |         512          | CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. CLIP allows to embed images and text into the same vector space. This allows to find similar images as well as to implement image search. This is the image encoder part of the CLIP model. To embed text checkout [clip-vit-base-patch32-text](../01-natural-language-processing/useTextEmbeddings.md#supported-models). |
-
-**`Image size`** - the size of an image that the model takes as an input. Resize will happen automatically.
-
-**`Embedding Dimensions`** - the size of the output embedding vector. This is the number of dimensions in the vector representation of the input image.
-
-:::info
-For the supported models, the returned embedding vector is normalized, meaning that its length is equal to 1. This allows for easier comparison of vectors using cosine similarity, just calculate the dot product of two vectors to get the cosine similarity score.
-:::
-
-## Benchmarks
-
-### Model size
-
-| Model                       | XNNPACK [MB] |
-| --------------------------- | :----------: |
-| CLIP_VIT_BASE_PATCH32_IMAGE |     352      |
-
-### Memory usage
-
-| Model                       | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| --------------------------- | :--------------------: | :----------------: |
-| CLIP_VIT_BASE_PATCH32_IMAGE |          350           |        340         |
-
-### Inference time
-
-:::warning warning
-Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization. Performance also heavily depends on image size, because resize is expansive operation, especially on low-end devices.
-:::
-
-| Model                       | iPhone 17 Pro (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| --------------------------- | :--------------------------: | :-----------------------: |
-| CLIP_VIT_BASE_PATCH32_IMAGE |              18              |            55             |
-
-:::info
-Image embedding benchmark times are measured using 224×224 pixel images, as required by the model. All input images, whether larger or smaller, are resized to 224×224 before processing. Resizing is typically fast for small images but may be noticeably slower for very large images, which can increase total inference time.
-:::
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageSegmentation.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageSegmentation.md
deleted file mode 100644
index 7fee70880..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useImageSegmentation.md
+++ /dev/null
@@ -1,117 +0,0 @@
----
-title: useImageSegmentation
----
-
-Semantic image segmentation, akin to image classification, tries to assign the content of the image to one of the predefined classes. However, in case of segmentation this classification is done on a per-pixel basis, so as the result the model provides an image-sized array of scores for each of the classes. You can then use this information to detect objects on a per-pixel basis. React Native ExecuTorch offers a dedicated hook `useImageSegmentation` for this task.
-
-:::caution
-It is recommended to use models provided by us which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-style-transfer-candy), you can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
-:::
-
-## Reference
-
-```typescript
-import {
-  useImageSegmentation,
-  DEEPLAB_V3_RESNET50,
-} from 'react-native-executorch';
-
-const model = useImageSegmentation({ model: DEEPLAB_V3_RESNET50 });
-
-const imageUri = 'file::///Users/.../cute_cat.png';
-
-try {
-  const outputDict = await model.forward(imageUri);
-} catch (error) {
-  console.error(error);
-}
-```
-
-### Arguments
-
-**`model`** - Object containing the model source.
-
-- **`modelSource`** - A string that specifies the location of the model binary.
-
-**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-### Returns
-
-| Field              | Type                                                                                                                         | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
-| ------------------ | ---------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `forward`          | `(imageSource: string, classesOfInterest?: DeeplabLabel[], resize?: boolean) => Promise<{[key in DeeplabLabel]?: number[]}>` | Executes the model's forward pass, where: <br/> \* `imageSource` can be a fetchable resource or a Base64-encoded string. <br/> \* `classesOfInterest` is an optional list of `DeeplabLabel` used to indicate additional arrays of probabilities to output (see section "Running the model"). The default is an empty list. <br/> \* `resize` is an optional boolean to indicate whether the output should be resized to the original image dimensions, or left in the size of the model (see section "Running the model"). The default is `false`. <br/> <br/> The return is a dictionary containing: <br/> \* for the key `DeeplabLabel.ARGMAX` an array of integers corresponding to the most probable class for each pixel <br/> \* an array of floats for each class from `classesOfInterest` corresponding to the probabilities for this class. |
-| `error`            | <code>string &#124; null</code>                                                                                              | Contains the error message if the model failed to load.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
-| `isGenerating`     | `boolean`                                                                                                                    | Indicates whether the model is currently processing an inference.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
-| `isReady`          | `boolean`                                                                                                                    | Indicates whether the model has successfully loaded and is ready for inference.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
-| `downloadProgress` | `number`                                                                                                                     | Represents the download progress as a value between 0 and 1.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
-
-## Running the model
-
-To run the model, you can use the `forward` method. It accepts three arguments: a required image, an optional list of classes, and an optional flag whether to resize the output to the original dimensions.
-
-- The image can be a remote URL, a local file URI, or a base64-encoded image.
-- The `classesOfInterest` list contains classes for which to output the full results. By default the list is empty, and only the most probable classes are returned (essentially an arg max for each pixel). Look at [`DeeplabLabel`](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/imageSegmentation.ts) enum for possible classes.
-- The `resize` flag says whether the output will be rescaled back to the size of the image you put in. The default is `false`. The model runs inference on a scaled (probably smaller) version of your image (224x224 for `DEEPLAB_V3_RESNET50`). If you choose to resize, the output will be `number[]` of size `width * height` of your original image.
-
-:::caution
-Setting `resize` to true will make `forward` slower.
-:::
-
-`forward` returns a promise which can resolve either to an error or a dictionary containing number arrays with size depending on `resize`:
-
-- For the key `DeeplabLabel.ARGMAX` the array contains for each pixel an integer corresponding to the class with the highest probability.
-- For every other key from `DeeplabLabel`, if the label was included in `classesOfInterest` the dictionary will contain an array of floats corresponding to the probability of this class for every pixel.
-
-## Example
-
-```typescript
-function App() {
-  const model = useImageSegmentation({ model: DEEPLAB_V3_RESNET50 });
-
-  // ...
-  const imageUri = 'file::///Users/.../cute_cat.png';
-
-  try {
-    const outputDict = await model.forward(imageUri, [DeeplabLabel.CAT], true);
-  } catch (error) {
-    console.error(error);
-  }
-  // ...
-}
-```
-
-## Supported models
-
-| Model                                                                                                                            | Number of classes | Class list                                                                                                                                            |
-| -------------------------------------------------------------------------------------------------------------------------------- | ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [deeplabv3_resnet50](https://pytorch.org/vision/stable/models/generated/torchvision.models.segmentation.deeplabv3_resnet50.html) | 21                | [DeeplabLabel](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/types/imageSegmentation.ts) |
-
-## Benchmarks
-
-### Model size
-
-| Model             | XNNPACK [MB] |
-| ----------------- | ------------ |
-| DEELABV3_RESNET50 | 168          |
-
-### Memory usage
-
-:::warning warning
-Data presented in the following sections is based on inference with non-resized output. When resize is enabled, expect higher memory usage and inference time with higher resolutions.
-:::
-
-| Model             | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| ----------------- | ---------------------- | ------------------ |
-| DEELABV3_RESNET50 | 930                    | 660                |
-
-### Inference time
-
-:::warning warning
-Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
-:::
-
-| Model             | iPhone 16 Pro (Core ML) [ms] | iPhone 14 Pro Max (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] |
-| ----------------- | ---------------------------- | -------------------------------- | --------------------------------- |
-| DEELABV3_RESNET50 | 1000                         | 670                              | 700                               |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useOCR.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useOCR.md
deleted file mode 100644
index d07efd601..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useOCR.md
+++ /dev/null
@@ -1,332 +0,0 @@
----
-title: useOCR
----
-
-Optical character recognition(OCR) is a computer vision technique that detects and recognizes text within the image. It's commonly used to convert different types of documents, such as scanned paper documents, PDF files, or images captured by a digital camera, into editable and searchable data.
-
-:::caution
-It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
-:::
-
-## Reference
-
-```tsx
-import { useOCR, OCR_ENGLISH } from 'react-native-executorch';
-
-function App() {
-  const model = useOCR({ model: OCR_ENGLISH });
-
-  // ...
-  for (const ocrDetection of await model.forward('https://url-to-image.jpg')) {
-    console.log('Bounding box: ', ocrDetection.bbox);
-    console.log('Bounding label: ', ocrDetection.text);
-    console.log('Bounding score: ', ocrDetection.score);
-  }
-  // ...
-}
-```
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-interface RecognizerSources {
-  recognizerLarge: string | number;
-  recognizerMedium: string | number;
-  recognizerSmall: string | number;
-}
-
-type OCRLanguage =
-  | 'abq'
-  | 'ady'
-  | 'af'
-  | 'ava'
-  | 'az'
-  | 'be'
-  | 'bg'
-  | 'bs'
-  | 'chSim'
-  | 'che'
-  | 'cs'
-  | 'cy'
-  | 'da'
-  | 'dar'
-  | 'de'
-  | 'en'
-  | 'es'
-  | 'et'
-  | 'fr'
-  | 'ga'
-  | 'hr'
-  | 'hu'
-  | 'id'
-  | 'inh'
-  | 'ic'
-  | 'it'
-  | 'ja'
-  | 'kbd'
-  | 'kn'
-  | 'ko'
-  | 'ku'
-  | 'la'
-  | 'lbe'
-  | 'lez'
-  | 'lt'
-  | 'lv'
-  | 'mi'
-  | 'mn'
-  | 'ms'
-  | 'mt'
-  | 'nl'
-  | 'no'
-  | 'oc'
-  | 'pi'
-  | 'pl'
-  | 'pt'
-  | 'ro'
-  | 'ru'
-  | 'rsCyrillic'
-  | 'rsLatin'
-  | 'sk'
-  | 'sl'
-  | 'sq'
-  | 'sv'
-  | 'sw'
-  | 'tab'
-  | 'te'
-  | 'th'
-  | 'tjk'
-  | 'tl'
-  | 'tr'
-  | 'uk'
-  | 'uz'
-  | 'vi';
-
-interface Point {
-  x: number;
-  y: number;
-}
-
-interface OCRDetection {
-  bbox: Point[];
-  text: string;
-  score: number;
-}
-```
-
-</details>
-
-### Arguments
-
-**`model`** - Object containing the detector source, recognizer sources, and language.
-
-- **`detectorSource`** - A string that specifies the location of the detector binary.
-- **`recognizerLarge`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 512 pixels.
-- **`recognizerMedium`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 256 pixels.
-- **`recognizerSmall`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 128 pixels.
-- **`language`** - A parameter that specifies the language of the text to be recognized by the OCR.
-
-**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-### Returns
-
-The hook returns an object with the following properties:
-
-| Field              | Type                                               | Description                                                                                 |
-| ------------------ | -------------------------------------------------- | ------------------------------------------------------------------------------------------- |
-| `forward`          | `(imageSource: string) => Promise<OCRDetection[]>` | A function that accepts an image (url, b64) and returns an array of `OCRDetection` objects. |
-| `error`            | <code>string &#124; null</code>                    | Contains the error message if the model loading failed.                                     |
-| `isGenerating`     | `boolean`                                          | Indicates whether the model is currently processing an inference.                           |
-| `isReady`          | `boolean`                                          | Indicates whether the model has successfully loaded and is ready for inference.             |
-| `downloadProgress` | `number`                                           | Represents the download progress as a value between 0 and 1.                                |
-
-## Running the model
-
-To run the model, you can use the `forward` method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The function returns an array of `OCRDetection` objects. Each object contains coordinates of the bounding box, the text recognized within the box, and the confidence score. For more information, please refer to the reference or type definitions.
-
-## Detection object
-
-The detection object is specified as follows:
-
-```typescript
-interface Point {
-  x: number;
-  y: number;
-}
-
-interface OCRDetection {
-  bbox: Point[];
-  text: string;
-  score: number;
-}
-```
-
-The `bbox` property contains information about the bounding box of detected text regions. It is represented as four points, which are corners of detected bounding box.
-The `text` property contains the text recognized within detected text region. The `score` represents the confidence score of the recognized text.
-
-## Example
-
-```tsx
-import { useOCR, OCR_ENGLISH } from 'react-native-executorch';
-
-function App() {
-  const model = useOCR({ model: OCR_ENGLISH });
-
-  const runModel = async () => {
-    const ocrDetections = await model.forward('https://url-to-image.jpg');
-
-    for (const ocrDetection of ocrDetections) {
-      console.log('Bounding box: ', ocrDetection.bbox);
-      console.log('Bounding text: ', ocrDetection.text);
-      console.log('Bounding score: ', ocrDetection.score);
-    }
-  };
-}
-```
-
-## Language-Specific Recognizers
-
-Each supported language requires its own set of recognizer models.  
-The built-in constants such as `RECOGNIZER_EN_CRNN_512`, `RECOGNIZER_PL_CRNN_256`, etc., point to specific models trained for a particular language.
-
-> For example:
->
-> - To recognize **English** text, use:
->   - `RECOGNIZER_EN_CRNN_512`
->   - `RECOGNIZER_EN_CRNN_256`
->   - `RECOGNIZER_EN_CRNN_128`
-> - To recognize **Polish** text, use:
->   - `RECOGNIZER_PL_CRNN_512`
->   - `RECOGNIZER_PL_CRNN_256`
->   - `RECOGNIZER_PL_CRNN_128`
-
-You need to make sure the recognizer models you pass in `recognizerSources` match the `language` you specify.
-
-## Supported languages
-
-|      Language      | Code Name  |
-| :----------------: | :--------: |
-|       Abaza        |    abq     |
-|       Adyghe       |    ady     |
-|      Africans      |     af     |
-|        Avar        |    ava     |
-|    Azerbaijani     |     az     |
-|     Belarusian     |     be     |
-|     Bulgarian      |     bg     |
-|      Bosnian       |     bs     |
-| Simplified Chinese |   chSim    |
-|      Chechen       |    che     |
-|       Chech        |     cs     |
-|       Welsh        |     cy     |
-|       Danish       |     da     |
-|       Dargwa       |    dar     |
-|       German       |     de     |
-|      English       |     en     |
-|      Spanish       |     es     |
-|      Estonian      |     et     |
-|       French       |     fr     |
-|       Irish        |     ga     |
-|      Croatian      |     hr     |
-|     Hungarian      |     hu     |
-|     Indonesian     |     id     |
-|       Ingush       |    inh     |
-|     Icelandic      |     ic     |
-|      Italian       |     it     |
-|      Japanese      |     ja     |
-|     Karbadian      |    kbd     |
-|      Kannada       |     kn     |
-|       Korean       |     ko     |
-|      Kurdish       |     ku     |
-|       Latin        |     la     |
-|        Lak         |    lbe     |
-|      Lezghian      |    lez     |
-|     Lithuanian     |     lt     |
-|      Latvian       |     lv     |
-|       Maori        |     mi     |
-|     Mongolian      |     mn     |
-|       Malay        |     ms     |
-|      Maltese       |     mt     |
-|       Dutch        |     nl     |
-|     Norwegian      |     no     |
-|      Occitan       |     oc     |
-|        Pali        |     pi     |
-|       Polish       |     pl     |
-|     Portuguese     |     pt     |
-|      Romanian      |     ro     |
-|      Russian       |     ru     |
-| Serbian (Cyrillic) | rsCyrillic |
-|  Serbian (Latin)   |  rsLatin   |
-|       Slovak       |     sk     |
-|     Slovenian      |     sl     |
-|      Albanian      |     sq     |
-|      Swedish       |     sv     |
-|      Swahili       |     sw     |
-|     Tabassaran     |    tab     |
-|       Telugu       |     te     |
-|        Thai        |     th     |
-|       Tajik        |    tjk     |
-|      Tagalog       |     tl     |
-|      Turkish       |     tr     |
-|     Ukrainian      |     uk     |
-|       Uzbek        |     uz     |
-|     Vietnamese     |     vi     |
-
-## Supported models
-
-| Model                                                   |    Type    |
-| ------------------------------------------------------- | :--------: |
-| [CRAFT_800\*](https://github.com/clovaai/CRAFT-pytorch) |  Detector  |
-| [CRNN_512\*](https://www.jaided.ai/easyocr/modelhub/)   | Recognizer |
-| [CRNN_256\*](https://www.jaided.ai/easyocr/modelhub/)   | Recognizer |
-| [CRNN_128\*](https://www.jaided.ai/easyocr/modelhub/)   | Recognizer |
-
-\* - The number following the underscore (\_) indicates the input image width used during model export.
-
-## Benchmarks
-
-### Model size
-
-| Model                          | XNNPACK [MB] |
-| ------------------------------ | :----------: |
-| Detector (CRAFT_800_QUANTIZED) |     19.8     |
-| Recognizer (CRNN_512)          |  15 - 18\*   |
-| Recognizer (CRNN_256)          |  16 - 18\*   |
-| Recognizer (CRNN_128)          |  17 - 19\*   |
-
-\* - The model weights vary depending on the language.
-
-### Memory usage
-
-| Model                                                                                                  | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| ------------------------------------------------------------------------------------------------------ | :--------------------: | :----------------: |
-| Detector (CRAFT_800_QUANTIZED) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) |          1400          |        1320        |
-
-### Inference time
-
-**Image Used for Benchmarking:**
-
-| ![Alt text](../../../static/img/harvard.png) | ![Alt text](../../../static/img/harvard-boxes.png) |
-| -------------------------------------------- | -------------------------------------------------- |
-| Original Image                               | Image with detected Text Boxes                     |
-
-:::warning warning
-Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
-:::
-
-**Time measurements:**
-
-| Metric                             | iPhone 17 Pro <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
-| ---------------------------------- | ------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
-| **Total Inference Time**           | 652                       | 600                       | 2855        | 1092                           | 1034                   |
-| **Detector (CRAFT_800_QUANTIZED)** | 220                       | 221                       | 1740        | 521                            | 492                    |
-| **Recognizer (CRNN_512)**          |                           |                           |             |                                |                        |
-| ├─ Average Time                    | 45                        | 38                        | 110         | 40                             | 38                     |
-| ├─ Total Time (3 runs)             | 135                       | 114                       | 330         | 120                            | 114                    |
-| **Recognizer (CRNN_256)**          |                           |                           |             |                                |                        |
-| ├─ Average Time                    | 21                        | 18                        | 54          | 20                             | 19                     |
-| ├─ Total Time (7 runs)             | 147                       | 126                       | 378         | 140                            | 133                    |
-| **Recognizer (CRNN_128)**          |                           |                           |             |                                |                        |
-| ├─ Average Time                    | 11                        | 9                         | 27          | 10                             | 10                     |
-| ├─ Total Time (7 runs)             | 77                        | 63                        | 189         | 70                             | 70                     |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useObjectDetection.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useObjectDetection.md
deleted file mode 100644
index 2bae6a658..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useObjectDetection.md
+++ /dev/null
@@ -1,152 +0,0 @@
----
-title: useObjectDetection
----
-
-Object detection is a computer vision technique that identifies and locates objects within images or video. It’s commonly used in applications like image recognition, video surveillance or autonomous driving.
-`useObjectDetection` is a hook that allows you to seamlessly integrate object detection into your React Native applications.
-
-:::caution
-It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-ssdlite320-mobilenet-v3-large). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
-:::
-
-## Reference
-
-```tsx
-import {
-  useObjectDetection,
-  SSDLITE_320_MOBILENET_V3_LARGE,
-} from 'react-native-executorch';
-
-function App() {
-  const ssdlite = useObjectDetection({ model: SSDLITE_320_MOBILENET_V3_LARGE });
-
-  // ...
-  for (const detection of await ssdlite.forward('https://url-to-image.jpg')) {
-    console.log('Bounding box: ', detection.bbox);
-    console.log('Bounding label: ', detection.label);
-    console.log('Bounding score: ', detection.score);
-  }
-  // ...
-}
-```
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-interface Bbox {
-  x1: number;
-  x2: number;
-  y1: number;
-  y2: number;
-}
-
-interface Detection {
-  bbox: Bbox;
-  label: keyof typeof CocoLabel;
-  score: number;
-}
-```
-
-</details>
-
-### Arguments
-
-**`model`** - Object containing the model source.
-
-- **`modelSource`** - A string that specifies the path to the model file. You can download the model from our [HuggingFace repository](https://huggingface.co/software-mansion/react-native-executorch-ssdlite320-mobilenet-v3-large/tree/main).
-
-**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-### Returns
-
-The hook returns an object with the following properties:
-
-| Field              | Type                                                                              | Description                                                                                                                                                              |
-| ------------------ | --------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
-| `forward`          | `(imageSource: string, detectionThreshold: number = 0.7) => Promise<Detection[]>` | A function that accepts an image (url, b64) and returns an array of `Detection` objects. `detectionThreshold` can be supplied to alter the sensitivity of the detection. |
-| `error`            | <code>string &#124; null</code>                                                   | Contains the error message if the model loading failed.                                                                                                                  |
-| `isGenerating`     | `boolean`                                                                         | Indicates whether the model is currently processing an inference.                                                                                                        |
-| `isReady`          | `boolean`                                                                         | Indicates whether the model has successfully loaded and is ready for inference.                                                                                          |
-| `downloadProgress` | `number`                                                                          | Represents the download progress as a value between 0 and 1.                                                                                                             |
-
-## Running the model
-
-To run the model, you can use the `forward` method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The function returns an array of `Detection` objects. Each object contains coordinates of the bounding box, the label of the detected object, and the confidence score. For more information, please refer to the reference or type definitions.
-
-## Detection object
-
-The detection object is specified as follows:
-
-```typescript
-interface Bbox {
-  x1: number;
-  y1: number;
-  x2: number;
-  y2: number;
-}
-
-interface Detection {
-  bbox: Bbox;
-  label: keyof typeof CocoLabels;
-  score: number;
-}
-```
-
-The `bbox` property contains information about the bounding box of detected objects. It is represented as two points: one at the bottom-left corner of the bounding box (`x1`, `y1`) and the other at the top-right corner (`x2`, `y2`).
-The `label` property contains the name of the detected object, which corresponds to one of the `CocoLabels`. The `score` represents the confidence score of the detected object.
-
-## Example
-
-```tsx
-import {
-  useObjectDetection,
-  SSDLITE_320_MOBILENET_V3_LARGE,
-} from 'react-native-executorch';
-
-function App() {
-  const ssdlite = useObjectDetection({ model: SSDLITE_320_MOBILENET_V3_LARGE });
-
-  const runModel = async () => {
-    const detections = await ssdlite.forward('https://url-to-image.jpg');
-
-    for (const detection of detections) {
-      console.log('Bounding box: ', detection.bbox);
-      console.log('Bounding label: ', detection.label);
-      console.log('Bounding score: ', detection.score);
-    }
-  };
-}
-```
-
-## Supported models
-
-| Model                                                                                                                                                                                                                 | Number of classes | Class list                                                                                                                                                             |
-| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [SSDLite320 MobileNetV3 Large](https://pytorch.org/vision/stable/models/generated/torchvision.models.detection.ssdlite320_mobilenet_v3_large.html#torchvision.models.detection.SSDLite320_MobileNet_V3_Large_Weights) | 91                | [COCO](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/common/rnexecutorch/models/object_detection/Constants.h) |
-
-## Benchmarks
-
-### Model size
-
-| Model                          | XNNPACK [MB] |
-| ------------------------------ | :----------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |     13.9     |
-
-### Memory usage
-
-| Model                          | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| ------------------------------ | :--------------------: | :----------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |          164           |        132         |
-
-### Inference time
-
-:::warning warning
-Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
-:::
-
-| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |              71              |              74              |            257             |                115                |            109            |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useStyleTransfer.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useStyleTransfer.md
deleted file mode 100644
index f5d0a423c..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useStyleTransfer.md
+++ /dev/null
@@ -1,114 +0,0 @@
----
-title: useStyleTransfer
----
-
-Style transfer is a technique used in computer graphics and machine learning where the visual style of one image is applied to the content of another. This is achieved using algorithms that manipulate data from both images, typically with the aid of a neural network. The result is a new image that combines the artistic elements of one picture with the structural details of another, effectively merging art with traditional imagery. React Native ExecuTorch offers a dedicated hook `useStyleTransfer`, for this task. However before you start you'll need to obtain ExecuTorch-compatible model binary.
-
-:::caution
-It is recommended to use models provided by us which are available at our [Hugging Face repository](https://huggingface.co/software-mansion/react-native-executorch-style-transfer-candy), you can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
-:::
-
-## Reference
-
-```typescript
-import {
-  useStyleTransfer,
-  STYLE_TRANSFER_CANDY,
-} from 'react-native-executorch';
-
-const model = useStyleTransfer({ model: STYLE_TRANSFER_CANDY });
-
-const imageUri = 'file::///Users/.../cute_cat.png';
-
-try {
-  const generatedImageUrl = await model.forward(imageUri);
-} catch (error) {
-  console.error(error);
-}
-```
-
-### Arguments
-
-**`model`** - Object containing the model source.
-
-- **`modelSource`** - A string that specifies the location of the model binary.
-
-**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-### Returns
-
-| Field              | Type                                       | Description                                                                                                    |
-| ------------------ | ------------------------------------------ | -------------------------------------------------------------------------------------------------------------- |
-| `forward`          | `(imageSource: string) => Promise<string>` | Executes the model's forward pass, where `imageSource` can be a fetchable resource or a Base64-encoded string. |
-| `error`            | <code>string &#124; null</code>            | Contains the error message if the model failed to load.                                                        |
-| `isGenerating`     | `boolean`                                  | Indicates whether the model is currently processing an inference.                                              |
-| `isReady`          | `boolean`                                  | Indicates whether the model has successfully loaded and is ready for inference.                                |
-| `downloadProgress` | `number`                                   | Represents the download progress as a value between 0 and 1.                                                   |
-
-## Running the model
-
-To run the model, you can use `forward` method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The function returns a promise which can resolve either to an error or a URL to generated image.
-
-:::info
-Images from external sources and the generated image are stored in your application's temporary directory.
-:::
-
-## Example
-
-```typescript
-function App() {
-  const model = useStyleTransfer({ model: STYLE_TRANSFER_CANDY });
-
-  // ...
-  const imageUri = 'file::///Users/.../cute_cat.png';
-
-  try {
-    const generatedImageUrl = await model.forward(imageUri);
-  } catch (error) {
-    console.error(error);
-  }
-  // ...
-}
-```
-
-## Supported models
-
-- [Candy](https://github.com/pytorch/examples/tree/main/fast_neural_style)
-- [Mosaic](https://github.com/pytorch/examples/tree/main/fast_neural_style)
-- [Udnie](https://github.com/pytorch/examples/tree/main/fast_neural_style)
-- [Rain princess](https://github.com/pytorch/examples/tree/main/fast_neural_style)
-
-## Benchmarks
-
-### Model size
-
-| Model                        | XNNPACK [MB] | Core ML [MB] |
-| ---------------------------- | :----------: | :----------: |
-| STYLE_TRANSFER_CANDY         |     6.78     |     5.22     |
-| STYLE_TRANSFER_MOSAIC        |     6.78     |     5.22     |
-| STYLE_TRANSFER_UDNIE         |     6.78     |     5.22     |
-| STYLE_TRANSFER_RAIN_PRINCESS |     6.78     |     5.22     |
-
-### Memory usage
-
-| Model                        | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
-| ---------------------------- | :--------------------: | :----------------: |
-| STYLE_TRANSFER_CANDY         |          1200          |        380         |
-| STYLE_TRANSFER_MOSAIC        |          1200          |        380         |
-| STYLE_TRANSFER_UDNIE         |          1200          |        380         |
-| STYLE_TRANSFER_RAIN_PRINCESS |          1200          |        380         |
-
-### Inference time
-
-:::warning warning
-Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
-:::
-
-| Model                        | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| ---------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| STYLE_TRANSFER_CANDY         |             1400             |             1485             |            4255            |               2510                |           2355            |
-| STYLE_TRANSFER_MOSAIC        |             1400             |             1485             |            4255            |               2510                |           2355            |
-| STYLE_TRANSFER_UDNIE         |             1400             |             1485             |            4255            |               2510                |           2355            |
-| STYLE_TRANSFER_RAIN_PRINCESS |             1400             |             1485             |            4255            |               2510                |           2355            |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useTextToImage.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useTextToImage.md
deleted file mode 100644
index 3eaf7d826..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useTextToImage.md
+++ /dev/null
@@ -1,133 +0,0 @@
----
-title: useTextToImage
-keywords: [image generation]
-description: "Learn how to use image generation models in your React Native applications with React Native ExecuTorch's useTextToImage hook."
----
-
-Text-to-image is a process of generating images directly from a description in natural language by conditioning a model on the provided text input. Our implementation follows the Stable Diffusion pipeline, which applies the diffusion process in a lower-dimensional latent space to reduce memory requirements. The pipeline combines a text encoder to preprocess the prompt, a U-Net that iteratively denoises latent representations, and a VAE decoder to reconstruct the final image. React Native ExecuTorch offers a dedicated hook, `useTextToImage`, for this task.
-
-<!-- Update links after uploading the model to Swm HuggingFace -->
-
-:::warning
-It is recommended to use models provided by us which are available at our Hugging Face repository, you can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
-:::
-
-## Reference
-
-```typescript
-import { useTextToImage, BK_SDM_TINY_VPRED_256 } from 'react-native-executorch';
-
-const model = useTextToImage({ model: BK_SDM_TINY_VPRED_256 });
-
-const input = 'a castle';
-
-try {
-  const image = await model.generate(input);
-} catch (error) {
-  console.error(error);
-}
-```
-
-### Arguments
-
-**`model`** - Object containing the model source.
-
-- **`schedulerSource`** - A string that specifies the location of the scheduler config.
-
-- **`tokenizerSource`** - A string that specifies the location of the tokenizer config.
-
-- **`encoderSource`** - A string that specifies the location of the text encoder binary.
-
-- **`unetSource`** - A string that specifies the location of the U-Net binary.
-
-- **`decoderSource`** - A string that specifies the location of the VAE decoder binary.
-
-**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-### Returns
-
-| Field              | Type                                                                                       | Description                                                                                                                                                                                                                              |
-| ------------------ | ------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `generate`         | `(input: string, imageSize?: number, numSteps?: number, seed?: number) => Promise<string>` | Runs the model to generate an image described by `input`, and conditioned by `seed`, performing `numSteps` inference steps. The resulting image, with dimensions `imageSize`×`imageSize` pixels, is returned as a base64-encoded string. |
-| `error`            | <code>string &#124; null</code>                                                            | Contains the error message if the model failed to load.                                                                                                                                                                                  |
-| `isGenerating`     | `boolean`                                                                                  | Indicates whether the model is currently processing an inference.                                                                                                                                                                        |
-| `isReady`          | `boolean`                                                                                  | Indicates whether the model has successfully loaded and is ready for inference.                                                                                                                                                          |
-| `downloadProgress` | `number`                                                                                   | Represents the download progress as a value between 0 and 1.                                                                                                                                                                             |
-| `interrupt()`      | `() => void`                                                                               | Interrupts the current inference. The model is stopped in the nearest inference step.                                                                                                                                                    |
-
-## Running the model
-
-To run the model, you can use the `forward` method. It accepts four arguments: a text prompt describing the requested image, a size of the image in pixels, a number of denoising steps, and an optional seed value, which enables reproducibility of the results.
-
-The image size must be a multiple of 32 due to the architecture of the U-Net and VAE models. The seed should be a positive integer.
-
-:::warning
-Larger imageSize values require significantly more memory to run the model.
-:::
-
-## Example
-
-```tsx
-import { useTextToImage, BK_SDM_TINY_VPRED_256 } from 'react-native-executorch';
-
-function App() {
-  const model = useTextToImage({ model: BK_SDM_TINY_VPRED_256 });
-
-  //...
-  const input = 'a medieval castle by the sea shore';
-
-  const imageSize = 256;
-  const numSteps = 25;
-
-  try {
-    image = await model.generate(input, imageSize, numSteps);
-  } catch (error) {
-    console.error(error);
-  }
-  //...
-
-  return <Image source={{ uri: `data:image/png;base64,${image}` }} />;
-}
-```
-
-| ![Castle 256x256](../../../static/img/castle256.png) | ![Castle 512x512](../../../static/img/castle512.png) |
-| ---------------------------------------------------- | ---------------------------------------------------- |
-| Image of size 256×256                                | Image of size 512×512                                |
-
-## Supported models
-
-| Model                                                               | Parameters [B] | Description                                                                                                                                                                                                                                                                                                  |
-| ------------------------------------------------------------------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
-| [bk-sdm-tiny-vpred](https://huggingface.co/vivym/bk-sdm-tiny-vpred) | 0.5            | BK-SDM (Block-removed Knowledge-distilled Stable Diffusion Model) is a compressed version of Stable Diffusion v1.4 with several residual and attention blocks removed. The BK-SDM-Tiny is a v-prediction variant of the model, obtained through further block removal, built around a 0.33B-parameter U-Net. |
-
-## Benchmarks
-
-:::info
-The number following the underscore (\_) indicates that the model supports generating image with dimensions ranging from 128 pixels up to that value. This setting doesn’t affect the model’s file size - it only determines how memory is allocated at runtime, based on the maximum allowed image size.
-:::
-
-### Model size
-
-| Model                 | Text encoder (XNNPACK) [MB] | UNet (XNNPACK) [MB] | VAE decoder (XNNPACK) [MB] |
-| --------------------- | --------------------------- | ------------------- | -------------------------- |
-| BK_SDM_TINY_VPRED_256 | 492                         | 1290                | 198                        |
-| BK_SDM_TINY_VPRED_512 | 492                         | 1290                | 198                        |
-
-### Memory usage
-
-| Model                 | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| --------------------- | ---------------------- | ------------------ |
-| BK_SDM_TINY_VPRED_256 | 2900                   | 2800               |
-| BK_SDM_TINY_VPRED_512 | 6700                   | 6560               |
-
-### Inference time
-
-| Model                 | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| --------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| BK_SDM_TINY_VPRED_256 |            21184             |            21021             |             ❌             |               18834               |           16617           |
-
-:::info
-Text-to-image benchmark times are measured generating 256×256 images in 10 inference steps.
-:::
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useVerticalOCR.md b/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useVerticalOCR.md
deleted file mode 100644
index f317d527e..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/02-computer-vision/useVerticalOCR.md
+++ /dev/null
@@ -1,347 +0,0 @@
----
-title: useVerticalOCR
----
-
-:::danger Experimental
-The `useVerticalOCR` hook is currently in an experimental phase. We appreciate feedback from users as we continue to refine and enhance its functionality.
-:::
-
-Optical Character Recognition (OCR) is a computer vision technique used to detect and recognize text within images. It is commonly utilized to convert a variety of documents, such as scanned paper documents, PDF files, or images captured by a digital camera, into editable and searchable data. Traditionally, OCR technology has been optimized for recognizing horizontal text, and integrating support for vertical text recognition often requires significant additional effort from developers. To simplify this, we introduce `useVerticalOCR`, a tool designed to abstract the complexities of vertical text OCR, enabling seamless integration into your applications.
-
-:::caution
-It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/software-mansion). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
-:::
-
-## Reference
-
-```tsx
-import { useVerticalOCR, VERTICAL_OCR_ENGLISH } from 'react-native-executorch';
-
-function App() {
-  const model = useVerticalOCR({
-    model: VERTICAL_OCR_ENGLISH,
-    independentCharacters: true,
-  });
-
-  // ...
-  for (const ocrDetection of await model.forward('https://url-to-image.jpg')) {
-    console.log('Bounding box: ', ocrDetection.bbox);
-    console.log('Bounding label: ', ocrDetection.text);
-    console.log('Bounding score: ', ocrDetection.score);
-  }
-  // ...
-}
-```
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-interface DetectorSources {
-  detectorLarge: string | number;
-  detectorNarrow: string | number;
-}
-
-interface RecognizerSources {
-  recognizerLarge: string | number;
-  recognizerSmall: string | number;
-}
-
-type OCRLanguage =
-  | 'abq'
-  | 'ady'
-  | 'af'
-  | 'ava'
-  | 'az'
-  | 'be'
-  | 'bg'
-  | 'bs'
-  | 'chSim'
-  | 'che'
-  | 'cs'
-  | 'cy'
-  | 'da'
-  | 'dar'
-  | 'de'
-  | 'en'
-  | 'es'
-  | 'et'
-  | 'fr'
-  | 'ga'
-  | 'hr'
-  | 'hu'
-  | 'id'
-  | 'inh'
-  | 'ic'
-  | 'it'
-  | 'ja'
-  | 'kbd'
-  | 'kn'
-  | 'ko'
-  | 'ku'
-  | 'la'
-  | 'lbe'
-  | 'lez'
-  | 'lt'
-  | 'lv'
-  | 'mi'
-  | 'mn'
-  | 'ms'
-  | 'mt'
-  | 'nl'
-  | 'no'
-  | 'oc'
-  | 'pi'
-  | 'pl'
-  | 'pt'
-  | 'ro'
-  | 'ru'
-  | 'rsCyrillic'
-  | 'rsLatin'
-  | 'sk'
-  | 'sl'
-  | 'sq'
-  | 'sv'
-  | 'sw'
-  | 'tab'
-  | 'te'
-  | 'th'
-  | 'tjk'
-  | 'tl'
-  | 'tr'
-  | 'uk'
-  | 'uz'
-  | 'vi';
-
-interface Point {
-  x: number;
-  y: number;
-}
-
-interface OCRDetection {
-  bbox: Point[];
-  text: string;
-  score: number;
-}
-```
-
-</details>
-
-### Arguments
-
-**`model`** - Object containing the detector sources, recognizer sources, and language.
-
-- **`detectorLarge`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 1280 pixels.
-- **`detectorNarrow`** - A string that specifies the location of the detector binary file which accepts input images with a width of 320 pixels.
-- **`recognizerLarge`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 512 pixels.
-- **`recognizerSmall`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 64 pixels.
-- **`language`** - A parameter that specifies the language of the text to be recognized by the OCR.
-
-**`independentCharacters`** – A boolean parameter that indicates whether the text in the image consists of a random sequence of characters. If set to true, the algorithm will scan each character individually instead of reading them as continuous text.
-
-**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-### Returns
-
-The hook returns an object with the following properties:
-
-| Field              | Type                                               | Description                                                                                 |
-| ------------------ | -------------------------------------------------- | ------------------------------------------------------------------------------------------- |
-| `forward`          | `(imageSource: string) => Promise<OCRDetection[]>` | A function that accepts an image (url, b64) and returns an array of `OCRDetection` objects. |
-| `error`            | <code>string &#124; null</code>                    | Contains the error message if the model loading failed.                                     |
-| `isGenerating`     | `boolean`                                          | Indicates whether the model is currently processing an inference.                           |
-| `isReady`          | `boolean`                                          | Indicates whether the model has successfully loaded and is ready for inference.             |
-| `downloadProgress` | `number`                                           | Represents the download progress as a value between 0 and 1.                                |
-
-## Running the model
-
-To run the model, you can use the `forward` method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The function returns an array of `OCRDetection` objects. Each object contains coordinates of the bounding box, the text recognized within the box, and the confidence score. For more information, please refer to the reference or type definitions.
-
-## Detection object
-
-The detection object is specified as follows:
-
-```typescript
-interface Point {
-  x: number;
-  y: number;
-}
-
-interface OCRDetection {
-  bbox: Point[];
-  text: string;
-  score: number;
-}
-```
-
-The `bbox` property contains information about the bounding box of detected text regions. It is represented as four points, which are corners of detected bounding box.
-The `text` property contains the text recognized within detected text region. The `score` represents the confidence score of the recognized text.
-
-## Example
-
-```tsx
-import { useVerticalOCR, VERTICAL_OCR_ENGLISH } from 'react-native-executorch';
-
-function App() {
-  const model = useVerticalOCR({
-    model: VERTICAL_OCR_ENGLISH,
-    independentCharacters: true,
-  });
-
-  const runModel = async () => {
-    const ocrDetections = await model.forward('https://url-to-image.jpg');
-
-    for (const ocrDetection of ocrDetections) {
-      console.log('Bounding box: ', ocrDetection.bbox);
-      console.log('Bounding text: ', ocrDetection.text);
-      console.log('Bounding score: ', ocrDetection.score);
-    }
-  };
-}
-```
-
-## Language-Specific Recognizers
-
-Each supported language requires its own set of recognizer models.  
-The built-in constants such as `RECOGNIZER_EN_CRNN_512`, `RECOGNIZER_PL_CRNN_64`, etc., point to specific models trained for a particular language.
-
-> For example:
->
-> - To recognize **English** text, use:
->   - `RECOGNIZER_EN_CRNN_512`
->   - `RECOGNIZER_EN_CRNN_64`
-> - To recognize **Polish** text, use:
->   - `RECOGNIZER_PL_CRNN_512`
->   - `RECOGNIZER_PL_CRNN_64`
-
-You need to make sure the recognizer models you pass in `recognizerSources` match the `language` you specify.
-
-## Supported languages
-
-|      Language      | Code Name  |
-| :----------------: | :--------: |
-|       Abaza        |    abq     |
-|       Adyghe       |    ady     |
-|      Africans      |     af     |
-|        Avar        |    ava     |
-|    Azerbaijani     |     az     |
-|     Belarusian     |     be     |
-|     Bulgarian      |     bg     |
-|      Bosnian       |     bs     |
-| Simplified Chinese |   chSim    |
-|      Chechen       |    che     |
-|       Chech        |     cs     |
-|       Welsh        |     cy     |
-|       Danish       |     da     |
-|       Dargwa       |    dar     |
-|       German       |     de     |
-|      English       |     en     |
-|      Spanish       |     es     |
-|      Estonian      |     et     |
-|       French       |     fr     |
-|       Irish        |     ga     |
-|      Croatian      |     hr     |
-|     Hungarian      |     hu     |
-|     Indonesian     |     id     |
-|       Ingush       |    inh     |
-|     Icelandic      |     ic     |
-|      Italian       |     it     |
-|      Japanese      |     ja     |
-|     Karbadian      |    kbd     |
-|      Kannada       |     kn     |
-|       Korean       |     ko     |
-|      Kurdish       |     ku     |
-|       Latin        |     la     |
-|        Lak         |    lbe     |
-|      Lezghian      |    lez     |
-|     Lithuanian     |     lt     |
-|      Latvian       |     lv     |
-|       Maori        |     mi     |
-|     Mongolian      |     mn     |
-|       Malay        |     ms     |
-|      Maltese       |     mt     |
-|       Dutch        |     nl     |
-|     Norwegian      |     no     |
-|      Occitan       |     oc     |
-|        Pali        |     pi     |
-|       Polish       |     pl     |
-|     Portuguese     |     pt     |
-|      Romanian      |     ro     |
-|      Russian       |     ru     |
-| Serbian (Cyrillic) | rsCyrillic |
-|  Serbian (Latin)   |  rsLatin   |
-|       Slovak       |     sk     |
-|     Slovenian      |     sl     |
-|      Albanian      |     sq     |
-|      Swedish       |     sv     |
-|      Swahili       |     sw     |
-|     Tabassaran     |    tab     |
-|       Telugu       |     te     |
-|        Thai        |     th     |
-|       Tajik        |    tjk     |
-|      Tagalog       |     tl     |
-|      Turkish       |     tr     |
-|     Ukrainian      |     uk     |
-|       Uzbek        |     uz     |
-|     Vietnamese     |     vi     |
-
-## Supported models
-
-| Model                                                    | Type       |
-| -------------------------------------------------------- | ---------- |
-| [CRAFT_1280\*](https://github.com/clovaai/CRAFT-pytorch) | Detector   |
-| [CRAFT_320\*](https://github.com/clovaai/CRAFT-pytorch)  | Detector   |
-| [CRNN_512\*](https://www.jaided.ai/easyocr/modelhub/)    | Recognizer |
-| [CRNN_64\*](https://www.jaided.ai/easyocr/modelhub/)     | Recognizer |
-
-\* - The number following the underscore (\_) indicates the input image width used during model export.
-
-## Benchmarks
-
-### Model size
-
-| Model                           | XNNPACK [MB] |
-| ------------------------------- | :----------: |
-| Detector (CRAFT_1280_QUANTIZED) |     19.8     |
-| Detector (CRAFT_32_QUANTIZED)   |     19.8     |
-| Recognizer (CRNN_512)           |  15 - 18\*   |
-| Recognizer (CRNN_64)            |  15 - 16\*   |
-
-\* - The model weights vary depending on the language.
-
-### Memory usage
-
-| Model                                                                | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| -------------------------------------------------------------------- | :--------------------: | :----------------: |
-| Detector (CRAFT_1280) + Detector (CRAFT_320) + Recognizer (CRNN_512) |          1540          |        1470        |
-| Detector(CRAFT_1280) + Detector(CRAFT_320) + Recognizer (CRNN_64)    |          1070          |        1000        |
-
-### Inference time
-
-**Image Used for Benchmarking:**
-
-| ![Alt text](../../../static/img/sales-vertical.jpeg) | ![Alt text](../../../static/img/sales-vertical-boxes.png) |
-| ---------------------------------------------------- | --------------------------------------------------------- |
-| Original Image                                       | Image with detected Text Boxes                            |
-
-:::warning warning
-Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
-:::
-
-**Time measurements:**
-
-| Metric                                                                     | iPhone 17 Pro <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
-| -------------------------------------------------------------------------- | ------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
-| **Total Inference Time**                                                   | 1104                      | 1113                      | 8840        | 2845                           | 2640                   |
-| **Detector (CRAFT_1280_QUANTIZED)**                                        | 501                       | 507                       | 4317        | 1405                           | 1275                   |
-| **Detector (CRAFT_320_QUANTIZED)**                                         |                           |                           |             |                                |                        |
-| ├─ Average Time                                                            | 125                       | 121                       | 1060        | 338                            | 299                    |
-| ├─ Total Time (4 runs)                                                     | 500                       | 484                       | 4240        | 1352                           | 1196                   |
-| **Recognizer (CRNN_64)** <br /> (_With Flag `independentChars == true`_)   |                           |                           |             |                                |                        |
-| ├─ Average Time                                                            | 5                         | 6                         | 14          | 7                              | 6                      |
-| ├─ Total Time (21 runs)                                                    | 105                       | 126                       | 294         | 147                            | 126                    |
-| **Recognizer (CRNN_512)** <br /> (_With Flag `independentChars == false`_) |                           |                           |             |                                |                        |
-| ├─ Average Time                                                            | 46                        | 42                        | 109         | 47                             | 37                     |
-| ├─ Total Time (4 runs)                                                     | 184                       | 168                       | 436         | 188                            | 148                    |
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/_category_.json b/docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/_category_.json
deleted file mode 100644
index 1e8953592..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/_category_.json
+++ /dev/null
@@ -1,6 +0,0 @@
-{
-  "label": "ExecuTorch Bindings",
-  "link": {
-    "type": "generated-index"
-  }
-}
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/useExecutorchModule.md b/docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/useExecutorchModule.md
deleted file mode 100644
index 137b19d92..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/03-executorch-bindings/useExecutorchModule.md
+++ /dev/null
@@ -1,155 +0,0 @@
----
-title: useExecutorchModule
----
-
-useExecutorchModule provides React Native bindings to the ExecuTorch [Module API](https://pytorch.org/executorch/stable/extension-module.html) directly from JavaScript.
-
-:::caution
-These bindings are primarily intended for custom model integration where no dedicated hook exists. If you are considering using a provided model, first verify whether a dedicated hook is available. Dedicated hooks simplify the implementation process by managing necessary pre and post-processing automatically. Utilizing these can save you effort and reduce complexity, ensuring you do not implement additional handling that is already covered.
-:::
-
-## Initializing ExecuTorch Module
-
-You can initialize the ExecuTorch module in your JavaScript application using the `useExecutorchModule` hook. This hook facilitates the loading of models from the specified source and prepares them for use.
-
-```typescript
-import { useExecutorchModule } from 'react-native-executorch';
-
-const executorchModule = useExecutorchModule({
-  modelSource: require('../assets/models/model.pte'),
-});
-```
-
-The `modelSource` parameter expects a location string pointing to the model binary.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-### Arguments
-
-**`modelSource`** - A string that specifies the location of the model binary.
-
-**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
-
-### Returns
-
-|       Field        |                      Type                      |                                                                         Description                                                                         |
-| :----------------: | :--------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------: |
-|      `error`       |        <code>string &#124; null</code>         |                                                   Contains the error message if the model failed to load.                                                   |
-|   `isGenerating`   |                   `boolean`                    |                                              Indicates whether the model is currently processing an inference.                                              |
-|     `isReady`      |                   `boolean`                    |                                       Indicates whether the model has successfully loaded and is ready for inference.                                       |
-|     `forward`      | `(input: TensorPtr[]) => Promise<TensorPtr[]>` | Executes the model's forward pass, where `input` is an array of TensorPtr objects. If the inference is successful, an array of tensor pointers is returned. |
-| `downloadProgress` |                    `number`                    |                                                Represents the download progress as a value between 0 and 1.                                                 |
-
-## TensorPtr
-
-TensorPtr is a JS representation of the underlying tensor, which is then passed to the model. You can read more about creating tensors [here](https://docs.pytorch.org/executorch/stable/extension-tensor.html). On JS side, the TensorPtr holds the following information:
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-interface TensorPtr {
-  dataPtr: TensorBuffer;
-  sizes: number[];
-  scalarType: ScalarType;
-}
-
-type TensorBuffer =
-  | ArrayBuffer
-  | Float32Array
-  | Float64Array
-  | Int8Array
-  | Int16Array
-  | Int32Array
-  | Uint8Array
-  | Uint16Array
-  | Uint32Array
-  | BigInt64Array
-  | BigUint64Array;
-
-enum ScalarType {
-  BYTE = 0,
-  CHAR = 1,
-  SHORT = 2,
-  INT = 3,
-  LONG = 4,
-  HALF = 5,
-  FLOAT = 6,
-  DOUBLE = 7,
-  BOOL = 11,
-  QINT8 = 12,
-  QUINT8 = 13,
-  QINT32 = 14,
-  QUINT4X2 = 16,
-  QUINT2X4 = 17,
-  BITS16 = 22,
-  FLOAT8E5M2 = 23,
-  FLOAT8E4M3FN = 24,
-  FLOAT8E5M2FNUZ = 25,
-  FLOAT8E4M3FNUZ = 26,
-  UINT16 = 27,
-  UINT32 = 28,
-  UINT64 = 29,
-}
-```
-
-</details>
-
-`dataPtr` - Represents a data buffer that will be used to create a tensor on the native side. This can be either an `ArrayBuffer` or a `TypedArray`. If your model takes in a datatype which is not covered by any of the `TypedArray` types, just pass an `ArrayBuffer` here.
-
-`sizes` - Represents a shape of a given tensor, i.e. for a 640x640 RGB image with a batch size of 1, you would need to pass `[1, 3, 640, 640]` here.
-
-`scalarType` - An enum resembling the ExecuTorch's `ScalarType`. For example, if your model was exported with float32 as an input, you will need to pass `ScalarType.FLOAT` here.
-
-## End to end example
-
-This example demonstrates the integration and usage of the ExecuTorch bindings with a [style transfer model](../../02-hooks/02-computer-vision/useStyleTransfer.md). Specifically, we'll be using the `STYLE_TRANSFER_CANDY` model, which applies artistic style transfer to an input image.
-
-### Importing the Module and loading the model
-
-First, import the necessary functions from the `react-native-executorch` package and initialize the ExecuTorch module with the specified style transfer model.
-
-```typescript
-import {
-  useExecutorchModule,
-  STYLE_TRANSFER_CANDY,
-  ScalarType,
-} from 'react-native-executorch';
-
-// Initialize the executorch module with the predefined style transfer model.
-const executorchModule = useExecutorchModule({
-  modelSource: STYLE_TRANSFER_CANDY,
-});
-```
-
-### Setting up input parameters
-
-To prepare the model input, define the tensor shape according to your model's requirements (defined by the model export process). For example, the STYLE_TRANSFER_CANDY model expects a tensor with shape `[1, 3, 640, 640]` — representing a batch size of 1, 3 color channels (RGB), and 640×640 pixel dimensions.
-
-```typescript
-const inputTensor = {
-  dataPtr: new Float32Array(1 * 3 * 640 * 640), // or other TypedArray / ArrayBuffer
-  sizes: [1, 3, 640, 640],
-  scalarType: ScalarType.FLOAT,
-};
-```
-
-### Performing inference
-
-After passing input to the forward function, you'll receive an array of TensorPtr objects. Each TensorPtr contains its `dataPtr` as an ArrayBuffer. Since ArrayBuffer represents raw binary data, you'll need to interpret it according to the tensor's underlying data type (e.g., creating a Float32Array view for float32 tensors, Int32Array for int32 tensors, etc.).
-
-```typescript
-try {
-  // Perform the forward operation and receive the stylized image output.
-  const output = await executorchModule.forward([inputTensor]);
-  // Interpret the output ArrayBuffer
-  // foo(output[0].dataPtr);
-} catch (error) {
-  // Log any errors that occur during the forward pass.
-  console.error('Error during model execution:', error);
-}
-```
-
-:::info
-This code assumes that you have handled preprocessing of the input image (scaling, normalization) and postprocessing of the output (interpreting the raw output data) according to the model's requirements. Make sure to adjust these parts depending on your specific data and model outputs.
-:::
diff --git a/docs/versioned_docs/version-0.6.x/02-hooks/_category_.json b/docs/versioned_docs/version-0.6.x/02-hooks/_category_.json
deleted file mode 100644
index 556f74382..000000000
--- a/docs/versioned_docs/version-0.6.x/02-hooks/_category_.json
+++ /dev/null
@@ -1,6 +0,0 @@
-{
-  "label": "Hooks",
-  "link": {
-    "type": "generated-index"
-  }
-}
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/LLMModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/LLMModule.md
deleted file mode 100644
index 0a2197849..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/LLMModule.md
+++ /dev/null
@@ -1,166 +0,0 @@
----
-title: LLMModule
----
-
-TypeScript API implementation of the [useLLM](../../02-hooks/01-natural-language-processing/useLLM.md) hook.
-
-## Reference
-
-```typescript
-import { LLMModule, LLAMA3_2_1B_QLORA } from 'react-native-executorch';
-
-// Creating an instance
-const llm = new LLMModule({
-  tokenCallback: (token) => console.log(token),
-  messageHistoryCallback: (messages) => console.log(messages),
-});
-
-// Loading the model
-await llm.load(LLAMA3_2_1B_QLORA, (progress) => console.log(progress));
-
-// Running the model
-await llm.sendMessage('Hello, World!');
-
-// Interrupting the model (to actually interrupt the generation it has to be called when sendMessage or generate is running)
-llm.interrupt();
-
-// Deleting the model from memory
-llm.delete();
-```
-
-### Methods
-
-| Method                   | Type                                                                                                                                                                                       | Description                                                                                                                                                                                                                                                                                                                                                                 |
-| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `constructor`            | `({tokenCallback?: (token: string) => void, responseCallback?: (response: string) => void, messageHistoryCallback?: (messageHistory: Message[]) => void})`                                 | Creates a new instance of LLMModule with optional callbacks.                                                                                                                                                                                                                                                                                                                |
-| `load`                   | `(model: { modelSource: ResourceSource; tokenizerSource: ResourceSource; tokenizerConfigSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model.                                                                                                                                                                                                                                                                                                                                                            |
-| `setTokenCallback`       | `{tokenCallback: (token: string) => void}) => void`                                                                                                                                        | Sets new token callback.                                                                                                                                                                                                                                                                                                                                                    |
-| `generate`               | `(messages: Message[], tools?: LLMTool[]) => Promise<string>`                                                                                                                              | Runs model to complete chat passed in `messages` argument. It doesn't manage conversation context.                                                                                                                                                                                                                                                                          |
-| `forward`                | `(input: string) => Promise<string>`                                                                                                                                                       | Runs model inference with raw input string. You need to provide entire conversation and prompt (in correct format and with special tokens!) in input string to this method. It doesn't manage conversation context. It is intended for users that need access to the model itself without any wrapper. If you want a simple chat with model the consider using`sendMessage` |
-| `configure`              | `({chatConfig?: Partial<ChatConfig>, toolsConfig?: ToolsConfig, generationConfig?: GenerationConfig}) => void`                                                                             | Configures chat and tool calling and generation settings. See more details in [configuring the model](#configuring-the-model).                                                                                                                                                                                                                                              |
-| `sendMessage`            | `(message: string) => Promise<Message[]>`                                                                                                                                                  | Method to add user message to conversation. After model responds it will call `messageHistoryCallback()`containing both user message and model response. It also returns them.                                                                                                                                                                                              |
-| `deleteMessage`          | `(index: number) => void`                                                                                                                                                                  | Deletes all messages starting with message on `index` position. After deletion it will call `messageHistoryCallback()` containing new history. It also returns it.                                                                                                                                                                                                          |
-| `delete`                 | `() => void`                                                                                                                                                                               | Method to delete the model from memory. Note you cannot delete model while it's generating. You need to interrupt it first and make sure model stopped generation.                                                                                                                                                                                                          |
-| `interrupt`              | `() => void`                                                                                                                                                                               | Interrupts model generation. It may return one more token after interrupt.                                                                                                                                                                                                                                                                                                  |
-| `getGeneratedTokenCount` | `() => number`                                                                                                                                                                             | Returns the number of tokens generated in the last response.                                                                                                                                                                                                                                                                                                                |
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-type ResourceSource = string | number | object;
-
-type MessageRole = 'user' | 'assistant' | 'system';
-
-interface Message {
-  role: MessageRole;
-  content: string;
-}
-interface ChatConfig {
-  initialMessageHistory: Message[];
-  contextWindowLength: number;
-  systemPrompt: string;
-}
-
-interface GenerationConfig {
-  outputTokenBatchSize: number;
-  batchTimeInterval: number;
-}
-
-// tool calling
-interface ToolsConfig {
-  tools: LLMTool[];
-  executeToolCallback: (call: ToolCall) => Promise<string | null>;
-  displayToolCalls?: boolean;
-}
-
-interface ToolCall {
-  toolName: string;
-  arguments: Object;
-}
-
-type LLMTool = Object;
-```
-
-</details>
-
-## Loading the model
-
-To create a new instance of LLMModule, use the constructor with optional callbacks:
-
-**`tokenCallback`** - (Optional) A function that will be called on every generated token with that token as its only argument.
-
-**`responseCallback`** - (Optional) A function that will be called on every generated token and receives the entire response, including this token. [**DEPRECATED** - consider using `tokenCallback`]
-
-**`messageHistoryCallback`** - (Optional) A function called on every finished message. Returns the entire message history.
-
-Then, to load the model, use the `load` method. It accepts an object with the following fields:
-
-**`model`** - Object containing the model source, tokenizer source, and tokenizer config source.
-
-- **`modelSource`** - `ResourceSource` specifying the location of the model binary.
-
-- **`tokenizerSource`** - `ResourceSource` specifying the location of the tokenizer.
-
-- **`tokenizerConfigSource`** - `ResourceSource` specifying the location of the tokenizer config.
-
-**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
-
-This method returns a promise, which can resolve to an error or void.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-## Listening for download progress
-
-To subscribe to the download progress event, you can pass the `onDownloadProgressCallback` function to the `load` method. This function is called whenever the download progress changes.
-
-## Running the model
-
-To run the model, you can use `generate` method. It allows you to pass chat messages and receive completion from the model. It doesn't provide any message history management.
-
-Alternatively in managed chat (see: [Functional vs managed](../../02-hooks/01-natural-language-processing/useLLM.md#functional-vs-managed)), you can use the `sendMessage` method. It accepts the user message. After model responds it will return new message history containing both user message and model response.. Additionally, it will call `messageHistoryCallback`.
-
-If you need raw model, without any wrappers, you can use `forward`. It provides direct access to the model, so the input string is passed straight into the model. It may be useful to work with models that aren't finetuned for chat completions. If you're not sure what are implications of that (e.g. that you have to include special model tokens), you're better off with `sendMessage`.
-
-## Listening for generated tokens
-
-To subscribe to the token generation event, you can pass `tokenCallback` or `messageHistoryCallback` functions to the constructor. `tokenCallback` is called on every token and contains only the most recent token. `messageHistoryCallback` is called whenever model finishes generation and contains all message history including user's and model's last messages.
-
-## Interrupting the model
-
-In order to interrupt the model, you can use the `interrupt` method.
-
-## Token Batching
-
-Depending on selected model and the user's device generation speed can be above 60 tokens per second. If the `tokenCallback` triggers rerenders and is invoked on every single token it can significantly decrease the app's performance. To alleviate this and help improve performance we've implemented token batching. To configure this you need to call `configure` method and pass `generationConfig`. Inside you can set two parameters `outputTokenBatchSize` and `batchTimeInterval`. They set the size of the batch before tokens are emitted and the maximum time interval between consecutive batches respectively. Each batch is emitted if either `timeInterval` elapses since last batch or `countInterval` number of tokens are generated. This allows for smooth generation even if model lags during generation. Default parameters are set to 10 tokens and 80ms for time interval (~12 batches per second).
-
-## Configuring the model
-
-To configure model (i.e. change system prompt, load initial conversation history or manage tool calling) you can use
-`configure` method. It is only applied to managed chats i.e. when using `sendMessage` (see: [Functional vs managed](../../02-hooks/01-natural-language-processing/useLLM.md#functional-vs-managed)) It accepts object with following fields:
-
-**`chatConfig`** - Object configuring chat management:
-
-- **`systemPrompt`** - Often used to tell the model what is its purpose, for example - "Be a helpful translator".
-
-- **`initialMessageHistory`** - An array of `Message` objects that represent the conversation history. This can be used to provide initial context to the model.
-
-- **`contextWindowLength`** - The number of messages from the current conversation that the model will use to generate a response. The higher the number, the more context the model will have. Keep in mind that using larger context windows will result in longer inference time and higher memory usage.
-
-**`toolsConfig`** - Object configuring options for enabling and managing tool use. **It will only have effect if your model's chat template support it**. Contains following properties:
-
-- **`tools`** - List of objects defining tools.
-
-- **`executeToolCallback`** - Function that accepts `ToolCall`, executes tool and returns the string to model.
-
-- **`displayToolCalls`** - If set to true, JSON tool calls will be displayed in chat. If false, only answers will be displayed.
-
-**`generationConfig`** - Object configuring generation settings, currently only output token batching.
-
-- **`outputTokenBatchSize`** - Soft upper limit on the number of tokens in each token batch (in certain cases there can be more tokens in given batch, i.e. when the batch would end with special emoji join character).
-
-- **`batchTimeInterval`** - Upper limit on the time interval between consecutive token batches.
-
-## Deleting the model from memory
-
-To delete the model from memory, you can use the `delete` method.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/SpeechToTextModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/SpeechToTextModule.md
deleted file mode 100644
index f93600c00..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/SpeechToTextModule.md
+++ /dev/null
@@ -1,252 +0,0 @@
----
-title: SpeechToTextModule
----
-
-TypeScript API implementation of the [useSpeechToText](../../02-hooks/01-natural-language-processing/useSpeechToText.md) hook.
-
-## Reference
-
-```typescript
-import { SpeechToTextModule, WHISPER_TINY_EN } from 'react-native-executorch';
-
-const model = new SpeechToTextModule();
-await model.load(WHISPER_TINY_EN, (progress) => {
-  console.log(progress);
-});
-
-await model.transcribe(waveform);
-```
-
-### Methods
-
-| Method         | Type                                                                                                       | Description                                                                                                                                                                                                   |
-| -------------- | ---------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `load`         | `(model: SpeechToTextModelConfig, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model specified by the config object. `onDownloadProgressCallback` allows you to monitor the current progress of the model download.                                                                |
-| `delete`       | `(): void`                                                                                                 | Unloads the model from memory.                                                                                                                                                                                |
-| `encode`       | `(waveform: Float32Array \| number[]): Promise<Float32Array>`                                              | Runs the encoding part of the model on the provided waveform. Returns the encoded waveform as a Float32Array. Passing `number[]` is deprecated.                                                               |
-| `decode`       | `(tokens: number[] \| Int32Array, encoderOutput: Float32Array \| number[]): Promise<Float32Array>`         | Runs the decoder of the model. Passing `number[]` is deprecated.                                                                                                                                              |
-| `transcribe`   | `(waveform: Float32Array \| number[], options?: DecodingOptions): Promise<string>`                         | Starts a transcription process for a given input array (16kHz waveform). For multilingual models, specify the language in `options`. Returns the transcription as a string. Passing `number[]` is deprecated. |
-| `stream`       | `(options?: DecodingOptions): AsyncGenerator<{ committed: string; nonCommitted: string }>`                 | Starts a streaming transcription session. Yields objects with `committed` and `nonCommitted` transcriptions. Use with `streamInsert` and `streamStop` to control the stream.                                  |
-| `streamStop`   | `(): void`                                                                                                 | Stops the current streaming transcription session.                                                                                                                                                            |
-| `streamInsert` | `(waveform: Float32Array \| number[]): void`                                                               | Inserts a new audio chunk into the streaming transcription session. Passing `number[]` is deprecated.                                                                                                         |
-
-:::info
-
-- `committed` contains the latest part of the transcription that is finalized and will not change. To obtain the full transcription during streaming, concatenate all the `committed` values yielded over time. Useful for displaying stable results during streaming.
-- `nonCommitted` contains the part of the transcription that is still being processed and may change. Useful for displaying live, partial results during streaming.
-  :::
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-// Languages supported by whisper (Multilingual)
-type SpeechToTextLanguage =
-  | 'af'
-  | 'sq'
-  | 'ar'
-  | 'hy'
-  | 'az'
-  | 'eu'
-  | 'be'
-  | 'bn'
-  | 'bs'
-  | 'bg'
-  | 'my'
-  | 'ca'
-  | 'zh'
-  | 'hr'
-  | 'cs'
-  | 'da'
-  | 'nl'
-  | 'et'
-  | 'en'
-  | 'fi'
-  | 'fr'
-  | 'gl'
-  | 'ka'
-  | 'de'
-  | 'el'
-  | 'gu'
-  | 'ht'
-  | 'he'
-  | 'hi'
-  | 'hu'
-  | 'is'
-  | 'id'
-  | 'it'
-  | 'ja'
-  | 'kn'
-  | 'kk'
-  | 'km'
-  | 'ko'
-  | 'lo'
-  | 'lv'
-  | 'lt'
-  | 'mk'
-  | 'mg'
-  | 'ms'
-  | 'ml'
-  | 'mt'
-  | 'mr'
-  | 'ne'
-  | 'no'
-  | 'fa'
-  | 'pl'
-  | 'pt'
-  | 'pa'
-  | 'ro'
-  | 'ru'
-  | 'sr'
-  | 'si'
-  | 'sk'
-  | 'sl'
-  | 'es'
-  | 'su'
-  | 'sw'
-  | 'sv'
-  | 'tl'
-  | 'tg'
-  | 'ta'
-  | 'te'
-  | 'th'
-  | 'tr'
-  | 'uk'
-  | 'ur'
-  | 'uz'
-  | 'vi'
-  | 'cy'
-  | 'yi';
-
-interface DecodingOptions {
-  language?: SpeechToTextLanguage;
-}
-
-interface SpeechToTextModelConfig {
-  isMultilingual: boolean;
-  encoderSource: ResourceSource;
-  decoderSource: ResourceSource;
-  tokenizerSource: ResourceSource;
-}
-```
-
-</details>
-
-## Loading the model
-
-Create an instance of SpeechToTextModule and use the `load` method. It accepts an object with the following fields:
-
-**`model`** - Object containing:
-
-- **`isMultilingual`** - A boolean flag indicating whether the model supports multiple languages.
-
-- **`encoderSource`** - A string that specifies the location of a `.pte` file for the encoder.
-
-- **`decoderSource`** - A string that specifies the location of a `.pte` file for the decoder.
-
-- **`tokenizerSource`** - A string that specifies the location to the tokenizer for the model.
-
-**`onDownloadProgressCallback`** - (Optional) Function that will be called on download progress.
-
-This method returns a promise, which can resolve to an error or void.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-## Running the model
-
-To run the model, you can use the `transcribe` method. It accepts one argument, which is an array of numbers representing a waveform at 16kHz sampling rate. The method returns a promise, which can resolve either to an error or a string containing the output text.
-
-### Multilingual transcription
-
-If you aim to obtain a transcription in other languages than English, use the multilingual version of whisper. To obtain the output text in your desired language, pass the `DecodingOptions` object with the `language` field set to your desired language code.
-
-```typescript
-import { SpeechToTextModule, WHISPER_TINY } from 'react-native-executorch';
-
-const model = new SpeechToTextModule();
-await model.load(WHISPER_TINY, (progress) => {
-  console.log(progress);
-});
-
-const transcription = await model.transcribe(spanishAudio, { language: 'es' });
-```
-
-## Example
-
-### Transcription
-
-```tsx
-import { SpeechToTextModule, WHISPER_TINY_EN } from 'react-native-executorch';
-import { AudioContext } from 'react-native-audio-api';
-import * as FileSystem from 'expo-file-system';
-
-// Load the model
-const model = new SpeechToTextModule();
-
-// Download the audio file
-const { uri } = await FileSystem.downloadAsync(
-  'https://some-audio-url.com/file.mp3',
-  FileSystem.cacheDirectory + 'audio_file'
-);
-
-// Decode the audio data
-const audioContext = new AudioContext({ sampleRate: 16000 });
-const decodedAudioData = await audioContext.decodeAudioDataSource(uri);
-const audioBuffer = decodedAudioData.getChannelData(0);
-
-// Transcribe the audio
-try {
-  const transcription = await model.transcribe(audioBuffer);
-  console.log(transcription);
-} catch (error) {
-  console.error('Error during audio transcription', error);
-}
-```
-
-### Streaming Transcription
-
-```tsx
-import { SpeechToTextModule, WHISPER_TINY_EN } from 'react-native-executorch';
-import { AudioManager, AudioRecorder } from 'react-native-audio-api';
-
-// Load the model
-const model = new SpeechToTextModule();
-await model.load(WHISPER_TINY_EN, (progress) => {
-  console.log(progress);
-});
-
-// Configure audio session
-AudioManager.setAudioSessionOptions({
-  iosCategory: 'playAndRecord',
-  iosMode: 'spokenAudio',
-  iosOptions: ['allowBluetooth', 'defaultToSpeaker'],
-});
-AudioManager.requestRecordingPermissions();
-
-// Initialize audio recorder
-const recorder = new AudioRecorder({
-  sampleRate: 16000,
-  bufferLengthInSamples: 1600,
-});
-recorder.onAudioReady(({ buffer }) => {
-  // Insert the audio into the streaming transcription
-  model.streamInsert(buffer.getChannelData(0));
-});
-recorder.start();
-
-// Start streaming transcription
-try {
-  let transcription = '';
-  for await (const { committed, nonCommitted } of model.stream()) {
-    console.log('Streaming transcription:', { committed, nonCommitted });
-    transcription += committed;
-  }
-  console.log('Final transcription:', transcription);
-} catch (error) {
-  console.error('Error during streaming transcription:', error);
-}
-
-// Stop streaming transcription
-model.streamStop();
-recorder.stop();
-```
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TextEmbeddingsModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TextEmbeddingsModule.md
deleted file mode 100644
index 7f59268f9..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TextEmbeddingsModule.md
+++ /dev/null
@@ -1,59 +0,0 @@
----
-title: TextEmbeddingsModule
----
-
-TypeScript API implementation of the [useTextEmbeddings](../../02-hooks/01-natural-language-processing/useTextEmbeddings.md) hook.
-
-## Reference
-
-```typescript
-import {
-  TextEmbeddingsModule,
-  ALL_MINILM_L6_V2,
-} from 'react-native-executorch';
-
-// Creating an instance
-const textEmbeddingsModule = new TextEmbeddingsModule();
-
-// Loading the model
-await textEmbeddingsModule.load(ALL_MINILM_L6_V2);
-
-// Running the model
-const embedding = await textEmbeddingsModule.forward('Hello World!');
-```
-
-### Methods
-
-| Method               | Type                                                                                                                                                | Description                                                                                                                                                                             |
-| -------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `load`               | `(model: { modelSource: ResourceSource; tokenizerSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `modelSource` is a string that specifies the location of the model binary, `tokenizerSource` is a string that specifies the location of the tokenizer JSON file. |
-| `forward`            | `(input: string): Promise<number[]>`                                                                                                                | Executes the model's forward pass, where `input` is a text that will be embedded.                                                                                                       |
-| `onDownloadProgress` | `(callback: (downloadProgress: number) => void): any`                                                                                               | Subscribe to the download progress event.                                                                                                                                               |
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-type ResourceSource = string | number | object;
-```
-
-</details>
-
-## Loading the model
-
-To load the model, use the `load` method. It accepts an object:
-
-**`model`** - Object containing the model source and tokenizer source.
-
-- **`modelSource`** - A string that specifies the location of the model binary.
-- **`tokenizerSource`** - A string that specifies the location of the tokenizer JSON file.
-
-**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
-
-This method returns a promise, which can resolve to an error or void.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-## Running the model
-
-To run the model, you can use the `forward` method. It accepts one argument, which is the text you want to embed. The method returns a promise, which can resolve either to an error or an array of numbers representing the embedding.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TokenizerModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TokenizerModule.md
deleted file mode 100644
index 41ad2b027..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/TokenizerModule.md
+++ /dev/null
@@ -1,60 +0,0 @@
----
-title: TokenizerModule
----
-
-TypeScript API implementation of the [useTokenizer](../../02-hooks/01-natural-language-processing/useTokenizer.md) hook.
-
-## Reference
-
-```typescript
-import { TokenizerModule, ALL_MINILM_L6_V2 } from 'react-native-executorch';
-
-// Creating an instance
-const tokenizerModule = new TokenizerModule();
-
-// Load the tokenizer
-await tokenizerModule.load(ALL_MINILM_L6_V2);
-console.log('Tokenizer loaded');
-
-// Get tokenizers vocabulary size
-const vocabSize = await tokenizerModule.getVocabSize();
-console.log('Vocabulary size:', vocabSize);
-
-const text = 'Hello, world!';
-
-// Tokenize the text
-const tokens = await tokenizerModule.encode(text);
-console.log('Token IDs:', tokens);
-
-// Decode the tokens back to text
-const decoded = await tokenizerModule.decode(tokens);
-console.log('Decoded text:', decoded);
-
-// Get the token ID for a specific token
-const tokenId = await tokenizerModule.tokenToId('hello');
-console.log('Token ID for "Hello":', tokenId);
-
-// Get the token for a specific ID
-const token = await tokenizerModule.idToToken(tokenId);
-console.log('Token for ID:', token);
-```
-
-### Methods
-
-| Method         | Type                                                                                                                       | Description                                                                                                                          |
-| -------------- | -------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ |
-| `load`         | `(tokenizer: { tokenizerSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the tokenizer from the specified source. `tokenizerSource` is a string that points to the location of the tokenizer JSON file. |
-| `encode`       | `(input: string): Promise<number[]>`                                                                                       | Converts a string into an array of token IDs.                                                                                        |
-| `decode`       | `(input: number[]): Promise<string>`                                                                                       | Converts an array of token IDs into a string.                                                                                        |
-| `getVocabSize` | `(): Promise<number>`                                                                                                      | Returns the size of the tokenizer's vocabulary.                                                                                      |
-| `idToToken`    | `(tokenId: number): Promise<string>`                                                                                       | Returns the token associated to the ID.                                                                                              |
-| `tokenToId`    | `(token: string): Promise<number>`                                                                                         | Returns the ID associated to the token.                                                                                              |
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-type ResourceSource = string | number | object;
-```
-
-</details>
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/_category_.json b/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/_category_.json
deleted file mode 100644
index 0314f315d..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/01-natural-language-processing/_category_.json
+++ /dev/null
@@ -1,6 +0,0 @@
-{
-  "label": "Natural Language Processing",
-  "link": {
-    "type": "generated-index"
-  }
-}
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ClassificationModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ClassificationModule.md
deleted file mode 100644
index a8e7bea75..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ClassificationModule.md
+++ /dev/null
@@ -1,64 +0,0 @@
----
-title: ClassificationModule
----
-
-TypeScript API implementation of the [useClassification](../../02-hooks/02-computer-vision/useClassification.md) hook.
-
-## Reference
-
-```typescript
-import {
-  ClassificationModule,
-  EFFICIENTNET_V2_S,
-} from 'react-native-executorch';
-
-const imageUri = 'path/to/image.png';
-
-// Creating an instance
-const classificationModule = new ClassificationModule();
-
-// Loading the model
-await classificationModule.load(EFFICIENTNET_V2_S);
-
-// Running the model
-const classesWithProbabilities = await classificationModule.forward(imageUri);
-```
-
-### Methods
-
-| Method    | Type                                                                                                               | Description                                                                                                                                                                                |
-| --------- | ------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
-| `load`    | `(model: { modelSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `modelSource` is a string that specifies the location of the model binary. To track the download progress, supply a callback function `onDownloadProgressCallback`. |
-| `forward` | `(imageSource: string): Promise<{ [category: string]: number }>`                                                   | Executes the model's forward pass, where `imageSource` can be a fetchable resource or a Base64-encoded string.                                                                             |
-| `delete`  | `(): void`                                                                                                         | Release the memory held by the module. Calling `forward` afterwards is invalid.                                                                                                            |
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-type ResourceSource = string | number | object;
-```
-
-</details>
-
-## Loading the model
-
-To load the model, create a new instance of the module and use the `load` method on it. It accepts an object:
-
-**`model`** - Object containing the model source.
-
-- **`modelSource`** - A string that specifies the location of the model binary.
-
-**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
-
-This method returns a promise, which can resolve to an error or void.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-## Running the model
-
-To run the model, you can use the `forward` method on the module object. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The method returns a promise, which can resolve either to an error or an object containing categories with their probabilities.
-
-## Managing memory
-
-The module is a regular JavaScript object, and as such its lifespan will be managed by the garbage collector. In most cases this should be enough, and you should not worry about freeing the memory of the module yourself, but in some cases you may want to release the memory occupied by the module before the garbage collector steps in. In this case use the method `delete()` on the module object you will no longer use, and want to remove from the memory. Note that you cannot use `forward` after `delete` unless you load the module again.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageEmbeddingsModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageEmbeddingsModule.md
deleted file mode 100644
index 7045da8e5..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageEmbeddingsModule.md
+++ /dev/null
@@ -1,60 +0,0 @@
----
-title: ImageEmbeddingsModule
----
-
-TypeScript API implementation of the [useImageEmbeddings](../../02-hooks/02-computer-vision/useImageEmbeddings.md) hook.
-
-## Reference
-
-```typescript
-import {
-  ImageEmbeddingsModule,
-  CLIP_VIT_BASE_PATCH32_IMAGE,
-} from 'react-native-executorch';
-
-// Creating an instance
-const imageEmbeddingsModule = new ImageEmbeddingsModule();
-
-// Loading the model
-await imageEmbeddingsModule.load(CLIP_VIT_BASE_PATCH32_IMAGE);
-
-// Running the model
-const embedding = await imageEmbeddingsModule.forward(
-  'https://url-to-image.jpg'
-);
-```
-
-### Methods
-
-| Method               | Type                                                                                                               | Description                                                                                         |
-| -------------------- | ------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------- |
-| `load`               | `(model: { modelSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `modelSource` is a string that specifies the location of the model binary.   |
-| `forward`            | `(imageSource: string): Promise<Float32Array>`                                                                     | Executes the model's forward pass, where `imageSource` is a URI/URL to image that will be embedded. |
-| `onDownloadProgress` | `(callback: (downloadProgress: number) => void): any`                                                              | Subscribe to the download progress event.                                                           |
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-type ResourceSource = string | number | object;
-```
-
-</details>
-
-## Loading the model
-
-To load the model, use the `load` method. It accepts an object:
-
-**`model`** - Object containing the model source.
-
-- **`modelSource`** - A string that specifies the location of the model binary.
-
-**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
-
-This method returns a promise, which can resolve to an error or void.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-## Running the model
-
-It accepts one argument, which is a URI/URL to an image you want to encode. The function returns a promise, which can resolve either to an error or an array of numbers representing the embedding.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageSegmentationModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageSegmentationModule.md
deleted file mode 100644
index 99deae014..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ImageSegmentationModule.md
+++ /dev/null
@@ -1,77 +0,0 @@
----
-title: ImageSegmentationModule
----
-
-TypeScript API implementation of the [useImageSegmentation](../../02-hooks/02-computer-vision/useImageSegmentation.md) hook.
-
-## Reference
-
-```typescript
-import {
-  ImageSegmentationModule,
-  DEEPLAB_V3_RESNET50,
-} from 'react-native-executorch';
-
-const imageUri = 'path/to/image.png';
-
-// Creating an instance
-const imageSegmentationModule = new ImageSegmentationModule();
-
-// Loading the model
-await imageSegmentationModule.load(DEEPLAB_V3_RESNET50);
-
-// Running the model
-const outputDict = await imageSegmentationModule.forward(imageUri);
-```
-
-### Methods
-
-| Method    | Type                                                                                                                         | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
-| --------- | ---------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `load`    | `(model: { modelSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>`           | Loads the model, where `modelSource` is a string that specifies the location of the model binary. To track the download progress, supply a callback function `onDownloadProgressCallback`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
-| `forward` | `(imageSource: string, classesOfInterest?: DeeplabLabel[], resize?: boolean) => Promise<{[key in DeeplabLabel]?: number[]}>` | Executes the model's forward pass, where : <br/> \* `imageSource` can be a fetchable resource or a Base64-encoded string. <br/> \* `classesOfInterest` is an optional list of `DeeplabLabel` used to indicate additional arrays of probabilities to output (see section "Running the model"). The default is an empty list. <br/> \* `resize` is an optional boolean to indicate whether the output should be resized to the original image dimensions, or left in the size of the model (see section "Running the model"). The default is `false`. <br/> <br/> The return is a dictionary containing: <br/> \* for the key `DeeplabLabel.ARGMAX` an array of integers corresponding to the most probable class for each pixel <br/> \* an array of floats for each class from `classesOfInterest` corresponding to the probabilities for this class. |
-| `delete`  | `(): void`                                                                                                                   | Release the memory held by the module. Calling `forward` afterwards is invalid.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-type ResourceSource = string | number | object;
-```
-
-</details>
-
-## Loading the model
-
-To load the model, create a new instance of the module and use the `load` method on it. It accepts an object:
-
-**`model`** - Object containing the model source.
-
-- **`modelSource`** - A string that specifies the location of the model binary.
-
-**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
-
-This method returns a promise, which can resolve to an error or void.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-## Running the model
-
-To run the model, you can use the `forward` method on the module object. It accepts three arguments: a required image, an optional list of classes, and an optional flag whether to resize the output to the original dimensions.
-
-- The image can be a remote URL, a local file URI, or a base64-encoded image.
-- The `classesOfInterest` list contains classes for which to output the full results. By default the list is empty, and only the most probable classes are returned (essentially an arg max for each pixel). Look at `DeeplabLabel` enum for possible classes.
-- The `resize` flag says whether the output will be rescaled back to the size of the image you put in. The default is `false`. The model runs inference on a scaled (probably smaller) version of your image (224x224 for the `DEEPLAB_V3_RESNET50`). If you choose to resize, the output will be `number[]` of size `width * height` of your original image.
-
-:::caution
-Setting `resize` to true will make `forward` slower.
-:::
-
-`forward` returns a promise which can resolve either to an error or a dictionary containing number arrays with size depending on `resize`:
-
-- For the key `DeeplabLabel.ARGMAX` the array contains for each pixel an integer corresponding to the class with the highest probability.
-- For every other key from `DeeplabLabel`, if the label was included in `classesOfInterest` the dictionary will contain an array of floats corresponding to the probability of this class for every pixel.
-
-## Managing memory
-
-The module is a regular JavaScript object, and as such its lifespan will be managed by the garbage collector. In most cases this should be enough, and you should not worry about freeing the memory of the module yourself, but in some cases you may want to release the memory occupied by the module before the garbage collector steps in. In this case use the method `delete()` on the module object you will no longer use, and want to remove from the memory. Note that you cannot use `forward` after `delete` unless you load the module again.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/OCRModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/OCRModule.md
deleted file mode 100644
index 43a812005..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/OCRModule.md
+++ /dev/null
@@ -1,135 +0,0 @@
----
-title: OCRModule
----
-
-TypeScript API implementation of the [useOCR](../../02-hooks/02-computer-vision/useOCR.md) hook.
-
-## Reference
-
-```typescript
-import { OCRModule, OCR_ENGLISH } from 'react-native-executorch';
-const imageUri = 'path/to/image.png';
-
-// Creating an instance
-const ocrModule = new OCRModule();
-
-// Loading the model
-await ocrModule.load(OCR_ENGLISH);
-
-// Running the model
-const detections = await ocrModule.forward(imageUri);
-```
-
-### Methods
-
-| Method    | Type                                                                                                                                                                                                                                             | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
-| --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `load`    | `(model: { detectorSource: ResourceSource; recognizerLarge: ResourceSource; recognizerMedium: ResourceSource; recognizerSmall: ResourceSource; language: OCRLanguage }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `detectorSource` is a string that specifies the location of the detector binary, `recognizerLarge` is a string that specifies the location of the recognizer binary file which accepts input images with a width of 512 pixels, `recognizerMedium` is a string that specifies the location of the recognizer binary file which accepts input images with a width of 256 pixels, `recognizerSmall` is a string that specifies the location of the recognizer binary file which accepts input images with a width of 128 pixels, and `language` is a parameter that specifies the language of the text to be recognized by the OCR. |
-| `forward` | `(input: string): Promise<OCRDetections[]>`                                                                                                                                                                                                      | Executes the model's forward pass, where `input` can be a fetchable resource or a Base64-encoded string.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
-| `delete`  | `(): void`                                                                                                                                                                                                                                       | Release the memory held by the module. Calling `forward` afterwards is invalid. Note that you cannot delete model while it's generating.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-type OCRLanguage =
-  | 'abq'
-  | 'ady'
-  | 'af'
-  | 'ava'
-  | 'az'
-  | 'be'
-  | 'bg'
-  | 'bs'
-  | 'chSim'
-  | 'che'
-  | 'cs'
-  | 'cy'
-  | 'da'
-  | 'dar'
-  | 'de'
-  | 'en'
-  | 'es'
-  | 'et'
-  | 'fr'
-  | 'ga'
-  | 'hr'
-  | 'hu'
-  | 'id'
-  | 'inh'
-  | 'ic'
-  | 'it'
-  | 'ja'
-  | 'kbd'
-  | 'kn'
-  | 'ko'
-  | 'ku'
-  | 'la'
-  | 'lbe'
-  | 'lez'
-  | 'lt'
-  | 'lv'
-  | 'mi'
-  | 'mn'
-  | 'ms'
-  | 'mt'
-  | 'nl'
-  | 'no'
-  | 'oc'
-  | 'pi'
-  | 'pl'
-  | 'pt'
-  | 'ro'
-  | 'ru'
-  | 'rsCyrillic'
-  | 'rsLatin'
-  | 'sk'
-  | 'sl'
-  | 'sq'
-  | 'sv'
-  | 'sw'
-  | 'tab'
-  | 'te'
-  | 'th'
-  | 'tjk'
-  | 'tl'
-  | 'tr'
-  | 'uk'
-  | 'uz'
-  | 'vi';
-
-interface Point {
-  x: number;
-  y: number;
-}
-
-interface OCRDetection {
-  bbox: Point[];
-  text: string;
-  score: number;
-}
-```
-
-</details>
-
-## Loading the model
-
-To load the model, use the `load` method. It accepts an object:
-
-**`model`** - Object containing the detector source, recognizer sources, and language.
-
-- **`detectorSource`** - A string that specifies the location of the detector binary.
-- **`recognizerLarge`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 512 pixels.
-- **`recognizerMedium`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 256 pixels.
-- **`recognizerSmall`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 128 pixels.
-- **`language`** - A parameter that specifies the language of the text to be recognized by the OCR.
-
-**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
-
-This method returns a promise, which can resolve to an error or void.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-## Running the model
-
-To run the model, you can use the `forward` method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The method returns a promise, which can resolve either to an error or an array of `OCRDetection` objects. Each object contains coordinates of the bounding box, the label of the detected object, and the confidence score.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ObjectDetectionModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ObjectDetectionModule.md
deleted file mode 100644
index ed4b27463..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/ObjectDetectionModule.md
+++ /dev/null
@@ -1,77 +0,0 @@
----
-title: ObjectDetectionModule
----
-
-TypeScript API implementation of the [useObjectDetection](../../02-hooks/02-computer-vision/useObjectDetection.md) hook.
-
-## Reference
-
-```typescript
-import {
-  ObjectDetectionModule,
-  SSDLITE_320_MOBILENET_V3_LARGE,
-} from 'react-native-executorch';
-
-const imageUri = 'path/to/image.png';
-
-// Creating an instance
-const objectDetectionModule = new ObjectDetectionModule();
-
-// Loading the model
-await objectDetectionModule.load(SSDLITE_320_MOBILENET_V3_LARGE);
-
-// Running the model
-const detections = await objectDetectionModule.forward(imageUri);
-```
-
-### Methods
-
-| Method    | Type                                                                                                               | Description                                                                                                                                                                                    |
-| --------- | ------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `load`    | `(model: { modelSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `modelSource` is a string that specifies the location of the model binary. To track the download progress, supply a callback function `onDownloadProgressCallback`.     |
-| `forward` | `(imageSource: string, detectionThreshold: number = 0.7): Promise<Detection[]>`                                    | Executes the model's forward pass, where `imageSource` can be a fetchable resource or a Base64-encoded string. `detectionThreshold` can be supplied to alter the sensitivity of the detection. |
-| `delete`  | `(): void`                                                                                                         | Release the memory held by the module. Calling `forward` afterwards is invalid.                                                                                                                |
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-type ResourceSource = string | number | object;
-
-interface Bbox {
-  x1: number;
-  x2: number;
-  y1: number;
-  y2: number;
-}
-
-interface Detection {
-  bbox: Bbox;
-  label: keyof typeof CocoLabel;
-  score: number;
-}
-```
-
-</details>
-
-## Loading the model
-
-To load the model, create a new instance of the module and use the `load` method on it. It accepts an object:
-
-**`model`** - Object containing the model source.
-
-- **`modelSource`** - A string that specifies the location of the model binary.
-
-**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
-
-This method returns a promise, which can resolve to an error or void.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-## Running the model
-
-To run the model, you can use the `forward` method on the module object. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The method returns a promise, which can resolve either to an error or an array of `Detection` objects. Each object contains coordinates of the bounding box, the label of the detected object, and the confidence score.
-
-## Managing memory
-
-The module is a regular JavaScript object, and as such its lifespan will be managed by the garbage collector. In most cases this should be enough, and you should not worry about freeing the memory of the module yourself, but in some cases you may want to release the memory occupied by the module before the garbage collector steps in. In this case use the method `delete()` on the module object you will no longer use, and want to remove from the memory. Note that you cannot use `forward` after `delete` unless you load the module again.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/StyleTransferModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/StyleTransferModule.md
deleted file mode 100644
index 50a19e104..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/StyleTransferModule.md
+++ /dev/null
@@ -1,64 +0,0 @@
----
-title: StyleTransferModule
----
-
-TypeScript API implementation of the [useStyleTransfer](../../02-hooks/02-computer-vision/useStyleTransfer.md) hook.
-
-## Reference
-
-```typescript
-import {
-  StyleTransferModule,
-  STYLE_TRANSFER_CANDY,
-} from 'react-native-executorch';
-
-const imageUri = 'path/to/image.png';
-
-// Creating an instance
-const styleTransferModule = new StyleTransferModule();
-
-// Loading the model
-await styleTransferModule.load(STYLE_TRANSFER_CANDY);
-
-// Running the model
-const generatedImageUrl = await styleTransferModule.forward(imageUri);
-```
-
-### Methods
-
-| Method    | Type                                                                                                               | Description                                                                                                                                                                                |
-| --------- | ------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
-| `load`    | `(model: { modelSource: ResourceSource }, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `modelSource` is a string that specifies the location of the model binary. To track the download progress, supply a callback function `onDownloadProgressCallback`. |
-| `forward` | `(imageSource: string): Promise<string>`                                                                           | Executes the model's forward pass, where `imageSource` can be a fetchable resource or a Base64-encoded string.                                                                             |
-| `delete`  | `(): void`                                                                                                         | Release the memory held by the module. Calling `forward` afterwards is invalid.                                                                                                            |
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-type ResourceSource = string | number | object;
-```
-
-</details>
-
-## Loading the model
-
-To load the model, create a new instance of the module and use the `load` method on it. It accepts an object:
-
-**`model`** - Object containing the model source.
-
-- **`modelSource`** - A string that specifies the location of the model binary.
-
-**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
-
-This method returns a promise, which can resolve to an error or void.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-## Running the model
-
-To run the model, you can use the `forward` method on the module object. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The method returns a promise, which can resolve either to an error or a URL to generated image.
-
-## Managing memory
-
-The module is a regular JavaScript object, and as such its lifespan will be managed by the garbage collector. In most cases this should be enough, and you should not worry about freeing the memory of the module yourself, but in some cases you may want to release the memory occupied by the module before the garbage collector steps in. In this case use the method `delete()` on the module object you will no longer use, and want to remove from the memory. Note that you cannot use `forward` after `delete` unless you load the module again.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/VerticalOCRModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/VerticalOCRModule.md
deleted file mode 100644
index 27b4564ad..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/VerticalOCRModule.md
+++ /dev/null
@@ -1,151 +0,0 @@
----
-title: VerticalOCRModule
----
-
-TypeScript API implementation of the [useVerticalOCR](../../02-hooks/02-computer-vision/useVerticalOCR.md) hook.
-
-## Reference
-
-```typescript
-import {
-  VerticalOCRModule,
-  VERTICAL_OCR_ENGLISH,
-} from 'react-native-executorch';
-
-const imageUri = 'path/to/image.png';
-
-// Creating an instance
-const verticalOCRModule = new VerticalOCRModule();
-
-// Loading the model
-await verticalOCRModule.load(VERTICAL_OCR_ENGLISH);
-
-// Running the model
-const detections = await verticalOCRModule.forward(imageUri);
-```
-
-### Methods
-
-| Method    | Type                                                                                                                                                                                                                                                                          | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
-| --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `load`    | `(model: { detectorLarge: ResourceSource; detectorNarrow: ResourceSource; recognizerLarge: ResourceSource; recognizerSmall: ResourceSource; language: OCRLanguage }, independentCharacters: boolean, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `detectorLarge` is a string that specifies the location of the recognizer binary file which accepts input images with a width of 1280 pixels, `detectorNarrow` is a string that specifies the location of the detector binary file which accepts input images with a width of 320 pixels, `recognizerLarge` is a string that specifies the location of the recognizer binary file which accepts input images with a width of 512 pixels, `recognizerSmall` is a string that specifies the location of the recognizer binary file which accepts input images with a width of 64 pixels, and `language` is a parameter that specifies the language of the text to be recognized by the OCR. |
-| `forward` | `(input: string): Promise<OCRDetections[]>`                                                                                                                                                                                                                                   | Executes the model's forward pass, where `input` can be a fetchable resource or a Base64-encoded string.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
-| `delete`  | `(): void`                                                                                                                                                                                                                                                                    | Release the memory held by the module. Calling `forward` afterwards is invalid. Note that you cannot delete model while it's generating.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-interface DetectorSources {
-  detectorLarge: string | number;
-  detectorNarrow: string | number;
-}
-
-interface RecognizerSources {
-  recognizerLarge: string | number;
-  recognizerSmall: string | number;
-}
-
-type OCRLanguage =
-  | 'abq'
-  | 'ady'
-  | 'af'
-  | 'ava'
-  | 'az'
-  | 'be'
-  | 'bg'
-  | 'bs'
-  | 'chSim'
-  | 'che'
-  | 'cs'
-  | 'cy'
-  | 'da'
-  | 'dar'
-  | 'de'
-  | 'en'
-  | 'es'
-  | 'et'
-  | 'fr'
-  | 'ga'
-  | 'hr'
-  | 'hu'
-  | 'id'
-  | 'inh'
-  | 'ic'
-  | 'it'
-  | 'ja'
-  | 'kbd'
-  | 'kn'
-  | 'ko'
-  | 'ku'
-  | 'la'
-  | 'lbe'
-  | 'lez'
-  | 'lt'
-  | 'lv'
-  | 'mi'
-  | 'mn'
-  | 'ms'
-  | 'mt'
-  | 'nl'
-  | 'no'
-  | 'oc'
-  | 'pi'
-  | 'pl'
-  | 'pt'
-  | 'ro'
-  | 'ru'
-  | 'rsCyrillic'
-  | 'rsLatin'
-  | 'sk'
-  | 'sl'
-  | 'sq'
-  | 'sv'
-  | 'sw'
-  | 'tab'
-  | 'te'
-  | 'th'
-  | 'tjk'
-  | 'tl'
-  | 'tr'
-  | 'uk'
-  | 'uz'
-  | 'vi';
-
-interface Point {
-  x: number;
-  y: number;
-}
-
-interface OCRDetection {
-  bbox: Point[];
-  text: string;
-  score: number;
-}
-```
-
-</details>
-
-## Loading the model
-
-To load the model, use the `load` method. It accepts:
-
-**`model`** - Object containing the detector sources, recognizer sources, and language.
-
-- **`detectorLarge`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 1280 pixels.
-- **`detectorNarrow`** - A string that specifies the location of the detector binary file which accepts input images with a width of 320 pixels.
-- **`recognizerLarge`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 512 pixels.
-- **`recognizerSmall`** - A string that specifies the location of the recognizer binary file which accepts input images with a width of 64 pixels.
-- **`language`** - A parameter that specifies the language of the text to be recognized by the OCR.
-
-**`independentCharacters`** – A boolean parameter that indicates whether the text in the image consists of a random sequence of characters. If set to true, the algorithm will scan each character individually instead of reading them as continuous text.
-
-**`onDownloadProgressCallback`** - (Optional) Function called on download progress.
-
-This method returns a promise, which can resolve to an error or void.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-## Running the model
-
-To run the model, you can use the `forward` method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image. The method returns a promise, which can resolve either to an error or an array of `OCRDetection` objects. Each object contains coordinates of the bounding box, the label of the detected object, and the confidence score.
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/_category_.json b/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/_category_.json
deleted file mode 100644
index 930e814ef..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/02-computer-vision/_category_.json
+++ /dev/null
@@ -1,6 +0,0 @@
-{
-  "label": "Computer Vision",
-  "link": {
-    "type": "generated-index"
-  }
-}
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/ExecutorchModule.md b/docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/ExecutorchModule.md
deleted file mode 100644
index 58b09c9be..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/ExecutorchModule.md
+++ /dev/null
@@ -1,164 +0,0 @@
----
-title: ExecutorchModule
----
-
-ExecutorchModule provides TypeScript bindings for the underlying ExecuTorch [Module API](https://pytorch.org/executorch/stable/extension-module.html).
-
-:::tip
-For React applications, consider using the [`useExecutorchModule`](../../02-hooks/03-executorch-bindings/useExecutorchModule.md) hook instead, which provides automatic state management, loading progress tracking, and cleanup on unmount.
-:::
-
-## Reference
-
-```typescript
-import {
-  ExecutorchModule,
-  STYLE_TRANSFER_CANDY,
-  ScalarType,
-} from 'react-native-executorch';
-
-// Creating the input array
-const inputTensor = {
-  dataPtr: new Float32Array(1 * 3 * 640 * 640),
-  sizes: [1, 3, 640, 640],
-  scalarType: ScalarType.FLOAT,
-};
-
-// Creating an instance
-const model = new ExecutorchModule();
-
-// Loading the model
-await model.load(STYLE_TRANSFER_CANDY);
-
-// Running the forward method
-const output = await model.forward([inputTensor]);
-```
-
-### Methods
-
-| Method          | Type                                                                                                    | Description                                                                                                                                                           |
-| --------------- | ------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `load`          | `(modelSource: ResourceSource, onDownloadProgressCallback?: (progress: number) => void): Promise<void>` | Loads the model, where `modelSource` is a string, number, or object that specifies the location of the model binary. Optionally accepts a download progress callback. |
-| `forward`       | `(inputTensor: TensorPtr[]): Promise<TensorPtr[]>`                                                      | Executes the model's forward pass, where input is an array of TensorPtr objects. If the inference is successful, an array of tensor pointers is returned.             |
-| `getInputShape` | `(methodName: string, index: number): Promise<number[]>`                                                | Returns the expected input shape for a specific method and input index.                                                                                               |
-| `delete`        | `(): void`                                                                                              | Unloads the model and releases resources.                                                                                                                             |
-
-## TensorPtr
-
-TensorPtr is a JS representation of the underlying tensor, which is then passed to the model. You can read more about creating tensors [here](https://docs.pytorch.org/executorch/stable/extension-tensor.html). On JS side, the TensorPtr holds the following information:
-
-<details>
-<summary>Type definitions</summary>
-
-```typescript
-interface TensorPtr {
-  dataPtr: TensorBuffer;
-  sizes: number[];
-  scalarType: ScalarType;
-}
-
-type TensorBuffer =
-  | ArrayBuffer
-  | Float32Array
-  | Float64Array
-  | Int8Array
-  | Int16Array
-  | Int32Array
-  | Uint8Array
-  | Uint16Array
-  | Uint32Array
-  | BigInt64Array
-  | BigUint64Array;
-
-enum ScalarType {
-  BYTE = 0,
-  CHAR = 1,
-  SHORT = 2,
-  INT = 3,
-  LONG = 4,
-  HALF = 5,
-  FLOAT = 6,
-  DOUBLE = 7,
-  BOOL = 11,
-  QINT8 = 12,
-  QUINT8 = 13,
-  QINT32 = 14,
-  QUINT4X2 = 16,
-  QUINT2X4 = 17,
-  BITS16 = 22,
-  FLOAT8E5M2 = 23,
-  FLOAT8E4M3FN = 24,
-  FLOAT8E5M2FNUZ = 25,
-  FLOAT8E4M3FNUZ = 26,
-  UINT16 = 27,
-  UINT32 = 28,
-  UINT64 = 29,
-}
-```
-
-</details>
-
-`dataPtr` - Represents a data buffer that will be used to create a tensor on the native side. This can be either an `ArrayBuffer` or a `TypedArray`. If your model takes in a datatype which is not covered by any of the `TypedArray` types, just pass an `ArrayBuffer` here.
-
-`sizes` - Represents the shape of a given tensor, i.e. for a 640x640 RGB image with a batch size of 1, you would need to pass `[1, 3, 640, 640]` here.
-
-`scalarType` - An enum resembling the ExecuTorch's `ScalarType`. For example, if your model was exported with float32 as an input, you will need to pass `ScalarType.FLOAT` here.
-
-## End to end example
-
-This example demonstrates the integration and usage of the ExecuTorch bindings with a [style transfer model](../../02-hooks/02-computer-vision/useStyleTransfer.md). Specifically, we'll be using the `STYLE_TRANSFER_CANDY` model, which applies artistic style transfer to an input image.
-
-### Importing the Module and loading the model
-
-First, import the necessary functions from the `react-native-executorch` package and initialize the ExecuTorch module with the specified style transfer model.
-
-```typescript
-import {
-  ExecutorchModule,
-  STYLE_TRANSFER_CANDY,
-  ScalarType,
-} from 'react-native-executorch';
-
-// Initialize the executorch module
-const executorchModule = new ExecutorchModule();
-
-// Load the model with optional download progress callback
-await executorchModule.load(STYLE_TRANSFER_CANDY, (progress) => {
-  console.log(`Download progress: ${progress}%`);
-});
-```
-
-### Setting up input parameters
-
-To prepare the model input, define the tensor shape according to your model's requirements (defined by the model export process). For example, the STYLE_TRANSFER_CANDY model expects a tensor with shape `[1, 3, 640, 640]` — representing a batch size of 1, 3 color channels (RGB), and 640×640 pixel dimensions.
-
-```typescript
-const inputTensor = {
-  dataPtr: new Float32Array(1 * 3 * 640 * 640), // or other TypedArray / ArrayBuffer
-  sizes: [1, 3, 640, 640],
-  scalarType: ScalarType.FLOAT,
-};
-```
-
-### Performing inference
-
-After passing input to the forward function, you'll receive an array of TensorPtr objects. Each TensorPtr contains its `dataPtr` field as an ArrayBuffer. Since ArrayBuffer represents raw binary data, you'll need to interpret it according to the tensor's underlying data type (e.g., creating a Float32Array view for float32 tensors, Int32Array for int32 tensors, etc.).
-
-```typescript
-try {
-  // Perform the forward operation and receive the stylized image output.
-  const output = await executorchModule.forward([inputTensor]);
-  // Interpret the output ArrayBuffer
-  // foo(output[0].dataPtr);
-} catch (error) {
-  // Log any errors that occur during the forward pass.
-  console.error('Error during model execution:', error);
-}
-
-// Clean up resources when done
-executorchModule.delete();
-```
-
-:::info
-This code assumes that you have handled preprocessing of the input image (scaling, normalization) and postprocessing of the output (interpreting the raw output data) according to the model's requirements. Make sure to adjust these parts depending on your specific data and model outputs.
-:::
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/_category_.json b/docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/_category_.json
deleted file mode 100644
index 1e8953592..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/03-executorch-bindings/_category_.json
+++ /dev/null
@@ -1,6 +0,0 @@
-{
-  "label": "ExecuTorch Bindings",
-  "link": {
-    "type": "generated-index"
-  }
-}
diff --git a/docs/versioned_docs/version-0.6.x/03-typescript-api/_category_.json b/docs/versioned_docs/version-0.6.x/03-typescript-api/_category_.json
deleted file mode 100644
index 5d1c80c51..000000000
--- a/docs/versioned_docs/version-0.6.x/03-typescript-api/_category_.json
+++ /dev/null
@@ -1,6 +0,0 @@
-{
-  "label": "TypeScript API",
-  "link": {
-    "type": "generated-index"
-  }
-}
diff --git a/docs/versioned_docs/version-0.6.x/04-benchmarks/_category_.json b/docs/versioned_docs/version-0.6.x/04-benchmarks/_category_.json
deleted file mode 100644
index 6ea5e9670..000000000
--- a/docs/versioned_docs/version-0.6.x/04-benchmarks/_category_.json
+++ /dev/null
@@ -1,6 +0,0 @@
-{
-  "label": "Benchmarks",
-  "link": {
-    "type": "generated-index"
-  }
-}
diff --git a/docs/versioned_docs/version-0.6.x/04-benchmarks/inference-time.md b/docs/versioned_docs/version-0.6.x/04-benchmarks/inference-time.md
deleted file mode 100644
index dbfc2b21d..000000000
--- a/docs/versioned_docs/version-0.6.x/04-benchmarks/inference-time.md
+++ /dev/null
@@ -1,111 +0,0 @@
----
-title: Inference Time
----
-
-:::warning warning
-Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
-:::
-
-## Classification
-
-| Model             | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| ----------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| EFFICIENTNET_V2_S |              64              |              68              |            217             |                205                |            198            |
-
-## Object Detection
-
-| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |              71              |              74              |            257             |                115                |            109            |
-
-## Style Transfer
-
-| Model                        | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| ---------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| STYLE_TRANSFER_CANDY         |             1400             |             1485             |            4255            |               2510                |           2355            |
-| STYLE_TRANSFER_MOSAIC        |             1400             |             1485             |            4255            |               2510                |           2355            |
-| STYLE_TRANSFER_UDNIE         |             1400             |             1485             |            4255            |               2510                |           2355            |
-| STYLE_TRANSFER_RAIN_PRINCESS |             1400             |             1485             |            4255            |               2510                |           2355            |
-
-## OCR
-
-Notice that the recognizer models were executed between 3 and 7 times during a single recognition.
-The values below represent the averages across all runs for the benchmark image.
-
-| Model                          | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Detector (CRAFT_800_QUANTIZED) |             220              |             221              |            1740            |                521                |            492            |
-| Recognizer (CRNN_512)          |              45              |              38              |            110             |                40                 |            38             |
-| Recognizer (CRNN_256)          |              21              |              18              |             54             |                20                 |            19             |
-| Recognizer (CRNN_128)          |              11              |              9               |             27             |                10                 |            10             |
-
-## Vertical OCR
-
-Notice that the recognizer models, as well as detector CRAFT_320 model, were executed between 4 and 21 times during a single recognition.
-The values below represent the averages across all runs for the benchmark image.
-
-| Model                           | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| ------------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Detector (CRAFT_1280_QUANTIZED) |             501              |             507              |            4317            |               1405                |           1275            |
-| Detector (CRAFT_320_QUANTIZED)  |             125              |             121              |            1060            |                338                |            299            |
-| Recognizer (CRNN_512)           |              46              |              42              |            109             |                47                 |            37             |
-| Recognizer (CRNN_64)            |              5               |              6               |             14             |                 7                 |             6             |
-
-## LLMs
-
-| Model                 | iPhone 16 Pro (XNNPACK) [tokens/s] | iPhone 13 Pro (XNNPACK) [tokens/s] | iPhone SE 3 (XNNPACK) [tokens/s] | Samsung Galaxy S24 (XNNPACK) [tokens/s] | OnePlus 12 (XNNPACK) [tokens/s] |
-| --------------------- | :--------------------------------: | :--------------------------------: | :------------------------------: | :-------------------------------------: | :-----------------------------: |
-| LLAMA3_2_1B           |                16.1                |                11.4                |                ❌                |                  15.6                   |              19.3               |
-| LLAMA3_2_1B_SPINQUANT |                40.6                |                16.7                |               16.5               |                  40.3                   |              48.2               |
-| LLAMA3_2_1B_QLORA     |                31.8                |                11.4                |               11.2               |                  37.3                   |              44.4               |
-| LLAMA3_2_3B           |                 ❌                 |                 ❌                 |                ❌                |                   ❌                    |               7.1               |
-| LLAMA3_2_3B_SPINQUANT |                17.2                |                8.2                 |                ❌                |                  16.2                   |              19.4               |
-| LLAMA3_2_3B_QLORA     |                14.5                |                 ❌                 |                ❌                |                  14.8                   |              18.1               |
-
-❌ - Insufficient RAM.
-
-### Encoding
-
-Average time for encoding audio of given length over 10 runs. For `Whisper` model we only list 30 sec audio chunks since `Whisper` does not accept other lengths (for shorter audio the audio needs to be padded to 30sec with silence).
-
-| Model              | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Whisper-tiny (30s) |             248              |             254              |            1145            |                435                |            526            |
-
-### Decoding
-
-Average time for decoding one token in sequence of approximately 100 tokens, with encoding context is obtained from audio of noted length.
-
-| Model              | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| ------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| Whisper-tiny (30s) |              23              |              25              |            121             |                92                 |            115            |
-
-## Text Embeddings
-
-| Model                      | iPhone 17 Pro (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| -------------------------- | :--------------------------: | :-----------------------: |
-| ALL_MINILM_L6_V2           |              7               |            21             |
-| ALL_MPNET_BASE_V2          |              24              |            90             |
-| MULTI_QA_MINILM_L6_COS_V1  |              7               |            19             |
-| MULTI_QA_MPNET_BASE_DOT_V1 |              24              |            88             |
-| CLIP_VIT_BASE_PATCH32_TEXT |              14              |            39             |
-
-:::info
-Benchmark times for text embeddings are highly dependent on the sentence length. The numbers above are based on a sentence of around 80 tokens. For shorter or longer sentences, inference time may vary accordingly.
-:::
-
-## Image Embeddings
-
-| Model                       | iPhone 17 Pro (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| --------------------------- | :--------------------------: | :-----------------------: |
-| CLIP_VIT_BASE_PATCH32_IMAGE |              18              |            55             |
-
-:::info
-Image embedding benchmark times are measured using 224×224 pixel images, as required by the model. All input images, whether larger or smaller, are resized to 224×224 before processing. Resizing is typically fast for small images but may be noticeably slower for very large images, which can increase total inference time.
-:::
-
-## Text to Image
-
-| Model                 | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| --------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| BK_SDM_TINY_VPRED_256 |            21184             |            21021             |             ❌             |               18834               |           16617           |
diff --git a/docs/versioned_docs/version-0.6.x/04-benchmarks/memory-usage.md b/docs/versioned_docs/version-0.6.x/04-benchmarks/memory-usage.md
deleted file mode 100644
index a0c5a7b6d..000000000
--- a/docs/versioned_docs/version-0.6.x/04-benchmarks/memory-usage.md
+++ /dev/null
@@ -1,81 +0,0 @@
----
-title: Memory Usage
----
-
-:::info
-All the below benchmarks were performed on iPhone 17 Pro (iOS) and OnePlus 12 (Android).
-:::
-
-## Classification
-
-| Model             | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
-| ----------------- | :--------------------: | :----------------: |
-| EFFICIENTNET_V2_S |          230           |         87         |
-
-## Object Detection
-
-| Model                          | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| ------------------------------ | :--------------------: | :----------------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |          164           |        132         |
-
-## Style Transfer
-
-| Model                        | Android (XNNPACK) [MB] | iOS (Core ML) [MB] |
-| ---------------------------- | :--------------------: | :----------------: |
-| STYLE_TRANSFER_CANDY         |          1200          |        380         |
-| STYLE_TRANSFER_MOSAIC        |          1200          |        380         |
-| STYLE_TRANSFER_UDNIE         |          1200          |        380         |
-| STYLE_TRANSFER_RAIN_PRINCESS |          1200          |        380         |
-
-## OCR
-
-| Model                                                                                                  | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| ------------------------------------------------------------------------------------------------------ | :--------------------: | :----------------: |
-| Detector (CRAFT_800_QUANTIZED) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) |          1400          |        1320        |
-
-## Vertical OCR
-
-| Model                                                                                    | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| ---------------------------------------------------------------------------------------- | :--------------------: | :----------------: |
-| Detector (CRAFT_1280_QUANTIZED) + Detector (CRAFT_320_QUANTIZED) + Recognizer (CRNN_512) |          1540          |        1470        |
-| Detector(CRAFT_1280_QUANTIZED) + Detector(CRAFT_320_QUANTIZED) + Recognizer (CRNN_64)    |          1070          |        1000        |
-
-## LLMs
-
-| Model                 | Android (XNNPACK) [GB] | iOS (XNNPACK) [GB] |
-| --------------------- | :--------------------: | :----------------: |
-| LLAMA3_2_1B           |          3.3           |        3.1         |
-| LLAMA3_2_1B_SPINQUANT |          1.9           |        2.4         |
-| LLAMA3_2_1B_QLORA     |          2.7           |        2.8         |
-| LLAMA3_2_3B           |          7.1           |        7.3         |
-| LLAMA3_2_3B_SPINQUANT |          3.7           |        3.8         |
-| LLAMA3_2_3B_QLORA     |          3.9           |        4.0         |
-
-## Speech to text
-
-| Model        | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| ------------ | :--------------------: | :----------------: |
-| WHISPER_TINY |          410           |        375         |
-
-## Text Embeddings
-
-| Model                      | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| -------------------------- | :--------------------: | :----------------: |
-| ALL_MINILM_L6_V2           |           95           |        110         |
-| ALL_MPNET_BASE_V2          |          405           |        455         |
-| MULTI_QA_MINILM_L6_COS_V1  |          120           |        140         |
-| MULTI_QA_MPNET_BASE_DOT_V1 |          435           |        455         |
-| CLIP_VIT_BASE_PATCH32_TEXT |          200           |        280         |
-
-## Image Embeddings
-
-| Model                       | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| --------------------------- | :--------------------: | :----------------: |
-| CLIP_VIT_BASE_PATCH32_IMAGE |          345           |        340         |
-
-## Text to Image
-
-| Model                 | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| --------------------- | ---------------------- | ------------------ |
-| BK_SDM_TINY_VPRED_256 | 2400                   | 2400               |
-| BK_SDM_TINY_VPRED     | 6210                   | 6050               |
diff --git a/docs/versioned_docs/version-0.6.x/04-benchmarks/model-size.md b/docs/versioned_docs/version-0.6.x/04-benchmarks/model-size.md
deleted file mode 100644
index 128cbd7fb..000000000
--- a/docs/versioned_docs/version-0.6.x/04-benchmarks/model-size.md
+++ /dev/null
@@ -1,90 +0,0 @@
----
-title: Model Size
----
-
-## Classification
-
-| Model             | XNNPACK [MB] | Core ML [MB] |
-| ----------------- | :----------: | :----------: |
-| EFFICIENTNET_V2_S |     85.6     |     43.9     |
-
-## Object Detection
-
-| Model                          | XNNPACK [MB] |
-| ------------------------------ | :----------: |
-| SSDLITE_320_MOBILENET_V3_LARGE |     13.9     |
-
-## Style Transfer
-
-| Model                        | XNNPACK [MB] | Core ML [MB] |
-| ---------------------------- | :----------: | :----------: |
-| STYLE_TRANSFER_CANDY         |     6.78     |     5.22     |
-| STYLE_TRANSFER_MOSAIC        |     6.78     |     5.22     |
-| STYLE_TRANSFER_UDNIE         |     6.78     |     5.22     |
-| STYLE_TRANSFER_RAIN_PRINCESS |     6.78     |     5.22     |
-
-## OCR
-
-| Model                          | XNNPACK [MB] |
-| ------------------------------ | :----------: |
-| Detector (CRAFT_800_QUANTIZED) |     19.8     |
-| Recognizer (CRNN_512)          |  15 - 18\*   |
-| Recognizer (CRNN_256)          |  16 - 18\*   |
-| Recognizer (CRNN_128)          |  17 - 19\*   |
-
-\* - The model weights vary depending on the language.
-
-## Vertical OCR
-
-| Model                           | XNNPACK [MB] |
-| ------------------------------- | :----------: |
-| Detector (CRAFT_1280_QUANTIZED) |     19.8     |
-| Detector (CRAFT_320_QUANTIZED)  |     19.8     |
-| Recognizer (CRNN_EN_512)        |  15 - 18\*   |
-| Recognizer (CRNN_EN_64)         |  15 - 16\*   |
-
-\* - The model weights vary depending on the language.
-
-## LLMs
-
-| Model                 | XNNPACK [GB] |
-| --------------------- | :----------: |
-| LLAMA3_2_1B           |     2.47     |
-| LLAMA3_2_1B_SPINQUANT |     1.14     |
-| LLAMA3_2_1B_QLORA     |     1.18     |
-| LLAMA3_2_3B           |     6.43     |
-| LLAMA3_2_3B_SPINQUANT |     2.55     |
-| LLAMA3_2_3B_QLORA     |     2.65     |
-
-## Speech to text
-
-| Model            | XNNPACK [MB] |
-| ---------------- | :----------: |
-| WHISPER_TINY_EN  |     151      |
-| WHISPER_TINY     |     151      |
-| WHISPER_BASE_EN  |    290.6     |
-| WHISPER_BASE     |    290.6     |
-| WHISPER_SMALL_EN |     968      |
-| WHISPER_SMALL    |     968      |
-
-## Text Embeddings
-
-| Model                      | XNNPACK [MB] |
-| -------------------------- | :----------: |
-| ALL_MINILM_L6_V2           |      91      |
-| ALL_MPNET_BASE_V2          |     438      |
-| MULTI_QA_MINILM_L6_COS_V1  |      91      |
-| MULTI_QA_MPNET_BASE_DOT_V1 |     438      |
-| CLIP_VIT_BASE_PATCH32_TEXT |     254      |
-
-## Image Embeddings
-
-| Model                       | XNNPACK [MB] |
-| --------------------------- | :----------: |
-| CLIP_VIT_BASE_PATCH32_IMAGE |     352      |
-
-## Text to Image
-
-| Model             | Text encoder (XNNPACK) [MB] | UNet (XNNPACK) [MB] | VAE decoder (XNNPACK) [MB] |
-| ----------------- | --------------------------- | ------------------- | -------------------------- |
-| BK_SDM_TINY_VPRED | 492                         | 1290                | 198                        |
diff --git a/docs/versioned_docs/version-0.6.x/05-utilities/_category_.json b/docs/versioned_docs/version-0.6.x/05-utilities/_category_.json
deleted file mode 100644
index dc9848f39..000000000
--- a/docs/versioned_docs/version-0.6.x/05-utilities/_category_.json
+++ /dev/null
@@ -1,6 +0,0 @@
-{
-  "label": "Utilities",
-  "link": {
-    "type": "generated-index"
-  }
-}
diff --git a/docs/versioned_docs/version-0.6.x/05-utilities/resource-fetcher.md b/docs/versioned_docs/version-0.6.x/05-utilities/resource-fetcher.md
deleted file mode 100644
index 1dfea89b3..000000000
--- a/docs/versioned_docs/version-0.6.x/05-utilities/resource-fetcher.md
+++ /dev/null
@@ -1,218 +0,0 @@
----
-title: Resource Fetcher
----
-
-This module provides functions to download and work with downloaded files stored in the application's document directory inside the `react-native-executorch/` directory. These utilities can help you manage your storage and clean up the downloaded files when they are no longer needed.
-
-## fetch
-
-Fetches resources (remote URLs, local files or embedded assets), downloads or stores them locally for use by React Native ExecuTorch.
-
-### Reference
-
-```typescript
-import { ResourceFetcher } from 'react-native-executorch';
-
-const uris = await ResourceFetcher.fetch(
-  (progress) => console.log('Total progress:', progress),
-  'https://.../llama3_2.pte',
-  'https://.../qwen3.pte'
-);
-```
-
-### Parameters
-
-- `callback: (downloadProgress: number) => void` - Optional callback to track progress of all downloads, reported between 0 and 1.
-- `...sources: ResourceSource[]` - Multiple resources that can be strings, asset references, or objects.
-
-### Returns
-
-`Promise<string[] | null>`:
-
-- If the fetch was successful, it returns a promise which resolves to an array of local file paths for the downloaded/stored resources (without file:// prefix).
-- If the fetch was interrupted by `pauseFetching` or `cancelFetching`, it returns a promise which resolves to `null`.
-
-:::info
-If the resource is an object, it will be saved as a JSON file on disk.  
-:::
-
-## pauseFetching
-
-Pauses an ongoing download of files.
-
-### Reference
-
-```typescript
-import { ResourceFetcher } from 'react-native-executorch';
-
-const uris = ResourceFetcher.fetch(
-  (progress) => console.log('Total progress:', progress),
-  'https://.../llama3_2.pte',
-  'https://.../qwen3.pte'
-).then((uris) => {
-  console.log('URI resolved to: ', uris); // since we pause the fetch, uris is resolved to null
-});
-
-await ResourceFetcher.pauseFetching(
-  'https://.../llama3_2.pte',
-  'https://.../qwen3.pte'
-);
-```
-
-### Parameters
-
-- `...sources: ResourceSource[]` - The resource identifiers used when calling `fetch`.
-
-### Returns
-
-`Promise<void>` – A promise that resolves once the download is paused.
-
-## resumeFetching
-
-Resumes a paused download of files.
-
-### Reference
-
-```typescript
-import { ResourceFetcher } from 'react-native-executorch';
-
-const uris = ResourceFetcher.fetch(
-  (progress) => console.log('Total progress:', progress),
-  'https://.../llama3_2.pte',
-  'https://.../qwen3.pte'
-).then((uris) => {
-  console.log('URI resolved as: ', uris); // since we pause the fetch, uris is resolved to null
-});
-
-await ResourceFetcher.pauseFetching(
-  'https://.../llama3_2.pte',
-  'https://.../qwen3.pte'
-);
-
-const resolvedUris = await ResourceFetcher.resumeFetching(
-  'https://.../llama3_2.pte',
-  'https://.../qwen3.pte'
-);
-//resolvedUris is resolved to file paths to fetched resources, unless explicitly paused/cancel again.
-```
-
-### Parameters
-
-- `...sources: ResourceSource[]` - The resource identifiers used when calling `fetch`.
-
-### Returns
-
-`Promise<string[] | null>`:
-
-- If the fetch was successful, it returns a promise which resolves to an array of local file paths for the downloaded resources (without file:// prefix).
-- If the fetch was again interrupted by `pauseFetching` or `cancelFetching`, it returns a promise which resolves to null.
-
-:::info
-The other way to resume paused resources is to simply call `fetch` again. However, `resumeFetching` is faster.
-:::
-
-## cancelFetching
-
-Cancels an ongoing/paused download of files.
-
-### Reference
-
-```typescript
-import { ResourceFetcher } from 'react-native-executorch';
-
-const uris = ResourceFetcher.fetch(
-  (progress) => console.log('Total progress:', progress),
-  'https://.../llama3_2.pte',
-  'https://.../qwen3.pte'
-).then((uris) => {
-  console.log('URI resolved as: ', uris); // since we cancel the fetch, uris is resolved to null
-});
-
-await ResourceFetcher.cancelFetching(
-  'https://.../llama3_2.pte',
-  'https://.../qwen3.pte'
-);
-```
-
-### Parameters
-
-- `...sources: ResourceSource[]` - The resource identifiers used when calling `fetch()`.
-
-### Returns
-
-`Promise<void>` – A promise that resolves once the download is cancelled.
-
-## deleteResources
-
-Deletes downloaded resources from the local filesystem.
-
-### Reference
-
-```typescript
-import { ResourceFetcher } from 'react-native-executorch';
-
-await ResourceFetcher.deleteResources('https://.../llama3_2.pte');
-```
-
-### Parameters
-
-- `...sources: ResourceSource[]` - The resource identifiers used when calling `fetch`.
-
-### Returns
-
-`Promise<void>` – A promise that resolves once all specified resources have been removed.
-
-## getFilesTotalSize
-
-Fetches the info about files size. Works only for remote files.
-
-### Reference
-
-```typescript
-import { ResourceFetcher } from 'react-native-executorch';
-
-const totalSize = await ResourceFetcher.getFilesTotalSize(
-  'https://.../llama3_2.pte',
-  'https://.../qwen3.pte'
-);
-```
-
-### Parameters
-
-- `...sources: ResourceSource[]` - The resource identifiers (URLs).
-
-### Returns
-
-`Promise<number>` – A promise that resolves to combined size of files in bytes.
-
-## listDownloadedFiles
-
-Lists all the downloaded files used by React Native ExecuTorch.
-
-### Reference
-
-```typescript
-import { ResourceFetcher } from 'react-native-executorch';
-
-const filesUris = await ResourceFetcher.listDownloadedFiles();
-```
-
-### Returns
-
-`Promise<string[]>` - A promise, which resolves to an array of URIs for all the downloaded files.
-
-## listDownloadedModels
-
-Lists all the downloaded models used by React Native ExecuTorch.
-
-### Reference
-
-```typescript
-import { ResourceFetcher } from 'react-native-executorch';
-
-const modelsUris = await ResourceFetcher.listDownloadedModels();
-```
-
-### Returns
-
-`Promise<string[]>` - A promise, which resolves to an array of URIs for all the downloaded models.

From 299e028d5f89417a04afca51ebe70d46483272fe Mon Sep 17 00:00:00 2001
From: IgorSwat <igorswat2002@o2.pl>
Date: Mon, 8 Dec 2025 09:28:19 +0100
Subject: [PATCH 11/11] remove unnecessary file

---
 .../02-computer-vision/useTextToImage.md      | 133 ------------------
 1 file changed, 133 deletions(-)
 delete mode 100644 docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useTextToImage.md

diff --git a/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useTextToImage.md b/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useTextToImage.md
deleted file mode 100644
index 476f8d95d..000000000
--- a/docs/versioned_docs/version-0.5.x/02-hooks/02-computer-vision/useTextToImage.md
+++ /dev/null
@@ -1,133 +0,0 @@
----
-title: useTextToImage
-keywords: [image generation]
-description: "Learn how to use image generation models in your React Native applications with React Native ExecuTorch's useTextToImage hook."
----
-
-Text-to-image is a process of generating images directly from a description in natural language by conditioning a model on the provided text input. Our implementation follows the Stable Diffusion pipeline, which applies the diffusion process in a lower-dimensional latent space to reduce memory requirements. The pipeline combines a text encoder to preprocess the prompt, a U-Net that iteratively denoises latent representations, and a VAE decoder to reconstruct the final image. React Native ExecuTorch offers a dedicated hook, `useTextToImage`, for this task.
-
-<!-- Update links after uploading the model to Swm HuggingFace -->
-
-:::warning
-It is recommended to use models provided by us which are available at our Hugging Face repository, you can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
-:::
-
-## Reference
-
-```typescript
-import { useTextToImage, BK_SDM_TINY_VPRED_256 } from 'react-native-executorch';
-
-const model = useTextToImage({ model: BK_SDM_TINY_VPRED_256 });
-
-const input = 'a castle';
-
-try {
-  const image = await model.generate(input);
-} catch (error) {
-  console.error(error);
-}
-```
-
-### Arguments
-
-**`model`** - Object containing the model source.
-
-- **`schedulerSource`** - A string that specifies the location of the scheduler config.
-
-- **`tokenizerSource`** - A string that specifies the location of the tokenizer config.
-
-- **`encoderSource`** - A string that specifies the location of the text encoder binary.
-
-- **`unetSource`** - A string that specifies the location of the U-Net binary.
-
-- **`decoderSource`** - A string that specifies the location of the VAE decoder binary.
-
-**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.
-
-For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page.
-
-### Returns
-
-| Field              | Type                                                                                       | Description                                                                                                                                                                                                                              |
-| ------------------ | ------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `generate`         | `(input: string, imageSize?: number, numSteps?: number, seed?: number) => Promise<string>` | Runs the model to generate an image described by `input`, and conditioned by `seed`, performing `numSteps` inference steps. The resulting image, with dimensions `imageSize`×`imageSize` pixels, is returned as a base64-encoded string. |
-| `error`            | <code>string &#124; null</code>                                                            | Contains the error message if the model failed to load.                                                                                                                                                                                  |
-| `isGenerating`     | `boolean`                                                                                  | Indicates whether the model is currently processing an inference.                                                                                                                                                                        |
-| `isReady`          | `boolean`                                                                                  | Indicates whether the model has successfully loaded and is ready for inference.                                                                                                                                                          |
-| `downloadProgress` | `number`                                                                                   | Represents the download progress as a value between 0 and 1.                                                                                                                                                                             |
-| `interrupt()`      | `() => void`                                                                               | Interrupts the current inference. The model is stopped in the nearest inference step.                                                                                                                                                    |
-
-## Running the model
-
-To run the model, you can use the `forward` method. It accepts four arguments: a text prompt describing the requested image, a size of the image in pixels, a number of denoising steps, and an optional seed value, which enables reproducibility of the results.
-
-The image size must be a multiple of 32 due to the architecture of the U-Net and VAE models. The seed should be a positive integer.
-
-:::warning
-Larger imageSize values require significantly more memory to run the model.
-:::
-
-## Example
-
-```tsx
-import { useTextToImage, BK_SDM_TINY_VPRED_256 } from 'react-native-executorch';
-
-function App() {
-  const model = useTextToImage({ model: BK_SDM_TINY_VPRED_256 });
-
-  //...
-  const input = 'a medieval castle by the sea shore';
-
-  const imageSize = 256;
-  const numSteps = 25;
-
-  try {
-    image = await model.generate(input, imageSize, numSteps);
-  } catch (error) {
-    console.error(error);
-  }
-  //...
-
-  return <Image source={{ uri: `data:image/png;base64,${image}` }} />;
-}
-```
-
-| ![Castle 256x256](../../../../static/img/castle256.png) | ![Castle 512x512](../../../../static/img/castle512.png) |
-| ------------------------------------------------------- | ------------------------------------------------------- |
-| Image of size 256×256                                   | Image of size 512×512                                   |
-
-## Supported models
-
-| Model                                                               | Parameters [B] | Description                                                                                                                                                                                                                                                                                                  |
-| ------------------------------------------------------------------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
-| [bk-sdm-tiny-vpred](https://huggingface.co/vivym/bk-sdm-tiny-vpred) | 0.5            | BK-SDM (Block-removed Knowledge-distilled Stable Diffusion Model) is a compressed version of Stable Diffusion v1.4 with several residual and attention blocks removed. The BK-SDM-Tiny is a v-prediction variant of the model, obtained through further block removal, built around a 0.33B-parameter U-Net. |
-
-## Benchmarks
-
-:::info
-The number following the underscore (\_) indicates that the model supports generating image with dimensions ranging from 128 pixels up to that value. This setting doesn’t affect the model’s file size - it only determines how memory is allocated at runtime, based on the maximum allowed image size.
-:::
-
-### Model size
-
-| Model                 | Text encoder (XNNPACK) [MB] | UNet (XNNPACK) [MB] | VAE decoder (XNNPACK) [MB] |
-| --------------------- | --------------------------- | ------------------- | -------------------------- |
-| BK_SDM_TINY_VPRED_256 | 492                         | 1290                | 198                        |
-| BK_SDM_TINY_VPRED_512 | 492                         | 1290                | 198                        |
-
-### Memory usage
-
-| Model                 | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
-| --------------------- | ---------------------- | ------------------ |
-| BK_SDM_TINY_VPRED_256 | 2900                   | 2800               |
-| BK_SDM_TINY_VPRED_512 | 6700                   | 6560               |
-
-### Inference time
-
-| Model                 | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
-| --------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
-| BK_SDM_TINY_VPRED_256 |            21184             |            21021             |             ❌             |               18834               |           16617           |
-
-:::info
-Text-to-image benchmark times are measured generating 256×256 images in 10 inference steps.
-:::