From dea2ab85e9774480e30f03f9d0208765b54efae4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Tue, 11 Mar 2025 17:48:42 +0100
Subject: [PATCH 1/7] [E&A] Drafts initial conceptual docs for EIS.

---
 explore-analyze/elastic-inference/eis.md | 57 +++++++++++++++++++++++-
 1 file changed, 55 insertions(+), 2 deletions(-)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index 9a2823744..96c884f6a 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -5,6 +5,59 @@ applies_to:
 navigation_title: Elastic Inference Service (EIS)
 ---
 
-# Elastic {{infer-cap}} Service
+# Elastic {{infer-cap}} Service [elastic-inference-service-eis]
 
-This is the documentation of the Elastic Inference Service.
+The Elastic {{infer-cap}} Service (EIS) enables you to leverage AI-powered search as a service without deploying a model in your cluster.
+With EIS, you don't need to manage the infrastructure and resources required for large language models (LLMs) by adding, configuring, and scaling {{ml}} nodes.
+Instead, you can use {{ml}} models in high-throughput, low-latency scenarios independently of your {{es}} infrastructure.
+
+Currently, you can perform chat completion tasks through EIS using the {{infer}} API.
+
+% TO DO: Link to the EIS inference endpoint reference docs when it's added to the OpenAPI spec. (Comming soon) %
+
+## Default EIS endpoints [default-eis-inference-endpoints]
+
+Your {{es}} deployment includes a preconfigured EIS endpoint, making it easier to use chat completion via the {{infer}} API:
+
+* `rainbow-sprinkles-elastic`: uses Anthropic's Claude Sonnet 3.5 model for chat completion {{infer}} tasks.
+
+::::{note}
+
+* The model appears as `Elastic LLM` in the AI Assistant, Attack Discovery UI, preconfigured connectors list, and the Search Playground.
+* To fine-tune prompts sent to `rainbow-sprinkles-elastic`, optimize them for Claude Sonnet 3.5.
+
+::::
+
+% TO DO: Link to the AI assistant documentation in the different solutions and possibly connector docs. %
+
+## Regions [eis-regions]
+
+EIS is currently running on AWS and in the following regions:
+
+* `us-east-1`
+* `us-west-2`
+
+For more details on AWS regions, refer to the [AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/) and the [supported cross-region {{infer}} profiles](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html) documentation.
+
+## LLM hosts [llm-hosts]
+
+The LLM used with EIS is hosted by [Amazon Bedrock](https://aws.amazon.com/bedrock/).
+
+## Examples
+
+The following example demostrates how to perform a `chat_completion` task through EIS by using the `.rainbow-sprinkles-elastic` default {{infer}} endpoint.
+
+```json
+POST /_inference/chat_completion/.rainbow-sprinkles-elastic/_stream
+{
+    "messages": [
+        {
+            "role": "user",
+            "content": "Say yes if it works."
+        }
+    ],
+    "temperature": 0.7,
+    "max_completion_tokens": 300
+    }
+}
+```

From 5c0499f94ce79ea1dc1b1f98de975ccb5b6f9eb7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Tue, 11 Mar 2025 17:55:29 +0100
Subject: [PATCH 2/7] [E&A] Small edits.

---
 .../inference-api/elastic-inference-service-eis.md    | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/explore-analyze/elastic-inference/inference-api/elastic-inference-service-eis.md b/explore-analyze/elastic-inference/inference-api/elastic-inference-service-eis.md
index d6127e53f..967c09d93 100644
--- a/explore-analyze/elastic-inference/inference-api/elastic-inference-service-eis.md
+++ b/explore-analyze/elastic-inference/inference-api/elastic-inference-service-eis.md
@@ -15,12 +15,10 @@ Refer to the [{{infer-cap}} APIs](https://www.elastic.co/docs/api/doc/elasticsea
 
 Creates an {{infer}} endpoint to perform an {{infer}} task with the `elastic` service.
 
-
 ## {{api-request-title}} [infer-service-elastic-api-request] 
 
 `PUT /_inference/<task_type>/<inference_id>`
 
-
 ## {{api-path-parms-title}} [infer-service-elastic-api-path-params] 
 
 `<inference_id>`
@@ -34,7 +32,6 @@ Creates an {{infer}} endpoint to perform an {{infer}} task with the `elastic` se
     * `chat_completion`,
     * `sparse_embedding`.
 
-
 ::::{note} 
 The `chat_completion` task type only supports streaming and only through the `_stream` API.
 
@@ -42,8 +39,6 @@ For more information on how to use the `chat_completion` task type, please refer
 
 ::::
 
-
-
 ## {{api-request-body-title}} [infer-service-elastic-api-request-body] 
 
 `max_chunk_size`
@@ -64,7 +59,6 @@ For more information on how to use the `chat_completion` task type, please refer
     `service_settings`
     :   (Required, object) Settings used to install the {{infer}} model.
 
-
 `model_id`
 :   (Required, string) The name of the model to use for the {{infer}} task.
 
@@ -77,9 +71,7 @@ For more information on how to use the `chat_completion` task type, please refer
     }
     ```
 
-
-
-## Elastic {{infer-cap}} Service example [inference-example-elastic] 
+## Elastic {{infer-cap}} Service example [inference-example-elastic]
 
 The following example shows how to create an {{infer}} endpoint called `elser-model-eis` to perform a `text_embedding` task type.
 
@@ -104,4 +96,3 @@ PUT /_inference/chat_completion/chat-completion-endpoint
     }
 }
 ```
-

From 92410c838239e70276c5c9d7793841c61852f7a5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Wed, 12 Mar 2025 14:53:36 +0100
Subject: [PATCH 3/7] Apply suggestions from code review

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
---
 explore-analyze/elastic-inference/eis.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index 96c884f6a..ec0f0ef55 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -32,7 +32,7 @@ Your {{es}} deployment includes a preconfigured EIS endpoint, making it easier t
 
 ## Regions [eis-regions]
 
-EIS is currently running on AWS and in the following regions:
+EIS runs on AWS in the following regions:
 
 * `us-east-1`
 * `us-west-2`
@@ -45,7 +45,7 @@ The LLM used with EIS is hosted by [Amazon Bedrock](https://aws.amazon.com/bedro
 
 ## Examples
 
-The following example demostrates how to perform a `chat_completion` task through EIS by using the `.rainbow-sprinkles-elastic` default {{infer}} endpoint.
+The following example demonstrates how to perform a `chat_completion` task through EIS by using the `.rainbow-sprinkles-elastic` default {{infer}} endpoint.
 
 ```json
 POST /_inference/chat_completion/.rainbow-sprinkles-elastic/_stream

From 705b0cf371aef3f0ac3467b9eaafd45423094b55 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Thu, 13 Mar 2025 11:57:51 +0100
Subject: [PATCH 4/7] Addresses feedback.

---
 explore-analyze/elastic-inference/eis.md | 39 +++++++++++++++++++-----
 1 file changed, 31 insertions(+), 8 deletions(-)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index ec0f0ef55..e8e05abf6 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -11,7 +11,18 @@ The Elastic {{infer-cap}} Service (EIS) enables you to leverage AI-powered searc
 With EIS, you don't need to manage the infrastructure and resources required for large language models (LLMs) by adding, configuring, and scaling {{ml}} nodes.
 Instead, you can use {{ml}} models in high-throughput, low-latency scenarios independently of your {{es}} infrastructure.
 
-Currently, you can perform chat completion tasks through EIS using the {{infer}} API.
+% TO DO: Link to the EIS inference endpoint reference docs when it's added to the OpenAPI spec. (Comming soon) %
+
+## Available task types
+
+EIS offers the following {{infer}} task types to perform:
+
+* Chat completion
+
+## How to use EIS [using-eis]
+
+Your Elastic deployment comes with default endpoints for EIS that you can use performing {{infer}} tasks.
+You can either do it by calling the {{infer}} API or using the default `Elastic LLM` model in the AI Assistant, Attack Discovery UI, and Search Playground.
 
 % TO DO: Link to the EIS inference endpoint reference docs when it's added to the OpenAPI spec. (Comming soon) %
 
@@ -19,12 +30,11 @@ Currently, you can perform chat completion tasks through EIS using the {{infer}}
 
 Your {{es}} deployment includes a preconfigured EIS endpoint, making it easier to use chat completion via the {{infer}} API:
 
-* `rainbow-sprinkles-elastic`: uses Anthropic's Claude Sonnet 3.5 model for chat completion {{infer}} tasks.
+* `rainbow-sprinkles-elastic`
 
 ::::{note}
 
 * The model appears as `Elastic LLM` in the AI Assistant, Attack Discovery UI, preconfigured connectors list, and the Search Playground.
-* To fine-tune prompts sent to `rainbow-sprinkles-elastic`, optimize them for Claude Sonnet 3.5.
 
 ::::
 
@@ -39,10 +49,6 @@ EIS runs on AWS in the following regions:
 
 For more details on AWS regions, refer to the [AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/) and the [supported cross-region {{infer}} profiles](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html) documentation.
 
-## LLM hosts [llm-hosts]
-
-The LLM used with EIS is hosted by [Amazon Bedrock](https://aws.amazon.com/bedrock/).
-
 ## Examples
 
 The following example demonstrates how to perform a `chat_completion` task through EIS by using the `.rainbow-sprinkles-elastic` default {{infer}} endpoint.
@@ -58,6 +64,23 @@ POST /_inference/chat_completion/.rainbow-sprinkles-elastic/_stream
     ],
     "temperature": 0.7,
     "max_completion_tokens": 300
-    }
 }
 ```
+
+The request returns the following response:
+
+```json
+(...)
+{
+  "role" : "assistant",
+  "content": "Yes",
+  "model" : "rainbow-sprinkles",
+  "object" : "chat.completion.chunk",
+  "usage" : {
+    "completion_tokens" : 4,
+    "prompt_tokens" : 13,
+    "total_tokens" : 17
+  }
+}
+(...)
+```

From 6937860416d9584ffa69f41866ec776082c3025e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Mon, 17 Mar 2025 09:43:01 +0100
Subject: [PATCH 5/7] Apply suggestions from code review

Co-authored-by: Max Jakob <max.jakob@elastic.co>
---
 explore-analyze/elastic-inference/eis.md | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index e8e05abf6..cbf5f2784 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -8,8 +8,8 @@ navigation_title: Elastic Inference Service (EIS)
 # Elastic {{infer-cap}} Service [elastic-inference-service-eis]
 
 The Elastic {{infer-cap}} Service (EIS) enables you to leverage AI-powered search as a service without deploying a model in your cluster.
-With EIS, you don't need to manage the infrastructure and resources required for large language models (LLMs) by adding, configuring, and scaling {{ml}} nodes.
-Instead, you can use {{ml}} models in high-throughput, low-latency scenarios independently of your {{es}} infrastructure.
+With EIS, you don't need to manage the infrastructure and resources required for {{ml}} {{infer}} by adding, configuring, and scaling {{ml}} nodes.
+Instead, you can use {{ml}} models for ingest, search and chat independently of your {{es}} infrastructure.
 
 % TO DO: Link to the EIS inference endpoint reference docs when it's added to the OpenAPI spec. (Comming soon) %
 
@@ -17,12 +17,11 @@ Instead, you can use {{ml}} models in high-throughput, low-latency scenarios ind
 
 EIS offers the following {{infer}} task types to perform:
 
-* Chat completion
+* `chat_completion`
 
-## How to use EIS [using-eis]
+## AI features powered by EIS [ai-features-powered-by-eis]
 
-Your Elastic deployment comes with default endpoints for EIS that you can use performing {{infer}} tasks.
-You can either do it by calling the {{infer}} API or using the default `Elastic LLM` model in the AI Assistant, Attack Discovery UI, and Search Playground.
+Your Elastic deployment or project comes with a default `Elastic LLM` connector. This connector is used in the AI Assistant, Attack Discovery, Automatic Import and Search Playground.
 
 % TO DO: Link to the EIS inference endpoint reference docs when it's added to the OpenAPI spec. (Comming soon) %
 
@@ -30,11 +29,11 @@ You can either do it by calling the {{infer}} API or using the default `Elastic
 
 Your {{es}} deployment includes a preconfigured EIS endpoint, making it easier to use chat completion via the {{infer}} API:
 
-* `rainbow-sprinkles-elastic`
+* `.rainbow-sprinkles-elastic`
 
 ::::{note}
 
-* The model appears as `Elastic LLM` in the AI Assistant, Attack Discovery UI, preconfigured connectors list, and the Search Playground.
+* This endpoint is used by the `Elastic LLM` AI connector, which in turn powers the AI Assistant, Attack Discovery, Automatic Import, and the Search Playground.
 
 ::::
 
@@ -42,12 +41,12 @@ Your {{es}} deployment includes a preconfigured EIS endpoint, making it easier t
 
 ## Regions [eis-regions]
 
-EIS runs on AWS in the following regions:
+All EIS requests are handled by one of these AWS regions:
 
 * `us-east-1`
 * `us-west-2`
 
-For more details on AWS regions, refer to the [AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/) and the [supported cross-region {{infer}} profiles](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html) documentation.
+For more details on AWS regions, refer to the [AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/).
 
 ## Examples
 
@@ -67,7 +66,7 @@ POST /_inference/chat_completion/.rainbow-sprinkles-elastic/_stream
 }
 ```
 
-The request returns the following response:
+The request returns the following response as a stream:
 
 ```json
 (...)
@@ -75,7 +74,6 @@ The request returns the following response:
   "role" : "assistant",
   "content": "Yes",
   "model" : "rainbow-sprinkles",
-  "object" : "chat.completion.chunk",
   "usage" : {
     "completion_tokens" : 4,
     "prompt_tokens" : 13,

From df74a391a184c2173549b2c61dce09430d468851 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Mon, 17 Mar 2025 12:30:20 +0100
Subject: [PATCH 6/7] Restructures page.

---
 explore-analyze/elastic-inference/eis.md | 70 +++++++++++++++---------
 1 file changed, 44 insertions(+), 26 deletions(-)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index cbf5f2784..2654043cc 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -13,18 +13,18 @@ Instead, you can use {{ml}} models for ingest, search and chat independently of
 
 % TO DO: Link to the EIS inference endpoint reference docs when it's added to the OpenAPI spec. (Comming soon) %
 
-## Available task types
-
-EIS offers the following {{infer}} task types to perform:
-
-* `chat_completion`
-
 ## AI features powered by EIS [ai-features-powered-by-eis]
 
 Your Elastic deployment or project comes with a default `Elastic LLM` connector. This connector is used in the AI Assistant, Attack Discovery, Automatic Import and Search Playground.
 
 % TO DO: Link to the EIS inference endpoint reference docs when it's added to the OpenAPI spec. (Comming soon) %
 
+## Available task types
+
+EIS offers the following {{infer}} task types to perform:
+
+* `chat_completion`
+
 ## Default EIS endpoints [default-eis-inference-endpoints]
 
 Your {{es}} deployment includes a preconfigured EIS endpoint, making it easier to use chat completion via the {{infer}} API:
@@ -39,16 +39,7 @@ Your {{es}} deployment includes a preconfigured EIS endpoint, making it easier t
 
 % TO DO: Link to the AI assistant documentation in the different solutions and possibly connector docs. %
 
-## Regions [eis-regions]
-
-All EIS requests are handled by one of these AWS regions:
-
-* `us-east-1`
-* `us-west-2`
-
-For more details on AWS regions, refer to the [AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/).
-
-## Examples
+### Examples
 
 The following example demonstrates how to perform a `chat_completion` task through EIS by using the `.rainbow-sprinkles-elastic` default {{infer}} endpoint.
 
@@ -69,16 +60,43 @@ POST /_inference/chat_completion/.rainbow-sprinkles-elastic/_stream
 The request returns the following response as a stream:
 
 ```json
-(...)
-{
-  "role" : "assistant",
-  "content": "Yes",
+event: message
+data: {
+  "id" : "unified-45ecde2b-6293-4fd6-a195-4252de76ee63",
+  "choices" : [
+    {
+      "delta" : {
+        "role" : "assistant"
+      },
+      "index" : 0
+    }
+  ],
   "model" : "rainbow-sprinkles",
-  "usage" : {
-    "completion_tokens" : 4,
-    "prompt_tokens" : 13,
-    "total_tokens" : 17
-  }
+  "object" : "chat.completion.chunk"
+}
+
+
+event: message
+data: {
+  "id" : "unified-45ecde2b-6293-4fd6-a195-4252de76ee63",
+  "choices" : [
+    {
+      "delta" : {
+        "content" : "Yes"
+      },
+      "index" : 0
+    }
+  ],
+  "model" : "rainbow-sprinkles",
+  "object" : "chat.completion.chunk"
 }
-(...)
 ```
+
+## Regions [eis-regions]
+
+All EIS requests are handled by one of these AWS regions:
+
+* `us-east-1`
+* `us-west-2`
+
+For more details on AWS regions, refer to the [AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/).

From d56e228264e0f0607ff084990af3ee9631154b68 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Mon, 24 Mar 2025 14:04:58 +0100
Subject: [PATCH 7/7] Addresses feedback.

---
 explore-analyze/elastic-inference/eis.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index 2654043cc..6c88d9ab9 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -99,4 +99,7 @@ All EIS requests are handled by one of these AWS regions:
 * `us-east-1`
 * `us-west-2`
 
+However, projects and deployments can use the Elastic LLM regardless of their cloud provider or region.
+The request routing does not restrict the location of your deployments.
+
 For more details on AWS regions, refer to the [AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/).