Skip to content

Clarify hardware profile and instance configuration related docs for ECH #2039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

kunisen
Copy link
Contributor

@kunisen kunisen commented Jul 4, 2025

Background

There are a bunch of hardware profile and instance configuration related ECH docs that are not clear enough and have caused some confusions to users and support so far.
After syncing with @yuvielastic @maggieghamry @stefnestor and @jakommo, we decided to make some doc updates to clarify.

Details in this internal ticket (https://github.com/elastic/support-tech-lead/issues/1586). Below is a quick recap of what we would like to change.

For reviewers

Recap of what to change

:: [1]

https://www.elastic.co/docs/deploy-manage/deploy/elastic-cloud/ec-change-hardware-profile

  • Change title to Manage hardware profiles
  • Add that version <= 7.10 can't do hardware profile migration

:: [2]

https://www.elastic.co/docs/deploy-manage/deploy/elastic-cloud/change-hardware

  • Update the doc from mentioning "Change hardware" to "Customize instance configuration" and make this update all through this doc page. <= It's technically more proper to say "instance configuration" but not just "hardware"
  • Add "Hardware profile is also referenced as deployment templates (DTs) in ECH to make it more logically connectable"
  • Add an "In addition" paragraph to explain some details, that has confused users and support so far. We have already confirmed with Yuvi from technical perspective.

:: [3]

https://www.elastic.co/docs/deploy-manage/production-guidance/optimize-performance/search-speed

  • Reference regarding "hardware profile" is not correct. It should be the real hardware profile one as mentioned in [1].

:: [4]

https://www.elastic.co/docs/deploy-manage/production-guidance/optimize-performance/indexing-speed

  • Reference regarding "hardware profile" is not correct. It should be the real hardware profile one as mentioned in [1].

:: [5]

Move https://www.elastic.co/docs/deploy-manage/deploy/elastic-cloud/change-hardware - "customize IC" doc under https://www.elastic.co/docs/deploy-manage/deploy/elastic-cloud/ec-change-hardware-profile "manage hardware profile" to make it more logically clear.


🔍 Preview links for changed docs


cc @maggieghamry @stefnestor @jakommo

@kunisen kunisen self-assigned this Jul 4, 2025
@kunisen kunisen requested a review from a team as a code owner July 4, 2025 13:35
@kunisen kunisen added documentation Improvements or additions to documentation supportability ability enable self-service or support of product labels Jul 4, 2025
@yuvielastic
Copy link

Thanks Kuni for the updates.

I wanted to suggest few changes within this page under List of Hardware profiles section. Following are the changes, can we please incoporate these changes as well?

Existing under CPU optimized (ARM): This profile is similar to CPU optimized profile but is powered by AWS Graviton2 instances.

Modification: This profile is similar to CPU optimized profile but powered by ARM instances. Currently, we offer ARM instances on AWS.

Existing under General purpose (ARM): This profile is similar to the General purpose profile but is powered by AWS Graviton2 instances.

Modification: This profile is similar to General purpose profile but powered by ARM instances. Currently, we offer ARM instances on AWS.

Existing under Vector search optimized (ARM): This profile is suited for Vector search, Generative AI and Semantic search optimized workloads.

Modification: This profile is suited for Vector search, Generative AI and Semantic search optimized workloads powered by ARM instances. Currently, we offer ARM instances on AWS.

To add: Vector search optimized
This profile is suited for Vector search, Generative AI and Semantic search optimized workloads. You can find the exact storage, memory, and vCPU allotment on the hardware details page for each cloud provider.

Ideal use case

Optimized for applications that leverage Vector Search and/or Generative AI. Also the optimal choice for utilizing ELSER for semantic search applications. Broadly suitable for all semantic search, text embedding, image search, and other Vector Search use cases.

Rest LGTM

@kunisen
Copy link
Contributor Author

kunisen commented Jul 7, 2025

Thank you @yuvielastic I made the change accordingly:
2749011

Could you please kindly confirm if it looks good and make an approval explicitly accordingly? 🙏


FWIW, the diff may be a bit tricky to understand about vector search hardware profile

  • We originally had Vector search optimized (RAM) which was not dedicated for ARM but for non-ARM / generic description.
  • We added Vector search optimized and rewrote previous Vector search optimized (ARM) to make it dedicated for ARM

From github diff perspective, it shows as "we newly added ARM section" but it's a 2-step updates as explained above.

Copy link

@yuvielastic yuvielastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -24,6 +26,11 @@ The {{ecloud}} console indicates when a new version of a hardware profile is ava

## Change the hardware profile using the {{ecloud}} console [ec_change_the_hardware_profile_using_the_elastic_cloud_console]

::::{note}
Deployment with Elastic stack version prior to 7.10 does not support hardware profile change {{ecloud}} console and API. If you want to make change on hardware profile, upgrading to version 7.10 and onwards is required.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my understanding: What's causing this behavior? Is this something that's fixable easily?

Copy link
Contributor Author

@kunisen kunisen Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my understanding: What's causing this behavior? Is this something that's fixable easily?

@yuvielastic it's a product issue that we only support hardware profile migration based on node_roles, where previously in cloud deployment, they use node types and we only introduced node_roles from Elasticsearch 7.10.

You can review this KB (external link) for more details: https://support.elastic.co/knowledge/2040b616 (you can also view internal link and I can share it with you internally if you need more history context)

Also, @gigerdo and I had a sync recently and he suggested that we should use a more self-explanatory message - which is being handled in CP-11182 (internal ticket).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it's not an easy fix - and we probably want to encourage customer more on upgrading to 7.10+ (preferably the current latest they can upgrade to) instead of fixing this and supporting customers to do HW migration on old versions. (It may cause other issues correspondingly as we didn't handle it very smoothly in earlier versions during node type to node roles migration, and it might be hard to cover all the test cases when we add an additional logic to hardware migration APIs - the most difficult part I can expect is the plan fails in the middle that having a mixed of node roles and types, then it may become technically hard to tackle.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Kuni for clarifying and agreed that we should encourage customers to upgrade to 7.10+ especially as that version is super old. Also, good to know that we will be updating the message to be more self-explanatory (separately in CP-11182).

Copy link
Contributor

@claudia-correia claudia-correia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — Thanks for making these changes, I believe the docs are more clear and consistent now! 🙇🏻‍♀️
Just left a small comment about how to use the API in order to get deprecated ICs/DTs.

@kunisen
Copy link
Contributor Author

kunisen commented Jul 7, 2025

Thank you Claudia for the very quick and helpful review! Much appreciated! 🙇

Copy link
Contributor

@claudia-correia claudia-correia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀 Thanks @kunisen for addressing my comments, the separation between using public doc vs. using the API makes total sense!
(I just added a small rephrasing suggestion below, but feel free to ignore if you don't find it useful 🙂)

@kunisen
Copy link
Contributor Author

kunisen commented Jul 9, 2025

Thanks Claudia again, great suggestions as always!
Glad to have your approval from CP dev perspective! 🙇

@kunisen
Copy link
Contributor Author

kunisen commented Jul 9, 2025

@eedugon may I trouble you to review from docs team's perspective and then we could consider the merge please? 🙏 thanks!

Copy link
Contributor

@eedugon eedugon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made some suggestions to the doc.

One important thing to highlight is that we don't have a proper explanation and introduction about hardware profiles and instance configurations, together, explaining what they are and the relation between them.

But if we don't want to sort that out in this PR I could create a new issue to improve that part at a later stage.

At the moment we offer some links to Hardware Profiles and Instance Configurations but in none of the destinations we get a clear view about are they.

I think a basic explanation about hardware profiles being representations of the entire deployment, including all components and tiers, and instance configurations being the actual specificacion of the virtual hardware where the instances run would be more than welcome, also explaining (if accurate) that a hardware profile includes multiple instance configurations, and an instance configuration can appear in multiple hardware profiles.

Anyway we could address that in a different PR if you consider better.

@kunisen
Copy link
Contributor Author

kunisen commented Jul 11, 2025

@eedugon

Anyway we could address that in a different PR if you consider better.

Thank you.
Let's address this in a different PR since it's a stemmed topic.

I agree with all your pointers, but I think this is a greater topic and we will need to engage with both PM and dev teams to have some further inputs on what we expect on those pages.
And from support perspective, I am afraid that it's not on us (and we are not authoritative enough) to make such doc page.

That said, could you please kindly check if anything else I should take action to tweak the wording or are we ready to merge this? 🙏 thank you again!

Copy link
Contributor

@eedugon eedugon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one minor change proposed (typo) before merging.
Thanks a lot for this!


The virtual hardware on which {{stack}} deployments run is defined by instance configurations. To learn more about what an instance configuration is, refer to [Instance configurations](cloud://reference/cloud-hosted/hardware.md#ec-getting-started-configurations).
This document explains how to modify the instance configurations used by specific components of your deployment without changing the overall hardware profile assigned to the deployment. This advanced configuration scenario is useful when specific situations in which we may need to migrate an Elasticsearch tier or stateless resource to a different hardware type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This document explains how to modify the instance configurations used by specific components of your deployment without changing the overall hardware profile assigned to the deployment. This advanced configuration scenario is useful when specific situations in which we may need to migrate an Elasticsearch tier or stateless resource to a different hardware type.
This document explains how to modify the instance configurations used by specific components of your deployment without changing the overall hardware profile assigned to the deployment. This advanced configuration scenario is useful in situations where you need to migrate an Elasticsearch tier or stateless resource to a different hardware type.

@eedugon
Copy link
Contributor

eedugon commented Jul 11, 2025

For the current build error please update the branch first. I think the error is not really related with this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation supportability ability enable self-service or support of product
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants