Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long segment and Host names stored in Zk can cause high heap usage and impact performance #14931

Open
vrajat opened this issue Jan 28, 2025 · 2 comments

Comments

@vrajat
Copy link
Collaborator

vrajat commented Jan 28, 2025

The length of segment & host names stored in zookeeper as part of cluster and data metadata can be very long. The length is dependent on table names and host names in the cluster. A couple of examples:

pinot-controller-controller-0-0.pinot-pinot-controller-headless.cell-bzf7co-managed.svc.cluster.local_9000
nation_dm2_0_output_4341_csv_FileIngestionTask_1732618352252_3715

In a test setup with a table of 200K segments, there are 5 million String objects and take up 247mb of memory.
A couple of stack traces of allocations:

 ↖{j.u.LinkedHashMap}.values
  ↖{j.u.TreeMap}.values
    ↖org.apache.helix.zookeeper.datamodel.ZNRecord.mapFields
      ↖org.apache.helix.model.CurrentState._record

↖{j.u.LinkedHashMap}.values
  ↖{j.u.TreeMap}.values
    ↖org.apache.helix.zookeeper.datamodel.ZNRecord.mapFields
      ↖org.apache.helix.model.ResourceConfig._record
        ↖{j.u.HashMap}.values
          ↖org.apache.helix.common.caches.PropertyCache._objMap

Long names also affect performance. An example with a relatively small table name.

curl -s -S -n -H "Authorization: Bearer $SCALETEST_TOKEN" "https://$CONTROLLER_HOST:$CONTROLLER_PORT/segments/nation_OFFLINE" -o /dev/null -w "%{time_total},%{size_download},%{speed_download}\n" >> stats.log

❯ cat stats.log
8.461959,4698905,555297
@vrajat
Copy link
Collaborator Author

vrajat commented Jan 28, 2025

An example stack trace for large memory allocations due to long host names.

Stack Trace	Count	Percentage
int[] java.util.Arrays.copyOf(int[], int)	491	94.8 %
int org.apache.pinot.shaded.com.fasterxml.jackson.core.sym.ByteQuadsCanonicalizer._appendLongName(int[], int)	485	93.6 %
String org.apache.pinot.shaded.com.fasterxml.jackson.core.sym.ByteQuadsCanonicalizer.addName(String, int[], int)	485	93.6 %
String org.apache.pinot.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.addName(int[], int, int)	485	93.6 %
String org.apache.pinot.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.findName(int[], int, int, int)	481	92.9 %
String org.apache.pinot.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.parseLongName(int, int, int)	481	92.9 %
String org.apache.pinot.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.parseMediumName2(int, int)	481	92.9 %
String org.apache.pinot.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.parseMediumName(int)	481	92.9 %
String org.apache.pinot.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._parseName(int)	481	92.9 %
String org.apache.pinot.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextFieldName()	481	92.9 %
void org.apache.pinot.shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer._readAndBindStringKeyMap(JsonParser, DeserializationContext, Map)	481	92.9 %
Map org.apache.pinot.shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(JsonParser, DeserializationContext)	481	92.9 %
Object org.apache.pinot.shaded.com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(JsonParser, DeserializationContext)	481	92.9 %
void org.apache.pinot.shaded.com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(JsonParser, DeserializationContext, Object)	473	91.3 %
Object org.apache.pinot.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(JsonParser, DeserializationContext, Object)	473	91.3 %
Object org.apache.pinot.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(JsonParser, DeserializationContext)	473	91.3 %
Object org.apache.pinot.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(JsonParser, DeserializationContext)	473	91.3 %
Object org.apache.pinot.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(JsonParser, DeserializationContext)	473	91.3 %
Object org.apache.pinot.shaded.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(JsonParser, DeserializationContext)	473	91.3 %
Object org.apache.pinot.shaded.com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(JsonParser, JavaType, JsonDeserializer, Object)	473	91.3 %
Object org.apache.pinot.shaded.com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(JsonParser, JavaType)	473	91.3 %
Object org.apache.pinot.shaded.com.fasterxml.jackson.databind.ObjectMapper.readValue(InputStream, Class)	473	91.3 %
Object org.apache.helix.zookeeper.datamodel.serializer.ZNRecordSerializer.deserialize(byte[])	473	91.3 %
Object org.apache.helix.zookeeper.datamodel.serializer.ChainedPathZkSerializer.deserialize(byte[], String)	473	91.3 %
Object org.apache.helix.zookeeper.zkclient.ZkClient.deserialize(byte[], String)	473	91.3 %
List org.apache.helix.manager.zk.ZkBaseDataAccessor.get(List, List, boolean[], boolean)	473	91.3 %
List org.apache.helix.manager.zk.ZkBaseDataAccessor.get(List, List, int, boolean)	473	91.3 %
List org.apache.helix.manager.zk.ZKHelixDataAccessor.getProperty(List, boolean)	473	91.3 %
Map org.apache.helix.common.caches.AbstractDataCache.refreshProperties(HelixDataAccessor, Set, List, Map, Set)	473	91.3 %
void org.apache.helix.common.caches.PropertyCache.doRefreshWithSelectiveUpdate(HelixDataAccessor)	473	91.3 %
void org.apache.helix.common.caches.PropertyCache.refresh(HelixDataAccessor)	473	91.3 %
void org.apache.helix.controller.dataproviders.BaseControllerDataProvider.refreshIdealState(HelixDataAccessor, Set)	473	91.3 %
Set org.apache.helix.controller.dataproviders.BaseControllerDataProvider.doRefresh(HelixDataAccessor)	473	91.3 %
void org.apache.helix.controller.dataproviders.ResourceControllerDataProvider.refresh(HelixDataAccessor)	238	45.9 %
void org.apache.helix.controller.stages.ReadClusterDataStage.process(ClusterEvent)	238	45.9 %
void org.apache.helix.controller.pipeline.Pipeline.handle(ClusterEvent)	238	45.9 %
void org.apache.helix.controller.GenericHelixController.handleEvent(ClusterEvent, BaseControllerDataProvider)	238	45.9 %
void org.apache.helix.controller.GenericHelixController$ClusterEventProcessor.run()	238	45.9 %

@Jackie-Jiang
Copy link
Contributor

This problem can be broken into 2 parts:

  • Segment name: this is determined by the task generating the segment. Given the total number of segments, we probably don't want to introduce a mapping from logical name to physical name. @KKcorps @swaminathanmanish is it possible to shorten the segment name generated from the minion task?
  • Instance name: Since most of the names are repeated within a table, we should be able to reuse the string object. @vrajat Can you check if the strings are reused? If not, can we make them reused?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants