Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Java] Optimize map deep copy performance #1744

Closed
chaokunyang opened this issue Jul 20, 2024 · 5 comments · Fixed by #1767
Closed

[Java] Optimize map deep copy performance #1744

chaokunyang opened this issue Jul 20, 2024 · 5 comments · Fixed by #1767
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@chaokunyang
Copy link
Collaborator

chaokunyang commented Jul 20, 2024

Is your feature request related to a problem? Please describe.

Map kv items are homogeneous mostly, we can cache the previoues immutable info and reduce kv items type dispatch cost to speed up performance.

Describe the solution you'd like

Additional context

#1679

@chaokunyang chaokunyang added enhancement New feature or request good first issue Good for newcomers labels Jul 20, 2024
@urlyy
Copy link
Contributor

urlyy commented Jul 21, 2024

Excuse me, I don't understand the issue requirement , do you mean that in code below, we should not invoke the copyObject's getClassInfo , but instead get the info instance before loop? Is this enough? Sorry I don't know where can cache (as MapSerilizer is not only for one type kv map ).

Map newMap = fury.copy(oldMap);

@Override
public T copy(T originMap) {
  ......
  copyEntry(originMap, newMap);
  return onMapCopy(newMap);
}

protected <K, V> void copyEntry(Map<K, V> originMap, Map<K, V> newMap) {
  ++ ClassInfo keyClassInfo= classResolver.getOrUpdateClassInfo(entry.getKey().getClass());
  ++ ClassInfo valueClassInfo= classResolver.getOrUpdateClassInfo(entry.getValue().getClass());
  for (Map.Entry<K, V> entry : originMap.entrySet()) {
     -- newMap.put(fury.copyObject(entry.getKey()), fury.copyObject(entry.getValue()));
     ++ newMap.put(fury.copyObject(entry.getKey(),keyClassInfo), fury.copyObject(entry.getValue(), valueClassInfo));
  }
}


-- public <T> T copyObject(T ob) {
// We can add a new method with additional parameter classInfo
++ public <T> T copyObject(T obj, ClassInfo classInfo) {
    if (obj == null) {
      return null;
    }
    Object copy;
    -- ClassInfo classInfo = classResolver.getOrUpdateClassInfo(obj.getClass());
    switch (classInfo.getClassId()) {
      case ClassResolver.PRIMITIVE_BOOLEAN_CLASS_ID:
      case ClassResolver.PRIMITIVE_BYTE_CLASS_ID:
      case ClassResolver.PRIMITIVE_CHAR_CLASS_ID:
      case ClassResolver.PRIMITIVE_SHORT_CLASS_ID:
      case ClassResolver.PRIMITIVE_INT_CLASS_ID:
      case ClassResolver.PRIMITIVE_FLOAT_CLASS_ID:
      case ClassResolver.PRIMITIVE_LONG_CLASS_ID:
      case ClassResolver.PRIMITIVE_DOUBLE_CLASS_ID:
      case ClassResolver.BOOLEAN_CLASS_ID:
      case ClassResolver.BYTE_CLASS_ID:
      case ClassResolver.CHAR_CLASS_ID:
      case ClassResolver.SHORT_CLASS_ID:
      case ClassResolver.INTEGER_CLASS_ID:
      case ClassResolver.FLOAT_CLASS_ID:
      case ClassResolver.LONG_CLASS_ID:
      case ClassResolver.DOUBLE_CLASS_ID:
      case ClassResolver.STRING_CLASS_ID:
        return obj;
      case ClassResolver.PRIMITIVE_BOOLEAN_ARRAY_CLASS_ID:
        boolean[] boolArr = (boolean[]) obj;
        return (T) Arrays.copyOf(boolArr, boolArr.length);
      ......
      case ClassResolver.ARRAYLIST_CLASS_ID:
        copy = arrayListSerializer.copy((ArrayList) obj);
        break;
      case ClassResolver.HASHMAP_CLASS_ID:
        copy = hashMapSerializer.copy((HashMap) obj);
        break;
        // todo: add fastpath for other types.
      default:
        copyDepth++;
        copy = classInfo.getSerializer().copy(obj);
        copyDepth--;
    }
    return (T) copy;
  }

@chaokunyang
Copy link
Collaborator Author

It's like:

  protected <K, V> void copyEntry(Map<K, V> originMap, Map<K, V> newMap) {
    ClassResolver classResolver = fury.getClassResolver();
    for (Map.Entry<K, V> entry : originMap.entrySet()) {
      K key = entry.getKey();
      if (key != null) {
        ClassInfo classInfo = classResolver.getClassInfo(key.getClass(), keyClassInfoWriteCache);
        key = fury.copyObject(key, classInfo.getClassId());
      }
      V value = entry.getValue();
      if (value != null) {
        ClassInfo classInfo =
            classResolver.getClassInfo(value.getClass(), valueClassInfoWriteCache);
        value = fury.copyObject(value, classInfo.getClassId());
      }
      newMap.put(key, value);
    }
  }

@chaokunyang
Copy link
Collaborator Author

Same for collection and object array deep copy

@urlyy
Copy link
Contributor

urlyy commented Jul 23, 2024

It's like:

  protected <K, V> void copyEntry(Map<K, V> originMap, Map<K, V> newMap) {
    ClassResolver classResolver = fury.getClassResolver();
    for (Map.Entry<K, V> entry : originMap.entrySet()) {
      K key = entry.getKey();
      if (key != null) {
        ClassInfo classInfo = classResolver.getClassInfo(key.getClass(), keyClassInfoWriteCache);
        key = fury.copyObject(key, classInfo.getClassId());
      }
      V value = entry.getValue();
      if (value != null) {
        ClassInfo classInfo =
            classResolver.getClassInfo(value.getClass(), valueClassInfoWriteCache);
        value = fury.copyObject(value, classInfo.getClassId());
      }
      newMap.put(key, value);
    }
  }

The classResolver.getOrUpdateClassInfo(obj.getClass())already use a Class -> ClassInfo map classInfoMap in ClassResolver.java, why do we introduce another classInfoWriteCache ? I'm sorry, but I'm struggling to understand this part.

@Internal
public ClassInfo getOrUpdateClassInfo(Class<?> cls) {
  ClassInfo classInfo = classInfoCache;
  if (classInfo.cls != cls) {
    classInfo = classInfoMap.get(cls);
    if (classInfo == null || classInfo.serializer == null) {
      addSerializer(cls, createSerializer(cls));
      classInfo = classInfoMap.get(cls);
    }
    classInfoCache = classInfo;
  }
  return classInfo;
}

@chaokunyang
Copy link
Collaborator Author

chaokunyang commented Jul 27, 2024

If we serialize nested object, the classinfo may be overwriten by other map

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants