Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Avoid creating objects that Godot is going to use placement new to initialize #1378

Closed
wants to merge 1 commit into from

Conversation

dsnopek
Copy link
Collaborator

@dsnopek dsnopek commented Jan 31, 2024

This is an idea for a way to maybe fix issue #1240

It's using bits of modern C++ that I don't totally understand, so I may be getting this completely wrong. Feedback is very welcome!

Anyway, the idea is to create a block of memory that can hold the class that we're going to create, but without actually calling that class' constructor on it. Then we pass a pointer to that memory to Godot to initialize it via placement new. And then ultimately cast the data to the real class and return it.

So far, I've only done this in the one case brought up on the issue (converting a Variant to a Dictionary), but if this works, there's a number of other places this should be used.

@dsnopek dsnopek added the bug This has been identified as a bug label Jan 31, 2024
@dsnopek dsnopek added this to the 4.x milestone Jan 31, 2024
@dsnopek dsnopek requested a review from a team as a code owner January 31, 2024 20:43
@dsnopek dsnopek force-pushed the uninitialized-value branch 2 times, most recently from 7300114 to 9b3bcee Compare January 31, 2024 21:38
@allenwp
Copy link
Contributor

allenwp commented Feb 1, 2024

Thanks for writing this @dsnopek -- this is a lot cleaner and nicer than the same concept that I was fiddling with!

Here is the full constructor and destructor calls, in the order they happen for the following code:

{
	Array original_array;
	{
		Variant my_variant = original_array;
		{
			Array my_array = my_variant; // This line causes memory to leak
		}
	}
}

construct-destruct---9b3bcee71d55b731ae88c8d285c566b58423738e

Because the placement new is happening on uninitialized memory, it doesn't need to create and destroy a whole new ArrayPrivate/DictionaryPrivate, etc... But we still need to figure out how to get it to either not make that constructor call get it to correctly call the matching destructor...

This is all somewhat new to me, but I feel that I'm learning quickly, so I'll keep working on solution...

@allenwp
Copy link
Contributor

allenwp commented Feb 1, 2024

Because it seems to be the memory that is created by the Godot core's Variant::operator Array() const cast operator that is leaking, I broke it down to determine which part is actually leaking here:

Variant::operator Array() const {
	if (type == ARRAY) {
		const Array *result_ptr = reinterpret_cast<const Array *>(_data._mem);
		Array result(*result_ptr); // This "result" is cleaned up.
		return result; // THIS LINE LEAKS
	} else {
		return _convert_array_from_variant<Array>(*this);
	}
}

So it's the Array that is constructed during the return from Variant::operator Array() const that is not having its destructor ever called with this PR draft...

(I might be misreading my debugger and my knowledge of C++ is not that great... So I'll clarify that it was the second of two calls to the Array constructor from Variant::operator Array() const that never had its destructor called.)

@allenwp
Copy link
Contributor

allenwp commented Feb 1, 2024

So adding this destructor to the new template you proposed might be the correct fix, but it should only call T::~T() when a placement new has actually been called on data -- I'm not sure how this template could tell if that's been done or not.

~UninitializedValue() {
	reinterpret_cast<T *>(&data)->~T();
}

A quick test suggests that this sort of code works well, but I'm going to need to spend some more time analyzing the memory the whole way through to make sure the destructor is being called at the correct time...

The relevant topic here might be this idea of a "placement delete".

@dsnopek
Copy link
Collaborator Author

dsnopek commented Feb 1, 2024

Ah, ok, I think that makes sense! We need for exiting the scope of the Variant::operator Array() const function to decrement the reference count by 1 because it'll go up by 1 when the return value is assigned to some other Array.

You're right that such a destructor would be dangerous - if the value is never initialized, we'll be running the destructor on junk memory. I wonder if we named it ManuallyInitializedValue<T> or something like that, and put lots of comments that explain that the value must be initialized before going out of scope?

@dsnopek dsnopek force-pushed the uninitialized-value branch from 9b3bcee to 566742d Compare February 1, 2024 19:48
@allenwp
Copy link
Contributor

allenwp commented Feb 1, 2024

You're right that such a destructor would be dangerous - if the value is never initialized, we'll be running the destructor on junk memory. I wonder if we named it ManuallyInitializedValue<T> or something like that, and put lots of comments that explain that the value must be initialized before going out of scope?

That name is definitely an improvement if we go with this approach.

Honestly, any and all of this manual memory management code is the sort of code where you need to know what you're doing to use it. Having an explanation written out in comments for the class might be a reasonable approach?

I'm also curious (from an educational perspective) about deleted copy constructor and operator: why were these deleted and why not any others? I'm also all for thinking this one through as much as possible, because I'm sure it will touch a lot of scenarios that will be hard to think of in advance...

I don't know anything about std::aligned_storage, so I can't comment on that right now.

Theoretically, this could be used on all of the cast operators in GDExtension's Variant, but I suspect that would be a bad idea because it would be virtually no improvement for most of them? Is consistency between all casts better? Or less change to the codebase better? Anyway, lots of little questions that don't matter much :)

@dsnopek
Copy link
Collaborator Author

dsnopek commented Feb 1, 2024

I updated the PR like I described above.

However, I'm starting to wonder if this template is going about this the wrong way. The uninitialized memory that we're passing to Godot isn't initializing the whole godot-cpp class, just the opaque data blob in the class. Pretending like it's initializing the whole class seems dangerous: what if there's more in the godot-cpp class that needs to be initialized, for example, the vptr? This doesn't apply to any of these classes because they don't have virtual methods, but it still kind of bothers me, because it's conceptually wrong.

What if Array had a constructor that took the opaque data blob? So, something like:

class Array {
	static constexpr size_t ARRAY_SIZE = 8;
	struct OpaqueData {
		uint8_t opaque[ARRAY_SIZE] = {};		
	};
	OpaqueData opaque;

	Array(const OpaqueData &p_data) {
		opaque = p_data;
	}

	// ...
};

And then we'd have:

Variant::operator Array() const {
	Array::OpaqueData data;
	to_type_constructor[ARRAY](&data, _native_ptr());
	return Array(data);
}

That adds an extra copy, though... I wonder if there's a way to avoid that?

@allenwp
Copy link
Contributor

allenwp commented Feb 1, 2024

However, I'm starting to wonder if this template is going about this the wrong way. The uninitialized memory that we're passing to Godot isn't initializing the whole godot-cpp class, just the opaque data blob in the class. Pretending like it's initializing the whole class seems dangerous: what if there's more in the godot-cpp class that needs to be initialized, for example, the vptr? This doesn't apply to any of these classes because they don't have virtual methods, but it still kind of bothers me, because it's conceptually wrong.

Good thoughts.

My intuition tells me that the placement new operation should be done with fully uninitialized memory and requires the copy constructor to fully initialize all memory for the class. A placement new operation should, therefore, not be performed with any class that may not fully initialize itself within its copy constructor.

So, if anything is in question, it would be the (existing) liberal use of placement new throughout Variant, etc.?

@dsnopek
Copy link
Collaborator Author

dsnopek commented Feb 1, 2024

Ok, here's a much less hacky PR #1379

@allenwp Can you let me know if this fixes the memory leak in your testing? Thanks!

@allenwp
Copy link
Contributor

allenwp commented Feb 2, 2024

You're right that such a destructor would be dangerous - if the value is never initialized, we'll be running the destructor on junk memory.

I had a shower thought on this specific issue that I think could be addressed by making the template a wrapper of the placement new. This is pseudo-code, in a sense, as I haven't even tried to compile this. It has some issues in that it doesn't address all cases, but I think it gets the point across:

// Performs a placement new: should only be used with classes
// that can be entirely initialized through their copy constructor.
template <class T, size_t Alignment = alignof(T)>
class CopiedValue {
	typename std::aligned_storage<sizeof(T), Alignment>::type data;

public:
	CopiedValue(void *value) {
		to_type_constructor[ARRAY](ptr(), value);
	}

	CopiedValue() = delete;
	CopiedValue(const CopiedValue &) = delete;
	CopiedValue &operator=(const CopiedValue &) = delete;

	~CopiedValue() {
		reinterpret_cast<T *>(&data)->~T();
	}

	T &get() {
		return *reinterpret_cast<T *>(&data);
	}

	T *ptr() {
		return reinterpret_cast<T *>(&data);
	}
};

And it would be used in the cast operators a little like this:

CopiedValue<Array> result(_native_ptr());
return result.get();

I added a note at the top about the placement new that will be performed. I'm still new to this concept, but if I understand it correctly, a placement new should only ever be done on a class that can be fully initialized through its copy constructor... But... maybe that's a reasonable assumption: if a copy constructor is provided by the class, then I think it's reasonable to presume that the class can be fully initialized through it. So maybe this comment is entirely unnecessary.

I am actually partial to performing the placement new on entirely uninitialized memory like this implementation would do. The existing bugs reveal quite clearly that performing a placement new over top of previously initialized memory can cause issues because classes are not normally designed to have their copy constructor called on an existing initialization. This is exactly why the original memory leak was happening.

For this reason, wrapping the placement new in a way that correctly performs it on uninitialized memory and correctly calls the new class's destructor like this template sounds like the correct and safe approach for any and all classes...

(But I haven't deeply looked at the idea in PR #1379 so maybe that's an entirely better approach!)

@dsnopek
Copy link
Collaborator Author

dsnopek commented Feb 2, 2024

Using a constructor to call the Godot function is definitely better, because then we're sure that we attempted to initialize the value. However, I still think the approach on PR #1379 is cleaner and less hacky in general.

I'm going to close this one as superseded by #1379

@dsnopek dsnopek closed this Feb 2, 2024
@AThousandShips AThousandShips removed this from the 4.x milestone Feb 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
archived bug This has been identified as a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants