-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support arm64 compilation #314
Comments
The GPU is a 32 bit processor. I haven't checked, but I'm expecting that there's a heck of a lot more work to do to get Khronos or other multimedia extension stuff up and running against a 64bit kernel than just getting userland to build. |
It looks like the build scripts have been merged, so perhaps the issue needs to be closed? |
I forgot I had filed this bug actually. The build scripts @Electron752 are referring to were part of PR #347 which adds -DARM64=ON to only compile known-working 64-bit code. But the fact remains that a lot of 64-bit broken code still exists. Maybe this bug should remain open and be used to refer to work on fixing the 64-bit broken code? I'll leave that decision to the repo maintainers. |
I wouldn't be surprised if its from the use of thumb, as its deprecated in aarch64 |
Since there have been no updates to this in a year, I'm inclined to close it. Any objections? |
@JamesH65 : I'm closely tracking this issue, as the reporter expanded with the following question :
What is your take on this? |
The RPF are not putting any dev effort in to a 64 bit userland, it's enough work supporting 32bit! I've no idea if that will change - its is a LOT of work I believe. So any updates will be coming from third parties, and there haven't been any posts here for a year, so presumably either no-one is actually working on it, or its being documented elsewhere. |
I'm sure there are certain applications where having a 64-bit kernel (let alone userland) may be beneficial, but I suspect the hoped-for performance improvements didn't materialise, otherwise people would be waving benchmark results at us demanding an RPi-supported aarch64 kernel. |
What is the best way to do benchmarks to post? I have a full 64-bit compile with march=armv8-a+crc and neon set in the compile, so it's pretty much optimized to the max of RPi hardware. |
You would think that Neon benchmarks would be the best ones to look at - Aarch64 Neon has double the number of Neon registers. |
No, the aim is not to find something that a 64-bit kernel will excel at, but rather a benchmark or two that reflect performance for the (mythical) typical user by including a bit of everything. |
Is there such a benchmark I could use? |
Hi everyone, whats the status here? |
We have not been working on this, so no change. |
I'm an experienced developer; if I wanted to hack on this in my spare time, where would be a good place to start? I understand if even figuring that out is more work than you guys want to put into this heh, but I figured it wouldn't hurt to ask. |
On 30 July 2018 at 23:06, Robert Thompson ***@***.***> wrote:
I'm an experienced developer; if I wanted to hack on this in my spare time, where would be a good place to start? I understand if even figuring that out is more work than you guys want to put into this heh, but I figured it wouldn't hurt to ask.
I've wondered about doing this; I think it's actually relatively
straightforward. There's a good chance I only think that because of a
combination of ignorance and hubris though.
But anyhow, the basic problem is that there are various ARM/VideoCore
interfaces around which are all designed around a 32 bit architecture
on both sides. The tricky part is that there are places where the ARM
side passes in a context, VC does some stuff, and then sends back a
message with that context. The ARM side then does whatever it needs to
do. That context is a pointer to some memory.
If you're on a 64 bit architecture, then that isn't going to work -
your pointers are obviously too large.
So, what to do?
Well, I think one way is to allocate a virtual region (vma) with a
suitably large size (e.g. 128MB virtual should be plenty, whatever,
it's virtual so it doesn't matter). And then allocate memory in there.
In the APIs, just pass the offset into this region, and on the way
back, convert back to a pointer by adding back the offset.
Note that VideoCore only actual ever reads or writes at *physical*
addresses as there is no IOMMU, so the virtual address can be
anywhere, and it won't matter. However, there are certainly some code
paths that will want a contiguous region (e.g. the VCHIQ circular
buffers).
The place to start looking at this is in the vchiq driver - fix that
and everything else will be easy (famous last words). There's a vchiq
test program, so once that works you are home and dry.
For example - vchiq_service_params_struct has a void* userdata -
that's an example of the problem. In vchiq_add_service_internal() it
stuffs that pointer into some shared memory - that's probably where
you would want to patch it up.
I think that should take care of vchiq.
There's also a shared memory driver where VC gets actual ARM-side
addresses; I don't know how that can possible work directly, probably
it will require a special allocator from this same region, but I think
other architectures have similar problems, so it might not be that
hard to overcome.
I can't help feeling though that I've overlooked something important!
Luke
…
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Thanks for the description! I got most of it compiled in 64bit removing some assertions and changing some types from The main problem here is that mmal and vcos code base (in interfaces) are definitely not 64bit compatible because of above reasons. Another issue is that mmal has a dependency (khronos) where 32bit assembly was used /interface/khronos/common/khrn_int_hash_asm.s. I am not that familiar with 32bit nor 64bit arm assembly to convert this. But maybe this file could be excluded?? Well I am glad that there are more people interested in doing this!! Greetings! |
If you can make your code available somewhere then I might be able to have
a look.
Don't worry abbot mmal for now, as it requires vchiq. vchiq kernel driver
is the place to start.
…On Tue, 31 Jul 2018, 09:03 Konstantin Wachendorff, ***@***.***> wrote:
Thanks for the description!
I got most of it compiled in 64bit removing some assertions and changing
some types from int to int32_t for example or removing void pointers
completely. There are still tons of pointer to integer casts warnings that
are probably critical, but I could not investigate further.
The main problem here is that mmal and vcos code base (in interfaces) are
definitely not 64bit compatible because of above reasons. Another issue is
that mmal has a dependency (khronos) where 32bit assembly was used
/interface/khronos/common/khrn_int_hash_asm.s. I am not that familiar with
32bit nor 64bit arm assembly to convert this. But maybe this file could be
excluded??
Well I am glad that there are more people interested in doing this!!
Greetings!
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#314 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFFYF_5t7XgLko-q5DzGTZXQIrehIC5vks5uMA9HgaJpZM4Ieaf3>
.
|
I have deleted all of it because I was stuck on the assembly file. But it wasn't that much work, I just changed the CMakeLists so it would compile everything with an aarch64 compiler (just remove the if not arm64), downloaded the newest linaro toolchain from here and fixed the errors on the go. I hope you have more time, patience and skill than I do! :) |
The assembly issue is trivial - there is a C implementation. Just remove:
from |
So as far as I can tell, you can currently use a 32-bit userland with a 64-bit kernel, and everything that I would expect to work does work (omxplayer, glmark2-es2-dispmanx, etc). Is this just a coincidence of the fact that the userland is always going to be putting a 32-bit pointer into the ostensibly 64-bit |
@popcornmix I guess I missed that... To your question, I don't know but I guess that the original authors maybe didn't plan to make it 64 bit in the first place ... because they might have expected to change the VC or so anyway so it was not necessary to take precautions for 64 bit |
Sorry my question was really directed @luked99 heh, should have made that explicit… |
On 1 August 2018 at 20:37, Robert Thompson ***@***.***> wrote:
Sorry my question was really directed @luked99 heh, should have made that explicit…
Well, I'm a bit surprised at your finding, but an ounce of experience
is worth a pound of theory(*)!
If you do "cat /dev/vchiq" it should give you a list of the services
(I know this should be in debugfs...). If that has something sensible
then it means that vchiq is working 64bit, which makes life much
easier.
In that case, fixing mmal might be just a matter of patching up the
structure definitions, and perhaps doing something as crude as a
lookup table to map from 64 bit address to 32 bit context (ugly, but I
suspect performance might well be fine). Otherwise we have to make
vchiq work 64 bit but I think that should still be OK.
The place where it will start getting tricky is if we ever have a 64
bit Raspberry Pi with more than 1GB of physical memory - at that point
I think the 32 bit VideoCore (combined with it's various cache
aliases) won't be able to address all of the available memory. But
we're not at that point yet.
I'm on vacation right now so I can't really do anything other than
theorize, sorry!
Luke
(*) I should use SI units, I know, sorry.
|
@luked99 indeed:
and
So it looks to me like the kernel side of this is already taken care of? |
@luked99 Don't worry about vcsm at the moment. There's a new version in the pipeline that replaces the reloc heap with CMA allocations made on behalf of the VPU and mem_wrapped into a MEM_HANDLE_T. There's also a V4L2 codec driver in progress, so that reduces MMAL to only being required for a couple of tasks. |
Hi, is there any update on this? I managed to compile it all without errors. However, I believe mmal does not work. Is there a way to make it work? |
I also found GLES/EGL have issues... first time I do eglSwapBuffers it works but after a few frames I get a segfault. This does not happen on 32 bits build. Any idea? |
We are not currently doing any dev work on 64bit builds, and don't use them in house, so I'm afraid I have no idea about the EGL issue. Without any sort of details on the fault it will also be very difficult to determine the cause of the issue. |
The firmware GLES / EGL drivers will never be updated for 64 bit systems - please use the vc4 KMS drivers instead (those should already support 64 bit). OpenMax IL is very unlikely to get any 64bit love - it's a hideous API to work with, and MMAL offers better functionality. MMAL still needs some work, and that is the one bit that may be tackled. vcsm is being rewritten. That should cover the majority of the userland code. |
I was looking for a solution for Raspberry Pi 3 running Ubuntu 18.04 64-bit. I tried for a while, but I was never able to get userland to compile/run properly on 32-bit and 64-bit. I decided to give V4L2 a try since others were reporting some success. I was able to get it up and running fairly quickly. (Wish I had tried that 8 hours ago.) Below is the info that helped me the most, for anyone trying to do the same:
Below are the two pages that were most helpful in finding this solution: |
Greetings, any update on the progress of this issue? |
Many of the userland apps build for 64bit already. What specific feature are you after? |
Greetings,
I'm looking for transcoding support using ffmpeg with command such as this: ffmpeg -i - -f mpegts -c:a copy -c:v h264_omx -b:v 600k -maxrate 600k -bufsize 1M pipe:1 |
Look at my reply above. You can use V4L2 on a 64-bit Linux, and it works with FFMPEG. |
Unless others choose to do it, then OMX will not be ported to 64bit. Technically it's not complex, but anything handling OMX becomes so. Video codecs are available via the V4L2 stateful codec API, so using h264_v4l2m2m should do what you want. MMAL is in the process of being fixed up for 64bit (#586). Generally speaking it is working well, but there appear to be some odd corner cases to clear up. TBH once MMAL is fully working and merged I'll be closing this issue as all the features will then be available on 64bit systems. |
thanks for the explenation, I'll use the 32 bit version for now until this issue is solved. |
For transcode the V4L2 codecs are your solution and already work on 64bit systems, so your issue is solved. |
v4l2 supports usb dtv dongles? |
V4L2 supports:
See the kernel documentation. https://www.kernel.org/doc/html/latest/media/intro.html |
good to know, now I just need to know how to make ffmpeg work with it. Thanks. |
As per the earlier comment, FFmpeg has support. The codec is v4l2_h264m2m. |
ok, will check it out, thanks |
I've started looking into it, first I noticed that for arm64, the cmake main txt file disables BUILD_MMAL which what I need, if it is supported, why it is disabled? |
Which bit of my earlier comment did you miss?
Why are you trying to build for the native host system? |
that part :) I assumed it was ready, that explains it.
I didn't said I'm building for native, I pass the --aarch64 switch. |
do I need to merge the changes from 64bit_mmal branch? |
#586 is now merged. @popcornmix is there any other apps or libraries from userland that you view as being needed for 64bit, or shall we close this? The one that might be worth checking is vcdbg, but that's only in the firmware tree rather than userland. I've just made a couple of changes to it for other reasons, so will look to see if it fixes up. (Looks to be only one command that needs disabling, so hopeful). |
I've checked. vcdbg is more involved to get to work on 64bit. Nothing too tricky, but messy enough not to be a 5 minute task. One to put on the backlog for now. |
question, building latest rpi-uland on aarch64 requires libbrcmEGL.so, libbrcmGLESv2.so and libkhrn_client.a. |
found the issue, here is the patch. |
@daggs1 Can you submit that as a Pull Request? |
done, #602 |
@6by9 This is what I did to get vcdbg working on arm64/ubuntu: raspberrypi/firmware#1118 (comment) |
I've got a RPi3 with a test 64-bit kernel + userland setup going, and tried to compile the VideoCore userland, without success. First obstacle was:
Stripping out -Werror allowed it to continue, leading it to:
followed by many more errors for khrn_int_hash_asm.s. Might be more problems after that's cleared.
The text was updated successfully, but these errors were encountered: