Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

README.md: added performance options (resolves #25) #27

Merged
merged 1 commit into from
Feb 5, 2014

Conversation

KristofRobot
Copy link

I propose to update the README.md with recommended performance options.

@ebutera
Copy link

ebutera commented Jan 23, 2014

I'm not sure that we should recommend thumb2, afaik nobody uses it and it's performance gain is uncertain.
The rest looks good to me.

@naguirre
Copy link

thumb2 is for old CPU where flash size has to be saved. but it's no more used these days.

About hardfloat support, Raoul point me recently that it's enabled by default in meta-fsl-arm for i.mx6 processor [1]. So i finally don't see why we could not enabled it by default !

What you think ?

[1] https://github.com/Freescale/meta-fsl-arm/blob/master/conf/machine/include/imx-base.inc#L34

@ebutera
Copy link

ebutera commented Jan 23, 2014

I didn't check but i suppose they force hf because they ship some hf-only binary blob (gpu, vpu...).

So we should have a good reason to do that too: sensible performance gain or for example gpu binaries.
I never used mali libs so i don't know if they require hard or soft fp, and i don't have data to choose hard or soft.

@KristofRobot can you re-run your benchmarks switching just hard/soft float (without thumb or vfpv4)?
Remember to set the cpu to max frequency to rule out governor differences (set min frequency to the max available frequency).

@KristofRobot
Copy link
Author

@ebutera @naguirre

Do you perhaps have some links/references that indicate that thumb2 is "not commonly used"?

I am no expert on this, but if I google for thumb2, it seems like a good idea to activate it - see [1], [2], [3].

[1] http://www.cs.uiuc.edu/class/fa05/cs433ug/PROCESSORS/Thumb2.pdf
[2] http://www.cnx-software.com/2011/04/22/compile-with-arm-thumb2-reduce-memory-footprint-and-improve-performance/
[3] http://stackoverflow.com/questions/15846737/arm-thumb-thumb-2-perfomance

@KristofRobot
Copy link
Author

can you re-run your benchmarks switching just hard/soft float (without thumb or vfpv4)?

Sure, will try to compare DEFAULTTUNE ?= "armv7a-neon" with DEFAULTTUNE ?= "armv7ahf-neon", and report back in issue #25

@KristofRobot
Copy link
Author

Just updated #25 with that comparison. In fact there is no difference (at least not with linpackc). Seems that without vfp hf does not make any difference.

@KristofRobot
Copy link
Author

As explained in #25, there is now a patch in oe-core to support vfpv4 specific tuning options. I'll update this pull request to reflect that, recommending to use DEFAULTTUNE = cortexa7thf-neon-vfpv4.

Any further input on the thumb discussion? In my simple performance tests I did not see any significant difference between thumb or no thumb - but googling it, activating thumb2 seems to make sense (see links posted above) - so I'd propose to recommend it, unless someone can point to some references that suggest that it is indeed not used/useful.

Or other comments?

Thanks!

@naguirre
Copy link

naguirre commented Feb 3, 2014

For what i know Thumb2 is 16/32 bits instructions set of arm. It's only
there to produce smaller code. The code resulting is less optimize for
performance. I never saw thumb2 instruction set uses on linux, it was
always on microcontroller, where size of the code matters. In out case with
huge CPU and and a lot of nand i don't think we need thumb2 instruction set.

In my point of view, the neon instruction set is much more interresting
regarding performances.

Regarding the links you post, i don't see where is the performance
improvement you are talking about.

I guess that you should ask your question on the linux sunxi mailing list
directly. you may get better answer than mine ;)

2014-02-03 KristofRobot notifications@github.com:

As explained in #25 #25,
there is now a patch in oe-core to support vfpv4 specific tuning options.
I'll update this pull request to reflect that, recommending to use DEFAULTTUNE
= cortexa7thf-neon-vfpv4.

Any further input on the thumb discussion? In my simple performance tests
I did not see any significant difference between thumb or no thumb - but
googling it, activating thumb2 seems to make sense (see links posted above)

  • so I'd propose to recommend it, unless someone can point to some
    references that suggest that it is indeed not used/useful.

Or other comments?

Thanks!

Reply to this email directly or view it on GitHubhttps://github.com//pull/27#issuecomment-33933353
.

Nicolas Aguirre
Mail: aguirre.nicolas@gmail.com
Web: http://enna.geexbox.org
Blog: http://dev.enlightenment.fr/~captainigloo/

@KristofRobot
Copy link
Author

Thanks for the answer.

In the meantime I found two additional references, one from ARM, suggesting that thumb2 comes with a 2% performance loss, but 26% space gain - [1].

Another post suggesting that Thumb-2 is preferred for everything but performance critical or system code [2].

But I think popping the question to the linux-sunxi mailing list is a great suggestion, will do that.

And yes, I fully agree that the neon-vfpv4 optimization is the significant one - but while I'm at it, I figured I could as well try to come up with a sensible thumb recommendation ;)

Thanks!

Kristof

[1] http://elinux.org/images/8/8a/Experiment_with_Linux_and_ARM_Thumb-2_ISA.pdf
[2] http://stackoverflow.com/questions/11062936/gcc-mthumb-against-marm

@ebutera
Copy link

ebutera commented Feb 3, 2014

i never saw thumb enabled in other big projects so i'd start recommending cortexa7hf-neon-vfpv4, if we find some useful cases where thumb2 really makes sense we can always update it.

@KristofRobot
Copy link
Author

Agreed - I updated my pull request along those lines (removing the thumb2 part).

Let me know if any other comments.

naguirre added a commit that referenced this pull request Feb 5, 2014
README.md: added performance options (resolves #25)
@naguirre naguirre merged commit 0a3c6d4 into linux-sunxi:master Feb 5, 2014
@KristofRobot KristofRobot deleted the CPU branch February 23, 2014 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants