Releases: Joachim26/StockfishNPS
Update 24/05/13: Some tournaments between SFNNv6.4_240509_MC_armv8 and Stockfish_dev.
Tournaments between SFNNv6.4_240509_MC_armv8 and SF_dev_240505 (aka SFNNv9.3_240509) were performed on Android (Snapdragon 662). They show convincingly that the first guess of Smallnet Threshold = 1750 is good enough to beat Monster Dimension 3072 Stockfish with up to 150 Elo 😂.
Das monsterdimensionale Netz scheint ständig irgendwo zwischen Akku, Cache und RAM festzustecken...
Wieder im Ernst: Der aktuelle Tri-Netz Code bringt für Android vermutlich eher schlechtere Performance als der mit 2 Netzen! Aus zwei Gründen:
- Das L1=256 nn-90xxxxxxxx.nnue v4-Netz funktioniert sehr gut unter Android, unter Windows aber fast nicht. In SFNN6.4 unter Android übernimmt das v4-Netz daher sowohl die Aufgaben des Midiumnets unter Windows und ist auch schnell genug um das v3-Netz zu ersetzen.
- Der aktuelle Code unterstützt für das Mediumnet NICHT die Finny Tables, das Mediumnet ist daher deutlich zu langsam. Die SFdevs "basteln" an diesem Code noch ständig herum, vielleicht löst sich das Problem ja auch ohne mein Zutun.
Bis auf Weiteres lasse ich das Tri-Netz daher mal links liegen.
Note: SFNNv9.3_240509 with Smallnet Threshold = 0 is SFnps_240509 or,
in other words, SF_dev_240505. Speed is probably some % higher than
the official pre-build.
SFNNv6.4_240509 played all five matches with Smallnet Threshold = 1750.
--------------------------------------------------
TC: 10+0.1s Concurrency: 6
Score of SFNNv6.4_240509 vs SFNNv9.3_240509: 330 - 87 - 183 [] 600
Elo difference: 149.26 +/- 24.26, LOS: 100.00 %, DrawRatio: 30.50 %
Ptnml: WW WD DD/WL LD LL
Distr: 83 108 84 19 6
--------------------------------------------------
TC: 10+0.1s Concurrency: 3
Score of SFNNv6.4_240509 vs SFNNv9.3_240509: 261 - 108 - 231 [] 600
Elo difference: 90.59 +/- 22.05, LOS: 100.00 %, DrawRatio: 38.50 %
Ptnml: WW WD DD/WL LD LL
Distr: 45 107 108 36 4
--------------------------------------------------
TC: 25+0.25s Concurrency: 4
Score of SFNNv6.4_240509 vs SFNNv9.3_240509: 178 - 97 - 225 [] 500
Elo difference: 56.78 +/- 22.64, LOS: 100.00 %, DrawRatio: 45.00 %
Ptnml: WW WD DD/WL LD LL
Distr: 18 89 102 38 3
--------------------------------------------------
TC: 60+0.6s Concurrency: 4
Score of SFNNv6.4_240509 vs SFNNv9.3_240509: 97 - 52 - 151 [] 300
Elo difference: 52.51 +/- 27.72, LOS: 99.99 %, DrawRatio: 50.33 %
Ptnml: WW WD DD/WL LD LL
Distr: 9 48 73 19 1
--------------------------------------------------
TC: 180+1s Concurrency: 4
Score of SFNNv6.4_240509 vs SFNNv9.3_240509: 45 - 32 - 73 [] 150
Elo difference: 30.19 +/- 39.97, LOS: 93.08 %, DrawRatio: 48.67 %
Ptnml: WW WD DD/WL LD LL
Distr: 0 27 34 14 0
--------------------------------------------------
24/05/09
SFNNv6.4_240509_MC_armv8, SFNNv9.4_240509_MC_armv8, and SFNNv9.3_240509_MC_armv8 are uploaded.
Note that tests will be performed to find good Smallnet Threshold values. While for the first two engines the current default value of 1750 may be not completely off, for the last (v9.3) engine the value should be set to 0! Then this engine is just SFnps (or Stockfish). As I have mentioned earlier, Smallnet Threshold is hardware dependent and corresponds to the Mediumnet Threshold of the tri-net Windows engines.
Benchmarks der hochgeladenen drei modern Builds von SFnps, SFNNv9.5.3 und SFNNv9.6.3:
An den identischen "Signatures" der drei Engines, die eigentlich die Anzahl der im Benchmark berechneten Knoten sind, kann man erkennen, dass die beiden Tri-Netz-Motoren völlig identisch zu SFnps bzw. Stockfish master spielen. Das ist natürlich nur so, weil Mediumnet Threshold = 0 ab jetzt der Default-Wert ist. Ebenso sind die gemessenen nps-Werte nahezu identisch. Ab spätestens ca. TC=120+1.2s würde ich mit den beiden Tri-Netzen nur noch mit MnTh = 0 spielen. Das hat sich seit den Finny-Tabellen geändert, da außer dem v9 Bignet im Moment, offiziell, keine andere Netzgröße Finny-beschleunigt wird. Also bei kürzeren TCs bis max.
TC=120+1.2s, MnTh vielleicht auf 500 ms setzen (am besten selbst testen, was ich praktisch noch nicht gemacht habe), der alte Wert von 1200 ist jetzt viel zu hoch! Ich muss mal sehen, ob die kommenden Updates alle L1=2^x Netze unterstützen, dann update ich natürlich, falls nicht schmeiße ich u. U. den ganzen Krempel einfach hin und mache etwas interessanteres.
Ach ja, ich bin übrigens noch immer relativ optimistisch, dass mein Vorschlag mit dem zweiten kleineren und schnelleren Netz funktionieren könnte. Auch NNs speichern Informationen (Knowledge / Schachwissen) in Bits und deren Anzahl ist immer begrenzt und man muss daher irrelevantes Knowledge weglassen und nicht so wichtiges Knowledge möglichst begrenzen. Dass man zwei getrennte Netze auch in eines mit einer entsprechenden Architektur packen kann ist vom Prinzip her trivial, die Implementierung nicht so ganz. Zumindest zum anfänglichen Testen sind zwei getrennte Netze daher besser geeignet.
Zum Schluss noch eine Bemerkung: Anstatt eines kleineren und schnelleren Netz kann man auch ein gleich großes und gleich schnelles Netz ausschließlich auf Positionen mit weniger Material (z.B. 16-Steiner) trainieren und im Spiel ab dem Zug, bei dem dieses Material dann erreicht wird, auf dieses Netz umschalten. Wiederum kann das auch intern mittels einer entsprechenden Netzarchitektur passieren. Dieses Netz braucht kein Wissen über 32- bis 17-Steiner, noch wird es darauf trainiert, hat mehr relevante Positionen gesehen und dafür mehr Speicherplatz zur Verfügung um das daraus gelernte zu speichern. Einziger offensichtlicher Nachteil ist der nicht optimale TT-Zustand nach dem einmaligen Netzwechsel. Nu is aber genug hierzu 🥴.
24/04/25
Das Updaten von SFNNv9.6.3 auf Finnys Tabellen hat überraschenderweise relativ problemlos funktioniert, allerdings läuft die Engine im Moment nur ohne Akku Caching für NNv5. Ich warte jetzt zuerst ab bis von den SFdevs eine Lösung des Problems für das v3 Netz kommt.
24/04/24
Ich schreibe ab jetzt auf Deutsch. Hat für mich nur Vorteile, insbesondere skaliert mein Zeitaufwand viel besser mit der Qualität des Geschriebenen. Vielleicht mache ich sogar den Discussion-Tab auf, wobei Deutsch und Denglisch dann obligatorisch sind. Es gibt ja Google translate und vor allem Aria, die phantastisch übersetzt. Beispiel gefällig:
Natürlich kann ich dir helfen, den Text ins Englische zu übersetzen. Hier ist die Übersetzung:
"Can you please translate this question and the following text into English?
I will now (better: "from now on" aber ohne Kontext kann Aria das ja nicht wissen) write in German. It has only advantages for me, especially because my time investment scales much better with the quality of the writing. Maybe I will even open the Discussion tab, where German and Denglish are then mandatory. There is Google translate and above all Aria, which translates fantastically. Need an example?"
Bitte beachte, dass ich eine KI-basierte Sprachmodell-Bot bin und meine Übersetzungen auf maschinellem Lernen basieren. Obwohl ich mein Bestes gebe, um genaue Übersetzungen zu liefern, können sie manchmal von menschlichen Übersetzungen abweichen.
Sollte jemand Aria nicht kennen hilft sicher Google weiter.
Ich habe gerade entdeckt, dass der Text da oben nicht ganz eindeutig ist, daher: Ich werde nur noch auf Deutsch schreiben und wer das nicht versteht (dies aber möchte) muss sich selbst um die Übersetzung kümmern. Mit ChatGPT, Aria, oder was auch immer😉.
In der Summe und auf längere Sicht wird der CO2 Ausstoß durch die Sprachumstellung vermutlich verringert und ich trage damit zur Rettung der Welt bei. Das ist ganz ganz sicher nicht ironisch gemeint😉🤔. Nu is aber genug für heute 😁.
24/04/20
Both Windows versions of SFNNv9.5.3_240420 are released now with new default parameter for Mediumnet threshold of 1200. Old value of 300 was just a (dumb and wrong 😂) guess and completely off... For both Windows engines I got warnings from the stupid MS Defender. Just ignore it (like I do since the 10th false positive) or test it on Virus Total.
Use 1200 in SFNNv9.5.3_240410 or even make selfplay tests with different values to find a good MnTh parameter (to find the optimal one takes 10000s of games). Anyway, the optimal value of Mediumnet threshold (MnTh) depends (not that much on the same OS) on hardware and also on TC. Both dependencies could be minimized by empirical formulas, however 100000s of games are needed🤔. Thus MnTh in the moment is an uci parameter.
However, with one(!) good MnTh value Stockfish dev can be easily beaten at STC and LTC. Here are the three tests with MnTh=1200 on my mini PC with Celeron N5095 (modern builds):
Important note: In the following Windows tests SFNNv9.5.3_240414_0
(identical to SFNN...18/20_0) is playing with Mediumnet threshold=0,
is thus playing exclusively with the SFNNv9 Bignet and is thus
perfectly simulating Stockfish dev of that date (same bench!). The
only difference may be a maximum slowdown of 2% corresponding to about
only 2 Elo which is much smaller than the following Elo differences.
--------------------------------------------------
TC: 10+0.1s
Score of SFNNv9.5.3_240418_1200 vs SFNNv9.5.3_240414_0: 283 - 220 - 497 [] 1000
Elo difference: 21.92 +/- 15.26, LOS: 99.75 %, DrawRatio: 49.70 %
Ptnml: WW WD DD/WL LD LL
Distr: 15 138 251 87 9
--------------------------------------------------
Finished tournament.
--------------------------------------------------
TC: 25+0.25s
Score of SFNNv9.5.3_240418_1200 vs SFNNv9.5.3_240414_0: 168 - 120 - 312 [] 600
Elo dif...
SFnps 16.1 releases
Which nets are needed by an engine can now be seen in the uci-options of the engine. As usual, for Windows the two nets (for SFS only one!) should be copied to the engine directory while for Droidfish the correct place is the logs-folder.
Update 2024-02-10: Several Windows builds
Source codes of all future engine series, all with nps-features, were rebased and somewhat renewed:
- SFnps remains unchanged: With default settings it is playing like a 100% Stockfish clone.
- SFNNv6 series replaces SFM which is discontinued: Larger v6 net and thus stronger, the "D" was omitted since Dual-Net is now mainstream in Stockfish master.
- SFS was already updated to SFS256 (download armv8), which gained a lot Elo with a new L1_256 net. Now SFS_240123 is a bit stronger than the last SFS256 and is thus published.
- SFDS uses two small nets L1_256 (same as SFS) and L1_128 (same as SF master). In the moment weaker than SFS256.
Note: I really dislike embedded nets (e.g. absolutely unnecessary CO2 emissions due to downloads of dev releases and clones), so most of my releases will not contain the two nets. Since the introduction of the Dual Nets an additional problem exists: The (great) SF devs COMPLETE hide the small net from the (naive and non-programming) user, even the small net name. So for all downloads down below the nets are as follows:
Nets for SFnps:
nn-baff1edbea57.nnue and nn-baff1ede1f90.nnue
Nets for SFNNv6:
nn-a3d1bfca1672.nnue and nn-baff1ede1f90.nnue
Nets for SFDS:
nn-9067e33176e8.nnue and nn-baff1ede1f90.nnue
(Single) net for SFS (and SFS256nps):
nn-9067e33176e8.nnue
As usual, nets on Windows in the engine folder, for Droidfish in the /logs folder.
Note: "nps" was removed from most new engine names, however, nps features are still in all engines.
Windows update tours (modern [SSE4.1+POPCNT] builds):
-------------------------------------------------------------------
TC 1+0.01s
Score of SFNNv6_240119 vs SFDNNv6_240114: 279 - 253 - 468 [] 1000
Elo difference: 9.04 +/- 15.70, LOS: 87.02 %, DrawRatio: 46.80 %
Ptnml: WW WD DD/WL LD LL
Distr: 29 122 217 110 22
-------------------------------------------------------------------
TC 2.5+0.025s
Score of SFNNv6_240119 vs SFDNNv6_240114: 260 - 247 - 493 [] 1000
Elo difference: 4.52 +/- 15.32, LOS: 71.81 %, DrawRatio: 49.30 %
Ptnml: WW WD DD/WL LD LL
Distr: 9 132 234 113 12
-------------------------------------------------------------------
TC 5+0.05s
Score of SFNNv6_240119 vs SFDNNv6_240114: 247 - 240 - 513 [] 1000
Elo difference: 2.43 +/- 15.02, LOS: 62.45 %, DrawRatio: 51.30 %
Ptnml: WW WD DD/WL LD LL
Distr: 12 126 229 123 10
-------------------------------------------------------------------
TC 10+0.1s
Score of SFNNv6_240119 vs SFDNNv6_240114: 250 - 215 - 535 [] 1000
Elo difference: 12.17 +/- 14.67, LOS: 94.77 %, DrawRatio: 53.50 %
Ptnml: WW WD DD/WL LD LL
Distr: 7 140 239 109 5
-------------------------------------------------------------------
TC 20+0.2s
Score of SFNNv6_240119 vs SFDNNv6_240114: 178 - 167 - 455 [] 800
Elo difference: 4.78 +/- 15.80, LOS: 72.31 %, DrawRatio: 56.88 %
Ptnml: WW WD DD/WL LD LL
Distr: 3 103 198 94 2
-------------------------------------------------------------------
TC 60+0.6s
Score of SFNNv6_240119 vs SFDNNv6_240114: 52 - 36 - 112 [] 200
Elo difference: 27.85 +/- 31.96, LOS: 95.60 %, DrawRatio: 56.00 %
Ptnml: WW WD DD/WL LD LL
Distr: 0 36 44 20 0
-------------------------------------------------------------------
What's Changed
Full Changelog (for SFnps240119): https://github.com/Joachim26/StockfishNPS/commits/FirstRebasedSourcesRelease
Windows SFDNNv6_240102 updated to SFDNNv6_240114: 10+ ELO!
Update tours (modern [SSE4.1+POPCNT] builds): SFDNNv6_240102 vs SFDNNv6_240114
-------------------------------------------------------------------
TC 2.5+0.025s
Score of SFDNNv6_240102 vs SFDNNv6_240114: 116 - 145 - 239 [] 500
Elo difference: -20.17 +/- 22.00, LOS: 3.63 %, DrawRatio: 47.80 %
Ptnml: WW WD DD/WL LD LL
Distr: 5 60 100 71 14
-------------------------------------------------------------------
TC15+0.15s
Score of SFDNNv6_240102 vs SFDNNv6_240114: 626 - 718 - 1656 [] 3000
Elo difference: -10.66 +/- 8.31, LOS: 0.60 %, DrawRatio: 55.20 %
Ptnml: WW WD DD/WL LD LL
Distr: 11 320 753 398 18
-------------------------------------------------------------------
Great minus signs: +20 and + 11 in favor of the updated SFDNNv6_240114.
Just another series of tournaments: SFDNNv6_240102(_231231) vs SFnps240110
TC 1+0.01s
----------------------------------------------------------------------------
Score of SFDNNv6_231231 vs SFnps240110: 362 - 225 - 413 [] 1000
Elo difference: 47.90 +/- 16.53, LOS: 100.00 %, DrawRatio: 41.30 %
Ptnml: WW WD DD/WL LD LL
Distr: 51 159 181 94 15
----------------------------------------------------------------------------
Finished tournament.
TC 2.5+0.025s
----------------------------------------------------------------------------
Score of SFDNNv6_231231 vs SFnps240110: 166 - 116 - 218 [] 500
Elo difference: 34.86 +/- 22.90, LOS: 99.85 %, DrawRatio: 43.60 %
Ptnml: WW WD DD/WL LD LL
Distr: 20 79 93 47 11
----------------------------------------------------------------------------
Finished tournament.
TC 5+0.05s
----------------------------------------------------------------------------
Score of SFDNNv6_231231 vs SFnps240110: 139 - 104 - 257 [] 500
Elo difference: 24.36 +/- 21.22, LOS: 98.76 %, DrawRatio: 51.40 %
Ptnml: WW WD DD/WL LD LL
Distr: 12 68 118 47 5
----------------------------------------------------------------------------
Finished tournament.
TC 10+0.1s
----------------------------------------------------------------------------
Score of SFDNNv6_231231 vs SFnps240110: 139 - 104 - 257 [] 500
Elo difference: 24.36 +/- 21.22, LOS: 98.76 %, DrawRatio: 51.40 %
Ptnml: WW WD DD/WL LD LL
Distr: 8 77 110 52 3
----------------------------------------------------------------------------
Finished tournament.
**SFDNNv6 updated!**
TC 25+0.25s
Score of SFDNNv6_240102 vs SFnps240110: 99 - 97 - 304 [] 500
Elo difference: 1.39 +/- 19.06, LOS: 55.68 %, DrawRatio: 60.80 %
Ptnml: WW WD DD/WL LD LL
Distr: 2 55 138 53 2
----------------------------------------------------------------------------
Finished tournament.
TC 120+1s
----------------------------------------------------------------------------
Score of SFDNNv6_240102 vs SFnps240110: 17 - 9 - 28 [] 54
Elo difference: 51.85 +/- 64.93, LOS: 94.17 %, DrawRatio: 51.85 %
Ptnml: WW WD DD/WL LD LL
Distr: 0 9 17 1 0
----------------------------------------------------------------------------
Started game 57 of 500 (SFDNNv6_240102 vs SFnps240110)
Finished tournament.
Saved results.
TC 120+1s (same as last)
----------------------------------------------------------------------------
Score of SFDNNv6_240102 vs SFnps240110: 72 - 54 - 174 [] 300
Elo difference: 20.87 +/- 25.47, LOS: 94.56 %, DrawRatio: 58.00 %
Ptnml: WW WD DD/WL LD LL
Distr: 1 44 77 28 0
----------------------------------------------------------------------------
Started game 303 of 500 (SFDNNv6_240102 vs SFnps240110)
Finished game 301 (SFDNNv6_240102 vs SFnps240110): 1-0 {SFnps240110 got checkmated}
Started game 304 of 500 (SFnps240110 vs SFDNNv6_240102)
Finished tournament.
Saved results.
Command line used to start the last tour was:
d:\Programme\FastChess>FastChess -engine cmd=SFDNNv6_240102 name=SFDNNv6_240102 -engine cmd=SFnps240110
name=SFnps240110 -each tc=120+1 -rounds 250 -repeat -concurrency 3 -openings file=UHO_2022_6mvs_+110_+119.epd
format=epd order=random -pgnout notation=san file=XXGames.pgn nodes=true nps=true -draw movecount=8 score=8
movenumber=30
For the earlier tours TC were changed and once SFDNNv6 was updated (when I saw that SFmaster is coming closer). Tours
were performed on my MiniPC (described elsewhere down below, happy searching..) So everything to repeat these tours is
now given, all raw tour data are already online.
This time there is a lot of scattering in the final results, especially deltaElo at TC 25+0.25s is completely off.
So the short conclusion is: On my PC SFDNNv6 is stronger than Stockfish master when tournament durations are limited
(like < 12h ).
Older comments:
The two attached Win-binaries down below are participating in the moment in Jorge's two engine tournaments see:
https://outskirts.altervista.org/forum/viewtopic.php?t=4472
SFDNNv6_240102_avx2.exe is 20 Elo stronger than
SFDNNv6_240102a_modern.exe and can be found in the 'latest release', just scroll down a bit. More older releases with many more downloads can be found below this 'latest release'.
Update 240116: Update tours for: SFDNNv6_240114_modern.exe, SFDNNv6_240114_avx2.exe, and SFDNNv6_240114_armv8 uploaded. Note that the armv8 build is not manually compiled thus not very fast. On Archimedes' page faster SFDNNv6_240114 builds are now available, see links below.
Links to Archimedes' SFDNNv6_240114 downloads:
SFDNNv6_240114 for Android (OEX)
SFDNNv6_240114 for Android (zip)
Two Windows update tours: Old SFDNNv6 vs New SFDNNv6
-------------------------------------------------------------------
TC 2.5+0.025s
Score of SFDNNv6_240102 vs SFDNNv6_240114: 116 - 145 - 239 [] 500
Elo difference: -20.17 +/- 22.00, LOS: 3.63 %, DrawRatio: 47.80 %
Ptnml: WW WD DD/WL LD LL
Distr: 5 60 100 71 14
-------------------------------------------------------------------
TC15+0.15s
Score of SFDNNv6_240102 vs SFDNNv6_240114: 626 - 718 - 1656 [] 3000
Elo difference: -10.66 +/- 8.31, LOS: 0.60 %, DrawRatio: 55.20 %
Ptnml: WW WD DD/WL LD LL
Distr: 11 320 753 398 18
-------------------------------------------------------------------
In the next update SFDNNv6 will slightly change its name to SFNNv6, since D=Dual is mainstream now, as predicted somewhere down below. So the future engine name and its main net generation name will be identical: SFNNv6.
Windows TC=90+0.9s tour NewSFDNNv6 vs OldSFDNNv6 vs Raid v2.76i
----------------------------------------------------------------------------
Rank Name Elo +/- Games Points Score Draw TC
1 SFDNNv6_240102 22 25 300 159.5 53.2% 58.3% 90+0.9
2 SFDNNv6_231231 7 25 300 153.0 51.0% 58.0% 90+0.9
3 Raid v2.76i_X_sse41 -29 26 300 137.5 45.8% 57.0% 90+0.9
450 of 3600 games finished.
----------------------------------------------------------------------------
Note that SFDNNv6 is primarily made for Android but it is obviously also pretty strong on Windows😲! Additionally note that the last update of SFDNNv6 worked fine.
Details about the tournament soft- and hardware have been already described elsewhere in the release section ("Battle of the clones"):
https://github.com/Joachim26/StockfishNPS/releases/tag/Master_DroidSFnps-bb4c63b3
All tour data (games, configs,...) will be uploaded when tour is finished.
SFDNnps231206 for Android
with the SFNNv8 (big) EvalFile
nn-0000000000a0.nnue
and the L1-256 (small) EvalFile
ecb35f70ff2a.nnue
SFDNnps231206NNv6 for Android
with the last SFNNv6 (big) EvalFile
nn-a3d1bfca1672.nnue
and the L1-256 (small) EvalFile
ecb35f70ff2a.nnue
SFDNnps231214 for Android and
SFDNnps231215_modern.exe (the small EvalFile has to be in the engine folder)
with the SFNNv8 (big) EvalFile
nn-0000000000a0.nnue
and the L1-128 (small) EvalFile
nn-c01dc0ffeede.nnue
SFDNnps231214NNv6 for Android
with the last SFNNv6 (big) EvalFile
nn-a3d1bfca1672.nnue
and the L1-128 (small) EvalFile
nn-c01dc0ffeede.nnue
All following Android tournaments were played on a Xiaomi Poco M3 (Android 12, Snapdragon 662, 4(+2) GB RAM) using Termux and FastChess for Android. Concurrency is set to 4 and 1 thread per engine is used. Opening suite used is UHO_2022_8mvs_+110_+119.epd. Note that such kind of openings enlarge ELO differences but reduced draw rates significantly.
------------------------------------------------------------------
TC: 1+0.01s
Score of SFDNnps231206NNv6 vs SFDNnps231206: 501 - 213 - 286 [] 1000
Elo difference: 102.97 +/- 18.65, LOS: 100.00 %, DrawRatio: 28.60 %
Ptnml: WW WD DD/WL LD LL
Distr: 118 146 161 56 19
------------------------------------------------------------------
TC: 2.5+0.025s
Score of SFDNnps231206NNv6 vs SFDNnps231206: 275 - 96 - 229 [] 600
Elo difference: 106.90 +/- 22.21, LOS: 100.00 %, DrawRatio: 38.17 %
Ptnml: WW WD DD/WL LD LL
Distr: 49 119 100 26 6
------------------------------------------------------------------
TC: 5+0.05s
Score of SFDNnps231206NNv6 vs SFDNnps231206: 259 - 112 - 229 [] 600
Elo difference: 86.89 +/- 22.10, LOS: 100.00 %, DrawRatio: 38.17 %
Ptnml: WW WD DD/WL LD LL
Distr: 44 108 104 39 5
------------------------------------------------------------------
TC: 10+0.1s
Score of SFDNnps231206NNv6 vs SFDNnps231206: 149 - 71 - 180 [] 400
Elo difference: 68.63 +/- 25.34, LOS: 100.00 %, DrawRatio: 45.00 %
Ptnml: WW WD DD/WL LD LL
Distr: 20 80 64 30 6
------------------------------------------------------------------
TC: 30+0.3s
Score of SFDNnps231206NNv6 vs SFDNnps231206: 66 - 40 - 94 [] 200
Elo difference: 45.42 +/- 35.16, LOS: 99.42 %, DrawRatio: 47.00 %
Ptnml: WW WD DD/WL LD LL
Distr: 3 35 48 13 1
------------------------------------------------------------------
TC: 35+0.35s
Score of SFDNnps231206NNv6 vs SFDNnps231206: 112 - 66 - 222 [] 400
Elo difference: 40.13 +/- 22.67, LOS: 99.97 %, DrawRatio: 55.50 %
Ptnml: WW WD DD/WL LD LL
Distr: 6 70 90 32 2
------------------------------------------------------------------
TC: 45+0.45s concurrency=6
Score of SFDNnps231206NNv6 vs SFDNnps231206: 93 - 55 - 152 [] 300
Elo difference: 44.25 +/- 27.63, LOS: 99.91 %, DrawRatio: 50.67 %
Ptnml: WW WD DD/WL LD LL
Distr: 5 56 64 22 3
------------------------------------------------------------------
TC: 50+0.5s concurrency=6
Score of SFDNnps231206NNv6 vs SFDNnps231206: 95 - 49 - 156 [] 300
Elo difference: 53.70 +/- 27.22, LOS: 99.99 %, DrawRatio: 52.00 %
Ptnml: WW WD DD/WL LD LL
Distr: 4 60 65 20 1
--------------------------------------------------
TC: 60+0.6s
Score of SFDNnps231206NNv6 vs SFDNnps231206: 54 - 42 - 104 [] 200
Elo difference: 20.87 +/- 33.41, LOS: 88.97 %, DrawRatio: 52.00 %
Ptnml: WW WD DD/WL LD LL
Distr: 0 31 51 17 1
------------------------------------------------------------------
TC: 180+1s
Score of SFDNnps231206NNv6 vs SFDNnps231206: 36 - 29 - 93 [] 158
Elo difference: 15.40 +/- 34.80, LOS: 80.74 %, DrawRatio: 58.86 %
Ptnml: WW WD DD/WL LD LL
Distr: 0 20 46 13 0
------------------------------------------------------------------
Convincing victories of SFDNnps with the smaller SFNNv6 main net. So far one should say, since nobody knows what would be the result at VVVLTCs. However, longer TCs than TC=180+1s will not be played since Termux tends to crash at such long TCs. A planned tour of 10 hours duration and Termux crashes after 6 or 9 hours and all data files are lost... no😊.
Although the error bars look quite large the resulting curve looks quite smooth. Except the two points determined with concurrency=6. However, since under these conditions less nodes per time interval are calculated (4 p-cores + 2 e-cores have on average per core less performance than 4 p-cores even without thermal throttling which in addition reduces the average nps) the two points are shifted somewhat to the left and then fit better to the other points.
BTW, the reason for the superiority of SFnpsNNv6 is, certainly, an about 50% higher nps-value (stored in the fcXX.pgn files for each move). Such high values are, at least on my phone, not much below nps-values of other strong Android engines like SFplusNPS, SFMnps, or SFMXnps, all three with the last SFNNv5 net. It is obvious, that tournaments between SFNNv5- and SFNNv6-engines should be made in future. Note that the nps difference between SFNNv5 and SFNNv6/8 engines will decrease when mean nps of the latter engines increases (e.g. due to smaller and faster small nets). This can already be observed for the 231224 update. I am sure that dual nets have high potential and will find their way into SF master, but this will take some time since several details of the upcoming patch are still under discussion.
Net-selection and nps-enhancement
SFnps16 uses the same net version SFNNv6 as SFDNnps231214v6 but the mate is not found after about 6 min:
SFDNnps231214v6 has found a mate after 17 s. The reason is the much higher nps, more than 3 times the nps of SFnps16. After 3 moves of the c-pawn the promotion to queen and the following moves result in very unbalanced positions and these positions are calculated, when called by the search function, with the small net or even with simple eval, which is both much faster than the calculation with the big net. Therefore the more than 3 times higher nps value:
When the mate m30 is found the nps has reached an even higher value of 2636 nps:
![SFDNnps231214v6](https://github...
SFS256/128nps tournaments
All following Android tournaments were played on a Xiaomi Poco M3 (Android 12, Snapdragon 662, 4(+2) GB RAM) using Termux and FastChess for Android. Concurrency is set to 6 and 1 thread per engine is used. Opening suite used is UHO_2022_8mvs_+110_+119.epd. Note that such kind of openings enlarge ELO differences but reduced draw rates significantly.
SFS256nps231206 for Android with L1-256 net nn-9067e33176e8.nnue
------------------------------------------------------------------
TC: 1+0.01s
Score of SFS256nps231206 vs SFSnps16: 277 - 162 - 161 [] 600
Elo difference: 67.43 +/- 24.07, LOS: 100.00 %, DrawRatio: 26.83 %
Ptnml: WW WD DD/WL LD LL
Distr: 52 90 96 45 17
------------------------------------------------------------------
TC: 2.5+0.025s
Score of SFS256nps231206 vs SFSnps16: 237 - 151 - 212 [] 600
Elo difference: 50.14 +/- 22.46, LOS: 100.00 %, DrawRatio: 35.33 %
Ptnml: WW WD DD/WL LD LL
Distr: 29 98 117 42 14
------------------------------------------------------------------
TC: 5+0.05s
Score of SFS256nps231206 vs SFSnps16: 210 - 147 - 243 [] 600
Elo difference: 36.62 +/- 21.48, LOS: 99.96 %, DrawRatio: 40.50 %
Ptnml: WW WD DD/WL LD LL
Distr: 31 77 129 50 13
------------------------------------------------------------------
TC: 10+0.1s
Score of SFS256nps231206 vs SFSnps16: 175 - 132 - 293 [] 600
Elo difference: 24.94 +/- 19.88, LOS: 99.29 %, DrawRatio: 48.83 %
Ptnml: WW WD DD/WL LD LL
Distr: 14 90 129 59 8
------------------------------------------------------------------
TC: 30+0.3s
Score of SFS256nps231206 vs SFSnps16: 112 - 84 - 204 [] 400
Elo difference: 24.36 +/- 23.83, LOS: 97.72 %, DrawRatio: 51.00 %
Ptnml: WW WD DD/WL LD LL
Distr: 2 65 95 35 3
------------------------------------------------------------------
TC: 60+0.6s
core of SFS256nps231206 vs SFSnps16: 82 - 57 - 161 [] 300
Elo difference: 29.02 +/- 26.76, LOS: 98.30 %, DrawRatio: 53.67 %
Ptnml: WW WD DD/WL LD LL
Distr: 0 51 76 20 3
------------------------------------------------------------------
The new L1-256 net is way better than the old net used in SFS16nps, which still uses hybrid eval but, on the other side, is many patches behind SFS256nps. However, these differences can't explain the large ELO gap.
SFS128nps231207 for Android with L1-128 net nn-a378c9c91bb0.nnue
------------------------------------------------------------------
TC: 1+0.01s
Score of SFS128nps231207 vs SFSnps16: 236 - 208 - 156 [] 600
Elo difference: 16.23 +/- 23.94, LOS: 90.80 %, DrawRatio: 26.00 %
Ptnml: WW WD DD/WL LD LL
Distr: 43 52 126 48 31
------------------------------------------------------------------
TC: 1+0.01s
Score of SFS128nps231207 vs SFSnps16: 218 - 208 - 174 [] 600
Elo difference: 5.79 +/- 23.43, LOS: 68.60 %, DrawRatio: 29.00 %
Ptnml: WW WD DD/WL LD LL
Distr: 35 66 108 56 35
------------------------------------------------------------------
TC: 1+0.01s
Score of SFS128nps231207 vs SFSnps16: 750 - 693 - 557 [] 2000
Elo difference: 9.90 +/- 12.92, LOS: 93.33 %, DrawRatio: 27.85 %
Ptnml: WW WD DD/WL LD LL
Distr: 128 211 357 198 106
------------------------------------------------------------------
TC: 1+0.01s
Score of SFS128nps231207 vs SFSnps16: 713 - 653 - 634 [] 2000
Elo difference: 10.43 +/- 12.57, LOS: 94.77 %, DrawRatio: 31.70 %
Ptnml: WW WD DD/WL LD LL
Distr: 112 233 362 189 104
------------------------------------------------------------------
TC: 10+0.1s
Score of SFS128nps231207 vs SFSnps16: 130 - 213 - 257 [] 600
Elo difference: -48.37 +/- 21.06, LOS: 0.00 %, DrawRatio: 42.83 %
Ptnml: WW WD DD/WL LD LL
Distr: 4 54 123 93 26
------------------------------------------------------------------
TC: 10+0.1s concurrency=3
Score of SFS128nps231207 vs SFSnps16: 110 - 206 - 284 [] 600
Elo difference: -56.07 +/- 20.19, LOS: 0.00 %, DrawRatio: 47.33 %
Ptnml: WW WD DD/WL LD LL
Distr: 4 40 130 108 18
------------------------------------------------------------------
These are the results so far, statistical behaviour looks good:
At 1+0.01s the new half size net is about 10 Elo +/- 7 Elo better, however, at 10+0.1s the old L1=256 is more than 50 ELO stronger.
BTW, not a single time forfeit during the 5200 games at 1+0.01s and (only) 3 in the 1200 games with TC=10+0.1s.
In the meantime Linrock has uploaded two more L1-128 nets. However, when used as solo nets, these small nets are even on Android too weak to be very interesting. Maybe if a small download size matters and strength < 2800 ELO is sufficient (e.g. human play on Lichess against local SF) these nets could be used.
23/10/24: SFMX and CFishNN tournaments
All following Android tournaments were played on a Xiaomi Poco M3 (Android 12, Snapdragon 662, 4(+2) GB RAM) using Termux and the CETSA script, which utilizes c-chess-cli. Concurrency is set to 4 and 1 thread per engine is used.
The absolute value of the rating is set to 3100 ELO for each tournament and the rating offsets and all other values are calculated by the CETSA script with Bayeselo. Opening suite used is UHO_2022_8mvs_+110_+119.epd. Note that such kind of openings enlarge ELO differences but reduced draw rates significantly.
Some selected Android tournament results
(Many more Android tourney results (filename "Bayeselo.txt"), games, and log- and config-files can be found
in the zip-file down below: AllMyAndroidTournamentsTill231022.zip)
TC: 10+0.1s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
---------------------------------------------------------------------------------------------------------
1 SFMX20231021 3138 0.0 11 11 500 304.0 60.8 189 81 230 37.8 46.0 3062
2 ShashChess 34 3062 75.9 11 11 500 196.0 39.2 81 189 230 16.2 46.0 3138
---------------------------------------------------------------------------------------------------------
TC: 30+0.3s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
---------------------------------------------------------------------------------------------------------
1 SFMX20231021 3127 0.0 11 11 500 288.5 57.7 169 92 239 33.8 47.8 3073
2 ShashChess 34 3073 53.9 11 11 500 211.5 42.3 92 169 239 18.4 47.8 3127
---------------------------------------------------------------------------------------------------------
TC: 60+0.6s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
---------------------------------------------------------------------------------------------------------
1 SFMX20231021 3109 0.0 14 15 250 131.5 52.6 65 52 133 26.0 53.2 3091
2 ShashChess 34 3091 18.8 15 14 250 118.5 47.4 52 65 133 20.8 53.2 3109
---------------------------------------------------------------------------------------------------------
TC: 180+1.0s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
---------------------------------------------------------------------------------------------------------
1 SFMX20231021 3110 0.0 32 32 50 26.5 53.0 14 11 25 28.0 50.0 3090
2 ShashChess 34 3090 19.9 32 32 50 23.5 47.0 11 14 25 22.0 50.0 3110
---------------------------------------------------------------------------------------------------------
TC: 180+1.0s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
---------------------------------------------------------------------------------------------------------
1 SFMX20231021 3112 0.0 19 19 100 53.5 53.5 21 14 65 21.0 65.0 3088
2 ShashChess 34 3088 23.8 19 19 100 46.5 46.5 14 21 65 14.0 65.0 3112
---------------------------------------------------------------------------------------------------------
With longer TCs Elo differences usually get smaller and smaller, since the weaker engine can draw
more and more games in a tournament.
However, I am pretty sure that SFMX remains stronger than Shashchess34 also for very long TCs.
To proof this it would take many hours or even several days to perform these tournaments with
a large enough number of long lasting games on the phone. This is too hard for the battery!
Now STC tours with former STC-Champion😉 CFishNN:
TC: 10+0.1s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
---------------------------------------------------------------------------------------------------------
1 SFMX20231021 3101 0.0 11 11 500 251.5 50.3 138 135 227 27.6 45.4 3099
2 CfishNN20230626 3099 3.0 11 11 500 248.5 49.7 135 138 227 27.0 45.4 3101
---------------------------------------------------------------------------------------------------------
TC: 10+0.1s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
---------------------------------------------------------------------------------------------------------
1 CfishNN20230626 3123 0.0 12 11 500 283.0 56.6 174 108 218 34.8 43.6 3077
2 ShashChess 34 3077 45.8 11 12 500 217.0 43.4 108 174 218 21.6 43.6 3123
---------------------------------------------------------------------------------------------------------
TC: 10+0.1s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
---------------------------------------------------------------------------------------------------------
1 CfishNN20230626 3176 0.0 12 11 500 355.5 71.1 263 52 185 52.6 37.0 3024
2 Vafra Cfish12.4 3024 152.5 11 12 500 144.5 28.9 52 263 185 10.4 37.0 3176
---------------------------------------------------------------------------------------------------------
TC: 10+0.1s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
---------------------------------------------------------------------------------------------------------
1 SFMX20231021 3175 0.0 11 11 500 353.5 70.7 258 51 191 51.6 38.2 3025
2 Vafra Cfish12.5 3025 149.0 11 11 500 146.5 29.3 51 258 191 10.2 38.2 3175
---------------------------------------------------------------------------------------------------------
Elo = (relative) Elo rating
Δ = delta from the next higher rated opponent
+/-= error (1 sigma / 68%)
# = number of games played
Σ = total score
Σ% = total score in percent
W L D = wins, losses, draws
W% = wins in percent
=% = draw ratio in percent
OppR = Elo of opponent
On Windows: Battle of the clones
Windows tournaments are played on a Mini PC (Geekom MiniAir 11, Windows 11, Intel N5095, 8 GB RAM) using CuteChess with concurrency = 3. Opening suite used is UHO_2022_8mvs_+110_+119.epd. Note that such kind of openings enlarge ELO differences but reduced draw rates significantly.
Kill these chubby clones as quickly as possible
The three clones use huge SFNNv8 nets while SFMX still uses hybrid eval with the good old last SFNNv5 net.
Note that SFMnps_Droid1x are just internal names for various older SFMX (or SFMXnps😉) versions.
TC 5+0.05s =================================================================================
Rank Name Elo +/- Games Points Score Draw TC
0 SFMnps_Droid12 58 13 1500 874.0 58.3% 46.9% 5+0.05
1 Raid v2.76i_X_sse41 -55 22 500 211.0 42.2% 50.0% 5+0.05
2 TACTICAL 280923_m -56 23 500 210.0 42.0% 44.4% 5+0.05
3 Sun 1.1-sse41-popcnt -63 22 500 205.0 41.0% 46.4% 5+0.05
1500 of 1500 games finished.
TC 7+0.07s =================================================================================
Rank Name Elo +/- Games Points Score Draw TC
0 SFMnps_Droid12 34 17 900 494.0 54.9% 46.4% 7+0.07
1 Raid v2.76i_X_sse41 -10 29 300 145.5 48.5% 45.7% 7+0.07
2 TACTICAL 280923_m -44 28 300 131.0 43.7% 48.0% 7+0.07
3 Sun 1.1-sse41-popcnt -48 29 300 129.5 43.2% 45.7% 7+0.07
900 of 900 games finished.
Continue reading here: https://github.com/Joachim26/StockfishNPS/releases/tag/Master_DroidSFnps-73ae5140
23/10/21: Optimized for Android: SFMXnps
SFMXnps is stronger than Stockfish17dev on Android. To proof this and, in particular, to become more familiar with testing on an Android device, a large number of c-chess-cli tournaments were performed and are shown below.
All tournaments were played on a Xiaomi Poco M3 (Android 12, Snapdragon 662, 4 GB RAM) using Termux and the CETSA script, which utilizes c-chess-cli. Concurrency was set to 4 and TCs from 5+0.05s to 120+1.2s with 1 thread per engine were tested. Also one tournament with 2 threads and concurrency 2 was carried out to test the SMP performance. More details are given in the configuration files which can be, together with the played games, downloaded below.
The absolute value of the rating was set to 3100 ELO for each tournament and the rating offsets and all other values were calculated by the script with Bayeselo. Opening suite was UHO_2022_8mvs_+110_+119.epd. It should be noted that such kind of openings enlarge ELO differences but have the advantage of reduced draw rates.
TC: 5+0.05s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
1 SFMXnps230801 3142 0.0 17 16 250 154.5 61.8 105 46 99 42.0 39.6 3058
2 SFnps230802 3058 83.1 16 17 250 95.5 38.2 46 105 99 18.4 39.6 3142
---------------------------------------------------------------------------------------------------------
TC: 7+0.07s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
1 SFMXnps230801 3139 0.0 16 16 250 152.5 61.0 98 43 109 39.2 43.6 3061
2 SFnps230802 3061 77.1 16 16 250 97.5 39.0 43 98 109 17.2 43.6 3139
---------------------------------------------------------------------------------------------------------
TC: 10+0.1s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
1 SFMXnps230801 3142 0.0 16 16 250 155.0 62.0 101 41 108 40.4 43.2 3058
2 SFnps230802 3058 84.5 16 16 250 95.0 38.0 41 101 108 16.4 43.2 3142
---------------------------------------------------------------------------------------------------------
TC: 15+0.15s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
1 SFMXnps230801 3125 0.0 15 15 250 143.0 57.2 83 47 120 33.2 48.0 3075
2 SFnps230802 3075 50.4 15 15 250 107.0 42.8 47 83 120 18.8 48.0 3125
---------------------------------------------------------------------------------------------------------
TC: 20+0.2s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
1 SFMXnps230801 3127 0.0 20 19 150 86.5 57.7 51 28 71 34.0 47.3 3073
2 SFnps230802 3073 53.0 19 20 150 63.5 42.3 28 51 71 18.7 47.3 3127
---------------------------------------------------------------------------------------------------------
TC: 25+0.25s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
1 SFMXnps230801 3124 0.0 19 19 150 85.5 57.0 47 26 77 31.3 51.3 3076
2 SFnps230802 3076 48.2 19 19 150 64.5 43.0 26 47 77 17.3 51.3 3124
---------------------------------------------------------------------------------------------------------
TC: 25+0.25s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
1 SFMXnps230801 3135 0.0 18 18 150 90.0 60.0 51 21 78 34.0 52.0 3065
2 SFnps230802 3065 69.4 18 18 150 60.0 40.0 21 51 78 14.0 52.0 3135
---------------------------------------------------------------------------------------------------------
TC: 40+0.4s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
1 SFMXnps230801 3119 0.0 20 20 150 83.0 55.3 50 34 66 33.3 44.0 3081
2 SFnps230802 3081 37.3 20 20 150 67.0 44.7 34 50 66 22.7 44.0 3119
---------------------------------------------------------------------------------------------------------
TC: 60+0.6s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
1 SFMXnps230801 3124 0.0 18 18 150 85.5 57.0 46 25 79 30.7 52.7 3076
2 SFnps230802 3076 48.5 18 18 150 64.5 43.0 25 46 79 16.7 52.7 3124
---------------------------------------------------------------------------------------------------------
TC: 90+0.9s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
1 SFMXnps230801 3117 0.0 17 17 150 82.5 55.0 39 24 87 26.0 58.0 3083
2 SFnps230802 3083 34.4 17 17 150 67.5 45.0 24 39 87 16.0 58.0 3117
---------------------------------------------------------------------------------------------------------
TC: 120+1.2s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
1 SFMXnps230801 3117 0.0 20 20 100 55.0 55.0 24 14 62 24.0 62.0 3083
2 SFnps230802 3083 34.1 20 20 100 45.0 45.0 14 24 62 14.0 62.0 3117
---------------------------------------------------------------------------------------------------------
TC: 120+1.2s Threads: 2
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
1 SFMXnps230801 3121 0.0 20 20 100 56.0 56.0 26 14 60 26.0 60.0 3079
2 SFnps230802 3079 41.1 20 20 100 44.0 44.0 14 26 60 14.0 60.0 3121
---------------------------------------------------------------------------------------------------------
TC: 180+1.0s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
1 SFMXnps230801 3116 0.0 22 22 100 54.5 54.5 27 18 55 27.0 55.0 3084
2 SFnps230802 3084 31.1 22 22 100 45.5 45.5 18 27 55 18.0 55.0 3116
---------------------------------------------------------------------------------------------------------
TC: 180+1.8s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
1 SFMXnps230801 3134 0.0 30 30 50 30.0 60.0 17 7 26 34.0 52.0 3066
2 SFnps230802 3066 67.4 30 30 50 20.0 40.0 7 17 26 14.0 52.0 3134
---------------------------------------------------------------------------------------------------------
Δ = delta from the next higher rated opponent
# = number of games played
Σ = total score, 1 point for win, 1/2 point for draw
Older tournaments:
Two recent tourneys added to the graph:
SF16 has a smaller net than current SF17dev and is faster than SF20230802. Could it be that Stockfish16nps is stronger than SFMX?
TC: 10+0.1s Threads: 1
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
1 SFMXnps230801 3137 0.0 19 19 300 173.5 57.8 107 60 133 35.7 44.3 3082
2 SFnps16 3116 20.8 19 19 300 160.5 53.5 92 71 137 30.7 45.7 3092
3 SFnps20230802 3047 69.3 19 19 300 116.0 38.7 49 117 134 16.3 44.7 3127
---------------------------------------------------------------------------------------------------------
Δ = delta from the next higher rated opponent
# = number of games played
Σ = total score, 1 point for win, 1/2 point for draw
Not very likely but not impossible either. Thus the above question will be answered by future tournaments at various TCs.
Edit 09/09/23: See https://github.com/Joachim26/StockfishNPS/releases/tag/Master_DroidSFnps-bb4c63b3
Windows tournaments included in the graph:
Reasonable results, since the speed ratio of the two engines is smaller on Windows (~1.4 for modern builds) than on Android (~1.6). Even more tournaments with (much) more games should be performed to give clearer results.
Continuation of the SFMX tournaments page...
Kill these chubby clones as quickly as possible
The three clones use huge SFNNv8 nets while SFMX still uses hybrid eval with the good old last SFNNv5 net.
Note that SFMnps_Droid1x are just internal names for various older SFMX (or SFMXnps😉) versions.
TC 5+0.05s =================================================================================
Rank Name Elo +/- Games Points Score Draw TC
0 SFMnps_Droid12 58 13 1500 874.0 58.3% 46.9% 5+0.05
1 Raid v2.76i_X_sse41 -55 22 500 211.0 42.2% 50.0% 5+0.05
2 TACTICAL 280923_m -56 23 500 210.0 42.0% 44.4% 5+0.05
3 Sun 1.1-sse41-popcnt -63 22 500 205.0 41.0% 46.4% 5+0.05
1500 of 1500 games finished.
TC 7+0.07s =================================================================================
Rank Name Elo +/- Games Points Score Draw TC
0 SFMnps_Droid12 34 17 900 494.0 54.9% 46.4% 7+0.07
1 Raid v2.76i_X_sse41 -10 29 300 145.5 48.5% 45.7% 7+0.07
2 TACTICAL 280923_m -44 28 300 131.0 43.7% 48.0% 7+0.07
3 Sun 1.1-sse41-popcnt -48 29 300 129.5 43.2% 45.7% 7+0.07
900 of 900 games finished.
TC 10+0.1s =================================================================================
Rank Name Elo +/- Games Points Score Draw TC
0 SFMnps_Droid12 43 16 900 506.0 56.2% 52.0% 10+0.1
1 TACTICAL 280923_m -40 28 300 133.0 44.3% 50.0% 10+0.1
2 Raid v2.76i_X_sse41 -45 27 300 130.5 43.5% 52.3% 10+0.1
3 Sun 1.1-sse41-popcnt -45 27 300 130.5 43.5% 53.7% 10+0.1
900 of 900 games finished.
TC 15+0.15s =================================================================================
Rank Name Elo +/- Games Points Score Draw TC
0 SFMnps_Droid12 40 16 900 501.5 55.7% 52.3% 15+0.15
1 Raid v2.76i_X_sse41 -33 26 300 136.0 45.3% 54.7% 15+0.15
2 TACTICAL 280923_m -33 27 300 136.0 45.3% 51.3% 15+0.15
3 Sun 1.1-sse41-popcnt -55 28 300 126.5 42.2% 51.0% 15+0.15
900 of 900 games finished.
TC 20+0.2s =================================================================================
Rank Name Elo +/- Games Points Score Draw TC
0 SFMnps_Droid12 28 19 600 324.0 54.0% 53.0% 20+0.2
1 Raid v2.76i_X_sse41 -21 34 200 94.0 47.0% 51.0% 20+0.2
2 TACTICAL 280923_m -24 31 200 93.0 46.5% 58.0% 20+0.2
3 Sun 1.1-sse41-popcnt -38 34 200 89.0 44.5% 50.0% 20+0.2
600 of 600 games finished.
TC 30+0.30s =================================================================================
Rank Name Elo +/- Games Points Score Draw TC
0 SFMnps_Droid12 16 18 600 313.5 52.3% 58.8% 30+0.3
1 TACTICAL 280923_m -14 30 200 96.0 48.0% 60.0% 30+0.3
2 Sun 1.1-sse41-popcnt -14 33 200 96.0 48.0% 53.0% 30+0.3
3 Raid v2.76i_X_sse41 -19 29 200 94.5 47.3% 63.5% 30+0.3
600 of 600 games finished.
In the following head to head matches will be performed, because the ELO gaps are getting smaller and smaller, smaller error margins and thus more games are needed. Raid, the best of the clones in the above tours, is selected as opponent for SFMX.
TC 25+0.25s =================================================================================
Score of SFMnps_Droid12 vs Raid v2.76i_X_sse41: 128 - 107 - 265 [0.521]
... SFMnps_Droid12 playing White: 123 - 7 - 120 [0.732] 250
... SFMnps_Droid12 playing Black: 5 - 100 - 145 [0.310] 250
... White vs Black: 223 - 12 - 265 [0.711] 500
Elo difference: 14.6 +/- 20.9, LOS: 91.5 %, DrawRatio: 53.0 %
500 of 500 games finished.
TC 30+0.3s =================================================================================
Score of SFMnps_Droid12 vs Raid v2.76i_X_sse41: 122 - 101 - 277 [0.521]
... SFMnps_Droid12 playing White: 118 - 5 - 127 [0.726] 250
... SFMnps_Droid12 playing Black: 4 - 96 - 150 [0.316] 250
... White vs Black: 214 - 9 - 277 [0.705] 500
Elo difference: 14.6 +/- 20.3, LOS: 92.0 %, DrawRatio: 55.4 %
500 of 500 games finished.
TC 45+0.45s =================================================================================
Score of SFMnps_Droid12 vs Raid v2.76i_X_sse41: 82 - 70 - 148 [0.520]
... SFMnps_Droid12 playing White: 81 - 2 - 67 [0.763] 150
... SFMnps_Droid12 playing Black: 1 - 68 - 81 [0.277] 150
... White vs Black: 149 - 3 - 148 [0.743] 300
Elo difference: 13.9 +/- 28.0, LOS: 83.5 %, DrawRatio: 49.3 %
300 of 500 games finished.
SFMnps_Droid12 --> SFMnps_Droid14_1536 (engine change, new one uses a bit more often classical eval)
TC 25+0.25s =================================================================================
Score of SFMnps_Droid14_1536 vs Raid v2.76i_X_sse41: 86 - 60 - 154 [0.543]
... SFMnps_Droid14_1536 playing White: 80 - 2 - 68 [0.760] 150
... SFMnps_Droid14_1536 playing Black: 6 - 58 - 86 [0.327] 150
... White vs Black: 138 - 8 - 154 [0.717] 300
Elo difference: 30.2 +/- 27.4, LOS: 98.4 %, DrawRatio: 51.3 %
300 of 300 games finished.
TC 60+0.6s =================================================================================
Score of SFMnps_Droid14_1536 vs Raid v2.76i_X_sse41: 67 - 62 - 171 [0.508]
... SFMnps_Droid14_1536 playing White: 64 - 1 - 85 [0.710] 150
... SFMnps_Droid14_1536 playing Black: 3 - 61 - 86 [0.307] 150
... White vs Black: 125 - 4 - 171 [0.702] 300
Elo difference: 5.8 +/- 25.8, LOS: 67.0 %, DrawRatio: 57.0 %
300 of 300 games finished.
TC 75+0.75s =================================================================================
Score of SFMnps_Droid14_1536 vs Raid v2.76i_X_sse41: 65 - 57 - 178 [0.513]
... SFMnps_Droid14_1536 playing White: 65 - 0 - 85 [0.717] 150
... SFMnps_Droid14_1536 playing Black: 0 - 57 - 93 [0.310] 150
... White vs Black: 122 - 0 - 178 [0.703] 300
Elo difference: 9.3 +/- 25.1, LOS: 76.6 %, DrawRatio: 59.3 %
300 of 300 games finished.
TC 100+1s =================================================================================
Score of SFMnps_Droid14_1536 vs Raid v2.76i_X_sse41: 63 - 65 - 172 [0.497]
... SFMnps_Droid14_1536 playing White: 63 - 1 - 86 [0.707] 150
... SFMnps_Droid14_1536 playing Black: 0 - 64 - 86 [0.287] 150
... White vs Black: 127 - 1 - 172 [0.710] 300
Elo difference: -2.3 +/- 25.7, LOS: 43.0 %, DrawRatio: 57.3 %
300 of 300 games finished.
TC 180+1s =================================================================================
Score of SFMnps_Droid12 vs Raid v2.76i_X_sse41: 9 - 5 - 26 [0.550]
... SFMnps_Droid12 playing White: 9 - 0 - 11 [0.725] 20
... SFMnps_Droid12 playing Black: 0 - 5 - 15 [0.375] 20
... White vs Black: 14 - 0 - 26 [0.675] 40
Elo difference: 34.9 +/- 64.1, LOS: 85.7 %, DrawRatio: 65.0 %
40 of 80 games finished.
TC 180+1s =================================================================================
Score of SFMnps_Droid12 vs Raid v2.76i_X_sse41: 16 - 16 - 68 [0.500]
... SFMnps_Droid12 playing White: 16 - 0 - 34 [0.660] 50
... SFMnps_Droid12 playing Black: 0 - 16 - 34 [0.340] 50
... White vs Black: 32 - 0 - 68 [0.660] 100
Elo difference: 0.0 +/- 38.6, LOS: 50.0 %, DrawRatio: 68.0 %
100 of 100 games finished.
More tournaments will follow soon. How long can SFMX resist the chubby clone pack?
At 3min+1s SFMX seems still be too strong for chubby Raid... However, the sample size is far too small to trust the outcome of the last tour!
BTW1: Why I use the C-word chubby? Simply because I was forced to download something like 2 * 165 + 40 = 370 MB just to test these three clones...unbelievable...😁
BTW2: I was too lazy to compile the clones for Android, thus the Windows tournaments are played first. Android tours may follow in future but then the "cross over point" will be pushed to such a long TC, that it will be practically unmeasurable with the Poco M3 phone...🤣
Battle of the nets
Unfortunately, these Android tournaments were lost.... 😒 (downloads are still assets of the previous page).
Update 23/12/04: SFSnps, SFMnps, SFMXnps, and SFnps with 12, 45, and 38 to 62 MB net.
All three StockfishNPS versions are updated each time official Stockfish17dev is updated. These automatically compiled builds of the various flavors and versions of StockfishNPS can be found down below. A very short TC tournament with three of such Windows builds [performed on 23/06/22] is presented here:
Rank Name Elo +/- Games Points Score Draw TC
1 SFSnps_modern 15 17 1000 521.0 52.1% 38.8% 1+0.03
2 SFMnps_modern -3 17 1000 495.0 49.5% 38.6% 1+0.03
3 SFnps_modern -11 17 1000 484.0 48.4% 37.6% 1+0.03
1500 of 1500 games finished.
In addition, manually compiled and (much) faster armv8 (and some armv7) Android builds can be downloaded on the following page: https://github.com/Joachim26/StockfishNPS/releases/tag/Win_modern_and_armv8_dev_release
On that page also more tournament results, some further informations and links to the nets are shown.