Simplified Transformer

Model

This model is a feedforwd network with ReLu activation functions. The input is image blocks tagged with a learned vector. Awareness is added by computing the average and variance of the output of each layer and then feeding that into the next layer. Self awareness is the sum of the system state mixed back into the system state.

Results

Below is the accuracy of the model for the MNIST dataset for different versions of the model.

Correct	Total	WidthMultiplier	Layers	Change
8474	10000	?	1	without awareness
8924	10000	?	1	with average awareness
8989	10000	?	1	with average and variance awareness
9102	10000	?	1	with dropout
8825	10000	?	1	with tags on middle layer (reverted)
9082	10000	?	1	with awareness on the input (reverted)
9155	10000	?	1	with relu on first layer
9325	10000	2	1	removed TanH from output layer
9599	10000	2	2	three layers
9662	10000	2	3	four layers
9679	10000	4	3	double width

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simplified Transformer

Model

Results

Citations

About

Releases

Packages

Languages

License

pointlander/bento

Folders and files

Latest commit

History

Repository files navigation

Simplified Transformer

Model

Results

Citations

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages