Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: x/tools/go/analysis/passes/appends: check for incorrect slice length initialization #69872

Open
cuishuang opened this issue Oct 14, 2024 · 28 comments
Labels
Milestone

Comments

@cuishuang
Copy link
Contributor

cuishuang commented Oct 14, 2024

Proposal Details

The following code exists in many projects, and developers actually want [0 1 2], but due to the initialization error of slice, the final result is [0 0 0 0 1 2]

package main

import "fmt"

func main() {

	sli := make([]int, 3)

	for i := range 3 {
		sli = append(sli, i)
	}

	fmt.Println(sli) // the result is [0 0 0 0 1 2]

}

The online demo: https://go.dev/play/p/q1BcVCmvidW

Over the past few months, I have conducted extensive research and analysis, and also submitted pull requests to fix issues in many well-known Go projects such as prometheus, zap, vitess. Below are some pull requests submitted by me and others related to this problem.

Due to limitations in search skills and time, I only checked records of such issues in the past few months. The history of more such issues has not been traced back. But I think it's already enough

I would like to propose adding a new analyzer to go vet that can detect such situations, thereby avoiding these issues in the future.

Now I have already completed an initial version of the code, and if the proposal is approved, I would be happy to refine it and add the necessary test cases.

The merged pr:

prometheus/prometheus#14702 (comment)

uber-go/zap#1461

cadence-workflow/cadence#6293

prometheus/prometheus#15026

vitessio/vitess#16674

kedacore/keda#6179

external-secrets/external-secrets#3964

brianvoe/gofakeit#365

fission/fission#3018

DataDog/datadog-agent#29744

superseriousbusiness/gotosocial#3382

ccfos/nightingale#2169

gookit/color#97

vdaas/vald#2672

supabase/auth#1788

pufferpanel/pufferpanel#1367

juju/juju#18176

go-spatial/tegola@0f3131f

lxc/incus#1285

yunionio/cloudpods#21346

taubyte/tau#253

fleetdm/fleet#22608

antrea-io/antrea#6715

tdewolff/canvas#315

Consensys/gnark#1288

superfly/flyctl#3982

bazel-contrib/rules_go#4133

zitadel/oidc#658

jhump/protoreflect#629

apache/rocketmq-client-go#1171

edgexfoundry/edgex-go#4938

dolthub/doltgresql#812

apache/trafficcontrol#8091

pingcap/tidb-operator#5755

botlabs-gg/yagpdb#1734

Altinity/clickhouse-backup#1019

openshift/installer#9072

GoogleCloudPlatform/magic-modules#11919

openmeterio/openmeter#1615

target/goalert#4090

kubeovn/kube-ovn#4579

syyongx/php2go#49

fluid-cloudnative/fluid#4335

akuity/kargo#2648

kubernetes/kubernetes#127785

apache/dubbo-go#2734

letsencrypt/boulder#7725

cortexproject/cortex#6237

kubeedge/kubeedge#5895

grafana/mimir#9449

rocboss/paopao-ce#581

authelia/authelia#7720

cilium/cilium#35164

git-lfs/git-lfs#5874

https://github.com/hashicorp/nomad/pull/24109/files

cosmos/ibc-go#6444

minio/minio#19567

VictoriaMetrics/VictoriaMetrics#6897

hyperledger/fabric#4956

grafana/pyroscope#3600

cosmos/cosmos-sdk#21494 (review)

anchore/grype#2133

https://github.com/ethereum-optimism/optimism/pull/11542/files

https://github.com/libp2p/go-libp2p/pull/2938/files

stashapp/stash#5327

trufflesecurity/trufflehog#3293

c9s/bbgo#1724 (comment)

cosmos/cosmos-sdk#22006

FerretDB/FerretDB#4598

dagger/dagger#8612

letsencrypt/boulder#7731

Layr-Labs/eigenda#767

wal-g/wal-g#1800

VictoriaMetrics/VictoriaMetrics#7161

harmony-one/harmony#4767

stackrox/stackrox#13028

stefanprodan/timoni#430

Altinity/clickhouse-operator#1523

iotexproject/iotex-core#4412

ane more in review process.

@gabyhelp
Copy link

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/619915 mentions this issue: go/analysis: add a new analyzer to check for incorrect slice length initialization

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/619935 mentions this issue: cmd: add a new analyzer to check for incorrect slice length initialization

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/618875 mentions this issue: runtime: fix slice init length

@timothy-king
Copy link
Contributor

What is the proposed criteria to decide when vet should issue a report? It is hard to decide whether to accept a vet proposal without this being specified. We need to assess the likely false positive and negatives rates of the criteria. Thanks.

(You can try to turn the included CL into natural language.)

@zigo101
Copy link

zigo101 commented Oct 14, 2024

I'm surprised that there are so many misuses of the append function.

@adonovan
Copy link
Member

Guessing from logic of the CL I just reviewed, the criterion is:

Vet should report a diagnostic when appending to a slice s if there exists an assignment s = make(T, n) where n is not a constant zero and there are no expressions of the form s[i] or copy(s, ...).

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Oct 14, 2024
@ianlancetaylor ianlancetaylor changed the title cmd/vet: add a new analyzer to check for incorrect slice length initialization proposal: cmd/vet: add a new analyzer to check for incorrect slice length initialization Oct 14, 2024
@gopherbot gopherbot added this to the Proposal milestone Oct 14, 2024
@cuishuang
Copy link
Contributor Author

cuishuang commented Oct 15, 2024

What is the proposed criteria to decide when vet should issue a report? It is hard to decide whether to accept a vet proposal without this being specified. We need to assess the likely false positive and negatives rates of the criteria. Thanks.

(You can try to turn the included CL into natural language.)

Yes, this is the issue I initially wanted to discuss on the CL page. Thanks to @adonovan for the detailed summary.

Additionally, through extensive practice, I have encountered several challenges and gained some insights: How do we balance false positives and omissions? Should we aim to comprehensively identify similar issues and accept a high rate of false positives, or should we focus on precision, which might lead to some omissions? This balance certainly warrants further discussion.

Specifically regarding the code, I’ve noticed that when slice are processed with binary.LittleEndian.PutUint16, they often lead to false positives. Moreover, if a slice is initialized in function A but only appended to after several layers of calls in function Z, should we address this situation? Or should we only consider initialization and appending that occur within the same method?

Last year, with your guidance, I added the "appends" analyzer to go vet, so I have some experience in this area.

However, it’s clear that this situation is more complex than last year, and there are more details to discuss. I look forward to your suggestions.

Thank you all!

@cuishuang
Copy link
Contributor Author

cuishuang commented Oct 15, 2024

I'm surprised that there are so many misuses of the append function.

Yes, if using sli[i] for assignment, there is no such problem, but many people like me are used to using append. So I think it can also be considered as an initialization issue with the slice.

Based on my limited investigation and analysis of high star libraries related to the Go ecosystem, it seems worthwhile to add such an analyzer.

@timothy-king
Copy link
Contributor

timothy-king commented Oct 17, 2024

I took a look at several of these reports. Most looks good. Some things to discuss. Here are my notes.

These all look like good reports to me. I think we can have stricter criteria and still report these. They are very similar.

https://github.com/prometheus/prometheus/pull/14702/files
https://github.com/uber/cadence/pull/6293/files
https://github.com/vitessio/vitess/pull/16674/files
https://github.com/kedacore/keda/pull/6179/files
https://github.com/brianvoe/gofakeit/pull/365/files
https://github.com/fission/fission/pull/3018/files
https://github.com/DataDog/datadog-agent/pull/29744/files
https://github.com/superseriousbusiness/gotosocial/pull/3382/files
https://github.com/ccfos/nightingale/pull/2169/files
https://github.com/gookit/color/pull/97/files
https://github.com/supabase/auth/pull/1788/files
https://github.com/pufferpanel/pufferpanel/pull/1367/files
https://github.com/juju/juju/pull/18176/files
go-spatial/tegola@0f3131f
https://github.com/lxc/incus/pull/1285/files
https://github.com/yunionio/cloudpods/pull/21346/files
https://github.com/taubyte/tau/pull/253/files
https://github.com/fleetdm/fleet/pull/22608/files
https://github.com/antrea-io/antrea/pull/6715/files
https://github.com/tdewolff/canvas/pull/315/files
https://github.com/Consensys/gnark/pull/1288/files
https://github.com/superfly/flyctl/pull/3982/files
bazel-contrib/rules_go#4133
https://github.com/zitadel/oidc/pull/658/files
https://github.com/jhump/protoreflect/pull/629/files
https://github.com/apache/rocketmq-client-go/pull/1171/files
https://github.com/edgexfoundry/edgex-go/pull/4938/files
https://github.com/apache/trafficcontrol/pull/8091/files
https://github.com/pingcap/tidb-operator/pull/5755/files
https://github.com/botlabs-gg/yagpdb/pull/1734/files
https://github.com/Altinity/clickhouse-backup/pull/1019/files
https://github.com/openshift/installer/pull/9072/files
https://github.com/cortexproject/cortex/pull/6237/files
https://github.com/letsencrypt/boulder/pull/7725/files
https://github.com/kubernetes/kubernetes/pull/127785/files
https://github.com/fluid-cloudnative/fluid/pull/4335/files
https://github.com/syyongx/php2go/pull/49/files
https://github.com/kubeovn/kube-ovn/pull/4579/files
https://github.com/target/goalert/pull/4090/files
https://github.com/openmeterio/openmeter/pull/1615/files
https://github.com/openmeterio/openmeter/pull/1615/files
https://github.com/cuishuang/vald/blob/2bb30dac79f0d9afdb0f809f0ca5b8eef0e62d75/internal/net/grpc/client.go
https://github.com/GoogleCloudPlatform/magic-modules/pull/11919/files

This also looks like a good report, but had a more interesting fix:

https://github.com/akuity/kargo/pull/2648/files

These I am not as sure about. It would not be terrible to report these, but it would also not be too bad to not report them. Some of these should be rewritten as literals. Could we suggest alternative rewrites from the Analyzer?

https://github.com/dolthub/doltgresql/pull/812/files
https://github.com/apache/dubbo-go/pull/2734/files

The proposed criteria would only have reported TestAzureKeyVaultSecretManagerGetAllSecrets in:

https://github.com/external-secrets/external-secrets/pull/3964/files
This is a good example for why we need to exclude 0.

We should probably not report either of these from vet. This suggests to me we need to exclude slice expressions too.

https://github.com/prometheus/prometheus/pull/15026/files
https://github.com/uber-go/zap/pull/1461/files

I am still not exactly sure I have a clear theory of what we are reporting yet. But the heuristics seem to be hitting on something promising. Indexing and slicing is out. range is fine. len is fine. Passing the slice as an argument or a variadic expansion... is also fine. Storing it in a field has been fine in these examples so far. I am not sure what this adds up to.

Another point, it would be good for us to have negative examples too. Making an all zero value slice needs to not be reported. And we should have guidance on how to create {0,0,0,1,2,3}. The proposal is going to flag a correct way to create this value. What is the 'right' way to do this? Do we need to give folks a clear escape hatch?

FWIW I think the proposal should be phrased 'as what to report' rather than 'add a new analyzer'. This could plausibly go into the "appends" analyzer.

@cuishuang
Copy link
Contributor Author

I took a look at several of these reports. Most looks good. Some things to discuss. Here are my notes.

These all look like good reports to me. I think we can have stricter criteria and still report these. They are very similar.

https://github.com/prometheus/prometheus/pull/14702/files
https://github.com/uber/cadence/pull/6293/files
https://github.com/vitessio/vitess/pull/16674/files
https://github.com/kedacore/keda/pull/6179/files
https://github.com/brianvoe/gofakeit/pull/365/files
https://github.com/fission/fission/pull/3018/files
https://github.com/DataDog/datadog-agent/pull/29744/files
https://github.com/superseriousbusiness/gotosocial/pull/3382/files
https://github.com/ccfos/nightingale/pull/2169/files
https://github.com/gookit/color/pull/97/files
https://github.com/supabase/auth/pull/1788/files
https://github.com/pufferpanel/pufferpanel/pull/1367/files
https://github.com/juju/juju/pull/18176/files
go-spatial/tegola@0f3131f
https://github.com/lxc/incus/pull/1285/files
https://github.com/yunionio/cloudpods/pull/21346/files
https://github.com/taubyte/tau/pull/253/files
https://github.com/fleetdm/fleet/pull/22608/files
https://github.com/antrea-io/antrea/pull/6715/files
https://github.com/tdewolff/canvas/pull/315/files
https://github.com/Consensys/gnark/pull/1288/files
https://github.com/superfly/flyctl/pull/3982/files
bazel-contrib/rules_go#4133
https://github.com/zitadel/oidc/pull/658/files
https://github.com/jhump/protoreflect/pull/629/files
https://github.com/apache/rocketmq-client-go/pull/1171/files
https://github.com/edgexfoundry/edgex-go/pull/4938/files
https://github.com/apache/trafficcontrol/pull/8091/files
https://github.com/pingcap/tidb-operator/pull/5755/files
https://github.com/botlabs-gg/yagpdb/pull/1734/files
https://github.com/Altinity/clickhouse-backup/pull/1019/files
https://github.com/openshift/installer/pull/9072/files
https://github.com/cortexproject/cortex/pull/6237/files
https://github.com/letsencrypt/boulder/pull/7725/files
https://github.com/kubernetes/kubernetes/pull/127785/files
https://github.com/fluid-cloudnative/fluid/pull/4335/files
https://github.com/syyongx/php2go/pull/49/files
https://github.com/kubeovn/kube-ovn/pull/4579/files
https://github.com/target/goalert/pull/4090/files
https://github.com/openmeterio/openmeter/pull/1615/files
https://github.com/openmeterio/openmeter/pull/1615/files
https://github.com/cuishuang/vald/blob/2bb30dac79f0d9afdb0f809f0ca5b8eef0e62d75/internal/net/grpc/client.go
https://github.com/GoogleCloudPlatform/magic-modules/pull/11919/files

This also looks like a good report, but had a more interesting fix:

https://github.com/akuity/kargo/pull/2648/files

These I am not as sure about. It would not be terrible to report these, but it would also not be too bad to not report them. Some of these should be rewritten as literals. Could we suggest alternative rewrites from the Analyzer?

https://github.com/dolthub/doltgresql/pull/812/files
https://github.com/apache/dubbo-go/pull/2734/files

The proposed criteria would only have reported TestAzureKeyVaultSecretManagerGetAllSecrets in:

https://github.com/external-secrets/external-secrets/pull/3964/files
This is a good example for why we need to exclude 0.

We should probably not report either of these from vet. This suggests to me we need to exclude slice expressions too.

https://github.com/prometheus/prometheus/pull/15026/files
https://github.com/uber-go/zap/pull/1461/files

I am still not exactly sure I have a clear theory of what we are reporting yet. But the heuristics seem to be hitting on something promising. Indexing and slicing is out. range is fine. len is fine. Passing the slice as an argument or a variadic expansion... is also fine. As is storing it in a field has been fine in these examples. I am not sure what this adds up to.

Another point, it would be good for us to have negative examples too. Making an all zero value slice needs to be fine. And we should have guidance on how to create {0,0,0,1,2,3}. The proposal is going to flag a correct way to create this value. What is the 'right' way to do this? Do we need to give folks a clear escape hatch?

FWIW I think the proposal should be phrased 'as what to report' rather than 'add a new analyzer'. This could plausibly go into the "appends" analyzer.

Thanks for your reply and review. I have had a lot of discussions with @adonovan about this on Gerrit.

Under adonovan's suggestion and guidance, I have now added the SuggestedFix feature.

Regarding this fix in https://github.com/akuity/kargo/pull/2648/files, it highlights the difference between using sli[i] and append, but the end result is the same. I might prefer using append, but all the way is ok.

Regarding whether to extend the existing appends analyzer or to add a new one, here are my thoughts: https://go-review.googlesource.com/c/tools/+/619915/comments/0e76d631_e6c99d92

Regarding other questions, such as "What is the 'right' way to do this? Do we need to give folks a clear escape hatch?" I'm also not sure, so perhaps I should listen to other‘s suggestions.

Overall, at least for now, adding such checks is valuable, and there are many details that can still be discussed and improved.

I will pay attention to and follow the final advice of the Go team. Thank you again!

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/620316 mentions this issue: all: register the sliceinit analyzer in a few places and update gopls/go.mod

@timothy-king
Copy link
Contributor

Do we need to give folks a clear escape hatch?

To answer my own question, the escape hatch is to write s := make([]T, n, n) instead of s := make([]T, n) when one intends to initialize the length to n as well. Obvious in retrospect.

It seems okay to encourage this as an escape hatch as the checker is trying to find cases where slice's (len,cap) is initialized to (n,n) when the checker 'thinks' (0,n) is almost certainly intended.

@cuishuang
Copy link
Contributor Author

After communicating with the two maintainers and receiving very good suggestions, we decided not to add a new analyzer, but to add these features based on the previous appends analyzer.

Does this still need to be proposed?

@cuishuang cuishuang changed the title proposal: cmd/vet: add a new analyzer to check for incorrect slice length initialization proposal: cmd/vet: add a new feature to the appends analyzer to check if the slice length initialization is incorrect Oct 23, 2024
@adonovan
Copy link
Member

Yes, major feature changes to analyzers included in vet do need proposals.

@cuishuang
Copy link
Contributor Author

Yes, major feature changes to analyzers included in vet do need proposals.

Thanks. I have moved the code to the appends directory. Some comments may need to be optimized. Please review it when you have time.

@adonovan
Copy link
Member

adonovan commented Oct 23, 2024

I ran the new analyzer (at patch set 18 of CL 619915) over about 15,000 modules, and got 1225 new findings. Here are a random 50.

Inspecting the first 10, at least 7 are false positives, most relating to functions like binary.BigEndian.PutUint32 that fill a fixed-size buffer that is later appended. It does turn up some real bugs though. More work is needed to reduce the false-positive rate beneath our target of 5%.

https://go-mod-viewer.appspot.com/github.com/cloudflare/circl@v1.5.0/abe/cpabe/tkn20/internal/tkn/policy.go#L209: ret initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/emmansun/gmsm@v0.29.1/internal/cryptotest/block.go#L182: expectedDst initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/tractor.dev/toolkit-go@v0.0.0-20241010005851-214d91207d07/duplex/rpc/frame.go#L41: prefix initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/osrg/gobgp/v3@v3.30.0/pkg/packet/bgp/bgp.go#L4784: b initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/gochain/gochain/v3@v3.4.9/p2p/rlpx.go#L477: prefix initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/sagernet/netlink@v0.0.0-20240612041022-b9a21c07ac6a/route_linux.go#L338: attr initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/core-coin/go-core/v2@v2.1.9/contracts/checkpointoracle/oracle_test.go#L135: buf initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/consensys/gnark-crypto@v0.14.0/ecc/bw6-761/twistededwards/eddsa/eddsa.go#L189: offset initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/openshift/installer@v1.4.17/pkg/asset/installconfig/aws/subnet.go#L49: zoneNames initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/pion/dtls/v2@v2.2.12/pkg/protocol/handshake/message_certificate.go#L29: out initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/ethersphere/bee/v2@v2.2.0/pkg/soc/testing/soc.go#L49: id initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/vishvananda/netlink@v1.3.0/route_linux.go#L350: attr initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/ledgerwatch/erigon-lib@v1.0.0/downloader/downloader.go#L683: list initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/cloudwego/hertz@v0.9.3/pkg/protocol/http1/client_test.go#L188: streamRead initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/spheronFdn/akash-api-fork@v1.0.6-sph/go/node/types/v1beta2/resource.go#L75: res initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/ethersphere/bee/v2@v2.2.0/pkg/file/pipeline/hashtrie/hashtrie.go#L153: spb initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/ronperry/cryptoedge@v0.0.0-20150815114006-cc363e290743/singhdas/types.go#L436: d initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/apache/arrow/go/v14@v14.0.2/parquet/internal/utils/bit_reader_test.go#L535: values initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/m3db/m3@v1.5.0/src/query/api/v1/handler/prometheus/response.go#L172: tags initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/status-im/status-go/extkeys@v1.1.2/hdkey.go#L235: extra initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/yandex/pandora@v0.5.32/components/providers/scenario/http/postprocessor/var_xpath.go#L17: result initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/zaquestion/lab@v0.25.1/cmd/label_list.go#L73: labelNames initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/trpc.group/trpc-go/trpc-go@v1.0.3/pool/multiplexed/multiplexed_test.go#L112: head initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/pion/webrtc/v4@v4.0.1/internal/mux/mux.go#L189: pendingPackets initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/GoogleContainerTools/skaffold@v1.39.18/pkg/skaffold/tag/tagger_mux.go#L44: sl initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/wangyougui/gf/v2@v2.6.5/container/gmap/gmap_z_example_int_any_test.go#L406: removeList initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/wangyougui/gf/v2@v2.6.5/container/gmap/gmap_z_example_any_any_test.go#L406: removeList initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/trustbloc/kms-go@v1.1.2/doc/util/jwkkid/kid_creator.go#L240: pad initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/turingchain2020/turingchain@v1.1.21/wallet/bipwallet/go-bip39/bip39.go#L222: newSlice initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/safedep/dry@v0.0.0-20241016050132-a15651f0548b/crypto/aes.go#L57: nonce initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/keybase/client/go@v0.0.0-20241007131713-f10651d043c8/kbfs/libkbfs/kbfs_ops_concur_test.go#L1562: ptrs initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/diggerhq/digger/libs@v0.0.0-20240604170430-9d61cdf01cc5/orchestrator/github/github_test.go#L98: projectNames initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/dubbo.apache.org/dubbo-go/v3@v3.1.1/proxy/proxy_factory/pass_through.go#L104: in initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/phpdave11/gofpdf@v1.4.2/utf8fontfile.go#L1044: answer initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/milvus-io/milvus-sdk-go/v2@v2.4.1/test/testcases/main_test.go#L529: ips initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/hashgraph/hedera-sdk-go/v2@v2.48.0/contract_function_parameters.go#L1762: bytes initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/waldiirawan/apm-agent-go/v2@v2.2.2/stacktrace/stacktrace.go#L45: pc initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/daefrom/go-dae@v1.0.1/contracts/checkpointoracle/oracle_test.go#L135: buf initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/ronperry/cryptoedge@v0.0.0-20150815114006-cc363e290743/jjm/types.go#L440: d initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/bitfinexcom/bitfinex-api-go@v0.0.0-20210608095005-9e0b26f200fb/v2/rest/currencies.go#L43: parsedRaw initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/jcmturner/gokrb5/v8@v8.4.4/crypto/rfc3961/encryption.go#L41: c initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/mattermost/mattermost-server/v5@v5.39.3/model/remote_cluster.go#L278: salt initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/richardwilkes/toolbox@v1.121.0/xio/network/natpmp/natpmp.go#L280: ports initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/oam-dev/cluster-gateway@v1.9.0/pkg/util/cluster/cluster.go#L42: clusters initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/osrg/gobgp/v3@v3.30.0/pkg/packet/bmp/bmp.go#L762: buf initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/cockroachdb/pebble@v1.1.2/tool/db_io_bench.go#L94: elapsed initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/gogf/gf/v2@v2.7.4/container/gmap/gmap_z_example_any_any_test.go#L406: removeList initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/CiscoM31/godata@v1.0.10/providers/mysql.go#L209: entityTypes initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/hyperledger/aries-framework-go@v0.3.2/pkg/didcomm/protocol/legacyconnection/states.go#L412: timestampBuf initialized with non-empty slice; use make(T, 0, cap) instead
https://go-mod-viewer.appspot.com/github.com/luckypickle/go-ethereum-vet@v1.14.2/p2p/rlpx.go#L475: prefix initialized with non-empty slice; use make(T, 0, cap) instead

@timothy-king
Copy link
Contributor

Thanks Alan. Those FPs in the first 10 or so examples are clear cases of wanting to generate {0,0,0,0,1,2,3,4}. We will need a more specific criteria.

@cuishuang
Copy link
Contributor Author

Additionally, through extensive practice, I have encountered several challenges and gained some insights: How do we balance false positives and omissions? Should we aim to comprehensively identify similar issues and accept a high rate of false positives, or should we focus on precision, which might lead to some omissions? This balance certainly warrants further discussion.

Specifically regarding the code, I’ve noticed that when slice are processed with binary.LittleEndian.PutUint16, they often lead to false positives. Moreover, if a slice is initialized in function A but only appended to after several layers of calls in function Z, should we address this situation? Or should we only consider initialization and appending that occur within the same method?

Thanks @adonovan ! In my practice, the main false positives also come from binary.LittleEndian.PutUint16 related methods.

I agree with what @timothy-king said, maybe we need to add some more standards on the basis of the above three standards to reduce the false positive rate. ( But I am surprised why there are so many cases where the original intention is to get {0,0,0,0,1,2,3,4} )

Do the two maintainers and community developers have any suggestions?

My current idea is that if sli is called by binary.LittleEndian.PutUint16 related methods, it is ignored, just like it has been called by copy or sli[i]

@timothy-king
Copy link
Contributor

My current idea is that if sli is called by binary.LittleEndian.PutUint16 related methods, it is ignored, just like it has been called by copy or sli[i]

We can try to list all of the stdlib functions that might be used to intentionally append to a slice, but my intuition is that this will not be sufficient. We have to also consider user appending functions.

My recommendation is to go back to the original examples and to try to tease out why we think those examples are bugs more precisely. We can re-implement once we better understand this.

@cuishuang
Copy link
Contributor Author

We can try to list all of the stdlib functions that might be used to intentionally append to a slice, but my intuition is that this will not be sufficient. We have to also consider user appending functions.

My recommendation is to go back to the original examples and to try to tease out why we think those examples are bugs more precisely. We can re-implement once we better understand this.

Okay, I will list out the methods in the standard library that trigger false positives, such as binary.LittleEndian.PutUint16. Regarding the "original examples," I want to confirm whether you are referring to the initial PRs I listed that were merged, or the results provided by Alan. I believe it should be the latter. We just need to analyze the common situations that lead to false positives, and by modifying the code accordingly, we can significantly reduce the false positive rate to an acceptable level.

Most of the false positives are caused by slices of type []byte. Ignoring checks for this type of slice could immediately reduce the false positive rate, but this approach seems a bit aggressive.

@cuishuang
Copy link
Contributor Author

cuishuang commented Oct 24, 2024

At present, through the analysis of the corpus information provided by Alan and my previous analysis of the go project on GitHub, the following situations occur when calling a method of the standard library and a false positive occurs:

        binary.LittleEndian.PutUint16(): https://github.com/cloudflare/circl/blob/main/abe/cpabe/tkn20/internal/tkn/policy.go#L84:8
	binary.LittleEndian.PutUint32(): https://github.com/nknorg/nkn/blob/master/chain/store/store.go#L480:12
	binary.LittleEndian.PutUint64(): https://github.com/ethersphere/bee/blob/master/pkg/storage/storagetest/storage.go#L101:8   

	binary.LittleEndian.Uint16()
	binary.LittleEndian.Uint32()
	binary.LittleEndian.Uint64()

	binary.LittleEndian.AppendUint16()
	binary.LittleEndian.AppendUint32()
	binary.LittleEndian.AppendUint64()

	binary.BigEndian.PutUint16():  https://github.com/XZB-1248/Spark/blob/master/client/service/desktop/desktop.go#L233:8
	binary.BigEndian.PutUint32()
	binary.BigEndian.PutUint64():  https://github.com/bnb-chain/bsc/blob/master/core/vm/contracts_lightclient.go#L105:11

	binary.BigEndian.Uint16()
	binary.BigEndian.Uint32()
	binary.BigEndian.Uint64()

	binary.BigEndian.AppendUint16()
	binary.BigEndian.AppendUint32()
	binary.BigEndian.AppendUint64()

I have submitted a new patch to exclude the above situation.

But if this method is not in the standard library but is user-defined, should it still be considered?

(Perhaps based on practical experience, the false positive rate after excluding binary.BigEndian is acceptable)

@adonovan
Copy link
Member

Are you able to estimate the the false positive rate when these cases are excluded? The file attached above contains the complete list of 1225 findings. Can you sample a couple of dozen?

@cuishuang
Copy link
Contributor Author

cuishuang commented Oct 25, 2024

Are you able to estimate the the false positive rate when these cases are excluded? The file attached above contains the complete list of 1225 findings. Can you sample a couple of dozen?


I conducted a detailed analysis, and out of 1,225 cases, approximately 375 of them are related to the usage of the standard library's methods involving binary.BigEndian or binary.LittleEndian to process slices. (I simply wrote a program to detect whether the page contained the keywords binary.BigEndian or binary.LittleEndian. I didn’t further check if slices were actually passed as parameters to these methods, but the margin of error is very small. )

For the remaining 850 cases, I analyzed the first 100.

(And also randomly selected 20 from the remaining 775.)

The detailed result: https://github.com/cuishuang/appends-analysis

Consider the first 100 as an example: 24 of them were clearly false positives. Among these, 22 were due to calls to the standard library functions rand.Read(sli) (cases 6, 7, 8, 9, 10, 11, 12, 15, 27, 29, 30, 31, 35, 36, 66), io.ReadFull(rand.Reader, sli) (cases 2, 18, 19, 38), cipher.XORKeyStream (cases 77, 78), and binary.PutUvarint (case 86), which we can easily avoid reporting
.

In case 87, the usage of unicode.ToUpper does not seem to be a false positive in my opinion.

Additionally:

In case 4:


out := make([]rune, len(slice))
out = out[:0]


This is essentially equivalent to out := make([]rune, 0, len(slice)). Perhaps it is necessary to exclude operations like sli[i:j].

For cases 5, 62, and 63, there are copy operations, but we still flagged these for detection. We need to investigate why they were not ignored.

There are about a dozen cases still in the "todo" state, mainly because they involve calls to user-defined methods, such as:


  • Case 1: runes.Convert from "9fans.net/go/cmd/acme/internal/runes"
  • Case 16: util.ReadAtMost
  • Case 60: fee.Amount.BigInt().FillBytes(sli)
  • Cases 79, 80: Wrap
  • Cases 93, 97: utils.NewBitWriter(utils.NewWriterAtBuffer(sli)
    
    Some of these might not be false positives, It's actually a bug that needs to be fixed. such as:
    
  • Cases 48, 53, 55
    
    Aside from these, the rest are fairly standard errors and should be reported.
    

In summary, when the slice type is []byte, and it's often used in cryptographic contexts, the use of make([]byte, n) is usually exactly what the developer intended.


If we handle the listed methods from the standard library while ignoring user-defined methods, I believe the false positive rate can be reduced to around 5%.


@seankhliao seankhliao changed the title proposal: cmd/vet: add a new feature to the appends analyzer to check if the slice length initialization is incorrect proposal: x/tools/go/analysis/passes/appends: check for incorrect slice length initialization Oct 25, 2024
@adonovan adonovan moved this from Incoming to Hold in Proposals Nov 20, 2024
@adonovan
Copy link
Member

I have put this proposal on Hold until we have a promising prototype implementation.

@cuishuang
Copy link
Contributor Author

I have put this proposal on Hold until we have a promising prototype implementation.

OK, thanks. I'll keep an eye on its progress.

If we reach an agreement, I'll be happy to implement its code.

@adonovan
Copy link
Member

If we reach an agreement, I'll be happy to implement its code.

The order is backwards: what this proposal needs to move forwards is a new idea, a prototype, an experiment, and some encouraging data; agreement will follow from that.

@cuishuang
Copy link
Contributor Author

cuishuang commented Nov 21, 2024

If we reach an agreement, I'll be happy to implement its code.

The order is backwards: what this proposal needs to move forwards is a new idea, a prototype, an experiment, and some encouraging data; agreement will follow from that.

OK, I understand.

My current idea is to add checks for special methods in the standard library (such as binary.BigEndian), or to exclude the []byte type more aggressively.

This is indeed a widespread problem, but it is not easy to reduce the false positive rate in implementation. Maybe there will be a better solution later.

Thanks for your patience guidance and review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Hold
Development

No branches or pull requests

6 participants