Skip to content

Commit

Permalink
Hotfix: Expect more possible metadata columns when parsing ES&S CVRs
Browse files Browse the repository at this point in the history
Previously, we expected exactly 3 metadata columns. When given a file
with more metadata columns, they were treated like contest columns. This
caused a bloating of the contest metadata and major performance
slowdown.
  • Loading branch information
jonahkagan committed Aug 13, 2024
1 parent fdefe6c commit 97c5113
Showing 1 changed file with 17 additions and 3 deletions.
20 changes: 17 additions & 3 deletions server/api/cvrs.py
Original file line number Diff line number Diff line change
Expand Up @@ -796,9 +796,23 @@ def parse_ballots_file(

def parse_contest_metadata(cvr_csv: CSVIterator) -> CVR_CONTESTS_METADATA:
headers = next(cvr_csv)
# Based on files we've seen, the first 3 columns are Cast Vote Record,
# Precinct, Ballot Style and the rest are contest names
first_contest_column = 3
# Based on files we've seen, the first few columns are metadata, and the
# rest are contest names
known_metadata_headers = [
"Election ID",
"Audit Number",
"Tabulator CVR",
"Cast Vote Record",
"Batch",
"Ballot Status",
"Precinct",
"Ballot Style",
]
first_contest_column = next(
index
for index, header in enumerate(headers)
if header not in known_metadata_headers
)
contest_names = headers[first_contest_column:]
# { contest_name: choice_names }
contest_choices = defaultdict(set)
Expand Down

0 comments on commit 97c5113

Please sign in to comment.