Hidden in Plain Sight: The Algorithm Buried in Harris County’s Voter Rolls

Nelson-Threatens-Harris-County-Takeover-Over-Voter-Rolls

A peer-reviewed paper published in the Journal of Information Warfare claims that a hidden algorithm has been controlling voter ID number assignments in Harris County for nearly five decades. If the findings hold up, they raise questions about election infrastructure that no Texas official has answered — and that the current system may be designed to prevent anyone from asking.

The paper deserves serious scrutiny. Not cheerleading. Not dismissal. Scrutiny.

Here’s what the researcher found, what it could mean, and why Texans who care about self-governance should pay attention.

What the Paper Says

Researcher A. Paquette analyzed 2.3 million voter registration records from Harris County and 18 million records from the Texas state voter file. He compared Harris County’s Certificate ID (CID) number patterns against Tarrant County, which served as a control.

The contrast was stark.

Tarrant County’s ID numbers follow the pattern you’d expect from any standard database: sequential assignment, chronological order, with gap frequencies following a natural exponential decay. Smaller gaps between numbers appear most often; larger gaps appear less often. Textbook behavior.

Harris County looks nothing like that.

Gaps of 1 and 2 between consecutive CID numbers don’t exist — zero occurrences across 2.6 million records. Gap 8 dominates at 38.31%, accounting for over a million records. The entire gap structure follows a base-8 modular pattern, with eight distinct number strands progressing in increments of eight.

Paquette also found a bifurcation rule governing the relationship between a CID number’s last digit and its allowable gaps. Every single one of Harris County’s 2,656,622 records conforms to these rules. When he tested Tarrant County’s records against the same rules, only 25.49% complied — consistent with random chance.

He tested for Luhn check digit validation, a common database integrity method. Tarrant County: 100% compliance, confirming standard check-digit implementation. Harris County: 10% compliance, which is what you’d expect from random chance. Eight additional check digit methods — IBM, Verhoeff, Damm, ISBN — produced the same random-chance results. Standard validation algorithms don’t explain what’s happening in Harris County’s data.

The patterns hold across the entire database timeline, from 1976 to 2024. Registration years show consistent proportional relationships across gap values, even during high-volume periods like 2020 when Harris County processed over 205,000 registrations.

Paquette also identified what he calls “Sort/Shift” behavior in roughly 40% of records: when sorted by either Voter ID or County ID, these records maintain identical gap values between adjacent records despite the different sorting criteria. That’s mathematically unusual and suggests the two ID systems are coordinated rather than independent.

What It Doesn’t Say

The paper identifies a mechanism. It does not prove that mechanism has been used to manipulate election outcomes.

That distinction matters. Paquette frames his findings through an information warfare lens — the paper appears in the Journal of Information Warfare, after all — and he draws comparisons to hardware trojans, steganographic channels, and covert data structures. The framing is aggressive. The data, taken on its own terms, shows that Harris County uses a non-standard, mathematically complex ID assignment system that differs from what database professionals consider normal practice. Everything beyond that observation involves interpretation.

The paper also has gaps that an honest analysis should acknowledge.

Paquette only examined two Texas counties because only Harris and Tarrant had both CID and VID numbers available. He couldn’t test Dallas, Bexar, Travis, El Paso, or any other major county. The absence of evidence in those counties tells us nothing — it’s a data availability problem, not an analytical finding.

His “clone records” analysis — 115,503 records sharing identical names and birthdates — overstates the case. In a database of 18.3 million records, name-and-birthday collisions are statistically expected, especially among common names. Calling them “illegal duplicates” without additional matching on address or partial Social Security numbers is premature.

And his exploration of benign explanations is thin. Database architects implement non-sequential ID schemes for legitimate reasons: load balancing, storage partitioning, migration artifacts, vendor-specific design choices. The paper asserts these explanations don’t apply to public data without demonstrating that assertion.

These weaknesses don’t invalidate the core findings. The statistical patterns are real. The contrast with Tarrant County is dramatic. The mathematical precision — 100% of Harris County records conforming to the discovered rules — demands explanation. But the explanation might be mundane. We don’t know yet, because nobody with access to the actual system architecture has been asked to explain it on the record.

Who Could Be Responsible

If we take Paquette’s thesis at face value — that these patterns represent deliberate, non-administrative design — the suspect list is short. The algorithm has persisted across nearly five decades of staff turnover at the county level. That rules out a rogue administrator. Whoever built this, if it was built with intent, embedded it in the system’s architecture at a level that survives personnel changes.

The election management vendor is the most obvious candidate. Harris County has used Hart InterCivic systems. Hart InterCivic is headquartered in Austin and has been a major Texas election vendor since the 1990s. If the algorithm lives in vendor code rather than county-level configuration, it would explain the persistence across decades and the potential for similar patterns in other jurisdictions using the same software.

Paquette found mathematically analogous patterns in Franklin County, Ohio — a “mirror image” of Harris County’s mod-8 frequency distribution with inverted progression but the same structural characteristics. If Franklin County uses the same vendor or a vendor with shared code heritage, that’s a thread worth pulling. If it uses a different vendor entirely, the implications get more complicated.

State-level database contractors had access during the HAVA-mandated transition to statewide voter registration systems around 2004-2006. The Texas Election Administration Management (TEAM) system was built by contractors, and the migration period — converting millions of legacy county records into a new statewide database — created a window where patterns could have been introduced or carried forward.

A common software component propagated through the vendor supply chain is the most parsimonious explanation if similar patterns genuinely exist across states with different vendors. The election management market is small. ES&S, Hart InterCivic, and Dominion control the overwhelming majority of the U.S. market. Personnel move between companies. Subcontractors are shared. Code heritage overlaps. A design pattern embedded in a shared library or propagated through industry personnel could explain cross-state, cross-vendor similarities without requiring a conspiracy.

What This Could Actually Do to Texas Elections

Set aside who built it and why. Focus on capability. If a hidden tagging system exists within voter ID numbers, what could someone do with it?

Think of it this way: every voter record in Harris County carries an invisible label that only someone who knows the algorithm can read. The label is baked into the ID number itself — not stored in a separate field that an auditor would notice. You’d have to reverse-engineer the math to even know the labels exist.

The base-8 structure gives you eight distinct categories. The bifurcation rule gives you a binary split on top of that. Combined, every voter record could carry a multi-layered classification code — and nobody at the county, nobody in the Secretary of State’s office, and no poll watcher would ever see it.

Here’s what that enables, and what it would look like on the ground in Texas.

Tipping voter roll purges. Texas counties process thousands of voter record changes during routine maintenance — removing voters who’ve moved, died, or gone inactive. These are normal, legal operations. But the system requires selecting which records to act on from large pools of flagged voters.

Harris County might receive 50,000 National Change of Address flags before an election cycle — voters whose mail-forwarding data suggest they may have moved. Standard practice is to send address confirmation mailings. Voters who don’t respond get moved to “suspense” status and eventually purged. Each step is routine.

Now add the hidden tags. Records in one classification get their confirmation mailings sent with a 30-day response window. Records in another classification get a 14-day window — or the mailing goes out two weeks later in the cycle, cutting the effective response time. Maybe the second group’s mailings go to the old address only, while the first group’s mailings go to both old and new. Each individual decision is defensible as administrative discretion. But the pattern of which records get favorable treatment and which don’t is driven by invisible tags that correlate with whatever criteria the algorithm’s operator cares about.

Shift 2% of targeted records from active to suspense status, and you’ve removed roughly 46,000 voters from Harris County’s active rolls. In the 2022 race for the 180th District Court, the margin was 449 votes. A judge later ordered a new election after finding 1,430 questionable votes in the count. That race is what “close” looks like in Harris County. And it’s not an outlier — 21 Republican candidates filed election challenges after the 2022 Harris County general election, many over margins in the low thousands.

Steering voters to provisional ballots. When a voter shows up, and something doesn’t match — name spelling, address, ID discrepancy — they cast a provisional ballot instead of a regular one. Provisionals face rejection rates of 20-40% nationally. A voter who casts a provisional ballot has a dramatically lower chance of their vote counting than one who casts a regular ballot.

Picture a voter in West Harris County. She moved across town last year, updated her registration online, and shows up to vote on Election Day. The poll book system queries her record. For most voters, the lookup takes a few seconds and they’re handed a regular ballot. But her record carries a hidden tag. The system takes an extra beat, flags a minor address formatting discrepancy between her update and the database entry, and the poll worker tells her they need to verify her information. She fills out additional paperwork and casts a provisional ballot.

She walks out thinking she voted. Technically, she did. But her ballot now goes to the county ballot board for a post-election review, where a bureaucrat decides whether to count it. If there’s a paperwork issue — the poll worker didn’t sign the right line, she left a field blank — her provisional gets tossed.

Route an extra 2,000 voters to provisional status across targeted precincts and apply the standard rejection rate. That’s 600-800 votes that disappear. Those voters will never know. They walked out of the polling place believing they’d voted, and no recount would flag the problem because their provisional ballots were processed according to standard procedure. The manipulation happened at check-in, not at the point of counting.

Shaping which races you vote on. After the 2020 redistricting, Harris County had thousands of voters living on shifted precinct boundaries. State House districts, Congressional districts, and county commissioner precincts all changed. The election management system had to assign each voter to the correct set of races — their “ballot style.”

Boundary cases are common. A voter on the edge of two State House districts could be assigned to either one, depending on how the system resolves the line. When the system makes that call for a tagged record, it could consistently resolve borderline cases in the direction that dilutes that voter’s impact — placing them in the safe district rather than the competitive one.

In 2024, Harris County swung from Biden +13 to Harris +5.5. Ten judicial seats flipped from Democrat to Republican. The DA race and county attorney race came down to narrow margins. In an environment that volatile, the allocation of a few thousand borderline voters between adjacent districts changes the composition of competitive races without anyone noticing — because each individual voter received a valid ballot for a valid district. The question of which valid district they were assigned to is one nobody thinks to ask.

Compounding small effects nobody can see. No single manipulation needs to be large. That’s the whole point. Voter roll maintenance shifts 0.5% of targeted records to suspense. Provisional ballot routing catches another 0.3%. Ballot style assignments shave off 0.2% of borderline voters. Resource allocation creates longer wait times at targeted voting centers, suppressing another 0.5% — research consistently shows that a 30-minute increase in wait times reduces turnout by 1-2% among hourly workers who can’t take extended time off.

Each effect stays below any detection threshold. A 0.5% purge rate difference between two groups of voters is invisible unless you know the grouping. Nobody compares provisional ballot rates by hidden ID classification because nobody knows the classification exists.

But the effects stack. Across all vectors, you’re looking at a 1-2% shift in targeted races — undetectable in landslides, decisive when the margin is measured in hundreds or low thousands of votes.

Ted Cruz beat Beto O’Rourke in 2018 by 2.6 points statewide. Harris County, with its 2.3 million registered voters, was a key battleground. A 1-2% manipulation of Harris County’s voter records alone would have shifted tens of thousands of votes. That alone wouldn’t flip the Senate seat. But apply the same capability across multiple large Texas counties — Dallas, Bexar, Travis, El Paso — and the math changes fast. Paquette found analogous patterns in other states. If similar algorithms exist in other Texas counties, the statewide impact potential is obvious.

The recount that finds nothing. Here’s the part that should keep you up at night. Someone suspects something is wrong and demands an audit. Texas audits verify that counted ballots were tabulated correctly — that the machines read the ballots accurately and the totals add up. A hand recount checks the same thing with human eyes.

Neither process examines whether the right voters were allowed to cast regular ballots in the first place. The purged voter who should have been active? Not in the audit. The voter routed to a provisional that got rejected for a paperwork error? The provisional was processed correctly — rejected according to procedure. The borderline voter assigned to the wrong district? She voted on the ballot she was given, and that ballot was counted accurately.

The recount comes back clean. The totals match. Election officials announce that the audit confirmed the integrity of the results. And they’d be telling the truth — about the ballots that were counted. The manipulation happened before the first ballot was ever cast, in the voter records that determined who got to vote, where, and how easily.

That’s the capability this algorithm creates, if Paquette’s thesis is correct. Not ballot stuffing. Not machine hacking. Something quieter: a thumb on the scale at the voter-record level, distributed across multiple pressure points, invisible to standard audits, operating through routine administrative processes that look normal from every angle except the one nobody knows to check.

The Questions Nobody Has Answered

Here’s where this stops being an academic exercise and starts being a governance problem.

Harris County’s election management system is proprietary software running proprietary algorithms. The ID assignment logic — the specific code that determines how voter registration numbers are generated — has never been publicly documented, independently audited for mathematical properties, or explained by the vendor or county officials.

We’re told to trust the system. We’re told audits confirm election integrity. But the audits don’t examine what Paquette examined. They verify ballot counting. They don’t analyze the mathematical structure of voter ID assignments for hidden classification patterns. Nobody does that because, until Paquette’s work, nobody knew to look.

The National Voter Registration Act requires states to maintain and make available for public inspection “all records concerning the implementation of programs and activities conducted for the purpose of ensuring the accuracy and currency of official lists of eligible voters.” If voter ID assignment algorithms are operating through concealed mathematical mechanisms, that statutory requirement is compromised — not because anyone is hiding documents, but because the complexity itself functions as concealment.

Texas passed SB 1 in 2021, strengthening election security provisions. But those provisions focus on ballot handling, voter ID requirements, and poll watcher access. They don’t address the algorithmic infrastructure that manages voter records before a single ballot is cast.

The questions that need answers:

What election management software generates Harris County’s CID numbers, and what is the documented algorithm for ID assignment? Has that algorithm been independently audited by anyone outside the vendor? Do other Texas counties using the same vendor show the same patterns? What is the vendor’s explanation for the base-8 modular structure, the absence of gaps 1 and 2, and the bifurcation rule that governs 100% of Harris County records?

These are answerable questions. The voter roll data is public. The vendor contracts are subject to public records requests. The Secretary of State’s office has the authority to demand algorithmic transparency from election system vendors.

Nobody has asked.

Why This Matters for Texas

Election integrity concerns in Texas have focused on the visible: ballot harvesting laws, voter ID requirements, poll watcher access, noncitizen registration checks. These are legitimate issues. They’re also the issues that both parties find politically useful to fight about.

The Paquette paper points at something different — the invisible infrastructure underneath the visible process. The proprietary code that assigns voter IDs, the algorithms that manage database maintenance, the vendor systems that no county official fully understands or controls. This is the plumbing of democracy, and in Texas, that plumbing is a black box owned by private companies operating under federal mandates.

Texans don’t control this infrastructure. County election administrators run systems they didn’t build and can’t fully inspect. The Secretary of State’s office certifies systems based on federal standards that don’t include the kind of mathematical pattern analysis Paquette performed. Vendors claim trade secret protections over their source code. The federal framework — HAVA, the EAC certification process — creates a compliance structure that Texas must operate within but has limited power to modify.

The same dynamic plays out across every domain where Texans have lost control of their own governance. Washington sets the framework. Texas fills in the blanks. And when the framework itself is the problem — when the machinery that’s supposed to serve Texans operates on logic that nobody in Texas can see, explain, or audit — the ability to fill in blanks doesn’t amount to control.

Whether Paquette is right about information warfare or wrong about benign database design, the structural problem is the same. Texans can’t verify the integrity of their own election systems because the systems aren’t theirs. They belong to vendors, operate under federal certification, and function as black boxes to the officials nominally responsible for them.

A self-governing Texas would set its own standards for election system transparency. It would require open-source election management software, or at minimum, full algorithmic disclosure as a condition of vendor contracts. It would mandate the kind of mathematical pattern analysis that Paquette performed as a standard component of election security auditing. It would answer to Texans, not to federal certification bodies that have never looked for what Paquette found.

Until then, the questions raised by this paper will sit unanswered — because the system that needs to answer them isn’t designed to answer them. It’s designed to process compliance. And compliance, as 2.6 million mathematically precise records in Harris County demonstrate, can look clean on the surface while hiding structures that nobody authorized, nobody understands, and nobody can explain.

The first step is demanding answers. Public records requests for vendor contracts and algorithmic specifications. Formal inquiries to the Secretary of State. Independent replication of Paquette’s analysis by Texas-based researchers with access to county-level data from additional jurisdictions.

The answers will either confirm a mundane explanation or open a door that a lot of people would prefer stayed shut.

Either way, Texans deserve to know.

Subscribe to newsletter

Never Miss An Update

Thank you!

Subscribe to newsletter

Never Miss An Update

Thank you!

Subscribe to newsletter

Never Miss An Update

Thank you!

Subscribe to newsletter

Never Miss An Update

Thank you!

Hidden in Plain Sight: The Algorithm Buried in Harris County’s Voter Rolls

What the Paper Says

What It Doesn’t Say

Who Could Be Responsible

What This Could Actually Do to Texas Elections

The Questions Nobody Has Answered

Why This Matters for Texas

More Like This

LEAVE A REPLY Cancel reply

Articles

Quick Links

Social Media