How does Cobalt normalize search queries and handle abbreviations or special characters?

‍
Before running a search, Cobalt applies a standard layer of query normalization to clean up formatting issues that can interfere with matching. This includes:

Removing emojis and emoji variants
Removing private use area characters
Removing zero-width and other invisible characters
Removing the HTML LRM entity (&lrm;)
Converting smart single quotes to straight quotes
Normalizing smart double quotes
Converting dashes to standard hyphens
Converting non-breaking spaces and tabs to normal spaces
Removing newlines
Collapsing consecutive whitespace into a single space
Removing bullets and straight double quotes
Converting the query to lowercase

After this base cleanup, Cobalt may apply additional adjustments depending on the state, since Secretary of State search behavior varies by jurisdiction. For example, some states allow characters like / in the search query, while others may require stricter normalization.

If an initial search does not return a result, Cobalt may then attempt a second-pass expansion when abbreviations appear to be involved. In these cases, the system can expand abbreviated terms and rerun the search.

For example:

“First Natl Bank” may be expanded to “FIRST NATIONAL BANK”

This fallback can help surface valid matches that would otherwise be missed. When that happens, the returned confidenceLevel may still be lower than an exact match, so clients should review those results accordingly rather than treating them as a perfect direct match.

matching-confidence