APR 1, 2026Spreadsheets

CSV gotchas that silently break a reconciliation: encoding, delimiters, and mangled IDs

The CSV opened fine. The rows look right. So why does your reconciliation suddenly show two hundred orders missing that you know shipped? Because somewhere between the export and the match, the file quietly changed your data and didn't tell you. Excel dropped the leading zeros off your SKUs, turned a 16-digit order ID into 1.23E+15, read a European export's semicolons as one giant column, or decided 04/05 meant April when it meant May. None of that throws an error. The match just stops matching. Here is the short list of CSV traps that do this, why each one is invisible, and a two-minute check that catches them before they eat your evening.

Why a clean-looking file lies to you

The CSV problems that error out are the easy ones — the import fails, you see a red message, you fix it in minutes. The expensive ones are silent. The file opens, every row is present, the totals look plausible. But the key column — the field your match depends on — has quietly changed shape. A reconciliation keyed on that column then reports differences that are not real: rows flagged "only in system A" that are sitting right there in system B under a mangled key. You go hunting for missing orders that never went anywhere. An operator on r/ecommerce described the version of this that happens across channels:

your Shopify customer data doesn't match your amazon reports and inventory data is scattered across different CSVs.

Their one-line verdict — "It's absolute hell" — is the sound of a match failing for a reason the grid will not show you. So before you treat a difference as a difference, rule out the file. Here is what to rule out.

The gotchas, in the order they bite

Gotcha	What you see	Why the match breaks	How to stop it
Leading zeros dropped	`00421` becomes `421`	The key no longer equals the same SKU on the other side, so the row reads as missing	Import the column as text — never open the file by double-click
Long IDs in scientific notation	a 16-digit order ID shows as `1.23E+15`	The trailing digits are gone, so two different IDs collapse to one or match nothing	Same fix — force the column to text on import
Encoding mismatch (mojibake)	`café` shows as `cafÃ©`, `€` as `â‚¬`	Any key or name holding the garbled character stops matching	Open as UTF-8; the Windows-1252 default is the usual culprit
A UTF-8 byte-order mark	an invisible marker before the first header	The first header (often the key) silently fails to match its mapping	Save as plain UTF-8 without the BOM, or strip the leading bytes
Wrong delimiter	every row lands in one column	There is no key column at all, so nothing matches	RFC 4180 says comma, but EU exports use semicolons and many use tabs — set the delimiter on import
Unquoted comma inside a field	`Smith, Jr.` splits into two columns	Every column after it shifts right, so the wrong field becomes the key	A correct export quotes fields that contain commas — re-export, or set the delimiter explicitly
Ambiguous dates	`04/05/2026`	One file reads April, the other May, so matched rows look like mismatches	Import dates as text or standardize to ISO 8601 `YYYY-MM-DD` before matching
Trailing / invisible whitespace	`1001` looks identical to `1001`	An exact-match comparison fails on the hidden space	TRIM the key, or clean it on import
Text vs number type	`1001` as text is not `1001` as a number	A type-strict match treats them as two different keys	Decide one type for the key and enforce it on both files

Most of these trace back to a single moment: the instant Excel is allowed to guess what your columns are. Take that decision away from it and the list above mostly disappears.

The one habit that prevents most of them: never let Excel guess

Double-clicking a CSV opens it with automatic type detection switched on. That is the moment your leading zeros vanish and your long IDs go scientific. The fix is to import the file instead of opening it, and to tell Excel the key column is text before it reads a single row.

Open Excel first. Do not double-click the CSV. On the Data tab, choose From Text/CSV and pick the file.
In the preview, set Data Type Detection to "Do not detect data types" — or use the Text Import Wizard and set the ID/key and date columns to Text.
Check the File Origin shows UTF-8. If accented characters or the € sign look garbled, it was saved as Windows-1252 — change the origin or re-export.
Confirm the delimiter split the columns correctly. If everything sits in column one, switch the delimiter to semicolon or tab.
Load, then run your match. The key column now holds exactly what the source system exported.

What "save as CSV UTF-8" actually fixes (and what it doesn't)

When a NetSuite import kept failing on an inventory worksheet, two of the most useful replies were about the file, not the system. The first named the trap nobody can see:

Double check there are no spaces behind the characters in the cells. That one has gotten me before and was basically invisible.

The second pointed straight at encoding:

Sometimes there are weirdness with UTF-8 vs Windows 1252 character set but that's usually with punctuation characters not digits in InternalID.

Both are right, and together they show why "just save it as CSV UTF-8" is half an answer. Saving as UTF-8 fixes the mojibake — the garbled accents and symbols that come from a file written in Windows-1252 and read as Unicode, or the reverse. It does nothing about trailing spaces, type coercion, or a wrong delimiter. And a UTF-8 file saved with a byte-order mark can corrupt the very first header — usually your key — so the column maps to nothing. Encoding is one trap on the list, not the list.

Dates: the trap that survives every other fix

You can fix the encoding, force the IDs to text, and get the delimiter right, and dates will still quietly betray you. 04/05/2026 is April 5th to a US export and the 4th of May to most of the rest of the world. When two files disagree about which, every matched row with a date check reads as a mismatch. A seller juggling US and German marketplace reports put it plainly:

the reports are formatted differently (date formats mess me up every time)

The durable fix is to stop trusting the locale. Import date columns as text so Excel cannot reinterpret them, then standardize to ISO 8601 — YYYY-MM-DD — which is unambiguous by design because it leads with the year. Match on the standardized column, never the raw one.

Settlement and marketplace files: same traps, higher stakes

Settlement and payout exports — Amazon, Stripe, Shopify — carry the worst combination of these traps at once: long transaction IDs, multiple currencies and locales, decimal commas, and embedded commas inside descriptions. When someone on r/Netsuite asked how to record marketplace customer payments from a settlement report, the community's reflex was to throw a tool at it:

I run thousands upon thousands of records monthly through Celigo reconciliation products.

That is a fair option at volume — integration platforms like Celigo or A2X do ingest these files and post the journals for you. But a tool does not repair a malformed export; it matches on whatever the file actually contains. Feed it a key mangled by scientific notation and it reconciles wrong, just faster and at scale. Whether you do it by hand or buy the platform, the file has to be clean first — and a settlement file is also a netted, bundled batch, so once the CSV is sound you are reconciling the payout, not the orders.

A two-minute sanity check before you match

Run this on every file before you trust a single comparison. It is faster than chasing a phantom difference for an hour later.

Row count: does the imported row count equal the source report's row count? A gap means the parser split or merged rows.
Key type: is the key column text, with leading zeros intact and no E+ anywhere in it?
Spot-check three known IDs: pick three records you know and confirm they survived the import character-for-character.
Raw view: open the file in a text editor — confirm the delimiter, that the encoding is clean (no Ã or â‚¬ garble), and that quotes are balanced.
Dates: are all dates in one unambiguous format before you compare them?

This is the same discipline as cleaning your key in a spreadsheet — done one step earlier, at the door, before a bad column gets into the workbook at all.

When the file is clean and the match still fails

If you have ruled out the CSV and rows still will not reconcile, the problem has moved up a layer. Either the key itself is wrong or non-unique — a repeating order number, a missing second field — or the two systems genuinely disagree, and you are now doing real reconciliation rather than fighting a parser. That is the good kind of difference: the kind worth investigating. CSV hygiene just makes sure every difference you chase is a real one. And where the same export lands on your desk every week, the cleaning belongs in a repeatable step, not re-done by hand each time it costs you an evening.

Frequently asked questions

Why does Excel drop the leading zeros from my CSV?

Because opening a CSV by double-click lets Excel auto-detect column types, and it reads a zero-padded value like 00421 as the number 421. Import the file through the Data tab's From Text/CSV instead, and set that column to Text before loading, so the value is preserved exactly.

Why is my long order ID showing as 1.23E+15?

Excel converted a number longer than 15 digits to scientific notation and discarded the trailing digits, so the ID is no longer accurate. Force the column to Text on import. Once the digits are gone they cannot be recovered, so you must re-import from the original file.

My CSV opens as one big column — what went wrong?

The file uses a delimiter Excel did not expect, usually a semicolon (common in European exports where the comma is the decimal separator) or a tab. Use From Text/CSV and set the delimiter explicitly so the columns split correctly.

That is mojibake — the file was written in one character encoding and read in another, most often a Windows-1252 file read as UTF-8 or the reverse. Re-open or re-export the file as UTF-8 and the characters resolve.

How do I stop dates from flipping between day and month when I import?

Import date columns as text so Excel cannot reinterpret them by locale, then convert them to the ISO 8601 format YYYY-MM-DD, which is unambiguous because it leads with the year. Match on the standardized column rather than the raw one.