So, I've been writing a little app to combine some Excel spreadsheets. These sheets all have header rows, so I cand open them up with an OleDbConnection, do a "SELECT * FROM [Sheet1$]"
and go along my merry way. The problem is, two of the columns I need to work with just aren't there. They're on the excel sheet just
fine, but my OleDbDataReader finds nothing in those columns on any row. Looking at the sheet, I see that in those columns, there isn't any data for about 12 rows.
So, I put in zeros at the top of those columns, and then it works fine.
Is the OleDbConnection really making assumptions about the dataset based on the first row? After much googling and little success, I try to find a definition of the
connectionstring, hoping there's some attribute like "rows to scan for schema" I can set, to tell it to actually read my data.
Nathan points me to an excellent
resource,
Connectionstrings.com, and they kindly let me know that I can specify
HDR=Yes; to indicate that I have a
header row in my sheets, and
IMEX=1; which, according to
Connectionstrings.com:
tells the driver to always read "intermixed" data columns as text
Apparently, the two columns in question were, in fact "intermixed" data columns, and once I set that in my connectionstring, all worked fine.
The mixed use of "1" and "Yes" aside, why the hell would your database driver just silently ignore data? I mean, if "intermixed" data columns is an error
condition, then have the balls to throw a exception, warning, event log entry, anything. Don't just not work and expect me to magically know where the problem is.
I almost reimplemented the whole damn thing using Excel objects and the Office API, and that would've taken me another couple of days. A pox on the Excel team!
Well, I guess not a pox, because at some point some manager sat them in a room and said "Ok, now lets let people query this using SQL!", which was probably
punishment enough.