I’m currently working on a new solution to allow geocoding and data augmentation with demographic data. The demographic data set that we are working with is provided to us as several gigabytes of data separated into logical groupings of .DBF files each with 200+ columns of data and an accompanying data dictionary document in Word format.
I needed to upload all of this data into a new SQL Server 2008 R2 database so that we can do some aggregation and other processing against the data. Unfortunately, SQL Server 2008 R2 does not support a direct import of the data found in the .DBF files. One option is to convert the .DBF files into Excel files. Excel handles the conversion easily. Unfortunately, in my case, the width of the tables is simply too wide for the Microsoft Office Data Access Engine to handle properly. When I attempt to import the resulting Excel files into the database I am greeted with a crash error dialog after a lengthy wait.
The second option, which is the one that I had to go with is to use Microsoft Access to do the conversion. Fortunately, Access will open the .DBF files with no problem. I chose to use the import approach as opposed to the linked tables approach. Due to the size of data that I was dealing with, it was actually faster to load the respective tables into Access, and then use the SQL Server upsizing wizard in Access to push the data into a local SQLExpress instance. Once that had completed, I then detached the database, copied the .MDF and .LDF files to the server and reattached.
The only glitch I’ve had with this process is that for some reason Access will fail to copy data into the SQL Server tables. It’s erratic and I’ve not determined why it does this. There are no errors produced during the export process. It creates the table structure with no problem, but occasionally it will fail to copy the data – even though it runs for several minutes. That’s a battle for another time.
Now I’m off to resume my work on the geocoding functionality that will ultimately pull in this data.