When Excel writes a file in the DIF format, SAS is unable to read the file. Here’s a valid DIF that has 24 columns, and 446,213 rows:
TABLE 0,1 "" VECTORS 0,24 "" TUPLES 0,446213
Note that “Tuples” in mathematical formulas are equivalent to “Rows”. A VECTOR is like a dimension, or field. In the case of Excel, it refers to columns. So far so good. However here is how such a file would be saved by Excel 2010:
TABLE 0,1 "EXCEL" VECTORS 0,446213 "" TUPLES 0,24
Excel has no problem reading the above file format, as it ignores tuples/vectors internally. However SAS cannot handle this not-so-standard deviation.
Below is VBA code that after saving a DIF file “fixes” the header by opening the file in binary mode and corrects the issue. Note FName contains both path and filename:
'Fixup TUPLES and VECTORS Dim filenum As Integer Dim strData As String Dim VectorIdx As Integer Dim TuplesIdx As Integer Dim VectorStr As String Dim TuplesStr As String Const CharsToProcess = 60 Dim outStr Dim CRLF As String Dim DoubleQuote As String Dim Fname As String CRLF = Chr(13) & Chr(10) DoubleQuote = Chr(34) Fname = saveString filenum = FreeFile Open Fname For Binary Access Read As filenum strData = String$(CharsToProcess, " ") Get #filenum, , strData Close #filenum VectorIdx = InStr(strData, "VECTORS") TuplesIdx = InStr(strData, "TUPLES") VectorStr = Mid(strData, VectorIdx + 9, 14) 'overly generous portion of chars TuplesStr = Mid(strData, TuplesIdx + 8, 14) If InStr(TuplesStr, Chr(13)) > 0 Then 'trim CR LF TuplesStr = Left(TuplesStr, InStr(TuplesStr, Chr(13)) - 1) End If If InStr(VectorStr, Chr(13)) > 0 Then 'trim CR LF VectorStr = Left(VectorStr, InStr(VectorStr, Chr(13)) - 1) End If outStr = "VECTORS" & CRLF & TuplesStr & CRLF & DoubleQuote & DoubleQuote & CRLF & "TUPLES" & CRLF & VectorStr filenum = FreeFile Open Fname For Binary Access Write As filenum Put #filenum, VectorIdx, outStr Close #filenum
One Response
Very helpful article!