Because the default nastring is NA there is a following problem:
- take a data structure that has e.g.
String column with missing data in it;
save it to disk using default parameters; missings get converted to NA on disk
load it back and you have "NA" string where you earlier had missings
The same problem occurs with e.g. Char data.
While NA is a sensible default for numeric columns it is a bit confusing for non-numeric columns (and actually can lead to wrong results as it is fully possible to have NA string in data).
I think that it would be best to have an empty string for missings in non-numeric data.
Because the default
nastringisNAthere is a following problem:Stringcolumn with missing data in it;saveit to disk using default parameters; missings get converted toNAon diskloadit back and you have"NA"string where you earlier had missingsThe same problem occurs with e.g.
Chardata.While
NAis a sensible default for numeric columns it is a bit confusing for non-numeric columns (and actually can lead to wrong results as it is fully possible to haveNAstring in data).I think that it would be best to have an empty string for missings in non-numeric data.