Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Among the shit I have seen in CSV, no " for strings, including those with a return char, innovative SEP, date, numbers, no escape for " within strings, rows related to the reporting tools used to export to CSV etc


True. But most of those problems are pretty easy for the non-technical person to see, understand, and (often) fix. Which strengthens the "friendship bridge".

(I'm assuming the technical person can easily write a basic parsing script for the CSV data - which can flag, if not fix, most of the format problems.)

For a dataset of any size, my experience is that most of the time & effort goes into handling records which do not comply with the non-technical person's beliefs about their data. Which data came from (say) an old customer database - and between bugs in the db software, and abuse by frustrated, lazy, or just ill-trained CSR's, there are all sorts of "interesting" things, which need cleaning up.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: