Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As noted on the scikit-learn mailing list the poor results of the liblinear can be cause by a low convergence tolerance but also by the fact that the internal memory layout used by liblinear is not optimized for dense input data arrays as is the case for the SGDClassifier of scikit-learn.


Any idea what the preferred algorithm would be for linear SVMs with dense data? (other than SGD, of course)


SGD with averaging [1] :)

More seriously, SGD is pretty hard to beat for fitting linear models (SVM, logistic regression and other l1 penalized models with various loss functions) when the number of samples getting large.

[1] http://leon.bottou.org/projects/sgd




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: