I´m super excited, sleepless for a couple of days already. I´m trying to use all tricks possible to improve a Sequence Labeling using Conditional Random Fields. I need to NER billion of documents, and need to be fast.
CRFSuite is a workhorse, and a baseline very hard to beat with speed and precision. But with o3 I´m created a frank-stain with many tricks such as CRF with variable order, feature interactions, bidirectional, jointly learning with word embeddings. The precision is already over than CRFSuite. And I believe that would be better than many other solutions such as bi-lstm-crf. Definitely much faster.
Now i´m trying to port to Cython to make as fast as possible. Here o3 is almost useless, but I´m progressing.