Might of the word embeddings

Three most important lessons about neural networks and word embeddings: 1. No free lunch, 2. Size matters, 3. Engineering matters


[New Paper] Information extraction from tables in literature

About two months ago, a paper that resulted from my Ph.D. work has been published in the International Journal of Document Analysis and Recognition. The paper is titled “A framework for information extraction from tables in biomedical literature”.


Awarded best paper award on NLDB 2018 conference


Marvin – A tool for semantic annotation released

During the last week I have released a version of Marvin – a tool for semantic annotations, that is able to annotate text using various sources, such as UMLS (using MetaMap), DBPedia, using some SPARQL interface, WordNet and probably most importantly SKOS (Simple Knowledge Organization System ) format for representing lexicons, dictionaries and terminologies. Primarily, the tool is supposed to be helpful in data labeling and normalization of biomedical texts, however, with the help of SKOS, WordNet and DBPedia it can be helpful in any domain.

When I mentioned normalization and labeling, for some readers not familiar with text mining and some aspects of semantic web, I better briefly explain. Basically, usual natural language text