natural language processing


What is the structure in data?


Building a search engine (Lucene tutorial)

Since the Google took over lives and branded a verb for searching as Googling, making a search engine is considered cool thing. I have crossed over search engines several times in my life. Even I worked in a company that was I guess pretending to build search engine (I was there just one month, when I realized they are not serious). But also I got some background on some text mining courses (both at Coursera and at the University of Manchester) and I came to a point of my research where I had to build search engine. Not many people today are building search engine from the scratch, since there are several engine libraries out there and one of the


Political bot (AI) fighting human bots (using NLP and OCR)

Probably I should write this on Serbian, but to keep consistency, English it is.

Since soon elections gonna be held in Serbia, there is a lot of talk about political campaigns. And one of the major issue in the news are human bots applied in the political campaign on the internet. Since parties in Serbia have too many members (it is estimated that almost every second person in the country is member of some party), they applied their members as bots to watch over news articles on internet portals and comment (make people vote for the party they are members of). Couple of years ago, there was no internet campaign at all in Serbia. Now, thousands of people are commenting articles


What is the big deal with natural language processing?

Recently here at Manchester University, at one class for all PhD students we realized that almost half of student in a group are doing some kind of natural language processing and almost everyone was doing something related with machine learning (even hardware guys are building neural network like multi-processor architecture). Unfortunately, these efforts are not joint, but are executed over several research groups (NLP and text mining research group, National Centre for Text Mining has it’s research student, and probably there is one more group). However, there is a lot of effort going on here, which is about natural language understanding. So what is a big deal? Why so many projects are funded in this particular field? I cannot say


Personalized relevance classifier of sentences

In this article I would like just to pitch idea about personalized classifier, and I would like to hear your opinion if this approach could be good and what can be problems with it. So what is the problem? I would like to build personalized relevance classifier.

Problem definition

Every user is tracking mentions of some term on internet or social media. Terms are usually brands they want to watch if they are some marketing guys or business owners, or some events, names etc. Since term can be ambiguous, user has opportunity to tell the program that some sentence is irrelevant for him. For example if user enter “Apple”, first it will show all mentions of Apple company and fruit


Artificial intelligence pt. 1

Yesterday we created Artificial intelligence section in company I am working (Prelovac media), so it would be great reason to write about AI. I realized that many people are not aware what is current state, where we are heading and what are applications of artificial intelligence at the moment. So lets start with the basics.

AI definitions

Artificial intelligence (AI) is the intelligence of machines or software, and is also a branch of computer science that studies and develops intelligent machines and software. Major AI researchers and textbooks define the field as “the study and design of intelligent agents”,where an intelligent agent is a system that perceives its environment and takes actions that maximize its chances of success.

In artificial