Even if you’re working with 100% machine-created data, more than likely you’re performing some amount of manual inspection on your data at different points in the data analysis process, and the output of your machine learning models.
Many companies including Google, GoDaddy, Yahoo!, and LinkedIn use what’s known as HITL, or Human-In-The-Loop, to improve the accuracy of everything from maps, matching business listings, ranking top search results and referring relevant job postings.
Why are we still at this point? Because many times humans are better at labeling content than machines. However, when we combine human knowledge with machine learning, we can create truly robust data flows. With that being said, what’s the best way to go about it?
Activ Learning, also known as semi-supervised machine learning, is where “a computer program’s learning algorithm knows that it can periodically and interactively ask questions of a user (or user group) to gather desired outputs at new data points.”
A real-world example of active learning is Pinterest’s use of crowdsourcing services such as CrowdFlower and Amazon Mechanical Turk to evaluate the relevance of search results and help filter out potentially inappropriate or explicit content.
In these scenarios, learning algorithms select unlabeled data, send it to a human for manual labeling, and then feed those answers back into itself as labeled training data. Ultimately what you have is a “smarter” algorithm that leverages our innate ability as humans to instantly categorize things.
In a previous position I created a predictive model which determined, based on the given inputs, whether or not two records were similar, and therefore if a record was considered “verified” or not. The inputs to the model were a number of scores produced by running string similarity algorithms.
When we developed our first scoring method, we determined through visual inspection of thousands of scores that there was a range of scores we were willing to automatically accept as verified and a range we would accept as not-verified. However, we had a third range – that which was so close that we didn’t trust the machine to mark it as verified or not. Where did this middle ground come from? Human error.
Because we were comparing what someone put into a web form to what a third-party service was providing, and humans tend to mistype or misspell things as well as transpose numbers, there was a range of scores where a record was close to being verified, but we wanted a person to take a look at it.
Due to this middle ground we implemented a HITL process whereby anything that was too close to call was sent to a human for processing before final action was taken. In this way, a person could look at two names or social security numbers, realize that the person entering the data made a mistake, and still mark a record as verified.
My research bears out three key points:
Artificial intelligence as seen in the movies is quite a way off.
By far, adding a HITL process is the most popular method of augmenting machine intelligence with human intelligence.
For the time being, time being measured in months and possibly years, humans produce the best training data.
This brings me to a point I’ve heard over and over again in data science circles – algorithms are not, and should not be, positioned to take the place of human intelligence. Rather, we should be focusing on augmenting our own intelligence with that provided by a machine. While computers can process more data than we could ever hope to, and provide us with an answer, we should use that answer guide for our own decisions, especially critical ones, rather than relying on the machine selecting the answer for us.
Please fill out the form below to receive the success story by email:
How can we come back to you ?