Recruitment is one of the most important parts of any company’s activities and getting the right person for the job can often be essential for the success of the company. Still, there is a lot of uncertainty surrounding this process, and the hiring decisions are often not very rational but based on the preconceptions and bias of the recruiter or hiring manager.
To reduce this bias is of course not only important from the company’s perspective — it is also a matter of justice for the individuals who are affected by the bias and who might not get the job they were in fact well suited for. However, addressing bias is a very challenging task, partly because our understanding of bias is very limited, and partly because the cues we base our decisions on can be very subtle.
When we talk about bias, most people might think about gender and ethnicity, but there are also other less obvious factors, such as height, weight, and age. And even if we talk about and try to become conscious about potential bias, it is very hard to control the unconscious processes that affect our behaviour, and which might not only affect the assessment of the candidate, but also how the job interview is conducted, which in turn affects the candidate’s behaviour, and thereby the recruiter’s or hiring manager’s perception of them.
Together with TNG, Furhat is developing the world’s first social robot specifically designed to reduce bias in the recruitment process, called Tengai Unbiased. Tengai will assist recruiters and conduct interviews with candidates to assess their skills and competencies, given the requirements and profile of the job.
The robot will conduct the interview in a way that is very similar to how a human recruiter would do it, using competency-based questions, such as “tell me about how you handled a conflict at your last job”. The robot will give feedback (such as nodding, smiles and saying “mhm”), in order to encourage the candidate to give elaborate answers. If the answer is too vague, the robot might, for example, ask the candidate to give more concrete examples. After the interview, the robot will make a summary of the interview and some objective recommendations for a human to make the decision about the candidate.
The nature of the conversational behaviour that we will develop for Tengai is of a much more open-ended and challenging nature than for example the dialog you have with your smartphone or smart speaker, where you typically give the system short commands or queries. To accomplish this, we will have to push the state-of-the-art in conversational systems. To this end, Tengai will be developed using machine learning, which means that we will collect data of interviews conducted by humans, where potential bias is reduced as much as possible, and then train the system to replicate the human recruiter’s conversational behaviour.
For the data collection, we are using a setup typically referred to as “Wizard-of-Oz”, named after the 1939 film, in which it was revealed that the mysterious Wizard in fact did not exist, but was instead controlled by a man hidden behind a curtain, speaking into a microphone. And this is how we are teaching Tengai in the beginning: Tengai was initially controlled by a human recruiter sitting in another room, triggering the robot’s behaviour, which has allowed us to record data of human-robot interviews. This data will now be used to develop two different models: One that can replicate the Wizard’s behavior during the interview, and one that can do the assessment of the interviews after.
Keeping Tengai unbiased
So, the question is then: How could such a robot help to reduce bias in the recruitment process? Wouldn’t the robot just replicate the human bias that exists in the data that we train the models on? There are several answers to this.
If we start with the automation of the interview, the robot will in itself have a much more consistent behaviour towards the candidate, since we can easily make sure that it will not, for example, use a certain tone of voice or look skeptical towards certain candidates (something that is impossible to enforce on a human recruiter). Having said that, the Wizard recruiter conducting the interviews could potentially be affected by factors such as age, gender, and ethnicity in the way they press the buttons to control the robot. However, we can still make sure that these factors will not be available to the automated system that we train on this data, which means that they cannot affect the behaviour of the robot in the end.
As a (hypothetical) example, let’s say a recruiter consistently would interrupt female candidates more than male candidates. If the gender of the candidate would not be available to the robot, it would simply “fail” to reproduce this behaviour (i.e., it would just learn that it should interrupt sometimes and sometimes not, but the decision would not be based on the candidate’s gender). This argument is valid at least as long as we are talking about direct access to these factors. But it is important to also be aware of the fact that the system could have indirect access to such factors.
To continue with the same example, if the system has access to the pitch of the voice, it could potentially associate high pitch (which is more common for females) with more interruptions. Although this relationship is much more indirect, and we, therefore, would have reduced gender bias, it is still undesirable and it is very important that we understand why the system makes certain decisions.
Thankfully, there are a number of ways in which we can investigate the models that we have trained, in order to understand which factors are used for which decisions, to verify that they are sound and objective. This so-called Explainable AI has recently gained a lot of interest from the research community, and it is indeed a challenging task. However, to analyse and understand the potential bias of an AI is arguably more feasible than opening up the brain and analysing the bias of a human recruiter or a hiring manager.
Automating candidate assessment
When it comes to automating the assessment of the candidates, it is again important that the data that we train our models on contain as little bias as possible, and that the assessment is done using a verified competence framework from a third party. Additionally, we will use several experienced and trained recruiters, who are also trained within discrimination law, have worked with an unbiased recruitment process for many years, and are well trained on the different dimensions of unconscious bias. Several recruiters will assess the same recorded interviews independently of each other, which means that we will be able to train our models on their aggregated judgment.
In this process, we will make sure that the people doing these assessments were not the same recruiters who conducted the interviews. Also, they will only have access to the raw transcript, and not to the audio and the video of the interview, which means that they will not be able to base their decisions on some of the most obvious potential factors for bias, such as gender, ethnicity, looks, and age.
But even if we reduce human bias as much as possible in the process, there is also a potential risk of introducing so-called algorithmic bias. For example, off-the-shelf face detection software has been reported to perform worse for darker skin colours, and the speech recognizer (which translates speech into words) might perform worse for speakers with a foreign accent or specific gender. This could potentially affect the outcome of the interview.
To mitigate this, we will perform thorough analyses of how these components perform on the data that we have recorded, to see if certain groups are affected. One should be careful here, though, to not throw out the baby with the bathwater, since it is not certain that a slightly worse performance at an early stage of processing will affect the final outcome, in terms of robot behaviour and analysis of the interviews. In many cases, there might be ways of compensating for these shortcomings.
To sum up, there is no single magic bullet against bias, and to completely remove bias from the recruitment process, or in other aspects of our lives, might never be possible.
But we do think that a robot recruiter might add another level of transparency and consistency to the process. Unlike a human recruiter, we can control the robot’s behaviour in detail, down to the micromovements in the corner of the mouth. And unlike a human recruiter, we have much better tools for analysing and understanding the rationale for its decisions, which in turn allows us to avoid repeating known biases that are common in recruitment today. We, therefore, think that the development of Tengai is one step towards the understanding and reduction of bias in the recruitment process.
For more information about Furhat Robotics please visit www.furhatrobotics.com.
For more information about TNG, please visit www.tng.se.
Gabriel Skantze is Co-founder and Chief Scientist at Furhat Robotics.
Gabriel has played a key role in the development of the Furhat platform. Gabriel is also an Associate Professor in Speech Communication and Technology at the Royal Technical University (KTH) in Stockholm. He also a faculty member of the SRA ICT-The Next Generation platform at KTH.
For any questions, please contact Gabriel on LinkedIn.