Here at the Search Party we have invested in truly speaking the language of hiring. Not by the water cooler, but through machine learning and understanding how terms relate to construct neural networks that can interpret the text of a job description and return key information like salary, industry and job title.
One of the most interesting outcomes was realising how diverse our interpretation is of some job titles and how different the results are when we take that preconception out of the equation by removing the job title from the text used to train the models.
The Data Science Behind it
In simple terms, a neural network is a series of layers of nodes that do computations used for machine learning. While traditional programming generally requires patterns or outcomes to be known in advance, neural networks can use unspecified feature extraction to self-improve.
This is where it gets interesting. Once a model is created based on known data or a feedback loop of sorts, the neural network will reveal recurring themes in data in the form of filters it adds to its nodes into layers. These will often be determining features in the data and can understand roles well beyond the training set that was supplied.
The result is that the model understands what we are trying to say, sometimes better than we do.
We maintain and continue to refine a model created on raw job descriptions including job titles and a separate model trained without the job title included (replaced with person or omitted by virtue of omitting the first few lines).
These models have similar results, but interesting differences. You can see that with the obvious hint of the job title, the neural net comes to rely on it too much. Indeed in recent times, infrastructure engineers tend more towards devops roles and this is clear when we only use the duties of a role to train the model.
|Original job description||Without “infrastructure engineer”|
|1||infrastructure engineer||linux engineer|
|2||linux engineer||devops engineer|
|3||systems engineer||linux systems engineer|
|4||windows engineer||systems engineer|
|5||devops engineer||systems administrator|
|7||linux infrastructure engineer||senior devops engineer|
|8||automation engineer devops||linux systems administrator|
|9||devops infrastructure engineer||devops consultant|
|10||linux systems engineer||build automation engineer|
For a full discussion of the method behind this, see here.
What this tells us about the average job ad
In many cases we have seen that the job title advertised is less reliable and less nuanced than the result the full description of the job can give us. While we haven’t yet taken into account regional differences, it is clear that the job description doesn’t always fit the role as well as we might think when placing an ad.
The outcome of having these 2 models is that one is great at cleaning job titles such as “Web developer – CBD Location – Awesome team asap” while the other can create much more meaningful interpretation of a job description and is much more consistent at identifying a job title without the obvious presence of that title. In human terms, one is more useful as utility whereas the other is more useful for interpretation.
What’s the use?
The practical applications for this are extremely useful. Standardising a candidate’s job titles to work out salary ranges and even the ideal next move in their career can cut through the company preference or fashionability of a job advertisement. We have some obvious uses of this within our employer experience, but the possibility of this as a stand alone tool has not escaped us.
If you think this could be something you would use in your recruitment business or HR team, please take the time to contact us and signup for our newsletter (in the footer). If there is enough interest in stand alone machine learning driven APIs, we will certainly look at making it available with regional influence taken into account.
For full details of the prediction model journey, see our detailed series on job title prediction.