RWE Data labelling and correction| TP Group Global

Background and requirement

Human entered data by nature will have some errors. This is particularly true when humans are entering large amounts of data under time pressure. This was true for an Army dataset comprising of orders for replacement parts for rotary wing aircraft.

TP Group were tasked with providing assurance that this data was correctly labelled and implementing corrections where needed. This dataset consisted of hundreds of rows with text and categorical columns all needing to be taken into account in determining whether the row had been correctly categorised.

Approach

Manually labelled a sub-section of the data, which was then using for training, validation, and testing.
Under sampling was used to balance out the different classes in the data.
Used TF-IDF (Term Frequency - Inverse Document Frequency) on the text columns to provide features to train the model on.
A TensorFlow neural network classifier model was used.
Prediction confidence was used to label the data to an acceptable standard of accuracy.

Outcome

The final model could label data with 83% accuracy. Used this way the model provided an extra check for the human user, helping them label more confidently in borderline cases. By making use of the model confidence in each prediction by only taking predictions over a confidence threshold, we could achieve 99% accuracy. This saved the hundreds of hours it would have taken a human to manually label those rows.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

RWE

Background and requirement

Approach

Outcome

Stay up to date with TP Group’s latest news, events and activities.

Come and meet us at the Operational Research Society Careers Day 2024.

TP Group is attending the RUSI Land Warfare Conference

Armed Forces Week – 24 to 29 June 2024

TP Group Sponsors the ISMOR 41 Gala Dinner

Team Tango raises £6,000 for St Peter’s Hospice

Interested in discussing a project?

Contact us to arrange a call