Having a reliable classification of knowledge assets in a corporate knowledge base system in an oil and gas company is challenging. This is due to the tremendous amount and variety of knowledge assets generated by complex business and engineering processes. Submission of the knowledge assets is one of the main challenges as users may choose the wrong knowledge type or classification. A lengthy manual review process is required to evaluate the submission and some need to be rejected and returned to users. Incorrect classification of knowledge assets may cause issues in knowledge search engines as users cannot find correct articles due to the misclassification of the knowledge assets. In another challenge, some assets knowledge submission is duplicated as other users have submitted the same assets.
Our research evaluates the potential of Natural Language Processing (NLP) applications in the form of neural language modeling transformers. The models will generate the following functions:
Knowledge categorization: Automatically categorizes the knowledge items based on the given abstract. This will reduce the possibility of the wrong classification that can prevent the knowledge from being published, simplify knowledge submission and expedite the knowledge review process.
Text Summarization: Automatically generating a summary of knowledge assets that can be used as short descriptions or abstracts during knowledge submission.
Document Semantic Similarity: Identifies similar documents in terms of the context that can help for document search or deletion of duplicate documents