Motivation for launching the project by the client: usually, customer reviews reflect certain characteristics of the goods. It was necessary to automate the process of highlighting keywords that are responsible for a specific characteristic. In addition, the task was to predict the product at the request of customers based on previously left reviews.
What we had initially: MVideo did not have such functionality for processing reviews, there was a need to add new functionality.
Project goals: select a "descriptive" characteristic for each review and build a graph based on this data.
MIL Team's solution: application of a semi-automatic method for extracting terms from the text of a product review. Building a knowledge graph, including matching terms with given technical characteristics, and training vector representations of graph elements to predict a product by the recall.
Tools for building the model:
- Adaptive Text Rank based on technical characteristics and a set of sentiment words for highlighting terms;
- SOTA BERT model for matching terms and specifications;
- TransE method for training vector representations of graph elements;
- ABAE method for highlighting "important" characteristics for products based on a set of reviews.
The model results: Sets of terms for various categories of goods have been obtained, graphs have been built and a model for highlighting "important" characteristics has been pretrained.
Technological stack: Python, Tensorflow