Motivation for launching the project by the client: the task was to convert a number of documents with tables from an image to an electronic form.
What we had initially:
- the main difficulties in recognizing tables: restoring the original structure of the table and working with invisible cell borders.
- classical CV methods do not work well for tables with invisible borders.
Project goals: the creation of a module for converting a table image into a structured format.
MIL Team's solution: segmentation model of table cells with subsequent post-processing.
Tools for building the model:
- Dataset PubTabNet
- crowdsourcing Ya. Toloka.
The model results: a model based on the Unet architecture and an accompanying environment that allows converting a scanned table to HTML format.
Client: ISP RAS
Technological stack: Python, PyTorch, OpenCV