Platform: Compressa

MIL

ML Infrastructure cost reduction by decreasing the computational complexity of DL models

On-device ML Infrastructure transfer by compression the DL models into device limitations

BUSINESS CASES

1

RAM, Energy, CPU/GPU lower consumption while models inference

2

Transfer high-quality and complex models on device

3

Models inference on low-bit CPU

4

Speeding up calculations

SOLVING METHODS

Post-training and Low-Bit Quantisation (Adaround, GDRQ, LSQ, own modification of LSQ, APoT, Symmetric, Asymmetric, etc)
Pruning and Knowledge Distillation (HRank, CUP, Cluster-based, Magnitude)
Device Placement
DL Optimization Solvers

EXPERIENCE & EXAMPLES

10-time decreasing of GPU resources while ranking architecture due to the quality in evolutionary NAS methods
Clients Cases
MIL Team Research - Compression Team
Filter level pruning
Seminar
MIL Team Research - Compression Team
Denoising Models transfer on software chips for wearable devices and smartphones
Clients Cases
MIL Team Research - Compression Team
Quantization experiments
Seminar
MIL Team Research - Compression Team
PyNet, HRNet architectures pruning
Clients Cases
MIL Team Research - Compression Team
Models optimization techniques
Seminar
MIL Team Research - Compression Team