Contrasting Multiple Representations with the Multi-Marginal Matching Gap
Learning meaningful representations of complex objects that can be seen through multiple (k≥3kgeq 3k≥3) views or modalities is a core task in machine learning. Existing methods use losses originally intended for paired views, and extend them to kkk views, either by instantiating 12k(k−1)tfrac12k(k-1)21k(k−1) loss-pairs, or by using reduced embeddings, following a one vs. average-of-resttextit{one vs. average-of-rest}one vs. average-of-rest strategy. …
Read more “Contrasting Multiple Representations with the Multi-Marginal Matching Gap”