Tag: multimodal

Interpretable Features

May 26, 2024

—

by

hans

in Uncategorized

A team at 𝐀𝐧𝐭𝐡𝐫𝐨𝐩𝐢𝐜, creator of the Claude models, published a paper about extracting 𝐢𝐧𝐭𝐞𝐫𝐩𝐫𝐞𝐭𝐚𝐛𝐥𝐞 𝐟𝐞𝐚𝐭𝐮𝐫𝐞𝐬 from Claude 3 Sonnet. This is achieved by placing a sparse autoencoder halfway through the model and then training it. An autoencoder is a neural network that learns to encode input data, here a middle layer of Claude, into…
Chameleon, a mixed-modal early-fusion foundation model

May 21, 2024

—

by

hans

in Uncategorized

In a new paper, Meta announces 𝐂𝐡𝐚𝐦𝐞𝐥𝐞𝐨𝐧, a 𝐦𝐢𝐱𝐞𝐝-𝐦𝐨𝐝𝐚𝐥 𝐞𝐚𝐫𝐥𝐲-𝐟𝐮𝐬𝐢𝐨𝐧 foundation model. Contrary to earlier multimodal models, which model the different modalities (text, image, audio, etc.) separately, mixed-modal early-fusion foundation models like Chameleon are end-to-end models. They ingest all modalities from the start and project them into one representational space. That permits integrating information across…