Author: hans

  • Interpretable Features

    A team at ๐€๐ง๐ญ๐ก๐ซ๐จ๐ฉ๐ข๐œ, creator of the Claude models, published a paper about extracting ๐ข๐ง๐ญ๐ž๐ซ๐ฉ๐ซ๐ž๐ญ๐š๐›๐ฅ๐ž ๐Ÿ๐ž๐š๐ญ๐ฎ๐ซ๐ž๐ฌ from Claude 3 Sonnet. This is achieved by placing a sparse autoencoder halfway through the model and then training it. An autoencoder is a neural network that learns to encode input data, here a middle layer of Claude, into…

  • Chameleon, a mixed-modal early-fusion foundation model

    In a new paper, Meta announces ๐‚๐ก๐š๐ฆ๐ž๐ฅ๐ž๐จ๐ง, a ๐ฆ๐ข๐ฑ๐ž๐-๐ฆ๐จ๐๐š๐ฅ ๐ž๐š๐ซ๐ฅ๐ฒ-๐Ÿ๐ฎ๐ฌ๐ข๐จ๐ง foundation model. Contrary to earlier multimodal models, which model the different modalities (text, image, audio, etc.) separately, mixed-modal early-fusion foundation models like Chameleon are end-to-end models. They ingest all modalities from the start and project them into one representational space. That permits integrating information across…

  • A new report explores the economic impact of generative AI

    I resisted the temptation to have a GenAI summarize theย ๐„๐œ๐จ๐ง๐จ๐ฆ๐ข๐œ ๐ˆ๐ฆ๐ฉ๐š๐œ๐ญ ๐จ๐Ÿ ๐†๐ž๐ง๐ž๐ซ๐š๐ญ๐ข๐ฏ๐ž ๐€๐ˆย report by Andrew McAfee (MIT). At around 20 pages long (excluding references), the report delivers on the titleโ€™s promise in less than one hour of reading. It explains how ๐†๐ž๐ง๐€๐ˆ ๐ข๐ฌ ๐š ๐ ๐ž๐ง๐ž๐ซ๐š๐ฅ-๐ฉ๐ฎ๐ซ๐ฉ๐จ๐ฌ๐ž ๐ญ๐ž๐œ๐ก๐ง๐จ๐ฅ๐จ๐ ๐ฒ that is rapidly improving, becoming pervasive, and spurring…

  • xLSTM: The inventors of LSTM are presenting a Transformer contender.

    Has Sepp Hochreiter done it again? After months of announcements, a group around the inventor of the LSTM finally published a paper presenting ๐ฑ๐‹๐’๐“๐Œ to the world. Until the appearance of the Transformer in 2017, ๐‹๐’๐“๐Œ had been the go-to technology for a wide variety of sequence-related tasks, including text generation. Three limitations relegated LSTMs…

  • GenAI’s copyright issues driving model diversity

    This weekโ€™s edition of the Economist (subscribers only) ran a feature on artificial intelligence and copyright. Generative AIs have been and continue to be trained on copyrighted material, including texts, images, music, videos, and more. Not all creators are amused. Some have chosen to sue the companies developing these Generative AI models. Others, including Associated…

  • GenAI and the job market: a case for optimism

    I resisted the temptation to have a GenAI summarize the ๐„๐œ๐จ๐ง๐จ๐ฆ๐ข๐œ ๐ˆ๐ฆ๐ฉ๐š๐œ๐ญ ๐จ๐Ÿ ๐†๐ž๐ง๐ž๐ซ๐š๐ญ๐ข๐ฏ๐ž ๐€๐ˆ report by Andrew McAfee (MIT). At around 20 pages long (excluding references), the report delivers on the titleโ€™s promise in less than one hour of reading. It explains how ๐†๐ž๐ง๐€๐ˆ ๐ข๐ฌ ๐š ๐ ๐ž๐ง๐ž๐ซ๐š๐ฅ-๐ฉ๐ฎ๐ซ๐ฉ๐จ๐ฌ๐ž ๐ญ๐ž๐œ๐ก๐ง๐จ๐ฅ๐จ๐ ๐ฒ that is rapidly improving, becoming pervasive, and…

  • Fraunhofer launches FhGenie

    The Fraunhofer Gesellschaft, one of Germany’s largest and most renowned research agencies, published a paper “FhGenie: A Custom, Confidentiality-preserving Chat AI for Corporate and Scientific Use“, in which the authors describe a customized chat AI, baptized FHGenie. The paper describes the motivation and requirements leading to the design, as well as the solution’s architecture. 1.…

  • Techniques for using language models

    One of the weaknesses of the models currently available on the market is that they have been trained on a publicly accessible data set, which may not necessarily be sufficient to meet certain specific needs. Take, for example, a company with a large volume of proprietary data, a highly specialized vocabulary or specific data formats.…

  • From language models to multimodal models

    Language models have remarkable qualities. Their ability to analyze complex human language queries, which comes from training on the immense volumes of textual data accessible on the Internet, was enough to provoke enthusiasm. However, these algorithms model only one component of human perception: text. Multimodal models aim to overcome this limitation by natively processing different…

  • Copyright and generative AI

    To kick off 2024, I’d like to talk about the current copyright situation for generative models. This is a highly topical issue, since two lawsuits on this subject are currently before the Anglo-Saxon courts: the first, in Great Britain, pits the Getty Images image library against Stability AI, a company that supplies an image-generating model.…