hans, Author at WSW

Senior Managers and the AI Learning Curve

Jun 12, 2024

—

by

𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈 𝐚𝐧𝐝 𝐨𝐭𝐡𝐞𝐫 𝐞𝐦𝐞𝐫𝐠𝐢𝐧𝐠 𝐭𝐞𝐜𝐡𝐧𝐨𝐥𝐨𝐠𝐢𝐞𝐬 (AI in general, digital platforms, blockchain, 3-D printing, etc.) pose a problem for 𝐬𝐞𝐧𝐢𝐨𝐫 𝐦𝐚𝐧𝐚𝐠𝐞𝐫𝐬. These technologies have far-reaching capabilities (partially surpassing those of humans) and are evolving rapidly, but they also come with significant risks. Senior managers, who have built their 𝐞𝐱𝐩𝐞𝐫𝐭𝐢𝐬𝐞 and associated 𝐬𝐭𝐚𝐭𝐮𝐬 over the years,…

Foundation Models for Numerical Tasks

May 31, 2024

—

by

hans

in Uncategorized

1. Language models By now, we all know that large language models (LLMs) are very capable in qualitative and language-based tasks. The jury is still out, however, concerning their 𝐫𝐞𝐚𝐬𝐨𝐧𝐢𝐧𝐠 𝐚𝐧𝐝 𝐧𝐮𝐦𝐞𝐫𝐢𝐜𝐚𝐥 skills. Researchers at the University of Chicago’s Booth School of Business (my alma mater) used 𝐅𝐢𝐧𝐚𝐧𝐜𝐢𝐚𝐥 𝐒𝐭𝐚𝐭𝐞𝐦𝐞𝐧𝐭 𝐀𝐧𝐚𝐥𝐲𝐬𝐢𝐬 (FSA) to test LLMs’ ability…

Interpretable Features

May 26, 2024

—

by

hans

in Uncategorized

A team at 𝐀𝐧𝐭𝐡𝐫𝐨𝐩𝐢𝐜, creator of the Claude models, published a paper about extracting 𝐢𝐧𝐭𝐞𝐫𝐩𝐫𝐞𝐭𝐚𝐛𝐥𝐞 𝐟𝐞𝐚𝐭𝐮𝐫𝐞𝐬 from Claude 3 Sonnet. This is achieved by placing a sparse autoencoder halfway through the model and then training it. An autoencoder is a neural network that learns to encode input data, here a middle layer of Claude, into…

Chameleon, a mixed-modal early-fusion foundation model

May 21, 2024

—

by

hans

in Uncategorized

In a new paper, Meta announces 𝐂𝐡𝐚𝐦𝐞𝐥𝐞𝐨𝐧, a 𝐦𝐢𝐱𝐞𝐝-𝐦𝐨𝐝𝐚𝐥 𝐞𝐚𝐫𝐥𝐲-𝐟𝐮𝐬𝐢𝐨𝐧 foundation model. Contrary to earlier multimodal models, which model the different modalities (text, image, audio, etc.) separately, mixed-modal early-fusion foundation models like Chameleon are end-to-end models. They ingest all modalities from the start and project them into one representational space. That permits integrating information across…

A new report explores the economic impact of generative AI

May 9, 2024

—

by

hans

in Uncategorized

I resisted the temptation to have a GenAI summarize the 𝐄𝐜𝐨𝐧𝐨𝐦𝐢𝐜 𝐈𝐦𝐩𝐚𝐜𝐭 𝐨𝐟 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈 report by Andrew McAfee (MIT). At around 20 pages long (excluding references), the report delivers on the title’s promise in less than one hour of reading. It explains how 𝐆𝐞𝐧𝐀𝐈 𝐢𝐬 𝐚 𝐠𝐞𝐧𝐞𝐫𝐚𝐥-𝐩𝐮𝐫𝐩𝐨𝐬𝐞 𝐭𝐞𝐜𝐡𝐧𝐨𝐥𝐨𝐠𝐲 that is rapidly improving, becoming pervasive, and spurring…

xLSTM: The inventors of LSTM are presenting a Transformer contender.

May 9, 2024

—

by

hans

in Uncategorized

Has Sepp Hochreiter done it again? After months of announcements, a group around the inventor of the LSTM finally published a paper presenting 𝐱𝐋𝐒𝐓𝐌 to the world. Until the appearance of the Transformer in 2017, 𝐋𝐒𝐓𝐌 had been the go-to technology for a wide variety of sequence-related tasks, including text generation. Three limitations relegated LSTMs…

GenAI’s copyright issues driving model diversity

May 2, 2024

—

by

hans

in Uncategorized

This week’s edition of the Economist (subscribers only) ran a feature on artificial intelligence and copyright. Generative AIs have been and continue to be trained on copyrighted material, including texts, images, music, videos, and more. Not all creators are amused. Some have chosen to sue the companies developing these Generative AI models. Others, including Associated…

GenAI and the job market: a case for optimism

May 2, 2024

—

by

hans

in Uncategorized

I resisted the temptation to have a GenAI summarize the 𝐄𝐜𝐨𝐧𝐨𝐦𝐢𝐜 𝐈𝐦𝐩𝐚𝐜𝐭 𝐨𝐟 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈 report by Andrew McAfee (MIT). At around 20 pages long (excluding references), the report delivers on the title’s promise in less than one hour of reading. It explains how 𝐆𝐞𝐧𝐀𝐈 𝐢𝐬 𝐚 𝐠𝐞𝐧𝐞𝐫𝐚𝐥-𝐩𝐮𝐫𝐩𝐨𝐬𝐞 𝐭𝐞𝐜𝐡𝐧𝐨𝐥𝐨𝐠𝐲 that is rapidly improving, becoming pervasive, and…

Fraunhofer launches FhGenie

Mar 24, 2024

—

by

hans

in Uncategorized

The Fraunhofer Gesellschaft, one of Germany’s largest and most renowned research agencies, published a paper “FhGenie: A Custom, Confidentiality-preserving Chat AI for Corporate and Scientific Use“, in which the authors describe a customized chat AI, baptized FHGenie. The paper describes the motivation and requirements leading to the design, as well as the solution’s architecture. 1.…

Techniques for using language models

Mar 24, 2024

—

by

hans

in Uncategorized

One of the weaknesses of the models currently available on the market is that they have been trained on a publicly accessible data set, which may not necessarily be sufficient to meet certain specific needs. Take, for example, a company with a large volume of proprietary data, a highly specialized vocabulary or specific data formats.…

Author: hans