ANALYSIS

ARTLAS Maps 78 Cultural-Technology Institutions Using NLP Clustering

M MegaOne AI Apr 1, 2026 4 min read
Engine Score 5/10 — Notable
Editorial illustration for: ARTLAS: Mapping Art-Technology Institutions via Conceptual Axes, Text Embeddings, and Unsupervise

A researcher has proposed a computational framework to systematically classify and compare art-technology institutions, filling a gap in cultural analytics where no unified methodology previously existed for mapping organizations as diverse as academic conferences, electronic music festivals, and hybrid art-science labs.

  • Joonhyung Bae submitted ARTLAS to arXiv on March 28, 2026, covering 78 cultural-technology institutions across festivals, biennials, research labs, and conferences.
  • The methodology uses E5-large-v2 sentence embeddings, UMAP dimensionality reduction, and agglomerative clustering with Average linkage at k=10.
  • Clustering achieved a composite score of 0.825, a silhouette coefficient of 0.803, and a Calinski-Harabasz index of 11,196, indicating strong cluster cohesion and separation.
  • Four coherent groupings emerged, ranging from an art-science hub anchored by ZKM and ArtScience Museum to an ACM academic cluster comprising TEI, DIS, and NIME.

What Happened

Researcher Joonhyung Bae submitted a paper to arXiv on March 28, 2026 proposing ARTLAS, a computational pipeline that maps 78 art-technology institutions — spanning festivals, biennials, research labs, conferences, and hybrid organizations — into a single comparable analytical space using text embeddings and unsupervised clustering. The paper is available at arXiv:2603.28816. The study works entirely from textual descriptions rather than quantitative institutional data such as budgets or attendance figures.

ARTLAS addresses a gap Bae identifies directly in the paper: “systematic frameworks for analyzing their multidimensional characteristics remain scarce,” despite the global art-technology landscape having grown considerably in institutional diversity.

Why It Matters

Art-technology institutions occupy a broad and often overlapping set of identities — a festival like Ars Electronica operates very differently from a conference series like NIME or a research center like ZKM, yet all three are frequently grouped under the same general label. Without a structured taxonomy, cross-institutional comparisons rely on informal knowledge rather than measurable criteria.

Institutional mapping methods from adjacent disciplines — such as scientometrics for academic journals or typologies in cultural economics — have not been systematically adapted to this sector. ARTLAS introduces an eight-axis framework specifically designed for the multidimensional nature of art-technology organizations, moving beyond single-variable classification schemes.

The absence of such a framework has practical implications for funding bodies evaluating institutional portfolios, researchers studying the evolution of the sector, and organizations seeking to identify peer institutions for benchmarking or collaboration.

Technical Details

The ARTLAS pipeline begins with qualitative descriptions of each of the 78 institutions along eight conceptual axes — Curatorial Philosophy, Territorial Relation, Knowledge Production Mode, Institutional Genealogy, Temporal Orientation, Ecosystem Function, Audience Relation, and Disciplinary Positioning — before encoding them into machine-readable feature vectors. These descriptions are processed using E5-large-v2 sentence embeddings, then quantized through a word-level codebook into TF-IDF feature vectors.

Dimensionality reduction is performed using UMAP before applying agglomerative clustering with Average linkage at k=10. The resulting model achieves a composite score of 0.825, a silhouette coefficient of 0.803, and a Calinski-Harabasz index of 11,196. The silhouette score in particular indicates strong intra-cluster cohesion alongside clear separation between groups.

Non-negative matrix factorization (NMF) is applied separately to extract ten latent thematic topics across the dataset. A neighbor-cluster entropy measure identifies boundary institutions — organizations that sit between multiple clusters rather than belonging clearly to a single one. An interactive React-based web visualization accompanies the methodology, designed to allow stakeholders to explore institutional similarities and thematic profiles without engaging directly with the underlying NLP pipeline.

Who’s Affected

The 78 institutions in the dataset include prominent names across the global art-technology sector, and the clustering process identified four coherent groupings that reflect substantive differences in how these organizations position themselves relative to curatorial philosophy, disciplinary scope, and audience relationship. The clusters are: an art-science hub anchored by ZKM and ArtScience Museum; an innovation and industry cluster including Ars Electronica, transmediale, and Sonar; an ACM academic community comprising TEI, DIS, and NIME; and an electronic music and media cluster featuring CTM Festival, MUTEK, and Sonic Acts.

Curators, festival directors, grant-making bodies, and academic researchers working at the art-science intersection are the primary audiences the tool addresses. The React-based visualization is designed to make institutional analysis accessible to practitioners who do not work directly with NLP or clustering methods.

What’s Next

Bae describes ARTLAS as a “replicable, data-driven approach to institutional ecology in the cultural-technology sector,” but the pipeline depends on qualitative axis descriptions as its core input — a step that introduces potential variability depending on how individual institutions are characterized before any encoding takes place. The paper does not report inter-rater reliability measures for this qualitative phase.

The current scope covers 78 institutions, and the paper does not detail how the framework would scale to a larger or more geographically diverse dataset. Validation through expanded coverage and testing of axis-description consistency across different annotators would be required to substantiate the methodology’s claims to broad replicability.

Share

Enjoyed this story?

Get articles like this delivered daily. The Engine Room — free AI intelligence newsletter.

Join 500+ AI professionals · No spam · Unsubscribe anytime

M
MegaOne AI Editorial Team

MegaOne AI monitors 200+ sources daily to identify and score the most important AI developments. Our editorial team reviews 200+ sources with rigorous oversight to deliver accurate, scored coverage of the AI industry. Every story is fact-checked, linked to primary sources, and rated using our six-factor Engine Score methodology.

About Us Editorial Policy