Michael Günther - Jina AI

Jina AI

Sign in Subscribe

Michael Günther

ML Scientist and Engineer @ Jina AI. Enthusiastic about open source and AI with particular interest in solving information retrieval problems.

Artistic pixel art of two seagulls on colored pipes with speech bubbles; one reads "Too long?" and the other shows math equat

Still Need Chunking When Long-Context Models Can Do It All?

Comparing how long-context embedding models perform with different chunking strategies to find the optimal approach for your needs.

Diagram illustrating the 'Late Chunking' and 'Long Document Model' processes in machine learning on a black background.

Tech Blog Featured

Late Chunking in Long-Context Embedding Models

Chunking long documents while preserving contextual information is challenging. We introduce the "Late Chunking" that leverages long-context embedding models to generate contextual chunk embeddings for better retrieval applications.

What We Learned at ICML2024 ft. PLaG, XRM, tinyBenchmark, MagicLens, Prompt Sketching etc.

We had a blast at ICML 2024 in Vienna, and we want to share with you everything we said, saw, and learned.

Illuminated sign reading "EMNLP 2023 Entry" mounted above a door, suggesting a conference entrance

A Tale of Two Worlds: EMNLP 2023 at Sentosa

Just back from EMNLP2023 and my mind's still reeling! Witnessed NLP's seismic shift firsthand through daring papers and provocative posters that are challenging everything we thought we knew. Check out my take on the conference's boldest ideas.

Abstract geometric background with bold "Hype Hybrids" text in white and colors against a purple backdrop

Hype and Hybrids: Search is more than Keywords and Vectors

Twenty years ago, “hybrid” was a term used only by botanists and chemists. Today, hybrid is booming… even in search. Many search systems are rolling out hybrid search schemes with the latest AI. But is "hybrid search" really more than a buzzword?