5/n
REFERENCES
[1] Orion Weller, Michael Boratko, Iftekhar Naim, and Jinhyuk Lee. 2025. On the theoretical limitations of embedding-based retrieval. https://arxiv.org/abs/2508.21038
[2] Yifu Qiu, Varun Embar, Yizhe Zhang, Navdeep Jaitly, Shay B. Cohen, and Benjamin Han. 2025. Eliciting in-context Retrieval and reasoning for long-context large language models. https://machinelearning.apple.com/research/eliciting-in-context repo: https://github.com/apple/ml-icr2
#AI #LLM #RAG #Embeddings #Retrieval #RecSys #Search #AgenticAI #paper
1.9.2025 01:195/nREFERENCES[1] Orion Weller, Michael Boratko, Iftekhar Naim, and Jinhyuk Lee. 2025. On the theoretical limitations of embedding-based...4/
4. Three ways out: (1) Cross-encoders—placing docs in the prompt of a Long-Context LM; accurate but costly. (2) Multi-vector retrieval. (3) Sparse retrieval.
For cross-encoders, this links directly to our earlier work on ICR² [2], where combining training data design with model re-architecting improved retrieval performance (see picture 4). Many other paths remain open!
#AI #LLM #RAG #Embeddings #Retrieval #RecSys #Search #AgenticAI #paper
1.9.2025 01:194/4. Three ways out: (1) Cross-encoders—placing docs in the prompt of a Long-Context LM; accurate but costly. (2) Multi-vector retrieval....3/
3. LIMIT dataset: To stress-test real models, they construct the LIMIT dataset consisting of 50k docs and 1k queries, each with 2 relevant docs (picture1). All single-vector models fail badly, while BM25 and multi-vector methods perform much better (see picture 2).
#AI #LLM #RAG #Embeddings #Retrieval #RecSys #Search #AgenticAI #paper
1.9.2025 01:173/3. LIMIT dataset: To stress-test real models, they construct the LIMIT dataset consisting of 50k docs and 1k queries, each with 2 relevant...2/
2. Oracle experiment: They empirically confirm this bound by directly optimizing embeddings (“free embeddings”) against the relevance matrix. The critical corpus size grows only cubically with dimension. For example, with embedding dimension d = 1024, you can only represent all possible 2-doc query combinations up to about 4 million documents — far below typical web-scale retrieval needs.
#AI #LLM #RAG #Embeddings #Retrieval #RecSys #Search #AgenticAI #paper
1.9.2025 01:162/2. Oracle experiment: They empirically confirm this bound by directly optimizing embeddings (“free embeddings”) against the relevance...1/
Embeddings are the beating heart of modern AI—powering RAG and serving as memory for agentic AI. But a new paper [1] shows a ceiling:
1. Dot-product retrieval is bounded by embedding dimension d; if the relevance matrix has sign-rank r, then d >= r is required—and no amount of training can avoid it.
#AI #LLM #RAG #Embeddings #Retrieval #RecSys #Search #AgenticAI #paper
1.9.2025 01:141/Embeddings are the beating heart of modern AI—powering RAG and serving as memory for agentic AI. But a new paper [1] shows a ceiling:1....Skipped today’s #Parkrun, and shadow-ran with my daughter in the community 5K race. It's her first race coming back to the school cross-country team after *two* years in recovery from her running injury. She did quite well in the first mile, but ran out of steam after finishing the first 2 miles. But this was only the first week, and i know she will outrun me in no time!
I did manage to sprint a bit at the end -- 4'28"/mi -- my fastest stride ever! 🙂
(half marathon race next Monday!)
31.8.2025 04:33Skipped today’s #Parkrun, and shadow-ran with my daughter in the community 5K race. It's her first race coming back to the school...Coachable human — achievement unlocked!
30.8.2025 03:40Coachable human — achievement unlocked!#coachGPT #runningSo much wonderful tech and engineering goes into this teeny tiny daily driver.
(you also get to learn why dogs tilt their heads when confused)
https://youtu.be/PB_8dGKh9JI?feature=shared
29.8.2025 02:36So much wonderful tech and engineering goes into this teeny tiny daily driver. (you also get to learn why dogs tilt their heads when...Maier, #Franck, #Schumann: Sonatas for #Violin & #Piano Duo Concertante
https://classical.music.apple.com/us/album/1817421838?l=en-US
#classicalMusic #appleMusicClassical #commute #chamber
28.8.2025 15:18Maier, #Franck, #Schumann: Sonatas for #Violin & #Piano Duo...Beware my #British friends. https://mstdn.social/@Free_Press/115099995291652362
27.8.2025 16:26Beware my #British friends. https://mstdn.social/@Free_Press/115099995291652362Oliver Davis & Antonio Vivaldi: Seasons
Trafalgar Sinfonia, Grace Davidson, Kerenza Peacock
(looking forward to fall)
https://classical.music.apple.com/us/album/1024672141?l=en-US
#classicalMusic #appleMusicClassical #commute #vocal
27.8.2025 16:20Oliver Davis & Antonio Vivaldi: SeasonsTrafalgar Sinfonia, Grace Davidson, Kerenza Peacock(looking forward to...Cause and effect
#running #health https://mas.to/@SmudgeTheInsultCat/115092405352945410
26.8.2025 16:10Cause and effect #running #health https://mas.to/@SmudgeTheInsultCat/115092405352945410“Exposure to heat waves over just two years could add up to 12 extra days of age-related health damage.”
Humans produce greenhouse gases -> global warming -> people age faster -> population decreases faster -> reduced greenhouse gases
https://www.nytimes.com/2025/08/25/climate/heat-waves-aging.html
#climateCrisis #Heat #health #earth
26.8.2025 16:01“Exposure to heat waves over just two years could add up to 12 extra days of age-related health damage.”Humans produce greenhouse gases...#Scarlatti: #piano sonatas
Javier Perianes
https://classical.music.apple.com/us/album/1810545013?l=en-US
#classicalMusic #appleMusicClassical #commute
26.8.2025 15:48#Scarlatti: #piano sonatasJavier Perianeshttps://classical.music.apple.com/us/album/1810545013?l=en-US#classicalMusic #appleMusicClassical...“Check all constants and make sure they are immutable.”
Life is WONDERFUL!
25.8.2025 20:55“Check all constants and make sure they are immutable.”Life is WONDERFUL!#genAI #coding #RooCodeHappy Monday! Let’s watch #Pika fight first! https://youtu.be/Q2ua2luJW00
25.8.2025 14:40Happy Monday! Let’s watch #Pika fight first! https://youtu.be/Q2ua2luJW00#animals #video #bbcGreat reading while training for #marathon races, I guess??? 🤔
Are Marathon Runners More Likely to Get Cancer? - VICE https://apple.news/A9uZaF_VXSxqNdeM5Tt6cjQ
24.8.2025 13:23Great reading while training for #marathon races, I guess??? 🤔Are Marathon Runners More Likely to Get Cancer? - VICE...Adding this rule to project documentation guidelines to prevent #RooCode from going into the infinite-loop hell.
23.8.2025 20:07Adding this rule to project documentation guidelines to prevent #RooCode from going into the infinite-loop hell.#genAI #codingToday’s run went according to #ChatGPT coach’s plan. The coach even gave a remark on the little drama I encountered! 😆
22.8.2025 02:09Today’s run went according to #ChatGPT coach’s plan. The coach even gave a remark on the little drama I encountered! 😆#RunningJ. S. #Bach & Sons: #Flute Sonatas
Toshiyuki Shibata, Anthony Romaniuk
https://classical.music.apple.com/us/album/1816634319?l=en-US
#classicalMusic #appleMusicClassical #commute
21.8.2025 15:29J. S. #Bach & Sons: #Flute SonatasToshiyuki Shibata, Anthony...