“This is the perfect opportunity to describe retrieval-augmented generation (RAG).” We assume the family had already threatened violence if he mentioned bitcoin.
It is also lovely that the quote follows directly after Google’s glue in pizza. Just pivot to something else.
But since I don’t trust the linked AI fondler’s description, what is RAG? Sounds like an LLM stapled to a search engine.
If it were merely a search engine, it risks not being ai enough. We already have search engines, and no one is gonna invest in that old garbage. So instead, it finds something that you might want that’s been predigested for ease of ai consumption (Retrieval), dumps it into the context window alongside your original question (Augmentation) and then bullshits about it (Generation).
Think of it as exactly the same stuff that the LLM folk have already tried to sell you, trying to work around limitations of training and data availability by providing “cut and paste as a service” to generate ever more complex prompts for you, in the hopes that this time you’ll pay more for it than it costs to run.
Wait is that all it’s doing? Google the prompt, take the first result, append it to the prompt, feed to LLM?
This stuff is getting pushed all the time in Obsidian plugins (note taking/personal knowledge management software). That kind of drives me crazy because the whole appeal of the app is your notes are just plain text you could easily read in notepad, but some people are chunking up their notes into tiny, confusing bite-sized pieces so it’s better formatted for a RAG (wow, that sounds familiar)
Even without a RAG, using LLMs for searching is sketchy. I was digging through a lot of obscure Stack Overflow posts yesterday and was thinking, how could an LLM possibly help with this? It takes less than a second to type in the search terms and you just have to look at the titles and snippets of the results to tell if you’re on the right track. You have the exact same bottleneck of typing and reading, except with ChatGPT or Copilot you also have to pad your query with a bunch of filler and read all the filler slop in the answer as it streams in a couple thousand times slower than dial-up. Maybe they’re more equal with simpler questions you don’t have to interrogate, but then why even bother? I’ve seen some people who say ChatGPT is faster, easier, and more accurate than Stack Overflow and even two crazy ones who said it’s completely obsolete and trying to understand that perspective just causes me psychic damage.
Rag feels to me like the other new llm tricks (like voice commands), the base model cant be improved much so they tack on modules which make it look like the base tech is still improving. While in reality it is just the tacking on of older tech.
Pivot has also declared scheduled downtime until the new year. So there.