25 - LLM Hallucinations and Looping

Creating code example snippets, finding and dealing with LLM hallucinations, and of course ChatGPT Plugins

Mar 27, 2023

Welcome to another edition! I'm Daniel Imfeld, and here I share things I've read recently, updates on what I've been working on, and occasionally nascent blog post drafts.

Book Progress

Made great progress on the book last week. The PostGIS chapter is pretty much done, aside from some extra examples and images. The next one will cover processing spatial data with JavaScript libraries such as Turf, D3, and topojson.

I also need to get some of these examples into actually running code, to ensure that there are no typos or other problems. AsciiDoc actually has great support for this type of thing, allowing you to pull in small parts of content from other files, so that your example code can live in actual running examples, while the AsciiDoc parser automatically pulls in the appropriate parts every time it renders.

This is something I had wanted to build in my Logseq exporter for a while, but I discovered AsciiDoc just as I was about to do so. Yet another squashed opportunity to procrastinate.

Especially for SQL examples, the main thing I want to avoid here is overengineering. I think a very simple set of scripts that can set up a temporary database, populate it with some data, and run some queries should suffice. No need for a real framework yet — that can come after I finish the first book.

LLM Hallucinations

I have been playing around with GPT4 and Claude+ as research partners, rounding out some rough edges of my knowledge. It’s largely been helpful for generating ideas, but inconsistent for more factual questions.

Claude+ suggested a Postgres feature in which you can use the any_but_current table alias in a check constraint to compare against all rows except the current one. This is actually a nice feature and I was surprised that I hadn’t heard about this.

Then I discovered that it doesn’t actually exist, in Postgres or any other database as far as I can tell.

Next, I was trying to see if there was a way to use a SQL exclusion constraint to check that shapes in a database table don’t overlay with each other. GPT4 suggested the && operator, which is almost correct. But that operator checks only the bounding boxes of the shapes, and so it would return a lot of false positives. The correct answer is that you actually just can't do it, and have to use some other method such as a trigger constraint.

As usual with these models, it apologetically gave the correct answer once I told it that it was wrong, but this is one of those cases where if you didn’t know better you might just add buggy code into your application without knowing.

Perhaps a better API to these models would always ask it to evaluate what it just wrote and see if it’s correct. If the model doesn’t give you a confident answer in the followup, then you should be extra wary. I’m not sure yet how well it would actually work, but it’s similar to this idea.

Amjad Masad ⠕ @amasad

Most LLM prompt hacks — eg “let’s think step by step” — are basically introducing looping to a feed forward network. Every new token generated is a loop, and the more you give the LLM space to “talk” the better it can compute because looping is essential for that.

ChatGPT Plugins

Of course, the AI world was rocked yet again as ChatGPT introduced plugins that allow interactivity. While this isn’t a new concept by itself having been preceded by explorations such as Geoffrey Litt’s Fuzzy API Composition and papers such as ToolFormer, there are two big differences here:

Plugins open up the functionality to the general non-coding public.
Plugins are easy to create. You don’t even need to tell ChatGPT what to do; you just tell it what’s available and it figures it out.

Mitchell Hashimoto @mitchellh

For those who aren't aware: you write an OpenAPI manifest for your API, use human language descriptions for everything, and that's it. You let the model figure out how to auth, chain calls, process data in between, format it for viewing, etc. There's absolutely zero glue code.

At this point, I don’t have much else to say on the subject that hasn’t been already said a thousand times. But it’s an exciting time ahead. LangChain already is starting to add support for these plugins as well.

Harrison Chase @hwchase17

⭐️Claude + AI Plugins⭐️ AI Plugins (the ones ChatGPT is using) are usable by ANY language model It takes a tiny bit of prompt engineering, but here is @AnthropicAI's Claude using them Code: gist.github.com/hwchase17/554e…

Tagged Unions

Just saw this tweet which resonated with me.

🇺🇦 Ingvar Stepanyan @RReverser

After Rust it feels pretty weird to use or even see languages without tagged unions. It feels like such a crucial core type at this point.

This is the number one thing I would like to have built-in to JavaScript. Tagged unions allow you to have an enum where each value of the enum also can have associated data. This is incredibly convenient when you have a variable which can be in a few different states, with different data associated with each state. There’s no need to ensure that the data is correctly synced with the state, since the compiler enforces it for you.

Typescript sort of lets you do the same thing, but it’s less convenient to actually work with. That’s not all Typescript’s fault of course — building on top of a loosely typed language like JavaScript presents a lot of limitations. I do think that the upcoming pattern matching standard for JavaScript will also help with this kind of work.

Links and Reading

Here’s another from Geoffrey Litt. Malleable software in the Age of LLMs looks forward to how LLMs may bring software development to the masses, not so much in the sense of replacing existing software developers, but in making it much easier for non-technical people to create small one-off applications to solve specific applications. Worth some time to read and think about.

Daniel’s Substack

Discussion about this post