Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Draft Paper Discovered in Which Joseph Weizenbaum Envisions ELIZA's Applications (sites.google.com)
125 points by abrax3141 on March 19, 2024 | hide | past | favorite | 24 comments


I interviewed Jeff Shrager. He is one of the people behind this site and the effort to investigate Eliza.

They do a lot of interesting work and the history of Eliza is more complicated than you would guess.

(The interview was before chatGPT was a big thing.)If you don't mind the plug:

https://corecursive.com/eliza-with-jeff-shrager/


I’d highly recommend that podcast as well. I’ve listened to that episode twice. It was really fascinating.

EDIT - meant to add that the summary style of your interviews is great. Keep up the good work!


Thanks!


Same, another recommendation for the podcast.


Wow this is a great podcast episode. I remember using a version the Creative Computing code that I reworked for my TSR-80. Fun to hear the back story


I really enjoy your podcast! :)


Thanks!


It's amazing to me that the "chat bot" interface for ELIZA, developed in the mid 1960s really isn't very different from that of ChatGPT 4, 60 years later.


I recommend also “computer power and human reason”, his 1976 treatise. it really presages all of the last few years of AIspew.


To Weizenbaum's point, back in the 80s I used to cart a Teletype and an acoustic coupler to high schools to talk about computer science. The big demo was Eliza. Even after I explained that it was a simplistic program (getting it to say “Perhaps we fragisticulate each other in your dreams”), and showed the students the scripts it was using, I found students would want to have serious conversations with it, of the “please don't look at it just now” variety. Seeing that the students and teachers consistently missed the point, I stopped using it.


> Seeing that the students and teachers consistently missed the point, I stopped using it.

In retrospect you missed the point. But so did everyone else. As primitive as ELIZA is, it is actually closer to the LLM approach than, well, everything else in AI over the past 60 years.

People don't realize how simple LLMs' code is. We're talking a few hundred lines! That's comparable to that of ELIZA.

All the magic is in the data those lines process. That's what takes up gigabytes of storage and requires many gigabytes of memory and GPUs/Apple Silicon to run. (Quite possibly representative of how seven pounds of meat and 20w of power can still outperform said gigabytes and silicon.) Had such compute power been available, there is no reason to think that Weizenbaum or others at the AI Lab could not have built a true language model in 1967.

I suspect that, if anything, ELIZA's simplicity caused researchers to not pursue it further, as it was obvious that true AI would of course require millions of intricate lines of code. Thus we got 50 wasted years building ever more-complex expert systems (i.e., fancy versions of Twenty Questions, or Akinator), or attempts to replicate the human brain in hardware (Danny Hillis named his company Thinking Machines for a reason). A future history of AI may describe the 50 years between ELIZA and "Attention is All You Need" in a chapter called "The Route Not Taken".


You're making it sound like everything that's needed for modern AI is computing power.

But going back in the '70s with a bunch of 4090s will not help them much, as they wouldn't have enough data to do very meaningful things. Availability of data and computing power went hand in hand, it's not just a matter of "they didn't know better": they couldn't do much else to begin with.


> But going back in the '70s with a bunch of 4090s will not help them much, as they wouldn't have enough data to do very meaningful things.

Not at first. But the corpus would have been built over time. Project Gutenberg was founded in 1971. Imagine a world in which the Copyright Act of 1976 requires an electronic copy of every book to be deposited with the Library of Congress.

The best single work of fiction ever created about LLMs' capabilities (and, perhaps, dangers) is Colossus by Jones. Although I think the film is even better than the book, only the latter mentions how, despite being created specifically for US national defense, Colossus is also fed unrelated data including Shakespeare's sonnets, because its creators do not know if it could be important.


"Imagine a world in which the Copyright Act of 1976 requires an electronic copy of every book to be deposited with the Library of Congress."

I fail to imagine such a world. It's 1976, there's very few computers around (at least for the general public and smaller companies), and a lot of typesetting is still not based on computers. Those who are using "digital" typesetting (not Aldus, but interacting with a photo-typesetter via a terminal) are probably storing data in proprietary formats on floppy drives or tapes, and presumably not as a neat .txt file. Who would even consider such a law?

The huge data availability which enabled GPT-2, GPT-3, ChatGPT, etc. is not because of project gutenberg, but because of the capillary usage of Internet and tons of user-produced content you can scrape and reuse. Sure, we can imagine a past in which all these conditions are in place, but then we're just moving all the tech of 2010's and 2020's 50 years behind. And if we do that, I agree we can have current AI approaches and flared trousers go hand in hand :)


The crucial difference is that ELIZA's scripts were made by hand and a modern LLM has gigabytes of "scripts" that were learned automatically. They're not really comparable.

We didn't know how to do this learning, or have the data and hardware to do the training on back then.


Another possibility is that, in fact, vincent-manis and everyone else did not miss the point, and the folks who believe we can achieve “true AI” if we just give Eliza more CPU power and disk space are barking up the wrong tree.

To paraphrase Ted Chiang: a learning thermostat has a goal, but it has no preferences or subjective experience. You’re training it to maintain the temperature of your house, but you’re still interacting with a thermostat. The way we’re approaching AI currently supposes that if you just cobble enough thermostats together, you get a person.


>> if you just cobble enough thermostats together, you get a person

Is that so different from how natural selection produced the human mind?


I think it is.

The completely mechanistic view of human behaviour (that we are merely reaction and/or goal seeking machines) fails to capture a host of other behaviours we are capable of (sacrifice, altruism). That is, it fails to capture those herd responses that are external to us.

I'm not just talking about fish swimming in schools (which has a natural survival explanation) but more emergent behaviours that lie outside the flock/fight/flee/fuck/feed set of things. Kindness to strangers, empathy, understanding.

It's entirely plausible that AI might produce machines, and we may deem them intelligent, but their calculations can't reproduce the human mind because it's obvious that is not how we work.


"...can't reproduce the human mind because it's obvious that is not how we work" <-- that's a very strong assumption, without any concrete evidence to support it


See above


>To paraphrase Ted Chiang: a learning thermostat has a goal, but it has no preferences or subjective experience. You’re training it to maintain the temperature of your house, but you’re still interacting with a thermostat. The way we’re approaching AI currently supposes that if you just cobble enough thermostats together, you get a person.

Imagine a person whose duty is to maintain the desired temperature of your house by adjusting a thermostat. With training and experience, he does a pretty good job of this. A learning thermostat, after training, does as good a job of maintaining the temperature if not better. Isn't this ASI (S standing for specific, not superior) for that particular role?

Your and most people's answer is of course not; nothing so simple is AI. But why not? It replicates a job that previously required human intelligence and judgment. It performs a task that the amount of raw energy Star Trek's warp cores are capable of generating cannot.

Now, if one device can be programmed to generalize such training for all such tasks, starting with other household duties, then outside the home ... At some point, when does it not become AGI?

>Another possibility is that, in fact, vincent-manis and everyone else did not miss the point, and the folks who believe we can achieve “true AI” if we just give Eliza more CPU power and disk space are barking up the wrong tree.

I don't know the answer to this; no one does. But I know of no reason to not think that AGI can be achieved via LLMs with enough CPU power and disk space, either. Speaking of Trek, elsewhere I compare Data to LLMs. <https://np.reddit.com/r/singularity/comments/1bibwz6/data_on...> I point you to my conversation with ninjasaid13.

I also point you to my attempt to annotate a meme about LLMs. <https://np.reddit.com/r/ChatGPT/comments/1bhk4ju/its_magic/k...> Circling back to Chiang, he has a background in software development, which puts him among the midwits (In this very specific situation!). That said, I wonder if Chiang truly does not believe in the possibility of emergence, a sum greater than the parts; he is, after all, the author of "Exhalation".


> A learning thermostat, after training, does as good a job of maintaining the temperature if not better

Maybe you’ve had more luck than me, I’ve hated every “smart” thermostat I’ve come across. If AGI is a bunch of smart thermostats in a trench coats I hope it comes with an off switch.


I don't know the answer to this, either, of course. :) I have a strong suspicion that LLMs are not, in and of themselves, the way AGI will be achieved. They may be a component of it, but I suspect there is a fundamental difference between "based on knowledge and context, I am confident this is a correct answer to your question" and "based on statistical and lexical analysis, I am confident this is what a correct answer to your question might look like," and I don't think LLM-only technology isn't capable of making the jump from the latter to the former.

So is just getting better and better at the latter enough? In some contexts, probably -- but somewhat ironically, I don't think those are the contexts that we keep pushing AI into. It's pretty good with problem spaces for which "correctness" isn't a terribly important metric. For instance, I just prompted ChatGPT 3.5-turbo to come up with a synopsis of a 1920s private detective story, with a few provided details, using the seven-point story structure, and I was surprised at how well it did. It came up with something really derivative, sure (to the point where it named its protagonist "Jack Malone," which sounds suitably detective-ish because Jack Malone turns out to be the main character of the long-running detective show "Without a Trace"), but definitely a passing grade!

But time and time and time again we see examples of AI failing at areas where correctness is important -- "hallucinations" -- and I think that's where my distinction in the first paragraph becomes more than just mere semantics. LLMs do not actually have knowledge and context, but instead just generate something that looks correct. Sometimes it actually is correct! (I could ask ChatGPT to summarize the seven-point story structure and it aced it, for instance.) But sometimes it isn't, and the LLM literally cannot distinguish between correct and incorrect responses. I ran into this at my last job, where…hmm. To be a bit elliptical about it to avoid breaking an NDA I suspect I'm still under, we were trying to train an LLM to call a JSON-based API by analyzing the user's query and filling in the correct API parameters. It did an amazing job most of the time, but the remit was "if the user doesn't specify a required parameter, prompt them for it," and there was no way to get the LLM to do that consistently. Sometimes it would, but sometimes it would just make up what it thought was a plausible value and plunge ahead. "Works nine times out of ten" may or may not be an acceptable success rate, but "one time out of ten makes up shit and sees what happens" is not an acceptable failure mode.

And this is where Chiang's thermostat analogy comes into play for me, I think: I simply don't see how you get from "looks correct" to "verifiably correct" by adding more and better thermostats. You're going to get responses that sound better, and you're going to get them faster, but you're still not going to be able to know whether they're bullshit without running them by a fact-checker.


KMP addressed that point:

20240214: Ken Pitman (KMP)'s Lisp ELIZA from MIT-AI's ITS history project We've added Kent Pitman's (aka. KMP, https://github.com/netsettler) MACLisp ELIZA to the elizagen repo. This was discovered by Lars Brinkhoff, and can also be found on his PDP-10 ITS historical sources repo.

KMP has been a fixture of the Lisp community for decades. Among many other lisp-related activities, he worked on Macsyma, Maclisp, Emacs, and authored the Common Lisp HyperSpec. He also chaired several of the Common Lisp standards subcommittees. He has written several interesting essays on the history of lisp and lisp-adjacent artifacts and activities, available here: https://nhplace.com/kent/Papers/.

In the late 1970s, when he started at MIT, KMP took Joseph Weizenbaum's Intro to Programming (6.030), where he was introduced to ELIZA. Probably having seen Bernie Cosell's BBN-Lisp ELIZA, KMP wrote a version of his own on ITS, a locally-developed OS on PDP-10s, that were the main computers in the MIT-AI lab. ITS had a slightly different version of Lisp (MACLisp vs. BBNLisp), so Cosell's code couldn't be run directly, also KMP wanted to hack the program a bit. He writes:

>"While [this code has] definitely got my work in it, it's pretty clearly built on someone else's work. Back then there was not a sense of publication so much as just making existing things better. And everything was a proto variant of free software because there was no place one realistically expected one would sell anything. It was just hack upon hack in that environment. Both intellectual property law itself, and my naive understanding of it, was quite different then."

KMP provided some additional thoughts on his ELIZA:

>"The thing about the DOCTOR program, while it was quite elaborate, was that it sort of missed the point of what Weizenbaum was trying to say. It's not that I didn't understand him, but it took me a long time to really come to viscerally understand the seriousness of his point. He said that (paraphrasing, maybe badly, from memory) long before we got to real AI, we would get to things that seemed smart but weren't, and we would overly trust systems that were not really smart. He was, for example, very worried about computerized launch of nuclear missiles using what was then thought to be AI. But the point of the original ELIZA was not to BE elaborate, it was to be simple, and to show how even a tiny, tiny program could seem smart. It was hard not to think it could be smarter if it were not so tiny, but that misses the point, which was about how easily we were confused, not about how much more effectively we could be confused if more energy were put in. So whether my DOCTOR or any other was making the right point was hard to say. My guess is not."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: