Mathematically, it is literally a probability distribution, because it fits the definition of a measure whose total mass is one, so I think the language is just imprecise. What they may be trying to say is that semantically it doesn't arise in a principled way from an uncertainty model, such as from Bayesian or frequentist statistics.
Hogwash. If you get into deriving maximum entropy distributions via the calculus of variations, the multinomial is the maximum entropy distribution among categorical distributions.
This is exactly the sense that it comes up for old school LMs and why it appears in thermodynamics.
Of course it is entirely possible that newfangled ML people use it without understanding that it is derived from first principles - i.e. see article.
That definitely could be the case. I was also a bit surprised by what the article said, so I was simply trying to interpret it, but I'm not extremely well versed in ML so I could be missing some details. My main point was that contrary to what the article said, they do in fact have a probability distribution on their hands.
You have a relatively small dictionary of tokens, each prediction has a neural network score that goes into the final token prediction layer, and they are trained based on a log-softmax (i.e. the above function) to predict their next token.
This is exactly how anyone in any field does conditional multinomial/categorical (i.e. one of a bunch of distinct tokens) distributions, and AFAIK what LLMs generally use as their loss functions on the output layer, though I have not deeply investigated all of them, since this has been how you do that since time immemorial.
I am extremely confused by all of the people screaming it's not a probability distribution?!?!?
I have seen computer vision tasks use binomial training objectives (one-vs-all) and then use the multinomial only at inference time, and that could be fair that that is not a probability distribution induced by training (while technically a probability distribution only in the sense it is \ge 0 and sums to 1).
But afaik token prediction LLMs that I am aware of use the softmax for the probability in their loss function, i.e. the maximize log softmax.
You're asking the right questions. The going theory as far as I can see is that training models is fair use (although it may not be fully resolved in the courts), in which case this whole exercise would seem to be pointless. If it were that easy, I have to think the FSF etc. would have been all over this years ago.
That is exactly my understanding as well, and certainly that was my intent in my GPL-licensed projects.
Also, about conditions on redistribution, the vast majority of all open-source software places at least some mild conditions, like the preservation of copyright and attribution to the authors, so if there is some kind of "gotcha" here, I don't think it has anything to do with copyleft.
Before even getting to any of the other issues people may have with generative AI, in my opinion by far the most important question is simply does the AI help students learn better? And it's pretty clear to me that the answer is "no."
To be frank, this one quote from a Google executive pretty much lays bare the whole scam:
> [Sinha of Google for Education] added that, by using A.I. tools, students are “able to create much more impressive projects that you could have never done before.”
This is one of the most obvious lies that the slop shops are trying to peddle. Using generative AI to make something is analogous to, and often literally just, hiring a third party to make the thing for you. It is not analogous to creating the thing yourself. I think the vast majority of people can recognize this, but unsurprisingly some people are buying the snake oil.
This really goes to the heart of what education is, doesn't it? While I'm no expert on theories of learning, I can draw from my own experiences, which I think are not exceptional. In my experience we learn things by (1) passively acquiring information, (2) thinking about the thing on our own, and (3) actively doing something with the knowledge. My point is that (2) and (3) are just as important as (1), and removing or reducing those is actively hampering learning rather than helping. As the article correctly points out, "creating impressive projects" has absolutely nothing to do with education. Duh.
My real worry is that teachers, who are already underpaid and underappreciated, will feel a lot of pressure to adopt some of these tools purely to manage their own workloads, and I think that would be a sad and preventable outcome.
Technically "it depends on the browser settings," but the body font Alegreya is served directly by the site, so I think it would be the one used in almost all cases.
The math fonts used in the formulas are just the ones provided by KaTeX, which I think are just TeX's default math fonts.
This is the actual quote: "There is a very real scenario in which personal computing as we know it is dead." He went on to say this, as reported in another article [1]:
> Still, Framework said that it will not take this lying down. Its event announcement also doubled as its own manifesto, saying that "as long as there is a person in the world who still wants to own their means of computation, we will be here to build the hardware that enables it," and that it "will always be fighting for a future where you can own everything and be free."
In San Francisco, the annual library budget is ~$200,000,000. That's about $10/month for each San Francisco resident (including babies, elderly people etc.).
It might not seem like a lot, but it is a lot when you consider that most residents don't use the library at all, and that adult book collections aren't great.
850,000 people have to share just 2 copies of Thompson's Calculus Made Easy. (I didn't cherry pick this: I looked up at my bookshelf and picked the first book I saw.)
Very little of the money is spent on books. Only 15% of the money is spent on 'collections', and much of that is spent on things other than books.
SF libraries are nice for children (lots of copies of kids' books, lots of desks to do homework when waiting for parents to get back from work).
But I personally don't find them a convenient source for reading material as, if I want a particular book, they usually don't have it.
SFPL's own stats say they see over 10,000 visitors per day and check out over 12 million items annually. Let's say you allocate 50% ($100M) to each of those two missions: serving as a community space vs. lending materials.
That gives you:
- As a community space: $100M ÷ (10,000 visitors × 365 days) = ~$27 per visit. You could hand every person who walks in a $27 gift card to a coffee shop with free Wi-Fi and they'd arguably get a comparable experience for many use cases.
- As a lending library: $100M ÷ 12,000,000 checkouts = ~$8.30 per checkout. You could just buy most paperbacks and many e-books for that price and give them away.
Libraries do more than lend books and provide community spaces. They also run a lot of programs. So just saying "hand everyone a Starbucks gift card and a paperback every month" doesn't cover everything.
There are worse ways for a city to spend money. SF has a spending problem. Both can be true.
Yep, you also get a quota of 10 suggested purchases for their collection every month - I scour for new books to max mine out and they grant >95% of what I ask for
If you click your username at top right corner and then the bell (which will have a number if there are notifications) you can find out what happened with those requests
Hmm if you'd submitted them in past it's strange to see 0 count. I see all of mine there. https://sfpl.bibliocommons.com/user_profile/me/notification_... shows:
> Your suggestion has been approved!
The library will acquire the following Book: <title>. To learn more and manage your requests, go to Suggested Purchases.
Some of the services end up being very expensive, like ebook lending. Some publishers basically charge libraries per loan ($X for an ebook that lasts Y loans), so while it is nice for residents it's not clear that it's a good value, or that it's a good use of tax money.
I once heard from a knowledgeable source that most of library lending is bodice rippers. These are available from Amazon/etc. pretty cheaply, which undercuts the value argument. And of course, there's practically no social value of providing the public with free bodice rippers...
I'd be interested to know more about the economics of lending DVDs and Blu-rays. Hopefully libraries get a better deal on these.
If most of lending were made up of educational texts, there would be a social value. Some people describe bodice rippers as porn for women, and people get addicted to them in the same way they get addicted to porn.
Would a library ever lend porn out? I'm guessing no, because of the lack of social value. To the extent that bodice rippers are like porn, the same rationale would apply.
Nope, because even if bodice rippers are not pure porn, it's not clear why libraries should subsidize entertainment for patrons. I'm not saying it's a terrible use of taxpayer money, just disagreeing with OP who said it's "such a great use of tax money". It does not bring people together, educate them, or provide for the common defense. Why not have movie theaters be government-run? It would make as much sense as providing free smut-adjacent books for (almost entirely) women.
If everyone used the library as much as people say they are great, their shelves would be empty. Libraries have to be some of the most underutilized services.
In my experience, there can be pretty high contention for certain items, so you need to be on the ball or make use of the "place hold" feature judiciously. Yeah, people are using the service.
In the US, libraries are often part of a network, and we have access to all the materials in the network. So if my local library doesn't have it, I simply request it from another library. They ship it to mine and I pick it up (and return to mine).
Then we also have a larger inter-library loan, where I can request things from libraries far, far away (even in another state). It takes much longer, though, and if it is deemed a popular/useful item, my local library may decide to purchase it and give that one to me rather than use ILL.
You may want to check if your local library has something similar.
I recently discovered Kanopy and was surprised by the amount of A-tier movies you can stream there for free with a library membership (SFPL in my case)!
Browsing through the library DVD shelves is somewhat reminiscent of browsing through a 80s/90s mom & pop type video rental place [1]. I think it's better for randomly finding something interesting than the algorithm (tm). [1] But without the room in the back with the ADULTS ONLY sign. Your library my vary.
Libraries are the single reason I got back into video games after a multi-decade hiatus.
I played very few games from 2002 to 2017. Didn't want to keep buying new computers, and did not want to bother with consoles (graphics was better on PC than a non-HD TV).
In 2010 I bought a PS3, but only to watch Blu-Ray, Netflix and stream from my PC to TV. Did not play games on it.
Then in 2016/2017, on a whim, I decided to check out a game from the library. I Googled some good games, and picked Telltale's The Living Dead.
Oh wow. One of the best games I've ever played. For the next 2 months I kept checking out games and playing them.
Then for some reason I stopped. I started again in 2022 and haven't looked back. Seriously cut down my TV watching so I can play the games. I don't use the library any more - I just buy the games.
Engineers are fundamentally pragmatic people. We're problem solvers. Someone who only cares about ingenuity and craft would be a shitty engineer, because that perspective is entirely inward-facing and not directed at the problems at hand. I think this is fundamentally my problem with your question, and I think if you framed the question slightly differently with this in mind, it would make more sense.
To attempt to answer it, I think there are many engineers who care deeply about creativity, ingenuity, and craft, because those are key qualities (among others) needed to solve real-world problems. The question you hinted at is whether LLMs are compatible with that, and I think more people are asking those types of questions now.
Absolutely, the whole point of the rubber duck is that it's inanimate. The act of talking to the rubber duck makes you first of all describe your problem in words, and secondly hear (or read) it back and reprocess it in a slightly different way. It's a completely free way to use more parts of your brain when you need to.
LLMs are a non-free way for you to make use of less of your brain. It seems to me that these are not the same thing.
reply