The "send known CSAM" attack has existed for a while but never made sense. Howev...

fay59 · on Aug 19, 2021

Before they make it to human review, photos in decrypted vouchers have to pass the CSAM match against a second classifier that Apple keeps to itself. Presumably, if it doesn’t match the same asset, it won’t be passed along. This is explained towards the end of the threat model document that Apple posted to its website. https://www.apple.com/child-safety/pdf/Security_Threat_Model...

sigmar · on Aug 19, 2021

What happens if someone leaks or guesses the weights on that "secret" classifier? The whole system is so ridiculous even before considering the amount of shenanigans the FBI could pull by putting in non-CSAM hashes.

fay59 · on Aug 19, 2021

For better or worse, opaque server-side CSAM models are the norm in the cloud photo hosting world. I imagine that the consequences would be roughly the same as if Google's, Facebook's or Microsoft's "secret classifiers" were leaked.

sigmar · on Aug 19, 2021

but in the cloud setting they have the plaintext of what was uploaded. The attack described above is about abusing the lack of information apple has so they will report an innocent user to the authorities.

fay59 · on Aug 19, 2021

The voucher that Apple can decrypt once enough positives have been received contains a scaled-down version of the original. How else would Apple be able to even run a second hash function on the same picture?

ehsankia · on Aug 19, 2021

Can't they just make a new one and recompute the 2nd secret hash on the whole data set fairly easily?

Also, the whole point is that it's fairly easy to create a fake image that collides with one hash, but doing it for 2 is exponentially harder. It's hard to see how you could have an image that collides with both hashes (of the same image mind you).

ummonk · on Aug 19, 2021

Two hash models is functionally equivalent to a particular type of one double-sized hash model. So it shouldn't be any harder to recompute against a 2nd hash, if that 2nd hash were public.

Of course, it won't be public (and if it ever became public they'd replace it with a different secret hash).

snovv_crash · on Aug 19, 2021

If you have both models it is easy. If Apple manages to keep the server model private then it is hard.

iflp · on Aug 19, 2021

You don’t need to have the weights. “Transfer attack” is a thing.

lamontcg · on Aug 19, 2021

You can still hack someone's phone and upload actual CSAM images. That exposes the attacker to additional charges, but they're already facing extortion and all that anyway. I don't understand the "golly gee whizz, they'd have to commit a severe felony first in order to launch that kind of attack" argument.

Don't know why this hasn't already been used on other cloud services, but maybe it will be now that its been more widely publicized.

salawat · on Aug 19, 2021

How...exactly did they train that CSAM classifier? Seeing as that training data would be illegal. I'd be most interested in an answer on that one. They are willing to make that training data set a matter of public record on the first trial, yes?

Or are we going to say secret evidence is just fine nowadays? Bloody mathwashing.

sethgecko · on Aug 19, 2021

They didn't train a classifier, just a hashing function.

rnjesus · on Aug 19, 2021

honestly asking — why is it illegal?

salawat · on Aug 19, 2021

It may not be, so honestly I think my objection is best dismissed. Once I ran down the actual chain I mostly sorted things out with a cooler head.

However, the line of thinking was if Apple has a secondary classifier to run against visual derivatives, the intent is it can say "CSAM/Not CSAM". Since the NeuralHash can collide, that means they'd need something to take in the visual derivatives, and match it vs an NN trained on actual CSAM. Not hashes. Actual.

Evidence, as far as I'm aware, is admitted to the public record, and a link needs to exist, and be documented in a publically and auditable way. That to me implies any results of a NN would necessarily require that the initial training set be included for replicability if we were really out to maintain the full integrity of the chain of evidence that is used as justification for locking someone away. That means a snapshot of the actual training source material, which means large CSAM dump snapshots being stored for each case using Apple's classifier as evidence. Even if you handwave the government being blessed to hold onto all that CSAM as fitting comfortably in the law enforcement action exclusions; it's still littering digital storage somewhere with a lotta CSAM. Also Apple would have to update their model over time, which would require retraining, which would require sending that CSAM source material to somewhere other than NCMEC or the FBI (unless both those agencies now rent out ML training infrastructure for you to do your training on leveraging their legal carve out, and I've seen or come across no mention of that.)

Thereby, I feel that logistically speaking, someone is commiting an illegal act somewhere, but no one wants to rock the boat enough to figure it out, because it's more important to catch pedophiles than muck about with blast craters created by legislation.

I need to go read the legislation more carefully, so just take my post as a grunt of frustration at how it seems like everyone just wants an excuse/means to punish pedophiles, but no one seems to be making a fuss over the devil in the details, which should really be the core issue in this type of thing, because it's always the parts nobody reads or bothers articulating that come back to haunt you in the end.

rnjesus · on Aug 19, 2021

i did a bit of reading as well and came across this. you might find it useful or interesting: https://www.law.cornell.edu/uscode/text/18/2258A at the end (h1-4), it details that providers must preserve the information they submit and also take steps to limit access to only people who need it. in this sense then, it’s not illegal for companies to possess csam. it’s not a big leap to then assume that storing csam for the development of detection software is legal (or at least as been throughly cleared with the courts, which is about the same). photodna was developed twelve years ago, and i can’t find anything about microsoft ever being charged with possession or distribution of cp.

salawat · on Aug 20, 2021

Interesting!

Thank you, that was what I was looking for that closes the gap somewhat.

onepunchedman · on Aug 19, 2021

Somehow this didn't solidify my trust in Apple! By this standard you can probably mount a half decent defence off "ignorance" if you are even caught sending the colliding material. Add this whole debacle on top of what's going on in the EU parliament and 2021 has been WILD for privacy.

copperx · on Aug 19, 2021

It seems like I'm not going to sleep tonight.

Sure, there is hyperbole in OP's comment (CSAM ransomware and automated law enforcement aren't a thing yet), but we're a few steps from that reality.

Even worse, how long will it take until other cloud storage services such as Dropbox, Amazon S3, Google Drive et al implement the same features? Or worse, required by law to do so?

This sounds like the start of an exodus from the cloud, at least in the non-developer consumer space.

spullara · on Aug 19, 2021

Cloud services generally already do this, for example, here is Google's report:

https://transparencyreport.google.com/child-sexual-abuse-mat...

onepunchedman · on Aug 19, 2021

Yeh I was talking in hyperbole, but the possible attack vectors this system enables are so powerful I felt it warranted. Under this system you are able to artificially ddos organizations that verify if CP is sent by sending legitimate, low-res porn whose hash has been modified. You can trigger legitimate investigations by sending CSAM through WhatsApp or through social engineering. You can also fuck with Apple by sending obvious spam.

* With regard to the legislative branch, they can even mandate changes to this system they aren't allowed to disclose. Once this system is in place, what is stopping governments from forcing other sets of hashes for matching.

copperx · on Aug 19, 2021

And this is just one step away from Apple and Microsoft building this scanning into the OS itself (into the kernel/filesystem code, why not?!). This is beyond insane. Stallman was right. Our devices aren't ours anymore.

Now, to be fair, there would be a secondary private hash algorithm running on Apple's servers to minimize the impact of hash collisions, but what's important is that once a file matches a hash locally, the file isn't yours anymore -- it will be uploaded unencrypted to Apple's servers and examined. How easy would it be to shift focus from CSAM into piracy to "protect intellectual property"? Or some other matter?

onepunchedman · on Aug 19, 2021

Jup. As others have pointed out, if Apple were willing to lie about the extent of this system and its inception date, why should we suddenly trust that they won't extend its functionality. They themselves explicitly state that the program will be extended, so if this is the starting point I don't think I will be around for the ride.

It's a shame as I really love some of their privacy-minded features (e.g. precision of access to the phone's sensors and/or media).

happyopossum · on Aug 19, 2021

> Even worse, how long will it take until other cloud storage services such as Dropbox, Amazon S3, Google Drive et al implement the same features? Or worse, required by law to do so

They already do this. Google and Facebook have even issued reports detailing their various success rates…

EGreg · on Aug 19, 2021

So, everyone is going to turn off their iCloud sync and they won’t be a target anymore?

croutonwagon · on Aug 19, 2021

Well according to reports that are generally the source of these collisions, the hashing code has been on the device since around December 2020 (14.3)

https://old.reddit.com/r/MachineLearning/comments/p6hsoh/p_a...

If Apple hasn't been honest about WHEN it was built into and added to their code base, why would anyone take their word for HOW its being used, or many of the other statements they are putting in their documents as of yet, at least until they are verified

heavyset_go · on Aug 19, 2021

It doesn't necessarily mean that it will stop them from being a target, because Apple says this[1]:

> This program is ambitious, and protecting children is an important responsibility. These efforts will evolve and expand over time.

[1] https://www.apple.com/child-safety/

copperx · on Aug 19, 2021

> This program is ambitious, and protecting children is an important responsibility. These efforts will evolve and expand over time.

"Think of the children" is the most recognizable trope in TV and film. They couldn't have phrased that to be more Orwellian.

copperx · on Aug 19, 2021

Yes, until they add local scanning to macOS / iOS / iPad OS.

cryptonector · on Aug 19, 2021

The attacker faces no charges because the colliding image can be a harmless meme.

robertoandred · on Aug 19, 2021

LEO is not alerted automatically, where’d you get that idea?

silisili · on Aug 19, 2021

They'd more or less have to be. Well, not necessarily 'police', but NCMEC.

I did work in automating abuse detection years back, and the US govt clearly tells you are not to open/confirm suspected, reported, or happened upon cp. There's a lot of other seemingly weird laws and rules around it.

robertoandred · on Aug 19, 2021

Those laws don’t apply if it’s part of the reporting process. Apple’s stated that they do a manual to decide whether to send a report to NCMEC or not, just like other companies do.

silisili · on Aug 19, 2021

Of course they do. If they didn't, every seedy pedo would be in the process of making a "report." It's probably also why Apple is using 'visual derivatives' for confirmation, rather than the image, though I can't find info on exactly how low resolution 'visual derivatives' are.

It is of course possible that companies may get some special sign off from LE/NCMEC to do this kind of work - I won't argue with you on that as I truly don't know. I can just tell you my company did not, and was very harshly told how to proceed despite knowing the nature of what we were trying to accomplish. But, we weren't anywhere near Apple big.

I remember chatting with our legal team, who made it explicit that laws didn't to cover carve outs - basically 'seeing' was illegal. But as you can imagine, police didn't come busting down our doors for happening upon it and reporting it. If you have links to law where this is not the case, I'll gladly eat crow. I've never looked myself and relied on what the lawyers had said.

hypothesis · on Aug 19, 2021

They will be if you collide a low-res image that resembles CSAM.

Why would person doing manual review risk his job in case if he’s unsure? Naturally he will just play it safe and report images.

ec109685 · on Aug 19, 2021

Not resembles. The adversarial image has to match a private perceptual hash function of the same CSAM image that the NeuralHash function matched before a human reviewer ever looks at it.

onepunchedman · on Aug 19, 2021

Do you have any material on this private function?

ec109685 · on Aug 19, 2021

Not beyond the documents Apple has shared. Presumably it will be kept that way given it prevents an adversarial attack against it.