Re: ISP wants to stop outgoing web based spam
On Thu, 10 Aug 2006, Stefan Bethke wrote:
Do you have any links or references?
Just ask the user some basic question. E.g.: What is 2 added to 23?: <textbox> regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: "Being disintegrated makes me ve-ry an-gry!" <huff, huff>
On Thursday 10 Aug 2006 01:14, Paul Jakma wrote:
On Thu, 10 Aug 2006, Stefan Bethke wrote:
Do you have any links or references?
Just ask the user some basic question. E.g.:
What is 2 added to 23?: <textbox>
I've no doubt some captcha can be invented in ASCII, but this isn't it. AI already substantially out performs all but a small minority of humans on mathematical style IQ test (they were over 160 when I was a kid), and it would be relatively trivial to code it to handle the types of questions for this kind of test. It would work for a minority use. Indeed I've already used a BBS that expected you to understand about factoring numbers or some such question on joining. Something requiring real world knowledge would be better, but it is very hard to automatically generate questions (and answers), that can't be automatically answered. And remember in most cases you need questions that are consistently hard, as the machines won't get bored retrying. If you generate them manually (at least the first time one is encountered). Visual noise (and auditory noise) is something we are good are consistently good at removing,and machines are still playing catch-up. But then some of the automated captcha solvers aren't that much worse than a lot of people. On the upside such captchas might spark more research into AI, as whilst recognising badly mangled images of text is kind of useful for the post office and other handwriting recognition, it has limited applications elsewhere.
On Thu, 10 Aug 2006, Simon Waters wrote:
I've no doubt some captcha can be invented in ASCII, but this isn't it.
'tis. It works for at least one blog platform, where I've never once had comment spam.
a kid), and it would be relatively trivial to code it to handle the types of questions for this kind of test.
Sure, so change the questions. The ultimate "captcha defeating AI" is already in-use by spammers by the way - humans (get humans to "solve" captchas in return for some reward, e.g. porn). ASCII or image matters not a jot to those. ASCII captches are no less effective than image-captcha just without the nasty "ban the blind from the internet!" side-effects. regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: byob, v: Believing Your Own Bull
Paul Jakma wrote:
ASCII captches are no less effective than image-captcha just without the nasty "ban the blind from the internet!" side-effects.
Then again you have Authen::Captcha that has sound based Captcha's as well.... / Mat
On Wednesday 16 Aug 2006 01:13, Paul Jakma wrote:
On Thu, 10 Aug 2006, Simon Waters wrote:
I've no doubt some captcha can be invented in ASCII, but this isn't it.
'tis. It works for at least one blog platform, where I've never once had comment spam.
You snipped the bit where I said "It would work for a minority use." I'm sure it works fine for just you, but it doesn't scale, so the folks at Nanog probably don't care. The reason people use image recognition is it is something (most) humans find very easy, but requires considerable investment of effort (or resource for self training) to teach computers, and readily permits of variations ('click the kitten' being a good example). For a demonstration of bashing at ASCII captchas try any good chat bot. I asked the online bot at ellaz.com your question: "What is 2 added to 23?" Ellaz replied; "I can tell you that 2, plus 23, is equal to 25" I hope your parser can recognise that as a valid answer, otherwise you'll have trouble with humans failing the test. Although for blog comments, excluding stupid, or overly verbose humans may not be a bad idea, I just get the feeling some days I'd never get to comment on anyones blog. I thought maybe spice it up a little; Simon: "What is the square root of -1?" Ellaz: "Hey Hey! You cannot take the square root of a negative number. That gives an imaginary number, and I don't go there." (Spot the canned response). Shucks. Unfortunately Ellaz bot isn't terribly good at non-maths questions, but I think it makes the point well enough. The reason no one defeated your text captcha was probably because no one tried, but that won't remain the case if it gets popular. We are locked in another arms race here. At the moment greylisting kills most of your email spam, and any captcha (even ones for which programs exists for, and which score better than humans) will kill most of your blog spam, but don't expect them to last as a defence, just as greylisting is slowly crumbling. The real solution is to break the monoculture, and have more security at the leaf nodes, but someone already started that thread (again). Although possibly the mistake is to assume you can distinguish between humans, and computers on the basis of intelligence. It isn't reliably possible to do this yet, but give it a few years and you'll know that if a site asks for all the integer solutions of a given quintic equation, it is probably not that interested in comments from apes, except perhaps the most exceptional apes.
On Wed, Aug 16, 2006 at 09:21:06AM +0100, Simon Waters wrote:
The reason people use image recognition is it is something (most) humans find very easy, but requires considerable investment of effort (or resource for self training) to teach computers, and readily permits of variations ('click the kitten' being a good example).
How many CAPTCHA tests can a human making minimum wage complete in an hour? Ask the post office people who input handwritten zipcodes. A tougher question might be, what does any of this have to do with NANOG? -- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
On Wed, 16 Aug 2006, Simon Waters wrote:
You snipped the bit where I said "It would work for a minority use."
Sorry, don't think that is relevant really - least I have no data on what minority uses are for captchas, nor majority uses or what the difference is.
The reason people use image recognition is it is something (most) humans find very easy, but requires considerable investment of effort (or resource for self training) to teach computers, and readily permits of variations ('click the kitten' being a good example).
Those need vast numbers of "kitten" pictures in order to be immune to dictionary attacks. There's a reason 'captchas' consist of auto-generated images of letters. You can auto-generate questions too, obviously. With dictionaries of question/answer tuples associated with some template question language. The tuples can be auto-generated, the strength lies in the variety of the question forms in use across the internet and/or across a site. The questions need not use language, they could be based on ASCII pattern matching, e.g.: oAwoZwoLwoC what's the next letter, etc.. Or you could simply test people on their ability to google perhaps? :)
For a demonstration of bashing at ASCII captchas try any good chat bot.
And for image captchas, see: http://www.cs.sfu.ca/~mori/research/gimpy/ and there are more. CAPTCHAs are, almost by definition, compelling problems for academia to tackle ;).
The reason no one defeated your text captcha was probably because no one tried, but that won't remain the case if it gets popular. We are locked in another arms race here.
Yes, that applies regardless of the form of the captcha.
Although possibly the mistake is to assume you can distinguish between humans, and computers on the basis of intelligence.
Maybe so. regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: The meat is rotten, but the booze is holding out. Computer translation of "The spirit is willing, but the flesh is weak."
participants (4)
-
Matthew Sullivan
-
Paul Jakma
-
Richard A Steenbergen
-
Simon Waters