/prog/ - shouldnt captcha be numerical?

Name: Anonymous 2010-07-03 15:11

to be more international friendly.

Name: Anonymous 2010-07-03 15:42

No

Name: Anonymous 2010-07-03 15:44

Captchas are designed to be unfriendly. There are no computers that don't deal in Latin characters anyway.

Name: Anonymous 2010-07-03 16:14

Yes, lets make it hard for computers to guess by limiting the set of possible inputs.

Name: Anonymous 2010-07-03 16:15

shouldnt captcha be hiragana?

あはごじゃうぃおぱいはうぇぬいあw

Name: Anonymous 2010-07-03 16:27

>>4
The size of the set of possible characters actually isn't a significant issue when it comes to breaking CAPTCHA. In fact, it isn't for most applications, unless you're brute-forcing.

Name: Anonymous 2010-07-03 17:06

>>6
How so? I'd have thought it would be easier to select an appropriate match to the captcha if you restricted it. Got a reference?

Name: Anonymous 2010-07-03 17:09

>>7
Why would it be easier? The only odds it would affect is the odds of getting it right when you're just entering characters at random.

Name: Anonymous 2010-07-03 17:10

>>7
When you try and guess a captcha and you're incorrect, it generates a new one. For a 6-character captcha, 1 / 10^6 is strong enough, even when compared to 1 / 26^6.

The real issue is human vs. computer readability.

Name: Anonymous 2010-07-03 17:12

>>8
If you get an ambiguous character, you would have a better chance of selecting the correct one if you were restricted in your input.

Name: Anonymous 2010-07-03 17:21

>>10
Not significantly, because of the way OCR works.
Also keep in mind that real-life CAPTCHAs have to be human-readable, so they don't tend to allow 1, l and I, for example. That kind of ambiguity is rarely a problem.

Name: Anonymous 2010-07-03 20:05

>>11
Quite significantly on any input that is hard enough to read that there is a chance of getting it wrong. There's a reason numeric forms are routinely read with machines now, while text OCR is still mostly limited to stuff like addresses with their limited domain, and large-scale search where errors do not matter much.

Name: Anonymous 2010-07-03 20:07

>>12
And that reason is not what you think it is, moron.

Name: Anonymous 2010-07-03 20:15

>>13
EXPERT POSTER.

His wisdom and grace oozes from the screen

Name: Anonymous 2010-07-03 20:47

>>13
Very well, I will not expose that you do not know what you are talking about by bringing cold and cruel reality into this.
I will also concede that you may indeed be capable of writing an OCR that will be too inept to benefit from reducing the number of solutions to a fourth, nut-spit.

Name: Anonymous 2010-07-03 20:56

>>15
62 → 10 is reduction to a fourth now?
And honestly, >>11 is right and >>12 is a non-sequitur. The difference in the examples given isn't the difference between a domain of 10 and a domain of (at least) 62, but the difference in the amount of quality control exerted over the input. If you can print the text to be scanned yourself, OCR can be very close to 100% successful regardless of the size of the domain.
Also,

nut-spit
Polecat kebabs.

Name: Anonymous 2010-07-03 21:31

>>16
62 → 10 is reduction to a fourth now?
Even if you were to use a case-sensitive captcha, which is beyond the literary capacity of approximately half of all internet users, it still would not be exactly 62 due to the various lookalikes, an example of which have already been given.
If you can print the text to be scanned yourself, OCR can be very close to 100% successful regardless of the size of the domain.
A situation almost directly opposed to the one in actual discussion, the captcha, where the text is specifically meant to be resistant to machine vision techniques. If you are getting next to no classification errors on a modern captcha, the implementor probably went to the same special school that you did.
garbage
Oh dear.

Name: Anonymous 2010-07-03 21:39

>>17
A situation almost directly opposed to the one in actual discussion
Yes, which makes >>12's comment even more beside the point. The fact of the matter is that the size of the input domain does not significantly impact the difficulty of successfully OCRing text. What does impact it is glyph similarity, which you agree is not an issue with real-world CAPTCHA.

the implementor probably went to the same special school that you did.
At least I can actually follow a thread of discussion.

Name: Anonymous 2010-07-03 22:21

>>18
Sure, so let us ask the reCAPTCHA guys if there are enough unclassifiable numbers left on earth for a separate numeric captcha. After all, why keep good entertainment to myself?
the size of the input [sic] domain does not significantly impact the difficulty [...] What does [...] is glyph similarity
Like there is no relation. But no. If your only task was binary classification between e.g. '1' and 'l' you could probably do quite well at it.
At least I can actually follow a thread of discussion.
A thread of discussion, perhaps. I would not trust you with this particular one.

Name: Anonymous 2010-07-03 22:45

>>19
[sic]
Oh, so you don't even speak English? No wonder you're having trouble.

Name: Anonymous 2010-07-03 23:22

>>20
I'm actually reasonably good at English, but I'm sure I'm no match for you, as I didn't major in it.

shouldnt captcha be numerical?

1 Name: Anonymous 2010-07-03 15:11

2 Name: Anonymous 2010-07-03 15:42

3 Name: Anonymous 2010-07-03 15:44

4 Name: Anonymous 2010-07-03 16:14

5 Name: Anonymous 2010-07-03 16:15

6 Name: Anonymous 2010-07-03 16:27

7 Name: Anonymous 2010-07-03 17:06

8 Name: Anonymous 2010-07-03 17:09

9 Name: Anonymous 2010-07-03 17:10

10 Name: Anonymous 2010-07-03 17:12

11 Name: Anonymous 2010-07-03 17:21

12 Name: Anonymous 2010-07-03 20:05

13 Name: Anonymous 2010-07-03 20:07

14 Name: Anonymous 2010-07-03 20:15

15 Name: Anonymous 2010-07-03 20:47

16 Name: Anonymous 2010-07-03 20:56

17 Name: Anonymous 2010-07-03 21:31

18 Name: Anonymous 2010-07-03 21:39

19 Name: Anonymous 2010-07-03 22:21

20 Name: Anonymous 2010-07-03 22:45

21 Name: Anonymous 2010-07-03 23:22