So, I'm workin on the book "The C Programming Language" when this code comes up under the chapter for character arrays:
#include <stdio.h>
/* count digits, white space, others */
main()
{
int c, i, nwhite, nother;
int ndigit[10];
nwhite = nother = 0;
for (i = 0; i < 10; ++i)
ndigit[i] = 0;
while ((c = getchar()) != EOF)
if (c >= '0' && c <= '9')
++ndigit[c - '0'];
else if (c == ' ' || c == '\n' || c == '\t')
++nwhite;
else
++nother;
printf("digits =");
for (i = 0; i < 10; ++i)
printf(" %d", ndigit[i]);
printf(", white space = %d, other = %d\n", nwhite, nother);
getchar();
}
It all makes sense, expect for the "++ndigit[c - '0'];" in line 14. I don't understand what this is doing, or why it has to do it. Shouldn't the program run the same if you didn't add the " - '0'" part? But when I take out that part, the program will compile, but it will return all '0' for every number, even if they are present.
So what exactly does subtracting '0' do for this program?
Name:
Anonymous2012-09-11 14:38
'0' does not refer to a numerical 0, but the value of the ASCII character '0', which has a decimal value of 48. Since characters and integers are more or less interchangeable (this is a simplification) in C, when you subtract a character from an integer, the character is converted to an integer. The code c - '0' is equivalent to c - 48.
c is a number in ASCII that corresponds to a number character.
'0' is the first number character
c-'0' will return a number from 0 to 9
That number goes into the ndigit array which has 10 memory locations, one for each digit 1 to 9.
Therefore if c is '6'.
'6'-'0' = 6
Remember that those are actually numbers from 0 to ~300 for ASCII characters.
++ndigit[6] will call the seventh location in the ndigit array to be increase by one.
Name:
Anonymous2012-09-11 14:53
Thank you guys so much. This makes a lot of sense. I replaced the '0' with the number 48, and the program worked exactly the same.
Just so I'm getting this right:
c, when it is '0' through '9' (which would mean it has a value of 48 through 57, is subtracted from 48, leaving a raw integer to be used in the variable ndigit. So, rewriting that section of the code, without using the ASCII characters, would look like:
while ((c = getchar()) != EOF)
if (c >= 48 && c <= 57)
++ndigit[c - 48];
And so when you type in a number, it is actually as if you are typing in the numbers 48 or 49 or so on. So, for example, if you typed in 1, the variable c is set to 49. So, in the next line, 49 is subtracted from 48, leaving an actual int of 1, so that ++ndigit[1].
Is this right? Cause this explains so much.
Name:
Anonymous2012-09-11 14:59
Another question on the same code:
I have a hard time understanding getchar(). I looked something up that explains it well, but just so I understand:
When c = getchar() in the first while statement, this is where the code "stops", right? The code then waits for the user to input whatever they want, and when they press return, the code "continues", with c == the first character imputed, then again for the second input and so on. Is this right? And this stops when the user enters the EOF. And if this isn't imputed, than the c is "emptied" of its characters and the code "stops" again at c = getchar() again. Right?
Name:
Anonymous2012-09-11 15:43
More like the value of c is overwritten with the new value. You have to be careful here, because if there is data still in stdin when getchar() is called again, you're going to get strange behaviour.
Name:
Anonymous2012-09-11 20:34
Stupid question, wouldn't the ++i on those loops feed it 1..10 instead of 0..9? After all, that's increment and return inc instead of use value and increment?
Just saying...
Name:
Anonymous2012-09-11 21:29
>>7
No. With a for loop, the first statement (i = 0) is executed, then the expression (i < j) is evaluated, then the loop is executed, then the last statement (++i) is executed. Then it goes back to evaluating the expression and starts again.
the commas in >>11 happen to be in places that would make sense for comma delimited lists, but they have nothing to do with separators. & doesn't mean address of. No, @ is not related to pointers either. Please learn another language besides seeples.
>>12
Undefined behavior is the price you pay to get portability and speed from one language. I'd rather deal with the gremlins in the standard than write a different assembly program for every machine in the universe.