Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

Teaching programming h;lp

Name: Anonymous 2006-09-18 11:08

I'm going to teach programming to somebody who's going to study it more formally in a year, but wants to be learning part of it in the mean time.

What SIMPLE, CLEAN, STRUCTURED language would you recommend me to use for teaching? Please, refrain from language wars as this is not a fanboy thread but a serious question. Before yuo mention it, I'm not going to start with either Python or Ruby because they're too complex and get too much in the way, and no, I'm not stupid enough to start with Java because a radical OO language (and a crappy one at that) with a shitty enterprise API is not the best either.

I'm thinking Pascal. As much as it sucks, it has strict/anal types (it's better to start anal than to start easy-going and botch it), simple yet not messy syntax, simple stdin/stdout input and output to play with (that's all I'll need), and none of the complexity of OO. Yet it sounds so useless. But I don't know of other languages that meet these requirements.

Name: Anonymous 2006-09-20 3:02

Erm, here's another overwhelming reason why char* C strings are completely useless today (I don't care about thirty years ago): Unicode.

There are wider character types that can be used, such as wchar_t, although I'm sure you knew this.

I don't know how this thread went so wrong. We went from "oh try this language here!" to "C sucks chocolate salty balls" >:[

Name: Anonymous 2006-09-20 7:44

>>37
Oh, of course. Note that, even though I was talking about C strings, I believe I was differentiating characters from bytes in my proposal and said the maximum number of characters is (length >> 1) - 1 or something like that, and using wchar_t, too lazy to check. (This is also an approximation, as we're using UTF-16 and not taking surrogate pairs in consideration.)

There's somewhat of a vague implementation of Unicode in C99 with wchar_t and the wcs* series of functions, even if it's so poorly and vaguely defined in the standard it's not too clear what the heck does this do. The C standard is really shitty there; instead of fully adopting and embracing Unicode UTF-16 and including a proper library and Unicode metainformation, they come up with vague "wide characters" and multibyte shit.


>>38-40
Thanks for the useful comments, I'll consider all you mentioned. Except Java :p . As for electronics, I won't get that far, but I sure will attempt to give him a rough idea of how it's all done with logic gates, for example with the logic for a simple 4 bit register.

Name: Anonymous 2006-09-20 11:11

>>30
Do you have a reading impediment there? I think I made myself pretty clear. Some architectures allow you to set condition flags based on the value loaded and its signed comparison with zero. Therefore, one fewer compare given an architecture where you can do shit like this.

Let's have a concrete example, shall we? The Motorola 68000 instruction set architecture, though I can't remember whether MOVE.B sets condition codes or not.

loop:
move.b (a0)+, d0
move.b d0, (a1)+
bne loop

That's a 3-instruction strcpy right there. Copies the terminator, too. Contrast with the length-prefixed form:

loop:
move.b (a0)+, d0
move.b d0, (a1)+
subq.l #1, d1
bne loop

This assumes that SUBQ.L sets condition codes too, and there's no reason why it wouldn't; this being the 68k architecture after all, and back then minimizing the instruction count was a big deal. So in this example, the length-prefixed version takes one more instruction per character than the zero-terminated version; that's one third more instructions total. (If the 68k supported memory operands on both sides of MOVE.B, it'd be 2 and 3 instructions respectively.)

Name: Anonymous 2006-09-20 11:12

>>42
Avoiding "technicality" in a programming course is something you most certainly don't want to do. I mean, that's heavy gloves territory right there.

Name: Anonymous 2006-09-20 11:28 (sage)

>>43
Whoop, seems I suck balls today. The 68k did support a byte move with memory operands on both sides. So that makes for a 2-instruction strcpy().

Likewise, on an ARM, I suppose a proper strcpy() would be something like:

loop:
ldrbs r0, [r1, #1]!
strb r0, [r2, #1]!
bne loop

Assuming that the S bit in the LDRB instruction means "set condition codes based on value loaded"; my ARMish is a little rusty.

Name: Anonymous 2006-09-20 11:34

>>36
Oh boo hoo. Your toy language background is obvious from the way you whine, and your failure to remember that there's a whole fucking family of standard routines in C stdlib for moving around blocks of memory. Also known as buffers. Also known as shit that's binary safe. I mean, fuck, people, learn the language you're using for crying out loud!

Oh, and
>>31
This should've ended the thread already.

Name: Anonymous 2006-09-20 11:52 (sage)

That's a 3-instruction strcpy right there.
More like a one way ticket to BUFFER OVERFLOW HELL. Jesus H. Christ.

Name: Anonymous 2006-09-20 12:31

Well, strcpy is Buffer Overflow Incarnate by definition. It's useless, except perhaps to copy static strings.

Name: Anonymous 2006-09-20 12:56 (sage)

DINGDING! Programming C requires care. Film at 11!

Name: Anonymous 2006-09-20 15:47

>>48
not if it's pascal styles strings. gawd.

Name: Anonymous 2006-09-20 16:12

>>50
Because buffers for Pascal strings are always of unlimited size amirite?

Gawd, you faggot.

Name: Anonymous 2006-09-20 16:19 (sage)

>>51
No, because checking whether the string fits the buffer is O(1) and is therefore always done.

Name: Anonymous 2006-09-20 17:37

>>46
Newsflash: I've been writing software in C for the past decade. But don't let that get in the way of your religion.

I can see what the language's shortcomings are. Only a complete dolt can't see that null-terminated strings are an ugly relic of a bygone era. Only someone who hasn't working with C a great deal thinks it's the greatest thing since sliced bread. It has problems, C-style strings being only one, got that?

What you missed is that there's a whole duplication of functionality going on because of a single corner-case (\0). Is that stupid or what?

Now, we've listed all the reasons why null-terminated strings are stupid. Are you going to come up with a few compelling reasons why they're not, or are you going to childishly call names again?

Name: Anonymous 2006-09-20 17:39

>>46
You call that "family of functions"? Lol.

This should've ended the thread already.
If this board had been implemented with C strings, threads would end when some buffer overflows.

Name: Anonymous 2006-09-20 17:49

>>43
The Motorola 68000
lol
lololol

Okay, let's play archaic:

loop:
  move.b (a0)+, (a1)+
  dbra d0, loop

But wait! There's more! Since we don't need to check what each character is, we can use a move.l! And we haven't even tried tricks with movem yet! Oh nooooooes!

Long story short: you know shit.

Name: Anonymous 2006-09-20 18:46 (sage)

>>55
This is turning out to be an entertaining thread after all. Perhaps a holy war, even. In the red corner, length-prefixed strings! In the right corner, zero-terminated strings! Match refereed by Haskell strings, where each character is a node in a singly-linked list.

Seriously guys, the only real difference between length-prefixed and zero-terminated is that length-prefixed wastes a byte or three of memory in the common case, and zero-terminated has an O(n) strlen(). There, we happy now?

>>54
Well, if you have a better name for a group of functions that have to do with strings and whose names start with "str", please to be informing us.

Name: Anonymous 2006-09-20 18:49

>>35

Yes, I have. Look up the "REPNE" prefix. It will instruct the following MOVS* instruction to repeat only while it's moving non-zero data. So to do a C-string copy, set ECX to 0xFFFFFFFF, DS:ESI to your source and ES:EDI to your destination, and fire off a REPNE MOVSB.

Name: Anonymous 2006-09-20 19:06

>>53
Well well, status dropping. I've been using C from 1995 onward on a hobby basis (i.e. demo programming and such).

As for your challenge, let's start with the obvious: zero-terminated strings are a nice thing to have, because a pointer to the middle of a zero-terminated string is exactly as good a string as any other. That is to say, O(1) head, tail and drop operations (where "tail" is obviously just "drop 1") though you still need to manage memory on your own. Arbitrary slicing that doesn't collapse to head, tail or drop is still O(n) where n is the length of the slice. Under some circumstances (parsers come to mind), this is not only rather intuitive but also relieves the programmer from having to create and then destroy a slice copy of the string being processed.

Another, just because I'm feeling like it: it's trivially simple to write a routine that takes character data read from a file, replaces all instances of the line feed with a '\0' and stores the appropriate pointers in an array, thus making for a line splitting function which allocates no extra memory for the resulting lines.

I'd rather like to hear what the matching idioms would be, if strings in C were length-prefixed rather than zero-terminated. Both of these examples assume that no garbage collection is being used; I think this is reasonable given that neither C-the-language or C-the-library require nor specify any such thing.

Name: Anonymous 2006-09-20 19:09 (sage)

>>57
Oh holy fuck. I forgot just how weird the processor architectures got in the seventies and eighties, when they started making them explicitly for running C programs.

And that'd work for length-prefixed as well, right? Just set ECX to the number of bytes you want to copy and use a REP prefix instead of REPNE?

Name: Anonymous 2006-09-20 19:40

>>59
Yes, that's what I meant earlier:

mov ecx, length
rep movsb

Or get fancy and integer divide the length by four and use a movsd. You probably don't need a movsb on the remainer unless you're not aligning things.

BTW, I do not know of a repne movs. There is a repne scas or repne cmps though, neither which do what we're after. You're free to look at it here on page 404 (434 in the PDF): http://www.intel.com/design/pentium/manuals/24319101.pdf

Name: Anonymous 2006-09-20 19:55

>>58
Hardly status dropping, mate. I was merely making obvious that my background isn't a "toy" language. It's C.

because a pointer to the middle of a zero-terminated string is exactly as good a string as any other.
A valid argument. However, if the language supports length-prefix strings, this doesn't hold. Have a look at how D does it (ignore the C++ bit): http://www.digitalmars.com/d/cppstrings.html

Just because you can do some things nicely doesn't mean it's a good general policy. Hark back to the ever-popular example of goto.

Name: Anonymous 2006-09-20 20:14

I'd vote for scheme or tcl.

Name: Anonymous 2006-09-20 20:23

>>60

You're right, I fail it at recognising what instructions REPNE can prefix.

Name: Anonymous 2006-09-20 20:30

>>58
line splitting function with length prefixed string would be pretty much the same thing.

struct {
int len;
char *str; }

The struct is how length prefixed string look like. So just pointers to the beginning of the lines with the length of the line is enough. Extra memory being sizeof(int) times number of lines. You can have a pointer to the middle of the string and you can slice in O(1) oh shi-

Name: Anonymous 2006-09-20 20:45

>>64
The only problem with that is C doesn't support it natively. Having a standard external pointer to the middle of a string won't work, because the string isn't null-terminated. How does any function using that external pointer know where the end of the string is?

In order to pull that off in C, you'd need to use the structs and explicitly calculate the lengths when initializing it. The result is either additional cruft everytime you need to initialize a new struct, or more function calls.

D is an example of a language that supports it natively. I really recommend any C types out there look at that link.

Name: Anonymous 2006-09-20 21:49

>>52
That's still irrelevant in practice.

Name: Anonymous 2006-09-20 22:30

>>66
How so?

Name: Anonymous 2006-09-21 12:23

>>67
Because string operations are rarely performance bottlenecks.

Name: Anonymous 2006-09-21 13:44

>>64
This is O(1) only for slices that are not written into however, and isn't practical in plain old C for the reasons >>65 gives. For a pristine, fresh runtime, I don't see much of a problem, but you'll want to note that even D puts an implicit zero at the end of char[] -type strings for compatibility with C libraries. (Plus, D is garbage collected and thus not suitable for a number of contexts for which C is.)

>>61
What, D has a string type now? Last I remember, the D people were dead set against doing that, arguing that it was better to check every write through a char[] (combined with static analysis for e.g. loops and such) and fake copy-on-write semantics that way. Anyway, if anything this reinforces my original point, namely that strings are collections of characters rather than collections of arbitrary bytes. This is further driven home by various languages' (excluding Python, apparently) transition to an Unicode encoding, making a bytes-as-characters interpretation tenuous at best and claims that C strings are of bytes rather than characters, even given that the usual C idiom for an unsigned 8-bit quantity is "unsigned char", strange.

Is this sufficient common ground, or would you like to go further? :-)

Name: Anonymous 2006-09-21 15:07

Python. Syntax is easy and it can be used to represent complexity in very simple ways.

Pascal is not a real programming language, it's a "trainer" language. A screwed up version of academic pseudocode.

Name: Anonymous 2006-09-21 15:19

10 REM A SIMPLE CALCULATION PROGRAM
20 PRINT TAB(20) "HI I'M A CALCULATOR."
30 PRINT : PRINT : PRINT
40 PRINT "FIRST NUMBER"
50 INPUT FIRST
60 PRINT
70 PRINT "ADDED TO..."
80 INPUT SECOND
90 RESULT = FIRST + SECOND
100 PRINT : PRINT
110 PRINT "RESULT IS " RESULT
120 END




Why not? :)

Name: Anonymous 2006-09-21 15:31

10 HELL
20 GOTO 10

Name: Anonymous 2006-09-21 15:40 (sage)

FREEBASIC AND STFU ALREADY.

Name: Anonymous 2006-09-21 16:35

>>72
Fail.

10 SIN
20 GOTO HELL

Name: Anonymous 2006-09-22 0:56

>>69
(Plus, D is garbage collected and thus not suitable for a number of contexts for which C is.)
Yes, that is true. I wasn't presenting D as an alternative though, but rather providing a real-world example of Pascal-style strings in a C-style language. The fact that D is garbage-collected is (nearly) orthogonal.

Anyway, if anything this reinforces my original point, namely that strings are collections of characters rather than collections of arbitrary bytes.
After thinking this over for a few minutes, I'm inclined to agree. Even so, I don't think using null-termination with any collection of characters is a good idea. For example, some codepoints in various UTF encodings use \x00.

Name: Anonymous 2006-09-22 2:52

C# maybe?  You can design a GUI in it, and it has C-like syntax.

Name: Anonymous 2006-09-22 2:56

C# IS THE DEVIL'S WORK

Name: Anonymous 2006-09-22 15:26

>>76
C# is C-like? FAIL!

Name: Anonymous 2006-09-22 16:22

>>76
Maybe, if by C-like you mean Caesarean section-like

Name: Anonymous 2006-09-22 17:54

>>76
Maybe, if by C-like you mean Java like
fix'd

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List