Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-

C# lost to PHP, and I lost my Job

Name: Anonymous 2014-02-08 12:07

Hey! It is me again.

My last posting here was about breakage with GF and that a new .NET job would be much cooler than the former C/C++ one. So the first task, employer gave me, was rewriting legacy PHP integration layer into C#.

The assignment sounded easy, because C# is a compiled language, having limited type inference. Alas after writing the initial prototype, I found that C# fails to give me a way to do efficiently something as a simple text data storage. Yes, after all the rewriting and optimization, PHP version was still faster and taking less memory than C# one, which continuously spitted System.OutOfMemoryException. Other programmers at the workplace were enraged, that I lost C#'s honor to some PHP. But the reality is that C# loses to PHP, when it comes to memory usage.

The PHP code consited of a sender.php, which repeatedly gets called by a cron job to spawn new processes, and a lock.file, which these processes lock before accessing DB. Another watchdog cron job moderates the number of these treads, killing unwanted, so that they won't eat all resources. The system can sustain around 30 of these PHP processes at any single moment. Each sender.php process fetches 10 mb of UTF8 messages from an SQL DB, converts them into x-www-form-urlencoded, compresses through libbz2 and sends the resulting stream as a POST http query to the destination. After transfer is complete, PHP gets killed, freeing all the resources.

In my C# version I copied PHP's behavior exactly: 30 threads taking messages in parallel and establishing 30 connections to destination. Yet C# version was taking 100 megabytes per thread, while PHP only 20mb. The problem unfolded as following:
1. Just like PHP, each C# thread fetched 10 mb of UTF8 messages
2. Yet unlike PHP, C# immediately converts fetched UTG8 bytes into UTF16 (string[]'s internal representation), allocating another 20mb and giving total of 30mb
3. urlencode further converts them, in some cases tripling the size due to the %DE%AD look; adding 40mb, due to the need for StringBuilder to generate the PHP's &hash_map[key0][key1]=fuck+you arrays.
4. Add here inefficient bzip2 implementation (PHP's one is written in C/C++) kicks-in.
5. All C#'s threads are running in a single process, so nothing is freed, after on single thread finishes. This really stresses memory, compared to PHP's shared nothing architecture. And NO, garbage collector doesn't free memory, until it is too late and System.OutOfMemoryException is generated. Even explicitly forcing GC wont help.

All in one, I lost to PHP and lost my job, because team lead got really aggravated to my failure and that I had nerve to defend PHP's approach to memory management. I'm in search of a new job, my dear /prog/. Hope this time it will be at least Clojure, because it uses UTF8 to store strings and Java's GC doesn't produce System.OutOfMemoryException out of the blue.

http://www.youtube.com/watch?v=oxL_xY0Tm2w

Name: Anonymous 2014-02-08 13:45

fuck you
you left and /prog/ was invaded with all kind of faggots communists mongrel gay columbian gender confused people

Name: Anonymous 2014-02-08 14:15

You should've used Forth, nigger.

Name: Anonymous 2014-02-08 14:36

                   `
>russians
>white

Name: Anonymous 2014-02-09 9:13

C# supports streaming for all these types of things - you can stream data from the database, into BZip2, into a WebRequest. I am sure that would be much more efficient. Not that it matters now.

Name: Anonymous 2014-02-09 9:53

>>2
We all left and went to the other site. Why do you still come here? This place is controlled by the Jews.

Name: Anonymous 2014-02-09 11:02

>>5
Well, streaming is a sound idea. Yet, PHP still wins in memory usage.

Name: Anonymous 2014-02-09 16:32

>>7
It's worth mentioning that for some input, such as Chinese or Japanese text, UTF-8 will use 3 bytes where UTF-16 will use only two. But UTF-8 is indeed generally superior, given that UTF-16 no longer has the advantage of being simpler to process (due to surrogate pairs). Unfortunately Microsoft jumped on the Unicode bandwagon too early and now the entire of Windows and .NET is UTF-16.

PHP's urlencode also allocates a new string for the output, though, so there's nothing different going on there, other than the fact that the amount of memory being doubled is higher to start with.

It seems very odd that the GC wasn't freeing memory after it was no longer needed. Perhaps the strings were being kept alive later than necessary due to unintentional references? In any case, it probably wouldn't be a huge task to convert your code to use multiple processes rather than multiple threads.

Name: Anonymous 2014-02-09 17:56

>>8
Perhaps the strings were being kept alive later than necessary due to unintentional references?
Most GCs have problems with substrings and array slices. If you take a substring, the whole string stays alive. It can be cheaper to copy the substrings.

Name: Anonymous 2014-02-09 18:54

>>6
We all left and went to the other site. Why do you still come here? This place is controlled by the Jews.
We come here because we are Jewish, silly ;)

Name: Anonymous 2014-02-09 21:24

tl;dr  OP a shit, blames C#

Name: Anonymous 2014-02-09 23:28

>>8
it probably wouldn't be a huge task to convert your code to use multiple processes rather than multiple threads.
It would. Because .NET runtime has an order of magnitude larger memory footprint than a primitive php interpreter.

It seems very odd that the GC wasn't freeing memory after it was no longer needed. Perhaps the strings were being kept alive later than necessary due to unintentional references?
It is likely strings spill out of the first generation, so GC doesn't bother collecting them. See http://en.wikipedia.org/wiki/Thrashing_%28computer_science%29

Name: Clay 2014-02-10 22:25

Guys, I'm a complete moron and can't for the life of me figure this out.

It's SUPPOSED to me my own implementation of a power function in C.

I wanted to see if I could grasp the logic of it, but obviously I can't.

Would a kind stranger tell me what's wrong?

http://pastebin.com/0uGN7zFX

Name: Anonymous 2014-02-10 22:36

Was you fired for being unable to implement your own power function in C.

Name: Anonymous 2014-02-10 22:40

>>6
What other site?

Name: Anonymous 2014-02-10 22:43

>>15
/lounge/
We are all waiting for you there

Don't change these.
Name: Email:
Entire Thread Thread List