Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-

XML

Name: GvR 2009-11-04 13:28

So I used xml.dom.minidom.parse on a 18MB XML file. The damn thing ate 900MB of RAM. I sort of expected this kind of disaster, but 50 fucking times more memory?

Now, I know jack shit about the in-memory representation this thing uses, but holy fucking shit, you could have an individual struct with 10 pointers for each character and it'd still be smaller (this is on 32 bit btw).

Name: Anonymous 2009-11-04 13:31

Try another parser from the hundreds floating around the internet? Make your own? Or just give up and just let it use the RAM, it's not like you need it for anything, right?

Name: Anonymous 2009-11-04 13:34

>>2
The RAM usage is not a problem in this particular case, I'm not asking for solutions. I just think it's disgusting.

Name: Anonymous 2009-11-04 14:54

Try a SAX parser? (Get your minds out of the gutter, /prog/; that's SAX, not SEXPR.) It'll require complicated mutable state, but I've recently had to switch from the enterprise javax.xml.parsers.DocumentBuilderFactory to a SAX parser because of not having enough heap space to parse and load data from a 291 MB XML file.

Name: Anonymous 2009-11-04 15:02

PARSE MY ANUS

Name: Anonymous 2009-11-04 15:09

Your problem is that you're trying to represent the whole XML file in memory at once. You should try lazily consuming it, so that only what you need is being parsed and represented at a time, and older stuff gets garbage collected. I'm not sure which parser would be best for this, since I don't have much experience with them.

Name: Anonymous 2009-11-04 15:12

The problem is XML is an enterprise faggot concept that only enterprise faggots would use. Therefore, the parses are written by enterprise faggots, who can't write effecient code. If a real programmer wrote an XML parses, it would be aewsome, but unfortunately real programmers never use XML.

Name: Anonymous 2009-11-04 15:27

>>7
What about when a real programmer needs to scrape some web pages and uses an XML parser-based HTML parser like lxml?

Name: Anonymous 2009-11-04 15:32

>>1
That DOM parser was designed for smaller data, so it is less efficient but provides more information.  Check out a more efficient parser if you want less, but you probably should be using a SAX parser for that data anyways.

Name: Anonymous 2009-11-04 16:52

>>7
If a real programmer wrote an XML parses, it would be aewsome, but unfortunately real programmers never use XML.
It's been done, once.
http://search.cpan.org/~mirod/XML-Twig-3.32/Twig.pm
XML would be totally tolerable if other languages caught on.  But then, so would a lot of things.

Name: Anonymous 2009-11-05 2:25

Name: Anonymous 2009-11-05 3:08

I wrote one in C before. It was intended to be part of a browser project but that never got anywhere.

Name: Anonymous 2009-11-05 6:22

HAX MY XML

Name: Anonymous 2009-11-05 7:28

>>13
That's just disgusting, I could get AIDS or worse, become a Java code monkey

Name: Anonymous 2009-11-05 13:44

HaXe my anus

Name: Anonymous 2009-11-05 14:45

HAXML MY ANUS

Don't change these.
Name: Email:
Entire Thread Thread List