Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

REPOLT HERE

Name: VIPPER 2006-02-28 18:51

??

Name: VIPPER 2009-03-03 16:32

FORMATTING TEST

This is what I emailled him. I think IHBT for sure, but I don't care as it was somewhat educational and interesting working this out:

Are you really serious about having 650 thousand lines in your hosts file? I can't imagine why you'd need that many. It also has a crippling effect on one's computer.

To test this, I created a sample copy of a hosts file with that many entries, using the "0" shorthand for IP address and a randomized hostname of average 32 characters. Total size of this file is 22855 kilobytes, and after an hour the DNS cache had only loaded a third of it in. This is primarily due to the choice of algorithm used by the DNS cache service - it wasn't designed for tens of thousands of hosts file entries to be stored, so uses a rather inefficient method of growing the space used to store that involves copying huge swathes of data around for each new entry. It also blocked any name lookups while loading the file.

So instead of this, I tried with only 65k entries, and made three copies of this file. Each had an identical list of hostnames, but used "0", "0.0.0.0" and "127.0.0.1" respectively. The DNS cache now took 1 minute 55 seconds to load each one; the choice of IP address style didn't make any difference to the loading time as the bulk of the processing was in inserting new entries as described in the paragraph above. Name resolution was at normal speed after that, though. Searching in-cache - even for such a large set of data - added no discernible penalty.

I decided to try with the DNS cache disabled. This isn't a good idea, as it forces uncached name resolution to be done for every single lookup. This is indeed what it did, and the original 650,000 entry hosts files added around 3 seconds onto every single name lookup, the amalgamated effect of which slowed general Internet access down considerably. Unlike the DNS cache loading, this time there was a slight difference in loading times between the different hosts files - this was expected, as it was reading the entire file each time so that became the bottleneck.

Finally, to address your last question: every IPv4 address is sorted in the cache using the same size of four bytes. e.g. both "0" and "0.0.0.0" become 00 00 00 00, both "127.1" and "127.0.0.1" become 7F 00 00 01, and so on. This is consistent with the binary format used in the sockets API.

In conclusion, using the hosts file to store tens of thousands of entries has a negative effect on the performance of Windows' name resolution. You should really consider another option to filter all those hostnames.

Regards
Harm

----- Original Message -----
From: "Alexander Peter Kowalski" <apk4776239@hotmail.com>;
To: "Harm Sørensen" <harsoren@gmail.com>;
Sent: Monday, March 02, 2009 1:07 PM
Subject: Re: hosts file APK REPLY #1, Thanks + IMPORTANT QUESTION... apk

Thanks,

Still, it does adversely affect speed, nevertheless... you stated:

"And parsing "0" versus "0.0.0.0" is such an insignificant difference it's not even worth talking about. Even on thousands of entries any speed impact would bebarely measurable and certainly not noticeable."

BUT, it is quite noticeable... humanly noticeable, & here is why/how/when etc.:

There IS a decent gain, using PLAIN 0 vs. 0.0.0.0 or 127.0.0.1 when you have as many lines in your HOSTS file as I do @ 650,000++ of them!

Simply because an extra 5 bytes per line using 0.0.0.0 or an extra 7 bytes using 127.0.0.1, vs plain 0, begins to show its face & compounds itself further the more lines you HOSTS file has).

That's where the problem is really. You noted it as did I below in your reply, No questions asked!

Doing the math alone (theory) shows that, &, as you noted, the loading speed of the IP stack hauling the HOSTS file in shows it, & also via unloading/flushing the local DNS cache via ipconfig /flushdns...

This simple test can illustrate it for you, as a test if you wish in fact.

I am more concerned on memory usage, & it appears that the use of IPConfig /flushdns increases using larger Blocking IP Addresses than 0 (& those of course being 0.0.0.0 &/or 127.0.0.1) & your explanation of this took care of that for me (I knew that HOSTS files did not load repeated entries upon load into the local DNS cache, but, not that 0 converted to 0.0.0.0).

QUESTION:  What about 127.0.0.1 vs. 0.0.0.0?

127.0.0.1 doesn't resolve to 0.0.0.0, so it IS bigger in RAM once the HOSTS is loaded into a local DNS cache though, where 0 is same as 0.0.0.0 once loaded into the local DNS cache, correct??

Thanks for the answer...

APK


----- Original Message -----
From: "Harm Sørensen" <harsoren@googlemail.com>;
To: "Alexander Peter Kowalski" <apk4776239@hotmail.com>;
Sent: Sunday, March 01, 2009 9:20 PM
Subject: hosts file

Hello,
>
In regards to your recent postings on Slashdot et al about how integer IP addresses such as "0" no longer work in the hosts file - the fact of the matter is that the difference this makes to the speed of one's system is insignificant, so there is no reason to just change the hosts file to use "0.0.0.0" instead.
>
The reason is this: the DNS Client service - assuming it is running, which it should be - reads the hosts file in once when it starts and only reads it in again on the rare occurrence that it changes. Using the DNSAPI library, it reads in each line, tokenizes it and parses it. Integers are converted during the parsing stage to their equivalent IP addresses, e.g. "0" becomes 0.0.0.0. These host to address mappings are then stored in their normalized (i.e. binary) form in the DNS cache.
>
Therefore, the only possible saving of speed would be in the initial loading of the hosts file, which happens rarely. And parsing "0" versus "0.0.0.0" is such an insignificant difference it's not even worth talking about. Even  on thousands of entries any speed impact would be barely measurable and certainly not noticeable.
>
The only saving of space would be in the hosts file itself - again, a negligible difference. Addresses stored as either "0" or "0.0.0.0" in the hosts file on disk are stored precisely the same in memory in the DNS cache so there is no space difference at all here.
>
Hope this explanation helps put your complaint into perspective.
>
Best regards,
Harm

Name: VIPPER 2009-03-03 16:38

FORMATTING TEST

This is what I emailled him. I think IHBT for sure, but I don't care as it was somewhat educational and interesting working this out:

Are you really serious about having 650 thousand lines in your hosts file? I can't imagine why you'd need that many. It also has a crippling effect on one's computer.

To test this, I created a sample copy of a hosts file with that many entries, using the "0" shorthand for IP address and a randomized hostname of average 32 characters. Total size of this file is 22855 kilobytes, and after an hour the DNS cache had only loaded a third of it in. This is primarily due to the choice of algorithm used by the DNS cache service - it wasn't designed for tens of thousands of hosts file entries to be stored, so uses a rather inefficient method of growing the space used to store that involves copying huge swathes of data around for each new entry. It also blocked any name lookups while loading the file.

So instead of this, I tried with only 65k entries, and made three copies of this file. Each had an identical list of hostnames, but used "0", "0.0.0.0" and "127.0.0.1" respectively. The DNS cache now took 1 minute 55 seconds to load each one; the choice of IP address style didn't make any difference to the loading time as the bulk of the processing was in inserting new entries as described in the paragraph above. Name resolution was at normal speed after that, though. Searching in-cache - even for such a large set of data - added no discernible penalty.

I decided to try with the DNS cache disabled. This isn't a good idea, as it forces uncached name resolution to be done for every single lookup. This is indeed what it did, and the original 650,000 entry hosts files added around 3 seconds onto every single name lookup, the amalgamated effect of which slowed general Internet access down considerably. Unlike the DNS cache loading, this time there was a slight difference in loading times between the different hosts files - this was expected, as it was reading the entire file each time so that became the bottleneck.

Finally, to address your last question: every IPv4 address is sorted in the cache using the same size of four bytes. e.g. both "0" and "0.0.0.0" become 00 00 00 00, both "127.1" and "127.0.0.1" become 7F 00 00 01, and so on. This is consistent with the binary format used in the sockets API.

In conclusion, using the hosts file to store tens of thousands of entries has a negative effect on the performance of Windows' name resolution. You should really consider another option to filter all those hostnames.

Regards
   

----- Original Message -----
From: "Alexander Peter Kowalski" <apk4776239@hotmail.com>;
To: "            " <        @     .   >
Sent: Monday, March 02, 2009 1:07 PM
Subject: Re: hosts file APK REPLY #1, Thanks + IMPORTANT QUESTION... apk

Thanks,

Still, it does adversely affect speed, nevertheless... you stated:

> "And parsing "0" versus "0.0.0.0" is such an insignificant difference it's not even worth talking about. Even on thousands of entries any speed impact would bebarely measurable and certainly not noticeable."

BUT, it is quite noticeable... humanly noticeable, & here is why/how/when etc.:

There IS a decent gain, using PLAIN 0 vs. 0.0.0.0 or 127.0.0.1 when you have as many lines in your HOSTS file as I do @ 650,000++ of them!

Simply because an extra 5 bytes per line using 0.0.0.0 or an extra 7 bytes using 127.0.0.1, vs plain 0, begins to show its face & compounds itself further the more lines you HOSTS file has).

That's where the problem is really. You noted it as did I below in your reply, No questions asked!

Doing the math alone (theory) shows that, &, as you noted, the loading speed of the IP stack hauling the HOSTS file in shows it, & also via unloading/flushing the local DNS cache via ipconfig /flushdns...

This simple test can illustrate it for you, as a test if you wish in fact.

I am more concerned on memory usage, & it appears that the use of IPConfig /flushdns increases using larger Blocking IP Addresses than 0 (& those of course being 0.0.0.0 &/or 127.0.0.1) & your explanation of this took care of that for me (I knew that HOSTS files did not load repeated entries upon load into the local DNS cache, but, not that 0 converted to 0.0.0.0).

QUESTION:  What about 127.0.0.1 vs. 0.0.0.0?

127.0.0.1 doesn't resolve to 0.0.0.0, so it IS bigger in RAM once the HOSTS is loaded into a local DNS cache though, where 0 is same as 0.0.0.0 once loaded into the local DNS cache, correct??

Thanks for the answer...

APK



----- Original Message -----
From: "            " <        @     .   >
To: "Alexander Peter Kowalski" <apk4776239@hotmail.com>;
Sent: Sunday, March 01, 2009 9:20 PM
Subject: hosts file
> > Hello,

In regards to your recent postings about how integer IP addresses such as "0" no longer work in the hosts file - the fact of the matter is that the difference this makes to the speed of one's system is insignificant, so there is no reason to just change the hosts file to use "0.0.0.0" instead.

The reason is this: the DNS Client service - assuming it is running, which it should be - reads the hosts file in once when it starts and only reads it in again on the rare occurrence that it changes. Using the DNSAPI library, it reads in each line, tokenizes it and parses it. Integers are converted during the parsing stage to their equivalent IP addresses, e.g. "0" becomes 0.0.0.0. These host to address mappings are then stored in their normalized (i.e. binary) form in the DNS cache.

Therefore, the only possible saving of speed would be in the initial loading of the hosts file, which happens rarely. And parsing "0" versus "0.0.0.0" is such an insignificant difference it's not even worth talking about. Even  on thousands of entries any speed impact would be barely measurable and certainly not noticeable.

The only saving of space would be in the hosts file itself - again, a negligible difference. Addresses stored as either "0" or "0.0.0.0" in the hosts file on disk are stored precisely the same in memory in the DNS cache so there is no space difference at all here.

Hope this explanation helps put your complaint into perspective.

Best regards,
   

Name: VIPPER 2009-03-03 16:41

FORMATTING TEST

This is what I emailled him. I think IHBT for sure, but I don't care as it was somewhat educational and interesting working this out:

Are you really serious about having 650 thousand lines in your hosts file? I can't imagine why you'd need that many. It also has a crippling effect on one's computer.

To test this, I created a sample copy of a hosts file with that many entries, using the "0" shorthand for IP address and a randomized hostname of average 32 characters. Total size of this file is 22855 kilobytes, and after an hour the DNS cache had only loaded a third of it in. This is primarily due to the choice of algorithm used by the DNS cache service - it wasn't designed for tens of thousands of hosts file entries to be stored, so uses a rather inefficient method of growing the space used to store that involves copying huge swathes of data around for each new entry. It also blocked any name lookups while loading the file.

So instead of this, I tried with only 65k entries, and made three copies of this file. Each had an identical list of hostnames, but used "0", "0.0.0.0" and "127.0.0.1" respectively. The DNS cache now took 1 minute 55 seconds to load each one; the choice of IP address style didn't make any difference to the loading time as the bulk of the processing was in inserting new entries as described in the paragraph above. Name resolution was at normal speed after that, though. Searching in-cache - even for such a large set of data - added no discernible penalty.

I decided to try with the DNS cache disabled. This isn't a good idea, as it forces uncached name resolution to be done for every single lookup. This is indeed what it did, and the original 650,000 entry hosts files added around 3 seconds onto every single name lookup, the amalgamated effect of which slowed general Internet access down considerably. Unlike the DNS cache loading, this time there was a slight difference in loading times between the different hosts files - this was expected, as it was reading the entire file each time so that became the bottleneck.

Finally, to address your last question: every IPv4 address is sorted in the cache using the same size of four bytes. e.g. both "0" and "0.0.0.0" become 00 00 00 00, both "127.1" and "127.0.0.1" become 7F 00 00 01, and so on. This is consistent with the binary format used in the sockets API.

In conclusion, using the hosts file to store tens of thousands of entries has a negative effect on the performance of Windows' name resolution. You should really consider another option to filter all those hostnames.

Regards
    

----- Original Message -----
From: "Alexander Peter Kowalski" <apk4776239@hotmail.com>;
To: "            " <        @     .   >
Sent: Monday, March 02, 2009 1:07 PM
Subject: Re: hosts file APK REPLY #1, Thanks + IMPORTANT QUESTION... apk

Thanks,

Still, it does adversely affect speed, nevertheless... you stated:

"And parsing "0" versus "0.0.0.0" is such an insignificant difference it's not even worth talking about. Even on thousands of entries any speed impact would bebarely measurable and certainly not noticeable."

BUT, it is quite noticeable... humanly noticeable, & here is why/how/when etc.:

There IS a decent gain, using PLAIN 0 vs. 0.0.0.0 or 127.0.0.1 when you have as many lines in your HOSTS file as I do @ 650,000++ of them!

Simply because an extra 5 bytes per line using 0.0.0.0 or an extra 7 bytes using 127.0.0.1, vs plain 0, begins to show its face & compounds itself further the more lines you HOSTS file has).

That's where the problem is really. You noted it as did I below in your reply, No questions asked!

Doing the math alone (theory) shows that, &, as you noted, the loading speed of the IP stack hauling the HOSTS file in shows it, & also via unloading/flushing the local DNS cache via ipconfig /flushdns...

This simple test can illustrate it for you, as a test if you wish in fact.

I am more concerned on memory usage, & it appears that the use of IPConfig /flushdns increases using larger Blocking IP Addresses than 0 (& those of course being 0.0.0.0 &/or 127.0.0.1) & your explanation of this took care of that for me (I knew that HOSTS files did not load repeated entries upon load into the local DNS cache, but, not that 0 converted to 0.0.0.0).

QUESTION:  What about 127.0.0.1 vs. 0.0.0.0?

127.0.0.1 doesn't resolve to 0.0.0.0, so it IS bigger in RAM once the HOSTS is loaded into a local DNS cache though, where 0 is same as 0.0.0.0 once loaded into the local DNS cache, correct??

Thanks for the answer...

APK



----- Original Message -----
From: "            " <        @     .   >
To: "Alexander Peter Kowalski" <apk4776239@hotmail.com>;
Sent: Sunday, March 01, 2009 9:20 PM
Subject: hosts file

Hello,

In regards to your recent postings about how integer IP addresses such as "0" no longer work in the hosts file - the fact of the matter is that the difference this makes to the speed of one's system is insignificant, so there is no reason to just change the hosts file to use "0.0.0.0" instead.

The reason is this: the DNS Client service - assuming it is running, which it should be - reads the hosts file in once when it starts and only reads it in again on the rare occurrence that it changes. Using the DNSAPI library, it reads in each line, tokenizes it and parses it. Integers are converted during the parsing stage to their equivalent IP addresses, e.g. "0" becomes 0.0.0.0. These host to address mappings are then stored in their normalized (i.e. binary) form in the DNS cache.

Therefore, the only possible saving of speed would be in the initial loading of the hosts file, which happens rarely. And parsing "0" versus "0.0.0.0" is such an insignificant difference it's not even worth talking about. Even  on thousands of entries any speed impact would be barely measurable and certainly not noticeable.

The only saving of space would be in the hosts file itself - again, a negligible difference. Addresses stored as either "0" or "0.0.0.0" in the hosts file on disk are stored precisely the same in memory in the DNS cache so there is no space difference at all here.

Hope this explanation helps put your complaint into perspective.

Best regards,
    


Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List