Someone who is working on his Masters in programming told me an interesting story about estimated times. He said that programmers build in a time to report, regardless of accuracy, so that the enduser will think that something is happening, and not bail out.
While some may do that, I tend to take one of two approaches.
Approach 1 is a dumb average. record start time. periodically calculate average speed by (amount downloaded / time spent thus far downloading). Multiply against remaining size. Calculate remaining time based on (remaining space / average speed) This is the easiest thing to do, but doesn't react well to changes in network speed. If you were downloading at 1MB/sec and suddenly your line became congested and now you are downloading at 50KB/sec, the average download speed will remain far above your current speed, and will continuously tell you "we're almost done". Some ISPs will give transfers a high "initial burst speed" so you can get small stuff fast. But if it sees you sucking on something for awhile, it'll throttle you down so others get their little stuff faster, figuring you might be at it for awhile and they'd rather lag one big download than a hundred little ones.
Approach 2 is a running average. Instead of recording initial start time, the start time of the last window is recorded, along with the current download progress. The next sample may be taken only one second later. The amount downloaded in that time and the time that elapsed are used to calculate
spontaneous speed. This speed is placed in the end of a circular queue, that may encompass say 20 seconds. (20 samples at 1 per second, or 40 samples at one every 1/2 second, etc) The queue is then averaged, and that is the speed that is then divided by the remaining data to download, to generate an estimate of remaining time. You can see this is quite a bit more complicated, but it will react quickly to changes in download speed, and it will fully correct in the time it takes to make one run around the queue. (20 seconds in this case) Shorter queues improve reaction time but can be unnecessarily sensitive to small hiccups in speed.
As the download gets very close to completing, both approaches tend to become very accurate. Transfers that vary wildly in speed (up, down, up, down) will produce rollercoaster estimates using approach 2, but will be fairly smooth and accurate using approach 1. So your specific conditions will determine which approach ends up giving the most accurate numbers through the bulk of the download. To this end, I sometimes implement a hybrid. I take BOTH approaches, and use an average of approach 1 and 2's speeds to divide remaining time. This seems to be the best practical approach if you are willing to deal with the added complexity of approach 2. It doesn't swing too wildly if conditions are all over the map, but it provides more accurate estimates mid-download if speeds make a change for the better or worse. Weighted averages of 1 and 2 are also possible, taking 1 less into account and 2 more into account as the download approaches the end.
It's actually a fairly interesting problem to work on.