Originally Posted By: CharlesS
I suppose there's a small possibility that the paths could be using up all the memory available to the process, but since 150 KB is hardly anything, I doubt that. It could be that it's trying to fork a process with too much VM space, but the fork seems to be succeeding since the ls program is actually getting run. The argument list could be too long, except that it isn't — ARG_MAX is 262144 bytes (256 KiB), which is longer than 150 KB.

Well, 256 is only slightly more than 150... if say perhaps (just speculating) it needed two copies of the data: one to buffer the input it's reading, and another to store the output it's constructing (be it ASCII-betical sorting or column-width calculating, etc.), then 150 x 2 = 300. [at this point i presume my "alert the operator" guess was probably incorrect... just pointing out that it wasn't necessarily totally ridiculous either, since (so far) there doesn't seem to be an actual memory error.]


Originally Posted By: CharlesS
If you really want to make the find command run faster by consolidating arguments, the best way to do it is probably just to pipe to xargs. That way, you'll get the additional benefit that if the argument list does get longer than ARG_MAX, it'll split it up for you without running the command once for each line.

find <path> <search criteria> | xargs ls -l@

Reading those statements, it seems apparent you were not yet aware of find's new {} + syntax (if Leopard 10.5.0 can still be called "new" that is), which i briefly mentioned earlier in this very thread. Its express intent is to emulate xargs. And —in some of the measurements i've taken on occasion, since Fall 2007 —it actually surpasses xargs (...not by much mind you, but still impressive).

Here are some timed runs (please note how the arg list for du does get exceeded, as evidenced by multiple "total" lines in the output):

time du -sh /usr
544M /usr

real 0m0.154s
user 0m0.024s
sys 0m0.130s



time find /usr -type f -print0 |xargs -0 du -sc |grep $'\ttotal' |
awk 'siz+=$1 END { print siz/2000, "megabytes" }'

566704 total
155432 total
119456 total
32144 total
101280 total
96488 total
19928 total
545.716 megabytes

real 0m0.293s
user 0m0.079s
sys 0m0.248s



time find /usr -type f -exec du -sc {} + |grep $'\ttotal' |
awk 'siz+=$1 END { print siz/2000, "megabytes" }'

537488 total
61696 total
97040 total
62248 total
73424 total
23528 total
22960 total
20784 total
105616 total
46936 total
40480 total
546.1 megabytes

real 0m0.285s
user 0m0.062s
sys 0m0.234s

^ Note there that the newer find syntax actually beat xargs by a whisker, despite it having used up 4 additional calls to du. (11 versus 7) shocked


By way of comparison, the old find syntax is glacially slow and abysmally inefficient (calling du for every single hit):

time find /usr -type f -exec du -sc {} \; |grep $'\ttotal' |
awk 'siz+=$1 END { print siz/2000, "megabytes" }'

48 total
240 total
152 total
0 total
48 total
:
: # 29,394 lines with a "total" (on my system)
:
8 total
8 total
176 total
8 total
554.184 megabytes

real 0m39.716s
user 0m8.726s
sys 0m28.341s

Interestingly enough, using a Finder Get Info window to measure /usr claims a calculated 1.22 GB on disk. (an error due to multi-linked files in /usr perhaps... or did they simply forget that "one block" != 1K ?)



Originally Posted By: CharlesS
There's clearly something wrong, though, since I'm getting the same error on my machine, and my list of zero-files in /usr, /bin, and /sbin is only 2924 bytes long. There's nothing too long about that list of arguments at all, and I can even copy and paste the list straight into ls -l@ at the shell, and it works fine. But the find command does not.

Something's returning ENOMEM in the code, which most likely means that something is using up the process's available memory. Or it could be a bug in the code. Who knows.

I don't know either, but I vote bug... and i think this particular sequence illustrates that possibility rather nicely (and/or contains a useful clue perhaps):

find /usr -type d -exec stat -f '%N' {} + |wc -l
   1509

find /usr -type f -exec stat -f '%N' {} + |wc -l
   29394

find /usr -type l -exec stat -f '%N' {} + |wc -l
find: fts_read: Cannot allocate memory
   3054

Now why would 3,000 symlinks cause a problem, when -exec stat previously processed nearly 30,000 files without blinking? confused

Here are three alternative methods of counting links:

find /usr -type l -exec stat -f '%N' {} \; |wc -l
   3054

find /usr -type l -print0 |xargs -0 stat -f '%N' |wc -l
   3054

find /usr -type l |wc -l
   3054


So it seems the "error" (in this case) shows up only when that newer find syntax encounters some particular condition (as yet unknown). Note that: it is the *message itself* which seems to be the actual error here, as 3054 is indeed the correct answer.

But it seems to turn up in other situations as well:



Originally Posted By: CharlesS
Beats me, but it's definitely not the file list being too long.

Having studied it further, i'll have to agree with you. But at the time, it didn't seem like the worst guess in the world. In a similar vein, i have some doubts about "means exactly what it says" at this juncture as well.

Last edited by Hal Itosis; 06/05/10 08:13 AM.