An open community 
of Macintosh users,
for Macintosh users.

FineTunedMac Dashboard widget now available! Download Here

Page 2 of 2 < 1 2
Topic Options
#10270 - 05/30/10 11:16 PM Re: du won't run? [Re: artie505]
Hal Itosis Offline


Registered: 09/03/09
Loc: 10.6.8 (build 10K549)
BTW, it occurs to me that —while the partial path '*/groff/*' was a convenient point for me to filter out those files which normally are zero bytes —it is probably too high up in that hierarchy. I.e., there may be *other* stuff inside groff which was not supposed to be zero (and —if such was you case —then those too got skipped).

IOW, this (original):

find /usr -type f -size -1c -not -path
'*/groff/*'

should be replaced by this:

find /usr -type f -size -1c -not -path '*/mm/*locale'

for a more accurate results list.

Top
#10276 - 05/31/10 08:48 AM Re: du won't run? [Re: Hal Itosis]
CharlesS Offline


Registered: 05/30/10
Those may still turn up false positives if other files show up that have a legitimate reason to be zero-length, which can happen, particularly in /usr/local where you don't know what third-party software might have been installed. In addition, if you want to scan for other damaged files outside /usr (which you want to do — although the damage usually seems to be in /usr, it's not yet proven that this will always be the case, and the entire file system is full of these HFS+ compressed files), it will turn up a lot of false positives, since every resource fork file, alias, .webloc file, etc. is going to appear as having a size-0 data fork and extended attributes. What you want to be doing is to specifically test for the com.apple.decmpfs attribute, which will only appear if the file is damaged (for a normal HFS+ compressed file, both com.apple.decmpfs and the resource fork are hidden to the normal interface).

I have written a tool in C to scan for HFS+ damaged files. I'm using it to scan affected people's machines and send the results to Apple, in the hope that they will figure out the issue and fix it. Because this tool is written in C, it will be much faster than the shell, and in addition it will not be dependent on any shell tools which could conceivably be damaged. You can run it with no options to do a scan of the entire hard drive, or send a path as an argument to have it only scan that particular path (but be aware that supplying a path which is not the root to an HFS+ volume will result in a much slower type of search being used). There is also an optional --hidePermissionWarnings flag that you can use to suppress the "Permission Denied" errors and avoid running as root.

Here is the URL.

http://www.charlessoft.com/find_sl_damaged_files.zip


Edited by CharlesS (05/31/10 08:49 AM)

Top
#10289 - 05/31/10 06:28 PM Re: du won't run? [Re: CharlesS]
Hal Itosis Offline


Registered: 09/03/09
Loc: 10.6.8 (build 10K549)
Excellent!

Top
#10292 - 05/31/10 10:04 PM Re: du won't run? [Re: Hal Itosis]
artie505 Online


Registered: 08/04/09
> If that adds up to more than 138 +1803 (or 1941), then yeah... something got dropped.

Code:
Artie's-MacBook:~ artie$ find /usr -type f -size -1c -not -path '*/groff/*' |wc -l
    1941
Artie's-MacBook:~ artie$ 
_________________________
The new Great Equalizer is the SEND button.

In Memory Of Harv: Those who can make you believe absurdities can make you commit atrocities. ~Voltaire

Top
#10294 - 05/31/10 10:33 PM Re: du won't run? [Re: Hal Itosis]
artie505 Online


Registered: 08/04/09
Quote:
IOW, this (original): [...] should be replaced by this:

find /usr -type f -size -1c -not -path '*/mm/*locale'

for a more accurate results list.

Code:
Arties-MacBook:~ artie$ find /usr -type f -size -1c -not -path '*/mm/*locale'
/usr/share/man/man3/bn_internal.3ssl.gz
Arties-MacBook:~ artie$

Same file as I mentioned at the very bottom of this post; I couldn't locate it with Pacifist and have no idea what it may be. (I Googled it but didn't get a single hit that wasn't incomprehensible to me.)
_________________________
The new Great Equalizer is the SEND button.

In Memory Of Harv: Those who can make you believe absurdities can make you commit atrocities. ~Voltaire

Top
#10296 - 05/31/10 11:28 PM Re: du won't run? [Re: artie505]
Hal Itosis Offline


Registered: 09/03/09
Loc: 10.6.8 (build 10K549)
Originally Posted By: artie505
> If that adds up to more than 138 +1803 (or 1941), then yeah... something got dropped.

Code:
Artie's-MacBook:~ artie$ find /usr -type f -size -1c -not -path '*/groff/*' |wc -l
    1941  

smile

It may be then that the "memory alloc" message is simply to alert the operator that more than one call to the -exec utility was needed. I.e., it manages to process everything... but just alerts the operator that it wasn't all done within a single call. That could be important if -exec was calling du (ironically), because then the "total" at the end would be incorrect... and the user would need to scroll back and find where any previous du totals got printed out, and then add all the totals. . . for example.

Just to clarify a bit (hopefully):

the -exec utility {} +  syntax attempts to process all results simultaneously into a single "utility" call (i.e., xargs emulation).

the -exec utility {} \; syntax simply calls a separate instance of "utility" for each and every found item, one at a time.

So while the former might produce output equivalent to:

ls -ldF@ file1 file2 file3 file4

...the latter does this:

ls -ldF@ file1
ls -ldF@ file2
ls -ldF@ file3
ls -ldF@ file4

Obviously when it come to multiple arguments, there may be a limit as to how long the arg list can be. [but it is far more efficient than making multiple calls, and thus much faster as well as producing nicer output in almost every case.]



Originally Posted By: artie505
Same file as I mentioned at the very bottom of this post; I couldn't locate it with Pacifist and have no idea what it may be. (I Googled it but didn't get a single hit that wasn't incomprehensible to me.)

Well, it's just something to do with a "man page" anyway... so, it won't affect operation in the least (unless you try to read that page someday. No biggie, fortunately).

BTW, i ran Charles' scan tool on my MPB (two OS partitions). Nary a peep here.


Edited by Hal Itosis (06/01/10 12:13 AM)

Top
#10352 - 06/02/10 11:22 AM Re: du won't run? [Re: Hal Itosis]
CharlesS Offline


Registered: 05/30/10
Originally Posted By: Hal Itosis
It may be then that the "memory alloc" message is simply to alert the operator that more than one call to the -exec utility was needed. I.e., it manages to process everything... but just alerts the operator that it wasn't all done within a single call.

Nope, it means exactly what it says: fts_read ran out of memory when trying to read a directory. My guess is that you've either got a really big directory somewhere on the disk, or that the disk is nearly full and there isn't much VM space.


Edited by CharlesS (06/02/10 11:23 AM)

Top
#10359 - 06/02/10 09:08 PM Re: du won't run? [Re: CharlesS]
Hal Itosis Offline


Registered: 09/03/09
Loc: 10.6.8 (build 10K549)
Originally Posted By: CharlesS
Originally Posted By: Hal Itosis
It may be then that the "memory alloc" message is simply to alert the operator that more than one call to the -exec utility was needed. I.e., it manages to process everything... but just alerts the operator that it wasn't all done within a single call.

Nope, it means exactly what it says: fts_read ran out of memory when trying to read a directory. My guess is that you've either got a really big directory somewhere on the disk, or that the disk is nearly full and there isn't much VM space.

No guess needed... man3 is well-known for its size.

But —pray tell —what are the useful consequences of “means exactly what it says” then?

And why doesn't that same message appear when running the same find command on the same hierarchy and only swapping {} \; for {} + ?

And BTW, i never said it didn't 'mean what it says'... so when you reply with "nope", it sounds as if you're disagreeing with something, but you: a) misrepresented (or misunderstood) what I was saying in the first place... and b) offered no alternative of your own which explains why both methods produce the same answer.

I was trying to explain what that message actually meant in terms of what happened. I.e., the command's activity within the filesystem, and the results we got. Though I admit my conclusion was based on some speculation, you have yet to prove anything I said was actually wrong, and/or offer any "right" answer.

I.e., if something ran out of memory, then why/how does it produce the same results as other methods which didn't run out of memory? So far, my explanation makes sense... and you haven't even provided one which might be worthy of either debate or agreement.

/posted from my iPad.


Edited by Hal Itosis (06/02/10 10:22 PM)
Edit Reason: style

Top
#10384 - 06/04/10 08:00 AM Re: du won't run? [Re: Hal Itosis]
CharlesS Offline


Registered: 05/30/10
fts_read is a function that iterates through a directory. "Cannot allocate memory" is error 12, ENOMEM. If you're getting that, it usually means malloc is failing to allocate memory for a variable somewhere. I suppose there's a small possibility that the paths could be using up all the memory available to the process, but since 150 KB is hardly anything, I doubt that. It could be that it's trying to fork a process with too much VM space, but the fork seems to be succeeding since the ls program is actually getting run. The argument list could be too long, except that it isn't — ARG_MAX is 262144 bytes (256 KiB), which is longer than 150 KB.

There's clearly something wrong, though, since I'm getting the same error on my machine, and my list of zero-files in /usr, /bin, and /sbin is only 2924 bytes long. There's nothing too long about that list of arguments at all, and I can even copy and paste the list straight into ls -l@ at the shell, and it works fine. But the find command does not.

Something's returning ENOMEM in the code, which most likely means that something is using up the process's available memory. Or it could be a bug in the code. Who knows. If you really want to make the find command run faster by consolidating arguments, the best way to do it is probably just to pipe to xargs. That way, you'll get the additional benefit that if the argument list does get longer than ARG_MAX, it'll split it up for you without running the command once for each line.

find <path> <search criteria> | xargs ls -l@


Edited by CharlesS (06/04/10 08:12 AM)

Top
#10385 - 06/04/10 02:26 PM Re: du won't run? [Re: CharlesS]
Virtual1 Offline


Registered: 08/04/09
Loc: Iowa
recursion maybe? symbolic links? directory damage causing physical recursion?
_________________________
I work for the Department of Redundancy Department

Top
#10387 - 06/04/10 08:46 PM Re: du won't run? [Re: Virtual1]
CharlesS Offline


Registered: 05/30/10
Beats me, but it's definitely not the file list being too long. I just created a folder with 5,923 files in it, all with long names. The total length of the file list is 276,718 bytes, but having find iterate them all and pass them to ls -l@ {} + does not result in the ENOMEM error. It does split the list up and launches the ls program several times, though, so this would actually be a decent substitute for xargs if it worked right, if less efficient (xargs launches the ls program twice for my test folder, whereas find -exec + launches it four times). This also means that the file list can't end up exceeding ARG_MAX, though. It's something else.

Top
#10391 - 06/05/10 12:07 AM Re: du won't run? [Re: CharlesS]
Hal Itosis Offline


Registered: 09/03/09
Loc: 10.6.8 (build 10K549)
Originally Posted By: CharlesS
I suppose there's a small possibility that the paths could be using up all the memory available to the process, but since 150 KB is hardly anything, I doubt that. It could be that it's trying to fork a process with too much VM space, but the fork seems to be succeeding since the ls program is actually getting run. The argument list could be too long, except that it isn't — ARG_MAX is 262144 bytes (256 KiB), which is longer than 150 KB.

Well, 256 is only slightly more than 150... if say perhaps (just speculating) it needed two copies of the data: one to buffer the input it's reading, and another to store the output it's constructing (be it ASCII-betical sorting or column-width calculating, etc.), then 150 x 2 = 300. [at this point i presume my "alert the operator" guess was probably incorrect... just pointing out that it wasn't necessarily totally ridiculous either, since (so far) there doesn't seem to be an actual memory error.]


Originally Posted By: CharlesS
If you really want to make the find command run faster by consolidating arguments, the best way to do it is probably just to pipe to xargs. That way, you'll get the additional benefit that if the argument list does get longer than ARG_MAX, it'll split it up for you without running the command once for each line.

find <path> <search criteria> | xargs ls -l@

Reading those statements, it seems apparent you were not yet aware of find's new {} + syntax (if Leopard 10.5.0 can still be called "new" that is), which i briefly mentioned earlier in this very thread. Its express intent is to emulate xargs. And —in some of the measurements i've taken on occasion, since Fall 2007 —it actually surpasses xargs (...not by much mind you, but still impressive).

Here are some timed runs (please note how the arg list for du does get exceeded, as evidenced by multiple "total" lines in the output):

time du -sh /usr
544M /usr

real 0m0.154s
user 0m0.024s
sys 0m0.130s



time find /usr -type f -print0 |xargs -0 du -sc |grep $'\ttotal' |
awk 'siz+=$1 END { print siz/2000, "megabytes" }'

566704 total
155432 total
119456 total
32144 total
101280 total
96488 total
19928 total
545.716 megabytes

real 0m0.293s
user 0m0.079s
sys 0m0.248s



time find /usr -type f -exec du -sc {} + |grep $'\ttotal' |
awk 'siz+=$1 END { print siz/2000, "megabytes" }'

537488 total
61696 total
97040 total
62248 total
73424 total
23528 total
22960 total
20784 total
105616 total
46936 total
40480 total
546.1 megabytes

real 0m0.285s
user 0m0.062s
sys 0m0.234s

^ Note there that the newer find syntax actually beat xargs by a whisker, despite it having used up 4 additional calls to du. (11 versus 7) shocked


By way of comparison, the old find syntax is glacially slow and abysmally inefficient (calling du for every single hit):

time find /usr -type f -exec du -sc {} \; |grep $'\ttotal' |
awk 'siz+=$1 END { print siz/2000, "megabytes" }'

48 total
240 total
152 total
0 total
48 total
:
: # 29,394 lines with a "total" (on my system)
:
8 total
8 total
176 total
8 total
554.184 megabytes

real 0m39.716s
user 0m8.726s
sys 0m28.341s

Interestingly enough, using a Finder Get Info window to measure /usr claims a calculated 1.22 GB on disk. (an error due to multi-linked files in /usr perhaps... or did they simply forget that "one block" != 1K ?)



Originally Posted By: CharlesS
There's clearly something wrong, though, since I'm getting the same error on my machine, and my list of zero-files in /usr, /bin, and /sbin is only 2924 bytes long. There's nothing too long about that list of arguments at all, and I can even copy and paste the list straight into ls -l@ at the shell, and it works fine. But the find command does not.

Something's returning ENOMEM in the code, which most likely means that something is using up the process's available memory. Or it could be a bug in the code. Who knows.

I don't know either, but I vote bug... and i think this particular sequence illustrates that possibility rather nicely (and/or contains a useful clue perhaps):

find /usr -type d -exec stat -f '%N' {} + |wc -l
   1509

find /usr -type f -exec stat -f '%N' {} + |wc -l
   29394

find /usr -type l -exec stat -f '%N' {} + |wc -l
find: fts_read: Cannot allocate memory
   3054

Now why would 3,000 symlinks cause a problem, when -exec stat previously processed nearly 30,000 files without blinking? confused

Here are three alternative methods of counting links:

find /usr -type l -exec stat -f '%N' {} \; |wc -l
   3054

find /usr -type l -print0 |xargs -0 stat -f '%N' |wc -l
   3054

find /usr -type l |wc -l
   3054


So it seems the "error" (in this case) shows up only when that newer find syntax encounters some particular condition (as yet unknown). Note that: it is the *message itself* which seems to be the actual error here, as 3054 is indeed the correct answer.

But it seems to turn up in other situations as well:



Originally Posted By: CharlesS
Beats me, but it's definitely not the file list being too long.

Having studied it further, i'll have to agree with you. But at the time, it didn't seem like the worst guess in the world. In a similar vein, i have some doubts about "means exactly what it says" at this juncture as well.


Edited by Hal Itosis (06/05/10 01:13 AM)

Top
#10399 - 06/05/10 05:23 AM Re: du won't run? [Re: Hal Itosis]
Hal Itosis Offline


Registered: 09/03/09
Loc: 10.6.8 (build 10K549)
Two other crazy clues:

If we start here with this error...
Originally Posted By: Hal Itosis
find /usr -type l -exec stat -f '%N' {} + |wc -l
find: fts_read: Cannot allocate memory
   3054
..and then walk down the hierarchy, it turns out that (unsurprisingly) man3 must be within the encompassing path for the error to happen:

find /usr/share/man/man3 -type l -exec ls -d {} + |wc

find: fts_read: Cannot allocate memory
   1335   1335   58596

(i used ls instead of stat there, but... same difference).

Now watch this: take out the -type l parameter, and poof: no more error...

find /usr/share/man/man3 -exec ls -d {} + |wc

   6803   6804   285957

...despite the large amount of data (almost 286K worth of pathnames, involving three calls to ls as opposed to one).



Another odd couple, this time using the -size primary:

find /usr/share/man/man3 -size -24c -exec ls -d {} + |wc

   842   843   30658

find /usr/share/man/man3 -size -25c -exec ls -d {} + |wc

find: fts_read: Cannot allocate memory
   919   920   33604

Looks like the character count (of the pathname data) crosses the 32K boundary there... but i'm not sure if that was the tipping point or just coincidence.

I think the authors of find and fts need to meet each other. smile


Edited by Hal Itosis (06/05/10 05:32 AM)

Top
#10403 - 06/05/10 05:59 AM Re: du won't run? [Re: Hal Itosis]
artie505 Online


Registered: 08/04/09
Talk about a thread evolving!
_________________________
The new Great Equalizer is the SEND button.

In Memory Of Harv: Those who can make you believe absurdities can make you commit atrocities. ~Voltaire

Top
#10407 - 06/05/10 09:35 AM Re: du won't run? [Re: Hal Itosis]
CharlesS Offline


Registered: 05/30/10
Originally Posted By: Hal Itosis
Well, 256 is only slightly more than 150...

If by "slightly" we mean "almost double," then you're right.

Quote:
if say perhaps (just speculating) it needed two copies of the data: one to buffer the input it's reading, and another to store the output it's constructing (be it ASCII-betical sorting or column-width calculating, etc.), then 150 x 2 = 300. [at this point i presume my "alert the operator" guess was probably incorrect... just pointing out that it wasn't necessarily totally ridiculous either, since (so far) there doesn't seem to be an actual memory error.]

But since it's already been established that -exec + is cutting off and starting a new command line every 128 KiB or so, similar to xargs, that's clearly not the issue.

If this were designed to "alert the operator", then it would simply give an error such as "argument list too long" or something actually appropriate. This is not the case.


Quote:
Reading those statements, it seems apparent you were not yet aware of find's new {} + syntax (if Leopard 10.5.0 can still be called "new" that is), which i briefly mentioned earlier in this very thread. Its express intent is to emulate xargs. And —in some of the measurements i've taken on occasion, since Fall 2007 —it actually surpasses xargs (...not by much mind you, but still impressive).

Instead of posting snotty comments like this, you could just think about it and realize the problem with -exec + is that it evidently DOESN'T WORK RIGHT, whereas xargs does. Given the choice between taking a fraction of a second longer and actually working versus failing slightly faster, I'd go with the one that works (and if speed is what you're after, why are you using the shell anyway?).

And since -exec + cuts off the command line at around 128 KiB whereas xargs cuts off at 256 KiB, if the command line you're running takes a non-trivial time to execute, you might make back that 0.15 seconds anyway. I'd use xargs for now, until they fix find.

You were suggesting to use ; instead of + to work around this error, which would of course be quite a bit slower. I pointed out that xargs is a better solution for this case. Chill.
Quote:
bunch of fiddling with command line switches to attempt to isolate the issue

I've actually issued the same command line multiple times, and gotten failures sometimes and not others. At this point, I think that there is probably a random element involved, depending on the state of memory at the time the code is run (which is often the case with memory-related issues), and thus the only really effective way to troubleshoot it would be to go through the source code to the find tool itself with a debugger and a fine-toothed comb. If the bug could be isolated, then it could even be reported to Apple, along with a patch to fix the issue. This would be a legitimate thing to investigate; however, I'm losing interest in this. I posted in this thread because artie505 asked for help; I did so by posting a C-based tool which should scan for HFS+ damaged files significantly faster and more accurately than using the find tool, as well as being more reliable due to the lack of reliance on possibly damaged shell tools. However, I'm starting to think it was a mistake to come here. If the thread's going to be about defending your "This error was designed to inform the operator about something that the tool automatically takes care of anyway, by reporting a completely different error" statement ad nauseum, then I'm out.

Top
#10420 - 06/06/10 09:20 AM Re: du won't run? [Re: CharlesS]
Hal Itosis Offline


Registered: 09/03/09
Loc: 10.6.8 (build 10K549)
Originally Posted By: CharlesS
But since it's already been established that -exec + is cutting off and starting a new command line every 128 KiB or so, similar to xargs, that's clearly not the issue.

Hold on a second... “128 KiB or so” now is it? Earlier you used 256 as some official size, in the midst of dismissing my comment about 150K as being above the threshold. And now we're saying that 128 is what's “already been established” ? smile

As far as “that's clearly not the issue” goes, didn't you see what i wrote above??? Here... i'll repeat a line from my previous reply, in 11 point size:
Originally Posted By: Hal Itosis
at this point i presume my "alert the operator" guess was probably incorrect.
So why are you getting all worked up over something i've already withdrawn? confused



Originally Posted By: CharlesS
If this were designed to "alert the operator", then it would simply give an error such as "argument list too long" or something actually appropriate. This is not the case.

Yep... but since i conceded that point already (twice in fact), your purpose in rehashing it is truly mysterious.



Originally Posted By: CharlesS
Instead of posting snotty comments like this, you could just think about it and realize the problem with -exec + is that it evidently DOESN'T WORK RIGHT, whereas xargs does. Given the choice between taking a fraction of a second longer and actually working versus failing slightly faster, I'd go with the one that works

>> snotty
Which part was snotty??? Seems like we're just trading shots here, whenever the other makes a slip.

>> DOESN'T WORK RIGHT
>> failing

Well, every example so far produces the right answer (despite the odd error message). Anyway, I totally followed the reasoning behind xargs being a worthy alternative. (in fact, xargs was long one of my favorite tools, and still is... so you're preaching to the choir there). My only objection was the way you initially worded the case for xargs, as it contained false (or misguided) reasoning. Now that you've revised the wording, i no longer object.

BTW (to all concerned), this "fts_read: Cannot allocate memory" error is apparently something new in Snow Leopard. I've never seen it in Leopard (and i use find a lot in my shell scripts, as well as on the command line).



Originally Posted By: CharlesS
At this point, I think that there is probably a random element involved, depending on the state of memory at the time the code is run (which is often the case with memory-related issues), -- I've actually issued the same command line multiple times, and gotten failures sometimes and not others.

Interesting... that i have not seen, after hours of tinkering. So —if you can post a particular command which will behave that way —please do, as i would find that clue quite compelling.



Originally Posted By: CharlesS
I posted in this thread because artie505 asked for help; I did so by posting a C-based tool which should scan for HFS+ damaged files significantly faster and more accurately than using the find tool, as well as being more reliable due to the lack of reliance on possibly damaged shell tools.

Understood and already acknowledged. I replied simply with an enthusiastic "Excellent!" — despite the fact that your post was couched with (subtle) criticisms of my efforts. I.e., yes, i already knew that IF we were to search other areas, then resource-fork-only items would produce false hits. But we weren't asked to search other areas... so the need for such a system-wide tool wasn't a consideration at the time. BTW, why is your binary so big? 50K seems huge for something doing such a basic task. Care to post the source?



Originally Posted By: CharlesS
However, I'm starting to think it was a mistake to come here. If the thread's going to be about defending your "This error was designed to inform the operator about something that the tool automatically takes care of anyway, by reporting a completely different error" statement ad nauseum, then I'm out.

Begging your pardon, but you have mischaracterized the matter entirely.
  • First off, my statement was never an assertion. You claim i said "was designed" -- but if you go back and read what i actually wrote, you'll see the phrase was "may be". So it never was anything to be defended. Your "mistake" was to dismiss it in such an offhanded manner, and then not supply anything useful to replace it which fit into the facts already seen: that there was no error in terms of items found. So you're the one who started the "nausea" here.

  • Second, twice now you have ignored what i wrote... so here is part 2 (in a large font size for easy reading):
    Originally Posted By: Hal Itosis
    Having studied it further, i'll have to agree with you.
    See that? It's from my previous reply. I.e., i dropped that theory twice (using plain English), but twice now you have picked it back up again.
Thus: a) you started it, and b) you have pursued it (without reason). My subsequent posts have been engaged in studying that fts error, not "defending" anything.


What hath Artie wrought this time?


Edited by Hal Itosis (06/06/10 09:55 AM)

Top
#10423 - 06/06/10 12:32 PM Re: du won't run? [Re: Hal Itosis]
CharlesS Offline


Registered: 05/30/10
I'm not going to do this, sorry. I'll only answer the one question that contained actual technical content:

Originally Posted By: Hal Itosis
Originally Posted By: CharlesS
But since it's already been established that -exec + is cutting off and starting a new command line every 128 KiB or so, similar to xargs, that's clearly not the issue.

Hold on a second... “128 KiB or so” now is it? Earlier you used 256 as some official size, in the midst of dismissing my comment about 150K as being above the threshold. And now we're saying that 128 is what's “already been established” ? smile

The "some official size" is ARG_MAX, which is 256 * 1024 bytes, or 256 KiB. It is well-known, and documented in /usr/include/sys/syslimits.h. If you don't have the developer tools installed, you can also use the sysctl tool to look it up. The xargs tool cuts off at ARG_MAX. -exec + seems to be cutting off at half that, hence 128 KiB, causing twice as many command lines to be executed. In neither case is the command line length able to exceed ARG_MAX.

By the way, I went through the source code to the find tool with a debugger, found the problem, and I think I've fixed it. As expected, it had nothing to do with the inputs given to the program except to the extent that random chance affected the code paths. The issue was that errno was being checked at the wrong time, thus causing spurious errors to be logged. This is a problem, though, because the code is bailing out as soon as it encounters the false error.

Oh, and it's been in there since Leopard. You probably just haven't noticed it because the conditions weren't just right to cause it.

Here is a build I made of the tool which should (hopefully) solve the issue. A patch has been sent to Apple, so if they accept the patch, then this might be fixed in some future version of OS X.

http://www.charlessoft.com/fixed_find.zip

P.S. The reason the tool is 50 KB is because it is a 64-bit universal binary, and thus contains three binaries — one for x86_64, one for i386, and one for ppc. 50 KB is not large, by the way — even Hello World is 37.2 KB if you compile it as a tri-binary.

Top
#10427 - 06/06/10 08:39 PM Re: du won't run? [Re: CharlesS]
MacManiac Offline
Moderator

Registered: 08/04/09
Loc: Paradise....on the central Ore...
Charles,

Thank you for your continued participation. We all know how valuable your time is and when you choose to invest it here everyone benefits greatly.

I think it's ironic and insightful that through this thread we were able to discover / correct a bug in the source code which runs behind the scenes in Snow Leopard & Leopard. I hope that Apple gives credit when they incorporate the fix.
_________________________
Freedom is never free....thank a Service member today.

Top
#10437 - 06/08/10 01:27 AM Re: du won't run? [Re: CharlesS]
Hal Itosis Offline


Registered: 09/03/09
Loc: 10.6.8 (build 10K549)
Originally Posted By: CharlesS
The "some official size" is ARG_MAX, which is 256 * 1024 bytes, or 256 KiB. It is well-known, and documented in /usr/include/sys/syslimits.h. If you don't have the developer tools installed, you can also use the sysctl tool to look it up.

And/or getconf(1) too:

$ getconf ARG_MAX
262144



Originally Posted By: CharlesS
The xargs tool cuts off at ARG_MAX. -exec + seems to be cutting off at half that, hence 128 KiB, causing twice as many command lines to be executed. In neither case is the command line length able to exceed ARG_MAX.

The bit that caused me to smile back there was where you said 128 KiB had "already been established" —yet, that was the very first mention of "128" in this thread.


Originally Posted By: CharlesS
By the way, I went through the source code to the find tool with a debugger, found the problem, and I think I've fixed it. As expected, it had nothing to do with the inputs given to the program except to the extent that random chance affected the code paths. The issue was that errno was being checked at the wrong time, thus causing spurious errors to be logged. This is a problem, though, because the code is bailing out as soon as it encounters the false error.

Oh, and it's been in there since Leopard. You probably just haven't noticed it because the conditions weren't just right to cause it.

Here is a build I made of the tool which should (hopefully) solve the issue. A patch has been sent to Apple, so if they accept the patch, then this might be fixed in some future version of OS X.

http://www.charlessoft.com/fixed_find.zip

Outstanding. It seems to work properly. (thanks)
I take no credit for your passionate perseverance. wink

Top
#10442 - 06/08/10 06:46 AM Re: du won't run? [Re: Hal Itosis]
CharlesS Offline


Registered: 05/30/10
Originally Posted By: Hal Itosis
Originally Posted By: CharlesS
The xargs tool cuts off at ARG_MAX. -exec + seems to be cutting off at half that, hence 128 KiB, causing twice as many command lines to be executed. In neither case is the command line length able to exceed ARG_MAX.

The bit that caused me to smile back there was where you said 128 KiB had "already been established" —yet, that was the very first mention of "128" in this thread.

It had already been established that it was cutting off earlier than xargs, which was the point. Since it generated about twice as many command lines when run on a large number of files with approximately equal lengths, it implied it was cutting off at about half the limit. 256 / 2 == 128. A little bit of testing seemed to support this. Sorry if I wasn't clear.

If you want, you can easily verify that. Just create a small program (or script, that works too) that logs the length of the argument list it's given, and have find pass its arguments to it. When I did it, they all ended up coming in slightly around 131072, or 128 KiB. I don't actually know if it's a hard limit or not (maybe not, since some seem to go slightly over 131072 when the null byte at the end of each string is taken into account), but they all seem to hover around that value. I'd have to look at the code to see what exactly it's doing, but I don't have time, and it really doesn't matter much anyway.

Anyway, I think this thread has served its purpose, so I'm done here.

Top
#10761 - 06/27/10 09:36 AM Re: du won't run? [Re: CharlesS]
Hal Itosis Offline


Registered: 09/03/09
Loc: 10.6.8 (build 10K549)
As an addendum here, i have encountered a thread where a user's Terminal doesn't startup... and the window's titlebar indicates it's stuck on the login command. That particular thread hasn't played out as yet, but i can certainly envision a situation wherein the file /usr/bin/login itself is one of the corrupted items... which will render Terminal.app unusable.

In such cases, one could employ AppleScript Editor (until a self-contained app comes along):

do shell script "/full/path/to/find_sl_damaged_files > ~/Desktop/report.txt"

...where the user will need to supply the proper "/full/path/to" part.


EDIT: hmm, or perhaps Terminal's "New Command..." menu item would still be workable (without a login shell).


Edited by Hal Itosis (06/27/10 06:35 PM)
Edit Reason: on second thought . . .

Top
#10768 - 06/27/10 08:27 PM Re: du won't run? [Re: Hal Itosis]
Virtual1 Offline


Registered: 08/04/09
Loc: Iowa
if login was broken I would expect anything that used the shell, including applescript events, not to work. anything that had to login anyway.

_________________________
I work for the Department of Redundancy Department

Top
#10770 - 06/27/10 11:11 PM Re: du won't run? [Re: Virtual1]
Hal Itosis Offline


Registered: 09/03/09
Loc: 10.6.8 (build 10K549)
Not all shells are "login" shells (or interactive for that matter), but idunno.
Perhaps we can get Artie to rebork his files and try it. wink Or heck...
we can just remove the -x bits* from /usr/bin/login and test it out.

Maybe later. (or will you do the honors now?... i see you're up late too).


EDIT: *yikes... that might require DURP to fix if you're right. Perhaps it's better to just move /usr/bin/login somewhere, as that would be easier to "undo". laugh

--

Rats... i don't know how anything works. I just moved /usr/bin/login to my desktop and Terminal launches same as ever.

Proof:

$ type -a login
-bash: type: login: not found

I don't get it... if there's an alternate (builtin), why doesn't type -a list it?
Does the /bin/bash binary contain its own "internal" login perhaps? ::shrug::
[i guess we can agree if /bin/*sh were all corrupt, then it would be pretty hopeless]

Night-night.


Edited by Hal Itosis (06/27/10 11:33 PM)

Top
#10773 - 06/28/10 12:35 AM Re: du won't run? [Re: Hal Itosis]
artie505 Online


Registered: 08/04/09
Originally Posted By: Hal Itosis
Perhaps we can get Artie to rebork his files and try it. wink

One of the fascinating things about this issue is that there's no consistency to which files wind up borked after a clone; sometimes it's a handful, sometimes a bushel basket (Almost invariably, different files are borked after each clone.), and sometimes everything comes out as expected.

The bug in the applicable rsync code remains under investigation.
_________________________
The new Great Equalizer is the SEND button.

In Memory Of Harv: Those who can make you believe absurdities can make you commit atrocities. ~Voltaire

Top
Page 2 of 2 < 1 2

Moderator:  alternaut, dkmarsh, joemikeb