Projects / For each File / Comments

Comments for For each File

20 Jul 2004 18:18 sunaku

Re: Advantages over find?


> This is a limitation of the GNU Bash

> shell as there is currently no way to

> express the ASCII NUL character '\0' in

> its scripting language. Also, such

> arbitrary filenames with newlines in

> them are a rather special case in most

> POSIX systems. Thus, in accordance with

> the common Engineering design rule:

> "make the common case fast", the

> handling of newlines in filenames is

> best left to the 'xargs' tool.

>

The above statement is false.

This tool can handle file names that contain new-line characters, as was proven in a comp.unix.shell newsgroup post (Message-ID:

<2m5ifuFjctd8U1@uni-berlin.de>). Furthermore, we can generate the NUL character in BASH scripts: echo $'NUL is \000'.

17 Jul 2004 13:01 sunaku

Re: Advantages over find?
My previous response to this question contains errors. Below is an updated and correct response from the FAQ (located at: http://ff-bash.sourceforge.net/docs/faq.html).

There is one advantage, which can be observed under the following conditions. If find -exec is used to invoke an intermediate shell: $ find . -exec sh -c '...' \;, then it will expend more system resources than For each File. This is because, for each file it handles, find -exec will fork a child process (the intermediate shell) and thereby cause the operating system to perform a context switch. In contrast, For each File will invoke a GNU BASH function that is already loaded into memory and thus does not cause a context switch. This behavior reduces the number of process forks and thereby expends less system resources as compared to find -exec under said conditions.

10 Jul 2004 14:52 loh

Re: Advantages over find?


> For example, renaming .tgz to .tar.gz
> can be done with a simple command line:
>
> find $some_path -type f -name \*.tgz | sed 's/\(.*\)\.tgz/mv & \1.tar.gz/' | sh

You could also use this command:

tgz -> tar.gz

for i in *tgz; do mv $i `echo $i | sed s/tgz/tar\.gz/`; done

tar.gz -> tgz

for i in *tar.gz; do mv $i `echo $i | sed s/tar\.gz/tgz/`; done

or, if you want to use more complicated mechanisms to find files and therefore want to use the `find' command, use this:

for i in `find . -type f [--further-options] -name "*tgz"`; do mv $i `echo $i | sed s/tgz/tar\.gz/`; done

19 Jun 2004 15:04 sunaku

Re: Advantages over find?


> Why such -- rather complicated -- bash

> magic?

Making use of the shell's builtin constructs yeilds better performance than invoking external programs.

> What's so bad on just teaching people

> clever use of

> standard tools as sh(1), find(1),

> xargs(1), sed(1) etc?

I agree. Please note that this tool is not meant to replace any existing tools; it simply aims to facilitate alternative approaches to performing filesystem manipulation tasks.

> For example, renaming .tgz to .tar.gz

> can be done

> with a simple command line:

>

> find $some_path -type f -name \*.tgz |

> sed

> 's/\(.*\)\.tgz/mv & \1.tar.gz/' | sh

That's a good solution and notice that you also have utilized an intermediate shell 'sh' in solving the example problem. This observation is detailed in my response to the question at the root of this thread.

> When unsure, run that command without

> the trailing |

> sh to see what would happen. Or run it

> with | sh -x to

> see *what* actually happens.

>

> Note that this example is a *oneliner*,

> so with a little

> bit more magic, you could easily make it

> robust

> against weird filenames (containing

> whitespace,

> quotes and thelike). However, neither ff

> nor a

> combination of find(1), sed(1) and sh(1)

> as above

> can handle filenames with newlines.

> You've to use

> find ... -print0 | xargs -0 for this,

> which unfortunately is

> a GNU extension.

This is a limitation of the GNU Bash shell as there is currently no way to express the ASCII NUL character '\0' in its scripting language. Also, such arbitrary filenames with newlines in them are a rather special case in most POSIX systems. Thus, in accordance with the common Engineering design rule: "make the common case fast", the handling of newlines in filenames is best left to the 'xargs' tool.

> Ciao,

>

> Kili

>

Thanks.

19 Jun 2004 03:33 mkilian

Re: Advantages over find?
Why such -- rather complicated -- bash magic?


What's so bad on just teaching people clever use of
standard tools as sh(1), find(1), xargs(1), sed(1) etc?


For example, renaming .tgz to .tar.gz can be done
with a simple command line:


find $some_path -type f -name \*.tgz | sed
's/\(.*\)\.tgz/mv & \1.tar.gz/' | sh


When unsure, run that command without the trailing |
sh to see what would happen. Or run it with | sh -x to
see *what* actually happens.


Note that this example is a *oneliner*, so with a little
bit more magic, you could easily make it robust
against weird filenames (containing whitespace,
quotes and thelike). However, neither ff nor a
combination of find(1), sed(1) and sh(1) as above
can handle filenames with newlines. You've to use
find ... -print0 | xargs -0 for this, which unfortunately is
a GNU extension.


Ciao,


Kili

22 Mar 2004 18:53 sunaku

Re: Advantages over find?

> Of course looping in the shell isn't
> recursive and limited by the argument
> buffer (so is calling ff *.tgz though).

Good point.

A command line option, which allows a list of targets (non-directories or directories) to be read from a file, will be added in the next release. This will enable the user to bypass the size limit of the argument buffer.

Thanks.

22 Mar 2004 16:34 akhasha

Re: Advantages over find?
I don't think the argument for use of "ff" is so clear cut. I often just use:

for i in *.tgz; do tar xvzf "$i"; done

Or, depending on the argument processing, use find with xargs:

find '*.txt' | xargs grep -i caveat

Xargs BTW can limit the number of simultaneous subprocesses with -P, and the number of arguments passed to each instance with -n, both of which default to one.

Of course looping in the shell isn't recursive and limited by the argument buffer (so is calling ff *.tgz though). While not so limited, the "find|xargs" solution spawns subprocesses so "ff" does seem to have a niche - but perhaps not so large as made out.

16 Mar 2004 10:46 sunaku

Re: Advantages over find?

> What's the advantage of using this
> instead of find -exec?

In comparison to using 'find -exec', key advantages are:
- better performance (less system resources used)
- flexibility (user can perform tasks without a separate script)
- simplicity (saves time and effort)

In order to justify these claims, consider the following cases.

(1) Case 1: Executing an existing script.

For every file it handles, 'find -exec' forks a separate child-process to execute the user's commands. On a large input size, this behavior yields poor performance.

This tool uses less system resources, and thereby yields better performance, by employing only one Bash process (in both non/recursive modes). Separate child-processes are not forked because user commands are evaluated by the same Bash process.

This tool also provides preset scripting variables to save users' time. Although it is a trivial matter to assemble these variables within a script, it becomes wasteful and tedious for quick tasks.


(2) Case 2: Performing complex tasks at the terminal without a separate script.

As the complexity of the problem increases, using 'find -exec' to implement solutions at the terminal becomes cryptic and time consuming. One possibility is to write a separate script and use it with 'find -exec'. In which case, this tool already puts the power of the Bash shell at the user's disposal, without having to manually create a separate script each time.

Using this tool, the user can, but is not forced to, enter scripting commands on one line. Entering commands is often made more efficient when this tool is launched with the editor option. This option allows the user to enter scripting commands via their favorite text editor, rather than fiddling with a one-line script at the terminal.

Consider the following examples.

(2a) Example A: We wish to extract all tar-ball files in the current working directory.

$ ls
bar.tgz foo.tgz


(2a-1) Using this tool:

$ ff *tgz
tar -zxf "$f"


(2a-2) Using 'find -exec':

$ find . -name '*tgz' -exec tar -zxf '{}' ';'


(2b) Example B: We wish to rename the file extension 'tgz' to 'tar.gz' and display the original filename.

$ ls some_tgz/tgz_dir/
bar.tgz b a z.tgz foo.tgz


(2b-1) Using this tool:

$ ff some_tgz/tgz_dir/*tgz
mv "$f" "$d/${fn/%tgz/tar.gz}"
echo $fn


(2b-2) Using 'find -exec':

$ find some_tgz/tgz_dir/ -name '*tgz' -exec sh -c 'mv "{}" `dirname "{}"`/`basename "{}" | sed "s/tgz/tar.gz/"`; echo {}' \;

Notice that we invoked a shell with 'find -exec' for this task. It is possible to perform the task without the aid of an intermediate shell, but one can imagine how time consuming and error prone that would be.


(3) In conclusion, this tool gives the user flexibility while saving them time and maintaining low usage of system resources.

14 Mar 2004 12:37 Ullerup

Advantages over find?
What's the advantage of using this instead of find -exec?

Screenshot

Project Spotlight

ReciJournal

An open, cross-platform journaling program.

Screenshot

Project Spotlight

Veusz

A scientific plotting package.