What is UNIX

Shell Command Language

Migrating from the System V Shell to the POSIX Shell

This paper considers the effects of new features of the POSIX Shell command language included with XPG4 and the Single UNIX Specification. In most cases these offer opportunities for new applications to be written with more reliance on the shell itself and less on the utilities. In some cases, however, these new features require subtle changes to existing applications. Where appropriate differences are highlighted between the POSIX Shell and the traditional System V shell that was previously included in XPG3.

Naming Considerations

Identifiers

The letters in portable names are restricted to those in the portable character set; this is not stated in the XPG3 description of the traditional System V Shell. Alias names can also include the characters:

! % , @

Implementations supporting additional characters document whether those alphabetics can be used in names and aliases.

Operators

The symbol (( is reserved as a control operator on some systems. Therefore, nested sub-shells that begin with (( must be separated. For example, convert:

((echo hello);(echo world))

to:

( (echo hello);(echo world))

Some systems (for example, those using the KornShell to implement the shell) may treat these as invalid arithmetic expressions instead of subshells. Note that this requirement does not force the separation of )) because the shell is able to distinguish the termination of arithmetic from that of nested subshells.

The ! character is now a reserved word that complements the results from a pipeline. This was previously a valid, although unportable, utility name.

The new redirection operators >| and <> are added, but these were not valid shell syntax previously.

The curly-brace { } characters are designated as possible control operators in a future issue; they are currently reserved words. Portable applications should begin quoting them now if they are to represent literal characters because this quoting may be required in the future. For example, (using one quoting mechanism), convert:

echo {Hi}

to:

echo \{Hi\}

The four words: function, select, [[ and ]] are reserved and cannot be used where a reserved word would be recognised, such as a command name.

Selecting Command Interpreters

Some systems have supported a kernel feature that caused special treatment of shell scripts beginning with the characters #!. This was used to select a command interpreter. For example, it is common to see a script beginning with:

#!/bin/sh

which means, "run this using the program /bin/sh, even if another shell is in use". This was never strictly portable (because the absolute pathname /bin/sh is not guaranteed on a system), and a fully portable program must not rely on it; the system may treat it only as a comment.

Aliases

Aliases are a new facility and should cause no forward compatibility problems.

However, if there is concern about the user setting up an environment where utility names do things unintended by the application, note that aliases can be brought into a shell script through common implementation extensions. A way to guard against command names expanding into aliases is to quote them. For example, a very common alias that an interactive user might set up is:

alias ls="ls -CF"

This would disrupt shell scripts such as:

ls | pax ...

The previous ls could be replaced by l\s or an unalias ls command could be issued.

Reserved Command Names

Because of new utilities and other changes, the following are no longer valid names for local commands, unless they are invoked with a pathname containing a slash:

Parameters, Variables and Word Expansions

IFS

The treatment of the value of the IFS variable and its use in field splitting are changed:

  1. The unset special built-in can unset IFS . If not set, IFS behaves as if it were:
    <space><tab><newline>

  2. When "$*" is expanded, the first character of IFS is used as a separator. Although this is as stated in the XPG3 description of the traditional System V Shell, historical systems actually used a single space character in this instance.

  3. If IFS is null, there is no separator, for example:
    
    
    $ IFS=" $ set a b c $ echo "$*" abc

    Once again, previous systems used a space character.

  4. When IFS is set to a value other than:
    <space><tab><newline>

    the characters within IFS that are not white space act as field separators. For example, if a file /etc/passwd contained the first line:

    root::0:0::/:/bin/sh

    The script:

    
    
    IFS=":" read a b c < /etc/passwd echo $b

    would have previously produced 0, but it now produces a null value, indicating that each instance of a colon delimited a field.

  5. Field splitting with IFS occurs only after parameter expansion, command substitution or as part of the read command. This prevents certain security loopholes. Previously, on some systems the following input:
    $ IFS=o $ violet

    would invoke vi to edit file let.

    No portable application should have relied on this behaviour.

Tilde Expansion

Previously, it was unportable but valid to have files named with a leading tilde character (~). Now, the use of such files in scripts should have quoting for the leading tildes, because the first component may match a login ID. For example, the first command should be converted to one of the following three:


cat ~jan ~feb ... cat "~jan" "~feb" ... cat \~jan \~feb ... cat ./~jan ./~feb ...

Although tilde expansion was not used previously by portable applications, a common KornShell extension must be avoided:

PATH=~dwc/bin:~maw/bin

This does not expand the second tilde (because it does not start a word). Use one of the following instead:

PATH=~maw/bin PATH=~dwc/bin:$PATH

PATH=$(printf %s ~dwc/bin : ~maw/bin)

Parameter Expansion

Five new forms of parameter expansion are added that yield string lengths and remove prefix or suffix patterns:

${#parameter}
String length.

${parameter%word}
Remove smallest suffix pattern.

${parameter%%word}
Remove largest suffix pattern.

${parameter#word}
Remove smallest prefix pattern.

${parameter##word}
Remove largest prefix pattern.

These can be used to replace some of the existing expr and sed calls in existing scripts and improve performance and readability in many cases.

The rules for parameter expansion with double-quotes:

"${...}"

are changed to require that any single- or double-quotes must be paired within the curly-braces. A consequence of this rule is that single-quotes cannot be used to quote the } within:

"${...}"

For example:

unset bar foo="${bar-'}'}"

is invalid because the:

"${...}"

substitution contains an unpaired unescaped single-quote. The backslash can be used to escape the } in this example to achieve the desired result:

unset bar foo="${bar-\}}"

Some systems have allowed the end of the word to terminate the backquoted command substitution, such as in:

"`echo hello"

This usage is undefined; the matching backquote is required by the . The other undefined usage can be illustrated by the example:

sh -c '` echo "foo`'

Command Substitution

Backquoted command substitution must be terminated by a backquote:

`...`

Some shells allowed the end of a file or string silently to delimit the command substitution.

A new form of command substitution is introduced that makes using quoting and nesting rules easier than with the back-quote method:

$(...)

It is unnecessary for any scripts to be converted to this new form, but new script writers may find it easier and more logical to use the new form for any complex constructs:

If the new form is used to execute a subshell, care must be taken to remove any ambiguity arising from arithmetic expansion. For example, if a utility named: 1+2 is written, the command:

$((1+2))

is then ambiguous. It must be written portably as:

$( (1+2))

Single-quotes cannot be used to quote the } within:

"${}"

Arithmetic Expansion

Arithmetic expansion is a new feature without forward compatibility problems. It can be used to simplify existing shell arithmetic that involves the >expr utility.

Redirection

Multi-digit file descriptors are now allowed syntactically, although a portable application cannot use numbers higher than 9, because the shell can reserve all higher numbers for its own use.

The new noclobber version of redirection can be used for creating lock files in applications or determining that a file can be created safely without replacing an existing file. For example, consider an application that wishes to save a copy of a file before it edits it:

cat $1 > $1.back

On a traditional file system with 14-byte filenames, if $1 is ten bytes or larger, this command could erase another file. If $1 is 14 bytes, it would erase the original file instead of copying it. Now, the following can be written:

set -C # set noclobber mode if > $1.back; then echo Backup copy can be created else echo Backup attempt will fail

The noclobber mode is even more useful for interactive users who wish to prevent inadvertent destruction of their files. They would then have to use the >| operator to overwrite a file deliberately.

The new <> operator has been supported on many implementations, but not documented. It should cause no compatibility problems, but its use is rather specialised.

the XPG3 description of the traditional System V Shell allowed here-documents to be terminated by the end of the script file, as in the following example of a two-line file:

cat <<EOF Hi

A fully portable script requires a proper delimiter.

The System V shell and the KornShell have differed historically on pathname expansion of an argument word; the former never performed it, the latter only when the result was a single field (file). As a compromise, it was decided that the KornShell capability was useful, but only as a shorthand device for interactive users. It is not reasonable to write a shell script such as:

cat foo > a*

Therefore this is not permitted.

Shell Commands

When a command names a utility that cannot be found, there is no assurance that this aborts a script. A portable script must be written to test the exit status of each command it considers critical before proceeding to the next step.

Historically, shells have returned an exit status of 128+n, where n represents the signal number. Since signal numbers are not standardised, there is no portable way to determine which signal caused the termination. Also, it is possible for a command to exit with a status in the same range of numbers that the shell would use to report that the command was terminated by a signal. Therefore, a portable script cannot rely on determining the exact cause of a command failure when a signal is received.

There is a historical difference in sh and ksh non-interactive error behaviour. When a command named in a script is not found, some implementations of sh exit immediately, but ksh continues with the next command. Thus, the says that the shell may exit in this case. This puts a small burden on the programmer, who has to test for successful completion following a command, when it is important that the next command not be executed if the previous is not found. When it is important for the command to be found, it is probably also important for it to complete successfully. The test for successful completion does not need to change.

With the System V shell, all built-ins are treated as special built-ins, which causes them to exhibit the special behaviour listed in the , Section 2.14, Special Built-in Utilities (such as the difference in how variable assignments stay in effect). Earlier versions of System V and BSD systems did not implement the common echo, pwd and test as built-ins (regular or special), so these older systems are actually closer to the current . The differences between the behaviour of built-ins and other utilities is not documented in the SVID.

Command Search

The rules for command search make explicit the differences between special and regular built-ins. Previously, regular built-ins had different characteristics from file-system utilities, but the portable application could not predict which utilities were which. So, it was impossible to write a shell function with the name of a common utility because that utility might be built-in and the function would never be accessed.

In XPG4, an application cannot discern between regular built-ins and file-system utilities (unless it is able to check for performance differences). All utilities, other than the special built-ins, can be replaced with functions. All utilities, other than the special built-ins, can be used as if they were in the file system by commands such as:

nohup utility find . -exec utility \; ls * | xargs utility

It is important to understand that some utilities only affect or understand their own shell environment, not their parent's; commands such as:

(cd /tmp) nohup kill %1 env wait

are valid, but not very useful.

The new command utility can be used to suppress function lookup.

Pipelines and Lists

A new reserved word, !, can be used to complement the exit status of a pipeline. For example:

if ! false; then echo True fi

Scripts assuming that a pipeline is (or is not) executed in a subshell must be modified to tolerate the pipeline being executed in a subshell, but not to depend on it. For example, the command:

echo dog cat mouse | read x y z echo $x $y $z

does not work as expected on some systems because the read is invoked in a subshell and does not affect the variables in the current environment. This example could be written as:

read x y z <<eof dog cat mouse eof echo $x $y $z

Scripts written assuming that the first example using read would not work are also not portable, because some shells, such as the KornShell, do run the final pipeline stage in the current environment. Such a script could use:

echo dog cat mouse | (read x y z) echo $x $y $z

to be sure read would not affect the current environment.

Portable applications should avoid using the AND and OR operators, && and ||, in complex constructs without using { } or ( ) groupings to show the precedence desired. The precedence of these operators, strictly left-to-right, is different from most programming languages, where AND has higher precedence, and confusion may result. So, for example, the following three commands are equivalent (assuming the subshell effects in the second are not relevant), but the second or third is preferred:

a || b && c (a || b) && c { a || b; } && c

Pattern Matching

Pattern matching is expanded with the internationalisation features described for bracket expressions in

A period in a bracket expression may now match a leading dot in a filename. Previously this was never portable and a portable application still cannot use this form.

A leading circumflex in a bracket expression must be quoted. For example, to list filenames beginning with ^ or a, use one of the following:

ls [a^]* ls [\^a]*

Any of the shell special characters used in a pattern must be quoted or escaped. In most cases, this was already necessary to prevent their misinterpretation by the shell. For example:

ls a(b*

never worked. However, the command:

find . -name 'a(b*' -print
did work.

Now, this form requires escaping of the shell character to avoid side effects from implementation extensions:

find . -name 'a\(b*' -print

Special Built-ins

Special built-ins have special properties for error conditions, variable assignments and accessibility via the exec functions and certain commands (such as nohup). It is now possible for the application to predict these effects because the list of special built-ins is specified; although any of the standard utilities could be implemented as a regular built-in, none of them can be special built-ins.

dot

Some older implementations searched the current directory for the file, even if the value of PATH disallowed it. This behaviour is now prohibited due to concerns about introducing susceptibility to trojan horses, which the user might be trying to avoid by leaving dot out of PATH .

exec

Most historical implementations were not conformant in that:

foo=bar exec cmd

did not pass foo to cmd. It is unlikely that any application ever relied on it not being passed.

Applications relying on file descriptors > 2 being automatically closed or left open following an exec must be recoded to force the desired result.

exit

In applications, the reserved exit status values 126 and 127 should be avoided, except as described in the XCU specifiation. Values greater than 128 should be reserved for signal terminations.

export

Instances without arguments that expect a specific portable output format must be recoded as:

export -p

Applications relying on previous output formats are not portable.

readonly

Instances without arguments that expect a specific portable output format have to be recoded as:

readonly -p

Applications relying on previous output formats are not portable.

return

The behaviour of return when not in a function or dot script differs between shells. In some shells this is an error; in others, and in XPG4, the effect is the same as exit.

The exit value given to exit cannot exceed 255.

set

In some previous shells:

set --

only unset parameters if there was at least one argument; the only way to unset all parameters was to use shift. Using the new XPG4 version should not affect existing scripts because there should be no reason deliberately to issue it without arguments; if it is issued as:

set -- "$@"

and there are in fact no arguments resulting from $@, unsetting the parameters would achieve the same effect.

The use of set + without other arguments (which is similar to set with no arguments, except that the values of the variables are not reported) is not documented in the XPG3 description of the traditional System V Shell and is no longer supported. An application requiring this could substitute:

"set | sed"

to suppress the variable values.

The -k and -t options are no longer supported and should be removed from portable scripts.

trap

Applications should be migrated to the symbolic signal names.

Scripts relying on a specific system's trap output format have to be recoded.

unset

Applications must be recoded to use unset with either -f or -v to be fully portable.


Single UNIX Specification UNIX Documentation Registered Products catalog White papers UNIX API Tables
UNIX and the Year 2000 Test tool downloads About The Open Group The Authorized Guide to UNIX 98 UNIX Resources

UNIX is a registered trademark of The Open Group.

Copyright © 1997-1999 , The Open Group.