Linux shell - function

RogerJune 8, 2024About 21 min

在编写 shell 脚本时，经常会发现在多个地方使用了同一段代码。如果只是一小段代码，通常也无所谓，但要在 shell 脚本中多次重写大段代码那可就太累人了。

bash shell 提供的用户自定义函数功能可以解决这个问题。可以将 shell 脚本代码放入函数中封装起来，这样就能在脚本的任意位置多次使用了。

脚本函数基础

在开始编写较复杂的 shell 脚本时，会发现自己重复使用了部分执行特定任务的代码。这些代码有时很简单，比如显示一条文本消息并从脚本用户那里获得答案；有时则比较复杂，需要作为大型处理过程的一部分被多次使用。

在这些情况下，在脚本中一遍又一遍地编写同样的代码实在烦人。如果只写一次代码，随后能够随时随地多次引用这部分代码就太好了。

bash shell 提供了这种功能。函数是一个脚本代码块，你可以为其命名并在脚本中的任何位置重用它。每当需要在脚本中使用该代码块时，直接写函数名即可（这叫作调用函数）。

创建函数

在 bash shell 脚本中创建函数的语法有两种。第一种语法是使用关键字 function，随后跟上分配给该代码块的函数名：

function name {
  commands
}

name 定义了该函数的唯一名称。脚本中的函数名不能重复。

commands 是组成函数的一个或多个 bash shell 命令。调用该函数时，bash shell 会依次执行函数内的命令，就像在普通脚本中一样。

第二种在 bash shell 脚本中创建函数的语法更接近其他编程语言中定义函数的方式：

name() {
  commands
}

函数名后的空括号表明正在定义的是一个函数。这种语法的命名规则和第一种语法一样。

使用函数

要在脚本中使用函数，只需像其他 shell 命令一样写出函数名即可：

function func1 {
  echo "This is an example of a function"
}

count=1
while [ $count -le 5 ]
do
  func1
  count=$[ $count + 1 ]
done
echo "This is the end of the loop"
func1
echo "Now this is the end of the script"

./test1
# This is an example of a function
# This is an example of a function
# This is an example of a function
# This is an example of a function
# This is an example of a function
# This is the end of the loop
# This is an example of a function
# Now this is the end of the script

每次引用函数名 func1 时，bash shell 会找到 func1 函数的定义并执行在其中定义的命令。

函数定义不一定非要放在 shell 脚本的最开始部分，但是要注意这种情况。如果试图在函数被定义之前调用它，则会收到一条错误消息：

count=1
echo "This line comes before the function definition"

function func1 {
  echo "This is an example of a function"
}

while [ $count -le 5 ]
do
  func1
  count=$[ $count + 1 ]
done
echo "This is the end of the loop"
func2
echo "Now this is the end of the script"

function func2 {
   echo "This is an example of a function"
}

./test2
# This line comes before the function definition
# This is an example of a function
# This is an example of a function
# This is an example of a function
# This is an example of a function
# This is an example of a function
# This is the end of the loop
./test2: func2: command not found
# Now this is the end of the script

第一个函数 func1 的定义出现在脚本起始处的几条语句之后，这当然没有任何问题。在脚本中调用 func1 函数时，shell 知道去哪里找它。

然而，脚本试图在 func2 函数被定义之前就调用该函数。由于 func2 函数此时尚未定义，因此在调用 func2 时，产生了一条错误消息。

另外也要注意函数名。记住，函数名必须是唯一的，否则就会出问题。如果定义了同名函数，那么新定义就会覆盖函数原先的定义，而这一切不会有任何错误消息：

function func1 {
  echo "This is the first definition of the function name"
}

func1

function func1 {
  echo "This is a repeat of the same function name"
}

func1
echo "This is the end of the script"

./test3
# This is the first definition of the function name
# This is a repeat of the same function name
# This is the end of the script

func1 函数最初的定义工作正常，但重新定义该函数后，后续的函数调用会使用第二个定义。

函数返回值

bash shell 把函数视为一个小型脚本，运行结束时会返回一个退出状态码。有 3 种方法能为函数生成退出状态码。

默认的退出状态码

在默认情况下，函数的退出状态码是函数中最后一个命令返回的退出状态码。函数执行结束后，可以使用标准变量$?来确定函数的退出状态码：

func1() {
  echo "trying to display a non-existent file"
  ls -l badfile
}

echo "testing the function: "
func1
echo "The exit status is: $?"

./test4
# testing the function:
# trying to display a non-existent file
# ls: badfile: No such file or directory
# The exit status is: 1

该函数的退出状态码是 1，因为函数中的最后一个命令执行失败了。但你无法知道该函数中的其他命令是否执行成功。来看下面的例子：

func1() {
  ls -l badfile
  echo "This was a test of a bad command"
}
echo "testing the function:"
func1
echo "The exit status is: $?"

./test4b
# testing the function:
# ls: badfile: No such file or directory
# This was a test of a bad command
The exit status is: 0

这次，由于函数最后一个命令 echo 执行成功，因此该函数的退出状态码为 0，不过其中的其他命令执行失败。使用函数的默认退出状态码是一种危险的做法。幸运的是，有几种办法可以解决这个问题。

使用 return 命令

bash shell 会使用 return 命令以特定的退出状态码退出函数。return 命令允许指定一个整数值作为函数的退出状态码，从而提供了一种简单的编程设定方式：

function dbl {
  read -p "Enter a value: " value
  echo "doubling the value"
  return $[ $value * 2 ]
}

dbl
echo "The new value is $?"

dbl 函数会将$value变量中用户输入的整数值翻倍，然后用return命令返回结果。脚本用$?变量显示出该结果。

当用这种方法从函数中返回值时，一定要小心。为了避免出问题，牢记以下两个技巧。

函数执行一结束就立刻读取返回值。
退出状态码必须介于 0~255。

如果在用$?变量提取函数返回值之前执行了其他命令，那么函数的返回值会丢失。记住，$?变量保存的是最后执行的那个命令的退出状态码。

第二个技巧界定了返回值的取值范围。由于退出状态码必须小于 256，因此函数结果也必须为一个小于 256 的整数值。大于 255 的任何数值都会产生错误的值：

./test5
# Enter a value: 200
# doubling the value
# The new value is 144

如果需要返回较大的整数值或字符串，就不能使用 return 方法。接下来将介绍另一种方法。

使用函数输出

正如可以将命令的输出保存到 shell 变量中一样，也可以将函数的输出保存到 shell 变量中：

result=$(dbl)

这个命令会将 dbl 函数的输出赋给$result 变量。来看一个例子：

function dbl {
  read -p "Enter a value: " value
  echo $[ $value * 2 ]
}
result=$(dbl)
echo "The new value is $result"

./test5b
# Enter a value: 200
# The new value is 400

./test5b
# Enter a value: 1000
# The new value is 2000

新函数会用 echo 语句来显示计算结果。该脚本会获取 dbl 函数的输出，而不是查看退出状态码。

这个例子演示了一个不易察觉的技巧。注意，dbl 函数实际上输出了两条消息。read 命令输出了一条简短的消息来向用户询问输入值。bash shell 脚本非常聪明，并不将其作为 STDOUT 输出的一部分，而是直接将其忽略。如果用 echo 语句生成这条消息来询问用户，那么它会与输出值一起被读入 shell 变量。

这种方法还可以返回浮点值和字符串，这使其成为一种获取函数返回值的强大方法。

在函数中使用变量

你可能已经注意到，在 test5 例子中，我们在函数中用了变量$value 来保存处理后的值。在函数中使用变量时，需要注意它们的定义方式和处理方式。这是 shell 脚本中常见错误的根源。下面将介绍一些处理 shell 脚本函数内外变量的方法。

向函数传递参数

bash shell 会将函数当作小型脚本来对待。这意味着你可以像普通脚本那样向函数传递参数。

函数可以使用标准的位置变量来表示在命令行中传给函数的任何参数。例如，函数名保存在$0变量中，函数参数依次保存在$1、$2等变量中。也可以用特殊变量$#来确定传给函数的参数数量。

在脚本中调用函数时，必须将参数和函数名放在同一行，就像下面这样：

func1 $value1 10

然后函数可以用位置变量来获取参数值。来看一个使用此方法向函数传递参数的例子：

function addem {
  if [ $# -eq 0 ] || [ $# -gt 2 ]
  then
    echo -1
  elif [ $# -eq 1 ]
  then
    echo $[ $1 + $1 ]
  else
    echo $[ $1 + $2 ]
  fi
}

echo -n "Adding 10 and 15: "
value=$(addem 10 15)
echo $value
echo -n "Let's try adding just one number: "
value=$(addem 10)
echo $value
echo -n "Now try adding no numbers: "
value=$(addem)
echo $value
echo -n "Finally, try adding three numbers: "
value=$(addem 10 15 20)
echo $value

./test6
# Adding 10 and 15: 25
# Let's try adding just one number: 20
# Now try adding no numbers: -1
# Finally, try adding three numbers: -1

text6 脚本中的 addem 函数首先会检查脚本传给它的参数数目。如果没有参数或者参数多于两个，那么 addem 会返回-1。如果只有一个参数，那么 addem 会将参数与自身相加。如果有两个参数，则 addem 会将二者相加。

由于函数使用位置变量访问函数参数，因此无法直接获取脚本的命令行参数。下面的例子无法成功运行：

function badfunc1 {
  echo $[ $1 * $2 ]
}

if [ $# -eq 2 ]
then
  value=$(badfunc1)
  echo "The result is $value"
else
  echo "Usage: badtest1 a b"
fi

./badtest1
# Usage: badtest1 a b
./badtest1 10 15
# ./badtest1: *  : syntax error: operand expected (error token is "*
# ")
The result is

尽管函数使用了$1 变量和$2 变量，但它们和脚本主体中的$1 变量和$2 变量不是一回事。要在函数中使用脚本的命令行参数，必须在调用函数时手动将其传入：

function func7 {
  echo $[ $1 * $2 ]
}

if [ $# -eq 2 ]
then
  value=$(func7 $1 $2)
  echo "The result is $value"
else
  echo "Usage: badtest1 a b"
fi

./test7
# Usage: badtest1 a b
./test7 10 15
# The result is 150

在将$1 变量和$2 变量传给函数后，它们就能跟其他变量一样，可供函数使用了。

在函数中处理变量

给 shell 脚本程序员带来麻烦的情况之一就是变量的作用域。作用域是变量的有效区域。在函数中定义的变量与普通变量的作用域不同。也就是说，对脚本的其他部分而言，在函数中定义的变量是无效的。

函数有两种类型的变量。

全局变量
局部变量

接下来将介绍这两种变量在函数中的用法。

全局变量

全局变量是在 shell 脚本内任何地方都有效的变量。如果在脚本的主体部分定义了一个全局变量，那么就可以在函数内读取它的值。类似地，如果在函数内定义了一个全局变量，那么也可以在脚本的主体部分读取它的值。

在默认情况下，在脚本中定义的任何变量都是全局变量。在函数外定义的变量可在函数内正常访问：

function dbl {
  value=$[ $value * 2 ]
}

read -p "Enter a value: " value
dbl
echo "The new value is: $value"

./test8
# Enter a value: 450
# The new value is: 900

$value 变量在函数外定义并被赋值。当 dbl 函数被调用时，该变量及其值在函数中依然有效。如果变量在函数内被赋予了新值，那么在脚本中引用该变量时，新值仍可用。

但这种情况其实很危险，尤其是想在不同的 shell 脚本中使用函数的时候，因为这要求你清清楚楚地知道函数中具体使用了哪些变量，包括那些用来计算非返回值的变量。这里有个例子可以说明事情是如何被搞砸的：

function func1 {
  temp=$[ $value + 5 ]
  result=$[ $temp * 2 ]
}

temp=4
value=6

func1
echo "The result is $result"
if [ $temp -gt $value ]
then
  echo "temp is larger"
else
  echo "temp is smaller"
fi

./badtest2
# The result is 22
# temp is larger

由于函数中用到了$temp 变量，因此它的值在脚本中使用时受到了影响，产生了意想不到的后果。有一种简单的方法可以解决函数中的这个问题，那就是使用局部变量。

局部变量

无须在函数中使用全局变量，任何在函数内部使用的变量都可以被声明为局部变量。为此，只需在变量声明之前加上 local 关键字即可：

local temp

也可以在变量赋值语句中使用 local 关键字：

local temp=$[ $value + 5 ]

local 关键字保证了变量仅在该函数中有效。如果函数之外有同名变量，那么 shell 会保持这两个变量的值互不干扰。这意味着你可以轻松地将函数变量和脚本变量分离开，只共享需要共享的变量：

function func1 {
  local temp=$[ $value + 5 ]
  result=$[ $temp * 2 ]
}

temp=4
value=6

func1
echo "The result is $result"
if [ $temp -gt $value ]
then
  echo "temp is larger"
else
  echo "temp is smaller"
fi

./test9
# The result is 22
# temp is smaller

现在，当你在 func1 函数中使用$temp变量时，该变量的值不会影响到脚本主体中赋给$temp 变量的值。

数组变量和函数

在函数中使用数组变量有点儿麻烦，需要做一些特殊考虑。

向函数传递数组

向脚本函数传递数组变量的方法有点儿难以理解。将数组变量当作单个参数传递的话，它不会起作用：

function testit {
  echo "The parameters are: $@"
  thisarray=$1
  echo "The received array is ${thisarray[*]}"
}

myarray=(1 2 3 4 5)
echo "The original array is: ${myarray[*]}"
testit $myarray

./badtest3
# The original array is: 1 2 3 4 5
# The parameters are: 1
# The received array is 1

如果试图将数组变量作为函数参数进行传递，则函数只会提取数组变量的第一个元素。

要解决这个问题，必须先将数组变量拆解成多个数组元素，然后将这些数组元素作为函数参数传递。最后在函数内部，将所有的参数重新组合成一个新的数组变量。来看下面的例子：

function testit {
  local newarray
  newarray=("$@")
  echo "The new array value is: ${newarray[*]}"
}

myarray=(1 2 3 4 5)
echo "The original array is ${myarray[*]}"
testit ${myarray[*]}

./test10
# The original array is 1 2 3 4 5
# The new array value is: 1 2 3 4 5

该脚本用$myarray 变量保存所有的数组元素，然后将其作为参数传递给函数。该函数随后根据参数重建数组变量。在函数内部，数组可以照常使用：

function addarray {
  local sum=0
  local newarray
  newarray=("$@")
  for value in ${newarray[*]}
  do
    sum=$[ $sum + $value ]
  done
  echo $sum
}
myarray=(1 2 3 4 5)
echo "The original array is: ${myarray[*]}"
arg1=$(echo ${myarray[*]})
result=$(addarray $arg1)
echo "The result is $result"

./test11
# The original array is: 1 2 3 4 5
# The result is 15

addarray 函数遍历了所有的数组元素，并将它们累加在一起。你可以在 myarray 数组变量中放置任意数量的值，addarry 函数会将它们依次相加。

从函数返回数组

函数向 shell 脚本返回数组变量也采用类似的方法。函数先用 echo 语句按正确顺序输出数组的各个元素，然后脚本再将数组元素重组成一个新的数组变量：

function arraydblr {
  local origarray
  local newarray
  local elements
  local i
  origarray=("$@")
  newarray=("$@")
  elements=$[ $# - 1 ]
  for (( i = 0; i <= $elements; i++ ))
  {
    newarray[$i]=$[ ${origarray[$i]} * 2 ]
  }
  echo ${newarray[*]}
}

myarray=(1 2 3 4 5)
echo "The original array is: ${myarray[*]}"
result=($(arraydblr ${myarray[*]}))
echo "The new array is: ${result[*]}"

./test12
The original array is: 1 2 3 4 5
The new array is: 2 4 6 8 10

该脚本通过$arg1 变量将数组元素作为参数传给 arraydblr 函数。arraydblr 函数将传入的参数重组成新的数组变量，生成该数组变量的副本。然后对数据元素进行遍历，将每个元素的值翻倍，并将结果存入函数中的数组变量副本。

arraydblr 函数使用 echo 语句输出每个数组元素的值。脚本用 arraydblr 函数的输出重组了一个新的数组变量。

函数递归

局部函数变量的一个特性是自成体系（self-containment）。除了获取函数参数，自成体系的函数不需要使用任何外部资源。

这个特性使得函数可以递归地调用，也就是说函数可以调用自己来得到结果。递归函数通常有一个最终可以迭代到的基准值。许多高级数学算法通过递归对复杂的方程进行逐级规约，直到基准值。

递归算法的经典例子是计算阶乘。一个数的阶乘是该数之前的所有数乘以该数的值。因此，要计算 5 的阶乘，可以执行下列算式：

5! = 1 * 2 * 3 * 4 * 5 = 120

使用递归，这一算法可以简化为以下形式：

x! = x * (x-1)!

也就是说，x 的阶乘等于 x 乘以 x-1 的阶乘。这可以用简单的递归脚本表达为以下形式：


function factorial {
  if [ $1 -eq 1 ]
  then
    echo 1
  else
    local temp=$[ $1 - 1 ]
    local result=`factorial $temp`
    echo $[ $result * $1 ]
  fi
}

阶乘函数用其自身计算阶乘的值：

function factorial {
  if [ $1 -eq 1 ]
  then
    echo 1
  else
    local temp=$[ $1 - 1 ]
    local result=$(factorial $temp)
    echo $[ $result * $1 ]
  fi
}

read -p "Enter value: " value
result=$(factorial $value)
echo "The factorial of $value is: $result"

./test13
# Enter value: 5
# The factorial of 5 is: 120

阶乘函数并不难。创建了这样的函数后，你甚至想把它用在其他的脚本中。下面来看看如何有效地利用函数。

创建库

使用函数可以为脚本省去一些重复性的输入工作，这一点是显而易见的。但如果你碰巧要在多个脚本中使用同一段代码呢？就为了使用一次而在每个脚本中都定义同样的函数，这显然很麻烦。

有一种方法能解决这个问题。bash shell 允许创建函数库文件，然后在多个脚本中引用此库文件。

这个过程的第一步是创建一个包含脚本中所需函数的公用库文件。来看一个库文件 myfuncs，其中定义了 3 个简单的函数：

function addem {
  echo $[ $1 + $2 ]
}

function multem {
  echo $[ $1 * $2 ]
}

function divem {
  if [ $2 -ne 0 ]
  then
    echo $[ $1 / $2 ]
  else
    echo -1
  fi
}

第二步是在需要用到这些函数的脚本文件中包含 myfuncs 库文件。这时候，事情变得棘手起来。

问题出在 shell 函数的作用域上。和环境变量一样，shell 函数仅在定义它的 shell 会话内有效。如果在 shell 命令行界面运行 myfuncs 脚本，那么 shell 会创建一个新的 shell 并在其中运行这个脚本。在这种情况下，以上 3 个函数会定义在新 shell 中，当你运行另一个要用到这些函数的脚本时，它们是无法使用的。

这同样适用于脚本。如果尝试像普通脚本文件那样运行库文件，那么这 3 个函数也不会出现在脚本中：

./myfuncs

result=$(addem 10 15)
echo "The result is $result"

./badtest4
# ./badtest4: addem: command not found
# The result is

使用函数库的关键在于 source 命令。source 命令会在当前 shell 的上下文中执行命令，而不是创建新的 shell 并在其中执行命令。可以用 source 命令在脚本中运行库文件。这样脚本就可以使用库中的函数了。

. ./myfuncs

这个例子假定 myfuncs 库文件和 shell 脚本位于同一目录。如果不是，则需要使用正确路径访问该文件。来看一个使用 myfuncs 库文件创建脚本的例子：

. ./myfuncs

value1=10
value2=5
result1=$(addem $value1 $value2)
result2=$(multem $value1 $value2)
result3=$(divem $value1 $value2)
echo "The result of adding them is: $result1"
echo "The result of multiplying them is: $result2"
echo "The result of dividing them is: $result3"

./test14
# The result of adding them is: 15
# The result of multiplying them is: 50
# The result of dividing them is: 2

脚本成功地使用了 myfuncs 库文件中定义的函数。

在命令行中使用函数

可以用脚本函数执行一些十分复杂的操作，但有时候，在命令行界面直接使用这些函数也很有必要。

就像在 shell 脚本中将脚本函数当作命令使用一样，在命令行界面中也可以这样做。这个特性很不错，因为一旦在 shell 中定义了函数，就可以在整个系统的任意目录中使用它，而无须担心该函数是否位于 PATH 环境变量中。重点在于 shell 要识别这些函数。有几种方法可以实现这一目的。

在命令行中创建函数

因为 shell 会解释用户输入的命令，所以可以在命令行中直接定义一个函数。有两种方法。

一种方法是采用单行方式来定义函数：

function divem { echo $[ $1 / $2 ]; }
divem 100 5
# 20

当你在命令行中定义函数时，必须在每个命令后面加个分号，这样 shell 就能知道哪里是命令的起止了：

function doubleit { read -p "Enter value: " value; echo $[ $value * 2 ]; }

doubleit
# Enter value: 20
# 40

另一种方法是采用多行方式来定义函数。在定义时，bash shell 会使用次提示符来提示输入更多命令。使用这种方法，无须在每条命令的末尾放置分号，只需按下回车键即可：

function multem {
  echo $[ $1 * $2 ]
}
multem 2 5
# 10

输入函数尾部的花括号后，shell 就知道你已经完成函数的定义了。

在命令行创建函数时要特别小心。如果给函数起了一个跟内建命令或另一个命令相同的名字，那么函数就会覆盖原来的命令。

在.bashrc 文件中定义函数

在命令行中直接定义 shell 函数的一个明显缺点是，在退出 shell 时，函数也会消失。对复杂的函数而言，这可是个麻烦事。

有一种非常简单的方法可以解决这个问题：将函数定义在每次新 shell 启动时都会重新读取该函数的地方。

.bashrc 文件就是最佳位置。不管是交互式 shell 还是从现有 shell 启动的新 shell，bash shell 在每次启动时都会在用户主目录中查找这个文件。

直接定义函数

可以直接在用户主目录的.bashrc 文件中定义函数。大多数 Linux 发行版已经在该文件中定义了部分内容，注意不要误删，只需将函数放在文件末尾即可。这里有个例子：

# Source global definitions
if [ -r /etc/bashrc ]; then
  . /etc/bashrc
fi

function addem {
  echo $[ $1 + $2 ]
}

该函数会在下次启动新的 bash shell 时生效。随后你就能在系统中的任意地方使用这个函数了。

源引函数文件

只要是在 shell 脚本中，就可以用 source 命令（或者其别名，即点号操作符）将库文件中的函数添加到.bashrc 脚本中：

# Source global definitions
if [ -r /etc/bashrc ]; then
  . /etc/bashrc
fi

. /home/roger/libraries/myfuncs

要确保库文件的路径名正确，以便 bash shell 找到该文件。下次启动 shell 时，库中的所有函数都可以在命令行界面使用了：

addem 10 5
# 15
multem 10 5
# 50
divem 10 5
# 2

更棒的是，shell 还会将定义好的函数传给子 shell 进程，这样一来，这些函数就能够自动用于该 shell 会话中的任何 shell 脚本了。你可以写个脚本，试试在不定义或不源引函数的情况下直接使用函数会是什么结果：

value1=10
value2=5
result1=$(addem $value1 $value2)
result2=$(multem $value1 $value2)
result3=$(divem $value1 $value2)
echo "The result of adding them is: $result1"
echo "The result of multiplying them is: $result2"
echo "The result of dividing them is: $result3"

./test15
# The result of adding them is: 15
# The result of multiplying them is: 50
# The result of dividing them is: 2

甚至都不用源引库文件，这些函数就可以在 shell 脚本中顺畅运行。