你用sh -c ‘command‘时踩过坑吗？

阿白不想努力了

已于 2022-06-27 12:34:30 修改

阅读量2.3k

点赞数 1

文章标签： c语言 bash linux

于 2022-06-27 12:31:33 首次发布

本文链接：https://blog.csdn.net/tykit/article/details/125481689

版权

本文详细探讨了在Linux环境下，shell执行命令时的ONESHOT优化问题，通过实例分析了`sh -c 'command'`在不同情况下的行为差异。文章揭示了这个问题背后的优化规则，包括必须是简单命令、不在handler中、没有重定向等因素，并且解释了优化如何影响效率。此外，还讨论了bash与dash在ONESHOT优化上的区别，以及Ubuntu中Dash禁用此优化的原因。

摘要由CSDN通过智能技术生成

发现问题

一个小问题：“怎么让一个 shell 执行一行命令输出它自己是什么 shell？”

一般通过sh来用 shell，而且sh是个符号链接，那读出它的源文件路径不就知道是什么 shell 了？咱找到了一条相关的命令：

root@ubuntu:~# sh -c 'readlink /proc/$$/exe'
/usr/bin/readlink

输出结果和想的不一样，为什么是readlink的文件位置？先查一下sh：

root@ubuntu:~# readlink $(which sh)
bash

这个默认 shell 被改成 BASH 了。一般 Ubuntu 上有两个 shell，BASH 和 Dash，它们都当过系统默认sh。最开始是 BASH，后来是 Dash。这俩对比一下：

# BASH
root@ubuntu:~# bash -c 'readlink /proc/$$/exe'
/usr/bin/readlink

# Dash
root@ubuntu:~# dash -c 'readlink /proc/$$/exe'
/usr/bin/dash

薛定谔的 shell 吗这是？一种代码，两种结果。

再想一下命令，$$是特殊变量，是当前 shell 的 PID，/proc/$$/exe是当前 shell 的可执行文件路径，这行代码的效果应该是读 BASH 的可执行文件路径吧？？？

无效试探

试着加上ps -f看看进程：

# 前边的历史
root@ubuntu:~# bash -c 'readlink /proc/$$/exe'
/usr/bin/readlink

# 当前 shell 的 PID
root@ubuntu:~# echo $$
564505

# 现在的尝试
root@ubuntu:~# bash -c 'echo $$; ps -f; readlink /proc/$$/exe; ps -f'
577450
UID          PID    PPID  C STIME TTY          TIME CMD
root      564505  564399  0 17:07 pts/0    00:00:00 -bash
root      577450  564505  0 18:16 pts/0    00:00:00 bash -c echo $$; ps -f; readlink /proc/$$/exe; ps -f
root      577451  577450  0 18:16 pts/0    00:00:00 ps -f
/usr/bin/bash
UID          PID    PPID  C STIME TTY          TIME CMD
root      564505  564399  0 17:07 pts/0    00:00:00 -bash
root      577450  564505  0 18:16 pts/0    00:00:00 bash -c echo $$; ps -f; readlink /proc/$$/exe; ps -f
root      577453  577450  0 18:16 pts/0    00:00:00 ps -f

沃的天啊，它对了，它输出了/usr/bin/bash。

再理解下命令的意思：

第一个echo $$拿到当前交互式 shell 的 PID，是 564505，后面用ps会看到两条bash的进程，这样可以区分开了。

当前交互式 shell(PID: 564505)执行bash -c 'echo $$; ps -f; readlink /proc/$$/exe; ps -f'时，开了一个非交互式 shell(PID: 577450)，把-c之后单引号之间的内容交给它来执行。

这个非交互式 shell(PID: 577450) 开了三个子进程，第一个是ps(PID: 577451)、第二个是readlink(PID: 577452)，第三个是另一个ps(PID: 577453)。命令里两个ps夹着readlink，主要是想看看readlink执行的前后。

虽然前后的ps都没看到readlink信息，但看输出知道readlink读了所在 shell(PID: 577450) 的可执行文件路径后就退出了，两个ps的 PID 之间缺的那个 577452 就是它的 PID。

上节知道readlink它不对，现在知道了有时对有时不对。它为什么这么精分呢？把命令简化一下：

root@ubuntu:~# bash -c 'echo $$; readlink /proc/$$/exe'
772062
/usr/bin/bash
root@ubuntu:~# bash -c '; readlink /proc/$$/exe'
bash: -c: line 0: syntax error near unexpected token `;'
bash: -c: line 0: `; readlink /proc/$$/exe'

发现这里有个;它就对了，这啥？这是 BASH 的命令列表啊，而上节没;的是简单命令！可能接近真相了，可当咱换了一台机器：

root@ubuntu-2:~# bash -c 'echo $$; readlink /proc/$$/exe'
12405
/usr/bin/readlink
root@ubuntu-2:~# bash -c '; readlink /proc/$$/exe'
bash: -c: line 1: syntax error near unexpected token `;'
bash: -c: line 1: `; readlink /proc/$$/exe'

看来不是非简单命令就能正常。查了一下版本：

root@ubuntu:~# bash --version
GNU bash, version 5.0.17(1)-release (x86_64-pc-linux-gnu)
...

root@ubuntu-2:~# bash --version
GNU bash, version 5.1.0(1)-release (aarch64-unknown-linux-gnu)
...

还和版本有关系。这里面有什么规则呢？

有效努力

粗略地看了看 BASH 5.1 的源码，发现了一些有意思的东西。 BASH 源码不是本文的重点，所以省略了许多细节，只贴出主流程的简化代码：

// 文件：config-top.h

// ...
/* Define ONESHOT if you want sh -c 'command' to avoid forking to execute
   `command' whenever possible.  This is a big efficiency improvement. */
#define ONESHOT
// ...

上面是ONESHOT的定义，这个宏的定义和解除定义，控制着 BASH fork优化的开和关。

// 文件：shell.c

// ...
int
main (argc, argv, env)
     int argc;
     char **argv, **env;
// ...
{
   
// ...
  if (command_execution_string)
    {
   
      startup_state = 2;
// ...
#if defined (ONESHOT)
      executing = 1;
      run_one_command (command_execution_string);
      exit_shell (last_command_exit_value);
#else /* ONESHOT */
      with_input_from_string (command_execution_string, "-c");
      goto read_and_execute;
#endif /* !ONESHOT */
    }
// ...
#if !defined (ONESHOT)
 read_and_execute:
#endif /* !ONESHOT */
// ...
  reader_loop ();
  exit_shell (last_command_exit_value);
}
// ...
#if defined (ONESHOT)
// ...
static int
run_one_command (command)
     char *command;
{
   
// ...
   return (parse_and_execute (savestring (command), "-c", SEVAL_NOHIST|SEVAL_RESETLINE));
}
#endif /* ONESHOT */
// ...

上面是用bash -c 'command'这种方式执行时的主要流程。当 BASH 解析完了命令行参数，指针command_execution_string就指向'command'，如果 ONESHOT 是关的，那跟bash 'filename'一样用reader_loop逐行读取和执行，不然就用parse_and_execute解析和执行'command'的内容。

// 文件：evalstring.c

// ...
int
parse_and_execute (string, from_file, flags)
     char *string;
     const char *from_file;
     int flags;
{
   
// ...
#if defined (ONESHOT)
// ...
	      if (should_suppress_fork (command))
		{
   
		  command->flags |= CMD_NO_FORK;
		  command->value.Simple->flags |= CMD_NO_FORK;
		}
// ...
	      else if (command->type == cm_connection && can_optimize_connection (command))
		{
   
		  command->value.Connection->second->flags |= CMD_TRY_OPTIMIZING;
		  command->value.Connection->second->value.Simple->flags |= CMD_TRY_OPTIMIZING;
		}
#endif /* ONESHOT */
// ...
}
// ...

上面是 BASH 把命令字符串解析成command结构体后，用should_suppress_fork函数判断做不做fork优化，如果能做，就打上CMD_NO_FORK标志，如果不能做，else if分支还给一次机会。如果是命令列表且can_optimize_connection函数判断能做，那打上CMD_TRY_OPTIMIZING标志，当执行到命令列表的最后一个，会继续用should_suppress_fork函数判断做不做fork优化，如果能做，也打上CMD_NO_FORK标志。

// 文件：execute_cmd.c

// ...
static int
execute_connection (command, asynchronous, pipe_in, pipe_out, fds_to_close)
     COMMAND *command;
     int asynchronous, pipe_in, pipe_out;
     struct fd_bitmap *fds_to_close;
{
   
// ...
  switch (command->value.Connection->connector)
    {
   
// ...
    case ';':
// ...
      optimize_fork (command);
      exec_result = execute_command_internal (command->value.Connection->second,
				      asynchronous, pipe_in, pipe_out,
				      fds_to_close);
      executing_list--;
      break;
// ...
    case AND_AND:
    case OR_OR:
// ...
      if (((command->value.Connection->connector == AND_AND) &&
	   (exec_result == EXECUTION_SUCCESS)) ||
	  ((command->value.Connection->connector == OR_OR) &&
	   (exec_result != EXECUTION_SUCCESS)))
	{
   
	  optimize_fork (command);

	  second = command->value.Connection->second;
	  if (ignore_return && second)
	    second->flags |= CMD_IGNORE_RETURN;

	  exec_result = execute_command (second);
	}
      executing_list--;
      break;
// ...
    }
// ...
}
// ...

中间进了execute_command_internal函数，然后看命令的类型执行execute_simple_command, execute_connection之类的函数，过程中的代码比较多所以不贴了。要注意的是上面的execute_connection函数，connection右边的命令总有被optimize_fork决定做不做优化的机会，方法和can_optimize_connection差不多。

现在快进到执行外部命令的execute_disk_command函数：

// 文件：execute_cmd.c

// ...
static int

最低0.47元/天解锁文章

阿白不想努力了

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
1
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫