fastcgi 更改环境变量environ引起的段错误(三）getenv函数的使用

vitalma

已于 2024-02-24 05:51:53 修改

阅读量921

点赞数 10

文章标签： linux c++

于 2024-02-24 05:11:25 首次发布

本文链接：https://blog.csdn.net/vitalma/article/details/136266762

版权

关键词：fcgi、environ、getenv、段错误、segment fault、FCGI_Accept

fastcgi 更改环境变量environ引起的段错误(一）fcgi库代码分析

fastcgi 更改环境变量environ引起的段错误(二）修改libfcgi库源码解决getenv段错误问题
 fastcgi 更改环境变量environ引起的段错误(三）getenv函数的使用

fastcgi 更改环境变量environ引起的段错误(四）gdb调试fcgi进程

getenv 的用法可以用命令 man getenv 查看：

mxl@os:~ $ man getenv
NAME
       getenv, secure_getenv - get an environment variable

SYNOPSIS
       #include <stdlib.h>
       char *getenv(const char *name);
       char *secure_getenv(const char *name);
   Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
       secure_getenv(): _GNU_SOURCE

DESCRIPTION
       The getenv() function searches the environment list to find the environment variable name, and returns a pointer to the corresponding value string.

       The GNU-specific secure_getenv() function is just like getenv() except that it returns NULL in cases where "secure execution" is required.  Secure execution is required if one of the following conditions was true when the program run by the calling process was loaded:

       *  the process's effective user ID did not match its real user ID or the process's effective group ID did not match its real group ID (typically this is the result of executing a set-user-ID or set-group-ID program);
       *  the effective capability bit was set on the executable file; or
       *  the process has a nonempty permitted capability set.
       Secure execution may also be required if triggered by some Linux security modules.
       The secure_getenv() function is intended for use in general-purpose libraries to avoid vulnerabilities that could occur if set-user-ID or set-group-ID programs accidentally trusted the environment.

RETURN VALUE
       The getenv() function returns a pointer to the value in the environment, or NULL if there is no match.
VERSIONS
       secure_getenv() first appeared in glibc 2.17.
ATTRIBUTES
       For an explanation of the terms used in this section, see attributes(7).
       ┌──────────────────────────┬───────────────┬─────────────┐
       │Interface                 │ Attribute     │ Value       │
       ├──────────────────────────┼───────────────┼─────────────┤
       │getenv(), secure_getenv() │ Thread safety │ MT-Safe env │
       └──────────────────────────┴───────────────┴─────────────┘
CONFORMING TO
       getenv(): POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
       secure_getenv() is a GNU extension.
NOTES
       The strings in the environment list are of the form name=value.
       As typically implemented, getenv() returns a pointer to a string within the environment list.  The caller must take care not to modify this string, since that would change the environment of the process.
       The implementation of getenv() is not required to be reentrant.  The string pointed to by the return value of getenv() may be statically allocated, and can be modified by a subsequent call to getenv(), putenv(3), setenv(3), or unsetenv(3).
       The "secure execution" mode of secure_getenv() is controlled by the AT_SECURE flag contained in the auxiliary vector passed from the kernel to user space.

getenv的用法也可能参见网址 getenv ：

The getenv() function is inherently not thread-safe because it returns a value pointing to static data.

Conforming applications are required not to directly modify the pointers to which environ points, but to use only the setenv(), unsetenv(), and putenv() functions, or assignment to environ itself, to manipulate the process environment. This constraint allows the implementation to properly manage the memory it allocates. This enables the implementation to free any space it has allocated to strings (and perhaps the pointers to them) stored in environ when unsetenv() is called. A C runtime start-up procedure (that which invokes main() and perhaps initializes environ) can also initialize a flag indicating that none of the environment has yet been copied to allocated storage, or that the separate table has not yet been initialized. If the application switches to a complete new environment by assigning a new value to environ, this can be detected by getenv(), setenv(), unsetenv(), or putenv() and the implementation can at that point reinitialize based on the new environment. (This may include copying the environment strings into a new array and assigning environ to point to it.)

In fact, for higher performance of getenv(), implementations that do not provide putenv() could also maintain a separate copy of the environment in a data structure that could be searched much more quickly (such as an indexed hash table, or a binary tree), and update both it and the linear list at environ when setenv() or unsetenv() is invoked. On implementations that do provide putenv(), such a copy might still be worthwhile but would need to allow for the fact that applications can directly modify the content of environment strings added with putenv(). For example, if an environment string found by searching the copy is one that was added using putenv(), the implementation would need to check that the string in environ still has the same name (and value, if the copy includes values), and whenever searching the copy produces no match the implementation would then need to search each environment string in environ that was added using putenv() in case any of them have changed their names and now match. Thus, each use of putenv() to add to the environment would reduce the speed advantage of having the copy.

Performance of getenv() can be important for applications which have large numbers of environment variables. Typically, applications like this use the environment as a resource database of user-configurable parameters. The fact that these variables are in the user's shell environment usually means that any other program that uses environment variables (such as ls, which attempts to use COLUMNS), or really almost any utility (LANG, LC_ALL, and so on) is similarly slowed down by the linear search through the variables.

An implementation that maintains separate data structures, or even one that manages the memory it consumes, is not currently required as it was thought it would reduce consensus among implementors who do not want to change their historical implementations.

getenv线程安全 thread-safe 的讨论

在网址13271 – getaddrinfo is not thread safe against concurrent setenv 有对 getenv的是否线程安全的讨论：

DescriptionJan David Mol 2011-10-07 08:26:00 UTC

The getaddrinfo() can trigger the following sequence of function calls:

getaddrinfo()

gaih_inet()

__gethostbyname2_r()

_nss_dns_gethostbyname2_r()

_nss_dns_gethostbyname3_r()

__res_hostalias()

getenv("HOSTALIASES")

The getenv() function is not thread-safe however, so getaddrinfo() should not call it. I suspect other getenv() functions can be reached as well, but I lack enough glibc knowledge to provide a good list so I won't try.

函数getaddrinfo()会调用函数getenv()，调用过程如下：

(gdb) bt

#0 0xb687bf90 in getenv () from /data/mxlApp/libs/libc.so.6

#1 0xb693e150 in __resolv_conf_load () from /data/mxlApp/libs/libc.so.6

#2 0xb69404ec in __resolv_conf_get_current () from /data/mxlApp/libs/libc.so.6

#3 0xb693ec98 in __res_vinit () from /data/mxlApp/libs/libc.so.6

#4 0xb693fc90 in maybe_init () from /data/mxlApp/libs/libc.so.6

#5 0xb693fe10 in __resolv_context_get () from /data/mxlApp/libs/libc.so.6

#6 0xb69073bc in gaih_inet.constprop () from /data/mxlApp/libs/libc.so.6

#7 0xb6908574 in getaddrinfo () from /data/mxlApp/libs/libc.so.6

Rich Felker 回复 getenv的用法

Rich Felker 2011-10-07 12:28:26 UTC

Not a bug, as far as I can tell. ：据我所知这不是一个 bug

2.9.1 Thread-Safety

Since multi-threaded applications 多线程的程序are not allowed to use the environ variable to access or modify any environment variable while any other thread is concurrently modifying any environment variable, any function dependent on any environment variable is not thread-safe if another thread is modifying the environment;

不能在一个线程调用getenv 获取环境变量 extern char ** environ 的值，同时另一个线程又在修改环境变量 extern char ** environ的值。

see XSH exec.

And the cross-referenced text (from exec):

Conforming multi-threaded applications shall not use the environ variable to access or modify any environment variable while any other thread is concurrently modifying any environment variable. A call to any function dependent on any environment variable shall be considered a use of the environ variable to access that environment variable.

The only way the issue you reported *may* be a bug is in that getaddrinfo is not documented by the standard to use any environment variables; this is a behavior specific to most implementations including glibc. I'm not sure how cases like this should be treated, but I think as long as the implementation documents the use of the environment it's probably technically okay..

Ulrich Drepper 2011-10-15 13:58:25 UTC

I've already explained that multi-threaded programs are not allowed to modify the environment. Stop reopening and educate yourself.

Comment 9Rich Felker 2011-10-15 20:55:55 UTC

Please educate yourself:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09_01

exec

getenv函数源码

getenv函数在gblic库文件glibc-2.32/stdlib/getenv.c 中实际，具体内容如下：

/* Copyright (C) 1991-2020 Free Software Foundation, Inc.
   This file is part of the GNU C Library.

   The GNU C Library is free software; you can redistribute it and/or
   modify it under the terms of the GNU Lesser General Public
   License as published by the Free Software Foundation; either
   version 2.1 of the License, or (at your option) any later version.

   The GNU C Library is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
   Lesser General Public License for more details.

   You should have received a copy of the GNU Lesser General Public
   License along with the GNU C Library; if not, see
   <https://www.gnu.org/licenses/>.  */

#include <endian.h>
#include <errno.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>


/* Return the value of the environment variable NAME.  This implementation
   is tuned a bit in that it assumes no environment variable has an empty
   name which of course should always be true.  We have a special case for
   one character names so that for the general case we can assume at least
   two characters which we can access.  By doing this we can avoid using the
   `strncmp' most of the time.  */
char *
getenv (const char *name)
{
  size_t len = strlen (name);
  char **ep;
  uint16_t name_start;

  if (__environ == NULL || name[0] == '\0')
    return NULL;

  if (name[1] == '\0')
    {
      /* The name of the variable consists of only one character.  Therefore
	 the first two characters of the environment entry are this character
	 and a '=' character.  */
#if __BYTE_ORDER == __LITTLE_ENDIAN || !_STRING_ARCH_unaligned
      name_start = ('=' << 8) | *(const unsigned char *) name;
#else
      name_start = '=' | ((*(const unsigned char *) name) << 8);
#endif
      for (ep = __environ; *ep != NULL; ++ep)
	{
#if _STRING_ARCH_unaligned
	  uint16_t ep_start = *(uint16_t *) *ep;
#else
	  uint16_t ep_start = (((unsigned char *) *ep)[0]
			       | (((unsigned char *) *ep)[1] << 8));
#endif
	  if (name_start == ep_start)
	    return &(*ep)[2];
	}
    }
  else
    {
#if _STRING_ARCH_unaligned
      name_start = *(const uint16_t *) name;
#else
      name_start = (((const unsigned char *) name)[0]
		    | (((const unsigned char *) name)[1] << 8));
#endif
      len -= 2;
      name += 2;

      for (ep = __environ; *ep != NULL; ++ep)
	{
#if _STRING_ARCH_unaligned
	  uint16_t ep_start = *(uint16_t *) *ep;
#else
	  uint16_t ep_start = (((unsigned char *) *ep)[0]
			       | (((unsigned char *) *ep)[1] << 8));
#endif

	  if (name_start == ep_start && !strncmp (*ep + 2, name, len)
	      && (*ep)[len + 2] == '=')
	    return &(*ep)[len + 3];
	}
    }

  return NULL;
}
libc_hidden_def (getenv)