一个erlang application启动之后,通常需要在一定时间后停止,一般会采用application:stop/1来停止一个application。如果我们希望在stop application的同时还能做些操作,例如持久化,数据清理之类的事情,就要利用terminate了。废话不多说,直接上代码:
start() ->
{ok, _} = supervisor:start_child(test_sup, []).
start_link() ->
gen_server:start_link(?MODULE, [], []).
init([]) ->
{ok, #state{}}.<pre name="code" class="java">terminate(Reason, _State) ->
io:format("i'm terminate:~p~n", [Reason]),
timer:sleep(10000),
io:format("~s", ["end"]),
ok.
handle_info({'EXIT', _, Reason}, State) ->
io:format("exit:~p~n", [Reason]),
{stop, normal, State};
上面代码是一个gen_server进程,test_sup是一个one_for_one supvisor,重启策略permanent,shutdown策略infinity,调用
application:stop/1来停止,发现terminate并没有执行到。源码是最好的工具,上源码:
%%
%% Terminate this server.
%%
-spec terminate(term(), state()) -> 'ok'.
terminate(_Reason, #state{children=[Child]} = State) when ?is_simple(State) ->
terminate_dynamic_children(Child, dynamics_db(Child#child.restart_type,
State#state.dynamics),
State#state.name);
terminate(_Reason, State) ->
terminate_children(State#state.children, State#state.name).
上面是supervisor的terminate方法,可以看到对于simple_one_for_one这种类型,supervisor是特殊处理的,这里,我们关注terminate_children/2这个方法。
terminate_children([Child = #child{restart_type=temporary} | Children], SupName, Res) ->
do_terminate(Child, SupName),
terminate_children(Children, SupName, Res);
terminate_children([Child | Children], SupName, Res) ->
NChild = do_terminate(Child, SupName),
terminate_children(Children, SupName, [NChild | Res]);
terminate_children([], _SupName, Res) ->
Res.
do_terminate(Child, SupName) when is_pid(Child#child.pid) ->
case shutdown(Child#child.pid, Child#child.shutdown) of
ok ->
ok;
{error, normal} when Child#child.restart_type =/= permanent ->
ok;
{error, OtherReason} ->
report_error(shutdown_error, OtherReason, Child, SupName)
end,
Child#child{pid = undefined};
do_terminate(Child, _SupName) ->
Child#child{pid = undefined}.
terminate_children/2这个方法是一个尾递归,作用就是对每一个child都执行do_terminate/2,do_terminate/2很简单,就是直接shutdowns
shutdown(Pid, brutal_kill) ->
case monitor_child(Pid) of
ok ->
exit(Pid, kill),
receive
{'DOWN', _MRef, process, Pid, killed} ->
ok;
{'DOWN', _MRef, process, Pid, OtherReason} ->
{error, OtherReason}
end;
{error, Reason} ->
{error, Reason}
end;
shutdown(Pid, Time) ->
case monitor_child(Pid) of
ok ->
exit(Pid, shutdown), %% Try to shutdown gracefully
receive
{'DOWN', _MRef, process, Pid, shutdown} ->
ok;
{'DOWN', _MRef, process, Pid, OtherReason} ->
{error, OtherReason}
after Time ->
exit(Pid, kill), %% Force termination.
receive
{'DOWN', _MRef, process, Pid, OtherReason} ->
{error, OtherReason}
end
end;
{error, Reason} ->
{error, Reason}
end.
shutdown/2中对brant_kill这种shutdown机制做了特殊处理,具体就是monitor_child/2之后直接kill进程,要知道,exit(PID, kill)这个方法执行之后,对应PID不管是否trap_exit都无法抓取退出消息的,也就是不会执行terminate,所以如果退出需要做处理,请不要设置brant_kill。如果不是brant_kill,在monitor_child/2之后,调用exit(PID, shutdown),用这种方法停止PID,所以在这种情况下想要执行terminate必须:PID是一个trap_exit为true的进程。
把最开始那段测试代码加上trap_exit,再次stop,查看输出发现有执行到terminate,但是没有执行到handle_info那里,这是为什么呢?这个和gen_server有关。
loop(Parent, Name, State, Mod, hibernate, Debug) ->
proc_lib:hibernate(?MODULE,wake_hib,[Parent, Name, State, Mod, Debug]);
loop(Parent, Name, State, Mod, Time, Debug) ->
Msg = receive
Input ->
Input
after Time ->
timeout
end,
decode_msg(Msg, Parent, Name, State, Mod, Time, Debug, false).
wake_hib(Parent, Name, State, Mod, Debug) ->
Msg = receive
Input ->
Input
end,
decode_msg(Msg, Parent, Name, State, Mod, hibernate, Debug, true).
decode_msg(Msg, Parent, Name, State, Mod, Time, Debug, Hib) ->
case Msg of
{system, From, Req} ->
sys:handle_system_msg(Req, From, Parent, ?MODULE, Debug,
[Name, State, Mod, Time], Hib);
{'EXIT', Parent, Reason} ->
terminate(Reason, Name, Msg, Mod, State, Debug);
_Msg when Debug =:= [] ->
handle_msg(Msg, Parent, Name, State, Mod);
_Msg ->
Debug1 = sys:handle_debug(Debug, fun print_event/3,
Name, {in, Msg}),
handle_msg(Msg, Parent, Name, State, Mod, Debug1)
end.
gen_server进程其实是在不断的执行loop/6,当收到消息时,调用decode_msg/7来解析,可以看到,decode_msg/7中,对{'EXIT', Parent, Reason}这种消息做了特殊处理,具体就是直接调用terminate/6。所以,对于gen_server来说,父进程发来的EXIT消息,是不会用handle_info来处理的,而是直接terminate。这就是为什么测试代码在stop的时候没有执行到handle_info的原因。
源码和文档果真是最好的工具。