目录:
一、 背景介绍
二、系统框架
三、代码流程
四、app至driver调用的代码详解
4.1 app模块代码
4.2 PowerManager
4.3 ThermalManagerService
4.4 Google Pixel Thermal Hal
4.5 android.hardware.thermal库
4.6 Thermal Driver
五、核心API梳理
六、温度值优化
七、温控策略
一、背景
Android 引入了热系统,用于将热子系统硬件设备的接口抽象化,硬件接口包括设备表面、电池、GPU、CPU 和 USB 端口的温度传感器和热敏电阻。借助该框架,设备制造商和应用开发者可以主动获取这些系统硬件设备的温度数据,或者通过注册的回调函数(位于 PowerManager 类中)接收高温通知,进而在设备开始过热时调整系统及应用执行策略,降低系统负载。例如,当系统温度较高时,可以通过一些温控策略来降低设备温度,如cpu降频、关闭子设备等。
设备温度管理相关的两个概念:
Thermal Event(热事件):热事件指的是设备或系统在检测到温度变化或达到某个阈值时触发的事件。
Thermal Status(热状态):热状态指的是设备当前的温度状态。
Android提供了Pixel Thermal HAL 2.0 的监控底层设备各个节点温度等信息,系统内部组件和三方应用可通过调用相关接口获取这些反馈信息。
二、系统框架
PowerManager:向三方app提供了Thermal Status发生变化的监听接口;
ThermalManagerService:向提供系统app、系统组件、PowerManager模块提供Thermal Status的接口;支持AIDL/HIDL Binder两种方式与Tharma Hal通信;
Thermal HAL:向ThermalManagerService服务提供Binder API、监听kernel thermal driver的thermal status 事件;
Thermal Driver:获取各类设备sensor温度,向user space发送thermal status 事件。
三、代码流程
四、app至driver调用的代码详解
4.1 app模块代码
创建一个PowerManager对象,调用addThermalStatusListener注册监听接口,传入一个PowerManager的内部interface(OnThermalStatusChangedListener),重写内部interface的方法(该方法会在PowerManager中被回调),处理相应的业务逻辑。
// 获取PowerManager实例
PowerManager pm = (PowerManager) getSystemService(Context.POWER_SERVICE);
// 传入PowerManager的OnThermalStatusChangedListener内部接口
// PowerManager回调onThermalStatusChanged方法
pm.addThermalStatusListener(new PowerManager.OnThermalStatusChangedListener {
void onThermalStatusChanged(@ThermalStatus int status) {
// 逻辑处理
}
});
4.2 PowerManager
提供三方app Thermal Status注册监听的接口及内部interface的回调实现接口(用于app接口回调)、创建IThermalStatusListener对象,作为参数调用ThermalManagerService服务的注册接口(ThermalManagerService会回调该对象中的onStatusChange接口,然后回调app中的onThermalStatusChanged接口)。
// 提供给app监听的内部interface
public interface OnThermalStatusChangedListener {
/**
* Called when overall thermal throttling status changed.
* @param status defined in {@link android.os.Temperature}.
*/ void onThermalStatusChanged(@ThermalStatus int status); // app中重写此方法
}
// 提供给三方app注册监听的接口
public void addThermalStatusListener(@NonNull @CallbackExecutor Executor executor, @NonNull OnThermalStatusChangedListener listener) {
Objects.requireNonNull(listener, "listener cannot be null");
Objects.requireNonNull(executor, "executor cannot be null");
Preconditions.checkArgument(!mListenerMap.containsKey(listener),
"Listener already registered: %s", listener);
// Thermal service回调internalListener的onStatusChange方法
IThermalStatusListener internalListener = new IThermalStatusListener.Stub() {
@Overridepublic void onStatusChange(int status) {
final long token = Binder.clearCallingIdentity();
try {
executor.execute(() -> listener.onThermalStatusChanged(status));
} finally {
Binder.restoreCallingIdentity(token);
}
}
};
try {
// 调用注册接口,internalListener对象传给ThermalService
if (mThermalService.registerThermalStatusListener(internalListener)) {
// 将app的listener和PM的internalListener对象存如map
mListenerMap.put(listener, internalListener);
} else {
throw new RuntimeException("Listener failed to set");
}
} catch (RemoteException e) {
throw e.rethrowFromSystemServer();
}
}
4.3 ThermalManagerService
1)ThermalManagerService提供HIDL、AIDL两种Binder通信方式与thermal hal service通信;
2)ThermalManagerService提供了callback接口(三方app可被动监听Themal Status)、主动获取Themal Status、设备温度的接口
3)对三方app只提供了oneway机制的接口,因此三方app只能被动接收Themal Status信息,而不能主动获取Themal Status或设备温度值。
4)与Thermal Hal实现aidl或hidl通信,调用设置callback、设备温度的接口
5)接收Thermal Hal callback或binder通信的数据,传给PowerManager或其它模块
IThermalStatusListener.aidl文件中定义了回调接口,为oneway机制。
// frameworks/base/core/java/android/os/IThermalStatusListener.aidl
// 提供给PowerManager或其它模块回调
oneway interface IThermalStatusListener {
/**
* Called when overall thermal throttling status changed.
* @param status defined in {@link android.os.Temperature#ThrottlingStatus}.
*/ void onStatusChange(int status);
}
private final RemoteCallbackList<IThermalStatusListener> mThermalStatusListeners =
new RemoteCallbackList<>();
// 提供PowerManager的callbcak注册接口
public boolean registerThermalStatusListener(IThermalStatusListener listener) {
synchronized (mLock) {
// Notify its callback after new client registered.
final long token = Binder.clearCallingIdentity();
try {
if (!mThermalStatusListeners.register(listener)) {
return false;
}
// Notify its callback after new client registered.
// 首次回调
postStatusListener(listener);
return true;
} finally {
Binder.restoreCallingIdentity(token);
}
}
}
private void postStatusListener(IThermalStatusListener listener) {
final boolean thermalCallbackQueued = FgThread.getHandler().post(() -> {
try {
// PowerManager回调
listener.onStatusChange(mStatus);
} catch (RemoteException | RuntimeException e) {
Slog.e(TAG, "Thermal callback failed to call", e);
}
});
if (!thermalCallbackQueued) {
Slog.e(TAG, "Thermal callback failed to queue");
}
}
//
/* HwBinder callback **/
// thermal hal 2.0回调接口
private void onTemperatureChangedCallback(Temperature temperature) {
final long token = Binder.clearCallingIdentity();
try {
onTemperatureChanged(temperature, true);
} finally {
Binder.restoreCallingIdentity(token);
}
}
abstract static class ThermalHalWrapper {
......
@FunctionalInterface
interface TemperatureChangedCallback {
void onValues(Temperature temperature);
}
/** Temperature callback. */
protected TemperatureChangedCallback mCallback;
......
}
// ThermalManagerService向Thermal Hal Serive设置callback
ThermalManagerService(Context context, @Nullable ThermalHalWrapper halWrapper) {
super(context);
mHalWrapper = halWrapper;
if (halWrapper != null) {
// thermal hal 2.0将回调onTemperatureChangedCallback接口
halWrapper.setCallback(this::onTemperatureChangedCallback);
}
mStatus = Temperature.THROTTLING_NONE;
}
private void onActivityManagerReady() {
synchronized (mLock) {
// Connect to HAL and post to listeners.
boolean halConnected = (mHalWrapper != null);
if (!halConnected) {
// aidl方式与Thermal Hal实现通信
mHalWrapper = new ThermalHalAidlWrapper(this::onTemperatureChangedCallback);
halConnected = mHalWrapper.connectToHal();
}
if (!halConnected) {
// hidl方式与Thermal Hal实现通信
mHalWrapper = new ThermalHal20Wrapper(this::onTemperatureChangedCallback);
halConnected = mHalWrapper.connectToHal();
}
if (!halConnected) {
mHalWrapper = new ThermalHal11Wrapper(this::onTemperatureChangedCallback);
halConnected = mHalWrapper.connectToHal();
}
if (!halConnected) {
mHalWrapper = new ThermalHal10Wrapper(this::onTemperatureChangedCallback);
halConnected = mHalWrapper.connectToHal();
}
if (!halConnected) {
Slog.w(TAG, "No Thermal HAL service on this device");
return;
}
List<Temperature> temperatures = mHalWrapper.getCurrentTemperatures(false,
0);
final int count = temperatures.size();
if (count == 0) {
Slog.w(TAG, "Thermal HAL reported invalid data, abort connection");
}
for (int i = 0; i < count; i++) {
onTemperatureChanged(temperatures.get(i), false);
}
onTemperatureMapChangedLocked();
mTemperatureWatcher.updateSevereThresholds();
mHalReady.set(true);
}
}
private void onTemperatureChanged(Temperature temperature, boolean sendStatus) {
shutdownIfNeeded(temperature);
synchronized (mLock) {
Temperature old = mTemperatureMap.put(temperature.getName(), temperature);
if (old == null || old.getStatus() != temperature.getStatus()) {
notifyEventListenersLocked(temperature);
}
if (sendStatus) {
// 回调PowerManager的onStatusChange接口
onTemperatureMapChangedLocked();
}
}
}
protected void setCallback(TemperatureChangedCallback cb) {
mCallback = cb;
}
// 构建ThermalHal20Wrapper对象时,已将Callback封装到对象属性
ThermalHal20Wrapper(TemperatureChangedCallback callback) {
mCallback = callback;
}
// thermal manager service与 thermal hal 2.0建立连接时,注册callback
protected boolean connectToHal() {
synchronized (mHalLock) {
try {
mThermalHal20 = android.hardware.thermal.V2_0.IThermal.getService(true);
mThermalHal20.linkToDeath(new DeathRecipient(), THERMAL_HAL_DEATH_COOKIE);
mThermalHal20.registerThermalChangedCallback(mThermalCallback20, false,
0 /* not used */);
Slog.i(TAG, "Thermal HAL 2.0 service connected.");
} catch (NoSuchElementException | RemoteException e) {
Slog.e(TAG, "Thermal HAL 2.0 service not connected.");
mThermalHal20 = null;
}
return (mThermalHal20 != null);
}
/** HWbinder callback for Thermal HAL 2.0. */
// mThermalCallback20对象用于callback注册,回调接口是notifyThrottling
private final android.hardware.thermal.V2_0.IThermalChangedCallback.Stub
mThermalCallback20 = new android.hardware.thermal.V2_0.IThermalChangedCallback.Stub() {
// thermal hal 2.0回调接口
@Override
public void notifyThrottling(
android.hardware.thermal.V2_0.Temperature temperature) {
// thermal hal 2.0 callback 数据封装到Temperature
Temperature thermalSvcTemp = new Temperature(
temperature.value, temperature.type, temperature.name,
temperature.throttlingStatus);
final long token = Binder.clearCallingIdentity();
try {
// 回调onTemperatureChangedCallback接口
mCallback.onValues(thermalSvcTemp);
} finally {
Binder.restoreCallingIdentity(token);
}
}
};
// 主动获取设备的温度信息,可能有多个sensor,因此返回list
protected List<Temperature> getCurrentTemperatures(boolean shouldFilter, int type) {
synchronized (mHalLock) {
List<Temperature> ret = new ArrayList<>();
if (mThermalHal20 == null) {
return ret;
}
try {
// 调用thermal hal 2.0接口
mThermalHal20.getCurrentTemperatures(shouldFilter, type,
(status, temperatures) -> {
if (ThermalStatusCode.SUCCESS == status.code) {
for (android.hardware.thermal.V2_0.Temperature temperature : temperatures) {
if (!Temperature.isValidStatus(
temperature.throttlingStatus)) {
Slog.e(TAG, "Invalid status data from HAL");
temperature.throttlingStatus =
Temperature.THROTTLING_NONE;
}
// 获取到的数据封装到class Temperature对象,并加入Temperature list
ret.add(new Temperature(
temperature.value, temperature.type,
temperature.name,
temperature.throttlingStatus));
}
} else {
Slog.e(TAG,
"Couldn't get temperatures because of HAL error: "
+ status.debugMessage);
}
});
} catch (RemoteException e) {
Slog.e(TAG, "Couldn't getCurrentTemperatures, reconnecting...", e);
connectToHal();
}
// 返回Temperature list
return ret;
}
}
ThermalManagerService中的Temperature实现Parcelable接口,定义Thermal 7个热状态等级、设备类型等。
// frameworks/base/core/java/android/os/Temperature.java
// Temperature类用于封装thermal status、valaue、type等数据信息
/**
* Temperature values used by IThermalService.
*
* @hide
*/
public final class Temperature implements Parcelable {
/** Temperature value */
private final float mValue;
/** A Temperature type from ThermalHAL */
private final int mType;
/** Name of this Temperature */
private final String mName;
/** The level of the sensor is currently in throttling */
private final int mStatus;
// thermal热状态划分为7个等级
@IntDef(prefix = { "THROTTLING_" }, value = {
THROTTLING_NONE,
THROTTLING_LIGHT,
THROTTLING_MODERATE,
THROTTLING_SEVERE,
THROTTLING_CRITICAL,
THROTTLING_EMERGENCY,
THROTTLING_SHUTDOWN,
})
// 设备类型
@IntDef(prefix = { "TYPE_" }, value = {
TYPE_UNKNOWN,
TYPE_CPU,
TYPE_GPU,
TYPE_BATTERY,
TYPE_SKIN,
TYPE_USB_PORT,
TYPE_POWER_AMPLIFIER,
TYPE_BCL_VOLTAGE,
TYPE_BCL_CURRENT,
TYPE_BCL_PERCENTAGE,
TYPE_NPU,
TYPE_TPU,
TYPE_DISPLAY,
TYPE_MODEM,
TYPE_SOC
})
public Temperature(float value, @Type int type,
@NonNull String name, @ThrottlingStatus int status) {
Preconditions.checkArgument(isValidType(type), "Invalid Type");
Preconditions.checkArgument(isValidStatus(status) , "Invalid Status");
mValue = value;
mType = type;
mName = Preconditions.checkStringNotEmpty(name);
mStatus = status;
}
......
}
UML:
4.4 Google Pixel Thermal Hal
Thermal HAL包括thermal service、thermal helper、thermal watcher,三者关系:
- Thermal service
// Thermal.cpp
// 创建thermal helper对象时,注册sendThermalChangedCallback接口
Thermal::Thermal()
: thermal_helper_(
std::bind(&Thermal::sendThermalChangedCallback, this, std::placeholders::_1)) {}
// 提供给ThermalManagerService等上层模块调用的Binder接口,用于获取指定设备类型的设备温度及thermal status
// 由于单个设备可能存在多个sensor,因此返回的是vector数组
// Binder API
ndk::ScopedAStatus Thermal::getTemperaturesWithType(TemperatureType type,
std::vector<Temperature> *_aidl_return) {
return getFilteredTemperatures(true, type, _aidl_return);
}
ndk::ScopedAStatus Thermal::getFilteredTemperatures(bool filterType, TemperatureType type,
std::vector<Temperature> *_aidl_return) {
*_aidl_return = {};
if (!thermal_helper_.isInitializedOk()) {
return initErrorStatus();
}
// 获取设备温度值
if (!thermal_helper_.fillCurrentTemperatures(filterType, false, type, _aidl_return)) {
return readErrorStatus();
}
return ndk::ScopedAStatus::ok();
}
// callback回调接口,通知上层ThermalManagerService服务,用于被动监听设备温度及thermal status
// thermal watcher模块中会每隔一段时间polling设备的thermal event,进行
void Thermal::sendThermalChangedCallback(const Temperature &t) {
ATRACE_CALL();
std::lock_guard<std::mutex> _lock(thermal_callback_mutex_);
LOG(VERBOSE) << "Sending notification: "
<< " Type: " << toString(t.type) << " Name: " << t.name
<< " CurrentValue: " << t.value
<< " ThrottlingStatus: " << toString(t.throttlingStatus);
// 遍历callbacks_ list中的所有callback对象,回调notifyThrottling接口
callbacks_.erase(std::remove_if(callbacks_.begin(), callbacks_.end(),
[&](const CallbackSetting &c) {
if (!c.is_filter_type || t.type == c.type) {
::ndk::ScopedAStatus ret =
c.callback->notifyThrottling(t);
if (!ret.isOk()) {
LOG(ERROR) << "a Thermal callback is dead, removed "
"from callback list.";
return true;
}
return false;
}
return false;
}),
callbacks_.end());
}
2)Thermal helper
// thermal节点路径
constexpr std::string_view kThermalSensorsRoot("/sys/devices/virtual/thermal");
constexpr std::string_view kSensorPrefix("thermal_zone");
constexpr std::string_view kCoolingDevicePrefix("cooling_device");
constexpr std::string_view kThermalNameFile("type");
constexpr std::string_view kSensorPolicyFile("policy");
constexpr std::string_view kSensorTempSuffix("temp");
constexpr std::string_view kSensorTripPointTempZeroFile("trip_point_0_temp");
constexpr std::string_view kSensorTripPointHystZeroFile("trip_point_0_hyst");
constexpr std::string_view kUserSpaceSuffix("user_space");
constexpr std::string_view kCoolingDeviceCurStateSuffix("cur_state");
constexpr std::string_view kCoolingDeviceMaxStateSuffix("max_state");
constexpr std::string_view kCoolingDeviceState2powerSuffix("state2power_table");
constexpr std::string_view kConfigProperty("vendor.thermal.config");
constexpr std::string_view kConfigDefaultFileName("thermal_info_config.json");
constexpr std::string_view kThermalGenlProperty("persist.vendor.enable.thermal.genl");
constexpr std::string_view kThermalDisabledProperty("vendor.disable.thermalhal.control");
// callback接口重定义
using NotificationCallback = std::function<void(const std::vector<Temperature_2_0> &temps)>;
// 创建ThermalHelper对象时,初始化ThermalWatcher对象,并注册thermalWatcherCallbackFunc callback接口
// NotificationCallback为sendThermalChangedCallback接口
ThermalHelper::ThermalHelper(const NotificationCallback &cb)
: thermal_watcher_(new ThermalWatcher(
std::bind(&ThermalHelper::thermalWatcherCallbackFunc, this, std::placeholders::_1))),
cb_(cb)
cooling_device_info_map_(ParseCoolingDevice(
"/vendor/etc/" +
android::base::GetProperty(kConfigProperty.data(), kConfigDefaultFileName.data()))),
sensor_info_map_(ParseSensorInfo(
"/vendor/etc/" +
android::base::GetProperty(kConfigProperty.data(), kConfigDefaultFileName.data()))) {
for (auto const &name_status_pair : sensor_info_map_) {
sensor_status_map_[name_status_pair.first] = {
.severity = ThrottlingSeverity::NONE,
.prev_hot_severity = ThrottlingSeverity::NONE,
.prev_cold_severity = ThrottlingSeverity::NONE,
.prev_hint_severity = ThrottlingSeverity::NONE,
};
}
{...}
// This is called in the different thread context and will update sensor_status
// uevent_sensors is the set of sensors which trigger uevent from thermal core driver.
// 从uevent_sensors中解析出传感器事件
std::chrono::milliseconds ThermalHelper::thermalWatcherCallbackFunc(
const std::set<std::string> &uevent_sensors) {
std::vector<Temperature> temps;
std::vector<std::string> cooling_devices_to_update;
boot_clock::time_point now = boot_clock::now();
auto min_sleep_ms = std::chrono::milliseconds::max();
bool power_data_is_updated = false;
ATRACE_CALL();
for (auto &name_status_pair : sensor_status_map_) {
bool force_update = false;
bool force_no_cache = false;
Temperature temp;
TemperatureThreshold threshold;
SensorStatus &sensor_status = name_status_pair.second;
const SensorInfo &sensor_info = sensor_info_map_.at(name_status_pair.first);
// Only handle the sensors in allow list
if (!sensor_info.is_watch) {
continue;
}
ATRACE_NAME(StringPrintf("ThermalHelper::thermalWatcherCallbackFunc - %s",
name_status_pair.first.data())
.c_str());
std::chrono::milliseconds time_elapsed_ms = std::chrono::milliseconds::zero();
auto sleep_ms = (sensor_status.severity != ThrottlingSeverity::NONE)
? sensor_info.passive_delay
: sensor_info.polling_delay;
if (sensor_info.virtual_sensor_info != nullptr &&
!sensor_info.virtual_sensor_info->trigger_sensors.empty()) {
for (size_t i = 0; i < sensor_info.virtual_sensor_info->trigger_sensors.size(); i++) {
const auto &trigger_sensor_status =
sensor_status_map_.at(sensor_info.virtual_sensor_info->trigger_sensors[i]);
if (trigger_sensor_status.severity != ThrottlingSeverity::NONE) {
sleep_ms = sensor_info.passive_delay;
break;
}
}
}
// Check if the sensor need to be updated
// 检查sensor是否需要被更新
if (sensor_status.last_update_time == boot_clock::time_point::min()) {
force_update = true;
} else {
time_elapsed_ms = std::chrono::duration_cast<std::chrono::milliseconds>(
now - sensor_status.last_update_time);
if (uevent_sensors.size()) {
if (sensor_info.virtual_sensor_info != nullptr) {
for (size_t i = 0; i < sensor_info.virtual_sensor_info->trigger_sensors.size();
i++) {
if (uevent_sensors.find(
sensor_info.virtual_sensor_info->trigger_sensors[i]) !=
uevent_sensors.end()) {
force_update = true;
break;
}
}
} else if (uevent_sensors.find(name_status_pair.first) != uevent_sensors.end()) {
force_update = true;
force_no_cache = true;
}
} else if (time_elapsed_ms > sleep_ms) {
force_update = true;
}
}
{
std::lock_guard<std::shared_mutex> _lock(sensor_status_map_mutex_);
if (sensor_status.emul_setting != nullptr &&
sensor_status.emul_setting->pending_update) {
force_update = true;
sensor_status.emul_setting->pending_update = false;
LOG(INFO) << "Update " << name_status_pair.first.data()
<< " right away with emul setting";
}
}
LOG(VERBOSE) << "sensor " << name_status_pair.first
<< ": time_elapsed=" << time_elapsed_ms.count()
<< ", sleep_ms=" << sleep_ms.count() << ", force_update = " << force_update
<< ", force_no_cache = " << force_no_cache;
if (!force_update) {
auto timeout_remaining = sleep_ms - time_elapsed_ms;
if (min_sleep_ms > timeout_remaining) {
min_sleep_ms = timeout_remaining;
}
LOG(VERBOSE) << "sensor " << name_status_pair.first
<< ": timeout_remaining=" << timeout_remaining.count();
continue;
}
std::pair<ThrottlingSeverity, ThrottlingSeverity> throttling_status;
// 读取设备温度
if (!readTemperature(name_status_pair.first, &temp, &throttling_status, force_no_cache)) {
LOG(ERROR) << __func__
<< ": error reading temperature for sensor: " << name_status_pair.first;
continue;
}
if (!readTemperatureThreshold(name_status_pair.first, &threshold)) {
LOG(ERROR) << __func__ << ": error reading temperature threshold for sensor: "
<< name_status_pair.first;
continue;
}
{
// writer lock
std::unique_lock<std::shared_mutex> _lock(sensor_status_map_mutex_);
if (throttling_status.first != sensor_status.prev_hot_severity) {
sensor_status.prev_hot_severity = throttling_status.first;
}
if (throttling_status.second != sensor_status.prev_cold_severity) {
sensor_status.prev_cold_severity = throttling_status.second;
}
if (temp.throttlingStatus != sensor_status.severity) {
temps.push_back(temp);
sensor_status.severity = temp.throttlingStatus;
sleep_ms = (sensor_status.severity != ThrottlingSeverity::NONE)
? sensor_info.passive_delay
: sensor_info.polling_delay;
}
}
if (!power_data_is_updated) {
power_files_.refreshPowerStatus();
power_data_is_updated = true;
}
if (sensor_status.severity == ThrottlingSeverity::NONE) {
thermal_throttling_.clearThrottlingData(name_status_pair.first, sensor_info);
} else {
// update thermal throttling request
thermal_throttling_.thermalThrottlingUpdate(
temp, sensor_info, sensor_status.severity, time_elapsed_ms,
power_files_.GetPowerStatusMap(), cooling_device_info_map_);
}
thermal_throttling_.computeCoolingDevicesRequest(
name_status_pair.first, sensor_info, sensor_status.severity,
&cooling_devices_to_update, &thermal_stats_helper_);
if (min_sleep_ms > sleep_ms) {
min_sleep_ms = sleep_ms;
}
LOG(VERBOSE) << "Sensor " << name_status_pair.first << ": sleep_ms=" << sleep_ms.count()
<< ", min_sleep_ms voting result=" << min_sleep_ms.count();
sensor_status.last_update_time = now;
}
if (!cooling_devices_to_update.empty()) {
updateCoolingDevices(cooling_devices_to_update);
}
// 检查当前设备温度变化是否达到json配置文件中的温度阈值,达到阈值就回调
if (!temps.empty()) {
for (const auto &t : temps) {
if (sensor_info_map_.at(t.name).send_cb && cb_) {
// callback回调sendThermalChangedCallback接口
cb_(t);
}
if (sensor_info_map_.at(t.name).send_powerhint && isAidlPowerHalExist()) {
sendPowerExtHint(t);
}
}
}
int count_failed_reporting = thermal_stats_helper_.reportStats();
if (count_failed_reporting != 0) {
LOG(ERROR) << "Failed to report " << count_failed_reporting << " thermal stats";
}
return min_sleep_ms;
}
// 获取设备温度、设备thermal status
bool ThermalHelper::readTemperature(
std::string_view sensor_name, Temperature *out,
std::pair<ThrottlingSeverity, ThrottlingSeverity> *throttling_status,
const bool force_no_cache) {
// Return fail if the thermal sensor cannot be read.
float temp;
std::map<std::string, float> sensor_log_map;
auto &sensor_status = sensor_status_map_.at(sensor_name.data());
// 通过sensor name获取设备温度,存在temp
if (!readThermalSensor(sensor_name, &temp, force_no_cache, &sensor_log_map)) {
LOG(ERROR) << "readTemperature: failed to read sensor: " << sensor_name;
return false;
}
// 给out赋值
const auto &sensor_info = sensor_info_map_.at(sensor_name.data());
out->type = sensor_info.type;
out->name = sensor_name.data();
out->value = temp * sensor_info.multiplier; // 从sensor中获取的温度值乘以一个热度因子,定义在json文件中
std::pair<ThrottlingSeverity, ThrottlingSeverity> status =
std::make_pair(ThrottlingSeverity::NONE, ThrottlingSeverity::NONE);
// Only update status if the thermal sensor is being monitored
if (sensor_info.is_watch) {
ThrottlingSeverity prev_hot_severity, prev_cold_severity;
{
// reader lock, readTemperature will be called in Binder call and the watcher thread.
std::shared_lock<std::shared_mutex> _lock(sensor_status_map_mutex_);
prev_hot_severity = sensor_status.prev_hot_severity;
prev_cold_severity = sensor_status.prev_cold_severity;
}
status = getSeverityFromThresholds(sensor_info.hot_thresholds, sensor_info.cold_thresholds,
sensor_info.hot_hysteresis, sensor_info.cold_hysteresis,
prev_hot_severity, prev_cold_severity, out->value);
}
if (throttling_status) {
*throttling_status = status;
}
if (sensor_status.emul_setting != nullptr && sensor_status.emul_setting->emul_severity >= 0) {
std::shared_lock<std::shared_mutex> _lock(sensor_status_map_mutex_);
out->throttlingStatus =
static_cast<ThrottlingSeverity>(sensor_status.emul_setting->emul_severity);
} else {
out->throttlingStatus =
static_cast<size_t>(status.first) > static_cast<size_t>(status.second)
? status.first
: status.second;
}
if (sensor_info.is_watch) {
std::ostringstream sensor_log;
for (const auto &sensor_log_pair : sensor_log_map) {
sensor_log << sensor_log_pair.first << ":" << sensor_log_pair.second << " ";
}
// Update sensor temperature time in state
thermal_stats_helper_.updateSensorTempStatsBySeverity(sensor_name, out->throttlingStatus);
LOG(INFO) << sensor_name.data() << ":" << out->value << " raw data: " << sensor_log.str();
}
return true;
}
// 读取设备sensor的温度
bool ThermalHelper::readThermalSensor(std::string_view sensor_name, float *temp,
const bool force_no_cache,
std::map<std::string, float> *sensor_log_map) {
float temp_val = 0.0;
std::string file_reading;
boot_clock::time_point now = boot_clock::now();
ATRACE_NAME(StringPrintf("ThermalHelper::readThermalSensor - %s", sensor_name.data()).c_str());
if (!(sensor_info_map_.count(sensor_name.data()) &&
sensor_status_map_.count(sensor_name.data()))) {
return false;
}
const auto &sensor_info = sensor_info_map_.at(sensor_name.data());
auto &sensor_status = sensor_status_map_.at(sensor_name.data());
{
std::shared_lock<std::shared_mutex> _lock(sensor_status_map_mutex_);
if (sensor_status.emul_setting != nullptr &&
!isnan(sensor_status.emul_setting->emul_temp)) {
*temp = sensor_status.emul_setting->emul_temp;
return true;
}
}
// Check if thermal data need to be read from cache
if (!force_no_cache &&
(sensor_status.thermal_cached.timestamp != boot_clock::time_point::min()) &&
(std::chrono::duration_cast<std::chrono::milliseconds>(
now - sensor_status.thermal_cached.timestamp) < sensor_info.time_resolution) &&
!isnan(sensor_status.thermal_cached.temp)) {
*temp = sensor_status.thermal_cached.temp;
(*sensor_log_map)[sensor_name.data()] = *temp;
ATRACE_INT((sensor_name.data() + std::string("-cached")).c_str(), static_cast<int>(*temp));
return true;
}
// Reading thermal sensor according to it's composition
if (sensor_info.virtual_sensor_info == nullptr) {
if (!thermal_sensors_.readThermalFile(sensor_name.data(), &file_reading)) {
return false;
}
if (file_reading.empty()) {
LOG(ERROR) << "failed to read sensor: " << sensor_name;
return false;
}
*temp = std::stof(::android::base::Trim(file_reading));
} else {
for (size_t i = 0; i < sensor_info.virtual_sensor_info->linked_sensors.size(); i++) {
float sensor_reading = 0.0;
// Get the sensor reading data
if (!readDataByType(sensor_info.virtual_sensor_info->linked_sensors[i], &sensor_reading,
sensor_info.virtual_sensor_info->linked_sensors_type[i],
force_no_cache, sensor_log_map)) {
LOG(ERROR) << "Failed to read " << sensor_name.data() << "'s linked sensor "
<< sensor_info.virtual_sensor_info->linked_sensors[i];
}
if (std::isnan(sensor_info.virtual_sensor_info->coefficients[i])) {
return false;
}
float coefficient = sensor_info.virtual_sensor_info->coefficients[i];
switch (sensor_info.virtual_sensor_info->formula) {
case FormulaOption::COUNT_THRESHOLD:
if ((coefficient < 0 && sensor_reading < -coefficient) ||
(coefficient >= 0 && sensor_reading >= coefficient))
temp_val += 1;
break;
case FormulaOption::WEIGHTED_AVG:
temp_val += sensor_reading * coefficient;
break;
case FormulaOption::MAXIMUM:
if (i == 0)
temp_val = std::numeric_limits<float>::lowest();
if (sensor_reading * coefficient > temp_val)
temp_val = sensor_reading * coefficient;
break;
case FormulaOption::MINIMUM:
if (i == 0)
temp_val = std::numeric_limits<float>::max();
if (sensor_reading * coefficient < temp_val)
temp_val = sensor_reading * coefficient;
break;
default:
break;
}
}
*temp = (temp_val + sensor_info.virtual_sensor_info->offset);
}
(*sensor_log_map)[sensor_name.data()] = *temp;
ATRACE_INT(sensor_name.data(), static_cast<int>(*temp));
{
std::unique_lock<std::shared_mutex> _lock(sensor_status_map_mutex_);
sensor_status.thermal_cached.temp = *temp;
sensor_status.thermal_cached.timestamp = now;
}
auto real_temp = (*temp) * sensor_info.multiplier;
// 更新thermal tmpture统计值
thermal_stats_helper_.updateSensorTempStatsByThreshold(sensor_name, real_temp);
return true;
}
小结:用户可以根据实际情况来设置thermal_info_config.json(该文件没给出)配置文件中的热度乘数因子,使获取到的设备温度更加精准。
3) Thermal Watcher
// callback接口重定义
using WatcherCallback = std::function<bool(const std::set<std::string> &name)>;
// 创建对象时,初始化callback及looper
// WatcherCallback为thermalWatcherCallbackFunc接口
explicit ThermalWatcher(const WatcherCallback &cb)
: Thread(false), cb_(cb), looper_(new ::android::Looper(true)) {}
bool ThermalWatcher::threadLoop() {
LOG(VERBOSE) << "ThermalWatcher polling...";
int fd;
std::set<std::string> sensors;
auto time_elapsed_ms = std::chrono::duration_cast<std::chrono::milliseconds>(boot_clock::now() -
last_update_time_);
if (time_elapsed_ms < sleep_ms_ &&
looper_->pollOnce(sleep_ms_.count(), &fd, nullptr, nullptr) >= 0) {
ATRACE_NAME("ThermalWatcher::threadLoop - receive event");
if (fd != uevent_fd_.get() && fd != thermal_genl_fd_.get()) {
return true;
} else if (fd == thermal_genl_fd_.get()) {
parseGenlink(&sensors);
} else if (fd == uevent_fd_.get()) {
parseUevent(&sensors);
}
// Ignore cb_ if uevent is not from monitored sensors
if (sensors.size() == 0) {
return true;
}
}
// 回调thermal helper的thermalWatcherCallbackFunc接口
// sensors ---存储传感器的名称
sleep_ms_ = cb_(sensors);
last_update_time_ = boot_clock::now();
return true;
}
4.5 android.hardware.thermal库
android.hardware.thermal@1.0库:
// hardware/interfaces/thermal/1.0/Android.bp
hidl_interface {
name: "android.hardware.thermal@1.0",
root: "android.hardware",
srcs: [
"types.hal",
"IThermal.hal",
],
interfaces: [
"android.hidl.base@1.0",
],
gen_java: true, ---编译生成对应的java文件,编译java层调用
gen_java_constants: true,
}
// IThermal.hal
package android.hardware.thermal@1.0;
interface IThermal {
/**
* Retrieves temperatures in Celsius.
*
* @return status Status of the operation. If status code is FAILURE,
* the status.debugMessage must be populated with the human-readable
* error message.
* @return temperatures If status code is SUCCESS, it's filled with the
* current temperatures. The order of temperatures of built-in
* devices (such as CPUs, GPUs and etc.) in the list must be kept
* the same regardless the number of calls to this method even if
* they go offline, if these devices exist on boot. The method
* always returns and never removes such temperatures.
*
*/
// 获取温度
@callflow(next={"*"})
@entry
@exit
getTemperatures()
generates (ThermalStatus status, vec<Temperature> temperatures);
/**
* Retrieves CPU usage information of each core: active and total times
* in ms since first boot.
*
* @return status Status of the operation. If status code is FAILURE,
* the status.debugMessage must be populated with the human-readable
* error message.
* @return cpuUsages If status code is SUCCESS, it's filled with the current
* CPU usages. The order and number of CPUs in the list must be kept
* the same regardless the number of calls to this method.
*
*/
// 检索每个核心的CPU使用信息:自第一次启动以来的活动时间和总时间(以毫秒为单位)。
@callflow(next={"*"})
@entry
@exit
getCpuUsages() generates (ThermalStatus status, vec<CpuUsage> cpuUsages);
/**
* Retrieves the cooling devices information.
*
* @return status Status of the operation. If status code is FAILURE,
* the status.debugMessage must be populated with the human-readable
* error message.
* @return devices If status code is SUCCESS, it's filled with the current
* cooling device information. The order of built-in cooling
* devices in the list must be kept the same regardless the number
* of calls to this method even if they go offline, if these devices
* exist on boot. The method always returns and never removes from
* the list such cooling devices.
*
*/
// 获取冷却设备的信息
@callflow(next={"*"})
@entry
@exit
getCoolingDevices()
generates (ThermalStatus status, vec<CoolingDevice> devices);
};
// 包名为android.hardware.thermal@1.0,提供getTemperatures()和getCoolingDevices()两个hwbinder接口。
// types.hal
package android.hardware.thermal@1.0;
/** Device temperature types */
@export
enum TemperatureType : int32_t { // 设备温度类型,包括cpu、gpu、电池、外壳或表面
UNKNOWN = -1,
CPU = 0,
GPU = 1,
BATTERY = 2,
SKIN = 3,
};
enum CoolingType : uint32_t {
/** Fan cooling device speed in RPM. */
FAN_RPM = 0,
};
struct Temperature {
/**
* This temperature's type.
*/
TemperatureType type;
/**
* Name of this temperature.
* All temperatures of the same "type" must have a different "name",
* e.g., cpu0, battery.
*/
string name;
/**
* Current temperature in Celsius. If not available set by HAL to NAN.
* Current temperature can be in any units if type=UNKNOWN.
*/
float currentValue; // 当前温度
/**
* Throttling temperature constant for this temperature.
* If not available, set by HAL to NAN.
*/
float throttlingThreshold; // 温度调节阈值
/**
* Shutdown temperature constant for this temperature.
* If not available, set by HAL to NAN.
*/
float shutdownThreshold; // 关机温度阈值
/**
* Threshold temperature above which the VR mode clockrate minimums cannot
* be maintained for this device.
* If not available, set by HAL to NAN.
*/
float vrThrottlingThreshold;
};
// 冷确设备,一般包括风扇或温度传感器
struct CoolingDevice {
/**
* This cooling device type.
*/
CoolingType type;
/**
* Name of this cooling device.
* All cooling devices of the same "type" must have a different "name".
*/
string name;
/**
* Current cooling device value. Units depend on cooling device "type".
*/
float currentValue;
};
// 各cpu状态信息
struct CpuUsage {
/**
* Name of this CPU.
* All CPUs must have a different "name".
*/
string name;
/**
* Active time since the last boot in ms.
*/
uint64_t active; // cpu active时长
/**
* Total time since the last boot in ms.
*/
uint64_t total;
/**
* Is set to true when a core is online.
* If the core is offline, all other members except |name| should be ignored.
*/
bool isOnline; // cpu online状态
};
enum ThermalStatusCode : uint32_t {
/** No errors. */
SUCCESS = 0,
/** Unknown failure occured. */
FAILURE = 1
};
/**
* Generic structure to return the status of any thermal operation.
*/
struct ThermalStatus {
ThermalStatusCode code;
/**
* A specific error message to provide more information.
* This can be used for debugging purposes only.
*/
string debugMessage;
};
android.hardware.thermal@2.0库:
// hardware/interfaces/thermal/2.0/Android.bp
hidl_interface {
name: "android.hardware.thermal@2.0",
root: "android.hardware",
srcs: [
"types.hal",
"IThermal.hal",
"IThermalChangedCallback.hal",
],
interfaces: [
"android.hardware.thermal@1.0",
"android.hidl.base@1.0",
],
gen_java: true,
}
---------------------------------------------------------
// hardware/interfaces/thermal/2.0/IThermal.hal
package android.hardware.thermal@2.0;
import android.hardware.thermal@1.0::IThermal;
import android.hardware.thermal@1.0::ThermalStatus;
import IThermalChangedCallback;
interface IThermal extends @1.0::IThermal {
/**
* Retrieves temperatures in Celsius.
*
* @param filterType whether to filter the result for a given type.
* @param type the TemperatureType such as battery or skin.
*
* @return status Status of the operation. If status code is FAILURE,
* the status.debugMessage must be populated with a human-readable
* error message.
*
* @return temperatures If status code is SUCCESS, it's filled with the
* current temperatures. The order of temperatures of built-in
* devices (such as CPUs, GPUs and etc.) in the list must be kept
* the same regardless of the number of calls to this method even if
* they go offline, if these devices exist on boot. The method
* always returns and never removes such temperatures.
*/
// 获取指定类型设备的温度
getCurrentTemperatures(bool filterType, TemperatureType type)
generates (ThermalStatus status, vec<Temperature> temperatures);
/**
* Retrieves static temperature thresholds in Celsius.
*
* @param filterType whether to filter the result for a given type.
* @param type the TemperatureType such as battery or skin.
*
* @return status Status of the operation. If status code is FAILURE,
* the status.debugMessage must be populated with a human-readable error message.
* @return temperatureThresholds If status code is SUCCESS, it's filled with the
* temperatures thresholds. The order of temperatures of built-in
* devices (such as CPUs, GPUs and etc.) in the list must be kept
* the same regardless of the number of calls to this method even if
* they go offline, if these devices exist on boot. The method
* always returns and never removes such temperatures. The thresholds
* are returned as static values and must not change across calls. The actual
* throttling state is determined in device thermal mitigation policy/agorithm
* which might not be simple thresholds so these values Thermal HAL provided
* may not be accurate to detemin the throttling status. To get accurate
* throttling status, use getCurrentTemperatures or registerThermalChangedCallback
* and listen to the callback.
*/
// 获取指定类型设备的温度阈值
getTemperatureThresholds(bool filterType, TemperatureType type)
generates (ThermalStatus status, vec<TemperatureThreshold> temperatureThresholds);
/**
* Register an IThermalChangedCallback, used by the Thermal HAL
* to receive thermal events when thermal mitigation status changed.
* Multiple registrations with different IThermalChangedCallback must be allowed.
* Multiple registrations with same IThermalChangedCallback is not allowed, client
* should unregister the given IThermalChangedCallback first.
*
* @param callback the IThermalChangedCallback to use for receiving
* thermal events (nullptr callback will lead to failure with status code FAILURE).
* @param filterType if filter for given sensor type.
* @param type the type to be filtered.
*
* @return status Status of the operation. If status code is FAILURE,
* the status.debugMessage must be populated with a human-readable error message.
*/
// 注册一个IThermalChangedCallback,当热缓解状态发生变化时,由热HAL使用以接收热事件
registerThermalChangedCallback(IThermalChangedCallback callback,
bool filterType,
TemperatureType type)
generates (ThermalStatus status);
/**
* Unregister an IThermalChangedCallback, used by the Thermal HAL
* to receive thermal events when thermal mitigation status changed.
*
* @param callback the IThermalChangedCallback used for receiving
* thermal events (nullptr callback will lead to failure with status code FAILURE).
*
* @return status Status of the operation. If status code is FAILURE,
* the status.debugMessage must be populated with a human-readable error message.
*/
unregisterThermalChangedCallback(IThermalChangedCallback callback)
generates (ThermalStatus status);
/**
* Retrieves the cooling devices information.
*
* @param filterType whether to filter the result for a given type.
* @param type the CoolingDevice such as CPU/GPU.
*
* @return status Status of the operation. If status code is FAILURE,
* the status.debugMessage must be populated with the human-readable
* error message.
* @return devices If status code is SUCCESS, it's filled with the current
* cooling device information. The order of built-in cooling
* devices in the list must be kept the same regardless of the number
* of calls to this method even if they go offline, if these devices
* exist on boot. The method always returns and never removes from
* the list such cooling devices.
*/
// 根据指定冷却设备的类型获取该设备
getCurrentCoolingDevices(bool filterType, CoolingType type)
generates (ThermalStatus status, vec<CoolingDevice> devices);
};
// // hardware/interfaces/thermal/2.0/IThermalChangedCallback.hal
package android.hardware.thermal@2.0;
import android.hardware.thermal@2.0::Temperature;
/**
* IThermalChangedCallback send throttling notification to clients.
*/
interface IThermalChangedCallback {
/**
* Send a thermal throttling event to all ThermalHAL
* thermal event listeners.
*
* @param temperature The temperature associated with the
* throttling event.
*/
// thermal状态等级发生变化时,通知所有的client
oneway notifyThrottling (Temperature temperature);
};
4.6 Thermal Driver
Linux Thermal框架可以分为Thermal Core、Thermal Governor、Thermal Cooling、Thermal Driver以及Thermal Device Tree五大部分。
kernel启动阶段,user space governor已经注册到Thermal Core,因此Linux Thermal框架是支持user space governor策略。
Thermal Hal中的ThermalWatcher模块中接收kernel thermal event事件,并解析。
// gov_user_space.c
/**
* notify_user_space - Notifies user space about thermal events
* @tz: thermal_zone_device
* @trip: trip point index
*
* This function notifies the user space through UEvents.
*/
// 当thermal zone中的设备温度发生变化,会调用user space governor模块通过uevent方式通知用户空间
// 用户空间监听uevent事件
static int notify_user_space(struct thermal_zone_device *tz, int trip)
{
char *thermal_prop[5];
int i;
mutex_lock(&tz->lock);
thermal_prop[0] = kasprintf(GFP_KERNEL, "NAME=%s", tz->type);
thermal_prop[1] = kasprintf(GFP_KERNEL, "TEMP=%d", tz->temperature);
thermal_prop[2] = kasprintf(GFP_KERNEL, "TRIP=%d", trip);
thermal_prop[3] = kasprintf(GFP_KERNEL, "EVENT=%d", tz->notify_event);
thermal_prop[4] = NULL;
// 调用uevent相关函数
kobject_uevent_env(&tz->device.kobj, KOBJ_CHANGE, thermal_prop);
for (i = 0; i < 4; ++i)
kfree(thermal_prop[i]);
mutex_unlock(&tz->lock);
return 0;
}
static struct thermal_governor thermal_gov_user_space = {
.name = "user_space",
.throttle = notify_user_space,
};
// kobject_uevent.c
/**
* kobject_uevent_env - send an uevent with environmental data
*
* @kobj: struct kobject that the action is happening to
* @action: action that is happening
* @envp_ext: pointer to environmental data
*
* Returns 0 if kobject_uevent_env() is completed with success or the
* corresponding error when it fails.
*/
int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
char *envp_ext[])
{
struct kobj_uevent_env *env; // 构建uevent
const char *action_string = kobject_actions[action];
const char *devpath = NULL;
const char *subsystem;
struct kobject *top_kobj;
struct kset *kset;
const struct kset_uevent_ops *uevent_ops;
int i = 0;
int retval = 0;
/*
* Mark "remove" event done regardless of result, for some subsystems
* do not want to re-trigger "remove" event via automatic cleanup.
*/
if (action == KOBJ_REMOVE)
kobj->state_remove_uevent_sent = 1;
pr_debug("kobject: '%s' (%p): %s\n",
kobject_name(kobj), kobj, __func__);
/* search the kset we belong to */
top_kobj = kobj;
while (!top_kobj->kset && top_kobj->parent)
top_kobj = top_kobj->parent;
if (!top_kobj->kset) {
pr_debug("kobject: '%s' (%p): %s: attempted to send uevent "
"without kset!\n", kobject_name(kobj), kobj,
__func__);
return -EINVAL;
}
kset = top_kobj->kset;
uevent_ops = kset->uevent_ops;
/* skip the event, if uevent_suppress is set*/
if (kobj->uevent_suppress) {
pr_debug("kobject: '%s' (%p): %s: uevent_suppress "
"caused the event to drop!\n",
kobject_name(kobj), kobj, __func__);
return 0;
}
/* skip the event, if the filter returns zero. */
if (uevent_ops && uevent_ops->filter)
if (!uevent_ops->filter(kset, kobj)) {
pr_debug("kobject: '%s' (%p): %s: filter function "
"caused the event to drop!\n",
kobject_name(kobj), kobj, __func__);
return 0;
}
/* originating subsystem */
if (uevent_ops && uevent_ops->name)
subsystem = uevent_ops->name(kset, kobj);
else
subsystem = kobject_name(&kset->kobj);
if (!subsystem) {
pr_debug("kobject: '%s' (%p): %s: unset subsystem caused the "
"event to drop!\n", kobject_name(kobj), kobj,
__func__);
return 0;
}
/* environment buffer */
env = kzalloc(sizeof(struct kobj_uevent_env), GFP_KERNEL);
if (!env)
return -ENOMEM;
/* complete object path */
devpath = kobject_get_path(kobj, GFP_KERNEL);
if (!devpath) {
retval = -ENOENT;
goto exit;
}
/* default keys */
retval = add_uevent_var(env, "ACTION=%s", action_string);
if (retval)
goto exit;
retval = add_uevent_var(env, "DEVPATH=%s", devpath);
if (retval)
goto exit;
retval = add_uevent_var(env, "SUBSYSTEM=%s", subsystem);
if (retval)
goto exit;
/* keys passed in from the caller */
if (envp_ext) {
for (i = 0; envp_ext[i]; i++) {
retval = add_uevent_var(env, "%s", envp_ext[i]);
if (retval)
goto exit;
}
}
/* let the kset specific function add its stuff */
if (uevent_ops && uevent_ops->uevent) {
retval = uevent_ops->uevent(kset, kobj, env);
if (retval) {
pr_debug("kobject: '%s' (%p): %s: uevent() returned "
"%d\n", kobject_name(kobj), kobj,
__func__, retval);
goto exit;
}
}
switch (action) {
case KOBJ_ADD:
/*
* Mark "add" event so we can make sure we deliver "remove"
* event to userspace during automatic cleanup. If
* the object did send an "add" event, "remove" will
* automatically generated by the core, if not already done
* by the caller.
*/
kobj->state_add_uevent_sent = 1;
break;
case KOBJ_UNBIND:
zap_modalias_env(env);
break;
default:
break;
}
mutex_lock(&uevent_sock_mutex);
/* we will send an event, so request a new sequence number */
retval = add_uevent_var(env, "SEQNUM=%llu", ++uevent_seqnum);
if (retval) {
mutex_unlock(&uevent_sock_mutex);
goto exit;
}
// 将uevent事件广播出去
retval = kobject_uevent_net_broadcast(kobj, env, action_string,
devpath);
mutex_unlock(&uevent_sock_mutex);
#ifdef CONFIG_UEVENT_HELPER
/* call uevent_helper, usually only enabled during early boot */
if (uevent_helper[0] && !kobj_usermode_filter(kobj)) {
struct subprocess_info *info;
retval = add_uevent_var(env, "HOME=/");
if (retval)
goto exit;
retval = add_uevent_var(env,
"PATH=/sbin:/bin:/usr/sbin:/usr/bin");
if (retval)
goto exit;
retval = init_uevent_argv(env, subsystem);
if (retval)
goto exit;
retval = -ENOMEM;
info = call_usermodehelper_setup(env->argv[0], env->argv,
env->envp, GFP_KERNEL,
NULL, cleanup_uevent_env, env);
if (info) {
retval = call_usermodehelper_exec(info, UMH_NO_WAIT);
env = NULL; /* freed by cleanup_uevent_env */
}
}
#endif
exit:
kfree(devpath);
kfree(env);
return retval;
}
EXPORT_SYMBOL_GPL(kobject_uevent_env);
static int uevent_net_broadcast_tagged(struct sock *usk,
struct kobj_uevent_env *env,
const char *action_string,
const char *devpath)
{
struct user_namespace *owning_user_ns = sock_net(usk)->user_ns;
struct sk_buff *skb = NULL;
int ret = 0;
skb = alloc_uevent_skb(env, action_string, devpath);
if (!skb)
return -ENOMEM;
/* fix credentials */
if (owning_user_ns != &init_user_ns) {
struct netlink_skb_parms *parms = &NETLINK_CB(skb);
kuid_t root_uid;
kgid_t root_gid;
/* fix uid */
root_uid = make_kuid(owning_user_ns, 0);
if (uid_valid(root_uid))
parms->creds.uid = root_uid;
/* fix gid */
root_gid = make_kgid(owning_user_ns, 0);
if (gid_valid(root_gid))
parms->creds.gid = root_gid;
}
// uevent事件最终通过netlink方式广播出去,user space的client可以接收uevent
ret = netlink_broadcast(usk, skb, 0, 1, GFP_KERNEL);
/* ENOBUFS should be handled in userspace */
if (ret == -ENOBUFS || ret == -ESRCH)
ret = 0;
return ret;
}
小结:
Uevent使用了Socket和Netlink来实现内核与用户态之间的通信:
1)内核通过创建和管理一个 uevent_net 结构体来提供 UEVENT 服务。
2)在用户空间,应用程序(如siengine thermal hal service)使用 Socket 和 Netlink API 来与内核通信并接收 Uevent消息。通常使用的是 Netlink 的原始套接字。
五、 核心API梳理
接口 | 作用 |
---|---|
getCurrentTemperatures(bool filterType, TemperatureType_2_0 type, getCurrentTemperatures_cb _hidl_cb) | 获取设备的多组温度信息,包括温度、热状态等级等 |
registerThermalChangedCallback(const sp &callback, bool filterType, TemperatureType_2_0 type, registerThermalChangedCallback_cb _hidl_cb) | 上层注册监听,当设备thermal状态发生变化时,会进行callback回调 |
sendThermalChangedCallback(const std::vector<Temperature_2_0> &temps) | thermal service进行callback回调,通知上层 |
getCurrentCoolingDevices(bool filterType, CoolingType type, getCurrentCoolingDevices_cb _hidl_cb) | 获取冷却设备组信息,包括设备类型、名称、当前状态值(如cpu频率)等 |
registerFilesToWatch(const std::setstd::string &sensors_to_watch, bool uevent_monitor) | 存储设备名称、打开uevent socket并设置为非阻塞模式、uevent fd添加到Looper |
thermalWatcherCallbackFunc | Thermal helper向thermal watcher对象创建时注册的callback接口。由thermal core driver触发uevent事件,通知thermal hal 2.0更新sensor_status |
notifyThrottling(Temperature temperature) | client回调接口 |
notify_user_space(struct thermal_zone_device *tz, int trip) | 当thermal zone的设备温度发生变化时,thermal driver core调用user space governor的接口通知用户空间的进程 |
kobject_uevent_net_broadcast(struct kobject *kobj, struct kobj_uevent_env *env, const char *action_string, const char *devpath) | 构建uevent结构体,并通过netlink_broadcast接口广播出去,用户空间client端可以监听到uevent事件 |
六、温度值优化
基于sensor提供的温度数据和设备实际的数据可能存在偏差,因此有sensor multiplier误差因子来矫正。但由于设备的温度值的误差不是呈线性的趋势,如设备实际温度为60度,sensor提供的温度为50度,理论上sensor multiplier误差因子可以设置为1.2,即out->value = temp * sensor_info.multiplier,但很有可能当设备实际温度为100度,sensor提供的温度为70度,即out->value = temp * sensor_info.multiplier得出实际温度只有70 * 1.2 = 84度,与实际温度不匹配。
因此,可以穷举出影响温度传感器的所有因素,统一做线性或非线性拟合,提升精确度。
七、温控策略
相关温控方案源码由于其它原因无法公开,提供一套大致的温控方案基本思路:
1)在Kernel空间进行CPU及GPU温度的获取
2)核心硬件相关的温控策略直接在内核空间进行,如CPU、GPU、DDR等降频,关闭显示设备,关机,关闭GPU等操作。
3)控制风扇的转速等其它降温设备
4)软件及基本硬件配置相关的温控策略由内核空间上报到用户空间进行实现,如杀死后台进程,杀死低优先级程序,降低屏幕亮度,降低屏幕刷新率,限制Camera、Media、Modem等操作。