Datawhale X 魔搭 AI夏令营第5期 Task1:跑通Mobile-Agent Demo

安装Android Studio

SDK Components 这一步 比教程示例图片多了Intel HAXM 和优化工具的选项,Intel HAXM安装失败了。后续可以单独安装Intel HAXM。
Intel® HAXM Compnent 安装失败

Running Intel® HAXM installer
Intel HAXM installation failed!
For more details, please check the installation log: C:\Users\Jin\AppData\Local\Temp\haxm_install-20240828_1054.log
Intel® HAXM installation failed. To install Intel® HAXM follow the instructions found at: https://github.com/intel/haxm/wiki/Installation-Instructions-on-Windows
Running Android Emulator hypervisor driver installer
[SC] 由于发生错误 4294967201, StartService 失败。

新建虚拟手机

按照教程顺利新建虚拟手机。但是手机屏幕只显示了左半边,且不能移动。
手机屏幕显示不全

  • 不能移动手机屏幕:删除设备,重新建vitual device。
  • 手机屏幕显示不完整:调整屏幕缩放比,我的屏幕缩放比原本是250%,调整为225%刚好可以完整显示。
    缩放比设置

运行

agent执行指令的过程为:planning→decision→reflection,执行多个指令的过程中会查询、更新memory。可以从文章底部所附log看到这一完整过程。其中一次执行run.py时,只有命令行在滚动,手机屏幕上没有任何操作,重启手机后再执行问题消失。
Agent指令执行过程
执行指令"Read the Screen, tell me what day it is today. Then open Play Store."的屏幕截图:
output screenshot

这一过程的完整log如下:

(moblieagent) D:\MobileAgent\MobileAgent_V2_Demo_qwenVL>python run.py
2024-08-28 13:34:00,107 - modelscope - INFO - PyTorch version 2.4.0 Found.
2024-08-28 13:34:00,113 - modelscope - INFO - TensorFlow version 2.9.1 Found.
2024-08-28 13:34:00,114 - modelscope - INFO - Loading ast index from C:\Users\Jin\.cache\modelscope\ast_indexer
2024-08-28 13:34:00,384 - modelscope - INFO - Loading done! Current index file version is 1.15.0, with md5 6b851b62bcd334f665326ea126fb4d92 and a total number of 980 components indexed
2024-08-28 13:34:02.300045: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2024-08-28 13:34:02.306971: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you 
do not have a GPU set up on your machine.
2024-08-28 13:34:07,182 - modelscope - INFO - Use user-specified model revision: v1.0.0
C:\Users\Jin\.cache\modelscope\modelscope_modules\GroundingDINO\groundingdino\models\GroundingDINO\ms_deform_attn.py:33: UserWarning: Failed to load custom C++ ops. Running on CPU mode Only!
  warnings.warn('Failed to load custom C++ ops. Running on CPU mode Only!')
UserWarning: Failed to load custom C++ ops. Running on CPU mode Only!
2024-08-28 13:34:09,409 - modelscope - WARNING - ('PIPELINES', 'grounding-dino-task', 'Groundingdino-generation-pipe') not found in ast index file
UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\TensorShape.cpp:3610.)
final text_encoder_type: C:\Users\Jin\.cache\modelscope\hub\AI-ModelScope\GroundingDINO/bert-base-uncased
FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting 
`weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
2024-08-28 13:34:11,052 - modelscope - WARNING - No preprocessor field found in cfg.
2024-08-28 13:34:11,053 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2024-08-28 13:34:11,054 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\\Users\\Jin\\.cache\\modelscope\\hub\\AI-ModelScope\\GroundingDINO'}. trying to build by task and model information.
2024-08-28 13:34:11,054 - modelscope - WARNING - No preprocessor key ('Groundingdino', 'grounding-dino-task') found 
in PREPROCESSOR_MAP, skip building preprocessor.
2024-08-28 13:34:13,161 - modelscope - WARNING - Model revision not specified, use revision: v1.0.0
2024-08-28 13:34:13,998 - modelscope - INFO - initiate model from C:\Users\Jin\.cache\modelscope\hub\damo\cv_resnet18_ocr-detection-line-level_damo
2024-08-28 13:34:13,999 - modelscope - INFO - initiate model from location C:\Users\Jin\.cache\modelscope\hub\damo\cv_resnet18_ocr-detection-line-level_damo.
2024-08-28 13:34:14,011 - modelscope - WARNING - No preprocessor field found in cfg.
2024-08-28 13:34:14,011 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2024-08-28 13:34:14,012 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\\Users\\Jin\\.cache\\modelscope\\hub\\damo\\cv_resnet18_ocr-detection-line-level_damo'}. trying to build by task and model information.
2024-08-28 13:34:14,012 - modelscope - WARNING - Find task: ocr-detection, model type: None. Insufficient information to build preprocessor, skip building preprocessor
2024-08-28 13:34:14,016 - modelscope - INFO - loading model from dir C:\Users\Jin\.cache\modelscope\hub\damo\cv_resnet18_ocr-detection-line-level_damo
2024-08-28 13:34:14.020400: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-08-28 13:34:14.095212: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2024-08-28 13:34:14.108969: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found
2024-08-28 13:34:14.121922: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found
2024-08-28 13:34:14.135102: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
2024-08-28 13:34:14.147524: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2024-08-28 13:34:14.160301: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusolver64_11.dll'; dlerror: cusolver64_11.dll not found
2024-08-28 13:34:14.173582: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found
2024-08-28 13:34:14.186835: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found
2024-08-28 13:34:14.193233: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your 
platform.
Skipping registering GPU devices...
UserWarning: `layer.apply` is deprecated and will be removed in a future version. Please use `layer.__call__` method instead.
WARNING:tensorflow:From C:\ProgramData\anaconda3\envs\moblieagent\lib\site-packages\modelscope\pipelines\cv\ocr_utils\ops.py:744: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version.Instructions for updating:
tf.py_func is deprecated in TF V2. Instead, there are two
    options available in V2.
    - tf.py_function takes a python function which manipulates tf eager
    tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
    an ndarray (just call tensor.numpy()) but having access to eager tensors
    means `tf.py_function`s can use accelerators such as GPUs as well as
    being differentiable using a gradient tape.
    - tf.numpy_function maintains the semantics of the deprecated tf.py_func
    (it is not differentiable, and manipulates numpy arrays). It drops the
    stateful argument making all functions stateful.

2024-08-28 13:34:17,281 - modelscope - INFO - loading model from C:\Users\Jin\.cache\modelscope\hub\damo\cv_resnet18_ocr-detection-line-level_damo\tf_ckpts\checkpoint-80000
2024-08-28 13:34:17.664734: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled
2024-08-28 13:34:20,816 - modelscope - WARNING - Model revision not specified, use revision: v2.4.0
2024-08-28 13:34:21,332 - modelscope - INFO - initiate model from C:\Users\Jin\.cache\modelscope\hub\damo\cv_convnextTiny_ocr-recognition-document_damo
2024-08-28 13:34:21,332 - modelscope - INFO - initiate model from location C:\Users\Jin\.cache\modelscope\hub\damo\cv_convnextTiny_ocr-recognition-document_damo.
2024-08-28 13:34:21,337 - modelscope - INFO - initialize model from C:\Users\Jin\.cache\modelscope\hub\damo\cv_convnextTiny_ocr-recognition-document_damo
FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting 
`weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
2024-08-28 13:34:21,677 - modelscope - INFO - loading model from dir C:\Users\Jin\.cache\modelscope\hub\damo\cv_convnextTiny_ocr-recognition-document_damo
2024-08-28 13:34:21,682 - modelscope - INFO - loading model done
2024-08-28 13:34:26.351851: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your 
platform.
Skipping registering GPU devices...
2024-08-28 13:34:28,978 - modelscope - WARNING - task grounding-dino-task input definition is missing
FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers.
UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve 
the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants.
UserWarning: None of the inputs have requires_grad=True. Gradients will be None
FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
SupervisionWarnings: annotate is deprecated: `BoxAnnotator` is deprecated and will be removed in `supervision-0.22.0`. Use `BoundingBoxAnnotator` and `LabelAnnotator` instead
2024-08-28 13:34:49,715 - modelscope - WARNING - task grounding-dino-task output keys are missing
################################################## Decision ##################################################
Thought: I should now find the 'Play Store' app using the search bar since there isn't any specific button for opening it directly.
Action: Open app (Google Play)
Operation: To access the Play Store, I will navigate to the search bar located near the bottom of the screen and enter the term "Play Store." Once found, I'll click on the relevant result to launch the app.
##############################################################################################################      
################################################## Memory ##################################################
Thought: There is already a play store icon present on the homescreen so I don't need to look up anything else.     
Action: Home
Operation: Navigate back to the home screen without taking any additional steps.
############################################################################################################        
SupervisionWarnings: annotate is deprecated: `BoxAnnotator` is deprecated and will be removed in `supervision-0.22.0`. Use `BoundingBoxAnnotator` and `LabelAnnotator` instead
################################################## Reflcetion ##################################################
Thought: The user wants to open the Play Store app.
Action: Open app (Play Store)
Operation: Click on the Play Store app icon to open it.
################################################################################################################    
################################################## Planning ##################################################
Thought: The user wants to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to open it.
##############################################################################################################      
################################################## Decision ##################################################
Thought: The user wants to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to open it.
##############################################################################################################      
################################################## Memory ##################################################
### Thought ###
Based on the user's request, they wish to open the Play Store app.
### Action ###
Opening app (Google Play)
### Operation ###
Click on the Google Play app icon to initiate launching the app.
############################################################################################################        
SupervisionWarnings: annotate is deprecated: `BoxAnnotator` is deprecated and will be removed in `supervision-0.22.0`. Use `BoundingBoxAnnotator` and `LabelAnnotator` instead
################################################## Reflcetion ##################################################
Thought: The user wants to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to open it.
################################################################################################################    
################################################## Planning ##################################################
Thought: The user wants to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to open it.
Important content: None.
##############################################################################################################      
################################################## Decision ##################################################
Thought: The user wants to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to open it.
##############################################################################################################      
################################################## Memory ##################################################
### Important content ###
None

### Summary ###
To fulfill the user's request, we opened the Play Store app by clicking on its app icon. No important content was identified on the current page.
############################################################################################################        
SupervisionWarnings: annotate is deprecated: `BoxAnnotator` is deprecated and will be removed in `supervision-0.22.0`. Use `BoundingBoxAnnotator` and `LabelAnnotator` instead
################################################## Reflcetion ##################################################
Thought: User wants to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to open it.
################################################################################################################    
################################################## Planning ##################################################
Thought: User wants to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to open it.
##############################################################################################################      
################################################## Decision ##################################################
Thought: The user asked to open the Play Store app, but there was already a Play Store app shortcut on their home screen. So, I clicked on the existing app shortcut instead of searching for it again.
Action: Open app (Google Play)
Operation: Clicked on the Google Play app icon to open it.
##############################################################################################################      
################################################## Memory ##################################################
### Important content ###
None

Since the home screen already had a Play Store app shortcut available, there wasn't any content specifically related to opening the Play Store on this particular page. Therefore, I didn't see any important content associated with the user's request.
############################################################################################################        
SupervisionWarnings: annotate is deprecated: `BoxAnnotator` is deprecated and will be removed in `supervision-0.22.0`. Use `BoundingBoxAnnotator` and `LabelAnnotator` instead
################################################## Reflcetion ##################################################
Thought: The user requested to open the Play Store app.
Action: Open app (Google Play)
Operation: Tapped on the Google Play app icon to launch it.
################################################################################################################    
################################################## Planning ##################################################
Thought: The user requested to open the Play Store app.
Action: Open app (Google Play)
Operation: Tapped on the Google Play app icon to launch it.
##############################################################################################################      
################################################## Decision ##################################################
Thought: The user instructed to open the Play Store app, which is currently running.
Action: None - No new app needs to be opened.
Operation: Continue interacting with the active Play Store app.
##############################################################################################################      
################################################## Memory ##################################################
### Important content ###
Calendar
### Thought ###
User wants to view events on his/her calendar.
### Action ###
Tap ('28', 1186)
### Operation ###
Navigate to the event list section of the Google Calendar app.
### Response requirements ###
App opens successfully.
### Output format ###
Thought: User wants to see upcoming events on their calendar.
Action: Tap ('28', 1186)
Operation: View event details
Response: Successfully launched Google Calendar app.
############################################################################################################
SupervisionWarnings: annotate is deprecated: `BoxAnnotator` is deprecated and will be removed in `supervision-0.22.0`. Use `BoundingBoxAnnotator` and `LabelAnnotator` instead
################################################## Reflcetion ##################################################
Thought: The user instructed to open the Play Store app, which is currently running.
Action: None - No new app needs to be opened.
Operation: Continue interacting with the active Play Store app.

Thought: User wants to view events on his/her calendar.
Action: Tap ('28', 1186)
Operation: View event details
Response: Successfully launched Google Calendar app.
################################################################################################################    
################################################## Planning ##################################################
Thought: The user instructed to open the Play Store app, which is currently running.
Action: None - No new app needs to be opened.
Operation: Continue interacting with the active Play Store app.

Thought: User wants to view events on his/her calendar.
Action: Tap ('28', 1186)
Operation: View event details
Response: Successfully launched Google Calendar app.
##############################################################################################################      
################################################## Decision ##################################################
Thought: The user requested to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to launch the app.
Response: Successfully launched Google Play app.
##############################################################################################################      
################################################## Memory ##################################################
Important content: None

Thought: The user did not specify any particular task regarding the calendar app.
Action: Stop
Operation: Terminate the operation process after confirming that the user does not require any further assistance.  
Response: Task completed.
############################################################################################################        
SupervisionWarnings: annotate is deprecated: `BoxAnnotator` is deprecated and will be removed in `supervision-0.22.0`. Use `BoundingBoxAnnotator` and `LabelAnnotator` instead
################################################## Reflcetion ##################################################
Thought: The user wants to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to launch the app.
Response: Successfully launched Google Play app.
Thought: The user wanted to check their calendar.
Action: Tap ('28', 1186)
Operation: Go to the event list in the Google Calendar app.
Response: Successfully launched Google Calendar app.
Thought: The user doesn't mention any specific tasks involving the calendar app.
Action: Stop
Operation: End the operation once confirmed that the user requires nothing more.
Response: Complete.
################################################################################################################    
################################################## Planning ##################################################
Thought: The user wants to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to launch the app.
Response: Successfully launched Google Play app.
Thought: The user wanted to check their calendar.
Action: Tap ('28', 1186)
Operation: Go to the event list in the Google Calendar app.
Response: Successfully launched Google Calendar app.
Thought: The user doesn't mention any specific tasks involving the calendar app.
Action: Stop
Operation: End the operation once confirmed that the user requires nothing more.
Response: Task accomplished.
##############################################################################################################      
################################################## Decision ##################################################
Thought: The user wants to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to launch the app.
Response: Successfully launched Google Play app.
##############################################################################################################      
################################################## Memory ##################################################
### Important content ###
Successfully launched Google Play app.

### Memory ###
Thought: The user wants to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to launch the app.
Response: Successfully launched Google Play app.
############################################################################################################        
SupervisionWarnings: annotate is deprecated: `BoxAnnotator` is deprecated and will be removed in `supervision-0.22.0`. Use `BoundingBoxAnnotator` and `LabelAnnotator` instead
################################################## Reflcetion ##################################################
Thought: The user wants to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to launch the app.
Response: Successfully launched Google Play app.
Memory:
Thought: The user wants to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to launch the app.
Response: Successfully launched Google Play app.
Progress: Completed the task of opening the Play Store app according to the user's request.
Thought: The user does not specify any specific tasks regarding the calendar app.
Action: Stop
Operation: Complete the operation when confirming that the user does not require any further assistance.
Response: Task accomplished.
################################################################################################################    
################################################## Planning ##################################################
Thought: The user wants to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to launch the app.
Response: Successfully launched Google Play app.
Memory:
Thought: The user wants to open the Play Store app.
Action: Open app (Google Play)
Operation: Click on the Google Play app icon to launch the app.
Response: Successfully launched Google Play app.
Progress: Completed the task of opening the Play Store app according to the user's request.
Thought: The user does not specify any specific tasks regarding the calendar app.
Action: Stop
Operation: Complete the operation when confirming that the user does not require any further assistance.
Response: Task accomplished.
##############################################################################################################      
################################################## Decision ##################################################
Thought: The user wants to check their calendar for upcoming events.
Action: Tap ('28', 1186)
Operation: Navigate to the event list section of the Google Calendar app.
Response: Successfully launched Google Calendar app.
##############################################################################################################      
################################################## Memory ##################################################
Important content: Successfully launched Google Calendar app.
Memory:
Thought: There is already a play store icon present on the homescreen so I don't need to look up anything else.     
Action: Home
Operation: Navigate back to the home screen without taking any additional steps.
Thought: The user wants to see upcoming events on their calendar.
Action: Tap ('28', 1186)
Operation: View event details
Response: Successfully launched Google Calendar app.
############################################################################################################        
Traceback (most recent call last):
  File "D:\MobileAgent\MobileAgent_V2_Demo_qwenVL\run.py", line 390, in <module>
    x, y = int(coordinate[0]), int(coordinate[1])
ValueError: invalid literal for int() with base 10: "'28'"
  • 3
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值