项目介绍
本项目提供了基于 通义千问 Qwen-7B Chat 在ModelWhale 平台上使用 CPU 实现高效部署大模型的教程,并且通过使用 Intel Extension for Transformers 工具包快速搭建环境,大大提升在线部署的效率、实现高效的模型推理体验。
查看更多:
计算资源及环境介绍
计算资源:腾讯云南京 CPU 16核64G
环境设置:Python3.11.8 数据科学镜像注意:
因为需要从云厂商拉取算力资源,耗时5~10min,且会预扣半小时资源价格的鲸币。如果资源未启动成功,预扣费用会在关闭编程页面后五分钟内退回,无需紧张。Wait,什么是 Intel Extension for Transformers?
Intel® Extension for Transformers 是英特尔推出的一个创新工具包,可基于英特尔® 架构平台,尤其是第四代英特尔® 至强® 可扩展处理器(代号 Sapphire Rapids,SPR)显著加速基于 Transformer 的大语言模型 (Large Language Model, LLM)。
它可以帮助开发者和研究人员更高效地在 Intel 的硬件平台上运行和优化基于 Transformer 的大型语言模型(LLM)。
就像给汽车安装涡轮增压器来提高性能一样,这个工具包就是给大型语言模型安装的“涡轮增压器”,让它们在 Intel 的 CPU 和 GPU 上运行得更快、更高效。
展开讲讲 Intel Extension for Transformers 的主要特性?
其主要特性包括:
(1) 通过扩展 Hugging Face transformers API 和利用英特尔® Neural Compressor,为用户提供无缝的模型压缩体验;
(2) 提供采用低位量化内核(NeurIPS 2023:在 CPU 上实现高效 LLM 推理)的 LLM 推理运行时,支持 Falcon、 LLaMA、MPT、 Llama2、 BLOOM、 OPT、 ChatGLM2、GPT-J-6B、Baichuan-13B-Base、Baichuan2-13B-Base、Qwen-7B、Qwen-14B 和 Dolly-v2-3B 等常见的 LLM;
(3) 先进的压缩感知运行时(NeurIPS 2022:在 CPU 上实现快速蒸馏 和 QuaLA-MiniLM:量化长度自适应 MiniLM;NeurIPS 2021:一次剪枝,一劳永逸:对预训练语言模型进行稀疏/剪枝)。
话不多说,下面进入实操环节,感受一下如何用 2 行代码高效实现大模型推理:
准备环境
In [1]:
# 安装Itrex在CPU环境下使用所需的系统环境 !sudo apt-get update !sudo apt-get install -y ffmpeg !sudo apt-get install -y libgl1-mesa-glx libgl1-mesa-dev !sudo apt-get install -y libsm6 libxext6Get:1 http://archive.ubuntu.com/ubuntu jammy InRelease [270 kB] Get:2 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB] Get:3 http://security.ubuntu.com/ubuntu jammy-security/restricted amd64 Packages [2395 kB] Get:4 http://security.ubuntu.com/ubuntu jammy-security/main amd64 Packages [1854 kB] Get:5 http://security.ubuntu.com/ubuntu jammy-security/universe amd64 Packages [1084 kB] Get:6 http://security.ubuntu.com/ubuntu jammy-security/multiverse amd64 Packages [44.7 kB] Get:7 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB] Get:8 http://archive.ubuntu.com/ubuntu jammy-backports InRelease [127 kB] Get:9 http://archive.ubuntu.com/ubuntu jammy/universe amd64 Packages [17.5 MB] Get:10 http://archive.ubuntu.com/ubuntu jammy/main amd64 Packages [1792 kB] Get:11 http://archive.ubuntu.com/ubuntu jammy/multiverse amd64 Packages [266 kB] Get:12 http://archive.ubuntu.com/ubuntu jammy/restricted amd64 Packages [164 kB] Get:13 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 Packages [1376 kB] Get:14 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages [2125 kB] Get:15 http://archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 Packages [2468 kB] Get:16 http://archive.ubuntu.com/ubuntu jammy-updates/multiverse amd64 Packages [51.1 kB] Get:17 http://archive.ubuntu.com/ubuntu jammy-backports/universe amd64 Packages [35.0 kB] Get:18 http://archive.ubuntu.com/ubuntu jammy-backports/main amd64 Packages [110 kB] Fetched 31.9 MB in 16s (1960 kB/s) Reading package lists... Done Reading package lists... Done Building dependency tree... Done Reading state information... Done ffmpeg is already the newest version (7:4.4.2-0ubuntu0.22.04.1). 0 upgraded, 0 newly installed, 0 to remove and 20 not upgraded. Reading package lists... Done Building dependency tree... Done Reading state information... Done The following additional packages will be installed: libegl-dev libegl-mesa0 libegl1 libgl-dev libgles-dev libgles1 libgles2 libglvnd-core-dev libglvnd-dev libglx-dev libopengl-dev libopengl0 libpthread-stubs0-dev libx11-dev libxau-dev libxcb1-dev libxdmcp-dev x11proto-dev xorg-sgml-doctools xtrans-dev Suggested packages: libx11-doc libxcb-doc The following NEW packages will be installed: libegl-dev libegl-mesa0 libegl1 libgl-dev libgl1-mesa-dev libgl1-mesa-glx libgles-dev libgles1 libgles2 libglvnd-core-dev libglvnd-dev libglx-dev libopengl-dev libopengl0 libpthread-stubs0-dev libx11-dev libxau-dev libxcb1-dev libxdmcp-dev x11proto-dev xorg-sgml-doctools xtrans-dev 0 upgraded, 22 newly installed, 0 to remove and 20 not upgraded. Need to get 1985 kB of archives. After this operation, 8967 kB of additional disk space will be used. Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 libegl-mesa0 amd64 23.2.1-1ubuntu3.1~22.04.2 [118 kB] Get:2 http://archive.ubuntu.com/ubuntu jammy/main amd64 libegl1 amd64 1.4.0-1 [28.6 kB] Get:3 http://archive.ubuntu.com/ubuntu jammy/main amd64 xorg-sgml-doctools all 1:1.11-1.1 [10.9 kB] Get:4 http://archive.ubuntu.com/ubuntu jammy/main amd64 x11proto-dev all 2021.5-1 [604 kB] Get:5 http://archive.ubuntu.com/ubuntu jammy/main amd64 libxau-dev amd64 1:1.0.9-1build5 [9724 B] Get:6 http://archive.ubuntu.com/ubuntu jammy/main amd64 libxdmcp-dev amd64 1:1.1.3-0ubuntu5 [26.5 kB] Get:7 http://archive.ubuntu.com/ubuntu jammy/main amd64 xtrans-dev all 1.4.0-1 [68.9 kB] Get:8 http://archive.ubuntu.com/ubuntu jammy/main amd64 libpthread-stubs0-dev amd64 0.4-1build2 [5516 B] Get:9 http://archive.ubuntu.com/ubuntu jammy/main amd64 libxcb1-dev amd64 1.14-3ubuntu3 [86.5 kB] Get:10 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 libx11-dev amd64 2:1.7.5-1ubuntu0.3 [744 kB] Get:11 http://archive.ubuntu.com/ubuntu jammy/main amd64 libglx-dev amd64 1.4.0-1 [14.1 kB] Get:12 http://archive.ubuntu.com/ubuntu jammy/main amd64 libgl-dev amd64 1.4.0-1 [101 kB] Get:13 http://archive.ubuntu.com/ubuntu jammy/main amd64 libegl-dev amd64 1.4.0-1 [18.0 kB] Get:14 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 libgl1-mesa-glx amd64 23.0.4-0ubuntu1~22.04.1 [5584 B] Get:15 http://archive.ubuntu.com/ubuntu jammy/main amd64 libgles1 amd64 1.4.0-1 [11.5 kB] Get:16 http://archive.ubuntu.com/ubuntu jammy/main amd64 libgles2 amd64 1.4.0-1 [18.0 kB] Get:17 http://archive.ubuntu.com/ubuntu jammy/main amd64 libgles-dev amd64 1.4.0-1 [49.4 kB] Get:18 http://archive.ubuntu.com/ubuntu jammy/main amd64 libopengl0 amd64 1.4.0-1 [36.5 kB] Get:19 http://archive.ubuntu.com/ubuntu jammy/main amd64 libopengl-dev amd64 1.4.0-1 [3400 B] Get:20 http://archive.ubuntu.com/ubuntu jammy/main amd64 libglvnd-core-dev amd64 1.4.0-1 [12.7 kB] Get:21 http://archive.ubuntu.com/ubuntu jammy/main amd64 libglvnd-dev amd64 1.4.0-1 [3162 B] Get:22 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 libgl1-mesa-dev amd64 23.2.1-1ubuntu3.1~22.04.2 [6842 B] Fetched 1985 kB in 3s (693 kB/s) debconf: delaying package configuration, since apt-utils is not installed Selecting previously unselected package libegl-mesa0:amd64. (Reading database ... 49714 files and directories currently installed.) Preparing to unpack .../00-libegl-mesa0_23.2.1-1ubuntu3.1~22.04.2_amd64.deb ... Unpacking libegl-mesa0:amd64 (23.2.1-1ubuntu3.1~22.04.2) ... Selecting previously unselected package libegl1:amd64. Preparing to unpack .../01-libegl1_1.4.0-1_amd64.deb ... Unpacking libegl1:amd64 (1.4.0-1) ... Selecting previously unselected package xorg-sgml-doctools. Preparing to unpack .../02-xorg-sgml-doctools_1%3a1.11-1.1_all.deb ... Unpacking xorg-sgml-doctools (1:1.11-1.1) ... Selecting previously unselected package x11proto-dev. Preparing to unpack .../03-x11proto-dev_2021.5-1_all.deb ... Unpacking x11proto-dev (2021.5-1) ... Selecting previously unselected package libxau-dev:amd64. Preparing to unpack .../04-libxau-dev_1%3a1.0.9-1build5_amd64.deb ... Unpacking libxau-dev:amd64 (1:1.0.9-1build5) ... Selecting previously unselected package libxdmcp-dev:amd64. Preparing to unpack .../05-libxdmcp-dev_1%3a1.1.3-0ubuntu5_amd64.deb ... Unpacking libxdmcp-dev:amd64 (1:1.1.3-0ubuntu5) ... Selecting previously unselected package xtrans-dev. Preparing to unpack .../06-xtrans-dev_1.4.0-1_all.deb ... Unpacking xtrans-dev (1.4.0-1) ... Selecting previously unselected package libpthread-stubs0-dev:amd64. Preparing to unpack .../07-libpthread-stubs0-dev_0.4-1build2_amd64.deb ... Unpacking libpthread-stubs0-dev:amd64 (0.4-1build2) ... Selecting previously unselected package libxcb1-dev:amd64. Preparing to unpack .../08-libxcb1-dev_1.14-3ubuntu3_amd64.deb ... Unpacking libxcb1-dev:amd64 (1.14-3ubuntu3) ... Selecting previously unselected package libx11-dev:amd64. Preparing to unpack .../09-libx11-dev_2%3a1.7.5-1ubuntu0.3_amd64.deb ... Unpacking libx11-dev:amd64 (2:1.7.5-1ubuntu0.3) ... Selecting previously unselected package libglx-dev:amd64. Preparing to unpack .../10-libglx-dev_1.4.0-1_amd64.deb ... Unpacking libglx-dev:amd64 (1.4.0-1) ... Selecting previously unselected package libgl-dev:amd64. Preparing to unpack .../11-libgl-dev_1.4.0-1_amd64.deb ... Unpacking libgl-dev:amd64 (1.4.0-1) ... Selecting previously unselected package libegl-dev:amd64. Preparing to unpack .../12-libegl-dev_1.4.0-1_amd64.deb ... Unpacking libegl-dev:amd64 (1.4.0-1) ... Selecting previously unselected package libgl1-mesa-glx:amd64. Preparing to unpack .../13-libgl1-mesa-glx_23.0.4-0ubuntu1~22.04.1_amd64.deb ... Unpacking libgl1-mesa-glx:amd64 (23.0.4-0ubuntu1~22.04.1) ... Selecting previously unselected package libgles1:amd64. Preparing to unpack .../14-libgles1_1.4.0-1_amd64.deb ... Unpacking libgles1:amd64 (1.4.0-1) ... Selecting previously unselected package libgles2:amd64. Preparing to unpack .../15-libgles2_1.4.0-1_amd64.deb ... Unpacking libgles2:amd64 (1.4.0-1) ... Selecting previously unselected package libgles-dev:amd64. Preparing to unpack .../16-libgles-dev_1.4.0-1_amd64.deb ... Unpacking libgles-dev:amd64 (1.4.0-1) ... Selecting previously unselected package libopengl0:amd64. Preparing to unpack .../17-libopengl0_1.4.0-1_amd64.deb ... Unpacking libopengl0:amd64 (1.4.0-1) ... Selecting previously unselected package libopengl-dev:amd64. Preparing to unpack .../18-libopengl-dev_1.4.0-1_amd64.deb ... Unpacking libopengl-dev:amd64 (1.4.0-1) ... Selecting previously unselected package libglvnd-core-dev:amd64. Preparing to unpack .../19-libglvnd-core-dev_1.4.0-1_amd64.deb ... Unpacking libglvnd-core-dev:amd64 (1.4.0-1) ... Selecting previously unselected package libglvnd-dev:amd64. Preparing to unpack .../20-libglvnd-dev_1.4.0-1_amd64.deb ... Unpacking libglvnd-dev:amd64 (1.4.0-1) ... Selecting previously unselected package libgl1-mesa-dev:amd64. Preparing to unpack .../21-libgl1-mesa-dev_23.2.1-1ubuntu3.1~22.04.2_amd64.deb ... Unpacking libgl1-mesa-dev:amd64 (23.2.1-1ubuntu3.1~22.04.2) ... Setting up libglvnd-core-dev:amd64 (1.4.0-1) ... Setting up libpthread-stubs0-dev:amd64 (0.4-1build2) ... Setting up libopengl0:amd64 (1.4.0-1) ... Setting up xtrans-dev (1.4.0-1) ... Setting up libegl-mesa0:amd64 (23.2.1-1ubuntu3.1~22.04.2) ... Setting up libgles2:amd64 (1.4.0-1) ... Setting up libgles1:amd64 (1.4.0-1) ... Setting up libgl1-mesa-glx:amd64 (23.0.4-0ubuntu1~22.04.1) ... Setting up libegl1:amd64 (1.4.0-1) ... Setting up xorg-sgml-doctools (1:1.11-1.1) ... Setting up libopengl-dev:amd64 (1.4.0-1) ... Setting up x11proto-dev (2021.5-1) ... Setting up libxau-dev:amd64 (1:1.0.9-1build5) ... Setting up libxdmcp-dev:amd64 (1:1.1.3-0ubuntu5) ... Setting up libxcb1-dev:amd64 (1.14-3ubuntu3) ... Setting up libx11-dev:amd64 (2:1.7.5-1ubuntu0.3) ... Setting up libglx-dev:amd64 (1.4.0-1) ... Setting up libgl-dev:amd64 (1.4.0-1) ... Setting up libegl-dev:amd64 (1.4.0-1) ... Setting up libgles-dev:amd64 (1.4.0-1) ... Setting up libglvnd-dev:amd64 (1.4.0-1) ... Setting up libgl1-mesa-dev:amd64 (23.2.1-1ubuntu3.1~22.04.2) ... Processing triggers for libc-bin (2.35-0ubuntu3.6) ... Reading package lists... Done Building dependency tree... Done Reading state information... Done libsm6 is already the newest version (2:1.2.3-1build2). libsm6 set to manually installed. libxext6 is already the newest version (2:1.3.4-1build1). libxext6 set to manually installed. 0 upgraded, 0 newly installed, 0 to remove and 20 not upgraded.In [2]:
# 安装所需的第三方库,包含环境依赖包和可选用于加速计算的依赖包 !pip install torch==2.3.0+cpu -i https://mirrors.cloud.tencent.com/pypi/simple !pip install cmake -i https://mirrors.cloud.tencent.com/pypi/simple !pip install ninja -i https://mirrors.cloud.tencent.com/pypi/simple !pip install neural_speed -i https://mirrors.cloud.tencent.com/pypi/simple !pip install intel-extension-for-transformers -i https://mirrors.cloud.tencent.com/pypi/simple !pip install modelscope -i https://mirrors.cloud.tencent.com/pypi/simple !pip install transformers -i https://mirrors.cloud.tencent.com/pypi/simple !pip install pyOpenSSL --upgrade -i https://mirrors.cloud.tencent.com/pypi/simple !pip install sentencepiece --upgrade -i https://mirrors.cloud.tencent.com/pypi/simple !pip install xformers -i https://mirrors.cloud.tencent.com/pypi/simple !pip install accelerate -i https://mirrors.cloud.tencent.com/pypi/simple !pip install tiktoken -i https://mirrors.cloud.tencent.com/pypi/simple !pip install transformers_stream_generator -i https://mirrors.cloud.tencent.com/pypi/simple/home/mw /home/mw/project Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple Requirement already satisfied: torch==2.3.0+cpu in /opt/conda/lib/python3.11/site-packages (2.3.0+cpu) Requirement already satisfied: filelock in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0+cpu) (3.14.0) Requirement already satisfied: typing-extensions>=4.8.0 in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0+cpu) (4.10.0) Requirement already satisfied: sympy in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0+cpu) (1.12) Requirement already satisfied: networkx in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0+cpu) (3.2.1) Requirement already satisfied: jinja2 in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0+cpu) (3.1.3) Requirement already satisfied: fsspec in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0+cpu) (2024.2.0) Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.11/site-packages (from jinja2->torch==2.3.0+cpu) (2.1.5) Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.11/site-packages (from sympy->torch==2.3.0+cpu) (1.3.0) Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple Requirement already satisfied: cmake in /opt/conda/lib/python3.11/site-packages (3.29.3) Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple Requirement already satisfied: ninja in /opt/conda/lib/python3.11/site-packages (1.11.1.1) Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple Requirement already satisfied: neural_speed in /opt/conda/lib/python3.11/site-packages (1.0) Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple Requirement already satisfied: intel-extension-for-transformers in /opt/conda/lib/python3.11/site-packages (1.4.2) Requirement already satisfied: packaging in /opt/conda/lib/python3.11/site-packages (from intel-extension-for-transformers) (24.0) Requirement already satisfied: numpy in /opt/conda/lib/python3.11/site-packages (from intel-extension-for-transformers) (1.26.4) Requirement already satisfied: schema in /opt/conda/lib/python3.11/site-packages (from intel-extension-for-transformers) (0.7.7) Requirement already satisfied: pyyaml in /opt/conda/lib/python3.11/site-packages (from intel-extension-for-transformers) (6.0.1) Requirement already satisfied: neural-compressor in /opt/conda/lib/python3.11/site-packages (from intel-extension-for-transformers) (2.5.1) Requirement already satisfied: transformers in /opt/conda/lib/python3.11/site-packages (from intel-extension-for-transformers) (4.41.1) Requirement already satisfied: deprecated>=1.2.13 in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (1.2.14) Requirement already satisfied: opencv-python-headless in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (4.9.0.80) Requirement already satisfied: pandas in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (2.2.1) Requirement already satisfied: Pillow in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (10.2.0) Requirement already satisfied: prettytable in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (3.10.0) Requirement already satisfied: psutil in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (5.9.8) Requirement already satisfied: py-cpuinfo in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (9.0.0) Requirement already satisfied: requests in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (2.31.0) Requirement already satisfied: scikit-learn in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (1.4.1.post1) Requirement already satisfied: pycocotools in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (2.0.7) Requirement already satisfied: filelock in /opt/conda/lib/python3.11/site-packages (from transformers->intel-extension-for-transformers) (3.14.0) Requirement already satisfied: huggingface-hub<1.0,>=0.23.0 in /opt/conda/lib/python3.11/site-packages (from transformers->intel-extension-for-transformers) (0.23.2) Requirement already satisfied: regex!=2019.12.17 in /opt/conda/lib/python3.11/site-packages (from transformers->intel-extension-for-transformers) (2024.5.15) Requirement already satisfied: tokenizers<0.20,>=0.19 in /opt/conda/lib/python3.11/site-packages (from transformers->intel-extension-for-transformers) (0.19.1) Requirement already satisfied: safetensors>=0.4.1 in /opt/conda/lib/python3.11/site-packages (from transformers->intel-extension-for-transformers) (0.4.3) Requirement already satisfied: tqdm>=4.27 in /opt/conda/lib/python3.11/site-packages (from transformers->intel-extension-for-transformers) (4.66.2) Requirement already satisfied: wrapt<2,>=1.10 in /opt/conda/lib/python3.11/site-packages (from deprecated>=1.2.13->neural-compressor->intel-extension-for-transformers) (1.16.0) Requirement already satisfied: fsspec>=2023.5.0 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub<1.0,>=0.23.0->transformers->intel-extension-for-transformers) (2024.2.0) Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub<1.0,>=0.23.0->transformers->intel-extension-for-transformers) (4.10.0) Requirement already satisfied: python-dateutil>=2.8.2 in /opt/conda/lib/python3.11/site-packages (from pandas->neural-compressor->intel-extension-for-transformers) (2.9.0) Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.11/site-packages (from pandas->neural-compressor->intel-extension-for-transformers) (2024.1) Requirement already satisfied: tzdata>=2022.7 in /opt/conda/lib/python3.11/site-packages (from pandas->neural-compressor->intel-extension-for-transformers) (2024.1) Requirement already satisfied: wcwidth in /opt/conda/lib/python3.11/site-packages (from prettytable->neural-compressor->intel-extension-for-transformers) (0.2.13) Requirement already satisfied: matplotlib>=2.1.0 in /opt/conda/lib/python3.11/site-packages (from pycocotools->neural-compressor->intel-extension-for-transformers) (3.8.3) Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.11/site-packages (from requests->neural-compressor->intel-extension-for-transformers) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-packages (from requests->neural-compressor->intel-extension-for-transformers) (3.6) Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.11/site-packages (from requests->neural-compressor->intel-extension-for-transformers) (2.2.1) Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.11/site-packages (from requests->neural-compressor->intel-extension-for-transformers) (2024.2.2) Requirement already satisfied: scipy>=1.6.0 in /opt/conda/lib/python3.11/site-packages (from scikit-learn->neural-compressor->intel-extension-for-transformers) (1.12.0) Requirement already satisfied: joblib>=1.2.0 in /opt/conda/lib/python3.11/site-packages (from scikit-learn->neural-compressor->intel-extension-for-transformers) (1.3.2) Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/conda/lib/python3.11/site-packages (from scikit-learn->neural-compressor->intel-extension-for-transformers) (3.4.0) Requirement already satisfied: contourpy>=1.0.1 in /opt/conda/lib/python3.11/site-packages (from matplotlib>=2.1.0->pycocotools->neural-compressor->intel-extension-for-transformers) (1.2.0) Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.11/site-packages (from matplotlib>=2.1.0->pycocotools->neural-compressor->intel-extension-for-transformers) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /opt/conda/lib/python3.11/site-packages (from matplotlib>=2.1.0->pycocotools->neural-compressor->intel-extension-for-transformers) (4.50.0) Requirement already satisfied: kiwisolver>=1.3.1 in /opt/conda/lib/python3.11/site-packages (from matplotlib>=2.1.0->pycocotools->neural-compressor->intel-extension-for-transformers) (1.4.5) Requirement already satisfied: pyparsing>=2.3.1 in /opt/conda/lib/python3.11/site-packages (from matplotlib>=2.1.0->pycocotools->neural-compressor->intel-extension-for-transformers) (3.1.2) Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas->neural-compressor->intel-extension-for-transformers) (1.16.0) Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple Requirement already satisfied: modelscope in /opt/conda/lib/python3.11/site-packages (1.14.0) Requirement already satisfied: addict in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.4.0) Requirement already satisfied: attrs in /opt/conda/lib/python3.11/site-packages (from modelscope) (23.2.0) Requirement already satisfied: datasets<2.19.0,>=2.16.0 in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.18.0) Requirement already satisfied: einops in /opt/conda/lib/python3.11/site-packages (from modelscope) (0.8.0) Requirement already satisfied: filelock>=3.3.0 in /opt/conda/lib/python3.11/site-packages (from modelscope) (3.14.0) Requirement already satisfied: gast>=0.2.2 in /opt/conda/lib/python3.11/site-packages (from modelscope) (0.5.4) Requirement already satisfied: huggingface-hub in /opt/conda/lib/python3.11/site-packages (from modelscope) (0.23.2) Requirement already satisfied: numpy in /opt/conda/lib/python3.11/site-packages (from modelscope) (1.26.4) Requirement already satisfied: oss2 in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.18.5) Requirement already satisfied: pandas in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.2.1) Requirement already satisfied: Pillow>=6.2.0 in /opt/conda/lib/python3.11/site-packages (from modelscope) (10.2.0) Requirement already satisfied: pyarrow!=9.0.0,>=6.0.0 in /opt/conda/lib/python3.11/site-packages (from modelscope) (15.0.2) Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.9.0) Requirement already satisfied: pyyaml in /opt/conda/lib/python3.11/site-packages (from modelscope) (6.0.1) Requirement already satisfied: requests>=2.25 in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.31.0) Requirement already satisfied: scipy in /opt/conda/lib/python3.11/site-packages (from modelscope) (1.12.0) Requirement already satisfied: setuptools in /opt/conda/lib/python3.11/site-packages (from modelscope) (69.2.0) Requirement already satisfied: simplejson>=3.3.0 in /opt/conda/lib/python3.11/site-packages (from modelscope) (3.19.2) Requirement already satisfied: sortedcontainers>=1.5.9 in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.4.0) Requirement already satisfied: tqdm>=4.64.0 in /opt/conda/lib/python3.11/site-packages (from modelscope) (4.66.2) Requirement already satisfied: urllib3>=1.26 in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.2.1) Requirement already satisfied: yapf in /opt/conda/lib/python3.11/site-packages (from modelscope) (0.40.2) Requirement already satisfied: pyarrow-hotfix in /opt/conda/lib/python3.11/site-packages (from datasets<2.19.0,>=2.16.0->modelscope) (0.6) Requirement already satisfied: dill<0.3.9,>=0.3.0 in /opt/conda/lib/python3.11/site-packages (from datasets<2.19.0,>=2.16.0->modelscope) (0.3.8) Requirement already satisfied: xxhash in /opt/conda/lib/python3.11/site-packages (from datasets<2.19.0,>=2.16.0->modelscope) (3.4.1) Requirement already satisfied: multiprocess in /opt/conda/lib/python3.11/site-packages (from datasets<2.19.0,>=2.16.0->modelscope) (0.70.16) Requirement already satisfied: fsspec<=2024.2.0,>=2023.1.0 in /opt/conda/lib/python3.11/site-packages (from fsspec[http]<=2024.2.0,>=2023.1.0->datasets<2.19.0,>=2.16.0->modelscope) (2024.2.0) Requirement already satisfied: aiohttp in /opt/conda/lib/python3.11/site-packages (from datasets<2.19.0,>=2.16.0->modelscope) (3.9.5) Requirement already satisfied: packaging in /opt/conda/lib/python3.11/site-packages (from datasets<2.19.0,>=2.16.0->modelscope) (24.0) Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub->modelscope) (4.10.0) Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.11/site-packages (from python-dateutil>=2.1->modelscope) (1.16.0) Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.11/site-packages (from requests>=2.25->modelscope) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-packages (from requests>=2.25->modelscope) (3.6) Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.11/site-packages (from requests>=2.25->modelscope) (2024.2.2) Requirement already satisfied: crcmod>=1.7 in /opt/conda/lib/python3.11/site-packages (from oss2->modelscope) (1.7) Requirement already satisfied: pycryptodome>=3.4.7 in /opt/conda/lib/python3.11/site-packages (from oss2->modelscope) (3.20.0) Requirement already satisfied: aliyun-python-sdk-kms>=2.4.1 in /opt/conda/lib/python3.11/site-packages (from oss2->modelscope) (2.16.3) Requirement already satisfied: aliyun-python-sdk-core>=2.13.12 in /opt/conda/lib/python3.11/site-packages (from oss2->modelscope) (2.15.1) Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.11/site-packages (from pandas->modelscope) (2024.1) Requirement already satisfied: tzdata>=2022.7 in /opt/conda/lib/python3.11/site-packages (from pandas->modelscope) (2024.1) Requirement already satisfied: importlib-metadata>=6.6.0 in /opt/conda/lib/python3.11/site-packages (from yapf->modelscope) (7.1.0) Requirement already satisfied: platformdirs>=3.5.1 in /opt/conda/lib/python3.11/site-packages (from yapf->modelscope) (4.2.0) Requirement already satisfied: tomli>=2.0.1 in /opt/conda/lib/python3.11/site-packages (from yapf->modelscope) (2.0.1) Requirement already satisfied: jmespath<1.0.0,>=0.9.3 in /opt/conda/lib/python3.11/site-packages (from aliyun-python-sdk-core>=2.13.12->oss2->modelscope) (0.10.0) Requirement already satisfied: cryptography>=2.6.0 in /opt/conda/lib/python3.11/site-packages (from aliyun-python-sdk-core>=2.13.12->oss2->modelscope) (42.0.5) Requirement already satisfied: aiosignal>=1.1.2 in /opt/conda/lib/python3.11/site-packages (from aiohttp->datasets<2.19.0,>=2.16.0->modelscope) (1.3.1) Requirement already satisfied: frozenlist>=1.1.1 in /opt/conda/lib/python3.11/site-packages (from aiohttp->datasets<2.19.0,>=2.16.0->modelscope) (1.4.1) Requirement already satisfied: multidict<7.0,>=4.5 in /opt/conda/lib/python3.11/site-packages (from aiohttp->datasets<2.19.0,>=2.16.0->modelscope) (6.0.5) Requirement already satisfied: yarl<2.0,>=1.0 in /opt/conda/lib/python3.11/site-packages (from aiohttp->datasets<2.19.0,>=2.16.0->modelscope) (1.9.4) Requirement already satisfied: zipp>=0.5 in /opt/conda/lib/python3.11/site-packages (from importlib-metadata>=6.6.0->yapf->modelscope) (3.17.0) Requirement already satisfied: cffi>=1.12 in /opt/conda/lib/python3.11/site-packages (from cryptography>=2.6.0->aliyun-python-sdk-core>=2.13.12->oss2->modelscope) (1.16.0) Requirement already satisfied: pycparser in /opt/conda/lib/python3.11/site-packages (from cffi>=1.12->cryptography>=2.6.0->aliyun-python-sdk-core>=2.13.12->oss2->modelscope) (2.22) Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple Requirement already satisfied: transformers in /opt/conda/lib/python3.11/site-packages (4.41.1) Requirement already satisfied: filelock in /opt/conda/lib/python3.11/site-packages (from transformers) (3.14.0) Requirement already satisfied: huggingface-hub<1.0,>=0.23.0 in /opt/conda/lib/python3.11/site-packages (from transformers) (0.23.2) Requirement already satisfied: numpy>=1.17 in /opt/conda/lib/python3.11/site-packages (from transformers) (1.26.4) Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.11/site-packages (from transformers) (24.0) Requirement already satisfied: pyyaml>=5.1 in /opt/conda/lib/python3.11/site-packages (from transformers) (6.0.1) Requirement already satisfied: regex!=2019.12.17 in /opt/conda/lib/python3.11/site-packages (from transformers) (2024.5.15) Requirement already satisfied: requests in /opt/conda/lib/python3.11/site-packages (from transformers) (2.31.0) Requirement already satisfied: tokenizers<0.20,>=0.19 in /opt/conda/lib/python3.11/site-packages (from transformers) (0.19.1) Requirement already satisfied: safetensors>=0.4.1 in /opt/conda/lib/python3.11/site-packages (from transformers) (0.4.3) Requirement already satisfied: tqdm>=4.27 in /opt/conda/lib/python3.11/site-packages (from transformers) (4.66.2) Requirement already satisfied: fsspec>=2023.5.0 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub<1.0,>=0.23.0->transformers) (2024.2.0) Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub<1.0,>=0.23.0->transformers) (4.10.0) Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.11/site-packages (from requests->transformers) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-packages (from requests->transformers) (3.6) Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.11/site-packages (from requests->transformers) (2.2.1) Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.11/site-packages (from requests->transformers) (2024.2.2) Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple Requirement already satisfied: pyOpenSSL in /opt/conda/lib/python3.11/site-packages (24.1.0) Requirement already satisfied: cryptography<43,>=41.0.5 in /opt/conda/lib/python3.11/site-packages (from pyOpenSSL) (42.0.5) Requirement already satisfied: cffi>=1.12 in /opt/conda/lib/python3.11/site-packages (from cryptography<43,>=41.0.5->pyOpenSSL) (1.16.0) Requirement already satisfied: pycparser in /opt/conda/lib/python3.11/site-packages (from cffi>=1.12->cryptography<43,>=41.0.5->pyOpenSSL) (2.22) Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple Requirement already satisfied: sentencepiece in /opt/conda/lib/python3.11/site-packages (0.2.0) Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple Requirement already satisfied: xformers in /opt/conda/lib/python3.11/site-packages (0.0.26.post1) Requirement already satisfied: numpy in /opt/conda/lib/python3.11/site-packages (from xformers) (1.26.4) Requirement already satisfied: torch==2.3.0 in /opt/conda/lib/python3.11/site-packages (from xformers) (2.3.0+cpu) Requirement already satisfied: filelock in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0->xformers) (3.14.0) Requirement already satisfied: typing-extensions>=4.8.0 in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0->xformers) (4.10.0) Requirement already satisfied: sympy in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0->xformers) (1.12) Requirement already satisfied: networkx in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0->xformers) (3.2.1) Requirement already satisfied: jinja2 in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0->xformers) (3.1.3) Requirement already satisfied: fsspec in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0->xformers) (2024.2.0) Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.11/site-packages (from jinja2->torch==2.3.0->xformers) (2.1.5) Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.11/site-packages (from sympy->torch==2.3.0->xformers) (1.3.0) Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple Requirement already satisfied: accelerate in /opt/conda/lib/python3.11/site-packages (0.30.1) Requirement already satisfied: numpy>=1.17 in /opt/conda/lib/python3.11/site-packages (from accelerate) (1.26.4) Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.11/site-packages (from accelerate) (24.0) Requirement already satisfied: psutil in /opt/conda/lib/python3.11/site-packages (from accelerate) (5.9.8) Requirement already satisfied: pyyaml in /opt/conda/lib/python3.11/site-packages (from accelerate) (6.0.1) Requirement already satisfied: torch>=1.10.0 in /opt/conda/lib/python3.11/site-packages (from accelerate) (2.3.0+cpu) Requirement already satisfied: huggingface-hub in /opt/conda/lib/python3.11/site-packages (from accelerate) (0.23.2) Requirement already satisfied: safetensors>=0.3.1 in /opt/conda/lib/python3.11/site-packages (from accelerate) (0.4.3) Requirement already satisfied: filelock in /opt/conda/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (3.14.0) Requirement already satisfied: typing-extensions>=4.8.0 in /opt/conda/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (4.10.0) Requirement already satisfied: sympy in /opt/conda/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (1.12) Requirement already satisfied: networkx in /opt/conda/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (3.2.1) Requirement already satisfied: jinja2 in /opt/conda/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (3.1.3) Requirement already satisfied: fsspec in /opt/conda/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (2024.2.0) Requirement already satisfied: requests in /opt/conda/lib/python3.11/site-packages (from huggingface-hub->accelerate) (2.31.0) Requirement already satisfied: tqdm>=4.42.1 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub->accelerate) (4.66.2) Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.11/site-packages (from jinja2->torch>=1.10.0->accelerate) (2.1.5) Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.11/site-packages (from requests->huggingface-hub->accelerate) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-packages (from requests->huggingface-hub->accelerate) (3.6) Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.11/site-packages (from requests->huggingface-hub->accelerate) (2.2.1) Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.11/site-packages (from requests->huggingface-hub->accelerate) (2024.2.2) Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.11/site-packages (from sympy->torch>=1.10.0->accelerate) (1.3.0) Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple Requirement already satisfied: tiktoken in /opt/conda/lib/python3.11/site-packages (0.7.0) Requirement already satisfied: regex>=2022.1.18 in /opt/conda/lib/python3.11/site-packages (from tiktoken) (2024.5.15) Requirement already satisfied: requests>=2.26.0 in /opt/conda/lib/python3.11/site-packages (from tiktoken) (2.31.0) Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.11/site-packages (from requests>=2.26.0->tiktoken) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-packages (from requests>=2.26.0->tiktoken) (3.6) Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.11/site-packages (from requests>=2.26.0->tiktoken) (2.2.1) Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.11/site-packages (from requests>=2.26.0->tiktoken) (2024.2.2) Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple Collecting transformers_stream_generator Downloading https://mirrors.cloud.tencent.com/pypi/packages/42/c2/65f13aec253100e1916e9bd7965fe17bde796ebabeb1265f45191ab4ddc0/transformers-stream-generator-0.0.5.tar.gz (13 kB) Preparing metadata (setup.py) ... done Requirement already satisfied: transformers>=4.26.1 in /opt/conda/lib/python3.11/site-packages (from transformers_stream_generator) (4.41.1) Requirement already satisfied: filelock in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (3.14.0) Requirement already satisfied: huggingface-hub<1.0,>=0.23.0 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (0.23.2) Requirement already satisfied: numpy>=1.17 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (1.26.4) Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (24.0) Requirement already satisfied: pyyaml>=5.1 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (6.0.1) Requirement already satisfied: regex!=2019.12.17 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (2024.5.15) Requirement already satisfied: requests in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (2.31.0) Requirement already satisfied: tokenizers<0.20,>=0.19 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (0.19.1) Requirement already satisfied: safetensors>=0.4.1 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (0.4.3) Requirement already satisfied: tqdm>=4.27 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (4.66.2) Requirement already satisfied: fsspec>=2023.5.0 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub<1.0,>=0.23.0->transformers>=4.26.1->transformers_stream_generator) (2024.2.0) Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub<1.0,>=0.23.0->transformers>=4.26.1->transformers_stream_generator) (4.10.0) Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.11/site-packages (from requests->transformers>=4.26.1->transformers_stream_generator) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-packages (from requests->transformers>=4.26.1->transformers_stream_generator) (3.6) Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.11/site-packages (from requests->transformers>=4.26.1->transformers_stream_generator) (2.2.1) Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.11/site-packages (from requests->transformers>=4.26.1->transformers_stream_generator) (2024.2.2) Building wheels for collected packages: transformers_stream_generator Building wheel for transformers_stream_generator (setup.py) ... done Created wheel for transformers_stream_generator: filename=transformers_stream_generator-0.0.5-py3-none-any.whl size=12425 sha256=a32bee62b0602b8b226c3b18e93f1d1568900c4135c1ed2ac80837f363bd9402 Stored in directory: /home/mw/.cache/pip/wheels/c0/9f/f6/f8573ca658852aa7cdde5a0e2717f767ac9b2dd19a7d2897b9 Successfully built transformers_stream_generator Installing collected packages: transformers_stream_generator Successfully installed transformers_stream_generator-0.0.5In [3]:
# 在temp目录下进行模型在cpu上的转换存储 %cd .. %cd temp/home/mw /home/mw/temp自动化部署模型 & 推理
In [4]:
# 使用itrex加载转换模型并自动化部署推理 from transformers import AutoTokenizer, TextStreamer from intel_extension_for_transformers.transformers import AutoModelForCausalLM prompt = "Once upon a time, there existed a little girl," model_dir = '/home/mw/input/qwen7bchat3536' tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True) inputs = tokenizer(prompt, return_tensors="pt").input_ids streamer = TextStreamer(tokenizer) model = AutoModelForCausalLM.from_pretrained(model_dir, load_in_4bit=True) outputs = model.generate(inputs, streamer=streamer, max_new_tokens=300)2024-05-29 15:13:06 [INFO] cpu device is used. 2024-05-29 15:13:06 [INFO] Applying Weight Only Quantization. 2024-05-29 15:13:06 [INFO] Quantize model by Neural Speed with RTN Algorithm.cmd: ['python', PosixPath('/opt/conda/lib/python3.11/site-packages/neural_speed/convert/convert_qwen.py'), '--outfile', 'runtime_outs/ne_qwen_f32.bin', '--outtype', 'f32', '--model_hub', 'huggingface', '/home/mw/input/qwen7bchat3536']Warning: please make sure that you are using the latest codes and checkpoints, especially if you used Qwen-7B before 09.25.2023.请使用最新模型和代码,尤其如果你在9月25日前已经开始使用Qwen-7B,千万注意不要使用错误代码和模型。Loading model: /home/mw/input/qwen7bchat3536Loading checkpoint shards: 100%|██████████| 8/8 [01:34<00:00, 11.81s/it]Model loaded: /home/mw/input/qwen7bchat3536 {'vocab_size': 151936, 'hidden_size': 4096, 'intermediate_size': 22016, 'num_hidden_layers': 32, 'num_attention_heads': 32, 'emb_dropout_prob': 0.0, 'attn_dropout_prob': 0.0, 'layer_norm_epsilon': 1e-06, 'initializer_range': 0.02, 'scale_attn_weights': True, 'use_cache': True, 'max_position_embeddings': 8192, 'bf16': False, 'fp16': False, 'fp32': True, 'kv_channels': 128, 'rotary_pct': 1.0, 'rotary_emb_base': 10000, 'use_dynamic_ntk': True, 'use_logn_attn': True, 'use_flash_attn': False, 'no_bias': True, 'use_cache_quantization': False, 'use_cache_kernel': False, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': None, 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': False, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': False, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['QWenLMHeadModel'], 'finetuning_task': None, 'id2label': {0: 'LABEL_0', 1: 'LABEL_1'}, 'label2id': {'LABEL_0': 0, 'LABEL_1': 1}, 'tokenizer_class': 'QWenTokenizer', 'prefix': None, 'bos_token_id': None, 'pad_token_id': None, 'eos_token_id': None, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': '/home/mw/input/qwen7bchat3536', 'transformers_version': '4.41.1', 'auto_map': {'AutoConfig': 'configuration_qwen.QWenConfig', 'AutoModelForCausalLM': 'modeling_qwen.QWenLMHeadModel'}, 'model_type': 'qwen', 'onnx_safe': None, 'seq_length': 8192} transformer.wte.weight -> transformer.wte.weight transformer.wte.weight 2 (151936, 4096) Converting to float32 (151936, 4096) [[-0.016845703125, -0.00958251953125, 0.0081787109375], [0.0029296875, 0.0096435546875, -0.00604248046875], [0.0162353515625, -0.0224609375, -0.01019287109375]] b'transformer.wte.weight' transformer.h.0.ln_1.weight -> transformer.h.0.ln_1.weight transformer.h.0.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.09765625, 0.08837890625, 0.10498046875] b'transformer.h.0.ln_1.weight' transformer.h.0.attn.c_attn.weight -> transformer.h.0.attn.c_attn.weight transformer.h.0.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.029541015625, -0.008544921875, 0.0361328125], [0.0023345947265625, -0.0035858154296875, -0.048095703125], [0.0302734375, -0.02392578125, -0.00750732421875]] b'transformer.h.0.attn.c_attn.weight' transformer.h.0.attn.c_attn.bias -> transformer.h.0.attn.c_attn.bias transformer.h.0.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [-0.9453125, 1.8828125, -0.74609375] b'transformer.h.0.attn.c_attn.bias' transformer.h.0.attn.c_proj.weight -> transformer.h.0.attn.c_proj.weight transformer.h.0.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[-0.007232666015625, -0.002655029296875, -6.437301635742188e-05], [0.00732421875, 0.0037384033203125, -0.01104736328125], [-0.0299072265625, 0.00836181640625, -0.00604248046875]] b'transformer.h.0.attn.c_proj.weight' transformer.h.0.ln_2.weight -> transformer.h.0.ln_2.weight transformer.h.0.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.1767578125, 0.171875, 0.16796875] b'transformer.h.0.ln_2.weight' transformer.h.0.mlp.w1.weight -> transformer.h.0.mlp.w1.weight transformer.h.0.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.027587890625, 0.0123291015625, -0.0299072265625], [-0.003570556640625, -0.00604248046875, 0.0062255859375], [-0.0012969970703125, -0.0003643035888671875, 0.0213623046875]] b'transformer.h.0.mlp.w1.weight' transformer.h.0.mlp.w2.weight -> transformer.h.0.mlp.w2.weight transformer.h.0.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.0263671875, -0.004364013671875, 0.0159912109375], [0.021728515625, 0.00970458984375, -0.035888671875], [0.0191650390625, 0.01397705078125, -0.01318359375]] b'transformer.h.0.mlp.w2.weight' transformer.h.0.mlp.c_proj.weight -> transformer.h.0.mlp.c_proj.weight transformer.h.0.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.00421142578125, -0.0111083984375, -0.0012969970703125], [0.0191650390625, 0.0130615234375, -0.008056640625], [0.0029754638671875, 0.01092529296875, 0.00665283203125]] b'transformer.h.0.mlp.c_proj.weight' transformer.h.1.ln_1.weight -> transformer.h.1.ln_1.weight transformer.h.1.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.134765625, 0.09130859375, 0.1044921875] b'transformer.h.1.ln_1.weight' transformer.h.1.attn.c_attn.weight -> transformer.h.1.attn.c_attn.weight transformer.h.1.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.044921875, -0.00823974609375, 0.021484375], [-0.0225830078125, 0.0029449462890625, -0.0037994384765625], [-0.033447265625, 0.007659912109375, -0.017822265625]] b'transformer.h.1.attn.c_attn.weight' transformer.h.1.attn.c_attn.bias -> transformer.h.1.attn.c_attn.bias transformer.h.1.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.41796875, -1.1171875, -1.109375] b'transformer.h.1.attn.c_attn.bias' transformer.h.1.attn.c_proj.weight -> transformer.h.1.attn.c_proj.weight transformer.h.1.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.0018310546875, 0.01239013671875, 0.0074462890625], [-0.0031280517578125, -0.028564453125, 0.0115966796875], [0.002838134765625, -0.001129150390625, 0.008544921875]] b'transformer.h.1.attn.c_proj.weight' transformer.h.1.ln_2.weight -> transformer.h.1.ln_2.weight transformer.h.1.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.244140625, 0.244140625, 0.2392578125] b'transformer.h.1.ln_2.weight' transformer.h.1.mlp.w1.weight -> transformer.h.1.mlp.w1.weight transformer.h.1.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.010498046875, 0.0303955078125, -0.021240234375], [-0.0064697265625, 0.0008544921875, -0.02685546875], [-0.022216796875, -0.036865234375, -8.440017700195312e-05]] b'transformer.h.1.mlp.w1.weight' transformer.h.1.mlp.w2.weight -> transformer.h.1.mlp.w2.weight transformer.h.1.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.031005859375, 0.01953125, 0.00689697265625], [0.0036773681640625, -0.0201416015625, -0.004364013671875], [0.01202392578125, 0.032470703125, -0.020263671875]] b'transformer.h.1.mlp.w2.weight' transformer.h.1.mlp.c_proj.weight -> transformer.h.1.mlp.c_proj.weight transformer.h.1.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[0.0013275146484375, 0.0286865234375, -0.0101318359375], [-0.0084228515625, 0.0220947265625, -0.02001953125], [-0.0128173828125, -0.0098876953125, 0.0242919921875]] b'transformer.h.1.mlp.c_proj.weight' transformer.h.2.ln_1.weight -> transformer.h.2.ln_1.weight transformer.h.2.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.244140625, 0.201171875, 0.1943359375] b'transformer.h.2.ln_1.weight' transformer.h.2.attn.c_attn.weight -> transformer.h.2.attn.c_attn.weight transformer.h.2.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.00016307830810546875, 0.0206298828125, -0.01025390625], [-0.01324462890625, 0.0096435546875, 0.0205078125], [-0.016357421875, 0.00885009765625, 0.02294921875]] b'transformer.h.2.attn.c_attn.weight' transformer.h.2.attn.c_attn.bias -> transformer.h.2.attn.c_attn.bias transformer.h.2.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [-0.2216796875, 0.412109375, 0.34375] b'transformer.h.2.attn.c_attn.bias' transformer.h.2.attn.c_proj.weight -> transformer.h.2.attn.c_proj.weight transformer.h.2.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[-0.01019287109375, -0.01446533203125, 0.0147705078125], [0.007049560546875, 0.00469970703125, -0.006011962890625], [-0.00970458984375, -0.006805419921875, 0.015869140625]] b'transformer.h.2.attn.c_proj.weight' transformer.h.2.ln_2.weight -> transformer.h.2.ln_2.weight transformer.h.2.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.30078125, 0.287109375, 0.283203125] b'transformer.h.2.ln_2.weight' transformer.h.2.mlp.w1.weight -> transformer.h.2.mlp.w1.weight transformer.h.2.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.046142578125, 0.0115966796875, 0.012451171875], [0.0283203125, -0.00531005859375, 0.0191650390625], [-0.0142822265625, -0.01055908203125, -0.0198974609375]] b'transformer.h.2.mlp.w1.weight' transformer.h.2.mlp.w2.weight -> transformer.h.2.mlp.w2.weight transformer.h.2.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.04345703125, 0.0155029296875, 0.0032196044921875], [-0.010009765625, 0.0250244140625, 0.0029144287109375], [0.00701904296875, -0.00726318359375, -0.0390625]] b'transformer.h.2.mlp.w2.weight' transformer.h.2.mlp.c_proj.weight -> transformer.h.2.mlp.c_proj.weight transformer.h.2.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.0174560546875, 0.006134033203125, -0.034912109375], [0.00933837890625, -0.02880859375, 0.000946044921875], [0.00640869140625, 0.0159912109375, -0.002685546875]] b'transformer.h.2.mlp.c_proj.weight' transformer.h.3.ln_1.weight -> transformer.h.3.ln_1.weight transformer.h.3.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.51171875, 0.46875, 0.486328125] b'transformer.h.3.ln_1.weight' transformer.h.3.attn.c_attn.weight -> transformer.h.3.attn.c_attn.weight transformer.h.3.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.02099609375, 0.005126953125, -0.0194091796875], [0.0128173828125, 0.0142822265625, 0.018310546875], [-0.0172119140625, 0.016357421875, 0.0047607421875]] b'transformer.h.3.attn.c_attn.weight' transformer.h.3.attn.c_attn.bias -> transformer.h.3.attn.c_attn.bias transformer.h.3.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.1962890625, -0.1494140625, -0.224609375] b'transformer.h.3.attn.c_attn.bias' transformer.h.3.attn.c_proj.weight -> transformer.h.3.attn.c_proj.weight transformer.h.3.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.00732421875, -0.0172119140625, 0.01007080078125], [-0.009765625, 0.028564453125, 0.00762939453125], [0.0184326171875, -0.006500244140625, 0.01904296875]] b'transformer.h.3.attn.c_proj.weight' transformer.h.3.ln_2.weight -> transformer.h.3.ln_2.weight transformer.h.3.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.326171875, 0.333984375, 0.3125] b'transformer.h.3.ln_2.weight' transformer.h.3.mlp.w1.weight -> transformer.h.3.mlp.w1.weight transformer.h.3.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.00750732421875, 0.0203857421875, 0.025634765625], [0.032470703125, 0.0111083984375, -0.012939453125], [-0.00037384033203125, 0.01025390625, -0.007781982421875]] b'transformer.h.3.mlp.w1.weight' transformer.h.3.mlp.w2.weight -> transformer.h.3.mlp.w2.weight transformer.h.3.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.0002689361572265625, -0.0091552734375, -0.0250244140625], [0.0113525390625, -0.022216796875, 0.0262451171875], [0.0172119140625, 0.0040283203125, 0.026123046875]] b'transformer.h.3.mlp.w2.weight' transformer.h.3.mlp.c_proj.weight -> transformer.h.3.mlp.c_proj.weight transformer.h.3.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[0.0150146484375, 0.0166015625, -0.00958251953125], [0.006744384765625, -0.01043701171875, -0.02734375], [-0.0072021484375, -0.01611328125, -0.037841796875]] b'transformer.h.3.mlp.c_proj.weight' transformer.h.4.ln_1.weight -> transformer.h.4.ln_1.weight transformer.h.4.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.625, 0.55078125, 0.57421875] b'transformer.h.4.ln_1.weight' transformer.h.4.attn.c_attn.weight -> transformer.h.4.attn.c_attn.weight transformer.h.4.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.0089111328125, -0.016357421875, -0.0135498046875], [0.002532958984375, -0.0162353515625, 0.006744384765625], [-0.01336669921875, -0.0255126953125, 0.0027008056640625]] b'transformer.h.4.attn.c_attn.weight' transformer.h.4.attn.c_attn.bias -> transformer.h.4.attn.c_attn.bias transformer.h.4.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [-0.1328125, -0.3828125, -0.359375] b'transformer.h.4.attn.c_attn.bias' transformer.h.4.attn.c_proj.weight -> transformer.h.4.attn.c_proj.weight transformer.h.4.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[-0.005401611328125, 0.002655029296875, -0.0019073486328125], [0.0003986358642578125, -0.0021514892578125, 0.00689697265625], [-0.01043701171875, -0.0040283203125, -0.0013427734375]] b'transformer.h.4.attn.c_proj.weight' transformer.h.4.ln_2.weight -> transformer.h.4.ln_2.weight transformer.h.4.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.380859375, 0.37890625, 0.361328125] b'transformer.h.4.ln_2.weight' transformer.h.4.mlp.w1.weight -> transformer.h.4.mlp.w1.weight transformer.h.4.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.01483154296875, 0.001861572265625, 0.0042724609375], [0.00787353515625, 0.02001953125, -0.00836181640625], [0.007476806640625, -0.02783203125, -0.048583984375]] b'transformer.h.4.mlp.w1.weight' transformer.h.4.mlp.w2.weight -> transformer.h.4.mlp.w2.weight transformer.h.4.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.006378173828125, 0.000682830810546875, 0.0133056640625], [-0.013916015625, -0.0272216796875, -0.01275634765625], [0.021728515625, -0.0322265625, -0.00183868408203125]] b'transformer.h.4.mlp.w2.weight' transformer.h.4.mlp.c_proj.weight -> transformer.h.4.mlp.c_proj.weight transformer.h.4.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[0.0106201171875, 0.0064697265625, 0.01068115234375], [-0.00799560546875, -0.045654296875, -0.01092529296875], [0.01251220703125, 0.000843048095703125, -0.03271484375]] b'transformer.h.4.mlp.c_proj.weight' transformer.h.5.ln_1.weight -> transformer.h.5.ln_1.weight transformer.h.5.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.64453125, 0.58984375, 0.609375] b'transformer.h.5.ln_1.weight' transformer.h.5.attn.c_attn.weight -> transformer.h.5.attn.c_attn.weight transformer.h.5.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.022705078125, 0.0072021484375, -0.01336669921875], [0.01171875, 0.036865234375, -0.017333984375], [-0.01104736328125, -0.00087738037109375, -0.0140380859375]] b'transformer.h.5.attn.c_attn.weight' transformer.h.5.attn.c_attn.bias -> transformer.h.5.attn.c_attn.bias transformer.h.5.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [-0.02783203125, 0.09130859375, 0.10693359375] b'transformer.h.5.attn.c_attn.bias' transformer.h.5.attn.c_proj.weight -> transformer.h.5.attn.c_proj.weight transformer.h.5.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.026611328125, -0.0074462890625, 0.010498046875], [0.00823974609375, 0.0101318359375, 0.0255126953125], [0.00075531005859375, -0.0037078857421875, 0.0023956298828125]] b'transformer.h.5.attn.c_proj.weight' transformer.h.5.ln_2.weight -> transformer.h.5.ln_2.weight transformer.h.5.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.416015625, 0.423828125, 0.404296875] b'transformer.h.5.ln_2.weight' transformer.h.5.mlp.w1.weight -> transformer.h.5.mlp.w1.weight transformer.h.5.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.0064697265625, -0.004608154296875, 0.0048828125], [-0.00093841552734375, -0.020751953125, 0.00799560546875], [0.00086212158203125, 0.00604248046875, 0.001983642578125]] b'transformer.h.5.mlp.w1.weight' transformer.h.5.mlp.w2.weight -> transformer.h.5.mlp.w2.weight transformer.h.5.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.002166748046875, -0.0361328125, -0.01165771484375], [0.0123291015625, -1.9311904907226562e-05, 0.016357421875], [-0.0038299560546875, -0.03662109375, -0.03564453125]] b'transformer.h.5.mlp.w2.weight' transformer.h.5.mlp.c_proj.weight -> transformer.h.5.mlp.c_proj.weight transformer.h.5.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.031982421875, -0.022216796875, 0.032958984375], [-0.02587890625, -0.01324462890625, 0.0096435546875], [0.023681640625, 0.00026702880859375, -0.0030975341796875]] b'transformer.h.5.mlp.c_proj.weight' transformer.h.6.ln_1.weight -> transformer.h.6.ln_1.weight transformer.h.6.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.63671875, 0.59375, 0.6015625] b'transformer.h.6.ln_1.weight' transformer.h.6.attn.c_attn.weight -> transformer.h.6.attn.c_attn.weight transformer.h.6.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.003204345703125, 0.00848388671875, 0.0263671875], [-0.0159912109375, -0.00102996826171875, 0.005950927734375], [0.00750732421875, -0.00653076171875, 0.014404296875]] b'transformer.h.6.attn.c_attn.weight' transformer.h.6.attn.c_attn.bias -> transformer.h.6.attn.c_attn.bias transformer.h.6.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.1123046875, -0.0228271484375, -0.28515625] b'transformer.h.6.attn.c_attn.bias' transformer.h.6.attn.c_proj.weight -> transformer.h.6.attn.c_proj.weight transformer.h.6.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[-0.045654296875, 0.021728515625, 0.0074462890625], [0.008056640625, -0.004852294921875, -0.034912109375], [-0.00136566162109375, 0.00531005859375, -0.00347900390625]] b'transformer.h.6.attn.c_proj.weight' transformer.h.6.ln_2.weight -> transformer.h.6.ln_2.weight transformer.h.6.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.45703125, 0.458984375, 0.44140625] b'transformer.h.6.ln_2.weight' transformer.h.6.mlp.w1.weight -> transformer.h.6.mlp.w1.weight transformer.h.6.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.00118255615234375, 0.000713348388671875, -0.0247802734375], [-0.005340576171875, -0.0126953125, 0.01202392578125], [-0.0025634765625, 0.00787353515625, -0.01300048828125]] b'transformer.h.6.mlp.w1.weight' transformer.h.6.mlp.w2.weight -> transformer.h.6.mlp.w2.weight transformer.h.6.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.0267333984375, -0.035888671875, 0.017822265625], [0.0016632080078125, -0.033447265625, 0.01416015625], [0.0006866455078125, 0.004486083984375, 0.0220947265625]] b'transformer.h.6.mlp.w2.weight' transformer.h.6.mlp.c_proj.weight -> transformer.h.6.mlp.c_proj.weight transformer.h.6.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.004425048828125, 0.0224609375, -0.0267333984375], [0.002685546875, -0.0025787353515625, 0.0017242431640625], [0.00145721435546875, 0.00433349609375, 0.001861572265625]] b'transformer.h.6.mlp.c_proj.weight' transformer.h.7.ln_1.weight -> transformer.h.7.ln_1.weight transformer.h.7.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.6484375, 0.62890625, 0.609375] b'transformer.h.7.ln_1.weight' transformer.h.7.attn.c_attn.weight -> transformer.h.7.attn.c_attn.weight transformer.h.7.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.0220947265625, 0.0244140625, 0.01513671875], [0.00787353515625, -0.03564453125, -0.01251220703125], [-0.0125732421875, 0.0125732421875, -0.014404296875]] b'transformer.h.7.attn.c_attn.weight' transformer.h.7.attn.c_attn.bias -> transformer.h.7.attn.c_attn.bias transformer.h.7.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [-0.26953125, 0.248046875, -0.0185546875] b'transformer.h.7.attn.c_attn.bias' transformer.h.7.attn.c_proj.weight -> transformer.h.7.attn.c_proj.weight transformer.h.7.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[-0.000946044921875, 0.0067138671875, -0.01220703125], [-0.0123291015625, -0.010986328125, -0.00738525390625], [-0.004913330078125, -0.001953125, 0.00518798828125]] b'transformer.h.7.attn.c_proj.weight' transformer.h.7.ln_2.weight -> transformer.h.7.ln_2.weight transformer.h.7.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.486328125, 0.4921875, 0.462890625] b'transformer.h.7.ln_2.weight' transformer.h.7.mlp.w1.weight -> transformer.h.7.mlp.w1.weight transformer.h.7.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.0028228759765625, 0.01483154296875, 0.007781982421875], [-0.01409912109375, 0.00958251953125, 0.0205078125], [-0.0068359375, -0.00372314453125, 0.0157470703125]] b'transformer.h.7.mlp.w1.weight' transformer.h.7.mlp.w2.weight -> transformer.h.7.mlp.w2.weight transformer.h.7.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.01043701171875, 0.0311279296875, -0.0152587890625], [-0.018798828125, -0.0308837890625, 0.0037841796875], [-0.004302978515625, 0.0169677734375, 0.026123046875]] b'transformer.h.7.mlp.w2.weight' transformer.h.7.mlp.c_proj.weight -> transformer.h.7.mlp.c_proj.weight transformer.h.7.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.00860595703125, -0.0142822265625, 0.0017242431640625], [0.01104736328125, 0.019775390625, -0.00390625], [-0.004058837890625, 0.016845703125, 0.0191650390625]] b'transformer.h.7.mlp.c_proj.weight' transformer.h.8.ln_1.weight -> transformer.h.8.ln_1.weight transformer.h.8.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.6875, 0.65234375, 0.60546875] b'transformer.h.8.ln_1.weight' transformer.h.8.attn.c_attn.weight -> transformer.h.8.attn.c_attn.weight transformer.h.8.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.0047607421875, 0.01068115234375, -0.0069580078125], [-0.01025390625, -0.00024127960205078125, 0.005615234375], [-0.01483154296875, 0.01519775390625, -0.0147705078125]] b'transformer.h.8.attn.c_attn.weight' transformer.h.8.attn.c_attn.bias -> transformer.h.8.attn.c_attn.bias transformer.h.8.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.07861328125, -0.08203125, 0.1171875] b'transformer.h.8.attn.c_attn.bias' transformer.h.8.attn.c_proj.weight -> transformer.h.8.attn.c_proj.weight transformer.h.8.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.003692626953125, 0.0128173828125, 0.00102996826171875], [-0.021240234375, -0.0152587890625, -0.00439453125], [0.009765625, -0.0012054443359375, -0.024169921875]] b'transformer.h.8.attn.c_proj.weight' transformer.h.8.ln_2.weight -> transformer.h.8.ln_2.weight transformer.h.8.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.5390625, 0.5546875, 0.5234375] b'transformer.h.8.ln_2.weight' transformer.h.8.mlp.w1.weight -> transformer.h.8.mlp.w1.weight transformer.h.8.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.00799560546875, 0.00653076171875, 0.0216064453125], [-0.0172119140625, 0.024658203125, 0.019287109375], [0.00799560546875, -0.017333984375, 0.03125]] b'transformer.h.8.mlp.w1.weight' transformer.h.8.mlp.w2.weight -> transformer.h.8.mlp.w2.weight transformer.h.8.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.01385498046875, -0.022705078125, -0.0023651123046875], [-0.03369140625, -0.0220947265625, 0.03076171875], [-0.00537109375, -0.0014801025390625, 0.00102996826171875]] b'transformer.h.8.mlp.w2.weight' transformer.h.8.mlp.c_proj.weight -> transformer.h.8.mlp.c_proj.weight transformer.h.8.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.033935546875, -0.016357421875, -0.0147705078125], [-0.0034637451171875, 0.02783203125, 0.010498046875], [0.0257568359375, 0.041015625, 0.021240234375]] b'transformer.h.8.mlp.c_proj.weight' transformer.h.9.ln_1.weight -> transformer.h.9.ln_1.weight transformer.h.9.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.73828125, 0.734375, 0.6484375] b'transformer.h.9.ln_1.weight' transformer.h.9.attn.c_attn.weight -> transformer.h.9.attn.c_attn.weight transformer.h.9.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.023681640625, -0.0341796875, -0.000926971435546875], [0.0030517578125, -0.0098876953125, -0.0079345703125], [-0.00360107421875, -0.004913330078125, -0.022705078125]] b'transformer.h.9.attn.c_attn.weight' transformer.h.9.attn.c_attn.bias -> transformer.h.9.attn.c_attn.bias transformer.h.9.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.040283203125, 0.00994873046875, 0.013671875] b'transformer.h.9.attn.c_attn.bias' transformer.h.9.attn.c_proj.weight -> transformer.h.9.attn.c_proj.weight transformer.h.9.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[-0.00775146484375, 0.00286865234375, -0.017578125], [-0.00457763671875, -0.0081787109375, -0.01495361328125], [0.0111083984375, 0.0052490234375, -0.01220703125]] b'transformer.h.9.attn.c_proj.weight' transformer.h.9.ln_2.weight -> transformer.h.9.ln_2.weight transformer.h.9.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.54296875, 0.58203125, 0.546875] b'transformer.h.9.ln_2.weight' transformer.h.9.mlp.w1.weight -> transformer.h.9.mlp.w1.weight transformer.h.9.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.0311279296875, -0.0048828125, 0.0205078125], [-0.01806640625, 0.010986328125, 0.021240234375], [-0.009765625, -0.003204345703125, -0.00927734375]] b'transformer.h.9.mlp.w1.weight' transformer.h.9.mlp.w2.weight -> transformer.h.9.mlp.w2.weight transformer.h.9.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.0028533935546875, -0.00848388671875, 0.006134033203125], [-0.003509521484375, 0.004974365234375, 0.01544189453125], [0.00372314453125, -0.0208740234375, 0.00958251953125]] b'transformer.h.9.mlp.w2.weight' transformer.h.9.mlp.c_proj.weight -> transformer.h.9.mlp.c_proj.weight transformer.h.9.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[0.0146484375, -0.018310546875, 0.012939453125], [-0.0380859375, -0.00213623046875, 0.016357421875], [0.0125732421875, 0.01080322265625, 0.0079345703125]] b'transformer.h.9.mlp.c_proj.weight' transformer.h.10.ln_1.weight -> transformer.h.10.ln_1.weight transformer.h.10.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.7265625, 0.71875, 0.6171875] b'transformer.h.10.ln_1.weight' transformer.h.10.attn.c_attn.weight -> transformer.h.10.attn.c_attn.weight transformer.h.10.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.01055908203125, 0.00946044921875, -0.0257568359375], [0.0024871826171875, -0.00830078125, -0.002197265625], [0.0146484375, 0.0101318359375, -0.0084228515625]] b'transformer.h.10.attn.c_attn.weight' transformer.h.10.attn.c_attn.bias -> transformer.h.10.attn.c_attn.bias transformer.h.10.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [-0.091796875, -0.042724609375, 0.00872802734375] b'transformer.h.10.attn.c_attn.bias' transformer.h.10.attn.c_proj.weight -> transformer.h.10.attn.c_proj.weight transformer.h.10.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.0005950927734375, -0.00518798828125, -0.0086669921875], [0.0361328125, -0.0244140625, 0.0029144287109375], [-0.01708984375, -0.00011587142944335938, -0.00054931640625]] b'transformer.h.10.attn.c_proj.weight' transformer.h.10.ln_2.weight -> transformer.h.10.ln_2.weight transformer.h.10.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.5625, 0.5859375, 0.5390625] b'transformer.h.10.ln_2.weight' transformer.h.10.mlp.w1.weight -> transformer.h.10.mlp.w1.weight transformer.h.10.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.01397705078125, 0.00396728515625, -0.0098876953125], [0.009521484375, -0.0152587890625, -0.0137939453125], [0.01031494140625, -0.0009918212890625, 0.016357421875]] b'transformer.h.10.mlp.w1.weight' transformer.h.10.mlp.w2.weight -> transformer.h.10.mlp.w2.weight transformer.h.10.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.01434326171875, -0.00140380859375, 0.00933837890625], [0.00482177734375, 0.004730224609375, -0.0172119140625], [0.01483154296875, -0.009521484375, 0.0054931640625]] b'transformer.h.10.mlp.w2.weight' transformer.h.10.mlp.c_proj.weight -> transformer.h.10.mlp.c_proj.weight transformer.h.10.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[0.0179443359375, -0.00897216796875, -0.00799560546875], [0.006866455078125, -0.033447265625, -0.0098876953125], [0.01416015625, -0.02978515625, -0.00457763671875]] b'transformer.h.10.mlp.c_proj.weight' transformer.h.11.ln_1.weight -> transformer.h.11.ln_1.weight transformer.h.11.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.8046875, 0.75, 0.640625] b'transformer.h.11.ln_1.weight' transformer.h.11.attn.c_attn.weight -> transformer.h.11.attn.c_attn.weight transformer.h.11.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.01251220703125, -0.0029449462890625, 0.01495361328125], [-0.015380859375, -0.0010528564453125, 0.008056640625], [-0.00958251953125, -0.0244140625, 0.00095367431640625]] b'transformer.h.11.attn.c_attn.weight' transformer.h.11.attn.c_attn.bias -> transformer.h.11.attn.c_attn.bias transformer.h.11.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.036865234375, 0.0303955078125, -0.055908203125] b'transformer.h.11.attn.c_attn.bias' transformer.h.11.attn.c_proj.weight -> transformer.h.11.attn.c_proj.weight transformer.h.11.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.0218505859375, -0.021484375, 0.0213623046875], [0.015625, -0.013671875, 0.0079345703125], [-0.00933837890625, -0.00469970703125, -0.00543212890625]] b'transformer.h.11.attn.c_proj.weight' transformer.h.11.ln_2.weight -> transformer.h.11.ln_2.weight transformer.h.11.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.546875, 0.56640625, 0.51953125] b'transformer.h.11.ln_2.weight' transformer.h.11.mlp.w1.weight -> transformer.h.11.mlp.w1.weight transformer.h.11.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.017333984375, 0.0177001953125, 0.04296875], [-0.0233154296875, 0.008544921875, 0.01507568359375], [-0.003631591796875, 0.009765625, 0.0030517578125]] b'transformer.h.11.mlp.w1.weight' transformer.h.11.mlp.w2.weight -> transformer.h.11.mlp.w2.weight transformer.h.11.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.00506591796875, 0.01458740234375, 0.0155029296875], [0.0093994140625, -0.017333984375, -0.0079345703125], [-0.00311279296875, -0.004730224609375, -0.010986328125]] b'transformer.h.11.mlp.w2.weight' transformer.h.11.mlp.c_proj.weight -> transformer.h.11.mlp.c_proj.weight transformer.h.11.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[5.4836273193359375e-05, -0.0020751953125, -0.013671875], [0.0111083984375, -0.01220703125, 0.01055908203125], [0.0196533203125, 0.0013427734375, -0.0400390625]] b'transformer.h.11.mlp.c_proj.weight' transformer.h.12.ln_1.weight -> transformer.h.12.ln_1.weight transformer.h.12.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.79296875, 0.78515625, 0.69140625] b'transformer.h.12.ln_1.weight' transformer.h.12.attn.c_attn.weight -> transformer.h.12.attn.c_attn.weight transformer.h.12.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.0218505859375, -0.017822265625, -0.021484375], [0.0107421875, -0.006134033203125, 0.003936767578125], [0.0017852783203125, -0.00067138671875, -0.0189208984375]] b'transformer.h.12.attn.c_attn.weight' transformer.h.12.attn.c_attn.bias -> transformer.h.12.attn.c_attn.bias transformer.h.12.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.04052734375, -0.060791015625, 0.049560546875] b'transformer.h.12.attn.c_attn.bias' transformer.h.12.attn.c_proj.weight -> transformer.h.12.attn.c_proj.weight transformer.h.12.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.00244140625, 0.01165771484375, 0.01385498046875], [0.00933837890625, -0.0019378662109375, -0.0057373046875], [-0.0125732421875, 0.0084228515625, -0.0172119140625]] b'transformer.h.12.attn.c_proj.weight' transformer.h.12.ln_2.weight -> transformer.h.12.ln_2.weight transformer.h.12.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.55078125, 0.5859375, 0.53125] b'transformer.h.12.ln_2.weight' transformer.h.12.mlp.w1.weight -> transformer.h.12.mlp.w1.weight transformer.h.12.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.0185546875, 0.00762939453125, 0.00665283203125], [-0.005706787109375, -0.033935546875, 0.017822265625], [-0.000911712646484375, -0.0042724609375, -0.01458740234375]] b'transformer.h.12.mlp.w1.weight' transformer.h.12.mlp.w2.weight -> transformer.h.12.mlp.w2.weight transformer.h.12.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.0167236328125, 0.00848388671875, -0.020263671875], [-0.023193359375, -0.008544921875, 0.03759765625], [0.0024261474609375, 0.0133056640625, 0.03564453125]] b'transformer.h.12.mlp.w2.weight' transformer.h.12.mlp.c_proj.weight -> transformer.h.12.mlp.c_proj.weight transformer.h.12.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[0.006134033203125, 0.037109375, 0.00244140625], [-0.00445556640625, -0.03515625, -0.0019378662109375], [-0.00165557861328125, -0.018798828125, -0.03662109375]] b'transformer.h.12.mlp.c_proj.weight' transformer.h.13.ln_1.weight -> transformer.h.13.ln_1.weight transformer.h.13.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.8046875, 0.765625, 0.703125] b'transformer.h.13.ln_1.weight' transformer.h.13.attn.c_attn.weight -> transformer.h.13.attn.c_attn.weight transformer.h.13.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.020263671875, -0.015625, 0.00360107421875], [-0.016845703125, 0.0126953125, -0.005157470703125], [0.00020313262939453125, -0.0091552734375, -0.0120849609375]] b'transformer.h.13.attn.c_attn.weight' transformer.h.13.attn.c_attn.bias -> transformer.h.13.attn.c_attn.bias transformer.h.13.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.0019073486328125, -0.00958251953125, -0.03369140625] b'transformer.h.13.attn.c_attn.bias' transformer.h.13.attn.c_proj.weight -> transformer.h.13.attn.c_proj.weight transformer.h.13.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[-0.0152587890625, -0.01470947265625, -0.000446319580078125], [-0.004058837890625, -0.00677490234375, 0.0032501220703125], [-0.01904296875, 0.0079345703125, -0.01007080078125]] b'transformer.h.13.attn.c_proj.weight' transformer.h.13.ln_2.weight -> transformer.h.13.ln_2.weight transformer.h.13.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.54296875, 0.57421875, 0.5390625] b'transformer.h.13.ln_2.weight' transformer.h.13.mlp.w1.weight -> transformer.h.13.mlp.w1.weight transformer.h.13.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[9.679794311523438e-05, 0.010009765625, 0.0262451171875], [0.00640869140625, 0.0263671875, -0.0027923583984375], [0.01202392578125, 0.01171875, -0.033447265625]] b'transformer.h.13.mlp.w1.weight' transformer.h.13.mlp.w2.weight -> transformer.h.13.mlp.w2.weight transformer.h.13.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.006103515625, -0.0013885498046875, -0.022216796875], [-0.00122833251953125, -0.00927734375, -0.003631591796875], [-0.029296875, 0.003814697265625, -0.003814697265625]] b'transformer.h.13.mlp.w2.weight' transformer.h.13.mlp.c_proj.weight -> transformer.h.13.mlp.c_proj.weight transformer.h.13.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.004150390625, -0.03173828125, 0.03515625], [0.017822265625, 0.035888671875, -0.006927490234375], [0.0103759765625, -0.0120849609375, 0.005096435546875]] b'transformer.h.13.mlp.c_proj.weight' transformer.h.14.ln_1.weight -> transformer.h.14.ln_1.weight transformer.h.14.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.85546875, 0.796875, 0.7578125] b'transformer.h.14.ln_1.weight' transformer.h.14.attn.c_attn.weight -> transformer.h.14.attn.c_attn.weight transformer.h.14.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.0198974609375, -0.0126953125, -0.00885009765625], [0.0057373046875, 0.00732421875, 0.00141143798828125], [0.0140380859375, 0.0084228515625, 0.000705718994140625]] b'transformer.h.14.attn.c_attn.weight' transformer.h.14.attn.c_attn.bias -> transformer.h.14.attn.c_attn.bias transformer.h.14.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.03271484375, -0.024658203125, 0.01055908203125] b'transformer.h.14.attn.c_attn.bias' transformer.h.14.attn.c_proj.weight -> transformer.h.14.attn.c_proj.weight transformer.h.14.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.00567626953125, 0.021728515625, 0.0126953125], [0.0120849609375, -0.0296630859375, 0.0152587890625], [-0.0159912109375, -0.02294921875, 0.009765625]] b'transformer.h.14.attn.c_proj.weight' transformer.h.14.ln_2.weight -> transformer.h.14.ln_2.weight transformer.h.14.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.55859375, 0.58984375, 0.55078125] b'transformer.h.14.ln_2.weight' transformer.h.14.mlp.w1.weight -> transformer.h.14.mlp.w1.weight transformer.h.14.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.0274658203125, -0.018310546875, 0.035400390625], [0.00921630859375, -0.016357421875, -0.00537109375], [0.0220947265625, 0.01348876953125, 0.031494140625]] b'transformer.h.14.mlp.w1.weight' transformer.h.14.mlp.w2.weight -> transformer.h.14.mlp.w2.weight transformer.h.14.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.005584716796875, -0.00921630859375, -0.005889892578125], [0.0223388671875, 0.02392578125, -0.0233154296875], [-0.021484375, -0.03564453125, 0.004730224609375]] b'transformer.h.14.mlp.w2.weight' transformer.h.14.mlp.c_proj.weight -> transformer.h.14.mlp.c_proj.weight transformer.h.14.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[0.004974365234375, -0.0091552734375, 0.0262451171875], [-0.01513671875, -0.01324462890625, -0.00665283203125], [0.019287109375, 0.029296875, 0.0135498046875]] b'transformer.h.14.mlp.c_proj.weight' transformer.h.15.ln_1.weight -> transformer.h.15.ln_1.weight transformer.h.15.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.80859375, 0.78125, 0.76171875] b'transformer.h.15.ln_1.weight' transformer.h.15.attn.c_attn.weight -> transformer.h.15.attn.c_attn.weight transformer.h.15.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.0096435546875, -0.00732421875, -0.0262451171875], [0.0015869140625, 0.0234375, -0.00714111328125], [0.001983642578125, -0.033203125, 0.0208740234375]] b'transformer.h.15.attn.c_attn.weight' transformer.h.15.attn.c_attn.bias -> transformer.h.15.attn.c_attn.bias transformer.h.15.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [-0.10302734375, -0.0771484375, -0.018310546875] b'transformer.h.15.attn.c_attn.bias' transformer.h.15.attn.c_proj.weight -> transformer.h.15.attn.c_proj.weight transformer.h.15.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[-0.0234375, -0.010009765625, -0.00579833984375], [-0.034423828125, 0.034912109375, 0.002044677734375], [0.04248046875, 0.01409912109375, 0.0078125]] b'transformer.h.15.attn.c_proj.weight' transformer.h.15.ln_2.weight -> transformer.h.15.ln_2.weight transformer.h.15.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.5703125, 0.6015625, 0.5625] b'transformer.h.15.ln_2.weight' transformer.h.15.mlp.w1.weight -> transformer.h.15.mlp.w1.weight transformer.h.15.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.016845703125, 0.0203857421875, 0.0299072265625], [0.04541015625, -0.030029296875, -0.008544921875], [0.007598876953125, -0.0224609375, 0.01031494140625]] b'transformer.h.15.mlp.w1.weight' transformer.h.15.mlp.w2.weight -> transformer.h.15.mlp.w2.weight transformer.h.15.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.00372314453125, 0.016845703125, 0.0018310546875], [0.0069580078125, 0.0087890625, 0.0174560546875], [0.0157470703125, -0.03515625, 0.004638671875]] b'transformer.h.15.mlp.w2.weight' transformer.h.15.mlp.c_proj.weight -> transformer.h.15.mlp.c_proj.weight transformer.h.15.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.010986328125, -0.01080322265625, 0.00787353515625], [-0.03369140625, 0.0069580078125, -0.0076904296875], [-0.0198974609375, -0.021240234375, -0.00933837890625]] b'transformer.h.15.mlp.c_proj.weight' transformer.h.16.ln_1.weight -> transformer.h.16.ln_1.weight transformer.h.16.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.8359375, 0.8203125, 0.79296875] b'transformer.h.16.ln_1.weight' transformer.h.16.attn.c_attn.weight -> transformer.h.16.attn.c_attn.weight transformer.h.16.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.003387451171875, -0.0101318359375, -0.0172119140625], [0.0223388671875, 0.0322265625, 0.016845703125], [0.01336669921875, 0.0072021484375, -0.0004596710205078125]] b'transformer.h.16.attn.c_attn.weight' transformer.h.16.attn.c_attn.bias -> transformer.h.16.attn.c_attn.bias transformer.h.16.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.00628662109375, 0.0306396484375, 0.019287109375] b'transformer.h.16.attn.c_attn.bias' transformer.h.16.attn.c_proj.weight -> transformer.h.16.attn.c_proj.weight transformer.h.16.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.00469970703125, 0.0179443359375, 0.025634765625], [0.0079345703125, 0.0018463134765625, 0.0361328125], [0.00604248046875, 0.01220703125, -0.00555419921875]] b'transformer.h.16.attn.c_proj.weight' transformer.h.16.ln_2.weight -> transformer.h.16.ln_2.weight transformer.h.16.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.58203125, 0.625, 0.5859375] b'transformer.h.16.ln_2.weight' transformer.h.16.mlp.w1.weight -> transformer.h.16.mlp.w1.weight transformer.h.16.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.0133056640625, -0.004791259765625, -0.03662109375], [0.04248046875, -0.048583984375, 0.0247802734375], [0.046875, 0.003326416015625, 0.0115966796875]] b'transformer.h.16.mlp.w1.weight' transformer.h.16.mlp.w2.weight -> transformer.h.16.mlp.w2.weight transformer.h.16.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.031005859375, -0.022705078125, 0.01556396484375], [0.017333984375, 0.04248046875, -0.0126953125], [0.009765625, -0.00183868408203125, 0.00372314453125]] b'transformer.h.16.mlp.w2.weight' transformer.h.16.mlp.c_proj.weight -> transformer.h.16.mlp.c_proj.weight transformer.h.16.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.006256103515625, 0.0107421875, -0.010009765625], [-0.00311279296875, -0.00084686279296875, -0.0091552734375], [-0.0140380859375, -0.00726318359375, -0.006134033203125]] b'transformer.h.16.mlp.c_proj.weight' transformer.h.17.ln_1.weight -> transformer.h.17.ln_1.weight transformer.h.17.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.81640625, 0.7890625, 0.7734375] b'transformer.h.17.ln_1.weight' transformer.h.17.attn.c_attn.weight -> transformer.h.17.attn.c_attn.weight transformer.h.17.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.018798828125, -0.013916015625, -0.02978515625], [0.017333984375, 0.00421142578125, -0.000431060791015625], [0.0101318359375, -0.02197265625, -0.01171875]] b'transformer.h.17.attn.c_attn.weight' transformer.h.17.attn.c_attn.bias -> transformer.h.17.attn.c_attn.bias transformer.h.17.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.005706787109375, -0.1015625, 0.0303955078125] b'transformer.h.17.attn.c_attn.bias' transformer.h.17.attn.c_proj.weight -> transformer.h.17.attn.c_proj.weight transformer.h.17.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.011474609375, 0.00457763671875, 0.025146484375], [-0.0031890869140625, -0.031494140625, -0.00494384765625], [0.01123046875, 0.037841796875, -0.02392578125]] b'transformer.h.17.attn.c_proj.weight' transformer.h.17.ln_2.weight -> transformer.h.17.ln_2.weight transformer.h.17.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.60546875, 0.6328125, 0.61328125] b'transformer.h.17.ln_2.weight' transformer.h.17.mlp.w1.weight -> transformer.h.17.mlp.w1.weight transformer.h.17.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.015869140625, -0.0034942626953125, -0.010498046875], [0.0191650390625, 0.0196533203125, -0.00408935546875], [-0.018310546875, 0.014892578125, 0.005859375]] b'transformer.h.17.mlp.w1.weight' transformer.h.17.mlp.w2.weight -> transformer.h.17.mlp.w2.weight transformer.h.17.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.00494384765625, 0.02734375, -0.00567626953125], [0.0022430419921875, -0.005767822265625, -0.00958251953125], [-0.01129150390625, 0.025146484375, 0.0234375]] b'transformer.h.17.mlp.w2.weight' transformer.h.17.mlp.c_proj.weight -> transformer.h.17.mlp.c_proj.weight transformer.h.17.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.0181884765625, -0.029296875, -0.007476806640625], [0.01123046875, 0.006500244140625, 0.013427734375], [-0.0022125244140625, 0.021728515625, 0.04248046875]] b'transformer.h.17.mlp.c_proj.weight' transformer.h.18.ln_1.weight -> transformer.h.18.ln_1.weight transformer.h.18.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.859375, 0.82421875, 0.8203125] b'transformer.h.18.ln_1.weight' transformer.h.18.attn.c_attn.weight -> transformer.h.18.attn.c_attn.weight transformer.h.18.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.010009765625, -0.017578125, 0.0079345703125], [0.00262451171875, 0.0142822265625, 0.00518798828125], [-0.00909423828125, -0.00555419921875, -0.01806640625]] b'transformer.h.18.attn.c_attn.weight' transformer.h.18.attn.c_attn.bias -> transformer.h.18.attn.c_attn.bias transformer.h.18.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [-0.107421875, 0.177734375, -0.05029296875] b'transformer.h.18.attn.c_attn.bias' transformer.h.18.attn.c_proj.weight -> transformer.h.18.attn.c_proj.weight transformer.h.18.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.007598876953125, -0.0059814453125, 0.00506591796875], [0.005615234375, 0.0186767578125, 0.0098876953125], [-0.00347900390625, -0.01251220703125, -0.018310546875]] b'transformer.h.18.attn.c_proj.weight' transformer.h.18.ln_2.weight -> transformer.h.18.ln_2.weight transformer.h.18.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.6171875, 0.65625, 0.640625] b'transformer.h.18.ln_2.weight' transformer.h.18.mlp.w1.weight -> transformer.h.18.mlp.w1.weight transformer.h.18.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.01953125, -0.0037841796875, -0.00653076171875], [-0.00213623046875, -0.01025390625, 0.00750732421875], [-0.01361083984375, 0.0147705078125, 0.01080322265625]] b'transformer.h.18.mlp.w1.weight' transformer.h.18.mlp.w2.weight -> transformer.h.18.mlp.w2.weight transformer.h.18.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.004608154296875, 0.004058837890625, 0.014404296875], [0.007659912109375, 0.00689697265625, -0.005584716796875], [-0.003631591796875, 0.020751953125, -0.0081787109375]] b'transformer.h.18.mlp.w2.weight' transformer.h.18.mlp.c_proj.weight -> transformer.h.18.mlp.c_proj.weight transformer.h.18.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[0.00579833984375, -0.00130462646484375, 0.01324462890625], [0.00787353515625, -0.00677490234375, -0.0052490234375], [0.00171661376953125, 0.0140380859375, -0.00168609619140625]] b'transformer.h.18.mlp.c_proj.weight' transformer.h.19.ln_1.weight -> transformer.h.19.ln_1.weight transformer.h.19.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.828125, 0.83203125, 0.81640625] b'transformer.h.19.ln_1.weight' transformer.h.19.attn.c_attn.weight -> transformer.h.19.attn.c_attn.weight transformer.h.19.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.01141357421875, 0.0013275146484375, -0.005584716796875], [-0.00023651123046875, -0.003997802734375, -0.0050048828125], [-0.0106201171875, -0.00927734375, 0.003173828125]] b'transformer.h.19.attn.c_attn.weight' transformer.h.19.attn.c_attn.bias -> transformer.h.19.attn.c_attn.bias transformer.h.19.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [-0.14453125, -0.087890625, 0.041015625] b'transformer.h.19.attn.c_attn.bias' transformer.h.19.attn.c_proj.weight -> transformer.h.19.attn.c_proj.weight transformer.h.19.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.020263671875, 0.028564453125, -0.012939453125], [0.009765625, 0.00811767578125, 0.034912109375], [-0.017822265625, 0.000751495361328125, 0.015869140625]] b'transformer.h.19.attn.c_proj.weight' transformer.h.19.ln_2.weight -> transformer.h.19.ln_2.weight transformer.h.19.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.65625, 0.6796875, 0.67578125] b'transformer.h.19.ln_2.weight' transformer.h.19.mlp.w1.weight -> transformer.h.19.mlp.w1.weight transformer.h.19.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.0028533935546875, -0.01556396484375, -0.042236328125], [0.0068359375, -0.0018310546875, -0.001312255859375], [0.0089111328125, 0.0234375, 0.0255126953125]] b'transformer.h.19.mlp.w1.weight' transformer.h.19.mlp.w2.weight -> transformer.h.19.mlp.w2.weight transformer.h.19.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.0006561279296875, 0.01165771484375, 0.002471923828125], [0.018798828125, 0.01904296875, 0.013427734375], [0.0059814453125, -0.012451171875, -0.01239013671875]] b'transformer.h.19.mlp.w2.weight' transformer.h.19.mlp.c_proj.weight -> transformer.h.19.mlp.c_proj.weight transformer.h.19.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.00537109375, 0.00738525390625, -0.008056640625], [0.023681640625, 0.0022430419921875, -0.0111083984375], [0.01953125, -0.0001735687255859375, -0.030029296875]] b'transformer.h.19.mlp.c_proj.weight' transformer.h.20.ln_1.weight -> transformer.h.20.ln_1.weight transformer.h.20.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.87890625, 0.828125, 0.8203125] b'transformer.h.20.ln_1.weight' transformer.h.20.attn.c_attn.weight -> transformer.h.20.attn.c_attn.weight transformer.h.20.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.0033111572265625, 0.010009765625, 0.0089111328125], [-0.0152587890625, 0.00096893310546875, -0.0003814697265625], [0.01519775390625, 0.01361083984375, -0.014404296875]] b'transformer.h.20.attn.c_attn.weight' transformer.h.20.attn.c_attn.bias -> transformer.h.20.attn.c_attn.bias transformer.h.20.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.0078125, 0.00185394287109375, 0.030029296875] b'transformer.h.20.attn.c_attn.bias' transformer.h.20.attn.c_proj.weight -> transformer.h.20.attn.c_proj.weight transformer.h.20.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.0068359375, -0.00762939453125, -0.0179443359375], [0.0118408203125, 0.001953125, 0.015869140625], [0.0011138916015625, 0.01116943359375, 0.0130615234375]] b'transformer.h.20.attn.c_proj.weight' transformer.h.20.ln_2.weight -> transformer.h.20.ln_2.weight transformer.h.20.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.6875, 0.72265625, 0.71875] b'transformer.h.20.ln_2.weight' transformer.h.20.mlp.w1.weight -> transformer.h.20.mlp.w1.weight transformer.h.20.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.02099609375, 0.0191650390625, 0.033447265625], [0.021484375, 0.0074462890625, -0.0194091796875], [0.0001811981201171875, 0.0242919921875, -0.0029296875]] b'transformer.h.20.mlp.w1.weight' transformer.h.20.mlp.w2.weight -> transformer.h.20.mlp.w2.weight transformer.h.20.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.011962890625, 0.001434326171875, -0.00445556640625], [0.017333984375, -0.002166748046875, -0.02392578125], [0.00518798828125, 0.0172119140625, 0.013916015625]] b'transformer.h.20.mlp.w2.weight' transformer.h.20.mlp.c_proj.weight -> transformer.h.20.mlp.c_proj.weight transformer.h.20.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.00830078125, -0.000919342041015625, -0.0108642578125], [0.02294921875, -0.00372314453125, 0.00396728515625], [0.02001953125, 0.00579833984375, 0.013671875]] b'transformer.h.20.mlp.c_proj.weight' transformer.h.21.ln_1.weight -> transformer.h.21.ln_1.weight transformer.h.21.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.91796875, 0.85546875, 0.84765625] b'transformer.h.21.ln_1.weight' transformer.h.21.attn.c_attn.weight -> transformer.h.21.attn.c_attn.weight transformer.h.21.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.00836181640625, -0.00494384765625, -0.01092529296875], [0.0201416015625, 0.0157470703125, 0.02587890625], [-0.003875732421875, -0.0032806396484375, -0.017333984375]] b'transformer.h.21.attn.c_attn.weight' transformer.h.21.attn.c_attn.bias -> transformer.h.21.attn.c_attn.bias transformer.h.21.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [-0.17578125, 0.076171875, -0.1611328125] b'transformer.h.21.attn.c_attn.bias' transformer.h.21.attn.c_proj.weight -> transformer.h.21.attn.c_proj.weight transformer.h.21.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.0233154296875, -0.01080322265625, 0.00872802734375], [-0.032958984375, -0.015869140625, 0.0177001953125], [-0.0003948211669921875, -0.004425048828125, -0.044677734375]] b'transformer.h.21.attn.c_proj.weight' transformer.h.21.ln_2.weight -> transformer.h.21.ln_2.weight transformer.h.21.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.734375, 0.7734375, 0.75390625] b'transformer.h.21.ln_2.weight' transformer.h.21.mlp.w1.weight -> transformer.h.21.mlp.w1.weight transformer.h.21.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.01123046875, -0.0272216796875, -0.0123291015625], [0.01446533203125, 0.0283203125, -0.04296875], [-0.013427734375, 0.024658203125, 0.01263427734375]] b'transformer.h.21.mlp.w1.weight' transformer.h.21.mlp.w2.weight -> transformer.h.21.mlp.w2.weight transformer.h.21.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.01397705078125, 0.020751953125, -0.0220947265625], [-0.0054931640625, -0.0137939453125, -0.018798828125], [-0.004730224609375, -0.0284423828125, 0.00098419189453125]] b'transformer.h.21.mlp.w2.weight' transformer.h.21.mlp.c_proj.weight -> transformer.h.21.mlp.c_proj.weight transformer.h.21.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[0.017822265625, 0.038818359375, -0.0031280517578125], [-0.01385498046875, -0.00689697265625, 0.01397705078125], [-0.0087890625, 0.02490234375, 0.01275634765625]] b'transformer.h.21.mlp.c_proj.weight' transformer.h.22.ln_1.weight -> transformer.h.22.ln_1.weight transformer.h.22.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.94140625, 0.90234375, 0.86328125] b'transformer.h.22.ln_1.weight' transformer.h.22.attn.c_attn.weight -> transformer.h.22.attn.c_attn.weight transformer.h.22.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.004974365234375, 0.00653076171875, 0.005615234375], [-0.01171875, 0.00069427490234375, -0.004974365234375], [0.0120849609375, -0.004119873046875, 0.003753662109375]] b'transformer.h.22.attn.c_attn.weight' transformer.h.22.attn.c_attn.bias -> transformer.h.22.attn.c_attn.bias transformer.h.22.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [-0.00897216796875, 0.00445556640625, 0.017822265625] b'transformer.h.22.attn.c_attn.bias' transformer.h.22.attn.c_proj.weight -> transformer.h.22.attn.c_proj.weight transformer.h.22.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.0169677734375, 0.01141357421875, 0.0234375], [0.0167236328125, -0.003570556640625, 0.00689697265625], [0.000591278076171875, 0.032470703125, 0.0030517578125]] b'transformer.h.22.attn.c_proj.weight' transformer.h.22.ln_2.weight -> transformer.h.22.ln_2.weight transformer.h.22.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.7734375, 0.8046875, 0.796875] b'transformer.h.22.ln_2.weight' transformer.h.22.mlp.w1.weight -> transformer.h.22.mlp.w1.weight transformer.h.22.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.00830078125, 0.0002422332763671875, 0.026611328125], [-0.021728515625, 0.005706787109375, -0.03173828125], [0.0164794921875, 0.0133056640625, 0.0240478515625]] b'transformer.h.22.mlp.w1.weight' transformer.h.22.mlp.w2.weight -> transformer.h.22.mlp.w2.weight transformer.h.22.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.023193359375, 0.0189208984375, -0.01171875], [0.0047607421875, 0.007781982421875, 0.0087890625], [0.00567626953125, 9.250640869140625e-05, 0.0030059814453125]] b'transformer.h.22.mlp.w2.weight' transformer.h.22.mlp.c_proj.weight -> transformer.h.22.mlp.c_proj.weight transformer.h.22.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.02294921875, 0.003204345703125, -0.0024261474609375], [-0.015869140625, -0.002899169921875, 0.0284423828125], [-0.004669189453125, 0.02490234375, -0.0003566741943359375]] b'transformer.h.22.mlp.c_proj.weight' transformer.h.23.ln_1.weight -> transformer.h.23.ln_1.weight transformer.h.23.ln_1.weight 1 (4096,) Converting to float32 (4096,) [0.95703125, 0.9140625, 0.9140625] b'transformer.h.23.ln_1.weight' transformer.h.23.attn.c_attn.weight -> transformer.h.23.attn.c_attn.weight transformer.h.23.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.0067138671875, -0.01953125, -0.020263671875], [-0.02099609375, 0.00933837890625, -0.00186920166015625], [0.01422119140625, 0.006317138671875, -0.002593994140625]] b'transformer.h.23.attn.c_attn.weight' transformer.h.23.attn.c_attn.bias -> transformer.h.23.attn.c_attn.bias transformer.h.23.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [-0.07763671875, -0.04443359375, -0.1376953125] b'transformer.h.23.attn.c_attn.bias' transformer.h.23.attn.c_proj.weight -> transformer.h.23.attn.c_proj.weight transformer.h.23.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.030029296875, -0.00799560546875, -0.0205078125], [-0.0380859375, 0.0235595703125, 0.023681640625], [-0.01507568359375, 0.016845703125, 0.029541015625]] b'transformer.h.23.attn.c_proj.weight' transformer.h.23.ln_2.weight -> transformer.h.23.ln_2.weight transformer.h.23.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.81640625, 0.83203125, 0.828125] b'transformer.h.23.ln_2.weight' transformer.h.23.mlp.w1.weight -> transformer.h.23.mlp.w1.weight transformer.h.23.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.018798828125, -0.02734375, 0.020263671875], [-0.019287109375, 0.0017852783203125, 0.012939453125], [0.025390625, 0.01953125, 0.0228271484375]] b'transformer.h.23.mlp.w1.weight' transformer.h.23.mlp.w2.weight -> transformer.h.23.mlp.w2.weight transformer.h.23.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.0067138671875, -0.0093994140625, -0.00732421875], [-0.0157470703125, 0.0047607421875, -0.01019287109375], [-0.008056640625, -0.01043701171875, 0.0107421875]] b'transformer.h.23.mlp.w2.weight' transformer.h.23.mlp.c_proj.weight -> transformer.h.23.mlp.c_proj.weight transformer.h.23.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.012451171875, 0.03369140625, 0.002899169921875], [0.0208740234375, -0.009033203125, -0.0037384033203125], [0.0098876953125, 0.00750732421875, 0.036865234375]] b'transformer.h.23.mlp.c_proj.weight' transformer.h.24.ln_1.weight -> transformer.h.24.ln_1.weight transformer.h.24.ln_1.weight 1 (4096,) Converting to float32 (4096,) [1.0078125, 0.9453125, 0.9375] b'transformer.h.24.ln_1.weight' transformer.h.24.attn.c_attn.weight -> transformer.h.24.attn.c_attn.weight transformer.h.24.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.00372314453125, 0.0054931640625, -0.00665283203125], [-0.0133056640625, -0.013671875, -0.00146484375], [-0.012939453125, 0.01513671875, 0.005645751953125]] b'transformer.h.24.attn.c_attn.weight' transformer.h.24.attn.c_attn.bias -> transformer.h.24.attn.c_attn.bias transformer.h.24.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.02685546875, 0.0184326171875, 0.00958251953125] b'transformer.h.24.attn.c_attn.bias' transformer.h.24.attn.c_proj.weight -> transformer.h.24.attn.c_proj.weight transformer.h.24.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.02880859375, 0.0036468505859375, -0.021728515625], [0.01373291015625, 4.839897155761719e-05, -0.006866455078125], [-0.019775390625, 0.04248046875, -0.002899169921875]] b'transformer.h.24.attn.c_proj.weight' transformer.h.24.ln_2.weight -> transformer.h.24.ln_2.weight transformer.h.24.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.86328125, 0.87109375, 0.87109375] b'transformer.h.24.ln_2.weight' transformer.h.24.mlp.w1.weight -> transformer.h.24.mlp.w1.weight transformer.h.24.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.013427734375, 0.042724609375, 0.0244140625], [-0.017333984375, -0.051513671875, 0.01080322265625], [-0.02099609375, 0.00157928466796875, 0.020263671875]] b'transformer.h.24.mlp.w1.weight' transformer.h.24.mlp.w2.weight -> transformer.h.24.mlp.w2.weight transformer.h.24.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.0081787109375, -0.016357421875, -0.026611328125], [0.0194091796875, 0.00830078125, -0.01416015625], [0.00714111328125, -0.0152587890625, 0.021728515625]] b'transformer.h.24.mlp.w2.weight' transformer.h.24.mlp.c_proj.weight -> transformer.h.24.mlp.c_proj.weight transformer.h.24.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[0.029052734375, -0.00616455078125, -0.024169921875], [0.014892578125, -0.005279541015625, 0.00274658203125], [0.013671875, 0.016845703125, -0.015869140625]] b'transformer.h.24.mlp.c_proj.weight' transformer.h.25.ln_1.weight -> transformer.h.25.ln_1.weight transformer.h.25.ln_1.weight 1 (4096,) Converting to float32 (4096,) [1.0625, 0.9921875, 0.98046875] b'transformer.h.25.ln_1.weight' transformer.h.25.attn.c_attn.weight -> transformer.h.25.attn.c_attn.weight transformer.h.25.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.0216064453125, -0.004150390625, -0.0142822265625], [0.00909423828125, -0.006683349609375, 0.009765625], [-0.00189971923828125, 0.0032958984375, 0.01483154296875]] b'transformer.h.25.attn.c_attn.weight' transformer.h.25.attn.c_attn.bias -> transformer.h.25.attn.c_attn.bias transformer.h.25.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [-0.0032958984375, -0.006134033203125, -0.0179443359375] b'transformer.h.25.attn.c_attn.bias' transformer.h.25.attn.c_proj.weight -> transformer.h.25.attn.c_proj.weight transformer.h.25.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.0177001953125, -0.02587890625, 0.01361083984375], [-0.00689697265625, 0.055419921875, 0.0245361328125], [-0.01953125, -0.0087890625, 0.01116943359375]] b'transformer.h.25.attn.c_proj.weight' transformer.h.25.ln_2.weight -> transformer.h.25.ln_2.weight transformer.h.25.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.90625, 0.91796875, 0.89453125] b'transformer.h.25.ln_2.weight' transformer.h.25.mlp.w1.weight -> transformer.h.25.mlp.w1.weight transformer.h.25.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.015869140625, 0.034912109375, -0.015380859375], [-0.0230712890625, -0.021728515625, -0.003509521484375], [0.002685546875, 0.0021209716796875, 0.0341796875]] b'transformer.h.25.mlp.w1.weight' transformer.h.25.mlp.w2.weight -> transformer.h.25.mlp.w2.weight transformer.h.25.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.0120849609375, 0.017822265625, -0.00384521484375], [0.00225830078125, 0.00762939453125, -0.02490234375], [0.01373291015625, 0.004486083984375, -0.014892578125]] b'transformer.h.25.mlp.w2.weight' transformer.h.25.mlp.c_proj.weight -> transformer.h.25.mlp.c_proj.weight transformer.h.25.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[0.01446533203125, 0.035888671875, -0.005859375], [-0.0233154296875, -0.02197265625, 0.020751953125], [0.0142822265625, -0.00830078125, -0.028076171875]] b'transformer.h.25.mlp.c_proj.weight' transformer.h.26.ln_1.weight -> transformer.h.26.ln_1.weight transformer.h.26.ln_1.weight 1 (4096,) Converting to float32 (4096,) [1.0703125, 0.9765625, 0.98046875] b'transformer.h.26.ln_1.weight' transformer.h.26.attn.c_attn.weight -> transformer.h.26.attn.c_attn.weight transformer.h.26.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.005279541015625, 0.000438690185546875, -0.000354766845703125], [-0.0068359375, -0.0198974609375, -0.0087890625], [-0.004425048828125, 0.025390625, -0.00885009765625]] b'transformer.h.26.attn.c_attn.weight' transformer.h.26.attn.c_attn.bias -> transformer.h.26.attn.c_attn.bias transformer.h.26.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.029296875, -0.0771484375, -0.0030670166015625] b'transformer.h.26.attn.c_attn.bias' transformer.h.26.attn.c_proj.weight -> transformer.h.26.attn.c_proj.weight transformer.h.26.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.02197265625, 0.0003662109375, 0.000667572021484375], [0.0213623046875, 0.0111083984375, -0.01446533203125], [-0.00537109375, -0.0186767578125, -0.01416015625]] b'transformer.h.26.attn.c_proj.weight' transformer.h.26.ln_2.weight -> transformer.h.26.ln_2.weight transformer.h.26.ln_2.weight 1 (4096,) Converting to float32 (4096,) [0.9609375, 0.96875, 0.953125] b'transformer.h.26.ln_2.weight' transformer.h.26.mlp.w1.weight -> transformer.h.26.mlp.w1.weight transformer.h.26.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.022705078125, -0.026123046875, 0.007293701171875], [-0.006744384765625, 0.0181884765625, 0.008056640625], [-0.0145263671875, -0.0029296875, 0.0216064453125]] b'transformer.h.26.mlp.w1.weight' transformer.h.26.mlp.w2.weight -> transformer.h.26.mlp.w2.weight transformer.h.26.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.0283203125, -0.019287109375, -0.025146484375], [-0.00506591796875, 0.01336669921875, -0.0213623046875], [-0.0225830078125, -0.00109100341796875, -0.0198974609375]] b'transformer.h.26.mlp.w2.weight' transformer.h.26.mlp.c_proj.weight -> transformer.h.26.mlp.c_proj.weight transformer.h.26.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.010986328125, -0.0130615234375, 0.002838134765625], [0.0172119140625, 0.0135498046875, -0.00860595703125], [-0.00927734375, -0.01251220703125, -0.0194091796875]] b'transformer.h.26.mlp.c_proj.weight' transformer.h.27.ln_1.weight -> transformer.h.27.ln_1.weight transformer.h.27.ln_1.weight 1 (4096,) Converting to float32 (4096,) [1.1328125, 1.0390625, 1.0234375] b'transformer.h.27.ln_1.weight' transformer.h.27.attn.c_attn.weight -> transformer.h.27.attn.c_attn.weight transformer.h.27.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.01336669921875, -0.0111083984375, 0.00543212890625], [-0.00653076171875, -0.00080108642578125, -0.0091552734375], [0.00244140625, 0.01446533203125, 0.006866455078125]] b'transformer.h.27.attn.c_attn.weight' transformer.h.27.attn.c_attn.bias -> transformer.h.27.attn.c_attn.bias transformer.h.27.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.07763671875, 0.1025390625, 0.038330078125] b'transformer.h.27.attn.c_attn.bias' transformer.h.27.attn.c_proj.weight -> transformer.h.27.attn.c_proj.weight transformer.h.27.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.012939453125, -0.0216064453125, -0.00494384765625], [0.0062255859375, 0.006744384765625, 0.0120849609375], [-0.00372314453125, 0.013671875, 0.020263671875]] b'transformer.h.27.attn.c_proj.weight' transformer.h.27.ln_2.weight -> transformer.h.27.ln_2.weight transformer.h.27.ln_2.weight 1 (4096,) Converting to float32 (4096,) [1.0, 0.984375, 0.98046875] b'transformer.h.27.ln_2.weight' transformer.h.27.mlp.w1.weight -> transformer.h.27.mlp.w1.weight transformer.h.27.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.0087890625, -0.00537109375, -0.00518798828125], [0.0234375, 0.007659912109375, -0.030029296875], [0.0289306640625, -0.0137939453125, 0.015380859375]] b'transformer.h.27.mlp.w1.weight' transformer.h.27.mlp.w2.weight -> transformer.h.27.mlp.w2.weight transformer.h.27.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.0032958984375, 0.002593994140625, 0.009521484375], [-0.00299072265625, -0.00201416015625, -0.00135040283203125], [0.026611328125, 0.002044677734375, 0.03173828125]] b'transformer.h.27.mlp.w2.weight' transformer.h.27.mlp.c_proj.weight -> transformer.h.27.mlp.c_proj.weight transformer.h.27.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.009765625, 0.01312255859375, 0.01904296875], [0.0185546875, 0.004119873046875, -0.01092529296875], [0.0218505859375, 0.01531982421875, 0.00970458984375]] b'transformer.h.27.mlp.c_proj.weight' transformer.h.28.ln_1.weight -> transformer.h.28.ln_1.weight transformer.h.28.ln_1.weight 1 (4096,) Converting to float32 (4096,) [1.171875, 1.0546875, 1.078125] b'transformer.h.28.ln_1.weight' transformer.h.28.attn.c_attn.weight -> transformer.h.28.attn.c_attn.weight transformer.h.28.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.0286865234375, -0.02197265625, 0.000347137451171875], [0.005706787109375, -0.003173828125, -0.0162353515625], [-0.01171875, 0.02294921875, 0.037109375]] b'transformer.h.28.attn.c_attn.weight' transformer.h.28.attn.c_attn.bias -> transformer.h.28.attn.c_attn.bias transformer.h.28.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [-0.00244140625, -0.091796875, 0.015869140625] b'transformer.h.28.attn.c_attn.bias' transformer.h.28.attn.c_proj.weight -> transformer.h.28.attn.c_proj.weight transformer.h.28.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[-0.0186767578125, 0.01300048828125, -0.010498046875], [-0.041015625, -0.030029296875, -0.0159912109375], [-0.014404296875, 0.0213623046875, 0.010009765625]] b'transformer.h.28.attn.c_proj.weight' transformer.h.28.ln_2.weight -> transformer.h.28.ln_2.weight transformer.h.28.ln_2.weight 1 (4096,) Converting to float32 (4096,) [1.015625, 1.03125, 1.015625] b'transformer.h.28.ln_2.weight' transformer.h.28.mlp.w1.weight -> transformer.h.28.mlp.w1.weight transformer.h.28.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.0089111328125, -0.0037841796875, -0.0244140625], [0.0078125, -0.031494140625, 0.0302734375], [-0.0208740234375, -0.0198974609375, -0.046875]] b'transformer.h.28.mlp.w1.weight' transformer.h.28.mlp.w2.weight -> transformer.h.28.mlp.w2.weight transformer.h.28.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.01165771484375, 0.0030059814453125, 0.0079345703125], [0.0091552734375, -0.00628662109375, -0.021728515625], [0.010009765625, 0.0079345703125, -0.035888671875]] b'transformer.h.28.mlp.w2.weight' transformer.h.28.mlp.c_proj.weight -> transformer.h.28.mlp.c_proj.weight transformer.h.28.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[0.0079345703125, -0.006195068359375, -0.04052734375], [0.01123046875, -0.013427734375, -0.0017852783203125], [-0.033447265625, 0.008544921875, 0.046875]] b'transformer.h.28.mlp.c_proj.weight' transformer.h.29.ln_1.weight -> transformer.h.29.ln_1.weight transformer.h.29.ln_1.weight 1 (4096,) Converting to float32 (4096,) [1.171875, 1.0625, 1.0703125] b'transformer.h.29.ln_1.weight' transformer.h.29.attn.c_attn.weight -> transformer.h.29.attn.c_attn.weight transformer.h.29.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.0093994140625, 0.00433349609375, 0.022216796875], [0.004547119140625, -0.016845703125, 0.0140380859375], [0.0015716552734375, 0.013427734375, -0.0166015625]] b'transformer.h.29.attn.c_attn.weight' transformer.h.29.attn.c_attn.bias -> transformer.h.29.attn.c_attn.bias transformer.h.29.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.34375, -0.33984375, -0.1337890625] b'transformer.h.29.attn.c_attn.bias' transformer.h.29.attn.c_proj.weight -> transformer.h.29.attn.c_proj.weight transformer.h.29.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[-0.0135498046875, 0.00762939453125, -0.041015625], [0.00738525390625, 0.015869140625, -0.022705078125], [0.0038604736328125, -0.004119873046875, 0.021484375]] b'transformer.h.29.attn.c_proj.weight' transformer.h.29.ln_2.weight -> transformer.h.29.ln_2.weight transformer.h.29.ln_2.weight 1 (4096,) Converting to float32 (4096,) [1.0625, 1.0703125, 1.0625] b'transformer.h.29.ln_2.weight' transformer.h.29.mlp.w1.weight -> transformer.h.29.mlp.w1.weight transformer.h.29.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.002685546875, -0.033203125, 0.0101318359375], [-0.003021240234375, -0.01312255859375, -0.017822265625], [-0.00106048583984375, 0.0103759765625, -0.004150390625]] b'transformer.h.29.mlp.w1.weight' transformer.h.29.mlp.w2.weight -> transformer.h.29.mlp.w2.weight transformer.h.29.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.0291748046875, -0.03662109375, 0.00604248046875], [-0.002349853515625, -0.012939453125, -0.0146484375], [-0.01226806640625, 0.04833984375, -0.00830078125]] b'transformer.h.29.mlp.w2.weight' transformer.h.29.mlp.c_proj.weight -> transformer.h.29.mlp.c_proj.weight transformer.h.29.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[0.00775146484375, 0.01202392578125, -0.000583648681640625], [0.0162353515625, -0.0244140625, -0.017333984375], [-0.03857421875, -0.019287109375, 0.005584716796875]] b'transformer.h.29.mlp.c_proj.weight' transformer.h.30.ln_1.weight -> transformer.h.30.ln_1.weight transformer.h.30.ln_1.weight 1 (4096,) Converting to float32 (4096,) [1.1484375, 1.078125, 1.0859375] b'transformer.h.30.ln_1.weight' transformer.h.30.attn.c_attn.weight -> transformer.h.30.attn.c_attn.weight transformer.h.30.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[-0.01806640625, 0.0015869140625, -0.004852294921875], [0.031005859375, 0.020751953125, -0.0322265625], [-0.013916015625, -0.023193359375, 0.002899169921875]] b'transformer.h.30.attn.c_attn.weight' transformer.h.30.attn.c_attn.bias -> transformer.h.30.attn.c_attn.bias transformer.h.30.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.10791015625, 0.1171875, 0.052001953125] b'transformer.h.30.attn.c_attn.bias' transformer.h.30.attn.c_proj.weight -> transformer.h.30.attn.c_proj.weight transformer.h.30.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.020751953125, -0.004638671875, -0.01202392578125], [-0.0123291015625, 0.045654296875, 0.009765625], [-0.026123046875, -0.000621795654296875, 0.00439453125]] b'transformer.h.30.attn.c_proj.weight' transformer.h.30.ln_2.weight -> transformer.h.30.ln_2.weight transformer.h.30.ln_2.weight 1 (4096,) Converting to float32 (4096,) [1.0546875, 1.09375, 1.078125] b'transformer.h.30.ln_2.weight' transformer.h.30.mlp.w1.weight -> transformer.h.30.mlp.w1.weight transformer.h.30.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.005615234375, -0.0120849609375, -0.0069580078125], [0.0079345703125, -0.003265380859375, 0.02783203125], [0.01373291015625, -0.0072021484375, 0.020263671875]] b'transformer.h.30.mlp.w1.weight' transformer.h.30.mlp.w2.weight -> transformer.h.30.mlp.w2.weight transformer.h.30.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[0.0096435546875, 0.0047607421875, -0.016845703125], [-0.00011205673217773438, -0.0230712890625, -0.0225830078125], [0.00494384765625, 0.007781982421875, -0.0034332275390625]] b'transformer.h.30.mlp.w2.weight' transformer.h.30.mlp.c_proj.weight -> transformer.h.30.mlp.c_proj.weight transformer.h.30.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.04541015625, -0.004852294921875, 0.00604248046875], [-0.016845703125, -0.00677490234375, -0.0026397705078125], [0.02978515625, -0.003082275390625, 0.012451171875]] b'transformer.h.30.mlp.c_proj.weight' transformer.h.31.ln_1.weight -> transformer.h.31.ln_1.weight transformer.h.31.ln_1.weight 1 (4096,) Converting to float32 (4096,) [1.0546875, 0.984375, 0.94140625] b'transformer.h.31.ln_1.weight' transformer.h.31.attn.c_attn.weight -> transformer.h.31.attn.c_attn.weight transformer.h.31.attn.c_attn.weight 2 (12288, 4096) Converting to float32 (12288, 4096) [[0.036865234375, 0.00872802734375, 0.01513671875], [0.00225830078125, -0.0174560546875, 0.00113677978515625], [0.007598876953125, 0.0012664794921875, 0.0263671875]] b'transformer.h.31.attn.c_attn.weight' transformer.h.31.attn.c_attn.bias -> transformer.h.31.attn.c_attn.bias transformer.h.31.attn.c_attn.bias 1 (12288,) Converting to float32 (12288,) [0.05224609375, -0.005523681640625, 0.0225830078125] b'transformer.h.31.attn.c_attn.bias' transformer.h.31.attn.c_proj.weight -> transformer.h.31.attn.c_proj.weight transformer.h.31.attn.c_proj.weight 2 (4096, 4096) Converting to float32 (4096, 4096) [[0.00022411346435546875, 0.002197265625, -0.023193359375], [0.005218505859375, 0.0057373046875, -0.02294921875], [-0.0027618408203125, -0.01324462890625, -0.000934600830078125]] b'transformer.h.31.attn.c_proj.weight' transformer.h.31.ln_2.weight -> transformer.h.31.ln_2.weight transformer.h.31.ln_2.weight 1 (4096,) Converting to float32 (4096,) [1.1328125, 1.140625, 1.15625] b'transformer.h.31.ln_2.weight' transformer.h.31.mlp.w1.weight -> transformer.h.31.mlp.w1.weight transformer.h.31.mlp.w1.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.0205078125, 0.018798828125, -0.01171875], [0.018798828125, -0.02880859375, -0.012939453125], [-0.0078125, -0.000926971435546875, 0.0245361328125]] b'transformer.h.31.mlp.w1.weight' transformer.h.31.mlp.w2.weight -> transformer.h.31.mlp.w2.weight transformer.h.31.mlp.w2.weight 2 (11008, 4096) Converting to float32 (11008, 4096) [[-0.01373291015625, -0.0185546875, -0.011474609375], [0.0277099609375, -0.02001953125, 0.003936767578125], [-0.00750732421875, -0.0228271484375, 0.0380859375]] b'transformer.h.31.mlp.w2.weight' transformer.h.31.mlp.c_proj.weight -> transformer.h.31.mlp.c_proj.weight transformer.h.31.mlp.c_proj.weight 2 (4096, 11008) Converting to float32 (4096, 11008) [[-0.006378173828125, -0.04345703125, 0.00439453125], [0.006256103515625, 0.00885009765625, -0.00946044921875], [0.00543212890625, -0.029296875, 0.00823974609375]] b'transformer.h.31.mlp.c_proj.weight' transformer.ln_f.weight -> transformer.ln_f.weight transformer.ln_f.weight 1 (4096,) Converting to float32 (4096,) [3.21875, 3.578125, 3.921875] b'transformer.ln_f.weight' lm_head.weight -> lm_head.weight lm_head.weight 2 (151936, 4096) Converting to float32 (151936, 4096) [[0.00982666015625, -0.00750732421875, -0.0125732421875], [0.01507568359375, -0.0140380859375, 0.00061798095703125], [-0.0006103515625, 0.015869140625, 0.00885009765625]] b'lm_head.weight' Done. Output file: runtime_outs/ne_qwen_f32.binmodel.cpp: loading model from runtime_outs/ne_qwen_f32.bin model.cpp: saving model to runtime_outs/ne_qwen_q_nf4_bestla_cfp32_g32.binne_ftype: 10 Loading the bin file with NE format... load_ne_hparams 0.hparams.n_vocab = 151936 load_ne_hparams 1.hparams.n_embd = 4096 load_ne_hparams 2.hparams.n_mult = 22016 load_ne_hparams 3.hparams.n_head = 32 load_ne_hparams 4.hparams.n_head_kv = 0 load_ne_hparams 5.hparams.n_layer = 32 load_ne_hparams 6.hparams.n_rot = 128 load_ne_hparams 7.hparams.ftype = 0 load_ne_hparams 8.hparams.max_seq_len = 8192 load_ne_hparams 9.hparams.alibi_bias_max = 0.000 load_ne_hparams 10.hparams.clip_qkv = 0.000 load_ne_hparams 11.hparams.par_res = 0 load_ne_hparams 12.hparams.word_embed_proj_dim = 0 load_ne_hparams 13.hparams.do_layer_norm_before = 0 load_ne_hparams 14.hparams.multi_query_group_num = 0 load_ne_hparams 15.hparams.ffn_hidden_size = 11008 load_ne_hparams 16.hparams.inner_hidden_size = 0 load_ne_hparams 17.hparams.n_experts = 0 load_ne_hparams 18.hparams.n_experts_used = 0 load_ne_hparams 19.hparams.n_embd_head_k = 0 load_ne_hparams 20.hparams.norm_eps = 0.000001 load_ne_hparams 21.hparams.freq_base = 10000.000 load_ne_hparams 22.hparams.freq_scale = 1.000 load_ne_hparams 23.hparams.rope_scaling_factor = 0.000 load_ne_hparams 24.hparams.original_max_position_embeddings = 0 load_ne_hparams 25.hparams.use_yarn = 0 load_ne_vocab 26.vocab.bos_token_id = 151643 load_ne_vocab 27.vocab.eos_token_id = 151643 load_ne_vocab 28.vocab.pad_token_id = -1 load_ne_vocab 29.vocab.sep_token_id = -1 [ 1/ 259] transformer.wte.weight - 4096 x 151936, type = f32, 0_0_32_0_0,quantizing .. GGML size = 2374.00 MB -> 333.84 MB [ 2/ 259] transformer.h.0.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 3/ 259] transformer.h.0.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 4/ 259] transformer.h.0.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 5/ 259] transformer.h.0.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 6/ 259] transformer.h.0.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 7/ 259] transformer.h.0.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 8/ 259] transformer.h.0.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 9/ 259] transformer.h.0.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 10/ 259] transformer.h.1.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 11/ 259] transformer.h.1.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 12/ 259] transformer.h.1.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 13/ 259] transformer.h.1.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 14/ 259] transformer.h.1.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 15/ 259] transformer.h.1.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 16/ 259] transformer.h.1.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 17/ 259] transformer.h.1.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 18/ 259] transformer.h.2.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 19/ 259] transformer.h.2.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 20/ 259] transformer.h.2.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 21/ 259] transformer.h.2.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 22/ 259] transformer.h.2.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 23/ 259] transformer.h.2.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 24/ 259] transformer.h.2.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 25/ 259] transformer.h.2.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 26/ 259] transformer.h.3.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 27/ 259] transformer.h.3.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 28/ 259] transformer.h.3.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 29/ 259] transformer.h.3.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 30/ 259] transformer.h.3.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 31/ 259] transformer.h.3.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 32/ 259] transformer.h.3.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 33/ 259] transformer.h.3.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 34/ 259] transformer.h.4.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 35/ 259] transformer.h.4.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 36/ 259] transformer.h.4.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 37/ 259] transformer.h.4.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 38/ 259] transformer.h.4.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 39/ 259] transformer.h.4.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 40/ 259] transformer.h.4.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 41/ 259] transformer.h.4.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 42/ 259] transformer.h.5.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 43/ 259] transformer.h.5.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 44/ 259] transformer.h.5.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 45/ 259] transformer.h.5.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 46/ 259] transformer.h.5.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 47/ 259] transformer.h.5.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 48/ 259] transformer.h.5.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 49/ 259] transformer.h.5.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 50/ 259] transformer.h.6.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 51/ 259] transformer.h.6.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 52/ 259] transformer.h.6.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 53/ 259] transformer.h.6.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 54/ 259] transformer.h.6.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 55/ 259] transformer.h.6.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 56/ 259] transformer.h.6.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 57/ 259] transformer.h.6.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 58/ 259] transformer.h.7.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 59/ 259] transformer.h.7.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 60/ 259] transformer.h.7.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 61/ 259] transformer.h.7.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 62/ 259] transformer.h.7.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 63/ 259] transformer.h.7.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 64/ 259] transformer.h.7.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 65/ 259] transformer.h.7.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 66/ 259] transformer.h.8.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 67/ 259] transformer.h.8.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 68/ 259] transformer.h.8.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 69/ 259] transformer.h.8.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 70/ 259] transformer.h.8.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 71/ 259] transformer.h.8.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 72/ 259] transformer.h.8.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 73/ 259] transformer.h.8.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 74/ 259] transformer.h.9.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 75/ 259] transformer.h.9.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 76/ 259] transformer.h.9.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 77/ 259] transformer.h.9.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 78/ 259] transformer.h.9.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 79/ 259] transformer.h.9.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 80/ 259] transformer.h.9.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 81/ 259] transformer.h.9.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 82/ 259] transformer.h.10.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 83/ 259] transformer.h.10.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 84/ 259] transformer.h.10.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 85/ 259] transformer.h.10.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 86/ 259] transformer.h.10.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 87/ 259] transformer.h.10.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 88/ 259] transformer.h.10.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 89/ 259] transformer.h.10.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 90/ 259] transformer.h.11.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 91/ 259] transformer.h.11.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 92/ 259] transformer.h.11.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 93/ 259] transformer.h.11.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 94/ 259] transformer.h.11.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 95/ 259] transformer.h.11.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 96/ 259] transformer.h.11.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 97/ 259] transformer.h.11.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 98/ 259] transformer.h.12.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 99/ 259] transformer.h.12.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 100/ 259] transformer.h.12.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 101/ 259] transformer.h.12.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 102/ 259] transformer.h.12.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 103/ 259] transformer.h.12.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 104/ 259] transformer.h.12.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 105/ 259] transformer.h.12.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 106/ 259] transformer.h.13.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 107/ 259] transformer.h.13.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 108/ 259] transformer.h.13.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 109/ 259] transformer.h.13.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 110/ 259] transformer.h.13.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 111/ 259] transformer.h.13.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 112/ 259] transformer.h.13.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 113/ 259] transformer.h.13.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 114/ 259] transformer.h.14.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 115/ 259] transformer.h.14.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 116/ 259] transformer.h.14.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 117/ 259] transformer.h.14.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 118/ 259] transformer.h.14.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 119/ 259] transformer.h.14.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 120/ 259] transformer.h.14.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 121/ 259] transformer.h.14.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 122/ 259] transformer.h.15.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 123/ 259] transformer.h.15.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 124/ 259] transformer.h.15.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 125/ 259] transformer.h.15.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 126/ 259] transformer.h.15.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 127/ 259] transformer.h.15.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 128/ 259] transformer.h.15.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 129/ 259] transformer.h.15.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 130/ 259] transformer.h.16.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 131/ 259] transformer.h.16.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 132/ 259] transformer.h.16.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 133/ 259] transformer.h.16.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 134/ 259] transformer.h.16.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 135/ 259] transformer.h.16.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 136/ 259] transformer.h.16.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 137/ 259] transformer.h.16.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 138/ 259] transformer.h.17.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 139/ 259] transformer.h.17.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 140/ 259] transformer.h.17.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 141/ 259] transformer.h.17.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 142/ 259] transformer.h.17.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 143/ 259] transformer.h.17.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 144/ 259] transformer.h.17.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 145/ 259] transformer.h.17.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 146/ 259] transformer.h.18.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 147/ 259] transformer.h.18.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 148/ 259] transformer.h.18.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 149/ 259] transformer.h.18.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 150/ 259] transformer.h.18.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 151/ 259] transformer.h.18.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 152/ 259] transformer.h.18.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 153/ 259] transformer.h.18.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 154/ 259] transformer.h.19.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 155/ 259] transformer.h.19.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 156/ 259] transformer.h.19.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 157/ 259] transformer.h.19.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 158/ 259] transformer.h.19.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 159/ 259] transformer.h.19.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 160/ 259] transformer.h.19.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 161/ 259] transformer.h.19.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 162/ 259] transformer.h.20.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 163/ 259] transformer.h.20.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 164/ 259] transformer.h.20.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 165/ 259] transformer.h.20.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 166/ 259] transformer.h.20.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 167/ 259] transformer.h.20.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 168/ 259] transformer.h.20.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 169/ 259] transformer.h.20.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 170/ 259] transformer.h.21.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 171/ 259] transformer.h.21.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 172/ 259] transformer.h.21.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 173/ 259] transformer.h.21.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 174/ 259] transformer.h.21.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 175/ 259] transformer.h.21.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 176/ 259] transformer.h.21.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 177/ 259] transformer.h.21.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 178/ 259] transformer.h.22.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 179/ 259] transformer.h.22.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 180/ 259] transformer.h.22.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 181/ 259] transformer.h.22.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 182/ 259] transformer.h.22.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 183/ 259] transformer.h.22.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 184/ 259] transformer.h.22.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 185/ 259] transformer.h.22.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 186/ 259] transformer.h.23.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 187/ 259] transformer.h.23.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 188/ 259] transformer.h.23.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 189/ 259] transformer.h.23.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 190/ 259] transformer.h.23.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 191/ 259] transformer.h.23.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 192/ 259] transformer.h.23.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 193/ 259] transformer.h.23.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 194/ 259] transformer.h.24.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 195/ 259] transformer.h.24.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 196/ 259] transformer.h.24.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 197/ 259] transformer.h.24.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 198/ 259] transformer.h.24.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 199/ 259] transformer.h.24.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 200/ 259] transformer.h.24.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 201/ 259] transformer.h.24.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 202/ 259] transformer.h.25.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 203/ 259] transformer.h.25.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 204/ 259] transformer.h.25.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 205/ 259] transformer.h.25.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 206/ 259] transformer.h.25.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 207/ 259] transformer.h.25.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 208/ 259] transformer.h.25.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 209/ 259] transformer.h.25.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 210/ 259] transformer.h.26.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 211/ 259] transformer.h.26.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 212/ 259] transformer.h.26.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 213/ 259] transformer.h.26.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 214/ 259] transformer.h.26.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 215/ 259] transformer.h.26.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 216/ 259] transformer.h.26.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 217/ 259] transformer.h.26.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 218/ 259] transformer.h.27.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 219/ 259] transformer.h.27.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 220/ 259] transformer.h.27.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 221/ 259] transformer.h.27.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 222/ 259] transformer.h.27.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 223/ 259] transformer.h.27.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 224/ 259] transformer.h.27.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 225/ 259] transformer.h.27.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 226/ 259] transformer.h.28.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 227/ 259] transformer.h.28.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 228/ 259] transformer.h.28.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 229/ 259] transformer.h.28.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 230/ 259] transformer.h.28.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 231/ 259] transformer.h.28.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 232/ 259] transformer.h.28.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 233/ 259] transformer.h.28.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 234/ 259] transformer.h.29.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 235/ 259] transformer.h.29.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 236/ 259] transformer.h.29.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 237/ 259] transformer.h.29.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 238/ 259] transformer.h.29.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 239/ 259] transformer.h.29.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 240/ 259] transformer.h.29.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 241/ 259] transformer.h.29.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 242/ 259] transformer.h.30.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 243/ 259] transformer.h.30.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 244/ 259] transformer.h.30.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 245/ 259] transformer.h.30.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 246/ 259] transformer.h.30.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 247/ 259] transformer.h.30.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 248/ 259] transformer.h.30.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 249/ 259] transformer.h.30.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 250/ 259] transformer.h.31.ln_1.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 251/ 259] transformer.h.31.attn.c_attn.weight - 4096 x 12288, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 192.00 MB -> 30.00 MB [ 252/ 259] transformer.h.31.attn.c_attn.bias - 12288, type = f32, 7_0_32_0_0,size = 0.047 MB [ 253/ 259] transformer.h.31.attn.c_proj.weight - 4096 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 64.00 MB -> 10.08 MB [ 254/ 259] transformer.h.31.ln_2.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 255/ 259] transformer.h.31.mlp.w1.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 256/ 259] transformer.h.31.mlp.w2.weight - 4096 x 11008, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 26.95 MB [ 257/ 259] transformer.h.31.mlp.c_proj.weight - 11008 x 4096, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 172.00 MB -> 27.09 MB [ 258/ 259] transformer.ln_f.weight - 4096, type = f32, 7_0_32_0_0,size = 0.016 MB [ 259/ 259] lm_head.weight - 4096 x 151936, type = f32, 4_0_32_1_2,quantizing .. BesTLA size = 2374.00 MB -> 371.02 MB model_quantize_internal: model size = 29454.52 MB model_quantize_internal: quant size = 4581.63 MB AVX:1 AVX2:1 AVX512F:1 AVX_VNNI:0 AVX512_VNNI:1 AMX_INT8:0 AMX_BF16:0 AVX512_BF16:0 AVX512_FP16:0 beam_size: 1, do_sample: 0, top_k: 40, top_p: 0.950, continuous_batching: 0, max_request_num: 1, early_stopping: 0, scratch_size_ratio: 1.000 Loading the bin file with NE format... Once upon a time, there existed a littlemodel.cpp: loading model from runtime_outs/ne_qwen_q_nf4_bestla_cfp32_g32.bin init: n_vocab = 151936 init: n_embd = 4096 init: n_mult = 22016 init: n_head = 32 init: n_head_kv = 0 init: n_layer = 32 init: n_rot = 128 init: ftype = 0 init: max_seq_len= 8192 init: n_ff = 11008 init: n_parts = 1 load: ctx size = 4581.78 MB load: scratch0 = 4096.00 MB load: scratch1 = 2048.00 MB load: scratch2 = 4096.00 MB load: mem required = 14821.78 MB (+ memory per state) ...................................................................................... model_init_from_file: support_bestla_kv = 0 model_init_from_file: kv self size = 256.00 MBgirl, who was very curious and adventurous. She loved to explore the world around her, and her parents often worried about her safety. One day, while playing in the park, the little girl stumbled upon a mysterious door that she had never seen before. The door was made of a strange, shimmering material, and it seemed to glow with an otherworldly light. The little girl was intrigued, and she pushed the door open, stepping through into a magical world. In this world, the little girl discovered that anything was possible. She could fly on the back of a giant dragon, ride a unicorn through a rainbow-colored forest, and even talk to the animals. She spent hours exploring this magical world, making new friends and having the time of her life. But as the day wore on, the little girl realized that she needed to go home. She said goodbye to her new friends and stepped back through the door, returning to the real world. From that day on, the little girl knew that there was so much more to the world than she had ever imagined. She continued to explore and discover new things, always keeping an open mind and an adventurous spirit. And she never forgot the magical world she had discovered, a place where anything was possible. The end.<|im_end|> <|endoftext|>
10 分钟使用 Intel Extension for Transformers 快速搭建 chatbot 聊天系统
最新推荐文章于 2024-06-24 16:03:11 发布