10 分钟使用 Intel Extension for Transformers 快速搭建 chatbot 聊天系统

项目介绍

本项目提供了基于 通义千问 Qwen-7B Chat 在ModelWhale 平台上使用 CPU 实现高效部署大模型的教程,并且通过使用 Intel Extension for Transformers 工具包快速搭建环境,大大提升在线部署的效率、实现高效的模型推理体验。

查看更多:

计算资源及环境介绍

计算资源:腾讯云南京 CPU 16核64G
环境设置:Python3.11.8 数据科学镜像

注意:
因为需要从云厂商拉取算力资源,耗时5~10min,且会预扣半小时资源价格的鲸币。如果资源未启动成功,预扣费用会在关闭编程页面后五分钟内退回,无需紧张。

Wait,什么是 Intel Extension for Transformers?

官方 github repo: GitHub - intel/intel-extension-for-transformers: ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Intel® Extension for Transformers 是英特尔推出的一个创新工具包,可基于英特尔® 架构平台,尤其是第四代英特尔® 至强® 可扩展处理器(代号 Sapphire Rapids,SPR)显著加速基于 Transformer 的大语言模型 (Large Language Model, LLM)。

它可以帮助开发者和研究人员更高效地在 Intel 的硬件平台上运行和优化基于 Transformer 的大型语言模型(LLM)。

就像给汽车安装涡轮增压器来提高性能一样,这个工具包就是给大型语言模型安装的“涡轮增压器”,让它们在 Intel 的 CPU 和 GPU 上运行得更快、更高效。

展开讲讲 Intel Extension for Transformers 的主要特性?

其主要特性包括:

(1) 通过扩展 Hugging Face transformers API 和利用英特尔® Neural Compressor,为用户提供无缝的模型压缩体验;

(2) 提供采用低位量化内核(NeurIPS 2023:在 CPU 上实现高效 LLM 推理)的 LLM 推理运行时,支持 Falcon、 LLaMA、MPT、 Llama2、 BLOOM、 OPT、 ChatGLM2、GPT-J-6B、Baichuan-13B-Base、Baichuan2-13B-Base、Qwen-7B、Qwen-14B 和 Dolly-v2-3B 等常见的 LLM;

(3) 先进的压缩感知运行时(NeurIPS 2022:在 CPU 上实现快速蒸馏 和 QuaLA-MiniLM:量化长度自适应 MiniLM;NeurIPS 2021:一次剪枝,一劳永逸:对预训练语言模型进行稀疏/剪枝)。

话不多说,下面进入实操环节,感受一下如何用 2 行代码高效实现大模型推理:

准备环境

In [1]:

# 安装Itrex在CPU环境下使用所需的系统环境
!sudo apt-get update
!sudo apt-get install -y ffmpeg
!sudo apt-get install -y libgl1-mesa-glx libgl1-mesa-dev
!sudo apt-get install -y libsm6 libxext6
Get:1 http://archive.ubuntu.com/ubuntu jammy InRelease [270 kB]                
Get:2 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]      
Get:3 http://security.ubuntu.com/ubuntu jammy-security/restricted amd64 Packages [2395 kB]
Get:4 http://security.ubuntu.com/ubuntu jammy-security/main amd64 Packages [1854 kB]
Get:5 http://security.ubuntu.com/ubuntu jammy-security/universe amd64 Packages [1084 kB]
Get:6 http://security.ubuntu.com/ubuntu jammy-security/multiverse amd64 Packages [44.7 kB]
Get:7 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]        
Get:8 http://archive.ubuntu.com/ubuntu jammy-backports InRelease [127 kB]      
Get:9 http://archive.ubuntu.com/ubuntu jammy/universe amd64 Packages [17.5 MB] 
Get:10 http://archive.ubuntu.com/ubuntu jammy/main amd64 Packages [1792 kB]    
Get:11 http://archive.ubuntu.com/ubuntu jammy/multiverse amd64 Packages [266 kB]
Get:12 http://archive.ubuntu.com/ubuntu jammy/restricted amd64 Packages [164 kB]
Get:13 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 Packages [1376 kB]
Get:14 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages [2125 kB]
Get:15 http://archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 Packages [2468 kB]
Get:16 http://archive.ubuntu.com/ubuntu jammy-updates/multiverse amd64 Packages [51.1 kB]
Get:17 http://archive.ubuntu.com/ubuntu jammy-backports/universe amd64 Packages [35.0 kB]
Get:18 http://archive.ubuntu.com/ubuntu jammy-backports/main amd64 Packages [110 kB]
Fetched 31.9 MB in 16s (1960 kB/s)                                             
Reading package lists... Done
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
ffmpeg is already the newest version (7:4.4.2-0ubuntu0.22.04.1).
0 upgraded, 0 newly installed, 0 to remove and 20 not upgraded.
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  libegl-dev libegl-mesa0 libegl1 libgl-dev libgles-dev libgles1 libgles2
  libglvnd-core-dev libglvnd-dev libglx-dev libopengl-dev libopengl0
  libpthread-stubs0-dev libx11-dev libxau-dev libxcb1-dev libxdmcp-dev
  x11proto-dev xorg-sgml-doctools xtrans-dev
Suggested packages:
  libx11-doc libxcb-doc
The following NEW packages will be installed:
  libegl-dev libegl-mesa0 libegl1 libgl-dev libgl1-mesa-dev libgl1-mesa-glx
  libgles-dev libgles1 libgles2 libglvnd-core-dev libglvnd-dev libglx-dev
  libopengl-dev libopengl0 libpthread-stubs0-dev libx11-dev libxau-dev
  libxcb1-dev libxdmcp-dev x11proto-dev xorg-sgml-doctools xtrans-dev
0 upgraded, 22 newly installed, 0 to remove and 20 not upgraded.
Need to get 1985 kB of archives.
After this operation, 8967 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 libegl-mesa0 amd64 23.2.1-1ubuntu3.1~22.04.2 [118 kB]
Get:2 http://archive.ubuntu.com/ubuntu jammy/main amd64 libegl1 amd64 1.4.0-1 [28.6 kB]
Get:3 http://archive.ubuntu.com/ubuntu jammy/main amd64 xorg-sgml-doctools all 1:1.11-1.1 [10.9 kB]
Get:4 http://archive.ubuntu.com/ubuntu jammy/main amd64 x11proto-dev all 2021.5-1 [604 kB]
Get:5 http://archive.ubuntu.com/ubuntu jammy/main amd64 libxau-dev amd64 1:1.0.9-1build5 [9724 B]
Get:6 http://archive.ubuntu.com/ubuntu jammy/main amd64 libxdmcp-dev amd64 1:1.1.3-0ubuntu5 [26.5 kB]
Get:7 http://archive.ubuntu.com/ubuntu jammy/main amd64 xtrans-dev all 1.4.0-1 [68.9 kB]
Get:8 http://archive.ubuntu.com/ubuntu jammy/main amd64 libpthread-stubs0-dev amd64 0.4-1build2 [5516 B]
Get:9 http://archive.ubuntu.com/ubuntu jammy/main amd64 libxcb1-dev amd64 1.14-3ubuntu3 [86.5 kB]
Get:10 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 libx11-dev amd64 2:1.7.5-1ubuntu0.3 [744 kB]
Get:11 http://archive.ubuntu.com/ubuntu jammy/main amd64 libglx-dev amd64 1.4.0-1 [14.1 kB]
Get:12 http://archive.ubuntu.com/ubuntu jammy/main amd64 libgl-dev amd64 1.4.0-1 [101 kB]
Get:13 http://archive.ubuntu.com/ubuntu jammy/main amd64 libegl-dev amd64 1.4.0-1 [18.0 kB]
Get:14 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 libgl1-mesa-glx amd64 23.0.4-0ubuntu1~22.04.1 [5584 B]
Get:15 http://archive.ubuntu.com/ubuntu jammy/main amd64 libgles1 amd64 1.4.0-1 [11.5 kB]
Get:16 http://archive.ubuntu.com/ubuntu jammy/main amd64 libgles2 amd64 1.4.0-1 [18.0 kB]
Get:17 http://archive.ubuntu.com/ubuntu jammy/main amd64 libgles-dev amd64 1.4.0-1 [49.4 kB]
Get:18 http://archive.ubuntu.com/ubuntu jammy/main amd64 libopengl0 amd64 1.4.0-1 [36.5 kB]
Get:19 http://archive.ubuntu.com/ubuntu jammy/main amd64 libopengl-dev amd64 1.4.0-1 [3400 B]
Get:20 http://archive.ubuntu.com/ubuntu jammy/main amd64 libglvnd-core-dev amd64 1.4.0-1 [12.7 kB]
Get:21 http://archive.ubuntu.com/ubuntu jammy/main amd64 libglvnd-dev amd64 1.4.0-1 [3162 B]
Get:22 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 libgl1-mesa-dev amd64 23.2.1-1ubuntu3.1~22.04.2 [6842 B]
Fetched 1985 kB in 3s (693 kB/s)           
debconf: delaying package configuration, since apt-utils is not installed
Selecting previously unselected package libegl-mesa0:amd64.
(Reading database ... 49714 files and directories currently installed.)
Preparing to unpack .../00-libegl-mesa0_23.2.1-1ubuntu3.1~22.04.2_amd64.deb ...
Unpacking libegl-mesa0:amd64 (23.2.1-1ubuntu3.1~22.04.2) ...
Selecting previously unselected package libegl1:amd64.
Preparing to unpack .../01-libegl1_1.4.0-1_amd64.deb ...
Unpacking libegl1:amd64 (1.4.0-1) ...
Selecting previously unselected package xorg-sgml-doctools.
Preparing to unpack .../02-xorg-sgml-doctools_1%3a1.11-1.1_all.deb ...
Unpacking xorg-sgml-doctools (1:1.11-1.1) ...
Selecting previously unselected package x11proto-dev.
Preparing to unpack .../03-x11proto-dev_2021.5-1_all.deb ...
Unpacking x11proto-dev (2021.5-1) ...
Selecting previously unselected package libxau-dev:amd64.
Preparing to unpack .../04-libxau-dev_1%3a1.0.9-1build5_amd64.deb ...
Unpacking libxau-dev:amd64 (1:1.0.9-1build5) ...
Selecting previously unselected package libxdmcp-dev:amd64.
Preparing to unpack .../05-libxdmcp-dev_1%3a1.1.3-0ubuntu5_amd64.deb ...
Unpacking libxdmcp-dev:amd64 (1:1.1.3-0ubuntu5) ...
Selecting previously unselected package xtrans-dev.
Preparing to unpack .../06-xtrans-dev_1.4.0-1_all.deb ...
Unpacking xtrans-dev (1.4.0-1) ...
Selecting previously unselected package libpthread-stubs0-dev:amd64.
Preparing to unpack .../07-libpthread-stubs0-dev_0.4-1build2_amd64.deb ...
Unpacking libpthread-stubs0-dev:amd64 (0.4-1build2) ...
Selecting previously unselected package libxcb1-dev:amd64.
Preparing to unpack .../08-libxcb1-dev_1.14-3ubuntu3_amd64.deb ...
Unpacking libxcb1-dev:amd64 (1.14-3ubuntu3) ...
Selecting previously unselected package libx11-dev:amd64.
Preparing to unpack .../09-libx11-dev_2%3a1.7.5-1ubuntu0.3_amd64.deb ...
Unpacking libx11-dev:amd64 (2:1.7.5-1ubuntu0.3) ...
Selecting previously unselected package libglx-dev:amd64.
Preparing to unpack .../10-libglx-dev_1.4.0-1_amd64.deb ...
Unpacking libglx-dev:amd64 (1.4.0-1) ...
Selecting previously unselected package libgl-dev:amd64.
Preparing to unpack .../11-libgl-dev_1.4.0-1_amd64.deb ...
Unpacking libgl-dev:amd64 (1.4.0-1) ...
Selecting previously unselected package libegl-dev:amd64.
Preparing to unpack .../12-libegl-dev_1.4.0-1_amd64.deb ...
Unpacking libegl-dev:amd64 (1.4.0-1) ...
Selecting previously unselected package libgl1-mesa-glx:amd64.
Preparing to unpack .../13-libgl1-mesa-glx_23.0.4-0ubuntu1~22.04.1_amd64.deb ...
Unpacking libgl1-mesa-glx:amd64 (23.0.4-0ubuntu1~22.04.1) ...
Selecting previously unselected package libgles1:amd64.
Preparing to unpack .../14-libgles1_1.4.0-1_amd64.deb ...
Unpacking libgles1:amd64 (1.4.0-1) ...
Selecting previously unselected package libgles2:amd64.
Preparing to unpack .../15-libgles2_1.4.0-1_amd64.deb ...
Unpacking libgles2:amd64 (1.4.0-1) ...
Selecting previously unselected package libgles-dev:amd64.
Preparing to unpack .../16-libgles-dev_1.4.0-1_amd64.deb ...
Unpacking libgles-dev:amd64 (1.4.0-1) ...
Selecting previously unselected package libopengl0:amd64.
Preparing to unpack .../17-libopengl0_1.4.0-1_amd64.deb ...
Unpacking libopengl0:amd64 (1.4.0-1) ...
Selecting previously unselected package libopengl-dev:amd64.
Preparing to unpack .../18-libopengl-dev_1.4.0-1_amd64.deb ...
Unpacking libopengl-dev:amd64 (1.4.0-1) ...
Selecting previously unselected package libglvnd-core-dev:amd64.
Preparing to unpack .../19-libglvnd-core-dev_1.4.0-1_amd64.deb ...
Unpacking libglvnd-core-dev:amd64 (1.4.0-1) ...
Selecting previously unselected package libglvnd-dev:amd64.
Preparing to unpack .../20-libglvnd-dev_1.4.0-1_amd64.deb ...
Unpacking libglvnd-dev:amd64 (1.4.0-1) ...
Selecting previously unselected package libgl1-mesa-dev:amd64.
Preparing to unpack .../21-libgl1-mesa-dev_23.2.1-1ubuntu3.1~22.04.2_amd64.deb ...
Unpacking libgl1-mesa-dev:amd64 (23.2.1-1ubuntu3.1~22.04.2) ...
Setting up libglvnd-core-dev:amd64 (1.4.0-1) ...
Setting up libpthread-stubs0-dev:amd64 (0.4-1build2) ...
Setting up libopengl0:amd64 (1.4.0-1) ...
Setting up xtrans-dev (1.4.0-1) ...
Setting up libegl-mesa0:amd64 (23.2.1-1ubuntu3.1~22.04.2) ...
Setting up libgles2:amd64 (1.4.0-1) ...
Setting up libgles1:amd64 (1.4.0-1) ...
Setting up libgl1-mesa-glx:amd64 (23.0.4-0ubuntu1~22.04.1) ...
Setting up libegl1:amd64 (1.4.0-1) ...
Setting up xorg-sgml-doctools (1:1.11-1.1) ...
Setting up libopengl-dev:amd64 (1.4.0-1) ...
Setting up x11proto-dev (2021.5-1) ...
Setting up libxau-dev:amd64 (1:1.0.9-1build5) ...
Setting up libxdmcp-dev:amd64 (1:1.1.3-0ubuntu5) ...
Setting up libxcb1-dev:amd64 (1.14-3ubuntu3) ...
Setting up libx11-dev:amd64 (2:1.7.5-1ubuntu0.3) ...
Setting up libglx-dev:amd64 (1.4.0-1) ...
Setting up libgl-dev:amd64 (1.4.0-1) ...
Setting up libegl-dev:amd64 (1.4.0-1) ...
Setting up libgles-dev:amd64 (1.4.0-1) ...
Setting up libglvnd-dev:amd64 (1.4.0-1) ...
Setting up libgl1-mesa-dev:amd64 (23.2.1-1ubuntu3.1~22.04.2) ...
Processing triggers for libc-bin (2.35-0ubuntu3.6) ...
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
libsm6 is already the newest version (2:1.2.3-1build2).
libsm6 set to manually installed.
libxext6 is already the newest version (2:1.3.4-1build1).
libxext6 set to manually installed.
0 upgraded, 0 newly installed, 0 to remove and 20 not upgraded.

In [2]:

# 安装所需的第三方库,包含环境依赖包和可选用于加速计算的依赖包
!pip install torch==2.3.0+cpu -i https://mirrors.cloud.tencent.com/pypi/simple
!pip install cmake -i https://mirrors.cloud.tencent.com/pypi/simple
!pip install ninja -i https://mirrors.cloud.tencent.com/pypi/simple
!pip install neural_speed -i https://mirrors.cloud.tencent.com/pypi/simple
!pip install intel-extension-for-transformers -i https://mirrors.cloud.tencent.com/pypi/simple
!pip install modelscope -i https://mirrors.cloud.tencent.com/pypi/simple
!pip install transformers -i https://mirrors.cloud.tencent.com/pypi/simple
!pip install pyOpenSSL --upgrade -i https://mirrors.cloud.tencent.com/pypi/simple
!pip install sentencepiece --upgrade -i https://mirrors.cloud.tencent.com/pypi/simple
!pip install xformers -i https://mirrors.cloud.tencent.com/pypi/simple
!pip install accelerate -i https://mirrors.cloud.tencent.com/pypi/simple
!pip install tiktoken -i https://mirrors.cloud.tencent.com/pypi/simple
!pip install transformers_stream_generator -i https://mirrors.cloud.tencent.com/pypi/simple
/home/mw
/home/mw/project
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple
Requirement already satisfied: torch==2.3.0+cpu in /opt/conda/lib/python3.11/site-packages (2.3.0+cpu)
Requirement already satisfied: filelock in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0+cpu) (3.14.0)
Requirement already satisfied: typing-extensions>=4.8.0 in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0+cpu) (4.10.0)
Requirement already satisfied: sympy in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0+cpu) (1.12)
Requirement already satisfied: networkx in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0+cpu) (3.2.1)
Requirement already satisfied: jinja2 in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0+cpu) (3.1.3)
Requirement already satisfied: fsspec in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0+cpu) (2024.2.0)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.11/site-packages (from jinja2->torch==2.3.0+cpu) (2.1.5)
Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.11/site-packages (from sympy->torch==2.3.0+cpu) (1.3.0)
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple
Requirement already satisfied: cmake in /opt/conda/lib/python3.11/site-packages (3.29.3)
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple
Requirement already satisfied: ninja in /opt/conda/lib/python3.11/site-packages (1.11.1.1)
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple
Requirement already satisfied: neural_speed in /opt/conda/lib/python3.11/site-packages (1.0)
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple
Requirement already satisfied: intel-extension-for-transformers in /opt/conda/lib/python3.11/site-packages (1.4.2)
Requirement already satisfied: packaging in /opt/conda/lib/python3.11/site-packages (from intel-extension-for-transformers) (24.0)
Requirement already satisfied: numpy in /opt/conda/lib/python3.11/site-packages (from intel-extension-for-transformers) (1.26.4)
Requirement already satisfied: schema in /opt/conda/lib/python3.11/site-packages (from intel-extension-for-transformers) (0.7.7)
Requirement already satisfied: pyyaml in /opt/conda/lib/python3.11/site-packages (from intel-extension-for-transformers) (6.0.1)
Requirement already satisfied: neural-compressor in /opt/conda/lib/python3.11/site-packages (from intel-extension-for-transformers) (2.5.1)
Requirement already satisfied: transformers in /opt/conda/lib/python3.11/site-packages (from intel-extension-for-transformers) (4.41.1)
Requirement already satisfied: deprecated>=1.2.13 in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (1.2.14)
Requirement already satisfied: opencv-python-headless in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (4.9.0.80)
Requirement already satisfied: pandas in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (2.2.1)
Requirement already satisfied: Pillow in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (10.2.0)
Requirement already satisfied: prettytable in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (3.10.0)
Requirement already satisfied: psutil in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (5.9.8)
Requirement already satisfied: py-cpuinfo in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (9.0.0)
Requirement already satisfied: requests in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (2.31.0)
Requirement already satisfied: scikit-learn in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (1.4.1.post1)
Requirement already satisfied: pycocotools in /opt/conda/lib/python3.11/site-packages (from neural-compressor->intel-extension-for-transformers) (2.0.7)
Requirement already satisfied: filelock in /opt/conda/lib/python3.11/site-packages (from transformers->intel-extension-for-transformers) (3.14.0)
Requirement already satisfied: huggingface-hub<1.0,>=0.23.0 in /opt/conda/lib/python3.11/site-packages (from transformers->intel-extension-for-transformers) (0.23.2)
Requirement already satisfied: regex!=2019.12.17 in /opt/conda/lib/python3.11/site-packages (from transformers->intel-extension-for-transformers) (2024.5.15)
Requirement already satisfied: tokenizers<0.20,>=0.19 in /opt/conda/lib/python3.11/site-packages (from transformers->intel-extension-for-transformers) (0.19.1)
Requirement already satisfied: safetensors>=0.4.1 in /opt/conda/lib/python3.11/site-packages (from transformers->intel-extension-for-transformers) (0.4.3)
Requirement already satisfied: tqdm>=4.27 in /opt/conda/lib/python3.11/site-packages (from transformers->intel-extension-for-transformers) (4.66.2)
Requirement already satisfied: wrapt<2,>=1.10 in /opt/conda/lib/python3.11/site-packages (from deprecated>=1.2.13->neural-compressor->intel-extension-for-transformers) (1.16.0)
Requirement already satisfied: fsspec>=2023.5.0 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub<1.0,>=0.23.0->transformers->intel-extension-for-transformers) (2024.2.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub<1.0,>=0.23.0->transformers->intel-extension-for-transformers) (4.10.0)
Requirement already satisfied: python-dateutil>=2.8.2 in /opt/conda/lib/python3.11/site-packages (from pandas->neural-compressor->intel-extension-for-transformers) (2.9.0)
Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.11/site-packages (from pandas->neural-compressor->intel-extension-for-transformers) (2024.1)
Requirement already satisfied: tzdata>=2022.7 in /opt/conda/lib/python3.11/site-packages (from pandas->neural-compressor->intel-extension-for-transformers) (2024.1)
Requirement already satisfied: wcwidth in /opt/conda/lib/python3.11/site-packages (from prettytable->neural-compressor->intel-extension-for-transformers) (0.2.13)
Requirement already satisfied: matplotlib>=2.1.0 in /opt/conda/lib/python3.11/site-packages (from pycocotools->neural-compressor->intel-extension-for-transformers) (3.8.3)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.11/site-packages (from requests->neural-compressor->intel-extension-for-transformers) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-packages (from requests->neural-compressor->intel-extension-for-transformers) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.11/site-packages (from requests->neural-compressor->intel-extension-for-transformers) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.11/site-packages (from requests->neural-compressor->intel-extension-for-transformers) (2024.2.2)
Requirement already satisfied: scipy>=1.6.0 in /opt/conda/lib/python3.11/site-packages (from scikit-learn->neural-compressor->intel-extension-for-transformers) (1.12.0)
Requirement already satisfied: joblib>=1.2.0 in /opt/conda/lib/python3.11/site-packages (from scikit-learn->neural-compressor->intel-extension-for-transformers) (1.3.2)
Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/conda/lib/python3.11/site-packages (from scikit-learn->neural-compressor->intel-extension-for-transformers) (3.4.0)
Requirement already satisfied: contourpy>=1.0.1 in /opt/conda/lib/python3.11/site-packages (from matplotlib>=2.1.0->pycocotools->neural-compressor->intel-extension-for-transformers) (1.2.0)
Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.11/site-packages (from matplotlib>=2.1.0->pycocotools->neural-compressor->intel-extension-for-transformers) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /opt/conda/lib/python3.11/site-packages (from matplotlib>=2.1.0->pycocotools->neural-compressor->intel-extension-for-transformers) (4.50.0)
Requirement already satisfied: kiwisolver>=1.3.1 in /opt/conda/lib/python3.11/site-packages (from matplotlib>=2.1.0->pycocotools->neural-compressor->intel-extension-for-transformers) (1.4.5)
Requirement already satisfied: pyparsing>=2.3.1 in /opt/conda/lib/python3.11/site-packages (from matplotlib>=2.1.0->pycocotools->neural-compressor->intel-extension-for-transformers) (3.1.2)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas->neural-compressor->intel-extension-for-transformers) (1.16.0)
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple
Requirement already satisfied: modelscope in /opt/conda/lib/python3.11/site-packages (1.14.0)
Requirement already satisfied: addict in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.4.0)
Requirement already satisfied: attrs in /opt/conda/lib/python3.11/site-packages (from modelscope) (23.2.0)
Requirement already satisfied: datasets<2.19.0,>=2.16.0 in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.18.0)
Requirement already satisfied: einops in /opt/conda/lib/python3.11/site-packages (from modelscope) (0.8.0)
Requirement already satisfied: filelock>=3.3.0 in /opt/conda/lib/python3.11/site-packages (from modelscope) (3.14.0)
Requirement already satisfied: gast>=0.2.2 in /opt/conda/lib/python3.11/site-packages (from modelscope) (0.5.4)
Requirement already satisfied: huggingface-hub in /opt/conda/lib/python3.11/site-packages (from modelscope) (0.23.2)
Requirement already satisfied: numpy in /opt/conda/lib/python3.11/site-packages (from modelscope) (1.26.4)
Requirement already satisfied: oss2 in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.18.5)
Requirement already satisfied: pandas in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.2.1)
Requirement already satisfied: Pillow>=6.2.0 in /opt/conda/lib/python3.11/site-packages (from modelscope) (10.2.0)
Requirement already satisfied: pyarrow!=9.0.0,>=6.0.0 in /opt/conda/lib/python3.11/site-packages (from modelscope) (15.0.2)
Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.9.0)
Requirement already satisfied: pyyaml in /opt/conda/lib/python3.11/site-packages (from modelscope) (6.0.1)
Requirement already satisfied: requests>=2.25 in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.31.0)
Requirement already satisfied: scipy in /opt/conda/lib/python3.11/site-packages (from modelscope) (1.12.0)
Requirement already satisfied: setuptools in /opt/conda/lib/python3.11/site-packages (from modelscope) (69.2.0)
Requirement already satisfied: simplejson>=3.3.0 in /opt/conda/lib/python3.11/site-packages (from modelscope) (3.19.2)
Requirement already satisfied: sortedcontainers>=1.5.9 in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.4.0)
Requirement already satisfied: tqdm>=4.64.0 in /opt/conda/lib/python3.11/site-packages (from modelscope) (4.66.2)
Requirement already satisfied: urllib3>=1.26 in /opt/conda/lib/python3.11/site-packages (from modelscope) (2.2.1)
Requirement already satisfied: yapf in /opt/conda/lib/python3.11/site-packages (from modelscope) (0.40.2)
Requirement already satisfied: pyarrow-hotfix in /opt/conda/lib/python3.11/site-packages (from datasets<2.19.0,>=2.16.0->modelscope) (0.6)
Requirement already satisfied: dill<0.3.9,>=0.3.0 in /opt/conda/lib/python3.11/site-packages (from datasets<2.19.0,>=2.16.0->modelscope) (0.3.8)
Requirement already satisfied: xxhash in /opt/conda/lib/python3.11/site-packages (from datasets<2.19.0,>=2.16.0->modelscope) (3.4.1)
Requirement already satisfied: multiprocess in /opt/conda/lib/python3.11/site-packages (from datasets<2.19.0,>=2.16.0->modelscope) (0.70.16)
Requirement already satisfied: fsspec<=2024.2.0,>=2023.1.0 in /opt/conda/lib/python3.11/site-packages (from fsspec[http]<=2024.2.0,>=2023.1.0->datasets<2.19.0,>=2.16.0->modelscope) (2024.2.0)
Requirement already satisfied: aiohttp in /opt/conda/lib/python3.11/site-packages (from datasets<2.19.0,>=2.16.0->modelscope) (3.9.5)
Requirement already satisfied: packaging in /opt/conda/lib/python3.11/site-packages (from datasets<2.19.0,>=2.16.0->modelscope) (24.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub->modelscope) (4.10.0)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.11/site-packages (from python-dateutil>=2.1->modelscope) (1.16.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.11/site-packages (from requests>=2.25->modelscope) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-packages (from requests>=2.25->modelscope) (3.6)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.11/site-packages (from requests>=2.25->modelscope) (2024.2.2)
Requirement already satisfied: crcmod>=1.7 in /opt/conda/lib/python3.11/site-packages (from oss2->modelscope) (1.7)
Requirement already satisfied: pycryptodome>=3.4.7 in /opt/conda/lib/python3.11/site-packages (from oss2->modelscope) (3.20.0)
Requirement already satisfied: aliyun-python-sdk-kms>=2.4.1 in /opt/conda/lib/python3.11/site-packages (from oss2->modelscope) (2.16.3)
Requirement already satisfied: aliyun-python-sdk-core>=2.13.12 in /opt/conda/lib/python3.11/site-packages (from oss2->modelscope) (2.15.1)
Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.11/site-packages (from pandas->modelscope) (2024.1)
Requirement already satisfied: tzdata>=2022.7 in /opt/conda/lib/python3.11/site-packages (from pandas->modelscope) (2024.1)
Requirement already satisfied: importlib-metadata>=6.6.0 in /opt/conda/lib/python3.11/site-packages (from yapf->modelscope) (7.1.0)
Requirement already satisfied: platformdirs>=3.5.1 in /opt/conda/lib/python3.11/site-packages (from yapf->modelscope) (4.2.0)
Requirement already satisfied: tomli>=2.0.1 in /opt/conda/lib/python3.11/site-packages (from yapf->modelscope) (2.0.1)
Requirement already satisfied: jmespath<1.0.0,>=0.9.3 in /opt/conda/lib/python3.11/site-packages (from aliyun-python-sdk-core>=2.13.12->oss2->modelscope) (0.10.0)
Requirement already satisfied: cryptography>=2.6.0 in /opt/conda/lib/python3.11/site-packages (from aliyun-python-sdk-core>=2.13.12->oss2->modelscope) (42.0.5)
Requirement already satisfied: aiosignal>=1.1.2 in /opt/conda/lib/python3.11/site-packages (from aiohttp->datasets<2.19.0,>=2.16.0->modelscope) (1.3.1)
Requirement already satisfied: frozenlist>=1.1.1 in /opt/conda/lib/python3.11/site-packages (from aiohttp->datasets<2.19.0,>=2.16.0->modelscope) (1.4.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /opt/conda/lib/python3.11/site-packages (from aiohttp->datasets<2.19.0,>=2.16.0->modelscope) (6.0.5)
Requirement already satisfied: yarl<2.0,>=1.0 in /opt/conda/lib/python3.11/site-packages (from aiohttp->datasets<2.19.0,>=2.16.0->modelscope) (1.9.4)
Requirement already satisfied: zipp>=0.5 in /opt/conda/lib/python3.11/site-packages (from importlib-metadata>=6.6.0->yapf->modelscope) (3.17.0)
Requirement already satisfied: cffi>=1.12 in /opt/conda/lib/python3.11/site-packages (from cryptography>=2.6.0->aliyun-python-sdk-core>=2.13.12->oss2->modelscope) (1.16.0)
Requirement already satisfied: pycparser in /opt/conda/lib/python3.11/site-packages (from cffi>=1.12->cryptography>=2.6.0->aliyun-python-sdk-core>=2.13.12->oss2->modelscope) (2.22)
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple
Requirement already satisfied: transformers in /opt/conda/lib/python3.11/site-packages (4.41.1)
Requirement already satisfied: filelock in /opt/conda/lib/python3.11/site-packages (from transformers) (3.14.0)
Requirement already satisfied: huggingface-hub<1.0,>=0.23.0 in /opt/conda/lib/python3.11/site-packages (from transformers) (0.23.2)
Requirement already satisfied: numpy>=1.17 in /opt/conda/lib/python3.11/site-packages (from transformers) (1.26.4)
Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.11/site-packages (from transformers) (24.0)
Requirement already satisfied: pyyaml>=5.1 in /opt/conda/lib/python3.11/site-packages (from transformers) (6.0.1)
Requirement already satisfied: regex!=2019.12.17 in /opt/conda/lib/python3.11/site-packages (from transformers) (2024.5.15)
Requirement already satisfied: requests in /opt/conda/lib/python3.11/site-packages (from transformers) (2.31.0)
Requirement already satisfied: tokenizers<0.20,>=0.19 in /opt/conda/lib/python3.11/site-packages (from transformers) (0.19.1)
Requirement already satisfied: safetensors>=0.4.1 in /opt/conda/lib/python3.11/site-packages (from transformers) (0.4.3)
Requirement already satisfied: tqdm>=4.27 in /opt/conda/lib/python3.11/site-packages (from transformers) (4.66.2)
Requirement already satisfied: fsspec>=2023.5.0 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub<1.0,>=0.23.0->transformers) (2024.2.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub<1.0,>=0.23.0->transformers) (4.10.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.11/site-packages (from requests->transformers) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-packages (from requests->transformers) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.11/site-packages (from requests->transformers) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.11/site-packages (from requests->transformers) (2024.2.2)
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple
Requirement already satisfied: pyOpenSSL in /opt/conda/lib/python3.11/site-packages (24.1.0)
Requirement already satisfied: cryptography<43,>=41.0.5 in /opt/conda/lib/python3.11/site-packages (from pyOpenSSL) (42.0.5)
Requirement already satisfied: cffi>=1.12 in /opt/conda/lib/python3.11/site-packages (from cryptography<43,>=41.0.5->pyOpenSSL) (1.16.0)
Requirement already satisfied: pycparser in /opt/conda/lib/python3.11/site-packages (from cffi>=1.12->cryptography<43,>=41.0.5->pyOpenSSL) (2.22)
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple
Requirement already satisfied: sentencepiece in /opt/conda/lib/python3.11/site-packages (0.2.0)
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple
Requirement already satisfied: xformers in /opt/conda/lib/python3.11/site-packages (0.0.26.post1)
Requirement already satisfied: numpy in /opt/conda/lib/python3.11/site-packages (from xformers) (1.26.4)
Requirement already satisfied: torch==2.3.0 in /opt/conda/lib/python3.11/site-packages (from xformers) (2.3.0+cpu)
Requirement already satisfied: filelock in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0->xformers) (3.14.0)
Requirement already satisfied: typing-extensions>=4.8.0 in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0->xformers) (4.10.0)
Requirement already satisfied: sympy in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0->xformers) (1.12)
Requirement already satisfied: networkx in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0->xformers) (3.2.1)
Requirement already satisfied: jinja2 in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0->xformers) (3.1.3)
Requirement already satisfied: fsspec in /opt/conda/lib/python3.11/site-packages (from torch==2.3.0->xformers) (2024.2.0)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.11/site-packages (from jinja2->torch==2.3.0->xformers) (2.1.5)
Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.11/site-packages (from sympy->torch==2.3.0->xformers) (1.3.0)
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple
Requirement already satisfied: accelerate in /opt/conda/lib/python3.11/site-packages (0.30.1)
Requirement already satisfied: numpy>=1.17 in /opt/conda/lib/python3.11/site-packages (from accelerate) (1.26.4)
Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.11/site-packages (from accelerate) (24.0)
Requirement already satisfied: psutil in /opt/conda/lib/python3.11/site-packages (from accelerate) (5.9.8)
Requirement already satisfied: pyyaml in /opt/conda/lib/python3.11/site-packages (from accelerate) (6.0.1)
Requirement already satisfied: torch>=1.10.0 in /opt/conda/lib/python3.11/site-packages (from accelerate) (2.3.0+cpu)
Requirement already satisfied: huggingface-hub in /opt/conda/lib/python3.11/site-packages (from accelerate) (0.23.2)
Requirement already satisfied: safetensors>=0.3.1 in /opt/conda/lib/python3.11/site-packages (from accelerate) (0.4.3)
Requirement already satisfied: filelock in /opt/conda/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (3.14.0)
Requirement already satisfied: typing-extensions>=4.8.0 in /opt/conda/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (4.10.0)
Requirement already satisfied: sympy in /opt/conda/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (1.12)
Requirement already satisfied: networkx in /opt/conda/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (3.2.1)
Requirement already satisfied: jinja2 in /opt/conda/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (3.1.3)
Requirement already satisfied: fsspec in /opt/conda/lib/python3.11/site-packages (from torch>=1.10.0->accelerate) (2024.2.0)
Requirement already satisfied: requests in /opt/conda/lib/python3.11/site-packages (from huggingface-hub->accelerate) (2.31.0)
Requirement already satisfied: tqdm>=4.42.1 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub->accelerate) (4.66.2)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.11/site-packages (from jinja2->torch>=1.10.0->accelerate) (2.1.5)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.11/site-packages (from requests->huggingface-hub->accelerate) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-packages (from requests->huggingface-hub->accelerate) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.11/site-packages (from requests->huggingface-hub->accelerate) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.11/site-packages (from requests->huggingface-hub->accelerate) (2024.2.2)
Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.11/site-packages (from sympy->torch>=1.10.0->accelerate) (1.3.0)
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple
Requirement already satisfied: tiktoken in /opt/conda/lib/python3.11/site-packages (0.7.0)
Requirement already satisfied: regex>=2022.1.18 in /opt/conda/lib/python3.11/site-packages (from tiktoken) (2024.5.15)
Requirement already satisfied: requests>=2.26.0 in /opt/conda/lib/python3.11/site-packages (from tiktoken) (2.31.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.11/site-packages (from requests>=2.26.0->tiktoken) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-packages (from requests>=2.26.0->tiktoken) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.11/site-packages (from requests>=2.26.0->tiktoken) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.11/site-packages (from requests>=2.26.0->tiktoken) (2024.2.2)
Looking in indexes: https://mirrors.cloud.tencent.com/pypi/simple
Collecting transformers_stream_generator
  Downloading https://mirrors.cloud.tencent.com/pypi/packages/42/c2/65f13aec253100e1916e9bd7965fe17bde796ebabeb1265f45191ab4ddc0/transformers-stream-generator-0.0.5.tar.gz (13 kB)
  Preparing metadata (setup.py) ... done
Requirement already satisfied: transformers>=4.26.1 in /opt/conda/lib/python3.11/site-packages (from transformers_stream_generator) (4.41.1)
Requirement already satisfied: filelock in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (3.14.0)
Requirement already satisfied: huggingface-hub<1.0,>=0.23.0 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (0.23.2)
Requirement already satisfied: numpy>=1.17 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (1.26.4)
Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (24.0)
Requirement already satisfied: pyyaml>=5.1 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (6.0.1)
Requirement already satisfied: regex!=2019.12.17 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (2024.5.15)
Requirement already satisfied: requests in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (2.31.0)
Requirement already satisfied: tokenizers<0.20,>=0.19 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (0.19.1)
Requirement already satisfied: safetensors>=0.4.1 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (0.4.3)
Requirement already satisfied: tqdm>=4.27 in /opt/conda/lib/python3.11/site-packages (from transformers>=4.26.1->transformers_stream_generator) (4.66.2)
Requirement already satisfied: fsspec>=2023.5.0 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub<1.0,>=0.23.0->transformers>=4.26.1->transformers_stream_generator) (2024.2.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.11/site-packages (from huggingface-hub<1.0,>=0.23.0->transformers>=4.26.1->transformers_stream_generator) (4.10.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.11/site-packages (from requests->transformers>=4.26.1->transformers_stream_generator) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-packages (from requests->transformers>=4.26.1->transformers_stream_generator) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.11/site-packages (from requests->transformers>=4.26.1->transformers_stream_generator) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.11/site-packages (from requests->transformers>=4.26.1->transformers_stream_generator) (2024.2.2)
Building wheels for collected packages: transformers_stream_generator
  Building wheel for transformers_stream_generator (setup.py) ... done
  Created wheel for transformers_stream_generator: filename=transformers_stream_generator-0.0.5-py3-none-any.whl size=12425 sha256=a32bee62b0602b8b226c3b18e93f1d1568900c4135c1ed2ac80837f363bd9402
  Stored in directory: /home/mw/.cache/pip/wheels/c0/9f/f6/f8573ca658852aa7cdde5a0e2717f767ac9b2dd19a7d2897b9
Successfully built transformers_stream_generator
Installing collected packages: transformers_stream_generator
Successfully installed transformers_stream_generator-0.0.5

In [3]:

# 在temp目录下进行模型在cpu上的转换存储
%cd ..
%cd temp
/home/mw
/home/mw/temp

自动化部署模型 & 推理

In [4]:

# 使用itrex加载转换模型并自动化部署推理
from transformers import AutoTokenizer, TextStreamer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM

prompt = "Once upon a time, there existed a little girl,"
model_dir = '/home/mw/input/qwen7bchat3536'
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
inputs = tokenizer(prompt, return_tensors="pt").input_ids
streamer = TextStreamer(tokenizer)
model = AutoModelForCausalLM.from_pretrained(model_dir, load_in_4bit=True)
outputs = model.generate(inputs, streamer=streamer, max_new_tokens=300)
2024-05-29 15:13:06 [INFO] cpu device is used.
2024-05-29 15:13:06 [INFO] Applying Weight Only Quantization.
2024-05-29 15:13:06 [INFO] Quantize model by Neural Speed with RTN Algorithm.
cmd: ['python', PosixPath('/opt/conda/lib/python3.11/site-packages/neural_speed/convert/convert_qwen.py'), '--outfile', 'runtime_outs/ne_qwen_f32.bin', '--outtype', 'f32', '--model_hub', 'huggingface', '/home/mw/input/qwen7bchat3536']
Warning: please make sure that you are using the latest codes and checkpoints, especially if you used Qwen-7B before 09.25.2023.请使用最新模型和代码,尤其如果你在9月25日前已经开始使用Qwen-7B,千万注意不要使用错误代码和模型。
Loading model:  /home/mw/input/qwen7bchat3536
Loading checkpoint shards: 100%|██████████| 8/8 [01:34<00:00, 11.81s/it]
Model loaded:  /home/mw/input/qwen7bchat3536
{'vocab_size': 151936, 'hidden_size': 4096, 'intermediate_size': 22016, 'num_hidden_layers': 32, 'num_attention_heads': 32, 'emb_dropout_prob': 0.0, 'attn_dropout_prob': 0.0, 'layer_norm_epsilon': 1e-06, 'initializer_range': 0.02, 'scale_attn_weights': True, 'use_cache': True, 'max_position_embeddings': 8192, 'bf16': False, 'fp16': False, 'fp32': True, 'kv_channels': 128, 'rotary_pct': 1.0, 'rotary_emb_base': 10000, 'use_dynamic_ntk': True, 'use_logn_attn': True, 'use_flash_attn': False, 'no_bias': True, 'use_cache_quantization': False, 'use_cache_kernel': False, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': None, 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': False, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': False, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['QWenLMHeadModel'], 'finetuning_task': None, 'id2label': {0: 'LABEL_0', 1: 'LABEL_1'}, 'label2id': {'LABEL_0': 0, 'LABEL_1': 1}, 'tokenizer_class': 'QWenTokenizer', 'prefix': None, 'bos_token_id': None, 'pad_token_id': None, 'eos_token_id': None, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': '/home/mw/input/qwen7bchat3536', 'transformers_version': '4.41.1', 'auto_map': {'AutoConfig': 'configuration_qwen.QWenConfig', 'AutoModelForCausalLM': 'modeling_qwen.QWenLMHeadModel'}, 'model_type': 'qwen', 'onnx_safe': None, 'seq_length': 8192}
transformer.wte.weight  ->  transformer.wte.weight
transformer.wte.weight 2 (151936, 4096)
  Converting to float32 (151936, 4096) [[-0.016845703125, -0.00958251953125, 0.0081787109375], [0.0029296875, 0.0096435546875, -0.00604248046875], [0.0162353515625, -0.0224609375, -0.01019287109375]]
b'transformer.wte.weight'
transformer.h.0.ln_1.weight  ->  transformer.h.0.ln_1.weight
transformer.h.0.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.09765625, 0.08837890625, 0.10498046875]
b'transformer.h.0.ln_1.weight'
transformer.h.0.attn.c_attn.weight  ->  transformer.h.0.attn.c_attn.weight
transformer.h.0.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.029541015625, -0.008544921875, 0.0361328125], [0.0023345947265625, -0.0035858154296875, -0.048095703125], [0.0302734375, -0.02392578125, -0.00750732421875]]
b'transformer.h.0.attn.c_attn.weight'
transformer.h.0.attn.c_attn.bias  ->  transformer.h.0.attn.c_attn.bias
transformer.h.0.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [-0.9453125, 1.8828125, -0.74609375]
b'transformer.h.0.attn.c_attn.bias'
transformer.h.0.attn.c_proj.weight  ->  transformer.h.0.attn.c_proj.weight
transformer.h.0.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[-0.007232666015625, -0.002655029296875, -6.437301635742188e-05], [0.00732421875, 0.0037384033203125, -0.01104736328125], [-0.0299072265625, 0.00836181640625, -0.00604248046875]]
b'transformer.h.0.attn.c_proj.weight'
transformer.h.0.ln_2.weight  ->  transformer.h.0.ln_2.weight
transformer.h.0.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.1767578125, 0.171875, 0.16796875]
b'transformer.h.0.ln_2.weight'
transformer.h.0.mlp.w1.weight  ->  transformer.h.0.mlp.w1.weight
transformer.h.0.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.027587890625, 0.0123291015625, -0.0299072265625], [-0.003570556640625, -0.00604248046875, 0.0062255859375], [-0.0012969970703125, -0.0003643035888671875, 0.0213623046875]]
b'transformer.h.0.mlp.w1.weight'
transformer.h.0.mlp.w2.weight  ->  transformer.h.0.mlp.w2.weight
transformer.h.0.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.0263671875, -0.004364013671875, 0.0159912109375], [0.021728515625, 0.00970458984375, -0.035888671875], [0.0191650390625, 0.01397705078125, -0.01318359375]]
b'transformer.h.0.mlp.w2.weight'
transformer.h.0.mlp.c_proj.weight  ->  transformer.h.0.mlp.c_proj.weight
transformer.h.0.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.00421142578125, -0.0111083984375, -0.0012969970703125], [0.0191650390625, 0.0130615234375, -0.008056640625], [0.0029754638671875, 0.01092529296875, 0.00665283203125]]
b'transformer.h.0.mlp.c_proj.weight'
transformer.h.1.ln_1.weight  ->  transformer.h.1.ln_1.weight
transformer.h.1.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.134765625, 0.09130859375, 0.1044921875]
b'transformer.h.1.ln_1.weight'
transformer.h.1.attn.c_attn.weight  ->  transformer.h.1.attn.c_attn.weight
transformer.h.1.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.044921875, -0.00823974609375, 0.021484375], [-0.0225830078125, 0.0029449462890625, -0.0037994384765625], [-0.033447265625, 0.007659912109375, -0.017822265625]]
b'transformer.h.1.attn.c_attn.weight'
transformer.h.1.attn.c_attn.bias  ->  transformer.h.1.attn.c_attn.bias
transformer.h.1.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.41796875, -1.1171875, -1.109375]
b'transformer.h.1.attn.c_attn.bias'
transformer.h.1.attn.c_proj.weight  ->  transformer.h.1.attn.c_proj.weight
transformer.h.1.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.0018310546875, 0.01239013671875, 0.0074462890625], [-0.0031280517578125, -0.028564453125, 0.0115966796875], [0.002838134765625, -0.001129150390625, 0.008544921875]]
b'transformer.h.1.attn.c_proj.weight'
transformer.h.1.ln_2.weight  ->  transformer.h.1.ln_2.weight
transformer.h.1.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.244140625, 0.244140625, 0.2392578125]
b'transformer.h.1.ln_2.weight'
transformer.h.1.mlp.w1.weight  ->  transformer.h.1.mlp.w1.weight
transformer.h.1.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.010498046875, 0.0303955078125, -0.021240234375], [-0.0064697265625, 0.0008544921875, -0.02685546875], [-0.022216796875, -0.036865234375, -8.440017700195312e-05]]
b'transformer.h.1.mlp.w1.weight'
transformer.h.1.mlp.w2.weight  ->  transformer.h.1.mlp.w2.weight
transformer.h.1.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.031005859375, 0.01953125, 0.00689697265625], [0.0036773681640625, -0.0201416015625, -0.004364013671875], [0.01202392578125, 0.032470703125, -0.020263671875]]
b'transformer.h.1.mlp.w2.weight'
transformer.h.1.mlp.c_proj.weight  ->  transformer.h.1.mlp.c_proj.weight
transformer.h.1.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[0.0013275146484375, 0.0286865234375, -0.0101318359375], [-0.0084228515625, 0.0220947265625, -0.02001953125], [-0.0128173828125, -0.0098876953125, 0.0242919921875]]
b'transformer.h.1.mlp.c_proj.weight'
transformer.h.2.ln_1.weight  ->  transformer.h.2.ln_1.weight
transformer.h.2.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.244140625, 0.201171875, 0.1943359375]
b'transformer.h.2.ln_1.weight'
transformer.h.2.attn.c_attn.weight  ->  transformer.h.2.attn.c_attn.weight
transformer.h.2.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.00016307830810546875, 0.0206298828125, -0.01025390625], [-0.01324462890625, 0.0096435546875, 0.0205078125], [-0.016357421875, 0.00885009765625, 0.02294921875]]
b'transformer.h.2.attn.c_attn.weight'
transformer.h.2.attn.c_attn.bias  ->  transformer.h.2.attn.c_attn.bias
transformer.h.2.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [-0.2216796875, 0.412109375, 0.34375]
b'transformer.h.2.attn.c_attn.bias'
transformer.h.2.attn.c_proj.weight  ->  transformer.h.2.attn.c_proj.weight
transformer.h.2.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[-0.01019287109375, -0.01446533203125, 0.0147705078125], [0.007049560546875, 0.00469970703125, -0.006011962890625], [-0.00970458984375, -0.006805419921875, 0.015869140625]]
b'transformer.h.2.attn.c_proj.weight'
transformer.h.2.ln_2.weight  ->  transformer.h.2.ln_2.weight
transformer.h.2.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.30078125, 0.287109375, 0.283203125]
b'transformer.h.2.ln_2.weight'
transformer.h.2.mlp.w1.weight  ->  transformer.h.2.mlp.w1.weight
transformer.h.2.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.046142578125, 0.0115966796875, 0.012451171875], [0.0283203125, -0.00531005859375, 0.0191650390625], [-0.0142822265625, -0.01055908203125, -0.0198974609375]]
b'transformer.h.2.mlp.w1.weight'
transformer.h.2.mlp.w2.weight  ->  transformer.h.2.mlp.w2.weight
transformer.h.2.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.04345703125, 0.0155029296875, 0.0032196044921875], [-0.010009765625, 0.0250244140625, 0.0029144287109375], [0.00701904296875, -0.00726318359375, -0.0390625]]
b'transformer.h.2.mlp.w2.weight'
transformer.h.2.mlp.c_proj.weight  ->  transformer.h.2.mlp.c_proj.weight
transformer.h.2.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.0174560546875, 0.006134033203125, -0.034912109375], [0.00933837890625, -0.02880859375, 0.000946044921875], [0.00640869140625, 0.0159912109375, -0.002685546875]]
b'transformer.h.2.mlp.c_proj.weight'
transformer.h.3.ln_1.weight  ->  transformer.h.3.ln_1.weight
transformer.h.3.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.51171875, 0.46875, 0.486328125]
b'transformer.h.3.ln_1.weight'
transformer.h.3.attn.c_attn.weight  ->  transformer.h.3.attn.c_attn.weight
transformer.h.3.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.02099609375, 0.005126953125, -0.0194091796875], [0.0128173828125, 0.0142822265625, 0.018310546875], [-0.0172119140625, 0.016357421875, 0.0047607421875]]
b'transformer.h.3.attn.c_attn.weight'
transformer.h.3.attn.c_attn.bias  ->  transformer.h.3.attn.c_attn.bias
transformer.h.3.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.1962890625, -0.1494140625, -0.224609375]
b'transformer.h.3.attn.c_attn.bias'
transformer.h.3.attn.c_proj.weight  ->  transformer.h.3.attn.c_proj.weight
transformer.h.3.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.00732421875, -0.0172119140625, 0.01007080078125], [-0.009765625, 0.028564453125, 0.00762939453125], [0.0184326171875, -0.006500244140625, 0.01904296875]]
b'transformer.h.3.attn.c_proj.weight'
transformer.h.3.ln_2.weight  ->  transformer.h.3.ln_2.weight
transformer.h.3.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.326171875, 0.333984375, 0.3125]
b'transformer.h.3.ln_2.weight'
transformer.h.3.mlp.w1.weight  ->  transformer.h.3.mlp.w1.weight
transformer.h.3.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.00750732421875, 0.0203857421875, 0.025634765625], [0.032470703125, 0.0111083984375, -0.012939453125], [-0.00037384033203125, 0.01025390625, -0.007781982421875]]
b'transformer.h.3.mlp.w1.weight'
transformer.h.3.mlp.w2.weight  ->  transformer.h.3.mlp.w2.weight
transformer.h.3.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.0002689361572265625, -0.0091552734375, -0.0250244140625], [0.0113525390625, -0.022216796875, 0.0262451171875], [0.0172119140625, 0.0040283203125, 0.026123046875]]
b'transformer.h.3.mlp.w2.weight'
transformer.h.3.mlp.c_proj.weight  ->  transformer.h.3.mlp.c_proj.weight
transformer.h.3.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[0.0150146484375, 0.0166015625, -0.00958251953125], [0.006744384765625, -0.01043701171875, -0.02734375], [-0.0072021484375, -0.01611328125, -0.037841796875]]
b'transformer.h.3.mlp.c_proj.weight'
transformer.h.4.ln_1.weight  ->  transformer.h.4.ln_1.weight
transformer.h.4.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.625, 0.55078125, 0.57421875]
b'transformer.h.4.ln_1.weight'
transformer.h.4.attn.c_attn.weight  ->  transformer.h.4.attn.c_attn.weight
transformer.h.4.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.0089111328125, -0.016357421875, -0.0135498046875], [0.002532958984375, -0.0162353515625, 0.006744384765625], [-0.01336669921875, -0.0255126953125, 0.0027008056640625]]
b'transformer.h.4.attn.c_attn.weight'
transformer.h.4.attn.c_attn.bias  ->  transformer.h.4.attn.c_attn.bias
transformer.h.4.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [-0.1328125, -0.3828125, -0.359375]
b'transformer.h.4.attn.c_attn.bias'
transformer.h.4.attn.c_proj.weight  ->  transformer.h.4.attn.c_proj.weight
transformer.h.4.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[-0.005401611328125, 0.002655029296875, -0.0019073486328125], [0.0003986358642578125, -0.0021514892578125, 0.00689697265625], [-0.01043701171875, -0.0040283203125, -0.0013427734375]]
b'transformer.h.4.attn.c_proj.weight'
transformer.h.4.ln_2.weight  ->  transformer.h.4.ln_2.weight
transformer.h.4.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.380859375, 0.37890625, 0.361328125]
b'transformer.h.4.ln_2.weight'
transformer.h.4.mlp.w1.weight  ->  transformer.h.4.mlp.w1.weight
transformer.h.4.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.01483154296875, 0.001861572265625, 0.0042724609375], [0.00787353515625, 0.02001953125, -0.00836181640625], [0.007476806640625, -0.02783203125, -0.048583984375]]
b'transformer.h.4.mlp.w1.weight'
transformer.h.4.mlp.w2.weight  ->  transformer.h.4.mlp.w2.weight
transformer.h.4.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.006378173828125, 0.000682830810546875, 0.0133056640625], [-0.013916015625, -0.0272216796875, -0.01275634765625], [0.021728515625, -0.0322265625, -0.00183868408203125]]
b'transformer.h.4.mlp.w2.weight'
transformer.h.4.mlp.c_proj.weight  ->  transformer.h.4.mlp.c_proj.weight
transformer.h.4.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[0.0106201171875, 0.0064697265625, 0.01068115234375], [-0.00799560546875, -0.045654296875, -0.01092529296875], [0.01251220703125, 0.000843048095703125, -0.03271484375]]
b'transformer.h.4.mlp.c_proj.weight'
transformer.h.5.ln_1.weight  ->  transformer.h.5.ln_1.weight
transformer.h.5.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.64453125, 0.58984375, 0.609375]
b'transformer.h.5.ln_1.weight'
transformer.h.5.attn.c_attn.weight  ->  transformer.h.5.attn.c_attn.weight
transformer.h.5.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.022705078125, 0.0072021484375, -0.01336669921875], [0.01171875, 0.036865234375, -0.017333984375], [-0.01104736328125, -0.00087738037109375, -0.0140380859375]]
b'transformer.h.5.attn.c_attn.weight'
transformer.h.5.attn.c_attn.bias  ->  transformer.h.5.attn.c_attn.bias
transformer.h.5.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [-0.02783203125, 0.09130859375, 0.10693359375]
b'transformer.h.5.attn.c_attn.bias'
transformer.h.5.attn.c_proj.weight  ->  transformer.h.5.attn.c_proj.weight
transformer.h.5.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.026611328125, -0.0074462890625, 0.010498046875], [0.00823974609375, 0.0101318359375, 0.0255126953125], [0.00075531005859375, -0.0037078857421875, 0.0023956298828125]]
b'transformer.h.5.attn.c_proj.weight'
transformer.h.5.ln_2.weight  ->  transformer.h.5.ln_2.weight
transformer.h.5.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.416015625, 0.423828125, 0.404296875]
b'transformer.h.5.ln_2.weight'
transformer.h.5.mlp.w1.weight  ->  transformer.h.5.mlp.w1.weight
transformer.h.5.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.0064697265625, -0.004608154296875, 0.0048828125], [-0.00093841552734375, -0.020751953125, 0.00799560546875], [0.00086212158203125, 0.00604248046875, 0.001983642578125]]
b'transformer.h.5.mlp.w1.weight'
transformer.h.5.mlp.w2.weight  ->  transformer.h.5.mlp.w2.weight
transformer.h.5.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.002166748046875, -0.0361328125, -0.01165771484375], [0.0123291015625, -1.9311904907226562e-05, 0.016357421875], [-0.0038299560546875, -0.03662109375, -0.03564453125]]
b'transformer.h.5.mlp.w2.weight'
transformer.h.5.mlp.c_proj.weight  ->  transformer.h.5.mlp.c_proj.weight
transformer.h.5.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.031982421875, -0.022216796875, 0.032958984375], [-0.02587890625, -0.01324462890625, 0.0096435546875], [0.023681640625, 0.00026702880859375, -0.0030975341796875]]
b'transformer.h.5.mlp.c_proj.weight'
transformer.h.6.ln_1.weight  ->  transformer.h.6.ln_1.weight
transformer.h.6.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.63671875, 0.59375, 0.6015625]
b'transformer.h.6.ln_1.weight'
transformer.h.6.attn.c_attn.weight  ->  transformer.h.6.attn.c_attn.weight
transformer.h.6.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.003204345703125, 0.00848388671875, 0.0263671875], [-0.0159912109375, -0.00102996826171875, 0.005950927734375], [0.00750732421875, -0.00653076171875, 0.014404296875]]
b'transformer.h.6.attn.c_attn.weight'
transformer.h.6.attn.c_attn.bias  ->  transformer.h.6.attn.c_attn.bias
transformer.h.6.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.1123046875, -0.0228271484375, -0.28515625]
b'transformer.h.6.attn.c_attn.bias'
transformer.h.6.attn.c_proj.weight  ->  transformer.h.6.attn.c_proj.weight
transformer.h.6.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[-0.045654296875, 0.021728515625, 0.0074462890625], [0.008056640625, -0.004852294921875, -0.034912109375], [-0.00136566162109375, 0.00531005859375, -0.00347900390625]]
b'transformer.h.6.attn.c_proj.weight'
transformer.h.6.ln_2.weight  ->  transformer.h.6.ln_2.weight
transformer.h.6.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.45703125, 0.458984375, 0.44140625]
b'transformer.h.6.ln_2.weight'
transformer.h.6.mlp.w1.weight  ->  transformer.h.6.mlp.w1.weight
transformer.h.6.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.00118255615234375, 0.000713348388671875, -0.0247802734375], [-0.005340576171875, -0.0126953125, 0.01202392578125], [-0.0025634765625, 0.00787353515625, -0.01300048828125]]
b'transformer.h.6.mlp.w1.weight'
transformer.h.6.mlp.w2.weight  ->  transformer.h.6.mlp.w2.weight
transformer.h.6.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.0267333984375, -0.035888671875, 0.017822265625], [0.0016632080078125, -0.033447265625, 0.01416015625], [0.0006866455078125, 0.004486083984375, 0.0220947265625]]
b'transformer.h.6.mlp.w2.weight'
transformer.h.6.mlp.c_proj.weight  ->  transformer.h.6.mlp.c_proj.weight
transformer.h.6.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.004425048828125, 0.0224609375, -0.0267333984375], [0.002685546875, -0.0025787353515625, 0.0017242431640625], [0.00145721435546875, 0.00433349609375, 0.001861572265625]]
b'transformer.h.6.mlp.c_proj.weight'
transformer.h.7.ln_1.weight  ->  transformer.h.7.ln_1.weight
transformer.h.7.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.6484375, 0.62890625, 0.609375]
b'transformer.h.7.ln_1.weight'
transformer.h.7.attn.c_attn.weight  ->  transformer.h.7.attn.c_attn.weight
transformer.h.7.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.0220947265625, 0.0244140625, 0.01513671875], [0.00787353515625, -0.03564453125, -0.01251220703125], [-0.0125732421875, 0.0125732421875, -0.014404296875]]
b'transformer.h.7.attn.c_attn.weight'
transformer.h.7.attn.c_attn.bias  ->  transformer.h.7.attn.c_attn.bias
transformer.h.7.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [-0.26953125, 0.248046875, -0.0185546875]
b'transformer.h.7.attn.c_attn.bias'
transformer.h.7.attn.c_proj.weight  ->  transformer.h.7.attn.c_proj.weight
transformer.h.7.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[-0.000946044921875, 0.0067138671875, -0.01220703125], [-0.0123291015625, -0.010986328125, -0.00738525390625], [-0.004913330078125, -0.001953125, 0.00518798828125]]
b'transformer.h.7.attn.c_proj.weight'
transformer.h.7.ln_2.weight  ->  transformer.h.7.ln_2.weight
transformer.h.7.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.486328125, 0.4921875, 0.462890625]
b'transformer.h.7.ln_2.weight'
transformer.h.7.mlp.w1.weight  ->  transformer.h.7.mlp.w1.weight
transformer.h.7.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.0028228759765625, 0.01483154296875, 0.007781982421875], [-0.01409912109375, 0.00958251953125, 0.0205078125], [-0.0068359375, -0.00372314453125, 0.0157470703125]]
b'transformer.h.7.mlp.w1.weight'
transformer.h.7.mlp.w2.weight  ->  transformer.h.7.mlp.w2.weight
transformer.h.7.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.01043701171875, 0.0311279296875, -0.0152587890625], [-0.018798828125, -0.0308837890625, 0.0037841796875], [-0.004302978515625, 0.0169677734375, 0.026123046875]]
b'transformer.h.7.mlp.w2.weight'
transformer.h.7.mlp.c_proj.weight  ->  transformer.h.7.mlp.c_proj.weight
transformer.h.7.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.00860595703125, -0.0142822265625, 0.0017242431640625], [0.01104736328125, 0.019775390625, -0.00390625], [-0.004058837890625, 0.016845703125, 0.0191650390625]]
b'transformer.h.7.mlp.c_proj.weight'
transformer.h.8.ln_1.weight  ->  transformer.h.8.ln_1.weight
transformer.h.8.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.6875, 0.65234375, 0.60546875]
b'transformer.h.8.ln_1.weight'
transformer.h.8.attn.c_attn.weight  ->  transformer.h.8.attn.c_attn.weight
transformer.h.8.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.0047607421875, 0.01068115234375, -0.0069580078125], [-0.01025390625, -0.00024127960205078125, 0.005615234375], [-0.01483154296875, 0.01519775390625, -0.0147705078125]]
b'transformer.h.8.attn.c_attn.weight'
transformer.h.8.attn.c_attn.bias  ->  transformer.h.8.attn.c_attn.bias
transformer.h.8.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.07861328125, -0.08203125, 0.1171875]
b'transformer.h.8.attn.c_attn.bias'
transformer.h.8.attn.c_proj.weight  ->  transformer.h.8.attn.c_proj.weight
transformer.h.8.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.003692626953125, 0.0128173828125, 0.00102996826171875], [-0.021240234375, -0.0152587890625, -0.00439453125], [0.009765625, -0.0012054443359375, -0.024169921875]]
b'transformer.h.8.attn.c_proj.weight'
transformer.h.8.ln_2.weight  ->  transformer.h.8.ln_2.weight
transformer.h.8.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.5390625, 0.5546875, 0.5234375]
b'transformer.h.8.ln_2.weight'
transformer.h.8.mlp.w1.weight  ->  transformer.h.8.mlp.w1.weight
transformer.h.8.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.00799560546875, 0.00653076171875, 0.0216064453125], [-0.0172119140625, 0.024658203125, 0.019287109375], [0.00799560546875, -0.017333984375, 0.03125]]
b'transformer.h.8.mlp.w1.weight'
transformer.h.8.mlp.w2.weight  ->  transformer.h.8.mlp.w2.weight
transformer.h.8.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.01385498046875, -0.022705078125, -0.0023651123046875], [-0.03369140625, -0.0220947265625, 0.03076171875], [-0.00537109375, -0.0014801025390625, 0.00102996826171875]]
b'transformer.h.8.mlp.w2.weight'
transformer.h.8.mlp.c_proj.weight  ->  transformer.h.8.mlp.c_proj.weight
transformer.h.8.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.033935546875, -0.016357421875, -0.0147705078125], [-0.0034637451171875, 0.02783203125, 0.010498046875], [0.0257568359375, 0.041015625, 0.021240234375]]
b'transformer.h.8.mlp.c_proj.weight'
transformer.h.9.ln_1.weight  ->  transformer.h.9.ln_1.weight
transformer.h.9.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.73828125, 0.734375, 0.6484375]
b'transformer.h.9.ln_1.weight'
transformer.h.9.attn.c_attn.weight  ->  transformer.h.9.attn.c_attn.weight
transformer.h.9.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.023681640625, -0.0341796875, -0.000926971435546875], [0.0030517578125, -0.0098876953125, -0.0079345703125], [-0.00360107421875, -0.004913330078125, -0.022705078125]]
b'transformer.h.9.attn.c_attn.weight'
transformer.h.9.attn.c_attn.bias  ->  transformer.h.9.attn.c_attn.bias
transformer.h.9.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.040283203125, 0.00994873046875, 0.013671875]
b'transformer.h.9.attn.c_attn.bias'
transformer.h.9.attn.c_proj.weight  ->  transformer.h.9.attn.c_proj.weight
transformer.h.9.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[-0.00775146484375, 0.00286865234375, -0.017578125], [-0.00457763671875, -0.0081787109375, -0.01495361328125], [0.0111083984375, 0.0052490234375, -0.01220703125]]
b'transformer.h.9.attn.c_proj.weight'
transformer.h.9.ln_2.weight  ->  transformer.h.9.ln_2.weight
transformer.h.9.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.54296875, 0.58203125, 0.546875]
b'transformer.h.9.ln_2.weight'
transformer.h.9.mlp.w1.weight  ->  transformer.h.9.mlp.w1.weight
transformer.h.9.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.0311279296875, -0.0048828125, 0.0205078125], [-0.01806640625, 0.010986328125, 0.021240234375], [-0.009765625, -0.003204345703125, -0.00927734375]]
b'transformer.h.9.mlp.w1.weight'
transformer.h.9.mlp.w2.weight  ->  transformer.h.9.mlp.w2.weight
transformer.h.9.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.0028533935546875, -0.00848388671875, 0.006134033203125], [-0.003509521484375, 0.004974365234375, 0.01544189453125], [0.00372314453125, -0.0208740234375, 0.00958251953125]]
b'transformer.h.9.mlp.w2.weight'
transformer.h.9.mlp.c_proj.weight  ->  transformer.h.9.mlp.c_proj.weight
transformer.h.9.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[0.0146484375, -0.018310546875, 0.012939453125], [-0.0380859375, -0.00213623046875, 0.016357421875], [0.0125732421875, 0.01080322265625, 0.0079345703125]]
b'transformer.h.9.mlp.c_proj.weight'
transformer.h.10.ln_1.weight  ->  transformer.h.10.ln_1.weight
transformer.h.10.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.7265625, 0.71875, 0.6171875]
b'transformer.h.10.ln_1.weight'
transformer.h.10.attn.c_attn.weight  ->  transformer.h.10.attn.c_attn.weight
transformer.h.10.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.01055908203125, 0.00946044921875, -0.0257568359375], [0.0024871826171875, -0.00830078125, -0.002197265625], [0.0146484375, 0.0101318359375, -0.0084228515625]]
b'transformer.h.10.attn.c_attn.weight'
transformer.h.10.attn.c_attn.bias  ->  transformer.h.10.attn.c_attn.bias
transformer.h.10.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [-0.091796875, -0.042724609375, 0.00872802734375]
b'transformer.h.10.attn.c_attn.bias'
transformer.h.10.attn.c_proj.weight  ->  transformer.h.10.attn.c_proj.weight
transformer.h.10.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.0005950927734375, -0.00518798828125, -0.0086669921875], [0.0361328125, -0.0244140625, 0.0029144287109375], [-0.01708984375, -0.00011587142944335938, -0.00054931640625]]
b'transformer.h.10.attn.c_proj.weight'
transformer.h.10.ln_2.weight  ->  transformer.h.10.ln_2.weight
transformer.h.10.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.5625, 0.5859375, 0.5390625]
b'transformer.h.10.ln_2.weight'
transformer.h.10.mlp.w1.weight  ->  transformer.h.10.mlp.w1.weight
transformer.h.10.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.01397705078125, 0.00396728515625, -0.0098876953125], [0.009521484375, -0.0152587890625, -0.0137939453125], [0.01031494140625, -0.0009918212890625, 0.016357421875]]
b'transformer.h.10.mlp.w1.weight'
transformer.h.10.mlp.w2.weight  ->  transformer.h.10.mlp.w2.weight
transformer.h.10.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.01434326171875, -0.00140380859375, 0.00933837890625], [0.00482177734375, 0.004730224609375, -0.0172119140625], [0.01483154296875, -0.009521484375, 0.0054931640625]]
b'transformer.h.10.mlp.w2.weight'
transformer.h.10.mlp.c_proj.weight  ->  transformer.h.10.mlp.c_proj.weight
transformer.h.10.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[0.0179443359375, -0.00897216796875, -0.00799560546875], [0.006866455078125, -0.033447265625, -0.0098876953125], [0.01416015625, -0.02978515625, -0.00457763671875]]
b'transformer.h.10.mlp.c_proj.weight'
transformer.h.11.ln_1.weight  ->  transformer.h.11.ln_1.weight
transformer.h.11.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.8046875, 0.75, 0.640625]
b'transformer.h.11.ln_1.weight'
transformer.h.11.attn.c_attn.weight  ->  transformer.h.11.attn.c_attn.weight
transformer.h.11.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.01251220703125, -0.0029449462890625, 0.01495361328125], [-0.015380859375, -0.0010528564453125, 0.008056640625], [-0.00958251953125, -0.0244140625, 0.00095367431640625]]
b'transformer.h.11.attn.c_attn.weight'
transformer.h.11.attn.c_attn.bias  ->  transformer.h.11.attn.c_attn.bias
transformer.h.11.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.036865234375, 0.0303955078125, -0.055908203125]
b'transformer.h.11.attn.c_attn.bias'
transformer.h.11.attn.c_proj.weight  ->  transformer.h.11.attn.c_proj.weight
transformer.h.11.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.0218505859375, -0.021484375, 0.0213623046875], [0.015625, -0.013671875, 0.0079345703125], [-0.00933837890625, -0.00469970703125, -0.00543212890625]]
b'transformer.h.11.attn.c_proj.weight'
transformer.h.11.ln_2.weight  ->  transformer.h.11.ln_2.weight
transformer.h.11.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.546875, 0.56640625, 0.51953125]
b'transformer.h.11.ln_2.weight'
transformer.h.11.mlp.w1.weight  ->  transformer.h.11.mlp.w1.weight
transformer.h.11.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.017333984375, 0.0177001953125, 0.04296875], [-0.0233154296875, 0.008544921875, 0.01507568359375], [-0.003631591796875, 0.009765625, 0.0030517578125]]
b'transformer.h.11.mlp.w1.weight'
transformer.h.11.mlp.w2.weight  ->  transformer.h.11.mlp.w2.weight
transformer.h.11.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.00506591796875, 0.01458740234375, 0.0155029296875], [0.0093994140625, -0.017333984375, -0.0079345703125], [-0.00311279296875, -0.004730224609375, -0.010986328125]]
b'transformer.h.11.mlp.w2.weight'
transformer.h.11.mlp.c_proj.weight  ->  transformer.h.11.mlp.c_proj.weight
transformer.h.11.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[5.4836273193359375e-05, -0.0020751953125, -0.013671875], [0.0111083984375, -0.01220703125, 0.01055908203125], [0.0196533203125, 0.0013427734375, -0.0400390625]]
b'transformer.h.11.mlp.c_proj.weight'
transformer.h.12.ln_1.weight  ->  transformer.h.12.ln_1.weight
transformer.h.12.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.79296875, 0.78515625, 0.69140625]
b'transformer.h.12.ln_1.weight'
transformer.h.12.attn.c_attn.weight  ->  transformer.h.12.attn.c_attn.weight
transformer.h.12.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.0218505859375, -0.017822265625, -0.021484375], [0.0107421875, -0.006134033203125, 0.003936767578125], [0.0017852783203125, -0.00067138671875, -0.0189208984375]]
b'transformer.h.12.attn.c_attn.weight'
transformer.h.12.attn.c_attn.bias  ->  transformer.h.12.attn.c_attn.bias
transformer.h.12.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.04052734375, -0.060791015625, 0.049560546875]
b'transformer.h.12.attn.c_attn.bias'
transformer.h.12.attn.c_proj.weight  ->  transformer.h.12.attn.c_proj.weight
transformer.h.12.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.00244140625, 0.01165771484375, 0.01385498046875], [0.00933837890625, -0.0019378662109375, -0.0057373046875], [-0.0125732421875, 0.0084228515625, -0.0172119140625]]
b'transformer.h.12.attn.c_proj.weight'
transformer.h.12.ln_2.weight  ->  transformer.h.12.ln_2.weight
transformer.h.12.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.55078125, 0.5859375, 0.53125]
b'transformer.h.12.ln_2.weight'
transformer.h.12.mlp.w1.weight  ->  transformer.h.12.mlp.w1.weight
transformer.h.12.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.0185546875, 0.00762939453125, 0.00665283203125], [-0.005706787109375, -0.033935546875, 0.017822265625], [-0.000911712646484375, -0.0042724609375, -0.01458740234375]]
b'transformer.h.12.mlp.w1.weight'
transformer.h.12.mlp.w2.weight  ->  transformer.h.12.mlp.w2.weight
transformer.h.12.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.0167236328125, 0.00848388671875, -0.020263671875], [-0.023193359375, -0.008544921875, 0.03759765625], [0.0024261474609375, 0.0133056640625, 0.03564453125]]
b'transformer.h.12.mlp.w2.weight'
transformer.h.12.mlp.c_proj.weight  ->  transformer.h.12.mlp.c_proj.weight
transformer.h.12.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[0.006134033203125, 0.037109375, 0.00244140625], [-0.00445556640625, -0.03515625, -0.0019378662109375], [-0.00165557861328125, -0.018798828125, -0.03662109375]]
b'transformer.h.12.mlp.c_proj.weight'
transformer.h.13.ln_1.weight  ->  transformer.h.13.ln_1.weight
transformer.h.13.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.8046875, 0.765625, 0.703125]
b'transformer.h.13.ln_1.weight'
transformer.h.13.attn.c_attn.weight  ->  transformer.h.13.attn.c_attn.weight
transformer.h.13.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.020263671875, -0.015625, 0.00360107421875], [-0.016845703125, 0.0126953125, -0.005157470703125], [0.00020313262939453125, -0.0091552734375, -0.0120849609375]]
b'transformer.h.13.attn.c_attn.weight'
transformer.h.13.attn.c_attn.bias  ->  transformer.h.13.attn.c_attn.bias
transformer.h.13.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.0019073486328125, -0.00958251953125, -0.03369140625]
b'transformer.h.13.attn.c_attn.bias'
transformer.h.13.attn.c_proj.weight  ->  transformer.h.13.attn.c_proj.weight
transformer.h.13.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[-0.0152587890625, -0.01470947265625, -0.000446319580078125], [-0.004058837890625, -0.00677490234375, 0.0032501220703125], [-0.01904296875, 0.0079345703125, -0.01007080078125]]
b'transformer.h.13.attn.c_proj.weight'
transformer.h.13.ln_2.weight  ->  transformer.h.13.ln_2.weight
transformer.h.13.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.54296875, 0.57421875, 0.5390625]
b'transformer.h.13.ln_2.weight'
transformer.h.13.mlp.w1.weight  ->  transformer.h.13.mlp.w1.weight
transformer.h.13.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[9.679794311523438e-05, 0.010009765625, 0.0262451171875], [0.00640869140625, 0.0263671875, -0.0027923583984375], [0.01202392578125, 0.01171875, -0.033447265625]]
b'transformer.h.13.mlp.w1.weight'
transformer.h.13.mlp.w2.weight  ->  transformer.h.13.mlp.w2.weight
transformer.h.13.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.006103515625, -0.0013885498046875, -0.022216796875], [-0.00122833251953125, -0.00927734375, -0.003631591796875], [-0.029296875, 0.003814697265625, -0.003814697265625]]
b'transformer.h.13.mlp.w2.weight'
transformer.h.13.mlp.c_proj.weight  ->  transformer.h.13.mlp.c_proj.weight
transformer.h.13.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.004150390625, -0.03173828125, 0.03515625], [0.017822265625, 0.035888671875, -0.006927490234375], [0.0103759765625, -0.0120849609375, 0.005096435546875]]
b'transformer.h.13.mlp.c_proj.weight'
transformer.h.14.ln_1.weight  ->  transformer.h.14.ln_1.weight
transformer.h.14.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.85546875, 0.796875, 0.7578125]
b'transformer.h.14.ln_1.weight'
transformer.h.14.attn.c_attn.weight  ->  transformer.h.14.attn.c_attn.weight
transformer.h.14.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.0198974609375, -0.0126953125, -0.00885009765625], [0.0057373046875, 0.00732421875, 0.00141143798828125], [0.0140380859375, 0.0084228515625, 0.000705718994140625]]
b'transformer.h.14.attn.c_attn.weight'
transformer.h.14.attn.c_attn.bias  ->  transformer.h.14.attn.c_attn.bias
transformer.h.14.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.03271484375, -0.024658203125, 0.01055908203125]
b'transformer.h.14.attn.c_attn.bias'
transformer.h.14.attn.c_proj.weight  ->  transformer.h.14.attn.c_proj.weight
transformer.h.14.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.00567626953125, 0.021728515625, 0.0126953125], [0.0120849609375, -0.0296630859375, 0.0152587890625], [-0.0159912109375, -0.02294921875, 0.009765625]]
b'transformer.h.14.attn.c_proj.weight'
transformer.h.14.ln_2.weight  ->  transformer.h.14.ln_2.weight
transformer.h.14.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.55859375, 0.58984375, 0.55078125]
b'transformer.h.14.ln_2.weight'
transformer.h.14.mlp.w1.weight  ->  transformer.h.14.mlp.w1.weight
transformer.h.14.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.0274658203125, -0.018310546875, 0.035400390625], [0.00921630859375, -0.016357421875, -0.00537109375], [0.0220947265625, 0.01348876953125, 0.031494140625]]
b'transformer.h.14.mlp.w1.weight'
transformer.h.14.mlp.w2.weight  ->  transformer.h.14.mlp.w2.weight
transformer.h.14.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.005584716796875, -0.00921630859375, -0.005889892578125], [0.0223388671875, 0.02392578125, -0.0233154296875], [-0.021484375, -0.03564453125, 0.004730224609375]]
b'transformer.h.14.mlp.w2.weight'
transformer.h.14.mlp.c_proj.weight  ->  transformer.h.14.mlp.c_proj.weight
transformer.h.14.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[0.004974365234375, -0.0091552734375, 0.0262451171875], [-0.01513671875, -0.01324462890625, -0.00665283203125], [0.019287109375, 0.029296875, 0.0135498046875]]
b'transformer.h.14.mlp.c_proj.weight'
transformer.h.15.ln_1.weight  ->  transformer.h.15.ln_1.weight
transformer.h.15.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.80859375, 0.78125, 0.76171875]
b'transformer.h.15.ln_1.weight'
transformer.h.15.attn.c_attn.weight  ->  transformer.h.15.attn.c_attn.weight
transformer.h.15.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.0096435546875, -0.00732421875, -0.0262451171875], [0.0015869140625, 0.0234375, -0.00714111328125], [0.001983642578125, -0.033203125, 0.0208740234375]]
b'transformer.h.15.attn.c_attn.weight'
transformer.h.15.attn.c_attn.bias  ->  transformer.h.15.attn.c_attn.bias
transformer.h.15.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [-0.10302734375, -0.0771484375, -0.018310546875]
b'transformer.h.15.attn.c_attn.bias'
transformer.h.15.attn.c_proj.weight  ->  transformer.h.15.attn.c_proj.weight
transformer.h.15.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[-0.0234375, -0.010009765625, -0.00579833984375], [-0.034423828125, 0.034912109375, 0.002044677734375], [0.04248046875, 0.01409912109375, 0.0078125]]
b'transformer.h.15.attn.c_proj.weight'
transformer.h.15.ln_2.weight  ->  transformer.h.15.ln_2.weight
transformer.h.15.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.5703125, 0.6015625, 0.5625]
b'transformer.h.15.ln_2.weight'
transformer.h.15.mlp.w1.weight  ->  transformer.h.15.mlp.w1.weight
transformer.h.15.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.016845703125, 0.0203857421875, 0.0299072265625], [0.04541015625, -0.030029296875, -0.008544921875], [0.007598876953125, -0.0224609375, 0.01031494140625]]
b'transformer.h.15.mlp.w1.weight'
transformer.h.15.mlp.w2.weight  ->  transformer.h.15.mlp.w2.weight
transformer.h.15.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.00372314453125, 0.016845703125, 0.0018310546875], [0.0069580078125, 0.0087890625, 0.0174560546875], [0.0157470703125, -0.03515625, 0.004638671875]]
b'transformer.h.15.mlp.w2.weight'
transformer.h.15.mlp.c_proj.weight  ->  transformer.h.15.mlp.c_proj.weight
transformer.h.15.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.010986328125, -0.01080322265625, 0.00787353515625], [-0.03369140625, 0.0069580078125, -0.0076904296875], [-0.0198974609375, -0.021240234375, -0.00933837890625]]
b'transformer.h.15.mlp.c_proj.weight'
transformer.h.16.ln_1.weight  ->  transformer.h.16.ln_1.weight
transformer.h.16.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.8359375, 0.8203125, 0.79296875]
b'transformer.h.16.ln_1.weight'
transformer.h.16.attn.c_attn.weight  ->  transformer.h.16.attn.c_attn.weight
transformer.h.16.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.003387451171875, -0.0101318359375, -0.0172119140625], [0.0223388671875, 0.0322265625, 0.016845703125], [0.01336669921875, 0.0072021484375, -0.0004596710205078125]]
b'transformer.h.16.attn.c_attn.weight'
transformer.h.16.attn.c_attn.bias  ->  transformer.h.16.attn.c_attn.bias
transformer.h.16.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.00628662109375, 0.0306396484375, 0.019287109375]
b'transformer.h.16.attn.c_attn.bias'
transformer.h.16.attn.c_proj.weight  ->  transformer.h.16.attn.c_proj.weight
transformer.h.16.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.00469970703125, 0.0179443359375, 0.025634765625], [0.0079345703125, 0.0018463134765625, 0.0361328125], [0.00604248046875, 0.01220703125, -0.00555419921875]]
b'transformer.h.16.attn.c_proj.weight'
transformer.h.16.ln_2.weight  ->  transformer.h.16.ln_2.weight
transformer.h.16.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.58203125, 0.625, 0.5859375]
b'transformer.h.16.ln_2.weight'
transformer.h.16.mlp.w1.weight  ->  transformer.h.16.mlp.w1.weight
transformer.h.16.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.0133056640625, -0.004791259765625, -0.03662109375], [0.04248046875, -0.048583984375, 0.0247802734375], [0.046875, 0.003326416015625, 0.0115966796875]]
b'transformer.h.16.mlp.w1.weight'
transformer.h.16.mlp.w2.weight  ->  transformer.h.16.mlp.w2.weight
transformer.h.16.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.031005859375, -0.022705078125, 0.01556396484375], [0.017333984375, 0.04248046875, -0.0126953125], [0.009765625, -0.00183868408203125, 0.00372314453125]]
b'transformer.h.16.mlp.w2.weight'
transformer.h.16.mlp.c_proj.weight  ->  transformer.h.16.mlp.c_proj.weight
transformer.h.16.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.006256103515625, 0.0107421875, -0.010009765625], [-0.00311279296875, -0.00084686279296875, -0.0091552734375], [-0.0140380859375, -0.00726318359375, -0.006134033203125]]
b'transformer.h.16.mlp.c_proj.weight'
transformer.h.17.ln_1.weight  ->  transformer.h.17.ln_1.weight
transformer.h.17.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.81640625, 0.7890625, 0.7734375]
b'transformer.h.17.ln_1.weight'
transformer.h.17.attn.c_attn.weight  ->  transformer.h.17.attn.c_attn.weight
transformer.h.17.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.018798828125, -0.013916015625, -0.02978515625], [0.017333984375, 0.00421142578125, -0.000431060791015625], [0.0101318359375, -0.02197265625, -0.01171875]]
b'transformer.h.17.attn.c_attn.weight'
transformer.h.17.attn.c_attn.bias  ->  transformer.h.17.attn.c_attn.bias
transformer.h.17.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.005706787109375, -0.1015625, 0.0303955078125]
b'transformer.h.17.attn.c_attn.bias'
transformer.h.17.attn.c_proj.weight  ->  transformer.h.17.attn.c_proj.weight
transformer.h.17.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.011474609375, 0.00457763671875, 0.025146484375], [-0.0031890869140625, -0.031494140625, -0.00494384765625], [0.01123046875, 0.037841796875, -0.02392578125]]
b'transformer.h.17.attn.c_proj.weight'
transformer.h.17.ln_2.weight  ->  transformer.h.17.ln_2.weight
transformer.h.17.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.60546875, 0.6328125, 0.61328125]
b'transformer.h.17.ln_2.weight'
transformer.h.17.mlp.w1.weight  ->  transformer.h.17.mlp.w1.weight
transformer.h.17.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.015869140625, -0.0034942626953125, -0.010498046875], [0.0191650390625, 0.0196533203125, -0.00408935546875], [-0.018310546875, 0.014892578125, 0.005859375]]
b'transformer.h.17.mlp.w1.weight'
transformer.h.17.mlp.w2.weight  ->  transformer.h.17.mlp.w2.weight
transformer.h.17.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.00494384765625, 0.02734375, -0.00567626953125], [0.0022430419921875, -0.005767822265625, -0.00958251953125], [-0.01129150390625, 0.025146484375, 0.0234375]]
b'transformer.h.17.mlp.w2.weight'
transformer.h.17.mlp.c_proj.weight  ->  transformer.h.17.mlp.c_proj.weight
transformer.h.17.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.0181884765625, -0.029296875, -0.007476806640625], [0.01123046875, 0.006500244140625, 0.013427734375], [-0.0022125244140625, 0.021728515625, 0.04248046875]]
b'transformer.h.17.mlp.c_proj.weight'
transformer.h.18.ln_1.weight  ->  transformer.h.18.ln_1.weight
transformer.h.18.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.859375, 0.82421875, 0.8203125]
b'transformer.h.18.ln_1.weight'
transformer.h.18.attn.c_attn.weight  ->  transformer.h.18.attn.c_attn.weight
transformer.h.18.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.010009765625, -0.017578125, 0.0079345703125], [0.00262451171875, 0.0142822265625, 0.00518798828125], [-0.00909423828125, -0.00555419921875, -0.01806640625]]
b'transformer.h.18.attn.c_attn.weight'
transformer.h.18.attn.c_attn.bias  ->  transformer.h.18.attn.c_attn.bias
transformer.h.18.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [-0.107421875, 0.177734375, -0.05029296875]
b'transformer.h.18.attn.c_attn.bias'
transformer.h.18.attn.c_proj.weight  ->  transformer.h.18.attn.c_proj.weight
transformer.h.18.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.007598876953125, -0.0059814453125, 0.00506591796875], [0.005615234375, 0.0186767578125, 0.0098876953125], [-0.00347900390625, -0.01251220703125, -0.018310546875]]
b'transformer.h.18.attn.c_proj.weight'
transformer.h.18.ln_2.weight  ->  transformer.h.18.ln_2.weight
transformer.h.18.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.6171875, 0.65625, 0.640625]
b'transformer.h.18.ln_2.weight'
transformer.h.18.mlp.w1.weight  ->  transformer.h.18.mlp.w1.weight
transformer.h.18.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.01953125, -0.0037841796875, -0.00653076171875], [-0.00213623046875, -0.01025390625, 0.00750732421875], [-0.01361083984375, 0.0147705078125, 0.01080322265625]]
b'transformer.h.18.mlp.w1.weight'
transformer.h.18.mlp.w2.weight  ->  transformer.h.18.mlp.w2.weight
transformer.h.18.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.004608154296875, 0.004058837890625, 0.014404296875], [0.007659912109375, 0.00689697265625, -0.005584716796875], [-0.003631591796875, 0.020751953125, -0.0081787109375]]
b'transformer.h.18.mlp.w2.weight'
transformer.h.18.mlp.c_proj.weight  ->  transformer.h.18.mlp.c_proj.weight
transformer.h.18.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[0.00579833984375, -0.00130462646484375, 0.01324462890625], [0.00787353515625, -0.00677490234375, -0.0052490234375], [0.00171661376953125, 0.0140380859375, -0.00168609619140625]]
b'transformer.h.18.mlp.c_proj.weight'
transformer.h.19.ln_1.weight  ->  transformer.h.19.ln_1.weight
transformer.h.19.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.828125, 0.83203125, 0.81640625]
b'transformer.h.19.ln_1.weight'
transformer.h.19.attn.c_attn.weight  ->  transformer.h.19.attn.c_attn.weight
transformer.h.19.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.01141357421875, 0.0013275146484375, -0.005584716796875], [-0.00023651123046875, -0.003997802734375, -0.0050048828125], [-0.0106201171875, -0.00927734375, 0.003173828125]]
b'transformer.h.19.attn.c_attn.weight'
transformer.h.19.attn.c_attn.bias  ->  transformer.h.19.attn.c_attn.bias
transformer.h.19.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [-0.14453125, -0.087890625, 0.041015625]
b'transformer.h.19.attn.c_attn.bias'
transformer.h.19.attn.c_proj.weight  ->  transformer.h.19.attn.c_proj.weight
transformer.h.19.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.020263671875, 0.028564453125, -0.012939453125], [0.009765625, 0.00811767578125, 0.034912109375], [-0.017822265625, 0.000751495361328125, 0.015869140625]]
b'transformer.h.19.attn.c_proj.weight'
transformer.h.19.ln_2.weight  ->  transformer.h.19.ln_2.weight
transformer.h.19.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.65625, 0.6796875, 0.67578125]
b'transformer.h.19.ln_2.weight'
transformer.h.19.mlp.w1.weight  ->  transformer.h.19.mlp.w1.weight
transformer.h.19.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.0028533935546875, -0.01556396484375, -0.042236328125], [0.0068359375, -0.0018310546875, -0.001312255859375], [0.0089111328125, 0.0234375, 0.0255126953125]]
b'transformer.h.19.mlp.w1.weight'
transformer.h.19.mlp.w2.weight  ->  transformer.h.19.mlp.w2.weight
transformer.h.19.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.0006561279296875, 0.01165771484375, 0.002471923828125], [0.018798828125, 0.01904296875, 0.013427734375], [0.0059814453125, -0.012451171875, -0.01239013671875]]
b'transformer.h.19.mlp.w2.weight'
transformer.h.19.mlp.c_proj.weight  ->  transformer.h.19.mlp.c_proj.weight
transformer.h.19.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.00537109375, 0.00738525390625, -0.008056640625], [0.023681640625, 0.0022430419921875, -0.0111083984375], [0.01953125, -0.0001735687255859375, -0.030029296875]]
b'transformer.h.19.mlp.c_proj.weight'
transformer.h.20.ln_1.weight  ->  transformer.h.20.ln_1.weight
transformer.h.20.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.87890625, 0.828125, 0.8203125]
b'transformer.h.20.ln_1.weight'
transformer.h.20.attn.c_attn.weight  ->  transformer.h.20.attn.c_attn.weight
transformer.h.20.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.0033111572265625, 0.010009765625, 0.0089111328125], [-0.0152587890625, 0.00096893310546875, -0.0003814697265625], [0.01519775390625, 0.01361083984375, -0.014404296875]]
b'transformer.h.20.attn.c_attn.weight'
transformer.h.20.attn.c_attn.bias  ->  transformer.h.20.attn.c_attn.bias
transformer.h.20.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.0078125, 0.00185394287109375, 0.030029296875]
b'transformer.h.20.attn.c_attn.bias'
transformer.h.20.attn.c_proj.weight  ->  transformer.h.20.attn.c_proj.weight
transformer.h.20.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.0068359375, -0.00762939453125, -0.0179443359375], [0.0118408203125, 0.001953125, 0.015869140625], [0.0011138916015625, 0.01116943359375, 0.0130615234375]]
b'transformer.h.20.attn.c_proj.weight'
transformer.h.20.ln_2.weight  ->  transformer.h.20.ln_2.weight
transformer.h.20.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.6875, 0.72265625, 0.71875]
b'transformer.h.20.ln_2.weight'
transformer.h.20.mlp.w1.weight  ->  transformer.h.20.mlp.w1.weight
transformer.h.20.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.02099609375, 0.0191650390625, 0.033447265625], [0.021484375, 0.0074462890625, -0.0194091796875], [0.0001811981201171875, 0.0242919921875, -0.0029296875]]
b'transformer.h.20.mlp.w1.weight'
transformer.h.20.mlp.w2.weight  ->  transformer.h.20.mlp.w2.weight
transformer.h.20.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.011962890625, 0.001434326171875, -0.00445556640625], [0.017333984375, -0.002166748046875, -0.02392578125], [0.00518798828125, 0.0172119140625, 0.013916015625]]
b'transformer.h.20.mlp.w2.weight'
transformer.h.20.mlp.c_proj.weight  ->  transformer.h.20.mlp.c_proj.weight
transformer.h.20.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.00830078125, -0.000919342041015625, -0.0108642578125], [0.02294921875, -0.00372314453125, 0.00396728515625], [0.02001953125, 0.00579833984375, 0.013671875]]
b'transformer.h.20.mlp.c_proj.weight'
transformer.h.21.ln_1.weight  ->  transformer.h.21.ln_1.weight
transformer.h.21.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.91796875, 0.85546875, 0.84765625]
b'transformer.h.21.ln_1.weight'
transformer.h.21.attn.c_attn.weight  ->  transformer.h.21.attn.c_attn.weight
transformer.h.21.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.00836181640625, -0.00494384765625, -0.01092529296875], [0.0201416015625, 0.0157470703125, 0.02587890625], [-0.003875732421875, -0.0032806396484375, -0.017333984375]]
b'transformer.h.21.attn.c_attn.weight'
transformer.h.21.attn.c_attn.bias  ->  transformer.h.21.attn.c_attn.bias
transformer.h.21.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [-0.17578125, 0.076171875, -0.1611328125]
b'transformer.h.21.attn.c_attn.bias'
transformer.h.21.attn.c_proj.weight  ->  transformer.h.21.attn.c_proj.weight
transformer.h.21.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.0233154296875, -0.01080322265625, 0.00872802734375], [-0.032958984375, -0.015869140625, 0.0177001953125], [-0.0003948211669921875, -0.004425048828125, -0.044677734375]]
b'transformer.h.21.attn.c_proj.weight'
transformer.h.21.ln_2.weight  ->  transformer.h.21.ln_2.weight
transformer.h.21.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.734375, 0.7734375, 0.75390625]
b'transformer.h.21.ln_2.weight'
transformer.h.21.mlp.w1.weight  ->  transformer.h.21.mlp.w1.weight
transformer.h.21.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.01123046875, -0.0272216796875, -0.0123291015625], [0.01446533203125, 0.0283203125, -0.04296875], [-0.013427734375, 0.024658203125, 0.01263427734375]]
b'transformer.h.21.mlp.w1.weight'
transformer.h.21.mlp.w2.weight  ->  transformer.h.21.mlp.w2.weight
transformer.h.21.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.01397705078125, 0.020751953125, -0.0220947265625], [-0.0054931640625, -0.0137939453125, -0.018798828125], [-0.004730224609375, -0.0284423828125, 0.00098419189453125]]
b'transformer.h.21.mlp.w2.weight'
transformer.h.21.mlp.c_proj.weight  ->  transformer.h.21.mlp.c_proj.weight
transformer.h.21.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[0.017822265625, 0.038818359375, -0.0031280517578125], [-0.01385498046875, -0.00689697265625, 0.01397705078125], [-0.0087890625, 0.02490234375, 0.01275634765625]]
b'transformer.h.21.mlp.c_proj.weight'
transformer.h.22.ln_1.weight  ->  transformer.h.22.ln_1.weight
transformer.h.22.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.94140625, 0.90234375, 0.86328125]
b'transformer.h.22.ln_1.weight'
transformer.h.22.attn.c_attn.weight  ->  transformer.h.22.attn.c_attn.weight
transformer.h.22.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.004974365234375, 0.00653076171875, 0.005615234375], [-0.01171875, 0.00069427490234375, -0.004974365234375], [0.0120849609375, -0.004119873046875, 0.003753662109375]]
b'transformer.h.22.attn.c_attn.weight'
transformer.h.22.attn.c_attn.bias  ->  transformer.h.22.attn.c_attn.bias
transformer.h.22.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [-0.00897216796875, 0.00445556640625, 0.017822265625]
b'transformer.h.22.attn.c_attn.bias'
transformer.h.22.attn.c_proj.weight  ->  transformer.h.22.attn.c_proj.weight
transformer.h.22.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.0169677734375, 0.01141357421875, 0.0234375], [0.0167236328125, -0.003570556640625, 0.00689697265625], [0.000591278076171875, 0.032470703125, 0.0030517578125]]
b'transformer.h.22.attn.c_proj.weight'
transformer.h.22.ln_2.weight  ->  transformer.h.22.ln_2.weight
transformer.h.22.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.7734375, 0.8046875, 0.796875]
b'transformer.h.22.ln_2.weight'
transformer.h.22.mlp.w1.weight  ->  transformer.h.22.mlp.w1.weight
transformer.h.22.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.00830078125, 0.0002422332763671875, 0.026611328125], [-0.021728515625, 0.005706787109375, -0.03173828125], [0.0164794921875, 0.0133056640625, 0.0240478515625]]
b'transformer.h.22.mlp.w1.weight'
transformer.h.22.mlp.w2.weight  ->  transformer.h.22.mlp.w2.weight
transformer.h.22.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.023193359375, 0.0189208984375, -0.01171875], [0.0047607421875, 0.007781982421875, 0.0087890625], [0.00567626953125, 9.250640869140625e-05, 0.0030059814453125]]
b'transformer.h.22.mlp.w2.weight'
transformer.h.22.mlp.c_proj.weight  ->  transformer.h.22.mlp.c_proj.weight
transformer.h.22.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.02294921875, 0.003204345703125, -0.0024261474609375], [-0.015869140625, -0.002899169921875, 0.0284423828125], [-0.004669189453125, 0.02490234375, -0.0003566741943359375]]
b'transformer.h.22.mlp.c_proj.weight'
transformer.h.23.ln_1.weight  ->  transformer.h.23.ln_1.weight
transformer.h.23.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [0.95703125, 0.9140625, 0.9140625]
b'transformer.h.23.ln_1.weight'
transformer.h.23.attn.c_attn.weight  ->  transformer.h.23.attn.c_attn.weight
transformer.h.23.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.0067138671875, -0.01953125, -0.020263671875], [-0.02099609375, 0.00933837890625, -0.00186920166015625], [0.01422119140625, 0.006317138671875, -0.002593994140625]]
b'transformer.h.23.attn.c_attn.weight'
transformer.h.23.attn.c_attn.bias  ->  transformer.h.23.attn.c_attn.bias
transformer.h.23.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [-0.07763671875, -0.04443359375, -0.1376953125]
b'transformer.h.23.attn.c_attn.bias'
transformer.h.23.attn.c_proj.weight  ->  transformer.h.23.attn.c_proj.weight
transformer.h.23.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.030029296875, -0.00799560546875, -0.0205078125], [-0.0380859375, 0.0235595703125, 0.023681640625], [-0.01507568359375, 0.016845703125, 0.029541015625]]
b'transformer.h.23.attn.c_proj.weight'
transformer.h.23.ln_2.weight  ->  transformer.h.23.ln_2.weight
transformer.h.23.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.81640625, 0.83203125, 0.828125]
b'transformer.h.23.ln_2.weight'
transformer.h.23.mlp.w1.weight  ->  transformer.h.23.mlp.w1.weight
transformer.h.23.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.018798828125, -0.02734375, 0.020263671875], [-0.019287109375, 0.0017852783203125, 0.012939453125], [0.025390625, 0.01953125, 0.0228271484375]]
b'transformer.h.23.mlp.w1.weight'
transformer.h.23.mlp.w2.weight  ->  transformer.h.23.mlp.w2.weight
transformer.h.23.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.0067138671875, -0.0093994140625, -0.00732421875], [-0.0157470703125, 0.0047607421875, -0.01019287109375], [-0.008056640625, -0.01043701171875, 0.0107421875]]
b'transformer.h.23.mlp.w2.weight'
transformer.h.23.mlp.c_proj.weight  ->  transformer.h.23.mlp.c_proj.weight
transformer.h.23.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.012451171875, 0.03369140625, 0.002899169921875], [0.0208740234375, -0.009033203125, -0.0037384033203125], [0.0098876953125, 0.00750732421875, 0.036865234375]]
b'transformer.h.23.mlp.c_proj.weight'
transformer.h.24.ln_1.weight  ->  transformer.h.24.ln_1.weight
transformer.h.24.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [1.0078125, 0.9453125, 0.9375]
b'transformer.h.24.ln_1.weight'
transformer.h.24.attn.c_attn.weight  ->  transformer.h.24.attn.c_attn.weight
transformer.h.24.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.00372314453125, 0.0054931640625, -0.00665283203125], [-0.0133056640625, -0.013671875, -0.00146484375], [-0.012939453125, 0.01513671875, 0.005645751953125]]
b'transformer.h.24.attn.c_attn.weight'
transformer.h.24.attn.c_attn.bias  ->  transformer.h.24.attn.c_attn.bias
transformer.h.24.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.02685546875, 0.0184326171875, 0.00958251953125]
b'transformer.h.24.attn.c_attn.bias'
transformer.h.24.attn.c_proj.weight  ->  transformer.h.24.attn.c_proj.weight
transformer.h.24.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.02880859375, 0.0036468505859375, -0.021728515625], [0.01373291015625, 4.839897155761719e-05, -0.006866455078125], [-0.019775390625, 0.04248046875, -0.002899169921875]]
b'transformer.h.24.attn.c_proj.weight'
transformer.h.24.ln_2.weight  ->  transformer.h.24.ln_2.weight
transformer.h.24.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.86328125, 0.87109375, 0.87109375]
b'transformer.h.24.ln_2.weight'
transformer.h.24.mlp.w1.weight  ->  transformer.h.24.mlp.w1.weight
transformer.h.24.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.013427734375, 0.042724609375, 0.0244140625], [-0.017333984375, -0.051513671875, 0.01080322265625], [-0.02099609375, 0.00157928466796875, 0.020263671875]]
b'transformer.h.24.mlp.w1.weight'
transformer.h.24.mlp.w2.weight  ->  transformer.h.24.mlp.w2.weight
transformer.h.24.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.0081787109375, -0.016357421875, -0.026611328125], [0.0194091796875, 0.00830078125, -0.01416015625], [0.00714111328125, -0.0152587890625, 0.021728515625]]
b'transformer.h.24.mlp.w2.weight'
transformer.h.24.mlp.c_proj.weight  ->  transformer.h.24.mlp.c_proj.weight
transformer.h.24.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[0.029052734375, -0.00616455078125, -0.024169921875], [0.014892578125, -0.005279541015625, 0.00274658203125], [0.013671875, 0.016845703125, -0.015869140625]]
b'transformer.h.24.mlp.c_proj.weight'
transformer.h.25.ln_1.weight  ->  transformer.h.25.ln_1.weight
transformer.h.25.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [1.0625, 0.9921875, 0.98046875]
b'transformer.h.25.ln_1.weight'
transformer.h.25.attn.c_attn.weight  ->  transformer.h.25.attn.c_attn.weight
transformer.h.25.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.0216064453125, -0.004150390625, -0.0142822265625], [0.00909423828125, -0.006683349609375, 0.009765625], [-0.00189971923828125, 0.0032958984375, 0.01483154296875]]
b'transformer.h.25.attn.c_attn.weight'
transformer.h.25.attn.c_attn.bias  ->  transformer.h.25.attn.c_attn.bias
transformer.h.25.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [-0.0032958984375, -0.006134033203125, -0.0179443359375]
b'transformer.h.25.attn.c_attn.bias'
transformer.h.25.attn.c_proj.weight  ->  transformer.h.25.attn.c_proj.weight
transformer.h.25.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.0177001953125, -0.02587890625, 0.01361083984375], [-0.00689697265625, 0.055419921875, 0.0245361328125], [-0.01953125, -0.0087890625, 0.01116943359375]]
b'transformer.h.25.attn.c_proj.weight'
transformer.h.25.ln_2.weight  ->  transformer.h.25.ln_2.weight
transformer.h.25.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.90625, 0.91796875, 0.89453125]
b'transformer.h.25.ln_2.weight'
transformer.h.25.mlp.w1.weight  ->  transformer.h.25.mlp.w1.weight
transformer.h.25.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.015869140625, 0.034912109375, -0.015380859375], [-0.0230712890625, -0.021728515625, -0.003509521484375], [0.002685546875, 0.0021209716796875, 0.0341796875]]
b'transformer.h.25.mlp.w1.weight'
transformer.h.25.mlp.w2.weight  ->  transformer.h.25.mlp.w2.weight
transformer.h.25.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.0120849609375, 0.017822265625, -0.00384521484375], [0.00225830078125, 0.00762939453125, -0.02490234375], [0.01373291015625, 0.004486083984375, -0.014892578125]]
b'transformer.h.25.mlp.w2.weight'
transformer.h.25.mlp.c_proj.weight  ->  transformer.h.25.mlp.c_proj.weight
transformer.h.25.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[0.01446533203125, 0.035888671875, -0.005859375], [-0.0233154296875, -0.02197265625, 0.020751953125], [0.0142822265625, -0.00830078125, -0.028076171875]]
b'transformer.h.25.mlp.c_proj.weight'
transformer.h.26.ln_1.weight  ->  transformer.h.26.ln_1.weight
transformer.h.26.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [1.0703125, 0.9765625, 0.98046875]
b'transformer.h.26.ln_1.weight'
transformer.h.26.attn.c_attn.weight  ->  transformer.h.26.attn.c_attn.weight
transformer.h.26.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.005279541015625, 0.000438690185546875, -0.000354766845703125], [-0.0068359375, -0.0198974609375, -0.0087890625], [-0.004425048828125, 0.025390625, -0.00885009765625]]
b'transformer.h.26.attn.c_attn.weight'
transformer.h.26.attn.c_attn.bias  ->  transformer.h.26.attn.c_attn.bias
transformer.h.26.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.029296875, -0.0771484375, -0.0030670166015625]
b'transformer.h.26.attn.c_attn.bias'
transformer.h.26.attn.c_proj.weight  ->  transformer.h.26.attn.c_proj.weight
transformer.h.26.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.02197265625, 0.0003662109375, 0.000667572021484375], [0.0213623046875, 0.0111083984375, -0.01446533203125], [-0.00537109375, -0.0186767578125, -0.01416015625]]
b'transformer.h.26.attn.c_proj.weight'
transformer.h.26.ln_2.weight  ->  transformer.h.26.ln_2.weight
transformer.h.26.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [0.9609375, 0.96875, 0.953125]
b'transformer.h.26.ln_2.weight'
transformer.h.26.mlp.w1.weight  ->  transformer.h.26.mlp.w1.weight
transformer.h.26.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.022705078125, -0.026123046875, 0.007293701171875], [-0.006744384765625, 0.0181884765625, 0.008056640625], [-0.0145263671875, -0.0029296875, 0.0216064453125]]
b'transformer.h.26.mlp.w1.weight'
transformer.h.26.mlp.w2.weight  ->  transformer.h.26.mlp.w2.weight
transformer.h.26.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.0283203125, -0.019287109375, -0.025146484375], [-0.00506591796875, 0.01336669921875, -0.0213623046875], [-0.0225830078125, -0.00109100341796875, -0.0198974609375]]
b'transformer.h.26.mlp.w2.weight'
transformer.h.26.mlp.c_proj.weight  ->  transformer.h.26.mlp.c_proj.weight
transformer.h.26.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.010986328125, -0.0130615234375, 0.002838134765625], [0.0172119140625, 0.0135498046875, -0.00860595703125], [-0.00927734375, -0.01251220703125, -0.0194091796875]]
b'transformer.h.26.mlp.c_proj.weight'
transformer.h.27.ln_1.weight  ->  transformer.h.27.ln_1.weight
transformer.h.27.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [1.1328125, 1.0390625, 1.0234375]
b'transformer.h.27.ln_1.weight'
transformer.h.27.attn.c_attn.weight  ->  transformer.h.27.attn.c_attn.weight
transformer.h.27.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.01336669921875, -0.0111083984375, 0.00543212890625], [-0.00653076171875, -0.00080108642578125, -0.0091552734375], [0.00244140625, 0.01446533203125, 0.006866455078125]]
b'transformer.h.27.attn.c_attn.weight'
transformer.h.27.attn.c_attn.bias  ->  transformer.h.27.attn.c_attn.bias
transformer.h.27.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.07763671875, 0.1025390625, 0.038330078125]
b'transformer.h.27.attn.c_attn.bias'
transformer.h.27.attn.c_proj.weight  ->  transformer.h.27.attn.c_proj.weight
transformer.h.27.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.012939453125, -0.0216064453125, -0.00494384765625], [0.0062255859375, 0.006744384765625, 0.0120849609375], [-0.00372314453125, 0.013671875, 0.020263671875]]
b'transformer.h.27.attn.c_proj.weight'
transformer.h.27.ln_2.weight  ->  transformer.h.27.ln_2.weight
transformer.h.27.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [1.0, 0.984375, 0.98046875]
b'transformer.h.27.ln_2.weight'
transformer.h.27.mlp.w1.weight  ->  transformer.h.27.mlp.w1.weight
transformer.h.27.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.0087890625, -0.00537109375, -0.00518798828125], [0.0234375, 0.007659912109375, -0.030029296875], [0.0289306640625, -0.0137939453125, 0.015380859375]]
b'transformer.h.27.mlp.w1.weight'
transformer.h.27.mlp.w2.weight  ->  transformer.h.27.mlp.w2.weight
transformer.h.27.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.0032958984375, 0.002593994140625, 0.009521484375], [-0.00299072265625, -0.00201416015625, -0.00135040283203125], [0.026611328125, 0.002044677734375, 0.03173828125]]
b'transformer.h.27.mlp.w2.weight'
transformer.h.27.mlp.c_proj.weight  ->  transformer.h.27.mlp.c_proj.weight
transformer.h.27.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.009765625, 0.01312255859375, 0.01904296875], [0.0185546875, 0.004119873046875, -0.01092529296875], [0.0218505859375, 0.01531982421875, 0.00970458984375]]
b'transformer.h.27.mlp.c_proj.weight'
transformer.h.28.ln_1.weight  ->  transformer.h.28.ln_1.weight
transformer.h.28.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [1.171875, 1.0546875, 1.078125]
b'transformer.h.28.ln_1.weight'
transformer.h.28.attn.c_attn.weight  ->  transformer.h.28.attn.c_attn.weight
transformer.h.28.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.0286865234375, -0.02197265625, 0.000347137451171875], [0.005706787109375, -0.003173828125, -0.0162353515625], [-0.01171875, 0.02294921875, 0.037109375]]
b'transformer.h.28.attn.c_attn.weight'
transformer.h.28.attn.c_attn.bias  ->  transformer.h.28.attn.c_attn.bias
transformer.h.28.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [-0.00244140625, -0.091796875, 0.015869140625]
b'transformer.h.28.attn.c_attn.bias'
transformer.h.28.attn.c_proj.weight  ->  transformer.h.28.attn.c_proj.weight
transformer.h.28.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[-0.0186767578125, 0.01300048828125, -0.010498046875], [-0.041015625, -0.030029296875, -0.0159912109375], [-0.014404296875, 0.0213623046875, 0.010009765625]]
b'transformer.h.28.attn.c_proj.weight'
transformer.h.28.ln_2.weight  ->  transformer.h.28.ln_2.weight
transformer.h.28.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [1.015625, 1.03125, 1.015625]
b'transformer.h.28.ln_2.weight'
transformer.h.28.mlp.w1.weight  ->  transformer.h.28.mlp.w1.weight
transformer.h.28.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.0089111328125, -0.0037841796875, -0.0244140625], [0.0078125, -0.031494140625, 0.0302734375], [-0.0208740234375, -0.0198974609375, -0.046875]]
b'transformer.h.28.mlp.w1.weight'
transformer.h.28.mlp.w2.weight  ->  transformer.h.28.mlp.w2.weight
transformer.h.28.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.01165771484375, 0.0030059814453125, 0.0079345703125], [0.0091552734375, -0.00628662109375, -0.021728515625], [0.010009765625, 0.0079345703125, -0.035888671875]]
b'transformer.h.28.mlp.w2.weight'
transformer.h.28.mlp.c_proj.weight  ->  transformer.h.28.mlp.c_proj.weight
transformer.h.28.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[0.0079345703125, -0.006195068359375, -0.04052734375], [0.01123046875, -0.013427734375, -0.0017852783203125], [-0.033447265625, 0.008544921875, 0.046875]]
b'transformer.h.28.mlp.c_proj.weight'
transformer.h.29.ln_1.weight  ->  transformer.h.29.ln_1.weight
transformer.h.29.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [1.171875, 1.0625, 1.0703125]
b'transformer.h.29.ln_1.weight'
transformer.h.29.attn.c_attn.weight  ->  transformer.h.29.attn.c_attn.weight
transformer.h.29.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.0093994140625, 0.00433349609375, 0.022216796875], [0.004547119140625, -0.016845703125, 0.0140380859375], [0.0015716552734375, 0.013427734375, -0.0166015625]]
b'transformer.h.29.attn.c_attn.weight'
transformer.h.29.attn.c_attn.bias  ->  transformer.h.29.attn.c_attn.bias
transformer.h.29.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.34375, -0.33984375, -0.1337890625]
b'transformer.h.29.attn.c_attn.bias'
transformer.h.29.attn.c_proj.weight  ->  transformer.h.29.attn.c_proj.weight
transformer.h.29.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[-0.0135498046875, 0.00762939453125, -0.041015625], [0.00738525390625, 0.015869140625, -0.022705078125], [0.0038604736328125, -0.004119873046875, 0.021484375]]
b'transformer.h.29.attn.c_proj.weight'
transformer.h.29.ln_2.weight  ->  transformer.h.29.ln_2.weight
transformer.h.29.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [1.0625, 1.0703125, 1.0625]
b'transformer.h.29.ln_2.weight'
transformer.h.29.mlp.w1.weight  ->  transformer.h.29.mlp.w1.weight
transformer.h.29.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.002685546875, -0.033203125, 0.0101318359375], [-0.003021240234375, -0.01312255859375, -0.017822265625], [-0.00106048583984375, 0.0103759765625, -0.004150390625]]
b'transformer.h.29.mlp.w1.weight'
transformer.h.29.mlp.w2.weight  ->  transformer.h.29.mlp.w2.weight
transformer.h.29.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.0291748046875, -0.03662109375, 0.00604248046875], [-0.002349853515625, -0.012939453125, -0.0146484375], [-0.01226806640625, 0.04833984375, -0.00830078125]]
b'transformer.h.29.mlp.w2.weight'
transformer.h.29.mlp.c_proj.weight  ->  transformer.h.29.mlp.c_proj.weight
transformer.h.29.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[0.00775146484375, 0.01202392578125, -0.000583648681640625], [0.0162353515625, -0.0244140625, -0.017333984375], [-0.03857421875, -0.019287109375, 0.005584716796875]]
b'transformer.h.29.mlp.c_proj.weight'
transformer.h.30.ln_1.weight  ->  transformer.h.30.ln_1.weight
transformer.h.30.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [1.1484375, 1.078125, 1.0859375]
b'transformer.h.30.ln_1.weight'
transformer.h.30.attn.c_attn.weight  ->  transformer.h.30.attn.c_attn.weight
transformer.h.30.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[-0.01806640625, 0.0015869140625, -0.004852294921875], [0.031005859375, 0.020751953125, -0.0322265625], [-0.013916015625, -0.023193359375, 0.002899169921875]]
b'transformer.h.30.attn.c_attn.weight'
transformer.h.30.attn.c_attn.bias  ->  transformer.h.30.attn.c_attn.bias
transformer.h.30.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.10791015625, 0.1171875, 0.052001953125]
b'transformer.h.30.attn.c_attn.bias'
transformer.h.30.attn.c_proj.weight  ->  transformer.h.30.attn.c_proj.weight
transformer.h.30.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.020751953125, -0.004638671875, -0.01202392578125], [-0.0123291015625, 0.045654296875, 0.009765625], [-0.026123046875, -0.000621795654296875, 0.00439453125]]
b'transformer.h.30.attn.c_proj.weight'
transformer.h.30.ln_2.weight  ->  transformer.h.30.ln_2.weight
transformer.h.30.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [1.0546875, 1.09375, 1.078125]
b'transformer.h.30.ln_2.weight'
transformer.h.30.mlp.w1.weight  ->  transformer.h.30.mlp.w1.weight
transformer.h.30.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.005615234375, -0.0120849609375, -0.0069580078125], [0.0079345703125, -0.003265380859375, 0.02783203125], [0.01373291015625, -0.0072021484375, 0.020263671875]]
b'transformer.h.30.mlp.w1.weight'
transformer.h.30.mlp.w2.weight  ->  transformer.h.30.mlp.w2.weight
transformer.h.30.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[0.0096435546875, 0.0047607421875, -0.016845703125], [-0.00011205673217773438, -0.0230712890625, -0.0225830078125], [0.00494384765625, 0.007781982421875, -0.0034332275390625]]
b'transformer.h.30.mlp.w2.weight'
transformer.h.30.mlp.c_proj.weight  ->  transformer.h.30.mlp.c_proj.weight
transformer.h.30.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.04541015625, -0.004852294921875, 0.00604248046875], [-0.016845703125, -0.00677490234375, -0.0026397705078125], [0.02978515625, -0.003082275390625, 0.012451171875]]
b'transformer.h.30.mlp.c_proj.weight'
transformer.h.31.ln_1.weight  ->  transformer.h.31.ln_1.weight
transformer.h.31.ln_1.weight 1 (4096,)
  Converting to float32 (4096,) [1.0546875, 0.984375, 0.94140625]
b'transformer.h.31.ln_1.weight'
transformer.h.31.attn.c_attn.weight  ->  transformer.h.31.attn.c_attn.weight
transformer.h.31.attn.c_attn.weight 2 (12288, 4096)
  Converting to float32 (12288, 4096) [[0.036865234375, 0.00872802734375, 0.01513671875], [0.00225830078125, -0.0174560546875, 0.00113677978515625], [0.007598876953125, 0.0012664794921875, 0.0263671875]]
b'transformer.h.31.attn.c_attn.weight'
transformer.h.31.attn.c_attn.bias  ->  transformer.h.31.attn.c_attn.bias
transformer.h.31.attn.c_attn.bias 1 (12288,)
  Converting to float32 (12288,) [0.05224609375, -0.005523681640625, 0.0225830078125]
b'transformer.h.31.attn.c_attn.bias'
transformer.h.31.attn.c_proj.weight  ->  transformer.h.31.attn.c_proj.weight
transformer.h.31.attn.c_proj.weight 2 (4096, 4096)
  Converting to float32 (4096, 4096) [[0.00022411346435546875, 0.002197265625, -0.023193359375], [0.005218505859375, 0.0057373046875, -0.02294921875], [-0.0027618408203125, -0.01324462890625, -0.000934600830078125]]
b'transformer.h.31.attn.c_proj.weight'
transformer.h.31.ln_2.weight  ->  transformer.h.31.ln_2.weight
transformer.h.31.ln_2.weight 1 (4096,)
  Converting to float32 (4096,) [1.1328125, 1.140625, 1.15625]
b'transformer.h.31.ln_2.weight'
transformer.h.31.mlp.w1.weight  ->  transformer.h.31.mlp.w1.weight
transformer.h.31.mlp.w1.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.0205078125, 0.018798828125, -0.01171875], [0.018798828125, -0.02880859375, -0.012939453125], [-0.0078125, -0.000926971435546875, 0.0245361328125]]
b'transformer.h.31.mlp.w1.weight'
transformer.h.31.mlp.w2.weight  ->  transformer.h.31.mlp.w2.weight
transformer.h.31.mlp.w2.weight 2 (11008, 4096)
  Converting to float32 (11008, 4096) [[-0.01373291015625, -0.0185546875, -0.011474609375], [0.0277099609375, -0.02001953125, 0.003936767578125], [-0.00750732421875, -0.0228271484375, 0.0380859375]]
b'transformer.h.31.mlp.w2.weight'
transformer.h.31.mlp.c_proj.weight  ->  transformer.h.31.mlp.c_proj.weight
transformer.h.31.mlp.c_proj.weight 2 (4096, 11008)
  Converting to float32 (4096, 11008) [[-0.006378173828125, -0.04345703125, 0.00439453125], [0.006256103515625, 0.00885009765625, -0.00946044921875], [0.00543212890625, -0.029296875, 0.00823974609375]]
b'transformer.h.31.mlp.c_proj.weight'
transformer.ln_f.weight  ->  transformer.ln_f.weight
transformer.ln_f.weight 1 (4096,)
  Converting to float32 (4096,) [3.21875, 3.578125, 3.921875]
b'transformer.ln_f.weight'
lm_head.weight  ->  lm_head.weight
lm_head.weight 2 (151936, 4096)
  Converting to float32 (151936, 4096) [[0.00982666015625, -0.00750732421875, -0.0125732421875], [0.01507568359375, -0.0140380859375, 0.00061798095703125], [-0.0006103515625, 0.015869140625, 0.00885009765625]]
b'lm_head.weight'
Done. Output file: runtime_outs/ne_qwen_f32.bin

model.cpp: loading model from runtime_outs/ne_qwen_f32.bin
model.cpp: saving model to runtime_outs/ne_qwen_q_nf4_bestla_cfp32_g32.bin
ne_ftype: 10
Loading the bin file with NE format...
load_ne_hparams  0.hparams.n_vocab = 151936                        
load_ne_hparams  1.hparams.n_embd = 4096                          
load_ne_hparams  2.hparams.n_mult = 22016                         
load_ne_hparams  3.hparams.n_head = 32                            
load_ne_hparams  4.hparams.n_head_kv = 0                             
load_ne_hparams  5.hparams.n_layer = 32                            
load_ne_hparams  6.hparams.n_rot = 128                           
load_ne_hparams  7.hparams.ftype = 0                             
load_ne_hparams  8.hparams.max_seq_len = 8192                          
load_ne_hparams  9.hparams.alibi_bias_max = 0.000                         
load_ne_hparams  10.hparams.clip_qkv = 0.000                         
load_ne_hparams  11.hparams.par_res = 0                             
load_ne_hparams  12.hparams.word_embed_proj_dim = 0                             
load_ne_hparams  13.hparams.do_layer_norm_before = 0                             
load_ne_hparams  14.hparams.multi_query_group_num = 0                             
load_ne_hparams  15.hparams.ffn_hidden_size = 11008                         
load_ne_hparams  16.hparams.inner_hidden_size = 0                             
load_ne_hparams  17.hparams.n_experts = 0                             
load_ne_hparams  18.hparams.n_experts_used = 0                             
load_ne_hparams  19.hparams.n_embd_head_k = 0                             
load_ne_hparams  20.hparams.norm_eps = 0.000001                      
load_ne_hparams  21.hparams.freq_base = 10000.000                     
load_ne_hparams  22.hparams.freq_scale = 1.000                         
load_ne_hparams  23.hparams.rope_scaling_factor = 0.000                         
load_ne_hparams  24.hparams.original_max_position_embeddings = 0                             
load_ne_hparams  25.hparams.use_yarn = 0                             
load_ne_vocab    26.vocab.bos_token_id = 151643                        
load_ne_vocab    27.vocab.eos_token_id = 151643                        
load_ne_vocab    28.vocab.pad_token_id = -1                            
load_ne_vocab    29.vocab.sep_token_id = -1                            
[   1/ 259]               transformer.wte.weight -    4096 x 151936, type =    f32, 0_0_32_0_0,quantizing .. GGML size =  2374.00 MB ->   333.84 MB

[   2/ 259]          transformer.h.0.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[   3/ 259]   transformer.h.0.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[   4/ 259]     transformer.h.0.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[   5/ 259]   transformer.h.0.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[   6/ 259]          transformer.h.0.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[   7/ 259]        transformer.h.0.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[   8/ 259]        transformer.h.0.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[   9/ 259]    transformer.h.0.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[  10/ 259]          transformer.h.1.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  11/ 259]   transformer.h.1.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[  12/ 259]     transformer.h.1.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[  13/ 259]   transformer.h.1.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[  14/ 259]          transformer.h.1.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  15/ 259]        transformer.h.1.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  16/ 259]        transformer.h.1.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  17/ 259]    transformer.h.1.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[  18/ 259]          transformer.h.2.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  19/ 259]   transformer.h.2.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[  20/ 259]     transformer.h.2.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[  21/ 259]   transformer.h.2.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[  22/ 259]          transformer.h.2.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  23/ 259]        transformer.h.2.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  24/ 259]        transformer.h.2.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  25/ 259]    transformer.h.2.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[  26/ 259]          transformer.h.3.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  27/ 259]   transformer.h.3.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[  28/ 259]     transformer.h.3.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[  29/ 259]   transformer.h.3.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[  30/ 259]          transformer.h.3.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  31/ 259]        transformer.h.3.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  32/ 259]        transformer.h.3.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  33/ 259]    transformer.h.3.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[  34/ 259]          transformer.h.4.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  35/ 259]   transformer.h.4.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[  36/ 259]     transformer.h.4.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[  37/ 259]   transformer.h.4.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[  38/ 259]          transformer.h.4.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  39/ 259]        transformer.h.4.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  40/ 259]        transformer.h.4.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  41/ 259]    transformer.h.4.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[  42/ 259]          transformer.h.5.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  43/ 259]   transformer.h.5.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[  44/ 259]     transformer.h.5.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[  45/ 259]   transformer.h.5.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[  46/ 259]          transformer.h.5.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  47/ 259]        transformer.h.5.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  48/ 259]        transformer.h.5.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  49/ 259]    transformer.h.5.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[  50/ 259]          transformer.h.6.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  51/ 259]   transformer.h.6.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[  52/ 259]     transformer.h.6.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[  53/ 259]   transformer.h.6.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[  54/ 259]          transformer.h.6.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  55/ 259]        transformer.h.6.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  56/ 259]        transformer.h.6.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  57/ 259]    transformer.h.6.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[  58/ 259]          transformer.h.7.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  59/ 259]   transformer.h.7.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[  60/ 259]     transformer.h.7.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[  61/ 259]   transformer.h.7.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[  62/ 259]          transformer.h.7.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  63/ 259]        transformer.h.7.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  64/ 259]        transformer.h.7.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  65/ 259]    transformer.h.7.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[  66/ 259]          transformer.h.8.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  67/ 259]   transformer.h.8.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[  68/ 259]     transformer.h.8.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[  69/ 259]   transformer.h.8.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[  70/ 259]          transformer.h.8.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  71/ 259]        transformer.h.8.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  72/ 259]        transformer.h.8.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  73/ 259]    transformer.h.8.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[  74/ 259]          transformer.h.9.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  75/ 259]   transformer.h.9.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[  76/ 259]     transformer.h.9.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[  77/ 259]   transformer.h.9.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[  78/ 259]          transformer.h.9.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  79/ 259]        transformer.h.9.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  80/ 259]        transformer.h.9.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  81/ 259]    transformer.h.9.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[  82/ 259]         transformer.h.10.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  83/ 259]  transformer.h.10.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[  84/ 259]    transformer.h.10.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[  85/ 259]  transformer.h.10.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[  86/ 259]         transformer.h.10.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  87/ 259]       transformer.h.10.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  88/ 259]       transformer.h.10.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  89/ 259]   transformer.h.10.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[  90/ 259]         transformer.h.11.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  91/ 259]  transformer.h.11.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[  92/ 259]    transformer.h.11.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[  93/ 259]  transformer.h.11.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[  94/ 259]         transformer.h.11.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  95/ 259]       transformer.h.11.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  96/ 259]       transformer.h.11.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[  97/ 259]   transformer.h.11.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[  98/ 259]         transformer.h.12.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[  99/ 259]  transformer.h.12.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 100/ 259]    transformer.h.12.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 101/ 259]  transformer.h.12.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 102/ 259]         transformer.h.12.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 103/ 259]       transformer.h.12.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 104/ 259]       transformer.h.12.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 105/ 259]   transformer.h.12.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 106/ 259]         transformer.h.13.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 107/ 259]  transformer.h.13.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 108/ 259]    transformer.h.13.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 109/ 259]  transformer.h.13.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 110/ 259]         transformer.h.13.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 111/ 259]       transformer.h.13.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 112/ 259]       transformer.h.13.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 113/ 259]   transformer.h.13.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 114/ 259]         transformer.h.14.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 115/ 259]  transformer.h.14.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 116/ 259]    transformer.h.14.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 117/ 259]  transformer.h.14.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 118/ 259]         transformer.h.14.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 119/ 259]       transformer.h.14.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 120/ 259]       transformer.h.14.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 121/ 259]   transformer.h.14.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 122/ 259]         transformer.h.15.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 123/ 259]  transformer.h.15.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 124/ 259]    transformer.h.15.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 125/ 259]  transformer.h.15.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 126/ 259]         transformer.h.15.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 127/ 259]       transformer.h.15.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 128/ 259]       transformer.h.15.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 129/ 259]   transformer.h.15.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 130/ 259]         transformer.h.16.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 131/ 259]  transformer.h.16.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 132/ 259]    transformer.h.16.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 133/ 259]  transformer.h.16.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 134/ 259]         transformer.h.16.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 135/ 259]       transformer.h.16.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 136/ 259]       transformer.h.16.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 137/ 259]   transformer.h.16.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 138/ 259]         transformer.h.17.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 139/ 259]  transformer.h.17.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 140/ 259]    transformer.h.17.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 141/ 259]  transformer.h.17.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 142/ 259]         transformer.h.17.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 143/ 259]       transformer.h.17.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 144/ 259]       transformer.h.17.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 145/ 259]   transformer.h.17.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 146/ 259]         transformer.h.18.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 147/ 259]  transformer.h.18.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 148/ 259]    transformer.h.18.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 149/ 259]  transformer.h.18.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 150/ 259]         transformer.h.18.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 151/ 259]       transformer.h.18.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 152/ 259]       transformer.h.18.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 153/ 259]   transformer.h.18.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 154/ 259]         transformer.h.19.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 155/ 259]  transformer.h.19.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 156/ 259]    transformer.h.19.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 157/ 259]  transformer.h.19.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 158/ 259]         transformer.h.19.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 159/ 259]       transformer.h.19.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 160/ 259]       transformer.h.19.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 161/ 259]   transformer.h.19.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 162/ 259]         transformer.h.20.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 163/ 259]  transformer.h.20.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 164/ 259]    transformer.h.20.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 165/ 259]  transformer.h.20.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 166/ 259]         transformer.h.20.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 167/ 259]       transformer.h.20.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 168/ 259]       transformer.h.20.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 169/ 259]   transformer.h.20.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 170/ 259]         transformer.h.21.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 171/ 259]  transformer.h.21.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 172/ 259]    transformer.h.21.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 173/ 259]  transformer.h.21.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 174/ 259]         transformer.h.21.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 175/ 259]       transformer.h.21.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 176/ 259]       transformer.h.21.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 177/ 259]   transformer.h.21.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 178/ 259]         transformer.h.22.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 179/ 259]  transformer.h.22.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 180/ 259]    transformer.h.22.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 181/ 259]  transformer.h.22.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 182/ 259]         transformer.h.22.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 183/ 259]       transformer.h.22.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 184/ 259]       transformer.h.22.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 185/ 259]   transformer.h.22.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 186/ 259]         transformer.h.23.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 187/ 259]  transformer.h.23.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 188/ 259]    transformer.h.23.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 189/ 259]  transformer.h.23.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 190/ 259]         transformer.h.23.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 191/ 259]       transformer.h.23.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 192/ 259]       transformer.h.23.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 193/ 259]   transformer.h.23.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 194/ 259]         transformer.h.24.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 195/ 259]  transformer.h.24.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 196/ 259]    transformer.h.24.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 197/ 259]  transformer.h.24.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 198/ 259]         transformer.h.24.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 199/ 259]       transformer.h.24.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 200/ 259]       transformer.h.24.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 201/ 259]   transformer.h.24.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 202/ 259]         transformer.h.25.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 203/ 259]  transformer.h.25.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 204/ 259]    transformer.h.25.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 205/ 259]  transformer.h.25.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 206/ 259]         transformer.h.25.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 207/ 259]       transformer.h.25.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 208/ 259]       transformer.h.25.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 209/ 259]   transformer.h.25.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 210/ 259]         transformer.h.26.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 211/ 259]  transformer.h.26.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 212/ 259]    transformer.h.26.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 213/ 259]  transformer.h.26.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 214/ 259]         transformer.h.26.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 215/ 259]       transformer.h.26.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 216/ 259]       transformer.h.26.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 217/ 259]   transformer.h.26.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 218/ 259]         transformer.h.27.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 219/ 259]  transformer.h.27.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 220/ 259]    transformer.h.27.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 221/ 259]  transformer.h.27.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 222/ 259]         transformer.h.27.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 223/ 259]       transformer.h.27.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 224/ 259]       transformer.h.27.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 225/ 259]   transformer.h.27.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 226/ 259]         transformer.h.28.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 227/ 259]  transformer.h.28.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 228/ 259]    transformer.h.28.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 229/ 259]  transformer.h.28.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 230/ 259]         transformer.h.28.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 231/ 259]       transformer.h.28.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 232/ 259]       transformer.h.28.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 233/ 259]   transformer.h.28.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 234/ 259]         transformer.h.29.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 235/ 259]  transformer.h.29.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 236/ 259]    transformer.h.29.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 237/ 259]  transformer.h.29.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 238/ 259]         transformer.h.29.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 239/ 259]       transformer.h.29.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 240/ 259]       transformer.h.29.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 241/ 259]   transformer.h.29.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 242/ 259]         transformer.h.30.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 243/ 259]  transformer.h.30.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 244/ 259]    transformer.h.30.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 245/ 259]  transformer.h.30.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 246/ 259]         transformer.h.30.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 247/ 259]       transformer.h.30.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 248/ 259]       transformer.h.30.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 249/ 259]   transformer.h.30.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 250/ 259]         transformer.h.31.ln_1.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 251/ 259]  transformer.h.31.attn.c_attn.weight -     4096 x 12288, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   192.00 MB ->    30.00 MB

[ 252/ 259]    transformer.h.31.attn.c_attn.bias -            12288, type =    f32, 7_0_32_0_0,size =    0.047 MB

[ 253/ 259]  transformer.h.31.attn.c_proj.weight -     4096 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =    64.00 MB ->    10.08 MB

[ 254/ 259]         transformer.h.31.ln_2.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 255/ 259]       transformer.h.31.mlp.w1.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 256/ 259]       transformer.h.31.mlp.w2.weight -     4096 x 11008, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    26.95 MB

[ 257/ 259]   transformer.h.31.mlp.c_proj.weight -    11008 x  4096, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =   172.00 MB ->    27.09 MB

[ 258/ 259]              transformer.ln_f.weight -             4096, type =    f32, 7_0_32_0_0,size =    0.016 MB

[ 259/ 259]                       lm_head.weight -    4096 x 151936, type =    f32, 4_0_32_1_2,quantizing .. BesTLA size =  2374.00 MB ->   371.02 MB

model_quantize_internal: model size  = 29454.52 MB
model_quantize_internal: quant size  =  4581.63 MB
AVX:1 AVX2:1 AVX512F:1 AVX_VNNI:0 AVX512_VNNI:1 AMX_INT8:0 AMX_BF16:0 AVX512_BF16:0 AVX512_FP16:0
beam_size: 1, do_sample: 0, top_k: 40, top_p: 0.950, continuous_batching: 0, max_request_num: 1, early_stopping: 0, scratch_size_ratio: 1.000
Loading the bin file with NE format...
Once upon a time, there existed a little 
model.cpp: loading model from runtime_outs/ne_qwen_q_nf4_bestla_cfp32_g32.bin
init: n_vocab    = 151936
init: n_embd     = 4096
init: n_mult     = 22016
init: n_head     = 32
init: n_head_kv  = 0
init: n_layer    = 32
init: n_rot      = 128
init: ftype      = 0
init: max_seq_len= 8192
init: n_ff       = 11008
init: n_parts    = 1
load: ctx size   = 4581.78 MB
load: scratch0   = 4096.00 MB
load: scratch1   = 2048.00 MB
load: scratch2   = 4096.00 MB
load: mem required  = 14821.78 MB (+ memory per state)
......................................................................................
model_init_from_file: support_bestla_kv = 0
model_init_from_file: kv self size =  256.00 MB
girl, who was very curious and adventurous. She loved to explore the world around her, and her parents often worried about her safety. One day, while playing in the park, the little girl stumbled upon a mysterious door that she had never seen before.
The door was made of a strange, shimmering material, and it seemed to glow with an otherworldly light. The little girl was intrigued, and she pushed the door open, stepping through into a magical world.
In this world, the little girl discovered that anything was possible. She could fly on the back of a giant dragon, ride a unicorn through a rainbow-colored forest, and even talk to the animals. She spent hours exploring this magical world, making new friends and having the time of her life.
But as the day wore on, the little girl realized that she needed to go home. She said goodbye to her new friends and stepped back through the door, returning to the real world.
From that day on, the little girl knew that there was so much more to the world than she had ever imagined. She continued to explore and discover new things, always keeping an open mind and an adventurous spirit. And she never forgot the magical world she had discovered, a place where anything was possible. The end.<|im_end|>
<|endoftext|>
  • 4
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
要在Docker中部署一个Chatbot GPT-2,您需要按照以下步骤操作: 1. 首先,您需要安装Docker和Docker Compose。您可以在官方网站上找到有关如何安装这些工具的文档。 2. 接下来,您需要下载Chatbot GPT-2代码并将其放置在一个文件夹中。您可以从GitHub或其他来源获取该代码。 3. 在该文件夹中,您需要创建一个名为“ Dockerfile”的文件,并将以下内容复制到该文件中: ``` FROM tensorflow/tensorflow:latest-gpu-py3 RUN apt-get update && apt-get install -y git RUN git clone https://github.com/huggingface/transformers.git && \ cd transformers && git checkout v3.0.2 && \ pip install . RUN git clone https://github.com/microsoft/DialoGPT.git && \ cd DialoGPT && git checkout v1.0 && \ pip install -r requirements.txt WORKDIR /app COPY . . CMD ["python", "main.py"] ``` 这个Dockerfile使用最新的TensorFlow GPU映像,安装了必要的软件包和依赖项,然后将应用程序代码复制到容器中并启动主程序。 4. 接下来,您需要创建一个名为“ docker-compose.yml”的文件,并将以下内容复制到该文件中: ``` version: '3' services: chatbot: build: . expose: - 5000 ports: - 5000:5000 environment: - GPU=1 ``` 这个docker-compose文件定义了一个服务,使用刚才创建的Dockerfile进行构建。它将端口5000暴露给主机,并将容器端口5000映射到主机端口5000。它还设置了一个名为GPU的环境变量,以便容器可以使用GPU。 5. 现在您可以运行以下命令来构建和启动Chatbot GPT-2容器: ``` docker-compose build docker-compose up -d ``` 第一条命令将构建容器,而第二条命令将启动容器并在后台运行。现在,您可以访问http://localhost:5000来与Chatbot GPT-2进行交互。 希望这可以帮助您部署Chatbot GPT-2容器。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

暴躁的秋秋

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值