自定义博客皮肤VIP专享

*博客头图:

格式为PNG、JPG,宽度*高度大于1920*100像素,不超过2MB,主视觉建议放在右侧,请参照线上博客头图

请上传大于1920*100像素的图片!

博客底图:

图片格式为PNG、JPG,不超过1MB,可上下左右平铺至整个背景

栏目图:

图片格式为PNG、JPG,图片宽度*高度为300*38像素,不超过0.5MB

主标题颜色:

RGB颜色,例如:#AFAFAF

Hover:

RGB颜色,例如:#AFAFAF

副标题颜色:

RGB颜色,例如:#AFAFAF

自定义博客皮肤

-+
  • 博客(230)
  • 问答 (4)
  • 收藏
  • 关注

原创 Low Level Programming: Some Basic Concepts

(mainly from Gemini, sources linked)Manage system resources and run applicationsStored on storage mediaWhile these three terms are often used in the context of computing, they have distinct roles and functions.1. Driver:1. What is a Driver? - Windows drive

2024-12-04 11:09:37 971

转载 Descriptive Statistics

easures o。

2024-08-01 15:39:09 95

转载 Primer on Python Decorators

a great python decorator introduction

2024-07-21 15:28:48 117

原创 (WIP) Network Paradigm Fundamentals and Comparison

在分布式存储网络中,我们使用的协议有RoCE、Infiniband(IB)和TCP/IP。其中RoCE和IB属于RDMA(RemoteDirect Memory Access)技术,他和传统的TCP/IP有什么区别呢,接下来我们将做详细对比。

2024-07-19 10:20:48 1094

原创 IEEE Floating Point Rounding

how to round to nearest, tie to even, IEEE-style

2024-07-18 14:50:52 1125

转载 Large sequence models for software development activities

uses。

2024-07-09 12:13:43 89

原创 Power Distribution Network (PDN) and Chip Packaging

[1] short description: Power Delivery Network (PDN) - Semiconductor Engineering[2] PDN on PCB:Power Distribution Network in PCB Design: Ensuring Stable Power Delivery[3] lecture slides on PDN: https://pages.hmc.edu/harris/cmosvlsi/4e/lect/lect21.pdf[4] (RL

2024-06-28 17:39:23 678

原创 LLM Benchmarks

We very often see a menagerie of performance benchmarks for LLM papers listed to showcase the "breakthroughs" and very likely know very little about the specifics about each particular test suite.There, then, lies a danger of being misled and manipulated b

2024-04-08 11:47:21 733

原创 1 bit LLM and 1 trit LLM

In light of NV's recent addition of fp4, I'm once again curious about the bottom line for LLM, at least for inference; let's go back to this BitNet paper from Microsoft, featuring 1 bit LLM, with 1-bit weights trained from scatch, and later on another feat

2024-03-22 18:17:10 1111

原创 SORA: text-to-video Generator by OpenAI

sources:1. OpenAI's blog piece: Video generation models as world simulators 2. DiTs (Diffusion Transformers): Scalable Diffusion Models with TransformersThis is so far the most contentious point for SORA, regarding whether it is "learning" physics and gene

2024-02-24 21:47:08 1180

转载 Cache Invalidation

Learn how。

2024-01-30 16:19:17 222

原创 Direct vs Indirect Branching

【代码】Direct vs Indirect Branching。

2024-01-30 12:35:06 1024

转载 Layer Normalization (LN)

here on I use the Pinecore article as main source, as it's the more comprehensive and easily read one.as you can read from the abstract of the original paper,LN is proposed as an alternative or complement to BN, hence it's best to start with a solid unders

2024-01-19 12:29:04 135

转载 12 Software Architecture Pitfalls and How to Avoid Them

Good luck!your。

2023-12-26 12:29:23 118

转载 vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention

paper: https://arxiv.org/pdf/2309.06180.pdfrepo: GitHub - vllm-project/vllm: A high-throughput and memory-efficient inference and serving engine for LLMshighlights blog by authors: vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention | vLLM BlogLLMs

2023-12-05 20:49:41 288

转载 Trailing Comma “Feature“ in Python

【代码】Trailing Comma “Feature“ in Python。

2023-11-25 17:44:43 135

原创 DeConvolution(Transposed Convolution)

DeConv fundamentals

2023-11-09 20:47:31 219

转载 Understanding Gated Recurrent Unit (GRU) in Deep Learning

SourceGRU stands for Gated Recurrent Unit, which is a type of recurrent neural network (RNN) architecture that is similar to LSTM (Long Short-Term Memory).Like LSTM, GRU is designed to model sequential data by allowing information to be selectively remembe

2023-11-07 19:01:17 171

原创 The Reversal Curse: LLMs trained on “A is B“ fail to learn “B is A“

paper: https://owainevans.github.io/reversal_curse.pdfblog with interactions with the authors: Paper: LLMs trained on “A is B” fail to learn “B is A” — LessWrongThis is a linkpost for https://owainevans.github.io/reversal_curse.pdfThis post is the copy of

2023-09-28 18:07:11 498

转载 Illustrated Stable Diffusion

​AI image generation is the most recent AI capability blowing people’s minds (mine included). The ability to create striking visuals from text descriptions has a magical quality to it and points clearly to a shift in how humans create art.

2023-08-17 14:02:34 251

原创 FlashAttention-2

FlashAttention is a fusion trick, which merges multiple operational steps (ops) in the attention layers of transformer networks to achieve better end2end result; the performance gain is mainly from better memory reuse given the vanilla version being memory

2023-07-29 12:01:51 521

原创 Automatic Differentiation

For beginners, the most daunting aspect of deep learning algorithms is perhaps Back-Propagations (BP) which require derivations of some highly complex mathematical expressions.Luckily when actually implementing BP, we do not have to rely on smmary symbolic

2023-07-28 13:46:54 258

原创 CPP: Reference is a type

【代码】CPP: Reference is a type。

2023-05-16 14:53:43 163

原创 Introduction to Verilog

Sources:Editted and padded GPT content; if you prefer human sources: Verilog Data TypesThis article focus on Verilog as a programming language, i.e. the simulation part is not covered.Verilog is C-like with a few quirks tweaked for the HDL side of things.V

2023-05-06 16:32:27 540

原创 FlashAttention

paper: https://arxiv.org/abs/2205.14135an informal talk by the author Tri Dao: https://www.youtube.com/watch?v=FThvfkXWqtEcode repo: GitHub - HazyResearch/flash-attention: Fast and memory-efficient exact attention introduction to transformer: Transformer a

2023-05-03 18:30:32 1470

原创 Cross Domain Signal Integrity in Asynchronous Designs

Conventional two flip-flop synchronizerfrom Synchronizer Techniques for Multi-Clock Domain SoCs & FPGAs - EDN In general, a conventional two flip-flop synchronizer is used for synchronizing a single bit level signal. As shown in Figure 1 and Figure 2 , fli

2023-04-22 18:50:56 630

原创 Introduction to Perl

=>

2023-04-18 21:26:46 494

转载 Moore vs. Mealy Machine

slides from https://inst.eecs.berkeley.edu/~cs150/fa05/Lectures/07-SeqLogicIIIx2.pdf==> output is not dependent on input, but next state still iswith the merging rule stated in the beginning:(starting from the Moore diagram, but change it to a Mealy first,

2023-04-13 17:24:40 128

原创 Initial Block and Testbenches in Verilog

【代码】Initial Block and Testbenches in Verilog。

2023-04-11 16:22:36 471

转载 Hierarchical Clustering: Agglomerative and Divisive

efficientaccurate。

2023-04-04 18:05:53 177

转载 Data Compression: Entropy Encoding and Run Length Encoding

GPT3.5]

2023-04-03 18:01:27 144

转载 Common architectures in convolutional neural networks

from: https://www.jeremyjordan.me/convnet-architectures/#lenet5==> most of the graphs cannot be copied to this platform, so just check the linked originalIn this post, I'll discuss commonly used architectures for convolutional networks. As you'll see, almo

2023-02-22 18:56:56 195

转载 Domain Specific Compiling: 领域编译器发展的前世今生 • 面向AI的编译技术

作者简介:张朔铭,博士研究生,正在中国科学院计算技术研究所崔慧敏研究员指导下攻读计算机系统结构博士学位,目前主要的研究方向是AI编译。zhangshuoming17@mails.ucas.ac.cn本文分为两个部分,第一部分为综述(领域编译器发展的前世今生 • 综述);这部分重点讨论面向AI领域的编译技术。0. 前言随着人工智能时代的来临,AI领域应用的大量出现也促进着领域编译的发展,最突出的表现就是多种AI编译器的普及和应用。AI领域有几个重要的特征使得AI编译器面临很多新的机遇和挑战:一是AI领域中编程

2023-02-21 18:59:50 696

原创 Process Corners: Terminology and Introduction

process corners

2023-02-18 18:00:25 1449

转载 Power Integrity

breakdown Decouple! Provide Use!

2022-11-10 17:17:44 375

转载 C vs. Python Operator Precedence: Beware of (Bitwise) Logical Op.

comparison operators in python and C have different precedence compared to bitwise and logical operators, beware.

2022-10-13 10:16:12 184

转载 Python: Function Annotation and “inspect“ module

https://peps.python.org/pep-3107/This PEP introduces a syntax for adding arbitrary metadata annotations to Python functions [1].Because Python’s 2.x series lacks a standard way of annotating a function’s parameters and return values, a variety of tools and

2022-09-14 18:48:53 190

转载 “context“ in C and Python

if”)..

2022-09-14 14:52:05 148

转载 Python Multi-level Import with ‘.‘

【代码】Python Multi-level Import。

2022-09-14 13:54:36 145

转载 Linux: find file by content

【代码】Linux: find file by content。

2022-09-08 10:43:41 235

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

提示
确定要删除当前文章?
取消 删除