自定义博客皮肤VIP专享

*博客头图:

格式为PNG、JPG,宽度*高度大于1920*100像素,不超过2MB,主视觉建议放在右侧,请参照线上博客头图

请上传大于1920*100像素的图片!

博客底图:

图片格式为PNG、JPG,不超过1MB,可上下左右平铺至整个背景

栏目图:

图片格式为PNG、JPG,图片宽度*高度为300*38像素,不超过0.5MB

主标题颜色:

RGB颜色,例如:#AFAFAF

Hover:

RGB颜色,例如:#AFAFAF

副标题颜色:

RGB颜色,例如:#AFAFAF

自定义博客皮肤

-+
  • 博客(467)
  • 资源 (2)
  • 收藏
  • 关注

原创 超大模型分布式训练DeepSpeed教程

DeepSpeed教程项目链接简介deep speed是微软的新大规模模型分布式训练的工具。专门为训练超大模型而生。号称可以训练10B参数的模型。比目前最好的模型大10倍,训练速度块10倍。兼容pytorch的模型,可以改动最少代码。下图是展示训练bert需要的时间,基本同gpu的数量成线性相关。安装下载code(0.3.0)git clone https://github.com/microsoft/DeepSpeed.git安装python环境需要注意pytroch cud

2020-09-20 11:44:20 8806 3

原创 Big Bird: Transformers for Longer Sequences论文详解

文章目录Big Bird大鸟模型论文要解决问题解决方法随机注意力固定窗口注意力全局注意力复杂度分析实验三种注意力的消融实验。语言模型对比roberta、longformerQA问题中对比longformer长文本分类任务文章摘要任务基因语言模型实验Big Bird大鸟模型论文要解决问题如下图,Transformer模型中,注意力中Q、K点乘的内存、速度是序列长度的平方复杂度。长文本时,这个复杂度是不可接受的。一般做法是切成512的块,这种做法损失了块与块之间的信息,例如QA问题中Q必须看到所有A才行

2020-09-20 09:44:20 3036

原创 Longformer论文解读: The Long-Document Transformer

文章目录Longformer要解决什么问题Longformer怎么解决固定窗口跳跃滑动窗口全局注意力实现细节实验结果实验自回归语言模型text8的消融实验WikiHop的消融实验Qa任务Longformer要解决什么问题原始transformer有O(n^2)时间空间复杂度(根据attention的公式,每个位置的Query都需要关注每个位置的Key),n是输入文本的长度。长文档时,原始transformer的复杂度太高。对于长文档,通常做法切分文档(单块限制在512内)切分的文档之间没有交互信息,这

2020-09-13 09:13:48 2429

原创 ReFormer论文解读(THE EFFICIENT TRANSFORMER)

ReFormer论文解读(THE EFFICIENT TRANSFORMER)Reformer要解决的问题attention的内存、计算复杂度是文本长度L的平方复杂度即O(L* L)(self-attention每个位置都要看整句的其他每个位置), 这在超长文本时(比如文章)是不可接受的。传统transformer一般是按512长度分块,这样损失了块与块之间的互信息。原生transformer训练是需要的内存是层数的倍数(因为反向传播是需要存储每层的结果来求误差的梯度)。feed-forward层

2020-09-12 17:04:17 2625

原创 机器学习面试整理

sigmod函数sigmoid的导数小于0.25交叉熵损失函数softmaxsoftmax +cross entropy原因:交叉熵比均方误差好(即使与label中1所对应下标的预测值是正确的,其他项预测值的分布也会影响损失的大小,这不符合我们对于分类问题损失函数的预期),似然估计的视角:交叉熵就是对应于该样本的负对数似然估计等价于KL散度又被称为相对熵。softmax+cr...

2019-11-14 15:23:05 726

原创 foj2204 Problem 2204 7 dp

Problem 2204 7 Accept: 50 Submit: 142 Time Limit: 2000 mSec Memory Limit : 65536 KB Problem Description n个有标号的球围成一个圈。每个球有两种颜色可以选择黑或白染色。问有多少种方案使得没有出现连续白球7个或连续黑球7个。Input 第一行有多组数据。第一行T表示组数。(T

2015-10-08 23:40:46 798

原创 foj2200 Problem 2200 cleaning dp

Problem 2200 cleaning Accept: 36 Submit: 56 Time Limit: 1000 mSec Memory Limit : 65536 KB Problem Description N个人围成一圈在讨论大扫除的事情,需要选出K个人。但是每个人与他距离为2的人存在矛盾,所以这K个人中任意两个人的距离不能为2,他们想知道共有多少种方法。Inp

2015-10-07 12:42:52 658

原创 Codeforces Round #324 (Div. 2) C. Marina and Vasya 字符串处理

C. Marina and Vasya time limit per test1 second memory limit per test256 megabytes inputstandard input outputstandard output Marina loves strings of the same length and Vasya loves when there is a

2015-10-07 09:26:46 731

原创 Codeforces Round #324 (Div. 2) D. Dima and Lisa 数论 三素数定理

D. Dima and Lisa time limit per test1 second memory limit per test256 megabytes inputstandard input outputstandard output Dima loves representing an odd number as the sum of multiple primes, and L

2015-10-07 09:22:39 897

原创 foj2198 Problem 2198 快来快来数一数 dp 矩阵快速幂

Problem 2198 快来快来数一数 Accept: 67 Submit: 194 Time Limit: 1000 mSec Memory Limit : 65536 KB Problem Description n个六边形排成一行,相邻两个六边形共用一条边,如下图所示:记这个图形的生成树个数为t(n)(由于每条边都是不同的,不存在同构的问题)。那么t(1)=6,t(2

2015-10-07 00:25:23 592

原创 foj2202 Problem 2202 犯罪嫌疑人

Problem 2202 犯罪嫌疑人 Accept: 30 Submit: 69 Time Limit: 1000 mSec Memory Limit : 65536 KB Problem Description 福尔摩斯是个大侦探,他总是在解决疑难案件。这一次的案件也不例外,案件是这样的:有编号为1到N的N位嫌疑犯,他们其中有一个犯了罪,然后每个嫌疑犯都被询问,“哪一个人犯

2015-10-06 20:55:12 573

原创 Foj 2203 Problem 2203 单纵大法好 stl应用

Problem 2203 单纵大法好 Accept: 20 Submit: 57 Time Limit: 5000 mSec Memory Limit : 65536 KB Problem Description 人在做,天在看,单纵不怂保平安众娘皆为轮回来,敢教POI进夜战。勿怪战列低智商,航母不是一个样?假摔不虚有损管,大破进击是真相!老S最近喜欢上某个搜集战舰的游戏,这

2015-10-06 20:50:00 671

原创 Codeforces Round #323 (Div. 1) A. GCD Table stl应用

A. GCD Table time limit per test2 seconds memory limit per test256 megabytes inputstandard input outputstandard output The GCD table G of size n × n for an array of positive integers a of length n

2015-10-05 18:13:28 798

原创 Codeforces Round #323 (Div. 1) B. Once Again... 最长非严格递增子序列

B. Once Again… time limit per test1 second memory limit per test256 megabytes inputstandard input outputstandard output You are given an array of positive integers a1, a2, …, an × T of length n ×

2015-10-05 18:07:05 1204

原创 Codeforces Round #321 (Div. 2) D. Kefa and Dishes 位 状态压缩 dp

D. Kefa and Dishestime limit per test2 secondsmemory limit per test256 megabytesinputstandard inputoutputstandard outputWhen Kefa came to the restaurant and sat

2015-10-01 21:52:36 468

原创 Codeforces Round #320 (Div. 2) [Bayan Thanks-Round] D. "Or" Game 贪心

D. "Or" Gametime limit per test2 secondsmemory limit per test256 megabytesinputstandard inputoutputstandard outputYou are given n numbers a1, a2, ..., an. You

2015-09-18 10:22:09 813

原创 Codeforces Round #320 (Div. 2) [Bayan Thanks-Round] C. A Problem about Polyline 精度控制

C. A Problem about Polylinetime limit per test1 secondmemory limit per test256 megabytesinputstandard inputoutputstandard outputThere is a polyline going throug

2015-09-18 10:08:48 623

原创 hiho一下 第六十二周 题目1 : Browser Caching stl 应用

题目1 : Browser Caching时间限制:10000ms单点时限:1000ms内存限制:256MB描述When you browse the Internet, browser usually caches some documents to reduce the time cost of fetching them from

2015-09-06 01:14:36 670

原创 hiho一下 第六十一周 题目1 : Combination Lock 线段树 成段更新

时间限制:10000ms单点时限:1000ms内存限制:256MB描述Finally, you come to the interview room. You know that a Microsoft interviewer is in the room though the door is locked. There is a combinati

2015-09-04 09:47:59 721

原创 hihoCoder挑战赛14 题目2 : 赛车 树的性质

题目2 : 赛车时间限制:20000ms单点时限:1000ms内存限制:256MB描述幻想乡有一个赛车场。赛车场里有N个地点。同时地点之间还有单向的道路存在。这些道路使得赛车场形成了一个外向树的结构。也就是说,道路将这N个地点连成了一个有根树。并且所有的边都是从父亲指向孩子的。由于幽香喜欢刺激,每次她去赛车场都会从根节点出发,选择

2015-08-30 21:12:53 878

原创 hihoCoder挑战赛14 题目1 : 不等式

时间限制:10000ms单点时限:1000ms内存限制:256MB描述给定n个关于X的不等式,问最多有多少个成立。每个不等式为如下的形式之一:X X X = CX > CX >= C输入第一行一个整数n。以下n行,每行一个不等式。数据范围:1输出一行一个整数,表示最多可以同时成立的不等式个数。

2015-08-30 21:09:14 1357

原创 Codeforces Round #318 [RussianCodeCup Thanks-Round] (Div. 1) A. Bear and Poker gcd

A. Bear and Pokertime limit per test2 secondsmemory limit per test256 megabytesinputstandard inputoutputstandard outputLimak is an old brown bear. He often play

2015-08-30 10:05:06 923

原创 Codeforces Round #318 [RussianCodeCup Thanks-Round] (Div. 1) B. Bear and Blocks dp

B. Bear and Blockstime limit per test1 secondmemory limit per test256 megabytesinputstandard inputoutputstandard outputLimak is a little bear who loves to play.

2015-08-30 10:01:35 838

原创 hdu 5423 Rikka with Tree 树的性质

Rikka with TreeTime Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/65536 K (Java/Others)Total Submission(s): 165    Accepted Submission(s): 85Problem DescriptionAs we know, Rikka

2015-08-29 23:38:44 1368

原创 hdu 5424 Rikka with Graph II 哈密顿通路

Rikka with Graph IITime Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/65536 K (Java/Others)Total Submission(s): 367    Accepted Submission(s): 90Problem DescriptionAs we know, R

2015-08-29 23:34:27 1316

原创 Codeforces Round #281 (Div. 2) E. Vasya and Polynomial 数学 思考题

E. Vasya and Polynomialtime limit per test2 secondsmemory limit per test256 megabytesinputstandard inputoutputstandard outputVasya is studying in the last class

2015-08-29 15:55:26 956

原创 Codeforces Round #317 [AimFund Thanks-Round] (Div. 1) C. CNF 2 无向图找环

C. CNF 2time limit per test1 secondmemory limit per test256 megabytesinputstandard inputoutputstandard output'In Boolean logic, a formula is in conjunctive normal form (CNF) or clausal norma

2015-08-28 23:19:35 989

原创 Codeforces Round #317 [AimFund Thanks-Round] (Div. 1) B. Minimization 贪心 dp

B. Minimizationtime limit per test2 secondsmemory limit per test256 megabytesinputstandard inputoutputstandard outputYou've got array A, consisting of n integer

2015-08-26 21:19:35 753

原创 Codeforces Round #317 [AimFund Thanks-Round] (Div. 1) A. Lengthening Sticks 分类

A. Lengthening Stickstime limit per test1 secondmemory limit per test256 megabytesinputstandard inputoutputstandard outputYou are given three sticks with positi

2015-08-26 17:52:32 746

原创 topcoder SRM 666 DIV2 CollectingTokens 树形dp

Problem Statement    Surya has a tree with n nodes, numbered 1 through n. Each node contains some arbitrary nonnegative number of tokens.Surya sometimes goes for a walk on the tree. He h

2015-08-26 17:11:15 1005

原创 Codeforces Round #284 (Div. 2) D. Name That Tune 概率dp

D. Name That Tunetime limit per test1 secondmemory limit per test256 megabytesinputstandard inputoutputstandard outputIt turns out that you are a great fan of r

2015-08-25 23:48:53 775

原创 Round A APAC Test 2016 Problem D. gSnake 贪吃蛇 stl应用

Problem D. gSnakeThis contest is open for practice. You can try every problem as many times as you like, though we won't keep track of which problems you solve. Read the Quick-Start Guide to

2015-08-25 00:42:56 1296

原创 Round A APAC Test 2016 Problem B. gCube

Problem B. gCubeThis contest is open for practice. You can try every problem as many times as you like, though we won't keep track of which problems you solve. Read the Quick-Start Guide to

2015-08-24 19:43:21 1191

原创 Round A APAC Test 2016 Problem A. Googol String

Problem A. Googol StringThis contest is open for practice. You can try every problem as many times as you like, though we won't keep track of which problems you solve. Read the Quick-Start G

2015-08-24 19:41:29 1687 1

原创 Round A APAC Test 2016 Problem C. gCampus 最短路

Problem C. gCampusThis contest is open for practice. You can try every problem as many times as you like, though we won't keep track of which problems you solve. Read the Quick-Start Guide t

2015-08-24 19:27:26 1336

原创 hiho一下 第六十周 题目1 : String Matching Content Length dp 最长公共子序列

题目1 : String Matching Content Length时间限制:10000ms单点时限:1000ms内存限制:256MB描述We define the matching contents in the strings of strA and strB as common substrings of the two str

2015-08-23 08:05:28 954

原创 个人acm模版

#pragma comment(linker, "/stack:20000000")#define _CRT_SECURE_NO_WARNINGS#include "stdio.h"#ifndef DEBUG#include <iostream>#include <cmath>#include <algorithm>#include <cstdio>#include <cstring

2015-08-22 22:17:40 729

原创 hdu 5418 Victor and World 状态压缩dp spfa最短路 floyed最短路

Victor and WorldTime Limit: 4000/2000 MS (Java/Others)    Memory Limit: 262144/131072 K (Java/Others)Total Submission(s): 132    Accepted Submission(s): 66Problem DescriptionAfter trying h

2015-08-22 21:49:46 913

原创 hdu 5419 Victor and Toys 线段树成段更新

Victor and ToysTime Limit: 2000/1000 MS (Java/Others)    Memory Limit: 262144/131072 K (Java/Others)Total Submission(s): 156    Accepted Submission(s): 54Problem DescriptionVictor has n

2015-08-22 21:40:23 1196

原创 Good Bye 2014 D. New Year Santa Network 树形dp

D. New Year Santa Networktime limit per test2 secondsmemory limit per test256 megabytesinputstandard inputoutputstandard outputNew Year is coming in Tree World!

2015-08-22 16:19:40 573

数据库运用实例

本软件名叫智力测试器,主要运用了数据库的知识!

2013-09-08

hdu oj poj 题目代码与详解

很多经典的杭电oj与poj习题的ac代码与详解!全部ac,决对没有错误的代码!

2013-09-08

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

提示
确定要删除当前文章?
取消 删除