第二周 ARTS

 

 

 第二周 ARTS

From 2019/6/24 To 2019/6/30  

Type

Completed Date

Contents

Finish

Algorithm

2019-06-29

Longest Substring Without Repeating Characters

[x]

Review

2019-06-30

CSSAPP:A Tour of Computer:Systems

[x]

Tips

2019-07-01

Get Depth,PathIndex,NumbericalMapping Of The Tree Table(Id,Parentid)

[x]

Share

2019-07-02

Review Computer Composition:Mapping Between Memory and Cache 

[x]

Summary

 

Improvement

 

 


Algorithm:

https://leetcode.com

3. Longest Substring Without Repeating Characters

Medium

 

 

 

 

 

 

Given a string, find the length of the longest substring without repeating characters.

Example 1:

Input: "abcabcbb"Output: 3

Explanation: The answer is "abc", with the length of 3.

Example 2:

Input: "bbbbb"Output: 1

Explanation: The answer is "b", with the length of 1.

Example 3:

Input: "pwwkew"Output: 3

Explanation: The answer is "wke", with the length of 3.

Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

 

c# code

/// <summary>

        /// 获取输入字符串中不重复的字符串

        /// </summary>

        public static int LengthOfLongestSubstring(string s)

        {

            string newstr = string.Empty;

            int maxlen = 0;

            foreach (char item in s)

            {

                if (newstr.Contains(item))

                {

                    int index = newstr.IndexOf(item);

                    if (newstr.Length > index + 1)

                        newstr = newstr.Substring(index + 1, newstr.Length - index - 1);

                    else

                        newstr = string.Empty;

                }

                newstr+=item;

                maxlen = newstr.Length > maxlen ? newstr.Length : maxlen;

            }

            return maxlen;

        }

 

审题审错了,因为题目下面列的太有诱导性了,Given a string, find the length of the longest substring without repeating characters.我理解成"获取一个字符串中最长的连续字符串",其实只要筛选出字符串不重复的长度就行了。而且,我用c#写的代码,

运行时间居然能超过60%,可以理解用c#的人太少了,又或者是LeetCode上面用的是Mono?其实这也是Microsoft遗留的历史问题了,你不能跨平台,人家肯定用Mono啦,那么时间用了92ms,内存用了24M,果断在本地用VS算一下时间和内存,才4ms,10MB。

 

因为字符串装箱拆箱太频繁了,用前后指针来计算字符串长度并且用哈希表存储当前对应的长度。(LeetCode叫这种方法Sliding window,让我想起来TCP、IP那个停止等待的滑动窗口处理办法,果然基础要学得好,才会再其他方面一通百通)

 

   public static int LengthOfLongestSubstring4(string s)

        {

            int n = s.Length, ans = 0;

            Hashtable map = new Hashtable();

            for (int j = 0, i = 0; j < n; j++)

            {

                if (map.Contains(s[j]))

                    i = Math.Max((int)map[s[j]], i);

                

                ans = Math.Max(ans, j - i + 1);

                map[s[j]] = j + 1;

            }

            return ans;

        }

看到上面还有一种终极算法,因为用哈希表还是太耗时间内存了,如果我们用更简单的方法--数组来提替换HashTable,但是得提前知道字符串的大小。

public static int LengthOfLongestSubstring(string s)

        {

            int n = s.Length, ans = 0;

            int[] arrays = new int[128];

            for (int j = 0, i = 0; j < n; j++)

            {

                i = Math.Max(arrays[s[j]],i);

                ans = Math.Max(ans, j - i + 1);

                arrays[s[j]] = j + 1;

            }

            return ans;

        }

 

Review:

Chapter 1 A Tour of Computer:Systems

 

1.1 Information Is Bits + Context 3

1.2 Programs Are Translated by Other Programs into DifferentForms 4

1.3 It Pays to Understand How Compilation Systems Work 6

1.4 Processors Read and Interpret Instructions Stored in Memory7

1.5 Caches Matter 11

1.6 Storage Devices Form a Hierarchy 14

1.7 The Operating System Manages the Hardware 14

1.8 Systems Communicate with Other Systems Using Networks19

1.9 Important Themes 22

1.10 Summary 27

 

Bibliographic Notes 28

Solutions to Practice Problems 28

 

第一章 计算机系统漫游

1.1 信息就是位的上下文

1.2 当前程序被其他程序翻译成不同的格式

1.3 值得去了解编译系统的原理

1.4 处理器在内存中读取和解释指令

1.5 缓存很重要

1.6 存储设备形成的梯次

1.7 操作系统管理着硬件

1.8 系统之间通过网络的通讯

1.9 重要的主题

1.10 总结

 

参考文献

练习的参考答案

 

 

A computer system consists of hardware and systems software that work together to run application programs. Specific implementations of systems

change over time, but the underlying concepts do not.All computer systems have similar hardware and software components that perform similar functions.This book is written for programmers who want to get better at their craft by understanding how these components work and how they affect the correctness and performance of their programs.

You are poised for an exciting journey. If you dedicate yourself to learning the concepts in this book, thenyou will be on your way to be coming a rare "powerprogrammer," enlightened by an understanding of the underlying computer system and its impact on your application programs.

 

计算机系统是由软件和硬件组成的,并且在此基础上运行应用程序。计算机系统的界面会随着时间变化,但是底层的概念不会变。所有的计算机系统都有相似的硬件和执行相似的软件。这本书是写给想更清楚这些组件是如何工作和这些组件如何影响他们程序运行的程序员。你肯定准备好这段激动人心的旅程了,如果你潜心学习这本上的所有概念,那么你会走在为数不多的"大牛"的正确道路上。对底层计算机系统的进一步了解,这些会直接影响你的应用程序。

 

You are going to learn practical skills such as how to avoid strange numerical errors caused by the way that computers represent numbers. You will learn how to optimize your C code by using clever tricks that exploit the designs of modern processors and memory systems. You will learn how the compiler implements procedure calls and how to use this knowledge to avoid the security holes from buffer overflow vulnerabilities that plague network and Internet software. You will learn how to recognize and avoid the nasty errors during linking that  the average programmer. You will learn how to write your own Unix shell, your own dynamic storage allocation package, and even your own Web server. You will learn the promises and pitfalls of concurrency, a topic of increasing importance as multiple processor coresare integrated onto single chips.

 

你会学习到调试的技能,如计算机发生一些奇怪数字引起的计算机异常。通过学习现代处理器和内存系统,你可以利用对应的技巧来优化你的C语言程序。你会学习到编译程序的调用和如何用这些知识在内存泄漏上避免安全漏洞,这些弱点会给网络和互联网带来灾难。你会学习到认识和避免一些令人讨厌的错误,这些困扰着大部分程序员。你会学习到如何编写自己的 Unix Shell,并且拥有自己的动态存储分配包,甚至你会有自己的Web服务器。你会学习到并发带来的挑战和机遇,当多个核心处理器集成在单芯片的时候显得越来越重要。

 

In their classic text on the C programming language[61], Kernighan and Ritchie introduce readers to Cusing the hello program shown in Figure 1.1 .Although hello is a very simple program, every majorpart of the system must work in concert in order for it to run to completion. In a sense, the goal of this bookis to help you understand what happens and whwhen you run hello on your system.We begin our study of systems by tracing the lifetimeof the hello program, from the time it is created by aprogrammer, until it runs on a system, prints its simple message, and terminates. As we follow the lifetime of the program, we will briefly introduce the key concepts, terminology, and components that comeinto play. Later chapters will expand on these ideas.

在Kernighan和Ritchie的C语言经典教材中,他们用以下这段“Hello Wolrd”的C语言程序显示给读者。尽管这段C语言程序非常简单,但是计算机系统各个重要部分必须协调有序运作才能运行完这段程序。在某种意义上,本身的目的就是要帮你理解在执行这段程序的时候,计算机系统发生了什么和什么时候发生了什么。我们通过追踪这段程序的生命周期来学习计算机系统,从程序员编写时,到在系统上运行,直到运行完成后在控制台打印出"Hello Wolrd"。对照着这段程序的生命周期,我们会主要介绍核心概念,专业术语和参与运行的组件。接下来的章节会展开讲这些概念。

 

 

-------------------------------------------code/intro/hello.c

 

Figure 1.2 The ASCII text representation of hello.c.

 

1.1 Information Is Bits + Context

 Our hello program begins life as a source program (or source file)that the programmer creates with an editor and saves in a text filecalled hello.c. The source program is a sequence of bits, each witha value of 0 or 1, organized in 8-bit chunks called bytes. Each byterepresents some text character in the program.

 

 Most computer systems represent text characters using the ASCIIstandard that represents each character with a unique byte-sizeinteger value. For example, 1 Figure 1.2 shows the ASCIIrepresentation of the hello.c program.

 

1. Other encoding methods are used to represent text in non-English languages.See the aside on page 50 for a discussion on this.

 

The hello.c program is stored in a file as a sequence of bytes. Eachbyte has an integer value that corresponds to some character. Forexample, the first byte has the integer value 35, which corresponds tothe character ` # '. The second byte has the integer value 105, whichcorresponds to the character 'i' , and so on. Notice that each textline is terminated by the invisible newline character `\n' , which isrepresented by the integer value 10. Files such as hello.c thatconsist exclusively of ASCII characters are known as text files. Allother files are known as binary files.

 

The representation of hello.c illustrates a fundamental idea: Allinformation in a system—including disk files, programs stored inmemory, user data stored in memory, and data transferred across anetwork—is represented as a bunch of bits. The only thing thatdistinguishes different data objects is the context in which we viewthem. For example, in different contexts, the same sequence of bytesmight represent an integer, floating-point number, character string, ormachine instruction.

 

As programmers, we need to understand machine representations ofnumbers because they are not the same as integers and realnumbers. They are finite

 

Aside Origins of the C programminglanguage

C was developed from 1969 to 1973 by Dennis Ritchie of BellLaboratories. The American National Standards Institute(ANSI) ratified the ANSI C standard in 1989, and thisstandardization later became the responsibility of theInternational Standards Organization (ISO). The standarddefine the C language and a set of library functions known asthe C standard library. Kernighan and Ritchie describe ANSI Cin their classic book, which is known affectionately as "K&R"[61]. In Ritchie's words [92], C is "quirky, flawed, and anenormous success." So why the success?

 

  • C was closely tied with the Unix operating system. was developed from the beginning as the system programming language for Unix. Most of the Unix kernel (the core part of the operating system), and all of its supporting tools and libraries, were written in C. As Unix became popular in universities in the late 1970s and early 1980s, many people were exposed to C and found that they liked it. Since Unix was written almost entirely in C, it could be easily ported to new machines, which created an even wider audience for both C and Unix.

 

  • C is a small, simple language. The design was controlled by a single person, rather than a committee, and the result was a clean, consistent design with little baggage. The K&R book describes the complete language and standard library, with numerous examples and exercises, in only 261 pages. The simplicity of C made it relatively easy to learn and to port to different computers.

 

  • C was designed for a practical purpose. C was designedto implement the Unix operating system. Later, other peoplefound that they could write the programs they wanted,without the language getting in the way.

 

C is the language of choice for system-level programming, and there is a huge installed base of application-level programs as well. However, it is not perfect for all programmers and all situations. C pointers are a common source of confusion and programming errors. C also lacks explicit support for useful abstractions such as classes, objects, and exceptions. Newer languages such as C++ and Java address these issues forapplication-level programs.

 

approximations that can behave in unexpected ways. Thifundamental idea is explored in detail in Chapter 2 .

 

Tips:

树形结构表的索引和序号

 

/*

   新建树形结构表,id,父节点(parentid)

   节点层级(depth),广度优先遍历索引(pathindex),树结构序号(numbericalmapping)

*/

CREATE TABLE tree(

       id int NOT NULL,

       parentid int NULL,

       name nvarchar(300) NOT NULL,

       depth int NULL,

       pathindex int NULL,

       numericalmapping nvarchar(300) NULL

)

--更新根节点的节点层级--

UPDATE tree SET depth = 0

WHERE parentId IS NULL;

--更新其他节点的节点层级--

WHILE EXISTS (SELECT * FROM tree WHERE depth IS NULL)

       UPDATE T SET T.depth = P.depth + 1

              FROM tree AS T INNER JOIN tree AS P

        ON (T.parentId = P.Id)

    WHERE P.depth >= 0

    AND T.depth IS NULL;

--更新根节点的索引--

UPDATE tree SET pathindex = 0, numericalMapping = '0.0'

WHERE parentId IS NULL;

--更新其他节点的索引--

WITH x AS

(

    SELECT id, rank() over (partition by parentId order by id) as pathindex

    FROM tree

    WHERE parentId IS NOT NULL  

)

UPDATE tree

SET pathindex = x.pathindex

FROM x

WHERE tree.id = x.id;

--更新根节点的序号--

UPDATE tree

SET numericalmapping = pathindex

WHERE depth = 1;

--更新其他节点的序号--

WHILE EXISTS (SELECT * FROM tree WHERE numericalMapping Is Null)

    UPDATE T SET T.numericalMapping =   cast(P.numericalmapping as

                                            varchar(300)) + '.' +

                                        cast(T.pathindex as varchar(300))  

              FROM tree AS T INNER JOIN tree AS P

              ON (T.parentId = P.Id)

    WHERE P.pathindex >= 0

    AND T.numericalMapping IS NULL;  

 

Share:

重温计算机组成原理:高速缓冲存储器与内存的映射方式

 

       指令系统运行过程中,CPU访存时间过慢导致CPU的等待,为了提升核心利用率,在主存和CPU之间增加高速缓冲存储器(Cache),也就是把经常使用的常数、常量、指令放在Cache中,来减少CPU访存时间,缩短指令的执行时间,从而提高计算机的性能。一般Cache采用高速的SRAM制作,其价格比主存贵, 但因其容量远小于主存,可以解决速度和成本之间的矛盾。

由主存地址映射到Cache 地址称为地址映射。地址映射方式很多,有直接映射(固定的映射关系)、全相联映射(灵活性大的映射关系)、组相联映射(上述两种映射的折中)。Cache为8K,主存为1M,块大小为512字,目标内存地址0240CH。

1、直接映射

 

直接映射,顾名思义就是按照Cache直接映射到主存,但是Cache对不上主存,怎么办?按照以下对应关系给匹配上。

Cachei块=主存j块 % Cache块数n,所以主存地址=标记+Cache行号+块内地址。

如Cache为8K,主存为1M,块大小为512字,Cache行号需要16行(8K=2^13字=2^4行*512字/行),对应主存地址(Line Number)需要4位。

1M=2^20字=2^11*512字/行=2^7块群*2^4块/块群*512字/行,对应主存地址(Tag)需要7位,剩余块内地址(Line Offset)需要9(20=7+4+9)。

 

内存地址0240C化为2进制为 0000 0010 0100 0000 1100

Tag(7)

Line Number(4)

Line Offset(9)

0000001

0010

000001100

也就是说,0240C在Cache的第2行,主存的第18块(在第1块群的第2行)

 

我们来分析下直接映射的优缺点。

优点:实现简单,一个简单的求余公式就能知道对应关系。

缺点:如果CPU或外部设备访存的地址都是在"同余"里,Cache的命中率就不高了。如上面例子,主存第0块,16块,32块(Cache共16块)是在同余里,对应Cache都是第0块,那么CPU访存的时候,第0块的内容命中率相对低,而且需要不停得替换这一块的内容。

 

2、全相联映射

 

由于直接相联的同余缺陷,全相联就是为了解决灵活分配Cache地址,也就是说一个主存块可以装进Cache的任意一行。

那么主存地址需要标记+块内地址两个字段就可以完成匹配了。

如Cache为8K,主存为1M,块大小为512字,1M=2^20字=2^11*512字/行,对应主存地址(Tag)需要11位,剩余块内地址(Line Offset)需要9(20=11+9)。

内存地址0240C化为2进制为 0000 0010 0100 0000 1100

Tag(11)

Line Offset(9)

00000010010

000001100

我们来分析下全相联映射的优缺点。

优点:每个Cache都可以映射任何一块主存,非常灵活。

缺点:每次访存的时候要知道是否在Cache中,都需要遍历整个Cache,时间开销大,实现较复杂。

 

 

3、组相联映射

 

由于直接映射与全相联映射的优缺点刚好相反,就有了第三种结合前两者的组相联映射。

组相联的主要思想是,把Cache所有行分为2^q个相同大小的组,每组有2^s行,也就是说组间按照直接映射的方式,组内按照全映射的方式。

那么主存地址就需要标识+Cache组号+块内地址组成了,Cache组号= 主存块号%Cache组数。

如Cache为8K,主存为1M,块大小为512字,若s=1(2路组相联),那么

Cache需要16组(8K=2^13字=2^3组*2^1行*512字/行),对应主存地址(Line Number)需要3位。

1M=2^20字=2^11*512字/行=2^8组群*2^3/组群*512字/行,对应主存地址(Tag)需要8位,剩余块内地址(Line Offset)需要9(20=8+3+9)。

 

内存地址0240C化为2进制为 0000 0010 0100 0000 1100

Tag(8)

Line Number(3)

Line Offset(9)

00000010

010

000001100

也就是说,0240C在Cache的第2组,主存的第18块(在第2组群)

 

组相联映射结合了直接相联和全相联的优点,当q=0(只有一个组)的时候,那就是全相联。

当s=0(只有每组只有一行)的时候,那就是直接相联。

 

 

 

 

 

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值