Lab2 Keyword extract

目录

Problem

PSP Table

Basic Idea

Design & Implement

Choice

Level 3 & 4

Enum

Skill

Key Point

Diagram(Flow Chart)

Code description

Github source code

Description

UML

Unit Test

Testing Version

Optimization Version

Complexity Test

Summary


The Link Your Classhttps://bbs.csdn.net/forums/MUEE308FZ?category=0
The Link of Requirement of This Assignmenthttps://bbs.csdn.net/topics/600798588
The Aim of This Assignment

1.To achieve a program function, it can extract keywords of different levels from the C or C++ code files that are read in.

2. Write a blog to record your work content and process.

MU STU ID and FZU STU ID19105690 & 831902209

Problem

To achieve a program function, it can extract keywords of different levels from the C or C++ code files that are read in.

1. Basic requirement: output "keyword" statistics

2. Advanced requirement: output the number of "switch case" structures, and output the number of "case" corresponding to each group

3. Uplifting requirement: output the number of "if else" structures

4. Ultimate requirement: output the number of "if, else if, else" structures

Before completing the more difficult requirements, you need to complete the Lower requirements.

PSP Table

PSP Table
StagesEstimate (min)Real(min)
Planing3031
Estimate9001052
Development60                        40
Analysis5                        11
Design Spec1545
Design Reveiw1510
Coding Standard52
Design180191
Coding180392
Code review20        40
Test180189
Reporting18        61
Test Report220
Size measurement515
Postmortem& Process impvement plan205
Total9001052

Basic Idea

  1. For this time,  we first care about the annotation and inCode (String, variable name) keyword, we should make judgement under these situation.
  2. After searching from the Internet, I decide using re module in python to match the basic function
  3. After Compelting level 1, the level can be easily extended if I design the logic clearly.
  4. For Worse considerasion,  dealling with the code with not annotation is needed (struct in one line)
  5. Easy way to think, dealling line by line is not good for complexity
  6. For muti-column String, I Serach from Internet, to deal with first 2 situation.C++ 长行字符串多行书写_GetRekt的博客-CSDN博客_c++ 多行字符串
  7. for no.1  variable name situation, delete all variable but not number 

I using Xmind to implement mind map to help me think

Design & Implement

Choice

I Read with file and deal with line by line, it is not best way to do that but is Simplest to think

're' module is Regular Expression module in Python, using it as basic match. 

Level 3 & 4

First is Stack  method

Because dealing with nested, general thought is Stack is similar to this structure because we need to deal with the FILO situation. we can design the 2 stack: bracket and state.

Enum

if –1   else—2   elseif—3

 { --True         } —False

Then, Think Twice

} just finish the sentence

Using 0 represent '}'

For formatted case,  if\else\elseif can combined with '{',

just need one string can finish this work!

Skill

Using Bit operation,  pop when match.

Using String operation instead of stack, find and replace,

Key Point

  • if can be eliminated when meet next ‘if’ or ‘}’
  • if’ can be add when meet the else, match the 10 20 
  • else if match 10 30*n 20

Diagram(Flow Chart)

 

Code description

Github source code

Lazer2077/C-Keyword-Extract (github.com)https://github.com/Lazer2077/C-Keyword-Extract

Description

Code can be divided as 4 Parts, like the diagram says.

UML

Too hard to do that, but this line preserved🤪

Unit Test

Testing Version

Beta 0.2

initial version for coding 20min ,debug input builded

Beta 0.3

bug fixed: string and Multiline-string ignore

Beta 0.4

bug fixed: do and double will rematch

key_word[7]-=key_word[8]

Beta 0.5

bug fixed: annotation ignore

shiled_word=['//','/\*','\*/','\"','\\\\n\\\\']

Beta 0.6

function add: input layer finished

file_path=input("Welcome to C keyword statistic!\n Please input file Path:")
level=int(input("Please input level:"))

Beta 1.0

function added: level 1 simple realization, debug output layer finished

ver 2.0

function added: level 2 simple realization

ver 2.5

function added:'switch num' output formatted

ver 3.0

function added: level 3 logical realization, simple and incorret

ver 4.0

remake for level 3&4 method match the problem requirement basicaly cancel the debug output

Thank you for readiing here, still have bug, keep updating utill ddl and found question

Unit test for Level 3 and 4

Optimization Version

ver 4.6

bug fixed:

  •  variable name ignore
  • ignore the variable name
  • optimize the output layer
  • ignore the #include and #define

Bug still exist, welcome to point out, comment or private meassage

Complexity Test

Using 

python -m cProfile -s cumulative Keyword_extract.py

to test the complexity, result as below

Summary

This assignment make me familiar with python, I got the full development procedure by myself  this time, this is an excellent experience, espacially the I review the "Algorithm & Data Structure" when I was developing.

On the other hand, due to my poor ability, the estimate time is far away from my expected, there is not perfect method and the result still have some situation did't finish.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值