CDC SFE coding guideline and common design
Overview
This document is a supplementary document to the exiting coding guideline/standard in ebay. It will not focus on basic coding style such as indent, bracket and layout, etc, which are already documented in existing coding standard . It will provide additional guideline which addressing few of the most common code quality issues. Of cause it is not an all-in-one.
This document summarizes coding style and common design to be reused firstly in CDC SFE team. Please try to follow the coding style and reuse the design in this document as working on project. Your input and review will be appreciated.
And we have examples for your reference on how to practice this document.
Coding style
Overview
Coding style is a very important aspect of code quality. A consistent coding style will help on bringing same code quality among team. The qualities can be measured by the following:
- Easy to be understood
- Easy to be modified
- Easy to be extended
Coding style can be measured from several aspects; these aspects are not really independent. They impact each other in various ways. The styles are outlined below:
- Code length
- Class style
- Method style
- Maintain style
- Misc
Code Length
The length of the code has very big impacts on understanding a code. Although it is easy to be judged against a number, the length criterion is more than just setup a threshold. Whether a piece of code is too long depends on the complexity of the code under analysis. So along with the suggested number, there will also be some exceptional cases. The code length can be gauged from the flowing levels.
Class
A class should be as small as possible with the integrity and purpose not harmed. A heuristic number of the length is less than 500 lines of code.
A class can be long when:
- It is a Manager class as discussed in Class Style
- It is a Utility class as discussed in Class Style.
- It has heavy initialization. The style of this kind of class is like the first one. E.g. IFTSearchOptionTestCaseUtil
Method
A method should be viewed within one browse window with extra space before and after the method body. The extra space will help to visually isolate the method from the context, so that reader can easily focus on it. A heuristic number is 30 lines at the most.
A method can be longer than that when;
- The overall structure of the method is sequential and most code is doing the similar task in very simple form (in one line). This means no part of the code needs to pay extra attention. The sequence of the code can even be changed without impact the functionality of the method. E.g. SearchFavoriteNormalizer
- There is clear and simple pattern in the method. Again, each part of the method is doing similar thing. E.g. FindingStateProvider
Although there are exceptions, method length should be limited whenever possible. A long method is a good candidate for code refactoring. Extract method is usually the best way of doing that.
Block
A block is a structure like if/if-else, while/do-while/for, {…} etc. As block makes method, a heuristic number will be no more than 10 lines of code. Usually there should be no exceptional case for a long block. If the code is complex, extract it into a method unless each line is at the same abstraction level and make the task complete.
Class style
Defining class is to distribute and partition responsibility among classes. Just like organization structure, the key is to clearly define which class is managing and which class is managed. The class style is to decide which kind of class a class is and how it interacts with other classes. In most of the case, there would be three kinds of class.
Manager class
This kind of class usually has only one methods to provide a clearly defined service. The method can be implemented in two ways:
- By invoking the other private methods defined within same class. The main method should be kept shortly, the method body should be very easy to understand since it is purely delegate to private methods. If not, use extract method to create new private method and keep the main method neat.
- By delegating to Worker classes. There are two kinds of delegation.
- Delegate to only one worker class, this is state and strategy
- Delegate to a list of worker classes in organized in a chain.
. E.g. CustomizationModelBuilder. SearchFavoriteNormalizer
Worker class
This kind of class usually works with Manager Classes to handle part of the all task. It usually implements an interface or extends from some abstract or adapter classes. There is usually single direction reference from manager class to worker class, so that worker class can not invoke manager class. The manager and worker class should count most of the classes in a system.
Utility class
Utility class is a class that groups similar/related services across multiple domains. It is likely to evolve (change) frequently in the long run. It usually has some overloaded methods for same kind of work but different use cases. These methods usually digest various incoming parameter specific to different domain. But they rely on few private methods to centralize common logic and decouple domain dependency. It is very important to have domain independent private methods because they are the building block.
When there is new use cases comes out, the best way to do is create new overloaded method and combine existing private methods for it. The domain specific logic can be well encapsulated in the newly introduced method. If there is anything in common with existing code, extract them into new private method.
DO NOT try to support new case in existing method by adding new parameter. This is all nightmare begins. A simple Boolean variable can double the size in the worst case. But if we have case specific method, we will need much less parameter than the giant one because the assumption it relies on.
E.g. TrackingenhancementUtil
Relationship between classes
Besides the kinds of classes we classified, the relationship between them should also follow certain guideline:
- Single direction reference. Classes should not refer to each other. We should keep single direction reference whenever possible. Bi-direction reference will introduce unnecessary complexity into system.
- Use composite instead of template. In case you really want to use template, you should keep the other method not accessible or override from subclass (decorated by final, static or private).
- Interface and Abstract class. There are cases that some logic is shared between implementations, in that case we can define use abstract class to make it easier to be used. But the shared method should never be override by subclass. common used method
Method style
It is usually not a big problem when it comes to method style. As discussed before, we should keep it short. Beside that, the following guideline should also be considered:
Limit the parameter number
We should keep parameter as less as possible. Too many parameters will make it hard to understand and keep track what is used. A heuristic number is 3. When there might be too many parameter, please try to break down the method into more specific methods working on less parameters or create a Value object to group the parameter. Value object is especially useful when the parameters need to change constantly in the life.
Write simple code
Try to write the code using the most understandable way as if your readers are all idiots. If you have more than 4 parameters for a method; sequentially list the lines for calculating each of them instead of putting everything into on code line. On long code line can be hard to understand.
Sort parameter
The order of parameter in the declare block usually implies how important the parameter is in the method. The most important one (depends on context) should be put in front of those less important ones. And we should avoid declare parameters with same Type besides each other. That will increase the error possibility when invoke such a method because the compiler can not distinguish wrong position of parameter by its type.
Have return value
It is preferred to use method with return value than void method. This follows the classic IPO model that clearly separate input, process and output. This allows invoking method easier control the data creation and consumption. Thus make both caller and called method clearly partition the responsibility.
Limit nested level
This is applied for any block, not only for the same kind. E.g. if-if-if, for-if-for, etc. The start level should be count from the method body. It is much easier to work on the ground than climbing mountain. So is the code. Code with several levels of nesting block is just like mountain. The unnecessary complex will make life harder. The code can be much more flat by the flowing two methods:
- Return fast. Using “if” wisely.
- Extract method. This makes the logic in main flow neat.
Common flow
A method should first check/validate whether it should be executed or not. It includes parameter validation, member variable checking etc.
Then it should do the processing. First it needs to initialize variables. A variable should be kept within the minimal range that it is being accessed.
Lastly it will do the returning. It is highly recommended to use convertor than processor as converter is much clear than processor.
There are two kinds of methods: hub and end point.
Maintain style
Nearly in all projects we have to modify existing code to plug our feature into current system. While doing this, we should keep in mind to minimize the impact to existing system. To do this, we should know which kind of maintenance we are performing. Each type will have different solution that should be followed.
Plug into existing framework
When there is some framework that allow new feature to be plugged in. The new feature should implement certain interface or extends from certain parent class. Do not violate or by pass the current design by insert code randomly. E.g. CustomizationModelBuilder. SearchFavoriteNormalizer
Add feature for one class
When an existing feature needs to be modified and there is no existing framework to support the plug & play extending. As the function is only needed for this one place, a new method should be added into existing class. The method will encapsulate the logic and get invoked from proper location. Do not insert the code randomly into existing code body, like if() {..} something. This is very bad practice and the main reason the code deformed.
Add feature for multiple classes
When the same feature is required from different classes, instead of copy/paste same code into them, we should create a Utility class to centralize all the shared methods for related/similar purpose. Then for each point we need to use the service, the utility class should be invoked instead of same block of code. And it is preferred to use public static methods to simplify the clause. E.g. IFTSearchOptionTestCaseUtil
Add new member variable(s)
Usually we need to add additional member variable(s) into existing VO or other classes. In this case, we should considering the following before we add it:
- Is there any other place need to be added too? If the same variable is needed by multiple classes, make sure you are using the same variable name whenever possible.
- How many new variables need to be added? If there are more than 1 variables to be added, and they can be logically grouped, create a new VO to hold all of them and only add that VO into existing classes.
Fix bug
It is also a kind of maintain. When fixing bug, pay much attention to timing and change risk, do not do last minute dirty fix when the regression window is closing. And the style should follow what mentioned above. Each time you check in code, be sure to run Findbugs first, sometimes some serious (red) bugs can be found from the new adding code-lines.
Misc
There are some points that are hard to be categorized are listed here. The overall goal is to make the code easy to be understood. Some of these misc things are actually talk about the same thing but from different perspective.
Use comments only when necessary
Try to avoid comments as much as possible. Comments are neither structured nor concrete; it is a totally different language than code. While code is under constant maintenance, comments are usually out of date and misleading. Most comments are just reiterating the method name or class name. A well designed system can document itself just by code. If you really have something to say, say it in ERD or wiki.
Structure is most important
Structure is the way to aggregate parts of the system. Structure here means a clean and clear way of organize parts. It can be:
- A block of code that talks about the similar task at same abstraction level.
- An initialization block that composite parts into Chain or Branch
Write to the essential
When writing high-level method, make sure every line is doing a clear task and moving things forward. If not, consider extract related code into method and give it a meaningful name. We should keep high-level structure clear and simple and natural.
Don’t mix the role
Partition responsibility between classes wisely, always thinks about the roles of the classes. Do not have high-level classes doing the low-level concrete task, the high-level should doing dispatching, controlling, managing, etc and delegate low-level task to proper classes.
Common best practice
Although already document in the coding standard; the followings are very important so that they are mentioned again here for quick reference.
- Use interface than implementation. Avoid to use “cast” “instanceof” along with multiple if-else branch as much as possible. When instantiate an implementation class, always assign it to the interface it implements instead of the concrete class itself.
- Optimize import statement. Be sure to “Ctrl-Alt-O” before check in, otherwise too many unused imports.
- Try to use “final” in the params’ list when applicable
Common design
Overview
Design is the blueprint which talks about how to build a system. No matter how different the system is, there are common designs that can be applied at any level – application, sub-component, external services, etc. The <<design pattern>> is a book that can be referred to some of the common designs. But it is too complicated and some of them are not about design actually. Design is very simple if you know what the factors that builds up a design.
In most of the case, a particular design can be decided by the following two factors:
- How behavior is organized:
- Chain
- Branch
- How information is processed:
- Processor
- Convertor
So, let’s do a very simple math, the common designs are the combination of these two factors:
- Processor chain
- Convertor chain
- Processor branch
- Convertor branch
Any design is one of these 4 designs at a given level. A common mistake is there is no one single clear design at a level, people put too many things into design, so the class diagram is just bloated with classes, and it is impossible for you to get the big picture.
It is not important if the design is using the 4 names above. It depends on the designer’s favorite. But know the essential behind various name is the key to master design.
The following of this section will talks about this two factors, it is your application decides which ones to use. And the complete design may include different sub designs at different level or sub system.
Chain
When a bunch of logic works together in a sequential way to fulfill a task, we can partition the logic into items with proper size, chain the items together. Chain is the most common way to bring structure into a system.
Chain is build up by items in it. Most of the execution on items in the chain is just like the sequential execution of the code for a piece of program. In most of the cases:
- The items in the Chain are ordered. The sequence of the item in the Chain decides the sequence of the execution of the item.
- Every item in the Chain may be executed per request.
- Add, remove or replace any item in the Chain will not impact other items. (Orthogonality)
- Item can also be a Chain or Branch. (Composite)
- Items may not belong to a specific Chain. That is an item can be reused in other Chain.
The following patters are belongs to Chain:
- Visitor. Typical chain, all items will be executed per request in predefined order.
- Chain of responsibility. Special chain, the execution on an item depends on current processing status.
- Decorator. A special chain with FILO execution sequence.
Usually, if a request needs multiple steps to finish, we can use chain.
Branch
When there are multiple choices to do a task, we can partition these choices into items and use a branch to hold the items and manage the decision making.
- The item in the Branch is not ordered, instead, they are identified by some kind of identifier.
- One or none item will be executed per request.
- Add, remove or replace any item in the Branch will not impact other items. (Orthogonality)
- Item can also be a chain or Branch. (Composite)
- Items may not belong to a specific Branch. That is an item can be reused in other Branch.
The following patters are belongs to Branch:
- State
- Strategy
Processor
A processor has not output, it just do something based on the input. It can change some of the attribute of the input, or change the state of the system by manipulating something outside of the input.
Convertor
A convertor has both input and output. It converts input to output in the traditional IPO way. It is the general form of Processor.