本文纯属笔记。
1.APWG : 国际反钓鱼组织,每季度有关于全球钓鱼攻击方面的统计和分析信息;
2.Microsoft Computing Safer Index Report:介绍了每年因钓鱼攻击造成的财产损失情况。
3. Phishing URL Detection with ML
An phisher has full control over the sub-domain portions and can set any value to it. The URL may also have a path and file components which, too, can be changed by the phisher at will. The sub-domain name and path are fully controllable by the phisher.
攻击方式:
1)The sub-domain portions of URL can be control and set any value to it.
2)The path and file components of URL can be changed by the phisher at will.
3)The attacker can register any domain name that has not been registered before. The phisher can change FreeURL at any time to create a new URL. The reason security defenders struggle to detect phishing domains is because of the unique part of the website domain.
4)The phisher tried to make the domain look like the domain of the legal URL.
5)Other methods that are often used by attackers are Cybersquatting and Typosquatting.
Cybersquatting (also known as domain squatting), is registering, trafficking in ,or using a domain name with bad faith intent to profit from the goodwill of a trademark belonging to someone else.
That is to say, the phisher can register the similarity of your company’s URL.(For example, the name of your company is “abcompany” and you register as abcompany.com. Then phishers can register abcompany.net, abcompany.org, abcompany.biz and they can use it for fraudulent purpose.)
Typosquatting, also called URL hijacking, is a form of cybersquatting which relies on mistakes such as typographical errors made by Internet users when inputting a website address into a web browser or based on typographical errors that are hard to notice while quick reading.
Features Used for Phishing Domain Detection
1)URL-Based Features
2)Domain-Based Features
3)Page-Based Features
4)Content-Based Features
URL-Based Features
Digit count in the URL
Total length of URL
Checking whether the URL is Typosquatted or not
Checking whether it includes a legitimate brand name or not
Number of sub-domains in URL
Is top Level Domain (TLD) one of the commonly used one
Domain-Based Features
Its domain name or its IP address in blacklists of well-known reputation services?
How many days passed since the domain was registered?
Is the registrant name hidden
Page-Based Features Global Page rank
Country Page rank
Position at the Alexa Top 1 Million Site
Estimated Number of Visits for the domain on a daily, weekly, or monthly basis
Average Page views per visit
Average Visit Duration
Web traffic share per country
Count of reference from Social Networks to the given domain
Category of the domain
Similar websites etc.
Content-Based Features Page Titles
Meta Tags
Hidden Text
Text in the Body
Images etc