It can classify a variety of phishing websites, and output the results by type, region, industry and other dimensions.
The domain name is analyzed in the intelligent way of autonomous learning.
1) Research and judge the domain name features: extracts and analyzes domain name semantics and page structure based on page code;
2) Feature training and strengthening: using historical data, unsupervised training and strengthening of domain name features are carried out;
3) Self updating mechanism: the trained features are added to the system by the self updating mechanism for subsequent detection.
Comprehensive, generalized black and white list are supported to meet the needs of different identification research.
Whitelist: it includes not only the domain names of mainstream websites, but also the sub domain names; according to the separation of dynamic and static contents, various images, audio and video domain names of mainstream websites are added to the whitelist.
Blacklist: including illegal and malicious website domain name, ISP service provider, domain name registrant, domain name contact person, IP and geographical location.
Grey list: Based on the access threshold, the grey list is distinguished. Through the domain name characteristics and source sensitive content, the grey list is studied and judged.
1) Intelligent research and judgment
Intelligent research and judgment on the trend of malicious websites, master the global network security situation, and solve the problem of low efficiency of traditional website security detection products.
2) Focus on business
In order to solve the passive problem of traditional website detection and identification, we use machine learning and other technologies to conduct in-depth security examination and perception of traffic URL from the aspects of business attribute and importance.
3) unified management
Integrating threat intelligence database, unified management of URL features, establishing website threat view, to solvie the problem that malicious website threat information is not commonly used.
4) Support multi-source data
Support IDC data, metrometropolitan area network data, CDN data, Internet log and other data sources.
5) flexible configurable architecture
Hadoop + streaming processing is taken as the overall big data technology architecture. At the same time, it has the feature of loose coupling in the functional modules. In addition, the product supports local or cloud deployment.
6) Intelligent multilayer filtering
Supports black and white list matching, DGA algorithm detection, domain name feature recognition.
7) Autonomous Learning Mechanism
It has the mechanism of feature extraction, key information research and judgment, feature strengthening training and independent updating.
8) Multidimensional result output
Supports result output by type, industry object and region.
The core functions of this product are as follows:
1) Data preprocessing
Cleaning, de duplication, alignment, data caching and standardized storage operations are performed for the massive data accessed; various data forms such as online log, real-time traffic, system log, CDN cache are supported.
2) Online quick analysis
With Storm stream processing technology as the core, combined with Threat Intelligence Database, the links in massive data are identified and analyzed.
3) Offline intelligent learning
Build a multi-layer machine learning model; quantify the features of the grey data to be checked, combine natural semantics, page elements, etc. for feature extraction and rule generation; implement a "self-update, self-learning" detection mechanism.
Taking "big data security analysis and artificial intelligence technology" as the core, more active and intelligent discovery and identification of various phishing website core technologies are as follows.
1) Big data security analysis technology
Based on big data analysis technology, detect and correlate various types of traffic and phishing website content in logs;
Combined with business scenarios, to achieve a comprehensive perception of phishing websites;
Through the deep correlation with threat intelligence, the APT's advanced persistent threats can be identified on the outreach website.
2) Artificial intelligence technology
Using artificial intelligence technology to continuously discover and extract the characteristics of phishing websites;
Combined the characteristics to continuously iterate and optimize the analysis and recognition rules to meet the security requirements of various business scenarios;
Combined with big data security analysis, providing intelligent security analysis products and security services.