This article mainly conducts an in-depth analysis of the log correlation technology implemented by OSSIM. 1. Challenges in Network Security ManagementCurrently, many organizations already have firewalls, intrusion detection, antivirus systems, and network management software, but network managers face the following challenges in security tracing and security management: 1. Security devices and network applications generate a huge number of security incidents, with serious false positives. An IDS system generates nearly tens of thousands of security incidents a day. Usually 99% of security incidents are false positives. Truly threatening security incidents are submerged in the massive amount of information and are difficult to identify. 2. The horizontal and vertical relationships between security events (such as different spatial sources, time series, etc.) have not been comprehensively analyzed, resulting in serious underreporting and failure to achieve real-time prediction. One attack activity is often followed by another, and the previous attack activity provides the basic conditions for the latter; one attack activity generates security events on multiple security devices; and security events from multiple different sources are actually a collaborative attack. These lack effective comprehensive analysis. 3. Security managers lack the ability to perceive the entire network security situation in real time. Security devices operate independently, and the huge amount of redundant logs generated by the devices are independent and scattered, making it difficult to distinguish between true and false. Obviously, they cannot be directly used as the basis for responding to security incidents. The following example shows the number of events generated by network security products every day in a medium-sized enterprise with 500 computers, as shown in Table 1. Table 1 Analysis of log output of typical enterprises
It is not difficult for us to collect these logs through existing log collection technology, but it is difficult for security personnel to identify the internal clues between massive security events in a short period of time, and it is even more difficult to analyze the log correlation. At this time, we need a technology that can help us automatically get network anomaly alarms after processing and analyzing massive log data, so as to improve the efficiency of security teams in discovering network threats. 2. Network Association Analysis TechnologyIn the face of the above problems, we can solve them with a correlation analysis platform, which can show us the overview of all aspects of network security. It also provides real-time monitoring and event correlation, risk analysis, reporting, and notification functions, which can comprehensively manage and audit security affairs within the enterprise, greatly improving the ability to identify network security risks. The correlation analysis platform relies on the core technology of correlation analysis and focuses on three aspects: log collection, log formatting (also called normalization), and log correlation analysis. 1. Log collection: Whether a SIM product has an advantage depends on log collection, whether it can support more types, whether it can be easily expanded, and whether it can automatically identify and support unknown device logs. For example, the protocols that need to be supported include syslog, snmp trap, windows log, database, file, xml, soap, etc. 2. Log formatting: After logs are collected, they need to be formatted to a unified standard. These unified formats will add risk value, reliability, priority and other labels to the original logs to prepare for subsequent correlation analysis. If the formatting is not standard, correlation analysis cannot be performed. The following is the normalized format: The following is a comparison of the original SSH log and the standardized log: It can be seen that the SSH brute force cracking event has many more elements than the original log, but these additional items can prepare for future log correlation analysis. 3. Log Correlation AnalysisBased on log formatting, certain algorithms (context association, attack scenario association, etc.) are used to obtain events from multiple data sources for correlation analysis, to achieve risk values for different logs, and to combine asset vulnerability information and inbound and outbound traffic to comprehensively calculate the degree of threat to network assets, and finally issue an alarm to alert administrators. This greatly reduces the difficulty for administrators to analyze massive logs. In order to achieve the purpose of security event correlation analysis, a good event processing mechanism is required, such as the normalization of log collection mentioned above, and a good correlation method, and more than one correlation method. It is better to combine multiple real-time correlation methods. After a large number of standardized events are sent to the correlation engine for processing, they will go through multiple correlation methods such as event classification processing, aggregation, cross-correlation, heuristic correlation, etc. The system will perform statistical classification based on the security events in the database. Let's look at a correlation analysis scenario: (1) For example, the VPN server log shows that Zhang San logged into the intranet from the external network at 3:00, logged into the FTP server at 3:05, and downloaded a file from the FTP server. The access control system log shows that Zhang San had just entered the office area not long ago. These three logs can be associated with a security event. (2) A company deployed a firewall system in front of its core database. One day, the security system detected that Zhang San logged into the MySQL database server, but no access log of Zhang San was found in the firewall log. This means that Zhang San may have bypassed the firewall and logged into the database server directly. 3). OpenVas scanned a Linux host in the network and found an Apache 2.2.x Scoreboard (local security restriction bypass) vulnerability. At the same time, NIDS detected an attempted attack on the host vulnerability. If the Linux server has been patched, the correlation analysis result is a low risk value and no alarm will be triggered. If the patch is not applied, the audit system will trigger an alarm. Each system management has its own security protection measures, but they are all isolated security islands. However, there must be a connection between everything. Analyzing these logs together is the correlation analysis technology we introduced above. 4. OSSIM Correlation Engine1. Overview of Correlation EngineThe core of the correlation analysis technology implemented under the OSSIM platform is the correlation engine. After the OSSIM sensor sends a large number of standardized events to the correlation engine for processing, they will undergo a variety of correlation methods such as event classification processing, aggregation, cross-correlation, and heuristic correlation. The system will perform statistical classification based on the security events in the database to find out the sources that often lead to security incidents and the ports that are frequently attacked. Event alarms will be generated at these stages. The security event correlation process module is shown in the figure below. The event aggregation module in the management engine can provide high-quality security events for subsequent correlation and improve the efficiency of the correlation engine, while the classification processing can directly upgrade known security events with high credibility and high attention to alarms. In addition, statistical classification can generate the most frequently attacked hosts and applications in the controlled network segment by comprehensively analyzing and counting the alarm database, log database, and knowledge base, and count the number of security events with the same data source and the same target port, so as to objectively reflect network anomalies. In the OSSIM system, the correlation analysis function is also implemented by the correlation engine. The correlation engine policy file definition is located in /etc/ossim/server/. The analyzed data is collected by the probe. The sensor collects thousands of events from the network every day. It is meaningless to directly generate events without any processing of these massive event information. Before reporting, these thousands of events can be condensed (clustered) through correlation analysis and confirmed into dozens or even several events, which are displayed in the SIEM on the Web front end. In simple terms, OSSIM's network security event correlation analysis can distinguish the true from the false alarm information generated by open source network security detection tools with different functions, thereby digging out real network attack events. 2. Main keywordsThe risk assessment in the OSSIM system mainly focuses on monitoring attack threats (Threat), vulnerabilities (Vulnerability), responses (i.e. security measures), assets (Asset), etc. The risk assessment model is embedded in the OSSIM correlation engine system. The correlation engine combines the asset library, vulnerability library, and threat library in the knowledge base to comprehensively consider the three elements of risk assets, vulnerabilities, and threats. 3. Asset risk calculationRisk=R(asset,vulnerability,threat) It is concise and effective to combine the three parameters of asset value, priority and reliability to calculate the risk. The following formula is used in the OSSIM system: Risk=asset*priority*reliability/25 (Risk model calculation formula 1) Among them, Asset (assets, value range 0~5) Priority (priority, value range 0~5) Reliability (reliability, value range 0~10) The Risk value of each Alert event is calculated by formula (1), where Ø The value range of Asset is 0-5, and the default value of asset is 2; in the OSSIM system, the attention level of assets is divided into 5 levels, and the values are 1, 2, 3, 4, and 5 from low to high. On the surface, the size of the number determines the size of the Risk value in the risk calculation formula, but it also has a deeper meaning. For example, the asset level of an ordinary workstation is 1. When it suffers a DOS attack, we only need a simple port network connection. If it is a database server, its asset level is 5, and the database service needs to be online in real time. Therefore, if it suffers a DOS attack, we cannot handle it like a workstation, but should automatically enable the backup IP address and direct the attack to the network honeypot system. Ø The value range of Priority is 0 to 5, and the default value is 1. This parameter describes the degree of harm caused by a successful attack. The larger the value, the higher the degree of harm. Ø The reliability value range is 0 to 10, with a default value of 1. The reliability parameter describes the probability that an attack may succeed. The highest value is 10, which represents 100% probability. Therefore, the higher the value, the less reliable it is. You can also understand this as the possibility of being attacked. When you operate OSSIM, open the SIEM console in the Web UI, and observe each Event, you will find that the risk value is controlled by the asset value priority and credibility. Each asset in the network has value, and the quantification of this value is achieved through the asset value. The default value of each asset is 2 (range 1~5). It can be analyzed in the risk calculation formula that the risk calculation does not use the asset value as the main factor affecting the risk result. For example, if the asset value of some database servers is very high, the calculated risk value will also be very large. For those workstations with small asset values, the risk value of nodes that are severely attacked is small, which makes it difficult to pay attention to them and loses their original authenticity. Therefore, the asset value range is between 1~5, which will have a small impact on the risk value. Example scenario: If a host is connected to port 445 of five different IP addresses in a VLAN, this may be normal network communication. If it is connected to port 445 of 15 machines, this is more suspicious. If there are 150 such connections and they last for a long time, it is likely to be attacked by a worm and requires further verification. 4. Example of the relationship between Risk, Priority and ReliabilityAnyone can calculate based on the formula. What is the internal relationship between the various parameters in the above formula 1? Among the events we collected, repeated events accounted for a large proportion, and most of them were risks with Risk=0. For example, taking an ICMP event of Cisco ASA as an example, its PRIO=1, REL=1, ASET=2, which is approximately equal to 0 according to the above risk model calculation formula. Analyzing the above figure, we can find that the reliability system has set an initial value for it in the association rule. If it is a constant, a large number of false alarms will be generated in dynamic network attacks. After the cross-correlation mechanism is introduced, the network security alarm and the specific network application service, open port, and vulnerability information of the open port are spatially associated to determine whether the attack is feasible. In this way, the reliability value (reliability) must be a variable. For those port security events that cannot exist, they are directly discarded (there is a real-time updated list in the system's KDB). The figure below shows the change process of the reliability parameter in the OSSIM association rule. 5. Event AggregationWe know that redundant alarms will seriously interfere with the administrator's judgment of faults. At this time, this information needs to be aggregated. By aggregating some similar alarms from massive alarms, the redundant parts are removed, the number of alarms is reduced, and the workload of the correlation module for correlation analysis is greatly reduced. The merging of redundant alarm information is mainly considered from three aspects: (1) Merge redundant alarm events generated by host-based monitoring Ossec. (2) Merge the redundant alarm events generated by network-based monitoring Snort/Suricata. (3) Combine the alarm information generated by host monitoring Ossec and network monitoring Snort for the same attack. The attributes of Snort alarm information are relatively comprehensive and have obvious characteristics. The process of merging alarm events generated by Snort can be based on attribute similarity. For the merging of Ossec alarm events, OSSIM adopts a method based on the combination of event ID and alarm category information. Currently, Ossec has more than 900 detection rules, all stored in XML format. After statistical analysis, it can be classified into 80 major alarm categories according to the alarm behavior, including Syslog, Firewall, IDS, Web, Squid, Windows, etc. The event ID of OSSEC alarm is used as the entry point to match the class and subclass level by level to achieve the purpose of matching and merging. According to the analysis of information sources, we can see that the objects of aggregation are the alarm data generated by network-based Snort/Suricata and host-based Ossec security products. Since Snort's alarm data (including elements such as IP address, attack content and port, etc.) contains protocol attributes, it can be aggregated in a rule-based manner. In the clustering process, it can be ensured that the alarm information is correctly classified. Snort supports multiple protocols, and clustering and merging alarm events can save a lot of time. Also due to its obvious attribute characteristics, the merging of redundant information is based on the similarity of its attributes, so the merging error is small. For the alarm information generated by Ossec, the ID of the alarm event is used as the entry point, and it is gradually aggregated according to the alarm category. The merged error is low. The method of merging according to the root category with the ID as the entry point saves the time of traversing the alarm information for clustering and merging. OSSIM uses the host monitoring software Ossec to implement host audit records, system logs and application collection and analysis. It supports file integrity monitoring, registry monitoring, port monitoring, rootkit detection and partial process monitoring. At the same time, Snort is used to collect data packets on the network, complete real-time traffic analysis and test IP packets on the network, and can also perform protocol analysis, content search and matching, and can be used to detect various attacks and sniffing (such as buffer overflow and ShellCode attacks). For two events, if they are highly similar, they are considered to be the same attack and merged; otherwise, they are considered to be unrelated attacks. We can start by calculating the similarity between two alarm messages. The first thing to consider is the common attributes of the two alarms. These attributes include attack source, target (host IP and port), attack type, MAC address, time information and event ID, etc. Based on this, we can abstract and formally model the attributes of alarm events and define their attributes as an event attribute set. The starting point of the merging method we use for redundant alarm events is that there is a high degree of similarity between alarm messages for the same attack behavior, and the alarm similarity for different attack behaviors is lower. When we merge the redundant alarm information of each product, we must also consider the alarm information generated by the two security products for the same attack, and try to effectively compress all duplicate alarm information. If Snort finds an attack based on a target IP, it means that the target IP host has a vulnerability, and its reliability coefficient is 10 (i.e. 100% attack success). In the figure below, the REL value is 10, which means that the reliability coefficient at the current time is 10, so the risk value is 4. This value can also be modified in the data source. As can be seen above, the values of reliability and priority determine the size of the risk. The method of manually modifying the data source REL/PRIO is shown in the figure below. Note that this modification is effective globally, not for a single event. Through sensors, the data of each monitoring node is associated to form a data chain. Then, the association analysis engine and rules work together to analyze hacker attacks, trace the source, and issue an alarm. The following is an analysis of its generation principle. Alarm Events are generated by Correlation Directives and Rules. The warning information displayed in Alarm is obtained by matching a large number of events through correlation rules. The system after OSSIM 5 adopts a graphical display mode of OSSIM system alarm alarm, which is convenient for administrators to filter out important parts from a large number of security events. The process of Alarm event generation is shown in the figure. The specific steps for generating an alarm are as follows: (1) Logs are collected to OSSIM; (2) Normalize the logs and generate events; (3) importing these events into the correlation engine; (4) Match new events based on association rules. From the picture above, even a security novice can easily detect brute force attacks on the SSH service. In this figure, we only take the System Compromise→Worm infection→Internal Host scanning alarm as an example. In the system compromise, internal host scanning behavior appeared. This type of behavior is suspected to be a network worm scan. We know that scanning host vulnerabilities is often a prerequisite for worm propagation. Worms often detect through ICMP Ping packets, TCP SYN, FIN, RST and ACK packets, and are random. As can be seen from the figure, the system defines the risk value of such events, the duration of the scan, the source address, destination address, source port, destination port and the association level. This abnormal behavior can be easily discovered by OSSIM. 5. Association rule instructionsThe core of correlation analysis is completed through one or a group of correlation instructions. The following example uses SSH brute force to illustrate. SSH brute force has been an attack behavior since the birth of UNIX. According to statistics, more than 50% of the user names are root. A low reliability SSH server successfully logs in after 100 login attempts, while a high reliability SSH server succeeds after 10,000 login attempts. Then the correlation engine can comprehensively determine the attack and respond based on the number of logins within a certain period of time, the different source addresses and the same target IP addresses during these times, and the countries where these IP addresses come from and their credibility. The source code of the associated command rules for detecting SSH brute force attacks is described in XML as follows: < directive id = "50113" name = "AV-FREE-FEED Bruteforce attack, SSH service authentication attack against DST_IP" priority = "4" > Parameter explanation:These instructions are often described in XML. If you don't understand their meaning, you will have no idea where to start when you write or modify the script yourself. The entire association rule system has a complete rule attribute, which means: Id, this property allows to define a unique identification of the associated instruction. This numbering must follow the instructions issued by OSSIM. So that the categorized display implemented in the frame (sub-instructions menu of the associated menu) is implemented accurately. Numbered instructions are available in this sub-menu. Name, this attribute allows to define the name of the directive. (displayed when the directive matches) SRC , source IP address DST, destination IP address Port_from source port Port_to Destination port Priority, this attribute allows to define the priority of the associated directive. Type, this attribute defines the rule type. There are only two types of rules. Detector, uses rules to detect component information, which is contained in the server database. Reliability (also called credibility), the larger this parameter is (closer to 10), the more realistic the alarm is. This parameter is crucial in the correlation process. In fact, as rules are matched one after another, the probability of false alarms in this group of alarms decreases. So it is possible to modify the reliability of high-level alarms in each marking rule. Subsequent rules will estimate their level in a relative (for example: +3, meaning that the global reliability has increased by 3 levels relative to the previous rule) or absolute (for example: 7, indicating that the current reliability level is 7). Occrrence, frequency of occurrence Plugin_id, this attribute defines the source of the alarm expected by the rule. In fact, each plugin has a correlation ID, which allows to reference the plugin in the correlation rules. Plugin_sid, this parameter defines the events associated with a plugin. In fact, all events recovered by OSSIM are indexed (according to their plugin) and configured. The plugin_sid can be configured by clicking on the required plugin_id in the Plugins submenu of the configuration menu. For example, the alarm provided by plugin_id 1501 and plugin_sid 400 is equivalent to: "apache: Bad request" With these two attributes (plugin_id and plugin_sid) it is possible to define exactly the events expected by the rule. Time_out, timeout setting, this property allows to indicate the waiting time for an event that meets a certain rule. If this event does not occur within the given time (calculated in seconds by the property), the correlation instruction ends and returns to the result of the previous rule calculation. This property can determine the temporary window in which the alarm (event) expected by the rule must be displayed. Protocol, this property allows to configure the type of network events expected by the rule. Three types of protocols can be defined: TCP, UDP, ICMP, this property allows absolute references. This means that it is possible to reuse the protocol type matched by the previous rule. So, just do the following: protocol="1:PROTOCOL" to explicitly express that the protocol of this rule is the same as the protocol matched by the first rule. If you want to restore the protocol matched by the second rule, just explicitly say: protocol="2:PROTOCOL". From, this attribute allows to clearly indicate the IP source address of the warning. It can be expressed in 6 different ways: ①ANY, indicating that any address source matches this attribute. ②x.xxx IP address. ③ By reference, consistent with the reference protocol attribute principle. (For example: 1: SCR_IP = the source address of the alarm matching the first-level correlation instruction, 2: DST_IP = the target address of the alarm matching the second-level correlation instruction). 6. Association Rules in ActionAfter the rules are imported into the OSSIM correlation analysis engine, they are displayed as shown in the figure below. After the preparation is ready, start testing: Start medusa on the attacking machine (Kali 2021, 192.168.183.158) to test the SSH port opened by the target machine (commands omitted). Then log in to the target machine (192.168.183.139) to view the SSH log: The traditional way of analyzing these logs is time-consuming and laborious and often cannot determine the fault in time. Below I have not seen the normalized events of OSSIM, as shown in the following figure. Results of association analysis: From the above figure, we can see that the risk value is 3, which can obviously be characterized as a brute force attack on the SSH service. This process is completed automatically. If the rules are written correctly, there is no need for manual intervention and the alarm is automatically realized. A video is more clear than a thousand words: the following video is a practical clip of log correlation analysis. https://mp.weixin.qq.com/s/BnedVIy8h4eCf80dNJG_Gw?source&ADUIN=1989365079&ADSESSION=1650851489&ADTAG=CLIENT.QQ.5887_.0&ADPUBNO=27211 VII. ConclusionIt is a bit difficult for novices to write rules from a blank sheet of paper, but it is still possible to follow the default rules given by the system. Regardless of whether the model is perfect, use it first and then modify it step by step. At the right time, conduct penetration testing on the system, and then monitor the response of SIEM to observe whether it generates correct alerts and whether the alerts provide enough information to assist the corresponding person to figure out what threat has occurred. If not, you need to modify the rules, and then you need to continuously optimize the threshold. This article briefly introduces the network log correlation analysis technology under the OSSIM platform, hoping to provide some help to everyone in daily network security operation and maintenance. |
<<: Seven types of networks and their use cases
>>: Game lag? Be careful to use the wrong WiFi frequency at home
With the rapid development of artificial intellig...
[51CTO.com original article] Using the Internet t...
File descriptor limits System-level limit: The op...
The latest report from market research firm Omdia...
[51CTO.com Shanghai report] The 2017 National Cyb...
Background XX Company is an engineering company s...
HOSTEROID is a British hosting company founded in...
Recently, the National Standardization Administra...
[[257522]] 1. With the support of policies, the c...
Not long ago, a video about 5G experience by &quo...
Last month, the tribe shared information about RA...
Readers who followed the blog in the early days m...
【51CTO.com Quick Translation】With the continuous ...
[[188759]] "In the past, I had to go to seve...
I saw a piece of information that Adobe said it w...