An Attention-based Deep Structured Semantic Model
Sim! Mas não!!!
Vamos ler esse artigo como Computational Design Science!
Coleta de dados, aparato, E TESTE E VALIDAÇÃO DESSE APARATO!!!
O QUE PODEMOS APRENDER COM ELE PARA A NOSSAS DISSERTAÇÕES/TESES??
(Obs: Esse artigo poderia ser lido tb sob uma perspectiva de IA/Redes Neurais!!!)
As a specific genre of design science, CDS aims to develop novel computational algorithms and methods to solve business and societal problems with significant impact;
“how can we create a system that does X?” Vs. “why does X happen?”
Fang, Xiao and Hu, Paul J. and Chau, Michael and Chen, Hsinchun, Computational Design Science: A Critical Information Systems Research Area Contributing to Artificial Intelligence and Data Science (February 01, 2025). Available at SSRN: https://ssrn.com/abstract=5455094 or http://dx.doi.org/10.2139/ssrn.5455094
Rai, A. 2017. “Editor’s Comments: Diversity of Design Science Research,” MIS Quarterly, (41:1), pp. iii– xviii.
| Category | Metadata | Description |
|---|---|---|
| Description | Exploit Name | Exploit name that defines its function and target |
| Author Name | Name of hacker who posted | |
| Post Date | Date when exploit was posted | |
| Exploit Category | Major category an exploit belongs to | |
| Operation | Targeted Platform | Specific platform and exploit targets |
| Common Vulnerabilities and Exposure (CVE) | Standardized representation of a vulnerability | |
| Verified Exploit | Verified by community that the exploit is operational | |
| Content | Exploit Description | Natural language explanation of the exploit |
| Exploit Discussion | Discussions between forum members | |
| Exploit Content | Raw exploit source code |
| Risk Level | CVSS Score | Number of Vulnerability Listings | Number Amenable for Text Analytics |
|---|---|---|---|
| Critical | 9.0 – 10.0 | 8,355 | 8,170 |
| High | 7.0 – 8.9 | 24,098 | 23,897 |
| Medium | 4.0 – 6.9 | 28,707 | 28,674 |
| Low | 0.1 – 3.9 | 3,163 | 3,163 |
| Informational | 0.0 – 0.0 | 22,696 | 0 |
| Total: | - | 87,019 | 64,104 |
Deep Structured Semantic Model (DSSM): processes input texts separately until the final embedding comparison. As a result, cannot capture global relationships across input texts during the training process to improve overall matching performance.
ATTENTION IS ALL YOU NEED: Attention mechanisms can be customized to focus on entire input sequences or portions, depending on the data characteristics and/or network architecture;
Ao invés de associar uma palavra sempre a um mesmo vetor (embedding), a camada de atenção da rede neural REFINA esse vetor de acordo com o contexto do texto. A mesma palavra tem diferentes embeddings de acordo com a localização e os “vizinhos” dela no corpus.
Pre-processing: All exploit and vulnerability names are stemmed, lowercased, and have stop words removed. Implementing these steps normalizes irregularities (e.g., capitalization) and follows common practice for hacker forum analysis;
Word Hashing: letter trigrams are extracted from pre-processed text;
Bi-LSTM Processing: The standard DSSM uses a bag-of-trigrams representation of input texts and therefore does not capture sequential dependencies within text.Each Bi-LSTM time-step processes a letter trigram sequentially in both forward and backward;
Context Attention Layer Específico (KQV+Scoring+Contex Vector): Operating in this fashion captures the relationships across exploit and vulnerability texts (i.e., global information) with the context vector, and information within the exploit texts (i.e., local information);
Self-Attention Layer): computes the attention weights assigned for the hidden state, summarizes the exploit texts information according to the relationships across exploit and vulnerability texts and the relationship within the exploit texts;
DNN Processing with Shared Dense Layers: To facilitate embedding similarity calculation, we input both generated embeddings into shared dense layers to project them into the same embedding space;
Computing Embedding Similarity:Cosine similarity computes the distance between output from previous layer. The softmax is used to obtain conditional probability of P(E|V) and phase, the loss is backpropagated to update network parameters according to gradient-based methods;
EVA-DSSM was implemented with the Keras, TensorFlow, Natural Language Toolkit (NLTK), numpy, pandas, genism, and scikit-learn packages.
Coupling hacker exploit and vulnerability metadata based on EVA-DSSM’s output to create specialized severity (risk) scores can further create holistic CTI and facilitate enhanced device prioritization capabilities;
Device Vulnerability Severity Metric: encompasses the number of vulnerabilities in a device, each vulnerability’s severity, and the hacker exploit age for each vulnerability;
A device’s overall score is higher if it has more severe vulnerabilities or newer exploits for vulnerabilities.
| Feature Category | Feature | Justification for Inclusion | References |
|---|---|---|---|
| Vulnerability | Vulnerability severity (CVSS, 0.0-10.0) | A higher severity score indicates more severe consequences if device is compromised. | Mell et al. 2007; Weidman 2014; Kennedy et al. 2011 |
| Number of device vulnerabilities | Devices with more vulnerabilities have a higher exploit susceptibility. | ||
| Hacker Exploit | # of exploits targeting vulnerabilities | More hacker exploits targeting a vulnerability increases the probability of the device’s harm. | Friedman 2015; Robertson et al. 2017 |
| Age of hacker exploits (i.e., forum post date) | Newer exploits are more valuable for CTI since there is less time to formulate defenses. | Shackleford 2016 |
Consistent with computational design science principles and DL fundamentals, we evaluated the proposed EVA-DSSM with three technical benchmark experiments: (1) EVA-DSSM vs Conventional Short Text Matching Algorithms, (2) EVA-DSSM vs Deep Learning-based Short Text Matching Algorithms, and (3) EVA-DSSM Sensitivity Analysis;
To validate the labels from the dataset, we recruited a security analyst from a well-known, international healthcare organization; We used Cohen’s kappa to compute the level of agreement between ratings;
In this research, we employed three performance metrics that are commonly used to evaluate DSSMs: Normalized Discounted Cumulative Gain (NDCG); Mean Reciprocal Rank (MRR); and Mean Average Precision (MAP).
MRR is the “Frustration Metric”: If the first relevant result drops from rank 1 to rank 2, the score halves (from \(1\) to \(0.5\)). If it drops from 10 to 11, the score barely changes. It measures the “Instant Gratification” of your retriever.
MAP is the “Information Density Metric”: It tells you how “clean” your top results are. If your model returns 10 cases but only the 1st and 10th are relevant, MAP penalizes you heavily because the user had to sift through 8 irrelevant items.
NDCG is the “Nuance Metric”: Use the formula below to show you are accounting for the quality of the match. It uses a logarithmic discount, meaning a “Perfect Match” at rank 4 is worth much less than a “Perfect Match” at rank 1.
EVA-DSSM outperformed non-DL short text matching algorithms in NDCG (at all levels), MRR, and MAP.
These results suggest that EVA-DSSM’s attention mechanisms combined with feed-forward, backpropagation, and error correction enables the model to identify finer grained linguistic patterns within exploit and vulnerability names that benchmark methods miss.
The consistency of these issues across all four datasets indicates that simple matching approaches, while appearing to have some face validity for exploit-vulnerability matching due to overlapping technology names in exploit and vulnerability names, cannot capture the semantics or context of selected technology names that EVA-DSSM can. (AQUI ELE DEMONSTRA ALGUMAS INTERSECÇÕES ENTRE OS DATASETS).
In Experiment 2, we evaluated the performance of EVA-DSSM against state-of-the-art DL-based short text matching algorithms;
Eleven models were selected for benchmarking; all models were evaluated based on MAP, MRR, and NDCG;
EVA-DSSM’s outperformance of both CNN and LSTM-based methods suggests that incorporating attention mechanisms can help capture global relationships and semantics across exploit and vulnerability short texts missed by prevailing approachs;
EVA-DSSM’s outperformance of both CNN and LSTM-based methods suggests that incorporating attention mechanisms can help capture global relationships and semantics across exploit and vulnerability short texts missed by prevailing approaches.
the base EVA-DSSM model using letter trigrams, one-layer Bi-LSTM, two dense layers, and self-attention and context attention mechanisms achieved the strongest performance;
Letter trigrams create a window large enough to capture these key three letter acronyms;
When considering the Bi-LSTM sensitivity analysis, results indicated that using only the LSTM that processes in a single direction rather than two directions resulted in performance degradation. This is likely due to the nature of how sequential dependencies appear in exploit and vulnerability names;
layers, the performance increased when having two layers as opposed to one, but the differences were negligible when adding a third layer. However, removing either attention mechanism from the EVA-DSSM substantially reduced the performance. This performance decrease was most pronounced when removing the context attention, which dropped by nearly 15% in some cases;
we used Nessus, a state-of-the-art vulnerability assessment tool, to discover the vulnerabilities of each device without port scanning and payload dropping. Scanning for vulnerabilities in this fashion has been noted in past literature to avoid adverse events;
After identifying vulnerabilities, EVA-DSSM determined the most relevant hacker exploit for each vulnerability;
After creating exploit-vulnerability links, we used the metadata from the exploit (post date) and vulnerability (CVSS score) for each exploit-vulnerability pair for each device. The DVSM score for each device is computed using these data. The final outputted DVSM values are ranked in descending order to help facilitate vulnerable device prioritization;
The exploit-vulnerability linkages identified by EVA-DSSM and the DVSM scores can offer cybersecurity experts an excellent starting point for their mitigation and remediation activities.
| Risk Level | Vulnerability Names (Severity) | Top Linked Exploit Name and its Post Date | # of Devices |
|---|---|---|---|
| Critical | “PHP Unsupported Version Detection” (10.0) | “phpshop 2.0 Injection Vulnerability” (1/14/2013) | 11 |
| “OpenSSL Unsupported” (10.0) | “OpenSSL TLS Heartbeat Extension - Memory Disclosure” (4/8/2014) | 7 | |
| “Unix OS Unsupported Version Detection” (10.0) | “TCP/IP Invisible Userland Unix Backdoor with Reverse Shell” (6/30/2012) | 6 | |
| High | “Multiple Apache Vulnerabilities” (8.3) | “Apache 2.4.17 - Denial of Service” (12/18/2015) | 17 |
| Medium | “HTTP TRACE / TRACK Methods Allowed” (5.0) | “traceroute Local Root Exploit” (11/15/2000) | 58 |
| “SSH Weak Algorithms” (4.3) | “OpenSSH attack DoS” (7/4/2010) | 55 |
Acho que é isso!
Muito obrigado!!!
Metodologia Científica - Profa. Cristina - 2026.1