Academic publishing at its very best: "Our thesaurus is creative commons licensed, please fill out this form to get access." 🤯
https://www.ieee.org/publications/services/thesaurus-access-page
27.8.2025 20:37Academic publishing at its very best: "Our thesaurus is creative commons licensed, please fill out this form to get access." 🤯..."having considerable success exploiting publicly known common vulnerabilities and exposures (CVEs) and other avoidable weaknesses within compromised infrastructure [T1190].
Exploitation of zero-day vulnerabilities has not been observed to date."
So you're saying they're script kiddies and they're totally pwning the telecoms sector worldwide?
What about this is "advanced?"
https://www.ic3.gov/CSA/2025/250827.pdf
27.8.2025 18:28"having considerable success exploiting publicly known common vulnerabilities and exposures (CVEs) and other avoidable weaknesses within...And bonus side commentary for the Mastodon HOA:
There are other concerns with LLMs that this post doesn’t touch on for length reasons. Those include, but probably aren’t limited to: LLMs are trained on copyrighted data — including my books; Environmental concerns over water and power; Connection to and pride in work; Workforce development; Software understanding. I'm on IrusRisk's advisory board.
(12/11)
26.8.2025 18:54And bonus side commentary for the Mastodon HOA:There are other concerns with LLMs that this post doesn’t touch on for length reasons....The only really effective tooling will come out of thoughtful investment, and how much it accelerates your program has to be evaluated relative to the threat modeling problem you want tooling to solve.
(11/11, full post, links, formatting at https://shostack.org/blog/mansplaining-your-threat-model-as-a-service/)
26.8.2025 18:54The only really effective tooling will come out of thoughtful investment, and how much it accelerates your program has to be evaluated...If you get tremendous advantage from building your own LLM, then by all means, go for it. Otherwise, there’s a lot of fascinating commercial innovation. Exciting developments I’ve seen recently include Seezo, who has a very interesting approach to minimizing halucinations, Prime Security, who I haven’t spoken to but won the BlackHat Startup competition, and my colleagues at IriusRisk, who’ve built Jeff, which helps in diagram creation, and Bex, a conversational agent that interacts in Jira.
(10/11)
26.8.2025 18:54If you get tremendous advantage from building your own LLM, then by all means, go for it. Otherwise, there’s a lot of fascinating...As you get into a build-buy decisions about LLM development, the key question is “do we get enough overall business advantage from developing our own security LLM, versus other LLM or product work we could do?” When you develop (or tune) an LLM, you have the advantage that you can use your own threat models as part of your training data. This is most powerful if you have a lot of great threat models that you want to use to inform future work. Unfortunately, many orgs do not have a great collection of threat models.
(9/11)
26.8.2025 18:53As you get into a build-buy decisions about LLM development, the key question is “do we get enough overall business advantage from...My intuition is that as long as LLMs are “language models”, rather than “concept models,” they’ll be less good at threat modeling, and we’ll soon find better understanding of the limits of chain-of-thought, plan, expose-your-reasinoning and other techniques for helping LLMs get better at things that require more than small token prediction. On the other hand, my intuition would not have anticipated all the things LLMs can do with token prediction.
(8/11)
26.8.2025 18:53My intuition is that as long as LLMs are “language models”, rather than “concept models,” they’ll be less good at threat modeling,...Adding RAG and tuning may dramatically improve the reliability of LLM-driven threat modeling, but “dramatically” is not a synonym for “sufficiently.” But a recent paper, Potemkin Understanding in Large Language Models raises a fascinating threat: “these benchmarks are only valid tests if LLMs misunderstand concepts in ways that mirror human misunderstandings. Otherwise, success on benchmarks only demonstrates potemkin understanding: the illusion of understanding driven by answers irreconcilable with how any human would interpret a concept.” It’s worth asking if that model applies to LLM-driven threat modeling? I think it does. If it does, we need to ask what that means for when we use LLMs, how we evaluate their output, and what remains with the people, and my answer is, I don’t know.
(7/11)
26.8.2025 18:53Adding RAG and tuning may dramatically improve the reliability of LLM-driven threat modeling, but “dramatically” is not a synonym for...It's worth looking at this academic paper, ACSE-Eval: Can LLMs threat model real-world cloud infrastructure?. They asked what's involved in assessing an LLM, and likely represents a person-year or more of effort. Some of that effort is re-usable, but it colors my thinking on the chatbot approaches to LLM threat modeling.
(6/11)
26.8.2025 18:53It's worth looking at this academic paper, ACSE-Eval: Can LLMs threat model real-world cloud infrastructure?. They asked what's involved in...We know that LLMs are very vulnerable to “pertubation attacks.” These use minor changes, not meaningful to humans, to dramatically alter the LLM’s behavior. These are important because interacting in models 1-4 inevitably invite accidental pertubations. Managing that requires that you develop tools, including evaluations, to test the tooling. It’s not a small investment.
(5/11)
26.8.2025 18:53We know that LLMs are very vulnerable to “pertubation attacks.” These use minor changes, not meaningful to humans, to dramatically alter...A relatively straightforward way to start is with a tuned set of prompts, managed in a test system with evaluations is going to do way better at analysis than randomly chatting with a bot. (The chatting with a bot has its charms, including focusing our attention in a way that a report doesn’t have.) And if the chat is in a tool, it’s easier to use techniques like setting a low temperature to drive consistent answers. My skepticism over chatbots extends to the deeper LLM application patterns that use RAG, but those may change the equation pretty dramatically.
(4/11)
26.8.2025 18:52A relatively straightforward way to start is with a tuned set of prompts, managed in a test system with evaluations is going to do way...If people do read the output, we should ask if they can effectively judge it. That requires some skill in doing threat modeling work. LLMs are excellent at sounding confident, thus the title of this post. People tend to think that others wouldn’t speak confidently if they’re not confident, and so LLM-displayed confidence is persuasive. Worse, LLM output is generally tedious and so people get bored reading it. It’s a dangerous combination when we ask if people can evaluate it effectively.
In the first of this paired set of posts, I presented the following model of chatbots:
Standard chatbots
Standard chatbots with structured prompts, used manually
Security chatbots (Deep Hat)
Security chatbots, structured prompts ( StrideGPT)
One time investment in RAG (etc) to provide structure
Ongoing product development effort
(3/11)
26.8.2025 18:52If people do read the output, we should ask if they can effectively judge it. That requires some skill in doing threat modeling work. LLMs...The key question is how can tooling effectively improve the security of delivered systems?
Let me start with accuracy. I am skeptical that we’re going to get decent threat modeling from chatbots anytime soon. There are a number of problems, including quality issues with training data and context windows/attention algorithms. Even with good structures, a lot of what’s called “intuition” in humans seems to be pattern recognition, and the patterns recognized may not align with the things that an LLM statistical model draw out. There’s also an anti-pattern: produccing tedius pablum that no one wants to read.
(2/11)
New blog Mansplaining your threat model, as a service
This is the second part of a short series. The first post looks at threat modeling tooling more broadly; this one is focused on LLMs in threat modeling.
It seems like you can’t turn around without another experiment in augmenting or replacing threat modeling with LLM tricks. It’s easy to fall into a Pollyanna or a Cassandra camp on this. Extremism doesn’t help us here. We should be both curious and cautious. It would be pollyannish to not worry about accuracy and completeness. Despite those, when we compare LLM TM to not threat modeling at all, or having untrained people threat model, using an LLM might deliver product security improvements.
(1/11 full post at https://shostack.org/blog/mansplaining-your-threat-model-as-a-service/
26.8.2025 18:52New blog Mansplaining your threat model, as a serviceThis is the second part of a short series. The first post looks at threat modeling...LLM support for threat modeling can fit into any of the tool types above (programmer, small team, or enterprise) or work in the general purpose tools. I don’t really want an LLM-enabled whiteboard, but here we are.
Scaling is a very common goal for tooling. If you’re thinking carefully about what I’m saying, you’ll see I’ve outlined at least two scaling challenges that tools can help address. The first is “more people threat modeling,” the second is “managing the process and outputs.” Being even more specific, more people threat modeling may involve help in creating diagrams, analyzing what can go wrong, or selecting or implementing mitigations. Being precise will make it easier to reach a goal and see that you’re reaching it.
(5/5, formatting, links and pix at https://is.gd/2CloIJ)
25.8.2025 16:53LLM support for threat modeling can fit into any of the tool types above (programmer, small team, or enterprise) or work in the general...When people talk about LLMs to help threat modeling, there’s a few ways they might be working with them. Those are:
Standard chatbots
Standard chatbots with structured prompts, used manually
Security chatbots (Deep Hat (Formerly Whiterabbit), StrideGPT)
Security chatbots, structured prompts
One time investment in RAG (etc) to provide structure
Ongoing product development effort
(4/5)
25.8.2025 16:52When people talk about LLMs to help threat modeling, there’s a few ways they might be working with them. Those are:Standard chatbots ...As we consider threat modeling specific tooling, we should consider who’ll use it. “Scaling” is a common goal, and usually involves more people threat modeling. Considering their learning curve and other elements of adoption can help us to think about what can go wrong and what we’re going to do about those things. Familiarity is valuable, but not the only value. It’s one of the things that keeps companies running on Excel long past the time when they should replace it.
I often use the metaphor of Excel versus SAP or Oracle Financials to illustrate the relationship between Microsoft’s TMT and IriusRisk. Microsoft TMT is the Excel in this metaphor. You can manage files and otherwise make it work as you grow, but it's not an enterprise tool with permissions, change management, approvals, project status, et cetera. There’s a tremendous amount of error-prone busywork in trying to scale Excel to running a business.
(3/5)
25.8.2025 16:52As we consider threat modeling specific tooling, we should consider who’ll use it. “Scaling” is a common goal, and usually involves...This model foreshadows how LLMs fit in. (Spoiler: Any of these can add an LLM, which limits the value we get from talking about “AI supported threat modeling,” and which is why it’s helpful to start with this model of types of tools.)
Any threat modeling project or program has tools, even if they’re not specialized. Editors, drawing tools and more get picked because they’re familiar, integrated into workflows, and often insufficient because they don’t do things that you hope they’d do, like analyze your diagram. But from whiteboards to Word to Miro, your existing tools can take you a long way — and leave you wanting more.
(2/5)
25.8.2025 16:52This model foreshadows how LLMs fit in. (Spoiler: Any of these can add an LLM, which limits the value we get from talking about “AI...New blog, Threat Modeling Tools https://is.gd/2CloIJ
People frequently ask me what threat modeling tooling they should use. My answer is always: The best threat modeling tool for you is the one that solves a specific problem that you can articulate. To help you articulate the problems, this is one part of a two-part series. The second post will dive deep into LLMs for threat modeling.
Threat modeling tools generally fall into four groups:
(1/5) https://is.gd/2CloIJ
25.8.2025 16:52New blog, Threat Modeling Tools https://is.gd/2CloIJPeople frequently ask me what threat modeling tooling they should use. My answer is...Some little Intel questions:
Does the US get a board seat? What's board's responsibility to protect the interests of the United States?
22.8.2025 23:26Some little Intel questions:Does the US get a board seat? What's board's responsibility to protect the interests of the United States?