» Archive for February, 2008

More definitions

Monday, February 25th, 2008

Here are a couple more definitions for you all. These are not quite so security-specific, but are nevertheless quite applicable to information security. I was aware of both concepts, but didn’t know the exact terms for them.

First, the The Dunning-Kruger Effect:

The Dunning-Kruger effect is the phenomenon wherein people who have little knowledge tend to think that they know more than they do, while others who have much more knowledge tend to think that they know less.

I always referred to this as, “not knowing what they don’t know” or “Doesn’t know enough to be ignorant.” Now that I’ve got a less-pejorative, I can use it more ;-).

Second is the Lake Woebegone Effect

Lake Wobegon effect is the human tendency to overestimate one’s achievements and capabilities in relation to others. It is named for the fictional town of Lake Wobegon from the radio series A Prairie Home Companion, where, according to the presenter, Garrison Keillor, “all the women are strong, all the men are good-looking, and all the children are above average.” In a similar way, a large majority of people claim to be above average; this phenomenon has been observed among drivers, CEOs, stock market analysts, college students, police officers and state education officials, among others. Experiments and surveys have repeatedly shown that most people believe that they possess attributes that are better or more desirable than average. The term is also used to describe a perceived tendency to treat children as “special” in order to boost their self-esteem, even though the children may only be average or even underperforming.

Like I said, I’d noticed both of these principles in practice, the first in security especially and the second in IT in general. I had a meeting earlier this morning, in fact, where I had to tell someone that even if they are a beautiful, unique snowflake, they are not so special that the rules don’t apply to them.

Happy Monday, everyone!

Monumentally stupid

Thursday, February 21st, 2008

This may be one of the worst ideas I’ve heard in a long time.

Cary Sherman of the RIAA…[is]…trying to convince other industries to step up and help the entertainment industry as well. His latest, as pointed out by Broadband Reports, is that one possibility would be for anti-spyware/anti-malware applications to also watch for the transfer of unauthorized copyright material. Sherman suggests that this would be one way to get around the question of people simply encrypting traffic to avoid ISP filters.

The original TechDirt piece does a fine job of explaining how it is not the job of others to break their products to help prop up a broken business model, and I wholeheartedly concur. As a general rule, if your business model needs people beyond your influence to change what they’re doing in a manner that’s not in their own best interest, then you’re the one with the broken model.

Fortunately, I think that the risk of this actually happening is close enough to zero that I can just laugh at the absurdity of it all and maybe have some fun batting it around like a cat with a toy mouse.

I mean, how much better example could you provide of how not to solve a problem? Ignoring the fundamentally shifting business landscape for music (micro-targeting, the Internet breaking the radio+record company cartel, etc.) and instead trying to screw up the new distribution mechanisms is just silly.

All that tying an evil-and-unnecessary thing to an irritating-but-necessary thing (if you run Windows) does is reduce the effectiveness of the irritating-but-necessary thing, since you now create a strong disincentive for some of the the most at-risk people (in this case, downloaders) to use the product.

BOTE analysis of DLP vs. full-disk encryption

Wednesday, February 20th, 2008

I did some Back-of-the-Envelope (BOTE) analysis yesterday to explain why I think that Digital Leakage Protection (DLP) is *not* where we need to be spending my company’s money right now. The overall analysis was much larger than this, but I did have a little lightweight numerical analysis which I found quite entertaining:

Using data from the notoriously-inaccurate-but-about-as-good-as-anything-else-out-there 2007 FBI/CSI study, I worked out that:
1) 194 respondents actually responded (divided total loss by average loss per customer)

2) Two categories of identified losses could reasonably be argued to be preventable via DLP (assuming a number of other security management practices were in place):
- “all data losses but mobile devices”
- “Unauthorized access to information”
Totaling $6,727,700 in reported losses

3) Divided by 194 to get $34,678.87 (call it “under $35k”) in average losses per respondent.

Even when, just for grins, I decided to assume that only Large Enterprises (revenues > $1b/year, 36% of respondents) suffered data loss, the average annual loss only jumped to $96,330.18.

Not much of a business justification for a multi-million dollar product (and that’s just the technology–it ignores everything that has to come before and after to actually make it perform) for any enterprise without either a zero-tolerance for loss or extra-large business and/or regulatory risk associated with data leakage.

Today, I decided to see how full-disk encryption of my laptops would stack up against the same analysis. Going back to the 2007 FBI-CSI survey, I came up with three categories of loss which would be addressed by disk encryption and remote wipe tools:
- Laptop or mobile hardware theft
- Theft of proprietary info from mobile device
- theft of confidential data from mobile device
Totalling $8,429,150 in reported losses

This gave me an annual average loss of $43,448.61 or $120,690 if I assumed a 100% weighting to large enterprises.

Also on the upside, the supporting activities required to support an effective rollout of full-disk encryption is a lot shorter. You just have to decide whose laptops get it and in what order, then do the deployments. Since the candidates can be pretty easily identified with nothing more than an org chart and some common sense (either “do all” or some picking & choosing: “HR? Yes. Sales? Yes. Media Relations? Probably not–we wish more people were reading the press releases, etc.) What’s more, it’s only a few thousand devices and once it’s in place, the support and maintenance overhead is fairly minimal.

So when I start to look at my priorities, this becomes pretty much a no-brainer. DLP costs more, reduces risk less (including some specific, high-profile regulatory risks), is much harder to implement, much costlier to support, and at the end of all that, is less likely to actually make a difference in our losses (IMHO).

Definitions

Tuesday, February 19th, 2008

Since IanG has been wondering about it in comments, I thought I’d take a moment to follow up on the theme that Alex Hutton so nicely summarized in one of his comments on my post:

Governance and Compliance are priors for Risk Management, and not the other way around.

So, starting with my favourite data source, Wikipedia, Compliance is defined as:

conforming to a specification, standard or law that has been clearly defined.

Governance, similarly, is defined as:

In the case of a business or of a non-profit organization, governance relates to consistent management, cohesive policies, processes and decision-rights for a given area of responsibility. For example, managing at a corporate level might involve evolving policies on privacy, on internal investment, and on the use of data.

Governance and compliance are two sides of the same coin–compliance is about following the rules, governance is about making sure the rules are clearly, consistently defined and enforced.

I think that one mistake people tend to make is to confuse Risk Analysis, effectively the process of compressing the three axes of risk (impact and likelihood over time) into a single value, with Risk Management, the process of ensuring that risks are identified and kept at some desired level.

How is this accomplished? Well, first and foremost, we must define the level–that’s where the clearly defined policies, processes, etc. come in. Once we define the rules, we (try to) ensure that people are aware of them and following them, then enforce and update them over time.

When people are in compliance, they are implicitly at our accepted level of risk. If they get too far outside of tolerances, then we now have a risk that must be managed. But without knowing what our accepted level risk is, we don’t know which risks can be accepted and which risks must must be managed to that level.

Hence, Alex’s observation.

Measuring risk reduction

Monday, February 18th, 2008

Another thought on KPI #2, “Are we secure enough?”:

Once management agrees that the approach (tracking compliance, gaps, and exceptions, extrapolated for coverage), then we can now effectively calculate the cost-per-gap-closed of a particular mitigation approach.

I’ll use a trivial example to demonstrate what I mean.

Say I have 10 network-level exceptions related to systems on a particular network, say a production line in a factory. I want to mitigate the risk (really, partition the risk, but I’ll argue/assume that the effect on the aggregate network is mitigation). To do so, I need to demonstrate that firewalling off the network is not only effective, but also a cost-effective approach to the problem.

Suppose, also, that I know from my risk assessment efforts that I have 50 exceptions on the network and 50% of my systems have been assessed for risk (based on best estimates from KPI #1, coverage and control. Extrapolating conservatively (reality is that the 50 assessed systems are probably somewhat better-than-average from a compliance perspective), then I assume that I have at least 100 documented or potential exceptions.

Therefore, deploying the firewall to mitigate the risk of 10 of them will reduce risk in excess of tolerance by 10%. This means that I can now provide a cost-per-exception to mitigate of 1/10th the cost of the firewall.

If I have some estimate of the impact of the risk (lifted, say, from the BIA for the systems/applications), then I can determine if the firewall is a cost-effective approach to protect those systems, or if I need to come up with something cheaper. This also allows me to prioritize my risk reduction efforts to maximize efficiency, and also explain to others why I’ve ranked them in the order I have.

I’ve also managed to turn my risk assessment into dollars, and the dollar amounts all come from the people I’m managing risk for–no accusations by the “customer” of FUD’ing up my numbers, either.

So, no math that’s more complex than four-function arithmetic. It’s simple enough both to maintain over time and to explain to any half-way competent business or IT leader. What’s not to love? (I’m sure you’ll let me know in comments)

Zombie Preparedness

Friday, February 15th, 2008

It’s good to know that I’m not the only one who’s worried about zombies:

The U.S. Geological Survey in 2003 said there’s a 62 percent chance of a magnitude 6.7 or greater earthquake in the Bay Area in the next 30 years. I’d have to put a zombie invasion in the same time period somewhere around 90 percent. Make no mistake: Between the possibility of a rogue virus, an alien spore or there simply being no more room in hell, the dead are going to walk the Earth in your lifetime.

Not that our government is lifting a finger to help us prepare. The Federal Emergency Management Agency and the state Office of Emergency Services has detailed information for more than 15 different disasters on their Web sites - covering everything from wildfires to dams breaking - but not a single word about what a citizen should do during a zombie attack.

Can you tell it’s Friday?

KPI #2: How secure are we?

Thursday, February 14th, 2008

Ultimately, all security metrics are an attempt to answer the question of “how secure are we?”

But rather than killing myself trying to answer that question–because, as the fact that KPI #1 is relevant (implying that my coverage & control is less-than-adequate), I’ll settle for targeting metrics that allow me to answer the question, “Are we secure enough?”

How much is “enough?” It depends. In the case of a corporation, “enough” is defined by its security policies. So I consider how we measure compliance to policies:

- Audit findings, which document non-compliance to policy, process, or standards (process and configuration compliance)
- Risk Assessments, which are effectively operational audits of the architectural decisions and work quality for an application (ensure IT teams are making design decisions which match the accepted level of risk)
- Documented Exceptions, which are where the application owners formally state that they’re ignoring policy (portion of the environment that’s knowingly non-compliant)
- Proportion of the IT environment participating in the policy regimen (coverage & control for extrapolation of above)

Now this does make one huge assumption–that our policies, procedures, standards and guidelines are, in fact, an accurate representation of the company’s risk tolerance. Assuring that to be the case is part of my job, which I do by measuring, analyzing, and doing any number of other activities.

From a KPI perspective, though, this is OK. It’s like the way that most drivers only care about the reading on their speedometer and maybe the gas gauge. They leave worrying about the rest of the things that go into making the car run to the engine’s computer and focus on getting the thing from A to B. They let my team be the engine’s computer, and just like with cars, it generally runs better that way.

And when it breaks, I’m also the mechanic they take it to to complain that it’s leaking oil, too slow off the line, or the air conditioning isn’t cold enough.

Getting back to what I can do with my KPI, though, if we want to go a little bit further, we can (and are going to) look at information such as how many Gaps identified through the risk assessment process are eventually closed versus convert into Exceptions as the application goes live. Initial Gaps are a good indicator of how {risk|security}-aware the system owners and implementers are. Go-live gaps (exceptions) are a good indicator of how serious they are about hitting the accepted level of risk as documented through policy.

Looking at the Gap-to-Exception flow through the projects’ System Development Lifecycle (SDLC), we now also have a first stab at a leading indicator of risk.

When I combine this with the baseline level of control over the total environment, as measured by overall IT hygiene (primarily participation in centralized management (e.g. Active Directory), patch compliance, and basic secure configuration), and now we actually have a feel for “how secure am I?” which can be adjusted over time by setting the SLA’s for the various operations groups. This, in turn, means that we now have a metric that both resonates with Senior Leadership as well as can be translated into specific goals for the people doing the actual work–it passes both sides of the so what test.

Additional metrics can be built either by tuning the slicing & dicing of the information (sub-report on “critical” applications, manufacturing, R&D, SoX, systems with some regulatory requirement, etc.) which we can then use to document actions which address those risks at a more macro level.

For example, I feel that we need to do some internal firewalling. No one disagrees, but thus far we have not been able to effectively document the expected benefit, which is important both to justify the effort and also to demonstrate how we expect it to improve things. Now, I expect to point to reduction of risk by addition of this compensating control, which will both protect both the rest of the environment from manufacturing, as well as the manufacturing environments from the rest of the environment.

This is a Good Thing since every manufacturing enterprise I’ve ever worked in has been (internally) famous for running outdated applications and platforms (NT4, anyone?), being unwilling to patch (”We know you have a patching maintenance window, but we use your patching window for other things!”), and being supremely high-impact if there is an outage–which they can measure to the minute (”If this factory goes down, we lose $4 million an hour!” Nice metric, btw–something that Andrew Jaquith also points out in his book). Similar rules and tales apply to the Mad Scientists in the R&D labs, too.

Eventually, if we agree that the model is valid, we might look at applying it to other environments, as well, based on agreement that risk is excessive and needs to be partitioned–we need only partition the measurement and agree accordingly.

It’s not the holy grail, but I’ll argue that it’s good enough for Senior Leadership to decide by, and in the KPI game, that’s all that really matters.

Search and Seizure

Friday, February 8th, 2008

When I read stories like this, I really begin to wonder if my country has gone irrevocably off the rails:

A few months earlier in the same airport, a tech engineer returning from a business trip to London objected when a federal agent asked him to type his password into his laptop computer. “This laptop doesn’t belong to me,” he remembers protesting. “It belongs to my company.” Eventually, he agreed to log on and stood by as the officer copied the Web sites he had visited, said the engineer, a U.S. citizen who spoke on the condition of anonymity for fear of calling attention to himself.

I guess that the Fourth Amendment is the latest member of the Bill of Rights to be put explicitly out-of-scope at airports.

The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated, and no warrants shall issue, but upon probable cause, supported by oath or affirmation, and particularly describing the place to be searched, and the persons or things to be seized.

KPI #1: Knowing what we don’t know

Friday, February 8th, 2008

As G.I. Joe taught us back many moons ago, Knowing is half the battle.

To test this theory, I’ve been testing potential KPI’s by mentioning the issues or concerns that the potential KPI’s represent with relevant parties in various conversations. Hands-down, the one that has gotten the most interest is how little we actually know about our risk landscape.

From this exercise, it has become obvious to me that my first Security Metrics KPI must be related to Coverage and Control:

percentage of internal hosts which are centrally managed and protected.

No matter what else I might try to tell people about our risk profile, I look like either chicken little or a buffoon if I don’t know how much of the total enterprise I’m actually speaking knowledgeably about.

And while this is really more of a table which rolls up to the KPI, and also while we can debate what exactly is required to be “centrally managed and controlled,” we cannot manage what we cannot control, and as such anything which is outside the framework (even if it meets compliance without our help) doesn’t matter in this case.

To know the percentage, I need to know the total number of active nodes on the internal network. From there, I can begin to provide detail around what type or level of control I have over those hosts. Things like:
- Number of windows hosts that are members of Active Directory
- Number of windows hosts with centrally-managed anti-virus
- Number of Linux/Unix hosts which are managed by IT
- Number of hosts which are patched by IT (and in keeping with our patching SLA’s–but that’s another metric for another day)

We have a fair amount of AS400 out there, too, but from a host count perspective, it’s small and it’s all centrally-managed. How well-managed is another deal entirely. There just isn’t anyone who says, “We can just order an iSeries and turn it in on the corporate card as a team dinner.”

But once I have this, I can provide not only my KPI, but also a measurable definition of what comprises it and from that, provide the operational roadmap of what must be done in order to achieve the necessary level of control for our network, given the stated risk tolerance for host security.

I would like to be able to do something similar for “applications,” but that creates a couple of problems which we can’t actually solve right now. First, IT can’t provide me with the inventory data that I would need to provide an accurate assessment. Second, finding “applications” is much more difficult than finding hosts on the network and determining simple characteristics like operating system and domain membership