By Eric Vandenbroeck
and co-workers
How To Deal With Classified Documents
In August
2016, the United States suffered one of history's most cataclysmic leaks of
classified information. An anonymous entity calling itself “the Shadow Brokers”
exposed an arsenal of cyberweapons that the National Security Agency had
developed—in great secrecy. The intelligence community sprang into
damage-control mode. Because the NSA’s hackers rely on plausible
deniability, disclosing such clandestine tools and their connection to the U.S.
government meant that the agency would be forced to devise new ones. But there
was also a more pressing danger: with the source code for these powerful
weapons now published on the Internet, any unscrupulous actor could deploy
them. It was the digital equivalent of “loose nukes.”
Practically
overnight, cybercriminals repurposed the NSA’s proprietary exploits to
launch audacious ransomware attacks, ultimately shutting down millions of
computers worldwide and paralyzing thousands of private businesses, from an
auto plant in France to a chocolate factory in Australia. Foreign governments
took advantage of the tools, as well. North Korea used the NSA’s malicious
code to attack the British healthcare system, forcing hospitals to turn away
patients. Iran used it to target airlines in the Middle East. Russia used it
against Ukraine.
Even as these
cyber-assaults proliferated, officials in Washington had no idea who was
responsible for the breach. They did not know whether it was a foreign
intelligence service that had compromised the NSA’s vaunted digital
defenses or some disillusioned agency coder gone rogue. As if to compound the
government’s humiliation and alarm, the Shadow Brokers taunted the agency in a
series of online posts, mocking the investigation in playfully broken English:
“Is NSA chasing shadows?”
In 2017, The New
York Times reported that after 15 months of investigation, authorities
were no closer to an answer. If they have since managed to identify the
perpetrator, then that, too, remains classified. But the whole debacle
highlights the subtle Achilles’ heel of government classification.
The NSA is famously secretive; as the old joke has it, its initials
stand for “no such agency.” Yet here was a massive leak in which some of the
nation’s most closely guarded secrets were spilled out for the world to see.
Nor was this the only recent jumbo leak of highly classified material: there
was the 2017 leak of CIA hacking tools by an agency software
engineer, Joshua Schulte; the 2013 leak of surveillance programs by
an NSA contractor, Edward Snowden; and the 2010 leak of cables and
videos by an army private, Chelsea Manning.
This, as Matthew
Connelly lays bare in his new book, The Declassification Engine, is
the paradox of contemporary government secrecy. For decades, blue-ribbon panels
and incoming presidents have observed with surprising unanimity that
overclassification has grown out of control—and vowed to fix it. Yet every
year, more new documents are marked “top secret,” and more realms of official
activity are placed beyond the scrutiny of citizens, journalists, and even
Congress. In 2017, the federal government spent over $18 billion maintaining this
classification system, almost double what it spent five years earlier. But
precisely because so much government work now transpires behind a veil of
secrecy, it is necessary to grant clearances to an ever-larger cadre of federal
employees. Some 1.3 million Americans now hold top-secret clearances, roughly
double the population of the District of Columbia.
The math becomes
simple. Combine the vast dimensions of the classified world with the enormous
numbers of people who need access to it to do their jobs and factor in the
increasing ease of copying and transferring enormous volumes of digital
information. It seems almost certain that wholesale leaks of classified data
will continue. Decades of bad habits practiced by government agencies hooked on
classification undermine transparency and democratic accountability, and this
impulse to classify indiscriminately is often justified by invoking national
security. But as Connelly points out, when everything is secret, nothing is
secret: the “very size of this dark state . . . has become its own security
risk.”
If the dangers of
excessive government secrecy are so widely acknowledged, why has nothing been
done about it? Connelly suggests that the authority to classify has become a
cherished prerogative of government power—a tool used by presidents, generals,
and various chieftains of lesser fiefs to enshroud their decisions in mystery
and ward off scrutiny or accountability. Reform efforts founder in the face of
bureaucratic recalcitrance. But another challenge is the sheer volume of
restricted documents: because the government classifies more quickly than it
declassifies, the amount keeps growing yearly. How do you begin to declassify
all this information, and if you cannot, what becomes of the historical record?
In his book, Connelly proposes what might be an inspired solution—but only if
the government takes him up on it.
Open And Shut
Connelly is a
historian at Columbia University, where he runs the History Lab, a group that
focuses on applying data science tools to the problem of overclassification.
When one considers the full sweep of American history, he argues, widespread
classification is not just a betrayal of the United States’ founding principles
but also a relatively recent anomaly. The first century and a half of the
republic were characterized by “radical transparency,” Connelly contends: when
the nation was at war, it engaged in espionage and secrecy, but during
peacetime, these practices receded. The United States had no
permanent intelligence agency until the Office of Naval Intelligence was
created in 1882. As late as 1912, Woodrow Wilson could remark, while
campaigning for president, “There ought to be no place where anything can be
done that everybody does not know about.”
Connelly demonstrates
the degree to which this ideal of accountability was explicitly linked to a
tradition of record-keeping and publicly accessible archives. In 1853, long
before President Donald Trump took to flushing official papers down a
White House toilet, it was declared a felony to destroy any federal records. A
century and a half before WikiLeaks published purloined State Department
cables, the department began publishing such records, voluntarily disclosing
volumes of letters recently received through embassies abroad. In one poignant
anecdote, Connelly recounts that when construction began on the Pentagon in
1941, President Franklin Roosevelt anticipated that the postwar
military establishment would be too small to fill it—and would vacate the
building when the fighting stopped so that it could be repurposed as an annex
to the National Archives.
It did not pan out
that way. Indeed, the rise of the permanent defense bureaucracy and the
military-industrial complex in the immediate aftermath of World War II gave
birth to the juggernaut of official classification. Rather than roll back the
culture and institutions of secrecy that had prevailed during wartime,
the Truman administration institutionalized them with the advent of
the Cold War. The creation of the CIA and other intelligence agencies and
the secrecy surrounding the United States’ growing nuclear arsenal accelerated
the professionalization of the classified state. “Our present security system
is a phenomenon of only the past decade,” Senator Hubert Humphrey remarked in
1955. “We have enacted espionage laws and tightened existing laws; we have
required investigation and clearance of millions of our citizens; we have
classified information and locked it in safes. . . .We have not paused in our
necessary, though the frantic, quest for security to ask ourselves: What are we
trying to protect, and against what?”
In theory, the
passage of time should enable Americans to look back at the ostensible
rationale offered for classifying various government activities and determine,
in retrospect, whether all that secrecy was justified. Connelly and his fellow
scholars at Columbia are engaged in this sort of enterprise. But such a project
is frustrated in practice by the slow pace of declassification. Reams of
important historical documents remain classified more than half a century after
the described events. Even as the government spends more money classifying more
documents each year, funding for declassification efforts has steadily eroded.
The federal government now budgets only about $100 million annually. As
Connelly dryly notes, “The Pentagon spends four times that just on military
bands.”
But Connelly and his
colleagues have developed an innovative solution, studying the records the
government has unsealed to see what they reveal about the dynamics of official
secrecy. Over the last decade, his researchers have assembled the world’s
largest database of declassified documents. Drawing on big data and machine
learning tools, they have developed a series of techniques to analyze this
archive for patterns and anomalies. When Connelly suggests that in some corners
of the federal bureaucracy, the devotion to secrecy has evolved from a culture
into “a cult,” it might seem hyperbolic. But consider that when he undertook
this academic project—scanning the redactions in declassified documents in
search of lessons about the pathologies of overclassification—the project was
perceived to be sufficiently threatening that former government lawyers advised
him and his team that they could be accused of violating the Espionage Act.
The Leakers
It should be no
surprise that the gatekeepers of the classified world might feel defensive
about such an inquiry. Even the staunchest critics of overclassification
generally acknowledge that the government must maintain some secrets.
Reasonable people can disagree about whether the NSA should be
developing an arsenal of cyberweapons. Still, most observers would concede that
such an arsenal, if it exists, should not be freely accessible to the public.
The same goes for sensitive details associated with nuclear weapons or the
names of people spying for the United States. (In the case of covert assets’
identities, there are compelling grounds for maintaining such secrets even
decades after the conduct in question since prospective spies abroad will be
less likely to betray their countries if they believe that the details of their
betrayals may be automatically declassified a mere 20 years later.)
If official
classification had been carefully confined to these sorts of tailored
categories, it would never have blossomed into such a rampant problem. But the
basis for most classifications is less coherent. At some point early in that
postwar expansion of government secrecy, the authority to mark something
classified gave rise to a bureaucratic reflex. For any government officer
making a quick decision during a busy workday, the penalties for under
classifying are quite salient, whereas penalties for overclassifying do not
exist. One way of accounting for how the nation got to this juncture is to look
at the incentive structure for that officer deciding whether to classify a
single document and extrapolate outward to all the other functionaries invested
with the power to deem something “secret” in all the other agencies every day
of every year over the last eight decades. The problem has assumed proportions
that can be difficult to comprehend. In a single year, 2012, U.S. officials
classified information more than 95 million times, or roughly three times per
second.
But that version of
the story—in which genuine national security imperatives merged with
bureaucratic path dependence and risk aversion and snowballed—is the benign
interpretation. For Connelly, who has scrutinized actual classification
decisions made over those eight decades, the real explanation points to
something more pernicious. Classification is an exertion of power, he argues.
As such, it has often been motivated not by the dictates of national security
but by considerations of raw political or bureaucratic leverage.
“It turns out that,
from the very beginning, what’s secret has been whatever serves the interests
of the president and all those around him who are invested in executive power,”
he writes. In any bureaucracy, the ability to render something secret becomes
an irresistible trump card—a way to evade oversight, tout parochial priorities,
and obscure shortcomings. “After conjuring the power of secrecy and setting it
loose, presidents found that it had a power all its own,” Connelly continues.
“Thousands more people, many career civil servants, began creating their
secrets and jealously protecting them, making it harder to identify and protect
what mattered to the president personally. At the same time, they could leak whatever
they liked, undermining the president’s ability to manage the news cycle.”
Connelly is particularly scathing about the role of military leaders, such as
Douglas MacArthur and Curtis LeMay, who “employed leaks and spin no less than
secrecy to protect their perquisites and push their agendas,” lobbying to
expand military spending and outright defying civilian authority. In 1978, he
notes, the Joint Chiefs of Staff stopped preserving notes from their meetings,
“as if America’s most senior military leadership were running a numbers racket,
committing nothing to paper.”
In a system where so
much information ends up classified, selective leaking is a safety valve for
when certain matters of national importance need to get out. The legal scholar
David Pozen has argued that the “leakiness” of the
executive branch is not a sign of institutional failure but, on the contrary, a
strategic adaptation to prevailing realities, one that enables an administrator
to send “messages about its activities to various domestic and international
audiences without incurring the full diplomatic, legal, or political risks that
official acknowledgment may entail.” As William Daley, President Barack Obama’s
chief of staff, once admitted, “I’m all for leaking when it's organized.”
Every White House has
regularly leaked sensitive and often classified information to the press.
Whereas penalties for rank-and-file employees who make unauthorized disclosures
are often severe, consequences for deliberate leaks by highly placed officials
are practically unheard of. Consider the contrast between Reality Winner,
the NSA contractor who leaked an intelligence report about Russia’s
interference in the 2016 election, and David Petraeus,
the CIA director and four-star general who shared several notebooks
full of highly classified information with his biographer (who was also his
mistress) and then lied to federal investigators about it. A winner was
sentenced to five years and three months in prison; Petraeus received two years probation and a fine. Connelly invokes a quip by Sir
Humphrey Appleby of the BBC sitcom Yes Minister: “The
Official Secrets Act is not to protect secrets. It is to protect officials.”
Locked In The Archives
What is maddening
about the lack of progress on overclassification is that anybody who has given
the issue serious consideration would likely agree with the broad contours of
Connelly’s arguments. Nearly two decades have elapsed since the 9/11 Commission
concluded that too much classification could jeopardize national security.
“Secrecy, while necessary, can also harm oversight,” the report argued, adding
that the “best oversight mechanism” in a democracy is “public disclosure.” But
it is one thing to acknowledge the problem and quite another to do something
meaningful about it. Obama came into office vowing to create “the
most open and transparent administration in history.” Yet, in the end, as
Connelly points out, “he presided over exponential growth in classified
information.” (He also initiated more criminal prosecutions of leakers than all
his predecessors combined.) When outside groups have tried to pressure the
federal government into greater transparency, they have aroused staunch
resistance and occasionally retaliation. Connelly relates one galling story: in
the 1980s, after the National Security Archive, a nonprofit group affiliated
with George Washington University, filed Freedom of Information Act requests
and initiated lawsuits to uncover abuses of government power by the Reagan
administration and the FBI, the FBI responded by placing the National
Security Archive itself under surveillance.
Meanwhile, the
daunting tonnage of classified documents has compounded every year. Even those
who earnestly want to do something about the problem fear that it may simply
have become unmanageable. By one estimate, it will take 250 years at the
government’s current processing rate to respond to the backlog of Freedom of
Information Act requests at the George W. Bush Library alone. No effective
system exists to automate declassification, and the relevant federal
agencies lack the personnel and resources to review and redact billions of
classified documents manually. “If instead these records were withheld
indefinitely or destroyed, it would be impossible to reconstruct what officials
did under the cloak of secrecy,” Connelly points out. Thus, a problem that on
its face might seem like a dry technocratic riddle—with billions of new
classified documents generated every year and no scalable method for safe and
reliable declassification, what happens to the historical record?—assumes an
existential urgency. If the U.S. government is “not even accountable in the
court of history,” Connelly writes, “it truly is accountable to no one.”
As it happens,
Connelly has a solution. Because the aggregate volume of still classified
information is so overwhelming, the only way to tackle it will be to employ the
wizardry of big data. By scanning hundreds of thousands of declassified
documents (some still redacted, others not), Connelly and his colleagues could
search for specific words, themes, and connections to identify areas of
particular sensitivity. Comparing redacted and unredacted versions of the same
declassified documents from a given period, they compiled a jokey “America’s
Most Redacted” list of names most frequently blacked out (including Congolese
Prime Minister Patrice Lumumba and Iranian Prime Minister Mohammad Mosaddeq,
both targets of CIA operations). They devised a series of technological methods
to rapidly sort through extensive archives and select documents that met
certain criteria. Suppose such techniques were harnessed for the
declassification effort. In that case, they realized, it might be possible “to
train algorithms to look for sensitive records requiring the closest scrutiny and
accelerate the release of everything else.” This is the “declassification
engine” of the book’s title: an ingenious technical solution to an impossible
bureaucratic problem.
For the moment, the
machine is still in its infancy, with a beta version concocted by the History
Lab at Columbia as proof of concept. To date, it has only worked with material
that has already been declassified. But Connelly and his team wanted to improve
its capability and accuracy by pilot testing it on historical classified information,
and for that, they needed government buy-in. This would not be difficult to
obtain. After all, the federal government has paid much lip service to the idea
that overclassification has reached crisis proportions. Here was a way of
solving it that would be cost-effective, especially compared to engaging human
reviewers to manually process old classified material before releasing it to
the public.
So Connolly and his
band of data scientists and mathematicians went to Washington to plead their
case. They met with the State Department, the National Declassification Center,
the CIA, the Public Interest Declassification Board, and the Office of the
Director of National Intelligence. There was certainly interest. At the State
Department, which produces more than two billion emails yearly, one official
informed them that the need for the technology they were offering was
“frighteningly clear.” But the department had no money to authorize a pilot
program or fund their research. Someone suggested Columbia students could be
enlisted to work on the initiative and paid in course credit. “I was struck by
the notion that declassification could be treated as a kind of school project,”
Connelly writes.
His group ended up in
a meeting at the Intelligence Advanced Research Projects Activity, which has
been delegated to work with the National Archives to explore technological
solutions to the problem of overclassification. After listening to their pitch,
an IARPA official told the visitors that she had been trying for
years to build a similar engine—not to declassify, but to classify. She found
their ideas intriguing but explained that making technology to help review and
release classified documents would represent an “insufficient return on
investment.”
It is a dispiriting coda
to Connelly’s fascinating and urgent book. One hopes that he and his colleagues
will ultimately find other, more hospitable points of entry in the federal
government that would allow them to test and improve their declassification
algorithms with actual classified raw data. If you believe in the founding
principles of the American form of government, then the stakes could scarcely
be higher. Connelly recalls thinking after being shown the door at IARPA, “We
cannot assign a dollar value to democratic accountability.”
For updates click hompage here