Friday, May 30, 2014

OpenStack 05/30/2014 (p.m.)

  • What a giggle, public spying on NSA and its contractors using Big Data. The interactive network graph with its sidebar display of relevant data derived from LinkedIn profiles is just too delightful. 

    Tags: surveillance state, NSA, NSA-methods, spying-on-spies

    • NSA spies need jobs, too. And that is why many covert programs could be hiding in plain sight.

      Job websites such as LinkedIn and Indeed.com contain hundreds of profiles that reference classified NSA efforts, posted by everyone from career government employees to low-level IT workers who served in Iraq or Afghanistan. They offer a rare glimpse into the intelligence community's projects and how they operate. Now some researchers are using the same kinds of big-data tools employed by the NSA to scrape public LinkedIn profiles for classified programs. But the presence of so much classified information in public view raises serious concerns about security — and about the intelligence industry as a whole.

      “I’ve spent the past couple of years searching LinkedIn profiles for NSA programs,” said Christopher Soghoian, the principal technologist with the American Civil Liberties Union’s Speech, Privacy and Technology Project.

    • On Aug. 3, The Wall Street Journal published a story about the FBI’s growing use of hacking to monitor suspects, based on information Soghoian provided. The next day, Soghoian spoke at the Defcon hacking conference about how he uncovered the existence of the FBI’s hacking team, known as the Remote Operations Unit (ROU), using the LinkedIn profiles of two employees at James Bimen Associates, with which the FBI contracts for hacking operations.

      “Had it not been for the sloppy actions of a few contractors updating their LinkedIn profiles, we would have never known about this,” Soghoian said in his Defcon talk. Those two contractors were not the only ones being sloppy.

    • “I was, like, huh, maybe there’s more we can do with this — actually get a list of all these profiles that have these results and use that to analyze the structure of which companies are helping with which programs, which people are helping with which programs, try to figure out in what capacity, and learn more about things that we might not know about,” McGrath said.

      He set up a computer program called a scraper to search LinkedIn for public profiles that mention known NSA programs, contractors or jargon — such as SIGINT, the agency’s term for “signals intelligence” gleaned from intercepted communications. Once the scraper found the name of an NSA program, it searched nearby for other words in all caps. That allowed McGrath to find the names of unknown programs, too.

      Once McGrath had the raw data — thousands of profiles in all, with 70 to 80 different program names — he created a network graph that showed the relationships between specific government agencies, contractors and intelligence programs. Of course, the data are limited to what people are posting on their LinkedIn profiles. Still, the network graph gives a sense of which contractors work on several NSA programs, which ones work on just one or two, and even which programs military units in Iraq and Afghanistan are using. And that is just the beginning.

    • And there are many more. A quick search of Indeed.com using three code names unlikely to return false positives — Dishfire, XKeyscore and Pinwale — turned up 323 résumés. The same search on LinkedIn turned up 48 profiles mentioning Dishfire, 18 mentioning XKeyscore and 74 mentioning Pinwale. Almost all these people appear to work in the intelligence industry.

      Network-mapping the data

      Fabio Pietrosanti of the Hermes Center for Transparency and Digital Human Rights noticed all the code names on LinkedIn last December. While sitting with M.C. McGrath at the Chaos Communication Congress in Hamburg, Germany, Pietrosanti began searching the website for classified program names — and getting serious results. McGrath was already developing Transparency Toolkit, a Web application for investigative research, and knew he could improve on Pietrosanti’s off-the-cuff methods.

    • Click on the image to view an interactive network illustration of the relationships between specific national security surveillance programs in red, and government organizations or private contractors in blue.
  • If you think this isn't a tool for some very serious research, check the short descriptions of the modules here. https://github.com/transparencytoolkit I'll be installing this and doing some test-driving soon. From the source files, the glue for the tools seems to be Ruby on Rails. The development roadmap linked from the last word on this About page is also highly instructive. It ranks among the most detailed dev roadmaps I have ever seen. Notice that it is classified by milestones with scheduled work periods, giving specific date ranges for achievement. Even given the inevitable need to alter the schedule for unforeseen problems, this is a very aggressive (not quite the word I want) development plan and schedule. And the planned changes look to be super-useful, including a lot of "make it easier for the user" changes.   

    Tags: transparency, transparency-tools, research-resources

    • About Transparency Toolkit


      We need information about governments, companies, and other institutions to uncover corruption, human rights abuses, and civil liberties violations. Unfortunately, the information provided by most transparency initiatives today is difficult to understand and incomplete. Transparency Toolkit is an open source web application where journalists, activists, or anyone can chain together tools to rapidly collect, combine, visualize, and analyze documents and data.

      For example, Transparency Toolkit can be used to get data on all of a legislator’s actions in congress (votes, bills sponsored, etc.), get data on the fundraising parties a legislator attends, combine that data, and show it on a timeline to find correlations between actions in congress and parties attended. It could also be used to extract all locations from a document and plot them on a map where each point is linked to where the location was mentioned in the document.

    • Analysis Platform

      On the analysis platform, users can add steps to the analysis process. These steps chain together the tools, so someone could scrape data, upload a document, crossreference that with the scraped data, and then visualize the result all in less than a minute with little technical knowledge. Some of the tools allow users to specify input, but when this is not the case the output of the last step is the input of the next.


      Tools

      Existing and planned Transparency Toolkit tools include include scrapers and APIs for accessing data, format converters, extraction tools (for dates, names, locations, numbers), tools for crossreferencing and merging data, visualizations (maps, timelines, network graphs, maps), and pattern and trend detecting tools. These tools are designed to work in many cases rather than a single specific situation. The tools can be linked together on Transparency Toolkit, but they are also available individually. Where possible, we build our tools off of existing open source software.


      Road Map

      You can see the plans for future development of Transparency Toolkit here.


Posted from Diigo. The rest of Open Web group favorite links are here.

Post a Comment