March 16, 2014 by yorksranter 0 comments

Overview

I am really impressed both by this OpenNews post about how to tackle a huge pile of documents, and also by the tools recommended. After all:

What I received a month later from Nash County, N.C., were two boxes filled with thousands of printed pages of emails. Double-sided.

One of the problems it solves is that your filesystem is usually very, very good at finding files, on all kinds of criteria, and fast – just look at any unix/linux find examples page – but that presupposes that the information you have is broken out into files whose boundaries map roughly to a logical structure within the underlying data.

Also, one of the best things is also the simplest: Overview has a feature that pulls a randomly selected sample of documents.

The blog is crazy good, too. Interestingly, I remember IBM announcing their big investment in big data the other year and giving “Computational Journalism” as one of the use cases.

Did I say the blog was good? The blog is good.

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Mohammad Mazhari on X: "Reportedly, General Behrouz Esbati, a senior commander of the IRGC in Syria reveals: 🔹 We suffered a major defeat in Syria. 🔹 The Syrian people rose up to overthrow the corrupt regime. 🔹 Russia was one of the main factors
The Islamic Republic has avenged the killing of Hassan Nasrallah. Escalating the war in the region does not benefit the Axis of Resistance. @barbaraslavin1 @lrozen @Massoudmaalouf
Sentencing of two teenage arsonists will take place in summer | Bradford Telegraph and Argus
Land Registry documents show that Dalton Mills was bought by Bellissimo Investments Limited, of Northampton, for £10,000 in 2013. But after that company was dissolved via a compulsory strike-off in 2021, Dalton Mills automatically passed to the Crown. Two years later, the Treasury Solicitor disclaimed - or gave up - Dalton Mills to the Crown […]
Southport murderer bought weapons and ricin-making equipment two years before attack | Southport attack | The Guardian
officers found safety goggles, a pestle and mortar, funnels and a flask, which contained traces of ricin residue // it's basically useless but they all seem to do it because it was on t'telly. ironically it's become an accurate marker of being a terrorist, while back in the day it was the case that silly […]
Pokrovsk, Toretsk, Chasiv Yar, Velyka Novosilka: The hotspots on the front and why they are the most critical | Ukrainska Pravda
not great esp the flank at VN. also 155th Bde seems to have been trained by the French version of Benny Hill
Southport killer Axel Rudakubana admitted carrying a knife more than 10 times - BBC News
On Monday, Rudakubana also pleaded guilty to the possession of an al-Qaeda training manual - a terror offence. However, police have never treated his case as terror-related as he did not appear to follow a single ideology // so if you have more than one that's like not having one at all? very deep
How to set up God Mode in Windows 11 - and the wonders you can do with it | ZDNET
neat!
How Shanghai’s ‘western food’ became a cuisine all of its own
In 1973, having somehow survived the worst years of extremism, it resumed trading as the quintessential Shanghainese western restaurant and Russian soup was back on the menu...The Lea & Perrins Worcestershire sauce that inspired Shanghainese “hot soy sauce” was, in turn, derived from an Indian recipe that may have incorporated a Chinese condiment: actual soy […]
TikTok users posting cat videos do not threaten UK national security, minister says | TikTok | The Guardian
He added: “There is a different approach on government devices [on which] we’ve not been allowed to use TikTok for many years. The last Conservative government took the same position because there’s sensitive information on those devices, but for consumers who want to post videos of their cats or dancing, that doesn’t seem like a […]
Boeing_747-400__modified__LauncherOne_Spaceflight.pdf
On 5 July 2023, the SAIA received a copy of the ‘Failure Investigation and Final Report’ from the operator. The operator’s report is subject to export-control restrictions and the information within it cannot be included in this statement // wait record scratch wtf the Branson firework crash report is sekrit bekos ITAR? iirc the RAF […]
Rotorsport_UK_Cavalon_G-CKYT_12-24.pdf
1) Autogyros are bad. They are in fact even worse than helicopters. #FixedWingChauvinism is right. 2) Are those welds meant to be the famous MADE IN GERMANY? What? 3) I can understand it's tough to push back against Boeing or LockMart but the German crash investigator is squishing for some penny-ante autogyro maker? What the […]