Preserving digital data is a costly and time-consuming task, especially when the amount of data keeps growing exponentially. A new group is tackling the problem with a two-year grant totaling $525,000 from the National Science Foundation and the Andrew W. Mellon Foundation.
Called the Blue Ribbon Task Force on Sustainable Digital Preservation, the group is holding meetings in Washington, D.C., this week to discuss what it means to have an economically sustainable model for digital preservation, and various ways to achieve this.
The group is expected to issue two reports, one at the end of this year and one in 2009. Brian Lavoie, an economist who is a research scientist at the OCLC Online Computer Library Center, and Francine Berman, director of the San Diego Supercomputer Center at the University of California at San Diego, are heading up the task force.
They said Monday that they hoped to have an outline for the first report by the end of this week. Ms. Berman said she was particularly concerned about the lack of attention paid to preserving data from research financed by the federal government.
The task force is working with the Library of Congress, Britain’s Joint Information Systems Committee, the Council on Library and Information Resources, and the National Archives and Records Administration.—-Andrea L. Foster




8 Responses to Academic Group Convenes to Tackle Archiving of Digital Data
amahoney - September 19, 2011 at 8:37 am
With a teeny bit more work, you can cast out elevens, too: if my number is abcd, then take
d – c + b – a
to get the remainder you’d get if you divided by 11. For example, 1357 gives 7 – 5 + 3 – 1 = 4, and, as it happens, 1357 = 11 * 123 + 4. You can use this the same way as 9s. So if I’m adding 32189 + 87011, the first number is (9 – 8 + 1 – 2 + 3) = 3 modulo 11, and the second is (1 – 1 + 0 – 7 + 8) = 1 modulo 11, so my sum had better come out to 4. If I get 119200, I’m good, because (0 – 0 + 2 – 9 + 1 – 1) = -7. Wait, what’s that? It’s OK: just as, if you get something bigger than 9 above, you subtract 9 until you get a single digit, so, too, if you get something smaller than zero, you add 11 and get back home: 11 – 7 = 4.
This will catch different errors: if you’d transposed digits, 112900 is (0 – 0 + 9 – 2 + 1 – 1) = 7 modulo 11, *not* 4 — so you know it’s wrong.
Why does it work? It’s a little messier, but the idea similar. Suppose your number is
N = 1000 d3 + 100 d2 + 10 d1 + d0
as above. Then re-write as
N = (1001 d3 – d3) + (99 d2 + d2) + (11 d1 – d1) + d0
(note that 1001 = 11 * 143)
and re-combine:
N = (1001 d3 + 99 d2 + 11 d1) + d0 – d1 + d2 – d3
The bit in parentheses is a multiple of 11 and the rest gives the remainder. And it works even for larger numbers.
Delighted to see a math-ish blog here!
Robert Talbert - September 19, 2011 at 10:48 am
Very cool. Thanks!
BTW, I’m researching a follow-up post to this about generalizing this to commutative rings. Turns out there was a neat paper from 1981 about this. Also, one can think about casting out different things if the number bases are different. So much math, so little time…
joneseagle - September 20, 2011 at 5:59 am
Growing up in the 50/60′s I kept hearing about this but no one would ever tall me about it. Nice to see with is was in writing.
Makes sense.
mindnbodybuilding - September 20, 2011 at 8:02 am
Very clever!
lexalexander - September 20, 2011 at 8:33 am
I learned this method in fourth grade (around 1970), and it saved my butt on a number of occasions. But I hadn’ t thought about it in years. I need to teach it to my kids.
katisumas - September 20, 2011 at 11:59 am
So neat! Much better than, in the case of an addition simply substracting one of the sums from the total or in the case of a substraction doing the opposite…..
But the way you describe numbers opens up a vista on their infinite complexities… thanks.
drburlbaw - September 21, 2011 at 9:41 am
I use a form of this when trying to figure out where I made an error when I get two sums – balancing a checkbook is a good example, even when doing it electronically. If the difference is divisible by 9, I know the problem is a transposition – and the number of columns in the results tells me in which pair of numbers to look for the transposition – tens/ hundreds, etc.
Raphael - September 23, 2011 at 5:33 pm
The downside of choosing a coined phrase as blog name is that it makes you hard to google.