ecosort.de
EcoSort
http://www.ecosort.de/index.html
Energieeffizientes Sortieren mit SSDs. 365 Orte im Land der Ideen. Energieeffizientes Sortieren mit SSDs. Das effiziente Sortieren von Daten ist von zentraler Bedeutung für Suchmaschinen und Datenbanken und damit ein wichtiges Forschungsthema in der theoretischen wie auch in der praktischen Informatik. Im Jahr 2008 verbrauchten die Server und Rechenzentren in Deutschland 10,11 TWh. Ist EcoSort der aktuelle Weltrekordhalter im energieeffizienten Sortieren in 4 Kategorien. Ort: H I und Foyer im Hörsaalgebä...
bandb.blogspot.com
Build and Break: Understanding MapReduce Performance: Part 1
http://bandb.blogspot.com/2011/04/understanding-mapreduce-performance.html
We do not really know how something works until we have taken it to pieces and put it back together again. Thus gaining knowledge involves breaking things. I deal mostly with software. Thursday, April 28, 2011. Understanding MapReduce Performance: Part 1. That their MapReduce was 100 times slower than their database system. Searching the web finds many people complaining. At the same time there is plenty of evidence that MapReduce is no performance slouch. The Sort Benchmark. Understanding MapReduce perf...
cs.cmu.edu
FAWN: A Fast Array of Wimpy Nodes
http://www.cs.cmu.edu/~fawnproj
A Fast Array of Wimpy Nodes. FAWN: A Fast Array of Wimpy Nodes. FAWN is a fast, scalable, and energy-efficient cluster architecture for data-intensive computing. A FAWN cluster links together a large number of "wimpy" nodes built using energy-efficient processors and small amounts of flash memory. Into an ensemble cluster that can perform the same amount of work as a traditional cluster but at a fraction of the power. Source code for Basic FAWN-KV is available below. San Jose, CA. October 2012. Small Cac...
blog.techottis.ch
cmd line | FM Techottis
https://blog.techottis.ch/category/it/cmd-line
Tech stuff from FM. October 15, 2016. Backup with rsync a la time machine: a proof. I am struggling for backups since ever. I assume just like anybody. For my personal machines I used rsync in combination with my script taritdate.sh. Then I kept reading online about using rsync for incremental backups but I could not find any simple example. So I did this little script: Continue reading →. October 2, 2014. Less unix, more linguistic and phonetics: a script to automate praat. Continue reading →. There are...
cutting.wordpress.com
Joining Cloudera | Free Search
https://cutting.wordpress.com/2009/08/10/joining-cloudera
Ramblings about Lucene, Nutch, Hadoop and other stuff. Laquo; Some early Avro benchmarks. I will be leaving Yahoo! At the end of this month to join Cloudera. About five years ago I was working with Mike Cafarella. An open-source web-search engine. Initially we were able to crawl and index on four machines in parallel, but with a lot of manual steps. Inspired by two Google. We implemented a distributed filesystem and MapReduce. To help make this happen. Then we set out to improve scalability, performance,...
linkedbigdata.com
LinkedBigData: January 2015
http://www.linkedbigdata.com/2015_01_01_archive.html
Tuesday, January 27, 2015. Links: January 27, 2015. Quasar (Java library that provides high-performance lightweight thread, Go-like channel and Erlang-like actor). BTrace (Java application tracing tool without restart, use Java syntax, have many intended restrictions). RocksDB (embeddable low latency key-value store by Facebook, based on Google LevelDB, used by LinkedIn Samza). Tig (text-mode interface for Git, see also hub. Gitlet (implemention of Git in JavaScript). Reverse Engineering for Beginners.