crs4.it
Open Source Software – OSS | CRS4
http://www.crs4.it/results/open-source-software
Center for advanced studies, research and development in sardinia. High Performance Computing and Networks. HPC for Energy & Environment. Next Generation Sequencing Core Facility. Open Source Software - OSS. Open Source Software - OSS. Open Source Software - OSS. Some examples of CRS4 OSS projects. A very efficient Python MapReduce and HDFS API for Hadoop;. Hadoop-based suite of tools for processing NGS data. An Open Framework for Shared Activity Spaces. Other OSS projects to which CRS4 contributes.
marktaviner.com
Machine Intelligence at Speed: Some Technical or Platform Notes | marktaviner
https://marktaviner.com/2015/07/03/machine-intelligence-at-speed-some-technical-or-platform-notes
Machine Intelligence at Speed: Some Technical or Platform Notes. July 3, 2015. July 3, 2015. This post looks at some of the underlying technologies, tools, platforms and architectures that are now enabling ‘Machine Intelligence at Speed’. Speed as a concept is closely related to both Scale and Scalability. For my convenience and to try and organise things, by this I mean applications that. Are built on or involve ‘Big Data’ architecture, tools and technologies. Here is the familiar Gartner Tech Hype Cycle.
bionics.it
Random links from the Hadoop NGS Workshop | Bionics IT
http://bionics.it/posts/hadoop-ngs-workshop
Random links from the Hadoop NGS Workshop. Share this on → Twitter. Posted on: 19 Feb '15. Some random links from the Hadoop for Next-Gen Sequencing workshop. Held at KTH in Kista, Stockholm in February 2015. UPDATE: Slides and Videos now available. By Big Data Genomics. Tweet by Frank Nothaft on common workflow def. Part of Global Alliance for . Another link is ga4gh.org. Does support multiple outputs etc. Black-box vs. White-box. Workflow dependency graph can be dynamically built up while you're running.
tom-e-white.com
Tom White: January 2015
http://www.tom-e-white.com/2015_01_01_archive.html
Problems worthy of attack prove their worth by hitting back. —Piet Hein. Friday, 16 January 2015. Some of the largest datasets are generated by the sciences. For example, the Large Hadron Collider produces around 30PB of data a year. I'm interested in the technologies and tools for analyzing these kind of datasets, and how they work with Hadoop, so here's a brief post. Amazon S3 seems to be emerging as the de facto. Hosts a 200TB dataset on S3. Notebooks have been around in the scientific community for a...
tom-e-white.com
Tom White: Hadoop for Science
http://www.tom-e-white.com/2015/01/hadoop-for-science.html
Problems worthy of attack prove their worth by hitting back. —Piet Hein. Friday, 16 January 2015. Some of the largest datasets are generated by the sciences. For example, the Large Hadron Collider produces around 30PB of data a year. I'm interested in the technologies and tools for analyzing these kind of datasets, and how they work with Hadoop, so here's a brief post. Amazon S3 seems to be emerging as the de facto. Hosts a 200TB dataset on S3. Notebooks have been around in the scientific community for a...
github.com
Release 0.16.0 · bigdatagenomics/adam · GitHub
https://github.com/bigdatagenomics/adam/releases/tag/adam-parent-0.16.0
Feb 18, 2015. Middot; 389 commits. To master since this release. ADAM IS BETA SOFTWARE AND SHOULD NOT BE USED IN PRODUCTION ENVIRONMENTS. We are working very hard toward a 1.0 release that is production ready. Documentation can be found attached below in both PDF and HTML format. Release Artifacts on Maven Central. To get started,. 1 Download your favorite pre-built Apache Spark 1.2.0 release. 2 Extract the pre-built Apache Spark binaries to location on your computer. 3 Set an environment variable.
marktaviner.com
Uncategorized | marktaviner
https://marktaviner.com/category/uncategorized
Machine Intelligence at Speed: Some Technical or Platform Notes. July 3, 2015. July 3, 2015. This post looks at some of the underlying technologies, tools, platforms and architectures that are now enabling ‘Machine Intelligence at Speed’. Speed as a concept is closely related to both Scale and Scalability. For my convenience and to try and organise things, by this I mean applications that. Are built on or involve ‘Big Data’ architecture, tools and technologies. Here is the familiar Gartner Tech Hype Cycle.
uppsala-bioinformatics.se
Thoughts | … on bioinformatics and technology
http://uppsala-bioinformatics.se/thoughts
Genomics data – distributed and searchable by content. Recently there has been a lot of interesting activity around MinHash sketches on Titus Browns blog (. To quickly recap what a MinHash sketch is, it is a data structure which allows approximate set comparisons in constant memory. In essence it stores a number of output from hash functions, e.g. 1000 such hashes, and then compares two sets by comparing the overlap of the hashes rather than the overlap of the original sets. Combining these two technolog...
amplab.cs.berkeley.edu
Software | AMPLab – UC Berkeley
https://amplab.cs.berkeley.edu/software
AMP Lab – UC Berkeley. BDAS, the Berkeley Data Analytics Stack. Is an open source software stack that integrates software components being built by the AMPLab to make sense of Big Data. BDAS consists of the components shown below. Components shown in Blue or Green are available for download now. Click on a title to go that project’s homepage. HDFS, S3, Ceph. In addition to BDAS, the AMPLab has released additional software components useful for processing data:. The Mesos meetup group.
SOCIAL ENGAGEMENT