• On How Computers Did (or Didn't) Break Science

    Recently I came across an article (How computers broke science – and what we can do to fix it by Ben Marwick) that argues that electronic computers are breaking science. Namely, computers are blamed for:

  • The National Academies of Science and Engineering and ACM Definitions of Reproducibility

    In November of last year, I attended the SC19 conference, which brings together an assortment of computer scientists, systems administrators, and vendors. One of the various birds of a feather sessions I attended (The National Academies’ Report on Reproducibility and Replicability in Science: Inspirations for the SC Reproducibility Initiative) discussed the issue of reproducibility in great detail. The big news at this session was that several operationalizations for the concept of reproducibility that were developed independently by the ACM (see here) and the National Academies (see here) were being harmonized across the two organizations.

  • A Bigger Threat than Unfriendly AI

    If you sample the discourse of futurists and transhumanists, you’ll quickly discover the following common talking point: if humanity develops a form of artificial intelligence that equals (via strong AI) or even exceeds (via a technological singularity) the intelligence endowed to us by natural selection, what prevents this artificial intelligence from being actively hostile towards humanity? Such a so-called “unfriendly AI” would strive to blow us all to dust similar to Skynet from the Terminator film franchise.

  • Using the Gzip File Format as a Metadata Container

    It’s been a while since I last posted, but I’ve been itching to commit to words a few thoughts I’ve been kicking around regarding one particular approach to adding metadata to arbitrary file formats. In the distant past, I was heavily involved in a community that was developing a standard for enriching neuroimaging datasets with metadata. I’m not going to re-hash all of the advantages that creating such a standard confers upon the neuroscience community, since others have done so extensively elsewhere, but the short of it is that the standard, the Brain Imaging Data Structure (BIDS), has done much to increase both the discoverability and reusability of fMRI/EEG/MEG datasets that otherwise would have remained “hidden” in a proverbial file drawer somewhere. If you’re interested in understanding BIDS more, there is a paper that goes into more detail. For this post, I’m going to focus on one particular technical choice that was made about how BIDS stores supplementary data (such what instrumentation was used and how equipment was configured).

  • Meditations on the 'Archivability Crisis' in Science and the Long-Term Reproducibility of Scientific Analyses

    This post is a response to C. Titus Brown’s How I learned to stop worrying and love the coming archivability crisis in scientific software, informed both by Emulation & Virtualization as Preservation Strategies by David S. H. Rosenthal and past experiences attending Vintage Computer Festival East.

  • Why I Support the Common Workflow Language

    I’ve been wanting to write a post about Common Workflow Language (CWL) for a while now and, realizing that if I don’t do so now I likely never will, have decided to embark upon an attempt at articulating my thoughts about why I support this project. For those who are unfamiliar with CWL, it is essentially a simple YAML-based syntax for expressing input-output relations between programs in a workflow. This is similar to the concept of piping inputs between commands in a Unix shell, or defining steps that need to be performed to compile a program using a makefile. I’ve been following it sporadically since I stopped working in science since it isolates the pipeline definition functionality of other flow-based tools used by scientists such as Nipype or Galaxy in a platform and field agnostic way.

  • 2018: A Digitization and Data Migration Odyssey

    Recently I journeyed into the hinterlands of upstate New York to visit my mother for the entire week of Memorial Day weekend. This was partially to be a good son and keep my mother company, partially to escape the air and noise pollution of New York City for a world of grass and open spaces, and partially to help with another large family project- a general cleanup and decluttering of my mom’s house. Since my dad’s passing, it’s become increasingly obvious that my childhood home is too packed with odd objects and artifacts that needlessly complicate my mom’s life, and I wanted to do my part to get rid of some of those bits and pieces.

  • On the Use of Distributed Databases for File Format Identification

    A perennial issue in the field of digital preservation is how to unambiguously identify an incoming file that is being stored for long-term archival. The Unix file command uses magic numbers stored in a text file to determine what format a file is, but this text file might not be uniform across Unix/Linux installations in use by libraries, and it is tedious to maintain across multiple institutions. Additionally, DOS/Windows-based files rely on file extensions for identification.

  • Scientific Shower Thoughts - The Holocaust, Contextual Psycholinguistics and Holograms

    I recently came across an interesting article in the New York Times discussing the Holocaust, the increasing ignorance amongst members of my generation about certain key facts, and the looming issue wherein concentration camp survivors are dying off due to old age, making it impossible to continue to hear their stories firsthand. I myself was fortunate enough to hear from a local area survivor, Helen Sperling when I was in high school, and was always struck by the intimacy of being in the same room as someone who had lived through an indisputably horrific experience. My most vivid memory of Helen’s story was how her best friend rapidly came to perceive Helen as “dirty” due to her Jewishness (an event summarized here at some level of detail).

  • The Immortality of Writers

    I have a post in the works for this blog (I swear!) although it’s not quite ready yet. In the meantime, I’m going to leave a few words of wisdom that will hopefully inspire me to actually write:

  • Posts I Want To Write

    I’m using my inaugural post as a convenient index of topics I’d like to write about. Listed in no particular order, these are: