tiflolinux.org - GNU Social
  • Login

Bienvenido

  • Public

    • Public
    • Network
    • Groups
    • Popular
    • People

Conversation

Notices

  1. Pierre Lindenbaum (yokofakun@genomic.social)'s status on Friday, 23-Dec-2022 12:50:34 CET Pierre Lindenbaum Pierre Lindenbaum

    https://www.biorxiv.org/content/10.1101/2022.12.21.521373v1 "BioNumPy: Fast and easy analysis of biological data with #Python" "BioNumPy is able to efficiently load biological datasets (e.g. FASTQ-files, BED-files and BAM-files) into NumPy-like data structures, so that NumPy operations like indexing, vectorized functions and reductions can be applied to the data. " #bioinformatics

    In conversation Friday, 23-Dec-2022 12:50:34 CET from genomic.social permalink

    Attachments

    1. BioNumPy: Fast and easy analysis of biological data with Python
      Python is a popular and widespread programming language for scientific computing, in large part due to the powerful array programming library NumPy, which makes it easy to write clean, vectorized and efficient code for handling large datasets. A challenge with using array programming for biological data is that the data is often non-numeric and variable-length (such as DNA sequences), inhibiting out-of-the-box use of standard array programming techniques. Thus, a tradition in bioinformatics has been to use low-level languages like C and C++ to write efficient code. This makes the tools less transparent to the average computational biologist - making them harder to understand, modify and contribute to. We here present a new Python package BioNumPy, which adds a layer on top of NumPy in order to enable intuitive array programming on biological datasets. BioNumPy is able to efficiently load biological datasets (e.g. FASTQ-files, BED-files and BAM-files) into NumPy-like data structures, so that NumPy operations like indexing, vectorized functions and reductions can be applied to the data. We show that BioNumPy is considerably faster than vanilla Python and other Python packages for common bioinformatics tasks, and in many cases as fast as tools written in C/C++. BioNumPy thus bridges a long-lasting gap in bioinformatics, allowing the same programming language (Python) to be used across the full spectrum from quick and simple scripts to computationally efficient processing of large-scale data. ### Competing Interest Statement The authors have declared no competing interest.

    Feeds

    • Activity Streams
    • RSS 2.0
    • Atom
    • Help
    • About
    • FAQ
    • TOS
    • Privacy
    • Source
    • Version
    • Contact

    tiflolinux.org - GNU Social is a social network, courtesy of tiflolinux.org. It runs on GNU social, version 2.0.1-beta0, available under the GNU Affero General Public License.

    Creative Commons Attribution 3.0 All tiflolinux.org - GNU Social content and data are available under the Creative Commons Attribution 3.0 license.