[Neurodebian-users] NIH Request for Information (RFI): Input on Development of Analysis Methods and Software for Big Data

Yaroslav Halchenko debian at onerussian.com
Thu Aug 8 17:01:35 UTC 2013


----- Forwarded message from "Glanzman, Dennis (NIH/NIMH) [E]" <dglanzma at mail.nih.gov> -----

Date: Thu, 8 Aug 2013 15:13:03 +0000
From: "Glanzman, Dennis (NIH/NIMH) [E]" <dglanzma at mail.nih.gov>
To: "comp-neuro at neuroinf.org" <comp-neuro at neuroinf.org>, "connectionists at mailman.srv.cs.cmu.edu" <connectionists at mailman.srv.cs.cmu.edu>
Cc: 
Subject: [Comp-neuro] Request for Information (RFI): Input on Development of Analysis Methods and Software for Big Data

   Request for Information (RFI): Input on Development of Analysis Methods
   and Software for Big Data

   --------------------------------------------------------------------------

   Notice Number: NOT-HG-13-014

   Key Dates
   Release Date: August 8, 2013
   Response Due Date: September 6, 2013

   Issued by
   National Human Genome Research Institute ([1]NHGRI)

   Purpose

   This Request for Information (RFI) is to solicit comments and ideas for
   the development of analysis methods and software tools, as part of the
   overall Big Data to Knowledge (BD2K) Initiative.  Specifically, this RFI
   solicits input on needs for software and analysis methods related to data
   compression/reduction, data visualization, data provenance, and data
   wrangling.

   Background

   Biomedical research is becoming more data-intensive as researchers are
   generating and using increasingly large, complex, and diverse datasets.
   This era of 'Big Data' in biomedical research taxes the ability of many
   researchers to release, locate, analyze, and interact with these data and
   associated software due to the lack of tools, accessibility, and
   training.  In response to these new challenges in biomedical research, and
   in response to the recommendations of the Data and Informatics Working
   Group (DIWG) of the Advisory Committee to the NIH Director
   ([2]http://acd.od.nih.gov/diwg.htm), NIH has launched the trans-NIH Big
   Data to Knowledge (BD2K) Initiative ([3]www.bd2k.nih.gov).

   The long-term goal of the NIH BD2K Initiative is to support advances in
   data science, other quantitative sciences, policy, and training that are
   needed for the effective use of Big Data in biomedical research.  (The
   term "biomedical" is used here in the broadest sense to include
   biological, biomedical, behavioral, social, environmental, and clinical
   studies that relate to understanding health and disease).  The term 'Big
   Data' refers to datasets that are increasingly larger, more complex, and
   which exceed the abilities of currently used approaches to manage and
   analyze.  "Big Data" is also meant to capture the opportunities and
   address the challenges facing all biomedical researchers in accessing,
   managing, analyzing and integrating large datasets of diverse data types. 
   Such data types may include imaging, phenotypic, molecular (including
   –omics), clinical, environmental, behavioral, and many other types of
   biological and biomedical data.  "Big Data" also includes data generated
   for other purposes (e.g. social media, search histories, cell phone data)
   when they are repurposed and applied to address health research
   questions.  Biomedical Big Data primarily emanate from three sources: (1)
   a small number of groups that produce very large amounts of data, usually
   as part of projects specifically funded to produce important resources for
   use by the research community at large, or large collections of electronic
   health records; (2) individual investigators who produce large datasets
   for their own project, but which might be broadly useful to the research
   community at-large; (3) an even greater number of investigators who each
   produce small datasets whose value can be amplified by aggregating or
   integrating them with other data.

   One of the DIWG recommendations was to support the development,
   implementation, evaluation, maintenance and dissemination of informatics
   methods and applications. NIH supports a wide range of bioinformatics and
   computational science through efforts such as the Biomedical Science and
   Technology Initiative funding opportunities and through programs supported
   by individual NIH institutes and centers.  NIH is now considering
   supporting the development of analytical methods and software tools and
   will focus initially on four targeted areas to begin to address critical
   current and emerging needs of the research community for using, managing,
   and analyzing more complex and larger data sets: data
   compression/reduction, visualization, provenance, and wrangling.

   An NIH BD2K Working Group charged with exploring the development of
   informatics methods and tools seeks input from the biomedical research
   communities on the four targeted areas listed above to ensure that
   research resources generated will have the highest impact and value to the
   research community. NIH has determined that guidance is needed from broad
   scientific community in the following areas:

   Data Compression/Reduction
   While data compression is important in BD2K since it helps reduce resource
   usage, most compression techniques involve trade-offs among various
   factors, including the degree of compression, the amount of distortion
   induced and the computational resources required to compress and
   decompress the data.
   Data reduction aims to more dramatically reduce the data volume, and in
   the meantime reduce the complexity/dimensionality of data for easier
   analysis. It usually involves processing and/or reorganization of data to
   minimize redundancy, eliminate noise, and preserve signal and data
   integrity. 

   Data Visualization
   Data visualization permits researchers to communicate information through
   graphical and interactive means and enables them to explore and gain
   insight/knowledge from the data. The challenge in the Big Data era is on
   interpreting complex, high-throughput data, especially in the context of
   other relevant, but often orthogonal, data. 

   Data Provenance
   Provenance of digital scientific data is useful for determining
   attribution, identifying relationships between objects, tracking back
   differences in similar results, guaranteeing the reliability of the data,
   and to allow researchers to determine whether a particular dataset can be
   used in their research (by providing lineage information about the data).

   Data Wrangling
   Data wrangling is a term that is applied to the conversion, formatting,
   and mapping of data that enables researchers to more easily submit data to
   a database, expose data to the internet, and allows data to be more easily
   accessible and shareable. Researchers who generate datasets that, in
   aggregate, become "Big Data" often find it difficult to submit data, even
   when standards are well-established. Specialized informatics skills are
   often needed, for example, to format data, apply metadata, fill gaps, use
   ontologies, capture provenance, annotate features, and apply other
   functions to reformat, manipulate, transform, or process data.

   Information Requested

   To maximize the impact of these valuable research resources and tools
   (informatics methods and tools) and facilitate its use by scientists with
   a broad range of expertise, we seek input from scientific and informatics
   research and user communities in identifying and prioritizing needs and
   gaps in the four focus areas outlined above.

   Submitting a Response

   All responses must be submitted via email to [4]BD2KSoftware at mail.nih.gov
   by Friday, September 6, 2013.  Please include the Notice number in the
   subject line. Response to this RFI is voluntary. Responders are free to
   address any or all of the categories listed above. The submitted
   information will be reviewed by the NIH staff.

   This request is for information and planning purposes only and should not
   be construed as a solicitation or as an obligation on the part of the
   Federal Government. The NIH does not intend to make any awards based on
   responses to this RFI or to otherwise pay for the preparation of any
   information submitted or for the Government's use of such information.

   The NIH will use the information submitted in response to this RFI at its
   discretion and will not provide comments to any responder's submission.
   However, responses to the RFI may be reflected in future funding
   opportunity announcements. The information provided will be analyzed and
   may appear in reports. Respondents are advised that the Government is
   under no obligation to acknowledge receipt of the information received or
   provide feedback to respondents with respect to any information
   submitted.  No proprietary, classified, confidential, or sensitive
   information should be included in your response. The Government reserves
   the right to use any non-proprietary technical information in any
   resultant solicitation(s).

   Inquiries

   Please direct all inquiries to:

   Jennifer Couch, Ph.D
   National Cancer Institute
   Telephone: 240-276-6210
   Email: [5]Jennifer_Couch at nih.gov
   Website: [6]http://bd2k.nih.gov/#sthash.i3bBBRHF.dpbs

   --------------------------------------------------------------------------

   [7]Weekly TOC for this Announcement
   [8]NIH Funding Opportunities and Notices

   --------------------------------------------------------------------------

                                0  15  0  0  15
                        [10]Department of  Department of
  [9]NIH Office of      Health and Human   Health          [11]USA.gov -
 Extramural Research    Services (HHS) -   and Human       Government Made Easy
        Logo            Home Page          Services
                                           (HHS)
                     NIH... Turning Discovery Into Health^®

--------------------------------------------------------------------------------

                                        

 Note: For help accessing PDF, RTF, MS Word, Excel, PowerPoint, Audio or Video
                     files, see [12]Help Downloading Files.

    

References

   Visible links
   1. http://www.nhgrii.nih.gov/
   2. http://acd.od.nih.gov/diwg.htm
   3. http://www.bd2k.nih.gov/
   4. mailto:BD2KSoftware at mail.nih.gov
   5. mailto:Jennifer_Couch at nih.gov
   6. http://bd2k.nih.gov/#sthash.i3bBBRHF.dpbs
   7. http://grants.nih.gov/grants/guide/WeeklyIndex.cfm?WeekEnding=08-09-13
   8. http://grants.nih.gov/grants/guide/index.html
   9. http://grants.nih.gov/grants/oer.htm
  10. http://www.hhs.gov/
  11. http://www.usa.gov/
  12. http://grants.nih.gov/grants/edocs.htm

_______________________________________________
Comp-neuro mailing list
Comp-neuro at neuroinf.org
http://www.neuroinf.org/mailman/listinfo/comp-neuro


----- End forwarded message -----

-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        



More information about the Neurodebian-users mailing list