Presenting my research – neuroscience & my interest in RDA

You are here

27 Mar 2020

Presenting my research – neuroscience & my interest in RDA

Since I am new to the RDA network, I wanted to present my reserach are, which is neuroscience. I am working as a postdoctoral researcher and senior teaching assistant at Faculty of Medicine, Osijek, Croata. My research interests are chronic stress, neurodegeneration and sex specific differences in neuronal and metabolic diseases. 

I am starting my first small independent scientific project that is trying to connect chronic stress to neurodegeneration through mechanisms beyond genetics - epigenetic modification. This scientific aspect requires big data analysis, area that I am not very experienced in, but I am determined to improve my knowledge.

Also, I am involved in the study of sex specific regulation of diabetes showing different responses between sexes and between two different interventions in animal model of diabetes. One of the goals of the project is development of mathematical model for analyzing glucose and insulin tolerance tests, often used as clinical and preclinical tests for diagnosis of diabetes. These mathematical models produce data that are appropriate for open sharing among medical and biomedical community in order to develop better tests and more precise, personalized diagnostics for diabetic patients.


Name & surname: 
Marta Balog
Scientific Discipline / Research Area: 
Medical and Health Sciences/Basic medicine
Department of Medical Biology and Genetics, Faculty of Medicine, Osijek, Croatia
PDF icon Poster_Marta_Balog.pdf1.1 MB
  • Christian Pagé's picture

    Author: Christian Pagé

    Date: 30 Mar, 2020

    Hi Marta,

    Thanks for this nice poster! It is very interesting in a scientific point of vue.

    For the data and RDA related aspects, I have a few questions:

    1- What is the estimated data volume expected. All data is local too?

    2- In your community, do you already have some standards for data formats and metadata schema?

    3- Also, when processing data, do you record provenance and lineage information? If so using W3C-PROV?


  • Marta Balog's picture

    Author: Marta Balog

    Date: 31 Mar, 2020

    Dear Christian, 

    thank you for your interest in my poster and for your questions. I am very inexperienced in big data analysis – I just started a pilot project that is supposed to generate big data; therefore, I am sorry for not being able to answer to all your questions in details.

    Here are the answers:

    1. The data will come from a study performed in animal model (rats). There were 80 brain samples in total. Each sample will give info on many different genes that we are choosing right now. Each gene can show multiple sites of mutations or epigenetic changes. Since this is the first ˝big data˝ project for me, I am not sure about the data volume, but I expect it to be quite demanding. Study was divided in two parts; one part being performed locally and other in Hungary.

    2. We are usually collecting data in excel spreadsheets, however data on sequencing from this project will be provided in output formats of the company that will do the sequencing, so I don´t know. Other data on animals besides the sequencing were already collected in classical excel spreadsheets or simple .txt files that can easily be imported to R software or python for further analysis.

    3. Previously we had no such data, so we did not use W3C-PROV yet. In the case of the abovementioned project data provenance and lineage were recorded using excel spreadsheets. We are planning project that will provide data coming from patients from different countries and for that purpose we will provide more pertinent recording of provenance and lineage and probably use W3C-PROV.  

  • Rob Hooft's picture

    Author: Rob Hooft

    Date: 31 Mar, 2020

    Hi Marta,

    When reading the poster I was already wondering whether you had any specific experience with data interoperability, and this question is strengthened by looking at your earlier answer about using EXCEL!

    Excel spreadsheets, if they have a header, are very bad in exactly defining what data is given. For example a column may be labeled "Gender", but this does (a) not clarify whether the column is filled with m/f/unknown strings, or with an encoding 1/2, maybe even following an ISO standard, and (b) does not define what kind of gender is registered: chromosomal, self-reported, characteristics at birth. 

    Especially when you need to join data from different tables, you really need to know these details. Are this issues you are encountering? And if yes, how do you deal with that?

  • Marta Balog's picture

    Author: Marta Balog

    Date: 02 Apr, 2020

    Dear Rob, thank you for your questions and comments.

    You are completely right. So far, I had no experience with data interoperability and since my data were not so demanding (coming from one experiment), excel spredsheets worked out just fine. However, I am aware that adjustment will be necessary when switching to bigger data volume. Our projects are evolving towards omics data, together with the equipment producing big data, so we have to adjust. That is why I am interested in RDA

submit a comment