Curating Historical Statistics Data on Baltic Countries in 1897-1939: Providing Data with Rich Metadata

You are here

29 Jan 2020

Curating Historical Statistics Data on Baltic Countries in 1897-1939: Providing Data with Rich Metadata

With the expansion of information and communication technologies and the more recent onslaught of ever-increasing computer power and big data, the research community started focusing its attention on collecting and analyzing wast amounts of textual, numeric and visual data available in the social media, business processes, health records etc. At the same time interest in other types of data waned somewhat. To a certain extent this also applies to the collections of historical data. To be sure, several databases exist that provide information on manifold aspects of societies in historical perspective (in some cases ranging for more than two centuries):,,,,, etc. Some of them provide raw data, visualizations as well as quite extensive metadata. However, historical statistics is, in general, rather sparse (data is available for a limited number of countries and for limited periods of time) and poorly documented (sources are only generally described). Historical researchers usually have to rely on the data collected in the “Western world” or analyze rather limited time series (ranging for only 30-50 years).

Historical statistics data on Baltic countries are rather scarce and there are two main reasons for this situation. First, these countries were occupied by Russia, Germany and the Soviet Union for extended periods of time. Thus, historical data is simply not available as they did not exist as separate countries. However, this situation could be remedied by collecting sub-national data on the provinces and other territorial units of the occupying countries coinciding (more or less) with the boundaries of the currently independent Baltic states. Second, historical statistics were collected and published during the interwar period when the countries gained independence, however, these data still remain non-digitized as to be easily available for research purposes (in data table format). Again, this situation could be changed by researchers collecting and digitizing the available data sources.

Having these two possible sources of historical data in mind, the project by the Vilnius University researchers was initiated (see One of the several activities in the project is systematic collection of historical statistics data on the comparative social and economic development of the Baltic States in 1897-1939 and subsequent publishing in the Lithuanian Data Archive for Social Sciences and Humanities (LiDA) hosted by the Kaunas University of Technology. Two major objectives were contemplated with regard to the data collection and publishing: 1) user friendly visualizations of times series and regional data (along the lines of as well as 2) detailed documentation of all the data sources and cells in the table. Thus, in this poster we present an attempt to develop data documentation model for historical statistics data that includes metadata not only on the variable (column) level (which is quite usual for metadata standards, such as DDI) but also on the data table cell level (as well as on the case (row) level). We also provide realization of the historical statistics data visualization possibilities (including detailed metadata) employing Shiny applications.


Name & surname: 
Vaidas Morkevičius, Giedrius Žvaliauskas
Scientific Discipline / Research Area: 
Social Sciences/Sociology
Kaunas University of Technology, Vilnius University
PDF icon histstatsbaltic_a2_hor_vm&gz.pdf951.97 KB