Work Package 8 - Data harmonization and linkage
The first objective of WP8 is defining the space of possible data joins and, where appropriate, define default join behavior. Focus is on the identification of relevant datasets (based on input from other WPs, mainly WP 2-4) and the opportunities that exist to merge those datasets. An elaborate inventarisation of relevant datasets will be made, focusing on geographical units, time periods, as well the concepts that are measured. We then outline the potential opportunities to combine datasets and evaluate those opportunities based on (a) the overlap in terms of what concepts are measured; and (b) the theoretical usefulness and the specific research questions that can be answered by combining data.
The second objective of WP8 is defining a flexible and extendable taxonomy for commonly used concepts (such as topics, frames, positions and sentiment) and their operationalizations. Based on the possible data joins defined under the first objective, we’ll make an inventarisation of measurement of key concepts, the degree to which those different measurements result in unmatchable data and define a procedure for those instances where concepts are measured differently, yet can still be combined.
The third objective of WP8, is developing a prototype tool that facilitates and partly automatizes data linkage between data provided by others WPs. Once we have established the potential data joins, both in terms of matchable datasets and concepts, we develop a tool that supports the data merging process, and (partly) automatizes it. Based on the input by the researcher (data to be merged, variables on which matching can be made, time granularity, and possible recoding), the tool merges data and creates a dataset that contains relevant variables from two or multiple original datasets. We will conduct a case study to demonstrate the usefulness and applicability of the tool.