Abstract:
Qualitative data analysis (QDA) is a time consuming and, potentially unreliable research activity. In qualitative research, a number of tasks must be repeated for every new
research case, even if each case is closely related or is in the same area of study.
Existing QDA applications provide users with a variety of tools and features that assist researchers in manipulating qualitative data. There is a great advantage in using these functions over completing these tasks manually. However, available QDA tools are not really more than a digital paper and pencil. In other words, existing tools are not equipped with any sort of automatic processing features.
A computer assisted framework was developed to help researchers in conducting qualitative data analysis. This framework leveraged the GATE platform, along with Natural Language Processing and Knowledge Extraction, to develop an automatic text annotation and summarization system. A performance model, developed from the
literature on lean manufacturing implementation strategies was converted to an ontology. A lexicon database for lean implementation practices was also developed. A dataset from
a previous research study focusing on lean implementation practices was used to conduct this development and testing. A number of different summarization techniques were developed and tested. A customized sensitivity analysis method was developed and used to systematically perform summarization algorithms comparisons. For the best summarization algorithm, an average F-score of 0.6567 was recorded. This F-score was based on a recall of 0.85 and a precision of 0.55, demonstrating the feasibility of
automatic processing on an unstructured qualitative dataset.